Basics:
- botocore is the low-level interface for practically all other clients.
- boto3 official Amazon Python client
- minio standalone alternative to botocore/boto3, does not natively support asynchronous operations as of now.
Async:
- aiobotocore async support for botocore
- aioboto3 wrapper for aiobotocore
- s3fs wrapper for aiobotocore
Boto3
import boto3
from botocore.config import Config
# Initialize the S3 client with a custom endpoint
s3 = boto3.client(
"s3",
endpoint_url="https://custom-endpoint.com",
aws_access_key_id="your-access-key",
aws_secret_access_key="your-secret-key",
config=Config(signature_version="s3v4"),
)
# Upload a file
s3.upload_file("local/path/file.txt", "bucket-name", "destination/path/file.txt")
# Download a file
s3.download_file("bucket-name", "source/path/file.txt", "local/path/file.txt")
Minio
from minio import Minio
# Initialize the Minio client with a custom endpoint
client = Minio(
"custom-endpoint.com",
access_key="your-access-key",
secret_key="your-secret-key",
secure=True, # Set to False if not using HTTPS
)
# Upload a file
client.fput_object("bucket-name", "destination/path/file.txt", "local/path/file.txt")
# Download a file
client.fget_object("bucket-name", "source/path/file.txt", "local/path/file.txt")
aioboto3
import aioboto3
# Create an async session and client
session = aioboto3.Session(
aws_access_key_id="your-access-key",
aws_secret_access_key="your-secret-key",
)
async with aioboto3.Session().client(
"s3",
endpoint_url=endpoint_url,
aws_access_key_id=access_key,
aws_secret_access_key=secret_key,
) as s3:
# Upload the file
with open(file_path, "rb") as file:
await s3.upload_fileobj(file, bucket_name, object_name)
S3fs
Looks the most clean:
s3_file = s3fs.S3FileSystem(
key=access_key,
secret=access_secret,
endpoint_url=endpoint_url,
)
s3_file.put(local_path, s3_path)
And that’s not all, it works fully recursive in batches:
s3_file.put(local_path, s3_path, recursive=True, batch_size=5)
Dask
Dask uses S3fs under the hood:
import dask.dataframe as dd
storage_options = {
"key": access_key,
"secret": access_secret,
"endpoint_url": endpoint_url,
}
ddf = dd.read_csv(f"s3://{bucket_name}/test.csv", storage_options=storage_options)