Basics:

  • botocore is the low-level interface for practically all other clients.
  • boto3 official Amazon Python client
  • minio standalone alternative to botocore/boto3, does not natively support asynchronous operations as of now.

Async:

Boto3

import boto3
from botocore.config import Config

# Initialize the S3 client with a custom endpoint
s3 = boto3.client(
    "s3",
    endpoint_url="https://custom-endpoint.com",
    aws_access_key_id="your-access-key",
    aws_secret_access_key="your-secret-key",
    config=Config(signature_version="s3v4"),
)

# Upload a file
s3.upload_file("local/path/file.txt", "bucket-name", "destination/path/file.txt")

# Download a file
s3.download_file("bucket-name", "source/path/file.txt", "local/path/file.txt")

Minio

from minio import Minio

# Initialize the Minio client with a custom endpoint
client = Minio(
    "custom-endpoint.com",
    access_key="your-access-key",
    secret_key="your-secret-key",
    secure=True,  # Set to False if not using HTTPS
)

# Upload a file
client.fput_object("bucket-name", "destination/path/file.txt", "local/path/file.txt")

# Download a file
client.fget_object("bucket-name", "source/path/file.txt", "local/path/file.txt")

aioboto3

import aioboto3

# Create an async session and client
session = aioboto3.Session(
    aws_access_key_id="your-access-key",
    aws_secret_access_key="your-secret-key",
)
async with aioboto3.Session().client(
    "s3",
    endpoint_url=endpoint_url,
    aws_access_key_id=access_key,
    aws_secret_access_key=secret_key,
) as s3:
    # Upload the file
    with open(file_path, "rb") as file:
        await s3.upload_fileobj(file, bucket_name, object_name)

S3fs

Looks the most clean:

s3_file = s3fs.S3FileSystem(
    key=access_key,
    secret=access_secret,
    endpoint_url=endpoint_url,
)
s3_file.put(local_path, s3_path)

And that’s not all, it works fully recursive in batches:

s3_file.put(local_path, s3_path, recursive=True, batch_size=5)

Dask

Dask uses S3fs under the hood:

import dask.dataframe as dd

storage_options = {
    "key": access_key,
    "secret": access_secret,
    "endpoint_url": endpoint_url,
}
ddf = dd.read_csv(f"s3://{bucket_name}/test.csv", storage_options=storage_options)