S3#
AWS S3 and MinIO-compatible object storage connector for state proxy.
Available Variants#
| Image Suffix | Consistency | Write Mode | Use Case |
|---|---|---|---|
s3-buffered-lww |
Last-Write-Wins | Buffered | General-purpose, small-to-medium files |
s3-buffered-cas |
Compare-And-Set | Buffered | Multi-process writes, conflict detection |
s3-passthrough |
Last-Write-Wins | Streaming | Large files, no buffering |
Configuration#
Environment Variables#
| Variable | Required | Description | Default |
|---|---|---|---|
STATE_BUCKET |
✅ | S3 bucket name | — |
STATE_PREFIX |
❌ | Key prefix inside bucket | "" (root) |
AWS_REGION |
❌ | AWS region | us-east-1 |
AWS_ENDPOINT_URL |
❌ | Custom endpoint for MinIO/LocalStack | — |
IAM authentication: Connectors use the AWS SDK's default credential chain (IRSA, Pod Identity, instance role, or environment variables AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY).
AsyncActor Example#
apiVersion: asya.sh/v1alpha1
kind: AsyncActor
metadata:
name: model-inference
namespace: prod
spec:
actor: model-inference
stateProxy:
- name: weights
mount:
path: /state/weights
writeMode: buffered
connector:
image: ghcr.io/deliveryhero/asya-state-proxy-s3-buffered-lww:v1.0.0
env:
- name: STATE_BUCKET
value: my-model-weights
- name: STATE_PREFIX
value: models/v2
- name: AWS_REGION
value: us-west-2
resources:
requests:
cpu: 50m
memory: 64Mi
limits:
memory: 128Mi
IAM Permissions#
Sidecar connector permissions (via IRSA, Pod Identity, or instance role):
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::my-model-weights/*",
"arn:aws:s3:::my-model-weights"
]
}
]
}
For CAS variant: no additional permissions needed — ETag-based conflict detection uses standard S3 operations.
Key Patterns#
Handler path /state/weights/model.bin maps to S3 key:
- Without prefix:
model.bin - With prefix (
STATE_PREFIX=models/v2):models/v2/model.bin
Directory semantics are simulated: os.listdir("/state/weights/subdir/") uses prefix-based listing to return entries under subdir/.
Bucket Setup#
Pre-create the bucket before deploying actors. The connector does not create buckets.
Recommended bucket settings:
- Versioning: Optional (connector does not use versions; beneficial for recovery)
- Lifecycle policies: Recommended for expired data cleanup
- Encryption: Recommended (SSE-S3 or SSE-KMS)
- Public access: Blocked (actors use IAM credentials)
MinIO Compatibility#
S3 connectors work with MinIO via AWS_ENDPOINT_URL.
connector:
image: ghcr.io/deliveryhero/asya-state-proxy-s3-buffered-lww:v1.0.0
env:
- name: STATE_BUCKET
value: actor-state
- name: AWS_ENDPOINT_URL
value: http://minio.default.svc.cluster.local:9000
- name: AWS_REGION
value: us-east-1
- name: AWS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: minio-credentials
key: access-key
- name: AWS_SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
name: minio-credentials
key: secret-key
MinIO requirements:
- Bucket must exist before deployment
- Endpoint URL must be accessible from actor pods
- Credentials can be static keys (not IRSA)
Consistency Guarantees#
s3-buffered-lww / s3-passthrough#
No conflict detection. Concurrent writes to the same key result in the last write winning. ETag changes are ignored.
Safe for: Single-writer per key, or when the latest write is always correct.
s3-buffered-cas#
Check-And-Set with ETag-based conflict detection. On read, the ETag is cached in memory. On write, a conditional PutObject with IfMatch: {cached_etag} is sent.
If the object was modified externally since the last read, S3 returns PreconditionFailed (412), which the connector maps to FileExistsError.
On a CAS conflict (FileExistsError), the sidecar requeues the message with
exponential backoff, and the handler runs again from scratch with a fresh read.
The handler does not need explicit retry logic.
Handler code:
import json
async def handler(payload):
# Read (connector caches ETag internally)
with open("/state/cache/result.json") as f:
data = json.load(f)
data["count"] += 1
# Write (connector uses cached ETag for conditional PutObject)
with open("/state/cache/result.json", "w") as f:
json.dump(data, f)
return payload
Write is unconditional for keys that have never been read (new key path).
Per the ADR on CAS safety, CAS is safe across pod crashes: a crash before write leaves no state change, a crash during write is atomic, and a crash after write but before message ack triggers a redelivery where CAS detects the version conflict.
Write Modes#
buffered (s3-buffered-lww, s3-buffered-cas)#
Collects all writes into memory (spills to disk above 4 MiB via tempfile.SpooledTemporaryFile). On close(), sends a single PutObject with Content-Length.
Advantages: Supports seek() and tell(), known Content-Length for S3 API.
Limitations: Memory overhead for large files.
passthrough (s3-passthrough)#
Opens HTTP connection immediately, streams each write() call to S3 via upload_fileobj. Does not buffer in memory.
Advantages: No memory overhead, suitable for files > 100 MB.
Limitations: Does not support seek() or tell().
Performance Characteristics#
- Latency: Typical S3 GET/PUT latency (10-100 ms in same region)
- Throughput: Limited by S3 API rate limits (3500 PUT/s, 5500 GET/s per prefix)
- Concurrency: No connection pooling; one HTTP request per file operation
- Caching: None — every
open()reads from S3
Optimization: Minimize reads by caching in actor memory when possible.
Best Practices#
- Use IRSA or Pod Identity for pod-level IAM permissions (avoid hardcoded keys)
- Set
STATE_PREFIXto namespace models by environment or version - Use lifecycle policies to delete expired data
- Monitor S3 request metrics (CloudWatch or MinIO metrics)
- For large files (> 100 MB), use
writeMode: passthrough - Pre-warm buckets in new regions to avoid throttling on first deploy
Troubleshooting#
FileNotFoundError on read: Verify bucket name, key path, and IAM s3:GetObject permission.
PermissionError: Check IAM policy allows s3:PutObject / s3:DeleteObject on bucket ARN.
ConnectionError: Verify AWS_ENDPOINT_URL is reachable from actor pod (for MinIO).
FileExistsError with s3-buffered-cas: Another process modified the key since last read. Retry or merge changes.
Slow writes: Large payloads + buffered mode = memory spill to disk. Use passthrough for files > 100 MB.
Analytics Queries via /query#
s3-buffered-lww and s3-buffered-cas connectors expose POST /query for
Mango-style analytics over stored objects. Intended for querying historical
pipeline outputs — e.g. finding all completed inference jobs for a given model.
Request#
{
"prefix": "runs/2024-01/",
"filter": {"status": "done", "model": "sdxl-xl"},
"sort": ["-created_at"],
"limit": 100,
"offset": 0
}
All fields are optional. filter key names must match [a-zA-Z_][a-zA-Z0-9_.]*
and are compared as strings. Sort prefix - means descending.
Response#
{"rows": [{...}, ...], "total": 42}
total is the count before limit is applied (useful for paging).
Per-call limits#
All limits are overridable via environment variables on the connector container.
| Limit | Env var | Default | Behaviour when exceeded |
|---|---|---|---|
| Listed objects | QUERY_MAX_KEYS |
10 000 | Returns 400 — narrow the prefix |
| Fetched objects | QUERY_MAX_FETCH_KEYS |
1 000 | Only the first N listed keys are downloaded |
| Downloaded bytes | QUERY_MAX_FETCH_BYTES |
268 435 456 (256 MiB) | Stop after the file that crossed the threshold; remaining keys skipped with a warning |
| Returned rows | QUERY_MAX_RESULT_ROWS |
10 000 | Enforced as SQL LIMIT; use paging for more |
The bytes budget fires after writing the current file so a single object larger
than the budget is always returned; subsequent objects in the same call are skipped.
Tighten QUERY_MAX_FETCH_BYTES for containers with limited ephemeral storage.
Helm configuration#
# deploy/helm-charts/asya-crew values.yaml
persistence:
config:
query:
maxFetchBytes: "67108864" # 64 MiB — reduce for resource-constrained pods
maxFetchKeys: "500"
maxKeys: "5000"
maxResultRows: "1000"