S3

AWS S3 and MinIO-compatible object storage connector for state proxy.

Available Variants#

Image Suffix	Consistency	Write Mode	Use Case
`s3-buffered-lww`	Last-Write-Wins	Buffered	General-purpose, small-to-medium files
`s3-buffered-cas`	Compare-And-Set	Buffered	Multi-process writes, conflict detection
`s3-passthrough`	Last-Write-Wins	Streaming	Large files, no buffering

Configuration#

Environment Variables#

Variable	Required	Description	Default
`STATE_BUCKET`	✅	S3 bucket name	—
`STATE_PREFIX`	❌	Key prefix inside bucket	`""` (root)
`AWS_REGION`	❌	AWS region	`us-east-1`
`AWS_ENDPOINT_URL`	❌	Custom endpoint for MinIO/LocalStack	—

IAM authentication: Connectors use the AWS SDK's default credential chain (IRSA, Pod Identity, instance role, or environment variables AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY).

AsyncActor Example#

apiVersion: asya.sh/v1alpha1
kind: AsyncActor
metadata:
  name: model-inference
  namespace: prod
spec:
  actor: model-inference
  stateProxy:
    - name: weights
      mount:
        path: /state/weights
      writeMode: buffered
      connector:
        image: ghcr.io/deliveryhero/asya-state-proxy-s3-buffered-lww:v1.0.0
        env:
          - name: STATE_BUCKET
            value: my-model-weights
          - name: STATE_PREFIX
            value: models/v2
          - name: AWS_REGION
            value: us-west-2
        resources:
          requests:
            cpu: 50m
            memory: 64Mi
          limits:
            memory: 128Mi

IAM Permissions#

Sidecar connector permissions (via IRSA, Pod Identity, or instance role):

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:PutObject",
        "s3:DeleteObject",
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::my-model-weights/*",
        "arn:aws:s3:::my-model-weights"
      ]
    }
  ]
}

For CAS variant: no additional permissions needed — ETag-based conflict detection uses standard S3 operations.

Key Patterns#

Handler path /state/weights/model.bin maps to S3 key:

Without prefix: model.bin
With prefix (STATE_PREFIX=models/v2): models/v2/model.bin

Directory semantics are simulated: os.listdir("/state/weights/subdir/") uses prefix-based listing to return entries under subdir/.

Bucket Setup#

Pre-create the bucket before deploying actors. The connector does not create buckets.

Recommended bucket settings:

Versioning: Optional (connector does not use versions; beneficial for recovery)
Lifecycle policies: Recommended for expired data cleanup
Encryption: Recommended (SSE-S3 or SSE-KMS)
Public access: Blocked (actors use IAM credentials)

MinIO Compatibility#

S3 connectors work with MinIO via AWS_ENDPOINT_URL.

connector:
  image: ghcr.io/deliveryhero/asya-state-proxy-s3-buffered-lww:v1.0.0
  env:
    - name: STATE_BUCKET
      value: actor-state
    - name: AWS_ENDPOINT_URL
      value: http://minio.default.svc.cluster.local:9000
    - name: AWS_REGION
      value: us-east-1
    - name: AWS_ACCESS_KEY_ID
      valueFrom:
        secretKeyRef:
          name: minio-credentials
          key: access-key
    - name: AWS_SECRET_ACCESS_KEY
      valueFrom:
        secretKeyRef:
          name: minio-credentials
          key: secret-key

MinIO requirements:

Bucket must exist before deployment
Endpoint URL must be accessible from actor pods
Credentials can be static keys (not IRSA)

Consistency Guarantees#

s3-buffered-lww / s3-passthrough#

No conflict detection. Concurrent writes to the same key result in the last write winning. ETag changes are ignored.

Safe for: Single-writer per key, or when the latest write is always correct.

s3-buffered-cas#

Check-And-Set with ETag-based conflict detection. On read, the ETag is cached in memory. On write, a conditional PutObject with IfMatch: {cached_etag} is sent.

If the object was modified externally since the last read, S3 returns PreconditionFailed (412), which the connector maps to FileExistsError.

On a CAS conflict (FileExistsError), the sidecar requeues the message with exponential backoff, and the handler runs again from scratch with a fresh read. The handler does not need explicit retry logic.

Handler code:

import json

async def handler(payload):
    # Read (connector caches ETag internally)
    with open("/state/cache/result.json") as f:
        data = json.load(f)

    data["count"] += 1

    # Write (connector uses cached ETag for conditional PutObject)
    with open("/state/cache/result.json", "w") as f:
        json.dump(data, f)

    return payload

Write is unconditional for keys that have never been read (new key path).

Per the ADR on CAS safety, CAS is safe across pod crashes: a crash before write leaves no state change, a crash during write is atomic, and a crash after write but before message ack triggers a redelivery where CAS detects the version conflict.

Write Modes#

buffered (s3-buffered-lww, s3-buffered-cas)#

Collects all writes into memory (spills to disk above 4 MiB via tempfile.SpooledTemporaryFile). On close(), sends a single PutObject with Content-Length.

Advantages: Supports seek() and tell(), known Content-Length for S3 API.

Limitations: Memory overhead for large files.

passthrough (s3-passthrough)#

Opens HTTP connection immediately, streams each write() call to S3 via upload_fileobj. Does not buffer in memory.

Advantages: No memory overhead, suitable for files > 100 MB.

Limitations: Does not support seek() or tell().

Performance Characteristics#

Latency: Typical S3 GET/PUT latency (10-100 ms in same region)
Throughput: Limited by S3 API rate limits (3500 PUT/s, 5500 GET/s per prefix)
Concurrency: No connection pooling; one HTTP request per file operation
Caching: None — every open() reads from S3

Optimization: Minimize reads by caching in actor memory when possible.

Best Practices#

Use IRSA or Pod Identity for pod-level IAM permissions (avoid hardcoded keys)
Set STATE_PREFIX to namespace models by environment or version
Use lifecycle policies to delete expired data
Monitor S3 request metrics (CloudWatch or MinIO metrics)
For large files (> 100 MB), use writeMode: passthrough
Pre-warm buckets in new regions to avoid throttling on first deploy

Troubleshooting#

FileNotFoundError on read: Verify bucket name, key path, and IAM s3:GetObject permission.

PermissionError: Check IAM policy allows s3:PutObject / s3:DeleteObject on bucket ARN.

ConnectionError: Verify AWS_ENDPOINT_URL is reachable from actor pod (for MinIO).

FileExistsError with s3-buffered-cas: Another process modified the key since last read. Retry or merge changes.

Slow writes: Large payloads + buffered mode = memory spill to disk. Use passthrough for files > 100 MB.

Analytics Queries via `/query`#

s3-buffered-lww and s3-buffered-cas connectors expose POST /query for Mango-style analytics over stored objects. Intended for querying historical pipeline outputs — e.g. finding all completed inference jobs for a given model.

Request#

{
  "prefix": "runs/2024-01/",
  "filter": {"status": "done", "model": "sdxl-xl"},
  "sort": ["-created_at"],
  "limit": 100,
  "offset": 0
}

All fields are optional. filter key names must match [a-zA-Z_][a-zA-Z0-9_.]* and are compared as strings. Sort prefix - means descending.

Response#

{"rows": [{...}, ...], "total": 42}

total is the count before limit is applied (useful for paging).

Per-call limits#

All limits are overridable via environment variables on the connector container.

Limit	Env var	Default	Behaviour when exceeded
Listed objects	`QUERY_MAX_KEYS`	10 000	Returns 400 — narrow the prefix
Fetched objects	`QUERY_MAX_FETCH_KEYS`	1 000	Only the first N listed keys are downloaded
Downloaded bytes	`QUERY_MAX_FETCH_BYTES`	268 435 456 (256 MiB)	Stop after the file that crossed the threshold; remaining keys skipped with a warning
Returned rows	`QUERY_MAX_RESULT_ROWS`	10 000	Enforced as SQL `LIMIT`; use paging for more

The bytes budget fires after writing the current file so a single object larger than the budget is always returned; subsequent objects in the same call are skipped. Tighten QUERY_MAX_FETCH_BYTES for containers with limited ephemeral storage.

Helm configuration#

# deploy/helm-charts/asya-crew values.yaml
persistence:
  config:
    query:
      maxFetchBytes: "67108864"   # 64 MiB — reduce for resource-constrained pods
      maxFetchKeys: "500"
      maxKeys: "5000"
      maxResultRows: "1000"

Asya Docs

S3#

Available Variants#

Configuration#

Environment Variables#

AsyncActor Example#

IAM Permissions#

Key Patterns#

Bucket Setup#

MinIO Compatibility#

Consistency Guarantees#

s3-buffered-lww / s3-passthrough#

s3-buffered-cas#

Write Modes#

buffered (s3-buffered-lww, s3-buffered-cas)#

passthrough (s3-passthrough)#

Performance Characteristics#

Best Practices#

Troubleshooting#

Analytics Queries via `/query`#

Request#

Response#

Per-call limits#

Helm configuration#

S3#

Available Variants#

Configuration#

Environment Variables#

AsyncActor Example#

IAM Permissions#

Key Patterns#

Bucket Setup#

MinIO Compatibility#

Consistency Guarantees#

s3-buffered-lww / s3-passthrough#

s3-buffered-cas#

Write Modes#

buffered (s3-buffered-lww, s3-buffered-cas)#

passthrough (s3-passthrough)#

Performance Characteristics#

Best Practices#

Troubleshooting#

Analytics Queries via /query#

Request#

Response#

Per-call limits#

Helm configuration#

Related Documentation#

Analytics Queries via `/query`#