Production deployment on Amazon EKS.

For a full map of all secrets, service accounts, and namespaces, see Credentials Reference.

Prerequisites#

  • AWS CLI configured
  • kubectl 1.24+
  • Helm 3.0+
  • eksctl (optional, for cluster creation)
  • EKS cluster 1.24+

Required Components#

1. VPC and Networking#

Requirements:

  • VPC with public and private subnets
  • NAT gateway for private subnet internet access
  • Security groups allowing pod-to-pod communication

See: AWS VPC Best Practices

2. IAM Roles and Permissions#

EKS Pod Identity (recommended):

Crossplane AWS provider role (crossplane-provider-aws-role):

{
  "Effect": "Allow",
  "Action": [
    "sqs:CreateQueue",
    "sqs:DeleteQueue",
    "sqs:GetQueueAttributes",
    "sqs:SetQueueAttributes",
    "sqs:TagQueue",
    "sqs:GetQueueUrl"
  ],
  "Resource": "arn:aws:sqs:*:*:asya-*"
}

Actor role (asya-actor-role) - shared IAM role for all actor sidecars. Provides access to SQS queues and S3 bucket for persisting messages. This role is assigned via IRSA (or EKS Pod Identity) to a shared asya-actors ServiceAccount in each actor namespace — no static AWS credentials are stored in the cluster:

Note: For local development with LocalStack, IRSA is unavailable. Use a static aws-creds Secret with AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY instead. See Quickstart for the dev setup.

{
  "Effect": "Allow",
  "Action": [
    "sqs:ReceiveMessage",
    "sqs:SendMessage",
    "sqs:DeleteMessage",
    "sqs:ChangeMessageVisibility",
    "sqs:GetQueueAttributes"
  ],
  "Resource": "arn:aws:sqs:*:*:asya-*"
},
{
  "Effect": "Allow",
  "Action": [
    "s3:GetObject",
    "s3:PutObject",
    "s3:DeleteObject",
    "s3:ListBucket"
  ],
  "Resource": [
    "arn:aws:s3:::asya-results-bucket",
    "arn:aws:s3:::asya-results-bucket/*"
  ]
}

KEDA role (keda-operator-role):

{
  "Effect": "Allow",
  "Action": [
    "sqs:GetQueueAttributes",
    "sqs:GetQueueUrl",
    "sqs:ListQueues"
  ],
  "Resource": "arn:aws:sqs:*:*:asya-*"
}

3. EKS Addons#

# Install Pod Identity Agent
eksctl create addon --cluster my-cluster \
  --name eks-pod-identity-agent

# Install VPC CNI
eksctl create addon --cluster my-cluster \
  --name vpc-cni --version v1.16.2

4. KEDA Operator#

# Create namespace
kubectl create namespace keda

# Add Helm repo
helm repo add kedacore https://kedacore.github.io/charts
helm repo update

# Install KEDA
helm install keda kedacore/keda \
  --namespace keda \
  --version 2.15.1

Configure Pod Identity for KEDA:

aws eks create-pod-identity-association \
  --cluster-name my-cluster \
  --namespace keda \
  --service-account keda-operator \
  --role-arn arn:aws:iam::ACCOUNT:role/keda-operator-role

5. S3 Bucket for Results#

Create a sample S3 bucket for persisting result messages:

aws s3 mb s3://asya-results-bucket --region us-east-1

Optional Components#

GPU Node Group#

For AI/ML workloads:

eksctl create nodegroup \
  --cluster my-cluster \
  --name gpu-nodes \
  --node-type g4dn.xlarge \
  --nodes-min 0 \
  --nodes-max 10 \
  --node-ami-family AmazonLinux2 \
  --node-taints nvidia.com/gpu=true:NoSchedule

Install NVIDIA Device Plugin:

kubectl apply -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v0.17.0/deployments/static/nvidia-device-plugin.yml

Cluster Autoscaler#

For automatic node provisioning:

helm repo add autoscaler https://kubernetes.github.io/autoscaler
helm install cluster-autoscaler autoscaler/cluster-autoscaler \
  --namespace kube-system \
  --set autoDiscovery.clusterName=my-cluster \
  --set awsRegion=us-east-1

Metrics Server#

For resource metrics:

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

CloudWatch Container Insights#

For centralized logging:

eksctl create iamserviceaccount \
  --cluster my-cluster \
  --namespace amazon-cloudwatch \
  --name cloudwatch-agent \
  --attach-policy-arn arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy \
  --approve

# Install CloudWatch agent
kubectl apply -f https://raw.githubusercontent.com/aws-samples/amazon-cloudwatch-container-insights/latest/k8s-deployment-manifest-templates/deployment-mode/daemonset/container-insights-monitoring/quickstart/cwagent-fluentd-quickstart.yaml

Asya🎭 Deployment#

1. Install Crossplane#

helm repo add crossplane-stable https://charts.crossplane.io/stable
helm install crossplane crossplane-stable/crossplane \
  --namespace crossplane-system --create-namespace

Custom namespace: If you install Crossplane in a namespace other than crossplane-system (e.g. asya-system), set crossplaneNamespace in your crossplane-values.yaml (see Step 2). The chart creates the required ClusterRoleBinding automatically.

Without this, the provider cannot create Deployments in actor namespaces and AsyncActors will remain in Creating state.

2. Configure Crossplane Values#

# crossplane-values.yaml
providers:
  aws:
    enabled: true   # opt-in: disabled by default

awsRegion: us-east-1
awsAccountId: "123456789012"    # required for KEDA SQS trigger queue URLs
actorNamespace: default         # namespace where AsyncActors will be created
# crossplaneNamespace: asya-system  # set if Crossplane is not in crossplane-system

irsa:
  enabled: true     # opt-in: disabled by default
  roleArnPattern: "arn:aws:iam::123456789012:role/asya-actors-{namespace}"

awsProviderConfig:
  name: default
  credentialsSource: Secret
  secretRef:
    namespace: crossplane-system
    name: aws-creds
    key: credentials

3. Install Asya Crossplane Chart (two-step)#

Crossplane providers must reach Healthy before their CRDs exist and ProviderConfigs can be created. Install with providerConfigs.install=false first.

helm repo add asya https://asya.sh/charts
helm repo update asya

# Step 1: install providers, XRDs, and compositions (skip ProviderConfigs)
helm install asya-crossplane asya/asya-crossplane --version $ASYA_VERSION \
  -n crossplane-system \
  -f crossplane-values.yaml \
  --set providerConfigs.install=false

Wait for providers to register their CRDs:

kubectl wait --for=condition=Healthy providers --all --timeout=300s

Then enable ProviderConfigs:

# Step 2: enable ProviderConfigs (CRDs now exist)
helm upgrade asya-crossplane asya/asya-crossplane --version $ASYA_VERSION \
  -n crossplane-system \
  --reuse-values \
  --set providerConfigs.install=true

4. Install Gateway (Optional)#

# gateway-values.yaml
config:
  sqsRegion: us-east-1
  s3Bucket: asya-results-bucket
  postgresHost: postgres.default.svc.cluster.local

serviceAccount:
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::ACCOUNT:role/asya-gateway-role

routes:
  tools:
  - name: example
    description: Example tool
    parameters:
      text:
        type: string
        required: true
    route: [example-actor]
helm install asya-gateway asya/asya-gateway --version $ASYA_VERSION \
  -n default \
  -f gateway-values.yaml

6. Install Crew Actors#

Crew actors are pre-defined system actors for handling common scenarios. For example, actors x-sink and x-sump are the common flow finalizers and can persist messages to S3-compatible storage.

Suppose, we want to save all messages to the bucket s3://asya-results-bucket. Note that the bucket name should be globally unique.

# crew-values.yaml
x-sink:
  enabled: true
  env:
    ASYA_PERSISTENCE_MOUNT: /state/checkpoints

x-sump:
  enabled: true
  env:
    ASYA_PERSISTENCE_MOUNT: /state/checkpoints
helm install asya-crew asya/asya-crew --version $ASYA_VERSION \
  -n default \
  -f crew-values.yaml

Note: IRSA annotation can be set per-actor in AsyncActor spec if needed.

7. Deploy Your Actors#

apiVersion: asya.sh/v1alpha1
kind: AsyncActor
metadata:
  name: my-actor
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::ACCOUNT:role/asya-actor-role
spec:
  scaling:
    minReplicaCount: 0
    maxReplicaCount: 50
  image: my-actor:v1
  handler: handler.process
kubectl apply -f my-actor.yaml

Verification#

# Check Crossplane
kubectl get pods -n crossplane-system

# Check KEDA
kubectl get pods -n keda

# Check actor
kubectl get asyncactor my-actor
kubectl get pods -l asya.sh/actor=my-actor

# Check queue created
aws sqs list-queues | grep asya-my-actor
kubectl get sqsqueue

Troubleshooting#

ProviderConfig CRD not found during install#

The chart uses a two-step install. If you see no matches for kind "ProviderConfig", install with --set providerConfigs.install=false first, wait for providers, then upgrade with providerConfigs.install=true. See Step 3 above.

AsyncActor stuck in Creating#

Check the Crossplane Object resource for RBAC errors:

kubectl get objects.kubernetes.crossplane.io -l crossplane.io/composite=$(
  kubectl get asyncactor <name> -n <ns> -o jsonpath='{.spec.resourceRef.name}'
) -o yaml | grep -A5 "message:"

If you see "deployments" is forbidden, set crossplaneNamespace in your values to match the namespace where Crossplane is installed. See the note under Step 1 above.

RabbitMQ: sidecar stuck in backoff#

Known issue (#384): if the RabbitMQ queue does not exist when the sidecar starts, the AMQP channel breaks on the first 404 and subsequent retries cannot recover. Restart the pod after the queue is created. Fix tracked in #372 and #384.

Cost Optimization#

  • Use Spot Instances for GPU nodes
  • Enable cluster autoscaler scale-to-zero
  • Use KEDA scale-to-zero (minReplicaCount: 0)
  • Set appropriate queueLength for scaling efficiency
  • Monitor SQS costs (first 1M requests free)

See: AWS EKS Best Practices