Debugging

This guide covers practical steps for tracing an envelope through the Asya mesh, diagnosing failures, and identifying performance bottlenecks.

Find the envelope by trace ID#

Every envelope carries a trace_id in its headers. Use it to grep across pod logs:

# Search all actor pods in the namespace for a trace ID
kubectl logs -n my-project -l asya.sh/actor --all-containers | grep "t-42"

If you know which actor the envelope should have reached:

kubectl logs -n my-project deployment/inference -c asya-sidecar | grep "t-42"
kubectl logs -n my-project deployment/inference -c asya-runtime | grep "t-42"

Check the gateway task status#

If the envelope was submitted through the gateway, query the task by ID:

curl http://<gateway-api>/tasks/<task-id>

The response shows the current status, which actor is processing, and how many actors have completed:

{
  "id": "5e6fdb2d-...",
  "status": "running",
  "current_actor_name": "inference",
  "actors_completed": 1,
  "total_actors": 3
}

Terminal statuses are succeeded, failed, paused, and canceled.

Inspect the runtime directly with curl#

You can bypass the sidecar and invoke the runtime's /invoke endpoint directly from within the pod. This isolates handler issues from queue/routing issues.

kubectl exec -n my-project deployment/inference -c asya-runtime -- \
  curl --unix-socket /var/run/asya/asya-runtime.sock \
  -X POST http://localhost/invoke \
  -H "Content-Type: application/json" \
  -d '{
    "id": "dbg-1",
    "route": {"prev": [], "curr": "inference", "next": []},
    "payload": {"text": "test input"}
  }'

Interpreting responses:

HTTP status	Meaning	Next step
`200`	Handler returned successfully	Inspect `frames` array for output
`204`	Handler returned `None` (abort)	Intentional pipeline exit
`400`	Malformed input	Check envelope JSON structure
`500`	Handler exception	Read `details.traceback` for the stack trace

Check handler readiness:

kubectl exec -n my-project deployment/inference -c asya-runtime -- \
  curl --unix-socket /var/run/asya/asya-runtime.sock http://localhost/healthz

Reference: Sidecar-Runtime Protocol for the full endpoint specification.

Check x-sink and x-sump#

x-sink (successful completions)#

Envelopes that finish the pipeline land in x-sink. Check its logs for the final payload:

kubectl logs -n my-project deployment/x-sink -c asya-runtime --tail=50

x-sump (errors)#

Envelopes that raised an exception or timed out are routed to x-sump. The error details (type, message, traceback) are included in the envelope:

kubectl logs -n my-project deployment/x-sump -c asya-runtime --tail=50

A growing x-sump queue signals systematic handler failures. Monitor its queue depth:

keda_scaler_metrics_value{scaledObject=~".*x-sump.*"}

Identify bottlenecks with Prometheus metrics#

The sidecar exposes Prometheus metrics on :8080/metrics. Use these queries to find where envelopes are slow or failing.

Throughput per actor#

rate(asya_actor_messages_processed_total{queue="asya-my-project-inference"}[5m])

P95 processing latency (total)#

Includes queue receive, runtime execution, and queue send:

histogram_quantile(0.95,
  rate(asya_actor_processing_duration_seconds_bucket{queue="asya-my-project-inference"}[5m])
)

P95 runtime latency (handler only)#

Isolates handler execution time from infrastructure overhead:

histogram_quantile(0.95,
  rate(asya_actor_runtime_execution_duration_seconds_bucket{queue="asya-my-project-inference"}[5m])
)

Error rate by reason#

sum by (reason) (
  rate(asya_actor_messages_failed_total{queue="asya-my-project-inference"}[5m])
)

Reasons include: parse_error, runtime_error, transport_error, validation_error, route_mismatch.

Queue depth#

keda_scaler_metrics_value{scaledObject="inference"}

A high queue depth with max replicas running suggests the actor is under-provisioned.

Reference: Monitoring for the full metrics catalog, ServiceMonitor configuration, and alerting rules.

Common failure patterns#

Envelope stuck in "running"#

Check the actor's sidecar logs for errors
Check if the runtime timed out (look for context.DeadlineExceeded)
Verify the actor pod is healthy: kubectl get pods -n my-project -l asya.sh/actor=inference

Envelope landed in x-sump#

Read the x-sump logs for the error type and traceback
Reproduce the failure by curling /invoke with the same payload
Fix the handler and redeploy

Envelope disappeared#

Check if the actor returned None (routes to x-sink, not x-sump)
Check if the SLA deadline expired (routes to x-sink with phase=failed, reason=Timeout)
Verify queue connectivity: check sidecar logs for transport errors

Timeout crashes#

When the runtime exceeds ASYA_RESILIENCY_ACTOR_TIMEOUT, the sidecar sends the envelope to x-sump and crashes the pod. Look for:

kubectl logs -n my-project deployment/inference -c asya-sidecar --previous | grep "deadline exceeded"

The pod restarts automatically. To prevent repeated crashes, either increase the timeout or optimize the handler.

Next steps#

Monitoring -- Prometheus alerting and Grafana dashboards
Sidecar-Runtime Protocol -- protocol details
Troubleshooting -- common operational issues

Asya Docs

Debugging#