TL;DR#

Temporal is a centralized workflow engine that replays deterministic functions to recover state. Asya is a decentralized actor mesh where stateless handlers communicate through durable message queues. Temporal gives you exactly-once semantics and complex state machines; Asya gives you independent per-actor scaling, scale-to-zero, and zero SDK lock-in.

At a Glance#

🎭 Temporal
One-liner Actor mesh on Kubernetes Durable execution engine
Execution model Choreography: message carries the route Orchestration: server replays workflow history
Handler UX Pure dict -> dict Python function SDK-wrapped activity/workflow with decorators
Scaling Per-actor via KEDA (queue depth) Per-task-queue worker pools
Scale to zero 🟢 Native (KEDA scales pods 0-N) 🔴 Workers must stay running to poll
Failure isolation ✅ Per-actor: crashed actor affects only its queue ⚠️ Per-worker: crashed worker affects all its workflows
State management Stateless handlers; state travels in the envelope Workflow state rebuilt via event-sourced replay
K8s native 🟢 CRDs, Helm, Crossplane, GitOps 🟡 Runs on K8s but not K8s-native (no CRDs)
Conceptual simplicity One abstraction: actor Workflows, activities, workers, task queues, signals, queries
Dynamic routing ✅ Actors rewrite route.next at runtime ✅ Workflow code uses conditionals and signals
SDK requirement ✅ None (plain Python functions) 🔴 Required (Go, Java, Python, TypeScript, .NET)
Transport SQS, RabbitMQ, GCP Pub/Sub Temporal Server (Cassandra/MySQL/PostgreSQL)
Agentic support ✅ A2A, MCP, pause/resume, streaming ⚠️ Via workflow signals and queries
Maturity 🟡 Alpha (production at Delivery Hero) 🟢 Mature (GA since 2020, thousands of production users)

Architecture Comparison#

Temporal#

Temporal runs a server cluster (frontend, history, matching, worker services) backed by Cassandra or a SQL database. Application code is split into workflows (deterministic orchestration logic) and activities (side effects). Workers are long-running processes that poll task queues. When a worker crashes, Temporal replays the workflow's event history on a new worker -- this requires workflow code to be deterministic (no random, no direct I/O).

Asya#

Each step is an independent actor pod on Kubernetes. A Go sidecar handles queue I/O, retries, and routing. The handler is a plain Python function. Messages (envelopes) carry their own route -- no central coordinator needed. Each actor scales independently via KEDA based on its own queue depth.

Developer Experience#

Consider a 3-step pipeline: validate input, call an LLM, store the result. Each step retries up to 3 times with exponential backoff.

Temporal#

# activities.py -- each step needs @activity.defn
@activity.defn
async def validate(data: dict) -> dict: ...

@activity.defn
async def call_llm(data: dict) -> dict: ...

@activity.defn
async def store_result(data: dict) -> dict: ...
# workflow.py -- orchestration logic, must be deterministic
@workflow.defn
class PipelineWorkflow:
    @workflow.run
    async def run(self, data: dict) -> dict:
        retry = workflow.RetryPolicy(maximum_attempts=3,
            initial_interval=timedelta(seconds=1),
            maximum_interval=timedelta(seconds=60),
            backoff_coefficient=2.0)
        for act in [validate, call_llm, store_result]:
            data = await workflow.execute_activity(
                act, data,
                start_to_close_timeout=timedelta(seconds=300),
                retry_policy=retry)
        return data
# worker.py -- long-running process that polls Temporal server
async def main():
    client = await Client.connect("temporal:7233")
    worker = Worker(client, task_queue="pipeline-queue",
        workflows=[PipelineWorkflow],
        activities=[validate, call_llm, store_result])
    await worker.run()

Three files, three concepts (activity, workflow, worker), SDK decorators throughout, and a running Temporal server cluster.

Asya 🎭#

# handler.py -- each actor gets one of these
def validate(state: dict) -> dict:
    return {**state, "validated": True}

def call_llm(state: dict) -> dict:
    state["response"] = model.generate(state["prompt"])
    return state

def store_result(state: dict) -> dict:
    db.save(state)
    return state
# asyncactor.yaml -- one per actor, or use flavors for shared defaults
apiVersion: asya.sh/v1alpha1
kind: AsyncActor
metadata:
  name: call-llm
spec:
  image: my-pipeline:latest
  handler: handler.call_llm
  scaling:
    minReplicaCount: 0
    maxReplicaCount: 10
  resiliency:
    actorTimeout: 300s
    policies:
      default:
        maxAttempts: 3
        backoff: exponential
        initialInterval: 1s
        maxInterval: 60s

Plain Python functions, no SDK, no decorators. Retry policies, timeouts, and scaling live in the Kubernetes manifest, not in application code. The route connecting the three actors is carried by the envelope itself.

When to Choose Temporal#

Temporal is a mature, battle-tested platform. It is the stronger choice when:

  • Complex state machines -- workflows with branching, compensation (sagas), and long-running human approvals that span days or weeks. Temporal's replay model excels at resuming exactly where it left off.
  • Exactly-once semantics -- Temporal's deterministic replay provides stronger delivery guarantees than at-least-once queue-based systems.
  • Timer-based scheduling -- cron workflows, delayed execution, and durable timers are first-class features in Temporal.
  • Multi-language orchestration -- Temporal has official SDKs for Go, Java, Python, TypeScript, and .NET, with consistent behavior across languages.
  • Mature observability -- Temporal's Web UI provides workflow-level visibility, searchable execution history, and debugging tools out of the box.
  • Non-Kubernetes environments -- Temporal runs anywhere; Asya requires Kubernetes.

When to Choose Asya#

Asya is purpose-built for AI/ML workloads on Kubernetes:

  • Heterogeneous hardware -- GPU actors scale independently from CPU actors. An LLM inference actor on A100 GPUs scales 0-5 while a preprocessing actor on CPU scales 0-50, each based on its own queue depth.
  • Scale-to-zero -- KEDA scales actor pods to zero when queues are empty. GPU pods cost nothing between batches. Temporal workers must stay running to poll.
  • No SDK lock-in -- handlers are plain Python functions (dict -> dict). No decorators, no base classes, no determinism constraints. Swap the function, redeploy.
  • Simpler mental model -- one abstraction (actor) instead of five (workflow, activity, worker, task queue, signal). Platform engineers own the YAML; data scientists own the Python function.
  • K8s-native operations -- AsyncActor is a CRD. Deploy with kubectl apply, manage with Helm, integrate with GitOps. Crossplane compositions render the full pod spec including sidecars.
  • Dynamic routing -- actors can rewrite the message route at runtime. An LLM judge can send high-confidence results to storage and low-confidence results to human review -- without rebuilding a DAG.
  • Agentic AI patterns -- built-in A2A and MCP gateway, pause/resume for human-in-the-loop, FLY streaming for live token output.