vs Google ADK#
TL;DR#
Google ADK is an in-process agent framework: agents, tools, and orchestration all run in a single Python process sharing memory. 🎭 Asya is a distributed actor mesh where each agent is a separate pod on Kubernetes, communicating through durable message queues. ADK gives you fast prototyping with Gemini and Vertex AI integration; 🎭 Asya gives you independent scaling, failure isolation, and production-grade deployment for multi-agent workloads.
At a Glance#
| 🎭 Asya | Google ADK | |
|---|---|---|
| One-liner | Actor mesh on Kubernetes | In-process agent framework |
| Execution model | Choreography: route embedded in each message | Orchestration: Runner drives a ReAct loop (while True over async generator) |
| Handler UX | Pure dict -> dict function, zero imports |
Plain function + Agent() config; tool_context param for state access |
| State passing | Full state travels in the message payload | Deltas written to shared session; output_key saves agent output to state dict |
| Scaling | Per-actor via KEDA (queue depth, 0-N) | Single process; scale the whole app |
| Failure isolation | 🟢 Per-actor: one crash affects only its queue | 🔴 Per-process: one crash takes down all agents |
| Multi-agent | Separate pods, queue-connected | In-process: Sequential, Parallel, Loop, AgentTool, transfer_to_agent |
| Streaming | FLY events via sidecar to gateway (SSE) | Async generator yields partial events (not persisted to session) |
| Human-in-the-loop | 🟢 Checkpoint to S3, resume from any replica | 🟡 LongRunningFunctionTool pauses in memory; lost on crash |
| Tool callbacks | 🔴 Not built-in (use pre/post-processing actors) | 🟢 before/after callbacks on both LLM calls and tool executions |
| SDK lock-in | 🟢 None (plain Python) | 🔴 google-adk package required |
| LLM coupling | 🟢 None; handler calls any LLM | 🟡 Gemini-first; other models via LiteLLM adapter |
| Protocols | 🟢 A2A + MCP gateway built-in | 🟡 A2A server built-in; MCP tool consumption only |
| Deployment | Kubernetes CRDs, Helm, GitOps | adk web, adk api_server, or Vertex AI Agent Engine |
| Transport | SQS, RabbitMQ, GCP Pub/Sub | In-memory (no transport; direct function calls) |
| Maturity | 🟡 Alpha | 🟡 Preview |
Architecture#
Google ADK runs everything inside one Python process. The Runner drives a
ReAct loop (while True) over an async generator (BaseLlmFlow.run_async).
Each iteration: preprocess (build LLM request with tools) -> call LLM ->
postprocess (execute tool calls via asyncio.gather if any). Tool results feed
back through session history for the next LLM turn. The loop breaks when
is_final_response() returns true (text-only response, no pending tool calls).
State lives in a delta-tracked State object -- every mutation records a
state_delta on the response event, persisted by the Runner.
┌─────────────────────────────────────────┐
│ Python Process │
│ │
│ Runner ─► Agent (async generator) │
│ │ │
│ ├─► LLM call (Gemini API) │
│ ├─► Tool A (in-process) │
│ ├─► Tool B (in-process) │
│ └─► Sub-agent (in-process) │
│ │
│ Session ◄── delta-tracked state ──► │
└─────────────────────────────────────────┘
🎭 Asya decomposes each agent into a separate pod. The sidecar reads from a queue, invokes the handler over a Unix socket, and routes the result to the next queue. State travels in the message payload -- no shared memory, no central session store.
┌──────────┐ ┌──────────┐ ┌──────────┐
│ Agent A │────►│ Agent B │────►│ Agent C │
│ pod + q │ msg │ pod + q │ msg │ pod + q │
└──────────┘ └──────────┘ └──────────┘
▲ │
│ ┌──────────┐ │
└──────────│ Gateway │◄──────────┘
│ A2A/MCP │
└──────────┘
Developer Experience: The Same Task in Both#
Task: research a topic, then write an article based on the research.
Google ADK#
from google.adk.agents import LlmAgent, SequentialAgent
researcher = LlmAgent(
name="Researcher",
model="gemini-2.0-flash",
instruction="Research the topic: {topic}",
output_key="research", # result saved to state["research"]
)
writer = LlmAgent(
name="Writer",
model="gemini-2.0-flash",
instruction="Write an article based on: {research}", # reads from state
)
pipeline = SequentialAgent(
name="Pipeline",
sub_agents=[researcher, writer],
)
State flows via output_key: Researcher writes to state["research"],
Writer reads it via the {research} template variable. Both agents share
the same in-memory session. State keys support scope prefixes (app:,
user:, temp:) for cross-session and per-invocation data.
🎭 Asya#
async def research_pipeline(payload: dict) -> dict: # asya: flow
payload = await researcher(payload) # asya: actor
payload = await writer(payload) # asya: actor
return payload
async def researcher(payload: dict) -> dict:
"""Research the topic."""
result = call_any_llm(payload["topic"])
payload["research"] = result
return payload
async def writer(payload: dict) -> dict:
"""Write an article from research."""
payload["article"] = call_any_llm(payload["research"])
return payload
asya flow compile research_pipeline.py generates one AsyncActor manifest per
function. Each deploys as a separate pod with its own queue and scaling policy.
Handlers are plain Python -- no ADK import, no Gemini dependency.
Key Differences in Practice#
State: Deltas vs Full Payload#
ADK tracks changes as deltas on events (EventActions.state_delta), applied by
the Runner to a central session service. 🎭 Asya carries the full state in every
message -- no central store, no delta merging, naturally distributed.
Composition: In-Process vs Distributed#
ADK offers five composition modes: SequentialAgent, ParallelAgent,
LoopAgent, AgentTool (encapsulated sub-agent as a tool), and
transfer_to_agent (LLM-driven dynamic routing). All share one asyncio loop.
🎭 Asya's flow compiler transforms the same sequential/parallel/loop patterns into
distributed actors connected by queues, each scaling independently.
Streaming: Generator vs FLY#
ADK yields partial events (partial=True) from async generators; these are
forwarded to the UI but not persisted to session history. 🎭 Asya emits FLY events
via the sidecar, streamed over SSE through the gateway -- works across pod
boundaries and survives network partitions.
Human-in-the-Loop#
ADK pauses invocations in memory via LongRunningFunctionTool and tool
confirmation (require_confirmation=True). State is lost on crash. 🎭 Asya
checkpoints the full envelope to S3 via x-pause; any replica can restore
it, surviving pod evictions and node failures.
When to Choose Google ADK#
- Prototyping with Gemini -- ADK's Gemini integration is first-class;
output_key, session state, and the ReAct loop work out of the box - Single-process agents -- if all agents fit in one process and you do not need independent scaling or failure isolation
- Vertex AI deployment -- ADK agents deploy to Vertex AI Agent Engine with minimal configuration for managed hosting
- Rich tool callbacks -- before/after hooks on both LLM calls and tool executions for guardrails, logging, and short-circuiting
- Dynamic agent routing --
transfer_to_agentlets the LLM decide which sub-agent to hand off to at runtime - Rapid iteration --
adk webgives you a local UI for testing agents interactively
When to Choose 🎭 Asya#
- Production multi-agent workloads -- each agent scales independently based on its own queue depth (including GPU-bound agents that scale differently from CPU agents)
- Failure isolation -- one agent crashing, OOMing, or timing out does not affect any other agent
- No SDK lock-in -- handlers are plain Python functions with zero framework imports; portable to any runtime
- Durable human-in-the-loop -- envelope checkpointed to S3 survives pod evictions, node drains, and cluster migrations
- Transport flexibility -- choose SQS, RabbitMQ, or GCP Pub/Sub per environment without changing handler code
- Kubernetes-native GitOps -- actors are CRDs managed by Crossplane;
kubectl applyand ArgoCD work natively - Scale to zero -- KEDA scales actor pods to zero when queues are empty; no idle compute cost