AI / LLM

Handoffs and the Swarm Pattern: Peer-to-Peer Agent Transfer

7 min readAILLM

Supervisor patterns put a central agent in charge. Every request passes through the supervisor; every specialist returns to the supervisor. This works, but it introduces a round-trip at every layer and forces all inter-agent communication through a single bottleneck. For some workloads, especially customer service flows where the conversation naturally moves from triage to a specialist to a human, the supervisor is needless overhead. The right topology is not a tree; it is a graph of peers.

The swarm pattern, popularized by OpenAI's experimental Swarm project and now a first-class concept in its Agents SDK, replaces the supervisor with handoffs. A handoff is a tool call that returns a reference to another agent. When the runtime sees a handoff, it swaps the active agent for the target, keeping the conversation history intact. The target agent reads what came before, continues the conversation, and can hand off again. No single agent holds coordination responsibility; agents are peers that know which other agents they can transfer control to (OpenAI, 2024). The rest of this article describes the mechanics, the input filter that controls what the next agent sees, and the failure modes specific to a leaderless multi-agent system.

How handoffs work

The implementation detail that makes the pattern work is small but important. A handoff looks like a regular tool call in the agent's output, but its return value is not a data object; it is an Agent reference. The harness intercepts this specific return type, swaps the active agent, and re-invokes the loop with the new agent bound. The new agent sees the full conversation history by default; it continues from where the previous agent left off.

Because handoffs are just tool calls, they benefit from the same tool-calling infrastructure covered earlier in this series. Adding a handoff is usually a one-line edit: define a tool that returns the target agent's reference, and include it in the source agent's toolset. The model learns that this tool transfers the conversation, and it invokes the tool when its judgment says the conversation belongs with a different specialist.

One shape, two agents

flowchart LR
    U([User]) --> T[Triage agent]
    T -->|classify + handoff| R[Refund agent]
    T -->|classify + handoff| S[Sales agent]
    R -->|handoff if stuck| H[Human agent]
    S -->|handoff if needed| T
    H --> OUT([Response])
    R --> OUT
    S --> OUT

The diagram shows four agents: a triage agent, two specialists (refund, sales), and a human-escalation agent. Arrows label which agent can transfer to which. The triage agent classifies the incoming request and hands off to the appropriate specialist. The specialist completes the task or hands off further (back to triage if the classification was wrong, to a human for cases outside the model's mandate).

Two versions in code

The excerpt below shows the pattern without a framework. Each agent is represented by a dict carrying its prompt and its available handoffs. The runtime loops: call the current agent, check whether the response includes a handoff, swap if so.

from dataclasses import dataclass, field
from openai import OpenAI
import json

client = OpenAI()

@dataclass
class Agent:
    name: str
    system: str
    tools: list = field(default_factory=list)

def handoff_tool(target: Agent) -> dict:
    return {"type": "function", "function": {
        "name": f"transfer_to_{target.name}",
        "description": f"Transfer the conversation to {target.name}.",
        "parameters": {"type": "object", "properties": {}, "required": []}}}

def run_swarm(initial: Agent, agents: dict, user_msg: str,
              max_turns: int = 8) -> str:
    messages = [{"role": "user", "content": user_msg}]
    active = initial
    for _ in range(max_turns):
        r = client.chat.completions.create(
            model="gpt-4o-mini",
            messages=[{"role": "system", "content": active.system}] + messages,
            tools=active.tools)
        msg = r.choices[0].message
        if not msg.tool_calls:
            return msg.content
        messages.append(msg)
        call = msg.tool_calls[0]
        if call.function.name.startswith("transfer_to_"):
            target_name = call.function.name[len("transfer_to_"):]
            active = agents[target_name]
            messages.append({"role": "tool", "tool_call_id": call.id,
                             "content": f"transferred to {active.name}"})
        else:
            # Regular tool call; execute and append as usual.
            messages.append({"role": "tool", "tool_call_id": call.id,
                             "content": "tool result"})
    return "halted: swarm turn budget exceeded"

The LangGraph version uses the handoff primitive directly. Each agent defines its handoff targets; the runtime manages the swap. An input_filter can strip or summarize prior conversation before the next agent sees it.

from langchain.chat_models import init_chat_model
from langgraph.prebuilt import create_react_agent
from langgraph.types import Command

model = init_chat_model("gpt-4o-mini")

def to_refund(state) -> Command:
    # Optionally summarize the conversation before handoff
    return Command(goto="refund", update={"messages": state["messages"][-5:]})

def to_sales(state) -> Command:
    return Command(goto="sales", update={"messages": state["messages"][-5:]})

triage = create_react_agent(
    model=model, tools=[to_refund, to_sales],
    state_modifier="You are a triage agent. Route to refund or sales."
)

refund = create_react_agent(
    model=model, tools=[process_refund_tool],
    state_modifier="You are a refund specialist."
)

sales = create_react_agent(
    model=model, tools=[quote_tool, discount_tool],
    state_modifier="You are a sales specialist."
)

Full runnable versions will live at github.com/subodhjena/agentic-patterns under examples/18_handoffs_swarm.py as that lesson lands in the repo.

The input filter

Handoffs inherit the full conversation by default, but that default is often wrong. A sales agent handed a refund conversation sees a large amount of context it cannot use and might misread. A human escalation receiving the full trace drowns in transcript. The input filter on handoffs is the mechanism that solves this.

An input filter is a function that runs at handoff time. Its input is the current conversation; its output is what the target agent will see. Common filters include: keep only the last N turns, keep only the user's original question plus the current agent's summary, or strip tool-call traces and retain only assistant responses. OpenAI's Agents SDK and AutoGen both support filters directly; LangGraph achieves the same effect by modifying the state in the Command that executes the handoff.

Filters are critical for privacy in multi-tenant systems. A filter that ensures a cross-tenant handoff scrubs identifiers and PII is a guardrail, not an optimization.

Where the swarm wins

The pattern fits specific operational shapes well.

Conversations with handoff structure. Customer service is the canonical case: triage to specialist to human. Each agent has a clear mandate and a clear transfer criterion.

Agents that do not need to coordinate. When only one specialist is active at any time, a supervisor adds no value. Swarm lets the conversation move directly.

Situations where a central coordinator is overhead. Supervisor-and-router doubles the round-trips when the supervisor's only job is to pass requests through.

Latency-sensitive flows. A swarm handoff is a single tool call; no second agent call is needed to "authorize" the transfer.

Where the swarm goes wrong

The absence of a supervisor means the absence of a central enforcer. Several failure modes are specific to leaderless topologies.

Handoff loops. Agent A hands to B, which hands to A, which hands to B. Without a cycle detector, the conversation ping-pongs until the turn budget runs out. Detect repeated handoffs between the same pair and break the cycle.

Wrong-specialist handoffs. The source agent's judgment about which agent to hand off to is wrong. The target agent either processes the request badly or hands it off again. Calibrate handoff triggers carefully and include a triage-back option for genuine misroutes.

Context leaks. Without an input filter, the target agent sees everything the source agent did, including context that should have been scoped to the source. Use filters by default; configure them per handoff.

Shared state confusion. In the absence of a supervisor, no single agent owns the final answer. A conversation that touches three agents produces three candidate final answers. Designate a termination criterion: the agent that receives the final user-ready response returns, and no subsequent handoffs are accepted.

Tool duplication across agents. A refund agent and a billing agent both need a refund API. If the tool lives on both, two agents produce different refund amounts. Factor shared tools into a distinct agent and hand off for them.

Debugging difficulty. Swarm traces are harder to reconstruct than supervisor traces because no single agent owns the conversation. Invest in structured logging at every handoff boundary from day one.

Trade against supervisor patterns

Swarm and supervisor sit on opposite ends of the coordination spectrum. The axes below name the choice.

Axis Supervisor Swarm
Coordinator Explicit central agent Implicit, distributed
Extra round-trips per specialist One (to and from supervisor) Zero
Control over flow Centralized Per-agent judgment
Debugging Clear trace through supervisor Graph trace across peers
Loop detection Easier, supervisor sees all Harder, distributed
Fit Parallel delegation with synthesis Linear escalation flows

Supervisor is better when the top-level agent genuinely coordinates multiple specialists. Swarm is better when the conversation moves linearly from one specialist to the next and coordination amounts to routing.

Neighbors in the series

Supervisor and router, two articles ago, is the opposite topology. Hierarchical teams, the previous article, scales the supervisor pattern further. The shared scratchpad pattern, next in this stage, is a third coordination style: no handoffs, no supervisor, all agents read and write a common workspace. Group chat patterns, covered after that, name several swarm-style topologies including round-robin and LLM-selected next-speaker. Guardrails, in the Safety stage, apply at handoff boundaries in swarms as they do at supervisor boundaries.

References

  1. OpenAI. Swarm: orchestrating lightweight multi-agent systems. 2024.
  2. Anthropic. Building effective agents. December 2024.
  3. Wu, Qingyun, et al. AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation. 2023.
  4. LangChain. Handoffs in LangGraph with Command. 2024.
  5. Google. Agent Development Kit: peer agents. 2024.
agentic-patternshandoffsswarmmulti-agentpeer-to-peeraillm
← Back to all posts