AI / LLM
Choosing the Right Agentic Pattern: A Decision Framework
This is the last article in the Agentic Patterns series. The preceding thirty-three articles covered individual patterns: how each works, when it fits, where it fails, what it costs. This article collapses them into the decision a practitioner actually faces. A task arrives. Which pattern should the team reach for first?
The guidance that follows is drawn from the same primary sources this series has referenced throughout: Anthropic's engineering writeups, OpenAI's agent guides, Google's orchestration documentation, the Microsoft MagenticOne paper, the DeepMind scaling study, and the academic literature on specific patterns. The framework is opinionated. The opinion is the same one that has run through the whole series: the default should be the simplest thing that could possibly work, and complexity should be added deliberately, not reflexively.
Four decision trees
The Anthropic agent-building guide and the agentic-patterns reference each organize pattern choice as a sequence of questions. The four trees below reproduce that structure, each covering one family of decisions.
Workflow versus agent
Start here: Can a single LLM call solve it?
|
+-- Yes --> Single LLM call. Done.
|
+-- No --> Are the steps fixed and known?
|
+-- Yes --> Is it sequential?
| |
| +-- Yes --> Prompt Chaining
| |
| +-- No --> Parallelization (sectioning or voting)
|
+-- No --> Does input determine the path?
|
+-- Yes, into categories --> Routing
|
+-- Yes, into dynamic subtasks --> Orchestrator-Workers
|
+-- No --> Does it need iterative refinement?
|
+-- Yes --> Evaluator-Optimizer
|
+-- No --> Is the task fully open-ended?
|
+-- Yes --> Agent with ReAct loop
The tree terminates at the first pattern that fits. A single LLM call is the assumed default; everything else is an earned upgrade.
Multi-agent coordination
Do you need multiple specialized agents?
|
+-- No --> Single agent is fine.
|
+-- Yes --> How should they coordinate?
|
+-- Central control --> Supervisor / Router
+-- Peer-to-peer --> Handoffs / Swarm
+-- Deep hierarchy --> Hierarchical Teams
+-- Tight collaboration --> Shared Scratchpad
+-- Turn-taking dialogue --> Group Chat Patterns
The question "do you need multiple specialized agents?" is genuine. DeepMind's 2024 scaling study found that sequential planning tasks degrade 39 to 70 percent under multi-agent architectures compared with single-agent baselines. Multi-agent is the right answer on parallelizable tasks with clear specialization; on sequential or loosely-structured tasks, a single agent usually wins.
Reasoning pattern
How important is correctness, and at what cost?
|
+-- Standard multi-step reasoning --> Chain-of-Thought
+-- Need exploration / backtracking --> Tree-of-Thought
+-- Need synthesis of partial solutions --> Graph-of-Thought
+-- Can retry the whole task on failure --> Reflexion
+-- Maximum correctness, cost not an issue --> LATS
The cost axis matters. Chain-of-thought is a free upgrade. Self-consistency costs N calls. Tree-of-thought costs roughly k^depth. LATS costs 5 to 10x a ReAct baseline. Match the pattern to the correctness the task needs and the budget the team has.
Framework choice
| If you need | Use |
|---|---|
| Maximum control over agent flow | LangGraph |
| Distributed agents across machines | AutoGen |
| Quick prototyping with roles | CrewAI |
| OpenAI-native with guardrails | OpenAI Agents SDK |
| Claude-native agent building | Claude Agent SDK |
| Gemini-optimized, code-first | Google ADK |
| Document-centric or RAG-heavy | LlamaIndex |
| Algorithmic prompt optimization | DSPy |
| AWS-managed, zero-ops | AWS Bedrock Agents |
| No framework (recommended start) | Direct API calls plus asyncio |
The last row is not a joke. For a team building its first production agent, starting with direct API calls plus the model provider's SDK is a reasonable default. A framework can be added later once the team knows what they actually need from it.
Anti-patterns to avoid
Anthropic's and Google's production retrospectives name the same mistakes. The tables below consolidate the architecture and execution anti-patterns; each has a recurring symptom and a known fix.
Architecture
| Anti-pattern | Symptom | Fix |
|---|---|---|
| Agent when a workflow suffices | Unpredictable behavior, runaway cost | Start with workflows; upgrade only when needed |
| Too many agents | Coordination overhead exceeds benefit | Start with one; split only when a single agent's context or tools overflow |
| Shared everything | Agents drown in irrelevant context | Use independent scratchpads; share only final results |
| No stopping condition | Infinite loops, cost incidents | Always set max_iterations; track repeat-call detection |
| Framework worship | Abstraction hides bugs | Understand the underlying API first |
| More than sixteen tools on one agent | Error amplification (DeepMind finding) | Split tools across specialist agents; add tool search |
Execution
| Anti-pattern | Symptom | Fix |
|---|---|---|
| No evaluation | Cannot tell if the system works | Measure performance; add complexity only when metrics improve |
| Trusting output blindly | Hallucinations compound | Add verification steps; use the evaluator pattern |
| Self-evaluation | Agent praises its own mediocre work | Use a separate evaluator tuned to be skeptical |
| Mixing concerns | Confused prompts, tool overload | One agent, one clear responsibility |
| Skipping human oversight | Dangerous actions or bad outputs | Add approval gates for irreversible actions |
| Over-engineering tools | Agent cannot figure out how to use them | Keep tools simple; one tool, one action |
These are not edge cases. They are the failure modes that teams ship repeatedly across organizations and across frameworks. Knowing the list is most of the battle.
The simplicity test
Anthropic's agent-building guide ends with a short test that this series has repeatedly pointed back to. Before adding any agentic complexity, ask three questions.
First: does a single LLM call with good prompting solve this? If yes, stop. Every pattern in this series is a candidate to introduce unnecessary complexity; the discipline is to avoid doing so.
Second: does adding this pattern measurably improve outcomes? If the team cannot measure the improvement, they cannot defend the complexity. Build the evaluation harness before the architecture.
Third: can the architecture be explained in one paragraph? If not, it is more complex than the task needs. Simplify until it can be.
The test is not about being minimalist for its own sake. It is about making complexity a deliberate choice rather than an accumulating default. A team that passes the test has an architecture they can defend, operate, and improve. A team that fails it has an architecture that will get worse over time.
A minimal reference example
The excerpt below shows the smallest pattern that could possibly work for a classification-and-response task: a workflow with two calls and a validation gate. No agent. No multi-agent. No reasoning pattern. No framework.
from openai import OpenAI
from pydantic import BaseModel
from typing import Literal
client = OpenAI()
class Route(BaseModel):
category: Literal["billing", "technical", "general"]
confidence: float
def answer(query: str) -> str:
route = client.responses.parse(
model="gpt-4o-mini",
instructions="Classify as billing, technical, or general.",
input=query, text_format=Route,
).output[0].content[0].parsed
if route.confidence < 0.5:
return "Forwarding to a human agent."
persona = {"billing": "billing specialist",
"technical": "technical support engineer",
"general": "general assistant"}[route.category]
return client.responses.create(
model="gpt-4o-mini",
instructions=f"Answer as a {persona}.",
input=query,
).output_text
This solves a real problem. It has one gate (confidence threshold). It has no agent. It ships in fifteen lines. Most production workloads should start here.
Teams that upgrade from this to a routing workflow, and later to a multi-agent system, follow a measurable path: each upgrade is justified by a specific failure of the previous version. Teams that skip to multi-agent architectures on day one usually end up rebuilding toward something closer to this excerpt.
Where to look next
The thirty-three preceding articles cover each pattern named in the decision trees above. The index at /blog lists all of them. The runnable code examples live at github.com/subodhjena/agentic-patterns; the examples/ directory contains one numbered file per pattern, with both raw API and LangChain or LangGraph implementations from lesson 05 onward.
Three closing recommendations apply across the series.
Invest in evaluation before architecture. The only reliable way to know whether an upgrade helps is to measure it. Without an evaluation harness, every decision in the decision trees above is a guess.
Invest in tool design before prompt tuning. Anthropic's SWE-bench team spent more time on tools than on prompts and reported a 40 percent completion-time reduction from iterative tool rewrites. Tool design has higher leverage than almost any other intervention.
Revisit harness assumptions as models improve. Every component in an agent harness encodes an assumption about what the model cannot do. When the model can do more, the harness can do less. Audit annually, or after each major model upgrade, whether each component still earns its cost.
The series has tried to be specific where specific advice exists and honest where it does not. Choosing the right pattern is a judgment call that benefits from knowing the options. The goal of this framework is not to remove the judgment but to clarify what is being judged.
Neighbors in the series
This is the capstone article; every previous article in the series is a neighbor. The most directly related are the two articles that anchor the first and last stages: "Workflows Versus Agents in LLM Systems" (article 1) argues for the workflow default that this framework inherits; "Scaling and Cost Optimization" (article 32) provides the empirical grounding for the multi-agent warnings.
References
- Anthropic. Building effective agents. December 2024.
- Anthropic. A practical guide to building agents. March 2025.
- OpenAI. Practices for deploying LLM-based agents. 2024.
- Han, Joshua, et al. Towards a Science of Scaling Agent Systems. Google DeepMind, 2024.
- LangChain. LangGraph concepts. 2024.
