Blog

Writing

Life, tech, and everything in-between.

Reflexion: Verbal Self-Critique After Failure for LLM Agents

Apr 1, 20268 minAI

When an agent fails a task, a separate critique call can write down what went wrong in words, store the critique as a memory, and let the next attempt condition on it. No gradient updates. A model that learns from text about its own past.

Graph-of-Thought: Non-Linear Reasoning with Merge and Refine

Mar 30, 20267 minAI

Trees branch and backtrack but never merge or loop. Graph-of-Thought extends the search to a directed graph, adding aggregation of parallel branches and refinement of earlier thoughts. The payoff is on problems that need synthesis rather than pure search.

Tree-of-Thought: Branching Reasoning with Search for LLMs

Mar 25, 20267 minAI

Chain-of-thought commits to one path. Tree-of-thought explores several, evaluates them, and backtracks. On problems that need exploration rather than sequential derivation, the difference is dramatic.

Chain-of-Thought Prompting for LLM Reasoning

Mar 23, 20267 minAI

Asking a large language model to reason step by step before answering reliably improves performance on multi-step problems. The technique is small in the prompt and large in effect, but only on models above a certain scale.

The Agent-Computer Interface: Designing Tools for LLM Agents

Mar 20, 20267 minAI

Anthropic reports spending more time tuning tool descriptions than tuning prompts for a SWE-bench agent. Tool design is where agentic systems actually succeed or fail, and it pays more compounding returns than almost any other investment.

Plan-and-Execute: Two-Phase Agents That Plan First, Then Act

Mar 18, 20267 minAI

ReAct interleaves planning with acting at every step. Plan-and-execute produces a complete plan first, then executes it. The separation trades adaptivity for token efficiency and is often the better choice when tasks have clear structure.

The Tool-Calling Agent Loop: ReAct as It Actually Ships

Mar 16, 20267 minAI

The 2022 ReAct paper parsed thought-action-observation from free text. Production agents today use native tool-calling, a step budget, input and output guardrails, and explicit handoff handling. The simple loop earns its hardening.

ReAct: Reasoning and Acting in One LLM Loop

Mar 11, 20266 minAI

The foundational agent pattern from Yao et al. interleaves a short reasoning trace with a single tool call, then folds the observation back into the next step. Simple in shape, strong on grounded reasoning benchmarks.

Evaluator-Optimizer: Iterative Refinement with a Separate Critic LLM

Mar 9, 20267 minAI

Some tasks improve on a second pass, but only when the critic is a separate LLM with its own mandate. Asking a generator to grade its own work produces confident praise for mediocre output.

Orchestrator-Workers: Dynamic Task Decomposition for LLM Agents

Mar 6, 20266 minAI

Routing picks from a fixed list. Parallelization runs a fixed fan-out. The orchestrator-workers pattern lets a planning LLM decide, at runtime, what the subtasks are and how many workers to spawn.

Parallelizing LLM Calls: Sectioning and Voting

Mar 4, 20266 minAI

Parallelizing LLM calls is not a single pattern. Sectioning runs different subtasks on the same input; voting runs the same task several times and aggregates. Choosing the wrong shape produces the wrong kind of improvement.

Routing: Classify and Dispatch LLM Requests to Specialists

Mar 2, 20266 minAI

A prompt tuned to handle billing refunds well is almost never the same prompt that handles technical outages well. Route the input instead, and give each category a prompt, a toolset, and sometimes a model of its own.