Planner-executor splits: when to separate them

A single model doing both planning and execution feels elegant on day one. By month three, the trace logs tell a different story: the planner part of the prompt drifts under tool-call context, and the executor part starts hallucinating steps that were never planned. Splitting the two is rarely the first instinct. It is often the right one.

What the split actually buys you

A dedicated planner runs on a clean context — just the user request and the available tool schemas — and produces a plan it cannot pollute with execution detail. A dedicated executor receives one step at a time, runs it, and reports back. Each component gets a smaller, sharper prompt. Each one is independently swappable: a cheap executor with an expensive planner is a real cost lever, and you cannot pull it without the split.

Where the split costs you

Latency. Two model calls per step, sometimes three when the planner needs to revise. For interactive use cases under two seconds, the split is often too expensive. The honest answer is to keep the joint loop for short tasks and split only when the task horizon exceeds five steps or when you’ve already seen the joint loop drift in production.

The planner-executor split is not architectural purity. It is a response to a specific failure mode that single-model loops exhibit at scale.

Related Posts

Designing an agent harness that doesn't fight the model

Lorem ipsum dolor sit amet consectetur adipisicing elit. The harness around an agent matters more th ...

How autonomous is too autonomous

Autonomy in agents is a slider, not a switch, and the right setting depends on the task more than th ...

When the agent fails: recovery patterns that don't loop forever

Agent failures don't throw exceptions. They produce plausible-looking output that's wrong, or quietl ...

Evaluating agents when there's no single right answer

Evaluating a single prompt is hard. Evaluating an agent that runs ten tool calls before answering is ...

Agent guardrails without lobotomizing the agent

Adding guardrails to an agent is one of those tasks where the easy version is too restrictive and th ...

The Agent Harness: Why Your Model Isn't the Problem

LangChain jumped from outside the top 30 to number 5 on TerminalBench 2.0. They didn't change the mo ...

Agent memory: episodic, semantic, and what to keep

The first agent you build has no memory beyond the current conversation, and that works for about a ...

Multi-agent systems: coordination is the actual hard part

Multi-agent architectures are seductive because they map onto how humans organize work: specialists, ...

ReAct in production: reasoning that survives sidetracks

ReAct is a clean idea: think, act, observe, repeat. In production, the loop is the part that breaks. ...

Tool selection: when the model should pick, and when you should

Tool-using agents look powerful in demos because the model is choosing what to do next. They look fr ...

File-Based Agents Don't Need a Build Step

The investment banking analyst who spends Friday night formatting a pitch deck isn't doing analysis. ...