ReAct in production: reasoning that survives sidetracks

ReAct is a clean idea: think, act, observe, repeat. In production, the loop is the part that breaks. The model thinks reasonably for the first few steps, then either over-explains, gets stuck second-guessing a tool call, or convinces itself the task is already done. The textbook diagram doesn’t show any of this.

Where the loop drifts

The most common failure isn’t a wrong tool call — it’s a redundant one. The model retries an action that already succeeded because it lost track of state in the conversation history. Step counts climb, latency climbs with them, and the user sees a slow response that’s secretly a chain of identical lookups. The next most common failure is premature termination: the model declares success on a partial result because it pattern-matched to “I have an answer” rather than “I have the right answer.”

Patches that keep it on the rails

A short scratchpad that summarizes prior actions — not the full history, just outcomes — cuts redundant calls more than any prompt rewrite. A separate “are we done?” check, run by the same model on a different prompt, catches premature termination far better than letting the loop self-judge. Cap the loop. Always cap the loop.

ReAct works in production. The published version of the prompt is rarely the version that ships.

Related Posts

Designing an agent harness that doesn't fight the model

Lorem ipsum dolor sit amet consectetur adipisicing elit. The harness around an agent matters more th ...

How autonomous is too autonomous

Autonomy in agents is a slider, not a switch, and the right setting depends on the task more than th ...

The Agent Harness: Why Your Model Isn't the Problem

LangChain jumped from outside the top 30 to number 5 on TerminalBench 2.0. They didn't change the mo ...

Agent memory: episodic, semantic, and what to keep

The first agent you build has no memory beyond the current conversation, and that works for about a ...

Multi-agent systems: coordination is the actual hard part

Multi-agent architectures are seductive because they map onto how humans organize work: specialists, ...

Planner-executor splits: when to separate them

A single model doing both planning and execution feels elegant on day one. By month three, the trace ...