Agent memory: episodic, semantic, and what to keep

The first agent you build has no memory beyond the current conversation, and that works for about a week. Then users come back expecting continuity, and you start bolting on memory: a database table, a vector store, a summary of past sessions stuffed into the system prompt. By month three, the memory layer has more failure modes than the agent itself.

The two kinds worth distinguishing

Episodic memory — what happened, when, in this conversation thread — is what most teams build first because it’s obvious. Semantic memory — what the user prefers, what facts persist across sessions — is what makes an agent feel intelligent. They’re different storage problems, different retrieval problems, and conflating them produces an agent that remembers everything and uses none of it well.

What to actually keep

Aggressively forget. The temptation is to log everything in case it’s useful later; the consequence is a context window full of noise that crowds out the model’s reasoning. Summarize old episodes into salient facts. Promote facts to semantic memory only when they’ve been confirmed across multiple sessions — single-mention “facts” are mostly user mistakes the agent should not be encoding.

Agent memory is a curation problem, not a storage problem. The teams that get this right have a forgetting policy, not just a remembering policy.

Related Posts

Designing an agent harness that doesn't fight the model

Lorem ipsum dolor sit amet consectetur adipisicing elit. The harness around an agent matters more th ...

Memory strategies for long-running agents

Long-running agents accumulate context. The job of memory design is to decide which slices of that c ...

How autonomous is too autonomous

Autonomy in agents is a slider, not a switch, and the right setting depends on the task more than th ...

The Agent Harness: Why Your Model Isn't the Problem

LangChain jumped from outside the top 30 to number 5 on TerminalBench 2.0. They didn't change the mo ...

Multi-agent systems: coordination is the actual hard part

Multi-agent architectures are seductive because they map onto how humans organize work: specialists, ...

Planner-executor splits: when to separate them

A single model doing both planning and execution feels elegant on day one. By month three, the trace ...

ReAct in production: reasoning that survives sidetracks

ReAct is a clean idea: think, act, observe, repeat. In production, the loop is the part that breaks. ...

When the agent fails: recovery patterns that don't loop forever

Agent failures don't throw exceptions. They produce plausible-looking output that's wrong, or quietl ...

Evaluating agents when there's no single right answer

Evaluating a single prompt is hard. Evaluating an agent that runs ten tool calls before answering is ...

Tool selection: when the model should pick, and when you should

Tool-using agents look powerful in demos because the model is choosing what to do next. They look fr ...

File-Based Agents Don't Need a Build Step

The investment banking analyst who spends Friday night formatting a pitch deck isn't doing analysis. ...