File-Based Agents Don't Need a Build Step

The investment banking analyst who spends Friday night formatting a pitch deck isn’t doing analysis. They’re drawing boxes around numbers — numbers that were already pulled from CapIQ, already cross-referenced in the model, already approved by the VP. The only thing standing between that data and a formatted slide is a human being who’d rather be sleeping.

Anthropic’s financial-services repository replaces that workflow with a file. A markdown file containing a system prompt and domain instructions. No build step. No CI/CD. No Docker container. Just a file, and an agent that knows how to read it.

The repo ships 11 agents, 30+ skills, and 11 data connectors — all from one source tree that deploys two ways: as a Claude Cowork plugin and as a headless managed agent behind the API. Everything is markdown and JSON. Everything is composable.

What makes this different

Most “AI for finance” products wrap a chat interface around a data terminal and call it a day. This isn’t that.

The agents own workflows end to end. The Pitch Agent doesn’t just answer questions — it runs comps, pulls precedents, builds an LBO, and populates a branded deck. The Earnings Reviewer ingests a call transcript plus filings and produces a model update and a note draft. The GL Reconciler finds breaks, traces root cause, and routes for sign-off. Each agent is a complete workflow, not a chatbot that hands off halfway through.

The architecture is file-based all the way down. A system prompt in “agents/slug.md”. Domain instructions in “skills/” as markdown files. Data connectors in “.mcp.json”. No build step means no divergence between what you test and what you ship — the files you edit are the files that run.

Most agent platforms force you into their deployment model. This one ships the same files as a desktop plugin and as an API callable agent. Same source, same behavior, two runtimes.

Eleven MCP connectors come pre-integrated: Daloopa, Morningstar, S&P Global, FactSet, Moody’s, MT Newswires, Aiera, LSEG, PitchBook, Chronograph, and Egnyte. These wire Claude to real financial data — not a sandbox, not a demo dataset. Partner-built plugins from LSEG and S&P Global add bond RV, swap curves, and Capital IQ tear sheets.

The Microsoft 365 add-in provisions Claude inside Excel, PowerPoint, Word, and Outlook — pointing at your own cloud (Vertex AI, Bedrock, or internal gateway) instead of Anthropic’s API.

Who this is for and when it works

Investment banking. Pitch books, CIM drafting, teaser creation, buyer list building, merger modeling, process letters, deal tracking. The repo has a named agent or slash command for each step of the deal lifecycle.

Equity research. Post-earnings notes, initiation reports, model updates, morning meeting prep, sector overviews, thesis tracking, idea generation. Run “/earnings” after the call drops and get a draft while the market is still digesting.

Private equity. Deal sourcing with CRM integration, CIM screening, due diligence checklists by workstream, IC memo drafting, portfolio KPI monitoring, value creation planning. The “/ic-memo” command pulls financials, comps, returns analysis, and risk flags into one document.

Wealth management. Client review prep, financial plans, portfolio rebalancing, tax-loss harvesting, investment proposals. Less glamorous than M&A, but the throughput gains are the same.

Fund administration. GL reconciliation, break tracing, accruals, roll-forwards, variance commentary, NAV tie-outs, LP statement audits. The workflows nobody wants to touch — and that’s exactly why they’re in here.

Where it breaks and when you shouldn’t use it

Every output needs human review. The repo is explicit about this: nothing here constitutes investment, legal, tax, or accounting advice. These agents draft analyst work product for a qualified professional to review. Skip the human sign-off and you’re not being efficient — you’re being negligent.

MCP connectors require subscriptions. Daloopa, PitchBook, LSEG — these are paid data services. The connectors are pre-built, but the data doesn’t flow without an active license. Budget the data costs alongside the compute costs.

The file-based architecture is powerful but puts maintenance on you. Swap connectors by editing “.mcp.json”. Add firm context by editing skill files. Bring templates through “/ppt-template”. This is flexibility, not magic — someone needs to keep those files aligned with how your team actually works.

Agent quality degrades when scope sprawls. The Pitch Agent does pitch decks. The Earnings Reviewer does earnings. Don’t ask one agent to cover the other’s territory — the system prompt is narrow for a reason. Edit scope to match your workflow, not to make one agent do everything.

Don’t use these agents for regulatory filings, final client deliverables without review, or any output where a mistake carries legal liability. The model will occasionally hallucinate a multiple or misattribute a data point. Your job is to catch it.

Getting started

Install the marketplace and pick your agents:

# Add the marketplace
claude plugin marketplace add anthropics/claude-for-financial-services

# Core skills + connectors (install first)
claude plugin install financial-analysis@claude-for-financial-services

# Named agents — pick what you need
claude plugin install pitch-agent@claude-for-financial-services
claude plugin install earnings-reviewer@claude-for-financial-services
claude plugin install market-researcher@claude-for-financial-services
claude plugin install gl-reconciler@claude-for-financial-services

# Vertical skill bundles
claude plugin install investment-banking@claude-for-financial-services
claude plugin install equity-research@claude-for-financial-services
claude plugin install private-equity@claude-for-financial-services

In Claude Cowork, open Settings → Plugins → Add plugin and paste the repo URL. Agents appear in the dispatch menu, skills activate automatically, and slash commands become available in your session.

For headless deployment:

export ANTHROPIC_API_KEY=sk-ant-...
scripts/deploy-managed-agent.sh gl-reconciler

Each template under “managed-agent-cookbooks/” references the same system prompt and skills as its plugin counterpart. The deploy script resolves file references, uploads skills, creates sub-agents, and POSTs the orchestrator to “/v1/agents”.

Making it yours in 30 minutes

Swap the connectors first. Point “.mcp.json” at your data providers and internal systems. The pre-built connectors are starting points — your data stack is different.

Add firm context next. Drop your terminology, process conventions, and formatting standards into the skill files under the relevant vertical. Five bullet points of firm-specific instruction beats fifty pages of generic guidance.

Teach Claude your templates. Run “/ppt-template” with your branded deck, and it learns your layouts. Same for Excel models — show it your comps template once, and subsequent runs match your formatting.

Edit agent scope last. Each agent’s system prompt lives in a markdown file. If your team runs comps before precedents instead of after, swap the order. If you use a different valuation methodology, describe it.

The repo is designed to be forked. Copy the structure for workflows not yet covered — the pattern is the same: system prompt, skills, commands, and you’re done.

An end-to-end example: from CIM to IC memo

A PE associate receives an inbound CIM on Tuesday morning. Here’s what the next 90 minutes look like with the agent stack running alongside them.

The associate drops the CIM into Claude and runs “/screen-deal”. The agent parses the document, extracts financials and KPIs, and produces a pass/fail assessment against the fund’s criteria — revenue threshold, margin profile, market position, growth rate. It flags what’s missing. The associate reads the output, agrees with the judgment, and decides to proceed.

They run “/comps” to pull comparable public companies and precedent transactions. The agent hits PitchBook and CapIQ through the MCP connectors, builds a trading comps table with EV/EBITDA and P/E multiples, then runs a sensitivity table at ±1x. The associate adjusts the peer group — the agent’s selection was reasonable but missed two names.

Finally, “/ic-memo” pulls everything together: investment thesis, company overview, financial summary, comps output, returns analysis with IRR and MOIC across cases, key risks, diligence questions, and a recommendation. The output is in the fund’s branded format because the template was taught once and reused.

The associate spends 30 minutes reviewing, adjusting assumptions, and tightening the prose. The other 60 minutes would have been mechanical work — pulling data, formatting tables, populating the template. That part is gone.

Total time saved: roughly an hour per deal screening. At 200 deals screened per year per associate, the math isn’t subtle.

What to build on

Start with one agent and one vertical. The Pitch Agent or the Earnings Reviewer, depending on whether your team is deal-facing or market-facing. Get the data connectors working, get the firm context right, and build the review loop before adding more agents.

The repo is Apache 2.0 licensed and open to contributions. New skills go in “plugins/vertical-plugins/skills/”. New agents follow the “agents/slug.md” + “skills/” pattern. Run “python3 scripts/check.py” before pushing — it lints every manifest and verifies all cross-file references resolve.

File-based agents aren’t a stopgap until something better comes along. They’re the right abstraction: composable, auditable, forkable, and deployable without a build pipeline. For domain-specific workflows where the process is known but labor-intensive, they’re already the best option.

If your team spends more than five hours a week on mechanical analyst work, the math works in your favor. Try one agent. Give it your template and your firm context. Then decide.

GitHub: https://github.com/anthropics/financial-services

Related Posts

When the agent fails: recovery patterns that don't loop forever

Agent failures don't throw exceptions. They produce plausible-looking output that's wrong, or quietl ...

Evaluating agents when there's no single right answer

Evaluating a single prompt is hard. Evaluating an agent that runs ten tool calls before answering is ...

Agent guardrails without lobotomizing the agent

Adding guardrails to an agent is one of those tasks where the easy version is too restrictive and th ...

The Agent Harness: Why Your Model Isn't the Problem

LangChain jumped from outside the top 30 to number 5 on TerminalBench 2.0. They didn't change the mo ...

Planner-executor splits: when to separate them

A single model doing both planning and execution feels elegant on day one. By month three, the trace ...

Tool selection: when the model should pick, and when you should

Tool-using agents look powerful in demos because the model is choosing what to do next. They look fr ...

Designing an agent harness that doesn't fight the model

Lorem ipsum dolor sit amet consectetur adipisicing elit. The harness around an agent matters more th ...

Memory strategies for long-running agents

Long-running agents accumulate context. The job of memory design is to decide which slices of that c ...

How autonomous is too autonomous

Autonomy in agents is a slider, not a switch, and the right setting depends on the task more than th ...

Agent memory: episodic, semantic, and what to keep

The first agent you build has no memory beyond the current conversation, and that works for about a ...