Math-To-Manim: Six AI Agents Turn Math Questions Into Cinematic Animations
- Smars
- Opensource , AI , Animation
- 23 Jun, 2026
The Problem
You’re a math teacher or science communicator. You have an idea — “animate how Fourier epicycles are rotating vectors that sum to a curve.” You open Manim, realize it’s 200 lines of Python, spend three hours debugging camera angles and font sizes, and the exported GIF still looks blurry.
Or you ask ChatGPT to generate Manim code. It runs, but the MathTex objects overlap, formulas collide with captions, and the camera zooms out too far to read anything. You don’t lack coding skills — you lack patience for visual parameter tuning.
Math-To-Manim solves this by giving you a one-sentence prompt and returning a complete animated film.
What It Is
Math-To-Manim is an open-source project that lets AI agents handle the entire pipeline from “understand the concept” to “render the MP4.” It has 2.4k GitHub stars, 339 commits, 255 forks, MIT license, and is written in Python.
Christian H. Cooper created the repo on January 20, 2025 — the same day DeepSeek R1 dropped. His tweet said it all: “I asked R1 to visually explain the Pythagorean theorem. This was done in one shot with no errors in less than 30 seconds. Wrap it up, its over.”
The core insight: LLMs can already generate runnable Manim code. The problem is quality — camera language, pacing, and visual polish are random. Math-To-Manim adds six reasoning stages before code generation so the AI “thinks it through” first.
Why It Stands Out
Six-Stage Reasoning Chain: Not Writing Code, Directing a Film
A normal text-to-code demo skips all instructional design. Math-To-Manim takes the longer path:
- Intent — Clarify what the learner actually wants to understand
- Prerequisite Graph — Reverse-engineer the knowledge needed before the target concept
- Curriculum — Turn the graph into a teachable sequence
- Math Director — Select definitions, equations, assumptions, and examples
- Cinematographer — Design camera language, text layout, and animation timing per frame
- Scene Composer — Compile everything into Manim objects and animation sequences
Only then does code generation, static validation, rendering, and self-repair kick in.
What this means: It’s not “writing Manim code.” It’s “figuring out how to teach this concept, then filming the teaching process as a movie.” Every frame has a pedagogical purpose.
Prime Intellect RL: Training Models to Fix Their Own Mistakes
Math-To-Manim is also building a reinforcement learning environment on Prime Intellect. The first RL target isn’t “make the perfect video in one shot” — it’s the edit move after a plausible but flawed scene: text overlapping formulas, equations too small, camera angle hiding the point.
The RL environment publishes the typed scene plan, generated Python, validation/render evidence, and a human request like “fix the overlap” or “zoom into the formulas.” The model learns to return sparse code edits that preserve the scene while making it more readable.
The hub environment is live at harleycooper/math-to-manim@0.1.1, with a 25-step pilot completed on Qwen/Qwen3.5-35B-A3B. This isn’t a code generator — it’s a trainable animation production environment.
Dual Backend: Claude Mythos + Codex CLI
Two code generation backends are supported:
- Claude Mythos — Six-stage reasoning chain, cinematic output, best for high-quality scenes
- Codex CLI — Locally authenticated OpenAI sessions, incremental migration, best for fast iteration
Planning stages use typed adapters; only code generation and repair stages switch to Codex. The migration is incremental, not all-or-nothing.
Quick Start
Three steps to your first animation:
# 1. Clone and install
git clone https://github.com/HarleyCoops/Math-To-Manim.git
cd Math-To-Manim
python3 -m venv .venv
source .venv/bin/activate
pip install -e ".[dev,render]"
# 2. Install render dependencies (Debian/Ubuntu/WSL)
./scripts/bootstrap-render.sh
# 3. Generate an animation
m2m2 generate \
"Show why the quantum harmonic oscillator only allows discrete energies: start with a springy potential well, zoom into the wavefunctions, then reveal the ladder of allowed energy levels." \
--codegen-provider codex-cli \
--codex-full-auto \
--style cinematic \
--quality l \
--runs-dir runs
Videos and all intermediate artifacts land in runs/<run_id>/ — including JSON contracts, generated code, validation reports, render results, and a manifest.
Run a no-API smoke test to verify the pipeline without spending model credits:
math-to-manim generate "Explain why derivatives are slopes" --deterministic --no-render
The Killer Demo
The QED journey: a 200-second, ~160-animation tour of quantum electrodynamics:
python -m mythos.harness "explain quantum field theory" --render -q m
# or render the flagship scene directly
manim -qh examples/mythos/qft_cinematic.py QFTCinematicJourney
The camera enters the Lagrangian, spotlights each term with a plain-language caption, flies into the exact symbol being explained, then pulls back to restore context. It’s a guided tour through mathematical structure, not a formula recitation.
Every run produces a full bundle:
runs/<run_id>/
request.json # original prompt
intent.json # parsed intent
knowledge_graph.json # prerequisite graph
curriculum.json # teachable sequence
math_packet.json # math materials
storyboard.json # frame-by-frame plan
scene_spec.json # Manim object specs
generated_scene.py # generated code
validation_report.json
render_result.json
review_report.json
manifest.json
Every JSON is a versioned contract, readable by downstream tools. Not a black box — a fully observable pipeline.
Community and Ecosystem
The numbers look healthy: 2.4k stars, 255 forks, 339 commits, 18 watchers. Documentation is thorough — the docs/ directory has architecture docs, RL integration notes, a roadmap, and a showcase gallery with 16 curated GIFs.
Hermes Agent is integrated as the contributor/operator layer. It’s not a runtime dependency — it interacts with the repo the way a developer does: reads files, searches code, runs tests, reviews rendered output. This makes maintaining the reverse-reasoning pipeline practical.
The Prime Intellect RL integration is the most interesting ecosystem piece. Models are learning to fix visual post-render issues — text overlap, tiny formulas, wrong camera angles. This is a trainable animation environment, not just a code generator.
When to Use (and When Not To)
Good fit:
- Math/physics explainer videos — from concept to rendered animation
- Online education content — course animations, knowledge visualization
- Academic paper figures — dynamic demonstrations for publications
- Social media — quick math beauty shorts
Bad fit:
- Commercial animation with precise timing control — the pipeline is automated, limited room for manual tuning
- Narrative storytelling — focused on instructional explanation, not plot
- Real-time rendering — requires offline rendering, no live preview
- Non-math content — the reasoning chain is designed around mathematical teaching
Heads up:
- Rendering needs FFmpeg and LaTeX — additional system deps on WSL and macOS
- High-quality rendering (
-q l) is slow — use--no-renderfor fast iteration - Claude Mythos requires a Claude subscription or API access; Codex CLI needs local auth
Verdict
Math-To-Manim proves that code generation is just the starting line. Making AI understand “how to teach” is the real differentiator. Six reasoning stages turn a vague question into a pedagogically paced animated film, with every intermediate artifact observable, verifiable, and repairable.
If you’re working on math visualization or educational content, this project deserves serious attention. It’s not just a code generator — it’s an animation production pipeline that thinks about teaching first.