Superpowers: Engineering Discipline for AI Coding Agents

AI coding agents usually fail in a boring way: they start too fast.

You ask for a login page, and they edit files. You ask them to fix a bug, and they guess. You say the feature is simple, and they believe you. The code may run, but the requirement is fuzzy, tests are missing, and nobody is sure the edge cases survived.

Superpowers fixes that problem by giving AI agents engineering discipline.

It is not a magic prompt. It does not make the model smarter. It gives your agent a process: clarify first, plan before coding, write tests before implementation, verify before declaring victory.

What Superpowers Is

Superpowers is an open-source project maintained by Jesse Vincent and the Prime Radiant team. The repository is https://github.com/obra/superpowers.

It provides a set of composable skills for Claude Code, Codex, Cursor, Gemini CLI, OpenCode, GitHub Copilot CLI, and other coding agents. These skills trigger when they match the task, then guide the agent through a structured software development workflow.

The idea is simple:

Do not let the AI jump straight into code. Make it understand the problem, then execute in verifiable steps.

The README describes Superpowers as a complete software development methodology built on skills and startup instructions. The value is not another tool. The value is a repeatable working style.

What It Changes

Without Superpowers, a typical agent session looks like this:

You describe a feature
The agent edits files
You notice the direction is wrong
The agent adds tests later
You point out missing cases
Nobody fully trusts the result

Superpowers flips the order:

brainstorming: clarify what you actually want
writing-plans: break the design into small executable tasks
test-driven-development: write the failing test before the implementation
systematic-debugging: find the root cause instead of guessing
requesting-code-review: review the work before handing it back
finishing-a-development-branch: verify, clean up, and choose how to finish

This can feel slower at first. In real projects, it often saves time. Two extra questions are cheaper than rebuilding the wrong thing.

Where It Works Best

Superpowers is especially useful for three kinds of work.

Feature work where the requirement is still soft.

“Add CSV import,” “build a dashboard,” or “refactor the settings page” all sound clear. They rarely are. You still need decisions about inputs, errors, permissions, old data, and how users recover from mistakes.

Superpowers triggers brainstorming so those decisions surface before code changes.

Code changes that need stability.

When you change business logic, fix bugs, or refactor shared modules, regression risk is real. The TDD skill forces the agent to write a failing test, watch it fail, then implement the smallest change that passes.

That sequence matters. A test that fails first proves it is testing the right behavior.

Longer tasks with multiple steps.

Superpowers includes writing-plans, executing-plans, dispatching-parallel-agents, and subagent-driven-development. These skills turn a large request into small chunks with checkpoints.

That structure matters when you want an agent to work for more than a few minutes. Without it, long sessions drift.

Where It Does Not Fit

Superpowers is not the right weight for everything.

If you only need a command lookup, a typo fix, or a throwaway snippet, the full workflow can feel heavy.

If your team explicitly does not want TDD, Superpowers will push against that culture. Its test-driven-development skill is intentionally strict. Decide whether you want speed theater or verifiable stability.

It also does not replace product judgment. It can ask better questions, split work into tasks, and verify the result. You still own the scope.

Installation

Superpowers supports several coding environments. Installation depends on the harness.

For Claude Code, install it from the official plugin marketplace:

/plugin install superpowers@claude-plugins-official

Or install it through the Superpowers marketplace:

/plugin marketplace add obra/superpowers-marketplace
/plugin install superpowers@superpowers-marketplace

For Codex CLI and Codex App, Superpowers is available through the official Codex plugin marketplace.

In Codex CLI, open the plugin interface:

/plugins

Search for Superpowers and install it.

In Codex App, open Plugins from the sidebar, look in the Coding section, and click the plus button next to Superpowers.

If you use Codex native skill discovery, you can also install it manually:

git clone https://github.com/obra/superpowers.git ~/.codex/superpowers
mkdir -p ~/.agents/skills
ln -s ~/.codex/superpowers/skills ~/.agents/skills/superpowers

Restart Codex so it discovers the skills.

For Gemini CLI:

gemini extensions install https://github.com/obra/superpowers

To update later:

gemini extensions update superpowers

Quick Start

After installation, you should not need to memorize every skill name. The point of Superpowers is that relevant skills trigger automatically.

Try it with a real but contained task:

Add CSV import to this project. Confirm the requirements with me before changing code.

If Superpowers is working, the agent should not immediately edit files. It should enter brainstorming and ask about the CSV format, validation rules, duplicate rows, permissions, and error handling.

Once the design is clear, it should write an implementation plan. The plan should break the task into small steps with file paths, test commands, and verification notes.

During implementation, the TDD skill should push the agent into this rhythm:

yarn test path/to/import.test.ts

Failing test first. Minimal code second. Refactor only after green.

That is the core Superpowers experience: fewer “I think it works” moments, more “I verified this behavior” moments.

Practical Workflow: Build a CSV Import Feature

Use one concrete example: add “import users from CSV” to an admin area.

The point is not the framework code. The point is how Superpowers changes each stage of the session. You can swap CSV import for payment webhooks, export jobs, permission changes, or a settings-page refactor. The workflow stays similar.

Stage One: Startup and Skill Check

Skill used: using-superpowers

This happens at the start of the session. Its job is to remind the agent to check for relevant skills before doing work.

You usually do not call it manually. Once Superpowers is installed, the agent should check the skill list before it acts.

Start like this:

Add user CSV import to the admin area. Fields are email, name, and role.
Do not write code yet. Confirm requirements and boundaries first.

If Superpowers is working, the agent should not open files immediately. It should move into requirement clarification.

Watch for whether it says it will use brainstorming, or at least starts asking design questions. If it jumps into code, interrupt it:

Use the Superpowers brainstorming workflow first. Do not implement yet.

Stage Two: Requirement Clarification

Skill used: brainstorming

brainstorming is not small talk. It turns a vague request into an implementable design.

For CSV import, it should ask:

Does the first row need headers
Should duplicate email values be skipped, overwritten, or rejected
Which role values are allowed
Is there a maximum row count
Can the import partially succeed
Where does the admin see the import result
Do failed rows need a downloadable error report

Your job is to make decisions. Do not rush the agent into coding.

A useful answer looks like this:

Rules:
- First row must be headers
- email is required and must be unique
- role can only be admin or member
- Max 1000 rows per import
- Partial success is allowed
- Failed rows must return row number and reason
- Do not build downloadable error reports yet

brainstorming should produce a design draft in readable chunks. It should not dump a huge document at once.

You can ask:

Also list what we are explicitly not building, so implementation does not drift.

That matters. Superpowers pushes YAGNI. Unneeded features should stay out.

Stage Three: Isolated Workspace

Skill used: using-git-worktrees

After design approval, Superpowers may suggest a separate worktree and branch.

The purpose is simple: keep unfinished agent work away from your current working tree.

Say:

The design is approved. Use git worktree to create an isolated branch for implementation.

The agent should:

Check the current git status
Create a branch, such as feature/csv-import
Create a worktree
Move into the new workspace
Run the existing tests to confirm a clean baseline

For a tiny change, you can skip the worktree:

This is small. Do not create a worktree. Continue on the current branch, but check git status first.

Worktrees are not ceremony. They solve isolation. Use them for long tasks, parallel work, or risky refactors.

Stage Four: Implementation Plan

Skill used: writing-plans

writing-plans turns the approved design into executable work. A good plan is not “build import.” It is a set of small tasks.

Superpowers expects the plan to include:

Which files change
What each step does
Which test comes first
Which command verifies the step
What counts as done

Trigger it like this:

Write an implementation plan from the approved design. Keep each task around 2-5 minutes and include test and verification steps.

A reasonable plan might include:

Add CSV parsing tests for empty files, missing fields, and duplicate email values
Add failing tests for the importUsers service
Implement minimal CSV parsing
Implement role validation and row-level errors
Connect the admin API route
Add the upload entry point and result display
Run unit tests, type checks, and manual verification

Review three things:

Did the agent turn unknowns into assumptions
Did it sneak in unapproved features like import history
Does every step include verification

If the plan is too broad, say:

This plan is too coarse. Split it into independently testable and committable steps.

Stage Five: Choose Execution Mode

Skills used: executing-plans or subagent-driven-development

After the plan is approved, there are two common execution paths.

executing-plans is best when you want human checkpoints. The agent runs a batch, then stops for feedback.

Say:

Execute the plan. Stop and report after every two tasks.

subagent-driven-development is better for larger tasks with clean boundaries. It dispatches subagents and reviews each task in two stages: spec compliance first, code quality second.

Say:

Use subagent-driven-development for this task. Review each subtask for spec compliance and code quality.

Choose like this:

Situation	Suggested path
Small feature, close supervision	executing-plans
Medium or large feature with clear task boundaries	subagent-driven-development
Independent modules can run in parallel	dispatching-parallel-agents
You do not fully trust the agent yet	executing-plans

If this is your first time using Superpowers, start with executing-plans. It is slower, but easier to inspect.

Stage Six: Test-Driven Implementation

Skill used: test-driven-development

This is the strictest part of Superpowers.

It enforces RED-GREEN-REFACTOR:

RED: write a failing test
GREEN: write the smallest implementation that passes
REFACTOR: clean up while tests stay green

For CSV import, the first step should not be writing a parser. It should be a failing test:

First write a failing test for “empty CSV returns an error.” Run it and confirm it fails before implementing.

Watch whether the agent actually runs the test. Writing a test without running it is not RED.

A normal command might be:

yarn test src/server/importUsers.test.ts

After the test fails, ask for the smallest implementation:

Now implement only the empty CSV validation needed to pass this test. Do not implement other rules yet.

Then continue with the next cases:

Missing email column
role is not admin or member
duplicate email inside the same file
partial success returns success count and failed rows

Each behavior gets a failing test before implementation.

Superpowers pushes back against “write all the code, then add tests.” In a strict TDD flow, code written before tests may be deleted or reverted.

Stage Seven: Root-Cause Debugging

Skill used: systematic-debugging

Suppose one import test flakes: it passes locally but fails in CI.

A normal agent may guess: add a wait, increase a timeout, change a mock. systematic-debugging forces a four-phase process:

Reproduce the issue
Identify the failing condition
Trace the root cause
Add protection against the class of bug

Trigger it like this:

This test flakes in CI. Use systematic-debugging. Do not guess. Gather evidence first.

A good debugging pass checks:

Local versus CI Node version, timezone, and file encoding
The failing input, not just the error message
Whether behavior depends on object key order, line endings, or async timing
Whether the assertion is too brittle

The value is preventing symptom fixes.

Stage Eight: Verification Before Completion

Skill used: verification-before-completion

Many agents say “should work” after changing code. Superpowers does not accept “should.”

verification-before-completion asks the agent to provide evidence.

Say:

Before saying this is done, run verification-before-completion. List the commands you actually ran and the results.

For CSV import, verification should include:

Unit tests pass
Type check passes
Lint or formatting check passes
Manual upload of a valid CSV
Manual upload of a CSV with failed rows
UI shows success count and error reasons

If something was not run, the agent should say so plainly.

Stage Nine: Code Review

Skills used: requesting-code-review and receiving-code-review

requesting-code-review makes the agent review the work before handing it over. It checks whether the code matches the plan, avoids overbuilding, and handles risk.

Say:

Before handing this back, run requesting-code-review. Report issues by severity. Critical issues must be fixed first.

The review should cover:

Did the implementation stay inside the approved scope
Are there untested branches
Are errors swallowed
Is CSV input treated as trusted when it should not be
Did existing user-management behavior change

If review feedback exists, receiving-code-review guides the agent through it:

Address the review feedback. For each item, explain what you changed or why you did not change it.

This stage is not about nitpicking. It stops confident half-finished work from reaching you.

Stage Ten: Finish the Branch

Skill used: finishing-a-development-branch

When the task is complete, finishing-a-development-branch handles the closeout.

It should:

Re-run key tests
Check the git diff
Summarize changes
Offer next steps: merge, create PR, keep branch, or discard worktree
Clean up temporary workspace if appropriate

Trigger it like this:

Use finishing-a-development-branch to close this out. Verify first, then give me merge or PR options.

A good final report includes:

What was implemented
What was intentionally not implemented
Which verification commands ran
Remaining risks
Current branch and file status

This gives a long agent session a real ending. Otherwise, the session often stops at “the files changed.”

Useful Trigger Phrases

Superpowers should trigger automatically, but direct language helps pull the agent back into process.

Goal	Say this
Clarify requirements	Use brainstorming first. Do not write code yet
Write a plan	Use writing-plans and split work into 2-5 minute tasks
Isolate the branch	Use using-git-worktrees to create a separate workspace
Execute with checkpoints	Use executing-plans and stop after each batch
Let a larger task run	Use subagent-driven-development with two-stage review
Enforce TDD	Use test-driven-development and show the failing test first
Debug a failure	Use systematic-debugging. Gather evidence before guessing
Verify before done	Use verification-before-completion and list actual results
Review the code	Use requesting-code-review and report by severity
Handle review feedback	Use receiving-code-review and address each item
Finish the branch	Use finishing-a-development-branch and offer merge or PR options

Remember: you are not invoking prompts. You are putting the agent inside an engineering process.

A Concrete Example

Suppose users can upload an empty CSV and the app does not show an error.

Without process, an agent may jump into the upload handler, add an if statement, and say it is fixed.

With Superpowers, the path is slower and safer:

Reproduce the issue
Write a failing test: empty CSV returns “file is empty”
Confirm the test fails because validation is missing
Add the smallest validation change
Run the test and watch it pass
Check nearby cases: header-only files, blank rows, encoding errors

This is not theater. It replaces “I guessed the fix” with “this failing case now passes.”

Common Pitfalls

Thinking installation is enough.

Superpowers gives the agent better habits, but you still need to make decisions. It will ask questions. You need to answer. It will propose a plan. You need to approve it.

Treating the workflow as friction.

At first, brainstorming can feel chatty. That pause is often the point. Rework costs more than clarification.

Using it for the wrong task.

Temporary scripts, experiments, and tiny text edits may not need the full workflow. Say clearly when something is a throwaway prototype.

Forgetting project rules.

Superpowers gives general engineering discipline. It does not know your business rules unless they are in the project. Repository instructions such as AGENTS.md, CLAUDE.md, and STYLE_GUIDE.md still matter.

How It Differs From a Normal Agent Skill

A normal skill usually solves one vertical task, such as writing a review, generating an image, or formatting a document.

Superpowers is a workflow layer. It changes the order in which the agent works.

Type	What it solves	Typical result
Single Agent Skill	How to do one specific task	Better output
Superpowers	How the whole development session should run	More controlled work

If you already maintain your own skills, Superpowers can sit underneath them. Your domain skills define what to do. Superpowers defines how to move.

Closing

AI coding agents do not need more speed. They need brakes.

Superpowers gives them those brakes: clarify, plan, test, implement, verify, then finish. Once you use agents on real projects, that structure stops feeling heavy. It becomes the reason you can trust them with more work.

GitHub: https://github.com/obra/superpowers

Codex installation guide: https://raw.githubusercontent.com/obra/superpowers/main/.codex/INSTALL.md