Can an AI agent write documentation for my codebase?

Yes — it can read the code and draft documentation far faster than you could write it by hand. The catch is that it documents what the code does, not always why, and it can misread intent. So an agent drafts and you verify, rather than trusting the docs blindly. That review is the part that makes them trustworthy.

How do I get an agent to document a legacy codebase well?

Have it map the codebase before documenting, work area by area rather than all at once, and ask it to capture behaviour and decisions, not just structure. Verify each draft against the tests and reality, correct what it misread, and commit the result into a contextbase the agent reads going forward.

What's the difference between code comments and documenting with an agent?

Comments explain a line; a contextbase explains the system — how parts connect, why decisions were made, what traps exist. An agent can generate both, but the high-value output is the system-level understanding written somewhere the next agent and the next person read first.

ArticlesMethod

Document an inherited codebase with an agent

Inherited a codebase with no docs? An agent can read it and draft the documentation — if you direct it well.

Stuart LeoJune 9, 20265 min read

Inheriting an undocumented codebase used to be a slow misery: read it all, guess at the reasoning, write up what you could. An agent changes the economics completely. It can read the whole thing and draft the documentation in an afternoon — turning an opaque codebase into something explained. The catch is that "draft" is the operative word, and your job is to direct and verify, not to trust blindly.

Here's how to turn an inherited codebase into a usable contextbase with an agent doing the heavy lifting.

The opaque inheritance

An undocumented codebase is opaque in a specific way: the what is recoverable from reading the code, but the why — the decisions, the constraints, the traps — is missing. Any documentation worth having has to capture both, and the why is the hard part, because it isn't written anywhere.

An agent can recover a lot of the what quickly. It can recover some of the why by inference — and it will confidently invent the rest, which is exactly why you stay in the loop. The guides on using agents in legacy code keep landing here: the agent accelerates the documentation enormously, but unverified, it documents its own guesses as fact.

Have the agent map before it documents

Don't ask for documentation cold. Ask for a map first.

Have the agent survey the codebase and report the structure — the main parts, how they connect, the entry points, the data flow. This does two things: it gives you a frame to organise the documentation around, and it surfaces the agent's understanding before it commits that understanding to docs, so you can correct a wrong mental model early rather than finding it baked into fifty files.

Map first, document second. Reverse the order and you get confident, well-formatted documentation of a misunderstanding.

Document behaviour, not just structure

The low-value output is a structural dump — "this folder contains these files." Anyone can get that from ls. The high-value output is behaviour and decisions: how login actually works end to end, why the payment flow is built this way, what the fragile parts are and what they're protecting against.

Stuart Leo

Structural docs tell you where things are. Behavioural docs tell you how the system works and why — that's the part worth an agent's effort.

So direct the agent toward behaviour: "explain how X works," "what does this module depend on and why," "what would break if I changed this." That's the documentation that makes the next person — and the next agent — actually competent in the code.

Verify against tests and reality

This is the step you cannot skip. The agent's documentation is a fast first draft of understanding, and it will be wrong in places — especially on intent. So verify:

Check claims against the tests — they encode real expected behaviour.
Check against reality — run the thing, trace the flow, confirm the surprising claims.
Correct what the agent misread, and note where even you aren't sure.

The verification is what converts "the agent's plausible guess" into "documentation you can trust." It's also far faster than writing the docs yourself — you're editing a draft, not facing a blank page.

From docs to a living contextbase

One-time documentation rots the day it's written. The move that makes this last is to commit the verified docs into a contextbase the agent reads before it works on that code — and to keep adding to it as you go, the way you'd grow a contextbase into any legacy project. Now the documentation isn't a static artifact that drifts out of date. It's the agent's working memory of the codebase, kept current by use.

An agent can draft the docs a codebase never had — your job is to verify them, not write them from scratch.

Start here: see how to add a contextbase to a legacy project, onboard to any codebase with an agent, or read the method.

FAQ

Can an AI agent write documentation for my codebase?: Yes — it can read the code and draft documentation far faster than you could write it by hand. The catch is that it documents what the code does, not always why, and it can misread intent. So an agent drafts and you verify, rather than trusting the docs blindly. That review is the part that makes them trustworthy.
How do I get an agent to document a legacy codebase well?: Have it map the codebase before documenting, work area by area rather than all at once, and ask it to capture behaviour and decisions, not just structure. Verify each draft against the tests and reality, correct what it misread, and commit the result into a contextbase the agent reads going forward.
What's the difference between code comments and documenting with an agent?: Comments explain a line; a contextbase explains the system — how parts connect, why decisions were made, what traps exist. An agent can generate both, but the high-value output is the system-level understanding written somewhere the next agent and the next person read first.

Add a contextbase to a legacy codebase

A legacy codebase holds years of tacit knowledge the agent can't see. How to build a contextbase for an existing project, incrementally, without boiling the ocean.

Onboard to any codebase in hours with an agent

Understanding an unfamiliar codebase used to take weeks. An agent can guide you through it in hours — here's how to use one to get oriented fast.

The legacy codebase the agent couldn't read

A field note on pointing an agent at a ten-year-old codebase, watching it confidently misunderstand everything, and the context that finally let it help.

All articles