Code from Spec
Code from Spec

Code from Spec

A software engineering methodology for the age of AI.

Specifications are the source of truth; code materializes it.

You write specifications. An agent generates the code from them. When a specification changes, the tooling detects which code is now stale, and that code is regenerated. The specification is the thing you author, review, and version; the code is its output.

None of this is asked of anyone's good behavior — the agent's or the team's. Staleness is detected by hash. Generation is confined by the tooling. A hand edit to a generated file is flagged the moment it appears. The sections below describe how that enforcement works.

The specification tree

Specifications are organized as a tree of nodes. Each node is a directory with a _node.md file, and its position in the filesystem is its position in the hierarchy. High-level intent sits at the root; implementation detail sits at the leaves. Only leaf nodes generate artifacts.

A leaf node, in its entirety, can be this small:

---
output: internal/fees/calculation.go
---
# SPEC/payments/fees/calculation

# Public
## Interface
`func CalculateFee(amount int64, isPremium bool) (int64, error)`

# Agent
Fees are computed in integer cents — no floating point.
The standard fee is 2% of the amount, rounded down;
premium accounts pay half.
Reject non-positive amounts with a validation error.

A node inherits the public content of every ancestor, automatically. A constraint written once at an ancestor node — an error-handling rule, a convention, a type — governs every leaf beneath it, including leaves added months later. What a node needs from elsewhere in the tree, it declares. The context for any generation is the path from the root to the node plus what the node declares — not searched for, not guessed. The structure decides what the agent sees.

Specifications are structured natural language — readable and reviewable by anyone who knows the domain, not only by programmers. A compliance officer can read a regulatory constraint in the spec and say "this is wrong" before any code is generated.

Specs live in git, reviewed in pull requests, diffed, and blamed, with the same machinery a team already uses for code.

Confinement

The agent that generates a file sees only that node's chain — the inherited constraints, the declared dependencies, the node's own specification — and writes one declared output file. It does not browse the repository, read neighboring code, or fetch anything else. If the chain is not enough to generate the artifact, the correct output is not code. It is a report of exactly what is missing.

Confinement is enforced by the tooling, not requested of the agent. An unconfined agent facing an ambiguous spec compensates: it reads other files, infers patterns, and builds output that satisfies a model it assembled from context nobody chose. The result looks diligent but is subtly wrong in ways that are hard to trace. A confined agent has two options instead — generate from what it has, or report the gap. The third option, inventing context that looks like research, is the one confinement removes.

Staleness, by hash

Each generated artifact records the hash of the chain that produced it. Change a specification, and the hash of every artifact that inherited or depended on it no longer matches. Those artifacts are stale, and stale is the signal to regenerate. The bookkeeping lives in a single manifest; generated files carry no framework metadata and no comments, so a regeneration that produces identical code produces no diff at all.

Divergence is caught in both directions: a spec that moved ahead of its artifact marks the artifact stale, and a generated file edited by hand is flagged as modified. Either way, the pair fell out of agreement, and the tooling says so.

The hash answers one question: was this artifact produced from this specification. Whether the code is correct is a separate question, for a separate layer.

Verification

An agent can hallucinate, omit a required step, misread an unambiguous spec, or preserve old behavior the spec has changed — and the code will compile and the manifest will read clean. Tests catch these. Tests are specifications too: authored, reviewed, and versioned alongside the rest, expressed as behavior — given this scenario, this expected result. The hash checks where the code came from; the tests check what it does.

A green suite is evidence only if the tests are an independent opinion — a generated implementation and a generated test could otherwise share the same misreading and agree on it. The structure prevents that: a test spec is its own account of expected behavior, and the agent that generates a test sees the component's public contract, never its implementation — the format keeps every node's internals invisible to the rest of the tree. Implementation and test become two independent readings of the same intent. The methodology assumes an imperfect agent, because that is the agent that exists.

Where we are

Code from Spec is on its fifth version, developed and refined through real projects. The spec tree format — how specifications are structured, how context is assembled, how staleness is detected — is approaching a definitive form. The newest piece is the regeneration mechanism: telling the agent exactly which parts of a spec changed, so it updates the code without re-deciding everything that did not. It is built, and early results across several projects are promising. The articles report this work as it happens, including the hypotheses that failed.

Open problems remain. Natural language is ambiguous; the working answer is convergence, with test specs anchoring meaning mechanically — whether it converges fast enough for complex domains is unproven. Generation is non-deterministic; the guarantees are about behavior, verified by tests, not about the generated code itself. Authorship by someone with no software background is the north star, not a current capability — the rationale describes that destination, and the precedent it stands on. The tooling targets Claude Code; the spec format is client-agnostic, the orchestration is not.

Explore the framework on GitHub