From Red-Green-Refactor to Living Documentation: The Shift to Spec-Driven Development

From Red-Green-Refactor to Living Documentation: The Shift to Spec-Driven Development

TDD gave us confidence. BDD gave us collaboration. Spec-driven development gives us truth — a single source of verified, executable understanding that product, engineering, and QA all inhabit together.

The industry's relationship with software testing has always been one of earnest striving and inconvenient compromise. We adopted Test-Driven Development and found better-designed code but still suffered from tests that nobody outside engineering could read. We adopted Behaviour-Driven Development and found better conversations but still suffered from scenarios that drifted from the production system. Spec-driven development is the next natural step — not a revolution but a maturation, an acknowledgement that the specification is the test and the documentation simultaneously.

This is the story of that evolution, why the transition makes sense, and how engineering organisations — particularly those running modern, multi-team product models — can make the move without breaking what already works.

A brief genealogy of test-first thinking

TDD — Red · Green · Refactor Write a failing unit test, make it pass with minimal code, then clean it up. Discipline for engineers, confidence through coverage.

BDD — Given · When · Then Express behaviour in plain English. Bring product owners into the test conversation. Scenarios become the contract between teams.

Spec-driven — Contract · Verify · Publish The specification is machine-readable, version-controlled, and executed. Drift between docs and reality becomes impossible.

TDD, first popularised by Kent Beck as part of Extreme Programming, solved an engineering problem: code without tests degrades quickly because refactoring is risky. Writing the test first forces a design conversation before implementation — you cannot write a test for an interface that doesn't make sense.

BDD, championed by Dan North in the mid-2000s, recognised that TDD was leaving non-technical stakeholders behind. The Given-When-Then vocabulary of Gherkin gave product managers and business analysts a way to express acceptance criteria that a machine could also interpret. Tools like Cucumber, SpecFlow, and Behave turned human-readable scenarios into executable specifications.

But BDD carried a hidden cost: the Gherkin files were usually written by QA, maintained inconsistently, and rarely reflected exactly what the running software did. They became stale documentation — accurate on day one, aspirational on day ninety.

What spec-driven development actually means

Spec-driven development (SDD) treats a formal specification — usually an OpenAPI document, AsyncAPI schema, JSON Schema, or structured Gherkin with traceability — as the authoritative, machine-verifiable definition of a system's expected behaviour. Everything else is derived from it: generated client SDKs, mock servers, contract tests, published API documentation, and CI pipeline gates.

The core principle: If your specification and your implementation can disagree without breaking the build, you do not have a specification — you have a wish list. Spec-driven development closes that gap by making the spec executable and the build fail when they diverge.

This is not a new idea. The OpenAPI Initiative has existed since 2015. What is new is the tooling maturity — Prism, Stoplight, Pact, Schemathesis, and a generation of AI-assisted contract generators — that makes enforcing the spec at every layer of the stack tractable for ordinary product teams, not just API-first infrastructure companies.

The three pain points that drive the transition

1. Specification drift

In most BDD setups, the Gherkin file in the repository, the OpenAPI definition in Confluence, and the Postman collection used by QA are three separate artefacts that diverge independently. Spec-driven development replaces that triangle with a single source — the spec — from which all others are generated or validated.

2. Cross-team contract failures

In organisations with many product teams — even those following the Spotify model — consumer-provider API failures discovered in integration testing are expensive. Consumer-Driven Contract (CDC) testing with tools like Pact moves the contract into a shared broker, so provider teams know immediately when a change would break a consumer, long before an integration environment is involved. This is spec-driven thinking applied to team boundaries.

3. The documentation theatre problem

Engineers write OpenAPI YAML to satisfy a documentation requirement, not because they trust it as a source of truth. Automated schema validation in CI, contract test failures blocking deploys, and mock servers generated from the same spec — all of these create the feedback loop that makes documentation honest again.

How the transition works in practice

The most important principle for any QA leader overseeing this shift: spec-driven development is not a replacement for TDD or BDD. It is a layer added above them. Unit tests remain. BDD scenarios remain — and in fact become more valuable when they are linked to a formal spec by traceability identifiers. What changes is the authority structure: the spec becomes the constitution, and everything else is derived from or validated against it.

Step 01 — Spec-first for new endpoints Require OpenAPI definitions to be committed and reviewed before any implementation begins. Use Stoplight Studio or similar to make this accessible to non-engineers.

Step 02 — Generate mocks immediately Prism or similar tools serve mock responses from the spec on day zero. Front-end and mobile teams can develop against the contract without waiting for the backend.

Step 03 — Gate CI on schema validation Schemathesis or Dredd run against every pull request. If the implementation deviates from the spec, the build fails — not the next QA cycle.

Step 04 — Introduce CDC at team boundaries Pact or Specmatic between provider and consumer teams. Contract failures surface in the provider's pipeline, not during integration testing.

Step 05 — Link BDD scenarios to spec operations Annotate Gherkin scenarios with @spec:operationId tags. Reporters then show which spec operations have scenario coverage — and which do not.

Step 06 — Publish living documentation Auto-generate and publish API docs from the same spec. When a spec changes and the pipeline passes, the docs update without human intervention.

The AI acceleration angle

Spec-driven development has an asymmetric advantage in the current AI tooling landscape. A well-structured OpenAPI or AsyncAPI specification is exactly the kind of machine-readable, semantically rich artefact that large language models can reason about reliably. This opens workflows that were previously impractical:

  • An AI agent can read an OpenAPI spec and generate a complete Playwright API test suite covering all defined operations, status codes, and schema shapes — including edge cases drawn from the examples fields
  • The same spec powers a Gherkin scenario generator that proposes BDD coverage gaps
  • A healing agent can watch contract test failures and propose spec amendments when an implementation legitimately diverges from the original design

None of this is science fiction — it is the direction the tooling is moving, and teams with clean, enforced specs are positioned to benefit from it immediately.

For AI augmentation to work: The spec must be honest. An OpenAPI file that describes aspirational behaviour rather than actual behaviour produces AI-generated tests that fail for the wrong reasons. Spec integrity — enforced by CI — is the prerequisite for reliable AI-assisted quality engineering.

What to preserve from TDD and BDD

The temptation in any paradigm shift is to declare the previous approach obsolete. Resist it. TDD's red-green-refactor discipline produces better-designed internal APIs and domain logic that no amount of contract testing can replicate. BDD's emphasis on shared vocabulary — the ubiquitous language that domain-driven design also champions — prevents the subtle miscommunications that produce correctly-implemented wrong features. Both remain essential.

Spec-driven development is the outer layer: it governs the published surface of a system, the contracts between components, and the living documentation that keeps every stakeholder's mental model synchronised with reality. Inside that boundary, TDD and BDD continue to do exactly what they were designed to do.

The bottom line

Organisations that treat specifications as binding, executable, version-controlled artefacts — not as documentation written after the fact — will find that integration failures decrease, cross-team communication improves, and AI-assisted testing tooling delivers its theoretical benefits in practice.

The transition from TDD through BDD to spec-driven development is not a leap of faith. It is a series of deliberate, incremental choices about what counts as truth in a software system.

Start with one service. Write the spec first. Gate the build on it. The rest follows.