Reverse-Engineering Spec Extraction

Existing codebases contain implicit specifications embedded in their implementations. When AI agents work on legacy systems, they lack structured context about what the system is, leading to repeated code exploration, context window waste, and inconsistent understanding across sessions. Spec Extraction introduces a methodology for reverse-engineering specifications from code to create "existing-fact" context.

Key Insight: Specifications extracted from implementations serve as compressed, authoritative context. A well-written spec is 5-20x smaller than the code it describes, dramatically reducing context window usage while preserving the essential knowledge AI agents need.

Problem Statement

Current State: Ad-hoc Code Exploration

Pain Points from Missing Specs

Issue	Impact on AI Development
No formal spec	AI re-explores code every session
Tribal knowledge	Implementation decisions locked in developers' heads
Context window waste	Must load entire files to understand behavior
Inconsistent understanding	Different AI sessions interpret code differently
Onboarding friction	New AI agents (and humans) start from zero

Real-World Examples

Repeated Discovery:

AI agent needs to modify authentication flow. Each session, it reads 15 files to understand the flow. The same ~2000 lines are loaded repeatedly because no spec exists documenting the authentication architecture.

Lost Institutional Knowledge:

Original developer leaves. The codebase has no documentation. New team members and AI agents must reverse-engineer intent from implementation, often guessing incorrectly.

Context Overflow:

Feature touches 8 modules. Loading all code exceeds context window. Without specs, AI cannot get a high-level view and makes changes that break undocumented invariants.

Proposal: Existing-Fact Specifications

Target State

What is an Existing-Fact Spec?

An existing-fact specification documents verified, implemented behavior extracted from existing code. Unlike forward-looking requirements that describe what should be built, existing-fact specs describe what is built.

Key Distinction: Existing-Fact vs Requirement

Aspect	Requirement Spec	Existing-Fact Spec
Source	Business needs, user stories	Implemented code
Purpose	Define what to build	Document what exists
Authority	Normative (should match)	Descriptive (does match)
Creation	Before implementation	After implementation
Validation	Tests verify implementation	Implementation is the truth
Use Case	Greenfield development	Legacy system understanding

Existing-Fact Spec Format

Frontmatter Schema

yaml

id: ef-auth-oauth-flow
title: OAuth 2.0 Authentication Flow
type: existing-fact
status: verified
version: 1.0.0
created: 2025-12-02
updated: 2025-12-02
extracted_from:
  - src/auth/oauth.ts
  - src/auth/token-manager.ts
  - src/middleware/auth.ts
extraction_method: ai-assisted
confidence: high
verified_by:
  - test-suite
  - human-review
compression_ratio: 12:1
authors:
  - extraction-agent
reviewers:
  - senior-developer@company.com
tags:
  - authentication
  - oauth
  - security
ai_summary: |
  OAuth 2.0 PKCE flow implementation for web clients.
  Handles authorization, token exchange, refresh, and logout.
  Integrates with identity provider via standard endpoints.

Content Structure

markdown

# [Feature Name] - Existing-Fact Specification

## Overview
[1-2 paragraph summary of what this code does]

## System Boundaries
### Entry Points
- [API endpoints, function signatures]

### External Dependencies
- [Services, APIs, databases this code interacts with]

### Data Flow
[Mermaid diagram showing high-level flow]

## Behavioral Specification

### Core Behaviors
#### Behavior: [Name]
- **Trigger**: [What initiates this behavior]
- **Preconditions**: [Required state]
- **Process**: [What happens]
- **Postconditions**: [Resulting state]
- **Extracted from**: `file.ts:line`

### Error Handling
#### Error: [Name]
- **Condition**: [When this error occurs]
- **Response**: [How system responds]
- **Recovery**: [How to recover, if applicable]

## Constraints & Invariants
- [Rules that must always hold]
- [Limits and thresholds]

## Known Technical Debt
- [Acknowledged issues not to replicate]

## Verification Status
| Aspect | Status | Evidence |
|--------|--------|----------|
| Core flow | Verified | Unit tests, integration tests |
| Error handling | Partial | Some edge cases untested |
| Performance | Unverified | No benchmarks available |

Layered Extraction Methodology

Four-Layer Approach

Extract specifications in layers of decreasing granularity, each providing increasing detail:

Layer Details

Layer	What to Extract	Compression Target	When Needed
L1: Boundaries	API contracts, interface definitions, external integration points	10:1	Always
L2: Structure	Component responsibilities, inter-module dependencies, data flow	20:1	Most features
L3: Behaviors	Algorithmic patterns, state machines, error handling strategies	5:1	Complex logic
L4: Edge Cases	Validation rules, constraints, limits, known quirks	3:1	Critical paths

Incremental Value

Layer 1 alone provides significant value. Each additional layer adds precision when needed:

Extraction Workflow

Three-Phase Process

Phase 1: Discovery

Objective: Understand codebase structure and identify extraction targets.

AI Tasks:

Analyze directory structure and file organization
Identify entry points (APIs, CLI commands, event handlers)
Map module dependencies
Detect architectural patterns (MVC, Clean Architecture, etc.)
Generate discovery report for human review

Output:

markdown

## Discovery Report: [Codebase Name]

### Architecture Overview
[High-level description]

### Key Modules
| Module | Purpose | Dependencies | Priority |
|--------|---------|--------------|----------|
| auth | Authentication | db, identity-provider | High |
| api | REST endpoints | auth, services | High |
| ... | ... | ... | ... |

### Recommended Extraction Order
1. [Module] - [Reason]
2. [Module] - [Reason]

### Complexity Assessment
- Estimated extraction effort: [Hours/Days]
- High-complexity areas: [List]

Phase 2: Extraction

Objective: Generate draft specifications from code.

AI Tasks:

Read module code thoroughly
Generate spec following layer methodology
Cross-reference with tests for behavior verification
Document uncertainties and assumptions

Human Tasks:

Review generated specs for accuracy
Correct misunderstandings
Add context AI cannot infer (business rationale, historical decisions)
Approve or request refinement

Iteration Pattern:

Phase 3: Validation

Objective: Establish spec trustworthiness for AI context consumption.

Validation Sources:

Source	Confidence Boost	Evidence
Automated tests pass	+20%	Test coverage report
Human review complete	+30%	Reviewer sign-off
Production behavior matches	+30%	Monitoring/logs comparison
Original developer confirms	+20%	Developer approval

Confidence Levels:

Level	Criteria	AI Usage Guidance
High	Test-validated + human-reviewed	Use as authoritative context
Medium	Human-reviewed, partial test coverage	Use with caution, verify critical paths
Low	AI-generated draft, awaiting validation	Treat as hypothesis, verify before use

Implementation Roadmap

Phase 1: Pilot Extraction (Week 1-2)

Goal: Extract specs for 1-2 high-value modules.

Deliverables:

[ ] Select pilot modules (high-change-frequency, well-tested)
[ ] Run discovery phase
[ ] Generate Layer 1 + Layer 2 specs
[ ] Validate with module owners
[ ] Measure compression ratio

Success Criteria:

Compression ratio >10:1
Module owner confirms accuracy
AI agent can use spec instead of reading code

Phase 2: Tooling & Templates (Week 3-4)

Goal: Establish repeatable extraction process.

Deliverables:

[ ] Existing-fact spec template
[ ] Extraction prompt library
[ ] Validation checklist
[ ] CI integration for spec freshness checks

Phase 3: Systematic Extraction (Week 5-8)

Goal: Extract specs for all high-priority modules.

Deliverables:

[ ] Prioritized module list
[ ] Extraction schedule
[ ] Progress tracking dashboard
[ ] Spec-to-code freshness monitoring

Phase 4: Continuous Maintenance (Ongoing)

Goal: Keep specs synchronized with code.

Deliverables:

[ ] Change detection triggers re-extraction
[ ] Spec diff on code changes
[ ] Staleness alerts
[ ] Periodic full refresh schedule

CLAUDE.md Integration

Add to project CLAUDE.md:

markdown

## Existing-Fact Specifications

### Purpose
Existing-fact specs document verified, implemented behavior.
Use these specs instead of reading source code when available.

### Location
- `docs/specs/existing-facts/` - Extracted specifications
- Each spec includes `extracted_from` references to source files

### Usage Priority
1. Check for existing-fact spec first
2. If spec exists and confidence is high, use spec as context
3. If spec is medium confidence, verify critical assumptions
4. If no spec exists or confidence is low, read source code

### When to Trigger Re-Extraction
If you modify code covered by an existing-fact spec:
1. Note the spec may be stale
2. Update the spec if change is significant
3. Mark spec as needs-review if uncertain

### Spec Quality Indicators
- `confidence: high` - Trust as authoritative
- `confidence: medium` - Verify critical paths
- `confidence: low` - Treat as hypothesis
- `compression_ratio` - Higher is better context efficiency

Context Compression Analysis

Compression Targets by Layer

Layer	Code Example	Spec Equivalent	Ratio
L1: Boundaries	500 lines of API handlers	50 lines of endpoint specs	10:1
L2: Structure	2000 lines across 10 modules	100 lines of architecture	20:1
L3: Behaviors	300 lines of algorithm	60 lines of behavioral spec	5:1
L4: Edge Cases	200 lines of validation	70 lines of constraint spec	3:1

Real-World Example

Before: Loading Source Code

Context tokens for authentication module:
- oauth.ts: 450 lines (~2000 tokens)
- token-manager.ts: 280 lines (~1200 tokens)
- auth-middleware.ts: 180 lines (~800 tokens)
- auth.test.ts: 520 lines (~2300 tokens)
Total: 1430 lines (~6300 tokens)

After: Loading Existing-Fact Spec

Context tokens for authentication spec:
- ef-auth-oauth-flow.md: 120 lines (~500 tokens)
Total: 120 lines (~500 tokens)

Compression: 12:1
Token savings: 5800 tokens per session

Success Metrics

Metric	Target	How to Measure
Compression ratio (avg)	>10:1	`code_lines / spec_lines`
Spec accuracy	>95%	Human review pass rate
Context reduction	>50%	Token usage before/after
AI task success rate	+20%	Compare with/without specs
Extraction efficiency	<2hr/module	Time tracking
Spec freshness	<30 days	`updated` date monitoring

Anti-Patterns to Avoid

1. Extracting Everything

Problem: Attempting to spec every line of code.

Solution: Focus on high-value, high-change-frequency modules first. Not all code needs specs.

2. Specs Without Validation

Problem: Generating specs and assuming they're correct.

Solution: Every spec requires human review. Confidence levels communicate trustworthiness.

3. Stale Specs

Problem: Specs drift from implementation over time.

Solution: CI checks, freshness monitoring, change-triggered re-extraction.

4. Over-Detailed Specs

Problem: Specs that are as long as the code they describe.

Solution: Focus on abstraction. If compression ratio ❤️:1, spec is too detailed.

5. Ignoring Technical Debt

Problem: Extracting specs for code that "shouldn't be this way."

Solution: Document known debt in specs. Don't legitimize bad patterns by specifying them.

Frequently Asked Questions

When should we extract specs vs write new ones?

Extract when:

Code exists but documentation doesn't
Original developers unavailable
Need to understand legacy system quickly
Want to reduce context window usage

Write new when:

Building new features
Redesigning existing features
Code doesn't exist yet

How do we handle code that violates its own patterns?

Document inconsistencies in the spec:

markdown

## Known Inconsistencies
- Module A uses pattern X
- Module B uses pattern Y for same purpose
- **Note**: This is legacy debt, not intentional design

Should existing-fact specs live with code or separately?

Recommended: Separate docs/specs/existing-facts/ directory.

Rationale:

Specs aggregate multiple files
Easier to find and load as context
Can have different review cadence
Clear distinction from code comments

How often should we re-extract?

Code Change Type	Re-extraction Needed
Bug fix	No (unless changes behavior)
Refactor (same behavior)	Maybe (update file references)
Feature addition	Yes (add to spec)
Behavior change	Yes (update spec)
Major rewrite	Yes (full re-extraction)

G1: Single Source of Truth - Extracted specs as canonical source
G2: Version-Controlled Documentation - Specs in git
C1: Context Engineering Competency - Context compression techniques

AI-DLC Mob Elaboration - Forward-looking spec creation
Agent-Friendly Knowledge Base - Where specs should live
Continuous Context Cleanup - Maintaining spec freshness
Frontmatter Spec Coordination - Metadata schema for specs

Related: Agent-Friendly Knowledge Base | Back: Proposals Overview

References

Claude Code Documentation - Official documentation for CLAUDE.md AI guidance files
Effective Context Engineering for AI Agents - Anthropic's guide to context optimization
OpenSpec - Specification-driven development framework
Legacy Code Documentation Patterns - Write the Docs community resources

Reverse-Engineering Spec Extraction ​

Problem Statement ​

Current State: Ad-hoc Code Exploration ​

Pain Points from Missing Specs ​

Real-World Examples ​

Proposal: Existing-Fact Specifications ​

Target State ​

What is an Existing-Fact Spec? ​

Key Distinction: Existing-Fact vs Requirement ​

Existing-Fact Spec Format ​

Frontmatter Schema ​

Content Structure ​

Layered Extraction Methodology ​

Four-Layer Approach ​

Layer Details ​

Incremental Value ​

Extraction Workflow ​

Three-Phase Process ​

Phase 1: Discovery ​

Phase 2: Extraction ​

Phase 3: Validation ​

Implementation Roadmap ​

Phase 1: Pilot Extraction (Week 1-2) ​

Phase 2: Tooling & Templates (Week 3-4) ​

Phase 3: Systematic Extraction (Week 5-8) ​

Phase 4: Continuous Maintenance (Ongoing) ​

CLAUDE.md Integration ​

Context Compression Analysis ​

Compression Targets by Layer ​

Real-World Example ​

Success Metrics ​

Anti-Patterns to Avoid ​

1. Extracting Everything ​

2. Specs Without Validation ​

3. Stale Specs ​

4. Over-Detailed Specs ​

5. Ignoring Technical Debt ​

Frequently Asked Questions ​

When should we extract specs vs write new ones? ​

How do we handle code that violates its own patterns? ​

Should existing-fact specs live with code or separately? ​

How often should we re-extract? ​

Related Principles ​

Related Proposals ​

References ​

Reverse-Engineering Spec Extraction

Problem Statement

Current State: Ad-hoc Code Exploration

Pain Points from Missing Specs

Real-World Examples

Proposal: Existing-Fact Specifications

Target State

What is an Existing-Fact Spec?

Key Distinction: Existing-Fact vs Requirement

Existing-Fact Spec Format

Frontmatter Schema

Content Structure

Layered Extraction Methodology

Four-Layer Approach

Layer Details

Incremental Value

Extraction Workflow

Three-Phase Process

Phase 1: Discovery

Phase 2: Extraction

Phase 3: Validation

Implementation Roadmap

Phase 1: Pilot Extraction (Week 1-2)

Phase 2: Tooling & Templates (Week 3-4)

Phase 3: Systematic Extraction (Week 5-8)

Phase 4: Continuous Maintenance (Ongoing)

CLAUDE.md Integration

Context Compression Analysis

Compression Targets by Layer

Real-World Example

Success Metrics

Anti-Patterns to Avoid

1. Extracting Everything

2. Specs Without Validation

3. Stale Specs

4. Over-Detailed Specs

5. Ignoring Technical Debt

Frequently Asked Questions

When should we extract specs vs write new ones?

How do we handle code that violates its own patterns?

Should existing-fact specs live with code or separately?

How often should we re-extract?

Related Principles

Related Proposals

References