Buildforce Logo
Concepts

Context Taxonomy

Understanding, capturing, and leveraging development context.

The Core Insight

Context engineering, not model capability, is the primary bottleneck for AI coding productivity. The same agent with different context produces dramatically different results.

A coding-session-level tool can capture approximately 60-70% of context that meaningfully impacts coding outcomes. This context falls into six pillars: Intent, Structural, Convention, Historical, Operational, and Verification.


1. The Fundamental Principle

Every piece of context an agent needs ultimately answers one of these core questions:

QuestionContext TypeTraditional SDLC Source
WHAT should I build?Intent ContextRequirements, user stories, tickets
WHY should I build it this way?Historical ContextADRs, design docs, team discussions
HOW do we do things here?Convention ContextStyle guides, code review, tribal knowledge
WHAT exists already?Structural ContextCodebase, dependencies, architecture
WHAT happened before?Historical ContextGit history, PR discussions, postmortems
WHAT could go wrong?Operational ContextPast bugs, production incidents
HOW do I know it's right?Verification ContextTest strategies, acceptance criteria, QA

The revelation: All this context already exists in every software organization. It is scattered across people's heads (tribal knowledge), documents that drift from reality, implicit patterns in code, conversations that disappear, and tickets that become obsolete. The opportunity is to materialize it, version it, and make it queryable.


2. The Six Pillars of Development Context

2.1 Intent Context

Intent context answers: WHAT should I build and WHY (at the task level)?

Sub-typeDescriptionExampleCoding Impact
GoalThe desired end stateUsers can export reports as PDFHIGH - Directs implementation
Acceptance CriteriaMeasurable success conditionsExport completes in <5s for 1000 rowsHIGH - Defines done
Non-goalsExplicit scope boundariesNot supporting CSV in this iterationMEDIUM - Prevents scope creep
MotivationBusiness/user need behind goalEnterprise compliance requirementsLOW - Rarely impacts code

Captureability: Goal and acceptance criteria are often stated at session start (~70% capturable). Motivation lives outside coding sessions (~20% capturable).

Key Insight: By the time you are in a coding session, motivation context has minimal impact on implementation. The scope should already be defined.

2.2 Structural Context

Structural context answers: WHAT exists and HOW is it organized?

Sub-typeDescriptionExampleCoding Impact
ArchitectureHigh-level system organizationMicroservices with event-driven commsHIGH - Where code goes
PatternsRecurring solutions in codebaseRepository pattern for data accessHIGH - Implementation approach
DependenciesWhat relies on whatReportService depends on PdfGeneratorHIGH - Change impact
BoundariesModule/domain separationsBilling never imports from ReportingHIGH - What not to cross
Data FlowHow information movesEvents flow through KafkaMEDIUM - Integration impl
API ContractsInterface agreementsv2 endpoints return pagination metadataHIGH - Compatibility

Captureability: Highly extractable from codebase analysis (~75-95% depending on sub-type).

Cold Start Value: This is the primary value from cold start extraction. The agent immediately knows where to put new code and what exists to build on.

2.3 Convention Context

Convention context answers: HOW do we do things here?

Sub-typeDescriptionExampleCoding Impact
Code StyleFormatting, naming conventionsPascalCase for components, camelCase for functionsHIGH - Syntax choices
Design PatternsPreferred implementation approachesUse hooks for state, never class componentsHIGH - Implementation
Error HandlingHow failures are managedAll API errors return ErrorResponse typeHIGH - Error code structure
Testing ApproachWhat and how to testUnit test business logic, integration test APIsHIGH - Test authoring
DocumentationWhat gets documented whereAll public APIs need JSDoc with examplesMEDIUM - Doc writing
Commit ConventionsMessage format, granularityConventional commits; one logical changeHIGH - Git workflow

Captureability: Highly extractable from existing code patterns and linting configuration (~70-90%).

Key Insight: Convention context ensures code looks like it belongs. Without it, agent output is obviously AI-generated.

2.4 Historical Context

Historical context answers: WHY are things the way they are?

Sub-typeDescriptionExampleCoding Impact
Decisions MadePast architectural choicesChose Postgres for ACID complianceMEDIUM - Understanding state
Decisions RejectedWhat was considered but not chosenGraphQL rejected for caching complexityHIGH - Prevents re-litigation
EvolutionHow things changed over timeAuth moved from JWT to session-based in v2.3MEDIUM - Code archaeology
Bug PatternsPast failures and resolutionsRace condition in payment fixed by mutexHIGH - Avoid repeats
WorkaroundsIntentional technical debtsetTimeout for library X timing bugMEDIUM - Don't fix quirks

Captureability: This is the WHY context that accumulates over time through session capture. Decisions rejected is particularly valuable (~75% capturable during sessions).

Key Insight: Historical context is rarely needed for daily coding but becomes critical for refactoring, migrations, and hard debugging. This is where accumulated context delivers visible magic.

2.5 Operational Context

Operational context answers: WHAT happens in production?

Sub-typeDescriptionExampleCoding Impact
ConstraintsHard limitsMax 5MB payload; 30s timeout; 10k concurrentLOW - Usually implicit
Failure ModesHow things breakRedis failover takes 30s; cache locallyLOW - Rarely in IDE
Performance BaselinesExpected behaviorP95 latency should stay under 200msLOW - Optimization targets
Security BoundariesTrust zonesUser input never reaches eval()MEDIUM - Security patterns

Captureability: Lives outside coding sessions (~20-40% capturable). Production reality is not in the IDE.

This context type is largely out of scope for coding-session tools. It represents future territory for operational context layer integration.

2.6 Verification Context

Verification context answers: HOW do I know it's right?

Sub-typeDescriptionExampleCoding Impact
Quality CriteriaWhat makes code good hereNo PR without tests for public methodsHIGH - Self-review
Test StrategyHow to prove correctnessUnit + integration + contract testsHIGH - Test authoring
Review ChecklistWhat reviewers look forError handling, edge cases, typesHIGH - Compliance
Acceptance TestsHow to validate featuresCucumber scenarios for user journeysMEDIUM - Feature validation
Compliance RulesWhat must be followedGDPR data handling, SOC2 loggingMEDIUM - Regulatory
Known RisksWhat tends to breakWatch for N+1 queries in UserRepositoryHIGH - Proactive avoidance

Captureability: Partially extractable from test structure and CI config (~55-80% depending on sub-type).

Critical Insight: Verification context is the feedback loop decision maker. Without it, the agent can write code but cannot evaluate if the code is good. It can only check if it compiles. This is what separates a coding agent from a competent developer.


3. Context Impact on Coding Outcomes

3.1 Impact Ranking

Not all context types have equal impact on coding session quality:

Impact LevelContext TypesWhy
CRITICALStructural, Convention, Intent (Goal), VerificationCore loop enablers - agent cannot function well without these
SIGNIFICANTHistorical (Rejected), Historical (Decisions), Intent (Criteria)Quality multipliers - improve accuracy and reduce iterations
MARGINALOperational (Constraints), Historical (Bug patterns)Edge case handlers - matter in specific scenarios
MINIMALIntent (Motivation), Intent (Stakeholder), Operational (Runtime)Outside coding scope - rarely change implementation

3.2 Quality Dimension Mapping

Different context types influence different quality dimensions:

Quality DimensionPrimary Context TypesOutcome
Correctness (does it work?)Intent, StructuralAgent builds the right thing in the right place
Compliance (does it fit?)Convention, VerificationCode follows patterns, meets review criteria
Safety (won't break things?)Historical, VerificationAvoids known pitfalls, respects boundaries
Efficiency (is it optimal?)Structural, OperationalKnows constraints, uses existing solutions
Maintainability (can evolve?)Convention, HistoricalCode matches team expectations
Verifiability (can prove it?)Verification, IntentKnows what done looks like, how to test

3.3 The Verification Gap

Current agents fail most dramatically at verification. This maps to a specific context deficit:

Verification LevelContext RequiredWithout ContextWith Context
SyntacticLanguage rulesLinter passesLinter passes
SemanticIntent + AcceptanceBuilds somethingBuilds the right thing
StylisticConventionGeneric patternsCode that belongs
ArchitecturalStructural + BoundariesMay violate boundariesRespects boundaries
BehavioralHistorical + VerificationHopes it worksAvoids known failures
ComplianceVerification criteriaSelf-approvesMeets team standards

4. What Can Be Captured

4.1 Coding Session Observability Window

A tool operating at the coding session level has a specific observability window:

Directly Observable:

  • What developer/agent is building
  • What they are reasoning about
  • What code they are reading/writing
  • What alternatives they explore and reject
  • How they verify their work
  • What tests they write/run

Outside the Window:

  • Business meetings and stakeholder discussions
  • PM motivation for features
  • Production incident history and runtime behavior
  • SRE runbooks and performance baselines

4.2 Captureability by Context Type

Context TypeCold StartSession CaptureCombined
STRUCTURAL~75%+10%~85%
CONVENTION~70%+15%~85%
VERIFICATION~55%+25%~80%
HISTORICAL~25%+45%~70%
INTENT (Goal/Criteria)~10%+60%~70%
OPERATIONAL~20%+15%~35%

Key Insight: When weighted by coding impact, capturable context covers approximately 60-70% of what actually matters for coding sessions. Structural, Convention, and Verification are both high-impact and highly capturable.

4.3 The Fundamental Gaps

Two context types remain fundamentally outside coding session scope:

Gap 1 - Upstream Intent: The business motivation, stakeholder priorities, and full product context that explains why something should be built. This lives in product meetings, customer conversations, and strategy documents.

Gap 2 - Downstream Reality: The operational truth about what happens when code runs in production. This lives in monitoring dashboards, incident reports, and SRE runbooks.


5. Value Accumulation Over Time

With comprehensive context capture in place, an agent's effectiveness compounds over time. Cold start provides ~15-20% improvement on day one by giving the agent structural awareness and convention compliance. As historical context accumulates—particularly the WHY behind decisions and rejected alternatives—improvement reaches 35-45% over months, with the critical unlock occurring when the agent can handle refactoring, migrations, and complex debugging by drawing on accumulated institutional knowledge.


Summary: Context That Matters for Coding

ContextDay 1 SourceAccumulation SourceImpactImprovement Contribution
STRUCTURALCold start extractionSession navigation patternsCritical~5-8%
CONVENTIONCold start extractionCode review discussionsCritical~5-8%
VERIFICATIONTest/CI analysisReview feedback, test sessionsCritical~4-6%
INTENT (Goal)Session startStated intent patternsCritical~3-5%
HISTORICALGit history (partial)Session captureSignificant~8-15%
OPERATIONALConfig analysisLimited (out of scope)Marginal~2-3%

Total Expected Improvement Range: 15-20% (Day 1) → 25-35% (Month 1-3) → 35-50% (Month 6+)

The path from baseline to mature context is not about capturing more context types. It is about accumulating depth in the context types that matter most: Historical (WHY) and Verification (HOW TO KNOW).


Extending What Agents Already Do

Modern coding agents are remarkably capable at reverse-engineering context from codebases. Through agentic search, AST analysis, dependency graphs, and pattern recognition, they can discover what exists, how it's organized, and often infer conventions from code alone. This capability is real and continues to improve.

This taxonomy does not aim to replace that discovery process or invent a new paradigm. Instead, it addresses a fundamental limitation: agents can reverse-engineer the WHAT, but they cannot reverse-engineer the WHY.

No amount of codebase analysis will reveal why GraphQL was rejected, what production incident led to that defensive timeout, or which testing patterns the team actually values versus tolerates. This knowledge exists only in human heads, disappearing conversations, and tribal memory that erodes with every team change.

The opportunity is not to build something agents can't already do. It is to capture and persist the context that cannot be discovered—the decisions, rejections, rationale, and verification expectations that shape whether code is merely functional or truly belongs.

This context doesn't need to be exhaustive or perfect. It needs to be:

  • Distilled: Capturing signal, not noise—the decisions that actually matter
  • Opinionated: Reflecting how this team works, not generic best practices
  • Accumulated: Building up naturally through the act of development itself
  • Queryable: Available when the agent needs it, not buried in documents

When an agent combines its native ability to search and analyze code with accumulated context about why things are the way they are, the result is not a different kind of agent. It is the same agent, operating with the institutional knowledge that previously existed only in senior developers' heads.

A common assumption is that better models will eventually make explicit context capture obsolete—that sufficiently advanced agents will simply figure everything out. The opposite is true. Better models amplify the value of good context, they don't replace it.

More capable models are better at capturing context: they recognize what's significant in a conversation, extract cleaner decision rationale, and identify patterns worth preserving. They're also better at utilizing context: they can reason over larger context windows, synthesize information from multiple sources, and apply historical knowledge more precisely to current tasks. The bottleneck was never the model's ability to use context—it was getting the right context to the model in the first place.

This creates a compounding relationship. As models improve, the same context infrastructure delivers progressively better outcomes. The investment in capturing WHY context and verification expectations becomes more valuable over time, not less. Teams that accumulate this context gain increasing advantage as the agents consuming it grow more sophisticated.

The goal is simple: help agents produce code that a thoughtful human teammate would produce—code that fits, that respects history, that anticipates problems, and that meets the team's actual quality bar. Not through more sophisticated models alone, but through the combination of better models with the context those models have always needed.