Decision: Build the question-aware retrieval layer as a local corpus + CLI flow first, not as an always-on service or mandatory MCP tool.
Rationale: Keeps the first version simple, debuggable, and aligned with current project scale.
Consequence: Runtime tool use is deferred, but the core retrieval contract must still be reusable later.
ADR-002: Separate retrieval from isolation
Status: Proposed
Decision: Search results are not passed directly to generation. The system must have an explicit isolation step that produces a question-scoped bundle.
Rationale: This is the core lesson from the gap analysis and from systems like LinkAlign.
Consequence: We need both a ranked-search contract and a bundle contract.
ADR-003: Reuse existing inspect/dbt artifacts instead of inventing new raw metadata sources
Status: Proposed
Decision: M2 builds on target/inspect.json, dbt schema metadata, and lightweight local docs before expanding to broader external context sources.
Rationale: We already have enough local metadata to improve narrowing meaningfully.
Consequence: M2 focuses on shaping and retrieving context, not on a new ingestion platform.
ADR-004: Use deterministic lexical ranking as the default M2 search path
Status: Proposed
Decision: Start with field-weighted lexical/deterministic ranking rather than embeddings or hybrid vector search as the default.
Rationale: Current schema sizes are manageable, exact name matching is highly valuable, and deterministic scoring is easier to debug and validate.
Consequence: Embeddings may be added later if recall is insufficient, but they are not the baseline contract.
ADR-005: Explicitly defer speed and indexing optimization
Status: Proposed
Decision: M2 will not optimize for search performance. A simple Python function over local JSON/JSONL artifacts is acceptable if it narrows context well for the agent.
Rationale: The current corpus sizes are small enough, and the main risk is overbuilding retrieval infrastructure before proving the narrowing contract is useful.
Consequence: If retrieval quality proves useful and corpus sizes grow, optimization can be a later follow-up rather than a design constraint now.
ADR-006: Materialize a derived corpus on disk
Status: Proposed
Decision: Build a structured derived corpus, likely JSONL plus manifest files under target/context/.
Rationale: Keeps the retriever inspectable, rebuildable, and easy to consume from both Python and CLI paths.
Consequence: The corpus becomes a cache/build artifact, not a hand-authored source of truth.
ADR-007: Bundle output must support both structured and text consumers
Status: Proposed
Decision: The question-scoped bundle should include structured records plus a compact text rendering.
Rationale: Current generation code already expects text context; a text form avoids forcing a simultaneous rewrite of every consumer.
Consequence: The bundle contract must preserve enough structure for future tooling while remaining easy to inject into prompts now.
ADR-008: Defer runtime search_context tool exposure until after the M2 contract stabilizes
Status: Proposed
Decision: Do not make a new agent tool the primary M2 deliverable. Instead, design the retrieval engine so a future tool can wrap it directly.
Rationale: Tool exposure before the retrieval contract is stable creates two moving parts at once.
Consequence: M2 should still define the likely future tool shapes and keep CLI/core logic reusable.
ADR-009: Keep full-schema prompting as a fallback
Status: Proposed
Decision: Retrieved-and-isolated context becomes a preferred path, not an all-or-nothing hard dependency.
Rationale: Some current projects are small enough that full schema context is still acceptable, and we need a safe fallback while the retriever matures.
Consequence: Consumers need a clear fallback rule when corpus or bundle generation is unavailable.
ADR-010: Measure retrieval quality through downstream and retrieval-specific signals
Status: Proposed
Decision: Judge the system by both retrieval metrics (top-k table/column hit, bundle inclusion) and downstream text-to-SQL quality.
Rationale: Retrieval that looks good in isolation but fails to help generation is not enough.
Consequence: This initiative should connect to existing eval work rather than inventing a disconnected success metric.