Wire question-scoped context bundles into text-to-SQL eval backends
Problem
Teach the shared SQL generation and eval backend layer to consume question-scoped context bundles from the retrieval CLI so local text-to-SQL runs can compare full-schema prompting against retrieved-and-isolated context without building a separate retrieval system inside the model prompt.
Context
The retrieval initiative only matters if generation can actually consume the narrowed context.
Relevant current paths:
dataface/ai/generate_sql.pyaccepts a schema-context stringapps/evals/sql/backends.pyresolves built-in generation backendsapps/evals/sql/context.pyalready has pluggable context-provider patterns- the eval runner can compare backend configurations and metadata across runs
So this task should not build a second retrieval system inside the backend. It should teach the generation/eval layer to consume the bundle artifact produced by the CLI/search layer and compare it honestly with the current full-schema baseline.
This task also should not turn into retrieval optimization work. It should consume whatever simple bundle generator M2 gives us, even if that bundle came from a very naive Python search implementation.
Possible Solutions
- Recommended: consume question-scoped bundles through the existing backend/context-provider seam Add a bundle-aware context provider or backend mode that loads the isolated bundle text/JSON for each question and feeds only that narrowed context to the shared generator.
Why this is recommended:
- reuses the eval backend architecture already in place
- keeps retrieval and generation loosely coupled
- makes A/B comparison against full-context prompting straightforward
- Reimplement retrieval logic directly inside each backend.
Trade-off: duplicates logic and guarantees drift between the CLI retriever and eval paths.
- Change
generate_sql()to own retrieval itself.
Trade-off: collapses retrieval and generation back together, which is the exact architecture problem this initiative is trying to fix.
Plan
- Define how a question maps to a saved or on-demand bundle artifact.
- Add a bundle-backed context-provider mode for eval backends.
- Preserve the current full-schema context mode as the baseline.
- Ensure backend metadata records: - bundle mode - bundle path or strategy - prompt context size or reduction metadata when available
- Run canary eval comparisons between: - full schema context - question-scoped bundle context
- Add focused tests that prove the backend consumes bundle context without duplicating retrieval logic.
Likely files
apps/evals/sql/backends.pyapps/evals/sql/context.pydataface/ai/generate_sql.py- eval tests under
tests/evals/sql/ - possibly the shared text-to-SQL task surfaces that currently call full schema context directly
Explicit anti-goals for this task
- no backend-local search/index implementation
- no search-speed optimization
- no tight coupling between eval backend logic and retrieval internals beyond the bundle contract
Implementation Progress
Not started.
QA Exploration
- [x] QA exploration completed (or N/A for non-UI tasks)
N/A - eval/backend integration task.
Review Feedback
- [ ] Review cleared