Dataface Tasks

Wire question-scoped context bundles into text-to-SQL eval backends

IDCONTEXT_CATALOG_NIMBLE-WIRE_QUESTION_SCOPED_CONTEXT_BUNDLES_INTO_TEXT_TO_SQL_EVAL_BACKENDS
Statusnot_started
Priorityp1
Milestonem2-internal-adoption-design-partners
Ownerdata-ai-engineer-architect
Initiativequestion-aware-schema-retrieval-and-narrowing

Problem

Teach the shared SQL generation and eval backend layer to consume question-scoped context bundles from the retrieval CLI so local text-to-SQL runs can compare full-schema prompting against retrieved-and-isolated context without building a separate retrieval system inside the model prompt.

Context

The retrieval initiative only matters if generation can actually consume the narrowed context.

Relevant current paths:

  • dataface/ai/generate_sql.py accepts a schema-context string
  • apps/evals/sql/backends.py resolves built-in generation backends
  • apps/evals/sql/context.py already has pluggable context-provider patterns
  • the eval runner can compare backend configurations and metadata across runs

So this task should not build a second retrieval system inside the backend. It should teach the generation/eval layer to consume the bundle artifact produced by the CLI/search layer and compare it honestly with the current full-schema baseline.

This task also should not turn into retrieval optimization work. It should consume whatever simple bundle generator M2 gives us, even if that bundle came from a very naive Python search implementation.

Possible Solutions

  1. Recommended: consume question-scoped bundles through the existing backend/context-provider seam Add a bundle-aware context provider or backend mode that loads the isolated bundle text/JSON for each question and feeds only that narrowed context to the shared generator.

Why this is recommended:

  • reuses the eval backend architecture already in place
  • keeps retrieval and generation loosely coupled
  • makes A/B comparison against full-context prompting straightforward
  1. Reimplement retrieval logic directly inside each backend.

Trade-off: duplicates logic and guarantees drift between the CLI retriever and eval paths.

  1. Change generate_sql() to own retrieval itself.

Trade-off: collapses retrieval and generation back together, which is the exact architecture problem this initiative is trying to fix.

Plan

  1. Define how a question maps to a saved or on-demand bundle artifact.
  2. Add a bundle-backed context-provider mode for eval backends.
  3. Preserve the current full-schema context mode as the baseline.
  4. Ensure backend metadata records: - bundle mode - bundle path or strategy - prompt context size or reduction metadata when available
  5. Run canary eval comparisons between: - full schema context - question-scoped bundle context
  6. Add focused tests that prove the backend consumes bundle context without duplicating retrieval logic.

Likely files

  • apps/evals/sql/backends.py
  • apps/evals/sql/context.py
  • dataface/ai/generate_sql.py
  • eval tests under tests/evals/sql/
  • possibly the shared text-to-SQL task surfaces that currently call full schema context directly

Explicit anti-goals for this task

  • no backend-local search/index implementation
  • no search-speed optimization
  • no tight coupling between eval backend logic and retrieval internals beyond the bundle contract

Implementation Progress

Not started.

QA Exploration

  • [x] QA exploration completed (or N/A for non-UI tasks)

N/A - eval/backend integration task.

Review Feedback

  • [ ] Review cleared