Dataface Tasks

Add join-path grounding to question-scoped context bundles

IDCONTEXT_CATALOG_NIMBLE-ADD_JOIN_PATH_GROUNDING_TO_QUESTION_SCOPED_CONTEXT_BUNDLES
Statusnot_started
Priorityp1
Milestonem3-public-launch
Ownerdata-ai-engineer-architect

Problem

Retrieval can identify the right tables and still leave the model stuck on how those tables connect. Wrong joins are one of the most expensive failure modes in text-to-SQL because the query may still parse and even look plausible while returning the wrong answer.

Question-scoped bundles need to tell the model not just which tables matter, but how they relate.

Context

The inspect and schema context work already knows some relationship signals:

  • foreign-key-like connections
  • joinable columns
  • grain hints
  • relationship metadata inferred during inspect/profile steps

The retrieval initiative already separates:

  1. broad retrieval
  2. isolation into a compact bundle
  3. downstream generation

This task belongs in the isolation step. It should enrich the bundle with likely join paths among retained tables, not create a new universal relationship engine.

Possible Solutions

  1. Recommended: add compact relationship edges and join-path hints to retained-table bundles. For the top retrieved tables, include the likely connecting keys and the most plausible join routes so the generator sees the shape of the narrowed schema.

Why this is recommended:

  • addresses a high-value text-to-SQL failure mode
  • fits naturally into bundle generation
  • reuses inspect relationship signals instead of inventing new infrastructure
  1. Leave join reasoning entirely to the generator.

Trade-off: simpler, but it wastes useful context that Dataface can often infer locally.

  1. Build a full-blown graph search or semantic join planner.

Trade-off: promising later, but too large for this stage.

Plan

  1. Review existing relationship and join inference outputs available from inspect artifacts.
  2. Define a compact bundle representation for relationship edges and likely join paths.
  3. Restrict those hints to the retained working set so the bundle stays narrow.
  4. Expose the relationship-aware bundle variant to eval consumers.
  5. Measure whether join-aware bundles reduce wrong-join failures in downstream evals.

Success criteria

  • question-scoped bundles can include likely relationship edges among retained tables
  • join-heavy benchmark cases become easier for the generator
  • bundle size remains controlled instead of turning into a full lineage dump

Implementation Progress

Not started.

QA Exploration

  • [x] QA exploration completed (or N/A for non-UI tasks)

N/A - retrieval/bundle task.

Review Feedback

No review feedback yet.

  • [ ] Review cleared