Evaluate question-scoped bundle compression strategies

ID	CONTEXT_CATALOG_NIMBLE-EVALUATE_QUESTION_SCOPED_BUNDLE_COMPRESSION_STRATEGIES
Status	not_started
Priority	p2
Milestone	m4-v1-0-launch
Owner	data-ai-engineer-architect

Problem

Once retrieval and bundle generation exist, a new question appears: how much of the narrowed context should the generator actually see? A bundle can still be too verbose, and different compression strategies may preserve or destroy different kinds of signal.

We need to evaluate bundle compression strategies explicitly instead of assuming the first bundle shape is the right one.

Context

The retrieval initiative already separates retrieval from isolation, which makes this task possible. We can compare different isolation/rendering styles without changing the underlying retriever.

Likely compression variants include:

raw narrowed table dumps
selected columns only
selected columns plus relationship summaries
short "why included" explanations
more structured planner-oriented summaries

This task should optimize for usefulness, not for elegance or compression ratio in the abstract.

Possible Solutions

Recommended: compare a small set of bundle compression/rendering strategies on the same eval slice. Keep retrieval fixed and vary only the bundle shape so the team can see which context presentation best preserves downstream SQL signal.

Why this is recommended:

isolates the compression question cleanly
aligns with the retrieval architecture
prevents accidental prompt bloat from creeping back in

Keep whatever initial bundle format exists and never compare alternatives.

Trade-off: simplest, but likely leaves prompt-quality gains unexplored.

Optimize purely for smallest token count.

Trade-off: tempting, but too narrow. The goal is best SQL usefulness, not smallest text.

Plan

Define a small set of bundle compression/rendering variants.
Keep retrieval fixed so only bundle shape changes.
Compare downstream SQL quality, grounding behavior, and context size across variants.
Choose a default bundle format based on downstream usefulness and interpretability.
Document which richer bundle elements are optional add-ons rather than defaults.

Success criteria

bundle shape decisions are backed by eval evidence
the default bundle remains compact without dropping critical signal
retrieval work stays separated cleanly from bundle rendering work

Implementation Progress

Not started.

QA Exploration

[x] QA exploration completed (or N/A for non-UI tasks)

N/A - retrieval/eval task.

Review Feedback

No review feedback yet.

[ ] Review cleared