Experiment: Schema tool strategy profiled vs filtered vs INFORMATION_SCHEMA vs none

ID	MCP_ANALYST_AGENT-EXPERIMENT_SCHEMA_TOOL_STRATEGY_PROFILED_VS_FILTERED_VS_INFORMATION_SCHEMA_VS_NONE
Status	not_started
Priority	p2
Milestone	m2-internal-adoption-design-partners
Owner	data-ai-engineer-architect
Initiative	ai-quality-experimentation-and-context-optimization

Hypothesis

The profiled catalog strategy should be strongest overall, filtered fields should be close behind, and INFORMATION_SCHEMA or no tool should trail unless freshness matters more than richness. This experiment should tell us whether the extra profiling cost is justified.

Method

Use the winning model and the winning context level from the earlier waves, then vary only the schema acquisition path across profiled catalog, filtered catalog, live INFORMATION_SCHEMA, and no tool. Keep the prompt and benchmark subset fixed so the result measures how schema access strategy affects SQL generation.

Variables

Variable	Values
Independent	schema acquisition path (`profiled`, `filtered`, `information_schema`, `none`)
Dependent	pass rate, grounding failures, tool-call rate, latency, token cost
Controlled	model, prompt, canary subset, context level, scorer, seed, temperature

Execution Log

Run 1: profiled catalog

Command: <exact command>
Config: winning model, chosen context level, profiled catalog path
Output: <path to results JSONL>
Started: <timestamp>
Duration: <time>
Notes: <anything unexpected>

Run 2: filtered catalog

Command: <exact command>
Config: winning model, chosen context level, filtered catalog path
Output: <path to results JSONL>
Started: <timestamp>
Duration: <time>
Notes: <anything unexpected>

Run 3: INFORMATION_SCHEMA

Command: <exact command>
Config: winning model, chosen context level, live INFORMATION_SCHEMA
Output: <path to results JSONL>
Started: <timestamp>
Duration: <time>
Notes: <anything unexpected>

Run 4: no tool

Command: <exact command>
Config: winning model, chosen context level, no schema tool
Output: <path to results JSONL>
Started: <timestamp>
Duration: <time>
Notes: <anything unexpected>

Results

Condition	Pass Rate	Avg Score	Parse	Grounding	Intent	Latency	Tokens
profiled
filtered
information_schema
none