IDE inspector: use cached inspect.json before querying database
Problem
The IDE inspector panel, server routes, and MCP tools each independently query the database when a user opens a table profile, even when a valid inspect.json cache already exists. This creates two issues: unnecessary warehouse costs (BigQuery inspect queries run $145+ per table scan), and inconsistent behavior where different surfaces may show stale or divergent profile data depending on their own caching logic. Worse, some surfaces auto-profile on cache miss without user consent, turning a passive "view profile" action into an expensive query the analyst didn't ask for. All profiler surfaces need to treat inspect.json as the single source of truth — reading from cache on hit, and prompting the user on miss instead of silently querying.
Context
Three surfaces serve inspect data to the IDE and AI agents:
- Server
/inspect/model/route (dataface/core/serve/server.py): renders built-in inspect templates viarender_inspect_dashboard. Already cache-aware — the renderer reads fromtarget/inspect.jsonand shows a "not profiled" prompt on cache miss. - MCP
_profile_table(dataface/ai/mcp/tools.py): returns cached profile on hit,not_profiledresponse on miss. Already cache-first. - MCP
_list_schema(dataface/ai/mcp/tools.py): lists all tables. Was callingget_table_schema()+get_table_enrichment()per uncached table — two DB queries per table. get_schema_context(dataface/ai/schema_context.py): builds schema text for AI consumption. Was callingget_table_schema()per uncached table and required a live DB connection.
Key files: dataface/ai/mcp/tools.py, dataface/ai/schema_context.py, dataface/core/inspect/storage.py, dataface/core/serve/server.py.
Possible Solutions
A. Patch individual surfaces — fix each function that queries DB on cache miss. Simple, targeted, but risks drift if new surfaces are added.
B. Extract shared cache-first helper — centralize the cache-then-prompt pattern. More upfront work but prevents future surfaces from accidentally auto-profiling.
Recommended: A — the violations are in two specific functions (_list_schema and get_schema_context). The server and MCP _profile_table already work correctly. A shared helper adds abstraction without current consumers to justify it.
Plan
Files to modify:
- dataface/ai/mcp/tools.py — _list_schema: stop calling get_table_schema/get_table_enrichment for uncached tables
- dataface/ai/schema_context.py — get_schema_context: stop calling get_table_schema for uncached tables; add cache-only fallback when DB is unreachable
Tests (TDD):
- tests/core/test_inspect_cache_first_schema.py — new file covering both functions
Implementation Progress
Design principles
inspect.jsonis the single source of truth. Every profile run writes to it. Every dashboard reads from it. The flow is always: profile → save toinspect.json→ render frominspect.json.- Profiling is always opt-in. No surface should auto-profile. Inspect queries are expensive ($145+ per table scan on BigQuery). On cache miss, prompt the user — don't silently run a query.
Completed
- [x]
_list_schemano longer callsget_table_schema()orget_table_enrichment()for uncached tables. Uncached tables appear withnot_profiled: trueand empty columns. - [x]
get_schema_contextno longer callsget_table_schema()for uncached tables. Lists them with name only. - [x]
get_schema_contextfalls back to cache-only mode when DB is unreachable or returns no tables, serving cached profiles without a live connection. - [x] Tests:
tests/core/test_inspect_cache_first_schema.py— 4 tests covering both functions. - [x] Server
/inspect/model/route already cache-aware (pre-existing). - [x] MCP
_profile_tablealready cache-first (pre-existing). - [x] All 103 related tests pass.
Reference
- GitHub issue: #489
Review Feedback
- [ ] Review cleared