Dataface Tasks

IDE inspector: use cached inspect.json before querying database

IDINSPECT_PROFILER-IDE_INSPECTOR_USE_CACHED_INSPECT_JSON_BEFORE_QUERYING_DATABASE
Statuscompleted
Priorityp1
Milestonem1-ft-analytics-analyst-pilot
Ownersr-engineer-architect
Completed bydave
Completed2026-03-24

Problem

The IDE inspector panel, server routes, and MCP tools each independently query the database when a user opens a table profile, even when a valid inspect.json cache already exists. This creates two issues: unnecessary warehouse costs (BigQuery inspect queries run $145+ per table scan), and inconsistent behavior where different surfaces may show stale or divergent profile data depending on their own caching logic. Worse, some surfaces auto-profile on cache miss without user consent, turning a passive "view profile" action into an expensive query the analyst didn't ask for. All profiler surfaces need to treat inspect.json as the single source of truth — reading from cache on hit, and prompting the user on miss instead of silently querying.

Context

Three surfaces serve inspect data to the IDE and AI agents:

  1. Server /inspect/model/ route (dataface/core/serve/server.py): renders built-in inspect templates via render_inspect_dashboard. Already cache-aware — the renderer reads from target/inspect.json and shows a "not profiled" prompt on cache miss.
  2. MCP _profile_table (dataface/ai/mcp/tools.py): returns cached profile on hit, not_profiled response on miss. Already cache-first.
  3. MCP _list_schema (dataface/ai/mcp/tools.py): lists all tables. Was calling get_table_schema() + get_table_enrichment() per uncached table — two DB queries per table.
  4. get_schema_context (dataface/ai/schema_context.py): builds schema text for AI consumption. Was calling get_table_schema() per uncached table and required a live DB connection.

Key files: dataface/ai/mcp/tools.py, dataface/ai/schema_context.py, dataface/core/inspect/storage.py, dataface/core/serve/server.py.

Possible Solutions

A. Patch individual surfaces — fix each function that queries DB on cache miss. Simple, targeted, but risks drift if new surfaces are added.

B. Extract shared cache-first helper — centralize the cache-then-prompt pattern. More upfront work but prevents future surfaces from accidentally auto-profiling.

Recommended: A — the violations are in two specific functions (_list_schema and get_schema_context). The server and MCP _profile_table already work correctly. A shared helper adds abstraction without current consumers to justify it.

Plan

Files to modify: - dataface/ai/mcp/tools.py_list_schema: stop calling get_table_schema/get_table_enrichment for uncached tables - dataface/ai/schema_context.pyget_schema_context: stop calling get_table_schema for uncached tables; add cache-only fallback when DB is unreachable

Tests (TDD): - tests/core/test_inspect_cache_first_schema.py — new file covering both functions

Implementation Progress

Design principles

  1. inspect.json is the single source of truth. Every profile run writes to it. Every dashboard reads from it. The flow is always: profile → save to inspect.json → render from inspect.json.
  2. Profiling is always opt-in. No surface should auto-profile. Inspect queries are expensive ($145+ per table scan on BigQuery). On cache miss, prompt the user — don't silently run a query.

Completed

  • [x] _list_schema no longer calls get_table_schema() or get_table_enrichment() for uncached tables. Uncached tables appear with not_profiled: true and empty columns.
  • [x] get_schema_context no longer calls get_table_schema() for uncached tables. Lists them with name only.
  • [x] get_schema_context falls back to cache-only mode when DB is unreachable or returns no tables, serving cached profiles without a live connection.
  • [x] Tests: tests/core/test_inspect_cache_first_schema.py — 4 tests covering both functions.
  • [x] Server /inspect/model/ route already cache-aware (pre-existing).
  • [x] MCP _profile_table already cache-first (pre-existing).
  • [x] All 103 related tests pass.

Reference

  • GitHub issue: #489

Review Feedback

  • [ ] Review cleared