Wire up batch query prefetch before render

ID	DFT_CORE-WIRE_UP_BATCH_QUERY_PREFETCH_BEFORE_RENDER
Status	not_started
Priority	p2
Milestone	m3-public-launch
Owner	sr-engineer-architect

Problem

Currently, query execution is purely lazy: the renderer walks the layout chart-by-chart, and each executor.execute_chart() call makes a separate database round-trip. For a 10-chart dashboard hitting the same Postgres instance, that's 10 individual connections/queries. If multiple charts derive from the same base query via {% raw %}{{ queries.X }}{% endraw %}, the base query runs multiple times. For cloud databases (Snowflake, BigQuery), each round-trip adds 100-500ms of network latency.

Context

Key files: - dataface/core/execute/batch.py — Dependency graph, topological sort, temp table SQL generation, profile grouping. All built but not wired into the render path. - dataface/core/execute/executor.py — execute_face_batch() method already exists and populates self._cache per-query. The cache is a simple dict[str, list[dict]] — synchronous key lookup, no futures. - dataface/core/render/renderer.py — render() function. The prefetch call would go here, before render_face_svg(). - ai_notes/archive/features/QUERY_DEPENDENCY_AND_BATCHING_STRATEGIES.md — Design doc covering temp tables, profile grouping, parallel execution, DuckDB caching.

Existing infrastructure ready to use: - collect_face_queries(face) — collects all chart queries + transitive deps - group_by_profile(queries) — groups by database source - build_dependency_graph(queries) — builds dep graph for temp table optimization - generate_batch_sql(queries, graph, dialect) — generates CREATE TEMP TABLE + rewritten queries - create_batch_execution_plan(face) — orchestrates all of the above

Depends on: DFT_CORE-ASYNC_PREFETCH task builds on this.

Possible Solutions

Option A: Call execute_face_batch() at top of render() — Recommended

Add executor.execute_face_batch(variables) as the first step in render(). This runs all queries synchronously, populates the executor cache, then render proceeds as normal with every execute_chart() hitting cache.

Simplest change — one line to wire it up
No architectural changes needed
Lazy fallback still works for edge cases (tab switches, variable changes at render time)
Sufficient for CLI, static export, and initial page loads

Option B: Prefetch in serve layer instead of render

Call execute_face_batch() in the FastAPI route handler before calling render(). Keeps render pure, but duplicates the call site across serve, CLI, playground.

More explicit but more call sites to maintain
Doesn't help CLI or playground paths without additional wiring

Plan

Add executor.execute_face_batch(merged_variables) call in render() after variable resolution but before layout calculation
Ensure execute_face_batch handles the case where some queries are non-SQL (CSV, HTTP) gracefully — those should still lazy-execute
Add tests: verify cache is warm after prefetch, verify render produces same output with and without prefetch
Measure latency improvement on a multi-chart dashboard with shared base queries

Implementation Progress

Review Feedback

[ ] Review cleared