Async prefetch with future-based cache for progressive rendering

ID	DFT_CORE-ASYNC_PREFETCH_WITH_FUTURE_BASED_CACHE_FOR_PROGRESSIVE_RENDERING
Status	not_started
Priority	p3
Milestone	m4-v1-0-launch
Owner	sr-engineer-architect

Problem

With synchronous prefetch (from the batch prefetch task), render blocks until ALL queries complete before rendering ANY chart. For a dashboard with 10 queries across 3 sources, the user waits for the slowest source before seeing anything. In the interactive serve path, we could start rendering charts as their data arrives — a chart whose query finished in 200ms shouldn't wait for another chart's 3-second Snowflake query.

Context

Depends on: DFT_CORE-WIRE_UP_BATCH_QUERY_PREFETCH_BEFORE_RENDER (synchronous prefetch must land first)

Key files: - dataface/core/execute/executor.py — self._cache: dict[str, list[dict]] is a simple dict. Needs to accept Future objects. - dataface/core/serve/server.py — FastAPI serve layer, already async. - dataface/core/render/renderer.py — render() and chart rendering call executor.execute_chart() synchronously.

Current cache behavior: execute_query() checks if cache_key in self._cache — a synchronous key lookup. Either data is there or it isn't. No concept of in-flight queries. If prefetch runs async and render starts before it finishes, execute_chart would cache-miss and fire a duplicate query.

Possible Solutions

Option A: Future-based cache with concurrent.futures — Recommended

Replace self._cache: dict[str, list[dict]] with dict[str, list[dict] | Future[list[dict]]]. Prefetch submits queries to a ThreadPoolExecutor and stores futures in the cache. execute_query() checks if the cache value is a Future and calls .result() to block until that specific query completes.

Uses stdlib concurrent.futures — no new dependencies
Minimal change to executor interface — callers don't need to know about async
ThreadPoolExecutor naturally handles parallel execution across source profiles
Synchronous mode (CLI) just calls .result() immediately or skips futures entirely

Option B: Full asyncio integration

Make execute_query and execute_chart async, use asyncio.Future and await. Render becomes async throughout.

More invasive — every call site becomes async
Better fit for the serve path (already async) but painful for CLI
Would need sync wrappers for non-async callers

Option C: SSE streaming with progressive chart delivery

Don't overlap prefetch with render. Instead, render each chart independently and stream results to the client via Server-Sent Events as each chart completes.

Best UX — charts appear progressively
Most complex — requires client-side assembly, SSE infrastructure
Could combine with Option A for both server-side and client-side progressiveness

Plan

Extend executor cache type to dict[str, list[dict] | Future[list[dict]]]
Add execute_face_batch_async() that submits queries to ThreadPoolExecutor, stores futures in cache
Update execute_query() cache lookup: if value is a Future, call .result() to block
Wire into serve path: use async prefetch for web requests, sync prefetch for CLI
Measure: compare time-to-first-chart-rendered with sync vs async prefetch

Implementation Progress

Review Feedback

[ ] Review cleared