Schema-derived AI prompts from compiled types

ID	MCP_ANALYST_AGENT-SCHEMA_DERIVED_AI_PROMPTS_FROM_COMPILED_TYPES
Status	not_started
Priority	p2
Milestone	m3-public-launch
Owner	data-ai-engineer-architect

Problem

AI prompts describing the Dataface YAML spec are hand-maintained markdown strings in schema.py. When the schema evolves (new chart types, new layout options, new variable features), the prompt templates must be manually updated — and they drift. generate_face_schema_summary() is a 70-line hand-written markdown string that's already incomplete (missing settings, geo charts, advanced layout options). json-render solved this with catalog.prompt() which auto-generates a complete system prompt from the Zod-typed catalog, so adding a component automatically makes it available to AI.

Context

Research origin: ai_notes/research/json-render-deep-dive.md — Priority 1 section.
Current hand-maintained prompts: dataface/core/compile/schema.py — get_schema_for_prompt(), generate_face_schema_summary(), generate_variable_schema()
Pydantic input types: dataface/core/compile/types.py — Face, Chart, Variable, QueryDefinition, etc.
Compiled types: dataface/core/compile/compiled_types.py — CompiledFace, CompiledChart, etc.
Chart type enum: ChartType in types.py
AI integration: dataface/ai/ — MCP server and tools that consume schema prompts
Dependency: declarative-schema-definition-outside-python-code — a declarative schema makes auto-generation trivial; without it, we'd introspect Pydantic models directly (possible but messier)
Enables: extensible-schema-with-custom-elements-and-chart-types — when users register custom elements, they automatically appear in AI prompts

Possible Solutions

A. Introspect Pydantic models directly Recommended (short-term)

Walk the Pydantic model tree (Face → Chart → ChartType enum, etc.) at runtime, extract field names, types, defaults, docstrings, and enum values. Generate markdown prompt from the live model structure.

Pros: No new schema format needed. Works today. Always in sync with code by definition. Cons: Pydantic docstrings aren't optimized for AI context. Can't express "when to use this" hints (like json-render's description field on components). May produce verbose/noisy prompts.

B. Declarative schema with AI annotations

Once the declarative schema exists (see dependency), add AI-specific metadata: component descriptions, usage hints, common mistakes, examples. Generate prompts from this enriched schema.

Pros: Rich, curated AI context. Descriptions tuned for LLM consumption. Same schema powers validation, editor tooling, and AI prompts. Cons: Blocked on declarative schema work. Requires maintaining description quality.

C. Hybrid: Pydantic introspection now, migrate to declarative later

Start with (A) to eliminate hand-maintenance immediately. When the declarative schema lands, switch the prompt generator to read from it instead. The public API (get_schema_for_prompt()) stays the same.

Pros: Immediate improvement. Clean migration path. No throwaway work. Cons: Two rounds of implementation.

Plan

Identify which parts of the current AI prompt can be generated directly from compiled types and which parts still need curated explanation.
Implement a derivation path that produces the structural schema guidance in a stable prompt-friendly format.
Integrate the derived prompt content into the existing prompt assembly flow with tests that catch schema/prompt drift.
Review the resulting prompt quality on representative agent tasks and refine the layering between generated and hand-authored guidance.

Implementation Progress

Review Feedback

[ ] Review cleared