Dataface Tasks

AI_CONTEXT metadata contract v1 for pilot

IDM1-AICONTEXT-001
Statusdone
Priorityp0
Milestonem1-ft-analytics-analyst-pilot
Ownerdata-ai-engineer-architect
Initiativemcp-catalog-agent-tools

Problem

The AI_CONTEXT metadata format has no versioned contract — field names, types, and semantics are implicitly defined by the code that produces them and can change without notice. Consumers (MCP tools, agents, rendering surfaces) have no way to know which fields are stable, what their semantics are, or whether a schema change will break their integration. During the pilot, this absence of a contract will cause fragile integrations and make it impossible to evolve the format without risking breakage across all consumers.

Context

  • AI_CONTEXT v1 schema is documented with required/optional fields and examples.
  • Compatibility and migration rules are defined for future schema changes.
  • Validation tests enforce contract compliance in build and runtime paths.

Possible Solutions

Recommended: Thin contract module (context_contract.py) with version constant, validation functions, and golden fixtures. This approach: - Adds AI_CONTEXT_VERSION = "1.0" as a version header on format_table_context() output - Provides validate_ai_context() for output validation and validate_table_profile_input() for input validation - Uses golden JSON fixtures for representative table profiles - Documents the contract in AI_CONTEXT_CONTRACT.md with versioning policy and migration notes

Alternative considered: JSON Schema / Pydantic models. Rejected because the existing codebase uses plain dicts with dataclass backing and test-based contract locking (same pattern as PROFILER_CONTRACT_VERSION). Adding a schema library would be inconsistent.

Plan

  • Define canonical AI_CONTEXT schema and version field strategy.
  • Implement validation checks in context generation and consumption layers.
  • Add golden fixtures for representative tables/columns/relationships.
  • Publish migration notes for pre-v1 payloads and legacy consumers.

Implementation Progress

  • [x] Define canonical AI_CONTEXT schema and version field strategy
  • AI_CONTEXT_VERSION = "1.0" in dataface/ai/context_contract.py
  • Version field added to format_table_context() output as ai_context_version
  • [x] Implement validation checks in context generation and consumption layers
  • validate_ai_context() — validates output payload shape and types
  • validate_table_profile_input() — validates profiler input before context generation
  • ContractValidationError raised on invalid payloads
  • [x] Add golden fixtures for representative tables/columns/relationships
  • tests/ai/fixtures/golden_minimal_table.json — minimal required fields
  • tests/ai/fixtures/golden_enriched_table.json — full enrichments (descriptions, grain, relationships)
  • [x] Publish migration notes for pre-v1 payloads and legacy consumers
  • dataface/ai/AI_CONTEXT_CONTRACT.md — full contract reference with versioning policy and migration guide
  • [x] 39 contract tests in tests/ai/test_ai_context_contract.py — all passing
  • [x] Zero regressions — full suite (2259 tests) passes

Review Feedback

  • [ ] Review cleared