Dataface Tasks

Ingest dbt schema.yml descriptions into AI_CONTEXT

IDM1-AICONTEXT-003
Statusdone
Priorityp0
Milestonem1-ft-analytics-analyst-pilot
Ownerdata-ai-engineer-architect
Initiativedescription-enrichment

Problem

dbt projects contain rich human-authored descriptions for models and columns in schema.yml files, but the AI_CONTEXT pipeline does not ingest them. Pilot analysts using Dataface against dbt-managed warehouses get metadata that ignores the semantic documentation their dbt teams have already written. This is the highest-value description source for dbt-native users and its absence significantly reduces context quality for the pilot's primary audience.

Context

  • dbt model and column descriptions are ingested and linked to AI_CONTEXT entities.
  • Conflicts with inferred descriptions are resolved via documented precedence rules.
  • Pilot agents can retrieve dbt-authored semantics through MCP context outputs.

Possible Solutions

Plan

  • Implement dbt schema parser/mapper for AI_CONTEXT entity IDs.
  • Handle missing/renamed model mappings with explicit warnings.
  • Add tests for merge behavior and precedence with profile-derived metadata.
  • Document ingestion assumptions and required dbt project structure.

Implementation Progress

  • Implemented dataface/core/inspect/dbt_schema.py (DbtSchemaParser + merge_dbt_descriptions), wired into MCP catalog via _enrich_with_dbt helper
  • 19 tests in tests/core/test_dbt_schema.py covering parser, mapping, and MCP integration
  • Two cbox review rounds: R1 fixed dead code, swallowed warnings, not-profiled path skip, added cached-path test; R2 fixed dead logging import and spurious single-table warnings
  • 77 tests pass (19 new + 58 existing), no regressions
  • Docs checklist item deferred — inline docstrings cover parser/warning semantics
  • Description priority merge engine landed in PR #500 (consumer of these sources)

Review Feedback

  • Two cbox review rounds completed, all blocking issues resolved
  • Review verdict: APPROVED

  • [x] Review cleared