Grain Inference and Fanout Risk
Ready For Eng · M1 — 5T Internal Pilot Ready2 / 2 (100%)
Objective
Infer candidate model grain, join multiplicity profiles, and fanout risk scores from profiling stats and relationship graph. Surface warnings to agents and compile-time linting.
Deliverables
- [x] Define scope and decision boundaries.
- [x] Produce implementation plan and execution checkpoints.
- [ ] Capture rollout and validation approach.
Approach
Three-phase implementation adding new modules to dataface/core/inspect/:
- Phase 1 — Grain candidate inference (
grain_detector.py): Detect per-table grain from existing profiler stats (PK, uniqueness, naming). No new DB queries. - Phase 2 — Join multiplicity profiling (
join_multiplicity.py): Classify relationship cardinality (1:1, 1:N, N:1, N:M) and compute fanout factors from existing column stats. - Phase 3 — Fanout risk scoring (
fanout_risk.py): Score join risk (none → critical) with actionable recommendations. Wire into MCP tools and compile-time warnings.
Phase 4 (multi-hop path analysis) is deferred to M2+.
See spec.md for full phased plan, output schemas, algorithms, non-goals, and acceptance criteria.
Tasks
- AI_CONTEXT grain and fanout risk signals (beta subset) — completed
- Research deterministic column fanout risk signals and AI context surfacing — completed research / design input for column-level hints vs edge-canonical model