Regression prevention and quality gates

ID	M4_V1_0_LAUNCH-INSPECT_PROFILER-02
Status	not_started
Priority	p1
Milestone	m4-v1-0-launch
Owner	sr-engineer-architect

Problem

Semantic type detection accuracy and profile output quality can silently regress when new detectors are added, detection logic is refactored, or upstream profiling queries change. Currently, there are no automated gates in CI that verify semantic inference results against a known-good baseline — a change that causes 15% of email columns to be misclassified as generic strings would pass all existing tests. Without regression gates that compare detector output, confidence distributions, and profile completeness against reference datasets, every release risks shipping quality degradations that are invisible until users report them. Automated quality gates are essential to make the release process sustainable as the detector library grows.

Context

Manual review is not enough to protect warehouse profiling, semantic inference, and analyst-facing inspect/context artifacts once the change rate increases; regressions will keep shipping unless the highest-value checks become automatic.
This task should identify what needs gating in CI or structured review and what evidence is sufficient to block a risky change before it reaches users.
Expected touchpoints include dataface/core/inspect/, schema-context consumers, inspect docs, and core tests, automated tests, eval/QA checks, and any release or review scripts that can enforce the new gates.

Possible Solutions

A - Add only a few narrow tests around current bugs: easy to land, but it rarely protects the broader behavior contract.
B - Recommended: define a regression-gate bundle around the core behavior contract: combine focused tests, snapshots/evals, and required review evidence for risky changes.
C - Depend on manual smoke testing before each release: better than nothing, but too inconsistent to serve as a durable gate.

Plan

Identify the highest-risk behavior contracts for warehouse profiling, semantic inference, and analyst-facing inspect/context artifacts and the types of changes that should be blocked when they regress.
Choose the smallest practical set of automated checks and required review evidence that covers those contracts well enough to matter.
Wire the new gates into the relevant test, review, or release surfaces and document when exceptions are allowed.
Trial the gates on a few representative changes and tighten the signal-to-noise ratio before expanding the coverage further.

Implementation Progress

Review Feedback

[ ] Review cleared