Adoption hardening for internal teams

ID	M2_INTERNAL_ADOPTION_DESIGN_PARTNERS-MCP_ANALYST_AGENT-01
Status	not_started
Priority	p1
Milestone	m2-internal-adoption-design-partners
Owner	data-ai-engineer-architect

Problem

The MCP server and tools were built for a single-team pilot and break down under broader use. Different teams have different data source configurations, dbt project layouts, and query patterns that expose edge cases in adapter resolution, schema inspection, and YAML compilation. Error handling is inconsistent — some failures return structured error objects while others surface raw Python tracebacks. Connection lifecycle management, caching behavior, and concurrent tool-call handling have not been stress-tested. Without hardening these paths, expanding to additional internal teams or design partners will generate a flood of support tickets and erode trust in the tooling.

Context

Pilot work proved the baseline flow for AI agent tool interfaces, execution workflows, and eval-driven behavior tuning, but repeated use by multiple internal teams and first design partners will surface reliability and UX edge cases that the pilot could work around manually.
This milestone is about turning a promising path into something teams can use predictably without bespoke engineering help, support heroics, or undocumented workarounds.
Expected touchpoints include dataface/ai/, MCP/tool contracts, cloud chat surfaces, eval runners, and prompt artifacts, plus any onboarding, validation, and issue-triage surfaces that repeatedly fail during broader adoption.

Possible Solutions

A - Fix only the currently visible blockers: fast, but it tends to preserve hidden fragility and leaves the next round of adopters to rediscover the same classes of issues.
B - Recommended: define an adoption-hardening checklist and close the highest-impact gaps: combine real usage review, targeted fixes, docs/runbook updates, and validation so repeated use becomes predictable.
C - Pause broader adoption until a larger redesign is finished: reduces short-term support load, but delays the real feedback this milestone is meant to capture.

Plan

Review prototype/pilot outputs, open issues, and support friction to identify the top blockers to repeated team adoption of AI agent tool interfaces, execution workflows, and eval-driven behavior tuning.
Turn those findings into a prioritized hardening scope that covers product behavior, docs/runbooks, and validation for the highest-risk flows.
Land or queue the highest-impact fixes and document explicit known limits for anything intentionally left out of scope.
Re-run representative multi-team or design-partner scenarios and update follow-up tasks based on what still fails or feels too brittle.

Implementation Progress

Review Feedback

[ ] Review cleared