MCP tooling contract for extension + Copilot dashboard/query generation

ID	M1-MCP-001
Status	completed
Priority	p0
Milestone	m1-ft-analytics-analyst-pilot
Owner	data-ai-engineer-architect
Completed by	CBox Agent
Completed	2026-03-14

Problem

The MCP tool schemas (render_dashboard, execute_query, catalog, list_sources) lack formalized input/output contracts, making integrations with IDE extensions and GitHub Copilot fragile. Tool responses vary in structure between success and error cases, required vs. optional parameters are not enforced consistently, and there are no contract tests to catch breaking changes. When an IDE extension or Copilot agent constructs a tool call, minor schema drift can silently produce wrong results or cryptic errors. Without hardened contracts and documented recipes, every new integration is a one-off debugging exercise.

Context

Key files: - dataface/ai/tool_schemas.py — canonical input schemas (single source of truth for all surfaces) - dataface/ai/mcp/tools.py — tool implementations (render_dashboard, execute_query, catalog, list_sources, list_dashboards, get_dashboard, get_schema) - dataface/ai/mcp/search.py — search_dashboards implementation - dataface/ai/mcp/review.py — review_dashboard implementation - dataface/ai/tools.py — OpenAI wrapper + dispatch_tool_call() (Playground/Cloud surface) - dataface/ai/context_contract.py — existing AI_CONTEXT v1 contract (model for this work) - tests/ai/test_ai_context_contract.py — contract-locking tests for AI_CONTEXT (pattern to follow) - tests/core/test_mcp.py — existing MCP tool tests (functional, not contract-locking) - tests/core/test_ai_tools.py — tool definition + dispatch tests

Response shape inconsistencies found: 1. list_dashboards — no success key; error uses error (string) instead of errors (list) 2. list_sources — no success or error keys at all 3. search_dashboards — no success/error keys, just {results: []} 4. execute_query — uses error (singular string) while render_dashboard/get_dashboard use errors (list) 5. render_dashboard success path omits warnings key entirely (present only on error)

Constraints: - Additive changes only (no removing keys consumers may depend on) - Follow existing context_contract.py pattern: versioned, validated, tested - Contract tests must lock shapes so breaking changes are caught by CI

Possible Solutions

A. Pydantic response models per tool

Define typed Pydantic models for each tool response. Strong typing but heavy; tool implementations return dicts today and Pydantic is not used in the AI layer (intentional design choice).

B. Recommended — Lightweight contract module + validation functions

Mirror context_contract.py: define required keys/types per tool in a tool_contracts.py module with validate_tool_response(). Add contract-locking tests. Normalize the three inconsistent tools (list_dashboards, list_sources, search_dashboards) to include success key. Keep error (singular) on execute_query for backward compat but add errors list alongside.

Why recommended: Matches existing patterns, minimal code, no new dependencies, locks contracts without over-engineering.

C. JSON Schema validation

Define JSON Schema per tool response and validate with jsonschema. Heavier dependency for limited benefit — the contract module approach is simpler and already proven in this codebase.

Plan

Write failing contract tests (tests/ai/test_tool_contracts.py) that lock the expected response shapes for all 8 MCP tools — TDD per CLAUDE.md.
Create dataface/ai/tool_contracts.py defining required response keys/types per tool, with validate_tool_response() function.
Normalize tool responses in mcp/tools.py, mcp/search.py, and mcp/review.py: - Add success: True/False to list_dashboards, list_sources, search_dashboards - Add errors list to execute_query alongside existing error string - Ensure warnings key present on render_dashboard success path
Run contract tests green, then run full just test for regressions.
Document integration guidance in task worksheet.
Validate task frontmatter, rebase, PR.

Implementation Progress

Completed

[x] Created dataface/ai/tool_contracts.py — versioned response contracts for all 9 MCP tools (mirrors context_contract.py pattern)
[x] Created tests/ai/test_tool_contracts.py — 27 contract-locking tests covering all tool response shapes (TDD: wrote failing tests first)
[x] Normalized list_dashboards — added success: bool to all response paths
[x] Normalized list_sources — added success: True to response
[x] Normalized search_dashboards — added success: True to all response paths
[x] Normalized execute_query — added errors: list[str] alongside existing error: str|None for consistency
[x] All 27 contract tests pass; 191 existing MCP/AI tests pass with 0 regressions

Integration Guidance (for extension/Copilot consumers)

Envelope guarantee: Every MCP tool response is a dict with a success: bool key (except get_schema which always succeeds and returns {schema_text, version}).

Error handling recipe: 1. Check response["success"] first. 2. On failure, read response["errors"] (list of strings) for structured error messages. Some tools also provide error (singular string) for backward compatibility. 3. Optional keys error_summary, tips, and warnings may be present on error — use them for richer UX.

Tool contract version: TOOL_CONTRACT_VERSION = "1.0" in dataface/ai/tool_contracts.py. Consumers can import validate_tool_response() for runtime validation during development.

Review Feedback

Self-review: all changes are additive (no removed keys), contract tests lock shapes, 0 regressions in 191 existing tests.
[x] Review cleared