Terminal Agent TUI - dft agent

ID	MCP_ANALYST_AGENT-TERMINAL_AGENT_TUI_DFT_AGENT
Status	completed
Priority	p1
Milestone	m2-internal-adoption-design-partners
Owner	data-ai-engineer-architect
Completed by	dave
Completed	2026-03-16

Problem

Today, Dataface's AI tools are only accessible through external clients (Cursor, Claude Desktop, Codex) via the MCP server (dft mcp serve). This means:

Setup friction: Users must configure MCP in their IDE, which varies per client and often breaks.
No standalone experience: There's no way to use Dataface's AI capabilities without a third-party IDE.
Context fragmentation: External clients don't load all of Dataface's skills and schema context by default — users get a degraded experience compared to what the tools can do.
Analyst accessibility: Internal analysts at Fivetran need a simple dft agent command they can run in any terminal to explore data and build dashboards conversationally.

A built-in terminal agent (dft agent) would provide a first-class, zero-config AI experience where all MCP tools and skills are pre-loaded.

Context

Research doc: ai_notes/ai/TERMINAL_AGENT_TUI.md — full landscape analysis of open-source terminal agents, TUI frameworks, agent SDKs, and architecture options.

Existing infrastructure (all built): - MCP tools: catalog, execute_query, render_dashboard, review_dashboard, search_dashboards, list_sources — in dataface/ai/mcp/tools.py - Tool schemas: dataface/ai/tool_schemas.py — canonical definitions used by MCP + OpenAI formats - Tool dispatch: dataface/ai/tools.py — dispatch_tool_call() routes any tool name to its implementation - Prompts/skills: dataface/ai/prompts.py + dataface/ai/skills/ — design guides, workflow, YAML reference - Schema context: dataface/ai/schema_context.py — token-efficient catalog summary - MCP server: dataface/ai/mcp/server.py — resources + tools, stdio mode - CLI: dataface/cli/main.py — typer-based CLI, easy to add new commands

Key external tools evaluated: - Claude Agent SDK (claude-agent-sdk-python) v0.1.48 — supports custom MCP tools via create_sdk_mcp_server - Pydantic AI — multi-model agent framework with clai CLI - Textual — Python TUI framework (Rich creator's project) - textual-chat v0.1.4 — zero-config chat widget with FastMCP tool integration - OpenCode (123k stars) — reference for UX patterns - Goose by Block (33k stars) — extensible via MCP, reference architecture

Possible Solutions

Option A: OpenAI Responses API + Anthropic Messages API — Multi-Model from Day 1 [Selected]

Support both OpenAI and Anthropic as LLM backends. OpenAI is the default. Use the OpenAI Responses API (not Chat Completions) for streaming tool-calling with JSON schema enforcement where applicable. Support Anthropic Messages API as an alternative backend (--model claude-sonnet or config). Both backends share the same tool definitions, system prompt, and agent loop — only the API wire format differs.

Trade-offs: Slightly more plumbing than single-provider, but we need both for evaluation/comparison. The tool layer is already provider-agnostic (tool schemas are plain dicts). The difference is only in how we serialize tool calls/results on the wire.

Option B: Claude Agent SDK (Single-Provider)

Use ClaudeSDKClient / query() with our MCP tools injected as in-process MCP servers.

Trade-offs: Fast to build but Claude-only. Locks us into one provider. We'd need to rewrite for multi-model later.

Option C: Pydantic AI Agent Framework

Use Pydantic AI for multi-model support, Textual for terminal UI.

Trade-offs: Multi-model via framework abstraction. More dependency. Framework may lag behind provider API features (e.g., OpenAI Responses API, reasoning effort).

Use textual-chat with FastMCP integration. Almost zero code needed for a working chat.

Trade-offs: Fastest prototype. Limited customization. Good for validating UX.

Option E: Toad / ACP Protocol

Implement Agent Client Protocol so dft works as a backend for Toad or any ACP-compatible frontend.

Trade-offs: Standard protocol. Toad's polished UI for free. ACP still new. Less UX control.

Plan

Phased approach — ship Phase 1 for M2, iterate from there.

Phase 1: Multi-Model Agent + Streaming Terminal (M2 target)

LLM backend decision: Support both OpenAI (default) and Anthropic. The Cloud app's AIService already uses OpenAI's Responses API for structured outputs — the terminal agent follows the same pattern. For the streaming agent loop, use the OpenAI Responses API with tool calling (replacing the older Chat Completions pattern). Anthropic support uses the Messages API with tool_use blocks. A thin LLMClient abstraction handles the wire format difference — the agent loop, tool dispatch, and prompt stack are shared.

Why OpenAI Responses API (not Chat Completions): - JSON schema enforcement for structured tool results - reasoning.effort parameter for cost control - Unified API for both streaming and non-streaming - Already used by AIService.generate_dashboard(), generate_sql(), suggest_chart_type()

Files to create/modify: - dataface/ai/agent.py — agent loop with multi-model LLM client - dataface/ai/llm.py — thin LLMClient abstraction (OpenAI default, Anthropic alt) - dataface/cli/commands/agent.py — new command module - dataface/cli/main.py — add agent command

Steps: 1. Add openai and anthropic to optional dependencies ([agent] extra) 2. Create dataface/ai/llm.py: - LLMClient protocol with stream_with_tools() method - OpenAIClient — uses Responses API, streams tool calls + text, handles JSON schema enforcement - AnthropicClient — uses Messages API with tool_use content blocks - Factory: create_client(provider="openai"|"anthropic", model=...) — reads from config/env with OpenAI as default 3. Create dataface/ai/agent.py: - Build system prompt from existing resources (schema context, skills, YAML ref) - run_agent() — manages the tool-calling conversation loop using LLMClient - Tool dispatch via existing dispatch_tool_call() from dataface/ai/tools.py - Yield events: thinking, tool_call, tool_result, content, done, error (same event schema as Cloud SSE) 4. Create dataface/cli/commands/agent.py: - dft agent command — starts interactive session - dft agent "one-shot prompt" — single query mode - --model flag to select provider/model (default: OpenAI) - Rich formatting for tool outputs (tables, syntax highlighting) 5. Wire into dataface/cli/main.py 6. Test with real data sources on both providers

The LLMClient abstraction also benefits the Cloud app: AIService.chat_with_tools() currently hard-codes OpenAI Chat Completions. Once LLMClient exists, AIService can use it too — getting Anthropic support and the Responses API upgrade for free. This is shared infrastructure.

Phase 2: Textual TUI (post-M2)

Chat panel with scrollback and streaming
Dashboard preview panel (open in browser or terminal render)
Status bar (sources, table count, session info)
Slash commands (/inspect, /sources, /render)

Phase 3: Advanced Features (post-M2)

Session persistence and resumption
Multi-turn context compaction for long conversations
Cost tracking per session

Implementation Progress

Read the task, architecture docs, and anti-slop/core rules before implementation.
Added dataface/ai/llm.py with a thin LLMClient protocol plus OpenAI Responses API and Anthropic Messages API adapters and a create_client() factory.
Added dataface/ai/agent.py with a provider-neutral tool loop that builds the system prompt from shared skills, YAML cheatsheet, and schema context, then dispatches tools through dispatch_tool_call().
Added dataface/cli/commands/agent.py and wired dft agent into dataface/cli/main.py for one-shot and interactive terminal chat with Rich-formatted tool output.
Added [agent] optional dependencies in pyproject.toml.
Added focused tests for prompt construction, provider selection, agent tool execution, and CLI one-shot mode.
Follow-up review fixes:
interactive mode now preserves conversation history across turns
agent loop only catches provider errors surfaced as LLMClientError
Anthropic client now uses streaming output
OpenAI client now guards against broken incremental tool history

Review Feedback

Focused verification passed with uv run pytest tests/core/test_ai_agent.py tests/core/test_ai_llm.py tests/core/test_agent_cli.py.
just ci passed, including lint, typecheck, full tests in Python 3.10/3.14, and visual test stage.
cbox review requested follow-up fixes for interactive state, exception handling, duplicate tool normalization, and Anthropic streaming; those fixes were applied.
[x] Review cleared