Tasks server universal doc-context chat sidebar

ID	INFRA_TOOLING-TASKS_SERVER_UNIVERSAL_DOC_CONTEXT_CHAT_SIDEBAR
Status	completed
Priority	p1
Milestone	m1-ft-analytics-analyst-pilot
Owner	sr-engineer-architect
Completed by	dave
Completed	2026-03-24

Problem

There is no way from just tasks serve to discuss the current document with an assistant while browsing. We need a generic chat entry on all tasks-server pages, with strong per-doc context (path + server-loaded markdown when applicable). The assistant should research and edit content in the current repo via explicit tools, optionally delegate heavier reasoning to a Claude subprocess sub-agent, and not own full implementation: coding work should hand off to the existing task manager (e.g. queue / register intent for pickup—same family of behavior as heartbeat-ready and dispatch), without the sidebar spawning worktrees.

Context

Primary code: tasks/tools/tasks_server.py, tasks/stylesheets/tasks_server_shell.css, tasks/javascripts/tasks.js.
Blocking dependency: converge-plans-and-plan-commands-to-a-single-task-cli-surface — tool calls that invoke the task CLI must use the converged task/tasks surface (not legacy plan naming).
Sibling (parallel): task-new-authoring-flow-with-full-worksheet-and-strict-validate improves how tasks are authored; this chat task does not block on it unless you choose to wire strict validate into a tool later.
Secrets: Use OPENAI_API_KEY (and optional OPENAI_MODEL, same as Cloud/playground—see .env.example and apps/cloud/apps/ai/service.py) from the repo .env. tasks/tools/run_tasks_server.py already calls load_dotenv(ROOT / ".env", override=False); the chat route reads keys from os.environ after that. Never send API keys to the browser.
Orchestrator vs Claude: The primary LLM for the sidebar can be OpenAI (streaming + tool calls)—no Anthropic API key required for that path. run_claude_subagent is separate: subprocess claude (Claude Code CLI) for deeper passes; it uses whatever auth that CLI already has locally, independent of the OpenAI orchestrator. Optional later: support ANTHROPIC_API_KEY if you want the orchestrator on Anthropic instead; not required for the OpenAI + CLI-delegation split.
Orchestrator pattern: Main UX = OpenAI-compatible streaming API + SSE with function calling and a small allowlisted tool loop—not tmux polling. Reuse local-only guards like /actions/dispatch: loopback client, Referer check, optional TASKS_SERVER_DISPATCH_TOKEN for any subprocess, file write, or spend.
Manager handoff: “Implementation” from chat means registering work for the manager (e.g. task update --status ready, or documented equivalent that heartbeat / operator flow picks up)—not creating .worktrees or running full dispatch from the chat UI unless explicitly out of scope and deferred.

Possible Solutions

A — Chat-only, no tools: Prose and pasted diffs only. Too weak for research/edit goals.
B — Full in-browser Claude Code clone: Too large; avoid.
C — Hybrid (Recommended): Streaming chat driven by OpenAI (or another single cloud API) + bounded tools: read file under repo root (path allowlist), ripgrep/search, apply patch or write scoped paths (tasks/docs only if you want tighter safety). Add run_claude_subagent (subprocess claude with a structured prompt, cwd=repo root, timeout, stdout/stderr captured) so the OpenAI orchestrator can delegate to Claude for research-heavy turns—no worktree in this path. Add register_task_for_manager (or equivalent) that runs allowlisted task CLI updates / status transitions so implementation stays on the manager + worker worktrees.

Plan

Land after CLI convergence dependency; wire tool documentation and argv allowlists to just task … / plans successor entrypoint as implemented there.
UI: right drawer + top control on all _render_docs_shell pages; embed page kind + data-doc-rel (or absent for non-file routes like /status).
Backend: SSE chat endpoint; use OPENAI_API_KEY / OPENAI_MODEL from os.environ after .env load; system prompt includes current doc identity + optional file snapshot; server-side conversation keyed (e.g. cookie + doc path).
Tools (implement minimally, expand later): read/search/edit within policy; run_claude_subagent subprocess with strict args and resource limits; register_task_for_manager calling documented manager contract (status ready + any register/snapshot hooks the repo already uses—see scripts/task_manager_lib, heartbeat, existing Build/dispatch docs).
Security: same family of checks as task-server POST actions for anything that touches disk, subprocess, or API quota.
Tests: tool allowlist, prompt construction from fixture path; optional Playwright for drawer.

Implementation Progress

Architecture

tasks/tools/tasks_server_chat.py — new module containing tool definitions, tool execution, system prompt construction, and SSE streaming chat endpoint. Keeps chat logic separate from the main server.
tasks/tools/tasks_server.py — wired POST /actions/chat route (same loopback+referer+token guards as dispatch), added GET /tasks-assets/tasks_chat.js asset route, embedded chat drawer HTML into _render_docs_shell.
tasks/javascripts/tasks_chat.js — client-side SSE consumer: toggle drawer, send messages, stream content/tool_call/tool_result/done events, minimal markdown formatting.
tasks/stylesheets/tasks_server_shell.css — chat drawer styles (right-side panel, 22rem, responsive collapse on mobile).

Tools (5 implemented)

Tool	Description
`read_file`	Read any file in repo (path-safe, 100KB limit)
`search_files`	ripgrep search with optional glob filter
`edit_file`	String replace scoped to `tasks/`, `docs/`, `ai_notes/`
`run_claude_subagent`	subprocess `claude -p` with 120s timeout
`register_task_for_manager`	Set task status via converged `task update` CLI

Security

Same guards as /actions/dispatch: loopback client, Referer origin, optional TASKS_SERVER_DISPATCH_TOKEN.
OPENAI_API_KEY required; returns 503 if not configured.
Path traversal blocked in all file tools.
Edit tool restricted to allowlisted prefixes.

Tests (13 new)

tests/core/test_tasks_server_chat.py: tool definitions, path safety, tool execution (read, search, register), prompt construction, endpoint auth, SSE response, API key requirement, shell HTML includes chat.
All 73 tests pass (60 existing + 13 new).

QA Exploration

[x] N/A — chat drawer requires live OPENAI_API_KEY for interactive testing; UI structure verified via automated tests (shell includes chat HTML, endpoint returns SSE).

Review Feedback

[ ] Review cleared