Tasks server universal doc-context chat sidebar
Problem
There is no way from just tasks serve to discuss the current document with an assistant while browsing. We need a generic chat entry on all tasks-server pages, with strong per-doc context (path + server-loaded markdown when applicable). The assistant should research and edit content in the current repo via explicit tools, optionally delegate heavier reasoning to a Claude subprocess sub-agent, and not own full implementation: coding work should hand off to the existing task manager (e.g. queue / register intent for pickup—same family of behavior as heartbeat-ready and dispatch), without the sidebar spawning worktrees.
Context
- Primary code:
tasks/tools/tasks_server.py,tasks/stylesheets/tasks_server_shell.css,tasks/javascripts/tasks.js. - Blocking dependency: converge-plans-and-plan-commands-to-a-single-task-cli-surface — tool calls that invoke the task CLI must use the converged
task/taskssurface (not legacyplannaming). - Sibling (parallel): task-new-authoring-flow-with-full-worksheet-and-strict-validate improves how tasks are authored; this chat task does not block on it unless you choose to wire strict validate into a tool later.
- Secrets: Use
OPENAI_API_KEY(and optionalOPENAI_MODEL, same as Cloud/playground—see.env.exampleandapps/cloud/apps/ai/service.py) from the repo.env.tasks/tools/run_tasks_server.pyalready callsload_dotenv(ROOT / ".env", override=False); the chat route reads keys fromos.environafter that. Never send API keys to the browser. - Orchestrator vs Claude: The primary LLM for the sidebar can be OpenAI (streaming + tool calls)—no Anthropic API key required for that path.
run_claude_subagentis separate: subprocessclaude(Claude Code CLI) for deeper passes; it uses whatever auth that CLI already has locally, independent of the OpenAI orchestrator. Optional later: supportANTHROPIC_API_KEYif you want the orchestrator on Anthropic instead; not required for the OpenAI + CLI-delegation split. - Orchestrator pattern: Main UX = OpenAI-compatible streaming API + SSE with function calling and a small allowlisted tool loop—not tmux polling. Reuse local-only guards like
/actions/dispatch: loopback client, Referer check, optionalTASKS_SERVER_DISPATCH_TOKENfor any subprocess, file write, or spend. - Manager handoff: “Implementation” from chat means registering work for the manager (e.g.
task update --status ready, or documented equivalent that heartbeat / operator flow picks up)—not creating.worktreesor running full dispatch from the chat UI unless explicitly out of scope and deferred.
Possible Solutions
- A — Chat-only, no tools: Prose and pasted diffs only. Too weak for research/edit goals.
- B — Full in-browser Claude Code clone: Too large; avoid.
- C — Hybrid (Recommended): Streaming chat driven by OpenAI (or another single cloud API) + bounded tools: read file under repo root (path allowlist), ripgrep/search, apply patch or write scoped paths (tasks/docs only if you want tighter safety). Add
run_claude_subagent(subprocessclaudewith a structured prompt, cwd=repo root, timeout, stdout/stderr captured) so the OpenAI orchestrator can delegate to Claude for research-heavy turns—no worktree in this path. Addregister_task_for_manager(or equivalent) that runs allowlistedtaskCLI updates / status transitions so implementation stays on the manager + worker worktrees.
Plan
- Land after CLI convergence dependency; wire tool documentation and argv allowlists to
just task …/planssuccessor entrypoint as implemented there. - UI: right drawer + top control on all
_render_docs_shellpages; embed page kind +data-doc-rel(or absent for non-file routes like/status). - Backend: SSE chat endpoint; use
OPENAI_API_KEY/OPENAI_MODELfromos.environafter.envload; system prompt includes current doc identity + optional file snapshot; server-side conversation keyed (e.g. cookie + doc path). - Tools (implement minimally, expand later): read/search/edit within policy;
run_claude_subagentsubprocess with strict args and resource limits;register_task_for_managercalling documented manager contract (status ready + any register/snapshot hooks the repo already uses—seescripts/task_manager_lib, heartbeat, existing Build/dispatch docs). - Security: same family of checks as task-server POST actions for anything that touches disk, subprocess, or API quota.
- Tests: tool allowlist, prompt construction from fixture path; optional Playwright for drawer.
Implementation Progress
Architecture
tasks/tools/tasks_server_chat.py— new module containing tool definitions, tool execution, system prompt construction, and SSE streaming chat endpoint. Keeps chat logic separate from the main server.tasks/tools/tasks_server.py— wiredPOST /actions/chatroute (same loopback+referer+token guards as dispatch), addedGET /tasks-assets/tasks_chat.jsasset route, embedded chat drawer HTML into_render_docs_shell.tasks/javascripts/tasks_chat.js— client-side SSE consumer: toggle drawer, send messages, stream content/tool_call/tool_result/done events, minimal markdown formatting.tasks/stylesheets/tasks_server_shell.css— chat drawer styles (right-side panel, 22rem, responsive collapse on mobile).
Tools (5 implemented)
| Tool | Description |
|---|---|
read_file |
Read any file in repo (path-safe, 100KB limit) |
search_files |
ripgrep search with optional glob filter |
edit_file |
String replace scoped to tasks/, docs/, ai_notes/ |
run_claude_subagent |
subprocess claude -p with 120s timeout |
register_task_for_manager |
Set task status via converged task update CLI |
Security
- Same guards as
/actions/dispatch: loopback client, Referer origin, optionalTASKS_SERVER_DISPATCH_TOKEN. OPENAI_API_KEYrequired; returns 503 if not configured.- Path traversal blocked in all file tools.
- Edit tool restricted to allowlisted prefixes.
Tests (13 new)
tests/core/test_tasks_server_chat.py: tool definitions, path safety, tool execution (read, search, register), prompt construction, endpoint auth, SSE response, API key requirement, shell HTML includes chat.- All 73 tests pass (60 existing + 13 new).
QA Exploration
- [x] N/A — chat drawer requires live OPENAI_API_KEY for interactive testing; UI structure verified via automated tests (shell includes chat HTML, endpoint returns SSE).
Review Feedback
- [ ] Review cleared