Make qa-explorer use local browser subagent without cbox fallback
Problem
Make qa-explorer run through the local subagent/browser path instead of any cbox fallback, ensure the browser automation stack is available to claude -p style task execution, isolate concurrent browser sessions so multiple QA runs do not interfere with each other, and evaluate whether Claude recommends a better-supported browser automation path than Playwright before locking the implementation.
Context
qa-explorershould validate browser-facing changes from the local task flow, not bounce into any legacy cbox path.- The task-manager flow already standardizes local worktrees,
scripts/dispatch, and per-worktree.worktree-ports.jsonmanifests for concurrent QA targets. - The current failure mode is twofold:
- local browser automation is not reliably available to the agent runtime that launches
qa-explorer - the skill/docs still leave room for legacy fallback behavior instead of failing fast on missing local browser capability
- The desired outcome is a local subagent-driven QA path that works with
claude -pstyle task execution and does not let concurrent browser runs stomp on each other. - If Claude or the repo now has a better-supported browser automation stack than Playwright, this task should prefer that option instead of preserving Playwright out of habit.
Possible Solutions
- Recommended: keep
qa-explorerlocal-only and run it through a dedicated browser subagent that receives the worktreecloud_url, provisions an isolated browser profile/session per run, and fails fast with a clear blocker when browser tooling is unavailable. This matches the non-cbox task-manager architecture and keeps concurrency concerns inside one runner contract. - Patch the current Playwright MCP wiring only, keep the rest of the skill contract unchanged. This is smaller, but risks preserving hidden coupling to whichever local browser surface happened to work first.
- Replace Playwright with another Claude-supported browser driver if it has clearly better support for concurrent isolated sessions. This is only worth doing if the newer path is meaningfully more reliable and still scriptable from the local subagent flow.
Plan
- Audit
qa-explorer,task-manager, and any local browser/subagent entrypoints to document the intended local-only execution path. - Decide the canonical browser automation backend: - keep Playwright if it is the best-supported local option - otherwise switch to the better-supported Claude/browser runner and update docs/contracts accordingly
- Make the browser QA invocation happen through a local subagent/worker flow that is compatible with
claude -ptask execution and receives the target URL from.worktree-ports.jsonorscripts/dispatch-watch. - Ensure each QA run gets isolated browser state so multiple concurrent runs can execute safely: - separate browser profile/session directories - separate artifact/output directories - no shared mutable global browser cache that causes cross-run interference
- Remove or explicitly reject any cbox fallback in the skill and surrounding docs so failures surface as local-tooling blockers instead of silently taking the old path.
- Add focused validation/tests/docs for: - missing-browser-tooling failure mode - concurrent QA run isolation - worktree URL handoff into the browser runner - expected subagent invocation path
Implementation Progress
- Added
scripts/qa-explore, a local-only wrapper that runsclaude -pwith a per-run Playwright MCP config, unique browser profile/output directories, and explicit failure checks for missing browser tooling or an unreachable target URL. - Added script coverage for
qa-exploredry-run URL/MCP wiring and its fail-fast path when no worktree URL is available. - Updated
qa-explorer,task-manager,AGENTS.md, andjustfileto use the local wrapper instead of any cbox fallback. - Switched the wrapper to safe-by-default permissions, moved MCP config JSON generation into Python to avoid shell/JSON injection issues, and tightened the tests around dangerous-mode opt-in plus
--allowed-hostswiring.
QA Exploration
A live acceptance run of scripts/qa-explore against the task worktree stack on http://127.0.0.1:8100 successfully launched a dedicated claude -p browser worker, a run-scoped Playwright MCP server, and isolated artifacts under .qa-explorer/runs/20260318T140656-49725/.
Observed UX issues from that run:
- Signed-in suite home leaves too much dead whitespace above the fold.
- Org home/chat landing page pushes meaningful content too far down, so the first screen feels empty.
- Dashboard/project breadcrumbs truncate aggressively and reduce orientation.
- Dashboard right-rail action icons are hard to interpret without labels.
- Boards listing cards leave awkward whitespace and uneven visual weight.
-
The suite still 404s on
favicon.ico. -
[x] QA exploration completed (or N/A for non-UI tasks)
Review Feedback
- Review initially blocked on
scripts/qa-exploredefaulting to dangerous permissions plus a few smaller hardening concerns. - Fixed the wrapper to default to safe permissions, generate JSON config via Python, report connection failures clearly, and keep dangerous mode as explicit opt-in.
-
Final
just reviewapproved the branch with only non-blocking maintenance notes. -
[x] Review cleared