Make qa-explorer use local browser subagent without cbox fallback

ID	INFRA_TOOLING-MAKE_QA_EXPLORER_USE_LOCAL_BROWSER_SUBAGENT_WITHOUT_CBOX_FALLBACK
Status	completed
Priority	p1
Milestone	m1-ft-analytics-analyst-pilot
Owner	sr-engineer-architect
Completed by	dave
Completed	2026-03-18

Problem

Make qa-explorer run through the local subagent/browser path instead of any cbox fallback, ensure the browser automation stack is available to claude -p style task execution, isolate concurrent browser sessions so multiple QA runs do not interfere with each other, and evaluate whether Claude recommends a better-supported browser automation path than Playwright before locking the implementation.

Context

qa-explorer should validate browser-facing changes from the local task flow, not bounce into any legacy cbox path.
The task-manager flow already standardizes local worktrees, scripts/dispatch, and per-worktree .worktree-ports.json manifests for concurrent QA targets.
The current failure mode is twofold:
local browser automation is not reliably available to the agent runtime that launches qa-explorer
the skill/docs still leave room for legacy fallback behavior instead of failing fast on missing local browser capability
The desired outcome is a local subagent-driven QA path that works with claude -p style task execution and does not let concurrent browser runs stomp on each other.
If Claude or the repo now has a better-supported browser automation stack than Playwright, this task should prefer that option instead of preserving Playwright out of habit.

Possible Solutions

Recommended: keep qa-explorer local-only and run it through a dedicated browser subagent that receives the worktree cloud_url, provisions an isolated browser profile/session per run, and fails fast with a clear blocker when browser tooling is unavailable. This matches the non-cbox task-manager architecture and keeps concurrency concerns inside one runner contract.
Patch the current Playwright MCP wiring only, keep the rest of the skill contract unchanged. This is smaller, but risks preserving hidden coupling to whichever local browser surface happened to work first.
Replace Playwright with another Claude-supported browser driver if it has clearly better support for concurrent isolated sessions. This is only worth doing if the newer path is meaningfully more reliable and still scriptable from the local subagent flow.

Plan

Audit qa-explorer, task-manager, and any local browser/subagent entrypoints to document the intended local-only execution path.
Decide the canonical browser automation backend: - keep Playwright if it is the best-supported local option - otherwise switch to the better-supported Claude/browser runner and update docs/contracts accordingly
Make the browser QA invocation happen through a local subagent/worker flow that is compatible with claude -p task execution and receives the target URL from .worktree-ports.json or scripts/dispatch-watch.
Ensure each QA run gets isolated browser state so multiple concurrent runs can execute safely: - separate browser profile/session directories - separate artifact/output directories - no shared mutable global browser cache that causes cross-run interference
Remove or explicitly reject any cbox fallback in the skill and surrounding docs so failures surface as local-tooling blockers instead of silently taking the old path.
Add focused validation/tests/docs for: - missing-browser-tooling failure mode - concurrent QA run isolation - worktree URL handoff into the browser runner - expected subagent invocation path

Implementation Progress

Added scripts/qa-explore, a local-only wrapper that runs claude -p with a per-run Playwright MCP config, unique browser profile/output directories, and explicit failure checks for missing browser tooling or an unreachable target URL.
Added script coverage for qa-explore dry-run URL/MCP wiring and its fail-fast path when no worktree URL is available.
Updated qa-explorer, task-manager, AGENTS.md, and justfile to use the local wrapper instead of any cbox fallback.
Switched the wrapper to safe-by-default permissions, moved MCP config JSON generation into Python to avoid shell/JSON injection issues, and tightened the tests around dangerous-mode opt-in plus --allowed-hosts wiring.

QA Exploration

A live acceptance run of scripts/qa-explore against the task worktree stack on http://127.0.0.1:8100 successfully launched a dedicated claude -p browser worker, a run-scoped Playwright MCP server, and isolated artifacts under .qa-explorer/runs/20260318T140656-49725/.

Observed UX issues from that run:

Signed-in suite home leaves too much dead whitespace above the fold.
Org home/chat landing page pushes meaningful content too far down, so the first screen feels empty.
Dashboard/project breadcrumbs truncate aggressively and reduce orientation.
Dashboard right-rail action icons are hard to interpret without labels.
Boards listing cards leave awkward whitespace and uneven visual weight.
The suite still 404s on favicon.ico.
[x] QA exploration completed (or N/A for non-UI tasks)

Review Feedback

Review initially blocked on scripts/qa-explore defaulting to dangerous permissions plus a few smaller hardening concerns.
Fixed the wrapper to default to safe permissions, generate JSON config via Python, report connection failures clearly, and keep dangerous mode as explicit opt-in.
Final just review approved the branch with only non-blocking maintenance notes.
[x] Review cleared