Dataface Tasks

Add lightweight qa-explorer verification artifacts and trace capture

IDINFRA_TOOLING-ADD_LIGHTWEIGHT_QA_EXPLORER_VERIFICATION_ARTIFACTS_AND_TRACE_CAPTURE
Statuscompleted
Priorityp1
Milestonem1-ft-analytics-analyst-pilot
Ownersr-engineer-architect
Completed bydave
Completed2026-03-18

Problem

Upgrade qa-explorer local visual verification with lightweight, ephemeral, gitignored per-run artifacts instead of repo-committed evidence. Add a stable artifact bundle contract, structured markdown/json QA summaries, console and network error summaries, and optional Playwright trace/session capture when it stays lightweight. Do not auto-link evidence into PRs or task files beyond normal human-written summaries, and skip before/after visual comparison work for now.

Context

The qa-explorer skill (scripts/qa-explore) already creates per-run artifact directories under .qa-explorer/runs/<run_id>/ with browser profiles, Playwright output, and trace/session capture enabled. However there was no stable contract for what a run produces, no structured summary output, and no explicit diagnostics capture for console errors or failed network requests.

Key files: - .codex/skills/qa-explorer/SKILL.md — skill documentation and worker instructions - scripts/qa-explore — bash wrapper that launches the claude worker - tests/scripts/test_dispatch_scripts.py — integration tests for the script

Possible Solutions

Recommended: Prompt + SKILL.md contract approach. Define the artifact bundle contract in SKILL.md (which the worker reads), and add explicit instructions in the prompt to write summary.md and summary.json. This is lightweight — no new dependencies, no new scripts, just documentation and a prompt update.

Alternative: Post-processing script. Add a script that parses claude worker output and generates summaries. Rejected: over-engineered for the current need, and the worker already has full context to write the summaries itself.

Plan

  1. Add "Artifact Bundle Contract" section to SKILL.md defining the per-run layout, summary.md format, summary.json schema, diagnostics capture requirements, and trace/session documentation.
  2. Update scripts/qa-explore prompt to instruct the worker to write both summary files and capture console/network diagnostics.
  3. TDD: write failing test first, then implement.

Implementation Progress

Changes made

  1. .codex/skills/qa-explorer/SKILL.md — Added "Artifact Bundle Contract" section with: - Per-run directory layout diagram - summary.md — human-readable report persisted to disk - summary.json — machine-readable schema with findings, diagnostics, and suggested tests - Diagnostics capture instructions (console errors, failed network requests) - Playwright trace/session documentation

  2. scripts/qa-explore — Extended the prompt injected into the claude worker with explicit instructions to write summary.md and summary.json, and to capture console errors and failed network requests as diagnostics.

  3. tests/scripts/test_dispatch_scripts.py — Added test_qa_explore_prompt_instructs_structured_artifact_output that verifies the prompt includes references to summary.md, summary.json, and console error capture.

What was intentionally excluded (per task scope)

  • No auto-linking of evidence into PR bodies or task files
  • No before/after visual comparison or visual diff baselines
  • Artifacts remain ephemeral and gitignored under .qa-explorer/

QA Exploration

N/A — this is infrastructure/skill tooling, not a UI change.

  • [x] QA exploration completed (or N/A for non-UI tasks)

Review Feedback

  • just review verdict: APPROVED. Review called out only a minor style note about long assertion lines in the new test, with no material correctness or security issues.
  • scripts/pr-validate pre passed after a clean rebase onto origin/main and a full local CI run.

  • [x] Review cleared