v1.0 stability and defect burn-down

ID	M4_V1_0_LAUNCH-IDE_EXTENSION-01
Status	not_started
Priority	p1
Milestone	m4-v1-0-launch
Owner	ui-design-frontend-dev

Problem

After public launch, defect reports will arrive from a much wider range of environments and usage patterns than internal testing covered. Without a structured stability program — recurring defect triage, severity-based SLAs, reliability metrics, and trend tracking — the team will operate reactively, fixing whatever is loudest rather than what matters most. Known defects will accumulate without a burn-down cadence, reliability will degrade silently between releases, and there will be no data to answer "is the extension getting more or less stable over time?" A v1.0 label requires demonstrable reliability, not just feature completeness.

Context

After launch, recurring defects in analyst authoring in VS Code/Cursor with preview, diagnostics, and assist will damage trust faster than new features can restore it, so this phase should prioritize stability over new scope.
The goal is to identify the repeat offenders, remove the highest support burden, and make failure patterns measurable enough that the team knows whether quality is improving.
Expected touchpoints include apps/ide/vscode-extension/, preview/inspector runtime code, and extension docs/tests, bug history, support or incident notes, and any tests or QA gaps that let defects recur.

Possible Solutions

A - Keep mixing bug fixes with feature work opportunistically: preserves flexibility, but lets long-tail reliability work stay perpetually unfinished.
B - Recommended: run an explicit stability program: rank defect classes, burn down the highest-frequency issues, and pair fixes with validation so regressions stop recurring.
C - Freeze all new work until zero known defects remain: simple in principle, but unrealistic and usually counterproductive.

Plan

Aggregate the recurring failures in analyst authoring in VS Code/Cursor with preview, diagnostics, and assist from bugs, support notes, and recent releases, then rank them by user impact and repeat rate.
Turn the top defect classes into a concrete burn-down list with owners, acceptance criteria, and the validation needed to keep each fix from regressing.
Land or schedule the highest-leverage fixes first, including any docs or operator changes that reduce repeat incidents.
Review the remaining defect mix after the first burn-down pass and update the next tranche of work based on actual stability improvements.

Implementation Progress

Review Feedback

[ ] Review cleared