Build tasks DuckDB SQL metrics pipeline for milestone dashboards
Problem
Milestone progress tracking in tasks is manual and not queryable — status counts are maintained by hand in Markdown macros, and there is no way to run SQL against planning data or generate reproducible visualizations. This means milestone dashboards are brittle, go stale easily, and cannot be validated programmatically. The dashboard factory workstream itself lacks the kind of data-driven progress views it is supposed to enable for others.
Context
- Source of truth is Markdown frontmatter across
tasks/milestones/*.md,tasks/workstreams/*/index.md, andtasks/workstreams/*/tasks/*.md(400 tasks across 11 workstreams, 7 milestones). - DuckDB and PyArrow are already project dependencies (
duckdb>=0.9.0,pyarrow>=14.0.1). dft rendersupports CSV-backed queries and HTML output — inlinevaluestype is unsupported in this flow, so metrics must be emitted as CSV files.- The existing
tasks/frontmatter.pyprovidesparse_frontmatter()for YAML extraction. just tasks serve(tasks server) is completed and available — generated HTML can be linked/embedded there. MkDocs embedding remains optional.- Generated artifacts go under
tasks/.generated/(git-ignored).
Possible Solutions
-
Python script with DuckDB SQL metrics (Recommended): Single
build_plan_dataset.pyscript parses frontmatter → Parquet, runs DuckDB SQL for rollups → CSV, thendft renderproduces static HTML. Minimal moving parts, uses existing project dependencies, deterministic outputs. -
dbt model approach: Model planning data as dbt sources and run metrics through dbt. Heavier infrastructure, requires dbt project setup for internal data that isn’t warehouse-backed.
-
Python-only aggregation (no DuckDB): Compute metrics in pure Python. Loses SQL queryability — the whole point is making planning data SQL-addressable.
Plan
- [x] Write TDD tests for parsing and metrics (13 tests covering parse, export, DuckDB rollups)
- [x] Implement
tasks/tools/build_plan_dataset.pywith frontmatter → Parquet export - [x] Add DuckDB SQL metrics: milestone rollups (total/completed/pct_complete/in_progress/blocked/open) and workstream-level breakdowns
- [x] Export metrics as both Parquet and CSV for
dft renderconsumption - [x] Create
tasks/dashboards/milestone_header.ymldashboard template (progress bar + task status stacked bar) - [x] Wire
dft renderinto build script to producetasks/.generated/charts/milestones/milestone_header.html - [x] Add
just plans-dataJustfile recipe - [x] Add
tasks/.gitignorefor.generated/directory - [x] Verify end-to-end:
just plans-dataproduces all artifacts from live data
Implementation Progress
- 2026-03-23: Full pipeline implemented and tested.
Files added/changed:
| File | Description |
|---|---|
tasks/tools/build_plan_dataset.py |
Frontmatter → Parquet export, DuckDB SQL metrics, dft render chart generation |
tasks/dashboards/milestone_header.yml |
Dataface dashboard YAML: progress bar + task status stacked bar |
tasks/.gitignore |
Ignores .generated/ directory |
tests/core/test_build_plan_dataset.py |
13 TDD tests covering parsing, Parquet export, and DuckDB metrics |
Justfile |
Added just plans-data recipe |
Generated artifacts (from just plans-data):
tasks/.generated/
├── parquet/
│ ├── milestones.parquet
│ ├── workstreams.parquet
│ ├── tasks.parquet
│ ├── milestone_metrics.parquet
│ ├── milestone_metrics.csv
│ ├── workstream_metrics.parquet
│ └── workstream_metrics.csv
└── charts/milestones/
└── milestone_header.html
Verified against live data: 400 tasks, 7 milestones, 11 workstreams. M1 pilot shows 76.9% complete (90/117 tasks).
Out of scope: - Historical trend snapshots/backfill - Rich multi-chart dashboards beyond compact header summary - MkDocs macro integration (generated HTML can be linked/embedded separately)
Review Feedback
- [ ] Review cleared
2026-03-22 Triage Decision
- Status set to
planned(active backlog, not obsolete): implementation artifacts (build_plan_dataset.py, Parquet pipeline, milestone header render integration) are not in repo yet, and no task PR exists. Keep in backlog rather than marking done/cancelled.