Evals

Browse eval dashboards and raw run artifacts through the tasks server.

Top-line eval dashboard

Compare SQL runs

Raw run files and directories

Latest runs

Run	Backend	Model	Provider	Context	Cases	Pass rate	Artifacts
`sql/20260325_043540_smoke_local`

Placement

Put generated eval run outputs under apps/evals/runs/<family>/<run_id>/. The tasks server picks them up automatically for the landing page and exposes them under /evals/artifacts/. Dataface eval faces are served under /evals/faces/.

Current assumptions: eval dashboards are the existing Dataface project in apps/evals/, and most raw artifacts are JSON/JSONL files plus any generated static files inside each run directory.