Dataface Tasks

Generalize same-agent repair for conflicted, CI-red, and unreconciled task PRs

IDINFRA_TOOLING-GENERALIZE_SAME_AGENT_REPAIR_FOR_CONFLICTED_CI_RED_AND_UNRECONCILED_TASK_PRS
Statuscompleted
Priorityp1
Milestonem1-ft-analytics-analyst-pilot
Ownersr-engineer-architect
Completed bydave
Completed2026-03-27

Problem

Extend the agent-before-human escalation pattern from conflicted PRs to also cover CI-red and dispatch-completed-unreconciled states, so the task manager dispatches a targeted repair agent before escalating to a human.

Context

  • resume_conflicted_pr_via_agent() in scripts/task_manager_lib.py already implements the pattern for conflicted PRs: dispatch a targeted agent to the existing worktree, collect results (agent_resolved/agent_failed/agent_skipped), escalate to human on failure.
  • CI-red PRs are detected by apply_pr_ci_attention() which adds a needs_attention line but no escalation code and no agent repair attempt. The task has a worktree and a PR — an agent can read CI failures and attempt fixes.
  • Unreconciled tasks are detected by detect_dispatch_completed_unreconciled() with code dispatch_completed_unreconciled. The dispatch exited 0 but the task wasn't marked completed — an agent can review the worktree and reconcile.
  • The scripts/dispatch script supports --path to reuse an existing worktree, so all three repair types use the same dispatch infrastructure.
  • Agent repair must remain opt-in (--agent-resolve flag) — not every caller wants subprocess spawns.
  • Relevant files:
  • scripts/task_manager_lib.py — repair functions, prompts, _note_attention()
  • scripts/task-manager-reconcile-pr-conflicts — CLI for conflict repair (already has --agent-resolve)
  • scripts/task-manager-heartbeat — main loop (no agent repair yet)
  • tests/scripts/test_task_manager_scripts.py — test suite

Possible Solutions

  • Option 1: Keep current behavior — agent repair for conflicts only, passive monitoring for CI-red and unreconciled. Simple but leaves two common failure states requiring human intervention for issues agents can fix. Rejected.
  • Option 2: Recommended Add resume_ci_red_pr_via_agent() and resume_unreconciled_via_agent() following the same pattern as resume_conflicted_pr_via_agent():
  • Each gets a narrow, targeted prompt
  • Each dispatches to the existing worktree via scripts/dispatch
  • On success → recorded as agent_resolved, no human escalation
  • On failure/timeout → escalated with a distinct _needs_human code at tier1
  • Wire into a new scripts/task-manager-reconcile-ci-red script and extend scripts/task-manager-reconcile-pr-conflicts output for the combined view
  • Controlled by --agent-resolve flag (opt-in)
  • Option 3: Build a single generic resume_via_agent() that takes a prompt and filter predicate. Elegant but premature — each failure type has different candidate selection, different prompts, and different escalation codes. Keep them separate for now. Rejected.

Plan

  1. Write failing tests for resume_ci_red_pr_via_agent(): - Agent succeeds → agent_resolved, no escalation - Agent fails → ci_red_needs_human tier1 escalation - Agent times out → same escalation - No worktree → skipped
  2. Write failing tests for resume_unreconciled_via_agent(): - Agent succeeds → agent_resolved, no escalation - Agent fails → unreconciled_needs_human tier1 escalation - No worktree → skipped
  3. Implement resume_ci_red_pr_via_agent() with _CI_RED_RESOLVE_PROMPT.
  4. Implement resume_unreconciled_via_agent() with _UNRECONCILED_RESOLVE_PROMPT.
  5. Add ci_red_needs_human and unreconciled_needs_human to escalation code in apply_pr_ci_attention().
  6. Wire CI-red repair into scripts/task-manager-reconcile-pr-conflicts behind --agent-resolve.
  7. Run focused tests and validate task file.

Implementation Progress

Files modified

  • scripts/task_manager_lib.py — Added two new agent-repair functions:
  • resume_ci_red_pr_via_agent(tasks) — dispatches agent to fix CI failures with _CI_RED_RESOLVE_PROMPT. On failure/timeout escalates with ci_red_needs_human tier1.
  • resume_unreconciled_via_agent(tasks) — dispatches agent to reconcile tasks where dispatch exited 0 but task not completed, using _UNRECONCILED_RESOLVE_PROMPT. On failure/timeout escalates with unreconciled_needs_human tier1.
  • scripts/task-manager-reconcile-pr-conflicts — Extended to call all three agent-repair functions (conflicts, CI-red, unreconciled) when --agent-resolve is passed. Added helper functions _empty_agent_results() and _merge_agent_results() to combine results.
  • scripts/escalation_lib.py — Added CI_RED_NEEDS_HUMAN and UNRECONCILED_NEEDS_HUMAN failure modes, signal-to-mode mappings, classification priority entries, and ESCALATE_HUMAN action recommendations.
  • tests/scripts/test_task_manager_scripts.py — 9 new tests:
  • 5 for resume_ci_red_pr_via_agent: success, failure escalation, timeout escalation, no worktree skip, non-failure skip
  • 4 for resume_unreconciled_via_agent: success, failure escalation, no worktree skip, non-exited skip

Key decisions

  • All agent repair remains opt-in (--agent-resolve flag) — no change to default heartbeat behavior
  • Each failure type gets its own function with a narrow, targeted prompt rather than a generic dispatch
  • New escalation codes (ci_red_needs_human, unreconciled_needs_human) are distinct from existing codes so triage can differentiate "agent tried and failed" from "no agent attempt"
  • CI-red prompt includes the specific failing check names for agent context
  • Unreconciled prompt instructs agent to either complete the task or create the PR

QA Exploration

  • [x] QA exploration completed (or N/A for non-UI tasks)
  • N/A: backend/orchestration task — no UI changes

Review Feedback

  • [ ] Review cleared