Improve cbox recovery from hung in-session tool calls
Problem
CBox sandboxes frequently get stuck in long-running shell tool calls (e.g., hung commit hooks, stalled uv sync, unresponsive git operations) where the Claude session becomes unresponsive to normal interaction. The manager's cbox send --interrupt does not reliably break these hung processes because a single Ctrl+C is often insufficient to terminate a deeply nested shell command. When this happens, the only recovery path is manual worktree takeover, which is slow, error-prone, and defeats the purpose of automated sandbox orchestration. This is one of the most common causes of sandbox workflow stalls.
Context
Key files: libs/cbox/cbox/tmux.py (session primitives), libs/cbox/cbox/cli.py (CLI surface), tests in libs/cbox/test_tmux.py, libs/cbox/test_interactive_stall.py, libs/cbox/test_send.py.
Prior state: send_keys(interrupt=True) sent 2× Ctrl+C. Interactive stall detection handled UI prompts (effort level, workspace trust, permission dialogs) but had no concept of "tool call in progress" vs "idle at prompt" — both returned CLEAR.
Possible Solutions
- Increase Ctrl+C count only — simple but doesn't help when subprocesses ignore SIGINT entirely.
- Add Escape key escalation + tool-call detection (Recommended) — Ctrl+C targets the subprocess; Escape targets Claude Code's own tool-cancellation mechanism. Combined with detecting "tool call active" state, the manager gets actionable status and a two-phase interrupt. Low risk (Escape is a no-op when not in a tool call).
- Kill tmux pane and restart — nuclear option, loses session context. Reserved for manual last resort.
Plan
Files modified:
- libs/cbox/cbox/tmux.py — Add INTERRUPT_ESCAPE_COUNT, Escape key phase in send_keys(interrupt=True), detect_tool_call_active().
- libs/cbox/cbox/cli.py — Add TOOL_ACTIVE to InteractiveStallStatus, integrate into check_interactive_stall(), update output --check-stall and send undelivered handler with worktree-takeover guidance.
- Test files — TDD: tests written before implementation.
Implementation Progress
- [x] Detect non-progressing in-session tool calls in sandbox output.
- [x] Make
cbox send --interruptreliably break active hung calls. -
[x] Provide manager guidance when manual worktree takeover is the only fallback.
-
[x] Hung commit/hook/install phases are recoverable through cbox controls.
-
[x] Recovery path is documented and test-covered where feasible.
-
Source issue: m1-infra-016-docs-sync stalled during commit and ignored interrupt.
- 2026-03-05: Implemented multi-
Ctrl+Cinterrupt intmux.send_keys(..., interrupt=True)and added tests inlibs/cbox/test_tmux.pyandlibs/cbox/test_send.py. - 2026-03-17: Added Escape-key escalation after Ctrl+C in
send_keys(interrupt=True). Addeddetect_tool_call_active()to distinguish "tool running" from "at prompt". AddedTOOL_ACTIVEstatus toInteractiveStallStatus.cbox output --check-stallnow reports tool-active state (exit 3) with interrupt/takeover guidance.cbox send --interruptundelivered path now suggests worktree take-over. 12 new tests across 3 files, all passing.
Review Feedback
- [ ] Review cleared