CBox review prompt context isolation on sandbox restart
Problem
When a cbox sandbox restarts (either by re-attaching to an existing tmux session or resuming over an existing worktree), it can inherit a stale .cbox/.review-prompt.md file left over from a previous review run. Claude reads this file on context switch, injecting unrelated review instructions into what should be a clean task execution session. This causes the sandbox agent to behave as if it's in review mode — ignoring its actual task prompt and producing confusing, off-task output. The one-shot review flow cleans up after itself, but neither sandbox resume path does, creating this contamination window.
Context
Possible Solutions
Plan
Implementation Progress
Implementation
Root cause
When a sandbox resumes (either by attaching to an existing tmux session or by
recreating a session over an existing worktree), the .cbox/.review-prompt.md
file was not cleaned up. This stale file could be read by Claude on context
switch, injecting unrelated review prompt context into the sandbox session.
The one-shot review flow (_run_review_in_tmux) properly deleted the file after
review completion or timeout, but neither sandbox resume path did.
Fix
Added _clean_stale_review_prompt(working_dir) helper in cli.py that removes
.cbox/.review-prompt.md if it exists (using unlink(missing_ok=True)).
Called before sandbox start in all paths: 1. Attach to existing session (before sending any commands or attaching) 2. Resume from existing worktree (before starting a new container) 3. Default sandbox fresh start (before creating a new session)
Files changed
libs/cbox/cbox/cli.py— added_clean_stale_review_prompt()helper and moved cleanup to run unconditionally before start/resumelibs/cbox/test_review_prompt_isolation.py— 5 tests covering both resume paths, default sandbox attach/fresh-start, and no-file case
Risks and follow-ups
- No risk:
unlink(missing_ok=True)is safe and idempotent. - Follow-up: consider whether
.cbox/reviews/directory should also be cleaned on sandbox resume (lower priority, review outputs are useful for history).
Review Feedback
- [ ] Review cleared