Dataface Tasks

Fix concurrent worktree port allocation collisions

IDINFRA_TOOLING-FIX_CONCURRENT_WORKTREE_PORT_ALLOCATION_COLLISIONS
Statuscompleted
Priorityp1
Milestonem1-ft-analytics-analyst-pilot
Ownersr-engineer-architect
Completed bydave
Completed2026-03-22

Problem

Investigate and fix the per-worktree port allocator so concurrently created fresh worktrees never reuse the same cloud_url or port bundle while active allocations still exist.

Context

  • scripts/worktree_ports.py currently allocates by probing for bindable ports and returning the first free offset bundle.
  • That implementation ignores already-written .worktree-ports.json files in sibling worktrees, so two fresh worktrees created before either stack binds sockets can receive the same bundle.
  • I reproduced the collision directly with existing active task worktrees:
  • /Users/dave.fowler/.codex/worktrees/0d5f/fix-favicon-and-dashboard-route-404s-found-in-qa-exploration/.worktree-ports.json
  • /Users/dave.fowler/.codex/worktrees/0d5f/fix-concurrent-worktree-port-allocation-collisions/.worktree-ports.json
  • both contained cloud_url: http://127.0.0.1:8100
  • The bug is in reservation logic, not in stack startup. The allocator needs to treat existing worktree port bundles as reserved even when nothing is currently listening.

Possible Solutions

  • Scan all sibling worktrees from git worktree list --porcelain, read each .worktree-ports.json, and treat those ports as reserved during allocation.
  • Recommended: matches the actual managed worktree set, closes the real duplicate-allocation bug, and keeps the CLI interface simple.
  • Keep bind probing only, but immediately start a placeholder process to reserve sockets until the stack starts.
  • Rejected: too invasive and brittle for local dev tooling.
  • Write a shared lockfile/global registry separate from the worktree port file.
  • Rejected: duplicates the source of truth we already have in .worktree-ports.json.

Plan

  1. Update scripts/worktree_ports.py to enumerate worktree-local port files and skip already-reserved bundles during allocation.
  2. Allow the allocator to ignore the current output path so rewriting an existing worktree's own port file does not block itself.
  3. Add focused tests for: - existing sibling worktree bundle blocks reuse - current output file does not block rewriting the same bundle
  4. Validate with the focused script tests and a live CLI allocation repro against the real local worktree set.

Implementation Progress

  • Added worktree-aware reservation scanning in scripts/worktree_ports.py:
  • parses git worktree list --porcelain
  • reads sibling .worktree-ports.json files
  • reserves all integer service ports already assigned in those files
  • Updated allocation to skip reserved bundles before checking socket bind availability.
  • Added an exclude_output path flow so allocate --output <existing current-worktree file> can rewrite its own file without treating it as a collision.
  • Added focused tests in tests/scripts/test_worktree_ports.py covering:
  • existing worktree bundle blocks reuse
  • current output file is ignored for self-rewrites
  • Focused validation passed:
  • uv run pytest tests/scripts/test_worktree_ports.py -q (5 passed)
  • Live CLI verification passed:
  • python3 scripts/worktree_ports.py allocate --output <tempfile> returned cloud_url: http://127.0.0.1:8200 while existing sibling worktrees still held 8100

QA Exploration

  • N/A: non-UI infra tooling task
  • [x] QA exploration completed (or N/A for non-UI tasks)

Review Feedback

  • [ ] Review cleared