# Fleet coordination lease — proposal **Status:** OUTLINE (not yet ready to build). Spun out of BooControl P8 (see `openspec/changes/boocontrol/`). This folder is the separate design pass the BooControl program deferred; it is an outline, not an implementation plan ready for `boo-implementing-changes`. Promote to READY only after the open questions below are resolved. ## Why Four independent processes dispatch inference to the same llama-swap hosts with no coordination: - **BooChat** (`apps/server`) — interactive chat turns. - **BooCoder** (`apps/coder`) — agent dispatches (opencode / ACP / PTY / Claude-SDK). - **Arena** (`apps/coder`) — head-to-head battles. - **BooControl** (`apps/control`) — bench + eval runs. Each host (`sam-desktop`, `embedding`) runs ONE model at a time on a single GPU; llama-swap evicts the loaded model to serve a request for a different one. So an unattended BooControl bench can evict a model mid-chat, and a chat can pollute a bench mid-run. BooControl P3 made this safe-by-construction for *manual* runs (human clicks "run", takeover confirmation, `concurrent_foreign_requests` recorded), but the underlying `inflight == 0` check is a courtesy gate with a TOCTOU race against the other three writers (design §8, risk table). That race is the single blocker for **unattended bench scheduling and reproducible concurrency sweeps** — the reason this batch exists. The proper fix is a per-host advisory lease in the shared `boochat` DB that BooControl's scheduler *requires* and the other three writers *honor*. ## What ships (scope) 1. **`control_host_leases` table** (owned by the BooControl schema, since it is the only *required* holder; the others are voluntary honorers): holder id, purpose, `expires_at`, heartbeat timestamp, keyed by `provider_id`. 2. **Lease lifecycle service** in `apps/control`: acquire (atomic, conditional insert/update), heartbeat (extend `expires_at`), release, and expiry sweep (a crashed holder's lease lapses without manual cleanup). 3. **The honor-protocol in all four writers**: before dispatching to a host, check for an active *exclusive* lease held by someone else; if present, queue behind it or fail fast with a clear "host leased for " signal. A shared (non-exclusive) lease for ordinary interactive traffic is the default; bench/eval take an exclusive lease. 4. **BooControl consumes it through the existing seam.** P3 left `acquireHostAccess(providerId, purpose): Promise` in `apps/control/src/services/host-access.ts` as a no-op returning `{ok: true}`. This batch swaps its body for a real lease acquire+heartbeat WITHOUT touching the bench engine (which already gates every run through the seam, design §8). 5. **Unattended bench scheduling + reproducible concurrency sweeps** unlock once the lease exists (the deferred half of BooControl P3). ## Out of scope - Cross-host scheduling / global GPU arbitration beyond per-host leases (YAGNI: reopen if per-host leases prove insufficient — implementation-plan Deferred section). - Frontier-provider coordination (no single-GPU contention there). - Replacing llama-swap's own on-demand eviction; the lease coordinates *callers*, not the swap engine. ## Open questions (resolve before READY) - **Exclusive vs shared semantics for interactive traffic.** Do BooChat/BooCoder take a shared lease per turn (heavyweight) or only *read* the exclusive-lease flag before dispatch (lightweight, racy on the boundary)? Leaning lightweight: interactive writers read-before-dispatch; only bench/eval take exclusive holds. - **Honor enforcement granularity.** Per-request check vs per-session hold. A per-request check is cheap but a long chat turn could still straddle a lease acquisition. Acceptable for v1? - **Heartbeat interval + lease TTL.** Short TTL = fast crash recovery but more DB chatter; long TTL = a crashed bench blocks the host until expiry. Proposed: TTL 60s, heartbeat 20s. - **Failure mode when the DB is unreachable.** Fail-open (dispatch anyway, current behavior) or fail-closed (refuse)? Fail-open preserves chat availability; document the residual race. ## Risks | Risk | Mitigation | |---|---| | A crashed exclusive holder blocks a host | TTL + heartbeat; expiry sweep reclaims lapsed leases | | Honor-protocol drift across four services | single shared lease-check helper in `@boocode/contracts`-adjacent shared code, consumed by all four; integration test per writer | | DB unreachable mid-dispatch | documented fail-open default; lease is advisory, never a hard dependency for interactive chat | | Lease check adds latency to every chat turn | lightweight read-before-dispatch (one indexed SELECT by `provider_id`); no per-turn write on the interactive path | ## References - BooControl design `§8 Fleet coordination lease (P8 — cross-service)` and the P3 seam contract (`acquireHostAccess`). - `apps/control/src/services/host-access.ts` — the seam to swap. - `apps/control/src/schema.sql` — where `control_host_leases` lands.