6.3 KiB
Fleet coordination lease — proposal
Status: OUTLINE (not yet ready to build). Spun out of BooControl P8 (see
openspec/changes/boocontrol/). This folder is the separate design pass the
BooControl program deferred; it is an outline, not an implementation plan ready
for boo-implementing-changes. Promote to READY only after the open questions
below are resolved.
Why
Four independent processes dispatch inference to the same llama-swap hosts with no coordination:
- BooChat (
apps/server) — interactive chat turns. - BooCoder (
apps/coder) — agent dispatches (opencode / ACP / PTY / Claude-SDK). - Arena (
apps/coder) — head-to-head battles. - BooControl (
apps/control) — bench + eval runs.
Each host (sam-desktop, embedding) runs ONE model at a time on a single GPU;
llama-swap evicts the loaded model to serve a request for a different one. So an
unattended BooControl bench can evict a model mid-chat, and a chat can pollute a
bench mid-run. BooControl P3 made this safe-by-construction for manual runs
(human clicks "run", takeover confirmation, concurrent_foreign_requests
recorded), but the underlying inflight == 0 check is a courtesy gate with a
TOCTOU race against the other three writers (design §8, risk table). That race
is the single blocker for unattended bench scheduling and reproducible
concurrency sweeps — the reason this batch exists.
The proper fix is a per-host advisory lease in the shared boochat DB that
BooControl's scheduler requires and the other three writers honor.
What ships (scope)
control_host_leasestable (owned by the BooControl schema, since it is the only required holder; the others are voluntary honorers): holder id, purpose,expires_at, heartbeat timestamp, keyed byprovider_id.- Lease lifecycle service in
apps/control: acquire (atomic, conditional insert/update), heartbeat (extendexpires_at), release, and expiry sweep (a crashed holder's lease lapses without manual cleanup). - The honor-protocol in all four writers: before dispatching to a host, check for an active exclusive lease held by someone else; if present, queue behind it or fail fast with a clear "host leased for " signal. A shared (non-exclusive) lease for ordinary interactive traffic is the default; bench/eval take an exclusive lease.
- BooControl consumes it through the existing seam. P3 left
acquireHostAccess(providerId, purpose): Promise<HostGrant>inapps/control/src/services/host-access.tsas a no-op returning{ok: true}. This batch swaps its body for a real lease acquire+heartbeat WITHOUT touching the bench engine (which already gates every run through the seam, design §8). - Unattended bench scheduling + reproducible concurrency sweeps unlock once the lease exists (the deferred half of BooControl P3).
Out of scope
- Cross-host scheduling / global GPU arbitration beyond per-host leases (YAGNI: reopen if per-host leases prove insufficient — implementation-plan Deferred section).
- Frontier-provider coordination (no single-GPU contention there).
- Replacing llama-swap's own on-demand eviction; the lease coordinates callers, not the swap engine.
Open questions (resolve before READY)
- Exclusive vs shared semantics for interactive traffic. Do BooChat/BooCoder take a shared lease per turn (heavyweight) or only read the exclusive-lease flag before dispatch (lightweight, racy on the boundary)? Leaning lightweight: interactive writers read-before-dispatch; only bench/eval take exclusive holds.
- Honor enforcement granularity. Per-request check vs per-session hold. A per-request check is cheap but a long chat turn could still straddle a lease acquisition. Acceptable for v1?
- Heartbeat interval + lease TTL. Short TTL = fast crash recovery but more DB chatter; long TTL = a crashed bench blocks the host until expiry. Proposed: TTL 60s, heartbeat 20s.
- Failure mode when the DB is unreachable. Fail-open (dispatch anyway, current behavior) or fail-closed (refuse)? Fail-open preserves chat availability; document the residual race.
Risks
| Risk | Mitigation |
|---|---|
| A crashed exclusive holder blocks a host | TTL + heartbeat; expiry sweep reclaims lapsed leases |
| Honor-protocol drift across four services | single shared lease-check helper in @boocode/contracts-adjacent shared code, consumed by all four; integration test per writer |
| DB unreachable mid-dispatch | documented fail-open default; lease is advisory, never a hard dependency for interactive chat |
| Lease check adds latency to every chat turn | lightweight read-before-dispatch (one indexed SELECT by provider_id); no per-turn write on the interactive path |
References
- BooControl design
§8 Fleet coordination lease (P8 — cross-service)and the P3 seam contract (acquireHostAccess). apps/control/src/services/host-access.ts— the seam to swap.apps/control/src/schema.sql— wherecontrol_host_leaseslands.
Recommended resolutions (draft)
These are draft recommendations for operator ratification before this change is promoted to READY.
- Exclusive vs shared semantics for interactive traffic: Use exclusive leases only for bench/eval holders in v1; BooChat, BooCoder, and Arena should read-before-dispatch and avoid writing shared leases. Rationale: this keeps interactive latency and availability close to current behavior while still giving scheduled control work a clear isolation signal.
- Honor enforcement granularity: Use a per-request honor check in v1, not a per-session hold. Rationale: it is the smallest cross-service contract and keeps long-lived chats from pinning a host across unrelated turns; document the residual boundary race.
- Heartbeat interval and lease TTL: Use a 60s TTL with a 20s heartbeat, with expired rows reclaimed during acquire plus an opportunistic sweep. Rationale: this bounds crash recovery to about one minute while keeping write traffic low.
- DB-unreachable failure mode: Fail open for interactive honorers, but fail closed for BooControl work that requires acquiring an exclusive lease. Rationale: chat availability should not depend on the advisory lease table, while unattended bench/eval work should not claim reproducible isolation when the lease cannot be acquired.