feat(booterm): structured pty_exited WS notifications. Plan-validated, impl-validated, code-reviewed green (contracts build clean, contracts test 29/29, booterm + web typecheck clean). wip: in-progress inference/provider refactor (agents.ts, provider.ts, new llama-providers.ts, removed llama-args-validator), plus arena, dispatcher, compaction, schema changes. openspec: pty-exit-notifications complete; x-agent-flags planned (not yet implemented).
47 lines
2.6 KiB
Markdown
47 lines
2.6 KiB
Markdown
# Fleet coordination lease — tasks
|
|
|
|
**Status:** OUTLINE. Do not start until the proposal's open questions are
|
|
resolved and this folder is promoted to READY. Task granularity here is
|
|
deliberately coarse; a full implementation plan (per `boo-planning-changes`) is
|
|
the first step once READY.
|
|
|
|
## L0 — design pass (gate)
|
|
- [ ] Resolve the four open questions in `proposal.md` (exclusive vs shared,
|
|
enforcement granularity, TTL/heartbeat, DB-unreachable failure mode).
|
|
- [ ] Write `design.md`: lease state machine, the atomic acquire SQL (conditional
|
|
upsert, no check-then-act), the honor-protocol contract shared by all four
|
|
writers, and the integration-test matrix.
|
|
|
|
## L1 — schema + lease service (apps/control)
|
|
- [ ] `control_host_leases` in `apps/control/src/schema.sql`: `provider_id`,
|
|
`holder`, `purpose`, `mode` (shared|exclusive), `expires_at`, `heartbeat_at`,
|
|
idempotent DDL. Index for the hot read path (active lease by `provider_id`).
|
|
- [ ] Lease service: `acquire` (atomic conditional upsert), `heartbeat`,
|
|
`release`, and an expiry sweep timer (reclaim lapsed leases) following the
|
|
retention-timer pattern.
|
|
- [ ] Pure helpers unit-tested (lease-conflict decision, expiry check) per the
|
|
`turn-guard.ts` pattern; DB-gated integration tests `describe.runIf(DATABASE_URL)`.
|
|
|
|
## L2 — swap the BooControl seam
|
|
- [ ] Replace the body of `acquireHostAccess(providerId, purpose)` in
|
|
`apps/control/src/services/host-access.ts` with a real exclusive-lease
|
|
acquire + heartbeat for bench/eval purposes. Do NOT touch the bench engine
|
|
(it already gates through the seam).
|
|
- [ ] Return a `HostGrant` that carries a release handle/heartbeat lifecycle the
|
|
bench runner can drive in its `finally`.
|
|
|
|
## L3 — honor-protocol in the other three writers
|
|
- [ ] BooChat (`apps/server`): read-before-dispatch active-exclusive-lease check
|
|
on the inference path; clear "host leased for <purpose>" surfacing.
|
|
- [ ] BooCoder (`apps/coder`): same check at the dispatch fetch sites.
|
|
- [ ] Arena (`apps/coder`): same check at the battle fetch sites.
|
|
- [ ] A single shared lease-check helper consumed by all four (avoid drift); one
|
|
integration test per writer proving it honors an exclusive lease.
|
|
|
|
## L4 — unlock unattended scheduling
|
|
- [ ] Unattended bench scheduling (the deferred half of BooControl P3): a
|
|
scheduler that acquires the exclusive lease, runs, releases.
|
|
- [ ] Reproducible concurrency sweeps behind the lease (no foreign traffic).
|
|
- [ ] Smoke: schedule an overnight bench; confirm it never evicts a live model
|
|
and that `concurrent_foreign_requests` is 0 for leased runs.
|