chore: snapshot working tree - pty_exited notifications + in-flight inference WIP

feat(booterm): structured pty_exited WS notifications. Plan-validated, impl-validated, code-reviewed green (contracts build clean, contracts test 29/29, booterm + web typecheck clean).

wip: in-progress inference/provider refactor (agents.ts, provider.ts, new llama-providers.ts, removed llama-args-validator), plus arena, dispatcher, compaction, schema changes.

openspec: pty-exit-notifications complete; x-agent-flags planned (not yet implemented).
This commit is contained in:
2026-06-14 12:48:47 +00:00
parent 0ed506f1da
commit b18de2a331
204 changed files with 25344 additions and 867 deletions

View File

@@ -0,0 +1,126 @@
# P2 Code Review — Fix Report
**Date:** 2026-06-12
**Status:** ALL BLOCKING FINDINGS FIXED
---
## B1 (REFUTED by supervisor) — No action taken.
The reviewer claimed routes need prefix changes. The supervisor correctly noted that `control-proxy.ts` rewrites `/api/control/*` to `/api/*`, so the control service routes are correct as-is.
---
## B2 (FIXED) — jobType 'action' as any
**Problem:** `actions.ts:70` used `jobType: 'action' as any`, violating the contract enum `['bench', 'eval']`. The web type guard silently dropped every action job frame.
**Fix:**
- `packages/contracts/src/ws-frames.ts:548` — added `'action'` to `z.enum(['bench', 'eval', 'action'])`
- `apps/web/src/api/types.ts:591` — mirrored: `jobType: 'bench' | 'eval' | 'action'`
- `apps/web/src/hooks/useControlStream.tsx:166` — type guard: `['bench', 'eval', 'action'].includes(...)`
- `apps/web/src/hooks/useControlStream.tsx:180` — ControlStreamState jobs type updated
- `apps/control/src/routes/actions.ts:70``as any` removed, now `as const`
- Rebuilt contracts: `pnpm -C packages/contracts build`
**Verification:** contracts test (29 tests), control build, web tsc --noEmit all pass.
---
## B3 (FIXED) — rebuildFleetFromDB iteration order
**Problem:** Model events queried `ORDER BY ts DESC` so older rows overwrite newest state in the Map.
**Fix:** `apps/control/src/index.ts:274` — changed to `ORDER BY ts ASC`. With ASC iteration, `Map.set()` overwrites with the latest state for each model, so the newest event wins.
---
## B4 (FIXED) — ttlDeadline recalculation
**Problem:** Rebuild computed `new Date(Date.now() + ttl * 1000)`, giving models a fresh TTL from rebuild time instead of from event time.
**Fix:** `apps/control/src/index.ts:297-299` — changed to `new Date(eventTs + ttl * 1000)` where `eventTs = new Date(row.ts).getTime()`. This matches the semantic intent: the deadline reflects when the model was actually loaded, not when we rebuild.
**Evidence:** The live handler (`index.ts:57`) does `new Date(Date.now() + ttl * 1000)` relative to event arrival. The rebuild now uses the event timestamp, which is the correct reference point for a historical event.
---
## B5 (FIXED) — currentEventType resets between network chunks
**Problem:** `fleet-connector.ts:204` declared `currentEventType` inside the chunk-read loop, so an `event:` line in one network chunk and its `data:` line in the next lost the event type.
**Fix:** `apps/control/src/services/fleet-connector.ts:196-198` — hoisted `let currentEventType: string | null = null` outside the `while (!signal.aborted)` read loop, making it connection-scoped. Added comment explaining the rationale.
---
## B6 (FIXED) — late joiners never receive log tail
**Problem:** WS connect sends fleet snapshot but never replays the in-memory LogRelay tail.
**Fix:**
- `apps/control/src/routes/ws.ts``registerControlWebSocket` now accepts `logRelay: LogRelay | null` parameter
- After sending the fleet snapshot, iterates `logRelay.getAllTails()` and sends each as a `control_log` frame
- `apps/control/src/index.ts:363` — passes `logRelay` to `registerControlWebSocket`
---
## B7 (FIXED) — capture string interpolation into ::jsonb
**Problem:** `index.ts:120` did `${captureTrimmed ? sql\`'\${captureTrimmed}'::jsonb\` : ...}`, which interpolates a JSON string into a quoted ::jsonb fragment, producing double-serialized storage.
**Fix:**
- `apps/control/src/services/retention.ts` — added `parseCaptureJson()` that parses the trimmed string into an object (or null for invalid JSON)
- `apps/control/src/index.ts:118-122` — pipeline: `trimCapture()` -> `parseCaptureJson()` -> `sql.json(parsedObj as never)` per convention
- Added test in `retention.test.ts` asserting the parsed result is an object suitable for `sql.json()`, not a string
- Also fixed `trimCapture` to use `Buffer.byteLength` instead of `length * 2` for accurate byte counting
---
## B8 (CONFIRMED + FIXED) — 'model' source log lines silently dropped
**Trace:**
1. `index.ts:103` — publishes `source: event.data.source as 'proxy' | 'upstream'` (cast is no-op at runtime; 'model' passes through)
2. `ws-frames.ts:540` — contracts enum was `['proxy', 'upstream']` only
3. `useControlStream.tsx:155` — type guard checked `['proxy', 'upstream'].includes(...)` — 'model' fails
4. Frame silently dropped at the JSON parse boundary
**Fix (end-to-end):**
- `packages/contracts/src/ws-frames.ts:540` — `z.enum(['proxy', 'upstream', 'model'])`
- `apps/web/src/api/types.ts:584` — `source: 'proxy' | 'upstream' | 'model'`
- `apps/web/src/hooks/useControlStream.tsx:47` — `ControlLogEntry.source` widened
- `apps/web/src/hooks/useControlStream.tsx:75` — `ControlLogFrame.source` widened
- `apps/web/src/hooks/useControlStream.tsx:155` — type guard: `['proxy', 'upstream', 'model'].includes(...)`
- `apps/control/src/index.ts:103` — source cast widened to include 'model'
---
## A1 (FIXED) — handleReconcile swallows errors
**Problem:** `index.ts:112-114` — `.catch(() => { /* DB failure must not crash the process. */ })`
**Fix:** `apps/control/src/index.ts:112-115` — logs the error: `console.warn({ providerId, err: msg }, 'fleet: reconcile failed')`
---
## Test results
```
contracts: 29 tests, 2 passed (29 passed)
control: 74 tests, 10 passed (74 passed)
server: 575 tests, 50 passed | 2 skipped (586 total)
web tsc: 0 errors (clean)
```
## Files changed (this batch)
| File | Change |
|------|--------|
| `packages/contracts/src/ws-frames.ts` | B2: 'action' to jobType; B8: 'model' to source |
| `apps/web/src/api/types.ts` | B2+B8: mirrored enums |
| `apps/web/src/hooks/useControlStream.tsx` | B2+B8: type guards + ControlStreamState |
| `apps/control/src/routes/actions.ts` | B2: removed `as any` |
| `apps/control/src/index.ts` | B3: ASC order; B4: eventTs ttlDeadline; B7: sql.json; A1: error log |
| `apps/control/src/services/fleet-connector.ts` | B5: hoisted currentEventType |
| `apps/control/src/routes/ws.ts` | B6: logRelay replay on connect |
| `apps/control/src/services/retention.ts` | B7: parseCaptureJson + byteLength fix |
| `apps/control/src/services/__tests__/retention.test.ts` | B7: JSONB object test |