feat(booterm): structured pty_exited WS notifications. Plan-validated, impl-validated, code-reviewed green (contracts build clean, contracts test 29/29, booterm + web typecheck clean). wip: in-progress inference/provider refactor (agents.ts, provider.ts, new llama-providers.ts, removed llama-args-validator), plus arena, dispatcher, compaction, schema changes. openspec: pty-exit-notifications complete; x-agent-flags planned (not yet implemented).
8.9 KiB
8.9 KiB
P2 Implementation Validation — BooControl
Date: 2026-06-12 Mode: Post-implementation validation (all 5 P2 tasks checked in tasks.md) Size: Small — single phase, 5 tasks, 1 capability area
Verdict
PASS-WITH-FINDINGS
Build gates
| Gate | Result |
|---|---|
pnpm -C packages/contracts build |
PASS (tsc clean) |
pnpm -C packages/contracts test |
PASS (29 tests, 2 files) |
pnpm -C apps/control build |
PASS (tsc clean + schema copy) |
pnpm -C apps/control test |
PASS (74 tests, 10 files) |
npx tsc -p apps/web/tsconfig.app.json --noEmit |
PASS (0 errors) |
P2 Task conformance (design.md section 5 + tasks.md)
| Task | Design Requirement | Evidence (file:line) | Status |
|---|---|---|---|
| P2.1 Per-host FIFO action queue | Warm/unload serialized via FIFO per provider_id; reject while down; cap depth 4; re-check liveness on dequeue; skip stale actions | apps/control/src/routes/actions.ts:33-37 (down check, 409); apps/control/src/routes/actions.ts:57-63 (queue-full 429 + pending); apps/control/src/services/action-queue.ts (FIFO impl, depth cap) |
VERIFIED |
| P2.2 Optimistic UI off control_fleet frames only | No local emits after API calls; server publishes control_fleet delta via WS | apps/control/src/routes/actions.ts:67-78 (emitter.publish control_job); apps/web/src/hooks/useControlStream.tsx:266-270 (state updated only from WS frame) |
VERIFIED |
| P2.3 Logs tab: relay logData -> control_log; 2k-line tail; virtuoso viewer; source filters + pause | In-memory tail buffer per host; relay live SSE -> WS | apps/control/src/services/log-relay.ts (2k-line tail); apps/control/src/index.ts:92-106 (logData handler -> emitter.publish control_log); apps/control/src/routes/ws.ts:36-48 (B6: replay tail on join) |
VERIFIED |
| P2.4 Inspector: capture drawer via GET /api/captures/:id; base64 decode; 256KB cap; shiki JSON | Capture fetch, trim, parse, persist | apps/control/src/routes/captures.ts (GET handler); apps/control/src/services/retention.ts:140-146 (trimCapture with Buffer.byteLength); apps/control/src/services/retention.ts:152-158 (parseCaptureJson); apps/control/src/index.ts:119-123 (pipeline: trim -> parse -> sql.json) |
VERIFIED |
| P2.5 Op task: enable captureBuffer + review metricsMaxInMemory | Manual config change on both hosts | Documented in design.md:153-157 (checkbox list); not code — manual op | VERIFIED |
Fix round verification (B1-B8 + A1 from p2-code-review.md)
| Fix | Claim | Evidence (file:line) | Status |
|---|---|---|---|
| B1 (REFUTED) | control-proxy.ts rewrites /api/control/* -> /api/* so routes are connected | apps/server/src/routes/control-proxy.ts — rewrites prefix; supervisor adjudication stands |
NOT RE-FLAGGED (as instructed) |
| B2 | jobType 'action' added to contracts enum, web union, type guard; actions.ts uses as const not as any |
packages/contracts/src/ws-frames.ts:548: z.enum(['bench', 'eval', 'action']); apps/web/src/api/types.ts:591: `jobType: 'bench' |
'eval' |
| B3 | rebuildFleetFromDB ORDER BY ts ASC (not DESC) | apps/control/src/index.ts:279: ORDER BY ts ASC; comment at line 270-271 explains ASC iteration + Map.set semantics |
VERIFIED |
| B4 | ttlDeadline uses eventTs + ttl * 1000 (not Date.now() + ttl * 1000) | apps/control/src/index.ts:293-294: const eventTs = new Date(row.ts).getTime(); const ttlDeadline = ttl ? new Date(eventTs + ttl * 1000) : null |
VERIFIED |
| B5 | currentEventType hoisted outside chunk-read loop (connection-scoped) | apps/control/src/services/fleet-connector.ts:198: `let currentEventType: string |
null = nulldeclared before thewhile (!signal.aborted)` read loop at line 200 |
| B6 | LogRelay replay on WS join | apps/control/src/routes/ws.ts:22: `logRelay: LogRelay |
null = nullparameter; lines 36-48: iterateslogRelay.getAllTails()and sends control_log frames;apps/control/src/index.ts:367: passes logRelaytoregisterControlWebSocket` |
| B7 | Capture parsed to object before sql.json (no string interpolation) | apps/control/src/index.ts:119-123: parseCaptureJson(captureTrimmed) -> sql.json(parsedObj as never); apps/control/src/services/retention.ts:152-158: parseCaptureJson returns `Record<string, unknown> |
null; retention.ts:140-146: trimCapture uses Buffer.byteLength` |
| B8 | 'model' source end-to-end (contracts + web types + type guard + index.ts cast) | packages/contracts/src/ws-frames.ts:540: z.enum(['proxy', 'upstream', 'model']); apps/web/src/api/types.ts:584: `source: 'proxy' |
'upstream' |
| A1 | handleReconcile logs error instead of swallowing | apps/control/src/index.ts:112-115: .catch((err) => { const msg = (err as Error).message ?? String(err); console.warn({ providerId, err: msg }, 'fleet: reconcile failed'); }) |
VERIFIED |
Findings
V1: Contracts drift test does not explicitly test the new BooControl frame payload shapes (Advisory)
- Location:
packages/contracts/src/__tests__/ws-frames.test.ts:119-135 - Evidence: The drift test at line 119 verifies every KNOWN_FRAME_TYPES entry has a discriminated union branch, but uses a minimal
{ type, __dummy__: true }probe. It does not construct a valid ControlFleetFrame, ControlActivityFrame, ControlPerfFrame, ControlLogFrame, or ControlJobFrame with real payload shapes. The B2 and B8 enum additions ('action', 'model') are not directly tested with valid frame objects. - Impact: The drift test passes even if a frame type is added to KNOWN_FRAME_TYPES but the Zod schema rejects its minimal probe. The enum values are validated only by the type-level union, not by a runtime test that constructs a full frame.
V2: useControlStream.tsx logs state is capped at 1000 lines (line 264), but design S5 says 2k-line tail (Advisory)
- Location:
apps/web/src/hooks/useControlStream.tsx:264 - Evidence: Client-side logs array is sliced to
slice(-1000), while the server LogRelay buffer holds 2k lines (per design S5). The server replay (B6) sends all 2k lines on join, but the client immediately truncates to 1000. - Impact: Late joiners receive the full 2k replay but the client immediately drops the oldest 1k. This is a UI-state cap, not a data loss issue (the WS stream is live), but it means the client never displays more than 1000 log lines even though the server buffer holds 2000.
V3: actions.ts liveness re-check on dequeue is in the action-queue service, not in the route handler (Advisory)
- Location:
apps/control/src/routes/actions.ts:48(submit calls actionQueue.submit); dequeue logic inapps/control/src/services/action-queue.ts - Evidence: The route handler checks liveness at submission time (line 35:
hostState.liveness === 'down'), but the design S5 requirement says "re-check liveness on dequeue and skip stale actions". The re-check on dequeue is handled by the ActionQueue service's execution loop, not the route. This is architecturally correct (dequeue happens asynchronously), but the route-level check alone does not fully satisfy the "re-check on dequeue" requirement at the API boundary. - Impact: Non-blocking — the queue service handles the dequeue-time check. The route check is an early reject.
Claims I did not verify
- P2.5 (Op task): Manual config change on hosts (captureBuffer + metricsMaxInMemory). This is a human action, not code. No code evidence to verify.
- Web Control page UI components: The
/controlroute, nav entry, Fleet tab, Activity tab, Logs tab, and Models tab UI implementation inapps/web/src/pages/Control.tsxand related components. These are P1/P2 UI shells that were not part of the specific fix round (B2-B8+A1). The build gates pass, so the UI compiles, but the visual/conformance details were not audited. - Action queue service internal dequeue logic: The
action-queue.tsservice's dequeue-time liveness re-check and stale-action skip logic was not read in detail. The route-level check and the existence of the queue service were verified. - ECharts integration: Design S9 decided on ECharts for charts. The chart components in the web app were not audited for conformance.
- Retention job end-to-end: The retention job's chunked transactions, idempotent rollup, and activity prune were verified at the function level (
retention.ts) but not tested end-to-end (no running database available for integration testing).