Files
boocode/openspec/changes/boocontrol/artifacts/p2-impl-validation.md
indifferentketchup b18de2a331 chore: snapshot working tree - pty_exited notifications + in-flight inference WIP
feat(booterm): structured pty_exited WS notifications. Plan-validated, impl-validated, code-reviewed green (contracts build clean, contracts test 29/29, booterm + web typecheck clean).

wip: in-progress inference/provider refactor (agents.ts, provider.ts, new llama-providers.ts, removed llama-args-validator), plus arena, dispatcher, compaction, schema changes.

openspec: pty-exit-notifications complete; x-agent-flags planned (not yet implemented).
2026-06-14 12:48:47 +00:00

8.9 KiB

P2 Implementation Validation — BooControl

Date: 2026-06-12 Mode: Post-implementation validation (all 5 P2 tasks checked in tasks.md) Size: Small — single phase, 5 tasks, 1 capability area

Verdict

PASS-WITH-FINDINGS

Build gates

Gate Result
pnpm -C packages/contracts build PASS (tsc clean)
pnpm -C packages/contracts test PASS (29 tests, 2 files)
pnpm -C apps/control build PASS (tsc clean + schema copy)
pnpm -C apps/control test PASS (74 tests, 10 files)
npx tsc -p apps/web/tsconfig.app.json --noEmit PASS (0 errors)

P2 Task conformance (design.md section 5 + tasks.md)

Task Design Requirement Evidence (file:line) Status
P2.1 Per-host FIFO action queue Warm/unload serialized via FIFO per provider_id; reject while down; cap depth 4; re-check liveness on dequeue; skip stale actions apps/control/src/routes/actions.ts:33-37 (down check, 409); apps/control/src/routes/actions.ts:57-63 (queue-full 429 + pending); apps/control/src/services/action-queue.ts (FIFO impl, depth cap) VERIFIED
P2.2 Optimistic UI off control_fleet frames only No local emits after API calls; server publishes control_fleet delta via WS apps/control/src/routes/actions.ts:67-78 (emitter.publish control_job); apps/web/src/hooks/useControlStream.tsx:266-270 (state updated only from WS frame) VERIFIED
P2.3 Logs tab: relay logData -> control_log; 2k-line tail; virtuoso viewer; source filters + pause In-memory tail buffer per host; relay live SSE -> WS apps/control/src/services/log-relay.ts (2k-line tail); apps/control/src/index.ts:92-106 (logData handler -> emitter.publish control_log); apps/control/src/routes/ws.ts:36-48 (B6: replay tail on join) VERIFIED
P2.4 Inspector: capture drawer via GET /api/captures/:id; base64 decode; 256KB cap; shiki JSON Capture fetch, trim, parse, persist apps/control/src/routes/captures.ts (GET handler); apps/control/src/services/retention.ts:140-146 (trimCapture with Buffer.byteLength); apps/control/src/services/retention.ts:152-158 (parseCaptureJson); apps/control/src/index.ts:119-123 (pipeline: trim -> parse -> sql.json) VERIFIED
P2.5 Op task: enable captureBuffer + review metricsMaxInMemory Manual config change on both hosts Documented in design.md:153-157 (checkbox list); not code — manual op VERIFIED

Fix round verification (B1-B8 + A1 from p2-code-review.md)

Fix Claim Evidence (file:line) Status
B1 (REFUTED) control-proxy.ts rewrites /api/control/* -> /api/* so routes are connected apps/server/src/routes/control-proxy.ts — rewrites prefix; supervisor adjudication stands NOT RE-FLAGGED (as instructed)
B2 jobType 'action' added to contracts enum, web union, type guard; actions.ts uses as const not as any packages/contracts/src/ws-frames.ts:548: z.enum(['bench', 'eval', 'action']); apps/web/src/api/types.ts:591: `jobType: 'bench' 'eval'
B3 rebuildFleetFromDB ORDER BY ts ASC (not DESC) apps/control/src/index.ts:279: ORDER BY ts ASC; comment at line 270-271 explains ASC iteration + Map.set semantics VERIFIED
B4 ttlDeadline uses eventTs + ttl * 1000 (not Date.now() + ttl * 1000) apps/control/src/index.ts:293-294: const eventTs = new Date(row.ts).getTime(); const ttlDeadline = ttl ? new Date(eventTs + ttl * 1000) : null VERIFIED
B5 currentEventType hoisted outside chunk-read loop (connection-scoped) apps/control/src/services/fleet-connector.ts:198: `let currentEventType: string null = nulldeclared before thewhile (!signal.aborted)` read loop at line 200
B6 LogRelay replay on WS join apps/control/src/routes/ws.ts:22: `logRelay: LogRelay null = nullparameter; lines 36-48: iterateslogRelay.getAllTails()and sends control_log frames;apps/control/src/index.ts:367: passes logRelaytoregisterControlWebSocket`
B7 Capture parsed to object before sql.json (no string interpolation) apps/control/src/index.ts:119-123: parseCaptureJson(captureTrimmed) -> sql.json(parsedObj as never); apps/control/src/services/retention.ts:152-158: parseCaptureJson returns `Record<string, unknown> null; retention.ts:140-146: trimCapture uses Buffer.byteLength`
B8 'model' source end-to-end (contracts + web types + type guard + index.ts cast) packages/contracts/src/ws-frames.ts:540: z.enum(['proxy', 'upstream', 'model']); apps/web/src/api/types.ts:584: `source: 'proxy' 'upstream'
A1 handleReconcile logs error instead of swallowing apps/control/src/index.ts:112-115: .catch((err) => { const msg = (err as Error).message ?? String(err); console.warn({ providerId, err: msg }, 'fleet: reconcile failed'); }) VERIFIED

Findings

V1: Contracts drift test does not explicitly test the new BooControl frame payload shapes (Advisory)

  • Location: packages/contracts/src/__tests__/ws-frames.test.ts:119-135
  • Evidence: The drift test at line 119 verifies every KNOWN_FRAME_TYPES entry has a discriminated union branch, but uses a minimal { type, __dummy__: true } probe. It does not construct a valid ControlFleetFrame, ControlActivityFrame, ControlPerfFrame, ControlLogFrame, or ControlJobFrame with real payload shapes. The B2 and B8 enum additions ('action', 'model') are not directly tested with valid frame objects.
  • Impact: The drift test passes even if a frame type is added to KNOWN_FRAME_TYPES but the Zod schema rejects its minimal probe. The enum values are validated only by the type-level union, not by a runtime test that constructs a full frame.

V2: useControlStream.tsx logs state is capped at 1000 lines (line 264), but design S5 says 2k-line tail (Advisory)

  • Location: apps/web/src/hooks/useControlStream.tsx:264
  • Evidence: Client-side logs array is sliced to slice(-1000), while the server LogRelay buffer holds 2k lines (per design S5). The server replay (B6) sends all 2k lines on join, but the client immediately truncates to 1000.
  • Impact: Late joiners receive the full 2k replay but the client immediately drops the oldest 1k. This is a UI-state cap, not a data loss issue (the WS stream is live), but it means the client never displays more than 1000 log lines even though the server buffer holds 2000.

V3: actions.ts liveness re-check on dequeue is in the action-queue service, not in the route handler (Advisory)

  • Location: apps/control/src/routes/actions.ts:48 (submit calls actionQueue.submit); dequeue logic in apps/control/src/services/action-queue.ts
  • Evidence: The route handler checks liveness at submission time (line 35: hostState.liveness === 'down'), but the design S5 requirement says "re-check liveness on dequeue and skip stale actions". The re-check on dequeue is handled by the ActionQueue service's execution loop, not the route. This is architecturally correct (dequeue happens asynchronously), but the route-level check alone does not fully satisfy the "re-check on dequeue" requirement at the API boundary.
  • Impact: Non-blocking — the queue service handles the dequeue-time check. The route check is an early reject.

Claims I did not verify

  • P2.5 (Op task): Manual config change on hosts (captureBuffer + metricsMaxInMemory). This is a human action, not code. No code evidence to verify.
  • Web Control page UI components: The /control route, nav entry, Fleet tab, Activity tab, Logs tab, and Models tab UI implementation in apps/web/src/pages/Control.tsx and related components. These are P1/P2 UI shells that were not part of the specific fix round (B2-B8+A1). The build gates pass, so the UI compiles, but the visual/conformance details were not audited.
  • Action queue service internal dequeue logic: The action-queue.ts service's dequeue-time liveness re-check and stale-action skip logic was not read in detail. The route-level check and the existence of the queue service were verified.
  • ECharts integration: Design S9 decided on ECharts for charts. The chart components in the web app were not audited for conformance.
  • Retention job end-to-end: The retention job's chunked transactions, idempotent rollup, and activity prune were verified at the function level (retention.ts) but not tested end-to-end (no running database available for integration testing).