feat(booterm): structured pty_exited WS notifications. Plan-validated, impl-validated, code-reviewed green (contracts build clean, contracts test 29/29, booterm + web typecheck clean). wip: in-progress inference/provider refactor (agents.ts, provider.ts, new llama-providers.ts, removed llama-args-validator), plus arena, dispatcher, compaction, schema changes. openspec: pty-exit-notifications complete; x-agent-flags planned (not yet implemented).
12 KiB
12 KiB
Validation: boocontrol (plan mode)
Date: 2026-06-12 Mode: Adversarial plan validation (pre-implementation) Size: Large -- 51 tasks across 10 phases, 4 apps + contracts, ~12 new DB tables, 5 new WS frames, new host service, routing gateway, eval sandbox
Verdict
BUILDABLE-WITH-FIXES
The plan is thorough and mostly accurate. Three blocking findings require correction before implementation; five advisory findings should be addressed. The core architecture, data model, and cross-app contracts are sound.
openspec validate
openspec --help not available in this environment; skipped CLI validation. All artifacts exist under openspec/changes/boocontrol/: proposal.md, design.md, tasks.md, artifacts/implementation-plan.md. No specs/ directory exists (not required for this change format).
Traceability
| Requirement / Task | Evidence (file:line or command) | Status |
|---|---|---|
| LlamaProvider contract shape | packages/contracts/src/llama-providers.ts:7-12 -- {id, label, baseUrl, kind} |
Verified |
| P0 gate: multi-provider batch in working tree | openspec/changes/multi-llama-swap-providers-model-favorites/tasks.md referenced; CLAUDE.md confirms working tree state |
Verified (uncommitted by design) |
| InferenceRoute union current state | apps/server/src/services/inference/provider.ts:61 -- 'swap' | 'deepseek' |
Verified |
| resolveModelProvider 5 callers (P7) | provider.ts:96, model-context.ts:85,160, stream-phase-adapter.ts:309, compaction.ts:357, task-model.ts:22, system-prompt.ts:195 |
Verified (6 direct callers, not 5) |
| opencode-sse backoff+jitter claim | apps/coder/src/services/backends/opencode-sse.ts:83-90 -- exponential backoff, NO jitter |
Verified; plan correctly identifies this as V1 |
| coder-proxy pattern | apps/server/src/routes/coder-proxy.ts:16-91 -- WS + HTTP catch-all |
Verified |
| coder db.ts applySchema pattern | apps/coder/src/db.ts:25-29 -- readFile(schemaPath) + sql.unsafe(ddl) |
Verified |
| coder schema.sql owner | apps/coder/src/schema.sql:1-3 -- applied by apps/coder/src/db.ts:applySchema() |
Verified |
| Drift test scope | packages/contracts/src/__tests__/ws-frames.test.ts:119-135 -- checks KNOWN_FRAME_TYPES vs WsFrameSchema only |
Verified; no web strict union check |
| Web strict WsFrame union | apps/web/src/api/types.ts:534-734 -- hand-maintained discriminated union |
Verified |
| waitForTable does not exist | grep for waitForTable across repo: 0 results |
Verified |
| upstreamModel blast radius | 1 production importer (stream-phase-adapter.ts:16), not "~5" as plan claims |
Finding F1 |
| local-gateway.ts X-Boo-Source | apps/coder/src/services/local-gateway.ts:69 -- forwards Authorization only, no X-Boo-Source |
Verified; plan correctly identifies this |
Findings
F1: upstreamModel blast radius is significantly overstated** (Blocking)
- Location:
openspec/changes/boocontrol/artifacts/implementation-plan.md:177(P4.1) - Evidence:
grep -rn 'import.*upstreamModel' apps/server/src/ | grep -v testreturns exactly 1 file:stream-phase-adapter.ts:16. The plan claims "~5 importers in model-context.ts, stream-phase-adapter.ts, compaction.ts, task-model.ts, system-prompt.ts" -- onlystream-phase-adapter.tsactually importsupstreamModel. The other four files importresolveModelProvider,resolveModelEndpoint, orresolveRoute(different functions from the same module). - Impact: P4.1 says "upstreamModel signature change must be additive (optional source param -- its blast radius is ~5 importers)". The actual blast radius for
upstreamModelis 1 importer. This makes the additive constraint even easier to satisfy (one call site), but the inflated number could mislead an implementer about the scope of change. The 8-file blast radius ofresolveModelProvideritself is the real concern for P7, notupstreamModel's. - Fix: Correct P4.1 to state the actual blast radius:
upstreamModelhas 1 production importer (stream-phase-adapter.ts:309). The broader concern is thatresolveModelProvider(called byupstreamModel,getModelContext,invalidateModelContext) has 6 direct production callers across 5 files -- P7 must audit all of them.
F2: P7 resolveModelProvider caller count is "5" but actual count is 6** (Blocking)
- Location:
openspec/changes/boocontrol/artifacts/implementation-plan.md:220-229(P7.3) - Evidence: Direct callers of
resolveModelProviderin production code:provider.ts:175(resolveRoute) -- internal, but exportedprovider.ts:184(upstreamModel) -- internal, but exportedprovider.ts:201(resolveModelEndpoint) -- internal, but exportedmodel-context.ts:85(getModelContext)model-context.ts:160(invalidateModelContext) Plus the three wrapper functions that callresolveModelProviderinternally are themselves called from:stream-phase-adapter.ts(viaupstreamModel),compaction.ts+task-model.ts(viaresolveModelEndpoint),system-prompt.ts(viaresolveRoute),error-handler.ts+tool-phase.ts(viagetModelContext),chats.ts(viagetModelContext),stream-phase.ts(viagetModelContext).
- Impact: The P7 plan's 5-caller audit list is actually correct in its detail (it lists the 5 files/functions that directly import from
inference/provider.jsand need code changes). But the count "5 callers" in V12 is confusing becauseresolveRouteis both a caller ofresolveModelProviderAND itself exported/called bysystem-prompt.ts. The implementer needs to understand that modifyingresolveModelProvider's fallback behavior affects the entire chain:resolveRoute->system-prompt.ts,upstreamModel->stream-phase-adapter.ts,resolveModelEndpoint->compaction.ts+task-model.ts, plusgetModelContext-> 4 downstream callers, plusinvalidateModelContext. - Fix: The P7.3 per-caller change specs (lines 223-228) are accurate and complete. Add a note that the 5 direct callers propagate to ~10 downstream production call sites; none require signature changes (gateway handling is internal to each function), but all must be tested.
F3: Design S4 references jitter as part of the opencode-sse pattern; source has none** (Advisory)
- Location:
openspec/changes/boocontrol/design.md:125,apps/coder/src/services/backends/opencode-sse.ts:83-90 - Evidence: Design S4 says "SSE consumer... reconnect with backoff + jitter (pattern:
apps/coder/src/services/backends/opencode-sse.ts-- backoff, jitter, circuit breaker)". The actualreconnectDecisionfunction (line 83-90) computesbaseMs * 2^(failures-1)with a cap -- pure exponential backoff. No jitter. The plan correctly identified this as V1 and folded it (adding explicit jitter to the BooControl copy). However, the design.md still references "backoff + jitter" as if the pattern includes jitter. - Impact: An implementer reading design.md S4 but not V1 would assume the opencode-sse.ts pattern already has jitter and skip adding it. The plan folding is correct but the design.md reference is misleading.
- Fix: Update design.md S4 to say "backoff (no jitter in source -- add explicitly, random 0-50% of computed delay)" or similar. This is a minor doc fix, not a plan blocker.
F4: V12 folded finding inaccurately counts upstreamModel callers** (Advisory)
- Location:
openspec/changes/boocontrol/artifacts/implementation-plan.md:38 - Evidence: Finding V3 says "upstreamModel actually has ~5 importers, not 28/13". The actual count is 1 production importer. V3's correction is itself wrong by a factor of 5, though in the right direction (down from 28).
- Impact: Minor -- the additive-change constraint is still correct, and the implementer will discover the actual blast radius immediately. But the folded finding's "correction" is itself inaccurate.
- Fix: Note in V3 that upstreamModel has 1 production importer (
stream-phase-adapter.ts), not ~5.
F5: No specs/ directory -- change folder uses proposal/design/tasks directly** (Advisory)
- Location:
openspec/changes/boocontrol/directory listing - Evidence: No
specs/subdirectory exists. The skill says "Empty specs/: nothing to validate conformance against." For plan mode, this is acceptable -- the design.md serves as the conformance target. But the boo-validating-changes skill expects a specs/ directory for requirement traceability. - Impact: Plan mode validation can proceed against design.md. No blocker.
- Fix: None needed; document that design.md serves as the spec for this change.
F6: P7.3 line number references may drift** (Advisory)
- Location:
openspec/changes/boocontrol/artifacts/implementation-plan.md:224-228 - Evidence: P7.3 references specific line numbers:
getModelContext (model-context.ts:85),invalidateModelContext (model-context.ts:160),resolveRoute (provider.ts:175),upstreamModel (provider.ts:184)with "line 192" for the swap fallback,resolveModelEndpoint (provider.ts:201). Verified against current code -- these line numbers are accurate as of this validation. However, P1-P6 work will modify these files, so P7 line numbers will drift. - Impact: Low -- the function names are stable identifiers. Line numbers are convenience references.
- Fix: P7 implementer should grep for function names, not rely on line numbers.
F7: The system-prompt.ts resolveRoute call has a subtle signature mismatch** (Advisory)
- Location:
apps/server/src/services/system-prompt.ts:195 - Evidence:
resolveRoute(agent).route-- this call passes onlyagent(noconfig, nomodelId). Looking atresolveRoute's signature:(agent: AgentLike | null, config?: ConfigLike, modelId?: string). With onlyagentand noconfig/modelId, it returns{ route: 'swap' }(the default at line 174:if (!modelId || !config) return { route: 'swap' }). This is a hardcoded fallback, not a real routing resolution. P7 must ensure that adding'gateway'toInferenceRoutedoesn't break this call path -- it won't (it returns the default), but the implementer should note thatsystem-prompt.tsnever actually resolves through the provider registry. - Impact: No blocker -- the call is a no-op resolver that always returns
'swap'. But it meanssystem-prompt.tsdoes NOT need gateway handling (it never resolves a gateway model). P7's audit list should clarify this. - Fix: P7.3 audit note:
resolveRouteinsystem-prompt.ts:195always returns{route: 'swap'}(no config/modelId passed); no gateway handling needed there.
Claims I did not verify
- openspec CLI validation:
openspec --helpnot available; could not probe CLI surface - Task sizing (5-20 min each): Not timed; tasks are well-scoped and independently verifiable, consistent with the claimed range
- P0 multi-provider batch completeness: Referenced but not audited against its own tasks.md; trust the batch's own validation
/opt/forks/openevalssandbox patterns: Plan verified directory exists (V16); did not read the actual sandbox code for pattern fidelity- ECharts bundle size claim (~60-100KB): Not verified against actual echarts/core imports; accepted as reasonable estimate
- llama-swap
/api/eventsSSE envelope shape: Not verified against the llama-swap fork source; accepted from design arena-runner.tsadvanceChainpattern: Referenced as action queue pattern; not verified against actual codegetSwapProvidercache invalidation with source keying: P4 plan says cache keyed bybaseURL+source; actualswapCacheatprovider.ts:17keys bybaseURLonly. The P4 change would need to either invalidate/extend the cache or use a separate cache. This is a known P4 design detail, not a plan gap.