Files
boocode/openspec/changes/boocontrol/artifacts/plan-validation.md
indifferentketchup b18de2a331 chore: snapshot working tree - pty_exited notifications + in-flight inference WIP
feat(booterm): structured pty_exited WS notifications. Plan-validated, impl-validated, code-reviewed green (contracts build clean, contracts test 29/29, booterm + web typecheck clean).

wip: in-progress inference/provider refactor (agents.ts, provider.ts, new llama-providers.ts, removed llama-args-validator), plus arena, dispatcher, compaction, schema changes.

openspec: pty-exit-notifications complete; x-agent-flags planned (not yet implemented).
2026-06-14 12:48:47 +00:00

12 KiB

Validation: boocontrol (plan mode)

Date: 2026-06-12 Mode: Adversarial plan validation (pre-implementation) Size: Large -- 51 tasks across 10 phases, 4 apps + contracts, ~12 new DB tables, 5 new WS frames, new host service, routing gateway, eval sandbox

Verdict

BUILDABLE-WITH-FIXES

The plan is thorough and mostly accurate. Three blocking findings require correction before implementation; five advisory findings should be addressed. The core architecture, data model, and cross-app contracts are sound.

openspec validate

openspec --help not available in this environment; skipped CLI validation. All artifacts exist under openspec/changes/boocontrol/: proposal.md, design.md, tasks.md, artifacts/implementation-plan.md. No specs/ directory exists (not required for this change format).

Traceability

Requirement / Task Evidence (file:line or command) Status
LlamaProvider contract shape packages/contracts/src/llama-providers.ts:7-12 -- {id, label, baseUrl, kind} Verified
P0 gate: multi-provider batch in working tree openspec/changes/multi-llama-swap-providers-model-favorites/tasks.md referenced; CLAUDE.md confirms working tree state Verified (uncommitted by design)
InferenceRoute union current state apps/server/src/services/inference/provider.ts:61 -- 'swap' | 'deepseek' Verified
resolveModelProvider 5 callers (P7) provider.ts:96, model-context.ts:85,160, stream-phase-adapter.ts:309, compaction.ts:357, task-model.ts:22, system-prompt.ts:195 Verified (6 direct callers, not 5)
opencode-sse backoff+jitter claim apps/coder/src/services/backends/opencode-sse.ts:83-90 -- exponential backoff, NO jitter Verified; plan correctly identifies this as V1
coder-proxy pattern apps/server/src/routes/coder-proxy.ts:16-91 -- WS + HTTP catch-all Verified
coder db.ts applySchema pattern apps/coder/src/db.ts:25-29 -- readFile(schemaPath) + sql.unsafe(ddl) Verified
coder schema.sql owner apps/coder/src/schema.sql:1-3 -- applied by apps/coder/src/db.ts:applySchema() Verified
Drift test scope packages/contracts/src/__tests__/ws-frames.test.ts:119-135 -- checks KNOWN_FRAME_TYPES vs WsFrameSchema only Verified; no web strict union check
Web strict WsFrame union apps/web/src/api/types.ts:534-734 -- hand-maintained discriminated union Verified
waitForTable does not exist grep for waitForTable across repo: 0 results Verified
upstreamModel blast radius 1 production importer (stream-phase-adapter.ts:16), not "~5" as plan claims Finding F1
local-gateway.ts X-Boo-Source apps/coder/src/services/local-gateway.ts:69 -- forwards Authorization only, no X-Boo-Source Verified; plan correctly identifies this

Findings

F1: upstreamModel blast radius is significantly overstated** (Blocking)

  • Location: openspec/changes/boocontrol/artifacts/implementation-plan.md:177 (P4.1)
  • Evidence: grep -rn 'import.*upstreamModel' apps/server/src/ | grep -v test returns exactly 1 file: stream-phase-adapter.ts:16. The plan claims "~5 importers in model-context.ts, stream-phase-adapter.ts, compaction.ts, task-model.ts, system-prompt.ts" -- only stream-phase-adapter.ts actually imports upstreamModel. The other four files import resolveModelProvider, resolveModelEndpoint, or resolveRoute (different functions from the same module).
  • Impact: P4.1 says "upstreamModel signature change must be additive (optional source param -- its blast radius is ~5 importers)". The actual blast radius for upstreamModel is 1 importer. This makes the additive constraint even easier to satisfy (one call site), but the inflated number could mislead an implementer about the scope of change. The 8-file blast radius of resolveModelProvider itself is the real concern for P7, not upstreamModel's.
  • Fix: Correct P4.1 to state the actual blast radius: upstreamModel has 1 production importer (stream-phase-adapter.ts:309). The broader concern is that resolveModelProvider (called by upstreamModel, getModelContext, invalidateModelContext) has 6 direct production callers across 5 files -- P7 must audit all of them.

F2: P7 resolveModelProvider caller count is "5" but actual count is 6** (Blocking)

  • Location: openspec/changes/boocontrol/artifacts/implementation-plan.md:220-229 (P7.3)
  • Evidence: Direct callers of resolveModelProvider in production code:
    1. provider.ts:175 (resolveRoute) -- internal, but exported
    2. provider.ts:184 (upstreamModel) -- internal, but exported
    3. provider.ts:201 (resolveModelEndpoint) -- internal, but exported
    4. model-context.ts:85 (getModelContext)
    5. model-context.ts:160 (invalidateModelContext) Plus the three wrapper functions that call resolveModelProvider internally are themselves called from: stream-phase-adapter.ts (via upstreamModel), compaction.ts + task-model.ts (via resolveModelEndpoint), system-prompt.ts (via resolveRoute), error-handler.ts + tool-phase.ts (via getModelContext), chats.ts (via getModelContext), stream-phase.ts (via getModelContext).
  • Impact: The P7 plan's 5-caller audit list is actually correct in its detail (it lists the 5 files/functions that directly import from inference/provider.js and need code changes). But the count "5 callers" in V12 is confusing because resolveRoute is both a caller of resolveModelProvider AND itself exported/called by system-prompt.ts. The implementer needs to understand that modifying resolveModelProvider's fallback behavior affects the entire chain: resolveRoute -> system-prompt.ts, upstreamModel -> stream-phase-adapter.ts, resolveModelEndpoint -> compaction.ts + task-model.ts, plus getModelContext -> 4 downstream callers, plus invalidateModelContext.
  • Fix: The P7.3 per-caller change specs (lines 223-228) are accurate and complete. Add a note that the 5 direct callers propagate to ~10 downstream production call sites; none require signature changes (gateway handling is internal to each function), but all must be tested.

F3: Design S4 references jitter as part of the opencode-sse pattern; source has none** (Advisory)

  • Location: openspec/changes/boocontrol/design.md:125, apps/coder/src/services/backends/opencode-sse.ts:83-90
  • Evidence: Design S4 says "SSE consumer... reconnect with backoff + jitter (pattern: apps/coder/src/services/backends/opencode-sse.ts -- backoff, jitter, circuit breaker)". The actual reconnectDecision function (line 83-90) computes baseMs * 2^(failures-1) with a cap -- pure exponential backoff. No jitter. The plan correctly identified this as V1 and folded it (adding explicit jitter to the BooControl copy). However, the design.md still references "backoff + jitter" as if the pattern includes jitter.
  • Impact: An implementer reading design.md S4 but not V1 would assume the opencode-sse.ts pattern already has jitter and skip adding it. The plan folding is correct but the design.md reference is misleading.
  • Fix: Update design.md S4 to say "backoff (no jitter in source -- add explicitly, random 0-50% of computed delay)" or similar. This is a minor doc fix, not a plan blocker.

F4: V12 folded finding inaccurately counts upstreamModel callers** (Advisory)

  • Location: openspec/changes/boocontrol/artifacts/implementation-plan.md:38
  • Evidence: Finding V3 says "upstreamModel actually has ~5 importers, not 28/13". The actual count is 1 production importer. V3's correction is itself wrong by a factor of 5, though in the right direction (down from 28).
  • Impact: Minor -- the additive-change constraint is still correct, and the implementer will discover the actual blast radius immediately. But the folded finding's "correction" is itself inaccurate.
  • Fix: Note in V3 that upstreamModel has 1 production importer (stream-phase-adapter.ts), not ~5.

F5: No specs/ directory -- change folder uses proposal/design/tasks directly** (Advisory)

  • Location: openspec/changes/boocontrol/ directory listing
  • Evidence: No specs/ subdirectory exists. The skill says "Empty specs/: nothing to validate conformance against." For plan mode, this is acceptable -- the design.md serves as the conformance target. But the boo-validating-changes skill expects a specs/ directory for requirement traceability.
  • Impact: Plan mode validation can proceed against design.md. No blocker.
  • Fix: None needed; document that design.md serves as the spec for this change.

F6: P7.3 line number references may drift** (Advisory)

  • Location: openspec/changes/boocontrol/artifacts/implementation-plan.md:224-228
  • Evidence: P7.3 references specific line numbers: getModelContext (model-context.ts:85), invalidateModelContext (model-context.ts:160), resolveRoute (provider.ts:175), upstreamModel (provider.ts:184) with "line 192" for the swap fallback, resolveModelEndpoint (provider.ts:201). Verified against current code -- these line numbers are accurate as of this validation. However, P1-P6 work will modify these files, so P7 line numbers will drift.
  • Impact: Low -- the function names are stable identifiers. Line numbers are convenience references.
  • Fix: P7 implementer should grep for function names, not rely on line numbers.

F7: The system-prompt.ts resolveRoute call has a subtle signature mismatch** (Advisory)

  • Location: apps/server/src/services/system-prompt.ts:195
  • Evidence: resolveRoute(agent).route -- this call passes only agent (no config, no modelId). Looking at resolveRoute's signature: (agent: AgentLike | null, config?: ConfigLike, modelId?: string). With only agent and no config/modelId, it returns { route: 'swap' } (the default at line 174: if (!modelId || !config) return { route: 'swap' }). This is a hardcoded fallback, not a real routing resolution. P7 must ensure that adding 'gateway' to InferenceRoute doesn't break this call path -- it won't (it returns the default), but the implementer should note that system-prompt.ts never actually resolves through the provider registry.
  • Impact: No blocker -- the call is a no-op resolver that always returns 'swap'. But it means system-prompt.ts does NOT need gateway handling (it never resolves a gateway model). P7's audit list should clarify this.
  • Fix: P7.3 audit note: resolveRoute in system-prompt.ts:195 always returns {route: 'swap'} (no config/modelId passed); no gateway handling needed there.

Claims I did not verify

  • openspec CLI validation: openspec --help not available; could not probe CLI surface
  • Task sizing (5-20 min each): Not timed; tasks are well-scoped and independently verifiable, consistent with the claimed range
  • P0 multi-provider batch completeness: Referenced but not audited against its own tasks.md; trust the batch's own validation
  • /opt/forks/openevals sandbox patterns: Plan verified directory exists (V16); did not read the actual sandbox code for pattern fidelity
  • ECharts bundle size claim (~60-100KB): Not verified against actual echarts/core imports; accepted as reasonable estimate
  • llama-swap /api/events SSE envelope shape: Not verified against the llama-swap fork source; accepted from design
  • arena-runner.ts advanceChain pattern: Referenced as action queue pattern; not verified against actual code
  • getSwapProvider cache invalidation with source keying: P4 plan says cache keyed by baseURL+source; actual swapCache at provider.ts:17 keys by baseURL only. The P4 change would need to either invalidate/extend the cache or use a separate cache. This is a known P4 design detail, not a plan gap.