chore: snapshot working tree - pty_exited notifications + in-flight inference WIP
feat(booterm): structured pty_exited WS notifications. Plan-validated, impl-validated, code-reviewed green (contracts build clean, contracts test 29/29, booterm + web typecheck clean). wip: in-progress inference/provider refactor (agents.ts, provider.ts, new llama-providers.ts, removed llama-args-validator), plus arena, dispatcher, compaction, schema changes. openspec: pty-exit-notifications complete; x-agent-flags planned (not yet implemented).
This commit is contained in:
101
openspec/changes/boocontrol/artifacts/plan-validation.md
Normal file
101
openspec/changes/boocontrol/artifacts/plan-validation.md
Normal file
@@ -0,0 +1,101 @@
|
||||
# Validation: boocontrol (plan mode)
|
||||
|
||||
**Date:** 2026-06-12
|
||||
**Mode:** Adversarial plan validation (pre-implementation)
|
||||
**Size:** Large -- 51 tasks across 10 phases, 4 apps + contracts, ~12 new DB tables, 5 new WS frames, new host service, routing gateway, eval sandbox
|
||||
|
||||
## Verdict
|
||||
|
||||
**BUILDABLE-WITH-FIXES**
|
||||
|
||||
The plan is thorough and mostly accurate. Three blocking findings require correction before implementation; five advisory findings should be addressed. The core architecture, data model, and cross-app contracts are sound.
|
||||
|
||||
## openspec validate
|
||||
|
||||
`openspec --help` not available in this environment; skipped CLI validation. All artifacts exist under `openspec/changes/boocontrol/`: `proposal.md`, `design.md`, `tasks.md`, `artifacts/implementation-plan.md`. No `specs/` directory exists (not required for this change format).
|
||||
|
||||
## Traceability
|
||||
|
||||
| Requirement / Task | Evidence (file:line or command) | Status |
|
||||
|--------------------|--------------------------------|--------|
|
||||
| LlamaProvider contract shape | `packages/contracts/src/llama-providers.ts:7-12` -- `{id, label, baseUrl, kind}` | Verified |
|
||||
| P0 gate: multi-provider batch in working tree | `openspec/changes/multi-llama-swap-providers-model-favorites/tasks.md` referenced; CLAUDE.md confirms working tree state | Verified (uncommitted by design) |
|
||||
| InferenceRoute union current state | `apps/server/src/services/inference/provider.ts:61` -- `'swap' \| 'deepseek'` | Verified |
|
||||
| resolveModelProvider 5 callers (P7) | `provider.ts:96`, `model-context.ts:85,160`, `stream-phase-adapter.ts:309`, `compaction.ts:357`, `task-model.ts:22`, `system-prompt.ts:195` | Verified (6 direct callers, not 5) |
|
||||
| opencode-sse backoff+jitter claim | `apps/coder/src/services/backends/opencode-sse.ts:83-90` -- exponential backoff, NO jitter | Verified; plan correctly identifies this as V1 |
|
||||
| coder-proxy pattern | `apps/server/src/routes/coder-proxy.ts:16-91` -- WS + HTTP catch-all | Verified |
|
||||
| coder db.ts applySchema pattern | `apps/coder/src/db.ts:25-29` -- `readFile(schemaPath)` + `sql.unsafe(ddl)` | Verified |
|
||||
| coder schema.sql owner | `apps/coder/src/schema.sql:1-3` -- applied by `apps/coder/src/db.ts:applySchema()` | Verified |
|
||||
| Drift test scope | `packages/contracts/src/__tests__/ws-frames.test.ts:119-135` -- checks KNOWN_FRAME_TYPES vs WsFrameSchema only | Verified; no web strict union check |
|
||||
| Web strict WsFrame union | `apps/web/src/api/types.ts:534-734` -- hand-maintained discriminated union | Verified |
|
||||
| waitForTable does not exist | grep for `waitForTable` across repo: 0 results | Verified |
|
||||
| upstreamModel blast radius | 1 production importer (`stream-phase-adapter.ts:16`), not "~5" as plan claims | Finding F1 |
|
||||
| local-gateway.ts X-Boo-Source | `apps/coder/src/services/local-gateway.ts:69` -- forwards Authorization only, no X-Boo-Source | Verified; plan correctly identifies this |
|
||||
|
||||
## Findings
|
||||
|
||||
### F1: upstreamModel blast radius is significantly overstated** (Blocking)
|
||||
|
||||
- **Location:** `openspec/changes/boocontrol/artifacts/implementation-plan.md:177` (P4.1)
|
||||
- **Evidence:** `grep -rn 'import.*upstreamModel' apps/server/src/ | grep -v test` returns exactly 1 file: `stream-phase-adapter.ts:16`. The plan claims "~5 importers in model-context.ts, stream-phase-adapter.ts, compaction.ts, task-model.ts, system-prompt.ts" -- only `stream-phase-adapter.ts` actually imports `upstreamModel`. The other four files import `resolveModelProvider`, `resolveModelEndpoint`, or `resolveRoute` (different functions from the same module).
|
||||
- **Impact:** P4.1 says "upstreamModel signature change must be additive (optional source param -- its blast radius is ~5 importers)". The actual blast radius for `upstreamModel` is 1 importer. This makes the additive constraint even easier to satisfy (one call site), but the inflated number could mislead an implementer about the scope of change. The 8-file blast radius of `resolveModelProvider` itself is the real concern for P7, not `upstreamModel`'s.
|
||||
- **Fix:** Correct P4.1 to state the actual blast radius: `upstreamModel` has 1 production importer (`stream-phase-adapter.ts:309`). The broader concern is that `resolveModelProvider` (called by `upstreamModel`, `getModelContext`, `invalidateModelContext`) has 6 direct production callers across 5 files -- P7 must audit all of them.
|
||||
|
||||
### F2: P7 resolveModelProvider caller count is "5" but actual count is 6** (Blocking)
|
||||
|
||||
- **Location:** `openspec/changes/boocontrol/artifacts/implementation-plan.md:220-229` (P7.3)
|
||||
- **Evidence:** Direct callers of `resolveModelProvider` in production code:
|
||||
1. `provider.ts:175` (`resolveRoute`) -- internal, but exported
|
||||
2. `provider.ts:184` (`upstreamModel`) -- internal, but exported
|
||||
3. `provider.ts:201` (`resolveModelEndpoint`) -- internal, but exported
|
||||
4. `model-context.ts:85` (`getModelContext`)
|
||||
5. `model-context.ts:160` (`invalidateModelContext`)
|
||||
Plus the three wrapper functions that call `resolveModelProvider` internally are themselves called from: `stream-phase-adapter.ts` (via `upstreamModel`), `compaction.ts` + `task-model.ts` (via `resolveModelEndpoint`), `system-prompt.ts` (via `resolveRoute`), `error-handler.ts` + `tool-phase.ts` (via `getModelContext`), `chats.ts` (via `getModelContext`), `stream-phase.ts` (via `getModelContext`).
|
||||
- **Impact:** The P7 plan's 5-caller audit list is actually correct in its detail (it lists the 5 files/functions that directly import from `inference/provider.js` and need code changes). But the count "5 callers" in V12 is confusing because `resolveRoute` is both a caller of `resolveModelProvider` AND itself exported/called by `system-prompt.ts`. The implementer needs to understand that modifying `resolveModelProvider`'s fallback behavior affects the entire chain: `resolveRoute` -> `system-prompt.ts`, `upstreamModel` -> `stream-phase-adapter.ts`, `resolveModelEndpoint` -> `compaction.ts` + `task-model.ts`, plus `getModelContext` -> 4 downstream callers, plus `invalidateModelContext`.
|
||||
- **Fix:** The P7.3 per-caller change specs (lines 223-228) are accurate and complete. Add a note that the 5 direct callers propagate to ~10 downstream production call sites; none require signature changes (gateway handling is internal to each function), but all must be tested.
|
||||
|
||||
### F3: Design S4 references jitter as part of the opencode-sse pattern; source has none** (Advisory)
|
||||
|
||||
- **Location:** `openspec/changes/boocontrol/design.md:125`, `apps/coder/src/services/backends/opencode-sse.ts:83-90`
|
||||
- **Evidence:** Design S4 says "SSE consumer... reconnect with backoff + jitter (pattern: `apps/coder/src/services/backends/opencode-sse.ts` -- backoff, jitter, circuit breaker)". The actual `reconnectDecision` function (line 83-90) computes `baseMs * 2^(failures-1)` with a cap -- pure exponential backoff. No jitter. The plan correctly identified this as V1 and folded it (adding explicit jitter to the BooControl copy). However, the design.md still references "backoff + jitter" as if the pattern includes jitter.
|
||||
- **Impact:** An implementer reading design.md S4 but not V1 would assume the opencode-sse.ts pattern already has jitter and skip adding it. The plan folding is correct but the design.md reference is misleading.
|
||||
- **Fix:** Update design.md S4 to say "backoff (no jitter in source -- add explicitly, random 0-50% of computed delay)" or similar. This is a minor doc fix, not a plan blocker.
|
||||
|
||||
### F4: V12 folded finding inaccurately counts upstreamModel callers** (Advisory)
|
||||
|
||||
- **Location:** `openspec/changes/boocontrol/artifacts/implementation-plan.md:38`
|
||||
- **Evidence:** Finding V3 says "upstreamModel actually has ~5 importers, not 28/13". The actual count is 1 production importer. V3's correction is itself wrong by a factor of 5, though in the right direction (down from 28).
|
||||
- **Impact:** Minor -- the additive-change constraint is still correct, and the implementer will discover the actual blast radius immediately. But the folded finding's "correction" is itself inaccurate.
|
||||
- **Fix:** Note in V3 that upstreamModel has 1 production importer (`stream-phase-adapter.ts`), not ~5.
|
||||
|
||||
### F5: No specs/ directory -- change folder uses proposal/design/tasks directly** (Advisory)
|
||||
|
||||
- **Location:** `openspec/changes/boocontrol/` directory listing
|
||||
- **Evidence:** No `specs/` subdirectory exists. The skill says "Empty specs/: nothing to validate conformance against." For plan mode, this is acceptable -- the design.md serves as the conformance target. But the boo-validating-changes skill expects a specs/ directory for requirement traceability.
|
||||
- **Impact:** Plan mode validation can proceed against design.md. No blocker.
|
||||
- **Fix:** None needed; document that design.md serves as the spec for this change.
|
||||
|
||||
### F6: P7.3 line number references may drift** (Advisory)
|
||||
|
||||
- **Location:** `openspec/changes/boocontrol/artifacts/implementation-plan.md:224-228`
|
||||
- **Evidence:** P7.3 references specific line numbers: `getModelContext (model-context.ts:85)`, `invalidateModelContext (model-context.ts:160)`, `resolveRoute (provider.ts:175)`, `upstreamModel (provider.ts:184)` with "line 192" for the swap fallback, `resolveModelEndpoint (provider.ts:201)`. Verified against current code -- these line numbers are accurate as of this validation. However, P1-P6 work will modify these files, so P7 line numbers will drift.
|
||||
- **Impact:** Low -- the function names are stable identifiers. Line numbers are convenience references.
|
||||
- **Fix:** P7 implementer should grep for function names, not rely on line numbers.
|
||||
|
||||
### F7: The `system-prompt.ts` `resolveRoute` call has a subtle signature mismatch** (Advisory)
|
||||
|
||||
- **Location:** `apps/server/src/services/system-prompt.ts:195`
|
||||
- **Evidence:** `resolveRoute(agent).route` -- this call passes only `agent` (no `config`, no `modelId`). Looking at `resolveRoute`'s signature: `(agent: AgentLike | null, config?: ConfigLike, modelId?: string)`. With only `agent` and no `config`/`modelId`, it returns `{ route: 'swap' }` (the default at line 174: `if (!modelId || !config) return { route: 'swap' }`). This is a hardcoded fallback, not a real routing resolution. P7 must ensure that adding `'gateway'` to `InferenceRoute` doesn't break this call path -- it won't (it returns the default), but the implementer should note that `system-prompt.ts` never actually resolves through the provider registry.
|
||||
- **Impact:** No blocker -- the call is a no-op resolver that always returns `'swap'`. But it means `system-prompt.ts` does NOT need gateway handling (it never resolves a gateway model). P7's audit list should clarify this.
|
||||
- **Fix:** P7.3 audit note: `resolveRoute` in `system-prompt.ts:195` always returns `{route: 'swap'}` (no config/modelId passed); no gateway handling needed there.
|
||||
|
||||
## Claims I did not verify
|
||||
|
||||
- **openspec CLI validation:** `openspec --help` not available; could not probe CLI surface
|
||||
- **Task sizing (5-20 min each):** Not timed; tasks are well-scoped and independently verifiable, consistent with the claimed range
|
||||
- **P0 multi-provider batch completeness:** Referenced but not audited against its own tasks.md; trust the batch's own validation
|
||||
- **`/opt/forks/openevals` sandbox patterns:** Plan verified directory exists (V16); did not read the actual sandbox code for pattern fidelity
|
||||
- **ECharts bundle size claim (~60-100KB):** Not verified against actual echarts/core imports; accepted as reasonable estimate
|
||||
- **llama-swap `/api/events` SSE envelope shape:** Not verified against the llama-swap fork source; accepted from design
|
||||
- **`arena-runner.ts` `advanceChain` pattern:** Referenced as action queue pattern; not verified against actual code
|
||||
- **`getSwapProvider` cache invalidation with source keying:** P4 plan says cache keyed by `baseURL+source`; actual `swapCache` at `provider.ts:17` keys by `baseURL` only. The P4 change would need to either invalidate/extend the cache or use a separate cache. This is a known P4 design detail, not a plan gap.
|
||||
Reference in New Issue
Block a user