Move 13 shipped openspec change docs under openspec/changes/archived/. Add docs/features/git-diff-panel, docs/plans/post-review-backlog, and docs/research/cross-app-contract-ssot.md (the research behind the @boocode/contracts SSOT work). Update BOOCHAT.md, BOOCODER.md, and boocode_roadmap.md. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
15 KiB
Synthesis input — Round 1 aggregation + dispositions
Deterministic aggregation of the Round-1 specialist review (on-call-engineer, behavioral-analyst, software-architect, test-engineer, user-experience-designer, junior-developer). This is the consolidated record the project-manager synthesizes into the three plan files. Evidence (file:line) is preserved inline.
Team size: large (cross-subsystem, user chose "everything"). Round cap 3; converged in 1 round (the remaining unknowns are spec-level, not resolvable by more specialist rounds).
Per-feature dispositions (the decisions)
READY TO BUILD
F1 — external task cancel kills child + finalizes message. Strong 4-way consensus (on-call B1, behavioral B1, architect A1, junior).
- Root cause CONFIRMED:
routes/tasks.ts:130-138callsinference.cancel(native-only); dispatcher has noMap<taskId,AbortController>; the four privateac(dispatcher.ts:316/655/991/1248) are unreachable;cancelExternalTaskdoes not exist anywhere. - Design (architect A1): add
taskControllers = new Map<string,AbortController>()insidecreateDispatcher;taskControllers.set(taskId, ac)at each of the 4 run-functions; delete in the existing.finally()at dispatcher.ts:117; exportcancelExternalTask(taskId): boolean(idempotent —ac.abort()is a no-op when already aborted, so double-Stop and cancel-after-exit are safe). Pass a narrowExternalCancelFn(NOT the whole dispatcher) intoregisterTaskRoutes; wire inindex.ts:254. - TWO pre-existing bugs F1 makes reachable, MUST be fixed in the same batch (on-call OCE-001/OCE-002,
behavioral B2/B3): (1) the four catch blocks update only
tasksstate, never themessagesrow → an aborted/thrown turn leaves the assistant messagestatus='streaming'(BooChat's 5-min sweep can't recover it — different process); (2) the warm-backend success path writesmessages.status='complete'unconditionally before checking abort (dispatcher.ts ~853/1122/1377) → a cancelled turn is recordedcomplete. Fix: afterawait backend.prompt(...),if (ac.signal.aborted)→ writestatus='cancelled', publish the terminalmessage_completeframe, emit idle, return; and in each catch finalize the message withWHERE status='streaming'(idempotent) distinguishing AbortError→cancelled vs error→failed. - UX (UX agent): disable the Stop button while the cancel POST is in flight (mobile double-tap); extend the
coder
message_completeframe with an optionalstatusfield (Option A — minimal, no new frame type) and map it in the reducer (CoderPane.tsx:299,MessageStatusalready includes'cancelled'); render a muted "Stopped" label (not red, not a toast). - Tests (test-engineer T1-T3): extract a pure
CancelRegistry(register/cancel/delete/has) — 4 unit cases, no DB/child; one DB-integration test for the route → row lands'cancelled'; warm-worktree-preserved held as a code comment, not a spy. - Resolved OQs: terminal state =
cancelled(notfailed) for user Stop; registry keyed bytaskId(route receives taskId);session/stoproute — CoderPane already callscancelTaskfor external tasks so the session-stop path "never fires for external from UI" (on-call) — wire it best-effort via aSELECT id FROM tasks WHERE session_id=$ AND state='running'lookup OR defer that leg (low value); use a sharedcancelAndFinalizehelper across the 4 paths (TDD precedent).
F2 — tool-call-parser prune (option a: prune-now-minimal). DECISION (architect A2, confirms junior
OQ-F2c): do NOT do the flag-gated full retirement (option b). KEEP extractToolCallBlocks + stripToolMarkup
- their types (
ToolCallExtraction,ParsedCall) — load-bearing<invoke>-as-text guard (the only guard for that case;experimental_repairToolCalldoesn't cover it; sidecar--jinjaunconfirmed so keeping the guard is correct). REMOVE theexportkeyword (not the implementations) from the 8 zero-external-caller symbols:isPlaceholderArgValue,parseXmlToolCall,parseInvokeToolCall,partialXmlOpenerStart, and the 4 constsXML_TOOL_OPEN/CLOSE,INVOKE_TOOL_OPEN/CLOSE. Zero runtime effect; public surface 11→4 exports.
- Test gap (test-engineer T6): the
<invoke>-text fallback instream-phase.ts:263-284is currently NOT exercised by any test → add a gate test (stubstreamTextto emit a text-delta containing a complete<invoke>block; assert it lands inresult.toolCallsand the markup is NOT inresult.content). Must stay green through the prune and fail ifextractToolCallBlocksis ever removed from the text-delta path.
F3 — xml-parser structured logging. Trivial. tool-call-parser.ts:65 console.debug → pass an optional
log?: { debug } param to extractToolCallBlocks from its one call site (stream-phase.ts executeStreamPhase)
and use it. No interface (architect: one site, one impl). SEQUENCING: same file as F2; F2 keeps
extractToolCallBlocks (decided), so F3 is safe; do F2+F3 in one batch. Confirm executeStreamPhase
signature/test-stubs tolerate the param (junior).
F6 — BooChat stall-timeout ONLY (retry deferred). on-call: wrap the stream-phase.ts:261 fullStream loop
with a per-chunk stall deadline: a local stallAc = new AbortController(), effectiveSignal = AbortSignal.any([signal, stallAc.signal]) passed to streamText; bump a setTimeout(STALL_TIMEOUT_MS=90_000)
on each chunk; clear it in the existing finally; at the post-loop check (stream-phase.ts:337) test
signal?.aborted || stallAc.signal.aborted and throw AbortError (→ handleAbortOrError writes
cancelled). Tests (test-engineer T8-T10): pure classifyStreamError(err) helper (5 cases, no I/O) + a
vi.useFakeTimers() stall test on a fake hanging stream + a regression pin on the existing signal?.aborted
post-loop check.
- YAGNI DEFER (on-call, strong): NO retry at
executeStreamPhase/streamCompletion. A retry after partial stream re-emits already-streamed deltas (state.accumulated+ livedeltaframes are non-idempotent) — worse than current. Reopen trigger: llama-swap gains restart-in-place-with-clear-partial, or a second instance for failover. The user re-sending is the correct recovery at single-instance scale.
F7 — view_session_history MCP tool. architect A4: add tool 7 inline in mcp-server.ts (follows the
existing 6-tool inline pattern, textResult + direct sql). Reads messages_with_parts, WHERE role != 'system' (strips sentinels), params session_id + optional chat_id + limit (default 50, max 200),
ORDER BY created_at ASC. No interface, no pagination beyond limit. Returns {role,content,...}[].
F9 — retire apps/coder/web :9502 SPA. architect A5: the if (existsSync(webRoot)) block in index.ts
(~269-289) already no-ops when the dist is absent. Delete that block, keep the inline 404 handler
({error:'not found'}); remove apps/coder/web from pnpm-workspace.yaml, the coder build step, and the
Dockerfile copy; remove the now-unused fastifyStatic import (verify it's only used there). KEEP all
/api/coder/* REST + WS + /api/health + --mcp routes (CoderPane depends on them). OQ-F9a RESOLVED:
nothing probes GET / on :9502 (health is /api/health; compose healthcheck is the boocode container, not
the host-systemd coder) → safe to 404 or add a 2-line GET / redirect-to-BooChat (no fastifyStatic).
BLOCKED — need a spec or a capability check before building (gate-trip items)
F4 — notify-hook config injection. SPEC-LEVEL gaps (junior OQ-F4a-e, behavioral B4, UX). The core premise
is UNVERIFIED: do claude / qwen / goose actually fire their native lifecycle hooks in unattended mode
(claude -p / SDK, qwen --acp / --output-format stream-json, goose)? goose's hook file/format is unknown
(not in repo). Idempotent per-agent settings.json merge strategy unspecified. boocoder.service run-user /
homedir() resolution unconfirmed. The inbound POST is a new unauthenticated localhost route (acceptable
single-user, note it). Double-publish dedup with the v2.7.6 turn-boundary publish: behavioral B4 +
architect A3 agree on the rule — inbound route calls normalizeAgentEvent (returns bucket
working|blocked|done), confirms tasks.state='running' before publishing blocked, and SUPPRESSES done
(the dispatcher already emits idle); done→drop, never re-publish. UI side already exists (AgentStatusDot,
all 4 buckets — UX: F4 is server-side only). RECOMMENDATION: own plan-a-feature — the dedup rule + module
shapes are settled, but the hook-firing-in-unattended-mode premise and goose hook mechanism must be verified
first or the whole feature is built on sand.
F5 — opencode compaction surfacing. BLOCKED on a capability check. The installed @opencode-ai/sdk
exposes NO compaction event arm (current arms confirmed: session.next.{text,reasoning,tool,step}.*,
message.part.*, session.idle/error at opencode-server.ts:379-491). The review's "consume
compaction.{started,delta,ended}" assumed events from opencode's CORE event.ts, which the pinned SDK may not
surface. MUST confirm the SDK emits a compaction signal + its exact event name (or an SDK bump is needed)
before building. DISPUTED UI treatment (behavioral B5 = persistent sentinel row metadata.kind='compaction',
survives refresh; UX = ephemeral inline divider via a new agent_compacted frame, no DB row) — settle once
the event exists. Only compaction.ended is in scope (YAGNI: started/delta/step.failed/tool.progress out).
Cross-app WS-frame parity is certain if a frame is added.
F8 — diff-line → agent re-prompt. SPEC-LEVEL (UX + junior, firm). The "DiffPanel" is inline in
CoderPane.tsx:478-619, rendering pending_changes rows as a static <pre> (CoderPane.tsx:607-610) — NO
line-selection infrastructure exists. Diff source ambiguous (pending_changes.diff = BooCode write-tools only
vs the external-agent worktree git diff). "Send to new agent" needs coordinated workspace-pane + chat creation
- pre-population across 3 surfaces with no existing contract. Selection diverges by modality (desktop line-
select vs mobile long-press → bottom sheet). RECOMMENDATION: own
plan-a-feature(the scope-brief already hedged this; treat as firm). MVP-if-pushed: "comment to current agent" only, block-level selection, pre-populateChatInput— still wants a spec.
Claim ledger (consolidated, deduped)
| # | Claim | State | Spec-maturity | Supporting |
|---|---|---|---|---|
| C1 | F1 cancel route never aborts external child; no registry/export | Evidenced | plan-level | on-call,behavioral,architect,junior |
| C2 | F1 catch blocks leave message streaming; success path writes complete on abort — fix in same batch |
Evidenced | plan-level | on-call,behavioral |
| C3 | F2 = prune-now-minimal: unexport 8 zero-caller symbols, keep extractToolCallBlocks+stripToolMarkup | Evidenced | plan-level | architect (test-engineer guard) |
| C4 | F2 <invoke>-text fallback is untested → add gate test before prune |
Evidenced | plan-level | test-engineer |
| C5 | F3 optional logger param, do with F2 (same file) | Evidenced | plan-level | architect,junior |
| C6 | F6 stall-timeout via AbortSignal.any, 90s; NO retry (non-idempotent deltas) | Evidenced | plan-level | on-call,behavioral,test-engineer |
| C7 | F7 inline MCP tool, messages_with_parts, role!='system', limit 50/200 | Evidenced | plan-level | architect,UX |
| C8 | F9 delete SPA block, keep routes; GET / unprobed → safe | Evidenced | plan-level | architect (+ verified) |
| C9 | F4 hook-firing in unattended mode UNVERIFIED; goose hook mechanism unknown | Anecdotal (premise) | spec-level | junior,behavioral,UX |
| C10 | F4 dedup rule: confirm running before blocked; suppress hook done |
Evidenced | plan-level | behavioral,architect |
| C11 | F5 pinned @opencode-ai/sdk exposes no compaction arm → blocked on capability check | Evidenced | spec-level | (verified) + junior |
| C12 | F5 UI treatment sentinel-row vs ephemeral-frame | Disputed | spec-level | behavioral vs UX |
| C13 | F8 no line-selection infra; diff source ambiguous; needs own spec | Evidenced | spec-level | UX,junior |
Open Questions — resolutions
- OQ (F1 terminal state) → RESOLVED:
cancelled. OQ (F1 registry key) → RESOLVED:taskId. OQ (F1 shared finalize helper) → RESOLVED: yes, pure helper. OQ (F1 warm re-throw on abort) → RESOLVED: short-circuit onac.signal.aborted. - OQ-F2a (sidecar jinja) → RESOLVED moot: option a keeps the guard. OQ-F2c (a vs b) → RESOLVED: option a.
- OQ-F6a/b/c → RESOLVED: AbortSignal.any (not Promise.race); no retry; 90s.
- OQ-F7a (session vs chat id) → RESOLVED: both (chat_id optional) + limit.
- OQ-F9a (GET / probe) → RESOLVED: unprobed, safe.
- OQ-F4a (hooks fire unattended?), OQ-F4b (goose hook format) → UNRESOLVED, spec-level → route to F4 spec.
- OQ-F5a (SDK compaction event name/existence) → UNRESOLVED, capability check → blocks F5.
- OQ-F5b (sentinel vs ephemeral UI) → UNRESOLVED → settle in F5 once event confirmed.
- OQ-F8a/b/c (diff source, serialization, new viewer) → UNRESOLVED, spec-level → route to F8 spec.
Spec-maturity gate
TRIPPED (≥5 spec-level findings — C9, C11, C12, C13, plus OQ-F4b/F8a — across ≥3 specialists: junior,
behavioral, UX). The trip is CONCENTRATED in the three WANT items F4/F5/F8; F1/F2/F3/F6/F7/F9 are all
plan-level and ready. Per skill: gate-trip → recommend the user route F4/F8 to plan-a-feature and F5 to a
capability check. USER OVERRIDE STANDING: Sam chose scope "everything we discussed" having pre-acknowledged the
WANT items would be planned more shallowly — so the plan proceeds, documenting F4/F5/F8 as Blocked/own-spec
rather than halting. Decision deferred to Step 9 user presentation.
YAGNI ledger
- F6 retry logic → DEFER (non-idempotent re-emit of streamed deltas). Reopen: llama-swap restart-in-place or second instance. Source: on-call R1.
- F2 option b (flag-gated full retirement of extractToolCallBlocks/stripToolMarkup) → DEFER (no evidence
qwen3.6 stopped emitting
<invoke>text on live; sidecar jinja unconfirmed). Reopen: documented multi- session live probe shows zero text-delta tool calls. Source: architect/test-engineer R1. - F4
NotifyHookInjectioninterface → REPLACE with one concrete function switching on agent name (3 agents, identical read-merge-write). Source: architect R1. - F5 handling of compaction.started/delta + step.failed + tool.progress → DEFER, only compaction.ended is user-actionable. Source: behavioral R1.
- F7 SessionHistoryReader interface / pagination → REPLACE with inline query + limit. Source: architect R1.
- Provider tier-2 follow-ups (snapshot frame, enabled column, shared types, MCP list_providers) → already DEFER/DROP per scope-brief; not re-planned.