Move 13 shipped openspec change docs under openspec/changes/archived/. Add docs/features/git-diff-panel, docs/plans/post-review-backlog, and docs/research/cross-app-contract-ssot.md (the research behind the @boocode/contracts SSOT work). Update BOOCHAT.md, BOOCODER.md, and boocode_roadmap.md. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
36 KiB
Feature Implementation Plan: Post-Review Backlog (F1–F9)
This is a multi-item backlog, not a single feature. It commits to shipping six READY items as independent sequential patch tags (D-9), one batch per coherent unit, honoring deploy-by-surface and stage-commits-by-path; and to documenting three BLOCKED WANT items (F4/F5/F8) with their exact blocking open questions and recommended resolution paths (D-12) rather than halting on the tripped spec-maturity gate.
Source Specification
- Feature specification: No formal
feature-specification.mdexists. The ground-truth source is the scope brief, which captures the items, decisions, and live-verified current state from the 2026-06-02 backlog review conversation plusboocode_code_review_v2.md,docs/DEFERRED-WORK.md, andboocode_roadmap.md. Origin is conversational + review-doc, so the WANT items are intentionally planned more shallowly. - Consolidated specialist record: artifacts/synthesis-input.md — the Round-1 aggregation of all six specialists (on-call-engineer, behavioral-analyst, software-architect, test-engineer, user-experience-designer, junior-developer) with file:line evidence, the claim ledger, OQ resolutions, the spec-maturity gate, and the YAGNI ledger. There is no separate per-specialist file; this digest is authoritative.
- Discovery notes: artifacts/.discovery-notes.md — project context and per-item code touch points.
Outcome
When this plan's READY cluster is executed: hitting Stop on an external agent task actually kills the
running child and finalizes the assistant message in a correct terminal state (no more streaming-stuck or
falsely-complete rows); the tool-call parser's public surface is pruned to its four load-bearing exports
while the plain-text <invoke> guard stays intact and is for the first time test-pinned; the parser's
rejection logging flows through pino/LOG_LEVEL; a hung BooChat stream is detected and finalized
server-side after 90s instead of relying solely on the frontend watchdog; an MCP client can read a
session's transcript through a new BooCoder MCP tool; and the unused :9502 fallback SPA and its build step
are gone. F4/F5/F8 leave this plan with a documented, evidence-backed route to their own spec/capability
work.
Context
- Driving constraint: A post-review backlog with two items live-verified as active correctness bugs (F1 cancel is a no-op that leaves messages stuck; F6 has zero server-side stall guard). The review surfaced them; Sam chose scope "everything we discussed." No external deadline — value is correctness and simplification debt paydown.
- Stakeholders: Sam (sole user/operator — wants Stop to work, wants the parser simplified, wants resilience and the MCP history tool, greenlit the :9502 removal). On-call posture is single-operator: Sam is the post-ship owner for every item.
- Future-state concern: F1 touches the dispatcher's message-finalization paths across four backends — the risk to watch is finalization races/regressions there. F6 introduces a server-side timer that must not fire on a healthy-but-slow stream. F5 is a standing capability dependency on the pinned
@opencode-ai/sdk. - Out-of-scope boundary: F2 Option B (flag-gated full parser retirement), F6 retry/backoff, F4's superset code (clean-room pattern only), and everything in the scope brief's "Deferred (Sam's explicit dispositions)" list (subagent permission demux, tier-2 provider follow-ups, large-file splits, PR-resolver, etc.). These are recorded, not planned.
Team Composition and Participation
Round-by-round detail lives in artifacts/implementation-iteration-history.md. All six specialists converged in one round; remaining unknowns are spec-level, not resolvable by more specialist rounds.
| Specialist | Status | Key Input |
|---|---|---|
project-manager |
Coordinator | Facilitated R1, applied the spec-maturity gate + YAGNI gate, synthesized this plan. |
on-call-engineer |
Active | F1 finalization bugs (OCE-001/002); F6 stall-timeout design + no-retry YAGNI call. |
behavioral-analyst |
Active | F1 message-state corruption (B2/B3); F5 sentinel-row UI position (disputed); F4 dedup rule. |
software-architect |
Active | F1 registry/ExternalCancelFn shape (A1); F2 option-A prune (A2); F7 inline tool (A4); F9 removal (A5). |
test-engineer |
Active | F1 CancelRegistry + DB test (T1-T3); F2 fallback gate test (T6); F6 classifyStreamError + fake-timer test (T8-T10). |
user-experience-designer |
Active | F1 disable-Stop-while-in-flight + Option-A frame extension + muted "Stopped" label; F5 ephemeral-divider UI (disputed); F8 no line-selection infra. |
junior-developer |
Active | OQ reframing across all items; F2c (option a), F4a/b unverified premise, F8 spec-level. |
Implementation Approach
ALTITUDE: this section names code artifacts and inlines only decision-bearing values (flag names, the 90s stall timeout, the 50/200 limits, the export list to unexport). Full implementations belong in the files themselves; mechanics live in the cited decision-log entries.
TIER 1 — READY TO BUILD
F1 — external task cancel kills the child + finalizes the message (apps/coder, standalone, do first).
The highest-value correctness fix. A taskControllers = new Map<string,AbortController>() registry inside
createDispatcher, populated at the four run-functions and deleted in the existing .finally(), plus an
exported idempotent cancelExternalTask(taskId): boolean wired into POST /api/tasks/:id/cancel via a
narrow ExternalCancelFn
(D-1).
This batch must also fix the two pre-existing finalization bugs the abort wiring newly makes reachable —
the catch blocks that leave messages streaming and the warm success path that writes complete on an
aborted turn — via a shared cancelAndFinalize helper, or it ships a new bug
(D-1).
User Stop finalizes to cancelled, not failed
(D-7);
the terminal state reaches the web reducer via an optional status field on the existing coder
message_complete frame, not a new frame type
(D-8),
rendered as a muted "Stopped" label (D-10). The
session/stop leg "never fires for external from UI" (CoderPane already calls cancelTask), so it is wired
best-effort via a WHERE session_id=$ AND state='running' lookup or deferred as a low-value leg.
F2 + F3 together (apps/server, same file). F2 is option-A prune-now-minimal: KEEP extractToolCallBlocks
stripToolMarkup(the only guard for a tool call emitted as plain text), unexport the 8 zero-caller symbols —isPlaceholderArgValue,parseXmlToolCall,parseInvokeToolCall,partialXmlOpenerStart, and the constsXML_TOOL_OPEN/CLOSE,INVOKE_TOOL_OPEN/CLOSE— taking the public surface 11 → 4 with zero runtime effect (D-3). A gate test pins the untested<invoke>-as-text fallback before the prune (D-4). Because F2 keepsextractToolCallBlocks, F3 is safe to land in the same file/batch: thread an optionallog?: { debug }param intoextractToolCallBlocksfrom its one call site so rejection logging flows through pino/LOG_LEVELinstead ofconsole.debug(D-2).
F6 — BooChat stall-timeout ONLY (apps/server, standalone). Wrap the stream-phase.ts:261 fullStream
loop with a per-chunk stall deadline using AbortSignal.any([signal, stallAc.signal]) and a
setTimeout(STALL_TIMEOUT_MS = 90_000) bumped on each chunk, cleared in the existing finally; the
post-loop check throws AbortError so the existing finalize path writes cancelled
(D-5). No
retry — partial-stream re-emit is non-idempotent and the instance is single+local (see Deferred (YAGNI)).
F7 — view_session_history MCP tool (apps/coder MCP, standalone). Tool 7 added inline in
mcp-server.ts following the existing 6-tool inline pattern (textResult + direct sql). Reads
messages_with_parts WHERE role != 'system', params session_id + optional chat_id + limit (default
50, max 200), ORDER BY created_at ASC (D-6).
No interface, no pagination beyond limit.
F9 — retire :9502 SPA (apps/coder, standalone cleanup). Delete the if (existsSync(webRoot)) serve
block in index.ts (~269-289), keep the inline 404; remove apps/coder/web from pnpm-workspace.yaml, the
coder build step, and the Dockerfile copy; drop the now-unused fastifyStatic import. KEEP every
/api/coder/* REST + WS + /api/health + --mcp route
(D-11).
TIER 2 — BLOCKED (need their own spec / a capability check)
These three are recorded, not built. They do not block the READY cluster (D-12).
F4 — notify-hook config injection. The module shapes and the double-publish dedup rule are settled
(inbound route calls normalizeAgentEvent → bucket; confirms tasks.state='running' before publishing
blocked; SUPPRESSES done since the dispatcher already emits idle). But the core premise is
unverified: it is unknown whether claude / qwen / goose actually fire their native lifecycle hooks in
unattended mode (claude -p/SDK, qwen --acp/stream-json, goose), and goose's hook file/format is
unknown (not in repo). Recommended path: route to plan-a-feature; verify hook-firing-in-unattended-mode
and the goose hook mechanism first, or the feature is built on sand. Blocking OQs: OQ-F4a (hooks fire
unattended?), OQ-F4b (goose hook format?).
F5 — opencode compaction surfacing. Blocked on a capability check: the pinned @opencode-ai/sdk
exposes NO compaction event arm (confirmed arms: session.next.{text,reasoning,tool,step}.*,
message.part.*, session.idle/error). The review assumed events from opencode's core event.ts that the
pinned SDK may not surface. Recommended path: confirm the SDK emits a compaction signal and its exact
event name (or an SDK bump is needed) before building. The UI treatment is disputed (behavioral:
persistent sentinel row metadata.kind='compaction' that survives refresh; UX: ephemeral inline divider via
a new agent_compacted frame, no DB row) — settle once the event is confirmed to exist. Only
compaction.ended is in scope. Blocking OQs: OQ-F5a (SDK event existence/name), OQ-F5b (sentinel vs
ephemeral UI).
F8 — diff-line → agent re-prompt UX. Spec-level. The "DiffPanel" is inline in
CoderPane.tsx:478-619, rendering pending_changes rows as a static <pre> — no line-selection
infrastructure exists. The diff source is ambiguous (pending_changes.diff = BooCode write-tools only vs the
external-agent worktree git diff). "Send to new agent" needs coordinated workspace-pane + chat creation +
pre-population across three surfaces with no existing contract; selection diverges by modality.
Recommended path: route to plan-a-feature (the scope brief already hedged this; treat as firm).
MVP-if-pushed: "comment to current agent" only, block-level selection, pre-populate ChatInput — still
wants a spec. Blocking OQs: OQ-F8a (diff source), OQ-F8b (selection serialization), OQ-F8c (new-viewer
routing).
Architecture and Integration Points
F1 lives entirely in apps/coder (dispatcher.ts, routes/tasks.ts, index.ts:254 wiring) with a
single-field touch on the web reducer (CoderPane.tsx:299). F2/F3 are confined to
apps/server/.../tool-call-parser.ts and its one call site stream-phase.ts. F6 is confined to
stream-phase.ts. F7 is one inline tool in apps/coder/.../mcp-server.ts. F9 removes a serve block and a
workspace package. No new interface is introduced anywhere in TIER 1 — every item follows the existing inline
/ pure-helper precedent (D-2,
D-6).
Runtime Behavior
F1's cancel path: Stop → POST /api/tasks/:id/cancel → cancelExternalTask(taskId) → ac.abort() →
backend honors the signal (child.kill / session/cancel / session.abort / interrupt) → the run-function
hits its abort short-circuit or catch → cancelAndFinalize writes the terminal messages.status
(cancelled on abort, failed on error) WHERE status='streaming' and publishes the terminal frame
(D-1,
D-7).
F6's stall path: each chunk bumps the 90s timer; on stall, stallAc.abort() flows through AbortSignal.any
into the same finalize-as-cancelled path
(D-5).
Data Model and Persistence
No schema changes in any TIER 1 item. F1 and F6 only change which terminal value gets written to existing
messages.status / tasks.state columns. F7 is a read against the existing messages_with_parts view.
External Interfaces
F1 extends the existing coder message_complete WS frame with an optional status field — no new frame
type, so no paired strict-union arm is forced in apps/web/src/api/types.ts beyond the optional field
(D-8).
F7 exposes one new read-only MCP tool on BooCoder's MCP server
(D-6). F4 (blocked) would add a new
unauthenticated localhost POST route — see Security Posture. F5 (blocked) would add a new WS frame and thus
trigger the full cross-app parity rule.
Decomposition and Sequencing
Each READY item is its own patch tag, one batch per coherent unit
(D-9). F2+F3 are a single unit (same file; F2 keeping
extractToolCallBlocks unblocks F3). F1 goes first as the highest-value correctness fix.
| # | Work Unit | Surface / Deploy | Delivers | Depends On | Verification |
|---|---|---|---|---|---|
| 1 | F1 cancel registry + finalization fixes | apps/coder → systemctl restart boocoder |
Stop kills child; message finalizes cancelled/failed; no stuck streaming (D-1) |
— | CancelRegistry unit (4 cases) + DB-integration route test → row cancelled (T1-T3) |
| 2 | F2+F3 parser prune + logger | apps/server → docker rebuild | Public surface 11→4; rejection logging via pino; guard pinned (D-3, D-2) | F2's keep-decision unblocks F3 | Fallback gate test green through prune (T6, D-4) |
| 3 | F6 stall-timeout | apps/server → docker rebuild | 90s server-side stall detection → finalize cancelled (D-5) |
— | classifyStreamError unit (5 cases) + vi.useFakeTimers() stall test + post-loop abort regression pin (T8-T10) |
| 4 | F7 session-history MCP tool | apps/coder → systemctl restart boocoder |
Read-only transcript tool; /api/health tool count +1 (D-6) |
— | Manual MCP call + sentinel-strip assertion (role != 'system') |
| 5 | F9 retire :9502 SPA | apps/coder → systemctl restart boocoder |
Serve block + build step + workspace pkg removed; routes kept (D-11) | — | /api/coder/* + /api/health still 200; GET / 404-or-redirect |
F4/F5/F8 are NOT in this table — they route out per D-12.
RAID Log
Risks
| ID | Risk | Likelihood | Severity | Blast Radius | Reversibility | Owner | Mitigation |
|---|---|---|---|---|---|---|---|
| R1 | F1 message-finalization races across the 4 backend paths leave a row in the wrong terminal state | Medium | Medium | One assistant message per affected turn | Reversible (re-run) | on-call-engineer | Shared cancelAndFinalize helper, WHERE status='streaming' idempotency, abort short-circuit before the unconditional complete write (D-1) |
| R2 | F2 prune accidentally removes the <invoke>-as-text fallback guard (regression: plain-text tool calls silently dropped) |
Low | High | Any qwen3.6 turn that emits tool-call markup as text | Reversible (re-add export/call) | software-architect | Option-A keeps the impl; gate test fails if extractToolCallBlocks leaves the text-delta path (D-3, D-4) |
| R3 | F6 stall-timeout fires on a healthy-but-slow stream (false abort) | Low | Low | One in-flight chat turn | Reversible (re-send) | on-call-engineer | 90s per-chunk (not per-turn) deadline, bumped on every chunk; tune only on observed false-fire (D-5) |
| R4 | F5 premise fails — the pinned SDK never surfaces a compaction event, so the feature is unbuildable without an SDK bump | High | N/A (blocked) | F5 only | N/A | (capability check) | Confirm event existence/name before any build; F5 stays Blocked until then (D-12) |
| R5 | F4 premise fails — agents do not fire native hooks in unattended mode, so injection yields no signals | High | N/A (blocked) | F4 only | N/A | (plan-a-feature) | Verify hook-firing + goose format in the F4 spec before any build (D-12) |
Assumptions
| ID | Assumption | What Changes If Wrong | Verifier | Status |
|---|---|---|---|---|
| A1 | llama-swap native --jinja parsing stays ON (structured tool_calls), so the text-scrape fallback remains dormant defense-in-depth |
If off, the kept guard becomes hot — fine; if a future config silently changes parsing, F2 Option B reopens | live probe of :8401 (not in-repo) |
Unverified-in-repo (probe-only); guard kept regardless (D-3) |
| A2 | All four external backends honor ctx.signal as cited (child.kill / session/cancel / session.abort / interrupt) |
If one does not, that backend's Stop stays a no-op even after wiring | F1 DB-integration test per backend path | Evidenced by file:line (scope-brief F1); test confirms |
| A3 | 90s is the right stall deadline for the local llama-swap workload shape | Too low → false aborts on slow models; too high → sluggish recovery | Production observation | Committed value, tunable (D-5) |
| A4 | F1's session/stop leg "never fires for external from UI" (CoderPane calls cancelTask) |
If a code path does hit session-stop for an external task, that leg's finalization is best-effort only | on-call review of CoderPane wiring | Evidenced (synthesis-input F1 resolved OQs) |
Dependencies
| ID | Dependency | Owner | Status |
|---|---|---|---|
| Dep1 | F5 requires @opencode-ai/sdk to surface a compaction event (or an SDK bump) |
(capability check) | BLOCKING F5 — unresolved (OQ-F5a) |
| Dep2 | F4 requires confirmed unattended hook-firing for claude/qwen/goose + the goose hook format | (plan-a-feature) | BLOCKING F4 — unresolved (OQ-F4a, OQ-F4b) |
| Dep3 | F8 requires a chosen diff source + a line-selection approach | (plan-a-feature) | BLOCKING F8 — unresolved (OQ-F8a/b/c) |
Testing Strategy
Sourced from test-engineer (T1-T3, T6, T8-T10) following the established pure-helper-then-wire precedent
(turn-guard.ts, lifecycle-decisions.ts, mistake-tracker.ts). Coder tests are globals:false (import
describe/it/expect); include glob src/**/__tests__/**/*.test.ts; DB-integration tests are opt-in via
DATABASE_URL + describe.runIf.
- Observable behaviors to test:
- F1: a cancelled external task lands
messages.status='cancelled'(notstreaming, notcomplete); the registry register/cancel/delete/has behaves idempotently (D-1, D-7). - F2: a text-delta carrying a complete
<invoke>block lands inresult.toolCallswith the markup absent fromresult.content— green through the prune, red if the guard is removed (D-4). - F6: a fake hanging stream is aborted after the 90s deadline under
vi.useFakeTimers(); the existing post-loopsignal?.abortedcheck still passes (regression pin) (D-5). - F7: results exclude
role='system'rows (sentinel strip) and respect thelimitcap.
- F1: a cancelled external task lands
- Test doubles posture: extract pure helpers (
CancelRegistry,classifyStreamError) with no DB/child/I/O for unit coverage; stubstreamTextfor the F2 gate and F6 stall tests; one DB-integration test for the F1 route. Warm-worktree-preserved is held as a code comment, not a spy (test-engineer T1-T3). - Edge cases requiring coverage: double-Stop / cancel-after-exit (idempotent
ac.abort()); abort vs thrown error in the catch (cancelled vs failed mapping); F6 stall vs healthy-slow stream. - Test levels: unit (registry, classifier, gate) → integration (F1 route → DB row) → manual (F7 MCP call, F9 route-still-200 smoke).
Security Posture
Thin. The single TIER 1 surface is BooCoder's new read-only view_session_history MCP tool (F7), which
reads only messages_with_parts WHERE role != 'system' — no write verbs, no PII beyond what the operator
already authored, single-user, Authelia at the reverse proxy, no app-layer auth by design.
The one genuinely new auth-relevant surface — F4's inbound unauthenticated localhost POST route for hook callbacks — belongs to a BLOCKED item and is not built in this plan. When F4 is planned, it must be recorded explicitly as a new unauthenticated localhost route (acceptable under the single-user / Authelia posture, but called out, not silent).
Operational Readiness
- Deploy by surface: apps/coder items (F1, F7, F9) →
sudo systemctl restart boocoder; apps/server items (F2+F3, F6) →docker compose up --build -d boocode. Stage commits explicitly by path; nevergit add -A(Sam has uncommitted web WIP) (D-9). - Build-step change (F9): removing
apps/coder/webdeletes a coder build step and a workspace package; the coder Dockerfile copy andpnpm-workspace.yamlentry must be removed in the same batch or the build references a missing package (D-11). - Health check after F7:
/api/healthreports a tool count; expect it to increment by one. - Tagging: sequential patch tags, one per unit; no v2.8.0 (D-9).
- Rollback: each unit is an independent patch tag, so rollback is per-item (revert the tag, redeploy by surface). No schema migrations means no expand/contract sequencing to unwind.
On-Call Resilience Posture
Application-source resilience, sourced from on-call-engineer. Infrastructure concerns live in Operational Readiness above.
- Timeouts and deadlines: F6 adds a 90s per-chunk stall deadline on the BooChat
fullStreamloop, the first server-side guard against a hung llama-swap stream (today only the frontend 60sdiscard_stalewatchdog exists) (D-5). - Retry strategy: none, deliberately. A retry after a partial stream re-emits already-streamed deltas (non-idempotent); the user re-sending is the correct recovery at single-local-instance scale (see Deferred (YAGNI)) (D-5).
- Idempotency: F1's
cancelExternalTaskis idempotent (ac.abort()no-ops when already aborted → double-Stop and cancel-after-exit are safe); message finalization is idempotent viaWHERE status='streaming'(D-1). - Kill switches: F1 is itself the kill switch for a runaway external agent — making Stop actually stop is the operator's manual control.
- Graceful degradation: F1's
session/stopleg is best-effort (does not fire for external tasks from the UI); F6 falls back to finalizing ascancelledon stall rather than hanging indefinitely. - Observability of failure paths: F1 distinguishes
cancelled(user/stall) fromfailed(thrown error) so the human-inbox surface stays honest (D-7); F3 routes parser-rejection logging through pino/LOG_LEVEL(D-2). - Data integrity: no monetary/rate columns touched; F1/F6 only write existing terminal-state enum values into existing columns.
- Migration safety: no schema changes in any TIER 1 item — nothing to expand/contract.
Definition of Done
- Hitting Stop on an external task kills the child and the assistant message finalizes
cancelled(never leftstreaming, never falselycomplete) (D-1, D-7). - Stop button is disabled while the cancel POST is in flight; a muted "Stopped" label renders on cancel (D-8, D-10).
- Parser public surface is 4 exports; the
<invoke>-as-text gate test is green and fails if the guard is removed (D-3, D-4). - Parser rejection logging appears via pino and respects
LOG_LEVEL(D-2). - A fake hanging BooChat stream is finalized
cancelledafter 90s under fake timers; the post-loop abort regression pin passes (D-5). view_session_historyreturns a session transcript excludingrole='system'rows, capped atlimit(default 50, max 200);/api/healthtool count +1 (D-6).- :9502 serve block + build step + workspace package removed; all
/api/coder/*+/api/healthroutes still 200 (D-11). - Each shipped item has its own sequential patch tag + CHANGELOG entry; deployed by surface (D-9).
Specialist Handoffs for Implementation
test-engineer— dispatch alongside each READY unit; needs the touch-point file:lines (already in discovery notes) to author T1-T3 (F1), T6 (F2), T8-T10 (F6) before/with the wiring.software-architect— dispatch if F1'scancelAndFinalizehelper shape needs settling across the four backend paths during implementation; needs the four catch-block + success-path line ranges.plan-a-feature(F4) — dispatch when Sam wants F4; input: the settled dedup rule + module shapes from synthesis-input, and the two blocking OQs (unattended hook-firing, goose format) to resolve first.plan-a-feature(F8) — dispatch when Sam wants F8; input:CoderPane.tsx:478-619DiffPanel location, the diff-source ambiguity, and the cross-surface routing gap.- Capability check (F5) — dispatch before any F5 build; input: the pinned
@opencode-ai/sdkarm list and OQ-F5a (does it emit a compaction event, and what is its exact name?).
Deferred (YAGNI)
Items considered during planning and deferred under the YAGNI rule. Each carries a concrete reopen trigger.
F6 retry / backoff classifier
- Why deferred: Evidence test — a retry after a partial stream re-emits already-streamed deltas (
state.accumulated+ livedeltaframes are non-idempotent), which is strictly worse than the current behavior; no second instance exists to fail over to. Single-local-instance scale makes operator re-send the correct recovery. - Reopen when: llama-swap gains restart-in-place-with-clear-partial semantics, OR a second llama-swap instance is added for failover.
- Source: R1, on-call-engineer.
F2 Option B — flag-gated full retirement of extractToolCallBlocks / stripToolMarkup
- Why deferred: Evidence test — no evidence qwen3.6 stopped emitting
<invoke>-as-text on live, and the sidecar--jinjastate is unconfirmed in-repo; deleting the only plain-text-tool-call guard on that basis is unsafe (named anti-pattern: removing a load-bearing guard without a measured signal). - Reopen when: a documented multi-session live probe shows zero text-delta tool calls from qwen3.6.
- Source: R1, software-architect / test-engineer.
F4 NotifyHookInjection interface
- Why deferred: Simpler-version test — three agents with an identical read-merge-write settings.json flow do not need an interface; one concrete function switching on agent name satisfies the same need (named anti-pattern: single-implementation interface before three divergent uses exist).
- Reopen when: a fourth agent with a genuinely different injection contract appears (and only after F4's premise is verified).
- Source: R1, software-architect.
F5 handling of compaction.started / compaction.delta + step.failed + tool.progress
- Why deferred: Evidence test — only
compaction.endedis user-actionable (it is what closes the silent context gap); the other arms add UI noise with no user-described need. Compounded by F5 being capability-blocked anyway. - Reopen when: the SDK compaction event is confirmed AND a user-described need for in-progress compaction feedback emerges.
- Source: R1, behavioral-analyst.
F7 SessionHistoryReader interface / pagination beyond limit
- Why deferred: Simpler-version test — one read tool with an inline query + a
limitcap satisfies the read need; an interface and cursor pagination are unused machinery (named anti-pattern: abstraction before a second concrete use). - Reopen when: a second history-read consumer needs a different read path, or transcripts routinely exceed the 200-row cap in a way that breaks the use case.
- Source: R1, software-architect.
Open Items
- OI-1 (F4 premise): Do claude / qwen / goose fire native lifecycle hooks in unattended mode, and what is goose's hook file/format?
- Resolves when: verified in a
plan-a-featurespec for F4 (OQ-F4a, OQ-F4b). - Blocks implementation: Yes for F4 — No for the READY cluster (F4 is independent).
- Resolves when: verified in a
- OI-2 (F5 capability): Does the pinned
@opencode-ai/sdksurface a compaction event, and what is its exact name?- Resolves when: a capability check confirms the event (or an SDK bump is scoped) — OQ-F5a.
- Blocks implementation: Yes for F5 — No for the READY cluster.
- OI-3 (F5 UI treatment): persistent sentinel row vs ephemeral inline divider for "context compacted."
- Resolves when: OI-2 resolves and the team picks the UI shape (OQ-F5b); currently disputed (behavioral vs UX).
- Blocks implementation: Yes for F5 — No for the READY cluster.
- OI-4 (F8 spec): diff source (
pending_changes.diffvs worktree git diff), selection serialization, and new-viewer routing.- Resolves when: a
plan-a-featurespec for F8 settles OQ-F8a/b/c. - Blocks implementation: Yes for F8 — No for the READY cluster.
- Resolves when: a
- OI-5 (F1 session-stop leg): whether to wire the best-effort
session/stoplookup or defer it.- Resolves when: decided at F1 implementation; it does not fire for external tasks from the UI today, so it is low-value.
- Blocks implementation: No — F1 ships with or without this leg.
Summary
- Outcome delivered: Stop actually stops external agents and finalizes their messages correctly; the parser is pruned to its load-bearing surface with the guard test-pinned; BooChat gains a 90s server-side stall guard; a session-history MCP tool and the :9502 retirement land; F4/F5/F8 have a documented route to their own spec/capability work.
- Team size: 7 specialists (incl. project-manager) — see artifacts/implementation-iteration-history.md
- Rounds of facilitation: 1 — see artifacts/implementation-iteration-history.md
- Decisions committed: 12 — see artifacts/implementation-decision-log.md
- Decisions settled by evidence: 11 — see artifacts/implementation-decision-log.md
- Decisions settled by junior-developer reframing: 0 (junior confirmed several, but each rests on specialist evidence) — see artifacts/implementation-decision-log.md
- Decisions settled by user input: 1 (D-12, the standing-override disposition) — see artifacts/implementation-decision-log.md
- Rejected alternatives recorded: 13 — see artifacts/implementation-decision-log.md
- Open items remaining: 5 (4 blocking only their own BLOCKED item; OI-5 non-blocking)
- Recommendation: Ship the READY cluster as planned — build F1 first (apps/coder, highest-value correctness fix, must include the OCE-001/002 finalization fixes), then F2+F3 (apps/server, one batch), F6 (apps/server), F7 (apps/coder MCP), F9 (apps/coder cleanup), each as its own sequential patch tag. Route F4 and F8 to
plan-a-featureand F5 to an@opencode-ai/sdkcapability check before any build.