Files

indifferentketchup 2a05d2f9fe docs: archive shipped openspec batches; add feature/plan/research notes

Move 13 shipped openspec change docs under openspec/changes/archived/.
Add docs/features/git-diff-panel, docs/plans/post-review-backlog, and
docs/research/cross-app-contract-ssot.md (the research behind the
@boocode/contracts SSOT work). Update BOOCHAT.md, BOOCODER.md, and
boocode_roadmap.md.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

2026-06-02 21:20:33 +00:00

36 KiB

Raw Blame History

Feature Implementation Plan: Post-Review Backlog (F1–F9)

This is a multi-item backlog, not a single feature. It commits to shipping six READY items as independent sequential patch tags (D-9), one batch per coherent unit, honoring deploy-by-surface and stage-commits-by-path; and to documenting three BLOCKED WANT items (F4/F5/F8) with their exact blocking open questions and recommended resolution paths (D-12) rather than halting on the tripped spec-maturity gate.

Source Specification

Feature specification: No formal feature-specification.md exists. The ground-truth source is the scope brief, which captures the items, decisions, and live-verified current state from the 2026-06-02 backlog review conversation plus boocode_code_review_v2.md, docs/DEFERRED-WORK.md, and boocode_roadmap.md. Origin is conversational + review-doc, so the WANT items are intentionally planned more shallowly.
Consolidated specialist record: artifacts/synthesis-input.md — the Round-1 aggregation of all six specialists (on-call-engineer, behavioral-analyst, software-architect, test-engineer, user-experience-designer, junior-developer) with file:line evidence, the claim ledger, OQ resolutions, the spec-maturity gate, and the YAGNI ledger. There is no separate per-specialist file; this digest is authoritative.
Discovery notes: artifacts/.discovery-notes.md — project context and per-item code touch points.

Outcome

When this plan's READY cluster is executed: hitting Stop on an external agent task actually kills the running child and finalizes the assistant message in a correct terminal state (no more streaming-stuck or falsely-complete rows); the tool-call parser's public surface is pruned to its four load-bearing exports while the plain-text <invoke> guard stays intact and is for the first time test-pinned; the parser's rejection logging flows through pino/LOG_LEVEL; a hung BooChat stream is detected and finalized server-side after 90s instead of relying solely on the frontend watchdog; an MCP client can read a session's transcript through a new BooCoder MCP tool; and the unused :9502 fallback SPA and its build step are gone. F4/F5/F8 leave this plan with a documented, evidence-backed route to their own spec/capability work.

Context

Driving constraint: A post-review backlog with two items live-verified as active correctness bugs (F1 cancel is a no-op that leaves messages stuck; F6 has zero server-side stall guard). The review surfaced them; Sam chose scope "everything we discussed." No external deadline — value is correctness and simplification debt paydown.
Stakeholders: Sam (sole user/operator — wants Stop to work, wants the parser simplified, wants resilience and the MCP history tool, greenlit the :9502 removal). On-call posture is single-operator: Sam is the post-ship owner for every item.
Future-state concern: F1 touches the dispatcher's message-finalization paths across four backends — the risk to watch is finalization races/regressions there. F6 introduces a server-side timer that must not fire on a healthy-but-slow stream. F5 is a standing capability dependency on the pinned @opencode-ai/sdk.
Out-of-scope boundary: F2 Option B (flag-gated full parser retirement), F6 retry/backoff, F4's superset code (clean-room pattern only), and everything in the scope brief's "Deferred (Sam's explicit dispositions)" list (subagent permission demux, tier-2 provider follow-ups, large-file splits, PR-resolver, etc.). These are recorded, not planned.

Team Composition and Participation

Round-by-round detail lives in artifacts/implementation-iteration-history.md. All six specialists converged in one round; remaining unknowns are spec-level, not resolvable by more specialist rounds.

Specialist	Status	Key Input
`project-manager`	Coordinator	Facilitated R1, applied the spec-maturity gate + YAGNI gate, synthesized this plan.
`on-call-engineer`	Active	F1 finalization bugs (OCE-001/002); F6 stall-timeout design + no-retry YAGNI call.
`behavioral-analyst`	Active	F1 message-state corruption (B2/B3); F5 sentinel-row UI position (disputed); F4 dedup rule.
`software-architect`	Active	F1 registry/`ExternalCancelFn` shape (A1); F2 option-A prune (A2); F7 inline tool (A4); F9 removal (A5).
`test-engineer`	Active	F1 `CancelRegistry` + DB test (T1-T3); F2 fallback gate test (T6); F6 `classifyStreamError` + fake-timer test (T8-T10).
`user-experience-designer`	Active	F1 disable-Stop-while-in-flight + Option-A frame extension + muted "Stopped" label; F5 ephemeral-divider UI (disputed); F8 no line-selection infra.
`junior-developer`	Active	OQ reframing across all items; F2c (option a), F4a/b unverified premise, F8 spec-level.

Implementation Approach

ALTITUDE: this section names code artifacts and inlines only decision-bearing values (flag names, the 90s stall timeout, the 50/200 limits, the export list to unexport). Full implementations belong in the files themselves; mechanics live in the cited decision-log entries.

TIER 1 — READY TO BUILD

F1 — external task cancel kills the child + finalizes the message (apps/coder, standalone, do first). The highest-value correctness fix. A taskControllers = new Map<string,AbortController>() registry inside createDispatcher, populated at the four run-functions and deleted in the existing .finally(), plus an exported idempotent cancelExternalTask(taskId): boolean wired into POST /api/tasks/:id/cancel via a narrow ExternalCancelFn (D-1). This batch must also fix the two pre-existing finalization bugs the abort wiring newly makes reachable — the catch blocks that leave messages streaming and the warm success path that writes complete on an aborted turn — via a shared cancelAndFinalize helper, or it ships a new bug (D-1). User Stop finalizes to cancelled, not failed (D-7); the terminal state reaches the web reducer via an optional status field on the existing coder message_complete frame, not a new frame type (D-8), rendered as a muted "Stopped" label (D-10). The session/stop leg "never fires for external from UI" (CoderPane already calls cancelTask), so it is wired best-effort via a WHERE session_id=$ AND state='running' lookup or deferred as a low-value leg.

F2 + F3 together (apps/server, same file). F2 is option-A prune-now-minimal: KEEP extractToolCallBlocks

stripToolMarkup (the only guard for a tool call emitted as plain text), unexport the 8 zero-caller symbols — isPlaceholderArgValue, parseXmlToolCall, parseInvokeToolCall, partialXmlOpenerStart, and the consts XML_TOOL_OPEN/CLOSE, INVOKE_TOOL_OPEN/CLOSE — taking the public surface 11 → 4 with zero runtime effect (D-3). A gate test pins the untested <invoke>-as-text fallback before the prune (D-4). Because F2 keeps extractToolCallBlocks, F3 is safe to land in the same file/batch: thread an optional log?: { debug } param into extractToolCallBlocks from its one call site so rejection logging flows through pino/LOG_LEVEL instead of console.debug (D-2).

F6 — BooChat stall-timeout ONLY (apps/server, standalone). Wrap the stream-phase.ts:261 fullStream loop with a per-chunk stall deadline using AbortSignal.any([signal, stallAc.signal]) and a setTimeout(STALL_TIMEOUT_MS = 90_000) bumped on each chunk, cleared in the existing finally; the post-loop check throws AbortError so the existing finalize path writes cancelled (D-5). No retry — partial-stream re-emit is non-idempotent and the instance is single+local (see Deferred (YAGNI)).

F7 — view_session_history MCP tool (apps/coder MCP, standalone). Tool 7 added inline in mcp-server.ts following the existing 6-tool inline pattern (textResult + direct sql). Reads messages_with_parts WHERE role != 'system', params session_id + optional chat_id + limit (default 50, max 200), ORDER BY created_at ASC (D-6). No interface, no pagination beyond limit.

F9 — retire :9502 SPA (apps/coder, standalone cleanup). Delete the if (existsSync(webRoot)) serve block in index.ts (~269-289), keep the inline 404; remove apps/coder/web from pnpm-workspace.yaml, the coder build step, and the Dockerfile copy; drop the now-unused fastifyStatic import. KEEP every /api/coder/* REST + WS + /api/health + --mcp route (D-11).

TIER 2 — BLOCKED (need their own spec / a capability check)

These three are recorded, not built. They do not block the READY cluster (D-12).

F4 — notify-hook config injection. The module shapes and the double-publish dedup rule are settled (inbound route calls normalizeAgentEvent → bucket; confirms tasks.state='running' before publishing blocked; SUPPRESSES done since the dispatcher already emits idle). But the core premise is unverified: it is unknown whether claude / qwen / goose actually fire their native lifecycle hooks in unattended mode (claude -p/SDK, qwen --acp/stream-json, goose), and goose's hook file/format is unknown (not in repo). Recommended path: route to plan-a-feature; verify hook-firing-in-unattended-mode and the goose hook mechanism first, or the feature is built on sand. Blocking OQs: OQ-F4a (hooks fire unattended?), OQ-F4b (goose hook format?).

F5 — opencode compaction surfacing. Blocked on a capability check: the pinned @opencode-ai/sdk exposes NO compaction event arm (confirmed arms: session.next.{text,reasoning,tool,step}.*, message.part.*, session.idle/error). The review assumed events from opencode's core event.ts that the pinned SDK may not surface. Recommended path: confirm the SDK emits a compaction signal and its exact event name (or an SDK bump is needed) before building. The UI treatment is disputed (behavioral: persistent sentinel row metadata.kind='compaction' that survives refresh; UX: ephemeral inline divider via a new agent_compacted frame, no DB row) — settle once the event is confirmed to exist. Only compaction.ended is in scope. Blocking OQs: OQ-F5a (SDK event existence/name), OQ-F5b (sentinel vs ephemeral UI).

F8 — diff-line → agent re-prompt UX. Spec-level. The "DiffPanel" is inline in CoderPane.tsx:478-619, rendering pending_changes rows as a static <pre> — no line-selection infrastructure exists. The diff source is ambiguous (pending_changes.diff = BooCode write-tools only vs the external-agent worktree git diff). "Send to new agent" needs coordinated workspace-pane + chat creation + pre-population across three surfaces with no existing contract; selection diverges by modality. Recommended path: route to plan-a-feature (the scope brief already hedged this; treat as firm). MVP-if-pushed: "comment to current agent" only, block-level selection, pre-populate ChatInput — still wants a spec. Blocking OQs: OQ-F8a (diff source), OQ-F8b (selection serialization), OQ-F8c (new-viewer routing).

Architecture and Integration Points

F1 lives entirely in apps/coder (dispatcher.ts, routes/tasks.ts, index.ts:254 wiring) with a single-field touch on the web reducer (CoderPane.tsx:299). F2/F3 are confined to apps/server/.../tool-call-parser.ts and its one call site stream-phase.ts. F6 is confined to stream-phase.ts. F7 is one inline tool in apps/coder/.../mcp-server.ts. F9 removes a serve block and a workspace package. No new interface is introduced anywhere in TIER 1 — every item follows the existing inline / pure-helper precedent (D-2, D-6).

Runtime Behavior

F1's cancel path: Stop → POST /api/tasks/:id/cancel → cancelExternalTask(taskId) → ac.abort() → backend honors the signal (child.kill / session/cancel / session.abort / interrupt) → the run-function hits its abort short-circuit or catch → cancelAndFinalize writes the terminal messages.status (cancelled on abort, failed on error) WHERE status='streaming' and publishes the terminal frame (D-1, D-7). F6's stall path: each chunk bumps the 90s timer; on stall, stallAc.abort() flows through AbortSignal.any into the same finalize-as-cancelled path (D-5).

Data Model and Persistence

No schema changes in any TIER 1 item. F1 and F6 only change which terminal value gets written to existing messages.status / tasks.state columns. F7 is a read against the existing messages_with_parts view.

External Interfaces

F1 extends the existing coder message_complete WS frame with an optional status field — no new frame type, so no paired strict-union arm is forced in apps/web/src/api/types.ts beyond the optional field (D-8). F7 exposes one new read-only MCP tool on BooCoder's MCP server (D-6). F4 (blocked) would add a new unauthenticated localhost POST route — see Security Posture. F5 (blocked) would add a new WS frame and thus trigger the full cross-app parity rule.

Decomposition and Sequencing

Each READY item is its own patch tag, one batch per coherent unit (D-9). F2+F3 are a single unit (same file; F2 keeping extractToolCallBlocks unblocks F3). F1 goes first as the highest-value correctness fix.

#	Work Unit	Surface / Deploy	Delivers	Depends On	Verification
1	F1 cancel registry + finalization fixes	apps/coder → `systemctl restart boocoder`	Stop kills child; message finalizes `cancelled`/`failed`; no stuck `streaming` (D-1)	—	`CancelRegistry` unit (4 cases) + DB-integration route test → row `cancelled` (T1-T3)
2	F2+F3 parser prune + logger	apps/server → docker rebuild	Public surface 11→4; rejection logging via pino; guard pinned (D-3, D-2)	F2's keep-decision unblocks F3	Fallback gate test green through prune (T6, D-4)
3	F6 stall-timeout	apps/server → docker rebuild	90s server-side stall detection → finalize `cancelled` (D-5)	—	`classifyStreamError` unit (5 cases) + `vi.useFakeTimers()` stall test + post-loop abort regression pin (T8-T10)
4	F7 session-history MCP tool	apps/coder → `systemctl restart boocoder`	Read-only transcript tool; `/api/health` tool count +1 (D-6)	—	Manual MCP call + sentinel-strip assertion (`role != 'system'`)
5	F9 retire :9502 SPA	apps/coder → `systemctl restart boocoder`	Serve block + build step + workspace pkg removed; routes kept (D-11)	—	`/api/coder/*` + `/api/health` still 200; `GET /` 404-or-redirect

F4/F5/F8 are NOT in this table — they route out per D-12.

RAID Log

Risks

ID	Risk	Likelihood	Severity	Blast Radius	Reversibility	Owner	Mitigation
R1	F1 message-finalization races across the 4 backend paths leave a row in the wrong terminal state	Medium	Medium	One assistant message per affected turn	Reversible (re-run)	on-call-engineer	Shared `cancelAndFinalize` helper, `WHERE status='streaming'` idempotency, abort short-circuit before the unconditional `complete` write (D-1)
R2	F2 prune accidentally removes the `<invoke>`-as-text fallback guard (regression: plain-text tool calls silently dropped)	Low	High	Any qwen3.6 turn that emits tool-call markup as text	Reversible (re-add export/call)	software-architect	Option-A keeps the impl; gate test fails if `extractToolCallBlocks` leaves the text-delta path (D-3, D-4)
R3	F6 stall-timeout fires on a healthy-but-slow stream (false abort)	Low	Low	One in-flight chat turn	Reversible (re-send)	on-call-engineer	90s per-chunk (not per-turn) deadline, bumped on every chunk; tune only on observed false-fire (D-5)
R4	F5 premise fails — the pinned SDK never surfaces a compaction event, so the feature is unbuildable without an SDK bump	High	N/A (blocked)	F5 only	N/A	(capability check)	Confirm event existence/name before any build; F5 stays Blocked until then (D-12)
R5	F4 premise fails — agents do not fire native hooks in unattended mode, so injection yields no signals	High	N/A (blocked)	F4 only	N/A	(plan-a-feature)	Verify hook-firing + goose format in the F4 spec before any build (D-12)

Assumptions

ID	Assumption	What Changes If Wrong	Verifier	Status
A1	llama-swap native `--jinja` parsing stays ON (structured `tool_calls`), so the text-scrape fallback remains dormant defense-in-depth	If off, the kept guard becomes hot — fine; if a future config silently changes parsing, F2 Option B reopens	live probe of `:8401` (not in-repo)	Unverified-in-repo (probe-only); guard kept regardless (D-3)
A2	All four external backends honor `ctx.signal` as cited (`child.kill` / `session/cancel` / `session.abort` / interrupt)	If one does not, that backend's Stop stays a no-op even after wiring	F1 DB-integration test per backend path	Evidenced by file:line (scope-brief F1); test confirms
A3	90s is the right stall deadline for the local llama-swap workload shape	Too low → false aborts on slow models; too high → sluggish recovery	Production observation	Committed value, tunable (D-5)
A4	F1's `session/stop` leg "never fires for external from UI" (CoderPane calls `cancelTask`)	If a code path does hit session-stop for an external task, that leg's finalization is best-effort only	on-call review of CoderPane wiring	Evidenced (synthesis-input F1 resolved OQs)

Dependencies

ID	Dependency	Owner	Status
Dep1	F5 requires `@opencode-ai/sdk` to surface a compaction event (or an SDK bump)	(capability check)	BLOCKING F5 — unresolved (OQ-F5a)
Dep2	F4 requires confirmed unattended hook-firing for claude/qwen/goose + the goose hook format	(plan-a-feature)	BLOCKING F4 — unresolved (OQ-F4a, OQ-F4b)
Dep3	F8 requires a chosen diff source + a line-selection approach	(plan-a-feature)	BLOCKING F8 — unresolved (OQ-F8a/b/c)

Testing Strategy

Sourced from test-engineer (T1-T3, T6, T8-T10) following the established pure-helper-then-wire precedent (turn-guard.ts, lifecycle-decisions.ts, mistake-tracker.ts). Coder tests are globals:false (import describe/it/expect); include glob src/**/__tests__/**/*.test.ts; DB-integration tests are opt-in via DATABASE_URL + describe.runIf.

Observable behaviors to test:
- F1: a cancelled external task lands messages.status='cancelled' (not streaming, not complete); the registry register/cancel/delete/has behaves idempotently (D-1, D-7).
- F2: a text-delta carrying a complete <invoke> block lands in result.toolCalls with the markup absent from result.content — green through the prune, red if the guard is removed (D-4).
- F6: a fake hanging stream is aborted after the 90s deadline under vi.useFakeTimers(); the existing post-loop signal?.aborted check still passes (regression pin) (D-5).
- F7: results exclude role='system' rows (sentinel strip) and respect the limit cap.
Test doubles posture: extract pure helpers (CancelRegistry, classifyStreamError) with no DB/child/I/O for unit coverage; stub streamText for the F2 gate and F6 stall tests; one DB-integration test for the F1 route. Warm-worktree-preserved is held as a code comment, not a spy (test-engineer T1-T3).
Edge cases requiring coverage: double-Stop / cancel-after-exit (idempotent ac.abort()); abort vs thrown error in the catch (cancelled vs failed mapping); F6 stall vs healthy-slow stream.
Test levels: unit (registry, classifier, gate) → integration (F1 route → DB row) → manual (F7 MCP call, F9 route-still-200 smoke).

Security Posture

Thin. The single TIER 1 surface is BooCoder's new read-only view_session_history MCP tool (F7), which reads only messages_with_parts WHERE role != 'system' — no write verbs, no PII beyond what the operator already authored, single-user, Authelia at the reverse proxy, no app-layer auth by design.

The one genuinely new auth-relevant surface — F4's inbound unauthenticated localhost POST route for hook callbacks — belongs to a BLOCKED item and is not built in this plan. When F4 is planned, it must be recorded explicitly as a new unauthenticated localhost route (acceptable under the single-user / Authelia posture, but called out, not silent).

Operational Readiness

Deploy by surface: apps/coder items (F1, F7, F9) → sudo systemctl restart boocoder; apps/server items (F2+F3, F6) → docker compose up --build -d boocode. Stage commits explicitly by path; never git add -A (Sam has uncommitted web WIP) (D-9).
Build-step change (F9): removing apps/coder/web deletes a coder build step and a workspace package; the coder Dockerfile copy and pnpm-workspace.yaml entry must be removed in the same batch or the build references a missing package (D-11).
Health check after F7: /api/health reports a tool count; expect it to increment by one.
Tagging: sequential patch tags, one per unit; no v2.8.0 (D-9).
Rollback: each unit is an independent patch tag, so rollback is per-item (revert the tag, redeploy by surface). No schema migrations means no expand/contract sequencing to unwind.

On-Call Resilience Posture

Application-source resilience, sourced from on-call-engineer. Infrastructure concerns live in Operational Readiness above.

Timeouts and deadlines: F6 adds a 90s per-chunk stall deadline on the BooChat fullStream loop, the first server-side guard against a hung llama-swap stream (today only the frontend 60s discard_stale watchdog exists) (D-5).
Retry strategy: none, deliberately. A retry after a partial stream re-emits already-streamed deltas (non-idempotent); the user re-sending is the correct recovery at single-local-instance scale (see Deferred (YAGNI)) (D-5).
Idempotency: F1's cancelExternalTask is idempotent (ac.abort() no-ops when already aborted → double-Stop and cancel-after-exit are safe); message finalization is idempotent via WHERE status='streaming' (D-1).
Kill switches: F1 is itself the kill switch for a runaway external agent — making Stop actually stop is the operator's manual control.
Graceful degradation: F1's session/stop leg is best-effort (does not fire for external tasks from the UI); F6 falls back to finalizing as cancelled on stall rather than hanging indefinitely.
Observability of failure paths: F1 distinguishes cancelled (user/stall) from failed (thrown error) so the human-inbox surface stays honest (D-7); F3 routes parser-rejection logging through pino/LOG_LEVEL (D-2).
Data integrity: no monetary/rate columns touched; F1/F6 only write existing terminal-state enum values into existing columns.
Migration safety: no schema changes in any TIER 1 item — nothing to expand/contract.

Definition of Done

Hitting Stop on an external task kills the child and the assistant message finalizes cancelled (never left streaming, never falsely complete) (D-1, D-7).
Stop button is disabled while the cancel POST is in flight; a muted "Stopped" label renders on cancel (D-8, D-10).
Parser public surface is 4 exports; the <invoke>-as-text gate test is green and fails if the guard is removed (D-3, D-4).
Parser rejection logging appears via pino and respects LOG_LEVEL (D-2).
A fake hanging BooChat stream is finalized cancelled after 90s under fake timers; the post-loop abort regression pin passes (D-5).
view_session_history returns a session transcript excluding role='system' rows, capped at limit (default 50, max 200); /api/health tool count +1 (D-6).
:9502 serve block + build step + workspace package removed; all /api/coder/* + /api/health routes still 200 (D-11).
Each shipped item has its own sequential patch tag + CHANGELOG entry; deployed by surface (D-9).

Specialist Handoffs for Implementation

test-engineer — dispatch alongside each READY unit; needs the touch-point file:lines (already in discovery notes) to author T1-T3 (F1), T6 (F2), T8-T10 (F6) before/with the wiring.
software-architect — dispatch if F1's cancelAndFinalize helper shape needs settling across the four backend paths during implementation; needs the four catch-block + success-path line ranges.
plan-a-feature (F4) — dispatch when Sam wants F4; input: the settled dedup rule + module shapes from synthesis-input, and the two blocking OQs (unattended hook-firing, goose format) to resolve first.
plan-a-feature (F8) — dispatch when Sam wants F8; input: CoderPane.tsx:478-619 DiffPanel location, the diff-source ambiguity, and the cross-surface routing gap.
Capability check (F5) — dispatch before any F5 build; input: the pinned @opencode-ai/sdk arm list and OQ-F5a (does it emit a compaction event, and what is its exact name?).

Deferred (YAGNI)

Items considered during planning and deferred under the YAGNI rule. Each carries a concrete reopen trigger.

F6 retry / backoff classifier

Why deferred: Evidence test — a retry after a partial stream re-emits already-streamed deltas (state.accumulated + live delta frames are non-idempotent), which is strictly worse than the current behavior; no second instance exists to fail over to. Single-local-instance scale makes operator re-send the correct recovery.
Reopen when: llama-swap gains restart-in-place-with-clear-partial semantics, OR a second llama-swap instance is added for failover.
Source: R1, on-call-engineer.

F2 Option B — flag-gated full retirement of `extractToolCallBlocks` / `stripToolMarkup`

Why deferred: Evidence test — no evidence qwen3.6 stopped emitting <invoke>-as-text on live, and the sidecar --jinja state is unconfirmed in-repo; deleting the only plain-text-tool-call guard on that basis is unsafe (named anti-pattern: removing a load-bearing guard without a measured signal).
Reopen when: a documented multi-session live probe shows zero text-delta tool calls from qwen3.6.
Source: R1, software-architect / test-engineer.

F4 `NotifyHookInjection` interface

Why deferred: Simpler-version test — three agents with an identical read-merge-write settings.json flow do not need an interface; one concrete function switching on agent name satisfies the same need (named anti-pattern: single-implementation interface before three divergent uses exist).
Reopen when: a fourth agent with a genuinely different injection contract appears (and only after F4's premise is verified).
Source: R1, software-architect.

F5 handling of `compaction.started` / `compaction.delta` + `step.failed` + `tool.progress`

Why deferred: Evidence test — only compaction.ended is user-actionable (it is what closes the silent context gap); the other arms add UI noise with no user-described need. Compounded by F5 being capability-blocked anyway.
Reopen when: the SDK compaction event is confirmed AND a user-described need for in-progress compaction feedback emerges.
Source: R1, behavioral-analyst.

F7 `SessionHistoryReader` interface / pagination beyond `limit`

Why deferred: Simpler-version test — one read tool with an inline query + a limit cap satisfies the read need; an interface and cursor pagination are unused machinery (named anti-pattern: abstraction before a second concrete use).
Reopen when: a second history-read consumer needs a different read path, or transcripts routinely exceed the 200-row cap in a way that breaks the use case.
Source: R1, software-architect.

Open Items

OI-1 (F4 premise): Do claude / qwen / goose fire native lifecycle hooks in unattended mode, and what is goose's hook file/format?
- Resolves when: verified in a plan-a-feature spec for F4 (OQ-F4a, OQ-F4b).
- Blocks implementation: Yes for F4 — No for the READY cluster (F4 is independent).
OI-2 (F5 capability): Does the pinned @opencode-ai/sdk surface a compaction event, and what is its exact name?
- Resolves when: a capability check confirms the event (or an SDK bump is scoped) — OQ-F5a.
- Blocks implementation: Yes for F5 — No for the READY cluster.
OI-3 (F5 UI treatment): persistent sentinel row vs ephemeral inline divider for "context compacted."
- Resolves when: OI-2 resolves and the team picks the UI shape (OQ-F5b); currently disputed (behavioral vs UX).
- Blocks implementation: Yes for F5 — No for the READY cluster.
OI-4 (F8 spec): diff source (pending_changes.diff vs worktree git diff), selection serialization, and new-viewer routing.
- Resolves when: a plan-a-feature spec for F8 settles OQ-F8a/b/c.
- Blocks implementation: Yes for F8 — No for the READY cluster.
OI-5 (F1 session-stop leg): whether to wire the best-effort session/stop lookup or defer it.
- Resolves when: decided at F1 implementation; it does not fire for external tasks from the UI today, so it is low-value.
- Blocks implementation: No — F1 ships with or without this leg.

Summary

Outcome delivered: Stop actually stops external agents and finalizes their messages correctly; the parser is pruned to its load-bearing surface with the guard test-pinned; BooChat gains a 90s server-side stall guard; a session-history MCP tool and the :9502 retirement land; F4/F5/F8 have a documented route to their own spec/capability work.
Team size: 7 specialists (incl. project-manager) — see artifacts/implementation-iteration-history.md
Rounds of facilitation: 1 — see artifacts/implementation-iteration-history.md
Decisions committed: 12 — see artifacts/implementation-decision-log.md
Decisions settled by evidence: 11 — see artifacts/implementation-decision-log.md
Decisions settled by junior-developer reframing: 0 (junior confirmed several, but each rests on specialist evidence) — see artifacts/implementation-decision-log.md
Decisions settled by user input: 1 (D-12, the standing-override disposition) — see artifacts/implementation-decision-log.md
Rejected alternatives recorded: 13 — see artifacts/implementation-decision-log.md
Open items remaining: 5 (4 blocking only their own BLOCKED item; OI-5 non-blocking)
Recommendation: Ship the READY cluster as planned — build F1 first (apps/coder, highest-value correctness fix, must include the OCE-001/002 finalization fixes), then F2+F3 (apps/server, one batch), F6 (apps/server), F7 (apps/coder MCP), F9 (apps/coder cleanup), each as its own sequential patch tag. Route F4 and F8 to plan-a-feature and F5 to an @opencode-ai/sdk capability check before any build.

36 KiB Raw Blame History Unescape Escape