Merge contracts-ssot-pkg: v2.7.13 single-source cross-app wire contracts in @boocode/contracts

feat: single-source cross-app wire contracts in @boocode/contracts (v2.7.13)
Move all hand-synced cross-app wire contracts into one built workspace package, @boocode/contracts, consumed by server/web/coder/coder-web via workspace:* + a per-subpath exports map. The ws-frames and provider-config Zod schemas are schema-first (z.infer); MessageMetadata, ErrorReason, AgentSessionConfig, the provider snapshot types, and WorktreeRiskReport are each single-sourced. Deletes the byte-identical copies and their parity tests, fixes a live AgentSessionConfig drift (coder dead copy removed, unified to the web required/nullable shape), removes the dead pending_change WS arms in the fallback SPA, and inverts the build order (contracts builds first) across root build, Dockerfile, and the coder deploy docs. Reverses the shared-package decision declined in v2.5.12. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 21:24:14 +00:00 · 2026-06-02 21:24:08 +00:00 · 2026-06-02 21:20:33 +00:00 · 2026-06-02 18:03:10 +00:00 · 2026-06-02 17:27:59 +00:00 · 2026-06-02 17:26:27 +00:00
84 changed files with 3089 additions and 1182 deletions
--- a/BOOCHAT.md
+++ b/BOOCHAT.md
@@ -28,6 +28,11 @@
 - Prefer codecontext (`search_symbols`, `get_symbol_info`, `get_dependencies`) over `grep` for symbol-level questions. Fall back to `grep` / `view_file` when codecontext returns degraded or empty results — that signals an unsupported language or parse failure.
 - Verify before reporting work complete: run the relevant test/build/smoke command and confirm output matches the claim. Evidence first, assertion second.
 ## Recovery and context (v2.7)
 - **Heed the recovery nudge.** Native inference tracks consecutive tool **failures** (`mistake-tracker.ts`): after 3 in a row with no successful step between, a `mistake_recovery` sentinel is injected telling you to re-read tool schemas, verify a path exists before acting, and try a *different* approach — not retry variations of the same failing call. Ignoring it (a second failure run with the nudge still outstanding) **escalates and stops the turn** to protect the step budget. This complements the doom-loop guard, which only catches *identical* repeats.
 - **Files-read provenance survives compaction.** Paths you read via `view_file` / `grep` / `find_files` / `list_dir` are accumulated and merged into a cumulative `## Files Read` ledger in the rolling summary, so a file read long ago stays in context across compactions. You don't manage this — but it means you usually don't need to re-read a file just because the raw turn scrolled out of the window.
 ## Output format
 - Stay in Markdown by default for every reply, short or long.
--- a/BOOCODER.md
+++ b/BOOCODER.md
@@ -23,6 +23,8 @@ You are BooCoder, a write-capable coding agent. You can read AND modify files wi
 Every file modification queues in `pending_changes` before touching disk. The user sees a diff preview and approves/rejects each change. Never bypass this queue — it is the safety boundary between inference and the filesystem.
 `edit_file`'s `old_string` match is **fuzzy** (`fuzzy-match.ts`, v2.7.1): an exact → per-line-whitespace → unicode-canonicalization (curly quotes/dashes/nbsp) → Levenshtein-≥0.66 ladder, so minor whitespace/indentation/unicode drift in `old_string` still lands on the right span. Two consequences: a near-miss `old_string` may still apply (verify the queued diff is what you intended), and an `old_string` matching **more than one** place is rejected as **ambiguous** rather than editing the first — add surrounding context to disambiguate. A genuine non-match returns a clear failure, not a thrown error.
 ## Behavior
 - Show diffs clearly. Explain what you're changing and why.
@@ -102,7 +104,7 @@ Either way, **adding to config does NOT install the binary.** Until the CLI is o
 ### Deploy + smoke
 Two deploy targets:
- **Routes (host service):** `pnpm -C apps/server build && pnpm -C apps/coder build && sudo systemctl restart boocoder`
+- **Routes (host service):** `pnpm -C packages/contracts build && pnpm -C apps/server build && pnpm -C apps/coder build && sudo systemctl restart boocoder`
 - **Web UI (container):** `docker compose up --build -d boocode`
 Green gate (verified across phases 1–5): `pnpm -C apps/coder test` (134 passing) `&& pnpm -C apps/coder build`.
@@ -115,3 +117,35 @@ curl http://100.114.205.53:9500/api/coder/providers/config   # raw config, throu
 # Settings → Providers: disable goose → it leaves the composer picker, stays in the tab
 # POST refresh → models repopulate; Add a catalog entry → it appears after refresh (unavailable until its CLI is installed)
 ```
 ## Persistent agent sessions (v2.6)
 When you `dispatch_external_agent` to a chat-tab provider, BooCoder keeps that agent **warm and resumable** instead of spawning a fresh process per turn. This is mostly transparent — but the model below explains why turn 2 is fast, why an external agent remembers earlier turns, and how edits flow.
 ### Backends and keying
 - One live backend per **`(chat_id, agent)`** pair, owned by the `agent-pool` (`agent-pool.ts`). State lives in `agent_sessions` (the resumable session id) and `worktrees` (the per-chat working copy).
 - **opencode** runs a long-lived `opencode serve` (`backends/opencode-server.ts`) with per-session SSE; turns after the first reuse the same session (memory intact, ~9× faster).
 - **goose / qwen** run a warm ACP connection (`backends/warm-acp.ts`) — `initialize` + `session/new` once per `(chat,agent)`, then `session/prompt` per turn. Interrupt cancels the prompt (`session/cancel`), never the child.
 - **claude** runs the Claude Agent SDK backend (`backends/claude-sdk.ts`) over a clean-room Postgres session store.
 - Arena, MCP `new_task`, and one-shot dispatches still use the cold `runExternalAgent` path — warm reuse needs both a `session_id` and a `chat_id`.
 ### Worktrees
 - External agents write **directly into a persistent per-chat worktree** (`/tmp/booworktrees/sess-<id>`), not into the project root via `pending_changes`. The worktree is created once, base commit captured, and **reused across turns and across agents in the same chat** — so opencode and goose in one chat share one worktree.
 - Each turn's worktree diff supersedes the prior `pending_changes` row for that `(chat,agent)` (latest-wins) and is badged with the authoring agent in the DiffPanel.
 - **Staging boundary:** a provider only sees another agent's edits once they are **applied**. Unapplied worktree edits from a different agent are invisible to you — the DiffPanel shows a muted hint when that's the case.
 ### Lifecycle (v2.6.10–v2.6.11)
 - **Idle eviction:** a backend idle past `AGENT_POOL_IDLE_TTL_MS` (default 30 min) is disposed; an LRU cap of `AGENT_POOL_MAX_LIVE` (default 10) bounds live backends. A busy backend is never evicted, and the next turn transparently re-attaches or re-creates from `agent_sessions`/`worktrees`.
 - **Crash recovery:** a health monitor restarts a crashed server (opencode → fresh sessions; ACP → re-`session/new`) and reclaims its port.
 - **Close cleanup:** closing/deleting a chat or session evicts its backends, archives the `worktrees` row, and removes the worktree. An hourly reaper sweeps orphaned worktrees (dirty/unpushed preflight before removal).
 ### Checkpoints (v2.7.1)
 Because external agents write the worktree directly (outside `pending_changes`), a worktree **checkpoint** is shadow-committed before each external-agent turn (tracked + untracked, into `refs/boocode/checkpoints/<id>`), anchored to that turn's assistant message. The per-message **"Restore to here"** affordance resets the worktree (`reset --hard` + `clean -fd`), trims the transcript past that message, and resets the `(chat,agent)` backend session — so files, transcript, and agent context land consistent at the restore point. `rewind` still only reverses BooCoder's own applied `pending_changes`; checkpoints are what cover external-agent worktree edits.
 ### Normalized status (v2.6 / v2.7.6)
 Turn boundaries publish a normalized per-`(chat,agent)` status — `working | blocked | idle | error` — to the UI (`agent_status_updated` frame), so blocked-on-permission and crash/idle are visible, not just WS liveness.
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -2,6 +2,18 @@
 All notable changes per release tag. Most recent on top, ordered by tag creation date (which matches the git history). Tag names follow `vMAJOR.MINOR.PATCH-slug` — the slug describes what shipped, so the tag name alone is enough to recall the batch.
 ## v2.7.13-contracts-ssot — 2026-06-02
 Creates `@boocode/contracts` (`packages/contracts`), a new workspace package that becomes the single source of truth for every cross-app wire contract — reversing the decision recorded in `v2.5.12-provider-lifecycle-phase4` that declined a shared types package as not worth the Docker/build-order risk at solo scale; a live `AgentSessionConfig` drift that had since appeared between `apps/coder` and `apps/web` justified the investment. Six contracts are now defined exactly once: the `WsFrameSchema` Zod runtime schema, the provider snapshot types (`ProviderSnapshotEntry` and family), the Zod provider-config schemas, `MessageMetadata` + `ErrorReason`, `AgentSessionConfig`, and `WorktreeRiskReport`; both Zod-backed contracts use `z.infer` so validator and type derive from the same definition and cannot drift independently. All four consumers — `apps/server`, `apps/web`, `apps/coder`, and the fallback SPA `apps/coder/web` — import via `workspace:*` through a per-subpath exports map consuming built dist only (no tsconfig project references); the hand-synced copies and their parity tests (`provider-types-parity.test.ts`; the ws-frames byte-parity assertion) are deleted while the KNOWN_FRAME_TYPES drift test and broker fail-closed tests are preserved. Build order is inverted in the root build script, Dockerfile, and coder deploy docs; `apps/coder/web`'s migration also removed dead `pending_change_*` reducer arms (no frame publisher exists for these — pending changes are HTTP-delivered), closing a latent missing-default-arm crash, and reconciled field-type conflicts with the canonical `WsFrame`; zod is pinned to a single version across the workspace. Server 543 / coder 293 / contracts 11 tests passing; human smoke verified on the live stack 2026-06-02.
 ## v2.7.11-coder-model-snapshot — 2026-06-02
 Hotfix for the coder model-attribution chip vanishing on refresh. The chip showed during a live turn (the `message_complete` frame carries `model`) but disappeared when a BooCoder session was reloaded — only in the coder, not BooChat. Root cause: `CoderPane`'s `useCoderMessages` hydrates from two sources on load — the HTTP `listMessages` fetch (whose SELECT includes `model`, added `v2.7.8`) AND the WS `snapshot` frame — and the WS snapshot's query in `apps/coder/src/routes/ws.ts` had its own column list that omitted `model`. The client's `snapshot` handler `setMessages`-overwrites the HTTP load, so the model-less rows won, and with no later `message_complete` for historical messages the chip stayed gone. Fix is one column: add `model` to the WS snapshot SELECT so both hydration paths agree. The `apps/coder/CLAUDE.md` "update every mapper" note now lists the WS snapshot SELECT explicitly (it was the one place not enumerated). apps/server + apps/coder builds green; deployed via `systemctl restart boocoder` (host service — the earlier `v2.7.10` docker deploy rebuilt only the container, never this route). Fixes the chip shipped in `v2.7.8-ember-coder-tabs-model-chips` / completed in `v2.7.9-mcp-keys-docs-coder-fixes`.
 ## v2.7.10-composer-chips — 2026-06-02
 A composer control-row refresh shared by BooChat and BooCoder via `ChatInput`. The slash-commands menu moves out of the full-width `AgentCommandsHint` disclosure (now removed) into a compact chip in the message box's bottom controls row — clicking it opens the existing `SlashCommandPicker` anchored to the chip and selecting inserts `/<name> `, while the typed-`/` autocomplete is unchanged. A new attach-file button sits beside it, opening a native multi-file picker that funnels picks through the same drag-drop pipeline (5 MB / binary gate, 10-attachment cap, chips + preview, `source:'drop'`). On mobile both collapse to icon-only — the slash count is `max-md:hidden` and the paperclip is icon-only — so the row stays on one line per the no-scroll toolbar rule. Web tsc + build green; deployed (docker). Builds on the BooCode 2.0 composer work in `v2.7.8-ember-coder-tabs-model-chips`.
 ## v2.7.9-mcp-keys-docs-coder-fixes — 2026-06-02
 The MCP-key hygiene feature plus accumulated in-flight coder fixes and a docs refactor. **MCP `{env:VAR}` substitution** (`mcp-config.ts:substituteEnvVars`, opencode-compatible) recursively resolves `{env:NAME}` references in any string value of `data/mcp.json` from `process.env` *before* Zod validation, so real keys live in `.env` (`env_file`) instead of the gitignored config — an unset var resolves to `''` with a boot-log warning, and on a validation failure the loader names the unset vars alongside the field errors (an empty `{env:VAR}` in a strict url/command field invalidates the whole config, an otherwise-disconnected warning). `data/mcp.json` is now untracked (`.gitignore` flips `!data/mcp.json` → `!data/mcp.example.json`); the tracked template `data/mcp.example.json` carries `"CONTEXT7_API_KEY": "{env:CONTEXT7_API_KEY}"` and `.env.example` documents the key (9 mcp-config tests). **Two coder bug fixes** ride along: the `message_complete` frame's `model` is widened `string` → `string | null` in both ws-frames copies (server + web parity) and the dispatcher now publishes `model: task.model` at all four external assistant-completion points — without the nullable widen a null model would fail-closed in `publishFrame` and drop the entire frame including the `status:'complete'` transition (regression test added); and Claude-SDK `mapUserToolResults` now maps `user`-message `tool_result` blocks → terminal `tool_update` events (completed/failed with output) so external-agent tool snapshots resolve instead of spinning forever (the SDK feeds tool output back as a user message, previously unmapped). On the view side the `AgentComposerBar` drops the §9b resumed/history/new-session chip and token-usage readout and loses `flex-wrap` so the control row stays on one line, while `CoderPane` gains a per-chat `localStorage` agent-config cache (provider/model/mode/thinking keyed by chat id, restoring the last model on reopen) and threads the new `model` field into the timeline + attribution chip. **Docs refactor**: the root `CLAUDE.md` is slimmed (~190 lines) with per-app deep references split into `apps/{coder,server,web}/CLAUDE.md` (auto-loaded in-subtree), plus a new 372-line `docs/coder-backends.md` dispatch reference, a `docs/project-discovery.md` stack inventory, and a `docs/coding-standards/` set (the `cross-app-contract-parity` standard, fronted by `.claude/rules` path-scoped indexes) — `ARCHITECTURE.md` links the backends doc. Server 555 + coder 299 tests passing (incl. new mcp-config, ws-frames, and claude-sdk-map suites), web tsc + server + coder builds green. Builds on `v2.7.8-ember-coder-tabs-model-chips`.
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -73,7 +73,7 @@ Schema CHECK migration order when renaming allowed values: (1) `ALTER TABLE ...
 Required: `DATABASE_URL`, `LLAMA_SWAP_URL`. Optional: `PORT` (3000), `HOST` (0.0.0.0), `PROJECT_ROOT_WHITELIST` (/opt, read-only add-existing scope), `BOOTSTRAP_ROOT` (/opt/projects, writable bootstrap mkdir target — host must `mkdir -p` it before container start), `DEFAULT_MODEL`, `LOG_LEVEL`, `SEARXNG_URL` (default `http://100.114.205.53:8888` — internal Tailscale; the public host is behind Authelia, unusable from server context), `BOOCODE_TOOLS` (`core`|`standard`|`all`, default `all`; a ceiling, never expands an agent's whitelist), `MCP_CONFIG_PATH` (default `/data/mcp.json`, opencode `mcpServers` shape; missing = no MCP), `CONTEXT7_API_KEY` (the Context7 MCP key, referenced from `data/mcp.json` as `"{env:CONTEXT7_API_KEY}"`). `data/mcp.json` is **gitignored** but no longer holds secrets — string values support opencode-style `{env:VAR}` substitution (`mcp-config.ts:substituteEnvVars`, applied before Zod validation; unset var → `''` + warn), so real keys live in `.env`; template `data/mcp.example.json`. A config-only edit there needs only `docker compose restart boocode` (data/ is bind-mounted); changing a referenced secret edits `.env`. MCP loads at server startup with per-server graceful degradation; the coder does NOT load MCP (BooChat only).
-BooCoder at port 9502: `curl http://100.114.205.53:9502/api/health`. Runs as `boocoder.service` on the host (not Docker). Deploy: `pnpm -C apps/server build && pnpm -C apps/coder build && sudo systemctl restart boocoder`. Health reports tool count: `{"ok":true,"db":true,"tools":33}`.
+BooCoder at port 9502: `curl http://100.114.205.53:9502/api/health`. Runs as `boocoder.service` on the host (not Docker). Deploy: `pnpm -C packages/contracts build && pnpm -C apps/server build && pnpm -C apps/coder build && sudo systemctl restart boocoder`. Health reports tool count: `{"ok":true,"db":true,"tools":33}`.
 - `FAST_MODEL` (optional) — cheaper model for titles, summaries, labeling (auto_name.ts, tool-summaries.ts). Falls back to session model or DEFAULT_MODEL. Set to a small llama-swap model (e.g. `nemotron-nano-4b`) to avoid loading the 35B for 20-token calls.
 - Qwen Code dispatch: `OPENAI_BASE_URL=http://100.101.41.16:8401/v1 OPENAI_API_KEY=dummy qwen -p "<task>" --output-format stream-json`. Install: `npm install -g @qwen-code/qwen-code@latest`. Node ≥22 on host (container stays Node 20; BooCoder dispatches via direct spawn on host). No `--yolo` flag — `-p` runs autonomously without prompts. ACP bridge is an HTTP daemon (not stdio); use PTY dispatch.
@@ -113,9 +113,9 @@ Cross-cutting only. Per-app conventions live in the matching `apps/*/CLAUDE.md`.
 - No app-layer auth. Authelia handles auth at the reverse proxy. All `broker.publishUser`/`subscribeUser` calls use `'default'` as the user key.
 - TypeScript strict mode. Both apps share `tsconfig.base.json`. Server + coder use NodeNext module resolution (`.js` extensions in imports).
 - Discriminated unions for type narrowing: `Pane` (by `kind`), `SessionEvent` (by `type`), `InferenceFrame` (by `type`).
- **Adding a new WS frame type** (cross-app) requires updating BOTH the server's `InferenceFrame` (loose `type:` union + optional fields in `services/inference/turn.ts`) AND the web `WsFrame` (strict discriminated union in `apps/web/src/api/types.ts`). Server publish is permissive; the frontend type is the wire-format gate — missing the web side silently drops the frame at JSON-parse.
+- **Adding a new WS frame type** (cross-app): add it to `WsFrameSchema` in `packages/contracts/src/ws-frames.ts` (single source of truth; rebuild with `pnpm -C packages/contracts build`). The server's `InferenceFrame` loose union (`services/inference/turn.ts`) and the web's strict `WsFrame` discriminated union (`apps/web/src/api/types.ts`) still exist separately and also need updating. Server publish is permissive; the frontend type is the wire-format gate — missing the web side silently drops the frame at JSON-parse.
- **Sentinels** (cross-app) are `role='system'` rows with structured `metadata.kind` (`cap_hit`, `doom_loop`). UI-only — `buildMessagesPayload` strips them via `isAnySentinel` so the LLM never sees them. A new kind requires arms in `MessageMetadata` in BOTH `apps/server/src/types/api.ts` AND `apps/web/src/api/types.ts`, plus a render branch in `apps/web/src/components/MessageBubble.tsx`.
+- **Sentinels** (cross-app) are `role='system'` rows with structured `metadata.kind` (`cap_hit`, `doom_loop`). UI-only — `buildMessagesPayload` strips them via `isAnySentinel` so the LLM never sees them. `MessageMetadata` is single-sourced in `@boocode/contracts` (`packages/contracts/src/message-metadata.ts`). A new kind requires updating that file and rebuilding the package, plus a render branch in `apps/web/src/components/MessageBubble.tsx`.
- **Coder↔web provider-type parity** (`apps/coder/src/services/provider-types.ts` ↔ `apps/web/src/api/types.ts`): enforced by runtime `provider-types-parity.test.ts` (compile-time cross-import is blocked by TS6307 on web's composite tsconfig). Mirror of the ws-frames parity pattern — edit both copies together.
+- **Provider snapshot types** (`ProviderSnapshotEntry`, `ProviderModel`, `ProviderMode`, `ThinkingOption`, `AgentCommand`, `ProviderSnapshotStatus`) are single-sourced in `@boocode/contracts` (`packages/contracts/src/provider-snapshot.ts`); `apps/coder/src/services/provider-types.ts` re-exports them. Edit the package source; there is no hand-synced web copy to update.
 - **JSONB columns**: use `sql.json(value as never)` — NOT `${JSON.stringify(value)}::jsonb` which double-serializes (stores a JSON string instead of an object/array). Pattern in `parts.ts`, `settings.ts`.
 - Skills live in `data/skills/<vendor>/`; Sam's own namespace is `boocode/` (`committing-changes`, `using-worktrees`, `improving-boocode-guidance`, `systematic-debugging`) — `SKILL.md` + optional `eval.yaml` (gerund names; eval = `skill:` + `tasks:` of `prompt`+`grader`, incl. a negative-trigger task). `data/skills/` is canonical; a divergent mirror at `/opt/skills/` exists.
--- a/4
+++ b/4
@@ -5,11 +5,15 @@ RUN corepack enable
 WORKDIR /build
 COPY package.json pnpm-workspace.yaml pnpm-lock.yaml tsconfig.base.json ./
 COPY packages/contracts/package.json ./packages/contracts/
 COPY apps/server/package.json ./apps/server/
 COPY apps/web/package.json ./apps/web/
 RUN pnpm install --frozen-lockfile
 # @boocode/contracts must be present before `pnpm build`, which builds it FIRST
 # (root build script) so apps/web can resolve its compiled dist via the exports map.
 COPY packages/contracts ./packages/contracts
 COPY apps/server ./apps/server
 COPY apps/web ./apps/web
--- a/README.md
+++ b/README.md
@@ -58,7 +58,7 @@ upstream and inject `Remote-User`. Postgres binds loopback only.
 BooCoder runs as a **host systemd service** (`boocoder.service`, port `:9502`), not in Docker:
 ```bash
-pnpm -C apps/server build && pnpm -C apps/coder build
+pnpm -C packages/contracts build && pnpm -C apps/server build && pnpm -C apps/coder build
 sudo systemctl restart boocoder
 curl http://100.114.205.53:9502/api/health
 ```
--- a/apps/coder/CLAUDE.md
+++ b/apps/coder/CLAUDE.md
@@ -13,7 +13,7 @@
 ## Build, deploy, dispatch
 - **Workspace dependency on `@boocode/server`**: imports `createInferenceRunner`, `createBroker`, `ALL_TOOLS`, `appendMcpTools` from the server's compiled `dist/`. apps/server's `package.json` has an `exports` map with `types` conditions for NodeNext resolution. **apps/server must build FIRST.**
- Build + deploy: `pnpm -C apps/server build && pnpm -C apps/coder build && sudo systemctl restart boocoder`. Env file at `apps/coder/.env.host`. Service file at `/etc/systemd/system/boocoder.service`.
+- Build + deploy: `pnpm -C packages/contracts build && pnpm -C apps/server build && pnpm -C apps/coder build && sudo systemctl restart boocoder`. Env file at `apps/coder/.env.host`. Service file at `/etc/systemd/system/boocoder.service`.
 - After `pnpm -C apps/coder build` the host service keeps running the OLD process until `sudo systemctl restart boocoder` — a stale process shows **new routes 404 with `{error:'not found'}` while old routes still 200** (the `/api` not-found handler shape). Restart, don't re-debug.
 - `:9502/api/health` is down ~15–20s after a boocoder restart while the startup agent-probe scan runs — retry; an early connection-refused is not a failed deploy.
 - Agent dispatch spawns binaries directly using `install_path` from `available_agents` — no `spawn('sh', ['-c', ...])` (fails under systemd). Paseo's pattern: `spawn(fullBinaryPath, argsArray, { cwd })`.
@@ -31,4 +31,4 @@
 - **agent_sessions resume**: `config_hash = sha256('opencode_server|<model>')` — must NOT include the server port (random per boot; breaks cross-restart resume). Keyed `(chat_id, agent)` — the tab/chat is the context unit (two opencode tabs = two contexts sharing one worktree). `chat_id` CASCADEs from `chats`; `session_id`/`worktree_id` are informational `SET NULL`. The `worktrees` table (one-per-session, survives session delete) supersedes the defanged `session_worktrees`. `tasks.chat_id` threads the tab id to the dispatcher; `runOpenCodeServerTask` resolves-or-creates a chat when null. The `@opencode-ai/sdk` v2 client takes flattened params (`{sessionID, directory, parts, model:{providerID,modelID}}`), `createOpencodeClient` from `@opencode-ai/sdk/v2/client`.
 - **Claude SDK backend tool RESULTS arrive as `type:'user'` SDK messages** (tool_result content blocks): `mapSdkMessage` (`claude-sdk-map.ts`) MUST map the `user` case → a terminal `tool_update` (completed/failed + output), else the tool_call persists `status:'running'` and the UI spinner never stops. The dispatcher's `tool_update` path then publishes + persists it.
 - **ACP command discovery is async**: `acp-probe.ts` must poll after `newSession` for `available_commands_update` (commands arrive in a later notification; reading synchronously captures 0). PTY providers (claude) discover from disk via `claude-command-discovery.ts` (`~/.claude/commands` + `enabledPlugins`, bare names, deduped). `AgentCommand.kind` tags `'command'` vs `'skill'`; `CoderPane`'s `slashGroups` splits them into icon'd groups. `SlashCommandPicker`'s `groups?` prop is opt-in.
- **A new per-message coder field silently drops unless you update every mapper**: server read SELECT + `mapCoderMessageRow` (`apps/coder/src/routes/messages.ts`), `CoderPane.tsx` (`RawCoderMessage`/`CoderMessage`/`mapCoderTimelineRow` + the live `message_complete` WS reducer), `CoderMessageWire` (`CoderMessageList.tsx`), and `api/types.ts`. The client `mapCoderTimelineRow` whitelists fields — easiest to forget (this is how the `model` chip silently vanished in the coder).
+- **A new per-message coder field silently drops unless you update every mapper**: the HTTP read SELECT + `mapCoderMessageRow` (`apps/coder/src/routes/messages.ts`), **the WS `snapshot` SELECT (`apps/coder/src/routes/ws.ts`)** — it has its OWN column list and the client's `snapshot` handler `setMessages`-overwrites the HTTP load, so a field present in the HTTP route but absent here shows live yet vanishes on refresh — `CoderPane.tsx` (`RawCoderMessage`/`CoderMessage`/`mapCoderTimelineRow` + the live `message_complete` WS reducer), `CoderMessageWire` (`CoderMessageList.tsx`), and `api/types.ts`. The client `mapCoderTimelineRow` whitelists fields — easiest to forget. This bit `model` twice: the client chain (`v2.7.9`) and then the WS snapshot SELECT (`v2.7.11`) — the chip showed live but vanished on coder refresh until both were fixed.
--- a/apps/coder/package.json
+++ b/apps/coder/package.json
@@ -13,6 +13,7 @@
    "test": "vitest run"
  },
  "dependencies": {
    "@boocode/contracts": "workspace:*",
    "@agentclientprotocol/sdk": "^0.22.1",
    "@anthropic-ai/claude-agent-sdk": "^0.3.159",
    "@boocode/server": "workspace:*",
--- a/apps/coder/src/index.ts
+++ b/apps/coder/src/index.ts
@@ -16,7 +16,7 @@ import { createInferenceRunner } from '@boocode/server/inference';
 import { createBroker } from '@boocode/server/broker';
 import { appendMcpTools, ALL_TOOLS } from '@boocode/server/tools';
 import type { Config as ServerConfig } from '@boocode/server/config';
-import type { WsFrame } from '@boocode/server/ws-frames';
+import type { WsFrame } from '@boocode/contracts/ws-frames';
 // v2.0.0 Phase 2C: write tools + adapter for BooChat ToolDef compatibility.
 import { WRITE_TOOLS } from './services/tools/index.js';
 import { adaptWriteTool } from './services/tools/adapter.js';
--- a/apps/coder/src/routes/messages.ts
+++ b/apps/coder/src/routes/messages.ts
@@ -2,7 +2,7 @@ import type { FastifyInstance } from 'fastify';
 import { z } from 'zod';
 import type { Sql } from '../db.js';
 import type { Broker } from '@boocode/server/broker';
-import type { WsFrame } from '@boocode/server/ws-frames';
+import type { WsFrame } from '@boocode/contracts/ws-frames';
 import { resolveChatId } from './chat-resolve.js';
 const AnswerUserInputBody = z.object({
--- a/apps/coder/src/routes/skills.ts
+++ b/apps/coder/src/routes/skills.ts
@@ -2,7 +2,7 @@ import type { FastifyInstance } from 'fastify';
 import { z } from 'zod';
 import type { Sql } from '../db.js';
 import type { Broker } from '@boocode/server/broker';
-import type { WsFrame } from '@boocode/server/ws-frames';
+import type { WsFrame } from '@boocode/contracts/ws-frames';
 import { getSkillBody } from '@boocode/server/skills';
 import {
  buildSkillInvokeSyntheticFrames,
--- a/apps/coder/src/routes/ws.ts
+++ b/apps/coder/src/routes/ws.ts
@@ -25,7 +25,7 @@ export function registerWebSocket(
      // Send snapshot of existing messages so client can hydrate
      const messages = await sql<Record<string, unknown>[]>`
-        SELECT id, session_id, chat_id, role, content, kind, tool_calls, tool_results, reasoning_parts, status, last_seq,
+        SELECT id, session_id, chat_id, role, content, kind, tool_calls, tool_results, reasoning_parts, status, model, last_seq,
               tokens_used, ctx_used, ctx_max, started_at, finished_at, created_at, metadata,
               summary, tail_start_id, compacted_at
        FROM messages_with_parts
--- a/apps/coder/src/services/tests/provider-types-parity.test.ts
+++ b/apps/coder/src/services/tests/provider-types-parity.test.ts
@@ -1,64 +0,0 @@
 import { describe, it, expect } from 'vitest';
 import { readFileSync } from 'node:fs';
 import { resolve, dirname } from 'node:path';
 import { fileURLToPath } from 'node:url';
 /**
 * Parity guard between the two copies of the provider snapshot types:
 *   apps/coder/src/services/provider-types.ts  (backend source of truth)
 *   apps/web/src/api/types.ts                  (web wire copy)
 *
 * APPROACH: text-identity of each shared type block (mirrors the repo's existing
 * ws-frames.test.ts byte-parity convention). A compile-time bidirectional-
 * assignability check was attempted first (a web-side file importing coder's
 * import-free provider-types.ts), but apps/web/tsconfig.app.json is a composite
 * project and rejects out-of-include files with TS6307 — so cross-project type
 * import is structurally blocked. This runtime guard FAILS on any field
 * add/remove/rename/loosen in either copy, including the nested model/mode/
 * command types that ProviderSnapshotEntry references. Single-source-of-truth
 * (shared workspace package) is deferred as a Tier-2 follow-up.
 */
 const here = dirname(fileURLToPath(import.meta.url));
 const coderSrc = readFileSync(resolve(here, '../provider-types.ts'), 'utf8');
 const webSrc = readFileSync(resolve(here, '../../../../web/src/api/types.ts'), 'utf8');
 function extractBlock(src: string, name: string): string {
  const iface = src.match(new RegExp(`export interface ${name} \\{[\\s\\S]*?\\n\\}`));
  const alias = src.match(new RegExp(`export type ${name} =[^;]*;`));
  const block = iface?.[0] ?? alias?.[0];
  if (!block) throw new Error(`type block '${name}' not found`);
  // Normalize to type structure: drop blank + comment lines (//, /* */, *),
  // trim each line. Field add/remove/rename/loosen still changes a field line.
  return block
    .split('\n')
    .map((l) => l.trim())
    .filter(
      (l) =>
        l.length > 0 &&
        !l.startsWith('//') &&
        !l.startsWith('/*') &&
        !l.startsWith('*'),
    )
    .join('\n');
 }
 describe('provider snapshot type parity (coder ↔ web)', () => {
  // Includes the nested types ProviderSnapshotEntry references, so structural
  // drift anywhere in the snapshot surface is caught.
  const names = [
    'ProviderSnapshotStatus',
    'ProviderSnapshotEntry',
    'ProviderModel',
    'ProviderMode',
    'ThinkingOption',
    'AgentCommand',
  ];
  for (const name of names) {
    it(`${name} is identical in both copies`, () => {
      expect(
        extractBlock(webSrc, name),
        `${name} drifted between apps/coder/src/services/provider-types.ts and apps/web/src/api/types.ts`,
      ).toBe(extractBlock(coderSrc, name));
    });
  }
 });
--- a/apps/coder/src/services/acp-dispatch.ts
+++ b/apps/coder/src/services/acp-dispatch.ts
@@ -23,7 +23,7 @@ import {
  type ClientSideConnection as ConnectionType,
 } from '@agentclientprotocol/sdk';
 import type { Broker } from '@boocode/server/broker';
-import type { WsFrame } from '@boocode/server/ws-frames';
+import type { WsFrame } from '@boocode/contracts/ws-frames';
 import { spawn } from 'node:child_process';
 import { findThoughtLevelConfigId } from './acp-derive.js';
 import { resolveLaunchSpec } from './acp-spawn.js';
--- a/apps/coder/src/services/agent-status-publish.ts
+++ b/apps/coder/src/services/agent-status-publish.ts
@@ -8,7 +8,7 @@
 * (`AgentStatusUpdatedFrame`) and mirrored byte-identical in apps/web.
 */
 import type { Broker } from '@boocode/server/broker';
-import type { WsFrame } from '@boocode/server/ws-frames';
+import type { WsFrame } from '@boocode/contracts/ws-frames';
 import type { AgentStatus } from './normalize-agent-status.js';
 // The exact slice of Broker we need — accepting just the bound method keeps call
--- a/apps/coder/src/services/dispatcher.ts
+++ b/apps/coder/src/services/dispatcher.ts
@@ -1,7 +1,7 @@
 import type { Sql } from '../db.js';
 import type { FastifyBaseLogger } from 'fastify';
 import type { Broker } from '@boocode/server/broker';
-import type { WsFrame } from '@boocode/server/ws-frames';
+import type { WsFrame } from '@boocode/contracts/ws-frames';
 import type { Config } from '../config.js';
 import { createWorktree, diffWorktree, cleanupWorktree, ensureSessionWorktree } from './worktrees.js';
 import { createCheckpoint } from './checkpoints.js';
--- a/apps/coder/src/services/provider-config.ts
+++ b/apps/coder/src/services/provider-config.ts
@@ -5,42 +5,28 @@
 * (see provider-config-registry.ts). Loading NEVER throws at startup (design.md
 * §2.1): a missing file, invalid JSON, or schema mismatch all fall back to
 * `{ providers: {} }` (built-ins only, all enabled).
 *
 * Schemas are defined once in @boocode/contracts/provider-config and re-exported
 * here so existing importers (routes, tests, registry) don't need path changes.
 */
 import { readFileSync, writeFileSync } from 'node:fs';
-import { z } from 'zod';
+import {
  ProviderOverrideSchema,
  CoderProvidersFileSchema,
  ProviderConfigPatchSchema,
  type ProviderOverride,
  type CoderProvidersFile,
  type ProviderConfigPatch,
 } from '@boocode/contracts/provider-config';
-// Schemas verbatim from design.md §2.2.
+export {
-export const ProviderOverrideSchema = z.object({
+  ProviderOverrideSchema,
-  extends: z.enum(['acp']).optional(), // v2.3: only 'acp' for custom; built-ins omit extends
+  CoderProvidersFileSchema,
-  label: z.string().min(1).optional(),
+  ProviderConfigPatchSchema,
-  description: z.string().optional(),
+  type ProviderOverride,
-  command: z.array(z.string().min(1)).min(1).optional(), // [binary, ...args]
+  type CoderProvidersFile,
-  env: z.record(z.string()).optional(),
+  type ProviderConfigPatch,
-  enabled: z.boolean().optional(), // default true
+};
  order: z.number().int().optional(), // UI sort key
  models: z.array(z.object({ id: z.string(), label: z.string() })).optional(),
  additionalModels: z.array(z.object({ id: z.string(), label: z.string() })).optional(),
 });
 export const CoderProvidersFileSchema = z.object({
  providers: z.record(ProviderOverrideSchema).default({}),
 });
 export type ProviderOverride = z.infer<typeof ProviderOverrideSchema>;
 export type CoderProvidersFile = z.infer<typeof CoderProvidersFileSchema>;
 /**
 * PATCH body schema (design.md §6.2). A partial providers map where each value
 * is either a full override object (REPLACES that id's override) or `null`
 * (DELETES the override → revert to the built-in default). Ids absent from the
 * patch are left untouched. The route validates the body against this first
 * (malformed → 422) so a bad shape can never reach the merge/save step.
 */
 export const ProviderConfigPatchSchema = z.object({
  providers: z.record(ProviderOverrideSchema.nullable()).default({}),
 });
 export type ProviderConfigPatch = z.infer<typeof ProviderConfigPatchSchema>;
 /**
 * Shallow per-id merge (design.md §6.2 / Paseo `patchConfig`). Each key in
--- a/apps/coder/src/services/provider-types.ts
+++ b/apps/coder/src/services/provider-types.ts
@@ -1,61 +1,10 @@
-/** Shared provider / snapshot types (Paseo-shaped, BooCoder-native). */
+/** Provider snapshot types — re-exported from @boocode/contracts for local consumers. */
-export interface ProviderMode {
+export type {
-  id: string;
+  ProviderMode,
-  label: string;
+  ThinkingOption,
-  description?: string;
+  ProviderModel,
-  /** Auto-approve tool permissions when this mode is selected. */
+  ProviderSnapshotStatus,
-  isUnattended?: boolean;
+  AgentCommand,
-}
+  ProviderSnapshotEntry,
-
+} from '@boocode/contracts/provider-snapshot';
 export interface ThinkingOption {
  id: string;
  label: string;
  isDefault?: boolean;
 }
 export interface ProviderModel {
  id: string;
  label: string;
  description?: string;
  isDefault?: boolean;
  thinkingOptions?: ThinkingOption[];
  defaultThinkingOptionId?: string;
 }
 // v2.3 phase 2: 'loading' (cache-miss, probe in flight) + 'unavailable'
 // (disabled or not installed) restored alongside the terminal 'ready' | 'error'.
 export type ProviderSnapshotStatus = 'loading' | 'ready' | 'unavailable' | 'error';
 export interface AgentCommand {
  name: string;
  description?: string;
  // v2.5.11: 'skill' (plugin skill) vs 'command' (native/CLI slash command).
  // Drives the icon split in the coder slash menu. Undefined → command.
  kind?: 'command' | 'skill';
 }
 // KEEP IN SYNC with apps/web/src/api/types.ts ProviderSnapshotEntry — parity is
 // enforced by __tests__/provider-types-parity.test.ts (fails on any field drift).
 export interface ProviderSnapshotEntry {
  name: string;
  label: string;
  description?: string;
  transport: string;
  status: ProviderSnapshotStatus;
  enabled: boolean;
  installed: boolean;
  models: ProviderModel[];
  modes: ProviderMode[];
  defaultModeId: string | null;
  commands: AgentCommand[];
  error?: string;
  fetchedAt?: string;
 }
 export interface AgentSessionConfig {
  provider: string;
  model?: string;
  modeId?: string;
  thinkingOptionId?: string;
 }
--- a/apps/coder/src/services/worktrees.ts
+++ b/apps/coder/src/services/worktrees.ts
@@ -8,6 +8,7 @@
 */
 import type { Sql } from '../db.js';
 import { hostExec } from './host-exec.js';
 import type { WorktreeRiskReport } from '@boocode/contracts/worktree-risk';
 export const WORKTREE_BASE = '/tmp/booworktrees';
@@ -379,22 +380,8 @@ export async function rebaselineWorktreeAfterApply(
 }
 // ─── Session-delete work-loss guard ─────────────────────────────────────────
-
+// WorktreeRiskReport single-sourced in @boocode/contracts — edit the package, not here.
-/**
+export type { WorktreeRiskReport };
 * Risk report for a single worktree, returned by checkWorktreeWorkAtRisk.
 * `atRisk` is the gate the server reads before allowing a session delete.
 * A git error never silently passes — it forces `atRisk` true and surfaces
 * the message in `error` (fail-closed).
 */
 export interface RiskReport {
  worktreePath: string;
  branch: string;
  dirty: boolean;   // uncommitted working-tree changes (incl. untracked)
  unpushed: number; // commits ahead of upstream, or -1 if no upstream is set
  unmerged: number; // commits on this branch not in the project default branch
  atRisk: boolean;  // dirty || unmerged > 0 || (upstream && unpushed > 0) || git error
  error?: string;   // populated on a git failure; presence forces atRisk
 }
 /**
 * Resolve the project's default branch as a git-usable ref (e.g. "origin/main").
@@ -448,7 +435,7 @@ async function detectDefaultBranchRef(
 export async function checkWorktreeWorkAtRisk(
  worktreePath: string,
  opts?: { signal?: AbortSignal },
-): Promise<RiskReport> {
+): Promise<WorktreeRiskReport> {
  // Branch name — also doubles as the "is this still a git worktree?" probe.
  const br = await hostExec(
    `git -C ${shellEscape(worktreePath)} rev-parse --abbrev-ref HEAD`,
--- a/apps/coder/web/package.json
+++ b/apps/coder/web/package.json
@@ -10,6 +10,7 @@
    "preview": "vite preview"
  },
  "dependencies": {
    "@boocode/contracts": "workspace:*",
    "lucide-react": "^1.16.0",
    "react": "^18.3.1",
    "react-dom": "^18.3.1",
--- a/apps/coder/web/src/api/types.ts
+++ b/apps/coder/web/src/api/types.ts
@@ -1,5 +1,22 @@
 // Minimal types for the BooCoder frontend.
 // Shared DB entities (same schema as BooChat).
 //
 // WS wire contracts are single-sourced from @boocode/contracts (the canonical
 // Zod-backed schema). The DB entity types below (Project/Session/Chat/Message/
 // ToolCall/ToolResult/PendingChange) are an intentional minimal SPA-local subset
 // and are NOT cross-app contracts — they stay defined here.
 import type { WsFrame } from '@boocode/contracts/ws-frames';
 // Re-export the canonical WebSocket frame union (single source of truth). The
 // coder backend publishes the full frame set; this SPA's reducer handles the
 // subset it renders and ignores the rest.
 export type { WsFrame };
 // The error frame's `reason`, single-sourced from the canonical schema's
 // frame-level reason enum (derived from WsFrame so it cannot drift from the
 // wire). Distinct from message-metadata's ErrorReason, which is a different set.
 export type ErrorReason = NonNullable<Extract<WsFrame, { type: 'error' }>['reason']>;
 export interface Project {
  id: string;
@@ -39,7 +56,9 @@ export interface ToolResult {
  tool_call_id: string;
  output: unknown;
  truncated?: boolean;
-  error?: boolean;
+  // Canonical wire shape: the failure message string (present only on error),
  // not a boolean. ToolResultBubble treats it as truthy → renders error styling.
  error?: string;
 }
 // Batch 9.7: ask_user_input shapes. The tool_call.args is { questions: AskUserQuestion[] }
@@ -96,15 +115,3 @@ export interface PendingChange {
  created_at: string;
  applied_at: string | null;
 }
 // WebSocket frame types (subset of what the coder backend publishes)
 export type WsFrame =
  | { type: 'snapshot'; messages: Message[] }
  | { type: 'message_started'; message_id: string; chat_id: string; role: Message['role'] }
  | { type: 'delta'; message_id: string; chat_id: string; content: string }
  | { type: 'tool_call'; message_id: string; chat_id: string; tool_call: ToolCall }
  | { type: 'tool_result'; tool_message_id: string; chat_id: string; tool_call_id: string; output: string; truncated?: boolean; error?: boolean }
  | { type: 'message_complete'; message_id: string; chat_id: string; tokens_used?: number; ctx_used?: number; ctx_max?: number; started_at?: string; finished_at?: string; metadata?: unknown }
  | { type: 'error'; message_id?: string; error: string; reason?: string }
  | { type: 'pending_change_added'; change: PendingChange }
  | { type: 'pending_change_updated'; change: PendingChange };
--- a/apps/coder/web/src/components/DiffPane.tsx
+++ b/apps/coder/web/src/components/DiffPane.tsx
@@ -5,10 +5,9 @@ import { api } from '@/api/client';
 interface Props {
  sessionId: string;
  onPendingChange: (cb: (change: PendingChange) => void) => () => void;
 }
-export function DiffPane({ sessionId, onPendingChange }: Props) {
+export function DiffPane({ sessionId }: Props) {
  const [changes, setChanges] = useState<PendingChange[]>([]);
  const [loading, setLoading] = useState(true);
  const [expandedId, setExpandedId] = useState<string | null>(null);
@@ -24,27 +23,13 @@ export function DiffPane({ sessionId, onPendingChange }: Props) {
    }
  }, [sessionId]);
-  // Initial load
+  // Initial load. Pending changes are delivered over HTTP (list + apply/reject/
  // rewind below); there is no WS pending-change frame, so the list refreshes on
  // mount, on the Refresh button, and optimistically as the user acts on it.
  useEffect(() => {
    fetchPending();
  }, [fetchPending]);
  // Listen for WS pending change events
  useEffect(() => {
    const unsub = onPendingChange((change) => {
      setChanges((prev) => {
        const idx = prev.findIndex((c) => c.id === change.id);
        if (idx >= 0) {
          const next = [...prev];
          next[idx] = change;
          return next;
        }
        return [...prev, change];
      });
    });
    return unsub;
  }, [onPendingChange]);
  const pendingChanges = changes.filter((c) => c.status === 'pending');
  const resolvedChanges = changes.filter((c) => c.status !== 'pending');
--- a/apps/coder/web/src/hooks/useSessionStream.ts
+++ b/apps/coder/web/src/hooks/useSessionStream.ts
@@ -1,5 +1,5 @@
-import { useEffect, useRef, useState, useCallback } from 'react';
+import { useEffect, useRef, useState } from 'react';
-import type { Message, WsFrame, PendingChange } from '@/api/types';
+import type { Message, WsFrame } from '@/api/types';
 interface State {
  messages: Message[];
@@ -10,7 +10,9 @@ interface State {
 function applyFrame(state: State, frame: WsFrame): State {
  switch (frame.type) {
    case 'snapshot': {
-      return { ...state, messages: frame.messages };
+      // Canonical SnapshotFrame.messages is opaque (z.array(z.unknown())); the
      // coder backend sends Message-shaped rows, so cast to the SPA's local type.
      return { ...state, messages: frame.messages as Message[] };
    }
    case 'message_started': {
      const exists = state.messages.some((m) => m.id === frame.message_id);
@@ -18,7 +20,7 @@ function applyFrame(state: State, frame: WsFrame): State {
      const newMsg: Message = {
        id: frame.message_id,
        session_id: '',
-        chat_id: frame.chat_id,
+        chat_id: frame.chat_id ?? '',
        role: frame.role,
        content: '',
        kind: 'message',
@@ -72,7 +74,7 @@ function applyFrame(state: State, frame: WsFrame): State {
      const newMsg: Message = {
        id: frame.tool_message_id,
        session_id: '',
-        chat_id: frame.chat_id,
+        chat_id: frame.chat_id ?? '',
        role: 'tool',
        content: '',
        kind: 'message',
@@ -119,9 +121,12 @@ function applyFrame(state: State, frame: WsFrame): State {
        : state.messages;
      return { ...state, messages: next, error: frame.error };
    }
-    case 'pending_change_added':
+    default:
-    case 'pending_change_updated':
+      // The canonical WsFrame carries the full set of frames the coder backend
-      // These are handled by the pending changes listener, not the message state
+      // can publish; this SPA only renders the subset handled above and safely
      // ignores the rest (reasoning_delta, usage, permission_*, agent_*, and the
      // per-user sidebar frames). pending_change_* frames have no publisher —
      // pending changes are delivered over HTTP, so there is nothing to handle.
      return state;
  }
 }
@@ -134,14 +139,11 @@ interface SessionStreamResult {
  connected: boolean;
  error: string | null;
  isStreaming: boolean;
  /** Listeners for pending change frames */
  onPendingChange: (cb: (change: PendingChange) => void) => () => void;
 }
 export function useSessionStream(sessionId: string | undefined): SessionStreamResult {
  const [state, setState] = useState<State>({ messages: [], connected: false, error: null });
  const wsRef = useRef<WebSocket | null>(null);
  const pendingListenersRef = useRef<Set<(change: PendingChange) => void>>(new Set());
  useEffect(() => {
    if (!sessionId) return;
@@ -172,13 +174,6 @@ export function useSessionStream(sessionId: string | undefined): SessionStreamRe
          return;
        }
        // Notify pending change listeners
        if (frame.type === 'pending_change_added' || frame.type === 'pending_change_updated') {
          for (const cb of pendingListenersRef.current) {
            cb(frame.change);
          }
        }
        setState((s) => applyFrame(s, frame));
      };
@@ -213,18 +208,10 @@ export function useSessionStream(sessionId: string | undefined): SessionStreamRe
  const isStreaming = state.messages.some((m) => m.status === 'streaming');
  const onPendingChange = useCallback((cb: (change: PendingChange) => void) => {
    pendingListenersRef.current.add(cb);
    return () => {
      pendingListenersRef.current.delete(cb);
    };
  }, []);
  return {
    messages: state.messages,
    connected: state.connected,
    error: state.error,
    isStreaming,
    onPendingChange,
  };
 }
--- a/apps/coder/web/src/pages/Session.tsx
+++ b/apps/coder/web/src/pages/Session.tsx
@@ -14,8 +14,7 @@ export function Session() {
  const [chat, setChat] = useState<Chat | null>(null);
  const [loading, setLoading] = useState(true);
-  const { messages, connected, isStreaming, onPendingChange } =
+  const { messages, connected, isStreaming } = useSessionStream(sessionId);
    useSessionStream(sessionId);
  // Get or create a chat for this session
  useEffect(() => {
@@ -78,9 +77,7 @@ export function Session() {
          connected={connected}
        />
      }
-      diffPane={
+      diffPane={<DiffPane sessionId={sessionId} />}
        <DiffPane sessionId={sessionId} onPendingChange={onPendingChange} />
      }
    />
  );
 }
--- a/apps/server/CLAUDE.md
+++ b/apps/server/CLAUDE.md
@@ -21,7 +21,7 @@
 - **`experimental_repairToolCall`** wired into `streamText` to keep the stream alive when qwen3.6 emits malformed tool args. Pass-through: logs the bad call, returns it unmodified; `executeToolPhase`'s zod-reject path routes it back to the model next turn.
 - **`chat_status` frame** (via `broker.publishUser`) — `status: 'streaming' | 'tool_running' | 'waiting_for_input' | 'idle' | 'error'`. Frontend `useChatStatus` derives `idle_warm` (<30s since idle) vs `idle_cold`. `ChatThroughput` renders beside `StatusDot` only when streaming/tool_running, fed by 500ms-throttled `'usage'` frames (`completion_tokens` + `ctx_used` + `ctx_max`). `POST /api/chats/:id/discard_stale` marks a stuck-streaming row `failed` when the frontend's 60s no-token timer gives up.
 - **Stale-streaming sweeps** (`apps/server/src/index.ts`): a boot-time pass after `applySchema()` and a periodic 60s `setInterval` both flip `messages.status='streaming'` older than 5 min to `failed` (publishing `chat_status='idle'`); the interval also runs `cleanupTruncations` (TTL + orphan reap of tmpfs truncation files). `onClose` hook clears the timer. Recovers from a container restart mid-stream.
- **`services/broker.ts`** — In-memory pub/sub, two channel types: per-session (message streaming) and per-user (sidebar). No persistence; clients reconnect on restart. Every WS publish goes through `broker.publishFrame(sessionId, frame)` / `publishUserFrame(user, frame)` — both Zod-validate against `WsFrameSchema` (`types/ws-frames.ts`) and fail-closed (log + drop). Schema duplicated byte-identical at `apps/web/src/api/ws-frames.ts`; `ws-frames.test.ts` enforces parity. Don't add raw `broker.publish()`/`publishUser()` calls.
+- **`services/broker.ts`** — In-memory pub/sub, two channel types: per-session (message streaming) and per-user (sidebar). No persistence; clients reconnect on restart. Every WS publish goes through `broker.publishFrame(sessionId, frame)` / `publishUserFrame(user, frame)` — both Zod-validate against `WsFrameSchema` (`types/ws-frames.ts`) and fail-closed (log + drop). Schema single-sourced in `@boocode/contracts` (`packages/contracts/src/ws-frames.ts`); the package's `ws-frames.test.ts` enforces schema correctness. Don't add raw `broker.publish()`/`publishUser()` calls.
 - **`services/tools.ts`** — Tool registry (`ALL_TOOLS`, `READ_ONLY_TOOL_NAMES`, `TOOLS_BY_NAME`). Filesystem tools (view_file/list_dir/grep/find_files) pass three guards: `path_guard.ts` (workspace scope), `secret_guard.ts` (filename deny list), `url_guard.ts` (SSRF/private-IP block for web_fetch). Web tools (`web_search`, `web_fetch`) are opt-in per chat via `session.web_search_enabled` (falls back to `project.default_web_search_enabled`) and filtered out of the LLM tool schema when false. Truncation: when a tool slice cuts content, `services/truncate.ts` stashes the full text on tmpfs (`BOOCODE_TRUNCATION_DIR`, default `/tmp/boocode-truncations`, 0o700) keyed by `tr_<12 base32>`; `view_truncated_output(id)` retrieves it. 5MB cap, 7-day TTL, reaped by the sweeper. Container restart loses retrieval — acceptable.
 - **`services/compaction.ts`** + **`services/model-context.ts`** — Anchored rolling summary (single `summary=true` assistant row per chat, supersedes itself each compaction). Triggered when `chats.needs_compaction` is set after a turn exceeds `usable(ctx_max) = floor(0.85 × ctx_max)`. **`ctx_max` comes from `model-context.getModelContext()` fetching `${LLAMA_SWAP_URL}/upstream/<model>/props`** — NOT from `parsed.timings.n_ctx`. First inferences after boot may have `ctx_max=NULL` if llama-swap hasn't loaded the model; negative cache TTL 60s, recovers next turn. `buildHeadPayload` embeds `reasoning_parts` as a `<reasoning>...</reasoning>` prose prefix on assistant `content` (OpenAI wire shape has no structured reasoning field); standalone tag when content is empty. `buildHeadPayload` + `OpenAiMessage` exported for tests — keep them exported.
 - **`services/system-prompt.ts`** — `buildSystemPrompt` is the string shim; `buildSystemPromptWithFingerprint` is the canonical impl returning `{prompt, fingerprint, drift}`. SHA-256 of the assembled prefix is logged per `buildMessagesPayload` (`prefix-fingerprint`, info); a `Map<sessionId, lastHash>` fires `prefix-drift` (warn) on change with a `changed_inputs` diff. The prefix is byte-stable in steady-state, so prefix caching is left to the input-layer mtime caches (BOOCHAT.md + AGENTS.md global/per-project in `agents.ts:safeStat`).
--- a/apps/server/package.json
+++ b/apps/server/package.json
@@ -53,10 +53,6 @@
      "types": "./dist/types/api.d.ts",
      "default": "./dist/types/api.js"
    },
    "./ws-frames": {
      "types": "./dist/types/ws-frames.d.ts",
      "default": "./dist/types/ws-frames.js"
    },
    "./db": {
      "types": "./dist/db.d.ts",
      "default": "./dist/db.js"
@@ -81,6 +77,7 @@
    "test": "vitest run"
  },
  "dependencies": {
    "@boocode/contracts": "workspace:*",
    "@ai-sdk/openai-compatible": "^2.0.47",
    "@fastify/static": "^7.0.4",
    "@fastify/websocket": "^10.0.1",
--- a/apps/server/src/index.ts
+++ b/apps/server/src/index.ts
@@ -140,7 +140,7 @@ async function main() {
      publish: (sessionId, frame) => {
        // v1.13.11-b: route through the typed publishFrame so the broker's
        // Zod gate validates every inference frame before delivery.
-        broker.publishFrame(sessionId, frame as unknown as import('./types/ws-frames.js').WsFrame);
+        broker.publishFrame(sessionId, frame as unknown as import('@boocode/contracts/ws-frames').WsFrame);
      },
      // v1.11: broker handle for compaction.process to publish 'compacted'
      // frames on the per-session channel. Inference's regular publish path
@@ -149,7 +149,7 @@ async function main() {
      broker,
    },
    (user, frame) => {
-      broker.publishUserFrame(user, frame as unknown as import('./types/ws-frames.js').WsFrame);
+      broker.publishUserFrame(user, frame as unknown as import('@boocode/contracts/ws-frames').WsFrame);
    }
  );
  registerMessageRoutes(app, sql, config, broker, {
@@ -194,7 +194,7 @@ async function main() {
      });
    },
    publishSessionFrame: (sessionId, frame) => {
-      broker.publishFrame(sessionId, frame as import('./types/ws-frames.js').WsFrame);
+      broker.publishFrame(sessionId, frame as import('@boocode/contracts/ws-frames').WsFrame);
    },
  });
  registerArtifactRoutes(app, sql);
@@ -222,7 +222,7 @@ async function main() {
      });
    },
    publishSessionFrame: (sessionId, frame) => {
-      broker.publishFrame(sessionId, frame as import('./types/ws-frames.js').WsFrame);
+      broker.publishFrame(sessionId, frame as import('@boocode/contracts/ws-frames').WsFrame);
    },
  });
  registerWebSocket(app, sql, broker);
--- a/apps/server/src/services/tests/ws-frames.test.ts
+++ b/apps/server/src/services/tests/ws-frames.test.ts
@@ -1,159 +1,13 @@
 import { describe, it, expect, vi, beforeEach, afterEach } from 'vitest';
 import { readFileSync } from 'node:fs';
 import { resolve } from 'node:path';
 import { fileURLToPath } from 'node:url';
 import {
  WsFrameSchema,
  KNOWN_FRAME_TYPES,
  type WsFrame,
-} from '../../types/ws-frames.js';
+} from '@boocode/contracts/ws-frames';
 import { createBroker } from '../broker.js';
 const VALID_UUID_A = '00000000-0000-0000-0000-000000000001';
 const VALID_UUID_B = '00000000-0000-0000-0000-000000000002';
 const VALID_UUID_C = '00000000-0000-0000-0000-000000000003';
 const VALID_TIMESTAMP = '2026-05-22T14:30:00.000Z';
 describe('WsFrameSchema (v1.13.11-a)', () => {
  it('accepts a well-formed chat_status frame', () => {
    const result = WsFrameSchema.safeParse({
      type: 'chat_status',
      chat_id: VALID_UUID_A,
      status: 'streaming',
      at: VALID_TIMESTAMP,
    });
    expect(result.success).toBe(true);
  });
  it('rejects an unknown frame type', () => {
    const result = WsFrameSchema.safeParse({
      type: 'cosmic_ray_strike',
      chat_id: VALID_UUID_A,
    });
    expect(result.success).toBe(false);
  });
  it('rejects a chat_status frame with invalid status enum', () => {
    // v1.12.1 dropped the legacy 'working' status. Any frame still emitting it
    // should fail validation — that's a drift catcher.
    const result = WsFrameSchema.safeParse({
      type: 'chat_status',
      chat_id: VALID_UUID_A,
      status: 'working',
      at: VALID_TIMESTAMP,
    });
    expect(result.success).toBe(false);
  });
  it('rejects a UUID field with a non-UUID string', () => {
    const result = WsFrameSchema.safeParse({
      type: 'chat_status',
      chat_id: 'not-a-uuid',
      status: 'idle',
      at: VALID_TIMESTAMP,
    });
    expect(result.success).toBe(false);
  });
  it('rejects negative token counts in usage frame', () => {
    const result = WsFrameSchema.safeParse({
      type: 'usage',
      message_id: VALID_UUID_A,
      chat_id: VALID_UUID_B,
      completion_tokens: -1,
      ctx_used: 100,
      ctx_max: 1000,
    });
    expect(result.success).toBe(false);
  });
  it('accepts a usage frame with nullable token counts (pre-v1.13.7 history)', () => {
    const result = WsFrameSchema.safeParse({
      type: 'usage',
      message_id: VALID_UUID_A,
      chat_id: VALID_UUID_B,
      completion_tokens: null,
      ctx_used: null,
      ctx_max: null,
    });
    expect(result.success).toBe(true);
  });
  it('accepts a tool_result frame with non-UUID tool_call_id (model-emitted)', () => {
    // Model-emitted tool_call_ids look like "call_abc123", not UUIDs.
    const result = WsFrameSchema.safeParse({
      type: 'tool_result',
      tool_message_id: VALID_UUID_A,
      chat_id: VALID_UUID_B,
      tool_call_id: 'call_abc123',
      output: { whatever: true },
      truncated: false,
    });
    expect(result.success).toBe(true);
  });
  it('accepts a compacted frame', () => {
    const result = WsFrameSchema.safeParse({
      type: 'compacted',
      session_id: VALID_UUID_A,
      chat_id: VALID_UUID_B,
      summary_message_id: VALID_UUID_C,
    });
    expect(result.success).toBe(true);
  });
  it('accepts a session_workspace_updated frame', () => {
    const result = WsFrameSchema.safeParse({
      type: 'session_workspace_updated',
      session_id: VALID_UUID_A,
      workspace_panes: [{ id: 'p1', kind: 'chat', chatIds: [], activeChatIdx: 0 }],
    });
    expect(result.success).toBe(true);
  });
  it('accepts a message_complete frame with a null model (external coder, no model selected)', () => {
    // Regression guard: the dispatcher publishes `model: task.model` (string |
    // null). When null, this MUST validate or publishFrame fail-closes and drops
    // the whole frame, incl. the status:'complete' transition.
    const result = WsFrameSchema.safeParse({
      type: 'message_complete',
      message_id: VALID_UUID_A,
      chat_id: VALID_UUID_B,
      model: null,
    });
    expect(result.success).toBe(true);
  });
  it('every KNOWN_FRAME_TYPES entry has a discriminated branch', () => {
    // Probe each known type by attempting a minimal valid construction.
    // Failure here means the union and the KNOWN_FRAME_TYPES list drifted.
    for (const type of KNOWN_FRAME_TYPES) {
      const probe = WsFrameSchema.safeParse({ type, __dummy__: true });
      // We expect FAILURE on every type because we're missing required fields,
      // but the failure must be ABOUT the missing fields, not about an unknown
      // type. A "Invalid discriminator value" error means the type isn't in
      // the union — that's a drift.
      if (probe.success) continue;
      const issues = probe.error.issues;
      const hasInvalidDiscriminator = issues.some(
        (i) => i.code === 'invalid_union_discriminator',
      );
      expect(hasInvalidDiscriminator, `frame type '${type}' is missing from the discriminated union`).toBe(false);
    }
  });
 });
 describe('ws-frames.ts file mirror parity', () => {
  it('apps/server and apps/web copies are byte-identical', () => {
    const here = fileURLToPath(import.meta.url);
    const serverPath = resolve(here, '../../../types/ws-frames.ts');
    const webPath = resolve(here, '../../../../../web/src/api/ws-frames.ts');
    const serverContent = readFileSync(serverPath, 'utf8');
    const webContent = readFileSync(webPath, 'utf8');
    expect(webContent, 'apps/web/src/api/ws-frames.ts must be byte-identical to apps/server/src/types/ws-frames.ts').toBe(serverContent);
  });
 });
 describe('broker.publishFrame / publishUserFrame fail-closed behavior', () => {
  let logErrors: Array<{ obj: unknown; msg: string }>;
  let mockLog: Parameters<typeof createBroker>[0];
--- a/apps/server/src/services/broker.ts
+++ b/apps/server/src/services/broker.ts
@@ -1,5 +1,5 @@
 import type { FastifyBaseLogger } from 'fastify';
-import { WsFrameSchema, type WsFrame } from '../types/ws-frames.js';
+import { WsFrameSchema, type WsFrame } from '@boocode/contracts/ws-frames';
 export type Frame = Record<string, unknown> & { type: string };
 export type Listener = (frame: Frame) => void;
--- a/apps/server/src/types/api.ts
+++ b/apps/server/src/types/api.ts
@@ -25,19 +25,9 @@ export interface AvailableProject {
 export type SessionStatus = 'open' | 'archived';
-// Session-delete work-loss guard. Returned (as `reports`) in the 409 body when
+// WorktreeRiskReport single-sourced in @boocode/contracts — edit the package, not here.
-// a delete is blocked because the session's worktree holds work at risk. The
+import type { WorktreeRiskReport } from '@boocode/contracts/worktree-risk';
-// shape is produced by BooCoder's checkWorktreeWorkAtRisk and passed through
+export type { WorktreeRiskReport };
 // verbatim; mirrored byte-for-byte in apps/web/src/api/types.ts for the dialog.
 export interface WorktreeRiskReport {
  worktreePath: string;
  branch: string;
  dirty: boolean;
  unpushed: number; // commits ahead of upstream, or -1 if no upstream
  unmerged: number; // commits not in the project default branch
  atRisk: boolean;
  error?: string;
 }
 export interface Session {
  id: string;
@@ -198,51 +188,10 @@ export interface ToolResult {
  error?: string;
 }
-// v1.8.2: structured reason codes for failed inferences. `error` carries the
+// v1.8.2 / v1.11.6: ErrorReason + MessageMetadata single-sourced in
-// human text; `reason` is the machine-readable discriminator the UI matches
+// @boocode/contracts — edit the package, not here.
-// on (with `error` as fallback when reason is absent or unrecognized).
+import type { ErrorReason, MessageMetadata } from '@boocode/contracts/message-metadata';
-export type ErrorReason =
+export type { ErrorReason, MessageMetadata };
  | 'llm_provider_error'
  | 'tool_execution_failed'
  | 'summary_after_cap_failed';
 // v1.8.2 / v1.11.6: shapes stored in messages.metadata. Discriminated on `kind`.
 //   cap_hit          — system sentinel emitted when tool budget is exhausted
 //   doom_loop        — system sentinel emitted when the model called the same
 //                      tool with the same args DOOM_LOOP_THRESHOLD times in a row
 //   mistake_recovery — system sentinel emitted when a run of consecutive
 //                      *heterogeneous* tool failures is detected (#12). A nudge
 //                      (escalated:false) injects model-facing recovery guidance
 //                      and continues; an escalate (escalated:true) stops the
 //                      turn after the nudge failed to break the failure run.
 //   error            — attached to a failed assistant message so UI can show reason
 export type MessageMetadata =
  | {
      kind: 'cap_hit';
      used: number;
      limit: number;
      agent_name: string | null;
      can_continue: boolean;
    }
  | {
      kind: 'doom_loop';
      tool_name: string;
      args: Record<string, unknown>;
      threshold: number;
    }
  | {
      // PINNED CONTRACT (#12) — mirrored byte-for-byte in apps/web/src/api/types.ts.
      kind: 'mistake_recovery';
      failure_kinds: string[];
      count: number;
      escalated: boolean;
      can_continue?: boolean;
    }
  | {
      kind: 'error';
      error_reason: ErrorReason;
      error_text: string;
    };
 export interface Message {
  id: string;
--- a/apps/web/package.json
+++ b/apps/web/package.json
@@ -10,6 +10,7 @@
    "typecheck": "tsc -b --noEmit"
  },
  "dependencies": {
    "@boocode/contracts": "workspace:*",
    "@fontsource-variable/inter": "^5.2.8",
    "@fontsource-variable/jetbrains-mono": "^5.2.8",
    "@xterm/addon-fit": "0.10.0",
--- a/apps/web/src/api/types.ts
+++ b/apps/web/src/api/types.ts
@@ -34,18 +34,8 @@ export interface AvailableProject {
 export type SessionStatus = 'open' | 'archived';
-// Session-delete work-loss guard. Mirror of WorktreeRiskReport in
+// WorktreeRiskReport single-sourced in @boocode/contracts — edit the package, not here.
-// apps/server/src/types/api.ts — edit both copies together. Arrives as the
+export type { WorktreeRiskReport } from '@boocode/contracts/worktree-risk';
 // `reports` field of the 409 body when a delete is blocked.
 export interface WorktreeRiskReport {
  worktreePath: string;
  branch: string;
  dirty: boolean;
  unpushed: number; // commits ahead of upstream, or -1 if no upstream
  unmerged: number; // commits not in the project default branch
  atRisk: boolean;
  error?: string;
 }
 export interface Session {
  id: string;
@@ -143,49 +133,10 @@ export interface ToolResult {
  error?: string;
 }
-// v1.8.2: structured reason codes that flow through error frames / metadata.
+// v1.8.2 / v1.11.6: ErrorReason + MessageMetadata single-sourced in
-// `error` text stays human; `reason` is the discriminator the UI matches on.
+// @boocode/contracts — edit the package, not here.
-export type ErrorReason =
+import type { ErrorReason, MessageMetadata } from '@boocode/contracts/message-metadata';
-  | 'llm_provider_error'
+export type { ErrorReason, MessageMetadata };
  | 'tool_execution_failed'
  | 'summary_after_cap_failed';
 // v1.8.2 / v1.11.6: shapes stored in Message.metadata. Discriminated on `kind`.
 //   cap_hit    — sentinel emitted when the tool budget is hit; carries the
 //                budget + agent name + whether Continue is still allowed.
 //   doom_loop  — sentinel emitted when the model called the same tool with
 //                the same arguments threshold times in a row.
 //   mistake_recovery — sentinel emitted when the model hit repeated *different*
 //                errors; non-escalated means recovery guidance was injected and
 //                the turn continues, escalated means the turn was stopped.
 //   error      — attached to a failed assistant message so the bubble can show
 //                a specific reason on reload (WS error frame is one-shot).
 export type MessageMetadata =
  | {
      kind: 'cap_hit';
      used: number;
      limit: number;
      agent_name: string | null;
      can_continue: boolean;
    }
  | {
      kind: 'doom_loop';
      tool_name: string;
      args: Record<string, unknown>;
      threshold: number;
    }
  | {
      kind: 'mistake_recovery';
      failure_kinds: string[];
      count: number;
      escalated: boolean;
      can_continue?: boolean;
    }
  | {
      kind: 'error';
      error_reason: ErrorReason;
      error_text: string;
    };
 export interface Message {
  id: string;
@@ -239,80 +190,23 @@ export interface ModelInfo {
  [key: string]: unknown;
 }
-export interface ProviderModel {
+export type {
-  id: string;
+  ProviderModel,
-  label: string;
+  ProviderMode,
-  description?: string;
+  ThinkingOption,
-  isDefault?: boolean;
+  ProviderSnapshotStatus,
-  thinkingOptions?: ThinkingOption[];
+  AgentCommand,
-  defaultThinkingOptionId?: string;
+  ProviderSnapshotEntry,
-}
+} from '@boocode/contracts/provider-snapshot';
-export interface ProviderMode {
+export type {
-  id: string;
+  ProviderOverride,
-  label: string;
+  CoderProvidersFile,
-  description?: string;
+  ProviderConfigPatch,
-  isUnattended?: boolean;
+} from '@boocode/contracts/provider-config';
 }
-export interface ThinkingOption {
+// AgentSessionConfig single-sourced in @boocode/contracts — edit the package, not here.
-  id: string;
+export type { AgentSessionConfig } from '@boocode/contracts/message-metadata';
  label: string;
  isDefault?: boolean;
 }
 // v2.3 phase 2: 'loading' + 'unavailable' restored alongside 'ready' | 'error'.
 export type ProviderSnapshotStatus = 'loading' | 'ready' | 'unavailable' | 'error';
 // KEEP IN SYNC with apps/coder/src/services/provider-types.ts ProviderSnapshotEntry
 // — parity is enforced by coder __tests__/provider-types-parity.test.ts (field drift fails it).
 export interface ProviderSnapshotEntry {
  name: string;
  label: string;
  description?: string;
  transport: string;
  status: ProviderSnapshotStatus;
  enabled: boolean;
  installed: boolean;
  models: ProviderModel[];
  modes: ProviderMode[];
  defaultModeId: string | null;
  commands: AgentCommand[];
  error?: string;
  fetchedAt?: string;
 }
 // v2.3 Phase 4: provider config file wire types. Mirror of the Zod-inferred
 // ProviderOverride / CoderProvidersFile in apps/coder/src/services/provider-config.ts
 // (web can't cross-import the coder package — TS6307 on the composite project).
 export interface ProviderOverride {
  extends?: 'acp';
  label?: string;
  description?: string;
  command?: string[];
  env?: Record<string, string>;
  enabled?: boolean;
  order?: number;
  models?: Array<{ id: string; label: string }>;
  additionalModels?: Array<{ id: string; label: string }>;
 }
 export interface CoderProvidersFile {
  providers: Record<string, ProviderOverride>;
 }
 // PATCH body: a partial providers map. A `null` value deletes that id's
 // override (revert to built-in default); an object replaces it wholesale.
 export interface ProviderConfigPatch {
  providers: Record<string, ProviderOverride | null>;
 }
 export interface AgentSessionConfig {
  provider: string;
  model: string;
  modeId: string | null;
  thinkingOptionId: string | null;
 }
 export type PermissionKind = 'tool' | 'question' | 'plan' | 'elicitation';
@@ -324,14 +218,6 @@ export interface PermissionPrompt {
  options: Array<{ optionId: string; label: string }>;
 }
 export interface AgentCommand {
  name: string;
  description?: string;
  // v2.5.11: 'skill' (plugin skill) vs 'command' (native/CLI slash command).
  // Drives the icon split in the coder slash menu. Undefined → command.
  kind?: 'command' | 'skill';
 }
 export interface CoderSendMessageBody {
  content: string;
  pane_id: string;
--- a/apps/web/src/api/ws-frames.ts
+++ b/apps/web/src/api/ws-frames.ts
@@ -1,408 +0,0 @@
 // v1.13.11-a: Zod schemas for every WebSocket frame published by the server.
 // Validation runs both on send (broker.publishFrame / publishUserFrame) and
 // on receive (apps/web/src/hooks/useSessionStream + useUserEvents). Catches
 // silent protocol drift between publisher and consumer.
 //
 // IMPORTANT: This file is duplicated byte-identical at
 // apps/web/src/api/ws-frames.ts. The two apps have separate tsconfigs and
 // no path alias; the duplication is sync-by-hand. A test asserts the two
 // files match. If you change one, change the other.
 //
 // Per-kind payload schemas (tool_call args, message_parts payloads, etc.)
 // stay z.unknown() in v1.13.11. Frame-level drift detection is the goal;
 // deep payload validation is follow-up work.
 import { z } from 'zod';
 // ---- shared primitives -----------------------------------------------------
 const Uuid = z.string().uuid();
 // Tool call IDs are model-emitted (e.g. "call_abc123") — not UUIDs.
 const ToolCallId = z.string().min(1);
 // v1.13.12 fix: postgres returns timestamp columns as JS Date objects, not
 // strings. The publish sites pass them through unchanged, so the schema must
 // tolerate both. preprocess converts Date → ISO string before string-validation;
 // on the web side (where frames arrive via JSON.parse) it's a no-op. Before
 // this fix, every message_complete / session_updated / chat_updated frame
 // failed validation and got dropped — symptoms: token tracking blank in UI,
 // status stuck at 'streaming' tripping the 60s stale-stream banner.
 const IsoTimestamp = z.preprocess(
  (v) => (v instanceof Date ? v.toISOString() : v),
  z.string().min(1),
 );
 const ChatStatusValue = z.enum([
  'streaming',
  'tool_running',
  'waiting_for_input',
  'idle',
  'error',
 ]);
 // agent-status-normalize (#10): normalized per-(chat,agent) lifecycle status for
 // external coding agents (warm-acp / opencode / claude-sdk / pty). Distinct from
 // ChatStatusValue (native-inference chat lifecycle) — published by BooCoder's
 // dispatcher + permission flow on the per-session channel.
 const AgentStatusValue = z.enum(['working', 'blocked', 'idle', 'error']);
 const ErrorReasonValue = z.enum([
  'llm_provider_error',
  'doom_loop',
  'doom_loop_summary_failed',
  'cap_hit',
  'cap_hit_summary_failed',
 ]);
 const MessageRoleValue = z.enum(['user', 'assistant', 'system', 'tool']);
 const ToolCallShape = z.object({
  id: ToolCallId,
  name: z.string().min(1),
  args: z.record(z.string(), z.unknown()),
 });
 // Free-form bags: opaque to the frame schema; deep validation is out of
 // scope for v1.13.11 (frame-level drift detection is the goal; per-kind
 // payload narrowing is follow-up work). z.unknown() means the consumer
 // must narrow before reading — TypeScript-side this is fine because every
 // consumer already operates on the hand-maintained Project / Chat / Session
 // / WorkspacePane types (the brief's "Don't strip existing types yet"
 // rule), and the Zod-typed shape is only used at the publishFrame boundary.
 const OpaqueObject = z.unknown();
 // ---- per-session channel frames --------------------------------------------
 export const SnapshotFrame = z.object({
  type: z.literal('snapshot'),
  messages: z.array(OpaqueObject),
 });
 export const MessageStartedFrame = z.object({
  type: z.literal('message_started'),
  message_id: Uuid,
  chat_id: Uuid.optional(),
  role: MessageRoleValue,
 });
 export const DeltaFrame = z.object({
  type: z.literal('delta'),
  message_id: Uuid,
  chat_id: Uuid.optional(),
  content: z.string(),
 });
 export const ReasoningDeltaFrame = z.object({
  type: z.literal('reasoning_delta'),
  message_id: Uuid,
  chat_id: Uuid.optional(),
  content: z.string(),
 });
 export const ToolCallFrame = z.object({
  type: z.literal('tool_call'),
  message_id: Uuid,
  chat_id: Uuid.optional(),
  tool_call: ToolCallShape,
 });
 export const ToolResultFrame = z.object({
  type: z.literal('tool_result'),
  tool_message_id: Uuid,
  chat_id: Uuid.optional(),
  tool_call_id: ToolCallId,
  output: z.unknown(),
  truncated: z.boolean(),
  error: z.string().optional(),
 });
 export const MessageCompleteFrame = z.object({
  type: z.literal('message_complete'),
  message_id: Uuid,
  chat_id: Uuid.optional(),
  tokens_used: z.number().int().nonnegative().nullable().optional(),
  ctx_used: z.number().int().nonnegative().nullable().optional(),
  ctx_max: z.number().int().positive().nullable().optional(),
  started_at: IsoTimestamp.nullable().optional(),
  finished_at: IsoTimestamp.nullable().optional(),
  // nullable: external-coder turns carry task.model, which is null when no
  // model was selected. This frame is published through the same fail-closed
  // publishFrame, so null MUST validate or the entire frame (incl. the
  // status:'complete' transition) is dropped.
  model: z.string().nullable().optional(),
  metadata: OpaqueObject.nullable().optional(),
 });
 export const UsageFrame = z.object({
  type: z.literal('usage'),
  message_id: Uuid,
  chat_id: Uuid.optional(),
  completion_tokens: z.number().int().nonnegative().nullable(),
  ctx_used: z.number().int().nonnegative().nullable(),
  ctx_max: z.number().int().positive().nullable(),
 });
 export const MessagesDeletedFrame = z.object({
  type: z.literal('messages_deleted'),
  message_ids: z.array(Uuid),
  chat_id: Uuid.optional(),
 });
 export const ChatRenamedFrame = z.object({
  type: z.literal('chat_renamed'),
  chat_id: Uuid,
  name: z.string(),
 });
 export const CompactedFrame = z.object({
  type: z.literal('compacted'),
  session_id: Uuid,
  chat_id: Uuid,
  summary_message_id: Uuid,
 });
 export const ErrorFrame = z.object({
  type: z.literal('error'),
  message_id: Uuid.optional(),
  chat_id: Uuid.optional(),
  error: z.string(),
  reason: ErrorReasonValue.optional(),
 });
 // ---- per-user channel frames (sidebar refresh) -----------------------------
 export const ChatStatusFrame = z.object({
  type: z.literal('chat_status'),
  chat_id: Uuid,
  status: ChatStatusValue,
  at: IsoTimestamp,
  reason: ErrorReasonValue.optional(),
 });
 export const SessionUpdatedFrame = z.object({
  type: z.literal('session_updated'),
  session_id: Uuid,
  project_id: Uuid,
  name: z.string(),
  updated_at: IsoTimestamp,
 });
 export const SessionRenamedFrame = z.object({
  type: z.literal('session_renamed'),
  session_id: Uuid,
  name: z.string(),
 });
 export const SessionCreatedFrame = z.object({
  type: z.literal('session_created'),
  session: OpaqueObject,
  project_id: Uuid,
 });
 export const SessionArchivedFrame = z.object({
  type: z.literal('session_archived'),
  session_id: Uuid,
  project_id: Uuid,
 });
 export const SessionDeletedFrame = z.object({
  type: z.literal('session_deleted'),
  session_id: Uuid,
  project_id: Uuid,
 });
 export const SessionWorkspaceUpdatedFrame = z.object({
  type: z.literal('session_workspace_updated'),
  session_id: Uuid,
  // v2.6.x: widened from z.array — the payload is now either the legacy bare
  // WorkspacePane[] OR the WorkspaceState envelope object (panes + tabNumbers +
  // nextTabNumber + closedPaneStack). z.array alone would fail-closed and drop
  // every envelope frame at validation. MUST be mirrored in the server's
  // byte-identical copy (parity test).
  workspace_panes: z.union([z.array(OpaqueObject), z.record(z.unknown())]),
 });
 export const ChatCreatedFrame = z.object({
  type: z.literal('chat_created'),
  chat: OpaqueObject,
  session_id: Uuid,
 });
 export const ChatUpdatedFrame = z.object({
  type: z.literal('chat_updated'),
  chat_id: Uuid,
  session_id: Uuid,
  name: z.string().nullable(),
  updated_at: IsoTimestamp,
 });
 export const ChatArchivedFrame = z.object({
  type: z.literal('chat_archived'),
  chat_id: Uuid,
  session_id: Uuid,
 });
 export const ChatUnarchivedFrame = z.object({
  type: z.literal('chat_unarchived'),
  chat: OpaqueObject,
 });
 export const ChatDeletedFrame = z.object({
  type: z.literal('chat_deleted'),
  chat_id: Uuid,
  session_id: Uuid,
 });
 export const ProjectCreatedFrame = z.object({
  type: z.literal('project_created'),
  project: OpaqueObject,
 });
 export const ProjectArchivedFrame = z.object({
  type: z.literal('project_archived'),
  project_id: Uuid,
 });
 export const ProjectUnarchivedFrame = z.object({
  type: z.literal('project_unarchived'),
  project: OpaqueObject,
 });
 export const ProjectUpdatedFrame = z.object({
  type: z.literal('project_updated'),
  project_id: Uuid,
  name: z.string(),
 });
 export const ProjectDeletedFrame = z.object({
  type: z.literal('project_deleted'),
  project_id: Uuid,
 });
 const PermissionOptionShape = z.object({
  option_id: z.string(),
  label: z.string(),
 });
 export const PermissionRequestedFrame = z.object({
  type: z.literal('permission_requested'),
  task_id: Uuid,
  session_id: Uuid,
  kind: z.enum(['tool', 'question', 'plan', 'elicitation']).optional(),
  tool_title: z.string().optional(),
  input: z.record(z.unknown()).optional(),
  options: z.array(PermissionOptionShape),
 });
 export const PermissionResolvedFrame = z.object({
  type: z.literal('permission_resolved'),
  task_id: Uuid,
  session_id: Uuid,
 });
 const AgentCommandShape = z.object({
  name: z.string(),
  description: z.string().optional(),
 });
 export const AgentCommandsFrame = z.object({
  type: z.literal('agent_commands'),
  task_id: Uuid,
  session_id: Uuid,
  commands: z.array(AgentCommandShape),
 });
 // agent-status-normalize (#10): published by BooCoder on the per-session channel
 // when an external agent's normalized status changes (turn start/end, permission
 // block/unblock). Keyed per (chat_id, agent); the frontend tracks the latest per
 // pair and resets on chat switch. `reason` is a free-form discriminator
 // (turn_start / turn_complete / failed / crashed / permission_request /
 // permission_resolved).
 export const AgentStatusUpdatedFrame = z.object({
  type: z.literal('agent_status_updated'),
  chat_id: Uuid,
  agent: z.string().min(1),
  status: AgentStatusValue,
  reason: z.string().optional(),
  at: IsoTimestamp,
 });
 // ---- discriminated union ---------------------------------------------------
 export const WsFrameSchema = z.discriminatedUnion('type', [
  // per-session
  SnapshotFrame,
  MessageStartedFrame,
  DeltaFrame,
  ReasoningDeltaFrame,
  ToolCallFrame,
  ToolResultFrame,
  MessageCompleteFrame,
  UsageFrame,
  MessagesDeletedFrame,
  ChatRenamedFrame,
  CompactedFrame,
  ErrorFrame,
  PermissionRequestedFrame,
  PermissionResolvedFrame,
  AgentCommandsFrame,
  AgentStatusUpdatedFrame,
  // per-user
  ChatStatusFrame,
  SessionUpdatedFrame,
  SessionRenamedFrame,
  SessionCreatedFrame,
  SessionArchivedFrame,
  SessionDeletedFrame,
  SessionWorkspaceUpdatedFrame,
  ChatCreatedFrame,
  ChatUpdatedFrame,
  ChatArchivedFrame,
  ChatUnarchivedFrame,
  ChatDeletedFrame,
  ProjectCreatedFrame,
  ProjectArchivedFrame,
  ProjectUnarchivedFrame,
  ProjectUpdatedFrame,
  ProjectDeletedFrame,
 ]);
 export type WsFrame = z.infer<typeof WsFrameSchema>;
 // Convenience: the set of known frame types. Useful for the publishFrame
 // helper to log the offending type name when validation fails. Kept in sync
 // by hand with the discriminated union above.
 export const KNOWN_FRAME_TYPES: readonly WsFrame['type'][] = [
  'snapshot',
  'message_started',
  'delta',
  'reasoning_delta',
  'tool_call',
  'tool_result',
  'message_complete',
  'usage',
  'messages_deleted',
  'chat_renamed',
  'compacted',
  'error',
  'permission_requested',
  'permission_resolved',
  'agent_commands',
  'agent_status_updated',
  'chat_status',
  'session_updated',
  'session_renamed',
  'session_created',
  'session_archived',
  'session_deleted',
  'session_workspace_updated',
  'chat_created',
  'chat_updated',
  'chat_archived',
  'chat_unarchived',
  'chat_deleted',
  'project_created',
  'project_archived',
  'project_unarchived',
  'project_updated',
  'project_deleted',
 ] as const;
--- a/apps/web/src/components/AgentCommandsHint.tsx
+++ b/apps/web/src/components/AgentCommandsHint.tsx
@@ -1,49 +0,0 @@
 import { ChevronDown } from 'lucide-react';
 import { useState } from 'react';
 import type { AgentCommand } from '@/api/types';
 import { cn } from '@/lib/utils';
 interface Props {
  commands: AgentCommand[];
 }
 export function AgentCommandsHint({ commands }: Props) {
  const [open, setOpen] = useState(false);
  const [expanded, setExpanded] = useState<string | null>(null);
  if (commands.length === 0) return null;
  return (
    <div className="mx-2 mb-1 rounded-md border border-border/60 bg-muted/30 text-xs">
      <button
        type="button"
        onClick={() => setOpen((v) => !v)}
        className="w-full flex items-center justify-between px-2 py-1.5 text-muted-foreground hover:text-foreground max-md:min-h-[44px]"
      >
        <span>Slash commands ({commands.length})</span>
        <ChevronDown className={cn('size-3.5 transition-transform', open && 'rotate-180')} />
      </button>
      {open && (
        <ul className="px-2 pb-2 space-y-1 border-t border-border/40 max-h-48 overflow-y-auto overscroll-contain touch-pan-y">
          {commands.map((cmd) => (
            <li
              key={cmd.name}
              className="cursor-pointer"
              onClick={() => setExpanded((v) => v === cmd.name ? null : cmd.name)}
            >
              <span className="font-mono text-primary/80">/{cmd.name}</span>
              {cmd.description && (
                <span className={cn(
                  'ml-1.5 text-muted-foreground font-sans',
                  expanded === cmd.name ? '' : 'line-clamp-2',
                )}>
                  {cmd.description}
                </span>
              )}
            </li>
          ))}
        </ul>
      )}
    </div>
  );
 }
--- a/apps/web/src/components/ChatInput.tsx
+++ b/apps/web/src/components/ChatInput.tsx
@@ -1,5 +1,5 @@
 import { useCallback, useEffect, useMemo, useRef, useState, type DragEvent, type KeyboardEvent } from 'react';
-import { Globe, ListPlus, Send, Square } from 'lucide-react';
+import { Globe, ListPlus, Paperclip, Send, Square, SquareSlash } from 'lucide-react';
 import { toast } from 'sonner';
 import { Textarea } from '@/components/ui/textarea';
 import { Button } from '@/components/ui/button';
@@ -16,7 +16,6 @@ import { AttachmentPreviewModal } from '@/components/AttachmentPreviewModal';
 import { FileMentionPopover } from '@/components/FileMentionPopover';
 import { DropOverlay } from '@/components/DropOverlay';
 import { AgentPicker } from '@/components/AgentPicker';
 import { AgentCommandsHint } from '@/components/AgentCommandsHint';
 import { ContextBar } from '@/components/ContextBar';
 import { SlashCommandPicker, type SlashCommandGroup } from '@/components/SlashCommandPicker';
 import { isSlashCommandToken, parseSlashInput, slashQuery } from '@/lib/slash-command';
@@ -117,6 +116,11 @@ export function ChatInput({ disabled, projectId, agentId, onAgentChange, session
  );
  const [fileIndex, setFileIndex] = useState<string[] | null>(null);
  const textareaRef = useRef<HTMLTextAreaElement | null>(null);
  // Attach-file button → hidden native picker (same File→Attachment path as drop).
  const fileInputRef = useRef<HTMLInputElement | null>(null);
  // Slash-commands chip → click-to-open command menu, anchored to the chip.
  const cmdChipRef = useRef<HTMLButtonElement | null>(null);
  const [cmdMenuOpen, setCmdMenuOpen] = useState(false);
  function addAttachment(a: Attachment) {
    setAttachments(prev => {
@@ -174,6 +178,23 @@ export function ChatInput({ disabled, projectId, agentId, onAgentChange, session
    setAttachments(prev => prev.filter(a => a.id !== id));
  }
  // Attach-file button: funnel picked files through the same size/binary gate +
  // chip pipeline as drag-drop. Reset value so re-picking the same file fires.
  async function onPickFiles(e: React.ChangeEvent<HTMLInputElement>) {
    const files = Array.from(e.target.files ?? []);
    e.target.value = '';
    if (files.length === 0) return;
    let remaining = MAX_ATTACHMENTS - attachments.length;
    for (const file of files) {
      if (remaining <= 0) {
        toast.error(`Attachment limit reached (${MAX_ATTACHMENTS}).`);
        break;
      }
      await processDroppedFile(file);
      remaining -= 1;
    }
  }
  async function submit() {
    const text = value.trim();
    if (!text && attachments.length === 0) return;
@@ -576,9 +597,6 @@ export function ChatInput({ disabled, projectId, agentId, onAgentChange, session
          ))}
        </div>
      )}
      {slashItems.length > 0 && (
        <AgentCommandsHint commands={slashItems} />
      )}
      {/* Batch 9 toolbar — agent picker + quick-toggle menu. v1.11.5.1
          inlines ContextBar in the same row so the bar lives next to the
          picker rather than as a separate header above it. The row renders
@@ -623,8 +641,34 @@ export function ChatInput({ disabled, projectId, agentId, onAgentChange, session
            rows={3}
            className="resize-none min-h-[56px] max-h-[240px] border-0 bg-transparent px-3 pt-2.5 shadow-none focus-visible:ring-0 dark:bg-transparent"
          />
-          {/* bottom controls row: Web toggle on the left, Send/Stop on the right */}
+          {/* bottom controls row: attach + slash chip + Web on the left, Send/Stop on the right */}
          <div className="flex items-center gap-1.5 px-2 pb-2 pt-0.5">
            <input ref={fileInputRef} type="file" multiple className="hidden" onChange={onPickFiles} />
            <button
              type="button"
              onClick={() => fileInputRef.current?.click()}
              disabled={disabled || busy || attachments.length >= MAX_ATTACHMENTS}
              aria-label="Attach file"
              title="Attach file"
              className="inline-flex items-center justify-center rounded-full border border-border px-2.5 py-1 text-muted-foreground transition-colors hover:bg-muted hover:text-foreground disabled:opacity-50 max-md:min-h-[36px] max-md:min-w-[36px]"
            >
              <Paperclip className="size-3.5" />
            </button>
            {slashItems.length > 0 && (
              <button
                ref={cmdChipRef}
                type="button"
                onMouseDown={(e) => e.stopPropagation()}
                onClick={() => setCmdMenuOpen((v) => !v)}
                aria-expanded={cmdMenuOpen}
                aria-label="Slash commands"
                title="Slash commands"
                className="inline-flex items-center gap-1.5 rounded-full border border-border px-2.5 py-1 text-xs text-muted-foreground transition-colors hover:bg-muted hover:text-foreground aria-expanded:bg-muted aria-expanded:text-foreground max-md:min-h-[36px] max-md:min-w-[36px]"
              >
                <SquareSlash className="size-3.5" />
                <span className="max-md:hidden">{slashItems.length}</span>
              </button>
            )}
            {sessionId && (
              <button
                type="button"
@@ -710,6 +754,21 @@ export function ChatInput({ disabled, projectId, agentId, onAgentChange, session
          emptyLabel={slashGroups ? 'No commands available' : 'No skills available'}
        />
      )}
      {/* Slash-commands chip menu (click-opened); anchored to the chip. */}
      {cmdMenuOpen && slashItems.length > 0 && (
        <SlashCommandPicker
          query=""
          items={slashItems}
          groups={slashGroups}
          inputRef={cmdChipRef}
          onSelect={(name) => {
            setCmdMenuOpen(false);
            handleSlashSelect(name);
          }}
          onClose={() => setCmdMenuOpen(false)}
          emptyLabel={slashGroups ? 'No commands available' : 'No skills available'}
        />
      )}
    </div>
  );
 }
--- a/apps/web/src/hooks/useSessionStream.ts
+++ b/apps/web/src/hooks/useSessionStream.ts
@@ -1,7 +1,7 @@
 import { useEffect, useRef, useState } from 'react';
 import { toast } from 'sonner';
 import type { Message, WsFrame } from '@/api/types';
-import { WsFrameSchema } from '@/api/ws-frames';
+import { WsFrameSchema } from '@boocode/contracts/ws-frames';
 import { api } from '@/api/client';
 import { sessionEvents } from './sessionEvents';
 import { recordUsage } from './useChatThroughput';
--- a/apps/web/src/hooks/useUserEvents.ts
+++ b/apps/web/src/hooks/useUserEvents.ts
@@ -1,5 +1,5 @@
 import { useEffect } from 'react';
-import { WsFrameSchema } from '@/api/ws-frames';
+import { WsFrameSchema } from '@boocode/contracts/ws-frames';
 import { sessionEvents } from './sessionEvents';
 import { createWsReconnectToast } from './wsReconnectToast';
--- a/apps/web/src/main.tsx
+++ b/apps/web/src/main.tsx
@@ -7,7 +7,6 @@ import React from 'react';
 import ReactDOM from 'react-dom/client';
 import App from './App';
 import './styles/globals.css';
 ReactDOM.createRoot(document.getElementById('root')!).render(
  <React.StrictMode>
    <App />
--- a/boocode_roadmap.md
+++ b/boocode_roadmap.md
@@ -399,25 +399,11 @@ All tags `vMAJOR.MINOR.PATCH-slug`, monotonic per minor, assigned at ship time (
 -----
-## v2.4 — BooCoder as ACP agent (driveable from external editors)
+## v2.4 — BooCoder as ACP agent (driveable from external editors) — DROPPED
-**Status: not shipped.** This is a conceptual milestone, not yet built. The `v2.4.0`/`v2.4.1` *patch tags* shipped unrelated content (Unsloth Studio parser/HTML-to-md lift, llama-sidecar routing) — patch numbers are assigned at ship time and have outrun the milestone plan. The outbound ACP-agent surface below is still future work.
+**Status: DROPPED (Sam, 2026-06-02).** Will not be built. Sam only ever drives BooCoder through BooCoder's own surface (CoderPane) — never from an external editor (Zed/JetBrains/Avante/CodeCompanion). The outbound `boocoder acp` exposure solves a problem he doesn't have, so the milestone is closed as won't-do rather than deferred. (The `v2.4.0`/`v2.4.1` *patch tags* shipped unrelated content — Unsloth Studio parser/HTML-to-md lift, llama-sidecar routing — and never had anything to do with this milestone; patch numbers are assigned at ship time.)
-**Goal:** expose `boocoder acp` so Zed, JetBrains, Avante.nvim, CodeCompanion.nvim can drive BooCoder as their agent. Outbound exposure of the BooCoder write-tool surface to ACP-compatible editors.
+The original plan (kept for record): expose `boocoder acp` (JSON-RPC over stdio) so ACP-compatible editors could drive BooCoder's write-tool surface, pending-changes queue still gating writes; lift source `zed-industries/codex-acp` (Apache-2.0), ~400 LoC. If an external-editor use case ever appears, re-open from here.
 **Scope:**
 1. New ACP server entry point: `boocoder acp` reads JSON-RPC over stdio, exposes BooCoder's task primitives as ACP sessions.
 1. BooCoder UI features remain optional: editor drives session via ACP; pending-changes queue still gates writes; user can approve/reject from either BooCoder's web UI or the editor's permission dialog (whichever responds first).
 1. Same auth model as the rest of BooCoder — editor must be reachable on the Tailscale mesh, or BooCoder is invoked with a short-lived token.
 **Why v2.4, not v2.0:** outbound ACP-agent role is cheap once the inbound ACP-client side is implemented (same protocol library, server side), but it's a *different product surface* — driving BooCoder from external editors. Ship it after BooCoder's own surface stabilizes. (The v2.2 version number was used for the Paseo provider/dispatch batch shipped 2026-05-26.)
 **Lift source:** `zed-industries/codex-acp` (Apache-2.0) as a server-side ACP reference implementation.
 **Dependencies:** v2.0 + v2.2 (recommended; v2.1 runtime isolation optional).
 **Estimated:** ~400 LoC.
 -----
--- a/docs/ARCHITECTURE.md
+++ b/docs/ARCHITECTURE.md
@@ -120,5 +120,5 @@ See [coder-backends.md](./coder-backends.md) for the full dispatch-backend refer
 ## Deploy topology
 - **BooChat + BooTerm + Postgres + codecontext:** `docker compose up --build -d` from `/opt/boocode`
- **BooCoder:** `pnpm -C apps/server build && pnpm -C apps/coder build && sudo systemctl restart boocoder`
+- **BooCoder:** `pnpm -C packages/contracts build && pnpm -C apps/server build && pnpm -C apps/coder build && sudo systemctl restart boocoder`
 - **Ports bind to Tailscale IP** `100.114.205.53`, not `0.0.0.0` — use that IP for host smoke curls
--- a/docs/DEFERRED-WORK.md
+++ b/docs/DEFERRED-WORK.md
@@ -152,16 +152,18 @@ These were explicitly scoped out of v2.3 (see `design.md` §11) and remain open:
 - **`provider_snapshot_updated` WS frame** — the loading state uses a capped client poll / one-shot refetch instead of a server-pushed frame (design §4.4, §11; tasks O.1).
 - **`available_agents.enabled` DB column** — `enabled` is read from the in-memory resolved registry only; no DB mirror, so settings state after a coder restart re-derives from the JSON config rather than the DB (design §3.3; tasks O.2).
- **Single-source-of-truth shared types package** — the provider snapshot types are duplicated across `apps/coder/.../provider-types.ts` and `apps/web/src/api/types.ts`, guarded by the text-identity `provider-types-parity.test.ts` rather than a shared package (see §3 below).
+- ~~**Single-source-of-truth shared types package**~~ — **shipped as `@boocode/contracts`** (branch `contracts-ssot-pkg`): all duplicated cross-app contracts (ws-frames schema, provider-snapshot types, provider-config schemas, `MessageMetadata`, `WorktreeRiskReport`) are now single-sourced in `packages/contracts/`; `provider-types-parity.test.ts` and the byte-parity test were deleted. See §3 below (now historical).
 - **MCP `list_providers` / `inspect_provider` tools** — provider introspection over MCP is not wired (design §11).
 ---
-## 3. Unified `packages/types` for provider snapshot JSON
+## 3. ~~Unified `packages/types` for provider snapshot JSON~~ (resolved — shipped as `@boocode/contracts`)
-### Current behavior
+> **Status: resolved.** All contracts described below are now single-sourced in `packages/contracts/` (`@boocode/contracts`), shipped on branch `contracts-ssot-pkg`. The rest of this section is historical.
-Provider snapshot shapes are **duplicated** (not byte-identical exports):
+### Former behavior (pre-`@boocode/contracts`)
 Provider snapshot shapes were **duplicated** (not byte-identical exports):
 | Location | Types |
 |----------|-------|
--- a/docs/STALE-DEPRECATED.md
+++ b/docs/STALE-DEPRECATED.md
@@ -74,4 +74,4 @@ _No open stale items from the 2026-05-26 review._
 - **Task cancel → abort external ACP/PTY child** — `AbortController` in dispatcher not wired to cancel route
 - **Skip ACP cold probe when DB models fresh** — perf; changes snapshot semantics
- **Unified `packages/types`** for provider snapshot JSON — Zod parity test may suffice
+- ~~**Unified `packages/types`** for provider snapshot JSON~~ — **shipped as `@boocode/contracts`** (`packages/contracts/`); all duplicated contracts are now single-sourced there
--- a/docs/coder-backends.md
+++ b/docs/coder-backends.md
@@ -140,7 +140,7 @@ No matter which backend runs, the turn streams the same way. Each backend emits
 | `apps/coder/.env.host` | Production env (DATABASE_URL, LLAMA_SWAP_URL, CODER_PROVIDERS_PATH, CLAUDE_SDK_BACKEND, …) |
 | `data/coder-providers.json` | Live runtime provider overrides (gitignored); template is `data/coder-providers.example.json` |
-**Build & deploy.** `apps/coder` imports the server's compiled `dist/` (`createInferenceRunner`, `createBroker`, `ALL_TOOLS`), so **`apps/server` must build first**: `pnpm -C apps/server build && pnpm -C apps/coder build && sudo systemctl restart boocoder`. The server's `package.json` `exports` map needs both `types` and `default` conditions per subpath (and `declaration: true` in its tsconfig) or NodeNext can't find the `.d.ts` and tsc fails "Cannot find module" here. Agent dispatch spawns binaries **directly** — `spawn(fullBinaryPath, argsArray, { cwd })` using `install_path` — never `spawn('sh', ['-c', ...])`, which fails under systemd.
+**Build & deploy.** `apps/coder` imports from both `@boocode/contracts` and the server's compiled `dist/` (`createInferenceRunner`, `createBroker`, `ALL_TOOLS`), so **`packages/contracts` and `apps/server` must build first**: `pnpm -C packages/contracts build && pnpm -C apps/server build && pnpm -C apps/coder build && sudo systemctl restart boocoder`. The server's `package.json` `exports` map needs both `types` and `default` conditions per subpath (and `declaration: true` in its tsconfig) or NodeNext can't find the `.d.ts` and tsc fails "Cannot find module" here. Agent dispatch spawns binaries **directly** — `spawn(fullBinaryPath, argsArray, { cwd })` using `install_path` — never `spawn('sh', ['-c', ...])`, which fails under systemd.
 ## Configuration
@@ -267,7 +267,7 @@ interface AgentBackend {
 `AcpToolSnapshot` (`apps/coder/src/services/acp-tool-snapshot.ts`) is the accumulating shape for a tool call — `{ toolCallId, title, kind?, status?, rawInput?, rawOutput? }` — merged incrementally and rendered via `snapshotToWireToolCall`.
-The provider picker is driven by `ProviderSnapshotEntry` / `AgentCommand` in `apps/coder/src/services/provider-types.ts`, which must stay byte-identical to the web copy in `apps/web/src/api/types.ts` (see Testing).
+The provider picker is driven by `ProviderSnapshotEntry` / `AgentCommand` single-sourced in `@boocode/contracts` (`packages/contracts/src/provider-snapshot.ts`). `apps/coder/src/services/provider-types.ts` re-exports them; the web imports them directly. There is no hand-synced copy to keep in sync.
 ### Constants
@@ -332,7 +332,7 @@ The picker is built by a four-stage pipeline: `provider-config.ts` (never-throws
 - `services/backends/__tests__/turn-guard.test.ts` — abort orphan-terminal suppression
 - `services/backends/__tests__/lifecycle-decisions.test.ts` — idle/LRU/restart eviction
 - `services/__tests__/acp-event-map.test.ts` / `acp-tool-snapshot.test.ts` — ACP normalization + snapshot merge
- `services/__tests__/provider-types-parity.test.ts` — text-identity parity between `provider-types.ts` and the web `api/types.ts` copy (compile-time cross-import is blocked by TS6307 on web's composite tsconfig)
+- `services/__tests__/provider-types-parity.test.ts` — **deleted**: provider snapshot types are now single-sourced in `@boocode/contracts/provider-snapshot`; parity is enforced by the single definition
 - `services/__tests__/write_guard.test.ts` (+ `_fuzz`) — path escape + secret-file blocking
 ### Adding a new backend
--- a/docs/coding-standards/cross-app-contract-parity.md
+++ b/docs/coding-standards/cross-app-contract-parity.md
@@ -1,7 +1,8 @@
 ---
 paths:
-  - "apps/server/src/types/ws-frames.ts"
+  - "packages/contracts/src/ws-frames.ts"
-  - "apps/web/src/api/ws-frames.ts"
+  - "packages/contracts/src/provider-snapshot.ts"
  - "packages/contracts/src/message-metadata.ts"
  - "apps/server/src/types/api.ts"
  - "apps/web/src/api/types.ts"
  - "apps/coder/src/services/provider-types.ts"
@@ -11,7 +12,7 @@ paths:
 # Cross-App Contract Parity
-*Reach for this when a parity test goes red (`ws-frames.test.ts`, `provider-types-parity.test.ts`), a reviewer flags a "half-synced" type, or a frame/sentinel "does nothing" at runtime — i.e. one copy of a duplicated cross-app contract drifted from the other. The fix-it path is [When to Apply](#when-to-apply) + its Verification step.*
+*Reach for this when adding a new WS frame type, a new sentinel `kind`, or touching the server's `InferenceFrame` loose union or the web's strict `WsFrame` union — contracts whose primary definition now lives in `@boocode/contracts` but that still have split secondary representations in the apps. The fix-it path is [When to Apply](#when-to-apply) + its Verification step.*
 - **Status:** proposed
 - **Date Created:** 2026-06-02 00:00
@@ -20,19 +21,19 @@ paths:
  - indifferentketchup (samkintop@gmail.com)
 - **Reviewers:**
 - **Applies To:**
-  - Every hand-synced type/schema contract that crosses the `apps/server` ↔ `apps/web` ↔ `apps/coder` boundary in the files under `paths:`. The primary examples are the WS-frame Zod schema, the provider-snapshot types, and the sentinel `MessageMetadata` union plus its `MessageBubble` render arm — but the same rule governs the other duplicated pairs in these files (`WorktreeRiskReport`, the provider-config wire types, and the interface-typed `WsFrame` union that mirrors the Zod schema).
+  - The remaining split contracts that cross app boundaries: the server's `InferenceFrame` loose publish union (`services/inference/turn.ts`), the web's strict `WsFrame` discriminated union (`apps/web/src/api/types.ts`), and the `MessageBubble` sentinel render arm (`apps/web/src/components/MessageBubble.tsx`). The primary contracts (WS-frame Zod schema, provider-snapshot types, `MessageMetadata`, `WorktreeRiskReport`, provider-config schemas) are now **single-sourced in `@boocode/contracts`** — those are governed by editing the package, not by this sync-both-copies rule.
 ## Introduction
-Several wire contracts in BooCode exist as **two or three hand-synced copies** in different apps, because the apps have separate `tsconfig`s with no shared path alias and a composite-project restriction (TS6307) that structurally blocks importing one app's types from another. There is no shared workspace package for these types yet. This standard governs what you must do when you touch one of those copies: **change every copy in the same commit** — and, where the contract has no compile-time consumer guarantee (the sentinel render arm), the consumer too.
+The primary wire contracts (WS-frame Zod schema, provider-snapshot types, provider-config schemas, `MessageMetadata`, `WorktreeRiskReport`, `AgentSessionConfig`) are now **single-sourced in `@boocode/contracts`** — edit `packages/contracts/src/<subpath>.ts` and rebuild the package. There is no second copy to sync for those contracts.
-The three families in [Coding Standard](#coding-standard) are the primary examples, but the rule applies to **every** hand-synced pair in the files under `paths:`, each of which carries its own in-code `edit both copies` / `Mirror of …` / `KEEP IN SYNC` marker. Beyond the three: `WorktreeRiskReport` (`apps/server/src/types/api.ts` ↔ `apps/web/src/api/types.ts`), the provider-config wire types (`ProviderOverride` / `CoderProvidersFile`, web mirror of the coder's Zod-inferred shapes), and — note this one — a **second** representation of the WS wire shape: the interface-typed `WsFrame` union in `apps/web/src/api/types.ts` plus the `*Frame` interfaces in `apps/server/src/types/api.ts`, which is distinct from the byte-identical Zod `ws-frames.ts` pair and is **not** covered by the byte-parity test. A WS frame's shape therefore lives in more than one place; treat all of them as one contract.
+What this standard still governs: adding a new WS frame type or a new sentinel `kind` touches contracts whose primary definition is in the package but that also have split secondary representations in the apps. The server's `InferenceFrame` loose publish union (`services/inference/turn.ts`) and the web's strict `WsFrame` discriminated union (`apps/web/src/api/types.ts`) both still exist separately from the canonical `WsFrameSchema` in the package, and must be updated together with it. The sentinel `MessageBubble` render arm (`apps/web/src/components/MessageBubble.tsx`) has no compile-time guard and still needs updating when a new `MessageMetadata` kind is added to the package.
 ### Purpose
 - **Primary:** prevent *silent runtime* contract breakage. Nothing at compile time links the copies — each app type-checks against its own copy, so `tsc` stays green when they drift. The failure surfaces only at runtime, and silently: a WS frame whose `type` exists on one side but not the other is **dropped at JSON-parse** with no error; a sentinel `kind` added without a render arm shows nothing. Editing every copy in lockstep is the only thing that keeps the contract whole.
- **Secondary:** two of the three contracts have runtime parity tests (`ws-frames.test.ts`, `provider-types-parity.test.ts`) that catch drift in the test run — but they are a backstop, not the mechanism, and the sentinel triple has no test at all.
+- **Secondary:** the `@boocode/contracts` package's `ws-frames.test.ts` tests schema correctness (accept/reject behavior) and the `KNOWN_FRAME_TYPES` drift probe. `provider-types-parity.test.ts` was deleted when the provider-snapshot types moved to the package. The sentinel render arm still has no automated test.
- **Side effect:** keeping the copies byte- or text-identical makes a contract change reviewable as a matched diff across files.
+- **Side effect:** for the remaining split representations (server `InferenceFrame`, web `WsFrame`), updating them together with the package source in a single commit makes the change reviewable as a matched diff across files.
 ### Scope
@@ -43,33 +44,33 @@ The specific duplicated contracts listed in `paths:` above, inside the `apps/ser
 Walk this before editing a type, schema, enum, or metadata union:
 1. **Does this shape exist as a copy in another app?** — Check: `grep -rn "<TypeOrFieldName>" apps/*/src`. If it appears under two or more of `apps/server`, `apps/web`, `apps/coder` → continue. If it lives in exactly one app → see "When NOT to Apply".
-2. **Are you changing its wire shape?** — adding, removing, renaming, or re-typing a field; adding/removing a frame `type`; adding an enum value or a sentinel `kind`. If yes → apply this standard: edit **every** copy, plus every consumer that switches on the shape, in the **same commit**. If no (a comment or formatting change that the contract's parity test normalizes away) → see "When NOT to Apply".
+2. **Are you changing its wire shape?** — adding, removing, renaming, or re-typing a field; adding/removing a frame `type`; adding an enum value or a sentinel `kind`. If yes → apply this standard: update the `@boocode/contracts` package source, rebuild, and also update every secondary app-side representation (the server `InferenceFrame` loose union, the web `WsFrame` strict union, and the `MessageBubble` render arm for sentinel kinds), in the **same commit**. If no (a comment or formatting change) → see "When NOT to Apply".
-**Exception — the sentinel/consumer triple:** `MessageMetadata` (`apps/server/src/types/api.ts` ↔ `apps/web/src/api/types.ts`) has **no parity test**, and a new `kind` is inert until it gets a render branch in `apps/web/src/components/MessageBubble.tsx`. When the shape you are editing is `MessageMetadata`, "every copy" includes that render arm — there is no test to remind you.
+**Exception — the sentinel render arm:** `MessageMetadata` is single-sourced in `@boocode/contracts/message-metadata` (one definition, no second copy). However, a new `kind` is inert until it gets a render branch in `apps/web/src/components/MessageBubble.tsx`. There is no test to catch a missing render arm — when adding a `kind`, the render branch in `MessageBubble.tsx` is the one consumer you must update manually.
 **Verification step:** run the guards that exist *now*, before you commit:
 ```bash
-# The trailing arg is a FILE-PATH substring filter for `vitest run` (not a test
+# Build the contracts package first, then run its tests.
-# name). A typo matches zero files and still exits 0 — a false green — so confirm
+pnpm -C packages/contracts build
-# the run actually executed the file (look for "1 passed" on the named file).
+pnpm -C packages/contracts test ws-frames.test     # WS-frame schema correctness + KNOWN_FRAME_TYPES drift
-pnpm -C apps/server test ws-frames.test            # WS-frame byte-parity + KNOWN_FRAME_TYPES drift
+# provider-types-parity.test.ts was deleted — provider snapshot types are in @boocode/contracts
-pnpm -C apps/coder  test provider-types-parity     # provider-snapshot text-parity (incl. nested blocks)
+
-# Sentinel triple has no test — grep all copies for a NEW rendering kind:
+# Sentinel render arm has no test — grep for the new kind in the package definition and MessageBubble:
-grep -rn "<new-kind>" apps/server/src/types/api.ts apps/web/src/api/types.ts apps/web/src/components/MessageBubble.tsx
+grep -rn "<new-kind>" packages/contracts/src/message-metadata.ts apps/web/src/components/MessageBubble.tsx
 ```
-For a **rendering** sentinel kind (`cap_hit` / `doom_loop` / `mistake_recovery`) the new `kind` must appear in all three files. The non-rendering `error` arm of `MessageMetadata` lives in the two type copies only — it has no `MessageBubble` branch — so for it the grep should match the two `api.ts`/`types.ts` copies, not `MessageBubble.tsx`.
+For a **rendering** sentinel kind (`cap_hit` / `doom_loop` / `mistake_recovery`) the new `kind` must appear in `packages/contracts/src/message-metadata.ts` and have a render branch in `MessageBubble.tsx`. The non-rendering `error` arm has no `MessageBubble` branch — for it, only the package definition needs updating.
 ## When NOT to Apply
 - **The type lives in a single app.** Internal server types, web-only view models, coder-only helpers — there is no second copy, so there is nothing to sync. Edit the one definition directly; do **not** manufacture a duplicate in another app "for symmetry." A new cross-app contract should prefer the eventual shared package or, at minimum, ship with its own parity test — not a third hand-synced copy.
- **A comment- or whitespace-only edit to a *text-parity* file.** `provider-types-parity.test.ts` strips comments and blank lines before comparing, so a comment-only change to one provider-types copy is tolerated and you needn't chase the other. (This relief does **not** apply to `ws-frames.ts`, which is compared **byte-for-byte** — every character, including comments, must match.)
+- **A comment- or whitespace-only edit to a `@boocode/contracts` source file.** The package's `ws-frames.test.ts` tests behavior and structure, not formatting, so a comment-only change does not require secondary app-side updates.
- **The shared workspace package lands.** This standard exists *only* because the single source of truth was deferred (a Tier-2 follow-up noted in `provider-types-parity.test.ts`). Once these types move into one shared package, delete the hand-syncing rule rather than keep paying it — the SSOT supersedes this standard.
+- **The type lives in `@boocode/contracts` and has no split secondary representations.** For the migrated contracts (WS-frame Zod schema, provider-snapshot, provider-config, `MessageMetadata`, `WorktreeRiskReport`) the package IS the single source of truth. Edit the package source; there is nothing to sync in the apps. The split representations that still require multi-file changes are the server `InferenceFrame` loose union, the web `WsFrame` strict union, and the `MessageBubble` render arm — all documented in [When to Apply](#when-to-apply).
 ## Background
-The duplication is deliberate, not accidental. A compile-time bidirectional-assignability check was attempted first — a web-side file importing the coder's import-free `provider-types.ts` — but `apps/web/tsconfig.app.json` is a composite project and rejects out-of-include files with **TS6307**, so cross-project type import is structurally blocked. The team chose hand-synced copies guarded by runtime tests over a premature shared package. The WS-frame copies go further and are kept **byte-identical** so a single `readFileSync` equality test can guard them; the provider-snapshot copies are kept **text-identical per named type block** (comments normalized away) because they sit among unrelated types. The cost of this choice is exactly what this standard manages: a copy can drift, and because each app compiles independently, only a runtime test — or a runtime bug — reveals it.
+*The history below explains why the duplication existed, not the current state.* The primary contracts (WS-frame Zod schema, provider-snapshot types, provider-config schemas, `MessageMetadata`, `WorktreeRiskReport`) were formerly hand-synced copies guarded by runtime parity tests — because `apps/web/tsconfig.app.json` is a composite project and rejects out-of-include files with **TS6307**, blocking cross-project type import. The WS-frame copies were kept **byte-identical** so a single `readFileSync` equality test could guard them; provider-snapshot copies were kept **text-identical per named block**. These contracts have since moved to `@boocode/contracts` (the `packages/contracts` workspace package), the hand-sync discipline and the byte-parity test are retired, and drift is prevented by there being exactly one definition. The split representations that remain (server `InferenceFrame` loose union, web `WsFrame` strict union, `MessageBubble` render arm) still require lockstep edits when frame types or sentinel kinds are added — that is what this standard now governs.
 ## Coding Standard
@@ -77,41 +78,35 @@ The duplication is deliberate, not accidental. A compile-time bidirectional-assi
 When you change one copy of a duplicated contract, change the others in the same commit. Each contract family has its own home files and its own (or no) guard.
-**WS frame schema — `apps/server/src/types/ws-frames.ts` ↔ `apps/web/src/api/ws-frames.ts` (byte-identical):**
+**WS frame schema — single-sourced at `packages/contracts/src/ws-frames.ts` (imported as `@boocode/contracts/ws-frames`):**
 ```typescript
-// PRIMARY: no compile-time link exists across apps (separate tsconfigs, TS6307
+// Single source of truth: WsFrameSchema (Zod), WsFrame (z.infer), KNOWN_FRAME_TYPES.
-// blocks cross-import). A frame type added to one copy but not the other breaks
+// No second copy to sync. When adding a frame type:
-// silently at runtime — the frontend drops the frame at JSON-parse. So this file
+//   1. Add it here (the canonical schema definition) and rebuild the package.
-// and apps/web/src/api/ws-frames.ts MUST stay byte-identical, in the same commit.
+//   2. Also add to the server's InferenceFrame loose union (services/inference/turn.ts).
-//
+//   3. Also add to the web's strict WsFrame discriminated union (apps/web/src/api/types.ts).
-//   IMPORTANT: This file is duplicated byte-identical at
+// Adding a frame also means adding its `type` to KNOWN_FRAME_TYPES — the package
-//   apps/web/src/api/ws-frames.ts. ... If you change one, change the other.
+// test probes every entry for a discriminated branch.
 //
 // Adding a frame also means adding its `type` to KNOWN_FRAME_TYPES (a drift test
 // probes every entry for a discriminated branch).
 ```
-**Provider snapshot types — `apps/coder/src/services/provider-types.ts` ↔ `apps/web/src/api/types.ts`, text-identical per block.** By convention you author on the coder side and mirror to web (the in-code `KEEP IN SYNC` markers point that way), but the parity test is **symmetric** — it fails on drift in *either* file and names no authoritative copy, so "fix the red test" means re-sync the two, not edit one in particular:
+**Provider snapshot types — single-sourced at `packages/contracts/src/provider-snapshot.ts` (imported as `@boocode/contracts/provider-snapshot`):**
 ```typescript
-// PRIMARY: nothing links these two copies at compile time — a field added here
+// Single source of truth for ProviderSnapshotEntry, ProviderModel, ProviderMode,
-// but not in apps/web/src/api/types.ts breaks silently at runtime (the web side
+// ThinkingOption, AgentCommand, ProviderSnapshotStatus. No second copy to sync.
-// drops or mis-reads the snapshot). The in-file marker, with its test backstop:
+// apps/coder/src/services/provider-types.ts re-exports from this package.
-//   KEEP IN SYNC with apps/web/src/api/types.ts ProviderSnapshotEntry — parity
+// provider-types-parity.test.ts was deleted — drift is prevented by the single definition.
 //   is enforced by __tests__/provider-types-parity.test.ts (fails on field drift).
 // Applies to the nested ProviderModel / ProviderMode / ThinkingOption /
 // AgentCommand / ProviderSnapshotStatus blocks the entry references, too.
 export interface ProviderSnapshotEntry { /* ...fields... */ }
 ```
-**Sentinel metadata — `apps/server/src/types/api.ts` ↔ `apps/web/src/api/types.ts`, plus the render arm in `apps/web/src/components/MessageBubble.tsx` (no parity test):**
+**Sentinel metadata — `packages/contracts/src/message-metadata.ts` (single source) plus the render arm in `apps/web/src/components/MessageBubble.tsx` (no automated test):**
 ```typescript
-// A new *rendering* sentinel kind is a THREE-file change with NO test to catch a miss:
+// MessageMetadata is single-sourced in @boocode/contracts/message-metadata.
-//   1. apps/server/src/types/api.ts  — add the arm to MessageMetadata
+// A new *rendering* sentinel kind is a TWO-step change with NO test to catch a miss:
-//   2. apps/web/src/api/types.ts     — add the identical arm
+//   1. packages/contracts/src/message-metadata.ts — add the arm to MessageMetadata, rebuild
-//   3. MessageBubble.tsx             — add the render branch, else it shows nothing
+//   2. MessageBubble.tsx             — add the render branch, else it shows nothing
 // The real union has FOUR arms; show it whole so nobody reads two as the full set:
 export type MessageMetadata =
  | { kind: 'cap_hit';          /* used, limit, agent_name, can_continue */ }
@@ -129,37 +124,37 @@ export type MessageMetadata =
 **What to avoid:**
 ```typescript
-// ANTI-PATTERN: editing one copy only.
+// ANTI-PATTERN: editing only the package schema without updating the app-side representations.
-// Add a new frame type to apps/web/src/api/ws-frames.ts but not the server copy
+// Add a new frame type to packages/contracts/src/ws-frames.ts but not to the
-// (or vice versa): tsc stays green — they're separate projects — but the parity
+// web's strict WsFrame union (apps/web/src/api/types.ts): tsc stays green because
-// test fails, and had it not existed, the server would publish a frame the
+// they're separate projects, but the frontend silently discards the frame at
-// frontend silently discards at JSON-parse. A half-edited contract is invisible
+// JSON-parse. A half-edited contract is invisible to the type-checker; never land one.
 // to the type-checker; never land one.
 ```
 **Project references:**
- `apps/server/src/types/ws-frames.ts` — the byte-identical sync comment (top of file) and `KNOWN_FRAME_TYPES`.
+- `packages/contracts/src/ws-frames.ts` — `WsFrameSchema`, `WsFrame`, `KNOWN_FRAME_TYPES` (via `@boocode/contracts/ws-frames`). The two former app-level `ws-frames.ts` copies are deleted.
- `apps/web/src/api/ws-frames.ts` — the web copy that must match it byte-for-byte.
+- `packages/contracts/src/provider-snapshot.ts` — provider snapshot types (via `@boocode/contracts/provider-snapshot`). The former web mirror block in `apps/web/src/api/types.ts` and the former coder copy in `provider-types.ts` are deleted; `provider-types.ts` now re-exports from the package.
- `apps/coder/src/services/provider-types.ts` — the `KEEP IN SYNC` comment above `ProviderSnapshotEntry`.
+- `packages/contracts/src/message-metadata.ts` — `MessageMetadata`, `ErrorReason` (via `@boocode/contracts/message-metadata`). The former copies in `apps/server/src/types/api.ts` and `apps/web/src/api/types.ts` are deleted.
- `apps/web/src/api/types.ts` — the provider-snapshot wire copy and the `MessageMetadata` copy.
+- `apps/web/src/api/types.ts` — the web-local strict `WsFrame` discriminated union (still maintained separately from the canonical schema in the package).
 - `apps/web/src/components/MessageBubble.tsx` — the sentinel render arms (`metadata?.kind` branches).
 ### A wire-shape change passes through the gate, then a consumer
 A frame is published by the server's permissive `InferenceFrame` union (`apps/server/src/services/inference/turn.ts`) but only reaches the UI if the strict schema/union accepts it — permissive publish, strict receive. Keep the **type/schema copies** (this standard's scope) in lockstep so the frame survives validation; then make sure something consumes it.
-> **Where consumer-wiring fits.** This standard governs the duplicated *type/schema* copies and the one consumer with no compile-time guard — the sentinel `MessageBubble` render arm. A new WS frame additionally needs a runtime handler to *do* anything: `applyFrame` in `apps/web/src/hooks/useSessionStream.ts` (per-session frames) and `useUserEvents` (user-channel frames), plus the sidebar reducer. That wiring — and the event-dedup discipline around it — is governed by `apps/web/CLAUDE.md`, not by this parity standard. A frame that passes the byte-parity test but has no reducer `case` validates and is then silently ignored.
+> **Where consumer-wiring fits.** This standard governs the duplicated *type/schema* copies and the one consumer with no compile-time guard — the sentinel `MessageBubble` render arm. A new WS frame additionally needs a runtime handler to *do* anything: `applyFrame` in `apps/web/src/hooks/useSessionStream.ts` (per-session frames) and `useUserEvents` (user-channel frames), plus the sidebar reducer. That wiring — and the event-dedup discipline around it — is governed by `apps/web/CLAUDE.md`, not by this parity standard. A frame that passes `WsFrameSchema` validation but has no reducer `case` is then silently ignored.
 **Correct usage:**
 ```typescript
 // Adding a WS frame type, all in one commit:
 //   - packages/contracts/src/ws-frames.ts         — WsFrameSchema + WsFrame + KNOWN_FRAME_TYPES (rebuild package)
 //   - apps/server/src/services/inference/turn.ts  — loose InferenceFrame publish union (+ optional fields)
-//   - apps/server/src/types/ws-frames.ts          — strict WsFrameSchema + WsFrame + KNOWN_FRAME_TYPES
+//   - apps/web/src/api/types.ts                   — strict WsFrame discriminated union (the web gate)
-//   - apps/web/src/api/ws-frames.ts               — byte-identical copy of the strict gate
+// The web strict WsFrame is the wire-format gate: a frame whose type isn't in
 // The strict web-side type is the wire-format gate: a frame whose type isn't in
 // it is dropped at JSON-parse. The loose publish union and the strict gate are
-// BOTH required — permissive publish, strict receive.
+// BOTH required — permissive publish, strict receive. The canonical schema in the
 // package is validated by the broker fail-closed (Zod) on every publish.
 ```
 **What to avoid:**
@@ -172,25 +167,26 @@ A frame is published by the server's permissive `InferenceFrame` union (`apps/se
 ```
 **Project references:**
 - `packages/contracts/src/ws-frames.ts` — `WsFrameSchema` (the broker's fail-closed validation gate) + `WsFrame` + `KNOWN_FRAME_TYPES`.
 - `apps/server/src/services/inference/turn.ts` — the loose `InferenceFrame` publish union.
- `apps/server/src/types/ws-frames.ts` — `WsFrameSchema` (the broker's fail-closed validation gate) + `KNOWN_FRAME_TYPES`.
+- `apps/web/src/api/types.ts` — the web-local strict `WsFrame` discriminated union.
 - `apps/web/src/components/MessageBubble.tsx` — the consumer for sentinel `MessageMetadata` kinds.
-### Sync the copies; never weaken the parity test
+### Keep the package tests; never weaken them
-When a parity test fails, the fix is to make the copies match — not to make the test stop checking. The corollary also holds: when you add a **new** nested type that `ProviderSnapshotEntry` references, add its name to the `names` array in `provider-types-parity.test.ts`, or the new type is hand-synced but **unguarded**.
+When a package test fails, the fix is to update the package source — not to delete the assertion or skip the test. When adding a **new** nested type to `ProviderSnapshotEntry`, add it to `packages/contracts/src/provider-snapshot.ts` — there is no `names` array to update since `provider-types-parity.test.ts` was deleted; the single definition is the guard.
 **What to avoid:**
 ```typescript
-// ANTI-PATTERN: a red parity test "fixed" by deleting the assertion, skipping
+// ANTI-PATTERN: "fixing" a red package test by deleting the assertion, skipping
-// the it(), or trimming a type out of the compared `names` list. That converts a
+// the it(), or removing a frame type from the tested set. That converts caught
-// caught drift into a shipped, silent contract break. Re-sync the copies instead.
+// drift into a shipped, silent contract break. Fix the package source instead.
 ```
 **Project references:**
- `apps/server/src/services/__tests__/ws-frames.test.ts` — `ws-frames.ts file mirror parity` (byte-identical) and the `KNOWN_FRAME_TYPES` drift probe.
+- `packages/contracts/src/__tests__/ws-frames.test.ts` — schema correctness (accept/reject) and the `KNOWN_FRAME_TYPES` drift probe. (The former server-side `ws-frames.ts file mirror parity` byte-identical test was deleted when the schema moved to the package.)
- `apps/coder/src/services/__tests__/provider-types-parity.test.ts` — text-identity of each shared block across the coder ↔ web copies.
+- `apps/coder/src/services/__tests__/provider-types-parity.test.ts` — **deleted**: provider snapshot types moved to `@boocode/contracts/provider-snapshot`.
 ## Additional Resources
@@ -198,8 +194,8 @@ When a parity test fails, the fix is to make the copies match — not to make th
 - [BooCoder Dispatch Backends](../coder-backends.md) — the provider-snapshot contract and the WS-frame mapping in their runtime context (see "Core Types" and the parity notes).
 - [Architecture overview](../ARCHITECTURE.md) — the three surfaces and the shared database the contracts cross.
- Root `CLAUDE.md` → "Conventions" — the cross-app contract rules (WS frame, sentinels, provider-type parity, JSONB) this standard formalizes.
+- Root `CLAUDE.md` → "Conventions" — the cross-app contract rules (WS frame schema, sentinels, provider snapshot types, JSONB) this standard formalizes; updated to reflect `@boocode/contracts` SSOT.
- `apps/server/CLAUDE.md` (`services/broker.ts`) and `apps/coder/CLAUDE.md` — per-app notes on the broker validation and the provider-type mirror.
+- `apps/server/CLAUDE.md` (`services/broker.ts`) — broker validation against the `@boocode/contracts` schema. `apps/coder/CLAUDE.md` — `provider-types.ts` re-exports from the package.
 ### External Resources
--- a/docs/features/git-diff-panel/artifacts/decision-log.md
+++ b/docs/features/git-diff-panel/artifacts/decision-log.md
@@ -0,0 +1,361 @@
 # Decision log — Git diff panel
 Decisions behind [`feature-specification.md`](../feature-specification.md). Full decisions carry rationale,
 evidence, and rejected alternatives; trivial decisions are one-liners. Shared D# counter.
 ## Full decisions
 ### D1 — Placement: a tab inside the file browser
 **Question:** Where does the diff view live in the workspace?
 **Decision:** The diff view lives in the right-side file panel as a Files / Git tab, occupying the same
 slot as the file tree, rather than as a new standalone workspace pane.
 **Rationale:** The request was "instead of the file browser," and the file browser is a right-side sidebar,
 not a workspace-grid pane. The reference design (Paseo) puts Changes / Files tabs in one sidebar slot. A
 new workspace pane would require new pane-kind plumbing for an affordance the user described as a
 replacement, not an addition.
 **Evidence:** User answer (2026-06-02). Codebase: the file browser is the right-rail sidebar; the legacy
 "file_browser" pane kind is unused. Paseo `explorer-sidebar.tsx` (Changes/Files tabs in one slot).
 **Rejected alternatives:**
 - A standalone git-diff workspace pane — rejected: it is an addition, not a replacement, and adds pane
  plumbing the user did not ask for.
 **Driven by findings:** —
 **Linked technical notes:** —
 **Dependent decisions:** D9, D10.
 **Referenced in spec:** Actors and triggers, Primary flow, User interactions.
 ### D2 — Scope: the project repository, with two comparison modes
 **Question:** What repository and comparison modes does the panel cover?
 **Decision:** The panel shows the **project repository's** changes, with a selector between **Uncommitted**
 (working tree vs. last commit) and **Committed** (current branch vs. its upstream tracking branch when
 set, otherwise the repository's default branch). It does not show the session agent's separate
 working-copy diff — that remains the pending-changes panel's job. In Committed mode the view labels the
 base it resolved ("Git — branch vs &lt;base&gt;"); when no base resolves the panel falls back to
 Uncommitted and labels the mode as a fallback.
 **Rationale:** The file browser is scoped to the project repository, so the diff "instead of" it should
 share that scope. The user chose both comparison modes (Paseo-style) over a single mode. The agent
 working-copy diff is already surfaced by the pending-changes panel; duplicating it here would create two
 overlapping surfaces. Labeling the base prevents silent ambiguity (F11).
 **Evidence:** User answer (2026-06-02, "Both, with a selector"). Codebase: the file browser and project
 git-metadata are scoped to the project path; agent worktree diffs already flow to the pending-changes
 panel. Paseo uncommitted/committed mode selector. F11 (base unlabeled finding).
 **Rejected alternatives:**
 - Project working tree only — rejected: user wanted both modes.
 - Session agent worktree — rejected: overlaps the pending-changes panel and is not the file browser's scope.
 - No base labeling — rejected (F11): the base is ambiguous between upstream tracking branch and default
  branch; unlabeled output invites confusion.
 **Driven by findings:** F11.
 **Linked technical notes:** —
 **Dependent decisions:** D3, D4, D5, D11, D13.
 **Referenced in spec:** Outcome, Primary flow, Edge cases and failure modes, Coordinations.
 ### D3 — Mode auto-selection and session pinning
 **Question:** How is the initial mode chosen and how does it behave across refreshes?
 **Decision:** Auto mode-selection applies on first open only: Uncommitted when the working tree is dirty,
 Committed when it is clean. Once the user selects a mode explicitly it is pinned for the session;
 refreshes do not override it. If a refresh would change the auto-selected mode (e.g. the tree went clean
 while Uncommitted was pinned), the panel briefly notes the change rather than swapping silently.
 **Rationale:** Paseo's convention is auto-select by state. However, a refresh-triggered silent mode swap
 would dislocate the user's view without warning — they could be mid-review and suddenly see a different
 file list (F12). Pinning after explicit selection preserves the user's intent.
 **Evidence:** Paseo convention (auto-select by dirty state). F12 (silent-dislocation finding;
 design-judgment resolution).
 **Rejected alternatives:**
 - Always auto-select on every refresh — rejected (F12): silently dislocates the view when the tree state
  changes mid-session.
 - No auto-selection (always start in a fixed mode) — rejected: ignores the most useful starting point.
 **Driven by findings:** F12.
 **Linked technical notes:** —
 **Dependent decisions:** D14.
 **Referenced in spec:** Primary flow, Alternate flows and states.
 ### D5 — Binary and large-file handling
 **Question:** What does the panel show for files it cannot diff or whose diff is too large?
 **Decision:** Binary files show a "Binary file" placeholder instead of a diff body. Files over a display-
 size threshold show "Diff too large to display" in place of the diff body. A git read that does not
 complete within a deadline exits the loading state, shows an error, and offers Refresh. The total
 response payload is bounded so a huge change set cannot stall the panel.
 **Rationale:** Paseo-style caps prevent the panel from hanging or overflowing on large repos. The read-
 deadline (F7) is a distinct concern from the large-result cap: a slow git process can stall the panel
 even when individual files are small.
 **Evidence:** Paseo codebase (display-size caps). F7 (hanging git-read finding; evidence-backed addition).
 **Rejected alternatives:**
 - No cap — rejected: a huge change set can stall or overflow the panel.
 - Combine deadline and size cap into one mechanism — rejected (F7): they address different failure modes
  (slow process vs. large output); both are needed.
 **Driven by findings:** F7.
 **Linked technical notes:** —
 **Dependent decisions:** —
 **Referenced in spec:** Edge cases and failure modes.
 ### D6 — v1 actions: stage, unstage, commit, discard (no push)
 **Question:** Which write actions are included in v1, and in which modes are they available?
 **Decision:** v1 includes staging/unstaging files, committing staged files with a message, and discarding a
 file's changes — all available only in Uncommitted mode. Committed mode is read-only review with no write
 actions. Pushing, pulling, PRs, and merges are excluded. Commit author/committer identity is derived
 server-side; the request cannot set or influence it.
 **Rationale:** The user chose to include stage/commit over a read-only view. Remote operations were not
 requested and the assistant-level rule already treats remote writes as out of band, so v1 stops at local
 history. Allowing write actions in Committed mode would mean reverting committed history (per-file resets
 of committed commits), which was not requested and creates a different risk profile. Committer identity
 must be server-derived to prevent the request body from spoofing authorship (F3).
 **Evidence:** User answer (2026-06-02, "Include stage/commit"). Convention: the assistant cannot push to
 remotes (project docs) — signals remotes are deliberately out of band. F3 (committer-identity
 finding). F14 (mode-scoping finding; design-judgment resolution).
 **Rejected alternatives:**
 - Read-only review — rejected by the user.
 - Full git actions incl. push/pull/PR (Paseo's set) — deferred (YAGNI): not requested.
 - Write actions in Committed mode too — rejected (F14): reverting committed history is a distinct,
  unrequested capability with a different risk profile.
 - Request-supplied commit identity — rejected (F3): allows spoofing; server-derived identity is the only
  safe source.
 **Driven by findings:** F3, F14.
 **Linked technical notes:** —
 **Dependent decisions:** D7, D8, D12.
 **Referenced in spec:** Primary flow, User interactions.
 ### D7 — Discard requires a plain confirmation
 **Question:** What confirmation does discard require, and what wording?
 **Decision:** Discarding a file's changes prompts a plain Cancel / Discard confirmation with wording that
 distinguishes the two cases: "Discard changes to X?" for a tracked file (reverts to committed content)
 and "Delete X? It has never been committed and cannot be recovered" for an untracked file (permanently
 removed). Stage, unstage, and commit do not prompt.
 **Rationale:** Discard is the only irreversible action in the set; a confirmation guards an accidental tap,
 especially on mobile. The project's stated preference is plain confirm dialogs, never type-the-name
 patterns. A tracked revert and an untracked permanent delete are different losses — the user deserves to
 know which one they are confirming (F4).
 **Evidence:** Convention: destructive actions use plain Cancel/Confirm dialogs (no type-to-confirm).
 Stage/unstage/commit are reversible (commits can be amended/reset), so they need no prompt. F4
 (tracked-vs-untracked and affordance-separation finding; design-judgment resolution).
 **Rejected alternatives:**
 - No confirmation — rejected: irreversible data loss on a stray tap.
 - Type-the-filename-to-confirm — rejected: against the project's confirmation convention.
 - Single generic confirmation for tracked and untracked — rejected (F4): hides the difference in
  consequence (revert vs. permanent delete) from the user.
 **Driven by findings:** F4.
 **Linked technical notes:** —
 **Dependent decisions:** D15.
 **Referenced in spec:** Alternate flows and states.
 ### D8 — Git write is a user action, not an assistant tool
 **Question:** Are the panel's write actions gated by session type (e.g. read-only-assistant sessions)?
 **Decision:** The diff panel's write actions (stage/commit/discard) are available wherever the file panel
 appears, including read-only-assistant sessions, because they are the human user's own UI actions, not
 the AI's. The git-write endpoints are never registered as assistant tools, and the artifact sandbox
 prevents a rendered artifact from invoking them.
 **Rationale:** The "read-only" rule constrains what the AI assistant's tools may do. A human committing
 their own repository through a panel is a different actor; gating the panel by session type would be a
 category error and produce inconsistent behavior across sessions. The artifact-sandbox commitment (F1)
 closes the indirect path an artifact might otherwise exploit.
 **Evidence:** Convention: the read-only invariant is defined over the assistant's tool surface; the file
 browser (also user-driven) already appears in all sessions. F1 (artifact-sandbox finding; evidenced by
 `connect-src 'none'` in the artifact iframe sandbox per BOOCHAT.md output-format section).
 **Rejected alternatives:**
 - Restrict the write actions to write-capable (coder) sessions only — rejected: conflates the assistant's
  tool permissions with the user's UI affordances; produces inconsistent behavior.
 **Driven by findings:** F1.
 **Linked technical notes:** —
 **Dependent decisions:** D12.
 **Referenced in spec:** Actors and triggers, Coordinations.
 ### D10 — Refresh on open, on mutation, on turn completion, on demand, with coalescence
 **Question:** When does the panel re-read the repository, and how are concurrent triggers handled?
 **Decision:** The panel re-reads the repository state when the Git tab is opened, after any stage /
 unstage / commit / discard it performs, after an agent turn completes, after the user applies or discards
 a queued change in the pending-changes panel, and on an explicit Refresh control. Concurrent refresh
 triggers are coalesced — a refresh already in flight absorbs later triggers rather than spawning a
 second concurrent read, so the panel settles to a single final snapshot. Continuous file-watching is
 excluded.
 **Rationale:** These triggers cover every event that can change the project repository's state within the
 app's single-user workflow. Coalescence (F8) prevents a burst of triggers (e.g. multiple rapid mutations)
 from causing redundant concurrent reads or a stale intermediate result overwriting a fresher one. A
 continuous file watcher adds cost without a multi-user need.
 **Evidence:** Convention: event-driven refresh follows the session-event / broker model already used for
 other panels. F8 (concurrent-refresh finding; evidence-backed addition).
 **Rejected alternatives:**
 - Continuous file-watch stream — rejected (YAGNI): event- and demand-driven covers the single-user case;
  deferred under YAGNI.
 - No coalescence (each trigger spawns its own read) — rejected (F8): can produce concurrent reads and
  stale-snapshot overwrites.
 **Driven by findings:** F8.
 **Linked technical notes:** —
 **Dependent decisions:** —
 **Referenced in spec:** Alternate flows and states.
 ### D11 — All git operations scoped to the project repository path
 **Question:** How is the git operation target scoped and validated?
 **Decision:** Every read and write the panel performs is confined to the project's own repository. The
 repository root is derived server-side from the session's project record — never from the request. Per-
 file arguments are validated as repo-relative paths and rejected if they escape the repository root.
 User-supplied text (commit message, file arguments) is passed as discrete arguments and never
 interpolated into a shell string.
 **Rationale:** The panel acts on the project repo only. Deriving the root server-side and validating
 per-file arguments closes the path-escape and command-injection vectors (F2). Passing text as discrete
 arguments (not a shell string) ensures user-supplied content cannot be interpreted as git flags or shell
 syntax.
 **Evidence:** Convention: project file operations resolve and scope to the project path via the existing
 path-scoping guard; git-metadata reads already do this. F2 (derivation + argument-safety finding;
 evidenced by the existing path-scoping guard).
 **Rejected alternatives:**
 - Accept a caller-supplied repository path — rejected: needless write surface, no use case.
 - Validate path only at the root level (not per-file arguments) — rejected (F2): per-file arguments can
  escape the repo root via `../` traversal if not independently validated.
 - Build git invocations as a shell string — rejected (F2): user-supplied content (commit message, file
  names with special characters) can be interpreted as flags or shell syntax.
 **Driven by findings:** F2.
 **Linked technical notes:** —
 **Dependent decisions:** D12.
 **Referenced in spec:** Coordinations.
 ### D12 — Git-write security posture
 **Question:** What are the combined security commitments for the git-write surface?
 **Decision:** The git-write surface (stage / commit / discard) has three security commitments: (1) these
 actions are user-initiated UI actions only and are never registered as assistant tools; the artifact
 sandbox prevents a rendered artifact from invoking them; (2) all git operations target only the project's
 own repository, with the root derived server-side and per-file paths validated inside it; (3) commit
 author/committer identity is derived from a server-side source (host git configuration) and cannot be set
 by the request.
 **Rationale:** F1, F2, and F3 each attacked a distinct vector — artifact-driven invocation, path/argument
 injection, and identity spoofing — that D8 and D11 individually did not close. D12 records all three
 commitments together as the complete security posture of the write surface.
 **Evidence:** F1 (artifact-sandbox; `connect-src 'none'` per BOOCHAT.md output-format section).
 F2 (path-scoping guard in the codebase; derivation and validation commitments). F3 (server-derived
 identity commitment; design-judgment that no request field should influence authorship).
 **Rejected alternatives:**
 - Trust the client-supplied repository path — rejected (F2): see D11.
 - Allow request-supplied commit identity — rejected (F3): allows spoofing; no legitimate use case in a
  single-user app.
 - Rely on session-type gating instead of endpoint-level exclusion from tool registry — rejected (F1):
  session type is the wrong layer; artifact-sandbox closes the actual indirect path.
 **Driven by findings:** F1, F2, F3.
 **Linked technical notes:** —
 **Dependent decisions:** —
 **Referenced in spec:** Actors and triggers, Coordinations.
 ### D13 — Committed-mode base resolution and labeling
 **Question:** What is the base for Committed mode, and how is it surfaced when resolution fails?
 **Decision:** Committed mode compares the current branch against its upstream tracking branch when one is
 set, falling back to the repository's default branch (main/master). The panel labels the base it used in
 the mode header ("Git — branch vs &lt;base&gt;"). When no base resolves (no tracking branch and no
 discoverable default branch), the panel falls back to showing uncommitted changes and labels the mode as
 a fallback, rather than erroring or silently swapping.
 **Rationale:** "Base" was undefined in D2, leaving the committed comparison ambiguous (F11). The tracking-
 branch-first resolution matches git's own upstream model and is the most useful default for contributors
 tracking a remote. Labeling the resolved base makes the comparison unambiguous to the user. A labeled
 fallback is more informative than an error and does not leave the panel empty.
 **Evidence:** F11 (base-unlabeled finding; UX-002, JD-002; design-judgment resolution). Git upstream
 model (tracking branch as natural "base" for a contributor's branch).
 **Rejected alternatives:**
 - Always compare against the default branch, ignoring tracking — rejected (F11): wrong for contributors
  whose tracking branch is a personal fork or a PR target branch, not the default.
 - Error when no base resolves — rejected: leaves the panel useless; an unlabeled fallback is more
  helpful.
 - Silently swap to uncommitted without a label — rejected (F11): the original spec's behavior; confusing
  because the mode selector still shows "Committed".
 **Driven by findings:** F11.
 **Linked technical notes:** —
 **Dependent decisions:** —
 **Referenced in spec:** Primary flow, Edge cases and failure modes, User interactions.
 ### D14 — Mode pinning and first-open auto-selection
 **Question:** Does auto mode-selection persist across refreshes after the user has acted?
 **Decision:** Auto mode-selection applies on first open only. Once the user selects a mode explicitly (via
 the selector), that choice is pinned for the session. Refreshes do not override a pinned mode. If a
 refresh would change the auto-selected mode (e.g. the tree transitioned from dirty to clean while
 Uncommitted was pinned), the panel briefly notes the change rather than swapping silently.
 **Rationale:** Auto-select on every refresh would dislocate the user mid-review without warning —
 illustrated by the scenario where the tree goes clean while the user is reading the uncommitted diff (F12).
 A brief note on a state change preserves awareness without overriding intent.
 **Evidence:** F12 (silent-dislocation finding; design-judgment resolution). D3 (auto-selection origin).
 **Rejected alternatives:**
 - Re-run auto-selection on every refresh — rejected (F12): dislocates the user's active view.
 - No notification on a would-be mode change — rejected: leaves the user unaware that the repository
  state changed in a way that would normally affect the view.
 **Driven by findings:** F12.
 **Linked technical notes:** —
 **Dependent decisions:** —
 **Referenced in spec:** Primary flow.
 ### D15 — Discard is irrecoverable; tracked vs. untracked confirmation; separated affordance
 **Question:** What are the full discard semantics and the UI placement of the discard control?
 **Decision:** Discard is hard-delete and irrecoverable. The confirmation dialog uses two distinct wordings:
 "Discard changes to X?" for a tracked file (which reverts to its committed content; the work is lost but
 the file remains in history) and "Delete X? It has never been committed and cannot be recovered" for an
 untracked file (permanent deletion with no recovery path). The Discard affordance is placed in an
 overflow or secondary position rather than as an equal-weight sibling of Stage/Unstage.
 **Rationale:** The spec previously called discard "irrecoverable" but left the git mechanic ambiguous.
 Owning the word and spelling out the two cases (F4) ensures the confirmation is honest. Separating the
 affordance from Stage/Unstage reduces the risk of an accidental tap on mobile (F4, UX concern).
 **Evidence:** F4 (discard-semantics and affordance-separation finding; on-call-engineer OCE-002,
 UX-005, adversarial-security-analyst; design-judgment resolution). Convention: plain
 Cancel/Confirm dialogs.
 **Rejected alternatives:**
 - Single generic confirmation for tracked and untracked cases — rejected (F4): obscures the difference
  in consequence.
 - Discard at equal weight alongside Stage/Unstage — rejected (F4): accidental-tap risk on mobile.
 **Driven by findings:** F4.
 **Linked technical notes:** —
 **Dependent decisions:** —
 **Referenced in spec:** Alternate flows and states, User interactions.
 ### D16 — Tab named "Git"
 **Question:** What is the new tab called?
 **Decision:** The new tab is named **Git**, giving a Files / Git tab pair. The existing "Pending Changes"
 panel is not renamed; that rename is out of scope.
 **Rationale:** "Changes" (the working name in the initial spec) collides with "Pending Changes" — the name
 of an existing distinct panel — creating discoverability confusion (F10, UX-001, UX-008, JD-001). "Git"
 is shorter, unambiguous, and describes the surface (the project's git state) without implying overlap
 with the pending-changes panel.
 **Evidence:** F10 (naming-collision finding; design-judgment resolution). Existing surface name: "Pending
 Changes" panel in the codebase.
 **Rejected alternatives:**
 - "Changes" — rejected (F10): collides with "Pending Changes"; confusion in context-switching.
 - Rename "Pending Changes" to disambiguate — rejected (F10): out of scope; would require changes to an
  existing surface the user did not ask to rename.
 **Driven by findings:** F10.
 **Linked technical notes:** —
 **Dependent decisions:** —
 **Referenced in spec:** Actors and triggers, User interactions, Out of scope.
 ### D17 — Ambient dirty indicator and empty-state hint
 **Question:** How does the user discover the Git tab when the panel defaults to Files?
 **Decision:** An ambient indicator on the file-panel toggle/header signals the repository is dirty (derived
 from the refresh data already gathered), making the Git tab findable without opening it. When the Git
 view is empty but the session has unapplied pending changes, the empty state hints that those live in the
 pending-changes panel.
 **Rationale:** Without a visual signal the Git tab is invisible until the user already knows to look for it
 (F10, UX-001). The indicator reuses state already gathered by the refresh cycle — no additional read
 needed. The empty-state hint prevents the user from concluding the panel is broken when what they are
 looking for is actually in the adjacent pending-changes panel.
 **Evidence:** F10 (discoverability finding; design-judgment resolution). Refresh cycle already produces
 dirty/clean state (D10).
 **Rejected alternatives:**
 - No ambient indicator (rely on the user knowing the tab exists) — rejected (F10): undiscoverable by
  new users.
 - Always show dirty indicator (not just when dirty) — rejected: misleading on clean repos.
 **Driven by findings:** F10.
 **Linked technical notes:** —
 **Dependent decisions:** —
 **Referenced in spec:** Actors and triggers, Alternate flows and states.
 ### D18 — Mobile tap-target and header-fit
 **Question:** What are the layout and accessibility constraints for the new tab and controls?
 **Decision:** All interactive controls in the diff panel follow the app's existing mobile tap-target
 minimum. The Files / Git tab strip and header fit on one line without horizontal scroll or wrapping;
 existing header elements are condensed if needed to maintain fit.
 **Rationale:** The app has an existing toolbar-fit rule (no scroll/wrap on crowded control bars) and a
 mobile-first posture. The new Git tab and its in-panel controls must not break either. Condensing
 existing elements rather than scrolling is the project's established pattern (F15, UX-009, JD-008).
 **Evidence:** F15 (mobile-fit finding; convention). Project convention: toolbars must fit one line (no
 scroll or wrapping); MEMORY.md toolbar-fit rule.
 **Rejected alternatives:**
 - Allow horizontal scroll if the header gets crowded — rejected: against the project's toolbar-fit rule.
 - Wrap the header to a second line — rejected: against the project's toolbar-fit rule.
 **Driven by findings:** F15.
 **Linked technical notes:** —
 **Dependent decisions:** —
 **Referenced in spec:** User interactions.
 ## Trivial decisions
 - D4: Untracked files included in Uncommitted view — untracked files appear in the Uncommitted file list as additions (considered tracked-only; rejected because the user's new files are part of "what changed"). — Referenced in spec: Primary flow.
 - D9: Unified layout, syntax-highlighted — diffs render in a single-column unified layout reusing the existing code highlighter (considered side-by-side; deferred under YAGNI as a desktop-only enhancement). — Referenced in spec: User interactions.
--- a/docs/features/git-diff-panel/artifacts/team-findings.md
+++ b/docs/features/git-diff-panel/artifacts/team-findings.md
@@ -0,0 +1,168 @@
 # Team findings — Git diff panel
 Review-team findings on [`feature-specification.md`](../feature-specification.md) and resolutions. Team
 (medium): junior-developer, user-experience-designer, adversarial-security-analyst, on-call-engineer.
 Dispatched 2026-06-02. All findings resolved by evidence/convention/design-judgment; none required new
 user input beyond the three foundational answers already given. Shared F# counter. All resolutions
 applied to spec and decision log 2026-06-02.
 ## Major findings
 ### F1 — Assistant could drive the new git-write endpoints via a rendered artifact (security)
 - **Raised by:** adversarial-security-analyst (D8). Highest priority.
 - **Concern:** D8 says the read-only-assistant rule covers the AI's tools, not the user's UI. But the AI can
  emit HTML artifacts; if an artifact could POST to the new stage/commit/discard endpoints using the user's
  Authelia cookie, the assistant would gain a write path it is forbidden.
 - **Resolved by:** evidence. The HTML artifact iframe is sandboxed with `connect-src 'none'` (per
  `BOOCHAT.md` output-format section — fetch/WebSocket do not work in artifacts), so an artifact cannot reach
  the endpoints. Spec gains an explicit commitment: the git-write actions are user-initiated UI actions only,
  are never registered as an assistant tool, and the artifact sandbox prevents an artifact from invoking them.
 - **Affected decisions:** D8 (expanded), D12 (new).
 - **Affected tech-notes:** —
 - **Changed in spec:** Actors and triggers, Coordinations.
 ### F2 — D11 scoping needs explicit derivation + argument-safety commitments (security)
 - **Raised by:** adversarial-security-analyst (D11).
 - **Concern:** "confined to the project path" is a destination constraint that doesn't say the path is
  derived server-side, that per-file arguments are validated to resolve inside the repo, or that the commit
  message and file arguments are passed as discrete arguments (not shell-interpolated).
 - **Resolved by:** evidence (the existing project path-scoping guard derives roots from the project record,
  never from the request). Spec/D11 gain three commitments: repository root derived server-side from the
  session's project; per-file arguments validated as repo-relative and rejected if they escape; commit
  message and file arguments passed as discrete arguments, never built into a shell string.
 - **Affected decisions:** D11 (expanded), D12 (new).
 - **Affected tech-notes:** —
 - **Changed in spec:** Coordinations.
 ### F3 — Commit author identity must be server-derived, not request-supplied (security + clarity)
 - **Raised by:** adversarial-security-analyst, junior-developer (JD-004; Open items).
 - **Concern:** identity was deferred entirely to implementation; an unauthenticated local request could set an
  arbitrary author, and the codebase already hardcodes differing identities elsewhere.
 - **Resolved by:** evidence + commitment. Spec commits: a panel commit's author/committer is derived from a
  server-side source (host git config or a configured value); the request body cannot set or influence it.
  The exact source is left to plan-implementation.
 - **Affected decisions:** D6 (expanded), D12 (new).
 - **Affected tech-notes:** —
 - **Changed in spec:** Primary flow, Open items (closed).
 ### F4 — Discard semantics: own "irrecoverable", distinguish tracked vs untracked, separate the affordance
 - **Raised by:** adversarial-security-analyst (D7), on-call-engineer (OCE-002), user-experience-designer (UX-005).
 - **Concern:** the spec calls discard "irrecoverable" but defers the git mechanic, creating a tension; a
  tracked-file revert and an untracked-file permanent delete are different losses; and Discard sitting next to
  Stage at equal weight invites accidental taps on mobile.
 - **Resolved by:** design-judgment. Discard hard-deletes (own the word "irrecoverable"). The confirmation uses
  two variants — "Discard changes to X?" (tracked, reverts to committed content) vs "Delete X? It has never
  been committed and cannot be recovered" (untracked). The Discard affordance is separated from Stage/Unstage
  (an overflow/secondary placement), not an equal-weight sibling button.
 - **Affected decisions:** D7 (expanded), D15 (new).
 - **Affected tech-notes:** —
 - **Changed in spec:** Alternate flows and states, User interactions.
 ### F5 — Index-lock contention with concurrent agent turns is unnamed (resilience)
 - **Raised by:** on-call-engineer (OCE-001).
 - **Resolved by:** evidence. Spec names the case: when a write fails because the repository is busy (its index
  is locked by another process, e.g. a concurrent agent turn), the panel communicates "the repository is busy,
  try again" rather than a raw git error.
 - **Affected decisions:** —
 - **Affected tech-notes:** —
 - **Changed in spec:** Edge cases and failure modes.
 ### F6 — "Leave state unchanged" is unenforceable for partial failures (resilience wording)
 - **Raised by:** on-call-engineer (OCE-002).
 - **Resolved by:** reword. "Leaves the repository state unchanged" → "leaves the repository as close to its
  pre-action state as the git layer allows, and the list refreshes to reflect the repository's true state";
  an untracked-directory discard that fails partway may leave a partially-removed tree, surfaced on refresh.
 - **Affected decisions:** —
 - **Affected tech-notes:** —
 - **Changed in spec:** Edge cases and failure modes.
 ### F7 — No deadline on a hanging git read (resilience)
 - **Raised by:** on-call-engineer (OCE-003).
 - **Resolved by:** add commitment. If a git read does not complete within a deadline, the panel leaves the
  loading state, shows an error, and offers Refresh (distinct from the large-result cap in D5).
 - **Affected decisions:** D5 (expanded).
 - **Affected tech-notes:** —
 - **Changed in spec:** Edge cases and failure modes.
 ### F8 — Concurrent refresh triggers have no coalescence commitment (resilience)
 - **Raised by:** on-call-engineer (OCE-004).
 - **Resolved by:** add commitment. Concurrent refresh triggers are coalesced — a refresh already in flight
  absorbs later triggers instead of spawning a second concurrent read; the panel settles to a single final
  snapshot.
 - **Affected decisions:** D10 (expanded).
 - **Affected tech-notes:** —
 - **Changed in spec:** Alternate flows and states.
 ### F9 — In-progress git states (merge/rebase/cherry-pick/bisect) make writes fail opaquely (resilience)
 - **Raised by:** on-call-engineer (OCE-005).
 - **Resolved by:** add commitment. When the repository is mid-operation (merge, rebase, cherry-pick, or bisect),
  the panel disables its write affordances and shows the repository's state rather than letting stage/commit/
  discard fail with raw git errors.
 - **Affected decisions:** —
 - **Affected tech-notes:** —
 - **Changed in spec:** Edge cases and failure modes.
 ### F10 — The Changes tab is undiscoverable and collides with "Pending Changes" naming (UX)
 - **Raised by:** user-experience-designer (UX-001, UX-008), junior-developer (JD-001).
 - **Resolved by:** design-judgment. (a) The new tab is named **Git** (Files / Git), distinct from the existing
  "Pending Changes" panel; the existing panel is NOT renamed (out of scope). (b) An ambient indicator on the
  file-panel toggle/header signals the repository is dirty (derived from the refresh data already gathered),
  so the tab is findable. (c) When the Git view is empty but the session has unapplied pending changes, the
  empty state hints that those live in the pending-changes panel.
 - **Affected decisions:** D16 (new tab name), D17 (new dirty indicator + empty-state hint).
 - **Affected tech-notes:** —
 - **Changed in spec:** Actors and triggers, Alternate flows and states, User interactions, Coordinations, Out of scope.
 ### F11 — Committed-mode base is undefined and unlabeled (UX + correctness)
 - **Raised by:** user-experience-designer (UX-002), junior-developer (JD-002).
 - **Resolved by:** decision. Committed mode compares the current branch against its **base** — the upstream
  tracking branch when set, otherwise the repository's default branch (main/master). The view labels the base
  it used ("Git — branch vs &lt;base&gt;"). When no base resolves, the panel shows uncommitted changes and labels
  the mode as falling back, rather than silently swapping.
 - **Affected decisions:** D2 (expanded), D13 (new).
 - **Affected tech-notes:** —
 - **Changed in spec:** Primary flow, Edge cases and failure modes, User interactions.
 ### F12 — Auto-mode selection silently dislocates the view on refresh (UX)
 - **Raised by:** user-experience-designer (UX-003).
 - **Resolved by:** design-judgment. Auto mode-selection applies on first open only; once the user picks a mode
  it is pinned for the session and refreshes do not override it. A refresh that would change the mode (e.g. the
  tree went clean) briefly notes the change rather than swapping silently.
 - **Affected decisions:** D3 (expanded), D14 (new).
 - **Affected tech-notes:** —
 - **Changed in spec:** Primary flow, Alternate flows and states.
 ### F13 — Staged vs unstaged distinction must not be color-only (UX/accessibility)
 - **Raised by:** user-experience-designer (UX-004).
 - **Resolved by:** add commitment. Staged and unstaged files are distinguished by more than color (a label/icon,
  and grouping into staged/unstaged sections); each stage/unstage control carries an accessible name that
  includes the file path.
 - **Affected decisions:** —
 - **Affected tech-notes:** —
 - **Changed in spec:** Primary flow, User interactions.
 ### F14 — Discard's availability across modes is unspecified (clarity)
 - **Raised by:** junior-developer (JD-003).
 - **Resolved by:** decision. Stage, unstage, commit, and discard are available only in Uncommitted mode;
  Committed mode is read-only review (no per-file revert of committed history in v1).
 - **Affected decisions:** D6 (scoped), D15 (new).
 - **Affected tech-notes:** —
 - **Changed in spec:** User interactions, Alternate flows and states.
 ### F15 — Mobile fit + tap-target convention for the new tab and controls (UX)
 - **Raised by:** user-experience-designer (UX-009), junior-developer (JD-008).
 - **Resolved by:** convention. All interactive controls in the panel follow the app's existing mobile tap-target
  minimum; the Files / Git tab strip and header fit one line without horizontal scroll or wrapping (the project's
  toolbar-fit rule), condensing existing header elements if needed.
 - **Affected decisions:** D18 (new).
 - **Affected tech-notes:** —
 - **Changed in spec:** User interactions.
 ## Minor edits
 - F16: Successful commit shows a brief, non-blocking success confirmation (not just files disappearing) — user-experience-designer (UX-006) — **Affected decisions:** — **Affected tech-notes:** — **Changed in spec:** Primary flow.
 - F17: Error placement — commit-area errors appear by the commit control, per-file action errors appear in the affected file row — user-experience-designer (UX-007) — **Affected decisions:** — **Affected tech-notes:** — **Changed in spec:** Edge cases and failure modes.
 - F18: Which service runs the git read vs write operations (read-only server vs write-capable host service) is an architecture/module-boundary decision routed to plan-implementation; the spec stays behavioral ("the system performs…") — junior-developer (JD-005) — **Affected decisions:** — **Affected tech-notes:** — **Changed in spec:** — (plan-implementation input).
 - F19: The git-write surface is the larger half of v1; implementation should sequence diff-display before the write actions even within v1 — junior-developer (JD-007) — **Affected decisions:** — **Affected tech-notes:** — **Changed in spec:** — (plan-implementation note).
 - F20: The refresh-on-pending-apply and refresh-on-turn-complete triggers require an event/frame wiring that must follow the project's WS-frame / sessionEvents parity steps — junior-developer (JD-009) — **Affected decisions:** — **Affected tech-notes:** — **Changed in spec:** — (plan-implementation note).
 - F21: D8's commit-button-in-a-read-only-session affordance needs no extra label; the "Git" tab name and dirty indicator make it clearly the user's own git surface, not assistant output — junior-developer (JD-006) — **Affected decisions:** D8 confirmed, no extra label needed — **Affected tech-notes:** — **Changed in spec:** — (D8 confirmed).
--- a/docs/features/git-diff-panel/feature-specification.md
+++ b/docs/features/git-diff-panel/feature-specification.md
@@ -0,0 +1,203 @@
 # Feature specification — Git diff panel
 ## Source
 No formal upstream PRD. Authored from a direct request (2026-06-02): "add a git diff panel that can be
 shown instead of the file browser, similar to Paseo," plus a read-only review of the Paseo reference at
 `/opt/forks/paseo`. Ground-truth conventions drawn from the project's own docs and existing surfaces.
 ## Outcome
 A person working in a session can review the uncommitted (and on-branch) changes of the project's
 repository in a diff view that lives in the same right-side panel slot as the file browser, switching
 between the two with a tab. From that view they can read each changed file's diff, stage and unstage
 files, commit staged work with a message, and discard a file's changes — without leaving the session or
 opening a terminal. The view tells them, at a glance, which files changed and by how much, and stays
 current as agents and the user make edits.
 ## Actors and triggers
 - **The session user** (single user) opens the right-side file panel and switches its tab from Files to
  Git. The Git tab is the trigger; switching back to Files restores the file tree in the same slot
  ([D1](artifacts/decision-log.md#d1--placement-a-tab-inside-the-file-browser)).
 - The panel is available in any session that has the file panel, because the stage / commit / discard
  actions are the human user's own UI actions, not assistant tools — the AI has no access to these
  endpoints, and the artifact sandbox prevents a rendered artifact from reaching them
  ([D8](artifacts/decision-log.md#d8--git-write-is-a-user-action-not-an-assistant-tool),
  [D12](artifacts/decision-log.md#d12--git-write-security-posture)).
 - An ambient indicator on the file-panel header signals the repository is dirty, making the Git tab
  findable without opening it ([D17](artifacts/decision-log.md#d17--ambient-dirty-indicator-and-empty-state-hint)).
 ## Primary flow
 1. The user opens the right-side file panel and selects the **Git** tab.
 2. The panel shows the changes of the **project's repository**, defaulting its comparison mode by the
   repository state on first open: **Uncommitted** (working tree vs. the last commit) when the working
   tree is dirty, **Committed** (the current branch vs. its base) when it is clean. Once the user
   selects a mode explicitly, that choice is pinned for the session and subsequent refreshes do not
   override it; if a refresh would change the auto-selected mode (e.g. the tree went clean while
   Uncommitted was pinned), the panel briefly notes the change rather than swapping silently
   ([D2](artifacts/decision-log.md#d2--scope-the-project-repository-with-two-comparison-modes),
   [D3](artifacts/decision-log.md#d3--mode-auto-selection-and-session-pinning),
   [D14](artifacts/decision-log.md#d14--mode-pinning-and-first-open-auto-selection)).
 3. The panel presents a list of changed files, each with its path, change type (added / modified /
   deleted / renamed / untracked), and an added/removed line count. In Committed mode the header labels
   the base used ("Git — branch vs &lt;base&gt;")
   ([D13](artifacts/decision-log.md#d13--committed-mode-base-resolution-and-labeling)).
 4. The user expands a file to read its diff inline; collapsing hides it again. A control expands or
   collapses all files at once.
 5. The user **stages** or **unstages** individual files. Staged and unstaged files are grouped into
   separate sections and distinguished by both a label/icon and grouping — not color alone — and each
   stage/unstage control carries an accessible name that includes the file path.
 6. The user writes a commit message and **commits** the staged files. The commit's author and committer
   identity is derived from a server-side source; the request cannot set or influence it
   ([D12](artifacts/decision-log.md#d12--git-write-security-posture)). On success the committed files
   leave the list, the panel refreshes to the new repository state, and a brief non-blocking confirmation
   is shown ([D6](artifacts/decision-log.md#d6--v1-actions-stage-unstage-commit-discard-no-push)).
 7. The user may switch the comparison mode explicitly (Uncommitted ↔ Committed) at any time; the file
   list and counts update to the selected mode, and the choice is pinned for the remainder of the session.
 ## Alternate flows and states
 - **Loading:** while the first difference is being computed the panel shows a brief loading indicator in
  place of the file list.
 - **Empty (no changes):** when the selected mode has no changes the panel shows a mode-specific empty
  message ("No uncommitted changes" / "No changes vs. the base branch") instead of a file list. When
  the Git view is empty but the session has unapplied pending changes, the empty state hints that those
  live in the pending-changes panel
  ([D17](artifacts/decision-log.md#d17--ambient-dirty-indicator-and-empty-state-hint)).
 - **Discard a file:** the user discards a single file's changes from an overflow or secondary affordance,
  separated from the Stage/Unstage controls (not an equal-weight sibling). Because discard is
  irrecoverable, the panel asks for a plain confirmation before acting, with wording that distinguishes
  the two cases: "Discard changes to X?" for a tracked file (which returns to its committed content) and
  "Delete X? It has never been committed and cannot be recovered" for an untracked file (which is
  permanently removed). On confirmation the file's changes are applied and the list refreshes
  ([D7](artifacts/decision-log.md#d7--discard-requires-a-plain-confirmation),
  [D15](artifacts/decision-log.md#d15--discard-is-irrecoverable-tracked-vs-untracked-confirmation-separated-affordance)).
 - **Commit with an empty message or nothing staged:** the commit control is unavailable until at least one
  file is staged and a non-empty message is present.
 - **Refresh:** the panel re-reads the repository state when the Git tab is opened, after any stage /
  unstage / commit / discard it performs, after an agent turn completes, after the user applies or
  discards a queued change in the pending-changes panel, and on an explicit Refresh control. Concurrent
  refresh triggers are coalesced — a refresh already in flight absorbs later triggers rather than
  spawning a second concurrent read, so the panel settles to a single final snapshot
  ([D10](artifacts/decision-log.md#d10--refresh-on-open-on-mutation-on-turn-completion-on-demand-with-coalescence)).
 ## Edge cases and failure modes
 - **Not a git repository:** when the project's path is not a git repository the Git tab is not offered;
  the file panel stays on Files only.
 - **Binary files:** a changed binary file appears in the list with its path and change type but its body
  shows a "Binary file" placeholder instead of a diff.
 - **Very large diffs:** a file whose diff exceeds a display threshold appears in the list with its path and
  counts but its body shows a "Diff too large to display" placeholder; a git read that does not complete
  within a deadline exits the loading state, shows an error, and offers Refresh; the overall response is
  bounded so a huge change set cannot stall the panel
  ([D5](artifacts/decision-log.md#d5--binary-and-large-file-handling)).
 - **A git action fails:** the panel surfaces the failure as an inline error — commit-area errors appear
  near the commit control; per-file action errors appear in the affected file row. The panel leaves the
  repository as close to its pre-action state as the git layer allows; the list refreshes to reflect the
  repository's true state.
 - **Repository busy (index locked):** when a write fails because the repository's index is locked by
  another process (e.g. a concurrent agent turn), the panel communicates "the repository is busy, try
  again" rather than a raw error.
 - **In-progress git operations:** when the repository is mid-operation (merge, rebase, cherry-pick, or
  bisect), the panel disables its write affordances and shows the repository's current state, rather than
  allowing stage / commit / discard to fail with raw errors.
 - **Concurrent edits during a read:** the displayed diff is a snapshot at read time; a later edit is picked
  up on the next refresh rather than mutating the view mid-read.
 - **The base branch cannot be resolved** (Committed mode, no discoverable base): the panel falls back to
  showing uncommitted changes and labels the mode as a fallback, rather than silently swapping or erroring
  ([D13](artifacts/decision-log.md#d13--committed-mode-base-resolution-and-labeling)).
 ## User interactions
 - A **Files / Git** tab switch in the file panel header. The diff view occupies the same slot as the
  file tree; only one is shown at a time. The tab strip and header fit on one line without horizontal
  scroll or wrapping; all interactive controls meet the app's existing mobile tap-target minimum. The
  tab affordance and the panel work the same on mobile (where the panel is a slide-in drawer) as on
  desktop ([D16](artifacts/decision-log.md#d16--tab-named-git),
  [D18](artifacts/decision-log.md#d18--mobile-tap-target-and-header-fit)).
 - An **ambient dirty indicator** on the file-panel toggle/header when the repository is dirty
  ([D17](artifacts/decision-log.md#d17--ambient-dirty-indicator-and-empty-state-hint)).
 - A **comparison-mode** selector (Uncommitted / Committed) at the top of the Git view.
 - Per-file rows showing path, change type, and an added/removed count, each expandable to reveal a
  syntax-highlighted unified diff, with an expand-all / collapse-all control.
 - Files grouped into **staged** and **unstaged** sections; each section is labeled and the grouping
  itself is the primary distinction (supplemented by a per-file label/icon), not color alone; each
  stage/unstage control carries an accessible name including the file path.
 - Per-file **Stage / Unstage** affordances and a **Discard** affordance in an overflow or secondary
  position (not an equal-weight sibling of Stage/Unstage), and a **commit message** field with a
  **Commit** action. Stage / Unstage / Commit / Discard are available only in Uncommitted mode;
  Committed mode is read-only review
  ([D6](artifacts/decision-log.md#d6--v1-actions-stage-unstage-commit-discard-no-push),
  [D15](artifacts/decision-log.md#d15--discard-is-irrecoverable-tracked-vs-untracked-confirmation-separated-affordance)).
 - A **Refresh** control.
 - Diffs render in a single (unified) layout; additions and removals are visually distinguished with line
  numbers ([D9](artifacts/decision-log.md)).
 ## Coordinations
 - **The file panel** hosts the view and owns the Files ↔ Git tab state.
 - **The project repository** is the single source of truth for the diff and the target of stage / commit /
  discard. The repository root is derived server-side from the session's project record; per-file
  arguments are validated to resolve inside the repository and rejected if they escape it; user-supplied
  text (commit message, file targets) is passed as discrete arguments and never interpreted as commands.
  The git-write actions are never registered as assistant tools; the artifact sandbox prevents a rendered
  artifact from invoking them
  ([D11](artifacts/decision-log.md#d11--all-git-operations-scoped-to-the-project-repository-path),
  [D12](artifacts/decision-log.md#d12--git-write-security-posture)).
 - **The pending-changes panel** remains the place where unapplied agent edits (held in a separate working
  copy) are reviewed and applied; the diff panel reflects the project repository's real state, so applying
  or discarding a pending change is one of the events that refreshes the diff panel. The two panels are
  complementary, not duplicates ([D2](artifacts/decision-log.md#d2--scope-the-project-repository-with-two-comparison-modes)).
 - **Agent turns and the user's own edits** change the repository; turn completion is a refresh trigger.
 ## Out of scope
 - Pushing, pulling, creating pull requests, merging, or any operation that talks to a remote.
 - Per-line or per-hunk review comments and "send selected lines to an agent" — that is a separate feature
  (the diff-line re-prompt item), and this panel deliberately does not build a line-selection/commenting
  surface.
 - Side-by-side (split) diff layout.
 - Staging or discarding individual hunks/lines (stage and discard operate at whole-file granularity).
 - A live file-system watcher that streams diffs as files change on disk; refresh is event- and
  demand-driven, not continuous.
 - Showing the session agent's separate working-copy diff in this panel; that remains the pending-changes
  panel's job.
 - Renaming the existing pending-changes panel; naming and scope changes to that panel are out of scope
  for this feature.
 ## Deferred (YAGNI)
 - **Push / pull / pull-request / merge actions.** Deferred — not requested (the request was "stage/commit"),
  and the assistant-level no-remote-write rule signals remotes are out of band for now. Reopening trigger: a
  stated need to publish commits from the panel. Evidence gate failed: no user-described need.
 - **Side-by-side diff layout.** Deferred — the primary surface is mobile-first and unified reads well there;
  a split layout is a desktop-only enhancement. Reopening trigger: a request to compare wide files
  side-by-side on desktop. Evidence gate: simpler unified version satisfies the stated need.
 - **Per-hunk staging / discarding.** Deferred — whole-file granularity covers the stated stage/commit need.
  Reopening trigger: a request to commit part of a file.
 - **Continuous file-watch streaming of the diff.** Deferred — event- and demand-driven refresh covers a
  single-user workflow without a watcher's cost. Reopening trigger: the diff is observed to feel stale
  between refresh events in practice.
 ## Open items
 - None. Commit author/committer identity is settled: derived server-side from the host git configuration;
  the request cannot set or influence it (F3, [D12](artifacts/decision-log.md#d12--git-write-security-posture)).
 ## Summary
 A Files / Git tab in the right-side file panel that shows the project repository's diff in two modes
 (uncommitted vs. HEAD, and the branch vs. its upstream/default base, auto-selected by repo state on
 first open and then pinned to the user's choice), lets the user stage, unstage, commit (with
 server-derived identity), and discard whole files (with irrecoverable-discard confirmation distinguishing
 tracked and untracked cases), and stays current via coalesced event- and demand-driven refresh. Single
 actor (the session user); the panel is complementary to the existing pending-changes panel. The tab is
 named "Git" (Files / Git), distinct from the pending-changes panel. An ambient dirty indicator makes it
 findable. Write affordances are disabled during in-progress git operations. 18 decisions recorded. Four
 items deferred under YAGNI (remote actions, split layout, per-hunk granularity, file-watch streaming).
 Review team: junior-developer, user-experience-designer, adversarial-security-analyst, on-call-engineer.
 No load-bearing mechanics required a technical-notes file (the git mechanics are discoverable from the
 codebase's existing git-metadata path).
--- a/docs/plans/post-review-backlog/artifacts/.discovery-notes.md
+++ b/docs/plans/post-review-backlog/artifacts/.discovery-notes.md
@@ -0,0 +1,76 @@
 # Discovery notes — post-review-backlog plan
 Single source of truth for project context. Specialists: read this first, do NOT re-grep what is here.
 Search further only for what your domain needs that is not covered.
 ## Tech stack
 - Monorepo, pnpm workspaces: `apps/server` (BooChat — Fastify + postgres, native inference, read-only tools),
  `apps/web` (React + Vite SPA), `apps/coder` (BooCoder — host systemd service, write tools + external-agent
  dispatch, port 9502), `apps/booterm` (PTY/tmux). TypeScript strict, NodeNext (`.js` import suffixes) on
  server + coder.
 - Tests: vitest (pinned ^3). server `pnpm -C apps/server test`; coder `pnpm -C apps/coder test`
  (`globals:false` — import describe/it/expect). Include glob `src/**/__tests__/**/*.test.ts`. No web test
  harness, no linters. DB-integration tests opt-in via `DATABASE_URL` + `describe.runIf`.
 - Deploy: apps/coder → `sudo systemctl restart boocoder`; apps/web|server → `docker compose up --build -d boocode`.
 - Postgres 16, DB `boochat`. Two schema files: `apps/server/src/schema.sql` (sessions/chats/messages/
  message_parts) + `apps/coder/src/schema.sql` (agent_sessions, worktrees, pending_changes, available_agents,
  checkpoints, claude_session_entries, tasks extension).
 ## ADRs / coding standards found
 - No `docs/adr/` directory. Architectural decisions live in `boocode_roadmap.md` (Decisions log) +
  per-app `CLAUDE.md` files (auto-loaded when editing that subtree) + `openspec/changes/archived/`.
 - Coding standards: `docs/coding-standards/` (canonical), surfaced via `.claude/rules/coding-standards/`
  path-scoped index files. Not loaded automatically; open on demand.
 - Cross-cutting conventions in root `CLAUDE.md` "Conventions" section (WS-frame parity, sentinels, JSONB
  via `sql.json`, event dedup discipline, deploy-by-surface).
 ## Code touch points (per scope-brief item)
 - **F1 task-cancel:** `apps/coder/src/services/dispatcher.ts` (private `ac=new AbortController()` at
  ~316/655/991/1248; `inflight=Map<sessionId,Promise>` at :56 — no per-task AbortController registry);
  `apps/coder/src/routes/tasks.ts:110-148` (cancel route — inference.cancel + DB only); `routes/messages.ts:388`
  (session stop); backends honor `ctx.signal`: `pty-dispatch.ts:159` (child.kill), `backends/warm-acp.ts:318`
  (session/cancel), `backends/opencode-server.ts:775` (session.abort), `backends/claude-sdk.ts:209` (interrupt).
  Frontend `apps/web/src/components/panes/CoderPane.tsx:987` handleStop; `api/client.ts:395` cancelTask.
 - **F2 parser prune:** `apps/server/src/services/inference/tool-call-parser.ts` (exports extractToolCallBlocks,
  stripToolMarkup, parseXmlToolCall, parseInvokeToolCall, isPlaceholderArgValue, XML_/INVOKE_ consts).
  Live consumers: `stream-phase.ts:263-284` (extractToolCallBlocks, text-delta fallback path), `:285-294`
  (structured tool-call path — authoritative today), `tool-phase.ts:122` + `error-handler.ts:25,106`
  (stripToolMarkup). llama-swap native `--jinja` parsing confirmed ON (external host `:8401`).
 - **F3 xml logging:** `tool-call-parser.ts:65` console.debug; one call site in `stream-phase.ts` executeStreamPhase.
 - **F4 notify-hook:** agent config paths `~/.claude/settings.json`, `~/.qwen/settings.json`,
  `~/.config/goose/`; existing readers `claude-command-discovery.ts:84`, `qwen-settings.ts`,
  `provider-registry.ts`. Existing normalize helper `apps/coder/src/services/normalize-agent-status.ts`
  (`normalizeAgentEvent`). Existing status publish wired at dispatcher turn boundaries (v2.7.6);
  `index.ts:86` references #10. `permission-waiter.ts:47` has a `PermissionHooks` registry.
 - **F5 compaction surfacing:** `apps/coder/src/services/backends/opencode-server.ts` SSE arm handling
  (~215-311 region); WS frame parity → server `ws-frames.ts` + web `apps/web/src/api/types.ts` (`WsFrame`).
 - **F6 resilience:** `apps/server/src/services/inference/stream-phase.ts:261` (`for await ... result.fullStream`),
  abort check at :333, usage at :343. Frontend 60s `discard_stale` watchdog is the only stall guard today.
 - **F7 session-history MCP tool:** `apps/coder/src/services/mcp-server.ts` (existing BooCoder MCP tools);
  read path `messages_with_parts` view.
 - **F8 diff-line UX:** diff UI component NOT located by `git ls-files apps/web | grep -i diff` (returned
  empty) — UX specialist must locate the actual diff/changes panel component (BOOCODER.md calls it
  "DiffPanel"; may be named differently or nested). Routes through dispatcher + AgentComposerBar.
 - **F9 retire :9502:** `apps/coder/web/` package + static serve in `apps/coder/src/index.ts` + coder build
  scripts/Dockerfile. KEEP WS + REST routes.
 ## Recent activity / precedent
 - HEAD `e5ce01a` (v2.7.11). v2.7.x line (relicense, write-edit-robustness, sampling-streamjson-tokens,
  mistake-tracker-ledger, claude-sdk-sessionstore, agent-status-normalize, UI batches) all shipped 2026-06-01.
 - Pure-helper + TDD precedent for extraction: `backends/turn-guard.ts`, `backends/lifecycle-decisions.ts`,
  `mistake-tracker.ts` (pure module + unit test, then wire). F1/F2/F6 should follow it.
 - Parallel-disjoint-file agent precedent: v2.7.0/v2.7.1/v2.7.3 each built by 3 parallel agents over disjoint
  files — relevant to decomposition/sequencing.
 ## Enumerated gaps (searched, not found)
 - No `feature-specification.md` — `scope-brief.md` is the ground-truth stand-in.
 - No `docs/adr/`.
 - Diff UI component filename not found via naive grep (F8) — needs UX specialist location.
 - llama-swap config is NOT in-repo (external host `:8401`); native-jinja state confirmed by live probe only.
 - F8 (diff-line UX) and F4 (notify-hook) are the two items most likely to need their own plan-a-feature; they
  have no behavioral spec beyond the review-doc pattern description.
--- a/docs/plans/post-review-backlog/artifacts/implementation-decision-log.md
+++ b/docs/plans/post-review-backlog/artifacts/implementation-decision-log.md
@@ -0,0 +1,147 @@
 # Implementation Decision Log: Post-Review Backlog (F1–F9)
 <!--
 This file records every implementation decision committed while planning the post-review backlog.
 Behavioral and implementation statements live in [../feature-implementation-plan.md](../feature-implementation-plan.md) —
 this file captures the question, rationale, evidence, and rejected alternatives for each decision.
 Round-by-round history lives in [implementation-iteration-history.md](implementation-iteration-history.md).
 -->
 Source of truth: [../scope-brief.md](../scope-brief.md) (ground-truth spec stand-in) and
 [synthesis-input.md](synthesis-input.md) (the consolidated Round-1 specialist aggregation; its file:line
 evidence is treated as verbatim). The D-N counter is shared across the Trivial and Full sections below.
 ## Trivial decisions
 - D-2: F3 logger threading shape — pass an optional `log?: { debug }` param to `extractToolCallBlocks` from its single call site in `stream-phase.ts` `executeStreamPhase`; no interface (one site, one impl). — Referenced in plan: Implementation Approach (F2+F3), Decomposition and Sequencing.
 - D-6: F7 query shape — read `messages_with_parts` `WHERE role != 'system'` (strips sentinels), params `session_id` + optional `chat_id` + `limit` (default 50, max 200), `ORDER BY created_at ASC`, returns `{role,content,...}[]`. — Referenced in plan: Implementation Approach (F7), External Interfaces.
 - D-9: Patch-tag-per-unit — each ready item ships as its own sequential patch tag (one batch per coherent unit), not a minor bump; Sam declined v2.8.0 twice. — Referenced in plan: Decomposition and Sequencing, Operational Readiness.
 - D-10: F1 Stop-button terminal label — render a muted "Stopped" label (not red, not a toast) for `status='cancelled'`. — Referenced in plan: Implementation Approach (F1), Testing Strategy.
 ## Full decisions
 ### D-1: F1 cancel registry shape and finalization-fix scope
 - **Question:** How does a Stop on an external agent task actually abort the running child, and what message-state corruption does wiring that abort newly expose?
 - **Decision:** Add `taskControllers = new Map<string, AbortController>()` inside `createDispatcher`; `taskControllers.set(taskId, ac)` at each of the four run-functions (`dispatcher.ts` ~316/655/991/1248) and `.delete` in the existing `.finally()` (`dispatcher.ts:117`); export `cancelExternalTask(taskId): boolean` (idempotent — `ac.abort()` no-ops when already aborted, so double-Stop and cancel-after-exit are safe). Pass a narrow `ExternalCancelFn` (not the whole dispatcher) into `registerTaskRoutes`, wired in `index.ts:254`. **In the same batch**, fix the two pre-existing finalization bugs this newly makes reachable: (1) the four catch blocks update only `tasks` state and leave the assistant `messages` row `status='streaming'` (the BooChat 5-min sweep is a different process and cannot recover it); (2) the warm-backend success path writes `messages.status='complete'` unconditionally before checking abort (`dispatcher.ts` ~853/1122/1377). Fix via a shared `cancelAndFinalize` helper across all four paths: after `await backend.prompt(...)`, `if (ac.signal.aborted)` → write `status='cancelled'`, publish the terminal `message_complete` frame, emit idle, return; in each catch finalize the message with `WHERE status='streaming'` (idempotent), mapping `AbortError → cancelled` vs `error → failed`.
 - **Rationale:** 4-way specialist consensus (on-call B1, behavioral B1, architect A1, junior). The frontend and all four backends already honor the abort signal correctly (PTY `child.kill` `pty-dispatch.ts:159`; warm-ACP `session/cancel` `warm-acp.ts:318`; opencode `session.abort` `opencode-server.ts:775`; claude-sdk interrupt `claude-sdk.ts:209`); the only missing link is the registry + export + route wiring. Shipping the abort wiring **without** the finalization fixes would convert a silent no-op into a new bug (cancelled turns recorded `complete`, or messages stuck `streaming`), so the two are inseparable in one batch (on-call OCE-001/OCE-002, behavioral B2/B3).
 - **Evidence:** `routes/tasks.ts:110-148` (cancel route calls `cancelPendingPermission` + `inference.cancel` native-only + DB cancelled; never reaches dispatcher); `dispatcher.ts:316/655/991/1248` (four private `ac` unreachable); `cancelExternalTask` absent anywhere (synthesis-input C1); finalization bugs at `dispatcher.ts` catch blocks + ~853/1122/1377 (synthesis-input C2); backend signal honoring cited above (scope-brief F1).
 - **Rejected alternatives:**
  - Abort wiring only, defer the finalization fixes — rejected because wiring abort makes the `streaming`/`complete` corruption reachable from the UI for the first time; deferring ships a new bug (synthesis-input C2, on-call/behavioral).
  - Recover stuck messages via the BooChat 5-min sweep — rejected because the sweep runs in a different process (BooChat) and does not see BooCoder's `agent_sessions`/`tasks` finalization (on-call OCE-001).
  - Pass the whole dispatcher into `registerTaskRoutes` — rejected for over-coupling; a narrow `ExternalCancelFn` is sufficient (architect A1).
 - **Specialist owner:** on-call-engineer (resilience) with software-architect (registry shape).
 - **Revisit criterion:** a fifth external backend is added whose abort contract differs from signal-based cancellation, or the warm-vs-one-shot worktree-cleanup split changes.
 - **Dissent (if any):** none.
 - **Driven by rounds:** R1.
 - **Dependent decisions:** D-7 (F1 terminal state), D-8 (F1 frame-extension over new frame type), D-10.
 - **Referenced in plan:** Implementation Approach (F1), Runtime Behavior, On-Call Resilience Posture, Decomposition and Sequencing, RAID Log (R1).
 ### D-3: F2 prune scope — option A (prune-now-minimal), keep the load-bearing guard
 - **Question:** How far does the tool-call-parser prune go — unexport the dead symbols only, or the full flag-gated retirement of the text-scrape fallback?
 - **Decision:** Option A only. KEEP `extractToolCallBlocks` + `stripToolMarkup` and their types (`ToolCallExtraction`, `ParsedCall`) — the load-bearing `<invoke>`-as-text guard. REMOVE only the `export` keyword (not the implementations) from the 8 zero-external-caller symbols: `isPlaceholderArgValue`, `parseXmlToolCall`, `parseInvokeToolCall`, `partialXmlOpenerStart`, and the 4 consts `XML_TOOL_OPEN/CLOSE`, `INVOKE_TOOL_OPEN/CLOSE`. Zero runtime effect; public export surface goes 11 → 4.
 - **Rationale:** The TS parser is dormant defense-in-depth but the `<invoke>`-as-text path is the *only* guard for "tool call emitted as plain text" — `experimental_repairToolCall` does not cover that case, and the sidecar `--jinja` state is confirmed only by a live probe, not pinned in-repo, so keeping the guard is correct (architect A2, confirms junior OQ-F2c). Unexporting is pure simplification with no behavior change; the relicense already removed the AGPL-dead exports, so there is no license pressure forcing the larger move.
 - **Evidence:** live consumers `stream-phase.ts:263-284` (extractToolCallBlocks text-delta fallback), `tool-phase.ts:122` + `error-handler.ts:25,106` (stripToolMarkup); structured path `stream-phase.ts:285` does all real work today; llama-swap `--jinja` confirmed ON by live probe of `:8401` only (scope-brief F2, synthesis-input C3).
 - **Rejected alternatives:**
  - Option B — flag-gate the text-scrape fallback, validate native parsing on live qwen3.6 for one release, then delete — rejected (deferred) because there is no evidence qwen3.6 stopped emitting `<invoke>` text on live and the sidecar `--jinja` state is unconfirmed in-repo; deleting the only plain-text-tool-call guard on that basis is unsafe (architect/test-engineer R1). See Deferred (YAGNI).
  - Delete the 8 symbols' implementations outright — rejected because three of them (`parseXmlToolCall`, `parseInvokeToolCall`, `isPlaceholderArgValue`) are called internally by the retained `extractToolCallBlocks`; only the `export` keyword is dead.
 - **Specialist owner:** software-architect.
 - **Revisit criterion:** a documented multi-session live probe shows zero text-delta tool calls from qwen3.6 (then Option B reopens).
 - **Dissent (if any):** none.
 - **Driven by rounds:** R1.
 - **Dependent decisions:** D-2 (F3 logger — safe because F2 keeps `extractToolCallBlocks`), D-4 (F2 gate test).
 - **Referenced in plan:** Implementation Approach (F2+F3), Decomposition and Sequencing, Testing Strategy, RAID Log (R2), Deferred (YAGNI).
 ### D-4: F2 fallback gate test — pin the untested guard before pruning
 - **Question:** The `<invoke>`-as-text fallback is currently exercised by no test; how do we prevent the prune from silently removing it?
 - **Decision:** Add a gate test before the prune: stub `streamText` to emit a text-delta containing a complete `<invoke>` block; assert the call lands in `result.toolCalls` and the markup is NOT present in `result.content`. The test must stay green through the prune and fail if `extractToolCallBlocks` is ever removed from the text-delta path.
 - **Rationale:** Pruning around an untested load-bearing path risks a silent regression; the gate test converts D-3's "keep the guard" commitment into an enforced invariant (test-engineer T6).
 - **Evidence:** untested fallback at `stream-phase.ts:263-284` (synthesis-input C4, test-engineer T6).
 - **Rejected alternatives:**
  - Prune without the gate test, relying on review — rejected because the fallback has no current coverage, so a future removal would pass CI silently (test-engineer T6).
 - **Specialist owner:** test-engineer.
 - **Revisit criterion:** the fallback path is intentionally retired under Option B (then this test is rewritten, not deleted).
 - **Dissent (if any):** none.
 - **Driven by rounds:** R1.
 - **Dependent decisions:** none.
 - **Referenced in plan:** Testing Strategy, Decomposition and Sequencing.
 ### D-5: F6 stall-timeout via AbortSignal.any; no retry
 - **Question:** How does BooChat detect and recover a hung llama-swap stream server-side, and does it retry?
 - **Decision:** Wrap the `stream-phase.ts:261` `fullStream` loop with a per-chunk stall deadline. Create a local `stallAc = new AbortController()`, pass `effectiveSignal = AbortSignal.any([signal, stallAc.signal])` to `streamText`, bump a `setTimeout(STALL_TIMEOUT_MS = 90_000)` on each chunk, clear it in the existing `finally`. At the post-loop check (`stream-phase.ts:337`) test `signal?.aborted || stallAc.signal.aborted` and throw `AbortError` (→ `handleAbortOrError` writes `cancelled`). **No retry** at `executeStreamPhase`/`streamCompletion`.
 - **Rationale:** Today a hung stream relies entirely on the frontend 60s `discard_stale` watchdog with zero server-side guard; the 90s server stall-timeout closes that gap and reuses the existing abort/finalize path. Retry is deferred (YAGNI): a retry after a partial stream re-emits already-streamed deltas (`state.accumulated` + live `delta` frames are non-idempotent), which is worse than the current behavior; at single-local-instance scale the user re-sending is the correct recovery (on-call, strong).
 - **Evidence:** `stream-phase.ts:261` fullStream loop, `:333` abort check, `:343` usage; frontend 60s `discard_stale` is the only stall guard today (scope-brief F6, synthesis-input C6).
 - **Rejected alternatives:**
  - `Promise.race` of the loop against a timeout — rejected in favor of `AbortSignal.any`, which threads cancellation through `streamText` and the existing finalize path cleanly (OQ-F6a, on-call).
  - Retry/backoff classifier (transient-5xx / stall) — rejected (deferred) because partial-stream re-emit is non-idempotent and llama-swap is a single local instance (synthesis-input YAGNI ledger). See Deferred (YAGNI).
 - **Specialist owner:** on-call-engineer.
 - **Revisit criterion:** llama-swap gains restart-in-place-with-clear-partial, or a second instance is added for failover (then retry reopens).
 - **Dissent (if any):** none.
 - **Driven by rounds:** R1.
 - **Dependent decisions:** none.
 - **Referenced in plan:** Implementation Approach (F6), On-Call Resilience Posture, Testing Strategy, RAID Log (R3), Deferred (YAGNI).
 ### D-7: F1 terminal state for user Stop — `cancelled`, not `failed`
 - **Question:** When a user hits Stop, what terminal `messages.status` does the finalized assistant message land in?
 - **Decision:** `cancelled` for a user-initiated Stop (`AbortError`); `failed` only for a genuine thrown error in the catch path.
 - **Rationale:** A user Stop is a deliberate, non-error outcome; `MessageStatus` already includes `'cancelled'` and the web reducer can map it without a new enum value. Distinguishing `AbortError → cancelled` vs `error → failed` keeps the human-inbox / failure surfaces honest (resolved OQ, on-call/behavioral).
 - **Evidence:** `MessageStatus` includes `'cancelled'`; reducer map point `CoderPane.tsx:299` (synthesis-input F1 UX + resolved OQs).
 - **Rejected alternatives:**
  - Record user Stop as `failed` — rejected because it pollutes failure surfaces with deliberate user actions (resolved OQ F1 terminal state).
 - **Specialist owner:** behavioral-analyst.
 - **Revisit criterion:** product decides a user Stop should count against a failure/alerting budget.
 - **Dissent (if any):** none.
 - **Driven by rounds:** R1.
 - **Dependent decisions:** D-8, D-10.
 - **Referenced in plan:** Implementation Approach (F1), Runtime Behavior.
 ### D-8: F1 status surfacing — extend the existing frame, no new frame type
 - **Question:** How does the cancelled terminal state reach the web reducer — a new WS frame type, or an extension of the existing one?
 - **Decision:** Extend the coder `message_complete` frame with an optional `status` field (Option A — minimal); map it in the reducer (`CoderPane.tsx:299`). No new frame type, so the cross-app `WsFrame` parity rule does not force a paired strict-union arm beyond the optional field.
 - **Rationale:** Adding a whole new frame type triggers the full cross-app parity dance (server `InferenceFrame`/`ws-frames.ts` + web `WsFrame`) for a single optional value already carried on a terminal frame; an optional field on the existing frame is the smaller change with the same observable result (UX agent).
 - **Evidence:** reducer map point `CoderPane.tsx:299`; cross-app frame parity rule (CLAUDE.md Conventions; scope-brief cross-cutting constraints).
 - **Rejected alternatives:**
  - New `agent_cancelled` frame type — rejected because it forces a paired strict-union arm in two files for a single optional status value (UX agent, Option A vs B).
 - **Specialist owner:** user-experience-designer.
 - **Revisit criterion:** a second distinct terminal sub-state needs carrying that does not fit `message_complete.status`.
 - **Dissent (if any):** none.
 - **Driven by rounds:** R1.
 - **Dependent decisions:** D-10.
 - **Referenced in plan:** Implementation Approach (F1), External Interfaces.
 ### D-11: F9 retire :9502 SPA — delete the serve block, keep all API/WS routes
 - **Question:** What exactly is removed when retiring the :9502 fallback SPA, and what must stay?
 - **Decision:** Delete the `if (existsSync(webRoot))` block in `index.ts` (~269-289) which already no-ops when the dist is absent; keep the inline 404 handler (`{error:'not found'}`). Remove `apps/coder/web` from `pnpm-workspace.yaml`, the coder build step, and the Dockerfile copy; remove the now-unused `fastifyStatic` import (verify it is only used there). KEEP all `/api/coder/*` REST + WS + `/api/health` + `--mcp` routes — CoderPane depends on them. Optionally add a 2-line `GET /` redirect-to-BooChat (no `fastifyStatic`).
 - **Rationale:** Sam confirmed "I don't use 9502"; primary UI is CoderPane inside BooChat. OQ-F9a resolved: nothing probes `GET /` on :9502 (health is `/api/health`; the compose healthcheck targets the boocode container, not the host-systemd coder), so 404-or-redirect at `/` is safe (architect A5, verified).
 - **Evidence:** serve block `index.ts` ~269-289; `GET /` unprobed (synthesis-input C8, OQ-F9a resolved); scope-brief F9 / DEFERRED #5 removal checklist.
 - **Rejected alternatives:**
  - Keep the SPA — rejected; Sam greenlit removal and the build step is dead weight (scope-brief F9).
  - Remove the REST/WS routes too — rejected because CoderPane inside BooChat depends on every `/api/coder/*` route (architect A5).
 - **Specialist owner:** software-architect.
 - **Revisit criterion:** a standalone :9502 UI is ever wanted again (would be a fresh feature, not a revert).
 - **Dissent (if any):** none.
 - **Driven by rounds:** R1.
 - **Dependent decisions:** none.
 - **Referenced in plan:** Implementation Approach (F9), Operational Readiness, Decomposition and Sequencing.
 ### D-12: F4/F5/F8 disposition under the standing override — document as Blocked, do not halt
 - **Question:** The spec-maturity gate tripped on F4/F5/F8; the skill says recommend routing them out. Sam issued a standing override to plan everything. How are they recorded?
 - **Decision:** Proceed with the plan but record F4, F5, F8 in a structurally separate **BLOCKED** tier with their exact blocking open question(s) and recommended resolution path, rather than halting synthesis. F4 → route to `plan-a-feature` (hook-firing-in-unattended-mode premise UNVERIFIED + goose hook mechanism unknown). F5 → SDK capability check (pinned `@opencode-ai/sdk` exposes no compaction event arm); UI treatment (sentinel-row vs ephemeral-frame) stays disputed until the event is confirmed to exist. F8 → route to `plan-a-feature` (no line-selection infra exists, diff source ambiguous). These three do NOT block the ready cluster (F1/F2/F3/F6/F7/F9).
 - **Rationale:** The spec-maturity gate tripped with ≥5 spec-level findings (C9, C11, C12, C13, OQ-F4b, OQ-F8a) concentrated in F4/F5/F8 across junior, behavioral, and UX. Sam pre-acknowledged the WANT items would be planned more shallowly when choosing scope "everything we discussed," so the honest synthesis records them as Blocked with explicit reopen paths rather than fabricating plan-level resolutions or stalling the ready work.
 - **Evidence:** spec-maturity gate TRIPPED (synthesis-input "Spec-maturity gate"); USER OVERRIDE STANDING (same section); blocking OQs OQ-F4a/F4b, OQ-F5a/F5b, OQ-F8a/b/c (synthesis-input Open Questions); claims C9, C11, C12, C13.
 - **Rejected alternatives:**
  - Halt synthesis and route F4/F5/F8 out before any planning — rejected because Sam's standing override directs the plan to proceed and document (synthesis-input gate section).
  - Plan F4/F5/F8 at plan-level alongside the ready cluster — rejected because their core premises are unverified (F4) / capability-blocked (F5) / infra-absent (F8); plan-level decisions would rest on unproven assumptions (C9/C11/C13).
 - **Specialist owner:** project-manager.
 - **Revisit criterion:** the named blocking OQ for an item resolves (F4: hooks confirmed to fire unattended + goose format known; F5: SDK compaction event confirmed; F8: diff source chosen + line-selection approach specified) — then that item graduates to its own plan.
 - **Dissent (if any):** none; the gate-trip is acknowledged rather than overridden silently.
 - **Driven by rounds:** R1.
 - **Dependent decisions:** none.
 - **Referenced in plan:** Implementation Approach (Blocked tier), RAID Log (R4, R5, assumptions), Open Items, Deferred (YAGNI).
--- a/docs/plans/post-review-backlog/artifacts/implementation-iteration-history.md
+++ b/docs/plans/post-review-backlog/artifacts/implementation-iteration-history.md
@@ -0,0 +1,48 @@
 # Implementation Iteration History: Post-Review Backlog (F1–F9)
 <!--
 This file records how the implementation plan for the post-review backlog evolved across discussion rounds.
 Committed decisions live in [implementation-decision-log.md](implementation-decision-log.md) and the primary
 plan lives in [../feature-implementation-plan.md](../feature-implementation-plan.md). It also consolidates
 the project-manager's per-round facilitation output (claim ledger, Open Questions, spec-maturity tags).
 The loop converged in one round; remaining unknowns are spec-level, not resolvable by more specialist rounds.
 -->
 ## R1: Parallel six-specialist backlog review
 - **Specialists engaged:** on-call-engineer, behavioral-analyst, software-architect, test-engineer, user-experience-designer, junior-developer, project-manager (coordinator). Team size: large (cross-subsystem; user chose scope "everything we discussed"). Round cap 3; converged in 1.
 - **New input provided:** Initial inputs — the [scope brief](../scope-brief.md) (ground-truth spec stand-in; two items live-verified 2026-06-02 with file:line evidence) and the [discovery notes](.discovery-notes.md) (per-item code touch points). No prior round; this is the initial sweep.
 - **Claim ledger:** (consolidated, deduped — see [synthesis-input.md](synthesis-input.md) for the full table)
  | # | Claim | State | Spec-maturity |
  |---|-------|-------|---------------|
  | C1 | F1 cancel route never aborts external child; no registry/export | Evidenced | plan-level |
  | C2 | F1 catch blocks leave message `streaming`; success path writes `complete` on abort — fix same batch | Evidenced | plan-level |
  | C3 | F2 = prune-now-minimal: unexport 8 zero-caller symbols, keep extractToolCallBlocks+stripToolMarkup | Evidenced | plan-level |
  | C4 | F2 `<invoke>`-text fallback untested → add gate test before prune | Evidenced | plan-level |
  | C5 | F3 optional logger param, do with F2 (same file) | Evidenced | plan-level |
  | C6 | F6 stall-timeout via AbortSignal.any, 90s; NO retry (non-idempotent deltas) | Evidenced | plan-level |
  | C7 | F7 inline MCP tool, messages_with_parts, role!='system', limit 50/200 | Evidenced | plan-level |
  | C8 | F9 delete SPA block, keep routes; GET / unprobed → safe | Evidenced | plan-level |
  | C9 | F4 hook-firing in unattended mode UNVERIFIED; goose hook mechanism unknown | Anecdotal (premise) | spec-level |
  | C10 | F4 dedup rule: confirm running before `blocked`; suppress hook `done` | Evidenced | plan-level |
  | C11 | F5 pinned @opencode-ai/sdk exposes no compaction arm → blocked on capability check | Evidenced | spec-level |
  | C12 | F5 UI treatment sentinel-row vs ephemeral-frame | Disputed | spec-level |
  | C13 | F8 no line-selection infra; diff source ambiguous; needs own spec | Evidenced | spec-level |
 - **Open Questions raised:**
  - F1: terminal state (→ D-7, `cancelled`); registry key (→ D-1, `taskId`); shared finalize helper (→ D-1, yes); warm re-throw on abort (→ D-1, short-circuit on `ac.signal.aborted`).
  - OQ-F2a (sidecar jinja) → moot under D-3 option A; OQ-F2c (a vs b) → D-3 option A.
  - OQ-F6a/b/c → D-5 (AbortSignal.any, no retry, 90s).
  - OQ-F7a (session vs chat id) → D-6 (both, chat_id optional, + limit).
  - OQ-F9a (GET / probe) → D-11 (unprobed, safe).
  - OQ-F4a (hooks fire unattended?), OQ-F4b (goose hook format) → UNRESOLVED, spec-level → OI-1, route F4 to plan-a-feature (D-12).
  - OQ-F5a (SDK compaction event existence/name) → UNRESOLVED, capability check → OI-2, blocks F5 (D-12).
  - OQ-F5b (sentinel vs ephemeral UI) → UNRESOLVED → OI-3, settle once event confirmed (D-12).
  - OQ-F8a/b/c (diff source, serialization, new viewer) → UNRESOLVED, spec-level → OI-4, route F8 to plan-a-feature (D-12).
  - OI-5 (F1 best-effort session-stop leg) → non-blocking, decided at implementation.
 - **Spec-maturity tags:** plan-level — C1-C8, C10 (9 claims). spec-level — C9, C11, C12, C13, plus OQ-F4b and OQ-F8a (≥5 across junior, behavioral, UX). **Spec-maturity gate TRIPPED**, concentrated in the three WANT items F4/F5/F8; F1/F2/F3/F6/F7/F9 are all plan-level and ready. No T#-contradictions.
 - **Resolution source:** evidence (Step 6 specialist findings) for every plan-level OQ (F1, F2, F3, F6, F7, F9); user input for the gate disposition (Sam's standing override → D-12); deferred-to-spec for OQ-F4a/F4b, OQ-F5a/F5b, OQ-F8a/b/c (recorded as OI-1..OI-4, routed out rather than resolved in this loop). The YAGNI gate ran during synthesis: F6 retry, F2 option B, F4 interface, F5 extra compaction arms, F7 reader interface all deferred.
 - **Decisions produced:** D-1, D-2, D-3, D-4, D-5, D-6, D-7, D-8, D-9, D-10, D-11, D-12 (all 12; the loop converged in one round so every decision originates here).
 - **Changed in plan:** all sections (initial authoring) — Source Specification, Outcome, Context, Implementation Approach (TIER 1 READY / TIER 2 BLOCKED), Decomposition and Sequencing, RAID Log, Testing Strategy, Security Posture, Operational Readiness, On-Call Resilience Posture, Definition of Done, Specialist Handoffs, Deferred (YAGNI), Open Items, Summary.
 - **Project-manager next-step recommendation:** Go to synthesis (done — this plan). Build the READY cluster in order F1 → F2+F3 → F6 → F7 → F9 as sequential patch tags; route F4 and F8 to `plan-a-feature` and F5 to an `@opencode-ai/sdk` capability check before any build on those three.
--- a/docs/plans/post-review-backlog/artifacts/synthesis-input.md
+++ b/docs/plans/post-review-backlog/artifacts/synthesis-input.md
@@ -0,0 +1,181 @@
 # Synthesis input — Round 1 aggregation + dispositions
 Deterministic aggregation of the Round-1 specialist review (on-call-engineer, behavioral-analyst,
 software-architect, test-engineer, user-experience-designer, junior-developer). This is the consolidated
 record the project-manager synthesizes into the three plan files. Evidence (file:line) is preserved inline.
 Team size: large (cross-subsystem, user chose "everything"). Round cap 3; converged in 1 round (the
 remaining unknowns are spec-level, not resolvable by more specialist rounds).
 ---
 ## Per-feature dispositions (the decisions)
 ### READY TO BUILD
 **F1 — external task cancel kills child + finalizes message.** Strong 4-way consensus (on-call B1, behavioral
 B1, architect A1, junior).
 - Root cause CONFIRMED: `routes/tasks.ts:130-138` calls `inference.cancel` (native-only); dispatcher has no
  `Map<taskId,AbortController>`; the four private `ac` (dispatcher.ts:316/655/991/1248) are unreachable;
  `cancelExternalTask` does not exist anywhere.
 - Design (architect A1): add `taskControllers = new Map<string,AbortController>()` inside `createDispatcher`;
  `taskControllers.set(taskId, ac)` at each of the 4 run-functions; delete in the existing `.finally()` at
  dispatcher.ts:117; export `cancelExternalTask(taskId): boolean` (idempotent — `ac.abort()` is a no-op when
  already aborted, so double-Stop and cancel-after-exit are safe). Pass a narrow `ExternalCancelFn`
  (NOT the whole dispatcher) into `registerTaskRoutes`; wire in `index.ts:254`.
 - TWO pre-existing bugs F1 makes reachable, MUST be fixed in the same batch (on-call OCE-001/OCE-002,
  behavioral B2/B3): (1) the four catch blocks update only `tasks` state, never the `messages` row → an
  aborted/thrown turn leaves the assistant message `status='streaming'` (BooChat's 5-min sweep can't recover
  it — different process); (2) the warm-backend success path writes `messages.status='complete'`
  unconditionally before checking abort (dispatcher.ts ~853/1122/1377) → a cancelled turn is recorded
  `complete`. Fix: after `await backend.prompt(...)`, `if (ac.signal.aborted)` → write `status='cancelled'`,
  publish the terminal `message_complete` frame, emit idle, return; and in each catch finalize the message
  with `WHERE status='streaming'` (idempotent) distinguishing AbortError→cancelled vs error→failed.
 - UX (UX agent): disable the Stop button while the cancel POST is in flight (mobile double-tap); extend the
  coder `message_complete` frame with an optional `status` field (Option A — minimal, no new frame type) and
  map it in the reducer (`CoderPane.tsx:299`, `MessageStatus` already includes `'cancelled'`); render a muted
  "Stopped" label (not red, not a toast).
 - Tests (test-engineer T1-T3): extract a pure `CancelRegistry` (register/cancel/delete/has) — 4 unit cases,
  no DB/child; one DB-integration test for the route → row lands `'cancelled'`; warm-worktree-preserved held
  as a code comment, not a spy.
 - Resolved OQs: terminal state = `cancelled` (not `failed`) for user Stop; registry keyed by `taskId`
  (route receives taskId); `session/stop` route — CoderPane already calls `cancelTask` for external tasks so
  the session-stop path "never fires for external from UI" (on-call) — wire it best-effort via a
  `SELECT id FROM tasks WHERE session_id=$ AND state='running'` lookup OR defer that leg (low value); use a
  shared `cancelAndFinalize` helper across the 4 paths (TDD precedent).
 **F2 — tool-call-parser prune (option a: prune-now-minimal).** DECISION (architect A2, confirms junior
 OQ-F2c): do NOT do the flag-gated full retirement (option b). KEEP `extractToolCallBlocks` + `stripToolMarkup`
 + their types (`ToolCallExtraction`, `ParsedCall`) — load-bearing `<invoke>`-as-text guard (the only guard for
 that case; `experimental_repairToolCall` doesn't cover it; sidecar `--jinja` unconfirmed so keeping the guard
 is correct). REMOVE the `export` keyword (not the implementations) from the 8 zero-external-caller symbols:
 `isPlaceholderArgValue`, `parseXmlToolCall`, `parseInvokeToolCall`, `partialXmlOpenerStart`, and the 4 consts
 `XML_TOOL_OPEN/CLOSE`, `INVOKE_TOOL_OPEN/CLOSE`. Zero runtime effect; public surface 11→4 exports.
 - Test gap (test-engineer T6): the `<invoke>`-text fallback in `stream-phase.ts:263-284` is currently NOT
  exercised by any test → add a gate test (stub `streamText` to emit a text-delta containing a complete
  `<invoke>` block; assert it lands in `result.toolCalls` and the markup is NOT in `result.content`). Must
  stay green through the prune and fail if `extractToolCallBlocks` is ever removed from the text-delta path.
 **F3 — xml-parser structured logging.** Trivial. `tool-call-parser.ts:65` `console.debug` → pass an optional
 `log?: { debug }` param to `extractToolCallBlocks` from its one call site (`stream-phase.ts` executeStreamPhase)
 and use it. No interface (architect: one site, one impl). SEQUENCING: same file as F2; F2 keeps
 `extractToolCallBlocks` (decided), so F3 is safe; do F2+F3 in one batch. Confirm `executeStreamPhase`
 signature/test-stubs tolerate the param (junior).
 **F6 — BooChat stall-timeout ONLY (retry deferred).** on-call: wrap the `stream-phase.ts:261` fullStream loop
 with a per-chunk stall deadline: a local `stallAc = new AbortController()`, `effectiveSignal =
 AbortSignal.any([signal, stallAc.signal])` passed to `streamText`; bump a `setTimeout(STALL_TIMEOUT_MS=90_000)`
 on each chunk; clear it in the existing `finally`; at the post-loop check (stream-phase.ts:337) test
 `signal?.aborted || stallAc.signal.aborted` and throw `AbortError` (→ `handleAbortOrError` writes
 `cancelled`). Tests (test-engineer T8-T10): pure `classifyStreamError(err)` helper (5 cases, no I/O) + a
 `vi.useFakeTimers()` stall test on a fake hanging stream + a regression pin on the existing `signal?.aborted`
 post-loop check.
 - YAGNI DEFER (on-call, strong): NO retry at `executeStreamPhase`/`streamCompletion`. A retry after partial
  stream re-emits already-streamed deltas (`state.accumulated` + live `delta` frames are non-idempotent) —
  worse than current. Reopen trigger: llama-swap gains restart-in-place-with-clear-partial, or a second
  instance for failover. The user re-sending is the correct recovery at single-instance scale.
 **F7 — view_session_history MCP tool.** architect A4: add tool 7 inline in `mcp-server.ts` (follows the
 existing 6-tool inline pattern, `textResult` + direct `sql`). Reads `messages_with_parts`, `WHERE role !=
 'system'` (strips sentinels), params `session_id` + optional `chat_id` + `limit` (default 50, max 200),
 `ORDER BY created_at ASC`. No interface, no pagination beyond limit. Returns `{role,content,...}[]`.
 **F9 — retire apps/coder/web :9502 SPA.** architect A5: the `if (existsSync(webRoot))` block in `index.ts`
 (~269-289) already no-ops when the dist is absent. Delete that block, keep the inline 404 handler
 (`{error:'not found'}`); remove `apps/coder/web` from `pnpm-workspace.yaml`, the coder build step, and the
 Dockerfile copy; remove the now-unused `fastifyStatic` import (verify it's only used there). KEEP all
 `/api/coder/*` REST + WS + `/api/health` + `--mcp` routes (CoderPane depends on them). OQ-F9a RESOLVED:
 nothing probes `GET /` on :9502 (health is `/api/health`; compose healthcheck is the boocode container, not
 the host-systemd coder) → safe to 404 or add a 2-line `GET /` redirect-to-BooChat (no fastifyStatic).
 ### BLOCKED — need a spec or a capability check before building (gate-trip items)
 **F4 — notify-hook config injection.** SPEC-LEVEL gaps (junior OQ-F4a-e, behavioral B4, UX). The core premise
 is UNVERIFIED: do claude / qwen / goose actually fire their native lifecycle hooks in unattended mode
 (`claude -p` / SDK, `qwen --acp` / `--output-format stream-json`, goose)? goose's hook file/format is unknown
 (not in repo). Idempotent per-agent settings.json merge strategy unspecified. `boocoder.service` run-user /
 `homedir()` resolution unconfirmed. The inbound POST is a new unauthenticated localhost route (acceptable
 single-user, note it). Double-publish dedup with the v2.7.6 turn-boundary publish: behavioral B4 +
 architect A3 agree on the rule — inbound route calls `normalizeAgentEvent` (returns bucket
 `working|blocked|done`), confirms `tasks.state='running'` before publishing `blocked`, and SUPPRESSES `done`
 (the dispatcher already emits `idle`); `done`→drop, never re-publish. UI side already exists (AgentStatusDot,
 all 4 buckets — UX: F4 is server-side only). RECOMMENDATION: own `plan-a-feature` — the dedup rule + module
 shapes are settled, but the hook-firing-in-unattended-mode premise and goose hook mechanism must be verified
 first or the whole feature is built on sand.
 **F5 — opencode compaction surfacing.** BLOCKED on a capability check. The installed `@opencode-ai/sdk`
 exposes NO compaction event arm (current arms confirmed: `session.next.{text,reasoning,tool,step}.*`,
 `message.part.*`, `session.idle/error` at opencode-server.ts:379-491). The review's "consume
 compaction.{started,delta,ended}" assumed events from opencode's CORE `event.ts`, which the pinned SDK may not
 surface. MUST confirm the SDK emits a compaction signal + its exact event name (or an SDK bump is needed)
 before building. DISPUTED UI treatment (behavioral B5 = persistent sentinel row `metadata.kind='compaction'`,
 survives refresh; UX = ephemeral inline divider via a new `agent_compacted` frame, no DB row) — settle once
 the event exists. Only `compaction.ended` is in scope (YAGNI: started/delta/step.failed/tool.progress out).
 Cross-app WS-frame parity is certain if a frame is added.
 **F8 — diff-line → agent re-prompt.** SPEC-LEVEL (UX + junior, firm). The "DiffPanel" is inline in
 `CoderPane.tsx:478-619`, rendering `pending_changes` rows as a static `<pre>` (CoderPane.tsx:607-610) — NO
 line-selection infrastructure exists. Diff source ambiguous (`pending_changes.diff` = BooCode write-tools only
 vs the external-agent worktree git diff). "Send to new agent" needs coordinated workspace-pane + chat creation
 + pre-population across 3 surfaces with no existing contract. Selection diverges by modality (desktop line-
 select vs mobile long-press → bottom sheet). RECOMMENDATION: own `plan-a-feature` (the scope-brief already
 hedged this; treat as firm). MVP-if-pushed: "comment to current agent" only, block-level selection,
 pre-populate `ChatInput` — still wants a spec.
 ---
 ## Claim ledger (consolidated, deduped)
 | # | Claim | State | Spec-maturity | Supporting |
 |---|-------|-------|---------------|-----------|
 | C1 | F1 cancel route never aborts external child; no registry/export | Evidenced | plan-level | on-call,behavioral,architect,junior |
 | C2 | F1 catch blocks leave message `streaming`; success path writes `complete` on abort — fix in same batch | Evidenced | plan-level | on-call,behavioral |
 | C3 | F2 = prune-now-minimal: unexport 8 zero-caller symbols, keep extractToolCallBlocks+stripToolMarkup | Evidenced | plan-level | architect (test-engineer guard) |
 | C4 | F2 `<invoke>`-text fallback is untested → add gate test before prune | Evidenced | plan-level | test-engineer |
 | C5 | F3 optional logger param, do with F2 (same file) | Evidenced | plan-level | architect,junior |
 | C6 | F6 stall-timeout via AbortSignal.any, 90s; NO retry (non-idempotent deltas) | Evidenced | plan-level | on-call,behavioral,test-engineer |
 | C7 | F7 inline MCP tool, messages_with_parts, role!='system', limit 50/200 | Evidenced | plan-level | architect,UX |
 | C8 | F9 delete SPA block, keep routes; GET / unprobed → safe | Evidenced | plan-level | architect (+ verified) |
 | C9 | F4 hook-firing in unattended mode UNVERIFIED; goose hook mechanism unknown | Anecdotal (premise) | spec-level | junior,behavioral,UX |
 | C10 | F4 dedup rule: confirm running before `blocked`; suppress hook `done` | Evidenced | plan-level | behavioral,architect |
 | C11 | F5 pinned @opencode-ai/sdk exposes no compaction arm → blocked on capability check | Evidenced | spec-level | (verified) + junior |
 | C12 | F5 UI treatment sentinel-row vs ephemeral-frame | Disputed | spec-level | behavioral vs UX |
 | C13 | F8 no line-selection infra; diff source ambiguous; needs own spec | Evidenced | spec-level | UX,junior |
 ## Open Questions — resolutions
 - OQ (F1 terminal state) → RESOLVED: `cancelled`. OQ (F1 registry key) → RESOLVED: `taskId`. OQ (F1 shared
  finalize helper) → RESOLVED: yes, pure helper. OQ (F1 warm re-throw on abort) → RESOLVED: short-circuit on
  `ac.signal.aborted`.
 - OQ-F2a (sidecar jinja) → RESOLVED moot: option a keeps the guard. OQ-F2c (a vs b) → RESOLVED: option a.
 - OQ-F6a/b/c → RESOLVED: AbortSignal.any (not Promise.race); no retry; 90s.
 - OQ-F7a (session vs chat id) → RESOLVED: both (chat_id optional) + limit.
 - OQ-F9a (GET / probe) → RESOLVED: unprobed, safe.
 - OQ-F4a (hooks fire unattended?), OQ-F4b (goose hook format) → UNRESOLVED, spec-level → route to F4 spec.
 - OQ-F5a (SDK compaction event name/existence) → UNRESOLVED, capability check → blocks F5.
 - OQ-F5b (sentinel vs ephemeral UI) → UNRESOLVED → settle in F5 once event confirmed.
 - OQ-F8a/b/c (diff source, serialization, new viewer) → UNRESOLVED, spec-level → route to F8 spec.
 ## Spec-maturity gate
 TRIPPED (≥5 spec-level findings — C9, C11, C12, C13, plus OQ-F4b/F8a — across ≥3 specialists: junior,
 behavioral, UX). The trip is CONCENTRATED in the three WANT items F4/F5/F8; F1/F2/F3/F6/F7/F9 are all
 plan-level and ready. Per skill: gate-trip → recommend the user route F4/F8 to `plan-a-feature` and F5 to a
 capability check. USER OVERRIDE STANDING: Sam chose scope "everything we discussed" having pre-acknowledged the
 WANT items would be planned more shallowly — so the plan proceeds, documenting F4/F5/F8 as Blocked/own-spec
 rather than halting. Decision deferred to Step 9 user presentation.
 ## YAGNI ledger
 - F6 retry logic → DEFER (non-idempotent re-emit of streamed deltas). Reopen: llama-swap restart-in-place or
  second instance. Source: on-call R1.
 - F2 option b (flag-gated full retirement of extractToolCallBlocks/stripToolMarkup) → DEFER (no evidence
  qwen3.6 stopped emitting `<invoke>` text on live; sidecar jinja unconfirmed). Reopen: documented multi-
  session live probe shows zero text-delta tool calls. Source: architect/test-engineer R1.
 - F4 `NotifyHookInjection` interface → REPLACE with one concrete function switching on agent name (3 agents,
  identical read-merge-write). Source: architect R1.
 - F5 handling of compaction.started/delta + step.failed + tool.progress → DEFER, only compaction.ended is
  user-actionable. Source: behavioral R1.
 - F7 SessionHistoryReader interface / pagination → REPLACE with inline query + limit. Source: architect R1.
 - Provider tier-2 follow-ups (snapshot frame, enabled column, shared types, MCP list_providers) → already
  DEFER/DROP per scope-brief; not re-planned.
--- a/docs/plans/post-review-backlog/feature-implementation-plan.md
+++ b/docs/plans/post-review-backlog/feature-implementation-plan.md
@@ -0,0 +1,350 @@
 # Feature Implementation Plan: Post-Review Backlog (F1–F9)
 This is a multi-item backlog, not a single feature. It commits to shipping six **READY** items as
 independent sequential patch tags ([D-9](artifacts/implementation-decision-log.md#trivial-decisions)), one batch per
 coherent unit, honoring deploy-by-surface and stage-commits-by-path; and to documenting three **BLOCKED**
 WANT items (F4/F5/F8) with their exact blocking open questions and recommended resolution paths
 ([D-12](artifacts/implementation-decision-log.md#d-12-f4f5f8-disposition-under-the-standing-override--document-as-blocked-do-not-halt))
 rather than halting on the tripped spec-maturity gate.
 ## Source Specification
 - **Feature specification:** No formal `feature-specification.md` exists. The ground-truth source is the [scope brief](scope-brief.md), which captures the items, decisions, and live-verified current state from the 2026-06-02 backlog review conversation plus `boocode_code_review_v2.md`, `docs/DEFERRED-WORK.md`, and `boocode_roadmap.md`. Origin is conversational + review-doc, so the WANT items are intentionally planned more shallowly.
 - **Consolidated specialist record:** [artifacts/synthesis-input.md](artifacts/synthesis-input.md) — the Round-1 aggregation of all six specialists (on-call-engineer, behavioral-analyst, software-architect, test-engineer, user-experience-designer, junior-developer) with file:line evidence, the claim ledger, OQ resolutions, the spec-maturity gate, and the YAGNI ledger. There is no separate per-specialist file; this digest is authoritative.
 - **Discovery notes:** [artifacts/.discovery-notes.md](artifacts/.discovery-notes.md) — project context and per-item code touch points.
 ## Outcome
 When this plan's READY cluster is executed: hitting Stop on an external agent task actually kills the
 running child and finalizes the assistant message in a correct terminal state (no more `streaming`-stuck or
 falsely-`complete` rows); the tool-call parser's public surface is pruned to its four load-bearing exports
 while the plain-text `<invoke>` guard stays intact and is for the first time test-pinned; the parser's
 rejection logging flows through pino/`LOG_LEVEL`; a hung BooChat stream is detected and finalized
 server-side after 90s instead of relying solely on the frontend watchdog; an MCP client can read a
 session's transcript through a new BooCoder MCP tool; and the unused :9502 fallback SPA and its build step
 are gone. F4/F5/F8 leave this plan with a documented, evidence-backed route to their own spec/capability
 work.
 ## Context
 - **Driving constraint:** A post-review backlog with two items live-verified as active correctness bugs (F1 cancel is a no-op that leaves messages stuck; F6 has zero server-side stall guard). The review surfaced them; Sam chose scope "everything we discussed." No external deadline — value is correctness and simplification debt paydown.
 - **Stakeholders:** Sam (sole user/operator — wants Stop to work, wants the parser simplified, wants resilience and the MCP history tool, greenlit the :9502 removal). On-call posture is single-operator: Sam is the post-ship owner for every item.
 - **Future-state concern:** F1 touches the dispatcher's message-finalization paths across four backends — the risk to watch is finalization races/regressions there. F6 introduces a server-side timer that must not fire on a healthy-but-slow stream. F5 is a standing capability dependency on the pinned `@opencode-ai/sdk`.
 - **Out-of-scope boundary:** F2 Option B (flag-gated full parser retirement), F6 retry/backoff, F4's superset code (clean-room pattern only), and everything in the scope brief's "Deferred (Sam's explicit dispositions)" list (subagent permission demux, tier-2 provider follow-ups, large-file splits, PR-resolver, etc.). These are recorded, not planned.
 ## Team Composition and Participation
 Round-by-round detail lives in [artifacts/implementation-iteration-history.md](artifacts/implementation-iteration-history.md). All six specialists converged in one round; remaining unknowns are spec-level, not resolvable by more specialist rounds.
 | Specialist | Status | Key Input |
 |------------|--------|-----------|
 | `project-manager` | Coordinator | Facilitated R1, applied the spec-maturity gate + YAGNI gate, synthesized this plan. |
 | `on-call-engineer` | Active | F1 finalization bugs (OCE-001/002); F6 stall-timeout design + no-retry YAGNI call. |
 | `behavioral-analyst` | Active | F1 message-state corruption (B2/B3); F5 sentinel-row UI position (disputed); F4 dedup rule. |
 | `software-architect` | Active | F1 registry/`ExternalCancelFn` shape (A1); F2 option-A prune (A2); F7 inline tool (A4); F9 removal (A5). |
 | `test-engineer` | Active | F1 `CancelRegistry` + DB test (T1-T3); F2 fallback gate test (T6); F6 `classifyStreamError` + fake-timer test (T8-T10). |
 | `user-experience-designer` | Active | F1 disable-Stop-while-in-flight + Option-A frame extension + muted "Stopped" label; F5 ephemeral-divider UI (disputed); F8 no line-selection infra. |
 | `junior-developer` | Active | OQ reframing across all items; F2c (option a), F4a/b unverified premise, F8 spec-level. |
 ## Implementation Approach
 ALTITUDE: this section names code artifacts and inlines only decision-bearing values (flag names, the 90s
 stall timeout, the 50/200 limits, the export list to unexport). Full implementations belong in the files
 themselves; mechanics live in the cited decision-log entries.
 ### TIER 1 — READY TO BUILD
 **F1 — external task cancel kills the child + finalizes the message (apps/coder, standalone, do first).**
 The highest-value correctness fix. A `taskControllers = new Map<string,AbortController>()` registry inside
 `createDispatcher`, populated at the four run-functions and deleted in the existing `.finally()`, plus an
 exported idempotent `cancelExternalTask(taskId): boolean` wired into `POST /api/tasks/:id/cancel` via a
 narrow `ExternalCancelFn`
 ([D-1](artifacts/implementation-decision-log.md#d-1-f1-cancel-registry-shape-and-finalization-fix-scope)).
 This batch **must** also fix the two pre-existing finalization bugs the abort wiring newly makes reachable —
 the catch blocks that leave messages `streaming` and the warm success path that writes `complete` on an
 aborted turn — via a shared `cancelAndFinalize` helper, or it ships a new bug
 ([D-1](artifacts/implementation-decision-log.md#d-1-f1-cancel-registry-shape-and-finalization-fix-scope)).
 User Stop finalizes to `cancelled`, not `failed`
 ([D-7](artifacts/implementation-decision-log.md#d-7-f1-terminal-state-for-user-stop--cancelled-not-failed));
 the terminal state reaches the web reducer via an optional `status` field on the existing coder
 `message_complete` frame, not a new frame type
 ([D-8](artifacts/implementation-decision-log.md#d-8-f1-status-surfacing--extend-the-existing-frame-no-new-frame-type)),
 rendered as a muted "Stopped" label ([D-10](artifacts/implementation-decision-log.md#trivial-decisions)). The
 `session/stop` leg "never fires for external from UI" (CoderPane already calls `cancelTask`), so it is wired
 best-effort via a `WHERE session_id=$ AND state='running'` lookup or deferred as a low-value leg.
 **F2 + F3 together (apps/server, same file).** F2 is option-A prune-now-minimal: KEEP `extractToolCallBlocks`
 + `stripToolMarkup` (the only guard for a tool call emitted as plain text), unexport the 8 zero-caller
 symbols — `isPlaceholderArgValue`, `parseXmlToolCall`, `parseInvokeToolCall`, `partialXmlOpenerStart`, and
 the consts `XML_TOOL_OPEN/CLOSE`, `INVOKE_TOOL_OPEN/CLOSE` — taking the public surface 11 → 4 with zero
 runtime effect
 ([D-3](artifacts/implementation-decision-log.md#d-3-f2-prune-scope--option-a-prune-now-minimal-keep-the-load-bearing-guard)).
 A gate test pins the untested `<invoke>`-as-text fallback before the prune
 ([D-4](artifacts/implementation-decision-log.md#d-4-f2-fallback-gate-test--pin-the-untested-guard-before-pruning)).
 Because F2 keeps `extractToolCallBlocks`, F3 is safe to land in the same file/batch: thread an optional
 `log?: { debug }` param into `extractToolCallBlocks` from its one call site so rejection logging flows
 through pino/`LOG_LEVEL` instead of `console.debug` ([D-2](artifacts/implementation-decision-log.md#trivial-decisions)).
 **F6 — BooChat stall-timeout ONLY (apps/server, standalone).** Wrap the `stream-phase.ts:261` `fullStream`
 loop with a per-chunk stall deadline using `AbortSignal.any([signal, stallAc.signal])` and a
 `setTimeout(STALL_TIMEOUT_MS = 90_000)` bumped on each chunk, cleared in the existing `finally`; the
 post-loop check throws `AbortError` so the existing finalize path writes `cancelled`
 ([D-5](artifacts/implementation-decision-log.md#d-5-f6-stall-timeout-via-abortsignalany-no-retry)). **No
 retry** — partial-stream re-emit is non-idempotent and the instance is single+local (see Deferred (YAGNI)).
 **F7 — `view_session_history` MCP tool (apps/coder MCP, standalone).** Tool 7 added inline in
 `mcp-server.ts` following the existing 6-tool inline pattern (`textResult` + direct `sql`). Reads
 `messages_with_parts` `WHERE role != 'system'`, params `session_id` + optional `chat_id` + `limit` (default
 50, max 200), `ORDER BY created_at ASC` ([D-6](artifacts/implementation-decision-log.md#trivial-decisions)).
 No interface, no pagination beyond `limit`.
 **F9 — retire :9502 SPA (apps/coder, standalone cleanup).** Delete the `if (existsSync(webRoot))` serve
 block in `index.ts` (~269-289), keep the inline 404; remove `apps/coder/web` from `pnpm-workspace.yaml`, the
 coder build step, and the Dockerfile copy; drop the now-unused `fastifyStatic` import. KEEP every
 `/api/coder/*` REST + WS + `/api/health` + `--mcp` route
 ([D-11](artifacts/implementation-decision-log.md#d-11-f9-retire-9502-spa--delete-the-serve-block-keep-all-apiws-routes)).
 ### TIER 2 — BLOCKED (need their own spec / a capability check)
 These three are recorded, not built. They do **not** block the READY cluster
 ([D-12](artifacts/implementation-decision-log.md#d-12-f4f5f8-disposition-under-the-standing-override--document-as-blocked-do-not-halt)).
 **F4 — notify-hook config injection.** The module shapes and the double-publish dedup rule are settled
 (inbound route calls `normalizeAgentEvent` → bucket; confirms `tasks.state='running'` before publishing
 `blocked`; SUPPRESSES `done` since the dispatcher already emits `idle`). But the **core premise is
 unverified**: it is unknown whether claude / qwen / goose actually fire their native lifecycle hooks in
 unattended mode (`claude -p`/SDK, `qwen --acp`/`stream-json`, goose), and goose's hook file/format is
 unknown (not in repo). **Recommended path:** route to `plan-a-feature`; verify hook-firing-in-unattended-mode
 and the goose hook mechanism first, or the feature is built on sand. Blocking OQs: OQ-F4a (hooks fire
 unattended?), OQ-F4b (goose hook format?).
 **F5 — opencode compaction surfacing.** **Blocked on a capability check**: the pinned `@opencode-ai/sdk`
 exposes NO compaction event arm (confirmed arms: `session.next.{text,reasoning,tool,step}.*`,
 `message.part.*`, `session.idle/error`). The review assumed events from opencode's core `event.ts` that the
 pinned SDK may not surface. **Recommended path:** confirm the SDK emits a compaction signal and its exact
 event name (or an SDK bump is needed) before building. The UI treatment is **disputed** (behavioral:
 persistent sentinel row `metadata.kind='compaction'` that survives refresh; UX: ephemeral inline divider via
 a new `agent_compacted` frame, no DB row) — settle once the event is confirmed to exist. Only
 `compaction.ended` is in scope. Blocking OQs: OQ-F5a (SDK event existence/name), OQ-F5b (sentinel vs
 ephemeral UI).
 **F8 — diff-line → agent re-prompt UX.** **Spec-level.** The "DiffPanel" is inline in
 `CoderPane.tsx:478-619`, rendering `pending_changes` rows as a static `<pre>` — no line-selection
 infrastructure exists. The diff source is ambiguous (`pending_changes.diff` = BooCode write-tools only vs the
 external-agent worktree git diff). "Send to new agent" needs coordinated workspace-pane + chat creation +
 pre-population across three surfaces with no existing contract; selection diverges by modality.
 **Recommended path:** route to `plan-a-feature` (the scope brief already hedged this; treat as firm).
 MVP-if-pushed: "comment to current agent" only, block-level selection, pre-populate `ChatInput` — still
 wants a spec. Blocking OQs: OQ-F8a (diff source), OQ-F8b (selection serialization), OQ-F8c (new-viewer
 routing).
 ### Architecture and Integration Points
 F1 lives entirely in apps/coder (`dispatcher.ts`, `routes/tasks.ts`, `index.ts:254` wiring) with a
 single-field touch on the web reducer (`CoderPane.tsx:299`). F2/F3 are confined to
 `apps/server/.../tool-call-parser.ts` and its one call site `stream-phase.ts`. F6 is confined to
 `stream-phase.ts`. F7 is one inline tool in `apps/coder/.../mcp-server.ts`. F9 removes a serve block and a
 workspace package. No new interface is introduced anywhere in TIER 1 — every item follows the existing inline
 / pure-helper precedent ([D-2](artifacts/implementation-decision-log.md#trivial-decisions),
 [D-6](artifacts/implementation-decision-log.md#trivial-decisions)).
 ### Runtime Behavior
 F1's cancel path: Stop → `POST /api/tasks/:id/cancel` → `cancelExternalTask(taskId)` → `ac.abort()` →
 backend honors the signal (`child.kill` / `session/cancel` / `session.abort` / interrupt) → the run-function
 hits its abort short-circuit or catch → `cancelAndFinalize` writes the terminal `messages.status`
 (`cancelled` on abort, `failed` on error) `WHERE status='streaming'` and publishes the terminal frame
 ([D-1](artifacts/implementation-decision-log.md#d-1-f1-cancel-registry-shape-and-finalization-fix-scope),
 [D-7](artifacts/implementation-decision-log.md#d-7-f1-terminal-state-for-user-stop--cancelled-not-failed)).
 F6's stall path: each chunk bumps the 90s timer; on stall, `stallAc.abort()` flows through `AbortSignal.any`
 into the same finalize-as-`cancelled` path
 ([D-5](artifacts/implementation-decision-log.md#d-5-f6-stall-timeout-via-abortsignalany-no-retry)).
 ### Data Model and Persistence
 No schema changes in any TIER 1 item. F1 and F6 only change *which* terminal value gets written to existing
 `messages.status` / `tasks.state` columns. F7 is a read against the existing `messages_with_parts` view.
 ### External Interfaces
 F1 extends the existing coder `message_complete` WS frame with an optional `status` field — no new frame
 type, so no paired strict-union arm is forced in `apps/web/src/api/types.ts` beyond the optional field
 ([D-8](artifacts/implementation-decision-log.md#d-8-f1-status-surfacing--extend-the-existing-frame-no-new-frame-type)).
 F7 exposes one new read-only MCP tool on BooCoder's MCP server
 ([D-6](artifacts/implementation-decision-log.md#trivial-decisions)). F4 (blocked) would add a new
 unauthenticated localhost POST route — see Security Posture. F5 (blocked) would add a new WS frame and thus
 trigger the full cross-app parity rule.
 ## Decomposition and Sequencing
 Each READY item is its own patch tag, one batch per coherent unit
 ([D-9](artifacts/implementation-decision-log.md#trivial-decisions)). F2+F3 are a single unit (same file; F2 keeping
 `extractToolCallBlocks` unblocks F3). F1 goes first as the highest-value correctness fix.
 | # | Work Unit | Surface / Deploy | Delivers | Depends On | Verification |
 |---|-----------|------------------|----------|------------|--------------|
 | 1 | **F1** cancel registry + finalization fixes | apps/coder → `systemctl restart boocoder` | Stop kills child; message finalizes `cancelled`/`failed`; no stuck `streaming` ([D-1](artifacts/implementation-decision-log.md#d-1-f1-cancel-registry-shape-and-finalization-fix-scope)) | — | `CancelRegistry` unit (4 cases) + DB-integration route test → row `cancelled` (T1-T3) |
 | 2 | **F2+F3** parser prune + logger | apps/server → docker rebuild | Public surface 11→4; rejection logging via pino; guard pinned ([D-3](artifacts/implementation-decision-log.md#d-3-f2-prune-scope--option-a-prune-now-minimal-keep-the-load-bearing-guard), [D-2](artifacts/implementation-decision-log.md#trivial-decisions)) | F2's keep-decision unblocks F3 | Fallback gate test green through prune (T6, [D-4](artifacts/implementation-decision-log.md#d-4-f2-fallback-gate-test--pin-the-untested-guard-before-pruning)) |
 | 3 | **F6** stall-timeout | apps/server → docker rebuild | 90s server-side stall detection → finalize `cancelled` ([D-5](artifacts/implementation-decision-log.md#d-5-f6-stall-timeout-via-abortsignalany-no-retry)) | — | `classifyStreamError` unit (5 cases) + `vi.useFakeTimers()` stall test + post-loop abort regression pin (T8-T10) |
 | 4 | **F7** session-history MCP tool | apps/coder → `systemctl restart boocoder` | Read-only transcript tool; `/api/health` tool count +1 ([D-6](artifacts/implementation-decision-log.md#trivial-decisions)) | — | Manual MCP call + sentinel-strip assertion (`role != 'system'`) |
 | 5 | **F9** retire :9502 SPA | apps/coder → `systemctl restart boocoder` | Serve block + build step + workspace pkg removed; routes kept ([D-11](artifacts/implementation-decision-log.md#d-11-f9-retire-9502-spa--delete-the-serve-block-keep-all-apiws-routes)) | — | `/api/coder/*` + `/api/health` still 200; `GET /` 404-or-redirect |
 F4/F5/F8 are NOT in this table — they route out per
 [D-12](artifacts/implementation-decision-log.md#d-12-f4f5f8-disposition-under-the-standing-override--document-as-blocked-do-not-halt).
 ## RAID Log
 ### Risks
 | ID | Risk | Likelihood | Severity | Blast Radius | Reversibility | Owner | Mitigation |
 |----|------|------------|----------|--------------|---------------|-------|------------|
 | R1 | F1 message-finalization races across the 4 backend paths leave a row in the wrong terminal state | Medium | Medium | One assistant message per affected turn | Reversible (re-run) | on-call-engineer | Shared `cancelAndFinalize` helper, `WHERE status='streaming'` idempotency, abort short-circuit before the unconditional `complete` write ([D-1](artifacts/implementation-decision-log.md#d-1-f1-cancel-registry-shape-and-finalization-fix-scope)) |
 | R2 | F2 prune accidentally removes the `<invoke>`-as-text fallback guard (regression: plain-text tool calls silently dropped) | Low | High | Any qwen3.6 turn that emits tool-call markup as text | Reversible (re-add export/call) | software-architect | Option-A keeps the impl; gate test fails if `extractToolCallBlocks` leaves the text-delta path ([D-3](artifacts/implementation-decision-log.md#d-3-f2-prune-scope--option-a-prune-now-minimal-keep-the-load-bearing-guard), [D-4](artifacts/implementation-decision-log.md#d-4-f2-fallback-gate-test--pin-the-untested-guard-before-pruning)) |
 | R3 | F6 stall-timeout fires on a healthy-but-slow stream (false abort) | Low | Low | One in-flight chat turn | Reversible (re-send) | on-call-engineer | 90s per-*chunk* (not per-turn) deadline, bumped on every chunk; tune only on observed false-fire ([D-5](artifacts/implementation-decision-log.md#d-5-f6-stall-timeout-via-abortsignalany-no-retry)) |
 | R4 | F5 premise fails — the pinned SDK never surfaces a compaction event, so the feature is unbuildable without an SDK bump | High | N/A (blocked) | F5 only | N/A | (capability check) | Confirm event existence/name before any build; F5 stays Blocked until then ([D-12](artifacts/implementation-decision-log.md#d-12-f4f5f8-disposition-under-the-standing-override--document-as-blocked-do-not-halt)) |
 | R5 | F4 premise fails — agents do not fire native hooks in unattended mode, so injection yields no signals | High | N/A (blocked) | F4 only | N/A | (plan-a-feature) | Verify hook-firing + goose format in the F4 spec before any build ([D-12](artifacts/implementation-decision-log.md#d-12-f4f5f8-disposition-under-the-standing-override--document-as-blocked-do-not-halt)) |
 ### Assumptions
 | ID | Assumption | What Changes If Wrong | Verifier | Status |
 |----|------------|-----------------------|----------|--------|
 | A1 | llama-swap native `--jinja` parsing stays ON (structured `tool_calls`), so the text-scrape fallback remains dormant defense-in-depth | If off, the kept guard becomes hot — fine; if a future config silently changes parsing, F2 Option B reopens | live probe of `:8401` (not in-repo) | Unverified-in-repo (probe-only); guard kept regardless ([D-3](artifacts/implementation-decision-log.md#d-3-f2-prune-scope--option-a-prune-now-minimal-keep-the-load-bearing-guard)) |
 | A2 | All four external backends honor `ctx.signal` as cited (`child.kill` / `session/cancel` / `session.abort` / interrupt) | If one does not, that backend's Stop stays a no-op even after wiring | F1 DB-integration test per backend path | Evidenced by file:line (scope-brief F1); test confirms |
 | A3 | 90s is the right stall deadline for the local llama-swap workload shape | Too low → false aborts on slow models; too high → sluggish recovery | Production observation | Committed value, tunable ([D-5](artifacts/implementation-decision-log.md#d-5-f6-stall-timeout-via-abortsignalany-no-retry)) |
 | A4 | F1's `session/stop` leg "never fires for external from UI" (CoderPane calls `cancelTask`) | If a code path does hit session-stop for an external task, that leg's finalization is best-effort only | on-call review of CoderPane wiring | Evidenced (synthesis-input F1 resolved OQs) |
 ### Dependencies
 | ID | Dependency | Owner | Status |
 |----|------------|-------|--------|
 | Dep1 | F5 requires `@opencode-ai/sdk` to surface a compaction event (or an SDK bump) | (capability check) | BLOCKING F5 — unresolved (OQ-F5a) |
 | Dep2 | F4 requires confirmed unattended hook-firing for claude/qwen/goose + the goose hook format | (plan-a-feature) | BLOCKING F4 — unresolved (OQ-F4a, OQ-F4b) |
 | Dep3 | F8 requires a chosen diff source + a line-selection approach | (plan-a-feature) | BLOCKING F8 — unresolved (OQ-F8a/b/c) |
 ## Testing Strategy
 Sourced from test-engineer (T1-T3, T6, T8-T10) following the established pure-helper-then-wire precedent
 (`turn-guard.ts`, `lifecycle-decisions.ts`, `mistake-tracker.ts`). Coder tests are `globals:false` (import
 `describe`/`it`/`expect`); include glob `src/**/__tests__/**/*.test.ts`; DB-integration tests are opt-in via
 `DATABASE_URL` + `describe.runIf`.
 - **Observable behaviors to test:**
  - F1: a cancelled external task lands `messages.status='cancelled'` (not `streaming`, not `complete`); the registry register/cancel/delete/has behaves idempotently ([D-1](artifacts/implementation-decision-log.md#d-1-f1-cancel-registry-shape-and-finalization-fix-scope), [D-7](artifacts/implementation-decision-log.md#d-7-f1-terminal-state-for-user-stop--cancelled-not-failed)).
  - F2: a text-delta carrying a complete `<invoke>` block lands in `result.toolCalls` with the markup absent from `result.content` — green through the prune, red if the guard is removed ([D-4](artifacts/implementation-decision-log.md#d-4-f2-fallback-gate-test--pin-the-untested-guard-before-pruning)).
  - F6: a fake hanging stream is aborted after the 90s deadline under `vi.useFakeTimers()`; the existing post-loop `signal?.aborted` check still passes (regression pin) ([D-5](artifacts/implementation-decision-log.md#d-5-f6-stall-timeout-via-abortsignalany-no-retry)).
  - F7: results exclude `role='system'` rows (sentinel strip) and respect the `limit` cap.
 - **Test doubles posture:** extract pure helpers (`CancelRegistry`, `classifyStreamError`) with no DB/child/I/O for unit coverage; stub `streamText` for the F2 gate and F6 stall tests; one DB-integration test for the F1 route. Warm-worktree-preserved is held as a code comment, not a spy (test-engineer T1-T3).
 - **Edge cases requiring coverage:** double-Stop / cancel-after-exit (idempotent `ac.abort()`); abort vs thrown error in the catch (cancelled vs failed mapping); F6 stall vs healthy-slow stream.
 - **Test levels:** unit (registry, classifier, gate) → integration (F1 route → DB row) → manual (F7 MCP call, F9 route-still-200 smoke).
 ## Security Posture
 Thin. The single TIER 1 surface is BooCoder's new read-only `view_session_history` MCP tool (F7), which
 reads only `messages_with_parts` `WHERE role != 'system'` — no write verbs, no PII beyond what the operator
 already authored, single-user, Authelia at the reverse proxy, no app-layer auth by design.
 The one genuinely new auth-relevant surface — F4's inbound unauthenticated localhost POST route for hook
 callbacks — belongs to a **BLOCKED** item and is not built in this plan. When F4 is planned, it must be
 recorded explicitly as a new unauthenticated localhost route (acceptable under the single-user / Authelia
 posture, but called out, not silent).
 ## Operational Readiness
 - **Deploy by surface:** apps/coder items (F1, F7, F9) → `sudo systemctl restart boocoder`; apps/server items (F2+F3, F6) → `docker compose up --build -d boocode`. Stage commits explicitly by path; never `git add -A` (Sam has uncommitted web WIP) ([D-9](artifacts/implementation-decision-log.md#trivial-decisions)).
 - **Build-step change (F9):** removing `apps/coder/web` deletes a coder build step and a workspace package; the coder Dockerfile copy and `pnpm-workspace.yaml` entry must be removed in the same batch or the build references a missing package ([D-11](artifacts/implementation-decision-log.md#d-11-f9-retire-9502-spa--delete-the-serve-block-keep-all-apiws-routes)).
 - **Health check after F7:** `/api/health` reports a tool count; expect it to increment by one.
 - **Tagging:** sequential patch tags, one per unit; no v2.8.0 ([D-9](artifacts/implementation-decision-log.md#trivial-decisions)).
 - **Rollback:** each unit is an independent patch tag, so rollback is per-item (revert the tag, redeploy by surface). No schema migrations means no expand/contract sequencing to unwind.
 ## On-Call Resilience Posture
 Application-source resilience, sourced from on-call-engineer. Infrastructure concerns live in Operational
 Readiness above.
 - **Timeouts and deadlines:** F6 adds a 90s per-chunk stall deadline on the BooChat `fullStream` loop, the first server-side guard against a hung llama-swap stream (today only the frontend 60s `discard_stale` watchdog exists) ([D-5](artifacts/implementation-decision-log.md#d-5-f6-stall-timeout-via-abortsignalany-no-retry)).
 - **Retry strategy:** none, deliberately. A retry after a partial stream re-emits already-streamed deltas (non-idempotent); the user re-sending is the correct recovery at single-local-instance scale (see Deferred (YAGNI)) ([D-5](artifacts/implementation-decision-log.md#d-5-f6-stall-timeout-via-abortsignalany-no-retry)).
 - **Idempotency:** F1's `cancelExternalTask` is idempotent (`ac.abort()` no-ops when already aborted → double-Stop and cancel-after-exit are safe); message finalization is idempotent via `WHERE status='streaming'` ([D-1](artifacts/implementation-decision-log.md#d-1-f1-cancel-registry-shape-and-finalization-fix-scope)).
 - **Kill switches:** F1 is itself the kill switch for a runaway external agent — making Stop actually stop is the operator's manual control.
 - **Graceful degradation:** F1's `session/stop` leg is best-effort (does not fire for external tasks from the UI); F6 falls back to finalizing as `cancelled` on stall rather than hanging indefinitely.
 - **Observability of failure paths:** F1 distinguishes `cancelled` (user/stall) from `failed` (thrown error) so the human-inbox surface stays honest ([D-7](artifacts/implementation-decision-log.md#d-7-f1-terminal-state-for-user-stop--cancelled-not-failed)); F3 routes parser-rejection logging through pino/`LOG_LEVEL` ([D-2](artifacts/implementation-decision-log.md#trivial-decisions)).
 - **Data integrity:** no monetary/rate columns touched; F1/F6 only write existing terminal-state enum values into existing columns.
 - **Migration safety:** no schema changes in any TIER 1 item — nothing to expand/contract.
 ## Definition of Done
 - [ ] Hitting Stop on an external task kills the child and the assistant message finalizes `cancelled` (never left `streaming`, never falsely `complete`) ([D-1](artifacts/implementation-decision-log.md#d-1-f1-cancel-registry-shape-and-finalization-fix-scope), [D-7](artifacts/implementation-decision-log.md#d-7-f1-terminal-state-for-user-stop--cancelled-not-failed)).
 - [ ] Stop button is disabled while the cancel POST is in flight; a muted "Stopped" label renders on cancel ([D-8](artifacts/implementation-decision-log.md#d-8-f1-status-surfacing--extend-the-existing-frame-no-new-frame-type), [D-10](artifacts/implementation-decision-log.md#trivial-decisions)).
 - [ ] Parser public surface is 4 exports; the `<invoke>`-as-text gate test is green and fails if the guard is removed ([D-3](artifacts/implementation-decision-log.md#d-3-f2-prune-scope--option-a-prune-now-minimal-keep-the-load-bearing-guard), [D-4](artifacts/implementation-decision-log.md#d-4-f2-fallback-gate-test--pin-the-untested-guard-before-pruning)).
 - [ ] Parser rejection logging appears via pino and respects `LOG_LEVEL` ([D-2](artifacts/implementation-decision-log.md#trivial-decisions)).
 - [ ] A fake hanging BooChat stream is finalized `cancelled` after 90s under fake timers; the post-loop abort regression pin passes ([D-5](artifacts/implementation-decision-log.md#d-5-f6-stall-timeout-via-abortsignalany-no-retry)).
 - [ ] `view_session_history` returns a session transcript excluding `role='system'` rows, capped at `limit` (default 50, max 200); `/api/health` tool count +1 ([D-6](artifacts/implementation-decision-log.md#trivial-decisions)).
 - [ ] :9502 serve block + build step + workspace package removed; all `/api/coder/*` + `/api/health` routes still 200 ([D-11](artifacts/implementation-decision-log.md#d-11-f9-retire-9502-spa--delete-the-serve-block-keep-all-apiws-routes)).
 - [ ] Each shipped item has its own sequential patch tag + CHANGELOG entry; deployed by surface ([D-9](artifacts/implementation-decision-log.md#trivial-decisions)).
 ## Specialist Handoffs for Implementation
 - **`test-engineer`** — dispatch alongside each READY unit; needs the touch-point file:lines (already in discovery notes) to author T1-T3 (F1), T6 (F2), T8-T10 (F6) before/with the wiring.
 - **`software-architect`** — dispatch if F1's `cancelAndFinalize` helper shape needs settling across the four backend paths during implementation; needs the four catch-block + success-path line ranges.
 - **`plan-a-feature` (F4)** — dispatch when Sam wants F4; input: the settled dedup rule + module shapes from synthesis-input, and the two blocking OQs (unattended hook-firing, goose format) to resolve first.
 - **`plan-a-feature` (F8)** — dispatch when Sam wants F8; input: `CoderPane.tsx:478-619` DiffPanel location, the diff-source ambiguity, and the cross-surface routing gap.
 - **Capability check (F5)** — dispatch before any F5 build; input: the pinned `@opencode-ai/sdk` arm list and OQ-F5a (does it emit a compaction event, and what is its exact name?).
 ## Deferred (YAGNI)
 Items considered during planning and deferred under the YAGNI rule. Each carries a concrete reopen trigger.
 ### F6 retry / backoff classifier
 - **Why deferred:** Evidence test — a retry after a partial stream re-emits already-streamed deltas (`state.accumulated` + live `delta` frames are non-idempotent), which is strictly worse than the current behavior; no second instance exists to fail over to. Single-local-instance scale makes operator re-send the correct recovery.
 - **Reopen when:** llama-swap gains restart-in-place-with-clear-partial semantics, OR a second llama-swap instance is added for failover.
 - **Source:** R1, on-call-engineer.
 ### F2 Option B — flag-gated full retirement of `extractToolCallBlocks` / `stripToolMarkup`
 - **Why deferred:** Evidence test — no evidence qwen3.6 stopped emitting `<invoke>`-as-text on live, and the sidecar `--jinja` state is unconfirmed in-repo; deleting the only plain-text-tool-call guard on that basis is unsafe (named anti-pattern: removing a load-bearing guard without a measured signal).
 - **Reopen when:** a documented multi-session live probe shows zero text-delta tool calls from qwen3.6.
 - **Source:** R1, software-architect / test-engineer.
 ### F4 `NotifyHookInjection` interface
 - **Why deferred:** Simpler-version test — three agents with an identical read-merge-write settings.json flow do not need an interface; one concrete function switching on agent name satisfies the same need (named anti-pattern: single-implementation interface before three divergent uses exist).
 - **Reopen when:** a fourth agent with a genuinely different injection contract appears (and only after F4's premise is verified).
 - **Source:** R1, software-architect.
 ### F5 handling of `compaction.started` / `compaction.delta` + `step.failed` + `tool.progress`
 - **Why deferred:** Evidence test — only `compaction.ended` is user-actionable (it is what closes the silent context gap); the other arms add UI noise with no user-described need. Compounded by F5 being capability-blocked anyway.
 - **Reopen when:** the SDK compaction event is confirmed AND a user-described need for in-progress compaction feedback emerges.
 - **Source:** R1, behavioral-analyst.
 ### F7 `SessionHistoryReader` interface / pagination beyond `limit`
 - **Why deferred:** Simpler-version test — one read tool with an inline query + a `limit` cap satisfies the read need; an interface and cursor pagination are unused machinery (named anti-pattern: abstraction before a second concrete use).
 - **Reopen when:** a second history-read consumer needs a different read path, or transcripts routinely exceed the 200-row cap in a way that breaks the use case.
 - **Source:** R1, software-architect.
 ## Open Items
 - **OI-1 (F4 premise):** Do claude / qwen / goose fire native lifecycle hooks in unattended mode, and what is goose's hook file/format?
  - **Resolves when:** verified in a `plan-a-feature` spec for F4 (OQ-F4a, OQ-F4b).
  - **Blocks implementation:** Yes for F4 — No for the READY cluster (F4 is independent).
 - **OI-2 (F5 capability):** Does the pinned `@opencode-ai/sdk` surface a compaction event, and what is its exact name?
  - **Resolves when:** a capability check confirms the event (or an SDK bump is scoped) — OQ-F5a.
  - **Blocks implementation:** Yes for F5 — No for the READY cluster.
 - **OI-3 (F5 UI treatment):** persistent sentinel row vs ephemeral inline divider for "context compacted."
  - **Resolves when:** OI-2 resolves and the team picks the UI shape (OQ-F5b); currently disputed (behavioral vs UX).
  - **Blocks implementation:** Yes for F5 — No for the READY cluster.
 - **OI-4 (F8 spec):** diff source (`pending_changes.diff` vs worktree git diff), selection serialization, and new-viewer routing.
  - **Resolves when:** a `plan-a-feature` spec for F8 settles OQ-F8a/b/c.
  - **Blocks implementation:** Yes for F8 — No for the READY cluster.
 - **OI-5 (F1 session-stop leg):** whether to wire the best-effort `session/stop` lookup or defer it.
  - **Resolves when:** decided at F1 implementation; it does not fire for external tasks from the UI today, so it is low-value.
  - **Blocks implementation:** No — F1 ships with or without this leg.
 ## Summary
 - **Outcome delivered:** Stop actually stops external agents and finalizes their messages correctly; the parser is pruned to its load-bearing surface with the guard test-pinned; BooChat gains a 90s server-side stall guard; a session-history MCP tool and the :9502 retirement land; F4/F5/F8 have a documented route to their own spec/capability work.
 - **Team size:** 7 specialists (incl. project-manager) — see [artifacts/implementation-iteration-history.md](artifacts/implementation-iteration-history.md)
 - **Rounds of facilitation:** 1 — see [artifacts/implementation-iteration-history.md](artifacts/implementation-iteration-history.md)
 - **Decisions committed:** 12 — see [artifacts/implementation-decision-log.md](artifacts/implementation-decision-log.md)
 - **Decisions settled by evidence:** 11 — see [artifacts/implementation-decision-log.md](artifacts/implementation-decision-log.md)
 - **Decisions settled by junior-developer reframing:** 0 (junior confirmed several, but each rests on specialist evidence) — see [artifacts/implementation-decision-log.md](artifacts/implementation-decision-log.md)
 - **Decisions settled by user input:** 1 (D-12, the standing-override disposition) — see [artifacts/implementation-decision-log.md](artifacts/implementation-decision-log.md)
 - **Rejected alternatives recorded:** 13 — see [artifacts/implementation-decision-log.md](artifacts/implementation-decision-log.md)
 - **Open items remaining:** 5 (4 blocking only their own BLOCKED item; OI-5 non-blocking)
 - **Recommendation:** Ship the READY cluster as planned — build F1 first (apps/coder, highest-value correctness fix, must include the OCE-001/002 finalization fixes), then F2+F3 (apps/server, one batch), F6 (apps/server), F7 (apps/coder MCP), F9 (apps/coder cleanup), each as its own sequential patch tag. Route F4 and F8 to `plan-a-feature` and F5 to an `@opencode-ai/sdk` capability check before any build.
--- a/docs/plans/post-review-backlog/scope-brief.md
+++ b/docs/plans/post-review-backlog/scope-brief.md
@@ -0,0 +1,169 @@
 # Scope brief — post-review backlog (spec stand-in)
 **Status:** ground-truth source for `plan-implementation`. No `feature-specification.md` exists; this brief
 captures the items, decisions, and verified current state from the 2026-06-02 backlog review conversation
 plus `boocode_code_review_v2.md`, `docs/DEFERRED-WORK.md`, and `boocode_roadmap.md`. Sam chose scope =
 "everything we discussed" (one plan covering all actionable items; WANT items have no per-feature spec, so
 their planning is intentionally shallower).
 Two items were verified live during the review (file:line evidence below). The plan must honor those
 findings as committed mechanics, not re-debate them.
 ---
 ## Items to implement
 ### F1 — External task cancel actually kills the child (DEFERRED #1)
 **Problem (VERIFIED 2026-06-02):** Hitting Stop on an external agent task (opencode/goose/qwen/claude)
 marks `tasks.state='cancelled'` but does NOT stop the agent. The frontend is wired
 (`CoderPane.tsx:987` `handleStop` → `api.coder.cancelTask` → `POST /api/coder/tasks/:id/cancel`, fired only
 for external tasks since `activeTaskId` is set only when `messages.ts:238` returns a `task_id`). But the
 route (`apps/coder/src/routes/tasks.ts:110-148`) only calls `cancelPendingPermission` + `inference.cancel`
 (native-only) + sets DB cancelled. It never reaches the dispatcher. The dispatcher
 (`apps/coder/src/services/dispatcher.ts`) has no `Map<taskId, AbortController>` and no exported cancel; each
 external run-function creates a private `const ac = new AbortController()` (lines ~316/655/991/1248) and
 threads `ac.signal` into every backend, but `ac.abort()` is never called by any route. All four backends
 honor the signal correctly (PTY `child.kill` SIGTERM→SIGKILL at `pty-dispatch.ts:159`; warm-ACP
 `session/cancel` at `warm-acp.ts:318`; opencode `session.abort` at `opencode-server.ts:775`; claude-sdk
 interrupt at `claude-sdk.ts:209`) — but the signal is never fired from Stop, and the assistant message is
 left `status='streaming'`.
 **Decision / approach:** Implement Phase A + B of the DEFERRED #1 proposal — a `Map<taskId,AbortController>`
 registry in the dispatcher, an exported `cancelExternalTask(taskId)`, wire `POST /api/tasks/:id/cancel` (and
 `POST /api/sessions/:id/stop`) into it for external tasks, and finalize the streaming assistant message
 (`complete`/`cancelled`) + publish terminal frame on cancel. Warm backends keep their persistent worktree
 (by design); one-shot PTY cleans its worktree in `finally`. Native boocode cancel path unchanged.
 **Surface:** apps/coder (dispatcher, routes/tasks.ts, routes/messages.ts). Sam: implement.
 ### F2 — Tool-call parser clean/prune (review §4a, optional retirement)
 **Problem (VERIFIED 2026-06-02):** The TS tool-call parser
 (`apps/server/src/services/inference/tool-call-parser.ts`) is STILL load-bearing in code —
 `extractToolCallBlocks` is called in the live text-delta path (`stream-phase.ts:263-284`) and
 `stripToolMarkup` in `tool-phase.ts:122` + `error-handler.ts:25,106`. But a live probe of llama-swap
 (`http://100.101.41.16:8401`) confirmed native `--jinja` parsing is ON: the server returns structured
 `tool_calls` + `finish_reason:tool_calls`, content is just `"\n"`. The structured path
 (`stream-phase.ts:285` `case 'tool-call'`) does all real work today; the TS parser is dormant
 defense-in-depth. qwen3.6 is specifically the model that drifts to `<invoke>` markup, and the TS fallback is
 the only guard for "tool call emitted as plain text" (`experimental_repairToolCall` does NOT cover that
 case).
 **Decision / approach:** Sam wants to clean/prune. Honor the safe path: this is NOT a zero-risk delete.
 The plan must decide between (a) prune only what is provably dead now while keeping the
 `<invoke>`-text-fallback guard, vs (b) the full retirement = gate the text-scrape fallback behind a flag,
 validate native parsing on live qwen3.6 for one release, then delete. The relicense already removed the
 AGPL-dead exports; this is pure simplification, no license pressure. Identify exactly which exports/branches
 are safely removable today vs which are the load-bearing guard.
 **Surface:** apps/server inference. Sam: clean/prune (safe path).
 ### F3 — xml-parser structured logging (DEFERRED #6)
 **Problem:** `tool-call-parser.ts:65` uses `console.debug` for placeholder-rejection logging instead of the
 pino `log.debug({...})` pattern, so it skips the structured pipeline and ignores `LOG_LEVEL`. Cosmetic.
 **Decision / approach:** Pass an optional logger into `extractToolCallBlocks` from its one call site
 (`stream-phase.ts` executeStreamPhase), or a module-level debug hook. ~5-minute edit. Sam: implement (will
 be done directly, included here for completeness).
 ### F4 — Notify-hook config injection → out-of-band agent status (review #10 follow-on)
 **Context:** v2.7.6 shipped the SCOPED status-publish (normalized `working|blocked|idle|error` frame from
 BooCoder's OWN turn boundaries via `normalize-agent-status.ts`). The follow-on — the superset clean-room
 pattern (ELv2, PATTERN-ONLY) — injects a notify hook into each PTY agent's native config
 (`~/.claude/settings.json`, `~/.qwen/settings.json`, `~/.config/goose/`), the hook POSTs
 `{terminalId/taskId, eventType, agent}` back to BooCoder, which normalizes ~30 vendor event names → the
 existing status buckets via the already-built `normalizeAgentEvent` helper. This gives goose/qwen/claude
 real out-of-band working/blocked-on-permission/done signals the turn-boundary publish can't see.
 **Decision / approach:** Sam wants it. Clean-room only (no superset code). Reuse the existing
 `normalize-agent-status.ts`. Plan the injection mechanism (idempotent settings.json merge per agent),
 the inbound POST route, and how it composes with the existing scoped publish without double-publishing.
 **Surface:** apps/coder (agent config injection, new route, normalize reuse). Sam: want.
 ### F5 — opencode compaction surfacing (review §3 #1)
 **Problem (VERIFIED earlier):** `opencode-server.ts` consumes ~5 SSE arms but NOT `compaction.{started,delta,ended}`
 — so when the warm opencode server auto-compacts mid-conversation it shows as a silent context gap in the
 UI. Only `step.ended` (tokens, v2.7.3) was taken from the fuller `session.next.*` set.
 **Decision / approach:** Sam wants it. Consume the compaction arms (and consider `step.failed`,
 `tool.progress`) in `opencode-server.ts`, map to a UI signal so the user sees "context compacted" instead of
 a gap. May need a new WS frame + web `WsFrame` arm (cross-app parity rule applies).
 **Surface:** apps/coder (opencode-server.ts) + apps/web (frame + render). Sam: want.
 ### F6 — BooChat resilience: stall-timeout + retry/backoff (review §3 #2/#3)
 **Problem (VERIFIED 2026-06-02):** `apps/server/src/services/inference/stream-phase.ts` has ZERO retry and
 NO server-side stall timeout. A hung llama-swap stream relies entirely on the frontend 60s `discard_stale`
 watchdog. The `for await (const part of result.fullStream)` loop (`stream-phase.ts:261`) is the wrap point.
 **Decision / approach:** Sam wants it. Add a per-chunk stall timeout wrapping the fullStream loop that fires
 the existing abort path (~40-60 LoC), and a minimal retry/backoff classifier (transient-5xx / stall;
 llama-swap rarely emits retry-after, so value is mostly transient-5xx + stall retry — strip the cloud-billing
 arms). YAGNI-gate the retry scope hard: single local instance, so keep it minimal.
 **Surface:** apps/server inference (stream-phase.ts). Sam: want.
 ### F7 — view_session_history MCP tool
 **Context:** Roadmap v2.x optional. Expose chat/session history as a read-only MCP tool on BooCoder's MCP
 server (`apps/coder/src/services/mcp-server.ts`) so an MCP client can retrieve the transcript of a session.
 **Decision / approach:** Sam wants it. Read-only tool over the existing `messages_with_parts` read path,
 keyed by session/chat id. Reference shape: memov `snap`/`mem_history`. Keep minimal — one read tool, no
 write/validate verbs unless evidence demands.
 **Surface:** apps/coder MCP server. Sam: want.
 ### F8 — diff-line → agent re-prompt UX (review §5j, superset pattern)
 **Context:** Select lines in a diff, send them as a comment to the existing agent session or to a new agent
 ("fix this here"). superset has the pattern (ELv2, PATTERN-ONLY clean-room). This is a review/diff UX
 frontier, the heaviest WANT and the one most needing its own feature spec.
 **Decision / approach:** Sam wants it. Plan at a coarse level (it likely warrants a follow-on plan-a-feature):
 the diff-line selection affordance, the compose-to-session/new-agent action, and how it routes through the
 existing dispatcher + AgentComposerBar. The diff UI component must be located (grep for the diff panel did not
 surface a `*diff*` filename — UX specialist to locate the actual component).
 **Surface:** apps/web (diff UI) + apps/coder (route to existing/new session). Sam: want.
 ### F9 — Retire apps/coder/web :9502 fallback SPA (DEFERRED #5)
 **Context:** Standalone Vite SPA (`@boocode/coder-web`) served by BooCoder at `:9502` when no BooChat proxy
 is in front. Sam confirmed: "I don't use 9502." Primary UI is CoderPane inside BooChat.
 **Decision / approach:** Remove `apps/coder/web/` + its static serve from `apps/coder/src/index.ts` + the
 coder build step; KEEP all WS + REST routes (CoderPane depends on them); optional minimal static
 "open in BooChat" page at `/`. Per DEFERRED #5 removal checklist. Low risk. Sam: greenlit.
 ---
 ## Deferred (Sam's explicit dispositions — record, do not plan)
 - **Subagent permission demux** — DEFER. Condition met (VERIFIED 2026-06-02: `opencode-server.ts` has zero
  permission handling; parent and child sub-sessions run without BooCode gating — effectively bypassed).
 - **Tier-2 provider follow-ups** — `available_agents.enabled` column and MCP `list_providers` tools DROP
  (redundant / moot now that v2.4 external-driving is dead); `provider_snapshot_updated` WS frame + shared
  `packages/types` PARK (revisit only if status feels laggy / a third type consumer appears).
 - **Large-file splits** — DEFER (refactor-when-touched).
 - **PR-resolver, multi-provider LLM, HMAC audit log, verify-gate ensembler, taste-skill, workflow graphs,
  record/replay test harness** — DEFER (optional/far-future).
 - **v2.4 BooCoder-as-ACP-agent** — DROPPED (won't-do, never drive BooCoder from external editors).
 - **Native llama-server tool-call-parser full retirement** — covered by F2's safe-path decision.
 ## Cross-cutting constraints (from CLAUDE.md — must honor)
 - New WS frame type → update BOTH server `InferenceFrame`/`ws-frames.ts` AND web `WsFrame`
  (`apps/web/src/api/types.ts`); the web type is the wire-format gate (applies to F5, maybe F4).
 - Sentinels are `role='system'` rows with `metadata.kind`; `buildMessagesPayload` strips them (relevant if
  F4/F5 add a UI-only signal).
 - Deploy by surface: apps/coder change → `systemctl restart boocoder`; apps/web|server → docker rebuild.
 - Stage commits explicitly by path; never `git add -A` (Sam has uncommitted web WIP).
 - Coder↔web provider-type parity + ws-frames parity tests must stay green.
 - Default to next patch tags, one batch per coherent unit (Sam declined v2.8.0 twice).
--- a/docs/plans/post-review-backlog/test-plan.md
+++ b/docs/plans/post-review-backlog/test-plan.md
@@ -0,0 +1,261 @@
 # Test Plan: post-review-backlog (F1, F2, F6, F5/F4/F7 brief)
 ## Scope
 `apps/coder/src/services/dispatcher.ts`,
 `apps/coder/src/routes/tasks.ts:110-148`,
 `apps/server/src/services/inference/tool-call-parser.ts`,
 `apps/server/src/services/inference/stream-phase.ts:261-341`,
 and their existing test companions.
 Branch: main (HEAD e5ce01a, v2.7.11).
 ---
 ## Summary
 Three features — F1 (task cancel), F2 (parser prune), F6 (stall-timeout + retry) — have behavioral contracts not yet protected by any test. F1 is the highest risk because the whole-point of the feature is a side-effect (child termination) that is trivially skippable in the implementation. F2 is the whole point of a prune: keeping the existing `tool-call-parser.test.ts` suite green IS the contract, but one gap (the `<invoke>`-fallback path exercised from the call site in `stream-phase.ts`) needs a new test before any deletion proceeds. F6 benefits most from a pure-helper extraction (following the `lifecycle-decisions.ts` / `mistake-tracker.ts` precedent) so the classifier logic is unit-testable without a real stream.
 | Priority | Count |
 |----------|-------|
 | High     | 4     |
 | Medium   | 3     |
 | Low      | 2     |
 | Skipped  | 3     |
 Full analysis written to: /home/samkintop/opt/boocode/docs/plans/post-review-backlog/test-plan.md
 ---
 ## Coverage Assessment
 **Well-tested:** `tool-call-parser.ts` — every export has thorough unit coverage (`tool-call-parser.test.ts`); the inline fallback behavior of `extractToolCallBlocks` is the best-tested surface in the inference layer. Coder backend lifecycle decisions (`lifecycle-decisions.test.ts`, `turn-guard.test.ts`, `normalize-agent-status.test.ts`) follow the pure-helper extraction pattern and have complete unit coverage.
 **Significant gaps:**
 - `dispatcher.ts` — the `cancelExternalTask` function does not exist yet. Once written it will have zero tests. The cancel-registry decision logic (the pure "does this taskId map to an AbortController?" portion) is the highest-value extraction target.
 - `routes/tasks.ts:110-148` (cancel route) — currently the route never reaches the dispatcher for external tasks; after F1 the new code path has no test.
 - `stream-phase.ts` stall / retry behavior — entirely absent; the `for await` loop has no timeout and no retry. Both will be new code with no tests.
 - `stream-phase.ts` abort-after-stream behavior — the critical post-loop `signal?.aborted` check at line 337 is already documented as a hard-won correctness property but has no regression test.
 **Overall:** the inference layer has solid unit tests for data-transformation helpers but none for control-flow behavior (abort, stall, retry). The coder dispatcher has no tests at all. This is appropriate given the DB-heavy nature of the dispatcher, but the pure decision layers (cancel-registry, retry classifier) can and should be extracted and tested.
 ---
 ## Findings
 ### F1 — Task Cancel
 **T1: cancel-registry decisions — pure unit**
 - **Priority:** High
 - **Test level:** Unit
 - **Entry point:** `apps/coder/src/services/dispatcher.ts` — the new `cancelExternalTask(taskId)` export (to be created at F1 implementation time; the registry is the `Map<taskId, AbortController>` that does not exist yet)
 - **Gap type:** Untested (code not yet written — test is the contract spec)
 - **Test approach:**
  - **Behavior:** Extract a pure `CancelRegistry` module (following the `lifecycle-decisions.ts` / `turn-guard.ts` / `mistake-tracker.ts` precedent) that owns register/cancel/has/delete operations over a `Map<taskId, AbortController>`. The dispatcher creates one and calls these; routes call `cancel`. The unit tests pin what the registry commits to.
  - **Stubs:** None — the module is pure (no I/O, no DB)
  - **Cases:**
    1. `register(id, ac)` then `cancel(id)` → `ac.signal.aborted` is `true`, returns `true`
    2. `cancel(id)` on unknown id → returns `false`, no throw
    3. `register` + natural completion calls `delete(id)` → `cancel(id)` now returns `false` (idempotent double-cancel safety)
    4. `has(id)` returns `true` while registered, `false` after delete
  - **Expected output:** Signal aborted state, return values as above
  - **Expected commands:** None (pure)
 - **Brittleness assessment:** Durable — asserts the behavioral contract of the new module regardless of how the dispatcher wires it. The four cases directly map to the F1 correctness requirements (signal fired, idempotent, no-op after exit). No mock expectations.
 ---
 **T2: cancel route reaches dispatcher for external task — integration**
 - **Priority:** High
 - **Test level:** Integration (DB-opt-in via `describe.runIf(!!process.env.DATABASE_URL)`)
 - **Entry point:** `apps/coder/src/routes/tasks.ts:110` — `POST /api/tasks/:id/cancel`
 - **Gap type:** Untested
 - **Test approach:**
  - **Behavior:** After F1, `POST /api/tasks/:id/cancel` for a running external task must call `cancelExternalTask(taskId)` (which fires the AbortController), set `tasks.state='cancelled'`, and leave the previously-streaming assistant message with `status` not `'streaming'` (it must be `'cancelled'` or `'complete'`).
  - **Stubs:** Stub the `cancelExternalTask` function (or inject it as a dep) so the test does not require a live agent child. Stub returns `true` (task was in-flight).
  - **Input/Action:** Insert a task row in state `'running'` with a known `session_id`. POST to the cancel route.
  - **Expected output:** Response `{ cancelled: true }`, DB `tasks.state = 'cancelled'`
  - **Expected commands:** `cancelExternalTask(taskId)` called once (spy on injected dep)
 - **Brittleness assessment:** The assertion on `cancelExternalTask` being called is legitimate here — it IS the behavioral contract of the route (the whole missing piece in the current implementation). Only assert call happened, not argument order or internal mechanics. The DB state assertion (`tasks.state`) is the durable anchor. Follow the `tool_cost_stats.test.ts` + `describe.runIf` pattern.
 ---
 **T3: natural-exit does not re-cancel — pure unit on registry**
 - **Priority:** High
 - **Test level:** Unit (part of T1's file, not a separate file)
 - **Entry point:** Cancel registry `delete(taskId)` — called in the dispatcher's `finally` block on clean completion
 - **Gap type:** Untested
 - **Test approach:**
  - **Behavior:** When the dispatcher's `runExternalAgent` / `runOpenCodeServerTask` `finally` block deletes the taskId from the registry, a subsequent route-level `cancelExternalTask(taskId)` call is a no-op (returns `false`). This is the "cancel-after-natural-exit no-op" behavior from the F1 scope.
  - **Stubs:** None
  - **Input/Action:** `registry.register(id, ac)` → `registry.delete(id)` → `registry.cancel(id)`
  - **Expected output:** returns `false`, `ac.signal.aborted` remains `false`
 - **Brittleness assessment:** Trivially durable. This is the critical property that prevents spurious aborts arriving late after a task completes.
 ---
 **T4: warm worktree preserved on cancel (opencode path) — contract assertion on existing behavior**
 - **Priority:** Medium
 - **Test level:** Unit (pure behavioral assertion — no DB or child needed)
 - **Entry point:** `apps/coder/src/services/dispatcher.ts` — `runOpenCodeServerTask` catch/cancel branch (~line 921-933)
 - **Gap type:** Untested
 - **Test approach:**
  - **Behavior:** The scope-brief (F1 "Warm backends keep their persistent worktree by design") requires that when `ac.abort()` is called during `runOpenCodeServerTask`, the function does NOT call `cleanupWorktree`. The one-shot PTY path (`runExternalAgent`) does call `cleanupWorktree` in its finally. This distinction is the behavioral contract.
  - **Stubs:** This is most cleanly verified by reading the implementation post-F1 and confirming via a test that passes a spy `cleanupWorktree` to the cancel path. However — this is better expressed as a **code-review assertion** rather than a test, because the behavior is structural (code path absence), not data-dependent.
  - **Recommendation:** Capture this as a doc-comment contract in the implementation ("no cleanupWorktree on cancel — worktree is persistent") rather than a brittle spy assertion. If Sam disagrees, add an integration test that confirms the worktree directory still exists on disk after an abort, using a tempdir fixture (follows `pending_changes_integration.test.ts` pattern).
 - **Brittleness assessment:** A spy on `cleanupWorktree` would break if the function is renamed or inlined. The structural guarantee is better held by the code comment + the existing lifecycle tests. Mark as Medium since the scope-brief calls it "by design" — a regression here would be caught by the warm-backend integration path anyway.
 ---
 ### F2 — Parser Prune
 **T5: existing tool-call-parser.test.ts suite stays green — regression gate**
 - **Priority:** High
 - **Test level:** Unit
 - **Entry point:** `apps/server/src/services/inference/tool-call-parser.ts` — all exports
 - **Gap type:** Partially tested (all current behaviors are tested; this is a "must not break" constraint during the prune)
 - **Test approach:**
  - **Behavior:** Every test in `apps/server/src/services/__tests__/tool-call-parser.test.ts` pins a behavior the prune must not change. The test suite covers: `parseXmlToolCall`, `parseInvokeToolCall`, `partialXmlOpenerStart`, `extractToolCallBlocks` (all cases including placeholder rejection and streaming accumulation), `stripToolMarkup` (closed/unclosed/final), delimiter constants.
  - **Action:** No new tests needed here — run `pnpm -C apps/server test` after each prune step and verify 0 failures. This suite is the contract.
  - **Gap to close:** The one behavior NOT currently tested in the pure module's test file is the `<invoke>`-fallback being triggered from `stream-phase.ts` when a text-delta chunk arrives. See T6.
 - **Brittleness assessment:** These tests are extremely durable — they test pure string-in/struct-out transforms. Zero brittleness risk. They are the reason to keep `tool-call-parser.ts` — they prove the guard is real.
 ---
 **T6: invoke-fallback fires from the stream-phase call site — call-site integration test**
 - **Priority:** High
 - **Test level:** Unit (test `streamCompletion` or `executeStreamPhase` with a fake `fullStream`)
 - **Entry point:** `apps/server/src/services/inference/stream-phase.ts:263-283` — the `text-delta` case calling `extractToolCallBlocks`
 - **Gap type:** Untested
 - **Test approach:**
  - **Behavior:** When `result.fullStream` emits a `text-delta` chunk containing a complete `<invoke>` block, `streamCompletion` must return that block's parsed tool call in `result.toolCalls` (not in `result.content`). This is the live guard for the "qwen3.6 drifts to `<invoke>` markup" scenario from the scope-brief.
  - **Stubs:** Stub `streamText` (import spy or dependency injection) to return a fake `fullStream` that yields:
    ```
    { type: 'text-delta', text: 'Thinking...\n<invoke name="view_file"><parameter name="path">/tmp/x</parameter></invoke>' }
    { type: 'finish', finishReason: 'stop' }
    ```
    Stub `result.usage` resolving to `{ inputTokens: 10, outputTokens: 5 }`. Follow the `ReadableStream test stubs` guidance in `apps/server/CLAUDE.md` (use `pull()` not `start()`). The `upstreamModel` call can be stubbed to return a dummy provider.
  - **Input/Action:** Call `streamCompletion(ctx, model, messages, opts, onDelta, onUsage, signal)` with the stubbed `streamText`.
  - **Expected output:**
    - `result.toolCalls` has length 1 with `name='view_file'`, `args.path='/tmp/x'`
    - `result.content` is `'Thinking...\n'` (markup stripped)
    - `onDelta` was called with `'Thinking...\n'` and NOT with the `<invoke>` markup
  - **Expected commands:** None (no DB, no WS)
 - **Brittleness assessment:** Medium brittleness — this test depends on the `streamText` API shape. Inject `streamText` as a dep rather than `vi.spyOn` a module export, so renaming the import doesn't break the test. The behavioral assertion (tool in `toolCalls`, markup NOT in `content`) is durable and directly pinned to the F2 safety concern. This test must stay green during the prune and must FAIL if `extractToolCallBlocks` is removed from the `text-delta` path.
 **Important:** If the F2 decision is option (b) — full retirement behind a flag — this test validates the flag-off path (native-parsing-only) for one release. Write the test against the flag-off behavior first, confirm it passes without the parser, then delete the flag and the test in the follow-on batch.
 ---
 **T7: stripToolMarkup still called in tool-phase path — regression guard**
 - **Priority:** Medium
 - **Test level:** Unit (test `tool-call-parser.ts` export directly — already covered)
 - **Entry point:** `apps/server/src/services/inference/tool-phase.ts:122` and `apps/server/src/services/inference/error-handler.ts:25,106`
 - **Gap type:** Partially tested
 - **Test approach:**
  - **Behavior:** `stripToolMarkup` is the second consumer (tool-phase, error-handler). The scope-brief identifies it as load-bearing for the case where the model puts raw `<tool_call>` or `<invoke>` markup in its "final" text content. The existing `stripToolMarkup` tests in `tool-call-parser.test.ts` already cover this behavior at the pure level.
  - **Action:** No additional test needed. The existing `stripToolMarkup` tests (lines 297-342 of `tool-call-parser.test.ts`) are the contract. Do not write call-site integration tests for `tool-phase.ts` — they would require mocking the entire inference context and test the wrong level.
  - **Note for prune:** Before deleting `stripToolMarkup`, confirm none of `tool-phase.ts:122`, `error-handler.ts:25`, `error-handler.ts:106` call it. If they do, the deletion is not safe regardless of the test.
 - **Brittleness assessment:** N/A — this is a "stay green" check, not a new test.
 ---
 ### F6 — Stall-Timeout + Retry
 **T8: stall-timeout classifier — pure unit**
 - **Priority:** High
 - **Test level:** Unit
 - **Entry point:** New `apps/server/src/services/inference/stream-resilience.ts` (to be created, following `mistake-tracker.ts` / `lifecycle-decisions.ts` pattern)
 - **Gap type:** Untested (code not yet written — test is the contract spec)
 - **Test approach:**
  - **Behavior:** Extract a pure `classifyStreamError(err: unknown): 'stall' | 'transient-5xx' | 'non-retryable'` function (or equivalent enum). The retry loop in `stream-phase.ts` delegates the retry decision to this function. Unit tests pin the classification without touching a real stream or timer.
  - **Stubs:** None
  - **Cases:**
    1. Error whose message contains "stall" or whose name is `'StallTimeoutError'` → `'stall'`
    2. Error with HTTP status 500, 502, 503, 504 → `'transient-5xx'`
    3. AbortError (name `'AbortError'`) → `'non-retryable'`
    4. Error with HTTP status 400 → `'non-retryable'`
    5. Error with HTTP status 429 → `'non-retryable'` (YAGNI-gated: scope-brief says "strip the cloud-billing arms"; 429 is a cloud concern, not a single local llama-swap instance)
  - **Expected output:** Classification string for each case
 - **Brittleness assessment:** Durable — pure function, no I/O. The classification is the decision surface; the retry loop can be refactored freely as long as it consults this function. This follows the exact `mistake-tracker.ts` + `lifecycle-decisions.ts` pattern.
 ---
 **T9: stall-timeout aborts the stream — unit with fake timer and fake stream**
 - **Priority:** Medium
 - **Test level:** Unit
 - **Entry point:** `apps/server/src/services/inference/stream-phase.ts:261` — the `for await (const part of result.fullStream)` loop (post-F6, wrapped with a per-chunk deadline timer)
 - **Gap type:** Untested (code not yet written)
 - **Test approach:**
  - **Behavior:** When `fullStream` emits no chunk for longer than the stall deadline, the wrapping code must throw a `StallTimeoutError` (or equivalent) that propagates out of `streamCompletion`. This is the new abort path distinct from the user-initiated `signal.aborted` path.
  - **Stubs:**
    - Stub `streamText` to return a fake `fullStream` that yields one chunk then hangs (never yields the next). Use a `ReadableStream` with a `pull()` controller that enqueues one chunk and then never calls the resolve/reject until after the test's fake timer fires.
    - Use `vi.useFakeTimers()` (vitest fake timer) to advance time past the stall deadline without waiting real milliseconds.
  - **Input/Action:** Call `streamCompletion` with fake timers active and a stream that stalls after the first chunk. Advance fake time by stall-deadline + 1ms.
  - **Expected output:** `streamCompletion` rejects with a `StallTimeoutError`
  - **Expected commands:** None
 - **Brittleness assessment:** Medium — depends on fake-timer interacting correctly with the async stream loop. Vitest fake timers work with `setTimeout`-based deadlines but NOT with `Promise`-based deadlines. The implementation must use `setTimeout` (not `Promise.race` with a raw `sleep`) for the stall timer, or the test will be structurally incompatible. Note this constraint to the implementer. If the implementation uses `Promise.race`, adjust the test to use a `setTimeout`-resolving promise that can be controlled via `vi.runAllTimersAsync()`.
 ---
 **T10: abort-after-stream regression — unit**
 - **Priority:** Medium
 - **Test level:** Unit
 - **Entry point:** `apps/server/src/services/inference/stream-phase.ts:337` — `if (signal?.aborted) throw AbortError`
 - **Gap type:** Untested (existing behavior, not new — but this is a documented regression risk per the inline comment "Smoke D caught this in v1.13.1-A — don't refactor it away")
 - **Test approach:**
  - **Behavior:** When the `AbortSignal` fires during a stream that has already consumed all chunks and exited the `for await` loop normally, `streamCompletion` must throw an `AbortError` rather than returning `StreamResult`. Without this throw the message would finalize as `complete` instead of `cancelled`.
  - **Stubs:** Stub `streamText` to return a fake `fullStream` that yields a text chunk then `finish`, then resolves normally. The `AbortSignal` is pre-aborted before the call (or aborted concurrently using a resolved-promise microtask so it fires before the post-loop check).
  - **Input/Action:** Create an `AbortController`, abort it, pass `ac.signal` to `streamCompletion`.
  - **Expected output:** `streamCompletion` rejects with an error whose `name === 'AbortError'`
  - **Expected commands:** None
 - **Brittleness assessment:** Very durable — asserts a precise, documented behavioral contract. This test protects the most subtle correctness invariant in the inference layer. It should be written whether or not F6 proceeds, as a regression pin on existing code.
 ---
 ## F5, F4, F7 — Brief Notes
 **F5 (opencode compaction surfacing):** The behavior being added is consuming new SSE event types in `opencode-server.ts` and emitting new WS frames. The primary test vector is the cross-app parity check: `apps/coder/src/services/__tests__/provider-types-parity.test.ts` is the precedent. A new `ws-frames-parity.test.ts` (or extension of `ws-frames.test.ts`) that asserts the new compaction frame type exists in BOTH `apps/coder` and `apps/web` `WsFrame` schemas is the right test. This is a Medium-priority addition once the frame types are defined. No new unit test is needed for the SSE arm parsing itself — it is structurally identical to the existing `step.ended` arm.
 **F4 (notify-hook config injection):** The testable surface is the idempotent `settings.json` merge logic. Extract that as a pure `mergeNotifyHook(existing: object, hookConfig: object): object` function and unit-test it (same pattern as `mistake-tracker.ts`). The inbound POST route is testable via Fastify's `inject()` method if the handler is kept thin. However, F4 has no behavioral spec beyond the review description and is a WANT item — defer test planning until a feature spec exists. Testability is high when the merge logic is extracted.
 **F7 (view_session_history MCP tool):** The new MCP tool is read-only over `messages_with_parts`. The testable behavior is the DB query returning the correct shape. Pattern: `describe.runIf(!!process.env.DATABASE_URL)` with a real DB, following `tool_cost_stats.test.ts`. Unit-level: the `mcp-client.test.ts` pattern (mock the DB, verify the query shape). Medium priority, straightforward once the tool implementation exists.
 ---
 ## Deferred / Skipped Tests
 **S1: dispatcher.ts full integration test (any path)**
 - **Entry point:** `apps/coder/src/services/dispatcher.ts:46` — `createDispatcher`
 - **Reason:** The dispatcher is DB-heavy, timer-driven, and spawns child processes. An integration test for the full dispatch loop would require a real DB, a real (or fake) agent binary, and process lifecycle management. The value is low relative to the brittleness: the cancel-registry unit tests (T1, T3) and the route integration test (T2) together cover the observable behavioral gap at much lower maintenance cost. The full dispatch loop is better validated by end-to-end smoke testing.
 **S2: one-shot PTY worktree cleanup on cancel**
 - **Entry point:** `apps/coder/src/services/dispatcher.ts:600-606` — `cleanupWorktree` in the PTY error path
 - **Reason:** Testing that `cleanupWorktree` is called requires either a spy (over-specified double) or a real filesystem side effect (requires a real git repo tempdir). The behavioral contract ("PTY worktree cleaned up on cancel") is better held by the existing `write_guard.test.ts` + `pending_changes_integration.test.ts` precedents, which validate the write-side contract. A spy assertion on `cleanupWorktree` call count is an over-specified double that breaks on any rename or refactor of the cleanup path.
 **S3: retry backoff timing**
 - **Entry point:** `apps/server/src/services/inference/stream-phase.ts` — post-F6 retry loop
 - **Reason:** Testing that backoff delays are e.g. 1s, 2s, 4s requires fake timers and multi-attempt orchestration. The scope-brief says "keep it minimal" and YAGNI-gates the retry scope hard for a single local instance. The classifier (T8) pins the correctness decision; the timer values are implementation details. A test that asserts specific delay durations would break on any tuning and has no behavioral anchor in the F6 spec.
 ---
 ## Coverage Estimate
 After all recommended tests are written:
 - **F1 behavioral contract:** Fully covered at unit level (registry decisions) + opt-in DB integration (route → dispatcher). The warm-vs-PTY worktree divergence is held by a code-comment contract rather than a test (see T4 recommendation).
 - **F2 behavioral contract:** Fully covered — existing `tool-call-parser.test.ts` suite is the regression gate; T6 closes the one gap (call-site fallback from `stream-phase.ts`). Prune can proceed safely with T6 green.
 - **F6 behavioral contract:** Covered for the classifier (T8, unit) and the stall-throw path (T9, unit with fake timers). The existing abort regression (T10) closes a pre-existing gap independent of F6.
 - **F5/F4/F7:** Testability is high but deferred to implementation time; no test can be written before the frame types and route handlers exist.
 Behaviors intentionally left without tests: full dispatcher orchestration loop (S1), PTY cleanup spy (S2), backoff timing (S3).
--- a/docs/project-discovery.md
+++ b/docs/project-discovery.md
@@ -100,7 +100,7 @@
 - cli: `pnpm -C apps/coder cli` (`tsx src/cli.ts`)
 - typecheck: `pnpm -C apps/coder typecheck`
 - test: `pnpm -C apps/coder test` (vitest run)
- deploy: `pnpm -C apps/server build && pnpm -C apps/coder build && sudo systemctl restart boocoder`
+- deploy: `pnpm -C packages/contracts build && pnpm -C apps/server build && pnpm -C apps/coder build && sudo systemctl restart boocoder`
 - tsconfig: `apps/coder/tsconfig.json` (NodeNext, `declaration: false`)
 - test config: `apps/coder/vitest.config.ts` (vitest ^3.0.0, env=node, `globals: false`, `fileParallelism: false`)
 - test pattern: `src/**/__tests__/**/*.test.ts`
--- a/docs/research/cross-app-contract-ssot.md
+++ b/docs/research/cross-app-contract-ssot.md
@@ -0,0 +1,204 @@
 # Research: Long-term fix for BooCode's hand-synced cross-app type contracts
 How to eliminate the duplicated, hand-synced TypeScript wire contracts shared across `apps/server` ↔ `apps/web` ↔ `apps/coder` (a single source of truth), and which approach to choose. **Evidence mode: strict** (default — sourced claims only; the recommendation rests on corroborated or codebase evidence, never reasoning alone).
 ## Summary
 The textbook fix is real and well-understood: put each shared type in **one place** that all three apps import, instead of keeping two or three hand-copied versions in sync. The strongest version of that is a small shared package (e.g. `@boocode/contracts`) where each contract is defined once — and for the contracts that are already validation schemas, define the schema once and derive the type from it so the validator and the type literally cannot disagree. This is the mainstream industry pattern and it is technically possible here: the project already does exactly this between two of its apps (`apps/coder` imports a built `@boocode/server` package).
 But "do it now" did not survive scrutiny. The team already looked at this and **deliberately chose not to build the shared package** at its current (solo/small) scale, judging it "not worth the Docker/build-order risk," and that decision is the most recent word on the matter. The web app — the hardest consumer — has **never** imported a workspace package, so the path is unproven for it specifically. And the real duplication is messier than a clean "move five types" job: there is a *live, undetected* mismatch in one contract today, a whole extra copy of types in a second web app nobody counted, and a Zod-version tension lurking. So the honest answer is staged, not a single big-bang migration.
 **Recommended:** do the cheap, high-value cleanup now (close the actual drift gaps — including the one that's already broken — and extend the existing guard-rails); treat the full shared-package unification as a *deliberate, de-risked investment* gated behind a small proof-of-concept, not a foregone conclusion. Whether to spend that larger effort now is a judgment call about your scale that only you can make.
 - **Confidence:** Medium
 ## Research Results
 **The end-state the industry converges on is a shared package, ideally schema-first.** Across independent sources, the portable way to share types across a pnpm monorepo that mixes a Node app (NodeNext resolution) and a browser app (Vite/Bundler resolution) is a workspace package whose `package.json` `exports` map points at compiled `dist/` with a `types` condition listed before `default` (A11, A12, A14, A15). For contracts that are *also* runtime validators, the schema-first pattern — define a Zod schema once and derive the static type via `z.infer` — makes type/validator drift structurally impossible, because the type and the validator are the same definition; `z.discriminatedUnion` infers a correct narrowing union, which is exactly the shape of a WS-frame contract (A18, A19). This is the dominant production pattern (T3/tRPC-style `packages/contracts`) (A19).
 **This is technically feasible in BooCode — the "blocker" is narrower than it looks.** The codebase already proves the built-package path: `apps/coder` consumes a compiled `@boocode/server` via `workspace:*`, and `apps/server` ships a `package.json` `exports` map with `types`+`default` conditions per subpath (including `./ws-frames`) (A3). The `TS6307` error cited in the code as "structurally blocking" cross-import (A5) applies to importing another app's *raw source file* — **not** a properly built `node_modules` package. And contrary to a claim in the codebase inventory, `moduleResolution: "Bundler"` (what `apps/web` uses) **does** honor `exports` maps and the `types` condition per the TypeScript primary docs (A11) — so a built `@boocode/contracts` is, in principle, consumable by all three apps.
 **But the codebase — the authoritative record of current intent — pushes back on doing it now.** The most recent explicit decision (CHANGELOG, v2.5.12) records that a shared package was "considered and declined (drift is already prevented; not worth the Docker/build-order risk at solo scale)" (A2). The prior design note (`DEFERRED-WORK.md §3`) recommended the lightweight options (Zod-inferred, or a parity test) "unless planning a broader shared-types initiative," with full `packages/types` "justified when a third consumer appears or WS frame duplication becomes painful" (A1). Critically, that "third consumer" trigger is **not cleanly met**: each duplicated contract still has exactly two hand-synced consumers — `apps/coder` already consumes the server's frames as a *built package*, not as a third hand-copy [validation V1].
 **The real duplication surface is wider and messier than a clean five-type list.** Codebase evidence found, beyond the six enumerated contracts: a **live, uncaught drift** — `AgentSessionConfig` is `model?/modeId?/thinkingOptionId?` (all optional) on the coder side but `model: string; modeId: string | null; …` (required/nullable) on the web side, and it is **not** in the parity test's coverage (A7) [validation V3, confirmed]; a **fourth duplication site**, `apps/coder/web/src/api/types.ts`, a fallback SPA with its own `WsFrame`/`Message`/`ToolCall` copies and no parity guard (A8) [V4, confirmed]; and the web's hand-written `WsFrame` union in `types.ts` is already *partially superseded* by the Zod-inferred type and is missing at least one frame type (`session_renamed`) (A10) [V6]. The web app has also **never** consumed a workspace package (`apps/web/package.json` has no `@boocode/*` dependency), so the build-tooling path is unproven for the hardest consumer (A9) [V2].
 **If unification is deferred, the lightweight guard-rails have documented prior art.** A structural type-parity test (`expectTypeOf().toEqualTypeOf()` / `Expect<Equal<>>`, run under `vitest --typecheck`) catches add/remove/rename drift between two copies with no new toolchain (A20) — this is the same family as the project's existing parity tests, and the natural way to close the `AgentSessionConfig`-style gaps. Generated-copy approaches (OpenAPI/Orval/Zod codegen) exist but trade the duplication for a schema artifact and a codegen pipeline (A23).
 **One source conflict worth flagging:** Zod v4's browser bundle size is reported inconsistently — ~5.36 KB gzipped (official) vs ~14 KB (independent), likely core-only vs realistic-usage measurement (A21, single-source on each figure). It does not bear on the recommendation (server/web already pin Zod v3.23.8), but a contracts package shipping Zod schemas to the browser would make this a real number to measure, and would add a third Zod-version-sync site against the latent v3→v4 pressure from the claude-agent-sdk (A21) [V8].
 ## Options to Consider
 ### O1: Shared compiled workspace package — `@boocode/contracts` (full unification)
 - **What it is:** A new `packages/contracts` package built like `@boocode/server` (`declaration:true`, `exports` map with `types`+`default` per subpath). All three apps depend on it via `workspace:*`. Zod-backed contracts (ws-frames, provider-config) live there as the single schema (`z.infer` for types); plain-type contracts move there as plain TS. Hand-synced copies and parity tests are deleted.
 - **Trade-offs:** Eliminates drift by construction — the strongest guarantee. Costs: a new workspace package + inverted build order (contracts must build *before* server/web/coder; today the root script builds web→server) and a Docker-build change; an unproven web-consumer path (composite + `noEmit` + Bundler reading a built `.d.ts` via exports has no in-repo precedent); a third Zod-version-pin site; and it does not, by itself, address the `apps/coder/web` SPA copy or the web `types.ts`-vs-`ws-frames.ts` dual `WsFrame` representation. Explicitly declined at v2.5.12 for solo-scale cost.
 - **Rests on:** (A1, A2, A3, A4, A11, A18, A19) corroborated; web-consumer feasibility (A11) is primary-source but unexercised here.
 - **Evidence status:** corroborated as an end-state; "proven for web" is **refuted** as currently unproven (V2).
 ### O2: Schema-first SSOT for the Zod-backed contracts only (partial unification)
 - **What it is:** A narrower O1 — share only the already-Zod contracts (the WS-frame schema, the provider-config schema) as single Zod definitions in a small shared package; leave the plain-type contracts under parity tests. Targets the highest-pain, lowest-friction subset (ws-frames is already byte-identical-maintained Zod in two files).
 - **Trade-offs:** Captures most of the drift-elimination value (the WS frames are the most painful duplicate) while touching fewer types and deferring the plain-type migration. Still needs the shared-package build wiring and the web-consumer proof; still leaves plain-type contracts duplicated (but guarded).
 - **Rests on:** (A3, A6, A11, A18) corroborated; (A19) pattern.
 - **Evidence status:** corroborated; the same unproven-web-path caveat applies (V2, V5).
 ### O3: Extend the status quo — parity tests + the new coding standard (close the gaps)
 - **What it is:** Keep hand-synced copies; (1) fix the live `AgentSessionConfig` drift, (2) add structural type-parity tests for the currently-unguarded contracts (`AgentSessionConfig`, `MessageMetadata`, `WorktreeRiskReport`, `ProviderOverride`/`CoderProvidersFile`, the interface `WsFrame` union), (3) decide the fate of the `apps/coder/web` SPA copy. Pairs with the `cross-app-contract-parity` coding standard already written.
 - **Trade-offs:** Lowest cost and risk; matches the recorded v2.5.12 decision; closes real, demonstrated holes (including one already broken). Does **not** eliminate duplication — it makes drift *loud* rather than *impossible*, and the guard-rail is only as good as remembering to add new types to the `names` arrays (the exact gap that let `AgentSessionConfig` drift).
 - **Rests on:** (A1, A2, A7, A20) corroborated; (A7) is a confirmed live defect.
 - **Evidence status:** corroborated; matches the authoritative codebase decision.
 ### O4: Codegen from a canonical schema (generated copies)
 - **What it is:** A canonical schema (Zod source, or OpenAPI) generates the per-app type files; CI regenerates and `git diff --exit-code` fails on staleness.
 - **Trade-offs:** Strong drift guarantee without a runtime shared import; but adds a schema artifact + codegen pipeline, and the guarantee degrades to "CI must run the generator." Heaviest setup for the least fit here — most BooCode contracts are not described by an OpenAPI spec, and the server is TypeScript (so a TS-source SSOT, i.e. O1/O2, dominates codegen).
 - **Rests on:** (A23) corroborated as a pattern.
 - **Evidence status:** corroborated, but weakly fit to this codebase.
 ## Recommendation
 - **Recommendation:** **No single big-bang winner survives scrutiny — take the staged path.** (1) **Now, regardless of the larger decision (O3):** close the demonstrated gaps — fix the live `AgentSessionConfig` mismatch and bring the unguarded contracts under parity coverage (or the new `cross-app-contract-parity` standard), and explicitly decide whether `apps/coder/web` is retired or guarded. This is low-risk, matches the recorded decision, and fixes something that is *already broken*. (2) **The long-term unification you asked for (O2 → O1) is the right end-state but is a deliberate investment, not a proven drop-in.** Gate it behind a tracer-bullet proof-of-concept: stand up a minimal `@boocode/contracts`, wire it into **`apps/web` first** (the unproven consumer), migrate **only the ws-frames Zod schema**, and verify that `tsc`, Vite dev (HMR), and `vite build` all resolve the built `.d.ts`/`.js` through the exports map. If green → proceed to migrate the rest and delete the copies/tests (O1). If it hits the composite/`noEmit`/Vite-resolution wall → fall back to the extended O3. Preconditions for O1/O2: pin Zod in `contracts` to the workspace version, invert the build order (contracts first) and update the Docker build, resolve the web `types.ts`-vs-`ws-frames.ts` dual `WsFrame` representation, and handle `apps/coder/web`.
 - **Evidence basis:** The end-state (shared package, schema-first for Zod contracts) rests on **corroborated** web evidence (TypeScript primary docs A11; multiple independent monorepo/Zod sources A12–A19) **plus codebase precedent** (`@boocode/server` consumed by `apps/coder`, A3). The decision *not* to treat full unification as proven/justified-now rests on **codebase evidence**, which is authoritative on current state: the explicit v2.5.12 decline (A2), the unmet "third consumer" trigger (A1, V1), the absent web workspace-package precedent (A9, V2), and the live `AgentSessionConfig` drift (A7, V3 — confirmed by direct inspection). The immediate O3 action rests on the confirmed defect (A7) and the recorded decision (A1, A2). No part of the recommendation rests on reasoning alone. **The one judgment that is genuinely yours:** whether the WS-frame byte-sync pain is now "painful enough" to spend the O1/O2 investment despite the solo-scale cost the team previously weighed — the evidence frames that trade-off but cannot settle it for you.
 ## Validation
 ### V1: Is the "third consumer" deferral trigger met?
 - **Strategy:** Challenge the Evidence
 - **Investigation:** Read `DEFERRED-WORK.md` §3 trigger and enumerated actual consumers; `apps/coder` consumes `@boocode/server` (built package), not a third hand-copy.
 - **Result:** Refuted (trigger not met — still two hand-synced consumers per contract).
 - **Impact:** Removed "the trigger is now met" from the rationale; the case for O1 now rests on WS-frame pain judged on its own merit, not a satisfied precondition.
 ### V2: Is the built-package path proven for the web consumer?
 - **Strategy:** Challenge the Evidence
 - **Investigation:** `apps/web/package.json` has no `@boocode/*` dependency; no `@boocode` import anywhere in `apps/web/src`. Coder→server (NodeNext) is proven; web (Bundler + composite + `noEmit`) is not.
 - **Result:** Partially Refuted (NodeNext path proven; web path unproven).
 - **Impact:** Reframed O1/O2 as gated behind a tracer-bullet probe; removed the "proven for web too" claim.
 ### V3: Is the parity-test safety net complete?
 - **Strategy:** Challenge the Evidence
 - **Investigation:** `AgentSessionConfig` differs between coder (all optional) and web (required/nullable) and is absent from the parity test `names` array — directly verified.
 - **Result:** Confirmed (a live, uncaught drift exists today).
 - **Impact:** Promoted "fix this now" into the recommendation's immediate action; weakens the "drift is already prevented" basis of the v2.5.12 decision.
 ### V4: Is the duplication scope fully enumerated?
 - **Strategy:** Challenge the Evidence
 - **Investigation:** `apps/coder/web/src/api/types.ts` (12 exported types, no parity test) is a fourth duplication site not in the inventory.
 - **Result:** Confirmed (uncounted site).
 - **Impact:** Added "decide the fate of `apps/coder/web`" as an explicit scope item / precondition.
 ### V5: Does the web tsconfig flag combo break package consumption?
 - **Strategy:** Challenge the Fix
 - **Investigation:** `composite:true`+`noEmit:true`+`allowImportingTsExtensions:true`+Bundler reading a `dist/*.d.ts` via exports is feasible but unexercised; Vite must resolve the JS at runtime, not just tsc the types.
 - **Result:** Partially Refuted (feasible, under-specified).
 - **Impact:** Made the probe verify `tsc` + Vite dev + `vite build` explicitly; added build-order/Docker preconditions.
 ### V6: Is the interface `WsFrame` union a simple plain-type move?
 - **Strategy:** Challenge the Assumptions
 - **Investigation:** Web `types.ts` `WsFrame` is missing `session_renamed` and is already partially superseded by the Zod-inferred type from `ws-frames.ts`.
 - **Result:** Confirmed (more complex than a move).
 - **Impact:** Added "resolve the web dual-`WsFrame` representation" as an O1 precondition.
 ### V7: Is the prior decision stale, or authoritative?
 - **Strategy:** Challenge the Evidence-Gathering Integrity
 - **Investigation:** CHANGELOG v2.5.12 explicitly *declined* a shared package — more recent than `DEFERRED-WORK.md` and consistent with it (option C was chosen).
 - **Result:** Partially Refuted (the "pending decision awaiting triggers" framing was wrong; it was an explicit rejection).
 - **Impact:** Recommendation now states O1 needs *new* justification over a recorded "declined," not just "triggers met."
 ### V8: Any hidden version/coupling cost?
 - **Strategy:** Challenge the Fix
 - **Investigation:** Server/web pin Zod `^3.23.8`; claude-agent-sdk exerts latent v4 pressure; a contracts package adds a third Zod-version-sync site.
 - **Result:** Confirmed (manageable, underspecified).
 - **Impact:** Added "pin Zod in contracts to the workspace version" as a precondition.
 ### Adjustments Made
 The original recommendation ("build `@boocode/contracts` now; the path is proven") **did not survive** and was rewritten into the staged / no-single-winner form above: an evidence-backed immediate action (O3 gap-closing, including the confirmed `AgentSessionConfig` defect) plus full unification (O2→O1) reframed as a deliberate, probe-gated investment with explicit preconditions.
 ### Confidence Assessment
 - **Confidence:** Medium
 - **Remaining Risks:** The web build-tooling path (composite + `noEmit` + Bundler consuming a built workspace package) is unproven in-repo — a failed probe is the main scope risk. The "is WS-frame pain worth the investment now" decision is a solo-scale judgment the evidence cannot settle. Web bundle-size of shipping Zod schemas to the browser is unmeasured (A21 conflict). Scope is wider than first enumerated (coder/web SPA, dual `WsFrame`, latent Zod v4) — discovery may not be complete (the `AgentSessionConfig` miss is evidence the inventory can lag reality).
 ## Sources
 | ID | Source | Link / location | Retrieved | Trust class | Summary (one line) | Evidence status |
 |---|---|---|---|---|---|---|
 | A1 | DEFERRED-WORK §3 — prior options A/B/C | `docs/DEFERRED-WORK.md:160-225` | n/a | codebase | Weighed Zod-inferred / shared `packages/types` / parity test; "A or C unless broader initiative; full package when 3rd consumer or WS pain" | recommendation-bearing |
 | A2 | CHANGELOG v2.5.12 decline note | `CHANGELOG.md` (~:99) | n/a | codebase | Shared package "considered and declined… not worth the Docker/build-order risk at solo scale" | recommendation-bearing |
 | A3 | `@boocode/server` built-package precedent | `apps/server/package.json` (exports), `apps/coder/package.json` (`workspace:*`), `apps/coder/src/index.ts:19` | n/a | codebase | Coder consumes a compiled server package via exports map + NodeNext; build server first | recommendation-bearing |
 | A4 | Web build constraints | `apps/web/tsconfig.app.json` | n/a | codebase | `composite:true`, `moduleResolution:"Bundler"`, `noEmit:true`, `allowImportingTsExtensions:true`, `include:["src"]` | recommendation-bearing |
 | A5 | TS6307 cross-import block (raw source) | `apps/coder/src/services/__tests__/provider-types-parity.test.ts:11-19` | n/a | codebase | Web-side import of coder's *source* file blocked by TS6307 on the composite project | corroborated by A11 |
 | A6 | WS-frame Zod byte-identical pair | `apps/server/src/types/ws-frames.ts` ↔ `apps/web/src/api/ws-frames.ts` (test `…/ws-frames.test.ts`) | n/a | codebase | The most-painful duplicate; already Zod; byte-parity test | single source (codebase) |
 | A7 | Live `AgentSessionConfig` drift | `apps/coder/src/services/provider-types.ts:56` vs `apps/web/src/api/types.ts:310`; absent from parity `names` | n/a | codebase | Optional (coder) vs required/nullable (web), uncaught | recommendation-bearing |
 | A8 | Uncounted 4th duplication site | `apps/coder/web/src/api/types.ts` | n/a | codebase | Fallback SPA, 12 exported types incl. own `WsFrame`/`Message`, no parity test | single source (codebase) |
 | A9 | Web has no workspace-package dep | `apps/web/package.json` | n/a | codebase | No `@boocode/*` dependency — built-package consumption unexercised for web | recommendation-bearing |
 | A10 | Web dual `WsFrame` representation | `apps/web/src/api/types.ts:560-622` vs `ws-frames.ts` | n/a | codebase | Interface union partially superseded by the Zod type; missing `session_renamed` | single source (codebase) |
 | A11 | TypeScript Modules Reference | https://www.typescriptlang.org/docs/handbook/modules/reference.html | 2026-06-02 | web | `bundler` AND `node16`/`nodenext` honor `exports`+`types` condition; `node10`/`classic` don't | recommendation-bearing (primary) |
 | A12 | Live Types in a TS Monorepo (C. McDonnell) | https://colinhacks.com/essays/live-types-typescript-monorepo | 2026-06-02 | web | Five sharing strategies; custom export conditions for source-pointing dev | corroborated by A11, A14 |
 | A13 | TypeScript Project References (docs) | https://www.typescriptlang.org/docs/handbook/project-references.html | 2026-06-02 | web | `composite`/`declaration`/`declarationMap`; `tsc -b` build ordering | corroborated by A14 |
 | A14 | Minimal TS monorepo lib (manzt gist) | https://gist.github.com/manzt/222c8e8f4ed35e74514eb756e4ba09bc | 2026-06-02 | web | `publishConfig` flip; source-pointing exports; nodenext authoring | corroborated by A12, A15 |
 | A15 | Sharing types React/NestJS pnpm (lico) | https://dev.to/lico/step-by-step-guide-sharing-types-and-values-between-react-esm-and-nestjs-cjs-in-a-pnpm-monorepo-2o2j | 2026-06-02 | web | Dual ESM/CJS exports map; `verbatimModuleSyntax` needs `import type` | corroborated by A11 |
 | A16 | TS6307 issue / tsup repro | https://github.com/microsoft/TypeScript/issues/27887 ; https://github.com/egoist/tsup/issues/1364 | 2026-06-02 | web | TS6307 on composite transitive re-exports; surfaces via bundler tools too | corroborated by A5 |
 | A17 | Vite TS Monorepo RFC / vite-tsconfig-paths | https://github.com/vitejs/vite-ts-monorepo-rfc ; https://www.npmjs.com/package/vite-tsconfig-paths | 2026-06-02 | web | Vite source-consumption gap; paths-alias workaround for HMR | single-source on the RFC stats |
 | A18 | Zod v4 — `z.infer` / `z.discriminatedUnion` | https://zod.dev/v4 ; https://zod.dev/api | 2026-06-02 | web | One schema yields a narrowing discriminated-union type; validator+type can't drift | recommendation-bearing |
 | A19 | Shared-Zod monorepo pattern (T3/tRPC) | https://calmops.com/programming/web/type-safe-fullstack-trpc-zod-monorepo/ ; https://www.ruthvikdev.com/blog/3-shared-zod-schemas | 2026-06-02 | web | `packages/contracts` Zod shared by server+client; version-pin caveat | corroborated |
 | A20 | Structural type-parity testing | https://vitest.dev/guide/testing-types ; https://www.totaltypescript.com/how-to-test-your-types | 2026-06-02 | web | `expectTypeOf().toEqualTypeOf()` / `Expect<Equal<>>` catches drift in CI, no new tooling | recommendation-bearing |
 | A21 | Zod v4 size/perf + alternatives | https://zod.dev/v4 ; https://pockit.tools/blog/zod-valibot-arktype-comparison-2026/ ; https://www.pkgpulse.com/blog/zod-vs-typebox-2026 | 2026-06-02 | web | Zod v4 ~5.36 KB (official) vs ~14 KB (independent); Valibot smaller, ArkType faster | single-source per figure; conflict flagged |
 | A22 | Standard Schema | https://standardschema.dev/schema | 2026-06-02 | web | Common interface across Zod/Valibot/ArkType; reduces library lock-in | corroborated by A21 |
 | A23 | OpenAPI/Zod codegen (generated copies) | https://openapi-ts.dev/ ; https://www.npmjs.com/package/ts-to-zod | 2026-06-02 | web | Generate type/validator copies from one source; guarantee = CI must regenerate | corroborated |
 ### A1: DEFERRED-WORK §3 — prior options and recommendation — recommendation-bearing
 - **Link / location:** `docs/DEFERRED-WORK.md:160-225`
 - **Retrieved:** n/a (codebase)
 - **Trust class:** codebase (trusted current-state anchor)
 - **Summary:** The project's own prior analysis of this exact question. Lays out option A (Zod + inferred types), B (shared `packages/types`), C (status-quo parity test), with a trade-off table, and recommends "Start with A or C unless planning a broader shared-types initiative. Full `packages/types` is justified when a third consumer appears or WS frame duplication becomes painful again."
 - **Evidence status:** authoritative on intent; corroborated by A2.
 ### A2: CHANGELOG v2.5.12 decline note — recommendation-bearing
 - **Link / location:** `CHANGELOG.md` (~line 99, v2.5.12 provider-lifecycle entry)
 - **Retrieved:** n/a (codebase)
 - **Trust class:** codebase
 - **Summary:** The most recent explicit decision: "a shared package was considered and declined (drift is already prevented; not worth the Docker/build-order risk at solo scale)." Records that option C was chosen.
 - **Evidence status:** authoritative; the "drift is already prevented" premise is weakened by A7.
 ### A3: `@boocode/server` built-package precedent — recommendation-bearing
 - **Link / location:** `apps/server/package.json` (exports map with `types`+`default` per subpath, incl. `./ws-frames`), `apps/coder/package.json` (`"@boocode/server": "workspace:*"`), `apps/coder/src/index.ts:19`
 - **Retrieved:** n/a (codebase)
 - **Trust class:** codebase
 - **Summary:** Demonstrates the exact built-package mechanism a `@boocode/contracts` would use, working today for a NodeNext consumer (coder). `apps/server` has `declaration:true`; build order is server-first.
 - **Evidence status:** proves the NodeNext path; does not prove the Bundler/web path (A9, V2).
 ### A4: Web build constraints — recommendation-bearing
 - **Link / location:** `apps/web/tsconfig.app.json`
 - **Retrieved:** n/a (codebase)
 - **Trust class:** codebase
 - **Summary:** `composite:true`, `moduleResolution:"Bundler"`, `noEmit:true`, `allowImportingTsExtensions:true`, `include:["src"]`. These permit consuming a built `node_modules` package (Bundler honors exports, A11) but the combination is unexercised against a workspace package here.
 - **Evidence status:** feasibility is primary-source (A11) but unexercised (V2, V5).
 ### A7: Live `AgentSessionConfig` drift — recommendation-bearing
 - **Link / location:** `apps/coder/src/services/provider-types.ts:56-61` vs `apps/web/src/api/types.ts:310-315`; not in `apps/coder/src/services/__tests__/provider-types-parity.test.ts` `names`
 - **Retrieved:** n/a (codebase)
 - **Trust class:** codebase
 - **Summary:** Coder: `model?/modeId?/thinkingOptionId?` (optional). Web: `model: string; modeId: string | null; thinkingOptionId: string | null` (required/nullable). Structurally incompatible and uncaught by any parity test — a live defect.
 - **Evidence status:** confirmed by direct inspection (V3); the concrete immediate action.
 ### A11: TypeScript Modules Reference — recommendation-bearing (primary)
 - **Link / location:** https://www.typescriptlang.org/docs/handbook/modules/reference.html
 - **Retrieved:** 2026-06-02
 - **Trust class:** web (primary source — TypeScript team)
 - **Summary:** `moduleResolution: bundler` and `node16`/`nodenext` both resolve `package.json` `exports` and match the `types` condition; `node10`/`classic` ignore `exports`. This is why a built `@boocode/contracts` is consumable by the Vite (Bundler) web app — and corrects the codebase-inventory claim that "Bundler ignores exports."
 - **Evidence status:** primary source; corroborated by A12, A14.
 ### A18: Zod v4 — `z.infer` / `z.discriminatedUnion` — recommendation-bearing
 - **Link / location:** https://zod.dev/v4 ; https://zod.dev/api
 - **Retrieved:** 2026-06-02
 - **Trust class:** web
 - **Summary:** `z.infer<typeof Schema>` over a `z.discriminatedUnion` yields a correct narrowing TypeScript discriminated union; the validator and the static type are one definition, so they cannot drift. This is the schema-first SSOT mechanism for the WS-frame and provider-config contracts (which are already Zod).
 - **Evidence status:** corroborated by A19; directly applicable since A6 shows ws-frames is already Zod.
 ### A20: Structural type-parity testing — recommendation-bearing
 - **Link / location:** https://vitest.dev/guide/testing-types ; https://www.totaltypescript.com/how-to-test-your-types
 - **Retrieved:** 2026-06-02
 - **Trust class:** web
 - **Summary:** `expectTypeOf().toEqualTypeOf()` / `Expect<Equal<A,B>>` under `vitest --typecheck` asserts structural identity (not just assignability) between two copies and fails CI on drift, with no new tooling beyond what the repo already runs. The natural mechanism for O3's gap-closing (e.g. guarding `AgentSessionConfig`).
 - **Evidence status:** corroborated (Vitest docs + Total TypeScript).
--- a/openspec/changes/archived/agent-status-normalize/proposal.md
+++ b/openspec/changes/archived/agent-status-normalize/proposal.md
--- a/openspec/changes/archived/claude-sdk-sessionstore/proposal.md
+++ b/openspec/changes/archived/claude-sdk-sessionstore/proposal.md
--- a/openspec/changes/archived/license-debt-mit/proposal.md
+++ b/openspec/changes/archived/license-debt-mit/proposal.md
--- a/openspec/changes/archived/license-debt-mit/tasks.md
+++ b/openspec/changes/archived/license-debt-mit/tasks.md
--- a/openspec/changes/archived/mistake-tracker-file-ledger/proposal.md
+++ b/openspec/changes/archived/mistake-tracker-file-ledger/proposal.md
--- a/openspec/changes/archived/sampling-streamjson-tokens/proposal.md
+++ b/openspec/changes/archived/sampling-streamjson-tokens/proposal.md
--- a/openspec/changes/archived/v2-3-provider-lifecycle/design.md
+++ b/openspec/changes/archived/v2-3-provider-lifecycle/design.md
--- a/openspec/changes/archived/v2-3-provider-lifecycle/proposal.md
+++ b/openspec/changes/archived/v2-3-provider-lifecycle/proposal.md
--- a/openspec/changes/archived/v2-3-provider-lifecycle/tasks.md
+++ b/openspec/changes/archived/v2-3-provider-lifecycle/tasks.md
--- a/openspec/changes/archived/v2-6-persistent-agent-sessions/design.md
+++ b/openspec/changes/archived/v2-6-persistent-agent-sessions/design.md
--- a/openspec/changes/archived/v2-6-persistent-agent-sessions/proposal.md
+++ b/openspec/changes/archived/v2-6-persistent-agent-sessions/proposal.md
--- a/openspec/changes/archived/v2-6-persistent-agent-sessions/tasks.md
+++ b/openspec/changes/archived/v2-6-persistent-agent-sessions/tasks.md
--- a/openspec/changes/archived/write-edit-robustness/proposal.md
+++ b/openspec/changes/archived/write-edit-robustness/proposal.md
--- a/openspec/changes/contracts-ssot/proposal.md
+++ b/openspec/changes/contracts-ssot/proposal.md
@@ -0,0 +1,102 @@
 # @boocode/contracts — cross-app wire contract SSOT
 **Status:** shipped (2026-06-02)
 Eliminate BooCode's hand-synced, duplicated cross-app TypeScript wire contracts by
 creating one workspace package, `@boocode/contracts`, that every consumer imports.
 Each contract is defined exactly once; the two Zod-backed contracts (`ws-frames`,
 `provider-config`) use `z.infer` so validator and type derive from the same definition
 and cannot drift independently.
 ## Why
 BooCode maintained hand-synced copies of cross-app contracts across up to four
 locations (`apps/server`, `apps/web`, `apps/coder`, `apps/coder/web`), verified only
 by byte-parity tests — `provider-types-parity.test.ts` and the ws-frames byte-parity
 assertion in `ws-frames.test.ts`. A live drift had appeared: `AgentSessionConfig`
 existed in two incompatible shapes — `apps/coder` held an all-optional dead copy
 (zero live importers) while `apps/web` held the real required/nullable shape — making
 the parity-test regime insufficient for type-only contracts.
 `v2.5.12-provider-lifecycle-phase4` had explicitly deferred a shared types package as
 "not worth the Docker/build-order risk at solo scale"; the observed drift made the
 investment worth taking.
 ## What shipped
 **Package:** `packages/contracts` (`@boocode/contracts`), `declaration:true`, zod
 pinned `^3.23.8`, per-subpath exports map with `types`-then-`default` conditions.
 `packages/*` added to `pnpm-workspace.yaml`; `pnpm-lock.yaml` regenerated.
 **Six contracts single-sourced:**
 - `./ws-frames` — `WsFrameSchema` (Zod runtime), `KNOWN_FRAME_TYPES`, `WsFrame`
  (`z.infer` from the schema).
 - `./provider-snapshot` — `ProviderSnapshotEntry`, `ProviderModel`, `ProviderMode`,
  `ThinkingOption`, `AgentCommand`, `ProviderSnapshotStatus` (plain TS; the coder's
  `provider-types.ts` re-exports them so internal importers are unchanged).
 - `./provider-config` — `ProviderOverrideSchema`, `CoderProvidersFileSchema`,
  `ProviderConfigPatchSchema` and their `z.infer` types.
 - `./message-metadata` — `MessageMetadata`, `ErrorReason`, `AgentSessionConfig`.
 - `./worktree-risk` — `WorktreeRiskReport` (unified from three copies that differed
  only in name; the coder called it `RiskReport`).
 **Four consumers** import via `workspace:*` through the exports map: `apps/server`,
 `apps/web`, `apps/coder`, and the fallback SPA `apps/coder/web`. No tsconfig project
 references; built dist only.
 **Deleted:**
 - `apps/server/src/types/ws-frames.ts` (server `./ws-frames` exports subpath dropped)
 - `apps/web/src/api/ws-frames.ts`
 - Provider-snapshot mirror block in web (`apps/web/src/api/types.ts`)
 - Provider-config 17-line hand-mirror in web
 - `apps/coder/src/services/__tests__/provider-types-parity.test.ts` (6 parity tests)
 - ws-frames byte-parity assertion (server test suite)
 - All duplicate `MessageMetadata`, `ErrorReason`, `AgentSessionConfig` copies
 - `WorktreeRiskReport` / `RiskReport` duplicates (3 copies)
 - `apps/coder/web` dead `pending_change_added`/`pending_change_updated` reducer arms
  and associated WS plumbing (`DiffPane` prop, `Session.tsx` listener)
 **Preserved:**
 - KNOWN_FRAME_TYPES drift test — moved into the package (11/11)
 - Broker fail-closed tests — kept in `apps/server`, importing from the package (4/4)
 - Web strict `SessionFrame`/`UserFrame` discriminated union — web-local, untouched
 ## Key decisions
 **F1 (ws-frames):** repoint all 8 server/coder importers + 2 web validators to the
 package; drop the server `./ws-frames` re-export subpath — one path per contract, no
 shim.
 **F2 (web strict union):** the package exports the runtime schema only (`WsFrameSchema`,
 `KNOWN_FRAME_TYPES`, loose `WsFrame`). The web's rich discriminated union stays
 web-local — it references entity types (`Message`, `ToolCall`, etc.) that are
 intentionally web-local and not cross-app duplicated. Zero entity-type scope expansion.
 **AgentSessionConfig drift:** unified to the web required/nullable shape; the coder's
 all-optional copy confirmed dead (zero live importers) and deleted.
 **`apps/coder/web` (fallback SPA):** its hand-copied 9-arm `WsFrame` union replaced by
 the canonical import; dead `pending_change_*` arms removed (no publisher exists for
 these frames anywhere in the codebase — they were HTTP-delivered, not WS); field
 conflicts reconciled per-field (`tool_result.error` boolean→string, `tokens_used`
 number→number|null, `snapshot.messages` cast).
 **Two `ErrorReason` concepts (intentional, not duplication):**
 `message-metadata`'s `ErrorReason` is the DB-persisted 3-value set; the ws-frames
 frame-level `reason` is the wire 5-value set. Different value sets, different
 semantics, confirmed during audit.
 ## Build-order inversion
 Contracts builds before all consumers in:
 - Root `package.json` build script
 - `Dockerfile` — new `COPY packages/*` block + contracts build step before web/server
 - Coder deploy command — updated in all 7 doc sites (CLAUDE.md, apps/coder/CLAUDE.md,
  BOOCODER.md, docs/ARCHITECTURE.md, docs/project-discovery.md, README.md,
  docs/coder-backends.md)
 ## Test counts at ship
 Server 543 / coder 293 / contracts 11. Clean `docker compose build --no-cache boocode`
 green. Human smoke verified 2026-06-02.
--- a/openspec/changes/contracts-ssot/tasks.md
+++ b/openspec/changes/contracts-ssot/tasks.md
@@ -0,0 +1,109 @@
 # Tasks — @boocode/contracts SSOT
 Nine phases, each independently verifiable. Phases 1–7 are the migration units (one
 contract group each after a proven tracer); Phase 8 is the audit gate; Phase 9 is the
 human smoke test. All shipped 2026-06-02.
 ## Phase 1 — Tracer: scaffold package + build-order inversion + web proof ✅ SHIPPED
 - [x] Create `packages/contracts` (`@boocode/contracts`): `declaration:true`, per-subpath
      exports map, zod `^3.23.8`. Placeholder `src/index.ts`.
 - [x] Add `packages/*` to `pnpm-workspace.yaml`.
 - [x] Invert build order everywhere: root `package.json` build script, `Dockerfile`
      (`COPY packages/*` + contracts build before web/server), coder deploy command.
 - [x] Regenerate `pnpm-lock.yaml` (Dockerfile uses `--frozen-lockfile`).
 - [x] Prove the web consumption path end-to-end: `tsc -b` (composite+Bundler), `vite dev`
      HMR, and `vite build` all resolve the built `.d.ts`+`.js` via the exports map.
 - [x] Verify all consumer builds + `docker compose build boocode` green.
 - [x] Remove Phase 1 probe artifacts before Phase 2.
 ## Phase 2a — Single-source the ws-frames runtime schema ✅ SHIPPED
 - [x] Move ws-frames schema to `packages/contracts/src/ws-frames.ts` (`./ws-frames` subpath).
 - [x] Repoint 8 server/coder importers + 2 web validators to `@boocode/contracts/ws-frames`.
 - [x] Delete `apps/server/src/types/ws-frames.ts`; drop server `./ws-frames` exports subpath.
 - [x] Delete `apps/web/src/api/ws-frames.ts`.
 - [x] Move KNOWN_FRAME_TYPES drift + accept/reject tests into the package (11/11); delete
      byte-parity test; keep broker fail-closed tests in server importing from the package.
 - [x] Container smoke: `docker compose up`, `/api/health` 200, broker imports from package.
 ## Phase 3 — Single-source provider snapshot types ✅ SHIPPED
 - [x] Move `ProviderSnapshotEntry`, `ProviderModel`, `ProviderMode`, `ThinkingOption`,
      `AgentCommand`, `ProviderSnapshotStatus` to `packages/contracts/src/provider-snapshot.ts`
      (`./provider-snapshot` subpath).
 - [x] `apps/coder/src/services/provider-types.ts` re-exports them (importers unchanged).
 - [x] Delete web mirror block; delete `provider-types-parity.test.ts` (coder −6 tests).
 - [x] All builds and typechecks green; server 543 / coder 293 unchanged.
 ## Phase 4 — Single-source the Zod provider-config schemas ✅ SHIPPED
 - [x] Move `ProviderOverrideSchema`, `CoderProvidersFileSchema`, `ProviderConfigPatchSchema`
      + `z.infer` types to `packages/contracts/src/provider-config.ts` (`./provider-config`
      subpath).
 - [x] `apps/coder/src/services/provider-config.ts` imports + re-exports (importers unchanged).
 - [x] Delete 17-line web hand-mirror.
 - [x] Coder provider-config tests 13/13; all builds green.
 ## Phase 5 — Single-source type-only contracts (MessageMetadata + AgentSessionConfig) ✅ SHIPPED
 - [x] Move `MessageMetadata`, `ErrorReason`, `AgentSessionConfig` to
      `packages/contracts/src/message-metadata.ts` (`./message-metadata` subpath).
 - [x] Unify `AgentSessionConfig` to the web required/nullable shape; delete the coder's
      dead all-optional copy (zero live importers confirmed).
 - [x] Delete all duplicate `MessageMetadata` + `ErrorReason` copies (confirmed byte-identical).
 - [x] Repoint server `api.ts`, web types, and `MessageBubble.tsx` to the package.
 - [x] Server 543 / coder 293 unchanged; all builds green.
 ## Phase 6 — Single-source WorktreeRiskReport ✅ SHIPPED
 - [x] Move `WorktreeRiskReport` to `packages/contracts/src/worktree-risk.ts`
      (`./worktree-risk` subpath); unify name (coder `RiskReport` → `WorktreeRiskReport`).
 - [x] Delete all three copies (shapes identical, names differed).
 - [x] Repoint `sessions.ts`, `ProjectSidebar.tsx`, `orphan-worktree-reaper.ts`,
      `worktree-safety.ts` via re-exports; `checkWorktreeWorkAtRisk` returns shared type.
 - [x] Server 543 / coder 293 unchanged; all builds green.
 ## Phase 7 — Migrate apps/coder/web (fallback SPA) ✅ SHIPPED
 - [x] Add `@boocode/contracts` `workspace:*` dep to `apps/coder/web/package.json`.
 - [x] Delete hand-copied 9-arm `WsFrame` union; import canonical `WsFrame` from the package.
 - [x] Reconcile field conflicts: `tool_result.error` boolean→string; `tokens_used`
      number→number|null; `snapshot.messages` cast `as Message[]`; `chat_id ?? ''`.
 - [x] Delete dead `pending_change_added`/`pending_change_updated` reducer arms + the entire
      dead `onPendingChange` WS plumbing (`DiffPane` prop, `Session.tsx` listener).
 - [x] Confirm HTTP pending-change apply/reject path untouched.
 - [x] `apps/coder/web` Vite build green; root build + server 543 / coder 293 green.
 ## Phase 8 — Audit every requirement bullet ✅ PASSED (12/12)
 - [x] Verify one definition per contract (file + line evidence for each).
 - [x] Verify `z.infer` for both Zod contracts (ws-frames, provider-config).
 - [x] Verify all four consumers wired via exports map (no project refs, no src imports).
 - [x] Verify all hand-copies + parity tests deleted; drift/broker tests preserved.
 - [x] Verify single zod version; build-order inverted in root + Dockerfile + deploy docs.
 - [x] Verify `packages/*` in workspace; dead `pending_change_*` arms gone.
 - [x] Verify web strict union preserved + coherent post-migration.
 - [x] Close gap G: update coder deploy command in all 7 doc sites (CLAUDE.md,
      apps/coder/CLAUDE.md, BOOCODER.md, docs/ARCHITECTURE.md, docs/project-discovery.md,
      README.md, docs/coder-backends.md).
 - [x] Correct now-false byte-parity/duplication claims in CLAUDE.md conventions,
      apps/server/CLAUDE.md broker note, docs/coder-backends.md, and
      docs/coding-standards/cross-app-contract-parity.md (rewritten to describe the SSOT).
 - [x] Mark DEFERRED-WORK §3 + STALE-DEPRECATED item as shipped.
 - [x] Clean `docker compose build --no-cache boocode` green; server 543 / coder 293 /
      contracts 11 at exact baselines; nothing staged (HEAD e5ce01a).
 ## Phase 9 — Human smoke test ✅ PASSED (Sam, 2026-06-02)
 - [x] Web dev HMR, web prod build at :9500, live WS stream rendering.
 - [x] Coder restart + a turn; fallback SPA; pending-change apply.
 ## Verify (all runs green)
 - `pnpm -C packages/contracts build`
 - `pnpm -C apps/server test` (543)
 - `pnpm -C apps/coder test` (293)
 - `pnpm -C apps/server build && pnpm -C apps/coder build`
 - `npx tsc -p apps/web/tsconfig.app.json --noEmit`
 - `docker compose build --no-cache boocode`
--- a/package.json
+++ b/package.json
@@ -5,7 +5,7 @@
  "scripts": {
    "dev:server": "pnpm --filter ./apps/server dev",
    "dev:web": "pnpm --filter ./apps/web dev",
-    "build": "pnpm --filter ./apps/web build && pnpm --filter ./apps/server build",
+    "build": "pnpm --filter ./packages/contracts build && pnpm --filter ./apps/web build && pnpm --filter ./apps/server build",
    "start": "node apps/server/dist/index.js"
  },
  "devDependencies": {
--- a/packages/contracts/package.json
+++ b/packages/contracts/package.json
@@ -0,0 +1,46 @@
 {
  "name": "@boocode/contracts",
  "version": "0.0.0",
  "private": true,
  "type": "module",
  "main": "dist/index.js",
  "exports": {
    ".": {
      "types": "./dist/index.d.ts",
      "default": "./dist/index.js"
    },
    "./ws-frames": {
      "types": "./dist/ws-frames.d.ts",
      "default": "./dist/ws-frames.js"
    },
    "./provider-snapshot": {
      "types": "./dist/provider-snapshot.d.ts",
      "default": "./dist/provider-snapshot.js"
    },
    "./provider-config": {
      "types": "./dist/provider-config.d.ts",
      "default": "./dist/provider-config.js"
    },
    "./message-metadata": {
      "types": "./dist/message-metadata.d.ts",
      "default": "./dist/message-metadata.js"
    },
    "./worktree-risk": {
      "types": "./dist/worktree-risk.d.ts",
      "default": "./dist/worktree-risk.js"
    }
  },
  "scripts": {
    "build": "tsc",
    "typecheck": "tsc --noEmit",
    "test": "vitest run"
  },
  "dependencies": {
    "zod": "^3.23.8"
  },
  "devDependencies": {
    "typescript": "^5.5.0",
    "vitest": "^3.2.4"
  },
  "license": "MIT"
 }
--- a/packages/contracts/src/tests/ws-frames.test.ts
+++ b/packages/contracts/src/tests/ws-frames.test.ts
@@ -0,0 +1,136 @@
 import { describe, it, expect } from 'vitest';
 import { WsFrameSchema, KNOWN_FRAME_TYPES } from '../ws-frames.js';
 const VALID_UUID_A = '00000000-0000-0000-0000-000000000001';
 const VALID_UUID_B = '00000000-0000-0000-0000-000000000002';
 const VALID_UUID_C = '00000000-0000-0000-0000-000000000003';
 const VALID_TIMESTAMP = '2026-05-22T14:30:00.000Z';
 describe('WsFrameSchema (v1.13.11-a)', () => {
  it('accepts a well-formed chat_status frame', () => {
    const result = WsFrameSchema.safeParse({
      type: 'chat_status',
      chat_id: VALID_UUID_A,
      status: 'streaming',
      at: VALID_TIMESTAMP,
    });
    expect(result.success).toBe(true);
  });
  it('rejects an unknown frame type', () => {
    const result = WsFrameSchema.safeParse({
      type: 'cosmic_ray_strike',
      chat_id: VALID_UUID_A,
    });
    expect(result.success).toBe(false);
  });
  it('rejects a chat_status frame with invalid status enum', () => {
    // v1.12.1 dropped the legacy 'working' status. Any frame still emitting it
    // should fail validation — that's a drift catcher.
    const result = WsFrameSchema.safeParse({
      type: 'chat_status',
      chat_id: VALID_UUID_A,
      status: 'working',
      at: VALID_TIMESTAMP,
    });
    expect(result.success).toBe(false);
  });
  it('rejects a UUID field with a non-UUID string', () => {
    const result = WsFrameSchema.safeParse({
      type: 'chat_status',
      chat_id: 'not-a-uuid',
      status: 'idle',
      at: VALID_TIMESTAMP,
    });
    expect(result.success).toBe(false);
  });
  it('rejects negative token counts in usage frame', () => {
    const result = WsFrameSchema.safeParse({
      type: 'usage',
      message_id: VALID_UUID_A,
      chat_id: VALID_UUID_B,
      completion_tokens: -1,
      ctx_used: 100,
      ctx_max: 1000,
    });
    expect(result.success).toBe(false);
  });
  it('accepts a usage frame with nullable token counts (pre-v1.13.7 history)', () => {
    const result = WsFrameSchema.safeParse({
      type: 'usage',
      message_id: VALID_UUID_A,
      chat_id: VALID_UUID_B,
      completion_tokens: null,
      ctx_used: null,
      ctx_max: null,
    });
    expect(result.success).toBe(true);
  });
  it('accepts a tool_result frame with non-UUID tool_call_id (model-emitted)', () => {
    // Model-emitted tool_call_ids look like "call_abc123", not UUIDs.
    const result = WsFrameSchema.safeParse({
      type: 'tool_result',
      tool_message_id: VALID_UUID_A,
      chat_id: VALID_UUID_B,
      tool_call_id: 'call_abc123',
      output: { whatever: true },
      truncated: false,
    });
    expect(result.success).toBe(true);
  });
  it('accepts a compacted frame', () => {
    const result = WsFrameSchema.safeParse({
      type: 'compacted',
      session_id: VALID_UUID_A,
      chat_id: VALID_UUID_B,
      summary_message_id: VALID_UUID_C,
    });
    expect(result.success).toBe(true);
  });
  it('accepts a session_workspace_updated frame', () => {
    const result = WsFrameSchema.safeParse({
      type: 'session_workspace_updated',
      session_id: VALID_UUID_A,
      workspace_panes: [{ id: 'p1', kind: 'chat', chatIds: [], activeChatIdx: 0 }],
    });
    expect(result.success).toBe(true);
  });
  it('accepts a message_complete frame with a null model (external coder, no model selected)', () => {
    // Regression guard: the dispatcher publishes `model: task.model` (string |
    // null). When null, this MUST validate or publishFrame fail-closes and drops
    // the whole frame, incl. the status:'complete' transition.
    const result = WsFrameSchema.safeParse({
      type: 'message_complete',
      message_id: VALID_UUID_A,
      chat_id: VALID_UUID_B,
      model: null,
    });
    expect(result.success).toBe(true);
  });
  it('every KNOWN_FRAME_TYPES entry has a discriminated branch', () => {
    // Probe each known type by attempting a minimal valid construction.
    // Failure here means the union and the KNOWN_FRAME_TYPES list drifted.
    for (const type of KNOWN_FRAME_TYPES) {
      const probe = WsFrameSchema.safeParse({ type, __dummy__: true });
      // We expect FAILURE on every type because we're missing required fields,
      // but the failure must be ABOUT the missing fields, not about an unknown
      // type. A "Invalid discriminator value" error means the type isn't in
      // the union — that's a drift.
      if (probe.success) continue;
      const issues = probe.error.issues;
      const hasInvalidDiscriminator = issues.some(
        (i) => i.code === 'invalid_union_discriminator',
      );
      expect(hasInvalidDiscriminator, `frame type '${type}' is missing from the discriminated union`).toBe(false);
    }
  });
 });
--- a/packages/contracts/src/index.ts
+++ b/packages/contracts/src/index.ts
@@ -0,0 +1,5 @@
 // @boocode/contracts — single source of truth for cross-app wire contracts.
 // Each contract is exported from its own subpath (e.g. @boocode/contracts/ws-frames).
 // This root module is intentionally empty; import from the subpath directly.
 export {};
--- a/packages/contracts/src/message-metadata.ts
+++ b/packages/contracts/src/message-metadata.ts
@@ -0,0 +1,45 @@
 // Single source of truth for cross-app message metadata contracts.
 // ErrorReason + MessageMetadata: sentinel shapes stored in messages.metadata
 // and carried on WS frames. AgentSessionConfig: the required/nullable shape
 // used by CoderPane/AgentComposerBar for provider dispatch.
 export type ErrorReason =
  | 'llm_provider_error'
  | 'tool_execution_failed'
  | 'summary_after_cap_failed';
 export type MessageMetadata =
  | {
      kind: 'cap_hit';
      used: number;
      limit: number;
      agent_name: string | null;
      can_continue: boolean;
    }
  | {
      kind: 'doom_loop';
      tool_name: string;
      args: Record<string, unknown>;
      threshold: number;
    }
  | {
      kind: 'mistake_recovery';
      failure_kinds: string[];
      count: number;
      escalated: boolean;
      can_continue?: boolean;
    }
  | {
      kind: 'error';
      error_reason: ErrorReason;
      error_text: string;
    };
 // Unified definition is the web required/nullable shape (the coder's all-optional
 // copy was dead — zero importers in apps/coder/src).
 export interface AgentSessionConfig {
  provider: string;
  model: string;
  modeId: string | null;
  thinkingOptionId: string | null;
 }
--- a/packages/contracts/src/provider-config.ts
+++ b/packages/contracts/src/provider-config.ts
@@ -0,0 +1,25 @@
 import { z } from 'zod';
 export const ProviderOverrideSchema = z.object({
  extends: z.enum(['acp']).optional(),
  label: z.string().min(1).optional(),
  description: z.string().optional(),
  command: z.array(z.string().min(1)).min(1).optional(),
  env: z.record(z.string()).optional(),
  enabled: z.boolean().optional(),
  order: z.number().int().optional(),
  models: z.array(z.object({ id: z.string(), label: z.string() })).optional(),
  additionalModels: z.array(z.object({ id: z.string(), label: z.string() })).optional(),
 });
 export const CoderProvidersFileSchema = z.object({
  providers: z.record(ProviderOverrideSchema).default({}),
 });
 export const ProviderConfigPatchSchema = z.object({
  providers: z.record(ProviderOverrideSchema.nullable()).default({}),
 });
 export type ProviderOverride = z.infer<typeof ProviderOverrideSchema>;
 export type CoderProvidersFile = z.infer<typeof CoderProvidersFileSchema>;
 export type ProviderConfigPatch = z.infer<typeof ProviderConfigPatchSchema>;
--- a/packages/contracts/src/provider-snapshot.ts
+++ b/packages/contracts/src/provider-snapshot.ts
@@ -0,0 +1,52 @@
 /** Provider snapshot types — single source of truth. Plain TS, no runtime. */
 export interface ProviderMode {
  id: string;
  label: string;
  description?: string;
  /** Auto-approve tool permissions when this mode is selected. */
  isUnattended?: boolean;
 }
 export interface ThinkingOption {
  id: string;
  label: string;
  isDefault?: boolean;
 }
 export interface ProviderModel {
  id: string;
  label: string;
  description?: string;
  isDefault?: boolean;
  thinkingOptions?: ThinkingOption[];
  defaultThinkingOptionId?: string;
 }
 // v2.3 phase 2: 'loading' (cache-miss, probe in flight) + 'unavailable'
 // (disabled or not installed) restored alongside the terminal 'ready' | 'error'.
 export type ProviderSnapshotStatus = 'loading' | 'ready' | 'unavailable' | 'error';
 export interface AgentCommand {
  name: string;
  description?: string;
  // v2.5.11: 'skill' (plugin skill) vs 'command' (native/CLI slash command).
  // Drives the icon split in the coder slash menu. Undefined → command.
  kind?: 'command' | 'skill';
 }
 export interface ProviderSnapshotEntry {
  name: string;
  label: string;
  description?: string;
  transport: string;
  status: ProviderSnapshotStatus;
  enabled: boolean;
  installed: boolean;
  models: ProviderModel[];
  modes: ProviderMode[];
  defaultModeId: string | null;
  commands: AgentCommand[];
  error?: string;
  fetchedAt?: string;
 }
--- a/packages/contracts/src/worktree-risk.ts
+++ b/packages/contracts/src/worktree-risk.ts
@@ -0,0 +1,13 @@
 // Single source of truth for the worktree work-loss guard type.
 // WorktreeRiskReport: returned by BooCoder's checkWorktreeWorkAtRisk, passed
 // through the server, and consumed by the web dialog. Three-way contract.
 export interface WorktreeRiskReport {
  worktreePath: string;
  branch: string;
  dirty: boolean;
  unpushed: number; // commits ahead of upstream, or -1 if no upstream is set
  unmerged: number; // commits on this branch not in the project default branch
  atRisk: boolean;  // dirty || unmerged > 0 || (upstream && unpushed > 0) || git error
  error?: string;   // populated on a git failure; presence forces atRisk
 }
--- a/packages/contracts/src/ws-frames.ts
+++ b/packages/contracts/src/ws-frames.ts
@@ -1,16 +1,10 @@
-// v1.13.11-a: Zod schemas for every WebSocket frame published by the server.
+// Single source of truth for the WebSocket frame Zod runtime schema.
-// Validation runs both on send (broker.publishFrame / publishUserFrame) and
+// Validation runs on send (broker.publishFrame / publishUserFrame) and
-// on receive (apps/web/src/hooks/useSessionStream + useUserEvents). Catches
+// on receive (apps/web hooks useSessionStream + useUserEvents). Catches
 // silent protocol drift between publisher and consumer.
 //
-// IMPORTANT: This file is duplicated byte-identical at
+// Per-kind payload schemas stay z.unknown() — frame-level drift detection
-// apps/web/src/api/ws-frames.ts. The two apps have separate tsconfigs and
+// is the goal; deep payload validation is follow-up work.
 // no path alias; the duplication is sync-by-hand. A test asserts the two
 // files match. If you change one, change the other.
 //
 // Per-kind payload schemas (tool_call args, message_parts payloads, etc.)
 // stay z.unknown() in v1.13.11. Frame-level drift detection is the goal;
 // deep payload validation is follow-up work.
 import { z } from 'zod';
@@ -66,8 +60,8 @@ const ToolCallShape = z.object({
 // payload narrowing is follow-up work). z.unknown() means the consumer
 // must narrow before reading — TypeScript-side this is fine because every
 // consumer already operates on the hand-maintained Project / Chat / Session
-// / WorkspacePane types (the brief's "Don't strip existing types yet"
+// / WorkspacePane types, and the Zod-typed shape is only used at the
-// rule), and the Zod-typed shape is only used at the publishFrame boundary.
+// publishFrame boundary.
 const OpaqueObject = z.unknown();
 // ---- per-session channel frames --------------------------------------------
@@ -216,8 +210,7 @@ export const SessionWorkspaceUpdatedFrame = z.object({
  // v2.6.x: widened from z.array — the payload is now either the legacy bare
  // WorkspacePane[] OR the WorkspaceState envelope object (panes + tabNumbers +
  // nextTabNumber + closedPaneStack). z.array alone would fail-closed and drop
-  // every envelope frame at validation. MUST be mirrored in the server's
+  // every envelope frame at validation.
  // byte-identical copy (parity test).
  workspace_panes: z.union([z.array(OpaqueObject), z.record(z.unknown())]),
 });
@@ -370,7 +363,7 @@ export type WsFrame = z.infer<typeof WsFrameSchema>;
 // Convenience: the set of known frame types. Useful for the publishFrame
 // helper to log the offending type name when validation fails. Kept in sync
-// by hand with the discriminated union above.
+// by the drift test in src/__tests__/ws-frames.test.ts.
 export const KNOWN_FRAME_TYPES: readonly WsFrame['type'][] = [
  'snapshot',
  'message_started',
--- a/packages/contracts/tsconfig.json
+++ b/packages/contracts/tsconfig.json
@@ -0,0 +1,15 @@
 {
  "extends": "../../tsconfig.base.json",
  "compilerOptions": {
    "module": "NodeNext",
    "moduleResolution": "NodeNext",
    "outDir": "dist",
    "rootDir": "src",
    "lib": ["ES2022"],
    "types": [],
    "declaration": true,
    "sourceMap": true
  },
  "include": ["src/**/*"],
  "exclude": ["src/**/__tests__/**", "**/*.test.ts"]
 }
--- a/packages/contracts/vitest.config.ts
+++ b/packages/contracts/vitest.config.ts
@@ -0,0 +1,9 @@
 import { defineConfig } from 'vitest/config';
 export default defineConfig({
  test: {
    environment: 'node',
    globals: false,
    include: ['src/**/__tests__/**/*.test.ts'],
  },
 });
--- a/pnpm-lock.yaml
+++ b/pnpm-lock.yaml
@@ -54,6 +54,9 @@ importers:
      '@anthropic-ai/claude-agent-sdk':
        specifier: ^0.3.159
        version: 0.3.159(@anthropic-ai/sdk@0.100.1(zod@3.25.76))(@modelcontextprotocol/sdk@1.29.0(zod@3.25.76))(zod@3.25.76)
      '@boocode/contracts':
        specifier: workspace:*
        version: link:../../packages/contracts
      '@boocode/server':
        specifier: workspace:*
        version: link:../server
@@ -100,6 +103,9 @@ importers:
  apps/coder/web:
    dependencies:
      '@boocode/contracts':
        specifier: workspace:*
        version: link:../../../packages/contracts
      lucide-react:
        specifier: ^1.16.0
        version: 1.16.0(react@18.3.1)
@@ -146,6 +152,9 @@ importers:
      '@ai-sdk/openai-compatible':
        specifier: ^2.0.47
        version: 2.0.47(zod@3.25.76)
      '@boocode/contracts':
        specifier: workspace:*
        version: link:../../packages/contracts
      '@fastify/static':
        specifier: ^7.0.4
        version: 7.0.4
@@ -192,6 +201,9 @@ importers:
  apps/web:
    dependencies:
      '@boocode/contracts':
        specifier: workspace:*
        version: link:../../packages/contracts
      '@fontsource-variable/inter':
        specifier: ^5.2.8
        version: 5.2.8
@@ -287,6 +299,19 @@ importers:
        specifier: ^5.3.4
        version: 5.4.21(@types/node@20.19.41)(lightningcss@1.32.0)
  packages/contracts:
    dependencies:
      zod:
        specifier: ^3.23.8
        version: 3.25.76
    devDependencies:
      typescript:
        specifier: ^5.5.0
        version: 5.9.3
      vitest:
        specifier: ^3.2.4
        version: 3.2.4(@types/debug@4.1.13)(@types/node@20.19.41)(lightningcss@1.32.0)(msw@2.14.6(@types/node@20.19.41)(typescript@5.9.3))
 packages:
  '@agentclientprotocol/sdk@0.22.1':
--- a/pnpm-workspace.yaml
+++ b/pnpm-workspace.yaml
@@ -1,3 +1,4 @@
 packages:
  - "packages/*"
  - "apps/*"
  - "apps/coder/web"
Author	SHA1	Message	Date
indifferentketchup	cc4bd04aa4	Merge contracts-ssot-pkg: v2.7.13 single-source cross-app wire contracts in @boocode/contracts	2026-06-02 21:24:14 +00:00
indifferentketchup	649ce71eff	feat: single-source cross-app wire contracts in @boocode/contracts (v2.7.13) Move all hand-synced cross-app wire contracts into one built workspace package, @boocode/contracts, consumed by server/web/coder/coder-web via workspace:* + a per-subpath exports map. The ws-frames and provider-config Zod schemas are schema-first (z.infer); MessageMetadata, ErrorReason, AgentSessionConfig, the provider snapshot types, and WorktreeRiskReport are each single-sourced. Deletes the byte-identical copies and their parity tests, fixes a live AgentSessionConfig drift (coder dead copy removed, unified to the web required/nullable shape), removes the dead pending_change WS arms in the fallback SPA, and inverts the build order (contracts builds first) across root build, Dockerfile, and the coder deploy docs. Reverses the shared-package decision declined in v2.5.12. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-02 21:24:08 +00:00
indifferentketchup	2a05d2f9fe	docs: archive shipped openspec batches; add feature/plan/research notes Move 13 shipped openspec change docs under openspec/changes/archived/. Add docs/features/git-diff-panel, docs/plans/post-review-backlog, and docs/research/cross-app-contract-ssot.md (the research behind the @boocode/contracts SSOT work). Update BOOCHAT.md, BOOCODER.md, and boocode_roadmap.md. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-02 21:20:33 +00:00
indifferentketchup	e5ce01ae72	fix(coder): include model in WS snapshot SELECT so the attribution chip survives refresh CoderPane hydrates from the HTTP listMessages fetch (SELECT has model) AND the WS snapshot frame, and the snapshot handler setMessages-overwrites the HTTP load. The snapshot query in apps/coder/src/routes/ws.ts had its own column list that omitted model, so on coder refresh the chip's model was lost (it showed live via the message_complete frame). One-column fix: add model to that SELECT. CLAUDE.md mapper-chain note updated to list the WS snapshot SELECT. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-02 18:03:10 +00:00
indifferentketchup	81470f5a77	Merge composer-chips: v2.7.10 composer attach-file button + slash-commands chip (icon-only on mobile) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-02 17:27:59 +00:00
indifferentketchup	35dba828e1	feat: composer attach-file button + slash-commands chip (icon-only on mobile) Move the slash-commands menu out of the full-width AgentCommandsHint disclosure into a compact chip in the composer's bottom controls row, and add an attach-file button that reuses the existing drag-drop pipeline (5MB/binary gate, 10-attachment cap, chips + preview). On mobile both collapse to icon-only (count hidden). Shared ChatInput, so it applies to both BooChat and BooCoder; typed-/ autocomplete is unchanged. Removes the now-unused AgentCommandsHint component. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-02 17:26:27 +00:00