feat: MCP {env:VAR} key substitution + coder model/tool-result fixes + docs refactor (v2.7.9)

- MCP secrets: substituteEnvVars recursively resolves {env:NAME} in mcp.json string values from process.env before Zod (opencode-compatible); unset -> '' + boot warning, and invalid-config log names the unset vars (an empty {env:VAR} in a strict url/command field invalidates the whole config) - data/mcp.json now untracked (.gitignore flips !data/mcp.json -> !data/mcp.example.json); tracked template data/mcp.example.json carries "{env:CONTEXT7_API_KEY}"; .env.example documents the key (9 mcp-config tests) - Coder fix: message_complete frame model widened string -> string|null (server+web ws-frames parity); dispatcher publishes model: task.model at all 4 external completion points — a null model otherwise fail-closed in publishFrame and dropped the whole frame incl. status:'complete' (regression test) - Coder fix: claude-sdk mapUserToolResults maps user-message tool_result blocks -> terminal tool_update events (completed/failed w/ output) so tool snapshots resolve instead of spinning forever - Composer: AgentComposerBar drops §9b resumed/history/new chip + token readout, loses flex-wrap so the row stays one line; CoderPane gains a per-chat localStorage agent-config cache (restores last model on reopen) + threads model into the timeline/chip - Docs: root CLAUDE.md slimmed (~190 lines), per-app refs split to apps/{coder,server,web}/CLAUDE.md; new docs/coder-backends.md, docs/project-discovery.md, docs/coding-standards/ (cross-app-contract-parity); ARCHITECTURE.md links the backends doc Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 17:01:03 +00:00
parent 7ca4a6b344
commit afaca9e426
23 changed files with 1284 additions and 256 deletions
--- a/docs/ARCHITECTURE.md
+++ b/docs/ARCHITECTURE.md
@@ -96,6 +96,8 @@ flowchart LR

 Since v2.1.0, BooCoder runs on the host (not Docker). Agent binaries spawn directly — no SSH tunnel.

+See [coder-backends.md](./coder-backends.md) for the full dispatch-backend reference: routing predicates, the warm vs. one-shot lifecycle, agent-session resume, and the provider-discovery pipeline.
+
 ## Supporting services

 | Service | Reachability | Purpose |
--- a/docs/coder-backends.md
+++ b/docs/coder-backends.md
@@ -0,0 +1,372 @@
+# BooCoder Dispatch Backends
+
+<!-- How BooCoder turns a coding request into work performed by one of several pluggable agent backends, streams the turn back to the browser, and resumes each agent's context across turns. -->
+
+- **Last Updated:** 2026-06-02 00:00
+- **Authors:**
+  - indifferentketchup (samkintop@gmail.com)
+
+## Summary
+
+BooCoder is the write-capable surface of BooCode: it takes a message you type into a coder tab and gets an AI coding agent to actually do the work in an isolated copy of your project. Unlike the read-only chat surface, BooCoder can edit, create, and delete files — every change is staged for your review before it touches disk. The interesting part is that BooCoder does not have one agent; it has several, each spoken to over a different protocol (a local model, OpenCode, Goose, Claude Code, Qwen Code), and it hides those differences behind a single internal contract so the rest of the system streams them all the same way.
+
+When you send a message, BooCoder decides which backend should handle it, keeps that agent "warm" so a follow-up message reuses the same conversation, streams the agent's text, reasoning, and tool calls back to your browser live, and records what files changed. The picker you see (provider, model, mode, slash commands) is built from a discovery pipeline that probes which agents are installed on the host.
+
+- **Five providers, four transports.** `boocode` (native llama-swap inference), `opencode` (warm HTTP server), `goose`/`qwen` (ACP), `claude` (PTY or — behind a flag — the Claude Agent SDK).
+- **One internal contract.** Every external backend implements the same `AgentBackend` interface and emits the same transport-agnostic `AgentEvent`s; the dispatcher maps those to WebSocket frames identically regardless of which agent produced them.
+- **Warm vs. one-shot.** Tasks that come from a real chat tab get a long-lived, resumable agent session; session-less tasks (arena, MCP, raw API) get a fresh one-shot process per turn.
+- **The tab is the unit of context.** Agent sessions are keyed `(chat_id, agent)` — two tabs in one workspace are two independent conversations that happen to share one worktree.
+- **Changes are staged, never auto-applied.** Write tools queue rows in `pending_changes`; nothing hits disk until you apply.
+- **BooCoder runs on the host**, not in Docker — a `boocoder.service` systemd unit on port 9502, so it can spawn agent binaries with full filesystem access.
+
+Key files:
+- `apps/coder/src/services/dispatcher.ts` — the task loop; routes each task to a backend and maps its events to WS frames
+- `apps/coder/src/services/agent-backend.ts` — the `AgentBackend` interface and `AgentEvent` contract every backend implements
+- `apps/coder/src/services/backends/` — the four backend implementations plus routing predicates
+- `apps/coder/src/services/provider-snapshot.ts` — the provider discovery / probe pipeline that builds the picker
+- `apps/coder/src/schema.sql` — `agent_sessions`, `worktrees`, `tasks`, `available_agents`, `pending_changes`
+
+## Architecture
+
+```mermaid
+flowchart TD
+    Msg["User message<br/>(CoderPane)"] --> Route{"provider?"}
+    Route -->|boocode| Inf["runNativeInference<br/>in-process llama-swap"]
+    Route -->|external| Task[("tasks row")]
+    Task --> Disp["dispatcher.ts<br/>LISTEN/NOTIFY + 2s poll"]
+
+    Disp --> Pred{"routing<br/>predicates"}
+    Pred -->|opencode| OC["OpenCodeServerBackend<br/>(warm HTTP server)"]
+    Pred -->|"goose / qwen + tab"| WA["WarmAcpBackend<br/>(warm ACP process)"]
+    Pred -->|"claude + tab + flag"| CS["ClaudeSdkBackend<br/>(warm SDK query)"]
+    Pred -->|"session-less / flag off"| OS["runExternalAgent<br/>(one-shot ACP / PTY)"]
+
+    OC -->|AgentEvent| Map["onEvent → WS frames"]
+    WA -->|AgentEvent| Map
+    CS -->|AgentEvent| Map
+    OS -->|AgentEvent| Map
+
+    Map -->|"delta / reasoning_delta / tool_call"| Broker["broker.publishFrame"]
+    Map -->|"message_complete + model"| Broker
+    Broker -->|WebSocket| Web["apps/web CoderPane"]
+
+    OC -.session.-> AS[("agent_sessions<br/>(chat_id, agent)")]
+    WA -.session.-> AS
+    CS -.session.-> AS
+    Map -->|"diff worktree"| PC[("pending_changes")]
+    Inf --> PC
+```
+
+## How It Works
+
+Everything starts from a row in the `tasks` table. When you send a message to a coder tab with an external provider selected, the message route writes a `tasks` row carrying the provider, model, chat id, and session id. A long-running **dispatcher** notices the row (instantly via a Postgres `LISTEN/NOTIFY` signal, with a 2-second poll as a safety net) and runs it. Only one turn runs at a time per session, so two messages in the same workspace queue rather than collide. If the provider is the native `boocode`, there is no task and no external agent — the dispatcher runs llama-swap inference in-process instead, the same way the chat surface does.
+
+For an external provider, the dispatcher picks **which backend** should handle the task using small pure predicates. The deciding factors are: which agent it is, whether the task came from a real chat tab (has both a session id and a chat id), and — for Claude — whether a feature flag is on. OpenCode always uses its warm HTTP server. Goose and Qwen use a warm, long-lived ACP process when they come from a tab, and a fresh one-shot process otherwise. Claude uses the warm Claude Agent SDK backend only when the `CLAUDE_SDK_BACKEND` flag is set; by default it falls through to a one-shot PTY process. Anything session-less — arena contestants, MCP-created tasks, raw `POST /api/tasks` — always takes the one-shot path.
+
+A **warm backend** keeps the agent's process and conversation alive between turns. The first turn in a tab spawns the process and creates the agent's session; later turns reuse it, so the agent remembers the conversation without re-sending it. Each backend persists a small row in `agent_sessions` keyed on the tab and agent, including a resume token and a `config_hash` of the model. If you switch models in the same tab, the hash changes and the backend transparently starts a fresh agent session while keeping the same worktree. If the process crashes, the row is marked `crashed` and the next turn re-spawns it.
+
+No matter which backend runs, the turn streams the same way. Each backend emits a small set of **transport-agnostic events** — text, reasoning, tool-call-started, tool-call-updated, commands — and the dispatcher maps every one of them to a WebSocket frame, identically across all backends. That uniformity is the whole point of the design: OpenCode's server-sent events, an ACP process's notifications, and the Claude SDK's message stream all arrive in your browser as the same `delta`, `reasoning_delta`, and `tool_call` frames. When the turn finishes, the dispatcher publishes a `message_complete` frame (now carrying the model id, so the UI can show a model attribution chip), diffs the worktree, and queues the file changes into `pending_changes` for you to review.
+
+## Primary Flows
+
+### A warm external-agent turn
+
+**Trigger:** You type a message in a coder tab with OpenCode, Goose, Qwen, or (flag-on) Claude selected.
+
+1. **The message becomes a task.** The coder message route creates your user message row and a `tasks` row stamped with the agent, model, `chat_id`, and `session_id`, then returns `202 { task_id, dispatched: true }`. Nothing is computed yet.
+2. **The dispatcher picks it up.** A Postgres `NOTIFY` wakes the dispatcher immediately (a 2-second poll is the backstop). It enforces one-turn-per-session concurrency, then evaluates the routing predicates and lands on a warm backend.
+3. **The session is ensured or resumed.** The backend looks up `agent_sessions` for this `(chat_id, agent)`. If a healthy session exists and the model's `config_hash` matches, it resumes — the agent still has the conversation. Otherwise it spawns the process (or, for OpenCode, reuses the one shared server) and creates a fresh agent session against the tab's worktree.
+4. **The turn streams.** The backend sends your message and emits `AgentEvent`s as the agent works. The dispatcher's `onEvent` maps each to a WebSocket frame — `text` → `delta`, `reasoning` → `reasoning_delta`, `tool_call`/`tool_update` → `tool_call` — and publishes them through the broker. Your browser renders the agent thinking, talking, and calling tools live. Tool snapshots accumulate so the final transcript persists with each tool's input, output, and status.
+5. **Outcome.** On completion the dispatcher publishes `message_complete` (with the model id for the attribution chip), records token/context usage on the message, diffs the worktree against its base commit, and supersedes any prior pending changes with one `pending_changes` set for the turn. You review the diff and apply it when ready.
+
+**When it fails:** If the agent stalls and emits no events for 180 seconds, an inactivity watchdog reconciles the session — it asks the server whether the turn actually finished and, if not, marks the `agent_sessions` row `crashed`. A crashed or exited backend is re-spawned on your next message. An aborted turn (you hit stop) cancels the prompt on the warm connection without killing the process, and a guard swallows any late "turn done" signal so it cannot accidentally settle your *next* turn.
+
+### A one-shot dispatch
+
+**Trigger:** A task with no chat tab behind it (arena contestant, MCP-created task, raw `POST /api/tasks`), or a Claude task while `CLAUDE_SDK_BACKEND` is off.
+
+1. **No warm session.** The routing predicates return false (missing `session_id`/`chat_id`, or the Claude flag is off), so the dispatcher calls `runExternalAgent`.
+2. **A fresh process per turn.** It creates a per-task worktree, then spawns the agent once — over ACP for OpenCode/Goose, or over a PTY with `--output-format stream-json` for Claude/Qwen — runs the single turn, and tears the process down.
+3. **Outcome.** Events stream and persist exactly as in the warm flow (same `onEvent` mapping, same `message_complete` with model), and the worktree diff queues one `pending_changes` row. Nothing is kept warm; the next such task starts over.
+
+### Native boocode inference
+
+**Trigger:** You send a message with the native `boocode` provider selected.
+
+1. **No task, no agent.** The message route sees a non-external provider, creates a streaming assistant message row, and enqueues in-process inference — there is no `tasks` row and no external process.
+2. **The shared inference loop runs.** BooCoder reuses the chat surface's inference runner against llama-swap; deltas and tool calls publish through the broker just like the chat surface, and write tools queue into `pending_changes`.
+3. **Outcome.** The assistant message is finalized in place with its content and token counts. (This path is the only one that tracks context fill natively; external one-shot agents report no ctx usage.)
+
+**When it fails:** If inference is already running for the session, the route returns `409` rather than starting a second concurrent turn.
+
+## Key Files
+
+### Backend
+| File | Purpose |
+|------|---------|
+| `apps/coder/src/services/dispatcher.ts` | Task loop (LISTEN/NOTIFY + poll), per-session concurrency, routes to each backend, maps `AgentEvent`→WS frames, publishes the four `message_complete` sites, diffs worktree → `pending_changes` |
+| `apps/coder/src/services/agent-backend.ts` | The `AgentBackend` interface, `AgentEvent` union, `EnsureSessionOpts`/`AgentSessionHandle`/`PromptCtx`/`TurnResult` |
+| `apps/coder/src/services/backends/opencode-server.ts` | Warm OpenCode HTTP server backend: one `opencode serve` per process, per-session SSE loop, `config_hash` resume, inactivity watchdog, orphan-terminal guard |
+| `apps/coder/src/services/backends/warm-acp.ts` | Warm ACP backend (goose/qwen): one persistent ACP process per `(chat, agent)`, reused across turns |
+| `apps/coder/src/services/backends/claude-sdk.ts` | Warm Claude Agent SDK backend: one streaming-input `query()` per `(chat, agent)`; transcript via `PostgresSessionStore` |
+| `apps/coder/src/services/backends/claude-sdk-map.ts` | Maps `SDKMessage` (stream_event / assistant / user) → `AgentEvent`s; the `user` tool_result → terminal `tool_update` mapping |
+| `apps/coder/src/services/backends/warm-acp-routing.ts` | `shouldUseWarmBackend` predicate + ACP `stopReason`→ok mapping |
+| `apps/coder/src/services/backends/claude-sdk-routing.ts` | `shouldUseClaudeSdk` + `claudeSdkBackendEnabled` (the `CLAUDE_SDK_BACKEND` flag) |
+| `apps/coder/src/services/backends/lifecycle-decisions.ts` | Pure idle/LRU/restart eviction decisions for the agent pool |
+| `apps/coder/src/services/backends/turn-guard.ts` | Post-abort orphan-terminal suppression |
+| `apps/coder/src/services/acp-dispatch.ts` / `pty-dispatch.ts` | One-shot ACP / PTY dispatch used by `runExternalAgent` |
+| `apps/coder/src/services/acp-event-map.ts` | Shared ACP `session/update` → `AgentEvent` normalization (warm + one-shot) |
+| `apps/coder/src/services/acp-tool-snapshot.ts` | `AcpToolSnapshot` shape, `mergeToolSnapshot`, `snapshotToWireToolCall`, lifecycle mapping |
+| `apps/coder/src/services/agent-pool.ts` | Holds live backends keyed `(primaryKey, agent)`; lazy spawn, idle/LRU eviction, never evicts a busy backend |
+| `apps/coder/src/services/provider-registry.ts` | Static `PROVIDERS` registry (label/transport/model source) |
+| `apps/coder/src/services/provider-snapshot.ts` | Two-tier probe → the snapshot the picker renders; `persistProbedModels` |
+| `apps/coder/src/services/agent-probe.ts` | Startup discovery of installed agents/versions/ACP/models → `available_agents` |
+| `apps/coder/src/services/write_guard.ts` / `pending_changes.ts` | Write-path validation (escape + secret-file block) and the stage/apply/rewind queue |
+| `apps/coder/src/routes/providers.ts` / `messages.ts` | Provider snapshot/config/refresh endpoints; coder message read/post |
+| `apps/coder/src/schema.sql` | `agent_sessions`, `worktrees`, `tasks`, `available_agents`, `pending_changes`, `claude_session_entries` |
+
+### Frontend
+| File | Purpose |
+|------|---------|
+| `apps/web/src/components/AgentComposerBar.tsx` | Renders the provider/model/mode/command picker from the provider snapshot |
+| `apps/web/src/components/panes/CoderPane.tsx` | Coder tab: live `message_complete` reducer, slash-command groups, message timeline mapping |
+| `apps/web/src/components/panes/CoderMessageList.tsx` | Message rendering (`CoderMessageWire`), model-attribution chip |
+| `apps/web/src/api/types.ts` | Web wire copy of `ProviderSnapshotEntry` / `AgentCommand` (parity with `provider-types.ts`) |
+
+### Infrastructure
+| File | Purpose |
+|------|---------|
+| `/etc/systemd/system/boocoder.service` | Host service (port 9502); only `NoNewPrivileges=true` is safe — `ProtectSystem`/`ProtectHome`/`PrivateTmp` break agent dispatch |
+| `apps/coder/.env.host` | Production env (DATABASE_URL, LLAMA_SWAP_URL, CODER_PROVIDERS_PATH, CLAUDE_SDK_BACKEND, …) |
+| `data/coder-providers.json` | Live runtime provider overrides (gitignored); template is `data/coder-providers.example.json` |
+
+**Build & deploy.** `apps/coder` imports the server's compiled `dist/` (`createInferenceRunner`, `createBroker`, `ALL_TOOLS`), so **`apps/server` must build first**: `pnpm -C apps/server build && pnpm -C apps/coder build && sudo systemctl restart boocoder`. The server's `package.json` `exports` map needs both `types` and `default` conditions per subpath (and `declaration: true` in its tsconfig) or NodeNext can't find the `.d.ts` and tsc fails "Cannot find module" here. Agent dispatch spawns binaries **directly** — `spawn(fullBinaryPath, argsArray, { cwd })` using `install_path` — never `spawn('sh', ['-c', ...])`, which fails under systemd.
+
+## Configuration
+
+| Variable | Description | Default |
+|----------|-------------|---------|
+| `DATABASE_URL` | Postgres connection (shared `boochat` DB) | _required_ |
+| `LLAMA_SWAP_URL` | llama-swap base; `/v1/models` for native + opencode model discovery | _required_ |
+| `CLAUDE_SDK_BACKEND` | Truthy opts a deployment into the warm Claude Agent SDK backend; otherwise Claude uses one-shot PTY | _off_ |
+| `CODER_PROVIDERS_PATH` | Provider override config file | `/data/coder-providers.json` |
+| `PROVIDER_PROBE_TTL_MS` | Tier-2 cold ACP probe staleness threshold | `86400000` (24h) |
+| `DEFAULT_MODEL` | Fallback model when a task carries none | _deployment-specific_ |
+| `FAST_MODEL` | Cheaper model for titles/summaries; falls back to session/DEFAULT_MODEL | _unset_ |
+| `AGENT_POOL_IDLE_TTL_MS` | Idle timeout before a warm backend is evicted | `1800000` (30m) |
+| `AGENT_POOL_MAX_LIVE` | LRU cap on simultaneously-live warm backends | `10` |
+| `LIFECYCLE_SWEEP_INTERVAL_MS` | Cadence of the idle/LRU/health sweep | `60000` |
+| `ORPHAN_WORKTREE_GRACE_MS` | Grace before an untouched worktree dir is reaped | `3600000` (1h) |
+| `PORT` / `HOST` | Service bind (production binds the Tailscale IP) | `9502` / `0.0.0.0` |
+
+> BooCoder does **not** load MCP (that is BooChat only). Default values above are the documented fallbacks; production overrides live in `apps/coder/.env.host`. A config-only edit to `data/coder-providers.json` needs only the appropriate restart, not a rebuild.
+
+## Error Handling
+
+This table is the lookup for failure behavior; step-by-step recovery recipes for the common ones live under [Troubleshooting](#troubleshooting) in Technical Reference.
+
+### Backend
+| Scenario | Result | Behavior |
+|----------|--------|----------|
+| Warm turn stalls (no events ≥180s) | inactivity watchdog | Reconciles the session; if the turn isn't actually finished, marks `agent_sessions.status='crashed'` |
+| Backend process exits / crashes | next turn | Row marked `crashed`; `ensureSession` re-spawns and re-initializes on the next message |
+| User aborts a turn | cancel, not kill | Prompt cancelled on the warm connection (process kept); turn-guard swallows a late terminal so it can't settle the next turn |
+| Model changed in a tab | `config_hash` mismatch | Fresh agent session created, worktree preserved |
+| OpenCode SSE missing `directory` | zero session events | Events scope to the server `cwd` → empty turn → 180s timeout (the subscribe MUST pass the worktree dir) |
+| Native inference already running | `409 Conflict` | Second concurrent turn refused |
+| Write target escapes project / is a secret file | `WriteGuardError` | Change is not queued (`.env`, `*.pem`, `id_rsa`, `credentials.json`, … blocked) |
+| Invalid provider config PATCH | `422` | In-memory registry untouched; on disk-write failure, `500` and state left unchanged to avoid divergence |
+
+### Frontend
+| Scenario | Handling | Behavior |
+|----------|----------|----------|
+| Unknown WS frame type | wire-format gate | Frames whose `type` isn't in the web `WsFrame` union drop silently at JSON-parse — add new frame types to both sides |
+| New per-message field not whitelisted | `mapCoderTimelineRow` | The field silently vanishes in the coder unless every mapper is updated (this is how the model chip once disappeared) |
+
+---
+
+## Technical Reference
+
+*Below this point is code-level lookup detail — schema, types, constants, endpoints, and extension recipes. Stop here if you only need to understand the backends' behavior.*
+
+### Data Model
+
+`apps/coder/src/schema.sql` owns the coder-side tables and extends `tasks`. (The chat-side `sessions`/`chats`/`messages` live in `apps/server/src/schema.sql`.) The defining choice: a backend session is keyed on the **tab** (`chat_id`), not the session — two tabs in one workspace are two independent contexts sharing one worktree.
+
+```sql
+-- One resumable backend session per (tab, agent). Re-keyed to (chat_id, agent)
+-- in P1.5-b; session_id/worktree_id are informational SET NULL links.
+CREATE TABLE agent_sessions (
+  session_id       UUID REFERENCES sessions(id) ON DELETE SET NULL,
+  agent            TEXT NOT NULL,
+  backend          TEXT NOT NULL,   -- CHECK IN ('opencode_server','acp_warm','claude_sdk')
+  agent_session_id TEXT,            -- provider's resume token; null until assigned
+  server_port      INTEGER,         -- opencode HTTP port; null for ACP/SDK
+  status           TEXT NOT NULL DEFAULT 'idle', -- 'idle'|'active'|'crashed'|'closed'
+  config_hash      TEXT,            -- sha256('opencode_server|<model>').slice(0,16); stale-detect
+  worktree_id      UUID REFERENCES worktrees(id) ON DELETE SET NULL,
+  input_tokens     BIGINT NOT NULL DEFAULT 0,   -- accumulated per (chat_id, agent)
+  output_tokens    BIGINT NOT NULL DEFAULT 0,
+  cost             DOUBLE PRECISION NOT NULL DEFAULT 0,
+  chat_id          UUID NOT NULL REFERENCES chats(id) ON DELETE CASCADE,
+  PRIMARY KEY (chat_id, agent)      -- closing a tab CASCADEs its context away
+);
+
+-- First-class worktree entity: one per session, survives session delete.
+CREATE TABLE worktrees (
+  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
+  session_id UUID REFERENCES sessions(id) ON DELETE SET NULL,
+  project_id UUID, path TEXT NOT NULL, branch TEXT, base_commit TEXT, slug TEXT,
+  status TEXT NOT NULL DEFAULT 'active' CHECK (status IN ('active','archived'))
+);
+
+-- Dispatcher work units (external agents only; native boocode uses messages+inference).
+-- tasks is created by apps/server schema; the coder adds these columns.
+-- Key coder columns: agent, model, execution_path ('native'|'acp'|'pty'|'qwen'),
+-- session_id (REFERENCES sessions), chat_id (REFERENCES chats ON DELETE SET NULL),
+-- state ('pending'|'running'|'completed'|'failed'|'blocked'|'cancelled').
+
+-- Probed agent registry (UPSERT by name at startup).
+-- available_agents(name PK, install_path, version, supports_acp, models JSONB,
+--   label, transport, commands JSONB, last_probed_at)
+
+-- Claude Agent SDK transcript store (resume materialization).
+-- claude_session_entries(id BIGSERIAL, project_key, session_id, subpath, ... )
+
+-- Staged file changes — nothing hits disk until apply_pending.
+-- pending_changes(id, session_id, task_id, file_path, operation('create'|'edit'|'delete'),
+--   diff, status('pending'|'applied'|'rejected'|'reverted'), agent)
+```
+
+Idempotent FK-action flips and PK swaps in this file guard on `pg_constraint` so re-runs are no-ops — see the P1.5-b re-key block.
+
+### Core Types
+
+The contract every external backend implements. Backends emit `AgentEvent`s without any WS envelope; the dispatcher owns the mapping to frames. `tool_call` and `tool_update` are kept distinct because OpenCode's SSE distinguishes tool-start from tool-result.
+
+See `apps/coder/src/services/agent-backend.ts` for the full definitions. Key shapes:
+
+```typescript
+type AgentEvent =
+  | { type: 'text'; text: string }
+  | { type: 'reasoning'; text: string }
+  | { type: 'tool_call'; toolCall: AcpToolSnapshot }   // tool started
+  | { type: 'tool_update'; toolCall: AcpToolSnapshot }  // tool result / status change
+  | { type: 'commands'; commands: AgentCommand[] };      // ACP available_commands_update
+
+interface AgentBackend {
+  ensureSession(sessionId: string, opts: EnsureSessionOpts): Promise<AgentSessionHandle>;
+  prompt(handle: AgentSessionHandle, input: string, ctx: PromptCtx): Promise<TurnResult>;
+  closeSession(handle: AgentSessionHandle): Promise<void>;
+  dispose(): Promise<void>;
+  health(): 'up' | 'down';
+  isBusy?(): boolean;              // pool never evicts a busy backend
+  tickHealth?(now?: number): Promise<void>;  // proactive restart (opencode only)
+}
+```
+
+`AcpToolSnapshot` (`apps/coder/src/services/acp-tool-snapshot.ts`) is the accumulating shape for a tool call — `{ toolCallId, title, kind?, status?, rawInput?, rawOutput? }` — merged incrementally and rendered via `snapshotToWireToolCall`.
+
+The provider picker is driven by `ProviderSnapshotEntry` / `AgentCommand` in `apps/coder/src/services/provider-types.ts`, which must stay byte-identical to the web copy in `apps/web/src/api/types.ts` (see Testing).
+
+### Constants
+
+| Constant | Value | Description |
+|----------|-------|-------------|
+| poll interval | `2000` ms | Dispatcher fallback poll between `NOTIFY` signals |
+| turn inactivity watchdog | `180000` ms | OpenCode turn with no events → reconcile |
+| snapshot cache TTL | `5 min` | Per-cwd provider snapshot memory cache |
+| `PROVIDER_PROBE_TTL_MS` | `86400000` ms (24h) | Tier-2 cold ACP probe staleness |
+| `config_hash` length | 16 hex chars | `sha256('opencode_server|<model>')` prefix; excludes port |
+| agent pool idle TTL | `1800000` ms (30m) | Default warm-backend eviction |
+| agent pool max live | `10` | Default LRU cap |
+
+### Implementation Notes
+
+#### Routing predicates
+
+`shouldUseWarmBackend(task)` (`backends/warm-acp-routing.ts`) returns true only for `goose`/`qwen` tasks that carry **both** a `session_id` and a `chat_id` — i.e. they came from a real chat tab. `shouldUseClaudeSdk(task, env)` (`backends/claude-sdk-routing.ts`) is the same shape for `claude`, additionally gated behind `claudeSdkBackendEnabled` (the `CLAUDE_SDK_BACKEND` flag, default off). OpenCode tasks always route to the warm server. Everything that fails these predicates — arena, MCP, raw `POST /api/tasks`, flag-off Claude — falls through to `runExternalAgent`'s one-shot path. Both predicates are pure so they unit-test without a live process.
+
+#### The `user` tool_result mapping (Claude SDK)
+
+The Claude Agent SDK feeds tool **results** back in as `type:'user'` messages containing `tool_result` blocks. `mapSdkMessage` must map the `user` case to a terminal `tool_update` (completed, or failed on `is_error`) carrying the tool's output. Without it, the tool call persists `status:'running'` forever and the UI spinner never stops. See `mapUserToolResults` / `toolResultText` in `apps/coder/src/services/backends/claude-sdk-map.ts`. The same mapper dedups text/thinking already streamed via partials and assembles buffered `input_json_delta` fragments on `content_block_stop`.
+
+#### OpenCode SSE and `config_hash` resume
+
+OpenCode runs as one warm `opencode serve` HTTP server per BooCoder process, multiplexed across sessions. Live streaming reads `session.next.text.delta` / `.reasoning.delta` / `.tool.{called,success,failed}` (not the post-hoc `message.part.*`), and the subscribe **must** pass the session's worktree `directory` or events scope to the server cwd and the turn times out. Resume hinges on `config_hash = sha256('opencode_server|<model>').slice(0,16)` — deliberately excluding the random per-boot server port so resume survives restarts; a model change flips the hash and forces a fresh opencode session while keeping the worktree. A `streamedPartKeys` set drops post-hoc duplicate deltas; a `sessionID` demux guard drops cross-session events when two sessions share a server.
+
+#### Warm vs. one-shot lifetime
+
+The agent pool (`agent-pool.ts`) holds OpenCode once per process (one server, many sessions) and warm-ACP / Claude-SDK once per `(chat, agent)`. Idle-TTL and LRU eviction are computed by the pure functions in `lifecycle-decisions.ts` and never evict a backend whose `isBusy()` is true. One-shot dispatch holds nothing warm: `runExternalAgent` spawns, runs one turn, and tears down.
+
+#### Worktree diff → pending changes
+
+All paths run in a git worktree (per-session for warm backends, per-task for one-shot). At turn end the dispatcher diffs against the worktree's base commit and queues the result as a single `pending_changes` set, superseding the previous one (latest-wins). Write tools validate every path through `write_guard.ts` (`resolveWritePath` — `resolve` + prefix check, no `realpath` since created files may not exist yet — plus `isSecretPath`).
+
+### API Endpoints
+
+| Method | Path | Description |
+|--------|------|-------------|
+| `GET` | `/api/providers` | Installed providers with models; transport reflects actual capability (`supports_acp`) |
+| `GET` | `/api/providers/snapshot?cwd=` | Full picker shape (providers + models + modes + commands); 5-min cache |
+| `GET` | `/api/providers/config` | Raw `data/coder-providers.json` (`{ providers: {} }` if absent) |
+| `PATCH` | `/api/providers/config` | Per-id **wholesale** override replace → save → reload → clear cache |
+| `POST` | `/api/providers/refresh` | Force cold ACP re-probe of installed providers |
+| `GET` | `/api/providers/:id/diagnostic` | Read-only diagnostics (no probe) |
+| `GET` | `/api/sessions/:sessionId/messages?chat_id=` | Coder message list via `mapCoderMessageRow` |
+| `POST` | `/api/sessions/:sessionId/messages` | Send: external → `202 { task_id, dispatched }`; native → `202 { assistant_message_id }` |
+| `GET` | `/api/health` | `{ ok, db, tools }` (down ~15–20s after restart while the agent probe runs) |
+
+`PATCH /api/providers/config` replaces a provider id's override object wholesale (per-id shallow merge) — to flip one field, send `{...existing, field}` or you wipe the rest. A custom ACP entry requires `extends: 'acp'` + `label` + `command` or it drops out of the resolved registry.
+
+### Provider discovery pipeline
+
+The picker is built by a four-stage pipeline: `provider-config.ts` (never-throws Zod load of the overrides file) → `provider-config-registry.ts` (`buildResolvedRegistry`, a singleton merging built-ins with overrides) → `provider-snapshot.ts` (two-tier probe) → `routes/providers.ts`. Tier 1 is a fast presence check; tier 2 is a cold ACP probe, skipped unless forced, stale past `PROVIDER_PROBE_TTL_MS`, or the DB has no models yet. Model sources differ per provider: `boocode`/`opencode` from llama-swap `/v1/models` (opencode IDs prefixed `llama-swap/`), `claude` from static registry entries, `qwen` from `~/.qwen/settings.json`, `goose` from the cold ACP probe. Startup `agent-probe.ts` UPSERTs all of this into `available_agents`. Commands come from the static `PROVIDER_COMMANDS` hints merged with live ACP `available_commands_update` (async — must poll after `newSession`); Claude, a PTY provider, discovers commands from disk via `claude-command-discovery.ts` (`~/.claude/commands` + enabled plugin skills). `AgentCommand.kind` (`'command'` vs `'skill'`) drives the slash-menu icon split in `CoderPane`.
+
+### Testing
+
+`apps/coder` has its own vitest suite (`pnpm -C apps/coder test`). Config: `globals: false` (import `describe`/`it`/`expect` from `vitest`), include glob `src/**/__tests__/**/*.test.ts` (files outside it silently don't run), `fileParallelism: false` so DB-integration suites serialize. The pattern is to extract pure helpers and unit-test them in isolation.
+
+- `services/backends/__tests__/claude-sdk-map.test.ts` — SDK stream assembly, text/thinking dedup, tool input buffering, the `user` tool_result mapping
+- `services/backends/__tests__/warm-acp-routing.test.ts` / `claude-sdk-routing.test.ts` — routing predicates
+- `services/backends/__tests__/turn-guard.test.ts` — abort orphan-terminal suppression
+- `services/backends/__tests__/lifecycle-decisions.test.ts` — idle/LRU/restart eviction
+- `services/__tests__/acp-event-map.test.ts` / `acp-tool-snapshot.test.ts` — ACP normalization + snapshot merge
+- `services/__tests__/provider-types-parity.test.ts` — text-identity parity between `provider-types.ts` and the web `api/types.ts` copy (compile-time cross-import is blocked by TS6307 on web's composite tsconfig)
+- `services/__tests__/write_guard.test.ts` (+ `_fuzz`) — path escape + secret-file blocking
+
+### Adding a new backend
+
+1. **Implement `AgentBackend`** — new file under `apps/coder/src/services/backends/`; emit `AgentEvent`s, persist an `agent_sessions` row, honor `isBusy()`.
+2. **Add a routing predicate** — a pure `shouldUseX(task, env?)` sibling to `warm-acp-routing.ts`, plus a unit test.
+3. **Wire the dispatcher** — branch in `runTask` (`dispatcher.ts`) to construct/pool the backend and run `ensureSession`/`prompt`; reuse the shared `onEvent` mapping and `message_complete` publish.
+4. **Extend the schema** — add the backend value to the `agent_sessions_backend_chk` CHECK in `apps/coder/src/schema.sql` (idempotent DROP + re-ADD).
+5. **Register the provider** — entry in `provider-registry.ts` (and `provider-manifest.ts`/`provider-commands.ts` if it has modes/commands).
+
+### Adding a new per-message field
+
+A new per-message coder field silently drops unless **every** mapper is updated: the server read SELECT + `mapCoderMessageRow` (`routes/messages.ts`), `CoderPane.tsx` (`RawCoderMessage`/`CoderMessage`/`mapCoderTimelineRow` + the live `message_complete` reducer), `CoderMessageWire` (`CoderMessageList.tsx`), and `api/types.ts`. The client `mapCoderTimelineRow` whitelists fields — the easiest to forget (this is how the `model` chip once vanished, and what the `model: task.model` on `message_complete` restores).
+
+### Troubleshooting
+
+#### New routes 404 after deploy
+
+The host service keeps running the OLD process until `sudo systemctl restart boocoder` — a stale process shows new routes `404 {error:'not found'}` while old routes still `200`. Restart, don't re-debug. `:9502/api/health` is down ~15–20s after a restart while the startup agent probe runs; an early connection-refused is not a failed deploy.
+
+#### Empty turns / 180s timeouts on OpenCode
+
+Confirm the SSE subscribe passes the session's worktree `directory`; without it, events scope to the server cwd and the session sees zero events. Confirm the opencode model string is provider-prefixed (`llama-swap/<model>`) and present in `~/.config/opencode/opencode.json` — not merely loadable by llama-swap.
+
+#### A stuck tool spinner that never resolves
+
+The tool's terminal `tool_update` isn't being emitted. For the Claude SDK backend, verify the `user` tool_result mapping in `claude-sdk-map.ts`; for ACP, verify `tool_call_update` is reaching `mapSessionUpdate`.
+
+## Related Documentation
+
+- [Cross-App Contract Parity](./coding-standards/cross-app-contract-parity.md) — the coding standard for editing the duplicated provider-snapshot / WS-frame contracts this doc references
+- [Architecture overview](./ARCHITECTURE.md) — system diagram; BooCoder execution paths in context
+- [Provider picker backend plan](./superpowers/plans/2026-05-25-provider-picker-backend.md) — historical design of provider discovery (shipped v2.1.0)
+- `apps/coder/CLAUDE.md` — per-app deep engineering reference (auto-loads when editing `apps/coder/`)
+- `apps/server/CLAUDE.md` — the chat-side inference pipeline BooCoder's native path reuses
+- `openspec/changes/v2-6-persistent-agent-sessions/` — `proposal.md` / `design.md` / `tasks.md`: the rationale behind keying sessions on `chat_id`, the warm-vs-one-shot decision, and the backend abstraction
+- `openspec/changes/claude-sdk-sessionstore/` — the Claude Agent SDK backend + the `user` tool_result mapping batch
--- a/docs/coding-standards/cross-app-contract-parity.md
+++ b/docs/coding-standards/cross-app-contract-parity.md
@@ -0,0 +1,206 @@
+---
+paths:
+  - "apps/server/src/types/ws-frames.ts"
+  - "apps/web/src/api/ws-frames.ts"
+  - "apps/server/src/types/api.ts"
+  - "apps/web/src/api/types.ts"
+  - "apps/coder/src/services/provider-types.ts"
+  - "apps/web/src/components/MessageBubble.tsx"
+  - "apps/server/src/services/inference/turn.ts"
+---
+
+# Cross-App Contract Parity
+
+*Reach for this when a parity test goes red (`ws-frames.test.ts`, `provider-types-parity.test.ts`), a reviewer flags a "half-synced" type, or a frame/sentinel "does nothing" at runtime — i.e. one copy of a duplicated cross-app contract drifted from the other. The fix-it path is [When to Apply](#when-to-apply) + its Verification step.*
+
+- **Status:** proposed
+- **Date Created:** 2026-06-02 00:00
+- **Last Updated:** 2026-06-02 00:00
+- **Authors:**
+  - indifferentketchup (samkintop@gmail.com)
+- **Reviewers:**
+- **Applies To:**
+  - Every hand-synced type/schema contract that crosses the `apps/server` ↔ `apps/web` ↔ `apps/coder` boundary in the files under `paths:`. The primary examples are the WS-frame Zod schema, the provider-snapshot types, and the sentinel `MessageMetadata` union plus its `MessageBubble` render arm — but the same rule governs the other duplicated pairs in these files (`WorktreeRiskReport`, the provider-config wire types, and the interface-typed `WsFrame` union that mirrors the Zod schema).
+
+## Introduction
+
+Several wire contracts in BooCode exist as **two or three hand-synced copies** in different apps, because the apps have separate `tsconfig`s with no shared path alias and a composite-project restriction (TS6307) that structurally blocks importing one app's types from another. There is no shared workspace package for these types yet. This standard governs what you must do when you touch one of those copies: **change every copy in the same commit** — and, where the contract has no compile-time consumer guarantee (the sentinel render arm), the consumer too.
+
+The three families in [Coding Standard](#coding-standard) are the primary examples, but the rule applies to **every** hand-synced pair in the files under `paths:`, each of which carries its own in-code `edit both copies` / `Mirror of …` / `KEEP IN SYNC` marker. Beyond the three: `WorktreeRiskReport` (`apps/server/src/types/api.ts` ↔ `apps/web/src/api/types.ts`), the provider-config wire types (`ProviderOverride` / `CoderProvidersFile`, web mirror of the coder's Zod-inferred shapes), and — note this one — a **second** representation of the WS wire shape: the interface-typed `WsFrame` union in `apps/web/src/api/types.ts` plus the `*Frame` interfaces in `apps/server/src/types/api.ts`, which is distinct from the byte-identical Zod `ws-frames.ts` pair and is **not** covered by the byte-parity test. A WS frame's shape therefore lives in more than one place; treat all of them as one contract.
+
+### Purpose
+
+- **Primary:** prevent *silent runtime* contract breakage. Nothing at compile time links the copies — each app type-checks against its own copy, so `tsc` stays green when they drift. The failure surfaces only at runtime, and silently: a WS frame whose `type` exists on one side but not the other is **dropped at JSON-parse** with no error; a sentinel `kind` added without a render arm shows nothing. Editing every copy in lockstep is the only thing that keeps the contract whole.
+- **Secondary:** two of the three contracts have runtime parity tests (`ws-frames.test.ts`, `provider-types-parity.test.ts`) that catch drift in the test run — but they are a backstop, not the mechanism, and the sentinel triple has no test at all.
+- **Side effect:** keeping the copies byte- or text-identical makes a contract change reviewable as a matched diff across files.
+
+### Scope
+
+The specific duplicated contracts listed in `paths:` above, inside the `apps/server`, `apps/web`, and `apps/coder` TypeScript packages. It does **not** govern types that live in a single app.
+
+## When to Apply
+
+Walk this before editing a type, schema, enum, or metadata union:
+
+1. **Does this shape exist as a copy in another app?** — Check: `grep -rn "<TypeOrFieldName>" apps/*/src`. If it appears under two or more of `apps/server`, `apps/web`, `apps/coder` → continue. If it lives in exactly one app → see "When NOT to Apply".
+2. **Are you changing its wire shape?** — adding, removing, renaming, or re-typing a field; adding/removing a frame `type`; adding an enum value or a sentinel `kind`. If yes → apply this standard: edit **every** copy, plus every consumer that switches on the shape, in the **same commit**. If no (a comment or formatting change that the contract's parity test normalizes away) → see "When NOT to Apply".
+
+**Exception — the sentinel/consumer triple:** `MessageMetadata` (`apps/server/src/types/api.ts` ↔ `apps/web/src/api/types.ts`) has **no parity test**, and a new `kind` is inert until it gets a render branch in `apps/web/src/components/MessageBubble.tsx`. When the shape you are editing is `MessageMetadata`, "every copy" includes that render arm — there is no test to remind you.
+
+**Verification step:** run the guards that exist *now*, before you commit:
+
+```bash
+# The trailing arg is a FILE-PATH substring filter for `vitest run` (not a test
+# name). A typo matches zero files and still exits 0 — a false green — so confirm
+# the run actually executed the file (look for "1 passed" on the named file).
+pnpm -C apps/server test ws-frames.test            # WS-frame byte-parity + KNOWN_FRAME_TYPES drift
+pnpm -C apps/coder  test provider-types-parity     # provider-snapshot text-parity (incl. nested blocks)
+# Sentinel triple has no test — grep all copies for a NEW rendering kind:
+grep -rn "<new-kind>" apps/server/src/types/api.ts apps/web/src/api/types.ts apps/web/src/components/MessageBubble.tsx
+```
+
+For a **rendering** sentinel kind (`cap_hit` / `doom_loop` / `mistake_recovery`) the new `kind` must appear in all three files. The non-rendering `error` arm of `MessageMetadata` lives in the two type copies only — it has no `MessageBubble` branch — so for it the grep should match the two `api.ts`/`types.ts` copies, not `MessageBubble.tsx`.
+
+## When NOT to Apply
+
+- **The type lives in a single app.** Internal server types, web-only view models, coder-only helpers — there is no second copy, so there is nothing to sync. Edit the one definition directly; do **not** manufacture a duplicate in another app "for symmetry." A new cross-app contract should prefer the eventual shared package or, at minimum, ship with its own parity test — not a third hand-synced copy.
+- **A comment- or whitespace-only edit to a *text-parity* file.** `provider-types-parity.test.ts` strips comments and blank lines before comparing, so a comment-only change to one provider-types copy is tolerated and you needn't chase the other. (This relief does **not** apply to `ws-frames.ts`, which is compared **byte-for-byte** — every character, including comments, must match.)
+- **The shared workspace package lands.** This standard exists *only* because the single source of truth was deferred (a Tier-2 follow-up noted in `provider-types-parity.test.ts`). Once these types move into one shared package, delete the hand-syncing rule rather than keep paying it — the SSOT supersedes this standard.
+
+## Background
+
+The duplication is deliberate, not accidental. A compile-time bidirectional-assignability check was attempted first — a web-side file importing the coder's import-free `provider-types.ts` — but `apps/web/tsconfig.app.json` is a composite project and rejects out-of-include files with **TS6307**, so cross-project type import is structurally blocked. The team chose hand-synced copies guarded by runtime tests over a premature shared package. The WS-frame copies go further and are kept **byte-identical** so a single `readFileSync` equality test can guard them; the provider-snapshot copies are kept **text-identical per named type block** (comments normalized away) because they sit among unrelated types. The cost of this choice is exactly what this standard manages: a copy can drift, and because each app compiles independently, only a runtime test — or a runtime bug — reveals it.
+
+## Coding Standard
+
+### Edit all copies of a cross-app contract together (cross-cutting)
+
+When you change one copy of a duplicated contract, change the others in the same commit. Each contract family has its own home files and its own (or no) guard.
+
+**WS frame schema — `apps/server/src/types/ws-frames.ts` ↔ `apps/web/src/api/ws-frames.ts` (byte-identical):**
+
+```typescript
+// PRIMARY: no compile-time link exists across apps (separate tsconfigs, TS6307
+// blocks cross-import). A frame type added to one copy but not the other breaks
+// silently at runtime — the frontend drops the frame at JSON-parse. So this file
+// and apps/web/src/api/ws-frames.ts MUST stay byte-identical, in the same commit.
+//
+//   IMPORTANT: This file is duplicated byte-identical at
+//   apps/web/src/api/ws-frames.ts. ... If you change one, change the other.
+//
+// Adding a frame also means adding its `type` to KNOWN_FRAME_TYPES (a drift test
+// probes every entry for a discriminated branch).
+```
+
+**Provider snapshot types — `apps/coder/src/services/provider-types.ts` ↔ `apps/web/src/api/types.ts`, text-identical per block.** By convention you author on the coder side and mirror to web (the in-code `KEEP IN SYNC` markers point that way), but the parity test is **symmetric** — it fails on drift in *either* file and names no authoritative copy, so "fix the red test" means re-sync the two, not edit one in particular:
+
+```typescript
+// PRIMARY: nothing links these two copies at compile time — a field added here
+// but not in apps/web/src/api/types.ts breaks silently at runtime (the web side
+// drops or mis-reads the snapshot). The in-file marker, with its test backstop:
+//   KEEP IN SYNC with apps/web/src/api/types.ts ProviderSnapshotEntry — parity
+//   is enforced by __tests__/provider-types-parity.test.ts (fails on field drift).
+// Applies to the nested ProviderModel / ProviderMode / ThinkingOption /
+// AgentCommand / ProviderSnapshotStatus blocks the entry references, too.
+export interface ProviderSnapshotEntry { /* ...fields... */ }
+```
+
+**Sentinel metadata — `apps/server/src/types/api.ts` ↔ `apps/web/src/api/types.ts`, plus the render arm in `apps/web/src/components/MessageBubble.tsx` (no parity test):**
+
+```typescript
+// A new *rendering* sentinel kind is a THREE-file change with NO test to catch a miss:
+//   1. apps/server/src/types/api.ts  — add the arm to MessageMetadata
+//   2. apps/web/src/api/types.ts     — add the identical arm
+//   3. MessageBubble.tsx             — add the render branch, else it shows nothing
+// The real union has FOUR arms; show it whole so nobody reads two as the full set:
+export type MessageMetadata =
+  | { kind: 'cap_hit';          /* used, limit, agent_name, can_continue */ }
+  | { kind: 'doom_loop';        /* tool_name, args, threshold */ }
+  | { kind: 'mistake_recovery'; /* failure_kinds, count, escalated */ }  // PINNED CONTRACT (#12), mirrored byte-for-byte
+  | { kind: 'error';            /* error_reason, error_text */ };        // NOT a rendered sentinel → 2-file change
+//
+// CROSS-APP CAVEAT for the MessageBubble render branch: the coder feeds rows in via
+// `CoderMessageWire as unknown as Message`, so `metadata` can be undefined there.
+// Null-guard the loose way — `message.metadata?.kind === 'x'` or `metadata != null` —
+// NEVER `metadata !== null` (undefined !== null is true → `.kind` throws → blank
+// screen, and tsc can't see it). See apps/web/CLAUDE.md.
+```
+
+**What to avoid:**
+
+```typescript
+// ANTI-PATTERN: editing one copy only.
+// Add a new frame type to apps/web/src/api/ws-frames.ts but not the server copy
+// (or vice versa): tsc stays green — they're separate projects — but the parity
+// test fails, and had it not existed, the server would publish a frame the
+// frontend silently discards at JSON-parse. A half-edited contract is invisible
+// to the type-checker; never land one.
+```
+
+**Project references:**
+- `apps/server/src/types/ws-frames.ts` — the byte-identical sync comment (top of file) and `KNOWN_FRAME_TYPES`.
+- `apps/web/src/api/ws-frames.ts` — the web copy that must match it byte-for-byte.
+- `apps/coder/src/services/provider-types.ts` — the `KEEP IN SYNC` comment above `ProviderSnapshotEntry`.
+- `apps/web/src/api/types.ts` — the provider-snapshot wire copy and the `MessageMetadata` copy.
+- `apps/web/src/components/MessageBubble.tsx` — the sentinel render arms (`metadata?.kind` branches).
+
+### A wire-shape change passes through the gate, then a consumer
+
+A frame is published by the server's permissive `InferenceFrame` union (`apps/server/src/services/inference/turn.ts`) but only reaches the UI if the strict schema/union accepts it — permissive publish, strict receive. Keep the **type/schema copies** (this standard's scope) in lockstep so the frame survives validation; then make sure something consumes it.
+
+> **Where consumer-wiring fits.** This standard governs the duplicated *type/schema* copies and the one consumer with no compile-time guard — the sentinel `MessageBubble` render arm. A new WS frame additionally needs a runtime handler to *do* anything: `applyFrame` in `apps/web/src/hooks/useSessionStream.ts` (per-session frames) and `useUserEvents` (user-channel frames), plus the sidebar reducer. That wiring — and the event-dedup discipline around it — is governed by `apps/web/CLAUDE.md`, not by this parity standard. A frame that passes the byte-parity test but has no reducer `case` validates and is then silently ignored.
+
+**Correct usage:**
+
+```typescript
+// Adding a WS frame type, all in one commit:
+//   - apps/server/src/services/inference/turn.ts  — loose InferenceFrame publish union (+ optional fields)
+//   - apps/server/src/types/ws-frames.ts          — strict WsFrameSchema + WsFrame + KNOWN_FRAME_TYPES
+//   - apps/web/src/api/ws-frames.ts               — byte-identical copy of the strict gate
+// The strict web-side type is the wire-format gate: a frame whose type isn't in
+// it is dropped at JSON-parse. The loose publish union and the strict gate are
+// BOTH required — permissive publish, strict receive.
+```
+
+**What to avoid:**
+
+```typescript
+// ANTI-PATTERN: widening the server publish union but not the strict schema.
+// turn.ts now emits { type: 'my_new_frame', ... }; the broker Zod-validates
+// against WsFrameSchema, which doesn't know the type, and fail-closed drops it.
+// The feature "does nothing" with no error in either app's logs.
+```
+
+**Project references:**
+- `apps/server/src/services/inference/turn.ts` — the loose `InferenceFrame` publish union.
+- `apps/server/src/types/ws-frames.ts` — `WsFrameSchema` (the broker's fail-closed validation gate) + `KNOWN_FRAME_TYPES`.
+- `apps/web/src/components/MessageBubble.tsx` — the consumer for sentinel `MessageMetadata` kinds.
+
+### Sync the copies; never weaken the parity test
+
+When a parity test fails, the fix is to make the copies match — not to make the test stop checking. The corollary also holds: when you add a **new** nested type that `ProviderSnapshotEntry` references, add its name to the `names` array in `provider-types-parity.test.ts`, or the new type is hand-synced but **unguarded**.
+
+**What to avoid:**
+
+```typescript
+// ANTI-PATTERN: a red parity test "fixed" by deleting the assertion, skipping
+// the it(), or trimming a type out of the compared `names` list. That converts a
+// caught drift into a shipped, silent contract break. Re-sync the copies instead.
+```
+
+**Project references:**
+- `apps/server/src/services/__tests__/ws-frames.test.ts` — `ws-frames.ts file mirror parity` (byte-identical) and the `KNOWN_FRAME_TYPES` drift probe.
+- `apps/coder/src/services/__tests__/provider-types-parity.test.ts` — text-identity of each shared block across the coder ↔ web copies.
+
+## Additional Resources
+
+### Project Documentation
+
+- [BooCoder Dispatch Backends](../coder-backends.md) — the provider-snapshot contract and the WS-frame mapping in their runtime context (see "Core Types" and the parity notes).
+- [Architecture overview](../ARCHITECTURE.md) — the three surfaces and the shared database the contracts cross.
+- Root `CLAUDE.md` → "Conventions" — the cross-app contract rules (WS frame, sentinels, provider-type parity, JSONB) this standard formalizes.
+- `apps/server/CLAUDE.md` (`services/broker.ts`) and `apps/coder/CLAUDE.md` — per-app notes on the broker validation and the provider-type mirror.
+
+### External Resources
+
+- [Claude Code path-scoped rules](https://code.claude.com/docs/en/memory) — how the `.claude/rules/coding-standards/` index that surfaces this standard is loaded.
--- a/docs/project-discovery.md
+++ b/docs/project-discovery.md
@@ -0,0 +1,132 @@
+# Project Discovery
+
+> Auto-generated stack / tooling / command inventory for the BooCode repository. Static reference for skills, agents, and contributors. Deep engineering notes live in the root and per-app `CLAUDE.md` files; this file is the factual "what's installed and how to run it" map.
+
+## Repository
+
+- type: pnpm monorepo (workspaces `apps/*` + `apps/coder/web`)
+- package manager: pnpm 10.15.1 (root `package.json` `packageManager`)
+- lock file: `pnpm-lock.yaml`
+- workspace config: `pnpm-workspace.yaml`
+- shared TS config: `tsconfig.base.json` (ES2022 target, strict, `noUncheckedIndexedAccess`, `isolatedModules`)
+- languages: TypeScript (all Node packages), Go 1.24 (codecontext sidecar)
+- database: PostgreSQL 16 (single `boochat` DB; two schema files applied idempotently — `apps/server/src/schema.sql` + `apps/coder/src/schema.sql`)
+- members: 5 Node packages (`@boocode/server`, `@boocode/web`, `@boocode/coder`, `@boocode/coder-web`, `@boocode/booterm`) + 1 Go sidecar (`codecontext/`)
+- cross-workspace dep: `@boocode/coder` → `@boocode/server` (`workspace:*`); **server must build first** (emits `.d.ts` for consumers)
+- build order: server → coder → web/booterm (independent) → codecontext (Docker, Go)
+
+## Repository-level
+
+### Documentation
+
+- primary guidance: `CLAUDE.md` (root, tracked) + per-app `apps/server/CLAUDE.md`, `apps/coder/CLAUDE.md`, `apps/web/CLAUDE.md` (lazy-loaded per subtree)
+- overview: `README.md`
+- release log: `CHANGELOG.md` (per-tag, newest-first)
+- system prompts: `BOOCHAT.md`, `BOOCODER.md` (bind-mounted into containers)
+- architecture: `docs/ARCHITECTURE.md` (system diagram + overview)
+- planning docs: `docs/codecontext-ts-plan.md`, `docs/DEFERRED-WORK.md`, `docs/STALE-DEPRECATED.md`, `docs/themes_v1.md`, `docs/superpowers/plans/`
+- specs/design: `openspec/README.md` (convention) + `openspec/changes/<slug>/{proposal,tasks,design}.md`; shipped batches snapshot under `openspec/changes/archived/`
+
+### Agent registry & skills
+
+- agent registry: `data/AGENTS.md` (tracked; parsed — each `## <Name>` + `---` frontmatter fence is one agent)
+- skills: `data/skills/<vendor>/` (each skill = `SKILL.md` + optional `eval.yaml`); vendors include `boocode/` (`committing-changes`, `improving-boocode-guidance`, `systematic-debugging`, `using-worktrees`), `anthropics/`, `superpowers/`, `mattpocock/`, others
+
+### Infrastructure
+
+- container build: `Dockerfile` (root — boocode = server + web; `node:20-alpine` builder + runtime; runtime adds ripgrep/git/ssh)
+- container build: `apps/booterm/Dockerfile` (`node:20-alpine` builder → `node:20-bookworm-slim` runtime for node-pty libc parity)
+- container build: `codecontext/Dockerfile` (`golang:1.24-alpine` → `alpine:3.20`)
+- container build: `apps/coder/Dockerfile` (present but unused — BooCoder now runs via host systemd, not Docker)
+- orchestration: `docker-compose.yml` — services `boocode` (:9500), `booterm` (:9501), `boocode_db` (postgres:16-alpine, host :5500), `codecontext` (internal :8080); `boocoder` service commented out (moved to host systemd `boocoder.service`, :9502)
+- CI/CD: none in repo (`.github/workflows`, `.gitea`, `.gitlab-ci.yml` absent); deploy is manual (`docker compose up --build -d`; boocoder via `pnpm build` + `sudo systemctl restart boocoder`)
+- git hooks: none committed (`.git/hooks/` has only `.sample`); a host-side `security_reminder_hook.py` is referenced but not in-repo
+- ADRs: no dedicated directory; decisions live inline in `CLAUDE.md` files, `docs/ARCHITECTURE.md`, and `openspec/changes/*/design.md`
+- linters/formatters: none configured (no eslint/prettier/stylelint config)
+
+### Environment & configuration
+
+- env file: `.env` (gitignored) / `.env.example` (tracked template)
+- boocoder host env: `apps/coder/.env.host`
+- config template: `data/mcp.example.json` (tracked) — live `data/mcp.json` is gitignored; secrets live in `.env` and resolve via `{env:VAR}` substitution (e.g. `CONTEXT7_API_KEY`)
+- config template: `data/coder-providers.example.json` (tracked) — live `data/coder-providers.json` is gitignored (runtime, read+written on UI toggles)
+
+## Projects
+
+### `boocode` (root coordinator) — `/`
+
+- manifest: `package.json`
+- role: pnpm workspace root; coordinates dev/build across members
+- install: `pnpm install`
+- dev: `pnpm dev:server` (tsx watch, :3000), `pnpm dev:web` (Vite, :5173)
+- build: `pnpm build` (web then server)
+- start: `pnpm start` (`node apps/server/dist/index.js`)
+- typecheck: `npx tsc --noEmit` (project references; per-app `tsc` is authoritative)
+
+### `@boocode/server` — `apps/server/`
+
+- manifest: `apps/server/package.json`
+- runtime: Node.js + TypeScript
+- frameworks: Fastify ^4.28.1 (+ `@fastify/websocket` ^10, `@fastify/static` ^7); postgres ^3.4.4 (porsager, tagged-template SQL, no ORM); Vercel AI SDK — `ai` ^6 + `@ai-sdk/openai-compatible` ^2 (llama-swap); Zod ^3; `@modelcontextprotocol/sdk` ^1.29
+- build: `pnpm -C apps/server build` (`tsc` + copy `schema.sql` → `dist/`) — authoritative for server code
+- dev: `pnpm -C apps/server dev` (tsx watch)
+- typecheck: `pnpm -C apps/server typecheck`
+- test: `pnpm -C apps/server test` (vitest run)
+- tsconfig: `apps/server/tsconfig.json` (NodeNext, `declaration: true` for workspace consumers; `exports` map with `types` conditions)
+- test config: `apps/server/vitest.config.ts` (vitest ^3.2.4, env=node, `globals: false`, `fileParallelism: false`)
+- test pattern: `src/**/__tests__/**/*.test.ts`
+- DB-integration tests: opt-in via `DATABASE_URL='postgres://boocode:devpass@localhost:5500/boochat' pnpm -C apps/server test`
+
+### `@boocode/web` — `apps/web/`
+
+- manifest: `apps/web/package.json`
+- runtime: Node.js + TypeScript (browser SPA; also hosts the BooCoder pane)
+- frameworks: React ^18.3.1 + React DOM; React Router v6.26.0; Tailwind v4.3.0 (+ `@tailwindcss/postcss`); shadcn/radix-ui primitives; Shiki ^1.29 (highlighting); `@xterm/xterm` 5.5 (+ addons fit/web-links/webgl); Vite ^5.3.4 (+ `@vitejs/plugin-react`)
+- build: `pnpm -C apps/web build` (`tsc -b` + `vite build`)
+- dev: `pnpm -C apps/web dev` (Vite, :5173)
+- preview: `pnpm -C apps/web preview`
+- typecheck: `pnpm -C apps/web typecheck` (`tsc -b --noEmit`) — or `npx tsc -p apps/web/tsconfig.app.json --noEmit`
+- tests: none (no test harness by design)
+- tsconfig: `apps/web/tsconfig.json` (composite refs) + `apps/web/tsconfig.app.json` (Bundler resolution, `react-jsx`, path alias `@/` → `src/*`)
+- dev config: `apps/web/vite.config.ts` (proxy order: `/api/term` + `/ws/term` → :9501, `/api/coder` → :9502, `/api` → :3000)
+
+### `@boocode/coder` — `apps/coder/`
+
+- manifest: `apps/coder/package.json`
+- runtime: Node.js + TypeScript; runs as host systemd service (`boocoder.service`, :9502), postgres at `127.0.0.1:5500`
+- frameworks: Fastify ^4.28.1 (+ `@fastify/websocket`); postgres ^3.4.4; agent SDKs — `@agentclientprotocol/sdk` ^0.22, `@anthropic-ai/claude-agent-sdk` ^0.3, `@opencode-ai/sdk` ~1.15 (imported via `@opencode-ai/sdk/v2/client`); `@modelcontextprotocol/sdk` ^1.29; `@boocode/server` (`workspace:*`)
+- build: `pnpm -C apps/coder build` (`tsc` + copy `schema.sql`) — requires server built first
+- dev: `pnpm -C apps/coder dev` (tsx watch)
+- cli: `pnpm -C apps/coder cli` (`tsx src/cli.ts`)
+- typecheck: `pnpm -C apps/coder typecheck`
+- test: `pnpm -C apps/coder test` (vitest run)
+- deploy: `pnpm -C apps/server build && pnpm -C apps/coder build && sudo systemctl restart boocoder`
+- tsconfig: `apps/coder/tsconfig.json` (NodeNext, `declaration: false`)
+- test config: `apps/coder/vitest.config.ts` (vitest ^3.0.0, env=node, `globals: false`, `fileParallelism: false`)
+- test pattern: `src/**/__tests__/**/*.test.ts`
+
+### `@boocode/coder-web` — `apps/coder/web/`
+
+- manifest: `apps/coder/web/package.json`
+- runtime: Node.js + TypeScript; standalone fallback SPA served at :9502 (primary coder UI is the pane in `@boocode/web`)
+- frameworks: React ^18.3.1 + React Router v6.26.0 + Tailwind v4.3.0 + Vite ^5.3.4
+- dev config: `apps/coder/web/vite.config.ts` (port 5174; proxies `/api` → `http://127.0.0.1:3000`)
+
+### `@boocode/booterm` — `apps/booterm/`
+
+- manifest: `apps/booterm/package.json`
+- runtime: Node.js + TypeScript; Docker container (:9501, bookworm-slim+glibc)
+- frameworks: Fastify ^4.28.1 (+ `@fastify/websocket`); node-pty ^1.0.0; `pg` ^8.13 (session persistence); tmux (per-session `bc-<sid>`)
+- build: `pnpm -C apps/booterm build` (`tsc` only)
+- dev: `pnpm -C apps/booterm dev` (tsx watch)
+- typecheck: `pnpm -C apps/booterm typecheck`
+- tests: none
+- tsconfig: `apps/booterm/tsconfig.json` (NodeNext, `declaration: false`)
+
+### codecontext shim — `codecontext/`
+
+- manifest: `codecontext/go.mod`
+- runtime: Go 1.24; standalone binary, MCP stdio↔HTTP adapter (NDJSON framing); Docker sidecar at `http://codecontext:8080/v1/<tool_name>`
+- source: single `shim.go`; wraps the CodeContext fork staged via `codecontext/fork.tar.gz` (gitignored; fork repo at `/opt/forks/codecontext/`, branch `boocode-ts`)
+- build: `go build ./...` (use `/snap/go/current/bin/go`); built into the container in Docker
+- test: `go test ./...`