feat: MCP {env:VAR} key substitution + coder model/tool-result fixes + docs refactor (v2.7.9)

- MCP secrets: substituteEnvVars recursively resolves {env:NAME} in mcp.json string values from process.env before Zod (opencode-compatible); unset -> '' + boot warning, and invalid-config log names the unset vars (an empty {env:VAR} in a strict url/command field invalidates the whole config) - data/mcp.json now untracked (.gitignore flips !data/mcp.json -> !data/mcp.example.json); tracked template data/mcp.example.json carries "{env:CONTEXT7_API_KEY}"; .env.example documents the key (9 mcp-config tests) - Coder fix: message_complete frame model widened string -> string|null (server+web ws-frames parity); dispatcher publishes model: task.model at all 4 external completion points — a null model otherwise fail-closed in publishFrame and dropped the whole frame incl. status:'complete' (regression test) - Coder fix: claude-sdk mapUserToolResults maps user-message tool_result blocks -> terminal tool_update events (completed/failed w/ output) so tool snapshots resolve instead of spinning forever - Composer: AgentComposerBar drops §9b resumed/history/new chip + token readout, loses flex-wrap so the row stays one line; CoderPane gains a per-chat localStorage agent-config cache (restores last model on reopen) + threads model into the timeline/chip - Docs: root CLAUDE.md slimmed (~190 lines), per-app refs split to apps/{coder,server,web}/CLAUDE.md; new docs/coder-backends.md, docs/project-discovery.md, docs/coding-standards/ (cross-app-contract-parity); ARCHITECTURE.md links the backends doc
2026-06-02 17:01:03 +00:00
parent 9b89156c48
commit ba08c63f6a
23 changed files with 1284 additions and 256 deletions
--- a/apps/coder/CLAUDE.md
+++ b/apps/coder/CLAUDE.md
@@ -0,0 +1,34 @@
+# apps/coder — BooCoder (deep reference)
+
+> Per-app engineering notes for `apps/coder/src/`. BooCoder runs as a **systemd service on the host** (`boocoder.service`), NOT in Docker — Fastify at port 9502, postgres at `127.0.0.1:5500`. Cross-cutting commands, database, environment, workflow, and cross-app contracts live in the **root `CLAUDE.md`**. This file auto-loads when you read/edit files under `apps/coder/`.
+
+## Probe & provider discovery
+
+- **`services/provider-registry.ts`** — Static registry of provider metadata (label, transport, model source). `PROVIDERS` array, `PROVIDERS_BY_NAME` map. 5 providers: boocode (native), opencode (acp), goose (pty), claude (pty), qwen (pty). `PROBED_AGENT_NAMES` derives from it — adding/removing providers means editing this file, not the frontend.
+- **`services/agent-probe.ts`** — Startup probe via direct `exec()` (not SSH): discovers installed agents, versions, ACP support, models. Qwen models from `~/.qwen/settings.json`; Claude models static from the registry. Persisted to `available_agents`.
+- **`routes/providers.ts`** — `GET /api/providers` returns installed providers with models. Transport reflects actual capability (checks `supports_acp` from DB, not just registry preference). The apps/server side is "Provider picker dispatch" (see `apps/server/CLAUDE.md`).
+- **Provider snapshot lifecycle** (`services/`): `provider-config.ts` (Zod config, never-throws) → `provider-config-registry.ts` (`buildResolvedRegistry`, singleton) → `provider-snapshot.ts` (two-tier probe: tier-1 fast presence, tier-2 cold ACP probe skipped unless force / stale `PROVIDER_PROBE_TTL_MS` 24h / dbEmpty; cached). Verify live: `curl http://100.114.205.53:9502/api/providers/snapshot` — returns providers + models + commands, the exact shape `AgentComposerBar` renders.
+- `PATCH /api/providers/config` replaces a provider id's override object **wholesale** (per-id shallow merge) — to flip one field send `{...existing, enabled}`, or a custom ACP entry's `command`/`label` is wiped and it drops out of the resolved registry. `data/coder-providers.json` is **gitignored** (live runtime config — the coder reads AND writes it on UI toggles); tracked reference is `data/coder-providers.example.json`. The loader falls back to `{providers:{}}` (built-ins only) when absent, so a fresh checkout needs no copy.
+
+## Build, deploy, dispatch
+
+- **Workspace dependency on `@boocode/server`**: imports `createInferenceRunner`, `createBroker`, `ALL_TOOLS`, `appendMcpTools` from the server's compiled `dist/`. apps/server's `package.json` has an `exports` map with `types` conditions for NodeNext resolution. **apps/server must build FIRST.**
+- Build + deploy: `pnpm -C apps/server build && pnpm -C apps/coder build && sudo systemctl restart boocoder`. Env file at `apps/coder/.env.host`. Service file at `/etc/systemd/system/boocoder.service`.
+- After `pnpm -C apps/coder build` the host service keeps running the OLD process until `sudo systemctl restart boocoder` — a stale process shows **new routes 404 with `{error:'not found'}` while old routes still 200** (the `/api` not-found handler shape). Restart, don't re-debug.
+- `:9502/api/health` is down ~15–20s after a boocoder restart while the startup agent-probe scan runs — retry; an early connection-refused is not a failed deploy.
+- Agent dispatch spawns binaries directly using `install_path` from `available_agents` — no `spawn('sh', ['-c', ...])` (fails under systemd). Paseo's pattern: `spawn(fullBinaryPath, argsArray, { cwd })`.
+- systemd hardening: only `NoNewPrivileges=true` is safe. `ProtectSystem`, `ProtectHome`, `PrivateTmp` all break agent dispatch (agents need full filesystem access to read configs, write to worktrees).
+- `apps/server/tsconfig.json` has `declaration: true` so `.d.ts` files exist for workspace consumers. The provider's `package.json` needs `exports` with `types` + `default` conditions per subpath (`"./inference": { "types": "./dist/.../index.d.ts", "default": "./dist/.../index.js" }`) — without the `types` condition, NodeNext can't find `.d.ts` files and tsc fails "Cannot find module" here.
+- Write tools (`edit_file`, `create_file`, `delete_file`, `apply_pending`, `rewind`) queue in `pending_changes`. Nothing hits disk until `apply_pending`. `write_guard.ts` validates paths (resolve + prefix-check, no realpath since files may not exist for creates).
+
+## Backends
+
+> Behavioral overview + flows + data model: see [/docs/coder-backends.md](/docs/coder-backends.md). The notes below are the deep per-fact reference.
+
+- **opencode** runs as a warm HTTP server (`services/backends/opencode-server.ts` — `opencode serve` per BooCoder process, one opencode session per BooCode session, resumed via `agent_sessions`). goose/qwen/claude dispatch **one-shot** ACP/PTY with no ctx/token usage; only native `boocode` (llama-swap) tracks ctx.
+- **opencode SSE** (`opencode-server.ts`): live streaming is `session.next.text.delta` / `.reasoning.delta` / `.tool.{called,success,failed}` — NOT `message.part.*` (terminal/post-hoc). `client.event.subscribe({ directory })` MUST pass the session's worktree dir; omit it and opencode scopes events to the server `process.cwd()` → zero session events (empty turns, 180s timeout). Each live session owns its own subscribe loop + AbortController (a `sessionID` demux guard drops cross-session events when two share a dir). Turn completes on `session.idle`; `promptAsync` is fire-and-forget (204).
+- **opencode model strings** must be provider-prefixed (`llama-swap/<model>`) AND exist in `~/.config/opencode/opencode.json` `provider.llama-swap.models` — not merely loadable by llama-swap. `parseModel` infers `llama-swap/` for a bare id; the dispatcher coalesces empty→DEFAULT_MODEL then prefixes. `agent-probe` populates opencode's `available_agents.models` via `mergeLlamaSwap` (fetches `/v1/models`); empty model list → frontend sends `''` → no inference (empty turn).
+- **agent_sessions resume**: `config_hash = sha256('opencode_server|<model>')` — must NOT include the server port (random per boot; breaks cross-restart resume). Keyed `(chat_id, agent)` — the tab/chat is the context unit (two opencode tabs = two contexts sharing one worktree). `chat_id` CASCADEs from `chats`; `session_id`/`worktree_id` are informational `SET NULL`. The `worktrees` table (one-per-session, survives session delete) supersedes the defanged `session_worktrees`. `tasks.chat_id` threads the tab id to the dispatcher; `runOpenCodeServerTask` resolves-or-creates a chat when null. The `@opencode-ai/sdk` v2 client takes flattened params (`{sessionID, directory, parts, model:{providerID,modelID}}`), `createOpencodeClient` from `@opencode-ai/sdk/v2/client`.
+- **Claude SDK backend tool RESULTS arrive as `type:'user'` SDK messages** (tool_result content blocks): `mapSdkMessage` (`claude-sdk-map.ts`) MUST map the `user` case → a terminal `tool_update` (completed/failed + output), else the tool_call persists `status:'running'` and the UI spinner never stops. The dispatcher's `tool_update` path then publishes + persists it.
+- **ACP command discovery is async**: `acp-probe.ts` must poll after `newSession` for `available_commands_update` (commands arrive in a later notification; reading synchronously captures 0). PTY providers (claude) discover from disk via `claude-command-discovery.ts` (`~/.claude/commands` + `enabledPlugins`, bare names, deduped). `AgentCommand.kind` tags `'command'` vs `'skill'`; `CoderPane`'s `slashGroups` splits them into icon'd groups. `SlashCommandPicker`'s `groups?` prop is opt-in.
+- **A new per-message coder field silently drops unless you update every mapper**: server read SELECT + `mapCoderMessageRow` (`apps/coder/src/routes/messages.ts`), `CoderPane.tsx` (`RawCoderMessage`/`CoderMessage`/`mapCoderTimelineRow` + the live `message_complete` WS reducer), `CoderMessageWire` (`CoderMessageList.tsx`), and `api/types.ts`. The client `mapCoderTimelineRow` whitelists fields — easiest to forget (this is how the `model` chip silently vanished in the coder).
--- a/apps/coder/src/services/backends/tests/claude-sdk-map.test.ts
+++ b/apps/coder/src/services/backends/tests/claude-sdk-map.test.ts
@@ -179,3 +179,73 @@ describe('mapSdkMessage — non-content messages', () => {
    ).toEqual([]);
  });
 });
+
+describe('mapSdkMessage — user tool results', () => {
+  /** A `user` message carrying tool_result blocks (the SDK feeds tool output back here). */
+  function userMsg(content: unknown): SDKMessage {
+    return msg({ type: 'user', message: { role: 'user', content }, parent_tool_use_id: null, uuid: 'u', session_id: 's' });
+  }
+
+  it('maps a string tool_result to a completed tool_update carrying the output', () => {
+    const state = createClaudeSdkMapState();
+    const out = mapSdkMessage(userMsg([{ type: 'tool_result', tool_use_id: 't1', content: 'done' }]), state);
+    expect(out).toEqual<AgentEvent[]>([
+      {
+        type: 'tool_update',
+        toolCall: { toolCallId: 't1', title: 't1', kind: null, status: 'completed', rawInput: undefined, rawOutput: 'done' },
+      },
+    ]);
+  });
+
+  it('marks an is_error result failed', () => {
+    const state = createClaudeSdkMapState();
+    const out = mapSdkMessage(userMsg([{ type: 'tool_result', tool_use_id: 't1', content: 'boom', is_error: true }]), state);
+    const ev = out[0]!;
+    if (ev.type !== 'tool_update') throw new Error('expected tool_update');
+    expect(ev.toolCall.status).toBe('failed');
+    expect(ev.toolCall.rawOutput).toBe('boom');
+  });
+
+  it('flattens array text blocks (skipping non-text) and reuses a prior snapshot title', () => {
+    const state = createClaudeSdkMapState();
+    mapSdkMessage(
+      streamEvent({ type: 'content_block_start', index: 1, content_block: { type: 'tool_use', id: 't2', name: 'view_file', input: {} } }),
+      state,
+    );
+    const out = mapSdkMessage(
+      userMsg([
+        {
+          type: 'tool_result',
+          tool_use_id: 't2',
+          content: [
+            { type: 'text', text: 'line1' },
+            { type: 'image', source: {} },
+            { type: 'text', text: 'line2' },
+          ],
+        },
+      ]),
+      state,
+    );
+    const ev = out[0]!;
+    if (ev.type !== 'tool_update') throw new Error('expected tool_update');
+    expect(ev.toolCall.toolCallId).toBe('t2');
+    expect(ev.toolCall.title).toBe('view_file');
+    expect(ev.toolCall.status).toBe('completed');
+    expect(ev.toolCall.rawOutput).toBe('line1\nline2');
+  });
+
+  it('surfaces a result for an unknown tool_use_id with the id as the title', () => {
+    const state = createClaudeSdkMapState();
+    const out = mapSdkMessage(userMsg([{ type: 'tool_result', tool_use_id: 'orphan-id', content: 'x' }]), state);
+    expect(out[0]).toMatchObject({
+      type: 'tool_update',
+      toolCall: { toolCallId: 'orphan-id', title: 'orphan-id', kind: null, status: 'completed' },
+    });
+  });
+
+  it('ignores non-tool_result blocks and non-array content', () => {
+    const state = createClaudeSdkMapState();
+    expect(mapSdkMessage(userMsg([{ type: 'text', text: 'hi' }]), state)).toEqual([]);
+    expect(mapSdkMessage(userMsg('plain string'), state)).toEqual([]);
+  });
+});
--- a/apps/coder/src/services/backends/claude-sdk-map.ts
+++ b/apps/coder/src/services/backends/claude-sdk-map.ts
@@ -49,6 +49,7 @@ import type { AcpToolSnapshot } from '../acp-tool-snapshot.js';
 type StreamEvent = Extract<SDKMessage, { type: 'stream_event' }>['event'];
 type AssistantContent = Extract<SDKMessage, { type: 'assistant' }>['message']['content'];
 type ContentBlock = AssistantContent extends readonly (infer B)[] ? B : never;
+type UserContent = Extract<SDKMessage, { type: 'user' }>['message']['content'];

 /**
 * Caller-owned accumulator threaded across `mapSdkMessage` calls within ONE turn.
@@ -81,6 +82,12 @@ export function mapSdkMessage(msg: SDKMessage, state: ClaudeSdkMapState): AgentE
      return mapStreamEvent(msg.event, state);
    case 'assistant':
      return mapFinalAssistant(msg.message.content, state);
+    case 'user':
+      // Tool RESULTS ride in as user messages (tool_result blocks): the SDK ran
+      // the tool and feeds its output back. Without mapping these, the tool_call
+      // never reaches a terminal snapshot — it persists as status:'running' with
+      // no output and the UI spinner never stops (the bug this fixes).
+      return mapUserToolResults(msg.message.content, state);
    default:
      // system/init, status, result, hooks, task_*, etc. — no turn content here.
      // (The backend reads session_id off the init message and usage/cost off the
@@ -180,6 +187,52 @@ function mapFinalAssistant(content: ContentBlock[], state: ClaudeSdkMapState): A
  return out;
 }

+/**
+ * User-message tool_result blocks → terminal tool_update events. The SDK runs
+ * each tool and feeds the output back in a `user` message; we mark the matching
+ * snapshot completed (or failed, on is_error) WITH its output so the snapshot
+ * persists/renders as resolved instead of spinning. Unknown ids (no prior
+ * snapshot) are still surfaced so a stray result isn't silently lost.
+ */
+function mapUserToolResults(content: UserContent, state: ClaudeSdkMapState): AgentEvent[] {
+  if (!Array.isArray(content)) return [];
+  const out: AgentEvent[] = [];
+  for (const raw of content) {
+    const block = raw as { type?: string; tool_use_id?: string; content?: unknown; is_error?: boolean };
+    if (block.type !== 'tool_result' || !block.tool_use_id) continue;
+    const prev = state.snapshots.get(block.tool_use_id);
+    const snap: AcpToolSnapshot = {
+      toolCallId: block.tool_use_id,
+      title: prev?.title ?? block.tool_use_id,
+      kind: prev?.kind ?? null,
+      status: block.is_error ? 'failed' : 'completed',
+      rawInput: prev?.rawInput,
+      rawOutput: toolResultText(block.content),
+    };
+    state.snapshots.set(block.tool_use_id, snap);
+    out.push({ type: 'tool_update', toolCall: snap });
+  }
+  return out;
+}
+
+/** tool_result content is a string OR an array of content blocks (text/image).
+ *  Flatten text blocks; fall back to the raw value so nothing is lost. */
+function toolResultText(content: unknown): unknown {
+  if (typeof content === 'string') return content;
+  if (Array.isArray(content)) {
+    const text = content
+      .map((c) =>
+        c && typeof c === 'object' && (c as { type?: string }).type === 'text'
+          ? String((c as { text?: unknown }).text ?? '')
+          : '',
+      )
+      .filter(Boolean)
+      .join('\n');
+    return text || content;
+  }
+  return content ?? '';
+}
+
 /** Parse a buffered JSON string; fall back to a prior value on empty/invalid. */
 function parseJsonOr(buf: string, fallback: unknown): unknown {
  const s = buf.trim();
--- a/apps/coder/src/services/dispatcher.ts
+++ b/apps/coder/src/services/dispatcher.ts
@@ -536,6 +536,7 @@ export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<v
        type: 'message_complete',
        message_id: assistantId,
        chat_id: chatId,
+        model: task.model,
      } as WsFrame);

      if (stopping) {
@@ -864,6 +865,7 @@ export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<v
        type: 'message_complete',
        message_id: assistantId,
        chat_id: chatId,
+        model: task.model,
      } as WsFrame);

      if (stopping) {
@@ -1128,6 +1130,7 @@ export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<v
        type: 'message_complete',
        message_id: assistantId,
        chat_id: chatId,
+        model: task.model,
      } as WsFrame);

      if (stopping) {
@@ -1385,6 +1388,7 @@ export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<v
        type: 'message_complete',
        message_id: assistantId,
        chat_id: chatId,
+        model: task.model,
      } as WsFrame);

      if (stopping) {
--- a/apps/server/CLAUDE.md
+++ b/apps/server/CLAUDE.md
@@ -0,0 +1,48 @@
+# apps/server — BooChat backend (deep reference)
+
+> Per-app engineering notes for `apps/server/src/`. Cross-cutting commands, database, environment, workflow, and cross-app contracts (WS-frame / provider-type parity, sentinels) live in the **root `CLAUDE.md`**. This file auto-loads when you read/edit files under `apps/server/`.
+
+## Stack
+
+- **Fastify** with `@fastify/websocket` and `@fastify/static` (serves the built frontend).
+- **postgres** (porsager/postgres) with tagged-template SQL — no ORM. Schema in `schema.sql`, applied on startup. LSP may false-positive on `sql<Type[]>\`...\`` generics; CLI `tsc` / `pnpm build` is authoritative.
+- **Zod** for request validation and config parsing.
+
+## Key services
+
+- **`services/inference/`** — Public surface re-exported via `inference/index.ts`; callers import from `./services/inference/index.js` explicitly (NodeNext doesn't honor directory-index resolution). Layout: `turn.ts` (runAssistantTurn/runInference/createInferenceRunner; exports `InferenceFrame`, `InferenceContext`, `TurnArgs`, `StreamResult`, `MAX_STEPS`); `stream-phase.ts` (streamCompletion AI SDK adapter + executeStreamPhase); `provider.ts` (`upstreamModel(baseURL, modelId)` wrapping `createOpenAICompatible` against llama-swap); `tool-phase.ts` (executeToolPhase → `ToolPhaseResult`; the turn loop lives in turn.ts, not recursion); `sentinel-summaries.ts` (cap-hit/doom-loop/step-cap summaries + inserters); `error-handler.ts` (handleAbortOrError, finalizeCompletion); `payload.ts` (buildMessagesPayload, loadContext, maybeFlagForCompaction, `OpenAiMessage`); `sentinels.ts` (`detectDoomLoop`, `DOOM_LOOP_THRESHOLD`); `budget.ts` (resolveToolBudget); `xml-parser.ts` (qwen3.6 XML tool-call fallback — KEEP, AI SDK doesn't handle inline-XML tool calls); `parts.ts` (`partsFromAssistantMessage`/`partsFromToolMessage`/`insertParts` — parts are the sole source of truth); `prune.ts` (two-tier compaction; `selectPruneTargets` is the pure helper); `types.ts` (`StreamPhaseState`, `DB_FLUSH_INTERVAL_MS`). **`TurnArgs`** is the per-turn state envelope, reset in `runInference` at the user-message boundary. Outer loop: `while (stepNumber < effectiveCap)`, `effectiveCap = Math.min(agent.steps ?? Infinity, MAX_STEPS=200)`. Per-agent `steps:` in AGENTS.md frontmatter; `steps: 0` = text-only. Step-cap hit writes a `cap_hit` sentinel (`CapHitSentinel.tsx` renders it).
+- **AI SDK v6 streamCompletion adapter** (`services/inference/stream-phase.ts`). `streamText` is the underlying call; the BooCode layer (executeStreamPhase, finalize, dual-write) is shape-preserved via an adapter. Five gotchas the LSP/tests won't catch:
+  - **Abort signals are swallowed.** `streamText`'s `fullStream` exits cleanly when `abortSignal` fires — no throw. Post-iteration `if (signal?.aborted) throw <AbortError>` is required, else the row finalizes `complete` instead of `cancelled`. Don't refactor away the pinning comment.
+  - **Usage lands only at stream end** via `await result.usage` (v6 `inputTokens`/`outputTokens` → mapped to `promptTokens`/`completionTokens`). No mid-stream tok/s; ChatThroughput shows one value at stream end.
+  - **Tools have NO `execute` field.** BooCode dispatches tools in tool-phase.ts, not the AI SDK loop — only `description` + `inputSchema: jsonSchema(parameters)`.
+  - **`includeUsage: true` MUST be set on `createOpenAICompatible`** in `provider.ts`. The adapter defaults it false → no `stream_options.include_usage` → llama-swap emits no usage block → `result.usage` resolves `undefined` (NULL token counts). Don't remove during refactor.
+  - **Tool-call-only turns may emit a leading `\n` text-delta.** `MessageList.flatten`'s `hasText` and `MessageBubble`'s `hasContent` both `.trim()` before the length check, else whitespace-only content renders an empty bubble + ActionRow between tool calls. `buildMessagesPayload` also skips `status='failed'` and complete-but-empty assistant rows (avoids "Cannot have 2 or more assistant messages at the end of the list" upstream rejection after cap-hit + Continue).
+- **AI SDK ModelMessage conversion** (`toModelMessages` in stream-phase.ts). Tool messages need a `toolName` for `ToolResultPart`; BooCode's OpenAI-shape history lacks it, so a forward-scan builds a `tool_call_id → toolName` map from prior assistant `tool_calls`. Tool outputs wrapped as `{ type: 'json' | 'text', value }` (v6 `ToolResultOutput`). Reasoning emits a `ReasoningPart` first in the content array.
+- **`experimental_repairToolCall`** wired into `streamText` to keep the stream alive when qwen3.6 emits malformed tool args. Pass-through: logs the bad call, returns it unmodified; `executeToolPhase`'s zod-reject path routes it back to the model next turn.
+- **`chat_status` frame** (via `broker.publishUser`) — `status: 'streaming' | 'tool_running' | 'waiting_for_input' | 'idle' | 'error'`. Frontend `useChatStatus` derives `idle_warm` (<30s since idle) vs `idle_cold`. `ChatThroughput` renders beside `StatusDot` only when streaming/tool_running, fed by 500ms-throttled `'usage'` frames (`completion_tokens` + `ctx_used` + `ctx_max`). `POST /api/chats/:id/discard_stale` marks a stuck-streaming row `failed` when the frontend's 60s no-token timer gives up.
+- **Stale-streaming sweeps** (`apps/server/src/index.ts`): a boot-time pass after `applySchema()` and a periodic 60s `setInterval` both flip `messages.status='streaming'` older than 5 min to `failed` (publishing `chat_status='idle'`); the interval also runs `cleanupTruncations` (TTL + orphan reap of tmpfs truncation files). `onClose` hook clears the timer. Recovers from a container restart mid-stream.
+- **`services/broker.ts`** — In-memory pub/sub, two channel types: per-session (message streaming) and per-user (sidebar). No persistence; clients reconnect on restart. Every WS publish goes through `broker.publishFrame(sessionId, frame)` / `publishUserFrame(user, frame)` — both Zod-validate against `WsFrameSchema` (`types/ws-frames.ts`) and fail-closed (log + drop). Schema duplicated byte-identical at `apps/web/src/api/ws-frames.ts`; `ws-frames.test.ts` enforces parity. Don't add raw `broker.publish()`/`publishUser()` calls.
+- **`services/tools.ts`** — Tool registry (`ALL_TOOLS`, `READ_ONLY_TOOL_NAMES`, `TOOLS_BY_NAME`). Filesystem tools (view_file/list_dir/grep/find_files) pass three guards: `path_guard.ts` (workspace scope), `secret_guard.ts` (filename deny list), `url_guard.ts` (SSRF/private-IP block for web_fetch). Web tools (`web_search`, `web_fetch`) are opt-in per chat via `session.web_search_enabled` (falls back to `project.default_web_search_enabled`) and filtered out of the LLM tool schema when false. Truncation: when a tool slice cuts content, `services/truncate.ts` stashes the full text on tmpfs (`BOOCODE_TRUNCATION_DIR`, default `/tmp/boocode-truncations`, 0o700) keyed by `tr_<12 base32>`; `view_truncated_output(id)` retrieves it. 5MB cap, 7-day TTL, reaped by the sweeper. Container restart loses retrieval — acceptable.
+- **`services/compaction.ts`** + **`services/model-context.ts`** — Anchored rolling summary (single `summary=true` assistant row per chat, supersedes itself each compaction). Triggered when `chats.needs_compaction` is set after a turn exceeds `usable(ctx_max) = floor(0.85 × ctx_max)`. **`ctx_max` comes from `model-context.getModelContext()` fetching `${LLAMA_SWAP_URL}/upstream/<model>/props`** — NOT from `parsed.timings.n_ctx`. First inferences after boot may have `ctx_max=NULL` if llama-swap hasn't loaded the model; negative cache TTL 60s, recovers next turn. `buildHeadPayload` embeds `reasoning_parts` as a `<reasoning>...</reasoning>` prose prefix on assistant `content` (OpenAI wire shape has no structured reasoning field); standalone tag when content is empty. `buildHeadPayload` + `OpenAiMessage` exported for tests — keep them exported.
+- **`services/system-prompt.ts`** — `buildSystemPrompt` is the string shim; `buildSystemPromptWithFingerprint` is the canonical impl returning `{prompt, fingerprint, drift}`. SHA-256 of the assembled prefix is logged per `buildMessagesPayload` (`prefix-fingerprint`, info); a `Map<sessionId, lastHash>` fires `prefix-drift` (warn) on change with a `changed_inputs` diff. The prefix is byte-stable in steady-state, so prefix caching is left to the input-layer mtime caches (BOOCHAT.md + AGENTS.md global/per-project in `agents.ts:safeStat`).
+- **`services/inference/budget.ts`** — tool-call budgets: `BUDGET_READ_ONLY = 30`, `BUDGET_NON_READ_ONLY = 10` (forward-looking; no write tools yet), `BUDGET_NO_AGENT = 30` (every `ALL_TOOLS` tool is read-only today, so no-agent shares the read-only cap). Per-agent `max_tool_calls` from AGENTS.md overrides.
+- **`messages_with_parts` view** (`schema.sql`). Read sites needing `tool_calls` / `tool_results` / `reasoning_parts` SELECT from this view, NOT `messages` — the legacy `messages.tool_calls`/`tool_results` JSON columns were dropped; the view reads parts-only subselects. Writes target `message_parts` via `insertParts` (or `partsFromAssistantMessage`/`partsFromToolMessage`). The `Message` wire type still carries `tool_calls?`/`tool_results?` because the view synthesizes them. Shapes: `tool_calls jsonb[]`, `tool_results jsonb` (single object), `reasoning_parts jsonb[]` of `{text}`. To UPDATE a message and return its full shape, do a two-step UPDATE returning `id` then SELECT from the view — RETURNING off bare `messages` no longer carries the tool fields. **`messages.model`** (attribution chip) stamps the model per assistant turn — at `finalizeCompletion` (BooChat + native coder) + the dispatcher's assistant-row INSERT (external coder); read via the view + the `message_complete` frame, rendered by `shortenModelName`.
+- **`services/file_ops.ts`** — Shared file operation implementations used by both inference tools and HTTP routes.
+- **`services/auto_name.ts`** — Non-streaming LLM call to generate 4-word session titles after the first assistant reply.
+- **Provider picker dispatch**: when `provider !== 'boocode'`, the message route creates a `tasks` row (with `session_id` set) instead of calling `inference.enqueue`. The dispatcher (in `apps/coder`) picks it up and dispatches via ACP or PTY using the agent's `install_path`.
+
+Route registration: all routes registered in `index.ts` via `register*Routes(app, sql, ...)`. Routes live in `routes/*.ts`.
+
+## Server conventions
+
+- **New tools** live in their own `services/<name>.ts` (see `web_search.ts`, `web_fetch.ts`) — a pure `executeFoo(input, ...deps)` for direct test access plus a `ToolDef` wrapper that `loadConfig()`s its real deps. Register the ToolDef in `tools.ts` `ALL_TOOLS` (and `READ_ONLY_TOOL_NAMES` if applicable). Inject `fetcher: typeof fetch = fetch` rather than `vi.spyOn(globalThis, 'fetch')`.
+- **DB/session-aware tools** take an optional 4th `ToolExecCtx { sql, sessionId }` arg on `ToolDef.execute`, plumbed `executeToolPhase`→`executeToolCall`→`execute`. Optional so filesystem tools and the `apps/coder` `ALL_TOOLS` consumer stay compatible; filesystem tools ignore it. `read_tab_by_number` is the reference.
+- **ReadableStream test stubs** use `pull()` (not `start()`) so chunks are produced lazily — `start()` enqueues everything and closes before the consumer reads, so a later `reader.cancel()` finds the stream closed and the `cancel()` callback never fires. Provide MORE chunks than the test consumes so the source stays 'readable' when cancel runs.
+- Tool-name whitelists must derive from `ALL_TOOLS` in `services/tools.ts`, never hardcoded (this drift class hit `services/agents.ts` `ALL_TOOL_NAMES` before).
+- Agent registry lives at `data/AGENTS.md` (global, bind-mounted at `/data/AGENTS.md`). No per-project `AGENTS.md` in this repo (removed to eliminate two-files-must-stay-in-sync drift); the `getAgentsForProject` per-project override mechanism remains for *other* projects.
+- `data/AGENTS.md` is PARSED (`agents.ts` `splitSections`/`parseAgentSection`): each `## <Name>` is one agent and must be followed by a `---` frontmatter fence or the block throws; content before the first `## ` is discarded. Do NOT add free-form `## ` rule sections — they break the registry. Cross-cutting agent rules go in CLAUDE.md or a parser-ignored preamble.
+- MCP stdio transport uses newline-delimited JSON (NDJSON), NOT LSP-style `Content-Length` headers. `codecontext/shim.go` is the reference (per the MCP spec, modelcontextprotocol.io/specification/server/transports).
+- **`payload.ts:loadContext` SELECT** must include every `Session` field downstream code reads. The tool phase reads `session.allowed_read_paths`; if the SELECT omits it, cross-repo read grants silently fail. `sql<Session[]>` doesn't enforce column coverage, so the type doesn't catch it.
+- **Sidecar routing** (`services/inference/provider.ts`): `upstreamModel(config, modelId, agent)` routes to `LLAMA_SIDECAR_URL` when the agent has `llama_extra_args`, else `LLAMA_SWAP_URL`. `resolveRoute(agent)` returns `{route, flags}`. Sidecar provider created fresh per call (not cached) because `X-Agent-Flags` varies per agent. Boot-time guard in `index.ts` refuses to start if any agent has `llama_extra_args` but `LLAMA_SIDECAR_URL` is unset.
+- **Secret guard safe patterns** (`services/secret_guard.ts`): `.env.example`, `.env.sample`, `.env.template`, `.env.defaults` are allowlisted via `SAFE_PATTERNS`. Do NOT add `.env.production`/`.env.development`/`.env.test` — those can hold real secrets.
+- **llama-sidecar** (`/opt/forks/llama-sidecar/`): Go daemon for a per-agent llama-server process pool (routed to via "Sidecar routing" above). Cross-compile: `GOOS=windows GOARCH=amd64 /snap/go/current/bin/go build -o bin/llama-sidecar.exe ./cmd/llama-sidecar`. Gitea: `indifferentketchup/llama-sidecar`. Windows child-process gotchas: `context.Background()` for child lifetime (not request ctx), `os.Open(os.DevNull)` for stdin, `os.Pipe()` for stdout with a drain goroutine, `DETACHED_PROCESS | CREATE_NEW_PROCESS_GROUP` flags. SSH to sam-desktop: `ssh samki@100.101.41.16`; use `schtasks` for persistent spawning (SSH `start /B` doesn't survive session close).
--- a/apps/server/src/services/tests/mcp-config.test.ts
+++ b/apps/server/src/services/tests/mcp-config.test.ts
@@ -0,0 +1,93 @@
+/**
+ * Unit tests for `{env:VAR}` substitution in the MCP config loader.
+ * Pure — no live MCP server. Verifies secrets resolve from process.env
+ * (so real keys live in `.env`, not the gitignored config file).
+ */
+import { describe, it, expect, beforeEach, afterEach } from 'vitest';
+import { substituteEnvVars } from '../mcp-config.js';
+
+// Minimal FastifyBaseLogger stub — only .warn is exercised here.
+function fakeLog() {
+  const warnings: string[] = [];
+  const log = {
+    warn: (msg: unknown) => {
+      warnings.push(typeof msg === 'string' ? msg : JSON.stringify(msg));
+    },
+  };
+  return { log: log as never, warnings };
+}
+
+describe('substituteEnvVars', () => {
+  const SAVED = process.env.MCP_TEST_SECRET;
+  beforeEach(() => {
+    process.env.MCP_TEST_SECRET = 'resolved-value';
+  });
+  afterEach(() => {
+    if (SAVED === undefined) delete process.env.MCP_TEST_SECRET;
+    else process.env.MCP_TEST_SECRET = SAVED;
+    delete process.env.MCP_TEST_MISSING;
+  });
+
+  it('replaces a {env:VAR} reference in a string value', () => {
+    const { log } = fakeLog();
+    expect(substituteEnvVars('{env:MCP_TEST_SECRET}', log)).toBe('resolved-value');
+  });
+
+  it('substitutes inside nested objects and arrays', () => {
+    const { log } = fakeLog();
+    const out = substituteEnvVars(
+      {
+        headers: { CONTEXT7_API_KEY: '{env:MCP_TEST_SECRET}' },
+        args: ['--token', '{env:MCP_TEST_SECRET}'],
+      },
+      log,
+    );
+    expect(out).toEqual({
+      headers: { CONTEXT7_API_KEY: 'resolved-value' },
+      args: ['--token', 'resolved-value'],
+    });
+  });
+
+  it('leaves object keys untouched, only transforms values', () => {
+    const { log } = fakeLog();
+    const out = substituteEnvVars({ '{env:MCP_TEST_SECRET}': 'literal' }, log) as Record<string, string>;
+    expect(Object.keys(out)).toEqual(['{env:MCP_TEST_SECRET}']);
+  });
+
+  it('resolves an unset var to empty string and warns', () => {
+    const { log, warnings } = fakeLog();
+    expect(substituteEnvVars('{env:MCP_TEST_MISSING}', log)).toBe('');
+    expect(warnings.some((w) => w.includes('MCP_TEST_MISSING'))).toBe(true);
+  });
+
+  it('passes non-string scalars through unchanged', () => {
+    const { log } = fakeLog();
+    expect(substituteEnvVars(true, log)).toBe(true);
+    expect(substituteEnvVars(42, log)).toBe(42);
+    expect(substituteEnvVars(null, log)).toBe(null);
+  });
+
+  it('leaves strings without a reference unchanged', () => {
+    const { log } = fakeLog();
+    expect(substituteEnvVars('https://mcp.context7.com/mcp', log)).toBe('https://mcp.context7.com/mcp');
+  });
+
+  it('resolves multiple references in one string (global flag)', () => {
+    const { log } = fakeLog();
+    expect(substituteEnvVars('{env:MCP_TEST_SECRET}/{env:MCP_TEST_SECRET}', log)).toBe(
+      'resolved-value/resolved-value',
+    );
+  });
+
+  it('passes an empty string through unchanged', () => {
+    const { log } = fakeLog();
+    expect(substituteEnvVars('', log)).toBe('');
+  });
+
+  it('collects unset var names into the optional collector set', () => {
+    const { log } = fakeLog();
+    const unset = new Set<string>();
+    substituteEnvVars({ url: '{env:MCP_TEST_MISSING}', headers: { k: '{env:MCP_TEST_SECRET}' } }, log, unset);
+    expect([...unset]).toEqual(['MCP_TEST_MISSING']);
+  });
+});
--- a/apps/server/src/services/tests/ws-frames.test.ts
+++ b/apps/server/src/services/tests/ws-frames.test.ts
@@ -111,6 +111,19 @@ describe('WsFrameSchema (v1.13.11-a)', () => {
    expect(result.success).toBe(true);
  });

+  it('accepts a message_complete frame with a null model (external coder, no model selected)', () => {
+    // Regression guard: the dispatcher publishes `model: task.model` (string |
+    // null). When null, this MUST validate or publishFrame fail-closes and drops
+    // the whole frame, incl. the status:'complete' transition.
+    const result = WsFrameSchema.safeParse({
+      type: 'message_complete',
+      message_id: VALID_UUID_A,
+      chat_id: VALID_UUID_B,
+      model: null,
+    });
+    expect(result.success).toBe(true);
+  });
+
  it('every KNOWN_FRAME_TYPES entry has a discriminated branch', () => {
    // Probe each known type by attempting a minimal valid construction.
    // Failure here means the union and the KNOWN_FRAME_TYPES list drifted.
--- a/apps/server/src/services/mcp-config.ts
+++ b/apps/server/src/services/mcp-config.ts
@@ -4,6 +4,12 @@
 * Reads a JSON config file (default `/data/mcp.json`) that declares MCP
 * servers — their transport type, connection parameters, and enabled state.
 * Schema shape matches opencode's `mcpServers` key for copy-paste compat.
+ *
+ * Secrets stay out of the config file via `{env:VAR}` substitution
+ * (opencode-compatible). Any string value can reference an environment
+ * variable, e.g. a header `"CONTEXT7_API_KEY": "{env:CONTEXT7_API_KEY}"`
+ * resolves from `process.env` at load. This keeps real keys in `.env`
+ * (`env_file` in docker-compose) rather than the gitignored config.
 */
 import { readFileSync } from 'node:fs';
 import { z } from 'zod';
@@ -38,6 +44,49 @@ export interface McpServerEntry {
  config: McpServerConfig;
 }

+// ---- Env-var substitution ----
+
+const ENV_VAR_PATTERN = /\{env:([A-Za-z_][A-Za-z0-9_]*)\}/g;
+
+/**
+ * Recursively replace `{env:VAR}` references in string values with the
+ * matching environment variable (opencode-compatible). Runs before Zod
+ * validation so a resolved value (e.g. a `{env:...}` URL) still validates.
+ * An unset var resolves to '' and logs a warning so a missing secret is
+ * visible in the boot log rather than silently sending a literal placeholder.
+ * Pass an optional `unsetVars` set to collect the names that resolved to '';
+ * the loader surfaces them on a validation failure (an empty value in a strict
+ * url/command field invalidates the whole config — see loadMcpConfig).
+ */
+export function substituteEnvVars(
+  value: unknown,
+  log: FastifyBaseLogger,
+  unsetVars?: Set<string>,
+): unknown {
+  if (typeof value === 'string') {
+    return value.replace(ENV_VAR_PATTERN, (_match, name: string) => {
+      const resolved = process.env[name];
+      if (resolved === undefined) {
+        unsetVars?.add(name);
+        log.warn(`mcp: env var ${name} referenced in config is unset; substituting empty string`);
+        return '';
+      }
+      return resolved;
+    });
+  }
+  if (Array.isArray(value)) {
+    return value.map((v) => substituteEnvVars(v, log, unsetVars));
+  }
+  if (value && typeof value === 'object') {
+    const out: Record<string, unknown> = {};
+    for (const [k, v] of Object.entries(value as Record<string, unknown>)) {
+      out[k] = substituteEnvVars(v, log, unsetVars);
+    }
+    return out;
+  }
+  return value;
+}
+
 // ---- Loader ----

 /**
@@ -61,9 +110,19 @@ export function loadMcpConfig(configPath: string, log: FastifyBaseLogger): McpSe
    return [];
  }

-  const result = McpConfigSchema.safeParse(json);
+  const unsetVars = new Set<string>();
+  const result = McpConfigSchema.safeParse(substituteEnvVars(json, log, unsetVars));
  if (!result.success) {
-    log.warn({ errors: result.error.flatten().fieldErrors }, `mcp: invalid config at ${configPath}`);
+    // Connect the two otherwise-disconnected warnings: an unset {env:VAR} that
+    // resolved to '' can invalidate a strict field (url/command) and drop the
+    // whole config, so name the unset vars alongside the validation errors.
+    const hint = unsetVars.size
+      ? ` — ${unsetVars.size} referenced env var(s) unset & substituted with '' (${[...unsetVars].join(', ')}); an unset {env:VAR} in a url/command field invalidates the whole config`
+      : '';
+    log.warn(
+      { errors: result.error.flatten().fieldErrors, unsetEnvVars: [...unsetVars] },
+      `mcp: invalid config at ${configPath}${hint}`,
+    );
    return [];
  }

--- a/apps/server/src/types/ws-frames.ts
+++ b/apps/server/src/types/ws-frames.ts
@@ -124,7 +124,11 @@ export const MessageCompleteFrame = z.object({
  ctx_max: z.number().int().positive().nullable().optional(),
  started_at: IsoTimestamp.nullable().optional(),
  finished_at: IsoTimestamp.nullable().optional(),
-  model: z.string().optional(),
+  // nullable: external-coder turns carry task.model, which is null when no
+  // model was selected. This frame is published through the same fail-closed
+  // publishFrame, so null MUST validate or the entire frame (incl. the
+  // status:'complete' transition) is dropped.
+  model: z.string().nullable().optional(),
  metadata: OpaqueObject.nullable().optional(),
 });

--- a/apps/web/CLAUDE.md
+++ b/apps/web/CLAUDE.md
@@ -0,0 +1,47 @@
+# apps/web — BooChat frontend (deep reference)
+
+> Per-app engineering notes for `apps/web/src/`. The frontend is a single React SPA that also hosts the BooCoder `'coder'` pane. Cross-cutting commands, database, environment, workflow, and cross-app contracts (WS-frame / provider-type parity, sentinels) live in the **root `CLAUDE.md`**. This file auto-loads when you read/edit files under `apps/web/`.
+
+## Stack
+
+- **React 18** + React Router v6 + **Tailwind v4** + shadcn/radix-ui primitives.
+- **Shiki** for syntax highlighting (async `codeToHtml` in `CodeBlock.tsx` and `FileViewer` in `FileBrowserPane.tsx`).
+- Path alias: `@/` maps to `src/`.
+- **Mobile interaction primitives**: `useViewport` (matchMedia; mobile <768 / tablet 768–1023 / desktop ≥1024), `useSidebarDrawer` / `useRightRailDrawer` (Context + auto-close on `useLocation().pathname` change), `useLongPress` (500ms timer, synthetic `contextmenu` on `[data-tab-id]`), `usePullToRefresh` (80px threshold, 600ms hold), `SwipeablePaneTab` (60px close, 30px vertical bail). Tap-target convention: `max-md:min-h-[44px] max-md:min-w-[44px]`. Mobile headers: `border-b px-3 sm:px-4 py-2` + `paddingTop: 'max(0.5rem, env(safe-area-inset-top))'`. Hamburger left, FolderTree right.
+
+## Key patterns
+
+- **`hooks/sessionEvents.ts`** — Module-singleton event bus (Set of listeners) for cross-component communication: session renames, file-open, attachment dispatch. 26-arm discriminated union (and growing). Adding an event type also requires a `case` in the `applyEvent` switch in `useSidebar.ts` (no-op `return prev` is fine), and a subscribe in any hook that needs it (e.g. `useSessionStream` for `refetch_messages`).
+- **`hooks/useSessionStream.ts`** — WebSocket per session; `applyFrame` reducer builds the message list from streaming frames.
+- **`hooks/useUserEvents.ts`** — Single app-level WS to `/api/ws/user` with exponential-backoff reconnect. Forwards frames onto the sessionEvents bus.
+- **`hooks/useSidebar.ts`** — Module-singleton with `Set<setState>` subscriber pattern; one bus subscription guarded by `globalThis.__boocode_sidebar_subscribed` for HMR safety. Every new `SessionEvent` type needs a `case` in `applyEvent`.
+- **`api/client.ts`** — Centralized typed fetch wrapper. All endpoints under `api.*`.
+
+## Font / CSS pipeline
+
+- Tailwind v4's `@import "tailwindcss"` strips font URLs from subsequent CSS `@import`s — `@fontsource*` packages must be JS side-effect imports in `apps/web/src/main.tsx`, not `@import` in `globals.css`, or the woff2 files never reach `dist/`.
+- Lightning CSS (inside `@tailwindcss/postcss` v4) collapses contiguous unicode-ranges to wildcard shorthand (`U+0000-FFFF` → `U+????`), which iOS Safari/Vivaldi mishandles (silently drops the font for those codepoints). Use explicit non-collapsible subranges (e.g. `U+2500-259F`, not `U+2500-25FF`). The `apps/web` build script greps `dist/assets/*.css` for `U+2500-259F` and fails the build if missing — preserve that guard.
+- `@font-face` blocks must live AFTER all `@import` statements (CSS spec). Earlier placement silently breaks every subsequent `@import`.
+- JetBrainsMono Nerd Font self-hosted in `apps/web/src/fonts/` (TTF from ryanoasis/nerd-fonts) — `@fontsource-variable/jetbrains-mono` ships subsetted woff2s that don't cover `U+2500-259F` (box drawing/block elements, used by opencode's banner). "NL" = No Ligatures (matches `font-feature-settings: "liga" 0`); "Mono" = single-cell icon width so TUI layouts don't desync.
+- xterm-addon-webgl rasterizes glyphs via Canvas2D into a GPU atlas; Canvas2D does NOT honor `font-display: block` — it uses whatever font is registered. Gate xterm init on `document.fonts.load(<font-name>)` resolving before `term.open()` (`fontsReady` in `TerminalPane.tsx`). iOS Safari/Vivaldi also reclaim WebGL contexts from backgrounded tabs: keep `webgl.onContextLoss(() => webgl.dispose())` + recreate via visibilitychange. Do NOT manually dispose+recreate the addon after font load — iOS silently fails the second GL context creation and drops to DOM renderer with stale metrics.
+
+## Multi-pane workspace
+
+Sessions hold 1–5 panes (chat / empty / placeholder terminal+agent). Pane state lives in `sessions.workspace_panes jsonb` for cross-device sync. `PATCH /api/sessions/:id/workspace` persists; `session_workspace_updated` user-channel frame broadcasts to every device. `useWorkspacePanes` debounces saves 300ms and dedups echoes by JSON string (legacy localStorage key seeded once on first hydrate, then no longer written). `validatePanes(validChatIds)` prunes panes referencing deleted chats. Each chat lives in at most one pane; the per-pane tab strip tracks `chatIds[]` + `activeChatIdx`, reorder via native HTML5 drag. `workspace_panes` is a `WorkspaceState` envelope `{panes, tabNumbers, nextTabNumber, closedPaneStack}` (tabNumbers = stable session-scoped chatId→number, never reused; closedPaneStack = reopen LIFO, max 10, persisted); hydrate (`toWorkspaceState`) and the server PATCH validator (`z.union([array, envelope])`) both accept the legacy bare array and normalize. Closing a chat pane relocates its tabs to the oldest chat/empty pane; `reopenPane` strips restored chatIds from all live panes first. `read_tab_by_number` resolves number→chatId through `tabNumbers`.
+
+## Frontend conventions
+
+- `overflowWrap` not `wordWrap` — TypeScript's CSSStyleDeclaration marks `wordWrap` deprecated (error 6385).
+- shadcn primitives live in `components/ui/`. Don't modify them unless adding a new primitive.
+- `ui/` primitives present: button, card, context-menu, dialog, dropdown-menu, input, label, radio-group, sonner, textarea. No switch/sheet/drawer/badge/checkbox — use a `<button role="switch" aria-checked>` toggle (a hand-rolled `Switch` lives in `SettingsPane.tsx`) and a Dialog-based panel for "drawers".
+- `inferLanguage()` from `lib/attachments.ts` is the canonical file-extension→language map. `CodeBlock.tsx` keeps its own `LANG_MAP` because it also resolves markdown fence names.
+- Two UI event buses: `hooks/sessionEvents.ts` for DB-state events (chat_created, session_updated); `lib/events.ts` for ephemeral UI (`sendToTerminal`, `terminalsRegistry`). Don't merge — different subscriber lifecycles.
+- `vite.config.ts` proxy entries are order-sensitive: more-specific prefixes (`/api/term`, `/ws/term`) must come BEFORE `/api`.
+- Mobile pane URL sync (`Session.tsx`): the `?pane=<id>` effect resets `activePaneIdx` whenever `panes` changes. New-pane creation on mobile must push `?pane=` atomically — `addPaneAndSwitch` does this; `addSplitPane` returns the new pane id.
+- A scrollable list inside a Dialog on mobile: cap `DialogContent` (`max-h-[85vh]` + `grid-rows-[auto_minmax(0,1fr)_auto]`) and make the list the single scroll region with `overscroll-contain` — otherwise touch-scroll drags the whole fixed modal / chains to the page.
+- xterm.js v5 uses canvas rendering — the browser doesn't see xterm's selection, so the native right-click Copy doesn't work for terminal text. App keybindings (`Cmd/Ctrl-C`, `Cmd/Ctrl-Shift-C`) are the path.
+- React **StrictMode is on** (`main.tsx`): an updater passed to one `setState` that itself calls another `setState` (e.g. `setClosedPaneStack` inside a `setPanes` updater) is double-invoked in dev. Make such nested updates idempotent — `useWorkspacePanes`'s `appendClosed` dedupes a value-identical top entry for this reason.
+- **CoderPane uses ChatInput** (`components/panes/CoderPane.tsx`): shares BooChat's `ChatInput` for full parity — attachments, paste-to-chip, auto-grow textarea, queued messages during send. `sendOneMessage` is the send callback; queued messages drain via `useEffect` when `sending` goes false.
+- **AgentComposerBar filters `e.installed`**: provider snapshot entries with `installed:false` (loading/unavailable) are dropped from the dropdown. `getProviderSnapshot` must await the full build — returning synchronous `loading` placeholders makes every provider vanish; surfacing loading states needs a client poll.
+- **Pane header architecture (mobile vs desktop)**: desktop coder pane header (BooCode label + [+] [×]) lives in `Workspace.tsx` gated by `isCoder && !isMobile`. Mobile coder controls (● ×) live in `Session.tsx` next to `MobileTabSwitcher`/`NewPaneMenu`. `AgentComposerBar` (provider/mode/model pickers) renders inside `CoderPane.tsx` on both; the ● status dot is passed via `connected` prop.
+- **MessageBubble shared between BooChat and BooCoder** (`components/MessageBubble.tsx`): optional `actions?: MessageActions` + `hideActions?` props; CoderPane overrides via `CoderMessageList`. **`CoderMessageList` passes `CoderMessageWire as unknown as Message`** — the coder shape lacks `metadata`/`kind`/`summary`, so they're `undefined` (not `null`). Null-guards on any `Message` field MUST use loose `!= null`, not `!== null` (`undefined !== null` is `true` → `.kind` throws → blank-screen crash). The cast hides this from tsc; build passes while runtime crashes.
--- a/apps/web/src/api/ws-frames.ts
+++ b/apps/web/src/api/ws-frames.ts
@@ -124,7 +124,11 @@ export const MessageCompleteFrame = z.object({
  ctx_max: z.number().int().positive().nullable().optional(),
  started_at: IsoTimestamp.nullable().optional(),
  finished_at: IsoTimestamp.nullable().optional(),
-  model: z.string().optional(),
+  // nullable: external-coder turns carry task.model, which is null when no
+  // model was selected. This frame is published through the same fail-closed
+  // publishFrame, so null MUST validate or the entire frame (incl. the
+  // status:'complete' transition) is dropped.
+  model: z.string().nullable().optional(),
  metadata: OpaqueObject.nullable().optional(),
 });

--- a/apps/web/src/components/AgentComposerBar.tsx
+++ b/apps/web/src/components/AgentComposerBar.tsx
@@ -5,7 +5,6 @@ import type { AgentSessionConfig, ProviderSnapshotEntry, AgentCommand } from '@/
 import { useProviderSnapshot, refreshProviderSnapshot } from '@/hooks/useProviderSnapshot';
 import type { AgentStatusEntry } from '@/hooks/useAgentStatus';
 import { providerIcon } from '@/components/coder/providerIcons';
-import { useAgentSessions } from '@/hooks/useAgentSessions';
 import {
  DropdownMenu,
  DropdownMenuContent,
@@ -174,16 +173,6 @@ interface Props {
  onChange: (next: AgentSessionConfig) => void;
  onProviderCommandsChange?: (commands: AgentCommand[]) => void;
  connected?: boolean;
-  // v2.6 Phase 1-UX §9b: chat id for the resumed/new-session chip. Optional so
-  // BooChat and any other AgentComposerBar caller renders no chip and is
-  // otherwise unaffected. When present + connected + the chat has ≥1 prior
-  // turn, a chip right of the Provider picker reports whether switching to the
-  // current provider resumes an agent session, replays history (boocode), or
-  // starts fresh.
-  sessionId?: string;
-  // True once the chat has at least one prior turn — gates the chip so it stays
-  // hidden on a brand-new chat. Defaults to false (no chip).
-  hasPriorTurn?: boolean;
  // #10: normalized status (working|blocked|idle|error) for the active external
  // agent in this chat, or null for native boocode / before any frame. Renders
  // a status dot DISTINCT from the WS-liveness `connected` dot. Undefined for
@@ -191,31 +180,6 @@ interface Props {
  agentStatus?: AgentStatusEntry | null;
 }

-// Condensed token count: 950 → "950", 12_400 → "12.4K", 3_200_000 → "3.2M".
-// Sub-1000 stays exact; thousands/millions get one decimal, trailing .0 trimmed.
-function abbrevTokens(n: number): string {
-  if (!Number.isFinite(n) || n < 1000) return String(Math.max(0, Math.round(n)));
-  if (n < 1_000_000) return `${(n / 1000).toFixed(1).replace(/\.0$/, '')}K`;
-  return `${(n / 1_000_000).toFixed(1).replace(/\.0$/, '')}M`;
-}
-
-// Relative-time formatter for the resumed-chip title (e.g. "3m ago").
-function relativeTime(iso: string | null): string {
-  if (!iso) return 'unknown';
-  const then = new Date(iso).getTime();
-  if (Number.isNaN(then)) return 'unknown';
-  const diffMs = Date.now() - then;
-  if (diffMs < 0) return 'just now';
-  const sec = Math.floor(diffMs / 1000);
-  if (sec < 60) return 'just now';
-  const min = Math.floor(sec / 60);
-  if (min < 60) return `${min}m ago`;
-  const hr = Math.floor(min / 60);
-  if (hr < 24) return `${hr}h ago`;
-  const day = Math.floor(hr / 24);
-  return `${day}d ago`;
-}
-
 // #10: normalized external-agent status dot. Mirrors StatusDot's visual
 // language but on the four normalized buckets (working|blocked|idle|error),
 // and is DISTINCT from the WS-liveness `connected` dot beside it:
@@ -251,7 +215,7 @@ function AgentStatusDot({ entry, agent }: { entry: AgentStatusEntry; agent: stri
  );
 }

-export function AgentComposerBar({ projectPath, value, onChange, onProviderCommandsChange, connected, sessionId, hasPriorTurn, agentStatus }: Props) {
+export function AgentComposerBar({ projectPath, value, onChange, onProviderCommandsChange, connected, agentStatus }: Props) {
  const allEntries = useProviderSnapshot(projectPath);
  // 5.5 — the composer picker only offers ENABLED providers that are ready (or
  // still loading). Disabled (enabled:false) and unavailable/error providers are
@@ -263,13 +227,6 @@ export function AgentComposerBar({ projectPath, value, onChange, onProviderComma
  );
  const [refreshing, setRefreshing] = useState(false);

-  // v2.6 Phase 1-UX §9b: chat-scoped agent-session rows for the resumed/new
-  // chip. Hook is unconditional (hooks rule); it self-no-ops when sessionId is
-  // undefined or the chat has no prior turn, so BooChat callers cost nothing.
-  const { sessions: agentSessions } = useAgentSessions(
-    sessionId && hasPriorTurn ? sessionId : undefined,
-  );
-
  const hydratedRef = useRef(false);

  useEffect(() => {
@@ -383,42 +340,8 @@ export function AgentComposerBar({ projectPath, value, onChange, onProviderComma
  const modelOptions = (currentEntry?.models ?? []).map((m) => ({ id: m.id, label: m.label }));
  const thinkingOpts = thinkingOptions.map((t) => ({ id: t.id, label: t.label }));

-  // v2.6 Phase 1-UX §9b: resumed / history / new-session chip. Only meaningful
-  // when this is a real chat (sessionId), the WS is connected, and the chat has
-  // ≥1 prior turn — otherwise render nothing so fresh chats and non-coder
-  // callers stay clean.
-  const sessionRow = agentSessions.find((s) => s.agent === value.provider);
-  const sessionChip: { label: string; title: string } | null =
-    sessionId && hasPriorTurn && connected
-      ? value.provider === 'boocode'
-        ? // Native boocode never holds an agent_sessions row — it reconstructs
-          // the conversation from the chat transcript each turn.
-          { label: 'history', title: 'BooCode replays the chat transcript each turn' }
-        : sessionRow?.has_session
-          ? {
-              label: 'resumed',
-              title: `Resuming ${value.provider} · last active ${relativeTime(sessionRow.last_active_at)}`,
-            }
-          : { label: 'new session', title: `${value.provider} starts a fresh session this turn` }
-      : null;
-
-  // sampling-streamjson-tokens #8: condensed per-(chat,agent) token/cost readout
-  // beside the session chip. Coerce — input/output are BIGINT (string over wire).
-  // Hidden when no session row or all totals are zero (e.g. native boocode, which
-  // holds no agent_sessions row, or a provider that hasn't run yet).
-  const usageReadout = (() => {
-    if (!sessionChip || !sessionRow) return null;
-    const inTok = Number(sessionRow.input_tokens) || 0;
-    const outTok = Number(sessionRow.output_tokens) || 0;
-    const cost = Number(sessionRow.cost) || 0;
-    if (inTok <= 0 && outTok <= 0 && cost <= 0) return null;
-    const parts = [`${abbrevTokens(inTok)} in`, `${abbrevTokens(outTok)} out`];
-    if (cost > 0) parts.push(`$${cost.toFixed(2)}`);
-    return parts.join(' · ');
-  })();
-
  return (
-    <div className="flex flex-wrap items-center gap-1 px-2 py-1 border-b border-border bg-muted/20 shrink-0">
+    <div className="flex items-center gap-1 px-2 py-1 border-b border-border bg-muted/20 shrink-0">
      <CompactPicker
        label="Provider"
        value={value.provider}
@@ -430,22 +353,6 @@ export function AgentComposerBar({ projectPath, value, onChange, onProviderComma
            : providerIcon(value.provider)
        }
      />
-      {sessionChip && (
-        <span
-          title={sessionChip.title}
-          className="inline-flex items-center rounded-full border border-border bg-muted/40 px-1.5 py-0.5 text-[10px] font-medium text-muted-foreground shrink-0"
-        >
-          {sessionChip.label}
-        </span>
-      )}
-      {usageReadout && (
-        <span
-          className="text-[10px] text-muted-foreground tabular-nums whitespace-nowrap shrink-0"
-          title="Tokens in · out · cost for this agent session"
-        >
-          {usageReadout}
-        </span>
-      )}
      <CompactPicker
        label="Mode"
        value={value.modeId ?? ''}
@@ -472,8 +379,7 @@ export function AgentComposerBar({ projectPath, value, onChange, onProviderComma
          icon={<Brain className="size-3 shrink-0" />}
        />
      )}
-      {/* Status dot + refresh as one right-aligned unit so the refresh button
-          stays on the top line instead of wrapping past the edge-pinned dot. */}
+      {/* Status dot + refresh — pinned right (ml-auto), never on its own line. */}
      <div className="ml-auto flex items-center gap-1 shrink-0">
        {/* #10: normalized agent status — only for an external agent with a
            live status frame. Distinct from the WS-liveness dot that follows. */}
--- a/apps/web/src/components/panes/CoderMessageList.tsx
+++ b/apps/web/src/components/panes/CoderMessageList.tsx
@@ -11,6 +11,7 @@ export interface CoderMessageWire {
  role: 'user' | 'assistant' | 'system';
  content: string;
  status?: 'streaming' | 'complete' | 'failed';
+  model?: string | null;
  reasoning_text?: string;
  tool_calls?: CoderToolCallWire[];
 }
--- a/apps/web/src/components/panes/CoderPane.tsx
+++ b/apps/web/src/components/panes/CoderPane.tsx
@@ -30,6 +30,8 @@ interface CoderMessage {
  role: 'user' | 'assistant' | 'system';
  content: string;
  status?: 'streaming' | 'complete' | 'failed';
+  // model-attribution: which model produced this assistant message (chip).
+  model?: string | null;
  reasoning_text?: string;
  tool_calls?: Array<{
    id: string;
@@ -52,6 +54,46 @@ interface CoderToolMessage {

 type CoderTimelineMessage = CoderMessage | CoderToolMessage;

+// Per-chat agent-config cache (provider/model/mode/thinking). Keyed by chat id
+// so reopening or switching back to a chat restores the model that was loaded
+// last there. Per-device (localStorage) — a UI convenience, not authoritative.
+const DEFAULT_AGENT_CONFIG: AgentSessionConfig = {
+  provider: 'boocode',
+  model: '',
+  modeId: null,
+  thinkingOptionId: null,
+};
+function agentConfigKey(chatId: string | undefined): string | null {
+  return chatId ? `boocode.coder.config.${chatId}` : null;
+}
+function readCachedAgentConfig(chatId: string | undefined): AgentSessionConfig | null {
+  const key = agentConfigKey(chatId);
+  if (!key || typeof localStorage === 'undefined') return null;
+  try {
+    const raw = localStorage.getItem(key);
+    if (!raw) return null;
+    const c = JSON.parse(raw) as Partial<AgentSessionConfig>;
+    if (typeof c?.provider !== 'string') return null;
+    return {
+      provider: c.provider,
+      model: typeof c.model === 'string' ? c.model : '',
+      modeId: c.modeId ?? null,
+      thinkingOptionId: c.thinkingOptionId ?? null,
+    };
+  } catch {
+    return null;
+  }
+}
+function writeCachedAgentConfig(chatId: string | undefined, config: AgentSessionConfig): void {
+  const key = agentConfigKey(chatId);
+  if (!key || typeof localStorage === 'undefined') return;
+  try {
+    localStorage.setItem(key, JSON.stringify(config));
+  } catch {
+    /* quota / disabled storage — non-fatal */
+  }
+}
+
 interface PendingChange {
  id: string;
  file_path: string;
@@ -97,6 +139,7 @@ type RawCoderMessage = {
  chat_id?: string;
  content?: string | null;
  status?: string | null;
+  model?: string | null;
  reasoning_text?: string;
  reasoning_parts?: Array<{ text?: string }> | null;
  tool_results?: {
@@ -144,6 +187,7 @@ function mapCoderTimelineRow(raw: RawCoderMessage): CoderTimelineMessage | null
    role: raw.role as CoderMessage['role'],
    content: raw.content ?? '',
    status: (raw.status ?? 'complete') as CoderMessage['status'],
+    ...(raw.model ? { model: raw.model } : {}),
    ...(reasoning_text ? { reasoning_text } : {}),
    ...(tool_calls?.length ? { tool_calls } : {}),
    ctx_used: raw.ctx_used ?? null,
@@ -253,6 +297,7 @@ function useCoderMessages(sessionId: string, chatId: string | undefined, handler
                ? {
                    ...m,
                    status: 'complete' as const,
+                    model: (frame as any).model ?? (m as any).model ?? null,
                    ctx_used: (frame as any).ctx_used ?? (m as any).ctx_used ?? null,
                    ctx_max: (frame as any).ctx_max ?? (m as any).ctx_max ?? null,
                  }
@@ -586,12 +631,37 @@ export function CoderPane({
  onConnectedChange,
  onAgentLabelChange,
 }: Props) {
-  const [agentConfig, setAgentConfig] = useState<AgentSessionConfig>({
-    provider: 'boocode',
-    model: '',
-    modeId: null,
-    thinkingOptionId: null,
-  });
+  const [agentConfig, setAgentConfig] = useState<AgentSessionConfig>(
+    () => readCachedAgentConfig(chatId) ?? DEFAULT_AGENT_CONFIG,
+  );
+  // Restore the per-chat cached config when the chat changes. The ref guard
+  // skips the initial mount (lazy init already loaded it) + StrictMode double-runs.
+  const lastLoadedChatRef = useRef<string | undefined>(chatId);
+  useEffect(() => {
+    const prev = lastLoadedChatRef.current;
+    if (prev === chatId) return;
+    lastLoadedChatRef.current = chatId;
+    // undefined → real id: the pane just resolved its chat. A selection made
+    // while chatId was undefined could not be persisted (the key was null), so
+    // carry the current in-memory config into the new chat — and persist it —
+    // rather than clobbering the user's pick with DEFAULT on the cache miss.
+    if (prev === undefined && chatId) {
+      const cached = readCachedAgentConfig(chatId);
+      if (cached) setAgentConfig(cached);
+      else writeCachedAgentConfig(chatId, agentConfig);
+      return;
+    }
+    setAgentConfig(readCachedAgentConfig(chatId) ?? DEFAULT_AGENT_CONFIG);
+  }, [chatId, agentConfig]);
+  // Persist on user-driven changes only (not on the restore above), so switching
+  // chats never clobbers the new chat's cached config with the old one.
+  const handleAgentConfigChange = useCallback(
+    (next: AgentSessionConfig) => {
+      setAgentConfig(next);
+      writeCachedAgentConfig(chatId, next);
+    },
+    [chatId],
+  );

  useEffect(() => {
    const parts = [agentConfig.provider || 'boocode'];
@@ -727,13 +797,6 @@ export function CoderPane({
    }
  }, [messages, refresh, refreshCheckpoints, sessionId]);

-  // The §9b chip only shows once the chat has ≥1 prior turn (a completed
-  // assistant message). Hidden on a brand-new chat.
-  const hasPriorTurn = useMemo(
-    () => messages.some((m) => m.role === 'assistant' && (m as CoderMessage).status === 'complete'),
-    [messages],
-  );
-
  // Poll fallbacks when WS is disconnected (reconnect uses WS as source of truth)
  useEffect(() => {
    if (!activeTaskId || connected) return;
@@ -1001,11 +1064,9 @@ export function CoderPane({
      <AgentComposerBar
        projectPath={projectPath}
        value={agentConfig}
-        onChange={setAgentConfig}
+        onChange={handleAgentConfigChange}
        onProviderCommandsChange={handleProviderCommandsChange}
        connected={connected}
-        sessionId={sessionId}
-        hasPriorTurn={hasPriorTurn}
        agentStatus={currentAgentStatus}
      />
      {/* Chat area — BooChat-style timeline (text + tool runs as siblings) */}