v1.13.20-drop-legacy-cols: final phase of v1.13.0 strangler-fig

Removes the dual-write into messages.tool_calls / messages.tool_results JSON columns and drops the columns. message_parts is now the only source of truth for tool calls and tool results. 10 dual-write sites stripped (5 in tool-phase.ts, 2 in routes/skills.ts, 2 in routes/messages.ts, 1 in routes/chats.ts fork-clone). The recon-driven grep caught 2 sites beyond the original v1.13.2 roadmap inventory and an extra fixture file (tool_cost_stats.test.ts) with a direct legacy-column INSERT. messages_with_parts view rewritten to parts-only subselects (COALESCE fallbacks gone). View runs via CREATE OR REPLACE so it lands before the column DROPs in startup DDL — Postgres rejects column-drop on view-referenced cols. v1.12.1 cleanup DO block (DROP CONSTRAINT messages_status_check / messages_role_check) removed; those one-shots have done their work. Adversarial review caught a runtime bug the green test suite missed: the discard_stale endpoint (chats.ts) had a RETURNING ... tool_calls, tool_results clause that would have crashed on every 60s-no-token-activity recovery in production. Fixed by switching to two-step UPDATE returning id, then SELECT from messages_with_parts so parts-synthesized fields keep flowing on the wire. Message API type retains tool_calls? / tool_results? — the view synthesizes those keys from parts so the wire shape is unchanged; frontend reads need no update. Override on the original v1.13.2 plan, captured in the openspec proposal. 339/339 server tests passing (including 7 DB-integration tests that applied the schema migration to a live DB and ran the parts-only view end-to-end). tsc + web build clean. Pairs with v1.13.0-ai-sdk-v6 (introduced the dual-write) and v1.13.1-B (moved the read path to messages_with_parts). Umbrella v1.13 tag ships on this same commit, marking the strangler-fig closed. CLAUDE.md picks up Sam's pre-existing edits documenting tag-naming and CHANGELOG conventions — both already in use by v1.13.19 / v1.13.20. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
v1.13.19-html-artifact-panes: pane-based artifact viewer with on-request HTML
2026-05-23 13:03:51 +00:00 · 2026-05-23 12:43:13 +00:00 · 2026-05-22 21:54:16 +00:00 · 2026-05-22 21:45:52 +00:00 · 2026-05-22 20:59:25 +00:00 · 2026-05-22 20:09:39 +00:00
187 changed files with 20144 additions and 2461 deletions
--- a/.codecontextignore
+++ b/.codecontextignore
@@ -0,0 +1,33 @@
 # .codecontextignore — paths codecontext skips during analysis
 # Copy to your project root and customize. Same syntax as .gitignore.
 # Dependencies / vendored code
 node_modules/
 vendor/
 .venv/
 venv/
 __pycache__/
 target/
 # Build artifacts
 dist/
 build/
 out/
 .next/
 .nuxt/
 .svelte-kit/
 # IDE / tooling
 .opencode/
 .vscode/
 .idea/
 # Test artifacts / coverage
 coverage/
 .nyc_output/
 .pytest_cache/
 # Lock files (rarely have meaningful symbols)
 package-lock.json
 yarn.lock
 pnpm-lock.yaml
--- a/.env.example
+++ b/.env.example
@@ -10,3 +10,12 @@ POSTGRES_PASSWORD=CHANGE_ME
 # Internal Tailscale address that bypasses Authelia. Override if you
 # point BooCode at a different SearXNG instance.
 SEARXNG_URL=http://100.114.205.53:8888
 # v1.13.15-tools: BOOCODE_TOOLS narrows the tool whitelist sent to the LLM.
 # Unset (default) → all tools (~21k schema). Useful primarily for single-purpose
 # sessions where the model only needs read-only filesystem access.
 #
 # core      → view_file, list_dir, grep, find_files                       (~2k)
 # standard  → core + web_*, git_status, all 8 codecontext_* tools         (~10k)
 # all       → every tool in ALL_TOOLS                                     (~21k)
 # BOOCODE_TOOLS=all
--- a/.gitignore
+++ b/.gitignore
@@ -1,9 +1,12 @@
 node_modules
 dist
 .env
 CLAUDE.local.md
 *.log
 .DS_Store
 .vite
 coverage
 secrets/
-data/
+data/*
 !data/AGENTS.md
 !data/skills/
--- a/BOOCHAT.md
+++ b/BOOCHAT.md
@@ -1,7 +1,5 @@
 # BooChat
 You are the assistant running inside BooChat — a self-hosted developer chat app.
 ## Capabilities
 - Read-only file tools: `view_file`, `list_dir`, `grep`, `find_files`
@@ -28,6 +26,18 @@ You are the assistant running inside BooChat — a self-hosted developer chat ap
 - Cite file paths + line numbers for any claim about the codebase
 - When uncertain about scope or intent, surface options via `ask_user_input` rather than guessing
 - Prefer codecontext (`search_symbols`, `get_symbol_info`, `get_dependencies`) over `grep` for symbol-level questions. Fall back to `grep` / `view_file` when codecontext returns degraded or empty results — that signals an unsupported language or parse failure.
 - Verify before reporting work complete: run the relevant test/build/smoke command and confirm output matches the claim. Evidence first, assertion second.
 ## Output format
 - Stay in Markdown by default for every reply, short or long.
 - Switch to a self-contained `<!DOCTYPE html>...</html>` artifact only when the user explicitly asks (e.g. "render this as HTML", "make me a dashboard", "build an interactive diagram"). Detection is opportunistic — the BooChat backend tags the assistant message as an HTML artifact, opens it in a sandboxed pane, and offers Download. Do not emit HTML unprompted; long Markdown is the right answer for most explanatory output.
 - When asked to produce HTML, avoid generic AI aesthetics: no excessive centered layouts, no purple gradients, no uniform rounded corners, no Inter font. Prefer interactive controls (sliders / knobs / SVG / side-by-side diffs) over passive prose-in-HTML. Pattern reference: claude.com/blog/using-claude-code-the-unreasonable-effectiveness-of-html (Thariq Shihipar, May 2026).
 - The HTML artifact is rendered in a sandboxed iframe with `connect-src 'none'` — `fetch()`, WebSockets, and tracking pixels do not work. All logic must be client-side.
 ## Convention: rules vs recipes
 Always-true rules (process discipline, refusals, behavior contracts) live here in `BOOCHAT.md` — and in `BOOCODER.md` / `CLAUDE.md` per their scopes — where they are 100% present in every turn. On-demand recipes (specific procedures, scaffolds, checklists) live in `/data/skills/` and invoke roughly 6% of the time in clean multi-turn flow (Codeminer42 measurement, 2026). Don't file workflow rules as skills — they silently misfire. See Anthropic agent-skills best-practices (platform.claude.com/docs/en/agents-and-tools/agent-skills/best-practices) for the canonical conventions.
 ## Known limitations
--- a/BOOCODER.md
+++ b/BOOCODER.md
@@ -2,8 +2,6 @@
 > (Stub. v2.0 implementation pending. This file documents the intended contract.)
 You are the assistant running inside BooCoder — the write-capable companion to BooChat.
 ## Capabilities
 - Everything in `BOOCHAT.md`
@@ -22,3 +20,8 @@ You are the assistant running inside BooCoder — the write-capable companion to
 - Show a diff preview before any write
 - Group related edits into a single `/apply` batch
 - If a tool fails, surface the error verbatim — don't paper over it
 - Verify before reporting work complete: run the relevant test/build/smoke and confirm output matches the claim. Evidence first, assertion second.
 ## Convention: rules vs recipes
 Always-true rules live here, in `BOOCHAT.md`, and in `CLAUDE.md` (100% present each turn). On-demand recipes live in `/data/skills/` (roughly 6% invoke rate in multi-turn per Codeminer42, 2026). Don't file workflow rules as skills — they misfire. See Anthropic agent-skills best-practices (platform.claude.com/docs/en/agents-and-tools/agent-skills/best-practices).
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -0,0 +1,183 @@
 # Changelog
 All notable changes per release tag. Most recent on top, ordered by tag creation date (which matches the git history). Tag names follow `vMAJOR.MINOR.PATCH-slug` — the slug describes what shipped, so the tag name alone is enough to recall the batch.
 ## v1.13.20-drop-legacy-cols — 2026-05-23
 Final phase of the v1.13.0 strangler-fig migration. Removes the dual-write into `messages.tool_calls` / `messages.tool_results` JSON columns and drops the columns themselves; `message_parts` is now the only source of truth for tool-call and tool-result data. 10 dual-write sites stripped (5 in `tool-phase.ts`, 2 in `routes/skills.ts`, 2 in `routes/messages.ts`, 1 in `routes/chats.ts` fork-clone) — recon's grep-driven inventory caught 2 sites beyond the original v1.13.2 roadmap count. `messages_with_parts` view simplified to parts-only subselects (COALESCE fallbacks gone) and rewritten via `CREATE OR REPLACE VIEW` BEFORE the column DROP since Postgres rejects column-drop on view-referenced cols. Adversarial review caught a runtime bug the green test suite missed: `chats.ts:/api/chats/:id/discard_stale` had a `RETURNING ... tool_calls, tool_results, ...` clause referencing the dropped columns; would have crashed on every 60s-no-token-activity recovery in production. Fixed by switching to two-step UPDATE-then-SELECT-from-view so the response keeps the parts-synthesized fields. `Message` API type retains `tool_calls?` / `tool_results?` fields (override on the original v1.13.2 plan) — the view continues to populate them from parts, so the wire shape is unchanged and the frontend needs no updates. v1.12.1 cleanup block (`DROP CONSTRAINT messages_status_check`/`messages_role_check`) removed — those one-shots have done their work. `tool_cost_stats.test.ts` had a direct `INSERT INTO messages` touching the legacy columns that wasn't in the roadmap's inventory; rewritten to parts-table inserts and confirmed semantically faithful. 339/339 server tests passing including the 7 DB-integration tests (live-DB applied the schema migration and ran the parts-only view end-to-end). Pairs with `v1.13.0-ai-sdk-v6` (which introduced the dual-write) and `v1.13.1-B` (which moved the read path to `messages_with_parts`); umbrella `v1.13` tag ships on the same commit.
 ## v1.13.19-html-artifact-panes — 2026-05-23
 Pane-based artifact viewer with on-request HTML support. Every assistant message gets an "Open in pane" icon button (`PanelRightOpen`, mobile 44px tap-target) in `MessageBubble`'s ActionRow; click opens the message in the workspace splitter as either a Markdown pane (Copy raw source + Download `.md`) or an HTML pane (Download `.html` only, no Copy). The HTML path triggers when the model emits a self-contained `<!DOCTYPE html>` or fenced ` ```html` artifact (opt-in only — `BOOCHAT.md` rule says Markdown is default at every length; HTML only on explicit user request like "render this as HTML"). Backend detection in `finalizeCompletion` (`error-handler.ts`) writes a new `message_parts.kind='html_artifact'` row with payload `{html_content, char_count, title}` (`<title>` → first `<h1>` → first 80 chars of inner text). Schema CHECK extended via the v1.13.13 drop-and-re-add pattern. 1MB cap is graceful — over-cap artifacts skip the part write and plain content lands; decision factored into a pure `decideHtmlArtifactWrite` helper so the warn-and-skip branch is unit-testable without mocking the full InferenceContext. Pane state is reference-only (`{chat_id, message_id, title}`) — content is fetched on mount, keeping `sessions.workspace_panes` jsonb small and avoiding 1MB blobs riding the `session_workspace_updated` WS frame. New `services/artifacts.ts` ships slug derivation (Markdown: first `#` heading → first 6 words; HTML: `<title>` → `<h1>` → inner text) and write helpers that realpath the artifacts directory after `mkdir` to close a symlink-escape gap (`assertArtifactsDirSafe`). `routes/artifacts.ts` exposes POST `/api/chats/:id/messages/:msg_id/artifacts/download?fmt=md|html` (writes to `<projectRoot>/.boocode/artifacts/<slug>-<ts>.<ext>`) plus GET `/api/projects/:project_id/artifacts/:filename` with `Content-Disposition: attachment`, `X-Content-Type-Options: nosniff`, and `Content-Security-Policy: sandbox` defense-in-depth on LLM-served HTML. iframe sandbox locks to `allow-scripts allow-clipboard-write allow-downloads` with no `allow-same-origin` and uses `srcDoc` (not `src`) for opaque-origin isolation. Frontend extracts `MarkdownRenderer.tsx` from `MessageBubble`'s inline `MarkdownBody` for reuse; `MarkdownArtifactPane.tsx` / `HtmlArtifactPane.tsx` render with loading + error states. 404-vs-real-error discrimination in `openInPane`: a real network/500 failure toasts and bails instead of silently masquerading as a Markdown pane. 31 new server unit tests (slug derivation, detection positive/negative, write helpers, symlink-escape, 1MB cap, real-symlink filesystem test); 332/332 server tests passing; `tsc -p apps/web/tsconfig.app.json --noEmit` clean; `pnpm -C apps/web build` green. Smoke deferred to first deploy.
 ## v1.13.18-codecontext-file-path — 2026-05-22
 Fix: four codecontext wrappers (`get_file_analysis`, `get_symbol_info`, `get_dependencies`, `get_semantic_neighborhoods`) forwarded `file_path` to the sidecar unchanged, but the sidecar's index is keyed on absolute paths — every relative path from the model returned "File not found in graph" (three back-to-back failures in one chat at 17:56 UTC, ~48 s of wasted tool budget). New `resolveProjectPath` helper in `codecontext_client.ts:64-89` realpath-resolves the candidate, applies the same escape check as the existing `target_dir` resolver (matching the error template byte-for-byte except the field name), and falls through with the normalised absolute on ENOENT so the sidecar issues its own self-correctable "File not found" error. Wired into `callCodecontext` once at the args-spread site — all four wrappers benefit without per-wrapper edits. `.trim()` added to all four `file_path` Zod schemas to absorb trailing newlines from model output. Adversarial review caught a P2 escape-bypass: an absolute path with `..` (e.g. `<projectRoot>/../etc/passwd`) that ENOENTs at realpath would slip through the literal prefix-check, fixed by `resolve()`-normalising the absolute branch too. 9 new test cases in `codecontext_client.test.ts` (7 spec scenarios + symlink-out-of-root + absolute-with-`..` ENOENT) plus a 1-line update in `codecontext_tools.test.ts` asserting the new resolved-absolute contract. Pairs with `v1.13.17-cross-repo-reads` — both harden path traversal, but v1.13.18 stays inside the project root while v1.13.17 widens access outside it.
 ## v1.13.17-cross-repo-reads — 2026-05-22
 On-demand read access to paths outside the session's primary project root. Closes the dead-end where `pathGuard` rejected every cross-repo read with no recovery path. New `request_read_access(path, reason)` tool emits an `ask_user_input`-style pause; user picks Allow/Deny via inline chips in `RequestReadAccessCard.tsx`; on Allow, the new `POST /api/chats/:id/grant_read_access` endpoint re-resolves the grant root and appends to `sessions.allowed_read_paths` (new `TEXT[]` column, default empty). Grant unit per design D1 = nearest registered `projects.path` ancestor → else nearest repo-shaped ancestor (`.git/` / `package.json` / `go.mod` / `Cargo.toml`) under `PROJECT_ROOT_WHITELIST` → else refuse without prompting. `pathGuard` extended with an optional `extraRoots` argument threaded from `session.allowed_read_paths` through `executeToolCall` to the four filesystem tools (view_file, list_dir, grep, find_files); `view_file` re-anchors the secret-guard check on `basename(real)` whenever the path resolved via a grant root so `.env` / `id_rsa*` deny still fires across grants. `grant_resolver.ts`'s ancestor walk checks the whitelist invariant on every iteration (not just final parent) so a symlinked input can't escape mid-walk. PATCH `/api/sessions/:id` exposes `allowed_read_paths` only for revocation: zod refines paths to absolute + no traversal markers, and a runtime subset guard (`findUnauthorizedAdditions`) rejects any entry not already present in the row, so a malicious `curl -X PATCH -d '{"allowed_read_paths":["/etc"]}'` 400s instead of bypassing the grant flow. Settings pane gains a per-session revoke list; archiving the session clears grants implicitly. 11 grant_resolver tests pin the symlink-escape-mid-walk guard (Sam's checkpoint-1 ask) and the nearest-project disambiguation; 8 path_guard tests cover extraRoots traversal; 8 sessions PATCH tests cover the subset guard including the `/etc` bypass attempt. Pairs with `v1.13.16-xml-parser` (model now both self-recovers from a wrong tool name AND from a refused path).
 ## v1.13.16-xml-parser — 2026-05-22
 Two-part fix for the model-emitted XML drift the v1.13.15 investigation surfaced. **Parser extension:** `xml-parser.ts` now recognizes the Anthropic `<invoke name="…"><parameter name="…">…</parameter></invoke>` shape alongside the existing Qwen/Hermes `<tool_call><function=…>…</function></tool_call>` shape. qwen3.6-35b-a3b-mxfp4 drifts to the Anthropic format when prompted as an Architect-style agent (Claude Code documentation in its pre-training corpus). Both formats route through the same synthetic-id `xml_call_${idx}` ToolCall path. The existing Qwen parser was tightened to tolerate whitespace around `=` (`<function = name>` shape) so a stray space doesn't get absorbed into the function name. **Unknown-tool recovery hint:** new `tool-suggestions.ts` exports `levenshtein()` + `suggestToolName()` + `formatUnknownToolError()`. When the dispatcher (`tool-phase.ts:executeToolCall`) receives an unknown tool name, the error returned to the model includes a "Did you mean: X?" hint based on Levenshtein distance ≤3 or substring match against `Object.keys(TOOLS_BY_NAME)`. Targets the qwen3.6 drift to `read_file` → suggest `view_file`. Test coverage in `xml-parser.test.ts` (46 tests, all green) covers both parsers, the partial-opener detector for both flavors, the unified extraction helper, and the new error formatter.
 ## v1.13.15-codecontext-synth — 2026-05-22
 Forced second-inference synthesis pass for codecontext overview-class tools (`get_codebase_overview`, `get_framework_analysis`, `get_semantic_neighborhoods`). After the tool result lands, the pipeline expands the truncated head via in-process `readTruncation`, extracts referenced file paths from the full content, auto-fetches top-N files + project docs (BOOCHAT.md, AGENTS.md, *roadmap*.md, CONTEXT.md) under a 32k-token budget with explicit drop-priority order, then streams a synthesis turn that replaces the recursive `runAssistantTurn`. The 32k truncated head still ships to the synth model (token-budget contract preserved); the expansion is reference-extraction-only. Falls through to recursion on timeout (90s), model error, or non-2xx; user-abort marks the synth message `status='failed'` and re-throws (the outer abort handler operates on the parent turn's message, not the new synth row — without explicit marking, the row would sit `streaming` until the 5-min sweeper, tripping the 60s stale-stream banner). Adds `'synthesis'` to `message_parts.kind` CHECK constraint via `DROP CONSTRAINT IF EXISTS` + `DO $$ pg_constraint` idempotency-guarded re-add. Smokes #1, #2, #6 all clean; smokes #3–#5 are content-quality checks for UI review.
 ## v1.13.14-skills-audit — 2026-05-22
 Multi-topic batch. **Skills audit (headline):** vendored all 26 skills from `/home/samkintop/opt/skills/` into repo-local `data/skills/` (the `/opt/skills:/data/skills` override mount removed from `docker-compose.yml` so skills are auditable per-batch in git). Audited via 5 parallel Claude Code agent-teams running mgechev's 4-step protocol per skill — 14 survive with gerund-form names + refined triggers; 11 dropped (duplicates, BooCode-irrelevant patterns, Claude-already-does-natively); 1 (`verification-before-completion`) migrated to `BOOCHAT.md`/`BOOCODER.md` as an always-true rule. The Codeminer42 "rules vs recipes" split codified in those files. **Token tracking + stale-stream banner fix:** same root cause — `IsoTimestamp = z.string()` in `ws-frames.ts` was failing on postgres `Date` objects, silently dropping every `message_complete` / `session_updated` / `chat_updated` frame through the `v1.13.13-ws-publish` Zod gate; `z.preprocess(v => v instanceof Date ? v.toISOString() : v, ...)` applied to the primitive on both server + web (parity test still passes). **Codecontext ignore:** `codecontext_client.ts` auto-installs `.codecontextignore.template` into any project's root on first call (stops the upstream empty-source-file parser crash on foreign projects' `node_modules`). **Budget bump:** `BUDGET_READ_ONLY` + `BUDGET_NO_AGENT` 30 → 50 (real recon need ~27 + headroom for codecontext failure-retry turns; doom-loop guard catches the loop class anyway). **UI:** queued-message dropdown → edit / force-send / cancel buttons in `ChatPane.tsx`; `ChatThroughput` removed from desktop tab strip (mobile tab switcher keeps it). Audit decisions in `openspec/changes/v1.13.12-skills-audit/audit-notes.md`.
 ## v1.13.13-ws-publish — 2026-05-22
 Second half of the WebSocket-frame-typing batch. Converts the existing ~50 inference + auto_name publish sites (via the `index.ts` adapter) plus ~30 direct `broker.publish*` call sites in routes + compaction, so every server-emitted frame now goes through Zod validation at the broker boundary. Pairs with `v1.13.12-ws-schemas`.
 ## v1.13.12-ws-schemas — 2026-05-22
 First half of the WebSocket-frame-typing batch. Adds `apps/server/src/types/ws-frames.ts` with Zod schemas for all 27 wire-format frame types (discriminated union `WsFrameSchema` + `KNOWN_FRAME_TYPES` diagnostic lookup), duplicated byte-identical at `apps/web/src/api/ws-frames.ts` with a parity test. Introduces the `publishFrame` / `publishUserFrame` wrappers that fail-closed on schema mismatch.
 ## v1.13.11-tools — 2026-05-22
 Tiered tool loading via `BOOCODE_TOOLS` env var (`core` | `standard` | `all`). Core = 4 read-only fs tools (~2k token schema cost). Standard = +web + git + codecontext (~10k). All (default) = every tool in `ALL_TOOLS` (~21k). The var is a ceiling — narrows agent whitelists, never expands. Pattern lifted from `eyaltoledano/claude-task-master`.
 ## v1.13.10-openspec — 2026-05-22
 Adopt `Fission-AI/OpenSpec`'s `openspec/changes/<slug>/{proposal,tasks,design}.md` shape for BooCode's own batch docs. Existing batch docs (`boocode_batch10.md`, `handoff_v1.13.8_prefix_verify.md`, `handoff_v1.13.10_per_tool_cost.md`) moved into `openspec/changes/archived/` via `git mv` to preserve history. Zero-dep documentation reformat.
 ## v1.13.9-agentlint — 2026-05-22
 Manual audit of instruction files against `0xmariowu/AgentLint`'s 31-check standard. Removed identity-opener sections from `BOOCHAT.md` and `BOOCODER.md` (emphatic decoration the model doesn't need). Added `CLAUDE.local.md` to `.gitignore` — Claude Code's Glob ignores `.gitignore` by default, so local overrides were otherwise readable by any agent walking the workspace. `CLAUDE.md` passed all 10 checks unchanged.
 ## v1.13.8-tool-cost — 2026-05-22
 Per-tool prompt/completion-token rolling averages surfaced in AgentPicker as at-a-glance cost hints. Implementation is the `tool_cost_stats` SQL view over `messages_with_parts` (`LATERAL jsonb_array_elements` on `tool_calls`), plus a read endpoint and a tooltip extension. Equal-split attribution — multi-tool turn divides tokens N-ways; the 100-call rolling mean absorbs split noise. Filters out `cap_hit` / `doom_loop` sentinels. Source data already lands via existing UPDATEs that `v1.13.5-stability-bundle`'s `includeUsage: true` fix made non-NULL.
 ## v1.13.7-compaction-trigger — 2026-05-22
 Compaction overflow trigger lowered to `floor(0.85 × ctx_max)`, replacing the v1.11.0-era `ctx_max − 20_000` formula. Old formula gave only 7.6% headroom at 262k context and 0 budget for ≤20k contexts (never fired). New formula gives consistent 15% summarizer headroom across all model sizes. Opencode pattern lift from `session/overflow.ts`.
 ## v1.13.6-prefix-stability — 2026-05-22
 System-prompt prefix stability verify-and-measure. Recon during planning disproved the original DB-cache premise: `buildSystemPrompt` already runs over inputs mtime-cached at the file layer (BOOCHAT.md, AGENTS.md global+per-project), and DB scalars are byte-stable until edited. This batch closes the verification gap with instrumentation, not implementation — `buildSystemPromptWithFingerprint` computes SHA-256 over the assembled prefix and a per-session `Map` observer fires `prefix-drift` (warn) on hash change with field-level `changed_inputs` diff.
 ## v1.13.5-stability-bundle — 2026-05-22
 Five fixes for latent regressions surfaced during the cosmetic-revert investigation. (1) `provider.ts` — `includeUsage: true` on `createOpenAICompatible` (default false omitted `stream_options.include_usage`; llama-swap never emitted usage; tokens_used / ctx_used were NULL on every assistant row since `v1.13.0-ai-sdk-v6`). (2) `MessageList.tsx` — `hasText = m.content.trim().length > 0` to skip whitespace-only tool-call-only turns rendering empty bubbles. (3) `BUDGET_NO_AGENT` raised 15 → 30 to match read-only agent cap. (4) `payload.ts` skips status='failed' + complete-but-empty assistant rows so cap-hit + Continue doesn't upstream-reject. (5) Misc UI sanitization.
 ## v1.13.4-reasoning-fix — 2026-05-22
 Compaction head-assembly audit caught one fix: reasoning was omitted from the summarizer's view of tool-bearing turns, silently degrading summary quality for reasoning-channel models (qwen3.6). `v1.13.0-ai-sdk-v6` had wired reasoning end-to-end into inference but missed this one read site. `CompactionMessage` extended with `reasoning_parts`; `buildHeadPayload` embeds it as a `<reasoning>...</reasoning>` prose prefix on the assistant content (OpenAI wire shape has no structured reasoning field).
 ## v1.13.3-truncate — 2026-05-22
 Port of opencode's `truncate.ts`. Full tool output retrievable via opaque `tr_<12 base32 chars>` id (~60 bits entropy) and a new `view_truncated_output(id)` tool. Tmpfs storage at `/tmp/boocode-truncations/` (overridable via `BOOCODE_TRUNCATION_DIR`), 5MB cap, 7-day TTL, orphan-reap on the periodic 60s sweeper. Wired through four tools: `view_file`, `list_dir`, `web_fetch`, `codecontext_client`. Each returns the existing sliced view plus an `outputPath` field when truncation fires.
 ## v1.13.2-compaction-prune — 2026-05-22
 Two-tier compaction prune — opencode pattern that was half-shipped in v1.11.0. New `message_parts.hidden_at` column with partial index on `WHERE hidden_at IS NULL`. `messages_with_parts` view changed from `COALESCE(parts, legacy)` to a CASE that distinguishes "no parts at all → fall back to legacy column for pre-v1.13.0 history" from "all parts hidden → drop the row from the model payload" (smoke caught the `COALESCE` leaking hidden parts back via legacy fallback). `prune.ts` scans `tool_result` parts newest-first, protects the last 40k tokens, marks older candidates hidden once the combined estimate clears 20k.
 ## v1.13.1-cleanup-bundle — 2026-05-22
 Four independent items owed from prior dispatches. (1) `statement_timeout = '30s'` at the database level (documented in `schema.sql` but applied operationally — `ALTER DATABASE` can't run inside a `DO` block). (2) Tool registry alpha-sorted at module load — llama.cpp's prompt cache hits on byte-identical prefixes; reordering tools near the top of the system prompt would invalidate every cached turn. (3) Periodic 60s stuck-row sweeper. (4) `experimental_repairToolCall` to keep streams alive on malformed qwen3.6 tool args (pass-through implementation — logs and forwards unmodified; existing zod-reject path routes back to the model).
 ## v1.13.0-ai-sdk-v6 — 2026-05-22
 Major migration to AI SDK v6. Introduces the `streamCompletion` adapter (`services/inference/stream-phase.ts`) over `streamText`, with five known gotchas the LSP can't catch — abort signals swallowed by `fullStream` (post-iteration throw required), usage lands only at stream end via `await result.usage`, tools have no `execute` field (BooCode dispatches in `tool-phase.ts`), and tool-call-only turns may emit a leading `\n` text-delta. Also ships the `messages_with_parts` view (parts-merge read path) and wires `reasoning_parts` end-to-end via a `ReasoningPart` in the v6 ModelMessage. Ports `ask_user_input` correlation queries from JSON columns to `message_parts` JOINs.
 ## v1.12.4-inference-split — 2026-05-21
 Complete `inference.ts` split into `services/inference/`. Pieces: `turn.ts` (orchestration — `runAssistantTurn` / `runInference` / `createInferenceRunner`), `sentinel-summaries.ts` (`runCapHitSummary`, `runDoomLoopSummary`), `stream-phase.ts`, `tool-phase.ts`, `provider.ts`, `payload.ts`, `prune.ts`, `budget.ts`, `xml-parser.ts`, `error-handler.ts`, `sentinels.ts`, `parts.ts`, `types.ts`. Public surface re-exported via `inference/index.ts`; callers import from `./services/inference/index.js` explicitly (NodeNext doesn't honor directory-index resolution).
 ## v1.12.3-stale-banner — 2026-05-21
 Stale-stream banner with Retry/Discard. When an assistant message sits `status='streaming'` with no token activity for 60+ seconds, the chat shows a banner above the input. Both actions clear the stale row via new `POST /api/chats/:id/discard_stale` (updates `status='failed'`, publishes `chat_status='idle'`). Closes the UX gap from the 2026-05-21 debugging spiral — slow streams and dead streams now look different.
 ## v1.12.2-live-toks — 2026-05-21
 Live tok/s + ctx display next to the status indicator. `ChatThroughput` renders inline beside `StatusDot` while streaming or tool_running. Subscribes to existing `'usage'` WS frames (500ms-throttled, carrying `completion_tokens` + `ctx_used` + `ctx_max`) via `sessionEvents`. Hides when status drops to idle/error or data is older than 10s. Addresses the same UX gap as `v1.12.3-stale-banner` — gives users a live token velocity readout that immediately distinguishes slow from dead.
 ## v1.12.1-stop-handler — 2026-05-21
 `handleAbortOrError` now writes `status='cancelled'` on user stop; rows no longer stuck `streaming` forever. Drops stale `messages_status_check` constraint (only `messages_status_chk` remains, allowing 'cancelled' via TS `MESSAGE_STATUSES`). Removes `detectSameNameLoop` and `DOOM_LOOP_SAME_NAME_THRESHOLD` (added during the 2026-05-21 debugging spike, never fired in any real run) plus 12 verbose `ctx.log.info` diagnostic markers from the same spike. Bundles workspace pane sync + status indicator overhaul + startup hung-row sweep that landed earlier in v1.12.1 work.
 ## v1.12.0-codecontext — 2026-05-21
 Adds the `codecontext` sidecar (Go-based code-graph indexer at `codecontext:8080/v1/<tool_name>` over `boocode_net`) plus container guidance and skills runtime updates. Introduces the `chat_status` WS frame (`streaming | tool_running | waiting_for_input | idle | error`, widened from `working|idle|error`). Drops the deprecated `session_panes` table — workspace pane state moves to `sessions.workspace_panes jsonb` for cross-device sync via `PATCH /api/sessions/:id/workspace`.
 ## v1.11.1-consolidation — 2026-05-21
 Rollup of v1.11.0–v1.11.10 work that was shipped piecemeal. Covers anchored rolling compaction (single `summary=true` row per chat that supersedes itself), doom-loop guard via `detectDoomLoop`, `path_guard` secret-filename deny list, web tools (`web_search` against SearXNG + `web_fetch` with SSRF/private-IP block), and the 5MB stream-cap on response bodies with abort-on-overflow.
 ## v1.11.0-context-bar — 2026-05-20
 Persistent context-window tracker in `ChatPane` + `ctx_max` capture via `${LLAMA_SWAP_URL}/upstream/<model>/props`. First inferences after a boocode boot may have `ctx_max=NULL` if llama-swap hasn't loaded the model yet — 60s negative cache TTL recovers on next turn. Replaced an earlier dead read of `parsed.timings.n_ctx` which never carried n_ctx.
 ## v1.10.1-booterm-user — 2026-05-19
 Per-user shell privilege drop in the booterm container via `gosu` in `tmux.conf` default-command. Shells launched in browser terminal panes drop privs to `samkintop` rather than running as root inside the container.
 ## v1.10.0-booterm — 2026-05-18
 Second container (`apps/booterm`, port 9501, bookworm-slim+glibc). Fastify + node-pty + tmux. Browser terminal panes connect via WS to `/ws/term/sessions/:sid/panes/:pid`; per-session tmux session `bc-<sid>`, per-pane window `term-<pid>`. xterm-addon-webgl with `document.fonts.load(...)`-gated init (Canvas2D doesn't honor `font-display: block`) and iOS-friendly visibility-change context recreation.
 ## v1.9.2-ask-user-input — 2026-05-18
 `ask_user_input` elicitation tool. Pauses the inference loop and surfaces a prompt to the user; their response routes back as the tool result. Correlation initially via `messages.tool_calls` / `tool_results` JSON columns (later ported to `message_parts` in `v1.13.0-ai-sdk-v6`).
 ## v1.9.1-skills — 2026-05-18
 Skills runtime + `/skill` slash command with autocomplete. Server-side parser, tools, `/api/skills`, and mount. Hardens `.dockerignore` to exclude `secrets/` and `data/`. Drops the type-to-confirm gate on chat delete (plain Cancel/Confirm only — per workspace convention).
 ## v1.9.0-themes-settings — 2026-05-17
 Settings pane + per-project defaults + bulk archive + themes lift. `themes-v1` (18 preset palettes) ships in the same batch with a Settings picker for live theme switching.
 ## v1.8.2-cap-hit — 2026-05-17
 Tool-loop cap-hit summary — when an assistant exceeds the per-turn tool budget, a sentinel `role='system'` row with `metadata.kind='cap_hit'` is inserted and a summary turn runs to give the user a coherent endpoint. Also compacts the tool-call UI rendering.
 ## v1.8.1-agents-global — 2026-05-16
 Global agents (`data/AGENTS.md` bind-mounted at `/data/AGENTS.md`) + parser robustness + WS reconnect toast. Per-project `AGENTS.md` mechanism (`getAgentsForProject`) remains for *other* projects; the BooCode repo itself uses global-only to eliminate two-files-must-stay-in-sync drift.
 ## v1.8.0-agents — 2026-05-16
 Tier 2 agents — `AGENTS.md` registry + per-session agent picker. Also lands mobile tab switcher, branch indicator, and the `git_status` tool.
 ## v1.7.0-drag-drop — 2026-05-16
 Drag-drop + paste-as-attachment for long text in the chat input.
 ## v1.6.0-mobile — 2026-05-16
 Full mobile suite. Adds `useViewport` (matchMedia breakpoints mobile <768 / tablet 768–1023 / desktop ≥1024), `useSidebarDrawer` / `useRightRailDrawer` (Context + auto-close on `useLocation().pathname` change), `useLongPress` (500ms timer, synthetic `contextmenu`), `usePullToRefresh` (80px threshold, 600ms hold), `SwipeablePaneTab` (60px close, 30px vertical bail). Mobile headers with safe-area padding, hamburger left, FolderTree right. Tap targets at `max-md:min-h-[44px] max-md:min-w-[44px]`. Raises `MAX_TOOL_LOOP_DEPTH` 5 → 15. Right-rail becomes a drawer on mobile.
 ## v1.5.1-bootstrap — 2026-05-16
 Bootstrap fixes — git + ssh installed in the boocode container, Tailscale host rewrite, `/opt/projects` label correction for the create-new-project bootstrap flow.
 ## v1.5.0-refactor-tests — 2026-05-16
 Refactor split (FileBrowserPane / Workspace / `runAssistantTurn`) + vitest harness + unit tests for security-critical pure functions. Scopes the `/opt` mount to `/opt/projects` (writable) plus `PROJECT_ROOT_WHITELIST=/opt` (read-only resolution for add-existing). Surfaces swallowed errors and removes dead `session_renamed` paths.
 ## v1.4.0-fork-header — 2026-05-16
 Fork from message + delete message + header polish + general housekeeping.
 ## v1.3.0-chats-projects — 2026-05-16
 Chats-in-sessions era. Adds force-send, `/compact`, right-rail file browser, archive/rename/Open-in-Gitea sidebar context menu, archived projects landing page, create-project bootstrap with Gitea remote setup, landing-card buttons, 1000px content cap. Dedup audit and chat archive/delete from the sidebar.
 ## v1.2.0-multi-pane — 2026-05-15
 Multi-pane workspace (batch 3, T1–T8). `session_panes` schema (later replaced by `sessions.workspace_panes jsonb` in v1.12.0), `Pane` discriminated union, broker user channel + `/api/ws/user`, `file_ops` + `file_index` services, `PaneShell` / `ChatPane` / `FileBrowserPane` / `PaneTab` / `Workspace` components, `usePanes` hook, Shiki integration in `CodeBlock`. Up to 5 panes per session; default chat pane created on `POST /api/sessions`.
 ## v1.1.0-markdown-sidebar — 2026-05-15
 Markdown rendering, message actions, tok/s + ctx display, AI session naming. Sidebar restructure — chats nested under projects (max 5 + view-all), live updates via WS.
 ## v1.0.0-initial — 2026-05-14
 Initial commit. Skeleton of the monorepo: `apps/server` (Fastify + postgres), `apps/web` (React + Vite), basic chat loop against llama-swap.
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -33,7 +33,7 @@ npx tsc -p apps/web/tsconfig.app.json --noEmit  # web app specifically
 docker compose build --no-cache boocode && docker compose up -d
 ```
-Tests: `pnpm -C apps/server test` runs 23 vitest tests. No test harness on `apps/web` (adding it requires installing vitest as a new devDep). Vitest pinned to `^3` because Vite 5 / vitest 4 are incompatible. No linters configured.
+Tests: `pnpm -C apps/server test` runs the vitest suite. No test harness on `apps/web` (adding it requires installing vitest as a new devDep). Vitest pinned to `^3` because Vite 5 / vitest 4 are incompatible. No linters configured. Vitest include glob is `src/**/__tests__/**/*.test.ts` (see `apps/server/vitest.config.ts`) — tests outside `src/**/__tests__/` silently won't run; match the per-domain convention (`apps/server/src/services/__tests__/foo.test.ts`).
 ## Architecture
@@ -46,9 +46,24 @@ Tests: `pnpm -C apps/server test` runs 23 vitest tests. No test harness on `apps
 - **Zod** for request validation and config parsing.
 Key services:
- **`services/inference.ts`** — Streams LLM responses, executes tool loops (max depth 15, see `MAX_TOOL_LOOP_DEPTH`), flushes to DB every 500ms. Publishes `InferenceFrame` events through the broker.
+- **`services/inference/`** — Public surface re-exported via `inference/index.ts`; callers import from `./services/inference/index.js` explicitly (NodeNext doesn't honor directory-index resolution). Layout: `turn.ts` (runAssistantTurn / runInference / createInferenceRunner; exports `InferenceFrame`, `InferenceContext`, `TurnArgs`, `StreamResult`), `stream-phase.ts` (streamCompletion as a v1.13.1-A AI SDK adapter + executeStreamPhase), `provider.ts` (`upstreamModel(baseURL, modelId)` wrapping `createOpenAICompatible` against llama-swap), `tool-phase.ts` (executeToolPhase; value back-edges into turn.ts for the runAssistantTurn recursion — cycle safe because deref at call time, not module top-level), `sentinel-summaries.ts` (runCapHitSummary + runDoomLoopSummary + their sentinel inserters), `error-handler.ts` (handleAbortOrError, finalizeCompletion), `payload.ts` (buildMessagesPayload, loadContext, maybeFlagForCompaction, `OpenAiMessage`), `sentinels.ts` (`detectDoomLoop`, `DOOM_LOOP_THRESHOLD`, sentinel predicates), `budget.ts` (resolveToolBudget), `xml-parser.ts` (qwen3.6 XML tool-call fallback — KEEP, AI SDK doesn't handle inline-XML tool calls), `parts.ts` (parts-table write helpers: `partsFromAssistantMessage`, `partsFromToolMessage`, `insertParts` — v1.13.20 made parts the sole source of truth), `prune.ts` (v1.13.4 two-tier compaction; `selectPruneTargets` is the pure decision helper), `types.ts` (`StreamPhaseState`, `DB_FLUSH_INTERVAL_MS`). **`TurnArgs`** is the per-turn state envelope threaded through the `executeToolPhase → runAssistantTurn` recursion; reset in `runInference` at user-message boundary. Add new per-turn state to `TurnArgs`, not module-level closures.
- **`services/broker.ts`** — In-memory pub/sub with two channel types: per-session (message streaming) and per-user (sidebar updates). No persistence; clients reconnect on restart.
+- **AI SDK v6 streamCompletion adapter** (v1.13.1-A; `services/inference/stream-phase.ts`). `streamText` is the underlying call; the BooCode layer above (executeStreamPhase, finalize, dual-write) is shape-preserved via an adapter. Five gotchas the LSP/test suite won't catch:
- **`services/tools.ts`** — Four read-only file tools exposed as OpenAI function-calling schemas. All file access goes through `path_guard.ts` which resolves against project root.
+  - **Abort signals are swallowed.** `streamText`'s `fullStream` iterator exits cleanly when `abortSignal` fires — no throw. Post-iteration `if (signal?.aborted) throw <AbortError>` is required; without it the row finalizes as `complete` instead of `cancelled`. Comment in stream-phase.ts pins this; don't refactor it away.
  - **Usage lands only at stream end** via `await result.usage` (`inputTokens` / `outputTokens` v6 names → mapped to `promptTokens` / `completionTokens` for the existing onUsage callback). Mid-stream live tok/s is gone vs v1.12.2; ChatThroughput shows a single value at stream end.
  - **Tools have NO `execute` field.** BooCode dispatches tools in tool-phase.ts, not the AI SDK loop. Only `description` + `inputSchema: jsonSchema(parameters)` — surfacing tool-call parts via `fullStream` and stopping is what we want.
  - **`includeUsage: true` MUST be set on `createOpenAICompatible`** in `services/inference/provider.ts`. The adapter defaults it false, omitting `stream_options.include_usage` from the request body; llama-swap then never emits the usage block and `result.usage.inputTokens/outputTokens` resolve to `undefined`. Latent regression from v1.13.1-A through v1.13.7 — every assistant row in that window has `tokens_used`/`ctx_used` NULL. Don't remove this flag during refactor.
  - **Tool-call-only turns may emit a leading `\n` text-delta** as the assistant content. `MessageList.flatten`'s `hasText` and `MessageBubble`'s `hasContent` both `.trim()` before the length check — otherwise whitespace-only content renders an empty bubble + ActionRow between every tool call (v1.13.7 fix). `payload.ts:buildMessagesPayload` also skips `status='failed'` AND complete-but-empty (no content, no tool_calls) assistant rows to avoid "Cannot have 2 or more assistant messages at the end of the list" upstream rejections after cap-hit + Continue.
 - **AI SDK ModelMessage conversion** (`toModelMessages` in stream-phase.ts). Tool messages need a `toolName` for `ToolResultPart` — BooCode's OpenAI-shape history doesn't carry it, so a forward-scan builds a `tool_call_id → toolName` map from prior assistant `tool_calls`. Tool outputs wrapped as `{ type: 'json' | 'text', value }` matching the v6 `ToolResultOutput` union. Assistant messages with reasoning emit a `ReasoningPart` first in the content array (v1.13.1-C).
 - **`experimental_repairToolCall`** (v1.13.3) wired into `streamText` to keep the stream alive when qwen3.6 emits malformed tool args. Pass-through implementation — logs the bad call and returns it unmodified; `executeToolPhase`'s existing zod-reject error path routes it to the model on the next turn.
 - **`chat_status` frame shape** (published via `broker.publishUser`) — `status: 'streaming' | 'tool_running' | 'waiting_for_input' | 'idle' | 'error'` (widened from `working|idle|error` in v1.12.1). Frontend `useChatStatus` derives `idle_warm` (<30s since idle) vs `idle_cold`. `ChatThroughput` renders inline beside `StatusDot` only when streaming or tool_running, fed by 500ms-throttled `'usage'` WS frames (`completion_tokens` + `ctx_used` + `ctx_max`). The `POST /api/chats/:id/discard_stale` endpoint exists to mark a stuck-streaming row as `failed` when the frontend's 60s no-token-activity timer (`ChatPane` content-length watcher) gives up.
 - **Boot-time stale-streaming sweep** in `apps/server/src/index.ts` after `applySchema()`: any `messages.status='streaming'` older than 5 minutes flips to `'failed'`. Logs only on non-zero count. Recovers from container restart while inference was mid-stream (v1.12.1).
 - **Periodic 60s sweeper** in `apps/server/src/index.ts` (v1.13.3 + v1.13.5). Same `setInterval` runs `sweepStaleStreaming` (marks `messages.status='streaming'` older than 5 min as `failed`, publishes `chat_status='idle'` so the UI dot drops) and `cleanupTruncations` (TTL + orphan reap of tmpfs truncation files). `app.addHook('onClose')` clears the timer. No-op when nothing to reap.
 - **`services/broker.ts`** — In-memory pub/sub with two channel types: per-session (message streaming) and per-user (sidebar updates). No persistence; clients reconnect on restart. v1.13.11: every WS publish goes through `broker.publishFrame(sessionId, frame)` or `broker.publishUserFrame(user, frame)` — both Zod-validate against `WsFrameSchema` (`types/ws-frames.ts`) and fail-closed (log + drop). `ctx.publish` / `ctx.publishUser` in inference + auto_name route through the index.ts adapter that calls publishFrame internally. The schema is duplicated byte-identical at `apps/web/src/api/ws-frames.ts`; a `ws-frames.test.ts` case enforces parity. Don't add new raw `broker.publish()` / `publishUser()` calls.
 - **`services/tools.ts`** — Tool registry (`ALL_TOOLS`, `READ_ONLY_TOOL_NAMES`, `TOOLS_BY_NAME`). Filesystem tools (view_file/list_dir/grep/find_files) go through three guard layers: `path_guard.ts` (workspace scope), `secret_guard.ts` (filename deny list), `url_guard.ts` (SSRF/private-IP block for web_fetch). v1.11.8+ web tools (`web_search`, `web_fetch`) are opt-in per chat via `session.web_search_enabled` (resolved with `project.default_web_search_enabled` fallback) and filtered out of the LLM's tool schema when false. v1.13.5 truncation: when a tool slice cuts content, `services/truncate.ts` stashes the full text on tmpfs at `BOOCODE_TRUNCATION_DIR` (default `/tmp/boocode-truncations`, 0o700) keyed by an opaque `tr_<12 base32 chars>` id, and the `view_truncated_output(id)` tool retrieves it. 5MB cap (matches `view_file`'s `MAX_FILE_BYTES`), 7-day TTL, reaped by the periodic sweeper. Tmpfs path means container restart loses retrieval — acceptable, the model usually has moved on.
 - **`services/compaction.ts`** + **`services/model-context.ts`** — v1.11.0 anchored rolling summary (single `summary=true` assistant row per chat, supersedes itself on each compaction). Triggered when `chats.needs_compaction` is set after an inference turn exceeds `usable(ctx_max) = floor(0.85 × ctx_max)` (v1.13.9 opencode-pattern early trigger; was `ctx_max - 20k` pre-v1.13.9, which gave only 7.6% headroom at 262k and 0 budget for ≤20k contexts). **`ctx_max` comes from `model-context.getModelContext()` which fetches `${LLAMA_SWAP_URL}/upstream/<model>/props`** — NOT from `parsed.timings.n_ctx` (the stream completion's `timings` doesn't carry n_ctx; that read was dead code until v1.11.3 ripped it out). First inferences after a boocode boot may have `ctx_max=NULL` if llama-swap hasn't loaded the model yet; negative cache TTL is 60s, recovers on next turn. v1.13.6: `buildHeadPayload` embeds `reasoning_parts` as a `<reasoning>...</reasoning>` prose prefix on the assistant `content` (OpenAI wire shape has no structured reasoning field; the summarizer reads text). Standalone tag when content is empty (tool-call-only turn). `buildHeadPayload` + `OpenAiMessage` exported for test access — keep them exported.
 - **`services/system-prompt.ts`** — `buildSystemPrompt` is the string-returning shim; `buildSystemPromptWithFingerprint` is the canonical impl returning `{prompt, fingerprint, drift}`. v1.13.8 instrumentation: SHA-256 of the assembled prefix is logged per `buildMessagesPayload` call (msg `prefix-fingerprint`, level=info); a `Map<sessionId, lastHash>` observer fires `prefix-drift` (level=warn) on hash change with a field-level `changed_inputs` diff. Smoke proved the prefix is byte-stable across turns in steady-state — the originally-planned `system_prompt_cache` DB table was dropped as redundant against the v1.12.0 input-layer mtime caches (BOOCHAT.md here + AGENTS.md global+per-project in `agents.ts:safeStat`).
 - **`services/inference/budget.ts`** — tool-call budgets: `BUDGET_READ_ONLY = 30`, `BUDGET_NON_READ_ONLY = 10` (forward-looking; no write tools yet), `BUDGET_NO_AGENT = 30` (v1.13.7; was 15 — every tool in `ALL_TOOLS` is read-only today, so no-agent mode shares the read-only-agent cap). Per-agent `max_tool_calls` from AGENTS.md frontmatter overrides.
 - **`messages_with_parts` view** (v1.13.1-B; `schema.sql`). Read sites that need `tool_calls` / `tool_results` / `reasoning_parts` SELECT from this view, NOT `messages` directly. v1.13.20 dropped the legacy `messages.tool_calls` / `messages.tool_results` JSON columns; the view now reads parts-only subselects. Writes target `message_parts` exclusively via `insertParts` (or via the helpers `partsFromAssistantMessage` / `partsFromToolMessage`). The `Message` wire type still carries `tool_calls?` / `tool_results?` because the view synthesizes them from parts — frontend reads are unchanged. Shapes: `tool_calls jsonb[]`, `tool_results jsonb` single object, `reasoning_parts jsonb[]` of `{text}`. If you ever need to UPDATE a message and return its full Message shape, do a two-step UPDATE returning `id` followed by SELECT from the view — RETURNING off the bare `messages` table no longer carries the tool fields.
 - **`services/file_ops.ts`** — Shared file operation implementations used by both inference tools and HTTP routes.
 - **`services/auto_name.ts`** — Non-streaming LLM call to generate 4-word session titles after first assistant reply.
@@ -86,26 +101,30 @@ Font / CSS pipeline (apps/web):
 ### Multi-pane workspace
-Sessions hold 1–5 panes (chat / empty / placeholder terminal+agent). Workspace pane state is **client-side only** (localStorage key `boocode.workspace.panes.<sessionId>`); the legacy `session_panes` table and its REST endpoints are deprecated — no `/api/panes/*` routes exist. Each chat lives in at most one pane; tab strip is per-pane and tracks `chatIds[]` + `activeChatIdx`. Sessions 1:N chats; chats own messages. Tab reorder via native HTML5 drag events.
+Sessions hold 1–5 panes (chat / empty / placeholder terminal+agent). v1.12.1 moved pane state from per-device localStorage to `sessions.workspace_panes jsonb` for cross-device sync. `PATCH /api/sessions/:id/workspace` persists; `session_workspace_updated` user-channel frame broadcasts to every device watching the session. `useWorkspacePanes` debounces saves 300ms and dedups echoes by JSON string. Legacy localStorage key `boocode.workspace.panes.<sessionId>` is read once on first hydrate (one-time seed-and-delete migration when server is empty but localStorage has data); no longer written. The deprecated `session_panes` table was dropped. `validatePanes(validChatIds)` prunes panes referencing chat IDs that no longer exist (called by `useSessionChats` after the chat list fetch lands). Each chat lives in at most one pane; tab strip is per-pane and tracks `chatIds[]` + `activeChatIdx`. Tab reorder via native HTML5 drag events.
 ## Database
-PostgreSQL 16. Tables: `projects`, `sessions`, `chats`, `messages`, `settings`, `session_panes` (deprecated). Schema applied idempotently on startup via `applySchema()`. Use `clock_timestamp()` (not `NOW()`) inside transactions. CHECK constraints in place: `projects_status_chk` ('open'|'archived'), `sessions_status_chk` (same), `chats_status_chk` (same), `messages_role_chk`, `messages_status_chk` — keep in sync with the `*_STATUSES` const arrays in `apps/server/src/types/api.ts`.
+PostgreSQL 16. Tables: `projects`, `sessions`, `chats`, `messages`, `settings`, `message_parts` (v1.13.0). Views: `messages_with_parts` (v1.13.1-B parts-merge read path), `tool_cost_stats` (v1.13.10 per-tool 100-call rolling window). (`session_panes` was dropped in v1.12.1; workspace pane state lives in `sessions.workspace_panes jsonb`.) Schema applied idempotently on startup via `applySchema()`. Use `clock_timestamp()` (not `NOW()`) inside transactions. CHECK constraints in place: `projects_status_chk` ('open'|'archived'), `sessions_status_chk` (same), `chats_status_chk` (same), `messages_role_chk`, `messages_status_chk` — keep in sync with the `*_STATUSES` const arrays in `apps/server/src/types/api.ts`. The older anonymous `messages_status_check` (without 'cancelled') and `messages_role_check` (without 'system') were dropped in v1.12.1; only the `_chk` variants remain.
 Schema CHECK migration order when renaming allowed values: (1) `ALTER TABLE ... DROP CONSTRAINT IF EXISTS <system_name>` (inline `CREATE TABLE` checks get `<table>_<column>_check`), (2) `UPDATE` rows to new values, (3) wrap new constraint ADD in `DO $$ ... pg_constraint` guard — that block is the only way to get `ADD CONSTRAINT IF NOT EXISTS`.
 Position-shift pattern for panes (legacy `session_panes` table): negate-and-restore to avoid UNIQUE(session_id, position) collisions during reorder/insert/delete. Sentinel value -100 for the moving pane.
 ## Environment
-Required: `DATABASE_URL`, `LLAMA_SWAP_URL`. Optional: `PORT` (3000), `HOST` (0.0.0.0), `PROJECT_ROOT_WHITELIST` (/opt, read-only scope for add-existing path resolution), `BOOTSTRAP_ROOT` (/opt/projects, writable scope for create-new-project bootstrap mkdir target — host must `mkdir -p /opt/projects` before container start), `DEFAULT_MODEL`, `LOG_LEVEL`.
+Required: `DATABASE_URL`, `LLAMA_SWAP_URL`. Optional: `PORT` (3000), `HOST` (0.0.0.0), `PROJECT_ROOT_WHITELIST` (/opt, read-only scope for add-existing path resolution), `BOOTSTRAP_ROOT` (/opt/projects, writable scope for create-new-project bootstrap mkdir target — host must `mkdir -p /opt/projects` before container start), `DEFAULT_MODEL`, `LOG_LEVEL`, `SEARXNG_URL` (default `http://100.114.205.53:8888` — internal Tailscale Fathom; the public `search.indifferentketchup.com` is behind Authelia and unusable from server context), `BOOCODE_TOOLS` (`core` | `standard` | `all`, default `all`; v1.13.15-tools tier filter — ceiling, never expands an agent's whitelist).
 ## Workflow
 - Sam reviews all diffs and commits manually. Do not commit unless explicitly asked.
 - Per-batch docs live under `openspec/changes/<slug>/{proposal,tasks,design}.md`. Already-shipped batches are snapshots in `openspec/changes/archived/`. New batches follow the proposal+tasks shape; see `openspec/README.md` for the convention.
 - Tag naming: `vMAJOR.MINOR.PATCH-slug` (e.g. `v1.13.13-ws-publish`). Monotonic per minor — the slug describes the batch's content so the tag name alone is enough to recall what shipped. No letter suffixes (`-a`/`-b`), no pseudo-ranges (`v1.11.x`), no slug-only sub-versions sharing a number (`v1.13.15-tools` + `-openspec` + `-agentlint` — split into sequential patches instead).
 - `CHANGELOG.md` is the per-tag release log, most-recent on top. When a new tag is created, add a `## <tag> — <YYYY-MM-DD>` section with a 3–6 sentence paragraph summarizing what shipped, drawn from the commit body. Cross-reference other tags by name when the batch builds on, fixes, or pairs with prior work (e.g. "pairs with `v1.13.12-ws-schemas`", "fixed in `v1.13.5-stability-bundle`"). No nested bullets — one paragraph.
 - Deploy: `cd /opt/boocode && docker compose up --build -d` (or `docker compose build --no-cache boocode && docker compose up -d` if you suspect a layer-cache issue).
 - Git push to Gitea: `GIT_SSH_COMMAND="ssh -i /opt/boocode/secrets/boocode_gitea -o IdentitiesOnly=yes" git push origin <branch>`. The default agent identity is rejected; the in-repo deploy key (`secrets/`, gitignored) is the working one. Transient `Connection reset by peer` retries cleanly after `sleep 5`.
 - Don't accumulate `.bak-*` files. Clean them up in the same batch or immediately after merge.
 - DB-integration tests opt-in via env var: `DATABASE_URL='postgres://boocode:devpass@localhost:5500/boocode' pnpm -C apps/server test`. Host port is 5500 (mapped from `boocode_db:5432`); password is `${POSTGRES_PASSWORD}` from `.env` (`devpass`), NOT the literal in `.env`'s `DATABASE_URL=postgres://boocode:Ketchup1479@boocode_db:5432/...` line. Pattern: `describe.runIf(!!process.env.DATABASE_URL)(...)` with a `beforeAll` that applies the schema via `sql.unsafe(readFileSync(schemaPath))`. Tests skip cleanly when var is unset. `tool_cost_stats.test.ts` is the reference.
 - Host-side smoke endpoint: `curl http://100.114.205.53:9500/api/...`. The boocode container's port mapping binds to the Tailscale IP, not `0.0.0.0`, so `localhost:9500` doesn't work from the host shell. Same for booterm at `:9501`.
 - Fastify global JSON parser tolerates empty bodies (overridden in `index.ts`); bodyless POSTs (archive, unarchive, stop) work without setting `Content-Type` tricks on the client.
 - Event dedup discipline: for any mutation the server publishes via `broker.publishUser`, do NOT add a local `sessionEvents.emit(...)` after the API call — `useUserEvents` forwards the WS frame onto the bus. Frontend mutation handlers must be idempotent (dedup by id, no-op on already-present).
 - `node:20-*` base images ship a `node` user at uid/gid 1000 — delete it (`userdel`/`groupdel` on debian, `deluser`/`delgroup` on alpine) before adding samkintop at 1000.
@@ -124,9 +143,16 @@ Required: `DATABASE_URL`, `LLAMA_SWAP_URL`. Optional: `PORT` (3000), `HOST` (0.0
 - TypeScript strict mode. Both apps share `tsconfig.base.json`.
 - Server uses NodeNext module resolution (`.js` extensions in imports).
 - Discriminated unions for type narrowing: `Pane` (by `kind`), `SessionEvent` (by `type`), `InferenceFrame` (by `type`).
 - **Adding a new WS frame type** requires updating BOTH the server's `InferenceFrame` (loose `type:` union + optional fields in `services/inference/turn.ts`) AND the web `WsFrame` (strict discriminated union in `apps/web/src/api/types.ts`). Server publish is permissive; the frontend type is the wire-format gate. The `'usage'` frame added in v1.12.2 needed both sides; missing the web side silently drops the frame at JSON-parse.
 - shadcn primitives live in `components/ui/`. Don't modify them unless adding a new primitive.
 - `inferLanguage()` from `lib/attachments.ts` is the canonical file-extension-to-language map. `CodeBlock.tsx` keeps its own `LANG_MAP` because it also resolves markdown fence names.
 - Two UI event buses: `hooks/sessionEvents.ts` for DB-state events (chat_created, session_updated); `lib/events.ts` for ephemeral UI (`sendToTerminal`, `terminalsRegistry`). Don't merge — different subscriber lifecycles.
 - `vite.config.ts` proxy entries are order-sensitive: more-specific prefixes (`/api/term`, `/ws/term`) must come BEFORE `/api`.
 - Mobile pane URL sync (`Session.tsx`): the `?pane=<id>` effect resets `activePaneIdx` whenever `panes` changes. New-pane creation on mobile must push `?pane=` atomically — `addPaneAndSwitch` is the wrapper that does this. `addSplitPane` returns the new pane id for callers.
 - xterm.js v5 uses canvas rendering — browser doesn't see xterm's selection; the native right-click menu has no working Copy for terminal text. App keybindings (`Cmd/Ctrl-C`, `Cmd/Ctrl-Shift-C`) are the path.
 - **New tools** live in their own `services/<name>.ts` file (see `web_search.ts`, `web_fetch.ts`) — exports a pure `executeFoo(input, ...deps)` for direct test access plus a `ToolDef` wrapper that `loadConfig()`s its real dependencies. Register the ToolDef in `tools.ts` `ALL_TOOLS` (and `READ_ONLY_TOOL_NAMES` if applicable). Inject `fetcher: typeof fetch = fetch` rather than `vi.spyOn(globalThis, 'fetch')` — cleanup is simpler and the production call site stays unchanged.
 - **Sentinels** are `role='system'` rows with structured `metadata.kind` (`cap_hit`, `doom_loop`). UI-only — `buildMessagesPayload` strips them via `isAnySentinel` so the LLM never sees them. A new kind requires arms in `MessageMetadata` in BOTH `apps/server/src/types/api.ts` AND `apps/web/src/api/types.ts`, plus a render branch in `apps/web/src/components/MessageBubble.tsx`.
 - **ReadableStream test stubs** use `pull()` (not `start()`) so chunks are produced lazily — `start()` enqueues everything and calls `controller.close()` before the consumer reads, so a subsequent `reader.cancel()` finds the stream already closed and the `cancel()` callback never fires. Also provide MORE chunks than the test will consume so the source stays in 'readable' state when cancel runs (e.g. cap test reads ~6 chunks, stub provides 10).
 - Tool-name whitelists must derive from `ALL_TOOLS` in `services/tools.ts`, never hardcoded. `services/agents.ts` `ALL_TOOL_NAMES` had this drift class until v1.12 — same pattern applies to any future tool-aware code.
 - Agent registry lives at `data/AGENTS.md` (global, bind-mounted at `/data/AGENTS.md`). No per-project `AGENTS.md` in this repo — removed in v1.12 to eliminate the two-files-must-stay-in-sync drift. The `getAgentsForProject` per-project override mechanism remains for *other* projects.
 - MCP stdio transport uses newline-delimited JSON (NDJSON), NOT LSP-style `Content-Length` headers. The `codecontext/shim.go` framing implementation is the reference; per the MCP spec (modelcontextprotocol.io/specification/server/transports).
--- a/apps/server/package.json
+++ b/apps/server/package.json
@@ -11,8 +11,10 @@
    "test": "vitest run"
  },
  "dependencies": {
    "@ai-sdk/openai-compatible": "^2.0.47",
    "@fastify/static": "^7.0.4",
    "@fastify/websocket": "^10.0.1",
    "ai": "^6.0.190",
    "fastify": "^4.28.1",
    "postgres": "^3.4.4",
    "ws": "^8.18.0",
--- a/apps/server/src/index.ts
+++ b/apps/server/src/index.ts
@@ -10,17 +10,20 @@ import { registerProjectRoutes } from './routes/projects.js';
 import { registerSessionRoutes } from './routes/sessions.js';
 import { registerSettingsRoutes } from './routes/settings.js';
 import { registerMessageRoutes } from './routes/messages.js';
 import { registerArtifactRoutes } from './routes/artifacts.js';
 import { registerChatRoutes } from './routes/chats.js';
 import { registerSidebarRoutes } from './routes/sidebar.js';
 import { registerWebSocket } from './routes/ws.js';
 import { registerModelRoutes } from './routes/models.js';
 import { registerAgentRoutes } from './routes/agents.js';
 import { registerSkillsRoutes } from './routes/skills.js';
-import { createInferenceRunner } from './services/inference.js';
+import { registerToolsRoutes } from './routes/tools.js';
 import { createInferenceRunner } from './services/inference/index.js';
 import { createBroker } from './services/broker.js';
 import { listSkills } from './services/skills.js';
 import * as compaction from './services/compaction.js';
 import { configureModelContext } from './services/model-context.js';
 import { cleanupTruncations } from './services/truncate.js';
 async function main() {
  const config = loadConfig();
@@ -49,6 +52,18 @@ async function main() {
  await applySchema(sql);
  app.log.info('database schema applied');
  const swept = await sql<{ count: string }[]>`
    WITH swept AS (
      UPDATE messages SET status = 'failed'
      WHERE status = 'streaming' AND created_at < NOW() - INTERVAL '5 minutes'
      RETURNING id
    ) SELECT count(*)::text AS count FROM swept
  `;
  const sweptCount = Number(swept[0]?.count ?? 0);
  if (sweptCount > 0) {
    app.log.info({ sweptCount }, 'swept stale streaming messages to failed');
  }
  // v1.11.3: tell the model-context cache where llama-swap lives. Cache
  // lookups go to ${LLAMA_SWAP_URL}/upstream/<model>/props to read
  // default_generation_settings.n_ctx — the value persisted as messages.ctx_max.
@@ -61,7 +76,7 @@ async function main() {
    return { status: dbOk ? 'ok' : 'degraded', db: dbOk };
  });
-  const broker = createBroker();
+  const broker = createBroker(app.log);
  registerProjectRoutes(app, sql, config, broker);
  registerSessionRoutes(app, sql, config, broker);
@@ -70,6 +85,7 @@ async function main() {
  registerAgentRoutes(app, sql);
  registerSidebarRoutes(app, sql);
  registerChatRoutes(app, sql, broker);
  registerToolsRoutes(app, sql);
  // Batch 9.6: warm the skills cache at boot and surface the count. Empty or
  // missing /data/skills is non-fatal — the skill tools just return empty.
@@ -86,7 +102,9 @@ async function main() {
      config,
      log: app.log,
      publish: (sessionId, frame) => {
-        broker.publish(sessionId, frame as unknown as Record<string, unknown> & { type: string });
+        // v1.13.11-b: route through the typed publishFrame so the broker's
        // Zod gate validates every inference frame before delivery.
        broker.publishFrame(sessionId, frame as unknown as import('./types/ws-frames.js').WsFrame);
      },
      // v1.11: broker handle for compaction.process to publish 'compacted'
      // frames on the per-session channel. Inference's regular publish path
@@ -95,10 +113,10 @@ async function main() {
      broker,
    },
    (user, frame) => {
-      broker.publishUser(user, frame as unknown as Record<string, unknown> & { type: string });
+      broker.publishUserFrame(user, frame as unknown as import('./types/ws-frames.js').WsFrame);
    }
  );
-  registerMessageRoutes(app, sql, {
+  registerMessageRoutes(app, sql, config, broker, {
    enqueueInference: (sessionId, chatId, assistantId, user) => {
      inference.enqueue(sessionId, chatId, assistantId, user);
    },
@@ -114,60 +132,61 @@ async function main() {
    },
    hasActiveInference: (chatId) => inference.hasActive(chatId),
    publishUserMessage: (sessionId, chatId, userMessageId, content) => {
-      broker.publish(sessionId, {
+      broker.publishFrame(sessionId, {
        type: 'message_started',
        message_id: userMessageId,
        chat_id: chatId,
        role: 'user',
      });
-      broker.publish(sessionId, {
+      broker.publishFrame(sessionId, {
        type: 'delta',
        message_id: userMessageId,
        chat_id: chatId,
        content,
      });
-      broker.publish(sessionId, {
+      broker.publishFrame(sessionId, {
        type: 'message_complete',
        message_id: userMessageId,
        chat_id: chatId,
      });
    },
    publishMessagesDeleted: (sessionId, chatId, messageIds) => {
-      broker.publish(sessionId, {
+      broker.publishFrame(sessionId, {
        type: 'messages_deleted',
        message_ids: messageIds,
        chat_id: chatId,
      });
    },
    publishSessionFrame: (sessionId, frame) => {
-      broker.publish(sessionId, frame);
+      broker.publishFrame(sessionId, frame as import('./types/ws-frames.js').WsFrame);
    },
  });
  registerArtifactRoutes(app, sql);
  registerSkillsRoutes(app, sql, {
    enqueueInference: (sessionId, chatId, assistantId, user) => {
      inference.enqueue(sessionId, chatId, assistantId, user);
    },
    publishUserMessage: (sessionId, chatId, userMessageId, content) => {
-      broker.publish(sessionId, {
+      broker.publishFrame(sessionId, {
        type: 'message_started',
        message_id: userMessageId,
        chat_id: chatId,
        role: 'user',
      });
-      broker.publish(sessionId, {
+      broker.publishFrame(sessionId, {
        type: 'delta',
        message_id: userMessageId,
        chat_id: chatId,
        content,
      });
-      broker.publish(sessionId, {
+      broker.publishFrame(sessionId, {
        type: 'message_complete',
        message_id: userMessageId,
        chat_id: chatId,
      });
    },
    publishSessionFrame: (sessionId, frame) => {
-      broker.publish(sessionId, frame);
+      broker.publishFrame(sessionId, frame as import('./types/ws-frames.js').WsFrame);
    },
  });
  registerWebSocket(app, sql, broker);
@@ -189,6 +208,52 @@ async function main() {
    app.log.info(`serving static frontend from ${webDist}`);
  }
  // v1.13.3: periodic in-process sweeper for streaming rows orphaned by a
  // mid-session crash. The boot sweep (above) only fires once at startup;
  // this loop catches the in-flight case. 60s cadence + 5-min threshold
  // matches the boot sweep so behavior is consistent. Publishes
  // chat_status='idle' on the user channel so the UI dot drops without a
  // refresh — same pattern as handleAbortOrError.
  const SWEEP_INTERVAL_MS = 60_000;
  const sweepStaleStreaming = async (): Promise<void> => {
    try {
      const rows = await sql<{ id: string; chat_id: string }[]>`
        UPDATE messages
        SET status = 'failed', finished_at = clock_timestamp()
        WHERE status = 'streaming'
          AND created_at < NOW() - INTERVAL '5 minutes'
        RETURNING id, chat_id
      `;
      if (rows.length === 0) return;
      app.log.warn(
        { swept: rows.length, ids: rows.map((r) => r.id) },
        'swept stale streaming rows',
      );
      const seenChats = new Set<string>();
      const now = new Date().toISOString();
      for (const row of rows) {
        if (seenChats.has(row.chat_id)) continue;
        seenChats.add(row.chat_id);
        broker.publishUserFrame('default', {
          type: 'chat_status',
          chat_id: row.chat_id,
          status: 'idle',
          at: now,
        });
      }
    } catch (err) {
      app.log.error({ err }, 'stuck-row sweeper failed');
    }
  };
  // v1.13.5: truncation cleanup rides the same cadence — 60s tick reaps
  // tmpfs files past the 7-day TTL plus any orphans whose owning part has
  // been pruned (v1.13.4) or deleted. No-op when the dir is empty.
  const sweepTimer = setInterval(() => {
    void sweepStaleStreaming();
    void cleanupTruncations({ sql, log: app.log });
  }, SWEEP_INTERVAL_MS);
  app.addHook('onClose', async () => { clearInterval(sweepTimer); });
  const shutdown = async (signal: string) => {
    app.log.info(`received ${signal}, shutting down`);
    try {
--- a/apps/server/src/routes/tests/sessions.test.ts
+++ b/apps/server/src/routes/tests/sessions.test.ts
@@ -0,0 +1,70 @@
 // v1.13.17-cross-repo-reads: PATCH /api/sessions/:id allowed_read_paths
 // subset enforcement. Sam flagged in the compliance review that without a
 // runtime subset check, a malicious client could POST
 //   {"allowed_read_paths":["/etc"]}
 // and bypass the user-consent grant flow entirely. The findUnauthorizedAdditions
 // helper is the guard; tests pin its behavior so a regression in the helper
 // or its callsite (PATCH handler in sessions.ts) trips CI before prod.
 import { describe, it, expect } from 'vitest';
 import { findUnauthorizedAdditions } from '../sessions.js';
 describe('findUnauthorizedAdditions — PATCH allowed_read_paths subset guard', () => {
  it('returns no extras when requested is empty (full revoke)', () => {
    expect(findUnauthorizedAdditions(['/opt/forks/foo'], [])).toEqual([]);
  });
  it('returns no extras when requested is a strict subset (single revoke)', () => {
    expect(
      findUnauthorizedAdditions(['/opt/forks/foo', '/opt/forks/bar'], ['/opt/forks/foo']),
    ).toEqual([]);
  });
  it('returns no extras when requested equals prior (no-op PATCH)', () => {
    expect(
      findUnauthorizedAdditions(['/opt/forks/foo', '/opt/forks/bar'], [
        '/opt/forks/foo',
        '/opt/forks/bar',
      ]),
    ).toEqual([]);
  });
  it('flags an unauthorized addition when prior is empty', () => {
    // The /etc bypass attempt — Sam's specific concern from the compliance
    // review. Without this guard, the PATCH would have written /etc directly.
    expect(findUnauthorizedAdditions([], ['/etc'])).toEqual(['/etc']);
  });
  it('flags a single unauthorized addition mixed in with valid revokes', () => {
    // The attacker still tries to be sneaky: keep one legit entry, drop
    // another, slip in a new one. The guard catches the addition regardless
    // of how the rest of the array shrinks.
    expect(
      findUnauthorizedAdditions(['/opt/forks/foo', '/opt/forks/bar'], [
        '/opt/forks/foo',
        '/var/secrets',
      ]),
    ).toEqual(['/var/secrets']);
  });
  it('flags every unauthorized addition when there are multiple', () => {
    expect(
      findUnauthorizedAdditions(['/opt/forks/foo'], ['/opt/forks/foo', '/etc', '/root']),
    ).toEqual(['/etc', '/root']);
  });
  it('treats requested duplicates correctly (each occurrence checked)', () => {
    // If the requested array has duplicates of an unauthorized entry, the
    // guard surfaces each one. (A frontend would never send duplicates, but
    // the guard's contract shouldn't assume that.)
    expect(findUnauthorizedAdditions([], ['/etc', '/etc'])).toEqual(['/etc', '/etc']);
  });
  it('does not flag entries present in prior even if requested has duplicates', () => {
    // Duplicate of an authorized entry passes — the membership check is by
    // value, not by index. Settled by Set.has semantics.
    expect(
      findUnauthorizedAdditions(['/opt/forks/foo'], ['/opt/forks/foo', '/opt/forks/foo']),
    ).toEqual([]);
  });
 });
--- a/apps/server/src/routes/artifacts.ts
+++ b/apps/server/src/routes/artifacts.ts
@@ -0,0 +1,231 @@
 // v1.14.x-html-artifact-panes: artifact download routes.
 //
 // Two endpoints:
 //   POST /api/chats/:id/messages/:msg_id/artifacts/download?fmt=md|html
 //     Materialises a file under <projectRoot>/.boocode/artifacts/ and
 //     returns {path, url}. fmt=html requires an existing html_artifact part
 //     on the message (404 otherwise). fmt=md works on any assistant
 //     message with non-empty content.
 //
 //   GET /api/projects/:project_id/artifacts/:filename
 //     Streams a previously-written artifact back with
 //     Content-Disposition: attachment. Path-guarded to the project's
 //     artifacts dir; rejects traversal attempts.
 import { createReadStream } from 'node:fs';
 import { realpath, stat } from 'node:fs/promises';
 import { resolve, sep, basename } from 'node:path';
 import type { FastifyInstance } from 'fastify';
 import { z } from 'zod';
 import type { Sql } from '../db.js';
 import {
  writeHtmlArtifact,
  writeMarkdownArtifact,
  type HtmlArtifactPayload,
 } from '../services/artifacts.js';
 const DownloadQuery = z.object({
  fmt: z.enum(['md', 'html']),
 });
 // Filename safety: alnum, dash, dot, underscore only. Blocks `..`, slashes,
 // nul bytes, etc. before we even touch the filesystem.
 const FilenameRe = /^[A-Za-z0-9._-]+$/;
 interface ChatRow {
  id: string;
  session_id: string;
  project_id: string;
  project_path: string;
 }
 interface MessageRow {
  id: string;
  chat_id: string;
  role: string;
  content: string;
 }
 export function registerArtifactRoutes(app: FastifyInstance, sql: Sql): void {
  app.post<{
    Params: { id: string; msg_id: string };
    Querystring: { fmt?: string };
  }>(
    '/api/chats/:id/messages/:msg_id/artifacts/download',
    async (req, reply) => {
      const parsed = DownloadQuery.safeParse(req.query);
      if (!parsed.success) {
        reply.code(400);
        return { error: 'invalid query', details: parsed.error.flatten() };
      }
      const { fmt } = parsed.data;
      const { id: chatId, msg_id: messageId } = req.params;
      const chatRows = await sql<ChatRow[]>`
        SELECT c.id, c.session_id, s.project_id, p.path AS project_path
        FROM chats c
        JOIN sessions s ON s.id = c.session_id
        JOIN projects p ON p.id = s.project_id
        WHERE c.id = ${chatId}
      `;
      if (chatRows.length === 0) {
        reply.code(404);
        return { error: 'chat not found' };
      }
      const chat = chatRows[0]!;
      const msgRows = await sql<MessageRow[]>`
        SELECT id, chat_id, role, content
        FROM messages
        WHERE id = ${messageId} AND chat_id = ${chatId}
      `;
      if (msgRows.length === 0) {
        reply.code(404);
        return { error: 'message not found' };
      }
      const msg = msgRows[0]!;
      if (msg.role !== 'assistant') {
        reply.code(400);
        return { error: 'only assistant messages produce artifacts' };
      }
      const ctx = { projectId: chat.project_id, projectRoot: chat.project_path };
      try {
        if (fmt === 'md') {
          if (!msg.content || msg.content.trim().length === 0) {
            reply.code(400);
            return { error: 'message has no content to export' };
          }
          const result = await writeMarkdownArtifact(
            { content: msg.content },
            ctx,
          );
          return result;
        }
        // fmt === 'html': require an html_artifact part on the message.
        const partRows = await sql<{ payload: HtmlArtifactPayload }[]>`
          SELECT payload
          FROM message_parts
          WHERE message_id = ${messageId} AND kind = 'html_artifact'
          ORDER BY sequence ASC
          LIMIT 1
        `;
        if (partRows.length === 0) {
          reply.code(404);
          return { error: 'no html_artifact part on this message' };
        }
        const result = await writeHtmlArtifact(partRows[0]!.payload, ctx);
        return result;
      } catch (err) {
        req.log.error({ err, messageId, fmt }, 'artifact write failed');
        reply.code(500);
        return {
          error: err instanceof Error ? err.message : 'artifact write failed',
        };
      }
    },
  );
  // v1.14.x-html-artifact-panes: HtmlArtifactPane needs the payload on click
  // to render its iframe. Returns 404 when the message has no html_artifact
  // sibling part — frontend uses that signal to open the markdown_artifact
  // pane variant instead. Payload shape matches HtmlArtifactPayload in
  // services/artifacts.ts.
  app.get<{ Params: { id: string; msg_id: string } }>(
    '/api/chats/:id/messages/:msg_id/html_artifact',
    async (req, reply) => {
      const { id: chatId, msg_id: messageId } = req.params;
      const partRows = await sql<{ payload: HtmlArtifactPayload }[]>`
        SELECT payload
        FROM message_parts mp
        JOIN messages m ON m.id = mp.message_id
        WHERE mp.message_id = ${messageId}
          AND m.chat_id = ${chatId}
          AND mp.kind = 'html_artifact'
        ORDER BY mp.sequence ASC
        LIMIT 1
      `;
      if (partRows.length === 0) {
        reply.code(404);
        return { error: 'no html_artifact part on this message' };
      }
      return partRows[0]!.payload;
    },
  );
  app.get<{ Params: { project_id: string; filename: string } }>(
    '/api/projects/:project_id/artifacts/:filename',
    async (req, reply) => {
      const { project_id: projectId, filename } = req.params;
      // Strip directory components defensively; only the basename is allowed.
      const base = basename(filename);
      if (base !== filename || !FilenameRe.test(base)) {
        reply.code(400);
        return { error: 'invalid filename' };
      }
      const projectRows = await sql<{ id: string; path: string }[]>`
        SELECT id, path FROM projects WHERE id = ${projectId}
      `;
      if (projectRows.length === 0) {
        reply.code(404);
        return { error: 'project not found' };
      }
      const project = projectRows[0]!;
      let resolvedRoot: string;
      try {
        resolvedRoot = await realpath(project.path);
      } catch {
        reply.code(404);
        return { error: 'project path missing' };
      }
      const artifactsDir = resolve(resolvedRoot, '.boocode/artifacts');
      const absPath = resolve(artifactsDir, base);
      if (!absPath.startsWith(artifactsDir + sep)) {
        reply.code(400);
        return { error: 'path traversal rejected' };
      }
      // Close the symlink-escape gap: if `.boocode/artifacts` (or an
      // ancestor) is a symlink pointing outside resolvedRoot, the lexical
      // prefix check above passes but the actual read lands outside the
      // sandbox. Realpath the artifacts dir and re-verify.
      try {
        const realArtifactsDir = await realpath(artifactsDir);
        if (
          realArtifactsDir !== resolvedRoot &&
          !realArtifactsDir.startsWith(resolvedRoot + sep)
        ) {
          reply.code(400);
          return { error: 'path traversal rejected' };
        }
      } catch {
        reply.code(404);
        return { error: 'artifact not found' };
      }
      try {
        await stat(absPath);
      } catch {
        reply.code(404);
        return { error: 'artifact not found' };
      }
      const ext = base.toLowerCase().endsWith('.html')
        ? 'text/html; charset=utf-8'
        : base.toLowerCase().endsWith('.md')
          ? 'text/markdown; charset=utf-8'
          : 'application/octet-stream';
      reply.header('Content-Type', ext);
      // Defense-in-depth on LLM-generated HTML served through this route.
      // Authelia gates the proxy; these headers limit blast radius if a
      // payload tries to escape that boundary in-browser.
      reply.header('X-Content-Type-Options', 'nosniff');
      reply.header('Content-Security-Policy', 'sandbox');
      reply.header(
        'Content-Disposition',
        `attachment; filename="${base.replace(/"/g, '')}"`,
      );
      return reply.send(createReadStream(absPath));
    },
  );
 }
--- a/apps/server/src/routes/chats.ts
+++ b/apps/server/src/routes/chats.ts
@@ -18,6 +18,12 @@ const ForkBody = z.object({
  name: z.string().min(1).max(200).optional(),
 });
 const DiscardStaleBody = z.object({
  message_id: z.string().uuid(),
 });
 const STALE_MIN_AGE_SECONDS = 60;
 export function registerChatRoutes(
  app: FastifyInstance,
  sql: Sql,
@@ -96,7 +102,7 @@ export function registerChatRoutes(
        VALUES (${req.params.id}, ${parsed.data.name ?? null}, 'open')
        RETURNING id, session_id, name, status, created_at, updated_at
      `;
-      broker.publishUser('default', {
+      broker.publishUserFrame('default', {
        type: 'chat_created',
        chat: chat!,
        session_id: req.params.id,
@@ -126,7 +132,7 @@ export function registerChatRoutes(
        return { error: 'chat not found' };
      }
      const chat = rows[0]!;
-      broker.publishUser('default', {
+      broker.publishUserFrame('default', {
        type: 'chat_updated',
        chat_id: chat.id,
        session_id: chat.session_id,
@@ -156,7 +162,7 @@ export function registerChatRoutes(
      `;
      const ids = rows.map((r) => r.id);
      for (const id of ids) {
-        broker.publishUser('default', {
+        broker.publishUserFrame('default', {
          type: 'chat_archived',
          chat_id: id,
          session_id: req.params.id,
@@ -197,7 +203,7 @@ export function registerChatRoutes(
        return { error: 'chat not found or already archived' };
      }
      const row = rows[0]!;
-      broker.publishUser('default', {
+      broker.publishUserFrame('default', {
        type: 'chat_archived',
        chat_id: row.id,
        session_id: row.session_id,
@@ -220,7 +226,7 @@ export function registerChatRoutes(
        return { error: 'chat not found or not archived' };
      }
      const chat = rows[0]!;
-      broker.publishUser('default', { type: 'chat_unarchived', chat });
+      broker.publishUserFrame('default', { type: 'chat_unarchived', chat });
      return chat;
    }
  );
@@ -237,7 +243,7 @@ export function registerChatRoutes(
        return { error: 'chat not found' };
      }
      const row = result[0]!;
-      broker.publishUser('default', {
+      broker.publishUserFrame('default', {
        type: 'chat_deleted',
        chat_id: row.id,
        session_id: row.session_id,
@@ -290,13 +296,13 @@ export function registerChatRoutes(
        `;
        await tx`
          INSERT INTO messages (
-            session_id, chat_id, role, content, kind, tool_calls, tool_results,
+            session_id, chat_id, role, content, kind,
            status, tokens_used, ctx_used, ctx_max, started_at, finished_at,
            created_at, metadata
          )
          SELECT
            ${source.session_id}, ${chat!.id}, role, content, kind,
-            tool_calls, tool_results, status,
+            status,
            tokens_used, ctx_used, ctx_max, started_at, finished_at,
            clock_timestamp() + (
              ROW_NUMBER() OVER (ORDER BY created_at ASC, id ASC) * INTERVAL '1 microsecond'
@@ -307,10 +313,32 @@ export function registerChatRoutes(
            AND created_at <= ${target.created_at}::timestamptz
            AND status = 'complete'
        `;
        // v1.13.0: clone message_parts for the forked messages. Source and
        // destination preserve ordering (the INSERT above orders by created_at,
        // id) so a ROW_NUMBER pairing maps source.id → dest.id deterministically.
        await tx`
          WITH src AS (
            SELECT id, ROW_NUMBER() OVER (ORDER BY created_at ASC, id ASC) AS rn
            FROM messages
            WHERE chat_id = ${source.id}
              AND created_at <= ${target.created_at}::timestamptz
              AND status = 'complete'
          ),
          dst AS (
            SELECT id, ROW_NUMBER() OVER (ORDER BY created_at ASC, id ASC) AS rn
            FROM messages
            WHERE chat_id = ${chat!.id}
          )
          INSERT INTO message_parts (message_id, sequence, kind, payload)
          SELECT dst.id, p.sequence, p.kind, p.payload
          FROM message_parts p
          JOIN src ON p.message_id = src.id
          JOIN dst ON dst.rn = src.rn
        `;
        return chat!;
      });
-      broker.publishUser('default', {
+      broker.publishUserFrame('default', {
        type: 'chat_created',
        chat: newChat,
        session_id: source.session_id,
@@ -320,6 +348,77 @@ export function registerChatRoutes(
    }
  );
  // v1.12.3: explicit recovery from a stuck-streaming assistant row. The
  // frontend gates this behind a 60s no-token-activity timer; the server
  // re-checks the age and current status for safety. Non-streaming rows
  // return 409 (frontend race; idempotent retry is fine).
  app.post<{ Params: { id: string } }>(
    '/api/chats/:id/discard_stale',
    async (req, reply) => {
      const parsed = DiscardStaleBody.safeParse(req.body ?? {});
      if (!parsed.success) {
        reply.code(400);
        return { error: 'invalid body', details: parsed.error.flatten() };
      }
      const rows = await sql<{
        id: string;
        session_id: string;
        chat_id: string;
        status: string;
        age_seconds: number;
      }[]>`
        SELECT id, session_id, chat_id, status,
               EXTRACT(EPOCH FROM (clock_timestamp() - created_at))::int AS age_seconds
        FROM messages
        WHERE id = ${parsed.data.message_id} AND chat_id = ${req.params.id}
      `;
      if (rows.length === 0) {
        reply.code(404);
        return { error: 'message not found in chat' };
      }
      const msg = rows[0]!;
      if (msg.status !== 'streaming') {
        reply.code(409);
        return { error: 'message is no longer streaming', current_status: msg.status };
      }
      if (msg.age_seconds < STALE_MIN_AGE_SECONDS) {
        reply.code(409);
        return { error: 'message is not stale yet', age_seconds: msg.age_seconds };
      }
      const updated = await sql<{ id: string }[]>`
        UPDATE messages
        SET status = 'failed',
            content = COALESCE(content, ''),
            finished_at = clock_timestamp()
        WHERE id = ${msg.id} AND status = 'streaming'
        RETURNING id
      `;
      if (updated.length === 0) {
        // Race: the row flipped out of 'streaming' between our SELECT and UPDATE.
        reply.code(409);
        return { error: 'message status changed mid-request' };
      }
      // v1.13.20: re-fetch via messages_with_parts so the returned shape
      // carries parts-synthesized tool_calls / tool_results. The dropped
      // legacy columns can no longer be selected directly.
      const refreshed = await sql<Message[]>`
        SELECT * FROM messages_with_parts WHERE id = ${msg.id}
      `;
      broker.publishUserFrame('default', {
        type: 'chat_status',
        chat_id: msg.chat_id,
        status: 'idle',
        at: new Date().toISOString(),
      });
      broker.publishFrame(msg.session_id, {
        type: 'message_complete',
        message_id: msg.id,
        chat_id: msg.chat_id,
      });
      return refreshed[0];
    }
  );
  app.get<{ Params: { id: string } }>(
    '/api/chats/:id/messages',
    async (req, reply) => {
@@ -328,11 +427,12 @@ export function registerChatRoutes(
        reply.code(404);
        return { error: 'chat not found' };
      }
      // v1.13.1-B: reads tool_calls/tool_results via the parts-merged view.
      const rows = await sql<Message[]>`
        SELECT id, session_id, chat_id, role, content, kind, tool_calls, tool_results, status, last_seq,
               tokens_used, ctx_used, ctx_max, started_at, finished_at, created_at, metadata,
               summary, tail_start_id, compacted_at
-        FROM messages
+        FROM messages_with_parts
        WHERE chat_id = ${req.params.id}
        ORDER BY created_at ASC, id ASC
      `;
--- a/apps/server/src/routes/messages.ts
+++ b/apps/server/src/routes/messages.ts
@@ -1,7 +1,13 @@
 import type { FastifyInstance } from 'fastify';
 import { z } from 'zod';
 import type { Sql } from '../db.js';
 import type { Config } from '../config.js';
 import type { Broker } from '../services/broker.js';
 import type { Chat, Message, Session, ToolCall } from '../types/api.js';
 // v1.13.17-cross-repo-reads: grant_read_access resolves the grant root at
 // decision time (not at request time) so concurrent project changes don't
 // stale-bind the resolution.
 import { resolveGrantRoot } from '../services/grant_resolver.js';
 const SendBody = z.object({
  content: z.string().min(1).max(64_000),
@@ -47,6 +53,21 @@ const AskUserInputArgs = z.object({
    .max(3),
 });
 // v1.13.17-cross-repo-reads: grant decision body. tool_call_id is the
 // model-emitted id (e.g. "call_abc123"), not a UUID. decision is binary.
 const GrantReadAccessBody = z.object({
  tool_call_id: z.string().min(1),
  decision: z.enum(['allow', 'deny']),
 });
 // Same shape as services/request_read_access.ts RequestReadAccessInput.
 // Re-derived to avoid the services/tools.ts import (matches the
 // AskUserInputArgs pattern above).
 const RequestReadAccessArgs = z.object({
  path: z.string().min(1),
  reason: z.string().min(1).max(500),
 });
 interface MessageHandlers {
  enqueueInference: (sessionId: string, chatId: string, assistantMessageId: string, user: string) => void;
  // v1.11: returns a promise that resolves after compaction.process finishes
@@ -76,6 +97,8 @@ interface MessageHandlers {
 export function registerMessageRoutes(
  app: FastifyInstance,
  sql: Sql,
  config: Config,
  broker: Broker,
  handlers: MessageHandlers
 ): void {
  app.get<{ Params: { id: string } }>(
@@ -91,11 +114,12 @@ export function registerMessageRoutes(
      // SummaryCard) and shows compacted_at-stamped rows inline for context.
      // Internal inference assembly filters compacted_at IS NULL separately —
      // see services/inference.ts loadContext + services/compaction.ts.
      // v1.13.1-B: reads tool_calls/tool_results via the parts-merged view.
      const rows = await sql<Message[]>`
        SELECT id, session_id, chat_id, role, content, kind, tool_calls, tool_results, status, last_seq,
               tokens_used, ctx_used, ctx_max, started_at, finished_at, created_at, metadata,
               summary, tail_start_id, compacted_at
-        FROM messages
+        FROM messages_with_parts
        WHERE session_id = ${req.params.id}
        ORDER BY created_at ASC, id ASC
      `;
@@ -469,30 +493,36 @@ export function registerMessageRoutes(
      const chat = chatRows[0]!;
      const sessionId = chat.session_id;
-      // Find the assistant message that emitted this tool_call. Scoped by
+      // v1.13.1-C: find the assistant's tool_call by indexing message_parts
-      // chat_id + role to avoid cross-chat lookups; ordered by created_at DESC
+      // directly on payload->>'id'. Scoped by chat_id + role via the JOIN.
-      // because the most recent issuance wins when an LLM reuses call IDs
+      // Pre-v1.13.0 history has no parts rows — those tool_calls become
-      // across turns (the older, already-answered one is a different row with
+      // unreachable here (404). Acceptable per the dispatch decision: any
-      // populated tool_results downstream).
+      // pending elicitation from before v1.13.0 is long timed out by now;
-      const callerRows = await sql<{ id: string; tool_calls: ToolCall[] | null }[]>`
+      // promote to a hotfix with a JSON-column fallback if it ever surfaces.
-        SELECT id, tool_calls FROM messages
+      const callerRows = await sql<{
-        WHERE chat_id = ${chat.id}
+        message_id: string;
-          AND role = 'assistant'
+        payload: { id: string; name: string; args: Record<string, unknown> };
-          AND tool_calls IS NOT NULL
+      }[]>`
-        ORDER BY created_at DESC
+        SELECT p.message_id, p.payload
        FROM message_parts p
        JOIN messages m ON m.id = p.message_id
        WHERE m.chat_id = ${chat.id}
          AND m.role = 'assistant'
          AND p.kind = 'tool_call'
          AND p.payload->>'id' = ${tool_call_id}
        ORDER BY m.created_at DESC
        LIMIT 1
      `;
-      let foundCall: ToolCall | null = null;
+      const callerRow = callerRows[0];
-      for (const row of callerRows) {
+      if (!callerRow) {
        const match = row.tool_calls?.find((tc) => tc.id === tool_call_id);
        if (match) {
          foundCall = match;
          break;
        }
      }
      if (!foundCall) {
        reply.code(404);
        return { error: 'unknown_tool_call_id' };
      }
      const foundCall: ToolCall = {
        id: callerRow.payload.id,
        name: callerRow.payload.name,
        args: callerRow.payload.args,
      };
      if (foundCall.name !== 'ask_user_input') {
        reply.code(400);
        return { error: 'tool_call_not_ask_user_input' };
@@ -539,18 +569,21 @@ export function registerMessageRoutes(
        }
      }
-      // Find the pending tool row. ORDER BY created_at DESC + LIMIT 1 picks
+      // v1.13.1-C: find the pending tool row via message_parts on
-      // the most recent row with this tool_call_id; the already-answered
+      // payload->>'tool_call_id'. Same fallback caveat as the caller lookup
-      // check below guards against UPDATE-ing a stale answer.
+      // above — pre-v1.13.0 rows are unreachable here.
      const toolRows = await sql<{
-        id: string;
+        message_id: string;
-        tool_results: { tool_call_id: string; output: unknown } | null;
+        payload: { tool_call_id: string; output: unknown };
      }[]>`
-        SELECT id, tool_results FROM messages
+        SELECT p.message_id, p.payload
-        WHERE chat_id = ${chat.id}
+        FROM message_parts p
-          AND role = 'tool'
+        JOIN messages m ON m.id = p.message_id
-          AND tool_results->>'tool_call_id' = ${tool_call_id}
+        WHERE m.chat_id = ${chat.id}
-        ORDER BY created_at DESC
+          AND m.role = 'tool'
          AND p.kind = 'tool_result'
          AND p.payload->>'tool_call_id' = ${tool_call_id}
        ORDER BY m.created_at DESC
        LIMIT 1
      `;
      const toolRow = toolRows[0];
@@ -558,7 +591,7 @@ export function registerMessageRoutes(
        reply.code(404);
        return { error: 'unknown_tool_call_id', detail: 'tool message not found' };
      }
-      if (toolRow.tool_results && toolRow.tool_results.output !== null) {
+      if (toolRow.payload && toolRow.payload.output !== null) {
        reply.code(409);
        return { error: 'tool_call_already_answered' };
      }
@@ -570,11 +603,17 @@ export function registerMessageRoutes(
        truncated: false,
      };
      const toolMessageId = toolRow.message_id;
      const result = await sql.begin(async (tx) => {
        // v1.13.20: parts-only. Replace the pending tool_result part inserted
        // at message creation (tool-phase.ts) with the answered one. Delete-
        // then-insert is simpler than UPDATE because parts are append-style
        // elsewhere; the UNIQUE (message_id, sequence) constraint blocks
        // plain insert.
        await tx`DELETE FROM message_parts WHERE message_id = ${toolMessageId} AND kind = 'tool_result'`;
        await tx`
-          UPDATE messages
+          INSERT INTO message_parts (message_id, sequence, kind, payload)
-          SET tool_results = ${tx.json(newToolResults as never)}
+          VALUES (${toolMessageId}, 0, 'tool_result', ${tx.json(newToolResults as never)})
          WHERE id = ${toolRow.id}
        `;
        const [assistantMsg] = await tx<{ id: string }[]>`
          INSERT INTO messages (session_id, chat_id, role, content, status, created_at)
@@ -584,7 +623,7 @@ export function registerMessageRoutes(
        await tx`UPDATE sessions SET updated_at = clock_timestamp() WHERE id = ${sessionId}`;
        await tx`UPDATE chats SET updated_at = clock_timestamp() WHERE id = ${chat.id}`;
        return {
-          tool_message_id: toolRow.id,
+          tool_message_id: toolMessageId,
          assistant_message_id: assistantMsg!.id,
        };
      });
@@ -606,4 +645,230 @@ export function registerMessageRoutes(
      return result;
    },
  );
  // v1.13.17-cross-repo-reads: resume an awaiting-grant pause. Mirror shape
  // of /answer_user_input (validate, look up via message_parts, UPDATE,
  // publish, enqueue). Differences vs /answer_user_input:
  //   - On 'allow', re-resolves the grant root via grant_resolver (state
  //     may have changed since the prompt fired — concurrent project add,
  //     etc.). Resolution failure auto-falls to a denial with reason text
  //     rather than 500ing.
  //   - On 'allow' with a valid root, appends to sessions.allowed_read_paths
  //     (deduplicated) inside the same transaction.
  //   - On success, also publishes session_updated so an open SettingsPane
  //     refetches the new grant list.
  // Error codes match /answer:
  //   400 invalid_body / mismatched_answer_shape (bad args on the tool_call)
  //   404 chat_not_found / unknown_tool_call_id
  //   409 tool_call_already_answered
  app.post<{ Params: { id: string } }>(
    '/api/chats/:id/grant_read_access',
    async (req, reply) => {
      const parsed = GrantReadAccessBody.safeParse(req.body);
      if (!parsed.success) {
        reply.code(400);
        return { error: 'invalid_body', details: parsed.error.flatten() };
      }
      const { tool_call_id, decision } = parsed.data;
      const chatRows = await sql<Chat[]>`
        SELECT id, session_id FROM chats WHERE id = ${req.params.id} AND status = 'open'
      `;
      if (chatRows.length === 0) {
        reply.code(404);
        return { error: 'chat_not_found' };
      }
      const chat = chatRows[0]!;
      const sessionId = chat.session_id;
      // Mirror the /answer lookup: assistant tool_call by id via message_parts.
      const callerRows = await sql<{
        message_id: string;
        payload: { id: string; name: string; args: Record<string, unknown> };
      }[]>`
        SELECT p.message_id, p.payload
        FROM message_parts p
        JOIN messages m ON m.id = p.message_id
        WHERE m.chat_id = ${chat.id}
          AND m.role = 'assistant'
          AND p.kind = 'tool_call'
          AND p.payload->>'id' = ${tool_call_id}
        ORDER BY m.created_at DESC
        LIMIT 1
      `;
      const callerRow = callerRows[0];
      if (!callerRow) {
        reply.code(404);
        return { error: 'unknown_tool_call_id' };
      }
      const foundCall: ToolCall = {
        id: callerRow.payload.id,
        name: callerRow.payload.name,
        args: callerRow.payload.args,
      };
      if (foundCall.name !== 'request_read_access') {
        reply.code(400);
        return { error: 'tool_call_not_request_read_access' };
      }
      const argsParsed = RequestReadAccessArgs.safeParse(foundCall.args);
      if (!argsParsed.success) {
        reply.code(400);
        return { error: 'mismatched_answer_shape', detail: 'tool_call args invalid' };
      }
      const requestedPath = argsParsed.data.path;
      // Find the pending tool row.
      const toolRows = await sql<{
        message_id: string;
        payload: { tool_call_id: string; output: unknown };
      }[]>`
        SELECT p.message_id, p.payload
        FROM message_parts p
        JOIN messages m ON m.id = p.message_id
        WHERE m.chat_id = ${chat.id}
          AND m.role = 'tool'
          AND p.kind = 'tool_result'
          AND p.payload->>'tool_call_id' = ${tool_call_id}
        ORDER BY m.created_at DESC
        LIMIT 1
      `;
      const toolRow = toolRows[0];
      if (!toolRow) {
        reply.code(404);
        return { error: 'unknown_tool_call_id', detail: 'tool message not found' };
      }
      if (toolRow.payload && toolRow.payload.output !== null) {
        reply.code(409);
        return { error: 'tool_call_already_answered' };
      }
      // Look up session + project so we can re-resolve the grant root and
      // append to allowed_read_paths atomically. We don't need agent or
      // history here — just the project path for the resolver.
      const sessionRows = await sql<{
        id: string;
        project_id: string;
        allowed_read_paths: string[];
        project_path: string;
      }[]>`
        SELECT s.id, s.project_id, s.allowed_read_paths, p.path AS project_path
        FROM sessions s
        JOIN projects p ON p.id = s.project_id
        WHERE s.id = ${sessionId}
      `;
      const sessionRow = sessionRows[0];
      if (!sessionRow) {
        reply.code(404);
        return { error: 'session_not_found' };
      }
      // Decision branch. 'deny' is the easy path: nothing to resolve or
      // persist. 'allow' resolves the grant root; if resolution fails (e.g.
      // path was deleted, project removed since prompt) the tool gets a
      // denial with the resolver's reason text instead of a 500.
      let resultOutput: string;
      let grantRoot: string | null = null;
      if (decision === 'allow') {
        const resolution = await resolveGrantRoot(
          sql,
          requestedPath,
          sessionRow.project_path,
          config.PROJECT_ROOT_WHITELIST,
        );
        if (!resolution.ok) {
          resultOutput = `denied: ${resolution.reason}`;
        } else {
          grantRoot = resolution.root;
          resultOutput = `granted: ${grantRoot}`;
        }
      } else {
        resultOutput = 'denied';
      }
      const newToolResults = {
        tool_call_id,
        output: resultOutput,
        truncated: false,
      };
      const toolMessageId = toolRow.message_id;
      const dbResult = await sql.begin(async (tx) => {
        // v1.13.20: parts-only. Same delete+insert dance as /answer —
        // UNIQUE (message_id, sequence) blocks plain UPDATE on append-style
        // parts.
        await tx`DELETE FROM message_parts WHERE message_id = ${toolMessageId} AND kind = 'tool_result'`;
        await tx`
          INSERT INTO message_parts (message_id, sequence, kind, payload)
          VALUES (${toolMessageId}, 0, 'tool_result', ${tx.json(newToolResults as never)})
        `;
        // Persist the grant if we have one. ARRAY-level dedup — append only
        // when the root isn't already present. The session row gets
        // touched (updated_at) so the post-update publish below has a
        // fresh timestamp.
        let allowedRootsAfter = sessionRow.allowed_read_paths;
        if (grantRoot !== null) {
          if (!sessionRow.allowed_read_paths.includes(grantRoot)) {
            const updated = await tx<{ allowed_read_paths: string[] }[]>`
              UPDATE sessions
              SET allowed_read_paths = array_append(allowed_read_paths, ${grantRoot}),
                  updated_at = clock_timestamp()
              WHERE id = ${sessionId}
              RETURNING allowed_read_paths
            `;
            allowedRootsAfter = updated[0]?.allowed_read_paths ?? sessionRow.allowed_read_paths;
          } else {
            // Already present — touch updated_at so any open settings
            // panel still picks up the no-op via session_updated.
            await tx`UPDATE sessions SET updated_at = clock_timestamp() WHERE id = ${sessionId}`;
          }
        }
        const [assistantMsg] = await tx<{ id: string }[]>`
          INSERT INTO messages (session_id, chat_id, role, content, status, created_at)
          VALUES (${sessionId}, ${chat.id}, 'assistant', '', 'streaming', clock_timestamp())
          RETURNING id
        `;
        await tx`UPDATE chats SET updated_at = clock_timestamp() WHERE id = ${chat.id}`;
        return {
          tool_message_id: toolMessageId,
          assistant_message_id: assistantMsg!.id,
          allowed_roots_after: allowedRootsAfter,
        };
      });
      // Publish the deferred tool_result frame so the pending card flips to
      // its answered view without a refetch.
      handlers.publishSessionFrame(sessionId, {
        type: 'tool_result',
        tool_message_id: dbResult.tool_message_id,
        tool_call_id,
        chat_id: chat.id,
        output: resultOutput,
        truncated: false,
      });
      // session_updated nudge so any open SettingsPane refetches and sees
      // the new allowed_read_paths. We publish on the user channel to match
      // the existing PATCH /api/sessions/:id behavior — frontend refetches
      // via api.sessions.get on receipt.
      const nowIso = new Date().toISOString();
      broker.publishUserFrame('default', {
        type: 'session_updated',
        session_id: sessionId,
        project_id: sessionRow.project_id,
        // session name doesn't change on grant; we look it up fresh to
        // avoid carrying stale state if a rename raced us.
        name:
          (
            await sql<{ name: string }[]>`SELECT name FROM sessions WHERE id = ${sessionId}`
          )[0]?.name ?? '',
        updated_at: nowIso,
      });
      handlers.enqueueInference(sessionId, chat.id, dbResult.assistant_message_id, 'default');
      reply.code(202);
      return {
        tool_message_id: dbResult.tool_message_id,
        assistant_message_id: dbResult.assistant_message_id,
        allowed_read_paths: dbResult.allowed_roots_after,
      };
    },
  );
 }
--- a/apps/server/src/routes/projects.ts
+++ b/apps/server/src/routes/projects.ts
@@ -129,7 +129,7 @@ export function registerProjectRoutes(
        RETURNING id, name, path, added_at, last_session_id, status, gitea_remote,
                  default_system_prompt, default_web_search_enabled
      `;
-      broker.publishUser('default', { type: 'project_created', project: row as unknown as Project });
+      broker.publishUserFrame('default', { type: 'project_created', project: row as unknown as Project });
      reply.code(201);
      return {
        project: row,
@@ -186,11 +186,11 @@ export function registerProjectRoutes(
    `;
    if (existing.length === 0) {
-      broker.publishUser('default', { type: 'project_created', project: row as unknown as Project });
+      broker.publishUserFrame('default', { type: 'project_created', project: row as unknown as Project });
      reply.code(201);
    } else {
      // existing.status was 'archived' — row has been restored.
-      broker.publishUser('default', { type: 'project_unarchived', project: row as unknown as Project });
+      broker.publishUserFrame('default', { type: 'project_unarchived', project: row as unknown as Project });
      reply.code(200);
    }
    return row;
@@ -243,7 +243,7 @@ export function registerProjectRoutes(
    // v1.9: the project_updated frame still only carries id + name. Clients
    // that need the new fields refetch via api.projects.list() — keeps the
    // frame payload lean, per the locked recon decision (d).
-    broker.publishUser('default', {
+    broker.publishUserFrame('default', {
      type: 'project_updated',
      project_id: project.id,
      name: project.name,
@@ -260,7 +260,7 @@ export function registerProjectRoutes(
      reply.code(404);
      return { error: 'not found or already archived' };
    }
-    broker.publishUser('default', { type: 'project_archived', project_id: req.params.id });
+    broker.publishUserFrame('default', { type: 'project_archived', project_id: req.params.id });
    reply.code(204);
    return null;
  });
@@ -277,7 +277,7 @@ export function registerProjectRoutes(
      return { error: 'not found or not archived' };
    }
    const project = rows[0]!;
-    broker.publishUser('default', { type: 'project_unarchived', project });
+    broker.publishUserFrame('default', { type: 'project_unarchived', project });
    return project;
  });
@@ -288,7 +288,7 @@ export function registerProjectRoutes(
      reply.code(404);
      return { error: 'not found' };
    }
-    broker.publishUser('default', { type: 'project_deleted', project_id: id });
+    broker.publishUserFrame('default', { type: 'project_deleted', project_id: id });
    reply.code(204);
    return null;
  });
--- a/apps/server/src/routes/sessions.ts
+++ b/apps/server/src/routes/sessions.ts
@@ -13,6 +13,43 @@ const CreateBody = z.object({
  agent_id: z.string().min(1).max(200).nullable().optional(),
 });
 // v1.14.x-html-artifact-panes: 'markdown_artifact' + 'html_artifact' added
 // as pane kinds. Pane state is a reference only (chat_id + message_id +
 // title) — the actual artifact body is fetched from the message row or
 // message_parts.payload by the pane component on mount.
 const MarkdownArtifactStateZ = z.object({
  chat_id: z.string().min(1).max(200),
  message_id: z.string().min(1).max(200),
  title: z.string().max(500),
 });
 const HtmlArtifactStateZ = z.object({
  chat_id: z.string().min(1).max(200),
  message_id: z.string().min(1).max(200),
  title: z.string().max(500),
 });
 const WorkspacePaneZ = z.object({
  id: z.string().min(1).max(200),
  kind: z.enum([
    'chat',
    'terminal',
    'agent',
    'empty',
    'settings',
    'markdown_artifact',
    'html_artifact',
  ]),
  chatId: z.string().min(1).max(200).optional(),
  chatIds: z.array(z.string().min(1).max(200)).max(50),
  activeChatIdx: z.number().int(),
  markdown_artifact_state: MarkdownArtifactStateZ.optional(),
  html_artifact_state: HtmlArtifactStateZ.optional(),
 });
 const WorkspacePanesBody = z.object({
  workspace_panes: z.array(WorkspacePaneZ).max(10),
 });
 const PatchBody = z.object({
  name: z.string().min(1).max(200).optional(),
  model: z.string().min(1).max(200).optional(),
@@ -20,6 +57,29 @@ const PatchBody = z.object({
  agent_id: z.string().min(1).max(200).nullable().optional(),
  // v1.9: null = inherit from project default; true/false = explicit override.
  web_search_enabled: z.boolean().nullable().optional(),
  // v1.13.17-cross-repo-reads: revocation pathway. PATCH with a shortened
  // list deletes entries; the grant flow itself APPENDS via the separate
  // grant_read_access endpoint, never via this PATCH. Frontend treats this
  // as "send the new whole array". Per-entry shape validation: must be
  // absolute, no NUL, no `/..` traversal segment. Server doesn't re-validate
  // whitelist membership on PATCH — entries already in the array were
  // placed there by the grant endpoint after a full whitelist+repo-shape
  // check. THE SUBSET CHECK (every entry must already be in the current
  // array) is enforced at runtime in the PATCH handler below, NOT in this
  // zod refinement, because the refinement has no access to the existing
  // session row.
  allowed_read_paths: z
    .array(
      z
        .string()
        .min(1)
        .max(1024)
        .refine((p) => p.startsWith('/') && !p.includes('\0') && !p.includes('/..'), {
          message: 'must be an absolute path without traversal markers',
        }),
    )
    .max(64)
    .optional(),
 });
 async function resolveDefaultModel(sql: Sql, config: Config): Promise<string> {
@@ -28,6 +88,19 @@ async function resolveDefaultModel(sql: Sql, config: Config): Promise<string> {
  return config.DEFAULT_MODEL;
 }
 // v1.13.17-cross-repo-reads: subset enforcement for PATCH allowed_read_paths.
 // The PATCH route can only SHRINK the array; growth happens exclusively via
 // POST /api/chats/:id/grant_read_access (which requires user consent).
 // Returns the list of disallowed-additions; an empty list means the request
 // is a valid shrink-or-no-op. Exported for the unit test.
 export function findUnauthorizedAdditions(
  prior: readonly string[],
  requested: readonly string[],
 ): string[] {
  const priorSet = new Set(prior);
  return requested.filter((p) => !priorSet.has(p));
 }
 export function registerSessionRoutes(
  app: FastifyInstance,
  sql: Sql,
@@ -44,7 +117,7 @@ export function registerSessionRoutes(
      }
      const status = req.query.status === 'archived' ? 'archived' : 'open';
      const rows = await sql<Session[]>`
-        SELECT id, project_id, name, model, system_prompt, status, created_at, updated_at, agent_id, web_search_enabled
+        SELECT id, project_id, name, model, system_prompt, status, created_at, updated_at, agent_id, web_search_enabled, workspace_panes, allowed_read_paths
        FROM sessions
        WHERE project_id = ${req.params.id} AND status = ${status}
        ORDER BY updated_at DESC
@@ -92,7 +165,7 @@ export function registerSessionRoutes(
        const [session] = await tx<Session[]>`
          INSERT INTO sessions (project_id, name, model, system_prompt, agent_id)
          VALUES (${req.params.id}, ${name}, ${model}, ${systemPrompt}, ${agentId})
-          RETURNING id, project_id, name, model, system_prompt, status, created_at, updated_at, agent_id, web_search_enabled
+          RETURNING id, project_id, name, model, system_prompt, status, created_at, updated_at, agent_id, web_search_enabled, workspace_panes
        `;
        await tx`
          INSERT INTO chats (session_id, name, status)
@@ -100,7 +173,7 @@ export function registerSessionRoutes(
        `;
        return session!;
      });
-      broker.publishUser('default', {
+      broker.publishUserFrame('default', {
        type: 'session_created',
        session: row,
        project_id: row.project_id,
@@ -112,7 +185,7 @@ export function registerSessionRoutes(
  app.get<{ Params: { id: string } }>('/api/sessions/:id', async (req, reply) => {
    const rows = await sql<Session[]>`
-      SELECT id, project_id, name, model, system_prompt, status, created_at, updated_at, agent_id, web_search_enabled
+      SELECT id, project_id, name, model, system_prompt, status, created_at, updated_at, agent_id, web_search_enabled, workspace_panes, allowed_read_paths
      FROM sessions WHERE id = ${req.params.id}
    `;
    if (rows.length === 0) {
@@ -138,15 +211,53 @@ export function registerSessionRoutes(
      const newAgentId = parsed.data.agent_id ?? null;
      const wseProvided = parsed.data.web_search_enabled !== undefined;
      const newWse = parsed.data.web_search_enabled ?? null;
-      // Read the prior name so the post-update publish can skip no-op renames
+      // v1.13.17-cross-repo-reads: tri-state on the wire (undefined = no
-      // (PATCH { name: "Foo" } where the session is already "Foo"). The window
+      // change, [] = clear). Frontend currently uses this PATCH only for
-      // between SELECT and UPDATE is sub-millisecond in the same request handler;
+      // revocation (delete a single entry from the existing array, send
-      // a concurrent rename in that gap would just mean one stale publish, which
+      // shortened result). Append-style grants go through the dedicated
-      // existing clients dedup by id.
+      // grant_read_access endpoint inside the inference loop.
-      const before = await sql<{ name: string }[]>`
+      const arpProvided = parsed.data.allowed_read_paths !== undefined;
-        SELECT name FROM sessions WHERE id = ${req.params.id}
+      const newArp = parsed.data.allowed_read_paths ?? [];
      // Read the prior name + grants so the post-update publish can skip no-op
      // renames (PATCH { name: "Foo" } where the session is already "Foo") AND
      // so the subset check below has the current grant list to compare against.
      // The window between SELECT and UPDATE is sub-millisecond in the same
      // request handler; a concurrent rename in that gap would just mean one
      // stale publish, which existing clients dedup by id.
      const before = await sql<{ name: string; allowed_read_paths: string[] }[]>`
        SELECT name, allowed_read_paths FROM sessions WHERE id = ${req.params.id}
      `;
      const priorName = before[0]?.name;
      const priorArp = before[0]?.allowed_read_paths ?? [];
      // v1.13.17-cross-repo-reads: subset enforcement. The grant flow is the
      // ONLY path that can add entries to allowed_read_paths — PATCH can only
      // shrink the array, never grow it. Without this guard, a malicious
      // client could POST {"allowed_read_paths":["/etc"]} and bypass the
      // user-consent prompt entirely. Sam flagged this in the v1.13.17
      // compliance review (2026-05-22).
      // Race note: a concurrent grant landing between this SELECT and the
      // UPDATE below would briefly make a "shouldn't-have-been-valid" PATCH
      // succeed (the newly-granted root sneaks in). Inverse race — a
      // legitimate revoke happening alongside a concurrent grant — could
      // briefly reject the revoke; the user retries. Both are acceptable
      // given the single-user threat model + sub-millisecond window.
      if (arpProvided) {
        const extras = findUnauthorizedAdditions(priorArp, newArp);
        if (extras.length > 0) {
          reply.code(400);
          return {
            error: 'invalid body',
            details: {
              fieldErrors: {
                allowed_read_paths: [
                  `entries must already be granted; cannot add via PATCH: ${extras.join(', ')}`,
                ],
              },
            },
          };
        }
      }
      const rows = await sql<Session[]>`
        UPDATE sessions
        SET
@@ -155,10 +266,11 @@ export function registerSessionRoutes(
          system_prompt = COALESCE(${system_prompt ?? null}, system_prompt),
          agent_id = CASE WHEN ${agentIdProvided} THEN ${newAgentId} ELSE agent_id END,
          web_search_enabled = CASE WHEN ${wseProvided} THEN ${newWse} ELSE web_search_enabled END,
          allowed_read_paths = CASE WHEN ${arpProvided} THEN ${sql.array(newArp, 25)} ELSE allowed_read_paths END,
          updated_at = clock_timestamp()
        WHERE id = ${req.params.id}
        RETURNING id, project_id, name, model, system_prompt, status, created_at, updated_at,
-                  agent_id, web_search_enabled
+                  agent_id, web_search_enabled, workspace_panes, allowed_read_paths
      `;
      if (rows.length === 0) {
        reply.code(404);
@@ -166,7 +278,7 @@ export function registerSessionRoutes(
      }
      const session = rows[0]!;
      if (name !== undefined && session.name !== priorName) {
-        broker.publishUser('default', {
+        broker.publishUserFrame('default', {
          type: 'session_renamed',
          session_id: session.id,
          name: session.name,
@@ -176,7 +288,7 @@ export function registerSessionRoutes(
      // (notably the SettingsPane open in another tab) can refetch and pick
      // up the new fields. Frame stays lean (decision d) — payload is just
      // ids + name + updated_at, the client refetches via api.sessions.get.
-      broker.publishUser('default', {
+      broker.publishUserFrame('default', {
        type: 'session_updated',
        session_id: session.id,
        project_id: session.project_id,
@@ -187,6 +299,36 @@ export function registerSessionRoutes(
    }
  );
  app.patch<{ Params: { id: string } }>(
    '/api/sessions/:id/workspace',
    async (req, reply) => {
      const parsed = WorkspacePanesBody.safeParse(req.body);
      if (!parsed.success) {
        reply.code(400);
        return { error: 'invalid body', details: parsed.error.flatten() };
      }
      const rows = await sql<Session[]>`
        UPDATE sessions
        SET workspace_panes = ${sql.json(parsed.data.workspace_panes as never)},
            updated_at = clock_timestamp()
        WHERE id = ${req.params.id}
        RETURNING id, project_id, name, model, system_prompt, status, created_at, updated_at,
                  agent_id, web_search_enabled, workspace_panes, allowed_read_paths
      `;
      if (rows.length === 0) {
        reply.code(404);
        return { error: 'session not found' };
      }
      const session = rows[0]!;
      broker.publishUserFrame('default', {
        type: 'session_workspace_updated',
        session_id: session.id,
        workspace_panes: session.workspace_panes,
      });
      return session;
    }
  );
  // v1.9: bulk-archive every open session in a project. Mirrors the
  // single-archive shape (same broker frame type) so the existing useSidebar
  // reducer cases handle it without changes — just N frames instead of 1.
@@ -206,7 +348,7 @@ export function registerSessionRoutes(
      `;
      const ids = rows.map((r) => r.id);
      for (const id of ids) {
-        broker.publishUser('default', {
+        broker.publishUserFrame('default', {
          type: 'session_archived',
          session_id: id,
          project_id: req.params.id,
@@ -247,7 +389,7 @@ export function registerSessionRoutes(
        reply.code(404);
        return { error: 'session not found or already archived' };
      }
-      broker.publishUser('default', {
+      broker.publishUserFrame('default', {
        type: 'session_archived',
        session_id: rows[0]!.id,
        project_id: rows[0]!.project_id,
@@ -263,14 +405,14 @@ export function registerSessionRoutes(
      const rows = await sql<Session[]>`
        UPDATE sessions SET status = 'open', updated_at = clock_timestamp()
        WHERE id = ${req.params.id} AND status = 'archived'
-        RETURNING id, project_id, name, model, system_prompt, status, created_at, updated_at, agent_id, web_search_enabled
+        RETURNING id, project_id, name, model, system_prompt, status, created_at, updated_at, agent_id, web_search_enabled, workspace_panes
      `;
      if (rows.length === 0) {
        reply.code(404);
        return { error: 'session not found or not archived' };
      }
      const session = rows[0]!;
-      broker.publishUser('default', {
+      broker.publishUserFrame('default', {
        type: 'session_created',
        session: session,
        project_id: session.project_id,
@@ -292,7 +434,7 @@ export function registerSessionRoutes(
        return { error: 'not found' };
      }
      const project_id = deleted[0]!.project_id;
-      broker.publishUser('default', { type: 'session_deleted', session_id: id, project_id });
+      broker.publishUserFrame('default', { type: 'session_deleted', session_id: id, project_id });
      reply.code(204);
      return null;
    }
--- a/apps/server/src/routes/skills.ts
+++ b/apps/server/src/routes/skills.ts
@@ -86,15 +86,30 @@ export function registerSkillsRoutes(
      const result = await sql.begin(async (tx) => {
        const [synthAssistant] = await tx<{ id: string }[]>`
-          INSERT INTO messages (session_id, chat_id, role, content, tool_calls, status, created_at)
+          INSERT INTO messages (session_id, chat_id, role, content, status, created_at)
-          VALUES (${sessionId}, ${chat.id}, 'assistant', '', ${sql.json(toolCalls as never)}, 'complete', clock_timestamp())
+          VALUES (${sessionId}, ${chat.id}, 'assistant', '', 'complete', clock_timestamp())
          RETURNING id
        `;
        // v1.13.20: parts-only write. Single skill_use tool_call, no text
        // content, so one part at seq 0.
        await tx`
          INSERT INTO message_parts (message_id, sequence, kind, payload)
          VALUES (${synthAssistant!.id}, 0, 'tool_call', ${tx.json({
            id: toolCallId,
            name: 'skill_use',
            args: { name: skill_name },
          } as never)})
        `;
        const [toolMsg] = await tx<{ id: string }[]>`
-          INSERT INTO messages (session_id, chat_id, role, content, tool_results, status, created_at)
+          INSERT INTO messages (session_id, chat_id, role, content, status, created_at)
-          VALUES (${sessionId}, ${chat.id}, 'tool', '', ${sql.json(toolResults as never)}, 'complete', clock_timestamp())
+          VALUES (${sessionId}, ${chat.id}, 'tool', '', 'complete', clock_timestamp())
          RETURNING id
        `;
        // v1.13.20: parts-only write of the synthetic tool result (skill body).
        await tx`
          INSERT INTO message_parts (message_id, sequence, kind, payload)
          VALUES (${toolMsg!.id}, 0, 'tool_result', ${tx.json(toolResults as never)})
        `;
        const [userMsg] = await tx<{ id: string }[]>`
          INSERT INTO messages (session_id, chat_id, role, content, status, created_at)
          VALUES (${sessionId}, ${chat.id}, 'user', ${userText}, 'complete', clock_timestamp())
--- a/apps/server/src/routes/tools.ts
+++ b/apps/server/src/routes/tools.ts
@@ -0,0 +1,40 @@
 import type { FastifyInstance } from 'fastify';
 import type { Sql } from '../db.js';
 export interface ToolCostStat {
  tool_name: string;
  mean_prompt_tokens: number;
  mean_completion_tokens: number;
  n_calls: number;
  updated_at: string;
 }
 // v1.13.10: per-tool token cost rolling window read endpoint. Backed by the
 // tool_cost_stats view in schema.sql (last 100 calls per tool, equal-split
 // attribution across multi-tool turns, sentinel/failed-turn excluded).
 // Consumed by AgentPicker for at-a-glance per-agent cost hints.
 export function registerToolsRoutes(app: FastifyInstance, sql: Sql): void {
  app.get('/api/tools/cost_stats', async () => {
    const rows = await sql<
      {
        tool_name: string;
        prompt_tokens_sum: number;
        completion_tokens_sum: number;
        n_calls: number;
        updated_at: string;
      }[]
    >`
      SELECT tool_name, prompt_tokens_sum, completion_tokens_sum, n_calls, updated_at
      FROM tool_cost_stats
      ORDER BY tool_name ASC
    `;
    const stats: ToolCostStat[] = rows.map((r) => ({
      tool_name: r.tool_name,
      mean_prompt_tokens: Math.round(r.prompt_tokens_sum / r.n_calls),
      mean_completion_tokens: Math.round(r.completion_tokens_sum / r.n_calls),
      n_calls: r.n_calls,
      updated_at: r.updated_at,
    }));
    return { stats };
  });
 }
--- a/apps/server/src/routes/ws.ts
+++ b/apps/server/src/routes/ws.ts
@@ -23,11 +23,12 @@ export function registerWebSocket(
      // v1.11: snapshot includes compaction fields so MessageBubble can
      // render the SummaryCard for summary=true rows on first connect.
      // v1.13.1-B: reads tool_calls/tool_results via the parts-merged view.
      const messages = await sql<Message[]>`
        SELECT id, session_id, chat_id, role, content, kind, tool_calls, tool_results, status, last_seq,
               tokens_used, ctx_used, ctx_max, started_at, finished_at, created_at, metadata,
               summary, tail_start_id, compacted_at
-        FROM messages
+        FROM messages_with_parts
        WHERE session_id = ${sessionId}
        ORDER BY created_at ASC, id ASC
      `;
--- a/apps/server/src/schema.sql
+++ b/apps/server/src/schema.sql
@@ -1,3 +1,10 @@
 -- v1.13.3: statement_timeout is set at database level via:
 --   ALTER DATABASE boocode SET statement_timeout = '30s';
 -- ALTER DATABASE can't run inside a DO block, so this is an operational
 -- step rather than schema. Re-apply after a volume reset (the setting
 -- lives in pg_db which survives `docker compose up --build` but NOT a
 -- `docker volume rm boocode_pgdata`).
 CREATE TABLE IF NOT EXISTS projects (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  name TEXT NOT NULL,
@@ -32,6 +39,162 @@ CREATE TABLE IF NOT EXISTS messages (
 CREATE INDEX IF NOT EXISTS idx_messages_session ON messages(session_id, created_at);
 -- v1.13.0: granular message parts table for AI SDK migration. Old
 -- messages.content / tool_calls / tool_results columns stay authoritative
 -- for reads in v1.13.0; this table is dual-written so the swap can happen
 -- in a later dispatch without a backfill window. ON DELETE CASCADE means
 -- removing a message removes its parts in one go.
 CREATE TABLE IF NOT EXISTS message_parts (
  id uuid PRIMARY KEY DEFAULT gen_random_uuid(),
  message_id uuid NOT NULL REFERENCES messages(id) ON DELETE CASCADE,
  sequence int NOT NULL,
  kind text NOT NULL,
  payload jsonb NOT NULL,
  created_at timestamptz NOT NULL DEFAULT clock_timestamp(),
  CONSTRAINT message_parts_kind_chk CHECK (kind IN ('text', 'tool_call', 'tool_result', 'reasoning', 'step_start', 'synthesis', 'html_artifact')),
  CONSTRAINT message_parts_seq_uniq UNIQUE (message_id, sequence)
 );
 CREATE INDEX IF NOT EXISTS message_parts_msg_seq_idx ON message_parts (message_id, sequence);
 -- v1.13.4: prune support. hidden_at marks parts that have been pruned out
 -- of the model payload by the two-tier compaction prune (services/inference/
 -- prune.ts). Rows stay in the DB so frontend can still display them with a
 -- "hidden" indicator (out of scope this dispatch). messages_with_parts
 -- view filters these out — see below. Partial index speeds the common
 -- "visible parts only" filter.
 DO $$
 BEGIN
  IF NOT EXISTS (
    SELECT 1 FROM information_schema.columns
    WHERE table_name = 'message_parts' AND column_name = 'hidden_at'
  ) THEN
    ALTER TABLE message_parts ADD COLUMN hidden_at timestamptz NULL;
  END IF;
 END $$;
 CREATE INDEX IF NOT EXISTS message_parts_hidden_idx
  ON message_parts (message_id) WHERE hidden_at IS NULL;
 -- v1.13.13: extend message_parts.kind to allow 'synthesis'. Existing DBs were
 -- created with the pre-v1.13.13 CHECK constraint that did NOT include
 -- 'synthesis'; drop + re-add the constraint with the extended enum. Fresh
 -- installs hit the inline constraint above (already updated) and skip this
 -- block via the pg_constraint guard.
 -- v1.14.x-html-artifact-panes: extend the same constraint with 'html_artifact'.
 -- DROP IF EXISTS + DO $$ pg_constraint $$ guard remains idempotent across
 -- both v1.13.13 and v1.14.x boots; the IN list below is the union of every
 -- kind ever shipped.
 ALTER TABLE message_parts DROP CONSTRAINT IF EXISTS message_parts_kind_chk;
 DO $$
 BEGIN
  IF NOT EXISTS (
    SELECT 1 FROM pg_constraint WHERE conname = 'message_parts_kind_chk'
  ) THEN
    ALTER TABLE message_parts
      ADD CONSTRAINT message_parts_kind_chk
      CHECK (kind IN ('text', 'tool_call', 'tool_result', 'reasoning', 'step_start', 'synthesis', 'html_artifact'));
  END IF;
 END $$;
 -- v1.13.1-B: read-path view. Read sites SELECT FROM messages_with_parts
 -- instead of messages so tool_calls / tool_results / reasoning_parts come
 -- from the granular message_parts table.
 -- v1.13.20: post column-drop. The legacy COALESCE fallback over
 -- messages.tool_calls / messages.tool_results was removed because those
 -- columns no longer exist on the table (see the ALTER TABLE DROP COLUMN
 -- statements below). Writes continue to target `messages` directly — the
 -- view is read-only. Shapes match the in-memory ToolCall / ToolResult
 -- types: tool_calls is a jsonb array of {id, name, args}, tool_results is
 -- a single jsonb object {tool_call_id, output, truncated, error?}.
 -- reasoning_parts is consumed by the inference history fetch (payload.ts)
 -- for v1.13.1-C reasoning round-tripping. Not surfaced in external APIs.
 CREATE OR REPLACE VIEW messages_with_parts AS
 SELECT
  m.id, m.session_id, m.chat_id, m.role, m.content, m.kind, m.status,
  m.last_seq, m.tokens_used, m.ctx_used, m.ctx_max,
  m.started_at, m.finished_at, m.created_at, m.metadata,
  m.summary, m.tail_start_id, m.compacted_at,
  (SELECT jsonb_agg(p.payload ORDER BY p.sequence)
     FROM message_parts p
    WHERE p.message_id = m.id AND p.kind = 'tool_call' AND p.hidden_at IS NULL) AS tool_calls,
  (SELECT p.payload
     FROM message_parts p
    WHERE p.message_id = m.id AND p.kind = 'tool_result' AND p.hidden_at IS NULL
    ORDER BY p.sequence LIMIT 1) AS tool_results,
  (SELECT jsonb_agg(p.payload ORDER BY p.sequence)
     FROM message_parts p
    WHERE p.message_id = m.id AND p.kind = 'reasoning' AND p.hidden_at IS NULL) AS reasoning_parts
 FROM messages m;
 -- v1.13.20: drop legacy tool_calls/tool_results columns. Reads have routed
 -- through messages_with_parts since v1.13.1-B; dual-writes removed in this
 -- batch. The view above was simplified to remove COALESCE fallbacks before
 -- this drop (Postgres rejects column-drop on view-referenced columns).
 -- Idempotent via IF EXISTS.
 ALTER TABLE messages DROP COLUMN IF EXISTS tool_calls;
 ALTER TABLE messages DROP COLUMN IF EXISTS tool_results;
 -- v1.13.10: per-tool token cost rolling window. Derives from
 -- messages_with_parts (the v1.13.1-B view that COALESCEs message_parts over
 -- the legacy JSON column) so this works whether the chat predates v1.13.0
 -- or postdates v1.13.2 (column drop). No new write site — all source data
 -- already lands via the existing tool-phase.ts:94-95 UPDATE.
 --
 -- Attribution model: equal split. A turn emitting N tool calls divides its
 -- prompt/completion tokens by N before attribution. See v1.13.10 dispatch
 -- brief for rationale + rejected alternatives.
 --
 -- Column mapping: messages.ctx_used = prompt (input), messages.tokens_used
 -- = completion (output). Non-obvious naming; pinned via canonical writes at
 -- tool-phase.ts:94-95 et al.
 --
 -- Filtering rationale:
 --   status='complete'                — exclude failed/cancelled (defense in
 --                                      depth; failed-path doesn't write
 --                                      tokens_used so they're filtered
 --                                      indirectly too).
 --   metadata->>'kind' exclusions     — exclude cap_hit / doom_loop sentinels
 --                                      (defense in depth; sentinels are
 --                                      role='system' with tool_calls=NULL
 --                                      so they're filtered indirectly too).
 --   experimental_repairToolCall      — no special handling; retries flow
 --                                      as normal next-turn tool_result
 --                                      errors and count naturally.
 --
 -- Rolling window: last 100 calls per tool_name, ordered by created_at DESC.
 -- Aggregate-on-read is microseconds at BooCode scale (single user, ~30
 -- tools, < 100 calls each). DROP VIEW + recreate to change window size.
 CREATE OR REPLACE VIEW tool_cost_stats AS
 WITH per_call AS (
  SELECT
    (tc->>'name')::text AS tool_name,
    (m.ctx_used::float / NULLIF(jsonb_array_length(m.tool_calls), 0)) AS prompt_tokens,
    (m.tokens_used::float / NULLIF(jsonb_array_length(m.tool_calls), 0)) AS completion_tokens,
    m.created_at,
    ROW_NUMBER() OVER (
      PARTITION BY (tc->>'name')::text
      ORDER BY m.created_at DESC
    ) AS rn
  FROM messages_with_parts m,
    LATERAL jsonb_array_elements(m.tool_calls) AS tc
  WHERE m.tool_calls IS NOT NULL
    AND jsonb_array_length(m.tool_calls) > 0
    AND m.tokens_used IS NOT NULL
    AND m.ctx_used IS NOT NULL
    AND m.status = 'complete'
    AND (m.metadata IS NULL
         OR m.metadata->>'kind' IS NULL
         OR m.metadata->>'kind' NOT IN ('cap_hit', 'doom_loop'))
 )
 SELECT
  tool_name,
  ROUND(SUM(prompt_tokens))::int AS prompt_tokens_sum,
  ROUND(SUM(completion_tokens))::int AS completion_tokens_sum,
  COUNT(*)::int AS n_calls,
  MAX(created_at) AS updated_at
 FROM per_call
 WHERE rn <= 100
 GROUP BY tool_name;
 ALTER TABLE messages ADD COLUMN IF NOT EXISTS tokens_used INTEGER;
 ALTER TABLE messages ADD COLUMN IF NOT EXISTS ctx_used INTEGER;
 ALTER TABLE messages ADD COLUMN IF NOT EXISTS ctx_max INTEGER;
@@ -47,22 +210,14 @@ CREATE TABLE IF NOT EXISTS settings (
 INSERT INTO settings (key, value) VALUES ('default_model', '"qwen3.6-35b-a3b-mxfp4"') ON CONFLICT (key) DO NOTHING;
-- DEPRECATED: client-side pane state as of v1.2-batch4. Table retained per
+-- v1.12.1: deprecated session_panes table removed. Workspace pane state now
-- additive schema rule; no writes. Drop in a future destructive migration.
+-- lives in sessions.workspace_panes (jsonb), see below.
-CREATE TABLE IF NOT EXISTS session_panes (
+DROP TABLE IF EXISTS session_panes;
  id           UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  session_id   UUID NOT NULL REFERENCES sessions(id) ON DELETE CASCADE,
  position     INTEGER NOT NULL,
  kind         TEXT NOT NULL CHECK (kind IN ('chat', 'file_browser', 'terminal')),
  state        JSONB NOT NULL DEFAULT '{}',
  created_at   TIMESTAMPTZ NOT NULL DEFAULT clock_timestamp(),
  UNIQUE (session_id, position)
 );
 CREATE INDEX IF NOT EXISTS idx_session_panes_session ON session_panes (session_id);
-- v1.4: backfill removed. Pane layout is client-side (localStorage) since v1.2-batch4.
+-- v1.12.1: server-side workspace pane layout, replaces localStorage so every
-- The CREATE TABLE above is retained for additive-schema discipline; drop is a
+-- device sees the same panes for a given session. Shape matches
-- future destructive migration.
+-- WorkspacePane[] from apps/server/src/types/api.ts.
 ALTER TABLE sessions ADD COLUMN IF NOT EXISTS workspace_panes JSONB NOT NULL DEFAULT '[]'::jsonb;
 -- v1.2: sessions.status (open | archived)
 ALTER TABLE sessions ADD COLUMN IF NOT EXISTS status TEXT NOT NULL DEFAULT 'open';
@@ -159,6 +314,16 @@ END $$;
 -- agent_id is the slugified agent name. NULL means "use BooCode defaults".
 ALTER TABLE sessions ADD COLUMN IF NOT EXISTS agent_id TEXT;
 -- v1.13.17-cross-repo-reads: session-scoped read grants for paths outside the
 -- session's primary project root. Populated only by the request_read_access
 -- tool's approve branch; revoked via PATCH /api/sessions/:id. Values are
 -- absolute paths to project roots OR repo-shaped dirs under
 -- PROJECT_ROOT_WHITELIST (default /opt). No CHECK constraint — validation
 -- happens at write time in services/grant_resolver.ts. Cleared automatically
 -- when the session row is deleted (no cascade needed; the column goes with it).
 ALTER TABLE sessions
  ADD COLUMN IF NOT EXISTS allowed_read_paths TEXT[] NOT NULL DEFAULT ARRAY[]::TEXT[];
 -- v1.8.2: per-message metadata for sentinels (cap-hit) and structured error
 -- reasons. JSONB so future kinds can extend without further schema churn.
 -- Shape for cap_hit:  { kind: 'cap_hit', used: number, limit: number,
--- a/apps/server/src/services/tests/artifacts.test.ts
+++ b/apps/server/src/services/tests/artifacts.test.ts
@@ -0,0 +1,261 @@
 import { mkdtemp, mkdir, readFile, rm, symlink } from 'node:fs/promises';
 import { tmpdir } from 'node:os';
 import { join } from 'node:path';
 import { afterEach, beforeEach, describe, expect, it } from 'vitest';
 import {
  decideHtmlArtifactWrite,
  deriveHtmlSlug,
  deriveHtmlTitle,
  deriveMarkdownSlug,
  detectHtmlArtifact,
  HTML_ARTIFACT_MAX_BYTES,
  writeHtmlArtifact,
  writeMarkdownArtifact,
 } from '../artifacts.js';
 import { PathScopeError } from '../path_guard.js';
 describe('deriveMarkdownSlug', () => {
  it('uses the first # heading when present', () => {
    expect(deriveMarkdownSlug('# Hello World\n\nbody')).toBe('hello-world');
  });
  it('falls back to first 6 words', () => {
    const s = deriveMarkdownSlug('the quick brown fox jumps over the lazy dog');
    expect(s).toBe('the-quick-brown-fox-jumps-over');
  });
  it('returns "artifact" for empty input', () => {
    expect(deriveMarkdownSlug('')).toBe('artifact');
  });
  it('caps at 60 chars and lowercases', () => {
    const long = '# ' + 'A'.repeat(200);
    const s = deriveMarkdownSlug(long);
    expect(s.length).toBeLessThanOrEqual(60);
    expect(s).toMatch(/^[a-z0-9-]+$/);
  });
  it('strips trailing punctuation', () => {
    expect(deriveMarkdownSlug('# Hello, World!!!')).toBe('hello-world');
  });
 });
 describe('deriveHtmlSlug', () => {
  it('prefers payload.title when set', () => {
    expect(
      deriveHtmlSlug({ html_content: '<html></html>', title: 'My Title' }),
    ).toBe('my-title');
  });
  it('falls back to <title> tag', () => {
    expect(
      deriveHtmlSlug({
        html_content: '<html><head><title>Page Title</title></head></html>',
        title: null,
      }),
    ).toBe('page-title');
  });
  it('falls back to first <h1> when no <title>', () => {
    expect(
      deriveHtmlSlug({
        html_content: '<html><body><h1>Heading One</h1></body></html>',
        title: null,
      }),
    ).toBe('heading-one');
  });
  it('falls back to inner text words', () => {
    expect(
      deriveHtmlSlug({
        html_content: '<div>one two three four five six seven</div>',
        title: null,
      }),
    ).toBe('one-two-three-four-five-six');
  });
 });
 describe('deriveHtmlTitle', () => {
  it('returns <title> content', () => {
    expect(deriveHtmlTitle('<html><head><title>T</title></head></html>')).toBe('T');
  });
  it('falls back to <h1>', () => {
    expect(deriveHtmlTitle('<body><h1>H</h1></body>')).toBe('H');
  });
  it('falls back to first 80 chars of inner text', () => {
    const html = '<div>' + 'x '.repeat(100) + '</div>';
    const t = deriveHtmlTitle(html);
    expect(t).not.toBeNull();
    expect(t!.length).toBeLessThanOrEqual(80);
  });
  it('returns null for empty html', () => {
    expect(deriveHtmlTitle('')).toBeNull();
  });
 });
 describe('detectHtmlArtifact', () => {
  it('detects <!DOCTYPE html> prefix case-insensitively', () => {
    const html = '<!doctype HTML><html><body>x</body></html>';
    expect(detectHtmlArtifact(html)).toBe(html);
  });
  it('strips leading/trailing whitespace before matching', () => {
    const html = '\n\n<!DOCTYPE html>\n<html></html>\n';
    expect(detectHtmlArtifact(html)).toBe(html.trim());
  });
  it('detects fenced ```html block wrapping entire message', () => {
    const wrapped = '```html\n<!DOCTYPE html>\n<html></html>\n```';
    expect(detectHtmlArtifact(wrapped)).toContain('<!DOCTYPE html>');
  });
  it('rejects plain markdown', () => {
    expect(detectHtmlArtifact('# heading\n\nsome text')).toBeNull();
  });
  it('rejects message with prose before the doctype', () => {
    expect(
      detectHtmlArtifact('Here you go: <!DOCTYPE html><html></html>'),
    ).toBeNull();
  });
  it('rejects empty input', () => {
    expect(detectHtmlArtifact('')).toBeNull();
    expect(detectHtmlArtifact('   \n  ')).toBeNull();
  });
  it('rejects fenced block without doctype/<html>', () => {
    expect(detectHtmlArtifact('```html\n<div>x</div>\n```')).toBeNull();
  });
  it('accepts fenced block containing <html> tag (no doctype)', () => {
    const r = detectHtmlArtifact('```html\n<html><body>x</body></html>\n```');
    expect(r).toContain('<html>');
  });
 });
 describe('writeMarkdownArtifact / writeHtmlArtifact', () => {
  let projectRoot: string;
  beforeEach(async () => {
    projectRoot = await mkdtemp(join(tmpdir(), 'artifacts-test-'));
  });
  afterEach(async () => {
    await rm(projectRoot, { recursive: true, force: true });
  });
  it('writes a markdown artifact under .boocode/artifacts/', async () => {
    const result = await writeMarkdownArtifact(
      { content: '# Hello\n\nbody' },
      { projectId: 'pid', projectRoot },
    );
    expect(result.path).toMatch(/\.boocode\/artifacts\/hello-\d+\.md$/);
    expect(result.url).toMatch(/^\/api\/projects\/pid\/artifacts\/hello-\d+\.md$/);
    const written = await readFile(result.path, 'utf8');
    expect(written).toBe('# Hello\n\nbody');
  });
  it('writes an html artifact', async () => {
    const result = await writeHtmlArtifact(
      {
        html_content: '<!DOCTYPE html><html><head><title>X</title></head></html>',
        char_count: 56,
        title: 'X',
      },
      { projectId: 'pid', projectRoot },
    );
    expect(result.path).toMatch(/\.boocode\/artifacts\/x-\d+\.html$/);
    const written = await readFile(result.path, 'utf8');
    expect(written).toContain('<!DOCTYPE html>');
  });
  it('creates the artifacts directory if absent', async () => {
    // Confirm the writer mkdir-recursive's the artifacts dir on first call.
    const result = await writeMarkdownArtifact(
      { content: '# T' },
      { projectId: 'pid', projectRoot },
    );
    expect(result.path).toContain('.boocode/artifacts');
  });
 });
 describe('1MB cap behavior', () => {
  it('reports the correct byte threshold', () => {
    expect(HTML_ARTIFACT_MAX_BYTES).toBe(1_048_576);
  });
  it('exceeds threshold for oversize payload', () => {
    const oversize = '<!DOCTYPE html>' + 'A'.repeat(HTML_ARTIFACT_MAX_BYTES);
    expect(Buffer.byteLength(oversize, 'utf8')).toBeGreaterThan(
      HTML_ARTIFACT_MAX_BYTES,
    );
  });
  it('detectHtmlArtifact still returns content above the cap (cap is checked by caller)', () => {
    // Detection is content-shape; the cap check lives in finalizeCompletion
    // (error-handler.ts). This test pins that contract: the helper does not
    // silently drop oversize payloads on the floor.
    const big = '<!DOCTYPE html>' + 'x'.repeat(2_000_000);
    expect(detectHtmlArtifact(big)).not.toBeNull();
  });
 });
 describe('decideHtmlArtifactWrite', () => {
  // Pure helper extracted from finalizeCompletion's cap-skip branch. Pins
  // the warn-and-skip decision without mocking the full InferenceContext.
  it('returns write=true for payloads under the cap', () => {
    const html = '<!DOCTYPE html><html></html>';
    const decision = decideHtmlArtifactWrite(html);
    expect(decision.write).toBe(true);
    expect(decision.byteLen).toBe(Buffer.byteLength(html, 'utf8'));
  });
  it('returns write=false with cap_exceeded reason for oversize payloads', () => {
    const big = '<!DOCTYPE html>' + 'x'.repeat(HTML_ARTIFACT_MAX_BYTES);
    const decision = decideHtmlArtifactWrite(big);
    expect(decision.write).toBe(false);
    if (!decision.write) {
      expect(decision.reason).toBe('cap_exceeded');
      expect(decision.byteLen).toBeGreaterThan(HTML_ARTIFACT_MAX_BYTES);
    }
  });
  it('accepts payload exactly at the cap (boundary)', () => {
    // byteLen === cap should write; only strictly greater skips.
    const exact = 'x'.repeat(HTML_ARTIFACT_MAX_BYTES);
    const decision = decideHtmlArtifactWrite(exact);
    expect(decision.write).toBe(true);
    expect(decision.byteLen).toBe(HTML_ARTIFACT_MAX_BYTES);
  });
 });
 describe('symlink escape protection', () => {
  // Closes the gap where `.boocode/artifacts` is a symlink pointing
  // outside the project root. The lexical prefix check on the resolved
  // candidate path passes (it's under projectRoot textually), but the
  // post-mkdir realpath verification must catch the escape.
  let projectRoot: string;
  let outside: string;
  beforeEach(async () => {
    projectRoot = await mkdtemp(join(tmpdir(), 'artifacts-symlink-root-'));
    outside = await mkdtemp(join(tmpdir(), 'artifacts-symlink-outside-'));
  });
  afterEach(async () => {
    await rm(projectRoot, { recursive: true, force: true });
    await rm(outside, { recursive: true, force: true });
  });
  it('throws PathScopeError when .boocode/artifacts is a symlink to outside the project', async () => {
    // Create .boocode dir, then make `artifacts` a symlink pointing outside.
    await mkdir(join(projectRoot, '.boocode'), { recursive: true });
    await symlink(outside, join(projectRoot, '.boocode', 'artifacts'));
    await expect(
      writeMarkdownArtifact(
        { content: '# Hello' },
        { projectId: 'pid', projectRoot },
      ),
    ).rejects.toBeInstanceOf(PathScopeError);
  });
 });
--- a/apps/server/src/services/tests/codecontext_client.test.ts
+++ b/apps/server/src/services/tests/codecontext_client.test.ts
@@ -1,5 +1,5 @@
 import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest';
-import { mkdir, mkdtemp, rm } from 'node:fs/promises';
+import { mkdir, mkdtemp, rm, symlink, writeFile } from 'node:fs/promises';
 import { join } from 'node:path';
 import { tmpdir } from 'node:os';
 import { callCodecontext } from '../codecontext_client.js';
@@ -203,3 +203,197 @@ describe('callCodecontext — error paths', () => {
    ).rejects.toThrow(/timed out after 30000ms/);
  });
 });
 // ---- v1.13.18: file_path resolution tests -----------------------------------
 describe('callCodecontext — file_path resolution', () => {
  // Case 1: relative path resolves to absolute under project root
  it('resolves a relative file_path to an absolute path inside project root', async () => {
    // Create a real file so realpath can canonicalise it
    const fileName = 'src_module.ts';
    await writeFile(join(projectDir, fileName), '// hello');
    const fetcher = vi.fn().mockResolvedValue(
      mockJSONResponse({ result: 'file analysis', error: null }),
    );
    await callCodecontext(
      {
        toolName: 'get_file_analysis',
        args: { file_path: fileName },
        projectPath: projectDir,
      },
      fetcher as unknown as typeof fetch,
    );
    expect(fetcher).toHaveBeenCalledTimes(1);
    const body = JSON.parse(fetcher.mock.calls[0]![1]!.body as string);
    // Should be the resolved absolute path
    expect(body.file_path).toBe(join(projectDir, fileName));
  });
  // Case 2: absolute path inside project root → realpathed → forwarded
  it('passes through an absolute file_path inside project root', async () => {
    const fileName = 'absolute_target.ts';
    const absPath = join(projectDir, fileName);
    await writeFile(absPath, '// absolute');
    const fetcher = vi.fn().mockResolvedValue(
      mockJSONResponse({ result: 'analysis', error: null }),
    );
    await callCodecontext(
      {
        toolName: 'get_file_analysis',
        args: { file_path: absPath },
        projectPath: projectDir,
      },
      fetcher as unknown as typeof fetch,
    );
    const body = JSON.parse(fetcher.mock.calls[0]![1]!.body as string);
    expect(body.file_path).toBe(absPath);
  });
  // Case 3: relative escape path → rejected with same error shape as target_dir escape
  it('rejects a relative file_path that escapes the project root', async () => {
    const fetcher = vi.fn();
    await expect(
      callCodecontext(
        {
          toolName: 'get_file_analysis',
          args: { file_path: '../../etc/passwd' },
          projectPath: projectDir,
        },
        fetcher as unknown as typeof fetch,
      ),
    ).rejects.toThrow(/escapes project root/);
    expect(fetcher).not.toHaveBeenCalled();
  });
  // Case 4: absolute path outside project root → rejected
  it('rejects an absolute file_path outside the project root', async () => {
    const fetcher = vi.fn();
    await expect(
      callCodecontext(
        {
          toolName: 'get_file_analysis',
          // /etc/passwd is outside any tmpdir project root
          args: { file_path: '/etc/passwd' },
          projectPath: projectDir,
        },
        fetcher as unknown as typeof fetch,
      ),
    ).rejects.toThrow(/escapes project root/);
    expect(fetcher).not.toHaveBeenCalled();
  });
  // Case 5: nonexistent file (ENOENT) → forwarded as un-realpath'd absolute
  it('forwards a nonexistent file_path as absolute without throwing', async () => {
    const missingPath = join(projectDir, 'does_not_exist.ts');
    const fetcher = vi.fn().mockResolvedValue(
      mockJSONResponse({ result: null, error: 'File not found in graph: ' + missingPath }),
    );
    // The resolver should NOT throw; the error comes back from the sidecar
    await expect(
      callCodecontext(
        {
          toolName: 'get_file_analysis',
          args: { file_path: 'does_not_exist.ts' },
          projectPath: projectDir,
        },
        fetcher as unknown as typeof fetch,
      ),
    ).rejects.toThrow(/File not found in graph/);
    // Wire was still called — resolver forwarded the path
    expect(fetcher).toHaveBeenCalledTimes(1);
    const body = JSON.parse(fetcher.mock.calls[0]![1]!.body as string);
    // Should receive the absolute (non-realpathed) path
    expect(body.file_path).toBe(missingPath);
  });
  // Case 6: empty string → skipped by guard, reaches wire unmodified
  // Note: Zod .trim().min(1) in get_file_analysis rejects empty before the
  // shim is reached in production. At the shim layer, the guard
  // `file_path.trim() !== ''` skips the resolver for empty strings so that
  // optional-file_path wrappers treat '' as "not provided". This is a
  // deliberate design; callers that require file_path validate at the Zod layer.
  it('skips resolver for empty string file_path (treated as not provided)', async () => {
    const fetcher = vi.fn().mockResolvedValue(
      mockJSONResponse({ result: 'analysis', error: null }),
    );
    // Should succeed — empty string is treated as "no file_path"
    await callCodecontext(
      {
        toolName: 'get_file_analysis',
        args: { file_path: '' },
        projectPath: projectDir,
      },
      fetcher as unknown as typeof fetch,
    );
    expect(fetcher).toHaveBeenCalledTimes(1);
    const body = JSON.parse(fetcher.mock.calls[0]![1]!.body as string);
    // Empty string passes through unchanged (resolver not invoked)
    expect(body.file_path).toBe('');
  });
  // Case 7: wrapper without file_path (e.g. get_codebase_overview) → resolver not invoked
  it('does not invoke file_path resolver when file_path is absent from args', async () => {
    const fetcher = vi.fn().mockResolvedValue(
      mockJSONResponse({ result: 'overview', error: null }),
    );
    await callCodecontext(
      {
        toolName: 'get_codebase_overview',
        args: { include_stats: true },
        projectPath: projectDir,
      },
      fetcher as unknown as typeof fetch,
    );
    expect(fetcher).toHaveBeenCalledTimes(1);
    const body = JSON.parse(fetcher.mock.calls[0]![1]!.body as string);
    // No file_path in the wire body
    expect('file_path' in body).toBe(false);
  });
  // Case 8: absolute path with `..` that resolves outside project root, even
  // when the literal path is ENOENT. Without resolve() in the absolute branch
  // the prefix check false-positives because the raw `<projectDir>/../etc/x`
  // literal starts with `<projectDir>/`.
  it('rejects absolute file_path with `..` resolving outside project root (ENOENT branch)', async () => {
    const fetcher = vi.fn();
    const escapingAbsolute = `${projectDir}/../etc/non_existent_passwd`;
    await expect(
      callCodecontext(
        {
          toolName: 'get_file_analysis',
          args: { file_path: escapingAbsolute },
          projectPath: projectDir,
        },
        fetcher as unknown as typeof fetch,
      ),
    ).rejects.toThrow(/escapes project root/);
    expect(fetcher).not.toHaveBeenCalled();
  });
  // Case 9: in-project symlink targeting outside the project root. This is the
  // canonical realpath defense — realpath must canonicalise the symlink and
  // the escape check must reject. Without this test, a symlink-out hole could
  // regress silently.
  it('rejects file_path that resolves through a symlink leaving project root', async () => {
    const outsideDir = await mkdtemp(join(tmpdir(), 'codecontext-outside-'));
    try {
      const evilTarget = join(outsideDir, 'secrets.txt');
      await writeFile(evilTarget, 'top secret');
      await symlink(evilTarget, join(projectDir, 'evil-link'));
      const fetcher = vi.fn();
      await expect(
        callCodecontext(
          {
            toolName: 'get_file_analysis',
            args: { file_path: 'evil-link' },
            projectPath: projectDir,
          },
          fetcher as unknown as typeof fetch,
        ),
      ).rejects.toThrow(/escapes project root/);
      expect(fetcher).not.toHaveBeenCalled();
    } finally {
      await rm(outsideDir, { recursive: true, force: true });
    }
  });
 });
--- a/apps/server/src/services/tests/codecontext_tools.test.ts
+++ b/apps/server/src/services/tests/codecontext_tools.test.ts
@@ -70,7 +70,7 @@ describe('codecontext wrappers — toolName + args forwarding', () => {
    const { url, body } = parsePOST(fetcher);
    expect(url).toMatch(/\/v1\/get_file_analysis$/);
    expect(body).toMatchObject({
-      file_path: 'apps/server/src/index.ts',
+      file_path: join(projectDir, 'apps/server/src/index.ts'),
      target_dir: projectDir,
    });
  });
--- a/apps/server/src/services/tests/compaction.test.ts
+++ b/apps/server/src/services/tests/compaction.test.ts
@@ -6,6 +6,7 @@ import {
  turns,
  select,
  buildPrompt,
  buildHeadPayload,
  type CompactionMessage,
 } from '../compaction.js';
 import { SUMMARY_TEMPLATE } from '../compaction-prompt.js';
@@ -31,6 +32,7 @@ function mkMsg(
    status: 'complete',
    tool_calls: null,
    tool_results: null,
    reasoning_parts: null,
    metadata: null,
    created_at: new Date(counter * 1000).toISOString(),
    ...overrides,
@@ -39,49 +41,58 @@ function mkMsg(
 // ---- usable -----------------------------------------------------------------
-describe('usable', () => {
+// v1.13.9: ratio-only early trigger at 0.85 × contextLimit. Replaces the
-  it('returns 0 when contextLimit is 0', () => {
+// v1.11.0-era `contextLimit - 20_000` math, which degenerated to 0 for
 // contexts ≤20k and gave only 7-8% headroom at 262k.
 describe('usable() — ratio-only early trigger (v1.13.9)', () => {
  it('returns floor(0.85 * limit) for the qwen3.6 daily-driver context', () => {
    // floor(0.85 * 262144) = floor(222822.4) = 222822 — 15% headroom for
    // the summarizer to do its turn without itself overflowing.
    expect(usable(262144)).toBe(222822);
  });
  it('returns 0.85× for a mid-sized context', () => {
    expect(usable(100_000)).toBe(85_000);
  });
  it('returns 0.85× for a small context (no degenerate 0)', () => {
    // floor(0.85 * 8192) = 6963. Under the old formula this returned 0
    // (8192 - 20_000 clamped to 0), effectively disabling compaction for
    // small-context models. The ratio keeps the trigger active.
    expect(usable(8192)).toBe(6963);
  });
  it('returns 0 for zero or negative contextLimit', () => {
    expect(usable(0)).toBe(0);
-  });
+    expect(usable(-1)).toBe(0);
  it('returns 0 when contextLimit is below the 20k buffer', () => {
    // Math.max(0, x - 20000) clamps the subtraction so we never report
    // negative headroom. A 10k-context model reports 0 usable, which makes
    // isOverflow short-circuit to false (correct — we can't size the
    // compaction with no headroom).
    expect(usable(10_000)).toBe(0);
    expect(usable(19_999)).toBe(0);
    expect(usable(20_000)).toBe(0);
  });
  it('subtracts the 20k buffer from a normal-sized context window', () => {
    expect(usable(100_000)).toBe(80_000);
    expect(usable(32_768)).toBe(12_768);
  });
 });
 // ---- isOverflow -------------------------------------------------------------
 describe('isOverflow', () => {
-  it('returns false when usable is 0 (unknown / sub-buffer context)', () => {
+  it('returns false when usable is 0 (unknown contextLimit)', () => {
    expect(isOverflow({ prompt_tokens: 999_999, completion_tokens: 0 }, 0)).toBe(false);
-    expect(isOverflow({ prompt_tokens: 0, completion_tokens: 999_999 }, 10_000)).toBe(false);
+    expect(isOverflow({ prompt_tokens: 0, completion_tokens: 999_999 }, -1)).toBe(false);
  });
  it('returns false at 50% of usable', () => {
-    // usable(100k) = 80k → 50% = 40k.
+    // v1.13.9: usable(100k) = 85k → 50% ≈ 42.5k.
    expect(isOverflow({ prompt_tokens: 30_000, completion_tokens: 10_000 }, 100_000)).toBe(false);
  });
  it('returns false just under usable', () => {
-    expect(isOverflow({ prompt_tokens: 79_000, completion_tokens: 999 }, 100_000)).toBe(false);
+    // v1.13.9: 84_000 + 999 = 84_999 < 85_000 budget.
    expect(isOverflow({ prompt_tokens: 84_000, completion_tokens: 999 }, 100_000)).toBe(false);
  });
  it('returns true exactly at usable (>=, not strict >)', () => {
-    expect(isOverflow({ prompt_tokens: 80_000, completion_tokens: 0 }, 100_000)).toBe(true);
+    // v1.13.9: 85_000 == usable(100_000).
    expect(isOverflow({ prompt_tokens: 85_000, completion_tokens: 0 }, 100_000)).toBe(true);
  });
  it('returns true above usable', () => {
    // 50_000 + 40_000 = 90_000 > 85_000.
    expect(isOverflow({ prompt_tokens: 50_000, completion_tokens: 40_000 }, 100_000)).toBe(true);
  });
 });
@@ -224,8 +235,9 @@ describe('select', () => {
    const u = mkMsg('user', 'oversized');
    const a = mkMsg('assistant', 'Y'.repeat(40_000));
    const result = select([u, a], 30_000, 1);
-    // usable(30k) = 10k → budget = min(8k, max(2k, floor(10k*0.25))) =
+    // v1.13.9: usable(30k) = floor(0.85*30k) = 25500 → budget =
-    // min(8k, max(2k, 2500)) = 2500. 40k chars ≈ 10k tokens. Can't fit.
+    // min(8k, max(2k, floor(25500*0.25))) = min(8k, max(2k, 6375)) = 6375.
    // 40k chars ≈ 10k tokens. Still can't fit (10k > 6375).
    expect(result.tail_start_id).toBeUndefined();
    expect(result.head).toEqual([u, a]);
  });
@@ -256,3 +268,56 @@ describe('buildPrompt', () => {
    expect(out.endsWith('extra-context-line')).toBe(true);
  });
 });
 // ---- buildHeadPayload (v1.13.6) -----------------------------------------------
 describe('buildHeadPayload reasoning render', () => {
  it('emits reasoning as a <reasoning> tag prefixed onto the assistant content', () => {
    const out = buildHeadPayload([
      mkMsg('user', 'show me the file'),
      mkMsg('assistant', 'reading it now', {
        reasoning_parts: [{ text: 'user wants src/index.ts; I should view it' }],
      }),
    ]);
    expect(out).toHaveLength(2);
    expect(out[1]!.role).toBe('assistant');
    expect(out[1]!.content).toBe(
      '<reasoning>user wants src/index.ts; I should view it</reasoning>\n\nreading it now',
    );
  });
  it('emits a standalone <reasoning> tag when reasoning is present but content is empty (tool-call-only turn)', () => {
    const out = buildHeadPayload([
      mkMsg('assistant', '', {
        reasoning_parts: [{ text: 'jumping straight to grep' }],
        tool_calls: [{ id: 'c1', name: 'grep', args: { pattern: 'foo' } }],
      }),
    ]);
    expect(out).toHaveLength(1);
    expect(out[0]!.content).toBe('<reasoning>jumping straight to grep</reasoning>');
    expect(out[0]!.tool_calls).toHaveLength(1);
    expect(out[0]!.tool_calls![0]!.function.name).toBe('grep');
  });
  it('joins multiple reasoning parts without separators (matches the streaming concat)', () => {
    const out = buildHeadPayload([
      mkMsg('assistant', 'final answer', {
        reasoning_parts: [{ text: 'first thought ' }, { text: 'second thought' }],
      }),
    ]);
    expect(out[0]!.content).toBe(
      '<reasoning>first thought second thought</reasoning>\n\nfinal answer',
    );
  });
  it('omits the reasoning tag entirely when reasoning_parts is null or empty', () => {
    const out = buildHeadPayload([
      mkMsg('assistant', 'plain answer', { reasoning_parts: null }),
      mkMsg('assistant', 'other answer', { reasoning_parts: [] }),
    ]);
    expect(out[0]!.content).toBe('plain answer');
    expect(out[1]!.content).toBe('other answer');
    expect(out[0]!.content).not.toContain('<reasoning>');
    expect(out[1]!.content).not.toContain('<reasoning>');
  });
 });
--- a/apps/server/src/services/tests/doom-loop.test.ts
+++ b/apps/server/src/services/tests/doom-loop.test.ts
@@ -1,5 +1,5 @@
 import { describe, it, expect } from 'vitest';
-import { DOOM_LOOP_THRESHOLD, detectDoomLoop } from '../inference.js';
+import { DOOM_LOOP_THRESHOLD, detectDoomLoop } from '../inference/index.js';
 import type { ToolCall } from '../../types/api.js';
 // ---- fixture ----------------------------------------------------------------
--- a/apps/server/src/services/tests/grant_resolver.test.ts
+++ b/apps/server/src/services/tests/grant_resolver.test.ts
@@ -0,0 +1,199 @@
 // v1.13.17-cross-repo-reads: resolveGrantRoot decision tree.
 //
 // Sam's dispatch note (2026-05-22): "in the project-root resolver ancestor
 // walk, stop the moment parent exits PROJECT_ROOT_WHITELIST or hits
 // filesystem root — check on every iteration, not just final parent.
 // Symlinked input must not be able to escape the whitelist during the
 // walk." The symlink-escape-mid-walk test below pins that invariant —
 // without the per-iteration whitelist check, this case would walk OUTSIDE
 // the whitelist root and return a phantom grant.
 import { describe, it, expect, beforeAll, afterAll, vi } from 'vitest';
 import { mkdtemp, rm, mkdir, writeFile, symlink } from 'node:fs/promises';
 import { tmpdir } from 'node:os';
 import { join } from 'node:path';
 import { realpath } from 'node:fs/promises';
 import { resolveGrantRoot } from '../grant_resolver.js';
 import type { Sql } from '../../db.js';
 let tmp: string;
 let whitelist: string;
 let project: string;
 let fork: string;
 let outside: string;
 // Fake sql tag — returns the projects rows we want without touching a real
 // database. The resolver only ever does a single SELECT, so a single-shot
 // mock that returns the prepared rows on every invocation is enough.
 function makeSql(rows: Array<{ path: string }>): Sql {
  const tag = ((..._args: unknown[]) => Promise.resolve(rows)) as unknown as Sql;
  return tag;
 }
 beforeAll(async () => {
  tmp = await realpath(await mkdtemp(join(tmpdir(), 'boocode-gr-')));
  whitelist = join(tmp, 'whitelist');
  project = join(whitelist, 'boocode');
  fork = join(whitelist, 'forks', 'codecontext');
  outside = join(tmp, 'outside');
  await mkdir(project, { recursive: true });
  await mkdir(fork, { recursive: true });
  await mkdir(outside, { recursive: true });
  // Mark project as a repo (.git directory).
  await mkdir(join(project, '.git'));
  await writeFile(join(project, 'README.md'), 'project readme');
  // Mark fork as a repo via go.mod (matches the proposal's example).
  await writeFile(join(fork, 'go.mod'), 'module example.com/foo');
  await writeFile(join(fork, 'main.go'), 'package main');
  await writeFile(join(outside, 'secret.txt'), 'forbidden');
 });
 afterAll(async () => {
  await rm(tmp, { recursive: true, force: true });
 });
 describe('resolveGrantRoot — happy paths', () => {
  it('refuses when the requested path is already under projectRoot', async () => {
    const result = await resolveGrantRoot(makeSql([]), join(project, 'README.md'), project, whitelist);
    expect(result.ok).toBe(false);
    if (!result.ok) expect(result.reason).toMatch(/already accessible/);
  });
  it('returns the project root when the path falls under a registered project', async () => {
    // Register `fork` as a known project. Resolver should return the project
    // ancestor (LONGEST match wins) rather than the repo-shape fallback.
    const result = await resolveGrantRoot(
      makeSql([{ path: fork }]),
      join(fork, 'main.go'),
      project,
      whitelist,
    );
    expect(result.ok).toBe(true);
    if (result.ok) {
      expect(result.root).toBe(fork);
      expect(result.source).toBe('project');
    }
  });
  it('falls back to the nearest repo-shaped ancestor when no project matches', async () => {
    const result = await resolveGrantRoot(
      makeSql([]),
      join(fork, 'main.go'),
      project,
      whitelist,
    );
    expect(result.ok).toBe(true);
    if (result.ok) {
      expect(result.root).toBe(fork);
      expect(result.source).toBe('whitelist');
    }
  });
 });
 describe('resolveGrantRoot — refusals', () => {
  it('refuses paths outside PROJECT_ROOT_WHITELIST', async () => {
    const result = await resolveGrantRoot(
      makeSql([]),
      join(outside, 'secret.txt'),
      project,
      whitelist,
    );
    expect(result.ok).toBe(false);
    if (!result.ok) expect(result.reason).toMatch(/outside permitted scope/);
  });
  it('refuses non-absolute paths', async () => {
    const result = await resolveGrantRoot(makeSql([]), 'relative/path', project, whitelist);
    expect(result.ok).toBe(false);
    if (!result.ok) expect(result.reason).toMatch(/absolute/);
  });
  it('refuses missing paths without prompting', async () => {
    const result = await resolveGrantRoot(
      makeSql([]),
      join(whitelist, 'nope'),
      project,
      whitelist,
    );
    expect(result.ok).toBe(false);
    if (!result.ok) expect(result.reason).toMatch(/does not exist/);
  });
  it('refuses when no repo-shape marker is found before hitting the whitelist root', async () => {
    // Build a directory tree under the whitelist that has NO repo markers
    // all the way up to the whitelist root.
    const plain = join(whitelist, 'plain-dir', 'nested');
    await mkdir(plain, { recursive: true });
    await writeFile(join(plain, 'just-a-file.txt'), 'x');
    const result = await resolveGrantRoot(
      makeSql([]),
      join(plain, 'just-a-file.txt'),
      project,
      whitelist,
    );
    expect(result.ok).toBe(false);
    if (!result.ok) expect(result.reason).toMatch(/no repo-shaped ancestor/);
  });
  it('does not grant the whitelist root itself as a fallback', async () => {
    // Even if .git existed at the whitelist root (it doesn't), we'd refuse.
    // Easier to assert: a path directly under whitelist with no repo marker.
    const direct = join(whitelist, 'lone-file.txt');
    await writeFile(direct, 'x');
    const result = await resolveGrantRoot(makeSql([]), direct, project, whitelist);
    expect(result.ok).toBe(false);
  });
 });
 describe('resolveGrantRoot — symlink-escape-mid-walk guard (Sam 2026-05-22)', () => {
  it('refuses a symlinked input whose realpath sits outside the whitelist', async () => {
    // The symlink lives nominally inside the whitelist, but its target
    // (realpath) is outside. The guard's first realpath() call normalizes
    // and the up-front whitelist check refuses immediately.
    const link = join(whitelist, 'escape-link');
    try {
      await symlink(outside, link);
      const result = await resolveGrantRoot(
        makeSql([]),
        join(link, 'secret.txt'),
        project,
        whitelist,
      );
      expect(result.ok).toBe(false);
      if (!result.ok) expect(result.reason).toMatch(/outside permitted scope/);
    } finally {
      await rm(link, { force: true });
    }
  });
  it('walk loop terminates at the whitelist root, not at filesystem /', async () => {
    // Construct a deep tree with NO repo markers anywhere. Without a bound,
    // the walk would chase parents up to "/". The bound flips the loop into
    // a refusal once the cursor equals the realpath'd whitelist root.
    const deep = join(whitelist, 'a', 'b', 'c', 'd');
    await mkdir(deep, { recursive: true });
    await writeFile(join(deep, 'leaf.txt'), 'x');
    const result = await resolveGrantRoot(makeSql([]), join(deep, 'leaf.txt'), project, whitelist);
    expect(result.ok).toBe(false);
    if (!result.ok) expect(result.reason).toMatch(/no repo-shaped ancestor/);
  });
 });
 describe('resolveGrantRoot — nearest-project disambiguation', () => {
  it('prefers the longest matching project path over a shorter ancestor', async () => {
    const outer = whitelist;
    const inner = fork; // /whitelist/forks/codecontext, deeper than outer
    const result = await resolveGrantRoot(
      makeSql([{ path: outer }, { path: inner }]),
      join(fork, 'main.go'),
      project,
      whitelist,
    );
    expect(result.ok).toBe(true);
    if (result.ok) expect(result.root).toBe(inner);
  });
 });
 // Belt-and-suspenders: silence a known dynamic-import warning that vitest
 // occasionally emits on transient fs operations in CI but never in dev.
 vi.spyOn(console, 'warn').mockImplementation(() => {});
--- a/apps/server/src/services/tests/inference.test.ts
+++ b/apps/server/src/services/tests/inference.test.ts
@@ -1,5 +1,5 @@
 import { describe, it, expect } from 'vitest';
-import { buildMessagesPayload } from '../inference.js';
+import { buildMessagesPayload } from '../inference/index.js';
 import type {
  Message,
  MessageRole,
--- a/apps/server/src/services/tests/parts.test.ts
+++ b/apps/server/src/services/tests/parts.test.ts
@@ -0,0 +1,121 @@
 import { describe, it, expect } from 'vitest';
 import { partsFromAssistantMessage, partsFromToolMessage } from '../inference/parts.js';
 import type { ToolCall, ToolResult } from '../../types/api.js';
 describe('partsFromAssistantMessage', () => {
  it('emits one text part for content-only assistant', () => {
    const parts = partsFromAssistantMessage({ content: 'hello world', tool_calls: null });
    expect(parts).toHaveLength(1);
    expect(parts[0]).toEqual({
      sequence: 0,
      kind: 'text',
      payload: { text: 'hello world' },
    });
  });
  it('emits one tool_call part for empty-content + single tool_call', () => {
    const tc: ToolCall = { id: 'call_1', name: 'view_file', args: { path: 'src/a.ts' } };
    const parts = partsFromAssistantMessage({ content: '', tool_calls: [tc] });
    expect(parts).toHaveLength(1);
    expect(parts[0]).toEqual({
      sequence: 0,
      kind: 'tool_call',
      payload: { id: 'call_1', name: 'view_file', args: { path: 'src/a.ts' } },
    });
  });
  it('emits text then tool_call parts in order when both present', () => {
    const tc: ToolCall = { id: 'call_2', name: 'grep', args: { pattern: 'foo' } };
    const parts = partsFromAssistantMessage({ content: 'let me search', tool_calls: [tc] });
    expect(parts.map((p) => [p.sequence, p.kind])).toEqual([
      [0, 'text'],
      [1, 'tool_call'],
    ]);
  });
  it('preserves tool_call order with multiple calls', () => {
    const calls: ToolCall[] = [
      { id: 'a', name: 'list_dir', args: { path: '.' } },
      { id: 'b', name: 'view_file', args: { path: 'x.ts' } },
      { id: 'c', name: 'grep', args: { pattern: 'y' } },
    ];
    const parts = partsFromAssistantMessage({ content: '', tool_calls: calls });
    expect(parts).toHaveLength(3);
    expect(parts.map((p) => p.payload)).toEqual([
      { id: 'a', name: 'list_dir', args: { path: '.' } },
      { id: 'b', name: 'view_file', args: { path: 'x.ts' } },
      { id: 'c', name: 'grep', args: { pattern: 'y' } },
    ]);
    expect(parts.map((p) => p.sequence)).toEqual([0, 1, 2]);
  });
  it('returns empty array for empty content + null tool_calls', () => {
    expect(partsFromAssistantMessage({ content: '', tool_calls: null })).toEqual([]);
  });
  it('v1.13.1-C: reasoning lands at sequence 0 before text + tool_calls', () => {
    const tc: ToolCall = { id: 'call_r', name: 'view_file', args: { path: 'x.ts' } };
    const parts = partsFromAssistantMessage({
      content: 'inspecting now',
      tool_calls: [tc],
      reasoning: 'user asked about x.ts; I should view it',
    });
    expect(parts.map((p) => [p.sequence, p.kind])).toEqual([
      [0, 'reasoning'],
      [1, 'text'],
      [2, 'tool_call'],
    ]);
    expect(parts[0]!.payload).toEqual({
      text: 'user asked about x.ts; I should view it',
    });
  });
  it('v1.13.1-C: reasoning + empty content + tool_calls preserves seq 0 reasoning', () => {
    const tc: ToolCall = { id: 'call_r2', name: 'grep', args: { pattern: 'foo' } };
    const parts = partsFromAssistantMessage({
      content: '',
      tool_calls: [tc],
      reasoning: 'jumping straight to grep',
    });
    expect(parts.map((p) => [p.sequence, p.kind])).toEqual([
      [0, 'reasoning'],
      [1, 'tool_call'],
    ]);
  });
 });
 describe('partsFromToolMessage', () => {
  it('emits a single tool_result part at sequence 0', () => {
    const tr: ToolResult = {
      tool_call_id: 'call_1',
      output: { contents: 'console.log(1)' },
      truncated: false,
    };
    const parts = partsFromToolMessage({ tool_results: tr });
    expect(parts).toHaveLength(1);
    expect(parts[0]).toEqual({
      sequence: 0,
      kind: 'tool_result',
      payload: {
        tool_call_id: 'call_1',
        output: { contents: 'console.log(1)' },
        truncated: false,
      },
    });
  });
  it('includes error in payload when present', () => {
    const tr: ToolResult = {
      tool_call_id: 'call_2',
      output: null,
      truncated: false,
      error: 'permission denied',
    };
    const parts = partsFromToolMessage({ tool_results: tr });
    expect(parts[0]!.payload).toMatchObject({ error: 'permission denied' });
  });
  it('returns empty array when tool_results is null', () => {
    expect(partsFromToolMessage({ tool_results: null })).toEqual([]);
  });
 });
--- a/apps/server/src/services/tests/path_guard.test.ts
+++ b/apps/server/src/services/tests/path_guard.test.ts
@@ -0,0 +1,93 @@
 // v1.13.17-cross-repo-reads: pathGuard now accepts an optional extraRoots
 // list. Validates the primary-root path stays the source of truth and that
 // extra roots are consulted when (and only when) the primary rejects.
 import { describe, it, expect, beforeAll, afterAll } from 'vitest';
 import { mkdtemp, rm, mkdir, writeFile, symlink } from 'node:fs/promises';
 import { tmpdir } from 'node:os';
 import { join } from 'node:path';
 import { realpath } from 'node:fs/promises';
 import { pathGuard, PathScopeError } from '../path_guard.js';
 let tmp: string;
 let projectRoot: string;
 let altRoot: string;
 let outsideDir: string;
 beforeAll(async () => {
  tmp = await realpath(await mkdtemp(join(tmpdir(), 'boocode-pg-')));
  projectRoot = join(tmp, 'project');
  altRoot = join(tmp, 'alt');
  outsideDir = join(tmp, 'outside');
  await mkdir(projectRoot, { recursive: true });
  await mkdir(altRoot, { recursive: true });
  await mkdir(outsideDir, { recursive: true });
  await writeFile(join(projectRoot, 'inside.txt'), 'p');
  await writeFile(join(altRoot, 'cross.txt'), 'a');
  await writeFile(join(outsideDir, 'forbidden.txt'), 'x');
 });
 afterAll(async () => {
  await rm(tmp, { recursive: true, force: true });
 });
 describe('pathGuard (v1.13.17 extraRoots)', () => {
  it('accepts paths inside the primary projectRoot', async () => {
    const real = await pathGuard(projectRoot, 'inside.txt');
    expect(real).toBe(join(projectRoot, 'inside.txt'));
  });
  it('rejects paths outside the primary root when no extra roots given', async () => {
    await expect(pathGuard(projectRoot, join(outsideDir, 'forbidden.txt'))).rejects.toBeInstanceOf(
      PathScopeError,
    );
  });
  it('accepts cross-root paths when the matching extra root is provided', async () => {
    const real = await pathGuard(projectRoot, join(altRoot, 'cross.txt'), [altRoot]);
    expect(real).toBe(join(altRoot, 'cross.txt'));
  });
  it('rejects cross-root paths even with extra roots when no root matches', async () => {
    await expect(
      pathGuard(projectRoot, join(outsideDir, 'forbidden.txt'), [altRoot]),
    ).rejects.toBeInstanceOf(PathScopeError);
  });
  it('ignores empty-string extra roots silently', async () => {
    const real = await pathGuard(projectRoot, join(altRoot, 'cross.txt'), ['', altRoot]);
    expect(real).toBe(join(altRoot, 'cross.txt'));
  });
  it('error message contains the request_read_access hint when scope rejects', async () => {
    try {
      await pathGuard(projectRoot, join(outsideDir, 'forbidden.txt'));
      throw new Error('should have thrown');
    } catch (err) {
      expect(err).toBeInstanceOf(PathScopeError);
      expect((err as Error).message).toContain('request_read_access');
    }
  });
  it('still resolves symlinks before the scope check', async () => {
    const linkPath = join(projectRoot, 'link-to-outside');
    await symlink(join(outsideDir, 'forbidden.txt'), linkPath);
    // Symlink target escapes both primary and the single extra root, so
    // even though the surface path "looks" inside projectRoot, the real
    // path resolves outside and the guard rejects.
    await expect(pathGuard(projectRoot, linkPath, [altRoot])).rejects.toBeInstanceOf(
      PathScopeError,
    );
    // But adding outsideDir as an extra root accepts (realpath inside it).
    const real = await pathGuard(projectRoot, linkPath, [altRoot, outsideDir]);
    expect(real).toBe(join(outsideDir, 'forbidden.txt'));
  });
  it('tries extra roots in order until one accepts', async () => {
    const real = await pathGuard(projectRoot, join(altRoot, 'cross.txt'), [
      outsideDir, // rejects
      altRoot,    // accepts
    ]);
    expect(real).toBe(join(altRoot, 'cross.txt'));
  });
 });
--- a/apps/server/src/services/tests/prune.test.ts
+++ b/apps/server/src/services/tests/prune.test.ts
@@ -0,0 +1,96 @@
 import { describe, it, expect, beforeEach } from 'vitest';
 import {
  selectPruneTargets,
  PROTECTED_TOKENS,
  PRUNE_TRIGGER_TOKENS,
  type PartForPrune,
 } from '../inference/prune.js';
 // Test fixture: build a tool_result part whose payload size yields a known
 // token estimate (chars/4). The decision logic only cares about
 // JSON.stringify(payload).length, so a string payload of `4n` chars
 // produces exactly `n` tokens.
 let seq = 0;
 function part(tokens: number, createdAt: Date): PartForPrune {
  seq += 1;
  // JSON.stringify("xxx...") wraps in quotes (adds 2 chars), so subtract 2
  // before multiplying. Math.ceil((len+2)/4) needs len ≈ 4*tokens - 2 so the
  // total stringified length is 4*tokens. Approximate by padding 4 chars per
  // token; the off-by-one from quotes is small and tests check totals, not
  // exact per-part counts.
  const text = 'x'.repeat(tokens * 4 - 2);
  return { id: `p${seq}`, payload: text, created_at: createdAt };
 }
 const T_NOW = new Date('2026-05-22T12:00:00Z');
 function ago(secondsBack: number): Date {
  return new Date(T_NOW.getTime() - secondsBack * 1000);
 }
 describe('selectPruneTargets', () => {
  beforeEach(() => {
    seq = 0;
  });
  it('returns nothing when there are no parts', () => {
    expect(selectPruneTargets([], null)).toEqual({ ids: [], freedTokens: 0 });
  });
  it('returns nothing when total tokens are under the protection window', () => {
    const parts: PartForPrune[] = [
      part(10_000, ago(10)),
      part(10_000, ago(20)),
    ]; // 20k total, all protected
    expect(selectPruneTargets(parts, null)).toEqual({ ids: [], freedTokens: 0 });
  });
  it('returns nothing when candidate total is below the prune trigger', () => {
    // Protection fills with ~40k newest, candidates only ~5k. Below 20k trigger.
    const parts: PartForPrune[] = [
      part(20_000, ago(10)),
      part(20_000, ago(20)),
      // Past protection; total ~5k won't trigger.
      part(5_000, ago(30)),
    ];
    const result = selectPruneTargets(parts, null);
    expect(result.ids).toEqual([]);
    expect(result.freedTokens).toBe(0);
  });
  it('hides candidates past protection when their total clears the trigger', () => {
    // Newest 40k protected; older 30k cleanly above the 20k trigger.
    const parts: PartForPrune[] = [
      part(20_000, ago(10)),
      part(20_000, ago(20)),
      // Past protection, total ~30k freed.
      part(15_000, ago(30)),
      part(15_000, ago(40)),
    ];
    const result = selectPruneTargets(parts, null);
    expect(result.ids).toEqual(['p3', 'p4']);
    expect(result.freedTokens).toBeGreaterThanOrEqual(PRUNE_TRIGGER_TOKENS);
  });
  it('stops at the compaction summary boundary', () => {
    // Newest 30k protected (just under PROTECTED_TOKENS=40k); then 30k of
    // older parts. Boundary sits at ago(35), so the ago(40) part is
    // beyond it and gets skipped.
    const parts: PartForPrune[] = [
      part(15_000, ago(10)),
      part(15_000, ago(20)),
      part(15_000, ago(30)), // crosses protection threshold; candidate
      part(15_000, ago(40)), // beyond summary boundary; skipped
    ];
    const tailStart = ago(35);
    const result = selectPruneTargets(parts, tailStart);
    // ago(30) is the only candidate inside the window; 15k is below the
    // 20k trigger so we expect no hides.
    expect(result.ids).toEqual([]);
  });
  it('does not prune when only protected parts exist (no candidates)', () => {
    // Exactly PROTECTED_TOKENS of newest parts; no older candidates.
    const parts: PartForPrune[] = [part(PROTECTED_TOKENS, ago(10))];
    expect(selectPruneTargets(parts, null)).toEqual({ ids: [], freedTokens: 0 });
  });
 });
--- a/apps/server/src/services/tests/system-prompt.test.ts
+++ b/apps/server/src/services/tests/system-prompt.test.ts
@@ -6,7 +6,9 @@ import {
  loadContainerGuidance,
  getContainerGuidance,
  buildSystemPrompt,
  buildSystemPromptWithFingerprint,
  _resetContainerGuidanceCacheForTests,
  _resetPrefixObserverForTests,
 } from '../system-prompt.js';
 import type { Agent, Project, Session } from '../../types/api.js';
@@ -17,12 +19,14 @@ let tmpDir: string;
 beforeEach(async () => {
  tmpDir = await mkdtemp(join(tmpdir(), 'system-prompt-test-'));
  _resetContainerGuidanceCacheForTests();
  _resetPrefixObserverForTests();
  delete process.env['CONTAINER_GUIDANCE_FILE'];
 });
 afterEach(async () => {
  delete process.env['CONTAINER_GUIDANCE_FILE'];
  _resetContainerGuidanceCacheForTests();
  _resetPrefixObserverForTests();
  await rm(tmpDir, { recursive: true, force: true });
 });
@@ -176,3 +180,75 @@ describe('buildSystemPrompt', () => {
    expect(prompt).not.toContain('--- end container guidance ---');
  });
 });
 // v1.13.8: byte-stability instrumentation surface.
 describe('buildSystemPromptWithFingerprint (v1.13.8)', () => {
  it('returns byte-identical prompts for two consecutive calls with the same inputs', async () => {
    const path = join(tmpDir, 'BOOCHAT.md');
    await writeFile(path, 'stable guidance', 'utf8');
    process.env['CONTAINER_GUIDANCE_FILE'] = path;
    const session = makeSession();
    const project = makeProject({ path: '/tmp/stable-proj' });
    const agent = makeAgent({ system_prompt: 'be terse' });
    const first = await buildSystemPromptWithFingerprint(project, session, agent);
    const second = await buildSystemPromptWithFingerprint(project, session, agent);
    expect(first.prompt).toBe(second.prompt);
    expect(first.fingerprint.prefix_hash).toBe(second.fingerprint.prefix_hash);
    expect(first.fingerprint.prefix_length).toBe(second.fingerprint.prefix_length);
  });
  it('emits drift=null on the first call for a fresh session, then null again when nothing changes', async () => {
    process.env['CONTAINER_GUIDANCE_FILE'] = join(tmpDir, 'absent.md');
    const session = makeSession();
    const project = makeProject({ path: '/tmp/stable-proj' });
    const first = await buildSystemPromptWithFingerprint(project, session, null);
    expect(first.drift).toBeNull();
    const second = await buildSystemPromptWithFingerprint(project, session, null);
    expect(second.drift).toBeNull();
    expect(second.fingerprint.prefix_hash).toBe(first.fingerprint.prefix_hash);
  });
  it('emits drift with prev/new hashes and a changed_inputs entry when an input mutates', async () => {
    // Two BOOCHAT.md contents with different mtimes → guidance cache picks
    // up the change → fingerprint hash flips → drift fires.
    const path = join(tmpDir, 'BOOCHAT.md');
    await writeFile(path, 'first', 'utf8');
    process.env['CONTAINER_GUIDANCE_FILE'] = path;
    const session = makeSession();
    const project = makeProject({ path: '/tmp/stable-proj' });
    const first = await buildSystemPromptWithFingerprint(project, session, null);
    expect(first.drift).toBeNull();
    await writeFile(path, 'second — different content', 'utf8');
    const later = new Date(Date.now() + 60_000);
    await utimes(path, later, later);
    const second = await buildSystemPromptWithFingerprint(project, session, null);
    expect(second.drift).not.toBeNull();
    expect(second.drift!.prev_hash).toBe(first.fingerprint.prefix_hash);
    expect(second.drift!.new_hash).toBe(second.fingerprint.prefix_hash);
    expect(second.drift!.prev_hash).not.toBe(second.drift!.new_hash);
    expect(second.drift!.changed_inputs).toContain('mtime_boochat');
  });
  it('does not fire drift across distinct sessions even if their hashes differ', async () => {
    process.env['CONTAINER_GUIDANCE_FILE'] = join(tmpDir, 'absent.md');
    const sessionA = makeSession({ id: 'sess-A' });
    const sessionB = makeSession({ id: 'sess-B', system_prompt: 'B-only override' });
    const project = makeProject({ path: '/tmp/stable-proj' });
    const a = await buildSystemPromptWithFingerprint(project, sessionA, null);
    const b = await buildSystemPromptWithFingerprint(project, sessionB, null);
    expect(a.drift).toBeNull();
    expect(b.drift).toBeNull();
    expect(a.fingerprint.prefix_hash).not.toBe(b.fingerprint.prefix_hash);
  });
 });
--- a/apps/server/src/services/tests/tool_cost_stats.test.ts
+++ b/apps/server/src/services/tests/tool_cost_stats.test.ts
@@ -0,0 +1,236 @@
 import { describe, it, expect, beforeAll, afterAll } from 'vitest';
 import postgres from 'postgres';
 import { readFileSync } from 'node:fs';
 import { resolve } from 'node:path';
 import { fileURLToPath } from 'node:url';
 // v1.13.10: integration tests for the tool_cost_stats view. Skipped unless
 // DATABASE_URL is set so they don't break `pnpm test` on a fresh checkout.
 // Run with:
 //   DATABASE_URL=postgres://boocode:<pw>@localhost:5500/boocode pnpm -C apps/server test
 //
 // Isolation: each test uses a unique tool_name suffix derived from a per-test
 // counter. The view aggregates globally across all chats, so without unique
 // tool names parallel test runs would interfere. Cleanup deletes by tool_name
 // suffix in afterAll.
 const DB_URL = process.env.DATABASE_URL;
 const describeFn = DB_URL ? describe : describe.skip;
 const TEST_RUN_ID = `v13_10_${Date.now()}`;
 const tname = (suffix: string) => `${TEST_RUN_ID}_${suffix}`;
 describeFn('tool_cost_stats view (v1.13.10)', () => {
  let sql: ReturnType<typeof postgres>;
  let projectId: string;
  let sessionId: string;
  let chatId: string;
  beforeAll(async () => {
    if (!DB_URL) return;
    sql = postgres(DB_URL, { max: 2, idle_timeout: 5, connect_timeout: 5, onnotice: () => {} });
    // Apply the schema before fixtures so the view exists. Idempotent via
    // CREATE OR REPLACE VIEW + CREATE TABLE IF NOT EXISTS; safe to run on a
    // pre-populated DB. Mirrors apps/server/src/db.ts:applySchema.
    const here = fileURLToPath(import.meta.url);
    const schemaPath = resolve(here, '../../../schema.sql');
    const ddl = readFileSync(schemaPath, 'utf8');
    await sql.unsafe(ddl);
    // Fixture project + session + chat for all inserts in this file.
    const proj = await sql<{ id: string }[]>`
      INSERT INTO projects (name, path)
      VALUES (${`tool_cost_stats_test_${TEST_RUN_ID}`}, ${`/tmp/${TEST_RUN_ID}`})
      RETURNING id
    `;
    projectId = proj[0]!.id;
    const sess = await sql<{ id: string }[]>`
      INSERT INTO sessions (project_id, name, model)
      VALUES (${projectId}, ${'test'}, ${'test-model'})
      RETURNING id
    `;
    sessionId = sess[0]!.id;
    const chat = await sql<{ id: string }[]>`
      INSERT INTO chats (session_id, name) VALUES (${sessionId}, ${'test'}) RETURNING id
    `;
    chatId = chat[0]!.id;
  });
  afterAll(async () => {
    if (!DB_URL) return;
    // Project FK CASCADE cleans sessions/chats/messages/parts in one shot.
    await sql`DELETE FROM projects WHERE id = ${projectId}`;
    await sql.end({ timeout: 5 });
  });
  async function insertAssistantTurn(opts: {
    toolNames: string[];
    tokensUsed: number | null;
    ctxUsed: number | null;
    status?: 'streaming' | 'complete' | 'failed' | 'cancelled';
    metadata?: { kind: string } | null;
    createdAt?: Date;
  }): Promise<string> {
    const toolCalls = opts.toolNames.map((name, i) => ({
      id: `call_${TEST_RUN_ID}_${name}_${i}`,
      name,
      args: {},
    }));
    const created = opts.createdAt ?? new Date();
    // v1.13.20: parts-only. messages.tool_calls column was dropped; the
    // tool_cost_stats view reads through messages_with_parts which derives
    // tool_calls from message_parts rows.
    const rows = await sql<{ id: string }[]>`
      INSERT INTO messages (
        session_id, chat_id, role, content, kind, status,
        tokens_used, ctx_used,
        metadata, created_at
      )
      VALUES (
        ${sessionId}, ${chatId}, 'assistant', '', 'message',
        ${opts.status ?? 'complete'},
        ${opts.tokensUsed},
        ${opts.ctxUsed},
        ${opts.metadata ? sql.json(opts.metadata as never) : null},
        ${created}
      )
      RETURNING id
    `;
    const messageId = rows[0]!.id;
    for (let i = 0; i < toolCalls.length; i++) {
      await sql`
        INSERT INTO message_parts (message_id, sequence, kind, payload)
        VALUES (${messageId}, ${i}, 'tool_call', ${sql.json(toolCalls[i] as never)})
      `;
    }
    return messageId;
  }
  it('returns empty when no tool calls exist for a tool name', async () => {
    const t = tname('absent');
    const stats = await sql<{ tool_name: string }[]>`
      SELECT * FROM tool_cost_stats WHERE tool_name = ${t}
    `;
    expect(stats).toEqual([]);
  });
  it('attributes single-tool turn fully to that tool', async () => {
    const t = tname('single');
    await insertAssistantTurn({ toolNames: [t], tokensUsed: 300, ctxUsed: 15000 });
    const stats = await sql<{
      tool_name: string;
      prompt_tokens_sum: number;
      completion_tokens_sum: number;
      n_calls: number;
    }[]>`SELECT * FROM tool_cost_stats WHERE tool_name = ${t}`;
    expect(stats[0]).toMatchObject({
      tool_name: t,
      prompt_tokens_sum: 15000,
      completion_tokens_sum: 300,
      n_calls: 1,
    });
  });
  it('splits multi-tool turn equally across tools', async () => {
    const a = tname('multi_a');
    const b = tname('multi_b');
    const c = tname('multi_c');
    // 3 tools, 300 completion / 15000 prompt → each gets 100 / 5000
    await insertAssistantTurn({ toolNames: [a, b, c], tokensUsed: 300, ctxUsed: 15000 });
    const stats = await sql<{
      tool_name: string;
      prompt_tokens_sum: number;
      completion_tokens_sum: number;
      n_calls: number;
    }[]>`
      SELECT * FROM tool_cost_stats
      WHERE tool_name IN (${a}, ${b}, ${c})
      ORDER BY tool_name
    `;
    expect(stats).toHaveLength(3);
    for (const s of stats) {
      expect(s.completion_tokens_sum).toBe(100);
      expect(s.prompt_tokens_sum).toBe(5000);
      expect(s.n_calls).toBe(1);
    }
  });
  it('limits to last 100 calls per tool (FIFO window)', async () => {
    const t = tname('window');
    // Insert 110 turns with monotonically-increasing created_at and tokensUsed.
    // Expect view to keep only the most recent 100.
    const base = Date.now() + 1_000_000; // distant future to avoid colliding with other tests
    for (let i = 1; i <= 110; i++) {
      await insertAssistantTurn({
        toolNames: [t],
        tokensUsed: i, // 1..110
        ctxUsed: i * 10,
        createdAt: new Date(base + i),
      });
    }
    const [stat] = await sql<{
      n_calls: number;
      completion_tokens_sum: number;
    }[]>`SELECT n_calls, completion_tokens_sum FROM tool_cost_stats WHERE tool_name = ${t}`;
    expect(stat!.n_calls).toBe(100);
    // Last 100 are tokensUsed=11..110, sum = (11+110)*100/2 = 6050.
    expect(stat!.completion_tokens_sum).toBe(6050);
  });
  it('excludes turns with NULL tokens_used (pre-v1.13.7 latent regression)', async () => {
    const t = tname('null_tokens');
    await insertAssistantTurn({ toolNames: [t], tokensUsed: null, ctxUsed: 1000 });
    await insertAssistantTurn({ toolNames: [t], tokensUsed: 100, ctxUsed: null });
    const stats = await sql`SELECT * FROM tool_cost_stats WHERE tool_name = ${t}`;
    expect(stats).toEqual([]);
  });
  it('excludes failed/cancelled turns and cap_hit/doom_loop sentinel rows', async () => {
    const t = tname('filtered');
    // A: status='failed'                              — excluded
    // B: status='cancelled'                           — excluded
    // C: status='complete', metadata={kind:'cap_hit'} — excluded
    // D: status='complete', metadata={kind:'doom_loop'} — excluded
    // E: status='complete', metadata=null             — included
    await insertAssistantTurn({ toolNames: [t], tokensUsed: 100, ctxUsed: 1000, status: 'failed' });
    await insertAssistantTurn({ toolNames: [t], tokensUsed: 100, ctxUsed: 1000, status: 'cancelled' });
    await insertAssistantTurn({ toolNames: [t], tokensUsed: 100, ctxUsed: 1000, metadata: { kind: 'cap_hit' } });
    await insertAssistantTurn({ toolNames: [t], tokensUsed: 100, ctxUsed: 1000, metadata: { kind: 'doom_loop' } });
    await insertAssistantTurn({ toolNames: [t], tokensUsed: 100, ctxUsed: 1000, metadata: null });
    const [stat] = await sql<{ n_calls: number }[]>`
      SELECT n_calls FROM tool_cost_stats WHERE tool_name = ${t}
    `;
    expect(stat!.n_calls).toBe(1);
  });
  it('reads tool_calls via messages_with_parts (parts-authoritative)', async () => {
    const t = tname('parts');
    // v1.13.20: post-column-drop the only source for tool_calls is
    // message_parts. This test asserts the same path the view always took
    // (parts-derived), now that the legacy column COALESCE fallback is gone.
    const rows = await sql<{ id: string }[]>`
      INSERT INTO messages (
        session_id, chat_id, role, content, kind, status,
        tokens_used, ctx_used
      )
      VALUES (
        ${sessionId}, ${chatId}, 'assistant', '', 'message', 'complete',
        200, 5000
      )
      RETURNING id
    `;
    const messageId = rows[0]!.id;
    await sql`
      INSERT INTO message_parts (message_id, sequence, kind, payload)
      VALUES (
        ${messageId}, 0, 'tool_call',
        ${sql.json({ id: `tc_parts_${TEST_RUN_ID}`, name: t, args: {} } as never)}
      )
    `;
    const [stat] = await sql<{ n_calls: number }[]>`
      SELECT n_calls FROM tool_cost_stats WHERE tool_name = ${t}
    `;
    expect(stat!.n_calls).toBe(1);
  });
 });
--- a/apps/server/src/services/tests/tools.test.ts
+++ b/apps/server/src/services/tests/tools.test.ts
@@ -0,0 +1,76 @@
 import { describe, it, expect } from 'vitest';
 import {
  ALL_TOOLS,
  CORE_TOOL_NAMES,
  STANDARD_TOOL_NAMES,
  TOOLS_BY_NAME,
  resolveToolTier,
 } from '../tools.js';
 describe('ALL_TOOLS registry', () => {
  // v1.13.3: tools must be alpha-sorted at module load. llama.cpp's prompt
  // cache hits on byte-identical prefixes; the tool list lives near the
  // top of the system prompt, so any order drift invalidates every cached
  // turn. The registry sort is the single source of truth; downstream
  // helpers (toolJsonSchemas, TOOLS_BY_NAME, buildAiTools) inherit it.
  it('exports tools in alphabetical order by name', () => {
    const names = ALL_TOOLS.map((t) => t.name);
    expect(names).toEqual([...names].sort((a, b) => a.localeCompare(b)));
  });
 });
 describe('resolveToolTier (v1.13.15-tools)', () => {
  it('returns CORE tools for tier=core', () => {
    expect(resolveToolTier('core')).toEqual(CORE_TOOL_NAMES);
  });
  it('returns STANDARD tools for tier=standard', () => {
    const result = resolveToolTier('standard');
    expect(result.length).toBe(STANDARD_TOOL_NAMES.length);
    expect(result.length).toBeGreaterThan(CORE_TOOL_NAMES.length);
    // STANDARD is a strict superset of CORE.
    expect(result).toEqual(expect.arrayContaining([...CORE_TOOL_NAMES]));
  });
  it('returns ALL tool names for tier=all', () => {
    expect(resolveToolTier('all').length).toBe(ALL_TOOLS.length);
  });
  it('defaults to all when env var is undefined', () => {
    expect(resolveToolTier(undefined).length).toBe(ALL_TOOLS.length);
  });
  it('is case-insensitive', () => {
    expect(resolveToolTier('CORE')).toEqual(CORE_TOOL_NAMES);
    expect(resolveToolTier('Standard').length).toBe(STANDARD_TOOL_NAMES.length);
  });
  it('falls back to all for unknown tier strings', () => {
    expect(resolveToolTier('bogus').length).toBe(ALL_TOOLS.length);
  });
 });
 describe('CORE_TOOL_NAMES + STANDARD_TOOL_NAMES validation', () => {
  // The module-load validation in tools.ts throws if a tier references a
  // tool that doesn't exist in TOOLS_BY_NAME. These tests double-check that
  // invariant from the consumer side so a future tier-list edit can't smuggle
  // in a typo without a test failure.
  it('every CORE name exists in TOOLS_BY_NAME', () => {
    for (const name of CORE_TOOL_NAMES) {
      expect(TOOLS_BY_NAME[name], `CORE references unknown tool '${name}'`).toBeDefined();
    }
  });
  it('every STANDARD name exists in TOOLS_BY_NAME', () => {
    for (const name of STANDARD_TOOL_NAMES) {
      expect(TOOLS_BY_NAME[name], `STANDARD references unknown tool '${name}'`).toBeDefined();
    }
  });
  it('CORE is a subset of STANDARD', () => {
    const standardSet = new Set<string>(STANDARD_TOOL_NAMES);
    for (const name of CORE_TOOL_NAMES) {
      expect(standardSet.has(name), `'${name}' is in CORE but not STANDARD`).toBe(true);
    }
  });
 });
--- a/apps/server/src/services/tests/truncate.test.ts
+++ b/apps/server/src/services/tests/truncate.test.ts
@@ -0,0 +1,104 @@
 // v1.13.5: truncate.ts unit coverage. Each test isolates TRUNCATION_DIR
 // under os.tmpdir() so concurrent vitest runs don't collide and the suite
 // stays self-cleaning. cleanupTruncations is covered by file-system half
 // only; the orphan-reap branch needs a real Postgres and is tested via the
 // smoke flow rather than vitest.
 import { afterEach, beforeAll, describe, expect, it, vi } from 'vitest';
 import { promises as fs } from 'fs';
 import path from 'path';
 import os from 'os';
 // Set the env var BEFORE importing the module so its module-load constant
 // reads the test directory rather than /tmp/boocode-truncations.
 const testDir = path.join(os.tmpdir(), `boocode-truncate-test-${process.pid}-${Date.now()}`);
 process.env.BOOCODE_TRUNCATION_DIR = testDir;
 const mod = await import('../truncate.js');
 const { storeTruncation, readTruncation, truncateIfNeeded, MAX_TRUNCATION_BYTES } = mod;
 beforeAll(async () => {
  await fs.mkdir(testDir, { recursive: true });
 });
 afterEach(async () => {
  // Drop every file between tests so id-collision asserts and orphan-style
  // counts start from zero.
  const entries = await fs.readdir(testDir).catch(() => [] as string[]);
  await Promise.all(entries.map((n) => fs.unlink(path.join(testDir, n)).catch(() => {})));
 });
 describe('storeTruncation / readTruncation roundtrip', () => {
  it('writes and reads identical content', async () => {
    const original = 'hello\nworld\n' + 'x'.repeat(500);
    const id = await storeTruncation(original);
    expect(id).toMatch(/^tr_[0-9a-v]{12}$/);
    const got = await readTruncation(id);
    expect(got).toBe(original);
  });
  it('readTruncation returns null for unknown ids', async () => {
    const got = await readTruncation('tr_000000000000');
    expect(got).toBeNull();
  });
  it('readTruncation rejects malformed ids (returns null, never escapes dir)', async () => {
    // Path traversal attempt; readTruncation should not even try to open.
    const got = await readTruncation('../../etc/passwd');
    expect(got).toBeNull();
  });
 });
 describe('truncateIfNeeded', () => {
  it('returns sliced content with no outputPath when wasTruncated=false', async () => {
    const out = await truncateIfNeeded({
      fullContent: 'irrelevant',
      slicedContent: 'visible',
      wasTruncated: false,
    });
    expect(out).toEqual({ content: 'visible', truncated: false });
    expect('outputPath' in out).toBe(false);
  });
  it('stashes full content and returns outputPath when wasTruncated=true', async () => {
    const full = 'line1\nline2\nline3\nline4\n';
    const sliced = 'line1\nline2\n[truncated]';
    const out = await truncateIfNeeded({
      fullContent: full,
      slicedContent: sliced,
      wasTruncated: true,
    });
    expect(out.content).toBe(sliced);
    expect(out.truncated).toBe(true);
    expect(out.outputPath).toMatch(/^tr_[0-9a-v]{12}$/);
    const stashed = await readTruncation(out.outputPath!);
    expect(stashed).toBe(full);
  });
  it('skips storage but still reports truncated when fullContent exceeds the cap', async () => {
    // Build content larger than MAX_TRUNCATION_BYTES. Use a Buffer to size
    // it without holding a literal that triggers the gigantic-string lint.
    const oversized = Buffer.alloc(MAX_TRUNCATION_BYTES + 1, 'x').toString('utf8');
    const sliced = 'preview...';
    const out = await truncateIfNeeded({
      fullContent: oversized,
      slicedContent: sliced,
      wasTruncated: true,
    });
    expect(out).toEqual({ content: sliced, truncated: true });
    expect('outputPath' in out).toBe(false);
  });
  it('storage failure surfaces as truncated without outputPath', async () => {
    // Force writeFile to throw. Spy at the fs module level since truncate.ts
    // imports { promises as fs } and storeTruncation calls fs.writeFile.
    const spy = vi.spyOn(fs, 'writeFile').mockRejectedValueOnce(new Error('disk full'));
    const out = await truncateIfNeeded({
      fullContent: 'short',
      slicedContent: 'sliced',
      wasTruncated: true,
    });
    expect(out).toEqual({ content: 'sliced', truncated: true });
    expect('outputPath' in out).toBe(false);
    spy.mockRestore();
  });
 });
--- a/apps/server/src/services/tests/web_tools.test.ts
+++ b/apps/server/src/services/tests/web_tools.test.ts
@@ -295,9 +295,10 @@ describe('executeWebFetch — size + truncation', () => {
    // 1.5M U+1F600 emojis: each is length 2 in UTF-16 (surrogate pair) and
    // 4 bytes in UTF-8. body.length = 3,000,000 chars (~2.86 MiB by
    // UTF-16 count) but Buffer.byteLength = 6,000,000 bytes (>5 MiB).
-    // Pre-fix the char-count comparison let this through; the byte-count
+    // v1.11.10: streaming reader catches this as body_too_large (was
-    // check now rejects. No Content-Length header so the pre-flight
+    // response_too_large in the post-consumption check). No
-    // guard doesn't fire — we're testing the POST-consumption check.
+    // Content-Length header so the pre-flight pass and the streaming
    // path is the one that rejects.
    const heavy = '😀'.repeat(1_500_000);
    const fakeFetch = vi.fn().mockResolvedValue(
      new Response(heavy, { status: 200, headers: { 'content-type': 'text/plain' } }),
@@ -308,9 +309,8 @@ describe('executeWebFetch — size + truncation', () => {
    );
    expect('error' in result).toBe(true);
    if ('error' in result) {
-      expect(result.error).toBe('response_too_large');
+      expect(result.error).toBe('body_too_large');
-      // Error reason should reference bytes, not character count.
+      expect(result.reason).toMatch(/exceeded/);
      expect(result.reason).toMatch(/bytes/);
    }
  });
@@ -453,3 +453,138 @@ describe('executeWebFetch — redirect handling', () => {
    expect(fakeFetch.mock.calls[1]![0]).toBe('https://example.com/foo');
  });
 });
 // ============================================================================
 // v1.11.10: streaming body cap — abort the response stream at MAX_BYTES
 // ============================================================================
 // MAX_BYTES is 5 * 1024 * 1024 = 5_242_880. Repeating this here (rather
 // than importing) so a change to the cap surfaces as a test failure —
 // the limit is part of the public contract.
 const MAX_BYTES_TEST = 5 * 1024 * 1024;
 // Build a Response whose body is a real ReadableStream. Uses pull() (not
 // start()) so chunks are produced lazily — without backpressure, an
 // unbounded start() enqueues everything and calls controller.close()
 // before the consumer reads, which means a subsequent reader.cancel()
 // finds the stream already closed and the cancel callback never fires.
 // `cancelFlag` lets the test observe whether reader.cancel() reached the
 // underlying source mid-stream.
 function streamedResponse(
  chunks: Uint8Array[],
  init: { contentType?: string; contentLength?: number | null; cancelFlag?: { cancelled: boolean } } = {},
 ): Response {
  let idx = 0;
  const stream = new ReadableStream({
    pull(controller) {
      if (idx >= chunks.length) {
        controller.close();
        return;
      }
      controller.enqueue(chunks[idx]!);
      idx += 1;
    },
    cancel() {
      if (init.cancelFlag) init.cancelFlag.cancelled = true;
    },
  });
  const headers: Record<string, string> = {};
  if (init.contentType) headers['content-type'] = init.contentType;
  if (init.contentLength !== undefined && init.contentLength !== null) {
    headers['content-length'] = String(init.contentLength);
  }
  return new Response(stream, { status: 200, headers });
 }
 describe('executeWebFetch — streaming body cap (v1.11.10)', () => {
  it('aborts the stream when a server lies about Content-Length and emits over the cap', async () => {
    // Honest header would have failed the pre-flight check. The lie is
    // the point: pre-flight passes (100 < 5MB) and the streaming reader
    // has to be the thing that catches the oversized body.
    //
    // Chunk count is deliberately higher than what the reader will
    // consume (10 × 1MB available, but the reader will cancel after ~6
    // chunks land it over 5MB). That headroom keeps the stream in
    // 'readable' state at the moment reader.cancel() runs — otherwise
    // a pull-then-close race could make the source close the stream
    // before cancel reaches it, and the cancel() callback wouldn't fire.
    const oneMB = new Uint8Array(1024 * 1024).fill(65); // 'A'
    const tenMBInChunks = Array.from({ length: 10 }, () => oneMB);
    const cancelFlag = { cancelled: false };
    const fakeFetch = vi.fn().mockResolvedValue(
      streamedResponse(tenMBInChunks, {
        contentType: 'text/plain',
        contentLength: 100,
        cancelFlag,
      }),
    );
    const result = await executeWebFetch(
      { url: 'https://example.com/lying-server' },
      fakeFetch as unknown as typeof fetch,
    );
    expect('error' in result).toBe(true);
    if ('error' in result) {
      expect(result.error).toBe('body_too_large');
      expect(result.reason).toMatch(/exceeded/);
    }
    // Critical: reader.cancel() actually fired so the underlying
    // connection / stream got released. Otherwise the abort would be
    // notional and the server could keep streaming.
    expect(cancelFlag.cancelled).toBe(true);
  });
  it('catches an oversized stream when Content-Length is omitted entirely', async () => {
    // Many real servers (chunked transfer-encoding, dynamic responses)
    // never send Content-Length. The pre-flight check has nothing to
    // gate on; the streaming reader is the only line of defense.
    // 10 chunks vs the ~6 the reader will consume — same headroom
    // rationale as the lying-Content-Length test above.
    const oneMB = new Uint8Array(1024 * 1024).fill(66); // 'B'
    const tenMBInChunks = Array.from({ length: 10 }, () => oneMB);
    const fakeFetch = vi.fn().mockResolvedValue(
      streamedResponse(tenMBInChunks, { contentType: 'text/plain' }),
    );
    const result = await executeWebFetch(
      { url: 'https://example.com/no-length' },
      fakeFetch as unknown as typeof fetch,
    );
    expect('error' in result && result.error).toBe('body_too_large');
  });
  it('passes a multi-chunk body that totals just under the cap', async () => {
    // Boundary case: MAX_BYTES - 1 bytes split across N chunks. The
    // streaming reader's `total > maxBytes` check is strict-greater so
    // exactly MAX_BYTES would still succeed; MAX_BYTES + 1 would fail.
    // - 1 leaves clear headroom without coinciding with the boundary.
    const targetTotal = MAX_BYTES_TEST - 1;
    const chunkSize = 256 * 1024; // 256 KiB chunks
    const chunks: Uint8Array[] = [];
    let remaining = targetTotal;
    while (remaining > 0) {
      const size = Math.min(chunkSize, remaining);
      chunks.push(new Uint8Array(size).fill(67)); // 'C'
      remaining -= size;
    }
    const fakeFetch = vi.fn().mockResolvedValue(
      streamedResponse(chunks, { contentType: 'text/plain' }),
    );
    const result = await executeWebFetch(
      { url: 'https://example.com/right-at-cap' },
      fakeFetch as unknown as typeof fetch,
    );
    // The streaming reader succeeded — we got a content shape, not an
    // error. (Downstream truncate() will clamp the final string to
    // MAX_CHARS_CAP=32000 and set truncated:true; that's the existing
    // truncation logic and is exercised by its own test. The point of
    // THIS test is that readBodyCapped didn't trip on a body that
    // sits just under its byte limit.)
    expect('content' in result).toBe(true);
    if ('content' in result) {
      expect(result.content.length).toBeGreaterThan(0);
      // All ASCII 'C's, so the leading 200 chars before any truncation
      // marker should be all C — proves we read real bytes through the
      // streaming reader rather than getting an empty buffer.
      expect(result.content.slice(0, 200)).toBe('C'.repeat(200));
    }
  });
 });
--- a/apps/server/src/services/tests/ws-frames.test.ts
+++ b/apps/server/src/services/tests/ws-frames.test.ts
@@ -0,0 +1,218 @@
 import { describe, it, expect, vi, beforeEach, afterEach } from 'vitest';
 import { readFileSync } from 'node:fs';
 import { resolve } from 'node:path';
 import { fileURLToPath } from 'node:url';
 import {
  WsFrameSchema,
  KNOWN_FRAME_TYPES,
  type WsFrame,
 } from '../../types/ws-frames.js';
 import { createBroker } from '../broker.js';
 const VALID_UUID_A = '00000000-0000-0000-0000-000000000001';
 const VALID_UUID_B = '00000000-0000-0000-0000-000000000002';
 const VALID_UUID_C = '00000000-0000-0000-0000-000000000003';
 const VALID_TIMESTAMP = '2026-05-22T14:30:00.000Z';
 describe('WsFrameSchema (v1.13.11-a)', () => {
  it('accepts a well-formed chat_status frame', () => {
    const result = WsFrameSchema.safeParse({
      type: 'chat_status',
      chat_id: VALID_UUID_A,
      status: 'streaming',
      at: VALID_TIMESTAMP,
    });
    expect(result.success).toBe(true);
  });
  it('rejects an unknown frame type', () => {
    const result = WsFrameSchema.safeParse({
      type: 'cosmic_ray_strike',
      chat_id: VALID_UUID_A,
    });
    expect(result.success).toBe(false);
  });
  it('rejects a chat_status frame with invalid status enum', () => {
    // v1.12.1 dropped the legacy 'working' status. Any frame still emitting it
    // should fail validation — that's a drift catcher.
    const result = WsFrameSchema.safeParse({
      type: 'chat_status',
      chat_id: VALID_UUID_A,
      status: 'working',
      at: VALID_TIMESTAMP,
    });
    expect(result.success).toBe(false);
  });
  it('rejects a UUID field with a non-UUID string', () => {
    const result = WsFrameSchema.safeParse({
      type: 'chat_status',
      chat_id: 'not-a-uuid',
      status: 'idle',
      at: VALID_TIMESTAMP,
    });
    expect(result.success).toBe(false);
  });
  it('rejects negative token counts in usage frame', () => {
    const result = WsFrameSchema.safeParse({
      type: 'usage',
      message_id: VALID_UUID_A,
      chat_id: VALID_UUID_B,
      completion_tokens: -1,
      ctx_used: 100,
      ctx_max: 1000,
    });
    expect(result.success).toBe(false);
  });
  it('accepts a usage frame with nullable token counts (pre-v1.13.7 history)', () => {
    const result = WsFrameSchema.safeParse({
      type: 'usage',
      message_id: VALID_UUID_A,
      chat_id: VALID_UUID_B,
      completion_tokens: null,
      ctx_used: null,
      ctx_max: null,
    });
    expect(result.success).toBe(true);
  });
  it('accepts a tool_result frame with non-UUID tool_call_id (model-emitted)', () => {
    // Model-emitted tool_call_ids look like "call_abc123", not UUIDs.
    const result = WsFrameSchema.safeParse({
      type: 'tool_result',
      tool_message_id: VALID_UUID_A,
      chat_id: VALID_UUID_B,
      tool_call_id: 'call_abc123',
      output: { whatever: true },
      truncated: false,
    });
    expect(result.success).toBe(true);
  });
  it('accepts a compacted frame', () => {
    const result = WsFrameSchema.safeParse({
      type: 'compacted',
      session_id: VALID_UUID_A,
      chat_id: VALID_UUID_B,
      summary_message_id: VALID_UUID_C,
    });
    expect(result.success).toBe(true);
  });
  it('accepts a session_workspace_updated frame', () => {
    const result = WsFrameSchema.safeParse({
      type: 'session_workspace_updated',
      session_id: VALID_UUID_A,
      workspace_panes: [{ id: 'p1', kind: 'chat', chatIds: [], activeChatIdx: 0 }],
    });
    expect(result.success).toBe(true);
  });
  it('every KNOWN_FRAME_TYPES entry has a discriminated branch', () => {
    // Probe each known type by attempting a minimal valid construction.
    // Failure here means the union and the KNOWN_FRAME_TYPES list drifted.
    for (const type of KNOWN_FRAME_TYPES) {
      const probe = WsFrameSchema.safeParse({ type, __dummy__: true });
      // We expect FAILURE on every type because we're missing required fields,
      // but the failure must be ABOUT the missing fields, not about an unknown
      // type. A "Invalid discriminator value" error means the type isn't in
      // the union — that's a drift.
      if (probe.success) continue;
      const issues = probe.error.issues;
      const hasInvalidDiscriminator = issues.some(
        (i) => i.code === 'invalid_union_discriminator',
      );
      expect(hasInvalidDiscriminator, `frame type '${type}' is missing from the discriminated union`).toBe(false);
    }
  });
 });
 describe('ws-frames.ts file mirror parity', () => {
  it('apps/server and apps/web copies are byte-identical', () => {
    const here = fileURLToPath(import.meta.url);
    const serverPath = resolve(here, '../../../types/ws-frames.ts');
    const webPath = resolve(here, '../../../../../web/src/api/ws-frames.ts');
    const serverContent = readFileSync(serverPath, 'utf8');
    const webContent = readFileSync(webPath, 'utf8');
    expect(webContent, 'apps/web/src/api/ws-frames.ts must be byte-identical to apps/server/src/types/ws-frames.ts').toBe(serverContent);
  });
 });
 describe('broker.publishFrame / publishUserFrame fail-closed behavior', () => {
  let logErrors: Array<{ obj: unknown; msg: string }>;
  let mockLog: Parameters<typeof createBroker>[0];
  beforeEach(() => {
    logErrors = [];
    mockLog = {
      error: (obj: unknown, msg: string) => {
        logErrors.push({ obj, msg });
      },
      info: () => {},
      warn: () => {},
      debug: () => {},
      trace: () => {},
      fatal: () => {},
      child: () => mockLog as never,
      level: 'info',
      silent: () => {},
    } as unknown as Parameters<typeof createBroker>[0];
  });
  afterEach(() => {
    vi.restoreAllMocks();
  });
  it('publishFrame delivers a valid frame to subscribers', () => {
    const broker = createBroker(mockLog);
    const received: WsFrame[] = [];
    broker.subscribe('sess-1', (f) => received.push(f as WsFrame));
    broker.publishFrame('sess-1', {
      type: 'delta',
      message_id: VALID_UUID_A,
      chat_id: VALID_UUID_B,
      content: 'hello',
    });
    expect(received).toHaveLength(1);
    expect((received[0] as { type: string }).type).toBe('delta');
    expect(logErrors).toHaveLength(0);
  });
  it('publishFrame drops + logs an invalid frame instead of delivering it', () => {
    const broker = createBroker(mockLog);
    const received: WsFrame[] = [];
    broker.subscribe('sess-1', (f) => received.push(f as WsFrame));
    broker.publishFrame('sess-1', {
      type: 'delta',
      message_id: 'not-a-uuid',
      content: 'hello',
    } as never);
    expect(received).toHaveLength(0);
    expect(logErrors).toHaveLength(1);
    expect(logErrors[0]!.msg).toMatch(/ws-frame-validation-failed/);
  });
  it('publishUserFrame drops + logs an invalid user-channel frame', () => {
    const broker = createBroker(mockLog);
    const received: WsFrame[] = [];
    broker.subscribeUser('default', (f) => received.push(f as WsFrame));
    broker.publishUserFrame('default', {
      type: 'chat_status',
      chat_id: VALID_UUID_A,
      status: 'working', // v1.12.1 dropped this enum value
      at: VALID_TIMESTAMP,
    } as never);
    expect(received).toHaveLength(0);
    expect(logErrors).toHaveLength(1);
  });
  it('publishFrame validation failure does not throw (no cascade into stream-phase)', () => {
    const broker = createBroker(mockLog);
    expect(() =>
      broker.publishFrame('sess-1', { type: 'unknown_type' } as never),
    ).not.toThrow();
  });
 });
--- a/apps/server/src/services/tests/xml-parser.test.ts
+++ b/apps/server/src/services/tests/xml-parser.test.ts
@@ -0,0 +1,357 @@
 // v1.13.16: covers the Qwen/Hermes <tool_call> parser, the new Anthropic
 // <invoke> parser, the partial-opener detector for both flavors, the unified
 // extraction helper, and the unknown-tool error formatter that downstream
 // dispatch uses to give the model a recovery hint when it drifts to a
 // Claude Code tool name like read_file instead of BooCode's view_file.
 import { describe, expect, it } from 'vitest';
 import {
  parseXmlToolCall,
  parseInvokeToolCall,
  partialXmlOpenerStart,
  extractToolCallBlocks,
  XML_TOOL_OPEN,
  XML_TOOL_CLOSE,
  INVOKE_TOOL_OPEN,
  INVOKE_TOOL_CLOSE,
 } from '../inference/xml-parser.js';
 import {
  levenshtein,
  suggestToolName,
  formatUnknownToolError,
 } from '../inference/tool-suggestions.js';
 describe('parseXmlToolCall (Qwen/Hermes <tool_call>)', () => {
  it('parses a well-formed single-parameter call', () => {
    const block = '<tool_call><function=view_file><parameter=path>/tmp/foo</parameter></function></tool_call>';
    expect(parseXmlToolCall(block)).toEqual({
      name: 'view_file',
      args: { path: '/tmp/foo' },
    });
  });
  it('parses multi-parameter call', () => {
    const block = '<tool_call><function=grep><parameter=pattern>foo</parameter><parameter=path>src/</parameter></function></tool_call>';
    expect(parseXmlToolCall(block)).toEqual({
      name: 'grep',
      args: { pattern: 'foo', path: 'src/' },
    });
  });
  it('JSON-parses numeric parameter values', () => {
    const block = '<tool_call><function=foo><parameter=count>42</parameter></function></tool_call>';
    expect(parseXmlToolCall(block)).toEqual({ name: 'foo', args: { count: 42 } });
  });
  it('tolerates whitespace around = in function (v1.13.16 tightening)', () => {
    const block = '<tool_call><function = view_file><parameter=path>/tmp/foo</parameter></function></tool_call>';
    expect(parseXmlToolCall(block)).toEqual({
      name: 'view_file',
      args: { path: '/tmp/foo' },
    });
  });
  it('tolerates whitespace around = in parameter (v1.13.16 tightening)', () => {
    const block = '<tool_call><function=view_file><parameter = path>/tmp/foo</parameter></function></tool_call>';
    expect(parseXmlToolCall(block)).toEqual({
      name: 'view_file',
      args: { path: '/tmp/foo' },
    });
  });
  it('returns null when function name is missing', () => {
    const block = '<tool_call><parameter=path>/tmp/foo</parameter></tool_call>';
    expect(parseXmlToolCall(block)).toBeNull();
  });
 });
 describe('parseInvokeToolCall (Anthropic <invoke>) — v1.13.16', () => {
  // Spec case 1
  it('parses a well-formed single-parameter call (spec case 1)', () => {
    const block = '<invoke name="view_file"><parameter name="path">/tmp/foo</parameter></invoke>';
    expect(parseInvokeToolCall(block)).toEqual({
      name: 'view_file',
      args: { path: '/tmp/foo' },
    });
  });
  // Spec case 2
  it('parses a multi-parameter call (spec case 2)', () => {
    const block = '<invoke name="grep"><parameter name="pattern">foo</parameter><parameter name="path">src/</parameter></invoke>';
    expect(parseInvokeToolCall(block)).toEqual({
      name: 'grep',
      args: { pattern: 'foo', path: 'src/' },
    });
  });
  // Spec case 3
  it('tolerates newlines and spaces in attributes (spec case 3)', () => {
    const block = `<invoke
      name="view_file"
    >
      <parameter
        name="path"
      >/tmp/foo</parameter>
    </invoke>`;
    expect(parseInvokeToolCall(block)).toEqual({
      name: 'view_file',
      args: { path: '/tmp/foo' },
    });
  });
  // Spec case 4 (parser portion — the not-found enrichment is tested below)
  it('parses a call whose name is not a registered BooCode tool (spec case 4)', () => {
    const block = '<invoke name="read_file"><parameter name="path">/tmp/foo</parameter></invoke>';
    expect(parseInvokeToolCall(block)).toEqual({
      name: 'read_file',
      args: { path: '/tmp/foo' },
    });
  });
  it('supports single-quoted attribute values', () => {
    const block = "<invoke name='view_file'><parameter name='path'>/tmp/foo</parameter></invoke>";
    expect(parseInvokeToolCall(block)).toEqual({
      name: 'view_file',
      args: { path: '/tmp/foo' },
    });
  });
  it('JSON-parses numeric parameter values', () => {
    const block = '<invoke name="foo"><parameter name="count">42</parameter></invoke>';
    expect(parseInvokeToolCall(block)).toEqual({ name: 'foo', args: { count: 42 } });
  });
  it('tolerates spaces around = inside name attribute', () => {
    const block = '<invoke name = "view_file"><parameter name = "path">/tmp/foo</parameter></invoke>';
    expect(parseInvokeToolCall(block)).toEqual({
      name: 'view_file',
      args: { path: '/tmp/foo' },
    });
  });
  it('returns null when name attribute is missing', () => {
    const block = '<invoke><parameter name="path">/tmp/foo</parameter></invoke>';
    expect(parseInvokeToolCall(block)).toBeNull();
  });
  it('returns null when name attribute is empty', () => {
    const block = '<invoke name=""><parameter name="path">/tmp/foo</parameter></invoke>';
    expect(parseInvokeToolCall(block)).toBeNull();
  });
  it('exports the expected delimiters', () => {
    expect(INVOKE_TOOL_OPEN).toBe('<invoke');
    expect(INVOKE_TOOL_CLOSE).toBe('</invoke>');
    expect(XML_TOOL_OPEN).toBe('<tool_call>');
    expect(XML_TOOL_CLOSE).toBe('</tool_call>');
  });
 });
 describe('partialXmlOpenerStart (v1.13.16 — both flavors)', () => {
  it('returns -1 when the buffer is empty', () => {
    expect(partialXmlOpenerStart('')).toBe(-1);
  });
  it('returns -1 when the buffer has no openers', () => {
    expect(partialXmlOpenerStart('plain prose, no markup')).toBe(-1);
  });
  it('returns the index of a complete <tool_call> opener (existing)', () => {
    expect(partialXmlOpenerStart('prose <tool_call>more')).toBe(6);
  });
  it('returns the index of a complete <invoke opener (v1.13.16)', () => {
    expect(partialXmlOpenerStart('prose <invoke name=')).toBe(6);
  });
  it('holds a partial <tool_ prefix at end of buffer', () => {
    expect(partialXmlOpenerStart('text <tool_')).toBe(5);
  });
  it('holds a partial <invo prefix at end of buffer (v1.13.16)', () => {
    expect(partialXmlOpenerStart('text <invo')).toBe(5);
  });
  it('holds a bare < at end of buffer', () => {
    expect(partialXmlOpenerStart('text <')).toBe(5);
  });
  it('returns -1 when < is followed by non-opener text', () => {
    expect(partialXmlOpenerStart('text <unknown>')).toBe(-1);
  });
  it('returns the earliest opener when both flavors are present', () => {
    expect(partialXmlOpenerStart('xxx <tool_call>YYY <invoke>')).toBe(4);
    expect(partialXmlOpenerStart('xxx <invoke>YYY <tool_call>')).toBe(4);
  });
 });
 describe('extractToolCallBlocks (v1.13.16 — unified extraction)', () => {
  // Spec case 1 (extraction-level)
  it('extracts a single <invoke> block (spec case 1)', () => {
    const input = '<invoke name="view_file"><parameter name="path">/tmp/foo</parameter></invoke>';
    const result = extractToolCallBlocks(input);
    expect(result.calls).toEqual([{ name: 'view_file', args: { path: '/tmp/foo' } }]);
    expect(result.flushed).toBe('');
    expect(result.remaining).toBe('');
  });
  // Spec case 5: opener arrives in one chunk, closer in the next.
  it('holds the partial <invoke> chunk when the closer has not arrived (spec case 5, first chunk)', () => {
    const firstChunk = '<invoke name="view_file"><parameter name="path">/tmp/foo</parameter>';
    const result = extractToolCallBlocks(firstChunk);
    expect(result.calls).toEqual([]);
    expect(result.flushed).toBe('');
    expect(result.remaining).toBe(firstChunk);
  });
  it('extracts the block once the closer arrives in a later chunk (spec case 5, completion)', () => {
    const firstChunk = '<invoke name="view_file"><parameter name="path">/tmp/foo</parameter>';
    const r1 = extractToolCallBlocks(firstChunk);
    const combined = r1.remaining + '</invoke>';
    const r2 = extractToolCallBlocks(combined);
    expect(r2.calls).toEqual([{ name: 'view_file', args: { path: '/tmp/foo' } }]);
    expect(r2.flushed).toBe('');
    expect(r2.remaining).toBe('');
  });
  // Spec case 6: prose interleaving
  it('flushes prose around a recognized block but not the markup itself (spec case 6)', () => {
    const input = 'I will read the file.\n<invoke name="view_file"><parameter name="path">/tmp/foo</parameter></invoke>\nThanks.';
    const result = extractToolCallBlocks(input);
    expect(result.calls).toEqual([{ name: 'view_file', args: { path: '/tmp/foo' } }]);
    expect(result.flushed).toBe('I will read the file.\n\nThanks.');
    expect(result.remaining).toBe('');
  });
  // Spec case 7 regression
  it('extracts a <tool_call> Qwen block alongside the new code path (spec case 7 regression)', () => {
    const input = '<tool_call><function=view_file><parameter=path>/tmp/foo</parameter></function></tool_call>';
    const result = extractToolCallBlocks(input);
    expect(result.calls).toEqual([{ name: 'view_file', args: { path: '/tmp/foo' } }]);
    expect(result.flushed).toBe('');
    expect(result.remaining).toBe('');
  });
  it('extracts mixed-format blocks in source order (hand-back: shared counter)', () => {
    const input =
      '<invoke name="view_file"><parameter name="path">/a</parameter></invoke>' +
      ' middle ' +
      '<tool_call><function=grep><parameter=pattern>foo</parameter></function></tool_call>';
    const result = extractToolCallBlocks(input);
    expect(result.calls).toEqual([
      { name: 'view_file', args: { path: '/a' } },
      { name: 'grep', args: { pattern: 'foo' } },
    ]);
    expect(result.flushed).toBe(' middle ');
    expect(result.remaining).toBe('');
  });
  it('drops a malformed <invoke> block silently (matches existing <tool_call> behavior)', () => {
    const input = 'prose <invoke><parameter name="path">/a</parameter></invoke> trailing';
    const result = extractToolCallBlocks(input);
    expect(result.calls).toEqual([]);
    expect(result.flushed).toBe('prose  trailing');
    expect(result.remaining).toBe('');
  });
  it('holds a tail with a fresh partial opener after extracting earlier complete blocks', () => {
    const input = '<invoke name="view_file"><parameter name="path">/a</parameter></invoke> next: <tool_';
    const result = extractToolCallBlocks(input);
    expect(result.calls).toEqual([{ name: 'view_file', args: { path: '/a' } }]);
    expect(result.flushed).toBe(' next: ');
    expect(result.remaining).toBe('<tool_');
  });
  it('passes plain prose straight through when no markup is present', () => {
    const input = 'just some text with a < character but no opener';
    const result = extractToolCallBlocks(input);
    expect(result.calls).toEqual([]);
    expect(result.flushed).toBe(input);
    expect(result.remaining).toBe('');
  });
 });
 describe('levenshtein', () => {
  it('returns 0 for identical strings', () => {
    expect(levenshtein('view_file', 'view_file')).toBe(0);
  });
  it('returns the length when one string is empty', () => {
    expect(levenshtein('', 'view_file')).toBe(9);
    expect(levenshtein('view_file', '')).toBe(9);
  });
  it('computes a small distance for a single-character substitution', () => {
    expect(levenshtein('cat', 'bat')).toBe(1);
  });
  it('computes a known case: read_file → view_file is 4', () => {
    // r→v, e→i, a→e, d→w → 4 substitutions, same length
    expect(levenshtein('read_file', 'view_file')).toBe(4);
  });
 });
 describe('suggestToolName (v1.13.16)', () => {
  const tools = [
    'view_file',
    'list_dir',
    'grep',
    'find_files',
    'view_truncated_output',
    'ask_user_input',
    'web_search',
  ];
  it('suggests the closest match when distance is small', () => {
    expect(suggestToolName('view_files', tools)).toBe('view_file');
  });
  it('suggests via substring match when distance alone would miss', () => {
    // 'file' is a substring of multiple tools; closest by distance wins.
    expect(suggestToolName('file', tools)).toBe('view_file');
  });
  it('returns null when nothing is close', () => {
    expect(suggestToolName('xxxx_yyyy_zzzz', tools)).toBeNull();
  });
  it('is case-insensitive in the distance check', () => {
    expect(suggestToolName('VIEW_FILE', tools)).toBe('view_file');
  });
 });
 describe('formatUnknownToolError (v1.13.16)', () => {
  const tools = ['view_file', 'list_dir', 'grep', 'find_files'];
  it('includes the wrong name and the available tools list', () => {
    const msg = formatUnknownToolError('read_file', tools);
    expect(msg).toContain("Tool 'read_file' not found");
    expect(msg).toContain('Available tools:');
    expect(msg).toContain('view_file');
    expect(msg).toContain('find_files');
  });
  it('includes a suggestion when the drifted name is within threshold', () => {
    // distance(view_files, view_file) = 1 (one extra char)
    const msg = formatUnknownToolError('view_files', tools);
    expect(msg).toContain('Did you mean: view_file?');
  });
  it('omits the suggestion clause when no tool is close enough', () => {
    const msg = formatUnknownToolError('zzzzzzz', tools);
    expect(msg).toContain("Tool 'zzzzzzz' not found");
    expect(msg).toContain('Available tools:');
    expect(msg).not.toContain('Did you mean');
  });
  // The drift incident in the recon (chat 30d8…1be7167, msg 7ff558f4) had the
  // model emit <invoke name="read_file">. lev(read_file, view_file) = 4, so
  // the spec's threshold (<=3) doesn't suggest view_file — the model still
  // gets the available-tools list to pick from. This pins that behavior so a
  // future loosening of the threshold is a deliberate choice.
  it('does not suggest view_file for the read_file drift case (distance is 4, over threshold)', () => {
    const msg = formatUnknownToolError('read_file', tools);
    expect(msg).not.toContain('Did you mean');
  });
 });
--- a/apps/server/src/services/agents.ts
+++ b/apps/server/src/services/agents.ts
@@ -1,7 +1,7 @@
 import { promises as fs } from 'node:fs';
 import { join } from 'node:path';
 import type { Agent, AgentsResponse, AgentParseError } from '../types/api.js';
-import { ALL_TOOLS } from './tools.js';
+import { ALL_TOOLS, resolveToolTier } from './tools.js';
 // v1.8.1: global agents live at /data/AGENTS.md inside the container
 // (./data:/data:ro mount on the host). Per-project AGENTS.md at the project
@@ -186,11 +186,14 @@ function parseAgentSection(section: RawSection): Omit<Agent, 'source'> {
    throw new Error(fmErrors.join('; '));
  }
  // v1.13.15-tools: intersect with BOOCODE_TOOLS tier (ceiling, not expansion).
  // Unset → resolveToolTier returns ALL tool names → no narrowing.
  const tierAllowed = new Set(resolveToolTier(process.env.BOOCODE_TOOLS));
  const filteredTools = Array.isArray(fm.tools)
    ? fm.tools.filter((t): t is string =>
-        (ALL_TOOL_NAMES as readonly string[]).includes(t),
+        (ALL_TOOL_NAMES as readonly string[]).includes(t) && tierAllowed.has(t),
      )
-    : DEFAULT_TOOLS;
+    : DEFAULT_TOOLS.filter((t) => tierAllowed.has(t));
  return {
    id: slugify(section.name),
@@ -252,6 +255,22 @@ export function invalidateAgentsCache(projectPath?: string): void {
  }
 }
 // v1.13.8: cache-read accessor for the system-prompt prefix-fingerprint log.
 // Returns the AGENTS.md mtimes that getAgentsForProject() observed on its
 // last cache fill for this projectPath. Both fields are null when the cache
 // is cold (e.g. tests, fresh boot before the first inference turn). Does no
 // I/O — a fresh stat would race the cache and isn't what the fingerprint
 // wants anyway (we want what was actually used to resolve the agent).
 export function getAgentsMtimes(projectPath: string): {
  global: number | null;
  project: number | null;
 } {
  const key = projectPath || '__none__';
  const entry = cache.get(key);
  if (!entry) return { global: null, project: null };
  return { global: entry.globalMtime, project: entry.projectMtime };
 }
 async function safeStat(path: string): Promise<number | null> {
  try {
    const s = await fs.stat(path);
--- a/apps/server/src/services/artifacts.ts
+++ b/apps/server/src/services/artifacts.ts
@@ -0,0 +1,255 @@
 // v1.14.x-html-artifact-panes: artifact writer + slug derivation.
 //
 // Writes Markdown and HTML artifacts to `<projectRoot>/.boocode/artifacts/`
 // as plain files. Returns `{path, url}` where:
 //   - path is the absolute on-disk path
 //   - url is a project-scoped REST URL pointing at the GET download route
 //     registered in routes/artifacts.ts. The route streams the file with
 //     Content-Disposition: attachment.
 //
 // Path safety: we do NOT use path_guard.ts (it realpaths and throws ENOENT
 // for files that don't exist yet, which artifact creation requires).
 // Instead we mirror the v1.13.18 codecontext_client.ts pattern: resolve
 // the candidate path against the realpath'd projectRoot, then verify the
 // result starts with projectRoot + sep (or equals projectRoot).
 import { mkdir, realpath, writeFile } from 'node:fs/promises';
 import { resolve, sep } from 'node:path';
 import { PathScopeError } from './path_guard.js';
 import type { Message } from '../types/api.js';
 export interface HtmlArtifactPayload {
  html_content: string;
  char_count: number;
  title: string | null;
 }
 export interface ArtifactWriteResult {
  path: string;
  url: string;
 }
 const ARTIFACT_SUBDIR = '.boocode/artifacts';
 // ---- slug helpers ----
 // Lowercase, replace non-alnum runs with '-', trim leading/trailing '-',
 // collapse repeated '-', cap at 60 chars. Empty → 'artifact'.
 function slugify(input: string): string {
  const cleaned = input
    .toLowerCase()
    .replace(/[^a-z0-9]+/g, '-')
    .replace(/^-+|-+$/g, '')
    .replace(/-{2,}/g, '-')
    .slice(0, 60)
    .replace(/^-+|-+$/g, '');
  return cleaned || 'artifact';
 }
 function firstHeading(md: string): string | null {
  // Match the first `# ` ATX heading at the start of a line.
  const m = md.match(/^[ \t]*#[ \t]+(.+?)\s*$/m);
  if (!m) return null;
  const text = m[1]?.trim() ?? '';
  return text.length > 0 ? text : null;
 }
 function firstNWords(s: string, n: number): string {
  const words = s.trim().split(/\s+/).filter(Boolean).slice(0, n);
  return words.join(' ');
 }
 export function deriveMarkdownSlug(messageContent: string): string {
  const heading = firstHeading(messageContent);
  if (heading) return slugify(heading);
  const sixWords = firstNWords(messageContent, 6);
  return slugify(sixWords);
 }
 // Strip HTML tags for inner-text extraction. Crude but sufficient for slug
 // derivation — we're not rendering, just finding readable words.
 function stripTags(html: string): string {
  return html
    .replace(/<script\b[^<]*(?:(?!<\/script>)<[^<]*)*<\/script>/gi, ' ')
    .replace(/<style\b[^<]*(?:(?!<\/style>)<[^<]*)*<\/style>/gi, ' ')
    .replace(/<[^>]+>/g, ' ')
    .replace(/\s+/g, ' ')
    .trim();
 }
 function extractTitleTag(html: string): string | null {
  const m = html.match(/<title[^>]*>([\s\S]*?)<\/title>/i);
  if (!m) return null;
  const text = stripTags(m[1] ?? '').trim();
  return text.length > 0 ? text : null;
 }
 function extractH1(html: string): string | null {
  const m = html.match(/<h1[^>]*>([\s\S]*?)<\/h1>/i);
  if (!m) return null;
  const text = stripTags(m[1] ?? '').trim();
  return text.length > 0 ? text : null;
 }
 export function deriveHtmlSlug(payload: {
  html_content: string;
  title: string | null;
 }): string {
  if (payload.title && payload.title.trim().length > 0) {
    return slugify(payload.title);
  }
  const title = extractTitleTag(payload.html_content);
  if (title) return slugify(title);
  const h1 = extractH1(payload.html_content);
  if (h1) return slugify(h1);
  const inner = stripTags(payload.html_content);
  return slugify(firstNWords(inner, 6));
 }
 // Derive title for the html_artifact part payload: <title> → first <h1> →
 // first 80 chars of inner text. Returns null if nothing useful is found.
 export function deriveHtmlTitle(html: string): string | null {
  const t = extractTitleTag(html);
  if (t) return t;
  const h1 = extractH1(html);
  if (h1) return h1;
  const inner = stripTags(html);
  if (inner.length === 0) return null;
  return inner.slice(0, 80);
 }
 // ---- HTML detection (B4) ----
 // Returns the inner HTML content if `text` is a recognised HTML artifact:
 //   - starts with <!DOCTYPE html> (case-insensitive, whitespace-trimmed), OR
 //   - wrapped entirely in a fenced ```html ... ``` block.
 // Returns null if neither matches.
 export function detectHtmlArtifact(text: string): string | null {
  const trimmed = text.trim();
  if (trimmed.length === 0) return null;
  if (/^<!doctype\s+html/i.test(trimmed)) {
    return trimmed;
  }
  // Fenced ```html block consuming the entire (trimmed) message. Allow an
  // optional trailing newline before the closing fence.
  const fence = trimmed.match(/^```html\s*\n([\s\S]*?)\n?```\s*$/i);
  if (fence) {
    const inner = fence[1] ?? '';
    if (/^\s*<!doctype\s+html/i.test(inner) || /<html[\s>]/i.test(inner)) {
      return inner.trim();
    }
  }
  return null;
 }
 // ---- path resolution ----
 // Resolve `<projectRoot>/.boocode/artifacts/<filename>` and verify the
 // result stays under projectRoot. Mirrors the v1.13.18 codecontext_client.ts
 // approach: realpath projectRoot first, then prefix-check the candidate.
 // Throws on escape.
 async function resolveArtifactPath(
  projectRoot: string,
  filename: string,
 ): Promise<{ resolvedRoot: string; artifactsDir: string; absPath: string }> {
  const resolvedRoot = await realpath(projectRoot);
  const artifactsDir = resolve(resolvedRoot, ARTIFACT_SUBDIR);
  const absPath = resolve(artifactsDir, filename);
  // Lexical prefix check on the resolved candidates. (The `!== resolvedRoot`
  // branch was dead — ARTIFACT_SUBDIR is non-empty so artifactsDir always
  // differs from resolvedRoot.)
  if (!artifactsDir.startsWith(resolvedRoot + sep)) {
    throw new PathScopeError(
      `artifacts dir escapes project root: ${artifactsDir}`,
    );
  }
  if (!absPath.startsWith(artifactsDir + sep)) {
    throw new PathScopeError(
      `artifact filename escapes artifacts dir: ${filename}`,
    );
  }
  return { resolvedRoot, artifactsDir, absPath };
 }
 // After mkdir, realpath the artifacts dir and re-verify it stays under
 // resolvedRoot. Closes the symlink-escape gap: if `.boocode/artifacts` (or
 // any ancestor below resolvedRoot) is a symlink pointing outside the
 // project, the lexical check in resolveArtifactPath passes but the actual
 // write lands outside the sandbox. Throws PathScopeError on escape.
 async function assertArtifactsDirSafe(
  artifactsDir: string,
  resolvedRoot: string,
 ): Promise<void> {
  const realDir = await realpath(artifactsDir);
  if (realDir !== resolvedRoot && !realDir.startsWith(resolvedRoot + sep)) {
    throw new PathScopeError(
      `artifacts dir resolves outside project root: ${realDir}`,
    );
  }
 }
 // Pure decision helper for whether finalizeCompletion should write the
 // `html_artifact` part. Exported for unit testing the cap-skip branch.
 // Returns `{write: true, byteLen}` when the payload is under the cap, or
 // `{write: false, byteLen, reason: 'cap_exceeded'}` when oversize.
 export type HtmlArtifactDecision =
  | { write: true; byteLen: number }
  | { write: false; byteLen: number; reason: 'cap_exceeded' };
 export function decideHtmlArtifactWrite(
  htmlContent: string,
 ): HtmlArtifactDecision {
  const byteLen = Buffer.byteLength(htmlContent, 'utf8');
  if (byteLen > HTML_ARTIFACT_MAX_BYTES) {
    return { write: false, byteLen, reason: 'cap_exceeded' };
  }
  return { write: true, byteLen };
 }
 function buildUrl(projectId: string, filename: string): string {
  return `/api/projects/${projectId}/artifacts/${encodeURIComponent(filename)}`;
 }
 export interface WriteContext {
  projectId: string;
  projectRoot: string;
 }
 export async function writeMarkdownArtifact(
  message: Pick<Message, 'content'>,
  ctx: WriteContext,
 ): Promise<ArtifactWriteResult> {
  const slug = deriveMarkdownSlug(message.content);
  const filename = `${slug}-${Date.now()}.md`;
  const { resolvedRoot, artifactsDir, absPath } = await resolveArtifactPath(
    ctx.projectRoot,
    filename,
  );
  await mkdir(artifactsDir, { recursive: true });
  await assertArtifactsDirSafe(artifactsDir, resolvedRoot);
  await writeFile(absPath, message.content, 'utf8');
  return { path: absPath, url: buildUrl(ctx.projectId, filename) };
 }
 export async function writeHtmlArtifact(
  payload: HtmlArtifactPayload,
  ctx: WriteContext,
 ): Promise<ArtifactWriteResult> {
  const slug = deriveHtmlSlug(payload);
  const filename = `${slug}-${Date.now()}.html`;
  const { resolvedRoot, artifactsDir, absPath } = await resolveArtifactPath(
    ctx.projectRoot,
    filename,
  );
  await mkdir(artifactsDir, { recursive: true });
  await assertArtifactsDirSafe(artifactsDir, resolvedRoot);
  await writeFile(absPath, payload.html_content, 'utf8');
  return { path: absPath, url: buildUrl(ctx.projectId, filename) };
 }
 // 1MB cap on HTML artifacts (proposal S6). Larger payloads are not written
 // to the `html_artifact` part — the assistant text lands as plain content
 // and a warning is logged. Streaming abort was considered but the graceful
 // "no artifact, plain text falls back" path is simpler and lossless from
 // the user's perspective.
 export const HTML_ARTIFACT_MAX_BYTES = 1_048_576;
--- a/apps/server/src/services/auto_name.ts
+++ b/apps/server/src/services/auto_name.ts
@@ -1,4 +1,4 @@
-import type { InferenceContext } from './inference.js';
+import type { InferenceContext } from './inference/index.js';
 const NAMING_SYSTEM_PROMPT =
  'You name chat sessions. Reply directly with no thinking, reasoning, or explanation. Output ONLY the title, 4 words max, no quotes, no punctuation, no prefix like "Title:".';
--- a/apps/server/src/services/broker.ts
+++ b/apps/server/src/services/broker.ts
@@ -1,3 +1,6 @@
 import type { FastifyBaseLogger } from 'fastify';
 import { WsFrameSchema, type WsFrame } from '../types/ws-frames.js';
 export type Frame = Record<string, unknown> & { type: string };
 export type Listener = (frame: Frame) => void;
@@ -6,9 +9,15 @@ export interface Broker {
  subscribe(sessionId: string, listener: Listener): () => void;
  publishUser(user: string, frame: Frame): void;
  subscribeUser(user: string, listener: Listener): () => void;
  // v1.13.11-a: typed publish wrappers. Validate against WsFrameSchema and
  // delegate to publish / publishUser on success; log + drop on failure
  // (fail-closed). Existing publish / publishUser callers stay legal — they
  // get converted to the typed variant in v1.13.11-b.
  publishFrame(sessionId: string, frame: WsFrame): void;
  publishUserFrame(user: string, frame: WsFrame): void;
 }
-export function createBroker(): Broker {
+export function createBroker(log?: FastifyBaseLogger): Broker {
  const topics = new Map<string, Set<Listener>>();
  const userTopics = new Map<string, Set<Listener>>();
@@ -39,6 +48,28 @@ export function createBroker(): Broker {
    };
  }
  // v1.13.11-a: shared validation guard. Returns the parsed/typed frame on
  // success, or null on failure (after logging). Brief mandates fail-closed
  // semantics: invalid frames don't reach subscribers; throwing here could
  // cascade into stream-phase aborts which v1.13.7 already had to defend
  // against, so log + drop is the right shape.
  function validate(channel: 'session' | 'user', key: string, frame: WsFrame): WsFrame | null {
    const parsed = WsFrameSchema.safeParse(frame);
    if (parsed.success) return parsed.data;
    const frameType = (frame as { type?: unknown })?.type;
    const errors = parsed.error.flatten();
    if (log) {
      log.error(
        { channel, key, frame_type: frameType, errors },
        'ws-frame-validation-failed: dropping invalid frame',
      );
    } else {
      // Fallback for callers that didn't pass a logger (e.g. unit tests).
      console.error('ws-frame-validation-failed', { channel, key, frame_type: frameType, errors });
    }
    return null;
  }
  return {
    publish(sessionId, frame) {
      publishTo(topics, sessionId, frame);
@@ -52,5 +83,15 @@ export function createBroker(): Broker {
    subscribeUser(user, listener) {
      return subscribeTo(userTopics, user, listener);
    },
    publishFrame(sessionId, frame) {
      const valid = validate('session', sessionId, frame);
      if (!valid) return;
      publishTo(topics, sessionId, valid as Frame);
    },
    publishUserFrame(user, frame) {
      const valid = validate('user', user, frame);
      if (!valid) return;
      publishTo(userTopics, user, valid as Frame);
    },
  };
 }
--- a/apps/server/src/services/codecontext_client.ts
+++ b/apps/server/src/services/codecontext_client.ts
@@ -16,7 +16,79 @@
 //      file parser bug (upstream issue #37) returns a generic error string,
 //      which we re-surface with a hint to add the file to .codecontextignore.
-import { realpath } from 'node:fs/promises';
+import { access, copyFile, realpath } from 'node:fs/promises';
 import { isAbsolute, join, resolve, sep } from 'node:path';
 import { truncateIfNeeded } from './truncate.js';
 // v1.13.12 fix: codecontext crashes on empty source files (upstream issue #37)
 // when it can't ignore them. The .codecontextignore.template ships with the
 // project at /opt/boocode/codecontext/.codecontextignore.template (path inside
 // the container; the host's /opt is bind-mounted). On the first call to any
 // project, copy the template in if no per-project ignore exists yet. The user
 // can subsequently edit the file to customize. Idempotent — once any file is
 // at the project root we never overwrite.
 const IGNORE_TEMPLATE_PATH = '/opt/boocode/codecontext/.codecontextignore.template';
 const ensuredIgnoreProjects = new Set<string>();
 async function ensureIgnoreFile(projectRoot: string): Promise<void> {
  if (ensuredIgnoreProjects.has(projectRoot)) return;
  const ignorePath = join(projectRoot, '.codecontextignore');
  try {
    await access(ignorePath);
    ensuredIgnoreProjects.add(projectRoot);
    return;
  } catch {
    // missing — install the default
  }
  try {
    await copyFile(IGNORE_TEMPLATE_PATH, ignorePath);
    ensuredIgnoreProjects.add(projectRoot);
  } catch {
    // Template missing or project root read-only — proceed without it. The
    // codecontext call may still crash on empty source files; the model gets
    // the existing hint-message via the catch below telling it to add to
    // .codecontextignore manually.
  }
 }
 // v1.13.18: resolve a `file_path` arg to an absolute path anchored within
 // the (already realpath'd) projectRoot. Contract:
 //   - empty/whitespace-only → INVALID_FILE_PATH error
 //   - relative path → resolve(projectRoot, rawPath) (normalises dot-segments)
 //   - absolute path → resolve(rawPath) (also normalises — e.g. /root/../etc
 //     becomes /etc so the prefix-check below rejects it even in the ENOENT
 //     fallthrough where realpath couldn't canonicalise)
 //   - try realpath; on ENOENT fall through with the (normalised) absolute
 //     (the sidecar issues its own "File not found in graph" that the model
 //     can self-correct on; re-implementing the check here would diverge)
 //   - if the final path doesn't sit inside projectRoot → escape error
 //     (same shape as target_dir escape, only the field name differs)
 async function resolveProjectPath(
  projectRoot: string,
  rawPath: string,
 ): Promise<string> {
  if (rawPath.trim() === '') {
    throw new Error('INVALID_FILE_PATH: file_path must not be empty');
  }
  const candidate = isAbsolute(rawPath) ? resolve(rawPath) : resolve(projectRoot, rawPath);
  let resolved: string;
  try {
    resolved = await realpath(candidate);
  } catch (err: unknown) {
    if ((err as NodeJS.ErrnoException).code === 'ENOENT') {
      // File doesn't exist yet (or was deleted). Forward the absolute path;
      // codecontext will return "File not found in graph" which the model
      // can self-correct on.
      resolved = candidate;
    } else {
      throw err;
    }
  }
  if (resolved !== projectRoot && !resolved.startsWith(projectRoot + sep)) {
    throw new Error(`file_path ${rawPath} escapes project root ${projectRoot}`);
  }
  return resolved;
 }
 export interface CodecontextRequest {
  toolName: string;
@@ -27,6 +99,9 @@ export interface CodecontextRequest {
 export interface CodecontextResponse {
  result: string;
  truncated: boolean;
  // v1.13.5: optional opaque id pointing at the full pre-slice content on
  // tmpfs. Set when truncated=true and storage succeeded.
  outputPath?: string;
 }
 const CODECONTEXT_BASE_URL = process.env['CODECONTEXT_URL'] ?? 'http://codecontext:8080';
@@ -42,6 +117,10 @@ export async function callCodecontext(
  // never pass target_dir; tests can override). A non-existent target_dir
  // throws before we hit the network so the model gets a sharp error.
  const resolvedProject = await realpath(req.projectPath);
  // v1.13.12 fix: install the default .codecontextignore on first call to any
  // project so codecontext doesn't crash on empty node_modules files. One file
  // written per project, idempotent (set-membership check inside).
  await ensureIgnoreFile(resolvedProject);
  const requestedTarget = req.args['target_dir'];
  const targetDir = typeof requestedTarget === 'string' && requestedTarget.length > 0
    ? requestedTarget
@@ -56,7 +135,14 @@ export async function callCodecontext(
  // Step 2: re-build args with the resolved target_dir so codecontext sees
  // the real absolute path, not a symlink or relative form.
-  const argsToSend = { ...req.args, target_dir: resolvedTarget };
+  // v1.13.18: also resolve file_path when present — the sidecar index is keyed
  // on absolute paths, so a relative path from the model yields "File not found
  // in graph". Same escape check as target_dir; ENOENT falls through so the
  // sidecar produces the canonical "File not found in graph" the model can fix.
  const argsToSend: Record<string, unknown> = { ...req.args, target_dir: resolvedTarget };
  if (typeof req.args['file_path'] === 'string' && req.args['file_path'].trim() !== '') {
    argsToSend['file_path'] = await resolveProjectPath(resolvedProject, req.args['file_path']);
  }
  // Step 3: POST with a hard timeout. AbortController + setTimeout pattern
  // matches web_fetch.ts; nothing fancier needed.
@@ -105,13 +191,22 @@ export async function callCodecontext(
  // Step 4: inline truncation. The model gets a clear hint about how to
  // narrow the next call rather than a silent cut. Mirrors web_fetch.ts.
  // v1.13.5: stash the full body on tmpfs when truncating so the model can
  // retrieve more via view_truncated_output(id).
  if (body.result.length > TRUNCATION_LIMIT) {
    const truncated = body.result.slice(0, TRUNCATION_LIMIT);
    const omitted = body.result.length - TRUNCATION_LIMIT;
    const slicedWithMarker =
      `${truncated}\n\n[truncated, ${omitted} chars omitted; narrow with file_path, file_type, or limit]`;
    const wrapped = await truncateIfNeeded({
      fullContent: body.result,
      slicedContent: slicedWithMarker,
      wasTruncated: true,
    });
    return {
-      result:
+      result: wrapped.content,
-        `${truncated}\n\n[truncated, ${omitted} chars omitted; narrow with file_path, file_type, or limit]`,
+      truncated: wrapped.truncated,
-      truncated: true,
+      ...(wrapped.outputPath ? { outputPath: wrapped.outputPath } : {}),
    };
  }
  return { result: body.result, truncated: false };
--- a/apps/server/src/services/compaction.ts
+++ b/apps/server/src/services/compaction.ts
@@ -23,7 +23,13 @@ import type { Broker } from './broker.js';
 import { SUMMARY_TEMPLATE } from './compaction-prompt.js';
 import * as modelContextLookup from './model-context.js';
-const COMPACTION_BUFFER = 20_000;
+// v1.13.9: ratio-only overflow trigger. Fires compaction at 85% of ctx_max
 // (opencode session/overflow.ts pattern). Replaces the v1.11.0-era
 // `ctx_max - 20_000` formula which degenerated to 0 for contexts ≤20k and
 // gave only 7-8% headroom to the summarizer at 262k. Ratio gives consistent
 // 15% headroom at any scale, and small-ctx models no longer get an
 // effectively-disabled trigger.
 const EARLY_TRIGGER_RATIO = 0.85;
 const MIN_PRESERVE_RECENT_TOKENS = 2_000;
 const MAX_PRESERVE_RECENT_TOKENS = 8_000;
 const DEFAULT_TAIL_TURNS = 2;
@@ -39,19 +45,24 @@ export interface CompactionMessage {
  status: 'streaming' | 'complete' | 'failed' | 'cancelled';
  tool_calls: Array<{ id: string; name: string; args: Record<string, unknown> }> | null;
  tool_results: { tool_call_id: string; output: unknown; truncated: boolean; error?: string } | null;
  // v1.13.6: reasoning_parts captured by v1.13.1-C and read back through
  // messages_with_parts. Embedded into the head-assembly payload as prose so
  // the summarizer LLM sees what the model was reasoning through when it
  // chose its tool calls.
  reasoning_parts: Array<{ text: string }> | null;
  metadata: { kind?: string } | null;
  created_at: string;
 }
 // === overflow ===
-// Tokens we hold in reserve for the model's response so a near-full context
+// Returns the token budget at which overflow fires. Triggers compaction at
-// can still produce a useful turn. Mirrors opencode's COMPACTION_BUFFER.
+// 85% of contextLimit (opencode session/overflow.ts pattern). Returns 0 when
-// Returns 0 when the context limit is unknown (caller treats 0 as "do not
+// the context limit is unknown — caller treats 0 as "do not trigger overflow",
-// trigger overflow"); avoids dividing-by-zero downstream.
+// keeping inference flowing rather than compacting a turn we can't size.
 export function usable(contextLimit: number): number {
  if (!contextLimit || contextLimit <= 0) return 0;
-  return Math.max(0, contextLimit - COMPACTION_BUFFER);
+  return Math.floor(EARLY_TRIGGER_RATIO * contextLimit);
 }
 export interface Usage {
@@ -197,7 +208,8 @@ export function buildPrompt(
 // would silently drop pre-legacy-compact history before the LLM sees it.
 // Compaction wants to send the entire head, full stop.) ===
-interface OpenAiMessage {
+// v1.13.6: exported for unit-test access (reasoning render coverage).
 export interface OpenAiMessage {
  role: 'system' | 'user' | 'assistant' | 'tool';
  content: string | null;
  tool_calls?: Array<{
@@ -212,7 +224,8 @@ function isCapHitSentinel(m: CompactionMessage): boolean {
  return m.role === 'system' && m.metadata != null && m.metadata.kind === 'cap_hit';
 }
-function buildHeadPayload(head: CompactionMessage[]): OpenAiMessage[] {
+// v1.13.6: exported for unit-test access (reasoning render coverage).
 export function buildHeadPayload(head: CompactionMessage[]): OpenAiMessage[] {
  const out: OpenAiMessage[] = [];
  for (const m of head) {
    if (isCapHitSentinel(m)) continue;
@@ -243,9 +256,22 @@ function buildHeadPayload(head: CompactionMessage[]): OpenAiMessage[] {
      continue;
    }
    if (m.role === 'assistant') {
      // v1.13.6: embed reasoning text as prose prefixed onto the assistant
      // content. OpenAI wire shape doesn't carry reasoning as a structured
      // field, but the summarizer is reading text — a tagged prose block
      // gives it the same signal. We mirror the AI SDK ReasoningPart shape
      // by using a <reasoning>...</reasoning> wrapper so the summarizer can
      // distinguish reasoning from user-visible answer.
      let body = m.content && m.content.length > 0 ? m.content : '';
      if (m.reasoning_parts && m.reasoning_parts.length > 0) {
        const reasoning = m.reasoning_parts.map((r) => r.text).join('');
        body = body.length > 0
          ? `<reasoning>${reasoning}</reasoning>\n\n${body}`
          : `<reasoning>${reasoning}</reasoning>`;
      }
      const msg: OpenAiMessage = {
        role: 'assistant',
-        content: m.content && m.content.length > 0 ? m.content : null,
+        content: body.length > 0 ? body : null,
      };
      if (m.tool_calls && m.tool_calls.length > 0) {
        msg.tool_calls = m.tool_calls.map((tc) => ({
@@ -342,9 +368,14 @@ export async function process(input: ProcessInput): Promise<void> {
  // 2. All currently-active messages in this chat (compacted_at IS NULL).
  // ORDER BY (created_at, id) matches loadContext in inference.ts so the
  // turns() boundary logic sees the same sequence the LLM will.
  // v1.13.1-B: reads tool_calls/tool_results via the parts-merged view so
  // the compaction payload matches what the LLM saw on the original turn.
  // v1.13.6: also pulls reasoning_parts (added in v1.13.1-C) so summaries
  // capture what the model was working through before each tool call.
  const messages = await sql<CompactionMessage[]>`
-    SELECT id, role, content, kind, summary, status, tool_calls, tool_results, metadata, created_at
+    SELECT id, role, content, kind, summary, status, tool_calls, tool_results,
-    FROM messages
+           reasoning_parts, metadata, created_at
    FROM messages_with_parts
    WHERE chat_id = ${chatId} AND compacted_at IS NULL
    ORDER BY created_at ASC, id ASC
  `;
@@ -400,15 +431,16 @@ export async function process(input: ProcessInput): Promise<void> {
    'compaction: invoking model',
  );
-  // 6a. Flip the chat dot amber for the duration of the LLM call + DB writes.
+  // 6a. Flip the chat dot for the duration of the LLM call + DB writes.
-  // Same { type: 'chat_status', status: 'working', at } shape inference.ts
+  // v1.13.11-b: publish status='streaming' (the v1.12.1-widened replacement
-  // emits at runner enqueue. publishUser → broadcasts on the per-user channel
+  // for the dropped 'working' value). Compaction's LLM call has the same
-  // (all devices / tabs see it) since chat_status is a user-channel frame in
+  // semantic as an inference turn for dot-state purposes. The v1.12.1
-  // BooCode (see useChatStatus.ts, which is the consumer).
+  // chat_status widening missed this site; v1.13.11's WsFrame Zod schema
-  broker.publishUser('default', {
+  // surfaced the drift via the unknown-enum-value check.
  broker.publishUserFrame('default', {
    type: 'chat_status',
    chat_id: chatId,
-    status: 'working',
+    status: 'streaming',
    at: new Date().toISOString(),
  });
@@ -477,7 +509,7 @@ export async function process(input: ProcessInput): Promise<void> {
    // Always restore the dot. Status='idle' (not 'error') even on failure —
    // the caller logs/re-surfaces the error separately; the dot doesn't
    // need to stay red across reloads for a transient compaction blip.
-    broker.publishUser('default', {
+    broker.publishUserFrame('default', {
      type: 'chat_status',
      chat_id: chatId,
      status: 'idle',
@@ -491,7 +523,7 @@ export async function process(input: ProcessInput): Promise<void> {
  // toast. Order matters: idle must precede 'compacted' so the dot is
  // already green by the time the refetch toast appears.
  if (succeeded) {
-    broker.publish(sessionId, {
+    broker.publishFrame(sessionId, {
      type: 'compacted',
      session_id: sessionId,
      chat_id: chatId,
--- a/apps/server/src/services/file_ops.ts
+++ b/apps/server/src/services/file_ops.ts
@@ -47,8 +47,12 @@ export interface FindFilesResult {
  truncated: boolean;
 }
-export async function listDir(projectRoot: string, relPath: string): Promise<ListDirResult> {
+export async function listDir(
-  const real = await pathGuard(projectRoot, relPath);
+  projectRoot: string,
  relPath: string,
  opts?: { extra_roots?: readonly string[] },
 ): Promise<ListDirResult> {
  const real = await pathGuard(projectRoot, relPath, opts?.extra_roots);
  const s = await stat(real);
  if (!s.isDirectory()) {
    throw new PathScopeError(`not a directory: ${relPath}`);
@@ -82,8 +86,12 @@ export async function listDir(projectRoot: string, relPath: string): Promise<Lis
  };
 }
-export async function viewFile(projectRoot: string, relPath: string): Promise<ViewFileResult> {
+export async function viewFile(
-  const real = await pathGuard(projectRoot, relPath);
+  projectRoot: string,
  relPath: string,
  opts?: { extra_roots?: readonly string[] },
 ): Promise<ViewFileResult> {
  const real = await pathGuard(projectRoot, relPath, opts?.extra_roots);
  const s = await stat(real);
  if (!s.isFile()) {
    throw new PathScopeError(`not a file: ${relPath}`);
@@ -119,10 +127,10 @@ interface RipgrepMatch {
 export async function grep(
  projectRoot: string,
  pattern: string,
-  opts?: { path?: string; max_matches?: number; case_sensitive?: boolean; hidden?: boolean }
+  opts?: { path?: string; max_matches?: number; case_sensitive?: boolean; hidden?: boolean; extra_roots?: readonly string[] }
 ): Promise<GrepResult> {
  const targetPath = opts?.path ?? projectRoot;
-  const target = await pathGuard(projectRoot, targetPath);
+  const target = await pathGuard(projectRoot, targetPath, opts?.extra_roots);
  const limit = Math.min(
    Math.max(opts?.max_matches ?? DEFAULT_GREP_RESULTS, 1),
    MAX_GREP_RESULTS
@@ -192,14 +200,14 @@ export async function grep(
 export async function findFiles(
  projectRoot: string,
  pattern?: string,
-  opts?: { type?: 'file' | 'dir'; max_results?: number; path?: string }
+  opts?: { type?: 'file' | 'dir'; max_results?: number; path?: string; extra_roots?: readonly string[] }
 ): Promise<FindFilesResult> {
  const limit = Math.min(
    Math.max(opts?.max_results ?? DEFAULT_FIND_RESULTS, 1),
    MAX_FIND_RESULTS
  );
  const target = opts?.path != null
-    ? await pathGuard(projectRoot, opts.path)
+    ? await pathGuard(projectRoot, opts.path, opts?.extra_roots)
    : projectRoot;
  const args = ['--files'];
  if (pattern) args.push('--glob', pattern);
--- a/apps/server/src/services/grant_resolver.ts
+++ b/apps/server/src/services/grant_resolver.ts
@@ -0,0 +1,161 @@
 // v1.13.17-cross-repo-reads: derives the grant root for a path the user is
 // being asked to approve cross-repo read access to.
 //
 // Per design decision D1: grant unit = nearest registered project root,
 // then nearest path-whitelist ancestor that looks like a repo root, then
 // refuse. Granting the literal file path is too narrow (next file in the
 // same repo re-prompts). Granting an arbitrary parent dir over-scopes.
 //
 // The resolver runs in two contexts:
 //   1. request_read_access.execute  — pre-prompt validation (cheap; bails
 //      early if the path can't plausibly be granted so the user is never
 //      asked about /etc/passwd)
 //   2. POST /api/chats/:id/grant_read_access — at decision time, re-derives
 //      the root and persists it on sessions.allowed_read_paths
 //
 // Sam (2026-05-22 dispatch confirmation): "in the project-root resolver
 // ancestor walk, stop the moment parent exits PROJECT_ROOT_WHITELIST or hits
 // filesystem root — check on every iteration, not just final parent.
 // Symlinked input must not be able to escape the whitelist during the
 // walk." Hence the loop here checks both the walk bound AND the still-
 // inside-whitelist invariant every step.
 import { access, realpath } from 'node:fs/promises';
 import { constants } from 'node:fs';
 import { dirname, isAbsolute, sep } from 'node:path';
 import type { Sql } from '../db.js';
 // Files whose presence in a directory marks it as a repo root for grant
 // purposes. Kept narrow on purpose; broader heuristics (e.g. ".project",
 // "pyproject.toml") can be added with measured intent. Each entry is a
 // literal basename — no globs.
 const REPO_MARKERS: ReadonlyArray<string> = [
  '.git',
  'package.json',
  'go.mod',
  'Cargo.toml',
 ];
 export type GrantResolution =
  | { ok: true; root: string; source: 'project' | 'whitelist' }
  | { ok: false; reason: string };
 function isUnder(child: string, parent: string): boolean {
  return child === parent || child.startsWith(parent + sep);
 }
 async function exists(path: string): Promise<boolean> {
  try {
    await access(path, constants.F_OK);
    return true;
  } catch {
    return false;
  }
 }
 async function isRepoShaped(dir: string): Promise<boolean> {
  for (const marker of REPO_MARKERS) {
    if (await exists(`${dir}${sep}${marker}`)) return true;
  }
  return false;
 }
 // Resolves an absolute path to its grant root or refuses with a reason
 // string suitable for surfacing to the model. Pure helper — no DB writes,
 // no broker publishes. Caller persists the root on session.allowed_read_paths
 // if it wants the grant to stick.
 //
 // Arguments:
 //   sql                   — used only to read projects.path (no writes)
 //   requestedPath         — absolute path the model wants to read
 //   projectRoot           — the session's primary project root (already
 //                           realpath'd by caller). Used to short-circuit
 //                           "already in scope".
 //   whitelistRoot         — PROJECT_ROOT_WHITELIST from config (default /opt).
 //                           Walk bound for the repo-shape fallback.
 //
 // Returns { ok: true, root, source } on success; { ok: false, reason } else.
 export async function resolveGrantRoot(
  sql: Sql,
  requestedPath: string,
  projectRoot: string,
  whitelistRoot: string,
 ): Promise<GrantResolution> {
  if (typeof requestedPath !== 'string' || requestedPath.length === 0) {
    return { ok: false, reason: 'path is required' };
  }
  if (!isAbsolute(requestedPath)) {
    return { ok: false, reason: 'path must be absolute' };
  }
  // Resolve symlinks so subsequent ancestor checks compare apples-to-apples
  // with realpath'd projectRoot. If the path doesn't exist at all, bail
  // before bothering the user — the model is asking about a phantom.
  let real: string;
  try {
    real = await realpath(requestedPath);
  } catch {
    return { ok: false, reason: `path does not exist: ${requestedPath}` };
  }
  // Whitelist guard. Symlinked inputs can resolve outside the whitelist
  // even when the surface-form path looks inside it; that's why we test
  // the *real* path here, not the requested one.
  let realWhitelist: string;
  try {
    realWhitelist = await realpath(whitelistRoot);
  } catch {
    return { ok: false, reason: `whitelist root does not exist: ${whitelistRoot}` };
  }
  if (!isUnder(real, realWhitelist)) {
    return { ok: false, reason: 'path outside permitted scope' };
  }
  // Already in scope? No prompt needed; the tool's caller should retry.
  if (isUnder(real, projectRoot)) {
    return { ok: false, reason: 'path already accessible without a grant' };
  }
  // Look for a registered project whose root is an ancestor of the
  // requested path. Pick the LONGEST match (nearest ancestor wins) so
  // sub-projects don't get over-broadened.
  const projectRows = await sql<{ path: string }[]>`
    SELECT path FROM projects WHERE status = 'open'
  `;
  let bestProject: string | null = null;
  for (const row of projectRows) {
    if (!row.path) continue;
    if (!isUnder(real, row.path)) continue;
    if (bestProject === null || row.path.length > bestProject.length) {
      bestProject = row.path;
    }
  }
  if (bestProject !== null) {
    return { ok: true, root: bestProject, source: 'project' };
  }
  // Repo-shape fallback. Walk from the requested path upward toward the
  // whitelist root. At every iteration: confirm we're still inside the
  // whitelist (so a symlinked component can't slip the bound mid-walk)
  // and confirm we haven't hit the filesystem root. The first dir with a
  // REPO_MARKER child is the grant root.
  let cursor = real;
  while (true) {
    // Don't grant the whitelist root itself — that would be far too broad.
    if (cursor === realWhitelist) {
      return { ok: false, reason: 'no repo-shaped ancestor found under whitelist' };
    }
    if (!isUnder(cursor, realWhitelist)) {
      return { ok: false, reason: 'path outside permitted scope' };
    }
    const parent = dirname(cursor);
    if (parent === cursor) {
      // Hit filesystem root without finding a repo marker.
      return { ok: false, reason: 'no repo-shaped ancestor found under whitelist' };
    }
    if (await isRepoShaped(cursor)) {
      return { ok: true, root: cursor, source: 'whitelist' };
    }
    cursor = parent;
  }
 }
--- a/apps/server/src/services/inference.ts
+++ b/apps/server/src/services/inference.ts
--- a/apps/server/src/services/inference/budget.ts
+++ b/apps/server/src/services/inference/budget.ts
@@ -0,0 +1,32 @@
 import type { Agent } from '../../types/api.js';
 import { READ_ONLY_TOOL_NAMES } from '../tools.js';
 // v1.8.2: tool-call budget defaults. Resolved per-turn by resolveToolBudget.
 //   - Agent with explicit max_tool_calls: that value.
 //   - Agent with read-only-only tools:    BUDGET_READ_ONLY (50).
 //   - Agent with any non-read-only tool:  BUDGET_NON_READ_ONLY (10).
 //   - No agent (raw chat):                BUDGET_NO_AGENT (50).
 // v1.13.7: bumped BUDGET_NO_AGENT 15→30 to match BUDGET_READ_ONLY. Every tool
 // in ALL_TOOLS today is read-only (see services/tools.ts comment at
 // READ_ONLY_TOOL_NAMES); the cautious 15-cap was a forward-looking guard for
 // write tools that haven't landed yet. No-agent mode gets the same toolset as
 // an all-read-only agent at runtime, so they should share the same budget.
 // v1.13.12: bumped read-only caps 30→50. Real recon sessions were hitting 30
 // with ~3 turns wasted on codecontext parse failures (empty node_modules
 // files); legitimate need was ~27, and Architect-class system overviews want
 // deeper recon than a 30-cap permits. Headroom of 20 absorbs failure-retry
 // turns + deeper exploration without changing the safety floor materially —
 // the doom-loop guard (3 identical calls → abort) catches the actual failure
 // mode this cap was guarding against.
 export const BUDGET_READ_ONLY = 50;
 export const BUDGET_NON_READ_ONLY = 10;
 export const BUDGET_NO_AGENT = 50;
 const READ_ONLY_SET: ReadonlySet<string> = new Set(READ_ONLY_TOOL_NAMES);
 export function resolveToolBudget(agent: Agent | null): number {
  if (agent?.max_tool_calls != null) return agent.max_tool_calls;
  if (!agent) return BUDGET_NO_AGENT;
  const allReadOnly = agent.tools.every((t) => READ_ONLY_SET.has(t));
  return allReadOnly ? BUDGET_READ_ONLY : BUDGET_NON_READ_ONLY;
 }
--- a/apps/server/src/services/inference/error-handler.ts
+++ b/apps/server/src/services/inference/error-handler.ts
@@ -0,0 +1,199 @@
 import type { MessageMetadata, Session } from '../../types/api.js';
 import {
  decideHtmlArtifactWrite,
  detectHtmlArtifact,
  deriveHtmlTitle,
  HTML_ARTIFACT_MAX_BYTES,
 } from '../artifacts.js';
 import * as modelContext from '../model-context.js';
 import { maybeFlagForCompaction } from './payload.js';
 import { insertParts, partsFromAssistantMessage } from './parts.js';
 import type { PartInsert } from './parts.js';
 import type { InferenceContext, StreamResult, TurnArgs } from './turn.js';
 export async function handleAbortOrError(
  ctx: InferenceContext,
  args: TurnArgs,
  accumulated: string,
  err: unknown
 ): Promise<void> {
  const { sessionId, chatId, assistantMessageId } = args;
  const isAbort = err instanceof Error && err.name === 'AbortError';
  const finalStatus = isAbort ? 'cancelled' : 'failed';
  const errMsg = err instanceof Error ? err.message : String(err);
  // v1.8.2: persist a structured error metadata blob on genuine failures so
  // the bubble can render the reason on reload without re-deriving from the
  // (one-shot) WS error frame. User-initiated abort skips this — there's no
  // "reason" to surface for a stop the user already explicitly chose.
  const errorMetadata: MessageMetadata | null = isAbort
    ? null
    : { kind: 'error', error_reason: 'llm_provider_error', error_text: errMsg };
  if (errorMetadata) {
    await ctx.sql`
      UPDATE messages
      SET status = ${finalStatus},
          content = ${accumulated},
          finished_at = clock_timestamp(),
          metadata = ${ctx.sql.json(errorMetadata as never)}
      WHERE id = ${assistantMessageId}
    `;
  } else {
    await ctx.sql`
      UPDATE messages
      SET status = ${finalStatus},
          content = ${accumulated},
          finished_at = clock_timestamp()
      WHERE id = ${assistantMessageId}
    `;
  }
  const [failSessRow] = await ctx.sql<{ project_id: string; name: string; updated_at: string }[]>`
    UPDATE sessions SET updated_at = clock_timestamp()
    WHERE id = ${sessionId}
    RETURNING project_id, name, updated_at
  `;
  ctx.publishUser({ type: 'session_updated', session_id: sessionId, project_id: failSessRow!.project_id, name: failSessRow!.name, updated_at: failSessRow!.updated_at });
  // v1.8 mobile-tabs: cancellation is a user-initiated stop, treat as idle;
  // genuine errors flip the dot red. v1.8.2: error path also carries a
  // machine-readable `reason` so the UI can render specifics inline.
  if (isAbort) {
    // v1.12.1: defensive cancellation write. The status=${finalStatus} UPDATE
    // above already sets 'cancelled' for the AbortError case, but a row can
    // leak as 'streaming' when the abort fires between the post-tool-phase
    // INSERT (executeToolPhase) and the next runAssistantTurn's stream setup,
    // bypassing the try/catch around executeStreamPhase. The status guard
    // makes this a no-op when the earlier write already landed.
    await ctx.sql`
      UPDATE messages
      SET status = 'cancelled', content = ${accumulated}, finished_at = clock_timestamp()
      WHERE id = ${args.assistantMessageId} AND status = 'streaming'
    `;
    ctx.publishUser({ type: 'chat_status', chat_id: chatId, status: 'idle', at: new Date().toISOString() });
    ctx.publish(sessionId, {
      type: 'message_complete',
      message_id: assistantMessageId,
      chat_id: chatId,
    });
    ctx.log.info({ sessionId, chatId, assistantMessageId }, 'inference cancelled');
  } else {
    ctx.publishUser({
      type: 'chat_status',
      chat_id: chatId,
      status: 'error',
      at: new Date().toISOString(),
      reason: 'llm_provider_error',
    });
    ctx.publish(sessionId, {
      type: 'error',
      message_id: assistantMessageId,
      chat_id: chatId,
      error: errMsg,
      reason: 'llm_provider_error',
    });
    ctx.log.error({ err, sessionId, assistantMessageId }, 'inference failed');
  }
 }
 export async function finalizeCompletion(
  ctx: InferenceContext,
  args: TurnArgs,
  result: StreamResult,
  startedAt: string | null,
  session: Session
 ): Promise<void> {
  const { sessionId, chatId, assistantMessageId } = args;
  const { content, finishReason, promptTokens, completionTokens } = result;
  // v1.11.3: see executeToolPhase for the rationale.
  const mctx = await modelContext.getModelContext(session.model);
  const nCtx = mctx?.n_ctx ?? null;
  const [updated] = await ctx.sql<
    { tokens_used: number | null; ctx_used: number | null; ctx_max: number | null; finished_at: string | null }[]
  >`
    UPDATE messages
    SET content = ${content},
        status = 'complete',
        tokens_used = ${completionTokens},
        ctx_used = ${promptTokens},
        ctx_max = ${nCtx},
        finished_at = clock_timestamp()
    WHERE id = ${assistantMessageId}
    RETURNING tokens_used, ctx_used, ctx_max, finished_at
  `;
  // v1.13.0: dual-write the text part. finalizeCompletion is the terminal
  // path for text-only assistant turns (no tool calls); tool_calls are null
  // here by construction (the tool-bearing path goes through executeToolPhase).
  // v1.13.1-C: include result.reasoning so reasoning-channel models capture
  // a kind='reasoning' part alongside the text.
  // TODO(v1.13.1): wrap the UPDATE above and this insertParts in a single
  // sql.begin before flipping read authority to message_parts.
  const baseParts: PartInsert[] = partsFromAssistantMessage({
    content,
    tool_calls: null,
    reasoning: result.reasoning,
  }).map((p) => ({
    ...p,
    message_id: assistantMessageId,
  }));
  // v1.14.x-html-artifact-panes: opportunistic HTML detection. Adds a
  // SIBLING html_artifact part — never replaces the text part. 1MB cap is
  // graceful: oversized payloads are skipped and the assistant message
  // lands as plain content (warn logged).
  const htmlContent = detectHtmlArtifact(content);
  if (htmlContent !== null) {
    const decision = decideHtmlArtifactWrite(htmlContent);
    if (!decision.write) {
      ctx.log.warn(
        { assistantMessageId, byteLen: decision.byteLen, cap: HTML_ARTIFACT_MAX_BYTES },
        'html_artifact exceeded 1MB cap; skipping artifact part',
      );
    } else {
      const title = deriveHtmlTitle(htmlContent);
      const nextSeq = baseParts.reduce((m, p) => Math.max(m, p.sequence), -1) + 1;
      baseParts.push({
        message_id: assistantMessageId,
        sequence: nextSeq,
        kind: 'html_artifact',
        payload: {
          html_content: htmlContent,
          char_count: htmlContent.length,
          title,
        },
      });
    }
  }
  await insertParts(ctx.sql, baseParts);
  // v1.11: flag for compaction on the terminal turn too. Catches the common
  // case of a turn that hit the limit without invoking tools.
  await maybeFlagForCompaction(ctx, chatId, updated);
  const [completeSessRow] = await ctx.sql<{ project_id: string; name: string; updated_at: string }[]>`
    UPDATE sessions SET updated_at = clock_timestamp()
    WHERE id = ${sessionId}
    RETURNING project_id, name, updated_at
  `;
  ctx.publishUser({ type: 'session_updated', session_id: sessionId, project_id: completeSessRow!.project_id, name: completeSessRow!.name, updated_at: completeSessRow!.updated_at });
  ctx.publishUser({ type: 'chat_status', chat_id: chatId, status: 'idle', at: new Date().toISOString() });
  ctx.publish(sessionId, {
    type: 'message_complete',
    message_id: assistantMessageId,
    chat_id: chatId,
    tokens_used: updated?.tokens_used ?? null,
    ctx_used: updated?.ctx_used ?? null,
    ctx_max: updated?.ctx_max ?? null,
    started_at: startedAt,
    finished_at: updated?.finished_at ?? null,
    model: session.model,
  });
  ctx.log.info(
    {
      sessionId,
      chatId,
      assistantMessageId,
      finishReason,
      chars: content.length,
      tokens_used: updated?.tokens_used,
      ctx_used: updated?.ctx_used,
    },
    'inference complete'
  );
 }
--- a/apps/server/src/services/inference/index.ts
+++ b/apps/server/src/services/inference/index.ts
@@ -0,0 +1,20 @@
 // v1.12.4: re-export shim. Outside callers (apps/server/src/index.ts and the
 // vitest inference tests) import from './services/inference/index.js'. The
 // directory is now the public surface; turn.ts holds runAssistantTurn /
 // runInference / createInferenceRunner while the other inference/*.ts files
 // stay implementation-private.
 export {
  createInferenceRunner,
  runAssistantTurn,
  runInference,
 } from './turn.js';
 export type {
  FramePublisher,
  InferenceContext,
  InferenceFrame,
  StreamResult,
  TurnArgs,
 } from './turn.js';
 export { detectDoomLoop, DOOM_LOOP_THRESHOLD } from './sentinels.js';
 export { buildMessagesPayload } from './payload.js';
--- a/apps/server/src/services/inference/parts.ts
+++ b/apps/server/src/services/inference/parts.ts
@@ -0,0 +1,108 @@
 import type { Sql } from '../../db.js';
 import type { ToolCall, ToolResult } from '../../types/api.js';
 // v1.13.0: dual-write helper. Every site that writes the legacy
 // messages.tool_calls / messages.tool_results JSON columns calls into here
 // to mirror the same data into message_parts rows. Reads still go to the
 // JSON columns; the swap to parts-as-source-of-truth happens in a later
 // v1.13 dispatch alongside the AI SDK streamText migration.
 // v1.13.13: 'synthesis' added. Schema CHECK constraint is updated in lockstep
 // (schema.sql adds 'synthesis' to message_parts_kind_chk on startup). The
 // dispatch's claim that no schema migration was needed assumed kind was a
 // bare text column — it isn't; the constraint enumerates allowed values.
 // v1.14.x-html-artifact-panes: 'html_artifact' added. Schema CHECK constraint
 // in schema.sql updated in lockstep.
 export type PartKind =
  | 'text'
  | 'tool_call'
  | 'tool_result'
  | 'reasoning'
  | 'step_start'
  | 'synthesis'
  | 'html_artifact';
 export interface PartInsert {
  message_id: string;
  sequence: number;
  kind: PartKind;
  payload: unknown;
 }
 export async function insertParts(sql: Sql, parts: PartInsert[]): Promise<void> {
  if (parts.length === 0) return;
  // postgres-js fans out an array of objects to a multi-row INSERT. Each
  // payload field needs sql.json() so jsonb storage receives a JSON value
  // rather than a quoted string.
  await sql`
    INSERT INTO message_parts ${sql(
      parts.map((p) => ({
        message_id: p.message_id,
        sequence: p.sequence,
        kind: p.kind,
        payload: sql.json(p.payload as never),
      })),
      'message_id',
      'sequence',
      'kind',
      'payload',
    )}
  `;
 }
 // Derive parts from the canonical messages row for an assistant message.
 // reasoning (when non-empty) becomes a 'reasoning' part at sequence 0 —
 // it precedes user-visible content logically. content (when non-empty)
 // becomes a 'text' part next; each tool_call becomes a 'tool_call' part
 // with payload { id, name, args } where args is the parsed object (we
 // use the in-memory ToolCall shape, not the OpenAI stringified one).
 export function partsFromAssistantMessage(args: {
  content: string;
  tool_calls: ToolCall[] | null;
  // v1.13.1-C: optional reasoning text streamed alongside the answer.
  // Most rows have none — only models with separate reasoning channels
  // (qwen3.6 etc.) populate this.
  reasoning?: string;
 }): Omit<PartInsert, 'message_id'>[] {
  const out: Omit<PartInsert, 'message_id'>[] = [];
  let seq = 0;
  if (args.reasoning && args.reasoning.length > 0) {
    out.push({ sequence: seq, kind: 'reasoning', payload: { text: args.reasoning } });
    seq += 1;
  }
  if (args.content && args.content.length > 0) {
    out.push({ sequence: seq, kind: 'text', payload: { text: args.content } });
    seq += 1;
  }
  for (const tc of args.tool_calls ?? []) {
    out.push({
      sequence: seq,
      kind: 'tool_call',
      payload: { id: tc.id, name: tc.name, args: tc.args },
    });
    seq += 1;
  }
  return out;
 }
 // Derive a single tool_result part from a tool message's tool_results JSON.
 // The payload includes the same shape that buildMessagesPayload reads from
 // later: tool_call_id, output, optional error/truncated metadata.
 export function partsFromToolMessage(args: {
  tool_results: ToolResult | null;
 }): Omit<PartInsert, 'message_id'>[] {
  if (!args.tool_results) return [];
  const tr = args.tool_results;
  return [
    {
      sequence: 0,
      kind: 'tool_result',
      payload: {
        tool_call_id: tr.tool_call_id,
        output: tr.output,
        truncated: tr.truncated,
        ...(tr.error ? { error: tr.error } : {}),
      },
    },
  ];
 }
--- a/apps/server/src/services/inference/payload.ts
+++ b/apps/server/src/services/inference/payload.ts
@@ -0,0 +1,226 @@
 import type { FastifyBaseLogger } from 'fastify';
 import type { Sql } from '../../db.js';
 import type {
  Agent,
  Message,
  Project,
  Session,
 } from '../../types/api.js';
 import * as compaction from '../compaction.js';
 import { buildSystemPromptWithFingerprint } from '../system-prompt.js';
 import { isAnySentinel } from './sentinels.js';
 import { PRUNE_TRIGGER_TOKENS, prune } from './prune.js';
 import type { InferenceContext } from './turn.js';
 export interface OpenAiMessage {
  role: 'system' | 'user' | 'assistant' | 'tool';
  content: string | null;
  tool_calls?: Array<{
    id: string;
    type: 'function';
    function: { name: string; arguments: string };
  }>;
  tool_call_id?: string;
  // v1.13.1-C: reasoning text from a prior assistant turn, sourced from
  // message_parts kind='reasoning' rows joined in via reasoning_parts on
  // the messages_with_parts view. stream-phase.ts/toModelMessages threads
  // this into the AI SDK ReasoningPart when forwarding to the model so
  // reasoning models can resume mid-thought across tool-call boundaries.
  reasoning?: string;
 }
 // v1.12: buildSystemPrompt lives in services/system-prompt.ts. It awaits the
 // container-guidance loader, so this function is async too and every call
 // site in inference.ts awaits the result.
 // v1.13.8: optional log argument. When provided, emit prefix-fingerprint
 // per call + prefix-drift when the same session sees a hash change. Tests
 // omit it and exercise the byte-stability surface directly through
 // buildSystemPromptWithFingerprint. The observer Map in system-prompt.ts
 // updates regardless of whether log is passed.
 export async function buildMessagesPayload(
  session: Session,
  project: Project,
  history: Message[],
  agent: Agent | null = null,
  log?: FastifyBaseLogger,
 ): Promise<OpenAiMessage[]> {
  const out: OpenAiMessage[] = [];
  const { prompt: systemPrompt, fingerprint, drift } =
    await buildSystemPromptWithFingerprint(project, session, agent);
  if (log) {
    log.info(fingerprint);
    if (drift) log.warn(drift);
  }
  out.push({ role: 'system', content: systemPrompt });
  // Find the latest compact marker — only send messages from that point onwards
  let startIdx = 0;
  for (let i = history.length - 1; i >= 0; i--) {
    if (history[i]!.kind === 'compact') {
      startIdx = i;
      break;
    }
  }
  for (let i = startIdx; i < history.length; i++) {
    const m = history[i]!;
    if (m.kind === 'compact') {
      out.push({ role: 'system', content: m.content });
      continue;
    }
    // v1.8.2 / v1.11.6: cap-hit and doom-loop sentinels are UI-only — never
    // send them to the LLM. The synthetic instruction note lives only inside
    // the summary call's messages array and is never persisted, so on a
    // follow-up turn the model resumes with a clean context.
    if (isAnySentinel(m)) continue;
    if (m.role === 'assistant' && m.status === 'streaming') continue;
    if (m.role === 'assistant' && m.status === 'cancelled') continue;
    // v1.13.7: skip failed assistant turns. A failed row carries no usable
    // content for the model, and leaving it in the payload alongside any
    // following assistant message produces "Cannot have 2 or more assistant
    // messages at the end of the list" from the OpenAI-compatible upstream.
    if (m.role === 'assistant' && m.status === 'failed') continue;
    // v1.13.7: skip "empty" completed assistants — clen=0 + no tool_calls.
    // These can land when an upstream stream returns finishReason='stop' with
    // no text/tool output (network blip, rate limit recovery, model quirk).
    // Same risk as the failed-status case: a trailing empty assistant plus
    // the next attempt's assistant placeholder = two trailing assistants and
    // the API rejects the whole payload.
    if (
      m.role === 'assistant' &&
      m.status === 'complete' &&
      (m.content == null || m.content.trim().length === 0) &&
      (m.tool_calls == null || m.tool_calls.length === 0)
    ) {
      continue;
    }
    if (m.role === 'tool') {
      const tr = m.tool_results;
      if (!tr) continue;
      const outputText = tr.error
        ? `error: ${tr.error}`
        : typeof tr.output === 'string'
          ? tr.output
          : JSON.stringify(tr.output);
      out.push({
        role: 'tool',
        content: outputText,
        tool_call_id: tr.tool_call_id,
      });
      continue;
    }
    if (m.role === 'assistant') {
      const msg: OpenAiMessage = {
        role: 'assistant',
        content: m.content && m.content.length > 0 ? m.content : null,
      };
      if (m.tool_calls && m.tool_calls.length > 0) {
        msg.tool_calls = m.tool_calls.map((tc) => ({
          id: tc.id,
          type: 'function' as const,
          function: { name: tc.name, arguments: JSON.stringify(tc.args) },
        }));
      }
      // v1.13.1-C: collapse reasoning_parts into a single string. The view
      // returns them ordered by sequence; multiple reasoning parts on one
      // message are rare but concat preserves ordering. Skip when absent.
      if (m.reasoning_parts && m.reasoning_parts.length > 0) {
        msg.reasoning = m.reasoning_parts.map((p) => p.text ?? '').join('');
      }
      out.push(msg);
      continue;
    }
    out.push({ role: 'user', content: m.content });
  }
  return out;
 }
 export async function loadContext(
  sql: Sql,
  sessionId: string,
  chatId: string
 ): Promise<{ session: Session; project: Project; history: Message[] } | null> {
  const sessionRows = await sql<Session[]>`
    SELECT id, project_id, name, model, system_prompt, status, created_at, updated_at,
           agent_id, web_search_enabled
    FROM sessions WHERE id = ${sessionId}
  `;
  if (sessionRows.length === 0) return null;
  const session = sessionRows[0]!;
  const projectRows = await sql<Project[]>`
    SELECT id, name, path, added_at, last_session_id, status, gitea_remote,
           default_system_prompt, default_web_search_enabled
    FROM projects WHERE id = ${session.project_id}
  `;
  if (projectRows.length === 0) return null;
  const project = projectRows[0]!;
  // v1.11: filter compacted messages out of the inference assembly. The GET
  // /api/sessions/:id/messages endpoint still returns everything (so the UI
  // can show history with the summary card inline); only LLM payloads skip
  // compacted rows. compacted_at IS NULL keeps the active summary + tail.
  // v1.13.1-B: reads tool_calls/tool_results via the parts-merged view.
  // v1.13.1-C: also pull reasoning_parts so assistant messages from
  // reasoning models can be replayed with their reasoning context preserved.
  const history = await sql<Message[]>`
    SELECT id, session_id, chat_id, role, content, kind, tool_calls, tool_results, status, last_seq,
           tokens_used, ctx_used, ctx_max, started_at, finished_at, created_at, metadata,
           reasoning_parts
    FROM messages_with_parts
    WHERE chat_id = ${chatId} AND compacted_at IS NULL
    ORDER BY created_at ASC, id ASC
  `;
  return { session, project, history };
 }
 // v1.11: shared helper used after both finalizeCompletion and executeToolPhase
 // persist their token counts. Reads tokens off the just-UPDATEd row (which
 // the caller returns from RETURNING), runs compaction.isOverflow, and flips
 // chats.needs_compaction. The next runAssistantTurn invocation acts on it.
 // Silent on missing tokens — llama-swap occasionally omits usage on truncated
 // streams, and we'd rather miss one overflow than crash the inference path.
 export async function maybeFlagForCompaction(
  ctx: InferenceContext,
  chatId: string,
  updated: { tokens_used: number | null; ctx_used: number | null; ctx_max: number | null } | undefined,
 ): Promise<void> {
  if (!updated) return;
  const promptTokens = updated.ctx_used;
  const completionTokens = updated.tokens_used;
  const contextLimit = updated.ctx_max;
  if (typeof promptTokens !== 'number') return;
  if (typeof completionTokens !== 'number') return;
  if (typeof contextLimit !== 'number') return;
  const overflow = compaction.isOverflow(
    { prompt_tokens: promptTokens, completion_tokens: completionTokens },
    contextLimit,
  );
  if (!overflow) return;
  // v1.13.4: try the cheap prune first. If it freed at least
  // PRUNE_TRIGGER_TOKENS (20k) worth of context, we're below the threshold
  // again — skip flagging summarize for the next turn. The next turn's
  // overflow check will re-evaluate from scratch.
  // v1.13.9: the overflow trigger above is now 85% of ctx_max (was
  // ctx_max - 20k). PRUNE_TRIGGER_TOKENS stays at 20k as the prune-freed
  // threshold — independent of the overflow formula.
  // Prune failures (DB errors etc.) propagate so the surrounding inference
  // path sees them; the catch in finalizeCompletion / executeToolPhase
  // doesn't shield this — by design, we want to know if prune is broken.
  const pruned = await prune({ sql: ctx.sql, chatId });
  if (pruned.hidden > 0) {
    ctx.log.info(
      { chatId, hidden: pruned.hidden, freedTokens: pruned.freedTokens },
      'inference: prune freed context budget',
    );
  }
  if (pruned.freedTokens >= PRUNE_TRIGGER_TOKENS) {
    // Prune handled it; skip the (expensive) summarize path.
    return;
  }
  await ctx.sql`UPDATE chats SET needs_compaction = true WHERE id = ${chatId}`;
  ctx.log.info({ chatId, promptTokens, completionTokens, contextLimit }, 'inference: flagged for compaction');
 }
--- a/apps/server/src/services/inference/provider.ts
+++ b/apps/server/src/services/inference/provider.ts
@@ -0,0 +1,34 @@
 import { createOpenAICompatible } from '@ai-sdk/openai-compatible';
 import type { LanguageModel } from 'ai';
 // v1.13.1-A: AI SDK provider against llama-swap. baseURL is threaded from
 // config.LLAMA_SWAP_URL at call time (not module-load) so tests can stub the
 // upstream without touching env vars. No apiKey — llama-swap is unauth in our
 // Tailscale topology and exposing it over the public internet is gated by
 // Authelia at the Caddy layer, not by API keys.
 const cache = new Map<string, ReturnType<typeof createOpenAICompatible>>();
 function getProvider(baseURL: string): ReturnType<typeof createOpenAICompatible> {
  let provider = cache.get(baseURL);
  if (!provider) {
    provider = createOpenAICompatible({
      name: 'llama-swap',
      baseURL: baseURL.endsWith('/v1') ? baseURL : `${baseURL}/v1`,
      // v1.13.7: @ai-sdk/openai-compatible defaults includeUsage=false, which
      // omits `stream_options.include_usage` from the request body. Without
      // it, llama.cpp / llama-swap never emits the trailing usage block, so
      // `result.usage` resolves with inputTokens=outputTokens=undefined and
      // tokens_used / ctx_used land as NULL in every messages row. Setting
      // true here re-enables the per-stream usage payload across all models
      // served via the llama-swap provider.
      includeUsage: true,
    });
    cache.set(baseURL, provider);
  }
  return provider;
 }
 export function upstreamModel(baseURL: string, modelId: string): LanguageModel {
  return getProvider(baseURL).chatModel(modelId);
 }
--- a/apps/server/src/services/inference/prune.ts
+++ b/apps/server/src/services/inference/prune.ts
@@ -0,0 +1,127 @@
 import type { Sql } from '../../db.js';
 // v1.13.4: two-tier compaction prune. Opencode's prune half (the cheap one);
 // summarize half shipped in v1.11.0 as services/compaction.ts.
 //
 // Algorithm: scan tool_result parts newest-first. Protect the last
 // PROTECTED_TOKENS of content (the model recently saw these — pruning them
 // kills coherence). Older parts are candidates. Mark them hidden_at only
 // if the candidate pool would free at least PRUNE_TRIGGER_TOKENS — pruning
 // 3 small tool_results to recover 500 tokens isn't worth the loss of
 // fidelity for the model's next turn.
 //
 // Stops at the last compaction summary boundary (chats.tail_start_id). The
 // v1.11.0 summary already encodes everything before that point; pruning
 // across the boundary would double-erase.
 export const PROTECTED_TOKENS = 40_000;
 export const PRUNE_TRIGGER_TOKENS = 20_000;
 // Rough char-to-token estimate. Same heuristic compaction's usable() uses
 // implicitly via the buffer constant.
 function estimateTokens(text: string): number {
  return Math.ceil(text.length / 4);
 }
 function payloadTokens(payload: unknown): number {
  return estimateTokens(JSON.stringify(payload ?? ''));
 }
 export interface PruneResult {
  hidden: number;
  freedTokens: number;
 }
 // Pure algorithmic core, exported for unit-test access. Takes parts already
 // ordered newest-first, plus an optional cutoff (last compaction summary
 // boundary). Returns the part ids to hide and the total token estimate of
 // the candidates. Caller does the DB UPDATE.
 export interface PartForPrune {
  id: string;
  payload: unknown;
  created_at: Date;
 }
 export function selectPruneTargets(
  partsNewestFirst: ReadonlyArray<PartForPrune>,
  tailStartCreatedAt: Date | null,
 ): { ids: string[]; freedTokens: number } {
  let protectedTokens = 0;
  const candidates: { id: string; tokens: number }[] = [];
  let crossedProtection = false;
  for (const part of partsNewestFirst) {
    if (tailStartCreatedAt && part.created_at < tailStartCreatedAt) {
      // Past the last summary boundary; the v1.11.0 anchored summary already
      // covers everything older. Bail rather than double-erase.
      break;
    }
    const tokens = payloadTokens(part.payload);
    if (!crossedProtection) {
      protectedTokens += tokens;
      if (protectedTokens >= PROTECTED_TOKENS) {
        crossedProtection = true;
      }
      continue;
    }
    candidates.push({ id: part.id, tokens });
  }
  const candidateTokens = candidates.reduce((s, c) => s + c.tokens, 0);
  if (candidates.length === 0 || candidateTokens < PRUNE_TRIGGER_TOKENS) {
    return { ids: [], freedTokens: 0 };
  }
  return { ids: candidates.map((c) => c.id), freedTokens: candidateTokens };
 }
 export async function prune(args: {
  sql: Sql;
  chatId: string;
 }): Promise<PruneResult> {
  const { sql, chatId } = args;
  // Newest-first scan of visible tool_result parts in this chat. Pull
  // chats.tail_start_id alongside so we know where the last summary boundary
  // sits (don't prune across it).
  const parts = await sql<{
    id: string;
    payload: unknown;
    created_at: Date;
    tail_start_id: string | null;
  }[]>`
    SELECT p.id, p.payload, m.created_at,
      (SELECT c.tail_start_id FROM chats c WHERE c.id = ${chatId}) AS tail_start_id
    FROM message_parts p
    JOIN messages m ON m.id = p.message_id
    WHERE m.chat_id = ${chatId}
      AND p.kind = 'tool_result'
      AND p.hidden_at IS NULL
    ORDER BY m.created_at DESC, p.sequence DESC
  `;
  if (parts.length === 0) {
    return { hidden: 0, freedTokens: 0 };
  }
  // Read the boundary cutoff timestamp once. Older messages are off-limits.
  let tailStartCreatedAt: Date | null = null;
  const firstTailId = parts[0]?.tail_start_id ?? null;
  if (firstTailId) {
    const tailRow = await sql<{ created_at: Date }[]>`
      SELECT created_at FROM messages WHERE id = ${firstTailId}
    `;
    tailStartCreatedAt = tailRow[0]?.created_at ?? null;
  }
  const decision = selectPruneTargets(parts, tailStartCreatedAt);
  if (decision.ids.length === 0) {
    return { hidden: 0, freedTokens: 0 };
  }
  await sql`
    UPDATE message_parts
    SET hidden_at = clock_timestamp()
    WHERE id = ANY(${decision.ids})
  `;
  return { hidden: decision.ids.length, freedTokens: decision.freedTokens };
 }
--- a/apps/server/src/services/inference/sentinel-summaries.ts
+++ b/apps/server/src/services/inference/sentinel-summaries.ts
@@ -0,0 +1,523 @@
 import type {
  Agent,
  Message,
  MessageMetadata,
  Project,
  Session,
 } from '../../types/api.js';
 import * as modelContext from '../model-context.js';
 import { buildMessagesPayload } from './payload.js';
 import { DOOM_LOOP_THRESHOLD } from './sentinels.js';
 import { streamCompletion } from './stream-phase.js';
 import { DB_FLUSH_INTERVAL_MS } from './types.js';
 import type {
  InferenceContext,
  StreamResult,
  TurnArgs,
 } from './turn.js';
 // Synthetic system note appended to the cap-hit summary call. Verbatim from
 // the v1.8.2 spec — do not paraphrase: the model is more reliable when the
 // instruction is short, declarative, and identical across calls.
 const CAP_HIT_SUMMARY_NOTE = (limit: number) =>
  `You've reached the tool budget (${limit} calls). Produce the best answer you can with what you have. Do not call more tools.`;
 const DOOM_LOOP_NOTE = (name: string) =>
  `You called ${name} with the same arguments ${DOOM_LOOP_THRESHOLD} times in a row. Stop calling it. Produce the best answer you can with what you have.`;
 export async function runCapHitSummary(
  ctx: InferenceContext,
  args: TurnArgs,
  session: Session,
  project: Project,
  history: Message[],
  agent: Agent | null,
  budget: number,
 ): Promise<void> {
  const { sessionId, chatId, assistantMessageId, signal } = args;
  const messages = await buildMessagesPayload(session, project, history, agent, ctx.log);
  messages.push({ role: 'system', content: CAP_HIT_SUMMARY_NOTE(budget) });
  const startedRow = await ctx.sql<{ started_at: string }[]>`
    UPDATE messages
    SET started_at = clock_timestamp()
    WHERE id = ${assistantMessageId}
    RETURNING started_at
  `;
  const startedAt = startedRow[0]?.started_at ?? null;
  ctx.publish(sessionId, {
    type: 'message_started',
    message_id: assistantMessageId,
    chat_id: chatId,
    role: 'assistant',
  });
  let accumulated = '';
  let pendingFlushTimer: NodeJS.Timeout | null = null;
  let flushPromise: Promise<unknown> = Promise.resolve();
  const flushNow = () => {
    if (pendingFlushTimer) {
      clearTimeout(pendingFlushTimer);
      pendingFlushTimer = null;
    }
    const snapshot = accumulated;
    flushPromise = flushPromise.then(() =>
      ctx.sql`UPDATE messages SET content = ${snapshot} WHERE id = ${assistantMessageId}`
    );
  };
  const scheduleFlush = () => {
    if (pendingFlushTimer) return;
    pendingFlushTimer = setTimeout(() => {
      pendingFlushTimer = null;
      flushNow();
    }, DB_FLUSH_INTERVAL_MS);
  };
  let summaryOk = false;
  let summarySoftCancelled = false;
  let summaryError: string | null = null;
  let result: StreamResult | null = null;
  try {
    result = await streamCompletion(
      ctx,
      session.model,
      messages,
      { tools: null, temperature: agent?.temperature },
      (delta) => {
        accumulated += delta;
        ctx.publish(sessionId, {
          type: 'delta',
          message_id: assistantMessageId,
          chat_id: chatId,
          content: delta,
        });
        scheduleFlush();
      },
      undefined,
      signal,
    );
    summaryOk = true;
  } catch (err) {
    if (err instanceof Error && err.name === 'AbortError') {
      summarySoftCancelled = true;
    } else {
      summaryError = err instanceof Error ? err.message : String(err);
    }
  } finally {
    if (pendingFlushTimer) {
      clearTimeout(pendingFlushTimer);
      pendingFlushTimer = null;
    }
    await flushPromise;
  }
  // Finalize the summary message based on the three outcomes. The sentinel
  // is inserted regardless so the user always has the Continue affordance —
  // even on a partial / failed summary the chat history shows where the
  // budget was hit.
  if (summaryOk && result) {
    // v1.11.3: see executeToolPhase for the rationale.
    const mctx = await modelContext.getModelContext(session.model);
    const nCtx = mctx?.n_ctx ?? null;
    const [updated] = await ctx.sql<
      { tokens_used: number | null; ctx_used: number | null; ctx_max: number | null; finished_at: string | null }[]
    >`
      UPDATE messages
      SET content = ${result.content},
          status = 'complete',
          tokens_used = ${result.completionTokens},
          ctx_used = ${result.promptTokens},
          ctx_max = ${nCtx},
          finished_at = clock_timestamp()
      WHERE id = ${assistantMessageId}
      RETURNING tokens_used, ctx_used, ctx_max, finished_at
    `;
    ctx.publish(sessionId, {
      type: 'message_complete',
      message_id: assistantMessageId,
      chat_id: chatId,
      tokens_used: updated?.tokens_used ?? null,
      ctx_used: updated?.ctx_used ?? null,
      ctx_max: updated?.ctx_max ?? null,
      started_at: startedAt,
      finished_at: updated?.finished_at ?? null,
      model: session.model,
    });
  } else if (summarySoftCancelled) {
    await ctx.sql`
      UPDATE messages
      SET content = ${accumulated},
          status = 'cancelled',
          finished_at = clock_timestamp()
      WHERE id = ${assistantMessageId}
    `;
    ctx.publish(sessionId, {
      type: 'message_complete',
      message_id: assistantMessageId,
      chat_id: chatId,
    });
  } else {
    const errMeta: MessageMetadata = {
      kind: 'error',
      error_reason: 'summary_after_cap_failed',
      error_text: summaryError ?? 'summary failed',
    };
    await ctx.sql`
      UPDATE messages
      SET content = ${accumulated},
          status = 'failed',
          finished_at = clock_timestamp(),
          metadata = ${ctx.sql.json(errMeta as never)}
      WHERE id = ${assistantMessageId}
    `;
    ctx.publish(sessionId, {
      type: 'error',
      message_id: assistantMessageId,
      chat_id: chatId,
      error: summaryError ?? 'summary failed',
      reason: 'summary_after_cap_failed',
    });
  }
  // Bump session/chat updated_at exactly once for this turn.
  const [sessRow] = await ctx.sql<{ project_id: string; name: string; updated_at: string }[]>`
    UPDATE sessions SET updated_at = clock_timestamp()
    WHERE id = ${sessionId}
    RETURNING project_id, name, updated_at
  `;
  ctx.publishUser({
    type: 'session_updated',
    session_id: sessionId,
    project_id: sessRow!.project_id,
    name: sessRow!.name,
    updated_at: sessRow!.updated_at,
  });
  await insertCapHitSentinel(ctx, sessionId, chatId, agent, budget);
  // Status frame fires last so the dot color reflects the terminal state.
  // Success → idle, abort → idle (user-driven stop), error → error+reason.
  if (summaryOk) {
    ctx.publishUser({ type: 'chat_status', chat_id: chatId, status: 'idle', at: new Date().toISOString() });
  } else if (summarySoftCancelled) {
    ctx.publishUser({ type: 'chat_status', chat_id: chatId, status: 'idle', at: new Date().toISOString() });
  } else {
    ctx.publishUser({
      type: 'chat_status',
      chat_id: chatId,
      status: 'error',
      at: new Date().toISOString(),
      reason: 'summary_after_cap_failed',
    });
  }
  ctx.log.info(
    { sessionId, chatId, assistantMessageId, budget, summaryOk, summaryCancelled: summarySoftCancelled },
    'inference cap-hit summary finished',
  );
 }
 async function insertCapHitSentinel(
  ctx: InferenceContext,
  sessionId: string,
  chatId: string,
  agent: Agent | null,
  budget: number,
 ): Promise<void> {
  // Hard ceiling: count prior cap_hit sentinels in this chat. After two
  // continues (sentinel count of 2), the next sentinel reports can_continue
  // false and the UI disables the Continue button.
  const priorRows = await ctx.sql<{ count: number }[]>`
    SELECT COUNT(*)::int AS count
    FROM messages
    WHERE chat_id = ${chatId}
      AND role = 'system'
      AND metadata->>'kind' = 'cap_hit'
  `;
  const priorCount = priorRows[0]?.count ?? 0;
  const canContinue = priorCount < 2;
  const metadata: MessageMetadata = {
    kind: 'cap_hit',
    used: budget,
    limit: budget,
    agent_name: agent?.name ?? null,
    can_continue: canContinue,
  };
  const content = `Reached tool budget (${budget}/${budget}). Continue to extend.`;
  const [row] = await ctx.sql<{ id: string }[]>`
    INSERT INTO messages (session_id, chat_id, role, content, status, created_at, metadata)
    VALUES (${sessionId}, ${chatId}, 'system', ${content}, 'complete', clock_timestamp(), ${ctx.sql.json(metadata as never)})
    RETURNING id
  `;
  // The sentinel content is static, but we still walk the standard frame
  // sequence (started → delta → complete) so useSessionStream's reducer
  // appends it via the same path it uses for streaming assistant messages.
  // The delta carries the full text in one chunk.
  ctx.publish(sessionId, {
    type: 'message_started',
    message_id: row!.id,
    chat_id: chatId,
    role: 'system',
  });
  ctx.publish(sessionId, {
    type: 'delta',
    message_id: row!.id,
    chat_id: chatId,
    content,
  });
  ctx.publish(sessionId, {
    type: 'message_complete',
    message_id: row!.id,
    chat_id: chatId,
    metadata,
  });
 }
 // v1.11.6: doom-loop wrap-up. Mirrors runCapHitSummary structurally — same
 // in-flight-slot reuse, same tools-disabled streaming-summary call, same
 // post-finalize sentinel insert + chat_status drop. Differences:
 //   - synthetic note text comes from DOOM_LOOP_NOTE (names the looping tool)
 //   - sentinel metadata is { kind: 'doom_loop', tool_name, args, threshold }
 //     and has no Continue affordance (manual retry would just re-loop)
 //   - chat_status error path uses reason: 'doom_loop_summary_failed'
 // Kept as a clone rather than refactored into a shared helper because the
 // two summary paths still differ in error reason + sentinel shape; a third
 // sentinel would justify factoring out runWrapUpSummary(opts).
 export async function runDoomLoopSummary(
  ctx: InferenceContext,
  args: TurnArgs,
  session: Session,
  project: Project,
  history: Message[],
  agent: Agent | null,
  loop: { name: string; args: Record<string, unknown> },
 ): Promise<void> {
  const { sessionId, chatId, assistantMessageId, signal } = args;
  const messages = await buildMessagesPayload(session, project, history, agent, ctx.log);
  messages.push({ role: 'system', content: DOOM_LOOP_NOTE(loop.name) });
  const startedRow = await ctx.sql<{ started_at: string }[]>`
    UPDATE messages
    SET started_at = clock_timestamp()
    WHERE id = ${assistantMessageId}
    RETURNING started_at
  `;
  const startedAt = startedRow[0]?.started_at ?? null;
  ctx.publish(sessionId, {
    type: 'message_started',
    message_id: assistantMessageId,
    chat_id: chatId,
    role: 'assistant',
  });
  let accumulated = '';
  let pendingFlushTimer: NodeJS.Timeout | null = null;
  let flushPromise: Promise<unknown> = Promise.resolve();
  const flushNow = () => {
    if (pendingFlushTimer) {
      clearTimeout(pendingFlushTimer);
      pendingFlushTimer = null;
    }
    const snapshot = accumulated;
    flushPromise = flushPromise.then(() =>
      ctx.sql`UPDATE messages SET content = ${snapshot} WHERE id = ${assistantMessageId}`
    );
  };
  const scheduleFlush = () => {
    if (pendingFlushTimer) return;
    pendingFlushTimer = setTimeout(() => {
      pendingFlushTimer = null;
      flushNow();
    }, DB_FLUSH_INTERVAL_MS);
  };
  let summaryOk = false;
  let summarySoftCancelled = false;
  let summaryError: string | null = null;
  let result: StreamResult | null = null;
  try {
    result = await streamCompletion(
      ctx,
      session.model,
      messages,
      { tools: null, temperature: agent?.temperature },
      (delta) => {
        accumulated += delta;
        ctx.publish(sessionId, {
          type: 'delta',
          message_id: assistantMessageId,
          chat_id: chatId,
          content: delta,
        });
        scheduleFlush();
      },
      undefined,
      signal,
    );
    summaryOk = true;
  } catch (err) {
    if (err instanceof Error && err.name === 'AbortError') {
      summarySoftCancelled = true;
    } else {
      summaryError = err instanceof Error ? err.message : String(err);
    }
  } finally {
    if (pendingFlushTimer) {
      clearTimeout(pendingFlushTimer);
      pendingFlushTimer = null;
    }
    await flushPromise;
  }
  if (summaryOk && result) {
    const mctx = await modelContext.getModelContext(session.model);
    const nCtx = mctx?.n_ctx ?? null;
    const [updated] = await ctx.sql<
      { tokens_used: number | null; ctx_used: number | null; ctx_max: number | null; finished_at: string | null }[]
    >`
      UPDATE messages
      SET content = ${result.content},
          status = 'complete',
          tokens_used = ${result.completionTokens},
          ctx_used = ${result.promptTokens},
          ctx_max = ${nCtx},
          finished_at = clock_timestamp()
      WHERE id = ${assistantMessageId}
      RETURNING tokens_used, ctx_used, ctx_max, finished_at
    `;
    ctx.publish(sessionId, {
      type: 'message_complete',
      message_id: assistantMessageId,
      chat_id: chatId,
      tokens_used: updated?.tokens_used ?? null,
      ctx_used: updated?.ctx_used ?? null,
      ctx_max: updated?.ctx_max ?? null,
      started_at: startedAt,
      finished_at: updated?.finished_at ?? null,
      model: session.model,
    });
  } else if (summarySoftCancelled) {
    await ctx.sql`
      UPDATE messages
      SET content = ${accumulated},
          status = 'cancelled',
          finished_at = clock_timestamp()
      WHERE id = ${assistantMessageId}
    `;
    ctx.publish(sessionId, {
      type: 'message_complete',
      message_id: assistantMessageId,
      chat_id: chatId,
    });
  } else {
    // Doom-loop summary failure reuses the existing summary_after_cap_failed
    // error reason — the ErrorReason union is shared between sentinel paths
    // and the UI surfaces a generic "summary failed" line for both. We don't
    // add a new reason code because the user-visible failure mode is the
    // same (model gave up mid-summary). Sentinel below still fires.
    const errMeta: MessageMetadata = {
      kind: 'error',
      error_reason: 'summary_after_cap_failed',
      error_text: summaryError ?? 'doom-loop summary failed',
    };
    await ctx.sql`
      UPDATE messages
      SET content = ${accumulated},
          status = 'failed',
          finished_at = clock_timestamp(),
          metadata = ${ctx.sql.json(errMeta as never)}
      WHERE id = ${assistantMessageId}
    `;
    ctx.publish(sessionId, {
      type: 'error',
      message_id: assistantMessageId,
      chat_id: chatId,
      error: summaryError ?? 'doom-loop summary failed',
      reason: 'summary_after_cap_failed',
    });
  }
  const [sessRow] = await ctx.sql<{ project_id: string; name: string; updated_at: string }[]>`
    UPDATE sessions SET updated_at = clock_timestamp()
    WHERE id = ${sessionId}
    RETURNING project_id, name, updated_at
  `;
  ctx.publishUser({
    type: 'session_updated',
    session_id: sessionId,
    project_id: sessRow!.project_id,
    name: sessRow!.name,
    updated_at: sessRow!.updated_at,
  });
  await insertDoomLoopSentinel(ctx, sessionId, chatId, loop);
  if (summaryOk || summarySoftCancelled) {
    ctx.publishUser({ type: 'chat_status', chat_id: chatId, status: 'idle', at: new Date().toISOString() });
  } else {
    ctx.publishUser({
      type: 'chat_status',
      chat_id: chatId,
      status: 'error',
      at: new Date().toISOString(),
      reason: 'summary_after_cap_failed',
    });
  }
  ctx.log.info(
    { sessionId, chatId, assistantMessageId, loopedTool: loop.name, summaryOk, summaryCancelled: summarySoftCancelled },
    'inference doom-loop summary finished',
  );
 }
 async function insertDoomLoopSentinel(
  ctx: InferenceContext,
  sessionId: string,
  chatId: string,
  loop: { name: string; args: Record<string, unknown> },
 ): Promise<void> {
  // No hard-ceiling / can-continue logic here — doom-loop is a different
  // failure mode from cap-hit. Continuing would re-trigger the loop with
  // the same tools available; the user needs to restate their question
  // or switch agents instead.
  const metadata: MessageMetadata = {
    kind: 'doom_loop',
    tool_name: loop.name,
    args: loop.args,
    threshold: DOOM_LOOP_THRESHOLD,
  };
  const content = `Detected ${DOOM_LOOP_THRESHOLD} identical calls to ${loop.name}. Stopping the tool-call loop. Produce the best answer you can with what you have.`;
  const [row] = await ctx.sql<{ id: string }[]>`
    INSERT INTO messages (session_id, chat_id, role, content, status, created_at, metadata)
    VALUES (${sessionId}, ${chatId}, 'system', ${content}, 'complete', clock_timestamp(), ${ctx.sql.json(metadata as never)})
    RETURNING id
  `;
  // Standard frame sequence — same as cap-hit sentinel — so
  // useSessionStream's reducer appends the row via the existing path.
  ctx.publish(sessionId, {
    type: 'message_started',
    message_id: row!.id,
    chat_id: chatId,
    role: 'system',
  });
  ctx.publish(sessionId, {
    type: 'delta',
    message_id: row!.id,
    chat_id: chatId,
    content,
  });
  ctx.publish(sessionId, {
    type: 'message_complete',
    message_id: row!.id,
    chat_id: chatId,
    metadata,
  });
 }
--- a/apps/server/src/services/inference/sentinels.ts
+++ b/apps/server/src/services/inference/sentinels.ts
@@ -0,0 +1,53 @@
 import type { Message, ToolCall } from '../../types/api.js';
 // v1.11.6: doom-loop guard. When the model calls the same tool with the
 // same arguments DOOM_LOOP_THRESHOLD times in a row within one user-message
 // turn, abort the recursion and run the same wrap-up summary path as the
 // cap-hit case. Ported from opencode (DOOM_LOOP_THRESHOLD in
 // session/processor.ts). Threshold of 3 is the smallest value that doesn't
 // false-positive on a model that retries once after a transient error.
 export const DOOM_LOOP_THRESHOLD = 3;
 // Returns the name + args of the looping tool when the LAST
 // DOOM_LOOP_THRESHOLD entries in `recentToolCalls` are identical (same name
 // AND deep-equal args via JSON.stringify). Returns null otherwise.
 // Pure; exported for unit-test access.
 export function detectDoomLoop(
  recentToolCalls: ToolCall[],
 ): { name: string; args: Record<string, unknown> } | null {
  if (recentToolCalls.length < DOOM_LOOP_THRESHOLD) return null;
  const last = recentToolCalls.slice(-DOOM_LOOP_THRESHOLD);
  const ref = last[0]!;
  const refArgs = JSON.stringify(ref.args);
  for (let i = 1; i < last.length; i++) {
    const tc = last[i]!;
    if (tc.name !== ref.name) return null;
    if (JSON.stringify(tc.args) !== refArgs) return null;
  }
  return { name: ref.name, args: ref.args };
 }
 export function isCapHitSentinel(m: Message): boolean {
  return (
    m.role === 'system' &&
    m.metadata !== null &&
    typeof m.metadata === 'object' &&
    (m.metadata as { kind?: unknown }).kind === 'cap_hit'
  );
 }
 // v1.11.6: parallel predicate. Same UI-only semantics as cap-hit sentinels —
 // never sent to the LLM (filtered by buildMessagesPayload through the
 // isAnySentinel check below).
 export function isDoomLoopSentinel(m: Message): boolean {
  return (
    m.role === 'system' &&
    m.metadata !== null &&
    typeof m.metadata === 'object' &&
    (m.metadata as { kind?: unknown }).kind === 'doom_loop'
  );
 }
 export function isAnySentinel(m: Message): boolean {
  return isCapHitSentinel(m) || isDoomLoopSentinel(m);
 }
--- a/apps/server/src/services/inference/stream-phase.ts
+++ b/apps/server/src/services/inference/stream-phase.ts
@@ -0,0 +1,464 @@
 import type {
  Agent,
  Session,
  ToolCall,
 } from '../../types/api.js';
 import * as modelContext from '../model-context.js';
 import { toolJsonSchemas, type ToolJsonSchema } from '../tools.js';
 import type { OpenAiMessage } from './payload.js';
 // v1.13.16: extractToolCallBlocks replaces the inline opener-search loop and
 // recognizes both Qwen <tool_call> and Anthropic <invoke> markup in one pass.
 import { extractToolCallBlocks } from './xml-parser.js';
 import { DB_FLUSH_INTERVAL_MS, type StreamPhaseState } from './types.js';
 import type {
  InferenceContext,
  StreamResult,
  TurnArgs,
 } from './turn.js';
 import { upstreamModel } from './provider.js';
 import {
  jsonSchema,
  streamText,
  tool,
  type JSONValue,
  type ModelMessage,
  type ToolCallRepairFunction,
 } from 'ai';
 interface StreamOptions {
  // null = omit tools entirely (compact phase); [] = caller stripped all tools
  // (rare; we still omit from the request body to avoid OpenAI 400).
  tools: ToolJsonSchema[] | null;
  temperature?: number;
 }
 // v1.13.1-A: convert BooCode's OpenAI-shaped history into AI SDK
 // ModelMessage[]. Tool result messages need a `toolName` field that the
 // OpenAI shape doesn't carry; we look it up by scanning earlier assistant
 // `tool_calls` entries for a matching id.
 function toModelMessages(messages: OpenAiMessage[]): ModelMessage[] {
  const toolNameById = new Map<string, string>();
  for (const m of messages) {
    if (m.role === 'assistant' && m.tool_calls) {
      for (const tc of m.tool_calls) {
        toolNameById.set(tc.id, tc.function.name);
      }
    }
  }
  const out: ModelMessage[] = [];
  for (const m of messages) {
    if (m.role === 'system' || m.role === 'user') {
      out.push({ role: m.role, content: m.content ?? '' });
      continue;
    }
    if (m.role === 'assistant') {
      const hasTools = m.tool_calls && m.tool_calls.length > 0;
      const hasReasoning = typeof m.reasoning === 'string' && m.reasoning.length > 0;
      if (!hasTools && !hasReasoning) {
        // Bare text assistant (string content). null content + no tool_calls
        // is degenerate but harmless to forward.
        out.push({ role: 'assistant', content: m.content ?? '' });
        continue;
      }
      // v1.13.1-C: AI SDK ReasoningPart precedes text + tool-calls in the
      // assistant content array. Reasoning models (qwen3.6) consume their
      // prior reasoning context to resume mid-thought across tool boundaries.
      const parts: Array<
        | { type: 'reasoning'; text: string }
        | { type: 'text'; text: string }
        | { type: 'tool-call'; toolCallId: string; toolName: string; input: unknown }
      > = [];
      if (hasReasoning) {
        parts.push({ type: 'reasoning', text: m.reasoning! });
      }
      if (m.content && m.content.length > 0) {
        parts.push({ type: 'text', text: m.content });
      }
      for (const tc of m.tool_calls ?? []) {
        let input: unknown = {};
        try {
          input = tc.function.arguments.length > 0 ? JSON.parse(tc.function.arguments) : {};
        } catch {
          // Malformed args from a prior turn: pass through as a raw blob so
          // the model sees the same shape it emitted. Wraps the string under
          // _raw to match the buildMessagesPayload upstream convention.
          input = { _raw: tc.function.arguments };
        }
        parts.push({ type: 'tool-call', toolCallId: tc.id, toolName: tc.function.name, input });
      }
      out.push({ role: 'assistant', content: parts });
      continue;
    }
    if (m.role === 'tool') {
      const toolCallId = m.tool_call_id ?? '';
      const toolName = toolNameById.get(toolCallId) ?? 'unknown';
      const raw = m.content ?? '';
      let output: { type: 'text'; value: string } | { type: 'json'; value: JSONValue };
      try {
        // JSON.parse returns `any`; cast to JSONValue since the upstream
        // tool_results column is already JSON-serializable by construction.
        output = { type: 'json', value: JSON.parse(raw) as JSONValue };
      } catch {
        output = { type: 'text', value: raw };
      }
      out.push({
        role: 'tool',
        content: [{ type: 'tool-result', toolCallId, toolName, output }],
      });
      continue;
    }
  }
  return out;
 }
 // Build the AI SDK tools record from BooCode's JSON-schema tool definitions.
 // No `execute` field: BooCode runs tools itself in tool-phase.ts; streamText
 // surfaces the tool-call parts via fullStream and we capture them for the
 // outer loop to dispatch.
 function buildAiTools(schemas: ToolJsonSchema[]): Record<string, ReturnType<typeof tool>> {
  const out: Record<string, ReturnType<typeof tool>> = {};
  for (const s of schemas) {
    out[s.function.name] = tool({
      description: s.function.description,
      inputSchema: jsonSchema(s.function.parameters),
    });
  }
  return out;
 }
 // v1.10.5 Qwen-coder XML fallback. Some local models (notably qwen3-coder via
 // llama-swap) emit tool calls as inline XML inside delta.content rather than
 // the structured tool_calls field. We extract them out of the streamed text
 // before flushing it to the client.
 //
 // Qwen shape:
 //   <tool_call>
 //   <function=NAME>
 //   <parameter=KEY>VALUE</parameter>
 //   ...
 //   </function>
 //   </tool_call>
 //
 // v1.13.16: also recognize Anthropic <invoke> markup that qwen3.6-35b-a3b-mxfp4
 // drifts to (training-data residue from Claude Code documentation):
 //   <invoke name="NAME">
 //   <parameter name="KEY">VALUE</parameter>
 //   </invoke>
 // Both formats share the synthetic xml_call_${idx} ID space; the counter
 // increments across whichever opener appears first. Multiple blocks may
 // appear back-to-back in either format and they never nest.
 export async function streamCompletion(
  ctx: InferenceContext,
  model: string,
  messages: OpenAiMessage[],
  opts: StreamOptions,
  onDelta: (content: string) => void,
  onUsage: ((prompt: number | null, completion: number | null) => void) | undefined,
  signal?: AbortSignal
 ): Promise<StreamResult> {
  const aiMessages = toModelMessages(messages);
  const hasTools = opts.tools !== null && opts.tools.length > 0;
  const aiTools = hasTools ? buildAiTools(opts.tools!) : undefined;
  const startedAt = Date.now();
  // v1.13.1-C: accumulate reasoning text across reasoning-delta parts.
  // qwen3.6 emits these on a separate channel from text content; we capture
  // them per stream so finalizeCompletion can dual-write a 'reasoning' part.
  // Replaces the v1.13.1-A counter-only diagnostic.
  let reasoningAccumulated = '';
  // v1.13.3: experimental_repairToolCall keeps the stream alive when the
  // model emits a malformed tool call (bad JSON args, unknown name, etc.).
  // Without a repair function streamText throws and the WHOLE stream dies;
  // with one, the SDK invokes us and we route the bad call through normally.
  // Strategy: pass through unmodified. executeToolPhase's existing error
  // path (unknown tool name → "unknown tool: X" result; zod-reject → tool
  // 'X' rejected — fieldname: required) already gives the model a clean
  // recovery surface on the next turn. Logging gives us visibility into
  // how often qwen3.6 actually emits broken calls.
  const repairToolCall: ToolCallRepairFunction<NonNullable<typeof aiTools>> = async ({
    toolCall,
    error,
  }) => {
    ctx.log.warn(
      {
        toolCallId: toolCall.toolCallId,
        toolName: toolCall.toolName,
        error: error.message,
      },
      'malformed tool call surfaced via repairToolCall',
    );
    return toolCall;
  };
  const result = streamText({
    model: upstreamModel(ctx.config.LLAMA_SWAP_URL, model),
    messages: aiMessages,
    ...(aiTools
      ? { tools: aiTools, toolChoice: 'auto' as const, experimental_repairToolCall: repairToolCall }
      : {}),
    ...(typeof opts.temperature === 'number' ? { temperature: opts.temperature } : {}),
    abortSignal: signal,
  });
  let content = '';
  let pendingBuffer = '';
  let finishReason: string | null = null;
  // v1.13.1-A: AI SDK emits one `tool-call` part per fully-aggregated call,
  // so we no longer need the OpenAI-index reassembly map the manual SSE
  // parser used. XML tool calls extracted from text content go into the
  // same flat list and keep the v1.10.5 synthetic id convention.
  const toolCalls: ToolCall[] = [];
  for await (const part of result.fullStream) {
    switch (part.type) {
      case 'text-delta': {
        pendingBuffer += part.text;
        // v1.13.16: unified extraction. The helper finds the earliest-opening
        // complete <tool_call> or <invoke> block, flushes prose between/around
        // them, holds any partial opener for the next chunk, and silently
        // drops blocks that fail to parse (matches pre-v1.13.16 behavior).
        const extracted = extractToolCallBlocks(pendingBuffer);
        if (extracted.flushed.length > 0) {
          content += extracted.flushed;
          onDelta(extracted.flushed);
        }
        for (const call of extracted.calls) {
          const synthIdx = toolCalls.length;
          toolCalls.push({
            id: `xml_call_${synthIdx}`,
            name: call.name,
            args: call.args,
          });
        }
        pendingBuffer = extracted.remaining;
        break;
      }
      case 'tool-call': {
        // AI SDK has already parsed the input into an object. Match the
        // ToolCall shape BooCode passes around in toolCallsBuffer downstream.
        toolCalls.push({
          id: part.toolCallId,
          name: part.toolName,
          args: (part.input ?? {}) as Record<string, unknown>,
        });
        break;
      }
      case 'reasoning-delta': {
        // v1.13.1-C: accumulate; finalizeCompletion / executeToolPhase
        // dual-write the resulting text as a kind='reasoning' part.
        if (typeof part.text === 'string') {
          reasoningAccumulated += part.text;
        }
        break;
      }
      case 'finish': {
        if (typeof part.finishReason === 'string') {
          finishReason = part.finishReason;
        }
        break;
      }
      case 'error': {
        const err = part.error;
        throw err instanceof Error ? err : new Error(String(err));
      }
      // Intentional no-op: start, start-step, text-start, text-end,
      // reasoning-start, reasoning-end, source, file, tool-input-start,
      // tool-input-delta, tool-input-end, tool-result, tool-error,
      // finish-step, raw. We only care about the aggregated tool-call and
      // text-delta paths above; the rest are AI SDK lifecycle/streaming
      // breadcrumbs that don't change BooCode's persistence or WS contract.
      default:
        break;
    }
  }
  // v1.13.1-A: drain any buffered partial XML opener as plain text. The
  // pre-AI-SDK path did this on stream end too — better to leak `<tool_c`
  // than vanish the text.
  if (pendingBuffer.length > 0) {
    content += pendingBuffer;
    onDelta(pendingBuffer);
    pendingBuffer = '';
  }
  // AI SDK v6 fullStream returns normally on abort; check signal explicitly.
  // Without this throw the row would land as status='complete' with partial
  // content instead of going through handleAbortOrError → status='cancelled'.
  // Smoke D caught this in v1.13.1-A — don't refactor it away.
  if (signal?.aborted) {
    const abortErr = new Error('aborted');
    abortErr.name = 'AbortError';
    throw abortErr;
  }
  // Usage lands as a promise on the result; awaiting after fullStream is
  // drained is safe. AI SDK v6 names: `inputTokens` / `outputTokens`.
  let promptTokens: number | null = null;
  let completionTokens: number | null = null;
  try {
    const usage = await result.usage;
    if (typeof usage.inputTokens === 'number') promptTokens = usage.inputTokens;
    if (typeof usage.outputTokens === 'number') completionTokens = usage.outputTokens;
  } catch {
    // Some providers omit usage on partial streams; leave both null.
  }
  if (onUsage && (promptTokens !== null || completionTokens !== null)) {
    onUsage(promptTokens, completionTokens);
  }
  if (reasoningAccumulated.length > 0) {
    ctx.log.debug(
      { reasoningChars: reasoningAccumulated.length, model, elapsed_ms: Date.now() - startedAt },
      'streamCompletion: captured reasoning',
    );
  }
  return {
    finishReason,
    content,
    toolCalls,
    promptTokens,
    completionTokens,
    reasoning: reasoningAccumulated,
  };
 }
 export async function executeStreamPhase(
  ctx: InferenceContext,
  args: TurnArgs,
  session: Session,
  messages: OpenAiMessage[],
  state: StreamPhaseState,
  agent: Agent | null,
  // v1.11.8: when false, web_search and web_fetch are stripped from the
  // tool list sent to the LLM, so the model can't even attempt them.
  webToolsEnabled: boolean,
 ): Promise<StreamResult> {
  const { sessionId, chatId, assistantMessageId, signal } = args;
  const startedRow = await ctx.sql<{ started_at: string }[]>`
    UPDATE messages
    SET started_at = clock_timestamp()
    WHERE id = ${assistantMessageId}
    RETURNING started_at
  `;
  state.startedAt = startedRow[0]?.started_at ?? null;
  ctx.publish(sessionId, {
    type: 'message_started',
    message_id: assistantMessageId,
    chat_id: chatId,
    role: 'assistant',
  });
  let pendingFlushTimer: NodeJS.Timeout | null = null;
  let flushPromise: Promise<unknown> = Promise.resolve();
  const flushNow = () => {
    if (pendingFlushTimer) {
      clearTimeout(pendingFlushTimer);
      pendingFlushTimer = null;
    }
    const snapshot = state.accumulated;
    flushPromise = flushPromise.then(() =>
      ctx.sql`UPDATE messages SET content = ${snapshot} WHERE id = ${assistantMessageId}`
    );
  };
  const scheduleFlush = () => {
    if (pendingFlushTimer) return;
    pendingFlushTimer = setTimeout(() => {
      pendingFlushTimer = null;
      flushNow();
    }, DB_FLUSH_INTERVAL_MS);
  };
  // Tool whitelist: if an agent is set, filter the global tool list to only the
  // tool names it allows. Unknown names in agent.tools are dropped silently
  // (handled here by intersection). When no agent: send all tools.
  // v1.11.8: a second filter strips web_search + web_fetch unless the chat
  // has them explicitly enabled. Counts as an opt-in security boundary: the
  // model can't summon a tool that wasn't offered to it.
  const WEB_TOOL_NAMES: ReadonlySet<string> = new Set(['web_search', 'web_fetch']);
  const effectiveTools: ToolJsonSchema[] = (agent
    ? toolJsonSchemas().filter((t) => agent.tools.includes(t.function.name))
    : toolJsonSchemas()
  ).filter((t) => webToolsEnabled || !WEB_TOOL_NAMES.has(t.function.name));
  const effectiveTemperature = agent?.temperature;
  // v1.12.2: ctx_max lookup is cached after the first hit per model, so this
  // is a Map probe in steady state. We capture nCtx once at the top of the
  // stream so the throttled usage publish doesn't refetch each tick.
  const mctxForStream = await modelContext.getModelContext(session.model);
  const nCtxForStream = mctxForStream?.n_ctx ?? null;
  // v1.12.2 → v1.13.1-A: live usage publishes were throttled to ~500ms when
  // the manual SSE parser saw `parsed.usage` per chunk. AI SDK v6 surfaces
  // usage only at stream end (result.usage promise), so the throttle is
  // effectively a single trailing publish. ChatThroughput will tick once at
  // stream completion rather than mid-stream — known regression vs v1.12.2,
  // recovered if a future dispatch interpolates from delta cadence.
  const USAGE_THROTTLE_MS = 500;
  let lastUsageAt = 0;
  let pendingUsage: { p: number | null; c: number | null } | null = null;
  let usageTimer: NodeJS.Timeout | null = null;
  const flushUsage = () => {
    if (!pendingUsage) return;
    const { p, c } = pendingUsage;
    pendingUsage = null;
    lastUsageAt = Date.now();
    ctx.publish(sessionId, {
      type: 'usage',
      message_id: assistantMessageId,
      chat_id: chatId,
      completion_tokens: c,
      ctx_used: p,
      ctx_max: nCtxForStream,
    });
  };
  try {
    return await streamCompletion(
      ctx,
      session.model,
      messages,
      { tools: effectiveTools, temperature: effectiveTemperature },
      (delta) => {
        state.accumulated += delta;
        ctx.publish(sessionId, {
          type: 'delta',
          message_id: assistantMessageId,
          chat_id: chatId,
          content: delta,
        });
        ctx.log.debug({ sessionId, delta }, 'inference delta');
        scheduleFlush();
      },
      (prompt, completion) => {
        pendingUsage = { p: prompt, c: completion };
        const elapsed = Date.now() - lastUsageAt;
        if (elapsed >= USAGE_THROTTLE_MS) {
          flushUsage();
        } else if (!usageTimer) {
          usageTimer = setTimeout(() => {
            usageTimer = null;
            flushUsage();
          }, USAGE_THROTTLE_MS - elapsed);
        }
      },
      signal
    );
  } finally {
    if (pendingFlushTimer) {
      clearTimeout(pendingFlushTimer);
      pendingFlushTimer = null;
    }
    if (usageTimer) {
      clearTimeout(usageTimer);
      usageTimer = null;
    }
    await flushPromise;
  }
 }
--- a/apps/server/src/services/inference/tool-phase.ts
+++ b/apps/server/src/services/inference/tool-phase.ts
@@ -0,0 +1,357 @@
 import type { Session, ToolCall } from '../../types/api.js';
 import * as modelContext from '../model-context.js';
 import { PathScopeError } from '../path_guard.js';
 import { TOOLS_BY_NAME } from '../tools.js';
 import { maybeFlagForCompaction } from './payload.js';
 import { insertParts, partsFromAssistantMessage, partsFromToolMessage } from './parts.js';
 // v1.13.16: richer unknown-tool error so the model can self-correct when it
 // drifts to a Claude Code tool name (e.g. read_file → suggest view_file).
 // Applies to all unknown tool names, not just <invoke>-derived ones — at the
 // dispatch layer we no longer know which format produced the call, and the
 // extra signal is harmless for Qwen-derived calls.
 import { formatUnknownToolError } from './tool-suggestions.js';
 // v1.13.17-cross-repo-reads: pre-prompt validation for request_read_access.
 // Resolves the grant root before pausing the loop so the user is never
 // prompted about paths we couldn't grant anyway (e.g. /etc/passwd).
 import { resolveGrantRoot } from '../grant_resolver.js';
 import type {
  InferenceContext,
  StreamResult,
  TurnArgs,
 } from './turn.js';
 // v1.12.4: ESM value-import cycle. executeToolPhase recurses into
 // runAssistantTurn which lives in inference.ts. The cycle is safe because
 // the reference is read at call time (inside an async function body), not
 // at module top-level. Node + tsc resolve this cleanly.
 import { runAssistantTurn } from './turn.js';
 // v1.13.13: synthesis pipeline — replaces the immediate recursive turn when
 // any of this batch's tool calls is in SYNTHESIS_TOOLS. Falls through to
 // recursion on synthesis failure (timeout / model error). See module header
 // in synthesisPipeline.ts for the auto-fetch + token-budget rules.
 import { SYNTHESIS_TOOLS, runSynthesisPass } from '../synthesisPipeline.js';
 async function executeToolCall(
  projectRoot: string,
  toolCall: ToolCall,
  extraRoots: readonly string[],
 ): Promise<{ output: unknown; truncated: boolean; error?: string }> {
  const tool = TOOLS_BY_NAME[toolCall.name];
  if (!tool) {
    return {
      output: null,
      truncated: false,
      error: formatUnknownToolError(toolCall.name, Object.keys(TOOLS_BY_NAME)),
    };
  }
  const parsed = tool.inputSchema.safeParse(toolCall.args);
  if (!parsed.success) {
    // v1.12 Track B.2: enrich the zod-reject path so the model sees a
    // one-line, tool-named hint ("tool 'search_symbols' rejected — query:
    // Required") instead of a JSON blob of flatten output. Higher recovery
    // rate on the next turn; doom-loop guard still bounds infinite retries.
    // The cast is because tool.inputSchema is ZodType<unknown>, so zod can't
    // statically narrow flatten()'s fieldErrors key set — but the runtime
    // shape is the standard { formErrors: string[]; fieldErrors: Record<...> }.
    const flatten = parsed.error.flatten() as {
      formErrors: string[];
      fieldErrors: Record<string, string[] | undefined>;
    };
    const fieldErrors = Object.entries(flatten.fieldErrors)
      .map(([field, errs]) => `${field}: ${errs?.[0] ?? 'invalid'}`)
      .join('; ');
    const formError = flatten.formErrors[0];
    const hint = fieldErrors || formError || 'unknown validation error';
    return {
      output: null,
      truncated: false,
      error: `tool '${toolCall.name}' rejected — ${hint}`,
    };
  }
  try {
    const output = await tool.execute(parsed.data, projectRoot, extraRoots);
    const truncated =
      typeof output === 'object' && output !== null && 'truncated' in output
        ? Boolean((output as { truncated: unknown }).truncated)
        : false;
    return { output, truncated };
  } catch (err) {
    if (err instanceof PathScopeError) {
      return { output: null, truncated: false, error: err.message };
    }
    return {
      output: null,
      truncated: false,
      error: err instanceof Error ? err.message : String(err),
    };
  }
 }
 export async function executeToolPhase(
  ctx: InferenceContext,
  args: TurnArgs,
  result: StreamResult,
  startedAt: string | null,
  session: Session,
  projectRoot: string
 ): Promise<void> {
  const { sessionId, chatId, assistantMessageId, toolsUsed, signal } = args;
  const { content, toolCalls, promptTokens, completionTokens } = result;
  // v1.11.3: ctx_max comes from llama-swap /upstream/<model>/props, not the
  // streaming completion (which doesn't emit n_ctx). getModelContext caches
  // the positive lookup for the process lifetime, so this is a single Map
  // hit after the first invocation per model.
  const mctx = await modelContext.getModelContext(session.model);
  const nCtx = mctx?.n_ctx ?? null;
  const [updated] = await ctx.sql<
    { tokens_used: number | null; ctx_used: number | null; ctx_max: number | null; finished_at: string | null }[]
  >`
    UPDATE messages
    SET content = ${content},
        status = 'complete',
        tokens_used = ${completionTokens},
        ctx_used = ${promptTokens},
        ctx_max = ${nCtx},
        finished_at = clock_timestamp()
    WHERE id = ${assistantMessageId}
    RETURNING tokens_used, ctx_used, ctx_max, finished_at
  `;
  // v1.13.20: message_parts is the sole source of truth for tool_calls.
  // Legacy messages.tool_calls column was dropped; reads route through the
  // messages_with_parts view.
  // v1.13.1-C: include result.reasoning so models with separate reasoning
  // channels (qwen3.6) get a kind='reasoning' part at sequence 0.
  await insertParts(
    ctx.sql,
    partsFromAssistantMessage({
      content,
      tool_calls: toolCalls,
      reasoning: result.reasoning,
    }).map((p) => ({
      ...p,
      message_id: assistantMessageId,
    })),
  );
  // v1.11: flag for compaction if this turn pushed us over the usable budget.
  // We never compact mid-loop (the recursive runAssistantTurn keeps tools
  // flowing); the flag fires on the NEXT turn's pre-fetch hook above.
  await maybeFlagForCompaction(ctx, chatId, updated);
  const [toolSessRow] = await ctx.sql<{ project_id: string; name: string; updated_at: string }[]>`
    UPDATE sessions SET updated_at = clock_timestamp()
    WHERE id = ${sessionId}
    RETURNING project_id, name, updated_at
  `;
  ctx.publishUser({ type: 'session_updated', session_id: sessionId, project_id: toolSessRow!.project_id, name: toolSessRow!.name, updated_at: toolSessRow!.updated_at });
  for (const tc of toolCalls) {
    ctx.publish(sessionId, {
      type: 'tool_call',
      message_id: assistantMessageId,
      chat_id: chatId,
      tool_call: tc,
    });
  }
  ctx.publish(sessionId, {
    type: 'message_complete',
    message_id: assistantMessageId,
    chat_id: chatId,
    tokens_used: updated?.tokens_used ?? null,
    ctx_used: updated?.ctx_used ?? null,
    ctx_max: updated?.ctx_max ?? null,
    started_at: startedAt,
    finished_at: updated?.finished_at ?? null,
    model: session.model,
  });
  // Batch 9.7: ask_user_input pauses the loop. The tool row is still inserted
  // (the answer endpoint needs a target row to UPDATE), but tool_results is
  // pre-stamped with output=null as a "pending" sentinel and no tool_result
  // frame goes out — the card renders from the tool_call frame alone. Mixed
  // batches still execute the other tools normally.
  ctx.publishUser({ type: 'chat_status', chat_id: chatId, status: 'tool_running', at: new Date().toISOString() });
  let pausingForUserInput = false;
  // v1.13.13: capture synth-tool result text so the synthesis pipeline below
  // doesn't have to re-fetch from DB. Array (not single) because a batch
  // could theoretically include multiple synthesis tools — we take the first
  // for the synthesis input. Race-free under Promise.all because each
  // callback pushes its own captured value.
  const synthEntries: Array<{ tc: ToolCall; output: unknown; error?: string }> = [];
  await Promise.all(
    toolCalls.map(async (tc) => {
      const [toolRow] = await ctx.sql<{ id: string }[]>`
        INSERT INTO messages (session_id, chat_id, role, content, status, created_at)
        VALUES (${sessionId}, ${chatId}, 'tool', '', 'complete', clock_timestamp())
        RETURNING id
      `;
      const toolMessageId = toolRow!.id;
      if (tc.name === 'ask_user_input') {
        pausingForUserInput = true;
        const sentinel = { tool_call_id: tc.id, output: null, truncated: false };
        // v1.13.20: parts-only. The answer-endpoint UPDATE later
        // (messages.ts) will delete and re-insert this part when the user
        // submits their answer.
        await insertParts(
          ctx.sql,
          partsFromToolMessage({ tool_results: sentinel }).map((p) => ({
            ...p,
            message_id: toolMessageId,
          })),
        );
        return;
      }
      // v1.13.17-cross-repo-reads: request_read_access pauses identically to
      // ask_user_input EXCEPT for an up-front validation pass — if the path
      // can't be granted under the whitelist / repo-shape rules, surface an
      // immediate denial without prompting the user. Per design D1, we never
      // ask the user about /etc/passwd or paths outside PROJECT_ROOT_WHITELIST.
      if (tc.name === 'request_read_access') {
        const tcArgs = tc.args as { path?: unknown; reason?: unknown };
        const requested =
          typeof tcArgs.path === 'string' ? tcArgs.path : '';
        const resolution = await resolveGrantRoot(
          ctx.sql,
          requested,
          projectRoot,
          ctx.config.PROJECT_ROOT_WHITELIST,
        );
        if (!resolution.ok) {
          // Auto-deny without pausing. The model sees the reason on its
          // next turn and decides what to do.
          const stored = {
            tool_call_id: tc.id,
            output: `denied: ${resolution.reason}`,
            truncated: false,
          };
          // v1.13.20: parts-only write.
          await insertParts(
            ctx.sql,
            partsFromToolMessage({ tool_results: stored }).map((p) => ({
              ...p,
              message_id: toolMessageId,
            })),
          );
          ctx.publish(sessionId, {
            type: 'tool_result',
            tool_message_id: toolMessageId,
            chat_id: chatId,
            tool_call_id: tc.id,
            output: stored.output,
            truncated: false,
          });
          return;
        }
        // Path is plausibly grantable — install the pending sentinel and
        // pause. The grant endpoint re-derives the root at decision time
        // (state may have changed in the meantime) so we don't stash it here.
        pausingForUserInput = true;
        const sentinel = { tool_call_id: tc.id, output: null, truncated: false };
        // v1.13.20: parts-only write.
        await insertParts(
          ctx.sql,
          partsFromToolMessage({ tool_results: sentinel }).map((p) => ({
            ...p,
            message_id: toolMessageId,
          })),
        );
        return;
      }
      const tres = await executeToolCall(projectRoot, tc, session.allowed_read_paths);
      if (SYNTHESIS_TOOLS.has(tc.name)) {
        synthEntries.push({ tc, output: tres.output, ...(tres.error ? { error: tres.error } : {}) });
      }
      const stored = {
        tool_call_id: tc.id,
        output: tres.output,
        truncated: tres.truncated,
        ...(tres.error ? { error: tres.error } : {}),
      };
      // v1.13.20: parts-only write. Reads route through messages_with_parts.
      await insertParts(
        ctx.sql,
        partsFromToolMessage({ tool_results: stored }).map((p) => ({
          ...p,
          message_id: toolMessageId,
        })),
      );
      ctx.publish(sessionId, {
        type: 'tool_result',
        tool_message_id: toolMessageId,
        chat_id: chatId,
        tool_call_id: tc.id,
        output: tres.output,
        truncated: tres.truncated,
        ...(tres.error ? { error: tres.error } : {}),
      });
    })
  );
  if (pausingForUserInput) {
    ctx.publishUser({
      type: 'chat_status',
      chat_id: chatId,
      status: 'waiting_for_input',
      at: new Date().toISOString(),
    });
    ctx.log.info(
      { sessionId, chatId, assistantMessageId },
      'inference paused awaiting user input',
    );
    return;
  }
  // v1.13.13: synthesis-pipeline branch. When any of this batch's tool calls
  // is a codecontext overview/analysis tool that produced a non-error result,
  // run a forced second-inference synthesis pass with auto-fetched files +
  // project docs instead of the normal recursive runAssistantTurn. Falls
  // through to the recursive call on synthesis failure (timeout, model
  // error). User-abort re-throws so the outer handler runs.
  const synthEntry = synthEntries.find((e) => !e.error && e.output != null);
  if (synthEntry) {
    // codecontext wrappers return { result: string, truncated: boolean, ... }.
    // Defensive: stringify the output if it isn't the expected shape so the
    // synthesis still has something to chew on rather than crashing on
    // missing `.result`.
    const out = synthEntry.output as { result?: unknown; truncated?: boolean; outputPath?: string };
    const toolResultText =
      typeof out?.result === 'string'
        ? out.result
        : JSON.stringify(synthEntry.output);
    // v1.13.15-b: forward the wrapper's truncation flag + opaque tmpfs id so
    // synthesisPipeline can re-read the full content for reference extraction.
    const ran = await runSynthesisPass({
      ctx,
      args,
      session,
      projectRoot,
      toolName: synthEntry.tc.name,
      toolResultText,
      ...(typeof out?.truncated === 'boolean' ? { truncated: out.truncated } : {}),
      ...(typeof out?.outputPath === 'string' ? { outputPath: out.outputPath } : {}),
    });
    if (ran) return;
    // ran === false → synthesis failed (timeout / model error) → fall through
    // to the standard recursive turn below. The synth message (if created)
    // was already marked status='failed' inside runSynthesisPass.
  }
  const [nextAssistant] = await ctx.sql<{ id: string }[]>`
    INSERT INTO messages (session_id, chat_id, role, content, status, created_at)
    VALUES (${sessionId}, ${chatId}, 'assistant', '', 'streaming', clock_timestamp())
    RETURNING id
  `;
  await runAssistantTurn(ctx, {
    sessionId,
    chatId,
    assistantMessageId: nextAssistant!.id,
    // v1.8.2: charge this turn's actual tool invocations against the budget.
    // One assistant message can emit multiple tool_calls, so we add the run
    // count, not 1. The next turn's budget check sees the cumulative total.
    toolsUsed: toolsUsed + result.toolCalls.length,
    // v1.11.6: append the just-executed tool calls to the per-turn history
    // so the next runAssistantTurn's doom-loop check can see them. We don't
    // cap the array length here — per-turn budgets keep it bounded
    // (typically <30 entries), and slicing happens inside detectDoomLoop.
    recentToolCalls: [...args.recentToolCalls, ...result.toolCalls],
    signal,
  });
 }
--- a/apps/server/src/services/inference/tool-suggestions.ts
+++ b/apps/server/src/services/inference/tool-suggestions.ts
@@ -0,0 +1,63 @@
 // v1.13.16: Levenshtein + suggestion + formatter for the unknown-tool error
 // returned to the model when an XML-extracted tool call references a name
 // that isn't in TOOLS_BY_NAME. The drift incident this targets: qwen3.6
 // emitting <invoke name="read_file"> from its Claude Code training residue
 // when BooCode's actual file-read tool is view_file. Hand-rolled distance
 // function — no new dep.
 export function levenshtein(a: string, b: string): number {
  if (a.length === 0) return b.length;
  if (b.length === 0) return a.length;
  const dp: number[][] = Array.from(
    { length: a.length + 1 },
    () => new Array<number>(b.length + 1).fill(0),
  );
  for (let i = 0; i <= a.length; i++) dp[i]![0] = i;
  for (let j = 0; j <= b.length; j++) dp[0]![j] = j;
  for (let i = 1; i <= a.length; i++) {
    for (let j = 1; j <= b.length; j++) {
      const cost = a[i - 1] === b[j - 1] ? 0 : 1;
      dp[i]![j] = Math.min(
        dp[i - 1]![j]! + 1,
        dp[i]![j - 1]! + 1,
        dp[i - 1]![j - 1]! + cost,
      );
    }
  }
  return dp[a.length]![b.length]!;
 }
 // Threshold per the v1.13.16 dispatch: distance <= 3 OR substring match
 // (either direction). Ties broken by smallest distance, then alphabetical.
 export function suggestToolName(
  name: string,
  available: readonly string[],
 ): string | null {
  const lower = name.toLowerCase();
  let best: { name: string; dist: number } | null = null;
  for (const tool of available) {
    const tlower = tool.toLowerCase();
    const dist = levenshtein(lower, tlower);
    const isSubstr = tlower.includes(lower) || lower.includes(tlower);
    if (dist > 3 && !isSubstr) continue;
    if (
      best === null ||
      dist < best.dist ||
      (dist === best.dist && tool.localeCompare(best.name) < 0)
    ) {
      best = { name: tool, dist };
    }
  }
  return best?.name ?? null;
 }
 export function formatUnknownToolError(
  name: string,
  available: readonly string[],
 ): string {
  const sorted = [...available].sort();
  const suggestion = suggestToolName(name, sorted);
  const list = sorted.join(', ');
  const tail = suggestion ? ` Did you mean: ${suggestion}?` : '';
  return `Tool '${name}' not found. Available tools: [${list}].${tail}`;
 }
--- a/apps/server/src/services/inference/turn.ts
+++ b/apps/server/src/services/inference/turn.ts
@@ -0,0 +1,329 @@
 import type { FastifyBaseLogger } from 'fastify';
 import type { Sql } from '../../db.js';
 import type { Config } from '../../config.js';
 import type {
  Agent,
  ErrorReason,
  Message,
  MessageMetadata,
  Project,
  Session,
  ToolCall,
  UserStreamFrame,
 } from '../../types/api.js';
 import { ALL_TOOLS } from '../tools.js';
 import { resolveProjectRoot } from '../path_guard.js';
 import { maybeAutoNameChat } from '../auto_name.js';
 import { getAgentById } from '../agents.js';
 import * as compaction from '../compaction.js';
 import * as modelContext from '../model-context.js';
 import type { Broker } from '../broker.js';
 import { resolveToolBudget } from './budget.js';
 import {
  DOOM_LOOP_THRESHOLD,
  detectDoomLoop,
 } from './sentinels.js';
 import {
  buildMessagesPayload,
  loadContext,
 } from './payload.js';
 import {
  finalizeCompletion,
  handleAbortOrError,
 } from './error-handler.js';
 import {
  executeStreamPhase,
  streamCompletion,
 } from './stream-phase.js';
 import { executeToolPhase } from './tool-phase.js';
 import { DB_FLUSH_INTERVAL_MS, type StreamPhaseState } from './types.js';
 import {
  runCapHitSummary,
  runDoomLoopSummary,
 } from './sentinel-summaries.js';
 // v1.12.4: re-exported so external callers (tests, future consumers) keep
 // importing from services/inference.js as the public surface.
 export { detectDoomLoop, DOOM_LOOP_THRESHOLD } from './sentinels.js';
 export { buildMessagesPayload } from './payload.js';
 export interface InferenceFrame {
  type:
    | 'message_started'
    | 'delta'
    | 'tool_call'
    | 'tool_result'
    | 'message_complete'
    | 'usage'
    | 'messages_deleted'
    | 'session_renamed'
    | 'chat_renamed'
    | 'error';
  message_id?: string;
  message_ids?: string[];
  chat_id?: string;
  tool_message_id?: string;
  tool_call_id?: string;
  // v1.8.2: 'system' added so cap-hit sentinel messages can announce themselves
  // through the normal message_started → delta → message_complete sequence.
  role?: 'assistant' | 'tool' | 'user' | 'system';
  content?: string;
  tool_call?: ToolCall;
  output?: unknown;
  truncated?: boolean;
  error?: string;
  // v1.8.2: structured error reason. Set on `type: 'error'` so the UI can
  // surface a specific message; `error` stays the human-readable text.
  reason?: ErrorReason;
  // v1.8.2: piggybacks on `message_complete` so static or terminally-resolved
  // messages can carry their persisted metadata to the live stream without a
  // refetch (sentinels carry { kind: 'cap_hit', ... }; failed messages carry
  // { kind: 'error', ... }).
  metadata?: MessageMetadata | null;
  tokens_used?: number | null;
  ctx_used?: number | null;
  ctx_max?: number | null;
  completion_tokens?: number | null;
  started_at?: string | null;
  finished_at?: string | null;
  model?: string;
  session_id?: string;
  name?: string;
 }
 export type FramePublisher = (sessionId: string, frame: InferenceFrame) => void;
 export interface InferenceContext {
  sql: Sql;
  config: Config;
  log: FastifyBaseLogger;
  publish: FramePublisher;
  publishUser: (frame: UserStreamFrame) => void;
  // v1.11: passed through so compaction.process can publish 'compacted'
  // frames on the same session WS channel useSessionStream subscribes to.
  // Compaction is the only path that needs the raw broker handle (regular
  // inference goes through `publish`); keeping a separate field avoids
  // tempting other code paths into bypassing the session-id binding.
  broker: Broker;
 }
 // v1.12.4: payload assembly extracted to ./inference/payload.ts (tests
 // import buildMessagesPayload from this module, so a re-export below
 // preserves the public surface). Stream + tool phases extracted to
 // ./inference/stream-phase.ts and ./inference/tool-phase.ts.
 export interface StreamResult {
  finishReason: string | null;
  content: string;
  toolCalls: ToolCall[];
  promptTokens: number | null;
  completionTokens: number | null;
  // v1.13.1-C: reasoning text accumulated across reasoning-delta parts.
  // Empty string when the model doesn't emit reasoning (most cases).
  reasoning: string;
 }
 export interface TurnArgs {
  sessionId: string;
  chatId: string;
  assistantMessageId: string;
  // v1.8.2: cumulative tool calls executed this run. Compared against the
  // resolved budget at the top of each turn. Replaces the older `depth`
  // counter (which counted iterations, not invocations).
  toolsUsed: number;
  // v1.11.6: ordered tool calls executed in this user-message turn (across
  // recursive runAssistantTurn invocations). Reset to [] at user-message
  // boundaries by runInference, same as toolsUsed. Doom-loop check at the
  // top of runAssistantTurn slices the last DOOM_LOOP_THRESHOLD entries.
  recentToolCalls: ToolCall[];
  signal: AbortSignal | undefined;
 }
 export async function runAssistantTurn(
  ctx: InferenceContext,
  args: TurnArgs,
 ): Promise<void> {
  const { sessionId, chatId } = args;
  // v1.11: if the prior turn flagged this chat for compaction, run it first
  // so loadContext below reads the post-compaction history. We swallow
  // compaction failures (clearing the flag so we don't loop) and proceed
  // with the un-compacted history — a slow turn that hits the model's
  // hard limit is recoverable; a dead session is not.
  const chatFlag = await ctx.sql<{ needs_compaction: boolean }[]>`
    SELECT needs_compaction FROM chats WHERE id = ${chatId}
  `;
  if (chatFlag[0]?.needs_compaction) {
    try {
      await compaction.process({
        sql: ctx.sql,
        config: ctx.config,
        log: ctx.log,
        broker: ctx.broker,
        chatId,
      });
    } catch (err) {
      ctx.log.warn({ err, chatId }, 'auto-compaction failed; clearing flag and proceeding');
      await ctx.sql`UPDATE chats SET needs_compaction = false WHERE id = ${chatId}`;
    }
  }
  const loaded = await loadContext(ctx.sql, sessionId, chatId);
  if (!loaded) {
    ctx.log.warn({ sessionId }, 'inference: session or project missing');
    return;
  }
  const { session, project, history } = loaded;
  const projectRoot = await resolveProjectRoot(project.path);
  // Agent resolution is per-turn so PATCH agent_id mid-conversation takes
  // effect on the next message. Unknown agent_id returns null silently —
  // session falls back to base prompt + all tools + default temperature.
  const agent = session.agent_id
    ? await getAgentById(project.path, session.agent_id)
    : null;
  // v1.8.2: cap-hit replaces the older "tool loop depth exceeded" failure.
  // When we've already burned the budget *before* this turn even runs, we
  // skip straight to the summary flow — the in-flight assistant message slot
  // gets reused for the wrap-up reply instead of being marked failed.
  const budget = resolveToolBudget(agent);
  if (args.toolsUsed >= budget) {
    await runCapHitSummary(ctx, args, session, project, history, agent, budget);
    return;
  }
  // v1.11.6: doom-loop guard. Detected BEFORE the budget cap (the model can
  // burn through 3 identical calls long before the 15-call budget fires).
  // Same in-flight-slot-reuse pattern as runCapHitSummary — wrap-up reply
  // lands in args.assistantMessageId, then a doom_loop sentinel is inserted
  // to make the abort visible in the chat history.
  const loop = detectDoomLoop(args.recentToolCalls);
  if (loop) {
    await runDoomLoopSummary(ctx, args, session, project, history, agent, loop);
    return;
  }
  const messages = await buildMessagesPayload(session, project, history, agent, ctx.log);
  // v1.11.8: resolve per-chat web-tools opt-in. Tri-state on the wire:
  //   - session.web_search_enabled = null → inherit project default
  //   - session.web_search_enabled = true/false → explicit
  // Both web_search and web_fetch are gated by this single flag (the UI
  // label is "Enable web search and fetch" — same store, both tools).
  // Default is false unless explicitly opted in, matching the v1.9
  // plumbing intent ("inert until Batch 8 ships the actual tools").
  const webToolsEnabled =
    session.web_search_enabled ?? project.default_web_search_enabled ?? false;
  const state: StreamPhaseState = { accumulated: '', startedAt: null };
  let result: StreamResult;
  try {
    result = await executeStreamPhase(ctx, args, session, messages, state, agent, webToolsEnabled);
  } catch (err) {
    await handleAbortOrError(ctx, args, state.accumulated, err);
    return;
  }
  if (result.toolCalls.length > 0) {
    await executeToolPhase(ctx, args, result, state.startedAt, session, projectRoot);
    return;
  }
  await finalizeCompletion(ctx, args, result, state.startedAt, session);
 }
 export async function runInference(
  ctx: InferenceContext,
  sessionId: string,
  chatId: string,
  assistantMessageId: string,
  signal?: AbortSignal
 ): Promise<void> {
  // v1.8.2: every fresh inference (initial send, regenerate, force_send,
  // continue) starts with a clean budget. Tool-call accumulation across
  // Continue invocations is what the hard ceiling guards against, not the
  // per-call budget.
  // v1.11.6: recentToolCalls also resets — doom-loop detection is scoped
  // to a single user-message turn, so a Continue starts with no history.
  return runAssistantTurn(ctx, {
    sessionId,
    chatId,
    assistantMessageId,
    toolsUsed: 0,
    recentToolCalls: [],
    signal,
  });
 }
 // v1.8.2: cap-hit summary flow. Called instead of erroring when the loop
 // hits its budget. Reuses the in-flight assistant message slot to stream a
 // short wrap-up reply with the synthetic note prepended and tools disabled,
 // then always inserts a cap_hit sentinel afterward (regardless of summary
 // outcome) so the UI can show a Continue affordance.
 interface InferenceRegistration {
  controller: AbortController;
  completed: Promise<void>;
 }
 export function createInferenceRunner(
  ctx: Omit<InferenceContext, 'publishUser'>,
  publishUserFn: (user: string, frame: UserStreamFrame) => void
 ) {
  const registry = new Map<string, InferenceRegistration>();
  return {
    enqueue(sessionId: string, chatId: string, assistantMessageId: string, user: string) {
      const callCtx: InferenceContext = {
        ...ctx,
        publishUser: (frame) => publishUserFn(user, frame),
        // v1.11: broker comes in via ctx (set at registration time). Repeated
        // here so the destructure carries it onto the per-call ctx without
        // having to add it to every enqueue/cancel signature individually.
        broker: ctx.broker,
      };
      // v1.8 mobile-tabs: announce working before the async loop starts so
      // every device subscribed to the user channel sees the amber dot.
      callCtx.publishUser({ type: 'chat_status', chat_id: chatId, status: 'streaming', at: new Date().toISOString() });
      const controller = new AbortController();
      let resolveCompleted!: () => void;
      const completed = new Promise<void>((res) => { resolveCompleted = res; });
      const registration: InferenceRegistration = { controller, completed };
      registry.set(chatId, registration);
      void (async () => {
        try {
          await runInference(callCtx, sessionId, chatId, assistantMessageId, controller.signal);
          setImmediate(() => {
            void maybeAutoNameChat(callCtx, chatId, sessionId).catch((err: Error) => {
              callCtx.log.warn({ err, chatId }, 'auto-name failed');
            });
          });
        } catch (err) {
          callCtx.log.error({ err }, 'unhandled inference error');
        } finally {
          resolveCompleted();
          // Only clear our own registration; a force-send may have replaced it.
          if (registry.get(chatId) === registration) {
            registry.delete(chatId);
          }
        }
      })();
    },
    async cancel(_sessionId: string, chatId: string): Promise<boolean> {
      const reg = registry.get(chatId);
      if (!reg) return false;
      reg.controller.abort();
      // Swallow — we just need to wait for the catch/finally to persist state.
      await reg.completed.catch(() => {});
      return true;
    },
    hasActive(chatId: string): boolean {
      return registry.has(chatId);
    },
  };
 }
 export const _toolNames = ALL_TOOLS.map((t) => t.name);
--- a/apps/server/src/services/inference/types.ts
+++ b/apps/server/src/services/inference/types.ts
@@ -0,0 +1,13 @@
 // v1.12.4: shared inter-phase types/constants for the extracted phase files.
 // Lives here so stream-phase, tool-phase, and the summary functions still in
 // inference.ts can all reference the same definitions without circular imports.
 export interface StreamPhaseState {
  accumulated: string;
  startedAt: string | null;
 }
 // 500ms keeps the DB UPDATE rate bounded under heavy streaming. Used by
 // executeStreamPhase, runCapHitSummary, and runDoomLoopSummary — every site
 // that does a debounced content flush during streaming.
 export const DB_FLUSH_INTERVAL_MS = 500;
--- a/apps/server/src/services/inference/xml-parser.ts
+++ b/apps/server/src/services/inference/xml-parser.ts
@@ -0,0 +1,169 @@
 // v1.10.5: XML-tag tool-call fallback. Some models emit
 // <tool_call><function=foo><parameter=key>value</parameter></function></tool_call>
 // in plain content instead of using the OpenAI tool_calls JSON channel.
 // The streaming loop in stream-phase.ts extracts these blocks via these helpers.
 //
 // v1.13.16: also recognize Anthropic <invoke name="..."><parameter name="...">
 // markup. qwen3.6-35b-a3b-mxfp4 drifts to this format when prompted as an
 // "Architect"-style agent because Claude Code documentation in its
 // pre-training data uses this shape. Both formats route through the same
 // synthetic ToolCall path with shared xml_call_${idx} IDs; downstream
 // dispatch handles unknown tool names with a richer error (see
 // tool-suggestions.ts + tool-phase.ts).
 export const XML_TOOL_OPEN = '<tool_call>';
 export const XML_TOOL_CLOSE = '</tool_call>';
 // v1.13.16: Anthropic <invoke> opener is matched by prefix (not the full
 // `<invoke ...>` tag) because attributes follow. Closer is the literal tag.
 export const INVOKE_TOOL_OPEN = '<invoke';
 export const INVOKE_TOOL_CLOSE = '</invoke>';
 export interface ParsedCall {
  name: string;
  args: Record<string, unknown>;
 }
 // v1.10.5: Qwen-flavor parser. Tightened in v1.13.16 to tolerate whitespace
 // around `=` (e.g. `<function = view_file>`). Name capture is non-whitespace,
 // non-`>` so a stray space doesn't get absorbed into the function name.
 const QWEN_FUNCTION_RE = /<function\s*=\s*([^>\s]+)\s*>/;
 const QWEN_PARAM_RE = /<parameter\s*=\s*([^>\s]+)\s*>([\s\S]*?)<\/parameter>/g;
 export function parseXmlToolCall(block: string): ParsedCall | null {
  const nameMatch = block.match(QWEN_FUNCTION_RE);
  if (!nameMatch || !nameMatch[1]) return null;
  const name = nameMatch[1].trim();
  if (!name) return null;
  const args: Record<string, unknown> = {};
  for (const m of block.matchAll(QWEN_PARAM_RE)) {
    const key = (m[1] ?? '').trim();
    if (!key) continue;
    const raw = (m[2] ?? '').trim();
    try {
      args[key] = JSON.parse(raw);
    } catch {
      args[key] = raw;
    }
  }
  return { name, args };
 }
 // v1.13.16: Anthropic-flavor parser. Same JSON-parse-with-string-fallback
 // shape as parseXmlToolCall so the dispatch layer doesn't need to care which
 // flavor produced the call.
 const INVOKE_NAME_RE =
  /<invoke\s+name\s*=\s*("([^"]*)"|'([^']*)')\s*>/;
 const INVOKE_PARAM_RE =
  /<parameter\s+name\s*=\s*("([^"]*)"|'([^']*)')\s*>([\s\S]*?)<\/parameter>/g;
 export function parseInvokeToolCall(block: string): ParsedCall | null {
  const nameMatch = block.match(INVOKE_NAME_RE);
  if (!nameMatch) return null;
  const name = (nameMatch[2] ?? nameMatch[3] ?? '').trim();
  if (!name) return null;
  const args: Record<string, unknown> = {};
  for (const m of block.matchAll(INVOKE_PARAM_RE)) {
    const key = ((m[2] ?? m[3] ?? '') as string).trim();
    if (!key) continue;
    const raw = (m[4] ?? '').trim();
    try {
      args[key] = JSON.parse(raw);
    } catch {
      args[key] = raw;
    }
  }
  return { name, args };
 }
 // Locate the first character that begins (or completely contains) an
 // unfinished opener (either flavor) in `s`. Returns -1 when `s` can be
 // flushed to the client in full without risking a partial tag leak.
 //   Case 1: a full opener (`<tool_call>` or `<invoke`) with no matching
 //           closer — caller must keep everything from that index forward
 //           until the next chunk arrives with the closer.
 //   Case 2: `s` ends with a strict prefix of either opener (e.g. `<tool_c`
 //           or `<invo`). Caller must keep just that suffix in the buffer.
 // Note: case 1 assumes the calling loop already extracted every complete
 // block before reaching this check.
 const ALL_OPENERS = [XML_TOOL_OPEN, INVOKE_TOOL_OPEN] as const;
 export function partialXmlOpenerStart(s: string): number {
  let earliest = -1;
  for (const op of ALL_OPENERS) {
    const idx = s.indexOf(op);
    if (idx === -1) continue;
    if (earliest === -1 || idx < earliest) earliest = idx;
  }
  if (earliest !== -1) return earliest;
  const lastLt = s.lastIndexOf('<');
  if (lastLt === -1) return -1;
  const suffix = s.slice(lastLt);
  for (const op of ALL_OPENERS) {
    if (op.startsWith(suffix) && suffix.length < op.length) return lastLt;
  }
  return -1;
 }
 // v1.13.16: unified extraction. Replaces the inline loop that used to live
 // in stream-phase.ts. Pure function — returns the visible text to flush,
 // the parsed tool-call payloads in source order, and the buffer remainder
 // to retain for the next streaming chunk. Parse failures are silently
 // dropped (matches the pre-v1.13.16 behavior — leaking partial XML to the
 // chat looks worse than swallowing a bad block).
 export interface ToolCallExtraction {
  flushed: string;
  calls: ParsedCall[];
  remaining: string;
 }
 interface OpenerSpec {
  open: string;
  close: string;
  parse: (block: string) => ParsedCall | null;
 }
 const OPENER_SPECS: ReadonlyArray<OpenerSpec> = [
  { open: XML_TOOL_OPEN, close: XML_TOOL_CLOSE, parse: parseXmlToolCall },
  { open: INVOKE_TOOL_OPEN, close: INVOKE_TOOL_CLOSE, parse: parseInvokeToolCall },
 ];
 export function extractToolCallBlocks(buffer: string): ToolCallExtraction {
  let flushed = '';
  const calls: ParsedCall[] = [];
  let pos = 0;
  while (pos < buffer.length) {
    let next: { spec: OpenerSpec; openIdx: number; closeIdx: number } | null = null;
    for (const spec of OPENER_SPECS) {
      const openIdx = buffer.indexOf(spec.open, pos);
      if (openIdx === -1) continue;
      const closeIdx = buffer.indexOf(spec.close, openIdx);
      if (closeIdx === -1) continue;
      if (next === null || openIdx < next.openIdx) {
        next = { spec, openIdx, closeIdx };
      }
    }
    if (next === null) break;
    if (next.openIdx > pos) {
      flushed += buffer.slice(pos, next.openIdx);
    }
    const blockEnd = next.closeIdx + next.spec.close.length;
    const block = buffer.slice(next.openIdx, blockEnd);
    const parsed = next.spec.parse(block);
    if (parsed) calls.push(parsed);
    pos = blockEnd;
  }
  const tail = buffer.slice(pos);
  const partialIdx = partialXmlOpenerStart(tail);
  if (partialIdx === -1) {
    flushed += tail;
    return { flushed, calls, remaining: '' };
  }
  if (partialIdx > 0) {
    flushed += tail.slice(0, partialIdx);
  }
  return { flushed, calls, remaining: tail.slice(partialIdx) };
 }
--- a/apps/server/src/services/path_guard.ts
+++ b/apps/server/src/services/path_guard.ts
@@ -16,9 +16,22 @@ export async function resolveProjectRoot(projectPath: string): Promise<string> {
  }
 }
 function isUnder(real: string, root: string): boolean {
  return real === root || real.startsWith(root + sep);
 }
 // v1.13.17-cross-repo-reads: pathGuard now accepts an optional extraRoots
 // list (typically session.allowed_read_paths). The primary projectRoot is
 // tried first; if the resolved path doesn't sit under it, each extraRoot is
 // tried in turn. Throws PathScopeError if no root accepts. The error message
 // includes a hint pointing the model at the request_read_access tool so it
 // can self-correct on the next turn — extraRoots IS the persistence
 // mechanism for those grants, so we only suggest it when there's a missing
 // grant to ask for (i.e. the path isn't already under any allowed root).
 export async function pathGuard(
  projectRoot: string,
-  requested: string
+  requested: string,
  extraRoots: readonly string[] = [],
 ): Promise<string> {
  if (typeof requested !== 'string' || requested.length === 0) {
    throw new PathScopeError('path is required');
@@ -30,10 +43,13 @@ export async function pathGuard(
  } catch {
    throw new PathScopeError(`path does not exist: ${requested}`);
  }
-  if (real !== projectRoot && !real.startsWith(projectRoot + sep)) {
+  if (isUnder(real, projectRoot)) return real;
-    throw new PathScopeError(
+  for (const extra of extraRoots) {
-      `path escapes project root: ${requested} -> ${real}`
+    if (extra.length === 0) continue;
-    );
+    if (isUnder(real, extra)) return real;
  }
-  return real;
+  throw new PathScopeError(
    `path escapes project root: ${requested} -> ${real}. ` +
      `Use request_read_access(path, reason) to ask the user for permission.`,
  );
 }
--- a/apps/server/src/services/request_read_access.ts
+++ b/apps/server/src/services/request_read_access.ts
@@ -0,0 +1,82 @@
 // v1.13.17-cross-repo-reads: tool the model uses to request read access to
 // a path outside its session's primary project root. When the model emits
 // view_file("/opt/forks/foo/go.mod") under a session scoped to /opt/boocode,
 // pathGuard's error message hints at this tool. The model then emits
 //   request_read_access(path="/opt/forks/foo/go.mod",
 //                       reason="investigating foo to write the design doc")
 // The tool's execute does cheap up-front validation: if the requested path
 // can't possibly be granted under the current whitelist + repo-shape rules,
 // it returns a denial immediately without prompting the user. Otherwise, the
 // tool-phase pause branch (parallel of ask_user_input) stores a pending
 // sentinel and waits for the user's allow/deny via the grant_read_access
 // endpoint.
 //
 // The execute body never directly mutates state; the grant endpoint owns
 // the persistence path. This keeps the tool-side logic side-effect-free
 // (it's just a request) and matches ask_user_input's "server-side no-op
 // fallback, pause happens in tool-phase" shape.
 import { z } from 'zod';
 import type { ToolDef } from './tools.js';
 const RequestReadAccessInput = z.object({
  path: z.string().min(1),
  reason: z.string().min(1).max(500),
 });
 type RequestReadAccessInputT = z.infer<typeof RequestReadAccessInput>;
 export const requestReadAccess: ToolDef<RequestReadAccessInputT> = {
  name: 'request_read_access',
  description:
    "Ask the user for read-only access to a path outside the current " +
    "session's project scope. Use when a previous read tool (view_file, " +
    'list_dir, grep, find_files) was refused with a path-escapes-project ' +
    'error and the path is plausibly under another known repository (e.g. ' +
    '/opt/forks/foo). Provide a short reason describing why you need the ' +
    "access. Pauses the conversation until the user picks Allow or Deny; " +
    'the next assistant turn sees the result. On Allow, the tool result ' +
    'is "granted: <root>" — subsequent reads under that root succeed for ' +
    'the rest of the session. On Deny, the tool result is "denied". Do ' +
    'not call this for paths that are already inside the project root.',
  inputSchema: RequestReadAccessInput,
  jsonSchema: {
    type: 'function',
    function: {
      name: 'request_read_access',
      description:
        "Ask the user for read-only access to a path outside the session's " +
        'project scope. Pauses the conversation until the user picks Allow ' +
        'or Deny. Subsequent reads under the granted root succeed for the ' +
        'rest of the session.',
      parameters: {
        type: 'object',
        properties: {
          path: {
            type: 'string',
            description:
              'Absolute path the model wants to read. Must be under the ' +
              "server's PROJECT_ROOT_WHITELIST (default /opt) and outside " +
              "the session's primary project root.",
          },
          reason: {
            type: 'string',
            description:
              'Short rationale (<=500 chars) shown to the user explaining ' +
              'why the access is needed. The user uses this to decide.',
          },
        },
        required: ['path', 'reason'],
        additionalProperties: false,
      },
    },
  },
  // Server-side no-op. The "execution" of request_read_access is the
  // pause-and-resume cycle managed by tool-phase.ts + the grant endpoint.
  // The inference loop catches this tool name BEFORE executeToolCall fires
  // and inserts a pending sentinel instead — this fallback only runs if
  // something bypasses that branch, in which case we surface the pending
  // shape so downstream code can still detect it. Mirrors ask_user_input.
  async execute(input) {
    return { _pending: true, path: input.path, reason: input.reason };
  },
 };
--- a/apps/server/src/services/synthesisPipeline.ts
+++ b/apps/server/src/services/synthesisPipeline.ts
@@ -0,0 +1,493 @@
 // v1.13.13: forced second-inference synthesis pass for codecontext
 // overview/analysis tools. Triggered from tool-phase.ts after a codecontext
 // tool call lands and BEFORE the normal recursive runAssistantTurn fires.
 //
 // Inputs to the synthesis stream:
 //   1. The codecontext tool's result text.
 //   2. Top-N source files referenced in that text, fetched via view_file.
 //   3. Project documentation auto-fetched from the repo root.
 //   4. The original user message that triggered the turn.
 //
 // Output: a NEW assistant message whose sole part is kind='synthesis'.
 // Streams to the client as deltas exactly like a normal assistant turn.
 //
 // Failure modes (all fall through to recursive runAssistantTurn):
 //   - SYNTHESIS_TOOLS membership check fails -> return false immediately.
 //   - File-fetch / doc-fetch errors -> silent skip, continue with what we have.
 //   - Stream error / timeout -> mark synth message status='failed', return false.
 //   - User-abort -> mark cancelled and re-throw so the outer abort handler runs.
 import { promises as fs } from 'node:fs';
 import { join } from 'node:path';
 import { TOOLS_BY_NAME } from './tools.js';
 import { streamCompletion } from './inference/stream-phase.js';
 import { SYNTHESIS_SYSTEM_PROMPT } from './synthesisPrompt.js';
 import { insertParts } from './inference/parts.js';
 import * as modelContext from './model-context.js';
 import { readTruncation } from './truncate.js';
 import type { Session } from '../types/api.js';
 import type { OpenAiMessage } from './inference/payload.js';
 import type { InferenceContext, TurnArgs } from './inference/turn.js';
 export const SYNTHESIS_TOOLS: ReadonlySet<string> = new Set([
  'get_codebase_overview',
  'get_framework_analysis',
  'get_semantic_neighborhoods',
 ]);
 const TOP_N_FILES = 5;
 const FILE_LINE_CAP = 200;
 const DOC_LINE_CAP = 500;
 // Token budget for the auto-fetched content (files + docs combined). Estimated
 // via chars/4 — a rough but stable proxy that doesn't require a tokenizer dep.
 const TOKEN_BUDGET = 32_000;
 const CHARS_PER_TOKEN = 4;
 // 90s per synthesis call. Long enough for a thoughtful overview against a
 // large auto-fetched payload; short enough that a hung upstream falls through
 // to the normal recursive turn within a typical user attention window.
 const SYNTH_TIMEOUT_MS = 90_000;
 // File-extension regex for referenced-file extraction. Limited to source-
 // language extensions so we don't pull in lockfiles, images, etc.
 const FILE_PATH_RE =
  /(?:^|[`'"<\s\(\[])([A-Za-z0-9_./@-]+\.(?:ts|tsx|js|jsx|py|go|rs|java|kt|c|cpp|h|hpp|md|json|yaml|yml|sql|sh|html|css))(?=[`'"<\)\]\s,;:]|$)/gm;
 export interface SynthesisParams {
  ctx: InferenceContext;
  args: TurnArgs;
  session: Session;
  projectRoot: string;
  toolName: string;
  toolResultText: string;
  // v1.13.15-b: when codecontext's wrapper hit its 32k inline-truncation
  // limit, we expand the full content via readTruncation for reference-file
  // extraction only. toolResultText (the truncated head) still ships to the
  // synth model — preserves the 32k payload-budget contract.
  truncated?: boolean;
  // opaque id (tr_<…>), not a filesystem path — see truncate.ts naming note
  outputPath?: string;
 }
 interface FetchedFile {
  path: string;
  content: string;
 }
 interface DocsCollection {
  boochat?: string;
  agents?: string;
  context?: string;
  roadmap?: string;
 }
 export async function runSynthesisPass(p: SynthesisParams): Promise<boolean> {
  if (!SYNTHESIS_TOOLS.has(p.toolName)) return false;
  let synthMessageId: string | null = null;
  let accumulated = '';
  let timedOut = false;
  const synthCtrl = new AbortController();
  const timer = setTimeout(() => {
    timedOut = true;
    synthCtrl.abort();
  }, SYNTH_TIMEOUT_MS);
  try {
    const userMessage = await fetchOriginalUserMessage(p.ctx, p.args.chatId);
    if (!userMessage) {
      p.ctx.log.warn({ chatId: p.args.chatId }, 'synthesis: no user message found; falling through');
      return false;
    }
    // v1.13.15-b: when the tool result was inline-truncated by the wrapper
    // (32k cap, see codecontext_client.ts:114), expand the full content from
    // tmpfs for reference-file extraction. The synth payload still ships the
    // truncated head (see buildPayload call below) so the token-budget
    // contract holds. Graceful degradation: if readTruncation returns null
    // (missing id, ENOENT) or throws, fall back to the truncated head.
    let extractionSource = p.toolResultText;
    if (p.truncated && p.outputPath) {
      try {
        const full = await readTruncation(p.outputPath);
        if (full !== null) {
          extractionSource = full;
          p.ctx.log.info(
            {
              chatId: p.args.chatId,
              toolName: p.toolName,
              originalChars: p.toolResultText.length,
              fullChars: full.length,
            },
            'synthesis: expanded truncated tool output',
          );
        }
      } catch (err) {
        p.ctx.log.warn(
          { chatId: p.args.chatId, toolName: p.toolName, err: String(err) },
          'synthesis: readTruncation failed, using truncated output',
        );
      }
    }
    const refFiles = extractReferencedFiles(extractionSource);
    const files = await fetchTopFiles(refFiles, p.projectRoot);
    const docs = await fetchProjectDocs(p.projectRoot);
    const { files: budgetedFiles, docs: budgetedDocs } = applyTokenBudget(files, docs);
    const synthMessages = buildPayload(
      p.toolName,
      // Truncated head only — full content was used for reference extraction above
      p.toolResultText,
      budgetedFiles,
      budgetedDocs,
      userMessage,
    );
    // Insert + announce the synthesis assistant message. From here on, any
    // exception must clean up via the catch block so the row doesn't linger
    // in 'streaming' status (the 5min stale-streaming sweeper catches it
    // eventually, but explicit cleanup is better).
    const [synthRow] = await p.ctx.sql<
      { id: string; started_at: string }[]
    >`
      INSERT INTO messages (session_id, chat_id, role, content, status, started_at, created_at)
      VALUES (${p.args.sessionId}, ${p.args.chatId}, 'assistant', '', 'streaming', clock_timestamp(), clock_timestamp())
      RETURNING id, started_at
    `;
    synthMessageId = synthRow!.id;
    const startedAt = synthRow!.started_at;
    p.ctx.publish(p.args.sessionId, {
      type: 'message_started',
      message_id: synthMessageId,
      chat_id: p.args.chatId,
      role: 'assistant',
    });
    // Combine the user-abort signal with our synthesis-specific timeout so
    // either fires correctly. The `timedOut` flag in scope tells us which one
    // tripped after streamCompletion throws.
    const combinedSignal: AbortSignal | undefined = p.args.signal
      ? AbortSignal.any([p.args.signal, synthCtrl.signal])
      : synthCtrl.signal;
    const onDelta = (delta: string): void => {
      accumulated += delta;
      p.ctx.publish(p.args.sessionId, {
        type: 'delta',
        message_id: synthMessageId!,
        chat_id: p.args.chatId,
        content: delta,
      });
    };
    const streamResult = await streamCompletion(
      p.ctx,
      p.session.model,
      synthMessages,
      { tools: null },
      onDelta,
      undefined,
      combinedSignal,
    );
    const mctx = await modelContext.getModelContext(p.session.model);
    const nCtx = mctx?.n_ctx ?? null;
    const [updated] = await p.ctx.sql<
      {
        tokens_used: number | null;
        ctx_used: number | null;
        ctx_max: number | null;
        finished_at: string | null;
      }[]
    >`
      UPDATE messages
      SET content = ${streamResult.content},
          status = 'complete',
          tokens_used = ${streamResult.completionTokens},
          ctx_used = ${streamResult.promptTokens},
          ctx_max = ${nCtx},
          finished_at = clock_timestamp()
      WHERE id = ${synthMessageId}
      RETURNING tokens_used, ctx_used, ctx_max, finished_at
    `;
    await insertParts(p.ctx.sql, [
      {
        message_id: synthMessageId,
        sequence: 0,
        kind: 'synthesis',
        payload: { text: streamResult.content },
      },
    ]);
    p.ctx.publish(p.args.sessionId, {
      type: 'message_complete',
      message_id: synthMessageId,
      chat_id: p.args.chatId,
      tokens_used: updated?.tokens_used ?? null,
      ctx_used: updated?.ctx_used ?? null,
      ctx_max: updated?.ctx_max ?? null,
      started_at: startedAt,
      finished_at: updated?.finished_at ?? null,
      model: p.session.model,
    });
    p.ctx.publishUser({
      type: 'chat_status',
      chat_id: p.args.chatId,
      status: 'idle',
      at: new Date().toISOString(),
    });
    p.ctx.log.info(
      {
        chatId: p.args.chatId,
        synthMessageId,
        toolName: p.toolName,
        chars: streamResult.content.length,
        files: budgetedFiles.length,
      },
      'synthesis pass complete',
    );
    return true;
  } catch (err) {
    await markSynthFailed(p, synthMessageId, accumulated).catch((cleanupErr) => {
      p.ctx.log.warn({ cleanupErr: String(cleanupErr) }, 'synthesis cleanup UPDATE failed');
    });
    if (err instanceof Error && err.name === 'AbortError') {
      if (timedOut) {
        p.ctx.log.warn(
          { toolName: p.toolName, chatId: p.args.chatId },
          'synthesis pass timed out; falling through to recursive turn',
        );
        return false;
      }
      // User-initiated abort: propagate so the outer error handler marks the
      // parent turn cancelled. The synth message is already marked failed by
      // markSynthFailed above.
      throw err;
    }
    p.ctx.log.warn(
      { err: String(err), toolName: p.toolName, chatId: p.args.chatId },
      'synthesis pass failed; falling through to recursive turn',
    );
    return false;
  } finally {
    clearTimeout(timer);
  }
 }
 async function markSynthFailed(
  p: SynthesisParams,
  synthMessageId: string | null,
  accumulated: string,
 ): Promise<void> {
  if (synthMessageId === null) return;
  await p.ctx.sql`
    UPDATE messages
    SET content = ${accumulated},
        status = 'failed',
        finished_at = clock_timestamp()
    WHERE id = ${synthMessageId}
  `;
  // Republish so the frontend's live state flips from 'streaming' to
  // terminal. message_complete carries no error reason — the row's status
  // column is the truth. The 5-state chat_status dot has 'error' but we
  // don't fire that here because the broader inference is about to retry
  // via recursion; flipping the user-channel status to 'error' would race
  // the recursive turn's 'streaming' announcement.
  p.ctx.publish(p.args.sessionId, {
    type: 'message_complete',
    message_id: synthMessageId,
    chat_id: p.args.chatId,
    model: p.session.model,
  });
 }
 async function fetchOriginalUserMessage(
  ctx: InferenceContext,
  chatId: string,
 ): Promise<string | null> {
  const rows = await ctx.sql<{ content: string }[]>`
    SELECT content FROM messages
    WHERE chat_id = ${chatId} AND role = 'user'
    ORDER BY created_at DESC
    LIMIT 1
  `;
  return rows[0]?.content ?? null;
 }
 function extractReferencedFiles(text: string): string[] {
  const seen = new Set<string>();
  const order: string[] = [];
  let m: RegExpExecArray | null;
  while ((m = FILE_PATH_RE.exec(text)) !== null) {
    const candidate = m[1]!;
    if (seen.has(candidate)) continue;
    if (
      candidate.includes('node_modules') ||
      candidate.includes('/dist/') ||
      candidate.includes('/test/') ||
      candidate.includes('/tests/') ||
      /\.(test|spec)\.[a-z]+$/.test(candidate)
    ) {
      continue;
    }
    seen.add(candidate);
    order.push(candidate);
  }
  return order;
 }
 async function fetchTopFiles(refs: string[], projectRoot: string): Promise<FetchedFile[]> {
  const tool = TOOLS_BY_NAME['view_file'];
  if (!tool) return [];
  const out: FetchedFile[] = [];
  for (const p of refs.slice(0, TOP_N_FILES)) {
    const absPath = p.startsWith('/') ? p : join(projectRoot, p);
    try {
      const r = await tool.execute({ path: absPath, end_line: FILE_LINE_CAP }, projectRoot);
      const content = (r as { content?: string }).content ?? '';
      if (content) out.push({ path: p, content });
    } catch {
      // path-scope blocked, secret-filtered, file too large, or missing —
      // skip silently. The remaining files (or none) still produce a
      // meaningful synthesis input.
    }
  }
  return out;
 }
 async function fetchProjectDocs(projectRoot: string): Promise<DocsCollection> {
  const tool = TOOLS_BY_NAME['view_file'];
  if (!tool) return {};
  const docs: DocsCollection = {};
  for (const [filename, key] of [
    ['BOOCHAT.md', 'boochat'],
    ['AGENTS.md', 'agents'],
    ['CONTEXT.md', 'context'],
  ] as const) {
    try {
      const r = await tool.execute(
        { path: join(projectRoot, filename), end_line: DOC_LINE_CAP },
        projectRoot,
      );
      const content = (r as { content?: string }).content;
      if (content) docs[key] = content;
    } catch {
      // missing doc — skip
    }
  }
  // Case-insensitive *roadmap*.md glob. Picks the first match (alphabetical
  // by readdir() order); typical projects have at most one roadmap doc.
  try {
    const entries = await fs.readdir(projectRoot);
    const roadmap = entries.find(
      (e) => /roadmap/i.test(e) && e.toLowerCase().endsWith('.md'),
    );
    if (roadmap) {
      const r = await tool.execute(
        { path: join(projectRoot, roadmap), end_line: DOC_LINE_CAP },
        projectRoot,
      );
      const content = (r as { content?: string }).content;
      if (content) docs.roadmap = content;
    }
  } catch {
    // unreadable project root — skip
  }
  return docs;
 }
 function estTokens(s: string | undefined): number {
  return s ? Math.ceil(s.length / CHARS_PER_TOKEN) : 0;
 }
 function applyTokenBudget(
  files: FetchedFile[],
  docs: DocsCollection,
 ): { files: FetchedFile[]; docs: DocsCollection } {
  let total = 0;
  for (const f of files) total += estTokens(f.content);
  total += estTokens(docs.boochat) + estTokens(docs.agents) + estTokens(docs.context) + estTokens(docs.roadmap);
  if (total <= TOKEN_BUDGET) return { files, docs };
  // Drop priority (lowest priority dropped first):
  //   1. top-2..N files (keep top-1)
  //   2. top-1 file
  //   3. roadmap (+ CONTEXT.md grouped here — dispatch listed roadmap above
  //      AGENTS.md, CONTEXT.md was not in the priority list)
  //   4. AGENTS.md
  //   5. BOOCHAT.md (never dropped — truncate to budget if alone exceeds)
  let outFiles = files.slice();
  const outDocs: DocsCollection = { ...docs };
  while (total > TOKEN_BUDGET && outFiles.length > 1) {
    const last = outFiles.pop()!;
    total -= estTokens(last.content);
  }
  if (total <= TOKEN_BUDGET) return { files: outFiles, docs: outDocs };
  if (outFiles[0]) {
    total -= estTokens(outFiles[0].content);
    outFiles = [];
  }
  if (total <= TOKEN_BUDGET) return { files: outFiles, docs: outDocs };
  if (outDocs.roadmap) {
    total -= estTokens(outDocs.roadmap);
    delete outDocs.roadmap;
  }
  if (outDocs.context) {
    total -= estTokens(outDocs.context);
    delete outDocs.context;
  }
  if (total <= TOKEN_BUDGET) return { files: outFiles, docs: outDocs };
  if (outDocs.agents) {
    total -= estTokens(outDocs.agents);
    delete outDocs.agents;
  }
  if (total <= TOKEN_BUDGET) return { files: outFiles, docs: outDocs };
  if (outDocs.boochat) {
    const maxChars = TOKEN_BUDGET * CHARS_PER_TOKEN;
    if (outDocs.boochat.length > maxChars) {
      outDocs.boochat = outDocs.boochat.slice(0, maxChars);
    }
  }
  return { files: outFiles, docs: outDocs };
 }
 function buildPayload(
  toolName: string,
  toolResultText: string,
  files: FetchedFile[],
  docs: DocsCollection,
  userMessage: string,
 ): OpenAiMessage[] {
  const sections: string[] = [];
  sections.push(`## Codecontext tool output (${toolName})\n\n${toolResultText}`);
  if (files.length > 0) {
    sections.push(`---\n\n## Auto-fetched source files`);
    for (const f of files) {
      sections.push(`### ${f.path}\n\n\`\`\`\n${f.content}\n\`\`\``);
    }
  }
  const docEntries: Array<[string, string | undefined]> = [
    ['BOOCHAT.md', docs.boochat],
    ['AGENTS.md', docs.agents],
    ['CONTEXT.md', docs.context],
    ['roadmap', docs.roadmap],
  ];
  const presentDocs = docEntries.filter(([, v]) => Boolean(v));
  if (presentDocs.length > 0) {
    sections.push(`---\n\n## Project documentation`);
    for (const [name, v] of presentDocs) {
      sections.push(`### ${name}\n\n${v!}`);
    }
  }
  sections.push(`---\n\n## Original user question\n\n${userMessage}`);
  return [
    { role: 'system', content: SYNTHESIS_SYSTEM_PROMPT },
    { role: 'user', content: sections.join('\n\n') },
  ];
 }
--- a/apps/server/src/services/synthesisPrompt.ts
+++ b/apps/server/src/services/synthesisPrompt.ts
@@ -0,0 +1,20 @@
 // v1.13.13: synthesis pipeline system prompt. Verbatim from the v1.13.13
 // dispatch — do not paraphrase. The synthesis pass loads this as its sole
 // system message, followed by a user message that concatenates the
 // codecontext tool result, auto-fetched top files, auto-fetched project
 // docs, and the original user message.
 export const SYNTHESIS_SYSTEM_PROMPT = `You are synthesizing structural data into an accurate, detailed answer about the user's codebase.
 Inputs you have been given:
 1. The output of a codecontext analysis tool (raw structural data — file counts, symbols, dependencies, frameworks).
 2. The contents of the top files referenced in that output.
 3. Any project documentation found in the repo root (BOOCHAT.md, AGENTS.md, roadmap docs, CONTEXT.md).
 Rules:
 - Cite specific files and line numbers when making claims about code.
 - If project docs contradict the code, docs win for questions about state, version, status, or roadmap. Code wins for questions about runtime behavior or implementation.
 - If the codecontext output looks sparse (low symbol count for a TypeScript project, missing dependency edges, empty framework list), explicitly say so — codecontext falls back to the JavaScript grammar for TypeScript and loses interfaces, generics, decorators, and type aliases.
 - Do not invent symbols, files, or relationships that are not present in the inputs.
 - Do not respond with a generic "this looks like a [framework] project" summary. The user has the framework analysis already. Add specifics: what is actually in this codebase, what is shipped, what is planned, what is load-bearing.
 - Length: match the depth the user asked for. Overview questions get structured multi-section answers. Specific questions get focused answers.
 `;
--- a/apps/server/src/services/system-prompt.ts
+++ b/apps/server/src/services/system-prompt.ts
@@ -8,9 +8,19 @@
 //   + container guidance (this layer, NEW in v1.12)
 //   + agent.system_prompt          (resolved from data/AGENTS.md by getAgentById)
 //   + session.system_prompt OR project.default_system_prompt
 //
 // v1.13.8: byte-stability instrumentation. buildSystemPromptWithFingerprint
 // returns the assembled string plus a SHA-256 fingerprint and a per-session
 // drift signal. buildSystemPrompt stays a string→string shim for backward
 // compat (tests use it). No cache added — recon proved input-layer mtime
 // caches (this file + agents.ts) already deliver byte-stable inputs in
 // steady state. v1.13.8 measures that claim against production traffic
 // before any cache infrastructure earns its place.
 import { createHash } from 'node:crypto';
 import { readFile, stat } from 'node:fs/promises';
 import type { Agent, Project, Session } from '../types/api.js';
 import { getAgentsMtimes } from './agents.js';
 const BASE_SYSTEM_PROMPT = (projectPath: string) =>
  `You are BooCode Chat, a code investigation assistant. The user is working on a project located at ${projectPath}. Use the file-read tools (view_file, list_dir, grep, find_files) to investigate code when needed. Be concise. Cite file paths and line numbers when discussing code. Do not hallucinate file contents — read the file first. Tool results may be truncated; if so, narrow your query rather than guessing.`;
@@ -60,11 +70,94 @@ export function _resetContainerGuidanceCacheForTests(): void {
  cachedGuidance = null;
 }
-export async function buildSystemPrompt(
+// v1.13.8: expose the mtime currently held in the BOOCHAT cache so the
 // fingerprint log can stamp it without re-statting (no I/O race against
 // getContainerGuidance, which is the canonical mtime source).
 function getCachedGuidanceMtime(): number | null {
  if (!cachedGuidance) return null;
  // mtime=0 is the sentinel for "file is missing" (set in the catch above).
  // Surface it as null so the log/diff doesn't treat absence as a number.
  return cachedGuidance.mtime > 0 ? cachedGuidance.mtime : null;
 }
 // v1.13.8: fingerprint emitted per turn, observer state keyed by session.
 // Field set is intentionally small — we want the diff between two
 // fingerprints to point at the exact input that drifted, not bury the
 // signal in noise.
 export interface PrefixFingerprint {
  msg: 'prefix-fingerprint';
  project_id: string;
  agent_id: string | null;
  agent_name: string | null;
  session_id: string;
  prefix_hash: string;
  prefix_length: number;
  mtime_boochat: number | null;
  mtime_agents_global: number | null;
  mtime_agents_project: number | null;
  has_agent_system_prompt: boolean;
  has_session_override: boolean;
  has_project_override: boolean;
 }
 export interface PrefixDrift {
  msg: 'prefix-drift';
  session_id: string;
  prev_hash: string;
  new_hash: string;
  prev_length: number;
  new_length: number;
  // Names of fields in PrefixFingerprint (excluding the hash + length pair
  // and the session_id key itself) whose values differ between the previous
  // observation and this one. The bug case is `changed_inputs: []` — hash
  // differs but no tracked input moved, which means assembly is
  // nondeterministic somewhere.
  changed_inputs: string[];
 }
 // Fields tracked per-session for the drift diff. Stored alongside the hash
 // so we can recompute changed_inputs without re-running buildSystemPrompt.
 interface ObservedInputs {
  agent_id: string | null;
  mtime_boochat: number | null;
  mtime_agents_global: number | null;
  mtime_agents_project: number | null;
  has_agent_system_prompt: boolean;
  has_session_override: boolean;
  has_project_override: boolean;
 }
 interface ObserverEntry {
  hash: string;
  length: number;
  inputs: ObservedInputs;
 }
 // Unbounded by design for v1.13.8 (instrumentation, short-lived sessions in
 // the smoke test). TODO(v1.13.x follow-up if v1.13.8 surfaces stable):
 // LRU-bound this Map at 1000 sessions when the in-process surface lives long
 // enough to matter.
 const prefixObserver = new Map<string, ObserverEntry>();
 // Test-only: clear the observer so consecutive tests don't share state.
 export function _resetPrefixObserverForTests(): void {
  prefixObserver.clear();
 }
 function computeChangedInputs(prev: ObservedInputs, curr: ObservedInputs): string[] {
  const out: string[] = [];
  const keys = Object.keys(curr) as (keyof ObservedInputs)[];
  for (const k of keys) {
    if (prev[k] !== curr[k]) out.push(k);
  }
  return out;
 }
 export async function buildSystemPromptWithFingerprint(
  project: Project,
  session: Session,
-  agent: Agent | null
+  agent: Agent | null,
-): Promise<string> {
+): Promise<{ prompt: string; fingerprint: PrefixFingerprint; drift: PrefixDrift | null }> {
  let out = BASE_SYSTEM_PROMPT(project.path);
  const guidance = await getContainerGuidance();
  if (guidance) {
@@ -79,5 +172,60 @@ export async function buildSystemPrompt(
  if (userPrompt.length > 0) {
    out += '\n\n' + userPrompt;
  }
-  return out;
+
  const hash = createHash('sha256').update(out, 'utf8').digest('hex');
  const agentsMtimes = getAgentsMtimes(project.path);
  const inputs: ObservedInputs = {
    agent_id: agent?.id ?? null,
    mtime_boochat: getCachedGuidanceMtime(),
    mtime_agents_global: agentsMtimes.global,
    mtime_agents_project: agentsMtimes.project,
    has_agent_system_prompt: !!(agent && agent.system_prompt.trim().length > 0),
    has_session_override: sessionPrompt.length > 0,
    has_project_override: projectPrompt.length > 0,
  };
  const fingerprint: PrefixFingerprint = {
    msg: 'prefix-fingerprint',
    project_id: project.id,
    agent_id: agent?.id ?? null,
    agent_name: agent?.name ?? null,
    session_id: session.id,
    prefix_hash: hash,
    prefix_length: out.length,
    mtime_boochat: inputs.mtime_boochat,
    mtime_agents_global: inputs.mtime_agents_global,
    mtime_agents_project: inputs.mtime_agents_project,
    has_agent_system_prompt: inputs.has_agent_system_prompt,
    has_session_override: inputs.has_session_override,
    has_project_override: inputs.has_project_override,
  };
  let drift: PrefixDrift | null = null;
  const prev = prefixObserver.get(session.id);
  if (prev && prev.hash !== hash) {
    drift = {
      msg: 'prefix-drift',
      session_id: session.id,
      prev_hash: prev.hash,
      new_hash: hash,
      prev_length: prev.length,
      new_length: out.length,
      changed_inputs: computeChangedInputs(prev.inputs, inputs),
    };
  }
  prefixObserver.set(session.id, { hash, length: out.length, inputs });
  return { prompt: out, fingerprint, drift };
 }
 // Backward-compatible string-returning shim. Kept so existing callers
 // (tests, future code paths that don't want to log) work unchanged.
 export async function buildSystemPrompt(
  project: Project,
  session: Session,
  agent: Agent | null,
 ): Promise<string> {
  const { prompt } = await buildSystemPromptWithFingerprint(project, session, agent);
  return prompt;
 }
--- a/apps/server/src/services/tools.ts
+++ b/apps/server/src/services/tools.ts
@@ -8,6 +8,7 @@ import { getGitMeta } from './git_meta.js';
 import { findSkills, getSkillBody, getSkillResource } from './skills.js';
 import { webSearch } from './web_search.js';
 import { webFetch } from './web_fetch.js';
 import { readTruncation, truncateIfNeeded } from './truncate.js';
 // v1.12 Track B.2: codecontext tools. 8 wrappers re-exported from
 // tools/codecontext/index.ts. Each calls into services/codecontext_client.ts
 // which talks to the codecontext sidecar at http://codecontext:8080.
@@ -21,6 +22,10 @@ import {
  getSemanticNeighborhoods,
  getFrameworkAnalysis,
 } from './tools/codecontext/index.js';
 // v1.13.17-cross-repo-reads: cross-repo read grant request tool. Paired
 // with the pause-on-pending-grant branch in inference/tool-phase.ts and the
 // POST /api/chats/:id/grant_read_access endpoint in routes/messages.ts.
 import { requestReadAccess } from './request_read_access.js';
 const MAX_FILE_BYTES = 5 * 1024 * 1024;
 const DEFAULT_VIEW_LINES = 200;
@@ -44,7 +49,13 @@ export interface ToolDef<TInput> {
  description: string;
  inputSchema: z.ZodType<TInput>;
  jsonSchema: ToolJsonSchema;
-  execute(input: TInput, projectRoot: string): Promise<unknown>;
+  // v1.13.17-cross-repo-reads: extraRoots is the session's
  // allowed_read_paths, threaded through executeToolCall in tool-phase.ts.
  // Only the filesystem tools (view_file, list_dir, grep, find_files,
  // view_truncated_output) forward it to pathGuard; other tools accept the
  // arg and ignore it. The execute signature stays compatible with
  // pre-v1.13.17 callsites because the parameter is optional.
  execute(input: TInput, projectRoot: string, extraRoots?: readonly string[]): Promise<unknown>;
 }
 const ViewFileInput = z.object({
@@ -77,14 +88,19 @@ export const viewFile: ToolDef<ViewFileInputT> = {
      },
    },
  },
-  async execute(input, projectRoot) {
+  async execute(input, projectRoot, extraRoots) {
-    const real = await pathGuard(projectRoot, input.path);
+    const real = await pathGuard(projectRoot, input.path, extraRoots);
    // v1.11.7: secret-file deny check. Test the project-relative path
    // (matches the form continue.dev's patterns expect: basenames + dir
    // segments). Throw a typed error so executeToolCall in inference.ts
    // surfaces a clear "blocked" message to the LLM instead of silently
    // returning content the user wanted hidden.
-    const relPath = relative(projectRoot, real) || basename(real);
+    // v1.13.17: when the resolved path is outside the primary projectRoot
    // (i.e. via an allowed_read_paths grant), `relative()` returns "../…"
    // which won't match secret-file basename patterns. Re-anchor on the
    // file's basename so the secret deny still fires across all grant roots.
    const rel = relative(projectRoot, real);
    const relPath = rel && !rel.startsWith('..') ? rel : basename(real);
    if (isSecretPath(relPath)) {
      throw new SecretBlockedError(relPath);
    }
@@ -109,12 +125,22 @@ export const viewFile: ToolDef<ViewFileInputT> = {
    const slice = lines.slice(start - 1, end);
    const content = slice.join('\n');
    const truncated = total > end || start > 1;
    // v1.13.5: stash the full file on tmpfs so the model can retrieve more
    // via view_truncated_output(id) without re-reading the file (which it
    // may not have project-relative-path access to in future agent setups).
    // raw is bounded by MAX_FILE_BYTES (5MB), within truncateIfNeeded's cap.
    const wrapped = await truncateIfNeeded({
      fullContent: raw,
      slicedContent: content,
      wasTruncated: truncated,
    });
    return {
      path: relative(projectRoot, real) || basename(real),
-      content,
+      content: wrapped.content,
      total_lines: total,
      returned_lines: [start, end],
-      truncated,
+      truncated: wrapped.truncated,
      ...(wrapped.outputPath ? { outputPath: wrapped.outputPath } : {}),
    };
  },
 };
@@ -146,8 +172,8 @@ export const listDir: ToolDef<ListDirInputT> = {
      },
    },
  },
-  async execute(input, projectRoot) {
+  async execute(input, projectRoot, extraRoots) {
-    const real = await pathGuard(projectRoot, input.path);
+    const real = await pathGuard(projectRoot, input.path, extraRoots);
    const s = await stat(real);
    if (!s.isDirectory()) {
      throw new PathScopeError(`not a directory: ${input.path}`);
@@ -157,41 +183,64 @@ export const listDir: ToolDef<ListDirInputT> = {
      ? entries
      : entries.filter((e) => !e.name.startsWith('.'));
    const total = filtered.length;
-    const slice = filtered.slice(0, MAX_DIR_ENTRIES);
+    const wasTruncated = total > MAX_DIR_ENTRIES;
    const out = await Promise.all(
      slice.map(async (e) => {
        const child = resolve(real, e.name);
        let size: number | undefined;
        if (e.isFile()) {
          try {
            const cs = await stat(child);
            size = cs.size;
          } catch {
            /* ignore */
          }
        }
        return {
          name: e.name,
          type: e.isDirectory() ? ('dir' as const) : ('file' as const),
          ...(size != null ? { size } : {}),
        };
      })
    );
    // v1.11.7: filter entries whose project-relative path matches a secret
    // pattern. Each entry is tested using the project-rel dir + its name
    // so the pattern's path/segment semantics work for nested dirs like
    // `.aws/`. The count is surfaced via `pathguard_note` — we never list
    // the hidden paths (defeats the purpose).
    const relDir = relative(projectRoot, real) || '.';
    // v1.13.5: when we'd truncate, render the FULL list to tmpfs so
    // view_truncated_output can serve it. Stat sizes for all entries when
    // truncating so the stored view matches the visible shape; this is the
    // one extra cost for big directories, bounded by total entries (which
    // is itself bounded by filesystem behavior).
    const processOne = async (e: typeof filtered[number]) => {
      const child = resolve(real, e.name);
      let size: number | undefined;
      if (e.isFile()) {
        try {
          const cs = await stat(child);
          size = cs.size;
        } catch { /* ignore */ }
      }
      return {
        name: e.name,
        type: e.isDirectory() ? ('dir' as const) : ('file' as const),
        ...(size != null ? { size } : {}),
      };
    };
    const slice = filtered.slice(0, MAX_DIR_ENTRIES);
    const out = await Promise.all(slice.map(processOne));
    // v1.11.7: filter entries whose project-relative path matches a secret
    // pattern. The same filter applies to the full-list snapshot below so
    // the stashed file never holds entries the slice would have hidden.
    const secretFilter = filterSecretEntries(out, (e) =>
      relDir === '.' ? e.name : `${relDir}/${e.name}`,
    );
    let outputPath: string | undefined;
    if (wasTruncated) {
      const fullProcessed = await Promise.all(filtered.map(processOne));
      const fullFiltered = filterSecretEntries(fullProcessed, (e) =>
        relDir === '.' ? e.name : `${relDir}/${e.name}`,
      );
      // One line per entry, view_truncated_output's line slicing semantics
      // map cleanly. Format: "<type>\t<name>[\tsize=N]". Header documents
      // the shape so the model can grep / regex without prior schema lookup.
      const header = `# list_dir ${relDir} — ${fullFiltered.kept.length} entries`;
      const lines = [header, ...fullFiltered.kept.map((e) => {
        const sz = 'size' in e && e.size != null ? `\tsize=${e.size}` : '';
        return `${e.type}\t${e.name}${sz}`;
      })];
      const wrapped = await truncateIfNeeded({
        fullContent: lines.join('\n'),
        slicedContent: '',
        wasTruncated: true,
      });
      outputPath = wrapped.outputPath;
    }
    return {
      path: relDir,
      entries: secretFilter.kept,
      total: secretFilter.kept.length,
-      truncated: total > MAX_DIR_ENTRIES,
+      truncated: wasTruncated,
      ...(secretFilter.note ? { pathguard_note: secretFilter.note } : {}),
      ...(outputPath ? { outputPath } : {}),
    };
  },
 };
@@ -230,7 +279,7 @@ export const grep: ToolDef<GrepInputT> = {
      },
    },
  },
-  async execute(input, projectRoot) {
+  async execute(input, projectRoot, extraRoots) {
    const limit = Math.min(
      Math.max(input.max_results ?? DEFAULT_GREP_RESULTS, 1),
      MAX_GREP_RESULTS
@@ -242,6 +291,7 @@ export const grep: ToolDef<GrepInputT> = {
      max_matches: limit,
      case_sensitive: input.case_sensitive,
      hidden: input.hidden,
      extra_roots: extraRoots,
    });
    const reshaped = result.matches.map((m) => ({
      path: m.path,
@@ -291,7 +341,7 @@ export const findFiles: ToolDef<FindFilesInputT> = {
      },
    },
  },
-  async execute(input, projectRoot) {
+  async execute(input, projectRoot, extraRoots) {
    const limit = Math.min(
      Math.max(input.max_results ?? DEFAULT_FIND_RESULTS, 1),
      MAX_FIND_RESULTS
@@ -301,6 +351,7 @@ export const findFiles: ToolDef<FindFilesInputT> = {
    const result = await fileOpsFindFiles(projectRoot, input.pattern, {
      path: input.path,
      max_results: limit,
      extra_roots: extraRoots,
    });
    // v1.11.7: drop paths matching secret patterns. The original `total`
    // from file_ops includes pre-truncation count; we report the visible
@@ -315,6 +366,74 @@ export const findFiles: ToolDef<FindFilesInputT> = {
  },
 };
 // v1.13.5: retrieves the full content of a previously-truncated tool output
 // via the opaque id stamped on the original tool_result. Line-based slicing
 // matches view_file's mental model so the model uses the same affordances.
 // Tmpfs-backed, 7-day TTL (see services/truncate.ts).
 const VIEW_TRUNCATED_DEFAULT_LINES = 200;
 const ViewTruncatedOutputInput = z.object({
  id: z.string().regex(/^tr_[0-9a-v]{12}$/),
  start_line: z.number().int().positive().optional(),
  end_line: z.number().int().positive().optional(),
 });
 type ViewTruncatedOutputInputT = z.infer<typeof ViewTruncatedOutputInput>;
 export const viewTruncatedOutput: ToolDef<ViewTruncatedOutputInputT> = {
  name: 'view_truncated_output',
  description: `Retrieve the full content of a previously-truncated tool output by its outputPath id. When a tool returns { truncated: true, outputPath: "tr_..." }, call this to view the full content. Defaults to the first ${VIEW_TRUNCATED_DEFAULT_LINES} lines. Use start_line and end_line (1-indexed, inclusive) to slice. Stored for 7 days.`,
  inputSchema: ViewTruncatedOutputInput,
  jsonSchema: {
    type: 'function',
    function: {
      name: 'view_truncated_output',
      description: `Retrieve the full content of a previously-truncated tool output by its outputPath id. Returns the first ${VIEW_TRUNCATED_DEFAULT_LINES} lines by default; use start_line/end_line to slice. Stored for 7 days.`,
      parameters: {
        type: 'object',
        properties: {
          id: { type: 'string', description: 'The outputPath value from an earlier truncated tool result (e.g. "tr_abc123def456").' },
          start_line: { type: 'integer', description: 'First line (1-indexed). Default 1.' },
          end_line: { type: 'integer', description: `Last line (1-indexed, inclusive). Default ${VIEW_TRUNCATED_DEFAULT_LINES} lines past start.` },
        },
        required: ['id'],
        additionalProperties: false,
      },
    },
  },
  // view_truncated_output doesn't touch the filesystem — it pulls from tmpfs
  // by opaque id. extraRoots is irrelevant here; declared for signature parity
  // with the v1.13.17 ToolDef contract.
  async execute(input, _projectRoot, _extraRoots) {
    const content = await readTruncation(input.id);
    if (content === null) {
      return {
        id: input.id,
        content: '',
        truncated: false,
        error: `No truncation found for id "${input.id}". It may have been pruned (7-day TTL) or never existed.`,
      };
    }
    const lines = content.split('\n');
    const total = lines.length;
    let start = input.start_line ?? 1;
    let end = input.end_line ?? Math.min(total, start + VIEW_TRUNCATED_DEFAULT_LINES - 1);
    if (start < 1) start = 1;
    if (end > total) end = total;
    if (end < start) end = start;
    const slice = lines.slice(start - 1, end).join('\n');
    // Re-slicing this view isn't truncation in the dual-write sense — the
    // model already has the id; no point stashing the slice again.
    const truncated = total > end || start > 1;
    return {
      id: input.id,
      content: slice,
      total_lines: total,
      returned_lines: [start, end],
      truncated,
    };
  },
 };
 // v1.8 Level 1 branch awareness: gives the model a read-only view of the
 // project's git state. No path input — operates on the inference-resolved
 // project root via getGitMeta. Subprocess runs with a 2s timeout (see git_meta).
@@ -527,8 +646,14 @@ export const askUserInput: ToolDef<AskUserInputInputT> = {
  },
 };
 // v1.13.3: alpha-sorted by tool.name at module load. llama.cpp's prompt
 // cache hits on byte-identical prefixes; the tool list lives near the top
 // of the system prompt, so any order drift would invalidate every cached
 // turn. Single source of truth for ordering lives here — toolJsonSchemas()
 // and TOOLS_BY_NAME inherit it.
 export const ALL_TOOLS: ReadonlyArray<ToolDef<unknown>> = [
  viewFile as ToolDef<unknown>,
  viewTruncatedOutput as ToolDef<unknown>,
  listDir as ToolDef<unknown>,
  grep as ToolDef<unknown>,
  findFiles as ToolDef<unknown>,
@@ -553,7 +678,12 @@ export const ALL_TOOLS: ReadonlyArray<ToolDef<unknown>> = [
  watchChanges as ToolDef<unknown>,
  getSemanticNeighborhoods as ToolDef<unknown>,
  getFrameworkAnalysis as ToolDef<unknown>,
-];
+  // v1.13.17-cross-repo-reads: paired with the pause-on-pending-grant
  // branch in tool-phase.ts. Read-only — only ever READS files; the only
  // state change is appending to sessions.allowed_read_paths via the
  // grant endpoint, gated by user consent.
  requestReadAccess as ToolDef<unknown>,
 ].sort((a, b) => a.name.localeCompare(b.name));
 // v1.8.2: forward-compatible read-only whitelist. An agent whose `tools` is
 // fully contained in this set gets a generous default tool budget (30);
@@ -565,6 +695,7 @@ export const ALL_TOOLS: ReadonlyArray<ToolDef<unknown>> = [
 // project state, so it belongs in the read-only set for budget purposes.
 export const READ_ONLY_TOOL_NAMES = [
  'view_file',
  'view_truncated_output',
  'list_dir',
  'grep',
  'find_files',
@@ -588,12 +719,74 @@ export const READ_ONLY_TOOL_NAMES = [
  'watch_changes',
  'get_semantic_neighborhoods',
  'get_framework_analysis',
  // v1.13.17-cross-repo-reads: pauses execution but doesn't mutate project
  // state directly (the grant endpoint appends to sessions.allowed_read_paths
  // only with user consent). Belongs in the read-only budget tier.
  'request_read_access',
 ] as const;
 export const TOOLS_BY_NAME: Record<string, ToolDef<unknown>> = Object.fromEntries(
  ALL_TOOLS.map((t) => [t.name, t])
 );
 // v1.13.15-tools: tiered tool loading. BOOCODE_TOOLS env var (`core` |
 // `standard` | `all`) filters the agent's tool whitelist before LLM dispatch.
 // Daily-driver token win on qwen3.6-35b-a3b — the 35B-A3B MoE benefits from
 // any prompt-cache stability win (fewer tools = shorter, more stable tool
 // schemas in the system prompt). Pattern lift from eyaltoledano/claude-task-
 // master (MIT + Commons Clause — pattern only, no code lift).
 //
 // The env var is a CEILING. It only narrows; never expands an agent's
 // declared whitelist. Default behavior (var unset) is unchanged: all tools.
 export const CORE_TOOL_NAMES = [
  'view_file',
  'list_dir',
  'grep',
  'find_files',
 ] as const;
 export const STANDARD_TOOL_NAMES = [
  ...CORE_TOOL_NAMES,
  'web_search',
  'web_fetch',
  'git_status',
  'get_codebase_overview',
  'get_file_analysis',
  'get_symbol_info',
  'search_symbols',
  'get_dependencies',
  'watch_changes',
  'get_semantic_neighborhoods',
  'get_framework_analysis',
 ] as const;
 // Module-load validation: every name in CORE / STANDARD must exist in
 // TOOLS_BY_NAME. Catches typos and stale tier definitions before they reach
 // production; server boot fails loudly rather than silently filtering valid
 // tools out of agent whitelists.
 for (const name of CORE_TOOL_NAMES) {
  if (!TOOLS_BY_NAME[name]) {
    throw new Error(`CORE_TOOL_NAMES references unknown tool: '${name}'`);
  }
 }
 for (const name of STANDARD_TOOL_NAMES) {
  if (!TOOLS_BY_NAME[name]) {
    throw new Error(`STANDARD_TOOL_NAMES references unknown tool: '${name}'`);
  }
 }
 export function resolveToolTier(tier: string | undefined): readonly string[] {
  switch ((tier ?? 'all').toLowerCase()) {
    case 'core':
      return CORE_TOOL_NAMES;
    case 'standard':
      return STANDARD_TOOL_NAMES;
    case 'all':
    default:
      return ALL_TOOLS.map((t) => t.name);
  }
 }
 export function toolJsonSchemas(): ToolJsonSchema[] {
  return ALL_TOOLS.map((t) => t.jsonSchema);
 }
--- a/apps/server/src/services/tools/codecontext/get_dependencies.ts
+++ b/apps/server/src/services/tools/codecontext/get_dependencies.ts
@@ -5,7 +5,7 @@ import type { ToolDef } from '../../tools.js';
 import { callCodecontext, type CodecontextResponse } from '../../codecontext_client.js';
 export const GetDependenciesInput = z.object({
-  file_path: z.string().optional(),
+  file_path: z.string().trim().optional(),
  direction: z.enum(['incoming', 'outgoing', 'both']).optional(),
 });
 export type GetDependenciesInputT = z.infer<typeof GetDependenciesInput>;
--- a/apps/server/src/services/tools/codecontext/get_file_analysis.ts
+++ b/apps/server/src/services/tools/codecontext/get_file_analysis.ts
@@ -5,7 +5,7 @@ import type { ToolDef } from '../../tools.js';
 import { callCodecontext, type CodecontextResponse } from '../../codecontext_client.js';
 export const GetFileAnalysisInput = z.object({
-  file_path: z.string().min(1),
+  file_path: z.string().trim().min(1),
 });
 export type GetFileAnalysisInputT = z.infer<typeof GetFileAnalysisInput>;
--- a/apps/server/src/services/tools/codecontext/get_semantic_neighborhoods.ts
+++ b/apps/server/src/services/tools/codecontext/get_semantic_neighborhoods.ts
@@ -5,7 +5,7 @@ import type { ToolDef } from '../../tools.js';
 import { callCodecontext, type CodecontextResponse } from '../../codecontext_client.js';
 export const GetSemanticNeighborhoodsInput = z.object({
-  file_path: z.string().optional(),
+  file_path: z.string().trim().optional(),
  include_basic: z.boolean().optional(),
  include_quality: z.boolean().optional(),
  max_results: z.number().int().positive().optional(),
--- a/apps/server/src/services/tools/codecontext/get_symbol_info.ts
+++ b/apps/server/src/services/tools/codecontext/get_symbol_info.ts
@@ -6,7 +6,7 @@ import { callCodecontext, type CodecontextResponse } from '../../codecontext_cli
 export const GetSymbolInfoInput = z.object({
  symbol_name: z.string().min(1),
-  file_path: z.string().optional(),
+  file_path: z.string().trim().optional(),
  framework_type: z.string().optional(),
 });
 export type GetSymbolInfoInputT = z.infer<typeof GetSymbolInfoInput>;
--- a/apps/server/src/services/truncate.ts
+++ b/apps/server/src/services/truncate.ts
@@ -0,0 +1,170 @@
 import { promises as fs } from 'fs';
 import { randomBytes } from 'crypto';
 import path from 'path';
 import type { Sql } from '../db.js';
 // v1.13.5: opencode-style truncation storage. When a tool slice would cut
 // content the model might still want, we store the full text on tmpfs and
 // hand the model an opaque id. view_truncated_output(id) retrieves it.
 //
 // Tmpfs path means full content vanishes on container restart; chats that
 // outlive a restart lose retrieval (acceptable — the user has usually moved
 // on or the data is stale). 7-day TTL + orphan reap bound disk growth via
 // the periodic sweeper in index.ts.
 export const TRUNCATION_DIR = process.env.BOOCODE_TRUNCATION_DIR ?? '/tmp/boocode-truncations';
 export const TRUNCATION_TTL_MS = 7 * 24 * 60 * 60 * 1000;
 // Matches view_file's MAX_FILE_BYTES — anything bigger was already refused
 // at the source tool's size check, so we never see it here.
 export const MAX_TRUNCATION_BYTES = 5 * 1024 * 1024;
 const ID_RE = /^tr_[0-9a-v]{12}$/;
 let dirEnsured = false;
 async function ensureDir(): Promise<void> {
  if (dirEnsured) return;
  await fs.mkdir(TRUNCATION_DIR, { recursive: true, mode: 0o700 });
  dirEnsured = true;
 }
 // 12 base32 chars ≈ 60 bits of entropy. Collision probability across a
 // 7-day window with ~thousands of truncations is essentially zero.
 function newId(): string {
  const buf = randomBytes(8);
  const alphabet = '0123456789abcdefghijklmnopqrstuv';
  let out = 'tr_';
  for (const byte of buf) {
    out += alphabet[byte & 0x1f];
    out += alphabet[(byte >> 3) & 0x1f];
  }
  return out.slice(0, 15);
 }
 function idToPath(id: string): string {
  // Defense-in-depth: the model never supplies a path component (only ids),
  // but a malformed id from anywhere else shouldn't escape TRUNCATION_DIR.
  if (!ID_RE.test(id)) {
    throw new Error(`Invalid truncation id: ${id}`);
  }
  return path.join(TRUNCATION_DIR, id);
 }
 export async function storeTruncation(fullContent: string): Promise<string> {
  const bytes = Buffer.byteLength(fullContent, 'utf8');
  if (bytes > MAX_TRUNCATION_BYTES) {
    throw new Error(`Truncation content ${bytes}B exceeds ${MAX_TRUNCATION_BYTES}B cap`);
  }
  await ensureDir();
  const id = newId();
  await fs.writeFile(idToPath(id), fullContent, { encoding: 'utf8', mode: 0o600 });
  return id;
 }
 export async function readTruncation(id: string): Promise<string | null> {
  if (!ID_RE.test(id)) return null;
  try {
    return await fs.readFile(idToPath(id), { encoding: 'utf8' });
  } catch (err) {
    if ((err as NodeJS.ErrnoException).code === 'ENOENT') return null;
    throw err;
  }
 }
 // Wrap a tool's output. If wasTruncated, stash the full content on tmpfs
 // and return its id alongside the sliced view the tool would have returned.
 // Storage failure (disk full, permission denied) is non-fatal — the sliced
 // view ships without an outputPath, which is exactly what the tool returned
 // before v1.13.5. Same goes for content over MAX_TRUNCATION_BYTES.
 export async function truncateIfNeeded(args: {
  fullContent: string;
  slicedContent: string;
  wasTruncated: boolean;
 }): Promise<{ content: string; truncated: boolean; outputPath?: string }> {
  if (!args.wasTruncated) {
    return { content: args.slicedContent, truncated: false };
  }
  const bytes = Buffer.byteLength(args.fullContent, 'utf8');
  if (bytes > MAX_TRUNCATION_BYTES) {
    return { content: args.slicedContent, truncated: true };
  }
  try {
    const outputPath = await storeTruncation(args.fullContent);
    return { content: args.slicedContent, truncated: true, outputPath };
  } catch {
    return { content: args.slicedContent, truncated: true };
  }
 }
 // Periodic cleanup. Called from index.ts's sweep interval (v1.13.3 cadence).
 // Pass 1: TTL — anything older than TRUNCATION_TTL_MS is gone.
 // Pass 2: orphans — files with no live message_parts.payload->'output'->>'outputPath'
 // reference. Catches the case where a part referencing an outputPath got
 // hidden by prune (v1.13.4) and the file is now unreachable.
 export async function cleanupTruncations(args: {
  sql: Sql;
  log: { warn: (obj: object, msg: string) => void; error: (obj: object, msg: string) => void };
 }): Promise<{ ttlReaped: number; orphanReaped: number }> {
  await ensureDir();
  const cutoff = Date.now() - TRUNCATION_TTL_MS;
  let ttlReaped = 0;
  let orphanReaped = 0;
  let entries: string[];
  try {
    entries = await fs.readdir(TRUNCATION_DIR);
  } catch (err) {
    args.log.error({ err }, 'cleanupTruncations readdir failed');
    return { ttlReaped, orphanReaped };
  }
  if (entries.length === 0) return { ttlReaped, orphanReaped };
  const survivors: string[] = [];
  for (const name of entries) {
    if (!ID_RE.test(name)) continue;
    const full = path.join(TRUNCATION_DIR, name);
    try {
      const stat = await fs.stat(full);
      if (stat.mtimeMs < cutoff) {
        await fs.unlink(full);
        ttlReaped += 1;
      } else {
        survivors.push(name);
      }
    } catch {
      // File vanished between readdir and stat — fine.
    }
  }
  if (survivors.length === 0) {
    if (ttlReaped > 0) {
      args.log.warn({ ttlReaped, orphanReaped: 0 }, 'cleanupTruncations reaped files');
    }
    return { ttlReaped, orphanReaped: 0 };
  }
  // outputPath rides inside the tool_result part's payload.output object
  // (see partsFromToolMessage in inference/parts.ts), so the json path is
  // payload->'output'->>'outputPath' rather than top-level.
  const referenced = await args.sql<{ output_path: string }[]>`
    SELECT DISTINCT p.payload->'output'->>'outputPath' AS output_path
    FROM message_parts p
    WHERE p.kind = 'tool_result'
      AND p.payload->'output' ? 'outputPath'
      AND p.payload->'output'->>'outputPath' = ANY(${survivors})
  `;
  const live = new Set(referenced.map((r) => r.output_path));
  for (const name of survivors) {
    if (live.has(name)) continue;
    try {
      await fs.unlink(path.join(TRUNCATION_DIR, name));
      orphanReaped += 1;
    } catch {
      // ignore
    }
  }
  if (ttlReaped > 0 || orphanReaped > 0) {
    args.log.warn({ ttlReaped, orphanReaped }, 'cleanupTruncations reaped files');
  }
  return { ttlReaped, orphanReaped };
 }
--- a/apps/server/src/services/web_fetch.ts
+++ b/apps/server/src/services/web_fetch.ts
@@ -11,6 +11,7 @@
 import { z } from 'zod';
 import { isPublicUrl } from './url_guard.js';
 import type { ToolDef } from './tools.js';
 import { truncateIfNeeded } from './truncate.js';
 const WebFetchInput = z.object({
  url: z.string().min(1).max(2048),
@@ -62,6 +63,39 @@ function stripHtml(html: string): { text: string; title: string | undefined } {
  return { text, title };
 }
 // v1.11.10: streaming body reader. Aborts the response stream the instant
 // cumulative bytes cross maxBytes, so a server that lies about
 // Content-Length (or omits it entirely) can't make us buffer gigabytes
 // before the post-read check fires. reader.cancel() releases the
 // underlying connection on the spot.
 async function readBodyCapped(
  res: Response,
  maxBytes: number,
 ): Promise<{ ok: true; body: string } | { ok: false; bytesRead: number }> {
  if (!res.body) return { ok: true, body: '' };
  const reader = res.body.getReader();
  const chunks: Uint8Array[] = [];
  let total = 0;
  try {
    while (true) {
      const { done, value } = await reader.read();
      if (done) break;
      total += value.byteLength;
      if (total > maxBytes) {
        // Best-effort cancel — surfaces on the server side as a closed
        // connection and (in our tests) fires the ReadableStream's
        // cancel() callback so we can assert the abort happened.
        await reader.cancel();
        return { ok: false, bytesRead: total };
      }
      chunks.push(value);
    }
  } finally {
    try { reader.releaseLock(); } catch { /* already released by cancel() */ }
  }
  return { ok: true, body: Buffer.concat(chunks).toString('utf8') };
 }
 function truncate(text: string, max: number): { content: string; truncated: boolean } {
  if (text.length <= max) return { content: text, truncated: false };
  const omitted = text.length - max;
@@ -159,19 +193,20 @@ export async function executeWebFetch(
    }
  }
  const contentType = (res.headers.get('content-type') ?? '').toLowerCase();
-  // Read body. We rely on the 5MB cap by checking length after consumption
+  // v1.11.10: stream the body with a hard byte cap. Previously we read
-  // — most malicious or accidental large responses also exceed it via the
+  // res.text() in one shot and then byte-length-checked — a server that
-  // Content-Length pre-flight above. A truly hostile server that lies
+  // lies about Content-Length (or omits it) could make us buffer
-  // about length AND streams gigabytes would defeat that; the per-hop
+  // gigabytes before the post-check fired. readBodyCapped aborts the
-  // 15s timeout is the secondary fence.
+  // stream the instant total bytes cross MAX_BYTES. The Content-Length
-  const body = await res.text();
+  // pre-flight above stays as a cheap early reject for honest servers.
-  // v1.11.8 review: byte-count, not char-count. A 5MB cap on body.length
+  const read = await readBodyCapped(res, MAX_BYTES);
-  // (UTF-16 code units) lets a multi-byte payload (emoji, CJK) pass when
+  if (!read.ok) {
-  // its wire size already exceeded MAX_BYTES.
+    return {
-  const bodyBytes = Buffer.byteLength(body, 'utf8');
+      error: 'body_too_large',
-  if (bodyBytes > MAX_BYTES) {
+      reason: `Response body exceeded ${MAX_BYTES} bytes (read ${read.bytesRead} before abort)`,
-    return { error: 'response_too_large', reason: `body ${bodyBytes} bytes > ${MAX_BYTES}` };
+    };
  }
  const body = read.body;
  let textRaw: string;
  let title: string | undefined;
@@ -196,15 +231,24 @@ export async function executeWebFetch(
  }
  const truncated = truncate(textRaw, maxChars);
  // v1.13.5: stash the full pre-slice body when truncation fires so the
  // model can pull more via view_truncated_output(id) without re-fetching.
  // textRaw is already bounded by MAX_BYTES (5MB), within truncate.ts's cap.
  const wrapped = await truncateIfNeeded({
    fullContent: textRaw,
    slicedContent: truncated.content,
    wasTruncated: truncated.truncated,
  });
  // Report the FINAL URL (post-redirects) so the LLM knows where the body
  // came from — useful for citations and for the model to reason about
  // domain trust.
  return {
    url: currentUrl,
    title,
-    content: truncated.content,
+    content: wrapped.content,
    content_type: contentType,
-    truncated: truncated.truncated,
+    truncated: wrapped.truncated,
    ...(wrapped.outputPath ? { outputPath: wrapped.outputPath } : {}),
  };
 }
--- a/apps/server/src/types/api.ts
+++ b/apps/server/src/types/api.ts
@@ -39,6 +39,53 @@ export interface Session {
  // project.default_web_search_enabled. Plumbed but inert in v1.9 — the
  // actual web_search tool ships in Batch 8.
  web_search_enabled: boolean | null;
  // v1.12.1: server-side workspace pane layout. Replaces per-device
  // localStorage so all devices viewing the session see the same panes.
  workspace_panes: WorkspacePane[];
  // v1.13.17: absolute paths the agent has been granted read access to via
  // the request_read_access tool. Empty by default; populated only by the
  // grant_read_access endpoint's allow branch. Revoked via PATCH session.
  // path_guard's extraRoots check consults this list before refusing reads
  // outside the primary project root.
  allowed_read_paths: string[];
 }
 // v1.14.x-html-artifact-panes: 'markdown_artifact' + 'html_artifact' added.
 // Optional payload state lives on the pane row itself so the jsonb survives
 // a hard reload without needing a re-fetch.
 export type WorkspacePaneKind =
  | 'chat'
  | 'terminal'
  | 'agent'
  | 'empty'
  | 'settings'
  | 'markdown_artifact'
  | 'html_artifact';
 // v1.14.x: reference-only — the actual artifact body lives in the message
 // row (markdown) or message_parts.payload (html_artifact). Pane components
 // fetch on mount.
 export interface MarkdownArtifactState {
  chat_id: string;
  message_id: string;
  title: string;
 }
 export interface HtmlArtifactState {
  chat_id: string;
  message_id: string;
  title: string;
 }
 export interface WorkspacePane {
  id: string;
  kind: WorkspacePaneKind;
  chatId?: string;
  chatIds: string[];
  activeChatIdx: number;
  // v1.14.x: populated only when kind === 'markdown_artifact' / 'html_artifact'.
  markdown_artifact_state?: MarkdownArtifactState;
  html_artifact_state?: HtmlArtifactState;
 }
 // v1.8.1: agents come from two sources. 'global' = /data/AGENTS.md (always
@@ -173,6 +220,11 @@ export interface Message {
  // v1.8.2: per-message metadata. See MessageMetadata for the discriminated
  // shapes currently in use.
  metadata: MessageMetadata | null;
  // v1.13.1-C: reasoning content captured from the model's reasoning stream
  // (qwen3.6 etc.). Populated from message_parts via the messages_with_parts
  // view's reasoning_parts column. Optional — most rows have no reasoning
  // and the API may omit the field on legacy responses.
  reasoning_parts?: Array<{ text: string }> | null;
  // v1.11: anchored rolling compaction. Optional so consumers that SELECT
  // the pre-v1.11 column set still type-check. See compaction.ts +
  // schema.sql for semantics.
@@ -273,6 +325,11 @@ export interface SessionRenamedFrame {
  session_id: string;
  name: string;
 }
 export interface SessionWorkspaceUpdatedFrame {
  type: 'session_workspace_updated';
  session_id: string;
  workspace_panes: WorkspacePane[];
 }
 export interface SessionArchivedFrame {
  type: 'session_archived';
  session_id: string;
@@ -324,7 +381,7 @@ export interface ProjectUpdatedFrame {
 export interface ChatStatusFrame {
  type: 'chat_status';
  chat_id: string;
-  status: 'working' | 'idle' | 'error';
+  status: 'streaming' | 'tool_running' | 'waiting_for_input' | 'idle' | 'error';
  at: string;
  reason?: ErrorReason;
 }
@@ -335,6 +392,7 @@ export type UserStreamFrame =
  | SessionDeletedFrame
  | SessionUpdatedFrame
  | SessionRenamedFrame
  | SessionWorkspaceUpdatedFrame
  | SessionArchivedFrame
  | ChatCreatedFrame
  | ChatUpdatedFrame
--- a/apps/server/src/types/ws-frames.ts
+++ b/apps/server/src/types/ws-frames.ts
@@ -0,0 +1,328 @@
 // v1.13.11-a: Zod schemas for every WebSocket frame published by the server.
 // Validation runs both on send (broker.publishFrame / publishUserFrame) and
 // on receive (apps/web/src/hooks/useSessionStream + useUserEvents). Catches
 // silent protocol drift between publisher and consumer.
 //
 // IMPORTANT: This file is duplicated byte-identical at
 // apps/web/src/api/ws-frames.ts. The two apps have separate tsconfigs and
 // no path alias; the duplication is sync-by-hand. A test asserts the two
 // files match. If you change one, change the other.
 //
 // Per-kind payload schemas (tool_call args, message_parts payloads, etc.)
 // stay z.unknown() in v1.13.11. Frame-level drift detection is the goal;
 // deep payload validation is follow-up work.
 import { z } from 'zod';
 // ---- shared primitives -----------------------------------------------------
 const Uuid = z.string().uuid();
 // Tool call IDs are model-emitted (e.g. "call_abc123") — not UUIDs.
 const ToolCallId = z.string().min(1);
 // v1.13.12 fix: postgres returns timestamp columns as JS Date objects, not
 // strings. The publish sites pass them through unchanged, so the schema must
 // tolerate both. preprocess converts Date → ISO string before string-validation;
 // on the web side (where frames arrive via JSON.parse) it's a no-op. Before
 // this fix, every message_complete / session_updated / chat_updated frame
 // failed validation and got dropped — symptoms: token tracking blank in UI,
 // status stuck at 'streaming' tripping the 60s stale-stream banner.
 const IsoTimestamp = z.preprocess(
  (v) => (v instanceof Date ? v.toISOString() : v),
  z.string().min(1),
 );
 const ChatStatusValue = z.enum([
  'streaming',
  'tool_running',
  'waiting_for_input',
  'idle',
  'error',
 ]);
 const ErrorReasonValue = z.enum([
  'llm_provider_error',
  'doom_loop',
  'doom_loop_summary_failed',
  'cap_hit',
  'cap_hit_summary_failed',
 ]);
 const MessageRoleValue = z.enum(['user', 'assistant', 'system', 'tool']);
 const ToolCallShape = z.object({
  id: ToolCallId,
  name: z.string().min(1),
  args: z.record(z.string(), z.unknown()),
 });
 // Free-form bags: opaque to the frame schema; deep validation is out of
 // scope for v1.13.11 (frame-level drift detection is the goal; per-kind
 // payload narrowing is follow-up work). z.unknown() means the consumer
 // must narrow before reading — TypeScript-side this is fine because every
 // consumer already operates on the hand-maintained Project / Chat / Session
 // / WorkspacePane types (the brief's "Don't strip existing types yet"
 // rule), and the Zod-typed shape is only used at the publishFrame boundary.
 const OpaqueObject = z.unknown();
 // ---- per-session channel frames --------------------------------------------
 export const SnapshotFrame = z.object({
  type: z.literal('snapshot'),
  messages: z.array(OpaqueObject),
 });
 export const MessageStartedFrame = z.object({
  type: z.literal('message_started'),
  message_id: Uuid,
  chat_id: Uuid.optional(),
  role: MessageRoleValue,
 });
 export const DeltaFrame = z.object({
  type: z.literal('delta'),
  message_id: Uuid,
  chat_id: Uuid.optional(),
  content: z.string(),
 });
 export const ToolCallFrame = z.object({
  type: z.literal('tool_call'),
  message_id: Uuid,
  chat_id: Uuid.optional(),
  tool_call: ToolCallShape,
 });
 export const ToolResultFrame = z.object({
  type: z.literal('tool_result'),
  tool_message_id: Uuid,
  chat_id: Uuid.optional(),
  tool_call_id: ToolCallId,
  output: z.unknown(),
  truncated: z.boolean(),
  error: z.string().optional(),
 });
 export const MessageCompleteFrame = z.object({
  type: z.literal('message_complete'),
  message_id: Uuid,
  chat_id: Uuid.optional(),
  tokens_used: z.number().int().nonnegative().nullable().optional(),
  ctx_used: z.number().int().nonnegative().nullable().optional(),
  ctx_max: z.number().int().positive().nullable().optional(),
  started_at: IsoTimestamp.nullable().optional(),
  finished_at: IsoTimestamp.nullable().optional(),
  model: z.string().optional(),
  metadata: OpaqueObject.nullable().optional(),
 });
 export const UsageFrame = z.object({
  type: z.literal('usage'),
  message_id: Uuid,
  chat_id: Uuid.optional(),
  completion_tokens: z.number().int().nonnegative().nullable(),
  ctx_used: z.number().int().nonnegative().nullable(),
  ctx_max: z.number().int().positive().nullable(),
 });
 export const MessagesDeletedFrame = z.object({
  type: z.literal('messages_deleted'),
  message_ids: z.array(Uuid),
  chat_id: Uuid.optional(),
 });
 export const ChatRenamedFrame = z.object({
  type: z.literal('chat_renamed'),
  chat_id: Uuid,
  name: z.string(),
 });
 export const CompactedFrame = z.object({
  type: z.literal('compacted'),
  session_id: Uuid,
  chat_id: Uuid,
  summary_message_id: Uuid,
 });
 export const ErrorFrame = z.object({
  type: z.literal('error'),
  message_id: Uuid.optional(),
  chat_id: Uuid.optional(),
  error: z.string(),
  reason: ErrorReasonValue.optional(),
 });
 // ---- per-user channel frames (sidebar refresh) -----------------------------
 export const ChatStatusFrame = z.object({
  type: z.literal('chat_status'),
  chat_id: Uuid,
  status: ChatStatusValue,
  at: IsoTimestamp,
  reason: ErrorReasonValue.optional(),
 });
 export const SessionUpdatedFrame = z.object({
  type: z.literal('session_updated'),
  session_id: Uuid,
  project_id: Uuid,
  name: z.string(),
  updated_at: IsoTimestamp,
 });
 export const SessionRenamedFrame = z.object({
  type: z.literal('session_renamed'),
  session_id: Uuid,
  name: z.string(),
 });
 export const SessionCreatedFrame = z.object({
  type: z.literal('session_created'),
  session: OpaqueObject,
  project_id: Uuid,
 });
 export const SessionArchivedFrame = z.object({
  type: z.literal('session_archived'),
  session_id: Uuid,
  project_id: Uuid,
 });
 export const SessionDeletedFrame = z.object({
  type: z.literal('session_deleted'),
  session_id: Uuid,
  project_id: Uuid,
 });
 export const SessionWorkspaceUpdatedFrame = z.object({
  type: z.literal('session_workspace_updated'),
  session_id: Uuid,
  workspace_panes: z.array(OpaqueObject),
 });
 export const ChatCreatedFrame = z.object({
  type: z.literal('chat_created'),
  chat: OpaqueObject,
  session_id: Uuid,
 });
 export const ChatUpdatedFrame = z.object({
  type: z.literal('chat_updated'),
  chat_id: Uuid,
  session_id: Uuid,
  name: z.string().nullable(),
  updated_at: IsoTimestamp,
 });
 export const ChatArchivedFrame = z.object({
  type: z.literal('chat_archived'),
  chat_id: Uuid,
  session_id: Uuid,
 });
 export const ChatUnarchivedFrame = z.object({
  type: z.literal('chat_unarchived'),
  chat: OpaqueObject,
 });
 export const ChatDeletedFrame = z.object({
  type: z.literal('chat_deleted'),
  chat_id: Uuid,
  session_id: Uuid,
 });
 export const ProjectCreatedFrame = z.object({
  type: z.literal('project_created'),
  project: OpaqueObject,
 });
 export const ProjectArchivedFrame = z.object({
  type: z.literal('project_archived'),
  project_id: Uuid,
 });
 export const ProjectUnarchivedFrame = z.object({
  type: z.literal('project_unarchived'),
  project: OpaqueObject,
 });
 export const ProjectUpdatedFrame = z.object({
  type: z.literal('project_updated'),
  project_id: Uuid,
  name: z.string(),
 });
 export const ProjectDeletedFrame = z.object({
  type: z.literal('project_deleted'),
  project_id: Uuid,
 });
 // ---- discriminated union ---------------------------------------------------
 export const WsFrameSchema = z.discriminatedUnion('type', [
  // per-session
  SnapshotFrame,
  MessageStartedFrame,
  DeltaFrame,
  ToolCallFrame,
  ToolResultFrame,
  MessageCompleteFrame,
  UsageFrame,
  MessagesDeletedFrame,
  ChatRenamedFrame,
  CompactedFrame,
  ErrorFrame,
  // per-user
  ChatStatusFrame,
  SessionUpdatedFrame,
  SessionRenamedFrame,
  SessionCreatedFrame,
  SessionArchivedFrame,
  SessionDeletedFrame,
  SessionWorkspaceUpdatedFrame,
  ChatCreatedFrame,
  ChatUpdatedFrame,
  ChatArchivedFrame,
  ChatUnarchivedFrame,
  ChatDeletedFrame,
  ProjectCreatedFrame,
  ProjectArchivedFrame,
  ProjectUnarchivedFrame,
  ProjectUpdatedFrame,
  ProjectDeletedFrame,
 ]);
 export type WsFrame = z.infer<typeof WsFrameSchema>;
 // Convenience: the set of known frame types. Useful for the publishFrame
 // helper to log the offending type name when validation fails. Kept in sync
 // by hand with the discriminated union above.
 export const KNOWN_FRAME_TYPES: readonly WsFrame['type'][] = [
  'snapshot',
  'message_started',
  'delta',
  'tool_call',
  'tool_result',
  'message_complete',
  'usage',
  'messages_deleted',
  'chat_renamed',
  'compacted',
  'error',
  'chat_status',
  'session_updated',
  'session_renamed',
  'session_created',
  'session_archived',
  'session_deleted',
  'session_workspace_updated',
  'chat_created',
  'chat_updated',
  'chat_archived',
  'chat_unarchived',
  'chat_deleted',
  'project_created',
  'project_archived',
  'project_unarchived',
  'project_updated',
  'project_deleted',
 ] as const;
--- a/apps/web/package.json
+++ b/apps/web/package.json
@@ -31,7 +31,8 @@
    "shiki": "^1.29.2",
    "sonner": "^2.0.7",
    "tailwind-merge": "^3.6.0",
-    "tw-animate-css": "^1.4.0"
+    "tw-animate-css": "^1.4.0",
    "zod": "^3.23.8"
  },
  "devDependencies": {
    "@tailwindcss/postcss": "^4.3.0",
--- a/apps/web/src/api/client.ts
+++ b/apps/web/src/api/client.ts
@@ -12,6 +12,7 @@ import type {
  GitMeta,
  Skill,
  AskUserAnswer,
  ToolCostStat,
 } from './types';
 export class ApiError extends Error {
@@ -122,7 +123,20 @@ export const api = {
    get: (id: string) => request<Session>(`/api/sessions/${id}`),
    update: (
      id: string,
-      body: Partial<Pick<Session, 'name' | 'model' | 'system_prompt' | 'agent_id' | 'web_search_enabled'>>
+      body: Partial<
        Pick<
          Session,
          | 'name'
          | 'model'
          | 'system_prompt'
          | 'agent_id'
          | 'web_search_enabled'
          // v1.13.17: revocation path — frontend sends the shortened list
          // when the user removes a grant. Grants are appended only via the
          // separate grantReadAccess endpoint below.
          | 'allowed_read_paths'
        >
      >
    ) =>
      request<Session>(`/api/sessions/${id}`, {
        method: 'PATCH',
@@ -143,6 +157,11 @@ export const api = {
      ),
    openChatsCount: (id: string) =>
      request<{ count: number }>(`/api/sessions/${id}/chats/open-count`),
    updateWorkspacePanes: (id: string, panes: Session['workspace_panes']) =>
      request<Session>(`/api/sessions/${id}/workspace`, {
        method: 'PATCH',
        body: JSON.stringify({ workspace_panes: panes }),
      }),
  },
  chats: {
@@ -175,6 +194,11 @@ export const api = {
      request<{ ok: true }>(`/api/chats/${chatId}/compact`, { method: 'POST' }),
    stop: (chatId: string) =>
      request<{ stopped: boolean }>(`/api/chats/${chatId}/stop`, { method: 'POST' }),
    discardStale: (chatId: string, messageId: string) =>
      request<Message>(`/api/chats/${chatId}/discard_stale`, {
        method: 'POST',
        body: JSON.stringify({ message_id: messageId }),
      }),
    forceSend: (chatId: string, content: string) =>
      request<{ user_message_id: string; assistant_message_id: string }>(
        `/api/chats/${chatId}/force_send`,
@@ -217,6 +241,19 @@ export const api = {
          body: JSON.stringify({ tool_call_id: toolCallId, answers }),
        },
      ),
    // v1.13.17-cross-repo-reads: resume a paused request_read_access. On
    // 'allow' the server re-resolves the grant root and appends it to
    // sessions.allowed_read_paths; the returned list reflects the post-
    // grant state. On 'deny' the array is unchanged.
    grantReadAccess: (chatId: string, toolCallId: string, decision: 'allow' | 'deny') =>
      request<{
        tool_message_id: string;
        assistant_message_id: string;
        allowed_read_paths: string[];
      }>(`/api/chats/${chatId}/grant_read_access`, {
        method: 'POST',
        body: JSON.stringify({ tool_call_id: toolCallId, decision }),
      }),
  },
  messages: {
@@ -239,6 +276,24 @@ export const api = {
      request<void>(`/api/chats/${chatId}/messages/${messageId}`, {
        method: 'DELETE',
      }),
    // v1.14.x-html-artifact-panes: write the artifact to
    // <projectRoot>/.boocode/artifacts/<slug>-<ts>.<ext> and return the
    // path + a /api/projects/.../artifacts/<filename> URL the browser can
    // GET to download. fmt=html requires the assistant message to carry an
    // html_artifact part (404 otherwise).
    downloadArtifact: (chatId: string, messageId: string, fmt: 'md' | 'html') =>
      request<{ path: string; url: string }>(
        `/api/chats/${chatId}/messages/${messageId}/artifacts/download?fmt=${fmt}`,
        { method: 'POST' },
      ),
    // v1.14.x-html-artifact-panes: fetch the html_artifact part payload so
    // HtmlArtifactPane can render the iframe srcdoc. 404 = no html_artifact
    // part on this message; MessageBubble uses that as a signal to fall back
    // to the markdown pane variant.
    getHtmlArtifact: (chatId: string, messageId: string) =>
      request<{ html_content: string; char_count: number; title: string }>(
        `/api/chats/${chatId}/messages/${messageId}/html_artifact`,
      ),
  },
  models: () => request<ModelInfo[]>('/api/models'),
@@ -252,6 +307,14 @@ export const api = {
    list: () => request<{ skills: Skill[] }>('/api/skills'),
  },
  // v1.13.10: per-tool cost rolling-window stats (last 100 calls per tool,
  // equal-split attribution across multi-tool turns). Read endpoint backed by
  // the tool_cost_stats view. AgentPicker consumes this for per-agent cost
  // hints.
  tools: {
    costStats: () => request<{ stats: ToolCostStat[] }>('/api/tools/cost_stats'),
  },
  settings: {
    get: () => request<Record<string, unknown>>('/api/settings'),
    patch: (body: Record<string, unknown>) =>
--- a/apps/web/src/api/types.ts
+++ b/apps/web/src/api/types.ts
@@ -1,6 +1,18 @@
 export const PROJECT_STATUSES = ['open', 'archived'] as const;
 export type ProjectStatus = typeof PROJECT_STATUSES[number];
 // v1.13.10: per-tool cost rolling-window stat. Returned by
 // GET /api/tools/cost_stats — one entry per tool with mean prompt/completion
 // tokens over the last 100 invocations. AgentPicker sums across an agent's
 // whitelisted tools for per-agent cost hints.
 export interface ToolCostStat {
  tool_name: string;
  mean_prompt_tokens: number;
  mean_completion_tokens: number;
  n_calls: number;
  updated_at: string;
 }
 export interface Project {
  id: string;
  name: string;
@@ -34,6 +46,13 @@ export interface Session {
  agent_id: string | null;
  // v1.9: null = inherit from project.default_web_search_enabled.
  web_search_enabled: boolean | null;
  // v1.12.1: server-authoritative pane layout, replaces localStorage.
  workspace_panes: WorkspacePane[];
  // v1.13.17: paths the agent has been granted read access to via the
  // request_read_access tool. Empty by default. Settings UI surfaces the
  // list with per-row revoke; the grant flow itself appends through the
  // dedicated POST /api/chats/:id/grant_read_access endpoint (not PATCH).
  allowed_read_paths: string[];
 }
 // v1.8.1: 'global' = /data/AGENTS.md (always-on), 'project' = per-project
@@ -159,6 +178,11 @@ export interface Message {
  // v1.8.2: per-message metadata; see MessageMetadata. null for the vast
  // majority of messages.
  metadata: MessageMetadata | null;
  // v1.13.1-C: reasoning content captured from models that stream reasoning
  // tokens separately (qwen3.6 etc.). Backend populates from message_parts;
  // optional on the wire — frontend doesn't render this yet (reserved for
  // a v1.14 UI surface).
  reasoning_parts?: Array<{ text: string }> | null;
  // v1.11: anchored rolling compaction fields. Optional on the wire so that
  // older API responses (or test fixtures) parse without explicit nulls.
  //   summary       — true on the assistant row that holds the active
@@ -292,7 +316,37 @@ export interface AskUserAnswerSet {
 // v1.9: 'settings' is an ephemeral pane kind — never persisted, always
 // singleton per workspace. The pane hook filters it out before writing to
 // localStorage and dedupes on insertion via toggleSettingsPane().
-export type WorkspacePaneKind = 'chat' | 'terminal' | 'agent' | 'empty' | 'settings';
+// v1.14.x-html-artifact-panes: 'markdown_artifact' + 'html_artifact' added.
 // Both carry payload state on the WorkspacePane row itself so
 // useWorkspacePanes's JSON-string dedup + persisted jsonb stay self-contained
 // — no extra fetch on rehydrate.
 export type WorkspacePaneKind =
  | 'chat'
  | 'terminal'
  | 'agent'
  | 'empty'
  | 'settings'
  | 'markdown_artifact'
  | 'html_artifact';
 // v1.14.x: per-pane artifact payloads. Optional + namespaced so older saved
 // pane rows (without these fields) deserialize unchanged.
 // v1.14.x: pane state is a reference only — the pane component fetches the
 // actual content on mount. This keeps sessions.workspace_panes jsonb small and
 // makes the message body / html_artifact part the single source of truth.
 export interface MarkdownArtifactState {
  // chat_id is needed for the download endpoint
  // (POST /api/chats/:chat_id/messages/:msg_id/artifacts/download).
  chat_id: string;
  message_id: string;
  title: string;
 }
 export interface HtmlArtifactState {
  chat_id: string;
  message_id: string;
  title: string;
 }
 export interface WorkspacePane {
  id: string;
@@ -300,6 +354,9 @@ export interface WorkspacePane {
  chatId?: string;
  chatIds: string[];
  activeChatIdx: number;
  // v1.14.x: populated only when kind === 'markdown_artifact' / 'html_artifact'.
  markdown_artifact_state?: MarkdownArtifactState;
  html_artifact_state?: HtmlArtifactState;
 }
 export type WsFrame =
@@ -330,6 +387,17 @@ export type WsFrame =
      // to the client without a refetch.
      metadata?: MessageMetadata | null;
    }
  // v1.12.2: live throughput frame, published mid-stream every ~500ms with
  // the latest token + ctx counts so ChatThroughput can render tok/s and
  // ctx_used while the model is still generating.
  | {
      type: 'usage';
      message_id: string;
      chat_id?: string;
      completion_tokens: number | null;
      ctx_used: number | null;
      ctx_max: number | null;
    }
  | { type: 'messages_deleted'; message_ids: string[]; chat_id?: string }
  | { type: 'chat_renamed'; chat_id: string; name: string }
  // v1.11: published by services/compaction.ts after the new anchored
--- a/apps/web/src/api/ws-frames.ts
+++ b/apps/web/src/api/ws-frames.ts
@@ -0,0 +1,328 @@
 // v1.13.11-a: Zod schemas for every WebSocket frame published by the server.
 // Validation runs both on send (broker.publishFrame / publishUserFrame) and
 // on receive (apps/web/src/hooks/useSessionStream + useUserEvents). Catches
 // silent protocol drift between publisher and consumer.
 //
 // IMPORTANT: This file is duplicated byte-identical at
 // apps/web/src/api/ws-frames.ts. The two apps have separate tsconfigs and
 // no path alias; the duplication is sync-by-hand. A test asserts the two
 // files match. If you change one, change the other.
 //
 // Per-kind payload schemas (tool_call args, message_parts payloads, etc.)
 // stay z.unknown() in v1.13.11. Frame-level drift detection is the goal;
 // deep payload validation is follow-up work.
 import { z } from 'zod';
 // ---- shared primitives -----------------------------------------------------
 const Uuid = z.string().uuid();
 // Tool call IDs are model-emitted (e.g. "call_abc123") — not UUIDs.
 const ToolCallId = z.string().min(1);
 // v1.13.12 fix: postgres returns timestamp columns as JS Date objects, not
 // strings. The publish sites pass them through unchanged, so the schema must
 // tolerate both. preprocess converts Date → ISO string before string-validation;
 // on the web side (where frames arrive via JSON.parse) it's a no-op. Before
 // this fix, every message_complete / session_updated / chat_updated frame
 // failed validation and got dropped — symptoms: token tracking blank in UI,
 // status stuck at 'streaming' tripping the 60s stale-stream banner.
 const IsoTimestamp = z.preprocess(
  (v) => (v instanceof Date ? v.toISOString() : v),
  z.string().min(1),
 );
 const ChatStatusValue = z.enum([
  'streaming',
  'tool_running',
  'waiting_for_input',
  'idle',
  'error',
 ]);
 const ErrorReasonValue = z.enum([
  'llm_provider_error',
  'doom_loop',
  'doom_loop_summary_failed',
  'cap_hit',
  'cap_hit_summary_failed',
 ]);
 const MessageRoleValue = z.enum(['user', 'assistant', 'system', 'tool']);
 const ToolCallShape = z.object({
  id: ToolCallId,
  name: z.string().min(1),
  args: z.record(z.string(), z.unknown()),
 });
 // Free-form bags: opaque to the frame schema; deep validation is out of
 // scope for v1.13.11 (frame-level drift detection is the goal; per-kind
 // payload narrowing is follow-up work). z.unknown() means the consumer
 // must narrow before reading — TypeScript-side this is fine because every
 // consumer already operates on the hand-maintained Project / Chat / Session
 // / WorkspacePane types (the brief's "Don't strip existing types yet"
 // rule), and the Zod-typed shape is only used at the publishFrame boundary.
 const OpaqueObject = z.unknown();
 // ---- per-session channel frames --------------------------------------------
 export const SnapshotFrame = z.object({
  type: z.literal('snapshot'),
  messages: z.array(OpaqueObject),
 });
 export const MessageStartedFrame = z.object({
  type: z.literal('message_started'),
  message_id: Uuid,
  chat_id: Uuid.optional(),
  role: MessageRoleValue,
 });
 export const DeltaFrame = z.object({
  type: z.literal('delta'),
  message_id: Uuid,
  chat_id: Uuid.optional(),
  content: z.string(),
 });
 export const ToolCallFrame = z.object({
  type: z.literal('tool_call'),
  message_id: Uuid,
  chat_id: Uuid.optional(),
  tool_call: ToolCallShape,
 });
 export const ToolResultFrame = z.object({
  type: z.literal('tool_result'),
  tool_message_id: Uuid,
  chat_id: Uuid.optional(),
  tool_call_id: ToolCallId,
  output: z.unknown(),
  truncated: z.boolean(),
  error: z.string().optional(),
 });
 export const MessageCompleteFrame = z.object({
  type: z.literal('message_complete'),
  message_id: Uuid,
  chat_id: Uuid.optional(),
  tokens_used: z.number().int().nonnegative().nullable().optional(),
  ctx_used: z.number().int().nonnegative().nullable().optional(),
  ctx_max: z.number().int().positive().nullable().optional(),
  started_at: IsoTimestamp.nullable().optional(),
  finished_at: IsoTimestamp.nullable().optional(),
  model: z.string().optional(),
  metadata: OpaqueObject.nullable().optional(),
 });
 export const UsageFrame = z.object({
  type: z.literal('usage'),
  message_id: Uuid,
  chat_id: Uuid.optional(),
  completion_tokens: z.number().int().nonnegative().nullable(),
  ctx_used: z.number().int().nonnegative().nullable(),
  ctx_max: z.number().int().positive().nullable(),
 });
 export const MessagesDeletedFrame = z.object({
  type: z.literal('messages_deleted'),
  message_ids: z.array(Uuid),
  chat_id: Uuid.optional(),
 });
 export const ChatRenamedFrame = z.object({
  type: z.literal('chat_renamed'),
  chat_id: Uuid,
  name: z.string(),
 });
 export const CompactedFrame = z.object({
  type: z.literal('compacted'),
  session_id: Uuid,
  chat_id: Uuid,
  summary_message_id: Uuid,
 });
 export const ErrorFrame = z.object({
  type: z.literal('error'),
  message_id: Uuid.optional(),
  chat_id: Uuid.optional(),
  error: z.string(),
  reason: ErrorReasonValue.optional(),
 });
 // ---- per-user channel frames (sidebar refresh) -----------------------------
 export const ChatStatusFrame = z.object({
  type: z.literal('chat_status'),
  chat_id: Uuid,
  status: ChatStatusValue,
  at: IsoTimestamp,
  reason: ErrorReasonValue.optional(),
 });
 export const SessionUpdatedFrame = z.object({
  type: z.literal('session_updated'),
  session_id: Uuid,
  project_id: Uuid,
  name: z.string(),
  updated_at: IsoTimestamp,
 });
 export const SessionRenamedFrame = z.object({
  type: z.literal('session_renamed'),
  session_id: Uuid,
  name: z.string(),
 });
 export const SessionCreatedFrame = z.object({
  type: z.literal('session_created'),
  session: OpaqueObject,
  project_id: Uuid,
 });
 export const SessionArchivedFrame = z.object({
  type: z.literal('session_archived'),
  session_id: Uuid,
  project_id: Uuid,
 });
 export const SessionDeletedFrame = z.object({
  type: z.literal('session_deleted'),
  session_id: Uuid,
  project_id: Uuid,
 });
 export const SessionWorkspaceUpdatedFrame = z.object({
  type: z.literal('session_workspace_updated'),
  session_id: Uuid,
  workspace_panes: z.array(OpaqueObject),
 });
 export const ChatCreatedFrame = z.object({
  type: z.literal('chat_created'),
  chat: OpaqueObject,
  session_id: Uuid,
 });
 export const ChatUpdatedFrame = z.object({
  type: z.literal('chat_updated'),
  chat_id: Uuid,
  session_id: Uuid,
  name: z.string().nullable(),
  updated_at: IsoTimestamp,
 });
 export const ChatArchivedFrame = z.object({
  type: z.literal('chat_archived'),
  chat_id: Uuid,
  session_id: Uuid,
 });
 export const ChatUnarchivedFrame = z.object({
  type: z.literal('chat_unarchived'),
  chat: OpaqueObject,
 });
 export const ChatDeletedFrame = z.object({
  type: z.literal('chat_deleted'),
  chat_id: Uuid,
  session_id: Uuid,
 });
 export const ProjectCreatedFrame = z.object({
  type: z.literal('project_created'),
  project: OpaqueObject,
 });
 export const ProjectArchivedFrame = z.object({
  type: z.literal('project_archived'),
  project_id: Uuid,
 });
 export const ProjectUnarchivedFrame = z.object({
  type: z.literal('project_unarchived'),
  project: OpaqueObject,
 });
 export const ProjectUpdatedFrame = z.object({
  type: z.literal('project_updated'),
  project_id: Uuid,
  name: z.string(),
 });
 export const ProjectDeletedFrame = z.object({
  type: z.literal('project_deleted'),
  project_id: Uuid,
 });
 // ---- discriminated union ---------------------------------------------------
 export const WsFrameSchema = z.discriminatedUnion('type', [
  // per-session
  SnapshotFrame,
  MessageStartedFrame,
  DeltaFrame,
  ToolCallFrame,
  ToolResultFrame,
  MessageCompleteFrame,
  UsageFrame,
  MessagesDeletedFrame,
  ChatRenamedFrame,
  CompactedFrame,
  ErrorFrame,
  // per-user
  ChatStatusFrame,
  SessionUpdatedFrame,
  SessionRenamedFrame,
  SessionCreatedFrame,
  SessionArchivedFrame,
  SessionDeletedFrame,
  SessionWorkspaceUpdatedFrame,
  ChatCreatedFrame,
  ChatUpdatedFrame,
  ChatArchivedFrame,
  ChatUnarchivedFrame,
  ChatDeletedFrame,
  ProjectCreatedFrame,
  ProjectArchivedFrame,
  ProjectUnarchivedFrame,
  ProjectUpdatedFrame,
  ProjectDeletedFrame,
 ]);
 export type WsFrame = z.infer<typeof WsFrameSchema>;
 // Convenience: the set of known frame types. Useful for the publishFrame
 // helper to log the offending type name when validation fails. Kept in sync
 // by hand with the discriminated union above.
 export const KNOWN_FRAME_TYPES: readonly WsFrame['type'][] = [
  'snapshot',
  'message_started',
  'delta',
  'tool_call',
  'tool_result',
  'message_complete',
  'usage',
  'messages_deleted',
  'chat_renamed',
  'compacted',
  'error',
  'chat_status',
  'session_updated',
  'session_renamed',
  'session_created',
  'session_archived',
  'session_deleted',
  'session_workspace_updated',
  'chat_created',
  'chat_updated',
  'chat_archived',
  'chat_unarchived',
  'chat_deleted',
  'project_created',
  'project_archived',
  'project_unarchived',
  'project_updated',
  'project_deleted',
 ] as const;
--- a/apps/web/src/components/AgentPicker.tsx
+++ b/apps/web/src/components/AgentPicker.tsx
@@ -1,8 +1,8 @@
-import { useEffect, useState } from 'react';
+import { useEffect, useMemo, useState } from 'react';
 import { Check, ChevronDown } from 'lucide-react';
 import { toast } from 'sonner';
 import { api } from '@/api/client';
-import type { Agent, AgentParseError } from '@/api/types';
+import type { Agent, AgentParseError, ToolCostStat } from '@/api/types';
 import {
  DropdownMenu,
  DropdownMenuContent,
@@ -22,6 +22,10 @@ export function AgentPicker({ projectId, value, onChange }: Props) {
  const [parseErrors, setParseErrors] = useState<AgentParseError[]>([]);
  const [error, setError] = useState<string | null>(null);
  const [open, setOpen] = useState(false);
  // v1.13.10: per-tool cost rolling window. Fetched once on mount; would
  // refresh on remount or page reload. Acceptable for a decision aid — the
  // 100-call rolling mean doesn't shift fast.
  const [costStats, setCostStats] = useState<ToolCostStat[]>([]);
  // v1.8.1: per-agent parse errors are non-blocking. Silent if any agents
  // loaded successfully; a gray warning toast fires only when EVERY agent
@@ -52,6 +56,29 @@ export function AgentPicker({ projectId, value, onChange }: Props) {
    };
  }, [projectId]);
  // v1.13.10: cost stats are project-independent — the 100-call rolling
  // window is global across all chats. Fetch once per mount; tolerate failure
  // silently (cost line hides).
  useEffect(() => {
    let cancelled = false;
    api.tools
      .costStats()
      .then((r) => {
        if (!cancelled) setCostStats(r.stats);
      })
      .catch(() => {
        if (!cancelled) setCostStats([]);
      });
    return () => {
      cancelled = true;
    };
  }, []);
  const costByTool = useMemo(
    () => Object.fromEntries(costStats.map((s) => [s.tool_name, s])),
    [costStats],
  );
  const selectedAgent = agents?.find((a) => a.id === value) ?? null;
  const triggerLabel = value === null
    ? 'No agent'
@@ -86,25 +113,33 @@ export function AgentPicker({ projectId, value, onChange }: Props) {
              <span className="font-medium">No agent</span>
            </DropdownMenuItem>
            {agents.length > 0 && <DropdownMenuSeparator />}
-            {agents.map((a) => (
+            {agents.map((a) => {
-              <DropdownMenuItem
+              const cost = agentCost(a, costByTool);
-                key={a.id}
+              return (
-                onSelect={() => void onChange(a.id)}
+                <DropdownMenuItem
-                className="text-xs flex-col items-start gap-0.5"
+                  key={a.id}
-              >
+                  onSelect={() => void onChange(a.id)}
-                <div className="flex items-center gap-1.5">
+                  className="text-xs flex-col items-start gap-0.5"
-                  <Check
+                >
-                    className={`size-3 ${a.id === value ? 'opacity-100' : 'opacity-0'}`}
+                  <div className="flex items-center gap-1.5">
-                  />
+                    <Check
-                  <span className="font-medium">{a.name}</span>
+                      className={`size-3 ${a.id === value ? 'opacity-100' : 'opacity-0'}`}
-                </div>
+                    />
-                {a.description && (
+                    <span className="font-medium">{a.name}</span>
-                  <span className="text-muted-foreground pl-[18px] truncate w-full">
+                  </div>
-                    {a.description}
+                  {a.description && (
-                  </span>
+                    <span className="text-muted-foreground pl-[18px] truncate w-full">
-                )}
+                      {a.description}
-              </DropdownMenuItem>
+                    </span>
-            ))}
+                  )}
                  {cost.nWithData > 0 && (
                    <span className="text-muted-foreground/70 pl-[18px] truncate w-full">
                      ~{formatK(cost.prompt)} prompt / {cost.completion} completion · {cost.nWithData}/{cost.nTools} tools{cost.mostRecent ? ` · last call ${formatAgo(cost.mostRecent)}` : ''}
                    </span>
                  )}
                </DropdownMenuItem>
              );
            })}
            {parseErrors.length > 0 && (
              <div
                className="px-2 py-1.5 mt-1 text-xs text-amber-500 border-t border-border"
@@ -119,3 +154,49 @@ export function AgentPicker({ projectId, value, onChange }: Props) {
    </DropdownMenu>
  );
 }
 // v1.13.10: sum the per-tool means across an agent's whitelisted tools.
 // Sum-of-means, not mean-of-sums — we're combining independent rolling
 // averages. nWithData reflects how many of the agent's tools have any
 // history yet; the line hides entirely when zero so a fresh deploy doesn't
 // render "0k / 0 / 0 tools".
 function agentCost(
  agent: Agent,
  costByTool: Record<string, ToolCostStat>,
 ): {
  prompt: number;
  completion: number;
  nTools: number;
  nWithData: number;
  mostRecent: string | null;
 } {
  let prompt = 0;
  let completion = 0;
  let nWithData = 0;
  let mostRecent: string | null = null;
  for (const t of agent.tools) {
    const s = costByTool[t];
    if (!s) continue;
    prompt += s.mean_prompt_tokens;
    completion += s.mean_completion_tokens;
    nWithData++;
    if (!mostRecent || s.updated_at > mostRecent) mostRecent = s.updated_at;
  }
  return { prompt, completion, nTools: agent.tools.length, nWithData, mostRecent };
 }
 function formatK(n: number): string {
  if (n < 1000) return String(n);
  if (n < 10_000) return `${(n / 1000).toFixed(1)}k`;
  return `${Math.round(n / 1000)}k`;
 }
 function formatAgo(iso: string): string {
  const then = new Date(iso).getTime();
  if (Number.isNaN(then)) return '—';
  const diff = Date.now() - then;
  if (diff < 60_000) return 'just now';
  if (diff < 3_600_000) return `${Math.round(diff / 60_000)}m ago`;
  if (diff < 86_400_000) return `${Math.round(diff / 3_600_000)}h ago`;
  return `${Math.round(diff / 86_400_000)}d ago`;
 }
--- a/apps/web/src/components/ChatThroughput.tsx
+++ b/apps/web/src/components/ChatThroughput.tsx
@@ -0,0 +1,28 @@
 import { useChatStatus } from '@/hooks/useChatStatus';
 import { useChatThroughput } from '@/hooks/useChatThroughput';
 import { cn } from '@/lib/utils';
 interface Props {
  chatId: string | null | undefined;
  className?: string;
 }
 // v1.12.2: inline throughput readout. Renders next to StatusDot while the
 // chat is streaming or running a tool. Hidden in idle/error/waiting states
 // — the dot already communicates those.
 export function ChatThroughput({ chatId, className }: Props) {
  const status = useChatStatus(chatId);
  const t = useChatThroughput(chatId);
  if (!chatId || !t) return null;
  if (status !== 'streaming' && status !== 'tool_running') return null;
  const tps = t.tps != null && t.tps > 0 ? Math.round(t.tps) : null;
  const showCtx = t.ctx_used != null && t.ctx_max != null;
  if (tps === null && !showCtx) return null;
  return (
    <span className={cn('text-xs text-muted-foreground tabular-nums', className)}>
      {tps !== null && `${tps} tok/s`}
      {tps !== null && showCtx && ' · '}
      {showCtx && `${t.ctx_used!.toLocaleString()}/${t.ctx_max!.toLocaleString()}`}
    </span>
  );
 }
--- a/apps/web/src/components/HtmlArtifactPane.tsx
+++ b/apps/web/src/components/HtmlArtifactPane.tsx
@@ -0,0 +1,116 @@
 // v1.14.x-html-artifact-panes: full-height HTML artifact viewer. Renders the
 // model's HTML inside a sandboxed iframe — no allow-same-origin, srcdoc only
 // (no separate URL), CSP injected by the backend writer. JS runs inside the
 // iframe (interactive controls work) but fetch / WS / tracking pixels are
 // blocked by connect-src 'none' on the CSP. NO Copy button per the spec.
 //
 // Pane state is a reference only (chat_id + message_id + title); the iframe
 // payload is fetched on mount from
 // GET /api/chats/:chat_id/messages/:msg_id/html_artifact so that
 // sessions.workspace_panes jsonb stays small and message_parts.payload is the
 // single source of truth.
 import { useEffect, useState } from 'react';
 import { Download, X } from 'lucide-react';
 import { toast } from 'sonner';
 import { api } from '@/api/client';
 import type { HtmlArtifactState } from '@/api/types';
 interface Props {
  chatId: string;
  state: HtmlArtifactState;
  onClose: () => void;
 }
 export function HtmlArtifactPane({ chatId, state, onClose }: Props) {
  const [downloading, setDownloading] = useState(false);
  const [htmlContent, setHtmlContent] = useState<string | null>(null);
  const [loadError, setLoadError] = useState<string | null>(null);
  useEffect(() => {
    let cancelled = false;
    setHtmlContent(null);
    setLoadError(null);
    void (async () => {
      try {
        const payload = await api.messages.getHtmlArtifact(chatId, state.message_id);
        if (cancelled) return;
        setHtmlContent(payload.html_content);
      } catch (err) {
        if (cancelled) return;
        setLoadError(err instanceof Error ? err.message : 'failed to load HTML artifact');
      }
    })();
    return () => {
      cancelled = true;
    };
  }, [chatId, state.message_id]);
  async function download() {
    if (downloading) return;
    setDownloading(true);
    try {
      const { url, path } = await api.messages.downloadArtifact(
        chatId,
        state.message_id,
        'html',
      );
      const a = document.createElement('a');
      a.href = url;
      a.rel = 'noopener';
      a.click();
      toast.success(`Saved to ${path}`);
    } catch (err) {
      toast.error(err instanceof Error ? err.message : 'download failed');
    } finally {
      setDownloading(false);
    }
  }
  return (
    <div className="flex flex-col h-full min-h-0">
      <div className="flex items-center gap-2 border-b border-border bg-muted/30 px-2 py-1 shrink-0">
        <span className="text-xs text-muted-foreground truncate flex-1" title={state.title}>
          {state.title || 'HTML artifact'}
        </span>
        <button
          type="button"
          onClick={() => void download()}
          disabled={downloading || htmlContent === null}
          className="inline-flex items-center justify-center size-5 rounded text-muted-foreground hover:bg-muted hover:text-foreground disabled:opacity-40 max-md:min-h-[44px] max-md:min-w-[44px]"
          aria-label="Download HTML"
          title="Download"
        >
          <Download size={12} />
        </button>
        <button
          type="button"
          onClick={onClose}
          className="inline-flex items-center justify-center size-5 rounded text-muted-foreground hover:bg-muted hover:text-foreground max-md:min-h-[44px] max-md:min-w-[44px]"
          aria-label="Close artifact pane"
          title="Close"
        >
          <X size={12} />
        </button>
      </div>
      <div className="flex-1 min-h-0 overflow-hidden bg-background">
        {loadError ? (
          <div className="p-4 text-sm text-destructive">Failed to load: {loadError}</div>
        ) : htmlContent === null ? (
          <div className="p-4 text-sm text-muted-foreground">Loading HTML artifact…</div>
        ) : (
          <iframe
            // Sandbox attributes are non-negotiable per the v1.14.x spec S5:
            // no allow-same-origin → opaque origin → can't reach parent cookies
            // or DOM. srcdoc (not src) means no URL exists to leak. JS runs
            // (allow-scripts) but connect-src 'none' on the CSP inside the
            // payload blocks fetch / WS / pixels.
            srcDoc={htmlContent}
            sandbox="allow-scripts allow-clipboard-write allow-downloads"
            className="w-full h-full border-0"
            title={state.title || 'HTML artifact'}
          />
        )}
      </div>
    </div>
  );
 }
--- a/apps/web/src/components/MarkdownArtifactPane.tsx
+++ b/apps/web/src/components/MarkdownArtifactPane.tsx
@@ -0,0 +1,137 @@
 // v1.14.x-html-artifact-panes: dedicated full-height Markdown viewer used
 // when a user clicks "Open in pane" on an assistant message that has NO
 // html_artifact part. Header carries Copy (raw source) + Download (server-
 // materialised .md under <projectRoot>/.boocode/artifacts/) + close.
 //
 // Pane state is a reference only (chat_id + message_id + title); the markdown
 // body is fetched on mount from GET /api/chats/:chat_id/messages by locating
 // the matching message_id. This keeps sessions.workspace_panes jsonb small
 // and the assistant message row remains the single source of truth.
 import { useEffect, useState } from 'react';
 import { Check, Copy, Download, X } from 'lucide-react';
 import { toast } from 'sonner';
 import { api } from '@/api/client';
 import type { MarkdownArtifactState } from '@/api/types';
 import { MarkdownRenderer } from './MarkdownRenderer';
 interface Props {
  chatId: string;
  state: MarkdownArtifactState;
  onClose: () => void;
 }
 export function MarkdownArtifactPane({ chatId, state, onClose }: Props) {
  const [justCopied, setJustCopied] = useState(false);
  const [downloading, setDownloading] = useState(false);
  const [content, setContent] = useState<string | null>(null);
  const [loadError, setLoadError] = useState<string | null>(null);
  useEffect(() => {
    let cancelled = false;
    setContent(null);
    setLoadError(null);
    void (async () => {
      try {
        // No single-message GET endpoint exists; the chat-messages list is
        // already cached server-side and the lookup is O(n) over a small
        // window. Cheaper than adding a new route for one call site.
        const messages = await api.chats.messages(chatId);
        if (cancelled) return;
        const msg = messages.find((m) => m.id === state.message_id);
        if (!msg) {
          setLoadError('Message not found');
          return;
        }
        setContent(msg.content ?? '');
      } catch (err) {
        if (cancelled) return;
        setLoadError(err instanceof Error ? err.message : 'failed to load message');
      }
    })();
    return () => {
      cancelled = true;
    };
  }, [chatId, state.message_id]);
  async function copy() {
    if (content === null) return;
    try {
      await navigator.clipboard.writeText(content);
      setJustCopied(true);
      setTimeout(() => setJustCopied(false), 1200);
    } catch (err) {
      toast.error(err instanceof Error ? err.message : 'copy failed');
    }
  }
  async function download() {
    if (downloading) return;
    setDownloading(true);
    try {
      const { url, path } = await api.messages.downloadArtifact(
        chatId,
        state.message_id,
        'md',
      );
      // Trigger browser download from the returned URL. The endpoint stamps
      // Content-Disposition: attachment so the click lands as a save.
      const a = document.createElement('a');
      a.href = url;
      a.rel = 'noopener';
      a.click();
      toast.success(`Saved to ${path}`);
    } catch (err) {
      toast.error(err instanceof Error ? err.message : 'download failed');
    } finally {
      setDownloading(false);
    }
  }
  return (
    <div className="flex flex-col h-full min-h-0">
      <div className="flex items-center gap-2 border-b border-border bg-muted/30 px-2 py-1 shrink-0">
        <span className="text-xs text-muted-foreground truncate flex-1" title={state.title}>
          {state.title || 'Markdown artifact'}
        </span>
        <button
          type="button"
          onClick={() => void copy()}
          disabled={content === null}
          className="inline-flex items-center justify-center size-5 rounded text-muted-foreground hover:bg-muted hover:text-foreground disabled:opacity-40 max-md:min-h-[44px] max-md:min-w-[44px]"
          aria-label="Copy markdown source"
          title="Copy"
        >
          {justCopied ? <Check size={12} /> : <Copy size={12} />}
        </button>
        <button
          type="button"
          onClick={() => void download()}
          disabled={downloading || content === null}
          className="inline-flex items-center justify-center size-5 rounded text-muted-foreground hover:bg-muted hover:text-foreground disabled:opacity-40 max-md:min-h-[44px] max-md:min-w-[44px]"
          aria-label="Download markdown"
          title="Download"
        >
          <Download size={12} />
        </button>
        <button
          type="button"
          onClick={onClose}
          className="inline-flex items-center justify-center size-5 rounded text-muted-foreground hover:bg-muted hover:text-foreground max-md:min-h-[44px] max-md:min-w-[44px]"
          aria-label="Close artifact pane"
          title="Close"
        >
          <X size={12} />
        </button>
      </div>
      <div className="flex-1 min-h-0 overflow-auto px-4 py-3 text-sm">
        {loadError ? (
          <div className="text-destructive">Failed to load: {loadError}</div>
        ) : content === null ? (
          <div className="text-muted-foreground">Loading…</div>
        ) : (
          <MarkdownRenderer content={content} />
        )}
      </div>
    </div>
  );
 }
--- a/apps/web/src/components/MarkdownRenderer.tsx
+++ b/apps/web/src/components/MarkdownRenderer.tsx
@@ -0,0 +1,148 @@
 // v1.14.x-html-artifact-panes: extracted from MessageBubble.tsx so both the
 // in-chat bubble renderer and the MarkdownArtifactPane share the same Shiki +
 // remark-gfm + path-linkifier pipeline. Behavior preserved byte-for-byte from
 // the original MessageBubble.MarkdownBody helper (and its linkify helpers).
 import { Children, cloneElement, isValidElement } from 'react';
 import type { ReactElement, ReactNode } from 'react';
 import Markdown from 'react-markdown';
 import remarkGfm from 'remark-gfm';
 import { CodeBlock } from './CodeBlock';
 import { sessionEvents } from '@/hooks/sessionEvents';
 // Match path-shaped substrings ending in `.ext`. Additionally require a `/`
 // in the match to reduce false positives in prose (e.g. plain `foo.ts` won't
 // match, but `src/foo.ts` will). False positives at the edges are accepted
 // per Sam's design decision (2026-05-14).
 const PATH_REGEX = /([a-zA-Z0-9._/-]+\.[a-zA-Z0-9]+)/g;
 function isPathLike(s: string): boolean {
  return s.includes('/');
 }
 function emitOpenFile(path: string): void {
  sessionEvents.emit({ type: 'open_file_in_browser', path });
 }
 function linkifyPaths(text: string, keyPrefix: string): ReactNode {
  const out: ReactNode[] = [];
  let lastIdx = 0;
  let idx = 0;
  for (const match of text.matchAll(PATH_REGEX)) {
    const matchedText = match[0];
    const start = match.index ?? 0;
    if (!isPathLike(matchedText)) continue;
    if (start > lastIdx) out.push(text.slice(lastIdx, start));
    out.push(
      <button
        key={`${keyPrefix}-${idx}`}
        type="button"
        onClick={() => emitOpenFile(matchedText)}
        className="text-primary underline cursor-pointer hover:text-primary/80"
      >
        {matchedText}
      </button>
    );
    lastIdx = start + matchedText.length;
    idx += 1;
  }
  if (out.length === 0) return text;
  if (lastIdx < text.length) out.push(text.slice(lastIdx));
  return out;
 }
 function linkifyChildren(children: ReactNode, keyPrefix = 'l'): ReactNode {
  const arr = Children.toArray(children);
  return arr.map((child, i) => {
    if (typeof child === 'string') {
      return (
        <span key={`${keyPrefix}-${i}`}>
          {linkifyPaths(child, `${keyPrefix}-${i}`)}
        </span>
      );
    }
    if (isValidElement(child)) {
      const el = child as ReactElement<{ children?: ReactNode }>;
      if (el.type === 'code' || el.type === CodeBlock) return child;
      const grandchildren = el.props.children;
      if (grandchildren === undefined) return child;
      return cloneElement(el, {
        key: el.key ?? `linkified-${i}`,
        children: linkifyChildren(grandchildren, `${keyPrefix}-${i}`),
      });
    }
    return child;
  });
 }
 const codeRenderer = (props: { children?: unknown; className?: string }) => {
  const { children, className, ...rest } = props;
  const text = String(children ?? '').replace(/\n$/, '');
  const langMatch = /language-([\w-]+)/.exec(className ?? '');
  const isBlock = !!langMatch || text.includes('\n');
  if (isBlock) {
    return <CodeBlock code={text} lang={langMatch?.[1]} />;
  }
  return (
    <code
      {...rest}
      className="rounded bg-muted px-1 py-0.5 font-mono text-[0.85em]"
    >
      {children as React.ReactNode}
    </code>
  );
 };
 export function MarkdownRenderer({ content }: { content: string }) {
  return (
    <Markdown
      remarkPlugins={[remarkGfm]}
      components={{
        pre: ({ children }) => <>{children}</>,
        code: codeRenderer,
        a: ({ children, href }) => (
          <a
            href={href}
            target="_blank"
            rel="noreferrer"
            className="underline decoration-muted-foreground/40 underline-offset-2 hover:decoration-foreground"
          >
            {children}
          </a>
        ),
        ul: ({ children }) => (
          <ul className="list-disc pl-5 space-y-1">{children}</ul>
        ),
        ol: ({ children }) => (
          <ol className="list-decimal pl-5 space-y-1">{children}</ol>
        ),
        li: ({ children }) => <li>{linkifyChildren(children)}</li>,
        p: ({ children }) => (
          <p className="leading-relaxed">{linkifyChildren(children)}</p>
        ),
        h1: ({ children }) => <h1 className="text-base font-semibold mt-2">{children}</h1>,
        h2: ({ children }) => <h2 className="text-sm font-semibold mt-2">{children}</h2>,
        h3: ({ children }) => <h3 className="text-sm font-semibold mt-1">{children}</h3>,
        blockquote: ({ children }) => (
          <blockquote className="border-l-2 border-border pl-3 text-muted-foreground">
            {children}
          </blockquote>
        ),
        table: ({ children }) => (
          <div className="overflow-x-auto">
            <table className="border-collapse text-xs">{children}</table>
          </div>
        ),
        th: ({ children }) => (
          <th className="border border-border px-2 py-1 text-left font-medium">{children}</th>
        ),
        td: ({ children }) => (
          <td className="border border-border px-2 py-1">
            {linkifyChildren(children)}
          </td>
        ),
      }}
    >
      {content}
    </Markdown>
  );
 }
--- a/apps/web/src/components/MessageBubble.tsx
+++ b/apps/web/src/components/MessageBubble.tsx
@@ -1,16 +1,14 @@
-import { Children, cloneElement, isValidElement, useEffect, useState } from 'react';
+import { useEffect, useState } from 'react';
-import type { ReactElement, ReactNode } from 'react';
+import type { ReactNode } from 'react';
-import Markdown from 'react-markdown';
+import { ChevronDown, ChevronRight, Copy, RefreshCw, Check, Share2, RotateCw, GitFork, Trash2, PanelRightOpen } from 'lucide-react';
 import remarkGfm from 'remark-gfm';
 import { ChevronDown, ChevronRight, Copy, RefreshCw, Check, Share2, RotateCw, GitFork, Trash2 } from 'lucide-react';
 import { toast } from 'sonner';
 import type { Chat, ErrorReason, Message } from '@/api/types';
-import { api } from '@/api/client';
+import { api, ApiError } from '@/api/client';
 import { sessionEvents } from '@/hooks/sessionEvents';
 import { sendToTerminal, terminalsRegistry, type TerminalRegistration } from '@/lib/events';
 import { CapHitSentinel } from './CapHitSentinel';
 import { DoomLoopSentinel } from './DoomLoopSentinel';
-import { CodeBlock } from './CodeBlock';
+import { MarkdownRenderer } from './MarkdownRenderer';
 import { Button } from '@/components/ui/button';
 import {
  ContextMenu,
@@ -90,76 +88,20 @@ const ERROR_REASON_LABELS: Record<ErrorReason, string> = {
  summary_after_cap_failed: 'Summary after tool budget hit failed',
 };
-// Match path-shaped substrings ending in `.ext`. Additionally require a `/`
+// v1.14.x-html-artifact-panes: MarkdownBody and its path-linkifier helpers
-// in the match to reduce false positives in prose (e.g. plain `foo.ts` won't
+// moved to apps/web/src/components/MarkdownRenderer.tsx so the new artifact
-// match, but `src/foo.ts` will). False positives at the edges are accepted
+// panes can render assistant content with the same Shiki + remark-gfm setup.
 // per Sam's design decision (2026-05-14).
 const PATH_REGEX = /([a-zA-Z0-9._/-]+\.[a-zA-Z0-9]+)/g;
-function isPathLike(s: string): boolean {
+// Pane-header title derivation for a markdown artifact. Order matches the
-  return s.includes('/');
+// server slug logic in services/artifacts.ts: first `# ` heading → first 6
-}
+// words of the body → 'Markdown artifact'. Truncated to keep the pane header
-
+// readable.
-function emitOpenFile(path: string): void {
+function deriveMarkdownTitle(content: string): string {
-  sessionEvents.emit({ type: 'open_file_in_browser', path });
+  const headingMatch = content.match(/^\s*#\s+(.+?)\s*$/m);
-}
+  if (headingMatch && headingMatch[1]) return headingMatch[1].slice(0, 80);
-
+  const words = content.trim().split(/\s+/).slice(0, 6).join(' ');
-// Split a plain string into a flat array of strings and clickable button
+  if (words) return words.slice(0, 80);
-// nodes for path-shaped substrings. If no matches, returns the original
+  return 'Markdown artifact';
 // string verbatim (no array wrapping).
 function linkifyPaths(text: string, keyPrefix: string): ReactNode {
  const out: ReactNode[] = [];
  let lastIdx = 0;
  let idx = 0;
  for (const match of text.matchAll(PATH_REGEX)) {
    const matchedText = match[0];
    const start = match.index ?? 0;
    if (!isPathLike(matchedText)) continue;
    if (start > lastIdx) out.push(text.slice(lastIdx, start));
    out.push(
      <button
        key={`${keyPrefix}-${idx}`}
        type="button"
        onClick={() => emitOpenFile(matchedText)}
        className="text-primary underline cursor-pointer hover:text-primary/80"
      >
        {matchedText}
      </button>
    );
    lastIdx = start + matchedText.length;
    idx += 1;
  }
  if (out.length === 0) return text;
  if (lastIdx < text.length) out.push(text.slice(lastIdx));
  return out;
 }
 // Walk react-markdown children, linkifying string text nodes. Children of
 // <code> nodes (CodeBlock and inline code) are left untouched — the regex
 // shouldn't run inside code spans.
 function linkifyChildren(children: ReactNode, keyPrefix = 'l'): ReactNode {
  const arr = Children.toArray(children);
  return arr.map((child, i) => {
    if (typeof child === 'string') {
      return (
        <span key={`${keyPrefix}-${i}`}>
          {linkifyPaths(child, `${keyPrefix}-${i}`)}
        </span>
      );
    }
    if (isValidElement(child)) {
      const el = child as ReactElement<{ children?: ReactNode }>;
      // Skip inline/block code — paths in code spans aren't link targets.
      if (el.type === 'code' || el.type === CodeBlock) return child;
      const grandchildren = el.props.children;
      if (grandchildren === undefined) return child;
      return cloneElement(el, {
        key: el.key ?? `linkified-${i}`,
        children: linkifyChildren(grandchildren, `${keyPrefix}-${i}`),
      });
    }
    return child;
  });
 }
 interface Props {
@@ -170,80 +112,6 @@ interface Props {
  capHitInfo?: { position: number; isLatest: boolean };
 }
 function MarkdownBody({ content }: { content: string }) {
  return (
    <Markdown
      remarkPlugins={[remarkGfm]}
      components={{
        pre: ({ children }) => <>{children}</>,
        code: (props) => {
          const { children, className, ...rest } = props as {
            children?: unknown;
            className?: string;
          };
          const text = String(children ?? '').replace(/\n$/, '');
          const langMatch = /language-([\w-]+)/.exec(className ?? '');
          const isBlock = !!langMatch || text.includes('\n');
          if (isBlock) {
            return <CodeBlock code={text} lang={langMatch?.[1]} />;
          }
          return (
            <code
              {...rest}
              className="rounded bg-muted px-1 py-0.5 font-mono text-[0.85em]"
            >
              {children as React.ReactNode}
            </code>
          );
        },
        a: ({ children, href }) => (
          <a
            href={href}
            target="_blank"
            rel="noreferrer"
            className="underline decoration-muted-foreground/40 underline-offset-2 hover:decoration-foreground"
          >
            {children}
          </a>
        ),
        ul: ({ children }) => (
          <ul className="list-disc pl-5 space-y-1">{children}</ul>
        ),
        ol: ({ children }) => (
          <ol className="list-decimal pl-5 space-y-1">{children}</ol>
        ),
        li: ({ children }) => <li>{linkifyChildren(children)}</li>,
        p: ({ children }) => (
          <p className="leading-relaxed">{linkifyChildren(children)}</p>
        ),
        h1: ({ children }) => <h1 className="text-base font-semibold mt-2">{children}</h1>,
        h2: ({ children }) => <h2 className="text-sm font-semibold mt-2">{children}</h2>,
        h3: ({ children }) => <h3 className="text-sm font-semibold mt-1">{children}</h3>,
        blockquote: ({ children }) => (
          <blockquote className="border-l-2 border-border pl-3 text-muted-foreground">
            {children}
          </blockquote>
        ),
        table: ({ children }) => (
          <div className="overflow-x-auto">
            <table className="border-collapse text-xs">{children}</table>
          </div>
        ),
        th: ({ children }) => (
          <th className="border border-border px-2 py-1 text-left font-medium">{children}</th>
        ),
        td: ({ children }) => (
          <td className="border border-border px-2 py-1">
            {linkifyChildren(children)}
          </td>
        ),
      }}
    >
      {content}
    </Markdown>
  );
 }
 function StatsLine({ message }: { message: Message }) {
  const tokens = message.tokens_used;
  if (typeof tokens !== 'number' || tokens <= 0) return null;
@@ -337,6 +205,54 @@ function ActionRow({
  const canRegen = isAssistant && message.status !== 'streaming';
  const canFork = message.status === 'complete';
  const canDelete = message.status !== 'streaming';
  const [openingPane, setOpeningPane] = useState(false);
  // v1.14.x-html-artifact-panes: probe for an html_artifact part. If present,
  // open the HTML pane variant; otherwise fall back to the markdown variant.
  // Title derivation for markdown: first `# ` heading → first 6 words of the
  // body → 'Markdown artifact' (mirrors the slug logic in
  // services/artifacts.ts).
  async function openInPane() {
    if (openingPane || message.status === 'streaming') return;
    setOpeningPane(true);
    try {
      try {
        const payload = await api.messages.getHtmlArtifact(
          message.chat_id,
          message.id,
        );
        sessionEvents.emit({
          type: 'open_html_artifact_pane',
          state: {
            chat_id: message.chat_id,
            message_id: message.id,
            title: payload.title,
          },
        });
        return;
      } catch (err) {
        // 404 (no html_artifact part) is the expected fall-through path —
        // markdown variant opens below. Any other error (network, 500) is
        // a real failure; toast and bail rather than masquerading as markdown.
        const status = err instanceof ApiError ? err.status : null;
        if (status !== 404) {
          toast.error(err instanceof Error ? err.message : 'open in pane failed');
          return;
        }
      }
      const title = deriveMarkdownTitle(message.content);
      sessionEvents.emit({
        type: 'open_markdown_artifact_pane',
        state: {
          chat_id: message.chat_id,
          message_id: message.id,
          title,
        },
      });
    } finally {
      setOpeningPane(false);
    }
  }
  return (
    <>
@@ -350,6 +266,18 @@ function ActionRow({
        >
          {justCopied ? <Check className="size-3" /> : <Copy className="size-3" />}
        </button>
        {isAssistant && (
          <button
            type="button"
            onClick={() => void openInPane()}
            disabled={openingPane || message.status === 'streaming'}
            className="inline-flex items-center justify-center size-6 rounded text-muted-foreground hover:bg-muted hover:text-foreground disabled:opacity-40 disabled:cursor-not-allowed max-md:min-h-[44px] max-md:min-w-[44px]"
            aria-label="Open in pane"
            title="Open in pane"
          >
            <PanelRightOpen className="size-3" />
          </button>
        )}
        {isAssistant && (
          <button
            type="button"
@@ -588,7 +516,7 @@ function SummaryCard({ message }: { message: Message }) {
      </div>
      {expanded && (
        <div className="px-3 pb-3 text-xs leading-relaxed border-t pt-2">
-          <MarkdownBody content={message.content} />
+          <MarkdownRenderer content={message.content} />
        </div>
      )}
    </div>
@@ -651,7 +579,9 @@ export function MessageBubble({ message, sessionChats, capHitInfo }: Props) {
  const isStreaming = message.status === 'streaming';
  const failed = message.status === 'failed';
-  const hasContent = message.content.length > 0;
+  // v1.13.7: match the MessageList.flatten trim guard so a whitespace-only
  // assistant turn doesn't render an empty bubble + dangling ActionRow.
  const hasContent = message.content.trim().length > 0;
  // v1.8.2: if metadata stamps an error reason, surface it inline under the
  // generic "message failed" line. Keeps the user's eye where it already is
  // rather than introducing a separate banner.
@@ -665,7 +595,7 @@ export function MessageBubble({ message, sessionChats, capHitInfo }: Props) {
      {(hasContent || isStreaming) && (
        <SendToTerminalMenu>
          <div className="max-w-[90%] text-sm leading-relaxed space-y-2 break-words min-w-0">
-            {hasContent ? <MarkdownBody content={message.content} /> : null}
+            {hasContent ? <MarkdownRenderer content={message.content} /> : null}
            {isStreaming && (
              <span className="inline-block w-1.5 h-3.5 align-baseline bg-muted-foreground/60 animate-pulse" />
            )}
--- a/apps/web/src/components/MessageList.tsx
+++ b/apps/web/src/components/MessageList.tsx
@@ -4,6 +4,7 @@ import { MessageBubble } from './MessageBubble';
 import { ToolCallGroup } from './ToolCallGroup';
 import { ToolCallLine, type ToolRun } from './ToolCallLine';
 import { AskUserInputCard } from './AskUserInputCard';
 import { RequestReadAccessCard } from './RequestReadAccessCard';
 interface Props {
  messages: Message[];
@@ -45,7 +46,12 @@ function flatten(messages: Message[]): RenderItem[] {
      continue;
    }
    const hasToolCalls = m.tool_calls != null && m.tool_calls.length > 0;
-    const hasText = m.content.length > 0;
+    // v1.13.7: trim before checking. AI SDK v6 streaming occasionally emits a
    // leading "\n" text-delta on tool-call-only turns, which used to flow into
    // messages.content with length=1 and render an empty bubble + ActionRow
    // between each tool call. Whitespace-only content has no visible payload,
    // so treat it as no-content.
    const hasText = m.content.trim().length > 0;
    if (m.role === 'assistant' && hasToolCalls) {
      if (hasText || m.status === 'streaming') {
        items.push({ kind: 'message', message: m });
@@ -80,7 +86,9 @@ function group(items: RenderItem[]): RenderItem[] {
      continue;
    }
    const name = item.run.call.name;
-    if (name === 'ask_user_input') {
+    if (name === 'ask_user_input' || name === 'request_read_access') {
      // v1.13.17: same rationale as ask_user_input — grouping would collapse
      // the interactive pause card into a non-actionable ToolCallLine.
      out.push(item);
      i += 1;
      continue;
@@ -176,6 +184,16 @@ export function MessageList({ messages, sessionChats }: Props) {
                />
              );
            }
            if (item.run.call.name === 'request_read_access') {
              return (
                <RequestReadAccessCard
                  key={item.key}
                  toolCall={item.run.call}
                  toolResult={item.run.result}
                  chatId={item.chatId}
                />
              );
            }
            return <ToolCallLine key={item.key} run={item.run} />;
          }
          return <ToolCallGroup key={item.key} runs={item.runs} />;
--- a/apps/web/src/components/MobileTabSwitcher.tsx
+++ b/apps/web/src/components/MobileTabSwitcher.tsx
@@ -13,6 +13,7 @@ import { toast } from 'sonner';
 import type { Chat, WorkspacePane } from '@/api/types';
 import { BottomSheet } from '@/components/BottomSheet';
 import { StatusDot } from '@/components/StatusDot';
 import { ChatThroughput } from '@/components/ChatThroughput';
 import {
  DropdownMenu,
  DropdownMenuContent,
@@ -206,6 +207,7 @@ export function MobileTabSwitcher({
        >
          <span className="shrink-0 text-muted-foreground">{paneIcon(active?.kind ?? 'chat')}</span>
          <StatusDot chatId={activeChatId} />
          <ChatThroughput chatId={activeChatId} />
          <span className="truncate flex-1 text-left">{activeLabel}</span>
          <ChevronDown size={14} className="opacity-60 shrink-0" />
        </button>
@@ -237,6 +239,7 @@ export function MobileTabSwitcher({
              >
                <span className="shrink-0 text-muted-foreground">{paneIcon(pane.kind)}</span>
                <StatusDot chatId={cid ?? null} />
                <ChatThroughput chatId={cid ?? null} />
                {renamingChatId === cid && cid ? (
                  <input
                    autoFocus
--- a/apps/web/src/components/RequestReadAccessCard.tsx
+++ b/apps/web/src/components/RequestReadAccessCard.tsx
@@ -0,0 +1,193 @@
 import { useState } from 'react';
 import { Check, FolderOpen, ShieldOff } from 'lucide-react';
 import { toast } from 'sonner';
 import { api } from '@/api/client';
 import { Button } from '@/components/ui/button';
 import type { ToolCall, ToolResult } from '@/api/types';
 // v1.13.17-cross-repo-reads. Renders an inline allow/deny picker for a
 // paused request_read_access tool call. Mirrors AskUserInputCard's pending
 // vs answered render dance:
 //   - Pending: server pre-stamps a sentinel tool_result with output=null.
 //     The card shows path + reason and lets the user pick Allow or Deny.
 //   - Answered: the eventual WS tool_result frame carries the actual
 //     decision string ("granted: <root>" or "denied" or "denied: <reason>").
 //     The card flips to a read-only summary line.
 //
 // Tool name discrimination lives in MessageList.flatten/group — anything
 // with tc.name === 'request_read_access' bypasses grouping and renders this
 // card directly.
 interface Props {
  toolCall: ToolCall;
  toolResult: ToolResult | null;
  chatId: string;
 }
 interface ParsedArgs {
  path: string;
  reason: string;
 }
 function parseArgs(raw: unknown): ParsedArgs | null {
  if (!raw || typeof raw !== 'object') return null;
  const obj = raw as { path?: unknown; reason?: unknown };
  if (typeof obj.path !== 'string' || obj.path.length === 0) return null;
  if (typeof obj.reason !== 'string' || obj.reason.length === 0) return null;
  return { path: obj.path, reason: obj.reason };
 }
 function decisionVariant(output: unknown): 'granted' | 'denied' | 'unknown' {
  if (typeof output !== 'string') return 'unknown';
  if (output.startsWith('granted:')) return 'granted';
  if (output === 'denied' || output.startsWith('denied:')) return 'denied';
  return 'unknown';
 }
 export function RequestReadAccessCard({ toolCall, toolResult, chatId }: Props) {
  const args = parseArgs(toolCall.args);
  if (!args) {
    return (
      <div className="rounded border border-destructive/40 bg-destructive/10 text-xs px-3 py-2 text-destructive">
        request_read_access: malformed tool args
      </div>
    );
  }
  // Non-null output means the WS tool_result frame arrived (or the row was
  // re-fetched from history).
  const answered = toolResult && toolResult.output !== null;
  if (answered) {
    return <AnsweredView args={args} output={toolResult!.output} />;
  }
  return <PendingView args={args} toolCallId={toolCall.id} chatId={chatId} />;
 }
 function PendingView({
  args,
  toolCallId,
  chatId,
 }: {
  args: ParsedArgs;
  toolCallId: string;
  chatId: string;
 }) {
  const [submitting, setSubmitting] = useState<'allow' | 'deny' | null>(null);
  async function decide(decision: 'allow' | 'deny') {
    if (submitting) return;
    setSubmitting(decision);
    try {
      await api.chats.grantReadAccess(chatId, toolCallId, decision);
      // Card stays mounted; the incoming WS tool_result frame swaps it to
      // AnsweredView via the parent prop change.
    } catch (err) {
      toast.error(err instanceof Error ? err.message : 'request failed');
      setSubmitting(null);
    }
  }
  return (
    <div className="rounded-lg border border-amber-500/40 bg-amber-500/5 text-sm">
      <div className="px-4 py-3 space-y-2">
        <div className="flex items-center gap-2 text-xs uppercase tracking-wide text-amber-700 dark:text-amber-300">
          <ShieldOff className="size-3.5" />
          <span>Read-access request</span>
        </div>
        <div className="space-y-1.5">
          <div className="text-[10px] uppercase tracking-wide text-muted-foreground/70">Path</div>
          <div className="font-mono text-xs break-all rounded bg-background/60 border px-2 py-1">
            {args.path}
          </div>
        </div>
        <div className="space-y-1.5">
          <div className="text-[10px] uppercase tracking-wide text-muted-foreground/70">Reason</div>
          <div className="text-sm leading-snug whitespace-pre-wrap">{args.reason}</div>
        </div>
        <div className="text-[11px] text-muted-foreground pt-1">
          Allow grants the agent read access to the matching repository root for
          the rest of this session. Revoke any time from the session settings.
        </div>
      </div>
      <div className="flex justify-end gap-2 border-t border-amber-500/20 px-4 py-2">
        <Button
          type="button"
          size="sm"
          variant="outline"
          disabled={submitting !== null}
          onClick={() => void decide('deny')}
        >
          {submitting === 'deny' ? 'Denying…' : 'Deny'}
        </Button>
        <Button
          type="button"
          size="sm"
          disabled={submitting !== null}
          onClick={() => void decide('allow')}
        >
          {submitting === 'allow' ? 'Allowing…' : 'Allow'}
        </Button>
      </div>
    </div>
  );
 }
 function AnsweredView({ args, output }: { args: ParsedArgs; output: unknown }) {
  const variant = decisionVariant(output);
  const text = typeof output === 'string' ? output : 'unknown';
  return (
    <div
      className={
        variant === 'granted'
          ? 'rounded-lg border border-emerald-500/40 bg-emerald-500/5 text-sm'
          : variant === 'denied'
            ? 'rounded-lg border bg-muted/20 text-sm'
            : 'rounded-lg border border-destructive/40 bg-destructive/5 text-sm'
      }
    >
      <div className="px-4 py-3 space-y-2">
        <div className="flex items-center gap-2 text-xs uppercase tracking-wide">
          {variant === 'granted' ? (
            <>
              <Check className="size-3.5 text-emerald-600" />
              <span className="text-emerald-700 dark:text-emerald-300">Read access granted</span>
            </>
          ) : variant === 'denied' ? (
            <>
              <ShieldOff className="size-3.5 text-muted-foreground" />
              <span className="text-muted-foreground">Read access denied</span>
            </>
          ) : (
            <>
              <ShieldOff className="size-3.5 text-destructive" />
              <span className="text-destructive">Read access request — unknown result</span>
            </>
          )}
        </div>
        <div className="space-y-1.5">
          <div className="text-[10px] uppercase tracking-wide text-muted-foreground/70">Path</div>
          <div className="font-mono text-xs break-all rounded bg-background/60 border px-2 py-1">
            {args.path}
          </div>
        </div>
        {variant === 'granted' && (
          <div className="space-y-1.5">
            <div className="text-[10px] uppercase tracking-wide text-muted-foreground/70">Granted root</div>
            <div className="font-mono text-xs break-all rounded bg-background/60 border px-2 py-1 flex items-center gap-1.5">
              <FolderOpen className="size-3 shrink-0 text-muted-foreground" />
              <span>{text.replace(/^granted:\s*/, '')}</span>
            </div>
          </div>
        )}
        {variant === 'denied' && text !== 'denied' && (
          <div className="text-[11px] text-muted-foreground">
            {text.replace(/^denied:\s*/, '')}
          </div>
        )}
      </div>
    </div>
  );
 }
--- a/apps/web/src/components/StaleStreamBanner.tsx
+++ b/apps/web/src/components/StaleStreamBanner.tsx
@@ -0,0 +1,34 @@
 interface Props {
  onRetry: () => void;
  onDiscard: () => void;
 }
 // v1.12.3: shown when an assistant message has been 'streaming' for 60+
 // seconds without new tokens. Lives above ChatInput in ChatPane. Retry
 // discards the stuck row then resends the last user message; Discard just
 // clears the row and drops the dot to idle.
 export function StaleStreamBanner({ onRetry, onDiscard }: Props) {
  return (
    <div className="border border-amber-500/30 bg-amber-500/5 rounded-md p-3 mb-2 mx-4 flex items-center justify-between gap-2">
      <span className="text-sm text-muted-foreground">
        Previous response didn't complete.
      </span>
      <div className="flex gap-2">
        <button
          type="button"
          onClick={onRetry}
          className="text-xs px-2 py-1 rounded border border-border hover:bg-accent max-md:min-h-[44px] max-md:px-3"
        >
          Retry
        </button>
        <button
          type="button"
          onClick={onDiscard}
          className="text-xs px-2 py-1 rounded border border-border hover:bg-accent max-md:min-h-[44px] max-md:px-3"
        >
          Discard
        </button>
      </div>
    </div>
  );
 }
--- a/apps/web/src/components/StatusDot.tsx
+++ b/apps/web/src/components/StatusDot.tsx
@@ -6,15 +6,10 @@ interface Props {
  className?: string;
 }
 const STATUS_CLASS: Record<DerivedStatus, string> = {
  working: 'bg-amber-500 animate-pulse',
  idle_warm: 'bg-emerald-500',
  idle_cold: 'bg-muted-foreground/40',
  error: 'bg-destructive',
 };
 const STATUS_LABEL: Record<DerivedStatus, string> = {
-  working: 'working',
+  streaming: 'streaming',
  tool_running: 'running tool',
  waiting_for_input: 'waiting for input',
  idle_warm: 'idle',
  idle_cold: 'idle',
  error: 'error',
@@ -22,15 +17,58 @@ const STATUS_LABEL: Record<DerivedStatus, string> = {
 export function StatusDot({ chatId, className }: Props) {
  const status = useChatStatus(chatId);
  if (status === 'streaming') {
    return (
      <span
        aria-label="Status: streaming"
        title="streaming"
        className={cn('inline-block relative w-3 h-3 shrink-0', className)}
      >
        <span className="absolute inset-0 animate-spin-slow">
          <span className="absolute top-0 left-1/2 -translate-x-1/2 w-1 h-1 rounded-full bg-amber-500" />
          <span className="absolute bottom-0 left-1/2 -translate-x-1/2 w-1 h-1 rounded-full bg-amber-500/60" />
        </span>
      </span>
    );
  }
  if (status === 'tool_running') {
    return (
      <span
        aria-label="Status: running tool"
        title="running tool"
        className={cn(
          'inline-block w-3 h-3 rounded-full border-2 border-sky-500 border-t-transparent animate-spin shrink-0',
          className,
        )}
      />
    );
  }
  if (status === 'waiting_for_input') {
    return (
      <span
        aria-label="Status: waiting for input"
        title="waiting for input"
        className={cn(
          'inline-block w-1.5 h-1.5 rounded-full shrink-0 bg-violet-500',
          className,
        )}
      />
    );
  }
  const bg =
    status === 'idle_warm' ? 'bg-emerald-500'
      : status === 'error' ? 'bg-destructive'
      : 'bg-muted-foreground/40';
  return (
    <span
      aria-label={`Status: ${STATUS_LABEL[status]}`}
      title={STATUS_LABEL[status]}
-      className={cn(
+      className={cn('inline-block w-1.5 h-1.5 rounded-full shrink-0', bg, className)}
        'inline-block w-1.5 h-1.5 rounded-full shrink-0',
        STATUS_CLASS[status],
        className,
      )}
    />
  );
 }
--- a/apps/web/src/components/Workspace.tsx
+++ b/apps/web/src/components/Workspace.tsx
@@ -8,6 +8,8 @@ import { terminalsRegistry } from '@/lib/events';
 import { ChatPane } from '@/components/panes/ChatPane';
 import { SettingsPane } from '@/components/panes/SettingsPane';
 import { TerminalPane } from '@/components/panes/TerminalPane';
 import { MarkdownArtifactPane } from '@/components/MarkdownArtifactPane';
 import { HtmlArtifactPane } from '@/components/HtmlArtifactPane';
 import { ChatTabBar } from '@/components/ChatTabBar';
 import { SessionLandingPage } from '@/components/SessionLandingPage';
 import {
@@ -182,6 +184,7 @@ export function Workspace({
        {panes.map((pane, idx) => {
          const isSettings = pane.kind === 'settings';
          const isTerminal = pane.kind === 'terminal';
          const isArtifact = pane.kind === 'markdown_artifact' || pane.kind === 'html_artifact';
          // v1.9: when maximized, hide every pane except the settings one.
          // display:none keeps the React tree mounted so streams / drafts
          // survive the toggle without re-mount cost.
@@ -195,7 +198,7 @@ export function Workspace({
          }
          // Terminal panes own their tab strip (no chats, no ChatTabBar) and
          // are not drag-reorderable for now — keeps the layout grid simple.
-          const isChromeless = isSettings || isTerminal;
+          const isChromeless = isSettings || isTerminal || isArtifact;
          return (
          <div
            key={pane.id}
@@ -318,6 +321,18 @@ export function Workspace({
                  label={terminalLabels.get(pane.id) ?? 'Terminal'}
                  active={idx === activePaneIdx}
                />
              ) : pane.kind === 'markdown_artifact' && pane.markdown_artifact_state ? (
                <MarkdownArtifactPane
                  chatId={pane.markdown_artifact_state.chat_id}
                  state={pane.markdown_artifact_state}
                  onClose={() => removePane(idx)}
                />
              ) : pane.kind === 'html_artifact' && pane.html_artifact_state ? (
                <HtmlArtifactPane
                  chatId={pane.html_artifact_state.chat_id}
                  state={pane.html_artifact_state}
                  onClose={() => removePane(idx)}
                />
              ) : pane.kind === 'chat' && pane.chatId ? (
                <ChatPane
                  sessionId={sessionId}
--- a/apps/web/src/components/panes/ChatPane.tsx
+++ b/apps/web/src/components/panes/ChatPane.tsx
@@ -1,16 +1,12 @@
 import { useCallback, useEffect, useRef, useState } from 'react';
-import { ChevronDown, Square, X } from 'lucide-react';
+import { Pencil, Send, Square, X } from 'lucide-react';
 import { toast } from 'sonner';
 import { api } from '@/api/client';
 import { useSessionStream } from '@/hooks/useSessionStream';
 import { MessageList } from '@/components/MessageList';
 import { ChatInput } from '@/components/ChatInput';
-import {
+import { StaleStreamBanner } from '@/components/StaleStreamBanner';
-  DropdownMenu,
+import { sendToChat } from '@/lib/events';
  DropdownMenuContent,
  DropdownMenuItem,
  DropdownMenuTrigger,
 } from '@/components/ui/dropdown-menu';
 interface Props {
  sessionId: string;
@@ -44,6 +40,38 @@ export function ChatPane({ sessionId, chatId, projectId, agentId, onAgentChange,
  const chatMessages = stream.messages.filter((m) => m.chat_id === chatId);
  const streaming = chatMessages.some((m) => m.status === 'streaming');
  // v1.12.3: stale-stream detection. Watches the (at most one) streaming
  // assistant row. If its content length doesn't grow for STALE_THRESHOLD_MS,
  // assume the upstream call is dead and surface the recovery banner. We use
  // content length as the activity signal because every token delta extends
  // it; last_seq isn't currently bumped per delta.
  const STALE_THRESHOLD_MS = 60_000;
  const streamingMsg = chatMessages.find((m) => m.status === 'streaming' && m.role === 'assistant');
  const streamingId = streamingMsg?.id ?? null;
  const streamingLen = streamingMsg?.content.length ?? 0;
  const lastActivityRef = useRef<{ id: string; len: number; at: number } | null>(null);
  const [stale, setStale] = useState(false);
  useEffect(() => {
    if (!streamingId) {
      lastActivityRef.current = null;
      setStale(false);
      return;
    }
    const prev = lastActivityRef.current;
    if (!prev || prev.id !== streamingId || prev.len !== streamingLen) {
      lastActivityRef.current = { id: streamingId, len: streamingLen, at: Date.now() };
      setStale(false);
    }
    const interval = setInterval(() => {
      const a = lastActivityRef.current;
      if (!a) return;
      if (Date.now() - a.at >= STALE_THRESHOLD_MS) {
        setStale(true);
      }
    }, 5_000);
    return () => clearInterval(interval);
  }, [streamingId, streamingLen]);
  // v1.11.5: per-chat model context limit comes from chat.model_context_limit
  // populated by GET /api/sessions/:id/chats. Threaded into ChatInput so
  // ContextBar can render a zero-state before the first assistant message.
@@ -87,6 +115,45 @@ export function ChatPane({ sessionId, chatId, projectId, agentId, onAgentChange,
    }
  }
  const handleDiscardStale = useCallback(async () => {
    if (!streamingId) return;
    try {
      await api.chats.discardStale(chatId, streamingId);
      setStale(false);
      lastActivityRef.current = null;
    } catch (err) {
      // 409 (race) is benign — the row already terminated some other way.
      const msg = err instanceof Error ? err.message : 'discard failed';
      if (!msg.includes('409')) toast.error(msg);
      setStale(false);
    }
  }, [chatId, streamingId]);
  const handleRetryStale = useCallback(async () => {
    if (!streamingId) return;
    const lastUser = [...chatMessages].reverse().find((m) => m.role === 'user' && m.kind === 'message');
    if (!lastUser) {
      toast.error('no prior user message to retry');
      return;
    }
    try {
      await api.chats.discardStale(chatId, streamingId);
    } catch (err) {
      const msg = err instanceof Error ? err.message : 'discard failed';
      if (!msg.includes('409')) {
        toast.error(msg);
        return;
      }
    }
    setStale(false);
    lastActivityRef.current = null;
    try {
      await api.messages.send(chatId, lastUser.content);
    } catch (err) {
      toast.error(err instanceof Error ? err.message : 'retry send failed');
    }
  }, [chatId, streamingId, chatMessages]);
  const handleForceSend = useCallback(async (content: string) => {
    const trimmed = content.trim();
    if (!trimmed) return;
@@ -114,6 +181,16 @@ export function ChatPane({ sessionId, chatId, projectId, agentId, onAgentChange,
    setQueue((prev) => prev.filter((_, i) => i !== idx));
  }
  // v1.13.12: edit a queued message — pop it off the queue and push its text
  // into ChatInput via sendToChat. ChatInput appends (or sets, if empty) and
  // focuses; user re-sends, which re-queues if streaming is still active.
  function editQueued(idx: number) {
    const msg = queue[idx];
    if (!msg) return;
    setQueue((prev) => prev.filter((_, i) => i !== idx));
    sendToChat.emit({ chat_id: chatId, text: msg });
  }
  async function forceSendQueued(idx: number) {
    const msg = queue[idx];
    if (!msg) return;
@@ -138,30 +215,30 @@ export function ChatPane({ sessionId, chatId, projectId, agentId, onAgentChange,
            <div key={i} className="flex items-center gap-2 text-xs text-muted-foreground bg-muted/30 rounded px-2 py-1">
              <span className="font-medium shrink-0">Queued:</span>
              <span className="truncate flex-1">{msg}</span>
-              <DropdownMenu>
+              <button
-                <DropdownMenuTrigger asChild>
+                type="button"
-                  <button
+                onClick={() => editQueued(i)}
-                    type="button"
+                className="inline-flex items-center justify-center p-0.5 hover:bg-muted rounded shrink-0 max-md:min-h-[44px] max-md:min-w-[44px]"
-                    className="inline-flex items-center justify-center p-0.5 hover:bg-muted rounded shrink-0 max-md:min-h-[44px] max-md:min-w-[44px]"
+                aria-label="Edit queued message"
-                    aria-label="Queued message options"
+                title="Edit"
-                  >
+              >
-                    <ChevronDown size={12} />
+                <Pencil size={12} />
-                  </button>
+              </button>
-                </DropdownMenuTrigger>
+              <button
-                <DropdownMenuContent align="end">
+                type="button"
-                  <DropdownMenuItem onSelect={() => { /* default: queued, nothing to do */ }}>
+                onClick={() => void forceSendQueued(i)}
-                    Send when done
+                className="inline-flex items-center justify-center p-0.5 hover:bg-muted rounded shrink-0 max-md:min-h-[44px] max-md:min-w-[44px]"
-                  </DropdownMenuItem>
+                aria-label="Force send queued message now"
-                  <DropdownMenuItem onSelect={() => void forceSendQueued(i)}>
+                title="Force send now"
-                    Force send now
+              >
-                  </DropdownMenuItem>
+                <Send size={12} />
-                </DropdownMenuContent>
+              </button>
              </DropdownMenu>
              <button
                type="button"
                onClick={() => removeQueued(i)}
                className="inline-flex items-center justify-center p-0.5 hover:bg-muted rounded shrink-0 max-md:min-h-[44px] max-md:min-w-[44px]"
                aria-label="Cancel queued message"
                title="Cancel"
              >
                <X size={12} />
              </button>
@@ -187,6 +264,13 @@ export function ChatPane({ sessionId, chatId, projectId, agentId, onAgentChange,
        </div>
      )}
      {stale && streamingId && (
        <StaleStreamBanner
          onRetry={() => void handleRetryStale()}
          onDiscard={() => void handleDiscardStale()}
        />
      )}
      <ChatInput
        disabled={false}
        projectId={projectId}
--- a/apps/web/src/components/panes/SettingsPane.tsx
+++ b/apps/web/src/components/panes/SettingsPane.tsx
@@ -1,5 +1,5 @@
 import { useEffect, useState } from 'react';
-import { Archive, Maximize2, Minimize2, X } from 'lucide-react';
+import { Archive, FolderOpen, Maximize2, Minimize2, Trash2, X } from 'lucide-react';
 import { toast } from 'sonner';
 import { api } from '@/api/client';
 import type { Project, Session } from '@/api/types';
@@ -269,6 +269,8 @@ function SessionSection({ session, project }: { session: Session; project: Proje
        </p>
      </div>
      <AllowedReadPathsSection session={session} />
      <div className="space-y-1.5">
        <div className="flex items-center justify-between gap-3">
          <label className="text-xs font-medium uppercase tracking-wide text-muted-foreground">
@@ -337,6 +339,76 @@ function SessionSection({ session, project }: { session: Session; project: Proje
  );
 }
 // v1.13.17-cross-repo-reads: revoke UI for session.allowed_read_paths.
 // Append happens through the inline request_read_access pause flow; this
 // section only shrinks the list. PATCH /api/sessions/:id replaces the
 // whole array, so we send the original list minus the deleted entry.
 function AllowedReadPathsSection({ session }: { session: Session }) {
  const [paths, setPaths] = useState<string[]>(session.allowed_read_paths);
  const [pendingDelete, setPendingDelete] = useState<string | null>(null);
  // Re-sync on session prop change (e.g. WS session_updated after a new
  // grant lands). Without this, a grant approved in this same chat wouldn't
  // appear in the list until the user closes and reopens settings.
  useEffect(() => {
    setPaths(session.allowed_read_paths);
  }, [session.id, session.allowed_read_paths]);
  async function remove(path: string) {
    if (pendingDelete) return;
    setPendingDelete(path);
    const next = paths.filter((p) => p !== path);
    try {
      const updated = await api.sessions.update(session.id, { allowed_read_paths: next });
      setPaths(updated.allowed_read_paths);
      toast.success('Grant revoked');
    } catch (err) {
      toast.error(err instanceof Error ? err.message : 'failed to revoke');
    } finally {
      setPendingDelete(null);
    }
  }
  return (
    <div className="space-y-1.5">
      <label className="text-xs font-medium uppercase tracking-wide text-muted-foreground">
        Cross-repo read grants
      </label>
      {paths.length === 0 ? (
        <p className="text-xs text-muted-foreground italic">
          The agent has no access outside this project. Grants are created when
          the agent asks for them inline.
        </p>
      ) : (
        <ul className="space-y-1">
          {paths.map((p) => (
            <li
              key={p}
              className="flex items-center gap-2 rounded border bg-background/60 px-2 py-1.5"
            >
              <FolderOpen className="size-3.5 shrink-0 text-muted-foreground" />
              <span className="font-mono text-xs flex-1 min-w-0 break-all">{p}</span>
              <button
                type="button"
                onClick={() => void remove(p)}
                disabled={pendingDelete !== null}
                aria-label={`Revoke ${p}`}
                title="Revoke"
                className="inline-flex items-center justify-center size-7 rounded text-muted-foreground hover:bg-muted hover:text-destructive disabled:opacity-40 disabled:cursor-not-allowed max-md:min-h-[44px] max-md:min-w-[44px]"
              >
                <Trash2 className="size-3.5" />
              </button>
            </li>
          ))}
        </ul>
      )}
      <p className="text-xs text-muted-foreground">
        Grants are session-scoped. Archiving the session clears them.
      </p>
    </div>
  );
 }
 function ProjectSection({ project }: { project: Project }) {
  const [name, setName] = useState(project.name);
  const [defaultPrompt, setDefaultPrompt] = useState(project.default_system_prompt);
--- a/apps/web/src/hooks/sessionEvents.ts
+++ b/apps/web/src/hooks/sessionEvents.ts
@@ -2,7 +2,14 @@
 // across hooks (e.g. AI rename arriving via WS in the session view needs to
 // also refresh the sidebar's session list).
-import type { Chat, ErrorReason, Project, Session } from '@/api/types';
+import type {
  Chat,
  ErrorReason,
  HtmlArtifactState,
  MarkdownArtifactState,
  Project,
  Session,
 } from '@/api/types';
 import type { Attachment } from '@/lib/attachments';
 export interface SessionRenamedEvent {
@@ -41,6 +48,12 @@ export interface SessionUpdatedEvent {
  updated_at: string;
 }
 export interface SessionWorkspaceUpdatedEvent {
  type: 'session_workspace_updated';
  session_id: string;
  workspace_panes: import('@/api/types').WorkspacePane[];
 }
 export interface SessionLoadedEvent {
  type: 'session_loaded';
  session_id: string;
@@ -62,6 +75,19 @@ export interface OpenChatInActivePaneEvent {
  chat_id: string;
 }
 // v1.14.x-html-artifact-panes: ActionRow's "Open in pane" button emits one of
 // these; useWorkspacePanes subscribes and inserts the corresponding artifact
 // pane (or focuses an existing one keyed by message_id).
 export interface OpenMarkdownArtifactPaneEvent {
  type: 'open_markdown_artifact_pane';
  state: MarkdownArtifactState;
 }
 export interface OpenHtmlArtifactPaneEvent {
  type: 'open_html_artifact_pane';
  state: HtmlArtifactState;
 }
 // Client-side event fired by the sidebar Settings button when a session is
 // currently mounted. Session.tsx subscribes and calls
 // panesHook.toggleSettingsPane() (open on first click, close on second).
@@ -131,7 +157,7 @@ export interface ProjectUpdatedEvent {
 export interface ChatStatusEvent {
  type: 'chat_status';
  chat_id: string;
-  status: 'working' | 'idle' | 'error';
+  status: 'streaming' | 'tool_running' | 'waiting_for_input' | 'idle' | 'error';
  at: string;
  reason?: ErrorReason;
 }
@@ -143,10 +169,13 @@ export type SessionEvent =
  | SessionCreatedEvent
  | SessionDeletedEvent
  | SessionUpdatedEvent
  | SessionWorkspaceUpdatedEvent
  | SessionLoadedEvent
  | OpenFileInBrowserEvent
  | AttachChatFileEvent
  | OpenChatInActivePaneEvent
  | OpenMarkdownArtifactPaneEvent
  | OpenHtmlArtifactPaneEvent
  | OpenSettingsPaneEvent
  | SessionArchivedEvent
  | ChatCreatedEvent
--- a/apps/web/src/hooks/useChatStatus.ts
+++ b/apps/web/src/hooks/useChatStatus.ts
@@ -1,8 +1,14 @@
 import { useEffect, useState } from 'react';
 import { sessionEvents } from './sessionEvents';
-export type RawStatus = 'working' | 'idle' | 'error';
+export type RawStatus = 'streaming' | 'tool_running' | 'waiting_for_input' | 'idle' | 'error';
-export type DerivedStatus = 'working' | 'idle_warm' | 'idle_cold' | 'error';
+export type DerivedStatus =
  | 'streaming'
  | 'tool_running'
  | 'waiting_for_input'
  | 'idle_warm'
  | 'idle_cold'
  | 'error';
 // Window during which an idle dot stays green; after this, it fades to gray.
 const WARM_WINDOW_MS = 30_000;
@@ -53,7 +59,9 @@ if (!G.__boocode_chat_status_subscribed) {
 function derive(entry: Entry | undefined): DerivedStatus {
  if (!entry) return 'idle_cold';
-  if (entry.status === 'working') return 'working';
+  if (entry.status === 'streaming') return 'streaming';
  if (entry.status === 'tool_running') return 'tool_running';
  if (entry.status === 'waiting_for_input') return 'waiting_for_input';
  if (entry.status === 'error') return 'error';
  const age = Date.now() - new Date(entry.at).getTime();
  return age < WARM_WINDOW_MS ? 'idle_warm' : 'idle_cold';
--- a/apps/web/src/hooks/useChatThroughput.ts
+++ b/apps/web/src/hooks/useChatThroughput.ts
@@ -0,0 +1,106 @@
 import { useEffect, useState } from 'react';
 // v1.12.2: live throughput stream consumer. Fed by useSessionStream when a
 // 'usage' WS frame lands. Renders next to StatusDot via ChatThroughput.
 //
 // Singleton + Set<setState> pattern mirrors useChatStatus so any component
 // can subscribe to any chatId without prop drilling.
 export interface ThroughputSample {
  tps: number | null;
  ctx_used: number | null;
  ctx_max: number | null;
 }
 interface Entry {
  ctx_used: number | null;
  ctx_max: number | null;
  completion_tokens: number | null;
  recorded_at: number;
  prev_completion_tokens: number | null;
  prev_recorded_at: number | null;
  tps: number | null;
 }
 // Stale window. After this, useChatThroughput returns null — clears the
 // indicator after the stream ends without the next inference turn.
 const STALE_MS = 10_000;
 const entries = new Map<string, Entry>();
 const subscribers = new Set<() => void>();
 function notify(): void {
  for (const s of subscribers) {
    try { s(); } catch { /* swallow */ }
  }
 }
 // v1.12.2: imported by useSessionStream's WS handler. Computes tps from the
 // gap between successive completion_tokens samples; first sample yields null
 // (we need two points). Skips zero-progress samples so a duplicate usage
 // frame doesn't push tps to 0.
 export function recordUsage(
  chatId: string,
  data: { completion_tokens: number | null; ctx_used: number | null; ctx_max: number | null },
 ): void {
  const now = Date.now();
  const prev = entries.get(chatId);
  let tps: number | null = prev?.tps ?? null;
  if (
    prev &&
    data.completion_tokens != null &&
    prev.completion_tokens != null &&
    data.completion_tokens > prev.completion_tokens &&
    now > prev.recorded_at
  ) {
    const dTokens = data.completion_tokens - prev.completion_tokens;
    const dSeconds = (now - prev.recorded_at) / 1000;
    tps = dTokens / dSeconds;
  }
  entries.set(chatId, {
    ctx_used: data.ctx_used,
    ctx_max: data.ctx_max,
    completion_tokens: data.completion_tokens,
    recorded_at: now,
    prev_completion_tokens: prev?.completion_tokens ?? null,
    prev_recorded_at: prev?.recorded_at ?? null,
    tps,
  });
  notify();
 }
 export function clearThroughput(chatId: string): void {
  if (entries.delete(chatId)) notify();
 }
 // Periodic sweep: re-notify so stale entries fall off the UI when the
 // stream ends without a follow-up frame. Light — one timer for the whole app.
 const G = globalThis as Record<string, unknown>;
 if (!G.__boocode_throughput_ticker) {
  G.__boocode_throughput_ticker = true;
  setInterval(() => {
    const now = Date.now();
    let touched = false;
    for (const [k, v] of entries) {
      if (now - v.recorded_at > STALE_MS) {
        entries.delete(k);
        touched = true;
      }
    }
    if (touched) notify();
  }, 2_000);
 }
 export function useChatThroughput(chatId: string | null | undefined): ThroughputSample | null {
  const [, force] = useState({});
  useEffect(() => {
    const sub = () => force({});
    subscribers.add(sub);
    return () => { subscribers.delete(sub); };
  }, []);
  if (!chatId) return null;
  const entry = entries.get(chatId);
  if (!entry) return null;
  if (Date.now() - entry.recorded_at > STALE_MS) return null;
  return { tps: entry.tps, ctx_used: entry.ctx_used, ctx_max: entry.ctx_max };
 }
--- a/apps/web/src/hooks/useSessionChats.ts
+++ b/apps/web/src/hooks/useSessionChats.ts
@@ -12,6 +12,7 @@ export interface UseSessionChatsOpts {
  // about pane indexing.
  openChatInActivePane: (chatId: string) => void;
  initializeFirstChatIfEmpty: (chatId: string) => void;
  validatePanes: (validChatIds: Set<string>) => void;
 }
 export interface UseSessionChatsResult {
@@ -44,12 +45,15 @@ export function useSessionChats(
  openChatInActivePaneRef.current = opts.openChatInActivePane;
  const initializeFirstChatIfEmptyRef = useRef(opts.initializeFirstChatIfEmpty);
  initializeFirstChatIfEmptyRef.current = opts.initializeFirstChatIfEmpty;
  const validatePanesRef = useRef(opts.validatePanes);
  validatePanesRef.current = opts.validatePanes;
  useEffect(() => {
    let cancelled = false;
    api.chats.listForSession(sessionId).then((list) => {
      if (cancelled) return;
      setChats(list);
      validatePanesRef.current(new Set(list.map((c) => c.id)));
      const openChat = list.find((c) => c.status === 'open');
      if (openChat) {
        initializeFirstChatIfEmptyRef.current(openChat.id);
--- a/apps/web/src/hooks/useSessionStream.ts
+++ b/apps/web/src/hooks/useSessionStream.ts
@@ -1,8 +1,10 @@
 import { useEffect, useRef, useState } from 'react';
 import { toast } from 'sonner';
 import type { Message, WsFrame } from '@/api/types';
 import { WsFrameSchema } from '@/api/ws-frames';
 import { api } from '@/api/client';
 import { sessionEvents } from './sessionEvents';
 import { recordUsage } from './useChatThroughput';
 // session_renamed frame removed from WsFrame — it was declared but never
 // published on the per-session WS channel (server publishes via broker.publishUser
@@ -125,6 +127,19 @@ function applyFrame(state: State, frame: WsFrame): State {
      );
      return { ...state, messages: next };
    }
    case 'usage': {
      // v1.12.2: live throughput. Side-effects into the module-level
      // singleton consumed by ChatThroughput; no message-state mutation.
      // chat_id is the optional ws-frame field; usage frames always include it.
      if (frame.chat_id) {
        recordUsage(frame.chat_id, {
          completion_tokens: frame.completion_tokens,
          ctx_used: frame.ctx_used,
          ctx_max: frame.ctx_max,
        });
      }
      return state;
    }
    case 'messages_deleted': {
      const removeSet = new Set(frame.message_ids);
      return {
@@ -202,8 +217,28 @@ export function useSessionStream(sessionId: string | undefined) {
        setState((s) => ({ ...s, connected: true, error: null }));
      };
      ws.onmessage = (ev) => {
        // v1.13.11-a: Zod-validate every inbound frame. Fail-closed — invalid
        // frames are logged and dropped. WsFrameSchema is the runtime guard;
        // the hand-maintained WsFrame type stays as the narrowed dev-time
        // shape (Zod uses OpaqueObject for nested types like Message[]). One
        // cast bridges the two.
        let raw: unknown;
        try {
-          const frame = JSON.parse(typeof ev.data === 'string' ? ev.data : '') as WsFrame;
+          raw = JSON.parse(typeof ev.data === 'string' ? ev.data : '');
        } catch (err) {
          console.warn('bad ws frame (parse)', err);
          return;
        }
        const validated = WsFrameSchema.safeParse(raw);
        if (!validated.success) {
          console.error('ws-frame-validation-failed (session channel)', {
            frame_type: (raw as { type?: unknown })?.type,
            errors: validated.error.flatten(),
          });
          return;
        }
        try {
          const frame = validated.data as unknown as WsFrame;
          // v1.11: on a compaction completion, re-fetch the message list so
          // the new summary row + the cohort of compacted_at-stamped older
          // rows render correctly. We dispatch the fresh list as a synthetic
--- a/apps/web/src/hooks/useSidebar.ts
+++ b/apps/web/src/hooks/useSidebar.ts
@@ -143,6 +143,9 @@ function applyEvent(prev: SidebarResponse, event: import('./sessionEvents').Sess
    case 'session_loaded':
      // activeSessionProjectId is updated in the subscribe callback; no data change here.
      return prev;
    case 'session_workspace_updated':
      // Pane layout is consumed by useWorkspacePanes; sidebar has no stake.
      return prev;
    case 'open_file_in_browser':
      // Consumed by Workspace (T7); no sidebar state change needed.
      return prev;
@@ -151,6 +154,11 @@ function applyEvent(prev: SidebarResponse, event: import('./sessionEvents').Sess
    case 'open_chat_in_active_pane':
      // Consumed by Workspace; sidebar has no business with pane state.
      return prev;
    case 'open_markdown_artifact_pane':
    case 'open_html_artifact_pane':
      // v1.14.x-html-artifact-panes: consumed by useWorkspacePanes; sidebar
      // has no business with pane state.
      return prev;
    case 'open_settings_pane':
      // Consumed by Session.tsx (calls toggleSettingsPane on its panesHook).
      // Sidebar data is untouched.
--- a/apps/web/src/hooks/useUserEvents.ts
+++ b/apps/web/src/hooks/useUserEvents.ts
@@ -1,4 +1,5 @@
 import { useEffect } from 'react';
 import { WsFrameSchema } from '@/api/ws-frames';
 import { sessionEvents } from './sessionEvents';
 import { createWsReconnectToast } from './wsReconnectToast';
@@ -38,14 +39,33 @@ export function useUserEvents(): void {
      };
      ws.onmessage = (ev) => {
        // v1.13.11-a: Zod-validate every inbound frame. Fail-closed — invalid
        // frames are logged and dropped instead of dispatched onto the
        // sessionEvents bus where a stale or wrong shape would silently
        // corrupt sidebar / chat state.
        let raw: unknown;
        try {
-          const parsed: unknown = JSON.parse(ev.data);
+          raw = JSON.parse(ev.data);
          if (parsed && typeof (parsed as { type?: unknown }).type === 'string') {
            sessionEvents.emit(parsed as import('./sessionEvents').SessionEvent);
          }
        } catch (err) {
          console.warn('useUserEvents: failed to parse frame', err);
          return;
        }
        const validated = WsFrameSchema.safeParse(raw);
        if (!validated.success) {
          console.error('ws-frame-validation-failed (user channel)', {
            frame_type: (raw as { type?: unknown })?.type,
            errors: validated.error.flatten(),
          });
          return;
        }
        // Bridge cast: Zod's union is broader than SessionEvent (it includes
        // per-session-channel frames too, which never arrive on the user
        // channel). sessionEvents.emit only dispatches frames whose type
        // appears in SessionEvent; the narrowing happens via the existing
        // useSidebar.ts applyEvent switch.
        sessionEvents.emit(
          validated.data as unknown as import('./sessionEvents').SessionEvent,
        );
      };
      ws.onclose = () => {
--- a/apps/web/src/hooks/useWorkspacePanes.ts
+++ b/apps/web/src/hooks/useWorkspacePanes.ts
@@ -2,11 +2,20 @@ import { useCallback, useEffect, useRef, useState } from 'react';
 import type { DragEvent } from 'react';
 import { toast } from 'sonner';
 import { api } from '@/api/client';
-import type { WorkspacePane } from '@/api/types';
+import type {
  HtmlArtifactState,
  MarkdownArtifactState,
  WorkspacePane,
 } from '@/api/types';
 import { setActivePaneInfo, clearActivePane } from '@/hooks/useActivePane';
 import { sessionEvents } from '@/hooks/sessionEvents';
 export const MAX_PANES = 5;
-const STORAGE_KEY = 'boocode.workspace.panes';
+// v1.12.1: legacy localStorage key. Read once on mount to seed the server
 // for sessions still on per-device state, then deleted. Server is now
 // authoritative via sessions.workspace_panes.
 const LEGACY_STORAGE_KEY = 'boocode.workspace.panes';
 const SAVE_DEBOUNCE_MS = 300;
 function generateId(): string {
  return crypto.randomUUID();
@@ -38,6 +47,28 @@ function settingsPane(): WorkspacePane {
  return { id: generateId(), kind: 'settings', chatIds: [], activeChatIdx: -1 };
 }
 // v1.14.x-html-artifact-panes: artifact pane factories. Payload travels with
 // the pane row so the sessions.workspace_panes jsonb survives reload.
 function markdownArtifactPane(state: MarkdownArtifactState): WorkspacePane {
  return {
    id: generateId(),
    kind: 'markdown_artifact',
    chatIds: [],
    activeChatIdx: -1,
    markdown_artifact_state: state,
  };
 }
 function htmlArtifactPane(state: HtmlArtifactState): WorkspacePane {
  return {
    id: generateId(),
    kind: 'html_artifact',
    chatIds: [],
    activeChatIdx: -1,
    html_artifact_state: state,
  };
 }
 // v1.9: settings panes are ephemeral. Filter them out before persisting so a
 // page reload always returns to a clean workspace; the user re-opens via the
 // sidebar Settings button when needed.
@@ -51,9 +82,11 @@ function nonSettingsCount(panes: WorkspacePane[]): number {
  return panes.reduce((n, p) => n + (p.kind === 'settings' ? 0 : 1), 0);
 }
-function loadPanes(sessionId: string): WorkspacePane[] | null {
+// v1.12.1: read legacy per-device localStorage. If present, the caller seeds
 // the server then deletes the key. One-time migration per session.
 function readLegacyPanes(sessionId: string): WorkspacePane[] | null {
  try {
-    const raw = localStorage.getItem(`${STORAGE_KEY}.${sessionId}`);
+    const raw = localStorage.getItem(`${LEGACY_STORAGE_KEY}.${sessionId}`);
    if (!raw) return null;
    const parsed = JSON.parse(raw) as WorkspacePane[];
    if (!Array.isArray(parsed) || parsed.length === 0) return null;
@@ -63,15 +96,6 @@ function loadPanes(sessionId: string): WorkspacePane[] | null {
  }
 }
 function savePanes(sessionId: string, panes: WorkspacePane[]): void {
  try {
    localStorage.setItem(
      `${STORAGE_KEY}.${sessionId}`,
      JSON.stringify(persistablePanes(panes)),
    );
  } catch { /* quota or disabled */ }
 }
 export interface UseWorkspacePanesResult {
  panes: WorkspacePane[];
  activePaneIdx: number;
@@ -96,6 +120,7 @@ export interface UseWorkspacePanesResult {
  removePane: (idx: number) => void;
  removeChatFromPanes: (chatId: string) => void;
  initializeFirstChatIfEmpty: (chatId: string) => void;
  validatePanes: (validChatIds: Set<string>) => void;
  handlePaneDragStart: (idx: number) => (e: DragEvent<HTMLDivElement>) => void;
  handlePaneDragOver: (idx: number) => (e: DragEvent<HTMLDivElement>) => void;
  handlePaneDragLeave: () => void;
@@ -106,15 +131,129 @@ export interface UseWorkspacePanesResult {
 }
 export function useWorkspacePanes(sessionId: string): UseWorkspacePanesResult {
-  const [panes, setPanes] = useState<WorkspacePane[]>(() => {
+  const [panes, setPanes] = useState<WorkspacePane[]>(() => [emptyPane()]);
    return loadPanes(sessionId) ?? [emptyPane()];
  });
  const [activePaneIdx, setActivePaneIdx] = useState(0);
  const draggingIdxRef = useRef<number | null>(null);
  const [dragOverIdx, setDragOverIdx] = useState<number | null>(null);
  // v1.12.1: skip PATCH while hydrating from the server. Without this, the
  // initial [emptyPane()] would be saved over the server's real state before
  // the GET resolves.
  const hydratedRef = useRef(false);
  // Tracks the last value broadcast by another device (or this one's own
  // round-trip). If a PATCH would echo this exact payload, we skip the call.
  const lastRemoteJsonRef = useRef<string>('[]');
  // v1.12.1: hydrate from server on mount, then subscribe to remote updates.
  useEffect(() => {
-    savePanes(sessionId, panes);
+    hydratedRef.current = false;
    let cancelled = false;
    void (async () => {
      try {
        const session = await api.sessions.get(sessionId);
        if (cancelled) return;
        let initial: WorkspacePane[] = Array.isArray(session.workspace_panes)
          ? session.workspace_panes
          : [];
        // One-time migration: if server is empty but legacy localStorage has
        // a layout, seed the server and delete the local key.
        if (initial.length === 0) {
          const legacy = readLegacyPanes(sessionId);
          if (legacy && legacy.length > 0) {
            try {
              const updated = await api.sessions.updateWorkspacePanes(sessionId, legacy);
              if (cancelled) return;
              initial = updated.workspace_panes;
              localStorage.removeItem(`${LEGACY_STORAGE_KEY}.${sessionId}`);
            } catch {
              initial = legacy;
            }
          }
        }
        const next = initial.length > 0 ? initial : [emptyPane()];
        lastRemoteJsonRef.current = JSON.stringify(persistablePanes(next));
        setPanes(next);
        setActivePaneIdx(0);
      } finally {
        if (!cancelled) hydratedRef.current = true;
      }
    })();
    return () => { cancelled = true; };
  }, [sessionId]);
  // v1.12.1: live cross-device sync. Replace local state when another device
  // (or our own write echo) lands a session_workspace_updated frame.
  useEffect(() => {
    return sessionEvents.subscribe((ev) => {
      if (ev.type !== 'session_workspace_updated') return;
      if (ev.session_id !== sessionId) return;
      const incoming = Array.isArray(ev.workspace_panes) ? ev.workspace_panes : [];
      const json = JSON.stringify(incoming);
      if (json === lastRemoteJsonRef.current) return;
      lastRemoteJsonRef.current = json;
      setPanes(incoming.length > 0 ? incoming : [emptyPane()]);
      setActivePaneIdx((prev) => Math.min(prev, Math.max(0, incoming.length - 1)));
    });
  }, [sessionId]);
  // v1.14.x-html-artifact-panes: ActionRow's "Open in pane" emits one of
  // these per click. If a pane already exists for the same message_id, focus
  // it instead of stacking a duplicate. Otherwise append (capped at MAX_PANES;
  // settings panes don't count, matching addSplitPane's rule).
  useEffect(() => {
    return sessionEvents.subscribe((ev) => {
      if (
        ev.type !== 'open_markdown_artifact_pane' &&
        ev.type !== 'open_html_artifact_pane'
      ) {
        return;
      }
      setPanes((prev) => {
        const targetKind: WorkspacePane['kind'] =
          ev.type === 'open_html_artifact_pane' ? 'html_artifact' : 'markdown_artifact';
        const messageId = ev.state.message_id;
        const existingIdx = prev.findIndex((p) =>
          p.kind === 'markdown_artifact'
            ? p.markdown_artifact_state?.message_id === messageId
            : p.kind === 'html_artifact'
              ? p.html_artifact_state?.message_id === messageId
              : false,
        );
        if (existingIdx >= 0) {
          setActivePaneIdx(existingIdx);
          return prev;
        }
        if (nonSettingsCount(prev) >= MAX_PANES) {
          toast.error(`Maximum ${MAX_PANES} panes`);
          return prev;
        }
        const newPane =
          ev.type === 'open_html_artifact_pane'
            ? htmlArtifactPane(ev.state)
            : markdownArtifactPane(ev.state);
        // Defensive: assert kind matches for the discriminated union.
        if (newPane.kind !== targetKind) return prev;
        const next = [...prev, newPane];
        setActivePaneIdx(next.length - 1);
        return next;
      });
    });
  }, []);
  // v1.12.1: debounced PATCH on every change. Settings panes are stripped
  // before saving (ephemeral per v1.9).
  useEffect(() => {
    if (!hydratedRef.current) return;
    const payload = persistablePanes(panes);
    const json = JSON.stringify(payload);
    if (json === lastRemoteJsonRef.current) return;
    const timer = setTimeout(() => {
      lastRemoteJsonRef.current = json;
      api.sessions.updateWorkspacePanes(sessionId, payload).catch(() => {
        // Non-fatal: next change retries. Persistent failures surface via
        // the network layer's existing reconnect toast.
      });
    }, SAVE_DEBOUNCE_MS);
    return () => clearTimeout(timer);
  }, [sessionId, panes]);
  useEffect(() => {
@@ -328,6 +467,23 @@ export function useWorkspacePanes(sessionId: string): UseWorkspacePanesResult {
    });
  }, []);
  const validatePanes = useCallback((validChatIds: Set<string>) => {
    setPanes((prev) => {
      const cleaned = prev.map((pane) => {
        if (pane.kind !== 'chat' || pane.chatIds.length === 0) return pane;
        const nextIds = pane.chatIds.filter((id) => validChatIds.has(id));
        if (nextIds.length === pane.chatIds.length) return pane;
        if (nextIds.length === 0) {
          return { ...pane, kind: 'empty' as const, chatId: undefined, chatIds: [], activeChatIdx: -1 };
        }
        const nextActiveIdx = Math.min(pane.activeChatIdx, nextIds.length - 1);
        return { ...pane, chatIds: nextIds, activeChatIdx: nextActiveIdx, chatId: nextIds[nextActiveIdx] };
      });
      const unchanged = cleaned.every((p, i) => p === prev[i]);
      return unchanged ? prev : cleaned;
    });
  }, []);
  const removeChatFromPanes = useCallback((chatId: string) => {
    setPanes((prev) => prev.map((p) => {
      const idx = p.chatIds.indexOf(chatId);
@@ -411,6 +567,7 @@ export function useWorkspacePanes(sessionId: string): UseWorkspacePanesResult {
    removePane,
    removeChatFromPanes,
    initializeFirstChatIfEmpty,
    validatePanes,
    handlePaneDragStart,
    handlePaneDragOver,
    handlePaneDragLeave,
--- a/Show More
+++ b/Show More