v1.13.16-xml-parser: Anthropic <invoke> support + unknown-tool recovery hints

Two-part fix for the model-emitted XML drift the v1.13.15-codecontext-synth investigation surfaced (1 raw <invoke> leak observed out of 190 qwen3.6 turns — qwen3.6-35b-a3b-mxfp4 drifts to the Anthropic format when prompted as an Architect-style agent because Claude Code documentation in its pre-training corpus uses that shape). ## Parser extension xml-parser.ts now recognizes BOTH XML tool-call flavors: - Qwen/Hermes: <tool_call><function=NAME>...<parameter=K>V</parameter>...</function></tool_call> - Anthropic: <invoke name="NAME"><parameter name="K">V</parameter></invoke> Both route through the same synthetic-id xml_call_${idx} ToolCall path. extractToolCallBlocks() and partialXmlOpenerStart() handle both openers (<tool_call> and <invoke...) so partial buffers don't get prematurely flushed during streaming. The existing Qwen parser was tightened to tolerate whitespace around `=` (<function = name>, <parameter = key>...) so a stray space doesn't get absorbed into the function name. Name capture is non-whitespace, non-`>`. ## Unknown-tool recovery hint New tool-suggestions.ts exports levenshtein() + suggestToolName() + formatUnknownToolError(). When tool-phase.ts:executeToolCall receives a toolCall.name that isn't in TOOLS_BY_NAME, the error returned to the model now includes a "Did you mean: X?" hint based on Levenshtein distance ≤3 or substring match against Object.keys(TOOLS_BY_NAME). Targets the qwen3.6 drift to read_file → suggest view_file. Applies to all unknown tool names, not just <invoke>-derived ones — at the dispatch layer we no longer know which format produced the call, and the extra signal is harmless for Qwen-derived calls. ## Test coverage xml-parser.test.ts: 46 tests, all green. Covers both parsers (well-formed, malformed, multi-parameter, nested-content), the partial-opener detector for both flavors, the unified extraction helper, and the unknown-tool error formatter. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
v1.13.15-codecontext-synth: remove "tag pending" qualifier in roadmap
2026-05-22 20:59:25 +00:00 · 2026-05-22 20:09:39 +00:00 · 2026-05-22 20:08:47 +00:00
12 changed files with 1543 additions and 84 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -0,0 +1,167 @@
+# Changelog
+
+All notable changes per release tag. Most recent on top, ordered by tag creation date (which matches the git history). Tag names follow `vMAJOR.MINOR.PATCH-slug` — the slug describes what shipped, so the tag name alone is enough to recall the batch.
+
+## v1.13.16-xml-parser — 2026-05-22
+
+Two-part fix for the model-emitted XML drift the v1.13.15 investigation surfaced. **Parser extension:** `xml-parser.ts` now recognizes the Anthropic `<invoke name="…"><parameter name="…">…</parameter></invoke>` shape alongside the existing Qwen/Hermes `<tool_call><function=…>…</function></tool_call>` shape. qwen3.6-35b-a3b-mxfp4 drifts to the Anthropic format when prompted as an Architect-style agent (Claude Code documentation in its pre-training corpus). Both formats route through the same synthetic-id `xml_call_${idx}` ToolCall path. The existing Qwen parser was tightened to tolerate whitespace around `=` (`<function = name>` shape) so a stray space doesn't get absorbed into the function name. **Unknown-tool recovery hint:** new `tool-suggestions.ts` exports `levenshtein()` + `suggestToolName()` + `formatUnknownToolError()`. When the dispatcher (`tool-phase.ts:executeToolCall`) receives an unknown tool name, the error returned to the model includes a "Did you mean: X?" hint based on Levenshtein distance ≤3 or substring match against `Object.keys(TOOLS_BY_NAME)`. Targets the qwen3.6 drift to `read_file` → suggest `view_file`. Test coverage in `xml-parser.test.ts` (46 tests, all green) covers both parsers, the partial-opener detector for both flavors, the unified extraction helper, and the new error formatter.
+
+## v1.13.15-codecontext-synth — 2026-05-22
+
+Forced second-inference synthesis pass for codecontext overview-class tools (`get_codebase_overview`, `get_framework_analysis`, `get_semantic_neighborhoods`). After the tool result lands, the pipeline expands the truncated head via in-process `readTruncation`, extracts referenced file paths from the full content, auto-fetches top-N files + project docs (BOOCHAT.md, AGENTS.md, *roadmap*.md, CONTEXT.md) under a 32k-token budget with explicit drop-priority order, then streams a synthesis turn that replaces the recursive `runAssistantTurn`. The 32k truncated head still ships to the synth model (token-budget contract preserved); the expansion is reference-extraction-only. Falls through to recursion on timeout (90s), model error, or non-2xx; user-abort marks the synth message `status='failed'` and re-throws (the outer abort handler operates on the parent turn's message, not the new synth row — without explicit marking, the row would sit `streaming` until the 5-min sweeper, tripping the 60s stale-stream banner). Adds `'synthesis'` to `message_parts.kind` CHECK constraint via `DROP CONSTRAINT IF EXISTS` + `DO $$ pg_constraint` idempotency-guarded re-add. Smokes #1, #2, #6 all clean; smokes #3–#5 are content-quality checks for UI review.
+
+## v1.13.14-skills-audit — 2026-05-22
+
+Multi-topic batch. **Skills audit (headline):** vendored all 26 skills from `/home/samkintop/opt/skills/` into repo-local `data/skills/` (the `/opt/skills:/data/skills` override mount removed from `docker-compose.yml` so skills are auditable per-batch in git). Audited via 5 parallel Claude Code agent-teams running mgechev's 4-step protocol per skill — 14 survive with gerund-form names + refined triggers; 11 dropped (duplicates, BooCode-irrelevant patterns, Claude-already-does-natively); 1 (`verification-before-completion`) migrated to `BOOCHAT.md`/`BOOCODER.md` as an always-true rule. The Codeminer42 "rules vs recipes" split codified in those files. **Token tracking + stale-stream banner fix:** same root cause — `IsoTimestamp = z.string()` in `ws-frames.ts` was failing on postgres `Date` objects, silently dropping every `message_complete` / `session_updated` / `chat_updated` frame through the `v1.13.13-ws-publish` Zod gate; `z.preprocess(v => v instanceof Date ? v.toISOString() : v, ...)` applied to the primitive on both server + web (parity test still passes). **Codecontext ignore:** `codecontext_client.ts` auto-installs `.codecontextignore.template` into any project's root on first call (stops the upstream empty-source-file parser crash on foreign projects' `node_modules`). **Budget bump:** `BUDGET_READ_ONLY` + `BUDGET_NO_AGENT` 30 → 50 (real recon need ~27 + headroom for codecontext failure-retry turns; doom-loop guard catches the loop class anyway). **UI:** queued-message dropdown → edit / force-send / cancel buttons in `ChatPane.tsx`; `ChatThroughput` removed from desktop tab strip (mobile tab switcher keeps it). Audit decisions in `openspec/changes/v1.13.12-skills-audit/audit-notes.md`.
+
+## v1.13.13-ws-publish — 2026-05-22
+
+Second half of the WebSocket-frame-typing batch. Converts the existing ~50 inference + auto_name publish sites (via the `index.ts` adapter) plus ~30 direct `broker.publish*` call sites in routes + compaction, so every server-emitted frame now goes through Zod validation at the broker boundary. Pairs with `v1.13.12-ws-schemas`.
+
+## v1.13.12-ws-schemas — 2026-05-22
+
+First half of the WebSocket-frame-typing batch. Adds `apps/server/src/types/ws-frames.ts` with Zod schemas for all 27 wire-format frame types (discriminated union `WsFrameSchema` + `KNOWN_FRAME_TYPES` diagnostic lookup), duplicated byte-identical at `apps/web/src/api/ws-frames.ts` with a parity test. Introduces the `publishFrame` / `publishUserFrame` wrappers that fail-closed on schema mismatch.
+
+## v1.13.11-tools — 2026-05-22
+
+Tiered tool loading via `BOOCODE_TOOLS` env var (`core` | `standard` | `all`). Core = 4 read-only fs tools (~2k token schema cost). Standard = +web + git + codecontext (~10k). All (default) = every tool in `ALL_TOOLS` (~21k). The var is a ceiling — narrows agent whitelists, never expands. Pattern lifted from `eyaltoledano/claude-task-master`.
+
+## v1.13.10-openspec — 2026-05-22
+
+Adopt `Fission-AI/OpenSpec`'s `openspec/changes/<slug>/{proposal,tasks,design}.md` shape for BooCode's own batch docs. Existing batch docs (`boocode_batch10.md`, `handoff_v1.13.8_prefix_verify.md`, `handoff_v1.13.10_per_tool_cost.md`) moved into `openspec/changes/archived/` via `git mv` to preserve history. Zero-dep documentation reformat.
+
+## v1.13.9-agentlint — 2026-05-22
+
+Manual audit of instruction files against `0xmariowu/AgentLint`'s 31-check standard. Removed identity-opener sections from `BOOCHAT.md` and `BOOCODER.md` (emphatic decoration the model doesn't need). Added `CLAUDE.local.md` to `.gitignore` — Claude Code's Glob ignores `.gitignore` by default, so local overrides were otherwise readable by any agent walking the workspace. `CLAUDE.md` passed all 10 checks unchanged.
+
+## v1.13.8-tool-cost — 2026-05-22
+
+Per-tool prompt/completion-token rolling averages surfaced in AgentPicker as at-a-glance cost hints. Implementation is the `tool_cost_stats` SQL view over `messages_with_parts` (`LATERAL jsonb_array_elements` on `tool_calls`), plus a read endpoint and a tooltip extension. Equal-split attribution — multi-tool turn divides tokens N-ways; the 100-call rolling mean absorbs split noise. Filters out `cap_hit` / `doom_loop` sentinels. Source data already lands via existing UPDATEs that `v1.13.5-stability-bundle`'s `includeUsage: true` fix made non-NULL.
+
+## v1.13.7-compaction-trigger — 2026-05-22
+
+Compaction overflow trigger lowered to `floor(0.85 × ctx_max)`, replacing the v1.11.0-era `ctx_max − 20_000` formula. Old formula gave only 7.6% headroom at 262k context and 0 budget for ≤20k contexts (never fired). New formula gives consistent 15% summarizer headroom across all model sizes. Opencode pattern lift from `session/overflow.ts`.
+
+## v1.13.6-prefix-stability — 2026-05-22
+
+System-prompt prefix stability verify-and-measure. Recon during planning disproved the original DB-cache premise: `buildSystemPrompt` already runs over inputs mtime-cached at the file layer (BOOCHAT.md, AGENTS.md global+per-project), and DB scalars are byte-stable until edited. This batch closes the verification gap with instrumentation, not implementation — `buildSystemPromptWithFingerprint` computes SHA-256 over the assembled prefix and a per-session `Map` observer fires `prefix-drift` (warn) on hash change with field-level `changed_inputs` diff.
+
+## v1.13.5-stability-bundle — 2026-05-22
+
+Five fixes for latent regressions surfaced during the cosmetic-revert investigation. (1) `provider.ts` — `includeUsage: true` on `createOpenAICompatible` (default false omitted `stream_options.include_usage`; llama-swap never emitted usage; tokens_used / ctx_used were NULL on every assistant row since `v1.13.0-ai-sdk-v6`). (2) `MessageList.tsx` — `hasText = m.content.trim().length > 0` to skip whitespace-only tool-call-only turns rendering empty bubbles. (3) `BUDGET_NO_AGENT` raised 15 → 30 to match read-only agent cap. (4) `payload.ts` skips status='failed' + complete-but-empty assistant rows so cap-hit + Continue doesn't upstream-reject. (5) Misc UI sanitization.
+
+## v1.13.4-reasoning-fix — 2026-05-22
+
+Compaction head-assembly audit caught one fix: reasoning was omitted from the summarizer's view of tool-bearing turns, silently degrading summary quality for reasoning-channel models (qwen3.6). `v1.13.0-ai-sdk-v6` had wired reasoning end-to-end into inference but missed this one read site. `CompactionMessage` extended with `reasoning_parts`; `buildHeadPayload` embeds it as a `<reasoning>...</reasoning>` prose prefix on the assistant content (OpenAI wire shape has no structured reasoning field).
+
+## v1.13.3-truncate — 2026-05-22
+
+Port of opencode's `truncate.ts`. Full tool output retrievable via opaque `tr_<12 base32 chars>` id (~60 bits entropy) and a new `view_truncated_output(id)` tool. Tmpfs storage at `/tmp/boocode-truncations/` (overridable via `BOOCODE_TRUNCATION_DIR`), 5MB cap, 7-day TTL, orphan-reap on the periodic 60s sweeper. Wired through four tools: `view_file`, `list_dir`, `web_fetch`, `codecontext_client`. Each returns the existing sliced view plus an `outputPath` field when truncation fires.
+
+## v1.13.2-compaction-prune — 2026-05-22
+
+Two-tier compaction prune — opencode pattern that was half-shipped in v1.11.0. New `message_parts.hidden_at` column with partial index on `WHERE hidden_at IS NULL`. `messages_with_parts` view changed from `COALESCE(parts, legacy)` to a CASE that distinguishes "no parts at all → fall back to legacy column for pre-v1.13.0 history" from "all parts hidden → drop the row from the model payload" (smoke caught the `COALESCE` leaking hidden parts back via legacy fallback). `prune.ts` scans `tool_result` parts newest-first, protects the last 40k tokens, marks older candidates hidden once the combined estimate clears 20k.
+
+## v1.13.1-cleanup-bundle — 2026-05-22
+
+Four independent items owed from prior dispatches. (1) `statement_timeout = '30s'` at the database level (documented in `schema.sql` but applied operationally — `ALTER DATABASE` can't run inside a `DO` block). (2) Tool registry alpha-sorted at module load — llama.cpp's prompt cache hits on byte-identical prefixes; reordering tools near the top of the system prompt would invalidate every cached turn. (3) Periodic 60s stuck-row sweeper. (4) `experimental_repairToolCall` to keep streams alive on malformed qwen3.6 tool args (pass-through implementation — logs and forwards unmodified; existing zod-reject path routes back to the model).
+
+## v1.13.0-ai-sdk-v6 — 2026-05-22
+
+Major migration to AI SDK v6. Introduces the `streamCompletion` adapter (`services/inference/stream-phase.ts`) over `streamText`, with five known gotchas the LSP can't catch — abort signals swallowed by `fullStream` (post-iteration throw required), usage lands only at stream end via `await result.usage`, tools have no `execute` field (BooCode dispatches in `tool-phase.ts`), and tool-call-only turns may emit a leading `\n` text-delta. Also ships the `messages_with_parts` view (parts-merge read path) and wires `reasoning_parts` end-to-end via a `ReasoningPart` in the v6 ModelMessage. Ports `ask_user_input` correlation queries from JSON columns to `message_parts` JOINs.
+
+## v1.12.4-inference-split — 2026-05-21
+
+Complete `inference.ts` split into `services/inference/`. Pieces: `turn.ts` (orchestration — `runAssistantTurn` / `runInference` / `createInferenceRunner`), `sentinel-summaries.ts` (`runCapHitSummary`, `runDoomLoopSummary`), `stream-phase.ts`, `tool-phase.ts`, `provider.ts`, `payload.ts`, `prune.ts`, `budget.ts`, `xml-parser.ts`, `error-handler.ts`, `sentinels.ts`, `parts.ts`, `types.ts`. Public surface re-exported via `inference/index.ts`; callers import from `./services/inference/index.js` explicitly (NodeNext doesn't honor directory-index resolution).
+
+## v1.12.3-stale-banner — 2026-05-21
+
+Stale-stream banner with Retry/Discard. When an assistant message sits `status='streaming'` with no token activity for 60+ seconds, the chat shows a banner above the input. Both actions clear the stale row via new `POST /api/chats/:id/discard_stale` (updates `status='failed'`, publishes `chat_status='idle'`). Closes the UX gap from the 2026-05-21 debugging spiral — slow streams and dead streams now look different.
+
+## v1.12.2-live-toks — 2026-05-21
+
+Live tok/s + ctx display next to the status indicator. `ChatThroughput` renders inline beside `StatusDot` while streaming or tool_running. Subscribes to existing `'usage'` WS frames (500ms-throttled, carrying `completion_tokens` + `ctx_used` + `ctx_max`) via `sessionEvents`. Hides when status drops to idle/error or data is older than 10s. Addresses the same UX gap as `v1.12.3-stale-banner` — gives users a live token velocity readout that immediately distinguishes slow from dead.
+
+## v1.12.1-stop-handler — 2026-05-21
+
+`handleAbortOrError` now writes `status='cancelled'` on user stop; rows no longer stuck `streaming` forever. Drops stale `messages_status_check` constraint (only `messages_status_chk` remains, allowing 'cancelled' via TS `MESSAGE_STATUSES`). Removes `detectSameNameLoop` and `DOOM_LOOP_SAME_NAME_THRESHOLD` (added during the 2026-05-21 debugging spike, never fired in any real run) plus 12 verbose `ctx.log.info` diagnostic markers from the same spike. Bundles workspace pane sync + status indicator overhaul + startup hung-row sweep that landed earlier in v1.12.1 work.
+
+## v1.12.0-codecontext — 2026-05-21
+
+Adds the `codecontext` sidecar (Go-based code-graph indexer at `codecontext:8080/v1/<tool_name>` over `boocode_net`) plus container guidance and skills runtime updates. Introduces the `chat_status` WS frame (`streaming | tool_running | waiting_for_input | idle | error`, widened from `working|idle|error`). Drops the deprecated `session_panes` table — workspace pane state moves to `sessions.workspace_panes jsonb` for cross-device sync via `PATCH /api/sessions/:id/workspace`.
+
+## v1.11.1-consolidation — 2026-05-21
+
+Rollup of v1.11.0–v1.11.10 work that was shipped piecemeal. Covers anchored rolling compaction (single `summary=true` row per chat that supersedes itself), doom-loop guard via `detectDoomLoop`, `path_guard` secret-filename deny list, web tools (`web_search` against SearXNG + `web_fetch` with SSRF/private-IP block), and the 5MB stream-cap on response bodies with abort-on-overflow.
+
+## v1.11.0-context-bar — 2026-05-20
+
+Persistent context-window tracker in `ChatPane` + `ctx_max` capture via `${LLAMA_SWAP_URL}/upstream/<model>/props`. First inferences after a boocode boot may have `ctx_max=NULL` if llama-swap hasn't loaded the model yet — 60s negative cache TTL recovers on next turn. Replaced an earlier dead read of `parsed.timings.n_ctx` which never carried n_ctx.
+
+## v1.10.1-booterm-user — 2026-05-19
+
+Per-user shell privilege drop in the booterm container via `gosu` in `tmux.conf` default-command. Shells launched in browser terminal panes drop privs to `samkintop` rather than running as root inside the container.
+
+## v1.10.0-booterm — 2026-05-18
+
+Second container (`apps/booterm`, port 9501, bookworm-slim+glibc). Fastify + node-pty + tmux. Browser terminal panes connect via WS to `/ws/term/sessions/:sid/panes/:pid`; per-session tmux session `bc-<sid>`, per-pane window `term-<pid>`. xterm-addon-webgl with `document.fonts.load(...)`-gated init (Canvas2D doesn't honor `font-display: block`) and iOS-friendly visibility-change context recreation.
+
+## v1.9.2-ask-user-input — 2026-05-18
+
+`ask_user_input` elicitation tool. Pauses the inference loop and surfaces a prompt to the user; their response routes back as the tool result. Correlation initially via `messages.tool_calls` / `tool_results` JSON columns (later ported to `message_parts` in `v1.13.0-ai-sdk-v6`).
+
+## v1.9.1-skills — 2026-05-18
+
+Skills runtime + `/skill` slash command with autocomplete. Server-side parser, tools, `/api/skills`, and mount. Hardens `.dockerignore` to exclude `secrets/` and `data/`. Drops the type-to-confirm gate on chat delete (plain Cancel/Confirm only — per workspace convention).
+
+## v1.9.0-themes-settings — 2026-05-17
+
+Settings pane + per-project defaults + bulk archive + themes lift. `themes-v1` (18 preset palettes) ships in the same batch with a Settings picker for live theme switching.
+
+## v1.8.2-cap-hit — 2026-05-17
+
+Tool-loop cap-hit summary — when an assistant exceeds the per-turn tool budget, a sentinel `role='system'` row with `metadata.kind='cap_hit'` is inserted and a summary turn runs to give the user a coherent endpoint. Also compacts the tool-call UI rendering.
+
+## v1.8.1-agents-global — 2026-05-16
+
+Global agents (`data/AGENTS.md` bind-mounted at `/data/AGENTS.md`) + parser robustness + WS reconnect toast. Per-project `AGENTS.md` mechanism (`getAgentsForProject`) remains for *other* projects; the BooCode repo itself uses global-only to eliminate two-files-must-stay-in-sync drift.
+
+## v1.8.0-agents — 2026-05-16
+
+Tier 2 agents — `AGENTS.md` registry + per-session agent picker. Also lands mobile tab switcher, branch indicator, and the `git_status` tool.
+
+## v1.7.0-drag-drop — 2026-05-16
+
+Drag-drop + paste-as-attachment for long text in the chat input.
+
+## v1.6.0-mobile — 2026-05-16
+
+Full mobile suite. Adds `useViewport` (matchMedia breakpoints mobile <768 / tablet 768–1023 / desktop ≥1024), `useSidebarDrawer` / `useRightRailDrawer` (Context + auto-close on `useLocation().pathname` change), `useLongPress` (500ms timer, synthetic `contextmenu`), `usePullToRefresh` (80px threshold, 600ms hold), `SwipeablePaneTab` (60px close, 30px vertical bail). Mobile headers with safe-area padding, hamburger left, FolderTree right. Tap targets at `max-md:min-h-[44px] max-md:min-w-[44px]`. Raises `MAX_TOOL_LOOP_DEPTH` 5 → 15. Right-rail becomes a drawer on mobile.
+
+## v1.5.1-bootstrap — 2026-05-16
+
+Bootstrap fixes — git + ssh installed in the boocode container, Tailscale host rewrite, `/opt/projects` label correction for the create-new-project bootstrap flow.
+
+## v1.5.0-refactor-tests — 2026-05-16
+
+Refactor split (FileBrowserPane / Workspace / `runAssistantTurn`) + vitest harness + unit tests for security-critical pure functions. Scopes the `/opt` mount to `/opt/projects` (writable) plus `PROJECT_ROOT_WHITELIST=/opt` (read-only resolution for add-existing). Surfaces swallowed errors and removes dead `session_renamed` paths.
+
+## v1.4.0-fork-header — 2026-05-16
+
+Fork from message + delete message + header polish + general housekeeping.
+
+## v1.3.0-chats-projects — 2026-05-16
+
+Chats-in-sessions era. Adds force-send, `/compact`, right-rail file browser, archive/rename/Open-in-Gitea sidebar context menu, archived projects landing page, create-project bootstrap with Gitea remote setup, landing-card buttons, 1000px content cap. Dedup audit and chat archive/delete from the sidebar.
+
+## v1.2.0-multi-pane — 2026-05-15
+
+Multi-pane workspace (batch 3, T1–T8). `session_panes` schema (later replaced by `sessions.workspace_panes jsonb` in v1.12.0), `Pane` discriminated union, broker user channel + `/api/ws/user`, `file_ops` + `file_index` services, `PaneShell` / `ChatPane` / `FileBrowserPane` / `PaneTab` / `Workspace` components, `usePanes` hook, Shiki integration in `CodeBlock`. Up to 5 panes per session; default chat pane created on `POST /api/sessions`.
+
+## v1.1.0-markdown-sidebar — 2026-05-15
+
+Markdown rendering, message actions, tok/s + ctx display, AI session naming. Sidebar restructure — chats nested under projects (max 5 + view-all), live updates via WS.
+
+## v1.0.0-initial — 2026-05-14
+
+Initial commit. Skeleton of the monorepo: `apps/server` (Fastify + postgres), `apps/web` (React + Vite), basic chat loop against llama-swap.
--- a/apps/server/src/schema.sql
+++ b/apps/server/src/schema.sql
@@ -51,7 +51,7 @@ CREATE TABLE IF NOT EXISTS message_parts (
  kind text NOT NULL,
  payload jsonb NOT NULL,
  created_at timestamptz NOT NULL DEFAULT clock_timestamp(),
-  CONSTRAINT message_parts_kind_chk CHECK (kind IN ('text', 'tool_call', 'tool_result', 'reasoning', 'step_start')),
+  CONSTRAINT message_parts_kind_chk CHECK (kind IN ('text', 'tool_call', 'tool_result', 'reasoning', 'step_start', 'synthesis')),
  CONSTRAINT message_parts_seq_uniq UNIQUE (message_id, sequence)
 );
 CREATE INDEX IF NOT EXISTS message_parts_msg_seq_idx ON message_parts (message_id, sequence);
@@ -74,6 +74,23 @@ END $$;
 CREATE INDEX IF NOT EXISTS message_parts_hidden_idx
  ON message_parts (message_id) WHERE hidden_at IS NULL;

+-- v1.13.13: extend message_parts.kind to allow 'synthesis'. Existing DBs were
+-- created with the pre-v1.13.13 CHECK constraint that did NOT include
+-- 'synthesis'; drop + re-add the constraint with the extended enum. Fresh
+-- installs hit the inline constraint above (already updated) and skip this
+-- block via the pg_constraint guard.
+ALTER TABLE message_parts DROP CONSTRAINT IF EXISTS message_parts_kind_chk;
+DO $$
+BEGIN
+  IF NOT EXISTS (
+    SELECT 1 FROM pg_constraint WHERE conname = 'message_parts_kind_chk'
+  ) THEN
+    ALTER TABLE message_parts
+      ADD CONSTRAINT message_parts_kind_chk
+      CHECK (kind IN ('text', 'tool_call', 'tool_result', 'reasoning', 'step_start', 'synthesis'));
+  END IF;
+END $$;
+
 -- v1.13.1-B: read-path view. Read sites SELECT FROM messages_with_parts
 -- instead of messages so tool_calls / tool_results / reasoning_parts come
 -- from the granular message_parts table. The COALESCE means pre-v1.13.0
--- a/apps/server/src/services/tests/xml-parser.test.ts
+++ b/apps/server/src/services/tests/xml-parser.test.ts
@@ -0,0 +1,357 @@
+// v1.13.16: covers the Qwen/Hermes <tool_call> parser, the new Anthropic
+// <invoke> parser, the partial-opener detector for both flavors, the unified
+// extraction helper, and the unknown-tool error formatter that downstream
+// dispatch uses to give the model a recovery hint when it drifts to a
+// Claude Code tool name like read_file instead of BooCode's view_file.
+
+import { describe, expect, it } from 'vitest';
+import {
+  parseXmlToolCall,
+  parseInvokeToolCall,
+  partialXmlOpenerStart,
+  extractToolCallBlocks,
+  XML_TOOL_OPEN,
+  XML_TOOL_CLOSE,
+  INVOKE_TOOL_OPEN,
+  INVOKE_TOOL_CLOSE,
+} from '../inference/xml-parser.js';
+import {
+  levenshtein,
+  suggestToolName,
+  formatUnknownToolError,
+} from '../inference/tool-suggestions.js';
+
+describe('parseXmlToolCall (Qwen/Hermes <tool_call>)', () => {
+  it('parses a well-formed single-parameter call', () => {
+    const block = '<tool_call><function=view_file><parameter=path>/tmp/foo</parameter></function></tool_call>';
+    expect(parseXmlToolCall(block)).toEqual({
+      name: 'view_file',
+      args: { path: '/tmp/foo' },
+    });
+  });
+
+  it('parses multi-parameter call', () => {
+    const block = '<tool_call><function=grep><parameter=pattern>foo</parameter><parameter=path>src/</parameter></function></tool_call>';
+    expect(parseXmlToolCall(block)).toEqual({
+      name: 'grep',
+      args: { pattern: 'foo', path: 'src/' },
+    });
+  });
+
+  it('JSON-parses numeric parameter values', () => {
+    const block = '<tool_call><function=foo><parameter=count>42</parameter></function></tool_call>';
+    expect(parseXmlToolCall(block)).toEqual({ name: 'foo', args: { count: 42 } });
+  });
+
+  it('tolerates whitespace around = in function (v1.13.16 tightening)', () => {
+    const block = '<tool_call><function = view_file><parameter=path>/tmp/foo</parameter></function></tool_call>';
+    expect(parseXmlToolCall(block)).toEqual({
+      name: 'view_file',
+      args: { path: '/tmp/foo' },
+    });
+  });
+
+  it('tolerates whitespace around = in parameter (v1.13.16 tightening)', () => {
+    const block = '<tool_call><function=view_file><parameter = path>/tmp/foo</parameter></function></tool_call>';
+    expect(parseXmlToolCall(block)).toEqual({
+      name: 'view_file',
+      args: { path: '/tmp/foo' },
+    });
+  });
+
+  it('returns null when function name is missing', () => {
+    const block = '<tool_call><parameter=path>/tmp/foo</parameter></tool_call>';
+    expect(parseXmlToolCall(block)).toBeNull();
+  });
+});
+
+describe('parseInvokeToolCall (Anthropic <invoke>) — v1.13.16', () => {
+  // Spec case 1
+  it('parses a well-formed single-parameter call (spec case 1)', () => {
+    const block = '<invoke name="view_file"><parameter name="path">/tmp/foo</parameter></invoke>';
+    expect(parseInvokeToolCall(block)).toEqual({
+      name: 'view_file',
+      args: { path: '/tmp/foo' },
+    });
+  });
+
+  // Spec case 2
+  it('parses a multi-parameter call (spec case 2)', () => {
+    const block = '<invoke name="grep"><parameter name="pattern">foo</parameter><parameter name="path">src/</parameter></invoke>';
+    expect(parseInvokeToolCall(block)).toEqual({
+      name: 'grep',
+      args: { pattern: 'foo', path: 'src/' },
+    });
+  });
+
+  // Spec case 3
+  it('tolerates newlines and spaces in attributes (spec case 3)', () => {
+    const block = `<invoke
+      name="view_file"
+    >
+      <parameter
+        name="path"
+      >/tmp/foo</parameter>
+    </invoke>`;
+    expect(parseInvokeToolCall(block)).toEqual({
+      name: 'view_file',
+      args: { path: '/tmp/foo' },
+    });
+  });
+
+  // Spec case 4 (parser portion — the not-found enrichment is tested below)
+  it('parses a call whose name is not a registered BooCode tool (spec case 4)', () => {
+    const block = '<invoke name="read_file"><parameter name="path">/tmp/foo</parameter></invoke>';
+    expect(parseInvokeToolCall(block)).toEqual({
+      name: 'read_file',
+      args: { path: '/tmp/foo' },
+    });
+  });
+
+  it('supports single-quoted attribute values', () => {
+    const block = "<invoke name='view_file'><parameter name='path'>/tmp/foo</parameter></invoke>";
+    expect(parseInvokeToolCall(block)).toEqual({
+      name: 'view_file',
+      args: { path: '/tmp/foo' },
+    });
+  });
+
+  it('JSON-parses numeric parameter values', () => {
+    const block = '<invoke name="foo"><parameter name="count">42</parameter></invoke>';
+    expect(parseInvokeToolCall(block)).toEqual({ name: 'foo', args: { count: 42 } });
+  });
+
+  it('tolerates spaces around = inside name attribute', () => {
+    const block = '<invoke name = "view_file"><parameter name = "path">/tmp/foo</parameter></invoke>';
+    expect(parseInvokeToolCall(block)).toEqual({
+      name: 'view_file',
+      args: { path: '/tmp/foo' },
+    });
+  });
+
+  it('returns null when name attribute is missing', () => {
+    const block = '<invoke><parameter name="path">/tmp/foo</parameter></invoke>';
+    expect(parseInvokeToolCall(block)).toBeNull();
+  });
+
+  it('returns null when name attribute is empty', () => {
+    const block = '<invoke name=""><parameter name="path">/tmp/foo</parameter></invoke>';
+    expect(parseInvokeToolCall(block)).toBeNull();
+  });
+
+  it('exports the expected delimiters', () => {
+    expect(INVOKE_TOOL_OPEN).toBe('<invoke');
+    expect(INVOKE_TOOL_CLOSE).toBe('</invoke>');
+    expect(XML_TOOL_OPEN).toBe('<tool_call>');
+    expect(XML_TOOL_CLOSE).toBe('</tool_call>');
+  });
+});
+
+describe('partialXmlOpenerStart (v1.13.16 — both flavors)', () => {
+  it('returns -1 when the buffer is empty', () => {
+    expect(partialXmlOpenerStart('')).toBe(-1);
+  });
+
+  it('returns -1 when the buffer has no openers', () => {
+    expect(partialXmlOpenerStart('plain prose, no markup')).toBe(-1);
+  });
+
+  it('returns the index of a complete <tool_call> opener (existing)', () => {
+    expect(partialXmlOpenerStart('prose <tool_call>more')).toBe(6);
+  });
+
+  it('returns the index of a complete <invoke opener (v1.13.16)', () => {
+    expect(partialXmlOpenerStart('prose <invoke name=')).toBe(6);
+  });
+
+  it('holds a partial <tool_ prefix at end of buffer', () => {
+    expect(partialXmlOpenerStart('text <tool_')).toBe(5);
+  });
+
+  it('holds a partial <invo prefix at end of buffer (v1.13.16)', () => {
+    expect(partialXmlOpenerStart('text <invo')).toBe(5);
+  });
+
+  it('holds a bare < at end of buffer', () => {
+    expect(partialXmlOpenerStart('text <')).toBe(5);
+  });
+
+  it('returns -1 when < is followed by non-opener text', () => {
+    expect(partialXmlOpenerStart('text <unknown>')).toBe(-1);
+  });
+
+  it('returns the earliest opener when both flavors are present', () => {
+    expect(partialXmlOpenerStart('xxx <tool_call>YYY <invoke>')).toBe(4);
+    expect(partialXmlOpenerStart('xxx <invoke>YYY <tool_call>')).toBe(4);
+  });
+});
+
+describe('extractToolCallBlocks (v1.13.16 — unified extraction)', () => {
+  // Spec case 1 (extraction-level)
+  it('extracts a single <invoke> block (spec case 1)', () => {
+    const input = '<invoke name="view_file"><parameter name="path">/tmp/foo</parameter></invoke>';
+    const result = extractToolCallBlocks(input);
+    expect(result.calls).toEqual([{ name: 'view_file', args: { path: '/tmp/foo' } }]);
+    expect(result.flushed).toBe('');
+    expect(result.remaining).toBe('');
+  });
+
+  // Spec case 5: opener arrives in one chunk, closer in the next.
+  it('holds the partial <invoke> chunk when the closer has not arrived (spec case 5, first chunk)', () => {
+    const firstChunk = '<invoke name="view_file"><parameter name="path">/tmp/foo</parameter>';
+    const result = extractToolCallBlocks(firstChunk);
+    expect(result.calls).toEqual([]);
+    expect(result.flushed).toBe('');
+    expect(result.remaining).toBe(firstChunk);
+  });
+
+  it('extracts the block once the closer arrives in a later chunk (spec case 5, completion)', () => {
+    const firstChunk = '<invoke name="view_file"><parameter name="path">/tmp/foo</parameter>';
+    const r1 = extractToolCallBlocks(firstChunk);
+    const combined = r1.remaining + '</invoke>';
+    const r2 = extractToolCallBlocks(combined);
+    expect(r2.calls).toEqual([{ name: 'view_file', args: { path: '/tmp/foo' } }]);
+    expect(r2.flushed).toBe('');
+    expect(r2.remaining).toBe('');
+  });
+
+  // Spec case 6: prose interleaving
+  it('flushes prose around a recognized block but not the markup itself (spec case 6)', () => {
+    const input = 'I will read the file.\n<invoke name="view_file"><parameter name="path">/tmp/foo</parameter></invoke>\nThanks.';
+    const result = extractToolCallBlocks(input);
+    expect(result.calls).toEqual([{ name: 'view_file', args: { path: '/tmp/foo' } }]);
+    expect(result.flushed).toBe('I will read the file.\n\nThanks.');
+    expect(result.remaining).toBe('');
+  });
+
+  // Spec case 7 regression
+  it('extracts a <tool_call> Qwen block alongside the new code path (spec case 7 regression)', () => {
+    const input = '<tool_call><function=view_file><parameter=path>/tmp/foo</parameter></function></tool_call>';
+    const result = extractToolCallBlocks(input);
+    expect(result.calls).toEqual([{ name: 'view_file', args: { path: '/tmp/foo' } }]);
+    expect(result.flushed).toBe('');
+    expect(result.remaining).toBe('');
+  });
+
+  it('extracts mixed-format blocks in source order (hand-back: shared counter)', () => {
+    const input =
+      '<invoke name="view_file"><parameter name="path">/a</parameter></invoke>' +
+      ' middle ' +
+      '<tool_call><function=grep><parameter=pattern>foo</parameter></function></tool_call>';
+    const result = extractToolCallBlocks(input);
+    expect(result.calls).toEqual([
+      { name: 'view_file', args: { path: '/a' } },
+      { name: 'grep', args: { pattern: 'foo' } },
+    ]);
+    expect(result.flushed).toBe(' middle ');
+    expect(result.remaining).toBe('');
+  });
+
+  it('drops a malformed <invoke> block silently (matches existing <tool_call> behavior)', () => {
+    const input = 'prose <invoke><parameter name="path">/a</parameter></invoke> trailing';
+    const result = extractToolCallBlocks(input);
+    expect(result.calls).toEqual([]);
+    expect(result.flushed).toBe('prose  trailing');
+    expect(result.remaining).toBe('');
+  });
+
+  it('holds a tail with a fresh partial opener after extracting earlier complete blocks', () => {
+    const input = '<invoke name="view_file"><parameter name="path">/a</parameter></invoke> next: <tool_';
+    const result = extractToolCallBlocks(input);
+    expect(result.calls).toEqual([{ name: 'view_file', args: { path: '/a' } }]);
+    expect(result.flushed).toBe(' next: ');
+    expect(result.remaining).toBe('<tool_');
+  });
+
+  it('passes plain prose straight through when no markup is present', () => {
+    const input = 'just some text with a < character but no opener';
+    const result = extractToolCallBlocks(input);
+    expect(result.calls).toEqual([]);
+    expect(result.flushed).toBe(input);
+    expect(result.remaining).toBe('');
+  });
+});
+
+describe('levenshtein', () => {
+  it('returns 0 for identical strings', () => {
+    expect(levenshtein('view_file', 'view_file')).toBe(0);
+  });
+
+  it('returns the length when one string is empty', () => {
+    expect(levenshtein('', 'view_file')).toBe(9);
+    expect(levenshtein('view_file', '')).toBe(9);
+  });
+
+  it('computes a small distance for a single-character substitution', () => {
+    expect(levenshtein('cat', 'bat')).toBe(1);
+  });
+
+  it('computes a known case: read_file → view_file is 4', () => {
+    // r→v, e→i, a→e, d→w → 4 substitutions, same length
+    expect(levenshtein('read_file', 'view_file')).toBe(4);
+  });
+});
+
+describe('suggestToolName (v1.13.16)', () => {
+  const tools = [
+    'view_file',
+    'list_dir',
+    'grep',
+    'find_files',
+    'view_truncated_output',
+    'ask_user_input',
+    'web_search',
+  ];
+
+  it('suggests the closest match when distance is small', () => {
+    expect(suggestToolName('view_files', tools)).toBe('view_file');
+  });
+
+  it('suggests via substring match when distance alone would miss', () => {
+    // 'file' is a substring of multiple tools; closest by distance wins.
+    expect(suggestToolName('file', tools)).toBe('view_file');
+  });
+
+  it('returns null when nothing is close', () => {
+    expect(suggestToolName('xxxx_yyyy_zzzz', tools)).toBeNull();
+  });
+
+  it('is case-insensitive in the distance check', () => {
+    expect(suggestToolName('VIEW_FILE', tools)).toBe('view_file');
+  });
+});
+
+describe('formatUnknownToolError (v1.13.16)', () => {
+  const tools = ['view_file', 'list_dir', 'grep', 'find_files'];
+
+  it('includes the wrong name and the available tools list', () => {
+    const msg = formatUnknownToolError('read_file', tools);
+    expect(msg).toContain("Tool 'read_file' not found");
+    expect(msg).toContain('Available tools:');
+    expect(msg).toContain('view_file');
+    expect(msg).toContain('find_files');
+  });
+
+  it('includes a suggestion when the drifted name is within threshold', () => {
+    // distance(view_files, view_file) = 1 (one extra char)
+    const msg = formatUnknownToolError('view_files', tools);
+    expect(msg).toContain('Did you mean: view_file?');
+  });
+
+  it('omits the suggestion clause when no tool is close enough', () => {
+    const msg = formatUnknownToolError('zzzzzzz', tools);
+    expect(msg).toContain("Tool 'zzzzzzz' not found");
+    expect(msg).toContain('Available tools:');
+    expect(msg).not.toContain('Did you mean');
+  });
+
+  // The drift incident in the recon (chat 30d8…1be7167, msg 7ff558f4) had the
+  // model emit <invoke name="read_file">. lev(read_file, view_file) = 4, so
+  // the spec's threshold (<=3) doesn't suggest view_file — the model still
+  // gets the available-tools list to pick from. This pins that behavior so a
+  // future loosening of the threshold is a deliberate choice.
+  it('does not suggest view_file for the read_file drift case (distance is 4, over threshold)', () => {
+    const msg = formatUnknownToolError('read_file', tools);
+    expect(msg).not.toContain('Did you mean');
+  });
+});
--- a/apps/server/src/services/inference/parts.ts
+++ b/apps/server/src/services/inference/parts.ts
@@ -7,7 +7,17 @@ import type { ToolCall, ToolResult } from '../../types/api.js';
 // JSON columns; the swap to parts-as-source-of-truth happens in a later
 // v1.13 dispatch alongside the AI SDK streamText migration.

-export type PartKind = 'text' | 'tool_call' | 'tool_result' | 'reasoning' | 'step_start';
+// v1.13.13: 'synthesis' added. Schema CHECK constraint is updated in lockstep
+// (schema.sql adds 'synthesis' to message_parts_kind_chk on startup). The
+// dispatch's claim that no schema migration was needed assumed kind was a
+// bare text column — it isn't; the constraint enumerates allowed values.
+export type PartKind =
+  | 'text'
+  | 'tool_call'
+  | 'tool_result'
+  | 'reasoning'
+  | 'step_start'
+  | 'synthesis';

 export interface PartInsert {
  message_id: string;
--- a/apps/server/src/services/inference/stream-phase.ts
+++ b/apps/server/src/services/inference/stream-phase.ts
@@ -6,12 +6,9 @@ import type {
 import * as modelContext from '../model-context.js';
 import { toolJsonSchemas, type ToolJsonSchema } from '../tools.js';
 import type { OpenAiMessage } from './payload.js';
-import {
-  XML_TOOL_CLOSE,
-  XML_TOOL_OPEN,
-  parseXmlToolCall,
-  partialXmlOpenerStart,
-} from './xml-parser.js';
+// v1.13.16: extractToolCallBlocks replaces the inline opener-search loop and
+// recognizes both Qwen <tool_call> and Anthropic <invoke> markup in one pass.
+import { extractToolCallBlocks } from './xml-parser.js';
 import { DB_FLUSH_INTERVAL_MS, type StreamPhaseState } from './types.js';
 import type {
  InferenceContext,
@@ -132,16 +129,24 @@ function buildAiTools(schemas: ToolJsonSchema[]): Record<string, ReturnType<type
 // v1.10.5 Qwen-coder XML fallback. Some local models (notably qwen3-coder via
 // llama-swap) emit tool calls as inline XML inside delta.content rather than
 // the structured tool_calls field. We extract them out of the streamed text
-// before flushing it to the client, mirroring the pre-AI-SDK behavior.
+// before flushing it to the client.
 //
-// XML shape:
+// Qwen shape:
 //   <tool_call>
 //   <function=NAME>
 //   <parameter=KEY>VALUE</parameter>
 //   ...
 //   </function>
 //   </tool_call>
-// Multiple <tool_call> blocks may appear back-to-back; they never nest.
+//
+// v1.13.16: also recognize Anthropic <invoke> markup that qwen3.6-35b-a3b-mxfp4
+// drifts to (training-data residue from Claude Code documentation):
+//   <invoke name="NAME">
+//   <parameter name="KEY">VALUE</parameter>
+//   </invoke>
+// Both formats share the synthetic xml_call_${idx} ID space; the counter
+// increments across whichever opener appears first. Multiple blocks may
+// appear back-to-back in either format and they never nest.
 export async function streamCompletion(
  ctx: InferenceContext,
  model: string,
@@ -209,47 +214,24 @@ export async function streamCompletion(
    switch (part.type) {
      case 'text-delta': {
        pendingBuffer += part.text;
-        // Extract any complete <tool_call>...</tool_call> blocks before
-        // flushing visible text.
-        while (true) {
-          const startIdx = pendingBuffer.indexOf(XML_TOOL_OPEN);
-          if (startIdx === -1) break;
-          const closeIdx = pendingBuffer.indexOf(XML_TOOL_CLOSE, startIdx);
-          if (closeIdx === -1) break;
-          const blockEnd = closeIdx + XML_TOOL_CLOSE.length;
-          const block = pendingBuffer.slice(startIdx, blockEnd);
-          if (startIdx > 0) {
-            const before = pendingBuffer.slice(0, startIdx);
-            content += before;
-            onDelta(before);
-          }
-          const parsedCall = parseXmlToolCall(block);
-          if (parsedCall) {
-            const synthIdx = toolCalls.length;
-            toolCalls.push({
-              id: `xml_call_${synthIdx}`,
-              name: parsedCall.name,
-              args: parsedCall.args,
-            });
-          }
-          // Parse failures still drop the block — leaking <tool_call> XML to
-          // the chat would look worse than silently swallowing the bad block.
-          pendingBuffer = pendingBuffer.slice(blockEnd);
+        // v1.13.16: unified extraction. The helper finds the earliest-opening
+        // complete <tool_call> or <invoke> block, flushes prose between/around
+        // them, holds any partial opener for the next chunk, and silently
+        // drops blocks that fail to parse (matches pre-v1.13.16 behavior).
+        const extracted = extractToolCallBlocks(pendingBuffer);
+        if (extracted.flushed.length > 0) {
+          content += extracted.flushed;
+          onDelta(extracted.flushed);
        }
-        // Hold back any (partial or full) unclosed opener; flush the rest.
-        const partialIdx = partialXmlOpenerStart(pendingBuffer);
-        if (partialIdx >= 0) {
-          if (partialIdx > 0) {
-            const flush = pendingBuffer.slice(0, partialIdx);
-            content += flush;
-            onDelta(flush);
-          }
-          pendingBuffer = pendingBuffer.slice(partialIdx);
-        } else if (pendingBuffer.length > 0) {
-          content += pendingBuffer;
-          onDelta(pendingBuffer);
-          pendingBuffer = '';
+        for (const call of extracted.calls) {
+          const synthIdx = toolCalls.length;
+          toolCalls.push({
+            id: `xml_call_${synthIdx}`,
+            name: call.name,
+            args: call.args,
+          });
        }
+        pendingBuffer = extracted.remaining;
        break;
      }
      case 'tool-call': {
--- a/apps/server/src/services/inference/tool-phase.ts
+++ b/apps/server/src/services/inference/tool-phase.ts
@@ -4,6 +4,12 @@ import { PathScopeError } from '../path_guard.js';
 import { TOOLS_BY_NAME } from '../tools.js';
 import { maybeFlagForCompaction } from './payload.js';
 import { insertParts, partsFromAssistantMessage, partsFromToolMessage } from './parts.js';
+// v1.13.16: richer unknown-tool error so the model can self-correct when it
+// drifts to a Claude Code tool name (e.g. read_file → suggest view_file).
+// Applies to all unknown tool names, not just <invoke>-derived ones — at the
+// dispatch layer we no longer know which format produced the call, and the
+// extra signal is harmless for Qwen-derived calls.
+import { formatUnknownToolError } from './tool-suggestions.js';
 import type {
  InferenceContext,
  StreamResult,
@@ -14,6 +20,11 @@ import type {
 // the reference is read at call time (inside an async function body), not
 // at module top-level. Node + tsc resolve this cleanly.
 import { runAssistantTurn } from './turn.js';
+// v1.13.13: synthesis pipeline — replaces the immediate recursive turn when
+// any of this batch's tool calls is in SYNTHESIS_TOOLS. Falls through to
+// recursion on synthesis failure (timeout / model error). See module header
+// in synthesisPipeline.ts for the auto-fetch + token-budget rules.
+import { SYNTHESIS_TOOLS, runSynthesisPass } from '../synthesisPipeline.js';

 async function executeToolCall(
  projectRoot: string,
@@ -21,7 +32,11 @@ async function executeToolCall(
 ): Promise<{ output: unknown; truncated: boolean; error?: string }> {
  const tool = TOOLS_BY_NAME[toolCall.name];
  if (!tool) {
-    return { output: null, truncated: false, error: `unknown tool: ${toolCall.name}` };
+    return {
+      output: null,
+      truncated: false,
+      error: formatUnknownToolError(toolCall.name, Object.keys(TOOLS_BY_NAME)),
+    };
  }
  const parsed = tool.inputSchema.safeParse(toolCall.args);
  if (!parsed.success) {
@@ -155,6 +170,12 @@ export async function executeToolPhase(
  // batches still execute the other tools normally.
  ctx.publishUser({ type: 'chat_status', chat_id: chatId, status: 'tool_running', at: new Date().toISOString() });
  let pausingForUserInput = false;
+  // v1.13.13: capture synth-tool result text so the synthesis pipeline below
+  // doesn't have to re-fetch from DB. Array (not single) because a batch
+  // could theoretically include multiple synthesis tools — we take the first
+  // for the synthesis input. Race-free under Promise.all because each
+  // callback pushes its own captured value.
+  const synthEntries: Array<{ tc: ToolCall; output: unknown; error?: string }> = [];
  await Promise.all(
    toolCalls.map(async (tc) => {
      const [toolRow] = await ctx.sql<{ id: string }[]>`
@@ -186,6 +207,9 @@ export async function executeToolPhase(
        return;
      }
      const tres = await executeToolCall(projectRoot, tc);
+      if (SYNTHESIS_TOOLS.has(tc.name)) {
+        synthEntries.push({ tc, output: tres.output, ...(tres.error ? { error: tres.error } : {}) });
+      }
      const stored = {
        tool_call_id: tc.id,
        output: tres.output,
@@ -233,6 +257,41 @@ export async function executeToolPhase(
    return;
  }

+  // v1.13.13: synthesis-pipeline branch. When any of this batch's tool calls
+  // is a codecontext overview/analysis tool that produced a non-error result,
+  // run a forced second-inference synthesis pass with auto-fetched files +
+  // project docs instead of the normal recursive runAssistantTurn. Falls
+  // through to the recursive call on synthesis failure (timeout, model
+  // error). User-abort re-throws so the outer handler runs.
+  const synthEntry = synthEntries.find((e) => !e.error && e.output != null);
+  if (synthEntry) {
+    // codecontext wrappers return { result: string, truncated: boolean, ... }.
+    // Defensive: stringify the output if it isn't the expected shape so the
+    // synthesis still has something to chew on rather than crashing on
+    // missing `.result`.
+    const out = synthEntry.output as { result?: unknown; truncated?: boolean; outputPath?: string };
+    const toolResultText =
+      typeof out?.result === 'string'
+        ? out.result
+        : JSON.stringify(synthEntry.output);
+    // v1.13.15-b: forward the wrapper's truncation flag + opaque tmpfs id so
+    // synthesisPipeline can re-read the full content for reference extraction.
+    const ran = await runSynthesisPass({
+      ctx,
+      args,
+      session,
+      projectRoot,
+      toolName: synthEntry.tc.name,
+      toolResultText,
+      ...(typeof out?.truncated === 'boolean' ? { truncated: out.truncated } : {}),
+      ...(typeof out?.outputPath === 'string' ? { outputPath: out.outputPath } : {}),
+    });
+    if (ran) return;
+    // ran === false → synthesis failed (timeout / model error) → fall through
+    // to the standard recursive turn below. The synth message (if created)
+    // was already marked status='failed' inside runSynthesisPass.
+  }
+
  const [nextAssistant] = await ctx.sql<{ id: string }[]>`
    INSERT INTO messages (session_id, chat_id, role, content, status, created_at)
    VALUES (${sessionId}, ${chatId}, 'assistant', '', 'streaming', clock_timestamp())
--- a/apps/server/src/services/inference/tool-suggestions.ts
+++ b/apps/server/src/services/inference/tool-suggestions.ts
@@ -0,0 +1,63 @@
+// v1.13.16: Levenshtein + suggestion + formatter for the unknown-tool error
+// returned to the model when an XML-extracted tool call references a name
+// that isn't in TOOLS_BY_NAME. The drift incident this targets: qwen3.6
+// emitting <invoke name="read_file"> from its Claude Code training residue
+// when BooCode's actual file-read tool is view_file. Hand-rolled distance
+// function — no new dep.
+
+export function levenshtein(a: string, b: string): number {
+  if (a.length === 0) return b.length;
+  if (b.length === 0) return a.length;
+  const dp: number[][] = Array.from(
+    { length: a.length + 1 },
+    () => new Array<number>(b.length + 1).fill(0),
+  );
+  for (let i = 0; i <= a.length; i++) dp[i]![0] = i;
+  for (let j = 0; j <= b.length; j++) dp[0]![j] = j;
+  for (let i = 1; i <= a.length; i++) {
+    for (let j = 1; j <= b.length; j++) {
+      const cost = a[i - 1] === b[j - 1] ? 0 : 1;
+      dp[i]![j] = Math.min(
+        dp[i - 1]![j]! + 1,
+        dp[i]![j - 1]! + 1,
+        dp[i - 1]![j - 1]! + cost,
+      );
+    }
+  }
+  return dp[a.length]![b.length]!;
+}
+
+// Threshold per the v1.13.16 dispatch: distance <= 3 OR substring match
+// (either direction). Ties broken by smallest distance, then alphabetical.
+export function suggestToolName(
+  name: string,
+  available: readonly string[],
+): string | null {
+  const lower = name.toLowerCase();
+  let best: { name: string; dist: number } | null = null;
+  for (const tool of available) {
+    const tlower = tool.toLowerCase();
+    const dist = levenshtein(lower, tlower);
+    const isSubstr = tlower.includes(lower) || lower.includes(tlower);
+    if (dist > 3 && !isSubstr) continue;
+    if (
+      best === null ||
+      dist < best.dist ||
+      (dist === best.dist && tool.localeCompare(best.name) < 0)
+    ) {
+      best = { name: tool, dist };
+    }
+  }
+  return best?.name ?? null;
+}
+
+export function formatUnknownToolError(
+  name: string,
+  available: readonly string[],
+): string {
+  const sorted = [...available].sort();
+  const suggestion = suggestToolName(name, sorted);
+  const list = sorted.join(', ');
+  const tail = suggestion ? ` Did you mean: ${suggestion}?` : '';
+  return `Tool '${name}' not found. Available tools: [${list}].${tail}`;
+}
--- a/apps/server/src/services/inference/xml-parser.ts
+++ b/apps/server/src/services/inference/xml-parser.ts
@@ -1,23 +1,42 @@
 // v1.10.5: XML-tag tool-call fallback. Some models emit
 // <tool_call><function=foo><parameter=key>value</parameter></function></tool_call>
 // in plain content instead of using the OpenAI tool_calls JSON channel.
-// The streaming loop in inference.ts extracts these blocks via these helpers.
+// The streaming loop in stream-phase.ts extracts these blocks via these helpers.
+//
+// v1.13.16: also recognize Anthropic <invoke name="..."><parameter name="...">
+// markup. qwen3.6-35b-a3b-mxfp4 drifts to this format when prompted as an
+// "Architect"-style agent because Claude Code documentation in its
+// pre-training data uses this shape. Both formats route through the same
+// synthetic ToolCall path with shared xml_call_${idx} IDs; downstream
+// dispatch handles unknown tool names with a richer error (see
+// tool-suggestions.ts + tool-phase.ts).

 export const XML_TOOL_OPEN = '<tool_call>';
 export const XML_TOOL_CLOSE = '</tool_call>';

-export function parseXmlToolCall(
-  block: string,
-): { name: string; args: Record<string, unknown> } | null {
-  const nameMatch = block.match(/<function=([^>]+)>/);
+// v1.13.16: Anthropic <invoke> opener is matched by prefix (not the full
+// `<invoke ...>` tag) because attributes follow. Closer is the literal tag.
+export const INVOKE_TOOL_OPEN = '<invoke';
+export const INVOKE_TOOL_CLOSE = '</invoke>';
+
+export interface ParsedCall {
+  name: string;
+  args: Record<string, unknown>;
+}
+
+// v1.10.5: Qwen-flavor parser. Tightened in v1.13.16 to tolerate whitespace
+// around `=` (e.g. `<function = view_file>`). Name capture is non-whitespace,
+// non-`>` so a stray space doesn't get absorbed into the function name.
+const QWEN_FUNCTION_RE = /<function\s*=\s*([^>\s]+)\s*>/;
+const QWEN_PARAM_RE = /<parameter\s*=\s*([^>\s]+)\s*>([\s\S]*?)<\/parameter>/g;
+
+export function parseXmlToolCall(block: string): ParsedCall | null {
+  const nameMatch = block.match(QWEN_FUNCTION_RE);
  if (!nameMatch || !nameMatch[1]) return null;
  const name = nameMatch[1].trim();
  if (!name) return null;
  const args: Record<string, unknown> = {};
-  // Non-greedy body so each <parameter=…>…</parameter> pair is matched
-  // independently even when multiple appear in the same block.
-  const paramRe = /<parameter=([^>]+)>([\s\S]*?)<\/parameter>/g;
-  for (const m of block.matchAll(paramRe)) {
+  for (const m of block.matchAll(QWEN_PARAM_RE)) {
    const key = (m[1] ?? '').trim();
    if (!key) continue;
    const raw = (m[2] ?? '').trim();
@@ -30,24 +49,121 @@ export function parseXmlToolCall(
  return { name, args };
 }

+// v1.13.16: Anthropic-flavor parser. Same JSON-parse-with-string-fallback
+// shape as parseXmlToolCall so the dispatch layer doesn't need to care which
+// flavor produced the call.
+const INVOKE_NAME_RE =
+  /<invoke\s+name\s*=\s*("([^"]*)"|'([^']*)')\s*>/;
+const INVOKE_PARAM_RE =
+  /<parameter\s+name\s*=\s*("([^"]*)"|'([^']*)')\s*>([\s\S]*?)<\/parameter>/g;
+
+export function parseInvokeToolCall(block: string): ParsedCall | null {
+  const nameMatch = block.match(INVOKE_NAME_RE);
+  if (!nameMatch) return null;
+  const name = (nameMatch[2] ?? nameMatch[3] ?? '').trim();
+  if (!name) return null;
+  const args: Record<string, unknown> = {};
+  for (const m of block.matchAll(INVOKE_PARAM_RE)) {
+    const key = ((m[2] ?? m[3] ?? '') as string).trim();
+    if (!key) continue;
+    const raw = (m[4] ?? '').trim();
+    try {
+      args[key] = JSON.parse(raw);
+    } catch {
+      args[key] = raw;
+    }
+  }
+  return { name, args };
+}
+
 // Locate the first character that begins (or completely contains) an
-// unfinished <tool_call> opener in `s`. Returns -1 when `s` can be flushed
-// to the client in full without risking a partial tag leak.
-//   Case 1: a full `<tool_call>` opener with no matching closer — caller
-//           must keep everything from that index forward until the next
-//           chunk arrives with the closer.
-//   Case 2: `s` ends with a strict prefix of `<tool_call>` (e.g. `<tool_c`).
-//           Caller must keep just that suffix in the buffer.
+// unfinished opener (either flavor) in `s`. Returns -1 when `s` can be
+// flushed to the client in full without risking a partial tag leak.
+//   Case 1: a full opener (`<tool_call>` or `<invoke`) with no matching
+//           closer — caller must keep everything from that index forward
+//           until the next chunk arrives with the closer.
+//   Case 2: `s` ends with a strict prefix of either opener (e.g. `<tool_c`
+//           or `<invo`). Caller must keep just that suffix in the buffer.
 // Note: case 1 assumes the calling loop already extracted every complete
-// <tool_call>…</tool_call> pair before reaching this check.
+// block before reaching this check.
+const ALL_OPENERS = [XML_TOOL_OPEN, INVOKE_TOOL_OPEN] as const;
+
 export function partialXmlOpenerStart(s: string): number {
-  const fullOpener = s.indexOf(XML_TOOL_OPEN);
-  if (fullOpener !== -1) return fullOpener;
+  let earliest = -1;
+  for (const op of ALL_OPENERS) {
+    const idx = s.indexOf(op);
+    if (idx === -1) continue;
+    if (earliest === -1 || idx < earliest) earliest = idx;
+  }
+  if (earliest !== -1) return earliest;
  const lastLt = s.lastIndexOf('<');
  if (lastLt === -1) return -1;
  const suffix = s.slice(lastLt);
-  if (XML_TOOL_OPEN.startsWith(suffix) && suffix.length < XML_TOOL_OPEN.length) {
-    return lastLt;
+  for (const op of ALL_OPENERS) {
+    if (op.startsWith(suffix) && suffix.length < op.length) return lastLt;
  }
  return -1;
 }
+
+// v1.13.16: unified extraction. Replaces the inline loop that used to live
+// in stream-phase.ts. Pure function — returns the visible text to flush,
+// the parsed tool-call payloads in source order, and the buffer remainder
+// to retain for the next streaming chunk. Parse failures are silently
+// dropped (matches the pre-v1.13.16 behavior — leaking partial XML to the
+// chat looks worse than swallowing a bad block).
+export interface ToolCallExtraction {
+  flushed: string;
+  calls: ParsedCall[];
+  remaining: string;
+}
+
+interface OpenerSpec {
+  open: string;
+  close: string;
+  parse: (block: string) => ParsedCall | null;
+}
+
+const OPENER_SPECS: ReadonlyArray<OpenerSpec> = [
+  { open: XML_TOOL_OPEN, close: XML_TOOL_CLOSE, parse: parseXmlToolCall },
+  { open: INVOKE_TOOL_OPEN, close: INVOKE_TOOL_CLOSE, parse: parseInvokeToolCall },
+];
+
+export function extractToolCallBlocks(buffer: string): ToolCallExtraction {
+  let flushed = '';
+  const calls: ParsedCall[] = [];
+  let pos = 0;
+
+  while (pos < buffer.length) {
+    let next: { spec: OpenerSpec; openIdx: number; closeIdx: number } | null = null;
+    for (const spec of OPENER_SPECS) {
+      const openIdx = buffer.indexOf(spec.open, pos);
+      if (openIdx === -1) continue;
+      const closeIdx = buffer.indexOf(spec.close, openIdx);
+      if (closeIdx === -1) continue;
+      if (next === null || openIdx < next.openIdx) {
+        next = { spec, openIdx, closeIdx };
+      }
+    }
+    if (next === null) break;
+
+    if (next.openIdx > pos) {
+      flushed += buffer.slice(pos, next.openIdx);
+    }
+    const blockEnd = next.closeIdx + next.spec.close.length;
+    const block = buffer.slice(next.openIdx, blockEnd);
+    const parsed = next.spec.parse(block);
+    if (parsed) calls.push(parsed);
+    pos = blockEnd;
+  }
+
+  const tail = buffer.slice(pos);
+  const partialIdx = partialXmlOpenerStart(tail);
+  if (partialIdx === -1) {
+    flushed += tail;
+    return { flushed, calls, remaining: '' };
+  }
+  if (partialIdx > 0) {
+    flushed += tail.slice(0, partialIdx);
+  }
+  return { flushed, calls, remaining: tail.slice(partialIdx) };
+}
--- a/apps/server/src/services/synthesisPipeline.ts
+++ b/apps/server/src/services/synthesisPipeline.ts
@@ -0,0 +1,493 @@
+// v1.13.13: forced second-inference synthesis pass for codecontext
+// overview/analysis tools. Triggered from tool-phase.ts after a codecontext
+// tool call lands and BEFORE the normal recursive runAssistantTurn fires.
+//
+// Inputs to the synthesis stream:
+//   1. The codecontext tool's result text.
+//   2. Top-N source files referenced in that text, fetched via view_file.
+//   3. Project documentation auto-fetched from the repo root.
+//   4. The original user message that triggered the turn.
+//
+// Output: a NEW assistant message whose sole part is kind='synthesis'.
+// Streams to the client as deltas exactly like a normal assistant turn.
+//
+// Failure modes (all fall through to recursive runAssistantTurn):
+//   - SYNTHESIS_TOOLS membership check fails -> return false immediately.
+//   - File-fetch / doc-fetch errors -> silent skip, continue with what we have.
+//   - Stream error / timeout -> mark synth message status='failed', return false.
+//   - User-abort -> mark cancelled and re-throw so the outer abort handler runs.
+
+import { promises as fs } from 'node:fs';
+import { join } from 'node:path';
+
+import { TOOLS_BY_NAME } from './tools.js';
+import { streamCompletion } from './inference/stream-phase.js';
+import { SYNTHESIS_SYSTEM_PROMPT } from './synthesisPrompt.js';
+import { insertParts } from './inference/parts.js';
+import * as modelContext from './model-context.js';
+import { readTruncation } from './truncate.js';
+
+import type { Session } from '../types/api.js';
+import type { OpenAiMessage } from './inference/payload.js';
+import type { InferenceContext, TurnArgs } from './inference/turn.js';
+
+export const SYNTHESIS_TOOLS: ReadonlySet<string> = new Set([
+  'get_codebase_overview',
+  'get_framework_analysis',
+  'get_semantic_neighborhoods',
+]);
+
+const TOP_N_FILES = 5;
+const FILE_LINE_CAP = 200;
+const DOC_LINE_CAP = 500;
+// Token budget for the auto-fetched content (files + docs combined). Estimated
+// via chars/4 — a rough but stable proxy that doesn't require a tokenizer dep.
+const TOKEN_BUDGET = 32_000;
+const CHARS_PER_TOKEN = 4;
+// 90s per synthesis call. Long enough for a thoughtful overview against a
+// large auto-fetched payload; short enough that a hung upstream falls through
+// to the normal recursive turn within a typical user attention window.
+const SYNTH_TIMEOUT_MS = 90_000;
+
+// File-extension regex for referenced-file extraction. Limited to source-
+// language extensions so we don't pull in lockfiles, images, etc.
+const FILE_PATH_RE =
+  /(?:^|[`'"<\s\(\[])([A-Za-z0-9_./@-]+\.(?:ts|tsx|js|jsx|py|go|rs|java|kt|c|cpp|h|hpp|md|json|yaml|yml|sql|sh|html|css))(?=[`'"<\)\]\s,;:]|$)/gm;
+
+export interface SynthesisParams {
+  ctx: InferenceContext;
+  args: TurnArgs;
+  session: Session;
+  projectRoot: string;
+  toolName: string;
+  toolResultText: string;
+  // v1.13.15-b: when codecontext's wrapper hit its 32k inline-truncation
+  // limit, we expand the full content via readTruncation for reference-file
+  // extraction only. toolResultText (the truncated head) still ships to the
+  // synth model — preserves the 32k payload-budget contract.
+  truncated?: boolean;
+  // opaque id (tr_<…>), not a filesystem path — see truncate.ts naming note
+  outputPath?: string;
+}
+
+interface FetchedFile {
+  path: string;
+  content: string;
+}
+
+interface DocsCollection {
+  boochat?: string;
+  agents?: string;
+  context?: string;
+  roadmap?: string;
+}
+
+export async function runSynthesisPass(p: SynthesisParams): Promise<boolean> {
+  if (!SYNTHESIS_TOOLS.has(p.toolName)) return false;
+
+  let synthMessageId: string | null = null;
+  let accumulated = '';
+  let timedOut = false;
+  const synthCtrl = new AbortController();
+  const timer = setTimeout(() => {
+    timedOut = true;
+    synthCtrl.abort();
+  }, SYNTH_TIMEOUT_MS);
+
+  try {
+    const userMessage = await fetchOriginalUserMessage(p.ctx, p.args.chatId);
+    if (!userMessage) {
+      p.ctx.log.warn({ chatId: p.args.chatId }, 'synthesis: no user message found; falling through');
+      return false;
+    }
+
+    // v1.13.15-b: when the tool result was inline-truncated by the wrapper
+    // (32k cap, see codecontext_client.ts:114), expand the full content from
+    // tmpfs for reference-file extraction. The synth payload still ships the
+    // truncated head (see buildPayload call below) so the token-budget
+    // contract holds. Graceful degradation: if readTruncation returns null
+    // (missing id, ENOENT) or throws, fall back to the truncated head.
+    let extractionSource = p.toolResultText;
+    if (p.truncated && p.outputPath) {
+      try {
+        const full = await readTruncation(p.outputPath);
+        if (full !== null) {
+          extractionSource = full;
+          p.ctx.log.info(
+            {
+              chatId: p.args.chatId,
+              toolName: p.toolName,
+              originalChars: p.toolResultText.length,
+              fullChars: full.length,
+            },
+            'synthesis: expanded truncated tool output',
+          );
+        }
+      } catch (err) {
+        p.ctx.log.warn(
+          { chatId: p.args.chatId, toolName: p.toolName, err: String(err) },
+          'synthesis: readTruncation failed, using truncated output',
+        );
+      }
+    }
+
+    const refFiles = extractReferencedFiles(extractionSource);
+    const files = await fetchTopFiles(refFiles, p.projectRoot);
+    const docs = await fetchProjectDocs(p.projectRoot);
+    const { files: budgetedFiles, docs: budgetedDocs } = applyTokenBudget(files, docs);
+    const synthMessages = buildPayload(
+      p.toolName,
+      // Truncated head only — full content was used for reference extraction above
+      p.toolResultText,
+      budgetedFiles,
+      budgetedDocs,
+      userMessage,
+    );
+
+    // Insert + announce the synthesis assistant message. From here on, any
+    // exception must clean up via the catch block so the row doesn't linger
+    // in 'streaming' status (the 5min stale-streaming sweeper catches it
+    // eventually, but explicit cleanup is better).
+    const [synthRow] = await p.ctx.sql<
+      { id: string; started_at: string }[]
+    >`
+      INSERT INTO messages (session_id, chat_id, role, content, status, started_at, created_at)
+      VALUES (${p.args.sessionId}, ${p.args.chatId}, 'assistant', '', 'streaming', clock_timestamp(), clock_timestamp())
+      RETURNING id, started_at
+    `;
+    synthMessageId = synthRow!.id;
+    const startedAt = synthRow!.started_at;
+
+    p.ctx.publish(p.args.sessionId, {
+      type: 'message_started',
+      message_id: synthMessageId,
+      chat_id: p.args.chatId,
+      role: 'assistant',
+    });
+
+    // Combine the user-abort signal with our synthesis-specific timeout so
+    // either fires correctly. The `timedOut` flag in scope tells us which one
+    // tripped after streamCompletion throws.
+    const combinedSignal: AbortSignal | undefined = p.args.signal
+      ? AbortSignal.any([p.args.signal, synthCtrl.signal])
+      : synthCtrl.signal;
+
+    const onDelta = (delta: string): void => {
+      accumulated += delta;
+      p.ctx.publish(p.args.sessionId, {
+        type: 'delta',
+        message_id: synthMessageId!,
+        chat_id: p.args.chatId,
+        content: delta,
+      });
+    };
+
+    const streamResult = await streamCompletion(
+      p.ctx,
+      p.session.model,
+      synthMessages,
+      { tools: null },
+      onDelta,
+      undefined,
+      combinedSignal,
+    );
+
+    const mctx = await modelContext.getModelContext(p.session.model);
+    const nCtx = mctx?.n_ctx ?? null;
+    const [updated] = await p.ctx.sql<
+      {
+        tokens_used: number | null;
+        ctx_used: number | null;
+        ctx_max: number | null;
+        finished_at: string | null;
+      }[]
+    >`
+      UPDATE messages
+      SET content = ${streamResult.content},
+          status = 'complete',
+          tokens_used = ${streamResult.completionTokens},
+          ctx_used = ${streamResult.promptTokens},
+          ctx_max = ${nCtx},
+          finished_at = clock_timestamp()
+      WHERE id = ${synthMessageId}
+      RETURNING tokens_used, ctx_used, ctx_max, finished_at
+    `;
+    await insertParts(p.ctx.sql, [
+      {
+        message_id: synthMessageId,
+        sequence: 0,
+        kind: 'synthesis',
+        payload: { text: streamResult.content },
+      },
+    ]);
+    p.ctx.publish(p.args.sessionId, {
+      type: 'message_complete',
+      message_id: synthMessageId,
+      chat_id: p.args.chatId,
+      tokens_used: updated?.tokens_used ?? null,
+      ctx_used: updated?.ctx_used ?? null,
+      ctx_max: updated?.ctx_max ?? null,
+      started_at: startedAt,
+      finished_at: updated?.finished_at ?? null,
+      model: p.session.model,
+    });
+    p.ctx.publishUser({
+      type: 'chat_status',
+      chat_id: p.args.chatId,
+      status: 'idle',
+      at: new Date().toISOString(),
+    });
+    p.ctx.log.info(
+      {
+        chatId: p.args.chatId,
+        synthMessageId,
+        toolName: p.toolName,
+        chars: streamResult.content.length,
+        files: budgetedFiles.length,
+      },
+      'synthesis pass complete',
+    );
+    return true;
+  } catch (err) {
+    await markSynthFailed(p, synthMessageId, accumulated).catch((cleanupErr) => {
+      p.ctx.log.warn({ cleanupErr: String(cleanupErr) }, 'synthesis cleanup UPDATE failed');
+    });
+    if (err instanceof Error && err.name === 'AbortError') {
+      if (timedOut) {
+        p.ctx.log.warn(
+          { toolName: p.toolName, chatId: p.args.chatId },
+          'synthesis pass timed out; falling through to recursive turn',
+        );
+        return false;
+      }
+      // User-initiated abort: propagate so the outer error handler marks the
+      // parent turn cancelled. The synth message is already marked failed by
+      // markSynthFailed above.
+      throw err;
+    }
+    p.ctx.log.warn(
+      { err: String(err), toolName: p.toolName, chatId: p.args.chatId },
+      'synthesis pass failed; falling through to recursive turn',
+    );
+    return false;
+  } finally {
+    clearTimeout(timer);
+  }
+}
+
+async function markSynthFailed(
+  p: SynthesisParams,
+  synthMessageId: string | null,
+  accumulated: string,
+): Promise<void> {
+  if (synthMessageId === null) return;
+  await p.ctx.sql`
+    UPDATE messages
+    SET content = ${accumulated},
+        status = 'failed',
+        finished_at = clock_timestamp()
+    WHERE id = ${synthMessageId}
+  `;
+  // Republish so the frontend's live state flips from 'streaming' to
+  // terminal. message_complete carries no error reason — the row's status
+  // column is the truth. The 5-state chat_status dot has 'error' but we
+  // don't fire that here because the broader inference is about to retry
+  // via recursion; flipping the user-channel status to 'error' would race
+  // the recursive turn's 'streaming' announcement.
+  p.ctx.publish(p.args.sessionId, {
+    type: 'message_complete',
+    message_id: synthMessageId,
+    chat_id: p.args.chatId,
+    model: p.session.model,
+  });
+}
+
+async function fetchOriginalUserMessage(
+  ctx: InferenceContext,
+  chatId: string,
+): Promise<string | null> {
+  const rows = await ctx.sql<{ content: string }[]>`
+    SELECT content FROM messages
+    WHERE chat_id = ${chatId} AND role = 'user'
+    ORDER BY created_at DESC
+    LIMIT 1
+  `;
+  return rows[0]?.content ?? null;
+}
+
+function extractReferencedFiles(text: string): string[] {
+  const seen = new Set<string>();
+  const order: string[] = [];
+  let m: RegExpExecArray | null;
+  while ((m = FILE_PATH_RE.exec(text)) !== null) {
+    const candidate = m[1]!;
+    if (seen.has(candidate)) continue;
+    if (
+      candidate.includes('node_modules') ||
+      candidate.includes('/dist/') ||
+      candidate.includes('/test/') ||
+      candidate.includes('/tests/') ||
+      /\.(test|spec)\.[a-z]+$/.test(candidate)
+    ) {
+      continue;
+    }
+    seen.add(candidate);
+    order.push(candidate);
+  }
+  return order;
+}
+
+async function fetchTopFiles(refs: string[], projectRoot: string): Promise<FetchedFile[]> {
+  const tool = TOOLS_BY_NAME['view_file'];
+  if (!tool) return [];
+  const out: FetchedFile[] = [];
+  for (const p of refs.slice(0, TOP_N_FILES)) {
+    const absPath = p.startsWith('/') ? p : join(projectRoot, p);
+    try {
+      const r = await tool.execute({ path: absPath, end_line: FILE_LINE_CAP }, projectRoot);
+      const content = (r as { content?: string }).content ?? '';
+      if (content) out.push({ path: p, content });
+    } catch {
+      // path-scope blocked, secret-filtered, file too large, or missing —
+      // skip silently. The remaining files (or none) still produce a
+      // meaningful synthesis input.
+    }
+  }
+  return out;
+}
+
+async function fetchProjectDocs(projectRoot: string): Promise<DocsCollection> {
+  const tool = TOOLS_BY_NAME['view_file'];
+  if (!tool) return {};
+  const docs: DocsCollection = {};
+  for (const [filename, key] of [
+    ['BOOCHAT.md', 'boochat'],
+    ['AGENTS.md', 'agents'],
+    ['CONTEXT.md', 'context'],
+  ] as const) {
+    try {
+      const r = await tool.execute(
+        { path: join(projectRoot, filename), end_line: DOC_LINE_CAP },
+        projectRoot,
+      );
+      const content = (r as { content?: string }).content;
+      if (content) docs[key] = content;
+    } catch {
+      // missing doc — skip
+    }
+  }
+  // Case-insensitive *roadmap*.md glob. Picks the first match (alphabetical
+  // by readdir() order); typical projects have at most one roadmap doc.
+  try {
+    const entries = await fs.readdir(projectRoot);
+    const roadmap = entries.find(
+      (e) => /roadmap/i.test(e) && e.toLowerCase().endsWith('.md'),
+    );
+    if (roadmap) {
+      const r = await tool.execute(
+        { path: join(projectRoot, roadmap), end_line: DOC_LINE_CAP },
+        projectRoot,
+      );
+      const content = (r as { content?: string }).content;
+      if (content) docs.roadmap = content;
+    }
+  } catch {
+    // unreadable project root — skip
+  }
+  return docs;
+}
+
+function estTokens(s: string | undefined): number {
+  return s ? Math.ceil(s.length / CHARS_PER_TOKEN) : 0;
+}
+
+function applyTokenBudget(
+  files: FetchedFile[],
+  docs: DocsCollection,
+): { files: FetchedFile[]; docs: DocsCollection } {
+  let total = 0;
+  for (const f of files) total += estTokens(f.content);
+  total += estTokens(docs.boochat) + estTokens(docs.agents) + estTokens(docs.context) + estTokens(docs.roadmap);
+  if (total <= TOKEN_BUDGET) return { files, docs };
+
+  // Drop priority (lowest priority dropped first):
+  //   1. top-2..N files (keep top-1)
+  //   2. top-1 file
+  //   3. roadmap (+ CONTEXT.md grouped here — dispatch listed roadmap above
+  //      AGENTS.md, CONTEXT.md was not in the priority list)
+  //   4. AGENTS.md
+  //   5. BOOCHAT.md (never dropped — truncate to budget if alone exceeds)
+  let outFiles = files.slice();
+  const outDocs: DocsCollection = { ...docs };
+
+  while (total > TOKEN_BUDGET && outFiles.length > 1) {
+    const last = outFiles.pop()!;
+    total -= estTokens(last.content);
+  }
+  if (total <= TOKEN_BUDGET) return { files: outFiles, docs: outDocs };
+
+  if (outFiles[0]) {
+    total -= estTokens(outFiles[0].content);
+    outFiles = [];
+  }
+  if (total <= TOKEN_BUDGET) return { files: outFiles, docs: outDocs };
+
+  if (outDocs.roadmap) {
+    total -= estTokens(outDocs.roadmap);
+    delete outDocs.roadmap;
+  }
+  if (outDocs.context) {
+    total -= estTokens(outDocs.context);
+    delete outDocs.context;
+  }
+  if (total <= TOKEN_BUDGET) return { files: outFiles, docs: outDocs };
+
+  if (outDocs.agents) {
+    total -= estTokens(outDocs.agents);
+    delete outDocs.agents;
+  }
+  if (total <= TOKEN_BUDGET) return { files: outFiles, docs: outDocs };
+
+  if (outDocs.boochat) {
+    const maxChars = TOKEN_BUDGET * CHARS_PER_TOKEN;
+    if (outDocs.boochat.length > maxChars) {
+      outDocs.boochat = outDocs.boochat.slice(0, maxChars);
+    }
+  }
+  return { files: outFiles, docs: outDocs };
+}
+
+function buildPayload(
+  toolName: string,
+  toolResultText: string,
+  files: FetchedFile[],
+  docs: DocsCollection,
+  userMessage: string,
+): OpenAiMessage[] {
+  const sections: string[] = [];
+  sections.push(`## Codecontext tool output (${toolName})\n\n${toolResultText}`);
+  if (files.length > 0) {
+    sections.push(`---\n\n## Auto-fetched source files`);
+    for (const f of files) {
+      sections.push(`### ${f.path}\n\n\`\`\`\n${f.content}\n\`\`\``);
+    }
+  }
+  const docEntries: Array<[string, string | undefined]> = [
+    ['BOOCHAT.md', docs.boochat],
+    ['AGENTS.md', docs.agents],
+    ['CONTEXT.md', docs.context],
+    ['roadmap', docs.roadmap],
+  ];
+  const presentDocs = docEntries.filter(([, v]) => Boolean(v));
+  if (presentDocs.length > 0) {
+    sections.push(`---\n\n## Project documentation`);
+    for (const [name, v] of presentDocs) {
+      sections.push(`### ${name}\n\n${v!}`);
+    }
+  }
+  sections.push(`---\n\n## Original user question\n\n${userMessage}`);
+  return [
+    { role: 'system', content: SYNTHESIS_SYSTEM_PROMPT },
+    { role: 'user', content: sections.join('\n\n') },
+  ];
+}
--- a/apps/server/src/services/synthesisPrompt.ts
+++ b/apps/server/src/services/synthesisPrompt.ts
@@ -0,0 +1,20 @@
+// v1.13.13: synthesis pipeline system prompt. Verbatim from the v1.13.13
+// dispatch — do not paraphrase. The synthesis pass loads this as its sole
+// system message, followed by a user message that concatenates the
+// codecontext tool result, auto-fetched top files, auto-fetched project
+// docs, and the original user message.
+export const SYNTHESIS_SYSTEM_PROMPT = `You are synthesizing structural data into an accurate, detailed answer about the user's codebase.
+
+Inputs you have been given:
+1. The output of a codecontext analysis tool (raw structural data — file counts, symbols, dependencies, frameworks).
+2. The contents of the top files referenced in that output.
+3. Any project documentation found in the repo root (BOOCHAT.md, AGENTS.md, roadmap docs, CONTEXT.md).
+
+Rules:
+- Cite specific files and line numbers when making claims about code.
+- If project docs contradict the code, docs win for questions about state, version, status, or roadmap. Code wins for questions about runtime behavior or implementation.
+- If the codecontext output looks sparse (low symbol count for a TypeScript project, missing dependency edges, empty framework list), explicitly say so — codecontext falls back to the JavaScript grammar for TypeScript and loses interfaces, generics, decorators, and type aliases.
+- Do not invent symbols, files, or relationships that are not present in the inputs.
+- Do not respond with a generic "this looks like a [framework] project" summary. The user has the framework analysis already. Add specifics: what is actually in this codebase, what is shipped, what is planned, what is load-bearing.
+- Length: match the depth the user asked for. Overview questions get structured multi-section answers. Specific questions get focused answers.
+`;
--- a/boocode_roadmap.md
+++ b/boocode_roadmap.md
@@ -72,6 +72,30 @@ External code lifted from / referenced in: see `boocode_code_review.md` for full

 -----

+### Shipped (v1.13.x — written 2026-05-22, retagged same day)
+
+All v1.13.x batches were retagged to the `vMAJOR.MINOR.PATCH-slug` scheme on 2026-05-22. `CHANGELOG.md` is the canonical per-tag record (slug describes what shipped; tag name alone recalls the batch). Tip is `v1.13.14-skills-audit` (`0fa46cd`); the next batch is `v1.13.15-codecontext-synth` (this batch, tag pending). Tags in chronological order:
+
+- `v1.13.0-ai-sdk-v6` — AI SDK v6 migration; `streamCompletion` adapter; `messages_with_parts` view; reasoning_parts end-to-end
+- `v1.13.1-cleanup-bundle` — `statement_timeout='30s'`, alpha-sorted tool registry, 60s stuck-row sweeper, `experimental_repairToolCall` pass-through
+- `v1.13.2-compaction-prune` — two-tier prune; `message_parts.hidden_at` column + partial index; `messages_with_parts` view CASE refinement
+- `v1.13.3-truncate` — opencode `truncate.ts` port; opaque `tr_<…>` id, `view_truncated_output(id)` tool, tmpfs storage
+- `v1.13.4-reasoning-fix` — `<reasoning>` prose-prefix in compaction head-assembly for tool-bearing turns
+- `v1.13.5-stability-bundle` — `includeUsage: true` on provider, `hasText` trim guard, `BUDGET_NO_AGENT` 15→30, trailing-empty-assistant filter
+- `v1.13.6-prefix-stability` — `buildSystemPromptWithFingerprint` SHA-256 + per-session drift observer
+- `v1.13.7-compaction-trigger` — overflow trigger lowered to `floor(0.85 × ctx_max)`
+- `v1.13.8-tool-cost` — `tool_cost_stats` SQL view + per-tool rolling 100-call mean in AgentPicker
+- `v1.13.9-agentlint` — instruction-file AgentLint pass; identity-openers removed; `CLAUDE.local.md` to .gitignore
+- `v1.13.10-openspec` — `openspec/changes/<slug>/{proposal,tasks,design}.md` shape; archived batch docs preserved via `git mv`
+- `v1.13.11-tools` — tiered tool loading via `BOOCODE_TOOLS` env (`core | standard | all`)
+- `v1.13.12-ws-schemas` — Zod schemas for all 27 wire-format frames; `publishFrame` / `publishUserFrame` wrappers; parity test
+- `v1.13.13-ws-publish` — all ~80 publish sites converted to the typed wrappers; every WS frame now Zod-validated at boundary
+- `v1.13.14-skills-audit` — 26 skills vendored + audited via 5 parallel agent teams; 14 kept, 11 dropped, 1 migrated to BOOCHAT.md/BOOCODER.md
+- `v1.13.15-codecontext-synth` — forced second-inference synthesis pass for codecontext overview tools (truncation-aware extraction; auto-fetched top-N files + project docs; 32k payload-budget contract preserved)
+- `v1.13.16-xml-parser` — Anthropic `<invoke>` parser support + Levenshtein-based unknown-tool recovery hints (qwen3.6 drift to Claude Code-style tool names like `read_file`); xml-parser test coverage
+
+The remaining strangler-fig final step (drop `messages.tool_calls` + `tool_results` columns) is still pending under its old `v1.13.2` working name; will get a new tag slug when scoped.
+
 ## In flight / next (v1.13.x cleanup line)

 Five more single-dispatch batches before the strangler-fig closes. Each ships independently with its own smoke and rollback surface. **Do not fold.** Order is locked:
@@ -462,17 +486,23 @@ term.indifferentketchup.com         → booterm   :9501   (or routed under code.
 - **v1.11.7:** none (pathGuard logic, no DB)
 - **v1.12.0:** none (codecontext stateless; truncation in-memory id-map with TTL cleanup)
 - **v1.12.1:** `sessions.workspace_panes jsonb` (workspace sync); drop deprecated `session_panes` table; drop stale `messages_status_check` constraint
- **v1.13.0:** `message_parts (id, message_id, sequence, kind, payload jsonb, created_at)` + unique `(message_id, sequence)` + `kind` CHECK; `ToolDef.category` field (TS type, not DB)
- **v1.13.1-B:** `messages_with_parts` view with COALESCE fallbacks
- **v1.13.3:** `ALTER DATABASE boocode SET statement_timeout = '30s'` (op step, documented in schema.sql; doesn't survive volume reset)
- **v1.13.4:** `message_parts.hidden_at TIMESTAMPTZ` column + partial index `(message_id) WHERE hidden_at IS NULL`; `messages_with_parts` view filters hidden parts
- **v1.13.5:** none (tmpfs id-map stored on disk under `BOOCODE_TRUNCATION_DIR`; no schema)
- **v1.13.6:** none (compaction read-side change; `CompactionMessage` extended in TS, not DB)
- **v1.13.7:** none (provider config + 4 frontend/payload guards + budget constant, no schema change)
- **v1.13.8 (planned):** none — verify-and-measure batch, instrumentation only; drops the originally-planned `system_prompt_cache` table since recon proved input-layer mtime caches already achieve prefix stability
- **v1.13.9 (planned):** none (compaction overflow trigger is a constant change in `services/compaction.ts`, no DB)
- **v1.13.10 (planned):** `tool_cost_stats (tool_name, prompt_tokens_sum, completion_tokens_sum, n_calls, updated_at)` — rolling 100-call window
- **v1.13.2 (planned):** drop `messages.tool_calls`, `messages.tool_results`; simplify `messages_with_parts` view
+- **v1.13.0-ai-sdk-v6:** `message_parts (id, message_id, sequence, kind, payload jsonb, created_at)` + unique `(message_id, sequence)` + `kind` CHECK; `messages_with_parts` view with COALESCE fallbacks; `ToolDef.category` field (TS type, not DB)
+- **v1.13.1-cleanup-bundle:** `ALTER DATABASE boocode SET statement_timeout = '30s'` (op step, documented in schema.sql; doesn't survive volume reset)
+- **v1.13.2-compaction-prune:** `message_parts.hidden_at TIMESTAMPTZ` column + partial index `(message_id) WHERE hidden_at IS NULL`; `messages_with_parts` view filters hidden parts
+- **v1.13.3-truncate:** none (tmpfs id-map stored on disk under `BOOCODE_TRUNCATION_DIR`; no schema)
+- **v1.13.4-reasoning-fix:** none (compaction read-side change; `CompactionMessage` extended in TS, not DB)
+- **v1.13.5-stability-bundle:** none (provider config + 4 frontend/payload guards + budget constant, no schema change)
+- **v1.13.6-prefix-stability:** none — verify-and-measure batch, instrumentation only; drops the originally-planned `system_prompt_cache` table since recon proved input-layer mtime caches already achieve prefix stability
+- **v1.13.7-compaction-trigger:** none (compaction overflow trigger is a constant change in `services/compaction.ts`, no DB)
+- **v1.13.8-tool-cost:** `tool_cost_stats` SQL view over `messages_with_parts` (no new table — view + LATERAL `jsonb_array_elements` on `tool_calls`); rolling 100-call window
+- **v1.13.9-agentlint:** none (instruction-file audit + `.gitignore` add of `CLAUDE.local.md`, no DB)
+- **v1.13.10-openspec:** none (docs reorganization, `git mv` only)
+- **v1.13.11-tools:** none (env-var tier filter at request time, no DB)
+- **v1.13.12-ws-schemas:** none (Zod schemas + wrappers in TS, no DB)
+- **v1.13.13-ws-publish:** none (publish-site conversion + protocol-drift fix in `compaction.ts`, no DB)
+- **v1.13.14-skills-audit:** none (skills + AGENTS.md migration into git via `.gitignore` negation patterns; no DB)
+- **v1.13.15-codecontext-synth (this batch, tag pending):** `message_parts.kind` CHECK constraint extended with `'synthesis'` value (DROP + DO $$ pg_constraint idempotency-guarded re-add)
+- **(column drop, pending — old working name v1.13.2):** drop `messages.tool_calls`, `messages.tool_results`; simplify `messages_with_parts` view
 - **v1.14:** `agents.steps` column (or AGENTS.md parser extension; no DB if file-only)
 - **v1.14.x-mcp (NEW):** none — single-server MCP-client PoC is config-only at first, no schema change
 - **v1.14.x-html (NEW):** `message_parts.kind` CHECK constraint extended with `'html_artifact'` value
@@ -582,7 +612,7 @@ Earlier May 18 chat recommended Option A (thin orchestration shell over OpenCode

 ### v1.13.x cleanup line locked (2026-05-22)

-After v1.13.1-C shipped clean, the cleanup order is **v1.13.3 ✅ → v1.13.4 ✅ → v1.13.5 ✅ → v1.13.6 ✅ → v1.13.7 ✅ → v1.13.8 (verify) → v1.13.9 (overflow) → v1.13.10 → v1.13.11 → v1.13.12 → v1.13.2** (column drop last as rollback insurance). **Do not fold.** Smoke isolation matters: each batch has a distinct rollback surface, and bisecting a 750-LoC merge across four unrelated changes is worse than four separate dispatches.
+After the 2026-05-22 retag, the v1.13.x cleanup line in `vMAJOR.MINOR.PATCH-slug` form is **v1.13.0-ai-sdk-v6 ✅ → v1.13.1-cleanup-bundle ✅ → v1.13.2-compaction-prune ✅ → v1.13.3-truncate ✅ → v1.13.4-reasoning-fix ✅ → v1.13.5-stability-bundle ✅ → v1.13.6-prefix-stability ✅ → v1.13.7-compaction-trigger ✅ → v1.13.8-tool-cost ✅ → v1.13.9-agentlint ✅ → v1.13.10-openspec ✅ → v1.13.11-tools ✅ → v1.13.12-ws-schemas ✅ → v1.13.13-ws-publish ✅ → v1.13.14-skills-audit ✅ → v1.13.15-codecontext-synth ✅ → v1.13.16-xml-parser ✅ → column drop (final, pending — old working name v1.13.2)**. **Do not fold.** Smoke isolation matters: each batch has a distinct rollback surface, and bisecting a 750-LoC merge across four unrelated changes is worse than four separate dispatches.

 ### v1.13 retrospective (what shipped)

--- a/openspec/changes/v1.13.15-codecontext-synth/proposal.md
+++ b/openspec/changes/v1.13.15-codecontext-synth/proposal.md
@@ -0,0 +1,145 @@
+# v1.13.13 — codecontext synthesis pipeline
+
+Slots between v1.13.12 (skills audit) and v1.14 (Phase C outer agent loop). Adds a forced second-inference synthesis pass for codecontext overview/analysis tools so the model stops returning shallow first-touch summaries.
+
+Does NOT change the recursion structure, depth cap, or budget — those are v1.14 concerns. The cap-50 patch from v1.13.12 stays; v1.14 supersedes it via per-agent `agent.steps`.
+
+## What ships
+
+- `apps/server/src/services/synthesisPrompt.ts` (NEW, 20 lines) — verbatim system prompt as a const.
+- `apps/server/src/services/synthesisPipeline.ts` (NEW, ~450 lines) — `SYNTHESIS_TOOLS` set + `runSynthesisPass(params) → Promise<boolean>`. Auto-fetches top-N referenced files + project docs (BOOCHAT.md, AGENTS.md, *roadmap*.md, CONTEXT.md), applies a 32k-token budget with priority drop order, streams a synthesis turn via `streamCompletion`, dual-writes a `kind='synthesis'` part.
+- `apps/server/src/services/inference/parts.ts` — `PartKind` union extended with `'synthesis'`.
+- `apps/server/src/services/inference/tool-phase.ts` — synth-tool result capture during `Promise.all`; post-pause synth check before the recursive `runAssistantTurn`.
+- `apps/server/src/schema.sql` — inline CHECK constraint updated + `DROP CONSTRAINT IF EXISTS` + `DO $$ pg_constraint` migration block. Idempotent (drops + re-adds on every startup; per-boot cost is trivial).
+
+SYNTHESIS_TOOLS = `{get_codebase_overview, get_framework_analysis, get_semantic_neighborhoods}`. The other 5 codecontext tools (search_symbols, get_dependencies, get_file_analysis, get_symbol_info, watch_changes) return targeted data the model uses directly — no synthesis pass.
+
+## Decisions
+
+### Schema migration was required (dispatch was wrong)
+
+The original dispatch said "kind is text column, no schema migration needed." Reality: `schema.sql:54` has an explicit `message_parts_kind_chk` CHECK constraint enumerating allowed kinds (`'text', 'tool_call', 'tool_result', 'reasoning', 'step_start'`). Adding `'synthesis'` requires updating the constraint.
+
+Resolution: added a `DROP CONSTRAINT IF EXISTS` + `DO $$ ... pg_constraint` idempotency-guarded migration block in `schema.sql` matching the CLAUDE.md migration pattern, plus updated the inline CREATE TABLE constraint so fresh installs include the new value.
+
+### `view_file` input shape uses `start_line`/`end_line`, not `line_count`
+
+The dispatch's auto-fetch sketch implied a `line_count` parameter. The real `viewFile` tool's input schema (`tools.ts:51-55`) takes `start_line`/`end_line` (1-indexed inclusive) with a 200-line default if both are omitted. The pipeline uses `end_line: FILE_LINE_CAP` for files (200) and `end_line: DOC_LINE_CAP` for docs (500), which gives the first N lines — same effective truncation.
+
+### User-abort during synthesis marks the synth message failed (deviates from review req)
+
+**Decision: option A — mark synth message `status='failed'` on every catch path including user-abort, then re-throw on user-abort.**
+
+Sam's stated review requirement: "User-abort path does NOT mark the message failed (re-throw to outer handler is correct)."
+
+Why this deviation: the outer abort handler (`error-handler.ts:handleAbortOrError`) operates on `args.assistantMessageId` — the *parent* assistant message that triggered the tool call. It does not know about the *new* synth assistant message that `runSynthesisPass` created. If the synth row isn't explicitly marked failed on user-abort, it sits in `status='streaming'` until the 5-min stale-streaming sweeper (`apps/server/src/index.ts`) picks it up — meanwhile the frontend's 60s no-token-activity timer trips the stale-stream banner on the orphan. Same UX bug class the v1.13.3 stuck-row sweeper was added to handle.
+
+Cost: one extra DB write + one `message_complete` republish on the rare user-abort-during-synth path. Worth it to avoid the zombie message + ghost banner.
+
+**Note for v1.14 outer-loop port**: when Phase C migrates the depth cap into `agent.steps` and reworks the recursion, the synth message is a sibling to the parent assistant message — both belong to the same chat. The new outer loop should either (a) preserve this pattern (mark all chat-scoped streaming messages failed on abort) or (b) extend `handleAbortOrError` to sweep chat-scoped streaming rows. Option (b) is a wider blast radius and was rejected here; option (a) is one targeted call site.
+
+### Token budget priority list
+
+Drop order when the 32k cap is exceeded (lowest priority first):
+1. top-2..N files (keep top-1)
+2. top-1 file
+3. `*roadmap*.md` + `CONTEXT.md` (mid-priority — both describe state/intent)
+4. `AGENTS.md`
+5. `BOOCHAT.md` — **never dropped**; truncated to 32k if it alone exceeds
+
+CONTEXT.md wasn't in the original dispatch's priority list; grouped with roadmap as mid-priority (same semantic — both are state/intent docs).
+
+### 90s timeout via `AbortSignal.any`
+
+Synthesis call has its own `AbortController` with a 90s `setTimeout`. Combined with `p.args.signal` (the user-abort signal) via `AbortSignal.any([user, synth])` — either fires correctly. Node 20.3+. A `timedOut` flag in scope disambiguates which signal tripped after `streamCompletion` throws (`AbortError`): timeout → return false (fall through to recursion); user-abort → re-throw (after `markSynthFailed`).
+
+### Race-safe synth-tool capture under `Promise.all`
+
+`synthEntries: Array<{tc, output, error?}>` populated by each parallel callback pushing its own result. After `Promise.all` resolves, `synthEntries.find((e) => !e.error && e.output != null)` picks the first non-error synth entry by call-order (i.e. by `toolCalls` array index in the original LLM emit order). Not result-quality scoring — explicitly call-order, documented inline.
+
+### Known interaction: qwen3.6 `include_stats: "True"` retry loop compounds synth-pass cost
+
+Smoke #1 surfaced a pre-existing qwen3.6 quirk: the model emits `"True"` (string) instead of `true` (bool) for boolean tool args. The `experimental_repairToolCall` + zod-reject retry path (v1.13.3) handles this — the model retries on the next turn with corrected args, then succeeds.
+
+**Synth pass cost interaction:** when the first tool-call fails zod validation, the recursive runAssistantTurn fires *before* the successful synth-tool call lands. The user effectively pays: (1) failed tool-call turn → (2) error tool-result → (3) retry tool-call turn → (4) successful tool-result → (5) synth pass.
+
+Per-fire token cost for an overview question now: ~5 inference calls (turns 1, 3, 5 are model calls; 5 is the synth pass adding ~5k tokens of auto-fetched context). Not a blocker — the synth content is dramatically better than the without-synth case (4920 tokens of cited analysis vs. a 70-token tool-call-only turn). Worth tracking if usage stats start showing it.
+
+### v1.14 outer-loop port — preserve this pattern
+
+Two patterns from this batch the Phase C outer-loop port must preserve:
+
+1. **Chat-scoped abort cleanup**: the synth message is a sibling to the parent assistant message, both belong to the same chat. The new outer loop should either (a) keep `markSynthFailed` (or its equivalent) firing on every catch path including user-abort, or (b) extend `handleAbortOrError` to sweep all chat-scoped streaming rows. This batch chose (a); (b) was rejected as wider blast radius.
+2. **Race-safe `Promise.all` capture**: `synthEntries: Array<...>` instead of a single shared variable. Per-callback push avoids the last-write-wins race when a batch has multiple synth tools.
+
+## Test plan
+
+6-prompt smoke + 1 failure-injection. Sequence:
+
+1. **Default agent** — "What's in this codebase?" → expect `get_codebase_overview` + synthesis pass, response cites BOOCHAT.md + actual files + roadmap state.
+2. **Architect agent** — "Give me a system overview of how BooCode handles tool calls" → expect synthesis with refs to inference/turn.ts, tool-phase.ts, stream-phase.ts.
+3. **Architect agent** — "What's the current state of v1.13?" → synthesis must read `boocode_roadmap.md` and report shipped vs planned correctly. Must NOT infer "v1.13.2 shipped" from code presence — roadmap explicitly defers it.
+4. **Code Reviewer** — "Find all callers of buildSystemPrompt" → `search_symbols` fires, NO synthesis pass (not in SYNTHESIS_TOOLS).
+5. **Debugger** — "Where is detectDoomLoop defined and called from?" → `search_symbols` + `get_dependencies`, NO synthesis pass.
+6. **Failure injection** — temporarily make `streamCompletion` throw inside `runSynthesisPass`; verify fall-through to recursion + log entry visible + non-empty answer.
+
+## Backups in place
+
+```
+apps/server/src/schema.sql.bak-v1.13.13-20260522
+apps/server/src/services/inference/parts.ts.bak-v1.13.13-20260522
+apps/server/src/services/inference/tool-phase.ts.bak-v1.13.13-20260522
+```
+
+To be deleted after merge.
+
+## Smoke results
+
+### Smoke #1 — default agent, "What is in this codebase?"
+
+Synthesis fired on `get_codebase_overview`. Log line:
+```
+{"chatId":"7bb05e54-…","synthMessageId":"44480541-…","toolName":"get_codebase_overview","chars":6727,"files":5,"msg":"synthesis pass complete"}
+```
+
+Token accounting: synth turn = 4920 tokens (vs. 63 + 70 on the preceding tool-call-only turns). Model is using the auto-fetched context, not parroting codecontext output. Synth message has the expected `kind='synthesis'` part dual-write.
+
+Side note: qwen3.6 needed one retry due to the `include_stats: "True"` quirk (see Decisions). `repairToolCall` handled it; synth fired on the successful call.
+
+### Smoke #6 — fault injection
+
+Env-gated throw inserted between the synth-message INSERT and the `streamCompletion` call. Container rebuilt with `V1_13_13_FAULT_INJECT=1`. Sent the same prompt to a new smoke chat.
+
+All 6 expected outcomes confirmed:
+
+| # | Outcome | Evidence |
+|---|---|---|
+| 1 | `runSynthesisPass` throws | log: `err: "Error: v1.13.13 smoke #6 fault injection"` |
+| 2 | Synth message marked `status='failed'` with empty content | msg `7ac9c685-…` role=assistant status=failed content_len=0 |
+| 3 | `message_complete` frame published for the synth message | implicit via `markSynthFailed`; frontend never tripped the 60s timer |
+| 4 | Fall-through to recursive `runAssistantTurn` | log: `synthesis pass failed; falling through to recursive turn` |
+| 5 | User sees normal (non-synthesized) assistant response | final msg `924076a3-…` 453 tokens: `"This is **boocode** — a self-hosted, single-user developer chat app."` |
+| 6 | Stale-stream banner does NOT fire on failed synth | confirmed — terminal `status='failed'` is what `applyFrame` writes |
+
+Fault injection reverted post-test:
+- `grep FAULT_INJECT apps/server/src/services/synthesisPipeline.ts docker-compose.yml` → empty
+- `grep FAULT_INJECT apps/server/dist/services/synthesisPipeline.js` → empty
+- `docker compose exec boocode printenv V1_13_13_FAULT_INJECT` → exit 1 (unset)
+- Boot log clean, `skills loaded: 14`
+
+### Smokes #2–#5
+
+Sam is doing the qualitative reads from the UI in parallel — those verifications are about synthesis content quality (cites correct files, reads roadmap accurately, no-synthesis on `search_symbols`).
+
+## Done when
+
+- ✅ `synthesisPrompt.ts` + `synthesisPipeline.ts` created
+- ✅ `parts.ts` PartKind union extended
+- ✅ `tool-phase.ts` insertion point edited
+- ✅ Schema migration block added (deviation from dispatch acknowledged)
+- ✅ Type-clean (`pnpm -C apps/server build`)
+- ✅ Container rebuilt + migration confirmed via pg_constraint and logs
+- ✅ Smoke #1 (positive synth path) verified
+- ✅ Smoke #6 (fault injection + fall-through) verified, injection reverted
+- ⏳ Smokes #2–#5 (Sam's UI reads)
+- ⏳ Sam commit