Files

indifferentketchup 2e1a81de72 v1.13.16-xml-parser: Anthropic <invoke> support + unknown-tool recovery hints

Two-part fix for the model-emitted XML drift the v1.13.15-codecontext-synth
investigation surfaced (1 raw <invoke> leak observed out of 190 qwen3.6
turns — qwen3.6-35b-a3b-mxfp4 drifts to the Anthropic format when prompted
as an Architect-style agent because Claude Code documentation in its
pre-training corpus uses that shape).

## Parser extension

xml-parser.ts now recognizes BOTH XML tool-call flavors:

  - Qwen/Hermes:   <tool_call><function=NAME>...<parameter=K>V</parameter>...</function></tool_call>
  - Anthropic:     <invoke name="NAME"><parameter name="K">V</parameter></invoke>

Both route through the same synthetic-id xml_call_${idx} ToolCall path.
extractToolCallBlocks() and partialXmlOpenerStart() handle both openers
(<tool_call> and <invoke...) so partial buffers don't get prematurely
flushed during streaming.

The existing Qwen parser was tightened to tolerate whitespace around `=`
(<function = name>, <parameter = key>...) so a stray space doesn't get
absorbed into the function name. Name capture is non-whitespace,
non-`>`.

## Unknown-tool recovery hint

New tool-suggestions.ts exports levenshtein() + suggestToolName() +
formatUnknownToolError(). When tool-phase.ts:executeToolCall receives a
toolCall.name that isn't in TOOLS_BY_NAME, the error returned to the
model now includes a "Did you mean: X?" hint based on Levenshtein
distance ≤3 or substring match against Object.keys(TOOLS_BY_NAME).
Targets the qwen3.6 drift to read_file → suggest view_file. Applies to
all unknown tool names, not just <invoke>-derived ones — at the
dispatch layer we no longer know which format produced the call, and
the extra signal is harmless for Qwen-derived calls.

## Test coverage

xml-parser.test.ts: 46 tests, all green. Covers both parsers
(well-formed, malformed, multi-parameter, nested-content), the
partial-opener detector for both flavors, the unified extraction
helper, and the unknown-tool error formatter.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-22 20:59:25 +00:00

20 KiB

Raw Permalink Blame History

Changelog

All notable changes per release tag. Most recent on top, ordered by tag creation date (which matches the git history). Tag names follow vMAJOR.MINOR.PATCH-slug — the slug describes what shipped, so the tag name alone is enough to recall the batch.

v1.13.16-xml-parser — 2026-05-22

Two-part fix for the model-emitted XML drift the v1.13.15 investigation surfaced. Parser extension: xml-parser.ts now recognizes the Anthropic <invoke name="…"><parameter name="…">…</parameter></invoke> shape alongside the existing Qwen/Hermes <tool_call><function=…>…</function></tool_call> shape. qwen3.6-35b-a3b-mxfp4 drifts to the Anthropic format when prompted as an Architect-style agent (Claude Code documentation in its pre-training corpus). Both formats route through the same synthetic-id xml_call_${idx} ToolCall path. The existing Qwen parser was tightened to tolerate whitespace around = (<function = name> shape) so a stray space doesn't get absorbed into the function name. Unknown-tool recovery hint: new tool-suggestions.ts exports levenshtein() + suggestToolName() + formatUnknownToolError(). When the dispatcher (tool-phase.ts:executeToolCall) receives an unknown tool name, the error returned to the model includes a "Did you mean: X?" hint based on Levenshtein distance ≤3 or substring match against Object.keys(TOOLS_BY_NAME). Targets the qwen3.6 drift to read_file → suggest view_file. Test coverage in xml-parser.test.ts (46 tests, all green) covers both parsers, the partial-opener detector for both flavors, the unified extraction helper, and the new error formatter.

v1.13.15-codecontext-synth — 2026-05-22

Forced second-inference synthesis pass for codecontext overview-class tools (get_codebase_overview, get_framework_analysis, get_semantic_neighborhoods). After the tool result lands, the pipeline expands the truncated head via in-process readTruncation, extracts referenced file paths from the full content, auto-fetches top-N files + project docs (BOOCHAT.md, AGENTS.md, roadmap.md, CONTEXT.md) under a 32k-token budget with explicit drop-priority order, then streams a synthesis turn that replaces the recursive runAssistantTurn. The 32k truncated head still ships to the synth model (token-budget contract preserved); the expansion is reference-extraction-only. Falls through to recursion on timeout (90s), model error, or non-2xx; user-abort marks the synth message status='failed' and re-throws (the outer abort handler operates on the parent turn's message, not the new synth row — without explicit marking, the row would sit streaming until the 5-min sweeper, tripping the 60s stale-stream banner). Adds 'synthesis' to message_parts.kind CHECK constraint via DROP CONSTRAINT IF EXISTS + DO $$ pg_constraint idempotency-guarded re-add. Smokes #1, #2, #6 all clean; smokes #3–#5 are content-quality checks for UI review.

v1.13.14-skills-audit — 2026-05-22

Multi-topic batch. Skills audit (headline): vendored all 26 skills from /home/samkintop/opt/skills/ into repo-local data/skills/ (the /opt/skills:/data/skills override mount removed from docker-compose.yml so skills are auditable per-batch in git). Audited via 5 parallel Claude Code agent-teams running mgechev's 4-step protocol per skill — 14 survive with gerund-form names + refined triggers; 11 dropped (duplicates, BooCode-irrelevant patterns, Claude-already-does-natively); 1 (verification-before-completion) migrated to BOOCHAT.md/BOOCODER.md as an always-true rule. The Codeminer42 "rules vs recipes" split codified in those files. Token tracking + stale-stream banner fix: same root cause — IsoTimestamp = z.string() in ws-frames.ts was failing on postgres Date objects, silently dropping every message_complete / session_updated / chat_updated frame through the v1.13.13-ws-publish Zod gate; z.preprocess(v => v instanceof Date ? v.toISOString() : v, ...) applied to the primitive on both server + web (parity test still passes). Codecontext ignore: codecontext_client.ts auto-installs .codecontextignore.template into any project's root on first call (stops the upstream empty-source-file parser crash on foreign projects' node_modules). Budget bump: BUDGET_READ_ONLY + BUDGET_NO_AGENT 30 → 50 (real recon need ~27 + headroom for codecontext failure-retry turns; doom-loop guard catches the loop class anyway). UI: queued-message dropdown → edit / force-send / cancel buttons in ChatPane.tsx; ChatThroughput removed from desktop tab strip (mobile tab switcher keeps it). Audit decisions in openspec/changes/v1.13.12-skills-audit/audit-notes.md.

v1.13.13-ws-publish — 2026-05-22

Second half of the WebSocket-frame-typing batch. Converts the existing ~50 inference + auto_name publish sites (via the index.ts adapter) plus ~30 direct broker.publish* call sites in routes + compaction, so every server-emitted frame now goes through Zod validation at the broker boundary. Pairs with v1.13.12-ws-schemas.

v1.13.12-ws-schemas — 2026-05-22

First half of the WebSocket-frame-typing batch. Adds apps/server/src/types/ws-frames.ts with Zod schemas for all 27 wire-format frame types (discriminated union WsFrameSchema + KNOWN_FRAME_TYPES diagnostic lookup), duplicated byte-identical at apps/web/src/api/ws-frames.ts with a parity test. Introduces the publishFrame / publishUserFrame wrappers that fail-closed on schema mismatch.

v1.13.11-tools — 2026-05-22

Tiered tool loading via BOOCODE_TOOLS env var (core | standard | all). Core = 4 read-only fs tools (~2k token schema cost). Standard = +web + git + codecontext (~10k). All (default) = every tool in ALL_TOOLS (~21k). The var is a ceiling — narrows agent whitelists, never expands. Pattern lifted from eyaltoledano/claude-task-master.

v1.13.10-openspec — 2026-05-22

Adopt Fission-AI/OpenSpec's openspec/changes/<slug>/{proposal,tasks,design}.md shape for BooCode's own batch docs. Existing batch docs (boocode_batch10.md, handoff_v1.13.8_prefix_verify.md, handoff_v1.13.10_per_tool_cost.md) moved into openspec/changes/archived/ via git mv to preserve history. Zero-dep documentation reformat.

v1.13.9-agentlint — 2026-05-22

Manual audit of instruction files against 0xmariowu/AgentLint's 31-check standard. Removed identity-opener sections from BOOCHAT.md and BOOCODER.md (emphatic decoration the model doesn't need). Added CLAUDE.local.md to .gitignore — Claude Code's Glob ignores .gitignore by default, so local overrides were otherwise readable by any agent walking the workspace. CLAUDE.md passed all 10 checks unchanged.

v1.13.8-tool-cost — 2026-05-22

Per-tool prompt/completion-token rolling averages surfaced in AgentPicker as at-a-glance cost hints. Implementation is the tool_cost_stats SQL view over messages_with_parts (LATERAL jsonb_array_elements on tool_calls), plus a read endpoint and a tooltip extension. Equal-split attribution — multi-tool turn divides tokens N-ways; the 100-call rolling mean absorbs split noise. Filters out cap_hit / doom_loop sentinels. Source data already lands via existing UPDATEs that v1.13.5-stability-bundle's includeUsage: true fix made non-NULL.

v1.13.7-compaction-trigger — 2026-05-22

Compaction overflow trigger lowered to floor(0.85 × ctx_max), replacing the v1.11.0-era ctx_max − 20_000 formula. Old formula gave only 7.6% headroom at 262k context and 0 budget for ≤20k contexts (never fired). New formula gives consistent 15% summarizer headroom across all model sizes. Opencode pattern lift from session/overflow.ts.

v1.13.6-prefix-stability — 2026-05-22

System-prompt prefix stability verify-and-measure. Recon during planning disproved the original DB-cache premise: buildSystemPrompt already runs over inputs mtime-cached at the file layer (BOOCHAT.md, AGENTS.md global+per-project), and DB scalars are byte-stable until edited. This batch closes the verification gap with instrumentation, not implementation — buildSystemPromptWithFingerprint computes SHA-256 over the assembled prefix and a per-session Map observer fires prefix-drift (warn) on hash change with field-level changed_inputs diff.

v1.13.5-stability-bundle — 2026-05-22

Five fixes for latent regressions surfaced during the cosmetic-revert investigation. (1) provider.ts — includeUsage: true on createOpenAICompatible (default false omitted stream_options.include_usage; llama-swap never emitted usage; tokens_used / ctx_used were NULL on every assistant row since v1.13.0-ai-sdk-v6). (2) MessageList.tsx — hasText = m.content.trim().length > 0 to skip whitespace-only tool-call-only turns rendering empty bubbles. (3) BUDGET_NO_AGENT raised 15 → 30 to match read-only agent cap. (4) payload.ts skips status='failed' + complete-but-empty assistant rows so cap-hit + Continue doesn't upstream-reject. (5) Misc UI sanitization.

v1.13.4-reasoning-fix — 2026-05-22

Compaction head-assembly audit caught one fix: reasoning was omitted from the summarizer's view of tool-bearing turns, silently degrading summary quality for reasoning-channel models (qwen3.6). v1.13.0-ai-sdk-v6 had wired reasoning end-to-end into inference but missed this one read site. CompactionMessage extended with reasoning_parts; buildHeadPayload embeds it as a <reasoning>...</reasoning> prose prefix on the assistant content (OpenAI wire shape has no structured reasoning field).

v1.13.3-truncate — 2026-05-22

Port of opencode's truncate.ts. Full tool output retrievable via opaque tr_<12 base32 chars> id (~60 bits entropy) and a new view_truncated_output(id) tool. Tmpfs storage at /tmp/boocode-truncations/ (overridable via BOOCODE_TRUNCATION_DIR), 5MB cap, 7-day TTL, orphan-reap on the periodic 60s sweeper. Wired through four tools: view_file, list_dir, web_fetch, codecontext_client. Each returns the existing sliced view plus an outputPath field when truncation fires.

v1.13.2-compaction-prune — 2026-05-22

Two-tier compaction prune — opencode pattern that was half-shipped in v1.11.0. New message_parts.hidden_at column with partial index on WHERE hidden_at IS NULL. messages_with_parts view changed from COALESCE(parts, legacy) to a CASE that distinguishes "no parts at all → fall back to legacy column for pre-v1.13.0 history" from "all parts hidden → drop the row from the model payload" (smoke caught the COALESCE leaking hidden parts back via legacy fallback). prune.ts scans tool_result parts newest-first, protects the last 40k tokens, marks older candidates hidden once the combined estimate clears 20k.

v1.13.1-cleanup-bundle — 2026-05-22

Four independent items owed from prior dispatches. (1) statement_timeout = '30s' at the database level (documented in schema.sql but applied operationally — ALTER DATABASE can't run inside a DO block). (2) Tool registry alpha-sorted at module load — llama.cpp's prompt cache hits on byte-identical prefixes; reordering tools near the top of the system prompt would invalidate every cached turn. (3) Periodic 60s stuck-row sweeper. (4) experimental_repairToolCall to keep streams alive on malformed qwen3.6 tool args (pass-through implementation — logs and forwards unmodified; existing zod-reject path routes back to the model).

v1.13.0-ai-sdk-v6 — 2026-05-22

Major migration to AI SDK v6. Introduces the streamCompletion adapter (services/inference/stream-phase.ts) over streamText, with five known gotchas the LSP can't catch — abort signals swallowed by fullStream (post-iteration throw required), usage lands only at stream end via await result.usage, tools have no execute field (BooCode dispatches in tool-phase.ts), and tool-call-only turns may emit a leading \n text-delta. Also ships the messages_with_parts view (parts-merge read path) and wires reasoning_parts end-to-end via a ReasoningPart in the v6 ModelMessage. Ports ask_user_input correlation queries from JSON columns to message_parts JOINs.

v1.12.4-inference-split — 2026-05-21

Complete inference.ts split into services/inference/. Pieces: turn.ts (orchestration — runAssistantTurn / runInference / createInferenceRunner), sentinel-summaries.ts (runCapHitSummary, runDoomLoopSummary), stream-phase.ts, tool-phase.ts, provider.ts, payload.ts, prune.ts, budget.ts, xml-parser.ts, error-handler.ts, sentinels.ts, parts.ts, types.ts. Public surface re-exported via inference/index.ts; callers import from ./services/inference/index.js explicitly (NodeNext doesn't honor directory-index resolution).

v1.12.3-stale-banner — 2026-05-21

Stale-stream banner with Retry/Discard. When an assistant message sits status='streaming' with no token activity for 60+ seconds, the chat shows a banner above the input. Both actions clear the stale row via new POST /api/chats/:id/discard_stale (updates status='failed', publishes chat_status='idle'). Closes the UX gap from the 2026-05-21 debugging spiral — slow streams and dead streams now look different.

v1.12.2-live-toks — 2026-05-21

Live tok/s + ctx display next to the status indicator. ChatThroughput renders inline beside StatusDot while streaming or tool_running. Subscribes to existing 'usage' WS frames (500ms-throttled, carrying completion_tokens + ctx_used + ctx_max) via sessionEvents. Hides when status drops to idle/error or data is older than 10s. Addresses the same UX gap as v1.12.3-stale-banner — gives users a live token velocity readout that immediately distinguishes slow from dead.

v1.12.1-stop-handler — 2026-05-21

handleAbortOrError now writes status='cancelled' on user stop; rows no longer stuck streaming forever. Drops stale messages_status_check constraint (only messages_status_chk remains, allowing 'cancelled' via TS MESSAGE_STATUSES). Removes detectSameNameLoop and DOOM_LOOP_SAME_NAME_THRESHOLD (added during the 2026-05-21 debugging spike, never fired in any real run) plus 12 verbose ctx.log.info diagnostic markers from the same spike. Bundles workspace pane sync + status indicator overhaul + startup hung-row sweep that landed earlier in v1.12.1 work.

v1.12.0-codecontext — 2026-05-21

Adds the codecontext sidecar (Go-based code-graph indexer at codecontext:8080/v1/<tool_name> over boocode_net) plus container guidance and skills runtime updates. Introduces the chat_status WS frame (streaming | tool_running | waiting_for_input | idle | error, widened from working|idle|error). Drops the deprecated session_panes table — workspace pane state moves to sessions.workspace_panes jsonb for cross-device sync via PATCH /api/sessions/:id/workspace.

v1.11.1-consolidation — 2026-05-21

Rollup of v1.11.0–v1.11.10 work that was shipped piecemeal. Covers anchored rolling compaction (single summary=true row per chat that supersedes itself), doom-loop guard via detectDoomLoop, path_guard secret-filename deny list, web tools (web_search against SearXNG + web_fetch with SSRF/private-IP block), and the 5MB stream-cap on response bodies with abort-on-overflow.

v1.11.0-context-bar — 2026-05-20

Persistent context-window tracker in ChatPane + ctx_max capture via ${LLAMA_SWAP_URL}/upstream/<model>/props. First inferences after a boocode boot may have ctx_max=NULL if llama-swap hasn't loaded the model yet — 60s negative cache TTL recovers on next turn. Replaced an earlier dead read of parsed.timings.n_ctx which never carried n_ctx.

v1.10.1-booterm-user — 2026-05-19

Per-user shell privilege drop in the booterm container via gosu in tmux.conf default-command. Shells launched in browser terminal panes drop privs to samkintop rather than running as root inside the container.

v1.10.0-booterm — 2026-05-18

Second container (apps/booterm, port 9501, bookworm-slim+glibc). Fastify + node-pty + tmux. Browser terminal panes connect via WS to /ws/term/sessions/:sid/panes/:pid; per-session tmux session bc-<sid>, per-pane window term-<pid>. xterm-addon-webgl with document.fonts.load(...)-gated init (Canvas2D doesn't honor font-display: block) and iOS-friendly visibility-change context recreation.

v1.9.2-ask-user-input — 2026-05-18

ask_user_input elicitation tool. Pauses the inference loop and surfaces a prompt to the user; their response routes back as the tool result. Correlation initially via messages.tool_calls / tool_results JSON columns (later ported to message_parts in v1.13.0-ai-sdk-v6).

v1.9.1-skills — 2026-05-18

Skills runtime + /skill slash command with autocomplete. Server-side parser, tools, /api/skills, and mount. Hardens .dockerignore to exclude secrets/ and data/. Drops the type-to-confirm gate on chat delete (plain Cancel/Confirm only — per workspace convention).

v1.9.0-themes-settings — 2026-05-17

Settings pane + per-project defaults + bulk archive + themes lift. themes-v1 (18 preset palettes) ships in the same batch with a Settings picker for live theme switching.

v1.8.2-cap-hit — 2026-05-17

Tool-loop cap-hit summary — when an assistant exceeds the per-turn tool budget, a sentinel role='system' row with metadata.kind='cap_hit' is inserted and a summary turn runs to give the user a coherent endpoint. Also compacts the tool-call UI rendering.

v1.8.1-agents-global — 2026-05-16

Global agents (data/AGENTS.md bind-mounted at /data/AGENTS.md) + parser robustness + WS reconnect toast. Per-project AGENTS.md mechanism (getAgentsForProject) remains for other projects; the BooCode repo itself uses global-only to eliminate two-files-must-stay-in-sync drift.

v1.8.0-agents — 2026-05-16

Tier 2 agents — AGENTS.md registry + per-session agent picker. Also lands mobile tab switcher, branch indicator, and the git_status tool.

v1.7.0-drag-drop — 2026-05-16

Drag-drop + paste-as-attachment for long text in the chat input.

v1.6.0-mobile — 2026-05-16

Full mobile suite. Adds useViewport (matchMedia breakpoints mobile <768 / tablet 768–1023 / desktop ≥1024), useSidebarDrawer / useRightRailDrawer (Context + auto-close on useLocation().pathname change), useLongPress (500ms timer, synthetic contextmenu), usePullToRefresh (80px threshold, 600ms hold), SwipeablePaneTab (60px close, 30px vertical bail). Mobile headers with safe-area padding, hamburger left, FolderTree right. Tap targets at max-md:min-h-[44px] max-md:min-w-[44px]. Raises MAX_TOOL_LOOP_DEPTH 5 → 15. Right-rail becomes a drawer on mobile.

v1.5.1-bootstrap — 2026-05-16

Bootstrap fixes — git + ssh installed in the boocode container, Tailscale host rewrite, /opt/projects label correction for the create-new-project bootstrap flow.

v1.5.0-refactor-tests — 2026-05-16

Refactor split (FileBrowserPane / Workspace / runAssistantTurn) + vitest harness + unit tests for security-critical pure functions. Scopes the /opt mount to /opt/projects (writable) plus PROJECT_ROOT_WHITELIST=/opt (read-only resolution for add-existing). Surfaces swallowed errors and removes dead session_renamed paths.

v1.4.0-fork-header — 2026-05-16

Fork from message + delete message + header polish + general housekeeping.

v1.3.0-chats-projects — 2026-05-16

Chats-in-sessions era. Adds force-send, /compact, right-rail file browser, archive/rename/Open-in-Gitea sidebar context menu, archived projects landing page, create-project bootstrap with Gitea remote setup, landing-card buttons, 1000px content cap. Dedup audit and chat archive/delete from the sidebar.

v1.2.0-multi-pane — 2026-05-15

Multi-pane workspace (batch 3, T1–T8). session_panes schema (later replaced by sessions.workspace_panes jsonb in v1.12.0), Pane discriminated union, broker user channel + /api/ws/user, file_ops + file_index services, PaneShell / ChatPane / FileBrowserPane / PaneTab / Workspace components, usePanes hook, Shiki integration in CodeBlock. Up to 5 panes per session; default chat pane created on POST /api/sessions.

v1.1.0-markdown-sidebar — 2026-05-15

Markdown rendering, message actions, tok/s + ctx display, AI session naming. Sidebar restructure — chats nested under projects (max 5 + view-all), live updates via WS.

v1.0.0-initial — 2026-05-14

Initial commit. Skeleton of the monorepo: apps/server (Fastify + postgres), apps/web (React + Vite), basic chat loop against llama-swap.

20 KiB Raw Permalink Blame History Unescape Escape