diff --git a/CHANGELOG.md b/CHANGELOG.md index 45bc48b..d44e367 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -2,6 +2,10 @@ All notable changes per release tag. Most recent on top, ordered by tag creation date (which matches the git history). Tag names follow `vMAJOR.MINOR.PATCH-slug` — the slug describes what shipped, so the tag name alone is enough to recall the batch. +## v1.13.16-xml-parser — 2026-05-22 + +Two-part fix for the model-emitted XML drift the v1.13.15 investigation surfaced. **Parser extension:** `xml-parser.ts` now recognizes the Anthropic `` shape alongside the existing Qwen/Hermes `` shape. qwen3.6-35b-a3b-mxfp4 drifts to the Anthropic format when prompted as an Architect-style agent (Claude Code documentation in its pre-training corpus). Both formats route through the same synthetic-id `xml_call_${idx}` ToolCall path. The existing Qwen parser was tightened to tolerate whitespace around `=` (`` shape) so a stray space doesn't get absorbed into the function name. **Unknown-tool recovery hint:** new `tool-suggestions.ts` exports `levenshtein()` + `suggestToolName()` + `formatUnknownToolError()`. When the dispatcher (`tool-phase.ts:executeToolCall`) receives an unknown tool name, the error returned to the model includes a "Did you mean: X?" hint based on Levenshtein distance ≤3 or substring match against `Object.keys(TOOLS_BY_NAME)`. Targets the qwen3.6 drift to `read_file` → suggest `view_file`. Test coverage in `xml-parser.test.ts` (46 tests, all green) covers both parsers, the partial-opener detector for both flavors, the unified extraction helper, and the new error formatter. + ## v1.13.15-codecontext-synth — 2026-05-22 Forced second-inference synthesis pass for codecontext overview-class tools (`get_codebase_overview`, `get_framework_analysis`, `get_semantic_neighborhoods`). After the tool result lands, the pipeline expands the truncated head via in-process `readTruncation`, extracts referenced file paths from the full content, auto-fetches top-N files + project docs (BOOCHAT.md, AGENTS.md, *roadmap*.md, CONTEXT.md) under a 32k-token budget with explicit drop-priority order, then streams a synthesis turn that replaces the recursive `runAssistantTurn`. The 32k truncated head still ships to the synth model (token-budget contract preserved); the expansion is reference-extraction-only. Falls through to recursion on timeout (90s), model error, or non-2xx; user-abort marks the synth message `status='failed'` and re-throws (the outer abort handler operates on the parent turn's message, not the new synth row — without explicit marking, the row would sit `streaming` until the 5-min sweeper, tripping the 60s stale-stream banner). Adds `'synthesis'` to `message_parts.kind` CHECK constraint via `DROP CONSTRAINT IF EXISTS` + `DO $$ pg_constraint` idempotency-guarded re-add. Smokes #1, #2, #6 all clean; smokes #3–#5 are content-quality checks for UI review. diff --git a/apps/server/src/services/__tests__/xml-parser.test.ts b/apps/server/src/services/__tests__/xml-parser.test.ts new file mode 100644 index 0000000..86d06df --- /dev/null +++ b/apps/server/src/services/__tests__/xml-parser.test.ts @@ -0,0 +1,357 @@ +// v1.13.16: covers the Qwen/Hermes parser, the new Anthropic +// parser, the partial-opener detector for both flavors, the unified +// extraction helper, and the unknown-tool error formatter that downstream +// dispatch uses to give the model a recovery hint when it drifts to a +// Claude Code tool name like read_file instead of BooCode's view_file. + +import { describe, expect, it } from 'vitest'; +import { + parseXmlToolCall, + parseInvokeToolCall, + partialXmlOpenerStart, + extractToolCallBlocks, + XML_TOOL_OPEN, + XML_TOOL_CLOSE, + INVOKE_TOOL_OPEN, + INVOKE_TOOL_CLOSE, +} from '../inference/xml-parser.js'; +import { + levenshtein, + suggestToolName, + formatUnknownToolError, +} from '../inference/tool-suggestions.js'; + +describe('parseXmlToolCall (Qwen/Hermes )', () => { + it('parses a well-formed single-parameter call', () => { + const block = '/tmp/foo'; + expect(parseXmlToolCall(block)).toEqual({ + name: 'view_file', + args: { path: '/tmp/foo' }, + }); + }); + + it('parses multi-parameter call', () => { + const block = 'foosrc/'; + expect(parseXmlToolCall(block)).toEqual({ + name: 'grep', + args: { pattern: 'foo', path: 'src/' }, + }); + }); + + it('JSON-parses numeric parameter values', () => { + const block = '42'; + expect(parseXmlToolCall(block)).toEqual({ name: 'foo', args: { count: 42 } }); + }); + + it('tolerates whitespace around = in function (v1.13.16 tightening)', () => { + const block = '/tmp/foo'; + expect(parseXmlToolCall(block)).toEqual({ + name: 'view_file', + args: { path: '/tmp/foo' }, + }); + }); + + it('tolerates whitespace around = in parameter (v1.13.16 tightening)', () => { + const block = '/tmp/foo'; + expect(parseXmlToolCall(block)).toEqual({ + name: 'view_file', + args: { path: '/tmp/foo' }, + }); + }); + + it('returns null when function name is missing', () => { + const block = '/tmp/foo'; + expect(parseXmlToolCall(block)).toBeNull(); + }); +}); + +describe('parseInvokeToolCall (Anthropic ) — v1.13.16', () => { + // Spec case 1 + it('parses a well-formed single-parameter call (spec case 1)', () => { + const block = '/tmp/foo'; + expect(parseInvokeToolCall(block)).toEqual({ + name: 'view_file', + args: { path: '/tmp/foo' }, + }); + }); + + // Spec case 2 + it('parses a multi-parameter call (spec case 2)', () => { + const block = 'foosrc/'; + expect(parseInvokeToolCall(block)).toEqual({ + name: 'grep', + args: { pattern: 'foo', path: 'src/' }, + }); + }); + + // Spec case 3 + it('tolerates newlines and spaces in attributes (spec case 3)', () => { + const block = ` + /tmp/foo + `; + expect(parseInvokeToolCall(block)).toEqual({ + name: 'view_file', + args: { path: '/tmp/foo' }, + }); + }); + + // Spec case 4 (parser portion — the not-found enrichment is tested below) + it('parses a call whose name is not a registered BooCode tool (spec case 4)', () => { + const block = '/tmp/foo'; + expect(parseInvokeToolCall(block)).toEqual({ + name: 'read_file', + args: { path: '/tmp/foo' }, + }); + }); + + it('supports single-quoted attribute values', () => { + const block = "/tmp/foo"; + expect(parseInvokeToolCall(block)).toEqual({ + name: 'view_file', + args: { path: '/tmp/foo' }, + }); + }); + + it('JSON-parses numeric parameter values', () => { + const block = '42'; + expect(parseInvokeToolCall(block)).toEqual({ name: 'foo', args: { count: 42 } }); + }); + + it('tolerates spaces around = inside name attribute', () => { + const block = '/tmp/foo'; + expect(parseInvokeToolCall(block)).toEqual({ + name: 'view_file', + args: { path: '/tmp/foo' }, + }); + }); + + it('returns null when name attribute is missing', () => { + const block = '/tmp/foo'; + expect(parseInvokeToolCall(block)).toBeNull(); + }); + + it('returns null when name attribute is empty', () => { + const block = '/tmp/foo'; + expect(parseInvokeToolCall(block)).toBeNull(); + }); + + it('exports the expected delimiters', () => { + expect(INVOKE_TOOL_OPEN).toBe(''); + expect(XML_TOOL_OPEN).toBe(''); + expect(XML_TOOL_CLOSE).toBe(''); + }); +}); + +describe('partialXmlOpenerStart (v1.13.16 — both flavors)', () => { + it('returns -1 when the buffer is empty', () => { + expect(partialXmlOpenerStart('')).toBe(-1); + }); + + it('returns -1 when the buffer has no openers', () => { + expect(partialXmlOpenerStart('plain prose, no markup')).toBe(-1); + }); + + it('returns the index of a complete opener (existing)', () => { + expect(partialXmlOpenerStart('prose more')).toBe(6); + }); + + it('returns the index of a complete { + expect(partialXmlOpenerStart('prose { + expect(partialXmlOpenerStart('text { + expect(partialXmlOpenerStart('text { + expect(partialXmlOpenerStart('text <')).toBe(5); + }); + + it('returns -1 when < is followed by non-opener text', () => { + expect(partialXmlOpenerStart('text ')).toBe(-1); + }); + + it('returns the earliest opener when both flavors are present', () => { + expect(partialXmlOpenerStart('xxx YYY ')).toBe(4); + expect(partialXmlOpenerStart('xxx YYY ')).toBe(4); + }); +}); + +describe('extractToolCallBlocks (v1.13.16 — unified extraction)', () => { + // Spec case 1 (extraction-level) + it('extracts a single block (spec case 1)', () => { + const input = '/tmp/foo'; + const result = extractToolCallBlocks(input); + expect(result.calls).toEqual([{ name: 'view_file', args: { path: '/tmp/foo' } }]); + expect(result.flushed).toBe(''); + expect(result.remaining).toBe(''); + }); + + // Spec case 5: opener arrives in one chunk, closer in the next. + it('holds the partial chunk when the closer has not arrived (spec case 5, first chunk)', () => { + const firstChunk = '/tmp/foo'; + const result = extractToolCallBlocks(firstChunk); + expect(result.calls).toEqual([]); + expect(result.flushed).toBe(''); + expect(result.remaining).toBe(firstChunk); + }); + + it('extracts the block once the closer arrives in a later chunk (spec case 5, completion)', () => { + const firstChunk = '/tmp/foo'; + const r1 = extractToolCallBlocks(firstChunk); + const combined = r1.remaining + ''; + const r2 = extractToolCallBlocks(combined); + expect(r2.calls).toEqual([{ name: 'view_file', args: { path: '/tmp/foo' } }]); + expect(r2.flushed).toBe(''); + expect(r2.remaining).toBe(''); + }); + + // Spec case 6: prose interleaving + it('flushes prose around a recognized block but not the markup itself (spec case 6)', () => { + const input = 'I will read the file.\n/tmp/foo\nThanks.'; + const result = extractToolCallBlocks(input); + expect(result.calls).toEqual([{ name: 'view_file', args: { path: '/tmp/foo' } }]); + expect(result.flushed).toBe('I will read the file.\n\nThanks.'); + expect(result.remaining).toBe(''); + }); + + // Spec case 7 regression + it('extracts a Qwen block alongside the new code path (spec case 7 regression)', () => { + const input = '/tmp/foo'; + const result = extractToolCallBlocks(input); + expect(result.calls).toEqual([{ name: 'view_file', args: { path: '/tmp/foo' } }]); + expect(result.flushed).toBe(''); + expect(result.remaining).toBe(''); + }); + + it('extracts mixed-format blocks in source order (hand-back: shared counter)', () => { + const input = + '/a' + + ' middle ' + + 'foo'; + const result = extractToolCallBlocks(input); + expect(result.calls).toEqual([ + { name: 'view_file', args: { path: '/a' } }, + { name: 'grep', args: { pattern: 'foo' } }, + ]); + expect(result.flushed).toBe(' middle '); + expect(result.remaining).toBe(''); + }); + + it('drops a malformed block silently (matches existing behavior)', () => { + const input = 'prose /a trailing'; + const result = extractToolCallBlocks(input); + expect(result.calls).toEqual([]); + expect(result.flushed).toBe('prose trailing'); + expect(result.remaining).toBe(''); + }); + + it('holds a tail with a fresh partial opener after extracting earlier complete blocks', () => { + const input = '/a next: { + const input = 'just some text with a < character but no opener'; + const result = extractToolCallBlocks(input); + expect(result.calls).toEqual([]); + expect(result.flushed).toBe(input); + expect(result.remaining).toBe(''); + }); +}); + +describe('levenshtein', () => { + it('returns 0 for identical strings', () => { + expect(levenshtein('view_file', 'view_file')).toBe(0); + }); + + it('returns the length when one string is empty', () => { + expect(levenshtein('', 'view_file')).toBe(9); + expect(levenshtein('view_file', '')).toBe(9); + }); + + it('computes a small distance for a single-character substitution', () => { + expect(levenshtein('cat', 'bat')).toBe(1); + }); + + it('computes a known case: read_file → view_file is 4', () => { + // r→v, e→i, a→e, d→w → 4 substitutions, same length + expect(levenshtein('read_file', 'view_file')).toBe(4); + }); +}); + +describe('suggestToolName (v1.13.16)', () => { + const tools = [ + 'view_file', + 'list_dir', + 'grep', + 'find_files', + 'view_truncated_output', + 'ask_user_input', + 'web_search', + ]; + + it('suggests the closest match when distance is small', () => { + expect(suggestToolName('view_files', tools)).toBe('view_file'); + }); + + it('suggests via substring match when distance alone would miss', () => { + // 'file' is a substring of multiple tools; closest by distance wins. + expect(suggestToolName('file', tools)).toBe('view_file'); + }); + + it('returns null when nothing is close', () => { + expect(suggestToolName('xxxx_yyyy_zzzz', tools)).toBeNull(); + }); + + it('is case-insensitive in the distance check', () => { + expect(suggestToolName('VIEW_FILE', tools)).toBe('view_file'); + }); +}); + +describe('formatUnknownToolError (v1.13.16)', () => { + const tools = ['view_file', 'list_dir', 'grep', 'find_files']; + + it('includes the wrong name and the available tools list', () => { + const msg = formatUnknownToolError('read_file', tools); + expect(msg).toContain("Tool 'read_file' not found"); + expect(msg).toContain('Available tools:'); + expect(msg).toContain('view_file'); + expect(msg).toContain('find_files'); + }); + + it('includes a suggestion when the drifted name is within threshold', () => { + // distance(view_files, view_file) = 1 (one extra char) + const msg = formatUnknownToolError('view_files', tools); + expect(msg).toContain('Did you mean: view_file?'); + }); + + it('omits the suggestion clause when no tool is close enough', () => { + const msg = formatUnknownToolError('zzzzzzz', tools); + expect(msg).toContain("Tool 'zzzzzzz' not found"); + expect(msg).toContain('Available tools:'); + expect(msg).not.toContain('Did you mean'); + }); + + // The drift incident in the recon (chat 30d8…1be7167, msg 7ff558f4) had the + // model emit . lev(read_file, view_file) = 4, so + // the spec's threshold (<=3) doesn't suggest view_file — the model still + // gets the available-tools list to pick from. This pins that behavior so a + // future loosening of the threshold is a deliberate choice. + it('does not suggest view_file for the read_file drift case (distance is 4, over threshold)', () => { + const msg = formatUnknownToolError('read_file', tools); + expect(msg).not.toContain('Did you mean'); + }); +}); diff --git a/apps/server/src/services/inference/stream-phase.ts b/apps/server/src/services/inference/stream-phase.ts index 8ec399b..81efdea 100644 --- a/apps/server/src/services/inference/stream-phase.ts +++ b/apps/server/src/services/inference/stream-phase.ts @@ -6,12 +6,9 @@ import type { import * as modelContext from '../model-context.js'; import { toolJsonSchemas, type ToolJsonSchema } from '../tools.js'; import type { OpenAiMessage } from './payload.js'; -import { - XML_TOOL_CLOSE, - XML_TOOL_OPEN, - parseXmlToolCall, - partialXmlOpenerStart, -} from './xml-parser.js'; +// v1.13.16: extractToolCallBlocks replaces the inline opener-search loop and +// recognizes both Qwen and Anthropic markup in one pass. +import { extractToolCallBlocks } from './xml-parser.js'; import { DB_FLUSH_INTERVAL_MS, type StreamPhaseState } from './types.js'; import type { InferenceContext, @@ -132,16 +129,24 @@ function buildAiTools(schemas: ToolJsonSchema[]): Record // // VALUE // ... // // -// Multiple blocks may appear back-to-back; they never nest. +// +// v1.13.16: also recognize Anthropic markup that qwen3.6-35b-a3b-mxfp4 +// drifts to (training-data residue from Claude Code documentation): +// +// VALUE +// +// Both formats share the synthetic xml_call_${idx} ID space; the counter +// increments across whichever opener appears first. Multiple blocks may +// appear back-to-back in either format and they never nest. export async function streamCompletion( ctx: InferenceContext, model: string, @@ -209,47 +214,24 @@ export async function streamCompletion( switch (part.type) { case 'text-delta': { pendingBuffer += part.text; - // Extract any complete ... blocks before - // flushing visible text. - while (true) { - const startIdx = pendingBuffer.indexOf(XML_TOOL_OPEN); - if (startIdx === -1) break; - const closeIdx = pendingBuffer.indexOf(XML_TOOL_CLOSE, startIdx); - if (closeIdx === -1) break; - const blockEnd = closeIdx + XML_TOOL_CLOSE.length; - const block = pendingBuffer.slice(startIdx, blockEnd); - if (startIdx > 0) { - const before = pendingBuffer.slice(0, startIdx); - content += before; - onDelta(before); - } - const parsedCall = parseXmlToolCall(block); - if (parsedCall) { - const synthIdx = toolCalls.length; - toolCalls.push({ - id: `xml_call_${synthIdx}`, - name: parsedCall.name, - args: parsedCall.args, - }); - } - // Parse failures still drop the block — leaking XML to - // the chat would look worse than silently swallowing the bad block. - pendingBuffer = pendingBuffer.slice(blockEnd); + // v1.13.16: unified extraction. The helper finds the earliest-opening + // complete or block, flushes prose between/around + // them, holds any partial opener for the next chunk, and silently + // drops blocks that fail to parse (matches pre-v1.13.16 behavior). + const extracted = extractToolCallBlocks(pendingBuffer); + if (extracted.flushed.length > 0) { + content += extracted.flushed; + onDelta(extracted.flushed); } - // Hold back any (partial or full) unclosed opener; flush the rest. - const partialIdx = partialXmlOpenerStart(pendingBuffer); - if (partialIdx >= 0) { - if (partialIdx > 0) { - const flush = pendingBuffer.slice(0, partialIdx); - content += flush; - onDelta(flush); - } - pendingBuffer = pendingBuffer.slice(partialIdx); - } else if (pendingBuffer.length > 0) { - content += pendingBuffer; - onDelta(pendingBuffer); - pendingBuffer = ''; + for (const call of extracted.calls) { + const synthIdx = toolCalls.length; + toolCalls.push({ + id: `xml_call_${synthIdx}`, + name: call.name, + args: call.args, + }); } + pendingBuffer = extracted.remaining; break; } case 'tool-call': { diff --git a/apps/server/src/services/inference/tool-phase.ts b/apps/server/src/services/inference/tool-phase.ts index b9b59b8..2c9d91c 100644 --- a/apps/server/src/services/inference/tool-phase.ts +++ b/apps/server/src/services/inference/tool-phase.ts @@ -4,6 +4,12 @@ import { PathScopeError } from '../path_guard.js'; import { TOOLS_BY_NAME } from '../tools.js'; import { maybeFlagForCompaction } from './payload.js'; import { insertParts, partsFromAssistantMessage, partsFromToolMessage } from './parts.js'; +// v1.13.16: richer unknown-tool error so the model can self-correct when it +// drifts to a Claude Code tool name (e.g. read_file → suggest view_file). +// Applies to all unknown tool names, not just -derived ones — at the +// dispatch layer we no longer know which format produced the call, and the +// extra signal is harmless for Qwen-derived calls. +import { formatUnknownToolError } from './tool-suggestions.js'; import type { InferenceContext, StreamResult, @@ -26,7 +32,11 @@ async function executeToolCall( ): Promise<{ output: unknown; truncated: boolean; error?: string }> { const tool = TOOLS_BY_NAME[toolCall.name]; if (!tool) { - return { output: null, truncated: false, error: `unknown tool: ${toolCall.name}` }; + return { + output: null, + truncated: false, + error: formatUnknownToolError(toolCall.name, Object.keys(TOOLS_BY_NAME)), + }; } const parsed = tool.inputSchema.safeParse(toolCall.args); if (!parsed.success) { diff --git a/apps/server/src/services/inference/tool-suggestions.ts b/apps/server/src/services/inference/tool-suggestions.ts new file mode 100644 index 0000000..47a42f8 --- /dev/null +++ b/apps/server/src/services/inference/tool-suggestions.ts @@ -0,0 +1,63 @@ +// v1.13.16: Levenshtein + suggestion + formatter for the unknown-tool error +// returned to the model when an XML-extracted tool call references a name +// that isn't in TOOLS_BY_NAME. The drift incident this targets: qwen3.6 +// emitting from its Claude Code training residue +// when BooCode's actual file-read tool is view_file. Hand-rolled distance +// function — no new dep. + +export function levenshtein(a: string, b: string): number { + if (a.length === 0) return b.length; + if (b.length === 0) return a.length; + const dp: number[][] = Array.from( + { length: a.length + 1 }, + () => new Array(b.length + 1).fill(0), + ); + for (let i = 0; i <= a.length; i++) dp[i]![0] = i; + for (let j = 0; j <= b.length; j++) dp[0]![j] = j; + for (let i = 1; i <= a.length; i++) { + for (let j = 1; j <= b.length; j++) { + const cost = a[i - 1] === b[j - 1] ? 0 : 1; + dp[i]![j] = Math.min( + dp[i - 1]![j]! + 1, + dp[i]![j - 1]! + 1, + dp[i - 1]![j - 1]! + cost, + ); + } + } + return dp[a.length]![b.length]!; +} + +// Threshold per the v1.13.16 dispatch: distance <= 3 OR substring match +// (either direction). Ties broken by smallest distance, then alphabetical. +export function suggestToolName( + name: string, + available: readonly string[], +): string | null { + const lower = name.toLowerCase(); + let best: { name: string; dist: number } | null = null; + for (const tool of available) { + const tlower = tool.toLowerCase(); + const dist = levenshtein(lower, tlower); + const isSubstr = tlower.includes(lower) || lower.includes(tlower); + if (dist > 3 && !isSubstr) continue; + if ( + best === null || + dist < best.dist || + (dist === best.dist && tool.localeCompare(best.name) < 0) + ) { + best = { name: tool, dist }; + } + } + return best?.name ?? null; +} + +export function formatUnknownToolError( + name: string, + available: readonly string[], +): string { + const sorted = [...available].sort(); + const suggestion = suggestToolName(name, sorted); + const list = sorted.join(', '); + const tail = suggestion ? ` Did you mean: ${suggestion}?` : ''; + return `Tool '${name}' not found. Available tools: [${list}].${tail}`; +} diff --git a/apps/server/src/services/inference/xml-parser.ts b/apps/server/src/services/inference/xml-parser.ts index 61f080b..55d833d 100644 --- a/apps/server/src/services/inference/xml-parser.ts +++ b/apps/server/src/services/inference/xml-parser.ts @@ -1,23 +1,42 @@ // v1.10.5: XML-tag tool-call fallback. Some models emit // value // in plain content instead of using the OpenAI tool_calls JSON channel. -// The streaming loop in inference.ts extracts these blocks via these helpers. +// The streaming loop in stream-phase.ts extracts these blocks via these helpers. +// +// v1.13.16: also recognize Anthropic +// markup. qwen3.6-35b-a3b-mxfp4 drifts to this format when prompted as an +// "Architect"-style agent because Claude Code documentation in its +// pre-training data uses this shape. Both formats route through the same +// synthetic ToolCall path with shared xml_call_${idx} IDs; downstream +// dispatch handles unknown tool names with a richer error (see +// tool-suggestions.ts + tool-phase.ts). export const XML_TOOL_OPEN = ''; export const XML_TOOL_CLOSE = ''; -export function parseXmlToolCall( - block: string, -): { name: string; args: Record } | null { - const nameMatch = block.match(/]+)>/); +// v1.13.16: Anthropic opener is matched by prefix (not the full +// `` tag) because attributes follow. Closer is the literal tag. +export const INVOKE_TOOL_OPEN = '; +} + +// v1.10.5: Qwen-flavor parser. Tightened in v1.13.16 to tolerate whitespace +// around `=` (e.g. ``). Name capture is non-whitespace, +// non-`>` so a stray space doesn't get absorbed into the function name. +const QWEN_FUNCTION_RE = /\s]+)\s*>/; +const QWEN_PARAM_RE = /\s]+)\s*>([\s\S]*?)<\/parameter>/g; + +export function parseXmlToolCall(block: string): ParsedCall | null { + const nameMatch = block.match(QWEN_FUNCTION_RE); if (!nameMatch || !nameMatch[1]) return null; const name = nameMatch[1].trim(); if (!name) return null; const args: Record = {}; - // Non-greedy body so each pair is matched - // independently even when multiple appear in the same block. - const paramRe = /]+)>([\s\S]*?)<\/parameter>/g; - for (const m of block.matchAll(paramRe)) { + for (const m of block.matchAll(QWEN_PARAM_RE)) { const key = (m[1] ?? '').trim(); if (!key) continue; const raw = (m[2] ?? '').trim(); @@ -30,24 +49,121 @@ export function parseXmlToolCall( return { name, args }; } +// v1.13.16: Anthropic-flavor parser. Same JSON-parse-with-string-fallback +// shape as parseXmlToolCall so the dispatch layer doesn't need to care which +// flavor produced the call. +const INVOKE_NAME_RE = + //; +const INVOKE_PARAM_RE = + /([\s\S]*?)<\/parameter>/g; + +export function parseInvokeToolCall(block: string): ParsedCall | null { + const nameMatch = block.match(INVOKE_NAME_RE); + if (!nameMatch) return null; + const name = (nameMatch[2] ?? nameMatch[3] ?? '').trim(); + if (!name) return null; + const args: Record = {}; + for (const m of block.matchAll(INVOKE_PARAM_RE)) { + const key = ((m[2] ?? m[3] ?? '') as string).trim(); + if (!key) continue; + const raw = (m[4] ?? '').trim(); + try { + args[key] = JSON.parse(raw); + } catch { + args[key] = raw; + } + } + return { name, args }; +} + // Locate the first character that begins (or completely contains) an -// unfinished opener in `s`. Returns -1 when `s` can be flushed -// to the client in full without risking a partial tag leak. -// Case 1: a full `` opener with no matching closer — caller -// must keep everything from that index forward until the next -// chunk arrives with the closer. -// Case 2: `s` ends with a strict prefix of `` (e.g. `` or ` pair before reaching this check. +// block before reaching this check. +const ALL_OPENERS = [XML_TOOL_OPEN, INVOKE_TOOL_OPEN] as const; + export function partialXmlOpenerStart(s: string): number { - const fullOpener = s.indexOf(XML_TOOL_OPEN); - if (fullOpener !== -1) return fullOpener; + let earliest = -1; + for (const op of ALL_OPENERS) { + const idx = s.indexOf(op); + if (idx === -1) continue; + if (earliest === -1 || idx < earliest) earliest = idx; + } + if (earliest !== -1) return earliest; const lastLt = s.lastIndexOf('<'); if (lastLt === -1) return -1; const suffix = s.slice(lastLt); - if (XML_TOOL_OPEN.startsWith(suffix) && suffix.length < XML_TOOL_OPEN.length) { - return lastLt; + for (const op of ALL_OPENERS) { + if (op.startsWith(suffix) && suffix.length < op.length) return lastLt; } return -1; } + +// v1.13.16: unified extraction. Replaces the inline loop that used to live +// in stream-phase.ts. Pure function — returns the visible text to flush, +// the parsed tool-call payloads in source order, and the buffer remainder +// to retain for the next streaming chunk. Parse failures are silently +// dropped (matches the pre-v1.13.16 behavior — leaking partial XML to the +// chat looks worse than swallowing a bad block). +export interface ToolCallExtraction { + flushed: string; + calls: ParsedCall[]; + remaining: string; +} + +interface OpenerSpec { + open: string; + close: string; + parse: (block: string) => ParsedCall | null; +} + +const OPENER_SPECS: ReadonlyArray = [ + { open: XML_TOOL_OPEN, close: XML_TOOL_CLOSE, parse: parseXmlToolCall }, + { open: INVOKE_TOOL_OPEN, close: INVOKE_TOOL_CLOSE, parse: parseInvokeToolCall }, +]; + +export function extractToolCallBlocks(buffer: string): ToolCallExtraction { + let flushed = ''; + const calls: ParsedCall[] = []; + let pos = 0; + + while (pos < buffer.length) { + let next: { spec: OpenerSpec; openIdx: number; closeIdx: number } | null = null; + for (const spec of OPENER_SPECS) { + const openIdx = buffer.indexOf(spec.open, pos); + if (openIdx === -1) continue; + const closeIdx = buffer.indexOf(spec.close, openIdx); + if (closeIdx === -1) continue; + if (next === null || openIdx < next.openIdx) { + next = { spec, openIdx, closeIdx }; + } + } + if (next === null) break; + + if (next.openIdx > pos) { + flushed += buffer.slice(pos, next.openIdx); + } + const blockEnd = next.closeIdx + next.spec.close.length; + const block = buffer.slice(next.openIdx, blockEnd); + const parsed = next.spec.parse(block); + if (parsed) calls.push(parsed); + pos = blockEnd; + } + + const tail = buffer.slice(pos); + const partialIdx = partialXmlOpenerStart(tail); + if (partialIdx === -1) { + flushed += tail; + return { flushed, calls, remaining: '' }; + } + if (partialIdx > 0) { + flushed += tail.slice(0, partialIdx); + } + return { flushed, calls, remaining: tail.slice(partialIdx) }; +} diff --git a/boocode_roadmap.md b/boocode_roadmap.md index a5d4c4d..4fe592a 100644 --- a/boocode_roadmap.md +++ b/boocode_roadmap.md @@ -92,6 +92,7 @@ All v1.13.x batches were retagged to the `vMAJOR.MINOR.PATCH-slug` scheme on 202 - `v1.13.13-ws-publish` — all ~80 publish sites converted to the typed wrappers; every WS frame now Zod-validated at boundary - `v1.13.14-skills-audit` — 26 skills vendored + audited via 5 parallel agent teams; 14 kept, 11 dropped, 1 migrated to BOOCHAT.md/BOOCODER.md - `v1.13.15-codecontext-synth` — forced second-inference synthesis pass for codecontext overview tools (truncation-aware extraction; auto-fetched top-N files + project docs; 32k payload-budget contract preserved) +- `v1.13.16-xml-parser` — Anthropic `` parser support + Levenshtein-based unknown-tool recovery hints (qwen3.6 drift to Claude Code-style tool names like `read_file`); xml-parser test coverage The remaining strangler-fig final step (drop `messages.tool_calls` + `tool_results` columns) is still pending under its old `v1.13.2` working name; will get a new tag slug when scoped. @@ -611,7 +612,7 @@ Earlier May 18 chat recommended Option A (thin orchestration shell over OpenCode ### v1.13.x cleanup line locked (2026-05-22) -After the 2026-05-22 retag, the v1.13.x cleanup line in `vMAJOR.MINOR.PATCH-slug` form is **v1.13.0-ai-sdk-v6 ✅ → v1.13.1-cleanup-bundle ✅ → v1.13.2-compaction-prune ✅ → v1.13.3-truncate ✅ → v1.13.4-reasoning-fix ✅ → v1.13.5-stability-bundle ✅ → v1.13.6-prefix-stability ✅ → v1.13.7-compaction-trigger ✅ → v1.13.8-tool-cost ✅ → v1.13.9-agentlint ✅ → v1.13.10-openspec ✅ → v1.13.11-tools ✅ → v1.13.12-ws-schemas ✅ → v1.13.13-ws-publish ✅ → v1.13.14-skills-audit ✅ → v1.13.15-codecontext-synth ✅ → column drop (final, pending — old working name v1.13.2)**. **Do not fold.** Smoke isolation matters: each batch has a distinct rollback surface, and bisecting a 750-LoC merge across four unrelated changes is worse than four separate dispatches. +After the 2026-05-22 retag, the v1.13.x cleanup line in `vMAJOR.MINOR.PATCH-slug` form is **v1.13.0-ai-sdk-v6 ✅ → v1.13.1-cleanup-bundle ✅ → v1.13.2-compaction-prune ✅ → v1.13.3-truncate ✅ → v1.13.4-reasoning-fix ✅ → v1.13.5-stability-bundle ✅ → v1.13.6-prefix-stability ✅ → v1.13.7-compaction-trigger ✅ → v1.13.8-tool-cost ✅ → v1.13.9-agentlint ✅ → v1.13.10-openspec ✅ → v1.13.11-tools ✅ → v1.13.12-ws-schemas ✅ → v1.13.13-ws-publish ✅ → v1.13.14-skills-audit ✅ → v1.13.15-codecontext-synth ✅ → v1.13.16-xml-parser ✅ → column drop (final, pending — old working name v1.13.2)**. **Do not fold.** Smoke isolation matters: each batch has a distinct rollback surface, and bisecting a 750-LoC merge across four unrelated changes is worse than four separate dispatches. ### v1.13 retrospective (what shipped)