boocode

Author	SHA1	Message	Date
indifferentketchup	bcfc94fa47	v2.4.1-sidecar-routing: route per-agent flags to llama-sidecar + tool gap fix Batch 3c: when an agent has llama_extra_args in AGENTS.md, provider.ts routes inference through LLAMA_SIDECAR_URL instead of LLAMA_SWAP_URL. X-Agent-Flags header built from the agent's flags. Boot-time guard refuses to start if any agent has llama_extra_args but LLAMA_SIDECAR_URL is unset. PrefixFingerprint gains a route field (swap/sidecar) for per-turn visibility. 9 provider tests. AGENTS.md tool gap: all agents (except Prompt Builder) were missing 8 tools that were added after the original tool lists were written: request_read_access, view_truncated_output, ask_user_input, git_status, get_blast_radius, get_hot_files, get_middleware, get_routes. The missing request_read_access caused silent "permission denied" when reading files outside the project root. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-27 19:28:08 +00:00
indifferentketchup	90a6761b07	v2.4.0-unsloth-studio-lift: port 3 Unsloth Studio AGPL-3.0 modules Batch 1 — tool-call-parser.ts: replaces xml-parser.ts with a port of Unsloth's tool_call_parser.py. Adds balanced-brace JSON scanner, single-param fast path, hasToolSignal/stripToolMarkup/parseToolCallsFromText exports, and stream-finalization stripping at all three final-write sites (error-handler, finalizeCompletion, executeToolPhase). Anthropic <invoke> shape preserved. 75+12 tests. Batch 2 — web/html-to-md.ts: parse5 tree-walking HTML-to-Markdown converter ported from Unsloth's _html_to_md.py. Replaces web_fetch's regex stripHtml with structured markdown output (headings, links, lists, tables, code blocks, blockquotes, entity decoding). 29 tests. Batch 3 — llama-args-validator.ts: port of llama_server_args.py deny-list validator. Wired into AGENTS.md frontmatter parser — llama_extra_args field validated at load time, rejects managed flags (model identity, networking, auth/TLS, server UI). No runtime consumer yet (llama-swap boundary). 76 tests. All three files carry SPDX-License-Identifier: AGPL-3.0-only headers. LICENSE flipped to AGPL-3.0-only in prior commit (`a938cf1`). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-26 23:30:50 +00:00
indifferentketchup	792bbb9da3	v2.3.0-sampling-params-ask-user: agent sampling params, ask_user_input in CoderPane, UX polish Add top_p/top_k/min_p/presence_penalty to AGENTS.md frontmatter and thread through inference (agents.ts parser → Agent type → stream-phase → sentinel summaries). Null means omit from request body, preserving provider defaults. Wire ask_user_input interactive card into both BooCoder frontends: the CoderPane in BooChat's SPA (CoderMessageList now renders AskUserInputCard instead of ToolCallLine for ask_user_input tool calls) and the standalone coder SPA (MessageBubble + new AskUserInputCard + shadcn ui primitives). Additional fixes: SessionLandingPage uses ChatInput with slash-command support and lazy chat creation; Session.tsx hydrate-race fix for empty pane promotion; AgentPicker wider dropdown with line-clamp; ModelPicker min-width; Textarea converted to forwardRef; Recon agent added to AGENTS.md; codecontext host port exposed in docker-compose. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-26 21:02:21 +00:00
indifferentketchup	d27a977d59	v1.15.0-mcp-multi: multi-server MCP client + stdio transport + config file + tool globs Generalizes the v1.14.1 single-server Context7 PoC into a multi-server MCP client registry with per-server graceful degradation. JSON config at /data/mcp.json (bind-mounted alongside AGENTS.md) matches opencode's mcpServers schema shape. Config file missing = no MCP (opt-in by presence). Two transports: Streamable HTTP (remote servers like Context7) and stdio (local subprocess servers like codecontext). Stdio spawns a persistent child via the SDK's StdioClientTransport; shutdown hook closes all transports. Tool prefix generalized from context7_<name> to <serverName>_<toolName> with a toolToServer reverse map for dispatch routing. AGENTS.md tools: field now supports glob patterns (context7_, !web_) via matchToolGlob — last-match- wins with ! deny prefix. Replaces exact-match .includes() in stream-phase.ts. refreshToolNames() in agents.ts rebuilds the DEFAULT_TOOLS snapshot after appendMcpTools so agents without explicit tools: lists see MCP tools — reviewer caught that the module-load-time snapshot would permanently exclude late-registered tools. Read-only invariant: readOnlyHint === false rejected at discovery. Result size capped at 5MB. v1.14.1 env vars removed — superseded by config file. Default data/mcp.json ships with Context7 disabled. 363/363 server tests passing. No schema changes, no frontend changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-24 04:08:42 +00:00
indifferentketchup	2e1a81de72	v1.13.16-xml-parser: Anthropic <invoke> support + unknown-tool recovery hints Two-part fix for the model-emitted XML drift the v1.13.15-codecontext-synth investigation surfaced (1 raw <invoke> leak observed out of 190 qwen3.6 turns — qwen3.6-35b-a3b-mxfp4 drifts to the Anthropic format when prompted as an Architect-style agent because Claude Code documentation in its pre-training corpus uses that shape). ## Parser extension xml-parser.ts now recognizes BOTH XML tool-call flavors: - Qwen/Hermes: <tool_call><function=NAME>...<parameter=K>V</parameter>...</function></tool_call> - Anthropic: <invoke name="NAME"><parameter name="K">V</parameter></invoke> Both route through the same synthetic-id xml_call_${idx} ToolCall path. extractToolCallBlocks() and partialXmlOpenerStart() handle both openers (<tool_call> and <invoke...) so partial buffers don't get prematurely flushed during streaming. The existing Qwen parser was tightened to tolerate whitespace around `=` (<function = name>, <parameter = key>...) so a stray space doesn't get absorbed into the function name. Name capture is non-whitespace, non-`>`. ## Unknown-tool recovery hint New tool-suggestions.ts exports levenshtein() + suggestToolName() + formatUnknownToolError(). When tool-phase.ts:executeToolCall receives a toolCall.name that isn't in TOOLS_BY_NAME, the error returned to the model now includes a "Did you mean: X?" hint based on Levenshtein distance ≤3 or substring match against Object.keys(TOOLS_BY_NAME). Targets the qwen3.6 drift to read_file → suggest view_file. Applies to all unknown tool names, not just <invoke>-derived ones — at the dispatch layer we no longer know which format produced the call, and the extra signal is harmless for Qwen-derived calls. ## Test coverage xml-parser.test.ts: 46 tests, all green. Covers both parsers (well-formed, malformed, multi-parameter, nested-content), the partial-opener detector for both flavors, the unified extraction helper, and the unknown-tool error formatter. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 20:59:25 +00:00
indifferentketchup	a08d809b73	v1.13.3: cleanup bundle — statement timeout + alpha ordering + stuck-row sweeper + repairToolCall Four independent items, all owed from prior dispatches. - statement_timeout at the database level via: ALTER DATABASE boocode SET statement_timeout = '30s'; Applied operationally; documented as a comment at the top of schema.sql (ALTER DATABASE can't run inside a DO block, so it's not idempotent inside applySchema). Re-apply after a volume reset. - Tool registry alpha-sorted at module load. llama.cpp's prompt cache hits on byte-identical prefixes; any reordering of the tool list near the top of the system prompt would invalidate every cached turn. Single-source sort at the ALL_TOOLS export so toolJsonSchemas() and TOOLS_BY_NAME inherit the order automatically. New tools.test.ts asserts the invariant; total tests 173 (was 172). - Periodic in-process stuck-row sweeper. Runs every 60s, marks 'streaming' rows older than 5 minutes as 'failed', and publishes chat_status='idle' on the user channel so the UI dot drops without a refresh. Closes the mid-session crash UX gap; the v1.12.1 boot sweep only fires once at startup, so sessions used to stay stuck until next reboot. setInterval cleaned up via app.addHook('onClose'). Mirrors handleAbortOrError's publish pattern. - experimental_repairToolCall wired through AI SDK v6 streamText. Pass- through implementation: log + return the original toolCall so the stream keeps going. executeToolPhase's existing error paths (unknown tool name → 'unknown tool: X' result; zod-reject → 'tool X rejected — field: required') already surface bad calls to the model; the value here is preventing the AI SDK from THROWING on parse errors and killing the whole stream. Owed since v1.13.1-A. Smoke verified: - statement_timeout = '30s' confirmed via SHOW. - Tool path normal flow intact (list_dir prompt → tool_call → result → final assistant). No malformed tool calls in the test run; repair log will surface them when qwen3.6 actually emits one. - Alpha order verified at runtime via the dist bundle: match: true. - Sweeper logic not traffic-tested (no stuck rows to find), but the SQL UPDATE + broker.publishUser pattern is identical to handleAbort and the boot sweep — synthesis-only verification. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 06:46:03 +00:00
indifferentketchup	ac1a71f583	v1.13.1-C: port ask_user_input correlation to parts + wire reasoning_parts end-to-end Pass 1 — ask_user_input correlation port (messages.ts:478, :549): - The two correlation queries that backed the elicitation flow used to scan messages.tool_calls and messages.tool_results JSON columns directly. They now JOIN message_parts on payload->>'id' (for the caller assistant) and payload->>'tool_call_id' (for the pending tool row). Semantics preserved: ORDER BY m.created_at DESC LIMIT 1 still picks the latest issuance, the already-answered 409 guard now reads payload.output, and the UPDATE + parts replace inside sql.begin is unchanged from v1.13.0. - Pre-v1.13.0 history has no parts rows and is unreachable to this lookup path (404). Acceptable per dispatch decision — no pending elicitation from before v1.13.0 will still be open. JSON-column fallback can land as a hotfix if it ever surfaces. Pass 2 — reasoning_parts wired end-to-end: - types.ts/StreamResult gains `reasoning: string`. stream-phase.ts accumulates reasoning-delta text per stream (replacing the v1.13.1-A counter-only diagnostic) and returns it on the result. - parts.ts/partsFromAssistantMessage gains an optional `reasoning` param. When present it emits a kind='reasoning' part at sequence 0, ahead of the text and tool_call parts. - error-handler.ts/finalizeCompletion and tool-phase.ts/executeToolPhase both thread result.reasoning into the dual-write call so reasoning-channel models (qwen3.6) get persistent reasoning rows. - payload.ts: loadContext SELECT pulls reasoning_parts from the v1.13.1-B view; OpenAiMessage gains an optional `reasoning` field; buildMessagesPayload collapses reasoning_parts into a single string per assistant message. - stream-phase.ts/toModelMessages converts assistant messages with reasoning into an AI SDK ModelMessage content array starting with a ReasoningPart, matching the @ai-sdk/provider-utils AssistantContent union. Reasoning models can now replay prior reasoning context across tool-call boundaries. - types/api.ts and apps/web/src/api/types.ts Message interface gain reasoning_parts (optional, nullable). Frontend doesn't render this yet — field reserved for a v1.14 UI surface. Tests: 2 new in parts.test.ts cover reasoning-at-sequence-0 with and without text content. 172 tests pass (170 prior + 2 new). Smoke verified against the live container: - A reasoning-prompt ("walk through 17 × 23 step by step") produced one message with kind='reasoning' (361 chars) at sequence 0 and kind='text' (429 chars) at sequence 1. Adapter log confirmed reasoning capture. - The new correlation SQL was validated against existing tool_call / tool_result parts: returns the expected message_id + payload shape with pending state correctly identified via payload.output IS NULL. - ask_user_input end-to-end through the UI is Sam's smoke — the Prompt Builder agent does not always trigger ask_user_input for these prompts, so synthetic verification via SQL substituted for traffic-driven cover. Annotation: the v1.13.1-A abort-throw site in stream-phase.ts got a one-liner comment ("AI SDK v6 fullStream returns normally on abort; check signal explicitly.") to prevent a future refactor removing it. v1.13.2 drops the dual-write + the JSON columns + collapses the view. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 06:34:10 +00:00
indifferentketchup	c2c4f78a26	v1.13.1-A: install AI SDK v6 + swap streamText into stream-phase.ts adapter - Add ai@^6 and @ai-sdk/openai-compatible@^2 to apps/server. - New services/inference/provider.ts: createOpenAICompatible against llama-swap (baseURL threaded from config.LLAMA_SWAP_URL, cached per baseURL). No apiKey — Authelia + Tailscale gate llama-swap, not keys. - streamCompletion rewritten as an adapter over streamText. AI SDK fullStream parts (text-delta, tool-call, finish, error) map back to the legacy {content?, tool_calls?, finishReason} StreamResult shape that executeStreamPhase already consumes. No layer above streamCompletion changes. - toModelMessages converts BooCode's OpenAI-shaped history to AI SDK ModelMessage[]; tool messages need toolName which we look up by scanning earlier assistant tool_calls for the matching id. - buildAiTools wraps BooCode's JSON-schema tool defs via tool({ inputSchema: jsonSchema(parameters) }) with NO execute — BooCode dispatches tools in tool-phase.ts, not the AI SDK loop. - XML fallback parser preserved as-is — qwen3.6 still emits XML tool calls in text content that the structured tool-call layer misses. - reasoning-delta parts dropped with a debug-level counter — captured properly in v1.13.1-C. - Abort path: streamText({ abortSignal }) wires ctx.signal through, but AI SDK v6 swallows the abort (fullStream iterator exits cleanly rather than throwing). Post-iteration `if (signal?.aborted) throw` so handleAbortOrError owns the row and writes status='cancelled'. Caught by smoke D; would have shipped as status='complete' on stop otherwise. - Usage frame reads result.usage (inputTokens / outputTokens v6 names) AFTER stream drain. Single trailing publish through the existing 500ms throttle. Known regression: ChatThroughput's live mid-stream tick (v1.12.2) is gone — it now shows a single value at stream end. TODO(v1.13.1-followup): interpolate outputTokens during streaming via a delta-cadence counter (e.g. part.text.length/4 token proxy) and publish every 500ms; reconcile against result.usage at finish. - Write-path dual-write from v1.13.0 unaffected. Read path stays on JSON columns. v1.13.1-B flips reads to message_parts. Smoke verified end-to-end against running container: - A. Plain text: status='complete', 1 text part. - B. Single tool prompt → multi-tool chain (4 calls): every assistant with tool_calls has 2 parts (text+tool_call), every tool row has 1 part (tool_result). - C. Multi-step covered by B's chain. - D. Stop mid-stream: status='cancelled' written via handleAbortOrError after the post-iteration abort throw. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 06:17:56 +00:00
indifferentketchup	9ef00c0268	v1.12.4: complete inference.ts split into services/inference/ - sentinel-summaries.ts: runCapHitSummary, insertCapHitSentinel, runDoomLoopSummary, insertDoomLoopSentinel - inference.ts → inference/turn.ts: residue is runAssistantTurn, runInference, createInferenceRunner orchestration only - inference/index.ts: re-export shim preserves the public surface (createInferenceRunner, runInference, runAssistantTurn, detectDoomLoop, DOOM_LOOP_THRESHOLD, buildMessagesPayload, plus type-side InferenceContext/InferenceFrame/StreamResult/TurnArgs/ FramePublisher) - src/index.ts + auto_name.ts + the two vitest test files updated to import from ./services/inference/index.js explicitly (NodeNext ESM doesn't honor directory-index resolution) Final tally: 11 files under services/inference/, the largest being sentinel-summaries.ts at 523 LoC (two near-clone summary paths kept side-by-side until a third sentinel justifies factoring out a shared runWrapUpSummary). turn.ts is now 326 LoC, the next-largest is stream-phase.ts at 380. Public import surface unchanged. tool-phase.ts → turn.ts back-edge for runAssistantTurn remains (cycle is safe; resolved at call time). Prepares the file structure for v1.13 AI SDK migration — streamText swap targets stream-phase.ts only. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 22:36:35 +00:00
indifferentketchup	c87df6981a	v1.12.4-rc3: extract stream-phase + tool-phase from inference.ts - stream-phase.ts: streamCompletion, executeStreamPhase (plus sseLines, StreamOptions, ChatCompletionDelta/Chunk as private helpers) - tool-phase.ts: executeToolPhase + private executeToolCall - types.ts: shared StreamPhaseState + DB_FLUSH_INTERVAL_MS so the summary functions still in inference.ts can reference them without pulling from a phase file Cycle: executeToolPhase recurses into runAssistantTurn, which stays in inference.ts. Resolved by direct value back-edge — tool-phase.ts does `import { runAssistantTurn } from '../inference.js'` and runAssistantTurn is now exported. Safe because the dereference happens inside an async function body, after both modules have fully evaluated. No callback-through-args fallback needed. inference.ts shrinks from ~1401 to ~828 LoC. Final Dispatch D moves the sentinel summaries out and renames the residue to inference/turn.ts. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 22:28:23 +00:00

10 Commits