boocode

Author	SHA1	Message	Date
indifferentketchup	2e1a81de72	v1.13.16-xml-parser: Anthropic <invoke> support + unknown-tool recovery hints Two-part fix for the model-emitted XML drift the v1.13.15-codecontext-synth investigation surfaced (1 raw <invoke> leak observed out of 190 qwen3.6 turns — qwen3.6-35b-a3b-mxfp4 drifts to the Anthropic format when prompted as an Architect-style agent because Claude Code documentation in its pre-training corpus uses that shape). ## Parser extension xml-parser.ts now recognizes BOTH XML tool-call flavors: - Qwen/Hermes: <tool_call><function=NAME>...<parameter=K>V</parameter>...</function></tool_call> - Anthropic: <invoke name="NAME"><parameter name="K">V</parameter></invoke> Both route through the same synthetic-id xml_call_${idx} ToolCall path. extractToolCallBlocks() and partialXmlOpenerStart() handle both openers (<tool_call> and <invoke...) so partial buffers don't get prematurely flushed during streaming. The existing Qwen parser was tightened to tolerate whitespace around `=` (<function = name>, <parameter = key>...) so a stray space doesn't get absorbed into the function name. Name capture is non-whitespace, non-`>`. ## Unknown-tool recovery hint New tool-suggestions.ts exports levenshtein() + suggestToolName() + formatUnknownToolError(). When tool-phase.ts:executeToolCall receives a toolCall.name that isn't in TOOLS_BY_NAME, the error returned to the model now includes a "Did you mean: X?" hint based on Levenshtein distance ≤3 or substring match against Object.keys(TOOLS_BY_NAME). Targets the qwen3.6 drift to read_file → suggest view_file. Applies to all unknown tool names, not just <invoke>-derived ones — at the dispatch layer we no longer know which format produced the call, and the extra signal is harmless for Qwen-derived calls. ## Test coverage xml-parser.test.ts: 46 tests, all green. Covers both parsers (well-formed, malformed, multi-parameter, nested-content), the partial-opener detector for both flavors, the unified extraction helper, and the unknown-tool error formatter. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 20:59:25 +00:00
indifferentketchup	8b568b36d3	v1.13.11-a: WS frame schemas + frontend receive validation First half of the WebSocket-frame-typing batch (split per recon — total scope was ~535 LoC, larger than the roadmap's ~300 estimate, so the server-side publish-site conversion lands separately in v1.13.11-b). Phase A scope: (1) apps/server/src/types/ws-frames.ts (NEW) — Zod schemas for all 27 wire-format WS frame types. Discriminated union (WsFrameSchema) plus KNOWN_FRAME_TYPES const for diagnostic lookup. UUIDs are z.string(). uuid(); model-emitted tool_call_id stays z.string().min(1) since OpenAI- compatible APIs emit "call_<random>" not UUID. Per-kind payload narrowing (tool args, message_parts payloads) intentionally stays z.unknown() — frame-level drift detection is the goal; deep payload validation is follow-up work. (2) apps/web/src/api/ws-frames.ts (NEW) — byte-identical mirror of the authoritative server file. No path alias from web→server in the existing tsconfig setup; sync-by-hand was chosen over a new packages/shared/ dir. A ws-frames.test.ts test asserts the two files match. (3) apps/server/src/services/broker.ts — adds publishFrame() and publishUserFrame() methods to the Broker interface. Both validate via WsFrameSchema and fail-closed: log + drop on invalid. createBroker now accepts an optional FastifyBaseLogger so validation failures land in the pino stream (with console.error fallback for unit tests). The existing publish() / publishUser() raw methods stay legal — they get converted to the typed variants in v1.13.11-b. (4) apps/web/src/hooks/useSessionStream.ts + useUserEvents.ts — wrap ws.onmessage with WsFrameSchema.safeParse. Fail-closed: invalid frames log + return without dispatching. Hand-maintained WsFrame and SessionEvent types stay in place; one cast bridges Zod-typed → narrowed shape (Zod uses OpaqueObject for nested Message[] / WorkspacePane[] etc., which are dev-time-narrowed via the existing hand-maintained types). (5) apps/web/package.json — adds zod ^3.23.8 as a direct dep. Was a transitive dep via ai-sdk / postgres; promotion makes the import legal. (6) Tests: 15 new in ws-frames.test.ts covering happy-path per major frame type, drift-catchers (unknown type, invalid enum, non-UUID, negative tokens), parts-authoritative read variants, the mirror-file diff check, and four broker fail-closed scenarios. 219/219 server tests pass (was 204; +15 new). Two recon corrections to the dispatch brief, both flagged before implementation: - No 'parts_appended' frame exists. The brief assumed one; the codebase reads parts via the messages_with_parts view after message_complete triggers a refetch. MessagePartSchema is therefore unused this batch. - No 'tool_running' frame exists. The brief listed it as standalone; it is in fact a 'chat_status' variant ({ status: 'tool_running' }), already covered by ChatStatusFrame. Smoke: clean container boot, no validation errors in the server log. Real production frames pass validation (the schemas were derived from the existing hand-maintained types in api/types.ts and sessionEvents.ts). v1.13.11-b will follow immediately: convert all ~85 raw broker.publish / ctx.publish call sites across 11 server files to publishFrame / publishUserFrame. Mechanical edit; the wiring done here means the diff in -b is just the call-site swaps. ~310 LoC across 9 files (4 new + 5 modified).	2026-05-22 15:48:32 +00:00
indifferentketchup	34cbecf975	v1.13.15-tools: tiered tool loading via BOOCODE_TOOLS env var Pattern lift from eyaltoledano/claude-task-master (MIT + Commons Clause — pattern only, no code lift). Adds BOOCODE_TOOLS env var with three tiers: - core (4 tools): view_file, list_dir, grep, find_files. ~2k token schema cost. - standard (15 tools): core + web_search, web_fetch, git_status, all 8 codecontext_* tools. ~10k token schema cost. - all (default; current behavior): every tool in ALL_TOOLS (20). ~21k token schema cost. The env var is a CEILING — narrows agent whitelists, never expands. Default behavior unchanged when var is unset. resolveToolTier is case-insensitive and falls back to 'all' on unknown values. CORE_TOOL_NAMES + STANDARD_TOOL_NAMES validated at module load against TOOLS_BY_NAME via two top-level for-loops that throw on the first missing name. Module fails to import if a tier references a tool that doesn't exist in the registry — catches typos and stale tier definitions at boot rather than silently filtering valid tools out of agent whitelists. Wiring: agents.ts parseAgentBlock now reads BOOCODE_TOOLS from process.env per parse, intersects with the agent's declared frontmatter tools (or DEFAULT_TOOLS when frontmatter omits the field). Per-parse read is fine — agents are re-parsed on the existing 60s cache TTL. Tests: tools.test.ts grows from 1 to 10 tests. Covers resolveToolTier across tiers/case/unknown values + the CORE-subset-of-STANDARD invariant + TOOLS_BY_NAME existence for both tier sets. 204/204 pass (was 195; +9 new). Deviation from the brief: the codecontext tools in the actual registry have NO codecontext_* prefix (the brief's STANDARD list assumed it). Used the actual names (get_codebase_overview, search_symbols, etc.). Module-load validation would have failed boot with the prefixed names. Smoke: with BOOCODE_TOOLS unset, agents return their full 12-tool whitelists. With BOOCODE_TOOLS=core in .env + container restart, the same agents narrow to 4 tools (find_files, grep, list_dir, view_file) — intersection of declared whitelist ∩ core tier. Reverted after confirmation. CLAUDE.md updated with BOOCODE_TOOLS in the Environment section's Optional list. .env.example gained a commented BOOCODE_TOOLS=all line with the per-tier token-cost table. ~110 LoC across 5 files (4 modified + 1 test expansion). Under the brief's ~30 LoC estimate for code; the test suite expansion drove most of the growth.	2026-05-22 14:59:01 +00:00
indifferentketchup	9ce638c916	v1.13.10: per-tool token cost accounting (rolling 100-call view) Surfaces per-tool prompt/completion-token rolling averages in AgentPicker for at-a-glance agent-cost hints. Implementation is a SQL view on top of messages_with_parts plus a read endpoint and AgentPicker tooltip extension. No new write site; all source data already lands via the existing tool-phase.ts:94-95 / error-handler.ts: 109-110 / sentinel-summaries.ts UPDATEs that v1.13.7's includeUsage: true fix made non-NULL. (1) schema.sql — new tool_cost_stats view. Window-functions over messages_with_parts.tool_calls with LATERAL jsonb_array_elements. Attribution: equal split — multi-tool turn divides tokens N-ways; the 100-call rolling mean absorbs split noise. Filters: status= 'complete' + metadata.kind NOT IN ('cap_hit','doom_loop') exclude failed turns and sentinels respectively; tool_calls IS NOT NULL is defense-in-depth since sentinels are role='system' rows. CREATE OR REPLACE means schema apply is idempotent. (2) routes/tools.ts NEW + index.ts wire-in. GET /api/tools/cost_stats returns { stats: ToolCostStat[] } with mean_prompt_tokens / mean_ completion_tokens computed at read time (sum / n_calls). Sorted by tool_name ASC. No pagination — ≤30 tools. (3) __tests__/tool_cost_stats.test.ts NEW — 7 integration tests keyed off DATABASE_URL env var. Tests skip gracefully when unset (no-DB default). beforeAll applies the schema via sql.unsafe(read FileSync(schema.sql)) for self-contained runs. Helper insertAssistant Turn shared across cases. Covers: empty state, single-tool attribution, multi-tool equal split, 100-call FIFO window, NULL-tokens exclusion, parts-authoritative read via messages_with_parts, failed/sentinel exclusion. (4) web/api/types.ts + client.ts — ToolCostStat interface + api.tools. costStats() method binding. (5) AgentPicker.tsx — fetch costStats on mount, compute per-agent sum-of-means across whitelisted tools, render muted cost line below description: "~5.2k prompt / 280 completion · 6/8 tools · last call 3h ago". Skips line entirely when no tool history; preserves existing native title= for layout backward-compat. formatK/formatAgo colocated. Tests: 202/202 pass (195 prior + 7 new view-integration). Server + web tsc clean. Smoke: schema applied cleanly; GET /api/tools/cost_stats returns canonical JSON; view + endpoint agree. Single-row result expected given the v1.13.1-A → v1.13.7 NULL latent regression window; new traffic populates organically. Roadmap row at boocode_roadmap.md:114 plus schema row at :474 both match. View vs table decision documented in handoff_v1.13.10_per_ tool_cost.md (rollback-safe, microsecond-fast at BooCode scale). ~270 LoC across 8 files (5 modified + 3 new).	2026-05-22 14:42:09 +00:00
indifferentketchup	b06a4a8e55	v1.13.9: compaction overflow trigger — 0.85 × ctx_max early trigger Opencode pattern (session/overflow.ts): fire compaction at 85% of ctx_max, replacing the v1.11.0-era `ctx_max - 20_000` formula. Old formula: usable = ctx_max - 20_000 - ctx=262144 → trigger at 242144 (92.4%) — only 7.6% headroom - ctx=100000 → trigger at 80000 (80.0%) - ctx= 32000 → trigger at 12000 (37.5%) — over-eager - ctx<=20000 → trigger at 0 — never fires New formula: usable = floor(0.85 * ctx_max) - ctx=262144 → trigger at 222822 (85.0%) — 15% headroom for summarizer - ctx=100000 → trigger at 85000 (85.0%) - ctx= 32000 → trigger at 27200 (85.0%) - ctx= 8192 → trigger at 6963 (85.0%) Ratio gives consistent headroom at any context scale. The qwen3.6 daily driver gets ~19k tokens more breathing room before overflow; small-ctx models no longer degenerate to never-triggering. usable() is the only consumer of COMPACTION_BUFFER → constant deleted. New EARLY_TRIGGER_RATIO constant takes its place. isOverflow() and the maybeFlagForCompaction() call site at payload.ts:184 are unchanged — formula swap is internal to compaction.ts. payload.ts comment touched only to drop the stale COMPACTION_BUFFER reference (PRUNE_TRIGGER_TOKENS stays at 20k as the prune-freed threshold; independent of the overflow formula). Tests: 4 new usable() corner cases (262k/100k/8k/zero+negative), plus 5 isOverflow() numbers shifted to match the 85k budget at ctx=100k. 195/195 server tests pass (was 194). Smoke: ratio math verified by unit tests at all four corners. Live cap-hit verification deferred — requires accumulating >222k tokens in a session under qwen3.6-35b-a3b-mxfp4 (was >242k pre-fix); will surface organically in extended use.	2026-05-22 13:59:14 +00:00
indifferentketchup	a0c8d212cb	v1.13.8: system-prompt prefix stability verify-and-measure Recon during planning disproved the original v1.13.7 (DB-cache) premise: buildSystemPrompt already runs over inputs mtime-cached at the file layer (BOOCHAT.md in system-prompt.ts:25, AGENTS.md global+per-project in agents.ts:245), and DB scalars are byte-stable until edited. The output is microsecond pure-string concat with no I/O. Skills aren't in the prefix; tools live in a separate request body field alpha-sorted by v1.13.3. This batch closes the verification gap with instrumentation, not implementation: - system-prompt.ts: buildSystemPromptWithFingerprint canonical impl computes SHA-256 over the assembled prefix, runs a per-session Map<sessionId, lastHash> observer, emits PrefixFingerprint per call and PrefixDrift (with field-level changed_inputs) on hash change. buildSystemPrompt is now a thin shim returning .prompt. - agents.ts: getAgentsMtimes accessor — cache-read only, no I/O. - payload.ts: buildMessagesPayload takes optional log argument; when passed, emits prefix-fingerprint (info) + prefix-drift (warn). - turn.ts + sentinel-summaries.ts: pass ctx.log at 3 production call sites; sentinel summaries log too so any drift across cap-hit / doom-loop paths surfaces. - system-prompt.test.ts: 4 new tests (byte-identical, no-drift-on- stable, drift-fires-with-changed-inputs, cross-session-no-drift). 194/194 tests pass (was 190). Smoke: 5 messages in a fresh session produced 7 prefix-fingerprint logs (extras from buildMessagesPayload being called from sentinel summary paths), all with identical prefix_hash and prefix_length=2907, zero prefix-drift. Prefix is byte-stable in steady-state. Decision: original system_prompt_cache DB table from the roadmap is permanently dropped. The v1.12.0 mtime caches at the input layer plus alpha tool ordering at the request body (v1.13.3) already address the load-bearing cache-stability surfaces. Instrumentation stays so the claim can be re-verified at any time.	2026-05-22 13:42:18 +00:00
indifferentketchup	81d837c04e	v1.13.6: compaction head-assembly audit + reasoning fix Audit traced compaction's summary path post-v1.13.1-B read flip: - Q1: reads from messages_with_parts (view) — clean - Q2: parts shape correctly threaded through buildHeadPayload — clean - Q3: reasoning omitted from summary input — FIX NEEDED v1.13.1-C wired reasoning end-to-end into inference/payload.ts but missed this read site. Summarizer model couldn't see the reasoning trail for tool-bearing turns, quietly degrading summary quality for reasoning-channel models (qwen3.6). Fix: - CompactionMessage extended with reasoning_parts field - SELECT pulls reasoning_parts from messages_with_parts - buildHeadPayload (now exported for tests) prefixes assistant content with <reasoning>...</reasoning>\n\n<content>... when reasoning is present; standalone <reasoning>...</reasoning> for tool-call-only turns; omits the tag when reasoning is null or empty 4 new render branch tests (190 total). Smoke deferred: forcing real compaction requires either threshold pollution or building up a >40k-token chat with reasoning_parts. Render branches are unit-covered; integration would only re-prove structural correctness.	2026-05-22 08:18:47 +00:00
indifferentketchup	f8fc5db929	v1.13.5: opencode truncate.ts port — full tool output retrievable via opaque id - New services/truncate.ts. Tmpfs storage at /tmp/boocode-truncations/ (BOOCODE_TRUNCATION_DIR env var overrides for tests). 12-char base32 opaque ids (~60 bits entropy, "tr_<id>"). Three exports: storeTruncation, readTruncation, truncateIfNeeded (wrap-or-passthrough helper). cleanupTruncations does TTL-pass (7 days) + orphan-reap (parts query on payload->'output'->>'outputPath') in one shot. - Wired four tools through truncateIfNeeded: view_file (raw full file), list_dir (full filtered+secret-filtered entries serialized one-per-line), web_fetch (textRaw pre-slice), codecontext_client (body.result pre-slice). Each returns the existing sliced view plus an optional outputPath field when truncation fires. - New view_truncated_output ToolDef. Resolves opaque id → on-disk content internally; model never sees the truncation dir. Same start_line / end_line slicing semantics as view_file. Registered in ALL_TOOLS (alpha sort places it after view_file automatically) and READ_ONLY_TOOL_NAMES. - cleanupTruncations piggybacks on the v1.13.3 stuck-row sweeper's 60s setInterval. No-op when truncation dir is empty. Not wired (TODO follow-up): grep and find_files. file_ops returns post-cap results to the tool execute path, so the "full content" isn't recoverable without a refactor of fileOps.grep / fileOps.findFiles to expose the uncapped result. web_search is silent-slice (no truncated flag); outside scope. Five sites of seven covered; the remaining two are the only ones needing a file_ops change. Tests: 7 new in truncate.test.ts (roundtrip, unknown id, malformed id, truncateIfNeeded false/true/over-cap/storage-failure paths). 186 total (was 179). cleanupTruncations file-system half implicitly via TTL pass; orphan-reap branch covered by the live container smoke. Smoke verified end-to-end against the live container: - view_file with start_line=1, end_line=3 on CLAUDE.md → tool_result part carried outputPath "tr_cdpn1o04k6ma" + truncated=true. - /tmp/boocode-truncations/tr_cdpn1o04k6ma exists, 15876 bytes, mode 0o600, parent dir mode 0o700. - Follow-up view_truncated_output(id, start_line=50, end_line=55) returned the actual lines 50-55 of CLAUDE.md (the 808notes/BooCode bullets). - ALL_TOOLS count=20 (was 19); alpha sort places view_truncated_output between view_file and watch_changes. Closes a v1.12 catalog row that was scoped but deferred. The v1.13 parts table made outputPath ride on the existing tool_result payload with no schema change beyond the storage helper itself. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 07:55:55 +00:00
indifferentketchup	ec8593cf77	v1.13.4: two-tier compaction prune — opencode pattern half-shipped in v1.11.0 - message_parts.hidden_at timestamptz column (NULL by default) with a partial index on (message_id) WHERE hidden_at IS NULL for the common visible-parts filter. - messages_with_parts view changed from COALESCE(parts, legacy) to CASE WHEN EXISTS(any parts of kind) THEN visible-parts ELSE legacy. COALESCE would have leaked hidden parts back via the legacy fallback when every part was pruned (smoke caught it pre-commit). The CASE distinguishes "no parts at all → fall back to legacy column for pre-v1.13.0 history" from "all parts hidden → return null/empty so the row drops out of the model payload" exactly. - prune.ts: scans tool_result parts newest-first, protects the last 40k tokens (PROTECTED_TOKENS), marks older candidates hidden when their combined estimate clears 20k (PRUNE_TRIGGER_TOKENS — equal to COMPACTION_BUFFER from v1.11.0, so a successful prune is exactly the budget the summary path would have freed). Stops at chats.tail_start_id so it doesn't double-erase across the last summary boundary. Pure decision helper selectPruneTargets exported separately for unit tests. - Wired into maybeFlagForCompaction: prune runs synchronously when overflow is detected; if it freed >= PRUNE_TRIGGER_TOKENS, the needs_compaction flag is NOT set and the (expensive) summary inference call is skipped this turn. The next turn's overflow check re-evaluates from scratch. - 6 new unit tests in prune.test.ts cover: empty input, protection-only (no candidates), candidates below trigger, candidates above trigger, candidates straddling a summary boundary, exactly-protection-tokens. 179 tests total (was 173). Smoke verified post-rebuild: - \\d message_parts shows hidden_at + partial index. - View definition shows AND p.hidden_at IS NULL filters on all three subselects. - Synthetic hide-then-restore confirmed the view drops the tool_result jsonb to null when its only part is hidden, and restores when un-hidden. - EXPLAIN ANALYZE on the 42-message stress chat: 0.325ms (faster than v1.13.1-B's 1.018ms — EXISTS short-circuits cleanly for the common no-parts case). - Normal turn (plain text prompt) completes unaffected. Closes a v1.11.0 design item that was scoped but never implemented. With v1.13's parts table the prune is dramatically cheaper to write — pre-parts it would have meant editing JSON blobs in-place; now it's a hidden_at flag and a view subselect. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 07:02:17 +00:00
indifferentketchup	a08d809b73	v1.13.3: cleanup bundle — statement timeout + alpha ordering + stuck-row sweeper + repairToolCall Four independent items, all owed from prior dispatches. - statement_timeout at the database level via: ALTER DATABASE boocode SET statement_timeout = '30s'; Applied operationally; documented as a comment at the top of schema.sql (ALTER DATABASE can't run inside a DO block, so it's not idempotent inside applySchema). Re-apply after a volume reset. - Tool registry alpha-sorted at module load. llama.cpp's prompt cache hits on byte-identical prefixes; any reordering of the tool list near the top of the system prompt would invalidate every cached turn. Single-source sort at the ALL_TOOLS export so toolJsonSchemas() and TOOLS_BY_NAME inherit the order automatically. New tools.test.ts asserts the invariant; total tests 173 (was 172). - Periodic in-process stuck-row sweeper. Runs every 60s, marks 'streaming' rows older than 5 minutes as 'failed', and publishes chat_status='idle' on the user channel so the UI dot drops without a refresh. Closes the mid-session crash UX gap; the v1.12.1 boot sweep only fires once at startup, so sessions used to stay stuck until next reboot. setInterval cleaned up via app.addHook('onClose'). Mirrors handleAbortOrError's publish pattern. - experimental_repairToolCall wired through AI SDK v6 streamText. Pass- through implementation: log + return the original toolCall so the stream keeps going. executeToolPhase's existing error paths (unknown tool name → 'unknown tool: X' result; zod-reject → 'tool X rejected — field: required') already surface bad calls to the model; the value here is preventing the AI SDK from THROWING on parse errors and killing the whole stream. Owed since v1.13.1-A. Smoke verified: - statement_timeout = '30s' confirmed via SHOW. - Tool path normal flow intact (list_dir prompt → tool_call → result → final assistant). No malformed tool calls in the test run; repair log will surface them when qwen3.6 actually emits one. - Alpha order verified at runtime via the dist bundle: match: true. - Sweeper logic not traffic-tested (no stuck rows to find), but the SQL UPDATE + broker.publishUser pattern is identical to handleAbort and the boot sweep — synthesis-only verification. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 06:46:03 +00:00
indifferentketchup	ac1a71f583	v1.13.1-C: port ask_user_input correlation to parts + wire reasoning_parts end-to-end Pass 1 — ask_user_input correlation port (messages.ts:478, :549): - The two correlation queries that backed the elicitation flow used to scan messages.tool_calls and messages.tool_results JSON columns directly. They now JOIN message_parts on payload->>'id' (for the caller assistant) and payload->>'tool_call_id' (for the pending tool row). Semantics preserved: ORDER BY m.created_at DESC LIMIT 1 still picks the latest issuance, the already-answered 409 guard now reads payload.output, and the UPDATE + parts replace inside sql.begin is unchanged from v1.13.0. - Pre-v1.13.0 history has no parts rows and is unreachable to this lookup path (404). Acceptable per dispatch decision — no pending elicitation from before v1.13.0 will still be open. JSON-column fallback can land as a hotfix if it ever surfaces. Pass 2 — reasoning_parts wired end-to-end: - types.ts/StreamResult gains `reasoning: string`. stream-phase.ts accumulates reasoning-delta text per stream (replacing the v1.13.1-A counter-only diagnostic) and returns it on the result. - parts.ts/partsFromAssistantMessage gains an optional `reasoning` param. When present it emits a kind='reasoning' part at sequence 0, ahead of the text and tool_call parts. - error-handler.ts/finalizeCompletion and tool-phase.ts/executeToolPhase both thread result.reasoning into the dual-write call so reasoning-channel models (qwen3.6) get persistent reasoning rows. - payload.ts: loadContext SELECT pulls reasoning_parts from the v1.13.1-B view; OpenAiMessage gains an optional `reasoning` field; buildMessagesPayload collapses reasoning_parts into a single string per assistant message. - stream-phase.ts/toModelMessages converts assistant messages with reasoning into an AI SDK ModelMessage content array starting with a ReasoningPart, matching the @ai-sdk/provider-utils AssistantContent union. Reasoning models can now replay prior reasoning context across tool-call boundaries. - types/api.ts and apps/web/src/api/types.ts Message interface gain reasoning_parts (optional, nullable). Frontend doesn't render this yet — field reserved for a v1.14 UI surface. Tests: 2 new in parts.test.ts cover reasoning-at-sequence-0 with and without text content. 172 tests pass (170 prior + 2 new). Smoke verified against the live container: - A reasoning-prompt ("walk through 17 × 23 step by step") produced one message with kind='reasoning' (361 chars) at sequence 0 and kind='text' (429 chars) at sequence 1. Adapter log confirmed reasoning capture. - The new correlation SQL was validated against existing tool_call / tool_result parts: returns the expected message_id + payload shape with pending state correctly identified via payload.output IS NULL. - ask_user_input end-to-end through the UI is Sam's smoke — the Prompt Builder agent does not always trigger ask_user_input for these prompts, so synthetic verification via SQL substituted for traffic-driven cover. Annotation: the v1.13.1-A abort-throw site in stream-phase.ts got a one-liner comment ("AI SDK v6 fullStream returns normally on abort; check signal explicitly.") to prevent a future refactor removing it. v1.13.2 drops the dual-write + the JSON columns + collapses the view. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 06:34:10 +00:00
indifferentketchup	1cb6eee24c	v1.13.0: message_parts table + dual-write at every tool_calls/tool_results site Adds a granular message_parts table (one row per text/tool_call/tool_result chunk) without changing any read path. Old messages.content / tool_calls / tool_results columns remain authoritative for v1.13.0; this dispatch is write-only mirroring so the AI SDK migration in v1.13.1 can flip read authority without a backfill window. Schema: CREATE TABLE message_parts (id, message_id FK ON DELETE CASCADE, sequence int, kind text CHECK (text\|tool_call\|tool_result\|reasoning\|step_start), payload jsonb, created_at, UNIQUE (message_id, sequence)) New module services/inference/parts.ts with two pure derive helpers (partsFromAssistantMessage, partsFromToolMessage) and insertParts that fan-outs a multi-row INSERT via postgres-js. Wired dual-write at every site that writes tool_calls or tool_results: - tool-phase.ts: assistant finalize UPDATE, executed-tool UPDATE, ask_user_input sentinel UPDATE - messages.ts answer flow: DELETE pending tool_result part + INSERT answered one inside the existing sql.begin - skills.ts: synthetic assistant + tool INSERTs both inside existing tx - chats.ts fork: CTE clones parts via ROW_NUMBER pairing (source→dest message id mapping in one statement, no N+1) - error-handler.ts finalizeCompletion: text part for plain text-only assistant turns Deviation: tool-phase.ts finalize UPDATEs and finalizeCompletion text-part write are not wrapped in fresh sql.begin transactions. Safe in v1.13.0 because JSON columns are authoritative for reads. v1.13.1 must wrap these sites before flipping read authority — TODO comments added at each unwrapped site referencing v1.13.1. Tests: 8 new unit tests for the derive helpers in services/__tests__/parts.test.ts. Existing 162 tests untouched. 170 total. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 05:46:29 +00:00
indifferentketchup	9ef00c0268	v1.12.4: complete inference.ts split into services/inference/ - sentinel-summaries.ts: runCapHitSummary, insertCapHitSentinel, runDoomLoopSummary, insertDoomLoopSentinel - inference.ts → inference/turn.ts: residue is runAssistantTurn, runInference, createInferenceRunner orchestration only - inference/index.ts: re-export shim preserves the public surface (createInferenceRunner, runInference, runAssistantTurn, detectDoomLoop, DOOM_LOOP_THRESHOLD, buildMessagesPayload, plus type-side InferenceContext/InferenceFrame/StreamResult/TurnArgs/ FramePublisher) - src/index.ts + auto_name.ts + the two vitest test files updated to import from ./services/inference/index.js explicitly (NodeNext ESM doesn't honor directory-index resolution) Final tally: 11 files under services/inference/, the largest being sentinel-summaries.ts at 523 LoC (two near-clone summary paths kept side-by-side until a third sentinel justifies factoring out a shared runWrapUpSummary). turn.ts is now 326 LoC, the next-largest is stream-phase.ts at 380. Public import surface unchanged. tool-phase.ts → turn.ts back-edge for runAssistantTurn remains (cycle is safe; resolved at call time). Prepares the file structure for v1.13 AI SDK migration — streamText swap targets stream-phase.ts only. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 22:36:35 +00:00
indifferentketchup	fce8c06932	Merge v1.11.10 + doc refinements onto v1.12.0 main # Conflicts: # CLAUDE.md	2026-05-21 15:22:46 +00:00
indifferentketchup	16c69a38a1	Merge v1.12 track B: codecontext sidecar # Conflicts: # apps/web/src/components/ToolCallLine.tsx # docker-compose.yml	2026-05-21 15:12:30 +00:00
indifferentketchup	a2e2481ef9	v1.12 track A: container guidance + skills	2026-05-21 15:11:04 +00:00
indifferentketchup	136e9538aa	v1.12 track B.2: codecontext tool wrappers + tests	2026-05-21 13:35:44 +00:00
indifferentketchup	3e1e17ecf6	v1.11.10: stream-cap response body at 5MB, abort on overflow	2026-05-21 02:27:31 +00:00
indifferentketchup	ab01e04d77	v1.11.9: manual redirect handling — re-run URL guard on each hop	2026-05-21 00:37:35 +00:00
indifferentketchup	4e67a265ac	v1.11.8: address review — inject fetcher, byte-count limit, redirect TODO	2026-05-20 21:40:11 +00:00
indifferentketchup	2fdbb05477	v1.11.8: web_search + web_fetch tools via SearXNG Adds two new tools registered through the existing ALL_TOOLS registry: - web_search hits SearXNG's JSON API (Fathom, internal Tailscale URL, no auth) and returns top results - web_fetch retrieves a URL's text content, gated by isPublicUrl (url_guard.ts) which blocks loopback / RFC1918 / Tailscale CGNAT / link-local / .local / .internal / non-http schemes Both tools are opt-in via the existing session.web_search_enabled flag (plumbed in v1.9, activated here). Default off. UI labels updated to "Enable web search and fetch" / "Web search and fetch" since fetch joins the same store. Counts against the v1.8.2 per-turn budget; covered by the v1.11.6 doom-loop guard. Native Node 20 fetch — no new prod dep. HTML stripping via regex (script and style content elided wholesale). 5MB body cap, 15s fetch timeout, 8000-char default output, 32000-char cap. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 21:38:02 +00:00
indifferentketchup	863452ae07	v1.11.7: secret-file deny list for codebase tools Ports continue.dev's DEFAULT_SECURITY_IGNORE_FILETYPES + ignored-dir lists into apps/server/src/services/secret_guard.ts plus a small BooCode additions block (id_rsa, credentials, .netrc, .kdbx). Tiny glob-to- regex matcher; no new prod dep. view_file hard-refuses via SecretBlockedError. list_dir / grep / find_files filter their results and surface a pathguard_note string field with the hidden count — never list the offending paths back. Named secret_guard.ts (not safety/pathGuard.ts) to avoid collision with the existing path_guard.ts which already exports a pathGuard() function. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 20:55:50 +00:00
indifferentketchup	f92b0810c3	v1.11.6: doom-loop guard (3 identical tool calls aborts recursion)	2026-05-20 20:28:45 +00:00
indifferentketchup	89dcfb95dc	v1.11.3: fix ctx_max capture via /props endpoint - llama-server does not emit n_ctx in timings (confirmed empirically); dead code at inference.ts:479 and compaction.ts:300 never fired - New model-context.ts: cached fetch of /upstream/<model>/props with positive-cache (no TTL) and 60s negative-cache - Wired into all 4 ctx_max write sites: 3 in inference.ts (executeToolPhase, finalizeCompletion, runCapHitSummary) and 1 in compaction.ts (summary row INSERT) - AbortController 3s timeout, lenient parsing with sensible defaults - 12 new vitest cases for the cache module (59 total) - 7 historical assistant rows backfilled manually (see notes)	2026-05-20 19:29:26 +00:00
indifferentketchup	dc43dd44f9	v1.11: opencode-style compaction port - compaction.ts: usable/isOverflow/estimate/turns/select/buildPrompt/process - compaction-prompt.ts: SUMMARY_TEMPLATE verbatim from opencode - schema: messages.{compacted_at,summary,tail_start_id} + chats.needs_compaction - inference: auto-trigger on overflow, pre-fetch compaction before next turn - /compact slash command rewired to new path - WS: chat_status working/idle around compaction + compacted frame - frontend: SummaryCard + sonner toast on compacted - 24 unit tests for pure functions	2026-05-20 19:05:35 +00:00
indifferentketchup	09aecc4ee9	v1.9: settings pane + per-project defaults + bulk archive + themes lift Adds a singleton, ephemeral 'settings' pane kind to the workspace. Opened via a new bottom-pinned button in ProjectSidebar (emits an open_settings_pane event when a session is mounted; navigates to /settings otherwise). Pane has three sections — Session, Project, Theme — and a maximize toggle that hides sibling pane columns via display:none on desktop only. Settings panes don't count toward MAX_PANES and are filtered out of the localStorage persistence layer so reload always restores a clean workspace. Schema (additive): - projects.default_system_prompt TEXT NOT NULL DEFAULT '' - projects.default_web_search_enabled BOOLEAN NOT NULL DEFAULT false - sessions.web_search_enabled BOOLEAN (nullable; null = inherit) Inference resolves user_prompt = session.system_prompt.trim() \|\| project.default_system_prompt.trim() — empty/whitespace at either layer means "no override". Keeps the columns NOT NULL and matches the existing inherit semantics. Server routes: - GET /api/projects/:id (new; settings pane refetches on project_updated) - PATCH /api/projects/:id accepts default_system_prompt, default_web_search_enabled - PATCH /api/sessions/:id accepts web_search_enabled (tri-state) - POST /api/projects/:id/sessions/archive-all + GET /api/projects/:id/sessions/open-count - POST /api/sessions/:id/chats/archive-all + GET /api/sessions/:id/chats/open-count - PATCH /api/sessions/:id now broadcasts session_updated on every successful PATCH (was rename-only). Lets SettingsPane open in another tab pick up edits without a refetch. Bulk-archive publishes one session_archived / chat_archived frame per affected id so useSidebar's existing reducer cases handle them incrementally — no new frame type, no payload widening. ModelPicker refactored: shared ModelList inside a responsive shell. Desktop = labeled trigger + DropdownMenu, mobile = icon-only Cpu button + BottomSheet. Header in Session.tsx drops the pill wrap on mobile since the new trigger is the visual. ChatInput gains an icon-only '+' DropdownMenu next to AgentPicker when sessionId + webSearchEnabled props are provided. One item for now — Web search — with a checkmark reflecting the stored value (true), not the effective one. Click PATCHes the override; to restore inherit-from-project the user opens SettingsPane. ThemePicker lifted out of pages/Settings.tsx into a reusable component. The standalone /settings route is now a thin wrapper that mounts <ThemePicker /> with a Back button on top (navigate(-1) with fallback to '/'); the SettingsPane Theme tab renders the same picker bare. Project section delete-flow removed (button + confirm dialog + handler). Replaced with "Archive all sessions" using the same two-step count → confirm → fire pattern as "Archive all chats" in the Session section. api.projects.remove() stays in the client because useProjects.ts still uses it. Hand-rolled Switch primitive in SettingsPane (no shadcn switch in the project; spec said no new deps). Section nav is plain buttons (no shadcn Tabs). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 17:37:29 +00:00
indifferentketchup	5c61cc7281	v1.8.2: tool loop cap-hit summary + tool call UI compaction Old hardcoded MAX_TOOL_LOOP_DEPTH=15 replaced by per-agent max_tool_calls (1-100, AGENTS.md frontmatter) with defaults: 30 for read-only-only agents, 10 for agents that include any non-read-only tool, 15 for raw chat. When the loop hits cap, fire one final summary call with tools disabled, stream the wrap-up into the in-flight assistant message, then insert a system sentinel with metadata.kind='cap_hit'. The sentinel renders an amber bubble with a Continue button (latest sentinel only) that POSTs to a new /api/chats/:id/continue route to extend. Hard ceiling: 3 cap-hits per chat (2 continues max) — third sentinel reports can_continue=false. Error frames carry a machine-readable reason code alongside human error text. Failed messages persist the reason via metadata.kind='error' so the bubble renders specifics on reload (WS error frame is one-shot). Tool call UI rewired: ToolCallLine renders inline (↳ name args spinner/check/✗, expand-on-tap for args+result); ToolCallGroup collapses 3+ consecutive same-tool runs into a compact card. MessageList owns a three-pass pre-render (flatten + fold tool results onto matching runs by id + group same-tool runs + number sentinels). MessageBubble drops tool rendering and adds the sentinel / error-reason branches. ToolCallCard deleted. Roadmap follow-up logged: add explicit max_tool_calls: 30 to the 6 agents in /data/AGENTS.md and /opt/boocode/AGENTS.md post-ship for discoverability (defaults handle behavior identically). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 10:31:32 +00:00
indifferentketchup	1ecb79476e	test: vitest harness + unit tests for security-critical pure functions Adds vitest 3.x (pinned to ^3 because vitest 4 requires Vite 6, while the web app pins Vite 5). Tests live under src//__tests__/. Three target functions: - sanitizeFolderName (project_bootstrap.ts): 8 cases covering happy path, path-traversal stripping, empty-after-sanitize, control chars, truncation at 64, null bytes, leading/trailing dot/slash stripping. - resolveProjectPath (projects.ts): 7 cases including symlink-escape via realpath, outside-whitelist rejection, nonexistent path, AND a flagged BEHAVIOR GAP: passing the whitelist path itself currently returns success rather than erroring out (function early-exits the scope check when real === whitelistReal). Test asserts current behavior with explicit comment flagging the spec violation — function NOT silently patched. Function made exportable for testing (single keyword change). - buildMessagesPayload (inference.ts): 8 cases for compact-marker logic (no marker, marker present, multiple compacts, tool-message position). tsconfig.json excludes __tests__ + *.test.ts from emit so dist/ stays clean. pnpm -C apps/server test => 23 passed in ~340ms. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-16 04:35:31 +00:00

28 Commits