boocode

Author	SHA1	Message	Date
indifferentketchup	d27a977d59	v1.15.0-mcp-multi: multi-server MCP client + stdio transport + config file + tool globs Generalizes the v1.14.1 single-server Context7 PoC into a multi-server MCP client registry with per-server graceful degradation. JSON config at /data/mcp.json (bind-mounted alongside AGENTS.md) matches opencode's mcpServers schema shape. Config file missing = no MCP (opt-in by presence). Two transports: Streamable HTTP (remote servers like Context7) and stdio (local subprocess servers like codecontext). Stdio spawns a persistent child via the SDK's StdioClientTransport; shutdown hook closes all transports. Tool prefix generalized from context7_<name> to <serverName>_<toolName> with a toolToServer reverse map for dispatch routing. AGENTS.md tools: field now supports glob patterns (context7_, !web_) via matchToolGlob — last-match- wins with ! deny prefix. Replaces exact-match .includes() in stream-phase.ts. refreshToolNames() in agents.ts rebuilds the DEFAULT_TOOLS snapshot after appendMcpTools so agents without explicit tools: lists see MCP tools — reviewer caught that the module-load-time snapshot would permanently exclude late-registered tools. Read-only invariant: readOnlyHint === false rejected at discovery. Result size capped at 5MB. v1.14.1 env vars removed — superseded by config file. Default data/mcp.json ships with Context7 disabled. 363/363 server tests passing. No schema changes, no frontend changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-24 04:08:42 +00:00
indifferentketchup	5692e99a5d	v1.14.1-mcp-poc: single-server MCP client against Context7 Validates the MCP-client loop end-to-end against one real MCP server before the full v1.15 port. New services/mcp-client.ts wraps @modelcontextprotocol/sdk v1.29.0 with Streamable HTTP transport. On startup (when MCP_CONTEXT7_URL is set), connects to Context7, discovers tools via tools/list, wraps each as a ToolDef prefixed context7_<name>, and appends to ALL_TOOLS via appendMcpTools. Read-only invariant guard rejects any tool with readOnlyHint: false. Tool dispatch is transparent — executeToolCall routes MCP calls through the ToolDef execute wrapper, which strips the prefix before calling the MCP server. Result size capped at 5MB with truncation. Graceful degradation: server down at startup → zero tools; server down mid-session → error result, model self-corrects. Adversarial review caught that a Zod .default() on the URL config made MCP always-on instead of opt-in — fixed by removing the default. MCP_CONTEXT7_URL must be explicitly set to enable. ALL_TOOLS changed from ReadonlyArray to mutable to support late-registration. appendMcpTools re-sorts and rebuilds TOOLS_BY_NAME after append. 348/348 server tests passing (16 new mcp-client tests). No schema changes, no frontend changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-23 21:58:09 +00:00
indifferentketchup	211e903620	v1.13.20-drop-legacy-cols: final phase of v1.13.0 strangler-fig Removes the dual-write into messages.tool_calls / messages.tool_results JSON columns and drops the columns. message_parts is now the only source of truth for tool calls and tool results. 10 dual-write sites stripped (5 in tool-phase.ts, 2 in routes/skills.ts, 2 in routes/messages.ts, 1 in routes/chats.ts fork-clone). The recon-driven grep caught 2 sites beyond the original v1.13.2 roadmap inventory and an extra fixture file (tool_cost_stats.test.ts) with a direct legacy-column INSERT. messages_with_parts view rewritten to parts-only subselects (COALESCE fallbacks gone). View runs via CREATE OR REPLACE so it lands before the column DROPs in startup DDL — Postgres rejects column-drop on view-referenced cols. v1.12.1 cleanup DO block (DROP CONSTRAINT messages_status_check / messages_role_check) removed; those one-shots have done their work. Adversarial review caught a runtime bug the green test suite missed: the discard_stale endpoint (chats.ts) had a RETURNING ... tool_calls, tool_results clause that would have crashed on every 60s-no-token-activity recovery in production. Fixed by switching to two-step UPDATE returning id, then SELECT from messages_with_parts so parts-synthesized fields keep flowing on the wire. Message API type retains tool_calls? / tool_results? — the view synthesizes those keys from parts so the wire shape is unchanged; frontend reads need no update. Override on the original v1.13.2 plan, captured in the openspec proposal. 339/339 server tests passing (including 7 DB-integration tests that applied the schema migration to a live DB and ran the parts-only view end-to-end). tsc + web build clean. Pairs with v1.13.0-ai-sdk-v6 (introduced the dual-write) and v1.13.1-B (moved the read path to messages_with_parts). Umbrella v1.13 tag ships on this same commit, marking the strangler-fig closed. CLAUDE.md picks up Sam's pre-existing edits documenting tag-naming and CHANGELOG conventions — both already in use by v1.13.19 / v1.13.20. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-23 13:03:51 +00:00
indifferentketchup	ad45b28250	v1.13.19-html-artifact-panes: pane-based artifact viewer with on-request HTML Every assistant message gets an "Open in pane" affordance that opens the message in the workspace splitter — Markdown pane (Copy + Download .md) by default; HTML pane (Download .html only) when the model emits a self-contained <!DOCTYPE html> or fenced ```html artifact. BOOCHAT.md rule keeps Markdown default at every length; HTML opt-in on explicit user request. Backend: services/artifacts.ts (slug derivation + write helpers with symlink-escape guard via realpath-after-mkdir), routes/artifacts.ts (POST download + GET stream with nosniff + CSP sandbox defense-in-depth), HTML detection in finalizeCompletion writing a new message_parts.kind='html_artifact' row (schema CHECK extended via v1.13.13 pattern), graceful 1MB cap via the pure decideHtmlArtifactWrite helper. PartKind union extended. Frontend: MarkdownRenderer.tsx extracted from MessageBubble's inline MarkdownBody for reuse; MarkdownArtifactPane.tsx + HtmlArtifactPane.tsx with loading/error states; pane state is reference-only ({chat_id, message_id, title}) — content fetched on mount to keep workspace_panes jsonb small and avoid 1MB blobs riding session_workspace_updated frames. iframe sandbox locked to allow-scripts allow-clipboard-write allow-downloads with no allow-same-origin, srcDoc not src. openInPane discriminates 404 (expected fallback) from real errors (toast + bail). PanelRightOpen icon button with mobile 44px tap-target. 31 new server unit tests including a real-symlink filesystem case; 332/332 server tests passing, tsc clean both sides, pnpm -C apps/web build green. Smoke deferred to first deploy. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-23 12:43:13 +00:00
indifferentketchup	1a889dcde3	v1.13.18-codecontext-file-path: resolve file_path against project root in codecontext wrappers Four codecontext sidecar wrappers — get_file_analysis (required file_path), get_symbol_info, get_dependencies, and get_semantic_neighborhoods (optional) — forwarded file_path to the HTTP sidecar unchanged. The sidecar's internal file index is keyed on absolute paths, so any relative path from the model returned "File not found in graph". Three back-to-back failures observed in one chat on 2026-05-22 17:56 UTC, ~48 s of wasted tool budget. ## Resolver Add resolveProjectPath(projectRoot, rawPath) in codecontext_client.ts: trim check → absolute/relative branch (both go through resolve() so dot-segments normalise) → realpath with ENOENT fallthrough → escape check using the realpathed value. Error shape mirrors the existing target_dir escape error byte-for-byte; only the field name differs. Wired into callCodecontext at the args-spread site, guarded on file_path presence + non-empty. All four wrappers benefit from one call site; wrappers without file_path (overview, framework, watch, search) are unaffected. ## Schema trim .trim() added to all four file_path Zod schemas: get_file_analysis: z.string().trim().min(1) get_symbol_info: z.string().trim().optional() get_dependencies: z.string().trim().optional() get_semantic_neighborhoods: z.string().trim().optional() Absorbs trailing newlines / whitespace from model output before the resolver sees the value. ## Adversarial review fixes Adversarial pass surfaced two P2 findings: 1. Absolute path with `..` resolving outside the project root (e.g. `<projectRoot>/../etc/passwd`) that ENOENTs at realpath would slip through the literal prefix-check: the raw string starts with `<projectRoot>/`. Fix: resolve() the absolute branch's candidate too, so dot-segments normalise before the prefix check. 2. No symlink-escape test coverage. Realpath's stated purpose (catching in-project symlinks pointing outside the project) was never tested. Added: create a tmpdir outside projectRoot, symlink projectRoot/evil-link → outside file, assert rejection. ## Tests codecontext_client.test.ts: 19 tests (10 baseline + 9 new file_path resolution cases). Cases cover: relative→absolute, absolute-inside, relative-escape, absolute-outside, ENOENT-fallthrough, empty-string, wrapper-without-file_path, absolute-with-`..`-ENOENT, symlink-leaving-root. codecontext_tools.test.ts: one assertion updated to expect the resolved-absolute file_path on the wire (previously asserted the raw relative path passed through, which is exactly the bug being fixed). Full suite: 301 passed, 7 skipped. ## Affected / unaffected - get_codebase_overview, get_framework_analysis, watch_changes, search_symbols: no file_path arg → resolver guard skips them. No behavior change. - get_semantic_neighborhoods IS in SYNTHESIS_TOOLS — previously-failing relative-path calls will now successfully synthesize. Desirable, not a regression. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 21:54:16 +00:00
indifferentketchup	b52c5df705	v1.13.17-cross-repo-reads: on-demand read access to paths outside the project root When the agent needed context from another repo, pathGuard rejected every read with no recovery path. This batch adds a reactive request_read_access flow: pathGuard's error now hints at the tool, the model emits a structured request, the inference loop pauses (same mechanism as ask_user_input), the user picks Allow/Deny via inline chips, and subsequent reads under the granted root succeed for the rest of the session. Schema: sessions.allowed_read_paths TEXT[] NOT NULL DEFAULT ARRAY[]::TEXT[] (idempotent ADD COLUMN IF NOT EXISTS). Grant unit (design D1): nearest registered projects.path ancestor → nearest repo-shaped ancestor (.git/ / package.json / go.mod / Cargo.toml) under PROJECT_ROOT_WHITELIST → else refuse. grant_resolver.ts walks ancestors with a per-iteration whitelist invariant check so symlinked input can't escape the whitelist mid-walk (Sam's checkpoint-1 ask). Path-guard: optional extraRoots arg threaded from session.allowed_read_paths through executeToolCall to view_file / list_dir / grep / find_files. The ToolDef.execute signature gets an optional third param; non-FS tools ignore it. view_file re-anchors the secret-guard check on basename(real) whenever a relative path starts with "../" so .env / id_rsa* etc. still deny across grant roots. Endpoint: POST /api/chats/:id/grant_read_access mirrors /answer_user_input. On 'allow' it re-resolves the grant root (state may have changed since prompt — auto-falls to denial reason text on failure, not 500), array_appends to sessions.allowed_read_paths with in-memory dedup, then publishes tool_result + session_updated frames and enqueues the next assistant turn. PATCH /api/sessions/:id allowed_read_paths supports revocation only. Zod refines absolute + no traversal markers; runtime findUnauthorizedAdditions guard rejects any entry not already present in the row, so a malicious curl -X PATCH -d '{"allowed_read_paths":["/etc"]}' returns 400 instead of bypassing the grant flow (Sam's compliance-review action item). Frontend: RequestReadAccessCard renders pending (path + reason + Allow/Deny) and answered (granted/denied summary with the resolved root) variants; MessageList.flatten/group special-cases the tool name; SettingsPane adds a per-session grants list with per-row revoke that PATCHes the shortened array. Tests: 11 grant_resolver, 8 path_guard, 8 sessions PATCH subset, including explicit cases for symlink escape mid-walk, walk-bound termination at whitelist root, /etc bypass attempt via PATCH, and nearest-project disambiguation. 292 total server tests green. Pairs with v1.13.16-xml-parser — the model now self-recovers from both a wrong tool name AND from a refused path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 21:45:52 +00:00
indifferentketchup	2e1a81de72	v1.13.16-xml-parser: Anthropic <invoke> support + unknown-tool recovery hints Two-part fix for the model-emitted XML drift the v1.13.15-codecontext-synth investigation surfaced (1 raw <invoke> leak observed out of 190 qwen3.6 turns — qwen3.6-35b-a3b-mxfp4 drifts to the Anthropic format when prompted as an Architect-style agent because Claude Code documentation in its pre-training corpus uses that shape). ## Parser extension xml-parser.ts now recognizes BOTH XML tool-call flavors: - Qwen/Hermes: <tool_call><function=NAME>...<parameter=K>V</parameter>...</function></tool_call> - Anthropic: <invoke name="NAME"><parameter name="K">V</parameter></invoke> Both route through the same synthetic-id xml_call_${idx} ToolCall path. extractToolCallBlocks() and partialXmlOpenerStart() handle both openers (<tool_call> and <invoke...) so partial buffers don't get prematurely flushed during streaming. The existing Qwen parser was tightened to tolerate whitespace around `=` (<function = name>, <parameter = key>...) so a stray space doesn't get absorbed into the function name. Name capture is non-whitespace, non-`>`. ## Unknown-tool recovery hint New tool-suggestions.ts exports levenshtein() + suggestToolName() + formatUnknownToolError(). When tool-phase.ts:executeToolCall receives a toolCall.name that isn't in TOOLS_BY_NAME, the error returned to the model now includes a "Did you mean: X?" hint based on Levenshtein distance ≤3 or substring match against Object.keys(TOOLS_BY_NAME). Targets the qwen3.6 drift to read_file → suggest view_file. Applies to all unknown tool names, not just <invoke>-derived ones — at the dispatch layer we no longer know which format produced the call, and the extra signal is harmless for Qwen-derived calls. ## Test coverage xml-parser.test.ts: 46 tests, all green. Covers both parsers (well-formed, malformed, multi-parameter, nested-content), the partial-opener detector for both flavors, the unified extraction helper, and the unknown-tool error formatter. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 20:59:25 +00:00
indifferentketchup	8b568b36d3	v1.13.11-a: WS frame schemas + frontend receive validation First half of the WebSocket-frame-typing batch (split per recon — total scope was ~535 LoC, larger than the roadmap's ~300 estimate, so the server-side publish-site conversion lands separately in v1.13.11-b). Phase A scope: (1) apps/server/src/types/ws-frames.ts (NEW) — Zod schemas for all 27 wire-format WS frame types. Discriminated union (WsFrameSchema) plus KNOWN_FRAME_TYPES const for diagnostic lookup. UUIDs are z.string(). uuid(); model-emitted tool_call_id stays z.string().min(1) since OpenAI- compatible APIs emit "call_<random>" not UUID. Per-kind payload narrowing (tool args, message_parts payloads) intentionally stays z.unknown() — frame-level drift detection is the goal; deep payload validation is follow-up work. (2) apps/web/src/api/ws-frames.ts (NEW) — byte-identical mirror of the authoritative server file. No path alias from web→server in the existing tsconfig setup; sync-by-hand was chosen over a new packages/shared/ dir. A ws-frames.test.ts test asserts the two files match. (3) apps/server/src/services/broker.ts — adds publishFrame() and publishUserFrame() methods to the Broker interface. Both validate via WsFrameSchema and fail-closed: log + drop on invalid. createBroker now accepts an optional FastifyBaseLogger so validation failures land in the pino stream (with console.error fallback for unit tests). The existing publish() / publishUser() raw methods stay legal — they get converted to the typed variants in v1.13.11-b. (4) apps/web/src/hooks/useSessionStream.ts + useUserEvents.ts — wrap ws.onmessage with WsFrameSchema.safeParse. Fail-closed: invalid frames log + return without dispatching. Hand-maintained WsFrame and SessionEvent types stay in place; one cast bridges Zod-typed → narrowed shape (Zod uses OpaqueObject for nested Message[] / WorkspacePane[] etc., which are dev-time-narrowed via the existing hand-maintained types). (5) apps/web/package.json — adds zod ^3.23.8 as a direct dep. Was a transitive dep via ai-sdk / postgres; promotion makes the import legal. (6) Tests: 15 new in ws-frames.test.ts covering happy-path per major frame type, drift-catchers (unknown type, invalid enum, non-UUID, negative tokens), parts-authoritative read variants, the mirror-file diff check, and four broker fail-closed scenarios. 219/219 server tests pass (was 204; +15 new). Two recon corrections to the dispatch brief, both flagged before implementation: - No 'parts_appended' frame exists. The brief assumed one; the codebase reads parts via the messages_with_parts view after message_complete triggers a refetch. MessagePartSchema is therefore unused this batch. - No 'tool_running' frame exists. The brief listed it as standalone; it is in fact a 'chat_status' variant ({ status: 'tool_running' }), already covered by ChatStatusFrame. Smoke: clean container boot, no validation errors in the server log. Real production frames pass validation (the schemas were derived from the existing hand-maintained types in api/types.ts and sessionEvents.ts). v1.13.11-b will follow immediately: convert all ~85 raw broker.publish / ctx.publish call sites across 11 server files to publishFrame / publishUserFrame. Mechanical edit; the wiring done here means the diff in -b is just the call-site swaps. ~310 LoC across 9 files (4 new + 5 modified).	2026-05-22 15:48:32 +00:00
indifferentketchup	34cbecf975	v1.13.15-tools: tiered tool loading via BOOCODE_TOOLS env var Pattern lift from eyaltoledano/claude-task-master (MIT + Commons Clause — pattern only, no code lift). Adds BOOCODE_TOOLS env var with three tiers: - core (4 tools): view_file, list_dir, grep, find_files. ~2k token schema cost. - standard (15 tools): core + web_search, web_fetch, git_status, all 8 codecontext_* tools. ~10k token schema cost. - all (default; current behavior): every tool in ALL_TOOLS (20). ~21k token schema cost. The env var is a CEILING — narrows agent whitelists, never expands. Default behavior unchanged when var is unset. resolveToolTier is case-insensitive and falls back to 'all' on unknown values. CORE_TOOL_NAMES + STANDARD_TOOL_NAMES validated at module load against TOOLS_BY_NAME via two top-level for-loops that throw on the first missing name. Module fails to import if a tier references a tool that doesn't exist in the registry — catches typos and stale tier definitions at boot rather than silently filtering valid tools out of agent whitelists. Wiring: agents.ts parseAgentBlock now reads BOOCODE_TOOLS from process.env per parse, intersects with the agent's declared frontmatter tools (or DEFAULT_TOOLS when frontmatter omits the field). Per-parse read is fine — agents are re-parsed on the existing 60s cache TTL. Tests: tools.test.ts grows from 1 to 10 tests. Covers resolveToolTier across tiers/case/unknown values + the CORE-subset-of-STANDARD invariant + TOOLS_BY_NAME existence for both tier sets. 204/204 pass (was 195; +9 new). Deviation from the brief: the codecontext tools in the actual registry have NO codecontext_* prefix (the brief's STANDARD list assumed it). Used the actual names (get_codebase_overview, search_symbols, etc.). Module-load validation would have failed boot with the prefixed names. Smoke: with BOOCODE_TOOLS unset, agents return their full 12-tool whitelists. With BOOCODE_TOOLS=core in .env + container restart, the same agents narrow to 4 tools (find_files, grep, list_dir, view_file) — intersection of declared whitelist ∩ core tier. Reverted after confirmation. CLAUDE.md updated with BOOCODE_TOOLS in the Environment section's Optional list. .env.example gained a commented BOOCODE_TOOLS=all line with the per-tier token-cost table. ~110 LoC across 5 files (4 modified + 1 test expansion). Under the brief's ~30 LoC estimate for code; the test suite expansion drove most of the growth.	2026-05-22 14:59:01 +00:00
indifferentketchup	9ce638c916	v1.13.10: per-tool token cost accounting (rolling 100-call view) Surfaces per-tool prompt/completion-token rolling averages in AgentPicker for at-a-glance agent-cost hints. Implementation is a SQL view on top of messages_with_parts plus a read endpoint and AgentPicker tooltip extension. No new write site; all source data already lands via the existing tool-phase.ts:94-95 / error-handler.ts: 109-110 / sentinel-summaries.ts UPDATEs that v1.13.7's includeUsage: true fix made non-NULL. (1) schema.sql — new tool_cost_stats view. Window-functions over messages_with_parts.tool_calls with LATERAL jsonb_array_elements. Attribution: equal split — multi-tool turn divides tokens N-ways; the 100-call rolling mean absorbs split noise. Filters: status= 'complete' + metadata.kind NOT IN ('cap_hit','doom_loop') exclude failed turns and sentinels respectively; tool_calls IS NOT NULL is defense-in-depth since sentinels are role='system' rows. CREATE OR REPLACE means schema apply is idempotent. (2) routes/tools.ts NEW + index.ts wire-in. GET /api/tools/cost_stats returns { stats: ToolCostStat[] } with mean_prompt_tokens / mean_ completion_tokens computed at read time (sum / n_calls). Sorted by tool_name ASC. No pagination — ≤30 tools. (3) __tests__/tool_cost_stats.test.ts NEW — 7 integration tests keyed off DATABASE_URL env var. Tests skip gracefully when unset (no-DB default). beforeAll applies the schema via sql.unsafe(read FileSync(schema.sql)) for self-contained runs. Helper insertAssistant Turn shared across cases. Covers: empty state, single-tool attribution, multi-tool equal split, 100-call FIFO window, NULL-tokens exclusion, parts-authoritative read via messages_with_parts, failed/sentinel exclusion. (4) web/api/types.ts + client.ts — ToolCostStat interface + api.tools. costStats() method binding. (5) AgentPicker.tsx — fetch costStats on mount, compute per-agent sum-of-means across whitelisted tools, render muted cost line below description: "~5.2k prompt / 280 completion · 6/8 tools · last call 3h ago". Skips line entirely when no tool history; preserves existing native title= for layout backward-compat. formatK/formatAgo colocated. Tests: 202/202 pass (195 prior + 7 new view-integration). Server + web tsc clean. Smoke: schema applied cleanly; GET /api/tools/cost_stats returns canonical JSON; view + endpoint agree. Single-row result expected given the v1.13.1-A → v1.13.7 NULL latent regression window; new traffic populates organically. Roadmap row at boocode_roadmap.md:114 plus schema row at :474 both match. View vs table decision documented in handoff_v1.13.10_per_ tool_cost.md (rollback-safe, microsecond-fast at BooCode scale). ~270 LoC across 8 files (5 modified + 3 new).	2026-05-22 14:42:09 +00:00
indifferentketchup	b06a4a8e55	v1.13.9: compaction overflow trigger — 0.85 × ctx_max early trigger Opencode pattern (session/overflow.ts): fire compaction at 85% of ctx_max, replacing the v1.11.0-era `ctx_max - 20_000` formula. Old formula: usable = ctx_max - 20_000 - ctx=262144 → trigger at 242144 (92.4%) — only 7.6% headroom - ctx=100000 → trigger at 80000 (80.0%) - ctx= 32000 → trigger at 12000 (37.5%) — over-eager - ctx<=20000 → trigger at 0 — never fires New formula: usable = floor(0.85 * ctx_max) - ctx=262144 → trigger at 222822 (85.0%) — 15% headroom for summarizer - ctx=100000 → trigger at 85000 (85.0%) - ctx= 32000 → trigger at 27200 (85.0%) - ctx= 8192 → trigger at 6963 (85.0%) Ratio gives consistent headroom at any context scale. The qwen3.6 daily driver gets ~19k tokens more breathing room before overflow; small-ctx models no longer degenerate to never-triggering. usable() is the only consumer of COMPACTION_BUFFER → constant deleted. New EARLY_TRIGGER_RATIO constant takes its place. isOverflow() and the maybeFlagForCompaction() call site at payload.ts:184 are unchanged — formula swap is internal to compaction.ts. payload.ts comment touched only to drop the stale COMPACTION_BUFFER reference (PRUNE_TRIGGER_TOKENS stays at 20k as the prune-freed threshold; independent of the overflow formula). Tests: 4 new usable() corner cases (262k/100k/8k/zero+negative), plus 5 isOverflow() numbers shifted to match the 85k budget at ctx=100k. 195/195 server tests pass (was 194). Smoke: ratio math verified by unit tests at all four corners. Live cap-hit verification deferred — requires accumulating >222k tokens in a session under qwen3.6-35b-a3b-mxfp4 (was >242k pre-fix); will surface organically in extended use.	2026-05-22 13:59:14 +00:00
indifferentketchup	a0c8d212cb	v1.13.8: system-prompt prefix stability verify-and-measure Recon during planning disproved the original v1.13.7 (DB-cache) premise: buildSystemPrompt already runs over inputs mtime-cached at the file layer (BOOCHAT.md in system-prompt.ts:25, AGENTS.md global+per-project in agents.ts:245), and DB scalars are byte-stable until edited. The output is microsecond pure-string concat with no I/O. Skills aren't in the prefix; tools live in a separate request body field alpha-sorted by v1.13.3. This batch closes the verification gap with instrumentation, not implementation: - system-prompt.ts: buildSystemPromptWithFingerprint canonical impl computes SHA-256 over the assembled prefix, runs a per-session Map<sessionId, lastHash> observer, emits PrefixFingerprint per call and PrefixDrift (with field-level changed_inputs) on hash change. buildSystemPrompt is now a thin shim returning .prompt. - agents.ts: getAgentsMtimes accessor — cache-read only, no I/O. - payload.ts: buildMessagesPayload takes optional log argument; when passed, emits prefix-fingerprint (info) + prefix-drift (warn). - turn.ts + sentinel-summaries.ts: pass ctx.log at 3 production call sites; sentinel summaries log too so any drift across cap-hit / doom-loop paths surfaces. - system-prompt.test.ts: 4 new tests (byte-identical, no-drift-on- stable, drift-fires-with-changed-inputs, cross-session-no-drift). 194/194 tests pass (was 190). Smoke: 5 messages in a fresh session produced 7 prefix-fingerprint logs (extras from buildMessagesPayload being called from sentinel summary paths), all with identical prefix_hash and prefix_length=2907, zero prefix-drift. Prefix is byte-stable in steady-state. Decision: original system_prompt_cache DB table from the roadmap is permanently dropped. The v1.12.0 mtime caches at the input layer plus alpha tool ordering at the request body (v1.13.3) already address the load-bearing cache-stability surfaces. Instrumentation stays so the claim can be re-verified at any time.	2026-05-22 13:42:18 +00:00
indifferentketchup	81d837c04e	v1.13.6: compaction head-assembly audit + reasoning fix Audit traced compaction's summary path post-v1.13.1-B read flip: - Q1: reads from messages_with_parts (view) — clean - Q2: parts shape correctly threaded through buildHeadPayload — clean - Q3: reasoning omitted from summary input — FIX NEEDED v1.13.1-C wired reasoning end-to-end into inference/payload.ts but missed this read site. Summarizer model couldn't see the reasoning trail for tool-bearing turns, quietly degrading summary quality for reasoning-channel models (qwen3.6). Fix: - CompactionMessage extended with reasoning_parts field - SELECT pulls reasoning_parts from messages_with_parts - buildHeadPayload (now exported for tests) prefixes assistant content with <reasoning>...</reasoning>\n\n<content>... when reasoning is present; standalone <reasoning>...</reasoning> for tool-call-only turns; omits the tag when reasoning is null or empty 4 new render branch tests (190 total). Smoke deferred: forcing real compaction requires either threshold pollution or building up a >40k-token chat with reasoning_parts. Render branches are unit-covered; integration would only re-prove structural correctness.	2026-05-22 08:18:47 +00:00
indifferentketchup	f8fc5db929	v1.13.5: opencode truncate.ts port — full tool output retrievable via opaque id - New services/truncate.ts. Tmpfs storage at /tmp/boocode-truncations/ (BOOCODE_TRUNCATION_DIR env var overrides for tests). 12-char base32 opaque ids (~60 bits entropy, "tr_<id>"). Three exports: storeTruncation, readTruncation, truncateIfNeeded (wrap-or-passthrough helper). cleanupTruncations does TTL-pass (7 days) + orphan-reap (parts query on payload->'output'->>'outputPath') in one shot. - Wired four tools through truncateIfNeeded: view_file (raw full file), list_dir (full filtered+secret-filtered entries serialized one-per-line), web_fetch (textRaw pre-slice), codecontext_client (body.result pre-slice). Each returns the existing sliced view plus an optional outputPath field when truncation fires. - New view_truncated_output ToolDef. Resolves opaque id → on-disk content internally; model never sees the truncation dir. Same start_line / end_line slicing semantics as view_file. Registered in ALL_TOOLS (alpha sort places it after view_file automatically) and READ_ONLY_TOOL_NAMES. - cleanupTruncations piggybacks on the v1.13.3 stuck-row sweeper's 60s setInterval. No-op when truncation dir is empty. Not wired (TODO follow-up): grep and find_files. file_ops returns post-cap results to the tool execute path, so the "full content" isn't recoverable without a refactor of fileOps.grep / fileOps.findFiles to expose the uncapped result. web_search is silent-slice (no truncated flag); outside scope. Five sites of seven covered; the remaining two are the only ones needing a file_ops change. Tests: 7 new in truncate.test.ts (roundtrip, unknown id, malformed id, truncateIfNeeded false/true/over-cap/storage-failure paths). 186 total (was 179). cleanupTruncations file-system half implicitly via TTL pass; orphan-reap branch covered by the live container smoke. Smoke verified end-to-end against the live container: - view_file with start_line=1, end_line=3 on CLAUDE.md → tool_result part carried outputPath "tr_cdpn1o04k6ma" + truncated=true. - /tmp/boocode-truncations/tr_cdpn1o04k6ma exists, 15876 bytes, mode 0o600, parent dir mode 0o700. - Follow-up view_truncated_output(id, start_line=50, end_line=55) returned the actual lines 50-55 of CLAUDE.md (the 808notes/BooCode bullets). - ALL_TOOLS count=20 (was 19); alpha sort places view_truncated_output between view_file and watch_changes. Closes a v1.12 catalog row that was scoped but deferred. The v1.13 parts table made outputPath ride on the existing tool_result payload with no schema change beyond the storage helper itself. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 07:55:55 +00:00
indifferentketchup	ec8593cf77	v1.13.4: two-tier compaction prune — opencode pattern half-shipped in v1.11.0 - message_parts.hidden_at timestamptz column (NULL by default) with a partial index on (message_id) WHERE hidden_at IS NULL for the common visible-parts filter. - messages_with_parts view changed from COALESCE(parts, legacy) to CASE WHEN EXISTS(any parts of kind) THEN visible-parts ELSE legacy. COALESCE would have leaked hidden parts back via the legacy fallback when every part was pruned (smoke caught it pre-commit). The CASE distinguishes "no parts at all → fall back to legacy column for pre-v1.13.0 history" from "all parts hidden → return null/empty so the row drops out of the model payload" exactly. - prune.ts: scans tool_result parts newest-first, protects the last 40k tokens (PROTECTED_TOKENS), marks older candidates hidden when their combined estimate clears 20k (PRUNE_TRIGGER_TOKENS — equal to COMPACTION_BUFFER from v1.11.0, so a successful prune is exactly the budget the summary path would have freed). Stops at chats.tail_start_id so it doesn't double-erase across the last summary boundary. Pure decision helper selectPruneTargets exported separately for unit tests. - Wired into maybeFlagForCompaction: prune runs synchronously when overflow is detected; if it freed >= PRUNE_TRIGGER_TOKENS, the needs_compaction flag is NOT set and the (expensive) summary inference call is skipped this turn. The next turn's overflow check re-evaluates from scratch. - 6 new unit tests in prune.test.ts cover: empty input, protection-only (no candidates), candidates below trigger, candidates above trigger, candidates straddling a summary boundary, exactly-protection-tokens. 179 tests total (was 173). Smoke verified post-rebuild: - \\d message_parts shows hidden_at + partial index. - View definition shows AND p.hidden_at IS NULL filters on all three subselects. - Synthetic hide-then-restore confirmed the view drops the tool_result jsonb to null when its only part is hidden, and restores when un-hidden. - EXPLAIN ANALYZE on the 42-message stress chat: 0.325ms (faster than v1.13.1-B's 1.018ms — EXISTS short-circuits cleanly for the common no-parts case). - Normal turn (plain text prompt) completes unaffected. Closes a v1.11.0 design item that was scoped but never implemented. With v1.13's parts table the prune is dramatically cheaper to write — pre-parts it would have meant editing JSON blobs in-place; now it's a hidden_at flag and a view subselect. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 07:02:17 +00:00
indifferentketchup	a08d809b73	v1.13.3: cleanup bundle — statement timeout + alpha ordering + stuck-row sweeper + repairToolCall Four independent items, all owed from prior dispatches. - statement_timeout at the database level via: ALTER DATABASE boocode SET statement_timeout = '30s'; Applied operationally; documented as a comment at the top of schema.sql (ALTER DATABASE can't run inside a DO block, so it's not idempotent inside applySchema). Re-apply after a volume reset. - Tool registry alpha-sorted at module load. llama.cpp's prompt cache hits on byte-identical prefixes; any reordering of the tool list near the top of the system prompt would invalidate every cached turn. Single-source sort at the ALL_TOOLS export so toolJsonSchemas() and TOOLS_BY_NAME inherit the order automatically. New tools.test.ts asserts the invariant; total tests 173 (was 172). - Periodic in-process stuck-row sweeper. Runs every 60s, marks 'streaming' rows older than 5 minutes as 'failed', and publishes chat_status='idle' on the user channel so the UI dot drops without a refresh. Closes the mid-session crash UX gap; the v1.12.1 boot sweep only fires once at startup, so sessions used to stay stuck until next reboot. setInterval cleaned up via app.addHook('onClose'). Mirrors handleAbortOrError's publish pattern. - experimental_repairToolCall wired through AI SDK v6 streamText. Pass- through implementation: log + return the original toolCall so the stream keeps going. executeToolPhase's existing error paths (unknown tool name → 'unknown tool: X' result; zod-reject → 'tool X rejected — field: required') already surface bad calls to the model; the value here is preventing the AI SDK from THROWING on parse errors and killing the whole stream. Owed since v1.13.1-A. Smoke verified: - statement_timeout = '30s' confirmed via SHOW. - Tool path normal flow intact (list_dir prompt → tool_call → result → final assistant). No malformed tool calls in the test run; repair log will surface them when qwen3.6 actually emits one. - Alpha order verified at runtime via the dist bundle: match: true. - Sweeper logic not traffic-tested (no stuck rows to find), but the SQL UPDATE + broker.publishUser pattern is identical to handleAbort and the boot sweep — synthesis-only verification. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 06:46:03 +00:00
indifferentketchup	ac1a71f583	v1.13.1-C: port ask_user_input correlation to parts + wire reasoning_parts end-to-end Pass 1 — ask_user_input correlation port (messages.ts:478, :549): - The two correlation queries that backed the elicitation flow used to scan messages.tool_calls and messages.tool_results JSON columns directly. They now JOIN message_parts on payload->>'id' (for the caller assistant) and payload->>'tool_call_id' (for the pending tool row). Semantics preserved: ORDER BY m.created_at DESC LIMIT 1 still picks the latest issuance, the already-answered 409 guard now reads payload.output, and the UPDATE + parts replace inside sql.begin is unchanged from v1.13.0. - Pre-v1.13.0 history has no parts rows and is unreachable to this lookup path (404). Acceptable per dispatch decision — no pending elicitation from before v1.13.0 will still be open. JSON-column fallback can land as a hotfix if it ever surfaces. Pass 2 — reasoning_parts wired end-to-end: - types.ts/StreamResult gains `reasoning: string`. stream-phase.ts accumulates reasoning-delta text per stream (replacing the v1.13.1-A counter-only diagnostic) and returns it on the result. - parts.ts/partsFromAssistantMessage gains an optional `reasoning` param. When present it emits a kind='reasoning' part at sequence 0, ahead of the text and tool_call parts. - error-handler.ts/finalizeCompletion and tool-phase.ts/executeToolPhase both thread result.reasoning into the dual-write call so reasoning-channel models (qwen3.6) get persistent reasoning rows. - payload.ts: loadContext SELECT pulls reasoning_parts from the v1.13.1-B view; OpenAiMessage gains an optional `reasoning` field; buildMessagesPayload collapses reasoning_parts into a single string per assistant message. - stream-phase.ts/toModelMessages converts assistant messages with reasoning into an AI SDK ModelMessage content array starting with a ReasoningPart, matching the @ai-sdk/provider-utils AssistantContent union. Reasoning models can now replay prior reasoning context across tool-call boundaries. - types/api.ts and apps/web/src/api/types.ts Message interface gain reasoning_parts (optional, nullable). Frontend doesn't render this yet — field reserved for a v1.14 UI surface. Tests: 2 new in parts.test.ts cover reasoning-at-sequence-0 with and without text content. 172 tests pass (170 prior + 2 new). Smoke verified against the live container: - A reasoning-prompt ("walk through 17 × 23 step by step") produced one message with kind='reasoning' (361 chars) at sequence 0 and kind='text' (429 chars) at sequence 1. Adapter log confirmed reasoning capture. - The new correlation SQL was validated against existing tool_call / tool_result parts: returns the expected message_id + payload shape with pending state correctly identified via payload.output IS NULL. - ask_user_input end-to-end through the UI is Sam's smoke — the Prompt Builder agent does not always trigger ask_user_input for these prompts, so synthetic verification via SQL substituted for traffic-driven cover. Annotation: the v1.13.1-A abort-throw site in stream-phase.ts got a one-liner comment ("AI SDK v6 fullStream returns normally on abort; check signal explicitly.") to prevent a future refactor removing it. v1.13.2 drops the dual-write + the JSON columns + collapses the view. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 06:34:10 +00:00
indifferentketchup	1cb6eee24c	v1.13.0: message_parts table + dual-write at every tool_calls/tool_results site Adds a granular message_parts table (one row per text/tool_call/tool_result chunk) without changing any read path. Old messages.content / tool_calls / tool_results columns remain authoritative for v1.13.0; this dispatch is write-only mirroring so the AI SDK migration in v1.13.1 can flip read authority without a backfill window. Schema: CREATE TABLE message_parts (id, message_id FK ON DELETE CASCADE, sequence int, kind text CHECK (text\|tool_call\|tool_result\|reasoning\|step_start), payload jsonb, created_at, UNIQUE (message_id, sequence)) New module services/inference/parts.ts with two pure derive helpers (partsFromAssistantMessage, partsFromToolMessage) and insertParts that fan-outs a multi-row INSERT via postgres-js. Wired dual-write at every site that writes tool_calls or tool_results: - tool-phase.ts: assistant finalize UPDATE, executed-tool UPDATE, ask_user_input sentinel UPDATE - messages.ts answer flow: DELETE pending tool_result part + INSERT answered one inside the existing sql.begin - skills.ts: synthetic assistant + tool INSERTs both inside existing tx - chats.ts fork: CTE clones parts via ROW_NUMBER pairing (source→dest message id mapping in one statement, no N+1) - error-handler.ts finalizeCompletion: text part for plain text-only assistant turns Deviation: tool-phase.ts finalize UPDATEs and finalizeCompletion text-part write are not wrapped in fresh sql.begin transactions. Safe in v1.13.0 because JSON columns are authoritative for reads. v1.13.1 must wrap these sites before flipping read authority — TODO comments added at each unwrapped site referencing v1.13.1. Tests: 8 new unit tests for the derive helpers in services/__tests__/parts.test.ts. Existing 162 tests untouched. 170 total. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 05:46:29 +00:00
indifferentketchup	9ef00c0268	v1.12.4: complete inference.ts split into services/inference/ - sentinel-summaries.ts: runCapHitSummary, insertCapHitSentinel, runDoomLoopSummary, insertDoomLoopSentinel - inference.ts → inference/turn.ts: residue is runAssistantTurn, runInference, createInferenceRunner orchestration only - inference/index.ts: re-export shim preserves the public surface (createInferenceRunner, runInference, runAssistantTurn, detectDoomLoop, DOOM_LOOP_THRESHOLD, buildMessagesPayload, plus type-side InferenceContext/InferenceFrame/StreamResult/TurnArgs/ FramePublisher) - src/index.ts + auto_name.ts + the two vitest test files updated to import from ./services/inference/index.js explicitly (NodeNext ESM doesn't honor directory-index resolution) Final tally: 11 files under services/inference/, the largest being sentinel-summaries.ts at 523 LoC (two near-clone summary paths kept side-by-side until a third sentinel justifies factoring out a shared runWrapUpSummary). turn.ts is now 326 LoC, the next-largest is stream-phase.ts at 380. Public import surface unchanged. tool-phase.ts → turn.ts back-edge for runAssistantTurn remains (cycle is safe; resolved at call time). Prepares the file structure for v1.13 AI SDK migration — streamText swap targets stream-phase.ts only. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 22:36:35 +00:00
indifferentketchup	fce8c06932	Merge v1.11.10 + doc refinements onto v1.12.0 main # Conflicts: # CLAUDE.md	2026-05-21 15:22:46 +00:00
indifferentketchup	16c69a38a1	Merge v1.12 track B: codecontext sidecar # Conflicts: # apps/web/src/components/ToolCallLine.tsx # docker-compose.yml	2026-05-21 15:12:30 +00:00
indifferentketchup	a2e2481ef9	v1.12 track A: container guidance + skills	2026-05-21 15:11:04 +00:00
indifferentketchup	136e9538aa	v1.12 track B.2: codecontext tool wrappers + tests	2026-05-21 13:35:44 +00:00
indifferentketchup	3e1e17ecf6	v1.11.10: stream-cap response body at 5MB, abort on overflow	2026-05-21 02:27:31 +00:00
indifferentketchup	ab01e04d77	v1.11.9: manual redirect handling — re-run URL guard on each hop	2026-05-21 00:37:35 +00:00
indifferentketchup	4e67a265ac	v1.11.8: address review — inject fetcher, byte-count limit, redirect TODO	2026-05-20 21:40:11 +00:00
indifferentketchup	2fdbb05477	v1.11.8: web_search + web_fetch tools via SearXNG Adds two new tools registered through the existing ALL_TOOLS registry: - web_search hits SearXNG's JSON API (Fathom, internal Tailscale URL, no auth) and returns top results - web_fetch retrieves a URL's text content, gated by isPublicUrl (url_guard.ts) which blocks loopback / RFC1918 / Tailscale CGNAT / link-local / .local / .internal / non-http schemes Both tools are opt-in via the existing session.web_search_enabled flag (plumbed in v1.9, activated here). Default off. UI labels updated to "Enable web search and fetch" / "Web search and fetch" since fetch joins the same store. Counts against the v1.8.2 per-turn budget; covered by the v1.11.6 doom-loop guard. Native Node 20 fetch — no new prod dep. HTML stripping via regex (script and style content elided wholesale). 5MB body cap, 15s fetch timeout, 8000-char default output, 32000-char cap. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 21:38:02 +00:00
indifferentketchup	863452ae07	v1.11.7: secret-file deny list for codebase tools Ports continue.dev's DEFAULT_SECURITY_IGNORE_FILETYPES + ignored-dir lists into apps/server/src/services/secret_guard.ts plus a small BooCode additions block (id_rsa, credentials, .netrc, .kdbx). Tiny glob-to- regex matcher; no new prod dep. view_file hard-refuses via SecretBlockedError. list_dir / grep / find_files filter their results and surface a pathguard_note string field with the hidden count — never list the offending paths back. Named secret_guard.ts (not safety/pathGuard.ts) to avoid collision with the existing path_guard.ts which already exports a pathGuard() function. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-20 20:55:50 +00:00
indifferentketchup	f92b0810c3	v1.11.6: doom-loop guard (3 identical tool calls aborts recursion)	2026-05-20 20:28:45 +00:00
indifferentketchup	89dcfb95dc	v1.11.3: fix ctx_max capture via /props endpoint - llama-server does not emit n_ctx in timings (confirmed empirically); dead code at inference.ts:479 and compaction.ts:300 never fired - New model-context.ts: cached fetch of /upstream/<model>/props with positive-cache (no TTL) and 60s negative-cache - Wired into all 4 ctx_max write sites: 3 in inference.ts (executeToolPhase, finalizeCompletion, runCapHitSummary) and 1 in compaction.ts (summary row INSERT) - AbortController 3s timeout, lenient parsing with sensible defaults - 12 new vitest cases for the cache module (59 total) - 7 historical assistant rows backfilled manually (see notes)	2026-05-20 19:29:26 +00:00
indifferentketchup	dc43dd44f9	v1.11: opencode-style compaction port - compaction.ts: usable/isOverflow/estimate/turns/select/buildPrompt/process - compaction-prompt.ts: SUMMARY_TEMPLATE verbatim from opencode - schema: messages.{compacted_at,summary,tail_start_id} + chats.needs_compaction - inference: auto-trigger on overflow, pre-fetch compaction before next turn - /compact slash command rewired to new path - WS: chat_status working/idle around compaction + compacted frame - frontend: SummaryCard + sonner toast on compacted - 24 unit tests for pure functions	2026-05-20 19:05:35 +00:00
indifferentketchup	09aecc4ee9	v1.9: settings pane + per-project defaults + bulk archive + themes lift Adds a singleton, ephemeral 'settings' pane kind to the workspace. Opened via a new bottom-pinned button in ProjectSidebar (emits an open_settings_pane event when a session is mounted; navigates to /settings otherwise). Pane has three sections — Session, Project, Theme — and a maximize toggle that hides sibling pane columns via display:none on desktop only. Settings panes don't count toward MAX_PANES and are filtered out of the localStorage persistence layer so reload always restores a clean workspace. Schema (additive): - projects.default_system_prompt TEXT NOT NULL DEFAULT '' - projects.default_web_search_enabled BOOLEAN NOT NULL DEFAULT false - sessions.web_search_enabled BOOLEAN (nullable; null = inherit) Inference resolves user_prompt = session.system_prompt.trim() \|\| project.default_system_prompt.trim() — empty/whitespace at either layer means "no override". Keeps the columns NOT NULL and matches the existing inherit semantics. Server routes: - GET /api/projects/:id (new; settings pane refetches on project_updated) - PATCH /api/projects/:id accepts default_system_prompt, default_web_search_enabled - PATCH /api/sessions/:id accepts web_search_enabled (tri-state) - POST /api/projects/:id/sessions/archive-all + GET /api/projects/:id/sessions/open-count - POST /api/sessions/:id/chats/archive-all + GET /api/sessions/:id/chats/open-count - PATCH /api/sessions/:id now broadcasts session_updated on every successful PATCH (was rename-only). Lets SettingsPane open in another tab pick up edits without a refetch. Bulk-archive publishes one session_archived / chat_archived frame per affected id so useSidebar's existing reducer cases handle them incrementally — no new frame type, no payload widening. ModelPicker refactored: shared ModelList inside a responsive shell. Desktop = labeled trigger + DropdownMenu, mobile = icon-only Cpu button + BottomSheet. Header in Session.tsx drops the pill wrap on mobile since the new trigger is the visual. ChatInput gains an icon-only '+' DropdownMenu next to AgentPicker when sessionId + webSearchEnabled props are provided. One item for now — Web search — with a checkmark reflecting the stored value (true), not the effective one. Click PATCHes the override; to restore inherit-from-project the user opens SettingsPane. ThemePicker lifted out of pages/Settings.tsx into a reusable component. The standalone /settings route is now a thin wrapper that mounts <ThemePicker /> with a Back button on top (navigate(-1) with fallback to '/'); the SettingsPane Theme tab renders the same picker bare. Project section delete-flow removed (button + confirm dialog + handler). Replaced with "Archive all sessions" using the same two-step count → confirm → fire pattern as "Archive all chats" in the Session section. api.projects.remove() stays in the client because useProjects.ts still uses it. Hand-rolled Switch primitive in SettingsPane (no shadcn switch in the project; spec said no new deps). Section nav is plain buttons (no shadcn Tabs). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 17:37:29 +00:00
indifferentketchup	5c61cc7281	v1.8.2: tool loop cap-hit summary + tool call UI compaction Old hardcoded MAX_TOOL_LOOP_DEPTH=15 replaced by per-agent max_tool_calls (1-100, AGENTS.md frontmatter) with defaults: 30 for read-only-only agents, 10 for agents that include any non-read-only tool, 15 for raw chat. When the loop hits cap, fire one final summary call with tools disabled, stream the wrap-up into the in-flight assistant message, then insert a system sentinel with metadata.kind='cap_hit'. The sentinel renders an amber bubble with a Continue button (latest sentinel only) that POSTs to a new /api/chats/:id/continue route to extend. Hard ceiling: 3 cap-hits per chat (2 continues max) — third sentinel reports can_continue=false. Error frames carry a machine-readable reason code alongside human error text. Failed messages persist the reason via metadata.kind='error' so the bubble renders specifics on reload (WS error frame is one-shot). Tool call UI rewired: ToolCallLine renders inline (↳ name args spinner/check/✗, expand-on-tap for args+result); ToolCallGroup collapses 3+ consecutive same-tool runs into a compact card. MessageList owns a three-pass pre-render (flatten + fold tool results onto matching runs by id + group same-tool runs + number sentinels). MessageBubble drops tool rendering and adds the sentinel / error-reason branches. ToolCallCard deleted. Roadmap follow-up logged: add explicit max_tool_calls: 30 to the 6 agents in /data/AGENTS.md and /opt/boocode/AGENTS.md post-ship for discoverability (defaults handle behavior identically). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 10:31:32 +00:00
indifferentketchup	1ecb79476e	test: vitest harness + unit tests for security-critical pure functions Adds vitest 3.x (pinned to ^3 because vitest 4 requires Vite 6, while the web app pins Vite 5). Tests live under src//__tests__/. Three target functions: - sanitizeFolderName (project_bootstrap.ts): 8 cases covering happy path, path-traversal stripping, empty-after-sanitize, control chars, truncation at 64, null bytes, leading/trailing dot/slash stripping. - resolveProjectPath (projects.ts): 7 cases including symlink-escape via realpath, outside-whitelist rejection, nonexistent path, AND a flagged BEHAVIOR GAP: passing the whitelist path itself currently returns success rather than erroring out (function early-exits the scope check when real === whitelistReal). Test asserts current behavior with explicit comment flagging the spec violation — function NOT silently patched. Function made exportable for testing (single keyword change). - buildMessagesPayload (inference.ts): 8 cases for compact-marker logic (no marker, marker present, multiple compacts, tool-message position). tsconfig.json excludes __tests__ + *.test.ts from emit so dist/ stays clean. pnpm -C apps/server test => 23 passed in ~340ms. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-16 04:35:31 +00:00

34 Commits