Commit Graph

195 Commits

Author SHA1 Message Date
fe52250d78 coder(providers): fix empty picker (loading-state) + config model overrides + current Claude models
Fix: getProviderSnapshot returned synchronous installed:false 'loading' entries on a cache miss (v2.5.5/Phase 2), which AgentComposerBar filters out — with the Phase 5 client poll not yet built, a single fetch stranded on 'loading' and the picker showed no providers. It now awaits the build and returns terminal entries; the sync loading-return is deferred until Phase 5. Builds stay fast via the tier-2 cold-probe skip.

Feature: wire the v2.3 config schema's models/additionalModels — buildResolvedRegistry carries them onto ResolvedProviderDef (models replace, additionalModels merge) and provider-snapshot applies them to every ready model list, so /data/coder-providers.json can edit any provider's models with no code change. Claude staticModels bumped from the stale 2-entry list to opus/sonnet/haiku latest-aliases + pinned claude-opus-4-8 / claude-sonnet-4-6 / claude-haiku-4-5-20251001 (passed verbatim to claude --model). +2 tests (109 total).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
v2.5.7-claude-models-and-picker-fix
2026-05-29 12:37:01 +00:00
4035aa2b98 coder(providers): v2.3 provider-lifecycle phase 3 — generic ACP dispatch
ACP dispatch now spawns from the resolved registry's launch spec instead of a hardcoded per-name switch. acp-spawn.ts gains resolveLaunchSpec(resolved, installPath): launchCommand (config override / custom-ACP command) wins, else the kept resolveAcpSpawnArgs switch is the built-in fallback. acp-dispatch.ts spawns spec.binary/spec.args with env { ...process.env, ...spec.env }; dispatcher.ts loads the resolved def by task.agent and passes it through. Config-defined custom ACP providers dispatch with no new switch case. Built-in dispatch (opencode/goose/qwen) is byte-identical to pre-v2.3 — proven by a regression test (opencode->['acp'], goose->['acp'], qwen->['--acp'], binary=installPath ?? id, empty env -> plain process.env). Deliberate deviation from design's !installPath->null: the installPath ?? id fallback is preserved. setSessionMode/permission/streaming and the dispatcher poll/NOTIFY/running-guard untouched. 7 new acp-spawn.test.ts cases. No routes/UI (Phase 4+).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
v2.5.6-provider-lifecycle-phase3
2026-05-29 12:06:32 +00:00
35a0aba211 coder(providers): v2.3 provider-lifecycle phase 2 — snapshot lifecycle
provider-snapshot no longer returns null for uninstalled/disabled providers: it emits one entry per registered provider with a lifecycle status (loading|ready|unavailable|error), an enabled flag, and a two-tier probe. Tier-1 is a fast which-style check (command-availability.ts, execFile/no-shell); tier-2 (cold ACP probe) is skipped unless forced, last_probed_at is older than PROVIDER_PROBE_TTL_MS (24h), or DB models are empty — the snapshot-latency win. Cache miss returns status:'loading' synchronously while the build settles via the existing inflight promise. ProviderSnapshotStatus/Entry regain loading/unavailable + gain enabled/description?/fetchedAt? in both coder and web copies, guarded by a runtime parity test (provider-types-parity.test.ts; compile-time cross-project check was blocked by TS6307). Also tracks the data/coder-providers.json seed via a .gitignore exception, completing the Phase 1 config file. No dispatch/route/UI changes (Phase 3+); AgentComposerBar filtering unchanged. 13 snapshot tests (+6) + 6 parity tests.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
v2.5.5-provider-lifecycle-phase2
2026-05-29 11:47:48 +00:00
3730dc9341 coder(providers): v2.3 provider-lifecycle phase 1 — config-backed registry
Adds a config layer merged over the hardcoded built-ins (tasks 1.1-1.6): CODER_PROVIDERS_PATH env (default /data/coder-providers.json); provider-config.ts (Zod schema + never-throw loader — missing/invalid file falls back to built-ins only — + save); provider-config-registry.ts (ResolvedProviderDef + buildResolvedRegistry merge: override built-ins, add custom extends:'acp' entries, boocode always enabled + singleton); agent-probe now iterates the resolved registry, probes custom-ACP command[0] via execFile (no shell), skips disabled providers (keeps the row), reads enabled from memory only (no DB column). No snapshot/dispatch/route/UI changes (Phase 2+). 6 new unit tests; empty config provably yields exactly the built-ins.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
v2.5.4-provider-lifecycle-phase1
2026-05-29 04:09:34 +00:00
a359a4ab8b coder(providers): remove retired cursor and copilot providers
Drop both retired providers from BooCoder's provider layer: acp-spawn argv cases, provider-manifest mode blocks + manifest keys, provider-commands maps, the provider-snapshot cursor model-CLI branch (+ orphaned exec/promisify imports), the agent-probe copilot ACP-detect branch, and the now-dead cursor-models module + its test. The PROVIDERS registry array already lacked both. Built-ins unchanged: claude, opencode, goose, qwen, native boocode.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
v2.5.3-remove-cursor-copilot
2026-05-29 04:07:21 +00:00
a8c84ecfe4 chore+docs: config, agent registry, codecontext, v2.6 spec, changelog
Working-tree config/doc changes (.gitignore, CLAUDE.md, AGENTS.md removal + data/AGENTS.md, codecontext Dockerfile/shim — pre-existing) plus this session's v2-6 persistent-agent-sessions openspec proposal/design/tasks (planning only; feature unimplemented, reserves the v2.6.0 tag) and the v2.5.2 CHANGELOG entry.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
v2.5.2-coder-ux-fixes
2026-05-29 03:12:31 +00:00
547fd70650 server/coder: working-tree backend changes (pre-existing)
Checkpoint of in-progress backend work present in the tree, not authored this session: auto_name, inference tool-phase/turn, secret_guard, provider-registry, plus a new agent-allowlist test (7 tests, passing).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 03:12:16 +00:00
990a615b87 web(coder UI): ChatInput migration + Thinking render + DiffPanel route fix
Bundles in-progress working-tree UI work not authored this session (CoderPane ChatInput migration, AgentComposerBar/CoderMessageList/tab-bar/sidebar/pane refinements, provider icons) with this session's changes to the same files: MessageBubble renders a collapsible 'Thinking' block from reasoning_text/reasoning_parts (surfacing ACP agent_thought_chunk + native reasoning), and the DiffPanel approve/reject calls are repointed to the real /api/coder/pending/:id/apply and /reject routes (the old /sessions/:id/pending/:id/approve|reject paths did not exist).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 03:12:06 +00:00
5352fd9942 coder(pending): new-file-from-RightRail create endpoint + modal
POST /api/sessions/:sessionId/pending/create queues a pending_changes create via queueCreate (WriteGuardError -> 422 with the guard message). RightRail gains a 'New file from pasted text' modal (path + content) wired through api.coder.createPendingFile; sessionId is threaded down from App.tsx. The staged change shows in the CoderPane DiffPanel for explicit apply.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 03:11:50 +00:00
66df410826 web: fix mobile nav stuck-open on rejoin + paste-chip code fence
useViewport re-syncs the snapshot on pageshow/visibilitychange/resize/orientationchange — iOS reported a stale width on backgrounded-tab restore, leaving isMobile=false so the sidebar rendered as a permanent column with no close affordance. flattenToMessage now inserts pasted-text chips verbatim instead of wrapping them in a triple-backtick fence.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 03:11:42 +00:00
f89c8f3f15 coder(dispatcher): react to new tasks via LISTEN/NOTIFY, poll as fallback
AFTER INSERT trigger on tasks fires pg_notify('tasks_new'); the dispatcher listens via porsager sql.listen and triggers an immediate poll, with the setInterval poll kept at 2s as a missed-notification safety net. Per-session guard unchanged (no double-dispatch).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 03:11:34 +00:00
cbef7618b3 v2.5.1-budget-100: raise all tool call budgets to 100 + codecontextignore fix
Budget defaults raised from 50/10/50 to 100/100/100 (read-only,
non-read-only, no-agent). Per-agent max_tool_calls from AGENTS.md
still overrides.

Added .claude/worktrees/ to .codecontextignore to prevent
get_codebase_overview from parsing empty stub files in stale
worktree node_modules.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
v2.5.1-budget-100
2026-05-28 02:40:26 +00:00
fcc7c5a86e v2.5.0-task-model: lightweight task model services + tasks table
Task model infrastructure for cheap LLM calls (auto-naming, search
rewrite, tags, summaries) via a dedicated llama-server instance at
TASK_MODEL_URL, falling back to LLAMA_SWAP_URL with FAST_MODEL when
unset. Replaces the inline fetch in auto_name.ts with taskModelCompletion.

Adds search query rewriting: on step 0 when web tools are enabled, the
user's message is summarized into a search intent hint appended to the
system prompt, improving web_search relevance.

Schema: tasks table for provider dispatch and arena, sessions.tags column.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
v2.5.0-task-model
2026-05-27 21:44:39 +00:00
bcfc94fa47 v2.4.1-sidecar-routing: route per-agent flags to llama-sidecar + tool gap fix
Batch 3c: when an agent has llama_extra_args in AGENTS.md, provider.ts
routes inference through LLAMA_SIDECAR_URL instead of LLAMA_SWAP_URL.
X-Agent-Flags header built from the agent's flags. Boot-time guard
refuses to start if any agent has llama_extra_args but LLAMA_SIDECAR_URL
is unset. PrefixFingerprint gains a route field (swap/sidecar) for
per-turn visibility. 9 provider tests.

AGENTS.md tool gap: all agents (except Prompt Builder) were missing 8
tools that were added after the original tool lists were written:
request_read_access, view_truncated_output, ask_user_input, git_status,
get_blast_radius, get_hot_files, get_middleware, get_routes. The missing
request_read_access caused silent "permission denied" when reading files
outside the project root.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
v2.4.1-sidecar-routing
2026-05-27 19:28:08 +00:00
90a6761b07 v2.4.0-unsloth-studio-lift: port 3 Unsloth Studio AGPL-3.0 modules
Batch 1 — tool-call-parser.ts: replaces xml-parser.ts with a port of
Unsloth's tool_call_parser.py. Adds balanced-brace JSON scanner,
single-param fast path, hasToolSignal/stripToolMarkup/parseToolCallsFromText
exports, and stream-finalization stripping at all three final-write sites
(error-handler, finalizeCompletion, executeToolPhase). Anthropic <invoke>
shape preserved. 75+12 tests.

Batch 2 — web/html-to-md.ts: parse5 tree-walking HTML-to-Markdown converter
ported from Unsloth's _html_to_md.py. Replaces web_fetch's regex stripHtml
with structured markdown output (headings, links, lists, tables, code blocks,
blockquotes, entity decoding). 29 tests.

Batch 3 — llama-args-validator.ts: port of llama_server_args.py deny-list
validator. Wired into AGENTS.md frontmatter parser — llama_extra_args field
validated at load time, rejects managed flags (model identity, networking,
auth/TLS, server UI). No runtime consumer yet (llama-swap boundary). 76 tests.

All three files carry SPDX-License-Identifier: AGPL-3.0-only headers.
LICENSE flipped to AGPL-3.0-only in prior commit (a938cf1).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
v2.4.0-unsloth-studio-lift
2026-05-26 23:30:50 +00:00
a938cf1d42 License: AGPL-3.0-only 2026-05-26 23:29:25 +00:00
6f6b3afb5d v2.3.2-coder-answer-endpoint: fix ask_user_input submit in CoderPane
The CoderPane runs its own inference runner and broker on the boocoder
service. The AskUserInputCard was calling /api/chats/:id/answer_user_input
on the main BooChat server, which has a different inference runner — the
answer was accepted but the next turn was enqueued on the wrong runner,
so nothing happened.

Fix: register the same answer_user_input endpoint on the boocoder, and
add an apiPrefix prop to AskUserInputCard so the CoderPane routes
through /api/coder/chats/:id/answer_user_input. BooChat's MessageList
continues to use the default (no prefix) path.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
v2.3.2-coder-answer-endpoint
2026-05-26 21:54:08 +00:00
154ef78f7c v2.3.1-permission-questions: enrich ACP permission wire for interactive questions and elicitations
The permission_requested WS frame now carries kind ('tool'|'question'|'plan'|
'elicitation'), input (the tool's rawInput payload), and description fields.
PermissionCard detects question-type permissions (Claude Code's AskUserQuestion)
and renders an interactive radio/checkbox form instead of approve/deny buttons.
Submitting answers auto-selects the first allow option.

Also wires up ACP createElicitation (unstable/experimental) — JSON Schema-driven
forms for structured user input. The same PermissionCard renders elicitation
fields with type-appropriate inputs. Both flows use the existing permission-waiter
blocking pattern with 120s timeout.

The response path (POST /api/coder/tasks/:id/permission) now accepts optional
updated_input alongside option_id, forwarded to the ACP agent as the user's
answer payload. Elicitation responses map to accept/decline/cancel actions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
v2.3.1-permission-questions
2026-05-26 21:28:14 +00:00
792bbb9da3 v2.3.0-sampling-params-ask-user: agent sampling params, ask_user_input in CoderPane, UX polish
Add top_p/top_k/min_p/presence_penalty to AGENTS.md frontmatter and thread
through inference (agents.ts parser → Agent type → stream-phase → sentinel
summaries). Null means omit from request body, preserving provider defaults.

Wire ask_user_input interactive card into both BooCoder frontends: the
CoderPane in BooChat's SPA (CoderMessageList now renders AskUserInputCard
instead of ToolCallLine for ask_user_input tool calls) and the standalone
coder SPA (MessageBubble + new AskUserInputCard + shadcn ui primitives).

Additional fixes: SessionLandingPage uses ChatInput with slash-command
support and lazy chat creation; Session.tsx hydrate-race fix for empty pane
promotion; AgentPicker wider dropdown with line-clamp; ModelPicker min-width;
Textarea converted to forwardRef; Recon agent added to AGENTS.md; codecontext
host port exposed in docker-compose.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
v2.3.0-sampling-params-ask-user
2026-05-26 21:02:21 +00:00
31e1b32be1 v2.2.2-xml-placeholder-reject: drop placeholder XML tool calls at parse time
Reject qwen3.6 spurious <invoke> tails with path "..." or empty args before
they enter toolCalls, preventing duplicate assistant answers. Dropped blocks
append to flushed text; four new xml-parser tests. DEFERRED-WORK §6 for
console.debug → pino cleanup.

Co-authored-by: Cursor <cursoragent@cursor.com>
v2.2.2-xml-placeholder-reject
2026-05-26 16:22:43 +00:00
314adaae48 docs: reconcile roadmap, README, and deferred work for v2.2 ship state
Mark v2.2/v2.2.1 shipped and v2.3 planned in roadmap and README; fix
DEFERRED-WORK §2 (ACP probe skip is planned, not resolved).

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-26 15:27:16 +00:00
93d3f86c2b v2.2-paseo-providers: Paseo provider stack + v2.2.1 pane-scoped chat fixes
Ship Paseo-equivalent provider snapshot, AgentComposerBar, ACP dispatch
rewrite with streaming/persist, permission prompts, and agent commands.
Follow-up: pane-scoped chat resolution, CoderMessageList tool timeline,
WS user-delta replace, and inference orphan tool_call stripping.
Archive openspec v2-2; update CHANGELOG and CURRENT.

Co-authored-by: Cursor <cursoragent@cursor.com>
v2.2-paseo-providers v2.2.1-pane-scoped-chats
2026-05-26 15:18:31 +00:00
04673eaf59 v2.1.1: roadmap cleanup + README update + openspec archive
- Archive all 10 shipped openspec changes to openspec/changes/archived/
- Update boocode_roadmap.md: date, shipped status for v1.14/v1.15/v2.0, add v2.1.0 section
- Update README.md: 3-app monorepo, add services table, add What's shipped section
- Remove stale active openspec folders (all work shipped)
v2.1.1-roadmap-cleanup
2026-05-25 20:23:22 +00:00
d8ffee1950 v2.1.0-provider-picker: BooCoder systemd migration + provider picker
- BooCoder moves from Docker to host systemd service (boocoder.service)
- Agent dispatch (ACP + PTY) switches from SSH to direct spawn/exec
- SSH helpers marked @deprecated (kept for one release cycle)
- Provider registry (5 providers: boocode, opencode, goose, claude, qwen)
- Agent probe with direct which/exec + model discovery (qwen settings, static claude models)
- GET /api/providers route with installed status, models, transport fallback
- ProviderPicker frontend component in CoderPane header
- External provider messages route through tasks row instead of inference enqueue
- Smart scroll: MessageList only auto-scrolls when near bottom (150px threshold)
- DB: available_agents gets models, label, transport columns
- Bug fix: loadContext SELECT includes allowed_read_paths
- Bug fix: cap hit sentinel inserted before buildMessagesPayload
- docker-compose.yml: boocoder service commented out, BOOCODER_URL env var added
- CLAUDE.md: updated docs for systemd, provider registry, JSONB gotcha, loadContext
v2.1.0-provider-picker
2026-05-25 19:20:53 +00:00
e423579e99 v2.0.5: FAST_MODEL routing + tool-use summaries + Qwen dispatch + Arena
Source-level recon of QwenLM/qwen-code (Apache-2.0) informed 4 lifts:

1. FAST_MODEL config: optional env var routes cheap LLM calls (titles,
   summaries, labeling) to a smaller model on llama-swap. auto_name.ts
   uses ctx.config.FAST_MODEL ?? session.model. Set FAST_MODEL=nemotron-
   nano-4b to avoid loading the 35B model for 20-token title generation.

2. Tool-use summaries (services/inference/tool-summaries.ts): utility
   that generates "git-commit-subject-style" labels for tool batches via
   a fast-model LLM call. System prompt + truncation logic ported from
   Qwen Code's toolUseSummary.ts. Exported via @boocode/server/inference
   for BooCoder's dispatcher to call after task completion.

3. Qwen as dispatchable agent: added to agent-probe.ts KNOWN_AGENTS.
   PTY dispatch builds: qwen -p "<task>" --output-format stream-json
   (NDJSON structured events over stdout). Env: OPENAI_BASE_URL +
   OPENAI_API_KEY points Qwen Code at llama-swap. execution_path CHECK
   constraint extended with 'qwen'.

4. Arena routes (routes/arena.ts): POST /api/arena dispatches the same
   task to N contestants (2-5, each with different agent/model), each
   getting its own task row linked by arena_id UUID. GET /api/arena/:id
   shows all contestants. POST /api/arena/:id/select/:task_id marks
   winner. Schema: arena_id column added to tasks.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
v2.0.5
2026-05-25 14:05:59 +00:00
06116f31b3 v2.0.4-hardening: fuzz suite + integration tests + production readiness
Phase 8 of v2.0. Final hardening pass before production tag.

Path-guard fuzz suite (34 tests): traversal attacks (../ all depths,
encoded %2e%2e, null bytes, absolute escapes, prefix-without-separator,
backslash), secret-file deny list (.env, *.pem, id_rsa*, *.key,
credentials.json, *.kdbx, .netrc), valid-path positives, edge cases
(empty, whitespace, very long, triple-dot, multiple slashes).

write_guard.ts hardened: added null-byte rejection and whitespace-only
rejection (previously only checked empty string).

Pending-changes integration test skeleton: 4 tests covering the full
queue→apply→rewind cycle against a real DB + filesystem. Gated on
DATABASE_URL via describe.runIf (same pattern as apps/server's
tool_cost_stats.test.ts). Skips cleanly when unset.

57 tests passing (23 existing + 34 fuzz), 4 integration skipped.
All builds clean. All services healthy.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
v2.0.4-hardening
2026-05-25 04:31:22 +00:00
47abbb6e3c v2.0.3: CLI client + human inbox + cost tracking + Boomerang new_task
Phase 7 of v2.0. BooCoder gains a terminal-driven UX and subagent
isolation primitive.

CLI (src/cli.ts): standalone entry point for terminal use.
- boocode run "task" [--agent x] [--model y] — create + stream output
- boocode ls [--state x] — formatted task table
- boocode attach <id> — WS stream of running task
- boocode send <id> "msg" — follow-up message to task session
Connects to BOOCODER_URL (default http://100.114.205.53:9502).

Human inbox (routes/inbox.ts): GET /api/inbox (failed/blocked tasks),
POST /api/inbox/:id/retry (reset to pending for re-dispatch).

Cost tracking: dispatcher aggregates tokens_used from all messages in
the task's session after completion, stores in tasks.cost_tokens.
GET /api/stats/costs?group_by=project|agent|day for aggregation.

Boomerang subagent isolation (3 new tools):
- new_task: creates child task with parent_task_id linkage, runs in
  fresh isolated session. Orchestrator sees only output_summary.
- list_tasks: query child tasks of current parent
- check_task_status: read task state + output_summary

The orchestrator pattern: an agent with tools: [new_task, list_tasks,
check_task_status] can ONLY dispatch — can't read files or MCP. This
is the Roo Code Boomerang Tasks capability-restriction principle.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
v2.0.3
2026-05-25 04:25:18 +00:00
f53c6d6cb9 v2.0.2: BooCoder MCP server — 6 tools over stdio
Phase 6 of v2.0. BooCoder exposes its task primitives as MCP tools
so external agents (Sam's opencode in Termius) can drive the task
queue without going through the web UI.

6 MCP tools registered via McpServer + StdioServerTransport:
- boocoder.create_task — INSERT pending task
- boocoder.list_pending_changes — SELECT pending changes
- boocoder.apply — apply a specific pending change to disk
- boocoder.reject — reject a pending change
- boocoder.dispatch_external_agent — create task with agent for Path B
- boocoder.list_worktrees — list active worktrees from running tasks

Activated by --mcp CLI flag: `node dist/index.js --mcp` starts the
MCP server over stdio instead of the HTTP server. Configure in
opencode: {"mcpServers":{"boocoder":{"type":"stdio","command":"docker",
"args":["exec","-i","boocoder","node","dist/index.js","--mcp"]}}}

Uses McpServer class from @modelcontextprotocol/sdk/server/mcp.js
(high-level .tool() registration API). Zod schemas for input
validation. Process blocks on stdin close, cleanly shuts down DB.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
v2.0.2
2026-05-25 04:17:28 +00:00
3d6055518b v2.0.1: ACP dispatch + PTY fallback + worktree management
Phase 5 of v2.0. External agent dispatch via SSH to host.

ACP dispatch (acp-dispatch.ts): spawns agent via SSH with JSON-RPC
stdio pipe. Wraps opencode/goose in ACP mode. Captures structured
events (file operations, tool calls) mapped to parts taxonomy.
Falls back to PTY if ACP handshake fails.

PTY dispatch (pty-dispatch.ts): raw SSH spawn for agents without ACP
support (claude, pi). Captures stdout/stderr as plain text. Simpler
but less structured than ACP.

SSH helper (ssh.ts): shared spawn wrapper for SSH commands to
samkintop@100.114.205.53 (Tailscale IP, same as booterm). Uses
openssh-client installed in the runtime Dockerfile stage.

Worktree management (worktrees.ts): createWorktree (git worktree add
via SSH), diffWorktree (git diff HEAD...task-branch), cleanupWorktree
(git worktree remove --force). One worktree per task at
/tmp/booworktrees/<taskId>.

Dispatcher updated: checks available_agents.supports_acp to pick
transport. Path B flow: create worktree → dispatch agent → diff
worktree → queue diff into pending_changes → cleanup worktree →
mark task complete.

Agent probe updated: probes via SSH to find host-installed agents
(which opencode && opencode --version over SSH).

Dockerfile: openssh-client added to runtime stage.
Config: SSH_HOST env var (default 100.114.205.53).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
v2.0.1
2026-05-25 04:10:46 +00:00
752ea74f43 v2.0.0-final: dispatcher + task queue + agent probing
Phase 4 of v2.0. BooCoder can now queue tasks and dispatch them
through the inference loop autonomously.

Dispatcher (services/dispatcher.ts): in-process setInterval(5s) polls
tasks WHERE state='pending', picks one at a time, creates an isolated
session+chat, enqueues inference with the task's input as the user
message, polls for completion, marks state completed/failed with
output_summary. Single-task-at-a-time for v2.0.0; parallel dispatch
is a Phase 5+ concern. Respects onClose hook for graceful shutdown.

Task routes (routes/tasks.ts): POST /api/tasks (create), GET /api/tasks
(list with state/project filters), GET /api/tasks/:id (detail),
POST /api/tasks/:id/cancel (marks cancelled, aborts if running).

Agent probe (services/agent-probe.ts): on startup, probes PATH for
opencode/goose/claude/pi via which + --version. UPSERTs into
available_agents table. Finds nothing inside the container (expected —
Phase 5 addresses host-agent access via ACP/PTY).

Schema: ALTER TABLE tasks ADD COLUMN IF NOT EXISTS session_id (links
task to its auto-created inference session for isolation).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
v2.0.0-final
2026-05-25 03:55:18 +00:00
73b53089b0 CLAUDE.md: v2.0.0 architecture docs — BooCoder, DB rename, MCP config, workspace deps
Session learnings applied:
- Database renamed boochat (from boocode), new tables documented
- BooCoder architecture section: workspace dep pattern, write tools,
  coder pane integration, proxy routing
- Environment: MCP_CONFIG_PATH, BooCoder health at :9502
- Workflow: Go binary at /snap/go/current/bin, codecontext fork location
- Conventions: workspace exports with types conditions, Docker build order

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 03:51:24 +00:00
457c59fb06 v2.0.0: BooCoder frontend — chat pane + diff pane + session picker
Integrates BooCoder as a 'coder' workspace pane within the existing
BooChat SPA at code.indifferentketchup.com. Renamed the placeholder
'agent' pane kind to 'coder' across all types, menus, hooks, and
mobile switcher (Icon: Code instead of Bot).

CoderPane.tsx: split layout with chat area (messages via WS to
boocoder:9502, input bar posting to /api/coder/sessions/:id/messages)
and diff panel (pending changes with Approve/Reject per change plus
Approve All/Reject All). Reuses MarkdownRenderer for message content.

Proxy: Vite dev config adds /api/coder → boocoder:9502 (ordered above
/api per CLAUDE.md proxy-ordering rule). Production: Fastify route in
apps/server/src/index.ts proxies /api/coder/* to http://boocoder:3000
via fetch() pass-through. WS connects directly to :9502 (same
Tailscale network, no proxy needed for WebSocket upgrade).

WorkspacePaneKind mirror updated in both apps/web and apps/server
types. useWorkspacePanes gains coderPane() factory (replaces the old
agent toast stub). Workspace.tsx switch renders CoderPane for
pane.kind === 'coder'.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
v2.0.0
2026-05-25 03:24:49 +00:00
78455b7efc v2.0.0: BooCoder frontend — chat pane + diff pane + session picker
Phase 3 of v2.0. React + Vite SPA at apps/coder/web/ served by
the coder Fastify server via @fastify/static with SPA fallback.

Chat pane: message list via WS streaming (useSessionStream hook),
input bar, POST /api/sessions/:id/messages on submit, markdown
rendering via react-markdown + remark-gfm, inline tool-call display.

Diff pane: fetches GET /api/sessions/:id/pending, shows pending
changes with file path + operation badge (create/edit/delete),
before/after diff for edits, Approve/Reject per change and
Approve All/Reject All buttons.

Layout: fixed two-pane split (chat 60%, diff 40%). Dark theme
(bg-zinc-900). Desktop-first for v2.0.0.

Session picker (Home page): lists projects and sessions from the
shared DB. No CRUD — use BooChat's UI for that.

Dockerfile updated: builds web app in builder stage, copies dist
to runtime. index.ts registers fastifyStatic + SPA fallback route.

Tailwind v4, React 18, TypeScript strict. ~20 new files, ~370KB
built output. Functional developer tool UI, not polished consumer
product — Phase 7 (v2.0.3) handles polish.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 03:04:52 +00:00
d2108b2f8d verification discipline rules + chat naming from assistant response
BOOCHAT.md + BOOCODER.md: 4 verification rules added to both —
verify against running container not source files, never count dist/,
run commands before claiming success, derive counts from commands.

auto_name.ts: chat titles now derived from the assistant's first
response only (user message dropped from naming input). System prompt
updated to "summarize the topic or outcome — do NOT copy the first
few words verbatim." Produces titles like "Fastify Route Setup"
instead of echoing the assistant's opening sentence.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 02:52:49 +00:00
ce31577d1e v2.0.0-beta: write tools, pending-changes queue, inference loop, API routes
Phase 2 of v2.0. BooCoder is now a functional write-capable chatbot.

Write-path guard: resolveWritePath() uses resolve() (no realpath — files may
not exist for creates) + prefix-check + secret-file deny list (.env, *.pem,
id_rsa*, etc.). 23 unit tests cover traversal attacks.

Pending-changes service: queueEdit/Create/Delete → applyOne/All →
rejectOne/All → rewindOne. Edit diffs stored as JSON {old, new}. All writes
queue before touching disk; apply re-validates the path guard.

5 write tools: edit_file, create_file, delete_file, apply_pending, rewind.
Registered alongside 25 read-only tools from BooChat (30 total, alpha-sorted).
Write tools use a module-level inference context for sql+sessionId injection.

Inference loop via workspace dependency: apps/coder imports
createInferenceRunner, createBroker, ALL_TOOLS from @boocode/server (dist/).
apps/server gains declaration: true + exports map with typed subpath entries.
No code duplication — one inference engine shared by both apps.

API routes: POST /api/sessions/:id/messages (user msg → inference), POST stop,
GET/POST pending-changes CRUD (5 endpoints), WebSocket session streaming.

Dockerfile updated to build apps/server first (coder depends on its .d.ts).
Health endpoint reports tool count: {"ok":true,"db":true,"tools":30}.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
v2.0.0-beta
2026-05-25 01:53:38 +00:00
006226cce5 v2.0.0-alpha: BooCoder foundation — container, schema, DB rename
Phase 1 of v2.0. BooCoder is live at port 9502 with a health endpoint.

- Database renamed: ALTER DATABASE boocode RENAME TO boochat (one-time).
  All services updated to connect to /boochat. Docker service name stays
  boocode_db (rename is internal to Postgres, not Docker).

- New apps/coder/ app skeleton: Fastify server with health endpoint,
  postgres connection, schema apply on boot. Mirrors apps/server pattern
  but minimal (no inference loop yet — Phase 2).

- Schema: pending_changes (operation queue before /apply), tasks (dispatch
  DAG with state machine), available_agents (startup-probed agent registry),
  human_inbox view (tasks WHERE state IN blocked/failed). All IF NOT EXISTS,
  idempotent on re-run. Same boochat database, different tables.

- Dockerfile: Node 20 bookworm-slim (glibc for future node-pty in Phase 5).
  Multi-stage build matching the existing boocode image pattern.

- docker-compose.yml: boocoder service on 100.114.205.53:9502, /opt:/opt:rw
  mount (write-capable, policy-gated at tool layer), depends on boocode_db.

- BOOCODER.md: container guidance declaring write-tool capability +
  pending-changes discipline.

All 4 services boot and pass health checks. 9 tables in the shared DB.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
v2.0.0-alpha
2026-05-25 01:20:29 +00:00
62d818af23 v2.0 implementation plan: 8 phases from foundation to production
Detailed execution plan for all v2.0 sub-versions:

Phase 1 (v2.0.0-alpha): container skeleton, DB rename, schema migration
Phase 2 (v2.0.0-beta): write tools + pending-changes service + fuzz tests
Phase 3 (v2.0.0): frontend diff pane + chat pane + Caddy routing
Phase 4 (v2.0.0-final): dispatcher worker + task queue + agent probing
Phase 5 (v2.0.1): ACP client + PTY fallback + worktree management
Phase 6 (v2.0.2): MCP server (6 tools, stdio, 10-question eval)
Phase 7 (v2.0.3): CLI + human inbox + cost tracking + observation hooks + Boomerang
Phase 8 (v2.0.x): path-guard fuzz, integration tests, docs, production deploy

~2050 LoC total. Phases 1-4 sequential, 5-7 parallelizable after 4.
Risk register covers path-guard bypass, ACP instability, worktree cleanup,
DB rename, MCP eval, Boomerang context leak.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 01:09:05 +00:00
531d39ace9 v2.0 proposal update: add AGENTS.md extensions, Boomerang pattern, observation hooks, follow-up batches
Additions from second pass of boocode_code_review.md:

- AGENTS.md extensions: output_schema, exit_expression, execution_strategy
  (qodo-ai/agents MIT), expert_model escape hatch (RA.Aid Apache-2.0)
- Subagent isolation via Boomerang Tasks pattern: orchestrator-only-dispatches,
  down-pass/up-pass context discipline, fresh session per subtask
- Observation hooks: 5-event taxonomy from budi (SessionStart, UserPromptSubmit,
  PostToolUse, SubagentStart, Stop) mapped to WS frames
- Follow-up batches table: PR-resolver, HMAC audit log, blind-validation gate,
  majority-vote ensembler, drift detection, anti-slop, globstar gate, Docker
  sandbox, multi-provider LLM
- Additional repo to clone: qodo-ai/agents for agent.toml schema reference

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 23:22:57 +00:00
f2974d6887 v2.0 proposal: BooCoder — write tools, pending changes, ACP dispatch, MCP server
Comprehensive roadmap for the v2.0 major version bump. Covers:
- Schema: pending_changes, tasks, available_agents tables + human_inbox view
- Path A: native write tools (edit_file, create_file, delete_file) queuing
  through pending_changes before /apply flushes to disk
- Path B: external agent dispatch via ACP (opencode, goose) or PTY fallback
  (claude, pi) with per-task git worktrees and automatic diff-on-completion
- BooCoder MCP server: 6 tools exposing task primitives over stdio
- Code lifts: agent-hub (Apache-2.0, task DAG), plandex (MIT, diff UX),
  ACP SDK (Apache-2.0, subprocess protocol), Paseo (AGPL, design-only)
- Sub-versions: v2.0.0 (Path A), v2.0.1 (Path B), v2.0.2 (MCP server),
  v2.0.3 (CLI + polish)
- Estimate: ~2200 LoC total

All v1.x dependencies shipped (v1.13 parts, v1.14 outer loop, v1.15 MCP
client, v1.16 codesight). v2.0 is unblocked.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
v2.0-proposal
2026-05-24 15:11:16 +00:00
29c7d051b6 v1.16.0-codesight-merge: 4 new codecontext tools — blast radius, hot files, routes, middleware
BooCode wrapper tools for the 4 new MCP tools added to the codecontext
sidecar (Go side committed separately at /opt/forks/codecontext).

- get_blast_radius: reverse-edge BFS — "what breaks if I change this?"
- get_hot_files: most-imported files by incoming edge count
- get_routes: Fastify/Express route extraction via tree-sitter AST
- get_middleware: middleware detection via import + registration patterns

Wrappers follow the existing codecontext pattern: Zod input → callCodecontext
→ ToolDef export. Registered in ALL_TOOLS (alpha-sorted). All 4 are read-only.

codecontext sidecar rebuilt from commit b19e646 with the 4 new Go handlers
(2130 lines, 29 tests). Reviewer fixes applied: defer RUnlock on Tier 2
handlers, extractObjectProperty delegates to extractStringValue for
template-literal route paths.

363/363 server tests passing. No schema changes, no frontend changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
v1.16.0-codesight-merge
2026-05-24 05:19:52 +00:00
d27a977d59 v1.15.0-mcp-multi: multi-server MCP client + stdio transport + config file + tool globs
Generalizes the v1.14.1 single-server Context7 PoC into a multi-server MCP
client registry with per-server graceful degradation. JSON config at
/data/mcp.json (bind-mounted alongside AGENTS.md) matches opencode's
mcpServers schema shape. Config file missing = no MCP (opt-in by presence).

Two transports: Streamable HTTP (remote servers like Context7) and stdio
(local subprocess servers like codecontext). Stdio spawns a persistent child
via the SDK's StdioClientTransport; shutdown hook closes all transports.

Tool prefix generalized from context7_<name> to <serverName>_<toolName> with
a toolToServer reverse map for dispatch routing. AGENTS.md tools: field now
supports glob patterns (context7_*, !web_*) via matchToolGlob — last-match-
wins with ! deny prefix. Replaces exact-match .includes() in stream-phase.ts.

refreshToolNames() in agents.ts rebuilds the DEFAULT_TOOLS snapshot after
appendMcpTools so agents without explicit tools: lists see MCP tools —
reviewer caught that the module-load-time snapshot would permanently exclude
late-registered tools.

Read-only invariant: readOnlyHint === false rejected at discovery. Result
size capped at 5MB. v1.14.1 env vars removed — superseded by config file.
Default data/mcp.json ships with Context7 disabled.

363/363 server tests passing. No schema changes, no frontend changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
v1.15.0-mcp-multi
2026-05-24 04:08:42 +00:00
5692e99a5d v1.14.1-mcp-poc: single-server MCP client against Context7
Validates the MCP-client loop end-to-end against one real MCP server before
the full v1.15 port. New services/mcp-client.ts wraps @modelcontextprotocol/sdk
v1.29.0 with Streamable HTTP transport. On startup (when MCP_CONTEXT7_URL is
set), connects to Context7, discovers tools via tools/list, wraps each as a
ToolDef prefixed context7_<name>, and appends to ALL_TOOLS via appendMcpTools.

Read-only invariant guard rejects any tool with readOnlyHint: false. Tool
dispatch is transparent — executeToolCall routes MCP calls through the ToolDef
execute wrapper, which strips the prefix before calling the MCP server. Result
size capped at 5MB with truncation. Graceful degradation: server down at
startup → zero tools; server down mid-session → error result, model
self-corrects.

Adversarial review caught that a Zod .default() on the URL config made MCP
always-on instead of opt-in — fixed by removing the default. MCP_CONTEXT7_URL
must be explicitly set to enable.

ALL_TOOLS changed from ReadonlyArray to mutable to support late-registration.
appendMcpTools re-sorts and rebuilds TOOLS_BY_NAME after append.

348/348 server tests passing (16 new mcp-client tests). No schema changes,
no frontend changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
v1.14.1-mcp-poc
2026-05-23 21:58:09 +00:00
f4a97808ad v1.14.0-outer-loop: explicit while loop replaces inference recursion
Converts the ad-hoc executeToolPhase → runAssistantTurn recursion into an
explicit while (stepNumber < effectiveCap) loop. A step is one stream-and-
tool-execute iteration; the loop terminates on non-tool finish, step-cap hit,
doom-loop, budget exhaustion, abort, or synthesis success.

MAX_STEPS = 200 hard ceiling (4x old effective limit from budget). Per-agent
steps: field in AGENTS.md frontmatter sets tighter caps (Refactorer: 5,
Architect: 20, others: unset = bounded only by MAX_STEPS). Resolution:
effectiveCap = Math.min(agent.steps ?? Infinity, MAX_STEPS).

executeToolPhase no longer recurses — returns ToolPhaseResult struct
(action: 'continue' | 'paused' | 'synthesis_done') so the caller decides
whether to continue or break. steps: 0 handled as "no tool calls allowed"
via runTextOnlyTurn (one text-only stream phase, tool calls ignored with
warn log).

Step-cap hits produce a sentinel summary (reuses cap_hit kind so
CapHitSentinel.tsx renders without frontend changes; text distinguishes
"Step limit reached" from "Tool budget exhausted"). Doom-loop check migrated
to top of loop body — same predicate, same threshold (3), break instead of
return.

step_start parts are in the schema CHECK but not emitted as message_parts —
writing before the stream phase creates a sequence-0 collision with
partsFromAssistantMessage. Structured log line emitted instead. Adversarial
review caught the collision pre-deploy.

332/332 server tests passing. No frontend changes. No schema changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
v1.14.0-outer-loop
2026-05-23 20:29:21 +00:00
211e903620 v1.13.20-drop-legacy-cols: final phase of v1.13.0 strangler-fig
Removes the dual-write into messages.tool_calls / messages.tool_results JSON
columns and drops the columns. message_parts is now the only source of truth
for tool calls and tool results.

10 dual-write sites stripped (5 in tool-phase.ts, 2 in routes/skills.ts, 2 in
routes/messages.ts, 1 in routes/chats.ts fork-clone). The recon-driven grep
caught 2 sites beyond the original v1.13.2 roadmap inventory and an extra
fixture file (tool_cost_stats.test.ts) with a direct legacy-column INSERT.

messages_with_parts view rewritten to parts-only subselects (COALESCE
fallbacks gone). View runs via CREATE OR REPLACE so it lands before the
column DROPs in startup DDL — Postgres rejects column-drop on view-referenced
cols. v1.12.1 cleanup DO block (DROP CONSTRAINT messages_status_check /
messages_role_check) removed; those one-shots have done their work.

Adversarial review caught a runtime bug the green test suite missed: the
discard_stale endpoint (chats.ts) had a RETURNING ... tool_calls, tool_results
clause that would have crashed on every 60s-no-token-activity recovery in
production. Fixed by switching to two-step UPDATE returning id, then SELECT
from messages_with_parts so parts-synthesized fields keep flowing on the wire.

Message API type retains tool_calls? / tool_results? — the view synthesizes
those keys from parts so the wire shape is unchanged; frontend reads need no
update. Override on the original v1.13.2 plan, captured in the openspec
proposal.

339/339 server tests passing (including 7 DB-integration tests that applied
the schema migration to a live DB and ran the parts-only view end-to-end).
tsc + web build clean.

Pairs with v1.13.0-ai-sdk-v6 (introduced the dual-write) and v1.13.1-B (moved
the read path to messages_with_parts). Umbrella v1.13 tag ships on this same
commit, marking the strangler-fig closed.

CLAUDE.md picks up Sam's pre-existing edits documenting tag-naming and
CHANGELOG conventions — both already in use by v1.13.19 / v1.13.20.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
v1.13.20-drop-legacy-cols v1.13
2026-05-23 13:03:51 +00:00
ad45b28250 v1.13.19-html-artifact-panes: pane-based artifact viewer with on-request HTML
Every assistant message gets an "Open in pane" affordance that opens the
message in the workspace splitter — Markdown pane (Copy + Download .md) by
default; HTML pane (Download .html only) when the model emits a self-contained
<!DOCTYPE html> or fenced ```html artifact. BOOCHAT.md rule keeps Markdown
default at every length; HTML opt-in on explicit user request.

Backend: services/artifacts.ts (slug derivation + write helpers with
symlink-escape guard via realpath-after-mkdir), routes/artifacts.ts (POST
download + GET stream with nosniff + CSP sandbox defense-in-depth), HTML
detection in finalizeCompletion writing a new message_parts.kind='html_artifact'
row (schema CHECK extended via v1.13.13 pattern), graceful 1MB cap via the
pure decideHtmlArtifactWrite helper. PartKind union extended.

Frontend: MarkdownRenderer.tsx extracted from MessageBubble's inline
MarkdownBody for reuse; MarkdownArtifactPane.tsx + HtmlArtifactPane.tsx with
loading/error states; pane state is reference-only ({chat_id, message_id,
title}) — content fetched on mount to keep workspace_panes jsonb small and
avoid 1MB blobs riding session_workspace_updated frames. iframe sandbox
locked to allow-scripts allow-clipboard-write allow-downloads with no
allow-same-origin, srcDoc not src. openInPane discriminates 404 (expected
fallback) from real errors (toast + bail). PanelRightOpen icon button with
mobile 44px tap-target.

31 new server unit tests including a real-symlink filesystem case; 332/332
server tests passing, tsc clean both sides, pnpm -C apps/web build green.
Smoke deferred to first deploy.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
v1.13.19-html-artifact-panes
2026-05-23 12:43:13 +00:00
1a889dcde3 v1.13.18-codecontext-file-path: resolve file_path against project root in codecontext wrappers
Four codecontext sidecar wrappers — get_file_analysis (required
file_path), get_symbol_info, get_dependencies, and get_semantic_neighborhoods
(optional) — forwarded file_path to the HTTP sidecar unchanged. The
sidecar's internal file index is keyed on absolute paths, so any
relative path from the model returned "File not found in graph".
Three back-to-back failures observed in one chat on 2026-05-22
17:56 UTC, ~48 s of wasted tool budget.

## Resolver

Add resolveProjectPath(projectRoot, rawPath) in codecontext_client.ts:
trim check → absolute/relative branch (both go through resolve() so
dot-segments normalise) → realpath with ENOENT fallthrough → escape
check using the realpathed value. Error shape mirrors the existing
target_dir escape error byte-for-byte; only the field name differs.

Wired into callCodecontext at the args-spread site, guarded on
file_path presence + non-empty. All four wrappers benefit from one
call site; wrappers without file_path (overview, framework, watch,
search) are unaffected.

## Schema trim

.trim() added to all four file_path Zod schemas:

  get_file_analysis:                  z.string().trim().min(1)
  get_symbol_info:                    z.string().trim().optional()
  get_dependencies:                   z.string().trim().optional()
  get_semantic_neighborhoods:         z.string().trim().optional()

Absorbs trailing newlines / whitespace from model output before the
resolver sees the value.

## Adversarial review fixes

Adversarial pass surfaced two P2 findings:

1. Absolute path with `..` resolving outside the project root (e.g.
   `<projectRoot>/../etc/passwd`) that ENOENTs at realpath would slip
   through the literal prefix-check: the raw string starts with
   `<projectRoot>/`. Fix: resolve() the absolute branch's candidate
   too, so dot-segments normalise before the prefix check.

2. No symlink-escape test coverage. Realpath's stated purpose
   (catching in-project symlinks pointing outside the project) was
   never tested. Added: create a tmpdir outside projectRoot,
   symlink projectRoot/evil-link → outside file, assert rejection.

## Tests

codecontext_client.test.ts: 19 tests (10 baseline + 9 new file_path
resolution cases). Cases cover: relative→absolute, absolute-inside,
relative-escape, absolute-outside, ENOENT-fallthrough, empty-string,
wrapper-without-file_path, absolute-with-`..`-ENOENT,
symlink-leaving-root.

codecontext_tools.test.ts: one assertion updated to expect the
resolved-absolute file_path on the wire (previously asserted the raw
relative path passed through, which is exactly the bug being fixed).

Full suite: 301 passed, 7 skipped.

## Affected / unaffected

- get_codebase_overview, get_framework_analysis, watch_changes,
  search_symbols: no file_path arg → resolver guard skips them. No
  behavior change.
- get_semantic_neighborhoods IS in SYNTHESIS_TOOLS — previously-failing
  relative-path calls will now successfully synthesize. Desirable, not
  a regression.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
v1.13.18-codecontext-file-path
2026-05-22 21:54:16 +00:00
b52c5df705 v1.13.17-cross-repo-reads: on-demand read access to paths outside the project root
When the agent needed context from another repo, pathGuard rejected every read
with no recovery path. This batch adds a reactive request_read_access flow:
pathGuard's error now hints at the tool, the model emits a structured request,
the inference loop pauses (same mechanism as ask_user_input), the user picks
Allow/Deny via inline chips, and subsequent reads under the granted root succeed
for the rest of the session.

Schema: sessions.allowed_read_paths TEXT[] NOT NULL DEFAULT ARRAY[]::TEXT[]
(idempotent ADD COLUMN IF NOT EXISTS).

Grant unit (design D1): nearest registered projects.path ancestor →
nearest repo-shaped ancestor (.git/ / package.json / go.mod / Cargo.toml)
under PROJECT_ROOT_WHITELIST → else refuse. grant_resolver.ts walks
ancestors with a per-iteration whitelist invariant check so symlinked
input can't escape the whitelist mid-walk (Sam's checkpoint-1 ask).

Path-guard: optional extraRoots arg threaded from session.allowed_read_paths
through executeToolCall to view_file / list_dir / grep / find_files. The
ToolDef.execute signature gets an optional third param; non-FS tools
ignore it. view_file re-anchors the secret-guard check on basename(real)
whenever a relative path starts with "../" so .env / id_rsa* etc. still
deny across grant roots.

Endpoint: POST /api/chats/:id/grant_read_access mirrors /answer_user_input.
On 'allow' it re-resolves the grant root (state may have changed since
prompt — auto-falls to denial reason text on failure, not 500), array_appends
to sessions.allowed_read_paths with in-memory dedup, then publishes
tool_result + session_updated frames and enqueues the next assistant turn.

PATCH /api/sessions/:id allowed_read_paths supports revocation only. Zod
refines absolute + no traversal markers; runtime findUnauthorizedAdditions
guard rejects any entry not already present in the row, so a malicious
curl -X PATCH -d '{"allowed_read_paths":["/etc"]}' returns 400 instead of
bypassing the grant flow (Sam's compliance-review action item).

Frontend: RequestReadAccessCard renders pending (path + reason + Allow/Deny)
and answered (granted/denied summary with the resolved root) variants;
MessageList.flatten/group special-cases the tool name; SettingsPane adds a
per-session grants list with per-row revoke that PATCHes the shortened
array.

Tests: 11 grant_resolver, 8 path_guard, 8 sessions PATCH subset, including
explicit cases for symlink escape mid-walk, walk-bound termination at
whitelist root, /etc bypass attempt via PATCH, and nearest-project
disambiguation. 292 total server tests green.

Pairs with v1.13.16-xml-parser — the model now self-recovers from both
a wrong tool name AND from a refused path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
v1.13.17-cross-repo-reads
2026-05-22 21:45:52 +00:00
2e1a81de72 v1.13.16-xml-parser: Anthropic <invoke> support + unknown-tool recovery hints
Two-part fix for the model-emitted XML drift the v1.13.15-codecontext-synth
investigation surfaced (1 raw <invoke> leak observed out of 190 qwen3.6
turns — qwen3.6-35b-a3b-mxfp4 drifts to the Anthropic format when prompted
as an Architect-style agent because Claude Code documentation in its
pre-training corpus uses that shape).

## Parser extension

xml-parser.ts now recognizes BOTH XML tool-call flavors:

  - Qwen/Hermes:   <tool_call><function=NAME>...<parameter=K>V</parameter>...</function></tool_call>
  - Anthropic:     <invoke name="NAME"><parameter name="K">V</parameter></invoke>

Both route through the same synthetic-id xml_call_${idx} ToolCall path.
extractToolCallBlocks() and partialXmlOpenerStart() handle both openers
(<tool_call> and <invoke...) so partial buffers don't get prematurely
flushed during streaming.

The existing Qwen parser was tightened to tolerate whitespace around `=`
(<function = name>, <parameter = key>...) so a stray space doesn't get
absorbed into the function name. Name capture is non-whitespace,
non-`>`.

## Unknown-tool recovery hint

New tool-suggestions.ts exports levenshtein() + suggestToolName() +
formatUnknownToolError(). When tool-phase.ts:executeToolCall receives a
toolCall.name that isn't in TOOLS_BY_NAME, the error returned to the
model now includes a "Did you mean: X?" hint based on Levenshtein
distance ≤3 or substring match against Object.keys(TOOLS_BY_NAME).
Targets the qwen3.6 drift to read_file → suggest view_file. Applies to
all unknown tool names, not just <invoke>-derived ones — at the
dispatch layer we no longer know which format produced the call, and
the extra signal is harmless for Qwen-derived calls.

## Test coverage

xml-parser.test.ts: 46 tests, all green. Covers both parsers
(well-formed, malformed, multi-parameter, nested-content), the
partial-opener detector for both flavors, the unified extraction
helper, and the unknown-tool error formatter.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
v1.13.16-xml-parser
2026-05-22 20:59:25 +00:00
61308cf17c v1.13.15-codecontext-synth: remove "tag pending" qualifier in roadmap
Trivial follow-up after the v1.13.15-codecontext-synth tag landed.
Retrospective bullet now describes the shipped state; cleanup-order
tracker marks the batch .

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 20:09:39 +00:00
3992a9fcb7 v1.13.15-codecontext-synth: forced second-inference synthesis for codecontext overview tools
After a codecontext overview-class tool call lands (get_codebase_overview,
get_framework_analysis, get_semantic_neighborhoods), the pipeline runs a
second inference pass that replaces the recursive runAssistantTurn. The
synth pass auto-fetches the top-N source files referenced in the
codecontext output plus project docs (BOOCHAT.md, AGENTS.md,
*roadmap*.md, CONTEXT.md), applies a 32k-token budget with explicit
drop-priority, and streams a structured response that grounds the model
in real load-bearing code rather than relying on the codecontext summary
alone. Smoke #1 (default) and #2 (Architect) both cite the correct
inference/turn.ts + tool-phase.ts + stream-phase.ts files; smoke #6
(fault injection) verifies the fall-through path marks the synth message
status='failed' and yields cleanly to the recursive turn.

## Truncation-aware extraction

codecontext's wrapper inline-truncates results at 32k chars. Without the
expansion step, the top-N file selection only saw the alphabetical head
of the codebase (apps/booterm/dist/*) and auto-fetched the wrong sources.
The pipeline now calls in-process readTruncation(outputPath) before
extracting referenced files, so top-N selection sees the full 80k+ char
output. The 32k truncated head still ships to the synth model — the
expansion is reference-extraction-only, preserving the token-budget
contract. Graceful degradation on readTruncation null/throw: log warn,
fall back to the truncated head.

## Schema deviation from dispatch

The dispatch claimed no schema migration was needed for the new
'synthesis' part kind. Reality: message_parts.kind has an explicit
CHECK constraint (schema.sql:54) that would reject the new value. Added
a DROP CONSTRAINT IF EXISTS + DO $$ pg_constraint idempotency-guarded
re-add matching the CLAUDE.md migration pattern. The inline CREATE TABLE
constraint also updated so fresh installs land with the extended enum.

## User-abort marks synth-message failed

Deviation from review-time spec ("user-abort path does NOT mark the
message failed"). The outer abort handler in error-handler.ts operates
on the parent turn's assistantMessageId, not the new synth row that
runSynthesisPass created. Without explicit marking, the synth row would
sit in status='streaming' until the 5-min stale-streaming sweeper
(v1.13.1-cleanup-bundle), tripping the frontend's 60s no-token-activity
banner in the meantime — exactly the UX bug class the v1.13.1 sweeper
was added to handle. Marking failed on every catch path (including
user-abort) closes the gap. Cost: one extra DB write + one publish on
the rare user-abort-during-synth path.

## Race-safe synth-tool capture

tool-phase.ts uses synthEntries: Array<{tc, output, error?}> with
per-callback push under Promise.all. find() picks the first non-error
entry by call-order (toolCalls array index). Multiple synth-tools in
one batch are uncommon but handled deterministically.

## Roadmap rebase

Updated boocode_roadmap.md retrospective section + cleanup-order tracker
+ schema-changes summary to use the new vMAJOR.MINOR.PATCH-slug tag
names per the 2026-05-22 retag (CHANGELOG.md is the canonical record).
v1.13.15 listed as "this batch, tag pending"; a one-line follow-up
commit will remove that qualifier after the tag lands.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
v1.13.15-codecontext-synth
2026-05-22 20:08:47 +00:00