Compare commits

..

145 Commits

Author SHA1 Message Date
a584dd16b0 feat: sampling knobs + live PTY stream-json + token UI (v2.7.3)
Three small wins from boocode_code_review_v2 §1 #11/#7/#8.

#11 sampling knobs: top_n_sigma + dry_* family as first-class Agent fields,
threaded into the request body via providerOptions.openaiCompatible. Fixes a
latent bug — top_k (rejected by the AI-SDK provider) and min_p (never passed to
streamText) were dead on the wire; both now route through the same channel.
--reasoning-budget documented in data/AGENTS.md.

#7 live PTY stream-json: new stream-json-parser.ts line-buffers qwen/claude
NDJSON and emits text/reasoning/tool frames live + persists, with a fallback to
the old opaque slice. claude gets --output-format stream-json --verbose.

#8 token UI: agent_sessions input/output_tokens/cost now flow through the route
+ type and render beside the AgentComposerBar session chip.

Built by 3 parallel agents. Server 523 + coder 245 tests passing; builds + web
tsc clean. Builds on v2.7.2. openspec sampling-streamjson-tokens.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 12:47:17 +00:00
5651f56039 Merge checkpoint-idor-fix: v2.7.2 close 2 checkpoint IDOR holes 2026-06-01 12:16:08 +00:00
9c7d80e2d8 fix(security): scope checkpoint routes to session — close 2 IDORs (v2.7.2)
Flagged by the automated push security review on v2.7.1.

- GET /checkpoints?chat_id= : the chat_id branch filtered by chat_id alone
  (any session's chat_id read its checkpoints). Now joins chats and gates on
  chats.session_id.
- restoreCheckpoint scope guard was fail-open: `cp.session_id && cp.session_id
  !== sessionId` fell through on a null denormalized session_id, allowing a
  cross-session restore (worktree reset + transcript trim). Now resolves the
  owning session via the checkpoint's chat and denies on missing/mismatch.
- Adds a DB-integration regression for the null-session_id cross-session case.

Both scope authoritatively through chats.session_id (checkpoints.session_id is
a nullable hint). Coder suite 234 passing; 7/7 checkpoint tests (incl. the
regression) against live postgres+git; typecheck clean. Hotfix on v2.7.1.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 12:15:54 +00:00
a41a02a62b Merge fuzzy-checkpoints: v2.7.1 write/edit robustness (fuzzy applier + worktree checkpoints) 2026-06-01 12:02:06 +00:00
59f07e8cb8 feat: write/edit robustness — fuzzy patch applier + worktree checkpoints (v2.7.1)
#3 Fuzzy patch applier: new pure fuzzy-match.ts (locateMatch, exact→trim→
unicode-canon→Levenshtein≥0.66, refuse-on-ambiguous) wired into pending_changes
applyOne/rewindOne so local-model whitespace/unicode drift in old_string no
longer loses the edit.

#4 Worktree checkpoint + conversation-trim: checkpoints table + checkpoints.ts
(shadow-commit of tracked+untracked into refs/boocode/checkpoints, hooked into
the 3 external-agent dispatcher paths) + POST restore route (reset --hard +
clean -fd -> transcript trim -> backend-session reset) + "Restore to here" UI.

Built by 3 parallel agents; DB-integration testing caught a created_at
self-deletion bug. Coder suite 234 passing; server+coder build + web tsc clean.
Builds on v2.7.0-mit. openspec write-edit-robustness.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 12:01:57 +00:00
1108d07fb2 Merge relicense-agpl-to-mit: v2.7.0 AGPL-3.0 → MIT relicense 2026-06-01 08:16:25 +00:00
a8bfde8f8d feat: relicense AGPL-3.0 → MIT (v2.7.0)
Clear the 3 Unsloth-Studio-derived AGPL files and flip LICENSE + 5
package.json from AGPL-3.0-only to MIT.

- html-to-md.ts → MIT node-html-markdown (parse5 dropped)
- llama-args-validator.ts → clean-room (flag denylist = facts)
- tool-call-parser.ts → delete dead Unsloth-ported code; keep
  extractToolCallBlocks/stripToolMarkup byte-identical (no behavior change)
- LICENSE → MIT (Copyright (c) 2026 indifferentketchup); 5 package.json → MIT;
  AGPL SPDX headers removed; README License section; license-mit guard test
- roadmap License-debt batch marked shipped; openspec/changes/license-debt-mit

Decouples the relicense from the native-parsing retirement (the ported parser
was dead code). Server suite 519 passing; build + coder typecheck clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 08:16:03 +00:00
9c1ddcaa7c Merge v2611-followups: v2.6.11 apps/server close-hook caller + DiffPanel staging hint 2026-06-01 02:35:21 +00:00
217f487395 docs(changelog): v2.6.11-close-hooks-staging (closes the v2.6 openspec)
CHANGELOG + roadmap (through v2.6.11) + openspec v2-6 Phase 3 fully closed (3.7 + apps/server close-hook caller done).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 02:35:21 +00:00
2dfbef4c41 feat: v2.6 follow-ups — apps/server close-hook caller + DiffPanel staging hint (3.7)
apps/server fire-and-forgets BooCoder's Phase-3 close hooks (new coder-notify.ts, reuses BOOCODER_URL, never-rejects) on session-delete + chat archive/archive-all/delete, so warm backends + worktrees tear down immediately (idle-evict/reaper was the backstop). 3.7: BooCoder DiffPanel shows a muted one-liner when the selected provider can't see another agent's unapplied worktree edits (pure derivation from per-change agent + current provider, no new state). 6 new server tests (coder-notify); 537 server tests pass; web+server tsc/build clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 02:35:11 +00:00
c7a8128059 Merge phase3-lifecycle: v2.6.10 lifecycle hardening (completes v2.6 persistent agent sessions) 2026-06-01 01:10:16 +00:00
986c8a83a9 docs(changelog): v2.6.10-lifecycle-hardening (completes v2.6)
CHANGELOG + roadmap (through v2.6.10; v2.6 marked complete) + openspec v2-6 Phase 3 checked off (3.1-3.6; 3.7 frontend + apps/server caller as follow-ups).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 01:10:16 +00:00
aa3797e356 feat(coder): v2.6 Phase 3 — lifecycle hardening (idle evict, crash recovery, worktree reaper)
Idle TTL eviction per (chat,agent) + LRU cap (never a busy backend); pure lifecycle-decisions.ts (TDD). Crash recovery lifts openchamber's health-monitor + busy-aware-restart + stale-grace state machine into opencode-server.ts (+ port reclaim) and warm-acp.ts; opencode crash -> fresh sessions, ACP -> re-session/new. F.1 turn-guard + U.6 usage preserved (their tests pass). Orphan worktree reaper (1h grace, superset-style dirty/unpushed preflight, Paseo soft-delete) + close hooks + diff re-baseline after apply_pending. 35 new tests + DB-opt-in reconnect test; 215 coder tests pass; tsc + build clean. Completes v2.6. Follow-ups out of scope: apps/server close-hook caller, 3.7 DiffPanel staging hint, live smokes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 01:10:09 +00:00
850d48853f Merge phase2-warm-acp: v2.6.9 warm ACP backend for goose/qwen 2026-05-31 23:57:14 +00:00
f619ae0978 docs(changelog): v2.6.9-warm-acp
CHANGELOG + roadmap (through v2.6.9) + openspec v2-6 Phase 2 checked off (2.1-2.4; Smoke 2/2b pending live).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 23:57:09 +00:00
0d3d08f5f2 feat(coder): v2.6 Phase 2 — warm ACP backend for goose/qwen
WarmAcpBackend (AgentBackend) holds one persistent goose acp / qwen --acp child + ClientSideConnection + ACP session per (chat,agent); initialize+session/new once, reused across turns. Abort = session/cancel the prompt only (never kills the child); child exit -> agent_sessions.status='crashed' -> re-spawn next turn. Dispatcher routes goose/qwen chat-tab tasks to the pooled warm backend via pure shouldUseWarmBackend (needs session_id+chat_id); one-shot runExternalAgent kept as fallback for arena/MCP/new_task. handleSessionUpdate extracted to a shared pure acp-event-map.ts (one-shot path byte-identical). SDK: installed @agentclientprotocol/sdk@^0.22.1 has stable resumeSession/loadSession; resume moot in the warm hot path, deferred to Phase 3. 15 new tests (warm-acp-routing, acp-event-map); 180 coder tests pass; tsc + build clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 23:57:03 +00:00
0658d19b64 Merge phase1-ux: v2.6.8 agent attribution (DiffPanel badges + composer chip + agent-sessions route + opencode usage) 2026-05-31 22:07:39 +00:00
631af5dd4c docs(changelog): v2.6.8-agent-attribution
CHANGELOG + roadmap shipped record (through v2.6.8) + openspec v2-6 Phase 1-UX checked off (U.1-U.6; Smoke U pending the frontend Docker rebuild).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 22:07:32 +00:00
5db6551361 feat(web): Phase 1-UX frontend — DiffPanel agent badges + resumed/new-session chip
DiffPanel renders a per-row agent badge (icon+label; null -> 'manual') + a 'Changes from X, Y' note when the pending set spans >1 agent. AgentComposerBar gains an optional sessionId prop -> resumed/history/new-session chip beside the Provider picker (gated, so BooChat callers are unchanged), driven by a new useAgentSessions hook (refetch on message-complete). providerIcon extracted to shared components/coder/providerIcons.tsx; api.coder gains agentSessions(sessionId); PendingChange type gains agent. web tsc clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 22:07:26 +00:00
c060778258 feat(coder): Phase 1-UX backend — agent attribution + agent-sessions route + opencode usage
pending_changes.agent stamped at every queue site (native -> 'boocode', dispatched external -> task.agent, manual RightRail -> NULL) + flows through listPending. New GET /api/sessions/:id/agent-sessions -> [{agent,status,has_session,last_active_at}] per (chat,agent). opencode warm server consumes session.next.step.ended, accumulating input_tokens/output_tokens/cost onto agent_sessions (new idempotent columns) via a pure opencode-usage.ts mapper. Tests: agent-sessions.routes (3) + opencode-usage (6); tsc clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 22:07:14 +00:00
48c1d70baf Merge f1-interrupt-guard: F.1 opencode post-interrupt stale-terminal guard + doc reconciliation (v2.6.7) 2026-05-31 21:32:25 +00:00
457010391a docs(changelog): v2.6.7-interrupt-guard + reconcile roadmap/review/openspec
CHANGELOG entry for v2.6.7. Plus the session's doc reconciliation: roadmap shipped record synced through v2.6.7 (v2.3 lifecycle marked shipped, relicense AGPL->MIT batch, fork-sweep lift items, claude-agent-sdk SessionStore, ACP package fix); boocode_code_review_v2 (two fork sweeps, relicense decision = 3 AGPL files, jinja gate green); openspec v2-3 reconciled to shipped (v2.5.4-v2.5.13); openspec v2-6 Phase 0/1 + P1.5 shipped, F.1 done, remaining-phase plan + lift sources.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 21:31:47 +00:00
372651bcb1 fix(coder): F.1 post-interrupt stale-terminal guard (opencode warm server)
opencode emits one trailing session.idle/error for a turn cancelled via client.session.abort(), carrying only a sessionID (no turn id). The warm-server backend settled activeTurn on that event, so after Stop + an immediate new message the orphan idle settled the NEXT turn early as success (one-click reachable since v2.6.5's Send->Stop composer).

Adds a pure per-session guard (backends/turn-guard.ts: armAbortGuard / noteTurnActivity / consumeTerminal over swallowNextTerminal) wired into opencode-server.ts: abort arms it, the next terminal is swallowed once, and a new turn's first delta self-heals so a never-arriving orphan can't strand a real turn. Test-first; 3 regression tests in turn-guard.test.ts. Paseo parallel: 1d38aac.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 21:31:35 +00:00
d66948c925 docs(changelog): v2.6.6-claude-md
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 12:44:33 +00:00
58d0c0f132 docs(claude-md): v2.6.5 session learnings
Capture four recurring gotchas from the panes/tabs/composer batch: the
workspace_panes WorkspaceState envelope (+ legacy-array migration on hydrate and
the union-accepting server PATCH validator); the optional ToolExecCtx
({ sql, sessionId }) 4th arg on ToolDef.execute for DB/session-aware tools
(read_tab_by_number reference); the two-schema-files-one-DB ownership split
(apps/coder owns agent_sessions/worktrees/pending_changes/available_agents) plus
the idempotent confdeltype FK-action-flip pattern; and that React StrictMode is
on, so a setState called inside another setState's updater double-fires in dev.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 12:44:20 +00:00
7b4f41b26f docs: roadmap shipping-state update + external code-review v2 findings
Update boocode_roadmap.md's shipped section through v2.6.4 (provider lifecycle,
persistent agent sessions, cursor/copilot retirement) and add
boocode_code_review_v2.md — a point-in-time external-fork lift/cross-check
findings doc (Paseo + opencode + llama.cpp + the second fork sweep), companion
to the standing boocode_code_review.md inventory.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 02:28:13 +00:00
5527e7a5e8 docs(changelog): v2.6.5-panes-tabs-composer
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 02:15:46 +00:00
08d6a8fa40 feat(web): morphing send/stop/queue composer button
The composer's primary button now reflects generation state: Send when idle,
Stop while generating with an empty draft, and Queue while generating with a
draft typed (submitting queues it via the existing queue path). Stop is
click-only so a stray Enter never interrupts a run. ChatInput gains generating
+ onStop props.

BooChat: removes the separate centered "Stop generating" pill and wires
generating={streaming} + onStop={handleStop}. BooCoder: generating now keys on
sending || activeTaskId (the dispatch POST is too brief on its own), which also
fixes the queue gates that previously fired mid-run; onStop cancels the active
task via the new api.coder.cancelTask, and the input is no longer disabled while
a task runs so follow-ups can be queued.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 02:15:14 +00:00
2fd7e5bf97 feat(web): workspace panes & tabs overhaul
A cohesive batch of pane/tab UX + the persisted workspace-state model (grouped
because the changes interleave across useWorkspacePanes, ChatTabBar, Workspace,
sessionEvents and the api types/client):

- Open a whole chat in a fresh pane via a new open_chat_in_new_pane event:
  ChatTabBar tab context menu "Open in new pane", and MessageBubble.fork() now
  lands the fork beside the original instead of replacing the active pane.
  openChatInNewPane detaches the chat from any pane already holding it
  (one-chat-per-pane).
- The tab-bar "+" becomes a New BooChat/BooTerm/BooCode menu (chat as a tab,
  term/coder as split panes); the split button is unchanged.
- Drop the per-message "Open in pane" button (it opened a single message's
  artifact) and its dead code; the artifact-pane machinery is left orphaned for
  a later teardown.
- Session history: the empty/landing pane lists the session's open chats plus
  archived chats (fetched separately), click to open / restore-and-open.
- Relocate-on-close: closing a chat pane moves its tabs (in order) into the
  oldest chat/empty pane instead of discarding them; terminal/coder panes close
  as before. Reopen strips the restored chatIds from all live panes first, so a
  relocated-then-reopened pane never duplicates a tab — no stack-shape change.
- Stable global tab numbering: tabNumbers/nextTabNumber assigned on chat-pane
  open, retired on close (never reused), rendered map-keyed (not positional).
- workspace_panes is now a WorkspaceState envelope { panes, tabNumbers,
  nextTabNumber, closedPaneStack }; the reopen stack moved from a module-level
  array into the persisted envelope so it survives reload. Hydrate/persist
  normalize the legacy bare-array shape. appendClosed dedupes a value-identical
  top entry to neutralize the StrictMode double-invoke of the setPanes updater.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 02:15:03 +00:00
d05f73be26 feat(server): workspace_panes envelope + read_tab_by_number tool
Widen the sessions.workspace_panes JSONB from a bare WorkspacePane[] to a
WorkspaceState envelope { panes, tabNumbers, nextTabNumber, closedPaneStack }.
The PATCH validator accepts either the legacy array or the envelope (zod union)
and normalizes to a full envelope before storing, so existing array-shaped rows
migrate transparently on next write. The session_workspace_updated WS frame
schema is widened to match (kept byte-identical to the web copy; parity test
passes).

Adds read_tab_by_number, a read-only tool that resolves a session-scoped tab
number to its chat via the persisted tabNumbers map and returns that chat's
transcript (oldest-first, sentinels skipped, capped at 20k chars). Tools gain an
optional ToolExecCtx ({ sql, sessionId }) 4th param on ToolDef.execute, threaded
through executeToolCall from executeToolPhase; the param is optional so existing
filesystem tools and the apps/coder consumer stay compatible. Registered in
ALL_TOOLS + READ_ONLY_TOOL_NAMES.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 02:14:42 +00:00
e857815d79 feat(web): paste chips trail the typed message text
flattenToMessage now places the typed text first and appends pasted-chip
content after it with a single leading space (file/line chips remain fenced
provenance blocks after that), instead of prepending all attachments. A
leading slash command therefore stays first and the paste reads as its
continuation — `/command <pasted>` rather than `<pasted>` then the command.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 02:13:40 +00:00
12d31a81a0 docs(changelog): v2.6.4-agent-sessions-fk
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 00:47:40 +00:00
5da6eb2447 docs(claude-md): sync v2.6 engineering notes (P1.5-a/b, skills, AGENTS.md parsing)
Reflect shipped v2.6.1–v2.6.3 work in the deep reference. The opencode SSE
bullet now describes per-session SSE (P1.5-a) instead of the single-stream
Phase-1 limit; the agent_sessions resume bullet describes the (chat_id, agent)
re-key (P1.5-b) — chat_id CASCADEs from chats, session_id/worktree_id are
informational SET NULL, and the worktrees table supersedes the defanged
session_worktrees. Drop the stale root AGENTS.md navigation pointer (removed
in v1.12; data/AGENTS.md is the registry, not navigation). Add two
conventions: data/AGENTS.md is parsed (## headings need a --- fence, no
free-form rule sections) and the data/skills/<vendor>/ layout with the
boocode/ namespace.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 00:47:16 +00:00
7f6c4780e2 fix(coder): converge agent_sessions.session_id FK to SET NULL (P1.5-b follow-up)
The P1.5-b re-key block (cb1846c) re-adds session_id_fkey as ON DELETE
SET NULL, but the whole block is guarded on chat_id_fkey's absence. A DB
already re-keyed to (chat_id, agent) while session_id_fkey was still
ON DELETE CASCADE never re-enters that block, so applySchema leaves it at
'c' forever — diverging from the schema's stated intent, from worktree_id
(already SET NULL), and from the v2.6.3 changelog's own claim that
session_id is informational SET NULL.

Add a standalone confdeltype-guarded block (mirroring the session_worktrees
defang) that flips session_id_fkey CASCADE -> SET NULL independently of the
re-key gate. Idempotent: fires only while the FK is still 'c' — a no-op on a
fresh deploy (already 'n' from the re-key block) and on every re-run. The
live DB was converged by hand with the identical statements; \d
agent_sessions now shows session_id ... ON DELETE SET NULL.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 00:46:41 +00:00
30b6f70f95 docs(changelog): v2.6.3-chatkey-and-skills
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 00:06:19 +00:00
c2b3e0a013 skills: committing-changes + using-worktrees judgment skills + AGENTS.md guidance
Two portable agent-judgment skills in data/skills/boocode/, externalizing when/how Opus commits and when it isolates work in a worktree, so weaker agents (opencode build agent, BooCoder) can approximate it. committing-changes: segment by concern, stage explicitly (never git add -A), draft scope-prefix messages, present-and-STOP — commit only on explicit command, never push, identity indifferentketchup. using-worktrees: the when-to-isolate heuristic (just-create-when-clear / propose-when-ambiguous / skip), stable-base mechanics, runtime-isolation caveat — deliberately autonomous vs committing's command-gate. Each has an eval.yaml (matching improving-boocode-guidance) with a negative-trigger task. AGENTS.md gets a parser-safe preamble (the registry throws on bare ## headings) pointing at both skills.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 00:04:48 +00:00
cb1846c0d5 feat(coder): re-key agent_sessions to (chat_id, agent) + worktrees table (P1.5-b)
The tab (a chat) is the context unit: two opencode tabs in one session are two independent agent contexts sharing one worktree. agent_sessions re-keys from (session_id, agent) to (chat_id, agent) — chat_id FK ON DELETE CASCADE (closing a tab ends its context); worktree_id and session_id become informational SET NULL columns. New worktrees table (one-per-session, survives session delete via session_id SET NULL) supersedes session_worktrees, which is defanged (CASCADE dropped) not yet removed. chat_id is threaded end-to-end: tasks.chat_id added, written by the coder message + skills routes from the frontend tab, read by runOpenCodeServerTask which falls back to resolve-or-create a chat for session-less creators (arena/MCP/new_task/generic) so ensureSession never gets a null key. Idempotent migration with a backfill-verify gate (0-row assertion after the test session was deleted). config_hash fingerprint logic preserved; one-worktree-per-session unchanged; runExternalAgent untouched. Column rename worktree_path -> path repointed at all five readers (server delete-guard, risk/stash endpoints, ensureSessionWorktree). Supersedes the earlier (worktree_id) draft.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 00:04:35 +00:00
f1a85627e4 fix(coder): strip dcp-message-id tags split across stream chunks
The dcp tag (<dcp-message-id>mNNNN</dcp-message-id>) is streamed token-by-token, so it arrives split across SSE deltas. The existing per-chunk stripDcpTags never sees a complete tag in any single fragment, so fragments pass through and the dispatcher reassembles the tag in textChunks (persisted + shown) — and the terminal message.part.updated path that would strip the full text is suppressed by the dedup gate. Add a stateful cross-chunk stripper (dcp-strip.ts: makeDcpStreamStripper) at the dispatcher's opencode frame boundary: it emits text that cannot be part of a forming tag, holds back only a trailing partial-tag prefix (without swallowing legitimate <…> content), and flushes at turn end. Fixes both live delta frames and persisted content. 11 unit tests incl. split-at-every-boundary and the documented per-chunk-fails case. opencode path only; ACP (goose/qwen/claude) untouched.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 23:16:47 +00:00
c65daba5dd docs(changelog): v2.6.2-delete-guard-and-sse
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 22:24:25 +00:00
c9e302da37 fix(coder): no-upstream branch alone no longer flags a session at-risk
Session worktree branches (session-<id>) never get an upstream, so the original atRisk rule (unpushed !== 0) flagged every worktree-backed session as at-risk on delete — even pristine ones — forcing a Stash/Force confirm on each. Gate the unpushed arm behind hasUpstream (unpushed !== -1) so the no-upstream sentinel can't trigger it: atRisk = dirty || unmerged > 0 || (hasUpstream && unpushed > 0). No protection is lost — any genuinely unsafe local commit also shows as unmerged > 0 — and the unpushed > 0 arm stays correct for P1.5's pushable worktree branches. unpushed is still reported (-1 = local-only) as informational. Follow-up to 3a26563.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 22:19:53 +00:00
f69ea5f494 feat(coder): per-session SSE subscriptions (P1.5-a concurrency prereq)
Replace the single global SSE loop (scoped to the most-recently-used worktree directory) with one subscription per live opencode session, each scoped to that session's worktree dir. Two sessions in different worktrees now stream concurrently instead of the second silently dropping the first's events. Each session owns an AbortController (SessionState.sseAbort) wired into subscribe(..., {signal}); the loop reconnects, reconciles (per-session), and is torn down on closeSession/dispose by aborting the signal — which also fixes a latent Phase-1 bug where switching directories left the old runEventLoop parked forever in its for-await (zombie loops). A sessionID demux guard (eventSessionId) drops events that aren't this loop's own, so two sessions sharing a worktree (possible after P1.5-b) don't double-process each other's deltas. Removed sseRunning/sseDirectory/startEventLoop/runEventLoop/reconcileInFlight and the 'SSE directory changed' collision warning. dispatchEvent/handleUpdatedPart (translation, dedup, dcp-strip) and the watchdog are unchanged — only the subscription topology changed. SDK confirmed: @opencode-ai/sdk Event.subscribe opens an independent SSE connection per call, so N concurrent dir-scoped streams are supported. No schema/dispatcher/frontend changes; runExternalAgent untouched.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 22:15:55 +00:00
3a26563be2 feat(coder): guard session delete against worktree work loss
Deleting a BooChat session CASCADE-wipes its session_worktrees row, which would silently orphan uncommitted/unpushed/unmerged work in the worktree. Add a pre-DELETE gate: the server reads session_worktrees from the shared DB first (no row = chat-only session = delete immediately, zero round-trip), and for worktree-backed sessions calls a new BooCoder endpoint that runs git on the host (only the host systemd service can see /tmp/booworktrees). checkWorktreeWorkAtRisk reports dirty/unpushed/unmerged via the audited hostExec+shellEscape path; default branch is detected from refs/remotes/origin/HEAD (not the worktree's own branch), never hardcoded. Any at-risk worktree returns 409 with per-worktree RiskReport[]; force=true bypasses the check entirely. Fail-closed: coder unreachable/errored also blocks (force still escapes). The sidebar renders a block dialog distinguishing work-at-risk (Commit/Stash/Force) from couldn't-verify (Cancel/Force only); stash uses -u and re-blocks on remaining commits with an explanatory message. Commit never auto-commits — it routes the user to the session.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 22:01:25 +00:00
937920df06 docs(changelog): v2.6.1-phase1-opencode
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 21:42:39 +00:00
e05469c6ae docs(claude): v2.6 Phase 1 opencode learnings — SSE, model resolution, resume
- opencode is now a warm HTTP server (was "planned, unshipped").
- SSE: session.next.* event types + subscribe({directory}) requirement.
- Model strings need llama-swap/ prefix + presence in opencode.json.
- config_hash excludes ephemeral port; session FKs are ON DELETE CASCADE.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 21:40:16 +00:00
0e026be5f8 fix(coder): CASCADE delete on session_worktrees + agent_sessions FKs
Deleting a session with linked session_worktrees or agent_sessions rows
threw a FK violation (500 on DELETE /api/sessions/:id). Both FKs now
ON DELETE CASCADE. Idempotent migration: drops the old constraint and
re-adds with CASCADE only if confdeltype != 'c'.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 21:26:28 +00:00
315cdd23e2 feat: strip dcp-message-id tags from opencode output + reopen closed panes
Two independent fixes:

- opencode-server.ts: stripDcpTags() removes <dcp-message-id>…</dcp-message-id>
  tags from text deltas before they reach the frame/DB. Applied to all three
  text paths (session.next.text.delta, message.part.delta text field,
  handleUpdatedPart text type). Reasoning/tool paths untouched.
- useWorkspacePanes.ts: module-level closedPaneStack (capped at 10) captures
  pane kind + chatIds on removePane and removeTab auto-remove. reopenPane()
  pops the stack and re-attaches a new pane to the existing chat ids (chats
  survive pane close server-side). hasClosedPanes drives conditional render.
- ChatTabBar.tsx: [+] is now instant new-tab (no dropdown); split-pane
  dropdown (Columns2 icon) opens Chat/Term/Code in a new pane; reopen button
  (RotateCcw icon) appears when closed panes exist.
- Workspace.tsx: pass reopenPane + hasClosedPanes through to ChatTabBar.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 21:26:07 +00:00
6d24726c3a feat: add systematic-debugging slash command for BooChat + BooCoder
/data/skills/boocode/systematic-debugging/SKILL.md — guided root-cause
debugging methodology (investigate before fixing). Available as
/systematic-debugging in both BooChat and BooCoder slash menus via the
shared /api/skills endpoint.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 20:37:51 +00:00
1bbeaf95c7 fix: auto-name uses session model + pane auto-remove on last tab close
Two independent UI/UX fixes:

- auto_name.ts: pass the session's own model as fallbackModel to
  taskModelCompletion, so chat rename uses whatever model is already
  loaded on llama-swap instead of forcing a swap to DEFAULT_MODEL
  (which times out at 10s when a different model is active).
- useWorkspacePanes.ts: when the last tab in a pane is closed and
  other panes exist, remove the pane entirely instead of leaving an
  orphaned empty panel.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 20:37:38 +00:00
e30a9e8b23 feat(coder): v2.6 Phase 1 — OpenCode warm server backend
Persistent multi-turn opencode backend: one `opencode serve` HTTP server per
BooCoder process, one opencode session per BooCode session (resumed on
switch-back), single SSE read loop demuxed by session id.

- backends/opencode-server.ts: AgentBackend implementation — spawn with
  waitForReady, session.next.* SSE event translation (text/reasoning/tool
  deltas), Paseo-ported reasoning dedup (streamedPartKeys), promptAsync
  fire-and-forget settled by session.idle, per-turn inactivity watchdog
  (180s) + reconnect reconciliation via session.messages, stale-session
  guard (crashed-not-resumed + config_hash fingerprint on model).
- dispatcher.ts: opencode routes to pool backend (ensureSession→prompt);
  per-session concurrency Map replaces global running boolean (1.9);
  model coalesce (empty→DEFAULT_MODEL) + llama-swap/ prefix for opencode;
  diff-supersede (DELETE+INSERT pending_changes by session, stamp agent).
- worktrees.ts: ensureSessionWorktree (session-keyed, captures base_commit,
  persists to session_worktrees); diffWorktree gains optional baseRef.
- agent-probe.ts: mergeLlamaSwap branch fetches /v1/models, prefixes with
  llama-swap/, populates opencode's available_agents.models (was 0).
- provider-snapshot.ts: export fetchLlamaSwapModels for probe reuse.
- schema.sql: session_worktrees + agent_sessions tables (Phase 0) +
  config_hash column on agent_sessions, pending_changes.agent column.
- package.json: @opencode-ai/sdk ~1.15.0 (resolved 1.15.12).

Known Phase 1 limitation: single SSE stream scoped to most-recent session's
directory; concurrent opencode sessions in different worktrees collide
(warning logged, watchdog prevents hang). Phase 2 moves to per-session SSE.

Smoke 1 verified: two turns in one session, both produce real tokens, same
agent_session_id reused, same server port, turn 2 is 9x faster (no spawn).
goose/qwen/claude paths untouched (runExternalAgent md5 identical).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 20:37:11 +00:00
140ff26204 feat(coder): v2.6 Phase 0 — AgentBackend foundations (no behavior change)
Schema, interface, and service scaffold for v2.6 persistent agent sessions.
Nothing in this batch alters runtime behavior.

- schema.sql: add session_worktrees (one shared worktree per session, FK
  sessions(id)) and agent_sessions (one backend session per (session, agent),
  with backend/status CHECKs); add pending_changes.agent column for DiffPanel
  attribution. All three statements idempotent (IF NOT EXISTS).
- services/agent-backend.ts: AgentBackend interface + AgentSessionHandle,
  EnsureSessionOpts, PromptCtx, TurnResult, and the normalized transport-agnostic
  AgentEvent union (text/reasoning/tool_call/tool_update/commands). Types only.
- services/agent-pool.ts: lazy get-or-create AgentPool keyed by
  `${sessionId}:${agent}` + shared `agentPool` singleton. Empty in Phase 0.
- index.ts: widen onClose to await dispatcher.stop() then agentPool.dispose()
  (pool empty, so dispose() is inert).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 02:50:17 +00:00
a97293b5d9 Merge coder-hardening: acp-client-fs path-guard fix + untrack live provider config 2026-05-29 22:23:20 +00:00
63adb218e6 chore(coder): untrack live coder-providers.json, ship example
The live config is read AND written by the coder (UI provider toggles PATCH it),
so tracking it churned `git status`. Untrack it (now gitignored under data/*),
add a tracked data/coder-providers.example.json reference, and update the
.gitignore exception + CLAUDE.md/BOOCODER.md docs. Loader already falls back to
{providers:{}} (built-ins only) when the live file is absent. + CHANGELOG v2.5.15.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 22:23:13 +00:00
d0334ca544 fix(coder): separator-bounded worktree path guard in acp-client-fs
The ACP fs bridge's worktree guard used an unbounded `startsWith(resolve(
worktreePath))`, so a sibling path sharing the worktree as a string prefix
(`<worktree>-evil/...`) escaped the scope. Since writeWorktreeTextFile hits disk
directly (no pending_changes gate), a confused/buggy ACP agent could write
outside its worktree. Now uses a separator-bounded check matching write_guard.ts
(resolve() + `startsWith(root + sep)` / `=== root`) via a shared resolveInWorktree,
with a regression test (../ traversal + the sibling-prefix bug). Symlink-swap
hardening intentionally skipped — consistent with write_guard's no-realpath
stance; the agent runs with host FS access so this is a containment guard, not a
trust boundary. Flagged by the automated push security review.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 22:22:51 +00:00
024ffc0b92 Merge claude-md-learnings: session learnings + CHANGELOG v2.5.14 2026-05-29 21:24:18 +00:00
691eef1b30 docs(claude): session learnings — provider lifecycle, deploy + mobile gotchas
Adds to CLAUDE.md: stale boocoder-restart symptom after build (new routes 404 /
old routes 200); boocode container build: . deploys the working tree, web
dev≠prod until container rebuild; PATCH provider-config replaces override
wholesale (send full override) + coder-providers.json is live config (don't
commit drift); external agents one-shot with no ctx tracking + OpenCode-as-server
is unshipped v2.6; ui/ primitive inventory + button-role=switch / Dialog
fallbacks; mobile Dialog scroll containment. Also backfills uncommitted doc
bullets for the v2.5.7–v2.5.11 coder work. CHANGELOG v2.5.14 entry. Docs only.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 21:24:10 +00:00
e92c51578d Merge v2.3-provider-lifecycle-phase5: provider settings UI + closeout
Phase 5 (Settings → Providers tab, picker filter, ACP catalog) + mobile settings
fix + Phase 6 docs. Completes the v2.3 provider-lifecycle batch
(phases 1–4: v2.5.4 / v2.5.5 / v2.5.6 / v2.5.12).
2026-05-29 20:20:38 +00:00
6d03690a65 docs: v2.3 provider-lifecycle closeout (Phase 6)
BOOCODER.md gains a Provider lifecycle section (config file + schema,
gitignored-with-exception, the 24h PROVIDER_PROBE_TTL_MS refresh contract,
enable/disable via Settings → Providers, custom-ACP add, native boocode
always-on, the honest subset-refresh known limitation, deploy + smoke).
docs/DEFERRED-WORK.md §2 (cold-probe skip) marked ADDRESSED with the still-
deferred Tier-2 follow-ups listed. CHANGELOG gets the v2.5.13 batch-closeout
entry. Docs only — no code.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 20:20:31 +00:00
21384cce5b web: fix Settings pane unreachable on mobile (push ?pane= atomically)
Opening the settings pane on mobile set activePaneIdx, but the ?pane= URL-sync
effect snapped it back to the chat pane on the panes change, so the pane never
showed. toggleSettingsPane now returns the new pane id (id generated outside the
updater, strict-mode safe); Session's toggleSettingsAndSync pushes ?pane=<id> on
mobile when opening (and drops it on close) so the sync effect keeps it active —
mirrors the existing addPaneAndSwitch pattern. Desktop unaffected.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 20:20:24 +00:00
920f8b75a6 web(coder): provider settings UI — Settings → Providers tab, picker filter, ACP catalog
v2.3 Phase 5. Provider management lives in Settings → Providers: lists every
registered provider with a status badge, enable/disable toggle (sends the full
override so a custom ACP entry's command survives the wholesale-replace PATCH),
per-provider refresh, and a plaintext diagnostic. The composer provider picker
now filters to enabled && (status==='ready' || 'loading') — disabled/unavailable
providers leave the picker and are managed only in settings; native boocode
always shows. Adds a curated ACP catalog + AddProviderModal (PATCH config then
subset refresh; the modal caps to the viewport with a single overscroll-contain
scroll region). Loading state uses a capped client poll (no WS frame).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 20:20:18 +00:00
e83d9b7d5b docs(changelog): v2.5.12-provider-lifecycle-phase4
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 17:48:28 +00:00
f302969c71 coder(providers): v2.3 provider-lifecycle phase 4 — config HTTP API (diagnostic returns JSON)
GET/PATCH /api/providers/config, subset POST /refresh, and
GET /api/providers/:id/diagnostic (JSON { diagnostic }, §6.4). PATCH order
is validate→save→reload→clear; a malformed body or invalid merged config
returns 422 without writing, and a save failure returns 500 without
reloading (no file/registry divergence). Web client + types extended.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 17:46:56 +00:00
2d997ecb6c web+coder: discover Claude's enabled commands + plugin skills; icon-split commands vs skills
claude is PTY (no ACP discovery), so claude-command-discovery.ts reads its enabled set from disk (user-global): ~/.claude/commands/*.md + every enabled plugin's skills/<name>/SKILL.md (kind=skill) and commands/*.md (kind=command), from ~/.claude/settings.json:enabledPlugins + installed_plugins.json install paths, frontmatter-parsed, bare names, deduped. The snapshot claude branch discovers these live (snapshot cache rate-limits the reads). The coder / menu now shows up to three icon'd groups: <agent> commands (Terminal), <agent> skills (Puzzle), BooCoder skills (Sparkles) via a new optional icon on SlashCommandGroup. AgentCommand gains a kind field in both coder + web copies (parity test enforces); mergeCommandsByName made generic to preserve it. Invocation unchanged (literal /name -> claude). Project-local plugins deferred. BooChat unaffected.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 16:21:32 +00:00
dc3859975d coder(providers): capture + persist opencode's live ACP commands (no dispatch needed)
The cold ACP probe captured available_commands but read probedCommands synchronously right after newSession, racing opencode's async available_commands_update notification -> captured nothing, only the static manifest showed. The probe now waits (poll <=3s + 300ms settle) for the notification. Captured commands persist to a new available_agents.commands column and are served (merged with the manifest) on the tier-2-skip path, so the agent's discovered commands survive once models are warm and show without a dispatch. Boot warms via the force:true startup snapshot. Caveat: relies on opencode emitting available_commands_update on session creation, not only post-prompt.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 14:56:18 +00:00
23a33e893a web+coder: segmented per-agent slash menu (agent commands + skills) + cross-agent skill execution
Coder / menu now shows two groups: the active agent's commands first (manifest + live ACP available_commands), BooCoder skills second. SlashCommandPicker gains an opt-in groups prop (flat items path unchanged -> BooChat byte-identical, parity verified); ChatInput takes slashGroups; CoderPane builds the groups. Skills run under the selected agent: coder skill_invoke accepts a provider and, when external, injects the server-side skill body into a dispatched task instead of native inference. Also folds in the initial-chat skill fix (handleLandingSkill: create chat -> assign to pane -> invoke, same transition as a text send) that resolves the landing-page blank screen. BooChat slash menu + skill invocation unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 14:38:39 +00:00
8bf86ecb92 web(coder): keep composer refresh on the top line + icon-only Mode picker on mobile
The AgentComposerBar refresh button wrapped to a second line on mobile: the status dot had ml-auto (pinned to the far-right edge) and the refresh button followed it in DOM order, overflowing past the edge. Group the dot + refresh into one right-aligned (ml-auto) unit so the refresh stays on the top line. Also add an iconOnly option to CompactPicker and render the Mode (permission) picker icon-only on mobile (shield + chevron, no label; aria-label/title + tap-to-open list still convey the selection) to free row width. Desktop unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 12:46:40 +00:00
fe52250d78 coder(providers): fix empty picker (loading-state) + config model overrides + current Claude models
Fix: getProviderSnapshot returned synchronous installed:false 'loading' entries on a cache miss (v2.5.5/Phase 2), which AgentComposerBar filters out — with the Phase 5 client poll not yet built, a single fetch stranded on 'loading' and the picker showed no providers. It now awaits the build and returns terminal entries; the sync loading-return is deferred until Phase 5. Builds stay fast via the tier-2 cold-probe skip.

Feature: wire the v2.3 config schema's models/additionalModels — buildResolvedRegistry carries them onto ResolvedProviderDef (models replace, additionalModels merge) and provider-snapshot applies them to every ready model list, so /data/coder-providers.json can edit any provider's models with no code change. Claude staticModels bumped from the stale 2-entry list to opus/sonnet/haiku latest-aliases + pinned claude-opus-4-8 / claude-sonnet-4-6 / claude-haiku-4-5-20251001 (passed verbatim to claude --model). +2 tests (109 total).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 12:37:01 +00:00
4035aa2b98 coder(providers): v2.3 provider-lifecycle phase 3 — generic ACP dispatch
ACP dispatch now spawns from the resolved registry's launch spec instead of a hardcoded per-name switch. acp-spawn.ts gains resolveLaunchSpec(resolved, installPath): launchCommand (config override / custom-ACP command) wins, else the kept resolveAcpSpawnArgs switch is the built-in fallback. acp-dispatch.ts spawns spec.binary/spec.args with env { ...process.env, ...spec.env }; dispatcher.ts loads the resolved def by task.agent and passes it through. Config-defined custom ACP providers dispatch with no new switch case. Built-in dispatch (opencode/goose/qwen) is byte-identical to pre-v2.3 — proven by a regression test (opencode->['acp'], goose->['acp'], qwen->['--acp'], binary=installPath ?? id, empty env -> plain process.env). Deliberate deviation from design's !installPath->null: the installPath ?? id fallback is preserved. setSessionMode/permission/streaming and the dispatcher poll/NOTIFY/running-guard untouched. 7 new acp-spawn.test.ts cases. No routes/UI (Phase 4+).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 12:06:32 +00:00
35a0aba211 coder(providers): v2.3 provider-lifecycle phase 2 — snapshot lifecycle
provider-snapshot no longer returns null for uninstalled/disabled providers: it emits one entry per registered provider with a lifecycle status (loading|ready|unavailable|error), an enabled flag, and a two-tier probe. Tier-1 is a fast which-style check (command-availability.ts, execFile/no-shell); tier-2 (cold ACP probe) is skipped unless forced, last_probed_at is older than PROVIDER_PROBE_TTL_MS (24h), or DB models are empty — the snapshot-latency win. Cache miss returns status:'loading' synchronously while the build settles via the existing inflight promise. ProviderSnapshotStatus/Entry regain loading/unavailable + gain enabled/description?/fetchedAt? in both coder and web copies, guarded by a runtime parity test (provider-types-parity.test.ts; compile-time cross-project check was blocked by TS6307). Also tracks the data/coder-providers.json seed via a .gitignore exception, completing the Phase 1 config file. No dispatch/route/UI changes (Phase 3+); AgentComposerBar filtering unchanged. 13 snapshot tests (+6) + 6 parity tests.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 11:47:48 +00:00
3730dc9341 coder(providers): v2.3 provider-lifecycle phase 1 — config-backed registry
Adds a config layer merged over the hardcoded built-ins (tasks 1.1-1.6): CODER_PROVIDERS_PATH env (default /data/coder-providers.json); provider-config.ts (Zod schema + never-throw loader — missing/invalid file falls back to built-ins only — + save); provider-config-registry.ts (ResolvedProviderDef + buildResolvedRegistry merge: override built-ins, add custom extends:'acp' entries, boocode always enabled + singleton); agent-probe now iterates the resolved registry, probes custom-ACP command[0] via execFile (no shell), skips disabled providers (keeps the row), reads enabled from memory only (no DB column). No snapshot/dispatch/route/UI changes (Phase 2+). 6 new unit tests; empty config provably yields exactly the built-ins.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 04:09:34 +00:00
a359a4ab8b coder(providers): remove retired cursor and copilot providers
Drop both retired providers from BooCoder's provider layer: acp-spawn argv cases, provider-manifest mode blocks + manifest keys, provider-commands maps, the provider-snapshot cursor model-CLI branch (+ orphaned exec/promisify imports), the agent-probe copilot ACP-detect branch, and the now-dead cursor-models module + its test. The PROVIDERS registry array already lacked both. Built-ins unchanged: claude, opencode, goose, qwen, native boocode.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 04:07:21 +00:00
a8c84ecfe4 chore+docs: config, agent registry, codecontext, v2.6 spec, changelog
Working-tree config/doc changes (.gitignore, CLAUDE.md, AGENTS.md removal + data/AGENTS.md, codecontext Dockerfile/shim — pre-existing) plus this session's v2-6 persistent-agent-sessions openspec proposal/design/tasks (planning only; feature unimplemented, reserves the v2.6.0 tag) and the v2.5.2 CHANGELOG entry.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 03:12:31 +00:00
547fd70650 server/coder: working-tree backend changes (pre-existing)
Checkpoint of in-progress backend work present in the tree, not authored this session: auto_name, inference tool-phase/turn, secret_guard, provider-registry, plus a new agent-allowlist test (7 tests, passing).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 03:12:16 +00:00
990a615b87 web(coder UI): ChatInput migration + Thinking render + DiffPanel route fix
Bundles in-progress working-tree UI work not authored this session (CoderPane ChatInput migration, AgentComposerBar/CoderMessageList/tab-bar/sidebar/pane refinements, provider icons) with this session's changes to the same files: MessageBubble renders a collapsible 'Thinking' block from reasoning_text/reasoning_parts (surfacing ACP agent_thought_chunk + native reasoning), and the DiffPanel approve/reject calls are repointed to the real /api/coder/pending/:id/apply and /reject routes (the old /sessions/:id/pending/:id/approve|reject paths did not exist).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 03:12:06 +00:00
5352fd9942 coder(pending): new-file-from-RightRail create endpoint + modal
POST /api/sessions/:sessionId/pending/create queues a pending_changes create via queueCreate (WriteGuardError -> 422 with the guard message). RightRail gains a 'New file from pasted text' modal (path + content) wired through api.coder.createPendingFile; sessionId is threaded down from App.tsx. The staged change shows in the CoderPane DiffPanel for explicit apply.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 03:11:50 +00:00
66df410826 web: fix mobile nav stuck-open on rejoin + paste-chip code fence
useViewport re-syncs the snapshot on pageshow/visibilitychange/resize/orientationchange — iOS reported a stale width on backgrounded-tab restore, leaving isMobile=false so the sidebar rendered as a permanent column with no close affordance. flattenToMessage now inserts pasted-text chips verbatim instead of wrapping them in a triple-backtick fence.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 03:11:42 +00:00
f89c8f3f15 coder(dispatcher): react to new tasks via LISTEN/NOTIFY, poll as fallback
AFTER INSERT trigger on tasks fires pg_notify('tasks_new'); the dispatcher listens via porsager sql.listen and triggers an immediate poll, with the setInterval poll kept at 2s as a missed-notification safety net. Per-session guard unchanged (no double-dispatch).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 03:11:34 +00:00
cbef7618b3 v2.5.1-budget-100: raise all tool call budgets to 100 + codecontextignore fix
Budget defaults raised from 50/10/50 to 100/100/100 (read-only,
non-read-only, no-agent). Per-agent max_tool_calls from AGENTS.md
still overrides.

Added .claude/worktrees/ to .codecontextignore to prevent
get_codebase_overview from parsing empty stub files in stale
worktree node_modules.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-28 02:40:26 +00:00
fcc7c5a86e v2.5.0-task-model: lightweight task model services + tasks table
Task model infrastructure for cheap LLM calls (auto-naming, search
rewrite, tags, summaries) via a dedicated llama-server instance at
TASK_MODEL_URL, falling back to LLAMA_SWAP_URL with FAST_MODEL when
unset. Replaces the inline fetch in auto_name.ts with taskModelCompletion.

Adds search query rewriting: on step 0 when web tools are enabled, the
user's message is summarized into a search intent hint appended to the
system prompt, improving web_search relevance.

Schema: tasks table for provider dispatch and arena, sessions.tags column.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-27 21:44:39 +00:00
bcfc94fa47 v2.4.1-sidecar-routing: route per-agent flags to llama-sidecar + tool gap fix
Batch 3c: when an agent has llama_extra_args in AGENTS.md, provider.ts
routes inference through LLAMA_SIDECAR_URL instead of LLAMA_SWAP_URL.
X-Agent-Flags header built from the agent's flags. Boot-time guard
refuses to start if any agent has llama_extra_args but LLAMA_SIDECAR_URL
is unset. PrefixFingerprint gains a route field (swap/sidecar) for
per-turn visibility. 9 provider tests.

AGENTS.md tool gap: all agents (except Prompt Builder) were missing 8
tools that were added after the original tool lists were written:
request_read_access, view_truncated_output, ask_user_input, git_status,
get_blast_radius, get_hot_files, get_middleware, get_routes. The missing
request_read_access caused silent "permission denied" when reading files
outside the project root.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-27 19:28:08 +00:00
90a6761b07 v2.4.0-unsloth-studio-lift: port 3 Unsloth Studio AGPL-3.0 modules
Batch 1 — tool-call-parser.ts: replaces xml-parser.ts with a port of
Unsloth's tool_call_parser.py. Adds balanced-brace JSON scanner,
single-param fast path, hasToolSignal/stripToolMarkup/parseToolCallsFromText
exports, and stream-finalization stripping at all three final-write sites
(error-handler, finalizeCompletion, executeToolPhase). Anthropic <invoke>
shape preserved. 75+12 tests.

Batch 2 — web/html-to-md.ts: parse5 tree-walking HTML-to-Markdown converter
ported from Unsloth's _html_to_md.py. Replaces web_fetch's regex stripHtml
with structured markdown output (headings, links, lists, tables, code blocks,
blockquotes, entity decoding). 29 tests.

Batch 3 — llama-args-validator.ts: port of llama_server_args.py deny-list
validator. Wired into AGENTS.md frontmatter parser — llama_extra_args field
validated at load time, rejects managed flags (model identity, networking,
auth/TLS, server UI). No runtime consumer yet (llama-swap boundary). 76 tests.

All three files carry SPDX-License-Identifier: AGPL-3.0-only headers.
LICENSE flipped to AGPL-3.0-only in prior commit (a938cf1).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-26 23:30:50 +00:00
a938cf1d42 License: AGPL-3.0-only 2026-05-26 23:29:25 +00:00
6f6b3afb5d v2.3.2-coder-answer-endpoint: fix ask_user_input submit in CoderPane
The CoderPane runs its own inference runner and broker on the boocoder
service. The AskUserInputCard was calling /api/chats/:id/answer_user_input
on the main BooChat server, which has a different inference runner — the
answer was accepted but the next turn was enqueued on the wrong runner,
so nothing happened.

Fix: register the same answer_user_input endpoint on the boocoder, and
add an apiPrefix prop to AskUserInputCard so the CoderPane routes
through /api/coder/chats/:id/answer_user_input. BooChat's MessageList
continues to use the default (no prefix) path.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-26 21:54:08 +00:00
154ef78f7c v2.3.1-permission-questions: enrich ACP permission wire for interactive questions and elicitations
The permission_requested WS frame now carries kind ('tool'|'question'|'plan'|
'elicitation'), input (the tool's rawInput payload), and description fields.
PermissionCard detects question-type permissions (Claude Code's AskUserQuestion)
and renders an interactive radio/checkbox form instead of approve/deny buttons.
Submitting answers auto-selects the first allow option.

Also wires up ACP createElicitation (unstable/experimental) — JSON Schema-driven
forms for structured user input. The same PermissionCard renders elicitation
fields with type-appropriate inputs. Both flows use the existing permission-waiter
blocking pattern with 120s timeout.

The response path (POST /api/coder/tasks/:id/permission) now accepts optional
updated_input alongside option_id, forwarded to the ACP agent as the user's
answer payload. Elicitation responses map to accept/decline/cancel actions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-26 21:28:14 +00:00
792bbb9da3 v2.3.0-sampling-params-ask-user: agent sampling params, ask_user_input in CoderPane, UX polish
Add top_p/top_k/min_p/presence_penalty to AGENTS.md frontmatter and thread
through inference (agents.ts parser → Agent type → stream-phase → sentinel
summaries). Null means omit from request body, preserving provider defaults.

Wire ask_user_input interactive card into both BooCoder frontends: the
CoderPane in BooChat's SPA (CoderMessageList now renders AskUserInputCard
instead of ToolCallLine for ask_user_input tool calls) and the standalone
coder SPA (MessageBubble + new AskUserInputCard + shadcn ui primitives).

Additional fixes: SessionLandingPage uses ChatInput with slash-command
support and lazy chat creation; Session.tsx hydrate-race fix for empty pane
promotion; AgentPicker wider dropdown with line-clamp; ModelPicker min-width;
Textarea converted to forwardRef; Recon agent added to AGENTS.md; codecontext
host port exposed in docker-compose.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-26 21:02:21 +00:00
31e1b32be1 v2.2.2-xml-placeholder-reject: drop placeholder XML tool calls at parse time
Reject qwen3.6 spurious <invoke> tails with path "..." or empty args before
they enter toolCalls, preventing duplicate assistant answers. Dropped blocks
append to flushed text; four new xml-parser tests. DEFERRED-WORK §6 for
console.debug → pino cleanup.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-26 16:22:43 +00:00
314adaae48 docs: reconcile roadmap, README, and deferred work for v2.2 ship state
Mark v2.2/v2.2.1 shipped and v2.3 planned in roadmap and README; fix
DEFERRED-WORK §2 (ACP probe skip is planned, not resolved).

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-26 15:27:16 +00:00
93d3f86c2b v2.2-paseo-providers: Paseo provider stack + v2.2.1 pane-scoped chat fixes
Ship Paseo-equivalent provider snapshot, AgentComposerBar, ACP dispatch
rewrite with streaming/persist, permission prompts, and agent commands.
Follow-up: pane-scoped chat resolution, CoderMessageList tool timeline,
WS user-delta replace, and inference orphan tool_call stripping.
Archive openspec v2-2; update CHANGELOG and CURRENT.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-26 15:18:31 +00:00
04673eaf59 v2.1.1: roadmap cleanup + README update + openspec archive
- Archive all 10 shipped openspec changes to openspec/changes/archived/
- Update boocode_roadmap.md: date, shipped status for v1.14/v1.15/v2.0, add v2.1.0 section
- Update README.md: 3-app monorepo, add services table, add What's shipped section
- Remove stale active openspec folders (all work shipped)
2026-05-25 20:23:22 +00:00
d8ffee1950 v2.1.0-provider-picker: BooCoder systemd migration + provider picker
- BooCoder moves from Docker to host systemd service (boocoder.service)
- Agent dispatch (ACP + PTY) switches from SSH to direct spawn/exec
- SSH helpers marked @deprecated (kept for one release cycle)
- Provider registry (5 providers: boocode, opencode, goose, claude, qwen)
- Agent probe with direct which/exec + model discovery (qwen settings, static claude models)
- GET /api/providers route with installed status, models, transport fallback
- ProviderPicker frontend component in CoderPane header
- External provider messages route through tasks row instead of inference enqueue
- Smart scroll: MessageList only auto-scrolls when near bottom (150px threshold)
- DB: available_agents gets models, label, transport columns
- Bug fix: loadContext SELECT includes allowed_read_paths
- Bug fix: cap hit sentinel inserted before buildMessagesPayload
- docker-compose.yml: boocoder service commented out, BOOCODER_URL env var added
- CLAUDE.md: updated docs for systemd, provider registry, JSONB gotcha, loadContext
2026-05-25 19:20:53 +00:00
e423579e99 v2.0.5: FAST_MODEL routing + tool-use summaries + Qwen dispatch + Arena
Source-level recon of QwenLM/qwen-code (Apache-2.0) informed 4 lifts:

1. FAST_MODEL config: optional env var routes cheap LLM calls (titles,
   summaries, labeling) to a smaller model on llama-swap. auto_name.ts
   uses ctx.config.FAST_MODEL ?? session.model. Set FAST_MODEL=nemotron-
   nano-4b to avoid loading the 35B model for 20-token title generation.

2. Tool-use summaries (services/inference/tool-summaries.ts): utility
   that generates "git-commit-subject-style" labels for tool batches via
   a fast-model LLM call. System prompt + truncation logic ported from
   Qwen Code's toolUseSummary.ts. Exported via @boocode/server/inference
   for BooCoder's dispatcher to call after task completion.

3. Qwen as dispatchable agent: added to agent-probe.ts KNOWN_AGENTS.
   PTY dispatch builds: qwen -p "<task>" --output-format stream-json
   (NDJSON structured events over stdout). Env: OPENAI_BASE_URL +
   OPENAI_API_KEY points Qwen Code at llama-swap. execution_path CHECK
   constraint extended with 'qwen'.

4. Arena routes (routes/arena.ts): POST /api/arena dispatches the same
   task to N contestants (2-5, each with different agent/model), each
   getting its own task row linked by arena_id UUID. GET /api/arena/:id
   shows all contestants. POST /api/arena/:id/select/:task_id marks
   winner. Schema: arena_id column added to tasks.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 14:05:59 +00:00
06116f31b3 v2.0.4-hardening: fuzz suite + integration tests + production readiness
Phase 8 of v2.0. Final hardening pass before production tag.

Path-guard fuzz suite (34 tests): traversal attacks (../ all depths,
encoded %2e%2e, null bytes, absolute escapes, prefix-without-separator,
backslash), secret-file deny list (.env, *.pem, id_rsa*, *.key,
credentials.json, *.kdbx, .netrc), valid-path positives, edge cases
(empty, whitespace, very long, triple-dot, multiple slashes).

write_guard.ts hardened: added null-byte rejection and whitespace-only
rejection (previously only checked empty string).

Pending-changes integration test skeleton: 4 tests covering the full
queue→apply→rewind cycle against a real DB + filesystem. Gated on
DATABASE_URL via describe.runIf (same pattern as apps/server's
tool_cost_stats.test.ts). Skips cleanly when unset.

57 tests passing (23 existing + 34 fuzz), 4 integration skipped.
All builds clean. All services healthy.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 04:31:22 +00:00
47abbb6e3c v2.0.3: CLI client + human inbox + cost tracking + Boomerang new_task
Phase 7 of v2.0. BooCoder gains a terminal-driven UX and subagent
isolation primitive.

CLI (src/cli.ts): standalone entry point for terminal use.
- boocode run "task" [--agent x] [--model y] — create + stream output
- boocode ls [--state x] — formatted task table
- boocode attach <id> — WS stream of running task
- boocode send <id> "msg" — follow-up message to task session
Connects to BOOCODER_URL (default http://100.114.205.53:9502).

Human inbox (routes/inbox.ts): GET /api/inbox (failed/blocked tasks),
POST /api/inbox/:id/retry (reset to pending for re-dispatch).

Cost tracking: dispatcher aggregates tokens_used from all messages in
the task's session after completion, stores in tasks.cost_tokens.
GET /api/stats/costs?group_by=project|agent|day for aggregation.

Boomerang subagent isolation (3 new tools):
- new_task: creates child task with parent_task_id linkage, runs in
  fresh isolated session. Orchestrator sees only output_summary.
- list_tasks: query child tasks of current parent
- check_task_status: read task state + output_summary

The orchestrator pattern: an agent with tools: [new_task, list_tasks,
check_task_status] can ONLY dispatch — can't read files or MCP. This
is the Roo Code Boomerang Tasks capability-restriction principle.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 04:25:18 +00:00
f53c6d6cb9 v2.0.2: BooCoder MCP server — 6 tools over stdio
Phase 6 of v2.0. BooCoder exposes its task primitives as MCP tools
so external agents (Sam's opencode in Termius) can drive the task
queue without going through the web UI.

6 MCP tools registered via McpServer + StdioServerTransport:
- boocoder.create_task — INSERT pending task
- boocoder.list_pending_changes — SELECT pending changes
- boocoder.apply — apply a specific pending change to disk
- boocoder.reject — reject a pending change
- boocoder.dispatch_external_agent — create task with agent for Path B
- boocoder.list_worktrees — list active worktrees from running tasks

Activated by --mcp CLI flag: `node dist/index.js --mcp` starts the
MCP server over stdio instead of the HTTP server. Configure in
opencode: {"mcpServers":{"boocoder":{"type":"stdio","command":"docker",
"args":["exec","-i","boocoder","node","dist/index.js","--mcp"]}}}

Uses McpServer class from @modelcontextprotocol/sdk/server/mcp.js
(high-level .tool() registration API). Zod schemas for input
validation. Process blocks on stdin close, cleanly shuts down DB.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 04:17:28 +00:00
3d6055518b v2.0.1: ACP dispatch + PTY fallback + worktree management
Phase 5 of v2.0. External agent dispatch via SSH to host.

ACP dispatch (acp-dispatch.ts): spawns agent via SSH with JSON-RPC
stdio pipe. Wraps opencode/goose in ACP mode. Captures structured
events (file operations, tool calls) mapped to parts taxonomy.
Falls back to PTY if ACP handshake fails.

PTY dispatch (pty-dispatch.ts): raw SSH spawn for agents without ACP
support (claude, pi). Captures stdout/stderr as plain text. Simpler
but less structured than ACP.

SSH helper (ssh.ts): shared spawn wrapper for SSH commands to
samkintop@100.114.205.53 (Tailscale IP, same as booterm). Uses
openssh-client installed in the runtime Dockerfile stage.

Worktree management (worktrees.ts): createWorktree (git worktree add
via SSH), diffWorktree (git diff HEAD...task-branch), cleanupWorktree
(git worktree remove --force). One worktree per task at
/tmp/booworktrees/<taskId>.

Dispatcher updated: checks available_agents.supports_acp to pick
transport. Path B flow: create worktree → dispatch agent → diff
worktree → queue diff into pending_changes → cleanup worktree →
mark task complete.

Agent probe updated: probes via SSH to find host-installed agents
(which opencode && opencode --version over SSH).

Dockerfile: openssh-client added to runtime stage.
Config: SSH_HOST env var (default 100.114.205.53).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 04:10:46 +00:00
752ea74f43 v2.0.0-final: dispatcher + task queue + agent probing
Phase 4 of v2.0. BooCoder can now queue tasks and dispatch them
through the inference loop autonomously.

Dispatcher (services/dispatcher.ts): in-process setInterval(5s) polls
tasks WHERE state='pending', picks one at a time, creates an isolated
session+chat, enqueues inference with the task's input as the user
message, polls for completion, marks state completed/failed with
output_summary. Single-task-at-a-time for v2.0.0; parallel dispatch
is a Phase 5+ concern. Respects onClose hook for graceful shutdown.

Task routes (routes/tasks.ts): POST /api/tasks (create), GET /api/tasks
(list with state/project filters), GET /api/tasks/:id (detail),
POST /api/tasks/:id/cancel (marks cancelled, aborts if running).

Agent probe (services/agent-probe.ts): on startup, probes PATH for
opencode/goose/claude/pi via which + --version. UPSERTs into
available_agents table. Finds nothing inside the container (expected —
Phase 5 addresses host-agent access via ACP/PTY).

Schema: ALTER TABLE tasks ADD COLUMN IF NOT EXISTS session_id (links
task to its auto-created inference session for isolation).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 03:55:18 +00:00
73b53089b0 CLAUDE.md: v2.0.0 architecture docs — BooCoder, DB rename, MCP config, workspace deps
Session learnings applied:
- Database renamed boochat (from boocode), new tables documented
- BooCoder architecture section: workspace dep pattern, write tools,
  coder pane integration, proxy routing
- Environment: MCP_CONFIG_PATH, BooCoder health at :9502
- Workflow: Go binary at /snap/go/current/bin, codecontext fork location
- Conventions: workspace exports with types conditions, Docker build order

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 03:51:24 +00:00
457c59fb06 v2.0.0: BooCoder frontend — chat pane + diff pane + session picker
Integrates BooCoder as a 'coder' workspace pane within the existing
BooChat SPA at code.indifferentketchup.com. Renamed the placeholder
'agent' pane kind to 'coder' across all types, menus, hooks, and
mobile switcher (Icon: Code instead of Bot).

CoderPane.tsx: split layout with chat area (messages via WS to
boocoder:9502, input bar posting to /api/coder/sessions/:id/messages)
and diff panel (pending changes with Approve/Reject per change plus
Approve All/Reject All). Reuses MarkdownRenderer for message content.

Proxy: Vite dev config adds /api/coder → boocoder:9502 (ordered above
/api per CLAUDE.md proxy-ordering rule). Production: Fastify route in
apps/server/src/index.ts proxies /api/coder/* to http://boocoder:3000
via fetch() pass-through. WS connects directly to :9502 (same
Tailscale network, no proxy needed for WebSocket upgrade).

WorkspacePaneKind mirror updated in both apps/web and apps/server
types. useWorkspacePanes gains coderPane() factory (replaces the old
agent toast stub). Workspace.tsx switch renders CoderPane for
pane.kind === 'coder'.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 03:24:49 +00:00
78455b7efc v2.0.0: BooCoder frontend — chat pane + diff pane + session picker
Phase 3 of v2.0. React + Vite SPA at apps/coder/web/ served by
the coder Fastify server via @fastify/static with SPA fallback.

Chat pane: message list via WS streaming (useSessionStream hook),
input bar, POST /api/sessions/:id/messages on submit, markdown
rendering via react-markdown + remark-gfm, inline tool-call display.

Diff pane: fetches GET /api/sessions/:id/pending, shows pending
changes with file path + operation badge (create/edit/delete),
before/after diff for edits, Approve/Reject per change and
Approve All/Reject All buttons.

Layout: fixed two-pane split (chat 60%, diff 40%). Dark theme
(bg-zinc-900). Desktop-first for v2.0.0.

Session picker (Home page): lists projects and sessions from the
shared DB. No CRUD — use BooChat's UI for that.

Dockerfile updated: builds web app in builder stage, copies dist
to runtime. index.ts registers fastifyStatic + SPA fallback route.

Tailwind v4, React 18, TypeScript strict. ~20 new files, ~370KB
built output. Functional developer tool UI, not polished consumer
product — Phase 7 (v2.0.3) handles polish.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 03:04:52 +00:00
d2108b2f8d verification discipline rules + chat naming from assistant response
BOOCHAT.md + BOOCODER.md: 4 verification rules added to both —
verify against running container not source files, never count dist/,
run commands before claiming success, derive counts from commands.

auto_name.ts: chat titles now derived from the assistant's first
response only (user message dropped from naming input). System prompt
updated to "summarize the topic or outcome — do NOT copy the first
few words verbatim." Produces titles like "Fastify Route Setup"
instead of echoing the assistant's opening sentence.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 02:52:49 +00:00
ce31577d1e v2.0.0-beta: write tools, pending-changes queue, inference loop, API routes
Phase 2 of v2.0. BooCoder is now a functional write-capable chatbot.

Write-path guard: resolveWritePath() uses resolve() (no realpath — files may
not exist for creates) + prefix-check + secret-file deny list (.env, *.pem,
id_rsa*, etc.). 23 unit tests cover traversal attacks.

Pending-changes service: queueEdit/Create/Delete → applyOne/All →
rejectOne/All → rewindOne. Edit diffs stored as JSON {old, new}. All writes
queue before touching disk; apply re-validates the path guard.

5 write tools: edit_file, create_file, delete_file, apply_pending, rewind.
Registered alongside 25 read-only tools from BooChat (30 total, alpha-sorted).
Write tools use a module-level inference context for sql+sessionId injection.

Inference loop via workspace dependency: apps/coder imports
createInferenceRunner, createBroker, ALL_TOOLS from @boocode/server (dist/).
apps/server gains declaration: true + exports map with typed subpath entries.
No code duplication — one inference engine shared by both apps.

API routes: POST /api/sessions/:id/messages (user msg → inference), POST stop,
GET/POST pending-changes CRUD (5 endpoints), WebSocket session streaming.

Dockerfile updated to build apps/server first (coder depends on its .d.ts).
Health endpoint reports tool count: {"ok":true,"db":true,"tools":30}.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 01:53:38 +00:00
006226cce5 v2.0.0-alpha: BooCoder foundation — container, schema, DB rename
Phase 1 of v2.0. BooCoder is live at port 9502 with a health endpoint.

- Database renamed: ALTER DATABASE boocode RENAME TO boochat (one-time).
  All services updated to connect to /boochat. Docker service name stays
  boocode_db (rename is internal to Postgres, not Docker).

- New apps/coder/ app skeleton: Fastify server with health endpoint,
  postgres connection, schema apply on boot. Mirrors apps/server pattern
  but minimal (no inference loop yet — Phase 2).

- Schema: pending_changes (operation queue before /apply), tasks (dispatch
  DAG with state machine), available_agents (startup-probed agent registry),
  human_inbox view (tasks WHERE state IN blocked/failed). All IF NOT EXISTS,
  idempotent on re-run. Same boochat database, different tables.

- Dockerfile: Node 20 bookworm-slim (glibc for future node-pty in Phase 5).
  Multi-stage build matching the existing boocode image pattern.

- docker-compose.yml: boocoder service on 100.114.205.53:9502, /opt:/opt:rw
  mount (write-capable, policy-gated at tool layer), depends on boocode_db.

- BOOCODER.md: container guidance declaring write-tool capability +
  pending-changes discipline.

All 4 services boot and pass health checks. 9 tables in the shared DB.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 01:20:29 +00:00
62d818af23 v2.0 implementation plan: 8 phases from foundation to production
Detailed execution plan for all v2.0 sub-versions:

Phase 1 (v2.0.0-alpha): container skeleton, DB rename, schema migration
Phase 2 (v2.0.0-beta): write tools + pending-changes service + fuzz tests
Phase 3 (v2.0.0): frontend diff pane + chat pane + Caddy routing
Phase 4 (v2.0.0-final): dispatcher worker + task queue + agent probing
Phase 5 (v2.0.1): ACP client + PTY fallback + worktree management
Phase 6 (v2.0.2): MCP server (6 tools, stdio, 10-question eval)
Phase 7 (v2.0.3): CLI + human inbox + cost tracking + observation hooks + Boomerang
Phase 8 (v2.0.x): path-guard fuzz, integration tests, docs, production deploy

~2050 LoC total. Phases 1-4 sequential, 5-7 parallelizable after 4.
Risk register covers path-guard bypass, ACP instability, worktree cleanup,
DB rename, MCP eval, Boomerang context leak.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 01:09:05 +00:00
531d39ace9 v2.0 proposal update: add AGENTS.md extensions, Boomerang pattern, observation hooks, follow-up batches
Additions from second pass of boocode_code_review.md:

- AGENTS.md extensions: output_schema, exit_expression, execution_strategy
  (qodo-ai/agents MIT), expert_model escape hatch (RA.Aid Apache-2.0)
- Subagent isolation via Boomerang Tasks pattern: orchestrator-only-dispatches,
  down-pass/up-pass context discipline, fresh session per subtask
- Observation hooks: 5-event taxonomy from budi (SessionStart, UserPromptSubmit,
  PostToolUse, SubagentStart, Stop) mapped to WS frames
- Follow-up batches table: PR-resolver, HMAC audit log, blind-validation gate,
  majority-vote ensembler, drift detection, anti-slop, globstar gate, Docker
  sandbox, multi-provider LLM
- Additional repo to clone: qodo-ai/agents for agent.toml schema reference

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 23:22:57 +00:00
f2974d6887 v2.0 proposal: BooCoder — write tools, pending changes, ACP dispatch, MCP server
Comprehensive roadmap for the v2.0 major version bump. Covers:
- Schema: pending_changes, tasks, available_agents tables + human_inbox view
- Path A: native write tools (edit_file, create_file, delete_file) queuing
  through pending_changes before /apply flushes to disk
- Path B: external agent dispatch via ACP (opencode, goose) or PTY fallback
  (claude, pi) with per-task git worktrees and automatic diff-on-completion
- BooCoder MCP server: 6 tools exposing task primitives over stdio
- Code lifts: agent-hub (Apache-2.0, task DAG), plandex (MIT, diff UX),
  ACP SDK (Apache-2.0, subprocess protocol), Paseo (AGPL, design-only)
- Sub-versions: v2.0.0 (Path A), v2.0.1 (Path B), v2.0.2 (MCP server),
  v2.0.3 (CLI + polish)
- Estimate: ~2200 LoC total

All v1.x dependencies shipped (v1.13 parts, v1.14 outer loop, v1.15 MCP
client, v1.16 codesight). v2.0 is unblocked.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 15:11:16 +00:00
29c7d051b6 v1.16.0-codesight-merge: 4 new codecontext tools — blast radius, hot files, routes, middleware
BooCode wrapper tools for the 4 new MCP tools added to the codecontext
sidecar (Go side committed separately at /opt/forks/codecontext).

- get_blast_radius: reverse-edge BFS — "what breaks if I change this?"
- get_hot_files: most-imported files by incoming edge count
- get_routes: Fastify/Express route extraction via tree-sitter AST
- get_middleware: middleware detection via import + registration patterns

Wrappers follow the existing codecontext pattern: Zod input → callCodecontext
→ ToolDef export. Registered in ALL_TOOLS (alpha-sorted). All 4 are read-only.

codecontext sidecar rebuilt from commit b19e646 with the 4 new Go handlers
(2130 lines, 29 tests). Reviewer fixes applied: defer RUnlock on Tier 2
handlers, extractObjectProperty delegates to extractStringValue for
template-literal route paths.

363/363 server tests passing. No schema changes, no frontend changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 05:19:52 +00:00
d27a977d59 v1.15.0-mcp-multi: multi-server MCP client + stdio transport + config file + tool globs
Generalizes the v1.14.1 single-server Context7 PoC into a multi-server MCP
client registry with per-server graceful degradation. JSON config at
/data/mcp.json (bind-mounted alongside AGENTS.md) matches opencode's
mcpServers schema shape. Config file missing = no MCP (opt-in by presence).

Two transports: Streamable HTTP (remote servers like Context7) and stdio
(local subprocess servers like codecontext). Stdio spawns a persistent child
via the SDK's StdioClientTransport; shutdown hook closes all transports.

Tool prefix generalized from context7_<name> to <serverName>_<toolName> with
a toolToServer reverse map for dispatch routing. AGENTS.md tools: field now
supports glob patterns (context7_*, !web_*) via matchToolGlob — last-match-
wins with ! deny prefix. Replaces exact-match .includes() in stream-phase.ts.

refreshToolNames() in agents.ts rebuilds the DEFAULT_TOOLS snapshot after
appendMcpTools so agents without explicit tools: lists see MCP tools —
reviewer caught that the module-load-time snapshot would permanently exclude
late-registered tools.

Read-only invariant: readOnlyHint === false rejected at discovery. Result
size capped at 5MB. v1.14.1 env vars removed — superseded by config file.
Default data/mcp.json ships with Context7 disabled.

363/363 server tests passing. No schema changes, no frontend changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 04:08:42 +00:00
5692e99a5d v1.14.1-mcp-poc: single-server MCP client against Context7
Validates the MCP-client loop end-to-end against one real MCP server before
the full v1.15 port. New services/mcp-client.ts wraps @modelcontextprotocol/sdk
v1.29.0 with Streamable HTTP transport. On startup (when MCP_CONTEXT7_URL is
set), connects to Context7, discovers tools via tools/list, wraps each as a
ToolDef prefixed context7_<name>, and appends to ALL_TOOLS via appendMcpTools.

Read-only invariant guard rejects any tool with readOnlyHint: false. Tool
dispatch is transparent — executeToolCall routes MCP calls through the ToolDef
execute wrapper, which strips the prefix before calling the MCP server. Result
size capped at 5MB with truncation. Graceful degradation: server down at
startup → zero tools; server down mid-session → error result, model
self-corrects.

Adversarial review caught that a Zod .default() on the URL config made MCP
always-on instead of opt-in — fixed by removing the default. MCP_CONTEXT7_URL
must be explicitly set to enable.

ALL_TOOLS changed from ReadonlyArray to mutable to support late-registration.
appendMcpTools re-sorts and rebuilds TOOLS_BY_NAME after append.

348/348 server tests passing (16 new mcp-client tests). No schema changes,
no frontend changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 21:58:09 +00:00
f4a97808ad v1.14.0-outer-loop: explicit while loop replaces inference recursion
Converts the ad-hoc executeToolPhase → runAssistantTurn recursion into an
explicit while (stepNumber < effectiveCap) loop. A step is one stream-and-
tool-execute iteration; the loop terminates on non-tool finish, step-cap hit,
doom-loop, budget exhaustion, abort, or synthesis success.

MAX_STEPS = 200 hard ceiling (4x old effective limit from budget). Per-agent
steps: field in AGENTS.md frontmatter sets tighter caps (Refactorer: 5,
Architect: 20, others: unset = bounded only by MAX_STEPS). Resolution:
effectiveCap = Math.min(agent.steps ?? Infinity, MAX_STEPS).

executeToolPhase no longer recurses — returns ToolPhaseResult struct
(action: 'continue' | 'paused' | 'synthesis_done') so the caller decides
whether to continue or break. steps: 0 handled as "no tool calls allowed"
via runTextOnlyTurn (one text-only stream phase, tool calls ignored with
warn log).

Step-cap hits produce a sentinel summary (reuses cap_hit kind so
CapHitSentinel.tsx renders without frontend changes; text distinguishes
"Step limit reached" from "Tool budget exhausted"). Doom-loop check migrated
to top of loop body — same predicate, same threshold (3), break instead of
return.

step_start parts are in the schema CHECK but not emitted as message_parts —
writing before the stream phase creates a sequence-0 collision with
partsFromAssistantMessage. Structured log line emitted instead. Adversarial
review caught the collision pre-deploy.

332/332 server tests passing. No frontend changes. No schema changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 20:29:21 +00:00
211e903620 v1.13.20-drop-legacy-cols: final phase of v1.13.0 strangler-fig
Removes the dual-write into messages.tool_calls / messages.tool_results JSON
columns and drops the columns. message_parts is now the only source of truth
for tool calls and tool results.

10 dual-write sites stripped (5 in tool-phase.ts, 2 in routes/skills.ts, 2 in
routes/messages.ts, 1 in routes/chats.ts fork-clone). The recon-driven grep
caught 2 sites beyond the original v1.13.2 roadmap inventory and an extra
fixture file (tool_cost_stats.test.ts) with a direct legacy-column INSERT.

messages_with_parts view rewritten to parts-only subselects (COALESCE
fallbacks gone). View runs via CREATE OR REPLACE so it lands before the
column DROPs in startup DDL — Postgres rejects column-drop on view-referenced
cols. v1.12.1 cleanup DO block (DROP CONSTRAINT messages_status_check /
messages_role_check) removed; those one-shots have done their work.

Adversarial review caught a runtime bug the green test suite missed: the
discard_stale endpoint (chats.ts) had a RETURNING ... tool_calls, tool_results
clause that would have crashed on every 60s-no-token-activity recovery in
production. Fixed by switching to two-step UPDATE returning id, then SELECT
from messages_with_parts so parts-synthesized fields keep flowing on the wire.

Message API type retains tool_calls? / tool_results? — the view synthesizes
those keys from parts so the wire shape is unchanged; frontend reads need no
update. Override on the original v1.13.2 plan, captured in the openspec
proposal.

339/339 server tests passing (including 7 DB-integration tests that applied
the schema migration to a live DB and ran the parts-only view end-to-end).
tsc + web build clean.

Pairs with v1.13.0-ai-sdk-v6 (introduced the dual-write) and v1.13.1-B (moved
the read path to messages_with_parts). Umbrella v1.13 tag ships on this same
commit, marking the strangler-fig closed.

CLAUDE.md picks up Sam's pre-existing edits documenting tag-naming and
CHANGELOG conventions — both already in use by v1.13.19 / v1.13.20.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 13:03:51 +00:00
ad45b28250 v1.13.19-html-artifact-panes: pane-based artifact viewer with on-request HTML
Every assistant message gets an "Open in pane" affordance that opens the
message in the workspace splitter — Markdown pane (Copy + Download .md) by
default; HTML pane (Download .html only) when the model emits a self-contained
<!DOCTYPE html> or fenced ```html artifact. BOOCHAT.md rule keeps Markdown
default at every length; HTML opt-in on explicit user request.

Backend: services/artifacts.ts (slug derivation + write helpers with
symlink-escape guard via realpath-after-mkdir), routes/artifacts.ts (POST
download + GET stream with nosniff + CSP sandbox defense-in-depth), HTML
detection in finalizeCompletion writing a new message_parts.kind='html_artifact'
row (schema CHECK extended via v1.13.13 pattern), graceful 1MB cap via the
pure decideHtmlArtifactWrite helper. PartKind union extended.

Frontend: MarkdownRenderer.tsx extracted from MessageBubble's inline
MarkdownBody for reuse; MarkdownArtifactPane.tsx + HtmlArtifactPane.tsx with
loading/error states; pane state is reference-only ({chat_id, message_id,
title}) — content fetched on mount to keep workspace_panes jsonb small and
avoid 1MB blobs riding session_workspace_updated frames. iframe sandbox
locked to allow-scripts allow-clipboard-write allow-downloads with no
allow-same-origin, srcDoc not src. openInPane discriminates 404 (expected
fallback) from real errors (toast + bail). PanelRightOpen icon button with
mobile 44px tap-target.

31 new server unit tests including a real-symlink filesystem case; 332/332
server tests passing, tsc clean both sides, pnpm -C apps/web build green.
Smoke deferred to first deploy.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 12:43:13 +00:00
1a889dcde3 v1.13.18-codecontext-file-path: resolve file_path against project root in codecontext wrappers
Four codecontext sidecar wrappers — get_file_analysis (required
file_path), get_symbol_info, get_dependencies, and get_semantic_neighborhoods
(optional) — forwarded file_path to the HTTP sidecar unchanged. The
sidecar's internal file index is keyed on absolute paths, so any
relative path from the model returned "File not found in graph".
Three back-to-back failures observed in one chat on 2026-05-22
17:56 UTC, ~48 s of wasted tool budget.

## Resolver

Add resolveProjectPath(projectRoot, rawPath) in codecontext_client.ts:
trim check → absolute/relative branch (both go through resolve() so
dot-segments normalise) → realpath with ENOENT fallthrough → escape
check using the realpathed value. Error shape mirrors the existing
target_dir escape error byte-for-byte; only the field name differs.

Wired into callCodecontext at the args-spread site, guarded on
file_path presence + non-empty. All four wrappers benefit from one
call site; wrappers without file_path (overview, framework, watch,
search) are unaffected.

## Schema trim

.trim() added to all four file_path Zod schemas:

  get_file_analysis:                  z.string().trim().min(1)
  get_symbol_info:                    z.string().trim().optional()
  get_dependencies:                   z.string().trim().optional()
  get_semantic_neighborhoods:         z.string().trim().optional()

Absorbs trailing newlines / whitespace from model output before the
resolver sees the value.

## Adversarial review fixes

Adversarial pass surfaced two P2 findings:

1. Absolute path with `..` resolving outside the project root (e.g.
   `<projectRoot>/../etc/passwd`) that ENOENTs at realpath would slip
   through the literal prefix-check: the raw string starts with
   `<projectRoot>/`. Fix: resolve() the absolute branch's candidate
   too, so dot-segments normalise before the prefix check.

2. No symlink-escape test coverage. Realpath's stated purpose
   (catching in-project symlinks pointing outside the project) was
   never tested. Added: create a tmpdir outside projectRoot,
   symlink projectRoot/evil-link → outside file, assert rejection.

## Tests

codecontext_client.test.ts: 19 tests (10 baseline + 9 new file_path
resolution cases). Cases cover: relative→absolute, absolute-inside,
relative-escape, absolute-outside, ENOENT-fallthrough, empty-string,
wrapper-without-file_path, absolute-with-`..`-ENOENT,
symlink-leaving-root.

codecontext_tools.test.ts: one assertion updated to expect the
resolved-absolute file_path on the wire (previously asserted the raw
relative path passed through, which is exactly the bug being fixed).

Full suite: 301 passed, 7 skipped.

## Affected / unaffected

- get_codebase_overview, get_framework_analysis, watch_changes,
  search_symbols: no file_path arg → resolver guard skips them. No
  behavior change.
- get_semantic_neighborhoods IS in SYNTHESIS_TOOLS — previously-failing
  relative-path calls will now successfully synthesize. Desirable, not
  a regression.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 21:54:16 +00:00
b52c5df705 v1.13.17-cross-repo-reads: on-demand read access to paths outside the project root
When the agent needed context from another repo, pathGuard rejected every read
with no recovery path. This batch adds a reactive request_read_access flow:
pathGuard's error now hints at the tool, the model emits a structured request,
the inference loop pauses (same mechanism as ask_user_input), the user picks
Allow/Deny via inline chips, and subsequent reads under the granted root succeed
for the rest of the session.

Schema: sessions.allowed_read_paths TEXT[] NOT NULL DEFAULT ARRAY[]::TEXT[]
(idempotent ADD COLUMN IF NOT EXISTS).

Grant unit (design D1): nearest registered projects.path ancestor →
nearest repo-shaped ancestor (.git/ / package.json / go.mod / Cargo.toml)
under PROJECT_ROOT_WHITELIST → else refuse. grant_resolver.ts walks
ancestors with a per-iteration whitelist invariant check so symlinked
input can't escape the whitelist mid-walk (Sam's checkpoint-1 ask).

Path-guard: optional extraRoots arg threaded from session.allowed_read_paths
through executeToolCall to view_file / list_dir / grep / find_files. The
ToolDef.execute signature gets an optional third param; non-FS tools
ignore it. view_file re-anchors the secret-guard check on basename(real)
whenever a relative path starts with "../" so .env / id_rsa* etc. still
deny across grant roots.

Endpoint: POST /api/chats/:id/grant_read_access mirrors /answer_user_input.
On 'allow' it re-resolves the grant root (state may have changed since
prompt — auto-falls to denial reason text on failure, not 500), array_appends
to sessions.allowed_read_paths with in-memory dedup, then publishes
tool_result + session_updated frames and enqueues the next assistant turn.

PATCH /api/sessions/:id allowed_read_paths supports revocation only. Zod
refines absolute + no traversal markers; runtime findUnauthorizedAdditions
guard rejects any entry not already present in the row, so a malicious
curl -X PATCH -d '{"allowed_read_paths":["/etc"]}' returns 400 instead of
bypassing the grant flow (Sam's compliance-review action item).

Frontend: RequestReadAccessCard renders pending (path + reason + Allow/Deny)
and answered (granted/denied summary with the resolved root) variants;
MessageList.flatten/group special-cases the tool name; SettingsPane adds a
per-session grants list with per-row revoke that PATCHes the shortened
array.

Tests: 11 grant_resolver, 8 path_guard, 8 sessions PATCH subset, including
explicit cases for symlink escape mid-walk, walk-bound termination at
whitelist root, /etc bypass attempt via PATCH, and nearest-project
disambiguation. 292 total server tests green.

Pairs with v1.13.16-xml-parser — the model now self-recovers from both
a wrong tool name AND from a refused path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 21:45:52 +00:00
2e1a81de72 v1.13.16-xml-parser: Anthropic <invoke> support + unknown-tool recovery hints
Two-part fix for the model-emitted XML drift the v1.13.15-codecontext-synth
investigation surfaced (1 raw <invoke> leak observed out of 190 qwen3.6
turns — qwen3.6-35b-a3b-mxfp4 drifts to the Anthropic format when prompted
as an Architect-style agent because Claude Code documentation in its
pre-training corpus uses that shape).

## Parser extension

xml-parser.ts now recognizes BOTH XML tool-call flavors:

  - Qwen/Hermes:   <tool_call><function=NAME>...<parameter=K>V</parameter>...</function></tool_call>
  - Anthropic:     <invoke name="NAME"><parameter name="K">V</parameter></invoke>

Both route through the same synthetic-id xml_call_${idx} ToolCall path.
extractToolCallBlocks() and partialXmlOpenerStart() handle both openers
(<tool_call> and <invoke...) so partial buffers don't get prematurely
flushed during streaming.

The existing Qwen parser was tightened to tolerate whitespace around `=`
(<function = name>, <parameter = key>...) so a stray space doesn't get
absorbed into the function name. Name capture is non-whitespace,
non-`>`.

## Unknown-tool recovery hint

New tool-suggestions.ts exports levenshtein() + suggestToolName() +
formatUnknownToolError(). When tool-phase.ts:executeToolCall receives a
toolCall.name that isn't in TOOLS_BY_NAME, the error returned to the
model now includes a "Did you mean: X?" hint based on Levenshtein
distance ≤3 or substring match against Object.keys(TOOLS_BY_NAME).
Targets the qwen3.6 drift to read_file → suggest view_file. Applies to
all unknown tool names, not just <invoke>-derived ones — at the
dispatch layer we no longer know which format produced the call, and
the extra signal is harmless for Qwen-derived calls.

## Test coverage

xml-parser.test.ts: 46 tests, all green. Covers both parsers
(well-formed, malformed, multi-parameter, nested-content), the
partial-opener detector for both flavors, the unified extraction
helper, and the unknown-tool error formatter.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 20:59:25 +00:00
61308cf17c v1.13.15-codecontext-synth: remove "tag pending" qualifier in roadmap
Trivial follow-up after the v1.13.15-codecontext-synth tag landed.
Retrospective bullet now describes the shipped state; cleanup-order
tracker marks the batch .

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 20:09:39 +00:00
3992a9fcb7 v1.13.15-codecontext-synth: forced second-inference synthesis for codecontext overview tools
After a codecontext overview-class tool call lands (get_codebase_overview,
get_framework_analysis, get_semantic_neighborhoods), the pipeline runs a
second inference pass that replaces the recursive runAssistantTurn. The
synth pass auto-fetches the top-N source files referenced in the
codecontext output plus project docs (BOOCHAT.md, AGENTS.md,
*roadmap*.md, CONTEXT.md), applies a 32k-token budget with explicit
drop-priority, and streams a structured response that grounds the model
in real load-bearing code rather than relying on the codecontext summary
alone. Smoke #1 (default) and #2 (Architect) both cite the correct
inference/turn.ts + tool-phase.ts + stream-phase.ts files; smoke #6
(fault injection) verifies the fall-through path marks the synth message
status='failed' and yields cleanly to the recursive turn.

## Truncation-aware extraction

codecontext's wrapper inline-truncates results at 32k chars. Without the
expansion step, the top-N file selection only saw the alphabetical head
of the codebase (apps/booterm/dist/*) and auto-fetched the wrong sources.
The pipeline now calls in-process readTruncation(outputPath) before
extracting referenced files, so top-N selection sees the full 80k+ char
output. The 32k truncated head still ships to the synth model — the
expansion is reference-extraction-only, preserving the token-budget
contract. Graceful degradation on readTruncation null/throw: log warn,
fall back to the truncated head.

## Schema deviation from dispatch

The dispatch claimed no schema migration was needed for the new
'synthesis' part kind. Reality: message_parts.kind has an explicit
CHECK constraint (schema.sql:54) that would reject the new value. Added
a DROP CONSTRAINT IF EXISTS + DO $$ pg_constraint idempotency-guarded
re-add matching the CLAUDE.md migration pattern. The inline CREATE TABLE
constraint also updated so fresh installs land with the extended enum.

## User-abort marks synth-message failed

Deviation from review-time spec ("user-abort path does NOT mark the
message failed"). The outer abort handler in error-handler.ts operates
on the parent turn's assistantMessageId, not the new synth row that
runSynthesisPass created. Without explicit marking, the synth row would
sit in status='streaming' until the 5-min stale-streaming sweeper
(v1.13.1-cleanup-bundle), tripping the frontend's 60s no-token-activity
banner in the meantime — exactly the UX bug class the v1.13.1 sweeper
was added to handle. Marking failed on every catch path (including
user-abort) closes the gap. Cost: one extra DB write + one publish on
the rare user-abort-during-synth path.

## Race-safe synth-tool capture

tool-phase.ts uses synthEntries: Array<{tc, output, error?}> with
per-callback push under Promise.all. find() picks the first non-error
entry by call-order (toolCalls array index). Multiple synth-tools in
one batch are uncommon but handled deterministically.

## Roadmap rebase

Updated boocode_roadmap.md retrospective section + cleanup-order tracker
+ schema-changes summary to use the new vMAJOR.MINOR.PATCH-slug tag
names per the 2026-05-22 retag (CHANGELOG.md is the canonical record).
v1.13.15 listed as "this batch, tag pending"; a one-line follow-up
commit will remove that qualifier after the tag lands.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 20:08:47 +00:00
0fa46cd06c v1.13.12: skills audit + token-tracking fix + codecontext + cap50 + UI cleanups
Multi-topic batch. The big-ticket item is the skills audit; the rest are
smaller patches that compounded during the audit work.

## Skills audit (rules→recipes split)

Vendored all 26 skills from /home/samkintop/opt/skills/ into data/skills/
(the boocode-repo-local skill library — see docker-compose change below).
Audited via 5 parallel Claude Code agent-teams running the
mgechev/skills-best-practices 4-step protocol (Discovery → Logic → Edge
Case → self-Architecture-Refinement) per skill, ~2 min wall-clock vs the
~3.7-hour serial estimate.

Result: 14 skills surviving (renamed to gerund form, frontmatter matched),
11 deleted (duplicates, BooCode-irrelevant patterns, Claude-already-does-
natively), 1 migrated to BOOCHAT.md/BOOCODER.md as an always-true rule
(verification-before-completion). Each surviving skill had its description
refined to fix specific trigger gaps surfaced by the protocol — 4
real-bug findings landed (dead refs, stale tags, broken sub-file
references in the original vendored content).

Audit decisions documented in openspec/changes/v1.13.12-skills-audit/
audit-notes.md. Convention codified in BOOCHAT.md/BOOCODER.md "rules vs
recipes" sections — future workflow rules go to those files (100%
present), recipes stay in data/skills/ (~6% invoke rate in multi-turn
per the Codeminer42 measurement).

## Token tracking + stale-stream banner fix (same root cause)

ws-frames.ts IsoTimestamp was z.string().min(1) but postgres returns
timestamp columns as JS Date objects. Every message_complete /
session_updated / chat_updated frame was failing the v1.13.11 Zod gate
and being silently dropped. Symptoms: token tracking blank in the UI
(no usage frames landed); the 60s no-token-activity timer tripped the
stale-stream banner because the frontend's local message state never
saw status='streaming' flip to 'complete'.

Fix: z.preprocess(v => v instanceof Date ? v.toISOString() : v,
z.string().min(1)) applied to the IsoTimestamp primitive. Centralized,
no publisher changes, works identically server + web (the parity test
still passes).

## Codecontext .codecontextignore auto-install

services/codecontext_client.ts now copies the
codecontext/.codecontextignore.template into any project's root on the
first call to that project if no .codecontextignore exists. One file
written per project, idempotent (in-memory Set guard + access-check),
silent fallback on read-only project. Stops the upstream empty-source-
file parser crash on foreign projects' node_modules — previously
required manually copying the template per project.

## Tool-call budget cap 30 → 50

services/inference/budget.ts: BUDGET_READ_ONLY and BUDGET_NO_AGENT
bumped to 50 (from 30). BUDGET_NON_READ_ONLY stays at 10 (no write
tools landed yet). Real recon sessions were hitting 30 with ~3 turns
wasted on codecontext parse failures; legitimate need was ~27, and
Architect-class system overviews want deeper recon. Headroom of 20
absorbs failure-retry turns without changing the safety floor — the
doom-loop guard (3 identical calls → abort) catches the actual
failure mode this cap was guarding against.

v1.14 (Phase C outer agent loop) will supersede this via per-agent
agent.steps. Throwaway-ish patch but unblocks deeper recon today.

## UI cleanups

- ChatPane queued-message dropdown removed. Each queued message now
  has three buttons: edit (pop back into ChatInput via sendToChat
  event), force-send (was the dropdown's only useful action), and
  cancel. Default behavior (send when streaming completes) needs no
  UI — it's the implicit do-nothing path.
- ChatThroughput removed from desktop tab strip (ChatTabBar.tsx).
  Mobile tab switcher still shows it.

## Plumbing

- .gitignore: data/* + !data/AGENTS.md + !data/skills/ negation
  patterns so the vendored skill library + agent registry become
  git-tracked while session DB state stays out.
- docker-compose.yml: removed /opt/skills:/data/skills override
  mount. Skills now live in the boocode repo at data/skills/,
  auditable per-batch. The host-level /opt/skills/ is preserved
  untouched for any other tools that read from it.
- .codecontextignore at repo root: auto-installed when codecontext
  was first called against /opt/boocode itself; matches the template.
- CLAUDE.md: updated to document the v1.13.11 publishFrame wrapper +
  message_parts table + tool_cost_stats view + DB-integration test
  pattern + host-side smoke endpoint quirk. (Pre-existing in working
  tree before this batch; shipped here for completeness.)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 18:58:30 +00:00
bc376c878d v1.13.11-b: convert raw broker.publish call sites to typed publishFrame
Second half of the WebSocket-frame-typing batch. Phase A (8b568b3)
landed the schemas + frontend receive validation + publishFrame /
publishUserFrame wrappers. This commit converts the existing publish
call sites so every server-emitted WS frame now goes through Zod
validation at the broker boundary.

Conversion strategy: change once in the inference / skills adapters in
index.ts (so ctx.publish / ctx.publishUser propagate to publishFrame /
publishUserFrame for ALL ~50 inference + auto_name call sites in one
move), then bulk-replace the ~30 direct broker.publish* call sites in
the routes + compaction.

Files touched:
- index.ts: inference + skills route adapters now call publishFrame /
  publishUserFrame internally; raw broker.publishUser('default', ...)
  call in the stale-row sweeper also converted.
- routes/projects.ts (7 sites), routes/chats.ts (9 sites),
  routes/sessions.ts (8 sites): all broker.publishUser(...) → broker.
  publishUserFrame(...).
- services/compaction.ts (3 sites): 2 publishUser, 1 publish.

Real protocol drift surfaced by Zod, fixed in the same commit:

  services/compaction.ts:442 was publishing chat_status with status:
  'working' — the v1.12.1 chat_status widening (CLAUDE.md:55) dropped
  this enum value in favor of streaming|tool_running|waiting_for_input|
  idle|error. The compaction.ts site was missed during v1.12.1; the
  frame had been published with an unknown enum value ever since (the
  frontend useChatStatus quietly ignored it). Corrected to 'streaming'
  — compaction's LLM call has the same dot-state semantic as an
  inference turn. This is exactly the class of bug v1.13.11 exists to
  catch.

Schema relaxation: OpaqueObject (the bag type for nested entities like
Project / Chat / Session / WorkspacePane embedded in WS frames) was
z.object({}).passthrough(), which Zod outputs as {} & {[k:string]:
unknown}. The strict-typed entities don't have index signatures so
TypeScript rejected them at publishFrame call sites. Relaxed to
z.unknown() — runtime validation still accepts the value, dev-time
narrowing happens via the existing hand-maintained types. Trade-off:
frame-level drift detection stays sharp; nested-payload validation
goes to follow-up work as the brief intended.

Schema audit:
  grep -rn "broker\.publish(\|broker\.publishUser(" apps/server/src \
    --include="*.ts" | grep -v "broker.ts\|__tests__\|.bak"
  → 0 results. Every server publish goes through publishFrame /
  publishUserFrame. The remaining ctx.publish / ctx.publishUser sites
  in services/inference/* + services/auto_name.ts route through the
  index.ts adapter, which calls publishFrame internally.

Tests: 219/219 pass (unchanged from v1.13.11-a; the Phase B conversion
is mechanical and doesn't add test cases).

Smoke: clean container boot, no ws-frame-validation-failed entries
under normal traffic. Sidebar list refresh + agent picker open both
pass through useUserEvents without drops.

~70 LoC across 7 files. v1.13.11 closed.
2026-05-22 15:54:00 +00:00
8b568b36d3 v1.13.11-a: WS frame schemas + frontend receive validation
First half of the WebSocket-frame-typing batch (split per recon — total
scope was ~535 LoC, larger than the roadmap's ~300 estimate, so the
server-side publish-site conversion lands separately in v1.13.11-b).

Phase A scope:

(1) apps/server/src/types/ws-frames.ts (NEW) — Zod schemas for all 27
wire-format WS frame types. Discriminated union (WsFrameSchema) plus
KNOWN_FRAME_TYPES const for diagnostic lookup. UUIDs are z.string().
uuid(); model-emitted tool_call_id stays z.string().min(1) since OpenAI-
compatible APIs emit "call_<random>" not UUID. Per-kind payload narrowing
(tool args, message_parts payloads) intentionally stays z.unknown() —
frame-level drift detection is the goal; deep payload validation is
follow-up work.

(2) apps/web/src/api/ws-frames.ts (NEW) — byte-identical mirror of the
authoritative server file. No path alias from web→server in the existing
tsconfig setup; sync-by-hand was chosen over a new packages/shared/ dir.
A ws-frames.test.ts test asserts the two files match.

(3) apps/server/src/services/broker.ts — adds publishFrame() and
publishUserFrame() methods to the Broker interface. Both validate via
WsFrameSchema and fail-closed: log + drop on invalid. createBroker now
accepts an optional FastifyBaseLogger so validation failures land in
the pino stream (with console.error fallback for unit tests). The
existing publish() / publishUser() raw methods stay legal — they get
converted to the typed variants in v1.13.11-b.

(4) apps/web/src/hooks/useSessionStream.ts + useUserEvents.ts — wrap
ws.onmessage with WsFrameSchema.safeParse. Fail-closed: invalid frames
log + return without dispatching. Hand-maintained WsFrame and
SessionEvent types stay in place; one cast bridges Zod-typed → narrowed
shape (Zod uses OpaqueObject for nested Message[] / WorkspacePane[] etc.,
which are dev-time-narrowed via the existing hand-maintained types).

(5) apps/web/package.json — adds zod ^3.23.8 as a direct dep. Was a
transitive dep via ai-sdk / postgres; promotion makes the import legal.

(6) Tests: 15 new in ws-frames.test.ts covering happy-path per major
frame type, drift-catchers (unknown type, invalid enum, non-UUID, negative
tokens), parts-authoritative read variants, the mirror-file diff check,
and four broker fail-closed scenarios. 219/219 server tests pass (was
204; +15 new).

Two recon corrections to the dispatch brief, both flagged before
implementation:

- No 'parts_appended' frame exists. The brief assumed one; the codebase
  reads parts via the messages_with_parts view after message_complete
  triggers a refetch. MessagePartSchema is therefore unused this batch.
- No 'tool_running' frame exists. The brief listed it as standalone; it
  is in fact a 'chat_status' variant ({ status: 'tool_running' }), already
  covered by ChatStatusFrame.

Smoke: clean container boot, no validation errors in the server log. Real
production frames pass validation (the schemas were derived from the
existing hand-maintained types in api/types.ts and sessionEvents.ts).

v1.13.11-b will follow immediately: convert all ~85 raw broker.publish /
ctx.publish call sites across 11 server files to publishFrame /
publishUserFrame. Mechanical edit; the wiring done here means the diff
in -b is just the call-site swaps.

~310 LoC across 9 files (4 new + 5 modified).
2026-05-22 15:48:32 +00:00
34cbecf975 v1.13.15-tools: tiered tool loading via BOOCODE_TOOLS env var
Pattern lift from eyaltoledano/claude-task-master (MIT + Commons Clause
— pattern only, no code lift). Adds BOOCODE_TOOLS env var with three
tiers:

- core (4 tools): view_file, list_dir, grep, find_files. ~2k token
  schema cost.
- standard (15 tools): core + web_search, web_fetch, git_status, all
  8 codecontext_* tools. ~10k token schema cost.
- all (default; current behavior): every tool in ALL_TOOLS (20). ~21k
  token schema cost.

The env var is a CEILING — narrows agent whitelists, never expands.
Default behavior unchanged when var is unset. resolveToolTier is
case-insensitive and falls back to 'all' on unknown values.

CORE_TOOL_NAMES + STANDARD_TOOL_NAMES validated at module load against
TOOLS_BY_NAME via two top-level for-loops that throw on the first
missing name. Module fails to import if a tier references a tool that
doesn't exist in the registry — catches typos and stale tier
definitions at boot rather than silently filtering valid tools out of
agent whitelists.

Wiring: agents.ts parseAgentBlock now reads BOOCODE_TOOLS from
process.env per parse, intersects with the agent's declared frontmatter
tools (or DEFAULT_TOOLS when frontmatter omits the field). Per-parse
read is fine — agents are re-parsed on the existing 60s cache TTL.

Tests: tools.test.ts grows from 1 to 10 tests. Covers resolveToolTier
across tiers/case/unknown values + the CORE-subset-of-STANDARD invariant
+ TOOLS_BY_NAME existence for both tier sets. 204/204 pass (was 195;
+9 new).

Deviation from the brief: the codecontext tools in the actual registry
have NO codecontext_* prefix (the brief's STANDARD list assumed it).
Used the actual names (get_codebase_overview, search_symbols, etc.).
Module-load validation would have failed boot with the prefixed names.

Smoke: with BOOCODE_TOOLS unset, agents return their full 12-tool
whitelists. With BOOCODE_TOOLS=core in .env + container restart, the
same agents narrow to 4 tools (find_files, grep, list_dir, view_file)
— intersection of declared whitelist ∩ core tier. Reverted after
confirmation.

CLAUDE.md updated with BOOCODE_TOOLS in the Environment section's
Optional list. .env.example gained a commented BOOCODE_TOOLS=all line
with the per-tier token-cost table.

~110 LoC across 5 files (4 modified + 1 test expansion). Under the
brief's ~30 LoC estimate for code; the test suite expansion drove
most of the growth.
2026-05-22 14:59:01 +00:00
5a3f357ce9 v1.13.15-openspec: reformat batch docs to OpenSpec directory structure
Adopt Fission-AI/OpenSpec's openspec/changes/<change-name>/{proposal,
specs,design,tasks}.md shape for BooCode's own batch docs. Zero-dep
documentation reformat; replaces ad-hoc boocode_batchN.md /
handoff_vN.N.N.md convention.

Existing batch docs moved into openspec/changes/archived/ via git mv
(preserves history):
- boocode_batch10.md
- handoff_v1.13.8_prefix_verify.md
- handoff_v1.13.10_per_tool_cost.md

Pre-v1.13.15 docs were NOT split into proposal/tasks/design files. The
work was already shipped; the originals are preserved as archived
snapshots. New v1.13.15+ batches land directly in
openspec/changes/<slug>/proposal.md (+ tasks.md, + design.md when
applicable) per the convention documented in openspec/README.md.

CLAUDE.md gained a one-line pointer to the convention (workflow
section). File grew from 153 → 154 lines, 27,682 → 27,925 chars; both
remain well under the AgentLint hard caps.

specs/ directory is reserved for future OpenSpec CLI adoption (v1.14+).
No CLI dep added in this batch — directory structure only. If/when the
full OpenSpec lifecycle is adopted, that lands as a separate batch.
2026-05-22 14:54:17 +00:00
fc11e8dc91 v1.13.15-agentlint: instruction-file audit against AgentLint 31-check standard
Manual audit pass against 0xmariowu/AgentLint's evidence-backed checks
(MIT, drawn from 265 versions of Anthropic's internal Claude Code
system prompt).

Findings and fixes:
- Identity sections ("You are the assistant running inside ...") removed
  from BOOCHAT.md (line 3) and BOOCODER.md (line 5). The model already
  knows where it's running; the openers were emphatic decoration.
- CLAUDE.local.md added to .gitignore (.env was already covered).
  Claude Code's Glob tool ignores .gitignore by default, which means
  any local override file was otherwise readable by any agent walking
  the workspace.
- CLAUDE.md unchanged — already passes all 10 checks. Emphasis density
  0.58/1000 words (under Anthropic's 1.4/1000 endpoint); two IMPORTANT/
  MUST references are load-bearing (tsc-noEmit footgun, v1.13.7
  includeUsage invariant); zero identity sections; zero --no-verify
  references; 27,682 chars (under the 40,000-char silent-drop limit).
  Line count (153) is over the 60-120 target band, but the brief
  explicitly forbids structural rewrites in the audit pass.

Targets not in scope:
- /opt/boocode/AGENTS.md does not exist in this repo (removed in v1.12,
  per CLAUDE.md:152). The global agent registry lives at /data/AGENTS.md
  (bind-mounted from outside the repo); can't be touched by this batch.
- No .github/workflows/ directory — SHA-pin audit (step 8) skipped.

Cumulative effect: model spends fewer tokens parsing instruction-file
ceremony in BOOCHAT/BOOCODER and receives sharper priority signal per
Anthropic's measured-evolution data. Zero code changes.
2026-05-22 14:52:37 +00:00
9ce638c916 v1.13.10: per-tool token cost accounting (rolling 100-call view)
Surfaces per-tool prompt/completion-token rolling averages in
AgentPicker for at-a-glance agent-cost hints. Implementation is a
SQL view on top of messages_with_parts plus a read endpoint and
AgentPicker tooltip extension. No new write site; all source data
already lands via the existing tool-phase.ts:94-95 / error-handler.ts:
109-110 / sentinel-summaries.ts UPDATEs that v1.13.7's includeUsage:
true fix made non-NULL.

(1) schema.sql — new tool_cost_stats view. Window-functions over
messages_with_parts.tool_calls with LATERAL jsonb_array_elements.
Attribution: equal split — multi-tool turn divides tokens N-ways;
the 100-call rolling mean absorbs split noise. Filters: status=
'complete' + metadata.kind NOT IN ('cap_hit','doom_loop') exclude
failed turns and sentinels respectively; tool_calls IS NOT NULL is
defense-in-depth since sentinels are role='system' rows. CREATE OR
REPLACE means schema apply is idempotent.

(2) routes/tools.ts NEW + index.ts wire-in. GET /api/tools/cost_stats
returns { stats: ToolCostStat[] } with mean_prompt_tokens / mean_
completion_tokens computed at read time (sum / n_calls). Sorted by
tool_name ASC. No pagination — ≤30 tools.

(3) __tests__/tool_cost_stats.test.ts NEW — 7 integration tests
keyed off DATABASE_URL env var. Tests skip gracefully when unset
(no-DB default). beforeAll applies the schema via sql.unsafe(read
FileSync(schema.sql)) for self-contained runs. Helper insertAssistant
Turn shared across cases. Covers: empty state, single-tool attribution,
multi-tool equal split, 100-call FIFO window, NULL-tokens exclusion,
parts-authoritative read via messages_with_parts, failed/sentinel
exclusion.

(4) web/api/types.ts + client.ts — ToolCostStat interface + api.tools.
costStats() method binding.

(5) AgentPicker.tsx — fetch costStats on mount, compute per-agent
sum-of-means across whitelisted tools, render muted cost line below
description: "~5.2k prompt / 280 completion · 6/8 tools · last call
3h ago". Skips line entirely when no tool history; preserves existing
native title= for layout backward-compat. formatK/formatAgo colocated.

Tests: 202/202 pass (195 prior + 7 new view-integration). Server +
web tsc clean.

Smoke: schema applied cleanly; GET /api/tools/cost_stats returns
canonical JSON; view + endpoint agree. Single-row result expected
given the v1.13.1-A → v1.13.7 NULL latent regression window; new
traffic populates organically.

Roadmap row at boocode_roadmap.md:114 plus schema row at :474 both
match. View vs table decision documented in handoff_v1.13.10_per_
tool_cost.md (rollback-safe, microsecond-fast at BooCode scale).

~270 LoC across 8 files (5 modified + 3 new).
2026-05-22 14:42:09 +00:00
8126d78b34 docs: capture v1.13.7-v1.13.9 invariants in CLAUDE.md
Five additions surfacing session-discovered constraints future Claude
sessions need:
- AI SDK v6 includeUsage:true requirement (avoids re-introducing the
  v1.13.1-A→v1.13.7 NULL-tokens regression)
- \n text-delta trim guards in MessageList/MessageBubble + payload.ts
  failed/empty-assistant skip rules (avoid undoing v1.13.7)
- 0.85 × ctx_max overflow formula (v1.13.9) replacing the stale
  ctx_max - 20k line
- New services/system-prompt.ts bullet documenting the v1.13.8
  fingerprint instrumentation surface
- New services/inference/budget.ts bullet with current BUDGET_NO_AGENT=30
  and read-only-tools rationale
2026-05-22 14:07:11 +00:00
b06a4a8e55 v1.13.9: compaction overflow trigger — 0.85 × ctx_max early trigger
Opencode pattern (session/overflow.ts): fire compaction at 85% of
ctx_max, replacing the v1.11.0-era `ctx_max - 20_000` formula.

Old formula: usable = ctx_max - 20_000
  - ctx=262144 → trigger at 242144 (92.4%) — only 7.6% headroom
  - ctx=100000 → trigger at  80000 (80.0%)
  - ctx= 32000 → trigger at  12000 (37.5%) — over-eager
  - ctx<=20000 → trigger at      0 — never fires

New formula: usable = floor(0.85 * ctx_max)
  - ctx=262144 → trigger at 222822 (85.0%) — 15% headroom for summarizer
  - ctx=100000 → trigger at  85000 (85.0%)
  - ctx= 32000 → trigger at  27200 (85.0%)
  - ctx=  8192 → trigger at   6963 (85.0%)

Ratio gives consistent headroom at any context scale. The qwen3.6
daily driver gets ~19k tokens more breathing room before overflow;
small-ctx models no longer degenerate to never-triggering.

usable() is the only consumer of COMPACTION_BUFFER → constant deleted.
New EARLY_TRIGGER_RATIO constant takes its place.

isOverflow() and the maybeFlagForCompaction() call site at
payload.ts:184 are unchanged — formula swap is internal to compaction.ts.
payload.ts comment touched only to drop the stale COMPACTION_BUFFER
reference (PRUNE_TRIGGER_TOKENS stays at 20k as the prune-freed
threshold; independent of the overflow formula).

Tests: 4 new usable() corner cases (262k/100k/8k/zero+negative), plus
5 isOverflow() numbers shifted to match the 85k budget at ctx=100k.
195/195 server tests pass (was 194).

Smoke: ratio math verified by unit tests at all four corners. Live
cap-hit verification deferred — requires accumulating >222k tokens
in a session under qwen3.6-35b-a3b-mxfp4 (was >242k pre-fix); will
surface organically in extended use.
2026-05-22 13:59:14 +00:00
a0c8d212cb v1.13.8: system-prompt prefix stability verify-and-measure
Recon during planning disproved the original v1.13.7 (DB-cache) premise:
buildSystemPrompt already runs over inputs mtime-cached at the file layer
(BOOCHAT.md in system-prompt.ts:25, AGENTS.md global+per-project in
agents.ts:245), and DB scalars are byte-stable until edited. The output
is microsecond pure-string concat with no I/O. Skills aren't in the
prefix; tools live in a separate request body field alpha-sorted by
v1.13.3.

This batch closes the verification gap with instrumentation, not
implementation:

- system-prompt.ts: buildSystemPromptWithFingerprint canonical impl
  computes SHA-256 over the assembled prefix, runs a per-session
  Map<sessionId, lastHash> observer, emits PrefixFingerprint per call
  and PrefixDrift (with field-level changed_inputs) on hash change.
  buildSystemPrompt is now a thin shim returning .prompt.
- agents.ts: getAgentsMtimes accessor — cache-read only, no I/O.
- payload.ts: buildMessagesPayload takes optional log argument; when
  passed, emits prefix-fingerprint (info) + prefix-drift (warn).
- turn.ts + sentinel-summaries.ts: pass ctx.log at 3 production call
  sites; sentinel summaries log too so any drift across cap-hit /
  doom-loop paths surfaces.
- system-prompt.test.ts: 4 new tests (byte-identical, no-drift-on-
  stable, drift-fires-with-changed-inputs, cross-session-no-drift).

194/194 tests pass (was 190).

Smoke: 5 messages in a fresh session produced 7 prefix-fingerprint
logs (extras from buildMessagesPayload being called from sentinel
summary paths), all with identical prefix_hash and prefix_length=2907,
zero prefix-drift. Prefix is byte-stable in steady-state.

Decision: original system_prompt_cache DB table from the roadmap is
permanently dropped. The v1.12.0 mtime caches at the input layer plus
alpha tool ordering at the request body (v1.13.3) already address the
load-bearing cache-stability surfaces. Instrumentation stays so the
claim can be re-verified at any time.
2026-05-22 13:42:18 +00:00
0ce6115976 docs: renumber v1.13.8 to verify-and-measure, drop system_prompt_cache table, add v1.13.8 dispatch brief 2026-05-22 13:24:29 +00:00
ff29b48e3a v1.13.7: stability bundle — usage capture + payload/UI sanitization
Five fixes for latent regressions surfaced during the v1.13.x.cosmetic
revert investigation. None alter schema or compaction; all cleanup
against the v1.13.1-A AI SDK migration's hidden surface.

(1) provider.ts — includeUsage: true on createOpenAICompatible.
@ai-sdk/openai-compatible defaults this false, omitting
stream_options.include_usage from the request body; llama-swap never
emitted the usage block, so result.usage.inputTokens/outputTokens
resolved undefined and tokens_used / ctx_used landed NULL in every
assistant row since v1.13.1-A. No historical backfill.

(2) MessageList.tsx — hasText = m.content.trim().length > 0.
AI SDK v6 streaming occasionally emits a leading "\n" text-delta on
tool-call-only turns; the literal newline passed length > 0 and
rendered an empty bubble + ActionRow between every tool call. Trim
catches it without changing semantics for genuine content.

(3) MessageBubble.tsx — same trim on hasContent for the no-tool-calls
path. Defensive symmetry with MessageList.flatten.

(4) payload.ts — buildMessagesPayload skips assistant rows with
status='failed' AND assistant rows with status='complete' + empty
content + no tool_calls. Without this, a trailing empty/failed
assistant + the next attempt's placeholder produced "Cannot have 2
or more assistant messages at the end of the list" rejections from
the OpenAI-compatible upstream after cap-hit + Continue.

(5) budget.ts — BUDGET_NO_AGENT 15 → 30. Every tool in ALL_TOOLS is
read-only today; the 15-cap was forward-looking for write tools that
haven't landed. No-agent mode now matches BUDGET_READ_ONLY.

47 LoC across 5 files. 190/190 server tests pass.

Verified live: new assistant turns populate StatsLine token data;
single-tool-call turns no longer render the stray empty-bubble +
ActionRow between tool calls; Continue after cap-hit no longer hits
the trailing-assistant API rejection.
2026-05-22 13:24:19 +00:00
81d837c04e v1.13.6: compaction head-assembly audit + reasoning fix
Audit traced compaction's summary path post-v1.13.1-B read flip:
- Q1: reads from messages_with_parts (view) — clean
- Q2: parts shape correctly threaded through buildHeadPayload — clean
- Q3: reasoning omitted from summary input — FIX NEEDED

v1.13.1-C wired reasoning end-to-end into inference/payload.ts but
missed this read site. Summarizer model couldn't see the reasoning
trail for tool-bearing turns, quietly degrading summary quality for
reasoning-channel models (qwen3.6).

Fix:
- CompactionMessage extended with reasoning_parts field
- SELECT pulls reasoning_parts from messages_with_parts
- buildHeadPayload (now exported for tests) prefixes assistant content
  with <reasoning>...</reasoning>\n\n<content>... when reasoning is
  present; standalone <reasoning>...</reasoning> for tool-call-only
  turns; omits the tag when reasoning is null or empty

4 new render branch tests (190 total).

Smoke deferred: forcing real compaction requires either threshold
pollution or building up a >40k-token chat with reasoning_parts.
Render branches are unit-covered; integration would only re-prove
structural correctness.
2026-05-22 08:18:47 +00:00
f8fc5db929 v1.13.5: opencode truncate.ts port — full tool output retrievable via opaque id
- New services/truncate.ts. Tmpfs storage at /tmp/boocode-truncations/
  (BOOCODE_TRUNCATION_DIR env var overrides for tests). 12-char base32
  opaque ids (~60 bits entropy, "tr_<id>"). Three exports: storeTruncation,
  readTruncation, truncateIfNeeded (wrap-or-passthrough helper).
  cleanupTruncations does TTL-pass (7 days) + orphan-reap (parts query on
  payload->'output'->>'outputPath') in one shot.
- Wired four tools through truncateIfNeeded: view_file (raw full file),
  list_dir (full filtered+secret-filtered entries serialized one-per-line),
  web_fetch (textRaw pre-slice), codecontext_client (body.result pre-slice).
  Each returns the existing sliced view plus an optional outputPath field
  when truncation fires.
- New view_truncated_output ToolDef. Resolves opaque id → on-disk content
  internally; model never sees the truncation dir. Same start_line /
  end_line slicing semantics as view_file. Registered in ALL_TOOLS (alpha
  sort places it after view_file automatically) and READ_ONLY_TOOL_NAMES.
- cleanupTruncations piggybacks on the v1.13.3 stuck-row sweeper's 60s
  setInterval. No-op when truncation dir is empty.

Not wired (TODO follow-up): grep and find_files. file_ops returns post-cap
results to the tool execute path, so the "full content" isn't recoverable
without a refactor of fileOps.grep / fileOps.findFiles to expose the
uncapped result. web_search is silent-slice (no truncated flag); outside
scope. Five sites of seven covered; the remaining two are the only ones
needing a file_ops change.

Tests: 7 new in truncate.test.ts (roundtrip, unknown id, malformed id,
truncateIfNeeded false/true/over-cap/storage-failure paths). 186 total
(was 179). cleanupTruncations file-system half implicitly via TTL pass;
orphan-reap branch covered by the live container smoke.

Smoke verified end-to-end against the live container:
- view_file with start_line=1, end_line=3 on CLAUDE.md → tool_result part
  carried outputPath "tr_cdpn1o04k6ma" + truncated=true.
- /tmp/boocode-truncations/tr_cdpn1o04k6ma exists, 15876 bytes, mode 0o600,
  parent dir mode 0o700.
- Follow-up view_truncated_output(id, start_line=50, end_line=55) returned
  the actual lines 50-55 of CLAUDE.md (the 808notes/BooCode bullets).
- ALL_TOOLS count=20 (was 19); alpha sort places view_truncated_output
  between view_file and watch_changes.

Closes a v1.12 catalog row that was scoped but deferred. The v1.13 parts
table made outputPath ride on the existing tool_result payload with no
schema change beyond the storage helper itself.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 07:55:55 +00:00
ec8593cf77 v1.13.4: two-tier compaction prune — opencode pattern half-shipped in v1.11.0
- message_parts.hidden_at timestamptz column (NULL by default) with a
  partial index on (message_id) WHERE hidden_at IS NULL for the common
  visible-parts filter.
- messages_with_parts view changed from COALESCE(parts, legacy) to
  CASE WHEN EXISTS(any parts of kind) THEN visible-parts ELSE legacy.
  COALESCE would have leaked hidden parts back via the legacy fallback
  when every part was pruned (smoke caught it pre-commit). The CASE
  distinguishes "no parts at all → fall back to legacy column for
  pre-v1.13.0 history" from "all parts hidden → return null/empty so
  the row drops out of the model payload" exactly.
- prune.ts: scans tool_result parts newest-first, protects the last 40k
  tokens (PROTECTED_TOKENS), marks older candidates hidden when their
  combined estimate clears 20k (PRUNE_TRIGGER_TOKENS — equal to
  COMPACTION_BUFFER from v1.11.0, so a successful prune is exactly the
  budget the summary path would have freed). Stops at chats.tail_start_id
  so it doesn't double-erase across the last summary boundary. Pure
  decision helper selectPruneTargets exported separately for unit tests.
- Wired into maybeFlagForCompaction: prune runs synchronously when
  overflow is detected; if it freed >= PRUNE_TRIGGER_TOKENS, the
  needs_compaction flag is NOT set and the (expensive) summary inference
  call is skipped this turn. The next turn's overflow check re-evaluates
  from scratch.
- 6 new unit tests in prune.test.ts cover: empty input, protection-only
  (no candidates), candidates below trigger, candidates above trigger,
  candidates straddling a summary boundary, exactly-protection-tokens.
  179 tests total (was 173).

Smoke verified post-rebuild:
- \\d message_parts shows hidden_at + partial index.
- View definition shows AND p.hidden_at IS NULL filters on all three
  subselects.
- Synthetic hide-then-restore confirmed the view drops the tool_result
  jsonb to null when its only part is hidden, and restores when un-hidden.
- EXPLAIN ANALYZE on the 42-message stress chat: 0.325ms (faster than
  v1.13.1-B's 1.018ms — EXISTS short-circuits cleanly for the common
  no-parts case).
- Normal turn (plain text prompt) completes unaffected.

Closes a v1.11.0 design item that was scoped but never implemented. With
v1.13's parts table the prune is dramatically cheaper to write — pre-parts
it would have meant editing JSON blobs in-place; now it's a hidden_at
flag and a view subselect.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 07:02:17 +00:00
a08d809b73 v1.13.3: cleanup bundle — statement timeout + alpha ordering + stuck-row sweeper + repairToolCall
Four independent items, all owed from prior dispatches.

- statement_timeout at the database level via:
    ALTER DATABASE boocode SET statement_timeout = '30s';
  Applied operationally; documented as a comment at the top of schema.sql
  (ALTER DATABASE can't run inside a DO block, so it's not idempotent
  inside applySchema). Re-apply after a volume reset.

- Tool registry alpha-sorted at module load. llama.cpp's prompt cache
  hits on byte-identical prefixes; any reordering of the tool list near
  the top of the system prompt would invalidate every cached turn.
  Single-source sort at the ALL_TOOLS export so toolJsonSchemas() and
  TOOLS_BY_NAME inherit the order automatically. New tools.test.ts
  asserts the invariant; total tests 173 (was 172).

- Periodic in-process stuck-row sweeper. Runs every 60s, marks
  'streaming' rows older than 5 minutes as 'failed', and publishes
  chat_status='idle' on the user channel so the UI dot drops without a
  refresh. Closes the mid-session crash UX gap; the v1.12.1 boot sweep
  only fires once at startup, so sessions used to stay stuck until next
  reboot. setInterval cleaned up via app.addHook('onClose'). Mirrors
  handleAbortOrError's publish pattern.

- experimental_repairToolCall wired through AI SDK v6 streamText. Pass-
  through implementation: log + return the original toolCall so the
  stream keeps going. executeToolPhase's existing error paths (unknown
  tool name → 'unknown tool: X' result; zod-reject → 'tool X rejected
  — field: required') already surface bad calls to the model; the value
  here is preventing the AI SDK from THROWING on parse errors and
  killing the whole stream. Owed since v1.13.1-A.

Smoke verified:
- statement_timeout = '30s' confirmed via SHOW.
- Tool path normal flow intact (list_dir prompt → tool_call → result
  → final assistant). No malformed tool calls in the test run; repair
  log will surface them when qwen3.6 actually emits one.
- Alpha order verified at runtime via the dist bundle: match: true.
- Sweeper logic not traffic-tested (no stuck rows to find), but the
  SQL UPDATE + broker.publishUser pattern is identical to handleAbort
  and the boot sweep — synthesis-only verification.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 06:46:03 +00:00
ac1a71f583 v1.13.1-C: port ask_user_input correlation to parts + wire reasoning_parts end-to-end
Pass 1 — ask_user_input correlation port (messages.ts:478, :549):

- The two correlation queries that backed the elicitation flow used to scan
  messages.tool_calls and messages.tool_results JSON columns directly. They
  now JOIN message_parts on payload->>'id' (for the caller assistant) and
  payload->>'tool_call_id' (for the pending tool row). Semantics preserved:
  ORDER BY m.created_at DESC LIMIT 1 still picks the latest issuance, the
  already-answered 409 guard now reads payload.output, and the UPDATE +
  parts replace inside sql.begin is unchanged from v1.13.0.
- Pre-v1.13.0 history has no parts rows and is unreachable to this lookup
  path (404). Acceptable per dispatch decision — no pending elicitation
  from before v1.13.0 will still be open. JSON-column fallback can land as
  a hotfix if it ever surfaces.

Pass 2 — reasoning_parts wired end-to-end:

- types.ts/StreamResult gains `reasoning: string`. stream-phase.ts accumulates
  reasoning-delta text per stream (replacing the v1.13.1-A counter-only
  diagnostic) and returns it on the result.
- parts.ts/partsFromAssistantMessage gains an optional `reasoning` param.
  When present it emits a kind='reasoning' part at sequence 0, ahead of
  the text and tool_call parts.
- error-handler.ts/finalizeCompletion and tool-phase.ts/executeToolPhase
  both thread result.reasoning into the dual-write call so reasoning-channel
  models (qwen3.6) get persistent reasoning rows.
- payload.ts: loadContext SELECT pulls reasoning_parts from the v1.13.1-B
  view; OpenAiMessage gains an optional `reasoning` field; buildMessagesPayload
  collapses reasoning_parts into a single string per assistant message.
- stream-phase.ts/toModelMessages converts assistant messages with reasoning
  into an AI SDK ModelMessage content array starting with a ReasoningPart,
  matching the @ai-sdk/provider-utils AssistantContent union. Reasoning
  models can now replay prior reasoning context across tool-call boundaries.
- types/api.ts and apps/web/src/api/types.ts Message interface gain
  reasoning_parts (optional, nullable). Frontend doesn't render this yet —
  field reserved for a v1.14 UI surface.

Tests: 2 new in parts.test.ts cover reasoning-at-sequence-0 with and
without text content. 172 tests pass (170 prior + 2 new).

Smoke verified against the live container:
- A reasoning-prompt ("walk through 17 × 23 step by step") produced one
  message with kind='reasoning' (361 chars) at sequence 0 and kind='text'
  (429 chars) at sequence 1. Adapter log confirmed reasoning capture.
- The new correlation SQL was validated against existing tool_call /
  tool_result parts: returns the expected message_id + payload shape with
  pending state correctly identified via payload.output IS NULL.
- ask_user_input end-to-end through the UI is Sam's smoke — the Prompt
  Builder agent does not always trigger ask_user_input for these prompts,
  so synthetic verification via SQL substituted for traffic-driven cover.

Annotation: the v1.13.1-A abort-throw site in stream-phase.ts got a
one-liner comment ("AI SDK v6 fullStream returns normally on abort; check
signal explicitly.") to prevent a future refactor removing it.

v1.13.2 drops the dual-write + the JSON columns + collapses the view.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 06:34:10 +00:00
13c3aa5b4e v1.13.1-B: read-path flip from tool_calls/tool_results JSON columns to message_parts
- schema.sql: new messages_with_parts view. tool_calls aggregates parts
  with kind='tool_call' as a jsonb array of {id, name, args}; tool_results
  picks the single sequence=0 part with kind='tool_result' as a jsonb
  {tool_call_id, output, truncated, error?}. COALESCE against the legacy
  jsonb columns means pre-v1.13.0 history (no parts rows) still reads
  correctly via the fallback, and fresh inserts (where parts dual-write
  follows the row INSERT) hit the legacy columns until the parts land.
- reasoning_parts column added to the view but not selected by any caller
  yet — v1.13.1-C extends the Message type and pulls it into the model
  payload alongside the type extension.
- Read sites switched to FROM messages_with_parts:
  - routes/chats.ts:427 (chat history GET)
  - routes/messages.ts:95 (session history GET)
  - routes/ws.ts:27 (WS snapshot on session connect, resume path)
  - services/inference/payload.ts (loadContext for model assembly)
  - services/compaction.ts (compaction's payload assembly)
- chats.ts:394 (discard_stale UPDATE RETURNING) unchanged — UPDATEs target
  messages directly and the returned shape is for a freshly-modified row
  where the legacy column is dual-written and correct.
- messages.ts:478/549 (ask_user_input correlation) intentionally not
  migrated — those query a different shape, ported in v1.13.1-C.
- Writes still target `messages` directly; the view is read-only.

Smoke verified against the live container:
- Equivalence: 5/5 messages with both legacy column and parts row return
  identical tool_calls jsonb between FROM messages and FROM messages_with_parts.
- Perf: EXPLAIN ANALYZE on the 42-message stress chat returns in ~1ms
  (50ms threshold). Bitmap Index Scan on message_parts_msg_seq_idx
  carries the parts lookups.
- API contract: GET /api/chats/:id/messages returns identical
  {id, name, args} tool_calls and {tool_call_id, output, truncated, error}
  tool_results shapes to frontend consumers — no UI changes needed.
- Inference path: sent a view_file prompt; assistant turn 1 emitted the
  tool_call, tool message captured the result, follow-up assistant turn
  read the result back via loadContext (now view-backed) and answered
  correctly. End-to-end loop intact.

v1.13.2 drops the dual-write + the JSON columns + simplifies the view
to just SELECT FROM message_parts.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 06:22:47 +00:00
c2c4f78a26 v1.13.1-A: install AI SDK v6 + swap streamText into stream-phase.ts adapter
- Add ai@^6 and @ai-sdk/openai-compatible@^2 to apps/server.
- New services/inference/provider.ts: createOpenAICompatible against
  llama-swap (baseURL threaded from config.LLAMA_SWAP_URL, cached per
  baseURL). No apiKey — Authelia + Tailscale gate llama-swap, not keys.
- streamCompletion rewritten as an adapter over streamText. AI SDK
  fullStream parts (text-delta, tool-call, finish, error) map back to
  the legacy {content?, tool_calls?, finishReason} StreamResult shape
  that executeStreamPhase already consumes. No layer above
  streamCompletion changes.
- toModelMessages converts BooCode's OpenAI-shaped history to AI SDK
  ModelMessage[]; tool messages need toolName which we look up by
  scanning earlier assistant tool_calls for the matching id.
- buildAiTools wraps BooCode's JSON-schema tool defs via
  tool({ inputSchema: jsonSchema(parameters) }) with NO execute —
  BooCode dispatches tools in tool-phase.ts, not the AI SDK loop.
- XML fallback parser preserved as-is — qwen3.6 still emits XML tool
  calls in text content that the structured tool-call layer misses.
- reasoning-delta parts dropped with a debug-level counter — captured
  properly in v1.13.1-C.
- Abort path: streamText({ abortSignal }) wires ctx.signal through, but
  AI SDK v6 swallows the abort (fullStream iterator exits cleanly
  rather than throwing). Post-iteration `if (signal?.aborted) throw` so
  handleAbortOrError owns the row and writes status='cancelled'. Caught
  by smoke D; would have shipped as status='complete' on stop otherwise.
- Usage frame reads result.usage (inputTokens / outputTokens v6 names)
  AFTER stream drain. Single trailing publish through the existing 500ms
  throttle. Known regression: ChatThroughput's live mid-stream tick
  (v1.12.2) is gone — it now shows a single value at stream end.
  TODO(v1.13.1-followup): interpolate outputTokens during streaming
  via a delta-cadence counter (e.g. part.text.length/4 token proxy)
  and publish every 500ms; reconcile against result.usage at finish.
- Write-path dual-write from v1.13.0 unaffected.

Read path stays on JSON columns. v1.13.1-B flips reads to message_parts.

Smoke verified end-to-end against running container:
- A. Plain text: status='complete', 1 text part.
- B. Single tool prompt → multi-tool chain (4 calls): every assistant
     with tool_calls has 2 parts (text+tool_call), every tool row has
     1 part (tool_result).
- C. Multi-step covered by B's chain.
- D. Stop mid-stream: status='cancelled' written via handleAbortOrError
     after the post-iteration abort throw.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 06:17:56 +00:00
1cb6eee24c v1.13.0: message_parts table + dual-write at every tool_calls/tool_results site
Adds a granular message_parts table (one row per text/tool_call/tool_result
chunk) without changing any read path. Old messages.content / tool_calls /
tool_results columns remain authoritative for v1.13.0; this dispatch is
write-only mirroring so the AI SDK migration in v1.13.1 can flip read
authority without a backfill window.

Schema:
  CREATE TABLE message_parts (id, message_id FK ON DELETE CASCADE,
    sequence int, kind text CHECK (text|tool_call|tool_result|reasoning|step_start),
    payload jsonb, created_at, UNIQUE (message_id, sequence))

New module services/inference/parts.ts with two pure derive helpers
(partsFromAssistantMessage, partsFromToolMessage) and insertParts that
fan-outs a multi-row INSERT via postgres-js.

Wired dual-write at every site that writes tool_calls or tool_results:
- tool-phase.ts: assistant finalize UPDATE, executed-tool UPDATE,
  ask_user_input sentinel UPDATE
- messages.ts answer flow: DELETE pending tool_result part + INSERT
  answered one inside the existing sql.begin
- skills.ts: synthetic assistant + tool INSERTs both inside existing tx
- chats.ts fork: CTE clones parts via ROW_NUMBER pairing (source→dest
  message id mapping in one statement, no N+1)
- error-handler.ts finalizeCompletion: text part for plain text-only
  assistant turns

Deviation: tool-phase.ts finalize UPDATEs and finalizeCompletion text-part
write are not wrapped in fresh sql.begin transactions. Safe in v1.13.0
because JSON columns are authoritative for reads. v1.13.1 must wrap these
sites before flipping read authority — TODO comments added at each
unwrapped site referencing v1.13.1.

Tests: 8 new unit tests for the derive helpers in
services/__tests__/parts.test.ts. Existing 162 tests untouched. 170 total.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 05:46:29 +00:00
ca64bf9f0a docs: CLAUDE.md updates from /claude-md-management session
- services/inference.ts → services/inference/ directory map (v1.12.4 split)
- workspace_panes server-side jsonb (was: localStorage-only line)
- chat_status 5-state model + ChatThroughput + discard_stale endpoint
- boot-time stale-streaming sweep documented
- WS frame sync gotcha (server InferenceFrame ↔ web WsFrame)
- session_panes table noted as dropped (not deprecated)
- messages_status_check/role_check drift cleanup noted

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 05:46:14 +00:00
9ef00c0268 v1.12.4: complete inference.ts split into services/inference/
- sentinel-summaries.ts: runCapHitSummary, insertCapHitSentinel,
  runDoomLoopSummary, insertDoomLoopSentinel
- inference.ts → inference/turn.ts: residue is runAssistantTurn,
  runInference, createInferenceRunner orchestration only
- inference/index.ts: re-export shim preserves the public surface
  (createInferenceRunner, runInference, runAssistantTurn,
  detectDoomLoop, DOOM_LOOP_THRESHOLD, buildMessagesPayload, plus
  type-side InferenceContext/InferenceFrame/StreamResult/TurnArgs/
  FramePublisher)
- src/index.ts + auto_name.ts + the two vitest test files updated to
  import from ./services/inference/index.js explicitly (NodeNext ESM
  doesn't honor directory-index resolution)

Final tally: 11 files under services/inference/, the largest being
sentinel-summaries.ts at 523 LoC (two near-clone summary paths kept
side-by-side until a third sentinel justifies factoring out a shared
runWrapUpSummary). turn.ts is now 326 LoC, the next-largest is
stream-phase.ts at 380. Public import surface unchanged.

tool-phase.ts → turn.ts back-edge for runAssistantTurn remains
(cycle is safe; resolved at call time).

Prepares the file structure for v1.13 AI SDK migration — streamText
swap targets stream-phase.ts only.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 22:36:35 +00:00
c87df6981a v1.12.4-rc3: extract stream-phase + tool-phase from inference.ts
- stream-phase.ts: streamCompletion, executeStreamPhase (plus sseLines,
  StreamOptions, ChatCompletionDelta/Chunk as private helpers)
- tool-phase.ts: executeToolPhase + private executeToolCall
- types.ts: shared StreamPhaseState + DB_FLUSH_INTERVAL_MS so the
  summary functions still in inference.ts can reference them without
  pulling from a phase file

Cycle: executeToolPhase recurses into runAssistantTurn, which stays in
inference.ts. Resolved by direct value back-edge — tool-phase.ts does
`import { runAssistantTurn } from '../inference.js'` and runAssistantTurn
is now exported. Safe because the dereference happens inside an async
function body, after both modules have fully evaluated. No
callback-through-args fallback needed.

inference.ts shrinks from ~1401 to ~828 LoC. Final Dispatch D moves the
sentinel summaries out and renames the residue to inference/turn.ts.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 22:28:23 +00:00
8fa7b7fce9 v1.12.4-rc2: extract payload + error-handler from inference.ts
- payload.ts: buildMessagesPayload (re-exported), loadContext,
  maybeFlagForCompaction
- error-handler.ts: handleAbortOrError, finalizeCompletion

Both new files type-import InferenceContext/StreamResult/TurnArgs from
inference.ts; ESM elides type imports so there's no runtime cycle.
handleAbortOrError turned out not to call the summary functions, so
no back-edge needed.

inference.ts shrinks from ~1676 to ~1401 LoC.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 22:09:50 +00:00
ea468ca7fb v1.12.4-rc1: extract budget, sentinels, xml-parser from inference.ts
Pure file moves. No behavior change. inference.ts retains createInferenceRunner
public surface; new files are internal to services/inference/.

- budget.ts: resolveToolBudget
- sentinels.ts: detectDoomLoop (re-exported through inference.ts),
  isCapHitSentinel, isDoomLoopSentinel, isAnySentinel
- xml-parser.ts: parseXmlToolCall, partialXmlOpenerStart

First of four refactor batches preparing inference.ts for the v1.13
AI SDK migration. inference.ts goes from 1780 LoC to ~1620.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 21:42:41 +00:00
eef4782383 v1.12.3: stale-stream banner with Retry/Discard
When an assistant message sits status='streaming' with no token activity
for 60+ seconds, the chat shows a banner above the input offering Retry
or Discard. Both clear the stale row via a new backend endpoint
POST /api/chats/:id/discard_stale that updates status='failed' and
publishes chat_status='idle'.

Closes the UX gap that caused the 2026-05-21 debugging spiral —
slow streams and dead streams now look different to the user.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 20:48:22 +00:00
a7104691aa v1.12.2: live tok/s + ctx display next to status indicator
ChatThroughput renders inline beside StatusDot while streaming or
tool_running. Subscribes to existing usage frames via sessionEvents.
Hides when status drops to idle/error or data is older than 10s.

Addresses the 2026-05-21 spike's UX gap where slow streams looked
identical to dead streams — now there's a live token velocity readout
that immediately distinguishes the two.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 20:45:53 +00:00
1a0a3b1673 v1.12.1: stop-handler writes terminal status + constraint cleanup + dead code removal
- handleAbortOrError now writes status='cancelled' on user stop; rows
  no longer stuck 'streaming' forever
- Drop stale messages_status_check constraint (only messages_status_chk
  remains, allowing 'cancelled' via TS MESSAGE_STATUSES)
- Remove detectSameNameLoop and DOOM_LOOP_SAME_NAME_THRESHOLD (added
  during 2026-05-21 debugging spike, never fired in any real run,
  existing detectDoomLoop covers actual failure modes)
- Remove 12 ctx.log.info diagnostic markers added during the same
  spike (verbose for production)
- Bundles workspace pane sync + status indicator overhaul +
  startup hung-row sweep landed earlier in v1.12.1 work

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 20:34:40 +00:00
48ee63a286 v1.12.1: rich status indicator + server-side workspace pane sync
Status indicator (StatusDot): drops the flat amber pulse for a richer set
of states — orbiting amber for streaming, spinning sky ring for tool_running,
static violet for waiting_for_input, plus the existing idle/error. Backend
chat_status frame widens from 'working|idle|error' to discriminate streaming
vs tool execution vs paused for user input.

Workspace pane sync: pane layout moves from per-device localStorage to
server-side sessions.workspace_panes jsonb. PATCH /api/sessions/:id/workspace
broadcasts session_workspace_updated on the user channel for cross-device live
sync. Echo dedup via JSON comparison so the round-trip frame doesn't loop.
Legacy localStorage seeds the server on first hydrate, then is deleted.
Deprecated session_panes table dropped.

Resilience: startup sweep marks any stale 'streaming' message older than
5 minutes as 'failed' so v1.12.0-style hung rows clear on container restart.
useWorkspacePanes gains validatePanes() to prune dead chatId references from
saved pane state when the chat list lands.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 20:32:02 +00:00
d58d553503 v1.12.1: same-name doom-loop guard + runAssistantTurn trace logging
Add detectSameNameLoop (threshold 5) to catch over-verification hangs
where tool args vary but the model is stuck on one tool. Add 12 structured
log points across the inference state machine (runAssistantTurn,
executeToolPhase, runDoomLoopSummary) to diagnose the deterministic hang
surfaced in v1.12.0 smoke testing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-21 17:15:02 +00:00
415 changed files with 49647 additions and 3493 deletions

34
.codecontextignore Normal file
View File

@@ -0,0 +1,34 @@
# .codecontextignore — paths codecontext skips during analysis
# Copy to your project root and customize. Same syntax as .gitignore.
# Dependencies / vendored code
node_modules/
vendor/
.venv/
venv/
__pycache__/
target/
# Build artifacts
dist/
build/
out/
.next/
.nuxt/
.svelte-kit/
# IDE / tooling
.opencode/
.vscode/
.idea/
.claude/worktrees/
# Test artifacts / coverage
coverage/
.nyc_output/
.pytest_cache/
# Lock files (rarely have meaningful symbols)
package-lock.json
yarn.lock
pnpm-lock.yaml

View File

@@ -1,6 +1,6 @@
NODE_ENV=production
PORT=3000
DATABASE_URL=postgres://boocode:CHANGE_ME@boocode_db:5432/boocode
DATABASE_URL=postgres://boocode:CHANGE_ME@boocode_db:5432/boochat
LLAMA_SWAP_URL=http://100.101.41.16:8401
PROJECT_ROOT_WHITELIST=/opt
BOOTSTRAP_ROOT=/opt/projects
@@ -10,3 +10,17 @@ POSTGRES_PASSWORD=CHANGE_ME
# Internal Tailscale address that bypasses Authelia. Override if you
# point BooCode at a different SearXNG instance.
SEARXNG_URL=http://100.114.205.53:8888
# Task model: lightweight model for auto-naming, search rewrite, etc.
# Direct llama-server instance (NOT llama-swap). Falls back to LLAMA_SWAP_URL
# with FAST_MODEL when unset.
# TASK_MODEL_URL=http://100.90.172.55:7995
# v1.13.15-tools: BOOCODE_TOOLS narrows the tool whitelist sent to the LLM.
# Unset (default) → all tools (~21k schema). Useful primarily for single-purpose
# sessions where the model only needs read-only filesystem access.
#
# core → view_file, list_dir, grep, find_files (~2k)
# standard → core + web_*, git_status, all 8 codecontext_* tools (~10k)
# all → every tool in ALL_TOOLS (~21k)
# BOOCODE_TOOLS=all

13
.gitignore vendored
View File

@@ -1,9 +1,20 @@
node_modules
dist
.env
# Claude / Cursor (local agent & IDE config — CLAUDE.md and AGENTS.md stay tracked)
.claude/
.cursor/
.cursorignore
CLAUDE.local.md
*.log
.DS_Store
.vite
coverage
secrets/
data/
data/*
!data/AGENTS.md
!data/skills/
!data/mcp.json
!data/coder-providers.example.json
codecontext/fork.tar.gz

View File

@@ -1,7 +1,5 @@
# BooChat
You are the assistant running inside BooChat — a self-hosted developer chat app.
## Capabilities
- Read-only file tools: `view_file`, `list_dir`, `grep`, `find_files`
@@ -28,6 +26,25 @@ You are the assistant running inside BooChat — a self-hosted developer chat ap
- Cite file paths + line numbers for any claim about the codebase
- When uncertain about scope or intent, surface options via `ask_user_input` rather than guessing
- Prefer codecontext (`search_symbols`, `get_symbol_info`, `get_dependencies`) over `grep` for symbol-level questions. Fall back to `grep` / `view_file` when codecontext returns degraded or empty results — that signals an unsupported language or parse failure.
- Verify before reporting work complete: run the relevant test/build/smoke command and confirm output matches the claim. Evidence first, assertion second.
## Output format
- Stay in Markdown by default for every reply, short or long.
- Switch to a self-contained `<!DOCTYPE html>...</html>` artifact only when the user explicitly asks (e.g. "render this as HTML", "make me a dashboard", "build an interactive diagram"). Detection is opportunistic — the BooChat backend tags the assistant message as an HTML artifact, opens it in a sandboxed pane, and offers Download. Do not emit HTML unprompted; long Markdown is the right answer for most explanatory output.
- When asked to produce HTML, avoid generic AI aesthetics: no excessive centered layouts, no purple gradients, no uniform rounded corners, no Inter font. Prefer interactive controls (sliders / knobs / SVG / side-by-side diffs) over passive prose-in-HTML. Pattern reference: claude.com/blog/using-claude-code-the-unreasonable-effectiveness-of-html (Thariq Shihipar, May 2026).
- The HTML artifact is rendered in a sandboxed iframe with `connect-src 'none'``fetch()`, WebSockets, and tracking pixels do not work. All logic must be client-side.
## Convention: rules vs recipes
Always-true rules (process discipline, refusals, behavior contracts) live here in `BOOCHAT.md` — and in `BOOCODER.md` / `CLAUDE.md` per their scopes — where they are 100% present in every turn. On-demand recipes (specific procedures, scaffolds, checklists) live in `/data/skills/` and invoke roughly 6% of the time in clean multi-turn flow (Codeminer42 measurement, 2026). Don't file workflow rules as skills — they silently misfire. See Anthropic agent-skills best-practices (platform.claude.com/docs/en/agents-and-tools/agent-skills/best-practices) for the canonical conventions.
## Verification discipline
- When assessing implementation status, verify against the running container (`curl /api/health`) and latest git commit (`git log --oneline -3`), not just source file contents. Source files can be mid-edit. The deployed state is the truth.
- Never count `dist/` directory sizes as source lines. Only count `src/**/*.ts` files. Compiled output is inflated by inlined types and transpilation artifacts.
- Before claiming a feature works, run the actual command and show the output. "Should work" is not verification. Acceptable evidence: test output (`pnpm test`), build output (`pnpm build`), curl response, docker logs, `\d tablename` output. If you can't run it, say so explicitly — don't assert success without evidence.
- When reporting counts (tools, tests, files, routes, lines), derive the number from a command (`grep -c`, `wc -l`, test runner output) — not from memory or approximation.
## Known limitations

View File

@@ -1,24 +1,117 @@
# BooCoder
# BooCoder — Container Guidance
> (Stub. v2.0 implementation pending. This file documents the intended contract.)
You are BooCoder, a write-capable coding agent. You can read AND modify files within the project scope.
You are the assistant running inside BooCoder — the write-capable companion to BooChat.
## You can
## Capabilities
- Read files (view_file, list_dir, grep, find_files)
- Edit files (edit_file, create_file, delete_file) — all changes queue in pending_changes
- Apply pending changes to disk (apply_pending)
- Revert applied changes (rewind)
- Dispatch tasks to external agents (dispatch_external_agent)
- Use MCP tools from configured servers
- Everything in `BOOCHAT.md`
- Write tools (pending): `write_file`, `edit_file`, `delete_file` (all gated through pending-changes sandbox)
- Shell (pending): `run_command` (Docker-isolated per-session)
## You cannot
## Constraints
- Write outside the project root (path-guard enforced)
- Write to secret files (.env, *.pem, id_rsa*, credentials.json)
- Apply changes without explicit user approval (unless auto-apply is enabled per task)
- Push to git remotes
- Access the internet except via configured MCP servers
- All writes land in a pending-changes virtual layer; nothing touches the real filesystem until `/apply`
- `run_command` executes inside the session sandbox, not the host
- No git commits, pushes, or pulls — Sam owns those
- Stop and ask before destructive operations (delete, overwrite, recreate)
## Pending changes discipline
Every file modification queues in `pending_changes` before touching disk. The user sees a diff preview and approves/rejects each change. Never bypass this queue — it is the safety boundary between inference and the filesystem.
## Behavior
- Show a diff preview before any write
- Group related edits into a single `/apply` batch
- If a tool fails, surface the error verbatim — don't paper over it
- Show diffs clearly. Explain what you're changing and why.
- For multi-file changes, organize as a logical unit (one task = one coherent change set).
- If uncertain about scope, use smaller edits and verify between steps.
- Cite file paths + line numbers for context.
- Verify before reporting work complete: run the relevant test/build/smoke and confirm output matches the claim. Evidence first, assertion second.
## Verification discipline
- When assessing implementation status, verify against the running container (`curl /api/health`) and latest git commit (`git log --oneline -3`), not just source file contents. Source files can be mid-edit. The deployed state is the truth.
- Never count `dist/` directory sizes as source lines. Only count `src/**/*.ts` files. Compiled output is inflated by inlined types and transpilation artifacts.
- Before claiming a feature works, run the actual command and show the output. "Should work" is not verification. Acceptable evidence: test output (`pnpm test`), build output (`pnpm build`), curl response, docker logs, `\d tablename` output. If you can't run it, say so explicitly — don't assert success without evidence.
- When reporting counts (tools, tests, files, routes, lines), derive the number from a command (`grep -c`, `wc -l`, test runner output) — not from memory or approximation.
## Provider lifecycle (v2.3)
BooCoder's coding agents are a **config-backed registry**: built-ins live in `provider-registry.ts`, and `data/coder-providers.json` layers overrides + custom entries on top. Registration ≠ installation — the config lists what you *want*; a probe reports what's *ready*.
### Config file: `data/coder-providers.json`
Resolved from `CODER_PROVIDERS_PATH` (default `/data/coder-providers.json`; dev/host path `/opt/boocode/data/coder-providers.json`). It is **gitignored** — it's live runtime config that the coder reads *and writes* (UI toggles `PATCH` it), so tracking it would churn `git status`. The tracked reference is `data/coder-providers.example.json`; copy it to `coder-providers.json` to seed overrides. A missing file, invalid JSON, or a schema mismatch all fall back to built-ins-only — loading never throws at startup.
```json
{
"providers": {
"goose": { "enabled": false },
"amp-acp": {
"extends": "acp",
"label": "Amp",
"description": "ACP wrapper for Amp",
"command": ["amp-acp"],
"enabled": true
}
}
}
```
Per-provider override fields (all optional):
| Field | Meaning |
|-------|---------|
| `extends` | `"acp"` — required for a NEW (custom) provider; built-in overrides omit it |
| `label` | Display name (required for custom) |
| `description` | Sub-label shown in the picker / settings |
| `command` | `[binary, ...args]` to spawn (required for custom; overrides a built-in's default argv) |
| `env` | Extra env vars merged into the spawn |
| `enabled` | Default `true`; `false` hides it from the composer |
| `order` | UI sort key |
| `models` / `additionalModels` | Replace / merge onto the discovered model list |
A PATCH to one provider id **replaces that id's override object wholesale** (per-id shallow merge), so to flip a single field keep the rest; a `null` value for an id deletes its override (reverts to the built-in default).
### Refresh contract
The snapshot is cached and a provider's cold ACP probe (tier-2) is **skipped** while `available_agents.last_probed_at` is younger than `PROVIDER_PROBE_TTL_MS` (default `86400000` = 24h). Opening the composer is therefore fast and does not re-probe. To force a cold re-probe (after installing a CLI or editing models): **`POST /api/providers/refresh`** (the Refresh button in the Providers settings tab), which clears the cache and re-probes.
### Enable / disable
Two ways:
- **Settings → Providers tab** — open the sidebar → **Settings****Providers**: toggle a provider on/off, refresh it, or open its diagnostic. (Earlier builds exposed a gear in the composer; that control was moved into Settings.)
- **Edit the config** (`"enabled": false`) then `POST /api/providers/refresh`.
A **disabled** provider leaves the composer's provider picker but stays listed in the Providers tab (status "Disabled") so you can re-enable it. **Native `boocode` is always-on** — an `enabled:false` on it is ignored (with a warn log) and it is never rendered as toggleable.
### Adding a custom ACP provider
- **Catalog modal**: Providers tab → **Add provider** → pick an entry → it PATCHes the config (`extends:'acp'` + label + command, enabled) and refreshes that provider.
- **Hand-edit** `data/coder-providers.json`: add an id with `extends:'acp'`, `label`, and `command`, then `POST /api/providers/refresh`.
Either way, **adding to config does NOT install the binary.** Until the CLI is on `PATH` the provider shows **"Not installed"** (status `unavailable`) and does not appear in the composer picker.
### Known limitation — subset refresh
`POST /api/providers/refresh` accepts an optional `{ "providers": ["id", ...] }` body and returns a `refreshed` count scoped to that subset — **but the underlying cold re-probe currently covers ALL installed providers**, not just the requested subset. True per-provider force is a future change (it needs a snapshot-internal parameter). This is intentional for now, not a bug: a subset refresh still re-probes everything; only the reported count is scoped.
### Deploy + smoke
Two deploy targets:
- **Routes (host service):** `pnpm -C apps/server build && pnpm -C apps/coder build && sudo systemctl restart boocoder`
- **Web UI (container):** `docker compose up --build -d boocode`
Green gate (verified across phases 15): `pnpm -C apps/coder test` (134 passing) `&& pnpm -C apps/coder build`.
Smoke (via Tailscale):
```bash
curl http://100.114.205.53:9502/api/providers/snapshot # lists every registered provider
curl http://100.114.205.53:9500/api/coder/providers/config # raw config, through the BooChat proxy
# Settings → Providers: disable goose → it leaves the composer picker, stays in the tab
# POST refresh → models repopulate; Add a catalog entry → it appears after refresh (unavailable until its CLI is installed)
```

375
CHANGELOG.md Normal file
View File

@@ -0,0 +1,375 @@
# Changelog
All notable changes per release tag. Most recent on top, ordered by tag creation date (which matches the git history). Tag names follow `vMAJOR.MINOR.PATCH-slug` — the slug describes what shipped, so the tag name alone is enough to recall the batch.
## v2.7.3-sampling-streamjson-tokens — 2026-06-01
Three small BooCode wins from `boocode_code_review_v2.md` §1 #11/#7/#8. **Sampling knobs:** per-agent `top_n_sigma` + the `dry_*` repetition family (`dry_multiplier`/`dry_base`/`dry_allowed_length`/`dry_penalty_last_n`) are now first-class Agent frontmatter fields, parsed in `agents.ts` and threaded into the llama-swap chat-completion body via `providerOptions.openaiCompatible` (the `@ai-sdk/openai-compatible` extra-body channel). This surfaced and fixed a **latent bug**: `top_k` (rejected by the AI-SDK provider as unsupported) and `min_p` (never passed to `streamText` at all) had been dead on the wire — no agent's `top_k`/`min_p` ever affected sampling; both now route through the same channel, so agents that set them will start using them. `--reasoning-budget` is documented in `data/AGENTS.md` (already works via `llama_extra_args`, permitted by the deny-list validator). **Live PTY stream-json:** qwen/claude PTY dispatch sliced stdout opaque; a new `stream-json-parser.ts` line-buffers the Claude-Code-compatible NDJSON and emits text/reasoning/tool frames live as they arrive (mirroring the ACP/opencode paths) + persists the structured parts, with a clean fallback to the old opaque slice when output isn't NDJSON (claude now runs `--output-format stream-json --verbose`). **Token UI:** the per-`(chat,agent)` `agent_sessions.input_tokens`/`output_tokens`/`cost` columns (accumulated since `v2.6.8` but dropped by the read route + wire type) now flow through and render condensed beside the AgentComposerBar session chip. Built by three parallel agents over disjoint subsystems; server 523 + coder 245 tests passing (incl. 11 new stream-json-parser + new agent-parse tests), all builds + web tsc clean. Builds on `v2.7.2-checkpoint-idor`; openspec `sampling-streamjson-tokens`. The qwen-vs-claude `usage` field names in #7 are best-guess pending a live smoke.
## v2.7.2-checkpoint-idor — 2026-06-01
Closes two IDOR authorization holes in the `v2.7.1-write-edit-robustness` checkpoint routes, flagged by the automated push security review. The `GET /api/sessions/:id/checkpoints?chat_id=` list route scoped its `chat_id` branch by `chat_id` alone — any session's `chat_id` would read its checkpoints; it now joins through `chats` and gates on `chats.session_id` (authoritative; `checkpoints.session_id` is a nullable denormalized hint). The `restoreCheckpoint` scope guard was fail-open — `cp.session_id && cp.session_id !== sessionId` fell through whenever the checkpoint's denormalized `session_id` was null, allowing a cross-session restore (worktree reset + transcript trim) — it now resolves the owning session via the checkpoint's chat and denies on any missing-or-mismatched row. A DB-integration regression covers the exact null-`session_id` cross-session case. Real-world blast radius is small (BooCoder is single-user behind Authelia on loopback), but both are genuine authorization bugs. Coder suite 234 passing (7/7 checkpoint tests incl. the regression against live postgres+git), typecheck clean. Hotfix on `v2.7.1-write-edit-robustness`.
## v2.7.1-write-edit-robustness — 2026-06-01
Two BooCoder hardening features for local quantized models, algorithm-reimplemented (not vendored) from the cline findings in `boocode_code_review_v2.md` §1 #3/#4. **Fuzzy patch applier:** `edit_file`'s apply path was exact-`.includes`-or-throw + first-occurrence `.replace` (`pending_changes.ts`), so a qwen3.6 whitespace/indentation/unicode drift in `old_string` lost the edit; a new pure `fuzzy-match.ts` (`locateMatch`) now runs an exact → per-line-trim → unicode-canon (curly quotes/dashes/nbsp) → Levenshtein-≥0.66 ladder and returns the real file span, refusing multi-exact matches as ambiguous rather than silently editing the first. `applyOne`/`rewindOne` both use it. **Worktree checkpoints + conversation-trim:** `rewind` only reversed BooCode's own `pending_changes`, blind to what external agents (opencode/goose/qwen/claude) write directly into the session worktree — so a new `checkpoints` table + `checkpoints.ts` shadow-commit (tracked **and** untracked, captured via a temp-index `read-tree`/`add`/`write-tree`/`commit-tree` into a GC-safe `refs/boocode/checkpoints/<id>`) snapshots the worktree before each external-agent turn (hooked into all three dispatcher paths), anchored to the turn's assistant message. A new `POST /api/sessions/:id/checkpoints/:cid/restore` resets the worktree (`reset --hard` + `clean -fd`), trims the transcript past that message, and resets the `(chat,agent)` backend session so files, transcript, and agent context land consistent at the restore point; a per-message "Restore to here" affordance in `CoderMessageList` drives it. Built by three parallel agents over disjoint files; DB-integration testing caught a microsecond-`created_at` self-deletion bug in the later-checkpoint cleanup. Full coder suite 234 passing (incl. 17 fuzzy-match + 6 checkpoint tests), server+coder build + web tsc clean. Builds on `v2.7.0-mit`; openspec `write-edit-robustness`. Live host smoke (dispatcher hook + restore UI end-to-end) still to run.
## v2.7.0-mit — 2026-06-01
Relicenses BooCode from AGPL-3.0 back to MIT by clearing the three Unsloth-Studio-derived files the `v2.4.0`/`v2.4.1` lifts pulled in — the root `LICENSE` and all five `package.json` had been `AGPL-3.0-only`, making the network-served work AGPL §13-encumbered. The enabling finding decoupled the relicense from the long-planned native-llama-server-parsing retirement: `tool-call-parser.ts`'s Unsloth-ported algorithm (`parseToolCallsFromText`/`scanBalancedBraces` + unused nudge constants) was **dead code** with no production import, so it was simply deleted while the load-bearing `extractToolCallBlocks`/`stripToolMarkup` (BooCode-authored streaming helpers) were kept byte-identical — no behavior change to the live tool-call path. `html-to-md.ts` was swapped to the MIT `node-html-markdown` library (`parse5` dropped; the only behavior delta is column-aligned tables, GFM hard-break `<br>`, and `<ol start>` renumbering, all feeding the LLM via `web_fetch`), and `llama-args-validator.ts` was clean-room rewritten with the managed-flag denylist re-derived from the public llama-server flag list (facts, not copyrightable). The license flip set `LICENSE` to MIT (`Copyright (c) 2026 indifferentketchup`), the five `package.json` to `MIT`, removed every AGPL SPDX header, added a README License section, and added a `license-mit` guard test that fails if AGPL provenance returns. Built by three parallel agents over the disjoint files; full server suite 519 passing (incl. 9 new guard tests), server build + coder typecheck clean. Resolves `boocode_code_review_v2.md` §1 #1 / §5k and the roadmap's `License-debt` batch (openspec `license-debt-mit`); supersedes that batch's original staged plan, which had entangled the flip with a live qwen3.6 validation window.
## v2.6.11-close-hooks-staging — 2026-06-01
The two v2.6 follow-ups left after `v2.6.10-lifecycle-hardening`. **Server close-hook caller:** `apps/server` (BooChat) now fire-and-forgets BooCoder's Phase-3 close hooks so warm agent backends + worktrees tear down *immediately* on delete/archive instead of waiting for the idle-evict/reaper backstop — a new `coder-notify.ts` `notifyCoderClose(kind,id)` (reusing the v2.6.2 `BOOCODER_URL` reach, never-rejects) is `void`-called after the WS frame at session-delete (`POST /api/sessions/:id/close`) and chat archive / archive-all / delete (`POST /api/chats/:id/close`); an unreachable coder can never block or fail the user's delete/archive. **Staging-boundary hint (task 3.7):** the BooCoder DiffPanel now shows a muted one-liner when the selected provider can't see another agent's unapplied worktree edits — native boocode selected + external-agent-staged changes (or vice-versa) → "<agent>'s edits live in its worktree — BooCode won't see them until applied" — derived purely from the per-change `agent` + current provider, no new state. 6 new server tests (`coder-notify`), 537 server tests pass; web + server tsc/build clean. **With these the v2.6 openspec is fully closed** — only the live Smoke 2/2b/3 remain (manual exercise).
## v2.6.10-lifecycle-hardening — 2026-06-01
v2.6 Phase 3 (the last phase) — lifecycle hardening of the warm-process backends. **Idle eviction + LRU cap:** the agent pool runs a 60s sweep that evicts backends/sessions idle past `AGENT_POOL_IDLE_TTL_MS` (30 min default) and any beyond `AGENT_POOL_MAX_LIVE` (10, LRU) — **never a busy one** (in-flight turn, double-checked via a new `isBusy()` backend hook); the worktree persists (DB-backed) and the next turn re-spawns + reattaches. The eviction/LRU/restart decisions are factored into a pure `lifecycle-decisions.ts` (modeled on the inference `selectPruneTargets` pattern). **Crash recovery:** lifts openchamber's health-monitor + busy-aware-restart + consecutive-failure + stale-busy-grace state machine into `opencode-server.ts` (with port reclaim) and `warm-acp.ts` — an opencode server crash settles in-flight turns as failed, marks the rows `crashed`, and recreates fresh sessions (a fresh server can't hold the old in-memory id), while a warm-ACP child crash re-`session/new`s next turn; the F.1 turn-guard and U.6 usage are preserved (their tests still pass). **Worktree reaper:** a periodic reaper removes orphan on-disk worktrees (no live `worktrees` row, 1h grace) behind a superset-style preflight that skips dirty/unpushed/unmerged work, with Paseo-style soft-delete (`status='archived'`). Plus close hooks (`/api/chats/:id/close`, `/api/sessions/:id/close`, awaiting the apps/server caller) and diff re-baseline after `apply_pending`. Built test-first — 35 new tests (`lifecycle-decisions` 22, `agent-pool` 13) + a DB-opt-in reconnect integration test; 215 coder tests pass, tsc + build clean. **This completes v2.6** (Phase 03 + F.1 + Phase 1-UX). Remaining follow-ups (out of v2.6 scope): the apps/server close-hook caller, the 3.7 DiffPanel staging-boundary hint (frontend), and live Smoke 2/2b/3.
## v2.6.9-warm-acp — 2026-05-31
v2.6 Phase 2: goose and qwen now run as **warm ACP backends** instead of one-shot-per-task. A new `WarmAcpBackend` (`backends/warm-acp.ts`, implementing the same `AgentBackend` interface as the opencode warm server) holds one persistent `goose acp` / `qwen --acp` child + `ClientSideConnection` + ACP session per `(chat, agent)`, running `initialize` + `session/new` once and reusing the connection across turns; per-turn abort cancels the in-flight prompt (`session/cancel`) without killing the child, and a child exit marks `agent_sessions.status='crashed'` for re-spawn on the next turn. The dispatcher routes `goose`/`qwen` chat-tab tasks to the pooled warm backend via a pure `shouldUseWarmBackend(task)` predicate (warm only when both `session_id` and `chat_id` are set), keeping the one-shot `runExternalAgent` path as the fallback for session-less creators (arena, MCP, `new_task`); broker frames + `persistExternalAgentTurn` + the latest-wins `pending_changes` diff are identical to the opencode path. The `acp-dispatch.ts` `handleSessionUpdate` switch was extracted into a pure shared `acp-event-map.ts` mapper used by both the one-shot and warm paths (one-shot behavior byte-identical, all existing acp tests green). The design's `unstable_resumeSession` concern is resolved — the installed `@agentclientprotocol/sdk@^0.22.1` exposes stable `resumeSession`/`loadSession`, but resume is moot in the hot path (warm reuse needs none); cross-restart resume + idle eviction are deferred to Phase 3. Built test-first (15 new tests: `warm-acp-routing`, `acp-event-map`); 180 coder tests pass, tsc + build clean. **Smoke 2/2b (live two-message warm reuse + the opencode→boocode→opencode switch round-trip) to be run post-deploy.** Phase 3 (lifecycle hardening) is the last v2.6 phase.
## v2.6.8-agent-attribution — 2026-05-31
v2.6 Phase 1-UX: agent attribution + switch affordances over the already-shipped `pending_changes.agent` column and `agent_sessions` table (read+display, no new backend capability). **Backend:** `pending_changes.agent` is now stamped at every queue site (native write tools → `'boocode'`, dispatched external agents → the task's agent, manual RightRail create → `NULL`) and flows through `listPending`; a new `GET /api/sessions/:id/agent-sessions` route returns `[{agent,status,has_session,last_active_at}]` per `(chat,agent)` for the session's chats; and the opencode warm-server backend consumes opencode's `session.next.step.ended` events, accumulating `input_tokens`/`output_tokens`/`cost` onto the `agent_sessions` row (new columns, idempotent). **Frontend:** the BooCoder DiffPanel renders a per-row agent badge (provider icon + label; `null` → "manual") with a "Changes from X, Y" note when a pending set spans multiple agents, and the AgentComposerBar shows a resumed / history / new-session chip beside the Provider picker — gated on an optional `sessionId` prop so BooChat is unaffected — driven by a new `useAgentSessions` hook that refetches on message-complete; `providerIcon` was extracted to a shared `components/coder/providerIcons.tsx`. Built by three parallel subagents over disjoint file sets; web + coder typecheck clean, 165 coder tests pass (9 new across `opencode-usage` and `agent-sessions.routes`). U.6's persisted token totals are conversation-cumulative and not yet surfaced in the UI (deferred). Implements the U.1U.6 "remaining" plan from the v2.6 openspec reconciliation; Phase 2 (warm ACP goose/qwen) + Phase 3 (lifecycle hardening) remain.
## v2.6.7-interrupt-guard — 2026-05-31
Fixes a post-interrupt correctness bug in the `v2.6.1-phase1-opencode` warm-server backend, made one-click reachable by `v2.6.5-panes-tabs-composer`'s Send→Stop composer. `opencode-server.ts` settled an in-flight turn on opencode's `session.idle`/`session.error` by calling `activeTurn.settle()` on whatever turn currently held the session slot — but opencode emits one trailing terminal event for a *cancelled* turn after `client.session.abort()`, and those events carry only a `sessionID` (no turn id). So after the user hit Stop and immediately sent another message, the aborted turn's orphan `session.idle` settled the *new* turn early as success (Paseo hit and fixed the same class in `1d38aac`). The fix adds a small pure guard (`turn-guard.ts`: `armAbortGuard`/`noteTurnActivity`/`consumeTerminal` over a per-session `swallowNextTerminal` flag): abort arms it, the next terminal is swallowed once, and a new turn's first delta self-heals the flag so a never-arriving orphan can't strand a real turn. Implemented test-first — three regression tests in `turn-guard.test.ts` (swallow-the-orphan, settle-when-no-abort, self-heal); full coder suite green (156 passed). This is the F.1 "fix-next" item from the v2.6 openspec reconciliation; Phase 1-UX / Phase 2 / Phase 3 remain.
## v2.6.6-claude-md — 2026-05-31
Docs-only — CLAUDE.md session-learnings update, no code. Captures four recurring gotchas surfaced while shipping `v2.6.5-panes-tabs-composer`: (1) `sessions.workspace_panes` is now a `WorkspaceState` envelope (`panes` + `tabNumbers`/`nextTabNumber` + `closedPaneStack`), migrated from the legacy bare `WorkspacePane[]` on both frontend hydrate (`toWorkspaceState`) and the union-accepting server PATCH validator; (2) DB/session-aware tools take an optional `ToolExecCtx` (`{ sql, sessionId }`) 4th arg on `ToolDef.execute`, plumbed through the tool phase, with `read_tab_by_number` as the reference; (3) the two-schema-files-one-DB ownership split — `apps/coder/src/schema.sql` owns `agent_sessions`/`worktrees`/`pending_changes`/`available_agents` and extends `tasks`, distinct from BooChat's `apps/server/src/schema.sql` — plus the idempotent `confdeltype` FK-action-flip pattern (guard `ON DELETE` changes on `pg_constraint.confdeltype` so re-runs no-op); and (4) React StrictMode is on, so a `setState` called inside another `setState`'s updater double-fires in dev and must be made idempotent. Pairs with `v2.6.5-panes-tabs-composer`.
## v2.6.5-panes-tabs-composer — 2026-05-31
A workspace UX batch across BooChat panes, tabs, and the composer, plus the persistence model that backs them. **Panes & tabs:** a chat can be opened in a fresh pane (the ChatTabBar tab context menu's "Open in new pane", and the fork button — which now lands the fork beside the original via a new `open_chat_in_new_pane` event instead of replacing the active pane); the per-pane "+" became a New BooChat/BooTerm/BooCode menu; closing a chat pane relocates its tabs (in order) into the oldest chat/empty pane instead of discarding them, and reopen strips the restored chatIds from every live pane first so a relocated-then-reopened pane never duplicates a tab (no stack-shape change); each tab carries a stable session-scoped number assigned on open and retired on close (never reused), rendered map-keyed rather than positional. The per-message "Open in pane" artifact button was removed, and the empty/landing pane became a real session history — the session's open chats plus separately-fetched archived chats, click to open or restore-and-open. **Persistence:** `sessions.workspace_panes` was widened from a bare `WorkspacePane[]` to a `WorkspaceState` envelope (`panes` + `tabNumbers`/`nextTabNumber` + `closedPaneStack`) so tab numbers and the reopen stack survive reload; the PATCH validator accepts the legacy array or the envelope (zod union) and migrates on write, and the `session_workspace_updated` WS-frame schema was widened on both web and server (byte-identical, parity test green) — the same schema-drift class as `v2.6.4-agent-sessions-fk`. **Composer:** the send button morphs Send → Stop → Queue with generation state (BooCoder keys on `sending || activeTaskId`, which also corrected its queue gates and added `cancelTask`), the standalone "Stop generating" pill was folded into it, and pasted chips now trail the typed text so a leading slash command stays first. **Tooling:** adds the read-only `read_tab_by_number` tool — resolves a session-scoped tab number to its chat via the persisted `tabNumbers` map and returns that chat's transcript; tools gained an optional `ToolExecCtx` (`{ sql, sessionId }`) on `execute` to support DB-reading tools. Builds on `v2.6.4-agent-sessions-fk`.
## v2.6.4-agent-sessions-fk — 2026-05-31
Follow-up to `v2.6.3-chatkey-and-skills` (P1.5-b): the live `agent_sessions.session_id` foreign key is converged from `ON DELETE CASCADE` to `ON DELETE SET NULL`, matching the schema's stated intent. The P1.5-b re-key block re-adds `session_id_fkey` as `SET NULL`, but the whole block is guarded on `chat_id_fkey`'s absence — so a database already re-keyed to `(chat_id, agent)` while `session_id_fkey` was still `CASCADE` never re-enters it, leaving the live FK at `CASCADE` and diverging from both `worktree_id` (already `SET NULL`) and the `v2.6.3` changelog's own claim that `session_id` is informational `SET NULL`. The fix adds a standalone `confdeltype`-guarded `DO` block (mirroring the `session_worktrees` defang) that flips `session_id_fkey` `CASCADE → SET NULL` independently of the re-key gate; it is idempotent — fires only while the FK is still `'c'`, a no-op on a fresh deploy (already `'n'`) and on every re-run. The live DB was converged by hand with the identical statements, so `applySchema` and the hand-applied state match (`\d agent_sessions` now shows `session_id ... ON DELETE SET NULL`). Also bundles a CLAUDE.md doc-sync (committed separately): per-session SSE (P1.5-a) and the `(chat_id, agent)` re-key reflected in the engineering notes, the stale root `AGENTS.md` navigation pointer dropped, and new conventions for `data/AGENTS.md` parsing and the `data/skills/<vendor>/` layout.
## v2.6.3-chatkey-and-skills — 2026-05-31
Three threads. **agent_sessions re-keyed to `(chat_id, agent)` (P1.5-b):** the tab (a chat) is now the agent-context unit, so two opencode tabs in one BooCode session are two independent contexts that share one worktree. `chat_id` is threaded end-to-end — `tasks.chat_id` added, stamped by the coder message + skills routes from the frontend tab, read by `runOpenCodeServerTask` which falls back to resolve-or-create a chat for session-less creators (arena/MCP/new_task/generic `/api/tasks`) so `ensureSession` never receives a degenerate `(null, agent)` key. A new first-class `worktrees` table (one-per-session, survives session delete via `session_id ON DELETE SET NULL`) supersedes `session_worktrees`, which is defanged (CASCADE dropped, not yet removed); `agent_sessions.chat_id` CASCADEs from `chats` (closing a tab ends its context) while `worktree_id`/`session_id` are informational `SET NULL`. The migration is idempotent with a backfill-verify gate; the live re-key was applied against an empty table after the 35-chat test session `20d28876` was deleted (backed up first). This corrects and supersedes an earlier draft that wrongly keyed on `(worktree_id, agent)`; the delete-guard from `v2.6.2-delete-guard-and-sse` is repointed here from `session_worktrees` to `worktrees` (`worktree_path``path`). **dcp-strip cross-chunk fix:** the `<dcp-message-id>` tag streams split across SSE deltas, which the per-chunk strip from `v2.6.1-phase1-opencode` missed — a stateful `makeDcpStreamStripper` at the dispatcher boundary holds back partial-tag tails so neither live frames nor persisted content carry the tag (11 unit tests). **Agent-judgment skills:** `committing-changes` (segment by concern, stage explicitly, present-and-stop, never push) and `using-worktrees` (the when-to-isolate heuristic, autonomous-when-clear vs committing's command-gate) land in `data/skills/boocode/` with eval.yamls, plus a parser-safe `data/AGENTS.md` preamble pointing at both.
## v2.6.2-delete-guard-and-sse — 2026-05-30
Two coder-side batches under one tag. **Session-delete work-loss guard:** deleting a BooChat session CASCADE-wipes its `session_worktrees` row, which would silently orphan uncommitted/unpushed/unmerged work — so the server's `DELETE /api/sessions/:id` now gates before the delete. It reads `session_worktrees` from the shared DB first (no row → chat-only session → delete immediately, zero round-trip), and for worktree-backed sessions calls a new BooCoder endpoint (`/worktree-risk`) that runs git on the host, since the container can't see `/tmp/booworktrees` — only the host systemd service can. `checkWorktreeWorkAtRisk` reports dirty/unpushed/unmerged via the audited `hostExec`+`shellEscape` path, default branch detected from `refs/remotes/origin/HEAD` (never the worktree's own branch, never hardcoded); any at-risk worktree returns 409 with per-worktree `RiskReport[]`, `force=true` bypasses, and the check is fail-closed (BooCoder unreachable also blocks — force still escapes). The sidebar renders a block dialog distinguishing work-at-risk (Commit/Stash/Force; stash uses `-u` and re-blocks on remaining commits) from couldn't-verify (Cancel/Force), and Commit never auto-commits. A follow-up fix gates the `unpushed` arm behind an actual upstream (`atRisk = dirty || unmerged > 0 || (hasUpstream && unpushed > 0)`) so the no-upstream `session-<id>` branches stop flagging every pristine worktree-backed session — no protection lost, since real local work always also surfaces as `unmerged > 0`. **Per-session SSE (P1.5-a):** replaces the single global SSE loop scoped to the most-recent worktree directory — the known limit flagged in `v2.6.1-phase1-opencode` — with one `event.subscribe({directory})` per live opencode session, so sessions in different worktrees stream concurrently instead of the second silently dropping the first's events. Each session owns an `AbortController` wired into `subscribe(…, {signal})`, which also fixes a latent Phase-1 bug where switching directories left the old loop parked forever in its `for await` (zombie loops); a `sessionID` demux guard drops cross-session events so two sessions sharing a worktree (possible after P1.5-b) don't double-process deltas. The opencode SDK was confirmed to open an independent SSE connection per `subscribe()` call, so N concurrent dir-scoped streams are supported.
## v2.6.1-phase1-opencode — 2026-05-30
v2.6 Phase 1: opencode runs as a warm HTTP server (`apps/coder/src/services/backends/opencode-server.ts`) — one `opencode serve` per BooCoder process, one opencode session per BooCode session resumed across turns via the new `agent_sessions` table, with a single SSE read loop, reasoning dedup ported from Paseo, an inactivity watchdog, and a stale-session guard (crashed-not-resumed + a `config_hash` fingerprint over `opencode_server|<model>`, deliberately excluding the ephemeral server port so cross-restart resume survives). Builds on the `v2.6.0-phase0-foundations` schema/interface scaffold. The batch's hard-won fixes: opencode streams `session.next.*` events (not `message.part.*`), and `event.subscribe()` must pass the session's worktree `directory` or events route to the server CWD and turns come back empty; model strings must be `llama-swap/`-prefixed and present in opencode's own config, with `agent-probe` now populating `available_agents.models` via `mergeLlamaSwap` so the frontend stops sending an empty model; `session_worktrees`/`agent_sessions` FKs are `ON DELETE CASCADE` so session deletion no longer 500s. Also bundled: dcp-message-id tag stripping from opencode text output, a reopen-closed-pane control, the `[+]`/split-pane button separation, auto-name using the session's loaded model, and a `systematic-debugging` slash command. Smoke 1 verified end-to-end (two turns, session reuse, turn 2 ~9x faster). Known Phase 1 limit: one SSE stream scoped to the most-recent session's directory — concurrent opencode sessions in different worktrees collide (warns; per-session SSE is Phase 2).
## v2.5.15-acp-path-guard — 2026-05-29
Security fix + repo hygiene. Fixes a path-traversal in the ACP filesystem bridge (`acp-client-fs.ts`, flagged by the automated push security review): the worktree guard used an unbounded `startsWith(resolve(worktreePath))`, so a sibling path sharing the worktree as a string prefix (`<worktree>-evil/…`) escaped the scope — and `writeWorktreeTextFile` writes to disk directly (no `pending_changes` gate), so a confused/buggy ACP agent could write outside its worktree. Now uses a separator-bounded check matching `write_guard.ts` (`resolve()` + `startsWith(root + sep)` / `=== root`) via a shared `resolveInWorktree`, with a regression test covering `../` traversal and the sibling-prefix bug. Symlink-swap/`O_NOFOLLOW` hardening was intentionally skipped — consistent with `write_guard`'s no-realpath stance, and the agent already runs with host FS access so this is a containment guard, not a trust boundary. Separately, stops tracking the live `data/coder-providers.json` (it's runtime config the UI reads *and writes* on provider toggles, which churned `git status`) — it's now gitignored with a tracked `data/coder-providers.example.json` reference; the loader falls back to built-ins-only when the live file is absent. The provider-type duplication (coder ↔ web) stays guarded by the existing text-identity `provider-types-parity.test.ts` — a shared package was considered and declined (drift is already prevented; not worth the Docker/build-order risk at solo scale).
## v2.5.14-claude-md — 2026-05-29
Docs-only — CLAUDE.md session-learnings update, no code. Adds gotchas surfaced while shipping the v2.3 provider-lifecycle batch: the host `boocoder.service` keeps running the old process after `pnpm -C apps/coder build` (stale-process tell = new routes 404 while old routes 200, restart don't re-debug); the `boocode` container `build: .` deploys the working tree, so web edits are live on the Vite dev server but not production until `docker compose up --build -d boocode`; `PATCH /api/providers/config` replaces a provider's override wholesale (send `{...existing, enabled}` or a custom ACP entry's command is wiped) and `data/coder-providers.json` is live config not to be committed as code; external agents dispatch one-shot with no context/token tracking (only native `boocode` tracks ctx; OpenCode-as-server is the unshipped `v2-6-persistent-agent-sessions` plan); the `ui/` primitive inventory with `button role=switch` / Dialog fallbacks for the absent switch/sheet; and the mobile Dialog-with-list scroll-containment recipe. Also backfills previously-uncommitted doc bullets for the `v2.5.7``v2.5.11` coder work (provider-type parity test, async ACP command discovery, AgentComposerBar `installed` filter, provider-registry path disambiguation).
## v2.5.13-provider-lifecycle-phase5 — 2026-05-29
Closeout of the v2.3 provider-lifecycle batch — the web UI (Phase 5) plus docs (Phase 6). Provider management moved into **Settings → Providers**: a tab listing every registered provider with a status badge (Available / Disabled / Not installed / Error / Loading), an enable/disable toggle, a per-provider refresh, and a plaintext diagnostic; toggling sends the provider's *full* override (preserving a custom ACP entry's command under the wholesale-replace PATCH merge) then refetches the snapshot. The composer's provider picker now filters to `enabled && (status === 'ready' || 'loading')`, so disabled and unavailable providers drop out of the picker and are managed only in settings (native `boocode` always shows). A curated ACP catalog (`apps/web/src/data/acp-provider-catalog.ts`) + `AddProviderModal` register custom providers via `PATCH /api/providers/config` then a subset refresh, and the web client gained `getProvidersConfig` / `patchProvidersConfig` / `refreshProviders` / `getProviderDiagnostic`. Two mobile fixes ship alongside: the Settings pane is now reachable on phones (opening it pushes `?pane=` atomically so the mobile URL-sync effect keeps it active instead of snapping back to the chat pane), and the Add-provider modal caps to the viewport with a single `overscroll-contain` scroll region so the list scrolls instead of dragging the whole modal. This completes the arc begun in `v2.5.4-provider-lifecycle-phase1` (config-backed registry over the built-ins) → `v2.5.5-provider-lifecycle-phase2` (loading/unavailable snapshot lifecycle + tier-2 probe TTL gate) → `v2.5.6-provider-lifecycle-phase3` (generic `resolveLaunchSpec` ACP dispatch) → `v2.5.12-provider-lifecycle-phase4` (config GET/PATCH, subset refresh, diagnostic HTTP API). Docs landed in `BOOCODER.md` (config file, refresh contract, enable/disable, custom ACP, the honest subset-refresh known limitation) and `docs/DEFERRED-WORK.md` §2 is marked addressed; the remaining Tier-2 follow-ups (WS `provider_snapshot_updated` frame, `available_agents.enabled` column, shared types package, MCP provider tools) stay deferred.
## v2.5.12-provider-lifecycle-phase4 — 2026-05-29
Phase 4 of the v2.3 provider-lifecycle batch (`openspec/changes/v2-3-provider-lifecycle/design.md` §6): the HTTP API to read, patch, refresh, and diagnose providers. `routes/providers.ts` gains `GET /api/providers/config` (the raw loaded `CoderProvidersFile`), `PATCH /api/providers/config` (a partial providers map — an id's override object is replaced wholesale, a `null` value deletes it), an optional `{ providers?: string[] }` body on `POST /api/providers/refresh` (the `refreshed` count reflects the requested subset; the force probe itself still covers all installed providers, since per-provider force is a snapshot-internal change left to a later phase), and `GET /api/providers/:id/diagnostic` returning JSON `{ diagnostic: string }` — a read-only report (resolved def, install_path, last_probed_at, enabled, `which` availability, last cached probe error) with no probe spawn. PATCH correctness is the whole story: the order is validate→save→reload→clear, a malformed body or an invalid merged config returns 422 without writing the file, and a `save()` failure returns 500 without reloading the registry or clearing the snapshot cache, so on-disk and in-memory state can never diverge. New pure `mergeProviderConfigPatch` + `ProviderConfigPatchSchema` in `provider-config.ts`, a read-only `peekSnapshotEntry` cache accessor (source of the diagnostic's last-error — no probe/cache logic change), and a new `provider-diagnostic.ts` formatter. The web client gains `api.coder.getProvidersConfig` / `patchProvidersConfig` / `refreshProviders(providers?)` / `getProviderDiagnostic`, with mirrored `ProviderOverride` / `CoderProvidersFile` / `ProviderConfigPatch` types; the existing `/api/coder/*` proxy blanket-forwards the new routes with no change. +28 tests (134 coder total: pure merge/validate, the diagnostic formatter, and `app.inject` route tests proving the 422-no-write and save-fail-no-divergence guards). The diagnostic returns JSON rather than the §8 plaintext so it flows through the JSON `request` client helper (reconciling design §6.4's `{ diagnostic }` with §8's string report). No UI (Phase 5). Builds on `v2.5.6-provider-lifecycle-phase3`.
## v2.5.11-claude-skill-discovery — 2026-05-29
Surface Claude Code's real enabled commands + plugin skills in the coder slash menu, with icons separating commands from plugin skills. New `claude-command-discovery.ts` reads (user-global scope) `~/.claude/commands/*.md` plus every enabled plugin in `~/.claude/settings.json:enabledPlugins` — each plugin's user-scope install path contributes `skills/<name>/SKILL.md` (kind `skill`) and `commands/*.md` (kind `command`), parsed from frontmatter, bare names, deduped. The snapshot's claude branch discovers these **live** (claude is PTY, no ACP probe; the snapshot cache rate-limits the fs reads). The `/` menu now renders up to three icon'd groups: **`<agent> commands`** (Terminal), **`<agent> skills`** (Puzzle — claude's plugin skills / opencode is all commands), and **BooCoder skills** (Sparkles), via a new optional `icon` on `SlashCommandGroup`. `AgentCommand` gains a `kind` field, added identically to the coder and web copies (the `provider-types-parity` test enforces it); `mergeCommandsByName` is now generic so it preserves the tag. Invocation is unchanged — picking a claude command/skill sends `/name` to claude (PTY), which executes it. Project-local plugins + `<cwd>/.claude/commands` deferred. BooChat unaffected (flat skills). Smoke-test the claude skill slash-execution on the host.
## v2.5.10-opencode-live-commands — 2026-05-29
Surface opencode's real (live ACP) command set in the coder slash menu without needing a dispatch. Two fixes: (1) the cold ACP probe (`acp-probe.ts`) captured `available_commands` but read `probedCommands` synchronously right after `newSession` — racing opencode's async `available_commands_update` notification, so it captured **zero** and only the 7-item static manifest showed. The probe now waits briefly (poll up to 3s for the first batch + a 300ms settle, capped under the 30s probe timeout) so the commands are actually captured. (2) Captured commands are persisted to a new `available_agents.commands` JSONB column and served (merged with the manifest) on the tier-2-probe-skip path, so the agent's discovered commands survive once the model list is warm and show without a dispatch. Boot warms this via the `force: true` startup snapshot. apps/coder only (probe + schema + snapshot). Caveat: depends on opencode emitting `available_commands_update` on session creation rather than only after a prompt — to be confirmed on the host. Claude (PTY) disk/plugin discovery deferred.
## v2.5.9-agent-slash-commands — 2026-05-29
Segmented per-agent slash menu in the coder pane, plus cross-agent skills. The `/` menu now shows two labeled groups — **the active agent's commands first** (opencode/claude/qwen manifest + live ACP `available_commands`), **BooCoder skills second** — instead of always showing BooCoder's skills regardless of provider. `SlashCommandPicker` gains an opt-in `groups` prop (the flat `items` path is unchanged, so **BooChat's menu is byte-identical** — parity verified: no BooChat caller passes the grouped prop, and the skills lookup / invocation routing are untouched); `ChatInput` takes `slashGroups`; `CoderPane` builds the groups from the selected provider's commands + skills. Skills now **run under the selected agent**: the coder `skill_invoke` route accepts a `provider` and, when external, injects the server-side skill body into a dispatched task (instead of native inference) — so a skill like brainstorming executes through opencode/claude with the body kept server-side, mirroring the messages-route external dispatch. Also folds in the earlier initial-chat fix: invoking a skill on the landing chat now runs the same create-chat → assign-to-pane → invoke transition as a text send (`handleLandingSkill`) rather than invoking invisibly without a pane transition (the blank-screen repro). Web tsc + coder build clean.
## v2.5.8-mobile-composer-row — 2026-05-29
Mobile fix for the `AgentComposerBar`: the refresh button was wrapping to a second line. Root cause was layout order, not width — the status dot carried `ml-auto` (pinned to the far-right edge) and the refresh button followed it in DOM order, so it overflowed and wrapped. The dot + refresh are now one right-aligned (`ml-auto`) unit, keeping the refresh on the top line. Additionally, `CompactPicker` gained an `iconOnly` option and the Mode (permission) picker now renders icon-only on mobile (shield + chevron, no "Bypass"/"Plan" text label; `aria-label`/`title` and the tap-to-open list still convey the value) to free row width. Desktop is unchanged (full labels). Web-only change.
## v2.5.7-claude-models-and-picker-fix — 2026-05-29
Two provider-layer changes. **(1) Fix the empty provider picker** — a regression from `v2.5.5` (Phase 2): on a cache miss `getProviderSnapshot` returned synchronous `installed:false` `loading` entries, which `AgentComposerBar` filters out (`e.installed && e.status !== 'error'`); with the client-side poll deferred to Phase 5, a single fetch landed on `loading` forever and no providers appeared. `getProviderSnapshot` now awaits the build and returns terminal entries (the sync `loading` return is deferred until Phase 5 ships the poll); builds stay fast via the tier-2 cold-probe skip. **(2) Claude models** — the list was a hardcoded 2-entry static list (Opus 4 / Sonnet 4, May 2025), and the v2.3 config schema's `models`/`additionalModels` were parsed but never wired. `buildResolvedRegistry` now carries config `models` (replace) + `additionalModels` (merge) onto `ResolvedProviderDef`, and `provider-snapshot` applies them to every ready model list — so `/data/coder-providers.json` can add or replace any provider's models with no code change. Claude `staticModels` bumped to `opus`/`sonnet`/`haiku` latest-aliases plus pinned `claude-opus-4-8` / `claude-sonnet-4-6` / `claude-haiku-4-5-20251001` (passed verbatim to `claude --model`; the CLI accepts both aliases and pinned full names). +2 unit tests (109 total). Builds on `v2.5.6-provider-lifecycle-phase3`.
## v2.5.6-provider-lifecycle-phase3 — 2026-05-29
Phase 3 of the v2.3 provider-lifecycle batch (`openspec/changes/v2-3-provider-lifecycle/design.md` §5): generic ACP dispatch. `acp-spawn.ts` gains `resolveLaunchSpec(resolved, installPath)` — it consults the resolved registry's `launchCommand` (a config override or a custom-ACP entry's command) first, falling back to the kept `resolveAcpSpawnArgs` switch for built-ins. `acp-dispatch.ts` now spawns `spec.binary`/`spec.args` with `env: { ...process.env, ...spec.env }` instead of the hardcoded per-name argv, and `dispatcher.ts` loads the resolved def by `task.agent` and passes it through. This lets config-defined custom ACP providers dispatch with no new switch case. Built-in dispatch (claude/opencode/goose/qwen) is **byte-identical** to pre-v2.3 — proven by a regression test asserting opencode→`['acp']`, goose→`['acp']`, qwen→`['--acp']`, binary=`installPath ?? id`, and empty config env → plain `process.env`. One deliberate deviation from the spec's literal `!installPath → null`: the `installPath ?? id` fallback is preserved so a missing install path still spawns the bare agent name as before. `setSessionMode`/permission/streaming and the dispatcher poll/NOTIFY/running-guard are untouched. 7 new `acp-spawn.test.ts` cases. No routes/UI (Phase 4+). Builds on `v2.5.5-provider-lifecycle-phase2`.
## v2.5.5-provider-lifecycle-phase2 — 2026-05-29
Phase 2 of the v2.3 provider-lifecycle batch (`openspec/changes/v2-3-provider-lifecycle/design.md` §4). `provider-snapshot.ts` stops returning `null` for uninstalled/disabled providers — it now emits one entry per registered provider with a lifecycle status (`loading | ready | unavailable | error`), an `enabled` flag, and a two-tier probe. Tier-1 is a fast `which`-style availability check (`command-availability.ts`, `execFile`/no-shell); tier-2 — the 530s cold ACP probe — is now SKIPPED unless forced (`POST /refresh`), the `available_agents.last_probed_at` row is older than `PROVIDER_PROBE_TTL_MS` (24h default), or the DB model list is empty, which kills snapshot latency on warm reads. A cache miss returns `status:'loading'` synchronously while the build settles in the background (client polling is deferred to Phase 5). `ProviderSnapshotStatus`/`ProviderSnapshotEntry` regained `loading`/`unavailable` and gained `enabled`, `description?`, `fetchedAt?` in both the coder and web copies, guarded by a runtime parity test (`provider-types-parity.test.ts`, mirroring the `ws-frames.test.ts` convention) that fails on any field drift — a compile-time cross-project assignability check was attempted first but blocked by TS6307 (web is a composite tsconfig project). Also tracks the previously-gitignored `data/coder-providers.json` seed via a `.gitignore` exception, completing the Phase 1 config file. No dispatch/route/UI changes (Phase 3+); AgentComposerBar filtering unchanged. Builds on `v2.5.4-provider-lifecycle-phase1`.
## v2.5.4-provider-lifecycle-phase1 — 2026-05-29
Phase 1 of the v2.3 provider-lifecycle batch (`openspec/changes/v2-3-provider-lifecycle/design.md` §23): a config-backed provider layer merged over the hardcoded built-ins, with no runtime change when no config file exists. Adds `CODER_PROVIDERS_PATH` (default `/data/coder-providers.json`); `provider-config.ts` (Zod `ProviderOverride`/`CoderProvidersFile` schemas + a loader that never throws at startup — a missing file, invalid JSON, or schema mismatch all fall back to built-ins-only — plus `save` for the Phase 4 PATCH route); and `provider-config-registry.ts` (`ResolvedProviderDef` + `buildResolvedRegistry` merge: built-in overrides, custom `extends:'acp'` entries requiring label+command, `boocode` always enabled, plus a module singleton). `agent-probe.ts` now iterates the resolved registry instead of the hardcoded list — custom ACP entries resolve their binary from `command[0]` via `execFile` (no shell), disabled providers skip probing without losing their row, and `enabled` is read from memory only (no DB column this phase). Six unit tests, including a regression proving an empty config yields exactly the built-ins. No snapshot/dispatch/route/UI changes (Phase 2+). The `data/coder-providers.json` seed exists on disk but is gitignored (`data/*`). Lands on top of `v2.5.3-remove-cursor-copilot`.
## v2.5.3-remove-cursor-copilot — 2026-05-29
Retire the cursor and copilot providers from BooCoder entirely. Removes their `acp-spawn` argv cases, `provider-manifest` mode blocks + manifest keys, `provider-commands` command maps, the `provider-snapshot` cursor model-CLI branch (and the now-orphaned `exec`/`promisify` imports), and the `agent-probe` copilot ACP-detect branch; deletes the dead `cursor-models.ts` module and its test. The `PROVIDERS` registry array already lacked both entries, so only the doc comment needed correcting. Built-ins unchanged: claude, opencode, goose, qwen, native boocode. Standalone cleanup; pairs with `v2.5.4-provider-lifecycle-phase1` which builds on it.
## v2.5.2-coder-ux-fixes — 2026-05-29
Working-tree checkpoint bundling this session's fixes with in-progress coder UI work. This session: the BooCoder dispatcher now reacts to new tasks immediately via a Postgres `LISTEN/NOTIFY` (`tasks_new`) AFTER INSERT trigger, with the poll loop kept at 2s as a missed-notification fallback (`dispatcher.ts`, `apps/coder/src/schema.sql`); the mobile nav drawer no longer sticks open after returning to a backgrounded tab — `useViewport` re-syncs on `pageshow`/`visibilitychange`/`resize`/`orientationchange` (iOS reported a stale width on bfcache restore, leaving `isMobile=false`); assistant reasoning renders as a collapsible "Thinking" block in `MessageBubble`, surfacing ACP `agent_thought_chunk` from opencode/goose/qwen and native `reasoning_parts`; paste-to-chip inserts pasted text verbatim instead of wrapping it in a code fence; and a "New file from pasted text" affordance in the RightRail browser queues a `pending_changes` create through the new `POST /api/sessions/:id/pending/create` endpoint, paired with a fix repointing the DiffPanel's dead approve/reject calls to the real `/api/pending/:id/apply` and `/reject` routes. Also carried in the tree but not authored this session: the CoderPane `ChatInput` migration and `AgentComposerBar` refinements, plus backend tweaks to `auto_name`, inference `tool-phase`/`turn`, `secret_guard`, and `provider-registry`. Ships the `v2-6-persistent-agent-sessions` openspec proposal/design/tasks (free agent-switching with per-agent memory, opencode-as-server) as planning docs only — the feature is unimplemented and reserves the `v2.6.0` tag for it. Build green across server/coder/web; server suite 531 passing. (CHANGELOG note: the v2.3v2.5.1 entries were never backfilled and remain absent above.)
## v2.2.2-xml-placeholder-reject — 2026-05-26
Reject placeholder XML tool args at parse time in `extractToolCallBlocks` (`xml-parser.ts`). Drops calls when any string arg is `...`, empty/whitespace, `<path>`, `<file>`, `placeholder`, or angle-bracket sentinels; appends the raw XML block to flushed prose instead of silently deleting it. Fixes qwen3.6 answer-then-spurious-tools tail that caused duplicate assistant rows (full answer + failed `xml_call_*` tools + regenerated answer). Four new tests in `xml-parser.test.ts`. Known nit: rejection logs via `console.debug` instead of pino — filed in `docs/DEFERRED-WORK.md` §6 for a later cleanup.
## v2.2.1-pane-scoped-chats — 2026-05-26
Follow-up fixes on the v2.2 Paseo provider stack. Pane-scoped chat resolution: `resolveChatId(sql, sessionId, paneId)` reads `sessions.workspace_panes`, requires `pane_id` on coder POST routes, and creates a scoped chat per coder/terminal pane instead of falling back to the session's first open chat (which fused BooCoder writes into the BooChat pane). Client `useWorkspacePanes` seeds new coder/terminal panes with dedicated chats on create, hydrate, and workspace sync; `CoderPane` blocks send until seeded and filters WS frames + `GET /messages?chat_id=` to that chat. External-agent tool UI: new `CoderMessageList` renders BooChat-style `ToolCallLine` timeline (tools before answer text on combined ACP rows). WS user-delta handling replaces content instead of appending (fixes garbled duplicate user messages when optimistic UI met full-body deltas). BooChat inference: `buildMessagesPayload` strips orphan assistant `tool_calls` without matching `tool` rows and skips stray tool rows when the owning assistant turn is incomplete (fixes "Tool results are missing for tool calls" on shared chats with ACP history). Pairs with `v2.2-paseo-providers`.
## v2.2-paseo-providers — 2026-05-26
Paseo-equivalent provider stack for BooCoder. Seven providers (boocode, cursor, claude, opencode, goose, qwen, copilot) with snapshot API (`provider-snapshot.ts`, ACP cold probe, per-provider model merge, cursor models from ACP). Frontend `AgentComposerBar` replaces `ProviderPicker` — provider / mode / model / thinking in the coder composer; `SlashCommandPicker` + `useProviderSnapshot` hook. ACP dispatch rewritten (`acp-dispatch.ts`, `acp-stream.ts`, `acp-spawn.ts`, `agent-turn-persist.ts`, `acp-tool-snapshot.ts`) with Paseo merge/stream/persist pattern, inline `PermissionCard` prompts, and `reasoning_delta` WS frames. Agent slash-command hints via ACP `available_commands_update` cached in `agent-commands-cache.ts` + `AgentCommandsHint`. Arena and MCP entry points accept `mode_id` / `thinking_option_id`. SSH helpers removed; all host exec via `host-exec.ts` direct spawn. Server adds coder proxy route + shared skill invoke. New tests: acp-derive, acp-tool-snapshot, cursor-models, provider-commands, provider-snapshot, agents. Docs: `AGENTS.md`, `docs/ARCHITECTURE.md`, openspec `v2-2-paseo-providers`.
## v2.1.1-roadmap-cleanup — 2026-05-25
Roadmap reconciliation, README updates, and openspec archive housekeeping. No runtime behavior changes.
## v2.1.0-provider-picker — 2026-05-25
Provider picker: BooCoder moves from Docker container to host systemd service (`boocoder.service`). All agent dispatch (ACP + PTY) switches from SSH tunnel to direct `spawn`/`exec` — no more `sshSpawn`/`sshExec`/`sshSpawnWithStdin` (marked `@deprecated`). New provider registry (`provider-registry.ts`) with 5 providers (boocode, opencode, goose, claude, qwen), per-provider model discovery (llama-swap for ACP agents, `~/.qwen/settings.json` for qwen, static for claude), and `agent-probe.ts` runs direct `which`/`exec` instead of SSH. `GET /api/providers` route assembles the provider list with installed status, models, and transport (ACP→PTY fallback if `supports_acp` is false). Frontend `ProviderPicker` component in CoderPane header lets users pick provider/model per message; messages route through `tasks` row for external providers instead of inference enqueue. Smart scroll: `MessageList` only auto-scrolls when user is near bottom (150px threshold). DB schema adds `models`, `label`, `transport` columns to `available_agents`. Bug fixes: `loadContext` SELECT now includes `allowed_read_paths` (cross-repo read grants were silently failing), cap hit sentinel insertion moved before `buildMessagesPayload` call.
## v2.0.5 — 2026-05-25
FAST_MODEL routing: optional `FAST_MODEL` env var routes cheaper models (titles, summaries, labeling) to a small model on llama-swap (e.g. `nemotron-nano-4b`) instead of loading the 35B for 20-token calls. Falls back to session model or DEFAULT_MODEL. Tool-use summaries: `runCapHitSummary` now writes the cap_hit sentinel before building the summary payload (bug fix — sentinel was written after, causing it to appear after the summary text in the message list). Qwen Code dispatch: `qwen -p "<task>" --output-format stream-json` via PTY (non-interactive mode, no `--yolo` flag needed). Arena: `POST /api/arena` dispatches the same task to N models/agents in parallel, each with its own task + worktree; `GET /api/arena/:id` for results; `POST /api/arena/:id/select/:task_id` picks winner.
## v2.0.4-hardening — 2026-05-25
Path-guard fuzz suite: 25+ traversal-attack tests covering ../ sequences (all depths), encoded traversal (%2e%2e), null byte injection, absolute path escape, prefix-without-separator, backslash traversal, and the full secret-file deny list (.env, *.pem, id_rsa*, *.key, credentials.json, *.kdbx, .netrc). Plus 5 valid-path positive tests confirming normal writes aren't blocked and 5 edge-case tests (empty, whitespace-only, very long path, triple-dot, multiple slashes). Null-byte and whitespace-only guards added to `resolveWritePath` (previously only checked empty string). DB-integration test skeleton for pending_changes full-cycle (queue create/edit/delete, apply, rewind) gated on DATABASE_URL via `describe.runIf`. Production readiness verified: all services healthy, all builds clean, 57 tests passing (23 existing + 34 new).
## v2.0.3 — 2026-05-25
CLI client (`apps/coder/src/cli.ts`, 249 lines) for headless agent interaction. Human inbox view (`human_inbox` view) surfaces tasks in `blocked`/`failed` state. Cost tracking: `tool_cost_stats` view with per-tool 100-call rolling window. `new_task` tool (Boomerang pattern): creates tasks with project context and optional arena contestants. `check_task_status` and `list_tasks` tools for task lifecycle management. Stats routes (`GET /api/stats`) for cost aggregation. Dispatcher extended to support new task states.
## v2.0.2 — 2026-05-25
BooCoder MCP server (`mcp-server.ts`, 201 lines) exposing 6 write-capable tools over stdio: `edit_file`, `create_file`, `delete_file`, `view_pending_changes`, `apply_pending`, `rewind`. Registered in `apps/coder/src/index.ts` as an MCP stdio server. Enables external agents (opencode, claude, qwen) to call BooCoder's write tools through the MCP protocol.
## v2.0.1 — 2026-05-25
ACP dispatch (`acp-dispatch.ts`, 271 lines): runs ACP-capable agents (opencode, goose) via SSH tunnel wrapping stdio into NDJSON streams for `@agentclientprotocol/sdk` JSON-RPC sessions. PTY dispatch (`pty-dispatch.ts`, 139 lines): runs non-ACP agents (claude, qwen) via SSH with stdin pipe for non-interactive mode. Worktree management (`worktrees.ts`, 118 lines): per-task git worktree creation and cleanup. SSH helper (`ssh.ts`, 126 lines): `sshSpawn`, `sshExec`, `sshSpawnWithStdin` for host command execution. Dispatcher extended to route tasks to ACP vs PTY based on agent capability. Agent probe updated to verify ACP support.
## v2.0.0-final — 2026-05-25
Dispatcher (`dispatcher.ts`, 191 lines): task queue with polling loop, Path A (native inference) and Path B (external agent dispatch). Task routes (`tasks.ts`, 138 lines): CRUD for tasks with state transitions. Agent probe (`agent-probe.ts`, 51 lines): startup scan of host for installed agents (opencode, goose, claude, pi, qwen), version detection, ACP capability verification. Schema adds `tasks` table. CLAUDE.md updated with v2.0.0 architecture docs covering BooCoder, DB rename, MCP config, workspace deps.
## v2.0.0 — 2026-05-25
BooCoder frontend: `CoderPane.tsx` (432 lines) as a `'coder'` pane type within BooChat's SPA — chat pane + diff pane (pending changes) + session picker. Standalone fallback SPA in `apps/coder/web/` (Vite + React) served at `:9502` directly. Session streaming via `useSessionStream` WS hook. API client with typed endpoints. Workspace pane persistence via `useWorkspacePanes`. Server routes for pending changes (`PATCH/POST /api/coder/sessions/:id/pending`). Verification discipline rules + chat naming from assistant response.
## v2.0.0-beta — 2026-05-25
Write tools: `edit_file`, `create_file`, `delete_file`, `apply_pending`, `rewind` — queue in `pending_changes` table, nothing hits disk until applied. `write_guard.ts` validates paths (resolve + prefix-check, no realpath for creates). Inference loop integration via `inference_context.ts` (bridges inference turn state to tool execution). API routes: `messages.ts` (POST /api/coder/sessions/:id/messages), `pending.ts` (GET/POST /api/coder/sessions/:id/pending). WebSocket support (`ws.ts`) for real-time pending changes updates. Tool adapter (`adapter.ts`) converts inference tool calls to tool execution. Write guard tests (115 lines). Server-side inference loop wired to BooCoder tools.
## v2.0.0-alpha — 2026-05-25
BooCoder foundation: Docker container (`apps/coder/Dockerfile`), docker-compose service, host env file. Schema: `sessions`, `chats`, `messages`, `pending_changes`, `tasks`, `message_parts` tables. DB renamed from `boocode` to `boochat`. Config module, PostgreSQL connection (porsager/postgres). Initial Fastify server with health endpoint. BOOCODER.md guidance file. Implementation plan (8 phases). Proposal updated with AGENTS.md extensions, Boomerang pattern, observation hooks.
## v2.0-proposal — 2026-05-24
v2.0 proposal: BooCoder write tools, pending-changes queue, ACP dispatch, MCP server. Openspec proposal (`proposal.md`, 274 lines) and task breakdown (`tasks.md`, 130 lines) defining the v2.0 feature scope — write-capable coding agent with file operations, external agent dispatch via ACP/PTY, and MCP server for tool exposure.
## v1.16.0-codesight-merge — 2026-05-24
Ports codesight's highest-value analysis capabilities into the codecontext sidecar as 4 new MCP tools. Tier 1 (graph queries on existing edges, no re-parsing): `get_blast_radius` (BFS reverse-edge traversal — "what breaks if I change this file?", with depth tracking) and `get_hot_files` (most-imported files ranked by incoming edge count — change-risk indicators). Tier 2 (tree-sitter AST re-parsing on demand): `get_routes` (Fastify/Express HTTP route extraction with method, path, file, line, inferred tags for db/auth/cache) and `get_middleware` (middleware registration detection via import-name heuristics and app.register/addHook/setErrorHandler patterns, classifying as auth/cors/rate-limit/security/error-handler/logging/validation). All 4 tools use `defer s.graphMu.RUnlock()` for consistent mutex discipline (reviewer caught that the initial implementation released the lock early on the Tier 2 tools). Route object-property extraction delegates to `extractStringValue` for template-literal handling (reviewer catch). codecontext sidecar rebuilt from `/opt/forks/codecontext` commit `b19e646`, tagged `v1.16.0-codesight-merge`. BooCode wrapper tools follow the existing codecontext pattern — 4 new files in `apps/server/src/services/tools/codecontext/`, registered in ALL_TOOLS. 29 new Go tests + 363/363 BooCode server tests passing. No schema changes, no frontend changes.
## v1.15.0-mcp-multi — 2026-05-24
Multi-server MCP client with stdio + Streamable HTTP transports, JSON config file, and per-agent tool glob patterns. Generalizes the v1.14.1 single-server Context7 PoC into a registry of named MCP servers with per-server graceful degradation. JSON config at `/data/mcp.json` (bind-mounted alongside `AGENTS.md`) matches opencode's `mcpServers` schema shape so server entries are copy-pasteable. Config file missing = no MCP (opt-in by file presence). Stdio transport spawns a persistent subprocess via the SDK's `StdioClientTransport` with NDJSON framing; Streamable HTTP reuses the v1.14.1 pattern via `StreamableHTTPClientTransport`. Tool prefix generalized from `context7_<name>` to `<serverName>_<toolName>` with a reverse `toolToServer` map for dispatch routing. Per-agent AGENTS.md `tools:` field now supports glob patterns (`context7_*`, `!web_*`) via `matchToolGlob` (last-match-wins, `!` prefix denies); replaces the exact-match `.includes()` in `stream-phase.ts`. Glob patterns bypass `ALL_TOOL_NAMES` validation in the parser since MCP tool names aren't known at parse time. `refreshToolNames()` in `agents.ts` rebuilds the `DEFAULT_TOOLS` snapshot after `appendMcpTools` so agents without explicit `tools:` lists see MCP tools — reviewer caught that the module-load-time snapshot would permanently exclude late-registered tools. Read-only invariant preserved: all MCP tools with `readOnlyHint: false` rejected at discovery. Result size capped at 5MB. Shutdown hook closes all transports. v1.14.1 env vars (`MCP_CONTEXT7_URL`, `MCP_CONTEXT7_API_KEY`) removed — superseded by the config file. Default `data/mcp.json` ships with Context7 disabled; flip `"enabled": true` to activate. 363/363 server tests passing (27 new: multi-server wrapping, glob matching, routing, degradation). No schema changes, no frontend changes.
## v1.14.1-mcp-poc — 2026-05-23
Single-server MCP client PoC against Context7. New `apps/server/src/services/mcp-client.ts` (~200 lines) wraps `@modelcontextprotocol/sdk` v1.29.0 with Streamable HTTP transport. On startup (when `MCP_CONTEXT7_URL` is set), connects to Context7, discovers tools via `tools/list`, wraps each as a `ToolDef` prefixed `context7_<name>`, and appends to `ALL_TOOLS` (alpha-sorted for prompt-cache stability). `appendMcpTools()` in `tools.ts` handles the late-registration; `ALL_TOOLS` changed from `ReadonlyArray` to mutable to support it. Read-only invariant guard rejects any MCP tool with `readOnlyHint: false` (MCP SDK v1.29.0 uses `readOnlyHint`, not `readOnly`). Tool dispatch is transparent — `executeToolCall` routes MCP tool calls through the `ToolDef.execute` wrapper, which strips the `context7_` prefix before calling the MCP server. Graceful degradation: MCP server down at startup → zero tools, warn log; MCP server down mid-session → error-shaped result, model self-corrects. Result size capped at 5MB with truncation (matches native `view_file`'s `MAX_FILE_BYTES`). Adversarial review caught that the Zod `.default('https://...')` on the URL config made MCP effectively always-on instead of opt-in — fixed by removing the default. 348/348 server tests passing (16 new mcp-client tests covering tool wrapping, read-only guard, name prefixing, content extraction). No schema changes, no frontend changes. Proves the MCP tool-discovery → tool-call → result-render loop end-to-end before the full v1.15 port.
## v1.14.0-outer-loop — 2026-05-23
Converts the inference engine's ad-hoc `executeToolPhase → runAssistantTurn` recursion into an explicit `while` loop with a configurable step cap. A step is one stream-and-tool-execute iteration; the loop terminates on non-tool finish, step-cap hit, doom-loop, budget exhaustion, abort, or synthesis success. `MAX_STEPS = 200` is the hard ceiling (4x the old effective limit from budget); per-agent `steps:` field in AGENTS.md frontmatter sets tighter caps (Refactorer: 5, Architect: 20, others: unset = bounded only by MAX_STEPS). `executeToolPhase` no longer recurses — returns a `ToolPhaseResult` struct (`action: 'continue' | 'paused' | 'synthesis_done'`) so the caller (the while loop) decides whether to continue or break. `steps: 0` is handled as "no tool calls allowed" — one text-only stream phase, tool calls ignored with a warn log. Step-cap hits produce a sentinel summary (reuses `cap_hit` kind so `CapHitSentinel.tsx` renders it without frontend changes; text distinguishes "Step limit reached" from "Tool budget exhausted"). Doom-loop check migrated from pre-recursion position to top of loop body — same predicate (`detectDoomLoop`), same threshold (3 identical calls), `break` instead of `return`. `step_start` parts are in the schema CHECK but not emitted as message_parts in v1.14 — writing to the assistant message before the stream phase creates a sequence-0 collision with `partsFromAssistantMessage`; a structured log line is emitted instead. Adversarial review caught the collision pre-deploy. 332/332 server tests passing; no frontend changes. Pairs with `v1.13.20-drop-legacy-cols` (parts is now the sole source of truth, and this batch's loop operates entirely through parts).
## v1.13.20-drop-legacy-cols — 2026-05-23
Final phase of the v1.13.0 strangler-fig migration. Removes the dual-write into `messages.tool_calls` / `messages.tool_results` JSON columns and drops the columns themselves; `message_parts` is now the only source of truth for tool-call and tool-result data. 10 dual-write sites stripped (5 in `tool-phase.ts`, 2 in `routes/skills.ts`, 2 in `routes/messages.ts`, 1 in `routes/chats.ts` fork-clone) — recon's grep-driven inventory caught 2 sites beyond the original v1.13.2 roadmap count. `messages_with_parts` view simplified to parts-only subselects (COALESCE fallbacks gone) and rewritten via `CREATE OR REPLACE VIEW` BEFORE the column DROP since Postgres rejects column-drop on view-referenced cols. Adversarial review caught a runtime bug the green test suite missed: `chats.ts:/api/chats/:id/discard_stale` had a `RETURNING ... tool_calls, tool_results, ...` clause referencing the dropped columns; would have crashed on every 60s-no-token-activity recovery in production. Fixed by switching to two-step UPDATE-then-SELECT-from-view so the response keeps the parts-synthesized fields. `Message` API type retains `tool_calls?` / `tool_results?` fields (override on the original v1.13.2 plan) — the view continues to populate them from parts, so the wire shape is unchanged and the frontend needs no updates. v1.12.1 cleanup block (`DROP CONSTRAINT messages_status_check`/`messages_role_check`) removed — those one-shots have done their work. `tool_cost_stats.test.ts` had a direct `INSERT INTO messages` touching the legacy columns that wasn't in the roadmap's inventory; rewritten to parts-table inserts and confirmed semantically faithful. 339/339 server tests passing including the 7 DB-integration tests (live-DB applied the schema migration and ran the parts-only view end-to-end). Pairs with `v1.13.0-ai-sdk-v6` (which introduced the dual-write) and `v1.13.1-B` (which moved the read path to `messages_with_parts`); umbrella `v1.13` tag ships on the same commit.
## v1.13.19-html-artifact-panes — 2026-05-23
Pane-based artifact viewer with on-request HTML support. Every assistant message gets an "Open in pane" icon button (`PanelRightOpen`, mobile 44px tap-target) in `MessageBubble`'s ActionRow; click opens the message in the workspace splitter as either a Markdown pane (Copy raw source + Download `.md`) or an HTML pane (Download `.html` only, no Copy). The HTML path triggers when the model emits a self-contained `<!DOCTYPE html>` or fenced ` ```html` artifact (opt-in only — `BOOCHAT.md` rule says Markdown is default at every length; HTML only on explicit user request like "render this as HTML"). Backend detection in `finalizeCompletion` (`error-handler.ts`) writes a new `message_parts.kind='html_artifact'` row with payload `{html_content, char_count, title}` (`<title>` → first `<h1>` → first 80 chars of inner text). Schema CHECK extended via the v1.13.13 drop-and-re-add pattern. 1MB cap is graceful — over-cap artifacts skip the part write and plain content lands; decision factored into a pure `decideHtmlArtifactWrite` helper so the warn-and-skip branch is unit-testable without mocking the full InferenceContext. Pane state is reference-only (`{chat_id, message_id, title}`) — content is fetched on mount, keeping `sessions.workspace_panes` jsonb small and avoiding 1MB blobs riding the `session_workspace_updated` WS frame. New `services/artifacts.ts` ships slug derivation (Markdown: first `#` heading → first 6 words; HTML: `<title>``<h1>` → inner text) and write helpers that realpath the artifacts directory after `mkdir` to close a symlink-escape gap (`assertArtifactsDirSafe`). `routes/artifacts.ts` exposes POST `/api/chats/:id/messages/:msg_id/artifacts/download?fmt=md|html` (writes to `<projectRoot>/.boocode/artifacts/<slug>-<ts>.<ext>`) plus GET `/api/projects/:project_id/artifacts/:filename` with `Content-Disposition: attachment`, `X-Content-Type-Options: nosniff`, and `Content-Security-Policy: sandbox` defense-in-depth on LLM-served HTML. iframe sandbox locks to `allow-scripts allow-clipboard-write allow-downloads` with no `allow-same-origin` and uses `srcDoc` (not `src`) for opaque-origin isolation. Frontend extracts `MarkdownRenderer.tsx` from `MessageBubble`'s inline `MarkdownBody` for reuse; `MarkdownArtifactPane.tsx` / `HtmlArtifactPane.tsx` render with loading + error states. 404-vs-real-error discrimination in `openInPane`: a real network/500 failure toasts and bails instead of silently masquerading as a Markdown pane. 31 new server unit tests (slug derivation, detection positive/negative, write helpers, symlink-escape, 1MB cap, real-symlink filesystem test); 332/332 server tests passing; `tsc -p apps/web/tsconfig.app.json --noEmit` clean; `pnpm -C apps/web build` green. Smoke deferred to first deploy.
## v1.13.18-codecontext-file-path — 2026-05-22
Fix: four codecontext wrappers (`get_file_analysis`, `get_symbol_info`, `get_dependencies`, `get_semantic_neighborhoods`) forwarded `file_path` to the sidecar unchanged, but the sidecar's index is keyed on absolute paths — every relative path from the model returned "File not found in graph" (three back-to-back failures in one chat at 17:56 UTC, ~48 s of wasted tool budget). New `resolveProjectPath` helper in `codecontext_client.ts:64-89` realpath-resolves the candidate, applies the same escape check as the existing `target_dir` resolver (matching the error template byte-for-byte except the field name), and falls through with the normalised absolute on ENOENT so the sidecar issues its own self-correctable "File not found" error. Wired into `callCodecontext` once at the args-spread site — all four wrappers benefit without per-wrapper edits. `.trim()` added to all four `file_path` Zod schemas to absorb trailing newlines from model output. Adversarial review caught a P2 escape-bypass: an absolute path with `..` (e.g. `<projectRoot>/../etc/passwd`) that ENOENTs at realpath would slip through the literal prefix-check, fixed by `resolve()`-normalising the absolute branch too. 9 new test cases in `codecontext_client.test.ts` (7 spec scenarios + symlink-out-of-root + absolute-with-`..` ENOENT) plus a 1-line update in `codecontext_tools.test.ts` asserting the new resolved-absolute contract. Pairs with `v1.13.17-cross-repo-reads` — both harden path traversal, but v1.13.18 stays inside the project root while v1.13.17 widens access outside it.
## v1.13.17-cross-repo-reads — 2026-05-22
On-demand read access to paths outside the session's primary project root. Closes the dead-end where `pathGuard` rejected every cross-repo read with no recovery path. New `request_read_access(path, reason)` tool emits an `ask_user_input`-style pause; user picks Allow/Deny via inline chips in `RequestReadAccessCard.tsx`; on Allow, the new `POST /api/chats/:id/grant_read_access` endpoint re-resolves the grant root and appends to `sessions.allowed_read_paths` (new `TEXT[]` column, default empty). Grant unit per design D1 = nearest registered `projects.path` ancestor → else nearest repo-shaped ancestor (`.git/` / `package.json` / `go.mod` / `Cargo.toml`) under `PROJECT_ROOT_WHITELIST` → else refuse without prompting. `pathGuard` extended with an optional `extraRoots` argument threaded from `session.allowed_read_paths` through `executeToolCall` to the four filesystem tools (view_file, list_dir, grep, find_files); `view_file` re-anchors the secret-guard check on `basename(real)` whenever the path resolved via a grant root so `.env` / `id_rsa*` deny still fires across grants. `grant_resolver.ts`'s ancestor walk checks the whitelist invariant on every iteration (not just final parent) so a symlinked input can't escape mid-walk. PATCH `/api/sessions/:id` exposes `allowed_read_paths` only for revocation: zod refines paths to absolute + no traversal markers, and a runtime subset guard (`findUnauthorizedAdditions`) rejects any entry not already present in the row, so a malicious `curl -X PATCH -d '{"allowed_read_paths":["/etc"]}'` 400s instead of bypassing the grant flow. Settings pane gains a per-session revoke list; archiving the session clears grants implicitly. 11 grant_resolver tests pin the symlink-escape-mid-walk guard (Sam's checkpoint-1 ask) and the nearest-project disambiguation; 8 path_guard tests cover extraRoots traversal; 8 sessions PATCH tests cover the subset guard including the `/etc` bypass attempt. Pairs with `v1.13.16-xml-parser` (model now both self-recovers from a wrong tool name AND from a refused path).
## v1.13.16-xml-parser — 2026-05-22
Two-part fix for the model-emitted XML drift the v1.13.15 investigation surfaced. **Parser extension:** `xml-parser.ts` now recognizes the Anthropic `<invoke name="…"><parameter name="…">…</parameter></invoke>` shape alongside the existing Qwen/Hermes `<tool_call><function=…>…</function></tool_call>` shape. qwen3.6-35b-a3b-mxfp4 drifts to the Anthropic format when prompted as an Architect-style agent (Claude Code documentation in its pre-training corpus). Both formats route through the same synthetic-id `xml_call_${idx}` ToolCall path. The existing Qwen parser was tightened to tolerate whitespace around `=` (`<function = name>` shape) so a stray space doesn't get absorbed into the function name. **Unknown-tool recovery hint:** new `tool-suggestions.ts` exports `levenshtein()` + `suggestToolName()` + `formatUnknownToolError()`. When the dispatcher (`tool-phase.ts:executeToolCall`) receives an unknown tool name, the error returned to the model includes a "Did you mean: X?" hint based on Levenshtein distance ≤3 or substring match against `Object.keys(TOOLS_BY_NAME)`. Targets the qwen3.6 drift to `read_file` → suggest `view_file`. Test coverage in `xml-parser.test.ts` (46 tests, all green) covers both parsers, the partial-opener detector for both flavors, the unified extraction helper, and the new error formatter.
## v1.13.15-codecontext-synth — 2026-05-22
Forced second-inference synthesis pass for codecontext overview-class tools (`get_codebase_overview`, `get_framework_analysis`, `get_semantic_neighborhoods`). After the tool result lands, the pipeline expands the truncated head via in-process `readTruncation`, extracts referenced file paths from the full content, auto-fetches top-N files + project docs (BOOCHAT.md, AGENTS.md, *roadmap*.md, CONTEXT.md) under a 32k-token budget with explicit drop-priority order, then streams a synthesis turn that replaces the recursive `runAssistantTurn`. The 32k truncated head still ships to the synth model (token-budget contract preserved); the expansion is reference-extraction-only. Falls through to recursion on timeout (90s), model error, or non-2xx; user-abort marks the synth message `status='failed'` and re-throws (the outer abort handler operates on the parent turn's message, not the new synth row — without explicit marking, the row would sit `streaming` until the 5-min sweeper, tripping the 60s stale-stream banner). Adds `'synthesis'` to `message_parts.kind` CHECK constraint via `DROP CONSTRAINT IF EXISTS` + `DO $$ pg_constraint` idempotency-guarded re-add. Smokes #1, #2, #6 all clean; smokes #3#5 are content-quality checks for UI review.
## v1.13.14-skills-audit — 2026-05-22
Multi-topic batch. **Skills audit (headline):** vendored all 26 skills from `/home/samkintop/opt/skills/` into repo-local `data/skills/` (the `/opt/skills:/data/skills` override mount removed from `docker-compose.yml` so skills are auditable per-batch in git). Audited via 5 parallel Claude Code agent-teams running mgechev's 4-step protocol per skill — 14 survive with gerund-form names + refined triggers; 11 dropped (duplicates, BooCode-irrelevant patterns, Claude-already-does-natively); 1 (`verification-before-completion`) migrated to `BOOCHAT.md`/`BOOCODER.md` as an always-true rule. The Codeminer42 "rules vs recipes" split codified in those files. **Token tracking + stale-stream banner fix:** same root cause — `IsoTimestamp = z.string()` in `ws-frames.ts` was failing on postgres `Date` objects, silently dropping every `message_complete` / `session_updated` / `chat_updated` frame through the `v1.13.13-ws-publish` Zod gate; `z.preprocess(v => v instanceof Date ? v.toISOString() : v, ...)` applied to the primitive on both server + web (parity test still passes). **Codecontext ignore:** `codecontext_client.ts` auto-installs `.codecontextignore.template` into any project's root on first call (stops the upstream empty-source-file parser crash on foreign projects' `node_modules`). **Budget bump:** `BUDGET_READ_ONLY` + `BUDGET_NO_AGENT` 30 → 50 (real recon need ~27 + headroom for codecontext failure-retry turns; doom-loop guard catches the loop class anyway). **UI:** queued-message dropdown → edit / force-send / cancel buttons in `ChatPane.tsx`; `ChatThroughput` removed from desktop tab strip (mobile tab switcher keeps it). Audit decisions in `openspec/changes/v1.13.12-skills-audit/audit-notes.md`.
## v1.13.13-ws-publish — 2026-05-22
Second half of the WebSocket-frame-typing batch. Converts the existing ~50 inference + auto_name publish sites (via the `index.ts` adapter) plus ~30 direct `broker.publish*` call sites in routes + compaction, so every server-emitted frame now goes through Zod validation at the broker boundary. Pairs with `v1.13.12-ws-schemas`.
## v1.13.12-ws-schemas — 2026-05-22
First half of the WebSocket-frame-typing batch. Adds `apps/server/src/types/ws-frames.ts` with Zod schemas for all 27 wire-format frame types (discriminated union `WsFrameSchema` + `KNOWN_FRAME_TYPES` diagnostic lookup), duplicated byte-identical at `apps/web/src/api/ws-frames.ts` with a parity test. Introduces the `publishFrame` / `publishUserFrame` wrappers that fail-closed on schema mismatch.
## v1.13.11-tools — 2026-05-22
Tiered tool loading via `BOOCODE_TOOLS` env var (`core` | `standard` | `all`). Core = 4 read-only fs tools (~2k token schema cost). Standard = +web + git + codecontext (~10k). All (default) = every tool in `ALL_TOOLS` (~21k). The var is a ceiling — narrows agent whitelists, never expands. Pattern lifted from `eyaltoledano/claude-task-master`.
## v1.13.10-openspec — 2026-05-22
Adopt `Fission-AI/OpenSpec`'s `openspec/changes/<slug>/{proposal,tasks,design}.md` shape for BooCode's own batch docs. Existing batch docs (`boocode_batch10.md`, `handoff_v1.13.8_prefix_verify.md`, `handoff_v1.13.10_per_tool_cost.md`) moved into `openspec/changes/archived/` via `git mv` to preserve history. Zero-dep documentation reformat.
## v1.13.9-agentlint — 2026-05-22
Manual audit of instruction files against `0xmariowu/AgentLint`'s 31-check standard. Removed identity-opener sections from `BOOCHAT.md` and `BOOCODER.md` (emphatic decoration the model doesn't need). Added `CLAUDE.local.md` to `.gitignore` — Claude Code's Glob ignores `.gitignore` by default, so local overrides were otherwise readable by any agent walking the workspace. `CLAUDE.md` passed all 10 checks unchanged.
## v1.13.8-tool-cost — 2026-05-22
Per-tool prompt/completion-token rolling averages surfaced in AgentPicker as at-a-glance cost hints. Implementation is the `tool_cost_stats` SQL view over `messages_with_parts` (`LATERAL jsonb_array_elements` on `tool_calls`), plus a read endpoint and a tooltip extension. Equal-split attribution — multi-tool turn divides tokens N-ways; the 100-call rolling mean absorbs split noise. Filters out `cap_hit` / `doom_loop` sentinels. Source data already lands via existing UPDATEs that `v1.13.5-stability-bundle`'s `includeUsage: true` fix made non-NULL.
## v1.13.7-compaction-trigger — 2026-05-22
Compaction overflow trigger lowered to `floor(0.85 × ctx_max)`, replacing the v1.11.0-era `ctx_max 20_000` formula. Old formula gave only 7.6% headroom at 262k context and 0 budget for ≤20k contexts (never fired). New formula gives consistent 15% summarizer headroom across all model sizes. Opencode pattern lift from `session/overflow.ts`.
## v1.13.6-prefix-stability — 2026-05-22
System-prompt prefix stability verify-and-measure. Recon during planning disproved the original DB-cache premise: `buildSystemPrompt` already runs over inputs mtime-cached at the file layer (BOOCHAT.md, AGENTS.md global+per-project), and DB scalars are byte-stable until edited. This batch closes the verification gap with instrumentation, not implementation — `buildSystemPromptWithFingerprint` computes SHA-256 over the assembled prefix and a per-session `Map` observer fires `prefix-drift` (warn) on hash change with field-level `changed_inputs` diff.
## v1.13.5-stability-bundle — 2026-05-22
Five fixes for latent regressions surfaced during the cosmetic-revert investigation. (1) `provider.ts``includeUsage: true` on `createOpenAICompatible` (default false omitted `stream_options.include_usage`; llama-swap never emitted usage; tokens_used / ctx_used were NULL on every assistant row since `v1.13.0-ai-sdk-v6`). (2) `MessageList.tsx``hasText = m.content.trim().length > 0` to skip whitespace-only tool-call-only turns rendering empty bubbles. (3) `BUDGET_NO_AGENT` raised 15 → 30 to match read-only agent cap. (4) `payload.ts` skips status='failed' + complete-but-empty assistant rows so cap-hit + Continue doesn't upstream-reject. (5) Misc UI sanitization.
## v1.13.4-reasoning-fix — 2026-05-22
Compaction head-assembly audit caught one fix: reasoning was omitted from the summarizer's view of tool-bearing turns, silently degrading summary quality for reasoning-channel models (qwen3.6). `v1.13.0-ai-sdk-v6` had wired reasoning end-to-end into inference but missed this one read site. `CompactionMessage` extended with `reasoning_parts`; `buildHeadPayload` embeds it as a `<reasoning>...</reasoning>` prose prefix on the assistant content (OpenAI wire shape has no structured reasoning field).
## v1.13.3-truncate — 2026-05-22
Port of opencode's `truncate.ts`. Full tool output retrievable via opaque `tr_<12 base32 chars>` id (~60 bits entropy) and a new `view_truncated_output(id)` tool. Tmpfs storage at `/tmp/boocode-truncations/` (overridable via `BOOCODE_TRUNCATION_DIR`), 5MB cap, 7-day TTL, orphan-reap on the periodic 60s sweeper. Wired through four tools: `view_file`, `list_dir`, `web_fetch`, `codecontext_client`. Each returns the existing sliced view plus an `outputPath` field when truncation fires.
## v1.13.2-compaction-prune — 2026-05-22
Two-tier compaction prune — opencode pattern that was half-shipped in v1.11.0. New `message_parts.hidden_at` column with partial index on `WHERE hidden_at IS NULL`. `messages_with_parts` view changed from `COALESCE(parts, legacy)` to a CASE that distinguishes "no parts at all → fall back to legacy column for pre-v1.13.0 history" from "all parts hidden → drop the row from the model payload" (smoke caught the `COALESCE` leaking hidden parts back via legacy fallback). `prune.ts` scans `tool_result` parts newest-first, protects the last 40k tokens, marks older candidates hidden once the combined estimate clears 20k.
## v1.13.1-cleanup-bundle — 2026-05-22
Four independent items owed from prior dispatches. (1) `statement_timeout = '30s'` at the database level (documented in `schema.sql` but applied operationally — `ALTER DATABASE` can't run inside a `DO` block). (2) Tool registry alpha-sorted at module load — llama.cpp's prompt cache hits on byte-identical prefixes; reordering tools near the top of the system prompt would invalidate every cached turn. (3) Periodic 60s stuck-row sweeper. (4) `experimental_repairToolCall` to keep streams alive on malformed qwen3.6 tool args (pass-through implementation — logs and forwards unmodified; existing zod-reject path routes back to the model).
## v1.13.0-ai-sdk-v6 — 2026-05-22
Major migration to AI SDK v6. Introduces the `streamCompletion` adapter (`services/inference/stream-phase.ts`) over `streamText`, with five known gotchas the LSP can't catch — abort signals swallowed by `fullStream` (post-iteration throw required), usage lands only at stream end via `await result.usage`, tools have no `execute` field (BooCode dispatches in `tool-phase.ts`), and tool-call-only turns may emit a leading `\n` text-delta. Also ships the `messages_with_parts` view (parts-merge read path) and wires `reasoning_parts` end-to-end via a `ReasoningPart` in the v6 ModelMessage. Ports `ask_user_input` correlation queries from JSON columns to `message_parts` JOINs.
## v1.12.4-inference-split — 2026-05-21
Complete `inference.ts` split into `services/inference/`. Pieces: `turn.ts` (orchestration — `runAssistantTurn` / `runInference` / `createInferenceRunner`), `sentinel-summaries.ts` (`runCapHitSummary`, `runDoomLoopSummary`), `stream-phase.ts`, `tool-phase.ts`, `provider.ts`, `payload.ts`, `prune.ts`, `budget.ts`, `xml-parser.ts`, `error-handler.ts`, `sentinels.ts`, `parts.ts`, `types.ts`. Public surface re-exported via `inference/index.ts`; callers import from `./services/inference/index.js` explicitly (NodeNext doesn't honor directory-index resolution).
## v1.12.3-stale-banner — 2026-05-21
Stale-stream banner with Retry/Discard. When an assistant message sits `status='streaming'` with no token activity for 60+ seconds, the chat shows a banner above the input. Both actions clear the stale row via new `POST /api/chats/:id/discard_stale` (updates `status='failed'`, publishes `chat_status='idle'`). Closes the UX gap from the 2026-05-21 debugging spiral — slow streams and dead streams now look different.
## v1.12.2-live-toks — 2026-05-21
Live tok/s + ctx display next to the status indicator. `ChatThroughput` renders inline beside `StatusDot` while streaming or tool_running. Subscribes to existing `'usage'` WS frames (500ms-throttled, carrying `completion_tokens` + `ctx_used` + `ctx_max`) via `sessionEvents`. Hides when status drops to idle/error or data is older than 10s. Addresses the same UX gap as `v1.12.3-stale-banner` — gives users a live token velocity readout that immediately distinguishes slow from dead.
## v1.12.1-stop-handler — 2026-05-21
`handleAbortOrError` now writes `status='cancelled'` on user stop; rows no longer stuck `streaming` forever. Drops stale `messages_status_check` constraint (only `messages_status_chk` remains, allowing 'cancelled' via TS `MESSAGE_STATUSES`). Removes `detectSameNameLoop` and `DOOM_LOOP_SAME_NAME_THRESHOLD` (added during the 2026-05-21 debugging spike, never fired in any real run) plus 12 verbose `ctx.log.info` diagnostic markers from the same spike. Bundles workspace pane sync + status indicator overhaul + startup hung-row sweep that landed earlier in v1.12.1 work.
## v1.12.0-codecontext — 2026-05-21
Adds the `codecontext` sidecar (Go-based code-graph indexer at `codecontext:8080/v1/<tool_name>` over `boocode_net`) plus container guidance and skills runtime updates. Introduces the `chat_status` WS frame (`streaming | tool_running | waiting_for_input | idle | error`, widened from `working|idle|error`). Drops the deprecated `session_panes` table — workspace pane state moves to `sessions.workspace_panes jsonb` for cross-device sync via `PATCH /api/sessions/:id/workspace`.
## v1.11.1-consolidation — 2026-05-21
Rollup of v1.11.0v1.11.10 work that was shipped piecemeal. Covers anchored rolling compaction (single `summary=true` row per chat that supersedes itself), doom-loop guard via `detectDoomLoop`, `path_guard` secret-filename deny list, web tools (`web_search` against SearXNG + `web_fetch` with SSRF/private-IP block), and the 5MB stream-cap on response bodies with abort-on-overflow.
## v1.11.0-context-bar — 2026-05-20
Persistent context-window tracker in `ChatPane` + `ctx_max` capture via `${LLAMA_SWAP_URL}/upstream/<model>/props`. First inferences after a boocode boot may have `ctx_max=NULL` if llama-swap hasn't loaded the model yet — 60s negative cache TTL recovers on next turn. Replaced an earlier dead read of `parsed.timings.n_ctx` which never carried n_ctx.
## v1.10.1-booterm-user — 2026-05-19
Per-user shell privilege drop in the booterm container via `gosu` in `tmux.conf` default-command. Shells launched in browser terminal panes drop privs to `samkintop` rather than running as root inside the container.
## v1.10.0-booterm — 2026-05-18
Second container (`apps/booterm`, port 9501, bookworm-slim+glibc). Fastify + node-pty + tmux. Browser terminal panes connect via WS to `/ws/term/sessions/:sid/panes/:pid`; per-session tmux session `bc-<sid>`, per-pane window `term-<pid>`. xterm-addon-webgl with `document.fonts.load(...)`-gated init (Canvas2D doesn't honor `font-display: block`) and iOS-friendly visibility-change context recreation.
## v1.9.2-ask-user-input — 2026-05-18
`ask_user_input` elicitation tool. Pauses the inference loop and surfaces a prompt to the user; their response routes back as the tool result. Correlation initially via `messages.tool_calls` / `tool_results` JSON columns (later ported to `message_parts` in `v1.13.0-ai-sdk-v6`).
## v1.9.1-skills — 2026-05-18
Skills runtime + `/skill` slash command with autocomplete. Server-side parser, tools, `/api/skills`, and mount. Hardens `.dockerignore` to exclude `secrets/` and `data/`. Drops the type-to-confirm gate on chat delete (plain Cancel/Confirm only — per workspace convention).
## v1.9.0-themes-settings — 2026-05-17
Settings pane + per-project defaults + bulk archive + themes lift. `themes-v1` (18 preset palettes) ships in the same batch with a Settings picker for live theme switching.
## v1.8.2-cap-hit — 2026-05-17
Tool-loop cap-hit summary — when an assistant exceeds the per-turn tool budget, a sentinel `role='system'` row with `metadata.kind='cap_hit'` is inserted and a summary turn runs to give the user a coherent endpoint. Also compacts the tool-call UI rendering.
## v1.8.1-agents-global — 2026-05-16
Global agents (`data/AGENTS.md` bind-mounted at `/data/AGENTS.md`) + parser robustness + WS reconnect toast. Per-project `AGENTS.md` mechanism (`getAgentsForProject`) remains for *other* projects; the BooCode repo itself uses global-only to eliminate two-files-must-stay-in-sync drift.
## v1.8.0-agents — 2026-05-16
Tier 2 agents — `AGENTS.md` registry + per-session agent picker. Also lands mobile tab switcher, branch indicator, and the `git_status` tool.
## v1.7.0-drag-drop — 2026-05-16
Drag-drop + paste-as-attachment for long text in the chat input.
## v1.6.0-mobile — 2026-05-16
Full mobile suite. Adds `useViewport` (matchMedia breakpoints mobile <768 / tablet 7681023 / desktop ≥1024), `useSidebarDrawer` / `useRightRailDrawer` (Context + auto-close on `useLocation().pathname` change), `useLongPress` (500ms timer, synthetic `contextmenu`), `usePullToRefresh` (80px threshold, 600ms hold), `SwipeablePaneTab` (60px close, 30px vertical bail). Mobile headers with safe-area padding, hamburger left, FolderTree right. Tap targets at `max-md:min-h-[44px] max-md:min-w-[44px]`. Raises `MAX_TOOL_LOOP_DEPTH` 5 → 15. Right-rail becomes a drawer on mobile.
## v1.5.1-bootstrap — 2026-05-16
Bootstrap fixes — git + ssh installed in the boocode container, Tailscale host rewrite, `/opt/projects` label correction for the create-new-project bootstrap flow.
## v1.5.0-refactor-tests — 2026-05-16
Refactor split (FileBrowserPane / Workspace / `runAssistantTurn`) + vitest harness + unit tests for security-critical pure functions. Scopes the `/opt` mount to `/opt/projects` (writable) plus `PROJECT_ROOT_WHITELIST=/opt` (read-only resolution for add-existing). Surfaces swallowed errors and removes dead `session_renamed` paths.
## v1.4.0-fork-header — 2026-05-16
Fork from message + delete message + header polish + general housekeeping.
## v1.3.0-chats-projects — 2026-05-16
Chats-in-sessions era. Adds force-send, `/compact`, right-rail file browser, archive/rename/Open-in-Gitea sidebar context menu, archived projects landing page, create-project bootstrap with Gitea remote setup, landing-card buttons, 1000px content cap. Dedup audit and chat archive/delete from the sidebar.
## v1.2.0-multi-pane — 2026-05-15
Multi-pane workspace (batch 3, T1T8). `session_panes` schema (later replaced by `sessions.workspace_panes jsonb` in v1.12.0), `Pane` discriminated union, broker user channel + `/api/ws/user`, `file_ops` + `file_index` services, `PaneShell` / `ChatPane` / `FileBrowserPane` / `PaneTab` / `Workspace` components, `usePanes` hook, Shiki integration in `CodeBlock`. Up to 5 panes per session; default chat pane created on `POST /api/sessions`.
## v1.1.0-markdown-sidebar — 2026-05-15
Markdown rendering, message actions, tok/s + ctx display, AI session naming. Sidebar restructure — chats nested under projects (max 5 + view-all), live updates via WS.
## v1.0.0-initial — 2026-05-14
Initial commit. Skeleton of the monorepo: `apps/server` (Fastify + postgres), `apps/web` (React + Vite), basic chat loop against llama-swap.

View File

@@ -2,6 +2,8 @@
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
**Cursor agents:** start with `docs/ARCHITECTURE.md` (diagram). This file is the deep engineering reference. (Note: the root navigation `AGENTS.md` was removed in v1.12; `data/AGENTS.md` is the agent *registry*, not navigation.)
## What is BooCode
Self-hosted single-user developer chat app. AI assistant with read-only file tools (view_file, list_dir, grep, find_files) running against a local llama-swap inference server. Sessions organized by project, with a multi-pane workspace (chat + file browser side by side).
@@ -46,15 +48,52 @@ Tests: `pnpm -C apps/server test` runs the vitest suite. No test harness on `app
- **Zod** for request validation and config parsing.
Key services:
- **`services/inference.ts`** — Streams LLM responses, executes tool loops (max depth 15, see `MAX_TOOL_LOOP_DEPTH`), flushes to DB every 500ms. Publishes `InferenceFrame` events through the broker. **`TurnArgs`** is the per-turn state envelope threaded through the `executeToolPhase → runAssistantTurn` recursion (`toolsUsed`, `recentToolCalls`, `assistantMessageId`, `signal`); reset to defaults in `runInference` at the user-message boundary. Cap-hit (`toolsUsed >= budget`) and doom-loop (`detectDoomLoop(recentToolCalls)`) checks both read from this envelope. Add new per-turn state here, not in module-level closures.
- **`services/broker.ts`** — In-memory pub/sub with two channel types: per-session (message streaming) and per-user (sidebar updates). No persistence; clients reconnect on restart.
- **`services/tools.ts`** — Tool registry (`ALL_TOOLS`, `READ_ONLY_TOOL_NAMES`, `TOOLS_BY_NAME`). Filesystem tools (view_file/list_dir/grep/find_files) go through three guard layers: `path_guard.ts` (workspace scope), `secret_guard.ts` (filename deny list), `url_guard.ts` (SSRF/private-IP block for web_fetch). v1.11.8+ web tools (`web_search`, `web_fetch`) are opt-in per chat via `session.web_search_enabled` (resolved with `project.default_web_search_enabled` fallback) and filtered out of the LLM's tool schema when false.
- **`services/compaction.ts`** + **`services/model-context.ts`** — v1.11.0 anchored rolling summary (single `summary=true` assistant row per chat, supersedes itself on each compaction). Triggered when `chats.needs_compaction` is set after an inference turn exceeds `usable(ctx_max) = ctx_max - 20k`. **`ctx_max` comes from `model-context.getModelContext()` which fetches `${LLAMA_SWAP_URL}/upstream/<model>/props`** — NOT from `parsed.timings.n_ctx` (the stream completion's `timings` doesn't carry n_ctx; that read was dead code until v1.11.3 ripped it out).
- **`services/inference/`** — Public surface re-exported via `inference/index.ts`; callers import from `./services/inference/index.js` explicitly (NodeNext doesn't honor directory-index resolution). Layout: `turn.ts` (runAssistantTurn / runInference / createInferenceRunner; exports `InferenceFrame`, `InferenceContext`, `TurnArgs`, `StreamResult`, `MAX_STEPS`), `stream-phase.ts` (streamCompletion as a v1.13.1-A AI SDK adapter + executeStreamPhase), `provider.ts` (`upstreamModel(baseURL, modelId)` wrapping `createOpenAICompatible` against llama-swap), `tool-phase.ts` (executeToolPhase → returns `ToolPhaseResult`; no longer recurses into runAssistantTurn — v1.14.0 converted the recursion to an explicit while loop in turn.ts), `sentinel-summaries.ts` (runCapHitSummary + runDoomLoopSummary + runStepCapSummary + their sentinel inserters), `error-handler.ts` (handleAbortOrError, finalizeCompletion), `payload.ts` (buildMessagesPayload, loadContext, maybeFlagForCompaction, `OpenAiMessage`), `sentinels.ts` (`detectDoomLoop`, `DOOM_LOOP_THRESHOLD`, sentinel predicates), `budget.ts` (resolveToolBudget), `xml-parser.ts` (qwen3.6 XML tool-call fallback — KEEP, AI SDK doesn't handle inline-XML tool calls), `parts.ts` (parts-table write helpers: `partsFromAssistantMessage`, `partsFromToolMessage`, `insertParts` — v1.13.20 made parts the sole source of truth), `prune.ts` (v1.13.4 two-tier compaction; `selectPruneTargets` is the pure decision helper), `types.ts` (`StreamPhaseState`, `DB_FLUSH_INTERVAL_MS`). **`TurnArgs`** is the per-turn state envelope populated from loop locals each iteration; reset in `runInference` at user-message boundary. The outer loop in `runAssistantTurn` (v1.14.0) runs `while (stepNumber < effectiveCap)` where `effectiveCap = Math.min(agent.steps ?? Infinity, MAX_STEPS=200)`. Per-agent `steps:` field in AGENTS.md frontmatter. `steps: 0` means text-only (no tool execution). Step-cap hit writes a `cap_hit` sentinel so `CapHitSentinel.tsx` renders it.
- **AI SDK v6 streamCompletion adapter** (v1.13.1-A; `services/inference/stream-phase.ts`). `streamText` is the underlying call; the BooCode layer above (executeStreamPhase, finalize, dual-write) is shape-preserved via an adapter. Five gotchas the LSP/test suite won't catch:
- **Abort signals are swallowed.** `streamText`'s `fullStream` iterator exits cleanly when `abortSignal` fires — no throw. Post-iteration `if (signal?.aborted) throw <AbortError>` is required; without it the row finalizes as `complete` instead of `cancelled`. Comment in stream-phase.ts pins this; don't refactor it away.
- **Usage lands only at stream end** via `await result.usage` (`inputTokens` / `outputTokens` v6 names → mapped to `promptTokens` / `completionTokens` for the existing onUsage callback). Mid-stream live tok/s is gone vs v1.12.2; ChatThroughput shows a single value at stream end.
- **Tools have NO `execute` field.** BooCode dispatches tools in tool-phase.ts, not the AI SDK loop. Only `description` + `inputSchema: jsonSchema(parameters)` — surfacing tool-call parts via `fullStream` and stopping is what we want.
- **`includeUsage: true` MUST be set on `createOpenAICompatible`** in `services/inference/provider.ts`. The adapter defaults it false, omitting `stream_options.include_usage` from the request body; llama-swap then never emits the usage block and `result.usage.inputTokens/outputTokens` resolve to `undefined`. Latent regression from v1.13.1-A through v1.13.7 — every assistant row in that window has `tokens_used`/`ctx_used` NULL. Don't remove this flag during refactor.
- **Tool-call-only turns may emit a leading `\n` text-delta** as the assistant content. `MessageList.flatten`'s `hasText` and `MessageBubble`'s `hasContent` both `.trim()` before the length check — otherwise whitespace-only content renders an empty bubble + ActionRow between every tool call (v1.13.7 fix). `payload.ts:buildMessagesPayload` also skips `status='failed'` AND complete-but-empty (no content, no tool_calls) assistant rows to avoid "Cannot have 2 or more assistant messages at the end of the list" upstream rejections after cap-hit + Continue.
- **AI SDK ModelMessage conversion** (`toModelMessages` in stream-phase.ts). Tool messages need a `toolName` for `ToolResultPart` — BooCode's OpenAI-shape history doesn't carry it, so a forward-scan builds a `tool_call_id → toolName` map from prior assistant `tool_calls`. Tool outputs wrapped as `{ type: 'json' | 'text', value }` matching the v6 `ToolResultOutput` union. Assistant messages with reasoning emit a `ReasoningPart` first in the content array (v1.13.1-C).
- **`experimental_repairToolCall`** (v1.13.3) wired into `streamText` to keep the stream alive when qwen3.6 emits malformed tool args. Pass-through implementation — logs the bad call and returns it unmodified; `executeToolPhase`'s existing zod-reject error path routes it to the model on the next turn.
- **`chat_status` frame shape** (published via `broker.publishUser`) — `status: 'streaming' | 'tool_running' | 'waiting_for_input' | 'idle' | 'error'` (widened from `working|idle|error` in v1.12.1). Frontend `useChatStatus` derives `idle_warm` (<30s since idle) vs `idle_cold`. `ChatThroughput` renders inline beside `StatusDot` only when streaming or tool_running, fed by 500ms-throttled `'usage'` WS frames (`completion_tokens` + `ctx_used` + `ctx_max`). The `POST /api/chats/:id/discard_stale` endpoint exists to mark a stuck-streaming row as `failed` when the frontend's 60s no-token-activity timer (`ChatPane` content-length watcher) gives up.
- **Boot-time stale-streaming sweep** in `apps/server/src/index.ts` after `applySchema()`: any `messages.status='streaming'` older than 5 minutes flips to `'failed'`. Logs only on non-zero count. Recovers from container restart while inference was mid-stream (v1.12.1).
- **Periodic 60s sweeper** in `apps/server/src/index.ts` (v1.13.3 + v1.13.5). Same `setInterval` runs `sweepStaleStreaming` (marks `messages.status='streaming'` older than 5 min as `failed`, publishes `chat_status='idle'` so the UI dot drops) and `cleanupTruncations` (TTL + orphan reap of tmpfs truncation files). `app.addHook('onClose')` clears the timer. No-op when nothing to reap.
- **`services/broker.ts`** — In-memory pub/sub with two channel types: per-session (message streaming) and per-user (sidebar updates). No persistence; clients reconnect on restart. v1.13.11: every WS publish goes through `broker.publishFrame(sessionId, frame)` or `broker.publishUserFrame(user, frame)` — both Zod-validate against `WsFrameSchema` (`types/ws-frames.ts`) and fail-closed (log + drop). `ctx.publish` / `ctx.publishUser` in inference + auto_name route through the index.ts adapter that calls publishFrame internally. The schema is duplicated byte-identical at `apps/web/src/api/ws-frames.ts`; a `ws-frames.test.ts` case enforces parity. Don't add new raw `broker.publish()` / `publishUser()` calls.
- **`services/tools.ts`** — Tool registry (`ALL_TOOLS`, `READ_ONLY_TOOL_NAMES`, `TOOLS_BY_NAME`). Filesystem tools (view_file/list_dir/grep/find_files) go through three guard layers: `path_guard.ts` (workspace scope), `secret_guard.ts` (filename deny list), `url_guard.ts` (SSRF/private-IP block for web_fetch). v1.11.8+ web tools (`web_search`, `web_fetch`) are opt-in per chat via `session.web_search_enabled` (resolved with `project.default_web_search_enabled` fallback) and filtered out of the LLM's tool schema when false. v1.13.5 truncation: when a tool slice cuts content, `services/truncate.ts` stashes the full text on tmpfs at `BOOCODE_TRUNCATION_DIR` (default `/tmp/boocode-truncations`, 0o700) keyed by an opaque `tr_<12 base32 chars>` id, and the `view_truncated_output(id)` tool retrieves it. 5MB cap (matches `view_file`'s `MAX_FILE_BYTES`), 7-day TTL, reaped by the periodic sweeper. Tmpfs path means container restart loses retrieval — acceptable, the model usually has moved on.
- **`services/compaction.ts`** + **`services/model-context.ts`** — v1.11.0 anchored rolling summary (single `summary=true` assistant row per chat, supersedes itself on each compaction). Triggered when `chats.needs_compaction` is set after an inference turn exceeds `usable(ctx_max) = floor(0.85 × ctx_max)` (v1.13.9 opencode-pattern early trigger; was `ctx_max - 20k` pre-v1.13.9, which gave only 7.6% headroom at 262k and 0 budget for ≤20k contexts). **`ctx_max` comes from `model-context.getModelContext()` which fetches `${LLAMA_SWAP_URL}/upstream/<model>/props`** — NOT from `parsed.timings.n_ctx` (the stream completion's `timings` doesn't carry n_ctx; that read was dead code until v1.11.3 ripped it out). First inferences after a boocode boot may have `ctx_max=NULL` if llama-swap hasn't loaded the model yet; negative cache TTL is 60s, recovers on next turn. v1.13.6: `buildHeadPayload` embeds `reasoning_parts` as a `<reasoning>...</reasoning>` prose prefix on the assistant `content` (OpenAI wire shape has no structured reasoning field; the summarizer reads text). Standalone tag when content is empty (tool-call-only turn). `buildHeadPayload` + `OpenAiMessage` exported for test access — keep them exported.
- **`services/system-prompt.ts`** — `buildSystemPrompt` is the string-returning shim; `buildSystemPromptWithFingerprint` is the canonical impl returning `{prompt, fingerprint, drift}`. v1.13.8 instrumentation: SHA-256 of the assembled prefix is logged per `buildMessagesPayload` call (msg `prefix-fingerprint`, level=info); a `Map<sessionId, lastHash>` observer fires `prefix-drift` (level=warn) on hash change with a field-level `changed_inputs` diff. Smoke proved the prefix is byte-stable across turns in steady-state — the originally-planned `system_prompt_cache` DB table was dropped as redundant against the v1.12.0 input-layer mtime caches (BOOCHAT.md here + AGENTS.md global+per-project in `agents.ts:safeStat`).
- **`services/inference/budget.ts`** — tool-call budgets: `BUDGET_READ_ONLY = 30`, `BUDGET_NON_READ_ONLY = 10` (forward-looking; no write tools yet), `BUDGET_NO_AGENT = 30` (v1.13.7; was 15 — every tool in `ALL_TOOLS` is read-only today, so no-agent mode shares the read-only-agent cap). Per-agent `max_tool_calls` from AGENTS.md frontmatter overrides.
- **`messages_with_parts` view** (v1.13.1-B; `schema.sql`). Read sites that need `tool_calls` / `tool_results` / `reasoning_parts` SELECT from this view, NOT `messages` directly. v1.13.20 dropped the legacy `messages.tool_calls` / `messages.tool_results` JSON columns; the view now reads parts-only subselects. Writes target `message_parts` exclusively via `insertParts` (or via the helpers `partsFromAssistantMessage` / `partsFromToolMessage`). The `Message` wire type still carries `tool_calls?` / `tool_results?` because the view synthesizes them from parts — frontend reads are unchanged. Shapes: `tool_calls jsonb[]`, `tool_results jsonb` single object, `reasoning_parts jsonb[]` of `{text}`. If you ever need to UPDATE a message and return its full Message shape, do a two-step UPDATE returning `id` followed by SELECT from the view — RETURNING off the bare `messages` table no longer carries the tool fields.
- **`services/file_ops.ts`** — Shared file operation implementations used by both inference tools and HTTP routes.
- **`services/auto_name.ts`** — Non-streaming LLM call to generate 4-word session titles after first assistant reply.
- **`apps/coder/src/services/provider-registry.ts`** (BooCoder, NOT apps/server) — Static registry of provider metadata (label, transport, model source). `PROVIDERS` array, `PROVIDERS_BY_NAME` map. 5 providers: boocode (native), opencode (acp), goose (pty), claude (pty), qwen (pty).
- **`apps/coder/src/services/agent-probe.ts`** (BooCoder) — Startup probe using direct `exec()` (not SSH). Discovers installed agents on host, their versions, ACP support, and models. Qwen models read from `~/.qwen/settings.json`. Claude models are static from the registry. Results persisted to `available_agents` table.
- **`apps/coder/src/routes/providers.ts`** (BooCoder) — `GET /api/providers` returns installed providers with models. Transport field reflects actual capability (checks `supports_acp` from DB, not just registry preference). The apps/server side of this flow is the "Provider picker dispatch" bullet below.
- **Provider picker dispatch**: when `provider !== 'boocode'`, the message route creates a `tasks` row (with `session_id` set) instead of calling `inference.enqueue`. The dispatcher picks it up and dispatches via ACP or PTY using the agent's `install_path`.
Route registration: all routes registered in `index.ts` via `register*Routes(app, sql, ...)` functions. Routes are in `routes/*.ts`.
### BooCoder (`apps/coder/src/`)
- Write-capable coding agent. Runs as a **systemd service on the host** (`boocoder.service`), NOT in Docker. Fastify server at port 9502, connects to postgres at `127.0.0.1:5500`.
- **Workspace dependency on `@boocode/server`**: imports `createInferenceRunner`, `createBroker`, `ALL_TOOLS`, `appendMcpTools` from the server's compiled `dist/`. apps/server's `package.json` has an `exports` map with `types` conditions for NodeNext resolution. apps/server must build FIRST.
- Build + deploy: `pnpm -C apps/server build && pnpm -C apps/coder build && sudo systemctl restart boocoder`. Env file at `apps/coder/.env.host`. Service file at `/etc/systemd/system/boocoder.service`.
- After `pnpm -C apps/coder build` the host `boocoder.service` keeps running the OLD process until `sudo systemctl restart boocoder` — a stale process shows **new routes 404 with `{error:'not found'}` while old routes still 200** (the `/api` not-found handler returns that shape). Restart, don't re-debug.
- Agent dispatch spawns binaries directly using `install_path` from `available_agents` — no `spawn('sh', ['-c', ...])` (fails under systemd). Follows Paseo's pattern: `spawn(fullBinaryPath, argsArray, { cwd })`.
- systemd hardening: only `NoNewPrivileges=true` is safe. `ProtectSystem`, `ProtectHome`, `PrivateTmp` all break agent dispatch (agents need full filesystem access to read configs, write to worktrees).
- `apps/server/tsconfig.json` has `declaration: true` so `.d.ts` files exist for workspace consumers.
- Write tools (`edit_file`, `create_file`, `delete_file`, `apply_pending`, `rewind`) queue in `pending_changes` table. Nothing hits disk until `apply_pending` is called. `write_guard.ts` validates paths (resolve + prefix-check, no realpath since files may not exist for creates).
- Frontend: NOT a separate SPA. BooCoder is a `'coder'` pane type within BooChat's SPA (`apps/web/`). `CoderPane.tsx` in `apps/web/src/components/panes/`. API requests go through `/api/coder/*` proxy (Vite dev + Fastify production) which rewrites to the boocoder host service (`BOOCODER_URL` env var, default `http://100.114.205.53:9502`). WS connects directly to `:9502`.
- `apps/coder/web/` is a STANDALONE fallback SPA served at `:9502` directly. The PRIMARY BooCoder frontend is the `CoderPane` in BooChat's SPA (`apps/web/src/components/panes/CoderPane.tsx`), accessible via the "Coder" pane in the workspace at `code.indifferentketchup.com`. Both exist; the pane is what Sam uses.
- **Provider snapshot lifecycle** (`apps/coder/src/services/`): `provider-config.ts` (Zod config, never-throws on bad input) → `provider-config-registry.ts` (`buildResolvedRegistry`, singleton) → `provider-snapshot.ts` (two-tier probe: tier-1 fast presence, tier-2 cold ACP probe skipped unless force / stale `PROVIDER_PROBE_TTL_MS` 24h / dbEmpty; cached). Verify live: `curl http://100.114.205.53:9502/api/providers/snapshot` — returns providers + models + commands, the exact shape `AgentComposerBar` renders.
- `PATCH /api/providers/config` replaces a provider id's override object **wholesale** (per-id shallow merge) — to flip one field send `{...existing, enabled}`, or a custom ACP entry's `command`/`label` is wiped and it drops out of the resolved registry. `data/coder-providers.json` is **gitignored** (it's live runtime config — the coder reads AND writes it on UI toggles); the tracked reference is `data/coder-providers.example.json`. The loader falls back to `{providers:{}}` (built-ins only) when the live file is absent, so a fresh checkout needs no copy.
- **opencode** runs as a warm HTTP server (v2.6 Phase 1, `services/backends/opencode-server.ts``opencode serve` per BooCoder process, one opencode session per BooCode session, resumed via `agent_sessions`). goose/qwen/claude still dispatch **one-shot** ACP/PTY with no ctx/token usage; only native `boocode` (llama-swap engine) tracks ctx. Paseo's per-provider native clients (design §12) deliberately not ported.
- **opencode SSE** (`opencode-server.ts`): live streaming arrives as `session.next.text.delta` / `session.next.reasoning.delta` / `session.next.tool.{called,success,failed}` — NOT `message.part.*` (those are terminal/post-hoc). `client.event.subscribe({ directory })` MUST pass the session's worktree directory; omit it and opencode scopes events to the server's `process.cwd()` → zero session events (empty turns, 180s watchdog timeout). Per-session SSE (P1.5-a): each live session owns its own `event.subscribe({directory})` loop + AbortController, so concurrent sessions in different worktrees stream independently; a `sessionID` demux guard drops cross-session events when two share a dir. Turn completes on `session.idle`; `promptAsync` is fire-and-forget (204).
- **opencode model strings** must be provider-prefixed (`llama-swap/<model>`) AND exist in `~/.config/opencode/opencode.json` `provider.llama-swap.models` — not merely loadable by llama-swap. `parseModel` infers `llama-swap/` for a bare id; the dispatcher coalesces empty→DEFAULT_MODEL then prefixes. `agent-probe` populates opencode's `available_agents.models` via `mergeLlamaSwap` (fetches `/v1/models`); empty model list → frontend sends `''` → no inference (`input:0`, empty turn).
- **agent_sessions resume**: `config_hash = sha256('opencode_server|<model>')` — must NOT include the server port (random per boot; including it breaks cross-restart resume). P1.5-b: `agent_sessions` is keyed `(chat_id, agent)` — the tab/chat is the context unit (two opencode tabs in one session = two contexts sharing one worktree). `chat_id` CASCADEs from `chats`; `session_id`/`worktree_id` are informational `SET NULL`. The `worktrees` table (one-per-session, `session_id` SET NULL so it survives session delete) supersedes the defanged `session_worktrees`. `tasks.chat_id` threads the tab id to the dispatcher; `runOpenCodeServerTask` falls back to resolve-or-create a chat when it's null (arena/MCP/new_task). The `@opencode-ai/sdk` v2 client takes flattened params (`{sessionID, directory, parts, model:{providerID,modelID}}`), imports `createOpencodeClient` from `@opencode-ai/sdk/v2/client`.
### Frontend (`apps/web/src/`)
- **React 18** + React Router v6 + **Tailwind v4** + shadcn/radix-ui primitives.
@@ -87,26 +126,38 @@ Font / CSS pipeline (apps/web):
### Multi-pane workspace
Sessions hold 15 panes (chat / empty / placeholder terminal+agent). Workspace pane state is **client-side only** (localStorage key `boocode.workspace.panes.<sessionId>`); the legacy `session_panes` table and its REST endpoints are deprecated — no `/api/panes/*` routes exist. Each chat lives in at most one pane; tab strip is per-pane and tracks `chatIds[]` + `activeChatIdx`. Sessions 1:N chats; chats own messages. Tab reorder via native HTML5 drag events.
Sessions hold 15 panes (chat / empty / placeholder terminal+agent). v1.12.1 moved pane state from per-device localStorage to `sessions.workspace_panes jsonb` for cross-device sync. `PATCH /api/sessions/:id/workspace` persists; `session_workspace_updated` user-channel frame broadcasts to every device watching the session. `useWorkspacePanes` debounces saves 300ms and dedups echoes by JSON string. Legacy localStorage key `boocode.workspace.panes.<sessionId>` is read once on first hydrate (one-time seed-and-delete migration when server is empty but localStorage has data); no longer written. The deprecated `session_panes` table was dropped. `validatePanes(validChatIds)` prunes panes referencing chat IDs that no longer exist (called by `useSessionChats` after the chat list fetch lands). Each chat lives in at most one pane; tab strip is per-pane and tracks `chatIds[]` + `activeChatIdx`. Tab reorder via native HTML5 drag events. v2.6.5: `workspace_panes` is now a `WorkspaceState` envelope `{panes, tabNumbers (chatId→stable session-scoped tab number, assigned on chat-pane open, retired on close, never reused), nextTabNumber, closedPaneStack (reopen LIFO, max 10, persisted so it survives reload)}` — not a bare `WorkspacePane[]`. Hydrate (`toWorkspaceState`) and the server PATCH validator (`z.union([array, envelope])` in `routes/sessions.ts`) both accept the legacy array and normalize to the envelope on read/write. Closing a chat pane relocates its tabs to the oldest chat/empty pane; `reopenPane` strips the restored chatIds from all live panes first (no duplication). `read_tab_by_number` resolves a number→chatId through `tabNumbers`.
## Database
PostgreSQL 16. Tables: `projects`, `sessions`, `chats`, `messages`, `settings`, `session_panes` (deprecated). Schema applied idempotently on startup via `applySchema()`. Use `clock_timestamp()` (not `NOW()`) inside transactions. CHECK constraints in place: `projects_status_chk` ('open'|'archived'), `sessions_status_chk` (same), `chats_status_chk` (same), `messages_role_chk`, `messages_status_chk` — keep in sync with the `*_STATUSES` const arrays in `apps/server/src/types/api.ts`.
PostgreSQL 16. Database name: `boochat` (renamed from `boocode` in v2.0.0-alpha; Docker service name stays `boocode_db`). Tables: `projects`, `sessions`, `chats`, `messages`, `settings`, `message_parts` (v1.13.0), `pending_changes` (v2.0.0), `tasks` (v2.0.0), `available_agents` (v2.0.0). Views: `messages_with_parts` (v1.13.1-B parts-merge read path), `tool_cost_stats` (v1.13.10 per-tool 100-call rolling window), `human_inbox` (v2.0.0 — tasks WHERE state IN blocked/failed). (`session_panes` was dropped in v1.12.1; workspace pane state lives in `sessions.workspace_panes jsonb`.) Schema applied idempotently on startup via `applySchema()`. Use `clock_timestamp()` (not `NOW()`) inside transactions. CHECK constraints in place: `projects_status_chk` ('open'|'archived'), `sessions_status_chk` (same), `chats_status_chk` (same), `messages_role_chk`, `messages_status_chk` — keep in sync with the `*_STATUSES` const arrays in `apps/server/src/types/api.ts`. The older anonymous `messages_status_check` (without 'cancelled') and `messages_role_check` (without 'system') were dropped in v1.12.1; only the `_chk` variants remain. **Two schema files, one DB:** `apps/server/src/schema.sql` owns `sessions`/`chats`/`messages`/`message_parts`; `apps/coder/src/schema.sql` (applied by the boocoder host service) owns `agent_sessions`, `worktrees`, `pending_changes`, `available_agents` and extends `tasks`. Both apply idempotently to the one `boochat` DB — so e.g. an `agent_sessions` FK change goes in the **coder** schema, not the server one. Idempotent FK-action flips (e.g. `ON DELETE CASCADE``SET NULL`) guard on `pg_constraint.confdeltype` so a re-run/fresh-deploy is a no-op (see the `session_worktrees`/`agent_sessions` defang blocks).
Schema CHECK migration order when renaming allowed values: (1) `ALTER TABLE ... DROP CONSTRAINT IF EXISTS <system_name>` (inline `CREATE TABLE` checks get `<table>_<column>_check`), (2) `UPDATE` rows to new values, (3) wrap new constraint ADD in `DO $$ ... pg_constraint` guard — that block is the only way to get `ADD CONSTRAINT IF NOT EXISTS`.
Position-shift pattern for panes (legacy `session_panes` table): negate-and-restore to avoid UNIQUE(session_id, position) collisions during reorder/insert/delete. Sentinel value -100 for the moving pane.
## Environment
Required: `DATABASE_URL`, `LLAMA_SWAP_URL`. Optional: `PORT` (3000), `HOST` (0.0.0.0), `PROJECT_ROOT_WHITELIST` (/opt, read-only scope for add-existing path resolution), `BOOTSTRAP_ROOT` (/opt/projects, writable scope for create-new-project bootstrap mkdir target — host must `mkdir -p /opt/projects` before container start), `DEFAULT_MODEL`, `LOG_LEVEL`, `SEARXNG_URL` (default `http://100.114.205.53:8888` — internal Tailscale Fathom; the public `search.indifferentketchup.com` is behind Authelia and unusable from server context).
Required: `DATABASE_URL`, `LLAMA_SWAP_URL`. Optional: `PORT` (3000), `HOST` (0.0.0.0), `PROJECT_ROOT_WHITELIST` (/opt, read-only scope for add-existing path resolution), `BOOTSTRAP_ROOT` (/opt/projects, writable scope for create-new-project bootstrap mkdir target — host must `mkdir -p /opt/projects` before container start), `DEFAULT_MODEL`, `LOG_LEVEL`, `SEARXNG_URL` (default `http://100.114.205.53:8888` — internal Tailscale Fathom; the public `search.indifferentketchup.com` is behind Authelia and unusable from server context), `BOOCODE_TOOLS` (`core` | `standard` | `all`, default `all`; v1.13.15-tools tier filter — ceiling, never expands an agent's whitelist), `MCP_CONFIG_PATH` (optional; default `/data/mcp.json` — JSON config for MCP servers matching opencode's `mcpServers` shape; file missing = no MCP).
BooCoder at port 9502: `curl http://100.114.205.53:9502/api/health`. Runs as `boocoder.service` on the host (not Docker). Deploy: `pnpm -C apps/server build && pnpm -C apps/coder build && sudo systemctl restart boocoder`. Health reports tool count: `{"ok":true,"db":true,"tools":33}`.
- `FAST_MODEL` (optional) — cheaper model for titles, summaries, labeling (auto_name.ts, tool-summaries.ts). Falls back to session model or DEFAULT_MODEL when unset. Set to a small model on llama-swap (e.g. `nemotron-nano-4b`) to avoid loading the 35B for 20-token calls.
- Qwen Code dispatch: `OPENAI_BASE_URL=http://100.101.41.16:8401/v1 OPENAI_API_KEY=dummy qwen -p "<task>" --output-format stream-json`. Install: `npm install -g @qwen-code/qwen-code@latest`. Node ≥22 required on host (container stays Node 20; BooCoder dispatches via direct spawn on host). No `--yolo` flag — non-interactive mode (`-p`) runs autonomously without approval prompts. ACP bridge is HTTP daemon (not stdio); use PTY dispatch.
- Arena (v2.0.5): `POST /api/arena {project_id, input, contestants: [{agent?, model?}]}` dispatches the same task to N models/agents in parallel. Each contestant gets its own task + worktree. `GET /api/arena/:id` for results. `POST /api/arena/:id/select/:task_id` picks winner.
## Workflow
- Sam reviews all diffs and commits manually. Do not commit unless explicitly asked.
- Per-batch docs live under `openspec/changes/<slug>/{proposal,tasks,design}.md`. Already-shipped batches are snapshots in `openspec/changes/archived/`. New batches follow the proposal+tasks shape; see `openspec/README.md` for the convention.
- Tag naming: `vMAJOR.MINOR.PATCH-slug` (e.g. `v1.13.13-ws-publish`). Monotonic per minor — the slug describes the batch's content so the tag name alone is enough to recall what shipped. No letter suffixes (`-a`/`-b`), no pseudo-ranges (`v1.11.x`), no slug-only sub-versions sharing a number (`v1.13.15-tools` + `-openspec` + `-agentlint` — split into sequential patches instead).
- `CHANGELOG.md` is the per-tag release log, most-recent on top. When a new tag is created, add a `## <tag> — <YYYY-MM-DD>` section with a 36 sentence paragraph summarizing what shipped, drawn from the commit body. Cross-reference other tags by name when the batch builds on, fixes, or pairs with prior work (e.g. "pairs with `v1.13.12-ws-schemas`", "fixed in `v1.13.5-stability-bundle`"). No nested bullets — one paragraph.
- Deploy: `cd /opt/boocode && docker compose up --build -d` (or `docker compose build --no-cache boocode && docker compose up -d` if you suspect a layer-cache issue).
- The `boocode` container is `build: .` — it builds web+server from the **working tree**, so uncommitted changes deploy. Web edits are live on the Vite dev server (HMR) but NOT on production (`:9500` / code.indifferentketchup.com) until `docker compose up --build -d boocode`.
- Git push to Gitea: `GIT_SSH_COMMAND="ssh -i /opt/boocode/secrets/boocode_gitea -o IdentitiesOnly=yes" git push origin <branch>`. The default agent identity is rejected; the in-repo deploy key (`secrets/`, gitignored) is the working one. Transient `Connection reset by peer` retries cleanly after `sleep 5`.
- Don't accumulate `.bak-*` files. Clean them up in the same batch or immediately after merge.
- DB-integration tests opt-in via env var: `DATABASE_URL='postgres://boocode:devpass@localhost:5500/boochat' pnpm -C apps/server test`. Host port is 5500 (mapped from `boocode_db:5432`); password is `${POSTGRES_PASSWORD}` from `.env` (`devpass`), NOT the literal in `.env`'s `DATABASE_URL=postgres://boocode:Ketchup1479@boocode_db:5432/...` line. `psql` is not on the host PATH — for an interactive query use `docker exec boocode_db psql -U boocode -d boochat -c "..."`. Pattern: `describe.runIf(!!process.env.DATABASE_URL)(...)` with a `beforeAll` that applies the schema via `sql.unsafe(readFileSync(schemaPath))`. Tests skip cleanly when var is unset. `tool_cost_stats.test.ts` is the reference.
- Host-side smoke endpoint: `curl http://100.114.205.53:9500/api/...`. The boocode container's port mapping binds to the Tailscale IP, not `0.0.0.0`, so `localhost:9500` doesn't work from the host shell. Same for booterm at `:9501`.
- Frontend blank-screen / runtime crash: get the stack-trace column offset from the browser console, then `cut -c <start>-<end> apps/web/dist/assets/index-*.js | sed -n '<line>p'` to read the exact minified expression that threw. Faster than bisecting source. Watch for `=== null`/`!== null` on optional fields fed an `as unknown as` cast — those bypass tsc.
- Fastify global JSON parser tolerates empty bodies (overridden in `index.ts`); bodyless POSTs (archive, unarchive, stop) work without setting `Content-Type` tricks on the client.
- Event dedup discipline: for any mutation the server publishes via `broker.publishUser`, do NOT add a local `sessionEvents.emit(...)` after the API call — `useUserEvents` forwards the WS frame onto the bus. Frontend mutation handlers must be idempotent (dedup by id, no-op on already-present).
- `node:20-*` base images ship a `node` user at uid/gid 1000 — delete it (`userdel`/`groupdel` on debian, `deluser`/`delgroup` on alpine) before adding samkintop at 1000.
@@ -115,7 +166,9 @@ Required: `DATABASE_URL`, `LLAMA_SWAP_URL`. Optional: `PORT` (3000), `HOST` (0.0
- A local PreToolUse hook (`security_reminder_hook.py`) regex-flags Node's older `child_process` spawn helpers as unsafe (false positive even on the File-suffixed variant). Use `spawn` — it's accepted.
- `/opt/boolab` hosts a working sibling BooCode terminal at `boocode.indifferentketchup.com`. Useful for visual side-by-side comparison on the same iPhone when debugging booterm rendering. Boolab uses Tailwind v3 (`@tailwind base`); boocode uses v4 — many subtle build differences. Don't assume parity.
- booterm SSHs to the host as `samkintop@100.114.205.53` (the Tailscale IP). The hostname `ubuntu-homelab` (shown in the bash prompt after login) does NOT resolve from inside the container — only the host's `/etc/hosts` knows it. Override via `BOOTERM_SSH_HOST` / `BOOTERM_SSH_USER` env vars in docker-compose if you ever move the shell to a different machine.
- codecontext sidecar lives at `/opt/boocode/codecontext/`. Sidecar HTTP API at `http://codecontext:8080/v1/<tool_name>` over the `boocode_net` bridge (no host port). BooCode wrappers in `apps/server/src/services/tools/codecontext/`. The `.codecontextignore.template` documents recommended ignore patterns; users copy and adapt to project root manually.
- codecontext sidecar lives at `/opt/boocode/codecontext/`. Sidecar HTTP API at `http://codecontext:8080/v1/<tool_name>` over the `boocode_net` bridge (no host port). BooCode wrappers in `apps/server/src/services/tools/codecontext/`. The `.codecontextignore` at project root is honored when `--respect-gitignore` is passed (enabled in the shim).
- codecontext fork at `/opt/forks/codecontext/` — separate git repo (branch `boocode-ts`), pushed via the same boocode_gitea SSH key to `indifferentketchup/codecontext`. Build: `go build ./...`. Test: `go test ./...`. Docker rebuild requires staging the fork source first: `tar -czf codecontext/fork.tar.gz -C /opt/forks/codecontext --exclude=.git --exclude=bin .` then `docker compose build --no-cache codecontext`. The Dockerfile COPYs `fork.tar.gz` into the builder stage (Gitea is behind Authelia, no HTTP clone). `fork.tar.gz` is gitignored.
- Go binary: `/snap/go/current/bin/go` (not on PATH by default). Use `export PATH=$PATH:/snap/go/current/bin` or full path for Go commands.
- `os/exec` child supervisors must explicitly call `child.Wait()` in a goroutine and `os.Exit` on child death. `Signal(0)` returns nil on zombies and is NOT a liveness check. Without `Wait()`, docker's `restart: unless-stopped` policy never fires because the parent stays alive. The `codecontext/shim.go` implementation is the reference pattern.
## Conventions
@@ -125,15 +178,36 @@ Required: `DATABASE_URL`, `LLAMA_SWAP_URL`. Optional: `PORT` (3000), `HOST` (0.0
- TypeScript strict mode. Both apps share `tsconfig.base.json`.
- Server uses NodeNext module resolution (`.js` extensions in imports).
- Discriminated unions for type narrowing: `Pane` (by `kind`), `SessionEvent` (by `type`), `InferenceFrame` (by `type`).
- **Adding a new WS frame type** requires updating BOTH the server's `InferenceFrame` (loose `type:` union + optional fields in `services/inference/turn.ts`) AND the web `WsFrame` (strict discriminated union in `apps/web/src/api/types.ts`). Server publish is permissive; the frontend type is the wire-format gate. The `'usage'` frame added in v1.12.2 needed both sides; missing the web side silently drops the frame at JSON-parse.
- shadcn primitives live in `components/ui/`. Don't modify them unless adding a new primitive.
- `ui/` primitives present: button, card, context-menu, dialog, dropdown-menu, input, label, radio-group, sonner, textarea. No switch/sheet/drawer/badge/checkbox — use a `<button role="switch" aria-checked>` toggle (a hand-rolled `Switch` already lives in `SettingsPane.tsx`) and a Dialog-based panel for "drawers".
- `inferLanguage()` from `lib/attachments.ts` is the canonical file-extension-to-language map. `CodeBlock.tsx` keeps its own `LANG_MAP` because it also resolves markdown fence names.
- Two UI event buses: `hooks/sessionEvents.ts` for DB-state events (chat_created, session_updated); `lib/events.ts` for ephemeral UI (`sendToTerminal`, `terminalsRegistry`). Don't merge — different subscriber lifecycles.
- `vite.config.ts` proxy entries are order-sensitive: more-specific prefixes (`/api/term`, `/ws/term`) must come BEFORE `/api`.
- Mobile pane URL sync (`Session.tsx`): the `?pane=<id>` effect resets `activePaneIdx` whenever `panes` changes. New-pane creation on mobile must push `?pane=` atomically — `addPaneAndSwitch` is the wrapper that does this. `addSplitPane` returns the new pane id for callers.
- A scrollable list inside a Dialog on mobile: cap `DialogContent` (`max-h-[85vh]` + `grid-rows-[auto_minmax(0,1fr)_auto]`) and make the list the single scroll region with `overscroll-contain` — otherwise touch-scroll drags the whole fixed modal / chains to the page.
- xterm.js v5 uses canvas rendering — browser doesn't see xterm's selection; the native right-click menu has no working Copy for terminal text. App keybindings (`Cmd/Ctrl-C`, `Cmd/Ctrl-Shift-C`) are the path.
- **New tools** live in their own `services/<name>.ts` file (see `web_search.ts`, `web_fetch.ts`) — exports a pure `executeFoo(input, ...deps)` for direct test access plus a `ToolDef` wrapper that `loadConfig()`s its real dependencies. Register the ToolDef in `tools.ts` `ALL_TOOLS` (and `READ_ONLY_TOOL_NAMES` if applicable). Inject `fetcher: typeof fetch = fetch` rather than `vi.spyOn(globalThis, 'fetch')` — cleanup is simpler and the production call site stays unchanged.
- **DB/session-aware tools** take an optional 4th `ToolExecCtx { sql, sessionId }` arg on `ToolDef.execute`, plumbed `executeToolPhase``executeToolCall``execute`. It's optional so the filesystem tools and the `apps/coder` `ALL_TOOLS` consumer stay compatible; filesystem tools ignore it. `read_tab_by_number` (reads `sessions.workspace_panes` + the chat's messages via `sql`) is the reference.
- **Sentinels** are `role='system'` rows with structured `metadata.kind` (`cap_hit`, `doom_loop`). UI-only — `buildMessagesPayload` strips them via `isAnySentinel` so the LLM never sees them. A new kind requires arms in `MessageMetadata` in BOTH `apps/server/src/types/api.ts` AND `apps/web/src/api/types.ts`, plus a render branch in `apps/web/src/components/MessageBubble.tsx`.
- **ReadableStream test stubs** use `pull()` (not `start()`) so chunks are produced lazily — `start()` enqueues everything and calls `controller.close()` before the consumer reads, so a subsequent `reader.cancel()` finds the stream already closed and the `cancel()` callback never fires. Also provide MORE chunks than the test will consume so the source stays in 'readable' state when cancel runs (e.g. cap test reads ~6 chunks, stub provides 10).
- React **StrictMode is on** (`main.tsx`): an updater passed to one `setState` that itself calls another `setState` (e.g. `setClosedPaneStack` inside a `setPanes` updater) is double-invoked in dev. Make such nested updates idempotent — `useWorkspacePanes`'s `appendClosed` dedupes a value-identical top entry for exactly this reason.
- Tool-name whitelists must derive from `ALL_TOOLS` in `services/tools.ts`, never hardcoded. `services/agents.ts` `ALL_TOOL_NAMES` had this drift class until v1.12 — same pattern applies to any future tool-aware code.
- Agent registry lives at `data/AGENTS.md` (global, bind-mounted at `/data/AGENTS.md`). No per-project `AGENTS.md` in this repo — removed in v1.12 to eliminate the two-files-must-stay-in-sync drift. The `getAgentsForProject` per-project override mechanism remains for *other* projects.
- `data/AGENTS.md` is PARSED (`agents.ts` `splitSections`/`parseAgentSection`): each `## <Name>` is one agent and must be followed by a `---` frontmatter fence or the block throws; content before the first `## ` is discarded. Do NOT add free-form `## ` rule sections — they break the registry. Cross-cutting agent rules go in CLAUDE.md or a parser-ignored preamble.
- Skills live in `data/skills/<vendor>/`; Sam's own namespace is `boocode/` (`committing-changes`, `using-worktrees`, `improving-boocode-guidance`) — `SKILL.md` + optional `eval.yaml` (gerund names; eval = `skill:` + `tasks:` of `prompt`+`grader`, incl. a negative-trigger task). `data/skills/` is canonical; a divergent mirror at `/opt/skills/` exists.
- MCP stdio transport uses newline-delimited JSON (NDJSON), NOT LSP-style `Content-Length` headers. The `codecontext/shim.go` framing implementation is the reference; per the MCP spec (modelcontextprotocol.io/specification/server/transports).
- **Workspace dependency pattern** (`apps/coder``@boocode/server`): the consuming package adds `"@boocode/server": "workspace:*"` in `package.json`. The provider's `package.json` needs `exports` with `types` + `default` conditions per subpath: `"./inference": { "types": "./dist/.../index.d.ts", "default": "./dist/.../index.js" }`. Without the `types` condition, NodeNext resolution can't find `.d.ts` files and tsc fails with "Cannot find module" in the consumer.
- **JSONB columns**: use `sql.json(value as never)` — NOT `${JSON.stringify(value)}::jsonb` which double-serializes (stores a JSON string instead of a JSON object/array). Pattern established in `parts.ts`, `settings.ts`.
- **`payload.ts:loadContext` SELECT**: must include every `Session` field that downstream code reads. The tool phase reads `session.allowed_read_paths`; if the SELECT omits it, cross-repo read grants silently fail. The `Session` TypeScript type doesn't catch this because `sql<Session[]>` doesn't enforce column coverage.
- **Sidecar routing** (`services/inference/provider.ts`): `upstreamModel(config, modelId, agent)` routes to `LLAMA_SIDECAR_URL` when agent has `llama_extra_args`, otherwise `LLAMA_SWAP_URL`. `resolveRoute(agent)` returns `{route: 'swap'|'sidecar', flags}`. Sidecar provider created fresh per call (not cached) because `X-Agent-Flags` header varies per agent. Boot-time guard in `index.ts` refuses to start if any agent has `llama_extra_args` but `LLAMA_SIDECAR_URL` is unset.
- **Secret guard safe patterns** (`services/secret_guard.ts`): `.env.example`, `.env.sample`, `.env.template`, `.env.defaults` are allowlisted via `SAFE_PATTERNS` set. Do NOT add `.env.production`/`.env.development`/`.env.test` — those can hold real secrets.
- **CoderPane uses ChatInput** (`components/panes/CoderPane.tsx`): shares the same `ChatInput` component as BooChat for full parity — attachments, paste-to-chip, auto-grow textarea, queued messages during send. CoderPane's `sendOneMessage` is the send callback; queued messages drain via `useEffect` when `sending` goes false.
- **Adding a new `SessionEvent` type**: add the interface, add it to the `SessionEvent` union, add a `case` in `useSidebar.ts` `applyEvent` switch (no-op `return prev` is fine), and subscribe in any hook that needs it (e.g. `useSessionStream` for `refetch_messages`).
- **BooCoder provider registry** (`apps/coder/src/services/provider-registry.ts`): static list of provider defs (boocode, opencode, goose, claude, qwen). `PROBED_AGENT_NAMES` derives from it. Adding/removing providers means editing this file, not the frontend.
- **AgentComposerBar filters `e.installed`**: provider snapshot entries with `installed:false` (loading/unavailable) are dropped from the dropdown. `getProviderSnapshot` must await the full build — returning synchronous `loading` placeholders makes every provider vanish (the v2.5.7 "no providers showing up" regression); surfacing loading states needs a client poll.
- **Coder↔web provider-type parity** (`apps/coder/src/services/provider-types.ts``apps/web/src/api/types.ts`): enforced by runtime `provider-types-parity.test.ts` (compile-time cross-import is blocked by TS6307 on web's composite tsconfig). Mirror of the ws-frames parity pattern — edit both copies together or the test fails.
- **ACP command discovery is async**: `acp-probe.ts` must poll after `newSession` for `available_commands_update` (commands arrive in a later notification; reading synchronously captures 0). PTY providers (claude) instead discover from disk via `claude-command-discovery.ts` (`~/.claude/commands` + `enabledPlugins` `skills/`+`commands/`, bare names, deduped). `AgentCommand.kind` tags `'command'` vs `'skill'`; `CoderPane`'s `slashGroups` splits them into icon'd groups. `SlashCommandPicker`'s `groups?` prop is opt-in — BooChat passes flat `items` (unchanged).
- **Pane header architecture (mobile vs desktop)**: Desktop coder pane header (BooCode label + [+] [×]) lives in `Workspace.tsx` gated by `isCoder && !isMobile`. Mobile coder controls (● ×) live in `Session.tsx` header row next to `MobileTabSwitcher`/`NewPaneMenu`. `AgentComposerBar` (provider/mode/model pickers) renders inside `CoderPane.tsx` on both. The ● status dot is passed via `connected` prop from CoderPane to AgentComposerBar.
- **MessageBubble shared between BooChat and BooCoder** (`components/MessageBubble.tsx`): accepts optional `actions?: MessageActions` callbacks (onRegenerate, onResend, onFork, onDelete) and `hideActions?: ('fork'|'delete'|'openInPane')[]`. Defaults use BooChat API; CoderPane overrides via `CoderMessageList` props. `CoderTextBubble` was removed. **`CoderMessageList` passes `CoderMessageWire as unknown as Message`** — the coder wire shape lacks `metadata`/`kind`/`summary`, so those fields are `undefined` (not `null`) on coder messages. Null-guards on any `Message` field MUST use loose `!= null`, not strict `!== null` (`undefined !== null` is `true``.kind` throws → blank-screen crash). The `as unknown as` cast hides this from tsc; build + typecheck pass while runtime crashes.
- **llama-sidecar** (`/opt/forks/llama-sidecar/`): Go daemon for per-agent llama-server process pool. Cross-compile: `GOOS=windows GOARCH=amd64 /snap/go/current/bin/go build -o bin/llama-sidecar.exe ./cmd/llama-sidecar`. Gitea: `indifferentketchup/llama-sidecar`. Windows child process gotchas: use `context.Background()` for child lifetime (not request ctx), `os.Open(os.DevNull)` for stdin, `os.Pipe()` for stdout with drain goroutine, `DETACHED_PROCESS | CREATE_NEW_PROCESS_GROUP` creation flags. SSH to sam-desktop: `ssh samki@100.101.41.16`; use `schtasks` for persistent process spawning (SSH `start /B` doesn't survive session close).

10
CURRENT.md Normal file
View File

@@ -0,0 +1,10 @@
# Current focus
Last updated: 2026-05-26
- **Batch:** v2.3-provider-lifecycle (openspec drafted; not started)
- **Branch:** `main`
- **Blockers:** none
- **Last shipped:** `v2.2.2-xml-placeholder-reject`
Update this file when starting or finishing a batch. Agents: read this first for session intent; if stale vs `CHANGELOG.md`, trust CHANGELOG for shipped state.

21
LICENSE Normal file
View File

@@ -0,0 +1,21 @@
MIT License
Copyright (c) 2026 indifferentketchup
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

View File

@@ -1,6 +1,10 @@
# boocode
Self-hosted single-user developer chat app. v1: chat only.
Self-hosted single-user developer chat app. 3-app monorepo: BooChat (read-only chat), BooCoder (write tools + agent dispatch), BooTerm (PTY terminals).
**Latest release:** `v2.2.1-pane-scoped-chats` (2026-05-26) · [`CHANGELOG.md`](CHANGELOG.md) · **Current focus:** [`CURRENT.md`](CURRENT.md)
**Agent navigation:** [`AGENTS.md`](AGENTS.md) · **Architecture:** [`docs/ARCHITECTURE.md`](docs/ARCHITECTURE.md) · **Engineering reference:** [`CLAUDE.md`](CLAUDE.md)
## Stack
@@ -13,6 +17,8 @@ Self-hosted single-user developer chat app. v1: chat only.
- `apps/server` — Fastify API + WebSocket + inference loop + file-read tools
- `apps/web` — React frontend; served by Fastify in production, Vite in dev
- `apps/booterm` — Fastify + node-pty + tmux for in-browser terminal panes
- `apps/coder` — Fastify write tools + ACP/PTY dispatcher + MCP server (BooCoder)
## Local dev
@@ -28,7 +34,7 @@ cp .env.example .env
docker compose up -d boocode_db
# run server (port 3000) and web (port 5173) in two shells
DATABASE_URL=postgres://boocode:devpass@127.0.0.1:5500/boocode \
DATABASE_URL=postgres://boocode:devpass@127.0.0.1:5500/boochat \
LLAMA_SWAP_URL=http://100.101.41.16:8401 \
pnpm dev:server
@@ -49,11 +55,36 @@ docker compose up --build -d
Binds to `100.114.205.53:9500` (Tailscale). Authelia is expected to gate the
upstream and inject `Remote-User`. Postgres binds loopback only.
## What v1 has
BooCoder runs as a **host systemd service** (`boocoder.service`, port `:9502`), not in Docker:
Project sidebar, sessions per project, chat with streaming responses over
WebSocket, four file-read tools scoped to the project root (`view_file`,
`list_dir`, `grep`, `find_files`), and a model picker driven by llama-swap's
`/v1/models`.
```bash
pnpm -C apps/server build && pnpm -C apps/coder build
sudo systemctl restart boocoder
curl http://100.114.205.53:9502/api/health
```
What v1 does not have lives in v2 (terminal pane) and v3 (Coder pane).
## Services
|Service|Port|Description|
|---|---|---|
|BooChat|`100.114.205.53:9500`|Read-only chat + SPA |
|BooTerm|`100.114.205.53:9501`|PTY/tmux terminal panes |
|BooCoder|host:9502|Write tools + agent dispatch + MCP server (systemd service, not Docker) |
|Postgres|`127.0.0.1:5500`|Shared database (`boochat`; Docker service `boocode_db`) |
|codecontext|internal `:8080`|Code graph sidecar (Docker network only) |
## What's shipped
See [`boocode_roadmap.md`](boocode_roadmap.md) for full version history. Highlights as of **v2.2.1**:
- **BooChat**: streaming chat, file-read tools, compaction, reasoning support, HTML/Markdown artifact panes, cross-repo read grants, MCP client (multi-server + stdio), tool-cost tracking, skills system, builtin agent registry, multi-pane workspace (chat / terminal / coder)
- **BooTerm**: in-browser terminal panes via tmux + xterm.js, per-session tmux sessions, SSH-out support
- **BooCoder (v2.2)**: write tools (`edit_file`, `create_file`, `delete_file`, `apply_pending`, `rewind`), pending-changes queue with diff UI, Paseo-style provider snapshot (7 providers: boocode, cursor, claude, opencode, goose, qwen, copilot), `AgentComposerBar` (provider / mode / model / thinking), ACP dispatch with inline permission prompts + tool/reasoning streaming, PTY fallback, Arena, MCP server (6 tools, stdio), CLI client, human inbox, Boomerang orchestration, path-guard fuzz suite, **pane-scoped chats** (v2.2.1 — each coder/terminal pane owns its chat)
## Planned
- **v2.3 provider lifecycle** — config-backed provider registry (`/data/coder-providers.json`), enable/disable toggles, two-tier probe (openspec drafted). See [`CURRENT.md`](CURRENT.md).
## License
MIT — see [`LICENSE`](LICENSE).

View File

@@ -23,5 +23,6 @@
"@types/pg": "^8.11.10",
"tsx": "^4.16.2",
"typescript": "^5.5.0"
}
},
"license": "MIT"
}

16
apps/coder/.env.host Normal file
View File

@@ -0,0 +1,16 @@
NODE_ENV=production
PORT=9502
HOST=100.114.205.53
DATABASE_URL=postgres://boocode:devpass@127.0.0.1:5500/boochat
LLAMA_SWAP_URL=http://100.101.41.16:8401
PROJECT_ROOT_WHITELIST=/opt
BOOTSTRAP_ROOT=/opt/projects
DEFAULT_MODEL=qwen3.6-35b-a3b-mxfp4
LOG_LEVEL=info
SEARXNG_URL=http://100.114.205.53:8888
GITEA_BASE_URL=https://git.indifferentketchup.com
GITEA_USER=indifferentketchup
GITEA_SSH_HOST=100.114.205.53:2222
MCP_CONFIG_PATH=/data/mcp.json
SKILLS_ROOT=/opt/boocode/data/skills
CODER_PROVIDERS_PATH=/opt/boocode/data/coder-providers.json

35
apps/coder/Dockerfile Normal file
View File

@@ -0,0 +1,35 @@
# syntax=docker/dockerfile:1.7
FROM node:20-alpine AS builder
RUN corepack enable
WORKDIR /build
COPY package.json pnpm-workspace.yaml pnpm-lock.yaml tsconfig.base.json ./
COPY apps/server/package.json ./apps/server/
COPY apps/coder/package.json ./apps/coder/
COPY apps/coder/web/package.json ./apps/coder/web/
RUN pnpm install --frozen-lockfile
# Build server first (coder depends on it via workspace dep for types + inference)
COPY apps/server ./apps/server
RUN pnpm -C apps/server build
COPY apps/coder ./apps/coder
RUN pnpm -C apps/coder/web build
RUN pnpm -C apps/coder build
RUN pnpm deploy --filter=@boocode/coder --prod --legacy /out/coder
FROM node:20-bookworm-slim AS runtime
RUN apt-get update && apt-get install -y --no-install-recommends ripgrep git openssh-client && rm -rf /var/lib/apt/lists/*
WORKDIR /app
COPY --from=builder /out/coder ./
COPY --from=builder /build/apps/coder/web/dist ./web
ENV NODE_ENV=production
EXPOSE 3000
CMD ["node", "dist/index.js"]

35
apps/coder/package.json Normal file
View File

@@ -0,0 +1,35 @@
{
"name": "@boocode/coder",
"version": "2.0.0",
"private": true,
"type": "module",
"main": "dist/index.js",
"scripts": {
"dev": "tsx watch src/index.ts",
"build": "tsc && node -e \"import('node:fs').then(fs=>fs.copyFileSync('src/schema.sql','dist/schema.sql'))\"",
"start": "node dist/index.js",
"cli": "tsx src/cli.ts",
"typecheck": "tsc --noEmit",
"test": "vitest run"
},
"dependencies": {
"@agentclientprotocol/sdk": "^0.22.1",
"@boocode/server": "workspace:*",
"@fastify/static": "^7.0.4",
"@opencode-ai/sdk": "~1.15.0",
"@fastify/websocket": "^10.0.1",
"@modelcontextprotocol/sdk": "^1.29.0",
"fastify": "^4.28.1",
"postgres": "^3.4.4",
"ws": "^8.18.0",
"zod": "^3.23.8"
},
"devDependencies": {
"@types/node": "^20.14.10",
"@types/ws": "^8.5.10",
"tsx": "^4.16.2",
"typescript": "^5.5.0",
"vitest": "^3.0.0"
},
"license": "MIT"
}

249
apps/coder/src/cli.ts Normal file
View File

@@ -0,0 +1,249 @@
#!/usr/bin/env node
/**
* BooCoder CLI client.
*
* Usage:
* boocode run "task description" [--agent opencode] [--model claude-opus-4-7] [--project <id>]
* boocode ls [--state pending|running|completed|failed]
* boocode attach <task-id>
* boocode send <task-id> "message"
*/
import { WebSocket } from 'ws';
const BASE_URL = process.env.BOOCODER_URL ?? 'http://100.114.205.53:9502';
// ─── Arg parsing ─────────────────────────────────────────────────────────────
function getFlag(args: string[], name: string): string | undefined {
const idx = args.indexOf(name);
if (idx === -1 || idx + 1 >= args.length) return undefined;
return args[idx + 1];
}
function hasFlag(args: string[], name: string): boolean {
return args.includes(name);
}
// ─── HTTP helpers ────────────────────────────────────────────────────────────
async function api(method: string, path: string, body?: unknown): Promise<unknown> {
const url = `${BASE_URL}${path}`;
const res = await fetch(url, {
method,
headers: body ? { 'Content-Type': 'application/json' } : undefined,
body: body ? JSON.stringify(body) : undefined,
});
if (!res.ok) {
const text = await res.text().catch(() => '');
throw new Error(`${method} ${path}${res.status}: ${text}`);
}
return res.json();
}
// ─── WS streaming ────────────────────────────────────────────────────────────
function streamSession(sessionId: string): void {
const wsUrl = BASE_URL.replace(/^http/, 'ws') + `/api/ws/sessions/${sessionId}`;
const ws = new WebSocket(wsUrl);
ws.on('message', (data) => {
try {
const frame = JSON.parse(data.toString()) as { type: string; content?: string; name?: string; arguments?: string };
if (frame.type === 'delta' && frame.content) {
process.stdout.write(frame.content);
} else if (frame.type === 'tool_call') {
process.stdout.write(`\n[tool: ${frame.name ?? '?'}(${(frame.arguments ?? '').slice(0, 80)})]\n`);
} else if (frame.type === 'tool_result') {
process.stdout.write(`[tool_result]\n`);
} else if (frame.type === 'status' || frame.type === 'chat_status') {
// Silent
}
} catch {
// Non-JSON frame, ignore
}
});
ws.on('error', (err) => {
process.stderr.write(`WS error: ${err.message}\n`);
});
ws.on('close', () => {
process.stdout.write('\n');
process.exit(0);
});
process.on('SIGINT', () => {
ws.close();
process.exit(0);
});
}
// ─── Commands ────────────────────────────────────────────────────────────────
async function cmdRun(args: string[]): Promise<void> {
const input = args.find((a) => !a.startsWith('--'));
if (!input) {
process.stderr.write('Usage: boocode run "task description" [--agent X] [--model X] [--project X]\n');
process.exit(1);
}
const agent = getFlag(args, '--agent');
const model = getFlag(args, '--model');
const project_id = getFlag(args, '--project');
if (!project_id) {
process.stderr.write('Error: --project <uuid> is required\n');
process.exit(1);
}
const result = (await api('POST', '/api/tasks', {
project_id,
input,
...(agent && { agent }),
...(model && { model }),
})) as { id: string; state: string };
process.stdout.write(`Task created: ${result.id} (state: ${result.state})\n`);
// Poll until task has session_id, then stream; or poll until terminal state
const POLL_MS = 2000;
for (;;) {
await sleep(POLL_MS);
const task = (await api('GET', `/api/tasks/${result.id}`)) as {
id: string; state: string; session_id?: string; output_summary?: string;
};
if (task.session_id) {
process.stdout.write(`Streaming session ${task.session_id}...\n`);
streamSession(task.session_id);
return; // streamSession handles exit
}
if (task.state === 'completed') {
process.stdout.write(`\nCompleted: ${task.output_summary ?? '(no summary)'}\n`);
return;
}
if (task.state === 'failed') {
process.stderr.write(`\nFailed: ${task.output_summary ?? '(no summary)'}\n`);
process.exit(1);
}
if (task.state === 'cancelled') {
process.stderr.write(`\nCancelled.\n`);
process.exit(1);
}
}
}
async function cmdLs(args: string[]): Promise<void> {
const state = getFlag(args, '--state');
const query = state ? `?state=${state}` : '';
const tasks = (await api('GET', `/api/tasks${query}`)) as Array<{
id: string; state: string; agent: string | null; input: string; created_at: string;
}>;
if (tasks.length === 0) {
process.stdout.write('No tasks.\n');
return;
}
// Table header
process.stdout.write(
pad('ID', 38) + pad('STATE', 12) + pad('AGENT', 14) + pad('INPUT', 52) + 'CREATED\n',
);
process.stdout.write('-'.repeat(120) + '\n');
for (const t of tasks) {
process.stdout.write(
pad(t.id, 38) +
pad(t.state, 12) +
pad(t.agent ?? '-', 14) +
pad(t.input.slice(0, 50), 52) +
(t.created_at?.slice(0, 19) ?? '') + '\n',
);
}
}
async function cmdAttach(args: string[]): Promise<void> {
const taskId = args[0];
if (!taskId) {
process.stderr.write('Usage: boocode attach <task-id>\n');
process.exit(1);
}
const task = (await api('GET', `/api/tasks/${taskId}`)) as { session_id?: string };
if (!task.session_id) {
process.stderr.write('Task has no session yet (still pending?).\n');
process.exit(1);
}
streamSession(task.session_id);
}
async function cmdSend(args: string[]): Promise<void> {
const taskId = args[0];
const message = args[1];
if (!taskId || !message) {
process.stderr.write('Usage: boocode send <task-id> "message"\n');
process.exit(1);
}
const task = (await api('GET', `/api/tasks/${taskId}`)) as { session_id?: string };
if (!task.session_id) {
process.stderr.write('Task has no session yet.\n');
process.exit(1);
}
// Find active chat
const sessionId = task.session_id;
// POST message to the session's chat (the messages route expects session_id in path)
await api('POST', `/api/sessions/${sessionId}/messages`, { content: message });
// Then attach to stream the response
streamSession(sessionId);
}
// ─── Utils ───────────────────────────────────────────────────────────────────
function pad(s: string, width: number): string {
return s.length >= width ? s.slice(0, width) : s + ' '.repeat(width - s.length);
}
function sleep(ms: number): Promise<void> {
return new Promise((resolve) => setTimeout(resolve, ms));
}
// ─── Main ────────────────────────────────────────────────────────────────────
const [cmd, ...rest] = process.argv.slice(2);
switch (cmd) {
case 'run':
cmdRun(rest).catch(fatal);
break;
case 'ls':
cmdLs(rest).catch(fatal);
break;
case 'attach':
cmdAttach(rest).catch(fatal);
break;
case 'send':
cmdSend(rest).catch(fatal);
break;
default:
process.stdout.write(
'BooCoder CLI\n\n' +
'Commands:\n' +
' run "task" [--agent X] [--model X] [--project <id>] Create and stream a task\n' +
' ls [--state pending|running|completed|failed] List tasks\n' +
' attach <task-id> Stream a running task\n' +
' send <task-id> "message" Send input to a task\n' +
'\n' +
`Base URL: ${BASE_URL} (set BOOCODER_URL to override)\n`,
);
if (cmd && cmd !== '--help' && cmd !== '-h') process.exit(1);
}
function fatal(err: unknown): void {
process.stderr.write(`Error: ${err instanceof Error ? err.message : String(err)}\n`);
process.exit(1);
}

69
apps/coder/src/config.ts Normal file
View File

@@ -0,0 +1,69 @@
import { z } from 'zod';
// BooCoder's config is a superset of the server's Config type so it can be
// passed directly into the inference runner's InferenceContext. Fields the
// inference loop reads: LLAMA_SWAP_URL, PROJECT_ROOT_WHITELIST. The rest
// default to values that satisfy the server's Zod schema without BooCoder
// needing to supply them in its environment.
const ConfigSchema = z.object({
NODE_ENV: z.enum(['development', 'production', 'test']).default('development'),
PORT: z.coerce.number().int().positive().default(3000),
HOST: z.string().default('0.0.0.0'),
DATABASE_URL: z.string().url(),
LLAMA_SWAP_URL: z.string().url(),
PROJECT_ROOT_WHITELIST: z.string().default('/opt'),
BOOTSTRAP_ROOT: z.string().default('/opt/projects'),
DEFAULT_MODEL: z.string().default('qwen3.6-35b-a3b-mxfp4'),
LOG_LEVEL: z.string().default('info'),
CONTAINER_GUIDANCE_FILE: z.string().optional(),
// Fields needed to satisfy the server's Config type but unused by BooCoder:
SEARXNG_URL: z.string().url().default('http://100.114.205.53:8888'),
GITEA_BASE_URL: z.string().url().default('https://git.indifferentketchup.com'),
GITEA_USER: z.string().default('indifferentketchup'),
GITEA_TOKEN: z.string().optional(),
GITEA_SSH_HOST: z.string().default('100.114.205.53:2222'),
MCP_CONFIG_PATH: z.string().optional(),
// v2.3: config-backed provider overrides/custom-ACP entries merged over the
// hardcoded built-ins. Missing file = built-ins only (see provider-config.ts).
CODER_PROVIDERS_PATH: z.string().default('/data/coder-providers.json'),
// v2.3 phase 2: tier-2 (cold ACP probe) is skipped when available_agents was
// probed more recently than this. 24h default — stale model lists self-heal
// on the next snapshot; an explicit /refresh always re-probes.
PROVIDER_PROBE_TTL_MS: z.coerce.number().int().positive().default(86_400_000),
// v2.0.5: cheaper model for titles, summaries, labeling.
FAST_MODEL: z.string().optional(),
// SSH access to the host for external agent dispatch (Phase 5)
BOOCODER_SSH_HOST: z.string().default('100.114.205.53'),
BOOCODER_SSH_USER: z.string().default('samkintop'),
// v2.6 Phase 3 (lifecycle hardening). Idle TTL: evict a non-busy warm backend
// (opencode server / warm-ACP child) after this long with no turn — its worktree
// + agent_sessions row persist, so the next turn re-spawns + reattaches. 30 min
// default (design §6).
AGENT_POOL_IDLE_TTL_MS: z.coerce.number().int().positive().default(1_800_000),
// LRU cap: max live warm backends before the least-recently-used (non-busy) ones
// are evicted. Bounds the long-lived-daemon's per-(chat,agent) Map growth.
AGENT_POOL_MAX_LIVE: z.coerce.number().int().positive().default(10),
// Periodic sweep cadence (idle/LRU pool eviction + orphan-worktree reap). 60s
// mirrors the apps/server truncation/stale-streaming sweeper.
LIFECYCLE_SWEEP_INTERVAL_MS: z.coerce.number().int().positive().default(60_000),
// Orphan-worktree grace: an on-disk worktree dir with no live `worktrees` row is
// only reaped after it's been untouched this long (avoids sweeping a dir mid
// ensureSessionWorktree create). 1h default.
ORPHAN_WORKTREE_GRACE_MS: z.coerce.number().int().positive().default(3_600_000),
});
export type Config = z.infer<typeof ConfigSchema>;
let cached: Config | null = null;
export function loadConfig(): Config {
if (cached) return cached;
const parsed = ConfigSchema.safeParse(process.env);
if (!parsed.success) {
console.error('Invalid environment configuration:');
console.error(parsed.error.flatten().fieldErrors);
process.exit(1);
}
cached = parsed.data;
return cached;
}

45
apps/coder/src/db.ts Normal file
View File

@@ -0,0 +1,45 @@
import postgres from 'postgres';
import { readFile } from 'node:fs/promises';
import { fileURLToPath } from 'node:url';
import { dirname, resolve } from 'node:path';
import type { Config } from './config.js';
const __filename = fileURLToPath(import.meta.url);
const __dirname = dirname(__filename);
export type Sql = ReturnType<typeof postgres>;
let sqlInstance: Sql | null = null;
export function getSql(config: Config): Sql {
if (sqlInstance) return sqlInstance;
sqlInstance = postgres(config.DATABASE_URL, {
max: 10,
idle_timeout: 30,
connect_timeout: 10,
onnotice: () => {},
});
return sqlInstance;
}
export async function applySchema(sql: Sql): Promise<void> {
const schemaPath = resolve(__dirname, 'schema.sql');
const ddl = await readFile(schemaPath, 'utf8');
await sql.unsafe(ddl);
}
export async function pingDb(sql: Sql): Promise<boolean> {
try {
await sql`SELECT 1`;
return true;
} catch {
return false;
}
}
export async function closeDb(): Promise<void> {
if (sqlInstance) {
await sqlInstance.end({ timeout: 5 });
sqlInstance = null;
}
}

268
apps/coder/src/index.ts Normal file
View File

@@ -0,0 +1,268 @@
import { resolve, dirname } from 'node:path';
import { fileURLToPath } from 'node:url';
import { existsSync } from 'node:fs';
import Fastify from 'fastify';
import fastifyWebsocket from '@fastify/websocket';
import fastifyStatic from '@fastify/static';
const __filename = fileURLToPath(import.meta.url);
const __dirname = dirname(__filename);
import { loadConfig } from './config.js';
import { getSql, applySchema, pingDb, closeDb } from './db.js';
import { startMcpServer } from './services/mcp-server.js';
// v2.0.0 Phase 2B: workspace dependency on @boocode/server — reuse the
// inference loop, broker, and tool registry without duplication.
import { createInferenceRunner } from '@boocode/server/inference';
import { createBroker } from '@boocode/server/broker';
import { appendMcpTools, ALL_TOOLS } from '@boocode/server/tools';
import type { Config as ServerConfig } from '@boocode/server/config';
import type { WsFrame } from '@boocode/server/ws-frames';
// v2.0.0 Phase 2C: write tools + adapter for BooChat ToolDef compatibility.
import { WRITE_TOOLS } from './services/tools/index.js';
import { adaptWriteTool } from './services/tools/adapter.js';
import { setInferenceContext, clearInferenceContext } from './services/tools/inference_context.js';
// Routes
import { registerMessageRoutes } from './routes/messages.js';
import { registerSkillRoutes } from './routes/skills.js';
import { registerPendingRoutes } from './routes/pending.js';
import { registerCheckpointRoutes } from './routes/checkpoints.js';
import { registerAgentSessionRoutes } from './routes/agent-sessions.js';
import { registerTaskRoutes } from './routes/tasks.js';
import { registerInboxRoutes } from './routes/inbox.js';
import { registerStatsRoutes } from './routes/stats.js';
import { registerArenaRoutes } from './routes/arena.js';
import { registerProviderRoutes } from './routes/providers.js';
import { registerWorktreeSafetyRoutes } from './routes/worktree-safety.js';
import { registerLifecycleRoutes } from './routes/lifecycle.js';
import { registerWebSocket } from './routes/ws.js';
// Phase 4: dispatcher + agent probe
import { createDispatcher } from './services/dispatcher.js';
import { agentPool } from './services/agent-pool.js';
import { createOrphanWorktreeReaper } from './services/orphan-worktree-reaper.js';
import { probeAgents } from './services/agent-probe.js';
import { getProviderSnapshot, persistProbedModels } from './services/provider-snapshot.js';
import { setPermissionHooks } from './services/permission-waiter.js';
import { homedir } from 'node:os';
async function main() {
// MCP mode: stdio transport, no HTTP server
if (process.argv.includes('--mcp')) {
const config = loadConfig();
const sql = getSql(config);
await applySchema(sql);
await startMcpServer(sql);
return;
}
const config = loadConfig();
const app = Fastify({
logger: { level: config.LOG_LEVEL },
});
// Allow empty JSON bodies (same pattern as apps/server).
app.removeContentTypeParser(['application/json']);
app.addContentTypeParser('application/json', { parseAs: 'string' }, (_req, body, done) => {
const str = (body as string) ?? '';
if (str.trim().length === 0) {
done(null, {});
return;
}
try {
done(null, JSON.parse(str));
} catch (err) {
done(err as Error, undefined);
}
});
const sql = getSql(config);
await applySchema(sql);
app.log.info('database schema applied');
// Broker: in-memory pub/sub for session + user channel streaming.
const broker = createBroker(app.log);
setPermissionHooks({
onPrompt: async (prompt) => {
await sql`
UPDATE tasks SET state = 'blocked' WHERE id = ${prompt.taskId} AND state = 'running'
`;
broker.publishFrame(prompt.sessionId, {
type: 'permission_requested',
task_id: prompt.taskId,
session_id: prompt.sessionId,
kind: prompt.kind,
tool_title: prompt.toolTitle,
...(prompt.input ? { input: prompt.input } : {}),
options: prompt.options.map((o) => ({ option_id: o.optionId, label: o.label })),
} as WsFrame);
},
onResolved: async (taskId, sessionId) => {
await sql`
UPDATE tasks SET state = 'running' WHERE id = ${taskId} AND state = 'blocked'
`;
broker.publishFrame(sessionId, {
type: 'permission_resolved',
task_id: taskId,
session_id: sessionId,
} as WsFrame);
},
});
// --- Tool registry extension ---
// Append BooCoder write tools (adapted to BooChat's ToolDef interface) to
// the shared ALL_TOOLS registry. appendMcpTools re-sorts and rebuilds
// TOOLS_BY_NAME so tool-phase.ts dispatch sees the full set.
const adaptedWriteTools = WRITE_TOOLS.map((t) => adaptWriteTool(t));
appendMcpTools(adaptedWriteTools);
app.log.info(`tool registry: ${ALL_TOOLS.length} tools loaded (${WRITE_TOOLS.length} write tools)`);
// Inference runner: same engine as BooChat, uses ALL_TOOLS (which includes
// the appended write tools) for tool dispatch.
const inference = createInferenceRunner(
{
sql,
config: config as unknown as ServerConfig,
log: app.log,
publish: (sessionId, frame) => {
broker.publishFrame(sessionId, frame as unknown as WsFrame);
},
broker,
},
(user, frame) => {
broker.publishUserFrame(user, frame as unknown as WsFrame);
}
);
// Wrap the inference runner to set/clear the write-tool context around each run.
// The inference runner calls enqueue() which fires asynchronously — we hook
// into the enqueue to set context before the run starts.
const inferenceApi = {
enqueue: (sessionId: string, chatId: string, assistantId: string, user: string) => {
// Set the inference context so write tools can access sql + sessionId.
// The context persists for the duration of the inference run. Since
// BooCoder is single-user and runs one inference at a time per session,
// this module-level state is safe.
setInferenceContext({ sql, sessionId, taskId: null });
inference.enqueue(sessionId, chatId, assistantId, user);
},
cancel: async (sessionId: string, chatId: string) => {
const result = await inference.cancel(sessionId, chatId);
clearInferenceContext();
return result;
},
hasActive: (chatId: string) => inference.hasActive(chatId),
};
// Register WebSocket support
await app.register(fastifyWebsocket);
// Health endpoint
app.get('/api/health', async (_req, reply) => {
const dbOk = await pingDb(sql);
const status = dbOk ? 200 : 503;
return reply.status(status).send({
ok: dbOk,
db: dbOk,
tools: ALL_TOOLS.length,
});
});
// Phase 4: probe available agents on startup
await probeAgents(sql, app.log);
// Warm provider snapshot in background (ACP cold probes + model merges)
void getProviderSnapshot(sql, config, homedir(), true)
.then((entries) => persistProbedModels(sql, entries, app.log))
.catch((err) => {
app.log.warn(
{ err: err instanceof Error ? err.message : String(err) },
'provider-snapshot: warm failed',
);
});
// Phase 4: dispatcher — polls tasks table and runs inference
const dispatcher = createDispatcher({ sql, inference: inferenceApi, broker, log: app.log, config });
dispatcher.start();
// v2.6 Phase 3: configure + start the agent-pool lifecycle sweep (idle-TTL +
// LRU-cap eviction of warm backends, plus each backend's proactive health probe)
// and the orphan-worktree reaper. Both run on the same periodic timer.
agentPool.configure({
idleTtlMs: config.AGENT_POOL_IDLE_TTL_MS,
maxLive: config.AGENT_POOL_MAX_LIVE,
sweepIntervalMs: config.LIFECYCLE_SWEEP_INTERVAL_MS,
log: app.log,
});
agentPool.startReaper(app.log);
const orphanReaper = createOrphanWorktreeReaper({
sql,
log: app.log,
intervalMs: config.LIFECYCLE_SWEEP_INTERVAL_MS,
graceMs: config.ORPHAN_WORKTREE_GRACE_MS,
});
orphanReaper.start();
app.addHook('onClose', async () => {
// stop() first so in-flight dispatcher turns settle, then stop the reapers and
// drain the pool (kills opencode server + warm ACP children).
await dispatcher.stop();
orphanReaper.stop();
await agentPool.dispose();
});
// Register routes
registerMessageRoutes(app, sql, broker, inferenceApi);
registerSkillRoutes(app, sql, broker, inferenceApi);
registerPendingRoutes(app, sql);
registerCheckpointRoutes(app, sql);
registerAgentSessionRoutes(app, sql);
registerTaskRoutes(app, sql, inferenceApi);
registerInboxRoutes(app, sql);
registerStatsRoutes(app, sql);
registerArenaRoutes(app, sql);
registerProviderRoutes(app, sql, config);
registerWorktreeSafetyRoutes(app, sql);
registerLifecycleRoutes(app, sql);
registerWebSocket(app, sql, broker);
// Serve static frontend (built web app). In production, the dist/ is
// copied to ../web relative to the dist/ directory at /app/web. In dev,
// check adjacent to the source.
const webRoot = resolve(__dirname, '../web');
if (existsSync(webRoot)) {
await app.register(fastifyStatic, {
root: webRoot,
prefix: '/',
// Don't intercept /api routes — static only serves files that exist.
wildcard: false,
});
// SPA fallback: serve index.html for non-API routes that don't match a file.
app.setNotFoundHandler(async (req, reply) => {
if (req.url.startsWith('/api')) {
reply.code(404);
return { error: 'not found' };
}
return reply.sendFile('index.html');
});
app.log.info(`serving frontend from ${webRoot}`);
}
// Graceful shutdown
const shutdown = async () => {
app.log.info('shutting down');
await app.close();
await closeDb();
process.exit(0);
};
process.on('SIGTERM', shutdown);
process.on('SIGINT', shutdown);
await app.listen({ port: config.PORT, host: config.HOST });
app.log.info(`BooCoder listening on ${config.HOST}:${config.PORT}`);
}
main().catch((err) => {
console.error('fatal:', err);
process.exit(1);
});

View File

@@ -0,0 +1,75 @@
import { describe, it, expect } from 'vitest';
import Fastify, { type FastifyInstance } from 'fastify';
import { registerAgentSessionRoutes } from '../agent-sessions.js';
import type { Sql } from '../../db.js';
// Mock the porsager surface this route uses: a tagged-template `sql` dispatched by
// query substring. Two queries: the session-existence check and the agent_sessions
// JOIN. We return post-coercion shapes (booleans/strings) exactly as porsager would
// hand them to the route — `has_session` already a JS boolean, `last_active_at` a
// string|null — so the asserted JSON matches the API contract end-to-end.
interface MockState {
sessionExists: boolean;
rows: Array<{ agent: string; status: string; has_session: boolean; last_active_at: string | null }>;
}
function mockSql(state: MockState): Sql {
return ((strings: TemplateStringsArray) => {
const q = strings.join('');
if (q.includes('SELECT id FROM sessions')) {
return Promise.resolve(state.sessionExists ? [{ id: 'session-1' }] : []);
}
if (q.includes('FROM agent_sessions')) {
return Promise.resolve(state.rows);
}
return Promise.resolve([]);
}) as unknown as Sql;
}
function buildApp(state: MockState): FastifyInstance {
const app = Fastify();
registerAgentSessionRoutes(app, mockSql(state));
return app;
}
describe('GET /api/sessions/:id/agent-sessions', () => {
it('returns the per-(chat,agent) rows in the contracted shape', async () => {
const app = buildApp({
sessionExists: true,
rows: [
{ agent: 'opencode', status: 'active', has_session: true, last_active_at: '2026-05-31T12:00:00.000Z' },
{ agent: 'goose', status: 'idle', has_session: false, last_active_at: null },
],
});
const res = await app.inject({ method: 'GET', url: '/api/sessions/session-1/agent-sessions' });
expect(res.statusCode).toBe(200);
const body = res.json();
expect(Array.isArray(body)).toBe(true);
expect(body).toEqual([
{ agent: 'opencode', status: 'active', has_session: true, last_active_at: '2026-05-31T12:00:00.000Z' },
{ agent: 'goose', status: 'idle', has_session: false, last_active_at: null },
]);
// Contract field types.
expect(typeof body[0].agent).toBe('string');
expect(typeof body[0].status).toBe('string');
expect(typeof body[0].has_session).toBe('boolean');
expect(body[1].last_active_at).toBeNull();
await app.close();
});
it('returns an empty array when the session has no agent_sessions rows', async () => {
const app = buildApp({ sessionExists: true, rows: [] });
const res = await app.inject({ method: 'GET', url: '/api/sessions/session-1/agent-sessions' });
expect(res.statusCode).toBe(200);
expect(res.json()).toEqual([]);
await app.close();
});
it('404s when the session does not exist', async () => {
const app = buildApp({ sessionExists: false, rows: [] });
const res = await app.inject({ method: 'GET', url: '/api/sessions/nope/agent-sessions' });
expect(res.statusCode).toBe(404);
expect(res.json()).toEqual({ error: 'session not found' });
await app.close();
});
});

View File

@@ -0,0 +1,211 @@
import { describe, it, expect, beforeEach, afterEach, vi } from 'vitest';
import Fastify, { type FastifyInstance } from 'fastify';
import { existsSync, readFileSync, writeFileSync, rmSync } from 'node:fs';
import { tmpdir } from 'node:os';
import { join } from 'node:path';
import { registerProviderRoutes } from '../providers.js';
import { load } from '../../services/provider-config.js';
import { loadProviderConfig } from '../../services/provider-config-registry.js';
import { clearProviderSnapshotCache } from '../../services/provider-snapshot.js';
import type { Config } from '../../config.js';
import type { Sql } from '../../db.js';
/** Minimal sql stub: available_agents reads return []. */
function mockSql(): Sql {
return vi.fn((strings: TemplateStringsArray) => {
const q = strings.join('');
if (q.includes('available_agents')) return Promise.resolve([]);
return Promise.resolve([]);
}) as unknown as Sql;
}
let tmpCounter = 0;
function freshPath(): string {
tmpCounter += 1;
return join(tmpdir(), `coder-providers-routes-${process.pid}-${tmpCounter}.json`);
}
function buildApp(providersPath: string): FastifyInstance {
const app = Fastify();
// Mirror index.ts: tolerate empty JSON bodies.
app.removeContentTypeParser(['application/json']);
app.addContentTypeParser('application/json', { parseAs: 'string' }, (_req, body, done) => {
const str = (body as string) ?? '';
if (str.trim().length === 0) return done(null, {});
try {
done(null, JSON.parse(str));
} catch (err) {
done(err as Error, undefined);
}
});
const config = {
CODER_PROVIDERS_PATH: providersPath,
LLAMA_SWAP_URL: 'http://llama-swap.test',
PROVIDER_PROBE_TTL_MS: 86_400_000,
} as unknown as Config;
registerProviderRoutes(app, mockSql(), config);
return app;
}
const JSON_HEADERS = { 'content-type': 'application/json' };
const createdPaths: string[] = [];
beforeEach(() => {
clearProviderSnapshotCache();
loadProviderConfig('/nonexistent-coder-providers.json'); // reset registry to built-ins
vi.restoreAllMocks();
vi.stubGlobal('fetch', vi.fn().mockRejectedValue(new Error('no network in test')));
});
afterEach(() => {
for (const p of createdPaths.splice(0)) {
try {
rmSync(p, { force: true });
} catch {
/* ignore */
}
}
});
describe('GET /api/providers/config', () => {
it('returns the current config file (built-ins-only when missing)', async () => {
const path = freshPath();
createdPaths.push(path);
const app = buildApp(path);
const res = await app.inject({ method: 'GET', url: '/api/providers/config' });
expect(res.statusCode).toBe(200);
expect(res.json()).toEqual({ providers: {} });
await app.close();
});
it('reflects an existing file', async () => {
const path = freshPath();
createdPaths.push(path);
writeFileSync(path, JSON.stringify({ providers: { goose: { enabled: false } } }));
const app = buildApp(path);
const res = await app.inject({ method: 'GET', url: '/api/providers/config' });
expect(res.json()).toEqual({ providers: { goose: { enabled: false } } });
await app.close();
});
});
describe('PATCH /api/providers/config', () => {
it('valid patch → 200, writes the merged file (order: validate→save→reload→clear)', async () => {
const path = freshPath();
createdPaths.push(path);
writeFileSync(path, JSON.stringify({ providers: { goose: { label: 'Goose' } } }));
const app = buildApp(path);
const res = await app.inject({
method: 'PATCH',
url: '/api/providers/config',
headers: JSON_HEADERS,
payload: JSON.stringify({ providers: { opencode: { enabled: false } } }),
});
expect(res.statusCode).toBe(200);
expect(res.json()).toMatchObject({ ok: true });
// File written + merged (goose untouched, opencode added).
const onDisk = load(path);
expect(onDisk.providers).toEqual({
goose: { label: 'Goose' },
opencode: { enabled: false },
});
await app.close();
});
it('null value deletes the override', async () => {
const path = freshPath();
createdPaths.push(path);
writeFileSync(path, JSON.stringify({ providers: { goose: { enabled: false }, opencode: { enabled: false } } }));
const app = buildApp(path);
const res = await app.inject({
method: 'PATCH',
url: '/api/providers/config',
headers: JSON_HEADERS,
payload: JSON.stringify({ providers: { goose: null } }),
});
expect(res.statusCode).toBe(200);
expect(load(path).providers).toEqual({ opencode: { enabled: false } });
await app.close();
});
it('INVALID body → 422 and the file is NOT written (validate before save)', async () => {
const path = freshPath();
createdPaths.push(path);
const before = JSON.stringify({ providers: { goose: { enabled: true } } });
writeFileSync(path, before);
const app = buildApp(path);
const res = await app.inject({
method: 'PATCH',
url: '/api/providers/config',
headers: JSON_HEADERS,
payload: JSON.stringify({ providers: { goose: { enabled: 'yes' } } }), // bad type
});
expect(res.statusCode).toBe(422);
// File must be byte-for-byte unchanged — nothing written on a 422.
expect(readFileSync(path, 'utf8')).toBe(before);
await app.close();
});
it('save failure → 500 and the file is NOT created (no state divergence)', async () => {
const path = join(tmpdir(), `no-such-dir-${process.pid}-${Date.now()}`, 'coder-providers.json');
const app = buildApp(path);
const res = await app.inject({
method: 'PATCH',
url: '/api/providers/config',
headers: JSON_HEADERS,
payload: JSON.stringify({ providers: { goose: { enabled: false } } }),
});
expect(res.statusCode).toBe(500);
expect(existsSync(path)).toBe(false);
await app.close();
});
});
describe('POST /api/providers/refresh', () => {
it('no body → refreshes all registered providers', async () => {
const app = buildApp(freshPath());
const res = await app.inject({ method: 'POST', url: '/api/providers/refresh' });
expect(res.statusCode).toBe(200);
expect(res.json().refreshed).toBeGreaterThan(0);
await app.close();
});
it('subset body → refreshed count reflects only the requested providers', async () => {
const app = buildApp(freshPath());
const res = await app.inject({
method: 'POST',
url: '/api/providers/refresh',
headers: JSON_HEADERS,
payload: JSON.stringify({ providers: ['boocode'] }),
});
expect(res.statusCode).toBe(200);
expect(res.json()).toEqual({ refreshed: 1 });
await app.close();
});
});
describe('GET /api/providers/:id/diagnostic', () => {
it('known provider → 200 JSON { diagnostic }', async () => {
const app = buildApp(freshPath());
const res = await app.inject({ method: 'GET', url: '/api/providers/boocode/diagnostic' });
expect(res.statusCode).toBe(200);
expect(res.headers['content-type']).toContain('application/json');
expect(res.json().diagnostic).toContain('provider: boocode');
await app.close();
});
it('unknown provider → 404', async () => {
const app = buildApp(freshPath());
const res = await app.inject({ method: 'GET', url: '/api/providers/nope/diagnostic' });
expect(res.statusCode).toBe(404);
await app.close();
});
});

View File

@@ -0,0 +1,59 @@
import type { FastifyInstance } from 'fastify';
import type { Sql } from '../db.js';
// v2.6 Phase 1-UX (design §9b): chat-scoped "resumed vs new session" indicator.
// `agent_sessions` is keyed (chat_id, agent) — the tab/chat is the agent-context
// unit (P1.5-b). The route param is a SESSION id, so we resolve every chat in the
// session and return the union of their agent_sessions rows. A session with two
// opencode tabs yields two rows (one per chat); the frontend keys the chip per
// chat, but the wire shape is a flat per-(chat,agent) list.
//
// has_session = agent_session_id IS NOT NULL — i.e. a native backend session id
// (opencode/ACP) was created and stored, so switching back resumes rather than
// starts fresh.
export interface AgentSessionRow {
agent: string;
status: string;
has_session: boolean;
last_active_at: string | null;
// v2.6.8 per-(chat,agent) running token/cost totals (sampling-streamjson-tokens
// #8). BIGINT columns arrive as strings over the wire; the frontend coerces.
input_tokens: number;
output_tokens: number;
cost: number;
}
export function registerAgentSessionRoutes(app: FastifyInstance, sql: Sql): void {
// GET /api/sessions/:sessionId/agent-sessions — list the agent-session rows for
// every chat in the session (drives the AgentComposerBar resumed/new chip).
app.get<{ Params: { sessionId: string } }>(
'/api/sessions/:sessionId/agent-sessions',
async (req, reply) => {
const sessionId = req.params.sessionId;
const session = await sql<{ id: string }[]>`SELECT id FROM sessions WHERE id = ${sessionId}`;
if (session.length === 0) {
reply.code(404);
return { error: 'session not found' };
}
// Join through chats so the session-scoped param resolves to its (chat,agent)
// rows. last_active_at first → the frontend reads the freshest activity.
const rows = await sql<AgentSessionRow[]>`
SELECT
a.agent AS agent,
a.status AS status,
(a.agent_session_id IS NOT NULL) AS has_session,
a.last_active_at AS last_active_at,
a.input_tokens AS input_tokens,
a.output_tokens AS output_tokens,
a.cost AS cost
FROM agent_sessions a
JOIN chats c ON c.id = a.chat_id
WHERE c.session_id = ${sessionId}
ORDER BY a.last_active_at DESC NULLS LAST, a.agent ASC
`;
return rows;
},
);
}

View File

@@ -0,0 +1,136 @@
/**
* v2.0.5: Arena routes — competitive dispatch of the same task to multiple agents.
*
* POST /api/arena — create an arena with 2-5 contestants
* GET /api/arena/:id — get all tasks in an arena
* POST /api/arena/:id/select/:task_id — mark a task as the arena winner
*/
import type { FastifyInstance } from 'fastify';
import { z } from 'zod';
import type { Sql } from '../db.js';
const ContestantSchema = z.object({
agent: z.string().max(100).optional(),
model: z.string().max(200).optional(),
mode_id: z.string().max(200).optional(),
thinking_option_id: z.string().max(200).optional(),
});
const CreateArenaBody = z.object({
project_id: z.string().uuid(),
input: z.string().min(1).max(64_000),
contestants: z.array(ContestantSchema).min(2).max(5),
});
interface TaskRow {
id: string;
agent: string | null;
model: string | null;
mode_id: string | null;
thinking_option_id: string | null;
state: string;
}
export function registerArenaRoutes(app: FastifyInstance, sql: Sql): void {
// POST /api/arena — create a new arena
app.post('/api/arena', async (req, reply) => {
const parsed = CreateArenaBody.safeParse(req.body);
if (!parsed.success) {
reply.code(400);
return { error: 'invalid body', details: parsed.error.flatten() };
}
const { project_id, input, contestants } = parsed.data;
const arenaId = crypto.randomUUID();
const tasks: TaskRow[] = [];
for (const contestant of contestants) {
const [task] = await sql<TaskRow[]>`
INSERT INTO tasks (project_id, input, agent, model, mode_id, thinking_option_id, arena_id)
VALUES (
${project_id},
${input},
${contestant.agent ?? null},
${contestant.model ?? null},
${contestant.mode_id ?? null},
${contestant.thinking_option_id ?? null},
${arenaId}
)
RETURNING id, agent, model, mode_id, thinking_option_id, state
`;
tasks.push(task!);
}
reply.code(201);
return {
arena_id: arenaId,
tasks: tasks.map((t) => ({
id: t.id,
agent: t.agent,
model: t.model,
mode_id: t.mode_id,
thinking_option_id: t.thinking_option_id,
state: t.state,
})),
};
});
// GET /api/arena/:arena_id — list all tasks in an arena
app.get<{ Params: { arena_id: string } }>('/api/arena/:arena_id', async (req, reply) => {
const { arena_id } = req.params;
// Validate UUID format
const uuidRegex = /^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$/i;
if (!uuidRegex.test(arena_id)) {
reply.code(400);
return { error: 'invalid arena_id format' };
}
const tasks = await sql`
SELECT id, project_id, state, input, output_summary, agent, model, mode_id, thinking_option_id, execution_path, session_id, started_at, ended_at, created_at, arena_id
FROM tasks
WHERE arena_id = ${arena_id}
ORDER BY created_at
`;
if (tasks.length === 0) {
reply.code(404);
return { error: 'arena not found' };
}
return { arena_id, tasks };
});
// POST /api/arena/:arena_id/select/:task_id — mark the winner
app.post<{ Params: { arena_id: string; task_id: string } }>(
'/api/arena/:arena_id/select/:task_id',
async (req, reply) => {
const { arena_id, task_id } = req.params;
// Verify the task belongs to this arena
const rows = await sql<{ id: string; state: string; arena_id: string | null }[]>`
SELECT id, state, arena_id FROM tasks WHERE id = ${task_id}
`;
if (rows.length === 0) {
reply.code(404);
return { error: 'task not found' };
}
const task = rows[0]!;
if (task.arena_id !== arena_id) {
reply.code(409);
return { error: 'task does not belong to this arena' };
}
// Mark as selected via output_summary prefix (lightweight — no schema change)
await sql`
UPDATE tasks
SET output_summary = COALESCE('[SELECTED] ' || output_summary, '[SELECTED]')
WHERE id = ${task_id}
`;
return { selected: true, task_id, arena_id };
}
);
}

View File

@@ -0,0 +1,81 @@
import type { Sql } from '../db.js';
interface WorkspacePaneRow {
id: string;
kind: string;
chatId?: string;
chatIds?: string[];
activeChatIdx?: number;
}
function chatNameForKind(kind: string): string {
if (kind === 'coder' || kind === 'agent') return 'BooCoder';
if (kind === 'terminal') return 'Terminal';
return 'Chat';
}
function activeChatIdForPane(pane: WorkspacePaneRow): string | undefined {
const chatIds = pane.chatIds ?? [];
const idx = pane.activeChatIdx ?? 0;
if (idx >= 0 && idx < chatIds.length) return chatIds[idx];
return pane.chatId;
}
/** Resolve the active chat for a workspace pane; auto-seed when empty. */
export async function resolveChatId(
sql: Sql,
sessionId: string,
paneId: string,
): Promise<string | null> {
return sql.begin(async (tx) => {
const sessionRows = await tx<{ workspace_panes: WorkspacePaneRow[] }[]>`
SELECT workspace_panes FROM sessions WHERE id = ${sessionId} FOR UPDATE
`;
if (sessionRows.length === 0) return null;
const panes = sessionRows[0]!.workspace_panes ?? [];
const paneIdx = panes.findIndex((p) => p.id === paneId);
if (paneIdx < 0) return null;
const pane = panes[paneIdx]!;
const existingChatId = activeChatIdForPane(pane);
if (existingChatId) {
const chatRows = await tx<{ id: string }[]>`
SELECT id FROM chats
WHERE id = ${existingChatId}
AND session_id = ${sessionId}
AND status = 'open'
`;
if (chatRows.length > 0) return existingChatId;
}
const [newChat] = await tx<{ id: string }[]>`
INSERT INTO chats (session_id, name, status)
VALUES (${sessionId}, ${chatNameForKind(pane.kind)}, 'open')
RETURNING id
`;
if (!newChat) return null;
const nextChatIds = [...(pane.chatIds ?? []), newChat.id];
const nextActiveIdx = nextChatIds.length - 1;
const nextPanes = panes.map((p, i) =>
i === paneIdx
? {
...p,
chatIds: nextChatIds,
activeChatIdx: nextActiveIdx,
chatId: newChat.id,
}
: p,
);
await tx`
UPDATE sessions
SET workspace_panes = ${tx.json(nextPanes as never)},
updated_at = clock_timestamp()
WHERE id = ${sessionId}
`;
return newChat.id;
});
}

View File

@@ -0,0 +1,73 @@
/**
* write-edit-robustness #4 — checkpoint restore + list routes (coder side).
*
* Proxied through the apps/server `/api/coder/*` blanket forwarder (no server-side
* change needed for new routes). Restore rewinds the session worktree to the
* checkpoint's shadow commit, trims the transcript from the anchor message forward,
* and resets the agent backend — see services/checkpoints.ts.
*/
import type { FastifyInstance } from 'fastify';
import type { Sql } from '../db.js';
import { restoreCheckpoint, CheckpointNotFoundError } from '../services/checkpoints.js';
export function registerCheckpointRoutes(app: FastifyInstance, sql: Sql): void {
// GET /api/sessions/:sessionId/checkpoints?chat_id= — list a chat's checkpoints
// so the frontend can mark which messages have a restore point. When chat_id is
// omitted, returns every checkpoint for the session's chats.
app.get<{ Params: { sessionId: string }; Querystring: { chat_id?: string } }>(
'/api/sessions/:sessionId/checkpoints',
async (req, reply) => {
const sessionId = req.params.sessionId;
const chatId = req.query.chat_id;
const session = await sql<{ id: string }[]>`SELECT id FROM sessions WHERE id = ${sessionId}`;
if (session.length === 0) {
reply.code(404);
return { error: 'session not found' };
}
// Scope authoritatively through chats.session_id (always set) — NOT the
// denormalized checkpoints.session_id (nullable). The chat_id branch must
// still be session-gated or it's an IDOR (any session's chat_id reads its
// checkpoints).
const rows = chatId
? await sql<{ id: string; chat_id: string; message_id: string | null; label: string | null; created_at: Date }[]>`
SELECT cp.id, cp.chat_id, cp.message_id, cp.label, cp.created_at
FROM checkpoints cp
JOIN chats c ON c.id = cp.chat_id
WHERE cp.chat_id = ${chatId} AND c.session_id = ${sessionId}
ORDER BY cp.created_at
`
: await sql<{ id: string; chat_id: string; message_id: string | null; label: string | null; created_at: Date }[]>`
SELECT cp.id, cp.chat_id, cp.message_id, cp.label, cp.created_at
FROM checkpoints cp
JOIN chats c ON c.id = cp.chat_id
WHERE c.session_id = ${sessionId}
ORDER BY cp.created_at
`;
return rows;
},
);
// POST /api/sessions/:sessionId/checkpoints/:checkpointId/restore — restore.
app.post<{ Params: { sessionId: string; checkpointId: string } }>(
'/api/sessions/:sessionId/checkpoints/:checkpointId/restore',
async (req, reply) => {
const { sessionId, checkpointId } = req.params;
try {
const result = await restoreCheckpoint(sql, checkpointId, {
sessionId,
log: app.log,
});
return result;
} catch (err) {
if (err instanceof CheckpointNotFoundError) {
reply.code(404);
return { error: err.message };
}
throw err;
}
},
);
}

View File

@@ -0,0 +1,33 @@
import type { FastifyInstance } from 'fastify';
import type { Sql } from '../db.js';
export function registerInboxRoutes(app: FastifyInstance, sql: Sql): void {
// GET /api/inbox — tasks needing human attention (blocked or failed)
app.get('/api/inbox', async () => {
return sql`
SELECT id, project_id, parent_task_id, state, input, output_summary, agent, model, session_id, started_at, ended_at, created_at
FROM human_inbox
ORDER BY created_at DESC
LIMIT 100
`;
});
// POST /api/inbox/:id/retry — reset a blocked/failed task to pending for re-dispatch
app.post<{ Params: { id: string } }>('/api/inbox/:id/retry', async (req, reply) => {
const taskId = req.params.id;
const result = await sql`
UPDATE tasks
SET state = 'pending', started_at = NULL, ended_at = NULL, output_summary = NULL
WHERE id = ${taskId} AND state IN ('blocked', 'failed')
RETURNING id, state
`;
if (result.length === 0) {
reply.code(404);
return { error: 'task not found or not in retryable state' };
}
return { id: result[0]!.id, state: result[0]!.state };
});
}

View File

@@ -0,0 +1,122 @@
/**
* v2.6 Phase 3 (3.3) — chat/session close-or-archive cleanup hook (coder side).
*
* Chat/session close + archive + delete all live in apps/server (Docker), which
* cannot see the host worktree dirs (/tmp/booworktrees), run git on them, or reach
* the warm agent processes the dispatcher pooled in THIS (host systemd) process. So
* — exactly like the `worktree-risk` guard — the server signals the coder when a
* chat/session closes, and the coder does the real teardown:
* 1. dispose the chat's warm-ACP backends (`agentPool.closeChat`) — kills the
* goose/qwen child processes for that chat,
* 2. close the chat's opencode session on the shared server (`closeSession`),
* 3. mark every `agent_sessions` row for the chat 'closed' + (when the session's
* last open chat closes) remove the shared session worktree, preflighting
* work-at-risk so uncommitted/unmerged work is never silently dropped
* (`closeChatBackendState`).
*
* Idempotent: closing an already-closed chat is a no-op (0 rows, no backend).
*
* SERVER WIRING (not done here — apps/server, out of this batch's scope): the
* server's `POST /api/chats/:id/archive`, `DELETE /api/chats/:id`, and the
* session archive/delete routes should fire-and-forget
* fetch(`${BOOCODER_URL}/api/chats/${id}/close`, { method: 'POST' })
* after publishing their WS frame (best-effort; the orphan-worktree reaper +
* idle-pool eviction are the backstop if the call is missed).
*/
import type { FastifyInstance } from 'fastify';
import type { Sql } from '../db.js';
import { agentPool, OPENCODE_POOL_KEY } from '../services/agent-pool.js';
import { closeChatBackendState } from '../services/worktrees.js';
import type { AgentSessionHandle } from '../services/agent-backend.js';
export function registerLifecycleRoutes(app: FastifyInstance, sql: Sql): void {
// POST /api/chats/:chatId/close — tear down all warm state for a chat tab.
app.post<{ Params: { chatId: string }; Querystring: { force?: string } }>(
'/api/chats/:chatId/close',
async (req) => {
const chatId = req.params.chatId;
const force = req.query.force === 'true' || req.query.force === '1';
// 1. Close the chat's opencode session on the SHARED server (the server is
// not chat-keyed, so agentPool.closeChat won't touch it). Resolve the
// stored opencode session id and ask the backend to drop it.
const ocRows = await sql<{ agent: string; agent_session_id: string | null; worktree_id: string | null; session_id: string | null }[]>`
SELECT agent, agent_session_id, worktree_id, session_id
FROM agent_sessions
WHERE chat_id = ${chatId} AND backend = 'opencode_server'
`;
const ocBackend = agentPool.peek(OPENCODE_POOL_KEY, 'opencode');
if (ocBackend) {
for (const row of ocRows) {
if (!row.agent_session_id) continue;
const handle: AgentSessionHandle = {
sessionId: row.session_id ?? '',
agent: row.agent,
backend: 'opencode_server',
chatId,
worktreeId: row.worktree_id ?? '',
agentSessionId: row.agent_session_id,
serverPort: null,
};
await ocBackend.closeSession(handle).catch((err) => {
app.log.warn({ err: err instanceof Error ? err.message : String(err), chatId }, 'lifecycle: opencode closeSession threw');
});
}
}
// 2. Dispose any warm-ACP backends pooled under this chat (kills the
// goose/qwen child + marks its agent row closed via the backend).
const disposed = await agentPool.closeChat(chatId);
// 3. DB + worktree truth: mark agent rows closed; remove the shared session
// worktree iff this was the session's last open chat (preflight at-risk).
const result = await closeChatBackendState(sql, chatId, { force });
app.log.info({ chatId, disposed, ...result }, 'lifecycle: chat closed');
return { ok: true, disposed, ...result };
},
);
// POST /api/sessions/:sessionId/close — close every open chat in a session
// (session archive/delete). Loops the chat-close path so the same preflight +
// teardown applies per chat; the worktree is removed on the last one.
app.post<{ Params: { sessionId: string }; Querystring: { force?: string } }>(
'/api/sessions/:sessionId/close',
async (req) => {
const sessionId = req.params.sessionId;
const force = req.query.force === 'true' || req.query.force === '1';
const chats = await sql<{ id: string }[]>`
SELECT id FROM chats WHERE session_id = ${sessionId}
`;
const results: { chatId: string; disposed: string[]; worktreeRemoved: boolean; worktreeAtRisk: boolean }[] = [];
for (const c of chats) {
const ocBackend = agentPool.peek(OPENCODE_POOL_KEY, 'opencode');
if (ocBackend) {
const ocRows = await sql<{ agent: string; agent_session_id: string | null; worktree_id: string | null; session_id: string | null }[]>`
SELECT agent, agent_session_id, worktree_id, session_id
FROM agent_sessions WHERE chat_id = ${c.id} AND backend = 'opencode_server'
`;
for (const row of ocRows) {
if (!row.agent_session_id) continue;
await ocBackend.closeSession({
sessionId: row.session_id ?? '',
agent: row.agent,
backend: 'opencode_server',
chatId: c.id,
worktreeId: row.worktree_id ?? '',
agentSessionId: row.agent_session_id,
serverPort: null,
}).catch(() => {});
}
}
const disposed = await agentPool.closeChat(c.id);
const r = await closeChatBackendState(sql, c.id, { force });
results.push({ chatId: c.id, disposed, worktreeRemoved: r.worktreeRemoved, worktreeAtRisk: r.worktreeAtRisk });
}
app.log.info({ sessionId, chats: results.length }, 'lifecycle: session closed');
return { ok: true, results };
},
);
}

View File

@@ -0,0 +1,402 @@
import type { FastifyInstance } from 'fastify';
import { z } from 'zod';
import type { Sql } from '../db.js';
import type { Broker } from '@boocode/server/broker';
import type { WsFrame } from '@boocode/server/ws-frames';
import { resolveChatId } from './chat-resolve.js';
const AnswerUserInputBody = z.object({
tool_call_id: z.string().min(1),
answers: z
.array(
z.object({
question: z.string(),
selected_options: z.array(z.string()),
free_text: z.string().nullable(),
}),
)
.min(1)
.max(3),
});
const AskUserInputArgs = z.object({
questions: z
.array(
z.object({
question: z.string(),
type: z.enum(['single_select', 'multi_select']),
options: z.array(z.string()).min(1),
}),
)
.min(1)
.max(3),
});
const SendBody = z.object({
content: z.string().min(1).max(64_000),
pane_id: z.string().min(1).max(200),
chat_id: z.string().uuid().optional(),
provider: z.string().max(100).optional(),
model: z.string().max(200).optional(),
mode_id: z.string().max(200).optional(),
thinking_option_id: z.string().max(200).optional(),
});
interface InferenceApi {
enqueue: (sessionId: string, chatId: string, assistantId: string, user: string) => void;
cancel: (sessionId: string, chatId: string) => Promise<boolean>;
hasActive: (chatId: string) => boolean;
}
interface MessageRow {
id: string;
role: string;
content: string | null;
status: string | null;
tool_calls: Array<{ id: string; name: string; args?: Record<string, unknown> }> | null;
tool_results: {
tool_call_id: string;
output: unknown;
truncated?: boolean;
error?: string;
} | null;
reasoning_parts: Array<{ text?: string }> | null;
}
function mapCoderMessageRow(row: MessageRow) {
if (row.role === 'tool') {
if (!row.tool_results?.tool_call_id) return null;
return {
id: row.id,
role: 'tool' as const,
tool_results: row.tool_results,
};
}
if (row.role !== 'user' && row.role !== 'assistant' && row.role !== 'system') {
return null;
}
const tool_calls = row.tool_calls?.map((tc) => ({
id: tc.id,
function: {
name: tc.name,
arguments: JSON.stringify(tc.args ?? {}),
},
}));
const reasoningText = row.reasoning_parts?.map((p) => p.text ?? '').join('') ?? '';
return {
id: row.id,
role: row.role as 'user' | 'assistant' | 'system',
content: row.content ?? '',
status: (row.status ?? 'complete') as 'streaming' | 'complete' | 'failed',
...(reasoningText ? { reasoning_text: reasoningText } : {}),
...(tool_calls?.length ? { tool_calls } : {}),
};
}
export function registerMessageRoutes(
app: FastifyInstance,
sql: Sql,
broker: Broker,
inference: InferenceApi,
): void {
// GET /api/sessions/:sessionId/messages — hydrate CoderPane on load / reconnect
app.get<{ Params: { sessionId: string }; Querystring: { chat_id?: string } }>(
'/api/sessions/:sessionId/messages',
async (req, reply) => {
const sessionId = req.params.sessionId;
const chatId = req.query.chat_id;
const sessionRows = await sql<{ id: string }[]>`
SELECT id FROM sessions WHERE id = ${sessionId}
`;
if (sessionRows.length === 0) {
reply.code(404);
return { error: 'session not found' };
}
if (chatId) {
const chatRows = await sql<{ id: string }[]>`
SELECT id FROM chats
WHERE id = ${chatId} AND session_id = ${sessionId} AND status = 'open'
`;
if (chatRows.length === 0) {
reply.code(404);
return { error: 'chat not found or not open in this session' };
}
}
const rows = chatId
? await sql<MessageRow[]>`
SELECT id, role, content, status, tool_calls, tool_results, reasoning_parts
FROM messages_with_parts
WHERE session_id = ${sessionId} AND chat_id = ${chatId}
ORDER BY created_at ASC, id ASC
`
: await sql<MessageRow[]>`
SELECT id, role, content, status, tool_calls, tool_results, reasoning_parts
FROM messages_with_parts
WHERE session_id = ${sessionId}
ORDER BY created_at ASC, id ASC
`;
return rows.map(mapCoderMessageRow).filter((m) => m !== null);
},
);
// POST /api/sessions/:sessionId/messages — send a user message + kick off inference
app.post<{ Params: { sessionId: string } }>(
'/api/sessions/:sessionId/messages',
async (req, reply) => {
const parsed = SendBody.safeParse(req.body);
if (!parsed.success) {
reply.code(400);
return { error: 'invalid body', details: parsed.error.flatten() };
}
const sessionId = req.params.sessionId;
const { content, pane_id, chat_id: explicitChatId, provider, model, mode_id, thinking_option_id } =
parsed.data;
const isExternal = provider && provider !== 'boocode';
// Validate session exists
const sessionRows = await sql<{ id: string; project_id: string }[]>`
SELECT id, project_id FROM sessions WHERE id = ${sessionId}
`;
if (sessionRows.length === 0) {
reply.code(404);
return { error: 'session not found' };
}
const resolved = await resolveChatId(sql, sessionId, pane_id);
if (!resolved) {
reply.code(404);
return { error: 'pane not found' };
}
let chatId = resolved;
if (explicitChatId) {
const chatRows = await sql<{ id: string }[]>`
SELECT id FROM chats WHERE id = ${explicitChatId} AND session_id = ${sessionId} AND status = 'open'
`;
if (chatRows.length === 0) {
reply.code(404);
return { error: 'chat not found or not open in this session' };
}
chatId = explicitChatId;
}
if (!isExternal) {
// Reject if inference is already running on this chat
if (inference.hasActive(chatId)) {
reply.code(409);
return { error: 'inference already running on this chat' };
}
}
// Create user message
const [userMsg] = await sql<{ id: string }[]>`
INSERT INTO messages (session_id, chat_id, role, content, status, created_at)
VALUES (${sessionId}, ${chatId}, 'user', ${content}, 'complete', clock_timestamp())
RETURNING id
`;
await sql`UPDATE sessions SET updated_at = clock_timestamp() WHERE id = ${sessionId}`;
await sql`UPDATE chats SET updated_at = clock_timestamp() WHERE id = ${chatId}`;
// Publish user message frames
broker.publishFrame(sessionId, {
type: 'message_started',
message_id: userMsg!.id,
chat_id: chatId,
role: 'user',
} as unknown as WsFrame);
broker.publishFrame(sessionId, {
type: 'delta',
message_id: userMsg!.id,
chat_id: chatId,
content,
} as unknown as WsFrame);
broker.publishFrame(sessionId, {
type: 'message_complete',
message_id: userMsg!.id,
chat_id: chatId,
} as unknown as WsFrame);
if (isExternal) {
// External provider: create a task for the dispatcher
const projectId = sessionRows[0]!.project_id;
const [task] = await sql<{ id: string; state: string }[]>`
INSERT INTO tasks (project_id, input, agent, model, mode_id, thinking_option_id, session_id, chat_id)
VALUES (${projectId}, ${content}, ${provider}, ${model ?? null}, ${mode_id ?? null}, ${thinking_option_id ?? null}, ${sessionId}, ${chatId})
RETURNING id, state
`;
reply.code(202);
return { user_message_id: userMsg!.id, task_id: task!.id, dispatched: true };
}
// Native provider: create streaming assistant row + enqueue inference
const [assistantMsg] = await sql<{ id: string }[]>`
INSERT INTO messages (session_id, chat_id, role, content, status, created_at)
VALUES (${sessionId}, ${chatId}, 'assistant', '', 'streaming', clock_timestamp())
RETURNING id
`;
inference.enqueue(sessionId, chatId, assistantMsg!.id, 'default');
reply.code(202);
return { user_message_id: userMsg!.id, assistant_message_id: assistantMsg!.id };
},
);
// POST /api/chats/:id/answer_user_input — answer a pending ask_user_input
app.post<{ Params: { id: string } }>(
'/api/chats/:id/answer_user_input',
async (req, reply) => {
const parsed = AnswerUserInputBody.safeParse(req.body);
if (!parsed.success) {
reply.code(400);
return { error: 'invalid_body', details: parsed.error.flatten() };
}
const { tool_call_id, answers } = parsed.data;
const chatRows = await sql<{ id: string; session_id: string }[]>`
SELECT id, session_id FROM chats WHERE id = ${req.params.id} AND status = 'open'
`;
if (chatRows.length === 0) {
reply.code(404);
return { error: 'chat_not_found' };
}
const chat = chatRows[0]!;
const sessionId = chat.session_id;
const callerRows = await sql<{
message_id: string;
payload: { id: string; name: string; args: Record<string, unknown> };
}[]>`
SELECT p.message_id, p.payload
FROM message_parts p
JOIN messages m ON m.id = p.message_id
WHERE m.chat_id = ${chat.id}
AND m.role = 'assistant'
AND p.kind = 'tool_call'
AND p.payload->>'id' = ${tool_call_id}
ORDER BY m.created_at DESC
LIMIT 1
`;
if (!callerRows[0]) {
reply.code(404);
return { error: 'unknown_tool_call_id' };
}
const foundCall = callerRows[0].payload;
if (foundCall.name !== 'ask_user_input') {
reply.code(400);
return { error: 'tool_call_not_ask_user_input' };
}
const argsParsed = AskUserInputArgs.safeParse(foundCall.args);
if (!argsParsed.success) {
reply.code(400);
return { error: 'mismatched_answer_shape', detail: 'tool_call args invalid' };
}
const questions = argsParsed.data.questions;
if (answers.length !== questions.length) {
reply.code(400);
return { error: 'mismatched_answer_shape', detail: `expected ${questions.length} answer(s), got ${answers.length}` };
}
for (let i = 0; i < questions.length; i++) {
const q = questions[i]!;
const a = answers[i]!;
for (const sel of a.selected_options) {
if (!q.options.includes(sel)) {
reply.code(400);
return { error: 'mismatched_answer_shape', detail: `answer ${i + 1} option not in question: ${sel}` };
}
}
if (q.type === 'single_select' && a.selected_options.length > 1) {
reply.code(400);
return { error: 'mismatched_answer_shape', detail: `answer ${i + 1} multi on single_select` };
}
if (a.selected_options.length === 0 && (!a.free_text || !a.free_text.trim())) {
reply.code(400);
return { error: 'mismatched_answer_shape', detail: `answer ${i + 1} is empty` };
}
}
const toolRows = await sql<{
message_id: string;
payload: { tool_call_id: string; output: unknown };
}[]>`
SELECT p.message_id, p.payload
FROM message_parts p
JOIN messages m ON m.id = p.message_id
WHERE m.chat_id = ${chat.id}
AND m.role = 'tool'
AND p.kind = 'tool_result'
AND p.payload->>'tool_call_id' = ${tool_call_id}
ORDER BY m.created_at DESC
LIMIT 1
`;
if (!toolRows[0]) {
reply.code(404);
return { error: 'unknown_tool_call_id', detail: 'tool message not found' };
}
if (toolRows[0].payload?.output !== null) {
reply.code(409);
return { error: 'tool_call_already_answered' };
}
const answerSet = { answers };
const newToolResults = { tool_call_id, output: answerSet, truncated: false };
const toolMessageId = toolRows[0].message_id;
const result = await sql.begin(async (tx) => {
await tx`DELETE FROM message_parts WHERE message_id = ${toolMessageId} AND kind = 'tool_result'`;
await tx`
INSERT INTO message_parts (message_id, sequence, kind, payload)
VALUES (${toolMessageId}, 0, 'tool_result', ${tx.json(newToolResults as never)})
`;
const [assistantMsg] = await tx<{ id: string }[]>`
INSERT INTO messages (session_id, chat_id, role, content, status, created_at)
VALUES (${sessionId}, ${chat.id}, 'assistant', '', 'streaming', clock_timestamp())
RETURNING id
`;
await tx`UPDATE sessions SET updated_at = clock_timestamp() WHERE id = ${sessionId}`;
await tx`UPDATE chats SET updated_at = clock_timestamp() WHERE id = ${chat.id}`;
return { tool_message_id: toolMessageId, assistant_message_id: assistantMsg!.id };
});
broker.publishFrame(sessionId, {
type: 'tool_result',
tool_message_id: result.tool_message_id,
tool_call_id,
chat_id: chat.id,
output: answerSet,
truncated: false,
} as unknown as WsFrame);
inference.enqueue(sessionId, chat.id, result.assistant_message_id, 'default');
reply.code(202);
return result;
},
);
// POST /api/sessions/:sessionId/stop — cancel active inference
app.post<{ Params: { sessionId: string } }>(
'/api/sessions/:sessionId/stop',
async (req, reply) => {
const sessionId = req.params.sessionId;
// Find active chats in this session
const chats = await sql<{ id: string }[]>`
SELECT id FROM chats WHERE session_id = ${sessionId} AND status = 'open'
`;
let cancelled = false;
for (const chat of chats) {
if (inference.hasActive(chat.id)) {
cancelled = await inference.cancel(sessionId, chat.id);
break;
}
}
return { cancelled };
},
);
}

View File

@@ -0,0 +1,193 @@
import type { FastifyInstance } from 'fastify';
import { z } from 'zod';
import type { Sql } from '../db.js';
import {
listPending,
applyOne,
applyAll,
rejectOne,
rewindOne,
queueCreate,
} from '../services/pending_changes.js';
import { WriteGuardError } from '../services/write_guard.js';
import { rebaselineWorktreeAfterApply } from '../services/worktrees.js';
const CreateBody = z.object({
file_path: z.string().min(1),
content: z.string(),
});
/**
* Resolve project root from a session's project path.
*/
async function resolveProjectRoot(sql: Sql, sessionId: string): Promise<string | null> {
const rows = await sql<{ path: string }[]>`
SELECT p.path FROM sessions s
JOIN projects p ON s.project_id = p.id
WHERE s.id = ${sessionId}
`;
return rows.length > 0 ? rows[0]!.path : null;
}
/**
* Resolve project root from a pending change's session.
*/
async function resolveProjectRootForChange(sql: Sql, changeId: string): Promise<string | null> {
const rows = await sql<{ path: string }[]>`
SELECT p.path FROM pending_changes pc
JOIN sessions s ON pc.session_id = s.id
JOIN projects p ON s.project_id = p.id
WHERE pc.id = ${changeId}
`;
return rows.length > 0 ? rows[0]!.path : null;
}
export function registerPendingRoutes(app: FastifyInstance, sql: Sql): void {
// GET /api/sessions/:sessionId/pending — list pending changes for a session
app.get<{ Params: { sessionId: string } }>(
'/api/sessions/:sessionId/pending',
async (req, reply) => {
const sessionId = req.params.sessionId;
const session = await sql<{ id: string }[]>`SELECT id FROM sessions WHERE id = ${sessionId}`;
if (session.length === 0) {
reply.code(404);
return { error: 'session not found' };
}
const pending = await listPending(sql, sessionId);
return pending;
},
);
// POST /api/sessions/:sessionId/pending/create — queue a new-file create
// (manual create from the RightRail file browser; no inference involved).
// queueCreate runs resolveWritePath internally, so a path that escapes the
// project root or hits a secret file throws WriteGuardError → 422 with the
// guard message. Mirrors the { error } 404 shape used by the other routes
// and the 422 status used by apply/rewind on failure.
app.post<{ Params: { sessionId: string } }>(
'/api/sessions/:sessionId/pending/create',
async (req, reply) => {
const sessionId = req.params.sessionId;
const parsed = CreateBody.safeParse(req.body);
if (!parsed.success) {
reply.code(400);
return { error: 'invalid body', details: parsed.error.flatten() };
}
const projectRoot = await resolveProjectRoot(sql, sessionId);
if (!projectRoot) {
reply.code(404);
return { error: 'session or project not found' };
}
try {
const change = await queueCreate(
sql,
sessionId,
null,
parsed.data.file_path,
parsed.data.content,
projectRoot,
// Manual RightRail create — no agent staged it; renders as "manual".
null,
);
return change;
} catch (err) {
if (err instanceof WriteGuardError) {
reply.code(422);
return { error: err.message };
}
throw err;
}
},
);
// POST /api/sessions/:sessionId/pending/apply — apply all pending changes
app.post<{ Params: { sessionId: string } }>(
'/api/sessions/:sessionId/pending/apply',
async (req, reply) => {
const sessionId = req.params.sessionId;
const projectRoot = await resolveProjectRoot(sql, sessionId);
if (!projectRoot) {
reply.code(404);
return { error: 'session or project not found' };
}
const results = await applyAll(sql, sessionId, projectRoot);
// v2.6 Phase 3 (3.5): re-baseline the session worktree's diff to the applied
// state, so the next external-agent turn diffs against applied-not-original
// and doesn't re-surface the just-applied changes. Best-effort: a worktree
// session may not exist (native-only chat), and a re-baseline hiccup must not
// fail the apply the user just requested.
if (results.some((r) => r.success)) {
await rebaselineWorktreeAfterApply(sql, sessionId).catch(() => {});
}
return { results };
},
);
// POST /api/pending/:id/apply — apply a single pending change
app.post<{ Params: { id: string } }>(
'/api/pending/:id/apply',
async (req, reply) => {
const changeId = req.params.id;
const projectRoot = await resolveProjectRootForChange(sql, changeId);
if (!projectRoot) {
reply.code(404);
return { error: 'pending change or project not found' };
}
const result = await applyOne(sql, changeId, projectRoot);
if (!result.success) {
reply.code(422);
} else {
// v2.6 Phase 3 (3.5): re-baseline the session worktree after a successful
// apply so the next external-agent turn diffs against applied-not-original.
// Resolve the change's session; best-effort, never fails the apply.
const sessRows = await sql<{ session_id: string }[]>`
SELECT session_id FROM pending_changes WHERE id = ${changeId}
`;
const sessionId = sessRows[0]?.session_id;
if (sessionId) await rebaselineWorktreeAfterApply(sql, sessionId).catch(() => {});
}
return result;
},
);
// POST /api/pending/:id/reject — reject a single pending change
app.post<{ Params: { id: string } }>(
'/api/pending/:id/reject',
async (req, reply) => {
const changeId = req.params.id;
await rejectOne(sql, changeId);
return { ok: true };
},
);
// POST /api/pending/:id/rewind — rewind (undo) an applied change
app.post<{ Params: { id: string } }>(
'/api/pending/:id/rewind',
async (req, reply) => {
const changeId = req.params.id;
const projectRoot = await resolveProjectRootForChange(sql, changeId);
if (!projectRoot) {
reply.code(404);
return { error: 'pending change or project not found' };
}
const result = await rewindOne(sql, changeId, projectRoot);
if (!result.success) {
reply.code(422);
}
return result;
},
);
}

View File

@@ -0,0 +1,127 @@
import type { FastifyInstance } from 'fastify';
import { z } from 'zod';
import type { Sql } from '../db.js';
import type { Config } from '../config.js';
import {
getProviderSnapshot,
clearProviderSnapshotCache,
peekSnapshotEntry,
} from '../services/provider-snapshot.js';
import {
load,
save,
CoderProvidersFileSchema,
ProviderConfigPatchSchema,
mergeProviderConfigPatch,
} from '../services/provider-config.js';
import {
reloadProviderConfig,
getResolvedRegistry,
} from '../services/provider-config-registry.js';
import {
getProviderDiagnostic,
type DiagnosticAgentRow,
} from '../services/provider-diagnostic.js';
const RefreshBodySchema = z.object({ providers: z.array(z.string()).optional() });
export function registerProviderRoutes(app: FastifyInstance, sql: Sql, config: Config): void {
app.get<{ Querystring: { cwd?: string } }>('/api/providers/snapshot', async (req, _reply) => {
const cwd = req.query.cwd;
return getProviderSnapshot(sql, config, cwd);
});
// 4.1 — current loaded config file (raw CoderProvidersFile, not the resolved registry).
app.get('/api/providers/config', async (_req, _reply) => {
return load(config.CODER_PROVIDERS_PATH);
});
// 4.2 — patch the config file (design.md §6.2). Strict order is the whole
// correctness story: validate → save → reload → clear. A malformed body or an
// invalid merged result returns 422 and NEVER writes; a save failure returns
// 500 and leaves in-memory state untouched (no file/registry divergence).
app.patch('/api/providers/config', async (req, reply) => {
// 1. Validate the PATCH body shape (malformed → 422, never reaches merge).
const parsed = ProviderConfigPatchSchema.safeParse(req.body);
if (!parsed.success) {
return reply.code(422).send({
error: 'invalid provider config patch',
issues: parsed.error.flatten(),
});
}
// 2. Shallow per-id merge over the current file (null deletes; object replaces).
const current = load(config.CODER_PROVIDERS_PATH);
const merged = mergeProviderConfigPatch(current, parsed.data);
// 3. Validate the merged result — refuse to write a config that won't load.
const validated = CoderProvidersFileSchema.safeParse(merged);
if (!validated.success) {
return reply.code(422).send({
error: 'merged provider config is invalid',
issues: validated.error.flatten(),
});
}
// 4. Persist. If save throws, STOP here — do NOT reload/clear, so the file on
// disk and the in-memory resolved registry can never diverge.
try {
save(config.CODER_PROVIDERS_PATH, validated.data);
} catch (err) {
req.log.error(
{ err: err instanceof Error ? err.message : String(err), path: config.CODER_PROVIDERS_PATH },
'provider-config: save failed — in-memory state untouched',
);
return reply.code(500).send({ error: 'failed to write provider config' });
}
// 5 + 6. Rebuild the in-memory resolved registry from the new file, then drop
// the snapshot cache so the next /snapshot reflects the change.
reloadProviderConfig();
clearProviderSnapshotCache();
// 7. Return the new config (per §6.2 `{ ok: true }`, plus the merged providers
// so the client can update without a follow-up GET).
return { ok: true, providers: validated.data.providers };
});
// 4.3 — force a cold probe. Optional { providers?: string[] } narrows the
// reported subset (design.md §6.3 Paseo pattern). The force=true snapshot is
// the only existing re-probe primitive (per-provider force would be a
// snapshot-internal change, out of Phase 4 scope), so the probe runs for all
// installed providers; the `refreshed` count reflects the requested subset.
app.post('/api/providers/refresh', async (req, reply) => {
const parsed = RefreshBodySchema.safeParse(req.body ?? {});
if (!parsed.success) {
return reply.code(422).send({ error: 'invalid refresh body', issues: parsed.error.flatten() });
}
const subset = parsed.data.providers;
clearProviderSnapshotCache();
const entries = await getProviderSnapshot(sql, config, undefined, true);
const refreshed =
subset && subset.length > 0
? entries.filter((e) => subset.includes(e.name)).length
: entries.length;
return { refreshed };
});
// 4.4 — per-provider diagnostic (design.md §6.4 → JSON `{ diagnostic: string }`).
// Read-only: reports cached state (resolved def + available_agents row + warm
// snapshot cache for the last probe error) plus a `which` PATH check. No probe
// spawn. The report itself is a plaintext block (§8); the route wraps it as JSON.
app.get<{ Params: { id: string } }>('/api/providers/:id/diagnostic', async (req, reply) => {
const id = req.params.id;
const resolved = getResolvedRegistry().get(id);
if (!resolved) {
return reply.code(404).send({ error: `unknown provider '${id}'` });
}
const rows = await sql<DiagnosticAgentRow[]>`
SELECT name, install_path, supports_acp, models, last_probed_at
FROM available_agents WHERE name = ${id}
`;
const report = await getProviderDiagnostic(resolved, rows[0], {
cachedEntry: peekSnapshotEntry(id),
});
return { diagnostic: report };
});
}

View File

@@ -0,0 +1,124 @@
import type { FastifyInstance } from 'fastify';
import { z } from 'zod';
import type { Sql } from '../db.js';
import type { Broker } from '@boocode/server/broker';
import type { WsFrame } from '@boocode/server/ws-frames';
import { getSkillBody } from '@boocode/server/skills';
import {
buildSkillInvokeSyntheticFrames,
buildSkillInvokeUserFrames,
DEFAULT_SKILL_USER_MESSAGE,
runSkillInvokeTransaction,
} from '@boocode/server/skill-invoke';
import { resolveChatId } from './chat-resolve.js';
const SkillInvokeBody = z.object({
pane_id: z.string().min(1).max(200),
skill_name: z.string().min(1),
user_message: z.string().max(64_000).nullable().optional(),
// v2.5.9: when set to an external provider, the skill runs UNDER that agent —
// its body is injected into a dispatched task instead of native inference.
provider: z.string().max(100).optional(),
model: z.string().max(200).optional(),
mode_id: z.string().max(200).optional(),
thinking_option_id: z.string().max(200).optional(),
});
interface InferenceApi {
enqueue: (sessionId: string, chatId: string, assistantId: string, user: string) => void;
hasActive: (chatId: string) => boolean;
}
export function registerSkillRoutes(
app: FastifyInstance,
sql: Sql,
broker: Broker,
inference: InferenceApi,
): void {
app.post<{ Params: { sessionId: string } }>(
'/api/sessions/:sessionId/skill_invoke',
async (req, reply) => {
const parsed = SkillInvokeBody.safeParse(req.body);
if (!parsed.success) {
reply.code(400);
return { error: 'invalid body', details: parsed.error.flatten() };
}
const sessionId = req.params.sessionId;
const { pane_id, skill_name, provider, model, mode_id, thinking_option_id } = parsed.data;
const sessionRows = await sql<{ id: string; project_id: string }[]>`
SELECT id, project_id FROM sessions WHERE id = ${sessionId}
`;
if (sessionRows.length === 0) {
reply.code(404);
return { error: 'session not found' };
}
const chatId = await resolveChatId(sql, sessionId, pane_id);
if (!chatId) {
reply.code(404);
return { error: 'pane not found' };
}
if (inference.hasActive(chatId)) {
reply.code(409);
return { error: 'inference already running on this chat' };
}
const userText = parsed.data.user_message?.trim()
? parsed.data.user_message
: DEFAULT_SKILL_USER_MESSAGE;
const body = await getSkillBody(skill_name);
if (body === null) {
reply.code(404);
return { error: 'unknown_skill', message: `unknown skill: ${skill_name}` };
}
// v2.5.9: external agent → run the skill UNDER that agent. The skill body
// stays server-side (like the native path's tool message) and is injected
// into a dispatched task; the agent receives the skill instructions + the
// user's text. Mirrors the messages-route external-provider dispatch.
if (provider && provider !== 'boocode') {
const [userMsg] = await sql<{ id: string }[]>`
INSERT INTO messages (session_id, chat_id, role, content, status, created_at)
VALUES (${sessionId}, ${chatId}, 'user', ${userText}, 'complete', clock_timestamp())
RETURNING id
`;
broker.publishFrame(sessionId, { type: 'message_started', message_id: userMsg!.id, chat_id: chatId, role: 'user' } as WsFrame);
broker.publishFrame(sessionId, { type: 'delta', message_id: userMsg!.id, chat_id: chatId, content: userText } as WsFrame);
broker.publishFrame(sessionId, { type: 'message_complete', message_id: userMsg!.id, chat_id: chatId } as WsFrame);
const taskInput = `${body}\n\n---\n\n${userText}`;
const [task] = await sql<{ id: string; state: string }[]>`
INSERT INTO tasks (project_id, input, agent, model, mode_id, thinking_option_id, session_id, chat_id)
VALUES (${sessionRows[0]!.project_id}, ${taskInput}, ${provider}, ${model ?? null}, ${mode_id ?? null}, ${thinking_option_id ?? null}, ${sessionId}, ${chatId})
RETURNING id, state
`;
await sql`UPDATE chats SET updated_at = clock_timestamp() WHERE id = ${chatId}`;
reply.code(202);
return { user_message_id: userMsg!.id, task_id: task!.id, dispatched: true };
}
const { result, toolCall } = await runSkillInvokeTransaction(sql, {
sessionId,
chatId,
skillName: skill_name,
skillBody: body,
userText,
});
for (const frame of buildSkillInvokeSyntheticFrames(chatId, result, toolCall, body)) {
broker.publishFrame(sessionId, frame as WsFrame);
}
for (const frame of buildSkillInvokeUserFrames(chatId, result.user_message_id, userText)) {
broker.publishFrame(sessionId, frame as WsFrame);
}
inference.enqueue(sessionId, chatId, result.assistant_message_id, 'default');
reply.code(202);
return result;
},
);
}

View File

@@ -0,0 +1,48 @@
import type { FastifyInstance } from 'fastify';
import { z } from 'zod';
import type { Sql } from '../db.js';
const CostQuery = z.object({
group_by: z.enum(['project', 'agent', 'day']).default('project'),
});
export function registerStatsRoutes(app: FastifyInstance, sql: Sql): void {
// GET /api/stats/costs — aggregate cost_tokens by project, agent, or day
app.get('/api/stats/costs', async (req, reply) => {
const parsed = CostQuery.safeParse(req.query);
if (!parsed.success) {
reply.code(400);
return { error: 'invalid query', details: parsed.error.flatten() };
}
const { group_by } = parsed.data;
switch (group_by) {
case 'project':
return sql`
SELECT project_id, COUNT(*)::int AS task_count, COALESCE(SUM(cost_tokens), 0)::int AS total_tokens
FROM tasks
WHERE cost_tokens IS NOT NULL
GROUP BY project_id
ORDER BY total_tokens DESC
`;
case 'agent':
return sql`
SELECT COALESCE(agent, 'native') AS agent, COUNT(*)::int AS task_count, COALESCE(SUM(cost_tokens), 0)::int AS total_tokens
FROM tasks
WHERE cost_tokens IS NOT NULL
GROUP BY agent
ORDER BY total_tokens DESC
`;
case 'day':
return sql`
SELECT DATE(created_at) AS day, COUNT(*)::int AS task_count, COALESCE(SUM(cost_tokens), 0)::int AS total_tokens
FROM tasks
WHERE cost_tokens IS NOT NULL
GROUP BY DATE(created_at)
ORDER BY day DESC
LIMIT 90
`;
}
});
}

View File

@@ -0,0 +1,185 @@
import type { FastifyInstance } from 'fastify';
import { z } from 'zod';
import type { Sql } from '../db.js';
import { getPendingPermission, respondToPermission, cancelPendingPermission } from '../services/permission-waiter.js';
import { getTaskCommands } from '../services/agent-commands-cache.js';
interface InferenceApi {
cancel: (sessionId: string, chatId: string) => Promise<boolean>;
}
const CreateBody = z.object({
project_id: z.string().uuid(),
input: z.string().min(1).max(64_000),
agent: z.string().max(100).optional(),
model: z.string().max(200).optional(),
mode_id: z.string().max(200).optional(),
thinking_option_id: z.string().max(200).optional(),
});
const PermissionBody = z.object({
option_id: z.string().max(200).nullable(),
updated_input: z.record(z.unknown()).optional(),
});
const ListQuery = z.object({
state: z.enum(['pending', 'running', 'completed', 'failed', 'blocked', 'cancelled']).optional(),
project_id: z.string().uuid().optional(),
});
export function registerTaskRoutes(app: FastifyInstance, sql: Sql, inference: InferenceApi): void {
// POST /api/tasks — create a new task
app.post('/api/tasks', async (req, reply) => {
const parsed = CreateBody.safeParse(req.body);
if (!parsed.success) {
reply.code(400);
return { error: 'invalid body', details: parsed.error.flatten() };
}
const { project_id, input, agent, model, mode_id, thinking_option_id } = parsed.data;
const [task] = await sql<{ id: string; state: string }[]>`
INSERT INTO tasks (project_id, input, agent, model, mode_id, thinking_option_id)
VALUES (${project_id}, ${input}, ${agent ?? null}, ${model ?? null}, ${mode_id ?? null}, ${thinking_option_id ?? null})
RETURNING id, state
`;
reply.code(201);
return { id: task!.id, state: task!.state };
});
// GET /api/tasks — list tasks with optional filters
app.get('/api/tasks', async (req, _reply) => {
const parsed = ListQuery.safeParse(req.query);
if (!parsed.success) {
return { error: 'invalid query', details: parsed.error.flatten() };
}
const { state, project_id } = parsed.data;
// Build query with optional filters
if (state && project_id) {
return sql`
SELECT id, project_id, state, input, output_summary, agent, model, execution_path, session_id, started_at, ended_at, created_at
FROM tasks
WHERE state = ${state} AND project_id = ${project_id}
ORDER BY created_at DESC
LIMIT 100
`;
} else if (state) {
return sql`
SELECT id, project_id, state, input, output_summary, agent, model, execution_path, session_id, started_at, ended_at, created_at
FROM tasks
WHERE state = ${state}
ORDER BY created_at DESC
LIMIT 100
`;
} else if (project_id) {
return sql`
SELECT id, project_id, state, input, output_summary, agent, model, execution_path, session_id, started_at, ended_at, created_at
FROM tasks
WHERE project_id = ${project_id}
ORDER BY created_at DESC
LIMIT 100
`;
} else {
return sql`
SELECT id, project_id, state, input, output_summary, agent, model, execution_path, session_id, started_at, ended_at, created_at
FROM tasks
ORDER BY created_at DESC
LIMIT 100
`;
}
});
// GET /api/tasks/:id — single task detail
app.get<{ Params: { id: string } }>('/api/tasks/:id', async (req, reply) => {
const rows = await sql`
SELECT id, project_id, parent_task_id, state, input, output_summary, agent, model, execution_path, worktree_path, session_id, cost_tokens, started_at, ended_at, created_at
FROM tasks
WHERE id = ${req.params.id}
`;
if (rows.length === 0) {
reply.code(404);
return { error: 'task not found' };
}
return rows[0];
});
// POST /api/tasks/:id/cancel — cancel a pending or running task
app.post<{ Params: { id: string } }>('/api/tasks/:id/cancel', async (req, reply) => {
const taskId = req.params.id;
// Get current task state + session info
const rows = await sql<{ id: string; state: string; session_id: string | null }[]>`
SELECT id, state, session_id FROM tasks WHERE id = ${taskId}
`;
if (rows.length === 0) {
reply.code(404);
return { error: 'task not found' };
}
const task = rows[0]!;
if (task.state !== 'pending' && task.state !== 'running' && task.state !== 'blocked') {
reply.code(409);
return { error: `cannot cancel task in state '${task.state}'` };
}
cancelPendingPermission(taskId);
// If running, try to cancel inference
if ((task.state === 'running' || task.state === 'blocked') && task.session_id) {
// Find active chat in the task's session
const chats = await sql<{ id: string }[]>`
SELECT id FROM chats WHERE session_id = ${task.session_id} AND status = 'open'
`;
for (const chat of chats) {
await inference.cancel(task.session_id, chat.id);
}
}
await sql`
UPDATE tasks
SET state = 'cancelled', ended_at = clock_timestamp()
WHERE id = ${taskId} AND state IN ('pending', 'running', 'blocked')
`;
return { cancelled: true };
});
// GET /api/tasks/:id/permission — pending permission prompt (if any)
app.get<{ Params: { id: string } }>('/api/tasks/:id/permission', async (req, reply) => {
const prompt = getPendingPermission(req.params.id);
if (!prompt) {
reply.code(404);
return { error: 'no pending permission' };
}
return prompt;
});
// POST /api/tasks/:id/permission — respond to a pending permission prompt
app.post<{ Params: { id: string } }>('/api/tasks/:id/permission', async (req, reply) => {
const parsed = PermissionBody.safeParse(req.body);
if (!parsed.success) {
reply.code(400);
return { error: 'invalid body', details: parsed.error.flatten() };
}
const ok = respondToPermission(req.params.id, parsed.data.option_id, parsed.data.updated_input as Record<string, unknown> | undefined);
if (!ok) {
reply.code(404);
return { error: 'no pending permission' };
}
return { ok: true };
});
// GET /api/tasks/:id/commands — cached ACP slash commands (if any)
app.get<{ Params: { id: string } }>('/api/tasks/:id/commands', async (req, reply) => {
const commands = getTaskCommands(req.params.id);
if (!commands?.length) {
reply.code(404);
return { error: 'no commands cached' };
}
return { taskId: req.params.id, commands };
});
}

View File

@@ -0,0 +1,45 @@
/**
* Session-delete work-loss guard (coder side).
*
* Session delete itself lives in apps/server (Docker), which CANNOT see the
* host worktree dirs (/tmp/booworktrees) or run git on them. Only BooCoder
* (host systemd) can. So the server's DELETE route calls these endpoints
* pre-delete to learn whether a session's worktree holds work at risk, and to
* stash it. The server owns the gate; coder owns the git truth.
*/
import type { FastifyInstance } from 'fastify';
import type { Sql } from '../db.js';
import { checkWorktreeWorkAtRisk, stashWorktree } from '../services/worktrees.js';
export function registerWorktreeSafetyRoutes(app: FastifyInstance, sql: Sql): void {
// GET risk for a session's worktree(s). One row per session today (PK on
// session_id); the loop already handles the Phase-1.5 multi-worktree case.
app.get<{ Params: { sessionId: string } }>(
'/api/sessions/:sessionId/worktree-risk',
async (req) => {
const rows = await sql<{ worktree_path: string }[]>`
SELECT path AS worktree_path FROM worktrees WHERE session_id = ${req.params.sessionId}
`;
const reports = [];
for (const row of rows) {
reports.push(await checkWorktreeWorkAtRisk(row.worktree_path));
}
return { reports };
},
);
// Stash a session's worktree(s) — clears the dirty risk; recoverable.
app.post<{ Params: { sessionId: string } }>(
'/api/sessions/:sessionId/worktree-stash',
async (req) => {
const rows = await sql<{ worktree_path: string }[]>`
SELECT path AS worktree_path FROM worktrees WHERE session_id = ${req.params.sessionId}
`;
const results = [];
for (const row of rows) {
results.push({ worktreePath: row.worktree_path, ...(await stashWorktree(row.worktree_path)) });
}
return { results };
},
);
}

View File

@@ -0,0 +1,51 @@
import type { FastifyInstance } from 'fastify';
import type { Sql } from '../db.js';
import type { Broker } from '@boocode/server/broker';
export function registerWebSocket(
app: FastifyInstance,
sql: Sql,
broker: Broker,
): void {
// Per-session streaming WebSocket. Clients connect here to receive live
// inference frames (deltas, tool_calls, tool_results, message_complete).
app.get<{ Params: { sessionId: string } }>(
'/api/ws/sessions/:sessionId',
{ websocket: true },
async (socket, req) => {
const sessionId = req.params.sessionId;
// Validate session exists
const session = await sql<{ id: string }[]>`SELECT id FROM sessions WHERE id = ${sessionId}`;
if (session.length === 0) {
socket.send(JSON.stringify({ type: 'error', error: 'session not found' }));
socket.close(1008, 'session not found');
return;
}
// Send snapshot of existing messages so client can hydrate
const messages = await sql<Record<string, unknown>[]>`
SELECT id, session_id, chat_id, role, content, kind, tool_calls, tool_results, reasoning_parts, status, last_seq,
tokens_used, ctx_used, ctx_max, started_at, finished_at, created_at, metadata,
summary, tail_start_id, compacted_at
FROM messages_with_parts
WHERE session_id = ${sessionId}
ORDER BY created_at ASC, id ASC
`;
socket.send(JSON.stringify({ type: 'snapshot', messages }));
// Subscribe to broker for live frames
const unsubscribe = broker.subscribe(sessionId, (frame) => {
if (socket.readyState !== socket.OPEN) return;
try {
socket.send(JSON.stringify(frame));
} catch (err) {
app.log.warn({ err, sessionId }, 'ws send failed');
}
});
socket.on('close', () => unsubscribe());
socket.on('error', () => unsubscribe());
},
);
}

281
apps/coder/src/schema.sql Normal file
View File

@@ -0,0 +1,281 @@
-- v2.0.0: BooCoder schema — pending changes, tasks, agent registry.
-- Applied on startup by apps/coder/src/db.ts:applySchema().
-- Lives in the same 'boochat' database as BooChat's tables.
CREATE TABLE IF NOT EXISTS pending_changes (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
session_id UUID NOT NULL,
task_id UUID,
file_path TEXT NOT NULL,
operation TEXT NOT NULL,
diff TEXT NOT NULL,
status TEXT NOT NULL DEFAULT 'pending',
created_at TIMESTAMPTZ NOT NULL DEFAULT clock_timestamp(),
CONSTRAINT pending_changes_operation_chk CHECK (operation IN ('create', 'edit', 'delete')),
CONSTRAINT pending_changes_status_chk CHECK (status IN ('pending', 'applied', 'rejected', 'reverted'))
);
CREATE TABLE IF NOT EXISTS tasks (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
project_id UUID NOT NULL,
parent_task_id UUID REFERENCES tasks(id),
state TEXT NOT NULL DEFAULT 'pending',
input TEXT NOT NULL,
output_summary TEXT,
agent TEXT,
model TEXT,
execution_path TEXT,
worktree_path TEXT,
cost_tokens INTEGER,
started_at TIMESTAMPTZ,
ended_at TIMESTAMPTZ,
created_at TIMESTAMPTZ NOT NULL DEFAULT clock_timestamp(),
CONSTRAINT tasks_state_chk CHECK (state IN ('pending', 'running', 'completed', 'failed', 'blocked', 'cancelled')),
CONSTRAINT tasks_execution_path_chk CHECK (execution_path IS NULL OR execution_path IN ('native', 'acp', 'pty', 'qwen'))
);
CREATE TABLE IF NOT EXISTS available_agents (
name TEXT PRIMARY KEY,
install_path TEXT,
version TEXT,
supports_acp BOOLEAN NOT NULL DEFAULT false,
supports_mcp_client BOOLEAN NOT NULL DEFAULT false,
last_probed_at TIMESTAMPTZ
);
-- v2.0.0 Phase 4: link tasks to their inference sessions.
ALTER TABLE tasks ADD COLUMN IF NOT EXISTS session_id UUID REFERENCES sessions(id);
-- v2.0.5: add 'qwen' to execution_path CHECK + arena_id column.
ALTER TABLE tasks DROP CONSTRAINT IF EXISTS tasks_execution_path_chk;
DO $$ BEGIN
IF NOT EXISTS (SELECT 1 FROM pg_constraint WHERE conname = 'tasks_execution_path_chk') THEN
ALTER TABLE tasks ADD CONSTRAINT tasks_execution_path_chk
CHECK (execution_path IS NULL OR execution_path IN ('native', 'acp', 'pty', 'qwen'));
END IF;
END $$;
-- v2.0.5: arena support — group tasks into competitive arenas.
ALTER TABLE tasks ADD COLUMN IF NOT EXISTS arena_id UUID;
-- Human inbox: tasks needing attention
CREATE OR REPLACE VIEW human_inbox AS
SELECT * FROM tasks WHERE state IN ('blocked', 'failed');
-- v2.1.0: provider picker — extend available_agents with model discovery.
ALTER TABLE available_agents ADD COLUMN IF NOT EXISTS models JSONB DEFAULT '[]'::jsonb;
ALTER TABLE available_agents ADD COLUMN IF NOT EXISTS label TEXT;
ALTER TABLE available_agents ADD COLUMN IF NOT EXISTS transport TEXT DEFAULT 'pty';
-- v2.5.10: persisted ACP available_commands (captured during the cold probe), so
-- an agent's live command set survives the tier-2 probe skip and shows without a
-- dispatch.
ALTER TABLE available_agents ADD COLUMN IF NOT EXISTS commands JSONB DEFAULT '[]'::jsonb;
-- v2.2.0: Paseo-style session config on tasks.
ALTER TABLE tasks ADD COLUMN IF NOT EXISTS mode_id TEXT;
ALTER TABLE tasks ADD COLUMN IF NOT EXISTS thinking_option_id TEXT;
ALTER TABLE tasks ADD COLUMN IF NOT EXISTS feature_values JSONB;
-- v2.6: one shared worktree per session (all agents/panes in the session operate in it).
CREATE TABLE IF NOT EXISTS session_worktrees (
session_id UUID PRIMARY KEY REFERENCES sessions(id) ON DELETE CASCADE,
worktree_path TEXT NOT NULL,
base_commit TEXT,
created_at TIMESTAMPTZ NOT NULL DEFAULT clock_timestamp()
);
-- P1.5-b: DEFANG the CASCADE — a session delete must no longer wipe its worktree
-- row. This table is SUPERSEDED by `worktrees` below; all readers are repointed
-- this phase, so the row just persists (dead) on session delete until a later
-- cleanup drops the table. session_id is this table's PRIMARY KEY, so it cannot be
-- nullable → SET NULL is invalid and NO ACTION/RESTRICT would block deletes; the
-- only valid defang is to drop the FK with no replacement. Idempotent: only fires
-- while the FK is still ON DELETE CASCADE ('c').
DO $$ BEGIN
IF EXISTS (
SELECT 1 FROM pg_constraint
WHERE conname = 'session_worktrees_session_id_fkey'
AND confdeltype = 'c'
) THEN
ALTER TABLE session_worktrees DROP CONSTRAINT session_worktrees_session_id_fkey;
END IF;
END $$;
-- v2.6: one backend session per (session, agent); resumed on switch-back.
CREATE TABLE IF NOT EXISTS agent_sessions (
session_id UUID NOT NULL REFERENCES sessions(id) ON DELETE CASCADE,
agent TEXT NOT NULL,
backend TEXT NOT NULL,
agent_session_id TEXT,
server_port INTEGER,
status TEXT NOT NULL DEFAULT 'idle',
last_active_at TIMESTAMPTZ,
created_at TIMESTAMPTZ NOT NULL DEFAULT clock_timestamp(),
PRIMARY KEY (session_id, agent),
CONSTRAINT agent_sessions_backend_chk CHECK (backend IN ('opencode_server', 'acp_warm')),
CONSTRAINT agent_sessions_status_chk CHECK (status IN ('idle', 'active', 'crashed', 'closed'))
);
-- Migrate existing agent_sessions FK to CASCADE.
DO $$ BEGIN
IF EXISTS (
SELECT 1 FROM pg_constraint
WHERE conname = 'agent_sessions_session_id_fkey'
AND confdeltype <> 'c'
) THEN
ALTER TABLE agent_sessions DROP CONSTRAINT agent_sessions_session_id_fkey;
ALTER TABLE agent_sessions ADD CONSTRAINT agent_sessions_session_id_fkey
FOREIGN KEY (session_id) REFERENCES sessions(id) ON DELETE CASCADE;
END IF;
END $$;
-- v2.6: config fingerprint for stale-session detection (auto-recover on model change).
ALTER TABLE agent_sessions ADD COLUMN IF NOT EXISTS config_hash TEXT;
-- v2.6 Phase 1-UX (U.6): opencode token/cost usage, ACCUMULATED per (chat_id, agent).
-- opencode's warm server emits `session.next.step.ended` once per LLM step (several
-- per multi-tool turn) carrying {tokens{input,output,reasoning,cache},cost}. We sum
-- each step's normalized {input,output,cost} onto the session row — running totals
-- for the whole conversation context, not last-step. Backend-only; no route/UI yet.
-- input_tokens folds in cache read+write; output_tokens folds in reasoning (see
-- backends/opencode-usage.ts). Defaults 0 so accumulation (col + delta) is well-defined.
ALTER TABLE agent_sessions ADD COLUMN IF NOT EXISTS input_tokens BIGINT NOT NULL DEFAULT 0;
ALTER TABLE agent_sessions ADD COLUMN IF NOT EXISTS output_tokens BIGINT NOT NULL DEFAULT 0;
ALTER TABLE agent_sessions ADD COLUMN IF NOT EXISTS cost DOUBLE PRECISION NOT NULL DEFAULT 0;
-- ─── P1.5-b (corrected): worktrees entity + re-key agent_sessions to (chat_id, agent) ───
-- The TAB (a chat) is the context unit: two opencode tabs in one session = two
-- independent contexts sharing one worktree. So agent_sessions keys on
-- (chat_id, agent), NOT (worktree_id, agent) or (session_id, agent). The
-- `worktrees` table is one-per-session (selectable later) and only referenced
-- informationally by agent_sessions.worktree_id (SET NULL); chat_id is the key.
--
-- PREREQUISITE: the unmigratable test session (35 chats, 1 agent_sessions row that
-- maps to no single chat) is DELETED before this runs, so agent_sessions is empty
-- and the chat_id backfill is N/A. If a row with NULL chat_id remains, the verify
-- gate below RAISEs and aborts — delete the offending session first.
-- worktree as a first-class entity; survives session delete (session_id SET NULL).
CREATE TABLE IF NOT EXISTS worktrees (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
session_id UUID REFERENCES sessions(id) ON DELETE SET NULL,
project_id UUID,
path TEXT NOT NULL,
branch TEXT,
base_commit TEXT,
slug TEXT,
status TEXT NOT NULL DEFAULT 'active' CHECK (status IN ('active','archived')),
created_at TIMESTAMPTZ NOT NULL DEFAULT clock_timestamp()
);
CREATE UNIQUE INDEX IF NOT EXISTS worktrees_active_path_uidx ON worktrees(path) WHERE status='active';
-- Migrate any surviving session_worktrees rows → worktrees (idempotent; 0 rows
-- after the test-session delete, kept for generality / fresh-DB safety).
INSERT INTO worktrees (session_id, path, branch, base_commit, status)
SELECT sw.session_id, sw.worktree_path, 'session-' || sw.session_id, sw.base_commit, 'active'
FROM session_worktrees sw
WHERE NOT EXISTS (SELECT 1 FROM worktrees w WHERE w.session_id = sw.session_id AND w.status='active');
-- Dispatch hint: which chat (tab) a task belongs to. The coder message route and
-- skills route set it from the frontend tab; session-less creators (arena, MCP,
-- new_task, generic /api/tasks) leave it NULL and the dispatcher creates a chat.
ALTER TABLE tasks ADD COLUMN IF NOT EXISTS chat_id UUID REFERENCES chats(id) ON DELETE SET NULL;
-- Re-key columns on agent_sessions.
ALTER TABLE agent_sessions ADD COLUMN IF NOT EXISTS chat_id UUID;
ALTER TABLE agent_sessions ADD COLUMN IF NOT EXISTS worktree_id UUID;
-- BACKFILL-VERIFY GATE: the new PK is (chat_id, agent), so chat_id must be
-- non-null on every row before the swap. With the test session deleted this is a
-- 0-row assertion; if any row has NULL chat_id (an unmigratable pre-existing row),
-- abort loudly rather than create a degenerate (NULL, agent) key.
DO $$
DECLARE n int;
BEGIN
SELECT count(*) INTO n FROM agent_sessions WHERE chat_id IS NULL;
IF n > 0 THEN
RAISE EXCEPTION 'P1.5-b: % agent_sessions row(s) have NULL chat_id — delete the unmigratable session(s) before applying', n;
END IF;
END $$;
-- Swap PK (session_id,agent) → (chat_id,agent) + FKs (run-once, guarded on the new
-- FK's absence). chat_id CASCADEs from chats (closing a tab ends its context);
-- worktree_id is informational SET NULL; session_id defanged to nullable SET NULL.
DO $$ BEGIN
IF NOT EXISTS (SELECT 1 FROM pg_constraint WHERE conname = 'agent_sessions_chat_id_fkey') THEN
ALTER TABLE agent_sessions DROP CONSTRAINT IF EXISTS agent_sessions_pkey;
ALTER TABLE agent_sessions DROP CONSTRAINT IF EXISTS agent_sessions_session_id_fkey;
ALTER TABLE agent_sessions ALTER COLUMN session_id DROP NOT NULL;
ALTER TABLE agent_sessions ALTER COLUMN chat_id SET NOT NULL;
ALTER TABLE agent_sessions ADD CONSTRAINT agent_sessions_pkey PRIMARY KEY (chat_id, agent);
ALTER TABLE agent_sessions ADD CONSTRAINT agent_sessions_chat_id_fkey
FOREIGN KEY (chat_id) REFERENCES chats(id) ON DELETE CASCADE;
ALTER TABLE agent_sessions ADD CONSTRAINT agent_sessions_session_id_fkey
FOREIGN KEY (session_id) REFERENCES sessions(id) ON DELETE SET NULL;
ALTER TABLE agent_sessions ADD CONSTRAINT agent_sessions_worktree_id_fkey
FOREIGN KEY (worktree_id) REFERENCES worktrees(id) ON DELETE SET NULL;
END IF;
END $$;
-- P1.5-b follow-up: converge agent_sessions.session_id FK CASCADE → SET NULL.
-- The re-key block above re-adds session_id_fkey as SET NULL, but it is guarded on
-- chat_id_fkey's ABSENCE — so a DB already re-keyed to (chat_id, agent) while
-- session_id_fkey was still ON DELETE CASCADE never re-enters that block and stays
-- 'c'. This standalone guard flips it to SET NULL ('n'), matching worktree_id.
-- Idempotent (mirrors the session_worktrees defang's confdeltype check): only fires
-- while the FK is still CASCADE — a no-op on a fresh deploy (already 'n' from the
-- re-key block) and on every re-run thereafter.
DO $$ BEGIN
IF EXISTS (
SELECT 1 FROM pg_constraint
WHERE conname = 'agent_sessions_session_id_fkey'
AND confdeltype = 'c'
) THEN
ALTER TABLE agent_sessions ALTER COLUMN session_id DROP NOT NULL;
ALTER TABLE agent_sessions DROP CONSTRAINT agent_sessions_session_id_fkey;
ALTER TABLE agent_sessions ADD CONSTRAINT agent_sessions_session_id_fkey
FOREIGN KEY (session_id) REFERENCES sessions(id) ON DELETE SET NULL;
END IF;
END $$;
-- v2.6: attribution for DiffPanel badges (Phase 1 UX reads this).
ALTER TABLE pending_changes ADD COLUMN IF NOT EXISTS agent TEXT;
-- write-edit-robustness #4: worktree checkpoints. A pre-turn shadow-commit of the
-- session worktree (tracked + untracked, captured without disturbing the real
-- index/working tree) stored in a private GC-safe ref refs/boocode/checkpoints/<id>.
-- Created best-effort before each external-agent turn (opencode / warm-ACP / one-shot
-- ACP+PTY); restore resets the worktree to commit_sha, trims the transcript from
-- message_id forward, and resets the backend session. chat_id CASCADEs from chats
-- (like agent_sessions); worktree_id SET NULL so a checkpoint outlives a reaped
-- worktree row. session_id / message_id are informational (no FK — message rows are
-- trimmed by a checkpoint restore and we must not block that on a dangling ref).
CREATE TABLE IF NOT EXISTS checkpoints (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
chat_id UUID NOT NULL REFERENCES chats(id) ON DELETE CASCADE,
session_id UUID,
worktree_id UUID REFERENCES worktrees(id) ON DELETE SET NULL,
message_id UUID, -- anchor: the assistant turn row this checkpoint precedes
commit_sha TEXT NOT NULL, -- shadow-commit capturing the pre-turn worktree tree
label TEXT,
created_at TIMESTAMPTZ NOT NULL DEFAULT clock_timestamp()
);
CREATE INDEX IF NOT EXISTS checkpoints_chat_created_idx ON checkpoints(chat_id, created_at);
-- LISTEN/NOTIFY fast path: every tasks INSERT (from any call site — routes,
-- new_task tool, arena, MCP server) fires pg_notify('tasks_new') in the same
-- transaction, so the dispatcher reacts immediately instead of waiting for the
-- fallback poll. Postgres holds the notification until COMMIT, so the listener
-- always sees the committed row. A trigger covers all insert paths with no
-- app-code drift. Idempotent: re-applied on every startup.
CREATE OR REPLACE FUNCTION notify_tasks_new() RETURNS trigger AS $$
BEGIN
PERFORM pg_notify('tasks_new', '');
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
DROP TRIGGER IF EXISTS tasks_notify_new ON tasks;
CREATE TRIGGER tasks_notify_new
AFTER INSERT ON tasks
FOR EACH ROW
EXECUTE FUNCTION notify_tasks_new();

View File

@@ -0,0 +1,50 @@
import { describe, it, expect, afterEach } from 'vitest';
import { mkdtempSync, rmSync } from 'node:fs';
import { tmpdir } from 'node:os';
import { join } from 'node:path';
import { readWorktreeTextFile, writeWorktreeTextFile } from '../acp-client-fs.js';
const created: string[] = [];
function freshWorktree(): string {
const wt = mkdtempSync(join(tmpdir(), 'acp-wt-'));
created.push(wt);
return wt;
}
afterEach(() => {
for (const d of created.splice(0)) {
try {
rmSync(d, { recursive: true, force: true });
rmSync(`${d}-evil`, { recursive: true, force: true });
} catch {
/* ignore */
}
}
});
describe('acp-client-fs worktree scoping', () => {
it('writes then reads a file inside the worktree', async () => {
const wt = freshWorktree();
await writeWorktreeTextFile(wt, 'sub/dir/note.txt', 'hello');
expect(await readWorktreeTextFile(wt, 'sub/dir/note.txt')).toBe('hello');
});
it('rejects ../ traversal on read', async () => {
const wt = freshWorktree();
await expect(readWorktreeTextFile(wt, '../../etc/passwd')).rejects.toThrow(/escapes worktree/);
});
it('rejects ../ traversal on write', async () => {
const wt = freshWorktree();
await expect(writeWorktreeTextFile(wt, '../escape.txt', 'x')).rejects.toThrow(/escapes worktree/);
});
it('rejects a sibling-prefix path (the unbounded-startsWith bug)', async () => {
const wt = freshWorktree();
// Absolute path that shares the worktree as a STRING prefix but is a sibling
// dir: `<wt>-evil/...`. A bare `startsWith(<wt>)` wrongly admits it.
await expect(readWorktreeTextFile(wt, `${wt}-evil/secret.txt`)).rejects.toThrow(/escapes worktree/);
await expect(writeWorktreeTextFile(wt, `${wt}-evil/secret.txt`, 'x')).rejects.toThrow(
/escapes worktree/,
);
});
});

View File

@@ -0,0 +1,154 @@
import { describe, it, expect } from 'vitest';
import type { SessionConfigOption } from '@agentclientprotocol/sdk';
import {
deriveModesFromACP,
deriveModelDefinitionsFromACP,
findThoughtLevelConfigId,
} from '../acp-derive.js';
describe('deriveModesFromACP', () => {
it('prefers modeState.availableModes when present', () => {
const { modes, currentModeId } = deriveModesFromACP(
[{ id: 'fallback', label: 'Fallback' }],
{
currentModeId: 'plan',
availableModes: [
{ id: 'plan', name: 'Plan', description: 'Read-only planning' },
{ id: 'code', name: 'Code' },
],
},
);
expect(modes).toEqual([
{ id: 'plan', label: 'Plan', description: 'Read-only planning' },
{ id: 'code', label: 'Code', description: undefined },
]);
expect(currentModeId).toBe('plan');
});
it('falls back to configOptions mode select', () => {
const configOptions: SessionConfigOption[] = [
{
type: 'select',
id: 'mode',
category: 'mode',
currentValue: 'auto',
options: [
{ value: 'auto', name: 'Auto' },
{ value: 'manual', name: 'Manual', description: 'Ask first' },
],
},
];
const { modes, currentModeId } = deriveModesFromACP([], null, configOptions);
expect(modes).toEqual([
{ id: 'auto', label: 'Auto', description: undefined },
{ id: 'manual', label: 'Manual', description: 'Ask first' },
]);
expect(currentModeId).toBe('auto');
});
it('uses static fallback when no ACP mode data', () => {
const fallback = [{ id: 'default', label: 'Default' }];
const { modes, currentModeId } = deriveModesFromACP(fallback, null, null);
expect(modes).toEqual(fallback);
expect(currentModeId).toBeNull();
});
});
describe('deriveModelDefinitionsFromACP', () => {
it('maps availableModels with thought_level options', () => {
const configOptions: SessionConfigOption[] = [
{
type: 'select',
id: 'thought',
category: 'thought_level',
currentValue: 'medium',
options: [
{ value: 'low', name: 'Low' },
{ value: 'medium', name: 'Medium' },
],
},
];
const models = deriveModelDefinitionsFromACP(
{
currentModelId: 'gpt-4',
availableModels: [
{ modelId: 'gpt-4', name: 'GPT-4' },
{ modelId: 'gpt-4-mini', name: 'Mini', description: 'Cheaper' },
],
},
configOptions,
);
expect(models).toEqual([
{
id: 'gpt-4',
label: 'GPT-4',
description: undefined,
isDefault: true,
thinkingOptions: [
{ id: 'low', label: 'Low', isDefault: false },
{ id: 'medium', label: 'Medium', isDefault: true },
],
defaultThinkingOptionId: 'medium',
},
{
id: 'gpt-4-mini',
label: 'Mini',
description: 'Cheaper',
isDefault: false,
thinkingOptions: [
{ id: 'low', label: 'Low', isDefault: false },
{ id: 'medium', label: 'Medium', isDefault: true },
],
defaultThinkingOptionId: 'medium',
},
]);
});
it('falls back to model select config when no availableModels', () => {
const configOptions: SessionConfigOption[] = [
{
type: 'select',
id: 'model',
category: 'model',
currentValue: 'sonnet',
options: [
{ value: 'sonnet', name: 'Sonnet' },
{ value: 'opus', name: 'Opus' },
],
},
];
const models = deriveModelDefinitionsFromACP(null, configOptions);
expect(models).toEqual([
{ id: 'sonnet', label: 'Sonnet', isDefault: true, defaultThinkingOptionId: undefined },
{ id: 'opus', label: 'Opus', isDefault: false, defaultThinkingOptionId: undefined },
]);
});
});
describe('findThoughtLevelConfigId', () => {
it('returns thought_level select id', () => {
const configOptions: SessionConfigOption[] = [
{
type: 'select',
id: 'effort',
category: 'thought_level',
currentValue: 'high',
options: [{ value: 'high', name: 'High' }],
},
];
expect(findThoughtLevelConfigId(configOptions)).toBe('effort');
});
it('returns null when missing', () => {
expect(findThoughtLevelConfigId(null)).toBeNull();
});
});

View File

@@ -0,0 +1,110 @@
import { describe, it, expect } from 'vitest';
import type { SessionNotification } from '@agentclientprotocol/sdk';
import { mapSessionUpdate } from '../acp-event-map.js';
import type { AcpToolSnapshot } from '../acp-tool-snapshot.js';
/**
* Pure event-mapping shared by the one-shot ACP dispatch (AcpStreamContext) and
* the warm ACP backend (Phase 2). Mirrors the original handleSessionUpdate switch
* verbatim but returns normalized AgentEvents instead of publishing broker frames.
*/
describe('mapSessionUpdate (shared ACP event mapping)', () => {
function note(update: SessionNotification['update']): SessionNotification {
return { sessionId: 's1', update };
}
it('maps an agent_message_chunk text → a text event', () => {
const events = mapSessionUpdate(
note({ sessionUpdate: 'agent_message_chunk', content: { type: 'text', text: 'hello' } }),
);
expect(events).toEqual([{ type: 'text', text: 'hello' }]);
});
it('maps an agent_thought_chunk text → a reasoning event', () => {
const events = mapSessionUpdate(
note({ sessionUpdate: 'agent_thought_chunk', content: { type: 'text', text: 'thinking' } }),
);
expect(events).toEqual([{ type: 'reasoning', text: 'thinking' }]);
});
it('ignores non-text content on message/thought chunks', () => {
const img = mapSessionUpdate(
note({
sessionUpdate: 'agent_message_chunk',
content: { type: 'image', data: 'x', mimeType: 'image/png' },
} as never),
);
expect(img).toEqual([]);
});
it('maps a tool_call → a tool_call event with a merged snapshot', () => {
const events = mapSessionUpdate(
note({
sessionUpdate: 'tool_call',
toolCallId: 't1',
title: 'read_file',
status: 'pending',
rawInput: { path: 'a.ts' },
} as never),
);
expect(events).toHaveLength(1);
expect(events[0]!.type).toBe('tool_call');
const snap = (events[0] as { type: 'tool_call'; toolCall: AcpToolSnapshot }).toolCall;
expect(snap.toolCallId).toBe('t1');
expect(snap.title).toBe('read_file');
expect(snap.status).toBe('pending');
expect(snap.rawInput).toEqual({ path: 'a.ts' });
});
it('maps a tool_call_update → a tool_update event merged over the prior snapshot', () => {
const prior = new Map<string, AcpToolSnapshot>([
['t1', { toolCallId: 't1', title: 'read_file', status: 'pending', rawInput: { path: 'a.ts' } }],
]);
const events = mapSessionUpdate(
note({
sessionUpdate: 'tool_call_update',
toolCallId: 't1',
status: 'completed',
rawOutput: 'file body',
} as never),
prior,
);
expect(events).toHaveLength(1);
expect(events[0]!.type).toBe('tool_update');
const snap = (events[0] as { type: 'tool_update'; toolCall: AcpToolSnapshot }).toolCall;
expect(snap.toolCallId).toBe('t1');
// merged: title carried from prior, status updated, output added, input retained
expect(snap.title).toBe('read_file');
expect(snap.status).toBe('completed');
expect(snap.rawOutput).toBe('file body');
expect(snap.rawInput).toEqual({ path: 'a.ts' });
});
it('maps available_commands_update → a commands event', () => {
const events = mapSessionUpdate(
note({
sessionUpdate: 'available_commands_update',
availableCommands: [
{ name: 'plan', description: 'make a plan' },
{ name: 'review', description: null },
],
} as never),
);
expect(events).toEqual([
{
type: 'commands',
commands: [
{ name: 'plan', description: 'make a plan' },
{ name: 'review', description: undefined },
],
},
]);
});
it('returns [] for unhandled update kinds (plan, mode change)', () => {
expect(mapSessionUpdate(note({ sessionUpdate: 'plan', entries: [] } as never))).toEqual([]);
expect(
mapSessionUpdate(note({ sessionUpdate: 'current_mode_update', currentModeId: 'code' } as never)),
).toEqual([]);
});
});

View File

@@ -0,0 +1,73 @@
import { describe, it, expect } from 'vitest';
import { resolveLaunchSpec, resolveAcpSpawnArgs } from '../acp-spawn.js';
import { buildResolvedRegistry } from '../provider-config-registry.js';
import type { CoderProvidersFile } from '../provider-config.js';
import { PROVIDERS } from '../provider-registry.js';
/** Resolved def for a provider id under the given config (default: no override). */
function builtin(name: string, providers: CoderProvidersFile['providers'] = {}) {
const def = buildResolvedRegistry(PROVIDERS, { providers }).get(name);
if (!def) throw new Error(`no resolved def for ${name}`);
return def;
}
describe('resolveLaunchSpec', () => {
// --- byte-identical built-in regression (the HARD CONSTRAINT) ---------------
// These argv values are the pre-v2.3 resolveAcpSpawnArgs switch outputs and
// MUST NOT change. spawn() is `spawn(spec.binary, spec.args, ...)`, so argv
// parity here is dispatch parity.
it('opencode (no override) → byte-identical argv ["acp"], binary = installPath', () => {
const spec = resolveLaunchSpec(builtin('opencode'), '/usr/bin/opencode');
expect(spec).not.toBeNull();
expect(spec!.args).toEqual(['acp']); // pre-v2.3 value
expect(spec!.binary).toBe('/usr/bin/opencode');
expect(spec!.env).toBeUndefined();
// cross-check against the switch source-of-truth
expect(spec!.args).toEqual(resolveAcpSpawnArgs('opencode'));
});
it('goose → ["acp"], qwen → ["--acp"] (byte-identical)', () => {
expect(resolveLaunchSpec(builtin('goose'), '/usr/bin/goose')!.args).toEqual(['acp']);
expect(resolveLaunchSpec(builtin('qwen'), '/usr/bin/qwen')!.args).toEqual(['--acp']);
});
it('built-in with null installPath falls back to the bare id (pre-v2.3 `installPath ?? agent`)', () => {
const spec = resolveLaunchSpec(builtin('opencode'), null);
expect(spec!.binary).toBe('opencode');
expect(spec!.args).toEqual(['acp']);
});
it('non-ACP / unknown provider → null (claude has no ACP argv)', () => {
expect(resolveLaunchSpec(builtin('claude'), '/usr/bin/claude')).toBeNull();
expect(resolveLaunchSpec(builtin('boocode'), null)).toBeNull();
});
// --- config-driven launch (the new capability) ------------------------------
it('custom ACP entry → configured command + env reach the spec', () => {
const def = builtin('amp-acp', {
'amp-acp': { extends: 'acp', label: 'Amp', command: ['amp-acp', '--acp'], env: { AMP_KEY: 'x' } },
});
const spec = resolveLaunchSpec(def, '/usr/local/bin/amp-acp');
expect(spec).not.toBeNull();
expect(spec!.binary).toBe('amp-acp'); // command[0], not the resolved install path
expect(spec!.args).toEqual(['--acp']); // command.slice(1)
expect(spec!.env).toEqual({ AMP_KEY: 'x' });
});
it('built-in WITH a config command override uses the override, not the switch default', () => {
const def = builtin('opencode', { opencode: { command: ['opencode', 'acp', '--verbose'], env: { DEBUG: '1' } } });
const spec = resolveLaunchSpec(def, '/usr/bin/opencode');
expect(spec!.binary).toBe('opencode');
expect(spec!.args).toEqual(['acp', '--verbose']);
expect(spec!.env).toEqual({ DEBUG: '1' });
});
});
describe('acp-dispatch spawn wiring (documented pass-through)', () => {
// dispatchViaAcp spawns `spawn(spec.binary, spec.args, { env: { ...process.env, ...spec.env } })`.
// The env merge layers config env over process.env; for a built-in with no
// config env, spec.env is undefined → { ...process.env } (byte-identical).
it('built-in with no config env yields an undefined spec.env (→ plain process.env at spawn)', () => {
expect(resolveLaunchSpec(builtin('opencode'), '/usr/bin/opencode')!.env).toBeUndefined();
});
});

View File

@@ -0,0 +1,66 @@
import { describe, it, expect } from 'vitest';
import {
mergeToolSnapshot,
mapToolLifecycleStatus,
snapshotToWireToolCall,
synthesizeCanceledSnapshots,
} from '../acp-tool-snapshot.js';
describe('mergeToolSnapshot', () => {
it('preserves stable toolCallId across updates', () => {
const first = mergeToolSnapshot('tc-1', {
toolCallId: 'tc-1',
title: 'Read file',
kind: 'read',
status: 'in_progress',
rawInput: { path: 'foo.ts' },
});
const merged = mergeToolSnapshot(
'tc-1',
{
toolCallId: 'tc-1',
title: 'Read file',
status: 'completed',
rawOutput: { content: 'hello' },
},
first,
);
expect(merged.toolCallId).toBe('tc-1');
expect(merged.rawInput).toEqual({ path: 'foo.ts' });
expect(merged.status).toBe('completed');
expect(merged.rawOutput).toEqual({ content: 'hello' });
});
});
describe('snapshotToWireToolCall', () => {
it('embeds ACP lifecycle meta for UI merge', () => {
const wire = snapshotToWireToolCall({
toolCallId: 'tc-42',
title: 'Edit',
kind: 'edit',
status: 'completed',
rawInput: { path: 'a.ts' },
rawOutput: 'ok',
});
expect(wire.id).toBe('tc-42');
expect(wire.name).toBe('edit');
expect(wire.args._acp).toMatchObject({ status: 'completed', title: 'Edit', output: 'ok' });
});
it('maps synthesized cancel to canceled lifecycle', () => {
const [canceled] = synthesizeCanceledSnapshots([
{ toolCallId: 'tc-1', title: 'Run', status: 'in_progress' },
]);
const wire = snapshotToWireToolCall(canceled!);
expect(wire.args._acp).toMatchObject({ status: 'canceled' });
});
});
describe('mapToolLifecycleStatus', () => {
it('maps ACP statuses to UI lifecycle', () => {
expect(mapToolLifecycleStatus('completed')).toBe('completed');
expect(mapToolLifecycleStatus('failed')).toBe('failed');
expect(mapToolLifecycleStatus('in_progress')).toBe('running');
expect(mapToolLifecycleStatus(undefined, 'canceled')).toBe('canceled');
});
});

View File

@@ -0,0 +1,233 @@
import { describe, it, expect, vi } from 'vitest';
import { AgentPool, OPENCODE_POOL_KEY } from '../agent-pool.js';
import type {
AgentBackend,
AgentSessionHandle,
EnsureSessionOpts,
PromptCtx,
TurnResult,
} from '../agent-backend.js';
/**
* v2.6 Phase 3 — AgentPool lifecycle unit test (T.1). No DB / no child process:
* a fake AgentBackend records dispose + reports busy/health, so we exercise
* get-or-create, idle eviction, the LRU cap, the busy-never-evict rule, closeChat,
* and dispose-drains directly. The pure decisions are covered separately in
* backends/__tests__/lifecycle-decisions.test.ts; this verifies the wiring.
*/
class FakeBackend implements AgentBackend {
disposed = 0;
closedSessions = 0;
private busyFlag = false;
tickHealthCalls = 0;
constructor(public readonly name = 'fake') {}
setBusy(b: boolean): void {
this.busyFlag = b;
}
// — AgentBackend —
async ensureSession(sessionId: string, opts: EnsureSessionOpts): Promise<AgentSessionHandle> {
return {
sessionId,
agent: opts.agent,
backend: 'acp_warm',
chatId: opts.chatId,
worktreeId: opts.worktreeId,
agentSessionId: 'fake-session',
serverPort: null,
};
}
async prompt(_h: AgentSessionHandle, _input: string, _ctx: PromptCtx): Promise<TurnResult> {
return { ok: true };
}
async closeSession(): Promise<void> {
this.closedSessions++;
}
async dispose(): Promise<void> {
this.disposed++;
}
health(): 'up' | 'down' {
return 'up';
}
isBusy(): boolean {
return this.busyFlag;
}
async tickHealth(): Promise<void> {
this.tickHealthCalls++;
}
}
describe('AgentPool — get/register/touch (3.1)', () => {
it('register then get returns the same backend', () => {
const pool = new AgentPool();
const b = new FakeBackend();
pool.register('chat-1', 'goose', b);
expect(pool.get('chat-1', 'goose')).toBe(b);
expect(pool.get('chat-1', 'qwen')).toBeUndefined();
});
it('peek does NOT exist for a missing key', () => {
const pool = new AgentPool();
expect(pool.peek('nope', 'goose')).toBeUndefined();
});
it('health reports size + busy count', () => {
const pool = new AgentPool();
const a = new FakeBackend();
const b = new FakeBackend();
b.setBusy(true);
pool.register('c1', 'goose', a);
pool.register('c2', 'qwen', b);
expect(pool.health()).toEqual({ size: 2, busy: 1 });
});
});
describe('AgentPool.sweep — idle TTL eviction (3.1)', () => {
it('evicts an idle backend past the TTL and disposes it', async () => {
const pool = new AgentPool({ idleTtlMs: 1_000, maxLive: 100 });
const b = new FakeBackend();
pool.register('c1', 'goose', b);
// Sweep with now far past the registration → idle → evicted.
const { evicted } = await pool.sweep(Date.now() + 10_000);
expect(evicted).toEqual(['c1:goose']);
expect(b.disposed).toBe(1);
expect(pool.get('c1', 'goose')).toBeUndefined();
});
it('never evicts a busy backend even past the TTL', async () => {
const pool = new AgentPool({ idleTtlMs: 1_000, maxLive: 100 });
const b = new FakeBackend();
b.setBusy(true);
pool.register('c1', 'goose', b);
const { evicted } = await pool.sweep(Date.now() + 10_000);
expect(evicted).toEqual([]);
expect(b.disposed).toBe(0);
expect(pool.get('c1', 'goose')).toBe(b);
});
it('touch keeps a backend warm so the TTL measures from the last turn', async () => {
const pool = new AgentPool({ idleTtlMs: 5_000, maxLive: 100 });
const b = new FakeBackend();
pool.register('c1', 'goose', b);
const base = Date.now();
// 4s later, touch — resets activity. A sweep at +6s from base is only +2s from
// the touch → still within TTL → not evicted.
vi.spyOn(Date, 'now').mockReturnValue(base + 4_000);
pool.touch('c1', 'goose');
vi.restoreAllMocks();
const { evicted } = await pool.sweep(base + 6_000);
expect(evicted).toEqual([]);
});
});
describe('AgentPool.sweep — LRU cap (3.4)', () => {
it('evicts the least-recently-used beyond the cap', async () => {
const pool = new AgentPool({ idleTtlMs: 1_000_000, maxLive: 2 });
const base = 1_000_000;
const mk = (key: string, regAt: number) => {
vi.spyOn(Date, 'now').mockReturnValue(regAt);
const b = new FakeBackend(key);
const [chat, agent] = key.split(':');
pool.register(chat!, agent!, b);
vi.restoreAllMocks();
return b;
};
const a = mk('c1:goose', base + 100);
const b = mk('c2:goose', base + 300);
const c = mk('c3:goose', base + 200);
// 3 entries, cap 2, all within idle TTL → LRU (oldest = a@+100) evicted.
const { evicted } = await pool.sweep(base + 1_000);
expect(evicted).toEqual(['c1:goose']);
expect(a.disposed).toBe(1);
expect(b.disposed).toBe(0);
expect(c.disposed).toBe(0);
});
});
describe('AgentPool.sweep — proactive health probe (3.2)', () => {
it('drives each backend tickHealth before eviction', async () => {
const pool = new AgentPool({ idleTtlMs: 1_000_000, maxLive: 100 });
const b = new FakeBackend();
pool.register('c1', 'opencode', b);
await pool.sweep(Date.now());
expect(b.tickHealthCalls).toBe(1);
});
});
describe('AgentPool.closeChat — chat-close teardown (3.3)', () => {
it('disposes only the matching chat keys, leaving others + the shared server', async () => {
const pool = new AgentPool();
const goose = new FakeBackend('goose');
const qwen = new FakeBackend('qwen');
const other = new FakeBackend('other-chat');
const ocServer = new FakeBackend('opencode-server');
pool.register('chat-1', 'goose', goose);
pool.register('chat-1', 'qwen', qwen);
pool.register('chat-2', 'goose', other);
pool.register(OPENCODE_POOL_KEY, 'opencode', ocServer);
const removed = await pool.closeChat('chat-1');
expect(removed.sort()).toEqual(['chat-1:goose', 'chat-1:qwen']);
expect(goose.disposed).toBe(1);
expect(qwen.disposed).toBe(1);
// other chat + shared opencode server untouched.
expect(other.disposed).toBe(0);
expect(ocServer.disposed).toBe(0);
expect(pool.peek('chat-2', 'goose')).toBe(other);
expect(pool.peek(OPENCODE_POOL_KEY, 'opencode')).toBe(ocServer);
});
it('does not dispose a busy backend on closeChat', async () => {
const pool = new AgentPool();
const b = new FakeBackend();
b.setBusy(true);
pool.register('chat-1', 'goose', b);
const removed = await pool.closeChat('chat-1');
expect(removed).toEqual([]);
expect(b.disposed).toBe(0);
});
it('does not match a chat id that is a prefix of another', async () => {
// 'chat-1' must not match 'chat-10' — keys are `${chatId}:${agent}` so the
// colon delimiter prevents the prefix collision.
const pool = new AgentPool();
const a = new FakeBackend();
const b = new FakeBackend();
pool.register('chat-1', 'goose', a);
pool.register('chat-10', 'goose', b);
await pool.closeChat('chat-1');
expect(a.disposed).toBe(1);
expect(b.disposed).toBe(0);
expect(pool.peek('chat-10', 'goose')).toBe(b);
});
});
describe('AgentPool.dispose — drain all (T.1)', () => {
it('disposes every backend and clears the map', async () => {
const pool = new AgentPool();
const a = new FakeBackend();
const b = new FakeBackend();
pool.register('c1', 'goose', a);
pool.register('c2', 'qwen', b);
await pool.dispose();
expect(a.disposed).toBe(1);
expect(b.disposed).toBe(1);
expect(pool.health()).toEqual({ size: 0, busy: 0 });
});
it('tolerates a backend whose dispose throws', async () => {
const pool = new AgentPool();
const good = new FakeBackend();
const bad = new FakeBackend();
bad.dispose = async () => {
throw new Error('boom');
};
pool.register('c1', 'goose', bad);
pool.register('c2', 'qwen', good);
await expect(pool.dispose()).resolves.toBeUndefined();
expect(good.disposed).toBe(1);
});
});

View File

@@ -0,0 +1,252 @@
import { describe, it, expect, beforeAll, afterAll } from 'vitest';
import { readFileSync } from 'node:fs';
import { rm, mkdir } from 'node:fs/promises';
import { resolve } from 'node:path';
import postgres from 'postgres';
import {
buildShadowCommitCommand,
createCheckpoint,
restoreCheckpoint,
CheckpointNotFoundError,
} from '../checkpoints.js';
import { ensureSessionWorktree } from '../worktrees.js';
import { hostExec } from '../host-exec.js';
/**
* write-edit-robustness #4 — worktree checkpoint tests.
*
* Pure-helper coverage (no DB / no host) for the shadow-commit command builder,
* plus a DB+git integration block (DB-opt-in via DATABASE_URL, skips cleanly
* otherwise; mirrors reconnect_integration.test.ts) that exercises the real
* create → restore round trip against a worktree on the host fs.
*/
describe('buildShadowCommitCommand (pure)', () => {
it('parks the commit under refs/boocode/checkpoints/<id> and prints only the SHA', () => {
const cmd = buildShadowCommitCommand('/tmp/booworktrees/sess-abc', 'cp-id-123');
// Uses a temp index so the real working tree/index is untouched.
expect(cmd).toContain('TMP=$(mktemp)');
expect(cmd).toContain('GIT_INDEX_FILE="$TMP" git read-tree HEAD');
expect(cmd).toContain('GIT_INDEX_FILE="$TMP" git add -A');
expect(cmd).toContain('git write-tree');
expect(cmd).toContain("git commit-tree \"$TREE\" -p HEAD -m \"boocode checkpoint\"");
// Ref name matches the row id, and stdout is ONLY the SHA (printf, no newline).
expect(cmd).toContain("update-ref 'refs/boocode/checkpoints/cp-id-123'");
expect(cmd).toContain("printf '%s' \"$SHA\"");
expect(cmd).not.toContain('echo "$SHA"');
});
it('shell-escapes the worktree path and the id', () => {
const cmd = buildShadowCommitCommand("/tmp/it's a path", "id'; rm -rf /");
// Single quotes inside the path/id are escaped via the '\'' wrapping idiom — no
// bare interpolation that could break out of the quoting.
expect(cmd).toContain("cd '/tmp/it'\\''s a path'");
expect(cmd).toContain("refs/boocode/checkpoints/id'\\''; rm -rf /");
});
});
describe.runIf(!!process.env.DATABASE_URL)('checkpoint create + restore (DB + git)', () => {
let sql: ReturnType<typeof postgres>;
const stamp = Date.now();
const projectDir = `/tmp/boocode-checkpoint-proj-${stamp}`;
let projectId: string;
let sessionId: string;
let chatId: string;
let worktreePath: string;
beforeAll(async () => {
sql = postgres(process.env.DATABASE_URL!, { max: 3 });
// Server schema first (FK targets), then coder schema (worktrees + checkpoints).
const serverSchema = resolve(__dirname, '../../../../server/src/schema.sql');
const coderSchema = resolve(__dirname, '../../schema.sql');
await sql.unsafe(readFileSync(serverSchema, 'utf8'));
await sql.unsafe(readFileSync(coderSchema, 'utf8'));
await mkdir(projectDir, { recursive: true });
await hostExec(
`cd ${projectDir} && git init -q && git config user.email t@t && git config user.name t ` +
`&& echo hello > README.md && git add -A && git commit -qm init`,
{ timeoutMs: 20_000 },
);
const [project] = await sql<{ id: string }[]>`
INSERT INTO projects (name, path, status) VALUES ('checkpoint-test', ${projectDir}, 'open') RETURNING id
`;
projectId = project!.id;
const [session] = await sql<{ id: string }[]>`
INSERT INTO sessions (project_id, name, model, status)
VALUES (${projectId}, 'cp', 'm', 'open') RETURNING id
`;
sessionId = session!.id;
const [chat] = await sql<{ id: string }[]>`
INSERT INTO chats (session_id, name, status) VALUES (${sessionId}, 'tab', 'open') RETURNING id
`;
chatId = chat!.id;
const wt = await ensureSessionWorktree(sql, projectDir, sessionId);
worktreePath = wt.worktreePath;
});
afterAll(async () => {
if (sql) {
const rows = await sql<{ path: string }[]>`SELECT path FROM worktrees WHERE session_id = ${sessionId}`.catch(() => []);
for (const r of rows) {
await hostExec(`git -C ${projectDir} worktree remove ${r.path} --force`, { timeoutMs: 10_000 }).catch(() => {});
}
await sql`DELETE FROM checkpoints WHERE chat_id = ${chatId}`.catch(() => {});
await sql`DELETE FROM agent_sessions WHERE chat_id = ${chatId}`.catch(() => {});
await sql`DELETE FROM worktrees WHERE session_id = ${sessionId}`.catch(() => {});
await sql`DELETE FROM chats WHERE id = ${chatId}`.catch(() => {});
await sql`DELETE FROM sessions WHERE id = ${sessionId}`.catch(() => {});
await sql`DELETE FROM projects WHERE id = ${projectId}`.catch(() => {});
await sql.end({ timeout: 5 });
}
await rm(projectDir, { recursive: true, force: true });
});
it('createCheckpoint inserts a row + a private ref capturing tracked + untracked', async () => {
const [wt] = await sql<{ id: string }[]>`SELECT id FROM worktrees WHERE session_id = ${sessionId} AND status = 'active'`;
const worktreeId = wt!.id;
// Pre-turn untracked + tracked-edit state the agent will start from.
await hostExec(`cd ${worktreePath} && echo edited >> README.md && echo new > extra.txt`, { timeoutMs: 10_000 });
const [assistantMsg] = await sql<{ id: string }[]>`
INSERT INTO messages (session_id, chat_id, role, content, status)
VALUES (${sessionId}, ${chatId}, 'assistant', '', 'streaming') RETURNING id
`;
const messageId = assistantMsg!.id;
const cp = await createCheckpoint(sql, {
chatId,
sessionId,
worktreeId,
worktreePath,
messageId,
});
expect(cp).not.toBeNull();
expect(cp!.commit_sha).toMatch(/^[0-9a-f]{40}$/);
const [row] = await sql<{ commit_sha: string; worktree_id: string; message_id: string }[]>`
SELECT commit_sha, worktree_id, message_id FROM checkpoints WHERE id = ${cp!.id}
`;
expect(row!.commit_sha).toBe(cp!.commit_sha);
expect(row!.worktree_id).toBe(worktreeId);
expect(row!.message_id).toBe(messageId);
// The ref exists and the captured tree carries the untracked file (proves the
// temp-index `git add -A` snapshotted untracked content).
const refLs = await hostExec(
`git -C ${worktreePath} ls-tree -r --name-only ${cp!.commit_sha}`,
{ timeoutMs: 10_000 },
);
expect(refLs.exitCode).toBe(0);
expect(refLs.stdout).toContain('extra.txt');
// The shadow commit did NOT disturb the real working tree: extra.txt is still
// present + still untracked (status shows it).
const status = await hostExec(`git -C ${worktreePath} status --porcelain`, { timeoutMs: 10_000 });
expect(status.stdout).toContain('extra.txt');
});
it('restoreCheckpoint resets the worktree, trims the transcript, and drops later checkpoints', async () => {
// Clean slate for this test: reset the worktree to HEAD, clear prior rows.
await hostExec(`git -C ${worktreePath} reset --hard HEAD && git -C ${worktreePath} clean -fd`, { timeoutMs: 10_000 });
await sql`DELETE FROM checkpoints WHERE chat_id = ${chatId}`;
await sql`DELETE FROM messages WHERE chat_id = ${chatId}`;
const [wt] = await sql<{ id: string }[]>`SELECT id FROM worktrees WHERE session_id = ${sessionId} AND status = 'active'`;
const worktreeId = wt!.id;
// Turn 1: a user msg, then the assistant turn the checkpoint anchors. The
// worktree is pristine (matches HEAD) when this checkpoint is captured.
await sql`INSERT INTO messages (session_id, chat_id, role, content, status) VALUES (${sessionId}, ${chatId}, 'user', 'do it', 'complete')`;
const [a1] = await sql<{ id: string }[]>`
INSERT INTO messages (session_id, chat_id, role, content, status)
VALUES (${sessionId}, ${chatId}, 'assistant', 'turn 1', 'complete') RETURNING id
`;
const cp1 = await createCheckpoint(sql, { chatId, sessionId, worktreeId, worktreePath, messageId: a1!.id });
expect(cp1).not.toBeNull();
// The agent (turn 1) writes a file into the worktree.
await hostExec(`cd ${worktreePath} && echo agent-wrote > agent.txt`, { timeoutMs: 10_000 });
// Turn 2: another user msg + assistant turn, AND a second (later) checkpoint.
await sql`INSERT INTO messages (session_id, chat_id, role, content, status) VALUES (${sessionId}, ${chatId}, 'user', 'more', 'complete')`;
const [a2] = await sql<{ id: string }[]>`
INSERT INTO messages (session_id, chat_id, role, content, status)
VALUES (${sessionId}, ${chatId}, 'assistant', 'turn 2', 'complete') RETURNING id
`;
const cp2 = await createCheckpoint(sql, { chatId, sessionId, worktreeId, worktreePath, messageId: a2!.id });
expect(cp2).not.toBeNull();
// An agent_sessions row that restore should mark 'crashed'.
await sql`
INSERT INTO agent_sessions (chat_id, session_id, worktree_id, agent, backend, agent_session_id, status, last_active_at)
VALUES (${chatId}, ${sessionId}, ${worktreeId}, 'goose', 'acp_warm', 'sess-1', 'active', clock_timestamp())
ON CONFLICT (chat_id, agent) DO UPDATE SET status = 'active'
`;
const before = await sql<{ id: string }[]>`SELECT id FROM messages WHERE chat_id = ${chatId} ORDER BY created_at`;
expect(before.length).toBe(4); // user, a1, user, a2
// Restore to cp1 (before turn 1's assistant message).
const result = await restoreCheckpoint(sql, cp1!.id, { sessionId });
expect(result.checkpoint_id).toBe(cp1!.id);
expect(result.worktree_reset).toBe(true);
expect(result.backend_reset).toBe(true);
// a1, user(turn2), a2 deleted (created_at >= a1) → 3 trimmed.
expect(result.messages_deleted).toBe(3);
// Transcript trimmed to just the first user message.
const after = await sql<{ role: string; content: string }[]>`SELECT role, content FROM messages WHERE chat_id = ${chatId} ORDER BY created_at`;
expect(after.length).toBe(1);
expect(after[0]!.role).toBe('user');
// Worktree reset: the agent's file is gone (it was written after cp1).
const ls = await hostExec(`ls ${worktreePath}/agent.txt`, { timeoutMs: 10_000 });
expect(ls.exitCode).not.toBe(0);
// The agent_sessions row was reset to 'crashed'.
const [as] = await sql<{ status: string }[]>`SELECT status FROM agent_sessions WHERE chat_id = ${chatId} AND agent = 'goose'`;
expect(as!.status).toBe('crashed');
// cp1 survives (re-restorable); cp2 (later) was dropped.
const cps = await sql<{ id: string }[]>`SELECT id FROM checkpoints WHERE chat_id = ${chatId}`;
expect(cps.map((c) => c.id)).toEqual([cp1!.id]);
});
it('restoreCheckpoint throws CheckpointNotFoundError for an unknown id', async () => {
await expect(
restoreCheckpoint(sql, '00000000-0000-0000-0000-000000000000', { sessionId }),
).rejects.toBeInstanceOf(CheckpointNotFoundError);
});
it('restoreCheckpoint throws when the checkpoint is not in the requested session', async () => {
// A checkpoint whose session_id differs from the route's sessionId.
const [wt] = await sql<{ id: string }[]>`SELECT id FROM worktrees WHERE session_id = ${sessionId} AND status = 'active'`;
const cp = await createCheckpoint(sql, { chatId, sessionId, worktreeId: wt!.id, worktreePath, messageId: null });
expect(cp).not.toBeNull();
await expect(
restoreCheckpoint(sql, cp!.id, { sessionId: '11111111-1111-1111-1111-111111111111' }),
).rejects.toBeInstanceOf(CheckpointNotFoundError);
await sql`DELETE FROM checkpoints WHERE id = ${cp!.id}`;
});
it('restoreCheckpoint denies a NULL-session_id checkpoint from another session (no fail-open IDOR)', async () => {
// Regression for the fail-open authorization bug: a checkpoint row whose
// denormalized session_id is NULL must STILL be scoped via its chat's owning
// session (chats.session_id), not skipped. The old guard `cp.session_id &&
// cp.session_id !== sessionId` fell through on NULL → cross-session restore.
const [row] = await sql<{ id: string }[]>`
INSERT INTO checkpoints (chat_id, session_id, message_id, commit_sha)
VALUES (${chatId}, NULL, NULL, 'deadbeef')
RETURNING id
`;
await expect(
restoreCheckpoint(sql, row!.id, { sessionId: '22222222-2222-2222-2222-222222222222' }),
).rejects.toBeInstanceOf(CheckpointNotFoundError);
await sql`DELETE FROM checkpoints WHERE id = ${row!.id}`;
});
});

View File

@@ -0,0 +1,73 @@
import { describe, it, expect } from 'vitest';
import { stripDcpTags, makeDcpStreamStripper } from '../dcp-strip.js';
// Feed chunks through a fresh stripper and return the fully reassembled output
// (everything emitted during streaming + the final flush) — i.e. what the
// dispatcher would accumulate into the persisted message content.
function run(chunks: string[]): string {
const s = makeDcpStreamStripper();
let out = '';
for (const c of chunks) out += s.push(c);
out += s.flush();
return out;
}
describe('stripDcpTags (one-shot)', () => {
it('removes a complete tag', () => {
expect(stripDcpTags('Yes — "Test".\n\n<dcp-message-id>m0019</dcp-message-id>')).toBe(
'Yes — "Test".\n\n',
);
});
it('leaves text without a tag untouched', () => {
expect(stripDcpTags('no tag here')).toBe('no tag here');
});
});
describe('per-chunk strip is INSUFFICIENT (documents the bug)', () => {
it('a tag split across chunks survives a naive per-chunk .replace()', () => {
const chunks = ['Yes.\n\n<dcp', '-message', '-id>m0019</dcp', '-message-id>'];
const naive = chunks.map(stripDcpTags).join('');
// The reassembled content still contains the tag — this is the screenshot bug.
expect(naive).toContain('<dcp-message-id>m0019</dcp-message-id>');
});
});
describe('makeDcpStreamStripper (cross-chunk fix)', () => {
it('strips a tag split across chunks (the real opencode case)', () => {
expect(run(['Yes.\n\n<dcp', '-message', '-id>m0019</dcp', '-message-id>'])).toBe('Yes.\n\n');
});
it('strips a tag split at EVERY character boundary', () => {
const full = 'Answer.<dcp-message-id>m0019</dcp-message-id>';
expect(run([...full])).toBe('Answer.');
});
it('strips a tag delivered whole in one chunk', () => {
expect(run(['Answer.<dcp-message-id>m0019</dcp-message-id>'])).toBe('Answer.');
});
it('passes through text with no tag', () => {
expect(run(['hello ', 'world'])).toBe('hello world');
});
it('does NOT swallow legitimate < content (code/HTML/generics)', () => {
expect(run(['use ', '<div>', ' and ', 'Array<', 'string>'])).toBe('use <div> and Array<string>');
});
it('handles a lone < that is not a dcp tag, split across chunks', () => {
expect(run(['a <', 'b c'])).toBe('a <b c');
});
it('emits surrounding text and strips a mid-text tag', () => {
expect(run(['before ', '<dcp-message-id>', 'm1', '</dcp-message-id>', ' after'])).toBe(
'before after',
);
});
it('flushes a truncated/never-closed partial tag without leaking it as a complete tag', () => {
// If the stream ends mid-tag, flush strips complete tags; an incomplete
// remnant is returned as-is (no complete tag ever existed to render).
const out = run(['done.<dcp-message-id>m00']);
expect(out).not.toContain('</dcp-message-id>');
});
});

View File

@@ -0,0 +1,173 @@
import { describe, it, expect } from 'vitest';
import { locateMatch, SIMILARITY_THRESHOLD } from '../fuzzy-match.js';
// Helper: assert a resolved span and slice it back out of the content so the
// test pins the EXACT file text the caller would replace.
function span(result: ReturnType<typeof locateMatch>): { start: number; end: number } {
if (result.kind !== 'exact' && result.kind !== 'fuzzy') {
throw new Error(`expected a located span, got ${result.kind}`);
}
return { start: result.start, end: result.end };
}
describe('locateMatch — strategy 1: exact', () => {
it('returns an exact unique span', () => {
const content = 'alpha\nbeta\ngamma\n';
const result = locateMatch(content, 'beta');
expect(result.kind).toBe('exact');
const { start, end } = span(result);
expect(content.slice(start, end)).toBe('beta');
});
it('returns the right offsets for a multi-line exact needle', () => {
const content = 'one\ntwo\nthree\nfour\n';
const needle = 'two\nthree';
const result = locateMatch(content, needle);
expect(result.kind).toBe('exact');
const { start, end } = span(result);
expect(content.slice(start, end)).toBe(needle);
});
it('refuses when the exact needle occurs more than once', () => {
const content = 'foo\nbar\nfoo\nbar\nfoo\n';
const result = locateMatch(content, 'foo');
expect(result).toEqual({ kind: 'ambiguous', count: 3 });
});
});
describe('locateMatch — strategy 2: per-line whitespace', () => {
it('matches across trailing-whitespace drift at the real span', () => {
// File has trailing spaces the model dropped from a TWO-line copy. A
// single-line needle would be located by exact indexOf (it's a substring),
// so use two lines where line 1's trailing ws breaks an exact substring run.
const content = 'function f() {\n setup(); \n return 1;\n}\n';
const needle = ' setup();\n return 1;'; // line 1 missing trailing spaces
const result = locateMatch(content, needle);
expect(result.kind).toBe('fuzzy');
const { start, end } = span(result);
// The returned span covers the ORIGINAL lines including the trailing spaces.
expect(content.slice(start, end)).toBe(' setup(); \n return 1;');
});
it('matches across indentation drift (multi-line block)', () => {
// File indents with 4 spaces; model emitted 2-space indentation. trimEnd
// alone does not normalize LEADING whitespace, so this exercises... actually
// leading-indent drift is a Levenshtein-tier fallback. Here we keep the
// leading indent identical and drift only trailing whitespace per line.
const content = ['if (x) {', ' doThing(); ', ' doOther();', '}'].join('\n');
const needle = [' doThing();', ' doOther();'].join('\n');
const result = locateMatch(content, needle);
expect(result.kind).toBe('fuzzy');
const { start, end } = span(result);
expect(content.slice(start, end)).toBe(' doThing(); \n doOther();');
});
it('ignores leading/trailing blank needle lines', () => {
const content = 'header\nbody line\nfooter\n';
const needle = '\n\nbody line\n\n';
const result = locateMatch(content, needle);
expect(result.kind).toBe('fuzzy');
const { start, end } = span(result);
expect(content.slice(start, end)).toBe('body line');
});
it('reports ambiguous when a whitespace-window matches twice', () => {
// Both line 1 and line 4 differ from the needle only by trailing whitespace,
// so exact indexOf fails (no exact substring) and the whitespace tier finds
// two equivalent windows → ambiguous.
const content = 'x = 1; \ny = 2;\nz = 3;\nx = 1;\t\n';
const needle = 'x = 1;'; // no trailing ws → not an exact substring of either line
const result = locateMatch(content, needle);
expect(result).toEqual({ kind: 'ambiguous', count: 2 });
});
});
describe('locateMatch — strategy 3: unicode canonicalization', () => {
it('matches across curly quotes', () => {
const content = "const s = 'hello';\n";
const needle = 'const s = hello;'; // hello
const result = locateMatch(content, needle);
expect(result.kind).toBe('fuzzy');
const { start, end } = span(result);
// Span maps back to ORIGINAL (straight-quote) text.
expect(content.slice(start, end)).toBe("const s = 'hello';");
});
it('matches across curly double-quotes', () => {
const content = 'log("done");\n';
const needle = 'log(“done”);'; // “done”
const result = locateMatch(content, needle);
expect(result.kind).toBe('fuzzy');
const { start, end } = span(result);
expect(content.slice(start, end)).toBe('log("done");');
});
it('matches across an em-dash drift', () => {
const content = 'range 1-10 inclusive\n';
const needle = 'range 1—10 inclusive'; // em-dash
const result = locateMatch(content, needle);
expect(result.kind).toBe('fuzzy');
const { start, end } = span(result);
expect(content.slice(start, end)).toBe('range 1-10 inclusive');
});
it('matches across a non-breaking space drift', () => {
const content = 'a b c\n'; // plain spaces
const needle = 'a b c'; // nbsp between words
const result = locateMatch(content, needle);
expect(result.kind).toBe('fuzzy');
const { start, end } = span(result);
expect(content.slice(start, end)).toBe('a b c');
});
});
describe('locateMatch — strategy 4: Levenshtein', () => {
it('matches a >= threshold near-miss (small typo drift)', () => {
// Needle has a one-char typo ('totals' vs 'total') so it is NOT an exact
// substring and the whitespace/canonical tiers (which require equality) both
// miss; Levenshtein similarity stays well above the 0.66 floor.
const content = 'const total = sum + tax;\n';
const needle = 'const totals = sum + tax;';
const result = locateMatch(content, needle);
expect(result.kind).toBe('fuzzy');
const { start, end } = span(result);
// Span maps to the real (correctly-spelled) file line.
expect(content.slice(start, end)).toBe('const total = sum + tax;');
});
it('matches a multi-line block with indentation drift via Levenshtein', () => {
const content = ['function g() {', ' return compute(a, b);', '}'].join('\n');
// 6-space indent vs file's 2-space; trimEnd does not fix leading indent, so
// this lands on the Levenshtein tier (joined-trim makes it identical → ~1.0).
const needle = [' return compute(a, b);'].join('\n');
const result = locateMatch(content, needle);
expect(result.kind).toBe('fuzzy');
const { start, end } = span(result);
expect(content.slice(start, end)).toBe(' return compute(a, b);');
});
it('returns not_found for a below-threshold miss', () => {
const content = 'the quick brown fox jumps over the lazy dog\n';
const needle = 'completely unrelated string of text here xyz';
const result = locateMatch(content, needle);
expect(result).toEqual({ kind: 'not_found' });
});
it('returns not_found for a genuinely-absent needle', () => {
const content = 'alpha\nbeta\ngamma\n';
const needle = 'this content does not exist anywhere at all';
const result = locateMatch(content, needle);
expect(result).toEqual({ kind: 'not_found' });
});
});
describe('locateMatch — edge cases', () => {
it('returns not_found for an empty needle', () => {
expect(locateMatch('anything', '')).toEqual({ kind: 'not_found' });
});
it('exposes a sane similarity threshold', () => {
expect(SIMILARITY_THRESHOLD).toBeGreaterThan(0);
expect(SIMILARITY_THRESHOLD).toBeLessThanOrEqual(1);
});
});

View File

@@ -0,0 +1,96 @@
import { describe, it, expect, beforeAll, afterAll } from 'vitest';
import { readFileSync, existsSync } from 'node:fs';
import { readFile, rm, mkdir } from 'node:fs/promises';
import { resolve } from 'node:path';
import postgres from 'postgres';
import { queueCreate, queueEdit, queueDelete, applyOne, rewindOne, listPending } from '../pending_changes.js';
/**
* Integration test for the full pending-changes lifecycle.
* Requires DATABASE_URL env var pointing to a running postgres instance.
* Skips cleanly when DATABASE_URL is not set.
*
* Run with:
* DATABASE_URL='postgres://boocode:devpass@localhost:5500/boocode' pnpm -C apps/coder test
*/
describe.runIf(!!process.env.DATABASE_URL)('pending_changes integration', () => {
let sql: ReturnType<typeof postgres>;
const testDir = '/tmp/boocode-pending-changes-test-' + Date.now();
const projectRoot = testDir;
const testSessionId = '00000000-0000-0000-0000-000000000001';
beforeAll(async () => {
sql = postgres(process.env.DATABASE_URL!, { max: 3 });
// Apply schema
const schemaPath = resolve(__dirname, '../../schema.sql');
const ddl = readFileSync(schemaPath, 'utf8');
await sql.unsafe(ddl);
// Create temp project directory
await mkdir(testDir, { recursive: true });
});
afterAll(async () => {
// Cleanup test data
await sql`DELETE FROM pending_changes WHERE session_id = ${testSessionId}`;
await sql.end({ timeout: 5 });
// Remove temp directory
await rm(testDir, { recursive: true, force: true });
});
it('queueCreate → listPending → applyOne → verify file exists', async () => {
const change = await queueCreate(sql, testSessionId, null, 'hello.txt', 'hello world', projectRoot);
expect(change.status).toBe('pending');
expect(change.operation).toBe('create');
const pending = await listPending(sql, testSessionId);
expect(pending.some((p) => p.id === change.id)).toBe(true);
const result = await applyOne(sql, change.id, projectRoot);
expect(result.success).toBe(true);
const content = await readFile(resolve(testDir, 'hello.txt'), 'utf8');
expect(content).toBe('hello world');
});
it('queueEdit → apply → verify content changed', async () => {
// Setup: create a file first
const createChange = await queueCreate(sql, testSessionId, null, 'editable.txt', 'original content here', projectRoot);
await applyOne(sql, createChange.id, projectRoot);
// Queue an edit
const editChange = await queueEdit(sql, testSessionId, null, 'editable.txt', 'original', 'modified', projectRoot);
expect(editChange.operation).toBe('edit');
const result = await applyOne(sql, editChange.id, projectRoot);
expect(result.success).toBe(true);
const content = await readFile(resolve(testDir, 'editable.txt'), 'utf8');
expect(content).toBe('modified content here');
});
it('queueDelete → apply → verify file gone', async () => {
// Setup: create a file
const createChange = await queueCreate(sql, testSessionId, null, 'deleteme.txt', 'goodbye', projectRoot);
await applyOne(sql, createChange.id, projectRoot);
expect(existsSync(resolve(testDir, 'deleteme.txt'))).toBe(true);
// Queue a delete
const deleteChange = await queueDelete(sql, testSessionId, null, 'deleteme.txt', projectRoot);
const result = await applyOne(sql, deleteChange.id, projectRoot);
expect(result.success).toBe(true);
expect(existsSync(resolve(testDir, 'deleteme.txt'))).toBe(false);
});
it('rewindOne → verify reverted', async () => {
// Setup: create and apply a file
const createChange = await queueCreate(sql, testSessionId, null, 'rewindable.txt', 'initial', projectRoot);
await applyOne(sql, createChange.id, projectRoot);
// Rewind the create (should delete the file)
const result = await rewindOne(sql, createChange.id, projectRoot);
expect(result.success).toBe(true);
expect(existsSync(resolve(testDir, 'rewindable.txt'))).toBe(false);
});
});

View File

@@ -0,0 +1,26 @@
import { describe, it, expect } from 'vitest';
import { getManifestCommands, mergeCommands, PROVIDER_COMMANDS } from '../provider-commands.js';
describe('provider-commands', () => {
it('defines commands for every external harness', () => {
for (const name of ['claude', 'opencode', 'goose', 'qwen']) {
expect(getManifestCommands(name).length, name).toBeGreaterThan(0);
}
});
it('boocode uses frontend skills — empty manifest', () => {
expect(getManifestCommands('boocode')).toEqual([]);
expect(PROVIDER_COMMANDS.boocode).toEqual([]);
});
it('mergeCommands dedupes by name with later override', () => {
const merged = mergeCommands(
[{ name: 'help', description: 'a' }],
[{ name: 'help', description: 'b' }, { name: 'clear' }],
);
expect(merged).toEqual([
{ name: 'clear' },
{ name: 'help', description: 'b' },
]);
});
});

View File

@@ -0,0 +1,93 @@
import { describe, it, expect, vi } from 'vitest';
import { buildResolvedRegistry } from '../provider-config-registry.js';
import { PROVIDERS } from '../provider-registry.js';
import type { CoderProvidersFile } from '../provider-config.js';
describe('buildResolvedRegistry', () => {
it('applies a built-in override (goose label)', () => {
const config: CoderProvidersFile = { providers: { goose: { label: 'Goosey' } } };
const reg = buildResolvedRegistry(PROVIDERS, config);
const goose = reg.get('goose');
expect(goose).toBeDefined();
expect(goose!.label).toBe('Goosey');
expect(goose!.configLabel).toBe('Goosey');
expect(goose!.enabled).toBe(true);
expect(goose!.isBuiltin).toBe(true);
expect(goose!.isCustomAcp).toBe(false);
});
it('adds a custom ACP entry (extends:acp + label + command)', () => {
const config: CoderProvidersFile = {
providers: {
'amp-acp': { extends: 'acp', label: 'Amp', description: 'ACP wrapper', command: ['amp-acp', '--acp'], env: { AMP: '1' } },
},
};
const reg = buildResolvedRegistry(PROVIDERS, config);
const amp = reg.get('amp-acp');
expect(amp).toBeDefined();
expect(amp!.isCustomAcp).toBe(true);
expect(amp!.isBuiltin).toBe(false);
expect(amp!.transport).toBe('acp');
expect(amp!.modelSource).toBe('probe');
expect(amp!.launchCommand).toEqual(['amp-acp', '--acp']);
expect(amp!.env).toEqual({ AMP: '1' });
expect(amp!.enabled).toBe(true);
});
it('keeps a disabled built-in in the registry flagged disabled (goose)', () => {
const config: CoderProvidersFile = { providers: { goose: { enabled: false } } };
const reg = buildResolvedRegistry(PROVIDERS, config);
expect(reg.has('goose')).toBe(true);
expect(reg.get('goose')!.enabled).toBe(false);
});
it('skips a custom id without extends (no throw)', () => {
const config: CoderProvidersFile = { providers: { weird: { label: 'Weird', command: ['weird'] } } };
const warn = vi.spyOn(console, 'warn').mockImplementation(() => {});
const reg = buildResolvedRegistry(PROVIDERS, config);
expect(reg.has('weird')).toBe(false);
// built-ins untouched
expect(reg.size).toBe(PROVIDERS.length);
expect(warn).toHaveBeenCalled();
warn.mockRestore();
});
it('ignores enabled:false on boocode and warns', () => {
const config: CoderProvidersFile = { providers: { boocode: { enabled: false } } };
const warn = vi.spyOn(console, 'warn').mockImplementation(() => {});
const reg = buildResolvedRegistry(PROVIDERS, config);
expect(reg.get('boocode')!.enabled).toBe(true);
expect(warn).toHaveBeenCalled();
warn.mockRestore();
});
it('carries config models + additionalModels onto built-in and custom defs', () => {
const reg = buildResolvedRegistry(PROVIDERS, {
providers: {
claude: { models: [{ id: 'claude-opus-4-8', label: 'Opus 4.8' }] },
'amp-acp': {
extends: 'acp',
label: 'Amp',
command: ['amp-acp'],
additionalModels: [{ id: 'amp-1', label: 'Amp 1' }],
},
},
});
expect(reg.get('claude')!.configModels).toEqual([{ id: 'claude-opus-4-8', label: 'Opus 4.8' }]);
expect(reg.get('amp-acp')!.configAdditionalModels).toEqual([{ id: 'amp-1', label: 'Amp 1' }]);
});
it('REGRESSION: empty config returns exactly the built-ins, all enabled', () => {
const reg = buildResolvedRegistry(PROVIDERS, { providers: {} });
expect(reg.size).toBe(PROVIDERS.length);
expect([...reg.keys()]).toEqual(PROVIDERS.map((p) => p.name));
for (const def of PROVIDERS) {
const r = reg.get(def.name)!;
expect(r.enabled).toBe(true);
expect(r.isBuiltin).toBe(true);
expect(r.isCustomAcp).toBe(false);
expect(r.launchCommand).toBeNull();
expect(r.label).toBe(def.label);
}
});
});

View File

@@ -0,0 +1,96 @@
import { describe, it, expect } from 'vitest';
import {
mergeProviderConfigPatch,
ProviderConfigPatchSchema,
CoderProvidersFileSchema,
type CoderProvidersFile,
} from '../provider-config.js';
describe('ProviderConfigPatchSchema', () => {
it('accepts a per-provider override patch', () => {
const parsed = ProviderConfigPatchSchema.safeParse({ providers: { goose: { enabled: false } } });
expect(parsed.success).toBe(true);
});
it('accepts a null value (delete-the-override sentinel)', () => {
const parsed = ProviderConfigPatchSchema.safeParse({ providers: { goose: null } });
expect(parsed.success).toBe(true);
});
it('defaults providers to {} on an empty body', () => {
const parsed = ProviderConfigPatchSchema.safeParse({});
expect(parsed.success).toBe(true);
if (parsed.success) expect(parsed.data.providers).toEqual({});
});
it('rejects a malformed override (wrong field type)', () => {
const parsed = ProviderConfigPatchSchema.safeParse({ providers: { goose: { enabled: 'yes' } } });
expect(parsed.success).toBe(false);
});
it('rejects a non-object providers map', () => {
const parsed = ProviderConfigPatchSchema.safeParse({ providers: 123 });
expect(parsed.success).toBe(false);
});
});
describe('mergeProviderConfigPatch', () => {
const current: CoderProvidersFile = {
providers: {
goose: { enabled: true, label: 'Goose' },
opencode: { enabled: true },
},
};
it('replaces an existing override object wholesale (not deep-merge)', () => {
const merged = mergeProviderConfigPatch(current, { providers: { goose: { enabled: false } } });
// Whole override replaced — the prior `label` is gone, only `enabled` remains.
expect(merged.providers.goose).toEqual({ enabled: false });
});
it('adds a brand-new override id', () => {
const merged = mergeProviderConfigPatch(current, {
providers: { 'amp-acp': { extends: 'acp', label: 'Amp', command: ['amp-acp'] } },
});
expect(merged.providers['amp-acp']).toEqual({ extends: 'acp', label: 'Amp', command: ['amp-acp'] });
});
it('deletes an override when the value is null', () => {
const merged = mergeProviderConfigPatch(current, { providers: { goose: null } });
expect(merged.providers.goose).toBeUndefined();
expect(Object.keys(merged.providers)).toEqual(['opencode']);
});
it('leaves ids absent from the patch untouched', () => {
const merged = mergeProviderConfigPatch(current, { providers: { goose: { enabled: false } } });
expect(merged.providers.opencode).toEqual({ enabled: true });
});
it('does not mutate the input config', () => {
const snapshot = JSON.parse(JSON.stringify(current));
mergeProviderConfigPatch(current, { providers: { goose: null, opencode: { enabled: false } } });
expect(current).toEqual(snapshot);
});
it('empty patch returns an equivalent config', () => {
const merged = mergeProviderConfigPatch(current, { providers: {} });
expect(merged).toEqual(current);
});
});
describe('CoderProvidersFileSchema (validate-before-save guard)', () => {
it('accepts a clean merged config', () => {
const merged = mergeProviderConfigPatch(
{ providers: {} },
{ providers: { goose: { enabled: false } } },
);
expect(CoderProvidersFileSchema.safeParse(merged).success).toBe(true);
});
it('rejects a config carrying an invalid override (never written)', () => {
// A merged object that somehow holds a bad override must fail validation
// so the PATCH route returns 422 and never calls save().
const invalid = { providers: { goose: { enabled: 'nope' } } };
expect(CoderProvidersFileSchema.safeParse(invalid).success).toBe(false);
});
});

View File

@@ -0,0 +1,85 @@
import { describe, it, expect } from 'vitest';
import { getProviderDiagnostic, type DiagnosticAgentRow } from '../provider-diagnostic.js';
import { buildResolvedRegistry } from '../provider-config-registry.js';
import { PROVIDERS } from '../provider-registry.js';
import type { ProviderSnapshotEntry } from '../provider-types.js';
const registry = buildResolvedRegistry(PROVIDERS, {
providers: {
goose: { enabled: false },
'amp-acp': { extends: 'acp', label: 'Amp', command: ['amp-acp', '--acp'] },
},
});
const alwaysAvailable = () => Promise.resolve(true);
const neverAvailable = () => Promise.resolve(false);
describe('getProviderDiagnostic', () => {
it('reports a disabled built-in (enabled:false, no install)', async () => {
const report = await getProviderDiagnostic(registry.get('goose')!, undefined, {
checkAvailable: neverAvailable,
});
expect(report).toContain('provider: goose');
expect(report).toContain('enabled: false');
expect(report).toContain('installed: false');
expect(report).toMatch(/command_available:\s*false/);
});
it('reports an installed built-in with its install_path, last_probed_at, model count', async () => {
const agentRow: DiagnosticAgentRow = {
name: 'opencode',
install_path: '/usr/bin/opencode',
supports_acp: true,
models: [
{ id: 'm1', label: 'M1' },
{ id: 'm2', label: 'M2' },
],
last_probed_at: '2026-05-29T12:00:00.000Z',
};
const report = await getProviderDiagnostic(registry.get('opencode')!, agentRow, {
checkAvailable: alwaysAvailable,
});
expect(report).toContain('install_path: /usr/bin/opencode');
expect(report).toContain('2026-05-29T12:00:00.000Z');
expect(report).toContain('installed: true');
expect(report).toMatch(/models_in_db:\s*2/);
expect(report).toMatch(/command_available:\s*true/);
});
it('reports a custom ACP launch command + its binary', async () => {
const report = await getProviderDiagnostic(registry.get('amp-acp')!, undefined, {
checkAvailable: alwaysAvailable,
});
expect(report).toContain('provider: amp-acp');
expect(report).toContain('amp-acp --acp');
expect(report).toContain('customAcp: true');
});
it('surfaces the last probe error from a cached snapshot entry', async () => {
const cachedEntry: ProviderSnapshotEntry = {
name: 'opencode',
label: 'OpenCode',
transport: 'acp',
status: 'error',
enabled: true,
installed: true,
models: [],
modes: [],
defaultModeId: null,
commands: [],
error: 'ACP initialize timed out',
};
const report = await getProviderDiagnostic(registry.get('opencode')!, undefined, {
cachedEntry,
checkAvailable: alwaysAvailable,
});
expect(report).toContain('ACP initialize timed out');
});
it('reports no error when none is cached', async () => {
const report = await getProviderDiagnostic(registry.get('opencode')!, undefined, {
checkAvailable: alwaysAvailable,
});
expect(report).toMatch(/last_probe_error:\s*\(none/);
});
});

View File

@@ -0,0 +1,370 @@
import { describe, it, expect, vi, beforeEach } from 'vitest';
import { writeFileSync } from 'node:fs';
import { tmpdir } from 'node:os';
import { join } from 'node:path';
import {
mergeModels,
prefixLlamaSwapModels,
clearProviderSnapshotCache,
getProviderSnapshot,
peekSnapshotEntry,
} from '../provider-snapshot.js';
import { loadProviderConfig } from '../provider-config-registry.js';
vi.mock('../acp-probe.js', () => ({
probeAcpProvider: vi.fn(),
}));
import { probeAcpProvider } from '../acp-probe.js';
const mockProbe = vi.mocked(probeAcpProvider);
/** Write a temp coder-providers.json and point the resolved registry at it. */
function loadConfigFixture(providers: Record<string, unknown>): void {
const path = join(tmpdir(), `coder-providers-test-${providers ? Object.keys(providers).join('-') || 'empty' : 'empty'}.json`);
writeFileSync(path, JSON.stringify({ providers }), 'utf8');
loadProviderConfig(path);
}
function mockSql(agents: Array<{
name: string;
install_path: string | null;
supports_acp: boolean;
models: Array<{ id: string; label: string }> | null;
label: string | null;
transport: string | null;
last_probed_at?: string | null;
}>) {
return vi.fn((strings: TemplateStringsArray) => {
const query = strings.join('');
if (query.includes('FROM available_agents')) {
return Promise.resolve(agents);
}
if (query.includes('UPDATE available_agents')) {
return Promise.resolve([]);
}
return Promise.resolve([]);
}) as unknown as import('../db.js').Sql;
}
const config = {
LLAMA_SWAP_URL: 'http://llama-swap.test',
PROVIDER_PROBE_TTL_MS: 86_400_000,
} as import('../config.js').Config;
describe('prefixLlamaSwapModels', () => {
it('prefixes bare ids', () => {
expect(prefixLlamaSwapModels([{ id: 'qwen3', label: 'qwen3' }])).toEqual([
{ id: 'llama-swap/qwen3', label: 'qwen3' },
]);
});
it('leaves already-prefixed ids unchanged', () => {
expect(prefixLlamaSwapModels([{ id: 'llama-swap/qwen3', label: 'qwen3' }])).toEqual([
{ id: 'llama-swap/qwen3', label: 'qwen3' },
]);
});
});
describe('mergeModels', () => {
it('dedupes by id preserving first occurrence', () => {
const merged = mergeModels(
[{ id: 'a', label: 'A' }],
[{ id: 'a', label: 'A2' }, { id: 'b', label: 'B' }],
);
expect(merged).toEqual([
{ id: 'a', label: 'A' },
{ id: 'b', label: 'B' },
]);
});
});
describe('getProviderSnapshot', () => {
beforeEach(() => {
clearProviderSnapshotCache();
// Reset the resolved registry to built-ins-only (missing path → {} config).
loadProviderConfig('/nonexistent-coder-providers.json');
vi.restoreAllMocks();
vi.stubGlobal(
'fetch',
vi.fn().mockResolvedValue({
ok: true,
json: async () => ({
data: [{ id: 'local-model' }, { id: 'llama-swap/existing' }],
}),
}),
);
});
it('merges opencode ACP models with prefixed llama-swap models', async () => {
mockProbe.mockResolvedValue({
ok: true,
models: [{ id: 'opencode/big-pickle', label: 'Big Pickle', isDefault: true }],
modes: [{ id: 'build', label: 'Build' }],
defaultModeId: 'build',
commands: [{ name: 'custom', description: 'From ACP probe' }],
});
const sql = mockSql([
{
name: 'opencode',
install_path: '/usr/bin/opencode',
supports_acp: true,
models: null,
label: 'OpenCode',
transport: 'acp',
},
]);
const entries = await getProviderSnapshot(sql, config, '/tmp/project', true);
const opencode = entries.find((e) => e.name === 'opencode');
expect(opencode?.models.map((m) => m.id)).toEqual([
'opencode/big-pickle',
'llama-swap/local-model',
'llama-swap/existing',
]);
expect(opencode?.commands.some((c) => c.name === 'help')).toBe(true);
expect(opencode?.commands.some((c) => c.name === 'custom')).toBe(true);
});
it('combines qwen-shaped probe and settings model lists via mergeModels', () => {
const merged = mergeModels(
[{ id: 'qwen-probed', label: 'Qwen Probed' }],
[{ id: 'from-settings', label: 'from-settings' }],
);
expect(merged.map((m) => m.id)).toEqual(['qwen-probed', 'from-settings']);
});
it('returns cached entries on second call within TTL', async () => {
mockProbe.mockResolvedValue({
ok: true,
models: [{ id: 'm1', label: 'M1' }],
modes: [],
defaultModeId: null,
commands: [],
});
const sql = mockSql([
{
name: 'goose',
install_path: '/usr/bin/goose',
supports_acp: true,
models: null,
label: 'Goose',
transport: 'acp',
},
]);
await getProviderSnapshot(sql, config, '/tmp/cwd', true);
await getProviderSnapshot(sql, config, '/tmp/cwd', false);
expect(mockProbe).toHaveBeenCalledTimes(1);
});
it('attaches claude thinking options', async () => {
const sql = mockSql([
{
name: 'claude',
install_path: '/usr/bin/claude',
supports_acp: false,
models: [{ id: 'claude-sonnet', label: 'Sonnet' }],
label: 'Claude Code',
transport: 'pty',
},
]);
const entries = await getProviderSnapshot(sql, config, '/tmp/project', true);
const claude = entries.find((e) => e.name === 'claude');
expect(claude?.models[0]?.thinkingOptions?.length).toBeGreaterThan(0);
expect(claude?.modes.length).toBeGreaterThan(0);
expect(claude?.commands.some((c) => c.name === 'help')).toBe(true);
});
it('disabled provider → unavailable + enabled:false, WITHOUT spawning a probe', async () => {
loadConfigFixture({ goose: { enabled: false } });
mockProbe.mockResolvedValue({ ok: true, models: [], modes: [], defaultModeId: null, commands: [] });
const sql = mockSql([
{
name: 'goose',
install_path: '/usr/bin/goose',
supports_acp: true,
models: [{ id: 'g1', label: 'G1' }],
label: 'Goose',
transport: 'acp',
last_probed_at: new Date().toISOString(),
},
]);
const entries = await getProviderSnapshot(sql, config, '/tmp/project', true);
const goose = entries.find((e) => e.name === 'goose');
expect(goose?.status).toBe('unavailable');
expect(goose?.enabled).toBe(false);
expect(goose?.installed).toBe(false);
expect(mockProbe).not.toHaveBeenCalled();
});
it('uninstalled provider → unavailable + enabled:true + installed:false', async () => {
loadConfigFixture({});
mockProbe.mockResolvedValue({ ok: true, models: [], modes: [], defaultModeId: null, commands: [] });
const sql = mockSql([]); // nothing probed/installed
const entries = await getProviderSnapshot(sql, config, '/tmp/project', true);
const opencode = entries.find((e) => e.name === 'opencode');
expect(opencode?.status).toBe('unavailable');
expect(opencode?.enabled).toBe(true);
expect(opencode?.installed).toBe(false);
expect(mockProbe).not.toHaveBeenCalled();
});
it('fresh DB within TTL → tier-2 cold probe SKIPPED (serves DB models)', async () => {
loadConfigFixture({});
// If this were wrongly called, cached-goose would be replaced and the
// not.toHaveBeenCalled assertion would fail.
mockProbe.mockResolvedValue({
ok: true,
models: [{ id: 'SHOULD-NOT-APPEAR', label: 'nope' }],
modes: [],
defaultModeId: null,
commands: [],
});
const sql = mockSql([
{
name: 'goose',
install_path: '/usr/bin/goose',
supports_acp: true,
models: [{ id: 'cached-goose', label: 'Cached Goose' }],
label: 'Goose',
transport: 'acp',
last_probed_at: new Date().toISOString(), // fresh
},
]);
// force=false → cache-miss returns loading; second call joins the build / cache.
await getProviderSnapshot(sql, config, '/tmp/cwd', false);
const entries = await getProviderSnapshot(sql, config, '/tmp/cwd', false);
const goose = entries.find((e) => e.name === 'goose');
expect(goose?.status).toBe('ready');
expect(goose?.installed).toBe(true);
expect(goose?.models.map((m) => m.id)).toContain('cached-goose');
expect(goose?.models.map((m) => m.id)).not.toContain('SHOULD-NOT-APPEAR');
expect(mockProbe).not.toHaveBeenCalled();
});
it('force refresh → tier-2 cold probe RUNS even when DB is fresh', async () => {
loadConfigFixture({});
mockProbe.mockResolvedValue({
ok: true,
models: [{ id: 'fresh-probe', label: 'Fresh' }],
modes: [],
defaultModeId: null,
commands: [],
});
const sql = mockSql([
{
name: 'goose',
install_path: '/usr/bin/goose',
supports_acp: true,
models: [{ id: 'cached-goose', label: 'Cached' }],
label: 'Goose',
transport: 'acp',
last_probed_at: new Date().toISOString(), // fresh, but force overrides
},
]);
await getProviderSnapshot(sql, config, '/tmp/cwd', true);
expect(mockProbe).toHaveBeenCalled();
});
it('native boocode → ready, enabled, installed', async () => {
loadConfigFixture({});
const sql = mockSql([]);
const entries = await getProviderSnapshot(sql, config, '/tmp/project', true);
const boocode = entries.find((e) => e.name === 'boocode');
expect(boocode?.status).toBe('ready');
expect(boocode?.enabled).toBe(true);
expect(boocode?.installed).toBe(true);
});
it('config models REPLACE the claude static list; additionalModels merge (+ thinking)', async () => {
loadConfigFixture({
claude: {
models: [{ id: 'claude-opus-4-8', label: 'Opus 4.8' }],
additionalModels: [{ id: 'sonnet', label: 'Sonnet (latest)' }],
},
});
const sql = mockSql([
{
name: 'claude',
install_path: '/usr/bin/claude',
supports_acp: false,
models: [{ id: 'old-static', label: 'Old' }],
label: 'Claude Code',
transport: 'pty',
last_probed_at: new Date().toISOString(),
},
]);
const entries = await getProviderSnapshot(sql, config, '/tmp/project', true);
const claude = entries.find((e) => e.name === 'claude');
const ids = claude!.models.map((m) => m.id);
expect(ids).toContain('claude-opus-4-8'); // config models replaced the DB/static list
expect(ids).toContain('sonnet'); // additionalModels merged on top
expect(ids).not.toContain('old-static'); // replaced, not appended
// thinking options still attach to the config-provided models
expect(claude!.models.find((m) => m.id === 'claude-opus-4-8')?.thinkingOptions?.length).toBeGreaterThan(0);
});
it('peekSnapshotEntry returns a cached entry (read-only) and undefined when cold/unknown', async () => {
loadConfigFixture({});
// Cold cache → undefined (no build triggered).
expect(peekSnapshotEntry('boocode', '/tmp/peek')).toBeUndefined();
const sql = mockSql([]);
await getProviderSnapshot(sql, config, '/tmp/peek', true);
expect(peekSnapshotEntry('boocode', '/tmp/peek')?.name).toBe('boocode');
expect(peekSnapshotEntry('does-not-exist', '/tmp/peek')).toBeUndefined();
});
it('2.7 warm cache: a second snapshot within the warm window spawns ZERO probes', async () => {
loadConfigFixture({});
mockProbe.mockResolvedValue({
ok: true,
models: [{ id: 'm1', label: 'M1' }],
modes: [],
defaultModeId: null,
commands: [],
});
const sql = mockSql([
{
name: 'goose',
install_path: '/usr/bin/goose',
supports_acp: true,
models: null,
label: 'Goose',
transport: 'acp',
last_probed_at: null,
},
]);
await getProviderSnapshot(sql, config, '/tmp/cwd', true); // cold populate
const probeCallsAfterFirst = mockProbe.mock.calls.length;
await getProviderSnapshot(sql, config, '/tmp/cwd', false); // warm read
const probeCallsAfterSecond = mockProbe.mock.calls.length;
// Success criterion: second snapshot is served from cache with no ACP spawns.
expect(probeCallsAfterSecond - probeCallsAfterFirst).toBe(0);
});
});

View File

@@ -0,0 +1,64 @@
import { describe, it, expect } from 'vitest';
import { readFileSync } from 'node:fs';
import { resolve, dirname } from 'node:path';
import { fileURLToPath } from 'node:url';
/**
* Parity guard between the two copies of the provider snapshot types:
* apps/coder/src/services/provider-types.ts (backend source of truth)
* apps/web/src/api/types.ts (web wire copy)
*
* APPROACH: text-identity of each shared type block (mirrors the repo's existing
* ws-frames.test.ts byte-parity convention). A compile-time bidirectional-
* assignability check was attempted first (a web-side file importing coder's
* import-free provider-types.ts), but apps/web/tsconfig.app.json is a composite
* project and rejects out-of-include files with TS6307 — so cross-project type
* import is structurally blocked. This runtime guard FAILS on any field
* add/remove/rename/loosen in either copy, including the nested model/mode/
* command types that ProviderSnapshotEntry references. Single-source-of-truth
* (shared workspace package) is deferred as a Tier-2 follow-up.
*/
const here = dirname(fileURLToPath(import.meta.url));
const coderSrc = readFileSync(resolve(here, '../provider-types.ts'), 'utf8');
const webSrc = readFileSync(resolve(here, '../../../../web/src/api/types.ts'), 'utf8');
function extractBlock(src: string, name: string): string {
const iface = src.match(new RegExp(`export interface ${name} \\{[\\s\\S]*?\\n\\}`));
const alias = src.match(new RegExp(`export type ${name} =[^;]*;`));
const block = iface?.[0] ?? alias?.[0];
if (!block) throw new Error(`type block '${name}' not found`);
// Normalize to type structure: drop blank + comment lines (//, /* */, *),
// trim each line. Field add/remove/rename/loosen still changes a field line.
return block
.split('\n')
.map((l) => l.trim())
.filter(
(l) =>
l.length > 0 &&
!l.startsWith('//') &&
!l.startsWith('/*') &&
!l.startsWith('*'),
)
.join('\n');
}
describe('provider snapshot type parity (coder ↔ web)', () => {
// Includes the nested types ProviderSnapshotEntry references, so structural
// drift anywhere in the snapshot surface is caught.
const names = [
'ProviderSnapshotStatus',
'ProviderSnapshotEntry',
'ProviderModel',
'ProviderMode',
'ThinkingOption',
'AgentCommand',
];
for (const name of names) {
it(`${name} is identical in both copies`, () => {
expect(
extractBlock(webSrc, name),
`${name} drifted between apps/coder/src/services/provider-types.ts and apps/web/src/api/types.ts`,
).toBe(extractBlock(coderSrc, name));
});
}
});

View File

@@ -0,0 +1,170 @@
import { describe, it, expect, beforeAll, afterAll } from 'vitest';
import { readFileSync, existsSync } from 'node:fs';
import { rm, mkdir } from 'node:fs/promises';
import { resolve } from 'node:path';
import postgres from 'postgres';
import {
ensureSessionWorktree,
closeChatBackendState,
rebaselineWorktreeAfterApply,
} from '../worktrees.js';
import { reapOrphanWorktrees } from '../orphan-worktree-reaper.js';
import { hostExec } from '../host-exec.js';
/**
* v2.6 Phase 3 (3.6) — reconnect-after-restart integration test.
*
* Proves the DB-truth side of crash/restart recovery: a BooCoder restart wipes the
* in-memory pool, but the persistent `worktrees` + `agent_sessions` rows survive,
* so the "next turn" re-resolves the SAME worktree (reattach, no new dir) and the
* agent-session row is still there to resume from. Also exercises the chat-close
* hook (3.3), the apply re-baseline (3.5), and the orphan reaper (3.4) end-to-end
* against a real git repo + postgres.
*
* Requires DATABASE_URL (DB-opt-in; skips cleanly otherwise) AND git on PATH. Runs:
* DATABASE_URL='postgres://boocode:devpass@localhost:5500/boochat' pnpm -C apps/coder test
*/
describe.runIf(!!process.env.DATABASE_URL)('reconnect after restart (Phase 3)', () => {
let sql: ReturnType<typeof postgres>;
const stamp = Date.now();
const projectDir = `/tmp/boocode-reconnect-proj-${stamp}`;
let projectId: string;
let sessionId: string;
let chatId: string;
beforeAll(async () => {
sql = postgres(process.env.DATABASE_URL!, { max: 3 });
// Both schemas land in the one boochat DB: server owns sessions/chats/projects,
// coder owns worktrees/agent_sessions (FK targets must pre-exist → server first).
const serverSchema = resolve(__dirname, '../../../../server/src/schema.sql');
const coderSchema = resolve(__dirname, '../../schema.sql');
await sql.unsafe(readFileSync(serverSchema, 'utf8'));
await sql.unsafe(readFileSync(coderSchema, 'utf8'));
// A real git repo with one commit so worktree add / diff / rev-parse work.
await mkdir(projectDir, { recursive: true });
await hostExec(
`cd ${projectDir} && git init -q && git config user.email t@t && git config user.name t ` +
`&& echo hello > README.md && git add -A && git commit -qm init`,
{ timeoutMs: 20_000 },
);
const [project] = await sql<{ id: string }[]>`
INSERT INTO projects (name, path, status) VALUES ('reconnect-test', ${projectDir}, 'open') RETURNING id
`;
projectId = project!.id;
const [session] = await sql<{ id: string }[]>`
INSERT INTO sessions (project_id, name, model, status)
VALUES (${projectId}, 'recon', 'm', 'open') RETURNING id
`;
sessionId = session!.id;
const [chat] = await sql<{ id: string }[]>`
INSERT INTO chats (session_id, name, status) VALUES (${sessionId}, 'tab', 'open') RETURNING id
`;
chatId = chat!.id;
});
afterAll(async () => {
if (sql) {
// Best-effort worktree cleanup before dropping rows.
const rows = await sql<{ path: string }[]>`SELECT path FROM worktrees WHERE session_id = ${sessionId}`.catch(() => []);
for (const r of rows) {
await hostExec(`git -C ${projectDir} worktree remove ${r.path} --force`, { timeoutMs: 10_000 }).catch(() => {});
}
await sql`DELETE FROM agent_sessions WHERE chat_id = ${chatId}`.catch(() => {});
await sql`DELETE FROM worktrees WHERE session_id = ${sessionId}`.catch(() => {});
await sql`DELETE FROM chats WHERE id = ${chatId}`.catch(() => {});
await sql`DELETE FROM sessions WHERE id = ${sessionId}`.catch(() => {});
await sql`DELETE FROM projects WHERE id = ${projectId}`.catch(() => {});
await sql.end({ timeout: 5 });
}
await rm(projectDir, { recursive: true, force: true });
});
it('reattaches the SAME worktree across a simulated restart (no new dir)', async () => {
// "Turn 1" — first ensureSessionWorktree creates the worktree + row.
const first = await ensureSessionWorktree(sql, projectDir, sessionId);
expect(existsSync(first.worktreePath)).toBe(true);
expect(first.baseCommit).toBeTruthy();
// Simulate an agent_sessions row written by turn 1 (opencode).
await sql`
INSERT INTO agent_sessions (chat_id, session_id, worktree_id, agent, backend, agent_session_id, status, last_active_at)
VALUES (${chatId}, ${sessionId}, ${first.worktreeId}, 'opencode', 'opencode_server', 'oc-sess-1', 'active', clock_timestamp())
ON CONFLICT (chat_id, agent) DO NOTHING
`;
// "Restart" = brand-new resolution with NO in-memory state. ensureSessionWorktree
// must return the EXISTING row (same id + path), proving reattach not re-create.
const second = await ensureSessionWorktree(sql, projectDir, sessionId);
expect(second.worktreeId).toBe(first.worktreeId);
expect(second.worktreePath).toBe(first.worktreePath);
expect(second.baseCommit).toBe(first.baseCommit);
// The agent_sessions row survived the "restart" with its resume handle intact.
const [row] = await sql<{ agent_session_id: string; status: string }[]>`
SELECT agent_session_id, status FROM agent_sessions WHERE chat_id = ${chatId} AND agent = 'opencode'
`;
expect(row!.agent_session_id).toBe('oc-sess-1');
});
it('re-baselines the worktree diff after apply (3.5)', async () => {
const wt = await ensureSessionWorktree(sql, projectDir, sessionId);
const baseBefore = wt.baseCommit;
// Make a change in the worktree (as an external agent would).
await hostExec(`cd ${wt.worktreePath} && echo change >> README.md`, { timeoutMs: 10_000 });
const r = await rebaselineWorktreeAfterApply(sql, sessionId);
expect(r.rebaselined).toBe(true);
expect(r.newBaseCommit).toBeTruthy();
expect(r.newBaseCommit).not.toBe(baseBefore);
const [row] = await sql<{ base_commit: string }[]>`
SELECT base_commit FROM worktrees WHERE session_id = ${sessionId} AND status = 'active'
`;
expect(row!.base_commit).toBe(r.newBaseCommit);
// Idempotent: a second re-baseline with no new edits is a no-op.
const r2 = await rebaselineWorktreeAfterApply(sql, sessionId);
expect(r2.rebaselined).toBe(false);
});
it('chat-close hook closes agent rows + removes the worktree on the last chat (3.3)', async () => {
// Sanity: an active worktree + agent row exist from the prior tests.
const beforeWt = await sql<{ id: string }[]>`SELECT id FROM worktrees WHERE session_id = ${sessionId} AND status = 'active'`;
expect(beforeWt.length).toBe(1);
const result = await closeChatBackendState(sql, chatId);
expect(result.agentRowsClosed).toBeGreaterThanOrEqual(1);
// chatId is the session's only chat → worktree removed (it was clean after the
// re-baseline commit), not at-risk.
expect(result.worktreeAtRisk).toBe(false);
expect(result.worktreeRemoved).toBe(true);
const [agentRow] = await sql<{ status: string }[]>`
SELECT status FROM agent_sessions WHERE chat_id = ${chatId} AND agent = 'opencode'
`;
expect(agentRow!.status).toBe('closed');
const activeWt = await sql<{ id: string }[]>`SELECT id FROM worktrees WHERE session_id = ${sessionId} AND status = 'active'`;
expect(activeWt.length).toBe(0); // archived, no longer active
});
it('orphan reaper leaves a live worktree alone and reaps a row-less dir (3.4)', async () => {
// Recreate a live worktree for this session (the close test archived the old one).
const live = await ensureSessionWorktree(sql, projectDir, sessionId);
expect(existsSync(live.worktreePath)).toBe(true);
// A live worktree (active row) with grace 0 must NOT be reaped.
const r1 = await reapOrphanWorktrees(sql, console as never, 0, Date.now());
expect(r1.reaped).not.toContain(live.worktreePath);
// Now archive its row (simulating a leaked dir) and reap again — it becomes an
// orphan and is reclaimed (it's clean → not at-risk).
await sql`UPDATE worktrees SET status = 'archived' WHERE id = ${live.worktreeId}`;
const r2 = await reapOrphanWorktrees(sql, console as never, 0, Date.now());
expect(r2.reaped).toContain(live.worktreePath);
expect(existsSync(live.worktreePath)).toBe(false);
});
});

View File

@@ -0,0 +1,189 @@
import { describe, it, expect } from 'vitest';
import {
makeStreamJsonParser,
makeStreamJsonState,
parseStreamJsonLine,
type AgentEventList,
} from '../stream-json-parser.js';
import type { AgentEvent } from '../agent-backend.js';
import type { AcpToolSnapshot } from '../acp-tool-snapshot.js';
// Helpers to JSON-encode the representative Claude-Code stream-json lines.
const sys = (sessionId: string) =>
JSON.stringify({ type: 'system', subtype: 'init', session_id: sessionId, tools: ['read', 'edit'] });
const streamEvent = (event: unknown) => JSON.stringify({ type: 'stream_event', event });
const textDelta = (index: number, text: string) =>
streamEvent({ type: 'content_block_delta', index, delta: { type: 'text_delta', text } });
const thinkingDelta = (index: number, thinking: string) =>
streamEvent({ type: 'content_block_delta', index, delta: { type: 'thinking_delta', thinking } });
const toolStart = (index: number, id: string, name: string) =>
streamEvent({ type: 'content_block_start', index, content_block: { type: 'tool_use', id, name } });
const inputJsonDelta = (index: number, partial: string) =>
streamEvent({ type: 'content_block_delta', index, delta: { type: 'input_json_delta', partial_json: partial } });
const blockStop = (index: number) => streamEvent({ type: 'content_block_stop', index });
const resultLine = (input: number, output: number, sessionId?: string) =>
JSON.stringify({ type: 'result', subtype: 'success', session_id: sessionId, usage: { input_tokens: input, output_tokens: output } });
describe('parseStreamJsonLine (pure per-line mapping)', () => {
it('captures session_id from the system init line and emits no events', () => {
const state = makeStreamJsonState();
const events = parseStreamJsonLine(sys('sess-abc'), state);
expect(events).toEqual([]);
expect(state.sessionId).toBe('sess-abc');
});
it('maps a text_delta stream_event → a text event', () => {
const state = makeStreamJsonState();
expect(parseStreamJsonLine(textDelta(0, 'Hello'), state)).toEqual([{ type: 'text', text: 'Hello' }]);
});
it('maps a thinking_delta stream_event → a reasoning event', () => {
const state = makeStreamJsonState();
expect(parseStreamJsonLine(thinkingDelta(0, 'pondering'), state)).toEqual([
{ type: 'reasoning', text: 'pondering' },
]);
});
it('tolerates a garbage / non-JSON line (returns [], no throw)', () => {
const state = makeStreamJsonState();
expect(parseStreamJsonLine('not json at all {{{', state)).toEqual([]);
expect(parseStreamJsonLine('', state)).toEqual([]);
expect(parseStreamJsonLine(' ', state)).toEqual([]);
// A truncated/partial JSON object also yields [] rather than throwing.
expect(parseStreamJsonLine('{"type":"stream_event","eve', state)).toEqual([]);
});
it('ignores unknown top-level line types and the user (tool-result) line', () => {
const state = makeStreamJsonState();
expect(parseStreamJsonLine(JSON.stringify({ type: 'user', message: {} }), state)).toEqual([]);
expect(parseStreamJsonLine(JSON.stringify({ type: 'whatever' }), state)).toEqual([]);
});
it('assembles a tool call across input_json_delta chunks (split across lines)', () => {
const state = makeStreamJsonState();
// start → tool_call (running, empty args)
const start = parseStreamJsonLine(toolStart(1, 'toolu_1', 'edit_file'), state);
expect(start).toHaveLength(1);
expect(start[0]!.type).toBe('tool_call');
const startSnap = (start[0] as { type: 'tool_call'; toolCall: AcpToolSnapshot }).toolCall;
expect(startSnap.toolCallId).toBe('toolu_1');
expect(startSnap.title).toBe('edit_file');
expect(startSnap.status).toBe('in_progress');
expect(startSnap.rawInput).toEqual({});
// args streamed in fragments — no events until stop
expect(parseStreamJsonLine(inputJsonDelta(1, '{"path":"a'), state)).toEqual([]);
expect(parseStreamJsonLine(inputJsonDelta(1, '.ts","content":'), state)).toEqual([]);
expect(parseStreamJsonLine(inputJsonDelta(1, '"hi"}'), state)).toEqual([]);
// stop → tool_update with the parsed, fully-assembled input
const stop = parseStreamJsonLine(blockStop(1), state);
expect(stop).toHaveLength(1);
expect(stop[0]!.type).toBe('tool_update');
const stopSnap = (stop[0] as { type: 'tool_update'; toolCall: AcpToolSnapshot }).toolCall;
expect(stopSnap.toolCallId).toBe('toolu_1');
expect(stopSnap.status).toBe('completed');
expect(stopSnap.rawInput).toEqual({ path: 'a.ts', content: 'hi' });
});
it('falls back to {_raw} when accumulated tool args are not valid JSON', () => {
const state = makeStreamJsonState();
parseStreamJsonLine(toolStart(0, 'toolu_x', 'run'), state);
parseStreamJsonLine(inputJsonDelta(0, '{"broken'), state);
const stop = parseStreamJsonLine(blockStop(0), state);
const snap = (stop[0] as { type: 'tool_update'; toolCall: AcpToolSnapshot }).toolCall;
expect(snap.rawInput).toEqual({ _raw: '{"broken' });
});
it('captures usage from message_delta and result lines', () => {
const state = makeStreamJsonState();
parseStreamJsonLine(streamEvent({ type: 'message_delta', usage: { output_tokens: 42 } }), state);
expect(state.usage.outputTokens).toBe(42);
parseStreamJsonLine(resultLine(100, 250, 'sess-z'), state);
expect(state.usage.inputTokens).toBe(100);
expect(state.usage.outputTokens).toBe(250);
expect(state.sessionId).toBe('sess-z');
});
it('maps a terminal assistant message (fallback) → text + reasoning + tool events', () => {
const state = makeStreamJsonState();
const line = JSON.stringify({
type: 'assistant',
session_id: 'sess-asst',
message: {
content: [
{ type: 'thinking', thinking: 'let me think' },
{ type: 'text', text: 'Here is the answer' },
{ type: 'tool_use', id: 'toolu_9', name: 'view_file', input: { path: 'x.ts' } },
],
usage: { input_tokens: 5, output_tokens: 7 },
},
});
const events = parseStreamJsonLine(line, state);
expect(events).toEqual([
{ type: 'reasoning', text: 'let me think' },
{ type: 'text', text: 'Here is the answer' },
{
type: 'tool_update',
toolCall: { toolCallId: 'toolu_9', title: 'view_file', kind: null, status: 'completed', rawInput: { path: 'x.ts' } },
},
]);
expect(state.usage).toEqual({ inputTokens: 5, outputTokens: 7 });
expect(state.sessionId).toBe('sess-asst');
});
});
describe('makeStreamJsonParser (stateful wrapper over a full turn)', () => {
it('streams a representative turn: init → text → thinking → tool → result', () => {
const parser = makeStreamJsonParser();
const all: AgentEvent[] = [];
const feed = (line: string): AgentEventList => {
const evs = parser.push(line);
all.push(...evs);
return evs;
};
feed(sys('sess-1'));
feed(textDelta(0, 'Reading '));
feed(textDelta(0, 'the file. '));
feed(thinkingDelta(0, 'I should edit it'));
feed(toolStart(1, 'toolu_a', 'edit_file'));
feed(inputJsonDelta(1, '{"path":'));
feed(inputJsonDelta(1, '"main.ts"}'));
feed(blockStop(1));
feed(textDelta(0, 'Done.'));
feed(resultLine(120, 80, 'sess-1'));
expect(all).toEqual([
{ type: 'text', text: 'Reading ' },
{ type: 'text', text: 'the file. ' },
{ type: 'reasoning', text: 'I should edit it' },
{
type: 'tool_call',
toolCall: { toolCallId: 'toolu_a', title: 'edit_file', kind: null, status: 'in_progress', rawInput: {} },
},
{
type: 'tool_update',
toolCall: { toolCallId: 'toolu_a', title: 'edit_file', kind: null, status: 'completed', rawInput: { path: 'main.ts' } },
},
{ type: 'text', text: 'Done.' },
]);
expect(parser.usage()).toEqual({ inputTokens: 120, outputTokens: 80 });
expect(parser.sessionId()).toBe('sess-1');
});
it('a garbage line interleaved mid-turn does not derail subsequent parsing', () => {
const parser = makeStreamJsonParser();
expect(parser.push(textDelta(0, 'a'))).toEqual([{ type: 'text', text: 'a' }]);
expect(parser.push('>>> not json <<<')).toEqual([]);
expect(parser.push(textDelta(0, 'b'))).toEqual([{ type: 'text', text: 'b' }]);
});
});

View File

@@ -0,0 +1,115 @@
import { describe, it, expect } from 'vitest';
import { resolveWritePath, isSecretPath, WriteGuardError } from '../write_guard.js';
const PROJECT_ROOT = '/opt/projects/my-app';
describe('resolveWritePath', () => {
it('resolves a relative path correctly', () => {
const result = resolveWritePath(PROJECT_ROOT, 'src/index.ts');
expect(result).toBe('/opt/projects/my-app/src/index.ts');
});
it('resolves nested relative path', () => {
const result = resolveWritePath(PROJECT_ROOT, 'src/lib/utils.ts');
expect(result).toBe('/opt/projects/my-app/src/lib/utils.ts');
});
it('throws on ../ escape', () => {
expect(() => resolveWritePath(PROJECT_ROOT, '../../../etc/passwd')).toThrow(WriteGuardError);
expect(() => resolveWritePath(PROJECT_ROOT, '../../../etc/passwd')).toThrow('path escapes project root');
});
it('throws on absolute path outside project root', () => {
expect(() => resolveWritePath(PROJECT_ROOT, '/etc/shadow')).toThrow(WriteGuardError);
expect(() => resolveWritePath(PROJECT_ROOT, '/tmp/exploit')).toThrow('path escapes project root');
});
it('allows absolute path inside project root', () => {
const result = resolveWritePath(PROJECT_ROOT, '/opt/projects/my-app/src/new.ts');
expect(result).toBe('/opt/projects/my-app/src/new.ts');
});
it('denies .env files', () => {
expect(() => resolveWritePath(PROJECT_ROOT, '.env')).toThrow(WriteGuardError);
expect(() => resolveWritePath(PROJECT_ROOT, '.env')).toThrow('cannot write to secret file');
});
it('denies .env.local', () => {
expect(() => resolveWritePath(PROJECT_ROOT, '.env.local')).toThrow(WriteGuardError);
});
it('denies .env.production', () => {
expect(() => resolveWritePath(PROJECT_ROOT, '.env.production')).toThrow(WriteGuardError);
});
it('denies *.pem files', () => {
expect(() => resolveWritePath(PROJECT_ROOT, 'certs/server.pem')).toThrow(WriteGuardError);
expect(() => resolveWritePath(PROJECT_ROOT, 'certs/server.pem')).toThrow('cannot write to secret file');
});
it('denies *.key files', () => {
expect(() => resolveWritePath(PROJECT_ROOT, 'ssl/private.key')).toThrow(WriteGuardError);
});
it('denies id_rsa', () => {
expect(() => resolveWritePath(PROJECT_ROOT, '.ssh/id_rsa')).toThrow(WriteGuardError);
});
it('denies id_ed25519', () => {
expect(() => resolveWritePath(PROJECT_ROOT, '.ssh/id_ed25519')).toThrow(WriteGuardError);
});
it('denies credentials.json', () => {
expect(() => resolveWritePath(PROJECT_ROOT, 'credentials.json')).toThrow(WriteGuardError);
});
it('passes a normal file inside project', () => {
const result = resolveWritePath(PROJECT_ROOT, 'src/components/Button.tsx');
expect(result).toBe('/opt/projects/my-app/src/components/Button.tsx');
});
it('passes a non-existent nested file (no realpath)', () => {
// This is the key difference from BooChat's pathGuard: no realpath means
// files that don't exist yet still pass validation
const result = resolveWritePath(PROJECT_ROOT, 'src/new-dir/new-file.ts');
expect(result).toBe('/opt/projects/my-app/src/new-dir/new-file.ts');
});
it('throws on null/empty path', () => {
expect(() => resolveWritePath(PROJECT_ROOT, '')).toThrow(WriteGuardError);
expect(() => resolveWritePath(PROJECT_ROOT, '')).toThrow('file path is required');
});
it('normalizes ../ within project root and still allows', () => {
const result = resolveWritePath(PROJECT_ROOT, 'src/../lib/utils.ts');
expect(result).toBe('/opt/projects/my-app/lib/utils.ts');
});
it('rejects path that looks inside root but normalizes outside', () => {
expect(() => resolveWritePath(PROJECT_ROOT, 'src/../../other-project/hack.ts')).toThrow(WriteGuardError);
});
});
describe('isSecretPath', () => {
it('detects .env', () => {
expect(isSecretPath('.env')).toBe(true);
});
it('detects nested .env', () => {
expect(isSecretPath('config/.env')).toBe(true);
});
it('detects *.pfx', () => {
expect(isSecretPath('certs/client.pfx')).toBe(true);
});
it('does not flag normal source files', () => {
expect(isSecretPath('src/index.ts')).toBe(false);
expect(isSecretPath('README.md')).toBe(false);
expect(isSecretPath('package.json')).toBe(false);
});
it('returns false for empty string', () => {
expect(isSecretPath('')).toBe(false);
});
});

View File

@@ -0,0 +1,193 @@
import { describe, it, expect } from 'vitest';
import { resolveWritePath } from '../write_guard.js';
const projectRoot = '/opt/testproject';
describe('write_guard fuzz — traversal attacks', () => {
// Basic traversal
it('rejects ../', () => {
expect(() => resolveWritePath(projectRoot, '../etc/passwd')).toThrow();
});
it('rejects ../../', () => {
expect(() => resolveWritePath(projectRoot, '../../etc/passwd')).toThrow();
});
it('rejects deeply nested ../../../', () => {
expect(() => resolveWritePath(projectRoot, '../../../../../../../etc/shadow')).toThrow();
});
// Encoded traversal — resolve() doesn't decode percent-encoding, so these
// stay as literal filenames. The guard must still not let them escape.
it('rejects %2e%2e/ (literal percent-encoded dots)', () => {
// resolve('/opt/testproject', '%2e%2e/etc/passwd') stays inside root
// because Node's resolve treats the literal characters, not decoded.
// The file would be /opt/testproject/%2e%2e/etc/passwd which IS inside root.
// This test confirms it doesn't throw (it resolves inside) — defense in depth
// is that the filesystem won't have this path, but no traversal occurs.
const result = resolveWritePath(projectRoot, '%2e%2e/etc/passwd');
expect(result).toContain(projectRoot);
});
it('rejects ..%2f (literal percent-encoded slash)', () => {
// '../%2fetc/passwd' — the ../ IS real traversal
expect(() => resolveWritePath(projectRoot, '../%2fetc/passwd')).toThrow();
});
// Null byte injection
it('rejects null bytes', () => {
expect(() => resolveWritePath(projectRoot, 'file.txt\x00.jpg')).toThrow();
});
// Absolute path escape
it('rejects /etc/passwd', () => {
expect(() => resolveWritePath(projectRoot, '/etc/passwd')).toThrow();
});
it('rejects /opt/other-project/file', () => {
expect(() => resolveWritePath(projectRoot, '/opt/other-project/file.ts')).toThrow();
});
// Path that starts with project root as prefix but isn't under it
it('rejects prefix match without separator', () => {
expect(() => resolveWritePath(projectRoot, '/opt/testproject-evil/file.ts')).toThrow();
});
// Double slashes / traversal after valid prefix
it('rejects /opt/testproject/../etc/passwd via double-dot after valid prefix', () => {
expect(() => resolveWritePath(projectRoot, '/opt/testproject/../etc/passwd')).toThrow();
});
// Windows-style (defense-in-depth on Linux)
it('rejects backslash traversal', () => {
// On POSIX, backslash is a valid filename char, so '..\\etc\\passwd' resolves
// as a single segment inside projectRoot. Not a traversal, but test that it
// doesn't crash and stays within root.
const result = resolveWritePath(projectRoot, '..\\etc\\passwd');
// Node resolve on POSIX treats this as a literal filename segment containing backslashes
// that starts with '..' — resolve normalizes: /opt/testproject/..\\etc\\passwd
// Wait: resolve('/opt/testproject', '..\\etc\\passwd') — on POSIX backslash
// is NOT a separator, so this is a file named '..\\etc\\passwd' inside projectRoot.
// Actually no — resolve splits on '/' only on POSIX. '..' at start triggers parent.
// Let's check: the string starts with '..' but the next char is '\\' not '/'.
// Node's path.resolve on POSIX: the string '..\\etc\\passwd' does NOT contain '/'
// so it IS treated as a single path component? No — resolve still splits on '/'.
// '..\\etc\\passwd' has no '/', so resolve('/opt/testproject', '..\\etc\\passwd')
// = resolve('/opt/testproject/..\\etc\\passwd') — but wait, resolve processes
// segments separated by '/'. With no '/', the whole thing is one segment.
// Actually wrong: path.resolve calls normalizeString which handles '.' and '..'
// only when they are full segments delimited by '/'. Since there's no '/' in
// '..\\etc\\passwd', it treats the entire string as one filename.
// So: /opt/testproject/..\\etc\\passwd — inside root. No throw.
expect(result).toContain(projectRoot);
});
// Secret files (deny list)
it('rejects .env', () => {
expect(() => resolveWritePath(projectRoot, '.env')).toThrow();
});
it('rejects nested .env', () => {
expect(() => resolveWritePath(projectRoot, 'config/.env')).toThrow();
});
it('rejects .env.local', () => {
expect(() => resolveWritePath(projectRoot, '.env.local')).toThrow();
});
it('rejects id_rsa', () => {
expect(() => resolveWritePath(projectRoot, '.ssh/id_rsa')).toThrow();
});
it('rejects id_ed25519', () => {
expect(() => resolveWritePath(projectRoot, '.ssh/id_ed25519')).toThrow();
});
it('rejects *.pem', () => {
expect(() => resolveWritePath(projectRoot, 'certs/server.pem')).toThrow();
});
it('rejects *.key', () => {
expect(() => resolveWritePath(projectRoot, 'certs/private.key')).toThrow();
});
it('rejects credentials.json', () => {
expect(() => resolveWritePath(projectRoot, 'credentials.json')).toThrow();
});
it('rejects *.p12', () => {
expect(() => resolveWritePath(projectRoot, 'certs/client.p12')).toThrow();
});
it('rejects .netrc', () => {
expect(() => resolveWritePath(projectRoot, '.netrc')).toThrow();
});
it('rejects *.kdbx', () => {
expect(() => resolveWritePath(projectRoot, 'secrets/passwords.kdbx')).toThrow();
});
// Valid paths (should NOT throw)
it('allows simple relative path', () => {
expect(resolveWritePath(projectRoot, 'src/index.ts')).toBe('/opt/testproject/src/index.ts');
});
it('allows nested path', () => {
expect(resolveWritePath(projectRoot, 'src/services/tools/edit_file.ts')).toContain(projectRoot);
});
it('allows dotfile that is not in deny list', () => {
expect(resolveWritePath(projectRoot, '.gitignore')).toContain(projectRoot);
});
it('allows absolute path inside project', () => {
expect(resolveWritePath(projectRoot, '/opt/testproject/new-file.ts')).toBe('/opt/testproject/new-file.ts');
});
it('allows path with safe internal ../', () => {
expect(resolveWritePath(projectRoot, 'src/../lib/utils.ts')).toBe('/opt/testproject/lib/utils.ts');
});
});
describe('write_guard fuzz — edge cases', () => {
it('throws on empty string', () => {
expect(() => resolveWritePath(projectRoot, '')).toThrow();
});
it('throws on whitespace-only', () => {
expect(() => resolveWritePath(projectRoot, ' ')).toThrow();
});
it('throws when path IS the project root itself', () => {
// Writing to the directory itself makes no sense for a file write
expect(() => resolveWritePath(projectRoot, '/opt/testproject')).not.toThrow();
// The guard allows it (resolve === projectRoot passes the check).
// This is acceptable because the filesystem write will fail on a directory.
// If we want to block this, that's a separate concern.
});
it('handles very long path without crashing', () => {
const longSegment = 'a'.repeat(255);
const longPath = Array(20).fill(longSegment).join('/');
// Should not crash — may throw or succeed, but must not buffer-overflow
expect(() => resolveWritePath(projectRoot, longPath)).not.toThrow();
});
it('handles path with only dots', () => {
// Single dot resolves to projectRoot itself
const result = resolveWritePath(projectRoot, './src/file.ts');
expect(result).toBe('/opt/testproject/src/file.ts');
});
it('rejects triple-dot trick (... is not special but ../ within is)', () => {
// '.../etc' is a literal directory name, not traversal
const result = resolveWritePath(projectRoot, '.../etc');
expect(result).toContain(projectRoot);
});
it('rejects path with multiple consecutive slashes', () => {
// resolve normalizes these; should still be inside root
const result = resolveWritePath(projectRoot, 'src///file.ts');
expect(result).toBe('/opt/testproject/src/file.ts');
});
});

View File

@@ -0,0 +1,49 @@
import { promises as fs } from 'node:fs';
import { dirname, isAbsolute, resolve, sep } from 'node:path';
/**
* Resolve an ACP-supplied path against the agent worktree and reject anything
* that escapes it. Mirrors `write_guard.ts`'s check: `resolve()` to normalize
* `../` segments, then a **separator-bounded** prefix test — a bare
* `startsWith(root)` wrongly admits a sibling dir like `<root>-evil/...`.
*
* No realpath (consistent with `write_guard.ts`: the target may not exist yet on
* write). This is a containment guard for the ACP fs bridge, not a hard trust
* boundary — the agent process already runs with host FS access; symlink-swap
* hardening (`O_NOFOLLOW`/realpath) is out of scope here.
*/
function resolveInWorktree(worktreePath: string, filePath: string): string {
const root = resolve(worktreePath);
const absolute = isAbsolute(filePath) ? resolve(filePath) : resolve(root, filePath);
if (absolute !== root && !absolute.startsWith(root + sep)) {
throw new Error(`path escapes worktree: ${filePath}`);
}
return absolute;
}
/** Resolve an ACP path against the agent worktree and read a slice of lines. */
export async function readWorktreeTextFile(
worktreePath: string,
filePath: string,
line?: number | null,
limit?: number | null,
): Promise<string> {
const absolute = resolveInWorktree(worktreePath, filePath);
const raw = await fs.readFile(absolute, 'utf8');
if (!line && !limit) return raw;
const lines = raw.split(/\r?\n/);
const start = Math.max((line ?? 1) - 1, 0);
const end = limit ? start + limit : undefined;
return lines.slice(start, end).join('\n');
}
/** Write a file inside the worktree (creates parent dirs). */
export async function writeWorktreeTextFile(
worktreePath: string,
filePath: string,
content: string,
): Promise<void> {
const absolute = resolveInWorktree(worktreePath, filePath);
await fs.mkdir(dirname(absolute), { recursive: true });
await fs.writeFile(absolute, content, 'utf8');
}

View File

@@ -0,0 +1,128 @@
/**
* ACP model/mode derivation — adapted from Paseo acp-agent.ts.
*/
import type {
SessionConfigOption,
SessionModelState,
SessionModeState,
} from '@agentclientprotocol/sdk';
import type { ProviderMode, ProviderModel, ThinkingOption } from './provider-types.js';
type SelectConfigOption = Extract<SessionConfigOption, { type: 'select' }>;
interface SelectConfigChoice {
value: string;
name: string;
description?: string | null;
group?: string;
}
function findSelectConfigOption({
configOptions,
category,
id,
}: {
configOptions: SessionConfigOption[] | null | undefined;
category: string;
id?: string;
}): SelectConfigOption | null {
const option = configOptions?.find(
(entry): entry is SelectConfigOption =>
entry.type === 'select' && entry.category === category && (!id || entry.id === id),
);
return option ?? null;
}
function flattenSelectOptions(options: SelectConfigOption['options']): SelectConfigChoice[] {
const flattened: SelectConfigChoice[] = [];
for (const option of options) {
if ('value' in option) {
flattened.push(option);
continue;
}
for (const groupOption of option.options) {
flattened.push({ ...groupOption, group: option.group });
}
}
return flattened;
}
function deriveSelectorOptions(
configOptions: SessionConfigOption[] | null | undefined,
category: string,
): ThinkingOption[] {
const option = findSelectConfigOption({ configOptions, category });
if (!option) return [];
return flattenSelectOptions(option.options).map((value) => ({
id: value.value,
label: value.name,
isDefault: value.value === option.currentValue,
}));
}
export function deriveModesFromACP(
fallbackModes: ProviderMode[],
modeState?: SessionModeState | null,
configOptions?: SessionConfigOption[] | null,
): { modes: ProviderMode[]; currentModeId: string | null } {
if (modeState?.availableModes?.length) {
return {
modes: modeState.availableModes.map((mode) => ({
id: mode.id,
label: mode.name,
description: mode.description ?? undefined,
})),
currentModeId: modeState.currentModeId ?? null,
};
}
const modeOption = findSelectConfigOption({ configOptions, category: 'mode' });
if (modeOption) {
const flatOptions = flattenSelectOptions(modeOption.options);
return {
modes: flatOptions.map((option) => ({
id: option.value,
label: option.name,
description: option.description ?? undefined,
})),
currentModeId: modeOption.currentValue,
};
}
return { modes: fallbackModes, currentModeId: null };
}
export function deriveModelDefinitionsFromACP(
models: SessionModelState | null | undefined,
configOptions?: SessionConfigOption[] | null,
): ProviderModel[] {
const thinkingOptions = deriveSelectorOptions(configOptions, 'thought_level');
const defaultThinkingOptionId = thinkingOptions.find((o) => o.isDefault)?.id;
if (models?.availableModels?.length) {
return models.availableModels.map((model) => ({
id: model.modelId,
label: model.name,
description: model.description ?? undefined,
isDefault: model.modelId === models.currentModelId,
thinkingOptions: thinkingOptions.length > 0 ? thinkingOptions : undefined,
defaultThinkingOptionId: defaultThinkingOptionId ?? undefined,
}));
}
const modelOptions = deriveSelectorOptions(configOptions, 'model');
return modelOptions.map((option) => ({
id: option.id,
label: option.label,
isDefault: option.isDefault,
thinkingOptions: thinkingOptions.length > 0 ? thinkingOptions : undefined,
defaultThinkingOptionId: defaultThinkingOptionId ?? undefined,
}));
}
export function findThoughtLevelConfigId(
configOptions: SessionConfigOption[] | null | undefined,
): string | null {
return findSelectConfigOption({ configOptions, category: 'thought_level' })?.id ?? null;
}

View File

@@ -0,0 +1,379 @@
/**
* ACP dispatch — runs ACP-capable agents directly on the host.
*
* v2.3: Paseo-aligned tool lifecycle — stable toolCallId, merge on
* tool_call_update, reasoning stream, worktree FS client, persist-ready snapshots.
*/
import type { FastifyBaseLogger } from 'fastify';
import {
ClientSideConnection,
type Client,
type SessionNotification,
type RequestPermissionRequest,
type RequestPermissionResponse,
type ReadTextFileRequest,
type ReadTextFileResponse,
type WriteTextFileRequest,
type WriteTextFileResponse,
type CreateTerminalRequest,
type CreateTerminalResponse,
type CreateElicitationRequest,
type CreateElicitationResponse,
type SessionConfigOption,
type ClientSideConnection as ConnectionType,
} from '@agentclientprotocol/sdk';
import type { Broker } from '@boocode/server/broker';
import type { WsFrame } from '@boocode/server/ws-frames';
import { spawn } from 'node:child_process';
import { findThoughtLevelConfigId } from './acp-derive.js';
import { resolveLaunchSpec } from './acp-spawn.js';
import { getResolvedRegistry, type ResolvedProviderDef } from './provider-config-registry.js';
import { createAcpNdJsonStream } from './acp-stream.js';
import { waitForPermissionResponse, waitForElicitationResponse, cancelPendingPermission } from './permission-waiter.js';
import { mergeTaskCommands, getTaskCommands } from './agent-commands-cache.js';
import { readWorktreeTextFile, writeWorktreeTextFile } from './acp-client-fs.js';
import { mapSessionUpdate } from './acp-event-map.js';
import {
type AcpToolSnapshot,
snapshotToWireToolCall,
synthesizeCanceledSnapshots,
} from './acp-tool-snapshot.js';
export interface AcpDispatchResult {
exitCode: number;
output: string;
toolSnapshots: AcpToolSnapshot[];
reasoningText: string;
stopReason: string;
}
export interface AcpDispatchOpts {
agent: string;
task: string;
worktreePath: string;
model?: string;
modeId?: string;
thinkingOptionId?: string;
taskId?: string;
sessionId?: string;
chatId?: string;
messageId?: string;
broker?: Broker;
installPath?: string;
/** v2.3 phase 3: resolved registry def for launch-spec resolution. The
* dispatcher loads this by task.agent; falls back to a registry lookup here. */
resolved?: ResolvedProviderDef;
signal?: AbortSignal;
log: FastifyBaseLogger;
}
async function applySessionOverrides(
connection: ConnectionType,
acpSessionId: string,
configOptions: SessionConfigOption[] | null | undefined,
opts: Pick<AcpDispatchOpts, 'model' | 'modeId' | 'thinkingOptionId' | 'log'>,
): Promise<void> {
const { model, modeId, thinkingOptionId, log } = opts;
if (modeId) {
try {
await connection.setSessionMode({ sessionId: acpSessionId, modeId });
} catch (err) {
log.warn({ modeId, err: err instanceof Error ? err.message : String(err) }, 'acp-dispatch: setSessionMode failed');
}
}
if (model) {
try {
await connection.unstable_setSessionModel({ sessionId: acpSessionId, modelId: model });
} catch (err) {
log.warn({ model, err: err instanceof Error ? err.message : String(err) }, 'acp-dispatch: setSessionModel failed');
}
}
if (thinkingOptionId) {
const configId = findThoughtLevelConfigId(configOptions);
if (configId) {
try {
await connection.setSessionConfigOption({
sessionId: acpSessionId,
configId,
value: thinkingOptionId,
});
} catch (err) {
log.warn(
{ thinkingOptionId, err: err instanceof Error ? err.message : String(err) },
'acp-dispatch: setSessionConfigOption failed',
);
}
}
}
}
class AcpStreamContext {
readonly textChunks: string[] = [];
readonly reasoningChunks: string[] = [];
readonly toolSnapshots = new Map<string, AcpToolSnapshot>();
private aborted = false;
constructor(
private readonly opts: Pick<
AcpDispatchOpts,
'broker' | 'sessionId' | 'chatId' | 'messageId' | 'taskId'
>,
private readonly worktreePath: string,
) {}
get reasoningText(): string {
return this.reasoningChunks.join('');
}
get output(): string {
return this.textChunks.join('');
}
get snapshots(): AcpToolSnapshot[] {
return [...this.toolSnapshots.values()];
}
markAborted(): void {
this.aborted = true;
for (const snap of synthesizeCanceledSnapshots(this.toolSnapshots.values())) {
this.toolSnapshots.set(snap.toolCallId, snap);
this.publishToolSnapshot(snap);
}
}
private canStream(): boolean {
return !!(this.opts.broker && this.opts.sessionId && this.opts.chatId && this.opts.messageId);
}
private publishToolSnapshot(snapshot: AcpToolSnapshot): void {
if (!this.canStream()) return;
const wire = snapshotToWireToolCall(snapshot);
this.opts.broker!.publishFrame(this.opts.sessionId!, {
type: 'tool_call',
message_id: this.opts.messageId!,
chat_id: this.opts.chatId!,
tool_call: wire,
} as WsFrame);
}
async handleSessionUpdate(params: SessionNotification): Promise<void> {
// v2.6 Phase 2: the case-by-case mapping now lives in the shared, pure
// `mapSessionUpdate` (reused by the warm ACP backend). This method keeps the
// identical broker-publishing side effects — it just translates the normalized
// AgentEvents back into the same frames it always emitted. `this.toolSnapshots`
// is the merge accumulator, so a later tool_call_update merges over its
// tool_call (the prior `handleToolUpdate` behavior, byte-for-byte).
for (const event of mapSessionUpdate(params, this.toolSnapshots)) {
switch (event.type) {
case 'text':
this.textChunks.push(event.text);
if (this.canStream()) {
this.opts.broker!.publishFrame(this.opts.sessionId!, {
type: 'delta',
message_id: this.opts.messageId!,
chat_id: this.opts.chatId!,
content: event.text,
} as WsFrame);
}
break;
case 'reasoning':
this.reasoningChunks.push(event.text);
if (this.canStream()) {
this.opts.broker!.publishFrame(this.opts.sessionId!, {
type: 'reasoning_delta',
message_id: this.opts.messageId!,
chat_id: this.opts.chatId!,
content: event.text,
} as WsFrame);
}
break;
case 'tool_call':
case 'tool_update':
// mapSessionUpdate already stored the merged snapshot in this.toolSnapshots.
this.publishToolSnapshot(event.toolCall);
break;
case 'commands':
if (this.opts.taskId && event.commands.length > 0) {
mergeTaskCommands(this.opts.taskId, event.commands);
if (this.canStream() && this.opts.sessionId) {
const all = getTaskCommands(this.opts.taskId) ?? event.commands;
this.opts.broker!.publishFrame(this.opts.sessionId, {
type: 'agent_commands',
task_id: this.opts.taskId,
session_id: this.opts.sessionId,
commands: all,
} as WsFrame);
}
}
break;
}
}
}
buildClient(agent: string, modeId: string | undefined, taskId: string | undefined, sessionId: string | undefined): Client {
return {
sessionUpdate: (params) => this.handleSessionUpdate(params),
requestPermission: async (params: RequestPermissionRequest): Promise<RequestPermissionResponse> => {
if (taskId && sessionId) {
return waitForPermissionResponse(taskId, sessionId, agent, modeId, params);
}
const firstOption = params.options[0];
if (firstOption) {
return { outcome: { outcome: 'selected', optionId: firstOption.optionId } };
}
return { outcome: { outcome: 'cancelled' } };
},
readTextFile: async (params: ReadTextFileRequest): Promise<ReadTextFileResponse> => {
const content = await readWorktreeTextFile(
this.worktreePath,
params.path,
params.line,
params.limit,
);
return { content };
},
writeTextFile: async (params: WriteTextFileRequest): Promise<WriteTextFileResponse> => {
await writeWorktreeTextFile(this.worktreePath, params.path, params.content);
return {};
},
createTerminal: async (_params: CreateTerminalRequest): Promise<CreateTerminalResponse> => {
return { terminalId: 'noop' };
},
unstable_createElicitation: async (params: CreateElicitationRequest): Promise<CreateElicitationResponse> => {
if (taskId && sessionId) {
return waitForElicitationResponse(taskId, sessionId, agent, modeId, params);
}
return { action: 'decline' };
},
};
}
}
export async function dispatchViaAcp(opts: AcpDispatchOpts): Promise<AcpDispatchResult> {
const {
agent,
task,
worktreePath,
installPath,
signal,
log,
taskId,
modeId,
sessionId,
chatId,
messageId,
broker,
} = opts;
// v2.3 phase 3: launch from the resolved registry def (config override /
// custom-ACP command) with the built-in switch as the fallback. The dispatcher
// passes `resolved`; fall back to a registry lookup if it didn't.
const resolved = opts.resolved ?? getResolvedRegistry().get(agent);
const spec = resolved ? resolveLaunchSpec(resolved, installPath ?? null) : null;
if (!spec) {
return {
exitCode: 1,
output: `Agent '${agent}' does not support ACP.`,
toolSnapshots: [],
reasoningText: '',
stopReason: 'error',
};
}
log.info({ agent, binary: spec.binary, worktreePath, modeId, model: opts.model }, 'acp-dispatch: spawning');
const child = spawn(spec.binary, spec.args, {
cwd: worktreePath,
stdio: ['pipe', 'pipe', 'pipe'],
env: { ...process.env, ...spec.env },
});
const streamCtx = new AcpStreamContext(
{ broker, sessionId, chatId, messageId, taskId },
worktreePath,
);
let killed = false;
const cleanup = () => {
if (!killed) {
killed = true;
streamCtx.markAborted();
child.kill('SIGTERM');
setTimeout(() => child.kill('SIGKILL'), 5_000);
}
if (taskId) cancelPendingPermission(taskId);
};
if (signal) {
if (signal.aborted) {
cleanup();
return {
exitCode: 130,
output: 'Aborted before start',
toolSnapshots: streamCtx.snapshots,
reasoningText: '',
stopReason: 'cancelled',
};
}
signal.addEventListener('abort', cleanup, { once: true });
}
try {
const stream = createAcpNdJsonStream(child);
const connection = new ClientSideConnection(
() => streamCtx.buildClient(agent, modeId, taskId, sessionId),
stream,
);
await connection.initialize({
protocolVersion: 1,
clientInfo: { name: 'boocoder', version: '2.3.0' },
clientCapabilities: {},
});
const acpSession = await connection.newSession({ cwd: worktreePath, mcpServers: [] });
log.info({ sessionId: acpSession.sessionId }, 'acp-dispatch: session created');
await applySessionOverrides(connection, acpSession.sessionId, acpSession.configOptions, opts);
const promptResult = await connection.prompt({
sessionId: acpSession.sessionId,
prompt: [{ type: 'text', text: task }],
});
const stopReason = promptResult.stopReason ?? 'end_turn';
log.info(
{ agent, stopReason, toolCallCount: streamCtx.snapshots.length, reasoningChars: streamCtx.reasoningText.length },
'acp-dispatch: prompt completed',
);
await connection.closeSession({ sessionId: acpSession.sessionId }).catch(() => {});
return {
exitCode: 0,
output: streamCtx.output,
toolSnapshots: streamCtx.snapshots,
reasoningText: streamCtx.reasoningText,
stopReason,
};
} catch (err) {
const message = err instanceof Error ? err.message : String(err);
log.error({ agent, err: message }, 'acp-dispatch: error');
return {
exitCode: 1,
output: message,
toolSnapshots: streamCtx.snapshots,
reasoningText: streamCtx.reasoningText,
stopReason: 'error',
};
} finally {
if (signal) signal.removeEventListener('abort', cleanup);
cleanup();
await new Promise<void>((resolve) => {
child.on('close', resolve);
setTimeout(resolve, 3_000);
});
}
}

View File

@@ -0,0 +1,68 @@
/**
* Shared ACP session-update → normalized AgentEvent mapping.
*
* Extracted verbatim (v2.6 Phase 2) from `AcpStreamContext.handleSessionUpdate`
* in `acp-dispatch.ts` so the warm ACP backend (`backends/warm-acp.ts`) and the
* one-shot dispatch share ONE mapping. The one-shot path translates the returned
* events into broker frames itself (preserving its prior behavior byte-for-byte);
* the warm backend forwards them to the dispatcher's `ctx.onEvent` exactly like
* the opencode-server backend does. No I/O, no broker — pure, so it's unit-testable.
*
* Spec: openspec/changes/v2-6-persistent-agent-sessions/design.md §2b.
*/
import type { SessionNotification } from '@agentclientprotocol/sdk';
import type { AgentEvent } from './agent-backend.js';
import { type AcpToolSnapshot, mergeToolSnapshot } from './acp-tool-snapshot.js';
/**
* Map one ACP `session/update` notification to zero-or-more normalized AgentEvents.
*
* `priorSnapshots` is the caller-owned tool-call snapshot accumulator (toolCallId →
* snapshot). For `tool_call` / `tool_call_update` the merged snapshot is written
* back into it (mutated in place, mirroring `AcpStreamContext.handleToolUpdate`)
* so a later `tool_call_update` merges over the earlier `tool_call`. Pass an empty
* Map for a stateless single call.
*
* Returns an array (never throws) so the caller can splat it onto `onEvent`.
*/
export function mapSessionUpdate(
params: SessionNotification,
priorSnapshots: Map<string, AcpToolSnapshot> = new Map(),
): AgentEvent[] {
const update = params.update;
switch (update.sessionUpdate) {
case 'agent_message_chunk': {
const content = update.content;
if (content.type === 'text' && 'text' in content) {
return [{ type: 'text', text: (content as { text: string }).text }];
}
return [];
}
case 'agent_thought_chunk': {
const content = update.content;
if (content.type === 'text' && 'text' in content) {
return [{ type: 'reasoning', text: (content as { text: string }).text }];
}
return [];
}
case 'tool_call': {
const snapshot = mergeToolSnapshot(update.toolCallId, update, priorSnapshots.get(update.toolCallId));
priorSnapshots.set(update.toolCallId, snapshot);
return [{ type: 'tool_call', toolCall: snapshot }];
}
case 'tool_call_update': {
const snapshot = mergeToolSnapshot(update.toolCallId, update, priorSnapshots.get(update.toolCallId));
priorSnapshots.set(update.toolCallId, snapshot);
return [{ type: 'tool_update', toolCall: snapshot }];
}
case 'available_commands_update': {
const commands = update.availableCommands.map((cmd) => ({
name: cmd.name,
description: cmd.description ?? undefined,
}));
return [{ type: 'commands', commands }];
}
default:
return [];
}
}

View File

@@ -0,0 +1,166 @@
/**
* Short-lived ACP probe — opens a session and reads models/modes from the response.
*/
import { spawn } from 'node:child_process';
import {
ClientSideConnection,
type Client,
type NewSessionResponse,
type ReadTextFileRequest,
type ReadTextFileResponse,
type WriteTextFileRequest,
type WriteTextFileResponse,
type CreateTerminalRequest,
type CreateTerminalResponse,
type RequestPermissionRequest,
type RequestPermissionResponse,
} from '@agentclientprotocol/sdk';
import { deriveModesFromACP, deriveModelDefinitionsFromACP } from './acp-derive.js';
import { getManifestDefaultModeId, getManifestModes } from './provider-manifest.js';
import { resolveAcpSpawnArgs } from './acp-spawn.js';
import { createAcpNdJsonStream } from './acp-stream.js';
import type { ProviderModel, ProviderMode } from './provider-types.js';
import type { AgentCommand } from './agent-commands-cache.js';
const PROBE_TIMEOUT_MS = 30_000;
export interface AcpProbeResult {
ok: boolean;
models: ProviderModel[];
modes: ProviderMode[];
defaultModeId: string | null;
commands: AgentCommand[];
error?: string;
}
function parseSessionResponse(session: NewSessionResponse, agent: string): AcpProbeResult {
const fallbackModes = getManifestModes(agent);
const { modes, currentModeId } = deriveModesFromACP(
fallbackModes,
session.modes,
session.configOptions,
);
const models = deriveModelDefinitionsFromACP(session.models, session.configOptions);
return {
ok: true,
models,
modes,
defaultModeId: currentModeId ?? getManifestDefaultModeId(agent),
commands: [],
};
}
export async function probeAcpProvider(
agent: string,
installPath: string,
cwd: string,
): Promise<AcpProbeResult> {
const args = resolveAcpSpawnArgs(agent);
if (!args) {
return {
ok: false,
models: [],
modes: getManifestModes(agent),
defaultModeId: getManifestDefaultModeId(agent),
commands: [],
error: 'no ACP spawn args',
};
}
const child = spawn(installPath, args, {
cwd,
stdio: ['pipe', 'pipe', 'pipe'],
env: { ...process.env },
});
let killed = false;
const kill = () => {
if (!killed) {
killed = true;
child.kill('SIGTERM');
setTimeout(() => child.kill('SIGKILL'), 2_000);
}
};
const timeout = setTimeout(kill, PROBE_TIMEOUT_MS);
const probedCommands: AgentCommand[] = [];
try {
const stream = createAcpNdJsonStream(child);
const connection = new ClientSideConnection(
(_agentInterface): Client => ({
async sessionUpdate(params) {
const update = params.update;
if (update.sessionUpdate === 'available_commands_update') {
for (const cmd of update.availableCommands) {
probedCommands.push({
name: cmd.name,
description: cmd.description ?? undefined,
});
}
}
},
async requestPermission(params: RequestPermissionRequest): Promise<RequestPermissionResponse> {
const first = params.options[0];
if (first) {
return { outcome: { outcome: 'selected', optionId: first.optionId } };
}
return { outcome: { outcome: 'cancelled' } };
},
async readTextFile(_params: ReadTextFileRequest): Promise<ReadTextFileResponse> {
return { content: '' };
},
async writeTextFile(_params: WriteTextFileRequest): Promise<WriteTextFileResponse> {
return {};
},
async createTerminal(_params: CreateTerminalRequest): Promise<CreateTerminalResponse> {
return { terminalId: 'noop' };
},
}),
stream,
);
await connection.initialize({
protocolVersion: 1,
clientInfo: { name: 'boocoder-probe', version: '2.2.0' },
clientCapabilities: {},
});
const session = await connection.newSession({ cwd, mcpServers: [] });
// available_commands_update is an async session notification opencode sends
// shortly AFTER newSession resolves — reading probedCommands synchronously
// here races it and captures nothing. Wait briefly for the first batch, then
// a short settle for any stragglers (capped well under PROBE_TIMEOUT_MS).
const deadline = Date.now() + 3_000;
while (probedCommands.length === 0 && Date.now() < deadline) {
await new Promise((r) => setTimeout(r, 150));
}
if (probedCommands.length > 0) {
await new Promise((r) => setTimeout(r, 300));
}
const result = parseSessionResponse(session, agent);
result.commands = probedCommands;
await connection.closeSession({ sessionId: session.sessionId }).catch(() => {});
return result;
} catch (err) {
const message = err instanceof Error ? err.message : String(err);
return {
ok: false,
models: [],
modes: getManifestModes(agent),
defaultModeId: getManifestDefaultModeId(agent),
commands: probedCommands,
error: message,
};
} finally {
clearTimeout(timeout);
kill();
await new Promise<void>((resolve) => {
child.on('close', resolve);
setTimeout(resolve, 2_000);
});
}
}

View File

@@ -0,0 +1,50 @@
import type { ResolvedProviderDef } from './provider-config-registry.js';
/**
* Resolve ACP spawn argv per built-in provider (host-probe verified 2026-05-25).
* Source of truth for built-in default argv — resolveLaunchSpec wraps these; it
* does NOT replace them.
*/
export function resolveAcpSpawnArgs(agent: string): string[] | null {
switch (agent) {
case 'opencode':
case 'goose':
return ['acp'];
case 'qwen':
return ['--acp'];
default:
return null;
}
}
/**
* v2.3 phase 3: resolve the launch spec for an ACP dispatch (design.md §5.1).
* Consults the resolved registry's launchCommand (config override or custom-ACP
* entry) first; otherwise falls back to the built-in default argv above.
*
* Byte-identical to pre-v2.3 for built-ins with no override: binary is
* `installPath ?? id` and args come from resolveAcpSpawnArgs — exactly the
* `binary = installPath ?? agent` + `resolveAcpSpawnArgs(agent)` the dispatcher
* used before. (Deliberate deviation from design §5.1's `!installPath → null`:
* the old path spawned the bare agent name when install_path was missing, so we
* preserve the `?? id` fallback rather than fail.)
*/
export function resolveLaunchSpec(
resolved: ResolvedProviderDef,
installPath: string | null,
): { binary: string; args: string[]; env?: Record<string, string> } | null {
if (resolved.launchCommand) {
return {
binary: resolved.launchCommand[0],
args: resolved.launchCommand.slice(1),
env: resolved.env,
};
}
const args = resolveAcpSpawnArgs(resolved.id);
if (!args) return null;
return { binary: installPath ?? resolved.id, args, env: resolved.env };
}
export function resolveAcpProbeBinaries(agent: string): string[] {
return [agent];
}

View File

@@ -0,0 +1,44 @@
import { Readable, Writable } from 'node:stream';
import type { ChildProcess } from 'node:child_process';
import { ndJsonStream } from '@agentclientprotocol/sdk';
export function nodeReadableToWeb(nodeStream: NodeJS.ReadableStream): ReadableStream<Uint8Array> {
return new ReadableStream<Uint8Array>({
start(controller) {
nodeStream.on('data', (chunk: Buffer) => controller.enqueue(new Uint8Array(chunk)));
nodeStream.on('end', () => controller.close());
nodeStream.on('error', (err) => controller.error(err));
},
cancel() {
if ('destroy' in nodeStream && typeof (nodeStream as Readable).destroy === 'function') {
(nodeStream as Readable).destroy();
}
},
});
}
export function nodeWritableToWeb(nodeStream: NodeJS.WritableStream): WritableStream<Uint8Array> {
return new WritableStream<Uint8Array>({
write(chunk) {
return new Promise<void>((resolve, reject) => {
const ok = (nodeStream as Writable).write(chunk, (err) => {
if (err) reject(err);
});
if (ok) resolve();
else (nodeStream as Writable).once('drain', resolve);
});
},
close() {
return new Promise<void>((resolve) => {
(nodeStream as Writable).end(resolve);
});
},
abort() {
(nodeStream as Writable).destroy();
},
});
}
export function createAcpNdJsonStream(child: ChildProcess) {
return ndJsonStream(nodeWritableToWeb(child.stdin!), nodeReadableToWeb(child.stdout!));
}

View File

@@ -0,0 +1,120 @@
/**
* ACP tool snapshot merge + wire mapping — lifted from Paseo acp-agent.ts patterns.
* Stable toolCallId, merge on tool_call_update, status lifecycle for UI + DB.
*/
import type { ToolCall, ToolCallUpdate, ToolCallStatus, ToolKind } from '@agentclientprotocol/sdk';
export type AcpToolLifecycleStatus = 'running' | 'completed' | 'failed' | 'canceled';
export interface AcpToolSnapshot {
toolCallId: string;
title: string;
kind?: ToolKind | null;
status?: ToolCallStatus | null;
rawInput?: unknown;
rawOutput?: unknown;
}
export interface AcpWireMeta {
status: AcpToolLifecycleStatus;
kind?: string | null;
title?: string;
output?: unknown;
error?: string;
}
function coalesceDefined<T>(next: T | null | undefined, previous: T | null | undefined, fallback: T | null): T | null {
if (next !== undefined && next !== null) return next;
if (previous !== undefined && previous !== null) return previous;
return fallback;
}
export function mergeToolSnapshot(
toolCallId: string,
update: ToolCall | ToolCallUpdate,
previous?: AcpToolSnapshot,
): AcpToolSnapshot {
return {
toolCallId,
title: update.title ?? previous?.title ?? toolCallId,
kind: update.kind ?? previous?.kind ?? null,
status: update.status ?? previous?.status ?? null,
rawInput: update.rawInput !== undefined ? update.rawInput : previous?.rawInput,
rawOutput: update.rawOutput !== undefined ? update.rawOutput : previous?.rawOutput,
};
}
export function mapToolLifecycleStatus(
status: ToolCallStatus | null | undefined,
rawOutput?: unknown,
): AcpToolLifecycleStatus {
if (rawOutput === 'canceled') return 'canceled';
switch (status) {
case 'completed':
return 'completed';
case 'failed':
return 'failed';
case 'pending':
case 'in_progress':
default:
return 'running';
}
}
function readErrorMessage(rawOutput: unknown): string | undefined {
if (typeof rawOutput === 'string' && rawOutput.trim()) return rawOutput;
if (rawOutput && typeof rawOutput === 'object' && !Array.isArray(rawOutput)) {
const rec = rawOutput as Record<string, unknown>;
const msg = rec.message ?? rec.error ?? rec.reason;
if (typeof msg === 'string' && msg.trim()) return msg;
}
return undefined;
}
function asRecord(value: unknown): Record<string, unknown> {
if (value && typeof value === 'object' && !Array.isArray(value)) {
return value as Record<string, unknown>;
}
return {};
}
export function snapshotToWireToolCall(snapshot: AcpToolSnapshot): {
id: string;
name: string;
args: Record<string, unknown>;
} {
const lifecycle = mapToolLifecycleStatus(snapshot.status, snapshot.rawOutput);
const input = asRecord(snapshot.rawInput);
const error = lifecycle === 'failed' ? readErrorMessage(snapshot.rawOutput) : undefined;
const meta: AcpWireMeta = {
status: lifecycle,
kind: snapshot.kind ?? null,
title: snapshot.title,
...(snapshot.rawOutput !== undefined ? { output: snapshot.rawOutput } : {}),
...(error ? { error } : {}),
};
return {
id: snapshot.toolCallId,
name: String(snapshot.kind ?? snapshot.title),
args: { ...input, _acp: meta },
};
}
export function snapshotToPartPayload(snapshot: AcpToolSnapshot): {
id: string;
name: string;
args: Record<string, unknown>;
} {
const wire = snapshotToWireToolCall(snapshot);
return { id: wire.id, name: wire.name, args: wire.args };
}
export function synthesizeCanceledSnapshots(snapshots: Iterable<AcpToolSnapshot>): AcpToolSnapshot[] {
const out: AcpToolSnapshot[] = [];
for (const snapshot of snapshots) {
if (mapToolLifecycleStatus(snapshot.status) === 'running') {
out.push({ ...snapshot, status: 'failed', rawOutput: snapshot.rawOutput ?? 'canceled' });
}
}
return out;
}

View File

@@ -0,0 +1,119 @@
/**
* v2.6 — AgentBackend abstraction (Phase 0 scaffold; types only, zero runtime logic).
*
* The core abstraction for persistent agent sessions. Two implementations land
* later: `OpenCodeServerBackend` (Phase 1, opencode HTTP server) and
* `WarmAcpBackend` (Phase 2, long-lived ACP process). Backends emit
* transport-agnostic `AgentEvent`s; the dispatcher maps them to WS frames.
*
* Nothing imports this file yet — it must compile standalone.
* Spec: openspec/changes/v2-6-persistent-agent-sessions/design.md §2.
*/
import type { AcpToolSnapshot } from './acp-tool-snapshot.js';
import type { AgentCommand } from './provider-types.js';
/** Backend transport kind. Mirrors `agent_sessions.backend` CHECK in schema.sql. */
export type AgentBackendKind = 'opencode_server' | 'acp_warm';
/**
* Normalized, transport-agnostic events a backend emits during a turn (§2).
* Derived from acp-dispatch's session-update handling, but WITHOUT the WS
* envelope (message_id/chat_id) — the dispatcher owns frame mapping.
*
* `tool_call` vs `tool_update` are kept distinct on purpose: acp-dispatch
* currently merges both into one snapshot frame, but opencode's SSE
* distinguishes tool-start from tool-result, so the contract carries both.
* `commands` mirrors the ACP `available_commands_update` path (v2.5.10).
*/
export type AgentEvent =
| { type: 'text'; text: string }
| { type: 'reasoning'; text: string }
| { type: 'tool_call'; toolCall: AcpToolSnapshot }
| { type: 'tool_update'; toolCall: AcpToolSnapshot }
| { type: 'commands'; commands: AgentCommand[] };
/** Params to establish (or look up) a backend session (§2). */
export interface EnsureSessionOpts {
agent: string;
/** Resolved model id. */
model: string;
/** P1.5-b: the chat (tab) this turn belongs to. agent_sessions is keyed
* (chat_id, agent) — the tab/chat is the context unit. Always non-null:
* the dispatcher creates a chat for session-less tasks before calling. */
chatId: string;
/** Shared per-session worktree (one per `sessions.id`, not per pane). */
worktreePath: string;
/** P1.5-b: the `worktrees.id` for this session's worktree — stored on the
* agent_sessions row informationally (NOT the key). */
worktreeId: string;
projectId: string;
}
/** Opaque handle to a live backend session, persisted to `agent_sessions` (§2). */
export interface AgentSessionHandle {
sessionId: string;
agent: string;
backend: AgentBackendKind;
/** P1.5-b: the chat (tab) this session is keyed on (with agent). */
chatId: string;
/** P1.5-b: the worktree this session's chat runs in (informational link). */
worktreeId: string;
/** Provider's own session id (resume token); null until the backend assigns one. */
agentSessionId: string | null;
/** opencode HTTP server port; null for ACP backends. */
serverPort: number | null;
}
/** Per-turn context passed to `prompt` (§2). */
export interface PromptCtx {
worktreePath: string;
model: string;
signal: AbortSignal;
onEvent: (e: AgentEvent) => void;
/** Phase 2: per-turn task id, so a warm ACP backend can route permission /
* elicitation prompts back to the UI via the permission-waiter. Optional —
* the opencode-server backend (autonomous) ignores it. */
taskId?: string;
/** Phase 2: per-turn mode id (gates autonomous mode in the permission-waiter). */
modeId?: string;
}
/** Result of a completed turn (§2). Diff/persist happen outside the backend. */
export interface TurnResult {
ok: boolean;
error?: string;
}
/**
* The core backend abstraction (§2). Implementations: OpenCodeServerBackend
* (Phase 1), WarmAcpBackend (Phase 2).
*/
export interface AgentBackend {
/** Lazy: spawn server / warm process if not already up for this (session, agent). §2 */
ensureSession(sessionId: string, opts: EnsureSessionOpts): Promise<AgentSessionHandle>;
/** Send a prompt; stream events via ctx.onEvent; resolves when the turn completes. §2 */
prompt(handle: AgentSessionHandle, input: string, ctx: PromptCtx): Promise<TurnResult>;
/** Graceful teardown of one session (session close or idle timeout). §2 */
closeSession(handle: AgentSessionHandle): Promise<void>;
/** Full teardown — kills all spawned servers/processes. §2 */
dispose(): Promise<void>;
/** Liveness for health endpoint + dispatcher fallback decision. §2 */
health(): 'up' | 'down';
/**
* v2.6 Phase 3: true iff a turn is in flight on this backend. The pool's idle
* eviction + LRU cap NEVER evict a busy backend (design §6 busy rule); the
* health-monitor defers a restart while busy (stale-grace). Optional so the
* Phase-0 scaffold and any test double stay compatible — absent ⇒ treated as
* not busy. opencode-server (multi-session) is busy iff ANY session has an
* active turn; warm-acp (single session) iff its one slot is active.
*/
isBusy?(): boolean;
/**
* v2.6 Phase 3: optional proactive health probe + busy-aware self-restart, run
* by the pool's periodic sweep. The opencode-server backend implements it
* (detects a hung-but-not-exited server and restarts when non-busy). Backends
* with no long-lived shared process (warm-ACP recovers lazily on its own child
* exit) can omit it. Must never throw — the sweep ignores rejections.
*/
tickHealth?(now?: number): Promise<void>;
}

View File

@@ -0,0 +1,28 @@
/** In-memory cache of ACP available_commands_update per task. */
import type { AgentCommand } from './provider-types.js';
import { mergeCommands } from './provider-commands.js';
export type { AgentCommand };
const commandsByTask = new Map<string, AgentCommand[]>();
export function setTaskCommands(taskId: string, commands: AgentCommand[]): void {
if (commands.length === 0) return;
commandsByTask.set(taskId, commands);
}
/** Merge by command name; later lists override earlier entries. */
export function mergeTaskCommands(taskId: string, commands: AgentCommand[]): void {
if (commands.length === 0) return;
const merged = mergeCommands(commandsByTask.get(taskId) ?? [], commands);
commandsByTask.set(taskId, merged);
}
export function getTaskCommands(taskId: string): AgentCommand[] | null {
return commandsByTask.get(taskId) ?? null;
}
export function clearTaskCommands(taskId: string): void {
commandsByTask.delete(taskId);
}

View File

@@ -0,0 +1,246 @@
/**
* v2.6 — AgentPool.
*
* Lazy get-or-create registry of `AgentBackend` instances keyed by
* `${primary}:${agent}` (primary = chatId for warm-ACP, a fixed sentinel for the
* single shared opencode server). Phase 0 shipped the skeleton (Map + health +
* dispose). Phase 3 adds the LIFECYCLE: per-entry idle tracking, a periodic
* idle-TTL + LRU-cap sweep (the pure decisions live in
* `backends/lifecycle-decisions.ts`), and a `closeChat` helper for the chat-close
* hook. Reattach after eviction is implicit — the next turn's `ensureSession`
* rebuilds the backend from `agent_sessions` / `worktrees` (DB is the source of
* truth; the in-memory pool is a warm cache).
*
* The hard rule (design §6): NEVER evict a busy backend (one with an in-flight
* turn). `selectIdleEvictionTargets` / `selectLruEvictionTargets` enforce it via
* `backend.isBusy()`; a long turn that outlives the TTL is left alone.
*
* Spec: openspec/changes/v2-6-persistent-agent-sessions/design.md §2 / §6.
*/
import type { FastifyBaseLogger } from 'fastify';
import type { AgentBackend } from './agent-backend.js';
import {
selectIdleEvictionTargets,
selectLruEvictionTargets,
DEFAULT_IDLE_TTL_MS,
DEFAULT_MAX_LIVE_BACKENDS,
} from './backends/lifecycle-decisions.js';
interface PoolEntry {
primary: string;
agent: string;
backend: AgentBackend;
/** Epoch ms of the last turn boundary (register or touch). Drives idle/LRU. */
lastActiveAt: number;
}
export interface AgentPoolOpts {
/** Idle TTL before a non-busy backend is evicted. Default 30 min. */
idleTtlMs?: number;
/** Max live backends before the LRU cap evicts the least-recently-used. */
maxLive?: number;
/** Sweep cadence. Default 60s (mirrors the server's periodic sweeper). */
sweepIntervalMs?: number;
log?: FastifyBaseLogger;
}
const DEFAULT_SWEEP_INTERVAL_MS = 60_000;
export class AgentPool {
private readonly backends = new Map<string, PoolEntry>();
private idleTtlMs: number;
private maxLive: number;
private sweepIntervalMs: number;
private log: FastifyBaseLogger | undefined;
private sweepTimer: ReturnType<typeof setInterval> | null = null;
/** Serializes sweep runs so a slow eviction can't overlap the next tick. */
private sweeping = false;
constructor(opts: AgentPoolOpts = {}) {
this.idleTtlMs = opts.idleTtlMs ?? DEFAULT_IDLE_TTL_MS;
this.maxLive = opts.maxLive ?? DEFAULT_MAX_LIVE_BACKENDS;
this.sweepIntervalMs = opts.sweepIntervalMs ?? DEFAULT_SWEEP_INTERVAL_MS;
this.log = opts.log;
}
/** Apply env-derived knobs to the module singleton at bootstrap (before
* startReaper). Only overrides explicitly-provided fields. */
configure(opts: AgentPoolOpts): void {
if (opts.idleTtlMs != null) this.idleTtlMs = opts.idleTtlMs;
if (opts.maxLive != null) this.maxLive = opts.maxLive;
if (opts.sweepIntervalMs != null) this.sweepIntervalMs = opts.sweepIntervalMs;
if (opts.log) this.log = opts.log;
}
private key(primary: string, agent: string): string {
return `${primary}:${agent}`;
}
/** Map lookup only. Spawning happens in the dispatcher (Phase 1/2). A hit also
* marks the entry recently-active so a resolve-without-prompt doesn't get it
* evicted out from under an imminent turn. */
get(primary: string, agent: string): AgentBackend | undefined {
const entry = this.backends.get(this.key(primary, agent));
if (entry) entry.lastActiveAt = Date.now();
return entry?.backend;
}
/** Store a backend instance for this (primary, agent). */
register(primary: string, agent: string, backend: AgentBackend): void {
this.backends.set(this.key(primary, agent), { primary, agent, backend, lastActiveAt: Date.now() });
}
/** Mark a backend recently-active (call at turn start AND settle so a long turn
* keeps its slot warm). No-op if the key isn't pooled. */
touch(primary: string, agent: string): void {
const entry = this.backends.get(this.key(primary, agent));
if (entry) entry.lastActiveAt = Date.now();
}
/** Snapshot for the decision helpers (busy is read live from the backend). */
private snapshots(): { key: string; lastActiveAt: number; busy: boolean }[] {
const out: { key: string; lastActiveAt: number; busy: boolean }[] = [];
for (const [key, e] of this.backends) {
out.push({ key, lastActiveAt: e.lastActiveAt, busy: e.backend.isBusy?.() ?? false });
}
return out;
}
/** Summary for the health endpoint. */
health(): { size: number; busy: number } {
let busy = 0;
for (const e of this.backends.values()) if (e.backend.isBusy?.()) busy++;
return { size: this.backends.size, busy };
}
// ─── Phase 3: idle-TTL + LRU eviction sweep ──────────────────────────────────
/** Start the periodic idle + LRU sweep. Idempotent; unref'd so it never holds
* the process open on its own. */
startReaper(log?: FastifyBaseLogger): void {
if (log) this.log = log;
if (this.sweepTimer) return;
this.sweepTimer = setInterval(() => {
void this.sweep().catch((err) => {
this.log?.warn({ err: errMsg(err) }, 'agent-pool: sweep error');
});
}, this.sweepIntervalMs);
this.sweepTimer.unref?.();
}
stopReaper(): void {
if (this.sweepTimer) {
clearInterval(this.sweepTimer);
this.sweepTimer = null;
}
}
/**
* One sweep pass: evict idle-past-TTL backends, then enforce the LRU cap.
* Deduped (a key can't appear in both lists for one pass). Busy backends are
* excluded by the decision helpers — a live turn is never torn down.
*/
async sweep(now: number = Date.now()): Promise<{ evicted: string[] }> {
if (this.sweeping) return { evicted: [] };
this.sweeping = true;
try {
// Phase 3: drive each backend's optional proactive health probe first (the
// opencode server's busy-aware hung-detect + self-restart). Best-effort —
// a probe must never fail the sweep.
for (const e of this.backends.values()) {
if (e.backend.tickHealth) {
await e.backend.tickHealth(now).catch((err) => {
this.log?.warn({ key: this.key(e.primary, e.agent), err: errMsg(err) }, 'agent-pool: tickHealth threw');
});
}
}
const snaps = this.snapshots();
const idle = selectIdleEvictionTargets(snaps, now, this.idleTtlMs);
// LRU runs on what remains after idle eviction, so the two never double-evict.
const idleSet = new Set(idle);
const remaining = snaps.filter((s) => !idleSet.has(s.key));
const lru = selectLruEvictionTargets(remaining, this.maxLive);
const targets = [...idle, ...lru];
if (targets.length === 0) return { evicted: [] };
const evicted: string[] = [];
for (const key of targets) {
const entry = this.backends.get(key);
if (!entry) continue;
// Re-check busy right before teardown — a turn may have started since the
// snapshot. Defensive; the decision already excluded busy at snapshot time.
if (entry.backend.isBusy?.()) continue;
this.backends.delete(key);
try {
await entry.backend.dispose();
} catch (err) {
this.log?.warn({ key, err: errMsg(err) }, 'agent-pool: backend dispose threw during eviction');
}
evicted.push(key);
}
if (evicted.length > 0) {
this.log?.info({ evicted, size: this.backends.size }, 'agent-pool: evicted idle/over-cap backends');
}
return { evicted };
} finally {
this.sweeping = false;
}
}
// ─── Phase 3: chat-close cleanup (3.3) ───────────────────────────────────────
/**
* Tear down every pooled backend whose key is for this chat. Used by the
* chat-close hook. The opencode server is shared (keyed on a sentinel, not the
* chat), so it is NOT disposed here — only its session is closed via
* `closeSession`, which the hook calls directly with the per-(chat,agent)
* handle. Returns the keys it removed. Skips busy entries (a close mid-turn is
* rare but must not kill a live stream — the idle sweep reaps it shortly after).
*/
async closeChat(chatId: string): Promise<string[]> {
const removed: string[] = [];
const prefix = `${chatId}:`;
for (const [key, entry] of [...this.backends]) {
if (!key.startsWith(prefix)) continue;
if (entry.backend.isBusy?.()) continue;
this.backends.delete(key);
try {
await entry.backend.dispose();
} catch (err) {
this.log?.warn({ key, err: errMsg(err) }, 'agent-pool: dispose threw during closeChat');
}
removed.push(key);
}
return removed;
}
/** Look up a backend by exact key without bumping its activity (for closeSession). */
peek(primary: string, agent: string): AgentBackend | undefined {
return this.backends.get(this.key(primary, agent))?.backend;
}
/** Dispose every backend and clear the map. Tolerates throwing backends. */
async dispose(): Promise<void> {
this.stopReaper();
const entries = [...this.backends.values()];
this.backends.clear();
await Promise.allSettled(entries.map((e) => e.backend.dispose()));
}
}
function errMsg(e: unknown): string {
return e instanceof Error ? e.message : String(e);
}
/**
* The shared opencode server is pooled under a FIXED sentinel (one server per
* BooCoder process, multiplexing all opencode sessions internally) rather than a
* chat id — so it is NOT torn down by `closeChat(chatId)` (only its per-chat
* session is closed). Exported so the dispatcher + the lifecycle close-hook agree
* on the key without drift.
*/
export const OPENCODE_POOL_KEY = '__opencode_server__';
/** Single shared instance — registered by the dispatcher, swept + drained by the
* server's onClose hook. */
export const agentPool = new AgentPool();

View File

@@ -0,0 +1,158 @@
import type { Sql } from '../db.js';
import type { FastifyBaseLogger } from 'fastify';
import { exec as execCb, execFile as execFileCb } from 'node:child_process';
import { promisify } from 'node:util';
import { PROVIDERS_BY_NAME } from './provider-registry.js';
import { resolveAcpProbeBinaries } from './acp-spawn.js';
import { clearProviderSnapshotCache, fetchLlamaSwapModels, prefixLlamaSwapModels } from './provider-snapshot.js';
import { readQwenSettingsModels } from './qwen-settings.js';
import { loadConfig } from '../config.js';
import { loadProviderConfig } from './provider-config-registry.js';
const exec = promisify(execCb);
const execFile = promisify(execFileCb);
// `which` via execFile (no shell) — the binary name can come from the config
// file (custom ACP entries), so avoid interpolating it into a shell string.
async function whichBinary(bin: string): Promise<string | null> {
try {
const { stdout } = await execFile('which', [bin], { timeout: 10_000 });
const path = stdout.trim();
return path || null;
} catch {
return null;
}
}
async function resolveInstallPath(agentName: string): Promise<string | null> {
const candidates = resolveAcpProbeBinaries(agentName);
for (const bin of candidates) {
const path = await whichBinary(bin);
if (path) return path;
}
return null;
}
async function detectAcpSupport(agentName: string, installPath: string): Promise<boolean> {
const transport = PROVIDERS_BY_NAME.get(agentName)?.transport;
if (transport !== 'acp') return false;
if (agentName === 'qwen') {
try {
const { stdout } = await exec(`"${installPath}" --help`, { timeout: 10_000 });
return stdout.includes('--acp');
} catch {
return false;
}
}
try {
await exec(`"${installPath}" acp --help`, { timeout: 10_000 });
return true;
} catch {
return false;
}
}
/**
* Probe for available agents on the HOST.
*
* v2.3: iterates the resolved provider registry (built-ins + config-backed
* custom ACP entries) rather than the hardcoded `PROBED_AGENT_NAMES`. Native
* boocode is not probed; disabled providers are skipped (their `available_agents`
* row is kept, not deleted). `enabled` is read from the in-memory registry only —
* no DB column in Phase 1 (design.md §3.3).
*/
export async function probeAgents(sql: Sql, log: FastifyBaseLogger): Promise<void> {
clearProviderSnapshotCache();
log.info('agent-probe: scanning for known agents');
const registry = loadProviderConfig(loadConfig().CODER_PROVIDERS_PATH);
for (const resolved of registry.values()) {
const agentName = resolved.id;
// Native boocode is not a probed host agent.
if (resolved.transport === 'native') continue;
// Disabled providers: skip the probe, keep any existing row.
if (!resolved.enabled) {
log.info({ agent: agentName }, 'agent-probe: skipping disabled provider');
continue;
}
try {
// Custom ACP entries resolve their binary from command[0]; built-ins use
// the per-agent probe binaries.
const installPath = resolved.isCustomAcp && resolved.launchCommand
? await whichBinary(resolved.launchCommand[0])
: await resolveInstallPath(agentName);
if (!installPath) continue;
let version: string | null = null;
try {
const { stdout: verOut } = await exec(`"${installPath}" --version`, { timeout: 15_000 });
version = verOut.trim().slice(0, 100);
} catch {
/* optional */
}
// Custom ACP entries are ACP by declaration; built-ins detect support.
let supportsAcp: boolean;
if (resolved.isCustomAcp) {
supportsAcp = true;
} else {
supportsAcp = resolved.transport === 'acp';
if (supportsAcp) {
supportsAcp = await detectAcpSupport(agentName, installPath);
}
}
let models: Array<{ id: string; label: string }> = [];
if (!resolved.isCustomAcp) {
const providerDef = PROVIDERS_BY_NAME.get(agentName);
if (providerDef?.modelSource === 'static' && providerDef.staticModels) {
models = providerDef.staticModels;
}
if (agentName === 'qwen') {
models = await readQwenSettingsModels();
}
if (providerDef?.mergeLlamaSwap) {
try {
const config = loadConfig();
const llamaModels = prefixLlamaSwapModels(await fetchLlamaSwapModels(config));
models = [...models, ...llamaModels];
} catch (err) {
log.warn({ agent: agentName, err: err instanceof Error ? err.message : String(err) }, 'agent-probe: llama-swap model fetch failed (non-fatal)');
}
}
}
const label = resolved.configLabel ?? resolved.label;
const transport = resolved.isCustomAcp
? 'acp'
: resolved.transport === 'acp' && !supportsAcp
? 'pty'
: (resolved.transport ?? 'pty');
await sql`
INSERT INTO available_agents (name, install_path, version, supports_acp, last_probed_at, models, label, transport)
VALUES (${agentName}, ${installPath}, ${version}, ${supportsAcp}, clock_timestamp(), ${sql.json(models as never)}, ${label}, ${transport})
ON CONFLICT (name) DO UPDATE SET
install_path = EXCLUDED.install_path,
version = EXCLUDED.version,
supports_acp = EXCLUDED.supports_acp,
last_probed_at = EXCLUDED.last_probed_at,
models = EXCLUDED.models,
label = EXCLUDED.label,
transport = EXCLUDED.transport
`;
log.info({ agent: agentName, version, installPath, supportsAcp, modelCount: models.length }, 'agent-probe: found');
} catch (err) {
const msg = err instanceof Error ? err.message : String(err);
log.debug({ agent: agentName, err: msg }, 'agent-probe: not found');
}
}
log.info('agent-probe: scan complete');
}

View File

@@ -0,0 +1,56 @@
import type { Sql } from '../db.js';
import type { AcpToolSnapshot } from './acp-tool-snapshot.js';
import { snapshotToPartPayload } from './acp-tool-snapshot.js';
interface PartInsert {
message_id: string;
sequence: number;
kind: 'reasoning' | 'tool_call';
payload: unknown;
}
async function insertParts(sql: Sql, parts: PartInsert[]): Promise<void> {
if (parts.length === 0) return;
await sql`
INSERT INTO message_parts ${sql(
parts.map((p) => ({
message_id: p.message_id,
sequence: p.sequence,
kind: p.kind,
payload: sql.json(p.payload as never),
})),
'message_id',
'sequence',
'kind',
'payload',
)}
`;
}
/** Persist external-agent reasoning + tool calls into message_parts for reload. */
export async function persistExternalAgentTurn(
sql: Sql,
assistantMessageId: string,
snapshots: AcpToolSnapshot[],
reasoningText: string,
): Promise<void> {
const parts: PartInsert[] = [];
let seq = 0;
if (reasoningText.trim()) {
parts.push({
message_id: assistantMessageId,
sequence: seq++,
kind: 'reasoning',
payload: { text: reasoningText },
});
}
for (const snapshot of snapshots) {
parts.push({
message_id: assistantMessageId,
sequence: seq++,
kind: 'tool_call',
payload: snapshotToPartPayload(snapshot),
});
}
await insertParts(sql, parts);
}

View File

@@ -0,0 +1,176 @@
import { describe, it, expect } from 'vitest';
import {
selectIdleEvictionTargets,
selectLruEvictionTargets,
decideRestart,
selectOrphanWorktreeTargets,
DEFAULT_IDLE_TTL_MS,
DEFAULT_MAX_LIVE_BACKENDS,
type PoolEntrySnapshot,
} from '../lifecycle-decisions.js';
/**
* v2.6 Phase 3 — pure lifecycle decisions. No DB, no children, no timers; `now`
* is injected. Models prune.ts:selectPruneTargets — the caller acts on the keys.
*/
const NOW = 1_000_000_000_000;
function entry(key: string, ageMs: number, busy = false): PoolEntrySnapshot {
return { key, lastActiveAt: NOW - ageMs, busy };
}
describe('selectIdleEvictionTargets (3.1)', () => {
it('evicts entries idle past the TTL', () => {
const entries = [
entry('a:opencode', DEFAULT_IDLE_TTL_MS + 1),
entry('b:goose', DEFAULT_IDLE_TTL_MS - 1),
];
expect(selectIdleEvictionTargets(entries, NOW)).toEqual(['a:opencode']);
});
it('never evicts a busy entry even when idle past the TTL', () => {
const entries = [entry('a:opencode', DEFAULT_IDLE_TTL_MS * 10, /* busy */ true)];
expect(selectIdleEvictionTargets(entries, NOW)).toEqual([]);
});
it('respects a custom TTL', () => {
const entries = [entry('a:goose', 5_000), entry('b:qwen', 500)];
expect(selectIdleEvictionTargets(entries, NOW, 1_000)).toEqual(['a:goose']);
});
it('treats exactly-at-TTL as evictable (>=)', () => {
expect(selectIdleEvictionTargets([entry('a:x', 1_000)], NOW, 1_000)).toEqual(['a:x']);
});
it('returns empty for an empty pool', () => {
expect(selectIdleEvictionTargets([], NOW)).toEqual([]);
});
});
describe('selectLruEvictionTargets (3.4)', () => {
it('returns nothing when at or under the cap', () => {
const entries = [entry('a:x', 10), entry('b:y', 20)];
expect(selectLruEvictionTargets(entries, 2)).toEqual([]);
expect(selectLruEvictionTargets(entries, 5)).toEqual([]);
});
it('evicts the least-recently-used beyond the cap', () => {
// oldest first: c (300ms ago) is LRU, then a (100ms), then b (10ms).
const entries = [entry('a:x', 100), entry('b:y', 10), entry('c:z', 300)];
expect(selectLruEvictionTargets(entries, 2)).toEqual(['c:z']);
});
it('evicts multiple LRU entries to reach the cap', () => {
const entries = [
entry('a:x', 100),
entry('b:y', 10),
entry('c:z', 300),
entry('d:w', 200),
];
// cap 1: must remove 3, oldest-first c(300), d(200), a(100).
expect(selectLruEvictionTargets(entries, 1)).toEqual(['c:z', 'd:w', 'a:x']);
});
it('never evicts a busy entry even if it is the LRU', () => {
// c is LRU but busy → it cannot be evicted; fall to the next-oldest (a).
const entries = [entry('a:x', 100), entry('b:y', 10), entry('c:z', 300, true)];
expect(selectLruEvictionTargets(entries, 2)).toEqual(['a:x']);
});
it('can transiently exceed the cap when too many are busy', () => {
// cap 1, but both old entries busy → only the single idle one is evictable.
const entries = [entry('a:x', 100, true), entry('c:z', 300, true), entry('b:y', 10)];
expect(selectLruEvictionTargets(entries, 1)).toEqual(['b:y']);
});
it('uses the default cap when omitted', () => {
const entries = Array.from({ length: DEFAULT_MAX_LIVE_BACKENDS + 1 }, (_, i) =>
entry(`k${String(i).padStart(2, '0')}:a`, (i + 1) * 1000),
);
const evicted = selectLruEvictionTargets(entries);
// exactly one over the default cap → evict the single LRU (largest age).
expect(evicted).toHaveLength(1);
expect(evicted[0]).toBe(`k${String(DEFAULT_MAX_LIVE_BACKENDS).padStart(2, '0')}:a`);
});
});
describe('decideRestart (3.2, busy-aware)', () => {
const base = {
consecutiveFailures: 0,
busy: false,
unhealthyBusySince: 0,
now: NOW,
failureThreshold: 3,
staleBusyGraceMs: 120_000,
};
it('does nothing when healthy', () => {
expect(decideRestart({ ...base, processExited: false, healthy: true }))
.toEqual({ action: 'none', reason: 'healthy' });
});
it('restarts immediately when the process exited', () => {
expect(decideRestart({ ...base, processExited: true, busy: true }))
.toEqual({ action: 'restart', reason: 'process-exited' });
});
it('waits below the failure threshold', () => {
expect(decideRestart({ ...base, processExited: false, consecutiveFailures: 2 }))
.toEqual({ action: 'wait', reason: 'below-threshold' });
});
it('restarts at the threshold when idle', () => {
expect(decideRestart({ ...base, processExited: false, consecutiveFailures: 3 }))
.toEqual({ action: 'restart', reason: 'threshold' });
});
it('defers a restart while busy within the grace window', () => {
expect(decideRestart({
...base, processExited: false, consecutiveFailures: 5, busy: true,
unhealthyBusySince: NOW - 1_000,
})).toEqual({ action: 'wait', reason: 'busy-grace' });
});
it('force-restarts a busy backend after the stale-busy grace', () => {
expect(decideRestart({
...base, processExited: false, consecutiveFailures: 5, busy: true,
unhealthyBusySince: NOW - 120_001,
})).toEqual({ action: 'restart', reason: 'stale-busy-grace' });
});
it('waits (busy-grace) when busy + threshold but the window just started', () => {
// unhealthyBusySince === 0 means the caller is about to stamp it this cycle.
expect(decideRestart({
...base, processExited: false, consecutiveFailures: 5, busy: true,
unhealthyBusySince: 0,
})).toEqual({ action: 'wait', reason: 'busy-grace' });
});
});
describe('selectOrphanWorktreeTargets (3.4)', () => {
it('skips dirs tracked by a live worktrees row', () => {
const onDisk = [{ path: '/wt/sess-a', mtimeMs: NOW - 10_000_000 }];
expect(selectOrphanWorktreeTargets(onDisk, new Set(['/wt/sess-a']), NOW, 1000)).toEqual([]);
});
it('reaps an untracked dir older than the grace', () => {
const onDisk = [{ path: '/wt/sess-orphan', mtimeMs: NOW - 5000 }];
expect(selectOrphanWorktreeTargets(onDisk, new Set(), NOW, 1000)).toEqual(['/wt/sess-orphan']);
});
it('never reaps a dir younger than the grace (mid-create race)', () => {
const onDisk = [{ path: '/wt/sess-fresh', mtimeMs: NOW - 500 }];
expect(selectOrphanWorktreeTargets(onDisk, new Set(), NOW, 1000)).toEqual([]);
});
it('mixes tracked, fresh, and orphaned correctly', () => {
const onDisk = [
{ path: '/wt/sess-live', mtimeMs: NOW - 10_000 },
{ path: '/wt/sess-fresh', mtimeMs: NOW - 100 },
{ path: '/wt/sess-orphan', mtimeMs: NOW - 10_000 },
];
expect(selectOrphanWorktreeTargets(onDisk, new Set(['/wt/sess-live']), NOW, 1000))
.toEqual(['/wt/sess-orphan']);
});
});

View File

@@ -0,0 +1,51 @@
import { describe, it, expect } from 'vitest';
import { stepEndedToUsage } from '../opencode-usage.js';
describe('stepEndedToUsage (U.6)', () => {
it('folds cache read+write into input and reasoning into output', () => {
const u = stepEndedToUsage({
cost: 0.0123,
tokens: { input: 100, output: 50, reasoning: 20, cache: { read: 10, write: 5 } },
});
expect(u).toEqual({ input: 115, output: 70, cost: 0.0123 });
});
it('handles a step with no cache and no reasoning', () => {
const u = stepEndedToUsage({
cost: 0,
tokens: { input: 8, output: 4, reasoning: 0, cache: { read: 0, write: 0 } },
});
expect(u).toEqual({ input: 8, output: 4, cost: 0 });
});
it('is defensive against a missing tokens block', () => {
const u = stepEndedToUsage({ cost: 0.5 } as never);
expect(u).toEqual({ input: 0, output: 0, cost: 0.5 });
});
it('is defensive against undefined props', () => {
expect(stepEndedToUsage(undefined)).toEqual({ input: 0, output: 0, cost: 0 });
});
it('drops NaN / negative noise to zero rather than poisoning the accumulated total', () => {
const u = stepEndedToUsage({
cost: Number.NaN,
tokens: {
input: -5,
output: Number.NaN,
reasoning: 3,
cache: { read: Number.POSITIVE_INFINITY, write: 2 },
},
});
// input: (-5→0) + (Inf→0) + 2 = 2; output: (NaN→0) + 3 = 3; cost: NaN→0
expect(u).toEqual({ input: 2, output: 3, cost: 0 });
});
it('rounds fractional token counts', () => {
const u = stepEndedToUsage({
cost: 1.5,
tokens: { input: 10.6, output: 4.4, reasoning: 0, cache: { read: 0, write: 0 } },
});
expect(u).toEqual({ input: 11, output: 4, cost: 1.5 });
});
});

View File

@@ -0,0 +1,34 @@
import { describe, it, expect } from 'vitest';
import {
armAbortGuard,
noteTurnActivity,
consumeTerminal,
type AbortTerminalGuard,
} from '../turn-guard.js';
describe('post-abort terminal guard (F.1)', () => {
it('swallows the orphan terminal that follows an abort, then settles the next real one', () => {
// Reproduces the v2.6.5 Stop-button bug: abort turn A, then opencode emits a
// trailing session.idle for A. That orphan must NOT settle the next turn.
const g: AbortTerminalGuard = { swallowNextTerminal: false };
armAbortGuard(g); // user aborts turn A
expect(consumeTerminal(g)).toBe('swallow'); // opencode's orphan idle for A → dropped
expect(consumeTerminal(g)).toBe('settle'); // turn B's real idle → settles B
});
it('settles a terminal when no abort happened', () => {
const g: AbortTerminalGuard = { swallowNextTerminal: false };
expect(consumeTerminal(g)).toBe('settle');
});
it('self-heals if the orphan never arrives: new-turn activity clears the guard', () => {
// If opencode emits no orphan idle (e.g. abort-before-prompt), the next turn's
// real terminal must still settle rather than being swallowed forever.
const g: AbortTerminalGuard = { swallowNextTerminal: false };
armAbortGuard(g); // abort A, but no orphan idle arrives
noteTurnActivity(g); // turn B produces its first delta
expect(consumeTerminal(g)).toBe('settle'); // turn B's idle settles, not swallowed
});
});

View File

@@ -0,0 +1,59 @@
import { describe, it, expect } from 'vitest';
import { shouldUseWarmBackend, isTurnOkForStopReason } from '../warm-acp-routing.js';
/**
* Phase 2 routing predicate: which goose/qwen tasks go to the warm pool backend
* vs the existing one-shot ACP path.
*
* The warm backend is keyed (chat_id, agent) — the persistent context unit (same
* as opencode-server). A task only routes warm when it carries BOTH a session_id
* and a chat_id, i.e. it originates from a real chat tab (the coder message route
* stamps both). Session-less creators (arena, MCP-created, generic /api/tasks,
* new_task) lack chat_id/session_id and keep the one-shot worktree-per-task path,
* which never spawns a warm process.
*/
describe('shouldUseWarmBackend (Phase 2 routing)', () => {
it('routes a chat-tab task (session_id + chat_id) to the warm backend', () => {
expect(shouldUseWarmBackend({ agent: 'qwen', session_id: 's1', chat_id: 'c1' })).toBe(true);
expect(shouldUseWarmBackend({ agent: 'goose', session_id: 's1', chat_id: 'c1' })).toBe(true);
});
it('keeps a session-less arena/MCP task on the one-shot path', () => {
expect(shouldUseWarmBackend({ agent: 'qwen', session_id: null, chat_id: null })).toBe(false);
});
it('keeps a task with a session but no chat on the one-shot path', () => {
// chat_id is the warm-key half; without it ensureSession would get a degenerate
// (null, agent) key, so fall back to one-shot rather than synthesize a chat.
expect(shouldUseWarmBackend({ agent: 'goose', session_id: 's1', chat_id: null })).toBe(false);
});
it('keeps a task with a chat but no session on the one-shot path', () => {
expect(shouldUseWarmBackend({ agent: 'qwen', session_id: null, chat_id: 'c1' })).toBe(false);
});
it('only applies to warm-capable agents (goose, qwen); others never warm here', () => {
// opencode has its own dedicated warm path; native/claude/etc. are not ACP-warm.
expect(shouldUseWarmBackend({ agent: 'opencode', session_id: 's1', chat_id: 'c1' })).toBe(false);
expect(shouldUseWarmBackend({ agent: 'claude', session_id: 's1', chat_id: 'c1' })).toBe(false);
expect(shouldUseWarmBackend({ agent: null, session_id: 's1', chat_id: 'c1' })).toBe(false);
});
});
describe('isTurnOkForStopReason (ACP stop-reason → ok/fail)', () => {
it('treats normal completions as ok', () => {
expect(isTurnOkForStopReason('end_turn')).toBe(true);
expect(isTurnOkForStopReason('max_tokens')).toBe(true);
expect(isTurnOkForStopReason('max_turn_requests')).toBe(true);
});
it('treats refusal and cancelled as failures', () => {
expect(isTurnOkForStopReason('refusal')).toBe(false);
expect(isTurnOkForStopReason('cancelled')).toBe(false);
});
it('defaults an absent stop reason to a successful end_turn', () => {
expect(isTurnOkForStopReason(undefined)).toBe(true);
expect(isTurnOkForStopReason(null)).toBe(true);
});
});

View File

@@ -0,0 +1,197 @@
/**
* v2.6 Phase 3 — pure lifecycle decision helpers.
*
* The eviction / LRU-cap / busy-aware-restart / reaper-target logic, factored out
* of AgentPool + the backends + the periodic sweeper so it's unit-testable with no
* DB, no child processes, no timers (modeled on
* apps/server/src/services/inference/prune.ts:selectPruneTargets — a pure decision
* core the caller acts on).
*
* Three decisions live here:
* 1. selectIdleEvictionTargets — which warm backends to evict for being idle.
* 2. selectLruEvictionTargets — which warm backends to evict to honour a max-live
* cap (least-recently-used beyond the cap), NEVER a busy one.
* 3. shouldRestartCrashedBackend (busy-aware) — openchamber's skip-while-busy +
* stale-grace state machine, re-implemented for BooCode's per-(chat,agent) pool.
*
* "Busy" = the backend has an in-flight turn. The hard rule (design §6, decisions):
* never evict or force-restart a busy backend; defer with a stale-grace.
*/
// ─── Idle TTL eviction (3.1) ─────────────────────────────────────────────────
/** Default idle TTL before a warm backend/session is evicted (design §6 ~30 min). */
export const DEFAULT_IDLE_TTL_MS = 30 * 60 * 1000;
/** A pool entry as the decision helpers see it (no backend internals). */
export interface PoolEntrySnapshot {
/** Pool key `${primary}:${agent}` — opaque to the decision, used for selection. */
key: string;
/** Epoch ms of the last turn activity (start or settle) on this backend. */
lastActiveAt: number;
/** True iff a turn is in flight right now. Busy entries are never evicted. */
busy: boolean;
}
/**
* Idle eviction: an entry is evictable when it has been idle (no turn) for longer
* than `ttlMs` AND is not currently busy. Returns the keys to evict.
*
* Pure: `now` is injected so tests don't depend on wall-clock. Busy entries are
* categorically excluded — a long-running turn that exceeds the TTL must NOT be
* torn down mid-stream (the §6 / openchamber busy rule).
*/
export function selectIdleEvictionTargets(
entries: ReadonlyArray<PoolEntrySnapshot>,
now: number,
ttlMs: number = DEFAULT_IDLE_TTL_MS,
): string[] {
const out: string[] = [];
for (const e of entries) {
if (e.busy) continue;
if (now - e.lastActiveAt >= ttlMs) out.push(e.key);
}
return out;
}
// ─── LRU cap (3.4) ───────────────────────────────────────────────────────────
/** Default max live warm backends/worktrees before the LRU cap evicts (env-overridable). */
export const DEFAULT_MAX_LIVE_BACKENDS = 10;
/**
* LRU cap: when more than `cap` non-busy entries are live, evict the
* least-recently-used ones (oldest `lastActiveAt` first) until at most `cap`
* remain. Busy entries are never evicted AND are not counted toward the cap's
* "kept" budget being freed — i.e. we only ever evict idle entries, so a burst of
* concurrent busy turns can transiently exceed the cap rather than kill live work.
*
* Returns the keys to evict, least-recently-used first. Pure / deterministic:
* ties broken by key for stable test output.
*/
export function selectLruEvictionTargets(
entries: ReadonlyArray<PoolEntrySnapshot>,
cap: number = DEFAULT_MAX_LIVE_BACKENDS,
): string[] {
if (cap < 0) cap = 0;
if (entries.length <= cap) return [];
// Only idle entries are eligible to be evicted.
const evictable = entries
.filter((e) => !e.busy)
.sort((a, b) => a.lastActiveAt - b.lastActiveAt || (a.key < b.key ? -1 : a.key > b.key ? 1 : 0));
// We must shrink total live count down to `cap`. Busy entries can't be evicted,
// so the number we CAN remove is bounded by the evictable pool; evict the oldest
// (total - cap) of them, never more than exist.
const overBy = entries.length - cap;
const toEvict = evictable.slice(0, Math.max(0, overBy));
return toEvict.map((e) => e.key);
}
// ─── Busy-aware crash restart (3.2) — openchamber lift ───────────────────────
/**
* Default grace after which a backend that has stayed unhealthy WHILE busy is
* force-restarted anyway (openchamber's STALE_BUSY_GRACE_MS = 2 min). Guards
* against a permanently-stuck "busy" turn wedging recovery forever.
*/
export const DEFAULT_STALE_BUSY_GRACE_MS = 2 * 60 * 1000;
/** Default consecutive health-check failures before a restart is attempted. */
export const DEFAULT_HEALTH_FAILURE_THRESHOLD = 3;
export interface RestartDecisionInput {
/** True iff the process is actually dead (exited). A dead process restarts
* immediately regardless of busy/threshold — there's nothing to protect. */
processExited: boolean;
/** Consecutive failed health probes so far (including the current one). */
consecutiveFailures: number;
/** Whether the backend currently has an in-flight turn. */
busy: boolean;
/** Epoch ms when the unhealthy-while-busy window started, or 0 if not in one. */
unhealthyBusySince: number;
/** Injected clock. */
now: number;
failureThreshold?: number;
staleBusyGraceMs?: number;
}
export type RestartDecision =
| { action: 'restart'; reason: 'process-exited' | 'threshold' | 'stale-busy-grace' }
| { action: 'wait'; reason: 'below-threshold' | 'busy-grace' }
| { action: 'none'; reason: 'healthy' };
/**
* Decide whether to restart a backend after a health probe. Mirrors
* openchamber's `runHealthCheckCycle` + `shouldSkipRestartForBusySessions`,
* re-implemented as a pure function over injected state (the caller owns the
* mutable counters + the actual restart side-effect).
*
* Order (matches openchamber):
* - process exited → restart now (nothing live to protect).
* - below failure threshold → wait (transient blip; the next probe re-checks).
* - threshold reached + idle → restart now.
* - threshold reached + busy → skip UNLESS the unhealthy-busy window exceeded
* the stale grace, then force restart.
*
* `healthy: true` callers don't reach here; included for completeness so the
* caller can pass through and reset counters on a single code path.
*/
export function decideRestart(input: RestartDecisionInput & { healthy?: boolean }): RestartDecision {
if (input.healthy) return { action: 'none', reason: 'healthy' };
if (input.processExited) return { action: 'restart', reason: 'process-exited' };
const threshold = input.failureThreshold ?? DEFAULT_HEALTH_FAILURE_THRESHOLD;
if (input.consecutiveFailures < threshold) {
return { action: 'wait', reason: 'below-threshold' };
}
if (!input.busy) {
return { action: 'restart', reason: 'threshold' };
}
// Busy + unhealthy at/over threshold: defer, but not forever.
const grace = input.staleBusyGraceMs ?? DEFAULT_STALE_BUSY_GRACE_MS;
if (input.unhealthyBusySince > 0 && input.now - input.unhealthyBusySince >= grace) {
return { action: 'restart', reason: 'stale-busy-grace' };
}
return { action: 'wait', reason: 'busy-grace' };
}
// ─── Orphan worktree reaper target selection (3.4) ───────────────────────────
/** Default TTL: an on-disk worktree dir with no live `worktrees` row is reaped
* only after it's been orphaned at least this long (mtime-based grace so a
* just-created dir mid-`ensureSessionWorktree` race is never swept). */
export const DEFAULT_ORPHAN_WORKTREE_GRACE_MS = 60 * 60 * 1000; // 1h
export interface OnDiskWorktree {
/** Absolute path of the worktree dir on disk. */
path: string;
/** Last-modified epoch ms of the dir (newest of dir + contents, caller's choice). */
mtimeMs: number;
}
/**
* Reaper target selection: which on-disk worktree dirs are orphans safe to
* inspect-and-reap. An orphan is a dir under the worktree base that has NO live
* `worktrees` row (path not in `liveWorktreePaths`) AND whose mtime is older than
* the grace window (so an in-flight create isn't swept).
*
* Pure — the caller (the sweeper) then runs the at-risk preflight (dirty/unpushed)
* on each returned path and only physically removes the SAFE ones. This helper
* never decides to remove work-at-risk; it only narrows the candidate set.
*/
export function selectOrphanWorktreeTargets(
onDisk: ReadonlyArray<OnDiskWorktree>,
liveWorktreePaths: ReadonlySet<string>,
now: number,
graceMs: number = DEFAULT_ORPHAN_WORKTREE_GRACE_MS,
): string[] {
const out: string[] = [];
for (const w of onDisk) {
if (liveWorktreePaths.has(w.path)) continue; // tracked → not an orphan
if (now - w.mtimeMs < graceMs) continue; // too fresh → could be mid-create
out.push(w.path);
}
return out;
}

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,77 @@
/**
* v2.6 Phase 1-UX (U.6) — pure mapper for opencode's per-step usage event.
*
* opencode's warm server emits `session.next.step.ended` once per completed LLM
* step (so a multi-tool turn fires it several times). Its `properties` carry the
* step's token + cost accounting:
*
* {
* timestamp: number;
* sessionID: string;
* finish: string;
* cost: number; // USD for this step
* tokens: {
* input: number; output: number; reasoning: number;
* cache: { read: number; write: number };
* };
* snapshot?: string;
* }
*
* (Verified against @opencode-ai/sdk@1.15.12 — `EventSessionNextStepEnded` in
* `dist/v2/gen/types.gen.d.ts`, a member of the `Event` union the SSE loop
* switches on.)
*
* We normalize to the review's target slice `{input, output, cost}` (the
* provider-agnostic `AgentUsage` shape lands later). cache read/write tokens are
* folded into `input` so the persisted input count reflects the real context the
* model billed for; reasoning tokens are folded into `output` since that's what
* the provider counts them as for generation. This keeps the persisted totals a
* faithful sum of what opencode reported, without inventing extra columns yet.
*/
/** The `properties` shape of a `session.next.step.ended` event (subset we read). */
export interface StepEndedProps {
cost: number;
tokens: {
input: number;
output: number;
reasoning: number;
cache: { read: number; write: number };
};
}
/** Normalized per-step usage delta persisted onto the agent_sessions row. */
export interface StepUsage {
input: number;
output: number;
cost: number;
}
/** Coerce a possibly-missing/NaN number to a non-negative finite integer (tokens). */
function n(v: unknown): number {
const x = typeof v === 'number' ? v : Number(v);
return Number.isFinite(x) && x > 0 ? Math.round(x) : 0;
}
/** Coerce a possibly-missing/NaN number to a non-negative finite float (cost USD). */
function f(v: unknown): number {
const x = typeof v === 'number' ? v : Number(v);
return Number.isFinite(x) && x > 0 ? x : 0;
}
/**
* Map a `session.next.step.ended` payload → the normalized `{input, output, cost}`
* delta. Defensive against missing/partial token blocks (the wire is trusted but
* we never want a NaN to poison the accumulated DB total). `input` folds in cache
* read+write; `output` folds in reasoning.
*/
export function stepEndedToUsage(props: Partial<StepEndedProps> | undefined): StepUsage {
const t = props?.tokens;
const cacheRead = n(t?.cache?.read);
const cacheWrite = n(t?.cache?.write);
return {
input: n(t?.input) + cacheRead + cacheWrite,
output: n(t?.output) + n(t?.reasoning),
cost: f(props?.cost),
};
}

View File

@@ -0,0 +1,38 @@
/**
* Guard against opencode's post-abort "orphan" terminal event (F.1).
*
* When a turn is aborted (`client.session.abort`), opencode emits one trailing
* `session.idle` / `session.error` for the cancelled turn. Without a guard that
* orphan settles whatever turn currently holds the session slot — which, after
* the user immediately sends another message, is the NEXT turn, settling it early
* as success (the v2.6.5 Stop-button bug). opencode terminal events carry only a
* `sessionID` (no turn id), so we can't match by id; instead we swallow exactly
* one terminal per abort, and self-heal if that orphan never arrives.
*/
export interface AbortTerminalGuard {
/** True between an abort and the orphan terminal event that follows it. */
swallowNextTerminal: boolean;
}
/** Arm on abort: the next terminal event for this session is the orphan. */
export function armAbortGuard(g: AbortTerminalGuard): void {
g.swallowNextTerminal = true;
}
/**
* A new turn produced activity (delta) → the orphan window is over. Self-heals
* the case where opencode emits no orphan idle (e.g. abort-before-prompt), so a
* real terminal still settles instead of being swallowed forever.
*/
export function noteTurnActivity(g: AbortTerminalGuard): void {
g.swallowNextTerminal = false;
}
/** Decide a terminal (idle/error): swallow the post-abort orphan once, else settle. */
export function consumeTerminal(g: AbortTerminalGuard): 'swallow' | 'settle' {
if (g.swallowNextTerminal) {
g.swallowNextTerminal = false;
return 'swallow';
}
return 'settle';
}

View File

@@ -0,0 +1,41 @@
/**
* v2.6 Phase 2 — warm-vs-one-shot routing predicate for goose/qwen.
*
* The warm ACP backend keys its persistent process + ACP session on (chat_id,
* agent) — exactly like the opencode-server backend. A task therefore only routes
* to the warm pool when it carries BOTH a `session_id` and a `chat_id`, i.e. it
* came from a real chat tab (the coder message route + skills route stamp both).
*
* Session-less creators — arena contestants, MCP-created tasks, generic
* `POST /api/tasks`, `new_task` — leave one or both null. Those keep the existing
* one-shot worktree-per-task ACP path (`runExternalAgent`), which spawns a fresh
* `goose acp` / `qwen --acp` per turn and never holds a warm process. Routing them
* warm would either synthesize a degenerate (null, agent) key or create a chat per
* arena contestant — neither is wanted, so they stay one-shot.
*
* Pure, so it's unit-testable; the dispatcher consumes it.
*/
const WARM_CAPABLE_AGENTS = new Set(['goose', 'qwen']);
export function shouldUseWarmBackend(task: {
agent: string | null;
session_id: string | null;
chat_id: string | null;
}): boolean {
if (!task.agent || !WARM_CAPABLE_AGENTS.has(task.agent)) return false;
return task.session_id != null && task.chat_id != null;
}
/**
* Map an ACP prompt `stopReason` to the backend's ok/fail contract (TurnResult.ok).
*
* ACP's `StopReason` union includes normal completions (`end_turn`, `max_tokens`,
* `max_turn_requests`) and abnormal ones (`refusal`, `cancelled`). Only the latter
* two read as a failed turn; everything else (including an undefined/absent reason,
* which we default to `end_turn`) is a successful completion. Pure so it's testable
* independently of the warm process.
*/
export function isTurnOkForStopReason(stopReason: string | null | undefined): boolean {
const reason = stopReason ?? 'end_turn';
return reason !== 'refusal' && reason !== 'cancelled';
}

View File

@@ -0,0 +1,417 @@
/**
* v2.6 Phase 2 — WarmAcpBackend (goose, qwen).
*
* One persistent stdio process + ONE `ClientSideConnection` per (chat, agent),
* `initialize` + `session/new` done ONCE, reused across every turn — the warm
* analogue of the previous one-shot `acp-dispatch.ts` (which spawned/torn-down a
* fresh `goose acp` / `qwen --acp` per turn). Mirrors Paseo's `SpawnedACPProcess`.
*
* Implements the Phase 0 `AgentBackend` interface (same contract as
* `OpenCodeServerBackend`). Emits transport-agnostic `AgentEvent`s via the SHARED
* `mapSessionUpdate` (reused verbatim from the one-shot stack); the dispatcher maps
* those to WS frames + `persistExternalAgentTurn`, unchanged.
*
* Lifecycle decisions (design.md §2b / §10):
* - **Child lifetime is the pool's, not a request's.** Spawned once; never tied
* to a per-turn abort signal. Only the in-flight `prompt` gets `ctx.signal` —
* abort = ACP `session/cancel`, NOT killing the child.
* - **Per-turn abort** cancels the prompt on the warm connection so the SAME
* process serves the next turn.
* - **Crash** (child exit) marks `agent_sessions.status='crashed'` + logs; the
* next `ensureSession` re-spawns + re-`session/new` (Phase 3 hardens auto-restart).
* - **Resume across a process restart is NOT attempted in Phase 2.** goose ACP
* advertises no `loadSession`/`session.resume`; qwen does, but cross-restart
* resume is Phase 3. Within ONE live process the ACP session persists across
* turns (the whole point of "warm"); a restart re-`session/new` (memory loss
* across restart, accepted per §10). The agent's resume capabilities ARE
* probed and logged for forward-compat.
*
* Each WarmAcpBackend instance owns exactly one (chat, agent) — the dispatcher
* pools them under `agentPool.register(chatId, agent, backend)`.
*
* SDK note (@agentclientprotocol/sdk@^0.22.1, cross-checked against the design's
* `^0.14` worry): the resume method is the STABLE `resumeSession` (`session/resume`,
* gated by `agentCapabilities.sessionCapabilities.resume`), NOT the `^0.14`
* `unstable_resumeSession`. `loadSession` is gated by `agentCapabilities.loadSession`.
*/
import { spawn, type ChildProcess } from 'node:child_process';
import type { FastifyBaseLogger } from 'fastify';
import {
ClientSideConnection,
type Client,
type SessionNotification,
type RequestPermissionRequest,
type RequestPermissionResponse,
type ReadTextFileRequest,
type ReadTextFileResponse,
type WriteTextFileRequest,
type WriteTextFileResponse,
type CreateTerminalRequest,
type CreateTerminalResponse,
type CreateElicitationRequest,
type CreateElicitationResponse,
} from '@agentclientprotocol/sdk';
import type { Sql } from '../../db.js';
import { resolveLaunchSpec } from '../acp-spawn.js';
import { isTurnOkForStopReason } from './warm-acp-routing.js';
import { getResolvedRegistry, type ResolvedProviderDef } from '../provider-config-registry.js';
import { createAcpNdJsonStream } from '../acp-stream.js';
import { mapSessionUpdate } from '../acp-event-map.js';
import { readWorktreeTextFile, writeWorktreeTextFile } from '../acp-client-fs.js';
import { waitForPermissionResponse, waitForElicitationResponse, cancelPendingPermission } from '../permission-waiter.js';
import { type AcpToolSnapshot, synthesizeCanceledSnapshots } from '../acp-tool-snapshot.js';
import type {
AgentBackend,
AgentEvent,
AgentSessionHandle,
EnsureSessionOpts,
PromptCtx,
TurnResult,
} from '../agent-backend.js';
/** State for one in-flight turn (only one at a time per backend — turns serialize). */
interface TurnState {
/** Per-turn task id, for routing permission prompts back to the UI. */
taskId: string | undefined;
/** BooCode session id for permission-waiter's broker frames. */
sessionId: string;
/** Per-turn mode id (autonomous-mode gate in permission-waiter). */
modeId: string | undefined;
onEvent: (e: AgentEvent) => void;
/** Tool-call snapshot accumulator for this turn — merge across tool_call_update. */
snapshots: Map<string, AcpToolSnapshot>;
}
export interface WarmAcpBackendDeps {
sql: Sql;
log: FastifyBaseLogger;
/** The (chat, agent) this backend serves — its pool identity + DB key. */
chatId: string;
agent: string;
/** Resolved binary for the agent (from available_agents.install_path), or null. */
installPath: string | null;
/** Optional override of the resolved registry def (defaults to a live lookup). */
resolved?: ResolvedProviderDef;
}
export class WarmAcpBackend implements AgentBackend {
readonly backend = 'acp_warm' as const;
private readonly sql: Sql;
private readonly log: FastifyBaseLogger;
private readonly chatId: string;
private readonly agent: string;
private readonly installPath: string | null;
private readonly resolvedOverride: ResolvedProviderDef | undefined;
private child: ChildProcess | null = null;
private connection: ClientSideConnection | null = null;
/** The single ACP session id for this warm process; null until session/new. */
private acpSessionId: string | null = null;
private up = false;
/** Idempotent spawn guard — one warm process per backend, started lazily. */
private starting: Promise<void> | null = null;
/** Resume capabilities probed at initialize, logged for forward-compat (Phase 3). */
private supportsLoadSession = false;
private supportsResumeSession = false;
/** The current in-flight turn; the Client closures read it. Null between turns. */
private activeTurn: TurnState | null = null;
constructor(deps: WarmAcpBackendDeps) {
this.sql = deps.sql;
this.log = deps.log;
this.chatId = deps.chatId;
this.agent = deps.agent;
this.installPath = deps.installPath;
this.resolvedOverride = deps.resolved;
}
/** §2: liveness for the health endpoint + dispatcher fallback decision. */
health(): 'up' | 'down' {
return this.up ? 'up' : 'down';
}
/** Phase 3: busy iff this backend's single session has an in-flight turn. The
* pool reads this to skip idle/LRU eviction (never kill the child mid-prompt). */
isBusy(): boolean {
return this.activeTurn != null;
}
// ─── warm-process lifecycle (2.1 spawn + initialize + session/new ONCE) ───────
/** Lazy: spawn the warm process on first use. Idempotent — one process per backend. */
private ensureProcess(worktreePath: string): Promise<void> {
if (this.up && this.connection && this.acpSessionId) return Promise.resolve();
if (!this.starting) {
this.starting = this.startProcess(worktreePath).catch((err) => {
// Reset so a later ensureSession can retry the spawn after a failed start.
this.starting = null;
throw err;
});
}
return this.starting;
}
private async startProcess(worktreePath: string): Promise<void> {
const resolved = this.resolvedOverride ?? getResolvedRegistry().get(this.agent);
const spec = resolved ? resolveLaunchSpec(resolved, this.installPath) : null;
if (!spec) throw new Error(`warm-acp: agent '${this.agent}' does not support ACP (no launch spec)`);
this.log.info({ agent: this.agent, chatId: this.chatId, binary: spec.binary, worktreePath }, 'warm-acp: spawning warm process');
// Child lifetime is the pool's. NOT tied to any per-turn abort signal — only
// the in-flight prompt is cancellable (via ACP session/cancel in prompt()).
const child = spawn(spec.binary, spec.args, {
cwd: worktreePath,
stdio: ['pipe', 'pipe', 'pipe'],
env: { ...process.env, ...spec.env },
});
this.child = child;
// 2.3: supervise the child; react to its exit, never let a request scope kill it.
child.on('exit', (code, signal) => {
this.up = false;
this.connection = null;
this.acpSessionId = null;
this.starting = null;
this.log.warn({ agent: this.agent, chatId: this.chatId, code, signal }, 'warm-acp: warm process exited — marking crashed (rebuild on next turn)');
void this.markCrashed();
});
// A spawn error (e.g. ENOENT) surfaces here, not as an exit.
child.on('error', (err) => {
this.up = false;
this.log.error({ agent: this.agent, chatId: this.chatId, err: errMsg(err) }, 'warm-acp: warm process error');
});
const stream = createAcpNdJsonStream(child);
const connection = new ClientSideConnection(() => this.buildClient(worktreePath), stream);
const init = await connection.initialize({
protocolVersion: 1,
clientInfo: { name: 'boocoder', version: '2.6.0' },
clientCapabilities: {},
});
const caps = init.agentCapabilities;
this.supportsLoadSession = caps?.loadSession === true;
this.supportsResumeSession = caps?.sessionCapabilities?.resume != null;
const session = await connection.newSession({ cwd: worktreePath, mcpServers: [] });
this.connection = connection;
this.acpSessionId = session.sessionId;
this.up = true;
this.log.info(
{
agent: this.agent,
chatId: this.chatId,
acpSessionId: session.sessionId,
loadSession: this.supportsLoadSession,
resumeSession: this.supportsResumeSession,
},
'warm-acp: warm session ready',
);
}
/** Build the ACP Client callbacks ONCE per connection. They read `this.activeTurn`
* so each turn's events/permissions route to the right place — exactly the
* opencode-server `activeTurn` pattern. Worktree-scoped FS like AcpStreamContext. */
private buildClient(worktreePath: string): Client {
return {
sessionUpdate: async (params: SessionNotification): Promise<void> => {
const turn = this.activeTurn;
if (!turn) return; // between turns — drop (no orphan settles a future turn)
for (const event of mapSessionUpdate(params, turn.snapshots)) {
turn.onEvent(event);
}
},
requestPermission: async (params: RequestPermissionRequest): Promise<RequestPermissionResponse> => {
const turn = this.activeTurn;
if (turn?.taskId) {
// Route to the UI via the per-turn task id (same as the one-shot path).
return waitForPermissionResponse(turn.taskId, turn.sessionId, this.agent, turn.modeId, params);
}
const firstOption = params.options[0];
if (firstOption) return { outcome: { outcome: 'selected', optionId: firstOption.optionId } };
return { outcome: { outcome: 'cancelled' } };
},
readTextFile: async (params: ReadTextFileRequest): Promise<ReadTextFileResponse> => {
const content = await readWorktreeTextFile(worktreePath, params.path, params.line, params.limit);
return { content };
},
writeTextFile: async (params: WriteTextFileRequest): Promise<WriteTextFileResponse> => {
await writeWorktreeTextFile(worktreePath, params.path, params.content);
return {};
},
createTerminal: async (_params: CreateTerminalRequest): Promise<CreateTerminalResponse> => {
return { terminalId: 'noop' };
},
unstable_createElicitation: async (params: CreateElicitationRequest): Promise<CreateElicitationResponse> => {
const turn = this.activeTurn;
if (turn?.taskId) {
return waitForElicitationResponse(turn.taskId, turn.sessionId, this.agent, turn.modeId, params);
}
return { action: 'decline' };
},
};
}
// ─── ensureSession: create-or-reuse the warm session (2.1) ───────────────────
async ensureSession(sessionId: string, opts: EnsureSessionOpts): Promise<AgentSessionHandle> {
await this.ensureProcess(opts.worktreePath);
if (!this.acpSessionId) throw new Error('warm-acp: session not ready after ensureProcess');
// P1.5-b: agent_sessions keys on (chat_id, agent). The ACP session id is the
// resume handle WITHIN the live process; across a process restart it's stale,
// so ensureProcess re-`session/new` and we upsert the fresh id here.
await this.sql`
INSERT INTO agent_sessions
(chat_id, session_id, worktree_id, agent, backend, agent_session_id, server_port, status, last_active_at)
VALUES
(${opts.chatId}, ${sessionId}, ${opts.worktreeId}, ${opts.agent}, 'acp_warm', ${this.acpSessionId}, NULL, 'active', clock_timestamp())
ON CONFLICT (chat_id, agent) DO UPDATE SET
session_id = EXCLUDED.session_id,
worktree_id = EXCLUDED.worktree_id,
backend = 'acp_warm',
agent_session_id = EXCLUDED.agent_session_id,
server_port = NULL,
status = 'active',
last_active_at = clock_timestamp()
`.catch((err) => {
this.log.warn({ err: errMsg(err), chatId: opts.chatId, agent: opts.agent }, 'warm-acp: agent_sessions upsert failed (non-fatal)');
});
return {
sessionId,
agent: opts.agent,
backend: 'acp_warm',
chatId: opts.chatId,
worktreeId: opts.worktreeId,
agentSessionId: this.acpSessionId,
serverPort: null,
};
}
// ─── prompt: one turn on the warm connection (2.2) ───────────────────────────
async prompt(handle: AgentSessionHandle, input: string, ctx: PromptCtx): Promise<TurnResult> {
// The warm process may have crashed between ensureSession and here, or this
// backend was rebuilt — re-establish before prompting.
await this.ensureProcess(ctx.worktreePath);
const connection = this.connection;
const acpSessionId = this.acpSessionId;
if (!connection || !acpSessionId) {
return { ok: false, error: 'warm-acp: no live ACP connection' };
}
const snapshots = new Map<string, AcpToolSnapshot>();
// taskId routes permission/elicitation prompts back to the UI. The dispatcher
// passes it (plus mode) on the per-turn PromptCtx; permission-waiter keys on it.
const turn: TurnState = {
taskId: ctx.taskId,
sessionId: handle.sessionId,
modeId: ctx.modeId,
onEvent: ctx.onEvent,
snapshots,
};
this.activeTurn = turn;
// Per-turn abort: cancel the in-flight prompt on the SAME connection — never
// kill the child (that's the pool's lifetime). On cancel we also synthesize
// 'canceled' updates for any still-running tool calls so the UI doesn't leave
// them spinning (mirrors AcpStreamContext.markAborted).
let aborted = false;
const onAbort = () => {
if (aborted) return;
aborted = true;
connection.cancel({ sessionId: acpSessionId }).catch(() => {});
if (ctx.taskId) cancelPendingPermission(ctx.taskId);
for (const snap of synthesizeCanceledSnapshots(snapshots.values())) {
snapshots.set(snap.toolCallId, snap);
ctx.onEvent({ type: 'tool_update', toolCall: snap });
}
};
if (ctx.signal.aborted) {
this.activeTurn = null;
return { ok: false, error: 'aborted' };
}
ctx.signal.addEventListener('abort', onAbort, { once: true });
try {
const result = await connection.prompt({
sessionId: acpSessionId,
prompt: [{ type: 'text', text: input }],
});
if (aborted) return { ok: false, error: 'aborted' };
const stopReason = result.stopReason ?? 'end_turn';
return isTurnOkForStopReason(stopReason)
? { ok: true }
: { ok: false, error: `stop_reason: ${stopReason}` };
} catch (err) {
if (aborted) return { ok: false, error: 'aborted' };
return { ok: false, error: errMsg(err) };
} finally {
ctx.signal.removeEventListener('abort', onAbort);
this.activeTurn = null;
await this.sql`
UPDATE agent_sessions SET status = 'idle', last_active_at = clock_timestamp()
WHERE chat_id = ${this.chatId} AND agent = ${this.agent}
`.catch(() => {});
}
}
// ─── teardown ────────────────────────────────────────────────────────────────
async closeSession(handle: AgentSessionHandle): Promise<void> {
// Gracefully close the ACP session if the agent supports it; then kill the child.
if (this.connection && this.acpSessionId) {
await this.connection.closeSession({ sessionId: this.acpSessionId }).catch(() => {});
}
await this.killChild();
await this.sql`
UPDATE agent_sessions SET status = 'closed'
WHERE chat_id = ${handle.chatId} AND agent = ${handle.agent}
`.catch(() => {});
}
async dispose(): Promise<void> {
this.up = false;
this.activeTurn = null;
if (this.connection && this.acpSessionId) {
await this.connection.closeSession({ sessionId: this.acpSessionId }).catch(() => {});
}
await this.killChild();
this.connection = null;
this.acpSessionId = null;
this.starting = null;
}
private async killChild(): Promise<void> {
const child = this.child;
this.child = null;
if (!child || child.killed) return;
child.kill('SIGTERM');
await new Promise<void>((resolve) => {
const t = setTimeout(() => {
if (!child.killed) child.kill('SIGKILL');
resolve();
}, 5_000);
t.unref?.();
child.once('close', () => {
clearTimeout(t);
resolve();
});
});
}
private async markCrashed(): Promise<void> {
await this.sql`
UPDATE agent_sessions SET status = 'crashed'
WHERE chat_id = ${this.chatId} AND agent = ${this.agent}
`.catch(() => {});
}
}
function errMsg(e: unknown): string {
return e instanceof Error ? e.message : String(e);
}

View File

@@ -0,0 +1,306 @@
/**
* write-edit-robustness #4 — worktree checkpoints.
*
* External agents (opencode / goose / qwen / claude) write DIRECTLY into the
* shared session worktree (`/tmp/booworktrees/sess-<id>`); BooCode's own `rewind`
* only reverses `pending_changes` against the project root, so it has zero coverage
* there. A checkpoint is a pre-turn shadow-commit of the worktree tree (tracked +
* untracked) captured WITHOUT touching the real index/working tree, stored in a
* private GC-safe ref. `restoreCheckpoint` rewinds the worktree to that commit,
* trims the transcript from the anchor message forward, and resets the agent
* backend so the next turn re-establishes a fresh context consistent with the
* restored files.
*
* All git goes through hostExec + shellEscape (BooCoder runs on the host; the
* worktrees live on the host fs). Checkpoint CREATION is best-effort: a failure
* logs and returns null — it must NEVER throw into the dispatch turn.
*/
import { randomUUID } from 'node:crypto';
import type { FastifyBaseLogger } from 'fastify';
import type { Sql } from '../db.js';
import { hostExec } from './host-exec.js';
import { agentPool, OPENCODE_POOL_KEY } from './agent-pool.js';
import type { AgentSessionHandle } from './agent-backend.js';
/** Minimal shell escape for paths/refs (single-quote wrapping). Mirrors worktrees.ts. */
function shellEscape(s: string): string {
return "'" + s.replace(/'/g, "'\\''") + "'";
}
/**
* Pure builder for the shadow-commit command. Captures tracked + untracked files
* in the worktree into a temp index (so the real index/working tree is untouched),
* writes a tree, commits it parented on HEAD, and parks the commit under a private
* ref `refs/boocode/checkpoints/<id>` so git's GC never reclaims it. Prints ONLY
* the resulting SHA on stdout (the trailing `printf '%s'`), so the caller parses
* stdout.trim() directly.
*
* `id` is the row UUID (minted before the ref so the ref name matches the row).
* Both the worktree path and the id are shell-escaped.
*/
export function buildShadowCommitCommand(worktreePath: string, id: string): string {
const wt = shellEscape(worktreePath);
const ref = shellEscape(`refs/boocode/checkpoints/${id}`);
return (
`cd ${wt} && TMP=$(mktemp) && GIT_INDEX_FILE="$TMP" git read-tree HEAD ` +
`&& GIT_INDEX_FILE="$TMP" git add -A ` +
`&& TREE=$(GIT_INDEX_FILE="$TMP" git write-tree) ` +
`&& SHA=$(git commit-tree "$TREE" -p HEAD -m "boocode checkpoint") ` +
`&& git update-ref ${ref} "$SHA" && rm -f "$TMP" && printf '%s' "$SHA"`
);
}
export interface CreateCheckpointArgs {
chatId: string;
sessionId: string | null;
worktreeId: string | null;
worktreePath: string;
messageId: string | null;
label?: string | null;
}
/**
* Capture a pre-turn checkpoint of the session worktree. Best-effort: returns the
* inserted row's { id, commit_sha } on success, or null on any failure (the turn
* proceeds either way — a missing checkpoint just means no restore point for that
* turn). NEVER throws.
*
* The id is minted up front so the git ref name (`refs/boocode/checkpoints/<id>`)
* matches the DB row id, keeping ref and row in lockstep.
*/
export async function createCheckpoint(
sql: Sql,
args: CreateCheckpointArgs,
opts?: { signal?: AbortSignal; log?: FastifyBaseLogger },
): Promise<{ id: string; commit_sha: string } | null> {
const id = randomUUID();
try {
const cmd = buildShadowCommitCommand(args.worktreePath, id);
const res = await hostExec(cmd, { signal: opts?.signal, timeoutMs: 30_000 });
if (res.exitCode !== 0) {
opts?.log?.warn(
{ chatId: args.chatId, worktreePath: args.worktreePath, stderr: res.stderr.trim().slice(0, 500) },
'checkpoint: shadow-commit failed (turn proceeds without a checkpoint)',
);
return null;
}
const commitSha = res.stdout.trim();
if (!commitSha) {
opts?.log?.warn(
{ chatId: args.chatId, worktreePath: args.worktreePath },
'checkpoint: shadow-commit produced no SHA (turn proceeds)',
);
return null;
}
await sql`
INSERT INTO checkpoints (id, chat_id, session_id, worktree_id, message_id, commit_sha, label)
VALUES (${id}, ${args.chatId}, ${args.sessionId}, ${args.worktreeId}, ${args.messageId}, ${commitSha}, ${args.label ?? null})
`;
opts?.log?.info({ checkpointId: id, chatId: args.chatId, commitSha }, 'checkpoint: created');
return { id, commit_sha: commitSha };
} catch (err) {
opts?.log?.warn(
{ chatId: args.chatId, err: err instanceof Error ? err.message : String(err) },
'checkpoint: create threw (turn proceeds without a checkpoint)',
);
return null;
}
}
/** Error the route maps to a 404 when the checkpoint can't be resolved / scoped. */
export class CheckpointNotFoundError extends Error {
constructor(message: string) {
super(message);
this.name = 'CheckpointNotFoundError';
}
}
export interface RestoreCheckpointResult {
checkpoint_id: string;
messages_deleted: number;
worktree_reset: boolean;
backend_reset: boolean;
}
export interface RestoreCheckpointOpts {
signal?: AbortSignal;
log?: FastifyBaseLogger;
/** If set, the checkpoint MUST belong to this session (route scope guard). */
sessionId?: string;
}
interface CheckpointRow {
id: string;
chat_id: string;
session_id: string | null;
worktree_id: string | null;
message_id: string | null;
commit_sha: string;
created_at: Date;
}
/**
* Restore a checkpoint: rewind its worktree to the shadow commit, trim the
* transcript from the anchor message forward, reset the backend session, and drop
* now-orphaned later checkpoints. Throws CheckpointNotFoundError when the
* checkpoint is missing or not in the requested session (route → 404).
*/
export async function restoreCheckpoint(
sql: Sql,
checkpointId: string,
opts?: RestoreCheckpointOpts,
): Promise<RestoreCheckpointResult> {
// 1. Resolve the checkpoint.
const [cp] = await sql<CheckpointRow[]>`
SELECT id, chat_id, session_id, worktree_id, message_id, commit_sha, created_at
FROM checkpoints WHERE id = ${checkpointId}
`;
if (!cp) {
throw new CheckpointNotFoundError('checkpoint not found');
}
// Authorization scope (fail-safe): the checkpoint's chat must belong to the
// requested session. cp.session_id is a denormalized hint that may be null, so
// gating on it directly fails open — resolve the owning session via chats
// (authoritative; chat_id is NOT NULL) and deny on any mismatch or missing row.
if (opts?.sessionId) {
const [owner] = await sql<{ session_id: string | null }[]>`
SELECT session_id FROM chats WHERE id = ${cp.chat_id}
`;
if (!owner || owner.session_id !== opts.sessionId) {
throw new CheckpointNotFoundError('checkpoint not in session');
}
}
// 2. Resolve the worktree path (by worktree_id, else the session's active one).
let worktreePath: string | null = null;
if (cp.worktree_id) {
const [wt] = await sql<{ path: string }[]>`
SELECT path FROM worktrees WHERE id = ${cp.worktree_id}
`;
worktreePath = wt?.path ?? null;
}
if (!worktreePath) {
const sid = cp.session_id ?? opts?.sessionId ?? null;
if (sid) {
const [wt] = await sql<{ path: string }[]>`
SELECT path FROM worktrees WHERE session_id = ${sid} AND status = 'active' LIMIT 1
`;
worktreePath = wt?.path ?? null;
}
}
// 3. Worktree reset — hard-reset to the shadow commit, then clean untracked.
let worktreeReset = false;
if (worktreePath) {
const resetRes = await hostExec(
`git -C ${shellEscape(worktreePath)} reset --hard ${shellEscape(cp.commit_sha)}`,
{ signal: opts?.signal, timeoutMs: 30_000 },
).catch((err) => {
opts?.log?.warn(
{ checkpointId, err: err instanceof Error ? err.message : String(err) },
'checkpoint restore: reset --hard threw',
);
return null;
});
if (resetRes && resetRes.exitCode === 0) {
const cleanRes = await hostExec(
`git -C ${shellEscape(worktreePath)} clean -fd`,
{ signal: opts?.signal, timeoutMs: 30_000 },
).catch(() => null);
worktreeReset = cleanRes != null && cleanRes.exitCode === 0;
if (!worktreeReset) {
opts?.log?.warn({ checkpointId, worktreePath }, 'checkpoint restore: clean -fd did not succeed');
}
} else {
opts?.log?.warn(
{ checkpointId, worktreePath, stderr: resetRes?.stderr?.trim()?.slice(0, 500) },
'checkpoint restore: reset --hard did not succeed',
);
}
} else {
opts?.log?.warn({ checkpointId }, 'checkpoint restore: no worktree path resolved (files not reset)');
}
// 4. Trim the transcript from the anchor message forward. message_parts FK to
// messages is ON DELETE CASCADE (apps/server schema.sql:49), so parts are
// removed with their messages — no explicit parts delete needed.
let messagesDeleted = 0;
if (cp.message_id) {
const deleted = await sql<{ id: string }[]>`
DELETE FROM messages
WHERE chat_id = ${cp.chat_id}
AND created_at >= (SELECT created_at FROM messages WHERE id = ${cp.message_id})
RETURNING id
`;
messagesDeleted = deleted.length;
}
// 5. Backend reset — mark the chat's agent sessions crashed so the next turn
// re-establishes a fresh backend, and evict the live pool session(s) for this
// (chat, agent). Warm backends hold context server-side with no partial
// rewind, so a full reset is the only consistent option (proposal §4).
const agentRows = await sql<{ agent: string; backend: string; agent_session_id: string | null; session_id: string | null; worktree_id: string | null }[]>`
SELECT agent, backend, agent_session_id, session_id, worktree_id
FROM agent_sessions WHERE chat_id = ${cp.chat_id}
`;
await sql`
UPDATE agent_sessions SET status = 'crashed' WHERE chat_id = ${cp.chat_id}
`.catch(() => {});
let backendReset = false;
try {
// opencode runs on the SHARED server (keyed on a sentinel, not the chat) — close
// just this chat's session(s) on it, mirroring the lifecycle close-hook.
const ocBackend = agentPool.peek(OPENCODE_POOL_KEY, 'opencode');
if (ocBackend) {
for (const row of agentRows) {
if (row.backend !== 'opencode_server' || !row.agent_session_id) continue;
const handle: AgentSessionHandle = {
sessionId: row.session_id ?? '',
agent: row.agent,
backend: 'opencode_server',
chatId: cp.chat_id,
worktreeId: row.worktree_id ?? '',
agentSessionId: row.agent_session_id,
serverPort: null,
};
await ocBackend.closeSession(handle).catch((err) => {
opts?.log?.warn(
{ checkpointId, err: err instanceof Error ? err.message : String(err) },
'checkpoint restore: opencode closeSession threw',
);
});
}
}
// Warm-ACP backends are pooled under the chat id — dispose them (kills the
// goose/qwen child). closeChat skips busy backends (a live turn isn't torn down).
const disposed = await agentPool.closeChat(cp.chat_id);
backendReset = true;
opts?.log?.info({ checkpointId, chatId: cp.chat_id, disposed }, 'checkpoint restore: backend reset');
} catch (err) {
opts?.log?.warn(
{ checkpointId, err: err instanceof Error ? err.message : String(err) },
'checkpoint restore: backend reset threw',
);
}
// 6. Drop now-orphaned later checkpoints for this chat (their anchor messages were
// just trimmed). Compare `created_at` SERVER-SIDE via a subquery (NOT the JS
// Date round-trip, which truncates the stored microsecond precision to ms and
// would make this checkpoint delete ITSELF), and exclude this checkpoint's own
// id so it always survives — letting the user re-restore to it.
await sql`
DELETE FROM checkpoints
WHERE chat_id = ${cp.chat_id}
AND id <> ${cp.id}
AND created_at > (SELECT created_at FROM checkpoints WHERE id = ${cp.id})
`.catch(() => {});
return {
checkpoint_id: checkpointId,
messages_deleted: messagesDeleted,
worktree_reset: worktreeReset,
backend_reset: backendReset,
};
}

View File

@@ -0,0 +1,108 @@
/**
* v2.5.11: discover Claude Code's real, enabled commands + plugin skills from
* disk so the coder slash menu shows them (claude is PTY — no ACP discovery).
*
* Scope (v1): user-global only — `~/.claude/commands/*.md` plus the enabled
* plugins listed in `~/.claude/settings.json:enabledPlugins` (user-scope install
* paths from `~/.claude/plugins/.../installed_plugins.json`). Project-local
* plugins and `<cwd>/.claude/commands` are deferred. Names are bare.
*/
import { readFileSync, readdirSync, existsSync } from 'node:fs';
import { homedir } from 'node:os';
import { join } from 'node:path';
import type { AgentCommand } from './provider-types.js';
/** Minimal frontmatter reader — single-line `key: value` between `---` fences. */
function frontmatterField(content: string, field: string): string | undefined {
const block = content.match(/^---\r?\n([\s\S]*?)\r?\n---/);
if (!block?.[1]) return undefined;
const m = block[1].match(new RegExp(`^${field}:\\s*(.+)$`, 'm'));
return m?.[1]?.trim().replace(/^["']|["']$/g, '') || undefined;
}
function readCommandDir(dir: string): AgentCommand[] {
if (!existsSync(dir)) return [];
let files: string[];
try {
files = readdirSync(dir);
} catch {
return [];
}
const out: AgentCommand[] = [];
for (const f of files) {
if (!f.endsWith('.md')) continue;
let description: string | undefined;
try {
description = frontmatterField(readFileSync(join(dir, f), 'utf8'), 'description');
} catch {
/* unreadable — still list the command by name */
}
out.push({ name: f.slice(0, -3), kind: 'command', ...(description ? { description } : {}) });
}
return out;
}
function readSkillDir(dir: string): AgentCommand[] {
if (!existsSync(dir)) return [];
let entries: string[];
try {
entries = readdirSync(dir);
} catch {
return [];
}
const out: AgentCommand[] = [];
for (const sub of entries) {
const skillMd = join(dir, sub, 'SKILL.md');
if (!existsSync(skillMd)) continue;
let content: string;
try {
content = readFileSync(skillMd, 'utf8');
} catch {
continue;
}
out.push({
name: frontmatterField(content, 'name') ?? sub,
kind: 'skill',
...(() => {
const d = frontmatterField(content, 'description');
return d ? { description: d } : {};
})(),
});
}
return out;
}
export function discoverClaudeCommands(): AgentCommand[] {
const root = join(homedir(), '.claude');
const out: AgentCommand[] = [];
// User custom commands.
out.push(...readCommandDir(join(root, 'commands')));
// Enabled plugins (user-scope installs).
try {
const settings = JSON.parse(readFileSync(join(root, 'settings.json'), 'utf8')) as {
enabledPlugins?: Record<string, boolean>;
};
const installed = JSON.parse(
readFileSync(join(root, 'plugins', 'installed_plugins.json'), 'utf8'),
) as { plugins?: Record<string, Array<{ scope?: string; installPath?: string }>> };
const enabled = settings.enabledPlugins ?? {};
const plugins = installed.plugins ?? {};
for (const [key, on] of Object.entries(enabled)) {
if (!on) continue;
const installs = plugins[key] ?? [];
const installPath = (installs.find((i) => i.scope === 'user') ?? installs[0])?.installPath;
if (!installPath || !existsSync(installPath)) continue;
out.push(...readSkillDir(join(installPath, 'skills')));
out.push(...readCommandDir(join(installPath, 'commands')));
}
} catch {
/* missing/unreadable plugin config → user commands only */
}
// Dedupe by name (first wins).
const seen = new Set<string>();
return out.filter((c) => (seen.has(c.name) ? false : (seen.add(c.name), true)));
}

View File

@@ -0,0 +1,22 @@
/**
* v2.3 phase 2: tier-1 fast availability check — is a binary on PATH?
*
* Uses execFile (NO shell) because the binary name can come from the provider
* config file (custom ACP entries) — mirrors the Phase 1 agent-probe hardening.
* Note: agent-probe's `whichBinary` returns the resolved path (it needs it for
* `install_path`); this returns a boolean. Kept separate rather than over-
* refactored into one helper — different return contracts, two short call sites.
*/
import { execFile as execFileCb } from 'node:child_process';
import { promisify } from 'node:util';
const execFile = promisify(execFileCb);
export async function isCommandAvailable(binary: string): Promise<boolean> {
try {
const { stdout } = await execFile('which', [binary], { timeout: 10_000 });
return stdout.trim().length > 0;
} catch {
return false;
}
}

View File

@@ -0,0 +1,77 @@
/**
* Strip opencode-dcp plugin tags (`<dcp-message-id>mNNNN</dcp-message-id>`) that
* the @tarquinen/opencode-dcp plugin appends to assistant text and which
* otherwise render as literal text in the UI.
*
* Why a streaming stripper and not a per-chunk `.replace()`: opencode streams
* assistant text token-by-token, so the tag arrives SPLIT across many SSE deltas
* (`<dcp`, `-message`, `-id>`, `m0019`, `</dcp`, …). A per-chunk regex never sees
* a complete tag in any single fragment, so the fragments pass through and the
* dispatcher reassembles the full tag in the persisted/displayed content. The
* stripper below buffers across chunks: it emits everything that cannot be part
* of a forming tag and holds back only a trailing partial-tag prefix until the
* next chunk resolves it — without holding back legitimate `<…>` content.
*/
const DCP_TAG_RE = /<dcp-message-id>[^<]*<\/dcp-message-id>/g;
const OPEN = '<dcp-message-id>';
const CLOSE = '</dcp-message-id>';
/** One-shot strip of COMPLETE tags. Safe for non-streaming / final content. */
export function stripDcpTags(s: string): string {
return s.replace(DCP_TAG_RE, '');
}
/**
* Could `tail` (a substring starting at a `<`) still grow into a complete dcp
* tag on a future chunk? If so the caller must hold it back rather than emit it.
* Returns false for unrelated `<` content (`<div>`, `<T>`, …) so those stream
* normally.
*/
function isPartialDcp(tail: string): boolean {
// A prefix of the opening marker: '<', '<d', …, '<dcp-message-id'.
if (OPEN.startsWith(tail)) return true;
// Opening marker fully seen — content (and maybe a forming close) still streaming.
if (tail.startsWith(OPEN)) {
const rest = tail.slice(OPEN.length);
const lt = rest.indexOf('<');
if (lt === -1) return true; // still inside the [^<]* content run
return CLOSE.startsWith(rest.slice(lt)); // a partial close marker forming
}
return false;
}
export interface DcpStreamStripper {
/** Feed one text chunk; returns the portion safe to emit now (may be ''). */
push(chunk: string): string;
/** Stream end: returns whatever was held back, with complete tags stripped. */
flush(): string;
}
/** Stateful, cross-chunk-safe dcp stripper. One instance per turn. */
export function makeDcpStreamStripper(): DcpStreamStripper {
let buf = '';
return {
push(chunk: string): string {
buf += chunk;
buf = buf.replace(DCP_TAG_RE, ''); // drop any now-complete tags
// Find the earliest `<` whose suffix is a forming dcp tag; hold from there,
// emit everything before it (real text, including unrelated `<…>`).
for (let i = buf.indexOf('<'); i !== -1; i = buf.indexOf('<', i + 1)) {
if (isPartialDcp(buf.slice(i))) {
const emit = buf.slice(0, i);
buf = buf.slice(i);
return emit;
}
}
const emit = buf;
buf = '';
return emit;
},
flush(): string {
const out = stripDcpTags(buf);
buf = '';
return out;
},
};
}

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,271 @@
// Fuzzy patch locator for staged edits.
//
// Local quantized models (qwen3.6 and friends) frequently reproduce an
// `old_string` with small, semantically-irrelevant drift: trailing whitespace,
// a different indent width, or "smart" unicode punctuation (curly quotes, an
// en/em-dash, a non-breaking space) where the source has the plain ASCII form.
// An exact `String.includes` then fails and the queued edit is lost even though
// a human would say it obviously matches.
//
// `locateMatch` walks a ladder of progressively looser strategies and returns
// the real `[start, end)` byte-offset span in the ORIGINAL content so the caller
// can splice in `new_string` over the true file text (preserving the file's own
// whitespace/unicode, not the model's drifted copy). The ladder stops at the
// first strategy that resolves to a single span:
//
// 1. exact — indexOf; >1 hit is reported `ambiguous` (we refuse to
// guess which occurrence the model meant).
// 2. per-line ws — line-window compare ignoring per-line trailing
// whitespace and leading/trailing blank needle lines.
// 3. unicode canon — same line-window compare after folding smart
// punctuation to ASCII on both sides; the match is
// mapped back to original offsets.
// 4. levenshtein — best line-window by normalized edit-distance
// similarity; accepted only at >= SIMILARITY_THRESHOLD.
//
// Pure and dependency-free (Levenshtein is the standard iterative two-row DP),
// reimplemented from the general technique — no vendored source.
export type MatchResult =
| { kind: 'exact' | 'fuzzy'; start: number; end: number } // [start,end) offsets into content
| { kind: 'ambiguous'; count: number }
| { kind: 'not_found' };
/** Levenshtein similarity floor for the final fuzzy fallback (strategy 4). */
export const SIMILARITY_THRESHOLD = 0.66;
export function locateMatch(content: string, needle: string): MatchResult {
// Empty needle has no meaningful match.
if (needle.length === 0) return { kind: 'not_found' };
// --- 1. Exact ----------------------------------------------------------------
const exact = locateExact(content, needle);
if (exact) return exact;
// --- 2. Per-line whitespace-insensitive -------------------------------------
const ws = locateByLineWindow(content, needle);
if (ws) return ws;
// --- 3. Unicode-canonicalized whitespace pass -------------------------------
const canon = locateCanonical(content, needle);
if (canon) return canon;
// --- 4. Levenshtein similarity ----------------------------------------------
const lev = locateByLevenshtein(content, needle);
if (lev) return lev;
return { kind: 'not_found' };
}
// --- Strategy 1: exact -------------------------------------------------------
function locateExact(content: string, needle: string): MatchResult | null {
const first = content.indexOf(needle);
if (first === -1) return null;
const second = content.indexOf(needle, first + 1);
if (second === -1) {
return { kind: 'exact', start: first, end: first + needle.length };
}
// Count all occurrences so the caller can report a useful number.
let count = 2;
let idx = content.indexOf(needle, second + 1);
while (idx !== -1) {
count++;
idx = content.indexOf(needle, idx + 1);
}
return { kind: 'ambiguous', count };
}
// --- Line-window machinery ---------------------------------------------------
interface Line {
/** Raw line text (no trailing newline). */
text: string;
/** Offset of the first char of this line in the original content. */
start: number;
/** Offset one past the last char of this line (before its newline, if any). */
end: number;
}
/**
* Split content into lines, tracking each line's real offset span. The span
* EXCLUDES the trailing newline so consecutive line spans plus their newlines
* exactly reconstruct the content; the match span we hand back covers from the
* first matched line's start through the last matched line's end (i.e. without a
* trailing newline), which is what an in-place splice wants.
*/
function splitLines(content: string): Line[] {
const lines: Line[] = [];
let start = 0;
for (let i = 0; i <= content.length; i++) {
if (i === content.length || content[i] === '\n') {
lines.push({ text: content.slice(start, i), start, end: i });
start = i + 1;
}
}
return lines;
}
/** Strip leading/trailing all-blank lines; returns the trimmed slice. */
function trimBlankLines(lines: string[]): string[] {
let lo = 0;
let hi = lines.length;
while (lo < hi && lines[lo]!.trim() === '') lo++;
while (hi > lo && lines[hi - 1]!.trim() === '') hi--;
return lines.slice(lo, hi);
}
/**
* Find a contiguous window of content lines whose trailing-whitespace-trimmed
* text equals the needle's (blank-trimmed) lines. Returns the real offset span
* over the matched content lines, or null if zero match. Multiple matches →
* ambiguous. `normalize` lets the caller fold unicode before comparing.
*/
function locateByLineWindow(
content: string,
needle: string,
normalize: (s: string) => string = (s) => s,
): MatchResult | null {
const contentLines = splitLines(content);
const needleLines = trimBlankLines(needle.split('\n'));
const n = needleLines.length;
if (n === 0) return null;
// A single needle line that is itself blank can't be located meaningfully.
if (n === 1 && needleLines[0]!.trim() === '') return null;
const needleKey = needleLines.map((l) => normalize(l.trimEnd())).join('\n');
const hits: Array<{ start: number; end: number }> = [];
for (let i = 0; i + n <= contentLines.length; i++) {
const windowKey = contentLines
.slice(i, i + n)
.map((l) => normalize(l.text.trimEnd()))
.join('\n');
if (windowKey === needleKey) {
hits.push({ start: contentLines[i]!.start, end: contentLines[i + n - 1]!.end });
}
}
if (hits.length === 0) return null;
if (hits.length > 1) return { kind: 'ambiguous', count: hits.length };
return { kind: 'fuzzy', start: hits[0]!.start, end: hits[0]!.end };
}
// --- Strategy 3: unicode canonicalization ------------------------------------
/**
* Fold smart punctuation to its ASCII equivalent. Crucially this is a
* length-PRESERVING, per-character map (every replacement is one char → one
* char), so an offset into the canonical string is also a valid offset into the
* original — letting strategy 3 reuse the line-window matcher and still hand
* back true original-content offsets.
*/
function canonicalizeChar(ch: string): string {
switch (ch) {
// single quotes / apostrophes
case '': // '
case '': // '
case '': //
case '': //
return "'";
// double quotes
case '“': // "
case '”': // "
case '„': // „
case '‟': // ‟
return '"';
// dashes
case '': // en dash
case '—': // — em dash
case '': // figure dash
case '―': // ― horizontal bar
case '': // minus sign
return '-';
// spaces
case ' ': // nbsp
case '': // figure space
case '': // narrow nbsp
return ' ';
default:
return ch;
}
}
function canonicalize(s: string): string {
let out = '';
for (const ch of s) out += canonicalizeChar(ch);
return out;
}
function locateCanonical(content: string, needle: string): MatchResult | null {
// Only worth running if canonicalization actually changes something on either
// side — otherwise it's identical to strategy 2 which already failed.
const canonContent = canonicalize(content);
const canonNeedle = canonicalize(needle);
if (canonContent === content && canonNeedle === needle) return null;
// Offsets are preserved (length-preserving fold), so a match on the canonical
// content maps directly back to the original.
return locateByLineWindow(canonContent, canonNeedle);
}
// --- Strategy 4: Levenshtein similarity --------------------------------------
/** Standard iterative two-row Levenshtein edit distance. */
function levenshtein(a: string, b: string): number {
if (a === b) return 0;
if (a.length === 0) return b.length;
if (b.length === 0) return a.length;
let prev = new Array<number>(b.length + 1);
let curr = new Array<number>(b.length + 1);
for (let j = 0; j <= b.length; j++) prev[j] = j;
for (let i = 1; i <= a.length; i++) {
curr[0] = i;
const ac = a.charCodeAt(i - 1);
for (let j = 1; j <= b.length; j++) {
const cost = ac === b.charCodeAt(j - 1) ? 0 : 1;
curr[j] = Math.min(
prev[j]! + 1, // deletion
curr[j - 1]! + 1, // insertion
prev[j - 1]! + cost, // substitution
);
}
[prev, curr] = [curr, prev];
}
return prev[b.length]!;
}
/** Normalized similarity in [0,1]: 1 - dist / max(len). */
function similarity(a: string, b: string): number {
const maxLen = Math.max(a.length, b.length);
if (maxLen === 0) return 1;
return 1 - levenshtein(a, b) / maxLen;
}
function locateByLevenshtein(content: string, needle: string): MatchResult | null {
const contentLines = splitLines(content);
const needleLines = trimBlankLines(needle.split('\n'));
const n = needleLines.length;
if (n === 0) return null;
if (contentLines.length < n) return null;
const needleJoined = needleLines.map((l) => l.trim()).join('\n');
let best = -1;
let bestSpan: { start: number; end: number } | null = null;
for (let i = 0; i + n <= contentLines.length; i++) {
const window = contentLines.slice(i, i + n);
const windowJoined = window.map((l) => l.text.trim()).join('\n');
const score = similarity(windowJoined, needleJoined);
if (score > best) {
best = score;
bestSpan = { start: window[0]!.start, end: window[n - 1]!.end };
}
}
if (bestSpan && best >= SIMILARITY_THRESHOLD) {
return { kind: 'fuzzy', start: bestSpan.start, end: bestSpan.end };
}
return null;
}

View File

@@ -0,0 +1,66 @@
/**
* Local shell exec on the BooCoder host (replaces deprecated ssh.ts for worktrees).
*/
import { spawn } from 'node:child_process';
export interface HostExecResult {
exitCode: number;
stdout: string;
stderr: string;
}
export async function hostExec(
command: string,
opts?: { signal?: AbortSignal; timeoutMs?: number },
): Promise<HostExecResult> {
return new Promise<HostExecResult>((resolve, reject) => {
const child = spawn('bash', ['-lc', command], {
stdio: ['pipe', 'pipe', 'pipe'],
});
let stdout = '';
let stderr = '';
let killed = false;
child.stdout!.on('data', (chunk: Buffer) => { stdout += chunk.toString(); });
child.stderr!.on('data', (chunk: Buffer) => { stderr += chunk.toString(); });
const cleanup = () => {
if (!killed) {
killed = true;
child.kill('SIGTERM');
}
};
if (opts?.signal) {
if (opts.signal.aborted) {
cleanup();
reject(new Error('host exec aborted before start'));
return;
}
opts.signal.addEventListener('abort', cleanup, { once: true });
}
let timer: ReturnType<typeof setTimeout> | undefined;
if (opts?.timeoutMs) {
timer = setTimeout(() => {
cleanup();
reject(new Error(`host exec timed out after ${opts.timeoutMs}ms`));
}, opts.timeoutMs);
}
child.on('close', (code) => {
if (timer) clearTimeout(timer);
if (opts?.signal) opts.signal.removeEventListener('abort', cleanup);
resolve({ exitCode: code ?? 1, stdout, stderr });
});
child.on('error', (err) => {
if (timer) clearTimeout(timer);
if (opts?.signal) opts.signal.removeEventListener('abort', cleanup);
reject(err);
});
child.stdin!.end();
});
}

View File

@@ -0,0 +1,232 @@
/**
* BooCoder MCP Server — exposes task primitives as MCP tools.
*
* Started when `--mcp` flag is passed to the entry point. Runs stdio transport
* so external tools (opencode in Termius) can drive the task queue.
*/
import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import { z } from 'zod';
import type { Sql } from '../db.js';
import { applyOne, rejectOne } from './pending_changes.js';
// --- Tool handlers -----------------------------------------------------------
interface TaskRow {
id: string;
state: string;
}
interface PendingRow {
id: string;
file_path: string;
operation: string;
diff: string;
session_id: string;
}
interface WorktreeRow {
id: string;
worktree_path: string;
agent: string;
started_at: string;
}
interface ProjectPathRow {
path: string;
}
function textResult(data: unknown) {
return { content: [{ type: 'text' as const, text: JSON.stringify(data, null, 2) }] };
}
// --- Public entry ------------------------------------------------------------
export async function startMcpServer(sql: Sql): Promise<void> {
const server = new McpServer(
{ name: 'boocoder', version: '2.0.2' },
{ capabilities: { tools: {} } },
);
// 1. boocoder.create_task
server.tool(
'boocoder.create_task',
'Create a new task in the BooCoder task queue',
{
project_id: z.string().describe('Project UUID'),
input: z.string().describe('Task description / prompt for the agent'),
agent: z.string().optional().describe('Agent name (optional — uses default if omitted)'),
model: z.string().optional().describe('Model override (optional)'),
mode_id: z.string().optional().describe('Permission/mode id (optional)'),
thinking_option_id: z.string().optional().describe('Thinking/effort option id (optional)'),
},
async (args) => {
const [row] = await sql<TaskRow[]>`
INSERT INTO tasks (project_id, input, agent, model, mode_id, thinking_option_id, state)
VALUES (
${args.project_id},
${args.input},
${args.agent ?? null},
${args.model ?? null},
${args.mode_id ?? null},
${args.thinking_option_id ?? null},
'pending'
)
RETURNING id, state
`;
return textResult({
task_id: row!.id,
state: row!.state,
mode_id: args.mode_id ?? null,
thinking_option_id: args.thinking_option_id ?? null,
});
},
);
// 2. boocoder.list_pending_changes
server.tool(
'boocoder.list_pending_changes',
'List pending changes awaiting review',
{
session_id: z.string().optional().describe('Optional session filter'),
},
async (args) => {
let rows: PendingRow[];
if (args.session_id) {
rows = await sql<PendingRow[]>`
SELECT id, file_path, operation, diff, session_id
FROM pending_changes
WHERE status = 'pending' AND session_id = ${args.session_id}
ORDER BY created_at ASC
`;
} else {
rows = await sql<PendingRow[]>`
SELECT id, file_path, operation, diff, session_id
FROM pending_changes
WHERE status = 'pending'
ORDER BY created_at ASC
`;
}
const items = rows.map((r) => ({
id: r.id,
file_path: r.file_path,
operation: r.operation,
diff_preview: r.diff.slice(0, 200),
}));
return textResult(items);
},
);
// 3. boocoder.apply
server.tool(
'boocoder.apply',
'Apply a pending change (write to disk)',
{
change_id: z.string().describe('Pending change UUID'),
},
async (args) => {
// Resolve projectRoot from the change's session → project path
const [proj] = await sql<ProjectPathRow[]>`
SELECT p.path FROM pending_changes pc
JOIN sessions s ON pc.session_id = s.id
JOIN projects p ON s.project_id = p.id
WHERE pc.id = ${args.change_id}
`;
if (!proj) {
return textResult({ success: false, file_path: '', error: 'change not found or project path unresolved' });
}
const result = await applyOne(sql, args.change_id, proj.path);
return textResult({ success: result.success, file_path: result.file_path, error: result.error });
},
);
// 4. boocoder.reject
server.tool(
'boocoder.reject',
'Reject a pending change (mark as rejected, no disk write)',
{
change_id: z.string().describe('Pending change UUID'),
},
async (args) => {
await rejectOne(sql, args.change_id);
return textResult({ success: true });
},
);
// 5. boocoder.dispatch_external_agent
server.tool(
'boocoder.dispatch_external_agent',
'Create a task targeting a specific external agent (ACP or PTY dispatch)',
{
project_id: z.string().describe('Project UUID'),
input: z.string().describe('Task prompt'),
agent: z.string().describe('Agent name (must match available_agents registry)'),
model: z.string().optional().describe('Model override (optional)'),
mode_id: z.string().optional().describe('Permission/mode id (optional)'),
thinking_option_id: z.string().optional().describe('Thinking/effort option id (optional)'),
},
async (args) => {
const [row] = await sql<TaskRow[]>`
INSERT INTO tasks (project_id, input, agent, model, mode_id, thinking_option_id, state)
VALUES (
${args.project_id},
${args.input},
${args.agent},
${args.model ?? null},
${args.mode_id ?? null},
${args.thinking_option_id ?? null},
'pending'
)
RETURNING id, state
`;
// Determine execution path from available_agents
const [agentRow] = await sql<{ supports_acp: boolean }[]>`
SELECT supports_acp FROM available_agents WHERE name = ${args.agent}
`;
const executionPath = agentRow?.supports_acp ? 'acp' : 'pty';
return textResult({
task_id: row!.id,
state: row!.state,
execution_path: executionPath,
mode_id: args.mode_id ?? null,
thinking_option_id: args.thinking_option_id ?? null,
});
},
);
// 6. boocoder.list_worktrees
server.tool(
'boocoder.list_worktrees',
'List active worktrees from running tasks',
{},
async () => {
const rows = await sql<WorktreeRow[]>`
SELECT id, worktree_path, agent, started_at
FROM tasks
WHERE worktree_path IS NOT NULL AND state = 'running'
ORDER BY started_at DESC
`;
const items = rows.map((r) => ({
task_id: r.id,
worktree_path: r.worktree_path,
agent: r.agent,
started_at: r.started_at,
}));
return textResult(items);
},
);
// Connect via stdio
const transport = new StdioServerTransport();
await server.connect(transport);
// Block until stdin closes (transport handles lifecycle)
await new Promise<void>((resolve) => {
process.stdin.on('end', resolve);
process.stdin.on('close', resolve);
});
await sql.end({ timeout: 5 });
}

View File

@@ -0,0 +1,170 @@
/**
* v2.6 Phase 3 (3.4) — orphan worktree reaper.
*
* Reclaims on-disk session worktree dirs under WORKTREE_BASE that have NO live
* (`status='active'`) row in the `worktrees` table — leaks from a crash between
* `git worktree add` and the DB insert, a missed chat-close hook, or a manual rm
* of the DB row. Extends the periodic-sweeper pattern (apps/server's truncation +
* stale-streaming reaper).
*
* SAFETY (Paseo worktree-archive cascade + superset destroy-saga lift): before
* removing ANY dir, run `checkWorktreeWorkAtRisk` — a dirty / unpushed / unmerged
* worktree is SKIPPED (logged), never force-removed. The pure orphan-target
* selection (which dirs are candidates) lives in
* `backends/lifecycle-decisions.ts:selectOrphanWorktreeTargets` and is unit-tested;
* this module does the DB read + fs stat + git preflight + removal side-effects.
*
* The mtime grace (default 1h) means a dir mid-`ensureSessionWorktree` (created on
* disk, row not yet committed) is never swept — the grace window covers the gap.
*/
import { readdir, stat } from 'node:fs/promises';
import { join } from 'node:path';
import type { FastifyBaseLogger } from 'fastify';
import type { Sql } from '../db.js';
import { WORKTREE_BASE, checkWorktreeWorkAtRisk } from './worktrees.js';
import { hostExec } from './host-exec.js';
import {
selectOrphanWorktreeTargets,
DEFAULT_ORPHAN_WORKTREE_GRACE_MS,
} from './backends/lifecycle-decisions.js';
export interface OrphanWorktreeReaperDeps {
sql: Sql;
log: FastifyBaseLogger;
intervalMs: number;
graceMs?: number;
}
export interface OrphanReaperResult {
scanned: number;
candidates: number;
reaped: string[];
skippedAtRisk: string[];
}
/** Single-pass reap: select orphan candidates, preflight at-risk, remove the safe. */
export async function reapOrphanWorktrees(
sql: Sql,
log: FastifyBaseLogger,
graceMs: number = DEFAULT_ORPHAN_WORKTREE_GRACE_MS,
now: number = Date.now(),
): Promise<OrphanReaperResult> {
// Enumerate on-disk session worktree dirs (`sess-*`). Per-task worktrees
// (arena/new_task/MCP) are cleaned up inline by the one-shot path, so we only
// own the persistent session dirs the warm paths leave behind.
let dirents: string[];
try {
dirents = await readdir(WORKTREE_BASE);
} catch {
return { scanned: 0, candidates: 0, reaped: [], skippedAtRisk: [] }; // base absent → nothing to do
}
const onDisk: { path: string; mtimeMs: number }[] = [];
for (const name of dirents) {
if (!name.startsWith('sess-')) continue; // only persistent session worktrees
const path = join(WORKTREE_BASE, name);
try {
const s = await stat(path);
if (!s.isDirectory()) continue;
onDisk.push({ path, mtimeMs: s.mtimeMs });
} catch {
// vanished between readdir and stat — skip
}
}
// Live worktree paths from the DB (active rows only — archived/removed rows are
// not "live", so their leftover dirs are reapable orphans).
const liveRows = await sql<{ path: string }[]>`
SELECT path FROM worktrees WHERE status = 'active'
`;
const live = new Set(liveRows.map((r) => r.path));
const candidates = selectOrphanWorktreeTargets(onDisk, live, now, graceMs);
const reaped: string[] = [];
const skippedAtRisk: string[] = [];
for (const path of candidates) {
// Preflight: never reap work at risk. A git error forces atRisk=true (fail
// closed), so a half-broken worktree is kept, not silently destroyed.
const risk = await checkWorktreeWorkAtRisk(path);
if (risk.atRisk) {
skippedAtRisk.push(path);
log.warn({ path, dirty: risk.dirty, unmerged: risk.unmerged, error: risk.error }, 'orphan-reaper: skipping at-risk orphan worktree');
continue;
}
const removed = await removeOrphanDir(path);
if (removed) reaped.push(path);
}
if (reaped.length > 0 || skippedAtRisk.length > 0) {
log.info({ scanned: onDisk.length, candidates: candidates.length, reaped, skippedAtRisk }, 'orphan-reaper: pass complete');
}
return { scanned: onDisk.length, candidates: candidates.length, reaped, skippedAtRisk };
}
/**
* Remove a single orphan worktree dir. Resolve its main repo via the git
* common-dir, run `worktree remove --force` from there + prune, then rm the dir as
* a backstop. Best-effort: every step is independently fault-tolerant so a partial
* state (dir present, git untracked) still gets reclaimed.
*/
async function removeOrphanDir(path: string): Promise<boolean> {
// Find the owning repo (the common git dir's parent). When the dir isn't a valid
// worktree anymore, this fails and we fall back to a plain rm.
const common = await hostExec(
`git -C ${shellEscape(path)} rev-parse --path-format=absolute --git-common-dir`,
{ timeoutMs: 10_000 },
).catch(() => null);
const commonDir = common && common.exitCode === 0 ? common.stdout.trim() : '';
// The repo worktree root is the parent of the .git common dir (strip trailing /.git).
const repoRoot = commonDir.replace(/\/\.git\/?$/, '').replace(/\/\.git$/, '');
if (repoRoot && repoRoot !== commonDir) {
await hostExec(
`git -C ${shellEscape(repoRoot)} worktree remove ${shellEscape(path)} --force`,
{ timeoutMs: 15_000 },
).catch(() => {});
await hostExec(
`git -C ${shellEscape(repoRoot)} worktree prune`,
{ timeoutMs: 10_000 },
).catch(() => {});
}
// Backstop: ensure the dir is gone even if the git remove no-op'd.
const rm = await hostExec(`rm -rf ${shellEscape(path)}`, { timeoutMs: 15_000 }).catch(() => null);
return rm != null && rm.exitCode === 0;
}
/** Minimal single-quote shell escape (mirrors worktrees.ts). */
function shellEscape(s: string): string {
return "'" + s.replace(/'/g, "'\\''") + "'";
}
/** Periodic orphan-worktree reaper, started/stopped by the bootstrap. Unref'd. */
export function createOrphanWorktreeReaper(deps: OrphanWorktreeReaperDeps): { start(): void; stop(): void } {
const { sql, log, intervalMs } = deps;
const graceMs = deps.graceMs ?? DEFAULT_ORPHAN_WORKTREE_GRACE_MS;
let timer: ReturnType<typeof setInterval> | null = null;
let running = false;
return {
start() {
if (timer) return;
timer = setInterval(() => {
if (running) return; // a slow pass must not overlap the next tick
running = true;
void reapOrphanWorktrees(sql, log, graceMs)
.catch((err) => log.warn({ err: err instanceof Error ? err.message : String(err) }, 'orphan-reaper: pass error'))
.finally(() => {
running = false;
});
}, intervalMs);
timer.unref?.();
log.info({ intervalMs, graceMs }, 'orphan-reaper: started');
},
stop() {
if (timer) {
clearInterval(timer);
timer = null;
}
},
};
}

View File

@@ -0,0 +1,254 @@
import { readFile, writeFile, unlink, mkdir } from 'node:fs/promises';
import { dirname } from 'node:path';
import type { Sql } from '../db.js';
import { resolveWritePath } from './write_guard.js';
import { locateMatch } from './fuzzy-match.js';
// --- Types -------------------------------------------------------------------
export interface PendingChange {
id: string;
session_id: string;
task_id: string | null;
file_path: string;
operation: 'create' | 'edit' | 'delete';
diff: string;
status: 'pending' | 'applied' | 'rejected' | 'reverted';
// v2.6 Phase 1-UX: which agent staged this change (DiffPanel attribution).
// Native boocode write tools stamp 'boocode'; the manual RightRail create path
// passes null (renders as "manual"). NULL on legacy rows queued pre-v2.6.
agent: string | null;
created_at: string;
}
export interface ApplyResult {
id: string;
file_path: string;
operation: string;
success: boolean;
error?: string;
}
// --- Queue functions ---------------------------------------------------------
export async function queueEdit(
sql: Sql,
sessionId: string,
taskId: string | null,
filePath: string,
oldString: string,
newString: string,
projectRoot: string,
// v2.6 Phase 1-UX: attribution. Defaults to 'boocode' because the only callers
// that omit it are the native write tools (edit_file/create_file/delete_file).
// Pass null explicitly for the manual RightRail create path.
agent: string | null = 'boocode',
): Promise<PendingChange> {
const resolved = resolveWritePath(projectRoot, filePath);
const diff = JSON.stringify({ old: oldString, new: newString });
const [row] = await sql<PendingChange[]>`
INSERT INTO pending_changes (session_id, task_id, file_path, operation, diff, agent)
VALUES (${sessionId}, ${taskId}, ${resolved}, 'edit', ${diff}, ${agent})
RETURNING *
`;
return row!;
}
export async function queueCreate(
sql: Sql,
sessionId: string,
taskId: string | null,
filePath: string,
content: string,
projectRoot: string,
// See queueEdit: defaults to 'boocode' for the native write tools; the manual
// RightRail create route passes null.
agent: string | null = 'boocode',
): Promise<PendingChange> {
const resolved = resolveWritePath(projectRoot, filePath);
const [row] = await sql<PendingChange[]>`
INSERT INTO pending_changes (session_id, task_id, file_path, operation, diff, agent)
VALUES (${sessionId}, ${taskId}, ${resolved}, 'create', ${content}, ${agent})
RETURNING *
`;
return row!;
}
export async function queueDelete(
sql: Sql,
sessionId: string,
taskId: string | null,
filePath: string,
projectRoot: string,
// See queueEdit: defaults to 'boocode' for the native write tools.
agent: string | null = 'boocode',
): Promise<PendingChange> {
const resolved = resolveWritePath(projectRoot, filePath);
const [row] = await sql<PendingChange[]>`
INSERT INTO pending_changes (session_id, task_id, file_path, operation, diff, agent)
VALUES (${sessionId}, ${taskId}, ${resolved}, 'delete', '', ${agent})
RETURNING *
`;
return row!;
}
// --- Apply functions ---------------------------------------------------------
export async function applyOne(
sql: Sql,
changeId: string,
projectRoot: string,
): Promise<ApplyResult> {
const [change] = await sql<PendingChange[]>`
SELECT * FROM pending_changes WHERE id = ${changeId} AND status = 'pending'
`;
if (!change) {
return { id: changeId, file_path: '', operation: '', success: false, error: 'change not found or not pending' };
}
try {
// Re-validate path in case projectRoot has shifted
resolveWritePath(projectRoot, change.file_path);
switch (change.operation) {
case 'create': {
await mkdir(dirname(change.file_path), { recursive: true });
await writeFile(change.file_path, change.diff, 'utf8');
break;
}
case 'edit': {
const { old: oldStr, new: newStr } = JSON.parse(change.diff) as { old: string; new: string };
const content = await readFile(change.file_path, 'utf8');
const match = locateMatch(content, oldStr);
if (match.kind === 'ambiguous') {
throw new Error(
`old_string matches ${match.count} locations — add surrounding context to disambiguate`,
);
}
if (match.kind === 'not_found') {
throw new Error(
'old_string not found in file (even fuzzily) — file may have changed since the edit was queued',
);
}
const updated = content.slice(0, match.start) + newStr + content.slice(match.end);
await writeFile(change.file_path, updated, 'utf8');
break;
}
case 'delete': {
// Stash current content in diff for potential rewind
try {
const existing = await readFile(change.file_path, 'utf8');
await sql`UPDATE pending_changes SET diff = ${existing} WHERE id = ${changeId}`;
} catch {
// File may already be gone — proceed with status update
}
await unlink(change.file_path);
break;
}
}
await sql`UPDATE pending_changes SET status = 'applied' WHERE id = ${changeId}`;
return { id: change.id, file_path: change.file_path, operation: change.operation, success: true };
} catch (err) {
const message = err instanceof Error ? err.message : String(err);
return { id: change.id, file_path: change.file_path, operation: change.operation, success: false, error: message };
}
}
export async function applyAll(
sql: Sql,
sessionId: string,
projectRoot: string,
): Promise<ApplyResult[]> {
const pending = await sql<PendingChange[]>`
SELECT * FROM pending_changes
WHERE session_id = ${sessionId} AND status = 'pending'
ORDER BY created_at ASC
`;
const results: ApplyResult[] = [];
for (const change of pending) {
results.push(await applyOne(sql, change.id, projectRoot));
}
return results;
}
// --- Reject functions --------------------------------------------------------
export async function rejectOne(sql: Sql, changeId: string): Promise<void> {
await sql`UPDATE pending_changes SET status = 'rejected' WHERE id = ${changeId} AND status = 'pending'`;
}
export async function rejectAll(sql: Sql, sessionId: string): Promise<void> {
await sql`UPDATE pending_changes SET status = 'rejected' WHERE session_id = ${sessionId} AND status = 'pending'`;
}
// --- Rewind functions --------------------------------------------------------
export async function rewindOne(
sql: Sql,
changeId: string,
projectRoot: string,
): Promise<ApplyResult> {
const [change] = await sql<PendingChange[]>`
SELECT * FROM pending_changes WHERE id = ${changeId} AND status = 'applied'
`;
if (!change) {
return { id: changeId, file_path: '', operation: '', success: false, error: 'change not found or not applied' };
}
try {
resolveWritePath(projectRoot, change.file_path);
switch (change.operation) {
case 'create': {
// Reverse a create: delete the file
await unlink(change.file_path);
break;
}
case 'edit': {
// Reverse an edit: swap old and new
const { old: oldStr, new: newStr } = JSON.parse(change.diff) as { old: string; new: string };
const content = await readFile(change.file_path, 'utf8');
const match = locateMatch(content, newStr);
if (match.kind === 'ambiguous') {
throw new Error(
`new_string matches ${match.count} locations — cannot rewind; add surrounding context to disambiguate`,
);
}
if (match.kind === 'not_found') {
throw new Error(
'new_string not found in file (even fuzzily) — cannot rewind; file may have been modified since apply',
);
}
const reverted = content.slice(0, match.start) + oldStr + content.slice(match.end);
await writeFile(change.file_path, reverted, 'utf8');
break;
}
case 'delete': {
// Reverse a delete: recreate the file (diff holds the original content stashed at apply time)
await mkdir(dirname(change.file_path), { recursive: true });
await writeFile(change.file_path, change.diff, 'utf8');
break;
}
}
await sql`UPDATE pending_changes SET status = 'reverted' WHERE id = ${changeId}`;
return { id: change.id, file_path: change.file_path, operation: change.operation, success: true };
} catch (err) {
const message = err instanceof Error ? err.message : String(err);
return { id: change.id, file_path: change.file_path, operation: change.operation, success: false, error: message };
}
}
// --- Query functions ---------------------------------------------------------
export async function listPending(sql: Sql, sessionId: string): Promise<PendingChange[]> {
return sql<PendingChange[]>`
SELECT * FROM pending_changes
WHERE session_id = ${sessionId} AND status = 'pending'
ORDER BY created_at ASC
`;
}

View File

@@ -0,0 +1,207 @@
/**
* Blocks ACP dispatch on permission/elicitation prompts until the user responds via API.
*/
import type { RequestPermissionRequest, RequestPermissionResponse, CreateElicitationRequest, CreateElicitationResponse } from '@agentclientprotocol/sdk';
import { isUnattendedMode } from './provider-manifest.js';
const DEFAULT_TIMEOUT_MS = 120_000;
interface PendingPermission {
type: 'permission';
request: RequestPermissionRequest;
sessionId: string;
resolve: (response: RequestPermissionResponse) => void;
reject: (err: Error) => void;
timer: ReturnType<typeof setTimeout>;
}
interface PendingElicitation {
type: 'elicitation';
request: CreateElicitationRequest;
sessionId: string;
resolve: (response: CreateElicitationResponse) => void;
reject: (err: Error) => void;
timer: ReturnType<typeof setTimeout>;
}
type PendingEntry = PendingPermission | PendingElicitation;
const pendingByTask = new Map<string, PendingEntry>();
export type PermissionKind = 'tool' | 'question' | 'plan' | 'elicitation';
export interface PermissionPrompt {
taskId: string;
kind: PermissionKind;
toolTitle?: string;
description?: string;
input?: Record<string, unknown>;
options: Array<{ optionId: string; label: string }>;
}
export interface PermissionHooks {
onPrompt?: (prompt: PermissionPrompt & { sessionId: string }) => void | Promise<void>;
onResolved?: (taskId: string, sessionId: string) => void | Promise<void>;
}
let hooks: PermissionHooks = {};
export function setPermissionHooks(next: PermissionHooks): void {
hooks = next;
}
function resolveKind(params: RequestPermissionRequest): PermissionKind {
const input = params.toolCall?.rawInput;
if (input && typeof input === 'object' && !Array.isArray(input) && 'questions' in input && Array.isArray((input as Record<string, unknown>).questions)) {
return 'question';
}
return 'tool';
}
function toPrompt(taskId: string, params: RequestPermissionRequest): PermissionPrompt {
const kind = resolveKind(params);
const rawInput = params.toolCall?.rawInput;
const input = rawInput && typeof rawInput === 'object' && !Array.isArray(rawInput)
? rawInput as Record<string, unknown>
: undefined;
return {
taskId,
kind,
toolTitle: params.toolCall?.title ?? undefined,
...(input ? { input } : {}),
options: params.options.map((o) => ({
optionId: o.optionId,
label: o.name,
})),
};
}
export function waitForPermissionResponse(
taskId: string,
sessionId: string,
provider: string,
modeId: string | undefined,
params: RequestPermissionRequest,
timeoutMs = DEFAULT_TIMEOUT_MS,
): Promise<RequestPermissionResponse> {
if (isUnattendedMode(provider, modeId)) {
const first = params.options[0];
if (first) {
return Promise.resolve({ outcome: { outcome: 'selected', optionId: first.optionId } });
}
return Promise.resolve({ outcome: { outcome: 'cancelled' } });
}
return new Promise((resolve, reject) => {
const existing = pendingByTask.get(taskId);
if (existing) {
clearTimeout(existing.timer);
existing.reject(new Error('superseded by newer permission request'));
}
const timer = setTimeout(() => {
pendingByTask.delete(taskId);
void hooks.onResolved?.(taskId, sessionId);
resolve({ outcome: { outcome: 'cancelled' } });
}, timeoutMs);
pendingByTask.set(taskId, { type: 'permission', request: params, sessionId, resolve, reject, timer });
const prompt = toPrompt(taskId, params);
void hooks.onPrompt?.({ ...prompt, sessionId });
});
}
export function respondToPermission(taskId: string, optionId: string | null, updatedInput?: Record<string, unknown>): boolean {
const pending = pendingByTask.get(taskId);
if (!pending) return false;
clearTimeout(pending.timer);
pendingByTask.delete(taskId);
if (pending.type === 'elicitation') {
if (updatedInput) {
const content = updatedInput as { [key: string]: string | number | boolean | string[] };
pending.resolve({ action: 'accept', content });
} else {
pending.resolve({ action: 'decline' });
}
} else {
if (optionId) {
pending.resolve({ outcome: { outcome: 'selected', optionId } });
} else {
pending.resolve({ outcome: { outcome: 'cancelled' } });
}
}
void hooks.onResolved?.(taskId, pending.sessionId);
return true;
}
export function getPendingPermission(taskId: string): PermissionPrompt | null {
const pending = pendingByTask.get(taskId);
if (!pending) return null;
if (pending.type === 'elicitation') {
return elicitationToPrompt(taskId, pending.request);
}
return toPrompt(taskId, pending.request);
}
function elicitationToPrompt(taskId: string, params: CreateElicitationRequest): PermissionPrompt {
const input: Record<string, unknown> = { message: params.message };
if ('requestedSchema' in params) {
input.requestedSchema = params.requestedSchema;
}
return {
taskId,
kind: 'elicitation',
toolTitle: params.message,
input,
options: [],
};
}
export function waitForElicitationResponse(
taskId: string,
sessionId: string,
provider: string,
modeId: string | undefined,
params: CreateElicitationRequest,
timeoutMs = DEFAULT_TIMEOUT_MS,
): Promise<CreateElicitationResponse> {
if (isUnattendedMode(provider, modeId)) {
return Promise.resolve({ action: 'decline' });
}
return new Promise((resolve, reject) => {
const existing = pendingByTask.get(taskId);
if (existing) {
clearTimeout(existing.timer);
existing.reject(new Error('superseded by newer elicitation request'));
}
const timer = setTimeout(() => {
pendingByTask.delete(taskId);
void hooks.onResolved?.(taskId, sessionId);
resolve({ action: 'cancel' });
}, timeoutMs);
pendingByTask.set(taskId, { type: 'elicitation', request: params, sessionId, resolve, reject, timer });
const prompt = elicitationToPrompt(taskId, params);
void hooks.onPrompt?.({ ...prompt, sessionId });
});
}
export function cancelPendingPermission(taskId: string): void {
const pending = pendingByTask.get(taskId);
if (!pending) return;
clearTimeout(pending.timer);
pendingByTask.delete(taskId);
if (pending.type === 'elicitation') {
pending.resolve({ action: 'cancel' });
} else {
pending.resolve({ outcome: { outcome: 'cancelled' } });
}
void hooks.onResolved?.(taskId, pending.sessionId);
}

View File

@@ -0,0 +1,66 @@
/**
* Static slash-command hints per harness (interactive TUI / agent session).
* Live ACP `available_commands_update` merges on top during dispatch.
*/
import type { AgentCommand } from './provider-types.js';
const CLAUDE_COMMANDS: AgentCommand[] = [
{ name: 'help', description: 'Show available slash commands' },
{ name: 'clear', description: 'Clear conversation history' },
{ name: 'compact', description: 'Compact context window' },
{ name: 'cost', description: 'Show session cost' },
{ name: 'memory', description: 'Manage project memory' },
{ name: 'model', description: 'Switch model' },
{ name: 'permissions', description: 'View or change permission mode' },
{ name: 'review', description: 'Review current changes' },
{ name: 'status', description: 'Show session status' },
{ name: 'vim', description: 'Toggle vim-style input' },
];
const OPENCODE_COMMANDS: AgentCommand[] = [
{ name: 'help', description: 'Show available commands' },
{ name: 'new', description: 'Start a new session' },
{ name: 'models', description: 'List or switch models' },
{ name: 'agents', description: 'List or switch agents' },
{ name: 'compact', description: 'Compact context' },
{ name: 'share', description: 'Share session' },
{ name: 'export', description: 'Export session' },
];
const GOOSE_COMMANDS: AgentCommand[] = [
{ name: 'help', description: 'Show available commands' },
{ name: 'clear', description: 'Clear conversation' },
{ name: 'compact', description: 'Compact context' },
{ name: 'exit', description: 'Exit session' },
];
const QWEN_COMMANDS: AgentCommand[] = [
{ name: 'help', description: 'Show available slash commands' },
{ name: 'clear', description: 'Clear conversation' },
{ name: 'memory', description: 'Manage memory' },
{ name: 'hooks', description: 'Manage hooks' },
{ name: 'review', description: 'Review changes' },
];
/** boocode harness uses /api/skills — merged on the frontend. */
export const PROVIDER_COMMANDS: Record<string, AgentCommand[]> = {
claude: CLAUDE_COMMANDS,
opencode: OPENCODE_COMMANDS,
goose: GOOSE_COMMANDS,
qwen: QWEN_COMMANDS,
boocode: [],
};
export function getManifestCommands(provider: string): AgentCommand[] {
return PROVIDER_COMMANDS[provider] ?? [];
}
export function mergeCommands(...lists: AgentCommand[][]): AgentCommand[] {
const byName = new Map<string, AgentCommand>();
for (const list of lists) {
for (const cmd of list) {
byName.set(cmd.name, cmd);
}
}
return [...byName.values()].sort((a, b) => a.name.localeCompare(b.name));
}

View File

@@ -0,0 +1,133 @@
/**
* v2.3 resolved provider registry — single in-memory source of truth after
* merging the hardcoded built-ins (provider-registry.ts) with the config file
* (provider-config.ts). Mirrors Paseo's buildProviderRegistry/addDerivedProviders.
*
* Phase 1 scope: build + expose the resolved registry. `launchCommand` is null
* for built-ins (the default argv is resolved at dispatch time in Phase 3) and
* is the config `command` for custom ACP entries. No DB columns (design.md §3.3);
* `enabled` lives in memory only.
*/
import type { ProviderDef } from './provider-registry.js';
import { PROVIDERS } from './provider-registry.js';
import { load, type CoderProvidersFile } from './provider-config.js';
export interface ResolvedProviderDef extends ProviderDef {
id: string;
enabled: boolean;
isBuiltin: boolean;
isCustomAcp: boolean;
/** Full argv for spawn: [binary, ...args]. Null for built-ins (resolved at dispatch). */
launchCommand: [string, ...string[]] | null;
env: Record<string, string> | undefined;
configLabel?: string;
configDescription?: string;
/** Config `models` — REPLACES the discovered/static model list when present. */
configModels?: Array<{ id: string; label: string }>;
/** Config `additionalModels` — MERGED on top of the resolved model list. */
configAdditionalModels?: Array<{ id: string; label: string }>;
}
/**
* Merge built-ins with config overrides into the resolved registry.
* Algorithm verbatim from design.md §3.1.
*/
export function buildResolvedRegistry(
builtins: ProviderDef[],
config: CoderProvidersFile,
): Map<string, ResolvedProviderDef> {
const out = new Map<string, ResolvedProviderDef>();
const overrides = config.providers ?? {};
const builtinNames = new Set(builtins.map((b) => b.name));
// 1. Built-ins, applying a config override if one is present.
for (const def of builtins) {
const ov = overrides[def.name];
let enabled = ov?.enabled !== false;
// 3. boocode is always enabled; an enabled:false override is ignored + warned.
if (def.name === 'boocode' && ov?.enabled === false) {
console.warn("provider-config: ignoring enabled:false for built-in 'boocode' (always enabled)");
enabled = true;
}
const launchCommand =
ov?.command && ov.command.length > 0 ? (ov.command as [string, ...string[]]) : null;
out.set(def.name, {
...def,
label: ov?.label ?? def.label,
id: def.name,
enabled,
isBuiltin: true,
isCustomAcp: false,
launchCommand,
env: ov?.env,
configLabel: ov?.label,
configDescription: ov?.description,
configModels: ov?.models,
configAdditionalModels: ov?.additionalModels,
});
}
// 2. Config ids that are not built-ins → custom ACP entries.
for (const [id, ov] of Object.entries(overrides)) {
if (builtinNames.has(id)) continue;
// §2.2 rules: "New id without extends → Reject at load with log."
if (ov.extends !== 'acp' || !ov.label || !ov.command || ov.command.length === 0) {
console.warn(
`provider-config: skipping custom provider '${id}' — requires extends:'acp', label, and command`,
);
continue;
}
out.set(id, {
name: id,
label: ov.label,
transport: 'acp',
modelSource: 'probe',
id,
enabled: ov.enabled !== false,
isBuiltin: false,
isCustomAcp: true,
launchCommand: ov.command as [string, ...string[]],
env: ov.env,
configLabel: ov.label,
configDescription: ov.description,
configModels: ov.models,
configAdditionalModels: ov.additionalModels,
});
}
return out;
}
// --- Module singleton ---------------------------------------------------------
let cachedRegistry: Map<string, ResolvedProviderDef> | null = null;
let cachedPath: string | null = null;
/** Load the config file at `path`, rebuild, and cache the resolved registry. */
export function loadProviderConfig(path: string): Map<string, ResolvedProviderDef> {
cachedPath = path;
cachedRegistry = buildResolvedRegistry(PROVIDERS, load(path));
return cachedRegistry;
}
/** Re-read the last-loaded config file and rebuild (Phase 4 calls this after PATCH). */
export function reloadProviderConfig(): Map<string, ResolvedProviderDef> {
if (cachedPath == null) {
cachedRegistry = buildResolvedRegistry(PROVIDERS, { providers: {} });
return cachedRegistry;
}
return loadProviderConfig(cachedPath);
}
/** The cached resolved registry (built-ins only if nothing has been loaded yet). */
export function getResolvedRegistry(): Map<string, ResolvedProviderDef> {
return cachedRegistry ?? buildResolvedRegistry(PROVIDERS, { providers: {} });
}
/** Resolved provider ids in registry order. */
export function getResolvedProviderIds(): string[] {
return [...getResolvedRegistry().keys()];
}

View File

@@ -0,0 +1,100 @@
/**
* v2.3 provider config file (`/data/coder-providers.json`) — schema + loader.
*
* Layers config-backed overrides/custom-ACP entries over the hardcoded built-ins
* (see provider-config-registry.ts). Loading NEVER throws at startup (design.md
* §2.1): a missing file, invalid JSON, or schema mismatch all fall back to
* `{ providers: {} }` (built-ins only, all enabled).
*/
import { readFileSync, writeFileSync } from 'node:fs';
import { z } from 'zod';
// Schemas verbatim from design.md §2.2.
export const ProviderOverrideSchema = z.object({
extends: z.enum(['acp']).optional(), // v2.3: only 'acp' for custom; built-ins omit extends
label: z.string().min(1).optional(),
description: z.string().optional(),
command: z.array(z.string().min(1)).min(1).optional(), // [binary, ...args]
env: z.record(z.string()).optional(),
enabled: z.boolean().optional(), // default true
order: z.number().int().optional(), // UI sort key
models: z.array(z.object({ id: z.string(), label: z.string() })).optional(),
additionalModels: z.array(z.object({ id: z.string(), label: z.string() })).optional(),
});
export const CoderProvidersFileSchema = z.object({
providers: z.record(ProviderOverrideSchema).default({}),
});
export type ProviderOverride = z.infer<typeof ProviderOverrideSchema>;
export type CoderProvidersFile = z.infer<typeof CoderProvidersFileSchema>;
/**
* PATCH body schema (design.md §6.2). A partial providers map where each value
* is either a full override object (REPLACES that id's override) or `null`
* (DELETES the override → revert to the built-in default). Ids absent from the
* patch are left untouched. The route validates the body against this first
* (malformed → 422) so a bad shape can never reach the merge/save step.
*/
export const ProviderConfigPatchSchema = z.object({
providers: z.record(ProviderOverrideSchema.nullable()).default({}),
});
export type ProviderConfigPatch = z.infer<typeof ProviderConfigPatchSchema>;
/**
* Shallow per-id merge (design.md §6.2 / Paseo `patchConfig`). Each key in
* `patch.providers` REPLACES that id's override object wholesale (NOT a deep
* field merge); a `null` value DELETES the override. Returns a new object —
* never mutates `current`. The result is a plain CoderProvidersFile (no nulls),
* which the route re-validates against CoderProvidersFileSchema before save.
*/
export function mergeProviderConfigPatch(
current: CoderProvidersFile,
patch: ProviderConfigPatch,
): CoderProvidersFile {
const providers: Record<string, ProviderOverride> = { ...current.providers };
for (const [id, override] of Object.entries(patch.providers)) {
if (override === null) {
delete providers[id];
} else {
providers[id] = override;
}
}
return { providers };
}
/** Read + parse + validate. Falls back to built-ins-only on any failure; never throws. */
export function load(path: string): CoderProvidersFile {
let raw: string;
try {
raw = readFileSync(path, 'utf8');
} catch {
// Missing file → built-ins only. Expected, not an error.
return { providers: {} };
}
let json: unknown;
try {
json = JSON.parse(raw);
} catch (err) {
console.error(`provider-config: invalid JSON in ${path} — using built-ins only`, err);
return { providers: {} };
}
const parsed = CoderProvidersFileSchema.safeParse(json);
if (!parsed.success) {
console.error(
`provider-config: schema validation failed for ${path} — using built-ins only`,
parsed.error.flatten(),
);
return { providers: {} };
}
return parsed.data;
}
/** Write the config back to disk (used by the Phase 4 PATCH route). */
export function save(path: string, config: CoderProvidersFile): void {
writeFileSync(path, `${JSON.stringify(config, null, 2)}\n`, 'utf8');
}

View File

@@ -0,0 +1,71 @@
/**
* v2.3 Phase 4 (design.md §8) — per-provider plaintext diagnostic report.
*
* Read-only by default: reports CACHED state (resolved registry def + the
* available_agents row + the warm snapshot-cache entry) plus a `which`-style
* PATH check for the launch binary. It does NOT spawn an ACP probe — §8 lists
* the live initialize probe as optional, and the route defaults to cached state.
*
* A template string is the whole formatter (no Paseo diagnostic-utils port).
*/
import type { ResolvedProviderDef } from './provider-config-registry.js';
import type { ProviderSnapshotEntry, ProviderModel } from './provider-types.js';
import { isCommandAvailable } from './command-availability.js';
/** The subset of an `available_agents` row the diagnostic reads. */
export interface DiagnosticAgentRow {
name: string;
install_path: string | null;
supports_acp?: boolean;
models?: ProviderModel[] | null;
last_probed_at?: string | Date | null;
}
interface DiagnosticOpts {
/** Warm snapshot-cache entry (read-only peek) — source of the last probe error. */
cachedEntry?: ProviderSnapshotEntry;
/** Injectable PATH check (defaults to the real `which`); stubbed in tests. */
checkAvailable?: (binary: string) => Promise<boolean>;
}
/** Resolve the binary the dispatcher would launch (for the PATH check + report). */
function resolveBinary(resolved: ResolvedProviderDef, agentRow: DiagnosticAgentRow | undefined): string {
return resolved.launchCommand?.[0] ?? agentRow?.install_path ?? resolved.id;
}
export async function getProviderDiagnostic(
resolved: ResolvedProviderDef,
agentRow: DiagnosticAgentRow | undefined,
opts: DiagnosticOpts = {},
): Promise<string> {
const checkAvailable = opts.checkAvailable ?? isCommandAvailable;
const installed = agentRow?.install_path != null;
const binary = resolveBinary(resolved, agentRow);
// boocode is native (no binary to launch) — short-circuit the PATH check.
const commandAvailable = resolved.transport === 'native' ? true : await checkAvailable(binary);
const lastProbedAt =
agentRow?.last_probed_at != null ? new Date(agentRow.last_probed_at).toISOString() : '(never)';
const modelCount = agentRow?.models?.length ?? 0;
const launchCommand = resolved.launchCommand
? resolved.launchCommand.join(' ')
: '(built-in default, resolved at dispatch)';
const lastError = opts.cachedEntry?.error ?? '(none recorded)';
return [
`provider: ${resolved.id}`,
`label: ${resolved.configLabel ?? resolved.label}`,
`transport: ${resolved.transport}`,
`enabled: ${resolved.enabled}`,
`builtin: ${resolved.isBuiltin}`,
`customAcp: ${resolved.isCustomAcp}`,
`installed: ${installed}`,
`install_path: ${agentRow?.install_path ?? '(none)'}`,
`binary: ${binary}`,
`command_available: ${commandAvailable}`,
`launch_command: ${launchCommand}`,
`supports_acp: ${agentRow?.supports_acp ?? '(unknown)'}`,
`last_probed_at: ${lastProbedAt}`,
`models_in_db: ${modelCount}`,
`last_probe_error: ${lastError}`,
].join('\n');
}

View File

@@ -0,0 +1,75 @@
/**
* Static provider mode metadata — lifted from Paseo provider-manifest.ts patterns.
*/
import type { ProviderMode } from './provider-types.js';
export interface ProviderManifestEntry {
defaultModeId: string | null;
modes: ProviderMode[];
/** Claude effort levels exposed as thinking options on models. */
thinkingOptions?: Array<{ id: string; label: string }>;
}
const CLAUDE_MODES: ProviderMode[] = [
{ id: 'default', label: 'Always Ask', description: 'Prompts for permission the first time a tool is used' },
{ id: 'auto', label: 'Auto mode', description: 'Model classifier reviews permission prompts automatically' },
{ id: 'acceptEdits', label: 'Accept File Edits', description: 'Automatically approves edit-focused tools' },
{ id: 'plan', label: 'Plan Mode', description: 'Analyze without executing tools or edits' },
{ id: 'bypassPermissions', label: 'Bypass', description: 'Skip all permission prompts', isUnattended: true },
];
const OPENCODE_MODES: ProviderMode[] = [
{ id: 'build', label: 'Build', description: 'Allows edits and tool execution' },
{ id: 'plan', label: 'Plan', description: 'Read-only planning mode' },
{ id: 'full-access', label: 'Full Access', description: 'Auto-approves all tool prompts', isUnattended: true },
];
const QWEN_PTY_MODES: ProviderMode[] = [
{ id: 'default', label: 'Default', description: 'Prompt for approval' },
{ id: 'plan', label: 'Plan', description: 'Plan only — no edits' },
{ id: 'auto-edit', label: 'Auto Edit', description: 'Auto-approve edit tools' },
{ id: 'auto', label: 'Auto', description: 'LLM classifier auto-approves safe actions' },
{ id: 'yolo', label: 'YOLO', description: 'Auto-approve all tools', isUnattended: true },
];
const CLAUDE_THINKING = [
{ id: 'low', label: 'Low' },
{ id: 'medium', label: 'Medium' },
{ id: 'high', label: 'High' },
{ id: 'xhigh', label: 'Extra High' },
{ id: 'max', label: 'Max' },
];
export const PROVIDER_MANIFEST: Record<string, ProviderManifestEntry> = {
claude: {
defaultModeId: 'default',
modes: CLAUDE_MODES,
thinkingOptions: CLAUDE_THINKING,
},
opencode: {
defaultModeId: 'build',
modes: OPENCODE_MODES,
},
goose: {
defaultModeId: null,
modes: [],
},
qwen: {
defaultModeId: 'default',
modes: QWEN_PTY_MODES,
},
};
export function getManifestModes(provider: string): ProviderMode[] {
return PROVIDER_MANIFEST[provider]?.modes ?? [];
}
export function getManifestDefaultModeId(provider: string): string | null {
return PROVIDER_MANIFEST[provider]?.defaultModeId ?? null;
}
export function isUnattendedMode(provider: string, modeId: string | undefined): boolean {
if (!modeId) return false;
const modes = getManifestModes(provider);
return modes.some((m) => m.id === modeId && m.isUnattended);
}

View File

@@ -0,0 +1,69 @@
export interface ProviderDef {
name: string;
label: string;
transport: 'native' | 'acp' | 'pty';
modelSource: 'llama-swap' | 'static' | 'probe';
staticModels?: Array<{ id: string; label: string }>;
/** Merge llama-swap models into probed list (OpenCode). */
mergeLlamaSwap?: boolean;
}
/**
* Model discovery rules (see provider-snapshot.ts):
* - boocode: llama-swap only
* - opencode: ACP probe + mergeLlamaSwap (prefixed llama-swap/* ids)
* - qwen: ACP probe + merge ~/.qwen/settings.json; PTY fallback reads settings only
* - goose: ACP probe only
* - claude: static manifest models + thinking options
*/
export const PROVIDERS: ProviderDef[] = [
{
name: 'boocode',
label: 'BooCoder',
transport: 'native',
modelSource: 'llama-swap',
},
{
name: 'opencode',
label: 'OpenCode',
transport: 'acp',
modelSource: 'probe',
mergeLlamaSwap: true,
},
{
name: 'goose',
label: 'Goose',
transport: 'acp',
modelSource: 'probe',
},
{
name: 'claude',
label: 'Claude Code',
transport: 'pty',
modelSource: 'static',
// Passed verbatim to `claude --model <id>` (PTY dispatch). The CLI accepts a
// latest-alias ('opus'/'sonnet'/'haiku') or a pinned full name
// ('claude-opus-4-8'). Aliases never go stale; pinned IDs let you select an
// exact version. Extend/replace per-install via data/coder-providers.json
// (models / additionalModels) without a code change.
staticModels: [
{ id: 'opus', label: 'Opus (latest)' },
{ id: 'claude-opus-4-8', label: 'Opus 4.8' },
{ id: 'sonnet', label: 'Sonnet (latest)' },
{ id: 'claude-sonnet-4-6', label: 'Sonnet 4.6' },
{ id: 'haiku', label: 'Haiku (latest)' },
{ id: 'claude-haiku-4-5-20251001', label: 'Haiku 4.5' },
],
},
{
name: 'qwen',
label: 'Qwen Code',
transport: 'acp',
modelSource: 'probe',
},
];
export const PROVIDERS_BY_NAME = new Map(PROVIDERS.map((p) => [p.name, p]));
/** External agents probed on host (excludes native boocode). */
export const PROBED_AGENT_NAMES = PROVIDERS.filter((p) => p.name !== 'boocode').map((p) => p.name);

View File

@@ -0,0 +1,334 @@
/**
* Provider snapshot cache — cold ACP probe per provider + static manifest merge.
*/
import { homedir } from 'node:os';
import type { FastifyBaseLogger } from 'fastify';
import type { Sql } from '../db.js';
import type { Config } from '../config.js';
import {
getManifestDefaultModeId,
getManifestModes,
PROVIDER_MANIFEST,
} from './provider-manifest.js';
import { probeAcpProvider } from './acp-probe.js';
import type { ProviderModel, ProviderSnapshotEntry, AgentCommand } from './provider-types.js';
import { getManifestCommands, mergeCommands } from './provider-commands.js';
import { readQwenSettingsModels } from './qwen-settings.js';
import { getResolvedRegistry, type ResolvedProviderDef } from './provider-config-registry.js';
import { isCommandAvailable } from './command-availability.js';
import { discoverClaudeCommands } from './claude-command-discovery.js';
interface AgentRow {
name: string;
install_path: string | null;
supports_acp: boolean;
models: ProviderModel[] | null;
commands: AgentCommand[] | null;
label: string | null;
transport: string | null;
last_probed_at: string | Date | null;
}
export async function fetchLlamaSwapModels(config: Config): Promise<ProviderModel[]> {
try {
const res = await fetch(`${config.LLAMA_SWAP_URL}/v1/models`);
if (!res.ok) return [];
const parsed = (await res.json()) as { data?: Array<{ id: string }> };
return (parsed.data ?? []).map((m) => ({ id: m.id, label: m.id }));
} catch {
return [];
}
}
/** Prefix llama-swap model ids so they don't collide with provider-native models. */
export function prefixLlamaSwapModels(models: ProviderModel[]): ProviderModel[] {
return models.map((m) => ({
...m,
id: m.id.startsWith('llama-swap/') ? m.id : `llama-swap/${m.id}`,
}));
}
function attachClaudeThinking(models: ProviderModel[]): ProviderModel[] {
const thinking = PROVIDER_MANIFEST.claude?.thinkingOptions;
if (!thinking?.length) return models;
return models.map((m) => ({
...m,
thinkingOptions: thinking,
defaultThinkingOptionId: 'medium',
}));
}
export function mergeModels(...lists: ProviderModel[][]): ProviderModel[] {
const seen = new Set<string>();
const out: ProviderModel[] = [];
for (const list of lists) {
for (const m of list) {
if (seen.has(m.id)) continue;
seen.add(m.id);
out.push(m);
}
}
return out;
}
async function buildProviderEntry(
resolved: ResolvedProviderDef,
agentRow: AgentRow | undefined,
llamaModels: ProviderModel[],
cwd: string,
ttlMs: number,
force: boolean,
): Promise<ProviderSnapshotEntry> {
const name = resolved.id;
const isNative = resolved.transport === 'native';
const fallbackModes = getManifestModes(name);
const defaultModeId = getManifestDefaultModeId(name);
const manifestCommands = getManifestCommands(name);
// Manifest + persisted live ACP commands (captured on a prior cold probe), so
// the agent's discovered commands show even when the tier-2 probe is skipped.
const dbCommands = mergeCommands(manifestCommands, agentRow?.commands ?? []);
const label = agentRow?.label ?? resolved.configLabel ?? resolved.label;
const descr = resolved.configDescription ? { description: resolved.configDescription } : {};
// v2.3: config `models` REPLACES the discovered/static list; `additionalModels`
// MERGES on top. Applied to every ready/installed model list below.
const withConfigModels = (m: ProviderModel[]): ProviderModel[] => {
let out = resolved.configModels && resolved.configModels.length > 0 ? resolved.configModels : m;
if (resolved.configAdditionalModels && resolved.configAdditionalModels.length > 0) {
out = mergeModels(out, resolved.configAdditionalModels);
}
return out;
};
// ACP built-ins fall back to PTY transport when the installed binary lacks ACP.
let transport = resolved.transport;
if (agentRow && resolved.transport === 'acp' && !agentRow.supports_acp) {
transport = 'pty';
}
// 1. Disabled → unavailable, no probe.
if (!resolved.enabled) {
return {
name, label, ...descr, transport, status: 'unavailable',
enabled: false, installed: false, models: [], modes: fallbackModes,
defaultModeId, commands: manifestCommands,
};
}
// 2. Native boocode → always ready (llama-swap models).
if (isNative) {
return {
name, label: resolved.label, transport, status: 'ready',
enabled: true, installed: true, models: withConfigModels(llamaModels), modes: [],
defaultModeId: null, commands: manifestCommands,
};
}
// 3. Tier-1 fast availability: installed iff a probed install_path exists or
// the launch binary is on PATH. No spawn beyond a `which` for custom entries.
const fast =
agentRow?.install_path != null ||
(resolved.launchCommand ? await isCommandAvailable(resolved.launchCommand[0]) : false);
if (!fast) {
return {
name, label, ...descr, transport, status: 'unavailable',
enabled: true, installed: false, models: [], modes: fallbackModes,
defaultModeId, commands: manifestCommands,
};
}
// Baseline model precedence (used by claude + non-probe fallbacks).
let models: ProviderModel[] = [];
if (resolved.modelSource === 'llama-swap' && resolved.mergeLlamaSwap) {
models = llamaModels;
} else if (agentRow?.models?.length) {
models = agentRow.models;
} else if (resolved.staticModels) {
models = resolved.staticModels.map((m) => ({ id: m.id, label: m.label }));
}
// claude: static models + thinking options, no ACP probe (unchanged from v2.2).
if (name === 'claude') {
// claude is PTY (no ACP discovery) — read its enabled commands + plugin
// skills from disk live (the snapshot cache rate-limits the fs reads).
return {
name, label, transport, status: 'ready', enabled: true, installed: true,
models: attachClaudeThinking(withConfigModels(models)), modes: fallbackModes, defaultModeId,
commands: mergeCommands(manifestCommands, discoverClaudeCommands()),
};
}
const canProbeAcp =
transport === 'acp' &&
((agentRow?.install_path != null && agentRow.supports_acp) ||
(resolved.isCustomAcp && resolved.launchCommand != null));
if (canProbeAcp) {
// Tier-2 gate (§4.3): cold ACP probe only on force, staleness, or empty DB
// models. Otherwise serve DB models + manifest modes/commands — no spawn.
const lastProbedMs =
agentRow?.last_probed_at != null ? new Date(agentRow.last_probed_at).getTime() : NaN;
const stale = Number.isNaN(lastProbedMs) || Date.now() - lastProbedMs > ttlMs;
const dbEmpty = !(agentRow?.models && agentRow.models.length > 0);
const runTier2 = force || stale || dbEmpty;
if (!runTier2) {
let skipModels = agentRow?.models ?? [];
if (resolved.mergeLlamaSwap && resolved.modelSource !== 'llama-swap') {
skipModels = mergeModels(skipModels, prefixLlamaSwapModels(llamaModels));
} else if (resolved.modelSource === 'llama-swap' && skipModels.length === 0) {
skipModels = llamaModels;
}
return {
name, label, transport, status: 'ready', enabled: true, installed: true,
models: withConfigModels(skipModels), modes: fallbackModes, defaultModeId, commands: dbCommands,
};
}
const probeTarget =
resolved.isCustomAcp && resolved.launchCommand
? resolved.launchCommand[0]
: agentRow!.install_path!;
const probe = await probeAcpProvider(name, probeTarget, cwd);
let probeModels = probe.models.length > 0 ? probe.models : models;
if (name === 'qwen') {
probeModels = mergeModels(probeModels, await readQwenSettingsModels());
}
if (resolved.mergeLlamaSwap && resolved.modelSource !== 'llama-swap') {
const nativeModels = probe.models.length > 0 ? probe.models : probeModels;
probeModels = mergeModels(nativeModels, prefixLlamaSwapModels(llamaModels));
}
return {
name, label, transport,
status: probe.ok ? 'ready' : 'error',
enabled: true, installed: true,
models: withConfigModels(probeModels),
modes: probe.modes.length > 0 ? probe.modes : fallbackModes,
defaultModeId: probe.defaultModeId ?? defaultModeId,
commands: mergeCommands(manifestCommands, probe.commands),
...(probe.error ? { error: probe.error } : {}),
fetchedAt: new Date().toISOString(),
};
}
// PTY-only fallback (e.g. qwen without ACP) — installed + ready.
if (name === 'qwen' && models.length === 0) {
models = await readQwenSettingsModels();
}
return {
name, label, transport, status: 'ready', enabled: true, installed: true,
models: withConfigModels(models), modes: fallbackModes, defaultModeId, commands: dbCommands,
};
}
const snapshotCache = new Map<string, { at: number; entries: ProviderSnapshotEntry[] }>();
const snapshotInflight = new Map<string, Promise<ProviderSnapshotEntry[]>>();
const CACHE_TTL_MS = 5 * 60_000;
export async function getProviderSnapshot(
sql: Sql,
config: Config,
cwd?: string,
force = false,
): Promise<ProviderSnapshotEntry[]> {
const resolvedCwd = cwd?.trim() || homedir();
const cacheKey = resolvedCwd;
const cached = snapshotCache.get(cacheKey);
if (!force && cached && Date.now() - cached.at < CACHE_TTL_MS) {
return cached.entries;
}
const inflight = snapshotInflight.get(cacheKey);
if (!force && inflight) {
return inflight;
}
const build = async (): Promise<ProviderSnapshotEntry[]> => {
const llamaModels = await fetchLlamaSwapModels(config);
const agents = await sql<AgentRow[]>`
SELECT name, install_path, supports_acp, models, commands, label, transport, last_probed_at FROM available_agents
`;
const agentMap = new Map(agents.map((a) => [a.name, a]));
const ttlMs = config.PROVIDER_PROBE_TTL_MS;
const entries = await Promise.all(
[...getResolvedRegistry().values()].map((resolved) =>
buildProviderEntry(resolved, agentMap.get(resolved.id), llamaModels, resolvedCwd, ttlMs, force),
),
);
snapshotCache.set(cacheKey, { at: Date.now(), entries });
return entries;
};
const promise = build().finally(() => {
snapshotInflight.delete(cacheKey);
});
snapshotInflight.set(cacheKey, promise);
// Await the build (force or cache-miss) and return terminal entries. The sync
// `loading` return (design §4.4) is DEFERRED until Phase 5 ships the client
// poll that resolves it: without that poll, a single fetch lands on
// installed:false `loading` entries, which AgentComposerBar filters out
// (`e.installed && ...`) → empty picker. Builds stay fast via the tier-2 skip
// once available_agents.models is warm.
return promise;
}
export function clearProviderSnapshotCache(): void {
snapshotCache.clear();
snapshotInflight.clear();
}
/**
* Read-only peek into the warm snapshot cache for one provider (no build, no
* probe). Used by the diagnostic route to report the last computed probe error
* without spawning anything. Returns undefined on a cold cache / unknown name.
*/
export function peekSnapshotEntry(name: string, cwd?: string): ProviderSnapshotEntry | undefined {
const resolvedCwd = cwd?.trim() || homedir();
return snapshotCache.get(resolvedCwd)?.entries.find((e) => e.name === name);
}
/** Persist probed model lists back to available_agents for fast legacy reads. */
export async function persistProbedModels(
sql: Sql,
entries: ProviderSnapshotEntry[],
log: FastifyBaseLogger,
): Promise<void> {
let count = 0;
for (const entry of entries) {
if (entry.name === 'boocode') continue;
let persisted = false;
if (entry.models.length > 0) {
const flatModels = entry.models.map(({ id, label }) => ({ id, label }));
await sql`
UPDATE available_agents
SET models = ${sql.json(flatModels as never)}, last_probed_at = clock_timestamp()
WHERE name = ${entry.name}
`;
persisted = true;
}
// Persist captured ACP commands so they survive the tier-2 probe skip and
// show without a dispatch. Only when non-empty — never clobber a prior set.
if (entry.commands.length > 0) {
const flatCommands = entry.commands.map((c) => ({
name: c.name,
...(c.description ? { description: c.description } : {}),
}));
await sql`
UPDATE available_agents
SET commands = ${sql.json(flatCommands as never)}, last_probed_at = clock_timestamp()
WHERE name = ${entry.name}
`;
persisted = true;
}
if (persisted) count++;
}
if (count > 0) {
log.info({ count }, 'provider-snapshot: persisted models/commands to available_agents');
}
}

View File

@@ -0,0 +1,61 @@
/** Shared provider / snapshot types (Paseo-shaped, BooCoder-native). */
export interface ProviderMode {
id: string;
label: string;
description?: string;
/** Auto-approve tool permissions when this mode is selected. */
isUnattended?: boolean;
}
export interface ThinkingOption {
id: string;
label: string;
isDefault?: boolean;
}
export interface ProviderModel {
id: string;
label: string;
description?: string;
isDefault?: boolean;
thinkingOptions?: ThinkingOption[];
defaultThinkingOptionId?: string;
}
// v2.3 phase 2: 'loading' (cache-miss, probe in flight) + 'unavailable'
// (disabled or not installed) restored alongside the terminal 'ready' | 'error'.
export type ProviderSnapshotStatus = 'loading' | 'ready' | 'unavailable' | 'error';
export interface AgentCommand {
name: string;
description?: string;
// v2.5.11: 'skill' (plugin skill) vs 'command' (native/CLI slash command).
// Drives the icon split in the coder slash menu. Undefined → command.
kind?: 'command' | 'skill';
}
// KEEP IN SYNC with apps/web/src/api/types.ts ProviderSnapshotEntry — parity is
// enforced by __tests__/provider-types-parity.test.ts (fails on any field drift).
export interface ProviderSnapshotEntry {
name: string;
label: string;
description?: string;
transport: string;
status: ProviderSnapshotStatus;
enabled: boolean;
installed: boolean;
models: ProviderModel[];
modes: ProviderMode[];
defaultModeId: string | null;
commands: AgentCommand[];
error?: string;
fetchedAt?: string;
}
export interface AgentSessionConfig {
provider: string;
model?: string;
modeId?: string;
thinkingOptionId?: string;
}

View File

@@ -0,0 +1,195 @@
/**
* PTY dispatch — runs external agents directly on the host.
*
* claude + qwen run with `--output-format stream-json` and emit Claude-Code's
* stream-json NDJSON on stdout. When an `onEvent` callback is supplied we
* line-buffer that stdout (split on `\n`, hold the partial tail) and feed complete
* lines to `makeStreamJsonParser` so deltas surface live as AgentEvents. The raw
* stdout is still accumulated + returned for back-compat (and the dispatcher's
* fallback when nothing parsed). See `stream-json-parser.ts`.
*/
import type { FastifyBaseLogger } from 'fastify';
import { spawn } from 'node:child_process';
import type { AgentEvent } from './agent-backend.js';
import { makeStreamJsonParser, type StreamJsonUsage } from './stream-json-parser.js';
export interface DispatchResult {
exitCode: number;
stdout: string;
stderr: string;
/** True iff at least one NDJSON AgentEvent was parsed from stdout (v#7). When
* false the dispatcher falls back to slicing stdout as the assistant content. */
streamed: boolean;
/** Final usage parsed from the stream-json `result` / `message_delta`, if any. */
usage?: StreamJsonUsage;
/** Provider session id from the stream-json `system` init line, if any. */
agentSessionId?: string | null;
}
export interface PtyDispatchOpts {
agent: string;
task: string;
worktreePath: string;
model?: string;
modeId?: string;
thinkingOptionId?: string;
installPath?: string;
signal?: AbortSignal;
log: FastifyBaseLogger;
/** Optional live event sink. When set, stdout is line-buffered + NDJSON-parsed
* and each AgentEvent is forwarded here as it arrives. Absent → opaque (old)
* behavior: stdout is accumulated and returned, no parsing. */
onEvent?: (e: AgentEvent) => void;
}
interface PtySpawnSpec {
binary: string;
args: string[];
stdin?: string;
}
function buildPtySpawnSpec(
agent: string,
task: string,
model?: string,
modeId?: string,
thinkingOptionId?: string,
installPath?: string,
): PtySpawnSpec | null {
const binary = installPath ?? agent;
switch (agent) {
case 'claude': {
// stream-json on -p requires --verbose (Claude Code rejects stream-json
// print mode without it). qwen needs no such flag.
const args = ['-p', '--output-format', 'stream-json', '--verbose'];
if (model) args.push('--model', model);
if (modeId) args.push('--permission-mode', modeId);
if (thinkingOptionId) args.push('--effort', thinkingOptionId);
return { binary, args, stdin: task };
}
case 'qwen': {
const args = ['-p', task, '--output-format', 'stream-json'];
if (model) args.push('--model', model);
if (modeId) args.push('--approval-mode', modeId);
return { binary, args };
}
case 'opencode':
return {
binary,
args: model ? ['--model', model] : [],
stdin: task,
};
case 'goose':
return {
binary,
args: model ? ['run', '--text', task, '--model', model] : ['run', '--text', task],
};
default:
return null;
}
}
export async function dispatchViaPty(opts: PtyDispatchOpts): Promise<DispatchResult> {
const { agent, task, worktreePath, model, modeId, thinkingOptionId, installPath, signal, log, onEvent } = opts;
const cmd = buildPtySpawnSpec(agent, task, model, modeId, thinkingOptionId, installPath);
if (!cmd) {
return {
exitCode: 1,
stdout: '',
stderr: `Agent '${agent}' is not yet supported for PTY dispatch.`,
streamed: false,
};
}
log.info({ agent, binary: cmd.binary, worktreePath, modeId }, 'pty-dispatch: starting');
return new Promise<DispatchResult>((resolve, reject) => {
const child = spawn(cmd.binary, cmd.args, {
cwd: worktreePath,
stdio: ['pipe', 'pipe', 'pipe'],
env: { ...process.env },
});
if (cmd.stdin) {
child.stdin!.write(cmd.stdin);
}
child.stdin!.end();
let stdout = '';
let stderr = '';
let killed = false;
// Live NDJSON parsing (only when a sink is supplied). Line-buffer: split on
// '\n', dispatch complete lines, hold the partial tail until the next chunk.
const parser = onEvent ? makeStreamJsonParser() : null;
let lineBuf = '';
let streamed = false;
const feedLine = (line: string): void => {
if (!parser || !onEvent) return;
for (const e of parser.push(line)) {
streamed = true;
onEvent(e);
}
};
child.stdout!.on('data', (chunk: Buffer) => {
const text = chunk.toString();
stdout += text;
if (!parser) return;
lineBuf += text;
let nl = lineBuf.indexOf('\n');
while (nl !== -1) {
const line = lineBuf.slice(0, nl);
lineBuf = lineBuf.slice(nl + 1);
feedLine(line);
nl = lineBuf.indexOf('\n');
}
});
child.stderr!.on('data', (chunk: Buffer) => { stderr += chunk.toString(); });
const cleanup = () => {
if (!killed) {
killed = true;
child.kill('SIGTERM');
setTimeout(() => child.kill('SIGKILL'), 5_000);
}
};
if (signal) {
if (signal.aborted) {
cleanup();
resolve({ exitCode: 130, stdout: '', stderr: 'Aborted before start', streamed: false });
return;
}
signal.addEventListener('abort', cleanup, { once: true });
}
child.on('close', (code) => {
if (signal) signal.removeEventListener('abort', cleanup);
// Flush any final line with no trailing newline.
if (lineBuf.trim()) feedLine(lineBuf);
lineBuf = '';
log.info({ agent, exitCode: code, streamed }, 'pty-dispatch: completed');
resolve({
exitCode: code ?? 1,
stdout,
stderr,
streamed,
usage: parser?.usage(),
agentSessionId: parser?.sessionId() ?? null,
});
});
child.on('error', (err) => {
if (signal) signal.removeEventListener('abort', cleanup);
log.error({ agent, err: err.message }, 'pty-dispatch: spawn error');
reject(err);
});
});
}

View File

@@ -0,0 +1,21 @@
import { readFile } from 'node:fs/promises';
import { join } from 'node:path';
import { homedir } from 'node:os';
import type { ProviderModel } from './provider-types.js';
const QWEN_SETTINGS_PATH = join(homedir(), '.qwen', 'settings.json');
export async function readQwenSettingsModels(): Promise<ProviderModel[]> {
try {
const raw = await readFile(QWEN_SETTINGS_PATH, 'utf8');
if (!raw.trim()) return [];
const settings = JSON.parse(raw) as {
modelProviders?: { openai?: Array<{ id: string }> };
};
const openaiModels = settings?.modelProviders?.openai;
if (!Array.isArray(openaiModels)) return [];
return openaiModels.map((m) => ({ id: m.id, label: m.id }));
} catch {
return [];
}
}

View File

@@ -0,0 +1,296 @@
/**
* Claude-Code-compatible stream-json NDJSON parser (feature #7,
* openspec `sampling-streamjson-tokens`).
*
* qwen (`--output-format stream-json`) and claude (`--output-format stream-json`)
* both emit Claude-Code's stream-json NDJSON on stdout: one JSON object per line.
* This module turns that stream into the same transport-agnostic `AgentEvent`s the
* ACP / opencode-server backends emit, so the PTY dispatch path can publish live
* broker frames + persist structured parts instead of slicing stdout opaque.
*
* Two surfaces:
* - `parseStreamJsonLine(line, state)` — PURE per-line mapping (unit-testable).
* `state` is the caller-owned accumulator (open tool blocks + usage/session_id).
* - `makeStreamJsonParser()` — a thin stateful wrapper holding the state, with a
* `push(line)` that returns the events for that line and getters for the final
* `usage` / `sessionId`.
*
* Defensive by contract: a non-JSON / partial / garbage line yields `[]` and never
* throws. Tool args (`input_json_delta`) arrive fragmented across many lines; we
* accumulate the partial JSON string per content-block index and only surface the
* parsed `rawInput` once the block stops (or, as a fallback, off the terminal
* `assistant` message which carries the fully-assembled `tool_use` blocks).
*
* Schema (keyed on top-level `type`):
* - `system` — init: { session_id, tools, ... }
* - `assistant` — { message: { content: [ {type:'text'|'thinking'|'tool_use', ...} ], usage? } }
* - `user` — tool results (ignored — diffing the worktree captures effects)
* - `result` — final: { usage: { input_tokens, output_tokens }, session_id? }
* - `stream_event` — { event: { type, index?, content_block?, delta?, usage? } }
* event.type:
* content_block_start — { index, content_block: {type, id?, name?} }
* content_block_delta — { index, delta: {type, text?|thinking?|partial_json?} }
* content_block_stop — { index }
* message_delta — { usage: { output_tokens } }
* message_start — { message: { usage } }
*/
import type { AgentEvent } from './agent-backend.js';
import type { AcpToolSnapshot } from './acp-tool-snapshot.js';
/** Convenience alias for the per-line return value. */
export type AgentEventList = AgentEvent[];
export interface StreamJsonUsage {
inputTokens?: number;
outputTokens?: number;
}
/** Per-open-content-block accumulation for tool args assembled across deltas. */
interface OpenToolBlock {
toolCallId: string;
name: string;
/** Concatenated `input_json_delta.partial_json` fragments. */
partialJson: string;
}
export interface StreamJsonState {
/** content-block index → open tool block (only `tool_use` blocks are tracked). */
toolBlocks: Map<number, OpenToolBlock>;
sessionId: string | null;
usage: StreamJsonUsage;
}
export function makeStreamJsonState(): StreamJsonState {
return { toolBlocks: new Map(), sessionId: null, usage: {} };
}
function asRecord(value: unknown): Record<string, unknown> | null {
if (value && typeof value === 'object' && !Array.isArray(value)) {
return value as Record<string, unknown>;
}
return null;
}
function asString(value: unknown): string | undefined {
return typeof value === 'string' ? value : undefined;
}
function asNumber(value: unknown): number | undefined {
return typeof value === 'number' && Number.isFinite(value) ? value : undefined;
}
/** Pull token counts out of an Anthropic-shape `usage` object, mutating state. */
function captureUsage(usage: Record<string, unknown> | null, state: StreamJsonState): void {
if (!usage) return;
const input = asNumber(usage.input_tokens);
const output = asNumber(usage.output_tokens);
if (input !== undefined) state.usage.inputTokens = input;
// output_tokens is reported incrementally on message_delta; keep the latest.
if (output !== undefined) state.usage.outputTokens = output;
}
/** Parse the accumulated tool-arg JSON; tolerate an unparseable/partial body. */
function parseToolInput(partialJson: string): unknown {
const trimmed = partialJson.trim();
if (!trimmed) return {};
try {
return JSON.parse(trimmed);
} catch {
return { _raw: partialJson };
}
}
function toolSnapshot(block: OpenToolBlock, rawInput: unknown, status: AcpToolSnapshot['status']): AcpToolSnapshot {
return {
toolCallId: block.toolCallId,
title: block.name,
kind: null,
status,
rawInput,
};
}
/**
* Map one stream-event sub-object (the `event` field of a `stream_event` line) to
* AgentEvents, mutating `state` for open tool blocks + usage.
*/
function handleStreamEvent(event: Record<string, unknown>, state: StreamJsonState): AgentEvent[] {
const eventType = asString(event.type);
if (!eventType) return [];
switch (eventType) {
case 'content_block_start': {
const index = asNumber(event.index);
const block = asRecord(event.content_block);
if (index === undefined || !block) return [];
if (asString(block.type) !== 'tool_use') return [];
const toolCallId = asString(block.id) ?? `tool_${index}`;
const name = asString(block.name) ?? 'tool';
const open: OpenToolBlock = { toolCallId, name, partialJson: '' };
state.toolBlocks.set(index, open);
// Surface the tool start immediately (running, no args yet) so the UI shows
// the call before the args finish streaming.
return [{ type: 'tool_call', toolCall: toolSnapshot(open, {}, 'in_progress') }];
}
case 'content_block_delta': {
const index = asNumber(event.index);
const delta = asRecord(event.delta);
if (delta === null) return [];
const deltaType = asString(delta.type);
if (deltaType === 'text_delta') {
const text = asString(delta.text);
return text ? [{ type: 'text', text }] : [];
}
if (deltaType === 'thinking_delta') {
const text = asString(delta.thinking);
return text ? [{ type: 'reasoning', text }] : [];
}
if (deltaType === 'input_json_delta') {
// Accumulate tool args; no event until the block stops.
const fragment = asString(delta.partial_json);
if (index !== undefined && fragment) {
const open = state.toolBlocks.get(index);
if (open) open.partialJson += fragment;
}
return [];
}
return [];
}
case 'content_block_stop': {
const index = asNumber(event.index);
if (index === undefined) return [];
const open = state.toolBlocks.get(index);
if (!open) return [];
state.toolBlocks.delete(index);
const rawInput = parseToolInput(open.partialJson);
return [{ type: 'tool_update', toolCall: toolSnapshot(open, rawInput, 'completed') }];
}
case 'message_start': {
const message = asRecord(event.message);
captureUsage(asRecord(message?.usage), state);
return [];
}
case 'message_delta': {
captureUsage(asRecord(event.usage), state);
return [];
}
default:
return [];
}
}
/**
* Map the terminal `assistant` message (post-hoc full message) to AgentEvents. Used
* as a fallback for transports that emit only the assembled `assistant` line and no
* incremental `stream_event`s. When stream_events already streamed a block, the
* caller dedups by toolCallId, so re-emitting the assembled tool_use is harmless.
*/
function handleAssistantMessage(message: Record<string, unknown>, state: StreamJsonState): AgentEvent[] {
captureUsage(asRecord(message.usage), state);
const content = message.content;
if (!Array.isArray(content)) return [];
const out: AgentEvent[] = [];
let toolIdx = 0;
for (const rawBlock of content) {
const block = asRecord(rawBlock);
if (!block) continue;
const blockType = asString(block.type);
if (blockType === 'text') {
const text = asString(block.text);
if (text) out.push({ type: 'text', text });
} else if (blockType === 'thinking') {
const text = asString(block.thinking);
if (text) out.push({ type: 'reasoning', text });
} else if (blockType === 'tool_use') {
const toolCallId = asString(block.id) ?? `tool_${toolIdx}`;
const name = asString(block.name) ?? 'tool';
const rawInput = 'input' in block ? block.input : {};
out.push({
type: 'tool_update',
toolCall: { toolCallId, title: name, kind: null, status: 'completed', rawInput },
});
}
toolIdx++;
}
return out;
}
/**
* Pure per-line mapping. `line` is a single complete NDJSON line (no trailing
* newline required; surrounding whitespace tolerated). Returns the AgentEvents the
* line produces and mutates `state` (open tool blocks, usage, session_id). A blank,
* non-JSON, or unrecognized line yields `[]` and never throws.
*/
export function parseStreamJsonLine(line: string, state: StreamJsonState): AgentEvent[] {
const trimmed = line.trim();
if (!trimmed) return [];
let obj: Record<string, unknown> | null;
try {
const parsed: unknown = JSON.parse(trimmed);
obj = asRecord(parsed);
} catch {
return [];
}
if (!obj) return [];
const type = asString(obj.type);
switch (type) {
case 'system': {
const sid = asString(obj.session_id);
if (sid) state.sessionId = sid;
return [];
}
case 'stream_event': {
const event = asRecord(obj.event);
return event ? handleStreamEvent(event, state) : [];
}
case 'assistant': {
const sid = asString(obj.session_id);
if (sid) state.sessionId = sid;
const message = asRecord(obj.message);
return message ? handleAssistantMessage(message, state) : [];
}
case 'result': {
const sid = asString(obj.session_id);
if (sid) state.sessionId = sid;
captureUsage(asRecord(obj.usage), state);
return [];
}
default:
// `user` (tool results) and any unknown line type — ignore.
return [];
}
}
export interface StreamJsonParser {
/** Feed one complete NDJSON line; returns its AgentEvents (never throws). */
push(line: string): AgentEvent[];
/** Final usage (input/output tokens) accumulated so far. */
usage(): StreamJsonUsage;
/** Provider session id from the init `system` line / `result`, if seen. */
sessionId(): string | null;
}
/**
* Stateful wrapper around `parseStreamJsonLine`. Holds per-tool-block accumulation
* + usage/session_id across the turn. Line-buffering (splitting stdout on `\n` and
* holding the partial tail) is the caller's job — see `pty-dispatch.ts`.
*/
export function makeStreamJsonParser(): StreamJsonParser {
const state = makeStreamJsonState();
return {
push: (line: string) => parseStreamJsonLine(line, state),
usage: () => ({ ...state.usage }),
sessionId: () => state.sessionId,
};
}

Some files were not shown because too many files have changed in this diff Show More