Compare commits

...

110 Commits

Author SHA1 Message Date
d10d79399b fix(docker): trust bind-mounted repos via git safe.directory
The container runs as root over uid-1000-owned host repos; git's dubious-
ownership guard made every project read as not-a-repo, hiding the git diff
panel's Git tab and nulling the branch indicator. Bakes safe.directory='*'
into the runtime image. Applied live to the running container too.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 08:29:33 +00:00
aeb2777ad4 docs: changelog for v2.7.14-backlog-hardening + v2.7.15-git-diff-panel
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 03:42:40 +00:00
2c58f2b3d3 Merge epic-backlog-and-gitdiff: v2.7.14 backlog hardening + v2.7.15 git diff panel
Two plans delivered via paseo-epic in an isolated worktree, audited green:
- v2.7.14: post-review backlog (external task-cancel + finalization, tool-call
  parser prune + pino logging, BooChat stall-timeout, view_session_history MCP
  tool, retire the :9502 fallback SPA).
- v2.7.15: git diff panel (Files/Git tab in the file browser with stage/commit/
  discard, server-side argv-safe git, sessionEvent-driven refresh).
2026-06-03 03:41:12 +00:00
d8bb2dabfe feat: git diff panel (Files/Git tab in the file browser)
Adds a Git tab to the right-side file panel that shows the project
repository's diff and lets the user stage, unstage, commit, and discard
whole files in-session. Two comparison modes (Uncommitted vs HEAD, and the
branch vs its base — upstream tracking branch else default branch), auto-
selected by repo state on first open and pinned after explicit choice;
per-file expand/collapse with lazy syntax-highlighted diffs, +/- stats, and
binary/large-file placeholders. All git read and write logic lives in
apps/server via a new git_diff service: argv-safe execFile only (never a
shell), per-file paths validated repo-relative through pathGuard with a
realpath symlink-escape check, server-derived commit identity (the request
carries no author fields), and the write endpoints are deliberately absent
from the assistant tool registry. Reads are bounded (30s deadline, 10MB);
an index lock or an in-progress merge/rebase/cherry-pick/bisect surfaces as
"repository busy" and disables writes. The panel stays current via a client
git_diff_refresh session event (no new wire contract) coalesced across tab
open, mutations, turn completion, and pending-change apply. Discard is an
irrecoverable hard-delete behind a plain confirm that distinguishes
reverting a tracked file from deleting an untracked one.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 03:18:41 +00:00
ca028a4024 docs: add git-diff-panel implementation planning artifacts
Implementation decision log, iteration history, synthesis input, the
implementation plan, and discovery notes for the git-diff-panel feature.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 02:26:04 +00:00
3e7115afad docs: record @boocode/contracts SSOT + schema-migration learnings in CLAUDE.md
Add the packages/contracts package to the monorepo list, a consolidated
@boocode/contracts conventions bullet (subpaths, build-first, web-local strict
WsFrame union, built-dist consumption), and the `SELECT *` view / DROP COLUMN
(2BP01) schema gotcha that crash-looped boocoder.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 02:25:59 +00:00
f32fd928b3 feat: post-review backlog hardening (cancel/parser/stall/history/9502)
Five independent items from the post-review backlog. F1: Stop on an external
agent task now aborts the running child via a per-task AbortController registry
reachable from the cancel route, and finalizes the assistant message as
cancelled (fixing two latent bugs — catch blocks left the message streaming,
and warm success-paths wrote complete on an aborted turn); warm pools/worktrees
are preserved and the native path is unchanged. F2/F3: prune the tool-call
parser to its two load-bearing exports (unexport eight zero-caller symbols, add
a gate test for the <invoke>-as-text fallback) and route placeholder-rejection
logging through pino. F6: a 90s per-chunk stall-timeout wraps native inference's
fullStream via AbortSignal.any so a hung stream finalizes the message instead of
hanging — no retry (a pure classifyStreamError helper is added). F7: a read-only
view_session_history MCP tool (newest-N, chronological). F9: retire the unused
apps/coder/web :9502 fallback SPA, keeping every API/WS/health/MCP route.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 02:23:11 +00:00
9a139633b8 fix(coder): drop human_inbox view before dropping tasks columns
The audit-cleanup migration dropped tasks.feature_values/worktree_path, but
human_inbox is `SELECT * FROM tasks` and pins every column, so the DROP COLUMN
failed (2BP01) on any existing DB and crash-looped boocoder on boot. Drop the
view, drop the columns, then recreate it — idempotent on fresh and existing DBs.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 00:29:28 +00:00
2c4ff2063d fix: reconcile audit-cleanup refactor with @boocode/contracts SSOT (worktree-risk type, frame-emitter import)
worktree-risk.ts now returns the package's WorktreeRiskReport (local RiskReport interface removed); frame-emitter.ts imports WsFrame from @boocode/contracts/ws-frames (the deleted @boocode/server/ws-frames subpath).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 23:25:50 +00:00
ae3f10b19d Merge remote-tracking branch 'origin/main' 2026-06-02 21:30:28 +00:00
cc4bd04aa4 Merge contracts-ssot-pkg: v2.7.13 single-source cross-app wire contracts in @boocode/contracts 2026-06-02 21:24:14 +00:00
649ce71eff feat: single-source cross-app wire contracts in @boocode/contracts (v2.7.13)
Move all hand-synced cross-app wire contracts into one built workspace
package, @boocode/contracts, consumed by server/web/coder/coder-web via
workspace:* + a per-subpath exports map. The ws-frames and provider-config
Zod schemas are schema-first (z.infer); MessageMetadata, ErrorReason,
AgentSessionConfig, the provider snapshot types, and WorktreeRiskReport are
each single-sourced. Deletes the byte-identical copies and their parity
tests, fixes a live AgentSessionConfig drift (coder dead copy removed,
unified to the web required/nullable shape), removes the dead pending_change
WS arms in the fallback SPA, and inverts the build order (contracts builds
first) across root build, Dockerfile, and the coder deploy docs. Reverses
the shared-package decision declined in v2.5.12.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 21:24:08 +00:00
2a05d2f9fe docs: archive shipped openspec batches; add feature/plan/research notes
Move 13 shipped openspec change docs under openspec/changes/archived/.
Add docs/features/git-diff-panel, docs/plans/post-review-backlog, and
docs/research/cross-app-contract-ssot.md (the research behind the
@boocode/contracts SSOT work). Update BOOCHAT.md, BOOCODER.md, and
boocode_roadmap.md.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 21:20:33 +00:00
8c200216eb refactor: codebase audit cleanup — dead code, dedup, module splits
Multi-agent audit + aggressive cleanup across server/web/coder/booterm,
delivered behind a DEFER discipline so none of the in-flight files were
touched. Removes dead code/deps/columns, dedups server + coder helpers,
and splits the oversized modules (tools.ts, opencode-server.ts,
sentinel-summaries, turn.ts, TerminalPane.tsx) behind stable contracts.
Adds 78 parity/unit tests (server 587, coder 323); fixes two latent bugs
(ChatPane queue keys, FileViewerOverlay blank-line parity).

Intended tag: v2.7.12-audit-cleanup.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 21:12:29 +00:00
e5ce01ae72 fix(coder): include model in WS snapshot SELECT so the attribution chip survives refresh
CoderPane hydrates from the HTTP listMessages fetch (SELECT has model) AND the WS snapshot frame, and the snapshot handler setMessages-overwrites the HTTP load. The snapshot query in apps/coder/src/routes/ws.ts had its own column list that omitted model, so on coder refresh the chip's model was lost (it showed live via the message_complete frame). One-column fix: add model to that SELECT. CLAUDE.md mapper-chain note updated to list the WS snapshot SELECT.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 18:03:10 +00:00
81470f5a77 Merge composer-chips: v2.7.10 composer attach-file button + slash-commands chip (icon-only on mobile)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 17:27:59 +00:00
35dba828e1 feat: composer attach-file button + slash-commands chip (icon-only on mobile)
Move the slash-commands menu out of the full-width AgentCommandsHint disclosure into a compact chip in the composer's bottom controls row, and add an attach-file button that reuses the existing drag-drop pipeline (5MB/binary gate, 10-attachment cap, chips + preview). On mobile both collapse to icon-only (count hidden). Shared ChatInput, so it applies to both BooChat and BooCoder; typed-/ autocomplete is unchanged. Removes the now-unused AgentCommandsHint component.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 17:26:27 +00:00
ce621bc003 Merge mcp-env-keys-batch: v2.7.9 MCP {env:VAR} key substitution + coder model/tool-result fixes + docs refactor 2026-06-02 17:01:11 +00:00
afaca9e426 feat: MCP {env:VAR} key substitution + coder model/tool-result fixes + docs refactor (v2.7.9)
- MCP secrets: substituteEnvVars recursively resolves {env:NAME} in mcp.json string values from process.env before Zod (opencode-compatible); unset -> '' + boot warning, and invalid-config log names the unset vars (an empty {env:VAR} in a strict url/command field invalidates the whole config)
- data/mcp.json now untracked (.gitignore flips !data/mcp.json -> !data/mcp.example.json); tracked template data/mcp.example.json carries "{env:CONTEXT7_API_KEY}"; .env.example documents the key (9 mcp-config tests)
- Coder fix: message_complete frame model widened string -> string|null (server+web ws-frames parity); dispatcher publishes model: task.model at all 4 external completion points — a null model otherwise fail-closed in publishFrame and dropped the whole frame incl. status:'complete' (regression test)
- Coder fix: claude-sdk mapUserToolResults maps user-message tool_result blocks -> terminal tool_update events (completed/failed w/ output) so tool snapshots resolve instead of spinning forever
- Composer: AgentComposerBar drops §9b resumed/history/new chip + token readout, loses flex-wrap so the row stays one line; CoderPane gains a per-chat localStorage agent-config cache (restores last model on reopen) + threads model into the timeline/chip
- Docs: root CLAUDE.md slimmed (~190 lines), per-app refs split to apps/{coder,server,web}/CLAUDE.md; new docs/coder-backends.md, docs/project-discovery.md, docs/coding-standards/ (cross-app-contract-parity); ARCHITECTURE.md links the backends doc

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 17:01:03 +00:00
7ca4a6b344 chore: prune unused brand PNGs (keep banner-mascot + banner-wordmark)
Removes boo-badge / boocode-icon / boocode-wordmark / boocode-wordmark-tight —
copied from the design bundle but unreferenced; only the two banner badges are
imported (ProjectSidebar).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 23:10:12 +00:00
27f3a6c463 Merge boocode-ui-ember-coder-model: v2.7.8 Ember theme + brand banner + coder tabs + model-attribution chips 2026-06-01 22:30:58 +00:00
3a646fd6df feat: BooCode 2.0 UI — Ember theme, brand banner, coder tabs, model-attribution chips
- Ember theme (Obsidian charcoal + #ff7a18 orange), now DEFAULT_THEME_ID; server theme_id whitelist gains 'ember'
- Brand banner: transparent Westie mascot + >_BooCode wordmark, big/edge-to-edge (flood-filled to transparency + cropped)
- Coder panes are multi-tab: + opens a BooCode tab, split opens a pane (shared ChatTabBar via tabKind + createCoderTab; closeOtherTabs/tab-numbering extended to coder)
- Model-attribution: new messages.model column stamped at finalizeCompletion (BooChat/native coder) + dispatcher assistant-row creation (external coder); surfaced via view + wire types + live frame; rendered as a subtle shortened-name chip (shortenModelName)
- Composer Web toggle moved into a boxed focus-ringed input; glowing accent dot on tool rows
- Claude SDK follow-ups (1M context, follow-up-message fix, collapsed thinking/tool chips) + CLAUDE_SDK_BACKEND=1

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 22:30:47 +00:00
7098014261 Merge pane-header-shared: v2.7.7 shared pane-header cluster + chat-resolve WorkspaceState fix 2026-06-01 14:29:00 +00:00
c56d169ef9 feat: shared PaneHeaderActions + chat-resolve WorkspaceState fix (v2.7.7)
In-flight workspace UX work.

- Extract a shared PaneHeaderActions cluster (+/Split/Reopen/History/Close)
  used by ChatTabBar + the Workspace coder/terminal pane headers, replacing the
  divergent per-header copies; SessionLandingPage history + useWorkspacePanes
  tweaks.
- Fix coder-side correctness bug: resolveChatId read sessions.workspace_panes as
  a bare WorkspacePane[] but v2.6.5 widened it to a WorkspaceState envelope, so
  it mis-read panes and clobbered tabNumbers/nextTabNumber/closedPaneStack on
  every pane-chat write. New normalizeWorkspaceState handles either shape and
  preserves the envelope (+ regression test).
- CLAUDE.md doc-sync (coder vitest suite, deploy-by-surface, dual-remote push,
  in-flight-web-WIP staging, release-branch naming).

Web tsc + coder build + coder tests green. Builds on v2.7.6.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 14:28:49 +00:00
b7fb254e5d Merge agent-status-dot: v2.7.6 normalized external-agent status (scoped #10) 2026-06-01 14:04:26 +00:00
59cf082e06 feat: normalized external-agent status (#10 scoped) (v2.7.6)
Scoped half of boocode_code_review_v2 §1 #10 — publish the agent status
BooCoder already observes (the config-injection notify-hook is the documented
follow-on, clean-room from superset ELv2).

- agent_status_updated WS frame (working|blocked|idle|error), server+web parity.
- Published from the dispatcher's turn boundaries (warm-acp/opencode/sdk/pty:
  working at start, idle/error at end) + the permission flow (blocked/working).
  Best-effort, never breaks a turn.
- Clean-room normalizeAgentEvent helper (superset's vendor-event -> Start/blocked
  /Stop collapse, event names as facts) + 25 tests — reused by the follow-on.
- AgentComposerBar status dot (distinct from the WS-liveness dot), tracked per
  (chat,agent) by a useAgentStatus map in CoderPane.

Built by 2 parallel agents vs a pinned frame contract. Server 545 + coder 294
tests passing (25 new); web tsc + builds clean; ws-frames parity green. Clears
the actionable review backlog (#1/#3/#4/#6-#12). Builds on v2.7.5.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 14:04:04 +00:00
6fc3175730 Merge claude-sdk-backend: v2.7.5 Claude SDK backend + clean-room PostgresSessionStore 2026-06-01 13:38:05 +00:00
f3a0197d6a feat: Claude Agent SDK backend + clean-room PostgresSessionStore (v2.7.5)
Lands the lean-SDK direction (boocode_code_review_v2 §1 #9) behind a flag.
Adds @anthropic-ai/claude-agent-sdk@0.3.159 (Commercial Terms, runtime dep).

- PostgresSessionStore: clean-room impl of the SDK's real SessionStore type
  over a new claude_session_entries table. Typechecks against the SDK type;
  8 DB-integration tests.
- ClaudeSdkBackend (implements AgentBackend): one warm query() per (chat,claude)
  in streaming-input mode via a pushable async-iterable pump, sessionStore +
  resume continuity, pure mapSdkMessage->AgentEvent, session_id from init,
  usage/cost onto agent_sessions (backend CHECK gains 'claude_sdk').
- Routing env-gated by CLAUDE_SDK_BACKEND (default off) -> PTY path UNCHANGED.
- Built against real SDK 0.3.159 types (install paid off: partial=stream_event
  needing includePartialMessages, MessageParam, result error arm).
- Fix latent test-infra deadlock: serialize DB suites (fileParallelism:false).

Coder 269 passing default / 290 with DB; tsc clean vs SDK types; builds clean.
LIVE pump + resume + actual claude turn need a host smoke (CLAUDE_SDK_BACKEND=1
+ claude binary + auth). zod peer-dep wants ^4 (workspace 3.25). Builds on v2.7.4.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 13:37:57 +00:00
7e0ecde83d Merge mistake-tracker-ledger: v2.7.4 heterogeneous-failure recovery + file-read ledger 2026-06-01 13:05:19 +00:00
bcc89d8adc feat: MistakeTracker + file-provenance ledger (v2.7.4)
Two native-inference hardening features from boocode_code_review_v2 §1 #12.

MistakeTracker: new pure mistake-tracker.ts tracks consecutive heterogeneous
tool failures (kinds surfaced per tool from tool-phase.ts). On 3 in a row the
turn loop soft-nudges (model-facing recovery guidance + mistake_recovery
sentinel + reset), then escalates to stopping the turn (cap-hit-style, Continue
affordance) on a re-trip. Complements doom-loop (identical repeats) + cap-hit.

File-provenance ledger: compaction.ts derives a deterministic ## Files Read list
from the head messages' read-tool calls and injects it into the rolling-summary
prompt so provenance survives compaction (no new table; read-only).

mistake_recovery sentinel: MessageMetadata arm (server + web) + MessageBubble
render branch. Built by 2 parallel agents. Server 545 tests passing (23 new);
build + web tsc clean. Native-inference only. Builds on v2.7.3.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 13:05:03 +00:00
f53d6a8afd Merge sampling-knobs-streamjson: v2.7.3 sampling knobs + live PTY stream-json + token UI 2026-06-01 12:47:31 +00:00
a584dd16b0 feat: sampling knobs + live PTY stream-json + token UI (v2.7.3)
Three small wins from boocode_code_review_v2 §1 #11/#7/#8.

#11 sampling knobs: top_n_sigma + dry_* family as first-class Agent fields,
threaded into the request body via providerOptions.openaiCompatible. Fixes a
latent bug — top_k (rejected by the AI-SDK provider) and min_p (never passed to
streamText) were dead on the wire; both now route through the same channel.
--reasoning-budget documented in data/AGENTS.md.

#7 live PTY stream-json: new stream-json-parser.ts line-buffers qwen/claude
NDJSON and emits text/reasoning/tool frames live + persists, with a fallback to
the old opaque slice. claude gets --output-format stream-json --verbose.

#8 token UI: agent_sessions input/output_tokens/cost now flow through the route
+ type and render beside the AgentComposerBar session chip.

Built by 3 parallel agents. Server 523 + coder 245 tests passing; builds + web
tsc clean. Builds on v2.7.2. openspec sampling-streamjson-tokens.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 12:47:17 +00:00
5651f56039 Merge checkpoint-idor-fix: v2.7.2 close 2 checkpoint IDOR holes 2026-06-01 12:16:08 +00:00
9c7d80e2d8 fix(security): scope checkpoint routes to session — close 2 IDORs (v2.7.2)
Flagged by the automated push security review on v2.7.1.

- GET /checkpoints?chat_id= : the chat_id branch filtered by chat_id alone
  (any session's chat_id read its checkpoints). Now joins chats and gates on
  chats.session_id.
- restoreCheckpoint scope guard was fail-open: `cp.session_id && cp.session_id
  !== sessionId` fell through on a null denormalized session_id, allowing a
  cross-session restore (worktree reset + transcript trim). Now resolves the
  owning session via the checkpoint's chat and denies on missing/mismatch.
- Adds a DB-integration regression for the null-session_id cross-session case.

Both scope authoritatively through chats.session_id (checkpoints.session_id is
a nullable hint). Coder suite 234 passing; 7/7 checkpoint tests (incl. the
regression) against live postgres+git; typecheck clean. Hotfix on v2.7.1.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 12:15:54 +00:00
a41a02a62b Merge fuzzy-checkpoints: v2.7.1 write/edit robustness (fuzzy applier + worktree checkpoints) 2026-06-01 12:02:06 +00:00
59f07e8cb8 feat: write/edit robustness — fuzzy patch applier + worktree checkpoints (v2.7.1)
#3 Fuzzy patch applier: new pure fuzzy-match.ts (locateMatch, exact→trim→
unicode-canon→Levenshtein≥0.66, refuse-on-ambiguous) wired into pending_changes
applyOne/rewindOne so local-model whitespace/unicode drift in old_string no
longer loses the edit.

#4 Worktree checkpoint + conversation-trim: checkpoints table + checkpoints.ts
(shadow-commit of tracked+untracked into refs/boocode/checkpoints, hooked into
the 3 external-agent dispatcher paths) + POST restore route (reset --hard +
clean -fd -> transcript trim -> backend-session reset) + "Restore to here" UI.

Built by 3 parallel agents; DB-integration testing caught a created_at
self-deletion bug. Coder suite 234 passing; server+coder build + web tsc clean.
Builds on v2.7.0-mit. openspec write-edit-robustness.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 12:01:57 +00:00
1108d07fb2 Merge relicense-agpl-to-mit: v2.7.0 AGPL-3.0 → MIT relicense 2026-06-01 08:16:25 +00:00
a8bfde8f8d feat: relicense AGPL-3.0 → MIT (v2.7.0)
Clear the 3 Unsloth-Studio-derived AGPL files and flip LICENSE + 5
package.json from AGPL-3.0-only to MIT.

- html-to-md.ts → MIT node-html-markdown (parse5 dropped)
- llama-args-validator.ts → clean-room (flag denylist = facts)
- tool-call-parser.ts → delete dead Unsloth-ported code; keep
  extractToolCallBlocks/stripToolMarkup byte-identical (no behavior change)
- LICENSE → MIT (Copyright (c) 2026 indifferentketchup); 5 package.json → MIT;
  AGPL SPDX headers removed; README License section; license-mit guard test
- roadmap License-debt batch marked shipped; openspec/changes/license-debt-mit

Decouples the relicense from the native-parsing retirement (the ported parser
was dead code). Server suite 519 passing; build + coder typecheck clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 08:16:03 +00:00
9c1ddcaa7c Merge v2611-followups: v2.6.11 apps/server close-hook caller + DiffPanel staging hint 2026-06-01 02:35:21 +00:00
217f487395 docs(changelog): v2.6.11-close-hooks-staging (closes the v2.6 openspec)
CHANGELOG + roadmap (through v2.6.11) + openspec v2-6 Phase 3 fully closed (3.7 + apps/server close-hook caller done).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 02:35:21 +00:00
2dfbef4c41 feat: v2.6 follow-ups — apps/server close-hook caller + DiffPanel staging hint (3.7)
apps/server fire-and-forgets BooCoder's Phase-3 close hooks (new coder-notify.ts, reuses BOOCODER_URL, never-rejects) on session-delete + chat archive/archive-all/delete, so warm backends + worktrees tear down immediately (idle-evict/reaper was the backstop). 3.7: BooCoder DiffPanel shows a muted one-liner when the selected provider can't see another agent's unapplied worktree edits (pure derivation from per-change agent + current provider, no new state). 6 new server tests (coder-notify); 537 server tests pass; web+server tsc/build clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 02:35:11 +00:00
c7a8128059 Merge phase3-lifecycle: v2.6.10 lifecycle hardening (completes v2.6 persistent agent sessions) 2026-06-01 01:10:16 +00:00
986c8a83a9 docs(changelog): v2.6.10-lifecycle-hardening (completes v2.6)
CHANGELOG + roadmap (through v2.6.10; v2.6 marked complete) + openspec v2-6 Phase 3 checked off (3.1-3.6; 3.7 frontend + apps/server caller as follow-ups).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 01:10:16 +00:00
aa3797e356 feat(coder): v2.6 Phase 3 — lifecycle hardening (idle evict, crash recovery, worktree reaper)
Idle TTL eviction per (chat,agent) + LRU cap (never a busy backend); pure lifecycle-decisions.ts (TDD). Crash recovery lifts openchamber's health-monitor + busy-aware-restart + stale-grace state machine into opencode-server.ts (+ port reclaim) and warm-acp.ts; opencode crash -> fresh sessions, ACP -> re-session/new. F.1 turn-guard + U.6 usage preserved (their tests pass). Orphan worktree reaper (1h grace, superset-style dirty/unpushed preflight, Paseo soft-delete) + close hooks + diff re-baseline after apply_pending. 35 new tests + DB-opt-in reconnect test; 215 coder tests pass; tsc + build clean. Completes v2.6. Follow-ups out of scope: apps/server close-hook caller, 3.7 DiffPanel staging hint, live smokes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 01:10:09 +00:00
850d48853f Merge phase2-warm-acp: v2.6.9 warm ACP backend for goose/qwen 2026-05-31 23:57:14 +00:00
f619ae0978 docs(changelog): v2.6.9-warm-acp
CHANGELOG + roadmap (through v2.6.9) + openspec v2-6 Phase 2 checked off (2.1-2.4; Smoke 2/2b pending live).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 23:57:09 +00:00
0d3d08f5f2 feat(coder): v2.6 Phase 2 — warm ACP backend for goose/qwen
WarmAcpBackend (AgentBackend) holds one persistent goose acp / qwen --acp child + ClientSideConnection + ACP session per (chat,agent); initialize+session/new once, reused across turns. Abort = session/cancel the prompt only (never kills the child); child exit -> agent_sessions.status='crashed' -> re-spawn next turn. Dispatcher routes goose/qwen chat-tab tasks to the pooled warm backend via pure shouldUseWarmBackend (needs session_id+chat_id); one-shot runExternalAgent kept as fallback for arena/MCP/new_task. handleSessionUpdate extracted to a shared pure acp-event-map.ts (one-shot path byte-identical). SDK: installed @agentclientprotocol/sdk@^0.22.1 has stable resumeSession/loadSession; resume moot in the warm hot path, deferred to Phase 3. 15 new tests (warm-acp-routing, acp-event-map); 180 coder tests pass; tsc + build clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 23:57:03 +00:00
0658d19b64 Merge phase1-ux: v2.6.8 agent attribution (DiffPanel badges + composer chip + agent-sessions route + opencode usage) 2026-05-31 22:07:39 +00:00
631af5dd4c docs(changelog): v2.6.8-agent-attribution
CHANGELOG + roadmap shipped record (through v2.6.8) + openspec v2-6 Phase 1-UX checked off (U.1-U.6; Smoke U pending the frontend Docker rebuild).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 22:07:32 +00:00
5db6551361 feat(web): Phase 1-UX frontend — DiffPanel agent badges + resumed/new-session chip
DiffPanel renders a per-row agent badge (icon+label; null -> 'manual') + a 'Changes from X, Y' note when the pending set spans >1 agent. AgentComposerBar gains an optional sessionId prop -> resumed/history/new-session chip beside the Provider picker (gated, so BooChat callers are unchanged), driven by a new useAgentSessions hook (refetch on message-complete). providerIcon extracted to shared components/coder/providerIcons.tsx; api.coder gains agentSessions(sessionId); PendingChange type gains agent. web tsc clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 22:07:26 +00:00
c060778258 feat(coder): Phase 1-UX backend — agent attribution + agent-sessions route + opencode usage
pending_changes.agent stamped at every queue site (native -> 'boocode', dispatched external -> task.agent, manual RightRail -> NULL) + flows through listPending. New GET /api/sessions/:id/agent-sessions -> [{agent,status,has_session,last_active_at}] per (chat,agent). opencode warm server consumes session.next.step.ended, accumulating input_tokens/output_tokens/cost onto agent_sessions (new idempotent columns) via a pure opencode-usage.ts mapper. Tests: agent-sessions.routes (3) + opencode-usage (6); tsc clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 22:07:14 +00:00
48c1d70baf Merge f1-interrupt-guard: F.1 opencode post-interrupt stale-terminal guard + doc reconciliation (v2.6.7) 2026-05-31 21:32:25 +00:00
457010391a docs(changelog): v2.6.7-interrupt-guard + reconcile roadmap/review/openspec
CHANGELOG entry for v2.6.7. Plus the session's doc reconciliation: roadmap shipped record synced through v2.6.7 (v2.3 lifecycle marked shipped, relicense AGPL->MIT batch, fork-sweep lift items, claude-agent-sdk SessionStore, ACP package fix); boocode_code_review_v2 (two fork sweeps, relicense decision = 3 AGPL files, jinja gate green); openspec v2-3 reconciled to shipped (v2.5.4-v2.5.13); openspec v2-6 Phase 0/1 + P1.5 shipped, F.1 done, remaining-phase plan + lift sources.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 21:31:47 +00:00
372651bcb1 fix(coder): F.1 post-interrupt stale-terminal guard (opencode warm server)
opencode emits one trailing session.idle/error for a turn cancelled via client.session.abort(), carrying only a sessionID (no turn id). The warm-server backend settled activeTurn on that event, so after Stop + an immediate new message the orphan idle settled the NEXT turn early as success (one-click reachable since v2.6.5's Send->Stop composer).

Adds a pure per-session guard (backends/turn-guard.ts: armAbortGuard / noteTurnActivity / consumeTerminal over swallowNextTerminal) wired into opencode-server.ts: abort arms it, the next terminal is swallowed once, and a new turn's first delta self-heals so a never-arriving orphan can't strand a real turn. Test-first; 3 regression tests in turn-guard.test.ts. Paseo parallel: 1d38aac.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 21:31:35 +00:00
d66948c925 docs(changelog): v2.6.6-claude-md
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 12:44:33 +00:00
58d0c0f132 docs(claude-md): v2.6.5 session learnings
Capture four recurring gotchas from the panes/tabs/composer batch: the
workspace_panes WorkspaceState envelope (+ legacy-array migration on hydrate and
the union-accepting server PATCH validator); the optional ToolExecCtx
({ sql, sessionId }) 4th arg on ToolDef.execute for DB/session-aware tools
(read_tab_by_number reference); the two-schema-files-one-DB ownership split
(apps/coder owns agent_sessions/worktrees/pending_changes/available_agents) plus
the idempotent confdeltype FK-action-flip pattern; and that React StrictMode is
on, so a setState called inside another setState's updater double-fires in dev.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 12:44:20 +00:00
7b4f41b26f docs: roadmap shipping-state update + external code-review v2 findings
Update boocode_roadmap.md's shipped section through v2.6.4 (provider lifecycle,
persistent agent sessions, cursor/copilot retirement) and add
boocode_code_review_v2.md — a point-in-time external-fork lift/cross-check
findings doc (Paseo + opencode + llama.cpp + the second fork sweep), companion
to the standing boocode_code_review.md inventory.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 02:28:13 +00:00
5527e7a5e8 docs(changelog): v2.6.5-panes-tabs-composer
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 02:15:46 +00:00
08d6a8fa40 feat(web): morphing send/stop/queue composer button
The composer's primary button now reflects generation state: Send when idle,
Stop while generating with an empty draft, and Queue while generating with a
draft typed (submitting queues it via the existing queue path). Stop is
click-only so a stray Enter never interrupts a run. ChatInput gains generating
+ onStop props.

BooChat: removes the separate centered "Stop generating" pill and wires
generating={streaming} + onStop={handleStop}. BooCoder: generating now keys on
sending || activeTaskId (the dispatch POST is too brief on its own), which also
fixes the queue gates that previously fired mid-run; onStop cancels the active
task via the new api.coder.cancelTask, and the input is no longer disabled while
a task runs so follow-ups can be queued.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 02:15:14 +00:00
2fd7e5bf97 feat(web): workspace panes & tabs overhaul
A cohesive batch of pane/tab UX + the persisted workspace-state model (grouped
because the changes interleave across useWorkspacePanes, ChatTabBar, Workspace,
sessionEvents and the api types/client):

- Open a whole chat in a fresh pane via a new open_chat_in_new_pane event:
  ChatTabBar tab context menu "Open in new pane", and MessageBubble.fork() now
  lands the fork beside the original instead of replacing the active pane.
  openChatInNewPane detaches the chat from any pane already holding it
  (one-chat-per-pane).
- The tab-bar "+" becomes a New BooChat/BooTerm/BooCode menu (chat as a tab,
  term/coder as split panes); the split button is unchanged.
- Drop the per-message "Open in pane" button (it opened a single message's
  artifact) and its dead code; the artifact-pane machinery is left orphaned for
  a later teardown.
- Session history: the empty/landing pane lists the session's open chats plus
  archived chats (fetched separately), click to open / restore-and-open.
- Relocate-on-close: closing a chat pane moves its tabs (in order) into the
  oldest chat/empty pane instead of discarding them; terminal/coder panes close
  as before. Reopen strips the restored chatIds from all live panes first, so a
  relocated-then-reopened pane never duplicates a tab — no stack-shape change.
- Stable global tab numbering: tabNumbers/nextTabNumber assigned on chat-pane
  open, retired on close (never reused), rendered map-keyed (not positional).
- workspace_panes is now a WorkspaceState envelope { panes, tabNumbers,
  nextTabNumber, closedPaneStack }; the reopen stack moved from a module-level
  array into the persisted envelope so it survives reload. Hydrate/persist
  normalize the legacy bare-array shape. appendClosed dedupes a value-identical
  top entry to neutralize the StrictMode double-invoke of the setPanes updater.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 02:15:03 +00:00
d05f73be26 feat(server): workspace_panes envelope + read_tab_by_number tool
Widen the sessions.workspace_panes JSONB from a bare WorkspacePane[] to a
WorkspaceState envelope { panes, tabNumbers, nextTabNumber, closedPaneStack }.
The PATCH validator accepts either the legacy array or the envelope (zod union)
and normalizes to a full envelope before storing, so existing array-shaped rows
migrate transparently on next write. The session_workspace_updated WS frame
schema is widened to match (kept byte-identical to the web copy; parity test
passes).

Adds read_tab_by_number, a read-only tool that resolves a session-scoped tab
number to its chat via the persisted tabNumbers map and returns that chat's
transcript (oldest-first, sentinels skipped, capped at 20k chars). Tools gain an
optional ToolExecCtx ({ sql, sessionId }) 4th param on ToolDef.execute, threaded
through executeToolCall from executeToolPhase; the param is optional so existing
filesystem tools and the apps/coder consumer stay compatible. Registered in
ALL_TOOLS + READ_ONLY_TOOL_NAMES.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 02:14:42 +00:00
e857815d79 feat(web): paste chips trail the typed message text
flattenToMessage now places the typed text first and appends pasted-chip
content after it with a single leading space (file/line chips remain fenced
provenance blocks after that), instead of prepending all attachments. A
leading slash command therefore stays first and the paste reads as its
continuation — `/command <pasted>` rather than `<pasted>` then the command.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 02:13:40 +00:00
12d31a81a0 docs(changelog): v2.6.4-agent-sessions-fk
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 00:47:40 +00:00
5da6eb2447 docs(claude-md): sync v2.6 engineering notes (P1.5-a/b, skills, AGENTS.md parsing)
Reflect shipped v2.6.1–v2.6.3 work in the deep reference. The opencode SSE
bullet now describes per-session SSE (P1.5-a) instead of the single-stream
Phase-1 limit; the agent_sessions resume bullet describes the (chat_id, agent)
re-key (P1.5-b) — chat_id CASCADEs from chats, session_id/worktree_id are
informational SET NULL, and the worktrees table supersedes the defanged
session_worktrees. Drop the stale root AGENTS.md navigation pointer (removed
in v1.12; data/AGENTS.md is the registry, not navigation). Add two
conventions: data/AGENTS.md is parsed (## headings need a --- fence, no
free-form rule sections) and the data/skills/<vendor>/ layout with the
boocode/ namespace.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 00:47:16 +00:00
7f6c4780e2 fix(coder): converge agent_sessions.session_id FK to SET NULL (P1.5-b follow-up)
The P1.5-b re-key block (cb1846c) re-adds session_id_fkey as ON DELETE
SET NULL, but the whole block is guarded on chat_id_fkey's absence. A DB
already re-keyed to (chat_id, agent) while session_id_fkey was still
ON DELETE CASCADE never re-enters that block, so applySchema leaves it at
'c' forever — diverging from the schema's stated intent, from worktree_id
(already SET NULL), and from the v2.6.3 changelog's own claim that
session_id is informational SET NULL.

Add a standalone confdeltype-guarded block (mirroring the session_worktrees
defang) that flips session_id_fkey CASCADE -> SET NULL independently of the
re-key gate. Idempotent: fires only while the FK is still 'c' — a no-op on a
fresh deploy (already 'n' from the re-key block) and on every re-run. The
live DB was converged by hand with the identical statements; \d
agent_sessions now shows session_id ... ON DELETE SET NULL.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 00:46:41 +00:00
30b6f70f95 docs(changelog): v2.6.3-chatkey-and-skills
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 00:06:19 +00:00
c2b3e0a013 skills: committing-changes + using-worktrees judgment skills + AGENTS.md guidance
Two portable agent-judgment skills in data/skills/boocode/, externalizing when/how Opus commits and when it isolates work in a worktree, so weaker agents (opencode build agent, BooCoder) can approximate it. committing-changes: segment by concern, stage explicitly (never git add -A), draft scope-prefix messages, present-and-STOP — commit only on explicit command, never push, identity indifferentketchup. using-worktrees: the when-to-isolate heuristic (just-create-when-clear / propose-when-ambiguous / skip), stable-base mechanics, runtime-isolation caveat — deliberately autonomous vs committing's command-gate. Each has an eval.yaml (matching improving-boocode-guidance) with a negative-trigger task. AGENTS.md gets a parser-safe preamble (the registry throws on bare ## headings) pointing at both skills.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 00:04:48 +00:00
cb1846c0d5 feat(coder): re-key agent_sessions to (chat_id, agent) + worktrees table (P1.5-b)
The tab (a chat) is the context unit: two opencode tabs in one session are two independent agent contexts sharing one worktree. agent_sessions re-keys from (session_id, agent) to (chat_id, agent) — chat_id FK ON DELETE CASCADE (closing a tab ends its context); worktree_id and session_id become informational SET NULL columns. New worktrees table (one-per-session, survives session delete via session_id SET NULL) supersedes session_worktrees, which is defanged (CASCADE dropped) not yet removed. chat_id is threaded end-to-end: tasks.chat_id added, written by the coder message + skills routes from the frontend tab, read by runOpenCodeServerTask which falls back to resolve-or-create a chat for session-less creators (arena/MCP/new_task/generic) so ensureSession never gets a null key. Idempotent migration with a backfill-verify gate (0-row assertion after the test session was deleted). config_hash fingerprint logic preserved; one-worktree-per-session unchanged; runExternalAgent untouched. Column rename worktree_path -> path repointed at all five readers (server delete-guard, risk/stash endpoints, ensureSessionWorktree). Supersedes the earlier (worktree_id) draft.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 00:04:35 +00:00
f1a85627e4 fix(coder): strip dcp-message-id tags split across stream chunks
The dcp tag (<dcp-message-id>mNNNN</dcp-message-id>) is streamed token-by-token, so it arrives split across SSE deltas. The existing per-chunk stripDcpTags never sees a complete tag in any single fragment, so fragments pass through and the dispatcher reassembles the tag in textChunks (persisted + shown) — and the terminal message.part.updated path that would strip the full text is suppressed by the dedup gate. Add a stateful cross-chunk stripper (dcp-strip.ts: makeDcpStreamStripper) at the dispatcher's opencode frame boundary: it emits text that cannot be part of a forming tag, holds back only a trailing partial-tag prefix (without swallowing legitimate <…> content), and flushes at turn end. Fixes both live delta frames and persisted content. 11 unit tests incl. split-at-every-boundary and the documented per-chunk-fails case. opencode path only; ACP (goose/qwen/claude) untouched.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 23:16:47 +00:00
c65daba5dd docs(changelog): v2.6.2-delete-guard-and-sse
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 22:24:25 +00:00
c9e302da37 fix(coder): no-upstream branch alone no longer flags a session at-risk
Session worktree branches (session-<id>) never get an upstream, so the original atRisk rule (unpushed !== 0) flagged every worktree-backed session as at-risk on delete — even pristine ones — forcing a Stash/Force confirm on each. Gate the unpushed arm behind hasUpstream (unpushed !== -1) so the no-upstream sentinel can't trigger it: atRisk = dirty || unmerged > 0 || (hasUpstream && unpushed > 0). No protection is lost — any genuinely unsafe local commit also shows as unmerged > 0 — and the unpushed > 0 arm stays correct for P1.5's pushable worktree branches. unpushed is still reported (-1 = local-only) as informational. Follow-up to 3a26563.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 22:19:53 +00:00
f69ea5f494 feat(coder): per-session SSE subscriptions (P1.5-a concurrency prereq)
Replace the single global SSE loop (scoped to the most-recently-used worktree directory) with one subscription per live opencode session, each scoped to that session's worktree dir. Two sessions in different worktrees now stream concurrently instead of the second silently dropping the first's events. Each session owns an AbortController (SessionState.sseAbort) wired into subscribe(..., {signal}); the loop reconnects, reconciles (per-session), and is torn down on closeSession/dispose by aborting the signal — which also fixes a latent Phase-1 bug where switching directories left the old runEventLoop parked forever in its for-await (zombie loops). A sessionID demux guard (eventSessionId) drops events that aren't this loop's own, so two sessions sharing a worktree (possible after P1.5-b) don't double-process each other's deltas. Removed sseRunning/sseDirectory/startEventLoop/runEventLoop/reconcileInFlight and the 'SSE directory changed' collision warning. dispatchEvent/handleUpdatedPart (translation, dedup, dcp-strip) and the watchdog are unchanged — only the subscription topology changed. SDK confirmed: @opencode-ai/sdk Event.subscribe opens an independent SSE connection per call, so N concurrent dir-scoped streams are supported. No schema/dispatcher/frontend changes; runExternalAgent untouched.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 22:15:55 +00:00
3a26563be2 feat(coder): guard session delete against worktree work loss
Deleting a BooChat session CASCADE-wipes its session_worktrees row, which would silently orphan uncommitted/unpushed/unmerged work in the worktree. Add a pre-DELETE gate: the server reads session_worktrees from the shared DB first (no row = chat-only session = delete immediately, zero round-trip), and for worktree-backed sessions calls a new BooCoder endpoint that runs git on the host (only the host systemd service can see /tmp/booworktrees). checkWorktreeWorkAtRisk reports dirty/unpushed/unmerged via the audited hostExec+shellEscape path; default branch is detected from refs/remotes/origin/HEAD (not the worktree's own branch), never hardcoded. Any at-risk worktree returns 409 with per-worktree RiskReport[]; force=true bypasses the check entirely. Fail-closed: coder unreachable/errored also blocks (force still escapes). The sidebar renders a block dialog distinguishing work-at-risk (Commit/Stash/Force) from couldn't-verify (Cancel/Force only); stash uses -u and re-blocks on remaining commits with an explanatory message. Commit never auto-commits — it routes the user to the session.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 22:01:25 +00:00
937920df06 docs(changelog): v2.6.1-phase1-opencode
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 21:42:39 +00:00
e05469c6ae docs(claude): v2.6 Phase 1 opencode learnings — SSE, model resolution, resume
- opencode is now a warm HTTP server (was "planned, unshipped").
- SSE: session.next.* event types + subscribe({directory}) requirement.
- Model strings need llama-swap/ prefix + presence in opencode.json.
- config_hash excludes ephemeral port; session FKs are ON DELETE CASCADE.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 21:40:16 +00:00
0e026be5f8 fix(coder): CASCADE delete on session_worktrees + agent_sessions FKs
Deleting a session with linked session_worktrees or agent_sessions rows
threw a FK violation (500 on DELETE /api/sessions/:id). Both FKs now
ON DELETE CASCADE. Idempotent migration: drops the old constraint and
re-adds with CASCADE only if confdeltype != 'c'.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 21:26:28 +00:00
315cdd23e2 feat: strip dcp-message-id tags from opencode output + reopen closed panes
Two independent fixes:

- opencode-server.ts: stripDcpTags() removes <dcp-message-id>…</dcp-message-id>
  tags from text deltas before they reach the frame/DB. Applied to all three
  text paths (session.next.text.delta, message.part.delta text field,
  handleUpdatedPart text type). Reasoning/tool paths untouched.
- useWorkspacePanes.ts: module-level closedPaneStack (capped at 10) captures
  pane kind + chatIds on removePane and removeTab auto-remove. reopenPane()
  pops the stack and re-attaches a new pane to the existing chat ids (chats
  survive pane close server-side). hasClosedPanes drives conditional render.
- ChatTabBar.tsx: [+] is now instant new-tab (no dropdown); split-pane
  dropdown (Columns2 icon) opens Chat/Term/Code in a new pane; reopen button
  (RotateCcw icon) appears when closed panes exist.
- Workspace.tsx: pass reopenPane + hasClosedPanes through to ChatTabBar.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 21:26:07 +00:00
6d24726c3a feat: add systematic-debugging slash command for BooChat + BooCoder
/data/skills/boocode/systematic-debugging/SKILL.md — guided root-cause
debugging methodology (investigate before fixing). Available as
/systematic-debugging in both BooChat and BooCoder slash menus via the
shared /api/skills endpoint.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 20:37:51 +00:00
1bbeaf95c7 fix: auto-name uses session model + pane auto-remove on last tab close
Two independent UI/UX fixes:

- auto_name.ts: pass the session's own model as fallbackModel to
  taskModelCompletion, so chat rename uses whatever model is already
  loaded on llama-swap instead of forcing a swap to DEFAULT_MODEL
  (which times out at 10s when a different model is active).
- useWorkspacePanes.ts: when the last tab in a pane is closed and
  other panes exist, remove the pane entirely instead of leaving an
  orphaned empty panel.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 20:37:38 +00:00
e30a9e8b23 feat(coder): v2.6 Phase 1 — OpenCode warm server backend
Persistent multi-turn opencode backend: one `opencode serve` HTTP server per
BooCoder process, one opencode session per BooCode session (resumed on
switch-back), single SSE read loop demuxed by session id.

- backends/opencode-server.ts: AgentBackend implementation — spawn with
  waitForReady, session.next.* SSE event translation (text/reasoning/tool
  deltas), Paseo-ported reasoning dedup (streamedPartKeys), promptAsync
  fire-and-forget settled by session.idle, per-turn inactivity watchdog
  (180s) + reconnect reconciliation via session.messages, stale-session
  guard (crashed-not-resumed + config_hash fingerprint on model).
- dispatcher.ts: opencode routes to pool backend (ensureSession→prompt);
  per-session concurrency Map replaces global running boolean (1.9);
  model coalesce (empty→DEFAULT_MODEL) + llama-swap/ prefix for opencode;
  diff-supersede (DELETE+INSERT pending_changes by session, stamp agent).
- worktrees.ts: ensureSessionWorktree (session-keyed, captures base_commit,
  persists to session_worktrees); diffWorktree gains optional baseRef.
- agent-probe.ts: mergeLlamaSwap branch fetches /v1/models, prefixes with
  llama-swap/, populates opencode's available_agents.models (was 0).
- provider-snapshot.ts: export fetchLlamaSwapModels for probe reuse.
- schema.sql: session_worktrees + agent_sessions tables (Phase 0) +
  config_hash column on agent_sessions, pending_changes.agent column.
- package.json: @opencode-ai/sdk ~1.15.0 (resolved 1.15.12).

Known Phase 1 limitation: single SSE stream scoped to most-recent session's
directory; concurrent opencode sessions in different worktrees collide
(warning logged, watchdog prevents hang). Phase 2 moves to per-session SSE.

Smoke 1 verified: two turns in one session, both produce real tokens, same
agent_session_id reused, same server port, turn 2 is 9x faster (no spawn).
goose/qwen/claude paths untouched (runExternalAgent md5 identical).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 20:37:11 +00:00
140ff26204 feat(coder): v2.6 Phase 0 — AgentBackend foundations (no behavior change)
Schema, interface, and service scaffold for v2.6 persistent agent sessions.
Nothing in this batch alters runtime behavior.

- schema.sql: add session_worktrees (one shared worktree per session, FK
  sessions(id)) and agent_sessions (one backend session per (session, agent),
  with backend/status CHECKs); add pending_changes.agent column for DiffPanel
  attribution. All three statements idempotent (IF NOT EXISTS).
- services/agent-backend.ts: AgentBackend interface + AgentSessionHandle,
  EnsureSessionOpts, PromptCtx, TurnResult, and the normalized transport-agnostic
  AgentEvent union (text/reasoning/tool_call/tool_update/commands). Types only.
- services/agent-pool.ts: lazy get-or-create AgentPool keyed by
  `${sessionId}:${agent}` + shared `agentPool` singleton. Empty in Phase 0.
- index.ts: widen onClose to await dispatcher.stop() then agentPool.dispose()
  (pool empty, so dispose() is inert).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 02:50:17 +00:00
a97293b5d9 Merge coder-hardening: acp-client-fs path-guard fix + untrack live provider config 2026-05-29 22:23:20 +00:00
63adb218e6 chore(coder): untrack live coder-providers.json, ship example
The live config is read AND written by the coder (UI provider toggles PATCH it),
so tracking it churned `git status`. Untrack it (now gitignored under data/*),
add a tracked data/coder-providers.example.json reference, and update the
.gitignore exception + CLAUDE.md/BOOCODER.md docs. Loader already falls back to
{providers:{}} (built-ins only) when the live file is absent. + CHANGELOG v2.5.15.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 22:23:13 +00:00
d0334ca544 fix(coder): separator-bounded worktree path guard in acp-client-fs
The ACP fs bridge's worktree guard used an unbounded `startsWith(resolve(
worktreePath))`, so a sibling path sharing the worktree as a string prefix
(`<worktree>-evil/...`) escaped the scope. Since writeWorktreeTextFile hits disk
directly (no pending_changes gate), a confused/buggy ACP agent could write
outside its worktree. Now uses a separator-bounded check matching write_guard.ts
(resolve() + `startsWith(root + sep)` / `=== root`) via a shared resolveInWorktree,
with a regression test (../ traversal + the sibling-prefix bug). Symlink-swap
hardening intentionally skipped — consistent with write_guard's no-realpath
stance; the agent runs with host FS access so this is a containment guard, not a
trust boundary. Flagged by the automated push security review.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 22:22:51 +00:00
024ffc0b92 Merge claude-md-learnings: session learnings + CHANGELOG v2.5.14 2026-05-29 21:24:18 +00:00
691eef1b30 docs(claude): session learnings — provider lifecycle, deploy + mobile gotchas
Adds to CLAUDE.md: stale boocoder-restart symptom after build (new routes 404 /
old routes 200); boocode container build: . deploys the working tree, web
dev≠prod until container rebuild; PATCH provider-config replaces override
wholesale (send full override) + coder-providers.json is live config (don't
commit drift); external agents one-shot with no ctx tracking + OpenCode-as-server
is unshipped v2.6; ui/ primitive inventory + button-role=switch / Dialog
fallbacks; mobile Dialog scroll containment. Also backfills uncommitted doc
bullets for the v2.5.7–v2.5.11 coder work. CHANGELOG v2.5.14 entry. Docs only.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 21:24:10 +00:00
e92c51578d Merge v2.3-provider-lifecycle-phase5: provider settings UI + closeout
Phase 5 (Settings → Providers tab, picker filter, ACP catalog) + mobile settings
fix + Phase 6 docs. Completes the v2.3 provider-lifecycle batch
(phases 1–4: v2.5.4 / v2.5.5 / v2.5.6 / v2.5.12).
2026-05-29 20:20:38 +00:00
6d03690a65 docs: v2.3 provider-lifecycle closeout (Phase 6)
BOOCODER.md gains a Provider lifecycle section (config file + schema,
gitignored-with-exception, the 24h PROVIDER_PROBE_TTL_MS refresh contract,
enable/disable via Settings → Providers, custom-ACP add, native boocode
always-on, the honest subset-refresh known limitation, deploy + smoke).
docs/DEFERRED-WORK.md §2 (cold-probe skip) marked ADDRESSED with the still-
deferred Tier-2 follow-ups listed. CHANGELOG gets the v2.5.13 batch-closeout
entry. Docs only — no code.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 20:20:31 +00:00
21384cce5b web: fix Settings pane unreachable on mobile (push ?pane= atomically)
Opening the settings pane on mobile set activePaneIdx, but the ?pane= URL-sync
effect snapped it back to the chat pane on the panes change, so the pane never
showed. toggleSettingsPane now returns the new pane id (id generated outside the
updater, strict-mode safe); Session's toggleSettingsAndSync pushes ?pane=<id> on
mobile when opening (and drops it on close) so the sync effect keeps it active —
mirrors the existing addPaneAndSwitch pattern. Desktop unaffected.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 20:20:24 +00:00
920f8b75a6 web(coder): provider settings UI — Settings → Providers tab, picker filter, ACP catalog
v2.3 Phase 5. Provider management lives in Settings → Providers: lists every
registered provider with a status badge, enable/disable toggle (sends the full
override so a custom ACP entry's command survives the wholesale-replace PATCH),
per-provider refresh, and a plaintext diagnostic. The composer provider picker
now filters to enabled && (status==='ready' || 'loading') — disabled/unavailable
providers leave the picker and are managed only in settings; native boocode
always shows. Adds a curated ACP catalog + AddProviderModal (PATCH config then
subset refresh; the modal caps to the viewport with a single overscroll-contain
scroll region). Loading state uses a capped client poll (no WS frame).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 20:20:18 +00:00
e83d9b7d5b docs(changelog): v2.5.12-provider-lifecycle-phase4
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 17:48:28 +00:00
f302969c71 coder(providers): v2.3 provider-lifecycle phase 4 — config HTTP API (diagnostic returns JSON)
GET/PATCH /api/providers/config, subset POST /refresh, and
GET /api/providers/:id/diagnostic (JSON { diagnostic }, §6.4). PATCH order
is validate→save→reload→clear; a malformed body or invalid merged config
returns 422 without writing, and a save failure returns 500 without
reloading (no file/registry divergence). Web client + types extended.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 17:46:56 +00:00
2d997ecb6c web+coder: discover Claude's enabled commands + plugin skills; icon-split commands vs skills
claude is PTY (no ACP discovery), so claude-command-discovery.ts reads its enabled set from disk (user-global): ~/.claude/commands/*.md + every enabled plugin's skills/<name>/SKILL.md (kind=skill) and commands/*.md (kind=command), from ~/.claude/settings.json:enabledPlugins + installed_plugins.json install paths, frontmatter-parsed, bare names, deduped. The snapshot claude branch discovers these live (snapshot cache rate-limits the reads). The coder / menu now shows up to three icon'd groups: <agent> commands (Terminal), <agent> skills (Puzzle), BooCoder skills (Sparkles) via a new optional icon on SlashCommandGroup. AgentCommand gains a kind field in both coder + web copies (parity test enforces); mergeCommandsByName made generic to preserve it. Invocation unchanged (literal /name -> claude). Project-local plugins deferred. BooChat unaffected.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 16:21:32 +00:00
dc3859975d coder(providers): capture + persist opencode's live ACP commands (no dispatch needed)
The cold ACP probe captured available_commands but read probedCommands synchronously right after newSession, racing opencode's async available_commands_update notification -> captured nothing, only the static manifest showed. The probe now waits (poll <=3s + 300ms settle) for the notification. Captured commands persist to a new available_agents.commands column and are served (merged with the manifest) on the tier-2-skip path, so the agent's discovered commands survive once models are warm and show without a dispatch. Boot warms via the force:true startup snapshot. Caveat: relies on opencode emitting available_commands_update on session creation, not only post-prompt.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 14:56:18 +00:00
23a33e893a web+coder: segmented per-agent slash menu (agent commands + skills) + cross-agent skill execution
Coder / menu now shows two groups: the active agent's commands first (manifest + live ACP available_commands), BooCoder skills second. SlashCommandPicker gains an opt-in groups prop (flat items path unchanged -> BooChat byte-identical, parity verified); ChatInput takes slashGroups; CoderPane builds the groups. Skills run under the selected agent: coder skill_invoke accepts a provider and, when external, injects the server-side skill body into a dispatched task instead of native inference. Also folds in the initial-chat skill fix (handleLandingSkill: create chat -> assign to pane -> invoke, same transition as a text send) that resolves the landing-page blank screen. BooChat slash menu + skill invocation unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 14:38:39 +00:00
8bf86ecb92 web(coder): keep composer refresh on the top line + icon-only Mode picker on mobile
The AgentComposerBar refresh button wrapped to a second line on mobile: the status dot had ml-auto (pinned to the far-right edge) and the refresh button followed it in DOM order, overflowing past the edge. Group the dot + refresh into one right-aligned (ml-auto) unit so the refresh stays on the top line. Also add an iconOnly option to CompactPicker and render the Mode (permission) picker icon-only on mobile (shield + chevron, no label; aria-label/title + tap-to-open list still convey the selection) to free row width. Desktop unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 12:46:40 +00:00
fe52250d78 coder(providers): fix empty picker (loading-state) + config model overrides + current Claude models
Fix: getProviderSnapshot returned synchronous installed:false 'loading' entries on a cache miss (v2.5.5/Phase 2), which AgentComposerBar filters out — with the Phase 5 client poll not yet built, a single fetch stranded on 'loading' and the picker showed no providers. It now awaits the build and returns terminal entries; the sync loading-return is deferred until Phase 5. Builds stay fast via the tier-2 cold-probe skip.

Feature: wire the v2.3 config schema's models/additionalModels — buildResolvedRegistry carries them onto ResolvedProviderDef (models replace, additionalModels merge) and provider-snapshot applies them to every ready model list, so /data/coder-providers.json can edit any provider's models with no code change. Claude staticModels bumped from the stale 2-entry list to opus/sonnet/haiku latest-aliases + pinned claude-opus-4-8 / claude-sonnet-4-6 / claude-haiku-4-5-20251001 (passed verbatim to claude --model). +2 tests (109 total).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 12:37:01 +00:00
4035aa2b98 coder(providers): v2.3 provider-lifecycle phase 3 — generic ACP dispatch
ACP dispatch now spawns from the resolved registry's launch spec instead of a hardcoded per-name switch. acp-spawn.ts gains resolveLaunchSpec(resolved, installPath): launchCommand (config override / custom-ACP command) wins, else the kept resolveAcpSpawnArgs switch is the built-in fallback. acp-dispatch.ts spawns spec.binary/spec.args with env { ...process.env, ...spec.env }; dispatcher.ts loads the resolved def by task.agent and passes it through. Config-defined custom ACP providers dispatch with no new switch case. Built-in dispatch (opencode/goose/qwen) is byte-identical to pre-v2.3 — proven by a regression test (opencode->['acp'], goose->['acp'], qwen->['--acp'], binary=installPath ?? id, empty env -> plain process.env). Deliberate deviation from design's !installPath->null: the installPath ?? id fallback is preserved. setSessionMode/permission/streaming and the dispatcher poll/NOTIFY/running-guard untouched. 7 new acp-spawn.test.ts cases. No routes/UI (Phase 4+).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 12:06:32 +00:00
35a0aba211 coder(providers): v2.3 provider-lifecycle phase 2 — snapshot lifecycle
provider-snapshot no longer returns null for uninstalled/disabled providers: it emits one entry per registered provider with a lifecycle status (loading|ready|unavailable|error), an enabled flag, and a two-tier probe. Tier-1 is a fast which-style check (command-availability.ts, execFile/no-shell); tier-2 (cold ACP probe) is skipped unless forced, last_probed_at is older than PROVIDER_PROBE_TTL_MS (24h), or DB models are empty — the snapshot-latency win. Cache miss returns status:'loading' synchronously while the build settles via the existing inflight promise. ProviderSnapshotStatus/Entry regain loading/unavailable + gain enabled/description?/fetchedAt? in both coder and web copies, guarded by a runtime parity test (provider-types-parity.test.ts; compile-time cross-project check was blocked by TS6307). Also tracks the data/coder-providers.json seed via a .gitignore exception, completing the Phase 1 config file. No dispatch/route/UI changes (Phase 3+); AgentComposerBar filtering unchanged. 13 snapshot tests (+6) + 6 parity tests.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 11:47:48 +00:00
3730dc9341 coder(providers): v2.3 provider-lifecycle phase 1 — config-backed registry
Adds a config layer merged over the hardcoded built-ins (tasks 1.1-1.6): CODER_PROVIDERS_PATH env (default /data/coder-providers.json); provider-config.ts (Zod schema + never-throw loader — missing/invalid file falls back to built-ins only — + save); provider-config-registry.ts (ResolvedProviderDef + buildResolvedRegistry merge: override built-ins, add custom extends:'acp' entries, boocode always enabled + singleton); agent-probe now iterates the resolved registry, probes custom-ACP command[0] via execFile (no shell), skips disabled providers (keeps the row), reads enabled from memory only (no DB column). No snapshot/dispatch/route/UI changes (Phase 2+). 6 new unit tests; empty config provably yields exactly the built-ins.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 04:09:34 +00:00
a359a4ab8b coder(providers): remove retired cursor and copilot providers
Drop both retired providers from BooCoder's provider layer: acp-spawn argv cases, provider-manifest mode blocks + manifest keys, provider-commands maps, the provider-snapshot cursor model-CLI branch (+ orphaned exec/promisify imports), the agent-probe copilot ACP-detect branch, and the now-dead cursor-models module + its test. The PROVIDERS registry array already lacked both. Built-ins unchanged: claude, opencode, goose, qwen, native boocode.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 04:07:21 +00:00
a8c84ecfe4 chore+docs: config, agent registry, codecontext, v2.6 spec, changelog
Working-tree config/doc changes (.gitignore, CLAUDE.md, AGENTS.md removal + data/AGENTS.md, codecontext Dockerfile/shim — pre-existing) plus this session's v2-6 persistent-agent-sessions openspec proposal/design/tasks (planning only; feature unimplemented, reserves the v2.6.0 tag) and the v2.5.2 CHANGELOG entry.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 03:12:31 +00:00
547fd70650 server/coder: working-tree backend changes (pre-existing)
Checkpoint of in-progress backend work present in the tree, not authored this session: auto_name, inference tool-phase/turn, secret_guard, provider-registry, plus a new agent-allowlist test (7 tests, passing).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 03:12:16 +00:00
990a615b87 web(coder UI): ChatInput migration + Thinking render + DiffPanel route fix
Bundles in-progress working-tree UI work not authored this session (CoderPane ChatInput migration, AgentComposerBar/CoderMessageList/tab-bar/sidebar/pane refinements, provider icons) with this session's changes to the same files: MessageBubble renders a collapsible 'Thinking' block from reasoning_text/reasoning_parts (surfacing ACP agent_thought_chunk + native reasoning), and the DiffPanel approve/reject calls are repointed to the real /api/coder/pending/:id/apply and /reject routes (the old /sessions/:id/pending/:id/approve|reject paths did not exist).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 03:12:06 +00:00
5352fd9942 coder(pending): new-file-from-RightRail create endpoint + modal
POST /api/sessions/:sessionId/pending/create queues a pending_changes create via queueCreate (WriteGuardError -> 422 with the guard message). RightRail gains a 'New file from pasted text' modal (path + content) wired through api.coder.createPendingFile; sessionId is threaded down from App.tsx. The staged change shows in the CoderPane DiffPanel for explicit apply.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 03:11:50 +00:00
66df410826 web: fix mobile nav stuck-open on rejoin + paste-chip code fence
useViewport re-syncs the snapshot on pageshow/visibilitychange/resize/orientationchange — iOS reported a stale width on backgrounded-tab restore, leaving isMobile=false so the sidebar rendered as a permanent column with no close affordance. flattenToMessage now inserts pasted-text chips verbatim instead of wrapping them in a triple-backtick fence.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 03:11:42 +00:00
f89c8f3f15 coder(dispatcher): react to new tasks via LISTEN/NOTIFY, poll as fallback
AFTER INSERT trigger on tasks fires pg_notify('tasks_new'); the dispatcher listens via porsager sql.listen and triggers an immediate poll, with the setInterval poll kept at 2s as a missed-notification safety net. Per-session guard unchanged (no double-dispatch).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 03:11:34 +00:00
cbef7618b3 v2.5.1-budget-100: raise all tool call budgets to 100 + codecontextignore fix
Budget defaults raised from 50/10/50 to 100/100/100 (read-only,
non-read-only, no-agent). Per-agent max_tool_calls from AGENTS.md
still overrides.

Added .claude/worktrees/ to .codecontextignore to prevent
get_codebase_overview from parsing empty stub files in stale
worktree node_modules.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-28 02:40:26 +00:00
fcc7c5a86e v2.5.0-task-model: lightweight task model services + tasks table
Task model infrastructure for cheap LLM calls (auto-naming, search
rewrite, tags, summaries) via a dedicated llama-server instance at
TASK_MODEL_URL, falling back to LLAMA_SWAP_URL with FAST_MODEL when
unset. Replaces the inline fetch in auto_name.ts with taskModelCompletion.

Adds search query rewriting: on step 0 when web tools are enabled, the
user's message is summarized into a search intent hint appended to the
system prompt, improving web_search relevance.

Schema: tasks table for provider dispatch and arena, sessions.tags column.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-27 21:44:39 +00:00
bcfc94fa47 v2.4.1-sidecar-routing: route per-agent flags to llama-sidecar + tool gap fix
Batch 3c: when an agent has llama_extra_args in AGENTS.md, provider.ts
routes inference through LLAMA_SIDECAR_URL instead of LLAMA_SWAP_URL.
X-Agent-Flags header built from the agent's flags. Boot-time guard
refuses to start if any agent has llama_extra_args but LLAMA_SIDECAR_URL
is unset. PrefixFingerprint gains a route field (swap/sidecar) for
per-turn visibility. 9 provider tests.

AGENTS.md tool gap: all agents (except Prompt Builder) were missing 8
tools that were added after the original tool lists were written:
request_read_access, view_truncated_output, ask_user_input, git_status,
get_blast_radius, get_hot_files, get_middleware, get_routes. The missing
request_read_access caused silent "permission denied" when reading files
outside the project root.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-27 19:28:08 +00:00
384 changed files with 31980 additions and 11157 deletions

View File

@@ -21,6 +21,7 @@ out/
.opencode/
.vscode/
.idea/
.claude/worktrees/
# Test artifacts / coverage
coverage/

View File

@@ -11,6 +11,15 @@ POSTGRES_PASSWORD=CHANGE_ME
# point BooCode at a different SearXNG instance.
SEARXNG_URL=http://100.114.205.53:8888
# Context7 MCP key. Referenced from data/mcp.json as "{env:CONTEXT7_API_KEY}"
# ({env:VAR} substitution, opencode-compatible). Leave unset to send no key.
# CONTEXT7_API_KEY=ctx7sk-...
# Task model: lightweight model for auto-naming, search rewrite, etc.
# Direct llama-server instance (NOT llama-swap). Falls back to LLAMA_SWAP_URL
# with FAST_MODEL when unset.
# TASK_MODEL_URL=http://100.90.172.55:7995
# v1.13.15-tools: BOOCODE_TOOLS narrows the tool whitelist sent to the LLM.
# Unset (default) → all tools (~21k schema). Useful primarily for single-purpose
# sessions where the model only needs read-only filesystem access.

4
.gitignore vendored
View File

@@ -15,4 +15,6 @@ secrets/
data/*
!data/AGENTS.md
!data/skills/
!data/mcp.json
!data/mcp.example.json
!data/coder-providers.example.json
codecontext/fork.tar.gz

109
AGENTS.md
View File

@@ -1,109 +0,0 @@
# Agent navigation
Cursor/agent entry point for the BooCode monorepo. **Deep engineering reference:** `CLAUDE.md` (Claude Code). This file is navigation + task routing only.
Last updated: 2026-05-25
## Doc map
| Need | Read |
|------|------|
| Commands, gotchas, inference, DB, env | `CLAUDE.md` |
| Read-only chat behavior | `BOOCHAT.md` |
| Write tools, dispatch, pending changes | `BOOCODER.md` |
| Shipped vs planned, version order | `boocode_roadmap.md` |
| Latest release truth | `CHANGELOG.md` (top entry = current) |
| System diagram + data flow | `docs/ARCHITECTURE.md` |
| Current focus / blockers | `CURRENT.md` |
| Batch convention | `openspec/README.md` |
| Shipped batch snapshots | `openspec/changes/archived/` |
| Chat agent personas + tool lists | `data/AGENTS.md` |
| External repo lift inventory | `boocode_code_review.md` |
## Monorepo layout (actual)
Three **surfaces**, four **packages**. There is no `apps/chat/` directory.
| Surface | Packages | Port | Deploy |
|---------|----------|------|--------|
| **BooChat** | `apps/server` (API + inference) + `apps/web` (SPA) | `100.114.205.53:9500` | Docker `boocode` container |
| **BooTerm** | `apps/booterm` | `100.114.205.53:9501` | Docker `booterm` container |
| **BooCoder** | `apps/coder` | host `:9502` | systemd `boocoder.service` (not Docker since v2.1.0) |
Shared: Postgres 16 — Docker service `boocode_db`, **database name `boochat`**, host port `127.0.0.1:5500`.
## Task routing
| Task type | Start here |
|-----------|------------|
| Chat inference / tools / compaction | `apps/server/src/services/inference/` |
| WS frames | `apps/server/src/types/ws-frames.ts` + `apps/web/src/api/ws-frames.ts` (keep in sync) |
| Frontend chat UI | `apps/web/src/components/`, hooks in `apps/web/src/hooks/` |
| BooCoder write tools / dispatch | `apps/coder/src/` — build server first (`pnpm -C apps/server build`) |
| Provider picker / external agents | `apps/coder/src/services/provider-registry.ts`, `dispatcher.ts`, `agent-probe.ts` |
| Terminal panes | `apps/booterm/src/`, frontend `TerminalPane.tsx` |
| Schema changes | `apps/server/src/schema.sql` + sync `*_STATUSES` in `apps/server/src/types/api.ts` |
| New batch / feature | `openspec/changes/<slug>/proposal.md` + `tasks.md` (see below) |
## Verification (before claiming done)
```bash
pnpm -C apps/server test && pnpm -C apps/server build
npx tsc -p apps/web/tsconfig.app.json --noEmit # root tsc can miss web errors
curl http://100.114.205.53:9500/api/health # Tailscale IP, not localhost:9500
curl http://100.114.205.53:9502/api/health # BooCoder on host
```
Deploy truth beats source-only reads — check running health + `git log --oneline -3`.
## Hard rules (from CLAUDE.md)
- **Do not commit or push** unless Sam explicitly asks.
- **No app-layer auth** — Authelia at the reverse proxy.
- **Parts table is source of truth** — read message tool fields from `messages_with_parts` view, write via `insertParts`.
- **New WS frame type** — update server + web schemas; publish via `publishFrame` / `publishUserFrame` only.
- **New tool** — own file in `services/`, register in `tools.ts` `ALL_TOOLS`; whitelists derive from there, never hardcoded.
- **Typecheck web with per-app tsconfig** — root `tsc --noEmit` uses project references and can miss errors.
- **`includeUsage: true`** on `createOpenAICompatible` in `provider.ts` — do not remove.
- **Agent dispatch** — direct `spawn`/`exec` on host via `install_path` (v2.1.0+); SSH helpers deprecated.
- **Event dedup** — server publishes via broker; frontend must not duplicate `sessionEvents.emit` after API calls that already WS-broadcast.
## Using openspec with Cursor
Openspec is a **folder convention**, not a CLI. Use it to give agents a scoped brief before coding.
### When starting a batch
1. Create `openspec/changes/<slug>/` (lowercase-hyphenated, e.g. `v2-2-arena-ui`).
2. Write `proposal.md` — why, scope, non-goals, dependencies.
3. Write `tasks.md` — numbered checkbox steps (build + smoke).
4. Optional `design.md` — schema/API decisions that outlive the batch.
See `openspec/README.md` for the full shape. Shipped pre-v1.13.15 batches live in `openspec/changes/archived/` as snapshots only.
### Prompting an agent
```
@openspec/changes/<slug>/proposal.md @openspec/changes/<slug>/tasks.md
Implement tasks 13. Server tests must pass. Do not commit.
```
Attach the spec files with `@` so they load into context. Point at specific code paths when known:
```
@openspec/changes/v2-x/proposal.md
Extend apps/coder/src/routes/providers.ts — follow provider-registry.ts patterns.
```
### After shipping
- Tag: `vMAJOR.MINOR.PATCH-slug`
- Add entry to top of `CHANGELOG.md`
- Move or snapshot the openspec folder to `archived/` if you want history preserved
- Update `CURRENT.md` and `boocode_roadmap.md` shipped table if the batch was roadmap-tracked
### What not to use openspec for
- One-line bug fixes — just describe the bug + file.
- Exploratory questions — Ask mode + `@CLAUDE.md` is enough.
- Duplicating `CLAUDE.md` — openspec is per-batch scope, not permanent conventions.

View File

@@ -28,6 +28,11 @@
- Prefer codecontext (`search_symbols`, `get_symbol_info`, `get_dependencies`) over `grep` for symbol-level questions. Fall back to `grep` / `view_file` when codecontext returns degraded or empty results — that signals an unsupported language or parse failure.
- Verify before reporting work complete: run the relevant test/build/smoke command and confirm output matches the claim. Evidence first, assertion second.
## Recovery and context (v2.7)
- **Heed the recovery nudge.** Native inference tracks consecutive tool **failures** (`mistake-tracker.ts`): after 3 in a row with no successful step between, a `mistake_recovery` sentinel is injected telling you to re-read tool schemas, verify a path exists before acting, and try a *different* approach — not retry variations of the same failing call. Ignoring it (a second failure run with the nudge still outstanding) **escalates and stops the turn** to protect the step budget. This complements the doom-loop guard, which only catches *identical* repeats.
- **Files-read provenance survives compaction.** Paths you read via `view_file` / `grep` / `find_files` / `list_dir` are accumulated and merged into a cumulative `## Files Read` ledger in the rolling summary, so a file read long ago stays in context across compactions. You don't manage this — but it means you usually don't need to re-read a file just because the raw turn scrolled out of the window.
## Output format
- Stay in Markdown by default for every reply, short or long.

View File

@@ -23,6 +23,8 @@ You are BooCoder, a write-capable coding agent. You can read AND modify files wi
Every file modification queues in `pending_changes` before touching disk. The user sees a diff preview and approves/rejects each change. Never bypass this queue — it is the safety boundary between inference and the filesystem.
`edit_file`'s `old_string` match is **fuzzy** (`fuzzy-match.ts`, v2.7.1): an exact → per-line-whitespace → unicode-canonicalization (curly quotes/dashes/nbsp) → Levenshtein-≥0.66 ladder, so minor whitespace/indentation/unicode drift in `old_string` still lands on the right span. Two consequences: a near-miss `old_string` may still apply (verify the queued diff is what you intended), and an `old_string` matching **more than one** place is rejected as **ambiguous** rather than editing the first — add surrounding context to disambiguate. A genuine non-match returns a clear failure, not a thrown error.
## Behavior
- Show diffs clearly. Explain what you're changing and why.
@@ -37,3 +39,113 @@ Every file modification queues in `pending_changes` before touching disk. The us
- Never count `dist/` directory sizes as source lines. Only count `src/**/*.ts` files. Compiled output is inflated by inlined types and transpilation artifacts.
- Before claiming a feature works, run the actual command and show the output. "Should work" is not verification. Acceptable evidence: test output (`pnpm test`), build output (`pnpm build`), curl response, docker logs, `\d tablename` output. If you can't run it, say so explicitly — don't assert success without evidence.
- When reporting counts (tools, tests, files, routes, lines), derive the number from a command (`grep -c`, `wc -l`, test runner output) — not from memory or approximation.
## Provider lifecycle (v2.3)
BooCoder's coding agents are a **config-backed registry**: built-ins live in `provider-registry.ts`, and `data/coder-providers.json` layers overrides + custom entries on top. Registration ≠ installation — the config lists what you *want*; a probe reports what's *ready*.
### Config file: `data/coder-providers.json`
Resolved from `CODER_PROVIDERS_PATH` (default `/data/coder-providers.json`; dev/host path `/opt/boocode/data/coder-providers.json`). It is **gitignored** — it's live runtime config that the coder reads *and writes* (UI toggles `PATCH` it), so tracking it would churn `git status`. The tracked reference is `data/coder-providers.example.json`; copy it to `coder-providers.json` to seed overrides. A missing file, invalid JSON, or a schema mismatch all fall back to built-ins-only — loading never throws at startup.
```json
{
"providers": {
"goose": { "enabled": false },
"amp-acp": {
"extends": "acp",
"label": "Amp",
"description": "ACP wrapper for Amp",
"command": ["amp-acp"],
"enabled": true
}
}
}
```
Per-provider override fields (all optional):
| Field | Meaning |
|-------|---------|
| `extends` | `"acp"` — required for a NEW (custom) provider; built-in overrides omit it |
| `label` | Display name (required for custom) |
| `description` | Sub-label shown in the picker / settings |
| `command` | `[binary, ...args]` to spawn (required for custom; overrides a built-in's default argv) |
| `env` | Extra env vars merged into the spawn |
| `enabled` | Default `true`; `false` hides it from the composer |
| `order` | UI sort key |
| `models` / `additionalModels` | Replace / merge onto the discovered model list |
A PATCH to one provider id **replaces that id's override object wholesale** (per-id shallow merge), so to flip a single field keep the rest; a `null` value for an id deletes its override (reverts to the built-in default).
### Refresh contract
The snapshot is cached and a provider's cold ACP probe (tier-2) is **skipped** while `available_agents.last_probed_at` is younger than `PROVIDER_PROBE_TTL_MS` (default `86400000` = 24h). Opening the composer is therefore fast and does not re-probe. To force a cold re-probe (after installing a CLI or editing models): **`POST /api/providers/refresh`** (the Refresh button in the Providers settings tab), which clears the cache and re-probes.
### Enable / disable
Two ways:
- **Settings → Providers tab** — open the sidebar → **Settings****Providers**: toggle a provider on/off, refresh it, or open its diagnostic. (Earlier builds exposed a gear in the composer; that control was moved into Settings.)
- **Edit the config** (`"enabled": false`) then `POST /api/providers/refresh`.
A **disabled** provider leaves the composer's provider picker but stays listed in the Providers tab (status "Disabled") so you can re-enable it. **Native `boocode` is always-on** — an `enabled:false` on it is ignored (with a warn log) and it is never rendered as toggleable.
### Adding a custom ACP provider
- **Catalog modal**: Providers tab → **Add provider** → pick an entry → it PATCHes the config (`extends:'acp'` + label + command, enabled) and refreshes that provider.
- **Hand-edit** `data/coder-providers.json`: add an id with `extends:'acp'`, `label`, and `command`, then `POST /api/providers/refresh`.
Either way, **adding to config does NOT install the binary.** Until the CLI is on `PATH` the provider shows **"Not installed"** (status `unavailable`) and does not appear in the composer picker.
### Known limitation — subset refresh
`POST /api/providers/refresh` accepts an optional `{ "providers": ["id", ...] }` body and returns a `refreshed` count scoped to that subset — **but the underlying cold re-probe currently covers ALL installed providers**, not just the requested subset. True per-provider force is a future change (it needs a snapshot-internal parameter). This is intentional for now, not a bug: a subset refresh still re-probes everything; only the reported count is scoped.
### Deploy + smoke
Two deploy targets:
- **Routes (host service):** `pnpm -C packages/contracts build && pnpm -C apps/server build && pnpm -C apps/coder build && sudo systemctl restart boocoder`
- **Web UI (container):** `docker compose up --build -d boocode`
Green gate (verified across phases 15): `pnpm -C apps/coder test` (134 passing) `&& pnpm -C apps/coder build`.
Smoke (via Tailscale):
```bash
curl http://100.114.205.53:9502/api/providers/snapshot # lists every registered provider
curl http://100.114.205.53:9500/api/coder/providers/config # raw config, through the BooChat proxy
# Settings → Providers: disable goose → it leaves the composer picker, stays in the tab
# POST refresh → models repopulate; Add a catalog entry → it appears after refresh (unavailable until its CLI is installed)
```
## Persistent agent sessions (v2.6)
When you `dispatch_external_agent` to a chat-tab provider, BooCoder keeps that agent **warm and resumable** instead of spawning a fresh process per turn. This is mostly transparent — but the model below explains why turn 2 is fast, why an external agent remembers earlier turns, and how edits flow.
### Backends and keying
- One live backend per **`(chat_id, agent)`** pair, owned by the `agent-pool` (`agent-pool.ts`). State lives in `agent_sessions` (the resumable session id) and `worktrees` (the per-chat working copy).
- **opencode** runs a long-lived `opencode serve` (`backends/opencode-server.ts`) with per-session SSE; turns after the first reuse the same session (memory intact, ~9× faster).
- **goose / qwen** run a warm ACP connection (`backends/warm-acp.ts`) — `initialize` + `session/new` once per `(chat,agent)`, then `session/prompt` per turn. Interrupt cancels the prompt (`session/cancel`), never the child.
- **claude** runs the Claude Agent SDK backend (`backends/claude-sdk.ts`) over a clean-room Postgres session store.
- Arena, MCP `new_task`, and one-shot dispatches still use the cold `runExternalAgent` path — warm reuse needs both a `session_id` and a `chat_id`.
### Worktrees
- External agents write **directly into a persistent per-chat worktree** (`/tmp/booworktrees/sess-<id>`), not into the project root via `pending_changes`. The worktree is created once, base commit captured, and **reused across turns and across agents in the same chat** — so opencode and goose in one chat share one worktree.
- Each turn's worktree diff supersedes the prior `pending_changes` row for that `(chat,agent)` (latest-wins) and is badged with the authoring agent in the DiffPanel.
- **Staging boundary:** a provider only sees another agent's edits once they are **applied**. Unapplied worktree edits from a different agent are invisible to you — the DiffPanel shows a muted hint when that's the case.
### Lifecycle (v2.6.10v2.6.11)
- **Idle eviction:** a backend idle past `AGENT_POOL_IDLE_TTL_MS` (default 30 min) is disposed; an LRU cap of `AGENT_POOL_MAX_LIVE` (default 10) bounds live backends. A busy backend is never evicted, and the next turn transparently re-attaches or re-creates from `agent_sessions`/`worktrees`.
- **Crash recovery:** a health monitor restarts a crashed server (opencode → fresh sessions; ACP → re-`session/new`) and reclaims its port.
- **Close cleanup:** closing/deleting a chat or session evicts its backends, archives the `worktrees` row, and removes the worktree. An hourly reaper sweeps orphaned worktrees (dirty/unpushed preflight before removal).
### Checkpoints (v2.7.1)
Because external agents write the worktree directly (outside `pending_changes`), a worktree **checkpoint** is shadow-committed before each external-agent turn (tracked + untracked, into `refs/boocode/checkpoints/<id>`), anchored to that turn's assistant message. The per-message **"Restore to here"** affordance resets the worktree (`reset --hard` + `clean -fd`), trims the transcript past that message, and resets the `(chat,agent)` backend session — so files, transcript, and agent context land consistent at the restore point. `rewind` still only reverses BooCoder's own applied `pending_changes`; checkpoints are what cover external-agent worktree edits.
### Normalized status (v2.6 / v2.7.6)
Turn boundaries publish a normalized per-`(chat,agent)` status — `working | blocked | idle | error` — to the UI (`agent_status_updated` frame), so blocked-on-permission and crash/idle are visible, not just WS liveness.

View File

@@ -2,6 +2,174 @@
All notable changes per release tag. Most recent on top, ordered by tag creation date (which matches the git history). Tag names follow `vMAJOR.MINOR.PATCH-slug` — the slug describes what shipped, so the tag name alone is enough to recall the batch.
## v2.7.16-container-git-safedir — 2026-06-03
Hotfix that makes the `v2.7.15-git-diff-panel` work in production. The `boocode` container runs as root but bind-mounts host project repos owned by uid 1000, so git rejected them with "detected dubious ownership" and the diff route reported every project as not-a-repo — which hid the Git tab entirely (and had been silently nulling the existing branch indicator too). Adds `git config --system --add safe.directory '*'` to the Dockerfile runtime stage so the container's git trusts the mounted repos; applied live to the running container and baked into the image for future rebuilds. Surfaced by a live smoke immediately after the v2.7.14/v2.7.15 deploy.
## v2.7.15-git-diff-panel — 2026-06-03
A Files / Git tab in the right-side file panel (the file-browser sidebar) that shows the project repository's git diff and lets the user stage, unstage, commit, and discard whole files in-session — modeled on Paseo's diff view, scoped and planned through the `plan-a-feature``plan-implementation` skills, then built and audited via `paseo-epic` in an isolated worktree. Two comparison modes (Uncommitted vs HEAD, and the current branch vs its base — the upstream tracking branch else `origin/HEAD`), auto-selected by repo dirty-state on first open and pinned after an explicit choice; per-file expand/collapse with lazy Shiki `lang:'diff'` highlighting, +/- stats, and binary/too-large placeholders. All git read and write logic lives in `apps/server` (new `git_diff.ts` + routes on `projects.ts`) — the read-only-server posture governs the assistant's tools, not the user's own actions, and the container already mounts `/opt` read-write while `project_bootstrap` already commits via `execFile`. Every write uses the safe `execFile` argv pattern (never a shell string) with `--` operand separators, per-file `pathGuard` + realpath symlink-escape validation, server-derived `-c` commit identity (the request body is `.strict()` and carries no author fields), and the write endpoints are deliberately absent from the assistant tool registry. Reads are bounded (30s deadline, 10MB); an index lock or an in-progress merge/rebase/cherry-pick/bisect surfaces as "repository busy" and disables writes. The panel stays current via a client `git_diff_refresh` sessionEvent (no new wire contract) coalesced across tab-open, mutations, turn completion, and pending-change apply; discard is an irrecoverable hard-delete behind a plain confirm distinguishing a tracked revert from an untracked delete. New `git_diff` pure-helper + temp-repo integration tests (59 cases); server 630 tests green, web tsc clean. Pairs with `v2.7.14-backlog-hardening` (shipped together).
## v2.7.14-backlog-hardening — 2026-06-03
Five independent items from the second external-code-review backlog (`boocode_code_review_v2.md`), each built and audited as its own phase via `paseo-epic`. **External task-cancel** now actually works: Stop on an opencode/goose/qwen/claude task aborts the running child via a per-task `AbortController` registry reachable from the cancel route and finalizes the assistant message as `cancelled` — fixing two latent bugs (catch blocks left the message `streaming`; warm success-paths wrote `complete` on an aborted turn); warm pools/worktrees are preserved (abort the prompt only, never the pooled process) and the native boocode path is unchanged. **Parser prune**: the tool-call parser drops to its two load-bearing exports (eight zero-caller symbols unexported, a gate test added for the `<invoke>`-as-text fallback) with no live-path behavior change, and placeholder-rejection logging moves to pino. **BooChat stall-timeout**: a 90s per-chunk deadline wraps native inference's `fullStream` via `AbortSignal.any` so a hung local stream finalizes the message instead of hanging — no retry, since re-running re-emits already-streamed deltas (a pure `classifyStreamError` helper is added). **view_session_history**: a read-only MCP tool returning the newest-N transcript (role≠system) in chronological order. **Retire :9502**: the unused `apps/coder/web` fallback SPA is removed (package, static-serve block, build step, Dockerfile copy, `@fastify/static`), keeping every API/WS/health/MCP route. F1 added an optional `status` field to the shared `message_complete` contracts frame (so a deploy rebuilds `@boocode/contracts` first, as the sequence already does). Server 630 / coder 360 tests green.
## v2.7.13-contracts-ssot — 2026-06-02
Creates `@boocode/contracts` (`packages/contracts`), a new workspace package that becomes the single source of truth for every cross-app wire contract — reversing the decision recorded in `v2.5.12-provider-lifecycle-phase4` that declined a shared types package as not worth the Docker/build-order risk at solo scale; a live `AgentSessionConfig` drift that had since appeared between `apps/coder` and `apps/web` justified the investment. Six contracts are now defined exactly once: the `WsFrameSchema` Zod runtime schema, the provider snapshot types (`ProviderSnapshotEntry` and family), the Zod provider-config schemas, `MessageMetadata` + `ErrorReason`, `AgentSessionConfig`, and `WorktreeRiskReport`; both Zod-backed contracts use `z.infer` so validator and type derive from the same definition and cannot drift independently. All four consumers — `apps/server`, `apps/web`, `apps/coder`, and the fallback SPA `apps/coder/web` — import via `workspace:*` through a per-subpath exports map consuming built dist only (no tsconfig project references); the hand-synced copies and their parity tests (`provider-types-parity.test.ts`; the ws-frames byte-parity assertion) are deleted while the KNOWN_FRAME_TYPES drift test and broker fail-closed tests are preserved. Build order is inverted in the root build script, Dockerfile, and coder deploy docs; `apps/coder/web`'s migration also removed dead `pending_change_*` reducer arms (no frame publisher exists for these — pending changes are HTTP-delivered), closing a latent missing-default-arm crash, and reconciled field-type conflicts with the canonical `WsFrame`; zod is pinned to a single version across the workspace. Server 543 / coder 293 / contracts 11 tests passing; human smoke verified on the live stack 2026-06-02.
## v2.7.12-audit-cleanup — 2026-06-02
A repo-wide audit and aggressive cleanup pass, run as a multi-agent orchestration (five read-only Opus auditors over server/web/coder/booterm + cross-cutting deps/build/parity + a structural-architecture lens) followed by phased, behavior-preserving implementation — every change gated on the per-app test suites and delivered behind a strict DEFER discipline that never touched the files in flight for `v2.7.9``v2.7.11` (`mcp-config`, the `ws-frames` pair, `dispatcher`, `claude-sdk-map`, `AgentComposerBar`/`CoderMessageList`/`CoderPane`), so the branch rebased onto current main with zero conflicts. **Dead code/deps/schema**: removed ~9 dead files and a swathe of dead exports/write-only state across all four apps, dropped dead deps (`next-themes`, `@xterm/addon-webgl`, booterm `tslib`; `shadcn`→devDep), and idempotently dropped dead schema columns/tables (`sessions.tags`, `tasks.worktree_path`/`feature_values`, `available_agents.supports_mcp_client`, the superseded `session_worktrees` table, the always-empty `list_worktrees` MCP tool) — chat/session/message DATA untouched, only never-read columns. **Server dedup + reshapes**: collapsed the dead `budget.ts` tier system (surfacing a latent `READ_ONLY_TOOL_NAMES` drift, then deleted), extracted shared `MESSAGE_COLUMNS`/`selectProject`/`stripQuotes`/`SENTINEL_KINDS`/`samplerOptsFromAgent`/`createContentFlusher`/`insertSentinel`/a `makeCodecontextTool` factory/a pending-tool-call resolver, split `tools.ts` (799→46 barrel + `tools/{types,fs-tools,misc-tools,registry,tiers}`, register-through registry preserved so coder's import contract stays byte-stable), and decomposed the inference pipeline (`sentinel-summaries``runWrapUpSummary`, `turn.ts``turn-config`+`step-decision`, a pure `stream-phase-adapter`, shared finalize atoms — stopping short of fusing synthesis to preserve frame timing). **Coder reshapes**: split the 1062-line `opencode-server.ts` god-class into supervisor / sse-loop / pure event-map / port-utils + extracted `buildAcpClient`/`makeFrameEmitter`/`worktree-risk`, plus happy-path-safe concurrency hardening (reconnect backoff, double-spawn guard; a defensive busy-assert + ensureSession coalescing flagged for review). **Web**: `React.memo` on `MessageBubble`/`MarkdownRenderer` + module-hoisted markdown components (the streaming re-parse was the biggest perf cost), shared `linkifyPaths`/artifact/tab dedup, two latent bug fixes (`ChatPane` index-keys → stable ids; `FileViewerOverlay` blank-line line-number desync), and decomposed the 1298-line `TerminalPane.tsx` into fit/socket/selection hooks + presentational pieces (verbatim move, all ~30 listeners/timers inventoried; the label-dep fix stops a live terminal tearing down on pane renumber). +78 parity/unit tests (server 597, coder 328 green; `apps/web` has no harness, so its changes are typecheck + manual/device QA). Net ≈ 4,600 LOC. Deferred (designed; blueprints in the audit reports): the `tasks` dual-CREATE / `project_id` FK (a cross-service deploy-ordering decision, not a data migration), web structural decomposition of `useWorkspacePanes`/`MessageBubble` (needs a web test harness first), a `@boocode/contracts` shared package, and the `dispatcher.ts` split — the last two now unblocked since their in-flight files shipped in `v2.7.9``v2.7.11`. Rebased clean onto `v2.7.11-coder-model-snapshot`.
## v2.7.11-coder-model-snapshot — 2026-06-02
Hotfix for the coder model-attribution chip vanishing on refresh. The chip showed during a live turn (the `message_complete` frame carries `model`) but disappeared when a BooCoder session was reloaded — only in the coder, not BooChat. Root cause: `CoderPane`'s `useCoderMessages` hydrates from two sources on load — the HTTP `listMessages` fetch (whose SELECT includes `model`, added `v2.7.8`) AND the WS `snapshot` frame — and the WS snapshot's query in `apps/coder/src/routes/ws.ts` had its own column list that omitted `model`. The client's `snapshot` handler `setMessages`-overwrites the HTTP load, so the model-less rows won, and with no later `message_complete` for historical messages the chip stayed gone. Fix is one column: add `model` to the WS snapshot SELECT so both hydration paths agree. The `apps/coder/CLAUDE.md` "update every mapper" note now lists the WS snapshot SELECT explicitly (it was the one place not enumerated). apps/server + apps/coder builds green; deployed via `systemctl restart boocoder` (host service — the earlier `v2.7.10` docker deploy rebuilt only the container, never this route). Fixes the chip shipped in `v2.7.8-ember-coder-tabs-model-chips` / completed in `v2.7.9-mcp-keys-docs-coder-fixes`.
## v2.7.10-composer-chips — 2026-06-02
A composer control-row refresh shared by BooChat and BooCoder via `ChatInput`. The slash-commands menu moves out of the full-width `AgentCommandsHint` disclosure (now removed) into a compact chip in the message box's bottom controls row — clicking it opens the existing `SlashCommandPicker` anchored to the chip and selecting inserts `/<name> `, while the typed-`/` autocomplete is unchanged. A new attach-file button sits beside it, opening a native multi-file picker that funnels picks through the same drag-drop pipeline (5 MB / binary gate, 10-attachment cap, chips + preview, `source:'drop'`). On mobile both collapse to icon-only — the slash count is `max-md:hidden` and the paperclip is icon-only — so the row stays on one line per the no-scroll toolbar rule. Web tsc + build green; deployed (docker). Builds on the BooCode 2.0 composer work in `v2.7.8-ember-coder-tabs-model-chips`.
## v2.7.9-mcp-keys-docs-coder-fixes — 2026-06-02
The MCP-key hygiene feature plus accumulated in-flight coder fixes and a docs refactor. **MCP `{env:VAR}` substitution** (`mcp-config.ts:substituteEnvVars`, opencode-compatible) recursively resolves `{env:NAME}` references in any string value of `data/mcp.json` from `process.env` *before* Zod validation, so real keys live in `.env` (`env_file`) instead of the gitignored config — an unset var resolves to `''` with a boot-log warning, and on a validation failure the loader names the unset vars alongside the field errors (an empty `{env:VAR}` in a strict url/command field invalidates the whole config, an otherwise-disconnected warning). `data/mcp.json` is now untracked (`.gitignore` flips `!data/mcp.json``!data/mcp.example.json`); the tracked template `data/mcp.example.json` carries `"CONTEXT7_API_KEY": "{env:CONTEXT7_API_KEY}"` and `.env.example` documents the key (9 mcp-config tests). **Two coder bug fixes** ride along: the `message_complete` frame's `model` is widened `string``string | null` in both ws-frames copies (server + web parity) and the dispatcher now publishes `model: task.model` at all four external assistant-completion points — without the nullable widen a null model would fail-closed in `publishFrame` and drop the entire frame including the `status:'complete'` transition (regression test added); and Claude-SDK `mapUserToolResults` now maps `user`-message `tool_result` blocks → terminal `tool_update` events (completed/failed with output) so external-agent tool snapshots resolve instead of spinning forever (the SDK feeds tool output back as a user message, previously unmapped). On the view side the `AgentComposerBar` drops the §9b resumed/history/new-session chip and token-usage readout and loses `flex-wrap` so the control row stays on one line, while `CoderPane` gains a per-chat `localStorage` agent-config cache (provider/model/mode/thinking keyed by chat id, restoring the last model on reopen) and threads the new `model` field into the timeline + attribution chip. **Docs refactor**: the root `CLAUDE.md` is slimmed (~190 lines) with per-app deep references split into `apps/{coder,server,web}/CLAUDE.md` (auto-loaded in-subtree), plus a new 372-line `docs/coder-backends.md` dispatch reference, a `docs/project-discovery.md` stack inventory, and a `docs/coding-standards/` set (the `cross-app-contract-parity` standard, fronted by `.claude/rules` path-scoped indexes) — `ARCHITECTURE.md` links the backends doc. Server 555 + coder 299 tests passing (incl. new mcp-config, ws-frames, and claude-sdk-map suites), web tsc + server + coder builds green. Builds on `v2.7.8-ember-coder-tabs-model-chips`.
## v2.7.8-ember-coder-tabs-model-chips — 2026-06-01
The BooCode 2.0 visual identity plus two workflow features. **Ember theme** (`styles/themes/ember.css`, now `DEFAULT_THEME_ID`) is the signature orange-on-near-black look — rebuilt on Obsidian's flat charcoal structure (`#0c0c0e`/`#15151a`/`#1f1f23`) with `#ff7a18` swapped in for the purple, after a Reinvented-direction detour (neon borders + a scanline/glow texture overlay) was dialed back to taste; the server `theme_id` whitelist gains `ember` so it can actually be selected. The **brand banner** (`ProjectSidebar`) shows the eye-patch Westie mascot + the `>_BooCode` wordmark big and edge-to-edge on transparent backgrounds — the source PNGs shipped with baked-white canvases, so they were flood-filled to transparency from the corners (preserving the white dog, which a naive white-key would have destroyed) and cropped to bounds. **Coder panes are now multi-tab**: `+` opens a new BooCode tab (a fresh chat = a new agent context sharing the session worktree) while the split button still opens a pane — coder panes reuse the shared `ChatTabBar` via a kind-aware `tabKind`, backed by a new `createCoderTab` action with `closeOtherTabs`/tab-numbering extended to coder kind. **Model-attribution chips**: a new `messages.model` column (both apps share the table) stamped at `finalizeCompletion` (BooChat + native coder) and at the dispatcher's assistant-row creation (external coder), surfaced through the `messages_with_parts` view + wire types + the live `message_complete` frame (the Zod already allowed `model`; nothing consumed it), and rendered as a subtle accent chip with a shortened label (`shortenModelName``Sonnet 4.6`, `Qwen3.6 35B`) beside the message stats — so swapping models mid-coder-session stays legible. Also the composer moved its Web toggle into a boxed, focus-ringed input, tool rows lead with a glowing accent dot, and the Claude-SDK-backend follow-ups validated live this session (1M context window, follow-up-message fix, collapsed thinking/tool chips) land with `CLAUDE_SDK_BACKEND=1` flipped on. One snag fixed mid-deploy: the view's new `m.model` was first inserted mid-list and `CREATE OR REPLACE VIEW` can't reorder columns (42P16) — appended at the end. Web tsc + server + coder builds green; deployed (docker + boocoder, tools:34). Builds on `v2.7.7-pane-header-actions`.
## v2.7.7-pane-header-actions — 2026-06-01
In-flight workspace UX work, committed alongside the v2.7 review batches. Extracts a shared `PaneHeaderActions` cluster (the +/Split/Reopen-closed-pane/Session-history/Close controls) used across the `ChatTabBar` and the desktop coder + terminal pane headers in `Workspace`, replacing the divergent per-header copies, with `SessionLandingPage` history enhancements and `useWorkspacePanes` tweaks. Also fixes a coder-side correctness bug: `resolveChatId` (`apps/coder/src/routes/chat-resolve.ts`) still read `sessions.workspace_panes` as a bare `WorkspacePane[]`, but `v2.6.5-panes-tabs-composer` widened it to a `WorkspaceState` envelope — so it mis-read the panes and, worse, clobbered `tabNumbers`/`nextTabNumber`/`closedPaneStack` back to a bare array on every pane-chat write; a new `normalizeWorkspaceState` accepts either shape and preserves the envelope (with a regression test). Plus a CLAUDE.md doc-sync (apps/coder vitest suite, deploy-by-surface, dual-remote push, in-flight-web-WIP staging, release-branch naming). Web tsc + coder build + coder tests green. Builds on `v2.7.6-agent-status-normalize`.
## v2.7.6-agent-status-normalize — 2026-06-01
The scoped half of `boocode_code_review_v2.md` §1 #10 — normalized external-agent status, surfaced from BooCoder's own dispatch observation (the heavier config-injection notify-hook, clean-room from superset's ELv2 `agent-setup`, is documented as the follow-on). The review's premise ("PTY agents have no status") had partly aged out — warm-ACP/opencode/SDK already carry working/done — so the real gap was that BooCoder never *published* a normalized per-`(chat,agent)` status (blocked-on-permission was invisible; crash/idle weren't pushed). Adds an `agent_status_updated` WS frame (`working|blocked|idle|error`, server+web parity) published from the dispatcher's turn boundaries across all four external paths (warm-acp/opencode/sdk/pty — `working` at start, `idle`/`error` at end) and the permission flow (`blocked` on request, `working` on resolve), best-effort so it never breaks a turn. A clean-room `normalizeAgentEvent` helper (superset's ~30-vendor-event → Start/blocked/Stop collapse, reimplemented with the event names as facts) ships now with 25 tests so the deferred notify-hook injection reuses it verbatim. The `AgentComposerBar` gains a normalized status dot (working=spinner, blocked=amber, idle=gray, error=red) distinct from the WS-liveness dot, fed by a `useAgentStatus` map `CoderPane` tracks per `(chat,agent)`. Built by two parallel agents (data plane + view plane) against a pinned frame contract; server 545 + coder 294 tests passing (25 new), web tsc + builds clean, ws-frames parity green. Clears the actionable review backlog (#1/#3/#4/#6#12). Builds on `v2.7.5-claude-sdk-sessionstore`; openspec `agent-status-normalize`.
## v2.7.5-claude-sdk-sessionstore — 2026-06-01
Lands the Claude Agent SDK direction (`boocode_code_review_v2.md` §1 #9, §6.2 "lean SDK") behind a flag. Adds `@anthropic-ai/claude-agent-sdk@0.3.159` (Commercial Terms — runtime dep, code reference-only) and builds a warm, resumable claude backend to supersede one-shot PTY dispatch — env-gated (`CLAUDE_SDK_BACKEND`, default off) so production claude stays on the unchanged PTY path until a host smoke. **Clean-room `PostgresSessionStore`** implements the SDK's real `SessionStore` type (`append`/`load`/`listSessions`/`delete`/`listSubkeys`) over a new `claude_session_entries` table — typechecked against the installed SDK type, 8 DB-integration tests. **`ClaudeSdkBackend`** (`implements AgentBackend`, mirroring warm-acp/opencode-server) drives one persistent `query()` per `(chat,'claude')` in streaming-input mode via a pushable async-iterable pump, with `sessionStore` + `resume` for cross-turn/cross-restart continuity, a pure `mapSdkMessage``AgentEvent` mapper, `session_id` captured from the `init` message, and `result.usage`/`total_cost_usd` accumulated onto `agent_sessions` (backend CHECK gains `'claude_sdk'`). Built against the REAL SDK 0.3.159 types after installing it — surfacing shapes a blind build would have missed (`SDKPartialAssistantMessage` is `type:'stream_event'` needing `includePartialMessages`; `SDKUserMessage.message` is `MessageParam`; the `SDKResultMessage` error arm). Also fixes a latent test-infra deadlock — three DB-integration suites applying the full schema in parallel under `DATABASE_URL` deadlocked, now serialized via `fileParallelism:false`. ~32 new tests (8 store + 10 mapper + 8 pushable + 6 routing); coder suite 269 passing default / 290 with DB; tsc clean against the SDK types; builds clean. **The live streaming pump + resume + an actual claude turn need a host smoke (`CLAUDE_SDK_BACKEND=1` + claude binary + ANTHROPIC auth) — cannot run from the dev container.** The zod peer-dep wants `^4` (workspace `3.25`) — watch at runtime. Builds on `v2.7.4-mistake-tracker-ledger`; openspec `claude-sdk-sessionstore`.
## v2.7.4-mistake-tracker-ledger — 2026-06-01
Two native-inference hardening features from `boocode_code_review_v2.md` §1 #12 (cline, algorithm-reimplemented). **MistakeTracker:** complements the doom-loop guard (identical repeats) and cap-hit (budget) by catching a run of consecutive tool *failures*. A new pure `mistake-tracker.ts` tracks heterogeneous failure kinds (`zod_reject`/`tool_not_found`/`exec_error`/`api_error`/`permission_denied`, surfaced per tool from `tool-phase.ts`); after 3 consecutive failures the `turn.ts` loop does a **soft nudge** — injects model-facing recovery guidance into the next step + drops a `mistake_recovery` UI sentinel + resets — then **escalates** to stopping the turn (cap-hit-style, with a Continue affordance) if it re-trips without an intervening success, so heterogeneous failures can't burn the whole step budget. **File-provenance ledger:** `compaction.ts` now derives a deterministic, sorted `## Files Read` list from the head messages' read-tool calls (`view_file`/`grep`/`find_files`/`list_dir`) and injects it into the rolling-summary prompt so file provenance survives compaction (no new table; prompt-driven merge, read-only since BooChat has no write tools). The `mistake_recovery` sentinel adds an arm to `MessageMetadata` in both server + web type copies plus a `MessageBubble` render branch. Built by two parallel agents (backend + frontend sentinel) over disjoint apps; server 545 tests passing (23 new: 12 mistake-tracker + 11 compaction), build + web tsc clean. Native-inference only (external agents run their own loops). Builds on `v2.7.3-sampling-streamjson-tokens`; openspec `mistake-tracker-file-ledger`.
## v2.7.3-sampling-streamjson-tokens — 2026-06-01
Three small BooCode wins from `boocode_code_review_v2.md` §1 #11/#7/#8. **Sampling knobs:** per-agent `top_n_sigma` + the `dry_*` repetition family (`dry_multiplier`/`dry_base`/`dry_allowed_length`/`dry_penalty_last_n`) are now first-class Agent frontmatter fields, parsed in `agents.ts` and threaded into the llama-swap chat-completion body via `providerOptions.openaiCompatible` (the `@ai-sdk/openai-compatible` extra-body channel). This surfaced and fixed a **latent bug**: `top_k` (rejected by the AI-SDK provider as unsupported) and `min_p` (never passed to `streamText` at all) had been dead on the wire — no agent's `top_k`/`min_p` ever affected sampling; both now route through the same channel, so agents that set them will start using them. `--reasoning-budget` is documented in `data/AGENTS.md` (already works via `llama_extra_args`, permitted by the deny-list validator). **Live PTY stream-json:** qwen/claude PTY dispatch sliced stdout opaque; a new `stream-json-parser.ts` line-buffers the Claude-Code-compatible NDJSON and emits text/reasoning/tool frames live as they arrive (mirroring the ACP/opencode paths) + persists the structured parts, with a clean fallback to the old opaque slice when output isn't NDJSON (claude now runs `--output-format stream-json --verbose`). **Token UI:** the per-`(chat,agent)` `agent_sessions.input_tokens`/`output_tokens`/`cost` columns (accumulated since `v2.6.8` but dropped by the read route + wire type) now flow through and render condensed beside the AgentComposerBar session chip. Built by three parallel agents over disjoint subsystems; server 523 + coder 245 tests passing (incl. 11 new stream-json-parser + new agent-parse tests), all builds + web tsc clean. Builds on `v2.7.2-checkpoint-idor`; openspec `sampling-streamjson-tokens`. The qwen-vs-claude `usage` field names in #7 are best-guess pending a live smoke.
## v2.7.2-checkpoint-idor — 2026-06-01
Closes two IDOR authorization holes in the `v2.7.1-write-edit-robustness` checkpoint routes, flagged by the automated push security review. The `GET /api/sessions/:id/checkpoints?chat_id=` list route scoped its `chat_id` branch by `chat_id` alone — any session's `chat_id` would read its checkpoints; it now joins through `chats` and gates on `chats.session_id` (authoritative; `checkpoints.session_id` is a nullable denormalized hint). The `restoreCheckpoint` scope guard was fail-open — `cp.session_id && cp.session_id !== sessionId` fell through whenever the checkpoint's denormalized `session_id` was null, allowing a cross-session restore (worktree reset + transcript trim) — it now resolves the owning session via the checkpoint's chat and denies on any missing-or-mismatched row. A DB-integration regression covers the exact null-`session_id` cross-session case. Real-world blast radius is small (BooCoder is single-user behind Authelia on loopback), but both are genuine authorization bugs. Coder suite 234 passing (7/7 checkpoint tests incl. the regression against live postgres+git), typecheck clean. Hotfix on `v2.7.1-write-edit-robustness`.
## v2.7.1-write-edit-robustness — 2026-06-01
Two BooCoder hardening features for local quantized models, algorithm-reimplemented (not vendored) from the cline findings in `boocode_code_review_v2.md` §1 #3/#4. **Fuzzy patch applier:** `edit_file`'s apply path was exact-`.includes`-or-throw + first-occurrence `.replace` (`pending_changes.ts`), so a qwen3.6 whitespace/indentation/unicode drift in `old_string` lost the edit; a new pure `fuzzy-match.ts` (`locateMatch`) now runs an exact → per-line-trim → unicode-canon (curly quotes/dashes/nbsp) → Levenshtein-≥0.66 ladder and returns the real file span, refusing multi-exact matches as ambiguous rather than silently editing the first. `applyOne`/`rewindOne` both use it. **Worktree checkpoints + conversation-trim:** `rewind` only reversed BooCode's own `pending_changes`, blind to what external agents (opencode/goose/qwen/claude) write directly into the session worktree — so a new `checkpoints` table + `checkpoints.ts` shadow-commit (tracked **and** untracked, captured via a temp-index `read-tree`/`add`/`write-tree`/`commit-tree` into a GC-safe `refs/boocode/checkpoints/<id>`) snapshots the worktree before each external-agent turn (hooked into all three dispatcher paths), anchored to the turn's assistant message. A new `POST /api/sessions/:id/checkpoints/:cid/restore` resets the worktree (`reset --hard` + `clean -fd`), trims the transcript past that message, and resets the `(chat,agent)` backend session so files, transcript, and agent context land consistent at the restore point; a per-message "Restore to here" affordance in `CoderMessageList` drives it. Built by three parallel agents over disjoint files; DB-integration testing caught a microsecond-`created_at` self-deletion bug in the later-checkpoint cleanup. Full coder suite 234 passing (incl. 17 fuzzy-match + 6 checkpoint tests), server+coder build + web tsc clean. Builds on `v2.7.0-mit`; openspec `write-edit-robustness`. Live host smoke (dispatcher hook + restore UI end-to-end) still to run.
## v2.7.0-mit — 2026-06-01
Relicenses BooCode from AGPL-3.0 back to MIT by clearing the three Unsloth-Studio-derived files the `v2.4.0`/`v2.4.1` lifts pulled in — the root `LICENSE` and all five `package.json` had been `AGPL-3.0-only`, making the network-served work AGPL §13-encumbered. The enabling finding decoupled the relicense from the long-planned native-llama-server-parsing retirement: `tool-call-parser.ts`'s Unsloth-ported algorithm (`parseToolCallsFromText`/`scanBalancedBraces` + unused nudge constants) was **dead code** with no production import, so it was simply deleted while the load-bearing `extractToolCallBlocks`/`stripToolMarkup` (BooCode-authored streaming helpers) were kept byte-identical — no behavior change to the live tool-call path. `html-to-md.ts` was swapped to the MIT `node-html-markdown` library (`parse5` dropped; the only behavior delta is column-aligned tables, GFM hard-break `<br>`, and `<ol start>` renumbering, all feeding the LLM via `web_fetch`), and `llama-args-validator.ts` was clean-room rewritten with the managed-flag denylist re-derived from the public llama-server flag list (facts, not copyrightable). The license flip set `LICENSE` to MIT (`Copyright (c) 2026 indifferentketchup`), the five `package.json` to `MIT`, removed every AGPL SPDX header, added a README License section, and added a `license-mit` guard test that fails if AGPL provenance returns. Built by three parallel agents over the disjoint files; full server suite 519 passing (incl. 9 new guard tests), server build + coder typecheck clean. Resolves `boocode_code_review_v2.md` §1 #1 / §5k and the roadmap's `License-debt` batch (openspec `license-debt-mit`); supersedes that batch's original staged plan, which had entangled the flip with a live qwen3.6 validation window.
## v2.6.11-close-hooks-staging — 2026-06-01
The two v2.6 follow-ups left after `v2.6.10-lifecycle-hardening`. **Server close-hook caller:** `apps/server` (BooChat) now fire-and-forgets BooCoder's Phase-3 close hooks so warm agent backends + worktrees tear down *immediately* on delete/archive instead of waiting for the idle-evict/reaper backstop — a new `coder-notify.ts` `notifyCoderClose(kind,id)` (reusing the v2.6.2 `BOOCODER_URL` reach, never-rejects) is `void`-called after the WS frame at session-delete (`POST /api/sessions/:id/close`) and chat archive / archive-all / delete (`POST /api/chats/:id/close`); an unreachable coder can never block or fail the user's delete/archive. **Staging-boundary hint (task 3.7):** the BooCoder DiffPanel now shows a muted one-liner when the selected provider can't see another agent's unapplied worktree edits — native boocode selected + external-agent-staged changes (or vice-versa) → "<agent>'s edits live in its worktree — BooCode won't see them until applied" — derived purely from the per-change `agent` + current provider, no new state. 6 new server tests (`coder-notify`), 537 server tests pass; web + server tsc/build clean. **With these the v2.6 openspec is fully closed** — only the live Smoke 2/2b/3 remain (manual exercise).
## v2.6.10-lifecycle-hardening — 2026-06-01
v2.6 Phase 3 (the last phase) — lifecycle hardening of the warm-process backends. **Idle eviction + LRU cap:** the agent pool runs a 60s sweep that evicts backends/sessions idle past `AGENT_POOL_IDLE_TTL_MS` (30 min default) and any beyond `AGENT_POOL_MAX_LIVE` (10, LRU) — **never a busy one** (in-flight turn, double-checked via a new `isBusy()` backend hook); the worktree persists (DB-backed) and the next turn re-spawns + reattaches. The eviction/LRU/restart decisions are factored into a pure `lifecycle-decisions.ts` (modeled on the inference `selectPruneTargets` pattern). **Crash recovery:** lifts openchamber's health-monitor + busy-aware-restart + consecutive-failure + stale-busy-grace state machine into `opencode-server.ts` (with port reclaim) and `warm-acp.ts` — an opencode server crash settles in-flight turns as failed, marks the rows `crashed`, and recreates fresh sessions (a fresh server can't hold the old in-memory id), while a warm-ACP child crash re-`session/new`s next turn; the F.1 turn-guard and U.6 usage are preserved (their tests still pass). **Worktree reaper:** a periodic reaper removes orphan on-disk worktrees (no live `worktrees` row, 1h grace) behind a superset-style preflight that skips dirty/unpushed/unmerged work, with Paseo-style soft-delete (`status='archived'`). Plus close hooks (`/api/chats/:id/close`, `/api/sessions/:id/close`, awaiting the apps/server caller) and diff re-baseline after `apply_pending`. Built test-first — 35 new tests (`lifecycle-decisions` 22, `agent-pool` 13) + a DB-opt-in reconnect integration test; 215 coder tests pass, tsc + build clean. **This completes v2.6** (Phase 03 + F.1 + Phase 1-UX). Remaining follow-ups (out of v2.6 scope): the apps/server close-hook caller, the 3.7 DiffPanel staging-boundary hint (frontend), and live Smoke 2/2b/3.
## v2.6.9-warm-acp — 2026-05-31
v2.6 Phase 2: goose and qwen now run as **warm ACP backends** instead of one-shot-per-task. A new `WarmAcpBackend` (`backends/warm-acp.ts`, implementing the same `AgentBackend` interface as the opencode warm server) holds one persistent `goose acp` / `qwen --acp` child + `ClientSideConnection` + ACP session per `(chat, agent)`, running `initialize` + `session/new` once and reusing the connection across turns; per-turn abort cancels the in-flight prompt (`session/cancel`) without killing the child, and a child exit marks `agent_sessions.status='crashed'` for re-spawn on the next turn. The dispatcher routes `goose`/`qwen` chat-tab tasks to the pooled warm backend via a pure `shouldUseWarmBackend(task)` predicate (warm only when both `session_id` and `chat_id` are set), keeping the one-shot `runExternalAgent` path as the fallback for session-less creators (arena, MCP, `new_task`); broker frames + `persistExternalAgentTurn` + the latest-wins `pending_changes` diff are identical to the opencode path. The `acp-dispatch.ts` `handleSessionUpdate` switch was extracted into a pure shared `acp-event-map.ts` mapper used by both the one-shot and warm paths (one-shot behavior byte-identical, all existing acp tests green). The design's `unstable_resumeSession` concern is resolved — the installed `@agentclientprotocol/sdk@^0.22.1` exposes stable `resumeSession`/`loadSession`, but resume is moot in the hot path (warm reuse needs none); cross-restart resume + idle eviction are deferred to Phase 3. Built test-first (15 new tests: `warm-acp-routing`, `acp-event-map`); 180 coder tests pass, tsc + build clean. **Smoke 2/2b (live two-message warm reuse + the opencode→boocode→opencode switch round-trip) to be run post-deploy.** Phase 3 (lifecycle hardening) is the last v2.6 phase.
## v2.6.8-agent-attribution — 2026-05-31
v2.6 Phase 1-UX: agent attribution + switch affordances over the already-shipped `pending_changes.agent` column and `agent_sessions` table (read+display, no new backend capability). **Backend:** `pending_changes.agent` is now stamped at every queue site (native write tools → `'boocode'`, dispatched external agents → the task's agent, manual RightRail create → `NULL`) and flows through `listPending`; a new `GET /api/sessions/:id/agent-sessions` route returns `[{agent,status,has_session,last_active_at}]` per `(chat,agent)` for the session's chats; and the opencode warm-server backend consumes opencode's `session.next.step.ended` events, accumulating `input_tokens`/`output_tokens`/`cost` onto the `agent_sessions` row (new columns, idempotent). **Frontend:** the BooCoder DiffPanel renders a per-row agent badge (provider icon + label; `null` → "manual") with a "Changes from X, Y" note when a pending set spans multiple agents, and the AgentComposerBar shows a resumed / history / new-session chip beside the Provider picker — gated on an optional `sessionId` prop so BooChat is unaffected — driven by a new `useAgentSessions` hook that refetches on message-complete; `providerIcon` was extracted to a shared `components/coder/providerIcons.tsx`. Built by three parallel subagents over disjoint file sets; web + coder typecheck clean, 165 coder tests pass (9 new across `opencode-usage` and `agent-sessions.routes`). U.6's persisted token totals are conversation-cumulative and not yet surfaced in the UI (deferred). Implements the U.1U.6 "remaining" plan from the v2.6 openspec reconciliation; Phase 2 (warm ACP goose/qwen) + Phase 3 (lifecycle hardening) remain.
## v2.6.7-interrupt-guard — 2026-05-31
Fixes a post-interrupt correctness bug in the `v2.6.1-phase1-opencode` warm-server backend, made one-click reachable by `v2.6.5-panes-tabs-composer`'s Send→Stop composer. `opencode-server.ts` settled an in-flight turn on opencode's `session.idle`/`session.error` by calling `activeTurn.settle()` on whatever turn currently held the session slot — but opencode emits one trailing terminal event for a *cancelled* turn after `client.session.abort()`, and those events carry only a `sessionID` (no turn id). So after the user hit Stop and immediately sent another message, the aborted turn's orphan `session.idle` settled the *new* turn early as success (Paseo hit and fixed the same class in `1d38aac`). The fix adds a small pure guard (`turn-guard.ts`: `armAbortGuard`/`noteTurnActivity`/`consumeTerminal` over a per-session `swallowNextTerminal` flag): abort arms it, the next terminal is swallowed once, and a new turn's first delta self-heals the flag so a never-arriving orphan can't strand a real turn. Implemented test-first — three regression tests in `turn-guard.test.ts` (swallow-the-orphan, settle-when-no-abort, self-heal); full coder suite green (156 passed). This is the F.1 "fix-next" item from the v2.6 openspec reconciliation; Phase 1-UX / Phase 2 / Phase 3 remain.
## v2.6.6-claude-md — 2026-05-31
Docs-only — CLAUDE.md session-learnings update, no code. Captures four recurring gotchas surfaced while shipping `v2.6.5-panes-tabs-composer`: (1) `sessions.workspace_panes` is now a `WorkspaceState` envelope (`panes` + `tabNumbers`/`nextTabNumber` + `closedPaneStack`), migrated from the legacy bare `WorkspacePane[]` on both frontend hydrate (`toWorkspaceState`) and the union-accepting server PATCH validator; (2) DB/session-aware tools take an optional `ToolExecCtx` (`{ sql, sessionId }`) 4th arg on `ToolDef.execute`, plumbed through the tool phase, with `read_tab_by_number` as the reference; (3) the two-schema-files-one-DB ownership split — `apps/coder/src/schema.sql` owns `agent_sessions`/`worktrees`/`pending_changes`/`available_agents` and extends `tasks`, distinct from BooChat's `apps/server/src/schema.sql` — plus the idempotent `confdeltype` FK-action-flip pattern (guard `ON DELETE` changes on `pg_constraint.confdeltype` so re-runs no-op); and (4) React StrictMode is on, so a `setState` called inside another `setState`'s updater double-fires in dev and must be made idempotent. Pairs with `v2.6.5-panes-tabs-composer`.
## v2.6.5-panes-tabs-composer — 2026-05-31
A workspace UX batch across BooChat panes, tabs, and the composer, plus the persistence model that backs them. **Panes & tabs:** a chat can be opened in a fresh pane (the ChatTabBar tab context menu's "Open in new pane", and the fork button — which now lands the fork beside the original via a new `open_chat_in_new_pane` event instead of replacing the active pane); the per-pane "+" became a New BooChat/BooTerm/BooCode menu; closing a chat pane relocates its tabs (in order) into the oldest chat/empty pane instead of discarding them, and reopen strips the restored chatIds from every live pane first so a relocated-then-reopened pane never duplicates a tab (no stack-shape change); each tab carries a stable session-scoped number assigned on open and retired on close (never reused), rendered map-keyed rather than positional. The per-message "Open in pane" artifact button was removed, and the empty/landing pane became a real session history — the session's open chats plus separately-fetched archived chats, click to open or restore-and-open. **Persistence:** `sessions.workspace_panes` was widened from a bare `WorkspacePane[]` to a `WorkspaceState` envelope (`panes` + `tabNumbers`/`nextTabNumber` + `closedPaneStack`) so tab numbers and the reopen stack survive reload; the PATCH validator accepts the legacy array or the envelope (zod union) and migrates on write, and the `session_workspace_updated` WS-frame schema was widened on both web and server (byte-identical, parity test green) — the same schema-drift class as `v2.6.4-agent-sessions-fk`. **Composer:** the send button morphs Send → Stop → Queue with generation state (BooCoder keys on `sending || activeTaskId`, which also corrected its queue gates and added `cancelTask`), the standalone "Stop generating" pill was folded into it, and pasted chips now trail the typed text so a leading slash command stays first. **Tooling:** adds the read-only `read_tab_by_number` tool — resolves a session-scoped tab number to its chat via the persisted `tabNumbers` map and returns that chat's transcript; tools gained an optional `ToolExecCtx` (`{ sql, sessionId }`) on `execute` to support DB-reading tools. Builds on `v2.6.4-agent-sessions-fk`.
## v2.6.4-agent-sessions-fk — 2026-05-31
Follow-up to `v2.6.3-chatkey-and-skills` (P1.5-b): the live `agent_sessions.session_id` foreign key is converged from `ON DELETE CASCADE` to `ON DELETE SET NULL`, matching the schema's stated intent. The P1.5-b re-key block re-adds `session_id_fkey` as `SET NULL`, but the whole block is guarded on `chat_id_fkey`'s absence — so a database already re-keyed to `(chat_id, agent)` while `session_id_fkey` was still `CASCADE` never re-enters it, leaving the live FK at `CASCADE` and diverging from both `worktree_id` (already `SET NULL`) and the `v2.6.3` changelog's own claim that `session_id` is informational `SET NULL`. The fix adds a standalone `confdeltype`-guarded `DO` block (mirroring the `session_worktrees` defang) that flips `session_id_fkey` `CASCADE → SET NULL` independently of the re-key gate; it is idempotent — fires only while the FK is still `'c'`, a no-op on a fresh deploy (already `'n'`) and on every re-run. The live DB was converged by hand with the identical statements, so `applySchema` and the hand-applied state match (`\d agent_sessions` now shows `session_id ... ON DELETE SET NULL`). Also bundles a CLAUDE.md doc-sync (committed separately): per-session SSE (P1.5-a) and the `(chat_id, agent)` re-key reflected in the engineering notes, the stale root `AGENTS.md` navigation pointer dropped, and new conventions for `data/AGENTS.md` parsing and the `data/skills/<vendor>/` layout.
## v2.6.3-chatkey-and-skills — 2026-05-31
Three threads. **agent_sessions re-keyed to `(chat_id, agent)` (P1.5-b):** the tab (a chat) is now the agent-context unit, so two opencode tabs in one BooCode session are two independent contexts that share one worktree. `chat_id` is threaded end-to-end — `tasks.chat_id` added, stamped by the coder message + skills routes from the frontend tab, read by `runOpenCodeServerTask` which falls back to resolve-or-create a chat for session-less creators (arena/MCP/new_task/generic `/api/tasks`) so `ensureSession` never receives a degenerate `(null, agent)` key. A new first-class `worktrees` table (one-per-session, survives session delete via `session_id ON DELETE SET NULL`) supersedes `session_worktrees`, which is defanged (CASCADE dropped, not yet removed); `agent_sessions.chat_id` CASCADEs from `chats` (closing a tab ends its context) while `worktree_id`/`session_id` are informational `SET NULL`. The migration is idempotent with a backfill-verify gate; the live re-key was applied against an empty table after the 35-chat test session `20d28876` was deleted (backed up first). This corrects and supersedes an earlier draft that wrongly keyed on `(worktree_id, agent)`; the delete-guard from `v2.6.2-delete-guard-and-sse` is repointed here from `session_worktrees` to `worktrees` (`worktree_path``path`). **dcp-strip cross-chunk fix:** the `<dcp-message-id>` tag streams split across SSE deltas, which the per-chunk strip from `v2.6.1-phase1-opencode` missed — a stateful `makeDcpStreamStripper` at the dispatcher boundary holds back partial-tag tails so neither live frames nor persisted content carry the tag (11 unit tests). **Agent-judgment skills:** `committing-changes` (segment by concern, stage explicitly, present-and-stop, never push) and `using-worktrees` (the when-to-isolate heuristic, autonomous-when-clear vs committing's command-gate) land in `data/skills/boocode/` with eval.yamls, plus a parser-safe `data/AGENTS.md` preamble pointing at both.
## v2.6.2-delete-guard-and-sse — 2026-05-30
Two coder-side batches under one tag. **Session-delete work-loss guard:** deleting a BooChat session CASCADE-wipes its `session_worktrees` row, which would silently orphan uncommitted/unpushed/unmerged work — so the server's `DELETE /api/sessions/:id` now gates before the delete. It reads `session_worktrees` from the shared DB first (no row → chat-only session → delete immediately, zero round-trip), and for worktree-backed sessions calls a new BooCoder endpoint (`/worktree-risk`) that runs git on the host, since the container can't see `/tmp/booworktrees` — only the host systemd service can. `checkWorktreeWorkAtRisk` reports dirty/unpushed/unmerged via the audited `hostExec`+`shellEscape` path, default branch detected from `refs/remotes/origin/HEAD` (never the worktree's own branch, never hardcoded); any at-risk worktree returns 409 with per-worktree `RiskReport[]`, `force=true` bypasses, and the check is fail-closed (BooCoder unreachable also blocks — force still escapes). The sidebar renders a block dialog distinguishing work-at-risk (Commit/Stash/Force; stash uses `-u` and re-blocks on remaining commits) from couldn't-verify (Cancel/Force), and Commit never auto-commits. A follow-up fix gates the `unpushed` arm behind an actual upstream (`atRisk = dirty || unmerged > 0 || (hasUpstream && unpushed > 0)`) so the no-upstream `session-<id>` branches stop flagging every pristine worktree-backed session — no protection lost, since real local work always also surfaces as `unmerged > 0`. **Per-session SSE (P1.5-a):** replaces the single global SSE loop scoped to the most-recent worktree directory — the known limit flagged in `v2.6.1-phase1-opencode` — with one `event.subscribe({directory})` per live opencode session, so sessions in different worktrees stream concurrently instead of the second silently dropping the first's events. Each session owns an `AbortController` wired into `subscribe(…, {signal})`, which also fixes a latent Phase-1 bug where switching directories left the old loop parked forever in its `for await` (zombie loops); a `sessionID` demux guard drops cross-session events so two sessions sharing a worktree (possible after P1.5-b) don't double-process deltas. The opencode SDK was confirmed to open an independent SSE connection per `subscribe()` call, so N concurrent dir-scoped streams are supported.
## v2.6.1-phase1-opencode — 2026-05-30
v2.6 Phase 1: opencode runs as a warm HTTP server (`apps/coder/src/services/backends/opencode-server.ts`) — one `opencode serve` per BooCoder process, one opencode session per BooCode session resumed across turns via the new `agent_sessions` table, with a single SSE read loop, reasoning dedup ported from Paseo, an inactivity watchdog, and a stale-session guard (crashed-not-resumed + a `config_hash` fingerprint over `opencode_server|<model>`, deliberately excluding the ephemeral server port so cross-restart resume survives). Builds on the `v2.6.0-phase0-foundations` schema/interface scaffold. The batch's hard-won fixes: opencode streams `session.next.*` events (not `message.part.*`), and `event.subscribe()` must pass the session's worktree `directory` or events route to the server CWD and turns come back empty; model strings must be `llama-swap/`-prefixed and present in opencode's own config, with `agent-probe` now populating `available_agents.models` via `mergeLlamaSwap` so the frontend stops sending an empty model; `session_worktrees`/`agent_sessions` FKs are `ON DELETE CASCADE` so session deletion no longer 500s. Also bundled: dcp-message-id tag stripping from opencode text output, a reopen-closed-pane control, the `[+]`/split-pane button separation, auto-name using the session's loaded model, and a `systematic-debugging` slash command. Smoke 1 verified end-to-end (two turns, session reuse, turn 2 ~9x faster). Known Phase 1 limit: one SSE stream scoped to the most-recent session's directory — concurrent opencode sessions in different worktrees collide (warns; per-session SSE is Phase 2).
## v2.5.15-acp-path-guard — 2026-05-29
Security fix + repo hygiene. Fixes a path-traversal in the ACP filesystem bridge (`acp-client-fs.ts`, flagged by the automated push security review): the worktree guard used an unbounded `startsWith(resolve(worktreePath))`, so a sibling path sharing the worktree as a string prefix (`<worktree>-evil/…`) escaped the scope — and `writeWorktreeTextFile` writes to disk directly (no `pending_changes` gate), so a confused/buggy ACP agent could write outside its worktree. Now uses a separator-bounded check matching `write_guard.ts` (`resolve()` + `startsWith(root + sep)` / `=== root`) via a shared `resolveInWorktree`, with a regression test covering `../` traversal and the sibling-prefix bug. Symlink-swap/`O_NOFOLLOW` hardening was intentionally skipped — consistent with `write_guard`'s no-realpath stance, and the agent already runs with host FS access so this is a containment guard, not a trust boundary. Separately, stops tracking the live `data/coder-providers.json` (it's runtime config the UI reads *and writes* on provider toggles, which churned `git status`) — it's now gitignored with a tracked `data/coder-providers.example.json` reference; the loader falls back to built-ins-only when the live file is absent. The provider-type duplication (coder ↔ web) stays guarded by the existing text-identity `provider-types-parity.test.ts` — a shared package was considered and declined (drift is already prevented; not worth the Docker/build-order risk at solo scale).
## v2.5.14-claude-md — 2026-05-29
Docs-only — CLAUDE.md session-learnings update, no code. Adds gotchas surfaced while shipping the v2.3 provider-lifecycle batch: the host `boocoder.service` keeps running the old process after `pnpm -C apps/coder build` (stale-process tell = new routes 404 while old routes 200, restart don't re-debug); the `boocode` container `build: .` deploys the working tree, so web edits are live on the Vite dev server but not production until `docker compose up --build -d boocode`; `PATCH /api/providers/config` replaces a provider's override wholesale (send `{...existing, enabled}` or a custom ACP entry's command is wiped) and `data/coder-providers.json` is live config not to be committed as code; external agents dispatch one-shot with no context/token tracking (only native `boocode` tracks ctx; OpenCode-as-server is the unshipped `v2-6-persistent-agent-sessions` plan); the `ui/` primitive inventory with `button role=switch` / Dialog fallbacks for the absent switch/sheet; and the mobile Dialog-with-list scroll-containment recipe. Also backfills previously-uncommitted doc bullets for the `v2.5.7``v2.5.11` coder work (provider-type parity test, async ACP command discovery, AgentComposerBar `installed` filter, provider-registry path disambiguation).
## v2.5.13-provider-lifecycle-phase5 — 2026-05-29
Closeout of the v2.3 provider-lifecycle batch — the web UI (Phase 5) plus docs (Phase 6). Provider management moved into **Settings → Providers**: a tab listing every registered provider with a status badge (Available / Disabled / Not installed / Error / Loading), an enable/disable toggle, a per-provider refresh, and a plaintext diagnostic; toggling sends the provider's *full* override (preserving a custom ACP entry's command under the wholesale-replace PATCH merge) then refetches the snapshot. The composer's provider picker now filters to `enabled && (status === 'ready' || 'loading')`, so disabled and unavailable providers drop out of the picker and are managed only in settings (native `boocode` always shows). A curated ACP catalog (`apps/web/src/data/acp-provider-catalog.ts`) + `AddProviderModal` register custom providers via `PATCH /api/providers/config` then a subset refresh, and the web client gained `getProvidersConfig` / `patchProvidersConfig` / `refreshProviders` / `getProviderDiagnostic`. Two mobile fixes ship alongside: the Settings pane is now reachable on phones (opening it pushes `?pane=` atomically so the mobile URL-sync effect keeps it active instead of snapping back to the chat pane), and the Add-provider modal caps to the viewport with a single `overscroll-contain` scroll region so the list scrolls instead of dragging the whole modal. This completes the arc begun in `v2.5.4-provider-lifecycle-phase1` (config-backed registry over the built-ins) → `v2.5.5-provider-lifecycle-phase2` (loading/unavailable snapshot lifecycle + tier-2 probe TTL gate) → `v2.5.6-provider-lifecycle-phase3` (generic `resolveLaunchSpec` ACP dispatch) → `v2.5.12-provider-lifecycle-phase4` (config GET/PATCH, subset refresh, diagnostic HTTP API). Docs landed in `BOOCODER.md` (config file, refresh contract, enable/disable, custom ACP, the honest subset-refresh known limitation) and `docs/DEFERRED-WORK.md` §2 is marked addressed; the remaining Tier-2 follow-ups (WS `provider_snapshot_updated` frame, `available_agents.enabled` column, shared types package, MCP provider tools) stay deferred.
## v2.5.12-provider-lifecycle-phase4 — 2026-05-29
Phase 4 of the v2.3 provider-lifecycle batch (`openspec/changes/v2-3-provider-lifecycle/design.md` §6): the HTTP API to read, patch, refresh, and diagnose providers. `routes/providers.ts` gains `GET /api/providers/config` (the raw loaded `CoderProvidersFile`), `PATCH /api/providers/config` (a partial providers map — an id's override object is replaced wholesale, a `null` value deletes it), an optional `{ providers?: string[] }` body on `POST /api/providers/refresh` (the `refreshed` count reflects the requested subset; the force probe itself still covers all installed providers, since per-provider force is a snapshot-internal change left to a later phase), and `GET /api/providers/:id/diagnostic` returning JSON `{ diagnostic: string }` — a read-only report (resolved def, install_path, last_probed_at, enabled, `which` availability, last cached probe error) with no probe spawn. PATCH correctness is the whole story: the order is validate→save→reload→clear, a malformed body or an invalid merged config returns 422 without writing the file, and a `save()` failure returns 500 without reloading the registry or clearing the snapshot cache, so on-disk and in-memory state can never diverge. New pure `mergeProviderConfigPatch` + `ProviderConfigPatchSchema` in `provider-config.ts`, a read-only `peekSnapshotEntry` cache accessor (source of the diagnostic's last-error — no probe/cache logic change), and a new `provider-diagnostic.ts` formatter. The web client gains `api.coder.getProvidersConfig` / `patchProvidersConfig` / `refreshProviders(providers?)` / `getProviderDiagnostic`, with mirrored `ProviderOverride` / `CoderProvidersFile` / `ProviderConfigPatch` types; the existing `/api/coder/*` proxy blanket-forwards the new routes with no change. +28 tests (134 coder total: pure merge/validate, the diagnostic formatter, and `app.inject` route tests proving the 422-no-write and save-fail-no-divergence guards). The diagnostic returns JSON rather than the §8 plaintext so it flows through the JSON `request` client helper (reconciling design §6.4's `{ diagnostic }` with §8's string report). No UI (Phase 5). Builds on `v2.5.6-provider-lifecycle-phase3`.
## v2.5.11-claude-skill-discovery — 2026-05-29
Surface Claude Code's real enabled commands + plugin skills in the coder slash menu, with icons separating commands from plugin skills. New `claude-command-discovery.ts` reads (user-global scope) `~/.claude/commands/*.md` plus every enabled plugin in `~/.claude/settings.json:enabledPlugins` — each plugin's user-scope install path contributes `skills/<name>/SKILL.md` (kind `skill`) and `commands/*.md` (kind `command`), parsed from frontmatter, bare names, deduped. The snapshot's claude branch discovers these **live** (claude is PTY, no ACP probe; the snapshot cache rate-limits the fs reads). The `/` menu now renders up to three icon'd groups: **`<agent> commands`** (Terminal), **`<agent> skills`** (Puzzle — claude's plugin skills / opencode is all commands), and **BooCoder skills** (Sparkles), via a new optional `icon` on `SlashCommandGroup`. `AgentCommand` gains a `kind` field, added identically to the coder and web copies (the `provider-types-parity` test enforces it); `mergeCommandsByName` is now generic so it preserves the tag. Invocation is unchanged — picking a claude command/skill sends `/name` to claude (PTY), which executes it. Project-local plugins + `<cwd>/.claude/commands` deferred. BooChat unaffected (flat skills). Smoke-test the claude skill slash-execution on the host.
## v2.5.10-opencode-live-commands — 2026-05-29
Surface opencode's real (live ACP) command set in the coder slash menu without needing a dispatch. Two fixes: (1) the cold ACP probe (`acp-probe.ts`) captured `available_commands` but read `probedCommands` synchronously right after `newSession` — racing opencode's async `available_commands_update` notification, so it captured **zero** and only the 7-item static manifest showed. The probe now waits briefly (poll up to 3s for the first batch + a 300ms settle, capped under the 30s probe timeout) so the commands are actually captured. (2) Captured commands are persisted to a new `available_agents.commands` JSONB column and served (merged with the manifest) on the tier-2-probe-skip path, so the agent's discovered commands survive once the model list is warm and show without a dispatch. Boot warms this via the `force: true` startup snapshot. apps/coder only (probe + schema + snapshot). Caveat: depends on opencode emitting `available_commands_update` on session creation rather than only after a prompt — to be confirmed on the host. Claude (PTY) disk/plugin discovery deferred.
## v2.5.9-agent-slash-commands — 2026-05-29
Segmented per-agent slash menu in the coder pane, plus cross-agent skills. The `/` menu now shows two labeled groups — **the active agent's commands first** (opencode/claude/qwen manifest + live ACP `available_commands`), **BooCoder skills second** — instead of always showing BooCoder's skills regardless of provider. `SlashCommandPicker` gains an opt-in `groups` prop (the flat `items` path is unchanged, so **BooChat's menu is byte-identical** — parity verified: no BooChat caller passes the grouped prop, and the skills lookup / invocation routing are untouched); `ChatInput` takes `slashGroups`; `CoderPane` builds the groups from the selected provider's commands + skills. Skills now **run under the selected agent**: the coder `skill_invoke` route accepts a `provider` and, when external, injects the server-side skill body into a dispatched task (instead of native inference) — so a skill like brainstorming executes through opencode/claude with the body kept server-side, mirroring the messages-route external dispatch. Also folds in the earlier initial-chat fix: invoking a skill on the landing chat now runs the same create-chat → assign-to-pane → invoke transition as a text send (`handleLandingSkill`) rather than invoking invisibly without a pane transition (the blank-screen repro). Web tsc + coder build clean.
## v2.5.8-mobile-composer-row — 2026-05-29
Mobile fix for the `AgentComposerBar`: the refresh button was wrapping to a second line. Root cause was layout order, not width — the status dot carried `ml-auto` (pinned to the far-right edge) and the refresh button followed it in DOM order, so it overflowed and wrapped. The dot + refresh are now one right-aligned (`ml-auto`) unit, keeping the refresh on the top line. Additionally, `CompactPicker` gained an `iconOnly` option and the Mode (permission) picker now renders icon-only on mobile (shield + chevron, no "Bypass"/"Plan" text label; `aria-label`/`title` and the tap-to-open list still convey the value) to free row width. Desktop is unchanged (full labels). Web-only change.
## v2.5.7-claude-models-and-picker-fix — 2026-05-29
Two provider-layer changes. **(1) Fix the empty provider picker** — a regression from `v2.5.5` (Phase 2): on a cache miss `getProviderSnapshot` returned synchronous `installed:false` `loading` entries, which `AgentComposerBar` filters out (`e.installed && e.status !== 'error'`); with the client-side poll deferred to Phase 5, a single fetch landed on `loading` forever and no providers appeared. `getProviderSnapshot` now awaits the build and returns terminal entries (the sync `loading` return is deferred until Phase 5 ships the poll); builds stay fast via the tier-2 cold-probe skip. **(2) Claude models** — the list was a hardcoded 2-entry static list (Opus 4 / Sonnet 4, May 2025), and the v2.3 config schema's `models`/`additionalModels` were parsed but never wired. `buildResolvedRegistry` now carries config `models` (replace) + `additionalModels` (merge) onto `ResolvedProviderDef`, and `provider-snapshot` applies them to every ready model list — so `/data/coder-providers.json` can add or replace any provider's models with no code change. Claude `staticModels` bumped to `opus`/`sonnet`/`haiku` latest-aliases plus pinned `claude-opus-4-8` / `claude-sonnet-4-6` / `claude-haiku-4-5-20251001` (passed verbatim to `claude --model`; the CLI accepts both aliases and pinned full names). +2 unit tests (109 total). Builds on `v2.5.6-provider-lifecycle-phase3`.
## v2.5.6-provider-lifecycle-phase3 — 2026-05-29
Phase 3 of the v2.3 provider-lifecycle batch (`openspec/changes/v2-3-provider-lifecycle/design.md` §5): generic ACP dispatch. `acp-spawn.ts` gains `resolveLaunchSpec(resolved, installPath)` — it consults the resolved registry's `launchCommand` (a config override or a custom-ACP entry's command) first, falling back to the kept `resolveAcpSpawnArgs` switch for built-ins. `acp-dispatch.ts` now spawns `spec.binary`/`spec.args` with `env: { ...process.env, ...spec.env }` instead of the hardcoded per-name argv, and `dispatcher.ts` loads the resolved def by `task.agent` and passes it through. This lets config-defined custom ACP providers dispatch with no new switch case. Built-in dispatch (claude/opencode/goose/qwen) is **byte-identical** to pre-v2.3 — proven by a regression test asserting opencode→`['acp']`, goose→`['acp']`, qwen→`['--acp']`, binary=`installPath ?? id`, and empty config env → plain `process.env`. One deliberate deviation from the spec's literal `!installPath → null`: the `installPath ?? id` fallback is preserved so a missing install path still spawns the bare agent name as before. `setSessionMode`/permission/streaming and the dispatcher poll/NOTIFY/running-guard are untouched. 7 new `acp-spawn.test.ts` cases. No routes/UI (Phase 4+). Builds on `v2.5.5-provider-lifecycle-phase2`.
## v2.5.5-provider-lifecycle-phase2 — 2026-05-29
Phase 2 of the v2.3 provider-lifecycle batch (`openspec/changes/v2-3-provider-lifecycle/design.md` §4). `provider-snapshot.ts` stops returning `null` for uninstalled/disabled providers — it now emits one entry per registered provider with a lifecycle status (`loading | ready | unavailable | error`), an `enabled` flag, and a two-tier probe. Tier-1 is a fast `which`-style availability check (`command-availability.ts`, `execFile`/no-shell); tier-2 — the 530s cold ACP probe — is now SKIPPED unless forced (`POST /refresh`), the `available_agents.last_probed_at` row is older than `PROVIDER_PROBE_TTL_MS` (24h default), or the DB model list is empty, which kills snapshot latency on warm reads. A cache miss returns `status:'loading'` synchronously while the build settles in the background (client polling is deferred to Phase 5). `ProviderSnapshotStatus`/`ProviderSnapshotEntry` regained `loading`/`unavailable` and gained `enabled`, `description?`, `fetchedAt?` in both the coder and web copies, guarded by a runtime parity test (`provider-types-parity.test.ts`, mirroring the `ws-frames.test.ts` convention) that fails on any field drift — a compile-time cross-project assignability check was attempted first but blocked by TS6307 (web is a composite tsconfig project). Also tracks the previously-gitignored `data/coder-providers.json` seed via a `.gitignore` exception, completing the Phase 1 config file. No dispatch/route/UI changes (Phase 3+); AgentComposerBar filtering unchanged. Builds on `v2.5.4-provider-lifecycle-phase1`.
## v2.5.4-provider-lifecycle-phase1 — 2026-05-29
Phase 1 of the v2.3 provider-lifecycle batch (`openspec/changes/v2-3-provider-lifecycle/design.md` §23): a config-backed provider layer merged over the hardcoded built-ins, with no runtime change when no config file exists. Adds `CODER_PROVIDERS_PATH` (default `/data/coder-providers.json`); `provider-config.ts` (Zod `ProviderOverride`/`CoderProvidersFile` schemas + a loader that never throws at startup — a missing file, invalid JSON, or schema mismatch all fall back to built-ins-only — plus `save` for the Phase 4 PATCH route); and `provider-config-registry.ts` (`ResolvedProviderDef` + `buildResolvedRegistry` merge: built-in overrides, custom `extends:'acp'` entries requiring label+command, `boocode` always enabled, plus a module singleton). `agent-probe.ts` now iterates the resolved registry instead of the hardcoded list — custom ACP entries resolve their binary from `command[0]` via `execFile` (no shell), disabled providers skip probing without losing their row, and `enabled` is read from memory only (no DB column this phase). Six unit tests, including a regression proving an empty config yields exactly the built-ins. No snapshot/dispatch/route/UI changes (Phase 2+). The `data/coder-providers.json` seed exists on disk but is gitignored (`data/*`). Lands on top of `v2.5.3-remove-cursor-copilot`.
## v2.5.3-remove-cursor-copilot — 2026-05-29
Retire the cursor and copilot providers from BooCoder entirely. Removes their `acp-spawn` argv cases, `provider-manifest` mode blocks + manifest keys, `provider-commands` command maps, the `provider-snapshot` cursor model-CLI branch (and the now-orphaned `exec`/`promisify` imports), and the `agent-probe` copilot ACP-detect branch; deletes the dead `cursor-models.ts` module and its test. The `PROVIDERS` registry array already lacked both entries, so only the doc comment needed correcting. Built-ins unchanged: claude, opencode, goose, qwen, native boocode. Standalone cleanup; pairs with `v2.5.4-provider-lifecycle-phase1` which builds on it.
## v2.5.2-coder-ux-fixes — 2026-05-29
Working-tree checkpoint bundling this session's fixes with in-progress coder UI work. This session: the BooCoder dispatcher now reacts to new tasks immediately via a Postgres `LISTEN/NOTIFY` (`tasks_new`) AFTER INSERT trigger, with the poll loop kept at 2s as a missed-notification fallback (`dispatcher.ts`, `apps/coder/src/schema.sql`); the mobile nav drawer no longer sticks open after returning to a backgrounded tab — `useViewport` re-syncs on `pageshow`/`visibilitychange`/`resize`/`orientationchange` (iOS reported a stale width on bfcache restore, leaving `isMobile=false`); assistant reasoning renders as a collapsible "Thinking" block in `MessageBubble`, surfacing ACP `agent_thought_chunk` from opencode/goose/qwen and native `reasoning_parts`; paste-to-chip inserts pasted text verbatim instead of wrapping it in a code fence; and a "New file from pasted text" affordance in the RightRail browser queues a `pending_changes` create through the new `POST /api/sessions/:id/pending/create` endpoint, paired with a fix repointing the DiffPanel's dead approve/reject calls to the real `/api/pending/:id/apply` and `/reject` routes. Also carried in the tree but not authored this session: the CoderPane `ChatInput` migration and `AgentComposerBar` refinements, plus backend tweaks to `auto_name`, inference `tool-phase`/`turn`, `secret_guard`, and `provider-registry`. Ships the `v2-6-persistent-agent-sessions` openspec proposal/design/tasks (free agent-switching with per-agent memory, opencode-as-server) as planning docs only — the feature is unimplemented and reserves the `v2.6.0` tag for it. Build green across server/coder/web; server suite 531 passing. (CHANGELOG note: the v2.3v2.5.1 entries were never backfilled and remain absent above.)
## v2.2.2-xml-placeholder-reject — 2026-05-26
Reject placeholder XML tool args at parse time in `extractToolCallBlocks` (`xml-parser.ts`). Drops calls when any string arg is `...`, empty/whitespace, `<path>`, `<file>`, `placeholder`, or angle-bracket sentinels; appends the raw XML block to flushed prose instead of silently deleting it. Fixes qwen3.6 answer-then-spurious-tools tail that caused duplicate assistant rows (full answer + failed `xml_call_*` tools + regenerated answer). Four new tests in `xml-parser.test.ts`. Known nit: rejection logs via `console.debug` instead of pino — filed in `docs/DEFERRED-WORK.md` §6 for a later cleanup.

175
CLAUDE.md
View File

@@ -2,11 +2,11 @@
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
**Cursor agents:** start with `AGENTS.md` (navigation) and `docs/ARCHITECTURE.md` (diagram). This file is the deep engineering reference.
**Cursor agents:** start with `docs/ARCHITECTURE.md` (diagram); this file is the deep engineering reference. `data/AGENTS.md` is the agent *registry*, not navigation (the root navigation `AGENTS.md` was removed).
## What is BooCode
Self-hosted single-user developer chat app. AI assistant with read-only file tools (view_file, list_dir, grep, find_files) running against a local llama-swap inference server. Sessions organized by project, with a multi-pane workspace (chat + file browser side by side).
Self-hosted single-user developer chat app. AI assistant with read-only file tools (view_file, list_dir, grep, find_files) against a local llama-swap inference server. Sessions organized by project, multi-pane workspace (chat + file browser side by side).
Plus `apps/booterm` (second container, port 9501, bookworm-slim+glibc): Fastify + node-pty + tmux. Browser terminal panes WS to `/ws/term/sessions/:sid/panes/:pid`; per-session tmux session `bc-<sid>`, per-pane window `term-<pid>`. Shells drop privs to samkintop via `gosu` in `tmux.conf` default-command.
@@ -23,90 +23,33 @@ pnpm -C apps/server build # server only (tsc + copy schema.sql)
pnpm -C apps/web build # web only (vite)
# Type checking (no emit)
npx tsc --noEmit # project references (root)
npx tsc -p apps/web/tsconfig.app.json --noEmit # web app specifically
# IMPORTANT: root tsc --noEmit uses project references and can miss errors
# that the per-app tsconfig catches. Always verify with the per-app command
# when editing web code. The server build (pnpm -C apps/server build) is
# authoritative for server code.
# Per-app is authoritative. There is NO root tsconfig.json (only tsconfig.base.json),
# so a bare `npx tsc --noEmit` at root compiles nothing.
npx tsc -p apps/web/tsconfig.app.json --noEmit # web (authoritative)
pnpm -C apps/server build # server typecheck (tsc + copy schema)
pnpm -C apps/coder build # coder typecheck
pnpm -C apps/booterm typecheck # booterm typecheck
# Production
docker compose build --no-cache boocode && docker compose up -d
```
Tests: `pnpm -C apps/server test` runs the vitest suite. No test harness on `apps/web` (adding it requires installing vitest as a new devDep). Vitest pinned to `^3` because Vite 5 / vitest 4 are incompatible. No linters configured. Vitest include glob is `src/**/__tests__/**/*.test.ts` (see `apps/server/vitest.config.ts`) — tests outside `src/**/__tests__/` silently won't run; match the per-domain convention (`apps/server/src/services/__tests__/foo.test.ts`).
Tests: `pnpm -C apps/server test` (vitest); `apps/coder` has its own suite — `pnpm -C apps/coder test` (`globals:false`, so import `describe`/`it`/`expect` from `vitest`). No `apps/web` test harness, no linters. Vitest pinned to `^3` (Vite 5 / vitest 4 incompatible). Include glob is `src/**/__tests__/**/*.test.ts` — tests outside it silently won't run. Extract pure helpers to unit-test (`backends/turn-guard.ts`, `lifecycle-decisions.ts` are the pattern).
## Architecture
**Monorepo**: pnpm workspaces with `apps/server` (Fastify + postgres), `apps/web` (React + Vite), and `apps/booterm` (Fastify + node-pty + tmux).
**Monorepo**: pnpm workspaces with `apps/server` (Fastify + postgres), `apps/web` (React + Vite), `apps/booterm` (Fastify + node-pty + tmux), `apps/coder` (BooCoder, host service), `packages/contracts` (`@boocode/contracts`, cross-app wire-contract SSOT — builds FIRST).
### Server (`apps/server/src/`)
### Per-app deep references
- **Fastify** with `@fastify/websocket` and `@fastify/static` (serves built frontend)
- **postgres** (porsager/postgres) with tagged-template SQL — no ORM. Schema in `schema.sql`, applied on startup. LSP may false-positive on `sql<Type[]>\`...\`` generics; CLI `tsc` / `pnpm build` is authoritative.
- **Zod** for request validation and config parsing.
Detailed engineering notes live in per-app `CLAUDE.md` files, **auto-loaded when you read/edit files in that subtree** (and worth opening before non-trivial work there):
Key services:
- **`services/inference/`** — Public surface re-exported via `inference/index.ts`; callers import from `./services/inference/index.js` explicitly (NodeNext doesn't honor directory-index resolution). Layout: `turn.ts` (runAssistantTurn / runInference / createInferenceRunner; exports `InferenceFrame`, `InferenceContext`, `TurnArgs`, `StreamResult`, `MAX_STEPS`), `stream-phase.ts` (streamCompletion as a v1.13.1-A AI SDK adapter + executeStreamPhase), `provider.ts` (`upstreamModel(baseURL, modelId)` wrapping `createOpenAICompatible` against llama-swap), `tool-phase.ts` (executeToolPhase → returns `ToolPhaseResult`; no longer recurses into runAssistantTurn — v1.14.0 converted the recursion to an explicit while loop in turn.ts), `sentinel-summaries.ts` (runCapHitSummary + runDoomLoopSummary + runStepCapSummary + their sentinel inserters), `error-handler.ts` (handleAbortOrError, finalizeCompletion), `payload.ts` (buildMessagesPayload, loadContext, maybeFlagForCompaction, `OpenAiMessage`), `sentinels.ts` (`detectDoomLoop`, `DOOM_LOOP_THRESHOLD`, sentinel predicates), `budget.ts` (resolveToolBudget), `xml-parser.ts` (qwen3.6 XML tool-call fallback — KEEP, AI SDK doesn't handle inline-XML tool calls), `parts.ts` (parts-table write helpers: `partsFromAssistantMessage`, `partsFromToolMessage`, `insertParts` — v1.13.20 made parts the sole source of truth), `prune.ts` (v1.13.4 two-tier compaction; `selectPruneTargets` is the pure decision helper), `types.ts` (`StreamPhaseState`, `DB_FLUSH_INTERVAL_MS`). **`TurnArgs`** is the per-turn state envelope populated from loop locals each iteration; reset in `runInference` at user-message boundary. The outer loop in `runAssistantTurn` (v1.14.0) runs `while (stepNumber < effectiveCap)` where `effectiveCap = Math.min(agent.steps ?? Infinity, MAX_STEPS=200)`. Per-agent `steps:` field in AGENTS.md frontmatter. `steps: 0` means text-only (no tool execution). Step-cap hit writes a `cap_hit` sentinel so `CapHitSentinel.tsx` renders it.
- **AI SDK v6 streamCompletion adapter** (v1.13.1-A; `services/inference/stream-phase.ts`). `streamText` is the underlying call; the BooCode layer above (executeStreamPhase, finalize, dual-write) is shape-preserved via an adapter. Five gotchas the LSP/test suite won't catch:
- **Abort signals are swallowed.** `streamText`'s `fullStream` iterator exits cleanly when `abortSignal` fires — no throw. Post-iteration `if (signal?.aborted) throw <AbortError>` is required; without it the row finalizes as `complete` instead of `cancelled`. Comment in stream-phase.ts pins this; don't refactor it away.
- **Usage lands only at stream end** via `await result.usage` (`inputTokens` / `outputTokens` v6 names → mapped to `promptTokens` / `completionTokens` for the existing onUsage callback). Mid-stream live tok/s is gone vs v1.12.2; ChatThroughput shows a single value at stream end.
- **Tools have NO `execute` field.** BooCode dispatches tools in tool-phase.ts, not the AI SDK loop. Only `description` + `inputSchema: jsonSchema(parameters)` — surfacing tool-call parts via `fullStream` and stopping is what we want.
- **`includeUsage: true` MUST be set on `createOpenAICompatible`** in `services/inference/provider.ts`. The adapter defaults it false, omitting `stream_options.include_usage` from the request body; llama-swap then never emits the usage block and `result.usage.inputTokens/outputTokens` resolve to `undefined`. Latent regression from v1.13.1-A through v1.13.7 — every assistant row in that window has `tokens_used`/`ctx_used` NULL. Don't remove this flag during refactor.
- **Tool-call-only turns may emit a leading `\n` text-delta** as the assistant content. `MessageList.flatten`'s `hasText` and `MessageBubble`'s `hasContent` both `.trim()` before the length check — otherwise whitespace-only content renders an empty bubble + ActionRow between every tool call (v1.13.7 fix). `payload.ts:buildMessagesPayload` also skips `status='failed'` AND complete-but-empty (no content, no tool_calls) assistant rows to avoid "Cannot have 2 or more assistant messages at the end of the list" upstream rejections after cap-hit + Continue.
- **AI SDK ModelMessage conversion** (`toModelMessages` in stream-phase.ts). Tool messages need a `toolName` for `ToolResultPart` — BooCode's OpenAI-shape history doesn't carry it, so a forward-scan builds a `tool_call_id → toolName` map from prior assistant `tool_calls`. Tool outputs wrapped as `{ type: 'json' | 'text', value }` matching the v6 `ToolResultOutput` union. Assistant messages with reasoning emit a `ReasoningPart` first in the content array (v1.13.1-C).
- **`experimental_repairToolCall`** (v1.13.3) wired into `streamText` to keep the stream alive when qwen3.6 emits malformed tool args. Pass-through implementation — logs the bad call and returns it unmodified; `executeToolPhase`'s existing zod-reject error path routes it to the model on the next turn.
- **`chat_status` frame shape** (published via `broker.publishUser`) — `status: 'streaming' | 'tool_running' | 'waiting_for_input' | 'idle' | 'error'` (widened from `working|idle|error` in v1.12.1). Frontend `useChatStatus` derives `idle_warm` (<30s since idle) vs `idle_cold`. `ChatThroughput` renders inline beside `StatusDot` only when streaming or tool_running, fed by 500ms-throttled `'usage'` WS frames (`completion_tokens` + `ctx_used` + `ctx_max`). The `POST /api/chats/:id/discard_stale` endpoint exists to mark a stuck-streaming row as `failed` when the frontend's 60s no-token-activity timer (`ChatPane` content-length watcher) gives up.
- **Boot-time stale-streaming sweep** in `apps/server/src/index.ts` after `applySchema()`: any `messages.status='streaming'` older than 5 minutes flips to `'failed'`. Logs only on non-zero count. Recovers from container restart while inference was mid-stream (v1.12.1).
- **Periodic 60s sweeper** in `apps/server/src/index.ts` (v1.13.3 + v1.13.5). Same `setInterval` runs `sweepStaleStreaming` (marks `messages.status='streaming'` older than 5 min as `failed`, publishes `chat_status='idle'` so the UI dot drops) and `cleanupTruncations` (TTL + orphan reap of tmpfs truncation files). `app.addHook('onClose')` clears the timer. No-op when nothing to reap.
- **`services/broker.ts`** — In-memory pub/sub with two channel types: per-session (message streaming) and per-user (sidebar updates). No persistence; clients reconnect on restart. v1.13.11: every WS publish goes through `broker.publishFrame(sessionId, frame)` or `broker.publishUserFrame(user, frame)` — both Zod-validate against `WsFrameSchema` (`types/ws-frames.ts`) and fail-closed (log + drop). `ctx.publish` / `ctx.publishUser` in inference + auto_name route through the index.ts adapter that calls publishFrame internally. The schema is duplicated byte-identical at `apps/web/src/api/ws-frames.ts`; a `ws-frames.test.ts` case enforces parity. Don't add new raw `broker.publish()` / `publishUser()` calls.
- **`services/tools.ts`** — Tool registry (`ALL_TOOLS`, `READ_ONLY_TOOL_NAMES`, `TOOLS_BY_NAME`). Filesystem tools (view_file/list_dir/grep/find_files) go through three guard layers: `path_guard.ts` (workspace scope), `secret_guard.ts` (filename deny list), `url_guard.ts` (SSRF/private-IP block for web_fetch). v1.11.8+ web tools (`web_search`, `web_fetch`) are opt-in per chat via `session.web_search_enabled` (resolved with `project.default_web_search_enabled` fallback) and filtered out of the LLM's tool schema when false. v1.13.5 truncation: when a tool slice cuts content, `services/truncate.ts` stashes the full text on tmpfs at `BOOCODE_TRUNCATION_DIR` (default `/tmp/boocode-truncations`, 0o700) keyed by an opaque `tr_<12 base32 chars>` id, and the `view_truncated_output(id)` tool retrieves it. 5MB cap (matches `view_file`'s `MAX_FILE_BYTES`), 7-day TTL, reaped by the periodic sweeper. Tmpfs path means container restart loses retrieval — acceptable, the model usually has moved on.
- **`services/compaction.ts`** + **`services/model-context.ts`** — v1.11.0 anchored rolling summary (single `summary=true` assistant row per chat, supersedes itself on each compaction). Triggered when `chats.needs_compaction` is set after an inference turn exceeds `usable(ctx_max) = floor(0.85 × ctx_max)` (v1.13.9 opencode-pattern early trigger; was `ctx_max - 20k` pre-v1.13.9, which gave only 7.6% headroom at 262k and 0 budget for ≤20k contexts). **`ctx_max` comes from `model-context.getModelContext()` which fetches `${LLAMA_SWAP_URL}/upstream/<model>/props`** — NOT from `parsed.timings.n_ctx` (the stream completion's `timings` doesn't carry n_ctx; that read was dead code until v1.11.3 ripped it out). First inferences after a boocode boot may have `ctx_max=NULL` if llama-swap hasn't loaded the model yet; negative cache TTL is 60s, recovers on next turn. v1.13.6: `buildHeadPayload` embeds `reasoning_parts` as a `<reasoning>...</reasoning>` prose prefix on the assistant `content` (OpenAI wire shape has no structured reasoning field; the summarizer reads text). Standalone tag when content is empty (tool-call-only turn). `buildHeadPayload` + `OpenAiMessage` exported for test access — keep them exported.
- **`services/system-prompt.ts`** — `buildSystemPrompt` is the string-returning shim; `buildSystemPromptWithFingerprint` is the canonical impl returning `{prompt, fingerprint, drift}`. v1.13.8 instrumentation: SHA-256 of the assembled prefix is logged per `buildMessagesPayload` call (msg `prefix-fingerprint`, level=info); a `Map<sessionId, lastHash>` observer fires `prefix-drift` (level=warn) on hash change with a field-level `changed_inputs` diff. Smoke proved the prefix is byte-stable across turns in steady-state — the originally-planned `system_prompt_cache` DB table was dropped as redundant against the v1.12.0 input-layer mtime caches (BOOCHAT.md here + AGENTS.md global+per-project in `agents.ts:safeStat`).
- **`services/inference/budget.ts`** — tool-call budgets: `BUDGET_READ_ONLY = 30`, `BUDGET_NON_READ_ONLY = 10` (forward-looking; no write tools yet), `BUDGET_NO_AGENT = 30` (v1.13.7; was 15 — every tool in `ALL_TOOLS` is read-only today, so no-agent mode shares the read-only-agent cap). Per-agent `max_tool_calls` from AGENTS.md frontmatter overrides.
- **`messages_with_parts` view** (v1.13.1-B; `schema.sql`). Read sites that need `tool_calls` / `tool_results` / `reasoning_parts` SELECT from this view, NOT `messages` directly. v1.13.20 dropped the legacy `messages.tool_calls` / `messages.tool_results` JSON columns; the view now reads parts-only subselects. Writes target `message_parts` exclusively via `insertParts` (or via the helpers `partsFromAssistantMessage` / `partsFromToolMessage`). The `Message` wire type still carries `tool_calls?` / `tool_results?` because the view synthesizes them from parts — frontend reads are unchanged. Shapes: `tool_calls jsonb[]`, `tool_results jsonb` single object, `reasoning_parts jsonb[]` of `{text}`. If you ever need to UPDATE a message and return its full Message shape, do a two-step UPDATE returning `id` followed by SELECT from the view — RETURNING off the bare `messages` table no longer carries the tool fields.
- **`services/file_ops.ts`** — Shared file operation implementations used by both inference tools and HTTP routes.
- **`services/auto_name.ts`** — Non-streaming LLM call to generate 4-word session titles after first assistant reply.
- **`services/provider-registry.ts`** — Static registry of provider metadata (label, transport, model source). `PROVIDERS` array, `PROVIDERS_BY_NAME` map. 5 providers: boocode (native), opencode (acp), goose (pty), claude (pty), qwen (pty).
- **`services/agent-probe.ts`** — Startup probe using direct `exec()` (not SSH). Discovers installed agents on host, their versions, ACP support, and models. Qwen models read from `~/.qwen/settings.json`. Claude models are static from the registry. Results persisted to `available_agents` table.
- **`routes/providers.ts`** — `GET /api/providers` returns installed providers with models. Transport field reflects actual capability (checks `supports_acp` from DB, not just registry preference).
- **Provider picker dispatch**: when `provider !== 'boocode'`, the message route creates a `tasks` row (with `session_id` set) instead of calling `inference.enqueue`. The dispatcher picks it up and dispatches via ACP or PTY using the agent's `install_path`.
- **`apps/server/CLAUDE.md`** — inference pipeline, AI-SDK adapter gotchas, tools, compaction, broker, the `messages_with_parts` view, sidecar routing, secret guard, the `data/AGENTS.md` registry.
- **`apps/coder/CLAUDE.md`** — BooCoder dispatch, provider registry/probe/snapshot, opencode/ACP/PTY/Claude-SDK backends, `agent_sessions` resume.
- **`apps/web/CLAUDE.md`** — React app, hooks/event buses, font & CSS pipeline, multi-pane workspace, all UI conventions.
- **`docs/project-discovery.md`** — full stack / tooling / command inventory across all packages (read-on-demand).
Route registration: all routes registered in `index.ts` via `register*Routes(app, sql, ...)` functions. Routes are in `routes/*.ts`.
### BooCoder (`apps/coder/src/`)
- Write-capable coding agent. Runs as a **systemd service on the host** (`boocoder.service`), NOT in Docker. Fastify server at port 9502, connects to postgres at `127.0.0.1:5500`.
- **Workspace dependency on `@boocode/server`**: imports `createInferenceRunner`, `createBroker`, `ALL_TOOLS`, `appendMcpTools` from the server's compiled `dist/`. apps/server's `package.json` has an `exports` map with `types` conditions for NodeNext resolution. apps/server must build FIRST.
- Build + deploy: `pnpm -C apps/server build && pnpm -C apps/coder build && sudo systemctl restart boocoder`. Env file at `apps/coder/.env.host`. Service file at `/etc/systemd/system/boocoder.service`.
- Agent dispatch spawns binaries directly using `install_path` from `available_agents` — no `spawn('sh', ['-c', ...])` (fails under systemd). Follows Paseo's pattern: `spawn(fullBinaryPath, argsArray, { cwd })`.
- systemd hardening: only `NoNewPrivileges=true` is safe. `ProtectSystem`, `ProtectHome`, `PrivateTmp` all break agent dispatch (agents need full filesystem access to read configs, write to worktrees).
- `apps/server/tsconfig.json` has `declaration: true` so `.d.ts` files exist for workspace consumers.
- Write tools (`edit_file`, `create_file`, `delete_file`, `apply_pending`, `rewind`) queue in `pending_changes` table. Nothing hits disk until `apply_pending` is called. `write_guard.ts` validates paths (resolve + prefix-check, no realpath since files may not exist for creates).
- Frontend: NOT a separate SPA. BooCoder is a `'coder'` pane type within BooChat's SPA (`apps/web/`). `CoderPane.tsx` in `apps/web/src/components/panes/`. API requests go through `/api/coder/*` proxy (Vite dev + Fastify production) which rewrites to the boocoder host service (`BOOCODER_URL` env var, default `http://100.114.205.53:9502`). WS connects directly to `:9502`.
- `apps/coder/web/` is a STANDALONE fallback SPA served at `:9502` directly. The PRIMARY BooCoder frontend is the `CoderPane` in BooChat's SPA (`apps/web/src/components/panes/CoderPane.tsx`), accessible via the "Coder" pane in the workspace at `code.indifferentketchup.com`. Both exist; the pane is what Sam uses.
### Frontend (`apps/web/src/`)
- **React 18** + React Router v6 + **Tailwind v4** + shadcn/radix-ui primitives.
- **Shiki** for syntax highlighting (async `codeToHtml` in `CodeBlock.tsx` and `FileViewer` in `FileBrowserPane.tsx`).
- Path alias: `@/` maps to `src/`.
- **Mobile interaction primitives** (post-v1.6): `useViewport` (matchMedia, breakpoints mobile <768 / tablet 7681023 / desktop ≥1024), `useSidebarDrawer` / `useRightRailDrawer` (Context + auto-close on `useLocation().pathname` change), `useLongPress` (500ms timer, dispatches synthetic `contextmenu` on `[data-tab-id]`), `usePullToRefresh` (80px threshold, 600ms hold), `SwipeablePaneTab` (60px close, 30px vertical bail). Tap-target convention: `max-md:min-h-[44px] max-md:min-w-[44px]`. Mobile headers: `border-b px-3 sm:px-4 py-2` + `style={{ paddingTop: 'max(0.5rem, env(safe-area-inset-top))' }}`. Hamburger left, FolderTree right.
Key patterns:
- **`hooks/sessionEvents.ts`** — Module-singleton event bus (Set of listeners). Used for cross-component communication: session renames, file-open events, attachment dispatch. 9 event types in the discriminated union. When adding a new event type to the `SessionEvent` union, you must also add a case to the `applyEvent` switch in `useSidebar.ts` (even if it's a no-op `return prev`).
- **`hooks/useSessionStream.ts`** — WebSocket per session, `applyFrame` reducer builds message list from streaming frames.
- **`hooks/useUserEvents.ts`** — Single app-level WS to `/api/ws/user` with exponential backoff reconnect. Forwards frames onto the sessionEvents bus.
- **`hooks/useSidebar.ts`** — Module-singleton with Set<setState> subscriber pattern; one bus subscription guarded by `globalThis.__boocode_sidebar_subscribed` for HMR safety. Every new `SessionEvent` type needs a `case` in the `applyEvent` switch (no-op `return prev` is fine).
- **`api/client.ts`** — Centralized typed fetch wrapper. All endpoints under `api.*` namespace.
Font / CSS pipeline (apps/web):
- Tailwind v4's `@import "tailwindcss"` directive strips font URLs from subsequent CSS `@import`s — `@fontsource*` packages must be imported as JS side-effect modules in `apps/web/src/main.tsx`, not via `@import` in `globals.css`. Otherwise the woff2 files never make it to `dist/`.
- Lightning CSS (inside `@tailwindcss/postcss` v4) collapses contiguous unicode-ranges to wildcard shorthand (`U+0000-FFFF``U+????`), which iOS Safari/Vivaldi mishandles (silently drops the font from those codepoints). Use explicit non-wildcard-collapsible subranges (e.g. `U+2500-259F` not `U+2500-25FF`). The `apps/web` build script greps `dist/assets/*.css` for `U+2500-259F` and fails the build if missing — preserve that guard.
- `@font-face` blocks must live AFTER all `@import` statements (CSS spec). Earlier placement silently breaks every subsequent `@import` (this broke the 18 theme palette imports in globals.css for one session).
- JetBrainsMono Nerd Font self-hosted in `apps/web/src/fonts/` (TTF from ryanoasis/nerd-fonts release) — needed because `@fontsource-variable/jetbrains-mono` ships subsetted woff2s that don't cover `U+2500-259F` (box drawing + block elements, used by opencode's banner). "NL" = No Ligatures (matches `font-feature-settings: "liga" 0`); "Mono" = single-cell icon width so TUI layouts don't desync.
- xterm-addon-webgl rasterizes glyphs via Canvas2D into a GPU texture atlas. Canvas2D does NOT honor `font-display: block` — it uses whatever font is currently registered. Gate xterm initialization on `document.fonts.load(<font-name>)` resolving before calling `term.open()` (see `fontsReady` useState in `TerminalPane.tsx`). iOS Safari/Vivaldi also reclaims WebGL contexts from backgrounded tabs: keep `webgl.onContextLoss(() => webgl.dispose())` + recreate via visibilitychange. Do NOT manually dispose+recreate the addon after font load — iOS silently fails the second GL context creation and the terminal drops to DOM renderer with stale metrics.
Cross-app contracts (WS-frame & provider-type parity, sentinels) and everything below stay here.
### Data flow for chat
@@ -117,71 +60,67 @@ Font / CSS pipeline (apps/web):
5. Tool calls: inference executes tools server-side, publishes tool_call/tool_result frames, loops back to LLM
6. Terminal states (complete/error): DB updated with final content + token counts, `session_updated` frame published on user channel
### Multi-pane workspace
Sessions hold 15 panes (chat / empty / placeholder terminal+agent). v1.12.1 moved pane state from per-device localStorage to `sessions.workspace_panes jsonb` for cross-device sync. `PATCH /api/sessions/:id/workspace` persists; `session_workspace_updated` user-channel frame broadcasts to every device watching the session. `useWorkspacePanes` debounces saves 300ms and dedups echoes by JSON string. Legacy localStorage key `boocode.workspace.panes.<sessionId>` is read once on first hydrate (one-time seed-and-delete migration when server is empty but localStorage has data); no longer written. The deprecated `session_panes` table was dropped. `validatePanes(validChatIds)` prunes panes referencing chat IDs that no longer exist (called by `useSessionChats` after the chat list fetch lands). Each chat lives in at most one pane; tab strip is per-pane and tracks `chatIds[]` + `activeChatIdx`. Tab reorder via native HTML5 drag events.
## Database
PostgreSQL 16. Database name: `boochat` (renamed from `boocode` in v2.0.0-alpha; Docker service name stays `boocode_db`). Tables: `projects`, `sessions`, `chats`, `messages`, `settings`, `message_parts` (v1.13.0), `pending_changes` (v2.0.0), `tasks` (v2.0.0), `available_agents` (v2.0.0). Views: `messages_with_parts` (v1.13.1-B parts-merge read path), `tool_cost_stats` (v1.13.10 per-tool 100-call rolling window), `human_inbox` (v2.0.0 — tasks WHERE state IN blocked/failed). (`session_panes` was dropped in v1.12.1; workspace pane state lives in `sessions.workspace_panes jsonb`.) Schema applied idempotently on startup via `applySchema()`. Use `clock_timestamp()` (not `NOW()`) inside transactions. CHECK constraints in place: `projects_status_chk` ('open'|'archived'), `sessions_status_chk` (same), `chats_status_chk` (same), `messages_role_chk`, `messages_status_chk` — keep in sync with the `*_STATUSES` const arrays in `apps/server/src/types/api.ts`. The older anonymous `messages_status_check` (without 'cancelled') and `messages_role_check` (without 'system') were dropped in v1.12.1; only the `_chk` variants remain.
PostgreSQL 16. DB name: `boochat` (Docker service stays `boocode_db`). Tables: `projects`, `sessions`, `chats`, `messages`, `settings`, `message_parts`, `pending_changes`, `tasks`, `available_agents`. Views: `messages_with_parts` (parts-merge read path), `tool_cost_stats` (per-tool 100-call rolling window), `human_inbox` (tasks WHERE state IN blocked/failed). Schema applied idempotently on startup via `applySchema()`. Use `clock_timestamp()` (not `NOW()`) inside transactions. CHECK constraints: `projects_status_chk`/`sessions_status_chk`/`chats_status_chk` ('open'|'archived'), `messages_role_chk`, `messages_status_chk` — keep in sync with the `*_STATUSES` const arrays in `apps/server/src/types/api.ts`. **Two schema files, one DB:** `apps/server/src/schema.sql` owns `sessions`/`chats`/`messages`/`message_parts`; `apps/coder/src/schema.sql` (applied by the boocoder host service) owns `agent_sessions`, `worktrees`, `pending_changes`, `available_agents` and extends `tasks` — so e.g. an `agent_sessions` FK change goes in the **coder** schema. Idempotent FK-action flips (e.g. `ON DELETE CASCADE``SET NULL`) guard on `pg_constraint.confdeltype` so re-runs are no-ops.
Schema CHECK migration order when renaming allowed values: (1) `ALTER TABLE ... DROP CONSTRAINT IF EXISTS <system_name>` (inline `CREATE TABLE` checks get `<table>_<column>_check`), (2) `UPDATE` rows to new values, (3) wrap new constraint ADD in `DO $$ ... pg_constraint` guard — that block is the only way to get `ADD CONSTRAINT IF NOT EXISTS`.
Schema CHECK migration order when renaming allowed values: (1) `ALTER TABLE ... DROP CONSTRAINT IF EXISTS <system_name>` (inline `CREATE TABLE` checks get `<table>_<column>_check`), (2) `UPDATE` rows to new values, (3) wrap the new constraint ADD in a `DO $$ ... pg_constraint` guard — the only way to get `ADD CONSTRAINT IF NOT EXISTS`.
**`CREATE OR REPLACE VIEW` can't reorder/rename columns** (Postgres `42P16`): append a new `messages_with_parts` column at the END of the SELECT — a mid-list insert shifts an existing column → crash-loops boot. Add it to each explicit read SELECT too (`routes/messages.ts`/`chats.ts`/`ws.ts`).
**A `SELECT *` view pins every column** (`2BP01`): `DROP COLUMN` on the table fails while such a view exists. `human_inbox` is `SELECT * FROM tasks` — to drop a `tasks` column, `DROP VIEW IF EXISTS human_inbox` first, drop the column(s), then recreate the view (idempotent). Bites existing DBs only; a fresh DB never had the column, so fresh-DB testing misses it.
## Environment
Required: `DATABASE_URL`, `LLAMA_SWAP_URL`. Optional: `PORT` (3000), `HOST` (0.0.0.0), `PROJECT_ROOT_WHITELIST` (/opt, read-only scope for add-existing path resolution), `BOOTSTRAP_ROOT` (/opt/projects, writable scope for create-new-project bootstrap mkdir target — host must `mkdir -p /opt/projects` before container start), `DEFAULT_MODEL`, `LOG_LEVEL`, `SEARXNG_URL` (default `http://100.114.205.53:8888` — internal Tailscale Fathom; the public `search.indifferentketchup.com` is behind Authelia and unusable from server context), `BOOCODE_TOOLS` (`core` | `standard` | `all`, default `all`; v1.13.15-tools tier filter — ceiling, never expands an agent's whitelist), `MCP_CONFIG_PATH` (optional; default `/data/mcp.json` — JSON config for MCP servers matching opencode's `mcpServers` shape; file missing = no MCP).
Required: `DATABASE_URL`, `LLAMA_SWAP_URL`. Optional: `PORT` (3000), `HOST` (0.0.0.0), `PROJECT_ROOT_WHITELIST` (/opt, read-only add-existing scope), `BOOTSTRAP_ROOT` (/opt/projects, writable bootstrap mkdir target — host must `mkdir -p` it before container start), `DEFAULT_MODEL`, `LOG_LEVEL`, `SEARXNG_URL` (default `http://100.114.205.53:8888` — internal Tailscale; the public host is behind Authelia, unusable from server context), `BOOCODE_TOOLS` (`core`|`standard`|`all`, default `all`; a ceiling, never expands an agent's whitelist), `MCP_CONFIG_PATH` (default `/data/mcp.json`, opencode `mcpServers` shape; missing = no MCP), `CONTEXT7_API_KEY` (the Context7 MCP key, referenced from `data/mcp.json` as `"{env:CONTEXT7_API_KEY}"`). `data/mcp.json` is **gitignored** but no longer holds secrets — string values support opencode-style `{env:VAR}` substitution (`mcp-config.ts:substituteEnvVars`, applied before Zod validation; unset var → `''` + warn), so real keys live in `.env`; template `data/mcp.example.json`. A config-only edit there needs only `docker compose restart boocode` (data/ is bind-mounted); changing a referenced secret edits `.env`. MCP loads at server startup with per-server graceful degradation; the coder does NOT load MCP (BooChat only).
BooCoder at port 9502: `curl http://100.114.205.53:9502/api/health`. Runs as `boocoder.service` on the host (not Docker). Deploy: `pnpm -C apps/server build && pnpm -C apps/coder build && sudo systemctl restart boocoder`. Health reports tool count: `{"ok":true,"db":true,"tools":33}`.
BooCoder at port 9502: `curl http://100.114.205.53:9502/api/health`. Runs as `boocoder.service` on the host (not Docker). Deploy: `pnpm -C packages/contracts build && pnpm -C apps/server build && pnpm -C apps/coder build && sudo systemctl restart boocoder`. Health reports tool count: `{"ok":true,"db":true,"tools":33}`.
- `FAST_MODEL` (optional) — cheaper model for titles, summaries, labeling (auto_name.ts, tool-summaries.ts). Falls back to session model or DEFAULT_MODEL when unset. Set to a small model on llama-swap (e.g. `nemotron-nano-4b`) to avoid loading the 35B for 20-token calls.
- Qwen Code dispatch: `OPENAI_BASE_URL=http://100.101.41.16:8401/v1 OPENAI_API_KEY=dummy qwen -p "<task>" --output-format stream-json`. Install: `npm install -g @qwen-code/qwen-code@latest`. Node ≥22 required on host (container stays Node 20; BooCoder dispatches via direct spawn on host). No `--yolo` flag — non-interactive mode (`-p`) runs autonomously without approval prompts. ACP bridge is HTTP daemon (not stdio); use PTY dispatch.
- Arena (v2.0.5): `POST /api/arena {project_id, input, contestants: [{agent?, model?}]}` dispatches the same task to N models/agents in parallel. Each contestant gets its own task + worktree. `GET /api/arena/:id` for results. `POST /api/arena/:id/select/:task_id` picks winner.
- `FAST_MODEL` (optional) — cheaper model for titles, summaries, labeling (auto_name.ts, tool-summaries.ts). Falls back to session model or DEFAULT_MODEL. Set to a small llama-swap model (e.g. `nemotron-nano-4b`) to avoid loading the 35B for 20-token calls.
- Qwen Code dispatch: `OPENAI_BASE_URL=http://100.101.41.16:8401/v1 OPENAI_API_KEY=dummy qwen -p "<task>" --output-format stream-json`. Install: `npm install -g @qwen-code/qwen-code@latest`. Node ≥22 on host (container stays Node 20; BooCoder dispatches via direct spawn on host). No `--yolo` flag — `-p` runs autonomously without prompts. ACP bridge is an HTTP daemon (not stdio); use PTY dispatch.
- Arena: `POST /api/arena {project_id, input, contestants: [{agent?, model?}]}` dispatches the same task to N models/agents in parallel; each contestant gets its own task + worktree. `GET /api/arena/:id` for results; `POST /api/arena/:id/select/:task_id` picks a winner.
## Workflow
- Sam reviews all diffs and commits manually. Do not commit unless explicitly asked.
- Per-batch docs live under `openspec/changes/<slug>/{proposal,tasks,design}.md`. Already-shipped batches are snapshots in `openspec/changes/archived/`. New batches follow the proposal+tasks shape; see `openspec/README.md` for the convention.
- Tag naming: `vMAJOR.MINOR.PATCH-slug` (e.g. `v1.13.13-ws-publish`). Monotonic per minor — the slug describes the batch's content so the tag name alone is enough to recall what shipped. No letter suffixes (`-a`/`-b`), no pseudo-ranges (`v1.11.x`), no slug-only sub-versions sharing a number (`v1.13.15-tools` + `-openspec` + `-agentlint` — split into sequential patches instead).
- `CHANGELOG.md` is the per-tag release log, most-recent on top. When a new tag is created, add a `## <tag> — <YYYY-MM-DD>` section with a 36 sentence paragraph summarizing what shipped, drawn from the commit body. Cross-reference other tags by name when the batch builds on, fixes, or pairs with prior work (e.g. "pairs with `v1.13.12-ws-schemas`", "fixed in `v1.13.5-stability-bundle`"). No nested bullets — one paragraph.
- Deploy: `cd /opt/boocode && docker compose up --build -d` (or `docker compose build --no-cache boocode && docker compose up -d` if you suspect a layer-cache issue).
- Git push to Gitea: `GIT_SSH_COMMAND="ssh -i /opt/boocode/secrets/boocode_gitea -o IdentitiesOnly=yes" git push origin <branch>`. The default agent identity is rejected; the in-repo deploy key (`secrets/`, gitignored) is the working one. Transient `Connection reset by peer` retries cleanly after `sleep 5`.
- Sam often has uncommitted `apps/web` work in flight — stage your own commits **explicitly by path** (never `git add -A`); `docker compose up --build -d boocode` builds the working tree, so a container rebuild also ships his uncommitted web changes.
- **Deploy by surface:** an `apps/coder` change → `sudo systemctl restart boocoder`; an `apps/web` or `apps/server` change → `docker compose up --build -d boocode` (rebuilds web+server from the working tree). The `boocode` container is `build: .`, so uncommitted changes deploy; web edits are live on the Vite dev server (HMR) but NOT on production (`:9500` / code.indifferentketchup.com) until a rebuild. Use `docker compose build --no-cache boocode && docker compose up -d` if you suspect a layer-cache issue.
- Cutting a release: name the feature branch DIFFERENTLY from the tag (branch `f1-interrupt-guard`, tag `v2.6.7-interrupt-guard`) — identical names trigger `warning: refname ... is ambiguous`.
- Per-batch docs live under `openspec/changes/<slug>/{proposal,tasks,design}.md`; shipped batches are snapshots in `openspec/changes/archived/`. New batches follow the proposal+tasks shape (see `openspec/README.md`).
- Tag naming: `vMAJOR.MINOR.PATCH-slug` (e.g. `v1.13.13-ws-publish`), monotonic per minor — the slug alone recalls what shipped. No letter suffixes, no pseudo-ranges, no slug-only sub-versions sharing a number (split into sequential patches).
- `CHANGELOG.md` is the per-tag release log, newest on top. New tag → add a `## <tag> — <YYYY-MM-DD>` section, one 36 sentence paragraph (no nested bullets) from the commit body; cross-reference related tags by name when the batch builds on / fixes / pairs with prior work.
- Git push to Gitea: `GIT_SSH_COMMAND="ssh -i /opt/boocode/secrets/boocode_gitea -o IdentitiesOnly=yes" git push origin <branch>`. The default agent identity is rejected; the in-repo deploy key (`secrets/`, gitignored) is the working one. Transient `Connection reset by peer` retries cleanly after `sleep 5`. Keep both remotes synced: push `main` + the release tag to `origin` (Gitea, deploy key above) AND `backup` (`git@github.com:indifferentketchup/boocode.git`, default key).
- Don't accumulate `.bak-*` files. Clean them up in the same batch or immediately after merge.
- DB-integration tests opt-in via env var: `DATABASE_URL='postgres://boocode:devpass@localhost:5500/boochat' pnpm -C apps/server test`. Host port is 5500 (mapped from `boocode_db:5432`); password is `${POSTGRES_PASSWORD}` from `.env` (`devpass`), NOT the literal in `.env`'s `DATABASE_URL=postgres://boocode:Ketchup1479@boocode_db:5432/...` line. Pattern: `describe.runIf(!!process.env.DATABASE_URL)(...)` with a `beforeAll` that applies the schema via `sql.unsafe(readFileSync(schemaPath))`. Tests skip cleanly when var is unset. `tool_cost_stats.test.ts` is the reference.
- Host-side smoke endpoint: `curl http://100.114.205.53:9500/api/...`. The boocode container's port mapping binds to the Tailscale IP, not `0.0.0.0`, so `localhost:9500` doesn't work from the host shell. Same for booterm at `:9501`.
- Fastify global JSON parser tolerates empty bodies (overridden in `index.ts`); bodyless POSTs (archive, unarchive, stop) work without setting `Content-Type` tricks on the client.
- DB-integration tests opt-in via env var: `DATABASE_URL='postgres://boocode:devpass@localhost:5500/boochat' pnpm -C apps/server test`. Host port 5500; password is `${POSTGRES_PASSWORD}` from `.env` (`devpass`), NOT the literal in `.env`'s `DATABASE_URL` line. `psql` isn't on host PATH — use `docker exec boocode_db psql -U boocode -d boochat -c "..."`. Pattern: `describe.runIf(!!process.env.DATABASE_URL)(...)` + `beforeAll` applying schema via `sql.unsafe(readFileSync(schemaPath))`. `tool_cost_stats.test.ts` is the reference.
- Host-side smoke endpoint: `curl http://100.114.205.53:9500/api/...`. The container's port mapping binds to the Tailscale IP, not `0.0.0.0`, so `localhost:9500` doesn't work from the host shell. Same for booterm at `:9501`.
- Frontend blank-screen / runtime crash: get the stack-trace column offset from the browser console, then `cut -c <start>-<end> apps/web/dist/assets/index-*.js | sed -n '<line>p'` to read the exact minified expression that threw. Watch for `=== null`/`!== null` on optional fields fed an `as unknown as` cast — those bypass tsc.
- Fastify global JSON parser tolerates empty bodies (overridden in `index.ts`); bodyless POSTs (archive, unarchive, stop) work without `Content-Type` tricks on the client.
- Event dedup discipline: for any mutation the server publishes via `broker.publishUser`, do NOT add a local `sessionEvents.emit(...)` after the API call — `useUserEvents` forwards the WS frame onto the bus. Frontend mutation handlers must be idempotent (dedup by id, no-op on already-present).
- `node:20-*` base images ship a `node` user at uid/gid 1000 — delete it (`userdel`/`groupdel` on debian, `deluser`/`delgroup` on alpine) before adding samkintop at 1000.
- node-pty's compiled `.node` is libc-specific: proddeps and runtime Dockerfile stages must share libc (alpine↔musl or bookworm-slim↔glibc); the TS-only builder stage can stay alpine for speed.
- pnpm 10 `--frozen-lockfile` skips node-pty's postinstall — the Docker proddeps stage runs `cd node_modules/node-pty && npm run install` to force the native compile.
- A local PreToolUse hook (`security_reminder_hook.py`) regex-flags Node's older `child_process` spawn helpers as unsafe (false positive even on the File-suffixed variant). Use `spawn` — it's accepted.
- `/opt/boolab` hosts a working sibling BooCode terminal at `boocode.indifferentketchup.com`. Useful for visual side-by-side comparison on the same iPhone when debugging booterm rendering. Boolab uses Tailwind v3 (`@tailwind base`); boocode uses v4 — many subtle build differences. Don't assume parity.
- booterm SSHs to the host as `samkintop@100.114.205.53` (the Tailscale IP). The hostname `ubuntu-homelab` (shown in the bash prompt after login) does NOT resolve from inside the container — only the host's `/etc/hosts` knows it. Override via `BOOTERM_SSH_HOST` / `BOOTERM_SSH_USER` env vars in docker-compose if you ever move the shell to a different machine.
- codecontext sidecar lives at `/opt/boocode/codecontext/`. Sidecar HTTP API at `http://codecontext:8080/v1/<tool_name>` over the `boocode_net` bridge (no host port). BooCode wrappers in `apps/server/src/services/tools/codecontext/`. The `.codecontextignore.template` documents recommended ignore patterns; users copy and adapt to project root manually.
- codecontext fork at `/opt/forks/codecontext/` — separate git repo (branch `boocode-ts`), pushed via the same boocode_gitea SSH key to `indifferentketchup/codecontext`. Build: `go build ./...`. Test: `go test ./...`. Docker rebuild: `docker compose build --no-cache codecontext`.
- Go binary: `/snap/go/current/bin/go` (not on PATH by default). Use `export PATH=$PATH:/snap/go/current/bin` or full path for Go commands.
- `os/exec` child supervisors must explicitly call `child.Wait()` in a goroutine and `os.Exit` on child death. `Signal(0)` returns nil on zombies and is NOT a liveness check. Without `Wait()`, docker's `restart: unless-stopped` policy never fires because the parent stays alive. The `codecontext/shim.go` implementation is the reference pattern.
- `/opt/boolab` hosts a sibling BooCode at `boocode.indifferentketchup.com` — useful for side-by-side iPhone comparison when debugging booterm rendering. It uses Tailwind v3, boocode uses v4 — don't assume build parity.
- booterm SSHs to the host as `samkintop@100.114.205.53` (the Tailscale IP). The hostname `ubuntu-homelab` (in the bash prompt) does NOT resolve inside the container. Override via `BOOTERM_SSH_HOST` / `BOOTERM_SSH_USER` env vars in docker-compose if the shell moves to a different machine.
- codecontext sidecar lives at `/opt/boocode/codecontext/`. HTTP API at `http://codecontext:8080/v1/<tool_name>` over the `boocode_net` bridge (no host port). BooCode wrappers in `apps/server/src/services/tools/codecontext/`. The `.codecontextignore` at project root is honored when `--respect-gitignore` is passed (enabled in the shim).
- codecontext fork at `/opt/forks/codecontext/` — separate git repo (branch `boocode-ts`), pushed via the boocode_gitea SSH key to `indifferentketchup/codecontext`. Build `go build ./...`; test `go test ./...`. Docker rebuild requires staging the fork first: `tar -czf codecontext/fork.tar.gz -C /opt/forks/codecontext --exclude=.git --exclude=bin .` then `docker compose build --no-cache codecontext` (the Dockerfile COPYs `fork.tar.gz` into the builder stage; Gitea is behind Authelia, no HTTP clone). `fork.tar.gz` is gitignored.
- Go binary: `/snap/go/current/bin/go` (not on PATH). Use `export PATH=$PATH:/snap/go/current/bin` or the full path.
- `os/exec` child supervisors must call `child.Wait()` in a goroutine and `os.Exit` on child death. `Signal(0)` returns nil on zombies and is NOT a liveness check. Without `Wait()`, docker's `restart: unless-stopped` never fires because the parent stays alive. `codecontext/shim.go` is the reference.
## Conventions
- `overflowWrap` not `wordWrap` — TypeScript's CSSStyleDeclaration marks `wordWrap` as deprecated (error 6385).
Cross-cutting only. Per-app conventions live in the matching `apps/*/CLAUDE.md`.
- No app-layer auth. Authelia handles auth at the reverse proxy. All `broker.publishUser`/`subscribeUser` calls use `'default'` as the user key.
- TypeScript strict mode. Both apps share `tsconfig.base.json`.
- Server uses NodeNext module resolution (`.js` extensions in imports).
- TypeScript strict mode. Both apps share `tsconfig.base.json`. Server + coder use NodeNext module resolution (`.js` extensions in imports).
- Discriminated unions for type narrowing: `Pane` (by `kind`), `SessionEvent` (by `type`), `InferenceFrame` (by `type`).
- **Adding a new WS frame type** requires updating BOTH the server's `InferenceFrame` (loose `type:` union + optional fields in `services/inference/turn.ts`) AND the web `WsFrame` (strict discriminated union in `apps/web/src/api/types.ts`). Server publish is permissive; the frontend type is the wire-format gate. The `'usage'` frame added in v1.12.2 needed both sides; missing the web side silently drops the frame at JSON-parse.
- shadcn primitives live in `components/ui/`. Don't modify them unless adding a new primitive.
- `inferLanguage()` from `lib/attachments.ts` is the canonical file-extension-to-language map. `CodeBlock.tsx` keeps its own `LANG_MAP` because it also resolves markdown fence names.
- Two UI event buses: `hooks/sessionEvents.ts` for DB-state events (chat_created, session_updated); `lib/events.ts` for ephemeral UI (`sendToTerminal`, `terminalsRegistry`). Don't merge — different subscriber lifecycles.
- `vite.config.ts` proxy entries are order-sensitive: more-specific prefixes (`/api/term`, `/ws/term`) must come BEFORE `/api`.
- Mobile pane URL sync (`Session.tsx`): the `?pane=<id>` effect resets `activePaneIdx` whenever `panes` changes. New-pane creation on mobile must push `?pane=` atomically — `addPaneAndSwitch` is the wrapper that does this. `addSplitPane` returns the new pane id for callers.
- xterm.js v5 uses canvas rendering — browser doesn't see xterm's selection; the native right-click menu has no working Copy for terminal text. App keybindings (`Cmd/Ctrl-C`, `Cmd/Ctrl-Shift-C`) are the path.
- **New tools** live in their own `services/<name>.ts` file (see `web_search.ts`, `web_fetch.ts`) — exports a pure `executeFoo(input, ...deps)` for direct test access plus a `ToolDef` wrapper that `loadConfig()`s its real dependencies. Register the ToolDef in `tools.ts` `ALL_TOOLS` (and `READ_ONLY_TOOL_NAMES` if applicable). Inject `fetcher: typeof fetch = fetch` rather than `vi.spyOn(globalThis, 'fetch')` — cleanup is simpler and the production call site stays unchanged.
- **Sentinels** are `role='system'` rows with structured `metadata.kind` (`cap_hit`, `doom_loop`). UI-only — `buildMessagesPayload` strips them via `isAnySentinel` so the LLM never sees them. A new kind requires arms in `MessageMetadata` in BOTH `apps/server/src/types/api.ts` AND `apps/web/src/api/types.ts`, plus a render branch in `apps/web/src/components/MessageBubble.tsx`.
- **ReadableStream test stubs** use `pull()` (not `start()`) so chunks are produced lazily — `start()` enqueues everything and calls `controller.close()` before the consumer reads, so a subsequent `reader.cancel()` finds the stream already closed and the `cancel()` callback never fires. Also provide MORE chunks than the test will consume so the source stays in 'readable' state when cancel runs (e.g. cap test reads ~6 chunks, stub provides 10).
- Tool-name whitelists must derive from `ALL_TOOLS` in `services/tools.ts`, never hardcoded. `services/agents.ts` `ALL_TOOL_NAMES` had this drift class until v1.12 — same pattern applies to any future tool-aware code.
- Agent registry lives at `data/AGENTS.md` (global, bind-mounted at `/data/AGENTS.md`). No per-project `AGENTS.md` in this repo — removed in v1.12 to eliminate the two-files-must-stay-in-sync drift. The `getAgentsForProject` per-project override mechanism remains for *other* projects.
- MCP stdio transport uses newline-delimited JSON (NDJSON), NOT LSP-style `Content-Length` headers. The `codecontext/shim.go` framing implementation is the reference; per the MCP spec (modelcontextprotocol.io/specification/server/transports).
- **Workspace dependency pattern** (`apps/coder``@boocode/server`): the consuming package adds `"@boocode/server": "workspace:*"` in `package.json`. The provider's `package.json` needs `exports` with `types` + `default` conditions per subpath: `"./inference": { "types": "./dist/.../index.d.ts", "default": "./dist/.../index.js" }`. Without the `types` condition, NodeNext resolution can't find `.d.ts` files and tsc fails with "Cannot find module" in the consumer.
- **JSONB columns**: use `sql.json(value as never)` — NOT `${JSON.stringify(value)}::jsonb` which double-serializes (stores a JSON string instead of a JSON object/array). Pattern established in `parts.ts`, `settings.ts`.
- **`payload.ts:loadContext` SELECT**: must include every `Session` field that downstream code reads. The tool phase reads `session.allowed_read_paths`; if the SELECT omits it, cross-repo read grants silently fail. The `Session` TypeScript type doesn't catch this because `sql<Session[]>` doesn't enforce column coverage.
- **Adding a new WS frame type** (cross-app): add it to `WsFrameSchema` in `packages/contracts/src/ws-frames.ts` (single source of truth; rebuild with `pnpm -C packages/contracts build`). The server's `InferenceFrame` loose union (`services/inference/turn.ts`) and the web's strict `WsFrame` discriminated union (`apps/web/src/api/types.ts`) still exist separately and also need updating. Server publish is permissive; the frontend type is the wire-format gate missing the web side silently drops the frame at JSON-parse.
- **Sentinels** (cross-app) are `role='system'` rows with structured `metadata.kind` (`cap_hit`, `doom_loop`). UI-only — `buildMessagesPayload` strips them via `isAnySentinel` so the LLM never sees them. `MessageMetadata` is single-sourced in `@boocode/contracts` (`packages/contracts/src/message-metadata.ts`). A new kind requires updating that file and rebuilding the package, plus a render branch in `apps/web/src/components/MessageBubble.tsx`.
- **Provider snapshot types** (`ProviderSnapshotEntry`, `ProviderModel`, `ProviderMode`, `ThinkingOption`, `AgentCommand`, `ProviderSnapshotStatus`) are single-sourced in `@boocode/contracts` (`packages/contracts/src/provider-snapshot.ts`); `apps/coder/src/services/provider-types.ts` re-exports them. Edit the package source; there is no hand-synced web copy to update.
- **`@boocode/contracts`** single-sources cross-app wire contracts via per-subpath built-dist exports, consumed by all four apps (incl. `apps/coder/web`): `./ws-frames`, `./provider-snapshot`, `./provider-config` (Zod schemas), `./message-metadata` (`MessageMetadata`/`ErrorReason`/`AgentSessionConfig`), `./worktree-risk`. It builds BEFORE every consumer (root build, Dockerfile, coder deploy). Its `WsFrame` is the loose `z.infer` of `WsFrameSchema` (payloads `unknown`); the web's richer strict `WsFrame` union is **deliberately web-local** (`apps/web/src/api/types.ts`), bridged to the validated frame by a cast — don't move it into the package. Consume built `dist` via the exports map; never add the package to a tsconfig `references` array.
- **JSONB columns**: use `sql.json(value as never)` — NOT `${JSON.stringify(value)}::jsonb` which double-serializes (stores a JSON string instead of an object/array). Pattern in `parts.ts`, `settings.ts`.
- Skills live in `data/skills/<vendor>/`; Sam's own namespace is `boocode/` (`committing-changes`, `using-worktrees`, `improving-boocode-guidance`, `systematic-debugging`) — `SKILL.md` + optional `eval.yaml` (gerund names; eval = `skill:` + `tasks:` of `prompt`+`grader`, incl. a negative-trigger task). `data/skills/` is canonical; a divergent mirror at `/opt/skills/` exists.
### Coding standards
Coding standards live in `docs/coding-standards/` (canonical, human-readable). They are exposed to Claude Code through per-file-type/subsystem index files under `.claude/rules/coding-standards/`. Each index is a path-scoped rule that lists the standards relevant to its `paths:` glob with a one-line description of each. When Claude reads a file matching an index's `paths:`, it loads only that small index and then decides which (if any) standards to open with Read — the full text of a standard is never loaded automatically, and standards do not appear in the skills picker. Browse `docs/coding-standards/` for the readable form.

View File

@@ -1,10 +1,9 @@
# Current focus
Last updated: 2026-05-26
Last updated: 2026-06-02
- **Batch:** v2.3-provider-lifecycle (openspec drafted; not started)
- **Branch:** `main`
- **Blockers:** none
- **Last shipped:** `v2.2.2-xml-placeholder-reject`
- **Last shipped:** `v2.7.8-ember-coder-tabs-model-chips` (2026-06-01)
- **Branch:** `codebase-audit-cleanup` (audit + cleanup epic, off main HEAD)
- **In progress:** Phase 3 — stale comments + docs refresh
Update this file when starting or finishing a batch. Agents: read this first for session intent; if stale vs `CHANGELOG.md`, trust CHANGELOG for shipped state.
See `CHANGELOG.md` for the full shipped history. That file is always authoritative; this file is a quick orientation pointer only.

View File

@@ -5,11 +5,15 @@ RUN corepack enable
WORKDIR /build
COPY package.json pnpm-workspace.yaml pnpm-lock.yaml tsconfig.base.json ./
COPY packages/contracts/package.json ./packages/contracts/
COPY apps/server/package.json ./apps/server/
COPY apps/web/package.json ./apps/web/
RUN pnpm install --frozen-lockfile
# @boocode/contracts must be present before `pnpm build`, which builds it FIRST
# (root build script) so apps/web can resolve its compiled dist via the exports map.
COPY packages/contracts ./packages/contracts
COPY apps/server ./apps/server
COPY apps/web ./apps/web
@@ -20,6 +24,9 @@ RUN pnpm deploy --filter=@boocode/server --prod --legacy /out/server
FROM node:20-alpine AS runtime
RUN apk add --no-cache ripgrep git openssh-client
# The container runs as root but bind-mounts host project repos owned by uid 1000;
# trust them so git read/write tools (git_status, the git diff panel) work over the mount.
RUN git config --system --add safe.directory '*'
RUN mkdir -p /root/.ssh && ssh-keyscan -p 2222 -H 100.114.205.53 git.indifferentketchup.com >> /root/.ssh/known_hosts && chmod 700 /root/.ssh && chmod 600 /root/.ssh/known_hosts
WORKDIR /app

682
LICENSE
View File

@@ -1,661 +1,21 @@
GNU AFFERO GENERAL PUBLIC LICENSE
Version 3, 19 November 2007
Copyright (C) 2007 Free Software Foundation, Inc. <https://fsf.org/>
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
Preamble
The GNU Affero General Public License is a free, copyleft license for
software and other kinds of works, specifically designed to ensure
cooperation with the community in the case of network server software.
The licenses for most software and other practical works are designed
to take away your freedom to share and change the works. By contrast,
our General Public Licenses are intended to guarantee your freedom to
share and change all versions of a program--to make sure it remains free
software for all its users.
When we speak of free software, we are referring to freedom, not
price. Our General Public Licenses are designed to make sure that you
have the freedom to distribute copies of free software (and charge for
them if you wish), that you receive source code or can get it if you
want it, that you can change the software or use pieces of it in new
free programs, and that you know you can do these things.
Developers that use our General Public Licenses protect your rights
with two steps: (1) assert copyright on the software, and (2) offer
you this License which gives you legal permission to copy, distribute
and/or modify the software.
A secondary benefit of defending all users' freedom is that
improvements made in alternate versions of the program, if they
receive widespread use, become available for other developers to
incorporate. Many developers of free software are heartened and
encouraged by the resulting cooperation. However, in the case of
software used on network servers, this result may fail to come about.
The GNU General Public License permits making a modified version and
letting the public access it on a server without ever releasing its
source code to the public.
The GNU Affero General Public License is designed specifically to
ensure that, in such cases, the modified source code becomes available
to the community. It requires the operator of a network server to
provide the source code of the modified version running there to the
users of that server. Therefore, public use of a modified version, on
a publicly accessible server, gives the public access to the source
code of the modified version.
An older license, called the Affero General Public License and
published by Affero, was designed to accomplish similar goals. This is
a different license, not a version of the Affero GPL, but Affero has
released a new version of the Affero GPL which permits relicensing under
this license.
The precise terms and conditions for copying, distribution and
modification follow.
TERMS AND CONDITIONS
0. Definitions.
"This License" refers to version 3 of the GNU Affero General Public License.
"Copyright" also means copyright-like laws that apply to other kinds of
works, such as semiconductor masks.
"The Program" refers to any copyrightable work licensed under this
License. Each licensee is addressed as "you". "Licensees" and
"recipients" may be individuals or organizations.
To "modify" a work means to copy from or adapt all or part of the work
in a fashion requiring copyright permission, other than the making of an
exact copy. The resulting work is called a "modified version" of the
earlier work or a work "based on" the earlier work.
A "covered work" means either the unmodified Program or a work based
on the Program.
To "propagate" a work means to do anything with it that, without
permission, would make you directly or secondarily liable for
infringement under applicable copyright law, except executing it on a
computer or modifying a private copy. Propagation includes copying,
distribution (with or without modification), making available to the
public, and in some countries other activities as well.
To "convey" a work means any kind of propagation that enables other
parties to make or receive copies. Mere interaction with a user through
a computer network, with no transfer of a copy, is not conveying.
An interactive user interface displays "Appropriate Legal Notices"
to the extent that it includes a convenient and prominently visible
feature that (1) displays an appropriate copyright notice, and (2)
tells the user that there is no warranty for the work (except to the
extent that warranties are provided), that licensees may convey the
work under this License, and how to view a copy of this License. If
the interface presents a list of user commands or options, such as a
menu, a prominent item in the list meets this criterion.
1. Source Code.
The "source code" for a work means the preferred form of the work
for making modifications to it. "Object code" means any non-source
form of a work.
A "Standard Interface" means an interface that either is an official
standard defined by a recognized standards body, or, in the case of
interfaces specified for a particular programming language, one that
is widely used among developers working in that language.
The "System Libraries" of an executable work include anything, other
than the work as a whole, that (a) is included in the normal form of
packaging a Major Component, but which is not part of that Major
Component, and (b) serves only to enable use of the work with that
Major Component, or to implement a Standard Interface for which an
implementation is available to the public in source code form. A
"Major Component", in this context, means a major essential component
(kernel, window system, and so on) of the specific operating system
(if any) on which the executable work runs, or a compiler used to
produce the work, or an object code interpreter used to run it.
The "Corresponding Source" for a work in object code form means all
the source code needed to generate, install, and (for an executable
work) run the object code and to modify the work, including scripts to
control those activities. However, it does not include the work's
System Libraries, or general-purpose tools or generally available free
programs which are used unmodified in performing those activities but
which are not part of the work. For example, Corresponding Source
includes interface definition files associated with source files for
the work, and the source code for shared libraries and dynamically
linked subprograms that the work is specifically designed to require,
such as by intimate data communication or control flow between those
subprograms and other parts of the work.
The Corresponding Source need not include anything that users
can regenerate automatically from other parts of the Corresponding
Source.
The Corresponding Source for a work in source code form is that
same work.
2. Basic Permissions.
All rights granted under this License are granted for the term of
copyright on the Program, and are irrevocable provided the stated
conditions are met. This License explicitly affirms your unlimited
permission to run the unmodified Program. The output from running a
covered work is covered by this License only if the output, given its
content, constitutes a covered work. This License acknowledges your
rights of fair use or other equivalent, as provided by copyright law.
You may make, run and propagate covered works that you do not
convey, without conditions so long as your license otherwise remains
in force. You may convey covered works to others for the sole purpose
of having them make modifications exclusively for you, or provide you
with facilities for running those works, provided that you comply with
the terms of this License in conveying all material for which you do
not control copyright. Those thus making or running the covered works
for you must do so exclusively on your behalf, under your direction
and control, on terms that prohibit them from making any copies of
your copyrighted material outside their relationship with you.
Conveying under any other circumstances is permitted solely under
the conditions stated below. Sublicensing is not allowed; section 10
makes it unnecessary.
3. Protecting Users' Legal Rights From Anti-Circumvention Law.
No covered work shall be deemed part of an effective technological
measure under any applicable law fulfilling obligations under article
11 of the WIPO copyright treaty adopted on 20 December 1996, or
similar laws prohibiting or restricting circumvention of such
measures.
When you convey a covered work, you waive any legal power to forbid
circumvention of technological measures to the extent such circumvention
is effected by exercising rights under this License with respect to
the covered work, and you disclaim any intention to limit operation or
modification of the work as a means of enforcing, against the work's
users, your or third parties' legal rights to forbid circumvention of
technological measures.
4. Conveying Verbatim Copies.
You may convey verbatim copies of the Program's source code as you
receive it, in any medium, provided that you conspicuously and
appropriately publish on each copy an appropriate copyright notice;
keep intact all notices stating that this License and any
non-permissive terms added in accord with section 7 apply to the code;
keep intact all notices of the absence of any warranty; and give all
recipients a copy of this License along with the Program.
You may charge any price or no price for each copy that you convey,
and you may offer support or warranty protection for a fee.
5. Conveying Modified Source Versions.
You may convey a work based on the Program, or the modifications to
produce it from the Program, in the form of source code under the
terms of section 4, provided that you also meet all of these conditions:
a) The work must carry prominent notices stating that you modified
it, and giving a relevant date.
b) The work must carry prominent notices stating that it is
released under this License and any conditions added under section
7. This requirement modifies the requirement in section 4 to
"keep intact all notices".
c) You must license the entire work, as a whole, under this
License to anyone who comes into possession of a copy. This
License will therefore apply, along with any applicable section 7
additional terms, to the whole of the work, and all its parts,
regardless of how they are packaged. This License gives no
permission to license the work in any other way, but it does not
invalidate such permission if you have separately received it.
d) If the work has interactive user interfaces, each must display
Appropriate Legal Notices; however, if the Program has interactive
interfaces that do not display Appropriate Legal Notices, your
work need not make them do so.
A compilation of a covered work with other separate and independent
works, which are not by their nature extensions of the covered work,
and which are not combined with it such as to form a larger program,
in or on a volume of a storage or distribution medium, is called an
"aggregate" if the compilation and its resulting copyright are not
used to limit the access or legal rights of the compilation's users
beyond what the individual works permit. Inclusion of a covered work
in an aggregate does not cause this License to apply to the other
parts of the aggregate.
6. Conveying Non-Source Forms.
You may convey a covered work in object code form under the terms
of sections 4 and 5, provided that you also convey the
machine-readable Corresponding Source under the terms of this License,
in one of these ways:
a) Convey the object code in, or embodied in, a physical product
(including a physical distribution medium), accompanied by the
Corresponding Source fixed on a durable physical medium
customarily used for software interchange.
b) Convey the object code in, or embodied in, a physical product
(including a physical distribution medium), accompanied by a
written offer, valid for at least three years and valid for as
long as you offer spare parts or customer support for that product
model, to give anyone who possesses the object code either (1) a
copy of the Corresponding Source for all the software in the
product that is covered by this License, on a durable physical
medium customarily used for software interchange, for a price no
more than your reasonable cost of physically performing this
conveying of source, or (2) access to copy the
Corresponding Source from a network server at no charge.
c) Convey individual copies of the object code with a copy of the
written offer to provide the Corresponding Source. This
alternative is allowed only occasionally and noncommercially, and
only if you received the object code with such an offer, in accord
with subsection 6b.
d) Convey the object code by offering access from a designated
place (gratis or for a charge), and offer equivalent access to the
Corresponding Source in the same way through the same place at no
further charge. You need not require recipients to copy the
Corresponding Source along with the object code. If the place to
copy the object code is a network server, the Corresponding Source
may be on a different server (operated by you or a third party)
that supports equivalent copying facilities, provided you maintain
clear directions next to the object code saying where to find the
Corresponding Source. Regardless of what server hosts the
Corresponding Source, you remain obligated to ensure that it is
available for as long as needed to satisfy these requirements.
e) Convey the object code using peer-to-peer transmission, provided
you inform other peers where the object code and Corresponding
Source of the work are being offered to the general public at no
charge under subsection 6d.
A separable portion of the object code, whose source code is excluded
from the Corresponding Source as a System Library, need not be
included in conveying the object code work.
A "User Product" is either (1) a "consumer product", which means any
tangible personal property which is normally used for personal, family,
or household purposes, or (2) anything designed or sold for incorporation
into a dwelling. In determining whether a product is a consumer product,
doubtful cases shall be resolved in favor of coverage. For a particular
product received by a particular user, "normally used" refers to a
typical or common use of that class of product, regardless of the status
of the particular user or of the way in which the particular user
actually uses, or expects or is expected to use, the product. A product
is a consumer product regardless of whether the product has substantial
commercial, industrial or non-consumer uses, unless such uses represent
the only significant mode of use of the product.
"Installation Information" for a User Product means any methods,
procedures, authorization keys, or other information required to install
and execute modified versions of a covered work in that User Product from
a modified version of its Corresponding Source. The information must
suffice to ensure that the continued functioning of the modified object
code is in no case prevented or interfered with solely because
modification has been made.
If you convey an object code work under this section in, or with, or
specifically for use in, a User Product, and the conveying occurs as
part of a transaction in which the right of possession and use of the
User Product is transferred to the recipient in perpetuity or for a
fixed term (regardless of how the transaction is characterized), the
Corresponding Source conveyed under this section must be accompanied
by the Installation Information. But this requirement does not apply
if neither you nor any third party retains the ability to install
modified object code on the User Product (for example, the work has
been installed in ROM).
The requirement to provide Installation Information does not include a
requirement to continue to provide support service, warranty, or updates
for a work that has been modified or installed by the recipient, or for
the User Product in which it has been modified or installed. Access to a
network may be denied when the modification itself materially and
adversely affects the operation of the network or violates the rules and
protocols for communication across the network.
Corresponding Source conveyed, and Installation Information provided,
in accord with this section must be in a format that is publicly
documented (and with an implementation available to the public in
source code form), and must require no special password or key for
unpacking, reading or copying.
7. Additional Terms.
"Additional permissions" are terms that supplement the terms of this
License by making exceptions from one or more of its conditions.
Additional permissions that are applicable to the entire Program shall
be treated as though they were included in this License, to the extent
that they are valid under applicable law. If additional permissions
apply only to part of the Program, that part may be used separately
under those permissions, but the entire Program remains governed by
this License without regard to the additional permissions.
When you convey a copy of a covered work, you may at your option
remove any additional permissions from that copy, or from any part of
it. (Additional permissions may be written to require their own
removal in certain cases when you modify the work.) You may place
additional permissions on material, added by you to a covered work,
for which you have or can give appropriate copyright permission.
Notwithstanding any other provision of this License, for material you
add to a covered work, you may (if authorized by the copyright holders of
that material) supplement the terms of this License with terms:
a) Disclaiming warranty or limiting liability differently from the
terms of sections 15 and 16 of this License; or
b) Requiring preservation of specified reasonable legal notices or
author attributions in that material or in the Appropriate Legal
Notices displayed by works containing it; or
c) Prohibiting misrepresentation of the origin of that material, or
requiring that modified versions of such material be marked in
reasonable ways as different from the original version; or
d) Limiting the use for publicity purposes of names of licensors or
authors of the material; or
e) Declining to grant rights under trademark law for use of some
trade names, trademarks, or service marks; or
f) Requiring indemnification of licensors and authors of that
material by anyone who conveys the material (or modified versions of
it) with contractual assumptions of liability to the recipient, for
any liability that these contractual assumptions directly impose on
those licensors and authors.
All other non-permissive additional terms are considered "further
restrictions" within the meaning of section 10. If the Program as you
received it, or any part of it, contains a notice stating that it is
governed by this License along with a term that is a further
restriction, you may remove that term. If a license document contains
a further restriction but permits relicensing or conveying under this
License, you may add to a covered work material governed by the terms
of that license document, provided that the further restriction does
not survive such relicensing or conveying.
If you add terms to a covered work in accord with this section, you
must place, in the relevant source files, a statement of the
additional terms that apply to those files, or a notice indicating
where to find the applicable terms.
Additional terms, permissive or non-permissive, may be stated in the
form of a separately written license, or stated as exceptions;
the above requirements apply either way.
8. Termination.
You may not propagate or modify a covered work except as expressly
provided under this License. Any attempt otherwise to propagate or
modify it is void, and will automatically terminate your rights under
this License (including any patent licenses granted under the third
paragraph of section 11).
However, if you cease all violation of this License, then your
license from a particular copyright holder is reinstated (a)
provisionally, unless and until the copyright holder explicitly and
finally terminates your license, and (b) permanently, if the copyright
holder fails to notify you of the violation by some reasonable means
prior to 60 days after the cessation.
Moreover, your license from a particular copyright holder is
reinstated permanently if the copyright holder notifies you of the
violation by some reasonable means, this is the first time you have
received notice of violation of this License (for any work) from that
copyright holder, and you cure the violation prior to 30 days after
your receipt of the notice.
Termination of your rights under this section does not terminate the
licenses of parties who have received copies or rights from you under
this License. If your rights have been terminated and not permanently
reinstated, you do not qualify to receive new licenses for the same
material under section 10.
9. Acceptance Not Required for Having Copies.
You are not required to accept this License in order to receive or
run a copy of the Program. Ancillary propagation of a covered work
occurring solely as a consequence of using peer-to-peer transmission
to receive a copy likewise does not require acceptance. However,
nothing other than this License grants you permission to propagate or
modify any covered work. These actions infringe copyright if you do
not accept this License. Therefore, by modifying or propagating a
covered work, you indicate your acceptance of this License to do so.
10. Automatic Licensing of Downstream Recipients.
Each time you convey a covered work, the recipient automatically
receives a license from the original licensors, to run, modify and
propagate that work, subject to this License. You are not responsible
for enforcing compliance by third parties with this License.
An "entity transaction" is a transaction transferring control of an
organization, or substantially all assets of one, or subdividing an
organization, or merging organizations. If propagation of a covered
work results from an entity transaction, each party to that
transaction who receives a copy of the work also receives whatever
licenses to the work the party's predecessor in interest had or could
give under the previous paragraph, plus a right to possession of the
Corresponding Source of the work from the predecessor in interest, if
the predecessor has it or can get it with reasonable efforts.
You may not impose any further restrictions on the exercise of the
rights granted or affirmed under this License. For example, you may
not impose a license fee, royalty, or other charge for exercise of
rights granted under this License, and you may not initiate litigation
(including a cross-claim or counterclaim in a lawsuit) alleging that
any patent claim is infringed by making, using, selling, offering for
sale, or importing the Program or any portion of it.
11. Patents.
A "contributor" is a copyright holder who authorizes use under this
License of the Program or a work on which the Program is based. The
work thus licensed is called the contributor's "contributor version".
A contributor's "essential patent claims" are all patent claims
owned or controlled by the contributor, whether already acquired or
hereafter acquired, that would be infringed by some manner, permitted
by this License, of making, using, or selling its contributor version,
but do not include claims that would be infringed only as a
consequence of further modification of the contributor version. For
purposes of this definition, "control" includes the right to grant
patent sublicenses in a manner consistent with the requirements of
this License.
Each contributor grants you a non-exclusive, worldwide, royalty-free
patent license under the contributor's essential patent claims, to
make, use, sell, offer for sale, import and otherwise run, modify and
propagate the contents of its contributor version.
In the following three paragraphs, a "patent license" is any express
agreement or commitment, however denominated, not to enforce a patent
(such as an express permission to practice a patent or covenant not to
sue for patent infringement). To "grant" such a patent license to a
party means to make such an agreement or commitment not to enforce a
patent against the party.
If you convey a covered work, knowingly relying on a patent license,
and the Corresponding Source of the work is not available for anyone
to copy, free of charge and under the terms of this License, through a
publicly available network server or other readily accessible means,
then you must either (1) cause the Corresponding Source to be so
available, or (2) arrange to deprive yourself of the benefit of the
patent license for this particular work, or (3) arrange, in a manner
consistent with the requirements of this License, to extend the patent
license to downstream recipients. "Knowingly relying" means you have
actual knowledge that, but for the patent license, your conveying the
covered work in a country, or your recipient's use of the covered work
in a country, would infringe one or more identifiable patents in that
country that you have reason to believe are valid.
If, pursuant to or in connection with a single transaction or
arrangement, you convey, or propagate by procuring conveyance of, a
covered work, and grant a patent license to some of the parties
receiving the covered work authorizing them to use, propagate, modify
or convey a specific copy of the covered work, then the patent license
you grant is automatically extended to all recipients of the covered
work and works based on it.
A patent license is "discriminatory" if it does not include within
the scope of its coverage, prohibits the exercise of, or is
conditioned on the non-exercise of one or more of the rights that are
specifically granted under this License. You may not convey a covered
work if you are a party to an arrangement with a third party that is
in the business of distributing software, under which you make payment
to the third party based on the extent of your activity of conveying
the work, and under which the third party grants, to any of the
parties who would receive the covered work from you, a discriminatory
patent license (a) in connection with copies of the covered work
conveyed by you (or copies made from those copies), or (b) primarily
for and in connection with specific products or compilations that
contain the covered work, unless you entered into that arrangement,
or that patent license was granted, prior to 28 March 2007.
Nothing in this License shall be construed as excluding or limiting
any implied license or other defenses to infringement that may
otherwise be available to you under applicable patent law.
12. No Surrender of Others' Freedom.
If conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License. If you cannot convey a
covered work so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you may
not convey it at all. For example, if you agree to terms that obligate you
to collect a royalty for further conveying from those to whom you convey
the Program, the only way you could satisfy both those terms and this
License would be to refrain entirely from conveying the Program.
13. Remote Network Interaction; Use with the GNU General Public License.
Notwithstanding any other provision of this License, if you modify the
Program, your modified version must prominently offer all users
interacting with it remotely through a computer network (if your version
supports such interaction) an opportunity to receive the Corresponding
Source of your version by providing access to the Corresponding Source
from a network server at no charge, through some standard or customary
means of facilitating copying of software. This Corresponding Source
shall include the Corresponding Source for any work covered by version 3
of the GNU General Public License that is incorporated pursuant to the
following paragraph.
Notwithstanding any other provision of this License, you have
permission to link or combine any covered work with a work licensed
under version 3 of the GNU General Public License into a single
combined work, and to convey the resulting work. The terms of this
License will continue to apply to the part which is the covered work,
but the work with which it is combined will remain governed by version
3 of the GNU General Public License.
14. Revised Versions of this License.
The Free Software Foundation may publish revised and/or new versions of
the GNU Affero General Public License from time to time. Such new versions
will be similar in spirit to the present version, but may differ in detail to
address new problems or concerns.
Each version is given a distinguishing version number. If the
Program specifies that a certain numbered version of the GNU Affero General
Public License "or any later version" applies to it, you have the
option of following the terms and conditions either of that numbered
version or of any later version published by the Free Software
Foundation. If the Program does not specify a version number of the
GNU Affero General Public License, you may choose any version ever published
by the Free Software Foundation.
If the Program specifies that a proxy can decide which future
versions of the GNU Affero General Public License can be used, that proxy's
public statement of acceptance of a version permanently authorizes you
to choose that version for the Program.
Later license versions may give you additional or different
permissions. However, no additional obligations are imposed on any
author or copyright holder as a result of your choosing to follow a
later version.
15. Disclaimer of Warranty.
THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY
APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT
HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY
OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO,
THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM
IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF
ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
16. Limitation of Liability.
IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS
THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY
GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE
USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF
DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD
PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS),
EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF
SUCH DAMAGES.
17. Interpretation of Sections 15 and 16.
If the disclaimer of warranty and limitation of liability provided
above cannot be given local legal effect according to their terms,
reviewing courts shall apply local law that most closely approximates
an absolute waiver of all civil liability in connection with the
Program, unless a warranty or assumption of liability accompanies a
copy of the Program in return for a fee.
END OF TERMS AND CONDITIONS
How to Apply These Terms to Your New Programs
If you develop a new program, and you want it to be of the greatest
possible use to the public, the best way to achieve this is to make it
free software which everyone can redistribute and change under these terms.
To do so, attach the following notices to the program. It is safest
to attach them to the start of each source file to most effectively
state the exclusion of warranty; and each file should have at least
the "copyright" line and a pointer to where the full notice is found.
<one line to give the program's name and a brief idea of what it does.>
Copyright (C) <year> <name of author>
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU Affero General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License
along with this program. If not, see <https://www.gnu.org/licenses/>.
Also add information on how to contact you by electronic and paper mail.
If your software can interact with users remotely through a computer
network, you should also make sure that it provides a way for users to
get its source. For example, if your program is a web application, its
interface could display a "Source" link that leads users to an archive
of the code. There are many ways you could offer source, and different
solutions will be better for different programs; see section 13 for the
specific requirements.
You should also get your employer (if you work as a programmer) or school,
if any, to sign a "copyright disclaimer" for the program, if necessary.
For more information on this, and how to apply and follow the GNU AGPL, see
<https://www.gnu.org/licenses/>.
MIT License
Copyright (c) 2026 indifferentketchup
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

View File

@@ -58,7 +58,7 @@ upstream and inject `Remote-User`. Postgres binds loopback only.
BooCoder runs as a **host systemd service** (`boocoder.service`, port `:9502`), not in Docker:
```bash
pnpm -C apps/server build && pnpm -C apps/coder build
pnpm -C packages/contracts build && pnpm -C apps/server build && pnpm -C apps/coder build
sudo systemctl restart boocoder
curl http://100.114.205.53:9502/api/health
```
@@ -84,3 +84,7 @@ See [`boocode_roadmap.md`](boocode_roadmap.md) for full version history. Highlig
## Planned
- **v2.3 provider lifecycle** — config-backed provider registry (`/data/coder-providers.json`), enable/disable toggles, two-tier probe (openspec drafted). See [`CURRENT.md`](CURRENT.md).
## License
MIT — see [`LICENSE`](LICENSE).

View File

@@ -15,7 +15,6 @@
"fastify": "^4.28.1",
"node-pty": "^1.0.0",
"pg": "^8.13.0",
"tslib": "^2.6.3",
"zod": "^3.23.8"
},
"devDependencies": {
@@ -24,5 +23,5 @@
"tsx": "^4.16.2",
"typescript": "^5.5.0"
},
"license": "AGPL-3.0-only"
"license": "MIT"
}

View File

@@ -9,7 +9,7 @@ const ConfigSchema = z.object({
TMUX_CONF_PATH: z.string().default('/etc/booterm/tmux.conf'),
});
export type Config = z.infer<typeof ConfigSchema>;
type Config = z.infer<typeof ConfigSchema>;
let cached: Config | null = null;

View File

@@ -10,7 +10,7 @@ export function getPool(databaseUrl: string): pg.Pool {
return pool;
}
export interface SessionInfo {
interface SessionInfo {
id: string;
project_id: string;
project_path: string;

View File

@@ -1,7 +1,7 @@
import * as pty from 'node-pty';
import type { IPty } from 'node-pty';
export interface AttachPtyOptions {
interface AttachPtyOptions {
sessionName: string;
projectRoot: string;
cols: number;

View File

@@ -13,3 +13,5 @@ GITEA_USER=indifferentketchup
GITEA_SSH_HOST=100.114.205.53:2222
MCP_CONFIG_PATH=/data/mcp.json
SKILLS_ROOT=/opt/boocode/data/skills
CODER_PROVIDERS_PATH=/opt/boocode/data/coder-providers.json
CLAUDE_SDK_BACKEND=1

34
apps/coder/CLAUDE.md Normal file
View File

@@ -0,0 +1,34 @@
# apps/coder — BooCoder (deep reference)
> Per-app engineering notes for `apps/coder/src/`. BooCoder runs as a **systemd service on the host** (`boocoder.service`), NOT in Docker — Fastify at port 9502, postgres at `127.0.0.1:5500`. Cross-cutting commands, database, environment, workflow, and cross-app contracts live in the **root `CLAUDE.md`**. This file auto-loads when you read/edit files under `apps/coder/`.
## Probe & provider discovery
- **`services/provider-registry.ts`** — Static registry of provider metadata (label, transport, model source). `PROVIDERS` array, `PROVIDERS_BY_NAME` map. 5 providers: boocode (native), opencode (acp), goose (pty), claude (pty), qwen (pty). `PROBED_AGENT_NAMES` derives from it — adding/removing providers means editing this file, not the frontend.
- **`services/agent-probe.ts`** — Startup probe via direct `exec()` (not SSH): discovers installed agents, versions, ACP support, models. Qwen models from `~/.qwen/settings.json`; Claude models static from the registry. Persisted to `available_agents`.
- **`routes/providers.ts`** — `GET /api/providers` returns installed providers with models. Transport reflects actual capability (checks `supports_acp` from DB, not just registry preference). The apps/server side is "Provider picker dispatch" (see `apps/server/CLAUDE.md`).
- **Provider snapshot lifecycle** (`services/`): `provider-config.ts` (Zod config, never-throws) → `provider-config-registry.ts` (`buildResolvedRegistry`, singleton) → `provider-snapshot.ts` (two-tier probe: tier-1 fast presence, tier-2 cold ACP probe skipped unless force / stale `PROVIDER_PROBE_TTL_MS` 24h / dbEmpty; cached). Verify live: `curl http://100.114.205.53:9502/api/providers/snapshot` — returns providers + models + commands, the exact shape `AgentComposerBar` renders.
- `PATCH /api/providers/config` replaces a provider id's override object **wholesale** (per-id shallow merge) — to flip one field send `{...existing, enabled}`, or a custom ACP entry's `command`/`label` is wiped and it drops out of the resolved registry. `data/coder-providers.json` is **gitignored** (live runtime config — the coder reads AND writes it on UI toggles); tracked reference is `data/coder-providers.example.json`. The loader falls back to `{providers:{}}` (built-ins only) when absent, so a fresh checkout needs no copy.
## Build, deploy, dispatch
- **Workspace dependency on `@boocode/server`**: imports `createInferenceRunner`, `createBroker`, `ALL_TOOLS`, `appendMcpTools` from the server's compiled `dist/`. apps/server's `package.json` has an `exports` map with `types` conditions for NodeNext resolution. **apps/server must build FIRST.**
- Build + deploy: `pnpm -C packages/contracts build && pnpm -C apps/server build && pnpm -C apps/coder build && sudo systemctl restart boocoder`. Env file at `apps/coder/.env.host`. Service file at `/etc/systemd/system/boocoder.service`.
- After `pnpm -C apps/coder build` the host service keeps running the OLD process until `sudo systemctl restart boocoder` — a stale process shows **new routes 404 with `{error:'not found'}` while old routes still 200** (the `/api` not-found handler shape). Restart, don't re-debug.
- `:9502/api/health` is down ~1520s after a boocoder restart while the startup agent-probe scan runs — retry; an early connection-refused is not a failed deploy.
- Agent dispatch spawns binaries directly using `install_path` from `available_agents` — no `spawn('sh', ['-c', ...])` (fails under systemd). Paseo's pattern: `spawn(fullBinaryPath, argsArray, { cwd })`.
- systemd hardening: only `NoNewPrivileges=true` is safe. `ProtectSystem`, `ProtectHome`, `PrivateTmp` all break agent dispatch (agents need full filesystem access to read configs, write to worktrees).
- `apps/server/tsconfig.json` has `declaration: true` so `.d.ts` files exist for workspace consumers. The provider's `package.json` needs `exports` with `types` + `default` conditions per subpath (`"./inference": { "types": "./dist/.../index.d.ts", "default": "./dist/.../index.js" }`) — without the `types` condition, NodeNext can't find `.d.ts` files and tsc fails "Cannot find module" here.
- Write tools (`edit_file`, `create_file`, `delete_file`, `apply_pending`, `rewind`) queue in `pending_changes`. Nothing hits disk until `apply_pending`. `write_guard.ts` validates paths (resolve + prefix-check, no realpath since files may not exist for creates).
## Backends
> Behavioral overview + flows + data model: see [/docs/coder-backends.md](/docs/coder-backends.md). The notes below are the deep per-fact reference.
- **opencode** runs as a warm HTTP server (`services/backends/opencode-server.ts``opencode serve` per BooCoder process, one opencode session per BooCode session, resumed via `agent_sessions`). goose/qwen/claude dispatch **one-shot** ACP/PTY with no ctx/token usage; only native `boocode` (llama-swap) tracks ctx.
- **opencode SSE** (`opencode-server.ts`): live streaming is `session.next.text.delta` / `.reasoning.delta` / `.tool.{called,success,failed}` — NOT `message.part.*` (terminal/post-hoc). `client.event.subscribe({ directory })` MUST pass the session's worktree dir; omit it and opencode scopes events to the server `process.cwd()` → zero session events (empty turns, 180s timeout). Each live session owns its own subscribe loop + AbortController (a `sessionID` demux guard drops cross-session events when two share a dir). Turn completes on `session.idle`; `promptAsync` is fire-and-forget (204).
- **opencode model strings** must be provider-prefixed (`llama-swap/<model>`) AND exist in `~/.config/opencode/opencode.json` `provider.llama-swap.models` — not merely loadable by llama-swap. `parseModel` infers `llama-swap/` for a bare id; the dispatcher coalesces empty→DEFAULT_MODEL then prefixes. `agent-probe` populates opencode's `available_agents.models` via `mergeLlamaSwap` (fetches `/v1/models`); empty model list → frontend sends `''` → no inference (empty turn).
- **agent_sessions resume**: `config_hash = sha256('opencode_server|<model>')` — must NOT include the server port (random per boot; breaks cross-restart resume). Keyed `(chat_id, agent)` — the tab/chat is the context unit (two opencode tabs = two contexts sharing one worktree). `chat_id` CASCADEs from `chats`; `session_id`/`worktree_id` are informational `SET NULL`. The `worktrees` table (one-per-session, survives session delete) supersedes the defanged `session_worktrees`. `tasks.chat_id` threads the tab id to the dispatcher; `runOpenCodeServerTask` resolves-or-creates a chat when null. The `@opencode-ai/sdk` v2 client takes flattened params (`{sessionID, directory, parts, model:{providerID,modelID}}`), `createOpencodeClient` from `@opencode-ai/sdk/v2/client`.
- **Claude SDK backend tool RESULTS arrive as `type:'user'` SDK messages** (tool_result content blocks): `mapSdkMessage` (`claude-sdk-map.ts`) MUST map the `user` case → a terminal `tool_update` (completed/failed + output), else the tool_call persists `status:'running'` and the UI spinner never stops. The dispatcher's `tool_update` path then publishes + persists it.
- **ACP command discovery is async**: `acp-probe.ts` must poll after `newSession` for `available_commands_update` (commands arrive in a later notification; reading synchronously captures 0). PTY providers (claude) discover from disk via `claude-command-discovery.ts` (`~/.claude/commands` + `enabledPlugins`, bare names, deduped). `AgentCommand.kind` tags `'command'` vs `'skill'`; `CoderPane`'s `slashGroups` splits them into icon'd groups. `SlashCommandPicker`'s `groups?` prop is opt-in.
- **A new per-message coder field silently drops unless you update every mapper**: the HTTP read SELECT + `mapCoderMessageRow` (`apps/coder/src/routes/messages.ts`), **the WS `snapshot` SELECT (`apps/coder/src/routes/ws.ts`)** — it has its OWN column list and the client's `snapshot` handler `setMessages`-overwrites the HTTP load, so a field present in the HTTP route but absent here shows live yet vanishes on refresh — `CoderPane.tsx` (`RawCoderMessage`/`CoderMessage`/`mapCoderTimelineRow` + the live `message_complete` WS reducer), `CoderMessageWire` (`CoderMessageList.tsx`), and `api/types.ts`. The client `mapCoderTimelineRow` whitelists fields — easiest to forget. This bit `model` twice: the client chain (`v2.7.9`) and then the WS snapshot SELECT (`v2.7.11`) — the chip showed live but vanished on coder refresh until both were fixed.

View File

@@ -7,7 +7,6 @@ WORKDIR /build
COPY package.json pnpm-workspace.yaml pnpm-lock.yaml tsconfig.base.json ./
COPY apps/server/package.json ./apps/server/
COPY apps/coder/package.json ./apps/coder/
COPY apps/coder/web/package.json ./apps/coder/web/
RUN pnpm install --frozen-lockfile
@@ -16,7 +15,6 @@ COPY apps/server ./apps/server
RUN pnpm -C apps/server build
COPY apps/coder ./apps/coder
RUN pnpm -C apps/coder/web build
RUN pnpm -C apps/coder build
RUN pnpm deploy --filter=@boocode/coder --prod --legacy /out/coder
@@ -27,7 +25,6 @@ RUN apt-get update && apt-get install -y --no-install-recommends ripgrep git ope
WORKDIR /app
COPY --from=builder /out/coder ./
COPY --from=builder /build/apps/coder/web/dist ./web
ENV NODE_ENV=production
EXPOSE 3000

View File

@@ -13,11 +13,13 @@
"test": "vitest run"
},
"dependencies": {
"@boocode/contracts": "workspace:*",
"@agentclientprotocol/sdk": "^0.22.1",
"@anthropic-ai/claude-agent-sdk": "^0.3.159",
"@boocode/server": "workspace:*",
"@fastify/static": "^7.0.4",
"@fastify/websocket": "^10.0.1",
"@modelcontextprotocol/sdk": "^1.29.0",
"@opencode-ai/sdk": "~1.15.0",
"fastify": "^4.28.1",
"postgres": "^3.4.4",
"ws": "^8.18.0",
@@ -30,5 +32,5 @@
"typescript": "^5.5.0",
"vitest": "^3.0.0"
},
"license": "AGPL-3.0-only"
"license": "MIT"
}

View File

@@ -23,11 +23,33 @@ const ConfigSchema = z.object({
GITEA_TOKEN: z.string().optional(),
GITEA_SSH_HOST: z.string().default('100.114.205.53:2222'),
MCP_CONFIG_PATH: z.string().optional(),
// v2.3: config-backed provider overrides/custom-ACP entries merged over the
// hardcoded built-ins. Missing file = built-ins only (see provider-config.ts).
CODER_PROVIDERS_PATH: z.string().default('/data/coder-providers.json'),
// v2.3 phase 2: tier-2 (cold ACP probe) is skipped when available_agents was
// probed more recently than this. 24h default — stale model lists self-heal
// on the next snapshot; an explicit /refresh always re-probes.
PROVIDER_PROBE_TTL_MS: z.coerce.number().int().positive().default(86_400_000),
// v2.0.5: cheaper model for titles, summaries, labeling.
FAST_MODEL: z.string().optional(),
// SSH access to the host for external agent dispatch (Phase 5)
BOOCODER_SSH_HOST: z.string().default('100.114.205.53'),
BOOCODER_SSH_USER: z.string().default('samkintop'),
// v2.6 Phase 3 (lifecycle hardening). Idle TTL: evict a non-busy warm backend
// (opencode server / warm-ACP child) after this long with no turn — its worktree
// + agent_sessions row persist, so the next turn re-spawns + reattaches. 30 min
// default (design §6).
AGENT_POOL_IDLE_TTL_MS: z.coerce.number().int().positive().default(1_800_000),
// LRU cap: max live warm backends before the least-recently-used (non-busy) ones
// are evicted. Bounds the long-lived-daemon's per-(chat,agent) Map growth.
AGENT_POOL_MAX_LIVE: z.coerce.number().int().positive().default(10),
// Periodic sweep cadence (idle/LRU pool eviction + orphan-worktree reap). 60s
// mirrors the apps/server truncation/stale-streaming sweeper.
LIFECYCLE_SWEEP_INTERVAL_MS: z.coerce.number().int().positive().default(60_000),
// Orphan-worktree grace: an on-disk worktree dir with no live `worktrees` row is
// only reaped after it's been untouched this long (avoids sweeping a dir mid
// ensureSessionWorktree create). 1h default.
ORPHAN_WORKTREE_GRACE_MS: z.coerce.number().int().positive().default(3_600_000),
});
export type Config = z.infer<typeof ConfigSchema>;

View File

@@ -1,12 +1,5 @@
import { resolve, dirname } from 'node:path';
import { fileURLToPath } from 'node:url';
import { existsSync } from 'node:fs';
import Fastify from 'fastify';
import fastifyWebsocket from '@fastify/websocket';
import fastifyStatic from '@fastify/static';
const __filename = fileURLToPath(import.meta.url);
const __dirname = dirname(__filename);
import { loadConfig } from './config.js';
import { getSql, applySchema, pingDb, closeDb } from './db.js';
import { startMcpServer } from './services/mcp-server.js';
@@ -16,7 +9,7 @@ import { createInferenceRunner } from '@boocode/server/inference';
import { createBroker } from '@boocode/server/broker';
import { appendMcpTools, ALL_TOOLS } from '@boocode/server/tools';
import type { Config as ServerConfig } from '@boocode/server/config';
import type { WsFrame } from '@boocode/server/ws-frames';
import type { WsFrame } from '@boocode/contracts/ws-frames';
// v2.0.0 Phase 2C: write tools + adapter for BooChat ToolDef compatibility.
import { WRITE_TOOLS } from './services/tools/index.js';
import { adaptWriteTool } from './services/tools/adapter.js';
@@ -25,17 +18,24 @@ import { setInferenceContext, clearInferenceContext } from './services/tools/inf
import { registerMessageRoutes } from './routes/messages.js';
import { registerSkillRoutes } from './routes/skills.js';
import { registerPendingRoutes } from './routes/pending.js';
import { registerCheckpointRoutes } from './routes/checkpoints.js';
import { registerAgentSessionRoutes } from './routes/agent-sessions.js';
import { registerTaskRoutes } from './routes/tasks.js';
import { registerInboxRoutes } from './routes/inbox.js';
import { registerStatsRoutes } from './routes/stats.js';
import { registerArenaRoutes } from './routes/arena.js';
import { registerProviderRoutes } from './routes/providers.js';
import { registerWorktreeSafetyRoutes } from './routes/worktree-safety.js';
import { registerLifecycleRoutes } from './routes/lifecycle.js';
import { registerWebSocket } from './routes/ws.js';
// Phase 4: dispatcher + agent probe
import { createDispatcher } from './services/dispatcher.js';
import { agentPool } from './services/agent-pool.js';
import { createOrphanWorktreeReaper } from './services/orphan-worktree-reaper.js';
import { probeAgents } from './services/agent-probe.js';
import { getProviderSnapshot, persistProbedModels } from './services/provider-snapshot.js';
import { setPermissionHooks } from './services/permission-waiter.js';
import { publishAgentStatus } from './services/agent-status-publish.js';
import { homedir } from 'node:os';
async function main() {
@@ -76,6 +76,21 @@ async function main() {
// Broker: in-memory pub/sub for session + user channel streaming.
const broker = createBroker(app.log);
// agent-status-normalize (#10): the permission hooks carry only taskId +
// sessionId, but the tasks row holds the (chat_id, agent) pair the status frame
// is keyed on. Resolve it best-effort so a blocked/working status accompanies
// every permission_requested/permission_resolved. Returns null when the task
// lacks a chat_id or agent (sessionless creators) — we simply skip the status.
const resolveChatAgent = async (
taskId: string,
): Promise<{ chatId: string; agent: string } | null> => {
const [row] = await sql<{ chat_id: string | null; agent: string | null }[]>`
SELECT chat_id, agent FROM tasks WHERE id = ${taskId}
`;
if (!row?.chat_id || !row.agent) return null;
return { chatId: row.chat_id, agent: row.agent };
};
setPermissionHooks({
onPrompt: async (prompt) => {
await sql`
@@ -90,6 +105,18 @@ async function main() {
...(prompt.input ? { input: prompt.input } : {}),
options: prompt.options.map((o) => ({ option_id: o.optionId, label: o.label })),
} as WsFrame);
// #10: agent is blocked on a human decision.
const ca = await resolveChatAgent(prompt.taskId).catch(() => null);
if (ca) {
publishAgentStatus(
broker.publishFrame,
prompt.sessionId,
ca.chatId,
ca.agent,
'blocked',
'permission_request',
);
}
},
onResolved: async (taskId, sessionId) => {
await sql`
@@ -100,6 +127,18 @@ async function main() {
task_id: taskId,
session_id: sessionId,
} as WsFrame);
// #10: human responded — agent resumes work.
const ca = await resolveChatAgent(taskId).catch(() => null);
if (ca) {
publishAgentStatus(
broker.publishFrame,
sessionId,
ca.chatId,
ca.agent,
'working',
'permission_resolved',
);
}
},
});
@@ -178,41 +217,48 @@ async function main() {
// Phase 4: dispatcher — polls tasks table and runs inference
const dispatcher = createDispatcher({ sql, inference: inferenceApi, broker, log: app.log, config });
dispatcher.start();
app.addHook('onClose', () => dispatcher.stop());
// v2.6 Phase 3: configure + start the agent-pool lifecycle sweep (idle-TTL +
// LRU-cap eviction of warm backends, plus each backend's proactive health probe)
// and the orphan-worktree reaper. Both run on the same periodic timer.
agentPool.configure({
idleTtlMs: config.AGENT_POOL_IDLE_TTL_MS,
maxLive: config.AGENT_POOL_MAX_LIVE,
sweepIntervalMs: config.LIFECYCLE_SWEEP_INTERVAL_MS,
log: app.log,
});
agentPool.startReaper(app.log);
const orphanReaper = createOrphanWorktreeReaper({
sql,
log: app.log,
intervalMs: config.LIFECYCLE_SWEEP_INTERVAL_MS,
graceMs: config.ORPHAN_WORKTREE_GRACE_MS,
});
orphanReaper.start();
app.addHook('onClose', async () => {
// stop() first so in-flight dispatcher turns settle, then stop the reapers and
// drain the pool (kills opencode server + warm ACP children).
await dispatcher.stop();
orphanReaper.stop();
await agentPool.dispose();
});
// Register routes
registerMessageRoutes(app, sql, broker, inferenceApi);
registerSkillRoutes(app, sql, broker, inferenceApi);
registerPendingRoutes(app, sql);
registerTaskRoutes(app, sql, inferenceApi);
registerCheckpointRoutes(app, sql);
registerAgentSessionRoutes(app, sql);
registerTaskRoutes(app, sql, inferenceApi, dispatcher.cancelExternalTask);
registerInboxRoutes(app, sql);
registerStatsRoutes(app, sql);
registerArenaRoutes(app, sql);
registerProviderRoutes(app, sql, config);
registerWorktreeSafetyRoutes(app, sql);
registerLifecycleRoutes(app, sql);
registerWebSocket(app, sql, broker);
// Serve static frontend (built web app). In production, the dist/ is
// copied to ../web relative to the dist/ directory at /app/web. In dev,
// check adjacent to the source.
const webRoot = resolve(__dirname, '../web');
if (existsSync(webRoot)) {
await app.register(fastifyStatic, {
root: webRoot,
prefix: '/',
// Don't intercept /api routes — static only serves files that exist.
wildcard: false,
});
// SPA fallback: serve index.html for non-API routes that don't match a file.
app.setNotFoundHandler(async (req, reply) => {
if (req.url.startsWith('/api')) {
reply.code(404);
return { error: 'not found' };
}
return reply.sendFile('index.html');
});
app.log.info(`serving frontend from ${webRoot}`);
}
// Graceful shutdown
const shutdown = async () => {
app.log.info('shutting down');

View File

@@ -0,0 +1,75 @@
import { describe, it, expect } from 'vitest';
import Fastify, { type FastifyInstance } from 'fastify';
import { registerAgentSessionRoutes } from '../agent-sessions.js';
import type { Sql } from '../../db.js';
// Mock the porsager surface this route uses: a tagged-template `sql` dispatched by
// query substring. Two queries: the session-existence check and the agent_sessions
// JOIN. We return post-coercion shapes (booleans/strings) exactly as porsager would
// hand them to the route — `has_session` already a JS boolean, `last_active_at` a
// string|null — so the asserted JSON matches the API contract end-to-end.
interface MockState {
sessionExists: boolean;
rows: Array<{ agent: string; status: string; has_session: boolean; last_active_at: string | null }>;
}
function mockSql(state: MockState): Sql {
return ((strings: TemplateStringsArray) => {
const q = strings.join('');
if (q.includes('SELECT id FROM sessions')) {
return Promise.resolve(state.sessionExists ? [{ id: 'session-1' }] : []);
}
if (q.includes('FROM agent_sessions')) {
return Promise.resolve(state.rows);
}
return Promise.resolve([]);
}) as unknown as Sql;
}
function buildApp(state: MockState): FastifyInstance {
const app = Fastify();
registerAgentSessionRoutes(app, mockSql(state));
return app;
}
describe('GET /api/sessions/:id/agent-sessions', () => {
it('returns the per-(chat,agent) rows in the contracted shape', async () => {
const app = buildApp({
sessionExists: true,
rows: [
{ agent: 'opencode', status: 'active', has_session: true, last_active_at: '2026-05-31T12:00:00.000Z' },
{ agent: 'goose', status: 'idle', has_session: false, last_active_at: null },
],
});
const res = await app.inject({ method: 'GET', url: '/api/sessions/session-1/agent-sessions' });
expect(res.statusCode).toBe(200);
const body = res.json();
expect(Array.isArray(body)).toBe(true);
expect(body).toEqual([
{ agent: 'opencode', status: 'active', has_session: true, last_active_at: '2026-05-31T12:00:00.000Z' },
{ agent: 'goose', status: 'idle', has_session: false, last_active_at: null },
]);
// Contract field types.
expect(typeof body[0].agent).toBe('string');
expect(typeof body[0].status).toBe('string');
expect(typeof body[0].has_session).toBe('boolean');
expect(body[1].last_active_at).toBeNull();
await app.close();
});
it('returns an empty array when the session has no agent_sessions rows', async () => {
const app = buildApp({ sessionExists: true, rows: [] });
const res = await app.inject({ method: 'GET', url: '/api/sessions/session-1/agent-sessions' });
expect(res.statusCode).toBe(200);
expect(res.json()).toEqual([]);
await app.close();
});
it('404s when the session does not exist', async () => {
const app = buildApp({ sessionExists: false, rows: [] });
const res = await app.inject({ method: 'GET', url: '/api/sessions/nope/agent-sessions' });
expect(res.statusCode).toBe(404);
expect(res.json()).toEqual({ error: 'session not found' });
await app.close();
});
});

View File

@@ -0,0 +1,110 @@
import { describe, it, expect } from 'vitest';
import { resolveChatId } from '../chat-resolve.js';
import type { Sql } from '../../db.js';
// Mock the porsager/postgres surface that chat-resolve.ts uses: a tagged-template
// `tx` (dispatched by query substring), `tx.json`, and `sql.begin(fn)` which just
// runs fn(tx). Captures the value written back to workspace_panes so we can assert
// the WorkspaceState envelope survives the UPDATE.
interface MockState {
stored: unknown; // initial sessions.workspace_panes value
existingChatOpen: boolean; // whether `SELECT id FROM chats ...` finds the active chat
newChatId: string;
written?: unknown; // captured tx.json(...) payload from `UPDATE sessions`
inserted: boolean; // whether INSERT INTO chats ran
}
interface MockTx {
(strings: TemplateStringsArray): Promise<unknown>;
json: (v: unknown) => unknown;
}
function mockSql(state: MockState): Sql {
const tx = ((strings: TemplateStringsArray) => {
const q = strings.join('');
if (q.includes('SELECT workspace_panes FROM sessions')) {
return Promise.resolve([{ workspace_panes: state.stored }]);
}
if (q.includes('FROM chats')) {
return Promise.resolve(state.existingChatOpen ? [{ id: 'placeholder' }] : []);
}
if (q.includes('INSERT INTO chats')) {
state.inserted = true;
return Promise.resolve([{ id: state.newChatId }]);
}
if (q.includes('UPDATE sessions')) {
return Promise.resolve([]);
}
return Promise.resolve([]);
}) as unknown as MockTx;
tx.json = (v: unknown) => {
state.written = v;
return v;
};
const sql = {
begin: (fn: (t: Sql) => Promise<unknown>) => fn(tx as unknown as Sql),
};
return sql as unknown as Sql;
}
const ENVELOPE = () => ({
panes: [{ id: 'pane-1', kind: 'coder', chatIds: [] as string[], activeChatIdx: 0 }],
tabNumbers: { 'chat-x': 3 },
nextTabNumber: 7,
closedPaneStack: [{ kind: 'coder', chatIds: ['old'], activeChatIdx: 0 }],
});
describe('resolveChatId — v2.6.5 WorkspaceState envelope', () => {
it('reads panes from the envelope without crashing (regression: panes.findIndex is not a function)', async () => {
const state: MockState = {
stored: ENVELOPE(),
existingChatOpen: false,
newChatId: 'new-chat-1',
inserted: false,
};
const chatId = await resolveChatId(mockSql(state), 'session-1', 'pane-1');
expect(chatId).toBe('new-chat-1');
expect(state.inserted).toBe(true);
});
it('preserves the envelope (tabNumbers/nextTabNumber/closedPaneStack) on write-back', async () => {
const state: MockState = {
stored: ENVELOPE(),
existingChatOpen: false,
newChatId: 'new-chat-1',
inserted: false,
};
await resolveChatId(mockSql(state), 'session-1', 'pane-1');
const w = state.written as Record<string, unknown>;
expect(Array.isArray(w.panes)).toBe(true); // envelope, not a bare array
expect(w.tabNumbers).toEqual({ 'chat-x': 3 });
expect(w.nextTabNumber).toBe(7);
expect(w.closedPaneStack).toEqual([{ kind: 'coder', chatIds: ['old'], activeChatIdx: 0 }]);
});
it('returns the existing open chat when the pane already has one', async () => {
const env = ENVELOPE();
env.panes[0]!.chatIds = ['existing-1'];
const state: MockState = {
stored: env,
existingChatOpen: true,
newChatId: 'should-not-be-used',
inserted: false,
};
const chatId = await resolveChatId(mockSql(state), 'session-1', 'pane-1');
expect(chatId).toBe('existing-1');
expect(state.inserted).toBe(false);
});
it('still accepts a legacy bare WorkspacePane[] array', async () => {
const state: MockState = {
stored: [{ id: 'pane-1', kind: 'coder', chatId: 'legacy-1', chatIds: ['legacy-1'], activeChatIdx: 0 }],
existingChatOpen: true,
newChatId: 'should-not-be-used',
inserted: false,
};
const chatId = await resolveChatId(mockSql(state), 'session-1', 'pane-1');
expect(chatId).toBe('legacy-1');
expect(state.inserted).toBe(false);
});
});

View File

@@ -0,0 +1,211 @@
import { describe, it, expect, beforeEach, afterEach, vi } from 'vitest';
import Fastify, { type FastifyInstance } from 'fastify';
import { existsSync, readFileSync, writeFileSync, rmSync } from 'node:fs';
import { tmpdir } from 'node:os';
import { join } from 'node:path';
import { registerProviderRoutes } from '../providers.js';
import { load } from '../../services/provider-config.js';
import { loadProviderConfig } from '../../services/provider-config-registry.js';
import { clearProviderSnapshotCache } from '../../services/provider-snapshot.js';
import type { Config } from '../../config.js';
import type { Sql } from '../../db.js';
/** Minimal sql stub: available_agents reads return []. */
function mockSql(): Sql {
return vi.fn((strings: TemplateStringsArray) => {
const q = strings.join('');
if (q.includes('available_agents')) return Promise.resolve([]);
return Promise.resolve([]);
}) as unknown as Sql;
}
let tmpCounter = 0;
function freshPath(): string {
tmpCounter += 1;
return join(tmpdir(), `coder-providers-routes-${process.pid}-${tmpCounter}.json`);
}
function buildApp(providersPath: string): FastifyInstance {
const app = Fastify();
// Mirror index.ts: tolerate empty JSON bodies.
app.removeContentTypeParser(['application/json']);
app.addContentTypeParser('application/json', { parseAs: 'string' }, (_req, body, done) => {
const str = (body as string) ?? '';
if (str.trim().length === 0) return done(null, {});
try {
done(null, JSON.parse(str));
} catch (err) {
done(err as Error, undefined);
}
});
const config = {
CODER_PROVIDERS_PATH: providersPath,
LLAMA_SWAP_URL: 'http://llama-swap.test',
PROVIDER_PROBE_TTL_MS: 86_400_000,
} as unknown as Config;
registerProviderRoutes(app, mockSql(), config);
return app;
}
const JSON_HEADERS = { 'content-type': 'application/json' };
const createdPaths: string[] = [];
beforeEach(() => {
clearProviderSnapshotCache();
loadProviderConfig('/nonexistent-coder-providers.json'); // reset registry to built-ins
vi.restoreAllMocks();
vi.stubGlobal('fetch', vi.fn().mockRejectedValue(new Error('no network in test')));
});
afterEach(() => {
for (const p of createdPaths.splice(0)) {
try {
rmSync(p, { force: true });
} catch {
/* ignore */
}
}
});
describe('GET /api/providers/config', () => {
it('returns the current config file (built-ins-only when missing)', async () => {
const path = freshPath();
createdPaths.push(path);
const app = buildApp(path);
const res = await app.inject({ method: 'GET', url: '/api/providers/config' });
expect(res.statusCode).toBe(200);
expect(res.json()).toEqual({ providers: {} });
await app.close();
});
it('reflects an existing file', async () => {
const path = freshPath();
createdPaths.push(path);
writeFileSync(path, JSON.stringify({ providers: { goose: { enabled: false } } }));
const app = buildApp(path);
const res = await app.inject({ method: 'GET', url: '/api/providers/config' });
expect(res.json()).toEqual({ providers: { goose: { enabled: false } } });
await app.close();
});
});
describe('PATCH /api/providers/config', () => {
it('valid patch → 200, writes the merged file (order: validate→save→reload→clear)', async () => {
const path = freshPath();
createdPaths.push(path);
writeFileSync(path, JSON.stringify({ providers: { goose: { label: 'Goose' } } }));
const app = buildApp(path);
const res = await app.inject({
method: 'PATCH',
url: '/api/providers/config',
headers: JSON_HEADERS,
payload: JSON.stringify({ providers: { opencode: { enabled: false } } }),
});
expect(res.statusCode).toBe(200);
expect(res.json()).toMatchObject({ ok: true });
// File written + merged (goose untouched, opencode added).
const onDisk = load(path);
expect(onDisk.providers).toEqual({
goose: { label: 'Goose' },
opencode: { enabled: false },
});
await app.close();
});
it('null value deletes the override', async () => {
const path = freshPath();
createdPaths.push(path);
writeFileSync(path, JSON.stringify({ providers: { goose: { enabled: false }, opencode: { enabled: false } } }));
const app = buildApp(path);
const res = await app.inject({
method: 'PATCH',
url: '/api/providers/config',
headers: JSON_HEADERS,
payload: JSON.stringify({ providers: { goose: null } }),
});
expect(res.statusCode).toBe(200);
expect(load(path).providers).toEqual({ opencode: { enabled: false } });
await app.close();
});
it('INVALID body → 422 and the file is NOT written (validate before save)', async () => {
const path = freshPath();
createdPaths.push(path);
const before = JSON.stringify({ providers: { goose: { enabled: true } } });
writeFileSync(path, before);
const app = buildApp(path);
const res = await app.inject({
method: 'PATCH',
url: '/api/providers/config',
headers: JSON_HEADERS,
payload: JSON.stringify({ providers: { goose: { enabled: 'yes' } } }), // bad type
});
expect(res.statusCode).toBe(422);
// File must be byte-for-byte unchanged — nothing written on a 422.
expect(readFileSync(path, 'utf8')).toBe(before);
await app.close();
});
it('save failure → 500 and the file is NOT created (no state divergence)', async () => {
const path = join(tmpdir(), `no-such-dir-${process.pid}-${Date.now()}`, 'coder-providers.json');
const app = buildApp(path);
const res = await app.inject({
method: 'PATCH',
url: '/api/providers/config',
headers: JSON_HEADERS,
payload: JSON.stringify({ providers: { goose: { enabled: false } } }),
});
expect(res.statusCode).toBe(500);
expect(existsSync(path)).toBe(false);
await app.close();
});
});
describe('POST /api/providers/refresh', () => {
it('no body → refreshes all registered providers', async () => {
const app = buildApp(freshPath());
const res = await app.inject({ method: 'POST', url: '/api/providers/refresh' });
expect(res.statusCode).toBe(200);
expect(res.json().refreshed).toBeGreaterThan(0);
await app.close();
});
it('subset body → refreshed count reflects only the requested providers', async () => {
const app = buildApp(freshPath());
const res = await app.inject({
method: 'POST',
url: '/api/providers/refresh',
headers: JSON_HEADERS,
payload: JSON.stringify({ providers: ['boocode'] }),
});
expect(res.statusCode).toBe(200);
expect(res.json()).toEqual({ refreshed: 1 });
await app.close();
});
});
describe('GET /api/providers/:id/diagnostic', () => {
it('known provider → 200 JSON { diagnostic }', async () => {
const app = buildApp(freshPath());
const res = await app.inject({ method: 'GET', url: '/api/providers/boocode/diagnostic' });
expect(res.statusCode).toBe(200);
expect(res.headers['content-type']).toContain('application/json');
expect(res.json().diagnostic).toContain('provider: boocode');
await app.close();
});
it('unknown provider → 404', async () => {
const app = buildApp(freshPath());
const res = await app.inject({ method: 'GET', url: '/api/providers/nope/diagnostic' });
expect(res.statusCode).toBe(404);
await app.close();
});
});

View File

@@ -0,0 +1,138 @@
import { describe, it, expect, beforeAll, afterAll } from 'vitest';
import { readFileSync } from 'node:fs';
import { resolve } from 'node:path';
import Fastify, { type FastifyInstance } from 'fastify';
import postgres from 'postgres';
import { registerTaskRoutes } from '../tasks.js';
/**
* F1 — POST /api/tasks/:id/cancel route wiring.
*
* The route's job: reach the in-flight external run via `cancelExternal(taskId)`
* (the new abort hook), keep cancelling native inference for open chats unchanged,
* and land the task row in 'cancelled'. The streaming assistant message is
* finalized by the dispatcher's run-function, not here — that path is covered by
* finalize-message.test.ts. This suite pins the route's behavior against a real DB.
*/
describe.runIf(!!process.env.DATABASE_URL)('POST /api/tasks/:id/cancel (route, F1)', () => {
let sql: ReturnType<typeof postgres>;
let app: FastifyInstance;
let projectId: string;
let sessionId: string;
let chatId: string;
const externalCancelCalls: string[] = [];
const inferenceCancelCalls: Array<[string, string]> = [];
let externalReturns = true;
beforeAll(async () => {
sql = postgres(process.env.DATABASE_URL!, { max: 3 });
const serverSchema = resolve(__dirname, '../../../../server/src/schema.sql');
const coderSchema = resolve(__dirname, '../../schema.sql');
await sql.unsafe(readFileSync(serverSchema, 'utf8'));
await sql.unsafe(readFileSync(coderSchema, 'utf8'));
const [p] = await sql<{ id: string }[]>`
INSERT INTO projects (name, path, status) VALUES ('f1-cancel-route', '/tmp/f1-cancel-route', 'open') RETURNING id
`;
projectId = p!.id;
const [s] = await sql<{ id: string }[]>`
INSERT INTO sessions (project_id, name, model, status) VALUES (${projectId}, 'f1', 'm', 'open') RETURNING id
`;
sessionId = s!.id;
const [c] = await sql<{ id: string }[]>`
INSERT INTO chats (session_id, name, status) VALUES (${sessionId}, 'tab', 'open') RETURNING id
`;
chatId = c!.id;
app = Fastify();
registerTaskRoutes(
app,
sql,
{
cancel: async (sid: string, cid: string) => {
inferenceCancelCalls.push([sid, cid]);
return false;
},
},
(taskId: string) => {
externalCancelCalls.push(taskId);
return externalReturns;
},
);
await app.ready();
});
afterAll(async () => {
if (app) await app.close();
if (!sql) return;
await sql`DELETE FROM messages WHERE session_id = ${sessionId}`.catch(() => {});
await sql`DELETE FROM tasks WHERE project_id = ${projectId}`.catch(() => {});
await sql`DELETE FROM chats WHERE id = ${chatId}`.catch(() => {});
await sql`DELETE FROM sessions WHERE id = ${sessionId}`.catch(() => {});
await sql`DELETE FROM projects WHERE id = ${projectId}`.catch(() => {});
await sql.end({ timeout: 5 });
});
async function insertTask(agent: string | null, state: string): Promise<string> {
const [t] = await sql<{ id: string }[]>`
INSERT INTO tasks (project_id, input, agent, session_id, state, started_at)
VALUES (${projectId}, 'do a thing', ${agent}, ${sessionId}, ${state}, clock_timestamp())
RETURNING id
`;
return t!.id;
}
it('reaches cancelExternal and lands the task cancelled for a running external task', async () => {
externalReturns = true;
externalCancelCalls.length = 0;
const taskId = await insertTask('opencode', 'running');
const res = await app.inject({ method: 'POST', url: `/api/tasks/${taskId}/cancel` });
expect(res.statusCode).toBe(200);
expect(res.json()).toEqual({ cancelled: true });
expect(externalCancelCalls).toContain(taskId);
const [row] = await sql<{ state: string; ended_at: Date | null }[]>`
SELECT state, ended_at FROM tasks WHERE id = ${taskId}
`;
expect(row!.state).toBe('cancelled');
expect(row!.ended_at).not.toBeNull();
});
it('still cancels a native boocode task (cancelExternal returns false → inference.cancel path unchanged)', async () => {
externalReturns = false; // native task: no controller registered
externalCancelCalls.length = 0;
inferenceCancelCalls.length = 0;
const taskId = await insertTask(null, 'running');
const res = await app.inject({ method: 'POST', url: `/api/tasks/${taskId}/cancel` });
expect(res.statusCode).toBe(200);
// The route calls cancelExternal unconditionally (cheap, returns false here)...
expect(externalCancelCalls).toContain(taskId);
// ...and the native inference.cancel path still fires for the open chat.
expect(inferenceCancelCalls).toContainEqual([sessionId, chatId]);
const [row] = await sql<{ state: string }[]>`SELECT state FROM tasks WHERE id = ${taskId}`;
expect(row!.state).toBe('cancelled');
});
it('rejects cancelling an already-terminal task with 409 and never touches the abort hook', async () => {
externalCancelCalls.length = 0;
const taskId = await insertTask('opencode', 'completed');
const res = await app.inject({ method: 'POST', url: `/api/tasks/${taskId}/cancel` });
expect(res.statusCode).toBe(409);
expect(externalCancelCalls).not.toContain(taskId);
});
it('returns 404 for an unknown task', async () => {
const res = await app.inject({
method: 'POST',
url: `/api/tasks/00000000-0000-0000-0000-000000000000/cancel`,
});
expect(res.statusCode).toBe(404);
});
});

View File

@@ -0,0 +1,59 @@
import type { FastifyInstance } from 'fastify';
import type { Sql } from '../db.js';
// v2.6 Phase 1-UX (design §9b): chat-scoped "resumed vs new session" indicator.
// `agent_sessions` is keyed (chat_id, agent) — the tab/chat is the agent-context
// unit (P1.5-b). The route param is a SESSION id, so we resolve every chat in the
// session and return the union of their agent_sessions rows. A session with two
// opencode tabs yields two rows (one per chat); the frontend keys the chip per
// chat, but the wire shape is a flat per-(chat,agent) list.
//
// has_session = agent_session_id IS NOT NULL — i.e. a native backend session id
// (opencode/ACP) was created and stored, so switching back resumes rather than
// starts fresh.
export interface AgentSessionRow {
agent: string;
status: string;
has_session: boolean;
last_active_at: string | null;
// v2.6.8 per-(chat,agent) running token/cost totals (sampling-streamjson-tokens
// #8). BIGINT columns arrive as strings over the wire; the frontend coerces.
input_tokens: number;
output_tokens: number;
cost: number;
}
export function registerAgentSessionRoutes(app: FastifyInstance, sql: Sql): void {
// GET /api/sessions/:sessionId/agent-sessions — list the agent-session rows for
// every chat in the session (drives the AgentComposerBar resumed/new chip).
app.get<{ Params: { sessionId: string } }>(
'/api/sessions/:sessionId/agent-sessions',
async (req, reply) => {
const sessionId = req.params.sessionId;
const session = await sql<{ id: string }[]>`SELECT id FROM sessions WHERE id = ${sessionId}`;
if (session.length === 0) {
reply.code(404);
return { error: 'session not found' };
}
// Join through chats so the session-scoped param resolves to its (chat,agent)
// rows. last_active_at first → the frontend reads the freshest activity.
const rows = await sql<AgentSessionRow[]>`
SELECT
a.agent AS agent,
a.status AS status,
(a.agent_session_id IS NOT NULL) AS has_session,
a.last_active_at AS last_active_at,
a.input_tokens AS input_tokens,
a.output_tokens AS output_tokens,
a.cost AS cost
FROM agent_sessions a
JOIN chats c ON c.id = a.chat_id
WHERE c.session_id = ${sessionId}
ORDER BY a.last_active_at DESC NULLS LAST, a.agent ASC
`;
return rows;
},
);
}

View File

@@ -8,6 +8,36 @@ interface WorkspacePaneRow {
activeChatIdx?: number;
}
// v2.6.5: sessions.workspace_panes widened from a bare WorkspacePane[] to a
// WorkspaceState envelope { panes, tabNumbers, nextTabNumber, closedPaneStack }.
// (See the union validator in apps/server routes/sessions.ts + normalizeWorkspaceState
// in apps/server read_tab_by_number.ts — this is the coder-side mirror.)
interface WorkspaceStateRow {
panes: WorkspacePaneRow[];
tabNumbers: Record<string, number>;
nextTabNumber: number;
closedPaneStack: unknown[];
}
// MIGRATION: the stored value may be the legacy bare array OR the envelope.
// Normalize to a full envelope so callers always read `.panes` as an array and
// write the envelope back intact (preserving tabNumbers/nextTabNumber/closedPaneStack).
export function normalizeWorkspaceState(v: unknown): WorkspaceStateRow {
if (Array.isArray(v)) {
return { panes: v as WorkspacePaneRow[], tabNumbers: {}, nextTabNumber: 1, closedPaneStack: [] };
}
if (v && typeof v === 'object' && Array.isArray((v as { panes?: unknown }).panes)) {
const env = v as Partial<WorkspaceStateRow>;
return {
panes: env.panes ?? [],
tabNumbers: env.tabNumbers ?? {},
nextTabNumber: env.nextTabNumber ?? 1,
closedPaneStack: env.closedPaneStack ?? [],
};
}
return { panes: [], tabNumbers: {}, nextTabNumber: 1, closedPaneStack: [] };
}
function chatNameForKind(kind: string): string {
if (kind === 'coder' || kind === 'agent') return 'BooCoder';
if (kind === 'terminal') return 'Terminal';
@@ -28,12 +58,13 @@ export async function resolveChatId(
paneId: string,
): Promise<string | null> {
return sql.begin(async (tx) => {
const sessionRows = await tx<{ workspace_panes: WorkspacePaneRow[] }[]>`
const sessionRows = await tx<{ workspace_panes: unknown }[]>`
SELECT workspace_panes FROM sessions WHERE id = ${sessionId} FOR UPDATE
`;
if (sessionRows.length === 0) return null;
const panes = sessionRows[0]!.workspace_panes ?? [];
const state = normalizeWorkspaceState(sessionRows[0]!.workspace_panes);
const panes = state.panes;
const paneIdx = panes.findIndex((p) => p.id === paneId);
if (paneIdx < 0) return null;
@@ -69,9 +100,10 @@ export async function resolveChatId(
: p,
);
const nextState: WorkspaceStateRow = { ...state, panes: nextPanes };
await tx`
UPDATE sessions
SET workspace_panes = ${tx.json(nextPanes as never)},
SET workspace_panes = ${tx.json(nextState as never)},
updated_at = clock_timestamp()
WHERE id = ${sessionId}
`;

View File

@@ -0,0 +1,73 @@
/**
* write-edit-robustness #4 — checkpoint restore + list routes (coder side).
*
* Proxied through the apps/server `/api/coder/*` blanket forwarder (no server-side
* change needed for new routes). Restore rewinds the session worktree to the
* checkpoint's shadow commit, trims the transcript from the anchor message forward,
* and resets the agent backend — see services/checkpoints.ts.
*/
import type { FastifyInstance } from 'fastify';
import type { Sql } from '../db.js';
import { restoreCheckpoint, CheckpointNotFoundError } from '../services/checkpoints.js';
export function registerCheckpointRoutes(app: FastifyInstance, sql: Sql): void {
// GET /api/sessions/:sessionId/checkpoints?chat_id= — list a chat's checkpoints
// so the frontend can mark which messages have a restore point. When chat_id is
// omitted, returns every checkpoint for the session's chats.
app.get<{ Params: { sessionId: string }; Querystring: { chat_id?: string } }>(
'/api/sessions/:sessionId/checkpoints',
async (req, reply) => {
const sessionId = req.params.sessionId;
const chatId = req.query.chat_id;
const session = await sql<{ id: string }[]>`SELECT id FROM sessions WHERE id = ${sessionId}`;
if (session.length === 0) {
reply.code(404);
return { error: 'session not found' };
}
// Scope authoritatively through chats.session_id (always set) — NOT the
// denormalized checkpoints.session_id (nullable). The chat_id branch must
// still be session-gated or it's an IDOR (any session's chat_id reads its
// checkpoints).
const rows = chatId
? await sql<{ id: string; chat_id: string; message_id: string | null; label: string | null; created_at: Date }[]>`
SELECT cp.id, cp.chat_id, cp.message_id, cp.label, cp.created_at
FROM checkpoints cp
JOIN chats c ON c.id = cp.chat_id
WHERE cp.chat_id = ${chatId} AND c.session_id = ${sessionId}
ORDER BY cp.created_at
`
: await sql<{ id: string; chat_id: string; message_id: string | null; label: string | null; created_at: Date }[]>`
SELECT cp.id, cp.chat_id, cp.message_id, cp.label, cp.created_at
FROM checkpoints cp
JOIN chats c ON c.id = cp.chat_id
WHERE c.session_id = ${sessionId}
ORDER BY cp.created_at
`;
return rows;
},
);
// POST /api/sessions/:sessionId/checkpoints/:checkpointId/restore — restore.
app.post<{ Params: { sessionId: string; checkpointId: string } }>(
'/api/sessions/:sessionId/checkpoints/:checkpointId/restore',
async (req, reply) => {
const { sessionId, checkpointId } = req.params;
try {
const result = await restoreCheckpoint(sql, checkpointId, {
sessionId,
log: app.log,
});
return result;
} catch (err) {
if (err instanceof CheckpointNotFoundError) {
reply.code(404);
return { error: err.message };
}
throw err;
}
},
);
}

View File

@@ -0,0 +1,122 @@
/**
* v2.6 Phase 3 (3.3) — chat/session close-or-archive cleanup hook (coder side).
*
* Chat/session close + archive + delete all live in apps/server (Docker), which
* cannot see the host worktree dirs (/tmp/booworktrees), run git on them, or reach
* the warm agent processes the dispatcher pooled in THIS (host systemd) process. So
* — exactly like the `worktree-risk` guard — the server signals the coder when a
* chat/session closes, and the coder does the real teardown:
* 1. dispose the chat's warm-ACP backends (`agentPool.closeChat`) — kills the
* goose/qwen child processes for that chat,
* 2. close the chat's opencode session on the shared server (`closeSession`),
* 3. mark every `agent_sessions` row for the chat 'closed' + (when the session's
* last open chat closes) remove the shared session worktree, preflighting
* work-at-risk so uncommitted/unmerged work is never silently dropped
* (`closeChatBackendState`).
*
* Idempotent: closing an already-closed chat is a no-op (0 rows, no backend).
*
* SERVER WIRING (not done here — apps/server, out of this batch's scope): the
* server's `POST /api/chats/:id/archive`, `DELETE /api/chats/:id`, and the
* session archive/delete routes should fire-and-forget
* fetch(`${BOOCODER_URL}/api/chats/${id}/close`, { method: 'POST' })
* after publishing their WS frame (best-effort; the orphan-worktree reaper +
* idle-pool eviction are the backstop if the call is missed).
*/
import type { FastifyInstance } from 'fastify';
import type { Sql } from '../db.js';
import { agentPool, OPENCODE_POOL_KEY } from '../services/agent-pool.js';
import { closeChatBackendState } from '../services/worktrees.js';
import type { AgentSessionHandle } from '../services/agent-backend.js';
export function registerLifecycleRoutes(app: FastifyInstance, sql: Sql): void {
// POST /api/chats/:chatId/close — tear down all warm state for a chat tab.
app.post<{ Params: { chatId: string }; Querystring: { force?: string } }>(
'/api/chats/:chatId/close',
async (req) => {
const chatId = req.params.chatId;
const force = req.query.force === 'true' || req.query.force === '1';
// 1. Close the chat's opencode session on the SHARED server (the server is
// not chat-keyed, so agentPool.closeChat won't touch it). Resolve the
// stored opencode session id and ask the backend to drop it.
const ocRows = await sql<{ agent: string; agent_session_id: string | null; worktree_id: string | null; session_id: string | null }[]>`
SELECT agent, agent_session_id, worktree_id, session_id
FROM agent_sessions
WHERE chat_id = ${chatId} AND backend = 'opencode_server'
`;
const ocBackend = agentPool.peek(OPENCODE_POOL_KEY, 'opencode');
if (ocBackend) {
for (const row of ocRows) {
if (!row.agent_session_id) continue;
const handle: AgentSessionHandle = {
sessionId: row.session_id ?? '',
agent: row.agent,
backend: 'opencode_server',
chatId,
worktreeId: row.worktree_id ?? '',
agentSessionId: row.agent_session_id,
serverPort: null,
};
await ocBackend.closeSession(handle).catch((err) => {
app.log.warn({ err: err instanceof Error ? err.message : String(err), chatId }, 'lifecycle: opencode closeSession threw');
});
}
}
// 2. Dispose any warm-ACP backends pooled under this chat (kills the
// goose/qwen child + marks its agent row closed via the backend).
const disposed = await agentPool.closeChat(chatId);
// 3. DB + worktree truth: mark agent rows closed; remove the shared session
// worktree iff this was the session's last open chat (preflight at-risk).
const result = await closeChatBackendState(sql, chatId, { force });
app.log.info({ chatId, disposed, ...result }, 'lifecycle: chat closed');
return { ok: true, disposed, ...result };
},
);
// POST /api/sessions/:sessionId/close — close every open chat in a session
// (session archive/delete). Loops the chat-close path so the same preflight +
// teardown applies per chat; the worktree is removed on the last one.
app.post<{ Params: { sessionId: string }; Querystring: { force?: string } }>(
'/api/sessions/:sessionId/close',
async (req) => {
const sessionId = req.params.sessionId;
const force = req.query.force === 'true' || req.query.force === '1';
const chats = await sql<{ id: string }[]>`
SELECT id FROM chats WHERE session_id = ${sessionId}
`;
const results: { chatId: string; disposed: string[]; worktreeRemoved: boolean; worktreeAtRisk: boolean }[] = [];
for (const c of chats) {
const ocBackend = agentPool.peek(OPENCODE_POOL_KEY, 'opencode');
if (ocBackend) {
const ocRows = await sql<{ agent: string; agent_session_id: string | null; worktree_id: string | null; session_id: string | null }[]>`
SELECT agent, agent_session_id, worktree_id, session_id
FROM agent_sessions WHERE chat_id = ${c.id} AND backend = 'opencode_server'
`;
for (const row of ocRows) {
if (!row.agent_session_id) continue;
await ocBackend.closeSession({
sessionId: row.session_id ?? '',
agent: row.agent,
backend: 'opencode_server',
chatId: c.id,
worktreeId: row.worktree_id ?? '',
agentSessionId: row.agent_session_id,
serverPort: null,
}).catch(() => {});
}
}
const disposed = await agentPool.closeChat(c.id);
const r = await closeChatBackendState(sql, c.id, { force });
results.push({ chatId: c.id, disposed, worktreeRemoved: r.worktreeRemoved, worktreeAtRisk: r.worktreeAtRisk });
}
app.log.info({ sessionId, chats: results.length }, 'lifecycle: session closed');
return { ok: true, results };
},
);
}

View File

@@ -2,7 +2,7 @@ import type { FastifyInstance } from 'fastify';
import { z } from 'zod';
import type { Sql } from '../db.js';
import type { Broker } from '@boocode/server/broker';
import type { WsFrame } from '@boocode/server/ws-frames';
import type { WsFrame } from '@boocode/contracts/ws-frames';
import { resolveChatId } from './chat-resolve.js';
const AnswerUserInputBody = z.object({
@@ -53,6 +53,9 @@ interface MessageRow {
role: string;
content: string | null;
status: string | null;
model: string | null;
ctx_used: number | null;
ctx_max: number | null;
tool_calls: Array<{ id: string; name: string; args?: Record<string, unknown> }> | null;
tool_results: {
tool_call_id: string;
@@ -88,6 +91,9 @@ function mapCoderMessageRow(row: MessageRow) {
role: row.role as 'user' | 'assistant' | 'system',
content: row.content ?? '',
status: (row.status ?? 'complete') as 'streaming' | 'complete' | 'failed',
...(row.model ? { model: row.model } : {}),
...(row.ctx_used != null ? { ctx_used: row.ctx_used } : {}),
...(row.ctx_max != null ? { ctx_max: row.ctx_max } : {}),
...(reasoningText ? { reasoning_text: reasoningText } : {}),
...(tool_calls?.length ? { tool_calls } : {}),
};
@@ -126,13 +132,13 @@ export function registerMessageRoutes(
const rows = chatId
? await sql<MessageRow[]>`
SELECT id, role, content, status, tool_calls, tool_results, reasoning_parts
SELECT id, role, content, status, model, ctx_used, ctx_max, tool_calls, tool_results, reasoning_parts
FROM messages_with_parts
WHERE session_id = ${sessionId} AND chat_id = ${chatId}
ORDER BY created_at ASC, id ASC
`
: await sql<MessageRow[]>`
SELECT id, role, content, status, tool_calls, tool_results, reasoning_parts
SELECT id, role, content, status, model, ctx_used, ctx_max, tool_calls, tool_results, reasoning_parts
FROM messages_with_parts
WHERE session_id = ${sessionId}
ORDER BY created_at ASC, id ASC
@@ -224,8 +230,8 @@ export function registerMessageRoutes(
// External provider: create a task for the dispatcher
const projectId = sessionRows[0]!.project_id;
const [task] = await sql<{ id: string; state: string }[]>`
INSERT INTO tasks (project_id, input, agent, model, mode_id, thinking_option_id, session_id)
VALUES (${projectId}, ${content}, ${provider}, ${model ?? null}, ${mode_id ?? null}, ${thinking_option_id ?? null}, ${sessionId})
INSERT INTO tasks (project_id, input, agent, model, mode_id, thinking_option_id, session_id, chat_id)
VALUES (${projectId}, ${content}, ${provider}, ${model ?? null}, ${mode_id ?? null}, ${thinking_option_id ?? null}, ${sessionId}, ${chatId})
RETURNING id, state
`;
reply.code(202);

View File

@@ -1,4 +1,5 @@
import type { FastifyInstance } from 'fastify';
import { z } from 'zod';
import type { Sql } from '../db.js';
import {
listPending,
@@ -6,7 +7,15 @@ import {
applyAll,
rejectOne,
rewindOne,
queueCreate,
} from '../services/pending_changes.js';
import { WriteGuardError } from '../services/write_guard.js';
import { rebaselineWorktreeAfterApply } from '../services/worktrees.js';
const CreateBody = z.object({
file_path: z.string().min(1),
content: z.string(),
});
/**
* Resolve project root from a session's project path.
@@ -51,6 +60,51 @@ export function registerPendingRoutes(app: FastifyInstance, sql: Sql): void {
},
);
// POST /api/sessions/:sessionId/pending/create — queue a new-file create
// (manual create from the RightRail file browser; no inference involved).
// queueCreate runs resolveWritePath internally, so a path that escapes the
// project root or hits a secret file throws WriteGuardError → 422 with the
// guard message. Mirrors the { error } 404 shape used by the other routes
// and the 422 status used by apply/rewind on failure.
app.post<{ Params: { sessionId: string } }>(
'/api/sessions/:sessionId/pending/create',
async (req, reply) => {
const sessionId = req.params.sessionId;
const parsed = CreateBody.safeParse(req.body);
if (!parsed.success) {
reply.code(400);
return { error: 'invalid body', details: parsed.error.flatten() };
}
const projectRoot = await resolveProjectRoot(sql, sessionId);
if (!projectRoot) {
reply.code(404);
return { error: 'session or project not found' };
}
try {
const change = await queueCreate(
sql,
sessionId,
null,
parsed.data.file_path,
parsed.data.content,
projectRoot,
// Manual RightRail create — no agent staged it; renders as "manual".
null,
);
return change;
} catch (err) {
if (err instanceof WriteGuardError) {
reply.code(422);
return { error: err.message };
}
throw err;
}
},
);
// POST /api/sessions/:sessionId/pending/apply — apply all pending changes
app.post<{ Params: { sessionId: string } }>(
'/api/sessions/:sessionId/pending/apply',
@@ -64,6 +118,15 @@ export function registerPendingRoutes(app: FastifyInstance, sql: Sql): void {
}
const results = await applyAll(sql, sessionId, projectRoot);
// v2.6 Phase 3 (3.5): re-baseline the session worktree's diff to the applied
// state, so the next external-agent turn diffs against applied-not-original
// and doesn't re-surface the just-applied changes. Best-effort: a worktree
// session may not exist (native-only chat), and a re-baseline hiccup must not
// fail the apply the user just requested.
if (results.some((r) => r.success)) {
await rebaselineWorktreeAfterApply(sql, sessionId).catch(() => {});
}
return { results };
},
);
@@ -83,6 +146,15 @@ export function registerPendingRoutes(app: FastifyInstance, sql: Sql): void {
const result = await applyOne(sql, changeId, projectRoot);
if (!result.success) {
reply.code(422);
} else {
// v2.6 Phase 3 (3.5): re-baseline the session worktree after a successful
// apply so the next external-agent turn diffs against applied-not-original.
// Resolve the change's session; best-effort, never fails the apply.
const sessRows = await sql<{ session_id: string }[]>`
SELECT session_id FROM pending_changes WHERE id = ${changeId}
`;
const sessionId = sessRows[0]?.session_id;
if (sessionId) await rebaselineWorktreeAfterApply(sql, sessionId).catch(() => {});
}
return result;
},

View File

@@ -1,7 +1,29 @@
import type { FastifyInstance } from 'fastify';
import { z } from 'zod';
import type { Sql } from '../db.js';
import type { Config } from '../config.js';
import { getProviderSnapshot, clearProviderSnapshotCache } from '../services/provider-snapshot.js';
import {
getProviderSnapshot,
clearProviderSnapshotCache,
peekSnapshotEntry,
} from '../services/provider-snapshot.js';
import {
load,
save,
CoderProvidersFileSchema,
ProviderConfigPatchSchema,
mergeProviderConfigPatch,
} from '../services/provider-config.js';
import {
reloadProviderConfig,
getResolvedRegistry,
} from '../services/provider-config-registry.js';
import {
getProviderDiagnostic,
type DiagnosticAgentRow,
} from '../services/provider-diagnostic.js';
const RefreshBodySchema = z.object({ providers: z.array(z.string()).optional() });
export function registerProviderRoutes(app: FastifyInstance, sql: Sql, config: Config): void {
app.get<{ Querystring: { cwd?: string } }>('/api/providers/snapshot', async (req, _reply) => {
@@ -9,9 +31,97 @@ export function registerProviderRoutes(app: FastifyInstance, sql: Sql, config: C
return getProviderSnapshot(sql, config, cwd);
});
app.post('/api/providers/refresh', async (_req, _reply) => {
// 4.1 — current loaded config file (raw CoderProvidersFile, not the resolved registry).
app.get('/api/providers/config', async (_req, _reply) => {
return load(config.CODER_PROVIDERS_PATH);
});
// 4.2 — patch the config file (design.md §6.2). Strict order is the whole
// correctness story: validate → save → reload → clear. A malformed body or an
// invalid merged result returns 422 and NEVER writes; a save failure returns
// 500 and leaves in-memory state untouched (no file/registry divergence).
app.patch('/api/providers/config', async (req, reply) => {
// 1. Validate the PATCH body shape (malformed → 422, never reaches merge).
const parsed = ProviderConfigPatchSchema.safeParse(req.body);
if (!parsed.success) {
return reply.code(422).send({
error: 'invalid provider config patch',
issues: parsed.error.flatten(),
});
}
// 2. Shallow per-id merge over the current file (null deletes; object replaces).
const current = load(config.CODER_PROVIDERS_PATH);
const merged = mergeProviderConfigPatch(current, parsed.data);
// 3. Validate the merged result — refuse to write a config that won't load.
const validated = CoderProvidersFileSchema.safeParse(merged);
if (!validated.success) {
return reply.code(422).send({
error: 'merged provider config is invalid',
issues: validated.error.flatten(),
});
}
// 4. Persist. If save throws, STOP here — do NOT reload/clear, so the file on
// disk and the in-memory resolved registry can never diverge.
try {
save(config.CODER_PROVIDERS_PATH, validated.data);
} catch (err) {
req.log.error(
{ err: err instanceof Error ? err.message : String(err), path: config.CODER_PROVIDERS_PATH },
'provider-config: save failed — in-memory state untouched',
);
return reply.code(500).send({ error: 'failed to write provider config' });
}
// 5 + 6. Rebuild the in-memory resolved registry from the new file, then drop
// the snapshot cache so the next /snapshot reflects the change.
reloadProviderConfig();
clearProviderSnapshotCache();
// 7. Return the new config (per §6.2 `{ ok: true }`, plus the merged providers
// so the client can update without a follow-up GET).
return { ok: true, providers: validated.data.providers };
});
// 4.3 — force a cold probe. Optional { providers?: string[] } narrows the
// reported subset (design.md §6.3 Paseo pattern). The force=true snapshot is
// the only existing re-probe primitive (per-provider force would be a
// snapshot-internal change, out of Phase 4 scope), so the probe runs for all
// installed providers; the `refreshed` count reflects the requested subset.
app.post('/api/providers/refresh', async (req, reply) => {
const parsed = RefreshBodySchema.safeParse(req.body ?? {});
if (!parsed.success) {
return reply.code(422).send({ error: 'invalid refresh body', issues: parsed.error.flatten() });
}
const subset = parsed.data.providers;
clearProviderSnapshotCache();
const entries = await getProviderSnapshot(sql, config, undefined, true);
return { refreshed: entries.length };
const refreshed =
subset && subset.length > 0
? entries.filter((e) => subset.includes(e.name)).length
: entries.length;
return { refreshed };
});
// 4.4 — per-provider diagnostic (design.md §6.4 → JSON `{ diagnostic: string }`).
// Read-only: reports cached state (resolved def + available_agents row + warm
// snapshot cache for the last probe error) plus a `which` PATH check. No probe
// spawn. The report itself is a plaintext block (§8); the route wraps it as JSON.
app.get<{ Params: { id: string } }>('/api/providers/:id/diagnostic', async (req, reply) => {
const id = req.params.id;
const resolved = getResolvedRegistry().get(id);
if (!resolved) {
return reply.code(404).send({ error: `unknown provider '${id}'` });
}
const rows = await sql<DiagnosticAgentRow[]>`
SELECT name, install_path, supports_acp, models, last_probed_at
FROM available_agents WHERE name = ${id}
`;
const report = await getProviderDiagnostic(resolved, rows[0], {
cachedEntry: peekSnapshotEntry(id),
});
return { diagnostic: report };
});
}

View File

@@ -2,7 +2,7 @@ import type { FastifyInstance } from 'fastify';
import { z } from 'zod';
import type { Sql } from '../db.js';
import type { Broker } from '@boocode/server/broker';
import type { WsFrame } from '@boocode/server/ws-frames';
import type { WsFrame } from '@boocode/contracts/ws-frames';
import { getSkillBody } from '@boocode/server/skills';
import {
buildSkillInvokeSyntheticFrames,
@@ -16,6 +16,12 @@ const SkillInvokeBody = z.object({
pane_id: z.string().min(1).max(200),
skill_name: z.string().min(1),
user_message: z.string().max(64_000).nullable().optional(),
// v2.5.9: when set to an external provider, the skill runs UNDER that agent —
// its body is injected into a dispatched task instead of native inference.
provider: z.string().max(100).optional(),
model: z.string().max(200).optional(),
mode_id: z.string().max(200).optional(),
thinking_option_id: z.string().max(200).optional(),
});
interface InferenceApi {
@@ -39,9 +45,9 @@ export function registerSkillRoutes(
}
const sessionId = req.params.sessionId;
const { pane_id, skill_name } = parsed.data;
const sessionRows = await sql<{ id: string }[]>`
SELECT id FROM sessions WHERE id = ${sessionId}
const { pane_id, skill_name, provider, model, mode_id, thinking_option_id } = parsed.data;
const sessionRows = await sql<{ id: string; project_id: string }[]>`
SELECT id, project_id FROM sessions WHERE id = ${sessionId}
`;
if (sessionRows.length === 0) {
reply.code(404);
@@ -69,6 +75,31 @@ export function registerSkillRoutes(
return { error: 'unknown_skill', message: `unknown skill: ${skill_name}` };
}
// v2.5.9: external agent → run the skill UNDER that agent. The skill body
// stays server-side (like the native path's tool message) and is injected
// into a dispatched task; the agent receives the skill instructions + the
// user's text. Mirrors the messages-route external-provider dispatch.
if (provider && provider !== 'boocode') {
const [userMsg] = await sql<{ id: string }[]>`
INSERT INTO messages (session_id, chat_id, role, content, status, created_at)
VALUES (${sessionId}, ${chatId}, 'user', ${userText}, 'complete', clock_timestamp())
RETURNING id
`;
broker.publishFrame(sessionId, { type: 'message_started', message_id: userMsg!.id, chat_id: chatId, role: 'user' } as WsFrame);
broker.publishFrame(sessionId, { type: 'delta', message_id: userMsg!.id, chat_id: chatId, content: userText } as WsFrame);
broker.publishFrame(sessionId, { type: 'message_complete', message_id: userMsg!.id, chat_id: chatId } as WsFrame);
const taskInput = `${body}\n\n---\n\n${userText}`;
const [task] = await sql<{ id: string; state: string }[]>`
INSERT INTO tasks (project_id, input, agent, model, mode_id, thinking_option_id, session_id, chat_id)
VALUES (${sessionRows[0]!.project_id}, ${taskInput}, ${provider}, ${model ?? null}, ${mode_id ?? null}, ${thinking_option_id ?? null}, ${sessionId}, ${chatId})
RETURNING id, state
`;
await sql`UPDATE chats SET updated_at = clock_timestamp() WHERE id = ${chatId}`;
reply.code(202);
return { user_message_id: userMsg!.id, task_id: task!.id, dispatched: true };
}
const { result, toolCall } = await runSkillInvokeTransaction(sql, {
sessionId,
chatId,

View File

@@ -8,6 +8,12 @@ interface InferenceApi {
cancel: (sessionId: string, chatId: string) => Promise<boolean>;
}
// F1: the dispatcher's reach into an in-flight external-agent run. Narrow by
// design (not the whole dispatcher) — the route only needs to fire the abort.
// Returns true when a controller was registered for the task (an external run was
// in flight), false otherwise (native boocode task, or already finished).
export type ExternalCancelFn = (taskId: string) => boolean;
const CreateBody = z.object({
project_id: z.string().uuid(),
input: z.string().min(1).max(64_000),
@@ -27,7 +33,12 @@ const ListQuery = z.object({
project_id: z.string().uuid().optional(),
});
export function registerTaskRoutes(app: FastifyInstance, sql: Sql, inference: InferenceApi): void {
export function registerTaskRoutes(
app: FastifyInstance,
sql: Sql,
inference: InferenceApi,
cancelExternal: ExternalCancelFn,
): void {
// POST /api/tasks — create a new task
app.post('/api/tasks', async (req, reply) => {
const parsed = CreateBody.safeParse(req.body);
@@ -95,7 +106,7 @@ export function registerTaskRoutes(app: FastifyInstance, sql: Sql, inference: In
// GET /api/tasks/:id — single task detail
app.get<{ Params: { id: string } }>('/api/tasks/:id', async (req, reply) => {
const rows = await sql`
SELECT id, project_id, parent_task_id, state, input, output_summary, agent, model, execution_path, worktree_path, session_id, cost_tokens, started_at, ended_at, created_at
SELECT id, project_id, parent_task_id, state, input, output_summary, agent, model, execution_path, session_id, cost_tokens, started_at, ended_at, created_at
FROM tasks
WHERE id = ${req.params.id}
`;
@@ -127,7 +138,14 @@ export function registerTaskRoutes(app: FastifyInstance, sql: Sql, inference: In
cancelPendingPermission(taskId);
// If running, try to cancel inference
// F1: abort the in-flight external-agent run (opencode / goose / qwen / claude).
// Idempotent — a double-Stop re-aborts harmlessly; a native boocode task is not
// registered, so this returns false and the inference.cancel path below handles
// it unchanged. The dispatcher's run-function finalizes the streaming assistant
// message as 'cancelled' once the backend honors the signal.
cancelExternal(taskId);
// If running, try to cancel inference (native boocode path — unchanged).
if ((task.state === 'running' || task.state === 'blocked') && task.session_id) {
// Find active chat in the task's session
const chats = await sql<{ id: string }[]>`

View File

@@ -0,0 +1,45 @@
/**
* Session-delete work-loss guard (coder side).
*
* Session delete itself lives in apps/server (Docker), which CANNOT see the
* host worktree dirs (/tmp/booworktrees) or run git on them. Only BooCoder
* (host systemd) can. So the server's DELETE route calls these endpoints
* pre-delete to learn whether a session's worktree holds work at risk, and to
* stash it. The server owns the gate; coder owns the git truth.
*/
import type { FastifyInstance } from 'fastify';
import type { Sql } from '../db.js';
import { checkWorktreeWorkAtRisk, stashWorktree } from '../services/worktree-risk.js';
export function registerWorktreeSafetyRoutes(app: FastifyInstance, sql: Sql): void {
// GET risk for a session's worktree(s). One row per session today (PK on
// session_id); the loop already handles the Phase-1.5 multi-worktree case.
app.get<{ Params: { sessionId: string } }>(
'/api/sessions/:sessionId/worktree-risk',
async (req) => {
const rows = await sql<{ worktree_path: string }[]>`
SELECT path AS worktree_path FROM worktrees WHERE session_id = ${req.params.sessionId}
`;
const reports = [];
for (const row of rows) {
reports.push(await checkWorktreeWorkAtRisk(row.worktree_path));
}
return { reports };
},
);
// Stash a session's worktree(s) — clears the dirty risk; recoverable.
app.post<{ Params: { sessionId: string } }>(
'/api/sessions/:sessionId/worktree-stash',
async (req) => {
const rows = await sql<{ worktree_path: string }[]>`
SELECT path AS worktree_path FROM worktrees WHERE session_id = ${req.params.sessionId}
`;
const results = [];
for (const row of rows) {
results.push({ worktreePath: row.worktree_path, ...(await stashWorktree(row.worktree_path)) });
}
return { results };
},
);
}

View File

@@ -25,7 +25,7 @@ export function registerWebSocket(
// Send snapshot of existing messages so client can hydrate
const messages = await sql<Record<string, unknown>[]>`
SELECT id, session_id, chat_id, role, content, kind, tool_calls, tool_results, reasoning_parts, status, last_seq,
SELECT id, session_id, chat_id, role, content, kind, tool_calls, tool_results, reasoning_parts, status, model, last_seq,
tokens_used, ctx_used, ctx_max, started_at, finished_at, created_at, metadata,
summary, tail_start_id, compacted_at
FROM messages_with_parts

View File

@@ -25,7 +25,6 @@ CREATE TABLE IF NOT EXISTS tasks (
agent TEXT,
model TEXT,
execution_path TEXT,
worktree_path TEXT,
cost_tokens INTEGER,
started_at TIMESTAMPTZ,
ended_at TIMESTAMPTZ,
@@ -39,9 +38,9 @@ CREATE TABLE IF NOT EXISTS available_agents (
install_path TEXT,
version TEXT,
supports_acp BOOLEAN NOT NULL DEFAULT false,
supports_mcp_client BOOLEAN NOT NULL DEFAULT false,
last_probed_at TIMESTAMPTZ
);
ALTER TABLE available_agents DROP COLUMN IF EXISTS supports_mcp_client;
-- v2.0.0 Phase 4: link tasks to their inference sessions.
ALTER TABLE tasks ADD COLUMN IF NOT EXISTS session_id UUID REFERENCES sessions(id);
@@ -66,8 +65,226 @@ CREATE OR REPLACE VIEW human_inbox AS
ALTER TABLE available_agents ADD COLUMN IF NOT EXISTS models JSONB DEFAULT '[]'::jsonb;
ALTER TABLE available_agents ADD COLUMN IF NOT EXISTS label TEXT;
ALTER TABLE available_agents ADD COLUMN IF NOT EXISTS transport TEXT DEFAULT 'pty';
-- v2.5.10: persisted ACP available_commands (captured during the cold probe), so
-- an agent's live command set survives the tier-2 probe skip and shows without a
-- dispatch.
ALTER TABLE available_agents ADD COLUMN IF NOT EXISTS commands JSONB DEFAULT '[]'::jsonb;
-- v2.2.0: Paseo-style session config on tasks.
ALTER TABLE tasks ADD COLUMN IF NOT EXISTS mode_id TEXT;
ALTER TABLE tasks ADD COLUMN IF NOT EXISTS thinking_option_id TEXT;
ALTER TABLE tasks ADD COLUMN IF NOT EXISTS feature_values JSONB;
-- tasks.feature_values and tasks.worktree_path were never read or written by any
-- code path; drop them from existing DBs (fresh DBs never had them in the CREATE).
-- human_inbox is `SELECT *` over tasks, so it pins every task column — dropping a
-- column while the view exists fails (2BP01). Drop the view, drop the columns, then
-- recreate it with the current column set (idempotent on fresh + existing DBs).
DROP VIEW IF EXISTS human_inbox;
ALTER TABLE tasks DROP COLUMN IF EXISTS feature_values;
ALTER TABLE tasks DROP COLUMN IF EXISTS worktree_path;
CREATE OR REPLACE VIEW human_inbox AS
SELECT * FROM tasks WHERE state IN ('blocked', 'failed');
-- v2.6: one backend session per (session, agent); resumed on switch-back.
CREATE TABLE IF NOT EXISTS agent_sessions (
session_id UUID NOT NULL REFERENCES sessions(id) ON DELETE CASCADE,
agent TEXT NOT NULL,
backend TEXT NOT NULL,
agent_session_id TEXT,
server_port INTEGER,
status TEXT NOT NULL DEFAULT 'idle',
last_active_at TIMESTAMPTZ,
created_at TIMESTAMPTZ NOT NULL DEFAULT clock_timestamp(),
PRIMARY KEY (session_id, agent),
CONSTRAINT agent_sessions_backend_chk CHECK (backend IN ('opencode_server', 'acp_warm')),
CONSTRAINT agent_sessions_status_chk CHECK (status IN ('idle', 'active', 'crashed', 'closed'))
);
-- Migrate existing agent_sessions FK to CASCADE.
DO $$ BEGIN
IF EXISTS (
SELECT 1 FROM pg_constraint
WHERE conname = 'agent_sessions_session_id_fkey'
AND confdeltype <> 'c'
) THEN
ALTER TABLE agent_sessions DROP CONSTRAINT agent_sessions_session_id_fkey;
ALTER TABLE agent_sessions ADD CONSTRAINT agent_sessions_session_id_fkey
FOREIGN KEY (session_id) REFERENCES sessions(id) ON DELETE CASCADE;
END IF;
END $$;
-- v2.6: config fingerprint for stale-session detection (auto-recover on model change).
ALTER TABLE agent_sessions ADD COLUMN IF NOT EXISTS config_hash TEXT;
-- v2.6 Phase 1-UX (U.6): opencode token/cost usage, ACCUMULATED per (chat_id, agent).
-- opencode's warm server emits `session.next.step.ended` once per LLM step (several
-- per multi-tool turn) carrying {tokens{input,output,reasoning,cache},cost}. We sum
-- each step's normalized {input,output,cost} onto the session row — running totals
-- for the whole conversation context, not last-step. Backend-only; no route/UI yet.
-- input_tokens folds in cache read+write; output_tokens folds in reasoning (see
-- backends/opencode-usage.ts). Defaults 0 so accumulation (col + delta) is well-defined.
ALTER TABLE agent_sessions ADD COLUMN IF NOT EXISTS input_tokens BIGINT NOT NULL DEFAULT 0;
ALTER TABLE agent_sessions ADD COLUMN IF NOT EXISTS output_tokens BIGINT NOT NULL DEFAULT 0;
ALTER TABLE agent_sessions ADD COLUMN IF NOT EXISTS cost DOUBLE PRECISION NOT NULL DEFAULT 0;
-- ─── P1.5-b (corrected): worktrees entity + re-key agent_sessions to (chat_id, agent) ───
-- The TAB (a chat) is the context unit: two opencode tabs in one session = two
-- independent contexts sharing one worktree. So agent_sessions keys on
-- (chat_id, agent), NOT (worktree_id, agent) or (session_id, agent). The
-- `worktrees` table is one-per-session (selectable later) and only referenced
-- informationally by agent_sessions.worktree_id (SET NULL); chat_id is the key.
--
-- PREREQUISITE: the unmigratable test session (35 chats, 1 agent_sessions row that
-- maps to no single chat) is DELETED before this runs, so agent_sessions is empty
-- and the chat_id backfill is N/A. If a row with NULL chat_id remains, the verify
-- gate below RAISEs and aborts — delete the offending session first.
-- worktree as a first-class entity; survives session delete (session_id SET NULL).
CREATE TABLE IF NOT EXISTS worktrees (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
session_id UUID REFERENCES sessions(id) ON DELETE SET NULL,
project_id UUID,
path TEXT NOT NULL,
branch TEXT,
base_commit TEXT,
slug TEXT,
status TEXT NOT NULL DEFAULT 'active' CHECK (status IN ('active','archived')),
created_at TIMESTAMPTZ NOT NULL DEFAULT clock_timestamp()
);
CREATE UNIQUE INDEX IF NOT EXISTS worktrees_active_path_uidx ON worktrees(path) WHERE status='active';
-- session_worktrees was superseded by worktrees (v2.6/P1.5-b); all rows migrated
-- before P2 cleanup. Drop the dead table; no-op on fresh DBs that never had it.
DROP TABLE IF EXISTS session_worktrees;
-- Dispatch hint: which chat (tab) a task belongs to. The coder message route and
-- skills route set it from the frontend tab; session-less creators (arena, MCP,
-- new_task, generic /api/tasks) leave it NULL and the dispatcher creates a chat.
ALTER TABLE tasks ADD COLUMN IF NOT EXISTS chat_id UUID REFERENCES chats(id) ON DELETE SET NULL;
-- Re-key columns on agent_sessions.
ALTER TABLE agent_sessions ADD COLUMN IF NOT EXISTS chat_id UUID;
ALTER TABLE agent_sessions ADD COLUMN IF NOT EXISTS worktree_id UUID;
-- BACKFILL-VERIFY GATE: the new PK is (chat_id, agent), so chat_id must be
-- non-null on every row before the swap. With the test session deleted this is a
-- 0-row assertion; if any row has NULL chat_id (an unmigratable pre-existing row),
-- abort loudly rather than create a degenerate (NULL, agent) key.
DO $$
DECLARE n int;
BEGIN
SELECT count(*) INTO n FROM agent_sessions WHERE chat_id IS NULL;
IF n > 0 THEN
RAISE EXCEPTION 'P1.5-b: % agent_sessions row(s) have NULL chat_id — delete the unmigratable session(s) before applying', n;
END IF;
END $$;
-- Swap PK (session_id,agent) → (chat_id,agent) + FKs (run-once, guarded on the new
-- FK's absence). chat_id CASCADEs from chats (closing a tab ends its context);
-- worktree_id is informational SET NULL; session_id defanged to nullable SET NULL.
DO $$ BEGIN
IF NOT EXISTS (SELECT 1 FROM pg_constraint WHERE conname = 'agent_sessions_chat_id_fkey') THEN
ALTER TABLE agent_sessions DROP CONSTRAINT IF EXISTS agent_sessions_pkey;
ALTER TABLE agent_sessions DROP CONSTRAINT IF EXISTS agent_sessions_session_id_fkey;
ALTER TABLE agent_sessions ALTER COLUMN session_id DROP NOT NULL;
ALTER TABLE agent_sessions ALTER COLUMN chat_id SET NOT NULL;
ALTER TABLE agent_sessions ADD CONSTRAINT agent_sessions_pkey PRIMARY KEY (chat_id, agent);
ALTER TABLE agent_sessions ADD CONSTRAINT agent_sessions_chat_id_fkey
FOREIGN KEY (chat_id) REFERENCES chats(id) ON DELETE CASCADE;
ALTER TABLE agent_sessions ADD CONSTRAINT agent_sessions_session_id_fkey
FOREIGN KEY (session_id) REFERENCES sessions(id) ON DELETE SET NULL;
ALTER TABLE agent_sessions ADD CONSTRAINT agent_sessions_worktree_id_fkey
FOREIGN KEY (worktree_id) REFERENCES worktrees(id) ON DELETE SET NULL;
END IF;
END $$;
-- P1.5-b follow-up: converge agent_sessions.session_id FK CASCADE → SET NULL.
-- The re-key block above re-adds session_id_fkey as SET NULL, but it is guarded on
-- chat_id_fkey's ABSENCE — so a DB already re-keyed to (chat_id, agent) while
-- session_id_fkey was still ON DELETE CASCADE never re-enters that block and stays
-- 'c'. This standalone guard flips it to SET NULL ('n'), matching worktree_id.
-- Idempotent (mirrors the session_worktrees defang's confdeltype check): only fires
-- while the FK is still CASCADE — a no-op on a fresh deploy (already 'n' from the
-- re-key block) and on every re-run thereafter.
DO $$ BEGIN
IF EXISTS (
SELECT 1 FROM pg_constraint
WHERE conname = 'agent_sessions_session_id_fkey'
AND confdeltype = 'c'
) THEN
ALTER TABLE agent_sessions ALTER COLUMN session_id DROP NOT NULL;
ALTER TABLE agent_sessions DROP CONSTRAINT agent_sessions_session_id_fkey;
ALTER TABLE agent_sessions ADD CONSTRAINT agent_sessions_session_id_fkey
FOREIGN KEY (session_id) REFERENCES sessions(id) ON DELETE SET NULL;
END IF;
END $$;
-- v2.6: attribution for DiffPanel badges (Phase 1 UX reads this).
ALTER TABLE pending_changes ADD COLUMN IF NOT EXISTS agent TEXT;
-- write-edit-robustness #4: worktree checkpoints. A pre-turn shadow-commit of the
-- session worktree (tracked + untracked, captured without disturbing the real
-- index/working tree) stored in a private GC-safe ref refs/boocode/checkpoints/<id>.
-- Created best-effort before each external-agent turn (opencode / warm-ACP / one-shot
-- ACP+PTY); restore resets the worktree to commit_sha, trims the transcript from
-- message_id forward, and resets the backend session. chat_id CASCADEs from chats
-- (like agent_sessions); worktree_id SET NULL so a checkpoint outlives a reaped
-- worktree row. session_id / message_id are informational (no FK — message rows are
-- trimmed by a checkpoint restore and we must not block that on a dangling ref).
CREATE TABLE IF NOT EXISTS checkpoints (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
chat_id UUID NOT NULL REFERENCES chats(id) ON DELETE CASCADE,
session_id UUID,
worktree_id UUID REFERENCES worktrees(id) ON DELETE SET NULL,
message_id UUID, -- anchor: the assistant turn row this checkpoint precedes
commit_sha TEXT NOT NULL, -- shadow-commit capturing the pre-turn worktree tree
label TEXT,
created_at TIMESTAMPTZ NOT NULL DEFAULT clock_timestamp()
);
CREATE INDEX IF NOT EXISTS checkpoints_chat_created_idx ON checkpoints(chat_id, created_at);
-- claude-sdk-sessionstore #9 (Part 1): append-only mirror of Claude Agent SDK
-- session transcripts. The SDK's SessionStore adapter writes one JSONL line per
-- entry; PostgresSessionStore (services/backends/claude-session-store.ts) inserts
-- one row per entry and replays them ORDER BY id on resume. The store is generic
-- per the SDK's SessionKey (project_key, session_id, subpath) — chat↔session
-- ownership lives in agent_sessions, not here. subpath '' is the main transcript
-- (the SDK's undefined subpath maps to '' in the column).
CREATE TABLE IF NOT EXISTS claude_session_entries (
id BIGSERIAL PRIMARY KEY,
project_key TEXT NOT NULL,
session_id TEXT NOT NULL,
subpath TEXT NOT NULL DEFAULT '', -- '' = main transcript (SDK's undefined subpath maps here)
entry JSONB NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT clock_timestamp()
);
CREATE INDEX IF NOT EXISTS claude_session_entries_key_idx ON claude_session_entries (project_key, session_id, subpath, id);
-- claude-sdk-sessionstore #9 (Part 2): the warm Claude-SDK backend persists its
-- agent_sessions rows with backend='claude_sdk'. Widen the named CHECK to accept
-- it. Idempotent: DROP the named constraint (the inline CREATE TABLE check above
-- carries this explicit name, so DROP IF EXISTS targets it) + re-ADD the widened
-- list. Re-runs/fresh deploys land on the same final constraint (the table-level
-- CREATE already includes only the old two values on a fresh DB; this block then
-- replaces it with the three-value list).
ALTER TABLE agent_sessions DROP CONSTRAINT IF EXISTS agent_sessions_backend_chk;
ALTER TABLE agent_sessions ADD CONSTRAINT agent_sessions_backend_chk
CHECK (backend IN ('opencode_server', 'acp_warm', 'claude_sdk'));
-- LISTEN/NOTIFY fast path: every tasks INSERT (from any call site — routes,
-- new_task tool, arena, MCP server) fires pg_notify('tasks_new') in the same
-- transaction, so the dispatcher reacts immediately instead of waiting for the
-- fallback poll. Postgres holds the notification until COMMIT, so the listener
-- always sees the committed row. A trigger covers all insert paths with no
-- app-code drift. Idempotent: re-applied on every startup.
CREATE OR REPLACE FUNCTION notify_tasks_new() RETURNS trigger AS $$
BEGIN
PERFORM pg_notify('tasks_new', '');
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
DROP TRIGGER IF EXISTS tasks_notify_new ON tasks;
CREATE TRIGGER tasks_notify_new
AFTER INSERT ON tasks
FOR EACH ROW
EXECUTE FUNCTION notify_tasks_new();

View File

@@ -0,0 +1,50 @@
import { describe, it, expect, afterEach } from 'vitest';
import { mkdtempSync, rmSync } from 'node:fs';
import { tmpdir } from 'node:os';
import { join } from 'node:path';
import { readWorktreeTextFile, writeWorktreeTextFile } from '../acp-client-fs.js';
const created: string[] = [];
function freshWorktree(): string {
const wt = mkdtempSync(join(tmpdir(), 'acp-wt-'));
created.push(wt);
return wt;
}
afterEach(() => {
for (const d of created.splice(0)) {
try {
rmSync(d, { recursive: true, force: true });
rmSync(`${d}-evil`, { recursive: true, force: true });
} catch {
/* ignore */
}
}
});
describe('acp-client-fs worktree scoping', () => {
it('writes then reads a file inside the worktree', async () => {
const wt = freshWorktree();
await writeWorktreeTextFile(wt, 'sub/dir/note.txt', 'hello');
expect(await readWorktreeTextFile(wt, 'sub/dir/note.txt')).toBe('hello');
});
it('rejects ../ traversal on read', async () => {
const wt = freshWorktree();
await expect(readWorktreeTextFile(wt, '../../etc/passwd')).rejects.toThrow(/escapes worktree/);
});
it('rejects ../ traversal on write', async () => {
const wt = freshWorktree();
await expect(writeWorktreeTextFile(wt, '../escape.txt', 'x')).rejects.toThrow(/escapes worktree/);
});
it('rejects a sibling-prefix path (the unbounded-startsWith bug)', async () => {
const wt = freshWorktree();
// Absolute path that shares the worktree as a STRING prefix but is a sibling
// dir: `<wt>-evil/...`. A bare `startsWith(<wt>)` wrongly admits it.
await expect(readWorktreeTextFile(wt, `${wt}-evil/secret.txt`)).rejects.toThrow(/escapes worktree/);
await expect(writeWorktreeTextFile(wt, `${wt}-evil/secret.txt`, 'x')).rejects.toThrow(
/escapes worktree/,
);
});
});

View File

@@ -0,0 +1,74 @@
import { describe, it, expect, vi } from 'vitest';
import type { RequestPermissionRequest, CreateElicitationRequest, SessionNotification } from '@agentclientprotocol/sdk';
import { buildAcpClient, type AcpTurnContext } from '../acp-client.js';
/**
* buildAcpClient (v2.7 audit reshape): the shared ACP `Client` closures. These
* tests cover the pure routing decisions that don't require the permission-waiter
* broker machinery — the auto-select/decline fallbacks and the between-turns drop.
*/
describe('buildAcpClient — sessionUpdate', () => {
it('drops the update when no turn is active (resolveTurn → null)', async () => {
const client = buildAcpClient('/wt', () => null);
// Must resolve without throwing and without an onSessionUpdate to call.
await expect(client.sessionUpdate({ sessionId: 's', update: {} } as unknown as SessionNotification)).resolves.toBeUndefined();
});
it('forwards the update to the active turn', async () => {
const onSessionUpdate = vi.fn();
const turn: AcpTurnContext = { taskId: 't', sessionId: 's', modeId: undefined, agent: 'goose', onSessionUpdate };
const client = buildAcpClient('/wt', () => turn);
const note = { sessionId: 's', update: {} } as unknown as SessionNotification;
await client.sessionUpdate(note);
expect(onSessionUpdate).toHaveBeenCalledWith(note);
});
});
describe('buildAcpClient — requestPermission fallback (no UI routing)', () => {
function req(options: Array<{ optionId: string }>): RequestPermissionRequest {
return { options } as unknown as RequestPermissionRequest;
}
it('auto-selects the first option when there is no turn', async () => {
const client = buildAcpClient('/wt', () => null);
const res = await client.requestPermission(req([{ optionId: 'allow' }, { optionId: 'deny' }]));
expect(res).toEqual({ outcome: { outcome: 'selected', optionId: 'allow' } });
});
it('cancels when there is no turn and no options', async () => {
const client = buildAcpClient('/wt', () => null);
const res = await client.requestPermission(req([]));
expect(res).toEqual({ outcome: { outcome: 'cancelled' } });
});
it('auto-selects when the turn has no taskId (UI routing gated off)', async () => {
const turn: AcpTurnContext = { taskId: undefined, sessionId: 's', modeId: undefined, agent: 'goose', onSessionUpdate: () => {} };
const client = buildAcpClient('/wt', () => turn);
const res = await client.requestPermission(req([{ optionId: 'ok' }]));
expect(res).toEqual({ outcome: { outcome: 'selected', optionId: 'ok' } });
});
});
describe('buildAcpClient — elicitation fallback', () => {
it('declines when there is no turn', async () => {
const client = buildAcpClient('/wt', () => null);
const res = await client.unstable_createElicitation!({} as CreateElicitationRequest);
expect(res).toEqual({ action: 'decline' });
});
it('declines when the turn has no taskId', async () => {
const turn: AcpTurnContext = { taskId: undefined, sessionId: 's', modeId: undefined, agent: 'goose', onSessionUpdate: () => {} };
const client = buildAcpClient('/wt', () => turn);
const res = await client.unstable_createElicitation!({} as CreateElicitationRequest);
expect(res).toEqual({ action: 'decline' });
});
});
describe('buildAcpClient — createTerminal', () => {
it('returns the noop terminal id', async () => {
const client = buildAcpClient('/wt', () => null);
const res = await client.createTerminal!({} as never);
expect(res).toEqual({ terminalId: 'noop' });
});
});

View File

@@ -0,0 +1,110 @@
import { describe, it, expect } from 'vitest';
import type { SessionNotification } from '@agentclientprotocol/sdk';
import { mapSessionUpdate } from '../acp-event-map.js';
import type { AcpToolSnapshot } from '../acp-tool-snapshot.js';
/**
* Pure event-mapping shared by the one-shot ACP dispatch (AcpStreamContext) and
* the warm ACP backend (Phase 2). Mirrors the original handleSessionUpdate switch
* verbatim but returns normalized AgentEvents instead of publishing broker frames.
*/
describe('mapSessionUpdate (shared ACP event mapping)', () => {
function note(update: SessionNotification['update']): SessionNotification {
return { sessionId: 's1', update };
}
it('maps an agent_message_chunk text → a text event', () => {
const events = mapSessionUpdate(
note({ sessionUpdate: 'agent_message_chunk', content: { type: 'text', text: 'hello' } }),
);
expect(events).toEqual([{ type: 'text', text: 'hello' }]);
});
it('maps an agent_thought_chunk text → a reasoning event', () => {
const events = mapSessionUpdate(
note({ sessionUpdate: 'agent_thought_chunk', content: { type: 'text', text: 'thinking' } }),
);
expect(events).toEqual([{ type: 'reasoning', text: 'thinking' }]);
});
it('ignores non-text content on message/thought chunks', () => {
const img = mapSessionUpdate(
note({
sessionUpdate: 'agent_message_chunk',
content: { type: 'image', data: 'x', mimeType: 'image/png' },
} as never),
);
expect(img).toEqual([]);
});
it('maps a tool_call → a tool_call event with a merged snapshot', () => {
const events = mapSessionUpdate(
note({
sessionUpdate: 'tool_call',
toolCallId: 't1',
title: 'read_file',
status: 'pending',
rawInput: { path: 'a.ts' },
} as never),
);
expect(events).toHaveLength(1);
expect(events[0]!.type).toBe('tool_call');
const snap = (events[0] as { type: 'tool_call'; toolCall: AcpToolSnapshot }).toolCall;
expect(snap.toolCallId).toBe('t1');
expect(snap.title).toBe('read_file');
expect(snap.status).toBe('pending');
expect(snap.rawInput).toEqual({ path: 'a.ts' });
});
it('maps a tool_call_update → a tool_update event merged over the prior snapshot', () => {
const prior = new Map<string, AcpToolSnapshot>([
['t1', { toolCallId: 't1', title: 'read_file', status: 'pending', rawInput: { path: 'a.ts' } }],
]);
const events = mapSessionUpdate(
note({
sessionUpdate: 'tool_call_update',
toolCallId: 't1',
status: 'completed',
rawOutput: 'file body',
} as never),
prior,
);
expect(events).toHaveLength(1);
expect(events[0]!.type).toBe('tool_update');
const snap = (events[0] as { type: 'tool_update'; toolCall: AcpToolSnapshot }).toolCall;
expect(snap.toolCallId).toBe('t1');
// merged: title carried from prior, status updated, output added, input retained
expect(snap.title).toBe('read_file');
expect(snap.status).toBe('completed');
expect(snap.rawOutput).toBe('file body');
expect(snap.rawInput).toEqual({ path: 'a.ts' });
});
it('maps available_commands_update → a commands event', () => {
const events = mapSessionUpdate(
note({
sessionUpdate: 'available_commands_update',
availableCommands: [
{ name: 'plan', description: 'make a plan' },
{ name: 'review', description: null },
],
} as never),
);
expect(events).toEqual([
{
type: 'commands',
commands: [
{ name: 'plan', description: 'make a plan' },
{ name: 'review', description: undefined },
],
},
]);
});
it('returns [] for unhandled update kinds (plan, mode change)', () => {
expect(mapSessionUpdate(note({ sessionUpdate: 'plan', entries: [] } as never))).toEqual([]);
expect(
mapSessionUpdate(note({ sessionUpdate: 'current_mode_update', currentModeId: 'code' } as never)),
).toEqual([]);
});
});

View File

@@ -0,0 +1,73 @@
import { describe, it, expect } from 'vitest';
import { resolveLaunchSpec, resolveAcpSpawnArgs } from '../acp-spawn.js';
import { buildResolvedRegistry } from '../provider-config-registry.js';
import type { CoderProvidersFile } from '../provider-config.js';
import { PROVIDERS } from '../provider-registry.js';
/** Resolved def for a provider id under the given config (default: no override). */
function builtin(name: string, providers: CoderProvidersFile['providers'] = {}) {
const def = buildResolvedRegistry(PROVIDERS, { providers }).get(name);
if (!def) throw new Error(`no resolved def for ${name}`);
return def;
}
describe('resolveLaunchSpec', () => {
// --- byte-identical built-in regression (the HARD CONSTRAINT) ---------------
// These argv values are the pre-v2.3 resolveAcpSpawnArgs switch outputs and
// MUST NOT change. spawn() is `spawn(spec.binary, spec.args, ...)`, so argv
// parity here is dispatch parity.
it('opencode (no override) → byte-identical argv ["acp"], binary = installPath', () => {
const spec = resolveLaunchSpec(builtin('opencode'), '/usr/bin/opencode');
expect(spec).not.toBeNull();
expect(spec!.args).toEqual(['acp']); // pre-v2.3 value
expect(spec!.binary).toBe('/usr/bin/opencode');
expect(spec!.env).toBeUndefined();
// cross-check against the switch source-of-truth
expect(spec!.args).toEqual(resolveAcpSpawnArgs('opencode'));
});
it('goose → ["acp"], qwen → ["--acp"] (byte-identical)', () => {
expect(resolveLaunchSpec(builtin('goose'), '/usr/bin/goose')!.args).toEqual(['acp']);
expect(resolveLaunchSpec(builtin('qwen'), '/usr/bin/qwen')!.args).toEqual(['--acp']);
});
it('built-in with null installPath falls back to the bare id (pre-v2.3 `installPath ?? agent`)', () => {
const spec = resolveLaunchSpec(builtin('opencode'), null);
expect(spec!.binary).toBe('opencode');
expect(spec!.args).toEqual(['acp']);
});
it('non-ACP / unknown provider → null (claude has no ACP argv)', () => {
expect(resolveLaunchSpec(builtin('claude'), '/usr/bin/claude')).toBeNull();
expect(resolveLaunchSpec(builtin('boocode'), null)).toBeNull();
});
// --- config-driven launch (the new capability) ------------------------------
it('custom ACP entry → configured command + env reach the spec', () => {
const def = builtin('amp-acp', {
'amp-acp': { extends: 'acp', label: 'Amp', command: ['amp-acp', '--acp'], env: { AMP_KEY: 'x' } },
});
const spec = resolveLaunchSpec(def, '/usr/local/bin/amp-acp');
expect(spec).not.toBeNull();
expect(spec!.binary).toBe('amp-acp'); // command[0], not the resolved install path
expect(spec!.args).toEqual(['--acp']); // command.slice(1)
expect(spec!.env).toEqual({ AMP_KEY: 'x' });
});
it('built-in WITH a config command override uses the override, not the switch default', () => {
const def = builtin('opencode', { opencode: { command: ['opencode', 'acp', '--verbose'], env: { DEBUG: '1' } } });
const spec = resolveLaunchSpec(def, '/usr/bin/opencode');
expect(spec!.binary).toBe('opencode');
expect(spec!.args).toEqual(['acp', '--verbose']);
expect(spec!.env).toEqual({ DEBUG: '1' });
});
});
describe('acp-dispatch spawn wiring (documented pass-through)', () => {
// dispatchViaAcp spawns `spawn(spec.binary, spec.args, { env: { ...process.env, ...spec.env } })`.
// The env merge layers config env over process.env; for a built-in with no
// config env, spec.env is undefined → { ...process.env } (byte-identical).
it('built-in with no config env yields an undefined spec.env (→ plain process.env at spawn)', () => {
expect(resolveLaunchSpec(builtin('opencode'), '/usr/bin/opencode')!.env).toBeUndefined();
});
});

View File

@@ -0,0 +1,233 @@
import { describe, it, expect, vi } from 'vitest';
import { AgentPool, OPENCODE_POOL_KEY } from '../agent-pool.js';
import type {
AgentBackend,
AgentSessionHandle,
EnsureSessionOpts,
PromptCtx,
TurnResult,
} from '../agent-backend.js';
/**
* v2.6 Phase 3 — AgentPool lifecycle unit test (T.1). No DB / no child process:
* a fake AgentBackend records dispose + reports busy/health, so we exercise
* get-or-create, idle eviction, the LRU cap, the busy-never-evict rule, closeChat,
* and dispose-drains directly. The pure decisions are covered separately in
* backends/__tests__/lifecycle-decisions.test.ts; this verifies the wiring.
*/
class FakeBackend implements AgentBackend {
disposed = 0;
closedSessions = 0;
private busyFlag = false;
tickHealthCalls = 0;
constructor(public readonly name = 'fake') {}
setBusy(b: boolean): void {
this.busyFlag = b;
}
// — AgentBackend —
async ensureSession(sessionId: string, opts: EnsureSessionOpts): Promise<AgentSessionHandle> {
return {
sessionId,
agent: opts.agent,
backend: 'acp_warm',
chatId: opts.chatId,
worktreeId: opts.worktreeId,
agentSessionId: 'fake-session',
serverPort: null,
};
}
async prompt(_h: AgentSessionHandle, _input: string, _ctx: PromptCtx): Promise<TurnResult> {
return { ok: true };
}
async closeSession(): Promise<void> {
this.closedSessions++;
}
async dispose(): Promise<void> {
this.disposed++;
}
health(): 'up' | 'down' {
return 'up';
}
isBusy(): boolean {
return this.busyFlag;
}
async tickHealth(): Promise<void> {
this.tickHealthCalls++;
}
}
describe('AgentPool — get/register/touch (3.1)', () => {
it('register then get returns the same backend', () => {
const pool = new AgentPool();
const b = new FakeBackend();
pool.register('chat-1', 'goose', b);
expect(pool.get('chat-1', 'goose')).toBe(b);
expect(pool.get('chat-1', 'qwen')).toBeUndefined();
});
it('peek does NOT exist for a missing key', () => {
const pool = new AgentPool();
expect(pool.peek('nope', 'goose')).toBeUndefined();
});
it('health reports size + busy count', () => {
const pool = new AgentPool();
const a = new FakeBackend();
const b = new FakeBackend();
b.setBusy(true);
pool.register('c1', 'goose', a);
pool.register('c2', 'qwen', b);
expect(pool.health()).toEqual({ size: 2, busy: 1 });
});
});
describe('AgentPool.sweep — idle TTL eviction (3.1)', () => {
it('evicts an idle backend past the TTL and disposes it', async () => {
const pool = new AgentPool({ idleTtlMs: 1_000, maxLive: 100 });
const b = new FakeBackend();
pool.register('c1', 'goose', b);
// Sweep with now far past the registration → idle → evicted.
const { evicted } = await pool.sweep(Date.now() + 10_000);
expect(evicted).toEqual(['c1:goose']);
expect(b.disposed).toBe(1);
expect(pool.get('c1', 'goose')).toBeUndefined();
});
it('never evicts a busy backend even past the TTL', async () => {
const pool = new AgentPool({ idleTtlMs: 1_000, maxLive: 100 });
const b = new FakeBackend();
b.setBusy(true);
pool.register('c1', 'goose', b);
const { evicted } = await pool.sweep(Date.now() + 10_000);
expect(evicted).toEqual([]);
expect(b.disposed).toBe(0);
expect(pool.get('c1', 'goose')).toBe(b);
});
it('touch keeps a backend warm so the TTL measures from the last turn', async () => {
const pool = new AgentPool({ idleTtlMs: 5_000, maxLive: 100 });
const b = new FakeBackend();
pool.register('c1', 'goose', b);
const base = Date.now();
// 4s later, touch — resets activity. A sweep at +6s from base is only +2s from
// the touch → still within TTL → not evicted.
vi.spyOn(Date, 'now').mockReturnValue(base + 4_000);
pool.touch('c1', 'goose');
vi.restoreAllMocks();
const { evicted } = await pool.sweep(base + 6_000);
expect(evicted).toEqual([]);
});
});
describe('AgentPool.sweep — LRU cap (3.4)', () => {
it('evicts the least-recently-used beyond the cap', async () => {
const pool = new AgentPool({ idleTtlMs: 1_000_000, maxLive: 2 });
const base = 1_000_000;
const mk = (key: string, regAt: number) => {
vi.spyOn(Date, 'now').mockReturnValue(regAt);
const b = new FakeBackend(key);
const [chat, agent] = key.split(':');
pool.register(chat!, agent!, b);
vi.restoreAllMocks();
return b;
};
const a = mk('c1:goose', base + 100);
const b = mk('c2:goose', base + 300);
const c = mk('c3:goose', base + 200);
// 3 entries, cap 2, all within idle TTL → LRU (oldest = a@+100) evicted.
const { evicted } = await pool.sweep(base + 1_000);
expect(evicted).toEqual(['c1:goose']);
expect(a.disposed).toBe(1);
expect(b.disposed).toBe(0);
expect(c.disposed).toBe(0);
});
});
describe('AgentPool.sweep — proactive health probe (3.2)', () => {
it('drives each backend tickHealth before eviction', async () => {
const pool = new AgentPool({ idleTtlMs: 1_000_000, maxLive: 100 });
const b = new FakeBackend();
pool.register('c1', 'opencode', b);
await pool.sweep(Date.now());
expect(b.tickHealthCalls).toBe(1);
});
});
describe('AgentPool.closeChat — chat-close teardown (3.3)', () => {
it('disposes only the matching chat keys, leaving others + the shared server', async () => {
const pool = new AgentPool();
const goose = new FakeBackend('goose');
const qwen = new FakeBackend('qwen');
const other = new FakeBackend('other-chat');
const ocServer = new FakeBackend('opencode-server');
pool.register('chat-1', 'goose', goose);
pool.register('chat-1', 'qwen', qwen);
pool.register('chat-2', 'goose', other);
pool.register(OPENCODE_POOL_KEY, 'opencode', ocServer);
const removed = await pool.closeChat('chat-1');
expect(removed.sort()).toEqual(['chat-1:goose', 'chat-1:qwen']);
expect(goose.disposed).toBe(1);
expect(qwen.disposed).toBe(1);
// other chat + shared opencode server untouched.
expect(other.disposed).toBe(0);
expect(ocServer.disposed).toBe(0);
expect(pool.peek('chat-2', 'goose')).toBe(other);
expect(pool.peek(OPENCODE_POOL_KEY, 'opencode')).toBe(ocServer);
});
it('does not dispose a busy backend on closeChat', async () => {
const pool = new AgentPool();
const b = new FakeBackend();
b.setBusy(true);
pool.register('chat-1', 'goose', b);
const removed = await pool.closeChat('chat-1');
expect(removed).toEqual([]);
expect(b.disposed).toBe(0);
});
it('does not match a chat id that is a prefix of another', async () => {
// 'chat-1' must not match 'chat-10' — keys are `${chatId}:${agent}` so the
// colon delimiter prevents the prefix collision.
const pool = new AgentPool();
const a = new FakeBackend();
const b = new FakeBackend();
pool.register('chat-1', 'goose', a);
pool.register('chat-10', 'goose', b);
await pool.closeChat('chat-1');
expect(a.disposed).toBe(1);
expect(b.disposed).toBe(0);
expect(pool.peek('chat-10', 'goose')).toBe(b);
});
});
describe('AgentPool.dispose — drain all (T.1)', () => {
it('disposes every backend and clears the map', async () => {
const pool = new AgentPool();
const a = new FakeBackend();
const b = new FakeBackend();
pool.register('c1', 'goose', a);
pool.register('c2', 'qwen', b);
await pool.dispose();
expect(a.disposed).toBe(1);
expect(b.disposed).toBe(1);
expect(pool.health()).toEqual({ size: 0, busy: 0 });
});
it('tolerates a backend whose dispose throws', async () => {
const pool = new AgentPool();
const good = new FakeBackend();
const bad = new FakeBackend();
bad.dispose = async () => {
throw new Error('boom');
};
pool.register('c1', 'goose', bad);
pool.register('c2', 'qwen', good);
await expect(pool.dispose()).resolves.toBeUndefined();
expect(good.disposed).toBe(1);
});
});

View File

@@ -0,0 +1,51 @@
import { describe, it, expect } from 'vitest';
import { createCancelRegistry } from '../cancel-registry.js';
/**
* F1 — per-task abort wiring. The registry is the missing link between the Stop
* route and the in-flight external run: register an AbortController per task id,
* cancel(taskId) aborts its signal, the run's .finally deletes it. Pure (no DB /
* child / IO) so the abort + idempotency contract is unit-testable in isolation.
*/
describe('CancelRegistry (F1 abort wiring)', () => {
it('register hands back a fresh controller; cancel aborts its signal', () => {
const reg = createCancelRegistry();
const ac = reg.register('t1');
expect(ac.signal.aborted).toBe(false);
expect(reg.has('t1')).toBe(true);
expect(reg.cancel('t1')).toBe(true);
expect(ac.signal.aborted).toBe(true);
});
it('cancel on an unknown task returns false (native task / cancel-before-register)', () => {
const reg = createCancelRegistry();
expect(reg.has('nope')).toBe(false);
expect(reg.cancel('nope')).toBe(false);
});
it('double-Stop is idempotent: a second cancel never throws and the signal stays aborted', () => {
const reg = createCancelRegistry();
const ac = reg.register('t1');
expect(reg.cancel('t1')).toBe(true);
// The run-function has not hit its .finally yet, so the entry is still
// present — a rapid second Stop re-aborts (abort() no-ops) without throwing.
expect(() => reg.cancel('t1')).not.toThrow();
expect(reg.cancel('t1')).toBe(true);
expect(ac.signal.aborted).toBe(true);
});
it('cancel after delete returns false (cancel-after-natural-exit is safe)', () => {
const reg = createCancelRegistry();
reg.register('t1');
reg.delete('t1');
expect(reg.has('t1')).toBe(false);
expect(reg.cancel('t1')).toBe(false);
});
it('delete of an unknown id is a no-op (never throws)', () => {
const reg = createCancelRegistry();
expect(() => reg.delete('ghost')).not.toThrow();
});
});

View File

@@ -0,0 +1,252 @@
import { describe, it, expect, beforeAll, afterAll } from 'vitest';
import { readFileSync } from 'node:fs';
import { rm, mkdir } from 'node:fs/promises';
import { resolve } from 'node:path';
import postgres from 'postgres';
import {
buildShadowCommitCommand,
createCheckpoint,
restoreCheckpoint,
CheckpointNotFoundError,
} from '../checkpoints.js';
import { ensureSessionWorktree } from '../worktrees.js';
import { hostExec } from '../host-exec.js';
/**
* write-edit-robustness #4 — worktree checkpoint tests.
*
* Pure-helper coverage (no DB / no host) for the shadow-commit command builder,
* plus a DB+git integration block (DB-opt-in via DATABASE_URL, skips cleanly
* otherwise; mirrors reconnect_integration.test.ts) that exercises the real
* create → restore round trip against a worktree on the host fs.
*/
describe('buildShadowCommitCommand (pure)', () => {
it('parks the commit under refs/boocode/checkpoints/<id> and prints only the SHA', () => {
const cmd = buildShadowCommitCommand('/tmp/booworktrees/sess-abc', 'cp-id-123');
// Uses a temp index so the real working tree/index is untouched.
expect(cmd).toContain('TMP=$(mktemp)');
expect(cmd).toContain('GIT_INDEX_FILE="$TMP" git read-tree HEAD');
expect(cmd).toContain('GIT_INDEX_FILE="$TMP" git add -A');
expect(cmd).toContain('git write-tree');
expect(cmd).toContain("git commit-tree \"$TREE\" -p HEAD -m \"boocode checkpoint\"");
// Ref name matches the row id, and stdout is ONLY the SHA (printf, no newline).
expect(cmd).toContain("update-ref 'refs/boocode/checkpoints/cp-id-123'");
expect(cmd).toContain("printf '%s' \"$SHA\"");
expect(cmd).not.toContain('echo "$SHA"');
});
it('shell-escapes the worktree path and the id', () => {
const cmd = buildShadowCommitCommand("/tmp/it's a path", "id'; rm -rf /");
// Single quotes inside the path/id are escaped via the '\'' wrapping idiom — no
// bare interpolation that could break out of the quoting.
expect(cmd).toContain("cd '/tmp/it'\\''s a path'");
expect(cmd).toContain("refs/boocode/checkpoints/id'\\''; rm -rf /");
});
});
describe.runIf(!!process.env.DATABASE_URL)('checkpoint create + restore (DB + git)', () => {
let sql: ReturnType<typeof postgres>;
const stamp = Date.now();
const projectDir = `/tmp/boocode-checkpoint-proj-${stamp}`;
let projectId: string;
let sessionId: string;
let chatId: string;
let worktreePath: string;
beforeAll(async () => {
sql = postgres(process.env.DATABASE_URL!, { max: 3 });
// Server schema first (FK targets), then coder schema (worktrees + checkpoints).
const serverSchema = resolve(__dirname, '../../../../server/src/schema.sql');
const coderSchema = resolve(__dirname, '../../schema.sql');
await sql.unsafe(readFileSync(serverSchema, 'utf8'));
await sql.unsafe(readFileSync(coderSchema, 'utf8'));
await mkdir(projectDir, { recursive: true });
await hostExec(
`cd ${projectDir} && git init -q && git config user.email t@t && git config user.name t ` +
`&& echo hello > README.md && git add -A && git commit -qm init`,
{ timeoutMs: 20_000 },
);
const [project] = await sql<{ id: string }[]>`
INSERT INTO projects (name, path, status) VALUES ('checkpoint-test', ${projectDir}, 'open') RETURNING id
`;
projectId = project!.id;
const [session] = await sql<{ id: string }[]>`
INSERT INTO sessions (project_id, name, model, status)
VALUES (${projectId}, 'cp', 'm', 'open') RETURNING id
`;
sessionId = session!.id;
const [chat] = await sql<{ id: string }[]>`
INSERT INTO chats (session_id, name, status) VALUES (${sessionId}, 'tab', 'open') RETURNING id
`;
chatId = chat!.id;
const wt = await ensureSessionWorktree(sql, projectDir, sessionId);
worktreePath = wt.worktreePath;
});
afterAll(async () => {
if (sql) {
const rows = await sql<{ path: string }[]>`SELECT path FROM worktrees WHERE session_id = ${sessionId}`.catch(() => []);
for (const r of rows) {
await hostExec(`git -C ${projectDir} worktree remove ${r.path} --force`, { timeoutMs: 10_000 }).catch(() => {});
}
await sql`DELETE FROM checkpoints WHERE chat_id = ${chatId}`.catch(() => {});
await sql`DELETE FROM agent_sessions WHERE chat_id = ${chatId}`.catch(() => {});
await sql`DELETE FROM worktrees WHERE session_id = ${sessionId}`.catch(() => {});
await sql`DELETE FROM chats WHERE id = ${chatId}`.catch(() => {});
await sql`DELETE FROM sessions WHERE id = ${sessionId}`.catch(() => {});
await sql`DELETE FROM projects WHERE id = ${projectId}`.catch(() => {});
await sql.end({ timeout: 5 });
}
await rm(projectDir, { recursive: true, force: true });
});
it('createCheckpoint inserts a row + a private ref capturing tracked + untracked', async () => {
const [wt] = await sql<{ id: string }[]>`SELECT id FROM worktrees WHERE session_id = ${sessionId} AND status = 'active'`;
const worktreeId = wt!.id;
// Pre-turn untracked + tracked-edit state the agent will start from.
await hostExec(`cd ${worktreePath} && echo edited >> README.md && echo new > extra.txt`, { timeoutMs: 10_000 });
const [assistantMsg] = await sql<{ id: string }[]>`
INSERT INTO messages (session_id, chat_id, role, content, status)
VALUES (${sessionId}, ${chatId}, 'assistant', '', 'streaming') RETURNING id
`;
const messageId = assistantMsg!.id;
const cp = await createCheckpoint(sql, {
chatId,
sessionId,
worktreeId,
worktreePath,
messageId,
});
expect(cp).not.toBeNull();
expect(cp!.commit_sha).toMatch(/^[0-9a-f]{40}$/);
const [row] = await sql<{ commit_sha: string; worktree_id: string; message_id: string }[]>`
SELECT commit_sha, worktree_id, message_id FROM checkpoints WHERE id = ${cp!.id}
`;
expect(row!.commit_sha).toBe(cp!.commit_sha);
expect(row!.worktree_id).toBe(worktreeId);
expect(row!.message_id).toBe(messageId);
// The ref exists and the captured tree carries the untracked file (proves the
// temp-index `git add -A` snapshotted untracked content).
const refLs = await hostExec(
`git -C ${worktreePath} ls-tree -r --name-only ${cp!.commit_sha}`,
{ timeoutMs: 10_000 },
);
expect(refLs.exitCode).toBe(0);
expect(refLs.stdout).toContain('extra.txt');
// The shadow commit did NOT disturb the real working tree: extra.txt is still
// present + still untracked (status shows it).
const status = await hostExec(`git -C ${worktreePath} status --porcelain`, { timeoutMs: 10_000 });
expect(status.stdout).toContain('extra.txt');
});
it('restoreCheckpoint resets the worktree, trims the transcript, and drops later checkpoints', async () => {
// Clean slate for this test: reset the worktree to HEAD, clear prior rows.
await hostExec(`git -C ${worktreePath} reset --hard HEAD && git -C ${worktreePath} clean -fd`, { timeoutMs: 10_000 });
await sql`DELETE FROM checkpoints WHERE chat_id = ${chatId}`;
await sql`DELETE FROM messages WHERE chat_id = ${chatId}`;
const [wt] = await sql<{ id: string }[]>`SELECT id FROM worktrees WHERE session_id = ${sessionId} AND status = 'active'`;
const worktreeId = wt!.id;
// Turn 1: a user msg, then the assistant turn the checkpoint anchors. The
// worktree is pristine (matches HEAD) when this checkpoint is captured.
await sql`INSERT INTO messages (session_id, chat_id, role, content, status) VALUES (${sessionId}, ${chatId}, 'user', 'do it', 'complete')`;
const [a1] = await sql<{ id: string }[]>`
INSERT INTO messages (session_id, chat_id, role, content, status)
VALUES (${sessionId}, ${chatId}, 'assistant', 'turn 1', 'complete') RETURNING id
`;
const cp1 = await createCheckpoint(sql, { chatId, sessionId, worktreeId, worktreePath, messageId: a1!.id });
expect(cp1).not.toBeNull();
// The agent (turn 1) writes a file into the worktree.
await hostExec(`cd ${worktreePath} && echo agent-wrote > agent.txt`, { timeoutMs: 10_000 });
// Turn 2: another user msg + assistant turn, AND a second (later) checkpoint.
await sql`INSERT INTO messages (session_id, chat_id, role, content, status) VALUES (${sessionId}, ${chatId}, 'user', 'more', 'complete')`;
const [a2] = await sql<{ id: string }[]>`
INSERT INTO messages (session_id, chat_id, role, content, status)
VALUES (${sessionId}, ${chatId}, 'assistant', 'turn 2', 'complete') RETURNING id
`;
const cp2 = await createCheckpoint(sql, { chatId, sessionId, worktreeId, worktreePath, messageId: a2!.id });
expect(cp2).not.toBeNull();
// An agent_sessions row that restore should mark 'crashed'.
await sql`
INSERT INTO agent_sessions (chat_id, session_id, worktree_id, agent, backend, agent_session_id, status, last_active_at)
VALUES (${chatId}, ${sessionId}, ${worktreeId}, 'goose', 'acp_warm', 'sess-1', 'active', clock_timestamp())
ON CONFLICT (chat_id, agent) DO UPDATE SET status = 'active'
`;
const before = await sql<{ id: string }[]>`SELECT id FROM messages WHERE chat_id = ${chatId} ORDER BY created_at`;
expect(before.length).toBe(4); // user, a1, user, a2
// Restore to cp1 (before turn 1's assistant message).
const result = await restoreCheckpoint(sql, cp1!.id, { sessionId });
expect(result.checkpoint_id).toBe(cp1!.id);
expect(result.worktree_reset).toBe(true);
expect(result.backend_reset).toBe(true);
// a1, user(turn2), a2 deleted (created_at >= a1) → 3 trimmed.
expect(result.messages_deleted).toBe(3);
// Transcript trimmed to just the first user message.
const after = await sql<{ role: string; content: string }[]>`SELECT role, content FROM messages WHERE chat_id = ${chatId} ORDER BY created_at`;
expect(after.length).toBe(1);
expect(after[0]!.role).toBe('user');
// Worktree reset: the agent's file is gone (it was written after cp1).
const ls = await hostExec(`ls ${worktreePath}/agent.txt`, { timeoutMs: 10_000 });
expect(ls.exitCode).not.toBe(0);
// The agent_sessions row was reset to 'crashed'.
const [as] = await sql<{ status: string }[]>`SELECT status FROM agent_sessions WHERE chat_id = ${chatId} AND agent = 'goose'`;
expect(as!.status).toBe('crashed');
// cp1 survives (re-restorable); cp2 (later) was dropped.
const cps = await sql<{ id: string }[]>`SELECT id FROM checkpoints WHERE chat_id = ${chatId}`;
expect(cps.map((c) => c.id)).toEqual([cp1!.id]);
});
it('restoreCheckpoint throws CheckpointNotFoundError for an unknown id', async () => {
await expect(
restoreCheckpoint(sql, '00000000-0000-0000-0000-000000000000', { sessionId }),
).rejects.toBeInstanceOf(CheckpointNotFoundError);
});
it('restoreCheckpoint throws when the checkpoint is not in the requested session', async () => {
// A checkpoint whose session_id differs from the route's sessionId.
const [wt] = await sql<{ id: string }[]>`SELECT id FROM worktrees WHERE session_id = ${sessionId} AND status = 'active'`;
const cp = await createCheckpoint(sql, { chatId, sessionId, worktreeId: wt!.id, worktreePath, messageId: null });
expect(cp).not.toBeNull();
await expect(
restoreCheckpoint(sql, cp!.id, { sessionId: '11111111-1111-1111-1111-111111111111' }),
).rejects.toBeInstanceOf(CheckpointNotFoundError);
await sql`DELETE FROM checkpoints WHERE id = ${cp!.id}`;
});
it('restoreCheckpoint denies a NULL-session_id checkpoint from another session (no fail-open IDOR)', async () => {
// Regression for the fail-open authorization bug: a checkpoint row whose
// denormalized session_id is NULL must STILL be scoped via its chat's owning
// session (chats.session_id), not skipped. The old guard `cp.session_id &&
// cp.session_id !== sessionId` fell through on NULL → cross-session restore.
const [row] = await sql<{ id: string }[]>`
INSERT INTO checkpoints (chat_id, session_id, message_id, commit_sha)
VALUES (${chatId}, NULL, NULL, 'deadbeef')
RETURNING id
`;
await expect(
restoreCheckpoint(sql, row!.id, { sessionId: '22222222-2222-2222-2222-222222222222' }),
).rejects.toBeInstanceOf(CheckpointNotFoundError);
await sql`DELETE FROM checkpoints WHERE id = ${row!.id}`;
});
});

View File

@@ -1,47 +0,0 @@
import { describe, it, expect } from 'vitest';
import { parseCursorAgentModelsOutput } from '../cursor-models.js';
describe('parseCursorAgentModelsOutput', () => {
it('parses cursor-agent models output with default marker', () => {
const output = `
Available models
claude-4-sonnet - Claude 4 Sonnet (default)
gpt-4.1 - GPT-4.1
Tip: use cursor-agent models for full list
`.trim();
const models = parseCursorAgentModelsOutput(output);
expect(models).toEqual([
{ id: 'claude-4-sonnet', label: 'Claude 4 Sonnet', isDefault: true },
{ id: 'gpt-4.1', label: 'GPT-4.1', isDefault: false },
]);
});
it('uses current marker when no default', () => {
const output = `
model-a - Model A (current)
model-b - Model B
`.trim();
const models = parseCursorAgentModelsOutput(output);
expect(models.find((m) => m.id === 'model-a')?.isDefault).toBe(true);
expect(models.find((m) => m.id === 'model-b')?.isDefault).toBe(false);
});
it('defaults to first model when no markers', () => {
const output = 'alpha - Alpha\nbeta - Beta';
const models = parseCursorAgentModelsOutput(output);
expect(models[0]?.isDefault).toBe(true);
expect(models[1]?.isDefault).toBe(false);
});
it('skips malformed lines', () => {
const output = 'no-separator\nvalid - Valid';
const models = parseCursorAgentModelsOutput(output);
expect(models).toEqual([{ id: 'valid', label: 'Valid', isDefault: true }]);
});
});

View File

@@ -0,0 +1,73 @@
import { describe, it, expect } from 'vitest';
import { stripDcpTags, makeDcpStreamStripper } from '../dcp-strip.js';
// Feed chunks through a fresh stripper and return the fully reassembled output
// (everything emitted during streaming + the final flush) — i.e. what the
// dispatcher would accumulate into the persisted message content.
function run(chunks: string[]): string {
const s = makeDcpStreamStripper();
let out = '';
for (const c of chunks) out += s.push(c);
out += s.flush();
return out;
}
describe('stripDcpTags (one-shot)', () => {
it('removes a complete tag', () => {
expect(stripDcpTags('Yes — "Test".\n\n<dcp-message-id>m0019</dcp-message-id>')).toBe(
'Yes — "Test".\n\n',
);
});
it('leaves text without a tag untouched', () => {
expect(stripDcpTags('no tag here')).toBe('no tag here');
});
});
describe('per-chunk strip is INSUFFICIENT (documents the bug)', () => {
it('a tag split across chunks survives a naive per-chunk .replace()', () => {
const chunks = ['Yes.\n\n<dcp', '-message', '-id>m0019</dcp', '-message-id>'];
const naive = chunks.map(stripDcpTags).join('');
// The reassembled content still contains the tag — this is the screenshot bug.
expect(naive).toContain('<dcp-message-id>m0019</dcp-message-id>');
});
});
describe('makeDcpStreamStripper (cross-chunk fix)', () => {
it('strips a tag split across chunks (the real opencode case)', () => {
expect(run(['Yes.\n\n<dcp', '-message', '-id>m0019</dcp', '-message-id>'])).toBe('Yes.\n\n');
});
it('strips a tag split at EVERY character boundary', () => {
const full = 'Answer.<dcp-message-id>m0019</dcp-message-id>';
expect(run([...full])).toBe('Answer.');
});
it('strips a tag delivered whole in one chunk', () => {
expect(run(['Answer.<dcp-message-id>m0019</dcp-message-id>'])).toBe('Answer.');
});
it('passes through text with no tag', () => {
expect(run(['hello ', 'world'])).toBe('hello world');
});
it('does NOT swallow legitimate < content (code/HTML/generics)', () => {
expect(run(['use ', '<div>', ' and ', 'Array<', 'string>'])).toBe('use <div> and Array<string>');
});
it('handles a lone < that is not a dcp tag, split across chunks', () => {
expect(run(['a <', 'b c'])).toBe('a <b c');
});
it('emits surrounding text and strips a mid-text tag', () => {
expect(run(['before ', '<dcp-message-id>', 'm1', '</dcp-message-id>', ' after'])).toBe(
'before after',
);
});
it('flushes a truncated/never-closed partial tag without leaking it as a complete tag', () => {
// If the stream ends mid-tag, flush strips complete tags; an incomplete
// remnant is returned as-is (no complete tag ever existed to render).
const out = run(['done.<dcp-message-id>m00']);
expect(out).not.toContain('</dcp-message-id>');
});
});

View File

@@ -0,0 +1,163 @@
import { describe, it, expect, beforeAll, afterAll } from 'vitest';
import { readFileSync } from 'node:fs';
import { resolve } from 'node:path';
import postgres from 'postgres';
import type { WsFrame } from '@boocode/contracts/ws-frames';
import { classifyTerminalStatus, finalizeStreamingMessage } from '../finalize-message.js';
/**
* F1 (D-7 / OCE-001 / OCE-002) — finalizing a Stop'd or errored external turn.
*
* `classifyTerminalStatus` is the pure D-7 decision (user Stop / AbortError →
* cancelled, genuine error → failed). `finalizeStreamingMessage` writes that
* terminal state onto the streaming assistant row and publishes the matching
* message_complete frame — idempotently, guarded by `WHERE status='streaming'`,
* so a double-Stop or an abort-then-catch settles the message exactly once and
* never clobbers a row that already finished cleanly.
*/
describe('classifyTerminalStatus (pure, D-7)', () => {
it('maps a fired abort signal to cancelled (user Stop)', () => {
expect(classifyTerminalStatus({ aborted: true })).toBe('cancelled');
});
it('maps a thrown AbortError to cancelled', () => {
const e = new Error('the operation was aborted');
e.name = 'AbortError';
expect(classifyTerminalStatus({ aborted: false, error: e })).toBe('cancelled');
});
it('maps a genuine thrown error to failed', () => {
expect(classifyTerminalStatus({ aborted: false, error: new Error('boom') })).toBe('failed');
});
it('defaults a no-abort / no-error catch to failed', () => {
expect(classifyTerminalStatus({ aborted: false })).toBe('failed');
});
});
describe.runIf(!!process.env.DATABASE_URL)('finalizeStreamingMessage (DB)', () => {
let sql: ReturnType<typeof postgres>;
let projectId: string;
let sessionId: string;
let chatId: string;
beforeAll(async () => {
sql = postgres(process.env.DATABASE_URL!, { max: 3 });
// Server schema owns messages/sessions/chats (FK targets); coder schema after.
const serverSchema = resolve(__dirname, '../../../../server/src/schema.sql');
const coderSchema = resolve(__dirname, '../../schema.sql');
await sql.unsafe(readFileSync(serverSchema, 'utf8'));
await sql.unsafe(readFileSync(coderSchema, 'utf8'));
const [p] = await sql<{ id: string }[]>`
INSERT INTO projects (name, path, status) VALUES ('f1-finalize', '/tmp/f1-finalize', 'open') RETURNING id
`;
projectId = p!.id;
const [s] = await sql<{ id: string }[]>`
INSERT INTO sessions (project_id, name, model, status) VALUES (${projectId}, 'f1', 'm', 'open') RETURNING id
`;
sessionId = s!.id;
const [c] = await sql<{ id: string }[]>`
INSERT INTO chats (session_id, name, status) VALUES (${sessionId}, 'tab', 'open') RETURNING id
`;
chatId = c!.id;
});
afterAll(async () => {
if (!sql) return;
await sql`DELETE FROM messages WHERE session_id = ${sessionId}`.catch(() => {});
await sql`DELETE FROM chats WHERE id = ${chatId}`.catch(() => {});
await sql`DELETE FROM sessions WHERE id = ${sessionId}`.catch(() => {});
await sql`DELETE FROM projects WHERE id = ${projectId}`.catch(() => {});
await sql.end({ timeout: 5 });
});
async function insertStreaming(): Promise<string> {
const [m] = await sql<{ id: string }[]>`
INSERT INTO messages (session_id, chat_id, role, content, status)
VALUES (${sessionId}, ${chatId}, 'assistant', '', 'streaming') RETURNING id
`;
return m!.id;
}
it('finalizes a streaming row to cancelled, persists partial content, publishes one frame', async () => {
const id = await insertStreaming();
const frames: WsFrame[] = [];
const did = await finalizeStreamingMessage(sql, (_s, f) => frames.push(f), {
sessionId,
chatId,
assistantId: id,
status: 'cancelled',
model: 'qwen',
content: 'partial answer',
});
expect(did).toBe(true);
const [row] = await sql<{ status: string; content: string; finished_at: Date | null }[]>`
SELECT status, content, finished_at FROM messages WHERE id = ${id}
`;
expect(row!.status).toBe('cancelled');
expect(row!.content).toBe('partial answer');
expect(row!.finished_at).not.toBeNull();
expect(frames).toHaveLength(1);
expect(frames[0]!.type).toBe('message_complete');
expect((frames[0] as { status?: string }).status).toBe('cancelled');
});
it('is idempotent for a double-Stop: second call updates nothing and re-publishes nothing', async () => {
const id = await insertStreaming();
const frames: WsFrame[] = [];
const push = (_s: string, f: WsFrame): void => {
frames.push(f);
};
expect(
await finalizeStreamingMessage(sql, push, { sessionId, chatId, assistantId: id, status: 'cancelled', model: null }),
).toBe(true);
expect(
await finalizeStreamingMessage(sql, push, { sessionId, chatId, assistantId: id, status: 'cancelled', model: null }),
).toBe(false);
expect(frames).toHaveLength(1);
const [row] = await sql<{ status: string }[]>`SELECT status FROM messages WHERE id = ${id}`;
expect(row!.status).toBe('cancelled');
});
it('never clobbers a row that already finished cleanly (abort raced a clean finish)', async () => {
const [m] = await sql<{ id: string }[]>`
INSERT INTO messages (session_id, chat_id, role, content, status)
VALUES (${sessionId}, ${chatId}, 'assistant', 'done', 'complete') RETURNING id
`;
const id = m!.id;
const frames: WsFrame[] = [];
const did = await finalizeStreamingMessage(sql, (_s, f) => frames.push(f), {
sessionId,
chatId,
assistantId: id,
status: 'cancelled',
model: null,
});
expect(did).toBe(false);
expect(frames).toHaveLength(0);
const [row] = await sql<{ status: string; content: string }[]>`
SELECT status, content FROM messages WHERE id = ${id}
`;
expect(row!.status).toBe('complete');
expect(row!.content).toBe('done');
});
it('no-ops on an empty assistantId (throw happened before the row was created)', async () => {
const frames: WsFrame[] = [];
const did = await finalizeStreamingMessage(sql, (_s, f) => frames.push(f), {
sessionId,
chatId,
assistantId: '',
status: 'failed',
model: null,
});
expect(did).toBe(false);
expect(frames).toHaveLength(0);
});
});

View File

@@ -0,0 +1,102 @@
import { describe, it, expect } from 'vitest';
import type { Broker } from '@boocode/server/broker';
import { makeFrameEmitter } from '../frame-emitter.js';
import { makeDcpStreamStripper } from '../dcp-strip.js';
import type { AcpToolSnapshot } from '../acp-tool-snapshot.js';
/**
* makeFrameEmitter (v2.7 audit reshape): the AgentEvent → WS-frame mapping + turn
* accumulators extracted from AcpStreamContext. Pure-ish over an injected broker.
*/
function fakeBroker(): { broker: Broker; frames: Array<{ sid: string; frame: Record<string, unknown> }> } {
const frames: Array<{ sid: string; frame: Record<string, unknown> }> = [];
const broker = {
publishFrame: (sid: string, frame: unknown) => {
frames.push({ sid, frame: frame as Record<string, unknown> });
},
} as unknown as Broker;
return { broker, frames };
}
const toolSnap: AcpToolSnapshot = { toolCallId: 'c1', title: 'grep', status: 'completed', rawOutput: 'x' };
describe('makeFrameEmitter — streaming frames', () => {
it('maps text/reasoning/tool events to delta/reasoning_delta/tool_call frames', () => {
const { broker, frames } = fakeBroker();
const em = makeFrameEmitter({ broker, sessionId: 's1', chatId: 'ch1', assistantId: 'm1' });
em.onEvent({ type: 'text', text: 'hello ' });
em.onEvent({ type: 'reasoning', text: 'mulling' });
em.onEvent({ type: 'tool_call', toolCall: toolSnap });
expect(frames.map((f) => f.frame.type)).toEqual(['delta', 'reasoning_delta', 'tool_call']);
expect(frames[0]!.frame).toMatchObject({ message_id: 'm1', chat_id: 'ch1', content: 'hello ' });
expect(frames[2]!.frame).toMatchObject({ message_id: 'm1', chat_id: 'ch1' });
expect(em.output).toBe('hello ');
expect(em.reasoningText).toBe('mulling');
expect(em.snapshots).toHaveLength(1);
});
it('publishes a tool_call frame for BOTH tool_call and tool_update events', () => {
const { broker, frames } = fakeBroker();
const em = makeFrameEmitter({ broker, sessionId: 's1', chatId: 'ch1', assistantId: 'm1' });
em.onEvent({ type: 'tool_update', toolCall: toolSnap });
expect(frames).toHaveLength(1);
expect(frames[0]!.frame.type).toBe('tool_call');
});
it('publishes an agent_commands frame and merges the command cache', () => {
const { broker, frames } = fakeBroker();
const taskId = `task-fe-${Math.floor(performance.now())}-${frames.length}`;
const em = makeFrameEmitter({ broker, sessionId: 's1', chatId: 'ch1', assistantId: 'm1', taskId });
em.onEvent({ type: 'commands', commands: [{ name: 'plan' }] });
expect(frames).toHaveLength(1);
expect(frames[0]!.frame).toMatchObject({ type: 'agent_commands', task_id: taskId, session_id: 's1' });
});
it('does not publish a commands frame without a taskId', () => {
const { broker, frames } = fakeBroker();
const em = makeFrameEmitter({ broker, sessionId: 's1', chatId: 'ch1', assistantId: 'm1' });
em.onEvent({ type: 'commands', commands: [{ name: 'plan' }] });
expect(frames).toHaveLength(0);
});
});
describe('makeFrameEmitter — no broker (one-shot accumulation)', () => {
it('accumulates output/reasoning/snapshots but publishes nothing', () => {
const em = makeFrameEmitter({ sessionId: 's1', chatId: 'ch1', assistantId: 'm1' });
em.onEvent({ type: 'text', text: 'abc' });
em.onEvent({ type: 'reasoning', text: 'r' });
em.onEvent({ type: 'tool_call', toolCall: toolSnap });
expect(em.output).toBe('abc');
expect(em.reasoningText).toBe('r');
expect(em.snapshots).toHaveLength(1);
});
});
describe('makeFrameEmitter — dcp stripping (opencode path contract)', () => {
it('strips a split dcp tag across deltas and flushes the tail on finalize', () => {
const { broker, frames } = fakeBroker();
const em = makeFrameEmitter({ broker, sessionId: 's1', chatId: 'ch1', assistantId: 'm1', dcp: makeDcpStreamStripper() });
for (const chunk of ['Answer.', '<dcp', '-message', '-id>m1</dcp', '-message-id>', ' tail']) {
em.onEvent({ type: 'text', text: chunk });
}
em.finalize();
expect(em.output).toBe('Answer. tail');
const published = frames.filter((f) => f.frame.type === 'delta').map((f) => f.frame.content).join('');
expect(published).toBe('Answer. tail');
});
it('finalize is a no-op without a dcp stripper', () => {
const { broker, frames } = fakeBroker();
const em = makeFrameEmitter({ broker, sessionId: 's1', chatId: 'ch1', assistantId: 'm1' });
em.onEvent({ type: 'text', text: 'raw <dcp-message-id>m</dcp-message-id>' });
em.finalize();
// No stripping without a stripper — verbatim text (prior ACP-path behavior).
expect(em.output).toBe('raw <dcp-message-id>m</dcp-message-id>');
expect(frames).toHaveLength(1);
});
});

View File

@@ -0,0 +1,173 @@
import { describe, it, expect } from 'vitest';
import { locateMatch, SIMILARITY_THRESHOLD } from '../fuzzy-match.js';
// Helper: assert a resolved span and slice it back out of the content so the
// test pins the EXACT file text the caller would replace.
function span(result: ReturnType<typeof locateMatch>): { start: number; end: number } {
if (result.kind !== 'exact' && result.kind !== 'fuzzy') {
throw new Error(`expected a located span, got ${result.kind}`);
}
return { start: result.start, end: result.end };
}
describe('locateMatch — strategy 1: exact', () => {
it('returns an exact unique span', () => {
const content = 'alpha\nbeta\ngamma\n';
const result = locateMatch(content, 'beta');
expect(result.kind).toBe('exact');
const { start, end } = span(result);
expect(content.slice(start, end)).toBe('beta');
});
it('returns the right offsets for a multi-line exact needle', () => {
const content = 'one\ntwo\nthree\nfour\n';
const needle = 'two\nthree';
const result = locateMatch(content, needle);
expect(result.kind).toBe('exact');
const { start, end } = span(result);
expect(content.slice(start, end)).toBe(needle);
});
it('refuses when the exact needle occurs more than once', () => {
const content = 'foo\nbar\nfoo\nbar\nfoo\n';
const result = locateMatch(content, 'foo');
expect(result).toEqual({ kind: 'ambiguous', count: 3 });
});
});
describe('locateMatch — strategy 2: per-line whitespace', () => {
it('matches across trailing-whitespace drift at the real span', () => {
// File has trailing spaces the model dropped from a TWO-line copy. A
// single-line needle would be located by exact indexOf (it's a substring),
// so use two lines where line 1's trailing ws breaks an exact substring run.
const content = 'function f() {\n setup(); \n return 1;\n}\n';
const needle = ' setup();\n return 1;'; // line 1 missing trailing spaces
const result = locateMatch(content, needle);
expect(result.kind).toBe('fuzzy');
const { start, end } = span(result);
// The returned span covers the ORIGINAL lines including the trailing spaces.
expect(content.slice(start, end)).toBe(' setup(); \n return 1;');
});
it('matches across indentation drift (multi-line block)', () => {
// File indents with 4 spaces; model emitted 2-space indentation. trimEnd
// alone does not normalize LEADING whitespace, so this exercises... actually
// leading-indent drift is a Levenshtein-tier fallback. Here we keep the
// leading indent identical and drift only trailing whitespace per line.
const content = ['if (x) {', ' doThing(); ', ' doOther();', '}'].join('\n');
const needle = [' doThing();', ' doOther();'].join('\n');
const result = locateMatch(content, needle);
expect(result.kind).toBe('fuzzy');
const { start, end } = span(result);
expect(content.slice(start, end)).toBe(' doThing(); \n doOther();');
});
it('ignores leading/trailing blank needle lines', () => {
const content = 'header\nbody line\nfooter\n';
const needle = '\n\nbody line\n\n';
const result = locateMatch(content, needle);
expect(result.kind).toBe('fuzzy');
const { start, end } = span(result);
expect(content.slice(start, end)).toBe('body line');
});
it('reports ambiguous when a whitespace-window matches twice', () => {
// Both line 1 and line 4 differ from the needle only by trailing whitespace,
// so exact indexOf fails (no exact substring) and the whitespace tier finds
// two equivalent windows → ambiguous.
const content = 'x = 1; \ny = 2;\nz = 3;\nx = 1;\t\n';
const needle = 'x = 1;'; // no trailing ws → not an exact substring of either line
const result = locateMatch(content, needle);
expect(result).toEqual({ kind: 'ambiguous', count: 2 });
});
});
describe('locateMatch — strategy 3: unicode canonicalization', () => {
it('matches across curly quotes', () => {
const content = "const s = 'hello';\n";
const needle = 'const s = hello;'; // hello
const result = locateMatch(content, needle);
expect(result.kind).toBe('fuzzy');
const { start, end } = span(result);
// Span maps back to ORIGINAL (straight-quote) text.
expect(content.slice(start, end)).toBe("const s = 'hello';");
});
it('matches across curly double-quotes', () => {
const content = 'log("done");\n';
const needle = 'log(“done”);'; // “done”
const result = locateMatch(content, needle);
expect(result.kind).toBe('fuzzy');
const { start, end } = span(result);
expect(content.slice(start, end)).toBe('log("done");');
});
it('matches across an em-dash drift', () => {
const content = 'range 1-10 inclusive\n';
const needle = 'range 1—10 inclusive'; // em-dash
const result = locateMatch(content, needle);
expect(result.kind).toBe('fuzzy');
const { start, end } = span(result);
expect(content.slice(start, end)).toBe('range 1-10 inclusive');
});
it('matches across a non-breaking space drift', () => {
const content = 'a b c\n'; // plain spaces
const needle = 'a b c'; // nbsp between words
const result = locateMatch(content, needle);
expect(result.kind).toBe('fuzzy');
const { start, end } = span(result);
expect(content.slice(start, end)).toBe('a b c');
});
});
describe('locateMatch — strategy 4: Levenshtein', () => {
it('matches a >= threshold near-miss (small typo drift)', () => {
// Needle has a one-char typo ('totals' vs 'total') so it is NOT an exact
// substring and the whitespace/canonical tiers (which require equality) both
// miss; Levenshtein similarity stays well above the 0.66 floor.
const content = 'const total = sum + tax;\n';
const needle = 'const totals = sum + tax;';
const result = locateMatch(content, needle);
expect(result.kind).toBe('fuzzy');
const { start, end } = span(result);
// Span maps to the real (correctly-spelled) file line.
expect(content.slice(start, end)).toBe('const total = sum + tax;');
});
it('matches a multi-line block with indentation drift via Levenshtein', () => {
const content = ['function g() {', ' return compute(a, b);', '}'].join('\n');
// 6-space indent vs file's 2-space; trimEnd does not fix leading indent, so
// this lands on the Levenshtein tier (joined-trim makes it identical → ~1.0).
const needle = [' return compute(a, b);'].join('\n');
const result = locateMatch(content, needle);
expect(result.kind).toBe('fuzzy');
const { start, end } = span(result);
expect(content.slice(start, end)).toBe(' return compute(a, b);');
});
it('returns not_found for a below-threshold miss', () => {
const content = 'the quick brown fox jumps over the lazy dog\n';
const needle = 'completely unrelated string of text here xyz';
const result = locateMatch(content, needle);
expect(result).toEqual({ kind: 'not_found' });
});
it('returns not_found for a genuinely-absent needle', () => {
const content = 'alpha\nbeta\ngamma\n';
const needle = 'this content does not exist anywhere at all';
const result = locateMatch(content, needle);
expect(result).toEqual({ kind: 'not_found' });
});
});
describe('locateMatch — edge cases', () => {
it('returns not_found for an empty needle', () => {
expect(locateMatch('anything', '')).toEqual({ kind: 'not_found' });
});
it('exposes a sane similarity threshold', () => {
expect(SIMILARITY_THRESHOLD).toBeGreaterThan(0);
expect(SIMILARITY_THRESHOLD).toBeLessThanOrEqual(1);
});
});

View File

@@ -3,7 +3,7 @@ import { getManifestCommands, mergeCommands, PROVIDER_COMMANDS } from '../provid
describe('provider-commands', () => {
it('defines commands for every external harness', () => {
for (const name of ['claude', 'opencode', 'cursor', 'goose', 'qwen', 'copilot']) {
for (const name of ['claude', 'opencode', 'goose', 'qwen']) {
expect(getManifestCommands(name).length, name).toBeGreaterThan(0);
}
});

View File

@@ -0,0 +1,93 @@
import { describe, it, expect, vi } from 'vitest';
import { buildResolvedRegistry } from '../provider-config-registry.js';
import { PROVIDERS } from '../provider-registry.js';
import type { CoderProvidersFile } from '../provider-config.js';
describe('buildResolvedRegistry', () => {
it('applies a built-in override (goose label)', () => {
const config: CoderProvidersFile = { providers: { goose: { label: 'Goosey' } } };
const reg = buildResolvedRegistry(PROVIDERS, config);
const goose = reg.get('goose');
expect(goose).toBeDefined();
expect(goose!.label).toBe('Goosey');
expect(goose!.configLabel).toBe('Goosey');
expect(goose!.enabled).toBe(true);
expect(goose!.isBuiltin).toBe(true);
expect(goose!.isCustomAcp).toBe(false);
});
it('adds a custom ACP entry (extends:acp + label + command)', () => {
const config: CoderProvidersFile = {
providers: {
'amp-acp': { extends: 'acp', label: 'Amp', description: 'ACP wrapper', command: ['amp-acp', '--acp'], env: { AMP: '1' } },
},
};
const reg = buildResolvedRegistry(PROVIDERS, config);
const amp = reg.get('amp-acp');
expect(amp).toBeDefined();
expect(amp!.isCustomAcp).toBe(true);
expect(amp!.isBuiltin).toBe(false);
expect(amp!.transport).toBe('acp');
expect(amp!.modelSource).toBe('probe');
expect(amp!.launchCommand).toEqual(['amp-acp', '--acp']);
expect(amp!.env).toEqual({ AMP: '1' });
expect(amp!.enabled).toBe(true);
});
it('keeps a disabled built-in in the registry flagged disabled (goose)', () => {
const config: CoderProvidersFile = { providers: { goose: { enabled: false } } };
const reg = buildResolvedRegistry(PROVIDERS, config);
expect(reg.has('goose')).toBe(true);
expect(reg.get('goose')!.enabled).toBe(false);
});
it('skips a custom id without extends (no throw)', () => {
const config: CoderProvidersFile = { providers: { weird: { label: 'Weird', command: ['weird'] } } };
const warn = vi.spyOn(console, 'warn').mockImplementation(() => {});
const reg = buildResolvedRegistry(PROVIDERS, config);
expect(reg.has('weird')).toBe(false);
// built-ins untouched
expect(reg.size).toBe(PROVIDERS.length);
expect(warn).toHaveBeenCalled();
warn.mockRestore();
});
it('ignores enabled:false on boocode and warns', () => {
const config: CoderProvidersFile = { providers: { boocode: { enabled: false } } };
const warn = vi.spyOn(console, 'warn').mockImplementation(() => {});
const reg = buildResolvedRegistry(PROVIDERS, config);
expect(reg.get('boocode')!.enabled).toBe(true);
expect(warn).toHaveBeenCalled();
warn.mockRestore();
});
it('carries config models + additionalModels onto built-in and custom defs', () => {
const reg = buildResolvedRegistry(PROVIDERS, {
providers: {
claude: { models: [{ id: 'claude-opus-4-8', label: 'Opus 4.8' }] },
'amp-acp': {
extends: 'acp',
label: 'Amp',
command: ['amp-acp'],
additionalModels: [{ id: 'amp-1', label: 'Amp 1' }],
},
},
});
expect(reg.get('claude')!.configModels).toEqual([{ id: 'claude-opus-4-8', label: 'Opus 4.8' }]);
expect(reg.get('amp-acp')!.configAdditionalModels).toEqual([{ id: 'amp-1', label: 'Amp 1' }]);
});
it('REGRESSION: empty config returns exactly the built-ins, all enabled', () => {
const reg = buildResolvedRegistry(PROVIDERS, { providers: {} });
expect(reg.size).toBe(PROVIDERS.length);
expect([...reg.keys()]).toEqual(PROVIDERS.map((p) => p.name));
for (const def of PROVIDERS) {
const r = reg.get(def.name)!;
expect(r.enabled).toBe(true);
expect(r.isBuiltin).toBe(true);
expect(r.isCustomAcp).toBe(false);
expect(r.launchCommand).toBeNull();
expect(r.label).toBe(def.label);
}
});
});

View File

@@ -0,0 +1,96 @@
import { describe, it, expect } from 'vitest';
import {
mergeProviderConfigPatch,
ProviderConfigPatchSchema,
CoderProvidersFileSchema,
type CoderProvidersFile,
} from '../provider-config.js';
describe('ProviderConfigPatchSchema', () => {
it('accepts a per-provider override patch', () => {
const parsed = ProviderConfigPatchSchema.safeParse({ providers: { goose: { enabled: false } } });
expect(parsed.success).toBe(true);
});
it('accepts a null value (delete-the-override sentinel)', () => {
const parsed = ProviderConfigPatchSchema.safeParse({ providers: { goose: null } });
expect(parsed.success).toBe(true);
});
it('defaults providers to {} on an empty body', () => {
const parsed = ProviderConfigPatchSchema.safeParse({});
expect(parsed.success).toBe(true);
if (parsed.success) expect(parsed.data.providers).toEqual({});
});
it('rejects a malformed override (wrong field type)', () => {
const parsed = ProviderConfigPatchSchema.safeParse({ providers: { goose: { enabled: 'yes' } } });
expect(parsed.success).toBe(false);
});
it('rejects a non-object providers map', () => {
const parsed = ProviderConfigPatchSchema.safeParse({ providers: 123 });
expect(parsed.success).toBe(false);
});
});
describe('mergeProviderConfigPatch', () => {
const current: CoderProvidersFile = {
providers: {
goose: { enabled: true, label: 'Goose' },
opencode: { enabled: true },
},
};
it('replaces an existing override object wholesale (not deep-merge)', () => {
const merged = mergeProviderConfigPatch(current, { providers: { goose: { enabled: false } } });
// Whole override replaced — the prior `label` is gone, only `enabled` remains.
expect(merged.providers.goose).toEqual({ enabled: false });
});
it('adds a brand-new override id', () => {
const merged = mergeProviderConfigPatch(current, {
providers: { 'amp-acp': { extends: 'acp', label: 'Amp', command: ['amp-acp'] } },
});
expect(merged.providers['amp-acp']).toEqual({ extends: 'acp', label: 'Amp', command: ['amp-acp'] });
});
it('deletes an override when the value is null', () => {
const merged = mergeProviderConfigPatch(current, { providers: { goose: null } });
expect(merged.providers.goose).toBeUndefined();
expect(Object.keys(merged.providers)).toEqual(['opencode']);
});
it('leaves ids absent from the patch untouched', () => {
const merged = mergeProviderConfigPatch(current, { providers: { goose: { enabled: false } } });
expect(merged.providers.opencode).toEqual({ enabled: true });
});
it('does not mutate the input config', () => {
const snapshot = JSON.parse(JSON.stringify(current));
mergeProviderConfigPatch(current, { providers: { goose: null, opencode: { enabled: false } } });
expect(current).toEqual(snapshot);
});
it('empty patch returns an equivalent config', () => {
const merged = mergeProviderConfigPatch(current, { providers: {} });
expect(merged).toEqual(current);
});
});
describe('CoderProvidersFileSchema (validate-before-save guard)', () => {
it('accepts a clean merged config', () => {
const merged = mergeProviderConfigPatch(
{ providers: {} },
{ providers: { goose: { enabled: false } } },
);
expect(CoderProvidersFileSchema.safeParse(merged).success).toBe(true);
});
it('rejects a config carrying an invalid override (never written)', () => {
// A merged object that somehow holds a bad override must fail validation
// so the PATCH route returns 422 and never calls save().
const invalid = { providers: { goose: { enabled: 'nope' } } };
expect(CoderProvidersFileSchema.safeParse(invalid).success).toBe(false);
});
});

View File

@@ -0,0 +1,85 @@
import { describe, it, expect } from 'vitest';
import { getProviderDiagnostic, type DiagnosticAgentRow } from '../provider-diagnostic.js';
import { buildResolvedRegistry } from '../provider-config-registry.js';
import { PROVIDERS } from '../provider-registry.js';
import type { ProviderSnapshotEntry } from '../provider-types.js';
const registry = buildResolvedRegistry(PROVIDERS, {
providers: {
goose: { enabled: false },
'amp-acp': { extends: 'acp', label: 'Amp', command: ['amp-acp', '--acp'] },
},
});
const alwaysAvailable = () => Promise.resolve(true);
const neverAvailable = () => Promise.resolve(false);
describe('getProviderDiagnostic', () => {
it('reports a disabled built-in (enabled:false, no install)', async () => {
const report = await getProviderDiagnostic(registry.get('goose')!, undefined, {
checkAvailable: neverAvailable,
});
expect(report).toContain('provider: goose');
expect(report).toContain('enabled: false');
expect(report).toContain('installed: false');
expect(report).toMatch(/command_available:\s*false/);
});
it('reports an installed built-in with its install_path, last_probed_at, model count', async () => {
const agentRow: DiagnosticAgentRow = {
name: 'opencode',
install_path: '/usr/bin/opencode',
supports_acp: true,
models: [
{ id: 'm1', label: 'M1' },
{ id: 'm2', label: 'M2' },
],
last_probed_at: '2026-05-29T12:00:00.000Z',
};
const report = await getProviderDiagnostic(registry.get('opencode')!, agentRow, {
checkAvailable: alwaysAvailable,
});
expect(report).toContain('install_path: /usr/bin/opencode');
expect(report).toContain('2026-05-29T12:00:00.000Z');
expect(report).toContain('installed: true');
expect(report).toMatch(/models_in_db:\s*2/);
expect(report).toMatch(/command_available:\s*true/);
});
it('reports a custom ACP launch command + its binary', async () => {
const report = await getProviderDiagnostic(registry.get('amp-acp')!, undefined, {
checkAvailable: alwaysAvailable,
});
expect(report).toContain('provider: amp-acp');
expect(report).toContain('amp-acp --acp');
expect(report).toContain('customAcp: true');
});
it('surfaces the last probe error from a cached snapshot entry', async () => {
const cachedEntry: ProviderSnapshotEntry = {
name: 'opencode',
label: 'OpenCode',
transport: 'acp',
status: 'error',
enabled: true,
installed: true,
models: [],
modes: [],
defaultModeId: null,
commands: [],
error: 'ACP initialize timed out',
};
const report = await getProviderDiagnostic(registry.get('opencode')!, undefined, {
cachedEntry,
checkAvailable: alwaysAvailable,
});
expect(report).toContain('ACP initialize timed out');
});
it('reports no error when none is cached', async () => {
const report = await getProviderDiagnostic(registry.get('opencode')!, undefined, {
checkAvailable: alwaysAvailable,
});
expect(report).toMatch(/last_probe_error:\s*\(none/);
});
});

View File

@@ -1,10 +1,15 @@
import { describe, it, expect, vi, beforeEach } from 'vitest';
import { writeFileSync } from 'node:fs';
import { tmpdir } from 'node:os';
import { join } from 'node:path';
import {
mergeModels,
prefixLlamaSwapModels,
clearProviderSnapshotCache,
getProviderSnapshot,
peekSnapshotEntry,
} from '../provider-snapshot.js';
import { loadProviderConfig } from '../provider-config-registry.js';
vi.mock('../acp-probe.js', () => ({
probeAcpProvider: vi.fn(),
@@ -14,6 +19,13 @@ import { probeAcpProvider } from '../acp-probe.js';
const mockProbe = vi.mocked(probeAcpProvider);
/** Write a temp coder-providers.json and point the resolved registry at it. */
function loadConfigFixture(providers: Record<string, unknown>): void {
const path = join(tmpdir(), `coder-providers-test-${providers ? Object.keys(providers).join('-') || 'empty' : 'empty'}.json`);
writeFileSync(path, JSON.stringify({ providers }), 'utf8');
loadProviderConfig(path);
}
function mockSql(agents: Array<{
name: string;
install_path: string | null;
@@ -21,6 +33,7 @@ function mockSql(agents: Array<{
models: Array<{ id: string; label: string }> | null;
label: string | null;
transport: string | null;
last_probed_at?: string | null;
}>) {
return vi.fn((strings: TemplateStringsArray) => {
const query = strings.join('');
@@ -36,6 +49,7 @@ function mockSql(agents: Array<{
const config = {
LLAMA_SWAP_URL: 'http://llama-swap.test',
PROVIDER_PROBE_TTL_MS: 86_400_000,
} as import('../config.js').Config;
describe('prefixLlamaSwapModels', () => {
@@ -68,6 +82,8 @@ describe('mergeModels', () => {
describe('getProviderSnapshot', () => {
beforeEach(() => {
clearProviderSnapshotCache();
// Reset the resolved registry to built-ins-only (missing path → {} config).
loadProviderConfig('/nonexistent-coder-providers.json');
vi.restoreAllMocks();
vi.stubGlobal(
'fetch',
@@ -165,4 +181,190 @@ describe('getProviderSnapshot', () => {
expect(claude?.modes.length).toBeGreaterThan(0);
expect(claude?.commands.some((c) => c.name === 'help')).toBe(true);
});
it('disabled provider → unavailable + enabled:false, WITHOUT spawning a probe', async () => {
loadConfigFixture({ goose: { enabled: false } });
mockProbe.mockResolvedValue({ ok: true, models: [], modes: [], defaultModeId: null, commands: [] });
const sql = mockSql([
{
name: 'goose',
install_path: '/usr/bin/goose',
supports_acp: true,
models: [{ id: 'g1', label: 'G1' }],
label: 'Goose',
transport: 'acp',
last_probed_at: new Date().toISOString(),
},
]);
const entries = await getProviderSnapshot(sql, config, '/tmp/project', true);
const goose = entries.find((e) => e.name === 'goose');
expect(goose?.status).toBe('unavailable');
expect(goose?.enabled).toBe(false);
expect(goose?.installed).toBe(false);
expect(mockProbe).not.toHaveBeenCalled();
});
it('uninstalled provider → unavailable + enabled:true + installed:false', async () => {
loadConfigFixture({});
mockProbe.mockResolvedValue({ ok: true, models: [], modes: [], defaultModeId: null, commands: [] });
const sql = mockSql([]); // nothing probed/installed
const entries = await getProviderSnapshot(sql, config, '/tmp/project', true);
const opencode = entries.find((e) => e.name === 'opencode');
expect(opencode?.status).toBe('unavailable');
expect(opencode?.enabled).toBe(true);
expect(opencode?.installed).toBe(false);
expect(mockProbe).not.toHaveBeenCalled();
});
it('fresh DB within TTL → tier-2 cold probe SKIPPED (serves DB models)', async () => {
loadConfigFixture({});
// If this were wrongly called, cached-goose would be replaced and the
// not.toHaveBeenCalled assertion would fail.
mockProbe.mockResolvedValue({
ok: true,
models: [{ id: 'SHOULD-NOT-APPEAR', label: 'nope' }],
modes: [],
defaultModeId: null,
commands: [],
});
const sql = mockSql([
{
name: 'goose',
install_path: '/usr/bin/goose',
supports_acp: true,
models: [{ id: 'cached-goose', label: 'Cached Goose' }],
label: 'Goose',
transport: 'acp',
last_probed_at: new Date().toISOString(), // fresh
},
]);
// force=false → cache-miss returns loading; second call joins the build / cache.
await getProviderSnapshot(sql, config, '/tmp/cwd', false);
const entries = await getProviderSnapshot(sql, config, '/tmp/cwd', false);
const goose = entries.find((e) => e.name === 'goose');
expect(goose?.status).toBe('ready');
expect(goose?.installed).toBe(true);
expect(goose?.models.map((m) => m.id)).toContain('cached-goose');
expect(goose?.models.map((m) => m.id)).not.toContain('SHOULD-NOT-APPEAR');
expect(mockProbe).not.toHaveBeenCalled();
});
it('force refresh → tier-2 cold probe RUNS even when DB is fresh', async () => {
loadConfigFixture({});
mockProbe.mockResolvedValue({
ok: true,
models: [{ id: 'fresh-probe', label: 'Fresh' }],
modes: [],
defaultModeId: null,
commands: [],
});
const sql = mockSql([
{
name: 'goose',
install_path: '/usr/bin/goose',
supports_acp: true,
models: [{ id: 'cached-goose', label: 'Cached' }],
label: 'Goose',
transport: 'acp',
last_probed_at: new Date().toISOString(), // fresh, but force overrides
},
]);
await getProviderSnapshot(sql, config, '/tmp/cwd', true);
expect(mockProbe).toHaveBeenCalled();
});
it('native boocode → ready, enabled, installed', async () => {
loadConfigFixture({});
const sql = mockSql([]);
const entries = await getProviderSnapshot(sql, config, '/tmp/project', true);
const boocode = entries.find((e) => e.name === 'boocode');
expect(boocode?.status).toBe('ready');
expect(boocode?.enabled).toBe(true);
expect(boocode?.installed).toBe(true);
});
it('config models REPLACE the claude static list; additionalModels merge (+ thinking)', async () => {
loadConfigFixture({
claude: {
models: [{ id: 'claude-opus-4-8', label: 'Opus 4.8' }],
additionalModels: [{ id: 'sonnet', label: 'Sonnet (latest)' }],
},
});
const sql = mockSql([
{
name: 'claude',
install_path: '/usr/bin/claude',
supports_acp: false,
models: [{ id: 'old-static', label: 'Old' }],
label: 'Claude Code',
transport: 'pty',
last_probed_at: new Date().toISOString(),
},
]);
const entries = await getProviderSnapshot(sql, config, '/tmp/project', true);
const claude = entries.find((e) => e.name === 'claude');
const ids = claude!.models.map((m) => m.id);
expect(ids).toContain('claude-opus-4-8'); // config models replaced the DB/static list
expect(ids).toContain('sonnet'); // additionalModels merged on top
expect(ids).not.toContain('old-static'); // replaced, not appended
// thinking options still attach to the config-provided models
expect(claude!.models.find((m) => m.id === 'claude-opus-4-8')?.thinkingOptions?.length).toBeGreaterThan(0);
});
it('peekSnapshotEntry returns a cached entry (read-only) and undefined when cold/unknown', async () => {
loadConfigFixture({});
// Cold cache → undefined (no build triggered).
expect(peekSnapshotEntry('boocode', '/tmp/peek')).toBeUndefined();
const sql = mockSql([]);
await getProviderSnapshot(sql, config, '/tmp/peek', true);
expect(peekSnapshotEntry('boocode', '/tmp/peek')?.name).toBe('boocode');
expect(peekSnapshotEntry('does-not-exist', '/tmp/peek')).toBeUndefined();
});
it('2.7 warm cache: a second snapshot within the warm window spawns ZERO probes', async () => {
loadConfigFixture({});
mockProbe.mockResolvedValue({
ok: true,
models: [{ id: 'm1', label: 'M1' }],
modes: [],
defaultModeId: null,
commands: [],
});
const sql = mockSql([
{
name: 'goose',
install_path: '/usr/bin/goose',
supports_acp: true,
models: null,
label: 'Goose',
transport: 'acp',
last_probed_at: null,
},
]);
await getProviderSnapshot(sql, config, '/tmp/cwd', true); // cold populate
const probeCallsAfterFirst = mockProbe.mock.calls.length;
await getProviderSnapshot(sql, config, '/tmp/cwd', false); // warm read
const probeCallsAfterSecond = mockProbe.mock.calls.length;
// Success criterion: second snapshot is served from cache with no ACP spawns.
expect(probeCallsAfterSecond - probeCallsAfterFirst).toBe(0);
});
});

View File

@@ -0,0 +1,170 @@
import { describe, it, expect, beforeAll, afterAll } from 'vitest';
import { readFileSync, existsSync } from 'node:fs';
import { rm, mkdir } from 'node:fs/promises';
import { resolve } from 'node:path';
import postgres from 'postgres';
import {
ensureSessionWorktree,
closeChatBackendState,
rebaselineWorktreeAfterApply,
} from '../worktrees.js';
import { reapOrphanWorktrees } from '../orphan-worktree-reaper.js';
import { hostExec } from '../host-exec.js';
/**
* v2.6 Phase 3 (3.6) — reconnect-after-restart integration test.
*
* Proves the DB-truth side of crash/restart recovery: a BooCoder restart wipes the
* in-memory pool, but the persistent `worktrees` + `agent_sessions` rows survive,
* so the "next turn" re-resolves the SAME worktree (reattach, no new dir) and the
* agent-session row is still there to resume from. Also exercises the chat-close
* hook (3.3), the apply re-baseline (3.5), and the orphan reaper (3.4) end-to-end
* against a real git repo + postgres.
*
* Requires DATABASE_URL (DB-opt-in; skips cleanly otherwise) AND git on PATH. Runs:
* DATABASE_URL='postgres://boocode:devpass@localhost:5500/boochat' pnpm -C apps/coder test
*/
describe.runIf(!!process.env.DATABASE_URL)('reconnect after restart (Phase 3)', () => {
let sql: ReturnType<typeof postgres>;
const stamp = Date.now();
const projectDir = `/tmp/boocode-reconnect-proj-${stamp}`;
let projectId: string;
let sessionId: string;
let chatId: string;
beforeAll(async () => {
sql = postgres(process.env.DATABASE_URL!, { max: 3 });
// Both schemas land in the one boochat DB: server owns sessions/chats/projects,
// coder owns worktrees/agent_sessions (FK targets must pre-exist → server first).
const serverSchema = resolve(__dirname, '../../../../server/src/schema.sql');
const coderSchema = resolve(__dirname, '../../schema.sql');
await sql.unsafe(readFileSync(serverSchema, 'utf8'));
await sql.unsafe(readFileSync(coderSchema, 'utf8'));
// A real git repo with one commit so worktree add / diff / rev-parse work.
await mkdir(projectDir, { recursive: true });
await hostExec(
`cd ${projectDir} && git init -q && git config user.email t@t && git config user.name t ` +
`&& echo hello > README.md && git add -A && git commit -qm init`,
{ timeoutMs: 20_000 },
);
const [project] = await sql<{ id: string }[]>`
INSERT INTO projects (name, path, status) VALUES ('reconnect-test', ${projectDir}, 'open') RETURNING id
`;
projectId = project!.id;
const [session] = await sql<{ id: string }[]>`
INSERT INTO sessions (project_id, name, model, status)
VALUES (${projectId}, 'recon', 'm', 'open') RETURNING id
`;
sessionId = session!.id;
const [chat] = await sql<{ id: string }[]>`
INSERT INTO chats (session_id, name, status) VALUES (${sessionId}, 'tab', 'open') RETURNING id
`;
chatId = chat!.id;
});
afterAll(async () => {
if (sql) {
// Best-effort worktree cleanup before dropping rows.
const rows = await sql<{ path: string }[]>`SELECT path FROM worktrees WHERE session_id = ${sessionId}`.catch(() => []);
for (const r of rows) {
await hostExec(`git -C ${projectDir} worktree remove ${r.path} --force`, { timeoutMs: 10_000 }).catch(() => {});
}
await sql`DELETE FROM agent_sessions WHERE chat_id = ${chatId}`.catch(() => {});
await sql`DELETE FROM worktrees WHERE session_id = ${sessionId}`.catch(() => {});
await sql`DELETE FROM chats WHERE id = ${chatId}`.catch(() => {});
await sql`DELETE FROM sessions WHERE id = ${sessionId}`.catch(() => {});
await sql`DELETE FROM projects WHERE id = ${projectId}`.catch(() => {});
await sql.end({ timeout: 5 });
}
await rm(projectDir, { recursive: true, force: true });
});
it('reattaches the SAME worktree across a simulated restart (no new dir)', async () => {
// "Turn 1" — first ensureSessionWorktree creates the worktree + row.
const first = await ensureSessionWorktree(sql, projectDir, sessionId);
expect(existsSync(first.worktreePath)).toBe(true);
expect(first.baseCommit).toBeTruthy();
// Simulate an agent_sessions row written by turn 1 (opencode).
await sql`
INSERT INTO agent_sessions (chat_id, session_id, worktree_id, agent, backend, agent_session_id, status, last_active_at)
VALUES (${chatId}, ${sessionId}, ${first.worktreeId}, 'opencode', 'opencode_server', 'oc-sess-1', 'active', clock_timestamp())
ON CONFLICT (chat_id, agent) DO NOTHING
`;
// "Restart" = brand-new resolution with NO in-memory state. ensureSessionWorktree
// must return the EXISTING row (same id + path), proving reattach not re-create.
const second = await ensureSessionWorktree(sql, projectDir, sessionId);
expect(second.worktreeId).toBe(first.worktreeId);
expect(second.worktreePath).toBe(first.worktreePath);
expect(second.baseCommit).toBe(first.baseCommit);
// The agent_sessions row survived the "restart" with its resume handle intact.
const [row] = await sql<{ agent_session_id: string; status: string }[]>`
SELECT agent_session_id, status FROM agent_sessions WHERE chat_id = ${chatId} AND agent = 'opencode'
`;
expect(row!.agent_session_id).toBe('oc-sess-1');
});
it('re-baselines the worktree diff after apply (3.5)', async () => {
const wt = await ensureSessionWorktree(sql, projectDir, sessionId);
const baseBefore = wt.baseCommit;
// Make a change in the worktree (as an external agent would).
await hostExec(`cd ${wt.worktreePath} && echo change >> README.md`, { timeoutMs: 10_000 });
const r = await rebaselineWorktreeAfterApply(sql, sessionId);
expect(r.rebaselined).toBe(true);
expect(r.newBaseCommit).toBeTruthy();
expect(r.newBaseCommit).not.toBe(baseBefore);
const [row] = await sql<{ base_commit: string }[]>`
SELECT base_commit FROM worktrees WHERE session_id = ${sessionId} AND status = 'active'
`;
expect(row!.base_commit).toBe(r.newBaseCommit);
// Idempotent: a second re-baseline with no new edits is a no-op.
const r2 = await rebaselineWorktreeAfterApply(sql, sessionId);
expect(r2.rebaselined).toBe(false);
});
it('chat-close hook closes agent rows + removes the worktree on the last chat (3.3)', async () => {
// Sanity: an active worktree + agent row exist from the prior tests.
const beforeWt = await sql<{ id: string }[]>`SELECT id FROM worktrees WHERE session_id = ${sessionId} AND status = 'active'`;
expect(beforeWt.length).toBe(1);
const result = await closeChatBackendState(sql, chatId);
expect(result.agentRowsClosed).toBeGreaterThanOrEqual(1);
// chatId is the session's only chat → worktree removed (it was clean after the
// re-baseline commit), not at-risk.
expect(result.worktreeAtRisk).toBe(false);
expect(result.worktreeRemoved).toBe(true);
const [agentRow] = await sql<{ status: string }[]>`
SELECT status FROM agent_sessions WHERE chat_id = ${chatId} AND agent = 'opencode'
`;
expect(agentRow!.status).toBe('closed');
const activeWt = await sql<{ id: string }[]>`SELECT id FROM worktrees WHERE session_id = ${sessionId} AND status = 'active'`;
expect(activeWt.length).toBe(0); // archived, no longer active
});
it('orphan reaper leaves a live worktree alone and reaps a row-less dir (3.4)', async () => {
// Recreate a live worktree for this session (the close test archived the old one).
const live = await ensureSessionWorktree(sql, projectDir, sessionId);
expect(existsSync(live.worktreePath)).toBe(true);
// A live worktree (active row) with grace 0 must NOT be reaped.
const r1 = await reapOrphanWorktrees(sql, console as never, 0, Date.now());
expect(r1.reaped).not.toContain(live.worktreePath);
// Now archive its row (simulating a leaked dir) and reap again — it becomes an
// orphan and is reclaimed (it's clean → not at-risk).
await sql`UPDATE worktrees SET status = 'archived' WHERE id = ${live.worktreeId}`;
const r2 = await reapOrphanWorktrees(sql, console as never, 0, Date.now());
expect(r2.reaped).toContain(live.worktreePath);
expect(existsSync(live.worktreePath)).toBe(false);
});
});

View File

@@ -0,0 +1,189 @@
import { describe, it, expect } from 'vitest';
import {
makeStreamJsonParser,
makeStreamJsonState,
parseStreamJsonLine,
type AgentEventList,
} from '../stream-json-parser.js';
import type { AgentEvent } from '../agent-backend.js';
import type { AcpToolSnapshot } from '../acp-tool-snapshot.js';
// Helpers to JSON-encode the representative Claude-Code stream-json lines.
const sys = (sessionId: string) =>
JSON.stringify({ type: 'system', subtype: 'init', session_id: sessionId, tools: ['read', 'edit'] });
const streamEvent = (event: unknown) => JSON.stringify({ type: 'stream_event', event });
const textDelta = (index: number, text: string) =>
streamEvent({ type: 'content_block_delta', index, delta: { type: 'text_delta', text } });
const thinkingDelta = (index: number, thinking: string) =>
streamEvent({ type: 'content_block_delta', index, delta: { type: 'thinking_delta', thinking } });
const toolStart = (index: number, id: string, name: string) =>
streamEvent({ type: 'content_block_start', index, content_block: { type: 'tool_use', id, name } });
const inputJsonDelta = (index: number, partial: string) =>
streamEvent({ type: 'content_block_delta', index, delta: { type: 'input_json_delta', partial_json: partial } });
const blockStop = (index: number) => streamEvent({ type: 'content_block_stop', index });
const resultLine = (input: number, output: number, sessionId?: string) =>
JSON.stringify({ type: 'result', subtype: 'success', session_id: sessionId, usage: { input_tokens: input, output_tokens: output } });
describe('parseStreamJsonLine (pure per-line mapping)', () => {
it('captures session_id from the system init line and emits no events', () => {
const state = makeStreamJsonState();
const events = parseStreamJsonLine(sys('sess-abc'), state);
expect(events).toEqual([]);
expect(state.sessionId).toBe('sess-abc');
});
it('maps a text_delta stream_event → a text event', () => {
const state = makeStreamJsonState();
expect(parseStreamJsonLine(textDelta(0, 'Hello'), state)).toEqual([{ type: 'text', text: 'Hello' }]);
});
it('maps a thinking_delta stream_event → a reasoning event', () => {
const state = makeStreamJsonState();
expect(parseStreamJsonLine(thinkingDelta(0, 'pondering'), state)).toEqual([
{ type: 'reasoning', text: 'pondering' },
]);
});
it('tolerates a garbage / non-JSON line (returns [], no throw)', () => {
const state = makeStreamJsonState();
expect(parseStreamJsonLine('not json at all {{{', state)).toEqual([]);
expect(parseStreamJsonLine('', state)).toEqual([]);
expect(parseStreamJsonLine(' ', state)).toEqual([]);
// A truncated/partial JSON object also yields [] rather than throwing.
expect(parseStreamJsonLine('{"type":"stream_event","eve', state)).toEqual([]);
});
it('ignores unknown top-level line types and the user (tool-result) line', () => {
const state = makeStreamJsonState();
expect(parseStreamJsonLine(JSON.stringify({ type: 'user', message: {} }), state)).toEqual([]);
expect(parseStreamJsonLine(JSON.stringify({ type: 'whatever' }), state)).toEqual([]);
});
it('assembles a tool call across input_json_delta chunks (split across lines)', () => {
const state = makeStreamJsonState();
// start → tool_call (running, empty args)
const start = parseStreamJsonLine(toolStart(1, 'toolu_1', 'edit_file'), state);
expect(start).toHaveLength(1);
expect(start[0]!.type).toBe('tool_call');
const startSnap = (start[0] as { type: 'tool_call'; toolCall: AcpToolSnapshot }).toolCall;
expect(startSnap.toolCallId).toBe('toolu_1');
expect(startSnap.title).toBe('edit_file');
expect(startSnap.status).toBe('in_progress');
expect(startSnap.rawInput).toEqual({});
// args streamed in fragments — no events until stop
expect(parseStreamJsonLine(inputJsonDelta(1, '{"path":"a'), state)).toEqual([]);
expect(parseStreamJsonLine(inputJsonDelta(1, '.ts","content":'), state)).toEqual([]);
expect(parseStreamJsonLine(inputJsonDelta(1, '"hi"}'), state)).toEqual([]);
// stop → tool_update with the parsed, fully-assembled input
const stop = parseStreamJsonLine(blockStop(1), state);
expect(stop).toHaveLength(1);
expect(stop[0]!.type).toBe('tool_update');
const stopSnap = (stop[0] as { type: 'tool_update'; toolCall: AcpToolSnapshot }).toolCall;
expect(stopSnap.toolCallId).toBe('toolu_1');
expect(stopSnap.status).toBe('completed');
expect(stopSnap.rawInput).toEqual({ path: 'a.ts', content: 'hi' });
});
it('falls back to {_raw} when accumulated tool args are not valid JSON', () => {
const state = makeStreamJsonState();
parseStreamJsonLine(toolStart(0, 'toolu_x', 'run'), state);
parseStreamJsonLine(inputJsonDelta(0, '{"broken'), state);
const stop = parseStreamJsonLine(blockStop(0), state);
const snap = (stop[0] as { type: 'tool_update'; toolCall: AcpToolSnapshot }).toolCall;
expect(snap.rawInput).toEqual({ _raw: '{"broken' });
});
it('captures usage from message_delta and result lines', () => {
const state = makeStreamJsonState();
parseStreamJsonLine(streamEvent({ type: 'message_delta', usage: { output_tokens: 42 } }), state);
expect(state.usage.outputTokens).toBe(42);
parseStreamJsonLine(resultLine(100, 250, 'sess-z'), state);
expect(state.usage.inputTokens).toBe(100);
expect(state.usage.outputTokens).toBe(250);
expect(state.sessionId).toBe('sess-z');
});
it('maps a terminal assistant message (fallback) → text + reasoning + tool events', () => {
const state = makeStreamJsonState();
const line = JSON.stringify({
type: 'assistant',
session_id: 'sess-asst',
message: {
content: [
{ type: 'thinking', thinking: 'let me think' },
{ type: 'text', text: 'Here is the answer' },
{ type: 'tool_use', id: 'toolu_9', name: 'view_file', input: { path: 'x.ts' } },
],
usage: { input_tokens: 5, output_tokens: 7 },
},
});
const events = parseStreamJsonLine(line, state);
expect(events).toEqual([
{ type: 'reasoning', text: 'let me think' },
{ type: 'text', text: 'Here is the answer' },
{
type: 'tool_update',
toolCall: { toolCallId: 'toolu_9', title: 'view_file', kind: null, status: 'completed', rawInput: { path: 'x.ts' } },
},
]);
expect(state.usage).toEqual({ inputTokens: 5, outputTokens: 7 });
expect(state.sessionId).toBe('sess-asst');
});
});
describe('makeStreamJsonParser (stateful wrapper over a full turn)', () => {
it('streams a representative turn: init → text → thinking → tool → result', () => {
const parser = makeStreamJsonParser();
const all: AgentEvent[] = [];
const feed = (line: string): AgentEventList => {
const evs = parser.push(line);
all.push(...evs);
return evs;
};
feed(sys('sess-1'));
feed(textDelta(0, 'Reading '));
feed(textDelta(0, 'the file. '));
feed(thinkingDelta(0, 'I should edit it'));
feed(toolStart(1, 'toolu_a', 'edit_file'));
feed(inputJsonDelta(1, '{"path":'));
feed(inputJsonDelta(1, '"main.ts"}'));
feed(blockStop(1));
feed(textDelta(0, 'Done.'));
feed(resultLine(120, 80, 'sess-1'));
expect(all).toEqual([
{ type: 'text', text: 'Reading ' },
{ type: 'text', text: 'the file. ' },
{ type: 'reasoning', text: 'I should edit it' },
{
type: 'tool_call',
toolCall: { toolCallId: 'toolu_a', title: 'edit_file', kind: null, status: 'in_progress', rawInput: {} },
},
{
type: 'tool_update',
toolCall: { toolCallId: 'toolu_a', title: 'edit_file', kind: null, status: 'completed', rawInput: { path: 'main.ts' } },
},
{ type: 'text', text: 'Done.' },
]);
expect(parser.usage()).toEqual({ inputTokens: 120, outputTokens: 80 });
expect(parser.sessionId()).toBe('sess-1');
});
it('a garbage line interleaved mid-turn does not derail subsequent parsing', () => {
const parser = makeStreamJsonParser();
expect(parser.push(textDelta(0, 'a'))).toEqual([{ type: 'text', text: 'a' }]);
expect(parser.push('>>> not json <<<')).toEqual([]);
expect(parser.push(textDelta(0, 'b'))).toEqual([{ type: 'text', text: 'b' }]);
});
});

View File

@@ -1,5 +1,25 @@
import { promises as fs } from 'node:fs';
import { dirname, isAbsolute, join, resolve } from 'node:path';
import { dirname, isAbsolute, resolve, sep } from 'node:path';
/**
* Resolve an ACP-supplied path against the agent worktree and reject anything
* that escapes it. Mirrors `write_guard.ts`'s check: `resolve()` to normalize
* `../` segments, then a **separator-bounded** prefix test — a bare
* `startsWith(root)` wrongly admits a sibling dir like `<root>-evil/...`.
*
* No realpath (consistent with `write_guard.ts`: the target may not exist yet on
* write). This is a containment guard for the ACP fs bridge, not a hard trust
* boundary — the agent process already runs with host FS access; symlink-swap
* hardening (`O_NOFOLLOW`/realpath) is out of scope here.
*/
function resolveInWorktree(worktreePath: string, filePath: string): string {
const root = resolve(worktreePath);
const absolute = isAbsolute(filePath) ? resolve(filePath) : resolve(root, filePath);
if (absolute !== root && !absolute.startsWith(root + sep)) {
throw new Error(`path escapes worktree: ${filePath}`);
}
return absolute;
}
/** Resolve an ACP path against the agent worktree and read a slice of lines. */
export async function readWorktreeTextFile(
@@ -8,10 +28,7 @@ export async function readWorktreeTextFile(
line?: number | null,
limit?: number | null,
): Promise<string> {
const absolute = isAbsolute(filePath) ? filePath : resolve(worktreePath, filePath);
if (!absolute.startsWith(resolve(worktreePath))) {
throw new Error(`path escapes worktree: ${filePath}`);
}
const absolute = resolveInWorktree(worktreePath, filePath);
const raw = await fs.readFile(absolute, 'utf8');
if (!line && !limit) return raw;
const lines = raw.split(/\r?\n/);
@@ -26,10 +43,7 @@ export async function writeWorktreeTextFile(
filePath: string,
content: string,
): Promise<void> {
const absolute = isAbsolute(filePath) ? filePath : resolve(worktreePath, filePath);
if (!absolute.startsWith(resolve(worktreePath))) {
throw new Error(`path escapes worktree: ${filePath}`);
}
const absolute = resolveInWorktree(worktreePath, filePath);
await fs.mkdir(dirname(absolute), { recursive: true });
await fs.writeFile(absolute, content, 'utf8');
}

View File

@@ -0,0 +1,88 @@
/**
* Shared ACP `Client` builder — the callback closures every ACP connection needs
* (worktree-scoped FS bridge + permission/elicitation routing + session updates).
*
* Extracted (v2.7 audit reshape) from the byte-identical `buildClient` closures in
* `acp-dispatch.ts` (one-shot) and `backends/warm-acp.ts` (warm). The two differed
* only in WHERE the per-turn context comes from (a fixed dispatch vs. the warm
* backend's `activeTurn`) and a trivially-equivalent permission gate — both are now
* supplied via the `resolveTurn` callback, so the FS/permission/elicitation wiring
* lives once. Behavior is preserved exactly:
* - `sessionUpdate` drops when `resolveTurn()` returns null (between turns).
* - permission/elicitation route to the UI only when BOTH a taskId AND sessionId
* are present (warm always has a sessionId, so this matches its prior
* `turn?.taskId` gate); otherwise the same auto-select-first / decline fallback.
*/
import type {
Client,
SessionNotification,
RequestPermissionRequest,
RequestPermissionResponse,
ReadTextFileRequest,
ReadTextFileResponse,
WriteTextFileRequest,
WriteTextFileResponse,
CreateTerminalRequest,
CreateTerminalResponse,
CreateElicitationRequest,
CreateElicitationResponse,
} from '@agentclientprotocol/sdk';
import { readWorktreeTextFile, writeWorktreeTextFile } from './acp-client-fs.js';
import { waitForPermissionResponse, waitForElicitationResponse } from './permission-waiter.js';
/** The per-turn context an ACP `Client` closure needs, resolved lazily per call. */
export interface AcpTurnContext {
/** Per-turn task id, for routing permission/elicitation prompts back to the UI. */
taskId: string | undefined;
/** BooCode session id (for permission-waiter's broker frames). */
sessionId: string | undefined;
/** Per-turn mode id (autonomous-mode gate in permission-waiter). */
modeId: string | undefined;
/** The agent name (for permission-waiter routing). */
agent: string;
/** Forward a session/update notification to the turn's event sink. */
onSessionUpdate: (params: SessionNotification) => void | Promise<void>;
}
/**
* Build the ACP `Client` callbacks once per connection. `resolveTurn` is called at
* the moment each callback fires and returns the live turn context (or null when no
* turn is active — `sessionUpdate` then drops, matching the warm backend's
* between-turns behavior). The FS bridge is scoped to `worktreePath`.
*/
export function buildAcpClient(worktreePath: string, resolveTurn: () => AcpTurnContext | null): Client {
return {
sessionUpdate: async (params: SessionNotification): Promise<void> => {
const turn = resolveTurn();
if (!turn) return; // between turns — drop (no orphan settles a future turn)
await turn.onSessionUpdate(params);
},
requestPermission: async (params: RequestPermissionRequest): Promise<RequestPermissionResponse> => {
const turn = resolveTurn();
if (turn && turn.taskId && turn.sessionId) {
return waitForPermissionResponse(turn.taskId, turn.sessionId, turn.agent, turn.modeId, params);
}
const firstOption = params.options[0];
if (firstOption) return { outcome: { outcome: 'selected', optionId: firstOption.optionId } };
return { outcome: { outcome: 'cancelled' } };
},
readTextFile: async (params: ReadTextFileRequest): Promise<ReadTextFileResponse> => {
const content = await readWorktreeTextFile(worktreePath, params.path, params.line, params.limit);
return { content };
},
writeTextFile: async (params: WriteTextFileRequest): Promise<WriteTextFileResponse> => {
await writeWorktreeTextFile(worktreePath, params.path, params.content);
return {};
},
createTerminal: async (_params: CreateTerminalRequest): Promise<CreateTerminalResponse> => {
return { terminalId: 'noop' };
},
unstable_createElicitation: async (params: CreateElicitationRequest): Promise<CreateElicitationResponse> => {
const turn = resolveTurn();
if (turn && turn.taskId && turn.sessionId) {
return waitForElicitationResponse(turn.taskId, turn.sessionId, turn.agent, turn.modeId, params);
}
return { action: 'decline' };
},
};
}

View File

@@ -9,34 +9,21 @@ import {
ClientSideConnection,
type Client,
type SessionNotification,
type RequestPermissionRequest,
type RequestPermissionResponse,
type ReadTextFileRequest,
type ReadTextFileResponse,
type WriteTextFileRequest,
type WriteTextFileResponse,
type CreateTerminalRequest,
type CreateTerminalResponse,
type CreateElicitationRequest,
type CreateElicitationResponse,
type SessionConfigOption,
type ClientSideConnection as ConnectionType,
} from '@agentclientprotocol/sdk';
import type { Broker } from '@boocode/server/broker';
import type { WsFrame } from '@boocode/server/ws-frames';
import { spawn } from 'node:child_process';
import { findThoughtLevelConfigId } from './acp-derive.js';
import { resolveAcpSpawnArgs } from './acp-spawn.js';
import { resolveLaunchSpec } from './acp-spawn.js';
import { getResolvedRegistry, type ResolvedProviderDef } from './provider-config-registry.js';
import { createAcpNdJsonStream } from './acp-stream.js';
import { waitForPermissionResponse, waitForElicitationResponse, cancelPendingPermission } from './permission-waiter.js';
import { mergeTaskCommands, getTaskCommands } from './agent-commands-cache.js';
import { readWorktreeTextFile, writeWorktreeTextFile } from './acp-client-fs.js';
import {
type AcpToolSnapshot,
mergeToolSnapshot,
snapshotToWireToolCall,
synthesizeCanceledSnapshots,
} from './acp-tool-snapshot.js';
import { cancelPendingPermission } from './permission-waiter.js';
import { mapSessionUpdate } from './acp-event-map.js';
import { type AcpToolSnapshot, synthesizeCanceledSnapshots } from './acp-tool-snapshot.js';
import { makeFrameEmitter, type FrameEmitter } from './frame-emitter.js';
import { buildAcpClient } from './acp-client.js';
export interface AcpDispatchResult {
exitCode: number;
@@ -59,6 +46,9 @@ export interface AcpDispatchOpts {
messageId?: string;
broker?: Broker;
installPath?: string;
/** v2.3 phase 3: resolved registry def for launch-spec resolution. The
* dispatcher loads this by task.agent; falls back to a registry lookup here. */
resolved?: ResolvedProviderDef;
signal?: AbortSignal;
log: FastifyBaseLogger;
}
@@ -107,162 +97,61 @@ async function applySessionOverrides(
}
class AcpStreamContext {
readonly textChunks: string[] = [];
readonly reasoningChunks: string[] = [];
readonly toolSnapshots = new Map<string, AcpToolSnapshot>();
private aborted = false;
/** AgentEvent → WS-frame mapping + text/reasoning/tool accumulation (shared
* `makeFrameEmitter`). The one-shot path passes no `dcp` stripper, so text is
* emitted verbatim — byte-identical to the prior inline switch. */
private readonly emitter: FrameEmitter;
constructor(
private readonly opts: Pick<
AcpDispatchOpts,
'broker' | 'sessionId' | 'chatId' | 'messageId' | 'taskId'
>,
opts: Pick<AcpDispatchOpts, 'broker' | 'sessionId' | 'chatId' | 'messageId' | 'taskId'>,
private readonly worktreePath: string,
) {}
) {
this.emitter = makeFrameEmitter({
broker: opts.broker,
sessionId: opts.sessionId,
chatId: opts.chatId,
assistantId: opts.messageId,
taskId: opts.taskId,
});
}
get reasoningText(): string {
return this.reasoningChunks.join('');
return this.emitter.reasoningText;
}
get output(): string {
return this.textChunks.join('');
return this.emitter.output;
}
get snapshots(): AcpToolSnapshot[] {
return [...this.toolSnapshots.values()];
return this.emitter.snapshots;
}
markAborted(): void {
this.aborted = true;
for (const snap of synthesizeCanceledSnapshots(this.toolSnapshots.values())) {
this.toolSnapshots.set(snap.toolCallId, snap);
this.publishToolSnapshot(snap);
// Synthesize 'canceled' updates for still-running tool calls so the UI doesn't
// leave them spinning, then emit them through the same frame path (tool_update
// → the same `tool_call` wire frame the original published).
for (const snap of synthesizeCanceledSnapshots(this.emitter.toolSnapshots.values())) {
this.emitter.onEvent({ type: 'tool_update', toolCall: snap });
}
}
private canStream(): boolean {
return !!(this.opts.broker && this.opts.sessionId && this.opts.chatId && this.opts.messageId);
}
private publishToolSnapshot(snapshot: AcpToolSnapshot): void {
if (!this.canStream()) return;
const wire = snapshotToWireToolCall(snapshot);
this.opts.broker!.publishFrame(this.opts.sessionId!, {
type: 'tool_call',
message_id: this.opts.messageId!,
chat_id: this.opts.chatId!,
tool_call: wire,
} as WsFrame);
}
handleToolUpdate(toolCallId: string, update: Parameters<typeof mergeToolSnapshot>[1]): void {
const previous = this.toolSnapshots.get(toolCallId);
const snapshot = mergeToolSnapshot(toolCallId, update, previous);
this.toolSnapshots.set(toolCallId, snapshot);
this.publishToolSnapshot(snapshot);
}
async handleSessionUpdate(params: SessionNotification): Promise<void> {
const update = params.update;
switch (update.sessionUpdate) {
case 'agent_message_chunk': {
const content = update.content;
if (content.type === 'text' && 'text' in content) {
const text = (content as { text: string }).text;
this.textChunks.push(text);
if (this.canStream()) {
this.opts.broker!.publishFrame(this.opts.sessionId!, {
type: 'delta',
message_id: this.opts.messageId!,
chat_id: this.opts.chatId!,
content: text,
} as WsFrame);
}
}
break;
}
case 'agent_thought_chunk': {
const content = update.content;
if (content.type === 'text' && 'text' in content) {
const text = (content as { text: string }).text;
this.reasoningChunks.push(text);
if (this.canStream()) {
this.opts.broker!.publishFrame(this.opts.sessionId!, {
type: 'reasoning_delta',
message_id: this.opts.messageId!,
chat_id: this.opts.chatId!,
content: text,
} as WsFrame);
}
}
break;
}
case 'tool_call':
this.handleToolUpdate(update.toolCallId, update);
break;
case 'tool_call_update':
this.handleToolUpdate(update.toolCallId, update);
break;
case 'available_commands_update': {
const commands = update.availableCommands.map((cmd) => ({
name: cmd.name,
description: cmd.description ?? undefined,
}));
if (this.opts.taskId && commands.length > 0) {
mergeTaskCommands(this.opts.taskId, commands);
if (this.canStream() && this.opts.sessionId) {
const all = getTaskCommands(this.opts.taskId) ?? commands;
this.opts.broker!.publishFrame(this.opts.sessionId, {
type: 'agent_commands',
task_id: this.opts.taskId,
session_id: this.opts.sessionId,
commands: all,
} as WsFrame);
}
}
break;
}
default:
break;
handleSessionUpdate(params: SessionNotification): void {
// The merge accumulator (`this.emitter.toolSnapshots`) is the same Map the
// emitter publishes from, so a later tool_call_update merges over its tool_call.
for (const event of mapSessionUpdate(params, this.emitter.toolSnapshots)) {
this.emitter.onEvent(event);
}
}
buildClient(agent: string, modeId: string | undefined, taskId: string | undefined, sessionId: string | undefined): Client {
return {
sessionUpdate: (params) => this.handleSessionUpdate(params),
requestPermission: async (params: RequestPermissionRequest): Promise<RequestPermissionResponse> => {
if (taskId && sessionId) {
return waitForPermissionResponse(taskId, sessionId, agent, modeId, params);
}
const firstOption = params.options[0];
if (firstOption) {
return { outcome: { outcome: 'selected', optionId: firstOption.optionId } };
}
return { outcome: { outcome: 'cancelled' } };
},
readTextFile: async (params: ReadTextFileRequest): Promise<ReadTextFileResponse> => {
const content = await readWorktreeTextFile(
this.worktreePath,
params.path,
params.line,
params.limit,
);
return { content };
},
writeTextFile: async (params: WriteTextFileRequest): Promise<WriteTextFileResponse> => {
await writeWorktreeTextFile(this.worktreePath, params.path, params.content);
return {};
},
createTerminal: async (_params: CreateTerminalRequest): Promise<CreateTerminalResponse> => {
return { terminalId: 'noop' };
},
unstable_createElicitation: async (params: CreateElicitationRequest): Promise<CreateElicitationResponse> => {
if (taskId && sessionId) {
return waitForElicitationResponse(taskId, sessionId, agent, modeId, params);
}
return { action: 'decline' };
},
};
return buildAcpClient(this.worktreePath, () => ({
taskId,
sessionId,
modeId,
agent,
onSessionUpdate: (params) => this.handleSessionUpdate(params),
}));
}
}
@@ -282,8 +171,12 @@ export async function dispatchViaAcp(opts: AcpDispatchOpts): Promise<AcpDispatch
broker,
} = opts;
const args = resolveAcpSpawnArgs(agent);
if (!args) {
// v2.3 phase 3: launch from the resolved registry def (config override /
// custom-ACP command) with the built-in switch as the fallback. The dispatcher
// passes `resolved`; fall back to a registry lookup if it didn't.
const resolved = opts.resolved ?? getResolvedRegistry().get(agent);
const spec = resolved ? resolveLaunchSpec(resolved, installPath ?? null) : null;
if (!spec) {
return {
exitCode: 1,
output: `Agent '${agent}' does not support ACP.`,
@@ -293,12 +186,11 @@ export async function dispatchViaAcp(opts: AcpDispatchOpts): Promise<AcpDispatch
};
}
const binary = installPath ?? agent;
log.info({ agent, binary, worktreePath, modeId, model: opts.model }, 'acp-dispatch: spawning');
const child = spawn(binary, args, {
log.info({ agent, binary: spec.binary, worktreePath, modeId, model: opts.model }, 'acp-dispatch: spawning');
const child = spawn(spec.binary, spec.args, {
cwd: worktreePath,
stdio: ['pipe', 'pipe', 'pipe'],
env: { ...process.env },
env: { ...process.env, ...spec.env },
});
const streamCtx = new AcpStreamContext(

View File

@@ -0,0 +1,68 @@
/**
* Shared ACP session-update → normalized AgentEvent mapping.
*
* Extracted verbatim (v2.6 Phase 2) from `AcpStreamContext.handleSessionUpdate`
* in `acp-dispatch.ts` so the warm ACP backend (`backends/warm-acp.ts`) and the
* one-shot dispatch share ONE mapping. The one-shot path translates the returned
* events into broker frames itself (preserving its prior behavior byte-for-byte);
* the warm backend forwards them to the dispatcher's `ctx.onEvent` exactly like
* the opencode-server backend does. No I/O, no broker — pure, so it's unit-testable.
*
* Spec: openspec/changes/v2-6-persistent-agent-sessions/design.md §2b.
*/
import type { SessionNotification } from '@agentclientprotocol/sdk';
import type { AgentEvent } from './agent-backend.js';
import { type AcpToolSnapshot, mergeToolSnapshot } from './acp-tool-snapshot.js';
/**
* Map one ACP `session/update` notification to zero-or-more normalized AgentEvents.
*
* `priorSnapshots` is the caller-owned tool-call snapshot accumulator (toolCallId →
* snapshot). For `tool_call` / `tool_call_update` the merged snapshot is written
* back into it (mutated in place, mirroring `AcpStreamContext.handleToolUpdate`)
* so a later `tool_call_update` merges over the earlier `tool_call`. Pass an empty
* Map for a stateless single call.
*
* Returns an array (never throws) so the caller can splat it onto `onEvent`.
*/
export function mapSessionUpdate(
params: SessionNotification,
priorSnapshots: Map<string, AcpToolSnapshot> = new Map(),
): AgentEvent[] {
const update = params.update;
switch (update.sessionUpdate) {
case 'agent_message_chunk': {
const content = update.content;
if (content.type === 'text' && 'text' in content) {
return [{ type: 'text', text: (content as { text: string }).text }];
}
return [];
}
case 'agent_thought_chunk': {
const content = update.content;
if (content.type === 'text' && 'text' in content) {
return [{ type: 'reasoning', text: (content as { text: string }).text }];
}
return [];
}
case 'tool_call': {
const snapshot = mergeToolSnapshot(update.toolCallId, update, priorSnapshots.get(update.toolCallId));
priorSnapshots.set(update.toolCallId, snapshot);
return [{ type: 'tool_call', toolCall: snapshot }];
}
case 'tool_call_update': {
const snapshot = mergeToolSnapshot(update.toolCallId, update, priorSnapshots.get(update.toolCallId));
priorSnapshots.set(update.toolCallId, snapshot);
return [{ type: 'tool_update', toolCall: snapshot }];
}
case 'available_commands_update': {
const commands = update.availableCommands.map((cmd) => ({
name: cmd.name,
description: cmd.description ?? undefined,
}));
return [{ type: 'commands', commands }];
}
default:
return [];
}
}

View File

@@ -130,6 +130,17 @@ export async function probeAcpProvider(
});
const session = await connection.newSession({ cwd, mcpServers: [] });
// available_commands_update is an async session notification opencode sends
// shortly AFTER newSession resolves — reading probedCommands synchronously
// here races it and captures nothing. Wait briefly for the first batch, then
// a short settle for any stragglers (capped well under PROBE_TIMEOUT_MS).
const deadline = Date.now() + 3_000;
while (probedCommands.length === 0 && Date.now() < deadline) {
await new Promise((r) => setTimeout(r, 150));
}
if (probedCommands.length > 0) {
await new Promise((r) => setTimeout(r, 300));
}
const result = parseSessionResponse(session, agent);
result.commands = probedCommands;
await connection.closeSession({ sessionId: session.sessionId }).catch(() => {});

View File

@@ -1,15 +1,15 @@
import type { ResolvedProviderDef } from './provider-config-registry.js';
/**
* Resolve ACP spawn argv per provider (host-probe verified 2026-05-25).
* Resolve ACP spawn argv per built-in provider (host-probe verified 2026-05-25).
* Source of truth for built-in default argv — resolveLaunchSpec wraps these; it
* does NOT replace them.
*/
export function resolveAcpSpawnArgs(agent: string): string[] | null {
switch (agent) {
case 'opencode':
case 'goose':
return ['acp'];
case 'cursor':
return ['acp'];
case 'copilot':
return ['--acp'];
case 'qwen':
return ['--acp'];
default:
@@ -17,13 +17,34 @@ export function resolveAcpSpawnArgs(agent: string): string[] | null {
}
}
export function resolveAcpProbeBinaries(agent: string): string[] {
switch (agent) {
case 'cursor':
return ['cursor-agent', 'agent'];
case 'copilot':
return ['copilot'];
default:
return [agent];
/**
* v2.3 phase 3: resolve the launch spec for an ACP dispatch (design.md §5.1).
* Consults the resolved registry's launchCommand (config override or custom-ACP
* entry) first; otherwise falls back to the built-in default argv above.
*
* Byte-identical to pre-v2.3 for built-ins with no override: binary is
* `installPath ?? id` and args come from resolveAcpSpawnArgs — exactly the
* `binary = installPath ?? agent` + `resolveAcpSpawnArgs(agent)` the dispatcher
* used before. (Deliberate deviation from design §5.1's `!installPath → null`:
* the old path spawned the bare agent name when install_path was missing, so we
* preserve the `?? id` fallback rather than fail.)
*/
export function resolveLaunchSpec(
resolved: ResolvedProviderDef,
installPath: string | null,
): { binary: string; args: string[]; env?: Record<string, string> } | null {
if (resolved.launchCommand) {
return {
binary: resolved.launchCommand[0],
args: resolved.launchCommand.slice(1),
env: resolved.env,
};
}
const args = resolveAcpSpawnArgs(resolved.id);
if (!args) return null;
return { binary: installPath ?? resolved.id, args, env: resolved.env };
}
export function resolveAcpProbeBinaries(agent: string): string[] {
return [agent];
}

View File

@@ -2,7 +2,7 @@ import { Readable, Writable } from 'node:stream';
import type { ChildProcess } from 'node:child_process';
import { ndJsonStream } from '@agentclientprotocol/sdk';
export function nodeReadableToWeb(nodeStream: NodeJS.ReadableStream): ReadableStream<Uint8Array> {
function nodeReadableToWeb(nodeStream: NodeJS.ReadableStream): ReadableStream<Uint8Array> {
return new ReadableStream<Uint8Array>({
start(controller) {
nodeStream.on('data', (chunk: Buffer) => controller.enqueue(new Uint8Array(chunk)));
@@ -17,7 +17,7 @@ export function nodeReadableToWeb(nodeStream: NodeJS.ReadableStream): ReadableSt
});
}
export function nodeWritableToWeb(nodeStream: NodeJS.WritableStream): WritableStream<Uint8Array> {
function nodeWritableToWeb(nodeStream: NodeJS.WritableStream): WritableStream<Uint8Array> {
return new WritableStream<Uint8Array>({
write(chunk) {
return new Promise<void>((resolve, reject) => {

View File

@@ -0,0 +1,125 @@
/**
* v2.6 — AgentBackend abstraction (Phase 0 scaffold; types only, zero runtime logic).
*
* The core abstraction for persistent agent sessions. Two implementations land
* later: `OpenCodeServerBackend` (Phase 1, opencode HTTP server) and
* `WarmAcpBackend` (Phase 2, long-lived ACP process). Backends emit
* transport-agnostic `AgentEvent`s; the dispatcher maps them to WS frames.
*
* Nothing imports this file yet — it must compile standalone.
* Spec: openspec/changes/v2-6-persistent-agent-sessions/design.md §2.
*/
import type { AcpToolSnapshot } from './acp-tool-snapshot.js';
import type { AgentCommand } from './provider-types.js';
/** Backend transport kind. Mirrors `agent_sessions.backend` CHECK in schema.sql. */
export type AgentBackendKind = 'opencode_server' | 'acp_warm' | 'claude_sdk';
/**
* Normalized, transport-agnostic events a backend emits during a turn (§2).
* Derived from acp-dispatch's session-update handling, but WITHOUT the WS
* envelope (message_id/chat_id) — the dispatcher owns frame mapping.
*
* `tool_call` vs `tool_update` are kept distinct on purpose: acp-dispatch
* currently merges both into one snapshot frame, but opencode's SSE
* distinguishes tool-start from tool-result, so the contract carries both.
* `commands` mirrors the ACP `available_commands_update` path (v2.5.10).
*/
export type AgentEvent =
| { type: 'text'; text: string }
| { type: 'reasoning'; text: string }
| { type: 'tool_call'; toolCall: AcpToolSnapshot }
| { type: 'tool_update'; toolCall: AcpToolSnapshot }
| { type: 'commands'; commands: AgentCommand[] };
/** Params to establish (or look up) a backend session (§2). */
export interface EnsureSessionOpts {
agent: string;
/** Resolved model id. */
model: string;
/** P1.5-b: the chat (tab) this turn belongs to. agent_sessions is keyed
* (chat_id, agent) — the tab/chat is the context unit. Always non-null:
* the dispatcher creates a chat for session-less tasks before calling. */
chatId: string;
/** Shared per-session worktree (one per `sessions.id`, not per pane). */
worktreePath: string;
/** P1.5-b: the `worktrees.id` for this session's worktree — stored on the
* agent_sessions row informationally (NOT the key). */
worktreeId: string;
projectId: string;
}
/** Opaque handle to a live backend session, persisted to `agent_sessions` (§2). */
export interface AgentSessionHandle {
sessionId: string;
agent: string;
backend: AgentBackendKind;
/** P1.5-b: the chat (tab) this session is keyed on (with agent). */
chatId: string;
/** P1.5-b: the worktree this session's chat runs in (informational link). */
worktreeId: string;
/** Provider's own session id (resume token); null until the backend assigns one. */
agentSessionId: string | null;
/** opencode HTTP server port; null for ACP backends. */
serverPort: number | null;
}
/** Per-turn context passed to `prompt` (§2). */
export interface PromptCtx {
worktreePath: string;
model: string;
signal: AbortSignal;
onEvent: (e: AgentEvent) => void;
/** Phase 2: per-turn task id, so a warm ACP backend can route permission /
* elicitation prompts back to the UI via the permission-waiter. Optional —
* the opencode-server backend (autonomous) ignores it. */
taskId?: string;
/** Phase 2: per-turn mode id (gates autonomous mode in the permission-waiter). */
modeId?: string;
}
/** Result of a completed turn (§2). Diff/persist happen outside the backend. */
export interface TurnResult {
ok: boolean;
error?: string;
// Optional context-window telemetry (claude SDK): the model's reported window
// (ctxMax, 1M-aware) and the peak request input ≈ current fill (ctxUsed). The
// dispatcher writes these onto the assistant message so the ContextBar renders a
// real fill for the turn. Omitted by backends that don't report a window.
ctxUsed?: number;
ctxMax?: number;
}
/**
* The core backend abstraction (§2). Implementations: OpenCodeServerBackend
* (Phase 1), WarmAcpBackend (Phase 2).
*/
export interface AgentBackend {
/** Lazy: spawn server / warm process if not already up for this (session, agent). §2 */
ensureSession(sessionId: string, opts: EnsureSessionOpts): Promise<AgentSessionHandle>;
/** Send a prompt; stream events via ctx.onEvent; resolves when the turn completes. §2 */
prompt(handle: AgentSessionHandle, input: string, ctx: PromptCtx): Promise<TurnResult>;
/** Graceful teardown of one session (session close or idle timeout). §2 */
closeSession(handle: AgentSessionHandle): Promise<void>;
/** Full teardown — kills all spawned servers/processes. §2 */
dispose(): Promise<void>;
/** Liveness for health endpoint + dispatcher fallback decision. §2 */
health(): 'up' | 'down';
/**
* v2.6 Phase 3: true iff a turn is in flight on this backend. The pool's idle
* eviction + LRU cap NEVER evict a busy backend (design §6 busy rule); the
* health-monitor defers a restart while busy (stale-grace). Optional so the
* Phase-0 scaffold and any test double stay compatible — absent ⇒ treated as
* not busy. opencode-server (multi-session) is busy iff ANY session has an
* active turn; warm-acp (single session) iff its one slot is active.
*/
isBusy?(): boolean;
/**
* v2.6 Phase 3: optional proactive health probe + busy-aware self-restart, run
* by the pool's periodic sweep. The opencode-server backend implements it
* (detects a hung-but-not-exited server and restarts when non-busy). Backends
* with no long-lived shared process (warm-ACP recovers lazily on its own child
* exit) can omit it. Must never throw — the sweep ignores rejections.
*/
tickHealth?(now?: number): Promise<void>;
}

View File

@@ -0,0 +1,246 @@
/**
* v2.6 — AgentPool.
*
* Lazy get-or-create registry of `AgentBackend` instances keyed by
* `${primary}:${agent}` (primary = chatId for warm-ACP, a fixed sentinel for the
* single shared opencode server). Phase 0 shipped the skeleton (Map + health +
* dispose). Phase 3 adds the LIFECYCLE: per-entry idle tracking, a periodic
* idle-TTL + LRU-cap sweep (the pure decisions live in
* `backends/lifecycle-decisions.ts`), and a `closeChat` helper for the chat-close
* hook. Reattach after eviction is implicit — the next turn's `ensureSession`
* rebuilds the backend from `agent_sessions` / `worktrees` (DB is the source of
* truth; the in-memory pool is a warm cache).
*
* The hard rule (design §6): NEVER evict a busy backend (one with an in-flight
* turn). `selectIdleEvictionTargets` / `selectLruEvictionTargets` enforce it via
* `backend.isBusy()`; a long turn that outlives the TTL is left alone.
*
* Spec: openspec/changes/v2-6-persistent-agent-sessions/design.md §2 / §6.
*/
import type { FastifyBaseLogger } from 'fastify';
import type { AgentBackend } from './agent-backend.js';
import {
selectIdleEvictionTargets,
selectLruEvictionTargets,
DEFAULT_IDLE_TTL_MS,
DEFAULT_MAX_LIVE_BACKENDS,
} from './backends/lifecycle-decisions.js';
interface PoolEntry {
primary: string;
agent: string;
backend: AgentBackend;
/** Epoch ms of the last turn boundary (register or touch). Drives idle/LRU. */
lastActiveAt: number;
}
export interface AgentPoolOpts {
/** Idle TTL before a non-busy backend is evicted. Default 30 min. */
idleTtlMs?: number;
/** Max live backends before the LRU cap evicts the least-recently-used. */
maxLive?: number;
/** Sweep cadence. Default 60s (mirrors the server's periodic sweeper). */
sweepIntervalMs?: number;
log?: FastifyBaseLogger;
}
const DEFAULT_SWEEP_INTERVAL_MS = 60_000;
export class AgentPool {
private readonly backends = new Map<string, PoolEntry>();
private idleTtlMs: number;
private maxLive: number;
private sweepIntervalMs: number;
private log: FastifyBaseLogger | undefined;
private sweepTimer: ReturnType<typeof setInterval> | null = null;
/** Serializes sweep runs so a slow eviction can't overlap the next tick. */
private sweeping = false;
constructor(opts: AgentPoolOpts = {}) {
this.idleTtlMs = opts.idleTtlMs ?? DEFAULT_IDLE_TTL_MS;
this.maxLive = opts.maxLive ?? DEFAULT_MAX_LIVE_BACKENDS;
this.sweepIntervalMs = opts.sweepIntervalMs ?? DEFAULT_SWEEP_INTERVAL_MS;
this.log = opts.log;
}
/** Apply env-derived knobs to the module singleton at bootstrap (before
* startReaper). Only overrides explicitly-provided fields. */
configure(opts: AgentPoolOpts): void {
if (opts.idleTtlMs != null) this.idleTtlMs = opts.idleTtlMs;
if (opts.maxLive != null) this.maxLive = opts.maxLive;
if (opts.sweepIntervalMs != null) this.sweepIntervalMs = opts.sweepIntervalMs;
if (opts.log) this.log = opts.log;
}
private key(primary: string, agent: string): string {
return `${primary}:${agent}`;
}
/** Map lookup only. Spawning happens in the dispatcher (Phase 1/2). A hit also
* marks the entry recently-active so a resolve-without-prompt doesn't get it
* evicted out from under an imminent turn. */
get(primary: string, agent: string): AgentBackend | undefined {
const entry = this.backends.get(this.key(primary, agent));
if (entry) entry.lastActiveAt = Date.now();
return entry?.backend;
}
/** Store a backend instance for this (primary, agent). */
register(primary: string, agent: string, backend: AgentBackend): void {
this.backends.set(this.key(primary, agent), { primary, agent, backend, lastActiveAt: Date.now() });
}
/** Mark a backend recently-active (call at turn start AND settle so a long turn
* keeps its slot warm). No-op if the key isn't pooled. */
touch(primary: string, agent: string): void {
const entry = this.backends.get(this.key(primary, agent));
if (entry) entry.lastActiveAt = Date.now();
}
/** Snapshot for the decision helpers (busy is read live from the backend). */
private snapshots(): { key: string; lastActiveAt: number; busy: boolean }[] {
const out: { key: string; lastActiveAt: number; busy: boolean }[] = [];
for (const [key, e] of this.backends) {
out.push({ key, lastActiveAt: e.lastActiveAt, busy: e.backend.isBusy?.() ?? false });
}
return out;
}
/** Summary for the health endpoint. */
health(): { size: number; busy: number } {
let busy = 0;
for (const e of this.backends.values()) if (e.backend.isBusy?.()) busy++;
return { size: this.backends.size, busy };
}
// ─── Phase 3: idle-TTL + LRU eviction sweep ──────────────────────────────────
/** Start the periodic idle + LRU sweep. Idempotent; unref'd so it never holds
* the process open on its own. */
startReaper(log?: FastifyBaseLogger): void {
if (log) this.log = log;
if (this.sweepTimer) return;
this.sweepTimer = setInterval(() => {
void this.sweep().catch((err) => {
this.log?.warn({ err: errMsg(err) }, 'agent-pool: sweep error');
});
}, this.sweepIntervalMs);
this.sweepTimer.unref?.();
}
stopReaper(): void {
if (this.sweepTimer) {
clearInterval(this.sweepTimer);
this.sweepTimer = null;
}
}
/**
* One sweep pass: evict idle-past-TTL backends, then enforce the LRU cap.
* Deduped (a key can't appear in both lists for one pass). Busy backends are
* excluded by the decision helpers — a live turn is never torn down.
*/
async sweep(now: number = Date.now()): Promise<{ evicted: string[] }> {
if (this.sweeping) return { evicted: [] };
this.sweeping = true;
try {
// Phase 3: drive each backend's optional proactive health probe first (the
// opencode server's busy-aware hung-detect + self-restart). Best-effort —
// a probe must never fail the sweep.
for (const e of this.backends.values()) {
if (e.backend.tickHealth) {
await e.backend.tickHealth(now).catch((err) => {
this.log?.warn({ key: this.key(e.primary, e.agent), err: errMsg(err) }, 'agent-pool: tickHealth threw');
});
}
}
const snaps = this.snapshots();
const idle = selectIdleEvictionTargets(snaps, now, this.idleTtlMs);
// LRU runs on what remains after idle eviction, so the two never double-evict.
const idleSet = new Set(idle);
const remaining = snaps.filter((s) => !idleSet.has(s.key));
const lru = selectLruEvictionTargets(remaining, this.maxLive);
const targets = [...idle, ...lru];
if (targets.length === 0) return { evicted: [] };
const evicted: string[] = [];
for (const key of targets) {
const entry = this.backends.get(key);
if (!entry) continue;
// Re-check busy right before teardown — a turn may have started since the
// snapshot. Defensive; the decision already excluded busy at snapshot time.
if (entry.backend.isBusy?.()) continue;
this.backends.delete(key);
try {
await entry.backend.dispose();
} catch (err) {
this.log?.warn({ key, err: errMsg(err) }, 'agent-pool: backend dispose threw during eviction');
}
evicted.push(key);
}
if (evicted.length > 0) {
this.log?.info({ evicted, size: this.backends.size }, 'agent-pool: evicted idle/over-cap backends');
}
return { evicted };
} finally {
this.sweeping = false;
}
}
// ─── Phase 3: chat-close cleanup (3.3) ───────────────────────────────────────
/**
* Tear down every pooled backend whose key is for this chat. Used by the
* chat-close hook. The opencode server is shared (keyed on a sentinel, not the
* chat), so it is NOT disposed here — only its session is closed via
* `closeSession`, which the hook calls directly with the per-(chat,agent)
* handle. Returns the keys it removed. Skips busy entries (a close mid-turn is
* rare but must not kill a live stream — the idle sweep reaps it shortly after).
*/
async closeChat(chatId: string): Promise<string[]> {
const removed: string[] = [];
const prefix = `${chatId}:`;
for (const [key, entry] of [...this.backends]) {
if (!key.startsWith(prefix)) continue;
if (entry.backend.isBusy?.()) continue;
this.backends.delete(key);
try {
await entry.backend.dispose();
} catch (err) {
this.log?.warn({ key, err: errMsg(err) }, 'agent-pool: dispose threw during closeChat');
}
removed.push(key);
}
return removed;
}
/** Look up a backend by exact key without bumping its activity (for closeSession). */
peek(primary: string, agent: string): AgentBackend | undefined {
return this.backends.get(this.key(primary, agent))?.backend;
}
/** Dispose every backend and clear the map. Tolerates throwing backends. */
async dispose(): Promise<void> {
this.stopReaper();
const entries = [...this.backends.values()];
this.backends.clear();
await Promise.allSettled(entries.map((e) => e.backend.dispose()));
}
}
function errMsg(e: unknown): string {
return e instanceof Error ? e.message : String(e);
}
/**
* The shared opencode server is pooled under a FIXED sentinel (one server per
* BooCoder process, multiplexing all opencode sessions internally) rather than a
* chat id — so it is NOT torn down by `closeChat(chatId)` (only its per-chat
* session is closed). Exported so the dispatcher + the lifecycle close-hook agree
* on the key without drift.
*/
export const OPENCODE_POOL_KEY = '__opencode_server__';
/** Single shared instance — registered by the dispatcher, swept + drained by the
* server's onClose hook. */
export const agentPool = new AgentPool();

View File

@@ -1,24 +1,34 @@
import type { Sql } from '../db.js';
import type { FastifyBaseLogger } from 'fastify';
import { exec as execCb } from 'node:child_process';
import { exec as execCb, execFile as execFileCb } from 'node:child_process';
import { promisify } from 'node:util';
import { PROVIDERS_BY_NAME, PROBED_AGENT_NAMES } from './provider-registry.js';
import { PROVIDERS_BY_NAME } from './provider-registry.js';
import { resolveAcpProbeBinaries } from './acp-spawn.js';
import { clearProviderSnapshotCache } from './provider-snapshot.js';
import { clearProviderSnapshotCache, fetchLlamaSwapModels, prefixLlamaSwapModels } from './provider-snapshot.js';
import { readQwenSettingsModels } from './qwen-settings.js';
import { loadConfig } from '../config.js';
import { loadProviderConfig } from './provider-config-registry.js';
const exec = promisify(execCb);
const execFile = promisify(execFileCb);
// `which` via execFile (no shell) — the binary name can come from the config
// file (custom ACP entries), so avoid interpolating it into a shell string.
async function whichBinary(bin: string): Promise<string | null> {
try {
const { stdout } = await execFile('which', [bin], { timeout: 10_000 });
const path = stdout.trim();
return path || null;
} catch {
return null;
}
}
async function resolveInstallPath(agentName: string): Promise<string | null> {
const candidates = resolveAcpProbeBinaries(agentName);
for (const bin of candidates) {
try {
const { stdout } = await exec(`which ${bin}`, { timeout: 10_000 });
const path = stdout.trim();
if (path) return path;
} catch {
/* try next */
}
const path = await whichBinary(bin);
if (path) return path;
}
return null;
}
@@ -27,15 +37,6 @@ async function detectAcpSupport(agentName: string, installPath: string): Promise
const transport = PROVIDERS_BY_NAME.get(agentName)?.transport;
if (transport !== 'acp') return false;
if (agentName === 'copilot') {
try {
const { stdout } = await exec(`"${installPath}" --help`, { timeout: 10_000 });
return stdout.includes('--acp');
} catch {
return false;
}
}
if (agentName === 'qwen') {
try {
const { stdout } = await exec(`"${installPath}" --help`, { timeout: 10_000 });
@@ -55,14 +56,37 @@ async function detectAcpSupport(agentName: string, installPath: string): Promise
/**
* Probe for available agents on the HOST.
*
* v2.3: iterates the resolved provider registry (built-ins + config-backed
* custom ACP entries) rather than the hardcoded `PROBED_AGENT_NAMES`. Native
* boocode is not probed; disabled providers are skipped (their `available_agents`
* row is kept, not deleted). `enabled` is read from the in-memory registry only —
* no DB column in Phase 1 (design.md §3.3).
*/
export async function probeAgents(sql: Sql, log: FastifyBaseLogger): Promise<void> {
clearProviderSnapshotCache();
log.info('agent-probe: scanning for known agents');
for (const agentName of PROBED_AGENT_NAMES) {
const registry = loadProviderConfig(loadConfig().CODER_PROVIDERS_PATH);
for (const resolved of registry.values()) {
const agentName = resolved.id;
// Native boocode is not a probed host agent.
if (resolved.transport === 'native') continue;
// Disabled providers: skip the probe, keep any existing row.
if (!resolved.enabled) {
log.info({ agent: agentName }, 'agent-probe: skipping disabled provider');
continue;
}
try {
const installPath = await resolveInstallPath(agentName);
// Custom ACP entries resolve their binary from command[0]; built-ins use
// the per-agent probe binaries.
const installPath = resolved.isCustomAcp && resolved.launchCommand
? await whichBinary(resolved.launchCommand[0])
: await resolveInstallPath(agentName);
if (!installPath) continue;
let version: string | null = null;
@@ -73,24 +97,43 @@ export async function probeAgents(sql: Sql, log: FastifyBaseLogger): Promise<voi
/* optional */
}
const providerDef = PROVIDERS_BY_NAME.get(agentName);
let supportsAcp = providerDef?.transport === 'acp';
if (supportsAcp) {
supportsAcp = await detectAcpSupport(agentName, installPath);
// Custom ACP entries are ACP by declaration; built-ins detect support.
let supportsAcp: boolean;
if (resolved.isCustomAcp) {
supportsAcp = true;
} else {
supportsAcp = resolved.transport === 'acp';
if (supportsAcp) {
supportsAcp = await detectAcpSupport(agentName, installPath);
}
}
let models: Array<{ id: string; label: string }> = [];
if (providerDef?.modelSource === 'static' && providerDef.staticModels) {
models = providerDef.staticModels;
if (!resolved.isCustomAcp) {
const providerDef = PROVIDERS_BY_NAME.get(agentName);
if (providerDef?.modelSource === 'static' && providerDef.staticModels) {
models = providerDef.staticModels;
}
if (agentName === 'qwen') {
models = await readQwenSettingsModels();
}
if (providerDef?.mergeLlamaSwap) {
try {
const config = loadConfig();
const llamaModels = prefixLlamaSwapModels(await fetchLlamaSwapModels(config));
models = [...models, ...llamaModels];
} catch (err) {
log.warn({ agent: agentName, err: err instanceof Error ? err.message : String(err) }, 'agent-probe: llama-swap model fetch failed (non-fatal)');
}
}
}
if (agentName === 'qwen') {
models = await readQwenSettingsModels();
}
const label = providerDef?.label ?? agentName;
const transport =
providerDef?.transport === 'acp' && !supportsAcp ? 'pty' : (providerDef?.transport ?? 'pty');
const label = resolved.configLabel ?? resolved.label;
const transport = resolved.isCustomAcp
? 'acp'
: resolved.transport === 'acp' && !supportsAcp
? 'pty'
: (resolved.transport ?? 'pty');
await sql`
INSERT INTO available_agents (name, install_path, version, supports_acp, last_probed_at, models, label, transport)

View File

@@ -0,0 +1,55 @@
/**
* agent-status-publish (#10) — builds + publishes the `agent_status_updated`
* WS frame on the per-session channel (the same channel CoderPane subscribes to).
*
* Kept separate from normalize-agent-status.ts so that module stays a pure,
* broker-free helper (trivially unit-testable; reused by the config-injection
* follow-on). The frame contract is pinned in apps/server/src/types/ws-frames.ts
* (`AgentStatusUpdatedFrame`) and mirrored byte-identical in apps/web.
*/
import type { Broker } from '@boocode/server/broker';
import type { WsFrame } from '@boocode/contracts/ws-frames';
import type { AgentStatus } from './normalize-agent-status.js';
// The exact slice of Broker we need — accepting just the bound method keeps call
// sites flexible (pass `broker.publishFrame.bind(broker)` or, since the broker's
// publishFrame doesn't read `this`, `broker.publishFrame` directly).
type PublishFrame = Broker['publishFrame'];
/**
* Best-effort publish of a normalized agent status. The broker's publishFrame
* already fail-closes (validates + logs + drops on bad input, never throws), but
* we additionally swallow any unexpected error so a publish can NEVER break the
* turn it's reporting on.
*
* @param publishFrame the session channel publisher (broker.publishFrame)
* @param sessionId WS subscription channel (CoderPane subscribes per-session)
* @param chatId the (chat) half of the (chat, agent) status key
* @param agent the (agent) half of the key
* @param status normalized lifecycle status
* @param reason free-form discriminator (turn_start / turn_complete / …)
* @param at ISO timestamp; defaults to now
*/
export function publishAgentStatus(
publishFrame: PublishFrame,
sessionId: string,
chatId: string,
agent: string,
status: AgentStatus,
reason?: string,
at: string = new Date().toISOString(),
): void {
try {
const frame: WsFrame = {
type: 'agent_status_updated',
chat_id: chatId,
agent,
status,
...(reason ? { reason } : {}),
at,
};
publishFrame(sessionId, frame);
} catch {
// never let a status publish break the turn — best-effort only.
}
}

View File

@@ -0,0 +1,251 @@
import { describe, it, expect } from 'vitest';
import type { SDKMessage } from '@anthropic-ai/claude-agent-sdk';
import { mapSdkMessage, createClaudeSdkMapState } from '../claude-sdk-map.js';
import type { AgentEvent } from '../../agent-backend.js';
/**
* Pure mapper for Claude-SDK messages → AgentEvents (claude-sdk-sessionstore #9 Part 2).
* Verifies the partial-stream → live-delta mapping, tool assembly across blocks, and
* the final-assistant dedup, with no live `claude` binary involved.
*
* Messages are cast through `unknown` to `SDKMessage`: the real SDK shapes carry many
* fields (uuid, parent_tool_use_id, …) irrelevant to the mapper, which reads only the
* `type`/`event`/`message.content` it discriminates on. The cast keeps the fixtures
* minimal while the production code path sees the full real types (the backend's
* typecheck against the real SDK is the type-safety proof).
*/
function msg(m: unknown): SDKMessage {
return m as SDKMessage;
}
/** A partial-stream message wrapping one BetaRawMessageStreamEvent. */
function streamEvent(event: unknown): SDKMessage {
return msg({ type: 'stream_event', event, parent_tool_use_id: null, uuid: 'u', session_id: 's' });
}
describe('mapSdkMessage — partial stream deltas', () => {
it('maps a text_delta to a text event', () => {
const state = createClaudeSdkMapState();
const out = mapSdkMessage(
streamEvent({ type: 'content_block_delta', index: 0, delta: { type: 'text_delta', text: 'Hello' } }),
state,
);
expect(out).toEqual<AgentEvent[]>([{ type: 'text', text: 'Hello' }]);
});
it('maps a thinking_delta to a reasoning event', () => {
const state = createClaudeSdkMapState();
const out = mapSdkMessage(
streamEvent({
type: 'content_block_delta',
index: 0,
delta: { type: 'thinking_delta', thinking: 'pondering', estimated_tokens: null },
}),
state,
);
expect(out).toEqual<AgentEvent[]>([{ type: 'reasoning', text: 'pondering' }]);
});
it('drops empty text/thinking deltas', () => {
const state = createClaudeSdkMapState();
expect(
mapSdkMessage(streamEvent({ type: 'content_block_delta', index: 0, delta: { type: 'text_delta', text: '' } }), state),
).toEqual([]);
expect(
mapSdkMessage(
streamEvent({ type: 'content_block_delta', index: 0, delta: { type: 'thinking_delta', thinking: '', estimated_tokens: null } }),
state,
),
).toEqual([]);
});
it('ignores message framing + signature/citation deltas', () => {
const state = createClaudeSdkMapState();
expect(mapSdkMessage(streamEvent({ type: 'message_start', message: {} }), state)).toEqual([]);
expect(mapSdkMessage(streamEvent({ type: 'message_stop' }), state)).toEqual([]);
expect(
mapSdkMessage(streamEvent({ type: 'content_block_delta', index: 0, delta: { type: 'signature_delta', signature: 'x' } }), state),
).toEqual([]);
});
});
describe('mapSdkMessage — tool assembly across blocks', () => {
it('opens a tool_call on content_block_start, buffers input_json_delta, emits tool_update with parsed input on stop', () => {
const state = createClaudeSdkMapState();
const started = mapSdkMessage(
streamEvent({
type: 'content_block_start',
index: 1,
content_block: { type: 'tool_use', id: 'tool-1', name: 'view_file', input: {} },
}),
state,
);
expect(started).toEqual<AgentEvent[]>([
{ type: 'tool_call', toolCall: { toolCallId: 'tool-1', title: 'view_file', kind: null, status: 'in_progress', rawInput: {}, rawOutput: undefined } },
]);
// args stream in fragments under the same block index
expect(
mapSdkMessage(streamEvent({ type: 'content_block_delta', index: 1, delta: { type: 'input_json_delta', partial_json: '{"path":' } }), state),
).toEqual([]);
expect(
mapSdkMessage(streamEvent({ type: 'content_block_delta', index: 1, delta: { type: 'input_json_delta', partial_json: '"a.ts"}' } }), state),
).toEqual([]);
const stopped = mapSdkMessage(streamEvent({ type: 'content_block_stop', index: 1 }), state);
expect(stopped).toHaveLength(1);
const ev = stopped[0]!;
expect(ev.type).toBe('tool_update');
if (ev.type === 'tool_update') {
expect(ev.toolCall.toolCallId).toBe('tool-1');
expect(ev.toolCall.title).toBe('view_file');
expect(ev.toolCall.rawInput).toEqual({ path: 'a.ts' });
}
});
it('content_block_stop for a non-tool block (no tracked index) emits nothing', () => {
const state = createClaudeSdkMapState();
// text block was streamed at index 0 but never tracked as a tool
mapSdkMessage(streamEvent({ type: 'content_block_delta', index: 0, delta: { type: 'text_delta', text: 'hi' } }), state);
expect(mapSdkMessage(streamEvent({ type: 'content_block_stop', index: 0 }), state)).toEqual([]);
});
it('falls back to the prior input when the buffered tool JSON is invalid', () => {
const state = createClaudeSdkMapState();
mapSdkMessage(
streamEvent({ type: 'content_block_start', index: 2, content_block: { type: 'tool_use', id: 't2', name: 'grep', input: { q: 'seed' } } }),
state,
);
mapSdkMessage(streamEvent({ type: 'content_block_delta', index: 2, delta: { type: 'input_json_delta', partial_json: '{not json' } }), state);
const stopped = mapSdkMessage(streamEvent({ type: 'content_block_stop', index: 2 }), state);
const ev = stopped[0]!;
if (ev.type === 'tool_update') {
expect(ev.toolCall.rawInput).toEqual({ q: 'seed' });
} else {
throw new Error('expected tool_update');
}
});
});
describe('mapSdkMessage — final assistant message', () => {
function assistant(content: unknown[]): SDKMessage {
return msg({ type: 'assistant', message: { content }, parent_tool_use_id: null, uuid: 'u', session_id: 's' });
}
it('dedups text/thinking (already streamed) and emits a completed tool_update per tool_use block', () => {
const state = createClaudeSdkMapState();
const out = mapSdkMessage(
assistant([
{ type: 'text', text: 'final answer', citations: null },
{ type: 'thinking', thinking: 'reasoned', signature: 'sig' },
{ type: 'tool_use', id: 'tool-9', name: 'find_files', input: { glob: '**/*.ts' } },
]),
state,
);
expect(out).toEqual<AgentEvent[]>([
{
type: 'tool_update',
toolCall: { toolCallId: 'tool-9', title: 'find_files', kind: null, status: 'completed', rawInput: { glob: '**/*.ts' }, rawOutput: undefined },
},
]);
});
it('preserves a title from a prior partial tool_call snapshot', () => {
const state = createClaudeSdkMapState();
mapSdkMessage(
streamEvent({ type: 'content_block_start', index: 0, content_block: { type: 'tool_use', id: 'tool-x', name: 'view_file', input: {} } }),
state,
);
const out = mapSdkMessage(assistant([{ type: 'tool_use', id: 'tool-x', name: 'view_file', input: { path: 'z' } }]), state);
const ev = out[0]!;
if (ev.type === 'tool_update') {
expect(ev.toolCall.status).toBe('completed');
expect(ev.toolCall.title).toBe('view_file');
expect(ev.toolCall.rawInput).toEqual({ path: 'z' });
} else {
throw new Error('expected tool_update');
}
});
});
describe('mapSdkMessage — non-content messages', () => {
it('returns [] for system/init, status, result, and other variants', () => {
const state = createClaudeSdkMapState();
expect(mapSdkMessage(msg({ type: 'system', subtype: 'init', session_id: 's', uuid: 'u' }), state)).toEqual([]);
expect(mapSdkMessage(msg({ type: 'system', subtype: 'status', status: null, session_id: 's', uuid: 'u' }), state)).toEqual([]);
expect(
mapSdkMessage(msg({ type: 'result', subtype: 'success', result: 'done', session_id: 's', uuid: 'u' }), state),
).toEqual([]);
});
});
describe('mapSdkMessage — user tool results', () => {
/** A `user` message carrying tool_result blocks (the SDK feeds tool output back here). */
function userMsg(content: unknown): SDKMessage {
return msg({ type: 'user', message: { role: 'user', content }, parent_tool_use_id: null, uuid: 'u', session_id: 's' });
}
it('maps a string tool_result to a completed tool_update carrying the output', () => {
const state = createClaudeSdkMapState();
const out = mapSdkMessage(userMsg([{ type: 'tool_result', tool_use_id: 't1', content: 'done' }]), state);
expect(out).toEqual<AgentEvent[]>([
{
type: 'tool_update',
toolCall: { toolCallId: 't1', title: 't1', kind: null, status: 'completed', rawInput: undefined, rawOutput: 'done' },
},
]);
});
it('marks an is_error result failed', () => {
const state = createClaudeSdkMapState();
const out = mapSdkMessage(userMsg([{ type: 'tool_result', tool_use_id: 't1', content: 'boom', is_error: true }]), state);
const ev = out[0]!;
if (ev.type !== 'tool_update') throw new Error('expected tool_update');
expect(ev.toolCall.status).toBe('failed');
expect(ev.toolCall.rawOutput).toBe('boom');
});
it('flattens array text blocks (skipping non-text) and reuses a prior snapshot title', () => {
const state = createClaudeSdkMapState();
mapSdkMessage(
streamEvent({ type: 'content_block_start', index: 1, content_block: { type: 'tool_use', id: 't2', name: 'view_file', input: {} } }),
state,
);
const out = mapSdkMessage(
userMsg([
{
type: 'tool_result',
tool_use_id: 't2',
content: [
{ type: 'text', text: 'line1' },
{ type: 'image', source: {} },
{ type: 'text', text: 'line2' },
],
},
]),
state,
);
const ev = out[0]!;
if (ev.type !== 'tool_update') throw new Error('expected tool_update');
expect(ev.toolCall.toolCallId).toBe('t2');
expect(ev.toolCall.title).toBe('view_file');
expect(ev.toolCall.status).toBe('completed');
expect(ev.toolCall.rawOutput).toBe('line1\nline2');
});
it('surfaces a result for an unknown tool_use_id with the id as the title', () => {
const state = createClaudeSdkMapState();
const out = mapSdkMessage(userMsg([{ type: 'tool_result', tool_use_id: 'orphan-id', content: 'x' }]), state);
expect(out[0]).toMatchObject({
type: 'tool_update',
toolCall: { toolCallId: 'orphan-id', title: 'orphan-id', kind: null, status: 'completed' },
});
});
it('ignores non-tool_result blocks and non-array content', () => {
const state = createClaudeSdkMapState();
expect(mapSdkMessage(userMsg([{ type: 'text', text: 'hi' }]), state)).toEqual([]);
expect(mapSdkMessage(userMsg('plain string'), state)).toEqual([]);
});
});

View File

@@ -0,0 +1,49 @@
import { describe, it, expect } from 'vitest';
import { shouldUseClaudeSdk, claudeSdkBackendEnabled } from '../claude-sdk-routing.js';
/**
* Env-flagged routing for the warm Claude-SDK backend. With CLAUDE_SDK_BACKEND off
* (the production default) every claude task falls through to the unchanged PTY path;
* with it on, only chat-tab claude tasks (session_id + chat_id) route to the SDK.
*/
const ON = { CLAUDE_SDK_BACKEND: '1' } as NodeJS.ProcessEnv;
const OFF = {} as NodeJS.ProcessEnv;
describe('claudeSdkBackendEnabled', () => {
it('is false when unset or falsy', () => {
expect(claudeSdkBackendEnabled({} as NodeJS.ProcessEnv)).toBe(false);
expect(claudeSdkBackendEnabled({ CLAUDE_SDK_BACKEND: '' } as NodeJS.ProcessEnv)).toBe(false);
expect(claudeSdkBackendEnabled({ CLAUDE_SDK_BACKEND: '0' } as NodeJS.ProcessEnv)).toBe(false);
expect(claudeSdkBackendEnabled({ CLAUDE_SDK_BACKEND: 'false' } as NodeJS.ProcessEnv)).toBe(false);
expect(claudeSdkBackendEnabled({ CLAUDE_SDK_BACKEND: 'off' } as NodeJS.ProcessEnv)).toBe(false);
expect(claudeSdkBackendEnabled({ CLAUDE_SDK_BACKEND: 'no' } as NodeJS.ProcessEnv)).toBe(false);
});
it('is true for any other truthy value', () => {
expect(claudeSdkBackendEnabled({ CLAUDE_SDK_BACKEND: '1' } as NodeJS.ProcessEnv)).toBe(true);
expect(claudeSdkBackendEnabled({ CLAUDE_SDK_BACKEND: 'true' } as NodeJS.ProcessEnv)).toBe(true);
expect(claudeSdkBackendEnabled({ CLAUDE_SDK_BACKEND: 'on' } as NodeJS.ProcessEnv)).toBe(true);
});
});
describe('shouldUseClaudeSdk', () => {
it('is always false while the env flag is off — production claude stays on PTY', () => {
expect(shouldUseClaudeSdk({ agent: 'claude', session_id: 's1', chat_id: 'c1' }, OFF)).toBe(false);
});
it('routes a chat-tab claude task to the SDK when the flag is on', () => {
expect(shouldUseClaudeSdk({ agent: 'claude', session_id: 's1', chat_id: 'c1' }, ON)).toBe(true);
});
it('only applies to the claude agent', () => {
expect(shouldUseClaudeSdk({ agent: 'qwen', session_id: 's1', chat_id: 'c1' }, ON)).toBe(false);
expect(shouldUseClaudeSdk({ agent: 'opencode', session_id: 's1', chat_id: 'c1' }, ON)).toBe(false);
expect(shouldUseClaudeSdk({ agent: null, session_id: 's1', chat_id: 'c1' }, ON)).toBe(false);
});
it('requires both session_id and chat_id (session-less creators stay one-shot)', () => {
expect(shouldUseClaudeSdk({ agent: 'claude', session_id: null, chat_id: null }, ON)).toBe(false);
expect(shouldUseClaudeSdk({ agent: 'claude', session_id: 's1', chat_id: null }, ON)).toBe(false);
expect(shouldUseClaudeSdk({ agent: 'claude', session_id: null, chat_id: 'c1' }, ON)).toBe(false);
});
});

View File

@@ -0,0 +1,135 @@
import { describe, it, expect, beforeAll, afterAll } from 'vitest';
import { readFileSync } from 'node:fs';
import { resolve } from 'node:path';
import postgres from 'postgres';
import { PostgresSessionStore } from '../claude-session-store.js';
import type { SessionStoreEntry } from '@anthropic-ai/claude-agent-sdk';
/**
* claude-sdk-sessionstore #9 (Part 1) — PostgresSessionStore tests.
*
* DB-opt-in (DATABASE_URL), mirrors checkpoints.test.ts: skips cleanly when the
* var is unset; otherwise applies the server + coder schemas and exercises the
* real append/load/listSessions/delete/listSubkeys round trips against postgres.
* Rows are namespaced under a unique project_key so concurrent suites / leftover
* data can't collide, and afterAll deletes everything written.
*/
describe.runIf(!!process.env.DATABASE_URL)('PostgresSessionStore (DB)', () => {
let sql: ReturnType<typeof postgres>;
let store: PostgresSessionStore;
const projectKey = `claude-store-test-${Date.now()}`;
const entry = (type: string, extra: Record<string, unknown> = {}): SessionStoreEntry => ({
type,
...extra,
});
beforeAll(async () => {
sql = postgres(process.env.DATABASE_URL!, { max: 3 });
const serverSchema = resolve(__dirname, '../../../../../server/src/schema.sql');
const coderSchema = resolve(__dirname, '../../../schema.sql');
await sql.unsafe(readFileSync(serverSchema, 'utf8'));
await sql.unsafe(readFileSync(coderSchema, 'utf8'));
store = new PostgresSessionStore(sql);
});
afterAll(async () => {
if (sql) {
await sql`DELETE FROM claude_session_entries WHERE project_key = ${projectKey}`.catch(() => {});
await sql.end({ timeout: 5 });
}
});
it('append → load round-trips and preserves order across two appends', async () => {
const key = { projectKey, sessionId: 'sess-order' };
await store.append(key, [entry('user', { uuid: 'u1' }), entry('assistant', { uuid: 'a1' })]);
await store.append(key, [entry('result', { uuid: 'r1' })]);
const loaded = await store.load(key);
expect(loaded).not.toBeNull();
expect(loaded!.map((e) => e.uuid)).toEqual(['u1', 'a1', 'r1']);
expect(loaded!.map((e) => e.type)).toEqual(['user', 'assistant', 'result']);
});
it('append with an empty batch is a no-op (load still null for an otherwise-unseen key)', async () => {
const key = { projectKey, sessionId: 'sess-empty' };
await store.append(key, []);
expect(await store.load(key)).toBeNull();
});
it('load of a key that was never written returns null', async () => {
expect(await store.load({ projectKey, sessionId: 'never-seen' })).toBeNull();
});
it('isolates the main transcript from a subpath (load each independently)', async () => {
const sessionId = 'sess-subpath';
const mainKey = { projectKey, sessionId };
const subKey = { projectKey, sessionId, subpath: 'subagents/x' };
await store.append(mainKey, [entry('user', { uuid: 'main-1' })]);
await store.append(subKey, [entry('assistant', { uuid: 'sub-1' })]);
const main = await store.load(mainKey);
const sub = await store.load(subKey);
expect(main!.map((e) => e.uuid)).toEqual(['main-1']);
expect(sub!.map((e) => e.uuid)).toEqual(['sub-1']);
});
it('listSessions returns the session with a numeric mtime (main transcripts only)', async () => {
const sessionId = 'sess-list';
await store.append({ projectKey, sessionId }, [entry('user', { uuid: 'l1' })]);
// A subagent-only session must NOT surface as a main-transcript session.
await store.append(
{ projectKey, sessionId: 'sess-sub-only', subpath: 'subagents/y' },
[entry('user', { uuid: 's1' })],
);
const sessions = await store.listSessions(projectKey);
const ids = sessions.map((s) => s.sessionId);
expect(ids).toContain(sessionId);
expect(ids).not.toContain('sess-sub-only');
const row = sessions.find((s) => s.sessionId === sessionId)!;
expect(typeof row.mtime).toBe('number');
expect(Number.isFinite(row.mtime)).toBe(true);
expect(row.mtime).toBeGreaterThan(0);
});
it('delete with a subpath removes only that subpath', async () => {
const sessionId = 'sess-del-subpath';
const mainKey = { projectKey, sessionId };
const subKey = { projectKey, sessionId, subpath: 'subagents/z' };
await store.append(mainKey, [entry('user', { uuid: 'keep-1' })]);
await store.append(subKey, [entry('assistant', { uuid: 'drop-1' })]);
await store.delete(subKey);
expect(await store.load(subKey)).toBeNull();
expect((await store.load(mainKey))!.map((e) => e.uuid)).toEqual(['keep-1']);
});
it('delete without a subpath removes the whole session (all subpaths)', async () => {
const sessionId = 'sess-del-all';
const mainKey = { projectKey, sessionId };
const subKey = { projectKey, sessionId, subpath: 'subagents/w' };
await store.append(mainKey, [entry('user', { uuid: 'm' })]);
await store.append(subKey, [entry('assistant', { uuid: 's' })]);
await store.delete({ projectKey, sessionId });
expect(await store.load(mainKey)).toBeNull();
expect(await store.load(subKey)).toBeNull();
expect(await store.listSubkeys({ projectKey, sessionId })).toEqual([]);
});
it('listSubkeys returns the distinct non-main subpaths', async () => {
const sessionId = 'sess-subkeys';
await store.append({ projectKey, sessionId }, [entry('user', { uuid: 'main' })]);
await store.append({ projectKey, sessionId, subpath: 'subagents/a' }, [entry('user', { uuid: 'a1' })]);
await store.append({ projectKey, sessionId, subpath: 'subagents/a' }, [entry('user', { uuid: 'a2' })]);
await store.append({ projectKey, sessionId, subpath: 'subagents/b' }, [entry('user', { uuid: 'b1' })]);
const subkeys = await store.listSubkeys({ projectKey, sessionId });
expect(subkeys.sort()).toEqual(['subagents/a', 'subagents/b']);
});
});

View File

@@ -0,0 +1,176 @@
import { describe, it, expect } from 'vitest';
import {
selectIdleEvictionTargets,
selectLruEvictionTargets,
decideRestart,
selectOrphanWorktreeTargets,
DEFAULT_IDLE_TTL_MS,
DEFAULT_MAX_LIVE_BACKENDS,
type PoolEntrySnapshot,
} from '../lifecycle-decisions.js';
/**
* v2.6 Phase 3 — pure lifecycle decisions. No DB, no children, no timers; `now`
* is injected. Models prune.ts:selectPruneTargets — the caller acts on the keys.
*/
const NOW = 1_000_000_000_000;
function entry(key: string, ageMs: number, busy = false): PoolEntrySnapshot {
return { key, lastActiveAt: NOW - ageMs, busy };
}
describe('selectIdleEvictionTargets (3.1)', () => {
it('evicts entries idle past the TTL', () => {
const entries = [
entry('a:opencode', DEFAULT_IDLE_TTL_MS + 1),
entry('b:goose', DEFAULT_IDLE_TTL_MS - 1),
];
expect(selectIdleEvictionTargets(entries, NOW)).toEqual(['a:opencode']);
});
it('never evicts a busy entry even when idle past the TTL', () => {
const entries = [entry('a:opencode', DEFAULT_IDLE_TTL_MS * 10, /* busy */ true)];
expect(selectIdleEvictionTargets(entries, NOW)).toEqual([]);
});
it('respects a custom TTL', () => {
const entries = [entry('a:goose', 5_000), entry('b:qwen', 500)];
expect(selectIdleEvictionTargets(entries, NOW, 1_000)).toEqual(['a:goose']);
});
it('treats exactly-at-TTL as evictable (>=)', () => {
expect(selectIdleEvictionTargets([entry('a:x', 1_000)], NOW, 1_000)).toEqual(['a:x']);
});
it('returns empty for an empty pool', () => {
expect(selectIdleEvictionTargets([], NOW)).toEqual([]);
});
});
describe('selectLruEvictionTargets (3.4)', () => {
it('returns nothing when at or under the cap', () => {
const entries = [entry('a:x', 10), entry('b:y', 20)];
expect(selectLruEvictionTargets(entries, 2)).toEqual([]);
expect(selectLruEvictionTargets(entries, 5)).toEqual([]);
});
it('evicts the least-recently-used beyond the cap', () => {
// oldest first: c (300ms ago) is LRU, then a (100ms), then b (10ms).
const entries = [entry('a:x', 100), entry('b:y', 10), entry('c:z', 300)];
expect(selectLruEvictionTargets(entries, 2)).toEqual(['c:z']);
});
it('evicts multiple LRU entries to reach the cap', () => {
const entries = [
entry('a:x', 100),
entry('b:y', 10),
entry('c:z', 300),
entry('d:w', 200),
];
// cap 1: must remove 3, oldest-first c(300), d(200), a(100).
expect(selectLruEvictionTargets(entries, 1)).toEqual(['c:z', 'd:w', 'a:x']);
});
it('never evicts a busy entry even if it is the LRU', () => {
// c is LRU but busy → it cannot be evicted; fall to the next-oldest (a).
const entries = [entry('a:x', 100), entry('b:y', 10), entry('c:z', 300, true)];
expect(selectLruEvictionTargets(entries, 2)).toEqual(['a:x']);
});
it('can transiently exceed the cap when too many are busy', () => {
// cap 1, but both old entries busy → only the single idle one is evictable.
const entries = [entry('a:x', 100, true), entry('c:z', 300, true), entry('b:y', 10)];
expect(selectLruEvictionTargets(entries, 1)).toEqual(['b:y']);
});
it('uses the default cap when omitted', () => {
const entries = Array.from({ length: DEFAULT_MAX_LIVE_BACKENDS + 1 }, (_, i) =>
entry(`k${String(i).padStart(2, '0')}:a`, (i + 1) * 1000),
);
const evicted = selectLruEvictionTargets(entries);
// exactly one over the default cap → evict the single LRU (largest age).
expect(evicted).toHaveLength(1);
expect(evicted[0]).toBe(`k${String(DEFAULT_MAX_LIVE_BACKENDS).padStart(2, '0')}:a`);
});
});
describe('decideRestart (3.2, busy-aware)', () => {
const base = {
consecutiveFailures: 0,
busy: false,
unhealthyBusySince: 0,
now: NOW,
failureThreshold: 3,
staleBusyGraceMs: 120_000,
};
it('does nothing when healthy', () => {
expect(decideRestart({ ...base, processExited: false, healthy: true }))
.toEqual({ action: 'none', reason: 'healthy' });
});
it('restarts immediately when the process exited', () => {
expect(decideRestart({ ...base, processExited: true, busy: true }))
.toEqual({ action: 'restart', reason: 'process-exited' });
});
it('waits below the failure threshold', () => {
expect(decideRestart({ ...base, processExited: false, consecutiveFailures: 2 }))
.toEqual({ action: 'wait', reason: 'below-threshold' });
});
it('restarts at the threshold when idle', () => {
expect(decideRestart({ ...base, processExited: false, consecutiveFailures: 3 }))
.toEqual({ action: 'restart', reason: 'threshold' });
});
it('defers a restart while busy within the grace window', () => {
expect(decideRestart({
...base, processExited: false, consecutiveFailures: 5, busy: true,
unhealthyBusySince: NOW - 1_000,
})).toEqual({ action: 'wait', reason: 'busy-grace' });
});
it('force-restarts a busy backend after the stale-busy grace', () => {
expect(decideRestart({
...base, processExited: false, consecutiveFailures: 5, busy: true,
unhealthyBusySince: NOW - 120_001,
})).toEqual({ action: 'restart', reason: 'stale-busy-grace' });
});
it('waits (busy-grace) when busy + threshold but the window just started', () => {
// unhealthyBusySince === 0 means the caller is about to stamp it this cycle.
expect(decideRestart({
...base, processExited: false, consecutiveFailures: 5, busy: true,
unhealthyBusySince: 0,
})).toEqual({ action: 'wait', reason: 'busy-grace' });
});
});
describe('selectOrphanWorktreeTargets (3.4)', () => {
it('skips dirs tracked by a live worktrees row', () => {
const onDisk = [{ path: '/wt/sess-a', mtimeMs: NOW - 10_000_000 }];
expect(selectOrphanWorktreeTargets(onDisk, new Set(['/wt/sess-a']), NOW, 1000)).toEqual([]);
});
it('reaps an untracked dir older than the grace', () => {
const onDisk = [{ path: '/wt/sess-orphan', mtimeMs: NOW - 5000 }];
expect(selectOrphanWorktreeTargets(onDisk, new Set(), NOW, 1000)).toEqual(['/wt/sess-orphan']);
});
it('never reaps a dir younger than the grace (mid-create race)', () => {
const onDisk = [{ path: '/wt/sess-fresh', mtimeMs: NOW - 500 }];
expect(selectOrphanWorktreeTargets(onDisk, new Set(), NOW, 1000)).toEqual([]);
});
it('mixes tracked, fresh, and orphaned correctly', () => {
const onDisk = [
{ path: '/wt/sess-live', mtimeMs: NOW - 10_000 },
{ path: '/wt/sess-fresh', mtimeMs: NOW - 100 },
{ path: '/wt/sess-orphan', mtimeMs: NOW - 10_000 },
];
expect(selectOrphanWorktreeTargets(onDisk, new Set(['/wt/sess-live']), NOW, 1000))
.toEqual(['/wt/sess-orphan']);
});
});

View File

@@ -0,0 +1,173 @@
import { describe, it, expect, vi } from 'vitest';
import type { Event, OpencodeClient } from '@opencode-ai/sdk/v2/client';
import {
reconnectDecision,
runSessionEventLoop,
DEFAULT_RECONNECT_POLICY,
type SessionState,
type SseLoopDeps,
} from '../opencode-sse.js';
import { shouldStartServer } from '../opencode-server-process.js';
/**
* v2.7 concurrency hardening (Phase 7): the pure decision cores for SSE reconnect
* backoff + the ensureServer double-spawn guard, plus a deterministic exercise of
* the loop's breaker (injected sleep, fake client). Happy path is asserted to be
* unchanged (clean stream end → reset → base-delay reconnect).
*/
function freshState(): SessionState {
return {
boocodeSessionId: 'boo1',
agentSessionId: 'oc1',
worktreePath: '/wt',
streamedPartKeys: new Set(),
partTypeById: new Map(),
activeTurn: { onEvent: () => {}, settle: () => {} },
watchdog: null,
sseAbort: null,
swallowNextTerminal: false,
};
}
const silentLog = {
warn: () => {},
info: () => {},
error: () => {},
debug: () => {},
} as unknown as SseLoopDeps['log'];
describe('reconnectDecision (pure backoff + breaker)', () => {
it('first failure uses the base delay (matches pre-hardening flat delay)', () => {
expect(reconnectDecision(1)).toEqual({ action: 'reconnect', delayMs: DEFAULT_RECONNECT_POLICY.baseMs });
});
it('grows exponentially and caps at maxMs', () => {
const policy = { baseMs: 1000, maxMs: 30_000, maxAttempts: 10 };
expect(reconnectDecision(2, policy)).toEqual({ action: 'reconnect', delayMs: 2000 });
expect(reconnectDecision(3, policy)).toEqual({ action: 'reconnect', delayMs: 4000 });
expect(reconnectDecision(6, policy)).toEqual({ action: 'reconnect', delayMs: 30_000 }); // 32000 capped
expect(reconnectDecision(9, policy)).toEqual({ action: 'reconnect', delayMs: 30_000 });
});
it('gives up once failures exceed maxAttempts', () => {
const policy = { baseMs: 1, maxMs: 8, maxAttempts: 3 };
expect(reconnectDecision(3, policy).action).toBe('reconnect');
expect(reconnectDecision(4, policy)).toEqual({ action: 'give-up' });
});
});
describe('shouldStartServer (double-spawn guard)', () => {
it('does not start when the server is live', () => {
expect(shouldStartServer({ up: true, hasClient: true, serverStarting: true, childDead: false, startInFlight: false })).toBe(false);
});
it('starts on a fresh process (no start in flight)', () => {
expect(shouldStartServer({ up: false, hasClient: false, serverStarting: false, childDead: false, startInFlight: false })).toBe(true);
});
it('re-spawns after a crash once the prior start finished', () => {
expect(shouldStartServer({ up: false, hasClient: false, serverStarting: true, childDead: true, startInFlight: false })).toBe(true);
});
it('does NOT double-spawn while a start is already in flight (the race fix)', () => {
expect(shouldStartServer({ up: false, hasClient: false, serverStarting: true, childDead: true, startInFlight: true })).toBe(false);
});
it('does NOT double-spawn when a crash nulled serverStarting mid-start', () => {
// The narrow window: a crash during the in-flight start (await freePort) nulls
// serverStarting while startInFlight is still true. The startInFlight guard must
// win over the !serverStarting branch, else a second server spawns on a new port.
expect(shouldStartServer({ up: false, hasClient: false, serverStarting: false, childDead: true, startInFlight: true })).toBe(false);
});
it('waits (no spawn) when a cached start exists and the child is still alive', () => {
expect(shouldStartServer({ up: false, hasClient: false, serverStarting: true, childDead: false, startInFlight: false })).toBe(false);
});
});
describe('runSessionEventLoop — happy path (unchanged)', () => {
it('dispatches streamed events, reconciles on clean end, reconnects at base delay', async () => {
const state = freshState();
const abort = new AbortController();
const events = [
{ type: 'session.next.text.delta', properties: { sessionID: 'oc1', delta: 'hi' } },
{ type: 'session.idle', properties: { sessionID: 'oc1' } },
] as unknown as Event[];
const client = {
event: {
subscribe: vi.fn(async () => ({
stream: (async function* () {
for (const ev of events) yield ev;
})(),
})),
},
} as unknown as OpencodeClient;
const dispatched: Event[] = [];
const sleeps: number[] = [];
let reconciles = 0;
const deps: SseLoopDeps = {
isUp: () => true,
getClient: () => client,
dispatchEvent: (ev) => dispatched.push(ev),
reconcile: async () => {
reconciles += 1;
abort.abort(); // stop the loop after the first clean cycle
return false;
},
onReconnectGiveUp: () => {
throw new Error('should not give up on the happy path');
},
log: silentLog,
sleep: async (ms) => {
sleeps.push(ms);
},
};
await runSessionEventLoop(state, abort, deps);
expect(dispatched).toHaveLength(2);
expect(reconciles).toBe(1);
expect(sleeps).toEqual([DEFAULT_RECONNECT_POLICY.baseMs]); // base delay, not backed off
});
});
describe('runSessionEventLoop — circuit breaker', () => {
it('backs off on repeated throws then gives up + fails the turn', async () => {
const state = freshState();
const abort = new AbortController();
const policy = { baseMs: 1, maxMs: 8, maxAttempts: 3 };
const subscribe = vi.fn(async () => {
throw new Error('connection refused');
});
const client = { event: { subscribe } } as unknown as OpencodeClient;
const sleeps: number[] = [];
const gaveUp = vi.fn();
const deps: SseLoopDeps = {
isUp: () => true,
getClient: () => client,
dispatchEvent: () => {},
reconcile: async () => false,
onReconnectGiveUp: gaveUp,
log: silentLog,
sleep: async (ms) => {
sleeps.push(ms);
},
policy,
};
await runSessionEventLoop(state, abort, deps);
// 3 backoff sleeps (1, 2, 4), then the 4th failure trips the breaker.
expect(sleeps).toEqual([1, 2, 4]);
expect(subscribe).toHaveBeenCalledTimes(4);
expect(gaveUp).toHaveBeenCalledTimes(1);
expect(gaveUp).toHaveBeenCalledWith(state);
});
});

View File

@@ -0,0 +1,226 @@
import { describe, it, expect } from 'vitest';
import type { Event, Part } from '@opencode-ai/sdk/v2/client';
import {
stripDcpTags,
eventSessionId,
resolvePartDedupeKey,
mapToolStatus,
toolPartToSnapshot,
toolCalledSnapshot,
toolSuccessSnapshot,
toolFailedSnapshot,
classifyPartDelta,
classifyUpdatedPart,
errToString,
errMsg,
type DedupState,
} from '../opencode-event-map.js';
/**
* Pure opencode Event → AgentEvent translation + dedup gate (v2.7 audit reshape).
* Mirrors the original `dispatchEvent` / `handleUpdatedPart` arms verbatim — no
* I/O, so it's unit-testable. The slimmed backend keeps the routing + side effects.
*/
function freshDedup(): DedupState {
return { streamedPartKeys: new Set(), partTypeById: new Map() };
}
describe('stripDcpTags', () => {
it('removes a complete dcp tag', () => {
expect(stripDcpTags('hi <dcp-message-id>m1</dcp-message-id> there')).toBe('hi there');
});
it('leaves untagged text untouched', () => {
expect(stripDcpTags('plain text <div>')).toBe('plain text <div>');
});
});
describe('eventSessionId', () => {
it('reads properties.sessionID for a normal event', () => {
const ev = { type: 'session.idle', properties: { sessionID: 's1' } } as unknown as Event;
expect(eventSessionId(ev)).toBe('s1');
});
it('reads properties.part.sessionID for message.part.updated', () => {
const ev = {
type: 'message.part.updated',
properties: { part: { sessionID: 's2' } },
} as unknown as Event;
expect(eventSessionId(ev)).toBe('s2');
});
it('returns null when there is no session', () => {
const ev = { type: 'server.connected', properties: {} } as unknown as Event;
expect(eventSessionId(ev)).toBeNull();
});
});
describe('resolvePartDedupeKey', () => {
it('prefers the part id', () => {
expect(resolvePartDedupeKey({ id: 'p1', messageID: 'm1' }, 'text')).toBe('text:p1');
});
it('falls back to the message id', () => {
expect(resolvePartDedupeKey({ id: ' ', messageID: 'm1' }, 'reasoning')).toBe('reasoning:message:m1');
});
it('returns null when neither is present', () => {
expect(resolvePartDedupeKey({ id: '', messageID: '' }, 'text')).toBeNull();
});
});
describe('mapToolStatus', () => {
it('maps the opencode tool states to ACP statuses', () => {
expect(mapToolStatus('pending')).toBe('pending');
expect(mapToolStatus('running')).toBe('in_progress');
expect(mapToolStatus('completed')).toBe('completed');
expect(mapToolStatus('error')).toBe('failed');
expect(mapToolStatus(undefined)).toBeNull();
});
});
describe('session.next.tool.* snapshot builders', () => {
it('toolCalledSnapshot → in_progress with tool title + raw input', () => {
expect(toolCalledSnapshot({ callID: 'c1', tool: 'read_file', input: { path: 'a.ts' } })).toEqual({
toolCallId: 'c1',
title: 'read_file',
kind: null,
status: 'in_progress',
rawInput: { path: 'a.ts' },
rawOutput: undefined,
});
});
it('toolSuccessSnapshot → completed with joined text content', () => {
const snap = toolSuccessSnapshot({ callID: 'c1', content: [{ text: 'foo' }, { text: 'bar' }, { other: 1 }] });
expect(snap.status).toBe('completed');
expect(snap.title).toBe('c1');
expect(snap.rawOutput).toBe('foobar');
});
it('toolSuccessSnapshot → empty output when content is missing', () => {
expect(toolSuccessSnapshot({ callID: 'c1' }).rawOutput).toBe('');
});
it('toolFailedSnapshot → failed with stringified error', () => {
const snap = toolFailedSnapshot({ callID: 'c1', error: 'boom' });
expect(snap.status).toBe('failed');
expect(snap.title).toBe('c1');
expect(snap.rawOutput).toBe('boom');
});
});
describe('toolPartToSnapshot', () => {
it('extracts input/output/title/status from the tool state', () => {
const part = {
type: 'tool',
callID: 'c1',
tool: 'grep',
state: { status: 'completed', input: { q: 'x' }, output: 'result', title: 'Grep run' },
} as unknown as Parameters<typeof toolPartToSnapshot>[0];
expect(toolPartToSnapshot(part)).toEqual({
toolCallId: 'c1',
title: 'Grep run',
kind: null,
status: 'completed',
rawInput: { q: 'x' },
rawOutput: 'result',
});
});
it('falls back to the tool name and uses error as output', () => {
const part = {
type: 'tool',
callID: 'c2',
tool: 'edit',
state: { status: 'error', error: 'nope' },
} as unknown as Parameters<typeof toolPartToSnapshot>[0];
const snap = toolPartToSnapshot(part);
expect(snap.title).toBe('edit');
expect(snap.status).toBe('failed');
expect(snap.rawOutput).toBe('nope');
});
});
describe('classifyPartDelta (message.part.delta dedup recording)', () => {
it('records a reasoning key and emits a reasoning event', () => {
const st = freshDedup();
const e = classifyPartDelta({ partID: 'p1', field: 'reasoning', delta: 'thinking' }, st);
expect(e).toEqual({ type: 'reasoning', text: 'thinking' });
expect(st.streamedPartKeys.has('reasoning:p1')).toBe(true);
});
it('records a text key, strips dcp, and emits text', () => {
const st = freshDedup();
const e = classifyPartDelta({ partID: 'p2', field: 'text', delta: 'hi <dcp-message-id>m</dcp-message-id>' }, st);
expect(e).toEqual({ type: 'text', text: 'hi ' });
expect(st.streamedPartKeys.has('text:p2')).toBe(true);
});
it('still records the text key even when the cleaned delta is empty', () => {
const st = freshDedup();
const e = classifyPartDelta({ partID: 'p3', field: 'text', delta: '<dcp-message-id>m</dcp-message-id>' }, st);
expect(e).toBeNull();
expect(st.streamedPartKeys.has('text:p3')).toBe(true);
});
it('uses the recorded part type when the field is absent', () => {
const st = freshDedup();
st.partTypeById.set('p4', 'reasoning');
const e = classifyPartDelta({ partID: 'p4', delta: 'more' }, st);
expect(e).toEqual({ type: 'reasoning', text: 'more' });
});
it('returns null for an unknown field', () => {
expect(classifyPartDelta({ partID: 'p5', field: 'other', delta: 'x' }, freshDedup())).toBeNull();
});
});
describe('classifyUpdatedPart (message.part.updated dedup gate)', () => {
function textPart(over: Partial<Part> = {}): Part {
return {
type: 'text',
id: 'p1',
messageID: 'm1',
sessionID: 's1',
text: 'final text',
time: { start: 1, end: 2 },
...over,
} as unknown as Part;
}
it('drops a terminal part already streamed via deltas', () => {
const st = freshDedup();
st.streamedPartKeys.add('text:p1');
expect(classifyUpdatedPart(textPart(), st)).toBeNull();
// the key is consumed
expect(st.streamedPartKeys.has('text:p1')).toBe(false);
});
it('emits a finished (ended) text part not seen via deltas', () => {
const st = freshDedup();
expect(classifyUpdatedPart(textPart(), st)).toEqual({ type: 'text', text: 'final text' });
expect(st.partTypeById.get('p1')).toBe('text');
});
it('does not emit a part that has not ended yet', () => {
const st = freshDedup();
expect(classifyUpdatedPart(textPart({ time: { start: 1 } as never }), st)).toBeNull();
});
it('strips dcp tags from the finished text', () => {
const st = freshDedup();
const part = textPart({ text: 'a <dcp-message-id>m</dcp-message-id>b' });
expect(classifyUpdatedPart(part, st)).toEqual({ type: 'text', text: 'a b' });
});
it('maps a running tool part to tool_call', () => {
const st = freshDedup();
const part = { type: 'tool', callID: 'c1', tool: 'grep', state: { status: 'running' } } as unknown as Part;
const e = classifyUpdatedPart(part, st);
expect(e?.type).toBe('tool_call');
});
it('maps a completed tool part to tool_update', () => {
const st = freshDedup();
const part = { type: 'tool', callID: 'c1', tool: 'grep', state: { status: 'completed', output: 'x' } } as unknown as Part;
const e = classifyUpdatedPart(part, st);
expect(e?.type).toBe('tool_update');
});
});
describe('error formatters', () => {
it('errMsg unwraps Error.message', () => {
expect(errMsg(new Error('x'))).toBe('x');
expect(errMsg('plain')).toBe('plain');
});
it('errToString handles null/string/Error/object', () => {
expect(errToString(null)).toBe('unknown error');
expect(errToString('s')).toBe('s');
expect(errToString(new Error('e'))).toBe('e');
expect(errToString({ a: 1 })).toBe('{"a":1}');
});
});

View File

@@ -0,0 +1,51 @@
import { describe, it, expect } from 'vitest';
import { stepEndedToUsage } from '../opencode-usage.js';
describe('stepEndedToUsage (U.6)', () => {
it('folds cache read+write into input and reasoning into output', () => {
const u = stepEndedToUsage({
cost: 0.0123,
tokens: { input: 100, output: 50, reasoning: 20, cache: { read: 10, write: 5 } },
});
expect(u).toEqual({ input: 115, output: 70, cost: 0.0123 });
});
it('handles a step with no cache and no reasoning', () => {
const u = stepEndedToUsage({
cost: 0,
tokens: { input: 8, output: 4, reasoning: 0, cache: { read: 0, write: 0 } },
});
expect(u).toEqual({ input: 8, output: 4, cost: 0 });
});
it('is defensive against a missing tokens block', () => {
const u = stepEndedToUsage({ cost: 0.5 } as never);
expect(u).toEqual({ input: 0, output: 0, cost: 0.5 });
});
it('is defensive against undefined props', () => {
expect(stepEndedToUsage(undefined)).toEqual({ input: 0, output: 0, cost: 0 });
});
it('drops NaN / negative noise to zero rather than poisoning the accumulated total', () => {
const u = stepEndedToUsage({
cost: Number.NaN,
tokens: {
input: -5,
output: Number.NaN,
reasoning: 3,
cache: { read: Number.POSITIVE_INFINITY, write: 2 },
},
});
// input: (-5→0) + (Inf→0) + 2 = 2; output: (NaN→0) + 3 = 3; cost: NaN→0
expect(u).toEqual({ input: 2, output: 3, cost: 0 });
});
it('rounds fractional token counts', () => {
const u = stepEndedToUsage({
cost: 1.5,
tokens: { input: 10.6, output: 4.4, reasoning: 0, cache: { read: 0, write: 0 } },
});
expect(u).toEqual({ input: 11, output: 4, cost: 1.5 });
});
});

View File

@@ -0,0 +1,96 @@
import { describe, it, expect } from 'vitest';
import { createPushable } from '../pushable-iterable.js';
/**
* The pushable async-iterable that feeds the Claude SDK's streaming-input query()
* one message per turn while staying open across turns. Tests cover the ordering
* contract (push/close/async-iterate) without any SDK shape.
*/
describe('createPushable — push/iterate ordering', () => {
it('yields buffered values in FIFO order then parks', async () => {
const p = createPushable<number>();
const it = p.iterable[Symbol.asyncIterator]();
p.push(1);
p.push(2);
expect(await it.next()).toEqual({ value: 1, done: false });
expect(await it.next()).toEqual({ value: 2, done: false });
// No more buffered → next() parks; resolve it by pushing.
const parked = it.next();
p.push(3);
expect(await parked).toEqual({ value: 3, done: false });
});
it('hands a value directly to a parked consumer (push after await)', async () => {
const p = createPushable<string>();
const it = p.iterable[Symbol.asyncIterator]();
const pending = it.next(); // parks immediately (empty buffer)
p.push('hello');
expect(await pending).toEqual({ value: 'hello', done: false });
});
it('close() resolves a parked consumer as done and reports done thereafter', async () => {
const p = createPushable<number>();
const it = p.iterable[Symbol.asyncIterator]();
const pending = it.next();
p.close();
expect(await pending).toEqual({ value: undefined, done: true });
expect(await it.next()).toEqual({ value: undefined, done: true });
expect(p.closed).toBe(true);
});
it('still drains values buffered BEFORE close', async () => {
const p = createPushable<number>();
const it = p.iterable[Symbol.asyncIterator]();
p.push(10);
p.push(20);
p.close();
expect(await it.next()).toEqual({ value: 10, done: false });
expect(await it.next()).toEqual({ value: 20, done: false });
expect(await it.next()).toEqual({ value: undefined, done: true });
});
it('drops values pushed after close', async () => {
const p = createPushable<number>();
const it = p.iterable[Symbol.asyncIterator]();
p.close();
p.push(99); // no-op
expect(await it.next()).toEqual({ value: undefined, done: true });
});
it('close() is idempotent', () => {
const p = createPushable<number>();
p.close();
expect(() => p.close()).not.toThrow();
expect(p.closed).toBe(true);
});
it('works with a for-await loop driven by interleaved pushes', async () => {
const p = createPushable<number>();
const seen: number[] = [];
const consumer = (async () => {
for await (const v of p.iterable) seen.push(v);
})();
p.push(1);
await Promise.resolve();
p.push(2);
await Promise.resolve();
p.close();
await consumer;
expect(seen).toEqual([1, 2]);
});
it('return() on the iterator closes the queue (for-await break)', async () => {
const p = createPushable<number>();
const it = p.iterable[Symbol.asyncIterator]();
p.push(1);
expect(await it.next()).toEqual({ value: 1, done: false });
// Simulate a `break` in for-await: the runtime calls return().
expect(await it.return!()).toEqual({ value: undefined, done: true });
expect(p.closed).toBe(true);
p.push(2); // dropped — queue is closed
expect(await it.next()).toEqual({ value: undefined, done: true });
});
});

View File

@@ -0,0 +1,34 @@
import { describe, it, expect } from 'vitest';
import {
armAbortGuard,
noteTurnActivity,
consumeTerminal,
type AbortTerminalGuard,
} from '../turn-guard.js';
describe('post-abort terminal guard (F.1)', () => {
it('swallows the orphan terminal that follows an abort, then settles the next real one', () => {
// Reproduces the v2.6.5 Stop-button bug: abort turn A, then opencode emits a
// trailing session.idle for A. That orphan must NOT settle the next turn.
const g: AbortTerminalGuard = { swallowNextTerminal: false };
armAbortGuard(g); // user aborts turn A
expect(consumeTerminal(g)).toBe('swallow'); // opencode's orphan idle for A → dropped
expect(consumeTerminal(g)).toBe('settle'); // turn B's real idle → settles B
});
it('settles a terminal when no abort happened', () => {
const g: AbortTerminalGuard = { swallowNextTerminal: false };
expect(consumeTerminal(g)).toBe('settle');
});
it('self-heals if the orphan never arrives: new-turn activity clears the guard', () => {
// If opencode emits no orphan idle (e.g. abort-before-prompt), the next turn's
// real terminal must still settle rather than being swallowed forever.
const g: AbortTerminalGuard = { swallowNextTerminal: false };
armAbortGuard(g); // abort A, but no orphan idle arrives
noteTurnActivity(g); // turn B produces its first delta
expect(consumeTerminal(g)).toBe('settle'); // turn B's idle settles, not swallowed
});
});

View File

@@ -0,0 +1,59 @@
import { describe, it, expect } from 'vitest';
import { shouldUseWarmBackend, isTurnOkForStopReason } from '../warm-acp-routing.js';
/**
* Phase 2 routing predicate: which goose/qwen tasks go to the warm pool backend
* vs the existing one-shot ACP path.
*
* The warm backend is keyed (chat_id, agent) — the persistent context unit (same
* as opencode-server). A task only routes warm when it carries BOTH a session_id
* and a chat_id, i.e. it originates from a real chat tab (the coder message route
* stamps both). Session-less creators (arena, MCP-created, generic /api/tasks,
* new_task) lack chat_id/session_id and keep the one-shot worktree-per-task path,
* which never spawns a warm process.
*/
describe('shouldUseWarmBackend (Phase 2 routing)', () => {
it('routes a chat-tab task (session_id + chat_id) to the warm backend', () => {
expect(shouldUseWarmBackend({ agent: 'qwen', session_id: 's1', chat_id: 'c1' })).toBe(true);
expect(shouldUseWarmBackend({ agent: 'goose', session_id: 's1', chat_id: 'c1' })).toBe(true);
});
it('keeps a session-less arena/MCP task on the one-shot path', () => {
expect(shouldUseWarmBackend({ agent: 'qwen', session_id: null, chat_id: null })).toBe(false);
});
it('keeps a task with a session but no chat on the one-shot path', () => {
// chat_id is the warm-key half; without it ensureSession would get a degenerate
// (null, agent) key, so fall back to one-shot rather than synthesize a chat.
expect(shouldUseWarmBackend({ agent: 'goose', session_id: 's1', chat_id: null })).toBe(false);
});
it('keeps a task with a chat but no session on the one-shot path', () => {
expect(shouldUseWarmBackend({ agent: 'qwen', session_id: null, chat_id: 'c1' })).toBe(false);
});
it('only applies to warm-capable agents (goose, qwen); others never warm here', () => {
// opencode has its own dedicated warm path; native/claude/etc. are not ACP-warm.
expect(shouldUseWarmBackend({ agent: 'opencode', session_id: 's1', chat_id: 'c1' })).toBe(false);
expect(shouldUseWarmBackend({ agent: 'claude', session_id: 's1', chat_id: 'c1' })).toBe(false);
expect(shouldUseWarmBackend({ agent: null, session_id: 's1', chat_id: 'c1' })).toBe(false);
});
});
describe('isTurnOkForStopReason (ACP stop-reason → ok/fail)', () => {
it('treats normal completions as ok', () => {
expect(isTurnOkForStopReason('end_turn')).toBe(true);
expect(isTurnOkForStopReason('max_tokens')).toBe(true);
expect(isTurnOkForStopReason('max_turn_requests')).toBe(true);
});
it('treats refusal and cancelled as failures', () => {
expect(isTurnOkForStopReason('refusal')).toBe(false);
expect(isTurnOkForStopReason('cancelled')).toBe(false);
});
it('defaults an absent stop reason to a successful end_turn', () => {
expect(isTurnOkForStopReason(undefined)).toBe(true);
expect(isTurnOkForStopReason(null)).toBe(true);
});
});

View File

@@ -0,0 +1,245 @@
/**
* claude-sdk-sessionstore #9 (Part 2) — PURE Claude-SDK message → AgentEvent mapper.
*
* `ClaudeSdkBackend` drives one `query()` per (chat, agent) session and feeds each
* `SDKMessage` it yields through this function, forwarding the returned
* `AgentEvent[]` to the dispatcher's `onEvent` (which maps them to WS frames +
* persists). Kept PURE (one message + a caller-owned accumulator → events) so it's
* unit-testable without a live `claude` binary — the whole point of Part 2's
* typecheck-and-unit-test gate (the live pump needs a host smoke).
*
* SDK shapes (verified against @anthropic-ai/claude-agent-sdk@0.3.159 sdk.d.ts +
* @anthropic-ai/sdk beta messages d.ts):
* - `SDKPartialAssistantMessage` (`type:'stream_event'`) carries a
* `BetaRawMessageStreamEvent` — the LIVE delta stream (only emitted when
* `options.includePartialMessages` is set, which the backend sets). We map:
* · content_block_delta + text_delta → { text }
* · content_block_delta + thinking_delta → { reasoning }
* · content_block_start + tool_use block → { tool_call } (in_progress)
* · content_block_delta + input_json_delta → buffered into the tool's args
* (no event; the assembled input rides the terminal tool_update)
* - `SDKAssistantMessage` (`type:'assistant'`) carries the FINAL `message.content`
* blocks. Text/thinking there are post-hoc repeats of what the partials already
* streamed, so we DROP them (dedup) and only emit a terminal `tool_update`
* (status completed) per `tool_use` block, with its now-complete `input`.
* - All other `SDKMessage` variants (system/init, status, result, hooks, task
* notifications, …) carry no renderable turn content → return [].
*
* Tool assembly spans messages: a tool_use block opens in a partial
* `content_block_start`, its args stream as `input_json_delta` frames keyed by the
* block `index`, and the final assistant message restates the complete block. The
* caller owns a `ClaudeSdkMapState` (snapshot map + per-index tool tracking) that
* threads this across calls, mirroring the `Map<string, AcpToolSnapshot>` the other
* backends pass into `mapSessionUpdate`. The result frames carry the SAME
* `AcpToolSnapshot` shape, so `persistExternalAgentTurn` / `snapshotToWireToolCall`
* are reused unchanged.
*/
import type { SDKMessage } from '@anthropic-ai/claude-agent-sdk';
import type { AgentEvent } from '../agent-backend.js';
import type { AcpToolSnapshot } from '../acp-tool-snapshot.js';
/**
* The underlying `@anthropic-ai/sdk` Beta message types (`BetaRawMessageStreamEvent`,
* `BetaContentBlock`) are a TRANSITIVE dep of `@anthropic-ai/claude-agent-sdk` — not
* a direct dependency of apps/coder — so a `@anthropic-ai/sdk/...` import does NOT
* resolve here under pnpm's strict node_modules. We instead DERIVE both shapes from
* the SDK's own exported message types, which is also more correct (it tracks the
* exact `event` / `content` shapes the SDK yields, not a hand-picked import path).
*/
type StreamEvent = Extract<SDKMessage, { type: 'stream_event' }>['event'];
type AssistantContent = Extract<SDKMessage, { type: 'assistant' }>['message']['content'];
type ContentBlock = AssistantContent extends readonly (infer B)[] ? B : never;
type UserContent = Extract<SDKMessage, { type: 'user' }>['message']['content'];
/**
* Caller-owned accumulator threaded across `mapSdkMessage` calls within ONE turn.
* The backend creates a fresh one per turn and clears it at turn end.
*/
export interface ClaudeSdkMapState {
/** Stable tool-call snapshots by tool_use id, merged across start/delta/stop. */
snapshots: Map<string, AcpToolSnapshot>;
/**
* Partial-stream block index → in-flight tool assembly. Anthropic's stream keys
* blocks by a numeric `index`; tool_use args arrive as `input_json_delta`s under
* that index with no id, so we map index→id to route them and buffer the raw
* JSON fragments until the block closes (or the final assistant message lands).
*/
toolByIndex: Map<number, { id: string; name: string; jsonBuf: string }>;
}
/** Construct a fresh per-turn accumulator. */
export function createClaudeSdkMapState(): ClaudeSdkMapState {
return { snapshots: new Map(), toolByIndex: new Map() };
}
/**
* Map one `SDKMessage` → zero or more `AgentEvent`s, mutating `state` for
* cross-message tool assembly + dedup. Pure w.r.t. its inputs otherwise.
*/
export function mapSdkMessage(msg: SDKMessage, state: ClaudeSdkMapState): AgentEvent[] {
switch (msg.type) {
case 'stream_event':
return mapStreamEvent(msg.event, state);
case 'assistant':
return mapFinalAssistant(msg.message.content, state);
case 'user':
// Tool RESULTS ride in as user messages (tool_result blocks): the SDK ran
// the tool and feeds its output back. Without mapping these, the tool_call
// never reaches a terminal snapshot — it persists as status:'running' with
// no output and the UI spinner never stops (the bug this fixes).
return mapUserToolResults(msg.message.content, state);
default:
// system/init, status, result, hooks, task_*, etc. — no turn content here.
// (The backend reads session_id off the init message and usage/cost off the
// result message directly; neither produces a renderable AgentEvent.)
return [];
}
}
/** Live partial-stream delta → AgentEvent(s). */
function mapStreamEvent(event: StreamEvent, state: ClaudeSdkMapState): AgentEvent[] {
switch (event.type) {
case 'content_block_start': {
const block = event.content_block;
if (block.type === 'tool_use') {
const snap: AcpToolSnapshot = {
toolCallId: block.id,
title: block.name,
kind: null,
status: 'in_progress',
rawInput: block.input ?? undefined,
rawOutput: undefined,
};
state.snapshots.set(block.id, snap);
state.toolByIndex.set(event.index, { id: block.id, name: block.name, jsonBuf: '' });
return [{ type: 'tool_call', toolCall: snap }];
}
return [];
}
case 'content_block_delta': {
const delta = event.delta;
if (delta.type === 'text_delta') {
return delta.text ? [{ type: 'text', text: delta.text }] : [];
}
if (delta.type === 'thinking_delta') {
return delta.thinking ? [{ type: 'reasoning', text: delta.thinking }] : [];
}
if (delta.type === 'input_json_delta') {
// Buffer the tool's streamed args under its block index; no event yet —
// the assembled input rides the terminal tool_update (or the final block).
const t = state.toolByIndex.get(event.index);
if (t) t.jsonBuf += delta.partial_json ?? '';
return [];
}
// signature_delta / citations_delta / compaction_delta — nothing to render.
return [];
}
case 'content_block_stop': {
// Close out a streamed tool block: parse its buffered JSON args and emit a
// tool_update carrying the assembled input. The final assistant message will
// restate the same block, but its snapshot is dedup-merged (same id) so this
// is harmless — we emit here so a tool's input renders even if the assistant
// message is delayed/dropped.
const t = state.toolByIndex.get(event.index);
if (!t) return [];
state.toolByIndex.delete(event.index);
const prev = state.snapshots.get(t.id);
const snap: AcpToolSnapshot = {
toolCallId: t.id,
title: prev?.title ?? t.name,
kind: null,
status: 'in_progress',
rawInput: parseJsonOr(t.jsonBuf, prev?.rawInput),
rawOutput: undefined,
};
state.snapshots.set(t.id, snap);
return [{ type: 'tool_update', toolCall: snap }];
}
default:
// message_start / message_delta / message_stop — turn framing, no content.
return [];
}
}
/**
* Final assistant message content blocks. Text/thinking are post-hoc repeats of
* the partial stream → dropped (dedup). Only tool_use blocks emit a terminal
* tool_update carrying the complete `input`.
*/
function mapFinalAssistant(content: ContentBlock[], state: ClaudeSdkMapState): AgentEvent[] {
const out: AgentEvent[] = [];
for (const block of content) {
if (block.type === 'tool_use') {
const prev = state.snapshots.get(block.id);
const snap: AcpToolSnapshot = {
toolCallId: block.id,
title: prev?.title ?? block.name,
kind: null,
status: 'completed',
rawInput: block.input ?? prev?.rawInput,
rawOutput: undefined,
};
state.snapshots.set(block.id, snap);
out.push({ type: 'tool_update', toolCall: snap });
}
// text / thinking / redacted_thinking blocks: already streamed via partials.
}
return out;
}
/**
* User-message tool_result blocks → terminal tool_update events. The SDK runs
* each tool and feeds the output back in a `user` message; we mark the matching
* snapshot completed (or failed, on is_error) WITH its output so the snapshot
* persists/renders as resolved instead of spinning. Unknown ids (no prior
* snapshot) are still surfaced so a stray result isn't silently lost.
*/
function mapUserToolResults(content: UserContent, state: ClaudeSdkMapState): AgentEvent[] {
if (!Array.isArray(content)) return [];
const out: AgentEvent[] = [];
for (const raw of content) {
const block = raw as { type?: string; tool_use_id?: string; content?: unknown; is_error?: boolean };
if (block.type !== 'tool_result' || !block.tool_use_id) continue;
const prev = state.snapshots.get(block.tool_use_id);
const snap: AcpToolSnapshot = {
toolCallId: block.tool_use_id,
title: prev?.title ?? block.tool_use_id,
kind: prev?.kind ?? null,
status: block.is_error ? 'failed' : 'completed',
rawInput: prev?.rawInput,
rawOutput: toolResultText(block.content),
};
state.snapshots.set(block.tool_use_id, snap);
out.push({ type: 'tool_update', toolCall: snap });
}
return out;
}
/** tool_result content is a string OR an array of content blocks (text/image).
* Flatten text blocks; fall back to the raw value so nothing is lost. */
function toolResultText(content: unknown): unknown {
if (typeof content === 'string') return content;
if (Array.isArray(content)) {
const text = content
.map((c) =>
c && typeof c === 'object' && (c as { type?: string }).type === 'text'
? String((c as { text?: unknown }).text ?? '')
: '',
)
.filter(Boolean)
.join('\n');
return text || content;
}
return content ?? '';
}
/** Parse a buffered JSON string; fall back to a prior value on empty/invalid. */
function parseJsonOr(buf: string, fallback: unknown): unknown {
const s = buf.trim();
if (!s) return fallback;
try {
return JSON.parse(s);
} catch {
return fallback;
}
}

View File

@@ -0,0 +1,38 @@
/**
* claude-sdk-sessionstore #9 (Part 2) — claude-SDK-vs-PTY routing predicate.
*
* Sibling to `shouldUseWarmBackend` (warm-acp-routing.ts). The warm Claude-SDK
* backend keys its persistent `query()` on (chat_id, agent) — exactly like the
* warm-ACP / opencode-server backends — so a task only routes to it when it carries
* BOTH a `session_id` and a `chat_id` (a real chat tab).
*
* CRUCIALLY this is ALSO gated behind the `CLAUDE_SDK_BACKEND` env flag (default
* OFF). While off — the production default — claude always falls through to the
* existing one-shot PTY `runExternalAgent` path, UNCHANGED. The live SDK streaming
* pump + cross-turn resume need a host smoke against the real `claude` binary, so
* we keep the working PTY path as the default until that lands. Flip the env var
* on a host (any truthy value) to opt a deployment into the SDK backend.
*
* Pure (env read injected) so it's unit-testable; the dispatcher consumes it.
*/
/** True iff the `CLAUDE_SDK_BACKEND` env flag is set to a truthy value. */
export function claudeSdkBackendEnabled(env: NodeJS.ProcessEnv = process.env): boolean {
const v = env.CLAUDE_SDK_BACKEND;
if (v == null) return false;
const s = v.trim().toLowerCase();
return s !== '' && s !== '0' && s !== 'false' && s !== 'off' && s !== 'no';
}
export function shouldUseClaudeSdk(
task: {
agent: string | null;
session_id: string | null;
chat_id: string | null;
},
env: NodeJS.ProcessEnv = process.env,
): boolean {
if (!claudeSdkBackendEnabled(env)) return false;
if (task.agent !== 'claude') return false;
return task.session_id != null && task.chat_id != null;
}

View File

@@ -0,0 +1,425 @@
/**
* claude-sdk-sessionstore #9 (Part 2) — ClaudeSdkBackend.
*
* A warm, resumable backend for the `claude` agent built on the Claude Agent SDK
* (`@anthropic-ai/claude-agent-sdk`), implementing the Phase-0 `AgentBackend`
* contract (same shape as `WarmAcpBackend` / `OpenCodeServerBackend`). One
* persistent `query()` per (chat, agent) session, driven in STREAMING-INPUT mode:
* the `prompt` is a pushable `AsyncIterable<SDKUserMessage>` that stays open across
* turns, so the SDK subprocess + conversation stay warm between `prompt()` calls
* until `closeSession`/`dispose`.
*
* ⚠ LIVE PUMP IS HOST-ONLY. The actual streaming turn needs the real `claude`
* binary + ANTHROPIC auth on a host — it CANNOT run in the dev container. This file
* is written against the REAL SDK types so it TYPECHECKS, and the PURE pieces (the
* `mapSdkMessage` mapper + the `createPushable` queue) are unit-tested. Routing to
* this backend is gated behind `CLAUDE_SDK_BACKEND` (default OFF) so production
* claude stays on the working PTY path until a host smoke validates the pump +
* cross-turn resume.
*
* Lifecycle (mirrors warm-acp.ts / opencode-server.ts):
* - `ensureSession`: resolve the resume id from `agent_sessions(chat_id,'claude')`
* and (re)build the single `query()` if not already live. The SDK's own
* `sessionStore` (Part 1 PostgresSessionStore) materializes the transcript on
* resume; `options.resume` carries the provider session id.
* - `prompt`: push ONE user message onto the open queue, iterate the generator,
* map each `SDKMessage` → `AgentEvent`s via `mapSdkMessage`, forward to
* `ctx.onEvent`, and resolve when the turn's `result` message lands. Capture the
* `session_id` from the `init` message and persist it to `agent_sessions`;
* accumulate `result.usage` / `total_cost_usd` onto the row (mirrors opencode U.6).
* - `closeSession` / `dispose`: close the queue + dispose the query generator.
* - A thrown error or `result.subtype==='error*'` marks `agent_sessions.status='crashed'`.
*
* Turn serialization: like warm-acp, exactly one turn is in flight at a time on a
* given backend (the dispatcher's per-session `inflight` map enforces this upstream;
* `isBusy()` reports it so the pool never evicts mid-turn).
*/
import { query, type Query, type SDKMessage, type SDKUserMessage, type Options } from '@anthropic-ai/claude-agent-sdk';
import type { FastifyBaseLogger } from 'fastify';
import type { Sql } from '../../db.js';
import { PostgresSessionStore } from './claude-session-store.js';
import { createPushable, type Pushable } from './pushable-iterable.js';
import { mapSdkMessage, createClaudeSdkMapState, type ClaudeSdkMapState } from './claude-sdk-map.js';
import type {
AgentBackend,
AgentSessionHandle,
EnsureSessionOpts,
PromptCtx,
TurnResult,
} from '../agent-backend.js';
export interface ClaudeSdkBackendDeps {
sql: Sql;
log: FastifyBaseLogger;
/** The (chat, agent) this backend serves — its pool identity + DB key. */
chatId: string;
/** Always 'claude' today; kept explicit so the pool key + DB writes stay honest. */
agent: string;
/** Resolved `claude` binary path (available_agents.install_path); null → SDK default. */
installPath: string | null;
}
export class ClaudeSdkBackend implements AgentBackend {
readonly backend = 'claude_sdk' as const;
private readonly sql: Sql;
private readonly log: FastifyBaseLogger;
private readonly chatId: string;
private readonly agent: string;
private readonly installPath: string | null;
private readonly sessionStore: PostgresSessionStore;
/** The single persistent query() generator; null until the first turn builds it. */
private query: Query | null = null;
/** The open input queue feeding the generator one SDKUserMessage per turn. */
private input: Pushable<SDKUserMessage> | null = null;
/** The provider's own session id (resume token), captured from the init message. */
private agentSessionId: string | null = null;
/** Resolved model the live query() was built with; a change forces a rebuild. */
private builtModel: string | null = null;
/** True between prompt() start and settle. */
private busy = false;
private up = false;
constructor(deps: ClaudeSdkBackendDeps) {
this.sql = deps.sql;
this.log = deps.log;
this.chatId = deps.chatId;
this.agent = deps.agent;
this.installPath = deps.installPath;
this.sessionStore = new PostgresSessionStore(deps.sql);
}
/** §2: liveness for the health endpoint + dispatcher fallback decision. */
health(): 'up' | 'down' {
return this.up ? 'up' : 'down';
}
/** Phase 3: busy iff a turn is in flight (pool never evicts a busy backend). */
isBusy(): boolean {
return this.busy;
}
// ─── ensureSession: resolve resume id + (re)build the warm query ──────────────
async ensureSession(sessionId: string, opts: EnsureSessionOpts): Promise<AgentSessionHandle> {
// Resolve the resume token from the (chat_id, agent) row. A crashed row is not
// resumed (the SDK would fail to load a dead session); we create fresh.
const [row] = await this.sql<{ agent_session_id: string | null; status: string }[]>`
SELECT agent_session_id, status FROM agent_sessions
WHERE chat_id = ${opts.chatId} AND agent = ${opts.agent}
`;
const resumeId = row && row.status !== 'crashed' ? row.agent_session_id : null;
// (Re)build the warm query if there is none, or the model changed (the SDK can
// change model mid-session via setModel, but a fresh build is simplest + matches
// opencode's config-drift → fresh-session rule). The query stays alive across
// turns; only closeSession/dispose tears it down.
if (!this.query || this.builtModel !== opts.model) {
await this.teardownQuery();
this.buildQuery(opts.worktreePath, opts.model, resumeId);
}
// Seed the in-memory resume id from the DB so a handle built before the first
// turn's init message still carries the last-known token. The init message
// overwrites it with the authoritative current id during the turn.
if (this.agentSessionId == null) this.agentSessionId = resumeId;
// Upsert the agent_sessions row (backend='claude_sdk'). agent_session_id may be
// null until the first turn captures it from the init message; prompt() updates it.
await this.sql`
INSERT INTO agent_sessions
(chat_id, session_id, worktree_id, agent, backend, agent_session_id, server_port, status, last_active_at)
VALUES
(${opts.chatId}, ${sessionId}, ${opts.worktreeId}, ${opts.agent}, 'claude_sdk', ${this.agentSessionId}, NULL, 'active', clock_timestamp())
ON CONFLICT (chat_id, agent) DO UPDATE SET
session_id = EXCLUDED.session_id,
worktree_id = EXCLUDED.worktree_id,
backend = 'claude_sdk',
agent_session_id = COALESCE(EXCLUDED.agent_session_id, agent_sessions.agent_session_id),
server_port = NULL,
status = 'active',
last_active_at = clock_timestamp()
`.catch((err) => {
this.log.warn({ err: errMsg(err), chatId: opts.chatId, agent: opts.agent }, 'claude-sdk: agent_sessions upsert failed (non-fatal)');
});
return {
sessionId,
agent: opts.agent,
backend: 'claude_sdk',
chatId: opts.chatId,
worktreeId: opts.worktreeId,
agentSessionId: this.agentSessionId,
serverPort: null,
};
}
/** Build the persistent query() in streaming-input mode. Lazy — no subprocess
* work happens until the generator is first iterated in prompt(). */
private buildQuery(worktreePath: string, model: string, resumeId: string | null): void {
const input = createPushable<SDKUserMessage>();
const options: Options = {
sessionStore: this.sessionStore,
cwd: worktreePath,
// Stream partial assistant messages so text/thinking/tool deltas arrive live
// (the mapper reads them; without this only terminal messages land).
includePartialMessages: true,
// BooCode default: enable the documented 1M-context-window beta. Active on
// models that support it (the SDK lists Sonnet 4/4.5); a non-supporting model
// simply doesn't get the larger window. The TRUE window is read back from
// `result.modelUsage[*].contextWindow` and shown in the ContextBar, so whatever
// window a model actually gets is surfaced truthfully (no guessing).
betas: ['context-1m-2025-08-07'],
...(model ? { model } : {}),
...(resumeId ? { resume: resumeId } : {}),
...(this.installPath ? { pathToClaudeCodeExecutable: this.installPath } : {}),
// ANTHROPIC auth/env must reach the child; inherit the process env (host concern).
env: process.env as Record<string, string>,
};
this.input = input;
this.query = query({ prompt: input.iterable, options });
this.builtModel = model;
this.up = true;
this.log.info({ chatId: this.chatId, agent: this.agent, model, resume: resumeId ?? null }, 'claude-sdk: warm query built');
}
// ─── prompt: push one user message + drain the generator until result ─────────
async prompt(handle: AgentSessionHandle, input: string, ctx: PromptCtx): Promise<TurnResult> {
if (!this.query || !this.input) {
// ensureSession should have built it; rebuild defensively (e.g. evicted/raced).
this.buildQuery(ctx.worktreePath, ctx.model, handle.agentSessionId);
}
const gen = this.query!;
const queue = this.input!;
if (ctx.signal.aborted) return { ok: false, error: 'aborted' };
this.busy = true;
const state: ClaudeSdkMapState = createClaudeSdkMapState();
// Peak per-request input (incl. cache) across the turn ≈ the conversation context
// held in the window. result.usage SUMS input over the turn's internal requests
// (overcounts for multi-tool turns), so the per-request peak is the accurate
// "context used" for the ContextBar (paseo's approach).
let maxInputTokens = 0;
// Per-turn abort: interrupt the in-flight query on the SAME generator (never
// tear down the warm query — that's the pool's lifetime). The generator then
// emits its terminal result and the drain loop exits.
let aborted = false;
const onAbort = () => {
if (aborted) return;
aborted = true;
void gen.interrupt().catch(() => {});
};
ctx.signal.addEventListener('abort', onAbort, { once: true });
// Push the turn's user message onto the open queue. session_id is optional on
// the wire; the SDK manages it via resume + the init message.
const userMsg: SDKUserMessage = {
type: 'user',
message: { role: 'user', content: input },
parent_tool_use_id: null,
...(handle.agentSessionId ? { session_id: handle.agentSessionId } : {}),
};
queue.push(userMsg);
try {
// Manual iteration — NOT `for await (… of gen)`. Returning out of a for-await
// loop calls gen.return(), which CLOSES the async generator; that killed the
// warm streaming-input query after a single turn, so every FOLLOW-UP message
// hit a dead generator and failed. gen.next() leaves the generator suspended
// (alive) for the next pushed user message — the warm query is only closed
// deliberately in teardownQuery()/dispose().
while (true) {
const next = await gen.next();
if (next.done) {
// Generator ended (e.g. disposed) without a result — non-fatal incomplete.
if (aborted) return { ok: false, error: 'aborted' };
return { ok: false, error: 'claude-sdk: query ended before result' };
}
const msg = next.value;
// Track the peak per-request input from message_start usage (delivered by
// includePartialMessages) — the largest single request's input is the real
// context fill, unlike the summed result.usage.
if (msg.type === 'stream_event') {
const sev = msg.event as { type?: string; message?: { usage?: Record<string, unknown> } };
if (sev?.type === 'message_start' && sev.message?.usage) {
const ru = sev.message.usage;
const reqInput =
num(ru.input_tokens) + num(ru.cache_read_input_tokens) + num(ru.cache_creation_input_tokens);
if (reqInput > maxInputTokens) maxInputTokens = reqInput;
}
}
// Capture the provider session id from the init message (authoritative).
if (msg.type === 'system' && msg.subtype === 'init' && msg.session_id) {
if (this.agentSessionId !== msg.session_id) {
this.agentSessionId = msg.session_id;
await this.persistAgentSessionId(msg.session_id);
}
}
// The result message ends THIS turn (it does not close the generator —
// streaming-input keeps it alive for the next pushed message).
if (msg.type === 'result') {
await this.accumulateUsage(msg);
const ok = msg.subtype === 'success' && !aborted;
if (!ok) {
// error_during_execution / error_max_turns / aborted → crashed row.
await this.markCrashed();
} else {
await this.markIdle();
}
if (aborted) return { ok: false, error: 'aborted' };
if (!ok) return { ok: false, error: resultErrorMessage(msg) };
// Context-window telemetry for the ContextBar (paseo's method):
// ctxMax = the model's OWN reported window (1M-aware — reflects the active
// window, so the bar shows the truth per model);
// ctxUsed = peak request input (history in the window) + this turn's output.
const ctxMax = extractMaxContextWindow((msg as { modelUsage?: unknown }).modelUsage);
const fallbackInput =
num(msg.usage?.input_tokens) +
num(msg.usage?.cache_read_input_tokens) +
num(msg.usage?.cache_creation_input_tokens);
const ctxUsed = (maxInputTokens || fallbackInput) + num(msg.usage?.output_tokens);
return {
ok: true,
...(ctxMax > 0 ? { ctxMax } : {}),
...(ctxUsed > 0 ? { ctxUsed } : {}),
};
}
// Map renderable content → AgentEvents for the dispatcher's onEvent.
for (const ev of mapSdkMessage(msg, state)) {
ctx.onEvent(ev);
}
}
} catch (err) {
if (aborted) return { ok: false, error: 'aborted' };
await this.markCrashed();
return { ok: false, error: errMsg(err) };
} finally {
ctx.signal.removeEventListener('abort', onAbort);
this.busy = false;
}
}
// ─── persistence helpers ──────────────────────────────────────────────────────
private async persistAgentSessionId(id: string): Promise<void> {
await this.sql`
UPDATE agent_sessions
SET agent_session_id = ${id}, last_active_at = clock_timestamp()
WHERE chat_id = ${this.chatId} AND agent = ${this.agent}
`.catch((err) => {
this.log.warn({ err: errMsg(err), chatId: this.chatId }, 'claude-sdk: failed to persist agent_session_id (non-fatal)');
});
}
/**
* Accumulate the turn's usage/cost onto the (chat_id, agent) row — mirrors the
* opencode U.6 running-total pattern. The SDK reports usage once per turn on the
* result message (not per step), so this fires once per prompt(). Cache read/write
* input tokens fold into `input_tokens`; usage telemetry never fails a turn.
*/
private async accumulateUsage(result: Extract<SDKMessage, { type: 'result' }>): Promise<void> {
const u = result.usage;
const input = num(u?.input_tokens) + num(u?.cache_read_input_tokens) + num(u?.cache_creation_input_tokens);
const output = num(u?.output_tokens);
const cost = numF(result.total_cost_usd);
if (input === 0 && output === 0 && cost === 0) return;
await this.sql`
UPDATE agent_sessions SET
input_tokens = input_tokens + ${input},
output_tokens = output_tokens + ${output},
cost = cost + ${cost}
WHERE chat_id = ${this.chatId} AND agent = ${this.agent}
`.catch((err) => {
this.log.warn({ err: errMsg(err), chatId: this.chatId }, 'claude-sdk: failed to persist usage (non-fatal)');
});
}
private async markIdle(): Promise<void> {
await this.sql`
UPDATE agent_sessions SET status = 'idle', last_active_at = clock_timestamp()
WHERE chat_id = ${this.chatId} AND agent = ${this.agent}
`.catch(() => {});
}
private async markCrashed(): Promise<void> {
await this.sql`
UPDATE agent_sessions SET status = 'crashed'
WHERE chat_id = ${this.chatId} AND agent = ${this.agent}
`.catch(() => {});
}
// ─── teardown ────────────────────────────────────────────────────────────────
async closeSession(handle: AgentSessionHandle): Promise<void> {
await this.teardownQuery();
await this.sql`
UPDATE agent_sessions SET status = 'closed'
WHERE chat_id = ${handle.chatId} AND agent = ${handle.agent}
`.catch(() => {});
}
async dispose(): Promise<void> {
await this.teardownQuery();
}
/** Close the input queue + dispose the generator. Idempotent. */
private async teardownQuery(): Promise<void> {
this.up = false;
this.busy = false;
const q = this.query;
const queue = this.input;
this.query = null;
this.input = null;
this.builtModel = null;
queue?.close();
if (q) {
// return() ends the AsyncGenerator and lets the SDK clean up its subprocess.
await q.return(undefined).catch(() => {});
}
}
}
// ─── helpers ──────────────────────────────────────────────────────────────────
/** Coerce to a non-negative finite integer (tokens). */
function num(v: unknown): number {
const x = typeof v === 'number' ? v : Number(v);
return Number.isFinite(x) && x > 0 ? Math.round(x) : 0;
}
/** Coerce to a non-negative finite float (cost USD). */
function numF(v: unknown): number {
const x = typeof v === 'number' ? v : Number(v);
return Number.isFinite(x) && x > 0 ? x : 0;
}
/** Largest context-window the SDK reports across `result.modelUsage` (a
* `Record<model, ModelUsage>`, each with a `contextWindow`). This is the model's
* OWN window — 1M when the 1M model/beta is active, 200K otherwise — so the
* ContextBar shows the true window without us mapping model→size ourselves. */
function extractMaxContextWindow(modelUsage: unknown): number {
if (!modelUsage || typeof modelUsage !== 'object') return 0;
let max = 0;
for (const v of Object.values(modelUsage as Record<string, unknown>)) {
if (v && typeof v === 'object') {
const cw = (v as { contextWindow?: unknown }).contextWindow;
if (typeof cw === 'number' && Number.isFinite(cw) && cw > max) max = cw;
}
}
return max;
}
/** Build a human-readable error from an SDK error-result message. */
function resultErrorMessage(result: Extract<SDKMessage, { type: 'result' }>): string {
if (result.subtype === 'success') return 'ok';
const errs = (result as { errors?: string[] }).errors;
if (Array.isArray(errs) && errs.length > 0) return `${result.subtype}: ${errs.join('; ')}`;
return result.subtype;
}
function errMsg(e: unknown): string {
return e instanceof Error ? e.message : String(e);
}

View File

@@ -0,0 +1,117 @@
import type { SessionStore, SessionKey, SessionStoreEntry } from '@anthropic-ai/claude-agent-sdk';
import type { Sql } from '../../db.js';
/**
* claude-sdk-sessionstore #9 (Part 1) — clean-room PostgresSessionStore.
*
* A Postgres-backed implementation of the Claude Agent SDK's `SessionStore`
* adapter type. The SDK mirrors each transcript line (a JSON-safe POJO with a
* `type` discriminant) to this store via `append`; on resume it calls `load`
* to materialize the full transcript back. We treat entries as opaque blobs and
* preserve append order via a BIGSERIAL `id` — `load` replays `ORDER BY id`.
*
* Storage shape: one row per entry in `claude_session_entries`, keyed by the
* SDK's `SessionKey` (project_key, session_id, subpath). The SDK uses an
* *undefined* subpath for the main transcript and disallows the empty string;
* we collapse `undefined → ''` so the main transcript and subagent files share
* one table, distinguished by the `subpath` column (`'' = main`).
*
* Clean-room: written against the SDK's published `SessionStore` type contract
* and BooCode's existing SQL conventions (porsager tagged templates, `sql.json`
* for JSONB). No SDK example/reference code was consulted.
*/
export class PostgresSessionStore implements SessionStore {
constructor(private readonly sql: Sql) {}
/**
* Mirror a batch of transcript entries. No-op on an empty batch; otherwise a
* single multi-row INSERT writes them in array order. Because `id` is a
* monotonically-increasing BIGSERIAL, the insert order is the replay order
* `load` reconstructs — entries within one call land in the order given.
*/
async append(key: SessionKey, entries: SessionStoreEntry[]): Promise<void> {
if (entries.length === 0) return;
const subpath = key.subpath ?? '';
const rows = entries.map((entry) => ({
project_key: key.projectKey,
session_id: key.sessionId,
subpath,
entry: this.sql.json(entry as never),
}));
await this.sql`
INSERT INTO claude_session_entries ${this.sql(rows, 'project_key', 'session_id', 'subpath', 'entry')}
`;
}
/**
* Load a full transcript for resume. Returns the entries in append order, or
* `null` for a (project_key, session_id, subpath) key that was never written.
*/
async load(key: SessionKey): Promise<SessionStoreEntry[] | null> {
const subpath = key.subpath ?? '';
const rows = await this.sql<{ entry: SessionStoreEntry }[]>`
SELECT entry
FROM claude_session_entries
WHERE project_key = ${key.projectKey}
AND session_id = ${key.sessionId}
AND subpath = ${subpath}
ORDER BY id
`;
if (rows.length === 0) return null;
return rows.map((r) => r.entry);
}
/**
* List the main transcripts for a project. `mtime` is the storage write time
* (latest `created_at` for the session) in Unix epoch milliseconds; the SDK
* sorts the result by mtime descending.
*/
async listSessions(projectKey: string): Promise<Array<{ sessionId: string; mtime: number }>> {
const rows = await this.sql<{ session_id: string; mtime: string }[]>`
SELECT session_id, extract(epoch FROM max(created_at)) * 1000 AS mtime
FROM claude_session_entries
WHERE project_key = ${projectKey}
AND subpath = ''
GROUP BY session_id
`;
return rows.map((r) => ({ sessionId: r.session_id, mtime: Number(r.mtime) }));
}
/**
* Delete a session. With a `subpath` set, only that subpath's rows are
* removed; with `subpath` omitted, every row for the session is removed
* (all subpaths, including the main transcript).
*/
async delete(key: SessionKey): Promise<void> {
if (key.subpath !== undefined) {
await this.sql`
DELETE FROM claude_session_entries
WHERE project_key = ${key.projectKey}
AND session_id = ${key.sessionId}
AND subpath = ${key.subpath}
`;
return;
}
await this.sql`
DELETE FROM claude_session_entries
WHERE project_key = ${key.projectKey}
AND session_id = ${key.sessionId}
`;
}
/**
* List the distinct non-main subpaths under a session (e.g. subagent files).
* Used during resume to discover and materialize subagent transcripts; the
* main transcript (`subpath = ''`) is excluded.
*/
async listSubkeys(key: { projectKey: string; sessionId: string }): Promise<string[]> {
const rows = await this.sql<{ subpath: string }[]>`
SELECT DISTINCT subpath
FROM claude_session_entries
WHERE project_key = ${key.projectKey}
AND session_id = ${key.sessionId}
AND subpath <> ''
`;
return rows.map((r) => r.subpath);
}
}

View File

@@ -0,0 +1,197 @@
/**
* v2.6 Phase 3 — pure lifecycle decision helpers.
*
* The eviction / LRU-cap / busy-aware-restart / reaper-target logic, factored out
* of AgentPool + the backends + the periodic sweeper so it's unit-testable with no
* DB, no child processes, no timers (modeled on
* apps/server/src/services/inference/prune.ts:selectPruneTargets — a pure decision
* core the caller acts on).
*
* Three decisions live here:
* 1. selectIdleEvictionTargets — which warm backends to evict for being idle.
* 2. selectLruEvictionTargets — which warm backends to evict to honour a max-live
* cap (least-recently-used beyond the cap), NEVER a busy one.
* 3. shouldRestartCrashedBackend (busy-aware) — openchamber's skip-while-busy +
* stale-grace state machine, re-implemented for BooCode's per-(chat,agent) pool.
*
* "Busy" = the backend has an in-flight turn. The hard rule (design §6, decisions):
* never evict or force-restart a busy backend; defer with a stale-grace.
*/
// ─── Idle TTL eviction (3.1) ─────────────────────────────────────────────────
/** Default idle TTL before a warm backend/session is evicted (design §6 ~30 min). */
export const DEFAULT_IDLE_TTL_MS = 30 * 60 * 1000;
/** A pool entry as the decision helpers see it (no backend internals). */
export interface PoolEntrySnapshot {
/** Pool key `${primary}:${agent}` — opaque to the decision, used for selection. */
key: string;
/** Epoch ms of the last turn activity (start or settle) on this backend. */
lastActiveAt: number;
/** True iff a turn is in flight right now. Busy entries are never evicted. */
busy: boolean;
}
/**
* Idle eviction: an entry is evictable when it has been idle (no turn) for longer
* than `ttlMs` AND is not currently busy. Returns the keys to evict.
*
* Pure: `now` is injected so tests don't depend on wall-clock. Busy entries are
* categorically excluded — a long-running turn that exceeds the TTL must NOT be
* torn down mid-stream (the §6 / openchamber busy rule).
*/
export function selectIdleEvictionTargets(
entries: ReadonlyArray<PoolEntrySnapshot>,
now: number,
ttlMs: number = DEFAULT_IDLE_TTL_MS,
): string[] {
const out: string[] = [];
for (const e of entries) {
if (e.busy) continue;
if (now - e.lastActiveAt >= ttlMs) out.push(e.key);
}
return out;
}
// ─── LRU cap (3.4) ───────────────────────────────────────────────────────────
/** Default max live warm backends/worktrees before the LRU cap evicts (env-overridable). */
export const DEFAULT_MAX_LIVE_BACKENDS = 10;
/**
* LRU cap: when more than `cap` non-busy entries are live, evict the
* least-recently-used ones (oldest `lastActiveAt` first) until at most `cap`
* remain. Busy entries are never evicted AND are not counted toward the cap's
* "kept" budget being freed — i.e. we only ever evict idle entries, so a burst of
* concurrent busy turns can transiently exceed the cap rather than kill live work.
*
* Returns the keys to evict, least-recently-used first. Pure / deterministic:
* ties broken by key for stable test output.
*/
export function selectLruEvictionTargets(
entries: ReadonlyArray<PoolEntrySnapshot>,
cap: number = DEFAULT_MAX_LIVE_BACKENDS,
): string[] {
if (cap < 0) cap = 0;
if (entries.length <= cap) return [];
// Only idle entries are eligible to be evicted.
const evictable = entries
.filter((e) => !e.busy)
.sort((a, b) => a.lastActiveAt - b.lastActiveAt || (a.key < b.key ? -1 : a.key > b.key ? 1 : 0));
// We must shrink total live count down to `cap`. Busy entries can't be evicted,
// so the number we CAN remove is bounded by the evictable pool; evict the oldest
// (total - cap) of them, never more than exist.
const overBy = entries.length - cap;
const toEvict = evictable.slice(0, Math.max(0, overBy));
return toEvict.map((e) => e.key);
}
// ─── Busy-aware crash restart (3.2) — openchamber lift ───────────────────────
/**
* Default grace after which a backend that has stayed unhealthy WHILE busy is
* force-restarted anyway (openchamber's STALE_BUSY_GRACE_MS = 2 min). Guards
* against a permanently-stuck "busy" turn wedging recovery forever.
*/
export const DEFAULT_STALE_BUSY_GRACE_MS = 2 * 60 * 1000;
/** Default consecutive health-check failures before a restart is attempted. */
export const DEFAULT_HEALTH_FAILURE_THRESHOLD = 3;
export interface RestartDecisionInput {
/** True iff the process is actually dead (exited). A dead process restarts
* immediately regardless of busy/threshold — there's nothing to protect. */
processExited: boolean;
/** Consecutive failed health probes so far (including the current one). */
consecutiveFailures: number;
/** Whether the backend currently has an in-flight turn. */
busy: boolean;
/** Epoch ms when the unhealthy-while-busy window started, or 0 if not in one. */
unhealthyBusySince: number;
/** Injected clock. */
now: number;
failureThreshold?: number;
staleBusyGraceMs?: number;
}
export type RestartDecision =
| { action: 'restart'; reason: 'process-exited' | 'threshold' | 'stale-busy-grace' }
| { action: 'wait'; reason: 'below-threshold' | 'busy-grace' }
| { action: 'none'; reason: 'healthy' };
/**
* Decide whether to restart a backend after a health probe. Mirrors
* openchamber's `runHealthCheckCycle` + `shouldSkipRestartForBusySessions`,
* re-implemented as a pure function over injected state (the caller owns the
* mutable counters + the actual restart side-effect).
*
* Order (matches openchamber):
* - process exited → restart now (nothing live to protect).
* - below failure threshold → wait (transient blip; the next probe re-checks).
* - threshold reached + idle → restart now.
* - threshold reached + busy → skip UNLESS the unhealthy-busy window exceeded
* the stale grace, then force restart.
*
* `healthy: true` callers don't reach here; included for completeness so the
* caller can pass through and reset counters on a single code path.
*/
export function decideRestart(input: RestartDecisionInput & { healthy?: boolean }): RestartDecision {
if (input.healthy) return { action: 'none', reason: 'healthy' };
if (input.processExited) return { action: 'restart', reason: 'process-exited' };
const threshold = input.failureThreshold ?? DEFAULT_HEALTH_FAILURE_THRESHOLD;
if (input.consecutiveFailures < threshold) {
return { action: 'wait', reason: 'below-threshold' };
}
if (!input.busy) {
return { action: 'restart', reason: 'threshold' };
}
// Busy + unhealthy at/over threshold: defer, but not forever.
const grace = input.staleBusyGraceMs ?? DEFAULT_STALE_BUSY_GRACE_MS;
if (input.unhealthyBusySince > 0 && input.now - input.unhealthyBusySince >= grace) {
return { action: 'restart', reason: 'stale-busy-grace' };
}
return { action: 'wait', reason: 'busy-grace' };
}
// ─── Orphan worktree reaper target selection (3.4) ───────────────────────────
/** Default TTL: an on-disk worktree dir with no live `worktrees` row is reaped
* only after it's been orphaned at least this long (mtime-based grace so a
* just-created dir mid-`ensureSessionWorktree` race is never swept). */
export const DEFAULT_ORPHAN_WORKTREE_GRACE_MS = 60 * 60 * 1000; // 1h
export interface OnDiskWorktree {
/** Absolute path of the worktree dir on disk. */
path: string;
/** Last-modified epoch ms of the dir (newest of dir + contents, caller's choice). */
mtimeMs: number;
}
/**
* Reaper target selection: which on-disk worktree dirs are orphans safe to
* inspect-and-reap. An orphan is a dir under the worktree base that has NO live
* `worktrees` row (path not in `liveWorktreePaths`) AND whose mtime is older than
* the grace window (so an in-flight create isn't swept).
*
* Pure — the caller (the sweeper) then runs the at-risk preflight (dirty/unpushed)
* on each returned path and only physically removes the SAFE ones. This helper
* never decides to remove work-at-risk; it only narrows the candidate set.
*/
export function selectOrphanWorktreeTargets(
onDisk: ReadonlyArray<OnDiskWorktree>,
liveWorktreePaths: ReadonlySet<string>,
now: number,
graceMs: number = DEFAULT_ORPHAN_WORKTREE_GRACE_MS,
): string[] {
const out: string[] = [];
for (const w of onDisk) {
if (liveWorktreePaths.has(w.path)) continue; // tracked → not an orphan
if (now - w.mtimeMs < graceMs) continue; // too fresh → could be mid-create
out.push(w.path);
}
return out;
}

View File

@@ -0,0 +1,203 @@
/**
* Pure opencode `Event` → normalized `AgentEvent` translation.
*
* Extracted (v2.7 audit reshape) from `OpenCodeServerBackend.dispatchEvent` /
* `handleUpdatedPart` and the file-local helpers. NO I/O, no timers, no DB, no
* `byOpencodeId` — every function here is a deterministic transform over its
* arguments (the dedup state is caller-owned and mutated in place, mirroring the
* `acp-event-map.ts` `priorSnapshots` pattern). This is the unit-testable core; the
* backend keeps the routing + side effects (watchdog, usage persistence, settle).
*
* Depends only on SDK TYPES + AcpToolSnapshot — safe to import anywhere.
*/
import type { Event, Part, ToolPart, ToolState } from '@opencode-ai/sdk/v2/client';
import type { ToolCallStatus } from '@agentclientprotocol/sdk';
import type { AcpToolSnapshot } from '../acp-tool-snapshot.js';
import type { AgentEvent } from '../agent-backend.js';
/** Per-(opencode session) dedup state the part-stream classifiers read + mutate. */
export interface DedupState {
/** dedup gate: `${type}:${id}` added on delta, deleted-and-tested on updated. */
streamedPartKeys: Set<string>;
/** partID → 'text' | 'reasoning', so a delta with a non-'reasoning' field is still classed right. */
partTypeById: Map<string, string>;
}
/** Strip opencode-dcp plugin tags that render as literal text in the UI. */
export function stripDcpTags(s: string): string {
return s.replace(/<dcp-message-id>[^<]*<\/dcp-message-id>/g, '');
}
/** Extract the opencode sessionID an event belongs to, across event shapes.
* Most carry `properties.sessionID`; `message.part.updated` nests it under
* `properties.part.sessionID`. Returns null when the event has no session
* (the per-session loop then leaves it to dispatchEvent, which drops it). */
export function eventSessionId(ev: Event): string | null {
const props = (ev as { properties?: unknown }).properties;
if (!props || typeof props !== 'object') return null;
if (ev.type === 'message.part.updated') {
const part = (props as { part?: { sessionID?: string } }).part;
return part?.sessionID ?? null;
}
return (props as { sessionID?: string }).sessionID ?? null;
}
/** Ported verbatim from Paseo opencode-agent.ts: id → message-id fallback → null. */
export function resolvePartDedupeKey(part: { id: string; messageID: string }, type: string): string | null {
if (part.id.trim().length > 0) return `${type}:${part.id}`;
if (part.messageID.trim().length > 0) return `${type}:message:${part.messageID}`;
return null;
}
export function mapToolStatus(s: ToolState['status'] | undefined): ToolCallStatus | null {
switch (s) {
case 'pending':
return 'pending';
case 'running':
return 'in_progress';
case 'completed':
return 'completed';
case 'error':
return 'failed';
default:
return null;
}
}
/** opencode ToolPart → ACP-shaped snapshot (reuses the existing persist/render path). */
export function toolPartToSnapshot(part: ToolPart): AcpToolSnapshot {
const state = part.state;
let rawInput: unknown;
let rawOutput: unknown;
let title: string | undefined;
if (state) {
if ('input' in state) rawInput = (state as { input?: unknown }).input;
if ('output' in state) rawOutput = (state as { output?: unknown }).output;
else if ('error' in state) rawOutput = (state as { error?: unknown }).error;
if ('title' in state) title = (state as { title?: string }).title;
}
return {
toolCallId: part.callID,
title: title ?? part.tool,
kind: null,
status: mapToolStatus(state?.status),
rawInput,
rawOutput,
};
}
// ─── session.next.tool.* snapshot builders ───────────────────────────────────
/** `session.next.tool.called` → an in-progress tool_call snapshot. */
export function toolCalledSnapshot(p: { callID: string; tool: string; input: unknown }): AcpToolSnapshot {
return {
toolCallId: p.callID,
title: p.tool,
kind: null,
status: 'in_progress',
rawInput: p.input,
rawOutput: undefined,
};
}
/** `session.next.tool.success` → a completed tool snapshot (text content joined). */
export function toolSuccessSnapshot(p: { callID: string; content?: ReadonlyArray<unknown> | null }): AcpToolSnapshot {
const output = p.content?.map((c) => (c && typeof c === 'object' && 'text' in c ? (c as { text: string }).text : '')).join('') ?? '';
return {
toolCallId: p.callID,
title: p.callID,
kind: null,
status: 'completed',
rawInput: undefined,
rawOutput: output,
};
}
/** `session.next.tool.failed` → a failed tool snapshot (error stringified). */
export function toolFailedSnapshot(p: { callID: string; error: unknown }): AcpToolSnapshot {
return {
toolCallId: p.callID,
title: p.callID,
kind: null,
status: 'failed',
rawInput: undefined,
rawOutput: errToString(p.error),
};
}
// ─── message.part.* dedup gate ────────────────────────────────────────────────
/**
* `message.part.delta`: mark the part as streamed (so a later `message.part.updated`
* for the same part is deduped) and return the AgentEvent to emit, or null when the
* field is neither reasoning nor text, or a text delta strips down to empty. Mutates
* `st.streamedPartKeys` exactly as the original inline arm did (the key is recorded
* for text even when the cleaned delta is empty).
*/
export function classifyPartDelta(
p: { partID: string; field?: string; delta: string },
st: DedupState,
): AgentEvent | null {
const isReasoning = p.field === 'reasoning' || st.partTypeById.get(p.partID) === 'reasoning';
if (isReasoning) {
st.streamedPartKeys.add(`reasoning:${p.partID}`);
return { type: 'reasoning', text: p.delta };
}
if (p.field === 'text') {
st.streamedPartKeys.add(`text:${p.partID}`);
const cleaned = stripDcpTags(p.delta);
return cleaned ? { type: 'text', text: cleaned } : null;
}
return null;
}
/**
* `message.part.updated` terminal part: the dedup gate for text/reasoning (drop a
* part already streamed via deltas; otherwise emit the finished text) plus the
* tool-part → tool_call/tool_update mapping. Returns null when nothing should be
* emitted. Mutates `st.partTypeById` / `st.streamedPartKeys` like the original.
*/
export function classifyUpdatedPart(part: Part, st: DedupState): AgentEvent | null {
if (part.type === 'text' || part.type === 'reasoning') {
st.partTypeById.set(part.id, part.type);
const key = resolvePartDedupeKey(part, part.type);
if (key && st.streamedPartKeys.delete(key)) return null; // already streamed via delta
const raw = part.text ?? '';
const text = part.type === 'text' ? stripDcpTags(raw) : raw;
if (text && part.time?.end != null) {
return { type: part.type, text };
}
return null;
}
if (part.type === 'tool') {
const snap = toolPartToSnapshot(part);
const status = part.state?.status;
// tool_call on start (pending/running), tool_update on terminal (completed/error).
// The current ACP path merges both into one frame; the contract keeps them
// distinct because opencode's SSE distinguishes start from result.
return status === 'completed' || status === 'error'
? { type: 'tool_update', toolCall: snap }
: { type: 'tool_call', toolCall: snap };
}
// NOTE: opencode's SSE payload union carries no available-commands event, so the
// AgentEvent 'commands' arm is intentionally never emitted here.
return null;
}
// ─── shared error formatters (pure) ───────────────────────────────────────────
export function errMsg(e: unknown): string {
return e instanceof Error ? e.message : String(e);
}
export function errToString(e: unknown): string {
if (e == null) return 'unknown error';
if (typeof e === 'string') return e;
if (e instanceof Error) return e.message;
try {
return JSON.stringify(e);
} catch {
return String(e);
}
}

View File

@@ -0,0 +1,325 @@
/**
* OpenCodeServerSupervisor — the opencode `serve` child + HTTP client + port +
* health-counter lifecycle, extracted (v2.7 audit reshape) from the backend
* god-class. Owns spawn / ready / crash / proactive-health restart / dispose and
* exposes `client` / `port` / `health()` / `tickHealth()` to the backend.
*
* Session-level recovery (failing in-flight turns, marking agent_sessions crashed,
* tearing down SSE loops) is NOT a process concern — it's delegated back to the
* backend through the injected `hooks.onServerDown` callback, keeping this module
* free of the demux map / SQL / turn state.
*
* v2.7 concurrency hardening: `ensureServer` is guarded against the crash-window
* double-spawn (two concurrent callers each re-spawning on different ports) via a
* synchronous `startInFlight` flag — see `shouldStartServer`.
*/
import { spawn, type ChildProcess } from 'node:child_process';
import { createOpencodeClient, type OpencodeClient } from '@opencode-ai/sdk/v2/client';
import type { FastifyBaseLogger } from 'fastify';
import { decideRestart, DEFAULT_HEALTH_FAILURE_THRESHOLD } from './lifecycle-decisions.js';
import { reclaimPort, waitForPortRelease, freePort } from '../net/port-utils.js';
const READY_TIMEOUT_MS = 30_000;
/** Info handed to the backend when the server goes down (crash or forced restart). */
export interface ServerDownInfo {
code: number | null;
signal: NodeJS.Signals | null;
port: number;
}
export interface SupervisorHooks {
/** True iff ANY pooled session has an in-flight turn (defers a busy restart). */
isBusy: () => boolean;
/** Session-level recovery: fail in-flight turns, mark crashed, drop demux state. */
onServerDown: (info: ServerDownInfo) => void;
}
export interface OpenCodeServerSupervisorDeps {
/** Absolute path to the opencode binary (resolved from available_agents). */
opencodeBinary: string;
log: FastifyBaseLogger;
hooks: SupervisorHooks;
}
/**
* Pure decision for `ensureServer`: should we (re)spawn the server right now?
*
* - A live, ready server (`up && client`) → no.
* - A start already in flight (`startInFlight`) → no, NEVER double-spawn — join the
* running start instead. This is checked BEFORE `serverStarting` because the crash
* handler can null `serverStarting` mid-start (a crash during `await freePort()`),
* and without this guard the `!serverStarting` branch would spawn a second server
* on a different port while the first is still coming up.
* - No start cached/running → yes (fresh start or post-crash re-spawn, since the
* crash handler nulls `serverStarting`).
* - A cached start that already finished, but the child has since died and the crash
* handler hasn't reset us yet → yes.
*/
export function shouldStartServer(s: {
up: boolean;
hasClient: boolean;
serverStarting: boolean;
childDead: boolean;
startInFlight: boolean;
}): boolean {
if (s.up && s.hasClient) return false;
if (s.startInFlight) return false;
if (!s.serverStarting) return true;
if (!s.up && s.childDead) return true;
return false;
}
export class OpenCodeServerSupervisor {
private readonly opencodeBinary: string;
private readonly log: FastifyBaseLogger;
private readonly hooks: SupervisorHooks;
private childProc: ChildProcess | null = null;
private opencodeClient: OpencodeClient | null = null;
private serverPort: number | null = null;
private up = false;
private serverStarting: Promise<void> | null = null;
/** True from the synchronous head of startServer() until it settles — the
* double-spawn guard reads it so a concurrent ensureServer joins instead of
* kicking a second spawn. */
private startInFlight = false;
// Phase 3 busy-aware health monitor (openchamber lift): consecutive failed
// probes + the start of an unhealthy-while-busy window feed `decideRestart`.
private consecutiveHealthFailures = 0;
private unhealthyBusySince = 0;
private restarting: Promise<void> | null = null;
constructor(deps: OpenCodeServerSupervisorDeps) {
this.opencodeBinary = deps.opencodeBinary;
this.log = deps.log;
this.hooks = deps.hooks;
}
/** The live opencode HTTP client, or null between (re)starts. */
get client(): OpencodeClient | null {
return this.opencodeClient;
}
/** The current server port, or null before the first start. */
get port(): number | null {
return this.serverPort;
}
/** §2: liveness for the health endpoint + dispatcher fallback decision. */
health(): 'up' | 'down' {
return this.up ? 'up' : 'down';
}
isUp(): boolean {
return this.up;
}
// ─── lifecycle (spawn once + client + ready; crash-restart) ──────────────────
/**
* Lazy: start the single server on first use; re-spawn after a crash. Idempotent
* within one live server — `serverStarting` caches the in-flight start, reset to
* null by the crash handler so the NEXT ensureServer re-spawns. A dead-but-not-
* yet-reaped child (exit handler raced) is also treated as needing a restart.
* Concurrent callers in a crash window are coalesced via `startInFlight`.
*/
ensureServer(): Promise<void> {
if (this.up && this.opencodeClient) return Promise.resolve();
const childDead =
this.childProc != null && (this.childProc.exitCode !== null || this.childProc.signalCode !== null);
if (
shouldStartServer({
up: this.up,
hasClient: this.opencodeClient != null,
serverStarting: this.serverStarting != null,
childDead,
startInFlight: this.startInFlight,
})
) {
this.serverStarting = this.startServer();
}
return this.serverStarting ?? Promise.resolve();
}
private async startServer(): Promise<void> {
// Set synchronously (before the first await) so a concurrent ensureServer sees
// the in-flight start and joins `serverStarting` instead of double-spawning.
this.startInFlight = true;
try {
const port = await freePort();
// Phase 1: run unsecured on loopback (opencode's documented default — serve.ts
// only WARNS when OPENCODE_SERVER_PASSWORD is unset). The real boundary is the
// 127.0.0.1 bind.
const child = spawn(this.opencodeBinary, ['serve', '--hostname', '127.0.0.1', '--port', String(port)], {
stdio: ['ignore', 'pipe', 'pipe'],
env: { ...process.env },
});
this.childProc = child;
this.serverPort = port;
// Child lifetime is the backend's (the pool's), NOT a request's. On unexpected
// exit we recover: settle in-flight turns, mark sessions crashed (the backend's
// onServerDown), reclaim the port, and reset state so the next ensureServer
// re-spawns.
child.on('exit', (code, signal) => {
// Only react to THIS child's exit (a restart may have swapped in a new one).
if (this.childProc !== child) return;
this.handleCrash(code, signal, port);
});
await waitForReady(child, READY_TIMEOUT_MS);
this.opencodeClient = createOpencodeClient({ baseUrl: `http://127.0.0.1:${port}` });
this.up = true;
this.log.info({ port }, 'opencode-server: ready');
} finally {
this.startInFlight = false;
}
}
/**
* Server down (crash-exit or forced restart): reset process/port state, delegate
* session-level recovery to the backend, and reclaim the port. Mirrors the
* original `handleServerCrash` ordering (up=false → session cleanup → client/
* serverStarting null → reclaimPort).
*/
private handleCrash(code: number | null, signal: NodeJS.Signals | null, port: number): void {
this.up = false;
this.hooks.onServerDown({ code, signal, port });
this.opencodeClient = null;
this.serverStarting = null; // force a re-spawn on the next ensureServer
// Reclaim the port so a re-spawn on a fixed/leaked port isn't blocked. Best
// effort; the next start uses a fresh ephemeral port anyway.
reclaimPort(port);
}
/**
* Phase 3 proactive health monitor (openchamber `runHealthCheckCycle` lift,
* busy-aware). Probes /global/health; on a sustained failure of a NON-busy server,
* force a restart so the next turn isn't blocked by a wedged process. Busy servers
* are deferred via the stale-grace in `decideRestart`. No-op when never started or
* a restart is already in flight.
*/
async tickHealth(now: number = Date.now()): Promise<void> {
if (!this.childProc || this.restarting) return;
const childExited = this.childProc.exitCode !== null || this.childProc.signalCode !== null;
// An exited child is recovered lazily by ensureServer; don't double-restart it.
if (childExited) return;
const healthy = await this.probeHealth();
if (healthy) {
this.consecutiveHealthFailures = 0;
this.unhealthyBusySince = 0;
return;
}
this.consecutiveHealthFailures += 1;
const busy = this.hooks.isBusy();
const decision = decideRestart({
processExited: false,
consecutiveFailures: this.consecutiveHealthFailures,
busy,
unhealthyBusySince: this.unhealthyBusySince,
now,
failureThreshold: DEFAULT_HEALTH_FAILURE_THRESHOLD,
});
// Stamp the start of an unhealthy-while-busy window so the stale-grace can fire.
if (busy && this.unhealthyBusySince === 0) this.unhealthyBusySince = now;
if (decision.action === 'restart') {
this.log.warn(
{ failures: this.consecutiveHealthFailures, busy, reason: decision.reason },
'opencode-server: health monitor forcing restart',
);
this.consecutiveHealthFailures = 0;
this.unhealthyBusySince = 0;
await this.restartServer();
}
}
private async probeHealth(): Promise<boolean> {
if (!this.opencodeClient) return false;
try {
const res = await this.opencodeClient.global.health();
return !res.error;
} catch {
return false;
}
}
/** Force-kill the current server + reclaim its port; the next ensureServer
* re-spawns (lazy). Mirrors handleCrash's state reset but is initiated by the
* health monitor rather than the OS. */
private async restartServer(): Promise<void> {
if (this.restarting) return this.restarting;
this.restarting = (async () => {
const child = this.childProc;
const port = this.serverPort;
this.up = false;
// Fail in-flight turns + mark sessions crashed via the same path as a crash.
if (child) {
this.handleCrash(null, null, port ?? 0);
if (!child.killed) child.kill('SIGTERM');
}
if (port) {
reclaimPort(port);
await waitForPortRelease(port, 3_000);
}
this.childProc = null;
})().finally(() => {
this.restarting = null;
});
return this.restarting;
}
/** Full teardown of the child + client + port state. */
async dispose(): Promise<void> {
this.up = false;
const child = this.childProc;
this.childProc = null;
this.opencodeClient = null;
if (child && !child.killed) {
child.kill('SIGTERM');
const t = setTimeout(() => {
if (!child.killed) child.kill('SIGKILL');
}, 5_000);
t.unref();
}
}
}
/** Resolve when the child prints the ready line; reject on timeout or early exit. */
function waitForReady(child: ChildProcess, timeoutMs: number): Promise<void> {
return new Promise((resolve, reject) => {
let done = false;
let stderrBuf = '';
const finish = (err?: Error) => {
if (done) return;
done = true;
clearTimeout(timer);
child.stdout?.off('data', onOut);
child.stderr?.off('data', onErr);
child.off('exit', onExit);
if (err) reject(err);
else resolve();
};
const onOut = (buf: Buffer) => {
if (buf.toString().includes('opencode server listening on')) finish();
};
const onErr = (buf: Buffer) => {
stderrBuf += buf.toString();
};
const onExit = (code: number | null) =>
finish(new Error(`opencode serve exited before ready (code ${code}); stderr: ${stderrBuf.slice(-2000)}`));
const timer = setTimeout(
() => finish(new Error(`opencode serve not ready in ${timeoutMs}ms; stderr: ${stderrBuf.slice(-2000)}`)),
timeoutMs,
);
child.stdout?.on('data', onOut);
child.stderr?.on('data', onErr);
child.on('exit', onExit);
});
}

View File

@@ -0,0 +1,607 @@
/**
* v2.6 Phase 1 — OpenCodeServerBackend (slimmed, v2.7 audit reshape).
*
* Warm, multi-turn backend for the `opencode` agent. One `opencode serve` HTTP
* server per BooCoder process; one opencode session per BooCode session (resumed
* on switch-back); one SSE read loop PER session, each scoped to that session's
* worktree directory so sessions in different directories stream concurrently.
*
* This file is now just the `AgentBackend` SURFACE — ensureSession / prompt /
* accumulateUsage / closeSession + the per-session demux side effects (watchdog,
* reconcile, usage). It composes three extracted collaborators:
* - `OpenCodeServerSupervisor` (opencode-server-process.ts) — child/client/port/
* health lifecycle, spawn/crash/restart/dispose.
* - the per-session SSE loop (opencode-sse.ts) — subscribe + reconnect/backoff.
* - the pure event map (opencode-event-map.ts) — Event → AgentEvent translation,
* dedup gate, dcp-strip, tool-snapshot.
*
* Implements the Phase 0 `AgentBackend` interface. Emits transport-agnostic
* `AgentEvent`s; the dispatcher maps them to WS frames.
*
* Spec: openspec/changes/v2-6-persistent-agent-sessions/design.md §2 / §2a.
*/
import { createHash } from 'node:crypto';
import type { FastifyBaseLogger } from 'fastify';
import type { Event, AssistantMessage } from '@opencode-ai/sdk/v2/client';
import type { Sql } from '../../db.js';
import { armAbortGuard, noteTurnActivity, consumeTerminal } from './turn-guard.js';
import { stepEndedToUsage, type StepUsage } from './opencode-usage.js';
import { OpenCodeServerSupervisor, type ServerDownInfo } from './opencode-server-process.js';
import {
startSessionEventLoop,
type SessionState,
type TurnState,
type SseLoopDeps,
} from './opencode-sse.js';
import {
classifyPartDelta,
classifyUpdatedPart,
toolCalledSnapshot,
toolSuccessSnapshot,
toolFailedSnapshot,
stripDcpTags,
errMsg,
errToString,
} from './opencode-event-map.js';
import type {
AgentBackend,
AgentSessionHandle,
EnsureSessionOpts,
PromptCtx,
TurnResult,
} from '../agent-backend.js';
/**
* No-activity backstop for an in-flight turn. opencode streams reasoning/text/tool
* deltas continuously while working, so "zero events for this long" means the turn
* is wedged or its terminal event (session.idle) was lost. Generous so a
* legitimately slow turn never trips it.
*/
const TURN_INACTIVITY_MS = 180_000;
export interface OpenCodeServerBackendDeps {
sql: Sql;
log: FastifyBaseLogger;
/** Absolute path to the opencode binary (resolved from available_agents at wiring time, Phase 1.7). */
opencodeBinary: string;
}
export class OpenCodeServerBackend implements AgentBackend {
readonly backend = 'opencode_server' as const;
private readonly sql: Sql;
private readonly log: FastifyBaseLogger;
private readonly supervisor: OpenCodeServerSupervisor;
/** opencode session id → demux state. Maintained by ensureSession; read by the SSE loop. */
private readonly byOpencodeId = new Map<string, SessionState>();
/** Coalesces concurrent ensureSession calls for the same (chat, agent) key. */
private readonly ensuring = new Map<string, Promise<AgentSessionHandle>>();
constructor(deps: OpenCodeServerBackendDeps) {
this.sql = deps.sql;
this.log = deps.log;
this.supervisor = new OpenCodeServerSupervisor({
opencodeBinary: deps.opencodeBinary,
log: deps.log,
hooks: {
isBusy: () => this.isBusy(),
onServerDown: (info) => this.onServerDown(info),
},
});
}
/** §2: liveness for the health endpoint + dispatcher fallback decision. */
health(): 'up' | 'down' {
return this.supervisor.health();
}
/** Phase 3: busy iff ANY pooled opencode session has an in-flight turn. */
isBusy(): boolean {
for (const st of this.byOpencodeId.values()) {
if (st.activeTurn) return true;
}
return false;
}
/** Phase 3 proactive health probe + busy-aware self-restart, run by the pool's
* periodic sweep. Delegates to the supervisor. */
async tickHealth(now: number = Date.now()): Promise<void> {
await this.supervisor.tickHealth(now);
}
/**
* Server down (crash-exit or forced restart): fail every in-flight turn so its
* dispatcher unblocks, mark each session crashed so ensureSession won't resume a
* now-dead native id, and tear down the SSE loops + demux state. Invoked by the
* supervisor (it owns the process/port reset). Mirrors the original
* handleServerCrash session-half byte-for-byte.
*/
private onServerDown(info: ServerDownInfo): void {
const states = [...this.byOpencodeId.values()];
this.log.warn(
{ code: info.code, signal: info.signal, port: info.port, liveSessions: states.length },
'opencode-server: child exited — recovering (fail in-flight, mark crashed, re-spawn next turn)',
);
const crashedIds: string[] = [];
for (const st of states) {
st.sseAbort?.abort();
if (st.activeTurn) {
st.activeTurn.settle({ ok: false, error: 'opencode server crashed mid-turn' });
st.activeTurn = null;
}
if (st.watchdog) {
clearTimeout(st.watchdog);
st.watchdog = null;
}
crashedIds.push(st.agentSessionId);
}
// Drop the demux map: every session id is stale against a fresh server.
this.byOpencodeId.clear();
if (crashedIds.length > 0) {
this.sql`
UPDATE agent_sessions SET status = 'crashed'
WHERE agent_session_id = ANY(${crashedIds}) AND status <> 'closed'
`.catch((err) => {
this.log.warn({ err: errMsg(err) }, 'opencode-server: failed to mark crashed sessions (non-fatal)');
});
}
}
// ─── SSE loop wiring ─────────────────────────────────────────────────────────
/** The dependency bundle the per-session SSE loop reads. */
private sseDeps(): SseLoopDeps {
return {
isUp: () => this.supervisor.isUp(),
getClient: () => this.supervisor.client,
dispatchEvent: (ev) => this.dispatchEvent(ev),
reconcile: (st) => this.reconcile(st),
onReconnectGiveUp: (st) => this.onReconnectGiveUp(st),
log: this.log,
};
}
/** Demux one event to the owning session's active turn. Unknown/between-turns → drop. */
private dispatchEvent(ev: Event): void {
switch (ev.type) {
// ─── session.next.* — live streaming events (the primary path) ─────────
case 'session.next.text.delta': {
const p = ev.properties;
const st = this.byOpencodeId.get(p.sessionID);
if (!st?.activeTurn) return;
this.bumpActivity(st);
const cleaned = stripDcpTags(p.delta);
if (cleaned) st.activeTurn.onEvent({ type: 'text', text: cleaned });
return;
}
case 'session.next.reasoning.delta': {
const p = ev.properties;
const st = this.byOpencodeId.get(p.sessionID);
if (!st?.activeTurn) return;
this.bumpActivity(st);
st.activeTurn.onEvent({ type: 'reasoning', text: p.delta });
return;
}
case 'session.next.tool.called': {
const p = ev.properties;
const st = this.byOpencodeId.get(p.sessionID);
if (!st?.activeTurn) return;
this.bumpActivity(st);
st.activeTurn.onEvent({ type: 'tool_call', toolCall: toolCalledSnapshot(p) });
return;
}
case 'session.next.tool.success': {
const p = ev.properties;
const st = this.byOpencodeId.get(p.sessionID);
if (!st?.activeTurn) return;
this.bumpActivity(st);
st.activeTurn.onEvent({ type: 'tool_update', toolCall: toolSuccessSnapshot(p) });
return;
}
case 'session.next.tool.failed': {
const p = ev.properties;
const st = this.byOpencodeId.get(p.sessionID);
if (!st?.activeTurn) return;
this.bumpActivity(st);
st.activeTurn.onEvent({ type: 'tool_update', toolCall: toolFailedSnapshot(p) });
return;
}
// ─── per-step usage (U.6) — token/cost accounting for opencode sessions ──
case 'session.next.step.ended': {
const p = ev.properties;
const st = this.byOpencodeId.get(p.sessionID);
if (!st?.activeTurn) return;
this.bumpActivity(st);
// Accumulate this step's normalized usage onto the (chat_id, agent) row.
// Fire-and-forget: a DB hiccup must not stall the turn.
const usage = stepEndedToUsage(p);
void this.accumulateUsage(st, usage);
return;
}
// ─── message.part.* — terminal/post-hoc events (dedup gate) ────────────
case 'message.part.delta': {
const p = ev.properties;
const st = this.byOpencodeId.get(p.sessionID);
if (!st?.activeTurn) return;
this.bumpActivity(st);
const e = classifyPartDelta(p, st);
if (e) st.activeTurn.onEvent(e);
return;
}
case 'message.part.updated': {
const part = ev.properties.part;
const st = this.byOpencodeId.get(part.sessionID);
if (!st?.activeTurn) return;
this.bumpActivity(st);
const e = classifyUpdatedPart(part, st);
if (e) st.activeTurn.onEvent(e);
return;
}
// ─── lifecycle ─────────────────────────────────────────────────────────
case 'session.idle': {
const st = this.byOpencodeId.get(ev.properties.sessionID);
if (!st) return;
if (consumeTerminal(st) === 'swallow') return; // F.1: drop the post-abort orphan
st.activeTurn?.settle({ ok: true });
return;
}
case 'session.error': {
const sid = ev.properties.sessionID;
if (!sid) return;
const st = this.byOpencodeId.get(sid);
if (!st) return;
if (consumeTerminal(st) === 'swallow') return; // F.1: drop the post-abort orphan
st.activeTurn?.settle({ ok: false, error: errToString(ev.properties.error) });
return;
}
default:
return;
}
}
// ─── turn-completion resilience (watchdog + reconnect reconcile) ─────────────
/** Reset the inactivity backstop on any event routed to a session's active turn. */
private bumpActivity(st: SessionState): void {
if (!st.activeTurn) return;
// A live turn is producing → the post-abort orphan-terminal window is over.
noteTurnActivity(st);
if (st.watchdog) clearTimeout(st.watchdog);
st.watchdog = setTimeout(() => {
void this.onTurnStall(st);
}, TURN_INACTIVITY_MS);
st.watchdog.unref?.();
}
/** Watchdog fired: reconcile once; if still-running we can't tell, so fail closed.
* Also mark the agent_sessions row crashed so a stale session isn't resumed. */
private async onTurnStall(st: SessionState): Promise<void> {
const settled = await this.reconcile(st);
if (!settled) {
this.log.warn({ agentSessionId: st.agentSessionId }, 'opencode-server: turn stalled (no activity), failing + marking crashed');
await this.sql`
UPDATE agent_sessions SET status = 'crashed'
WHERE agent_session_id = ${st.agentSessionId}
`.catch(() => {});
st.activeTurn?.settle({ ok: false, error: 'turn timed out (no activity)' });
}
}
/** SSE circuit-breaker fired (reconnect gave up): fail the active turn + mark the
* session crashed so it isn't resumed. The next turn re-creates a fresh session. */
private async onReconnectGiveUp(st: SessionState): Promise<void> {
if (!st.activeTurn) return;
await this.sql`
UPDATE agent_sessions SET status = 'crashed'
WHERE agent_session_id = ${st.agentSessionId}
`.catch(() => {});
st.activeTurn?.settle({ ok: false, error: 'opencode SSE stream lost (reconnect gave up)' });
}
/**
* Ask the server whether this session's turn already finished — recovers a
* session.idle/error lost during an SSE gap. Returns true if it settled the turn.
*/
private async reconcile(st: SessionState): Promise<boolean> {
const turn = st.activeTurn;
const client = this.supervisor.client;
if (!turn || !client) return false;
try {
const res = await client.session.messages({
sessionID: st.agentSessionId,
directory: st.worktreePath,
});
if (res.error || !res.data) return false;
let lastAssistant: AssistantMessage | undefined;
for (let i = res.data.length - 1; i >= 0; i--) {
const info = res.data[i]!.info;
if (info.role === 'assistant') {
lastAssistant = info;
break;
}
}
if (!lastAssistant) return false;
if (lastAssistant.error != null) {
turn.settle({ ok: false, error: errToString(lastAssistant.error) });
return true;
}
if (lastAssistant.time.completed != null) {
turn.settle({ ok: true });
return true;
}
return false; // still running — the live stream will deliver session.idle
} catch {
return false; // inconclusive — watchdog backstop covers it
}
}
// ─── per-step usage persistence (U.6) ────────────────────────────────────────
/**
* Accumulate one `session.next.step.ended`'s normalized usage onto the session's
* agent_sessions row. Running totals for the whole conversation context. Zero-delta
* steps are skipped. Errors are swallowed: usage telemetry must never fail a turn.
*/
private async accumulateUsage(st: SessionState, u: StepUsage): Promise<void> {
if (u.input === 0 && u.output === 0 && u.cost === 0) return;
try {
await this.sql`
UPDATE agent_sessions SET
input_tokens = input_tokens + ${u.input},
output_tokens = output_tokens + ${u.output},
cost = cost + ${u.cost}
WHERE agent_session_id = ${st.agentSessionId}
`;
} catch (err) {
this.log.warn(
{ err: errMsg(err), agentSessionId: st.agentSessionId },
'opencode-server: failed to persist step usage (non-fatal)',
);
}
}
// ─── ensureSession: create-or-resume against agent_sessions (1.5) ────────────
async ensureSession(sessionId: string, opts: EnsureSessionOpts): Promise<AgentSessionHandle> {
// Coalesce concurrent first-turns for the same (chat, agent) so the SELECT…
// create…upsert can't race into two opencode sessions (the second orphaning
// the first). A single (non-concurrent) call is unaffected — the entry is set
// and removed within this call. Defensive: the dispatcher already serializes
// turns per (chat, agent) via its inflight map.
const key = `${opts.chatId}:${opts.agent}`;
const existing = this.ensuring.get(key);
if (existing) return existing;
const p = this.ensureSessionInner(sessionId, opts).finally(() => {
if (this.ensuring.get(key) === p) this.ensuring.delete(key);
});
this.ensuring.set(key, p);
return p;
}
private async ensureSessionInner(sessionId: string, opts: EnsureSessionOpts): Promise<AgentSessionHandle> {
await this.supervisor.ensureServer();
const client = this.supervisor.client;
if (!client) throw new Error('opencode-server: client not ready after ensureServer');
const configHash = sessionConfigHash(opts.model);
// P1.5-b: agent_sessions is keyed (chat_id, agent) — the tab/chat is the
// context unit (two tabs in one session = two contexts sharing one worktree).
const [row] = await this.sql<{ agent_session_id: string | null; status: string; config_hash: string | null }[]>`
SELECT agent_session_id, status, config_hash FROM agent_sessions
WHERE chat_id = ${opts.chatId} AND agent = ${opts.agent}
`;
let agentSessionId = row?.agent_session_id ?? null;
// Don't resume crashed sessions or sessions whose config drifted (model change).
const shouldResume = agentSessionId
&& row!.status !== 'crashed'
&& (row!.config_hash == null || row!.config_hash === configHash);
if (!shouldResume) {
if (agentSessionId) {
this.log.info({ sessionId, oldStatus: row!.status, hashMatch: row!.config_hash === configHash },
'opencode-server: not resuming stale session, creating fresh');
this.byOpencodeId.delete(agentSessionId);
}
const created = await client.session.create({ directory: opts.worktreePath });
if (created.error || !created.data) {
throw new Error(`opencode-server: session.create failed: ${errToString(created.error)}`);
}
agentSessionId = created.data.id;
await this.sql`
INSERT INTO agent_sessions
(chat_id, session_id, worktree_id, agent, backend, agent_session_id, server_port, status, last_active_at, config_hash)
VALUES
(${opts.chatId}, ${sessionId}, ${opts.worktreeId}, ${opts.agent}, 'opencode_server', ${agentSessionId}, ${this.supervisor.port}, 'active', clock_timestamp(), ${configHash})
ON CONFLICT (chat_id, agent) DO UPDATE SET
session_id = EXCLUDED.session_id,
worktree_id = EXCLUDED.worktree_id,
backend = 'opencode_server',
agent_session_id = EXCLUDED.agent_session_id,
server_port = EXCLUDED.server_port,
status = 'active',
last_active_at = clock_timestamp(),
config_hash = EXCLUDED.config_hash
`;
} else {
await this.sql`
UPDATE agent_sessions
SET status = 'active', last_active_at = clock_timestamp(), server_port = ${this.supervisor.port}, config_hash = ${configHash}
WHERE chat_id = ${opts.chatId} AND agent = ${opts.agent}
`;
}
// Both branches above guarantee agentSessionId is non-null.
const ocSessionId = agentSessionId!;
// Register / refresh the demux entry the SSE loop keys on. Preserve an existing
// entry (and any in-flight turn) — just refresh the routing fields.
let state = this.byOpencodeId.get(ocSessionId);
if (state) {
state.boocodeSessionId = sessionId;
state.worktreePath = opts.worktreePath;
} else {
state = this.makeSessionState(sessionId, ocSessionId, opts.worktreePath);
this.byOpencodeId.set(ocSessionId, state);
}
// Start this session's own SSE loop, scoped to its worktree directory. Both
// fresh-create and resume reach here; idempotent.
startSessionEventLoop(state, this.sseDeps());
return {
sessionId,
agent: opts.agent,
backend: 'opencode_server',
chatId: opts.chatId,
worktreeId: opts.worktreeId,
agentSessionId: ocSessionId,
serverPort: this.supervisor.port,
};
}
/** Fresh per-(opencode session) demux state. */
private makeSessionState(boocodeSessionId: string, agentSessionId: string, worktreePath: string): SessionState {
return {
boocodeSessionId,
agentSessionId,
worktreePath,
streamedPartKeys: new Set(),
partTypeById: new Map(),
activeTurn: null,
watchdog: null,
sseAbort: null,
swallowNextTerminal: false,
};
}
// ─── prompt: send one turn (1.6) ─────────────────────────────────────────────
async prompt(handle: AgentSessionHandle, input: string, ctx: PromptCtx): Promise<TurnResult> {
const client = this.supervisor.client;
if (!client) throw new Error('opencode-server: client not ready');
const oc = handle.agentSessionId;
if (!oc) throw new Error('opencode-server: handle has no agentSessionId');
let state = this.byOpencodeId.get(oc);
if (!state) {
state = this.makeSessionState(handle.sessionId, oc, ctx.worktreePath);
this.byOpencodeId.set(oc, state);
}
const session = state;
// v2.7 busy-assert: one in-flight turn per session. The dispatcher serializes
// turns per (chat, agent), so this never fires in normal dispatch — but if a
// second prompt arrives while one is live it would silently overwrite the slot
// and orphan the first turn, so reject instead.
if (session.activeTurn) {
return { ok: false, error: 'opencode-server: session already has an in-flight turn' };
}
// Authoritative per-turn directory for SDK routing + reconcile.
session.worktreePath = ctx.worktreePath;
// Defensive: ensureSession normally starts the loop, but if prompt is reached
// with a freshly-created state (no loop yet), start it so the turn streams.
startSessionEventLoop(session, this.sseDeps());
return await new Promise<TurnResult>((resolve) => {
let settled = false;
const cleanup = () => {
session.activeTurn = null;
if (session.watchdog) {
clearTimeout(session.watchdog);
session.watchdog = null;
}
session.streamedPartKeys.clear();
session.partTypeById.clear();
ctx.signal.removeEventListener('abort', onAbort);
};
const settle = (r: TurnResult) => {
if (settled) return;
settled = true;
cleanup();
resolve(r);
};
const onAbort = () => {
// Abort the turn only — never the server.
client.session.abort({ sessionID: oc, directory: ctx.worktreePath }).catch(() => {});
// F.1: opencode emits one trailing session.idle/error for the cancelled
// turn — arm the guard so it's swallowed, not used to settle the next turn.
armAbortGuard(session);
settle({ ok: false, error: 'aborted' });
};
const turn: TurnState = { onEvent: ctx.onEvent, settle };
session.activeTurn = turn;
this.bumpActivity(session); // arm the inactivity backstop
if (ctx.signal.aborted) {
onAbort();
return;
}
ctx.signal.addEventListener('abort', onAbort, { once: true });
const model = parseModel(ctx.model);
client.session
.promptAsync({
sessionID: oc,
directory: ctx.worktreePath,
parts: [{ type: 'text', text: input }],
...(model ? { model } : {}),
})
.then((res) => {
// promptAsync is fire-and-forget (204); the turn completes via session.idle.
// Only a submission error settles here.
if (res.error) settle({ ok: false, error: errToString(res.error) });
})
.catch((err) => settle({ ok: false, error: errMsg(err) }));
});
}
// ─── teardown ────────────────────────────────────────────────────────────────
async closeSession(handle: AgentSessionHandle): Promise<void> {
if (handle.agentSessionId) {
// Stop this session's SSE loop before dropping its demux entry.
this.byOpencodeId.get(handle.agentSessionId)?.sseAbort?.abort();
this.byOpencodeId.delete(handle.agentSessionId);
}
await this.sql`
UPDATE agent_sessions SET status = 'closed'
WHERE chat_id = ${handle.chatId} AND agent = ${handle.agent}
`.catch(() => {});
}
async dispose(): Promise<void> {
// Abort every per-session SSE loop so none survive the teardown.
for (const st of this.byOpencodeId.values()) st.sseAbort?.abort();
this.byOpencodeId.clear();
await this.supervisor.dispose();
}
}
// ─── helpers ──────────────────────────────────────────────────────────────────
/** BooCoder model string "provider/model" → opencode's structured {providerID, modelID}. */
function parseModel(model: string | undefined): { providerID: string; modelID: string } | undefined {
if (!model || !model.trim()) return undefined;
const trimmed = model.trim();
const idx = trimmed.indexOf('/');
if (idx > 0 && idx < trimmed.length - 1) {
return { providerID: trimmed.slice(0, idx), modelID: trimmed.slice(idx + 1) };
}
// No slash but non-empty → infer llama-swap (the only configured provider).
if (idx < 0 && trimmed.length > 0) {
return { providerID: 'llama-swap', modelID: trimmed };
}
return undefined;
}
/** Hash of stable config — detects model changes across sessions without
* invalidating on ephemeral state like the random server port. */
function sessionConfigHash(model: string): string {
return createHash('sha256').update(`opencode_server|${model}`).digest('hex').slice(0, 16);
}

View File

@@ -0,0 +1,181 @@
/**
* Per-session SSE subscribe loop + reconnect/backoff + eventSessionId demux.
*
* Extracted (v2.7 audit reshape) from `OpenCodeServerBackend.startSessionEventLoop`
* / `runSessionEventLoop`. opencode scopes events by the `directory` query param, so
* each session runs its own dir-scoped stream and never drops a sibling's events.
*
* The loop is intentionally thin: it owns subscribe + the demux filter + reconnect
* timing only. Translating an event into turn side effects (watchdog, usage,
* settle) stays on the backend via the injected `dispatchEvent` / `reconcile`
* callbacks — `opencode-sse` knows nothing about turns or the DB.
*
* v2.7 concurrency hardening: the throw-driven reconnect path now backs off
* exponentially and trips a circuit-breaker (`onReconnectGiveUp`) after a bounded
* number of consecutive failures, instead of looping forever at a flat 1s. The
* HAPPY PATH is unchanged — a clean stream end (server still up) reconnects after
* `baseMs` (1s, as before) and resets the failure counter, so a long-lived session
* that re-subscribes normally never backs off.
*/
import type { FastifyBaseLogger } from 'fastify';
import type { Event, OpencodeClient } from '@opencode-ai/sdk/v2/client';
import type { AgentEvent } from '../agent-backend.js';
import type { TurnResult } from '../agent-backend.js';
import { eventSessionId, errMsg } from './opencode-event-map.js';
export const SSE_RECONNECT_DELAY_MS = 1_000;
/** One in-flight turn's emitter + completion settler. */
export interface TurnState {
onEvent: (e: AgentEvent) => void;
settle: (r: TurnResult) => void;
}
/** Per-(opencode session) demux state. dedup sets scoped here, cleared per turn. */
export interface SessionState {
boocodeSessionId: string;
agentSessionId: string;
/** Worktree directory for SDK `directory` routing; refreshed each turn from ctx. */
worktreePath: string;
/** dedup gate: `${type}:${id}` added on delta, deleted-and-tested on updated. Cleared at turn end. */
streamedPartKeys: Set<string>;
/** partID → 'text' | 'reasoning', so a delta with a non-'reasoning' field is still classed right. Cleared at turn end. */
partTypeById: Map<string, string>;
activeTurn: TurnState | null;
/** Inactivity backstop timer for the active turn; null when no turn in flight. */
watchdog: ReturnType<typeof setTimeout> | null;
/** Per-session SSE subscription handle. Non-null while the loop is running;
* aborting it tears down the underlying fetch and exits the loop. */
sseAbort: AbortController | null;
/** F.1 post-abort orphan-terminal guard: swallow the one session.idle/error
* opencode emits for an aborted turn so it can't settle the next turn. */
swallowNextTerminal: boolean;
}
// ─── reconnect backoff (pure) ────────────────────────────────────────────────
export interface ReconnectPolicy {
/** First retry delay (and the steady-state clean-reconnect delay). */
baseMs: number;
/** Cap on the exponential delay. */
maxMs: number;
/** Consecutive failures tolerated before the breaker trips (give up). */
maxAttempts: number;
}
export const DEFAULT_RECONNECT_POLICY: ReconnectPolicy = {
baseMs: SSE_RECONNECT_DELAY_MS,
maxMs: 30_000,
maxAttempts: 6,
};
export type ReconnectDecision =
| { action: 'reconnect'; delayMs: number }
| { action: 'give-up' };
/**
* Pure backoff decision after `failures` consecutive throwing reconnect attempts
* (1-based: the first failure passes `failures=1`). Returns an exponentially
* growing delay capped at `maxMs`, or `give-up` once the count exceeds
* `maxAttempts`. `failures=1` yields `baseMs`, so the very first retry matches the
* pre-hardening flat delay (happy-path-preserving).
*/
export function reconnectDecision(
failures: number,
policy: ReconnectPolicy = DEFAULT_RECONNECT_POLICY,
): ReconnectDecision {
if (failures > policy.maxAttempts) return { action: 'give-up' };
const exp = policy.baseMs * 2 ** (failures - 1);
return { action: 'reconnect', delayMs: Math.min(policy.maxMs, exp) };
}
// ─── the loop ────────────────────────────────────────────────────────────────
export interface SseLoopDeps {
/** Live iff the server is up (read each iteration so a crash stops the loop). */
isUp: () => boolean;
/** The current opencode client (null between server restarts). */
getClient: () => OpencodeClient | null;
/** Route one demuxed event to its turn (backend side effects live here). */
dispatchEvent: (ev: Event) => void;
/** Recover an idle/error lost during an SSE gap. Returns true if it settled. */
reconcile: (state: SessionState) => Promise<boolean>;
/** Circuit-breaker: called once the backoff gives up; fail the active turn. */
onReconnectGiveUp: (state: SessionState) => Promise<void> | void;
log: FastifyBaseLogger;
/** Injectable for tests; defaults to a real timer sleep. */
sleep?: (ms: number) => Promise<void>;
policy?: ReconnectPolicy;
}
function defaultSleep(ms: number): Promise<void> {
return new Promise((r) => setTimeout(r, ms));
}
/** Per-session SSE subscription, scoped to the session's worktree directory.
* Idempotent: a no-op if this session's loop is already running. */
export function startSessionEventLoop(state: SessionState, deps: SseLoopDeps): void {
if (state.sseAbort) return; // already running
const abort = new AbortController();
state.sseAbort = abort;
void runSessionEventLoop(state, abort, deps).finally(() => {
// Only clear if this controller is still the live one (a later restart may
// have already installed a new one).
if (state.sseAbort === abort) state.sseAbort = null;
});
}
export async function runSessionEventLoop(
state: SessionState,
abort: AbortController,
deps: SseLoopDeps,
): Promise<void> {
const signal = abort.signal;
const sleep = deps.sleep ?? defaultSleep;
const policy = deps.policy ?? DEFAULT_RECONNECT_POLICY;
let failures = 0;
while (deps.isUp() && deps.getClient() && !signal.aborted) {
const client = deps.getClient()!;
try {
// Re-read worktreePath each (re)subscribe so a directory refresh is picked
// up on reconnect. Passing `signal` lets close/dispose tear down a stream
// that's parked in `for await` between events.
const sub = await client.event.subscribe({ directory: state.worktreePath }, { signal });
for await (const ev of sub.stream) {
if (signal.aborted) break;
// Dir-scoped streams should only carry this session's events, but two
// sessions sharing a worktree (possible post-P1.5-b) each receive BOTH
// sessions' events — so drop anything that isn't ours, else the other
// session's deltas get processed twice (once per loop).
const sid = eventSessionId(ev);
if (sid != null && sid !== state.agentSessionId) continue;
deps.dispatchEvent(ev);
}
// Clean stream end — a healthy reconnect, NOT a failure: recover any lost
// terminal then re-subscribe at the base delay (pre-hardening behavior).
failures = 0;
if (deps.isUp() && !signal.aborted) {
await deps.reconcile(state); // recover an idle/error lost during the gap
await sleep(policy.baseMs);
}
} catch (err) {
if (!deps.isUp() || signal.aborted) break;
failures += 1;
const decision = reconnectDecision(failures, policy);
deps.log.warn(
{ err: errMsg(err), agentSessionId: state.agentSessionId, failures, action: decision.action },
'opencode-server: session event loop error; reconnecting',
);
await deps.reconcile(state);
if (decision.action === 'give-up') {
deps.log.warn(
{ agentSessionId: state.agentSessionId, failures },
'opencode-server: SSE reconnect gave up (circuit breaker) — failing active turn',
);
await deps.onReconnectGiveUp(state);
break;
}
await sleep(decision.delayMs);
}
}
}

View File

@@ -0,0 +1,77 @@
/**
* v2.6 Phase 1-UX (U.6) — pure mapper for opencode's per-step usage event.
*
* opencode's warm server emits `session.next.step.ended` once per completed LLM
* step (so a multi-tool turn fires it several times). Its `properties` carry the
* step's token + cost accounting:
*
* {
* timestamp: number;
* sessionID: string;
* finish: string;
* cost: number; // USD for this step
* tokens: {
* input: number; output: number; reasoning: number;
* cache: { read: number; write: number };
* };
* snapshot?: string;
* }
*
* (Verified against @opencode-ai/sdk@1.15.12 — `EventSessionNextStepEnded` in
* `dist/v2/gen/types.gen.d.ts`, a member of the `Event` union the SSE loop
* switches on.)
*
* We normalize to the review's target slice `{input, output, cost}` (the
* provider-agnostic `AgentUsage` shape lands later). cache read/write tokens are
* folded into `input` so the persisted input count reflects the real context the
* model billed for; reasoning tokens are folded into `output` since that's what
* the provider counts them as for generation. This keeps the persisted totals a
* faithful sum of what opencode reported, without inventing extra columns yet.
*/
/** The `properties` shape of a `session.next.step.ended` event (subset we read). */
export interface StepEndedProps {
cost: number;
tokens: {
input: number;
output: number;
reasoning: number;
cache: { read: number; write: number };
};
}
/** Normalized per-step usage delta persisted onto the agent_sessions row. */
export interface StepUsage {
input: number;
output: number;
cost: number;
}
/** Coerce a possibly-missing/NaN number to a non-negative finite integer (tokens). */
function n(v: unknown): number {
const x = typeof v === 'number' ? v : Number(v);
return Number.isFinite(x) && x > 0 ? Math.round(x) : 0;
}
/** Coerce a possibly-missing/NaN number to a non-negative finite float (cost USD). */
function f(v: unknown): number {
const x = typeof v === 'number' ? v : Number(v);
return Number.isFinite(x) && x > 0 ? x : 0;
}
/**
* Map a `session.next.step.ended` payload → the normalized `{input, output, cost}`
* delta. Defensive against missing/partial token blocks (the wire is trusted but
* we never want a NaN to poison the accumulated DB total). `input` folds in cache
* read+write; `output` folds in reasoning.
*/
export function stepEndedToUsage(props: Partial<StepEndedProps> | undefined): StepUsage {
const t = props?.tokens;
const cacheRead = n(t?.cache?.read);
const cacheWrite = n(t?.cache?.write);
return {
input: n(t?.input) + cacheRead + cacheWrite,
output: n(t?.output) + n(t?.reasoning),
cost: f(props?.cost),
};
}

View File

@@ -0,0 +1,96 @@
/**
* claude-sdk-sessionstore #9 (Part 2) — a tiny PURE pushable async-iterable.
*
* The Claude Agent SDK's streaming-input mode wants `query({ prompt })` where
* `prompt` is an `AsyncIterable<SDKUserMessage>`. To keep ONE `query()` generator
* alive across many turns (the "warm" property), the backend feeds it ONE user
* message per `prompt()` turn through a queue that stays open between turns and is
* only closed at `closeSession`/`dispose`. This is that queue.
*
* Semantics (the bit worth unit-testing — push/close/iterate ordering):
* - `push(v)` enqueues a value. If a consumer is parked in `await next()`, it's
* handed the value immediately; otherwise the value buffers in FIFO order.
* - The async iterator yields buffered/pushed values in push order, and PARKS
* (never busy-loops) when the buffer is empty — so the SDK generator waits for
* the next turn's message instead of seeing end-of-input.
* - `close()` ends the iterable: any parked consumer resolves `{done:true}` and
* all future `next()`s return done. Values pushed after close are dropped.
* - It's single-consumer (one `query()` reads it); concurrent consumers are not a
* supported shape and not needed here.
*
* No SDK import — generic over the pushed value `T` — so the pure push/close/iterate
* ordering is testable without the `SDKUserMessage` shape or a live binary.
*/
export interface Pushable<T> {
/** Enqueue a value (or hand it to a parked consumer). No-op after close. */
push(value: T): void;
/** End the iterable. Idempotent; a parked consumer resolves done. */
close(): void;
/** True once `close()` has been called. */
readonly closed: boolean;
/** The async-iterable the consumer (the SDK `query`) drives. */
readonly iterable: AsyncIterable<T>;
}
export function createPushable<T>(): Pushable<T> {
const buffer: T[] = [];
// A waiting consumer's resolver (null when none is parked). Single-consumer.
let pendingResolve: ((res: IteratorResult<T>) => void) | null = null;
let closed = false;
function push(value: T): void {
if (closed) return;
if (pendingResolve) {
const resolve = pendingResolve;
pendingResolve = null;
resolve({ value, done: false });
return;
}
buffer.push(value);
}
function close(): void {
if (closed) return;
closed = true;
if (pendingResolve) {
const resolve = pendingResolve;
pendingResolve = null;
resolve({ value: undefined, done: true });
}
}
const iterator: AsyncIterator<T> = {
next(): Promise<IteratorResult<T>> {
// Drain the buffer first (FIFO), regardless of close — buffered values
// pushed before close are still delivered.
if (buffer.length > 0) {
return Promise.resolve({ value: buffer.shift() as T, done: false });
}
if (closed) {
return Promise.resolve({ value: undefined, done: true });
}
// Park until the next push/close. Single-consumer: only one waiter at a time.
return new Promise<IteratorResult<T>>((resolve) => {
pendingResolve = resolve;
});
},
return(): Promise<IteratorResult<T>> {
// Consumer abandoned the loop (e.g. `break`) → close so a later push no-ops.
close();
return Promise.resolve({ value: undefined, done: true });
},
};
return {
push,
close,
get closed() {
return closed;
},
iterable: {
[Symbol.asyncIterator]() {
return iterator;
},
},
};
}

View File

@@ -0,0 +1,38 @@
/**
* Guard against opencode's post-abort "orphan" terminal event (F.1).
*
* When a turn is aborted (`client.session.abort`), opencode emits one trailing
* `session.idle` / `session.error` for the cancelled turn. Without a guard that
* orphan settles whatever turn currently holds the session slot — which, after
* the user immediately sends another message, is the NEXT turn, settling it early
* as success (the v2.6.5 Stop-button bug). opencode terminal events carry only a
* `sessionID` (no turn id), so we can't match by id; instead we swallow exactly
* one terminal per abort, and self-heal if that orphan never arrives.
*/
export interface AbortTerminalGuard {
/** True between an abort and the orphan terminal event that follows it. */
swallowNextTerminal: boolean;
}
/** Arm on abort: the next terminal event for this session is the orphan. */
export function armAbortGuard(g: AbortTerminalGuard): void {
g.swallowNextTerminal = true;
}
/**
* A new turn produced activity (delta) → the orphan window is over. Self-heals
* the case where opencode emits no orphan idle (e.g. abort-before-prompt), so a
* real terminal still settles instead of being swallowed forever.
*/
export function noteTurnActivity(g: AbortTerminalGuard): void {
g.swallowNextTerminal = false;
}
/** Decide a terminal (idle/error): swallow the post-abort orphan once, else settle. */
export function consumeTerminal(g: AbortTerminalGuard): 'swallow' | 'settle' {
if (g.swallowNextTerminal) {
g.swallowNextTerminal = false;
return 'swallow';
}
return 'settle';
}

View File

@@ -0,0 +1,41 @@
/**
* v2.6 Phase 2 — warm-vs-one-shot routing predicate for goose/qwen.
*
* The warm ACP backend keys its persistent process + ACP session on (chat_id,
* agent) — exactly like the opencode-server backend. A task therefore only routes
* to the warm pool when it carries BOTH a `session_id` and a `chat_id`, i.e. it
* came from a real chat tab (the coder message route + skills route stamp both).
*
* Session-less creators — arena contestants, MCP-created tasks, generic
* `POST /api/tasks`, `new_task` — leave one or both null. Those keep the existing
* one-shot worktree-per-task ACP path (`runExternalAgent`), which spawns a fresh
* `goose acp` / `qwen --acp` per turn and never holds a warm process. Routing them
* warm would either synthesize a degenerate (null, agent) key or create a chat per
* arena contestant — neither is wanted, so they stay one-shot.
*
* Pure, so it's unit-testable; the dispatcher consumes it.
*/
const WARM_CAPABLE_AGENTS = new Set(['goose', 'qwen']);
export function shouldUseWarmBackend(task: {
agent: string | null;
session_id: string | null;
chat_id: string | null;
}): boolean {
if (!task.agent || !WARM_CAPABLE_AGENTS.has(task.agent)) return false;
return task.session_id != null && task.chat_id != null;
}
/**
* Map an ACP prompt `stopReason` to the backend's ok/fail contract (TurnResult.ok).
*
* ACP's `StopReason` union includes normal completions (`end_turn`, `max_tokens`,
* `max_turn_requests`) and abnormal ones (`refusal`, `cancelled`). Only the latter
* two read as a failed turn; everything else (including an undefined/absent reason,
* which we default to `end_turn`) is a successful completion. Pure so it's testable
* independently of the warm process.
*/
export function isTurnOkForStopReason(stopReason: string | null | undefined): boolean {
const reason = stopReason ?? 'end_turn';
return reason !== 'refusal' && reason !== 'cancelled';
}

View File

@@ -0,0 +1,389 @@
/**
* v2.6 Phase 2 — WarmAcpBackend (goose, qwen).
*
* One persistent stdio process + ONE `ClientSideConnection` per (chat, agent),
* `initialize` + `session/new` done ONCE, reused across every turn — the warm
* analogue of the previous one-shot `acp-dispatch.ts` (which spawned/torn-down a
* fresh `goose acp` / `qwen --acp` per turn). Mirrors Paseo's `SpawnedACPProcess`.
*
* Implements the Phase 0 `AgentBackend` interface (same contract as
* `OpenCodeServerBackend`). Emits transport-agnostic `AgentEvent`s via the SHARED
* `mapSessionUpdate` (reused verbatim from the one-shot stack); the dispatcher maps
* those to WS frames + `persistExternalAgentTurn`, unchanged.
*
* Lifecycle decisions (design.md §2b / §10):
* - **Child lifetime is the pool's, not a request's.** Spawned once; never tied
* to a per-turn abort signal. Only the in-flight `prompt` gets `ctx.signal` —
* abort = ACP `session/cancel`, NOT killing the child.
* - **Per-turn abort** cancels the prompt on the warm connection so the SAME
* process serves the next turn.
* - **Crash** (child exit) marks `agent_sessions.status='crashed'` + logs; the
* next `ensureSession` re-spawns + re-`session/new` (Phase 3 hardens auto-restart).
* - **Resume across a process restart is NOT attempted in Phase 2.** goose ACP
* advertises no `loadSession`/`session.resume`; qwen does, but cross-restart
* resume is Phase 3. Within ONE live process the ACP session persists across
* turns (the whole point of "warm"); a restart re-`session/new` (memory loss
* across restart, accepted per §10). The agent's resume capabilities ARE
* probed and logged for forward-compat.
*
* Each WarmAcpBackend instance owns exactly one (chat, agent) — the dispatcher
* pools them under `agentPool.register(chatId, agent, backend)`.
*
* SDK note (@agentclientprotocol/sdk@^0.22.1, cross-checked against the design's
* `^0.14` worry): the resume method is the STABLE `resumeSession` (`session/resume`,
* gated by `agentCapabilities.sessionCapabilities.resume`), NOT the `^0.14`
* `unstable_resumeSession`. `loadSession` is gated by `agentCapabilities.loadSession`.
*/
import { spawn, type ChildProcess } from 'node:child_process';
import type { FastifyBaseLogger } from 'fastify';
import { ClientSideConnection, type Client } from '@agentclientprotocol/sdk';
import type { Sql } from '../../db.js';
import { resolveLaunchSpec } from '../acp-spawn.js';
import { isTurnOkForStopReason } from './warm-acp-routing.js';
import { getResolvedRegistry, type ResolvedProviderDef } from '../provider-config-registry.js';
import { createAcpNdJsonStream } from '../acp-stream.js';
import { mapSessionUpdate } from '../acp-event-map.js';
import { buildAcpClient } from '../acp-client.js';
import { cancelPendingPermission } from '../permission-waiter.js';
import { type AcpToolSnapshot, synthesizeCanceledSnapshots } from '../acp-tool-snapshot.js';
import type {
AgentBackend,
AgentEvent,
AgentSessionHandle,
EnsureSessionOpts,
PromptCtx,
TurnResult,
} from '../agent-backend.js';
/** State for one in-flight turn (only one at a time per backend — turns serialize). */
interface TurnState {
/** Per-turn task id, for routing permission prompts back to the UI. */
taskId: string | undefined;
/** BooCode session id for permission-waiter's broker frames. */
sessionId: string;
/** Per-turn mode id (autonomous-mode gate in permission-waiter). */
modeId: string | undefined;
onEvent: (e: AgentEvent) => void;
/** Tool-call snapshot accumulator for this turn — merge across tool_call_update. */
snapshots: Map<string, AcpToolSnapshot>;
}
export interface WarmAcpBackendDeps {
sql: Sql;
log: FastifyBaseLogger;
/** The (chat, agent) this backend serves — its pool identity + DB key. */
chatId: string;
agent: string;
/** Resolved binary for the agent (from available_agents.install_path), or null. */
installPath: string | null;
/** Optional override of the resolved registry def (defaults to a live lookup). */
resolved?: ResolvedProviderDef;
}
export class WarmAcpBackend implements AgentBackend {
readonly backend = 'acp_warm' as const;
private readonly sql: Sql;
private readonly log: FastifyBaseLogger;
private readonly chatId: string;
private readonly agent: string;
private readonly installPath: string | null;
private readonly resolvedOverride: ResolvedProviderDef | undefined;
private child: ChildProcess | null = null;
private connection: ClientSideConnection | null = null;
/** The single ACP session id for this warm process; null until session/new. */
private acpSessionId: string | null = null;
private up = false;
/** Idempotent spawn guard — one warm process per backend, started lazily. */
private starting: Promise<void> | null = null;
/** Resume capabilities probed at initialize, logged for forward-compat (Phase 3). */
private supportsLoadSession = false;
private supportsResumeSession = false;
/** The current in-flight turn; the Client closures read it. Null between turns. */
private activeTurn: TurnState | null = null;
constructor(deps: WarmAcpBackendDeps) {
this.sql = deps.sql;
this.log = deps.log;
this.chatId = deps.chatId;
this.agent = deps.agent;
this.installPath = deps.installPath;
this.resolvedOverride = deps.resolved;
}
/** §2: liveness for the health endpoint + dispatcher fallback decision. */
health(): 'up' | 'down' {
return this.up ? 'up' : 'down';
}
/** Phase 3: busy iff this backend's single session has an in-flight turn. The
* pool reads this to skip idle/LRU eviction (never kill the child mid-prompt). */
isBusy(): boolean {
return this.activeTurn != null;
}
// ─── warm-process lifecycle (2.1 spawn + initialize + session/new ONCE) ───────
/** Lazy: spawn the warm process on first use. Idempotent — one process per backend. */
private ensureProcess(worktreePath: string): Promise<void> {
if (this.up && this.connection && this.acpSessionId) return Promise.resolve();
if (!this.starting) {
this.starting = this.startProcess(worktreePath).catch((err) => {
// Reset so a later ensureSession can retry the spawn after a failed start.
this.starting = null;
throw err;
});
}
return this.starting;
}
private async startProcess(worktreePath: string): Promise<void> {
const resolved = this.resolvedOverride ?? getResolvedRegistry().get(this.agent);
const spec = resolved ? resolveLaunchSpec(resolved, this.installPath) : null;
if (!spec) throw new Error(`warm-acp: agent '${this.agent}' does not support ACP (no launch spec)`);
this.log.info({ agent: this.agent, chatId: this.chatId, binary: spec.binary, worktreePath }, 'warm-acp: spawning warm process');
// Child lifetime is the pool's. NOT tied to any per-turn abort signal — only
// the in-flight prompt is cancellable (via ACP session/cancel in prompt()).
const child = spawn(spec.binary, spec.args, {
cwd: worktreePath,
stdio: ['pipe', 'pipe', 'pipe'],
env: { ...process.env, ...spec.env },
});
this.child = child;
// 2.3: supervise the child; react to its exit, never let a request scope kill it.
child.on('exit', (code, signal) => {
this.up = false;
this.connection = null;
this.acpSessionId = null;
this.starting = null;
this.log.warn({ agent: this.agent, chatId: this.chatId, code, signal }, 'warm-acp: warm process exited — marking crashed (rebuild on next turn)');
void this.markCrashed();
});
// A spawn error (e.g. ENOENT) surfaces here, not as an exit.
child.on('error', (err) => {
this.up = false;
this.log.error({ agent: this.agent, chatId: this.chatId, err: errMsg(err) }, 'warm-acp: warm process error');
});
const stream = createAcpNdJsonStream(child);
const connection = new ClientSideConnection(() => this.buildClient(worktreePath), stream);
const init = await connection.initialize({
protocolVersion: 1,
clientInfo: { name: 'boocoder', version: '2.6.0' },
clientCapabilities: {},
});
const caps = init.agentCapabilities;
this.supportsLoadSession = caps?.loadSession === true;
this.supportsResumeSession = caps?.sessionCapabilities?.resume != null;
const session = await connection.newSession({ cwd: worktreePath, mcpServers: [] });
this.connection = connection;
this.acpSessionId = session.sessionId;
this.up = true;
this.log.info(
{
agent: this.agent,
chatId: this.chatId,
acpSessionId: session.sessionId,
loadSession: this.supportsLoadSession,
resumeSession: this.supportsResumeSession,
},
'warm-acp: warm session ready',
);
}
/** Build the ACP Client callbacks ONCE per connection (shared `buildAcpClient`).
* `resolveTurn` reads `this.activeTurn` at each callback so events/permissions
* route to the live turn — exactly the prior behavior. The warm session always
* has a non-empty `sessionId`, so the shared `taskId && sessionId` permission
* gate is equivalent to the old `turn?.taskId` gate. */
private buildClient(worktreePath: string): Client {
return buildAcpClient(worktreePath, () => {
const turn = this.activeTurn;
if (!turn) return null;
return {
taskId: turn.taskId,
sessionId: turn.sessionId,
modeId: turn.modeId,
agent: this.agent,
onSessionUpdate: (params) => {
for (const event of mapSessionUpdate(params, turn.snapshots)) turn.onEvent(event);
},
};
});
}
// ─── ensureSession: create-or-reuse the warm session (2.1) ───────────────────
async ensureSession(sessionId: string, opts: EnsureSessionOpts): Promise<AgentSessionHandle> {
await this.ensureProcess(opts.worktreePath);
if (!this.acpSessionId) throw new Error('warm-acp: session not ready after ensureProcess');
// P1.5-b: agent_sessions keys on (chat_id, agent). The ACP session id is the
// resume handle WITHIN the live process; across a process restart it's stale,
// so ensureProcess re-`session/new` and we upsert the fresh id here.
await this.sql`
INSERT INTO agent_sessions
(chat_id, session_id, worktree_id, agent, backend, agent_session_id, server_port, status, last_active_at)
VALUES
(${opts.chatId}, ${sessionId}, ${opts.worktreeId}, ${opts.agent}, 'acp_warm', ${this.acpSessionId}, NULL, 'active', clock_timestamp())
ON CONFLICT (chat_id, agent) DO UPDATE SET
session_id = EXCLUDED.session_id,
worktree_id = EXCLUDED.worktree_id,
backend = 'acp_warm',
agent_session_id = EXCLUDED.agent_session_id,
server_port = NULL,
status = 'active',
last_active_at = clock_timestamp()
`.catch((err) => {
this.log.warn({ err: errMsg(err), chatId: opts.chatId, agent: opts.agent }, 'warm-acp: agent_sessions upsert failed (non-fatal)');
});
return {
sessionId,
agent: opts.agent,
backend: 'acp_warm',
chatId: opts.chatId,
worktreeId: opts.worktreeId,
agentSessionId: this.acpSessionId,
serverPort: null,
};
}
// ─── prompt: one turn on the warm connection (2.2) ───────────────────────────
async prompt(handle: AgentSessionHandle, input: string, ctx: PromptCtx): Promise<TurnResult> {
// The warm process may have crashed between ensureSession and here, or this
// backend was rebuilt — re-establish before prompting.
await this.ensureProcess(ctx.worktreePath);
const connection = this.connection;
const acpSessionId = this.acpSessionId;
if (!connection || !acpSessionId) {
return { ok: false, error: 'warm-acp: no live ACP connection' };
}
// v2.7 busy-assert: one in-flight turn per warm session. The dispatcher
// serializes turns per (chat, agent), so this never fires in normal dispatch —
// but a second concurrent prompt would silently overwrite `activeTurn` and
// orphan the first turn, so reject instead.
if (this.activeTurn) {
return { ok: false, error: 'warm-acp: session already has an in-flight turn' };
}
const snapshots = new Map<string, AcpToolSnapshot>();
// taskId routes permission/elicitation prompts back to the UI. The dispatcher
// passes it (plus mode) on the per-turn PromptCtx; permission-waiter keys on it.
const turn: TurnState = {
taskId: ctx.taskId,
sessionId: handle.sessionId,
modeId: ctx.modeId,
onEvent: ctx.onEvent,
snapshots,
};
this.activeTurn = turn;
// Per-turn abort: cancel the in-flight prompt on the SAME connection — never
// kill the child (that's the pool's lifetime). On cancel we also synthesize
// 'canceled' updates for any still-running tool calls so the UI doesn't leave
// them spinning (mirrors AcpStreamContext.markAborted).
let aborted = false;
const onAbort = () => {
if (aborted) return;
aborted = true;
connection.cancel({ sessionId: acpSessionId }).catch(() => {});
if (ctx.taskId) cancelPendingPermission(ctx.taskId);
for (const snap of synthesizeCanceledSnapshots(snapshots.values())) {
snapshots.set(snap.toolCallId, snap);
ctx.onEvent({ type: 'tool_update', toolCall: snap });
}
};
if (ctx.signal.aborted) {
this.activeTurn = null;
return { ok: false, error: 'aborted' };
}
ctx.signal.addEventListener('abort', onAbort, { once: true });
try {
const result = await connection.prompt({
sessionId: acpSessionId,
prompt: [{ type: 'text', text: input }],
});
if (aborted) return { ok: false, error: 'aborted' };
const stopReason = result.stopReason ?? 'end_turn';
return isTurnOkForStopReason(stopReason)
? { ok: true }
: { ok: false, error: `stop_reason: ${stopReason}` };
} catch (err) {
if (aborted) return { ok: false, error: 'aborted' };
return { ok: false, error: errMsg(err) };
} finally {
ctx.signal.removeEventListener('abort', onAbort);
this.activeTurn = null;
await this.sql`
UPDATE agent_sessions SET status = 'idle', last_active_at = clock_timestamp()
WHERE chat_id = ${this.chatId} AND agent = ${this.agent}
`.catch(() => {});
}
}
// ─── teardown ────────────────────────────────────────────────────────────────
async closeSession(handle: AgentSessionHandle): Promise<void> {
// Gracefully close the ACP session if the agent supports it; then kill the child.
if (this.connection && this.acpSessionId) {
await this.connection.closeSession({ sessionId: this.acpSessionId }).catch(() => {});
}
await this.killChild();
await this.sql`
UPDATE agent_sessions SET status = 'closed'
WHERE chat_id = ${handle.chatId} AND agent = ${handle.agent}
`.catch(() => {});
}
async dispose(): Promise<void> {
this.up = false;
this.activeTurn = null;
if (this.connection && this.acpSessionId) {
await this.connection.closeSession({ sessionId: this.acpSessionId }).catch(() => {});
}
await this.killChild();
this.connection = null;
this.acpSessionId = null;
this.starting = null;
}
private async killChild(): Promise<void> {
const child = this.child;
this.child = null;
if (!child || child.killed) return;
child.kill('SIGTERM');
await new Promise<void>((resolve) => {
const t = setTimeout(() => {
if (!child.killed) child.kill('SIGKILL');
resolve();
}, 5_000);
t.unref?.();
child.once('close', () => {
clearTimeout(t);
resolve();
});
});
}
private async markCrashed(): Promise<void> {
await this.sql`
UPDATE agent_sessions SET status = 'crashed'
WHERE chat_id = ${this.chatId} AND agent = ${this.agent}
`.catch(() => {});
}
}
function errMsg(e: unknown): string {
return e instanceof Error ? e.message : String(e);
}

View File

@@ -0,0 +1,50 @@
/**
* F1 — per-task abort registry. A Stop on an external-agent task must reach the
* in-flight run and abort its child / prompt. Each external run-function registers
* its per-turn AbortController here keyed by task id; the cancel route calls
* `cancel(taskId)` to fire it; the run-function's `.finally` deletes the entry.
*
* Idempotent by construction:
* - `cancel()` on an already-aborted controller no-ops (AbortController.abort()
* is idempotent) → a rapid double-Stop is safe.
* - `cancel()` on an unknown / already-finished task returns false → a
* cancel-after-natural-exit (entry already deleted) and a Stop on a native
* boocode task (never registered) are both safe no-ops.
*
* Pure (no DB / child / IO) so the abort wiring + idempotency contract is
* unit-testable in isolation — mirrors the turn-guard / lifecycle-decisions
* pure-helper precedent.
*/
export interface CancelRegistry {
/** Create + store an AbortController for this task, returning it for the run. */
register(taskId: string): AbortController;
/** Abort the task's in-flight run. Returns false when no controller is registered. */
cancel(taskId: string): boolean;
/** Drop the task's entry (called from the run's `.finally`). No-op if absent. */
delete(taskId: string): void;
/** Whether a controller is currently registered for this task. */
has(taskId: string): boolean;
}
export function createCancelRegistry(): CancelRegistry {
const controllers = new Map<string, AbortController>();
return {
register(taskId) {
const ac = new AbortController();
controllers.set(taskId, ac);
return ac;
},
cancel(taskId) {
const ac = controllers.get(taskId);
if (!ac) return false;
ac.abort();
return true;
},
delete(taskId) {
controllers.delete(taskId);
},
has(taskId) {
return controllers.has(taskId);
},
};
}

View File

@@ -0,0 +1,306 @@
/**
* write-edit-robustness #4 — worktree checkpoints.
*
* External agents (opencode / goose / qwen / claude) write DIRECTLY into the
* shared session worktree (`/tmp/booworktrees/sess-<id>`); BooCode's own `rewind`
* only reverses `pending_changes` against the project root, so it has zero coverage
* there. A checkpoint is a pre-turn shadow-commit of the worktree tree (tracked +
* untracked) captured WITHOUT touching the real index/working tree, stored in a
* private GC-safe ref. `restoreCheckpoint` rewinds the worktree to that commit,
* trims the transcript from the anchor message forward, and resets the agent
* backend so the next turn re-establishes a fresh context consistent with the
* restored files.
*
* All git goes through hostExec + shellEscape (BooCoder runs on the host; the
* worktrees live on the host fs). Checkpoint CREATION is best-effort: a failure
* logs and returns null — it must NEVER throw into the dispatch turn.
*/
import { randomUUID } from 'node:crypto';
import type { FastifyBaseLogger } from 'fastify';
import type { Sql } from '../db.js';
import { hostExec } from './host-exec.js';
import { agentPool, OPENCODE_POOL_KEY } from './agent-pool.js';
import type { AgentSessionHandle } from './agent-backend.js';
/** Minimal shell escape for paths/refs (single-quote wrapping). Mirrors worktrees.ts. */
function shellEscape(s: string): string {
return "'" + s.replace(/'/g, "'\\''") + "'";
}
/**
* Pure builder for the shadow-commit command. Captures tracked + untracked files
* in the worktree into a temp index (so the real index/working tree is untouched),
* writes a tree, commits it parented on HEAD, and parks the commit under a private
* ref `refs/boocode/checkpoints/<id>` so git's GC never reclaims it. Prints ONLY
* the resulting SHA on stdout (the trailing `printf '%s'`), so the caller parses
* stdout.trim() directly.
*
* `id` is the row UUID (minted before the ref so the ref name matches the row).
* Both the worktree path and the id are shell-escaped.
*/
export function buildShadowCommitCommand(worktreePath: string, id: string): string {
const wt = shellEscape(worktreePath);
const ref = shellEscape(`refs/boocode/checkpoints/${id}`);
return (
`cd ${wt} && TMP=$(mktemp) && GIT_INDEX_FILE="$TMP" git read-tree HEAD ` +
`&& GIT_INDEX_FILE="$TMP" git add -A ` +
`&& TREE=$(GIT_INDEX_FILE="$TMP" git write-tree) ` +
`&& SHA=$(git commit-tree "$TREE" -p HEAD -m "boocode checkpoint") ` +
`&& git update-ref ${ref} "$SHA" && rm -f "$TMP" && printf '%s' "$SHA"`
);
}
export interface CreateCheckpointArgs {
chatId: string;
sessionId: string | null;
worktreeId: string | null;
worktreePath: string;
messageId: string | null;
label?: string | null;
}
/**
* Capture a pre-turn checkpoint of the session worktree. Best-effort: returns the
* inserted row's { id, commit_sha } on success, or null on any failure (the turn
* proceeds either way — a missing checkpoint just means no restore point for that
* turn). NEVER throws.
*
* The id is minted up front so the git ref name (`refs/boocode/checkpoints/<id>`)
* matches the DB row id, keeping ref and row in lockstep.
*/
export async function createCheckpoint(
sql: Sql,
args: CreateCheckpointArgs,
opts?: { signal?: AbortSignal; log?: FastifyBaseLogger },
): Promise<{ id: string; commit_sha: string } | null> {
const id = randomUUID();
try {
const cmd = buildShadowCommitCommand(args.worktreePath, id);
const res = await hostExec(cmd, { signal: opts?.signal, timeoutMs: 30_000 });
if (res.exitCode !== 0) {
opts?.log?.warn(
{ chatId: args.chatId, worktreePath: args.worktreePath, stderr: res.stderr.trim().slice(0, 500) },
'checkpoint: shadow-commit failed (turn proceeds without a checkpoint)',
);
return null;
}
const commitSha = res.stdout.trim();
if (!commitSha) {
opts?.log?.warn(
{ chatId: args.chatId, worktreePath: args.worktreePath },
'checkpoint: shadow-commit produced no SHA (turn proceeds)',
);
return null;
}
await sql`
INSERT INTO checkpoints (id, chat_id, session_id, worktree_id, message_id, commit_sha, label)
VALUES (${id}, ${args.chatId}, ${args.sessionId}, ${args.worktreeId}, ${args.messageId}, ${commitSha}, ${args.label ?? null})
`;
opts?.log?.info({ checkpointId: id, chatId: args.chatId, commitSha }, 'checkpoint: created');
return { id, commit_sha: commitSha };
} catch (err) {
opts?.log?.warn(
{ chatId: args.chatId, err: err instanceof Error ? err.message : String(err) },
'checkpoint: create threw (turn proceeds without a checkpoint)',
);
return null;
}
}
/** Error the route maps to a 404 when the checkpoint can't be resolved / scoped. */
export class CheckpointNotFoundError extends Error {
constructor(message: string) {
super(message);
this.name = 'CheckpointNotFoundError';
}
}
export interface RestoreCheckpointResult {
checkpoint_id: string;
messages_deleted: number;
worktree_reset: boolean;
backend_reset: boolean;
}
export interface RestoreCheckpointOpts {
signal?: AbortSignal;
log?: FastifyBaseLogger;
/** If set, the checkpoint MUST belong to this session (route scope guard). */
sessionId?: string;
}
interface CheckpointRow {
id: string;
chat_id: string;
session_id: string | null;
worktree_id: string | null;
message_id: string | null;
commit_sha: string;
created_at: Date;
}
/**
* Restore a checkpoint: rewind its worktree to the shadow commit, trim the
* transcript from the anchor message forward, reset the backend session, and drop
* now-orphaned later checkpoints. Throws CheckpointNotFoundError when the
* checkpoint is missing or not in the requested session (route → 404).
*/
export async function restoreCheckpoint(
sql: Sql,
checkpointId: string,
opts?: RestoreCheckpointOpts,
): Promise<RestoreCheckpointResult> {
// 1. Resolve the checkpoint.
const [cp] = await sql<CheckpointRow[]>`
SELECT id, chat_id, session_id, worktree_id, message_id, commit_sha, created_at
FROM checkpoints WHERE id = ${checkpointId}
`;
if (!cp) {
throw new CheckpointNotFoundError('checkpoint not found');
}
// Authorization scope (fail-safe): the checkpoint's chat must belong to the
// requested session. cp.session_id is a denormalized hint that may be null, so
// gating on it directly fails open — resolve the owning session via chats
// (authoritative; chat_id is NOT NULL) and deny on any mismatch or missing row.
if (opts?.sessionId) {
const [owner] = await sql<{ session_id: string | null }[]>`
SELECT session_id FROM chats WHERE id = ${cp.chat_id}
`;
if (!owner || owner.session_id !== opts.sessionId) {
throw new CheckpointNotFoundError('checkpoint not in session');
}
}
// 2. Resolve the worktree path (by worktree_id, else the session's active one).
let worktreePath: string | null = null;
if (cp.worktree_id) {
const [wt] = await sql<{ path: string }[]>`
SELECT path FROM worktrees WHERE id = ${cp.worktree_id}
`;
worktreePath = wt?.path ?? null;
}
if (!worktreePath) {
const sid = cp.session_id ?? opts?.sessionId ?? null;
if (sid) {
const [wt] = await sql<{ path: string }[]>`
SELECT path FROM worktrees WHERE session_id = ${sid} AND status = 'active' LIMIT 1
`;
worktreePath = wt?.path ?? null;
}
}
// 3. Worktree reset — hard-reset to the shadow commit, then clean untracked.
let worktreeReset = false;
if (worktreePath) {
const resetRes = await hostExec(
`git -C ${shellEscape(worktreePath)} reset --hard ${shellEscape(cp.commit_sha)}`,
{ signal: opts?.signal, timeoutMs: 30_000 },
).catch((err) => {
opts?.log?.warn(
{ checkpointId, err: err instanceof Error ? err.message : String(err) },
'checkpoint restore: reset --hard threw',
);
return null;
});
if (resetRes && resetRes.exitCode === 0) {
const cleanRes = await hostExec(
`git -C ${shellEscape(worktreePath)} clean -fd`,
{ signal: opts?.signal, timeoutMs: 30_000 },
).catch(() => null);
worktreeReset = cleanRes != null && cleanRes.exitCode === 0;
if (!worktreeReset) {
opts?.log?.warn({ checkpointId, worktreePath }, 'checkpoint restore: clean -fd did not succeed');
}
} else {
opts?.log?.warn(
{ checkpointId, worktreePath, stderr: resetRes?.stderr?.trim()?.slice(0, 500) },
'checkpoint restore: reset --hard did not succeed',
);
}
} else {
opts?.log?.warn({ checkpointId }, 'checkpoint restore: no worktree path resolved (files not reset)');
}
// 4. Trim the transcript from the anchor message forward. message_parts FK to
// messages is ON DELETE CASCADE (apps/server schema.sql:49), so parts are
// removed with their messages — no explicit parts delete needed.
let messagesDeleted = 0;
if (cp.message_id) {
const deleted = await sql<{ id: string }[]>`
DELETE FROM messages
WHERE chat_id = ${cp.chat_id}
AND created_at >= (SELECT created_at FROM messages WHERE id = ${cp.message_id})
RETURNING id
`;
messagesDeleted = deleted.length;
}
// 5. Backend reset — mark the chat's agent sessions crashed so the next turn
// re-establishes a fresh backend, and evict the live pool session(s) for this
// (chat, agent). Warm backends hold context server-side with no partial
// rewind, so a full reset is the only consistent option (proposal §4).
const agentRows = await sql<{ agent: string; backend: string; agent_session_id: string | null; session_id: string | null; worktree_id: string | null }[]>`
SELECT agent, backend, agent_session_id, session_id, worktree_id
FROM agent_sessions WHERE chat_id = ${cp.chat_id}
`;
await sql`
UPDATE agent_sessions SET status = 'crashed' WHERE chat_id = ${cp.chat_id}
`.catch(() => {});
let backendReset = false;
try {
// opencode runs on the SHARED server (keyed on a sentinel, not the chat) — close
// just this chat's session(s) on it, mirroring the lifecycle close-hook.
const ocBackend = agentPool.peek(OPENCODE_POOL_KEY, 'opencode');
if (ocBackend) {
for (const row of agentRows) {
if (row.backend !== 'opencode_server' || !row.agent_session_id) continue;
const handle: AgentSessionHandle = {
sessionId: row.session_id ?? '',
agent: row.agent,
backend: 'opencode_server',
chatId: cp.chat_id,
worktreeId: row.worktree_id ?? '',
agentSessionId: row.agent_session_id,
serverPort: null,
};
await ocBackend.closeSession(handle).catch((err) => {
opts?.log?.warn(
{ checkpointId, err: err instanceof Error ? err.message : String(err) },
'checkpoint restore: opencode closeSession threw',
);
});
}
}
// Warm-ACP backends are pooled under the chat id — dispose them (kills the
// goose/qwen child). closeChat skips busy backends (a live turn isn't torn down).
const disposed = await agentPool.closeChat(cp.chat_id);
backendReset = true;
opts?.log?.info({ checkpointId, chatId: cp.chat_id, disposed }, 'checkpoint restore: backend reset');
} catch (err) {
opts?.log?.warn(
{ checkpointId, err: err instanceof Error ? err.message : String(err) },
'checkpoint restore: backend reset threw',
);
}
// 6. Drop now-orphaned later checkpoints for this chat (their anchor messages were
// just trimmed). Compare `created_at` SERVER-SIDE via a subquery (NOT the JS
// Date round-trip, which truncates the stored microsecond precision to ms and
// would make this checkpoint delete ITSELF), and exclude this checkpoint's own
// id so it always survives — letting the user re-restore to it.
await sql`
DELETE FROM checkpoints
WHERE chat_id = ${cp.chat_id}
AND id <> ${cp.id}
AND created_at > (SELECT created_at FROM checkpoints WHERE id = ${cp.id})
`.catch(() => {});
return {
checkpoint_id: checkpointId,
messages_deleted: messagesDeleted,
worktree_reset: worktreeReset,
backend_reset: backendReset,
};
}

View File

@@ -0,0 +1,108 @@
/**
* v2.5.11: discover Claude Code's real, enabled commands + plugin skills from
* disk so the coder slash menu shows them (claude is PTY — no ACP discovery).
*
* Scope (v1): user-global only — `~/.claude/commands/*.md` plus the enabled
* plugins listed in `~/.claude/settings.json:enabledPlugins` (user-scope install
* paths from `~/.claude/plugins/.../installed_plugins.json`). Project-local
* plugins and `<cwd>/.claude/commands` are deferred. Names are bare.
*/
import { readFileSync, readdirSync, existsSync } from 'node:fs';
import { homedir } from 'node:os';
import { join } from 'node:path';
import type { AgentCommand } from './provider-types.js';
/** Minimal frontmatter reader — single-line `key: value` between `---` fences. */
function frontmatterField(content: string, field: string): string | undefined {
const block = content.match(/^---\r?\n([\s\S]*?)\r?\n---/);
if (!block?.[1]) return undefined;
const m = block[1].match(new RegExp(`^${field}:\\s*(.+)$`, 'm'));
return m?.[1]?.trim().replace(/^["']|["']$/g, '') || undefined;
}
function readCommandDir(dir: string): AgentCommand[] {
if (!existsSync(dir)) return [];
let files: string[];
try {
files = readdirSync(dir);
} catch {
return [];
}
const out: AgentCommand[] = [];
for (const f of files) {
if (!f.endsWith('.md')) continue;
let description: string | undefined;
try {
description = frontmatterField(readFileSync(join(dir, f), 'utf8'), 'description');
} catch {
/* unreadable — still list the command by name */
}
out.push({ name: f.slice(0, -3), kind: 'command', ...(description ? { description } : {}) });
}
return out;
}
function readSkillDir(dir: string): AgentCommand[] {
if (!existsSync(dir)) return [];
let entries: string[];
try {
entries = readdirSync(dir);
} catch {
return [];
}
const out: AgentCommand[] = [];
for (const sub of entries) {
const skillMd = join(dir, sub, 'SKILL.md');
if (!existsSync(skillMd)) continue;
let content: string;
try {
content = readFileSync(skillMd, 'utf8');
} catch {
continue;
}
out.push({
name: frontmatterField(content, 'name') ?? sub,
kind: 'skill',
...(() => {
const d = frontmatterField(content, 'description');
return d ? { description: d } : {};
})(),
});
}
return out;
}
export function discoverClaudeCommands(): AgentCommand[] {
const root = join(homedir(), '.claude');
const out: AgentCommand[] = [];
// User custom commands.
out.push(...readCommandDir(join(root, 'commands')));
// Enabled plugins (user-scope installs).
try {
const settings = JSON.parse(readFileSync(join(root, 'settings.json'), 'utf8')) as {
enabledPlugins?: Record<string, boolean>;
};
const installed = JSON.parse(
readFileSync(join(root, 'plugins', 'installed_plugins.json'), 'utf8'),
) as { plugins?: Record<string, Array<{ scope?: string; installPath?: string }>> };
const enabled = settings.enabledPlugins ?? {};
const plugins = installed.plugins ?? {};
for (const [key, on] of Object.entries(enabled)) {
if (!on) continue;
const installs = plugins[key] ?? [];
const installPath = (installs.find((i) => i.scope === 'user') ?? installs[0])?.installPath;
if (!installPath || !existsSync(installPath)) continue;
out.push(...readSkillDir(join(installPath, 'skills')));
out.push(...readCommandDir(join(installPath, 'commands')));
}
} catch {
/* missing/unreadable plugin config → user commands only */
}
// Dedupe by name (first wins).
const seen = new Set<string>();
return out.filter((c) => (seen.has(c.name) ? false : (seen.add(c.name), true)));
}

View File

@@ -0,0 +1,22 @@
/**
* v2.3 phase 2: tier-1 fast availability check — is a binary on PATH?
*
* Uses execFile (NO shell) because the binary name can come from the provider
* config file (custom ACP entries) — mirrors the Phase 1 agent-probe hardening.
* Note: agent-probe's `whichBinary` returns the resolved path (it needs it for
* `install_path`); this returns a boolean. Kept separate rather than over-
* refactored into one helper — different return contracts, two short call sites.
*/
import { execFile as execFileCb } from 'node:child_process';
import { promisify } from 'node:util';
const execFile = promisify(execFileCb);
export async function isCommandAvailable(binary: string): Promise<boolean> {
try {
const { stdout } = await execFile('which', [binary], { timeout: 10_000 });
return stdout.trim().length > 0;
} catch {
return false;
}
}

View File

@@ -1,39 +0,0 @@
/**
* Cursor model list parser — lifted from Paseo cursor-acp-agent.ts
*/
import type { ProviderModel } from './provider-types.js';
const CURSOR_MODEL_MARKER_PATTERN = /\s+\((?:default|current)\)$/;
export function parseCursorAgentModelsOutput(output: string): ProviderModel[] {
const parsed = output
.split(/\r?\n/)
.map((line) => line.trim())
.filter((line) => line && line !== 'Available models' && !line.startsWith('Tip:'))
.map((line) => {
const separatorIndex = line.indexOf(' - ');
if (separatorIndex <= 0) return null;
const id = line.slice(0, separatorIndex).trim();
const rawLabel = line.slice(separatorIndex + 3).trim();
if (!id || !rawLabel) return null;
let marker: 'default' | 'current' | null = null;
if (rawLabel.endsWith(' (default)')) marker = 'default';
else if (rawLabel.endsWith(' (current)')) marker = 'current';
return { id, label: rawLabel.replace(CURSOR_MODEL_MARKER_PATTERN, ''), marker };
})
.filter((m): m is { id: string; label: string; marker: 'default' | 'current' | null } => m !== null);
const defaultModelId =
parsed.find((m) => m.marker === 'default')?.id ??
parsed.find((m) => m.marker === 'current')?.id ??
parsed[0]?.id;
return parsed.map((model) => ({
id: model.id,
label: model.label,
isDefault: model.id === defaultModelId,
}));
}

View File

@@ -0,0 +1,77 @@
/**
* Strip opencode-dcp plugin tags (`<dcp-message-id>mNNNN</dcp-message-id>`) that
* the @tarquinen/opencode-dcp plugin appends to assistant text and which
* otherwise render as literal text in the UI.
*
* Why a streaming stripper and not a per-chunk `.replace()`: opencode streams
* assistant text token-by-token, so the tag arrives SPLIT across many SSE deltas
* (`<dcp`, `-message`, `-id>`, `m0019`, `</dcp`, …). A per-chunk regex never sees
* a complete tag in any single fragment, so the fragments pass through and the
* dispatcher reassembles the full tag in the persisted/displayed content. The
* stripper below buffers across chunks: it emits everything that cannot be part
* of a forming tag and holds back only a trailing partial-tag prefix until the
* next chunk resolves it — without holding back legitimate `<…>` content.
*/
const DCP_TAG_RE = /<dcp-message-id>[^<]*<\/dcp-message-id>/g;
const OPEN = '<dcp-message-id>';
const CLOSE = '</dcp-message-id>';
/** One-shot strip of COMPLETE tags. Safe for non-streaming / final content. */
export function stripDcpTags(s: string): string {
return s.replace(DCP_TAG_RE, '');
}
/**
* Could `tail` (a substring starting at a `<`) still grow into a complete dcp
* tag on a future chunk? If so the caller must hold it back rather than emit it.
* Returns false for unrelated `<` content (`<div>`, `<T>`, …) so those stream
* normally.
*/
function isPartialDcp(tail: string): boolean {
// A prefix of the opening marker: '<', '<d', …, '<dcp-message-id'.
if (OPEN.startsWith(tail)) return true;
// Opening marker fully seen — content (and maybe a forming close) still streaming.
if (tail.startsWith(OPEN)) {
const rest = tail.slice(OPEN.length);
const lt = rest.indexOf('<');
if (lt === -1) return true; // still inside the [^<]* content run
return CLOSE.startsWith(rest.slice(lt)); // a partial close marker forming
}
return false;
}
export interface DcpStreamStripper {
/** Feed one text chunk; returns the portion safe to emit now (may be ''). */
push(chunk: string): string;
/** Stream end: returns whatever was held back, with complete tags stripped. */
flush(): string;
}
/** Stateful, cross-chunk-safe dcp stripper. One instance per turn. */
export function makeDcpStreamStripper(): DcpStreamStripper {
let buf = '';
return {
push(chunk: string): string {
buf += chunk;
buf = buf.replace(DCP_TAG_RE, ''); // drop any now-complete tags
// Find the earliest `<` whose suffix is a forming dcp tag; hold from there,
// emit everything before it (real text, including unrelated `<…>`).
for (let i = buf.indexOf('<'); i !== -1; i = buf.indexOf('<', i + 1)) {
if (isPartialDcp(buf.slice(i))) {
const emit = buf.slice(0, i);
buf = buf.slice(i);
return emit;
}
}
const emit = buf;
buf = '';
return emit;
},
flush(): string {
const out = stripDcpTags(buf);
buf = '';
return out;
},
};
}

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,76 @@
import type { Sql } from '../db.js';
import type { WsFrame } from '@boocode/contracts/ws-frames';
export type TerminalMessageStatus = 'cancelled' | 'failed';
/**
* F1 (D-7) — decide the terminal status a Stop'd / errored external turn lands in.
*
* A user Stop (the per-task AbortController fired) or a thrown `AbortError` is a
* deliberate, non-error outcome → `'cancelled'`. A genuine thrown error → `'failed'`.
* Keeping the two distinct keeps the human-inbox / failure surfaces honest.
*
* Pure (no DB / IO) so the mapping is unit-testable in isolation.
*/
export function classifyTerminalStatus(opts: { aborted: boolean; error?: unknown }): TerminalMessageStatus {
if (opts.aborted) return 'cancelled';
if (opts.error instanceof Error && opts.error.name === 'AbortError') return 'cancelled';
return 'failed';
}
/**
* F1 (OCE-001 / OCE-002) — finalize a streaming assistant message into a terminal
* state and publish the matching `message_complete` frame.
*
* Idempotent via `WHERE status = 'streaming'`: a second call (a double-Stop, or an
* abort short-circuit followed by the catch block) updates zero rows and does NOT
* re-publish, so the frontend reducer settles the message exactly once. It also
* never clobbers a row that already finished cleanly (`complete`) — the abort that
* raced a clean finish is a no-op.
*
* Returns `true` iff this call performed the finalization (the row was still
* streaming); `false` if it was already terminal or the id is absent (the throw
* preceded the row's creation).
*/
export async function finalizeStreamingMessage(
sql: Sql,
publishFrame: (sessionId: string, frame: WsFrame) => void,
opts: {
sessionId: string;
chatId: string;
assistantId: string;
status: TerminalMessageStatus;
model: string | null;
/** Partial accumulated text to persist; omit to leave the row's content untouched. */
content?: string;
},
): Promise<boolean> {
const { sessionId, chatId, assistantId, status, model, content } = opts;
if (!assistantId) return false;
const rows =
content !== undefined
? await sql<{ id: string }[]>`
UPDATE messages
SET content = ${content}, status = ${status}, finished_at = clock_timestamp()
WHERE id = ${assistantId} AND status = 'streaming'
RETURNING id
`
: await sql<{ id: string }[]>`
UPDATE messages
SET status = ${status}, finished_at = clock_timestamp()
WHERE id = ${assistantId} AND status = 'streaming'
RETURNING id
`;
if (rows.length === 0) return false;
publishFrame(sessionId, {
type: 'message_complete',
message_id: assistantId,
chat_id: chatId,
model,
status,
} as WsFrame);
return true;
}

Some files were not shown because too many files have changed in this diff Show More