Compare commits

..

22 Commits

Author SHA1 Message Date
cc4bd04aa4 Merge contracts-ssot-pkg: v2.7.13 single-source cross-app wire contracts in @boocode/contracts 2026-06-02 21:24:14 +00:00
649ce71eff feat: single-source cross-app wire contracts in @boocode/contracts (v2.7.13)
Move all hand-synced cross-app wire contracts into one built workspace
package, @boocode/contracts, consumed by server/web/coder/coder-web via
workspace:* + a per-subpath exports map. The ws-frames and provider-config
Zod schemas are schema-first (z.infer); MessageMetadata, ErrorReason,
AgentSessionConfig, the provider snapshot types, and WorktreeRiskReport are
each single-sourced. Deletes the byte-identical copies and their parity
tests, fixes a live AgentSessionConfig drift (coder dead copy removed,
unified to the web required/nullable shape), removes the dead pending_change
WS arms in the fallback SPA, and inverts the build order (contracts builds
first) across root build, Dockerfile, and the coder deploy docs. Reverses
the shared-package decision declined in v2.5.12.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 21:24:08 +00:00
2a05d2f9fe docs: archive shipped openspec batches; add feature/plan/research notes
Move 13 shipped openspec change docs under openspec/changes/archived/.
Add docs/features/git-diff-panel, docs/plans/post-review-backlog, and
docs/research/cross-app-contract-ssot.md (the research behind the
@boocode/contracts SSOT work). Update BOOCHAT.md, BOOCODER.md, and
boocode_roadmap.md.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 21:20:33 +00:00
e5ce01ae72 fix(coder): include model in WS snapshot SELECT so the attribution chip survives refresh
CoderPane hydrates from the HTTP listMessages fetch (SELECT has model) AND the WS snapshot frame, and the snapshot handler setMessages-overwrites the HTTP load. The snapshot query in apps/coder/src/routes/ws.ts had its own column list that omitted model, so on coder refresh the chip's model was lost (it showed live via the message_complete frame). One-column fix: add model to that SELECT. CLAUDE.md mapper-chain note updated to list the WS snapshot SELECT.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 18:03:10 +00:00
81470f5a77 Merge composer-chips: v2.7.10 composer attach-file button + slash-commands chip (icon-only on mobile)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 17:27:59 +00:00
35dba828e1 feat: composer attach-file button + slash-commands chip (icon-only on mobile)
Move the slash-commands menu out of the full-width AgentCommandsHint disclosure into a compact chip in the composer's bottom controls row, and add an attach-file button that reuses the existing drag-drop pipeline (5MB/binary gate, 10-attachment cap, chips + preview). On mobile both collapse to icon-only (count hidden). Shared ChatInput, so it applies to both BooChat and BooCoder; typed-/ autocomplete is unchanged. Removes the now-unused AgentCommandsHint component.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 17:26:27 +00:00
ce621bc003 Merge mcp-env-keys-batch: v2.7.9 MCP {env:VAR} key substitution + coder model/tool-result fixes + docs refactor 2026-06-02 17:01:11 +00:00
afaca9e426 feat: MCP {env:VAR} key substitution + coder model/tool-result fixes + docs refactor (v2.7.9)
- MCP secrets: substituteEnvVars recursively resolves {env:NAME} in mcp.json string values from process.env before Zod (opencode-compatible); unset -> '' + boot warning, and invalid-config log names the unset vars (an empty {env:VAR} in a strict url/command field invalidates the whole config)
- data/mcp.json now untracked (.gitignore flips !data/mcp.json -> !data/mcp.example.json); tracked template data/mcp.example.json carries "{env:CONTEXT7_API_KEY}"; .env.example documents the key (9 mcp-config tests)
- Coder fix: message_complete frame model widened string -> string|null (server+web ws-frames parity); dispatcher publishes model: task.model at all 4 external completion points — a null model otherwise fail-closed in publishFrame and dropped the whole frame incl. status:'complete' (regression test)
- Coder fix: claude-sdk mapUserToolResults maps user-message tool_result blocks -> terminal tool_update events (completed/failed w/ output) so tool snapshots resolve instead of spinning forever
- Composer: AgentComposerBar drops §9b resumed/history/new chip + token readout, loses flex-wrap so the row stays one line; CoderPane gains a per-chat localStorage agent-config cache (restores last model on reopen) + threads model into the timeline/chip
- Docs: root CLAUDE.md slimmed (~190 lines), per-app refs split to apps/{coder,server,web}/CLAUDE.md; new docs/coder-backends.md, docs/project-discovery.md, docs/coding-standards/ (cross-app-contract-parity); ARCHITECTURE.md links the backends doc

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 17:01:03 +00:00
7ca4a6b344 chore: prune unused brand PNGs (keep banner-mascot + banner-wordmark)
Removes boo-badge / boocode-icon / boocode-wordmark / boocode-wordmark-tight —
copied from the design bundle but unreferenced; only the two banner badges are
imported (ProjectSidebar).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 23:10:12 +00:00
27f3a6c463 Merge boocode-ui-ember-coder-model: v2.7.8 Ember theme + brand banner + coder tabs + model-attribution chips 2026-06-01 22:30:58 +00:00
3a646fd6df feat: BooCode 2.0 UI — Ember theme, brand banner, coder tabs, model-attribution chips
- Ember theme (Obsidian charcoal + #ff7a18 orange), now DEFAULT_THEME_ID; server theme_id whitelist gains 'ember'
- Brand banner: transparent Westie mascot + >_BooCode wordmark, big/edge-to-edge (flood-filled to transparency + cropped)
- Coder panes are multi-tab: + opens a BooCode tab, split opens a pane (shared ChatTabBar via tabKind + createCoderTab; closeOtherTabs/tab-numbering extended to coder)
- Model-attribution: new messages.model column stamped at finalizeCompletion (BooChat/native coder) + dispatcher assistant-row creation (external coder); surfaced via view + wire types + live frame; rendered as a subtle shortened-name chip (shortenModelName)
- Composer Web toggle moved into a boxed focus-ringed input; glowing accent dot on tool rows
- Claude SDK follow-ups (1M context, follow-up-message fix, collapsed thinking/tool chips) + CLAUDE_SDK_BACKEND=1

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 22:30:47 +00:00
7098014261 Merge pane-header-shared: v2.7.7 shared pane-header cluster + chat-resolve WorkspaceState fix 2026-06-01 14:29:00 +00:00
c56d169ef9 feat: shared PaneHeaderActions + chat-resolve WorkspaceState fix (v2.7.7)
In-flight workspace UX work.

- Extract a shared PaneHeaderActions cluster (+/Split/Reopen/History/Close)
  used by ChatTabBar + the Workspace coder/terminal pane headers, replacing the
  divergent per-header copies; SessionLandingPage history + useWorkspacePanes
  tweaks.
- Fix coder-side correctness bug: resolveChatId read sessions.workspace_panes as
  a bare WorkspacePane[] but v2.6.5 widened it to a WorkspaceState envelope, so
  it mis-read panes and clobbered tabNumbers/nextTabNumber/closedPaneStack on
  every pane-chat write. New normalizeWorkspaceState handles either shape and
  preserves the envelope (+ regression test).
- CLAUDE.md doc-sync (coder vitest suite, deploy-by-surface, dual-remote push,
  in-flight-web-WIP staging, release-branch naming).

Web tsc + coder build + coder tests green. Builds on v2.7.6.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 14:28:49 +00:00
b7fb254e5d Merge agent-status-dot: v2.7.6 normalized external-agent status (scoped #10) 2026-06-01 14:04:26 +00:00
59cf082e06 feat: normalized external-agent status (#10 scoped) (v2.7.6)
Scoped half of boocode_code_review_v2 §1 #10 — publish the agent status
BooCoder already observes (the config-injection notify-hook is the documented
follow-on, clean-room from superset ELv2).

- agent_status_updated WS frame (working|blocked|idle|error), server+web parity.
- Published from the dispatcher's turn boundaries (warm-acp/opencode/sdk/pty:
  working at start, idle/error at end) + the permission flow (blocked/working).
  Best-effort, never breaks a turn.
- Clean-room normalizeAgentEvent helper (superset's vendor-event -> Start/blocked
  /Stop collapse, event names as facts) + 25 tests — reused by the follow-on.
- AgentComposerBar status dot (distinct from the WS-liveness dot), tracked per
  (chat,agent) by a useAgentStatus map in CoderPane.

Built by 2 parallel agents vs a pinned frame contract. Server 545 + coder 294
tests passing (25 new); web tsc + builds clean; ws-frames parity green. Clears
the actionable review backlog (#1/#3/#4/#6-#12). Builds on v2.7.5.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 14:04:04 +00:00
6fc3175730 Merge claude-sdk-backend: v2.7.5 Claude SDK backend + clean-room PostgresSessionStore 2026-06-01 13:38:05 +00:00
f3a0197d6a feat: Claude Agent SDK backend + clean-room PostgresSessionStore (v2.7.5)
Lands the lean-SDK direction (boocode_code_review_v2 §1 #9) behind a flag.
Adds @anthropic-ai/claude-agent-sdk@0.3.159 (Commercial Terms, runtime dep).

- PostgresSessionStore: clean-room impl of the SDK's real SessionStore type
  over a new claude_session_entries table. Typechecks against the SDK type;
  8 DB-integration tests.
- ClaudeSdkBackend (implements AgentBackend): one warm query() per (chat,claude)
  in streaming-input mode via a pushable async-iterable pump, sessionStore +
  resume continuity, pure mapSdkMessage->AgentEvent, session_id from init,
  usage/cost onto agent_sessions (backend CHECK gains 'claude_sdk').
- Routing env-gated by CLAUDE_SDK_BACKEND (default off) -> PTY path UNCHANGED.
- Built against real SDK 0.3.159 types (install paid off: partial=stream_event
  needing includePartialMessages, MessageParam, result error arm).
- Fix latent test-infra deadlock: serialize DB suites (fileParallelism:false).

Coder 269 passing default / 290 with DB; tsc clean vs SDK types; builds clean.
LIVE pump + resume + actual claude turn need a host smoke (CLAUDE_SDK_BACKEND=1
+ claude binary + auth). zod peer-dep wants ^4 (workspace 3.25). Builds on v2.7.4.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 13:37:57 +00:00
7e0ecde83d Merge mistake-tracker-ledger: v2.7.4 heterogeneous-failure recovery + file-read ledger 2026-06-01 13:05:19 +00:00
bcc89d8adc feat: MistakeTracker + file-provenance ledger (v2.7.4)
Two native-inference hardening features from boocode_code_review_v2 §1 #12.

MistakeTracker: new pure mistake-tracker.ts tracks consecutive heterogeneous
tool failures (kinds surfaced per tool from tool-phase.ts). On 3 in a row the
turn loop soft-nudges (model-facing recovery guidance + mistake_recovery
sentinel + reset), then escalates to stopping the turn (cap-hit-style, Continue
affordance) on a re-trip. Complements doom-loop (identical repeats) + cap-hit.

File-provenance ledger: compaction.ts derives a deterministic ## Files Read list
from the head messages' read-tool calls and injects it into the rolling-summary
prompt so provenance survives compaction (no new table; read-only).

mistake_recovery sentinel: MessageMetadata arm (server + web) + MessageBubble
render branch. Built by 2 parallel agents. Server 545 tests passing (23 new);
build + web tsc clean. Native-inference only. Builds on v2.7.3.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 13:05:03 +00:00
f53d6a8afd Merge sampling-knobs-streamjson: v2.7.3 sampling knobs + live PTY stream-json + token UI 2026-06-01 12:47:31 +00:00
a584dd16b0 feat: sampling knobs + live PTY stream-json + token UI (v2.7.3)
Three small wins from boocode_code_review_v2 §1 #11/#7/#8.

#11 sampling knobs: top_n_sigma + dry_* family as first-class Agent fields,
threaded into the request body via providerOptions.openaiCompatible. Fixes a
latent bug — top_k (rejected by the AI-SDK provider) and min_p (never passed to
streamText) were dead on the wire; both now route through the same channel.
--reasoning-budget documented in data/AGENTS.md.

#7 live PTY stream-json: new stream-json-parser.ts line-buffers qwen/claude
NDJSON and emits text/reasoning/tool frames live + persists, with a fallback to
the old opaque slice. claude gets --output-format stream-json --verbose.

#8 token UI: agent_sessions input/output_tokens/cost now flow through the route
+ type and render beside the AgentComposerBar session chip.

Built by 3 parallel agents. Server 523 + coder 245 tests passing; builds + web
tsc clean. Builds on v2.7.2. openspec sampling-streamjson-tokens.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 12:47:17 +00:00
5651f56039 Merge checkpoint-idor-fix: v2.7.2 close 2 checkpoint IDOR holes 2026-06-01 12:16:08 +00:00
153 changed files with 9267 additions and 1658 deletions

View File

@@ -11,6 +11,10 @@ POSTGRES_PASSWORD=CHANGE_ME
# point BooCode at a different SearXNG instance. # point BooCode at a different SearXNG instance.
SEARXNG_URL=http://100.114.205.53:8888 SEARXNG_URL=http://100.114.205.53:8888
# Context7 MCP key. Referenced from data/mcp.json as "{env:CONTEXT7_API_KEY}"
# ({env:VAR} substitution, opencode-compatible). Leave unset to send no key.
# CONTEXT7_API_KEY=ctx7sk-...
# Task model: lightweight model for auto-naming, search rewrite, etc. # Task model: lightweight model for auto-naming, search rewrite, etc.
# Direct llama-server instance (NOT llama-swap). Falls back to LLAMA_SWAP_URL # Direct llama-server instance (NOT llama-swap). Falls back to LLAMA_SWAP_URL
# with FAST_MODEL when unset. # with FAST_MODEL when unset.

2
.gitignore vendored
View File

@@ -15,6 +15,6 @@ secrets/
data/* data/*
!data/AGENTS.md !data/AGENTS.md
!data/skills/ !data/skills/
!data/mcp.json !data/mcp.example.json
!data/coder-providers.example.json !data/coder-providers.example.json
codecontext/fork.tar.gz codecontext/fork.tar.gz

View File

@@ -28,6 +28,11 @@
- Prefer codecontext (`search_symbols`, `get_symbol_info`, `get_dependencies`) over `grep` for symbol-level questions. Fall back to `grep` / `view_file` when codecontext returns degraded or empty results — that signals an unsupported language or parse failure. - Prefer codecontext (`search_symbols`, `get_symbol_info`, `get_dependencies`) over `grep` for symbol-level questions. Fall back to `grep` / `view_file` when codecontext returns degraded or empty results — that signals an unsupported language or parse failure.
- Verify before reporting work complete: run the relevant test/build/smoke command and confirm output matches the claim. Evidence first, assertion second. - Verify before reporting work complete: run the relevant test/build/smoke command and confirm output matches the claim. Evidence first, assertion second.
## Recovery and context (v2.7)
- **Heed the recovery nudge.** Native inference tracks consecutive tool **failures** (`mistake-tracker.ts`): after 3 in a row with no successful step between, a `mistake_recovery` sentinel is injected telling you to re-read tool schemas, verify a path exists before acting, and try a *different* approach — not retry variations of the same failing call. Ignoring it (a second failure run with the nudge still outstanding) **escalates and stops the turn** to protect the step budget. This complements the doom-loop guard, which only catches *identical* repeats.
- **Files-read provenance survives compaction.** Paths you read via `view_file` / `grep` / `find_files` / `list_dir` are accumulated and merged into a cumulative `## Files Read` ledger in the rolling summary, so a file read long ago stays in context across compactions. You don't manage this — but it means you usually don't need to re-read a file just because the raw turn scrolled out of the window.
## Output format ## Output format
- Stay in Markdown by default for every reply, short or long. - Stay in Markdown by default for every reply, short or long.

View File

@@ -23,6 +23,8 @@ You are BooCoder, a write-capable coding agent. You can read AND modify files wi
Every file modification queues in `pending_changes` before touching disk. The user sees a diff preview and approves/rejects each change. Never bypass this queue — it is the safety boundary between inference and the filesystem. Every file modification queues in `pending_changes` before touching disk. The user sees a diff preview and approves/rejects each change. Never bypass this queue — it is the safety boundary between inference and the filesystem.
`edit_file`'s `old_string` match is **fuzzy** (`fuzzy-match.ts`, v2.7.1): an exact → per-line-whitespace → unicode-canonicalization (curly quotes/dashes/nbsp) → Levenshtein-≥0.66 ladder, so minor whitespace/indentation/unicode drift in `old_string` still lands on the right span. Two consequences: a near-miss `old_string` may still apply (verify the queued diff is what you intended), and an `old_string` matching **more than one** place is rejected as **ambiguous** rather than editing the first — add surrounding context to disambiguate. A genuine non-match returns a clear failure, not a thrown error.
## Behavior ## Behavior
- Show diffs clearly. Explain what you're changing and why. - Show diffs clearly. Explain what you're changing and why.
@@ -102,7 +104,7 @@ Either way, **adding to config does NOT install the binary.** Until the CLI is o
### Deploy + smoke ### Deploy + smoke
Two deploy targets: Two deploy targets:
- **Routes (host service):** `pnpm -C apps/server build && pnpm -C apps/coder build && sudo systemctl restart boocoder` - **Routes (host service):** `pnpm -C packages/contracts build && pnpm -C apps/server build && pnpm -C apps/coder build && sudo systemctl restart boocoder`
- **Web UI (container):** `docker compose up --build -d boocode` - **Web UI (container):** `docker compose up --build -d boocode`
Green gate (verified across phases 15): `pnpm -C apps/coder test` (134 passing) `&& pnpm -C apps/coder build`. Green gate (verified across phases 15): `pnpm -C apps/coder test` (134 passing) `&& pnpm -C apps/coder build`.
@@ -115,3 +117,35 @@ curl http://100.114.205.53:9500/api/coder/providers/config # raw config, throu
# Settings → Providers: disable goose → it leaves the composer picker, stays in the tab # Settings → Providers: disable goose → it leaves the composer picker, stays in the tab
# POST refresh → models repopulate; Add a catalog entry → it appears after refresh (unavailable until its CLI is installed) # POST refresh → models repopulate; Add a catalog entry → it appears after refresh (unavailable until its CLI is installed)
``` ```
## Persistent agent sessions (v2.6)
When you `dispatch_external_agent` to a chat-tab provider, BooCoder keeps that agent **warm and resumable** instead of spawning a fresh process per turn. This is mostly transparent — but the model below explains why turn 2 is fast, why an external agent remembers earlier turns, and how edits flow.
### Backends and keying
- One live backend per **`(chat_id, agent)`** pair, owned by the `agent-pool` (`agent-pool.ts`). State lives in `agent_sessions` (the resumable session id) and `worktrees` (the per-chat working copy).
- **opencode** runs a long-lived `opencode serve` (`backends/opencode-server.ts`) with per-session SSE; turns after the first reuse the same session (memory intact, ~9× faster).
- **goose / qwen** run a warm ACP connection (`backends/warm-acp.ts`) — `initialize` + `session/new` once per `(chat,agent)`, then `session/prompt` per turn. Interrupt cancels the prompt (`session/cancel`), never the child.
- **claude** runs the Claude Agent SDK backend (`backends/claude-sdk.ts`) over a clean-room Postgres session store.
- Arena, MCP `new_task`, and one-shot dispatches still use the cold `runExternalAgent` path — warm reuse needs both a `session_id` and a `chat_id`.
### Worktrees
- External agents write **directly into a persistent per-chat worktree** (`/tmp/booworktrees/sess-<id>`), not into the project root via `pending_changes`. The worktree is created once, base commit captured, and **reused across turns and across agents in the same chat** — so opencode and goose in one chat share one worktree.
- Each turn's worktree diff supersedes the prior `pending_changes` row for that `(chat,agent)` (latest-wins) and is badged with the authoring agent in the DiffPanel.
- **Staging boundary:** a provider only sees another agent's edits once they are **applied**. Unapplied worktree edits from a different agent are invisible to you — the DiffPanel shows a muted hint when that's the case.
### Lifecycle (v2.6.10v2.6.11)
- **Idle eviction:** a backend idle past `AGENT_POOL_IDLE_TTL_MS` (default 30 min) is disposed; an LRU cap of `AGENT_POOL_MAX_LIVE` (default 10) bounds live backends. A busy backend is never evicted, and the next turn transparently re-attaches or re-creates from `agent_sessions`/`worktrees`.
- **Crash recovery:** a health monitor restarts a crashed server (opencode → fresh sessions; ACP → re-`session/new`) and reclaims its port.
- **Close cleanup:** closing/deleting a chat or session evicts its backends, archives the `worktrees` row, and removes the worktree. An hourly reaper sweeps orphaned worktrees (dirty/unpushed preflight before removal).
### Checkpoints (v2.7.1)
Because external agents write the worktree directly (outside `pending_changes`), a worktree **checkpoint** is shadow-committed before each external-agent turn (tracked + untracked, into `refs/boocode/checkpoints/<id>`), anchored to that turn's assistant message. The per-message **"Restore to here"** affordance resets the worktree (`reset --hard` + `clean -fd`), trims the transcript past that message, and resets the `(chat,agent)` backend session — so files, transcript, and agent context land consistent at the restore point. `rewind` still only reverses BooCoder's own applied `pending_changes`; checkpoints are what cover external-agent worktree edits.
### Normalized status (v2.6 / v2.7.6)
Turn boundaries publish a normalized per-`(chat,agent)` status — `working | blocked | idle | error` — to the UI (`agent_status_updated` frame), so blocked-on-permission and crash/idle are visible, not just WS liveness.

View File

@@ -2,6 +2,46 @@
All notable changes per release tag. Most recent on top, ordered by tag creation date (which matches the git history). Tag names follow `vMAJOR.MINOR.PATCH-slug` — the slug describes what shipped, so the tag name alone is enough to recall the batch. All notable changes per release tag. Most recent on top, ordered by tag creation date (which matches the git history). Tag names follow `vMAJOR.MINOR.PATCH-slug` — the slug describes what shipped, so the tag name alone is enough to recall the batch.
## v2.7.13-contracts-ssot — 2026-06-02
Creates `@boocode/contracts` (`packages/contracts`), a new workspace package that becomes the single source of truth for every cross-app wire contract — reversing the decision recorded in `v2.5.12-provider-lifecycle-phase4` that declined a shared types package as not worth the Docker/build-order risk at solo scale; a live `AgentSessionConfig` drift that had since appeared between `apps/coder` and `apps/web` justified the investment. Six contracts are now defined exactly once: the `WsFrameSchema` Zod runtime schema, the provider snapshot types (`ProviderSnapshotEntry` and family), the Zod provider-config schemas, `MessageMetadata` + `ErrorReason`, `AgentSessionConfig`, and `WorktreeRiskReport`; both Zod-backed contracts use `z.infer` so validator and type derive from the same definition and cannot drift independently. All four consumers — `apps/server`, `apps/web`, `apps/coder`, and the fallback SPA `apps/coder/web` — import via `workspace:*` through a per-subpath exports map consuming built dist only (no tsconfig project references); the hand-synced copies and their parity tests (`provider-types-parity.test.ts`; the ws-frames byte-parity assertion) are deleted while the KNOWN_FRAME_TYPES drift test and broker fail-closed tests are preserved. Build order is inverted in the root build script, Dockerfile, and coder deploy docs; `apps/coder/web`'s migration also removed dead `pending_change_*` reducer arms (no frame publisher exists for these — pending changes are HTTP-delivered), closing a latent missing-default-arm crash, and reconciled field-type conflicts with the canonical `WsFrame`; zod is pinned to a single version across the workspace. Server 543 / coder 293 / contracts 11 tests passing; human smoke verified on the live stack 2026-06-02.
## v2.7.11-coder-model-snapshot — 2026-06-02
Hotfix for the coder model-attribution chip vanishing on refresh. The chip showed during a live turn (the `message_complete` frame carries `model`) but disappeared when a BooCoder session was reloaded — only in the coder, not BooChat. Root cause: `CoderPane`'s `useCoderMessages` hydrates from two sources on load — the HTTP `listMessages` fetch (whose SELECT includes `model`, added `v2.7.8`) AND the WS `snapshot` frame — and the WS snapshot's query in `apps/coder/src/routes/ws.ts` had its own column list that omitted `model`. The client's `snapshot` handler `setMessages`-overwrites the HTTP load, so the model-less rows won, and with no later `message_complete` for historical messages the chip stayed gone. Fix is one column: add `model` to the WS snapshot SELECT so both hydration paths agree. The `apps/coder/CLAUDE.md` "update every mapper" note now lists the WS snapshot SELECT explicitly (it was the one place not enumerated). apps/server + apps/coder builds green; deployed via `systemctl restart boocoder` (host service — the earlier `v2.7.10` docker deploy rebuilt only the container, never this route). Fixes the chip shipped in `v2.7.8-ember-coder-tabs-model-chips` / completed in `v2.7.9-mcp-keys-docs-coder-fixes`.
## v2.7.10-composer-chips — 2026-06-02
A composer control-row refresh shared by BooChat and BooCoder via `ChatInput`. The slash-commands menu moves out of the full-width `AgentCommandsHint` disclosure (now removed) into a compact chip in the message box's bottom controls row — clicking it opens the existing `SlashCommandPicker` anchored to the chip and selecting inserts `/<name> `, while the typed-`/` autocomplete is unchanged. A new attach-file button sits beside it, opening a native multi-file picker that funnels picks through the same drag-drop pipeline (5 MB / binary gate, 10-attachment cap, chips + preview, `source:'drop'`). On mobile both collapse to icon-only — the slash count is `max-md:hidden` and the paperclip is icon-only — so the row stays on one line per the no-scroll toolbar rule. Web tsc + build green; deployed (docker). Builds on the BooCode 2.0 composer work in `v2.7.8-ember-coder-tabs-model-chips`.
## v2.7.9-mcp-keys-docs-coder-fixes — 2026-06-02
The MCP-key hygiene feature plus accumulated in-flight coder fixes and a docs refactor. **MCP `{env:VAR}` substitution** (`mcp-config.ts:substituteEnvVars`, opencode-compatible) recursively resolves `{env:NAME}` references in any string value of `data/mcp.json` from `process.env` *before* Zod validation, so real keys live in `.env` (`env_file`) instead of the gitignored config — an unset var resolves to `''` with a boot-log warning, and on a validation failure the loader names the unset vars alongside the field errors (an empty `{env:VAR}` in a strict url/command field invalidates the whole config, an otherwise-disconnected warning). `data/mcp.json` is now untracked (`.gitignore` flips `!data/mcp.json``!data/mcp.example.json`); the tracked template `data/mcp.example.json` carries `"CONTEXT7_API_KEY": "{env:CONTEXT7_API_KEY}"` and `.env.example` documents the key (9 mcp-config tests). **Two coder bug fixes** ride along: the `message_complete` frame's `model` is widened `string``string | null` in both ws-frames copies (server + web parity) and the dispatcher now publishes `model: task.model` at all four external assistant-completion points — without the nullable widen a null model would fail-closed in `publishFrame` and drop the entire frame including the `status:'complete'` transition (regression test added); and Claude-SDK `mapUserToolResults` now maps `user`-message `tool_result` blocks → terminal `tool_update` events (completed/failed with output) so external-agent tool snapshots resolve instead of spinning forever (the SDK feeds tool output back as a user message, previously unmapped). On the view side the `AgentComposerBar` drops the §9b resumed/history/new-session chip and token-usage readout and loses `flex-wrap` so the control row stays on one line, while `CoderPane` gains a per-chat `localStorage` agent-config cache (provider/model/mode/thinking keyed by chat id, restoring the last model on reopen) and threads the new `model` field into the timeline + attribution chip. **Docs refactor**: the root `CLAUDE.md` is slimmed (~190 lines) with per-app deep references split into `apps/{coder,server,web}/CLAUDE.md` (auto-loaded in-subtree), plus a new 372-line `docs/coder-backends.md` dispatch reference, a `docs/project-discovery.md` stack inventory, and a `docs/coding-standards/` set (the `cross-app-contract-parity` standard, fronted by `.claude/rules` path-scoped indexes) — `ARCHITECTURE.md` links the backends doc. Server 555 + coder 299 tests passing (incl. new mcp-config, ws-frames, and claude-sdk-map suites), web tsc + server + coder builds green. Builds on `v2.7.8-ember-coder-tabs-model-chips`.
## v2.7.8-ember-coder-tabs-model-chips — 2026-06-01
The BooCode 2.0 visual identity plus two workflow features. **Ember theme** (`styles/themes/ember.css`, now `DEFAULT_THEME_ID`) is the signature orange-on-near-black look — rebuilt on Obsidian's flat charcoal structure (`#0c0c0e`/`#15151a`/`#1f1f23`) with `#ff7a18` swapped in for the purple, after a Reinvented-direction detour (neon borders + a scanline/glow texture overlay) was dialed back to taste; the server `theme_id` whitelist gains `ember` so it can actually be selected. The **brand banner** (`ProjectSidebar`) shows the eye-patch Westie mascot + the `>_BooCode` wordmark big and edge-to-edge on transparent backgrounds — the source PNGs shipped with baked-white canvases, so they were flood-filled to transparency from the corners (preserving the white dog, which a naive white-key would have destroyed) and cropped to bounds. **Coder panes are now multi-tab**: `+` opens a new BooCode tab (a fresh chat = a new agent context sharing the session worktree) while the split button still opens a pane — coder panes reuse the shared `ChatTabBar` via a kind-aware `tabKind`, backed by a new `createCoderTab` action with `closeOtherTabs`/tab-numbering extended to coder kind. **Model-attribution chips**: a new `messages.model` column (both apps share the table) stamped at `finalizeCompletion` (BooChat + native coder) and at the dispatcher's assistant-row creation (external coder), surfaced through the `messages_with_parts` view + wire types + the live `message_complete` frame (the Zod already allowed `model`; nothing consumed it), and rendered as a subtle accent chip with a shortened label (`shortenModelName``Sonnet 4.6`, `Qwen3.6 35B`) beside the message stats — so swapping models mid-coder-session stays legible. Also the composer moved its Web toggle into a boxed, focus-ringed input, tool rows lead with a glowing accent dot, and the Claude-SDK-backend follow-ups validated live this session (1M context window, follow-up-message fix, collapsed thinking/tool chips) land with `CLAUDE_SDK_BACKEND=1` flipped on. One snag fixed mid-deploy: the view's new `m.model` was first inserted mid-list and `CREATE OR REPLACE VIEW` can't reorder columns (42P16) — appended at the end. Web tsc + server + coder builds green; deployed (docker + boocoder, tools:34). Builds on `v2.7.7-pane-header-actions`.
## v2.7.7-pane-header-actions — 2026-06-01
In-flight workspace UX work, committed alongside the v2.7 review batches. Extracts a shared `PaneHeaderActions` cluster (the +/Split/Reopen-closed-pane/Session-history/Close controls) used across the `ChatTabBar` and the desktop coder + terminal pane headers in `Workspace`, replacing the divergent per-header copies, with `SessionLandingPage` history enhancements and `useWorkspacePanes` tweaks. Also fixes a coder-side correctness bug: `resolveChatId` (`apps/coder/src/routes/chat-resolve.ts`) still read `sessions.workspace_panes` as a bare `WorkspacePane[]`, but `v2.6.5-panes-tabs-composer` widened it to a `WorkspaceState` envelope — so it mis-read the panes and, worse, clobbered `tabNumbers`/`nextTabNumber`/`closedPaneStack` back to a bare array on every pane-chat write; a new `normalizeWorkspaceState` accepts either shape and preserves the envelope (with a regression test). Plus a CLAUDE.md doc-sync (apps/coder vitest suite, deploy-by-surface, dual-remote push, in-flight-web-WIP staging, release-branch naming). Web tsc + coder build + coder tests green. Builds on `v2.7.6-agent-status-normalize`.
## v2.7.6-agent-status-normalize — 2026-06-01
The scoped half of `boocode_code_review_v2.md` §1 #10 — normalized external-agent status, surfaced from BooCoder's own dispatch observation (the heavier config-injection notify-hook, clean-room from superset's ELv2 `agent-setup`, is documented as the follow-on). The review's premise ("PTY agents have no status") had partly aged out — warm-ACP/opencode/SDK already carry working/done — so the real gap was that BooCoder never *published* a normalized per-`(chat,agent)` status (blocked-on-permission was invisible; crash/idle weren't pushed). Adds an `agent_status_updated` WS frame (`working|blocked|idle|error`, server+web parity) published from the dispatcher's turn boundaries across all four external paths (warm-acp/opencode/sdk/pty — `working` at start, `idle`/`error` at end) and the permission flow (`blocked` on request, `working` on resolve), best-effort so it never breaks a turn. A clean-room `normalizeAgentEvent` helper (superset's ~30-vendor-event → Start/blocked/Stop collapse, reimplemented with the event names as facts) ships now with 25 tests so the deferred notify-hook injection reuses it verbatim. The `AgentComposerBar` gains a normalized status dot (working=spinner, blocked=amber, idle=gray, error=red) distinct from the WS-liveness dot, fed by a `useAgentStatus` map `CoderPane` tracks per `(chat,agent)`. Built by two parallel agents (data plane + view plane) against a pinned frame contract; server 545 + coder 294 tests passing (25 new), web tsc + builds clean, ws-frames parity green. Clears the actionable review backlog (#1/#3/#4/#6#12). Builds on `v2.7.5-claude-sdk-sessionstore`; openspec `agent-status-normalize`.
## v2.7.5-claude-sdk-sessionstore — 2026-06-01
Lands the Claude Agent SDK direction (`boocode_code_review_v2.md` §1 #9, §6.2 "lean SDK") behind a flag. Adds `@anthropic-ai/claude-agent-sdk@0.3.159` (Commercial Terms — runtime dep, code reference-only) and builds a warm, resumable claude backend to supersede one-shot PTY dispatch — env-gated (`CLAUDE_SDK_BACKEND`, default off) so production claude stays on the unchanged PTY path until a host smoke. **Clean-room `PostgresSessionStore`** implements the SDK's real `SessionStore` type (`append`/`load`/`listSessions`/`delete`/`listSubkeys`) over a new `claude_session_entries` table — typechecked against the installed SDK type, 8 DB-integration tests. **`ClaudeSdkBackend`** (`implements AgentBackend`, mirroring warm-acp/opencode-server) drives one persistent `query()` per `(chat,'claude')` in streaming-input mode via a pushable async-iterable pump, with `sessionStore` + `resume` for cross-turn/cross-restart continuity, a pure `mapSdkMessage``AgentEvent` mapper, `session_id` captured from the `init` message, and `result.usage`/`total_cost_usd` accumulated onto `agent_sessions` (backend CHECK gains `'claude_sdk'`). Built against the REAL SDK 0.3.159 types after installing it — surfacing shapes a blind build would have missed (`SDKPartialAssistantMessage` is `type:'stream_event'` needing `includePartialMessages`; `SDKUserMessage.message` is `MessageParam`; the `SDKResultMessage` error arm). Also fixes a latent test-infra deadlock — three DB-integration suites applying the full schema in parallel under `DATABASE_URL` deadlocked, now serialized via `fileParallelism:false`. ~32 new tests (8 store + 10 mapper + 8 pushable + 6 routing); coder suite 269 passing default / 290 with DB; tsc clean against the SDK types; builds clean. **The live streaming pump + resume + an actual claude turn need a host smoke (`CLAUDE_SDK_BACKEND=1` + claude binary + ANTHROPIC auth) — cannot run from the dev container.** The zod peer-dep wants `^4` (workspace `3.25`) — watch at runtime. Builds on `v2.7.4-mistake-tracker-ledger`; openspec `claude-sdk-sessionstore`.
## v2.7.4-mistake-tracker-ledger — 2026-06-01
Two native-inference hardening features from `boocode_code_review_v2.md` §1 #12 (cline, algorithm-reimplemented). **MistakeTracker:** complements the doom-loop guard (identical repeats) and cap-hit (budget) by catching a run of consecutive tool *failures*. A new pure `mistake-tracker.ts` tracks heterogeneous failure kinds (`zod_reject`/`tool_not_found`/`exec_error`/`api_error`/`permission_denied`, surfaced per tool from `tool-phase.ts`); after 3 consecutive failures the `turn.ts` loop does a **soft nudge** — injects model-facing recovery guidance into the next step + drops a `mistake_recovery` UI sentinel + resets — then **escalates** to stopping the turn (cap-hit-style, with a Continue affordance) if it re-trips without an intervening success, so heterogeneous failures can't burn the whole step budget. **File-provenance ledger:** `compaction.ts` now derives a deterministic, sorted `## Files Read` list from the head messages' read-tool calls (`view_file`/`grep`/`find_files`/`list_dir`) and injects it into the rolling-summary prompt so file provenance survives compaction (no new table; prompt-driven merge, read-only since BooChat has no write tools). The `mistake_recovery` sentinel adds an arm to `MessageMetadata` in both server + web type copies plus a `MessageBubble` render branch. Built by two parallel agents (backend + frontend sentinel) over disjoint apps; server 545 tests passing (23 new: 12 mistake-tracker + 11 compaction), build + web tsc clean. Native-inference only (external agents run their own loops). Builds on `v2.7.3-sampling-streamjson-tokens`; openspec `mistake-tracker-file-ledger`.
## v2.7.3-sampling-streamjson-tokens — 2026-06-01
Three small BooCode wins from `boocode_code_review_v2.md` §1 #11/#7/#8. **Sampling knobs:** per-agent `top_n_sigma` + the `dry_*` repetition family (`dry_multiplier`/`dry_base`/`dry_allowed_length`/`dry_penalty_last_n`) are now first-class Agent frontmatter fields, parsed in `agents.ts` and threaded into the llama-swap chat-completion body via `providerOptions.openaiCompatible` (the `@ai-sdk/openai-compatible` extra-body channel). This surfaced and fixed a **latent bug**: `top_k` (rejected by the AI-SDK provider as unsupported) and `min_p` (never passed to `streamText` at all) had been dead on the wire — no agent's `top_k`/`min_p` ever affected sampling; both now route through the same channel, so agents that set them will start using them. `--reasoning-budget` is documented in `data/AGENTS.md` (already works via `llama_extra_args`, permitted by the deny-list validator). **Live PTY stream-json:** qwen/claude PTY dispatch sliced stdout opaque; a new `stream-json-parser.ts` line-buffers the Claude-Code-compatible NDJSON and emits text/reasoning/tool frames live as they arrive (mirroring the ACP/opencode paths) + persists the structured parts, with a clean fallback to the old opaque slice when output isn't NDJSON (claude now runs `--output-format stream-json --verbose`). **Token UI:** the per-`(chat,agent)` `agent_sessions.input_tokens`/`output_tokens`/`cost` columns (accumulated since `v2.6.8` but dropped by the read route + wire type) now flow through and render condensed beside the AgentComposerBar session chip. Built by three parallel agents over disjoint subsystems; server 523 + coder 245 tests passing (incl. 11 new stream-json-parser + new agent-parse tests), all builds + web tsc clean. Builds on `v2.7.2-checkpoint-idor`; openspec `sampling-streamjson-tokens`. The qwen-vs-claude `usage` field names in #7 are best-guess pending a live smoke.
## v2.7.2-checkpoint-idor — 2026-06-01 ## v2.7.2-checkpoint-idor — 2026-06-01
Closes two IDOR authorization holes in the `v2.7.1-write-edit-robustness` checkpoint routes, flagged by the automated push security review. The `GET /api/sessions/:id/checkpoints?chat_id=` list route scoped its `chat_id` branch by `chat_id` alone — any session's `chat_id` would read its checkpoints; it now joins through `chats` and gates on `chats.session_id` (authoritative; `checkpoints.session_id` is a nullable denormalized hint). The `restoreCheckpoint` scope guard was fail-open — `cp.session_id && cp.session_id !== sessionId` fell through whenever the checkpoint's denormalized `session_id` was null, allowing a cross-session restore (worktree reset + transcript trim) — it now resolves the owning session via the checkpoint's chat and denies on any missing-or-mismatched row. A DB-integration regression covers the exact null-`session_id` cross-session case. Real-world blast radius is small (BooCoder is single-user behind Authelia on loopback), but both are genuine authorization bugs. Coder suite 234 passing (7/7 checkpoint tests incl. the regression against live postgres+git), typecheck clean. Hotfix on `v2.7.1-write-edit-robustness`. Closes two IDOR authorization holes in the `v2.7.1-write-edit-robustness` checkpoint routes, flagged by the automated push security review. The `GET /api/sessions/:id/checkpoints?chat_id=` list route scoped its `chat_id` branch by `chat_id` alone — any session's `chat_id` would read its checkpoints; it now joins through `chats` and gates on `chats.session_id` (authoritative; `checkpoints.session_id` is a nullable denormalized hint). The `restoreCheckpoint` scope guard was fail-open — `cp.session_id && cp.session_id !== sessionId` fell through whenever the checkpoint's denormalized `session_id` was null, allowing a cross-session restore (worktree reset + transcript trim) — it now resolves the owning session via the checkpoint's chat and denies on any missing-or-mismatched row. A DB-integration regression covers the exact null-`session_id` cross-session case. Real-world blast radius is small (BooCoder is single-user behind Authelia on loopback), but both are genuine authorization bugs. Coder suite 234 passing (7/7 checkpoint tests incl. the regression against live postgres+git), typecheck clean. Hotfix on `v2.7.1-write-edit-robustness`.

185
CLAUDE.md
View File

@@ -2,11 +2,11 @@
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
**Cursor agents:** start with `docs/ARCHITECTURE.md` (diagram). This file is the deep engineering reference. (Note: the root navigation `AGENTS.md` was removed in v1.12; `data/AGENTS.md` is the agent *registry*, not navigation.) **Cursor agents:** start with `docs/ARCHITECTURE.md` (diagram); this file is the deep engineering reference. `data/AGENTS.md` is the agent *registry*, not navigation (the root navigation `AGENTS.md` was removed).
## What is BooCode ## What is BooCode
Self-hosted single-user developer chat app. AI assistant with read-only file tools (view_file, list_dir, grep, find_files) running against a local llama-swap inference server. Sessions organized by project, with a multi-pane workspace (chat + file browser side by side). Self-hosted single-user developer chat app. AI assistant with read-only file tools (view_file, list_dir, grep, find_files) against a local llama-swap inference server. Sessions organized by project, multi-pane workspace (chat + file browser side by side).
Plus `apps/booterm` (second container, port 9501, bookworm-slim+glibc): Fastify + node-pty + tmux. Browser terminal panes WS to `/ws/term/sessions/:sid/panes/:pid`; per-session tmux session `bc-<sid>`, per-pane window `term-<pid>`. Shells drop privs to samkintop via `gosu` in `tmux.conf` default-command. Plus `apps/booterm` (second container, port 9501, bookworm-slim+glibc): Fastify + node-pty + tmux. Browser terminal panes WS to `/ws/term/sessions/:sid/panes/:pid`; per-session tmux session `bc-<sid>`, per-pane window `term-<pid>`. Shells drop privs to samkintop via `gosu` in `tmux.conf` default-command.
@@ -35,85 +35,22 @@ npx tsc -p apps/web/tsconfig.app.json --noEmit # web app specifically
docker compose build --no-cache boocode && docker compose up -d docker compose build --no-cache boocode && docker compose up -d
``` ```
Tests: `pnpm -C apps/server test` runs the vitest suite. No test harness on `apps/web` (adding it requires installing vitest as a new devDep). Vitest pinned to `^3` because Vite 5 / vitest 4 are incompatible. No linters configured. Vitest include glob is `src/**/__tests__/**/*.test.ts` (see `apps/server/vitest.config.ts`) — tests outside `src/**/__tests__/` silently won't run; match the per-domain convention (`apps/server/src/services/__tests__/foo.test.ts`). Tests: `pnpm -C apps/server test` (vitest); `apps/coder` has its own suite — `pnpm -C apps/coder test` (`globals:false`, so import `describe`/`it`/`expect` from `vitest`). No `apps/web` test harness, no linters. Vitest pinned to `^3` (Vite 5 / vitest 4 incompatible). Include glob is `src/**/__tests__/**/*.test.ts` — tests outside it silently won't run. Extract pure helpers to unit-test (`backends/turn-guard.ts`, `lifecycle-decisions.ts` are the pattern).
## Architecture ## Architecture
**Monorepo**: pnpm workspaces with `apps/server` (Fastify + postgres), `apps/web` (React + Vite), and `apps/booterm` (Fastify + node-pty + tmux). **Monorepo**: pnpm workspaces with `apps/server` (Fastify + postgres), `apps/web` (React + Vite), `apps/booterm` (Fastify + node-pty + tmux), `apps/coder` (BooCoder, host service).
### Server (`apps/server/src/`) ### Per-app deep references
- **Fastify** with `@fastify/websocket` and `@fastify/static` (serves built frontend) Detailed engineering notes live in per-app `CLAUDE.md` files, **auto-loaded when you read/edit files in that subtree** (and worth opening before non-trivial work there):
- **postgres** (porsager/postgres) with tagged-template SQL — no ORM. Schema in `schema.sql`, applied on startup. LSP may false-positive on `sql<Type[]>\`...\`` generics; CLI `tsc` / `pnpm build` is authoritative.
- **Zod** for request validation and config parsing.
Key services: - **`apps/server/CLAUDE.md`** — inference pipeline, AI-SDK adapter gotchas, tools, compaction, broker, the `messages_with_parts` view, sidecar routing, secret guard, the `data/AGENTS.md` registry.
- **`services/inference/`** — Public surface re-exported via `inference/index.ts`; callers import from `./services/inference/index.js` explicitly (NodeNext doesn't honor directory-index resolution). Layout: `turn.ts` (runAssistantTurn / runInference / createInferenceRunner; exports `InferenceFrame`, `InferenceContext`, `TurnArgs`, `StreamResult`, `MAX_STEPS`), `stream-phase.ts` (streamCompletion as a v1.13.1-A AI SDK adapter + executeStreamPhase), `provider.ts` (`upstreamModel(baseURL, modelId)` wrapping `createOpenAICompatible` against llama-swap), `tool-phase.ts` (executeToolPhase → returns `ToolPhaseResult`; no longer recurses into runAssistantTurn — v1.14.0 converted the recursion to an explicit while loop in turn.ts), `sentinel-summaries.ts` (runCapHitSummary + runDoomLoopSummary + runStepCapSummary + their sentinel inserters), `error-handler.ts` (handleAbortOrError, finalizeCompletion), `payload.ts` (buildMessagesPayload, loadContext, maybeFlagForCompaction, `OpenAiMessage`), `sentinels.ts` (`detectDoomLoop`, `DOOM_LOOP_THRESHOLD`, sentinel predicates), `budget.ts` (resolveToolBudget), `xml-parser.ts` (qwen3.6 XML tool-call fallback — KEEP, AI SDK doesn't handle inline-XML tool calls), `parts.ts` (parts-table write helpers: `partsFromAssistantMessage`, `partsFromToolMessage`, `insertParts` — v1.13.20 made parts the sole source of truth), `prune.ts` (v1.13.4 two-tier compaction; `selectPruneTargets` is the pure decision helper), `types.ts` (`StreamPhaseState`, `DB_FLUSH_INTERVAL_MS`). **`TurnArgs`** is the per-turn state envelope populated from loop locals each iteration; reset in `runInference` at user-message boundary. The outer loop in `runAssistantTurn` (v1.14.0) runs `while (stepNumber < effectiveCap)` where `effectiveCap = Math.min(agent.steps ?? Infinity, MAX_STEPS=200)`. Per-agent `steps:` field in AGENTS.md frontmatter. `steps: 0` means text-only (no tool execution). Step-cap hit writes a `cap_hit` sentinel so `CapHitSentinel.tsx` renders it. - **`apps/coder/CLAUDE.md`** — BooCoder dispatch, provider registry/probe/snapshot, opencode/ACP/PTY/Claude-SDK backends, `agent_sessions` resume.
- **AI SDK v6 streamCompletion adapter** (v1.13.1-A; `services/inference/stream-phase.ts`). `streamText` is the underlying call; the BooCode layer above (executeStreamPhase, finalize, dual-write) is shape-preserved via an adapter. Five gotchas the LSP/test suite won't catch: - **`apps/web/CLAUDE.md`** — React app, hooks/event buses, font & CSS pipeline, multi-pane workspace, all UI conventions.
- **Abort signals are swallowed.** `streamText`'s `fullStream` iterator exits cleanly when `abortSignal` fires — no throw. Post-iteration `if (signal?.aborted) throw <AbortError>` is required; without it the row finalizes as `complete` instead of `cancelled`. Comment in stream-phase.ts pins this; don't refactor it away. - **`docs/project-discovery.md`** — full stack / tooling / command inventory across all packages (read-on-demand).
- **Usage lands only at stream end** via `await result.usage` (`inputTokens` / `outputTokens` v6 names → mapped to `promptTokens` / `completionTokens` for the existing onUsage callback). Mid-stream live tok/s is gone vs v1.12.2; ChatThroughput shows a single value at stream end.
- **Tools have NO `execute` field.** BooCode dispatches tools in tool-phase.ts, not the AI SDK loop. Only `description` + `inputSchema: jsonSchema(parameters)` — surfacing tool-call parts via `fullStream` and stopping is what we want.
- **`includeUsage: true` MUST be set on `createOpenAICompatible`** in `services/inference/provider.ts`. The adapter defaults it false, omitting `stream_options.include_usage` from the request body; llama-swap then never emits the usage block and `result.usage.inputTokens/outputTokens` resolve to `undefined`. Latent regression from v1.13.1-A through v1.13.7 — every assistant row in that window has `tokens_used`/`ctx_used` NULL. Don't remove this flag during refactor.
- **Tool-call-only turns may emit a leading `\n` text-delta** as the assistant content. `MessageList.flatten`'s `hasText` and `MessageBubble`'s `hasContent` both `.trim()` before the length check — otherwise whitespace-only content renders an empty bubble + ActionRow between every tool call (v1.13.7 fix). `payload.ts:buildMessagesPayload` also skips `status='failed'` AND complete-but-empty (no content, no tool_calls) assistant rows to avoid "Cannot have 2 or more assistant messages at the end of the list" upstream rejections after cap-hit + Continue.
- **AI SDK ModelMessage conversion** (`toModelMessages` in stream-phase.ts). Tool messages need a `toolName` for `ToolResultPart` — BooCode's OpenAI-shape history doesn't carry it, so a forward-scan builds a `tool_call_id → toolName` map from prior assistant `tool_calls`. Tool outputs wrapped as `{ type: 'json' | 'text', value }` matching the v6 `ToolResultOutput` union. Assistant messages with reasoning emit a `ReasoningPart` first in the content array (v1.13.1-C).
- **`experimental_repairToolCall`** (v1.13.3) wired into `streamText` to keep the stream alive when qwen3.6 emits malformed tool args. Pass-through implementation — logs the bad call and returns it unmodified; `executeToolPhase`'s existing zod-reject error path routes it to the model on the next turn.
- **`chat_status` frame shape** (published via `broker.publishUser`) — `status: 'streaming' | 'tool_running' | 'waiting_for_input' | 'idle' | 'error'` (widened from `working|idle|error` in v1.12.1). Frontend `useChatStatus` derives `idle_warm` (<30s since idle) vs `idle_cold`. `ChatThroughput` renders inline beside `StatusDot` only when streaming or tool_running, fed by 500ms-throttled `'usage'` WS frames (`completion_tokens` + `ctx_used` + `ctx_max`). The `POST /api/chats/:id/discard_stale` endpoint exists to mark a stuck-streaming row as `failed` when the frontend's 60s no-token-activity timer (`ChatPane` content-length watcher) gives up.
- **Boot-time stale-streaming sweep** in `apps/server/src/index.ts` after `applySchema()`: any `messages.status='streaming'` older than 5 minutes flips to `'failed'`. Logs only on non-zero count. Recovers from container restart while inference was mid-stream (v1.12.1).
- **Periodic 60s sweeper** in `apps/server/src/index.ts` (v1.13.3 + v1.13.5). Same `setInterval` runs `sweepStaleStreaming` (marks `messages.status='streaming'` older than 5 min as `failed`, publishes `chat_status='idle'` so the UI dot drops) and `cleanupTruncations` (TTL + orphan reap of tmpfs truncation files). `app.addHook('onClose')` clears the timer. No-op when nothing to reap.
- **`services/broker.ts`** — In-memory pub/sub with two channel types: per-session (message streaming) and per-user (sidebar updates). No persistence; clients reconnect on restart. v1.13.11: every WS publish goes through `broker.publishFrame(sessionId, frame)` or `broker.publishUserFrame(user, frame)` — both Zod-validate against `WsFrameSchema` (`types/ws-frames.ts`) and fail-closed (log + drop). `ctx.publish` / `ctx.publishUser` in inference + auto_name route through the index.ts adapter that calls publishFrame internally. The schema is duplicated byte-identical at `apps/web/src/api/ws-frames.ts`; a `ws-frames.test.ts` case enforces parity. Don't add new raw `broker.publish()` / `publishUser()` calls.
- **`services/tools.ts`** — Tool registry (`ALL_TOOLS`, `READ_ONLY_TOOL_NAMES`, `TOOLS_BY_NAME`). Filesystem tools (view_file/list_dir/grep/find_files) go through three guard layers: `path_guard.ts` (workspace scope), `secret_guard.ts` (filename deny list), `url_guard.ts` (SSRF/private-IP block for web_fetch). v1.11.8+ web tools (`web_search`, `web_fetch`) are opt-in per chat via `session.web_search_enabled` (resolved with `project.default_web_search_enabled` fallback) and filtered out of the LLM's tool schema when false. v1.13.5 truncation: when a tool slice cuts content, `services/truncate.ts` stashes the full text on tmpfs at `BOOCODE_TRUNCATION_DIR` (default `/tmp/boocode-truncations`, 0o700) keyed by an opaque `tr_<12 base32 chars>` id, and the `view_truncated_output(id)` tool retrieves it. 5MB cap (matches `view_file`'s `MAX_FILE_BYTES`), 7-day TTL, reaped by the periodic sweeper. Tmpfs path means container restart loses retrieval — acceptable, the model usually has moved on.
- **`services/compaction.ts`** + **`services/model-context.ts`** — v1.11.0 anchored rolling summary (single `summary=true` assistant row per chat, supersedes itself on each compaction). Triggered when `chats.needs_compaction` is set after an inference turn exceeds `usable(ctx_max) = floor(0.85 × ctx_max)` (v1.13.9 opencode-pattern early trigger; was `ctx_max - 20k` pre-v1.13.9, which gave only 7.6% headroom at 262k and 0 budget for ≤20k contexts). **`ctx_max` comes from `model-context.getModelContext()` which fetches `${LLAMA_SWAP_URL}/upstream/<model>/props`** — NOT from `parsed.timings.n_ctx` (the stream completion's `timings` doesn't carry n_ctx; that read was dead code until v1.11.3 ripped it out). First inferences after a boocode boot may have `ctx_max=NULL` if llama-swap hasn't loaded the model yet; negative cache TTL is 60s, recovers on next turn. v1.13.6: `buildHeadPayload` embeds `reasoning_parts` as a `<reasoning>...</reasoning>` prose prefix on the assistant `content` (OpenAI wire shape has no structured reasoning field; the summarizer reads text). Standalone tag when content is empty (tool-call-only turn). `buildHeadPayload` + `OpenAiMessage` exported for test access — keep them exported.
- **`services/system-prompt.ts`** — `buildSystemPrompt` is the string-returning shim; `buildSystemPromptWithFingerprint` is the canonical impl returning `{prompt, fingerprint, drift}`. v1.13.8 instrumentation: SHA-256 of the assembled prefix is logged per `buildMessagesPayload` call (msg `prefix-fingerprint`, level=info); a `Map<sessionId, lastHash>` observer fires `prefix-drift` (level=warn) on hash change with a field-level `changed_inputs` diff. Smoke proved the prefix is byte-stable across turns in steady-state — the originally-planned `system_prompt_cache` DB table was dropped as redundant against the v1.12.0 input-layer mtime caches (BOOCHAT.md here + AGENTS.md global+per-project in `agents.ts:safeStat`).
- **`services/inference/budget.ts`** — tool-call budgets: `BUDGET_READ_ONLY = 30`, `BUDGET_NON_READ_ONLY = 10` (forward-looking; no write tools yet), `BUDGET_NO_AGENT = 30` (v1.13.7; was 15 — every tool in `ALL_TOOLS` is read-only today, so no-agent mode shares the read-only-agent cap). Per-agent `max_tool_calls` from AGENTS.md frontmatter overrides.
- **`messages_with_parts` view** (v1.13.1-B; `schema.sql`). Read sites that need `tool_calls` / `tool_results` / `reasoning_parts` SELECT from this view, NOT `messages` directly. v1.13.20 dropped the legacy `messages.tool_calls` / `messages.tool_results` JSON columns; the view now reads parts-only subselects. Writes target `message_parts` exclusively via `insertParts` (or via the helpers `partsFromAssistantMessage` / `partsFromToolMessage`). The `Message` wire type still carries `tool_calls?` / `tool_results?` because the view synthesizes them from parts — frontend reads are unchanged. Shapes: `tool_calls jsonb[]`, `tool_results jsonb` single object, `reasoning_parts jsonb[]` of `{text}`. If you ever need to UPDATE a message and return its full Message shape, do a two-step UPDATE returning `id` followed by SELECT from the view — RETURNING off the bare `messages` table no longer carries the tool fields.
- **`services/file_ops.ts`** — Shared file operation implementations used by both inference tools and HTTP routes.
- **`services/auto_name.ts`** — Non-streaming LLM call to generate 4-word session titles after first assistant reply.
- **`apps/coder/src/services/provider-registry.ts`** (BooCoder, NOT apps/server) — Static registry of provider metadata (label, transport, model source). `PROVIDERS` array, `PROVIDERS_BY_NAME` map. 5 providers: boocode (native), opencode (acp), goose (pty), claude (pty), qwen (pty).
- **`apps/coder/src/services/agent-probe.ts`** (BooCoder) — Startup probe using direct `exec()` (not SSH). Discovers installed agents on host, their versions, ACP support, and models. Qwen models read from `~/.qwen/settings.json`. Claude models are static from the registry. Results persisted to `available_agents` table.
- **`apps/coder/src/routes/providers.ts`** (BooCoder) — `GET /api/providers` returns installed providers with models. Transport field reflects actual capability (checks `supports_acp` from DB, not just registry preference). The apps/server side of this flow is the "Provider picker dispatch" bullet below.
- **Provider picker dispatch**: when `provider !== 'boocode'`, the message route creates a `tasks` row (with `session_id` set) instead of calling `inference.enqueue`. The dispatcher picks it up and dispatches via ACP or PTY using the agent's `install_path`.
Route registration: all routes registered in `index.ts` via `register*Routes(app, sql, ...)` functions. Routes are in `routes/*.ts`. Cross-app contracts (WS-frame & provider-type parity, sentinels) and everything below stay here.
### BooCoder (`apps/coder/src/`)
- Write-capable coding agent. Runs as a **systemd service on the host** (`boocoder.service`), NOT in Docker. Fastify server at port 9502, connects to postgres at `127.0.0.1:5500`.
- **Workspace dependency on `@boocode/server`**: imports `createInferenceRunner`, `createBroker`, `ALL_TOOLS`, `appendMcpTools` from the server's compiled `dist/`. apps/server's `package.json` has an `exports` map with `types` conditions for NodeNext resolution. apps/server must build FIRST.
- Build + deploy: `pnpm -C apps/server build && pnpm -C apps/coder build && sudo systemctl restart boocoder`. Env file at `apps/coder/.env.host`. Service file at `/etc/systemd/system/boocoder.service`.
- After `pnpm -C apps/coder build` the host `boocoder.service` keeps running the OLD process until `sudo systemctl restart boocoder` — a stale process shows **new routes 404 with `{error:'not found'}` while old routes still 200** (the `/api` not-found handler returns that shape). Restart, don't re-debug.
- Agent dispatch spawns binaries directly using `install_path` from `available_agents` — no `spawn('sh', ['-c', ...])` (fails under systemd). Follows Paseo's pattern: `spawn(fullBinaryPath, argsArray, { cwd })`.
- systemd hardening: only `NoNewPrivileges=true` is safe. `ProtectSystem`, `ProtectHome`, `PrivateTmp` all break agent dispatch (agents need full filesystem access to read configs, write to worktrees).
- `apps/server/tsconfig.json` has `declaration: true` so `.d.ts` files exist for workspace consumers.
- Write tools (`edit_file`, `create_file`, `delete_file`, `apply_pending`, `rewind`) queue in `pending_changes` table. Nothing hits disk until `apply_pending` is called. `write_guard.ts` validates paths (resolve + prefix-check, no realpath since files may not exist for creates).
- Frontend: NOT a separate SPA. BooCoder is a `'coder'` pane type within BooChat's SPA (`apps/web/`). `CoderPane.tsx` in `apps/web/src/components/panes/`. API requests go through `/api/coder/*` proxy (Vite dev + Fastify production) which rewrites to the boocoder host service (`BOOCODER_URL` env var, default `http://100.114.205.53:9502`). WS connects directly to `:9502`.
- `apps/coder/web/` is a STANDALONE fallback SPA served at `:9502` directly. The PRIMARY BooCoder frontend is the `CoderPane` in BooChat's SPA (`apps/web/src/components/panes/CoderPane.tsx`), accessible via the "Coder" pane in the workspace at `code.indifferentketchup.com`. Both exist; the pane is what Sam uses.
- **Provider snapshot lifecycle** (`apps/coder/src/services/`): `provider-config.ts` (Zod config, never-throws on bad input) → `provider-config-registry.ts` (`buildResolvedRegistry`, singleton) → `provider-snapshot.ts` (two-tier probe: tier-1 fast presence, tier-2 cold ACP probe skipped unless force / stale `PROVIDER_PROBE_TTL_MS` 24h / dbEmpty; cached). Verify live: `curl http://100.114.205.53:9502/api/providers/snapshot` — returns providers + models + commands, the exact shape `AgentComposerBar` renders.
- `PATCH /api/providers/config` replaces a provider id's override object **wholesale** (per-id shallow merge) — to flip one field send `{...existing, enabled}`, or a custom ACP entry's `command`/`label` is wiped and it drops out of the resolved registry. `data/coder-providers.json` is **gitignored** (it's live runtime config — the coder reads AND writes it on UI toggles); the tracked reference is `data/coder-providers.example.json`. The loader falls back to `{providers:{}}` (built-ins only) when the live file is absent, so a fresh checkout needs no copy.
- **opencode** runs as a warm HTTP server (v2.6 Phase 1, `services/backends/opencode-server.ts``opencode serve` per BooCoder process, one opencode session per BooCode session, resumed via `agent_sessions`). goose/qwen/claude still dispatch **one-shot** ACP/PTY with no ctx/token usage; only native `boocode` (llama-swap engine) tracks ctx. Paseo's per-provider native clients (design §12) deliberately not ported.
- **opencode SSE** (`opencode-server.ts`): live streaming arrives as `session.next.text.delta` / `session.next.reasoning.delta` / `session.next.tool.{called,success,failed}` — NOT `message.part.*` (those are terminal/post-hoc). `client.event.subscribe({ directory })` MUST pass the session's worktree directory; omit it and opencode scopes events to the server's `process.cwd()` → zero session events (empty turns, 180s watchdog timeout). Per-session SSE (P1.5-a): each live session owns its own `event.subscribe({directory})` loop + AbortController, so concurrent sessions in different worktrees stream independently; a `sessionID` demux guard drops cross-session events when two share a dir. Turn completes on `session.idle`; `promptAsync` is fire-and-forget (204).
- **opencode model strings** must be provider-prefixed (`llama-swap/<model>`) AND exist in `~/.config/opencode/opencode.json` `provider.llama-swap.models` — not merely loadable by llama-swap. `parseModel` infers `llama-swap/` for a bare id; the dispatcher coalesces empty→DEFAULT_MODEL then prefixes. `agent-probe` populates opencode's `available_agents.models` via `mergeLlamaSwap` (fetches `/v1/models`); empty model list → frontend sends `''` → no inference (`input:0`, empty turn).
- **agent_sessions resume**: `config_hash = sha256('opencode_server|<model>')` — must NOT include the server port (random per boot; including it breaks cross-restart resume). P1.5-b: `agent_sessions` is keyed `(chat_id, agent)` — the tab/chat is the context unit (two opencode tabs in one session = two contexts sharing one worktree). `chat_id` CASCADEs from `chats`; `session_id`/`worktree_id` are informational `SET NULL`. The `worktrees` table (one-per-session, `session_id` SET NULL so it survives session delete) supersedes the defanged `session_worktrees`. `tasks.chat_id` threads the tab id to the dispatcher; `runOpenCodeServerTask` falls back to resolve-or-create a chat when it's null (arena/MCP/new_task). The `@opencode-ai/sdk` v2 client takes flattened params (`{sessionID, directory, parts, model:{providerID,modelID}}`), imports `createOpencodeClient` from `@opencode-ai/sdk/v2/client`.
### Frontend (`apps/web/src/`)
- **React 18** + React Router v6 + **Tailwind v4** + shadcn/radix-ui primitives.
- **Shiki** for syntax highlighting (async `codeToHtml` in `CodeBlock.tsx` and `FileViewer` in `FileBrowserPane.tsx`).
- Path alias: `@/` maps to `src/`.
- **Mobile interaction primitives** (post-v1.6): `useViewport` (matchMedia, breakpoints mobile <768 / tablet 7681023 / desktop ≥1024), `useSidebarDrawer` / `useRightRailDrawer` (Context + auto-close on `useLocation().pathname` change), `useLongPress` (500ms timer, dispatches synthetic `contextmenu` on `[data-tab-id]`), `usePullToRefresh` (80px threshold, 600ms hold), `SwipeablePaneTab` (60px close, 30px vertical bail). Tap-target convention: `max-md:min-h-[44px] max-md:min-w-[44px]`. Mobile headers: `border-b px-3 sm:px-4 py-2` + `style={{ paddingTop: 'max(0.5rem, env(safe-area-inset-top))' }}`. Hamburger left, FolderTree right.
Key patterns:
- **`hooks/sessionEvents.ts`** — Module-singleton event bus (Set of listeners). Used for cross-component communication: session renames, file-open events, attachment dispatch. 9 event types in the discriminated union. When adding a new event type to the `SessionEvent` union, you must also add a case to the `applyEvent` switch in `useSidebar.ts` (even if it's a no-op `return prev`).
- **`hooks/useSessionStream.ts`** — WebSocket per session, `applyFrame` reducer builds message list from streaming frames.
- **`hooks/useUserEvents.ts`** — Single app-level WS to `/api/ws/user` with exponential backoff reconnect. Forwards frames onto the sessionEvents bus.
- **`hooks/useSidebar.ts`** — Module-singleton with Set<setState> subscriber pattern; one bus subscription guarded by `globalThis.__boocode_sidebar_subscribed` for HMR safety. Every new `SessionEvent` type needs a `case` in the `applyEvent` switch (no-op `return prev` is fine).
- **`api/client.ts`** — Centralized typed fetch wrapper. All endpoints under `api.*` namespace.
Font / CSS pipeline (apps/web):
- Tailwind v4's `@import "tailwindcss"` directive strips font URLs from subsequent CSS `@import`s — `@fontsource*` packages must be imported as JS side-effect modules in `apps/web/src/main.tsx`, not via `@import` in `globals.css`. Otherwise the woff2 files never make it to `dist/`.
- Lightning CSS (inside `@tailwindcss/postcss` v4) collapses contiguous unicode-ranges to wildcard shorthand (`U+0000-FFFF``U+????`), which iOS Safari/Vivaldi mishandles (silently drops the font from those codepoints). Use explicit non-wildcard-collapsible subranges (e.g. `U+2500-259F` not `U+2500-25FF`). The `apps/web` build script greps `dist/assets/*.css` for `U+2500-259F` and fails the build if missing — preserve that guard.
- `@font-face` blocks must live AFTER all `@import` statements (CSS spec). Earlier placement silently breaks every subsequent `@import` (this broke the 18 theme palette imports in globals.css for one session).
- JetBrainsMono Nerd Font self-hosted in `apps/web/src/fonts/` (TTF from ryanoasis/nerd-fonts release) — needed because `@fontsource-variable/jetbrains-mono` ships subsetted woff2s that don't cover `U+2500-259F` (box drawing + block elements, used by opencode's banner). "NL" = No Ligatures (matches `font-feature-settings: "liga" 0`); "Mono" = single-cell icon width so TUI layouts don't desync.
- xterm-addon-webgl rasterizes glyphs via Canvas2D into a GPU texture atlas. Canvas2D does NOT honor `font-display: block` — it uses whatever font is currently registered. Gate xterm initialization on `document.fonts.load(<font-name>)` resolving before calling `term.open()` (see `fontsReady` useState in `TerminalPane.tsx`). iOS Safari/Vivaldi also reclaims WebGL contexts from backgrounded tabs: keep `webgl.onContextLoss(() => webgl.dispose())` + recreate via visibilitychange. Do NOT manually dispose+recreate the addon after font load — iOS silently fails the second GL context creation and the terminal drops to DOM renderer with stale metrics.
### Data flow for chat ### Data flow for chat
@@ -124,90 +61,64 @@ Font / CSS pipeline (apps/web):
5. Tool calls: inference executes tools server-side, publishes tool_call/tool_result frames, loops back to LLM 5. Tool calls: inference executes tools server-side, publishes tool_call/tool_result frames, loops back to LLM
6. Terminal states (complete/error): DB updated with final content + token counts, `session_updated` frame published on user channel 6. Terminal states (complete/error): DB updated with final content + token counts, `session_updated` frame published on user channel
### Multi-pane workspace
Sessions hold 15 panes (chat / empty / placeholder terminal+agent). v1.12.1 moved pane state from per-device localStorage to `sessions.workspace_panes jsonb` for cross-device sync. `PATCH /api/sessions/:id/workspace` persists; `session_workspace_updated` user-channel frame broadcasts to every device watching the session. `useWorkspacePanes` debounces saves 300ms and dedups echoes by JSON string. Legacy localStorage key `boocode.workspace.panes.<sessionId>` is read once on first hydrate (one-time seed-and-delete migration when server is empty but localStorage has data); no longer written. The deprecated `session_panes` table was dropped. `validatePanes(validChatIds)` prunes panes referencing chat IDs that no longer exist (called by `useSessionChats` after the chat list fetch lands). Each chat lives in at most one pane; tab strip is per-pane and tracks `chatIds[]` + `activeChatIdx`. Tab reorder via native HTML5 drag events. v2.6.5: `workspace_panes` is now a `WorkspaceState` envelope `{panes, tabNumbers (chatId→stable session-scoped tab number, assigned on chat-pane open, retired on close, never reused), nextTabNumber, closedPaneStack (reopen LIFO, max 10, persisted so it survives reload)}` — not a bare `WorkspacePane[]`. Hydrate (`toWorkspaceState`) and the server PATCH validator (`z.union([array, envelope])` in `routes/sessions.ts`) both accept the legacy array and normalize to the envelope on read/write. Closing a chat pane relocates its tabs to the oldest chat/empty pane; `reopenPane` strips the restored chatIds from all live panes first (no duplication). `read_tab_by_number` resolves a number→chatId through `tabNumbers`.
## Database ## Database
PostgreSQL 16. Database name: `boochat` (renamed from `boocode` in v2.0.0-alpha; Docker service name stays `boocode_db`). Tables: `projects`, `sessions`, `chats`, `messages`, `settings`, `message_parts` (v1.13.0), `pending_changes` (v2.0.0), `tasks` (v2.0.0), `available_agents` (v2.0.0). Views: `messages_with_parts` (v1.13.1-B parts-merge read path), `tool_cost_stats` (v1.13.10 per-tool 100-call rolling window), `human_inbox` (v2.0.0 — tasks WHERE state IN blocked/failed). (`session_panes` was dropped in v1.12.1; workspace pane state lives in `sessions.workspace_panes jsonb`.) Schema applied idempotently on startup via `applySchema()`. Use `clock_timestamp()` (not `NOW()`) inside transactions. CHECK constraints in place: `projects_status_chk` ('open'|'archived'), `sessions_status_chk` (same), `chats_status_chk` (same), `messages_role_chk`, `messages_status_chk` — keep in sync with the `*_STATUSES` const arrays in `apps/server/src/types/api.ts`. The older anonymous `messages_status_check` (without 'cancelled') and `messages_role_check` (without 'system') were dropped in v1.12.1; only the `_chk` variants remain. **Two schema files, one DB:** `apps/server/src/schema.sql` owns `sessions`/`chats`/`messages`/`message_parts`; `apps/coder/src/schema.sql` (applied by the boocoder host service) owns `agent_sessions`, `worktrees`, `pending_changes`, `available_agents` and extends `tasks`. Both apply idempotently to the one `boochat` DB — so e.g. an `agent_sessions` FK change goes in the **coder** schema, not the server one. Idempotent FK-action flips (e.g. `ON DELETE CASCADE``SET NULL`) guard on `pg_constraint.confdeltype` so a re-run/fresh-deploy is a no-op (see the `session_worktrees`/`agent_sessions` defang blocks). PostgreSQL 16. DB name: `boochat` (Docker service stays `boocode_db`). Tables: `projects`, `sessions`, `chats`, `messages`, `settings`, `message_parts`, `pending_changes`, `tasks`, `available_agents`. Views: `messages_with_parts` (parts-merge read path), `tool_cost_stats` (per-tool 100-call rolling window), `human_inbox` (tasks WHERE state IN blocked/failed). Schema applied idempotently on startup via `applySchema()`. Use `clock_timestamp()` (not `NOW()`) inside transactions. CHECK constraints: `projects_status_chk`/`sessions_status_chk`/`chats_status_chk` ('open'|'archived'), `messages_role_chk`, `messages_status_chk` — keep in sync with the `*_STATUSES` const arrays in `apps/server/src/types/api.ts`. **Two schema files, one DB:** `apps/server/src/schema.sql` owns `sessions`/`chats`/`messages`/`message_parts`; `apps/coder/src/schema.sql` (applied by the boocoder host service) owns `agent_sessions`, `worktrees`, `pending_changes`, `available_agents` and extends `tasks` — so e.g. an `agent_sessions` FK change goes in the **coder** schema. Idempotent FK-action flips (e.g. `ON DELETE CASCADE``SET NULL`) guard on `pg_constraint.confdeltype` so re-runs are no-ops.
Schema CHECK migration order when renaming allowed values: (1) `ALTER TABLE ... DROP CONSTRAINT IF EXISTS <system_name>` (inline `CREATE TABLE` checks get `<table>_<column>_check`), (2) `UPDATE` rows to new values, (3) wrap new constraint ADD in `DO $$ ... pg_constraint` guard — that block is the only way to get `ADD CONSTRAINT IF NOT EXISTS`. Schema CHECK migration order when renaming allowed values: (1) `ALTER TABLE ... DROP CONSTRAINT IF EXISTS <system_name>` (inline `CREATE TABLE` checks get `<table>_<column>_check`), (2) `UPDATE` rows to new values, (3) wrap the new constraint ADD in a `DO $$ ... pg_constraint` guard — the only way to get `ADD CONSTRAINT IF NOT EXISTS`.
**`CREATE OR REPLACE VIEW` can't reorder/rename columns** (Postgres `42P16`): append a new `messages_with_parts` column at the END of the SELECT — a mid-list insert shifts an existing column → crash-loops boot. Add it to each explicit read SELECT too (`routes/messages.ts`/`chats.ts`/`ws.ts`).
## Environment ## Environment
Required: `DATABASE_URL`, `LLAMA_SWAP_URL`. Optional: `PORT` (3000), `HOST` (0.0.0.0), `PROJECT_ROOT_WHITELIST` (/opt, read-only scope for add-existing path resolution), `BOOTSTRAP_ROOT` (/opt/projects, writable scope for create-new-project bootstrap mkdir target — host must `mkdir -p /opt/projects` before container start), `DEFAULT_MODEL`, `LOG_LEVEL`, `SEARXNG_URL` (default `http://100.114.205.53:8888` — internal Tailscale Fathom; the public `search.indifferentketchup.com` is behind Authelia and unusable from server context), `BOOCODE_TOOLS` (`core` | `standard` | `all`, default `all`; v1.13.15-tools tier filter — ceiling, never expands an agent's whitelist), `MCP_CONFIG_PATH` (optional; default `/data/mcp.json` — JSON config for MCP servers matching opencode's `mcpServers` shape; file missing = no MCP). Required: `DATABASE_URL`, `LLAMA_SWAP_URL`. Optional: `PORT` (3000), `HOST` (0.0.0.0), `PROJECT_ROOT_WHITELIST` (/opt, read-only add-existing scope), `BOOTSTRAP_ROOT` (/opt/projects, writable bootstrap mkdir target — host must `mkdir -p` it before container start), `DEFAULT_MODEL`, `LOG_LEVEL`, `SEARXNG_URL` (default `http://100.114.205.53:8888` — internal Tailscale; the public host is behind Authelia, unusable from server context), `BOOCODE_TOOLS` (`core`|`standard`|`all`, default `all`; a ceiling, never expands an agent's whitelist), `MCP_CONFIG_PATH` (default `/data/mcp.json`, opencode `mcpServers` shape; missing = no MCP), `CONTEXT7_API_KEY` (the Context7 MCP key, referenced from `data/mcp.json` as `"{env:CONTEXT7_API_KEY}"`). `data/mcp.json` is **gitignored** but no longer holds secrets — string values support opencode-style `{env:VAR}` substitution (`mcp-config.ts:substituteEnvVars`, applied before Zod validation; unset var → `''` + warn), so real keys live in `.env`; template `data/mcp.example.json`. A config-only edit there needs only `docker compose restart boocode` (data/ is bind-mounted); changing a referenced secret edits `.env`. MCP loads at server startup with per-server graceful degradation; the coder does NOT load MCP (BooChat only).
BooCoder at port 9502: `curl http://100.114.205.53:9502/api/health`. Runs as `boocoder.service` on the host (not Docker). Deploy: `pnpm -C apps/server build && pnpm -C apps/coder build && sudo systemctl restart boocoder`. Health reports tool count: `{"ok":true,"db":true,"tools":33}`. BooCoder at port 9502: `curl http://100.114.205.53:9502/api/health`. Runs as `boocoder.service` on the host (not Docker). Deploy: `pnpm -C packages/contracts build && pnpm -C apps/server build && pnpm -C apps/coder build && sudo systemctl restart boocoder`. Health reports tool count: `{"ok":true,"db":true,"tools":33}`.
- `FAST_MODEL` (optional) — cheaper model for titles, summaries, labeling (auto_name.ts, tool-summaries.ts). Falls back to session model or DEFAULT_MODEL when unset. Set to a small model on llama-swap (e.g. `nemotron-nano-4b`) to avoid loading the 35B for 20-token calls. - `FAST_MODEL` (optional) — cheaper model for titles, summaries, labeling (auto_name.ts, tool-summaries.ts). Falls back to session model or DEFAULT_MODEL. Set to a small llama-swap model (e.g. `nemotron-nano-4b`) to avoid loading the 35B for 20-token calls.
- Qwen Code dispatch: `OPENAI_BASE_URL=http://100.101.41.16:8401/v1 OPENAI_API_KEY=dummy qwen -p "<task>" --output-format stream-json`. Install: `npm install -g @qwen-code/qwen-code@latest`. Node ≥22 required on host (container stays Node 20; BooCoder dispatches via direct spawn on host). No `--yolo` flag — non-interactive mode (`-p`) runs autonomously without approval prompts. ACP bridge is HTTP daemon (not stdio); use PTY dispatch. - Qwen Code dispatch: `OPENAI_BASE_URL=http://100.101.41.16:8401/v1 OPENAI_API_KEY=dummy qwen -p "<task>" --output-format stream-json`. Install: `npm install -g @qwen-code/qwen-code@latest`. Node ≥22 on host (container stays Node 20; BooCoder dispatches via direct spawn on host). No `--yolo` flag — `-p` runs autonomously without prompts. ACP bridge is an HTTP daemon (not stdio); use PTY dispatch.
- Arena (v2.0.5): `POST /api/arena {project_id, input, contestants: [{agent?, model?}]}` dispatches the same task to N models/agents in parallel. Each contestant gets its own task + worktree. `GET /api/arena/:id` for results. `POST /api/arena/:id/select/:task_id` picks winner. - Arena: `POST /api/arena {project_id, input, contestants: [{agent?, model?}]}` dispatches the same task to N models/agents in parallel; each contestant gets its own task + worktree. `GET /api/arena/:id` for results; `POST /api/arena/:id/select/:task_id` picks a winner.
## Workflow ## Workflow
- Sam reviews all diffs and commits manually. Do not commit unless explicitly asked. - Sam reviews all diffs and commits manually. Do not commit unless explicitly asked.
- Per-batch docs live under `openspec/changes/<slug>/{proposal,tasks,design}.md`. Already-shipped batches are snapshots in `openspec/changes/archived/`. New batches follow the proposal+tasks shape; see `openspec/README.md` for the convention. - Sam often has uncommitted `apps/web` work in flight — stage your own commits **explicitly by path** (never `git add -A`); `docker compose up --build -d boocode` builds the working tree, so a container rebuild also ships his uncommitted web changes.
- Tag naming: `vMAJOR.MINOR.PATCH-slug` (e.g. `v1.13.13-ws-publish`). Monotonic per minor — the slug describes the batch's content so the tag name alone is enough to recall what shipped. No letter suffixes (`-a`/`-b`), no pseudo-ranges (`v1.11.x`), no slug-only sub-versions sharing a number (`v1.13.15-tools` + `-openspec` + `-agentlint` — split into sequential patches instead). - **Deploy by surface:** an `apps/coder` change → `sudo systemctl restart boocoder`; an `apps/web` or `apps/server` change → `docker compose up --build -d boocode` (rebuilds web+server from the working tree). The `boocode` container is `build: .`, so uncommitted changes deploy; web edits are live on the Vite dev server (HMR) but NOT on production (`:9500` / code.indifferentketchup.com) until a rebuild. Use `docker compose build --no-cache boocode && docker compose up -d` if you suspect a layer-cache issue.
- `CHANGELOG.md` is the per-tag release log, most-recent on top. When a new tag is created, add a `## <tag> — <YYYY-MM-DD>` section with a 36 sentence paragraph summarizing what shipped, drawn from the commit body. Cross-reference other tags by name when the batch builds on, fixes, or pairs with prior work (e.g. "pairs with `v1.13.12-ws-schemas`", "fixed in `v1.13.5-stability-bundle`"). No nested bullets — one paragraph. - Cutting a release: name the feature branch DIFFERENTLY from the tag (branch `f1-interrupt-guard`, tag `v2.6.7-interrupt-guard`) — identical names trigger `warning: refname ... is ambiguous`.
- Deploy: `cd /opt/boocode && docker compose up --build -d` (or `docker compose build --no-cache boocode && docker compose up -d` if you suspect a layer-cache issue). - Per-batch docs live under `openspec/changes/<slug>/{proposal,tasks,design}.md`; shipped batches are snapshots in `openspec/changes/archived/`. New batches follow the proposal+tasks shape (see `openspec/README.md`).
- The `boocode` container is `build: .` — it builds web+server from the **working tree**, so uncommitted changes deploy. Web edits are live on the Vite dev server (HMR) but NOT on production (`:9500` / code.indifferentketchup.com) until `docker compose up --build -d boocode`. - Tag naming: `vMAJOR.MINOR.PATCH-slug` (e.g. `v1.13.13-ws-publish`), monotonic per minor — the slug alone recalls what shipped. No letter suffixes, no pseudo-ranges, no slug-only sub-versions sharing a number (split into sequential patches).
- Git push to Gitea: `GIT_SSH_COMMAND="ssh -i /opt/boocode/secrets/boocode_gitea -o IdentitiesOnly=yes" git push origin <branch>`. The default agent identity is rejected; the in-repo deploy key (`secrets/`, gitignored) is the working one. Transient `Connection reset by peer` retries cleanly after `sleep 5`. - `CHANGELOG.md` is the per-tag release log, newest on top. New tag → add a `## <tag> — <YYYY-MM-DD>` section, one 36 sentence paragraph (no nested bullets) from the commit body; cross-reference related tags by name when the batch builds on / fixes / pairs with prior work.
- Git push to Gitea: `GIT_SSH_COMMAND="ssh -i /opt/boocode/secrets/boocode_gitea -o IdentitiesOnly=yes" git push origin <branch>`. The default agent identity is rejected; the in-repo deploy key (`secrets/`, gitignored) is the working one. Transient `Connection reset by peer` retries cleanly after `sleep 5`. Keep both remotes synced: push `main` + the release tag to `origin` (Gitea, deploy key above) AND `backup` (`git@github.com:indifferentketchup/boocode.git`, default key).
- Don't accumulate `.bak-*` files. Clean them up in the same batch or immediately after merge. - Don't accumulate `.bak-*` files. Clean them up in the same batch or immediately after merge.
- DB-integration tests opt-in via env var: `DATABASE_URL='postgres://boocode:devpass@localhost:5500/boochat' pnpm -C apps/server test`. Host port is 5500 (mapped from `boocode_db:5432`); password is `${POSTGRES_PASSWORD}` from `.env` (`devpass`), NOT the literal in `.env`'s `DATABASE_URL=postgres://boocode:Ketchup1479@boocode_db:5432/...` line. `psql` is not on the host PATH — for an interactive query use `docker exec boocode_db psql -U boocode -d boochat -c "..."`. Pattern: `describe.runIf(!!process.env.DATABASE_URL)(...)` with a `beforeAll` that applies the schema via `sql.unsafe(readFileSync(schemaPath))`. Tests skip cleanly when var is unset. `tool_cost_stats.test.ts` is the reference. - DB-integration tests opt-in via env var: `DATABASE_URL='postgres://boocode:devpass@localhost:5500/boochat' pnpm -C apps/server test`. Host port 5500; password is `${POSTGRES_PASSWORD}` from `.env` (`devpass`), NOT the literal in `.env`'s `DATABASE_URL` line. `psql` isn't on host PATH — use `docker exec boocode_db psql -U boocode -d boochat -c "..."`. Pattern: `describe.runIf(!!process.env.DATABASE_URL)(...)` + `beforeAll` applying schema via `sql.unsafe(readFileSync(schemaPath))`. `tool_cost_stats.test.ts` is the reference.
- Host-side smoke endpoint: `curl http://100.114.205.53:9500/api/...`. The boocode container's port mapping binds to the Tailscale IP, not `0.0.0.0`, so `localhost:9500` doesn't work from the host shell. Same for booterm at `:9501`. - Host-side smoke endpoint: `curl http://100.114.205.53:9500/api/...`. The container's port mapping binds to the Tailscale IP, not `0.0.0.0`, so `localhost:9500` doesn't work from the host shell. Same for booterm at `:9501`.
- Frontend blank-screen / runtime crash: get the stack-trace column offset from the browser console, then `cut -c <start>-<end> apps/web/dist/assets/index-*.js | sed -n '<line>p'` to read the exact minified expression that threw. Faster than bisecting source. Watch for `=== null`/`!== null` on optional fields fed an `as unknown as` cast — those bypass tsc. - Frontend blank-screen / runtime crash: get the stack-trace column offset from the browser console, then `cut -c <start>-<end> apps/web/dist/assets/index-*.js | sed -n '<line>p'` to read the exact minified expression that threw. Watch for `=== null`/`!== null` on optional fields fed an `as unknown as` cast — those bypass tsc.
- Fastify global JSON parser tolerates empty bodies (overridden in `index.ts`); bodyless POSTs (archive, unarchive, stop) work without setting `Content-Type` tricks on the client. - Fastify global JSON parser tolerates empty bodies (overridden in `index.ts`); bodyless POSTs (archive, unarchive, stop) work without `Content-Type` tricks on the client.
- Event dedup discipline: for any mutation the server publishes via `broker.publishUser`, do NOT add a local `sessionEvents.emit(...)` after the API call — `useUserEvents` forwards the WS frame onto the bus. Frontend mutation handlers must be idempotent (dedup by id, no-op on already-present). - Event dedup discipline: for any mutation the server publishes via `broker.publishUser`, do NOT add a local `sessionEvents.emit(...)` after the API call — `useUserEvents` forwards the WS frame onto the bus. Frontend mutation handlers must be idempotent (dedup by id, no-op on already-present).
- `node:20-*` base images ship a `node` user at uid/gid 1000 — delete it (`userdel`/`groupdel` on debian, `deluser`/`delgroup` on alpine) before adding samkintop at 1000. - `node:20-*` base images ship a `node` user at uid/gid 1000 — delete it (`userdel`/`groupdel` on debian, `deluser`/`delgroup` on alpine) before adding samkintop at 1000.
- node-pty's compiled `.node` is libc-specific: proddeps and runtime Dockerfile stages must share libc (alpine↔musl or bookworm-slim↔glibc); the TS-only builder stage can stay alpine for speed. - node-pty's compiled `.node` is libc-specific: proddeps and runtime Dockerfile stages must share libc (alpine↔musl or bookworm-slim↔glibc); the TS-only builder stage can stay alpine for speed.
- pnpm 10 `--frozen-lockfile` skips node-pty's postinstall — the Docker proddeps stage runs `cd node_modules/node-pty && npm run install` to force the native compile. - pnpm 10 `--frozen-lockfile` skips node-pty's postinstall — the Docker proddeps stage runs `cd node_modules/node-pty && npm run install` to force the native compile.
- A local PreToolUse hook (`security_reminder_hook.py`) regex-flags Node's older `child_process` spawn helpers as unsafe (false positive even on the File-suffixed variant). Use `spawn` — it's accepted. - A local PreToolUse hook (`security_reminder_hook.py`) regex-flags Node's older `child_process` spawn helpers as unsafe (false positive even on the File-suffixed variant). Use `spawn` — it's accepted.
- `/opt/boolab` hosts a working sibling BooCode terminal at `boocode.indifferentketchup.com`. Useful for visual side-by-side comparison on the same iPhone when debugging booterm rendering. Boolab uses Tailwind v3 (`@tailwind base`); boocode uses v4 — many subtle build differences. Don't assume parity. - `/opt/boolab` hosts a sibling BooCode at `boocode.indifferentketchup.com` — useful for side-by-side iPhone comparison when debugging booterm rendering. It uses Tailwind v3, boocode uses v4 — don't assume build parity.
- booterm SSHs to the host as `samkintop@100.114.205.53` (the Tailscale IP). The hostname `ubuntu-homelab` (shown in the bash prompt after login) does NOT resolve from inside the container — only the host's `/etc/hosts` knows it. Override via `BOOTERM_SSH_HOST` / `BOOTERM_SSH_USER` env vars in docker-compose if you ever move the shell to a different machine. - booterm SSHs to the host as `samkintop@100.114.205.53` (the Tailscale IP). The hostname `ubuntu-homelab` (in the bash prompt) does NOT resolve inside the container. Override via `BOOTERM_SSH_HOST` / `BOOTERM_SSH_USER` env vars in docker-compose if the shell moves to a different machine.
- codecontext sidecar lives at `/opt/boocode/codecontext/`. Sidecar HTTP API at `http://codecontext:8080/v1/<tool_name>` over the `boocode_net` bridge (no host port). BooCode wrappers in `apps/server/src/services/tools/codecontext/`. The `.codecontextignore` at project root is honored when `--respect-gitignore` is passed (enabled in the shim). - codecontext sidecar lives at `/opt/boocode/codecontext/`. HTTP API at `http://codecontext:8080/v1/<tool_name>` over the `boocode_net` bridge (no host port). BooCode wrappers in `apps/server/src/services/tools/codecontext/`. The `.codecontextignore` at project root is honored when `--respect-gitignore` is passed (enabled in the shim).
- codecontext fork at `/opt/forks/codecontext/` — separate git repo (branch `boocode-ts`), pushed via the same boocode_gitea SSH key to `indifferentketchup/codecontext`. Build: `go build ./...`. Test: `go test ./...`. Docker rebuild requires staging the fork source first: `tar -czf codecontext/fork.tar.gz -C /opt/forks/codecontext --exclude=.git --exclude=bin .` then `docker compose build --no-cache codecontext`. The Dockerfile COPYs `fork.tar.gz` into the builder stage (Gitea is behind Authelia, no HTTP clone). `fork.tar.gz` is gitignored. - codecontext fork at `/opt/forks/codecontext/` — separate git repo (branch `boocode-ts`), pushed via the boocode_gitea SSH key to `indifferentketchup/codecontext`. Build `go build ./...`; test `go test ./...`. Docker rebuild requires staging the fork first: `tar -czf codecontext/fork.tar.gz -C /opt/forks/codecontext --exclude=.git --exclude=bin .` then `docker compose build --no-cache codecontext` (the Dockerfile COPYs `fork.tar.gz` into the builder stage; Gitea is behind Authelia, no HTTP clone). `fork.tar.gz` is gitignored.
- Go binary: `/snap/go/current/bin/go` (not on PATH by default). Use `export PATH=$PATH:/snap/go/current/bin` or full path for Go commands. - Go binary: `/snap/go/current/bin/go` (not on PATH). Use `export PATH=$PATH:/snap/go/current/bin` or the full path.
- `os/exec` child supervisors must explicitly call `child.Wait()` in a goroutine and `os.Exit` on child death. `Signal(0)` returns nil on zombies and is NOT a liveness check. Without `Wait()`, docker's `restart: unless-stopped` policy never fires because the parent stays alive. The `codecontext/shim.go` implementation is the reference pattern. - `os/exec` child supervisors must call `child.Wait()` in a goroutine and `os.Exit` on child death. `Signal(0)` returns nil on zombies and is NOT a liveness check. Without `Wait()`, docker's `restart: unless-stopped` never fires because the parent stays alive. `codecontext/shim.go` is the reference.
## Conventions ## Conventions
- `overflowWrap` not `wordWrap` — TypeScript's CSSStyleDeclaration marks `wordWrap` as deprecated (error 6385). Cross-cutting only. Per-app conventions live in the matching `apps/*/CLAUDE.md`.
- No app-layer auth. Authelia handles auth at the reverse proxy. All `broker.publishUser`/`subscribeUser` calls use `'default'` as the user key. - No app-layer auth. Authelia handles auth at the reverse proxy. All `broker.publishUser`/`subscribeUser` calls use `'default'` as the user key.
- TypeScript strict mode. Both apps share `tsconfig.base.json`. - TypeScript strict mode. Both apps share `tsconfig.base.json`. Server + coder use NodeNext module resolution (`.js` extensions in imports).
- Server uses NodeNext module resolution (`.js` extensions in imports).
- Discriminated unions for type narrowing: `Pane` (by `kind`), `SessionEvent` (by `type`), `InferenceFrame` (by `type`). - Discriminated unions for type narrowing: `Pane` (by `kind`), `SessionEvent` (by `type`), `InferenceFrame` (by `type`).
- **Adding a new WS frame type** requires updating BOTH the server's `InferenceFrame` (loose `type:` union + optional fields in `services/inference/turn.ts`) AND the web `WsFrame` (strict discriminated union in `apps/web/src/api/types.ts`). Server publish is permissive; the frontend type is the wire-format gate. The `'usage'` frame added in v1.12.2 needed both sides; missing the web side silently drops the frame at JSON-parse. - **Adding a new WS frame type** (cross-app): add it to `WsFrameSchema` in `packages/contracts/src/ws-frames.ts` (single source of truth; rebuild with `pnpm -C packages/contracts build`). The server's `InferenceFrame` loose union (`services/inference/turn.ts`) and the web's strict `WsFrame` discriminated union (`apps/web/src/api/types.ts`) still exist separately and also need updating. Server publish is permissive; the frontend type is the wire-format gate missing the web side silently drops the frame at JSON-parse.
- shadcn primitives live in `components/ui/`. Don't modify them unless adding a new primitive. - **Sentinels** (cross-app) are `role='system'` rows with structured `metadata.kind` (`cap_hit`, `doom_loop`). UI-only — `buildMessagesPayload` strips them via `isAnySentinel` so the LLM never sees them. `MessageMetadata` is single-sourced in `@boocode/contracts` (`packages/contracts/src/message-metadata.ts`). A new kind requires updating that file and rebuilding the package, plus a render branch in `apps/web/src/components/MessageBubble.tsx`.
- `ui/` primitives present: button, card, context-menu, dialog, dropdown-menu, input, label, radio-group, sonner, textarea. No switch/sheet/drawer/badge/checkbox — use a `<button role="switch" aria-checked>` toggle (a hand-rolled `Switch` already lives in `SettingsPane.tsx`) and a Dialog-based panel for "drawers". - **Provider snapshot types** (`ProviderSnapshotEntry`, `ProviderModel`, `ProviderMode`, `ThinkingOption`, `AgentCommand`, `ProviderSnapshotStatus`) are single-sourced in `@boocode/contracts` (`packages/contracts/src/provider-snapshot.ts`); `apps/coder/src/services/provider-types.ts` re-exports them. Edit the package source; there is no hand-synced web copy to update.
- `inferLanguage()` from `lib/attachments.ts` is the canonical file-extension-to-language map. `CodeBlock.tsx` keeps its own `LANG_MAP` because it also resolves markdown fence names. - **JSONB columns**: use `sql.json(value as never)` — NOT `${JSON.stringify(value)}::jsonb` which double-serializes (stores a JSON string instead of an object/array). Pattern in `parts.ts`, `settings.ts`.
- Two UI event buses: `hooks/sessionEvents.ts` for DB-state events (chat_created, session_updated); `lib/events.ts` for ephemeral UI (`sendToTerminal`, `terminalsRegistry`). Don't merge — different subscriber lifecycles. - Skills live in `data/skills/<vendor>/`; Sam's own namespace is `boocode/` (`committing-changes`, `using-worktrees`, `improving-boocode-guidance`, `systematic-debugging`) — `SKILL.md` + optional `eval.yaml` (gerund names; eval = `skill:` + `tasks:` of `prompt`+`grader`, incl. a negative-trigger task). `data/skills/` is canonical; a divergent mirror at `/opt/skills/` exists.
- `vite.config.ts` proxy entries are order-sensitive: more-specific prefixes (`/api/term`, `/ws/term`) must come BEFORE `/api`.
- Mobile pane URL sync (`Session.tsx`): the `?pane=<id>` effect resets `activePaneIdx` whenever `panes` changes. New-pane creation on mobile must push `?pane=` atomically — `addPaneAndSwitch` is the wrapper that does this. `addSplitPane` returns the new pane id for callers. ### Coding standards
- A scrollable list inside a Dialog on mobile: cap `DialogContent` (`max-h-[85vh]` + `grid-rows-[auto_minmax(0,1fr)_auto]`) and make the list the single scroll region with `overscroll-contain` — otherwise touch-scroll drags the whole fixed modal / chains to the page.
- xterm.js v5 uses canvas rendering — browser doesn't see xterm's selection; the native right-click menu has no working Copy for terminal text. App keybindings (`Cmd/Ctrl-C`, `Cmd/Ctrl-Shift-C`) are the path. Coding standards live in `docs/coding-standards/` (canonical, human-readable). They are exposed to Claude Code through per-file-type/subsystem index files under `.claude/rules/coding-standards/`. Each index is a path-scoped rule that lists the standards relevant to its `paths:` glob with a one-line description of each. When Claude reads a file matching an index's `paths:`, it loads only that small index and then decides which (if any) standards to open with Read — the full text of a standard is never loaded automatically, and standards do not appear in the skills picker. Browse `docs/coding-standards/` for the readable form.
- **New tools** live in their own `services/<name>.ts` file (see `web_search.ts`, `web_fetch.ts`) — exports a pure `executeFoo(input, ...deps)` for direct test access plus a `ToolDef` wrapper that `loadConfig()`s its real dependencies. Register the ToolDef in `tools.ts` `ALL_TOOLS` (and `READ_ONLY_TOOL_NAMES` if applicable). Inject `fetcher: typeof fetch = fetch` rather than `vi.spyOn(globalThis, 'fetch')` — cleanup is simpler and the production call site stays unchanged.
- **DB/session-aware tools** take an optional 4th `ToolExecCtx { sql, sessionId }` arg on `ToolDef.execute`, plumbed `executeToolPhase``executeToolCall``execute`. It's optional so the filesystem tools and the `apps/coder` `ALL_TOOLS` consumer stay compatible; filesystem tools ignore it. `read_tab_by_number` (reads `sessions.workspace_panes` + the chat's messages via `sql`) is the reference.
- **Sentinels** are `role='system'` rows with structured `metadata.kind` (`cap_hit`, `doom_loop`). UI-only — `buildMessagesPayload` strips them via `isAnySentinel` so the LLM never sees them. A new kind requires arms in `MessageMetadata` in BOTH `apps/server/src/types/api.ts` AND `apps/web/src/api/types.ts`, plus a render branch in `apps/web/src/components/MessageBubble.tsx`.
- **ReadableStream test stubs** use `pull()` (not `start()`) so chunks are produced lazily — `start()` enqueues everything and calls `controller.close()` before the consumer reads, so a subsequent `reader.cancel()` finds the stream already closed and the `cancel()` callback never fires. Also provide MORE chunks than the test will consume so the source stays in 'readable' state when cancel runs (e.g. cap test reads ~6 chunks, stub provides 10).
- React **StrictMode is on** (`main.tsx`): an updater passed to one `setState` that itself calls another `setState` (e.g. `setClosedPaneStack` inside a `setPanes` updater) is double-invoked in dev. Make such nested updates idempotent — `useWorkspacePanes`'s `appendClosed` dedupes a value-identical top entry for exactly this reason.
- Tool-name whitelists must derive from `ALL_TOOLS` in `services/tools.ts`, never hardcoded. `services/agents.ts` `ALL_TOOL_NAMES` had this drift class until v1.12 — same pattern applies to any future tool-aware code.
- Agent registry lives at `data/AGENTS.md` (global, bind-mounted at `/data/AGENTS.md`). No per-project `AGENTS.md` in this repo — removed in v1.12 to eliminate the two-files-must-stay-in-sync drift. The `getAgentsForProject` per-project override mechanism remains for *other* projects.
- `data/AGENTS.md` is PARSED (`agents.ts` `splitSections`/`parseAgentSection`): each `## <Name>` is one agent and must be followed by a `---` frontmatter fence or the block throws; content before the first `## ` is discarded. Do NOT add free-form `## ` rule sections — they break the registry. Cross-cutting agent rules go in CLAUDE.md or a parser-ignored preamble.
- Skills live in `data/skills/<vendor>/`; Sam's own namespace is `boocode/` (`committing-changes`, `using-worktrees`, `improving-boocode-guidance`) — `SKILL.md` + optional `eval.yaml` (gerund names; eval = `skill:` + `tasks:` of `prompt`+`grader`, incl. a negative-trigger task). `data/skills/` is canonical; a divergent mirror at `/opt/skills/` exists.
- MCP stdio transport uses newline-delimited JSON (NDJSON), NOT LSP-style `Content-Length` headers. The `codecontext/shim.go` framing implementation is the reference; per the MCP spec (modelcontextprotocol.io/specification/server/transports).
- **Workspace dependency pattern** (`apps/coder``@boocode/server`): the consuming package adds `"@boocode/server": "workspace:*"` in `package.json`. The provider's `package.json` needs `exports` with `types` + `default` conditions per subpath: `"./inference": { "types": "./dist/.../index.d.ts", "default": "./dist/.../index.js" }`. Without the `types` condition, NodeNext resolution can't find `.d.ts` files and tsc fails with "Cannot find module" in the consumer.
- **JSONB columns**: use `sql.json(value as never)` — NOT `${JSON.stringify(value)}::jsonb` which double-serializes (stores a JSON string instead of a JSON object/array). Pattern established in `parts.ts`, `settings.ts`.
- **`payload.ts:loadContext` SELECT**: must include every `Session` field that downstream code reads. The tool phase reads `session.allowed_read_paths`; if the SELECT omits it, cross-repo read grants silently fail. The `Session` TypeScript type doesn't catch this because `sql<Session[]>` doesn't enforce column coverage.
- **Sidecar routing** (`services/inference/provider.ts`): `upstreamModel(config, modelId, agent)` routes to `LLAMA_SIDECAR_URL` when agent has `llama_extra_args`, otherwise `LLAMA_SWAP_URL`. `resolveRoute(agent)` returns `{route: 'swap'|'sidecar', flags}`. Sidecar provider created fresh per call (not cached) because `X-Agent-Flags` header varies per agent. Boot-time guard in `index.ts` refuses to start if any agent has `llama_extra_args` but `LLAMA_SIDECAR_URL` is unset.
- **Secret guard safe patterns** (`services/secret_guard.ts`): `.env.example`, `.env.sample`, `.env.template`, `.env.defaults` are allowlisted via `SAFE_PATTERNS` set. Do NOT add `.env.production`/`.env.development`/`.env.test` — those can hold real secrets.
- **CoderPane uses ChatInput** (`components/panes/CoderPane.tsx`): shares the same `ChatInput` component as BooChat for full parity — attachments, paste-to-chip, auto-grow textarea, queued messages during send. CoderPane's `sendOneMessage` is the send callback; queued messages drain via `useEffect` when `sending` goes false.
- **Adding a new `SessionEvent` type**: add the interface, add it to the `SessionEvent` union, add a `case` in `useSidebar.ts` `applyEvent` switch (no-op `return prev` is fine), and subscribe in any hook that needs it (e.g. `useSessionStream` for `refetch_messages`).
- **BooCoder provider registry** (`apps/coder/src/services/provider-registry.ts`): static list of provider defs (boocode, opencode, goose, claude, qwen). `PROBED_AGENT_NAMES` derives from it. Adding/removing providers means editing this file, not the frontend.
- **AgentComposerBar filters `e.installed`**: provider snapshot entries with `installed:false` (loading/unavailable) are dropped from the dropdown. `getProviderSnapshot` must await the full build — returning synchronous `loading` placeholders makes every provider vanish (the v2.5.7 "no providers showing up" regression); surfacing loading states needs a client poll.
- **Coder↔web provider-type parity** (`apps/coder/src/services/provider-types.ts``apps/web/src/api/types.ts`): enforced by runtime `provider-types-parity.test.ts` (compile-time cross-import is blocked by TS6307 on web's composite tsconfig). Mirror of the ws-frames parity pattern — edit both copies together or the test fails.
- **ACP command discovery is async**: `acp-probe.ts` must poll after `newSession` for `available_commands_update` (commands arrive in a later notification; reading synchronously captures 0). PTY providers (claude) instead discover from disk via `claude-command-discovery.ts` (`~/.claude/commands` + `enabledPlugins` `skills/`+`commands/`, bare names, deduped). `AgentCommand.kind` tags `'command'` vs `'skill'`; `CoderPane`'s `slashGroups` splits them into icon'd groups. `SlashCommandPicker`'s `groups?` prop is opt-in — BooChat passes flat `items` (unchanged).
- **Pane header architecture (mobile vs desktop)**: Desktop coder pane header (BooCode label + [+] [×]) lives in `Workspace.tsx` gated by `isCoder && !isMobile`. Mobile coder controls (● ×) live in `Session.tsx` header row next to `MobileTabSwitcher`/`NewPaneMenu`. `AgentComposerBar` (provider/mode/model pickers) renders inside `CoderPane.tsx` on both. The ● status dot is passed via `connected` prop from CoderPane to AgentComposerBar.
- **MessageBubble shared between BooChat and BooCoder** (`components/MessageBubble.tsx`): accepts optional `actions?: MessageActions` callbacks (onRegenerate, onResend, onFork, onDelete) and `hideActions?: ('fork'|'delete'|'openInPane')[]`. Defaults use BooChat API; CoderPane overrides via `CoderMessageList` props. `CoderTextBubble` was removed. **`CoderMessageList` passes `CoderMessageWire as unknown as Message`** — the coder wire shape lacks `metadata`/`kind`/`summary`, so those fields are `undefined` (not `null`) on coder messages. Null-guards on any `Message` field MUST use loose `!= null`, not strict `!== null` (`undefined !== null` is `true``.kind` throws → blank-screen crash). The `as unknown as` cast hides this from tsc; build + typecheck pass while runtime crashes.
- **llama-sidecar** (`/opt/forks/llama-sidecar/`): Go daemon for per-agent llama-server process pool. Cross-compile: `GOOS=windows GOARCH=amd64 /snap/go/current/bin/go build -o bin/llama-sidecar.exe ./cmd/llama-sidecar`. Gitea: `indifferentketchup/llama-sidecar`. Windows child process gotchas: use `context.Background()` for child lifetime (not request ctx), `os.Open(os.DevNull)` for stdin, `os.Pipe()` for stdout with drain goroutine, `DETACHED_PROCESS | CREATE_NEW_PROCESS_GROUP` creation flags. SSH to sam-desktop: `ssh samki@100.101.41.16`; use `schtasks` for persistent process spawning (SSH `start /B` doesn't survive session close).

View File

@@ -5,11 +5,15 @@ RUN corepack enable
WORKDIR /build WORKDIR /build
COPY package.json pnpm-workspace.yaml pnpm-lock.yaml tsconfig.base.json ./ COPY package.json pnpm-workspace.yaml pnpm-lock.yaml tsconfig.base.json ./
COPY packages/contracts/package.json ./packages/contracts/
COPY apps/server/package.json ./apps/server/ COPY apps/server/package.json ./apps/server/
COPY apps/web/package.json ./apps/web/ COPY apps/web/package.json ./apps/web/
RUN pnpm install --frozen-lockfile RUN pnpm install --frozen-lockfile
# @boocode/contracts must be present before `pnpm build`, which builds it FIRST
# (root build script) so apps/web can resolve its compiled dist via the exports map.
COPY packages/contracts ./packages/contracts
COPY apps/server ./apps/server COPY apps/server ./apps/server
COPY apps/web ./apps/web COPY apps/web ./apps/web

View File

@@ -58,7 +58,7 @@ upstream and inject `Remote-User`. Postgres binds loopback only.
BooCoder runs as a **host systemd service** (`boocoder.service`, port `:9502`), not in Docker: BooCoder runs as a **host systemd service** (`boocoder.service`, port `:9502`), not in Docker:
```bash ```bash
pnpm -C apps/server build && pnpm -C apps/coder build pnpm -C packages/contracts build && pnpm -C apps/server build && pnpm -C apps/coder build
sudo systemctl restart boocoder sudo systemctl restart boocoder
curl http://100.114.205.53:9502/api/health curl http://100.114.205.53:9502/api/health
``` ```

View File

@@ -14,3 +14,4 @@ GITEA_SSH_HOST=100.114.205.53:2222
MCP_CONFIG_PATH=/data/mcp.json MCP_CONFIG_PATH=/data/mcp.json
SKILLS_ROOT=/opt/boocode/data/skills SKILLS_ROOT=/opt/boocode/data/skills
CODER_PROVIDERS_PATH=/opt/boocode/data/coder-providers.json CODER_PROVIDERS_PATH=/opt/boocode/data/coder-providers.json
CLAUDE_SDK_BACKEND=1

34
apps/coder/CLAUDE.md Normal file
View File

@@ -0,0 +1,34 @@
# apps/coder — BooCoder (deep reference)
> Per-app engineering notes for `apps/coder/src/`. BooCoder runs as a **systemd service on the host** (`boocoder.service`), NOT in Docker — Fastify at port 9502, postgres at `127.0.0.1:5500`. Cross-cutting commands, database, environment, workflow, and cross-app contracts live in the **root `CLAUDE.md`**. This file auto-loads when you read/edit files under `apps/coder/`.
## Probe & provider discovery
- **`services/provider-registry.ts`** — Static registry of provider metadata (label, transport, model source). `PROVIDERS` array, `PROVIDERS_BY_NAME` map. 5 providers: boocode (native), opencode (acp), goose (pty), claude (pty), qwen (pty). `PROBED_AGENT_NAMES` derives from it — adding/removing providers means editing this file, not the frontend.
- **`services/agent-probe.ts`** — Startup probe via direct `exec()` (not SSH): discovers installed agents, versions, ACP support, models. Qwen models from `~/.qwen/settings.json`; Claude models static from the registry. Persisted to `available_agents`.
- **`routes/providers.ts`** — `GET /api/providers` returns installed providers with models. Transport reflects actual capability (checks `supports_acp` from DB, not just registry preference). The apps/server side is "Provider picker dispatch" (see `apps/server/CLAUDE.md`).
- **Provider snapshot lifecycle** (`services/`): `provider-config.ts` (Zod config, never-throws) → `provider-config-registry.ts` (`buildResolvedRegistry`, singleton) → `provider-snapshot.ts` (two-tier probe: tier-1 fast presence, tier-2 cold ACP probe skipped unless force / stale `PROVIDER_PROBE_TTL_MS` 24h / dbEmpty; cached). Verify live: `curl http://100.114.205.53:9502/api/providers/snapshot` — returns providers + models + commands, the exact shape `AgentComposerBar` renders.
- `PATCH /api/providers/config` replaces a provider id's override object **wholesale** (per-id shallow merge) — to flip one field send `{...existing, enabled}`, or a custom ACP entry's `command`/`label` is wiped and it drops out of the resolved registry. `data/coder-providers.json` is **gitignored** (live runtime config — the coder reads AND writes it on UI toggles); tracked reference is `data/coder-providers.example.json`. The loader falls back to `{providers:{}}` (built-ins only) when absent, so a fresh checkout needs no copy.
## Build, deploy, dispatch
- **Workspace dependency on `@boocode/server`**: imports `createInferenceRunner`, `createBroker`, `ALL_TOOLS`, `appendMcpTools` from the server's compiled `dist/`. apps/server's `package.json` has an `exports` map with `types` conditions for NodeNext resolution. **apps/server must build FIRST.**
- Build + deploy: `pnpm -C packages/contracts build && pnpm -C apps/server build && pnpm -C apps/coder build && sudo systemctl restart boocoder`. Env file at `apps/coder/.env.host`. Service file at `/etc/systemd/system/boocoder.service`.
- After `pnpm -C apps/coder build` the host service keeps running the OLD process until `sudo systemctl restart boocoder` — a stale process shows **new routes 404 with `{error:'not found'}` while old routes still 200** (the `/api` not-found handler shape). Restart, don't re-debug.
- `:9502/api/health` is down ~1520s after a boocoder restart while the startup agent-probe scan runs — retry; an early connection-refused is not a failed deploy.
- Agent dispatch spawns binaries directly using `install_path` from `available_agents` — no `spawn('sh', ['-c', ...])` (fails under systemd). Paseo's pattern: `spawn(fullBinaryPath, argsArray, { cwd })`.
- systemd hardening: only `NoNewPrivileges=true` is safe. `ProtectSystem`, `ProtectHome`, `PrivateTmp` all break agent dispatch (agents need full filesystem access to read configs, write to worktrees).
- `apps/server/tsconfig.json` has `declaration: true` so `.d.ts` files exist for workspace consumers. The provider's `package.json` needs `exports` with `types` + `default` conditions per subpath (`"./inference": { "types": "./dist/.../index.d.ts", "default": "./dist/.../index.js" }`) — without the `types` condition, NodeNext can't find `.d.ts` files and tsc fails "Cannot find module" here.
- Write tools (`edit_file`, `create_file`, `delete_file`, `apply_pending`, `rewind`) queue in `pending_changes`. Nothing hits disk until `apply_pending`. `write_guard.ts` validates paths (resolve + prefix-check, no realpath since files may not exist for creates).
## Backends
> Behavioral overview + flows + data model: see [/docs/coder-backends.md](/docs/coder-backends.md). The notes below are the deep per-fact reference.
- **opencode** runs as a warm HTTP server (`services/backends/opencode-server.ts``opencode serve` per BooCoder process, one opencode session per BooCode session, resumed via `agent_sessions`). goose/qwen/claude dispatch **one-shot** ACP/PTY with no ctx/token usage; only native `boocode` (llama-swap) tracks ctx.
- **opencode SSE** (`opencode-server.ts`): live streaming is `session.next.text.delta` / `.reasoning.delta` / `.tool.{called,success,failed}` — NOT `message.part.*` (terminal/post-hoc). `client.event.subscribe({ directory })` MUST pass the session's worktree dir; omit it and opencode scopes events to the server `process.cwd()` → zero session events (empty turns, 180s timeout). Each live session owns its own subscribe loop + AbortController (a `sessionID` demux guard drops cross-session events when two share a dir). Turn completes on `session.idle`; `promptAsync` is fire-and-forget (204).
- **opencode model strings** must be provider-prefixed (`llama-swap/<model>`) AND exist in `~/.config/opencode/opencode.json` `provider.llama-swap.models` — not merely loadable by llama-swap. `parseModel` infers `llama-swap/` for a bare id; the dispatcher coalesces empty→DEFAULT_MODEL then prefixes. `agent-probe` populates opencode's `available_agents.models` via `mergeLlamaSwap` (fetches `/v1/models`); empty model list → frontend sends `''` → no inference (empty turn).
- **agent_sessions resume**: `config_hash = sha256('opencode_server|<model>')` — must NOT include the server port (random per boot; breaks cross-restart resume). Keyed `(chat_id, agent)` — the tab/chat is the context unit (two opencode tabs = two contexts sharing one worktree). `chat_id` CASCADEs from `chats`; `session_id`/`worktree_id` are informational `SET NULL`. The `worktrees` table (one-per-session, survives session delete) supersedes the defanged `session_worktrees`. `tasks.chat_id` threads the tab id to the dispatcher; `runOpenCodeServerTask` resolves-or-creates a chat when null. The `@opencode-ai/sdk` v2 client takes flattened params (`{sessionID, directory, parts, model:{providerID,modelID}}`), `createOpencodeClient` from `@opencode-ai/sdk/v2/client`.
- **Claude SDK backend tool RESULTS arrive as `type:'user'` SDK messages** (tool_result content blocks): `mapSdkMessage` (`claude-sdk-map.ts`) MUST map the `user` case → a terminal `tool_update` (completed/failed + output), else the tool_call persists `status:'running'` and the UI spinner never stops. The dispatcher's `tool_update` path then publishes + persists it.
- **ACP command discovery is async**: `acp-probe.ts` must poll after `newSession` for `available_commands_update` (commands arrive in a later notification; reading synchronously captures 0). PTY providers (claude) discover from disk via `claude-command-discovery.ts` (`~/.claude/commands` + `enabledPlugins`, bare names, deduped). `AgentCommand.kind` tags `'command'` vs `'skill'`; `CoderPane`'s `slashGroups` splits them into icon'd groups. `SlashCommandPicker`'s `groups?` prop is opt-in.
- **A new per-message coder field silently drops unless you update every mapper**: the HTTP read SELECT + `mapCoderMessageRow` (`apps/coder/src/routes/messages.ts`), **the WS `snapshot` SELECT (`apps/coder/src/routes/ws.ts`)** — it has its OWN column list and the client's `snapshot` handler `setMessages`-overwrites the HTTP load, so a field present in the HTTP route but absent here shows live yet vanishes on refresh — `CoderPane.tsx` (`RawCoderMessage`/`CoderMessage`/`mapCoderTimelineRow` + the live `message_complete` WS reducer), `CoderMessageWire` (`CoderMessageList.tsx`), and `api/types.ts`. The client `mapCoderTimelineRow` whitelists fields — easiest to forget. This bit `model` twice: the client chain (`v2.7.9`) and then the WS snapshot SELECT (`v2.7.11`) — the chip showed live but vanished on coder refresh until both were fixed.

View File

@@ -13,12 +13,14 @@
"test": "vitest run" "test": "vitest run"
}, },
"dependencies": { "dependencies": {
"@boocode/contracts": "workspace:*",
"@agentclientprotocol/sdk": "^0.22.1", "@agentclientprotocol/sdk": "^0.22.1",
"@anthropic-ai/claude-agent-sdk": "^0.3.159",
"@boocode/server": "workspace:*", "@boocode/server": "workspace:*",
"@fastify/static": "^7.0.4", "@fastify/static": "^7.0.4",
"@opencode-ai/sdk": "~1.15.0",
"@fastify/websocket": "^10.0.1", "@fastify/websocket": "^10.0.1",
"@modelcontextprotocol/sdk": "^1.29.0", "@modelcontextprotocol/sdk": "^1.29.0",
"@opencode-ai/sdk": "~1.15.0",
"fastify": "^4.28.1", "fastify": "^4.28.1",
"postgres": "^3.4.4", "postgres": "^3.4.4",
"ws": "^8.18.0", "ws": "^8.18.0",

View File

@@ -16,7 +16,7 @@ import { createInferenceRunner } from '@boocode/server/inference';
import { createBroker } from '@boocode/server/broker'; import { createBroker } from '@boocode/server/broker';
import { appendMcpTools, ALL_TOOLS } from '@boocode/server/tools'; import { appendMcpTools, ALL_TOOLS } from '@boocode/server/tools';
import type { Config as ServerConfig } from '@boocode/server/config'; import type { Config as ServerConfig } from '@boocode/server/config';
import type { WsFrame } from '@boocode/server/ws-frames'; import type { WsFrame } from '@boocode/contracts/ws-frames';
// v2.0.0 Phase 2C: write tools + adapter for BooChat ToolDef compatibility. // v2.0.0 Phase 2C: write tools + adapter for BooChat ToolDef compatibility.
import { WRITE_TOOLS } from './services/tools/index.js'; import { WRITE_TOOLS } from './services/tools/index.js';
import { adaptWriteTool } from './services/tools/adapter.js'; import { adaptWriteTool } from './services/tools/adapter.js';
@@ -42,6 +42,7 @@ import { createOrphanWorktreeReaper } from './services/orphan-worktree-reaper.js
import { probeAgents } from './services/agent-probe.js'; import { probeAgents } from './services/agent-probe.js';
import { getProviderSnapshot, persistProbedModels } from './services/provider-snapshot.js'; import { getProviderSnapshot, persistProbedModels } from './services/provider-snapshot.js';
import { setPermissionHooks } from './services/permission-waiter.js'; import { setPermissionHooks } from './services/permission-waiter.js';
import { publishAgentStatus } from './services/agent-status-publish.js';
import { homedir } from 'node:os'; import { homedir } from 'node:os';
async function main() { async function main() {
@@ -82,6 +83,21 @@ async function main() {
// Broker: in-memory pub/sub for session + user channel streaming. // Broker: in-memory pub/sub for session + user channel streaming.
const broker = createBroker(app.log); const broker = createBroker(app.log);
// agent-status-normalize (#10): the permission hooks carry only taskId +
// sessionId, but the tasks row holds the (chat_id, agent) pair the status frame
// is keyed on. Resolve it best-effort so a blocked/working status accompanies
// every permission_requested/permission_resolved. Returns null when the task
// lacks a chat_id or agent (sessionless creators) — we simply skip the status.
const resolveChatAgent = async (
taskId: string,
): Promise<{ chatId: string; agent: string } | null> => {
const [row] = await sql<{ chat_id: string | null; agent: string | null }[]>`
SELECT chat_id, agent FROM tasks WHERE id = ${taskId}
`;
if (!row?.chat_id || !row.agent) return null;
return { chatId: row.chat_id, agent: row.agent };
};
setPermissionHooks({ setPermissionHooks({
onPrompt: async (prompt) => { onPrompt: async (prompt) => {
await sql` await sql`
@@ -96,6 +112,18 @@ async function main() {
...(prompt.input ? { input: prompt.input } : {}), ...(prompt.input ? { input: prompt.input } : {}),
options: prompt.options.map((o) => ({ option_id: o.optionId, label: o.label })), options: prompt.options.map((o) => ({ option_id: o.optionId, label: o.label })),
} as WsFrame); } as WsFrame);
// #10: agent is blocked on a human decision.
const ca = await resolveChatAgent(prompt.taskId).catch(() => null);
if (ca) {
publishAgentStatus(
broker.publishFrame,
prompt.sessionId,
ca.chatId,
ca.agent,
'blocked',
'permission_request',
);
}
}, },
onResolved: async (taskId, sessionId) => { onResolved: async (taskId, sessionId) => {
await sql` await sql`
@@ -106,6 +134,18 @@ async function main() {
task_id: taskId, task_id: taskId,
session_id: sessionId, session_id: sessionId,
} as WsFrame); } as WsFrame);
// #10: human responded — agent resumes work.
const ca = await resolveChatAgent(taskId).catch(() => null);
if (ca) {
publishAgentStatus(
broker.publishFrame,
sessionId,
ca.chatId,
ca.agent,
'working',
'permission_resolved',
);
}
}, },
}); });

View File

@@ -0,0 +1,110 @@
import { describe, it, expect } from 'vitest';
import { resolveChatId } from '../chat-resolve.js';
import type { Sql } from '../../db.js';
// Mock the porsager/postgres surface that chat-resolve.ts uses: a tagged-template
// `tx` (dispatched by query substring), `tx.json`, and `sql.begin(fn)` which just
// runs fn(tx). Captures the value written back to workspace_panes so we can assert
// the WorkspaceState envelope survives the UPDATE.
interface MockState {
stored: unknown; // initial sessions.workspace_panes value
existingChatOpen: boolean; // whether `SELECT id FROM chats ...` finds the active chat
newChatId: string;
written?: unknown; // captured tx.json(...) payload from `UPDATE sessions`
inserted: boolean; // whether INSERT INTO chats ran
}
interface MockTx {
(strings: TemplateStringsArray): Promise<unknown>;
json: (v: unknown) => unknown;
}
function mockSql(state: MockState): Sql {
const tx = ((strings: TemplateStringsArray) => {
const q = strings.join('');
if (q.includes('SELECT workspace_panes FROM sessions')) {
return Promise.resolve([{ workspace_panes: state.stored }]);
}
if (q.includes('FROM chats')) {
return Promise.resolve(state.existingChatOpen ? [{ id: 'placeholder' }] : []);
}
if (q.includes('INSERT INTO chats')) {
state.inserted = true;
return Promise.resolve([{ id: state.newChatId }]);
}
if (q.includes('UPDATE sessions')) {
return Promise.resolve([]);
}
return Promise.resolve([]);
}) as unknown as MockTx;
tx.json = (v: unknown) => {
state.written = v;
return v;
};
const sql = {
begin: (fn: (t: Sql) => Promise<unknown>) => fn(tx as unknown as Sql),
};
return sql as unknown as Sql;
}
const ENVELOPE = () => ({
panes: [{ id: 'pane-1', kind: 'coder', chatIds: [] as string[], activeChatIdx: 0 }],
tabNumbers: { 'chat-x': 3 },
nextTabNumber: 7,
closedPaneStack: [{ kind: 'coder', chatIds: ['old'], activeChatIdx: 0 }],
});
describe('resolveChatId — v2.6.5 WorkspaceState envelope', () => {
it('reads panes from the envelope without crashing (regression: panes.findIndex is not a function)', async () => {
const state: MockState = {
stored: ENVELOPE(),
existingChatOpen: false,
newChatId: 'new-chat-1',
inserted: false,
};
const chatId = await resolveChatId(mockSql(state), 'session-1', 'pane-1');
expect(chatId).toBe('new-chat-1');
expect(state.inserted).toBe(true);
});
it('preserves the envelope (tabNumbers/nextTabNumber/closedPaneStack) on write-back', async () => {
const state: MockState = {
stored: ENVELOPE(),
existingChatOpen: false,
newChatId: 'new-chat-1',
inserted: false,
};
await resolveChatId(mockSql(state), 'session-1', 'pane-1');
const w = state.written as Record<string, unknown>;
expect(Array.isArray(w.panes)).toBe(true); // envelope, not a bare array
expect(w.tabNumbers).toEqual({ 'chat-x': 3 });
expect(w.nextTabNumber).toBe(7);
expect(w.closedPaneStack).toEqual([{ kind: 'coder', chatIds: ['old'], activeChatIdx: 0 }]);
});
it('returns the existing open chat when the pane already has one', async () => {
const env = ENVELOPE();
env.panes[0]!.chatIds = ['existing-1'];
const state: MockState = {
stored: env,
existingChatOpen: true,
newChatId: 'should-not-be-used',
inserted: false,
};
const chatId = await resolveChatId(mockSql(state), 'session-1', 'pane-1');
expect(chatId).toBe('existing-1');
expect(state.inserted).toBe(false);
});
it('still accepts a legacy bare WorkspacePane[] array', async () => {
const state: MockState = {
stored: [{ id: 'pane-1', kind: 'coder', chatId: 'legacy-1', chatIds: ['legacy-1'], activeChatIdx: 0 }],
existingChatOpen: true,
newChatId: 'should-not-be-used',
inserted: false,
};
const chatId = await resolveChatId(mockSql(state), 'session-1', 'pane-1');
expect(chatId).toBe('legacy-1');
expect(state.inserted).toBe(false);
});
});

View File

@@ -16,6 +16,11 @@ export interface AgentSessionRow {
status: string; status: string;
has_session: boolean; has_session: boolean;
last_active_at: string | null; last_active_at: string | null;
// v2.6.8 per-(chat,agent) running token/cost totals (sampling-streamjson-tokens
// #8). BIGINT columns arrive as strings over the wire; the frontend coerces.
input_tokens: number;
output_tokens: number;
cost: number;
} }
export function registerAgentSessionRoutes(app: FastifyInstance, sql: Sql): void { export function registerAgentSessionRoutes(app: FastifyInstance, sql: Sql): void {
@@ -39,7 +44,10 @@ export function registerAgentSessionRoutes(app: FastifyInstance, sql: Sql): void
a.agent AS agent, a.agent AS agent,
a.status AS status, a.status AS status,
(a.agent_session_id IS NOT NULL) AS has_session, (a.agent_session_id IS NOT NULL) AS has_session,
a.last_active_at AS last_active_at a.last_active_at AS last_active_at,
a.input_tokens AS input_tokens,
a.output_tokens AS output_tokens,
a.cost AS cost
FROM agent_sessions a FROM agent_sessions a
JOIN chats c ON c.id = a.chat_id JOIN chats c ON c.id = a.chat_id
WHERE c.session_id = ${sessionId} WHERE c.session_id = ${sessionId}

View File

@@ -8,6 +8,36 @@ interface WorkspacePaneRow {
activeChatIdx?: number; activeChatIdx?: number;
} }
// v2.6.5: sessions.workspace_panes widened from a bare WorkspacePane[] to a
// WorkspaceState envelope { panes, tabNumbers, nextTabNumber, closedPaneStack }.
// (See the union validator in apps/server routes/sessions.ts + normalizeWorkspaceState
// in apps/server read_tab_by_number.ts — this is the coder-side mirror.)
interface WorkspaceStateRow {
panes: WorkspacePaneRow[];
tabNumbers: Record<string, number>;
nextTabNumber: number;
closedPaneStack: unknown[];
}
// MIGRATION: the stored value may be the legacy bare array OR the envelope.
// Normalize to a full envelope so callers always read `.panes` as an array and
// write the envelope back intact (preserving tabNumbers/nextTabNumber/closedPaneStack).
export function normalizeWorkspaceState(v: unknown): WorkspaceStateRow {
if (Array.isArray(v)) {
return { panes: v as WorkspacePaneRow[], tabNumbers: {}, nextTabNumber: 1, closedPaneStack: [] };
}
if (v && typeof v === 'object' && Array.isArray((v as { panes?: unknown }).panes)) {
const env = v as Partial<WorkspaceStateRow>;
return {
panes: env.panes ?? [],
tabNumbers: env.tabNumbers ?? {},
nextTabNumber: env.nextTabNumber ?? 1,
closedPaneStack: env.closedPaneStack ?? [],
};
}
return { panes: [], tabNumbers: {}, nextTabNumber: 1, closedPaneStack: [] };
}
function chatNameForKind(kind: string): string { function chatNameForKind(kind: string): string {
if (kind === 'coder' || kind === 'agent') return 'BooCoder'; if (kind === 'coder' || kind === 'agent') return 'BooCoder';
if (kind === 'terminal') return 'Terminal'; if (kind === 'terminal') return 'Terminal';
@@ -28,12 +58,13 @@ export async function resolveChatId(
paneId: string, paneId: string,
): Promise<string | null> { ): Promise<string | null> {
return sql.begin(async (tx) => { return sql.begin(async (tx) => {
const sessionRows = await tx<{ workspace_panes: WorkspacePaneRow[] }[]>` const sessionRows = await tx<{ workspace_panes: unknown }[]>`
SELECT workspace_panes FROM sessions WHERE id = ${sessionId} FOR UPDATE SELECT workspace_panes FROM sessions WHERE id = ${sessionId} FOR UPDATE
`; `;
if (sessionRows.length === 0) return null; if (sessionRows.length === 0) return null;
const panes = sessionRows[0]!.workspace_panes ?? []; const state = normalizeWorkspaceState(sessionRows[0]!.workspace_panes);
const panes = state.panes;
const paneIdx = panes.findIndex((p) => p.id === paneId); const paneIdx = panes.findIndex((p) => p.id === paneId);
if (paneIdx < 0) return null; if (paneIdx < 0) return null;
@@ -69,9 +100,10 @@ export async function resolveChatId(
: p, : p,
); );
const nextState: WorkspaceStateRow = { ...state, panes: nextPanes };
await tx` await tx`
UPDATE sessions UPDATE sessions
SET workspace_panes = ${tx.json(nextPanes as never)}, SET workspace_panes = ${tx.json(nextState as never)},
updated_at = clock_timestamp() updated_at = clock_timestamp()
WHERE id = ${sessionId} WHERE id = ${sessionId}
`; `;

View File

@@ -2,7 +2,7 @@ import type { FastifyInstance } from 'fastify';
import { z } from 'zod'; import { z } from 'zod';
import type { Sql } from '../db.js'; import type { Sql } from '../db.js';
import type { Broker } from '@boocode/server/broker'; import type { Broker } from '@boocode/server/broker';
import type { WsFrame } from '@boocode/server/ws-frames'; import type { WsFrame } from '@boocode/contracts/ws-frames';
import { resolveChatId } from './chat-resolve.js'; import { resolveChatId } from './chat-resolve.js';
const AnswerUserInputBody = z.object({ const AnswerUserInputBody = z.object({
@@ -53,6 +53,9 @@ interface MessageRow {
role: string; role: string;
content: string | null; content: string | null;
status: string | null; status: string | null;
model: string | null;
ctx_used: number | null;
ctx_max: number | null;
tool_calls: Array<{ id: string; name: string; args?: Record<string, unknown> }> | null; tool_calls: Array<{ id: string; name: string; args?: Record<string, unknown> }> | null;
tool_results: { tool_results: {
tool_call_id: string; tool_call_id: string;
@@ -88,6 +91,9 @@ function mapCoderMessageRow(row: MessageRow) {
role: row.role as 'user' | 'assistant' | 'system', role: row.role as 'user' | 'assistant' | 'system',
content: row.content ?? '', content: row.content ?? '',
status: (row.status ?? 'complete') as 'streaming' | 'complete' | 'failed', status: (row.status ?? 'complete') as 'streaming' | 'complete' | 'failed',
...(row.model ? { model: row.model } : {}),
...(row.ctx_used != null ? { ctx_used: row.ctx_used } : {}),
...(row.ctx_max != null ? { ctx_max: row.ctx_max } : {}),
...(reasoningText ? { reasoning_text: reasoningText } : {}), ...(reasoningText ? { reasoning_text: reasoningText } : {}),
...(tool_calls?.length ? { tool_calls } : {}), ...(tool_calls?.length ? { tool_calls } : {}),
}; };
@@ -126,13 +132,13 @@ export function registerMessageRoutes(
const rows = chatId const rows = chatId
? await sql<MessageRow[]>` ? await sql<MessageRow[]>`
SELECT id, role, content, status, tool_calls, tool_results, reasoning_parts SELECT id, role, content, status, model, ctx_used, ctx_max, tool_calls, tool_results, reasoning_parts
FROM messages_with_parts FROM messages_with_parts
WHERE session_id = ${sessionId} AND chat_id = ${chatId} WHERE session_id = ${sessionId} AND chat_id = ${chatId}
ORDER BY created_at ASC, id ASC ORDER BY created_at ASC, id ASC
` `
: await sql<MessageRow[]>` : await sql<MessageRow[]>`
SELECT id, role, content, status, tool_calls, tool_results, reasoning_parts SELECT id, role, content, status, model, ctx_used, ctx_max, tool_calls, tool_results, reasoning_parts
FROM messages_with_parts FROM messages_with_parts
WHERE session_id = ${sessionId} WHERE session_id = ${sessionId}
ORDER BY created_at ASC, id ASC ORDER BY created_at ASC, id ASC

View File

@@ -2,7 +2,7 @@ import type { FastifyInstance } from 'fastify';
import { z } from 'zod'; import { z } from 'zod';
import type { Sql } from '../db.js'; import type { Sql } from '../db.js';
import type { Broker } from '@boocode/server/broker'; import type { Broker } from '@boocode/server/broker';
import type { WsFrame } from '@boocode/server/ws-frames'; import type { WsFrame } from '@boocode/contracts/ws-frames';
import { getSkillBody } from '@boocode/server/skills'; import { getSkillBody } from '@boocode/server/skills';
import { import {
buildSkillInvokeSyntheticFrames, buildSkillInvokeSyntheticFrames,

View File

@@ -25,7 +25,7 @@ export function registerWebSocket(
// Send snapshot of existing messages so client can hydrate // Send snapshot of existing messages so client can hydrate
const messages = await sql<Record<string, unknown>[]>` const messages = await sql<Record<string, unknown>[]>`
SELECT id, session_id, chat_id, role, content, kind, tool_calls, tool_results, reasoning_parts, status, last_seq, SELECT id, session_id, chat_id, role, content, kind, tool_calls, tool_results, reasoning_parts, status, model, last_seq,
tokens_used, ctx_used, ctx_max, started_at, finished_at, created_at, metadata, tokens_used, ctx_used, ctx_max, started_at, finished_at, created_at, metadata,
summary, tail_start_id, compacted_at summary, tail_start_id, compacted_at
FROM messages_with_parts FROM messages_with_parts

View File

@@ -261,6 +261,34 @@ CREATE TABLE IF NOT EXISTS checkpoints (
); );
CREATE INDEX IF NOT EXISTS checkpoints_chat_created_idx ON checkpoints(chat_id, created_at); CREATE INDEX IF NOT EXISTS checkpoints_chat_created_idx ON checkpoints(chat_id, created_at);
-- claude-sdk-sessionstore #9 (Part 1): append-only mirror of Claude Agent SDK
-- session transcripts. The SDK's SessionStore adapter writes one JSONL line per
-- entry; PostgresSessionStore (services/backends/claude-session-store.ts) inserts
-- one row per entry and replays them ORDER BY id on resume. The store is generic
-- per the SDK's SessionKey (project_key, session_id, subpath) — chat↔session
-- ownership lives in agent_sessions, not here. subpath '' is the main transcript
-- (the SDK's undefined subpath maps to '' in the column).
CREATE TABLE IF NOT EXISTS claude_session_entries (
id BIGSERIAL PRIMARY KEY,
project_key TEXT NOT NULL,
session_id TEXT NOT NULL,
subpath TEXT NOT NULL DEFAULT '', -- '' = main transcript (SDK's undefined subpath maps here)
entry JSONB NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT clock_timestamp()
);
CREATE INDEX IF NOT EXISTS claude_session_entries_key_idx ON claude_session_entries (project_key, session_id, subpath, id);
-- claude-sdk-sessionstore #9 (Part 2): the warm Claude-SDK backend persists its
-- agent_sessions rows with backend='claude_sdk'. Widen the named CHECK to accept
-- it. Idempotent: DROP the named constraint (the inline CREATE TABLE check above
-- carries this explicit name, so DROP IF EXISTS targets it) + re-ADD the widened
-- list. Re-runs/fresh deploys land on the same final constraint (the table-level
-- CREATE already includes only the old two values on a fresh DB; this block then
-- replaces it with the three-value list).
ALTER TABLE agent_sessions DROP CONSTRAINT IF EXISTS agent_sessions_backend_chk;
ALTER TABLE agent_sessions ADD CONSTRAINT agent_sessions_backend_chk
CHECK (backend IN ('opencode_server', 'acp_warm', 'claude_sdk'));
-- LISTEN/NOTIFY fast path: every tasks INSERT (from any call site — routes, -- LISTEN/NOTIFY fast path: every tasks INSERT (from any call site — routes,
-- new_task tool, arena, MCP server) fires pg_notify('tasks_new') in the same -- new_task tool, arena, MCP server) fires pg_notify('tasks_new') in the same
-- transaction, so the dispatcher reacts immediately instead of waiting for the -- transaction, so the dispatcher reacts immediately instead of waiting for the

View File

@@ -0,0 +1,83 @@
import { describe, it, expect } from 'vitest';
import { normalizeAgentEvent } from '../normalize-agent-status.js';
describe('normalizeAgentEvent', () => {
describe('working bucket', () => {
const cases = [
'SessionStart',
'UserPromptSubmit',
'UserPromptSubmitted',
'PostToolUse',
'PostToolUseFailure',
'BeforeAgent',
'AfterTool',
'task_started',
];
for (const name of cases) {
it(`maps ${name} → working`, () => {
expect(normalizeAgentEvent(name)).toBe('working');
});
}
});
describe('blocked bucket', () => {
const cases = [
'PreToolUse',
'Notification',
'PermissionRequest',
'exec_approval_request',
'apply_patch_approval_request',
'request_user_input',
];
for (const name of cases) {
it(`maps ${name} → blocked`, () => {
expect(normalizeAgentEvent(name)).toBe('blocked');
});
}
});
describe('done bucket', () => {
const cases = [
'Stop',
'AfterAgent',
'SessionEnd',
'task_complete',
'agent-turn-complete',
];
for (const name of cases) {
it(`maps ${name} → done`, () => {
expect(normalizeAgentEvent(name)).toBe('done');
});
}
});
describe('unknown / nullish → null', () => {
it('returns null for an unrecognized event', () => {
expect(normalizeAgentEvent('SomeRandomEvent')).toBeNull();
});
it('returns null for empty string', () => {
expect(normalizeAgentEvent('')).toBeNull();
});
it('returns null for undefined', () => {
expect(normalizeAgentEvent(undefined)).toBeNull();
});
});
describe('case- and separator-insensitive matching', () => {
it('matches snake_case spelling of a PascalCase event', () => {
expect(normalizeAgentEvent('session_start')).toBe('working');
expect(normalizeAgentEvent('post_tool_use')).toBe('working');
expect(normalizeAgentEvent('pre_tool_use')).toBe('blocked');
});
it('matches camelCase spelling', () => {
expect(normalizeAgentEvent('userPromptSubmitted')).toBe('working');
expect(normalizeAgentEvent('postToolUse')).toBe('working');
expect(normalizeAgentEvent('preToolUse')).toBe('blocked');
expect(normalizeAgentEvent('sessionEnd')).toBe('done');
});
it('matches arbitrary case', () => {
expect(normalizeAgentEvent('STOP')).toBe('done');
expect(normalizeAgentEvent('notification')).toBe('blocked');
});
});
});

View File

@@ -1,64 +0,0 @@
import { describe, it, expect } from 'vitest';
import { readFileSync } from 'node:fs';
import { resolve, dirname } from 'node:path';
import { fileURLToPath } from 'node:url';
/**
* Parity guard between the two copies of the provider snapshot types:
* apps/coder/src/services/provider-types.ts (backend source of truth)
* apps/web/src/api/types.ts (web wire copy)
*
* APPROACH: text-identity of each shared type block (mirrors the repo's existing
* ws-frames.test.ts byte-parity convention). A compile-time bidirectional-
* assignability check was attempted first (a web-side file importing coder's
* import-free provider-types.ts), but apps/web/tsconfig.app.json is a composite
* project and rejects out-of-include files with TS6307 — so cross-project type
* import is structurally blocked. This runtime guard FAILS on any field
* add/remove/rename/loosen in either copy, including the nested model/mode/
* command types that ProviderSnapshotEntry references. Single-source-of-truth
* (shared workspace package) is deferred as a Tier-2 follow-up.
*/
const here = dirname(fileURLToPath(import.meta.url));
const coderSrc = readFileSync(resolve(here, '../provider-types.ts'), 'utf8');
const webSrc = readFileSync(resolve(here, '../../../../web/src/api/types.ts'), 'utf8');
function extractBlock(src: string, name: string): string {
const iface = src.match(new RegExp(`export interface ${name} \\{[\\s\\S]*?\\n\\}`));
const alias = src.match(new RegExp(`export type ${name} =[^;]*;`));
const block = iface?.[0] ?? alias?.[0];
if (!block) throw new Error(`type block '${name}' not found`);
// Normalize to type structure: drop blank + comment lines (//, /* */, *),
// trim each line. Field add/remove/rename/loosen still changes a field line.
return block
.split('\n')
.map((l) => l.trim())
.filter(
(l) =>
l.length > 0 &&
!l.startsWith('//') &&
!l.startsWith('/*') &&
!l.startsWith('*'),
)
.join('\n');
}
describe('provider snapshot type parity (coder ↔ web)', () => {
// Includes the nested types ProviderSnapshotEntry references, so structural
// drift anywhere in the snapshot surface is caught.
const names = [
'ProviderSnapshotStatus',
'ProviderSnapshotEntry',
'ProviderModel',
'ProviderMode',
'ThinkingOption',
'AgentCommand',
];
for (const name of names) {
it(`${name} is identical in both copies`, () => {
expect(
extractBlock(webSrc, name),
`${name} drifted between apps/coder/src/services/provider-types.ts and apps/web/src/api/types.ts`,
).toBe(extractBlock(coderSrc, name));
});
}
});

View File

@@ -0,0 +1,189 @@
import { describe, it, expect } from 'vitest';
import {
makeStreamJsonParser,
makeStreamJsonState,
parseStreamJsonLine,
type AgentEventList,
} from '../stream-json-parser.js';
import type { AgentEvent } from '../agent-backend.js';
import type { AcpToolSnapshot } from '../acp-tool-snapshot.js';
// Helpers to JSON-encode the representative Claude-Code stream-json lines.
const sys = (sessionId: string) =>
JSON.stringify({ type: 'system', subtype: 'init', session_id: sessionId, tools: ['read', 'edit'] });
const streamEvent = (event: unknown) => JSON.stringify({ type: 'stream_event', event });
const textDelta = (index: number, text: string) =>
streamEvent({ type: 'content_block_delta', index, delta: { type: 'text_delta', text } });
const thinkingDelta = (index: number, thinking: string) =>
streamEvent({ type: 'content_block_delta', index, delta: { type: 'thinking_delta', thinking } });
const toolStart = (index: number, id: string, name: string) =>
streamEvent({ type: 'content_block_start', index, content_block: { type: 'tool_use', id, name } });
const inputJsonDelta = (index: number, partial: string) =>
streamEvent({ type: 'content_block_delta', index, delta: { type: 'input_json_delta', partial_json: partial } });
const blockStop = (index: number) => streamEvent({ type: 'content_block_stop', index });
const resultLine = (input: number, output: number, sessionId?: string) =>
JSON.stringify({ type: 'result', subtype: 'success', session_id: sessionId, usage: { input_tokens: input, output_tokens: output } });
describe('parseStreamJsonLine (pure per-line mapping)', () => {
it('captures session_id from the system init line and emits no events', () => {
const state = makeStreamJsonState();
const events = parseStreamJsonLine(sys('sess-abc'), state);
expect(events).toEqual([]);
expect(state.sessionId).toBe('sess-abc');
});
it('maps a text_delta stream_event → a text event', () => {
const state = makeStreamJsonState();
expect(parseStreamJsonLine(textDelta(0, 'Hello'), state)).toEqual([{ type: 'text', text: 'Hello' }]);
});
it('maps a thinking_delta stream_event → a reasoning event', () => {
const state = makeStreamJsonState();
expect(parseStreamJsonLine(thinkingDelta(0, 'pondering'), state)).toEqual([
{ type: 'reasoning', text: 'pondering' },
]);
});
it('tolerates a garbage / non-JSON line (returns [], no throw)', () => {
const state = makeStreamJsonState();
expect(parseStreamJsonLine('not json at all {{{', state)).toEqual([]);
expect(parseStreamJsonLine('', state)).toEqual([]);
expect(parseStreamJsonLine(' ', state)).toEqual([]);
// A truncated/partial JSON object also yields [] rather than throwing.
expect(parseStreamJsonLine('{"type":"stream_event","eve', state)).toEqual([]);
});
it('ignores unknown top-level line types and the user (tool-result) line', () => {
const state = makeStreamJsonState();
expect(parseStreamJsonLine(JSON.stringify({ type: 'user', message: {} }), state)).toEqual([]);
expect(parseStreamJsonLine(JSON.stringify({ type: 'whatever' }), state)).toEqual([]);
});
it('assembles a tool call across input_json_delta chunks (split across lines)', () => {
const state = makeStreamJsonState();
// start → tool_call (running, empty args)
const start = parseStreamJsonLine(toolStart(1, 'toolu_1', 'edit_file'), state);
expect(start).toHaveLength(1);
expect(start[0]!.type).toBe('tool_call');
const startSnap = (start[0] as { type: 'tool_call'; toolCall: AcpToolSnapshot }).toolCall;
expect(startSnap.toolCallId).toBe('toolu_1');
expect(startSnap.title).toBe('edit_file');
expect(startSnap.status).toBe('in_progress');
expect(startSnap.rawInput).toEqual({});
// args streamed in fragments — no events until stop
expect(parseStreamJsonLine(inputJsonDelta(1, '{"path":"a'), state)).toEqual([]);
expect(parseStreamJsonLine(inputJsonDelta(1, '.ts","content":'), state)).toEqual([]);
expect(parseStreamJsonLine(inputJsonDelta(1, '"hi"}'), state)).toEqual([]);
// stop → tool_update with the parsed, fully-assembled input
const stop = parseStreamJsonLine(blockStop(1), state);
expect(stop).toHaveLength(1);
expect(stop[0]!.type).toBe('tool_update');
const stopSnap = (stop[0] as { type: 'tool_update'; toolCall: AcpToolSnapshot }).toolCall;
expect(stopSnap.toolCallId).toBe('toolu_1');
expect(stopSnap.status).toBe('completed');
expect(stopSnap.rawInput).toEqual({ path: 'a.ts', content: 'hi' });
});
it('falls back to {_raw} when accumulated tool args are not valid JSON', () => {
const state = makeStreamJsonState();
parseStreamJsonLine(toolStart(0, 'toolu_x', 'run'), state);
parseStreamJsonLine(inputJsonDelta(0, '{"broken'), state);
const stop = parseStreamJsonLine(blockStop(0), state);
const snap = (stop[0] as { type: 'tool_update'; toolCall: AcpToolSnapshot }).toolCall;
expect(snap.rawInput).toEqual({ _raw: '{"broken' });
});
it('captures usage from message_delta and result lines', () => {
const state = makeStreamJsonState();
parseStreamJsonLine(streamEvent({ type: 'message_delta', usage: { output_tokens: 42 } }), state);
expect(state.usage.outputTokens).toBe(42);
parseStreamJsonLine(resultLine(100, 250, 'sess-z'), state);
expect(state.usage.inputTokens).toBe(100);
expect(state.usage.outputTokens).toBe(250);
expect(state.sessionId).toBe('sess-z');
});
it('maps a terminal assistant message (fallback) → text + reasoning + tool events', () => {
const state = makeStreamJsonState();
const line = JSON.stringify({
type: 'assistant',
session_id: 'sess-asst',
message: {
content: [
{ type: 'thinking', thinking: 'let me think' },
{ type: 'text', text: 'Here is the answer' },
{ type: 'tool_use', id: 'toolu_9', name: 'view_file', input: { path: 'x.ts' } },
],
usage: { input_tokens: 5, output_tokens: 7 },
},
});
const events = parseStreamJsonLine(line, state);
expect(events).toEqual([
{ type: 'reasoning', text: 'let me think' },
{ type: 'text', text: 'Here is the answer' },
{
type: 'tool_update',
toolCall: { toolCallId: 'toolu_9', title: 'view_file', kind: null, status: 'completed', rawInput: { path: 'x.ts' } },
},
]);
expect(state.usage).toEqual({ inputTokens: 5, outputTokens: 7 });
expect(state.sessionId).toBe('sess-asst');
});
});
describe('makeStreamJsonParser (stateful wrapper over a full turn)', () => {
it('streams a representative turn: init → text → thinking → tool → result', () => {
const parser = makeStreamJsonParser();
const all: AgentEvent[] = [];
const feed = (line: string): AgentEventList => {
const evs = parser.push(line);
all.push(...evs);
return evs;
};
feed(sys('sess-1'));
feed(textDelta(0, 'Reading '));
feed(textDelta(0, 'the file. '));
feed(thinkingDelta(0, 'I should edit it'));
feed(toolStart(1, 'toolu_a', 'edit_file'));
feed(inputJsonDelta(1, '{"path":'));
feed(inputJsonDelta(1, '"main.ts"}'));
feed(blockStop(1));
feed(textDelta(0, 'Done.'));
feed(resultLine(120, 80, 'sess-1'));
expect(all).toEqual([
{ type: 'text', text: 'Reading ' },
{ type: 'text', text: 'the file. ' },
{ type: 'reasoning', text: 'I should edit it' },
{
type: 'tool_call',
toolCall: { toolCallId: 'toolu_a', title: 'edit_file', kind: null, status: 'in_progress', rawInput: {} },
},
{
type: 'tool_update',
toolCall: { toolCallId: 'toolu_a', title: 'edit_file', kind: null, status: 'completed', rawInput: { path: 'main.ts' } },
},
{ type: 'text', text: 'Done.' },
]);
expect(parser.usage()).toEqual({ inputTokens: 120, outputTokens: 80 });
expect(parser.sessionId()).toBe('sess-1');
});
it('a garbage line interleaved mid-turn does not derail subsequent parsing', () => {
const parser = makeStreamJsonParser();
expect(parser.push(textDelta(0, 'a'))).toEqual([{ type: 'text', text: 'a' }]);
expect(parser.push('>>> not json <<<')).toEqual([]);
expect(parser.push(textDelta(0, 'b'))).toEqual([{ type: 'text', text: 'b' }]);
});
});

View File

@@ -23,7 +23,7 @@ import {
type ClientSideConnection as ConnectionType, type ClientSideConnection as ConnectionType,
} from '@agentclientprotocol/sdk'; } from '@agentclientprotocol/sdk';
import type { Broker } from '@boocode/server/broker'; import type { Broker } from '@boocode/server/broker';
import type { WsFrame } from '@boocode/server/ws-frames'; import type { WsFrame } from '@boocode/contracts/ws-frames';
import { spawn } from 'node:child_process'; import { spawn } from 'node:child_process';
import { findThoughtLevelConfigId } from './acp-derive.js'; import { findThoughtLevelConfigId } from './acp-derive.js';
import { resolveLaunchSpec } from './acp-spawn.js'; import { resolveLaunchSpec } from './acp-spawn.js';

View File

@@ -13,7 +13,7 @@ import type { AcpToolSnapshot } from './acp-tool-snapshot.js';
import type { AgentCommand } from './provider-types.js'; import type { AgentCommand } from './provider-types.js';
/** Backend transport kind. Mirrors `agent_sessions.backend` CHECK in schema.sql. */ /** Backend transport kind. Mirrors `agent_sessions.backend` CHECK in schema.sql. */
export type AgentBackendKind = 'opencode_server' | 'acp_warm'; export type AgentBackendKind = 'opencode_server' | 'acp_warm' | 'claude_sdk';
/** /**
* Normalized, transport-agnostic events a backend emits during a turn (§2). * Normalized, transport-agnostic events a backend emits during a turn (§2).
@@ -82,6 +82,12 @@ export interface PromptCtx {
export interface TurnResult { export interface TurnResult {
ok: boolean; ok: boolean;
error?: string; error?: string;
// Optional context-window telemetry (claude SDK): the model's reported window
// (ctxMax, 1M-aware) and the peak request input ≈ current fill (ctxUsed). The
// dispatcher writes these onto the assistant message so the ContextBar renders a
// real fill for the turn. Omitted by backends that don't report a window.
ctxUsed?: number;
ctxMax?: number;
} }
/** /**

View File

@@ -0,0 +1,55 @@
/**
* agent-status-publish (#10) — builds + publishes the `agent_status_updated`
* WS frame on the per-session channel (the same channel CoderPane subscribes to).
*
* Kept separate from normalize-agent-status.ts so that module stays a pure,
* broker-free helper (trivially unit-testable; reused by the config-injection
* follow-on). The frame contract is pinned in apps/server/src/types/ws-frames.ts
* (`AgentStatusUpdatedFrame`) and mirrored byte-identical in apps/web.
*/
import type { Broker } from '@boocode/server/broker';
import type { WsFrame } from '@boocode/contracts/ws-frames';
import type { AgentStatus } from './normalize-agent-status.js';
// The exact slice of Broker we need — accepting just the bound method keeps call
// sites flexible (pass `broker.publishFrame.bind(broker)` or, since the broker's
// publishFrame doesn't read `this`, `broker.publishFrame` directly).
type PublishFrame = Broker['publishFrame'];
/**
* Best-effort publish of a normalized agent status. The broker's publishFrame
* already fail-closes (validates + logs + drops on bad input, never throws), but
* we additionally swallow any unexpected error so a publish can NEVER break the
* turn it's reporting on.
*
* @param publishFrame the session channel publisher (broker.publishFrame)
* @param sessionId WS subscription channel (CoderPane subscribes per-session)
* @param chatId the (chat) half of the (chat, agent) status key
* @param agent the (agent) half of the key
* @param status normalized lifecycle status
* @param reason free-form discriminator (turn_start / turn_complete / …)
* @param at ISO timestamp; defaults to now
*/
export function publishAgentStatus(
publishFrame: PublishFrame,
sessionId: string,
chatId: string,
agent: string,
status: AgentStatus,
reason?: string,
at: string = new Date().toISOString(),
): void {
try {
const frame: WsFrame = {
type: 'agent_status_updated',
chat_id: chatId,
agent,
status,
...(reason ? { reason } : {}),
at,
};
publishFrame(sessionId, frame);
} catch {
// never let a status publish break the turn — best-effort only.
}
}

View File

@@ -0,0 +1,251 @@
import { describe, it, expect } from 'vitest';
import type { SDKMessage } from '@anthropic-ai/claude-agent-sdk';
import { mapSdkMessage, createClaudeSdkMapState } from '../claude-sdk-map.js';
import type { AgentEvent } from '../../agent-backend.js';
/**
* Pure mapper for Claude-SDK messages → AgentEvents (claude-sdk-sessionstore #9 Part 2).
* Verifies the partial-stream → live-delta mapping, tool assembly across blocks, and
* the final-assistant dedup, with no live `claude` binary involved.
*
* Messages are cast through `unknown` to `SDKMessage`: the real SDK shapes carry many
* fields (uuid, parent_tool_use_id, …) irrelevant to the mapper, which reads only the
* `type`/`event`/`message.content` it discriminates on. The cast keeps the fixtures
* minimal while the production code path sees the full real types (the backend's
* typecheck against the real SDK is the type-safety proof).
*/
function msg(m: unknown): SDKMessage {
return m as SDKMessage;
}
/** A partial-stream message wrapping one BetaRawMessageStreamEvent. */
function streamEvent(event: unknown): SDKMessage {
return msg({ type: 'stream_event', event, parent_tool_use_id: null, uuid: 'u', session_id: 's' });
}
describe('mapSdkMessage — partial stream deltas', () => {
it('maps a text_delta to a text event', () => {
const state = createClaudeSdkMapState();
const out = mapSdkMessage(
streamEvent({ type: 'content_block_delta', index: 0, delta: { type: 'text_delta', text: 'Hello' } }),
state,
);
expect(out).toEqual<AgentEvent[]>([{ type: 'text', text: 'Hello' }]);
});
it('maps a thinking_delta to a reasoning event', () => {
const state = createClaudeSdkMapState();
const out = mapSdkMessage(
streamEvent({
type: 'content_block_delta',
index: 0,
delta: { type: 'thinking_delta', thinking: 'pondering', estimated_tokens: null },
}),
state,
);
expect(out).toEqual<AgentEvent[]>([{ type: 'reasoning', text: 'pondering' }]);
});
it('drops empty text/thinking deltas', () => {
const state = createClaudeSdkMapState();
expect(
mapSdkMessage(streamEvent({ type: 'content_block_delta', index: 0, delta: { type: 'text_delta', text: '' } }), state),
).toEqual([]);
expect(
mapSdkMessage(
streamEvent({ type: 'content_block_delta', index: 0, delta: { type: 'thinking_delta', thinking: '', estimated_tokens: null } }),
state,
),
).toEqual([]);
});
it('ignores message framing + signature/citation deltas', () => {
const state = createClaudeSdkMapState();
expect(mapSdkMessage(streamEvent({ type: 'message_start', message: {} }), state)).toEqual([]);
expect(mapSdkMessage(streamEvent({ type: 'message_stop' }), state)).toEqual([]);
expect(
mapSdkMessage(streamEvent({ type: 'content_block_delta', index: 0, delta: { type: 'signature_delta', signature: 'x' } }), state),
).toEqual([]);
});
});
describe('mapSdkMessage — tool assembly across blocks', () => {
it('opens a tool_call on content_block_start, buffers input_json_delta, emits tool_update with parsed input on stop', () => {
const state = createClaudeSdkMapState();
const started = mapSdkMessage(
streamEvent({
type: 'content_block_start',
index: 1,
content_block: { type: 'tool_use', id: 'tool-1', name: 'view_file', input: {} },
}),
state,
);
expect(started).toEqual<AgentEvent[]>([
{ type: 'tool_call', toolCall: { toolCallId: 'tool-1', title: 'view_file', kind: null, status: 'in_progress', rawInput: {}, rawOutput: undefined } },
]);
// args stream in fragments under the same block index
expect(
mapSdkMessage(streamEvent({ type: 'content_block_delta', index: 1, delta: { type: 'input_json_delta', partial_json: '{"path":' } }), state),
).toEqual([]);
expect(
mapSdkMessage(streamEvent({ type: 'content_block_delta', index: 1, delta: { type: 'input_json_delta', partial_json: '"a.ts"}' } }), state),
).toEqual([]);
const stopped = mapSdkMessage(streamEvent({ type: 'content_block_stop', index: 1 }), state);
expect(stopped).toHaveLength(1);
const ev = stopped[0]!;
expect(ev.type).toBe('tool_update');
if (ev.type === 'tool_update') {
expect(ev.toolCall.toolCallId).toBe('tool-1');
expect(ev.toolCall.title).toBe('view_file');
expect(ev.toolCall.rawInput).toEqual({ path: 'a.ts' });
}
});
it('content_block_stop for a non-tool block (no tracked index) emits nothing', () => {
const state = createClaudeSdkMapState();
// text block was streamed at index 0 but never tracked as a tool
mapSdkMessage(streamEvent({ type: 'content_block_delta', index: 0, delta: { type: 'text_delta', text: 'hi' } }), state);
expect(mapSdkMessage(streamEvent({ type: 'content_block_stop', index: 0 }), state)).toEqual([]);
});
it('falls back to the prior input when the buffered tool JSON is invalid', () => {
const state = createClaudeSdkMapState();
mapSdkMessage(
streamEvent({ type: 'content_block_start', index: 2, content_block: { type: 'tool_use', id: 't2', name: 'grep', input: { q: 'seed' } } }),
state,
);
mapSdkMessage(streamEvent({ type: 'content_block_delta', index: 2, delta: { type: 'input_json_delta', partial_json: '{not json' } }), state);
const stopped = mapSdkMessage(streamEvent({ type: 'content_block_stop', index: 2 }), state);
const ev = stopped[0]!;
if (ev.type === 'tool_update') {
expect(ev.toolCall.rawInput).toEqual({ q: 'seed' });
} else {
throw new Error('expected tool_update');
}
});
});
describe('mapSdkMessage — final assistant message', () => {
function assistant(content: unknown[]): SDKMessage {
return msg({ type: 'assistant', message: { content }, parent_tool_use_id: null, uuid: 'u', session_id: 's' });
}
it('dedups text/thinking (already streamed) and emits a completed tool_update per tool_use block', () => {
const state = createClaudeSdkMapState();
const out = mapSdkMessage(
assistant([
{ type: 'text', text: 'final answer', citations: null },
{ type: 'thinking', thinking: 'reasoned', signature: 'sig' },
{ type: 'tool_use', id: 'tool-9', name: 'find_files', input: { glob: '**/*.ts' } },
]),
state,
);
expect(out).toEqual<AgentEvent[]>([
{
type: 'tool_update',
toolCall: { toolCallId: 'tool-9', title: 'find_files', kind: null, status: 'completed', rawInput: { glob: '**/*.ts' }, rawOutput: undefined },
},
]);
});
it('preserves a title from a prior partial tool_call snapshot', () => {
const state = createClaudeSdkMapState();
mapSdkMessage(
streamEvent({ type: 'content_block_start', index: 0, content_block: { type: 'tool_use', id: 'tool-x', name: 'view_file', input: {} } }),
state,
);
const out = mapSdkMessage(assistant([{ type: 'tool_use', id: 'tool-x', name: 'view_file', input: { path: 'z' } }]), state);
const ev = out[0]!;
if (ev.type === 'tool_update') {
expect(ev.toolCall.status).toBe('completed');
expect(ev.toolCall.title).toBe('view_file');
expect(ev.toolCall.rawInput).toEqual({ path: 'z' });
} else {
throw new Error('expected tool_update');
}
});
});
describe('mapSdkMessage — non-content messages', () => {
it('returns [] for system/init, status, result, and other variants', () => {
const state = createClaudeSdkMapState();
expect(mapSdkMessage(msg({ type: 'system', subtype: 'init', session_id: 's', uuid: 'u' }), state)).toEqual([]);
expect(mapSdkMessage(msg({ type: 'system', subtype: 'status', status: null, session_id: 's', uuid: 'u' }), state)).toEqual([]);
expect(
mapSdkMessage(msg({ type: 'result', subtype: 'success', result: 'done', session_id: 's', uuid: 'u' }), state),
).toEqual([]);
});
});
describe('mapSdkMessage — user tool results', () => {
/** A `user` message carrying tool_result blocks (the SDK feeds tool output back here). */
function userMsg(content: unknown): SDKMessage {
return msg({ type: 'user', message: { role: 'user', content }, parent_tool_use_id: null, uuid: 'u', session_id: 's' });
}
it('maps a string tool_result to a completed tool_update carrying the output', () => {
const state = createClaudeSdkMapState();
const out = mapSdkMessage(userMsg([{ type: 'tool_result', tool_use_id: 't1', content: 'done' }]), state);
expect(out).toEqual<AgentEvent[]>([
{
type: 'tool_update',
toolCall: { toolCallId: 't1', title: 't1', kind: null, status: 'completed', rawInput: undefined, rawOutput: 'done' },
},
]);
});
it('marks an is_error result failed', () => {
const state = createClaudeSdkMapState();
const out = mapSdkMessage(userMsg([{ type: 'tool_result', tool_use_id: 't1', content: 'boom', is_error: true }]), state);
const ev = out[0]!;
if (ev.type !== 'tool_update') throw new Error('expected tool_update');
expect(ev.toolCall.status).toBe('failed');
expect(ev.toolCall.rawOutput).toBe('boom');
});
it('flattens array text blocks (skipping non-text) and reuses a prior snapshot title', () => {
const state = createClaudeSdkMapState();
mapSdkMessage(
streamEvent({ type: 'content_block_start', index: 1, content_block: { type: 'tool_use', id: 't2', name: 'view_file', input: {} } }),
state,
);
const out = mapSdkMessage(
userMsg([
{
type: 'tool_result',
tool_use_id: 't2',
content: [
{ type: 'text', text: 'line1' },
{ type: 'image', source: {} },
{ type: 'text', text: 'line2' },
],
},
]),
state,
);
const ev = out[0]!;
if (ev.type !== 'tool_update') throw new Error('expected tool_update');
expect(ev.toolCall.toolCallId).toBe('t2');
expect(ev.toolCall.title).toBe('view_file');
expect(ev.toolCall.status).toBe('completed');
expect(ev.toolCall.rawOutput).toBe('line1\nline2');
});
it('surfaces a result for an unknown tool_use_id with the id as the title', () => {
const state = createClaudeSdkMapState();
const out = mapSdkMessage(userMsg([{ type: 'tool_result', tool_use_id: 'orphan-id', content: 'x' }]), state);
expect(out[0]).toMatchObject({
type: 'tool_update',
toolCall: { toolCallId: 'orphan-id', title: 'orphan-id', kind: null, status: 'completed' },
});
});
it('ignores non-tool_result blocks and non-array content', () => {
const state = createClaudeSdkMapState();
expect(mapSdkMessage(userMsg([{ type: 'text', text: 'hi' }]), state)).toEqual([]);
expect(mapSdkMessage(userMsg('plain string'), state)).toEqual([]);
});
});

View File

@@ -0,0 +1,49 @@
import { describe, it, expect } from 'vitest';
import { shouldUseClaudeSdk, claudeSdkBackendEnabled } from '../claude-sdk-routing.js';
/**
* Env-flagged routing for the warm Claude-SDK backend. With CLAUDE_SDK_BACKEND off
* (the production default) every claude task falls through to the unchanged PTY path;
* with it on, only chat-tab claude tasks (session_id + chat_id) route to the SDK.
*/
const ON = { CLAUDE_SDK_BACKEND: '1' } as NodeJS.ProcessEnv;
const OFF = {} as NodeJS.ProcessEnv;
describe('claudeSdkBackendEnabled', () => {
it('is false when unset or falsy', () => {
expect(claudeSdkBackendEnabled({} as NodeJS.ProcessEnv)).toBe(false);
expect(claudeSdkBackendEnabled({ CLAUDE_SDK_BACKEND: '' } as NodeJS.ProcessEnv)).toBe(false);
expect(claudeSdkBackendEnabled({ CLAUDE_SDK_BACKEND: '0' } as NodeJS.ProcessEnv)).toBe(false);
expect(claudeSdkBackendEnabled({ CLAUDE_SDK_BACKEND: 'false' } as NodeJS.ProcessEnv)).toBe(false);
expect(claudeSdkBackendEnabled({ CLAUDE_SDK_BACKEND: 'off' } as NodeJS.ProcessEnv)).toBe(false);
expect(claudeSdkBackendEnabled({ CLAUDE_SDK_BACKEND: 'no' } as NodeJS.ProcessEnv)).toBe(false);
});
it('is true for any other truthy value', () => {
expect(claudeSdkBackendEnabled({ CLAUDE_SDK_BACKEND: '1' } as NodeJS.ProcessEnv)).toBe(true);
expect(claudeSdkBackendEnabled({ CLAUDE_SDK_BACKEND: 'true' } as NodeJS.ProcessEnv)).toBe(true);
expect(claudeSdkBackendEnabled({ CLAUDE_SDK_BACKEND: 'on' } as NodeJS.ProcessEnv)).toBe(true);
});
});
describe('shouldUseClaudeSdk', () => {
it('is always false while the env flag is off — production claude stays on PTY', () => {
expect(shouldUseClaudeSdk({ agent: 'claude', session_id: 's1', chat_id: 'c1' }, OFF)).toBe(false);
});
it('routes a chat-tab claude task to the SDK when the flag is on', () => {
expect(shouldUseClaudeSdk({ agent: 'claude', session_id: 's1', chat_id: 'c1' }, ON)).toBe(true);
});
it('only applies to the claude agent', () => {
expect(shouldUseClaudeSdk({ agent: 'qwen', session_id: 's1', chat_id: 'c1' }, ON)).toBe(false);
expect(shouldUseClaudeSdk({ agent: 'opencode', session_id: 's1', chat_id: 'c1' }, ON)).toBe(false);
expect(shouldUseClaudeSdk({ agent: null, session_id: 's1', chat_id: 'c1' }, ON)).toBe(false);
});
it('requires both session_id and chat_id (session-less creators stay one-shot)', () => {
expect(shouldUseClaudeSdk({ agent: 'claude', session_id: null, chat_id: null }, ON)).toBe(false);
expect(shouldUseClaudeSdk({ agent: 'claude', session_id: 's1', chat_id: null }, ON)).toBe(false);
expect(shouldUseClaudeSdk({ agent: 'claude', session_id: null, chat_id: 'c1' }, ON)).toBe(false);
});
});

View File

@@ -0,0 +1,135 @@
import { describe, it, expect, beforeAll, afterAll } from 'vitest';
import { readFileSync } from 'node:fs';
import { resolve } from 'node:path';
import postgres from 'postgres';
import { PostgresSessionStore } from '../claude-session-store.js';
import type { SessionStoreEntry } from '@anthropic-ai/claude-agent-sdk';
/**
* claude-sdk-sessionstore #9 (Part 1) — PostgresSessionStore tests.
*
* DB-opt-in (DATABASE_URL), mirrors checkpoints.test.ts: skips cleanly when the
* var is unset; otherwise applies the server + coder schemas and exercises the
* real append/load/listSessions/delete/listSubkeys round trips against postgres.
* Rows are namespaced under a unique project_key so concurrent suites / leftover
* data can't collide, and afterAll deletes everything written.
*/
describe.runIf(!!process.env.DATABASE_URL)('PostgresSessionStore (DB)', () => {
let sql: ReturnType<typeof postgres>;
let store: PostgresSessionStore;
const projectKey = `claude-store-test-${Date.now()}`;
const entry = (type: string, extra: Record<string, unknown> = {}): SessionStoreEntry => ({
type,
...extra,
});
beforeAll(async () => {
sql = postgres(process.env.DATABASE_URL!, { max: 3 });
const serverSchema = resolve(__dirname, '../../../../../server/src/schema.sql');
const coderSchema = resolve(__dirname, '../../../schema.sql');
await sql.unsafe(readFileSync(serverSchema, 'utf8'));
await sql.unsafe(readFileSync(coderSchema, 'utf8'));
store = new PostgresSessionStore(sql);
});
afterAll(async () => {
if (sql) {
await sql`DELETE FROM claude_session_entries WHERE project_key = ${projectKey}`.catch(() => {});
await sql.end({ timeout: 5 });
}
});
it('append → load round-trips and preserves order across two appends', async () => {
const key = { projectKey, sessionId: 'sess-order' };
await store.append(key, [entry('user', { uuid: 'u1' }), entry('assistant', { uuid: 'a1' })]);
await store.append(key, [entry('result', { uuid: 'r1' })]);
const loaded = await store.load(key);
expect(loaded).not.toBeNull();
expect(loaded!.map((e) => e.uuid)).toEqual(['u1', 'a1', 'r1']);
expect(loaded!.map((e) => e.type)).toEqual(['user', 'assistant', 'result']);
});
it('append with an empty batch is a no-op (load still null for an otherwise-unseen key)', async () => {
const key = { projectKey, sessionId: 'sess-empty' };
await store.append(key, []);
expect(await store.load(key)).toBeNull();
});
it('load of a key that was never written returns null', async () => {
expect(await store.load({ projectKey, sessionId: 'never-seen' })).toBeNull();
});
it('isolates the main transcript from a subpath (load each independently)', async () => {
const sessionId = 'sess-subpath';
const mainKey = { projectKey, sessionId };
const subKey = { projectKey, sessionId, subpath: 'subagents/x' };
await store.append(mainKey, [entry('user', { uuid: 'main-1' })]);
await store.append(subKey, [entry('assistant', { uuid: 'sub-1' })]);
const main = await store.load(mainKey);
const sub = await store.load(subKey);
expect(main!.map((e) => e.uuid)).toEqual(['main-1']);
expect(sub!.map((e) => e.uuid)).toEqual(['sub-1']);
});
it('listSessions returns the session with a numeric mtime (main transcripts only)', async () => {
const sessionId = 'sess-list';
await store.append({ projectKey, sessionId }, [entry('user', { uuid: 'l1' })]);
// A subagent-only session must NOT surface as a main-transcript session.
await store.append(
{ projectKey, sessionId: 'sess-sub-only', subpath: 'subagents/y' },
[entry('user', { uuid: 's1' })],
);
const sessions = await store.listSessions(projectKey);
const ids = sessions.map((s) => s.sessionId);
expect(ids).toContain(sessionId);
expect(ids).not.toContain('sess-sub-only');
const row = sessions.find((s) => s.sessionId === sessionId)!;
expect(typeof row.mtime).toBe('number');
expect(Number.isFinite(row.mtime)).toBe(true);
expect(row.mtime).toBeGreaterThan(0);
});
it('delete with a subpath removes only that subpath', async () => {
const sessionId = 'sess-del-subpath';
const mainKey = { projectKey, sessionId };
const subKey = { projectKey, sessionId, subpath: 'subagents/z' };
await store.append(mainKey, [entry('user', { uuid: 'keep-1' })]);
await store.append(subKey, [entry('assistant', { uuid: 'drop-1' })]);
await store.delete(subKey);
expect(await store.load(subKey)).toBeNull();
expect((await store.load(mainKey))!.map((e) => e.uuid)).toEqual(['keep-1']);
});
it('delete without a subpath removes the whole session (all subpaths)', async () => {
const sessionId = 'sess-del-all';
const mainKey = { projectKey, sessionId };
const subKey = { projectKey, sessionId, subpath: 'subagents/w' };
await store.append(mainKey, [entry('user', { uuid: 'm' })]);
await store.append(subKey, [entry('assistant', { uuid: 's' })]);
await store.delete({ projectKey, sessionId });
expect(await store.load(mainKey)).toBeNull();
expect(await store.load(subKey)).toBeNull();
expect(await store.listSubkeys({ projectKey, sessionId })).toEqual([]);
});
it('listSubkeys returns the distinct non-main subpaths', async () => {
const sessionId = 'sess-subkeys';
await store.append({ projectKey, sessionId }, [entry('user', { uuid: 'main' })]);
await store.append({ projectKey, sessionId, subpath: 'subagents/a' }, [entry('user', { uuid: 'a1' })]);
await store.append({ projectKey, sessionId, subpath: 'subagents/a' }, [entry('user', { uuid: 'a2' })]);
await store.append({ projectKey, sessionId, subpath: 'subagents/b' }, [entry('user', { uuid: 'b1' })]);
const subkeys = await store.listSubkeys({ projectKey, sessionId });
expect(subkeys.sort()).toEqual(['subagents/a', 'subagents/b']);
});
});

View File

@@ -0,0 +1,96 @@
import { describe, it, expect } from 'vitest';
import { createPushable } from '../pushable-iterable.js';
/**
* The pushable async-iterable that feeds the Claude SDK's streaming-input query()
* one message per turn while staying open across turns. Tests cover the ordering
* contract (push/close/async-iterate) without any SDK shape.
*/
describe('createPushable — push/iterate ordering', () => {
it('yields buffered values in FIFO order then parks', async () => {
const p = createPushable<number>();
const it = p.iterable[Symbol.asyncIterator]();
p.push(1);
p.push(2);
expect(await it.next()).toEqual({ value: 1, done: false });
expect(await it.next()).toEqual({ value: 2, done: false });
// No more buffered → next() parks; resolve it by pushing.
const parked = it.next();
p.push(3);
expect(await parked).toEqual({ value: 3, done: false });
});
it('hands a value directly to a parked consumer (push after await)', async () => {
const p = createPushable<string>();
const it = p.iterable[Symbol.asyncIterator]();
const pending = it.next(); // parks immediately (empty buffer)
p.push('hello');
expect(await pending).toEqual({ value: 'hello', done: false });
});
it('close() resolves a parked consumer as done and reports done thereafter', async () => {
const p = createPushable<number>();
const it = p.iterable[Symbol.asyncIterator]();
const pending = it.next();
p.close();
expect(await pending).toEqual({ value: undefined, done: true });
expect(await it.next()).toEqual({ value: undefined, done: true });
expect(p.closed).toBe(true);
});
it('still drains values buffered BEFORE close', async () => {
const p = createPushable<number>();
const it = p.iterable[Symbol.asyncIterator]();
p.push(10);
p.push(20);
p.close();
expect(await it.next()).toEqual({ value: 10, done: false });
expect(await it.next()).toEqual({ value: 20, done: false });
expect(await it.next()).toEqual({ value: undefined, done: true });
});
it('drops values pushed after close', async () => {
const p = createPushable<number>();
const it = p.iterable[Symbol.asyncIterator]();
p.close();
p.push(99); // no-op
expect(await it.next()).toEqual({ value: undefined, done: true });
});
it('close() is idempotent', () => {
const p = createPushable<number>();
p.close();
expect(() => p.close()).not.toThrow();
expect(p.closed).toBe(true);
});
it('works with a for-await loop driven by interleaved pushes', async () => {
const p = createPushable<number>();
const seen: number[] = [];
const consumer = (async () => {
for await (const v of p.iterable) seen.push(v);
})();
p.push(1);
await Promise.resolve();
p.push(2);
await Promise.resolve();
p.close();
await consumer;
expect(seen).toEqual([1, 2]);
});
it('return() on the iterator closes the queue (for-await break)', async () => {
const p = createPushable<number>();
const it = p.iterable[Symbol.asyncIterator]();
p.push(1);
expect(await it.next()).toEqual({ value: 1, done: false });
// Simulate a `break` in for-await: the runtime calls return().
expect(await it.return!()).toEqual({ value: undefined, done: true });
expect(p.closed).toBe(true);
p.push(2); // dropped — queue is closed
expect(await it.next()).toEqual({ value: undefined, done: true });
});
});

View File

@@ -0,0 +1,245 @@
/**
* claude-sdk-sessionstore #9 (Part 2) — PURE Claude-SDK message → AgentEvent mapper.
*
* `ClaudeSdkBackend` drives one `query()` per (chat, agent) session and feeds each
* `SDKMessage` it yields through this function, forwarding the returned
* `AgentEvent[]` to the dispatcher's `onEvent` (which maps them to WS frames +
* persists). Kept PURE (one message + a caller-owned accumulator → events) so it's
* unit-testable without a live `claude` binary — the whole point of Part 2's
* typecheck-and-unit-test gate (the live pump needs a host smoke).
*
* SDK shapes (verified against @anthropic-ai/claude-agent-sdk@0.3.159 sdk.d.ts +
* @anthropic-ai/sdk beta messages d.ts):
* - `SDKPartialAssistantMessage` (`type:'stream_event'`) carries a
* `BetaRawMessageStreamEvent` — the LIVE delta stream (only emitted when
* `options.includePartialMessages` is set, which the backend sets). We map:
* · content_block_delta + text_delta → { text }
* · content_block_delta + thinking_delta → { reasoning }
* · content_block_start + tool_use block → { tool_call } (in_progress)
* · content_block_delta + input_json_delta → buffered into the tool's args
* (no event; the assembled input rides the terminal tool_update)
* - `SDKAssistantMessage` (`type:'assistant'`) carries the FINAL `message.content`
* blocks. Text/thinking there are post-hoc repeats of what the partials already
* streamed, so we DROP them (dedup) and only emit a terminal `tool_update`
* (status completed) per `tool_use` block, with its now-complete `input`.
* - All other `SDKMessage` variants (system/init, status, result, hooks, task
* notifications, …) carry no renderable turn content → return [].
*
* Tool assembly spans messages: a tool_use block opens in a partial
* `content_block_start`, its args stream as `input_json_delta` frames keyed by the
* block `index`, and the final assistant message restates the complete block. The
* caller owns a `ClaudeSdkMapState` (snapshot map + per-index tool tracking) that
* threads this across calls, mirroring the `Map<string, AcpToolSnapshot>` the other
* backends pass into `mapSessionUpdate`. The result frames carry the SAME
* `AcpToolSnapshot` shape, so `persistExternalAgentTurn` / `snapshotToWireToolCall`
* are reused unchanged.
*/
import type { SDKMessage } from '@anthropic-ai/claude-agent-sdk';
import type { AgentEvent } from '../agent-backend.js';
import type { AcpToolSnapshot } from '../acp-tool-snapshot.js';
/**
* The underlying `@anthropic-ai/sdk` Beta message types (`BetaRawMessageStreamEvent`,
* `BetaContentBlock`) are a TRANSITIVE dep of `@anthropic-ai/claude-agent-sdk` — not
* a direct dependency of apps/coder — so a `@anthropic-ai/sdk/...` import does NOT
* resolve here under pnpm's strict node_modules. We instead DERIVE both shapes from
* the SDK's own exported message types, which is also more correct (it tracks the
* exact `event` / `content` shapes the SDK yields, not a hand-picked import path).
*/
type StreamEvent = Extract<SDKMessage, { type: 'stream_event' }>['event'];
type AssistantContent = Extract<SDKMessage, { type: 'assistant' }>['message']['content'];
type ContentBlock = AssistantContent extends readonly (infer B)[] ? B : never;
type UserContent = Extract<SDKMessage, { type: 'user' }>['message']['content'];
/**
* Caller-owned accumulator threaded across `mapSdkMessage` calls within ONE turn.
* The backend creates a fresh one per turn and clears it at turn end.
*/
export interface ClaudeSdkMapState {
/** Stable tool-call snapshots by tool_use id, merged across start/delta/stop. */
snapshots: Map<string, AcpToolSnapshot>;
/**
* Partial-stream block index → in-flight tool assembly. Anthropic's stream keys
* blocks by a numeric `index`; tool_use args arrive as `input_json_delta`s under
* that index with no id, so we map index→id to route them and buffer the raw
* JSON fragments until the block closes (or the final assistant message lands).
*/
toolByIndex: Map<number, { id: string; name: string; jsonBuf: string }>;
}
/** Construct a fresh per-turn accumulator. */
export function createClaudeSdkMapState(): ClaudeSdkMapState {
return { snapshots: new Map(), toolByIndex: new Map() };
}
/**
* Map one `SDKMessage` → zero or more `AgentEvent`s, mutating `state` for
* cross-message tool assembly + dedup. Pure w.r.t. its inputs otherwise.
*/
export function mapSdkMessage(msg: SDKMessage, state: ClaudeSdkMapState): AgentEvent[] {
switch (msg.type) {
case 'stream_event':
return mapStreamEvent(msg.event, state);
case 'assistant':
return mapFinalAssistant(msg.message.content, state);
case 'user':
// Tool RESULTS ride in as user messages (tool_result blocks): the SDK ran
// the tool and feeds its output back. Without mapping these, the tool_call
// never reaches a terminal snapshot — it persists as status:'running' with
// no output and the UI spinner never stops (the bug this fixes).
return mapUserToolResults(msg.message.content, state);
default:
// system/init, status, result, hooks, task_*, etc. — no turn content here.
// (The backend reads session_id off the init message and usage/cost off the
// result message directly; neither produces a renderable AgentEvent.)
return [];
}
}
/** Live partial-stream delta → AgentEvent(s). */
function mapStreamEvent(event: StreamEvent, state: ClaudeSdkMapState): AgentEvent[] {
switch (event.type) {
case 'content_block_start': {
const block = event.content_block;
if (block.type === 'tool_use') {
const snap: AcpToolSnapshot = {
toolCallId: block.id,
title: block.name,
kind: null,
status: 'in_progress',
rawInput: block.input ?? undefined,
rawOutput: undefined,
};
state.snapshots.set(block.id, snap);
state.toolByIndex.set(event.index, { id: block.id, name: block.name, jsonBuf: '' });
return [{ type: 'tool_call', toolCall: snap }];
}
return [];
}
case 'content_block_delta': {
const delta = event.delta;
if (delta.type === 'text_delta') {
return delta.text ? [{ type: 'text', text: delta.text }] : [];
}
if (delta.type === 'thinking_delta') {
return delta.thinking ? [{ type: 'reasoning', text: delta.thinking }] : [];
}
if (delta.type === 'input_json_delta') {
// Buffer the tool's streamed args under its block index; no event yet —
// the assembled input rides the terminal tool_update (or the final block).
const t = state.toolByIndex.get(event.index);
if (t) t.jsonBuf += delta.partial_json ?? '';
return [];
}
// signature_delta / citations_delta / compaction_delta — nothing to render.
return [];
}
case 'content_block_stop': {
// Close out a streamed tool block: parse its buffered JSON args and emit a
// tool_update carrying the assembled input. The final assistant message will
// restate the same block, but its snapshot is dedup-merged (same id) so this
// is harmless — we emit here so a tool's input renders even if the assistant
// message is delayed/dropped.
const t = state.toolByIndex.get(event.index);
if (!t) return [];
state.toolByIndex.delete(event.index);
const prev = state.snapshots.get(t.id);
const snap: AcpToolSnapshot = {
toolCallId: t.id,
title: prev?.title ?? t.name,
kind: null,
status: 'in_progress',
rawInput: parseJsonOr(t.jsonBuf, prev?.rawInput),
rawOutput: undefined,
};
state.snapshots.set(t.id, snap);
return [{ type: 'tool_update', toolCall: snap }];
}
default:
// message_start / message_delta / message_stop — turn framing, no content.
return [];
}
}
/**
* Final assistant message content blocks. Text/thinking are post-hoc repeats of
* the partial stream → dropped (dedup). Only tool_use blocks emit a terminal
* tool_update carrying the complete `input`.
*/
function mapFinalAssistant(content: ContentBlock[], state: ClaudeSdkMapState): AgentEvent[] {
const out: AgentEvent[] = [];
for (const block of content) {
if (block.type === 'tool_use') {
const prev = state.snapshots.get(block.id);
const snap: AcpToolSnapshot = {
toolCallId: block.id,
title: prev?.title ?? block.name,
kind: null,
status: 'completed',
rawInput: block.input ?? prev?.rawInput,
rawOutput: undefined,
};
state.snapshots.set(block.id, snap);
out.push({ type: 'tool_update', toolCall: snap });
}
// text / thinking / redacted_thinking blocks: already streamed via partials.
}
return out;
}
/**
* User-message tool_result blocks → terminal tool_update events. The SDK runs
* each tool and feeds the output back in a `user` message; we mark the matching
* snapshot completed (or failed, on is_error) WITH its output so the snapshot
* persists/renders as resolved instead of spinning. Unknown ids (no prior
* snapshot) are still surfaced so a stray result isn't silently lost.
*/
function mapUserToolResults(content: UserContent, state: ClaudeSdkMapState): AgentEvent[] {
if (!Array.isArray(content)) return [];
const out: AgentEvent[] = [];
for (const raw of content) {
const block = raw as { type?: string; tool_use_id?: string; content?: unknown; is_error?: boolean };
if (block.type !== 'tool_result' || !block.tool_use_id) continue;
const prev = state.snapshots.get(block.tool_use_id);
const snap: AcpToolSnapshot = {
toolCallId: block.tool_use_id,
title: prev?.title ?? block.tool_use_id,
kind: prev?.kind ?? null,
status: block.is_error ? 'failed' : 'completed',
rawInput: prev?.rawInput,
rawOutput: toolResultText(block.content),
};
state.snapshots.set(block.tool_use_id, snap);
out.push({ type: 'tool_update', toolCall: snap });
}
return out;
}
/** tool_result content is a string OR an array of content blocks (text/image).
* Flatten text blocks; fall back to the raw value so nothing is lost. */
function toolResultText(content: unknown): unknown {
if (typeof content === 'string') return content;
if (Array.isArray(content)) {
const text = content
.map((c) =>
c && typeof c === 'object' && (c as { type?: string }).type === 'text'
? String((c as { text?: unknown }).text ?? '')
: '',
)
.filter(Boolean)
.join('\n');
return text || content;
}
return content ?? '';
}
/** Parse a buffered JSON string; fall back to a prior value on empty/invalid. */
function parseJsonOr(buf: string, fallback: unknown): unknown {
const s = buf.trim();
if (!s) return fallback;
try {
return JSON.parse(s);
} catch {
return fallback;
}
}

View File

@@ -0,0 +1,38 @@
/**
* claude-sdk-sessionstore #9 (Part 2) — claude-SDK-vs-PTY routing predicate.
*
* Sibling to `shouldUseWarmBackend` (warm-acp-routing.ts). The warm Claude-SDK
* backend keys its persistent `query()` on (chat_id, agent) — exactly like the
* warm-ACP / opencode-server backends — so a task only routes to it when it carries
* BOTH a `session_id` and a `chat_id` (a real chat tab).
*
* CRUCIALLY this is ALSO gated behind the `CLAUDE_SDK_BACKEND` env flag (default
* OFF). While off — the production default — claude always falls through to the
* existing one-shot PTY `runExternalAgent` path, UNCHANGED. The live SDK streaming
* pump + cross-turn resume need a host smoke against the real `claude` binary, so
* we keep the working PTY path as the default until that lands. Flip the env var
* on a host (any truthy value) to opt a deployment into the SDK backend.
*
* Pure (env read injected) so it's unit-testable; the dispatcher consumes it.
*/
/** True iff the `CLAUDE_SDK_BACKEND` env flag is set to a truthy value. */
export function claudeSdkBackendEnabled(env: NodeJS.ProcessEnv = process.env): boolean {
const v = env.CLAUDE_SDK_BACKEND;
if (v == null) return false;
const s = v.trim().toLowerCase();
return s !== '' && s !== '0' && s !== 'false' && s !== 'off' && s !== 'no';
}
export function shouldUseClaudeSdk(
task: {
agent: string | null;
session_id: string | null;
chat_id: string | null;
},
env: NodeJS.ProcessEnv = process.env,
): boolean {
if (!claudeSdkBackendEnabled(env)) return false;
if (task.agent !== 'claude') return false;
return task.session_id != null && task.chat_id != null;
}

View File

@@ -0,0 +1,425 @@
/**
* claude-sdk-sessionstore #9 (Part 2) — ClaudeSdkBackend.
*
* A warm, resumable backend for the `claude` agent built on the Claude Agent SDK
* (`@anthropic-ai/claude-agent-sdk`), implementing the Phase-0 `AgentBackend`
* contract (same shape as `WarmAcpBackend` / `OpenCodeServerBackend`). One
* persistent `query()` per (chat, agent) session, driven in STREAMING-INPUT mode:
* the `prompt` is a pushable `AsyncIterable<SDKUserMessage>` that stays open across
* turns, so the SDK subprocess + conversation stay warm between `prompt()` calls
* until `closeSession`/`dispose`.
*
* ⚠ LIVE PUMP IS HOST-ONLY. The actual streaming turn needs the real `claude`
* binary + ANTHROPIC auth on a host — it CANNOT run in the dev container. This file
* is written against the REAL SDK types so it TYPECHECKS, and the PURE pieces (the
* `mapSdkMessage` mapper + the `createPushable` queue) are unit-tested. Routing to
* this backend is gated behind `CLAUDE_SDK_BACKEND` (default OFF) so production
* claude stays on the working PTY path until a host smoke validates the pump +
* cross-turn resume.
*
* Lifecycle (mirrors warm-acp.ts / opencode-server.ts):
* - `ensureSession`: resolve the resume id from `agent_sessions(chat_id,'claude')`
* and (re)build the single `query()` if not already live. The SDK's own
* `sessionStore` (Part 1 PostgresSessionStore) materializes the transcript on
* resume; `options.resume` carries the provider session id.
* - `prompt`: push ONE user message onto the open queue, iterate the generator,
* map each `SDKMessage` → `AgentEvent`s via `mapSdkMessage`, forward to
* `ctx.onEvent`, and resolve when the turn's `result` message lands. Capture the
* `session_id` from the `init` message and persist it to `agent_sessions`;
* accumulate `result.usage` / `total_cost_usd` onto the row (mirrors opencode U.6).
* - `closeSession` / `dispose`: close the queue + dispose the query generator.
* - A thrown error or `result.subtype==='error*'` marks `agent_sessions.status='crashed'`.
*
* Turn serialization: like warm-acp, exactly one turn is in flight at a time on a
* given backend (the dispatcher's per-session `inflight` map enforces this upstream;
* `isBusy()` reports it so the pool never evicts mid-turn).
*/
import { query, type Query, type SDKMessage, type SDKUserMessage, type Options } from '@anthropic-ai/claude-agent-sdk';
import type { FastifyBaseLogger } from 'fastify';
import type { Sql } from '../../db.js';
import { PostgresSessionStore } from './claude-session-store.js';
import { createPushable, type Pushable } from './pushable-iterable.js';
import { mapSdkMessage, createClaudeSdkMapState, type ClaudeSdkMapState } from './claude-sdk-map.js';
import type {
AgentBackend,
AgentSessionHandle,
EnsureSessionOpts,
PromptCtx,
TurnResult,
} from '../agent-backend.js';
export interface ClaudeSdkBackendDeps {
sql: Sql;
log: FastifyBaseLogger;
/** The (chat, agent) this backend serves — its pool identity + DB key. */
chatId: string;
/** Always 'claude' today; kept explicit so the pool key + DB writes stay honest. */
agent: string;
/** Resolved `claude` binary path (available_agents.install_path); null → SDK default. */
installPath: string | null;
}
export class ClaudeSdkBackend implements AgentBackend {
readonly backend = 'claude_sdk' as const;
private readonly sql: Sql;
private readonly log: FastifyBaseLogger;
private readonly chatId: string;
private readonly agent: string;
private readonly installPath: string | null;
private readonly sessionStore: PostgresSessionStore;
/** The single persistent query() generator; null until the first turn builds it. */
private query: Query | null = null;
/** The open input queue feeding the generator one SDKUserMessage per turn. */
private input: Pushable<SDKUserMessage> | null = null;
/** The provider's own session id (resume token), captured from the init message. */
private agentSessionId: string | null = null;
/** Resolved model the live query() was built with; a change forces a rebuild. */
private builtModel: string | null = null;
/** True between prompt() start and settle. */
private busy = false;
private up = false;
constructor(deps: ClaudeSdkBackendDeps) {
this.sql = deps.sql;
this.log = deps.log;
this.chatId = deps.chatId;
this.agent = deps.agent;
this.installPath = deps.installPath;
this.sessionStore = new PostgresSessionStore(deps.sql);
}
/** §2: liveness for the health endpoint + dispatcher fallback decision. */
health(): 'up' | 'down' {
return this.up ? 'up' : 'down';
}
/** Phase 3: busy iff a turn is in flight (pool never evicts a busy backend). */
isBusy(): boolean {
return this.busy;
}
// ─── ensureSession: resolve resume id + (re)build the warm query ──────────────
async ensureSession(sessionId: string, opts: EnsureSessionOpts): Promise<AgentSessionHandle> {
// Resolve the resume token from the (chat_id, agent) row. A crashed row is not
// resumed (the SDK would fail to load a dead session); we create fresh.
const [row] = await this.sql<{ agent_session_id: string | null; status: string }[]>`
SELECT agent_session_id, status FROM agent_sessions
WHERE chat_id = ${opts.chatId} AND agent = ${opts.agent}
`;
const resumeId = row && row.status !== 'crashed' ? row.agent_session_id : null;
// (Re)build the warm query if there is none, or the model changed (the SDK can
// change model mid-session via setModel, but a fresh build is simplest + matches
// opencode's config-drift → fresh-session rule). The query stays alive across
// turns; only closeSession/dispose tears it down.
if (!this.query || this.builtModel !== opts.model) {
await this.teardownQuery();
this.buildQuery(opts.worktreePath, opts.model, resumeId);
}
// Seed the in-memory resume id from the DB so a handle built before the first
// turn's init message still carries the last-known token. The init message
// overwrites it with the authoritative current id during the turn.
if (this.agentSessionId == null) this.agentSessionId = resumeId;
// Upsert the agent_sessions row (backend='claude_sdk'). agent_session_id may be
// null until the first turn captures it from the init message; prompt() updates it.
await this.sql`
INSERT INTO agent_sessions
(chat_id, session_id, worktree_id, agent, backend, agent_session_id, server_port, status, last_active_at)
VALUES
(${opts.chatId}, ${sessionId}, ${opts.worktreeId}, ${opts.agent}, 'claude_sdk', ${this.agentSessionId}, NULL, 'active', clock_timestamp())
ON CONFLICT (chat_id, agent) DO UPDATE SET
session_id = EXCLUDED.session_id,
worktree_id = EXCLUDED.worktree_id,
backend = 'claude_sdk',
agent_session_id = COALESCE(EXCLUDED.agent_session_id, agent_sessions.agent_session_id),
server_port = NULL,
status = 'active',
last_active_at = clock_timestamp()
`.catch((err) => {
this.log.warn({ err: errMsg(err), chatId: opts.chatId, agent: opts.agent }, 'claude-sdk: agent_sessions upsert failed (non-fatal)');
});
return {
sessionId,
agent: opts.agent,
backend: 'claude_sdk',
chatId: opts.chatId,
worktreeId: opts.worktreeId,
agentSessionId: this.agentSessionId,
serverPort: null,
};
}
/** Build the persistent query() in streaming-input mode. Lazy — no subprocess
* work happens until the generator is first iterated in prompt(). */
private buildQuery(worktreePath: string, model: string, resumeId: string | null): void {
const input = createPushable<SDKUserMessage>();
const options: Options = {
sessionStore: this.sessionStore,
cwd: worktreePath,
// Stream partial assistant messages so text/thinking/tool deltas arrive live
// (the mapper reads them; without this only terminal messages land).
includePartialMessages: true,
// BooCode default: enable the documented 1M-context-window beta. Active on
// models that support it (the SDK lists Sonnet 4/4.5); a non-supporting model
// simply doesn't get the larger window. The TRUE window is read back from
// `result.modelUsage[*].contextWindow` and shown in the ContextBar, so whatever
// window a model actually gets is surfaced truthfully (no guessing).
betas: ['context-1m-2025-08-07'],
...(model ? { model } : {}),
...(resumeId ? { resume: resumeId } : {}),
...(this.installPath ? { pathToClaudeCodeExecutable: this.installPath } : {}),
// ANTHROPIC auth/env must reach the child; inherit the process env (host concern).
env: process.env as Record<string, string>,
};
this.input = input;
this.query = query({ prompt: input.iterable, options });
this.builtModel = model;
this.up = true;
this.log.info({ chatId: this.chatId, agent: this.agent, model, resume: resumeId ?? null }, 'claude-sdk: warm query built');
}
// ─── prompt: push one user message + drain the generator until result ─────────
async prompt(handle: AgentSessionHandle, input: string, ctx: PromptCtx): Promise<TurnResult> {
if (!this.query || !this.input) {
// ensureSession should have built it; rebuild defensively (e.g. evicted/raced).
this.buildQuery(ctx.worktreePath, ctx.model, handle.agentSessionId);
}
const gen = this.query!;
const queue = this.input!;
if (ctx.signal.aborted) return { ok: false, error: 'aborted' };
this.busy = true;
const state: ClaudeSdkMapState = createClaudeSdkMapState();
// Peak per-request input (incl. cache) across the turn ≈ the conversation context
// held in the window. result.usage SUMS input over the turn's internal requests
// (overcounts for multi-tool turns), so the per-request peak is the accurate
// "context used" for the ContextBar (paseo's approach).
let maxInputTokens = 0;
// Per-turn abort: interrupt the in-flight query on the SAME generator (never
// tear down the warm query — that's the pool's lifetime). The generator then
// emits its terminal result and the drain loop exits.
let aborted = false;
const onAbort = () => {
if (aborted) return;
aborted = true;
void gen.interrupt().catch(() => {});
};
ctx.signal.addEventListener('abort', onAbort, { once: true });
// Push the turn's user message onto the open queue. session_id is optional on
// the wire; the SDK manages it via resume + the init message.
const userMsg: SDKUserMessage = {
type: 'user',
message: { role: 'user', content: input },
parent_tool_use_id: null,
...(handle.agentSessionId ? { session_id: handle.agentSessionId } : {}),
};
queue.push(userMsg);
try {
// Manual iteration — NOT `for await (… of gen)`. Returning out of a for-await
// loop calls gen.return(), which CLOSES the async generator; that killed the
// warm streaming-input query after a single turn, so every FOLLOW-UP message
// hit a dead generator and failed. gen.next() leaves the generator suspended
// (alive) for the next pushed user message — the warm query is only closed
// deliberately in teardownQuery()/dispose().
while (true) {
const next = await gen.next();
if (next.done) {
// Generator ended (e.g. disposed) without a result — non-fatal incomplete.
if (aborted) return { ok: false, error: 'aborted' };
return { ok: false, error: 'claude-sdk: query ended before result' };
}
const msg = next.value;
// Track the peak per-request input from message_start usage (delivered by
// includePartialMessages) — the largest single request's input is the real
// context fill, unlike the summed result.usage.
if (msg.type === 'stream_event') {
const sev = msg.event as { type?: string; message?: { usage?: Record<string, unknown> } };
if (sev?.type === 'message_start' && sev.message?.usage) {
const ru = sev.message.usage;
const reqInput =
num(ru.input_tokens) + num(ru.cache_read_input_tokens) + num(ru.cache_creation_input_tokens);
if (reqInput > maxInputTokens) maxInputTokens = reqInput;
}
}
// Capture the provider session id from the init message (authoritative).
if (msg.type === 'system' && msg.subtype === 'init' && msg.session_id) {
if (this.agentSessionId !== msg.session_id) {
this.agentSessionId = msg.session_id;
await this.persistAgentSessionId(msg.session_id);
}
}
// The result message ends THIS turn (it does not close the generator —
// streaming-input keeps it alive for the next pushed message).
if (msg.type === 'result') {
await this.accumulateUsage(msg);
const ok = msg.subtype === 'success' && !aborted;
if (!ok) {
// error_during_execution / error_max_turns / aborted → crashed row.
await this.markCrashed();
} else {
await this.markIdle();
}
if (aborted) return { ok: false, error: 'aborted' };
if (!ok) return { ok: false, error: resultErrorMessage(msg) };
// Context-window telemetry for the ContextBar (paseo's method):
// ctxMax = the model's OWN reported window (1M-aware — reflects the active
// window, so the bar shows the truth per model);
// ctxUsed = peak request input (history in the window) + this turn's output.
const ctxMax = extractMaxContextWindow((msg as { modelUsage?: unknown }).modelUsage);
const fallbackInput =
num(msg.usage?.input_tokens) +
num(msg.usage?.cache_read_input_tokens) +
num(msg.usage?.cache_creation_input_tokens);
const ctxUsed = (maxInputTokens || fallbackInput) + num(msg.usage?.output_tokens);
return {
ok: true,
...(ctxMax > 0 ? { ctxMax } : {}),
...(ctxUsed > 0 ? { ctxUsed } : {}),
};
}
// Map renderable content → AgentEvents for the dispatcher's onEvent.
for (const ev of mapSdkMessage(msg, state)) {
ctx.onEvent(ev);
}
}
} catch (err) {
if (aborted) return { ok: false, error: 'aborted' };
await this.markCrashed();
return { ok: false, error: errMsg(err) };
} finally {
ctx.signal.removeEventListener('abort', onAbort);
this.busy = false;
}
}
// ─── persistence helpers ──────────────────────────────────────────────────────
private async persistAgentSessionId(id: string): Promise<void> {
await this.sql`
UPDATE agent_sessions
SET agent_session_id = ${id}, last_active_at = clock_timestamp()
WHERE chat_id = ${this.chatId} AND agent = ${this.agent}
`.catch((err) => {
this.log.warn({ err: errMsg(err), chatId: this.chatId }, 'claude-sdk: failed to persist agent_session_id (non-fatal)');
});
}
/**
* Accumulate the turn's usage/cost onto the (chat_id, agent) row — mirrors the
* opencode U.6 running-total pattern. The SDK reports usage once per turn on the
* result message (not per step), so this fires once per prompt(). Cache read/write
* input tokens fold into `input_tokens`; usage telemetry never fails a turn.
*/
private async accumulateUsage(result: Extract<SDKMessage, { type: 'result' }>): Promise<void> {
const u = result.usage;
const input = num(u?.input_tokens) + num(u?.cache_read_input_tokens) + num(u?.cache_creation_input_tokens);
const output = num(u?.output_tokens);
const cost = numF(result.total_cost_usd);
if (input === 0 && output === 0 && cost === 0) return;
await this.sql`
UPDATE agent_sessions SET
input_tokens = input_tokens + ${input},
output_tokens = output_tokens + ${output},
cost = cost + ${cost}
WHERE chat_id = ${this.chatId} AND agent = ${this.agent}
`.catch((err) => {
this.log.warn({ err: errMsg(err), chatId: this.chatId }, 'claude-sdk: failed to persist usage (non-fatal)');
});
}
private async markIdle(): Promise<void> {
await this.sql`
UPDATE agent_sessions SET status = 'idle', last_active_at = clock_timestamp()
WHERE chat_id = ${this.chatId} AND agent = ${this.agent}
`.catch(() => {});
}
private async markCrashed(): Promise<void> {
await this.sql`
UPDATE agent_sessions SET status = 'crashed'
WHERE chat_id = ${this.chatId} AND agent = ${this.agent}
`.catch(() => {});
}
// ─── teardown ────────────────────────────────────────────────────────────────
async closeSession(handle: AgentSessionHandle): Promise<void> {
await this.teardownQuery();
await this.sql`
UPDATE agent_sessions SET status = 'closed'
WHERE chat_id = ${handle.chatId} AND agent = ${handle.agent}
`.catch(() => {});
}
async dispose(): Promise<void> {
await this.teardownQuery();
}
/** Close the input queue + dispose the generator. Idempotent. */
private async teardownQuery(): Promise<void> {
this.up = false;
this.busy = false;
const q = this.query;
const queue = this.input;
this.query = null;
this.input = null;
this.builtModel = null;
queue?.close();
if (q) {
// return() ends the AsyncGenerator and lets the SDK clean up its subprocess.
await q.return(undefined).catch(() => {});
}
}
}
// ─── helpers ──────────────────────────────────────────────────────────────────
/** Coerce to a non-negative finite integer (tokens). */
function num(v: unknown): number {
const x = typeof v === 'number' ? v : Number(v);
return Number.isFinite(x) && x > 0 ? Math.round(x) : 0;
}
/** Coerce to a non-negative finite float (cost USD). */
function numF(v: unknown): number {
const x = typeof v === 'number' ? v : Number(v);
return Number.isFinite(x) && x > 0 ? x : 0;
}
/** Largest context-window the SDK reports across `result.modelUsage` (a
* `Record<model, ModelUsage>`, each with a `contextWindow`). This is the model's
* OWN window — 1M when the 1M model/beta is active, 200K otherwise — so the
* ContextBar shows the true window without us mapping model→size ourselves. */
function extractMaxContextWindow(modelUsage: unknown): number {
if (!modelUsage || typeof modelUsage !== 'object') return 0;
let max = 0;
for (const v of Object.values(modelUsage as Record<string, unknown>)) {
if (v && typeof v === 'object') {
const cw = (v as { contextWindow?: unknown }).contextWindow;
if (typeof cw === 'number' && Number.isFinite(cw) && cw > max) max = cw;
}
}
return max;
}
/** Build a human-readable error from an SDK error-result message. */
function resultErrorMessage(result: Extract<SDKMessage, { type: 'result' }>): string {
if (result.subtype === 'success') return 'ok';
const errs = (result as { errors?: string[] }).errors;
if (Array.isArray(errs) && errs.length > 0) return `${result.subtype}: ${errs.join('; ')}`;
return result.subtype;
}
function errMsg(e: unknown): string {
return e instanceof Error ? e.message : String(e);
}

View File

@@ -0,0 +1,117 @@
import type { SessionStore, SessionKey, SessionStoreEntry } from '@anthropic-ai/claude-agent-sdk';
import type { Sql } from '../../db.js';
/**
* claude-sdk-sessionstore #9 (Part 1) — clean-room PostgresSessionStore.
*
* A Postgres-backed implementation of the Claude Agent SDK's `SessionStore`
* adapter type. The SDK mirrors each transcript line (a JSON-safe POJO with a
* `type` discriminant) to this store via `append`; on resume it calls `load`
* to materialize the full transcript back. We treat entries as opaque blobs and
* preserve append order via a BIGSERIAL `id` — `load` replays `ORDER BY id`.
*
* Storage shape: one row per entry in `claude_session_entries`, keyed by the
* SDK's `SessionKey` (project_key, session_id, subpath). The SDK uses an
* *undefined* subpath for the main transcript and disallows the empty string;
* we collapse `undefined → ''` so the main transcript and subagent files share
* one table, distinguished by the `subpath` column (`'' = main`).
*
* Clean-room: written against the SDK's published `SessionStore` type contract
* and BooCode's existing SQL conventions (porsager tagged templates, `sql.json`
* for JSONB). No SDK example/reference code was consulted.
*/
export class PostgresSessionStore implements SessionStore {
constructor(private readonly sql: Sql) {}
/**
* Mirror a batch of transcript entries. No-op on an empty batch; otherwise a
* single multi-row INSERT writes them in array order. Because `id` is a
* monotonically-increasing BIGSERIAL, the insert order is the replay order
* `load` reconstructs — entries within one call land in the order given.
*/
async append(key: SessionKey, entries: SessionStoreEntry[]): Promise<void> {
if (entries.length === 0) return;
const subpath = key.subpath ?? '';
const rows = entries.map((entry) => ({
project_key: key.projectKey,
session_id: key.sessionId,
subpath,
entry: this.sql.json(entry as never),
}));
await this.sql`
INSERT INTO claude_session_entries ${this.sql(rows, 'project_key', 'session_id', 'subpath', 'entry')}
`;
}
/**
* Load a full transcript for resume. Returns the entries in append order, or
* `null` for a (project_key, session_id, subpath) key that was never written.
*/
async load(key: SessionKey): Promise<SessionStoreEntry[] | null> {
const subpath = key.subpath ?? '';
const rows = await this.sql<{ entry: SessionStoreEntry }[]>`
SELECT entry
FROM claude_session_entries
WHERE project_key = ${key.projectKey}
AND session_id = ${key.sessionId}
AND subpath = ${subpath}
ORDER BY id
`;
if (rows.length === 0) return null;
return rows.map((r) => r.entry);
}
/**
* List the main transcripts for a project. `mtime` is the storage write time
* (latest `created_at` for the session) in Unix epoch milliseconds; the SDK
* sorts the result by mtime descending.
*/
async listSessions(projectKey: string): Promise<Array<{ sessionId: string; mtime: number }>> {
const rows = await this.sql<{ session_id: string; mtime: string }[]>`
SELECT session_id, extract(epoch FROM max(created_at)) * 1000 AS mtime
FROM claude_session_entries
WHERE project_key = ${projectKey}
AND subpath = ''
GROUP BY session_id
`;
return rows.map((r) => ({ sessionId: r.session_id, mtime: Number(r.mtime) }));
}
/**
* Delete a session. With a `subpath` set, only that subpath's rows are
* removed; with `subpath` omitted, every row for the session is removed
* (all subpaths, including the main transcript).
*/
async delete(key: SessionKey): Promise<void> {
if (key.subpath !== undefined) {
await this.sql`
DELETE FROM claude_session_entries
WHERE project_key = ${key.projectKey}
AND session_id = ${key.sessionId}
AND subpath = ${key.subpath}
`;
return;
}
await this.sql`
DELETE FROM claude_session_entries
WHERE project_key = ${key.projectKey}
AND session_id = ${key.sessionId}
`;
}
/**
* List the distinct non-main subpaths under a session (e.g. subagent files).
* Used during resume to discover and materialize subagent transcripts; the
* main transcript (`subpath = ''`) is excluded.
*/
async listSubkeys(key: { projectKey: string; sessionId: string }): Promise<string[]> {
const rows = await this.sql<{ subpath: string }[]>`
SELECT DISTINCT subpath
FROM claude_session_entries
WHERE project_key = ${key.projectKey}
AND session_id = ${key.sessionId}
AND subpath <> ''
`;
return rows.map((r) => r.subpath);
}
}

View File

@@ -0,0 +1,96 @@
/**
* claude-sdk-sessionstore #9 (Part 2) — a tiny PURE pushable async-iterable.
*
* The Claude Agent SDK's streaming-input mode wants `query({ prompt })` where
* `prompt` is an `AsyncIterable<SDKUserMessage>`. To keep ONE `query()` generator
* alive across many turns (the "warm" property), the backend feeds it ONE user
* message per `prompt()` turn through a queue that stays open between turns and is
* only closed at `closeSession`/`dispose`. This is that queue.
*
* Semantics (the bit worth unit-testing — push/close/iterate ordering):
* - `push(v)` enqueues a value. If a consumer is parked in `await next()`, it's
* handed the value immediately; otherwise the value buffers in FIFO order.
* - The async iterator yields buffered/pushed values in push order, and PARKS
* (never busy-loops) when the buffer is empty — so the SDK generator waits for
* the next turn's message instead of seeing end-of-input.
* - `close()` ends the iterable: any parked consumer resolves `{done:true}` and
* all future `next()`s return done. Values pushed after close are dropped.
* - It's single-consumer (one `query()` reads it); concurrent consumers are not a
* supported shape and not needed here.
*
* No SDK import — generic over the pushed value `T` — so the pure push/close/iterate
* ordering is testable without the `SDKUserMessage` shape or a live binary.
*/
export interface Pushable<T> {
/** Enqueue a value (or hand it to a parked consumer). No-op after close. */
push(value: T): void;
/** End the iterable. Idempotent; a parked consumer resolves done. */
close(): void;
/** True once `close()` has been called. */
readonly closed: boolean;
/** The async-iterable the consumer (the SDK `query`) drives. */
readonly iterable: AsyncIterable<T>;
}
export function createPushable<T>(): Pushable<T> {
const buffer: T[] = [];
// A waiting consumer's resolver (null when none is parked). Single-consumer.
let pendingResolve: ((res: IteratorResult<T>) => void) | null = null;
let closed = false;
function push(value: T): void {
if (closed) return;
if (pendingResolve) {
const resolve = pendingResolve;
pendingResolve = null;
resolve({ value, done: false });
return;
}
buffer.push(value);
}
function close(): void {
if (closed) return;
closed = true;
if (pendingResolve) {
const resolve = pendingResolve;
pendingResolve = null;
resolve({ value: undefined, done: true });
}
}
const iterator: AsyncIterator<T> = {
next(): Promise<IteratorResult<T>> {
// Drain the buffer first (FIFO), regardless of close — buffered values
// pushed before close are still delivered.
if (buffer.length > 0) {
return Promise.resolve({ value: buffer.shift() as T, done: false });
}
if (closed) {
return Promise.resolve({ value: undefined, done: true });
}
// Park until the next push/close. Single-consumer: only one waiter at a time.
return new Promise<IteratorResult<T>>((resolve) => {
pendingResolve = resolve;
});
},
return(): Promise<IteratorResult<T>> {
// Consumer abandoned the loop (e.g. `break`) → close so a later push no-ops.
close();
return Promise.resolve({ value: undefined, done: true });
},
};
return {
push,
close,
get closed() {
return closed;
},
iterable: {
[Symbol.asyncIterator]() {
return iterator;
},
},
};
}

View File

@@ -1,7 +1,7 @@
import type { Sql } from '../db.js'; import type { Sql } from '../db.js';
import type { FastifyBaseLogger } from 'fastify'; import type { FastifyBaseLogger } from 'fastify';
import type { Broker } from '@boocode/server/broker'; import type { Broker } from '@boocode/server/broker';
import type { WsFrame } from '@boocode/server/ws-frames'; import type { WsFrame } from '@boocode/contracts/ws-frames';
import type { Config } from '../config.js'; import type { Config } from '../config.js';
import { createWorktree, diffWorktree, cleanupWorktree, ensureSessionWorktree } from './worktrees.js'; import { createWorktree, diffWorktree, cleanupWorktree, ensureSessionWorktree } from './worktrees.js';
import { createCheckpoint } from './checkpoints.js'; import { createCheckpoint } from './checkpoints.js';
@@ -16,8 +16,12 @@ import { snapshotToWireToolCall, type AcpToolSnapshot } from './acp-tool-snapsho
import { agentPool, OPENCODE_POOL_KEY } from './agent-pool.js'; import { agentPool, OPENCODE_POOL_KEY } from './agent-pool.js';
import { OpenCodeServerBackend } from './backends/opencode-server.js'; import { OpenCodeServerBackend } from './backends/opencode-server.js';
import { WarmAcpBackend } from './backends/warm-acp.js'; import { WarmAcpBackend } from './backends/warm-acp.js';
import { ClaudeSdkBackend } from './backends/claude-sdk.js';
import { shouldUseWarmBackend } from './backends/warm-acp-routing.js'; import { shouldUseWarmBackend } from './backends/warm-acp-routing.js';
import { shouldUseClaudeSdk } from './backends/claude-sdk-routing.js';
import type { AgentBackend, AgentEvent } from './agent-backend.js'; import type { AgentBackend, AgentEvent } from './agent-backend.js';
import { publishAgentStatus } from './agent-status-publish.js';
import type { AgentStatus } from './normalize-agent-status.js';
interface InferenceRunner { interface InferenceRunner {
enqueue: (sessionId: string, chatId: string, assistantId: string, user: string) => void; enqueue: (sessionId: string, chatId: string, assistantId: string, user: string) => void;
@@ -64,6 +68,21 @@ export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<v
return task.session_id ?? `task:${task.id}`; return task.session_id ?? `task:${task.id}`;
} }
// agent-status-normalize (#10): publish a normalized per-(chat,agent) status on
// the session channel. Every external-agent path (warm-acp / opencode / claude-sdk /
// pty one-shot) reports `working` at turn start, `idle` on clean completion, and
// `error` on the failure path through this single helper so the four paths stay
// DRY and consistent. Best-effort — publishAgentStatus never throws.
function emitAgentStatus(
sessionId: string,
chatId: string,
agent: string,
status: AgentStatus,
reason: string,
): void {
publishAgentStatus(broker.publishFrame, sessionId, chatId, agent, status, reason);
}
async function poll(): Promise<void> { async function poll(): Promise<void> {
// `polling` serializes poll() execution itself (timer + NOTIFY can fire // `polling` serializes poll() execution itself (timer + NOTIFY can fire
// concurrently) so we never double-select a task. It does NOT serialize task // concurrently) so we never double-select a task. It does NOT serialize task
@@ -131,6 +150,12 @@ export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<v
// existing one-shot worktree-per-task ACP/PTY path untouched. // existing one-shot worktree-per-task ACP/PTY path untouched.
if (task.agent === 'opencode') { if (task.agent === 'opencode') {
await runOpenCodeServerTask(task, agentRow.install_path); await runOpenCodeServerTask(task, agentRow.install_path);
} else if (shouldUseClaudeSdk(task)) {
// claude-sdk-sessionstore #9 (Part 2): env-flagged (CLAUDE_SDK_BACKEND, default
// OFF) warm Claude-SDK backend for chat-tab claude tasks. When the flag is off
// (production default) this predicate returns false and claude falls through to
// the UNCHANGED one-shot PTY runExternalAgent path below.
await runClaudeSdkTask(task, agentRow.install_path);
} else if (shouldUseWarmBackend(task)) { } else if (shouldUseWarmBackend(task)) {
await runWarmAcpTask(task, agentRow.install_path); await runWarmAcpTask(task, agentRow.install_path);
} else { } else {
@@ -188,8 +213,8 @@ export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<v
RETURNING id RETURNING id
`; `;
const [assistantMsg] = await sql<{ id: string }[]>` const [assistantMsg] = await sql<{ id: string }[]>`
INSERT INTO messages (session_id, chat_id, role, content, status, created_at) INSERT INTO messages (session_id, chat_id, role, content, status, model, created_at)
VALUES (${sessionId}, ${chatId}, 'assistant', '', 'streaming', clock_timestamp()) VALUES (${sessionId}, ${chatId}, 'assistant', '', 'streaming', ${task.model}, clock_timestamp())
RETURNING id RETURNING id
`; `;
const assistantId = assistantMsg!.id; const assistantId = assistantMsg!.id;
@@ -290,6 +315,11 @@ export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<v
// Create an abort controller for this task // Create an abort controller for this task
const ac = new AbortController(); const ac = new AbortController();
// #10: hoisted above the try so the catch block can report `error` status with
// the (chat, agent) key. Empty until resolved below; guarded before use.
let sessionId = '';
let chatId = '';
try { try {
// Mark running // Mark running
await sql` await sql`
@@ -298,9 +328,6 @@ export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<v
WHERE id = ${taskId} WHERE id = ${taskId}
`; `;
let sessionId: string;
let chatId: string;
if (task.session_id) { if (task.session_id) {
sessionId = task.session_id; sessionId = task.session_id;
const chats = await sql<{ id: string }[]>` const chats = await sql<{ id: string }[]>`
@@ -353,8 +380,8 @@ export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<v
let acpReasoning = ''; let acpReasoning = '';
const [assistantMsg] = await sql<{ id: string }[]>` const [assistantMsg] = await sql<{ id: string }[]>`
INSERT INTO messages (session_id, chat_id, role, content, status, created_at) INSERT INTO messages (session_id, chat_id, role, content, status, model, created_at)
VALUES (${sessionId}, ${chatId}, 'assistant', '', 'streaming', clock_timestamp()) VALUES (${sessionId}, ${chatId}, 'assistant', '', 'streaming', ${task.model}, clock_timestamp())
RETURNING id RETURNING id
`; `;
const assistantId = assistantMsg!.id; const assistantId = assistantMsg!.id;
@@ -376,6 +403,9 @@ export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<v
role: 'assistant', role: 'assistant',
} as WsFrame); } as WsFrame);
// #10: external-agent turn begins.
emitAgentStatus(sessionId, chatId, agent, 'working', 'turn_start');
const manifestCommands = getManifestCommands(agent); const manifestCommands = getManifestCommands(agent);
if (manifestCommands.length > 0) { if (manifestCommands.length > 0) {
setTaskCommands(taskId, manifestCommands); setTaskCommands(taskId, manifestCommands);
@@ -410,6 +440,52 @@ export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<v
outputSummary = result.output.slice(0, 500); outputSummary = result.output.slice(0, 500);
await persistExternalAgentTurn(sql, assistantId, result.toolSnapshots, acpReasoning); await persistExternalAgentTurn(sql, assistantId, result.toolSnapshots, acpReasoning);
} else { } else {
// v#7 (stream-json): claude + qwen run with --output-format stream-json.
// Parse the NDJSON live in pty-dispatch and forward AgentEvents here so we
// publish the SAME live frames the warm-ACP / opencode paths emit (text,
// reasoning, tool) and persist structured parts. Accumulate for the final
// message content + persistence; fall back to the opaque stdout slice when
// nothing parsed (agent ran without the flag, or crashed before emitting).
const ptyTextChunks: string[] = [];
const ptyReasoningChunks: string[] = [];
const ptyToolSnaps = new Map<string, AcpToolSnapshot>();
const onPtyEvent = (e: AgentEvent): void => {
switch (e.type) {
case 'text':
ptyTextChunks.push(e.text);
broker.publishFrame(sessionId, {
type: 'delta',
message_id: assistantId,
chat_id: chatId,
content: e.text,
} as WsFrame);
break;
case 'reasoning':
ptyReasoningChunks.push(e.text);
broker.publishFrame(sessionId, {
type: 'reasoning_delta',
message_id: assistantId,
chat_id: chatId,
content: e.text,
} as WsFrame);
break;
case 'tool_call':
case 'tool_update':
ptyToolSnaps.set(e.toolCall.toolCallId, e.toolCall);
broker.publishFrame(sessionId, {
type: 'tool_call',
message_id: assistantId,
chat_id: chatId,
tool_call: snapshotToWireToolCall(e.toolCall),
} as WsFrame);
break;
case 'commands':
// stream-json carries no commands today; ignore if it ever does.
break;
}
};
const result = await dispatchViaPty({ const result = await dispatchViaPty({
agent, agent,
task: task.input, task: task.input,
@@ -420,17 +496,33 @@ export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<v
thinkingOptionId: task.thinking_option_id ?? undefined, thinkingOptionId: task.thinking_option_id ?? undefined,
signal: ac.signal, signal: ac.signal,
log, log,
onEvent: onPtyEvent,
}); });
assistantContent = (result.stdout || result.stderr || '(no output)').slice(0, 50_000);
outputSummary = (result.stdout || result.stderr).slice(0, 500);
if (assistantContent) { if (result.streamed) {
broker.publishFrame(sessionId, { assistantContent = ptyTextChunks.join('').slice(0, 50_000);
type: 'delta', // stream-json text can be empty for a tool-only turn — surface stderr or a
message_id: assistantId, // placeholder so the message row isn't blank.
chat_id: chatId, if (!assistantContent) {
content: assistantContent, assistantContent = (result.stderr || '(no text output)').slice(0, 50_000);
} as WsFrame); }
outputSummary = (ptyTextChunks.join('') || result.stderr).slice(0, 500);
acpReasoning = ptyReasoningChunks.join('').slice(0, 200_000);
await persistExternalAgentTurn(sql, assistantId, [...ptyToolSnaps.values()], acpReasoning);
} else {
// Fallback: agent produced no parseable NDJSON (ran without the flag, or
// crashed). Preserve today's opaque stdout-slice + single delta behavior.
assistantContent = (result.stdout || result.stderr || '(no output)').slice(0, 50_000);
outputSummary = (result.stdout || result.stderr).slice(0, 500);
if (assistantContent) {
broker.publishFrame(sessionId, {
type: 'delta',
message_id: assistantId,
chat_id: chatId,
content: assistantContent,
} as WsFrame);
}
} }
} }
@@ -444,6 +536,7 @@ export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<v
type: 'message_complete', type: 'message_complete',
message_id: assistantId, message_id: assistantId,
chat_id: chatId, chat_id: chatId,
model: task.model,
} as WsFrame); } as WsFrame);
if (stopping) { if (stopping) {
@@ -488,6 +581,8 @@ export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<v
WHERE id = ${taskId} WHERE id = ${taskId}
`; `;
log.info({ taskId, agent, costTokens: extCostTokens }, 'dispatcher: task completed (external)'); log.info({ taskId, agent, costTokens: extCostTokens }, 'dispatcher: task completed (external)');
// #10: external-agent turn completed cleanly.
emitAgentStatus(sessionId, chatId, agent, 'idle', 'turn_complete');
clearTaskCommands(taskId); clearTaskCommands(taskId);
} catch (err) { } catch (err) {
@@ -500,6 +595,11 @@ export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<v
WHERE id = ${taskId} WHERE id = ${taskId}
`.catch(() => {}); `.catch(() => {});
// #10: external-agent turn failed/crashed. chatId may be unbound if the throw
// preceded its assignment — guard so the status publish never masks the real
// error.
if (chatId) emitAgentStatus(sessionId, chatId, agent, 'error', 'failed');
// Best-effort cleanup // Best-effort cleanup
await cleanupWorktree(projectPath, taskId); await cleanupWorktree(projectPath, taskId);
clearTaskCommands(taskId); clearTaskCommands(taskId);
@@ -554,6 +654,10 @@ export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<v
const ac = new AbortController(); const ac = new AbortController();
// #10: hoisted so the catch can report `error` with the (chat, agent) key.
let sessionId = '';
let chatId = '';
try { try {
// execution_path = 'acp' — the schema CHECK has no 'opencode_server' value // execution_path = 'acp' — the schema CHECK has no 'opencode_server' value
// (schema is frozen at Phase 0); the warm-vs-one-shot distinction lives in // (schema is frozen at Phase 0); the warm-vs-one-shot distinction lives in
@@ -570,8 +674,6 @@ export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<v
// it directly. Session-less creators (arena, MCP, new_task, generic // it directly. Session-less creators (arena, MCP, new_task, generic
// /api/tasks) leave it null; fall back to resolving/creating a real chat so // /api/tasks) leave it null; fall back to resolving/creating a real chat so
// ensureSession never receives a degenerate (null, agent) key. // ensureSession never receives a degenerate (null, agent) key.
let sessionId: string;
let chatId: string;
if (task.chat_id && task.session_id) { if (task.chat_id && task.session_id) {
sessionId = task.session_id; sessionId = task.session_id;
chatId = task.chat_id; chatId = task.chat_id;
@@ -622,8 +724,8 @@ export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<v
log.info({ taskId, worktreePath }, 'dispatcher: session worktree ready'); log.info({ taskId, worktreePath }, 'dispatcher: session worktree ready');
const [assistantMsg] = await sql<{ id: string }[]>` const [assistantMsg] = await sql<{ id: string }[]>`
INSERT INTO messages (session_id, chat_id, role, content, status, created_at) INSERT INTO messages (session_id, chat_id, role, content, status, model, created_at)
VALUES (${sessionId}, ${chatId}, 'assistant', '', 'streaming', clock_timestamp()) VALUES (${sessionId}, ${chatId}, 'assistant', '', 'streaming', ${task.model}, clock_timestamp())
RETURNING id RETURNING id
`; `;
const assistantId = assistantMsg!.id; const assistantId = assistantMsg!.id;
@@ -644,6 +746,9 @@ export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<v
role: 'assistant', role: 'assistant',
} as WsFrame); } as WsFrame);
// #10: opencode-server turn begins.
emitAgentStatus(sessionId, chatId, agent, 'working', 'turn_start');
const manifestCommands = getManifestCommands(agent); const manifestCommands = getManifestCommands(agent);
if (manifestCommands.length > 0) { if (manifestCommands.length > 0) {
setTaskCommands(taskId, manifestCommands); setTaskCommands(taskId, manifestCommands);
@@ -760,6 +865,7 @@ export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<v
type: 'message_complete', type: 'message_complete',
message_id: assistantId, message_id: assistantId,
chat_id: chatId, chat_id: chatId,
model: task.model,
} as WsFrame); } as WsFrame);
if (stopping) { if (stopping) {
@@ -803,6 +909,14 @@ export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<v
WHERE id = ${taskId} WHERE id = ${taskId}
`; `;
log.info({ taskId, agent, finalState, costTokens: extCostTokens }, 'dispatcher: task finished (opencode server)'); log.info({ taskId, agent, finalState, costTokens: extCostTokens }, 'dispatcher: task finished (opencode server)');
// #10: clean completion → idle; backend-reported failure → error.
emitAgentStatus(
sessionId,
chatId,
agent,
result.ok ? 'idle' : 'error',
result.ok ? 'turn_complete' : 'failed',
);
clearTaskCommands(taskId); clearTaskCommands(taskId);
} catch (err) { } catch (err) {
const errMsg = err instanceof Error ? err.message : String(err); const errMsg = err instanceof Error ? err.message : String(err);
@@ -812,6 +926,8 @@ export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<v
SET state = 'failed', ended_at = clock_timestamp(), output_summary = ${errMsg.slice(0, 500)} SET state = 'failed', ended_at = clock_timestamp(), output_summary = ${errMsg.slice(0, 500)}
WHERE id = ${taskId} WHERE id = ${taskId}
`.catch(() => {}); `.catch(() => {});
// #10: turn crashed.
if (chatId) emitAgentStatus(sessionId, chatId, agent, 'error', 'crashed');
clearTaskCommands(taskId); clearTaskCommands(taskId);
// No worktree cleanup (persistent); backend stays warm for the next turn. // No worktree cleanup (persistent); backend stays warm for the next turn.
} }
@@ -890,8 +1006,8 @@ export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<v
log.info({ taskId, worktreePath }, 'dispatcher: session worktree ready (warm ACP)'); log.info({ taskId, worktreePath }, 'dispatcher: session worktree ready (warm ACP)');
const [assistantMsg] = await sql<{ id: string }[]>` const [assistantMsg] = await sql<{ id: string }[]>`
INSERT INTO messages (session_id, chat_id, role, content, status, created_at) INSERT INTO messages (session_id, chat_id, role, content, status, model, created_at)
VALUES (${sessionId}, ${chatId}, 'assistant', '', 'streaming', clock_timestamp()) VALUES (${sessionId}, ${chatId}, 'assistant', '', 'streaming', ${task.model}, clock_timestamp())
RETURNING id RETURNING id
`; `;
const assistantId = assistantMsg!.id; const assistantId = assistantMsg!.id;
@@ -912,6 +1028,9 @@ export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<v
role: 'assistant', role: 'assistant',
} as WsFrame); } as WsFrame);
// #10: warm-ACP turn begins.
emitAgentStatus(sessionId, chatId, agent, 'working', 'turn_start');
const manifestCommands = getManifestCommands(agent); const manifestCommands = getManifestCommands(agent);
if (manifestCommands.length > 0) { if (manifestCommands.length > 0) {
setTaskCommands(taskId, manifestCommands); setTaskCommands(taskId, manifestCommands);
@@ -1011,6 +1130,7 @@ export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<v
type: 'message_complete', type: 'message_complete',
message_id: assistantId, message_id: assistantId,
chat_id: chatId, chat_id: chatId,
model: task.model,
} as WsFrame); } as WsFrame);
if (stopping) { if (stopping) {
@@ -1053,6 +1173,14 @@ export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<v
WHERE id = ${taskId} WHERE id = ${taskId}
`; `;
log.info({ taskId, agent, finalState }, 'dispatcher: task finished (warm ACP)'); log.info({ taskId, agent, finalState }, 'dispatcher: task finished (warm ACP)');
// #10: clean completion → idle; backend-reported failure → error.
emitAgentStatus(
sessionId,
chatId,
agent,
result.ok ? 'idle' : 'error',
result.ok ? 'turn_complete' : 'failed',
);
clearTaskCommands(taskId); clearTaskCommands(taskId);
} catch (err) { } catch (err) {
const errMsg = err instanceof Error ? err.message : String(err); const errMsg = err instanceof Error ? err.message : String(err);
@@ -1062,6 +1190,266 @@ export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<v
SET state = 'failed', ended_at = clock_timestamp(), output_summary = ${errMsg.slice(0, 500)} SET state = 'failed', ended_at = clock_timestamp(), output_summary = ${errMsg.slice(0, 500)}
WHERE id = ${taskId} WHERE id = ${taskId}
`.catch(() => {}); `.catch(() => {});
// #10: turn crashed.
emitAgentStatus(sessionId, chatId, agent, 'error', 'crashed');
clearTaskCommands(taskId);
// No worktree cleanup (persistent); backend stays warm for the next turn.
}
}
// ─── Path B (claude SDK): warm Claude-SDK backend (v2.6 #9 Part 2) ───────────
// Claude-SDK backends are per (chat, agent) — each owns ONE persistent query()
// generator driven in streaming-input mode. Pool key = chatId (secondary = agent),
// mirroring agent_sessions' (chat_id, agent) PK + the warm-ACP pooling.
function getClaudeSdkBackend(chatId: string, agent: string, installPath: string | null): ClaudeSdkBackend {
let backend = agentPool.get(chatId, agent);
if (!backend) {
backend = new ClaudeSdkBackend({ sql, log, chatId, agent, installPath });
agentPool.register(chatId, agent, backend);
}
return backend as ClaudeSdkBackend;
}
async function runClaudeSdkTask(
task: {
id: string;
project_id: string;
input: string;
agent: string | null;
model: string | null;
mode_id: string | null;
thinking_option_id: string | null;
session_id: string | null;
chat_id: string | null;
},
installPath: string | null,
): Promise<void> {
const taskId = task.id;
const agent = task.agent!;
// shouldUseClaudeSdk guarantees both non-null before we get here.
const sessionId = task.session_id!;
const chatId = task.chat_id!;
log.info({ taskId, agent, chatId }, 'dispatcher: starting task (path B — claude SDK)');
const [project] = await sql<{ path: string | null }[]>`
SELECT path FROM projects WHERE id = ${task.project_id}
`;
const projectPath = project?.path;
if (!projectPath) {
await sql`
UPDATE tasks
SET state = 'failed', ended_at = clock_timestamp(), output_summary = 'Project has no path — cannot create worktree'
WHERE id = ${taskId}
`;
return;
}
const ac = new AbortController();
try {
await sql`
UPDATE tasks
SET state = 'running', started_at = clock_timestamp(), execution_path = 'acp'
WHERE id = ${taskId}
`;
// Persistent, session-keyed worktree (shared across turns + agents; NOT torn
// down per turn — Phase 3 reaps it). Same as the opencode/warm-ACP paths so a
// chat that switches agents shares one worktree.
const { worktreeId, worktreePath, baseCommit } = await ensureSessionWorktree(sql, projectPath, sessionId, {
signal: ac.signal,
});
log.info({ taskId, worktreePath }, 'dispatcher: session worktree ready (claude SDK)');
const [assistantMsg] = await sql<{ id: string }[]>`
INSERT INTO messages (session_id, chat_id, role, content, status, model, created_at)
VALUES (${sessionId}, ${chatId}, 'assistant', '', 'streaming', ${task.model}, clock_timestamp())
RETURNING id
`;
const assistantId = assistantMsg!.id;
// write-edit-robustness #4: pre-turn checkpoint of the persistent session
// worktree (best-effort; never breaks dispatch).
await createCheckpoint(
sql,
{ chatId, sessionId, worktreeId, worktreePath, messageId: assistantId },
{ signal: ac.signal, log },
).catch(() => null);
broker.publishFrame(sessionId, {
type: 'message_started',
message_id: assistantId,
chat_id: chatId,
role: 'assistant',
} as WsFrame);
// #10: claude-SDK turn begins.
emitAgentStatus(sessionId, chatId, agent, 'working', 'turn_start');
const manifestCommands = getManifestCommands(agent);
if (manifestCommands.length > 0) {
setTaskCommands(taskId, manifestCommands);
broker.publishFrame(sessionId, {
type: 'agent_commands',
task_id: taskId,
session_id: sessionId,
commands: manifestCommands,
} as WsFrame);
}
// Accumulate the turn's stream for persistence + the final message content.
const textChunks: string[] = [];
const reasoningChunks: string[] = [];
const toolSnaps = new Map<string, AcpToolSnapshot>();
// Map transport-agnostic AgentEvents → the SAME WS frames the warm-ACP /
// opencode paths emit. This boundary attaches message_id/chat_id.
const onEvent = (e: AgentEvent): void => {
switch (e.type) {
case 'text':
textChunks.push(e.text);
broker.publishFrame(sessionId, {
type: 'delta',
message_id: assistantId,
chat_id: chatId,
content: e.text,
} as WsFrame);
break;
case 'reasoning':
reasoningChunks.push(e.text);
broker.publishFrame(sessionId, {
type: 'reasoning_delta',
message_id: assistantId,
chat_id: chatId,
content: e.text,
} as WsFrame);
break;
case 'tool_call':
case 'tool_update':
toolSnaps.set(e.toolCall.toolCallId, e.toolCall);
broker.publishFrame(sessionId, {
type: 'tool_call',
message_id: assistantId,
chat_id: chatId,
tool_call: snapshotToWireToolCall(e.toolCall),
} as WsFrame);
break;
case 'commands':
if (e.commands.length > 0) {
setTaskCommands(taskId, e.commands);
broker.publishFrame(sessionId, {
type: 'agent_commands',
task_id: taskId,
session_id: sessionId,
commands: e.commands,
} as WsFrame);
}
break;
}
};
const model = task.model ?? undefined;
const backend = getClaudeSdkBackend(chatId, agent, installPath);
const handle = await backend.ensureSession(sessionId, {
agent,
model: model ?? '',
chatId,
worktreePath,
worktreeId,
projectId: task.project_id,
});
const result = await backend.prompt(handle, task.input, {
worktreePath,
model: model ?? '',
signal: ac.signal,
onEvent,
taskId,
modeId: task.mode_id ?? undefined,
});
// Phase 3: keep the pooled (chat,agent) backend warm across the turn.
agentPool.touch(chatId, agent);
const assistantContent = textChunks.join('').slice(0, 50_000);
const reasoningText = reasoningChunks.join('').slice(0, 200_000);
const outputSummary = (result.ok ? textChunks.join('') : result.error ?? 'claude SDK turn failed').slice(0, 500);
await persistExternalAgentTurn(sql, assistantId, [...toolSnaps.values()], reasoningText);
// ctx_used/ctx_max from the SDK result (1M-aware) → the assistant message, so
// the ContextBar renders a real context-window fill for claude.
await sql`
UPDATE messages
SET content = ${assistantContent}, status = 'complete', finished_at = clock_timestamp(),
ctx_used = ${result.ctxUsed ?? null}, ctx_max = ${result.ctxMax ?? null}
WHERE id = ${assistantId}
`;
broker.publishFrame(sessionId, {
type: 'message_complete',
message_id: assistantId,
chat_id: chatId,
model: task.model,
} as WsFrame);
if (stopping) {
await sql`UPDATE tasks SET state = 'cancelled', ended_at = clock_timestamp() WHERE id = ${taskId}`;
return; // worktree persists (no cleanup); backend stays warm
}
// Diff the persistent worktree against its captured baseline and SUPERSEDE
// the session's prior pending row (latest-wins) — identical to opencode/ACP.
const diff = await diffWorktree(worktreePath, projectPath, {
signal: ac.signal,
baseRef: baseCommit ?? 'HEAD',
});
if (diff) {
await sql`
DELETE FROM pending_changes WHERE session_id = ${sessionId} AND status = 'pending'
`;
await sql`
INSERT INTO pending_changes (session_id, task_id, file_path, operation, diff, agent)
VALUES (${sessionId}, ${taskId}, ${projectPath}, 'edit', ${diff}, ${agent})
`;
log.info({ taskId, diffLength: diff.length }, 'dispatcher: diff superseded prior pending change (claude SDK)');
} else {
log.info({ taskId }, 'dispatcher: no changes detected in session worktree (claude SDK)');
}
// NO worktree cleanup — persistent (Phase 3 reaps it). Backend stays warm.
const [extCostRow] = await sql<{ total: number | null }[]>`
SELECT SUM(tokens_used)::int AS total
FROM messages
WHERE session_id = ${sessionId} AND tokens_used IS NOT NULL
`;
const extCostTokens = extCostRow?.total ?? null;
const finalState = result.ok ? 'completed' : 'failed';
await sql`
UPDATE tasks
SET state = ${finalState}, ended_at = clock_timestamp(), output_summary = ${outputSummary}, cost_tokens = ${extCostTokens}
WHERE id = ${taskId}
`;
log.info({ taskId, agent, finalState }, 'dispatcher: task finished (claude SDK)');
// #10: clean completion → idle; backend-reported failure → error.
emitAgentStatus(
sessionId,
chatId,
agent,
result.ok ? 'idle' : 'error',
result.ok ? 'turn_complete' : 'failed',
);
clearTaskCommands(taskId);
} catch (err) {
const errMsg = err instanceof Error ? err.message : String(err);
log.error({ taskId, agent, err: errMsg }, 'dispatcher: claude SDK error');
await sql`
UPDATE tasks
SET state = 'failed', ended_at = clock_timestamp(), output_summary = ${errMsg.slice(0, 500)}
WHERE id = ${taskId}
`.catch(() => {});
// #10: turn crashed.
emitAgentStatus(sessionId, chatId, agent, 'error', 'crashed');
clearTaskCommands(taskId); clearTaskCommands(taskId);
// No worktree cleanup (persistent); backend stays warm for the next turn. // No worktree cleanup (persistent); backend stays warm for the next turn.
} }

View File

@@ -0,0 +1,92 @@
/**
* normalize-agent-status (#10) — clean-room vendor-event → bucket mapping.
*
* Different coding agents (claude, opencode, codex/gemini, goose, qwen) emit
* lifecycle hook events under inconsistent names: PascalCase (`SessionStart`),
* snake_case (`session_start`), camelCase (`sessionStart`), and a handful of
* provider-specific approval events (`exec_approval_request`). This module
* collapses every known event name into one of three coarse signals:
*
* working — the agent is actively progressing a turn
* blocked — the agent is waiting on a human (permission / approval / question)
* done — the turn / session ended cleanly
*
* `null` is returned for anything unrecognized so callers can ignore noise.
*
* Built now for the scoped status-publish, but specifically shaped for reuse by
* the documented config-injection follow-on: a future notify-hook injected into
* each agent's native config will POST the RAW vendor event name to a BooCoder
* endpoint, which runs this helper to derive the normalized status. The names
* below are facts about each agent's hook surface — not copied vendor code.
*/
export type AgentStatus = 'working' | 'blocked' | 'idle' | 'error';
/** The coarse signal a raw vendor event collapses to. */
export type AgentEventBucket = 'working' | 'blocked' | 'done';
// Each bucket lists the canonical vendor event names. Lookup is
// case-insensitive AND separator-insensitive (snake_case / camelCase /
// PascalCase all fold to the same key), so we normalize the raw input the same
// way before matching rather than enumerating every spelling here.
const WORKING_EVENTS = [
'SessionStart',
'UserPromptSubmit',
'UserPromptSubmitted',
'PostToolUse',
'PostToolUseFailure',
'BeforeAgent',
'AfterTool',
'task_started',
] as const;
const BLOCKED_EVENTS = [
'PreToolUse',
'Notification',
'PermissionRequest',
'exec_approval_request',
'apply_patch_approval_request',
'request_user_input',
] as const;
const DONE_EVENTS = [
'Stop',
'AfterAgent',
'SessionEnd',
'task_complete',
'agent-turn-complete',
] as const;
/**
* Fold a raw event name to a separator/case-insensitive key:
* strip every non-alphanumeric character and lowercase. So `post_tool_use`,
* `postToolUse`, `PostToolUse`, and `POST-TOOL-USE` all map to `posttooluse`.
*/
function foldKey(raw: string): string {
return raw.replace(/[^a-z0-9]/gi, '').toLowerCase();
}
function buildLookup(
groups: ReadonlyArray<readonly [AgentEventBucket, readonly string[]]>,
): Map<string, AgentEventBucket> {
const map = new Map<string, AgentEventBucket>();
for (const [bucket, names] of groups) {
for (const name of names) map.set(foldKey(name), bucket);
}
return map;
}
const EVENT_LOOKUP = buildLookup([
['working', WORKING_EVENTS],
['blocked', BLOCKED_EVENTS],
['done', DONE_EVENTS],
]);
/**
* Map a raw vendor hook-event name to its normalized bucket, or `null` when the
* name is unknown / undefined. Case- and separator-insensitive.
*/
export function normalizeAgentEvent(raw: string | undefined): AgentEventBucket | null {
if (!raw) return null;
return EVENT_LOOKUP.get(foldKey(raw)) ?? null;
}

View File

@@ -5,42 +5,28 @@
* (see provider-config-registry.ts). Loading NEVER throws at startup (design.md * (see provider-config-registry.ts). Loading NEVER throws at startup (design.md
* §2.1): a missing file, invalid JSON, or schema mismatch all fall back to * §2.1): a missing file, invalid JSON, or schema mismatch all fall back to
* `{ providers: {} }` (built-ins only, all enabled). * `{ providers: {} }` (built-ins only, all enabled).
*
* Schemas are defined once in @boocode/contracts/provider-config and re-exported
* here so existing importers (routes, tests, registry) don't need path changes.
*/ */
import { readFileSync, writeFileSync } from 'node:fs'; import { readFileSync, writeFileSync } from 'node:fs';
import { z } from 'zod'; import {
ProviderOverrideSchema,
CoderProvidersFileSchema,
ProviderConfigPatchSchema,
type ProviderOverride,
type CoderProvidersFile,
type ProviderConfigPatch,
} from '@boocode/contracts/provider-config';
// Schemas verbatim from design.md §2.2. export {
export const ProviderOverrideSchema = z.object({ ProviderOverrideSchema,
extends: z.enum(['acp']).optional(), // v2.3: only 'acp' for custom; built-ins omit extends CoderProvidersFileSchema,
label: z.string().min(1).optional(), ProviderConfigPatchSchema,
description: z.string().optional(), type ProviderOverride,
command: z.array(z.string().min(1)).min(1).optional(), // [binary, ...args] type CoderProvidersFile,
env: z.record(z.string()).optional(), type ProviderConfigPatch,
enabled: z.boolean().optional(), // default true };
order: z.number().int().optional(), // UI sort key
models: z.array(z.object({ id: z.string(), label: z.string() })).optional(),
additionalModels: z.array(z.object({ id: z.string(), label: z.string() })).optional(),
});
export const CoderProvidersFileSchema = z.object({
providers: z.record(ProviderOverrideSchema).default({}),
});
export type ProviderOverride = z.infer<typeof ProviderOverrideSchema>;
export type CoderProvidersFile = z.infer<typeof CoderProvidersFileSchema>;
/**
* PATCH body schema (design.md §6.2). A partial providers map where each value
* is either a full override object (REPLACES that id's override) or `null`
* (DELETES the override → revert to the built-in default). Ids absent from the
* patch are left untouched. The route validates the body against this first
* (malformed → 422) so a bad shape can never reach the merge/save step.
*/
export const ProviderConfigPatchSchema = z.object({
providers: z.record(ProviderOverrideSchema.nullable()).default({}),
});
export type ProviderConfigPatch = z.infer<typeof ProviderConfigPatchSchema>;
/** /**
* Shallow per-id merge (design.md §6.2 / Paseo `patchConfig`). Each key in * Shallow per-id merge (design.md §6.2 / Paseo `patchConfig`). Each key in

View File

@@ -38,6 +38,12 @@ export const PROVIDERS: ProviderDef[] = [
}, },
{ {
name: 'claude', name: 'claude',
// transport stays 'pty' — the DEFAULT dispatch path (one-shot `claude
// --output-format stream-json`). claude-sdk-sessionstore #9 (Part 2) adds a warm
// Claude-Agent-SDK backend (services/backends/claude-sdk.ts) routed ONLY when the
// `CLAUDE_SDK_BACKEND` env flag is truthy AND the task is a chat tab; with the flag
// off (production default) claude always uses this PTY path, so the transport label
// is left unchanged. Flip the env var on a host (after a live smoke) to opt in.
label: 'Claude Code', label: 'Claude Code',
transport: 'pty', transport: 'pty',
modelSource: 'static', modelSource: 'static',

View File

@@ -1,61 +1,10 @@
/** Shared provider / snapshot types (Paseo-shaped, BooCoder-native). */ /** Provider snapshot types — re-exported from @boocode/contracts for local consumers. */
export interface ProviderMode { export type {
id: string; ProviderMode,
label: string; ThinkingOption,
description?: string; ProviderModel,
/** Auto-approve tool permissions when this mode is selected. */ ProviderSnapshotStatus,
isUnattended?: boolean; AgentCommand,
} ProviderSnapshotEntry,
} from '@boocode/contracts/provider-snapshot';
export interface ThinkingOption {
id: string;
label: string;
isDefault?: boolean;
}
export interface ProviderModel {
id: string;
label: string;
description?: string;
isDefault?: boolean;
thinkingOptions?: ThinkingOption[];
defaultThinkingOptionId?: string;
}
// v2.3 phase 2: 'loading' (cache-miss, probe in flight) + 'unavailable'
// (disabled or not installed) restored alongside the terminal 'ready' | 'error'.
export type ProviderSnapshotStatus = 'loading' | 'ready' | 'unavailable' | 'error';
export interface AgentCommand {
name: string;
description?: string;
// v2.5.11: 'skill' (plugin skill) vs 'command' (native/CLI slash command).
// Drives the icon split in the coder slash menu. Undefined → command.
kind?: 'command' | 'skill';
}
// KEEP IN SYNC with apps/web/src/api/types.ts ProviderSnapshotEntry — parity is
// enforced by __tests__/provider-types-parity.test.ts (fails on any field drift).
export interface ProviderSnapshotEntry {
name: string;
label: string;
description?: string;
transport: string;
status: ProviderSnapshotStatus;
enabled: boolean;
installed: boolean;
models: ProviderModel[];
modes: ProviderMode[];
defaultModeId: string | null;
commands: AgentCommand[];
error?: string;
fetchedAt?: string;
}
export interface AgentSessionConfig {
provider: string;
model?: string;
modeId?: string;
thinkingOptionId?: string;
}

View File

@@ -1,13 +1,29 @@
/** /**
* PTY dispatch — runs external agents directly on the host. * PTY dispatch — runs external agents directly on the host.
*
* claude + qwen run with `--output-format stream-json` and emit Claude-Code's
* stream-json NDJSON on stdout. When an `onEvent` callback is supplied we
* line-buffer that stdout (split on `\n`, hold the partial tail) and feed complete
* lines to `makeStreamJsonParser` so deltas surface live as AgentEvents. The raw
* stdout is still accumulated + returned for back-compat (and the dispatcher's
* fallback when nothing parsed). See `stream-json-parser.ts`.
*/ */
import type { FastifyBaseLogger } from 'fastify'; import type { FastifyBaseLogger } from 'fastify';
import { spawn } from 'node:child_process'; import { spawn } from 'node:child_process';
import type { AgentEvent } from './agent-backend.js';
import { makeStreamJsonParser, type StreamJsonUsage } from './stream-json-parser.js';
export interface DispatchResult { export interface DispatchResult {
exitCode: number; exitCode: number;
stdout: string; stdout: string;
stderr: string; stderr: string;
/** True iff at least one NDJSON AgentEvent was parsed from stdout (v#7). When
* false the dispatcher falls back to slicing stdout as the assistant content. */
streamed: boolean;
/** Final usage parsed from the stream-json `result` / `message_delta`, if any. */
usage?: StreamJsonUsage;
/** Provider session id from the stream-json `system` init line, if any. */
agentSessionId?: string | null;
} }
export interface PtyDispatchOpts { export interface PtyDispatchOpts {
@@ -20,6 +36,10 @@ export interface PtyDispatchOpts {
installPath?: string; installPath?: string;
signal?: AbortSignal; signal?: AbortSignal;
log: FastifyBaseLogger; log: FastifyBaseLogger;
/** Optional live event sink. When set, stdout is line-buffered + NDJSON-parsed
* and each AgentEvent is forwarded here as it arrives. Absent → opaque (old)
* behavior: stdout is accumulated and returned, no parsing. */
onEvent?: (e: AgentEvent) => void;
} }
interface PtySpawnSpec { interface PtySpawnSpec {
@@ -40,7 +60,9 @@ function buildPtySpawnSpec(
switch (agent) { switch (agent) {
case 'claude': { case 'claude': {
const args = ['-p']; // stream-json on -p requires --verbose (Claude Code rejects stream-json
// print mode without it). qwen needs no such flag.
const args = ['-p', '--output-format', 'stream-json', '--verbose'];
if (model) args.push('--model', model); if (model) args.push('--model', model);
if (modeId) args.push('--permission-mode', modeId); if (modeId) args.push('--permission-mode', modeId);
if (thinkingOptionId) args.push('--effort', thinkingOptionId); if (thinkingOptionId) args.push('--effort', thinkingOptionId);
@@ -73,7 +95,7 @@ function buildPtySpawnSpec(
} }
export async function dispatchViaPty(opts: PtyDispatchOpts): Promise<DispatchResult> { export async function dispatchViaPty(opts: PtyDispatchOpts): Promise<DispatchResult> {
const { agent, task, worktreePath, model, modeId, thinkingOptionId, installPath, signal, log } = opts; const { agent, task, worktreePath, model, modeId, thinkingOptionId, installPath, signal, log, onEvent } = opts;
const cmd = buildPtySpawnSpec(agent, task, model, modeId, thinkingOptionId, installPath); const cmd = buildPtySpawnSpec(agent, task, model, modeId, thinkingOptionId, installPath);
if (!cmd) { if (!cmd) {
@@ -81,6 +103,7 @@ export async function dispatchViaPty(opts: PtyDispatchOpts): Promise<DispatchRes
exitCode: 1, exitCode: 1,
stdout: '', stdout: '',
stderr: `Agent '${agent}' is not yet supported for PTY dispatch.`, stderr: `Agent '${agent}' is not yet supported for PTY dispatch.`,
streamed: false,
}; };
} }
@@ -102,7 +125,32 @@ export async function dispatchViaPty(opts: PtyDispatchOpts): Promise<DispatchRes
let stderr = ''; let stderr = '';
let killed = false; let killed = false;
child.stdout!.on('data', (chunk: Buffer) => { stdout += chunk.toString(); }); // Live NDJSON parsing (only when a sink is supplied). Line-buffer: split on
// '\n', dispatch complete lines, hold the partial tail until the next chunk.
const parser = onEvent ? makeStreamJsonParser() : null;
let lineBuf = '';
let streamed = false;
const feedLine = (line: string): void => {
if (!parser || !onEvent) return;
for (const e of parser.push(line)) {
streamed = true;
onEvent(e);
}
};
child.stdout!.on('data', (chunk: Buffer) => {
const text = chunk.toString();
stdout += text;
if (!parser) return;
lineBuf += text;
let nl = lineBuf.indexOf('\n');
while (nl !== -1) {
const line = lineBuf.slice(0, nl);
lineBuf = lineBuf.slice(nl + 1);
feedLine(line);
nl = lineBuf.indexOf('\n');
}
});
child.stderr!.on('data', (chunk: Buffer) => { stderr += chunk.toString(); }); child.stderr!.on('data', (chunk: Buffer) => { stderr += chunk.toString(); });
const cleanup = () => { const cleanup = () => {
@@ -116,7 +164,7 @@ export async function dispatchViaPty(opts: PtyDispatchOpts): Promise<DispatchRes
if (signal) { if (signal) {
if (signal.aborted) { if (signal.aborted) {
cleanup(); cleanup();
resolve({ exitCode: 130, stdout: '', stderr: 'Aborted before start' }); resolve({ exitCode: 130, stdout: '', stderr: 'Aborted before start', streamed: false });
return; return;
} }
signal.addEventListener('abort', cleanup, { once: true }); signal.addEventListener('abort', cleanup, { once: true });
@@ -124,8 +172,18 @@ export async function dispatchViaPty(opts: PtyDispatchOpts): Promise<DispatchRes
child.on('close', (code) => { child.on('close', (code) => {
if (signal) signal.removeEventListener('abort', cleanup); if (signal) signal.removeEventListener('abort', cleanup);
log.info({ agent, exitCode: code }, 'pty-dispatch: completed'); // Flush any final line with no trailing newline.
resolve({ exitCode: code ?? 1, stdout, stderr }); if (lineBuf.trim()) feedLine(lineBuf);
lineBuf = '';
log.info({ agent, exitCode: code, streamed }, 'pty-dispatch: completed');
resolve({
exitCode: code ?? 1,
stdout,
stderr,
streamed,
usage: parser?.usage(),
agentSessionId: parser?.sessionId() ?? null,
});
}); });
child.on('error', (err) => { child.on('error', (err) => {

View File

@@ -0,0 +1,296 @@
/**
* Claude-Code-compatible stream-json NDJSON parser (feature #7,
* openspec `sampling-streamjson-tokens`).
*
* qwen (`--output-format stream-json`) and claude (`--output-format stream-json`)
* both emit Claude-Code's stream-json NDJSON on stdout: one JSON object per line.
* This module turns that stream into the same transport-agnostic `AgentEvent`s the
* ACP / opencode-server backends emit, so the PTY dispatch path can publish live
* broker frames + persist structured parts instead of slicing stdout opaque.
*
* Two surfaces:
* - `parseStreamJsonLine(line, state)` — PURE per-line mapping (unit-testable).
* `state` is the caller-owned accumulator (open tool blocks + usage/session_id).
* - `makeStreamJsonParser()` — a thin stateful wrapper holding the state, with a
* `push(line)` that returns the events for that line and getters for the final
* `usage` / `sessionId`.
*
* Defensive by contract: a non-JSON / partial / garbage line yields `[]` and never
* throws. Tool args (`input_json_delta`) arrive fragmented across many lines; we
* accumulate the partial JSON string per content-block index and only surface the
* parsed `rawInput` once the block stops (or, as a fallback, off the terminal
* `assistant` message which carries the fully-assembled `tool_use` blocks).
*
* Schema (keyed on top-level `type`):
* - `system` — init: { session_id, tools, ... }
* - `assistant` — { message: { content: [ {type:'text'|'thinking'|'tool_use', ...} ], usage? } }
* - `user` — tool results (ignored — diffing the worktree captures effects)
* - `result` — final: { usage: { input_tokens, output_tokens }, session_id? }
* - `stream_event` — { event: { type, index?, content_block?, delta?, usage? } }
* event.type:
* content_block_start — { index, content_block: {type, id?, name?} }
* content_block_delta — { index, delta: {type, text?|thinking?|partial_json?} }
* content_block_stop — { index }
* message_delta — { usage: { output_tokens } }
* message_start — { message: { usage } }
*/
import type { AgentEvent } from './agent-backend.js';
import type { AcpToolSnapshot } from './acp-tool-snapshot.js';
/** Convenience alias for the per-line return value. */
export type AgentEventList = AgentEvent[];
export interface StreamJsonUsage {
inputTokens?: number;
outputTokens?: number;
}
/** Per-open-content-block accumulation for tool args assembled across deltas. */
interface OpenToolBlock {
toolCallId: string;
name: string;
/** Concatenated `input_json_delta.partial_json` fragments. */
partialJson: string;
}
export interface StreamJsonState {
/** content-block index → open tool block (only `tool_use` blocks are tracked). */
toolBlocks: Map<number, OpenToolBlock>;
sessionId: string | null;
usage: StreamJsonUsage;
}
export function makeStreamJsonState(): StreamJsonState {
return { toolBlocks: new Map(), sessionId: null, usage: {} };
}
function asRecord(value: unknown): Record<string, unknown> | null {
if (value && typeof value === 'object' && !Array.isArray(value)) {
return value as Record<string, unknown>;
}
return null;
}
function asString(value: unknown): string | undefined {
return typeof value === 'string' ? value : undefined;
}
function asNumber(value: unknown): number | undefined {
return typeof value === 'number' && Number.isFinite(value) ? value : undefined;
}
/** Pull token counts out of an Anthropic-shape `usage` object, mutating state. */
function captureUsage(usage: Record<string, unknown> | null, state: StreamJsonState): void {
if (!usage) return;
const input = asNumber(usage.input_tokens);
const output = asNumber(usage.output_tokens);
if (input !== undefined) state.usage.inputTokens = input;
// output_tokens is reported incrementally on message_delta; keep the latest.
if (output !== undefined) state.usage.outputTokens = output;
}
/** Parse the accumulated tool-arg JSON; tolerate an unparseable/partial body. */
function parseToolInput(partialJson: string): unknown {
const trimmed = partialJson.trim();
if (!trimmed) return {};
try {
return JSON.parse(trimmed);
} catch {
return { _raw: partialJson };
}
}
function toolSnapshot(block: OpenToolBlock, rawInput: unknown, status: AcpToolSnapshot['status']): AcpToolSnapshot {
return {
toolCallId: block.toolCallId,
title: block.name,
kind: null,
status,
rawInput,
};
}
/**
* Map one stream-event sub-object (the `event` field of a `stream_event` line) to
* AgentEvents, mutating `state` for open tool blocks + usage.
*/
function handleStreamEvent(event: Record<string, unknown>, state: StreamJsonState): AgentEvent[] {
const eventType = asString(event.type);
if (!eventType) return [];
switch (eventType) {
case 'content_block_start': {
const index = asNumber(event.index);
const block = asRecord(event.content_block);
if (index === undefined || !block) return [];
if (asString(block.type) !== 'tool_use') return [];
const toolCallId = asString(block.id) ?? `tool_${index}`;
const name = asString(block.name) ?? 'tool';
const open: OpenToolBlock = { toolCallId, name, partialJson: '' };
state.toolBlocks.set(index, open);
// Surface the tool start immediately (running, no args yet) so the UI shows
// the call before the args finish streaming.
return [{ type: 'tool_call', toolCall: toolSnapshot(open, {}, 'in_progress') }];
}
case 'content_block_delta': {
const index = asNumber(event.index);
const delta = asRecord(event.delta);
if (delta === null) return [];
const deltaType = asString(delta.type);
if (deltaType === 'text_delta') {
const text = asString(delta.text);
return text ? [{ type: 'text', text }] : [];
}
if (deltaType === 'thinking_delta') {
const text = asString(delta.thinking);
return text ? [{ type: 'reasoning', text }] : [];
}
if (deltaType === 'input_json_delta') {
// Accumulate tool args; no event until the block stops.
const fragment = asString(delta.partial_json);
if (index !== undefined && fragment) {
const open = state.toolBlocks.get(index);
if (open) open.partialJson += fragment;
}
return [];
}
return [];
}
case 'content_block_stop': {
const index = asNumber(event.index);
if (index === undefined) return [];
const open = state.toolBlocks.get(index);
if (!open) return [];
state.toolBlocks.delete(index);
const rawInput = parseToolInput(open.partialJson);
return [{ type: 'tool_update', toolCall: toolSnapshot(open, rawInput, 'completed') }];
}
case 'message_start': {
const message = asRecord(event.message);
captureUsage(asRecord(message?.usage), state);
return [];
}
case 'message_delta': {
captureUsage(asRecord(event.usage), state);
return [];
}
default:
return [];
}
}
/**
* Map the terminal `assistant` message (post-hoc full message) to AgentEvents. Used
* as a fallback for transports that emit only the assembled `assistant` line and no
* incremental `stream_event`s. When stream_events already streamed a block, the
* caller dedups by toolCallId, so re-emitting the assembled tool_use is harmless.
*/
function handleAssistantMessage(message: Record<string, unknown>, state: StreamJsonState): AgentEvent[] {
captureUsage(asRecord(message.usage), state);
const content = message.content;
if (!Array.isArray(content)) return [];
const out: AgentEvent[] = [];
let toolIdx = 0;
for (const rawBlock of content) {
const block = asRecord(rawBlock);
if (!block) continue;
const blockType = asString(block.type);
if (blockType === 'text') {
const text = asString(block.text);
if (text) out.push({ type: 'text', text });
} else if (blockType === 'thinking') {
const text = asString(block.thinking);
if (text) out.push({ type: 'reasoning', text });
} else if (blockType === 'tool_use') {
const toolCallId = asString(block.id) ?? `tool_${toolIdx}`;
const name = asString(block.name) ?? 'tool';
const rawInput = 'input' in block ? block.input : {};
out.push({
type: 'tool_update',
toolCall: { toolCallId, title: name, kind: null, status: 'completed', rawInput },
});
}
toolIdx++;
}
return out;
}
/**
* Pure per-line mapping. `line` is a single complete NDJSON line (no trailing
* newline required; surrounding whitespace tolerated). Returns the AgentEvents the
* line produces and mutates `state` (open tool blocks, usage, session_id). A blank,
* non-JSON, or unrecognized line yields `[]` and never throws.
*/
export function parseStreamJsonLine(line: string, state: StreamJsonState): AgentEvent[] {
const trimmed = line.trim();
if (!trimmed) return [];
let obj: Record<string, unknown> | null;
try {
const parsed: unknown = JSON.parse(trimmed);
obj = asRecord(parsed);
} catch {
return [];
}
if (!obj) return [];
const type = asString(obj.type);
switch (type) {
case 'system': {
const sid = asString(obj.session_id);
if (sid) state.sessionId = sid;
return [];
}
case 'stream_event': {
const event = asRecord(obj.event);
return event ? handleStreamEvent(event, state) : [];
}
case 'assistant': {
const sid = asString(obj.session_id);
if (sid) state.sessionId = sid;
const message = asRecord(obj.message);
return message ? handleAssistantMessage(message, state) : [];
}
case 'result': {
const sid = asString(obj.session_id);
if (sid) state.sessionId = sid;
captureUsage(asRecord(obj.usage), state);
return [];
}
default:
// `user` (tool results) and any unknown line type — ignore.
return [];
}
}
export interface StreamJsonParser {
/** Feed one complete NDJSON line; returns its AgentEvents (never throws). */
push(line: string): AgentEvent[];
/** Final usage (input/output tokens) accumulated so far. */
usage(): StreamJsonUsage;
/** Provider session id from the init `system` line / `result`, if seen. */
sessionId(): string | null;
}
/**
* Stateful wrapper around `parseStreamJsonLine`. Holds per-tool-block accumulation
* + usage/session_id across the turn. Line-buffering (splitting stdout on `\n` and
* holding the partial tail) is the caller's job — see `pty-dispatch.ts`.
*/
export function makeStreamJsonParser(): StreamJsonParser {
const state = makeStreamJsonState();
return {
push: (line: string) => parseStreamJsonLine(line, state),
usage: () => ({ ...state.usage }),
sessionId: () => state.sessionId,
};
}

View File

@@ -8,6 +8,7 @@
*/ */
import type { Sql } from '../db.js'; import type { Sql } from '../db.js';
import { hostExec } from './host-exec.js'; import { hostExec } from './host-exec.js';
import type { WorktreeRiskReport } from '@boocode/contracts/worktree-risk';
export const WORKTREE_BASE = '/tmp/booworktrees'; export const WORKTREE_BASE = '/tmp/booworktrees';
@@ -379,22 +380,8 @@ export async function rebaselineWorktreeAfterApply(
} }
// ─── Session-delete work-loss guard ───────────────────────────────────────── // ─── Session-delete work-loss guard ─────────────────────────────────────────
// WorktreeRiskReport single-sourced in @boocode/contracts — edit the package, not here.
/** export type { WorktreeRiskReport };
* Risk report for a single worktree, returned by checkWorktreeWorkAtRisk.
* `atRisk` is the gate the server reads before allowing a session delete.
* A git error never silently passes — it forces `atRisk` true and surfaces
* the message in `error` (fail-closed).
*/
export interface RiskReport {
worktreePath: string;
branch: string;
dirty: boolean; // uncommitted working-tree changes (incl. untracked)
unpushed: number; // commits ahead of upstream, or -1 if no upstream is set
unmerged: number; // commits on this branch not in the project default branch
atRisk: boolean; // dirty || unmerged > 0 || (upstream && unpushed > 0) || git error
error?: string; // populated on a git failure; presence forces atRisk
}
/** /**
* Resolve the project's default branch as a git-usable ref (e.g. "origin/main"). * Resolve the project's default branch as a git-usable ref (e.g. "origin/main").
@@ -448,7 +435,7 @@ async function detectDefaultBranchRef(
export async function checkWorktreeWorkAtRisk( export async function checkWorktreeWorkAtRisk(
worktreePath: string, worktreePath: string,
opts?: { signal?: AbortSignal }, opts?: { signal?: AbortSignal },
): Promise<RiskReport> { ): Promise<WorktreeRiskReport> {
// Branch name — also doubles as the "is this still a git worktree?" probe. // Branch name — also doubles as the "is this still a git worktree?" probe.
const br = await hostExec( const br = await hostExec(
`git -C ${shellEscape(worktreePath)} rev-parse --abbrev-ref HEAD`, `git -C ${shellEscape(worktreePath)} rev-parse --abbrev-ref HEAD`,

View File

@@ -5,5 +5,11 @@ export default defineConfig({
environment: 'node', environment: 'node',
globals: false, globals: false,
include: ['src/**/__tests__/**/*.test.ts'], include: ['src/**/__tests__/**/*.test.ts'],
// DB-integration suites (checkpoints, claude-session-store, reconnect, etc.)
// each apply the full schema in beforeAll against the one shared dev DB; running
// test files in parallel makes those concurrent DDL applies deadlock under
// DATABASE_URL. Serialize file execution — the suites are fast, so the cost is
// negligible and the default (no-DATABASE_URL) run is unaffected.
fileParallelism: false,
}, },
}); });

View File

@@ -10,6 +10,7 @@
"preview": "vite preview" "preview": "vite preview"
}, },
"dependencies": { "dependencies": {
"@boocode/contracts": "workspace:*",
"lucide-react": "^1.16.0", "lucide-react": "^1.16.0",
"react": "^18.3.1", "react": "^18.3.1",
"react-dom": "^18.3.1", "react-dom": "^18.3.1",

View File

@@ -1,5 +1,22 @@
// Minimal types for the BooCoder frontend. // Minimal types for the BooCoder frontend.
// Shared DB entities (same schema as BooChat). // Shared DB entities (same schema as BooChat).
//
// WS wire contracts are single-sourced from @boocode/contracts (the canonical
// Zod-backed schema). The DB entity types below (Project/Session/Chat/Message/
// ToolCall/ToolResult/PendingChange) are an intentional minimal SPA-local subset
// and are NOT cross-app contracts — they stay defined here.
import type { WsFrame } from '@boocode/contracts/ws-frames';
// Re-export the canonical WebSocket frame union (single source of truth). The
// coder backend publishes the full frame set; this SPA's reducer handles the
// subset it renders and ignores the rest.
export type { WsFrame };
// The error frame's `reason`, single-sourced from the canonical schema's
// frame-level reason enum (derived from WsFrame so it cannot drift from the
// wire). Distinct from message-metadata's ErrorReason, which is a different set.
export type ErrorReason = NonNullable<Extract<WsFrame, { type: 'error' }>['reason']>;
export interface Project { export interface Project {
id: string; id: string;
@@ -39,7 +56,9 @@ export interface ToolResult {
tool_call_id: string; tool_call_id: string;
output: unknown; output: unknown;
truncated?: boolean; truncated?: boolean;
error?: boolean; // Canonical wire shape: the failure message string (present only on error),
// not a boolean. ToolResultBubble treats it as truthy → renders error styling.
error?: string;
} }
// Batch 9.7: ask_user_input shapes. The tool_call.args is { questions: AskUserQuestion[] } // Batch 9.7: ask_user_input shapes. The tool_call.args is { questions: AskUserQuestion[] }
@@ -96,15 +115,3 @@ export interface PendingChange {
created_at: string; created_at: string;
applied_at: string | null; applied_at: string | null;
} }
// WebSocket frame types (subset of what the coder backend publishes)
export type WsFrame =
| { type: 'snapshot'; messages: Message[] }
| { type: 'message_started'; message_id: string; chat_id: string; role: Message['role'] }
| { type: 'delta'; message_id: string; chat_id: string; content: string }
| { type: 'tool_call'; message_id: string; chat_id: string; tool_call: ToolCall }
| { type: 'tool_result'; tool_message_id: string; chat_id: string; tool_call_id: string; output: string; truncated?: boolean; error?: boolean }
| { type: 'message_complete'; message_id: string; chat_id: string; tokens_used?: number; ctx_used?: number; ctx_max?: number; started_at?: string; finished_at?: string; metadata?: unknown }
| { type: 'error'; message_id?: string; error: string; reason?: string }
| { type: 'pending_change_added'; change: PendingChange }
| { type: 'pending_change_updated'; change: PendingChange };

View File

@@ -5,10 +5,9 @@ import { api } from '@/api/client';
interface Props { interface Props {
sessionId: string; sessionId: string;
onPendingChange: (cb: (change: PendingChange) => void) => () => void;
} }
export function DiffPane({ sessionId, onPendingChange }: Props) { export function DiffPane({ sessionId }: Props) {
const [changes, setChanges] = useState<PendingChange[]>([]); const [changes, setChanges] = useState<PendingChange[]>([]);
const [loading, setLoading] = useState(true); const [loading, setLoading] = useState(true);
const [expandedId, setExpandedId] = useState<string | null>(null); const [expandedId, setExpandedId] = useState<string | null>(null);
@@ -24,27 +23,13 @@ export function DiffPane({ sessionId, onPendingChange }: Props) {
} }
}, [sessionId]); }, [sessionId]);
// Initial load // Initial load. Pending changes are delivered over HTTP (list + apply/reject/
// rewind below); there is no WS pending-change frame, so the list refreshes on
// mount, on the Refresh button, and optimistically as the user acts on it.
useEffect(() => { useEffect(() => {
fetchPending(); fetchPending();
}, [fetchPending]); }, [fetchPending]);
// Listen for WS pending change events
useEffect(() => {
const unsub = onPendingChange((change) => {
setChanges((prev) => {
const idx = prev.findIndex((c) => c.id === change.id);
if (idx >= 0) {
const next = [...prev];
next[idx] = change;
return next;
}
return [...prev, change];
});
});
return unsub;
}, [onPendingChange]);
const pendingChanges = changes.filter((c) => c.status === 'pending'); const pendingChanges = changes.filter((c) => c.status === 'pending');
const resolvedChanges = changes.filter((c) => c.status !== 'pending'); const resolvedChanges = changes.filter((c) => c.status !== 'pending');

View File

@@ -1,5 +1,5 @@
import { useEffect, useRef, useState, useCallback } from 'react'; import { useEffect, useRef, useState } from 'react';
import type { Message, WsFrame, PendingChange } from '@/api/types'; import type { Message, WsFrame } from '@/api/types';
interface State { interface State {
messages: Message[]; messages: Message[];
@@ -10,7 +10,9 @@ interface State {
function applyFrame(state: State, frame: WsFrame): State { function applyFrame(state: State, frame: WsFrame): State {
switch (frame.type) { switch (frame.type) {
case 'snapshot': { case 'snapshot': {
return { ...state, messages: frame.messages }; // Canonical SnapshotFrame.messages is opaque (z.array(z.unknown())); the
// coder backend sends Message-shaped rows, so cast to the SPA's local type.
return { ...state, messages: frame.messages as Message[] };
} }
case 'message_started': { case 'message_started': {
const exists = state.messages.some((m) => m.id === frame.message_id); const exists = state.messages.some((m) => m.id === frame.message_id);
@@ -18,7 +20,7 @@ function applyFrame(state: State, frame: WsFrame): State {
const newMsg: Message = { const newMsg: Message = {
id: frame.message_id, id: frame.message_id,
session_id: '', session_id: '',
chat_id: frame.chat_id, chat_id: frame.chat_id ?? '',
role: frame.role, role: frame.role,
content: '', content: '',
kind: 'message', kind: 'message',
@@ -72,7 +74,7 @@ function applyFrame(state: State, frame: WsFrame): State {
const newMsg: Message = { const newMsg: Message = {
id: frame.tool_message_id, id: frame.tool_message_id,
session_id: '', session_id: '',
chat_id: frame.chat_id, chat_id: frame.chat_id ?? '',
role: 'tool', role: 'tool',
content: '', content: '',
kind: 'message', kind: 'message',
@@ -119,9 +121,12 @@ function applyFrame(state: State, frame: WsFrame): State {
: state.messages; : state.messages;
return { ...state, messages: next, error: frame.error }; return { ...state, messages: next, error: frame.error };
} }
case 'pending_change_added': default:
case 'pending_change_updated': // The canonical WsFrame carries the full set of frames the coder backend
// These are handled by the pending changes listener, not the message state // can publish; this SPA only renders the subset handled above and safely
// ignores the rest (reasoning_delta, usage, permission_*, agent_*, and the
// per-user sidebar frames). pending_change_* frames have no publisher —
// pending changes are delivered over HTTP, so there is nothing to handle.
return state; return state;
} }
} }
@@ -134,14 +139,11 @@ interface SessionStreamResult {
connected: boolean; connected: boolean;
error: string | null; error: string | null;
isStreaming: boolean; isStreaming: boolean;
/** Listeners for pending change frames */
onPendingChange: (cb: (change: PendingChange) => void) => () => void;
} }
export function useSessionStream(sessionId: string | undefined): SessionStreamResult { export function useSessionStream(sessionId: string | undefined): SessionStreamResult {
const [state, setState] = useState<State>({ messages: [], connected: false, error: null }); const [state, setState] = useState<State>({ messages: [], connected: false, error: null });
const wsRef = useRef<WebSocket | null>(null); const wsRef = useRef<WebSocket | null>(null);
const pendingListenersRef = useRef<Set<(change: PendingChange) => void>>(new Set());
useEffect(() => { useEffect(() => {
if (!sessionId) return; if (!sessionId) return;
@@ -172,13 +174,6 @@ export function useSessionStream(sessionId: string | undefined): SessionStreamRe
return; return;
} }
// Notify pending change listeners
if (frame.type === 'pending_change_added' || frame.type === 'pending_change_updated') {
for (const cb of pendingListenersRef.current) {
cb(frame.change);
}
}
setState((s) => applyFrame(s, frame)); setState((s) => applyFrame(s, frame));
}; };
@@ -213,18 +208,10 @@ export function useSessionStream(sessionId: string | undefined): SessionStreamRe
const isStreaming = state.messages.some((m) => m.status === 'streaming'); const isStreaming = state.messages.some((m) => m.status === 'streaming');
const onPendingChange = useCallback((cb: (change: PendingChange) => void) => {
pendingListenersRef.current.add(cb);
return () => {
pendingListenersRef.current.delete(cb);
};
}, []);
return { return {
messages: state.messages, messages: state.messages,
connected: state.connected, connected: state.connected,
error: state.error, error: state.error,
isStreaming, isStreaming,
onPendingChange,
}; };
} }

View File

@@ -14,8 +14,7 @@ export function Session() {
const [chat, setChat] = useState<Chat | null>(null); const [chat, setChat] = useState<Chat | null>(null);
const [loading, setLoading] = useState(true); const [loading, setLoading] = useState(true);
const { messages, connected, isStreaming, onPendingChange } = const { messages, connected, isStreaming } = useSessionStream(sessionId);
useSessionStream(sessionId);
// Get or create a chat for this session // Get or create a chat for this session
useEffect(() => { useEffect(() => {
@@ -78,9 +77,7 @@ export function Session() {
connected={connected} connected={connected}
/> />
} }
diffPane={ diffPane={<DiffPane sessionId={sessionId} />}
<DiffPane sessionId={sessionId} onPendingChange={onPendingChange} />
}
/> />
); );
} }

48
apps/server/CLAUDE.md Normal file
View File

@@ -0,0 +1,48 @@
# apps/server — BooChat backend (deep reference)
> Per-app engineering notes for `apps/server/src/`. Cross-cutting commands, database, environment, workflow, and cross-app contracts (WS-frame / provider-type parity, sentinels) live in the **root `CLAUDE.md`**. This file auto-loads when you read/edit files under `apps/server/`.
## Stack
- **Fastify** with `@fastify/websocket` and `@fastify/static` (serves the built frontend).
- **postgres** (porsager/postgres) with tagged-template SQL — no ORM. Schema in `schema.sql`, applied on startup. LSP may false-positive on `sql<Type[]>\`...\`` generics; CLI `tsc` / `pnpm build` is authoritative.
- **Zod** for request validation and config parsing.
## Key services
- **`services/inference/`** — Public surface re-exported via `inference/index.ts`; callers import from `./services/inference/index.js` explicitly (NodeNext doesn't honor directory-index resolution). Layout: `turn.ts` (runAssistantTurn/runInference/createInferenceRunner; exports `InferenceFrame`, `InferenceContext`, `TurnArgs`, `StreamResult`, `MAX_STEPS`); `stream-phase.ts` (streamCompletion AI SDK adapter + executeStreamPhase); `provider.ts` (`upstreamModel(baseURL, modelId)` wrapping `createOpenAICompatible` against llama-swap); `tool-phase.ts` (executeToolPhase → `ToolPhaseResult`; the turn loop lives in turn.ts, not recursion); `sentinel-summaries.ts` (cap-hit/doom-loop/step-cap summaries + inserters); `error-handler.ts` (handleAbortOrError, finalizeCompletion); `payload.ts` (buildMessagesPayload, loadContext, maybeFlagForCompaction, `OpenAiMessage`); `sentinels.ts` (`detectDoomLoop`, `DOOM_LOOP_THRESHOLD`); `budget.ts` (resolveToolBudget); `xml-parser.ts` (qwen3.6 XML tool-call fallback — KEEP, AI SDK doesn't handle inline-XML tool calls); `parts.ts` (`partsFromAssistantMessage`/`partsFromToolMessage`/`insertParts` — parts are the sole source of truth); `prune.ts` (two-tier compaction; `selectPruneTargets` is the pure helper); `types.ts` (`StreamPhaseState`, `DB_FLUSH_INTERVAL_MS`). **`TurnArgs`** is the per-turn state envelope, reset in `runInference` at the user-message boundary. Outer loop: `while (stepNumber < effectiveCap)`, `effectiveCap = Math.min(agent.steps ?? Infinity, MAX_STEPS=200)`. Per-agent `steps:` in AGENTS.md frontmatter; `steps: 0` = text-only. Step-cap hit writes a `cap_hit` sentinel (`CapHitSentinel.tsx` renders it).
- **AI SDK v6 streamCompletion adapter** (`services/inference/stream-phase.ts`). `streamText` is the underlying call; the BooCode layer (executeStreamPhase, finalize, dual-write) is shape-preserved via an adapter. Five gotchas the LSP/tests won't catch:
- **Abort signals are swallowed.** `streamText`'s `fullStream` exits cleanly when `abortSignal` fires — no throw. Post-iteration `if (signal?.aborted) throw <AbortError>` is required, else the row finalizes `complete` instead of `cancelled`. Don't refactor away the pinning comment.
- **Usage lands only at stream end** via `await result.usage` (v6 `inputTokens`/`outputTokens` → mapped to `promptTokens`/`completionTokens`). No mid-stream tok/s; ChatThroughput shows one value at stream end.
- **Tools have NO `execute` field.** BooCode dispatches tools in tool-phase.ts, not the AI SDK loop — only `description` + `inputSchema: jsonSchema(parameters)`.
- **`includeUsage: true` MUST be set on `createOpenAICompatible`** in `provider.ts`. The adapter defaults it false → no `stream_options.include_usage` → llama-swap emits no usage block → `result.usage` resolves `undefined` (NULL token counts). Don't remove during refactor.
- **Tool-call-only turns may emit a leading `\n` text-delta.** `MessageList.flatten`'s `hasText` and `MessageBubble`'s `hasContent` both `.trim()` before the length check, else whitespace-only content renders an empty bubble + ActionRow between tool calls. `buildMessagesPayload` also skips `status='failed'` and complete-but-empty assistant rows (avoids "Cannot have 2 or more assistant messages at the end of the list" upstream rejection after cap-hit + Continue).
- **AI SDK ModelMessage conversion** (`toModelMessages` in stream-phase.ts). Tool messages need a `toolName` for `ToolResultPart`; BooCode's OpenAI-shape history lacks it, so a forward-scan builds a `tool_call_id → toolName` map from prior assistant `tool_calls`. Tool outputs wrapped as `{ type: 'json' | 'text', value }` (v6 `ToolResultOutput`). Reasoning emits a `ReasoningPart` first in the content array.
- **`experimental_repairToolCall`** wired into `streamText` to keep the stream alive when qwen3.6 emits malformed tool args. Pass-through: logs the bad call, returns it unmodified; `executeToolPhase`'s zod-reject path routes it back to the model next turn.
- **`chat_status` frame** (via `broker.publishUser`) — `status: 'streaming' | 'tool_running' | 'waiting_for_input' | 'idle' | 'error'`. Frontend `useChatStatus` derives `idle_warm` (<30s since idle) vs `idle_cold`. `ChatThroughput` renders beside `StatusDot` only when streaming/tool_running, fed by 500ms-throttled `'usage'` frames (`completion_tokens` + `ctx_used` + `ctx_max`). `POST /api/chats/:id/discard_stale` marks a stuck-streaming row `failed` when the frontend's 60s no-token timer gives up.
- **Stale-streaming sweeps** (`apps/server/src/index.ts`): a boot-time pass after `applySchema()` and a periodic 60s `setInterval` both flip `messages.status='streaming'` older than 5 min to `failed` (publishing `chat_status='idle'`); the interval also runs `cleanupTruncations` (TTL + orphan reap of tmpfs truncation files). `onClose` hook clears the timer. Recovers from a container restart mid-stream.
- **`services/broker.ts`** — In-memory pub/sub, two channel types: per-session (message streaming) and per-user (sidebar). No persistence; clients reconnect on restart. Every WS publish goes through `broker.publishFrame(sessionId, frame)` / `publishUserFrame(user, frame)` — both Zod-validate against `WsFrameSchema` (`types/ws-frames.ts`) and fail-closed (log + drop). Schema single-sourced in `@boocode/contracts` (`packages/contracts/src/ws-frames.ts`); the package's `ws-frames.test.ts` enforces schema correctness. Don't add raw `broker.publish()`/`publishUser()` calls.
- **`services/tools.ts`** — Tool registry (`ALL_TOOLS`, `READ_ONLY_TOOL_NAMES`, `TOOLS_BY_NAME`). Filesystem tools (view_file/list_dir/grep/find_files) pass three guards: `path_guard.ts` (workspace scope), `secret_guard.ts` (filename deny list), `url_guard.ts` (SSRF/private-IP block for web_fetch). Web tools (`web_search`, `web_fetch`) are opt-in per chat via `session.web_search_enabled` (falls back to `project.default_web_search_enabled`) and filtered out of the LLM tool schema when false. Truncation: when a tool slice cuts content, `services/truncate.ts` stashes the full text on tmpfs (`BOOCODE_TRUNCATION_DIR`, default `/tmp/boocode-truncations`, 0o700) keyed by `tr_<12 base32>`; `view_truncated_output(id)` retrieves it. 5MB cap, 7-day TTL, reaped by the sweeper. Container restart loses retrieval — acceptable.
- **`services/compaction.ts`** + **`services/model-context.ts`** — Anchored rolling summary (single `summary=true` assistant row per chat, supersedes itself each compaction). Triggered when `chats.needs_compaction` is set after a turn exceeds `usable(ctx_max) = floor(0.85 × ctx_max)`. **`ctx_max` comes from `model-context.getModelContext()` fetching `${LLAMA_SWAP_URL}/upstream/<model>/props`** — NOT from `parsed.timings.n_ctx`. First inferences after boot may have `ctx_max=NULL` if llama-swap hasn't loaded the model; negative cache TTL 60s, recovers next turn. `buildHeadPayload` embeds `reasoning_parts` as a `<reasoning>...</reasoning>` prose prefix on assistant `content` (OpenAI wire shape has no structured reasoning field); standalone tag when content is empty. `buildHeadPayload` + `OpenAiMessage` exported for tests — keep them exported.
- **`services/system-prompt.ts`** — `buildSystemPrompt` is the string shim; `buildSystemPromptWithFingerprint` is the canonical impl returning `{prompt, fingerprint, drift}`. SHA-256 of the assembled prefix is logged per `buildMessagesPayload` (`prefix-fingerprint`, info); a `Map<sessionId, lastHash>` fires `prefix-drift` (warn) on change with a `changed_inputs` diff. The prefix is byte-stable in steady-state, so prefix caching is left to the input-layer mtime caches (BOOCHAT.md + AGENTS.md global/per-project in `agents.ts:safeStat`).
- **`services/inference/budget.ts`** — tool-call budgets: `BUDGET_READ_ONLY = 30`, `BUDGET_NON_READ_ONLY = 10` (forward-looking; no write tools yet), `BUDGET_NO_AGENT = 30` (every `ALL_TOOLS` tool is read-only today, so no-agent shares the read-only cap). Per-agent `max_tool_calls` from AGENTS.md overrides.
- **`messages_with_parts` view** (`schema.sql`). Read sites needing `tool_calls` / `tool_results` / `reasoning_parts` SELECT from this view, NOT `messages` — the legacy `messages.tool_calls`/`tool_results` JSON columns were dropped; the view reads parts-only subselects. Writes target `message_parts` via `insertParts` (or `partsFromAssistantMessage`/`partsFromToolMessage`). The `Message` wire type still carries `tool_calls?`/`tool_results?` because the view synthesizes them. Shapes: `tool_calls jsonb[]`, `tool_results jsonb` (single object), `reasoning_parts jsonb[]` of `{text}`. To UPDATE a message and return its full shape, do a two-step UPDATE returning `id` then SELECT from the view — RETURNING off bare `messages` no longer carries the tool fields. **`messages.model`** (attribution chip) stamps the model per assistant turn — at `finalizeCompletion` (BooChat + native coder) + the dispatcher's assistant-row INSERT (external coder); read via the view + the `message_complete` frame, rendered by `shortenModelName`.
- **`services/file_ops.ts`** — Shared file operation implementations used by both inference tools and HTTP routes.
- **`services/auto_name.ts`** — Non-streaming LLM call to generate 4-word session titles after the first assistant reply.
- **Provider picker dispatch**: when `provider !== 'boocode'`, the message route creates a `tasks` row (with `session_id` set) instead of calling `inference.enqueue`. The dispatcher (in `apps/coder`) picks it up and dispatches via ACP or PTY using the agent's `install_path`.
Route registration: all routes registered in `index.ts` via `register*Routes(app, sql, ...)`. Routes live in `routes/*.ts`.
## Server conventions
- **New tools** live in their own `services/<name>.ts` (see `web_search.ts`, `web_fetch.ts`) — a pure `executeFoo(input, ...deps)` for direct test access plus a `ToolDef` wrapper that `loadConfig()`s its real deps. Register the ToolDef in `tools.ts` `ALL_TOOLS` (and `READ_ONLY_TOOL_NAMES` if applicable). Inject `fetcher: typeof fetch = fetch` rather than `vi.spyOn(globalThis, 'fetch')`.
- **DB/session-aware tools** take an optional 4th `ToolExecCtx { sql, sessionId }` arg on `ToolDef.execute`, plumbed `executeToolPhase``executeToolCall``execute`. Optional so filesystem tools and the `apps/coder` `ALL_TOOLS` consumer stay compatible; filesystem tools ignore it. `read_tab_by_number` is the reference.
- **ReadableStream test stubs** use `pull()` (not `start()`) so chunks are produced lazily — `start()` enqueues everything and closes before the consumer reads, so a later `reader.cancel()` finds the stream closed and the `cancel()` callback never fires. Provide MORE chunks than the test consumes so the source stays 'readable' when cancel runs.
- Tool-name whitelists must derive from `ALL_TOOLS` in `services/tools.ts`, never hardcoded (this drift class hit `services/agents.ts` `ALL_TOOL_NAMES` before).
- Agent registry lives at `data/AGENTS.md` (global, bind-mounted at `/data/AGENTS.md`). No per-project `AGENTS.md` in this repo (removed to eliminate two-files-must-stay-in-sync drift); the `getAgentsForProject` per-project override mechanism remains for *other* projects.
- `data/AGENTS.md` is PARSED (`agents.ts` `splitSections`/`parseAgentSection`): each `## <Name>` is one agent and must be followed by a `---` frontmatter fence or the block throws; content before the first `## ` is discarded. Do NOT add free-form `## ` rule sections — they break the registry. Cross-cutting agent rules go in CLAUDE.md or a parser-ignored preamble.
- MCP stdio transport uses newline-delimited JSON (NDJSON), NOT LSP-style `Content-Length` headers. `codecontext/shim.go` is the reference (per the MCP spec, modelcontextprotocol.io/specification/server/transports).
- **`payload.ts:loadContext` SELECT** must include every `Session` field downstream code reads. The tool phase reads `session.allowed_read_paths`; if the SELECT omits it, cross-repo read grants silently fail. `sql<Session[]>` doesn't enforce column coverage, so the type doesn't catch it.
- **Sidecar routing** (`services/inference/provider.ts`): `upstreamModel(config, modelId, agent)` routes to `LLAMA_SIDECAR_URL` when the agent has `llama_extra_args`, else `LLAMA_SWAP_URL`. `resolveRoute(agent)` returns `{route, flags}`. Sidecar provider created fresh per call (not cached) because `X-Agent-Flags` varies per agent. Boot-time guard in `index.ts` refuses to start if any agent has `llama_extra_args` but `LLAMA_SIDECAR_URL` is unset.
- **Secret guard safe patterns** (`services/secret_guard.ts`): `.env.example`, `.env.sample`, `.env.template`, `.env.defaults` are allowlisted via `SAFE_PATTERNS`. Do NOT add `.env.production`/`.env.development`/`.env.test` — those can hold real secrets.
- **llama-sidecar** (`/opt/forks/llama-sidecar/`): Go daemon for a per-agent llama-server process pool (routed to via "Sidecar routing" above). Cross-compile: `GOOS=windows GOARCH=amd64 /snap/go/current/bin/go build -o bin/llama-sidecar.exe ./cmd/llama-sidecar`. Gitea: `indifferentketchup/llama-sidecar`. Windows child-process gotchas: `context.Background()` for child lifetime (not request ctx), `os.Open(os.DevNull)` for stdin, `os.Pipe()` for stdout with a drain goroutine, `DETACHED_PROCESS | CREATE_NEW_PROCESS_GROUP` flags. SSH to sam-desktop: `ssh samki@100.101.41.16`; use `schtasks` for persistent spawning (SSH `start /B` doesn't survive session close).

View File

@@ -53,10 +53,6 @@
"types": "./dist/types/api.d.ts", "types": "./dist/types/api.d.ts",
"default": "./dist/types/api.js" "default": "./dist/types/api.js"
}, },
"./ws-frames": {
"types": "./dist/types/ws-frames.d.ts",
"default": "./dist/types/ws-frames.js"
},
"./db": { "./db": {
"types": "./dist/db.d.ts", "types": "./dist/db.d.ts",
"default": "./dist/db.js" "default": "./dist/db.js"
@@ -81,6 +77,7 @@
"test": "vitest run" "test": "vitest run"
}, },
"dependencies": { "dependencies": {
"@boocode/contracts": "workspace:*",
"@ai-sdk/openai-compatible": "^2.0.47", "@ai-sdk/openai-compatible": "^2.0.47",
"@fastify/static": "^7.0.4", "@fastify/static": "^7.0.4",
"@fastify/websocket": "^10.0.1", "@fastify/websocket": "^10.0.1",

View File

@@ -140,7 +140,7 @@ async function main() {
publish: (sessionId, frame) => { publish: (sessionId, frame) => {
// v1.13.11-b: route through the typed publishFrame so the broker's // v1.13.11-b: route through the typed publishFrame so the broker's
// Zod gate validates every inference frame before delivery. // Zod gate validates every inference frame before delivery.
broker.publishFrame(sessionId, frame as unknown as import('./types/ws-frames.js').WsFrame); broker.publishFrame(sessionId, frame as unknown as import('@boocode/contracts/ws-frames').WsFrame);
}, },
// v1.11: broker handle for compaction.process to publish 'compacted' // v1.11: broker handle for compaction.process to publish 'compacted'
// frames on the per-session channel. Inference's regular publish path // frames on the per-session channel. Inference's regular publish path
@@ -149,7 +149,7 @@ async function main() {
broker, broker,
}, },
(user, frame) => { (user, frame) => {
broker.publishUserFrame(user, frame as unknown as import('./types/ws-frames.js').WsFrame); broker.publishUserFrame(user, frame as unknown as import('@boocode/contracts/ws-frames').WsFrame);
} }
); );
registerMessageRoutes(app, sql, config, broker, { registerMessageRoutes(app, sql, config, broker, {
@@ -194,7 +194,7 @@ async function main() {
}); });
}, },
publishSessionFrame: (sessionId, frame) => { publishSessionFrame: (sessionId, frame) => {
broker.publishFrame(sessionId, frame as import('./types/ws-frames.js').WsFrame); broker.publishFrame(sessionId, frame as import('@boocode/contracts/ws-frames').WsFrame);
}, },
}); });
registerArtifactRoutes(app, sql); registerArtifactRoutes(app, sql);
@@ -222,7 +222,7 @@ async function main() {
}); });
}, },
publishSessionFrame: (sessionId, frame) => { publishSessionFrame: (sessionId, frame) => {
broker.publishFrame(sessionId, frame as import('./types/ws-frames.js').WsFrame); broker.publishFrame(sessionId, frame as import('@boocode/contracts/ws-frames').WsFrame);
}, },
}); });
registerWebSocket(app, sql, broker); registerWebSocket(app, sql, broker);

View File

@@ -441,7 +441,7 @@ export function registerChatRoutes(
const rows = await sql<Message[]>` const rows = await sql<Message[]>`
SELECT id, session_id, chat_id, role, content, kind, tool_calls, tool_results, status, last_seq, SELECT id, session_id, chat_id, role, content, kind, tool_calls, tool_results, status, last_seq,
tokens_used, ctx_used, ctx_max, started_at, finished_at, created_at, metadata, tokens_used, ctx_used, ctx_max, started_at, finished_at, created_at, metadata,
summary, tail_start_id, compacted_at summary, tail_start_id, compacted_at, model
FROM messages_with_parts FROM messages_with_parts
WHERE chat_id = ${req.params.id} WHERE chat_id = ${req.params.id}
ORDER BY created_at ASC, id ASC ORDER BY created_at ASC, id ASC

View File

@@ -118,7 +118,7 @@ export function registerMessageRoutes(
const rows = await sql<Message[]>` const rows = await sql<Message[]>`
SELECT id, session_id, chat_id, role, content, kind, tool_calls, tool_results, status, last_seq, SELECT id, session_id, chat_id, role, content, kind, tool_calls, tool_results, status, last_seq,
tokens_used, ctx_used, ctx_max, started_at, finished_at, created_at, metadata, tokens_used, ctx_used, ctx_max, started_at, finished_at, created_at, metadata,
summary, tail_start_id, compacted_at summary, tail_start_id, compacted_at, model
FROM messages_with_parts FROM messages_with_parts
WHERE session_id = ${req.params.id} WHERE session_id = ${req.params.id}
ORDER BY created_at ASC, id ASC ORDER BY created_at ASC, id ASC

View File

@@ -22,8 +22,9 @@ export async function setSetting(
`; `;
} }
// themes-v1: whitelist of the 18 preset theme ids. Kept in sync with // themes-v1: whitelist of the preset theme ids. Kept in sync with
// docs/themes_v1.md §1 and apps/web/src/lib/theme.ts THEMES. // docs/themes_v1.md §1 and apps/web/src/lib/theme.ts THEMES.
// (+ 'ember' — the BooCode 2.0 signature, now the default.)
const THEME_IDS = [ const THEME_IDS = [
'obsidian', 'obsidian',
'gunmetal', 'gunmetal',
@@ -43,6 +44,7 @@ const THEME_IDS = [
'chalk', 'chalk',
'cobalt', 'cobalt',
'midnight-sapphire', 'midnight-sapphire',
'ember',
] as const; ] as const;
const THEME_MODES = ['dark', 'light', 'system'] as const; const THEME_MODES = ['dark', 'light', 'system'] as const;

View File

@@ -27,7 +27,7 @@ export function registerWebSocket(
const messages = await sql<Message[]>` const messages = await sql<Message[]>`
SELECT id, session_id, chat_id, role, content, kind, tool_calls, tool_results, status, last_seq, SELECT id, session_id, chat_id, role, content, kind, tool_calls, tool_results, status, last_seq,
tokens_used, ctx_used, ctx_max, started_at, finished_at, created_at, metadata, tokens_used, ctx_used, ctx_max, started_at, finished_at, created_at, metadata,
summary, tail_start_id, compacted_at summary, tail_start_id, compacted_at, model
FROM messages_with_parts FROM messages_with_parts
WHERE session_id = ${sessionId} WHERE session_id = ${sessionId}
ORDER BY created_at ASC, id ASC ORDER BY created_at ASC, id ASC

View File

@@ -107,6 +107,11 @@ END $$;
-- a single jsonb object {tool_call_id, output, truncated, error?}. -- a single jsonb object {tool_call_id, output, truncated, error?}.
-- reasoning_parts is consumed by the inference history fetch (payload.ts) -- reasoning_parts is consumed by the inference history fetch (payload.ts)
-- for v1.13.1-C reasoning round-tripping. Not surfaced in external APIs. -- for v1.13.1-C reasoning round-tripping. Not surfaced in external APIs.
-- model-attribution: which model produced an assistant message (NULL for
-- user/system rows and pre-existing messages). Stamped at finalize (BooChat /
-- native coder) and at assistant-row creation (external coder dispatcher).
ALTER TABLE messages ADD COLUMN IF NOT EXISTS model TEXT;
CREATE OR REPLACE VIEW messages_with_parts AS CREATE OR REPLACE VIEW messages_with_parts AS
SELECT SELECT
m.id, m.session_id, m.chat_id, m.role, m.content, m.kind, m.status, m.id, m.session_id, m.chat_id, m.role, m.content, m.kind, m.status,
@@ -122,7 +127,10 @@ SELECT
ORDER BY p.sequence LIMIT 1) AS tool_results, ORDER BY p.sequence LIMIT 1) AS tool_results,
(SELECT jsonb_agg(p.payload ORDER BY p.sequence) (SELECT jsonb_agg(p.payload ORDER BY p.sequence)
FROM message_parts p FROM message_parts p
WHERE p.message_id = m.id AND p.kind = 'reasoning' AND p.hidden_at IS NULL) AS reasoning_parts WHERE p.message_id = m.id AND p.kind = 'reasoning' AND p.hidden_at IS NULL) AS reasoning_parts,
-- NEW columns MUST be appended at the end: CREATE OR REPLACE VIEW can't
-- reorder/rename existing columns (42P16). m.model added last.
m.model
FROM messages m; FROM messages m;
-- v1.13.20: drop legacy tool_calls/tool_results columns. Reads have routed -- v1.13.20: drop legacy tool_calls/tool_results columns. Reads have routed

View File

@@ -1,4 +1,4 @@
import { describe, it, expect } from 'vitest'; import { describe, it, expect, vi, afterEach } from 'vitest';
import { isAgentRegistryMarkdown, parseAgentsMd } from '../agents.js'; import { isAgentRegistryMarkdown, parseAgentsMd } from '../agents.js';
describe('isAgentRegistryMarkdown', () => { describe('isAgentRegistryMarkdown', () => {
@@ -31,3 +31,87 @@ Start here
expect(r.errors.length).toBeGreaterThan(0); expect(r.errors.length).toBeGreaterThan(0);
}); });
}); });
// v2.6 sampling-streamjson-tokens (#11): per-agent llama.cpp sampler extensions.
describe('parseAgentsMd: v2.6 sampling knobs', () => {
afterEach(() => {
vi.restoreAllMocks();
});
const withFrontmatter = (lines: string) => `# Agents
## Sampler
---
temperature: 0.6
${lines}
tools: [view_file]
description: test
---
You sample.
`;
it('parses top_n_sigma and the dry_* family from frontmatter', () => {
const md = withFrontmatter(
[
'top_n_sigma: 1.5',
'dry_multiplier: 0.8',
'dry_base: 1.75',
'dry_allowed_length: 2',
'dry_penalty_last_n: -1',
].join('\n'),
);
const { agents, errors } = parseAgentsMd(md);
expect(errors).toHaveLength(0);
expect(agents).toHaveLength(1);
const a = agents[0]!;
expect(a.top_n_sigma).toBe(1.5);
expect(a.dry_multiplier).toBe(0.8);
expect(a.dry_base).toBe(1.75);
expect(a.dry_allowed_length).toBe(2);
expect(a.dry_penalty_last_n).toBe(-1);
});
it('defaults the new sampler fields to null when omitted', () => {
const { agents } = parseAgentsMd(withFrontmatter('top_p: 0.95'));
const a = agents[0]!;
expect(a.top_n_sigma).toBeNull();
expect(a.dry_multiplier).toBeNull();
expect(a.dry_base).toBeNull();
expect(a.dry_allowed_length).toBeNull();
expect(a.dry_penalty_last_n).toBeNull();
});
it('warns (does not error) on out-of-range top_n_sigma / dry_* values', () => {
const warn = vi.spyOn(console, 'warn').mockImplementation(() => {});
const md = withFrontmatter(
[
'top_n_sigma: -1',
'dry_multiplier: -0.5',
'dry_base: -2',
'dry_allowed_length: -3',
'dry_penalty_last_n: -5',
].join('\n'),
);
const { agents, errors } = parseAgentsMd(md);
expect(errors).toHaveLength(0);
expect(agents).toHaveLength(1);
// Mirrors top_k/min_p: out-of-range still stored, with a warning.
expect(warn).toHaveBeenCalled();
const warnings = warn.mock.calls.map((c) => String(c[0])).join('\n');
expect(warnings).toContain('top_n_sigma');
expect(warnings).toContain('dry_multiplier');
expect(warnings).toContain('dry_base');
expect(warnings).toContain('dry_allowed_length');
expect(warnings).toContain('dry_penalty_last_n');
});
it('errors on non-numeric / non-integer sampler values', () => {
const md = withFrontmatter(
['top_n_sigma: high', 'dry_allowed_length: 2.5'].join('\n'),
);
const { errors } = parseAgentsMd(md);
const joined = errors.map((e) => e.reason).join('\n');
expect(joined).toContain('top_n_sigma must be a number');
expect(joined).toContain('dry_allowed_length must be an integer');
});
});

View File

@@ -7,6 +7,8 @@ import {
select, select,
buildPrompt, buildPrompt,
buildHeadPayload, buildHeadPayload,
deriveFilesRead,
buildFilesReadContext,
type CompactionMessage, type CompactionMessage,
} from '../compaction.js'; } from '../compaction.js';
import { SUMMARY_TEMPLATE } from '../compaction-prompt.js'; import { SUMMARY_TEMPLATE } from '../compaction-prompt.js';
@@ -321,3 +323,105 @@ describe('buildHeadPayload reasoning render', () => {
expect(out[1]!.content).not.toContain('<reasoning>'); expect(out[1]!.content).not.toContain('<reasoning>');
}); });
}); });
// ---- buildHeadPayload sentinel stripping (#12) -------------------------------
describe('buildHeadPayload strips all UI sentinels', () => {
it('drops cap_hit, doom_loop, and mistake_recovery system rows', () => {
const out = buildHeadPayload([
mkMsg('user', 'do the thing'),
mkMsg('system', 'budget reached', { metadata: { kind: 'cap_hit' } }),
mkMsg('system', 'looping', { metadata: { kind: 'doom_loop' } }),
mkMsg('system', 'repeated errors', { metadata: { kind: 'mistake_recovery' } }),
mkMsg('assistant', 'answer'),
]);
// Only the user + assistant rows survive; all three sentinels stripped.
expect(out).toHaveLength(2);
expect(out[0]!.role).toBe('user');
expect(out[1]!.role).toBe('assistant');
});
it('keeps a non-sentinel system row (e.g. compact bridge) untouched', () => {
const out = buildHeadPayload([
mkMsg('system', 'legacy compact', { kind: 'compact', metadata: null }),
mkMsg('user', 'q'),
]);
expect(out[0]!.role).toBe('system');
expect(out[0]!.content).toBe('legacy compact');
});
});
// ---- file-provenance ledger (#12, Part B) -----------------------------------
describe('deriveFilesRead', () => {
it('returns [] when the head has no read-tool calls', () => {
expect(deriveFilesRead([mkMsg('user', 'hi'), mkMsg('assistant', 'hello')])).toEqual([]);
});
it('extracts the path arg from view_file / list_dir / grep / find_files', () => {
const head = [
mkMsg('assistant', '', {
tool_calls: [
{ id: 'c1', name: 'view_file', args: { path: 'src/index.ts' } },
{ id: 'c2', name: 'list_dir', args: { path: 'src' } },
{ id: 'c3', name: 'grep', args: { pattern: 'TODO', path: 'apps' } },
{ id: 'c4', name: 'find_files', args: { pattern: '**/*.ts', path: 'lib' } },
],
}),
];
expect(deriveFilesRead(head)).toEqual(['apps', 'lib', 'src', 'src/index.ts']);
});
it('dedupes and sorts paths across multiple assistant turns', () => {
const head = [
mkMsg('assistant', '', { tool_calls: [{ id: 'c1', name: 'view_file', args: { path: 'b.ts' } }] }),
mkMsg('assistant', '', { tool_calls: [{ id: 'c2', name: 'view_file', args: { path: 'a.ts' } }] }),
mkMsg('assistant', '', { tool_calls: [{ id: 'c3', name: 'view_file', args: { path: 'b.ts' } }] }),
];
expect(deriveFilesRead(head)).toEqual(['a.ts', 'b.ts']);
});
it('ignores non-read tools and grep calls without a path arg', () => {
const head = [
mkMsg('assistant', '', {
tool_calls: [
{ id: 'c1', name: 'web_search', args: { query: 'x' } },
{ id: 'c2', name: 'grep', args: { pattern: 'foo' } }, // no path → root, skipped
{ id: 'c3', name: 'view_file', args: { path: 'kept.ts' } },
],
}),
];
expect(deriveFilesRead(head)).toEqual(['kept.ts']);
});
it('ignores read-tool calls on non-assistant rows', () => {
const head = [
mkMsg('user', '', { tool_calls: [{ id: 'c1', name: 'view_file', args: { path: 'nope.ts' } }] }),
];
expect(deriveFilesRead(head)).toEqual([]);
});
});
describe('buildFilesReadContext', () => {
it('returns null when nothing was read (no empty section injected)', () => {
expect(buildFilesReadContext([mkMsg('user', 'hi')])).toBeNull();
});
it('formats a ## Files Read block with sorted bullet paths', () => {
const head = [
mkMsg('assistant', '', {
tool_calls: [
{ id: 'c1', name: 'view_file', args: { path: 'z.ts' } },
{ id: 'c2', name: 'view_file', args: { path: 'a.ts' } },
],
}),
];
expect(buildFilesReadContext(head)).toBe('## Files Read\n- a.ts\n- z.ts');
});
});
describe('SUMMARY_TEMPLATE includes the Files Read section (#12)', () => {
it('declares a ## Files Read section the model must maintain', () => {
expect(SUMMARY_TEMPLATE).toContain('## Files Read');
});
});

View File

@@ -0,0 +1,93 @@
/**
* Unit tests for `{env:VAR}` substitution in the MCP config loader.
* Pure — no live MCP server. Verifies secrets resolve from process.env
* (so real keys live in `.env`, not the gitignored config file).
*/
import { describe, it, expect, beforeEach, afterEach } from 'vitest';
import { substituteEnvVars } from '../mcp-config.js';
// Minimal FastifyBaseLogger stub — only .warn is exercised here.
function fakeLog() {
const warnings: string[] = [];
const log = {
warn: (msg: unknown) => {
warnings.push(typeof msg === 'string' ? msg : JSON.stringify(msg));
},
};
return { log: log as never, warnings };
}
describe('substituteEnvVars', () => {
const SAVED = process.env.MCP_TEST_SECRET;
beforeEach(() => {
process.env.MCP_TEST_SECRET = 'resolved-value';
});
afterEach(() => {
if (SAVED === undefined) delete process.env.MCP_TEST_SECRET;
else process.env.MCP_TEST_SECRET = SAVED;
delete process.env.MCP_TEST_MISSING;
});
it('replaces a {env:VAR} reference in a string value', () => {
const { log } = fakeLog();
expect(substituteEnvVars('{env:MCP_TEST_SECRET}', log)).toBe('resolved-value');
});
it('substitutes inside nested objects and arrays', () => {
const { log } = fakeLog();
const out = substituteEnvVars(
{
headers: { CONTEXT7_API_KEY: '{env:MCP_TEST_SECRET}' },
args: ['--token', '{env:MCP_TEST_SECRET}'],
},
log,
);
expect(out).toEqual({
headers: { CONTEXT7_API_KEY: 'resolved-value' },
args: ['--token', 'resolved-value'],
});
});
it('leaves object keys untouched, only transforms values', () => {
const { log } = fakeLog();
const out = substituteEnvVars({ '{env:MCP_TEST_SECRET}': 'literal' }, log) as Record<string, string>;
expect(Object.keys(out)).toEqual(['{env:MCP_TEST_SECRET}']);
});
it('resolves an unset var to empty string and warns', () => {
const { log, warnings } = fakeLog();
expect(substituteEnvVars('{env:MCP_TEST_MISSING}', log)).toBe('');
expect(warnings.some((w) => w.includes('MCP_TEST_MISSING'))).toBe(true);
});
it('passes non-string scalars through unchanged', () => {
const { log } = fakeLog();
expect(substituteEnvVars(true, log)).toBe(true);
expect(substituteEnvVars(42, log)).toBe(42);
expect(substituteEnvVars(null, log)).toBe(null);
});
it('leaves strings without a reference unchanged', () => {
const { log } = fakeLog();
expect(substituteEnvVars('https://mcp.context7.com/mcp', log)).toBe('https://mcp.context7.com/mcp');
});
it('resolves multiple references in one string (global flag)', () => {
const { log } = fakeLog();
expect(substituteEnvVars('{env:MCP_TEST_SECRET}/{env:MCP_TEST_SECRET}', log)).toBe(
'resolved-value/resolved-value',
);
});
it('passes an empty string through unchanged', () => {
const { log } = fakeLog();
expect(substituteEnvVars('', log)).toBe('');
});
it('collects unset var names into the optional collector set', () => {
const { log } = fakeLog();
const unset = new Set<string>();
substituteEnvVars({ url: '{env:MCP_TEST_MISSING}', headers: { k: '{env:MCP_TEST_SECRET}' } }, log, unset);
expect([...unset]).toEqual(['MCP_TEST_MISSING']);
});
});

View File

@@ -0,0 +1,164 @@
import { describe, it, expect } from 'vitest';
import {
MISTAKE_THRESHOLD,
freshMistakeState,
recordStep,
detectMistakePattern,
MISTAKE_RECOVERY_NOTE,
type FailureKind,
} from '../inference/mistake-tracker.js';
// ---- helpers ----------------------------------------------------------------
// Replays a sequence of outcomes against a fresh state, returning the final
// state so assertions can read .run / .nudges. The caller mimics turn.ts: after
// each recordStep we consult detectMistakePattern and, if it returns 'nudge',
// bump nudges + reset run (the loop's nudge-handling side effect).
function replay(
outcomes: (FailureKind | 'success')[],
{ applyNudge = false }: { applyNudge?: boolean } = {},
) {
const state = freshMistakeState();
const decisions: (ReturnType<typeof detectMistakePattern>)[] = [];
for (const o of outcomes) {
recordStep(state, o);
const decision = detectMistakePattern(state);
decisions.push(decision);
if (applyNudge && decision === 'nudge') {
// Mirror turn.ts's nudge side effect: bump the counter, reset the streak.
state.nudges += 1;
state.run = [];
}
}
return { state, decisions };
}
// ---- fresh state ------------------------------------------------------------
describe('freshMistakeState', () => {
it('starts with an empty run and zero nudges', () => {
const s = freshMistakeState();
expect(s.run).toEqual([]);
expect(s.nudges).toBe(0);
});
});
// ---- below threshold --------------------------------------------------------
describe('detectMistakePattern — below threshold', () => {
it('returns null on a fresh state', () => {
expect(detectMistakePattern(freshMistakeState())).toBeNull();
});
it('returns null after fewer than MISTAKE_THRESHOLD failures', () => {
const { decisions } = replay(['zod_reject', 'exec_error']);
expect(decisions).toEqual([null, null]);
});
});
// ---- success reset ----------------------------------------------------------
describe('recordStep — success resets', () => {
it("'success' clears both the run streak and the nudge counter", () => {
const state = freshMistakeState();
recordStep(state, 'zod_reject');
recordStep(state, 'exec_error');
state.nudges = 2; // simulate prior nudges
recordStep(state, 'success');
expect(state.run).toEqual([]);
expect(state.nudges).toBe(0);
});
it('a success mid-streak prevents the threshold from tripping', () => {
// fail, fail, success, fail, fail → streak never reaches 3.
const { decisions } = replay([
'zod_reject',
'exec_error',
'success',
'tool_not_found',
'permission_denied',
]);
expect(decisions.every((d) => d === null)).toBe(true);
});
});
// ---- 3-streak nudge ---------------------------------------------------------
describe('detectMistakePattern — nudge on 3-streak', () => {
it("returns 'nudge' the first time the streak reaches MISTAKE_THRESHOLD", () => {
const { decisions } = replay(['zod_reject', 'exec_error', 'tool_not_found']);
expect(decisions).toEqual([null, null, 'nudge']);
});
it("fires 'nudge' for a streak of identical kinds too (kind-agnostic)", () => {
const { decisions } = replay(['exec_error', 'exec_error', 'exec_error']);
expect(decisions[2]).toBe('nudge');
});
});
// ---- re-trip escalate -------------------------------------------------------
describe('detectMistakePattern — escalate on re-trip', () => {
it("escalates when the streak re-trips after a nudge with no intervening success", () => {
// 3 fails → nudge (run reset, nudges=1), then 3 more fails → escalate.
const { decisions } = replay(
[
'zod_reject',
'exec_error',
'tool_not_found',
'permission_denied',
'exec_error',
'zod_reject',
],
{ applyNudge: true },
);
expect(decisions[2]).toBe('nudge');
expect(decisions[5]).toBe('escalate');
});
it("does NOT escalate if a success lands between the nudge and the next streak", () => {
const { decisions } = replay(
[
'zod_reject',
'exec_error',
'tool_not_found', // nudge here
'success', // clears nudges back to 0
'exec_error',
'zod_reject',
'tool_not_found', // 3-streak again → nudge, NOT escalate
],
{ applyNudge: true },
);
expect(decisions[2]).toBe('nudge');
expect(decisions[6]).toBe('nudge');
expect(decisions).not.toContain('escalate');
});
});
// ---- mixed kinds ------------------------------------------------------------
describe('detectMistakePattern — mixed failure kinds', () => {
it('counts a streak of all five distinct kinds toward the threshold', () => {
const { state, decisions } = replay([
'zod_reject',
'tool_not_found',
'exec_error',
]);
expect(decisions[2]).toBe('nudge');
expect(state.run).toEqual(['zod_reject', 'tool_not_found', 'exec_error']);
});
});
// ---- contract ---------------------------------------------------------------
describe('MISTAKE_THRESHOLD + MISTAKE_RECOVERY_NOTE', () => {
it('threshold is a positive integer (tests assume 3)', () => {
expect(MISTAKE_THRESHOLD).toBeGreaterThan(0);
expect(Number.isInteger(MISTAKE_THRESHOLD)).toBe(true);
});
it('recovery note is a non-empty model-facing string', () => {
expect(typeof MISTAKE_RECOVERY_NOTE).toBe('string');
expect(MISTAKE_RECOVERY_NOTE.length).toBeGreaterThan(0);
});
});

View File

@@ -1,146 +1,13 @@
import { describe, it, expect, vi, beforeEach, afterEach } from 'vitest'; import { describe, it, expect, vi, beforeEach, afterEach } from 'vitest';
import { readFileSync } from 'node:fs';
import { resolve } from 'node:path';
import { fileURLToPath } from 'node:url';
import { import {
WsFrameSchema,
KNOWN_FRAME_TYPES,
type WsFrame, type WsFrame,
} from '../../types/ws-frames.js'; } from '@boocode/contracts/ws-frames';
import { createBroker } from '../broker.js'; import { createBroker } from '../broker.js';
const VALID_UUID_A = '00000000-0000-0000-0000-000000000001'; const VALID_UUID_A = '00000000-0000-0000-0000-000000000001';
const VALID_UUID_B = '00000000-0000-0000-0000-000000000002'; const VALID_UUID_B = '00000000-0000-0000-0000-000000000002';
const VALID_UUID_C = '00000000-0000-0000-0000-000000000003';
const VALID_TIMESTAMP = '2026-05-22T14:30:00.000Z'; const VALID_TIMESTAMP = '2026-05-22T14:30:00.000Z';
describe('WsFrameSchema (v1.13.11-a)', () => {
it('accepts a well-formed chat_status frame', () => {
const result = WsFrameSchema.safeParse({
type: 'chat_status',
chat_id: VALID_UUID_A,
status: 'streaming',
at: VALID_TIMESTAMP,
});
expect(result.success).toBe(true);
});
it('rejects an unknown frame type', () => {
const result = WsFrameSchema.safeParse({
type: 'cosmic_ray_strike',
chat_id: VALID_UUID_A,
});
expect(result.success).toBe(false);
});
it('rejects a chat_status frame with invalid status enum', () => {
// v1.12.1 dropped the legacy 'working' status. Any frame still emitting it
// should fail validation — that's a drift catcher.
const result = WsFrameSchema.safeParse({
type: 'chat_status',
chat_id: VALID_UUID_A,
status: 'working',
at: VALID_TIMESTAMP,
});
expect(result.success).toBe(false);
});
it('rejects a UUID field with a non-UUID string', () => {
const result = WsFrameSchema.safeParse({
type: 'chat_status',
chat_id: 'not-a-uuid',
status: 'idle',
at: VALID_TIMESTAMP,
});
expect(result.success).toBe(false);
});
it('rejects negative token counts in usage frame', () => {
const result = WsFrameSchema.safeParse({
type: 'usage',
message_id: VALID_UUID_A,
chat_id: VALID_UUID_B,
completion_tokens: -1,
ctx_used: 100,
ctx_max: 1000,
});
expect(result.success).toBe(false);
});
it('accepts a usage frame with nullable token counts (pre-v1.13.7 history)', () => {
const result = WsFrameSchema.safeParse({
type: 'usage',
message_id: VALID_UUID_A,
chat_id: VALID_UUID_B,
completion_tokens: null,
ctx_used: null,
ctx_max: null,
});
expect(result.success).toBe(true);
});
it('accepts a tool_result frame with non-UUID tool_call_id (model-emitted)', () => {
// Model-emitted tool_call_ids look like "call_abc123", not UUIDs.
const result = WsFrameSchema.safeParse({
type: 'tool_result',
tool_message_id: VALID_UUID_A,
chat_id: VALID_UUID_B,
tool_call_id: 'call_abc123',
output: { whatever: true },
truncated: false,
});
expect(result.success).toBe(true);
});
it('accepts a compacted frame', () => {
const result = WsFrameSchema.safeParse({
type: 'compacted',
session_id: VALID_UUID_A,
chat_id: VALID_UUID_B,
summary_message_id: VALID_UUID_C,
});
expect(result.success).toBe(true);
});
it('accepts a session_workspace_updated frame', () => {
const result = WsFrameSchema.safeParse({
type: 'session_workspace_updated',
session_id: VALID_UUID_A,
workspace_panes: [{ id: 'p1', kind: 'chat', chatIds: [], activeChatIdx: 0 }],
});
expect(result.success).toBe(true);
});
it('every KNOWN_FRAME_TYPES entry has a discriminated branch', () => {
// Probe each known type by attempting a minimal valid construction.
// Failure here means the union and the KNOWN_FRAME_TYPES list drifted.
for (const type of KNOWN_FRAME_TYPES) {
const probe = WsFrameSchema.safeParse({ type, __dummy__: true });
// We expect FAILURE on every type because we're missing required fields,
// but the failure must be ABOUT the missing fields, not about an unknown
// type. A "Invalid discriminator value" error means the type isn't in
// the union — that's a drift.
if (probe.success) continue;
const issues = probe.error.issues;
const hasInvalidDiscriminator = issues.some(
(i) => i.code === 'invalid_union_discriminator',
);
expect(hasInvalidDiscriminator, `frame type '${type}' is missing from the discriminated union`).toBe(false);
}
});
});
describe('ws-frames.ts file mirror parity', () => {
it('apps/server and apps/web copies are byte-identical', () => {
const here = fileURLToPath(import.meta.url);
const serverPath = resolve(here, '../../../types/ws-frames.ts');
const webPath = resolve(here, '../../../../../web/src/api/ws-frames.ts');
const serverContent = readFileSync(serverPath, 'utf8');
const webContent = readFileSync(webPath, 'utf8');
expect(webContent, 'apps/web/src/api/ws-frames.ts must be byte-identical to apps/server/src/types/ws-frames.ts').toBe(serverContent);
});
});
describe('broker.publishFrame / publishUserFrame fail-closed behavior', () => { describe('broker.publishFrame / publishUserFrame fail-closed behavior', () => {
let logErrors: Array<{ obj: unknown; msg: string }>; let logErrors: Array<{ obj: unknown; msg: string }>;
let mockLog: Parameters<typeof createBroker>[0]; let mockLog: Parameters<typeof createBroker>[0];

View File

@@ -88,6 +88,12 @@ interface ParsedFrontmatter {
top_k?: number; top_k?: number;
min_p?: number; min_p?: number;
presence_penalty?: number; presence_penalty?: number;
// v2.6 sampling-streamjson-tokens (#11): llama.cpp sampler extensions.
top_n_sigma?: number;
dry_multiplier?: number;
dry_base?: number;
dry_allowed_length?: number;
dry_penalty_last_n?: number;
tools?: string[]; tools?: string[];
description?: string; description?: string;
model?: string; model?: string;
@@ -178,6 +184,63 @@ function parseFrontmatter(yaml: string): { data: ParsedFrontmatter; errors: stri
} else { } else {
errors.push(`presence_penalty must be a number (got "${valueRaw}")`); errors.push(`presence_penalty must be a number (got "${valueRaw}")`);
} }
} else if (key === 'top_n_sigma') {
// v2.6 #11: llama.cpp top-n-sigma sampler. Float ≥ 0 (typical 0-3).
// Mirrors top_p/min_p: store then warn on out-of-range (non-numeric
// hard-fails the block).
const n = Number(valueRaw);
if (Number.isFinite(n)) {
data.top_n_sigma = n;
if (n < 0) {
console.warn(`agents: top_n_sigma ${n} out of range (≥0), ignoring (falling back to default)`);
}
} else {
errors.push(`top_n_sigma must be a number (got "${valueRaw}")`);
}
} else if (key === 'dry_multiplier') {
// v2.6 #11: DRY repetition-penalty multiplier. Float ≥ 0 (0 disables DRY).
const n = Number(valueRaw);
if (Number.isFinite(n)) {
data.dry_multiplier = n;
if (n < 0) {
console.warn(`agents: dry_multiplier ${n} out of range (≥0), ignoring (falling back to default)`);
}
} else {
errors.push(`dry_multiplier must be a number (got "${valueRaw}")`);
}
} else if (key === 'dry_base') {
// v2.6 #11: DRY penalty growth base. Float ≥ 0.
const n = Number(valueRaw);
if (Number.isFinite(n)) {
data.dry_base = n;
if (n < 0) {
console.warn(`agents: dry_base ${n} out of range (≥0), ignoring (falling back to default)`);
}
} else {
errors.push(`dry_base must be a number (got "${valueRaw}")`);
}
} else if (key === 'dry_allowed_length') {
// v2.6 #11: DRY max sequence length not penalized. Integer ≥ 0.
const n = Number(valueRaw);
if (Number.isInteger(n)) {
data.dry_allowed_length = n;
if (n < 0) {
console.warn(`agents: dry_allowed_length ${n} out of range (≥0), ignoring (falling back to default)`);
}
} else {
errors.push(`dry_allowed_length must be an integer (got "${valueRaw}")`);
}
} else if (key === 'dry_penalty_last_n') {
// v2.6 #11: DRY lookback window. Integer ≥ -1 (-1 = whole context, 0 = off).
const n = Number(valueRaw);
if (Number.isInteger(n)) {
data.dry_penalty_last_n = n;
if (n < -1) {
console.warn(`agents: dry_penalty_last_n ${n} out of range (≥-1), ignoring (falling back to default)`);
}
} else {
errors.push(`dry_penalty_last_n must be an integer (got "${valueRaw}")`);
}
} else if (key === 'tools') { } else if (key === 'tools') {
if (valueRaw === '') { if (valueRaw === '') {
data.tools = []; data.tools = [];
@@ -354,6 +417,11 @@ function parseAgentSection(section: RawSection): Omit<Agent, 'source'> {
top_k: typeof fm.top_k === 'number' ? fm.top_k : null, top_k: typeof fm.top_k === 'number' ? fm.top_k : null,
min_p: typeof fm.min_p === 'number' ? fm.min_p : null, min_p: typeof fm.min_p === 'number' ? fm.min_p : null,
presence_penalty: typeof fm.presence_penalty === 'number' ? fm.presence_penalty : null, presence_penalty: typeof fm.presence_penalty === 'number' ? fm.presence_penalty : null,
top_n_sigma: typeof fm.top_n_sigma === 'number' ? fm.top_n_sigma : null,
dry_multiplier: typeof fm.dry_multiplier === 'number' ? fm.dry_multiplier : null,
dry_base: typeof fm.dry_base === 'number' ? fm.dry_base : null,
dry_allowed_length: typeof fm.dry_allowed_length === 'number' ? fm.dry_allowed_length : null,
dry_penalty_last_n: typeof fm.dry_penalty_last_n === 'number' ? fm.dry_penalty_last_n : null,
tools: filteredTools, tools: filteredTools,
model: typeof fm.model === 'string' && fm.model.length > 0 ? fm.model : null, model: typeof fm.model === 'string' && fm.model.length > 0 ? fm.model : null,
max_tool_calls: typeof fm.max_tool_calls === 'number' ? fm.max_tool_calls : null, max_tool_calls: typeof fm.max_tool_calls === 'number' ? fm.max_tool_calls : null,

View File

@@ -1,5 +1,5 @@
import type { FastifyBaseLogger } from 'fastify'; import type { FastifyBaseLogger } from 'fastify';
import { WsFrameSchema, type WsFrame } from '../types/ws-frames.js'; import { WsFrameSchema, type WsFrame } from '@boocode/contracts/ws-frames';
export type Frame = Record<string, unknown> & { type: string }; export type Frame = Record<string, unknown> & { type: string };
export type Listener = (frame: Frame) => void; export type Listener = (frame: Frame) => void;

View File

@@ -31,10 +31,16 @@ export const SUMMARY_TEMPLATE = `Output exactly the Markdown structure shown ins
## Relevant Files ## Relevant Files
- [file or directory path: why it matters, or "(none)"] - [file or directory path: why it matters, or "(none)"]
## Files Read
- [file or directory path that has been read/searched this session, or "(none)"]
</template> </template>
Rules: Rules:
- Keep every section, even when empty. - Keep every section, even when empty.
- Use terse bullets, not prose paragraphs. - Use terse bullets, not prose paragraphs.
- Preserve exact file paths, commands, error strings, and identifiers when known. - Preserve exact file paths, commands, error strings, and identifiers when known.
- For ## Files Read: this is a cumulative provenance ledger. MERGE the paths
listed in any "## Files Read" block provided below with those already in the
previous summary — never drop a previously-recorded path. Sort and dedupe.
- Do not mention the summary process or that context was compacted.`; - Do not mention the summary process or that context was compacted.`;

View File

@@ -181,6 +181,54 @@ export function select(
}; };
} }
// === file-provenance ledger (#12, Part B) ===
// Read tools whose path/target arg names a file or directory that was read.
// BooChat (apps/server) is read-only — there are no write tools, so the ledger
// only ever has a "Files Read" side (apps/coder can add "Modified" later).
const READ_TOOL_ARG: Record<string, string> = {
view_file: 'path',
list_dir: 'path',
grep: 'path',
find_files: 'path',
};
// Derive a deterministic, deduped, sorted list of file/dir paths read by the
// HEAD messages being summarized. Pure — scans assistant tool_calls only; the
// boundary (which messages are "head") is decided by select() at the call site.
// We derive at compaction time rather than via a live accumulator because
// TurnArgs resets per turn and would miss reads on non-compacting turns; the
// head messages are the authoritative record of what was read in the window
// being summarized. The result propagates forward as summary text across
// compactions (the LLM merges it into ## Files Read), so a path read long ago
// survives even after its originating messages are compacted out.
export function deriveFilesRead(head: CompactionMessage[]): string[] {
const paths = new Set<string>();
for (const m of head) {
if (m.role !== 'assistant') continue;
if (!m.tool_calls) continue;
for (const tc of m.tool_calls) {
const argName = READ_TOOL_ARG[tc.name];
if (!argName) continue;
const raw = (tc.args as Record<string, unknown> | null)?.[argName];
if (typeof raw === 'string' && raw.trim().length > 0) {
paths.add(raw.trim());
}
}
}
return [...paths].sort();
}
// Format the derived paths as a deterministic ## Files Read block for injection
// into buildPrompt's context array. Returns null when nothing was read (so we
// don't inject an empty section). The summarizer merges this into the rolling
// summary's ## Files Read section per the SUMMARY_TEMPLATE instructions.
export function buildFilesReadContext(head: CompactionMessage[]): string | null {
const paths = deriveFilesRead(head);
if (paths.length === 0) return null;
return ['## Files Read', ...paths.map((p) => `- ${p}`)].join('\n');
}
// === prompt assembly === // === prompt assembly ===
// Build the final user message that asks the model to (re)produce the // Build the final user message that asks the model to (re)produce the
@@ -220,15 +268,26 @@ export interface OpenAiMessage {
tool_call_id?: string; tool_call_id?: string;
} }
function isCapHitSentinel(m: CompactionMessage): boolean { // #12: mirror inference/sentinels.ts:isAnySentinel over the CompactionMessage
return m.role === 'system' && m.metadata != null && m.metadata.kind === 'cap_hit'; // shape (which carries metadata as { kind?: string } | null, not the full
// Message type isAnySentinel expects). All UI-only sentinels are stripped from
// the head payload — they never go to the summarizer LLM. Keep the kind list in
// sync with isAnySentinel in sentinels.ts.
const SENTINEL_KINDS = new Set(['cap_hit', 'doom_loop', 'mistake_recovery']);
function isAnySentinel(m: CompactionMessage): boolean {
return (
m.role === 'system' &&
m.metadata != null &&
typeof m.metadata.kind === 'string' &&
SENTINEL_KINDS.has(m.metadata.kind)
);
} }
// v1.13.6: exported for unit-test access (reasoning render coverage). // v1.13.6: exported for unit-test access (reasoning render coverage).
export function buildHeadPayload(head: CompactionMessage[]): OpenAiMessage[] { export function buildHeadPayload(head: CompactionMessage[]): OpenAiMessage[] {
const out: OpenAiMessage[] = []; const out: OpenAiMessage[] = [];
for (const m of head) { for (const m of head) {
if (isCapHitSentinel(m)) continue; if (isAnySentinel(m)) continue;
if (m.role === 'assistant' && (m.status === 'streaming' || m.status === 'cancelled')) continue; if (m.role === 'assistant' && (m.status === 'streaming' || m.status === 'cancelled')) continue;
if (m.kind === 'compact') { if (m.kind === 'compact') {
// Legacy compact row — pass through as system context. The new // Legacy compact row — pass through as system context. The new
@@ -417,7 +476,14 @@ export async function process(input: ProcessInput): Promise<void> {
// user message carrying buildPrompt(previousSummary, []). No system prompt // user message carrying buildPrompt(previousSummary, []). No system prompt
// — matches opencode (`system: []`); the template + anchor are sufficient. // — matches opencode (`system: []`); the template + anchor are sufficient.
const headPayload = buildHeadPayload(sel.head); const headPayload = buildHeadPayload(sel.head);
const finalUser: OpenAiMessage = { role: 'user', content: buildPrompt(previousSummary, []) }; // #12 Part B: derive the file-provenance ledger from the head's read-tool
// calls and inject it as a deterministic ## Files Read context block so the
// summarizer merges it into the rolling summary. Empty → no injection.
const filesReadCtx = buildFilesReadContext(sel.head);
const finalUser: OpenAiMessage = {
role: 'user',
content: buildPrompt(previousSummary, filesReadCtx ? [filesReadCtx] : []),
};
const payload = [...headPayload, finalUser]; const payload = [...headPayload, finalUser];
log.info( log.info(

View File

@@ -119,6 +119,7 @@ export async function finalizeCompletion(
tokens_used = ${completionTokens}, tokens_used = ${completionTokens},
ctx_used = ${promptTokens}, ctx_used = ${promptTokens},
ctx_max = ${nCtx}, ctx_max = ${nCtx},
model = ${session.model},
finished_at = clock_timestamp() finished_at = clock_timestamp()
WHERE id = ${assistantMessageId} WHERE id = ${assistantMessageId}
RETURNING tokens_used, ctx_used, ctx_max, finished_at RETURNING tokens_used, ctx_used, ctx_max, finished_at

View File

@@ -19,6 +19,14 @@ export type {
} from './turn.js'; } from './turn.js';
export type { ToolPhaseResult } from './tool-phase.js'; export type { ToolPhaseResult } from './tool-phase.js';
export { detectDoomLoop, DOOM_LOOP_THRESHOLD } from './sentinels.js'; export { detectDoomLoop, DOOM_LOOP_THRESHOLD } from './sentinels.js';
export {
detectMistakePattern,
freshMistakeState,
recordStep,
MISTAKE_THRESHOLD,
MISTAKE_RECOVERY_NOTE,
} from './mistake-tracker.js';
export type { FailureKind, MistakeState } from './mistake-tracker.js';
export { buildMessagesPayload } from './payload.js'; export { buildMessagesPayload } from './payload.js';
export { generateToolUseSummary } from './tool-summaries.js'; export { generateToolUseSummary } from './tool-summaries.js';
export type { ToolInfo } from './tool-summaries.js'; export type { ToolInfo } from './tool-summaries.js';

View File

@@ -0,0 +1,69 @@
// v#12 MistakeTracker: heterogeneous-failure recovery. Complements the
// doom-loop guard (sentinels.ts:detectDoomLoop, which only catches *identical*
// repeats) by catching a run of consecutive tool FAILURES the model isn't
// recovering from — even when each failure is a *different* error. Algorithm
// reimplemented from cline's mistake-counting pattern (NOT vendored).
//
// Pure module — mirrors sentinels.ts:detectDoomLoop. No DB, no I/O. The state
// lives loop-local in TurnArgs (reset per runInference, like recentToolCalls).
// The failure taxonomy already distinguished in tool-phase.ts:executeToolCall.
// 'api_error' is reserved for upstream-model failures surfaced as tool outcomes
// (no current emit site on apps/server, but the union mirrors the design doc
// so a future caller can record it without a type change).
export type FailureKind =
| 'zod_reject'
| 'tool_not_found'
| 'exec_error'
| 'api_error'
| 'permission_denied';
// Smallest streak that doesn't false-positive on a model that retries once
// after a transient error. Matches DOOM_LOOP_THRESHOLD's rationale.
export const MISTAKE_THRESHOLD = 3;
export interface MistakeState {
// The current consecutive-failure streak (any successful tool step clears it).
run: FailureKind[];
// How many recovery nudges have fired without an intervening success. Used to
// escalate (stop the turn) on the second trip rather than nudging forever.
nudges: number;
}
export function freshMistakeState(): MistakeState {
return { run: [], nudges: 0 };
}
// Record one tool step's outcome. A 'success' clears BOTH the streak and the
// nudge counter (the model recovered). A FailureKind pushes onto the streak.
export function recordStep(
state: MistakeState,
outcome: FailureKind | 'success',
): void {
if (outcome === 'success') {
state.run = [];
state.nudges = 0;
return;
}
state.run.push(outcome);
}
// Decide whether to intervene given the current streak. When the streak has
// reached MISTAKE_THRESHOLD: 'nudge' the first time (no nudge fired yet),
// 'escalate' if it trips again while a nudge is already outstanding (no
// intervening success cleared `nudges`). Below threshold → null.
//
// Pure — the caller is responsible for mutating `nudges`/`run` after acting on
// the decision (mirrors how turn.ts consumes detectDoomLoop's result).
export function detectMistakePattern(
state: MistakeState,
): 'nudge' | 'escalate' | null {
if (state.run.length < MISTAKE_THRESHOLD) return null;
return state.nudges === 0 ? 'nudge' : 'escalate';
}
// Model-facing guidance injected (transiently, for the next step only) when a
// nudge fires. Short + declarative for the same reliability reason as the
// cap-hit / doom-loop notes.
export const MISTAKE_RECOVERY_NOTE =
"You've hit several different errors in a row. Stop retrying variations — re-read the tool schemas, verify file paths and arguments exist before calling, and try a fundamentally different approach.";

View File

@@ -86,7 +86,7 @@ export async function runCapHitSummary(
ctx, ctx,
session.model, session.model,
messages, messages,
{ tools: null, temperature: agent?.temperature, top_p: agent?.top_p ?? undefined, top_k: agent?.top_k ?? undefined, min_p: agent?.min_p ?? undefined, presence_penalty: agent?.presence_penalty ?? undefined }, { tools: null, temperature: agent?.temperature, top_p: agent?.top_p ?? undefined, top_k: agent?.top_k ?? undefined, min_p: agent?.min_p ?? undefined, presence_penalty: agent?.presence_penalty ?? undefined, top_n_sigma: agent?.top_n_sigma ?? undefined, dry_multiplier: agent?.dry_multiplier ?? undefined, dry_base: agent?.dry_base ?? undefined, dry_allowed_length: agent?.dry_allowed_length ?? undefined, dry_penalty_last_n: agent?.dry_penalty_last_n ?? undefined },
(delta) => { (delta) => {
accumulated += delta; accumulated += delta;
ctx.publish(sessionId, { ctx.publish(sessionId, {
@@ -346,7 +346,7 @@ export async function runDoomLoopSummary(
ctx, ctx,
session.model, session.model,
messages, messages,
{ tools: null, temperature: agent?.temperature, top_p: agent?.top_p ?? undefined, top_k: agent?.top_k ?? undefined, min_p: agent?.min_p ?? undefined, presence_penalty: agent?.presence_penalty ?? undefined }, { tools: null, temperature: agent?.temperature, top_p: agent?.top_p ?? undefined, top_k: agent?.top_k ?? undefined, min_p: agent?.min_p ?? undefined, presence_penalty: agent?.presence_penalty ?? undefined, top_n_sigma: agent?.top_n_sigma ?? undefined, dry_multiplier: agent?.dry_multiplier ?? undefined, dry_base: agent?.dry_base ?? undefined, dry_allowed_length: agent?.dry_allowed_length ?? undefined, dry_penalty_last_n: agent?.dry_penalty_last_n ?? undefined },
(delta) => { (delta) => {
accumulated += delta; accumulated += delta;
ctx.publish(sessionId, { ctx.publish(sessionId, {
@@ -545,7 +545,7 @@ export async function runStepCapSummary(
ctx, ctx,
session.model, session.model,
messages, messages,
{ tools: null, temperature: agent?.temperature, top_p: agent?.top_p ?? undefined, top_k: agent?.top_k ?? undefined, min_p: agent?.min_p ?? undefined, presence_penalty: agent?.presence_penalty ?? undefined }, { tools: null, temperature: agent?.temperature, top_p: agent?.top_p ?? undefined, top_k: agent?.top_k ?? undefined, min_p: agent?.min_p ?? undefined, presence_penalty: agent?.presence_penalty ?? undefined, top_n_sigma: agent?.top_n_sigma ?? undefined, dry_multiplier: agent?.dry_multiplier ?? undefined, dry_base: agent?.dry_base ?? undefined, dry_allowed_length: agent?.dry_allowed_length ?? undefined, dry_penalty_last_n: agent?.dry_penalty_last_n ?? undefined },
(delta) => { (delta) => {
accumulated += delta; accumulated += delta;
ctx.publish(sessionId, { ctx.publish(sessionId, {
@@ -717,3 +717,57 @@ async function insertDoomLoopSentinel(
metadata, metadata,
}); });
} }
// #12 MistakeTracker: heterogeneous-failure recovery sentinel. Mirrors
// insertDoomLoopSentinel structurally — a role='system', status='complete' row
// firing the standard message_started → delta → message_complete frame
// sequence. Two variants distinguished by `escalated`:
// - escalated:false → a nudge fired; recovery guidance was injected into the
// model's next step and the loop continued. can_continue is true (the turn
// is still live).
// - escalated:true → the nudge didn't break the failure run; the turn was
// stopped (cap-hit-style). can_continue is true so the UI can still offer a
// Continue affordance — a fresh user turn resets the tracker.
export async function insertMistakeRecoverySentinel(
ctx: InferenceContext,
sessionId: string,
chatId: string,
opts: { failureKinds: string[]; count: number; escalated: boolean; canContinue: boolean },
): Promise<void> {
const metadata: MessageMetadata = {
kind: 'mistake_recovery',
failure_kinds: opts.failureKinds,
count: opts.count,
escalated: opts.escalated,
can_continue: opts.canContinue,
};
const content = opts.escalated
? `Repeated different errors persisted after a recovery nudge (${opts.count} in a row). Stopping the tool-call loop.`
: `Hit ${opts.count} different errors in a row. Injected recovery guidance and continuing.`;
const [row] = await ctx.sql<{ id: string }[]>`
INSERT INTO messages (session_id, chat_id, role, content, status, created_at, metadata)
VALUES (${sessionId}, ${chatId}, 'system', ${content}, 'complete', clock_timestamp(), ${ctx.sql.json(metadata as never)})
RETURNING id
`;
// Standard frame sequence — same as cap-hit / doom-loop sentinels.
ctx.publish(sessionId, {
type: 'message_started',
message_id: row!.id,
chat_id: chatId,
role: 'system',
});
ctx.publish(sessionId, {
type: 'delta',
message_id: row!.id,
chat_id: chatId,
content,
});
ctx.publish(sessionId, {
type: 'message_complete',
message_id: row!.id,
chat_id: chatId,
metadata,
});
}

View File

@@ -48,6 +48,18 @@ export function isDoomLoopSentinel(m: Message): boolean {
); );
} }
export function isAnySentinel(m: Message): boolean { // #12: mistake-recovery sentinel. Same UI-only semantics as cap-hit /
return isCapHitSentinel(m) || isDoomLoopSentinel(m); // doom-loop — never sent to the LLM (filtered via the isAnySentinel check
// below, which buildMessagesPayload + buildHeadPayload both consult).
export function isMistakeRecoverySentinel(m: Message): boolean {
return (
m.role === 'system' &&
m.metadata !== null &&
typeof m.metadata === 'object' &&
(m.metadata as { kind?: unknown }).kind === 'mistake_recovery'
);
}
export function isAnySentinel(m: Message): boolean {
return isCapHitSentinel(m) || isDoomLoopSentinel(m) || isMistakeRecoverySentinel(m);
} }

View File

@@ -33,6 +33,39 @@ interface StreamOptions {
top_k?: number | null; top_k?: number | null;
min_p?: number | null; min_p?: number | null;
presence_penalty?: number | null; presence_penalty?: number | null;
// v2.6 sampling-streamjson-tokens (#11): llama.cpp sampler extensions. These
// are NOT standard AI-SDK streamText options and are NOT serialized by the
// openai-compatible provider's standardized-settings path (topK is even
// explicitly dropped with an "unsupported feature: topK" warning). They reach
// llama-server only via providerOptions.openaiCompatible (see buildSamplerProviderOptions).
top_n_sigma?: number | null;
dry_multiplier?: number | null;
dry_base?: number | null;
dry_allowed_length?: number | null;
dry_penalty_last_n?: number | null;
}
// v2.6 #11: build the providerOptions.openaiCompatible extraBody object for the
// llama.cpp sampler extensions. @ai-sdk/openai-compatible (2.0.47) merges every
// non-reserved key under providerOptions.openaiCompatible straight into the
// chat-completion request body (see its getArgs: the Object.fromEntries spread
// filtered against openaiCompatibleLanguageModelChatOptions.shape). This is the
// ONLY working passthrough for these params:
// - top_k / min_p were latently dropped before this: top_k was passed as the
// AI-SDK `topK` setting which the openai-compatible provider rejects as
// unsupported; min_p was never passed to streamText at all.
// - top_n_sigma + the dry_* family have no AI-SDK equivalent.
// Keys use llama-server's snake_case body names so they land verbatim.
function buildSamplerProviderOptions(opts: StreamOptions): Record<string, number> | undefined {
const body: Record<string, number> = {};
if (typeof opts.top_k === 'number') body.top_k = opts.top_k;
if (typeof opts.min_p === 'number') body.min_p = opts.min_p;
if (typeof opts.top_n_sigma === 'number') body.top_n_sigma = opts.top_n_sigma;
if (typeof opts.dry_multiplier === 'number') body.dry_multiplier = opts.dry_multiplier;
if (typeof opts.dry_base === 'number') body.dry_base = opts.dry_base;
if (typeof opts.dry_allowed_length === 'number') body.dry_allowed_length = opts.dry_allowed_length;
if (typeof opts.dry_penalty_last_n === 'number') body.dry_penalty_last_n = opts.dry_penalty_last_n;
return Object.keys(body).length > 0 ? body : undefined;
} }
// v1.13.1-A: convert BooCode's OpenAI-shaped history into AI SDK // v1.13.1-A: convert BooCode's OpenAI-shaped history into AI SDK
@@ -195,6 +228,14 @@ export async function streamCompletion(
return toolCall; return toolCall;
}; };
// v2.6 #11: llama.cpp sampler extensions (top_k, min_p, top_n_sigma, dry_*)
// ride providerOptions.openaiCompatible — they are NOT standardized streamText
// settings. NB: top_k used to be passed below as the AI-SDK `topK` setting;
// the openai-compatible provider dropped it with an "unsupported feature: topK"
// warning and min_p was never wired at all, so both were dead on the wire
// before this. They now go through the same extraBody path as the new params.
const samplerBody = buildSamplerProviderOptions(opts);
const result = streamText({ const result = streamText({
model: upstreamModel(ctx.config, model, agent ?? null), model: upstreamModel(ctx.config, model, agent ?? null),
messages: aiMessages, messages: aiMessages,
@@ -203,8 +244,8 @@ export async function streamCompletion(
: {}), : {}),
...(typeof opts.temperature === 'number' ? { temperature: opts.temperature } : {}), ...(typeof opts.temperature === 'number' ? { temperature: opts.temperature } : {}),
...(typeof opts.top_p === 'number' ? { topP: opts.top_p } : {}), ...(typeof opts.top_p === 'number' ? { topP: opts.top_p } : {}),
...(typeof opts.top_k === 'number' ? { topK: opts.top_k } : {}),
...(typeof opts.presence_penalty === 'number' ? { presencePenalty: opts.presence_penalty } : {}), ...(typeof opts.presence_penalty === 'number' ? { presencePenalty: opts.presence_penalty } : {}),
...(samplerBody ? { providerOptions: { openaiCompatible: samplerBody } } : {}),
abortSignal: signal, abortSignal: signal,
}); });
@@ -398,6 +439,12 @@ export async function executeStreamPhase(
const effectiveTopK = agent?.top_k ?? undefined; const effectiveTopK = agent?.top_k ?? undefined;
const effectiveMinP = agent?.min_p ?? undefined; const effectiveMinP = agent?.min_p ?? undefined;
const effectivePresencePenalty = agent?.presence_penalty ?? undefined; const effectivePresencePenalty = agent?.presence_penalty ?? undefined;
// v2.6 #11: llama.cpp sampler extensions, threaded the same way as top_k/min_p.
const effectiveTopNSigma = agent?.top_n_sigma ?? undefined;
const effectiveDryMultiplier = agent?.dry_multiplier ?? undefined;
const effectiveDryBase = agent?.dry_base ?? undefined;
const effectiveDryAllowedLength = agent?.dry_allowed_length ?? undefined;
const effectiveDryPenaltyLastN = agent?.dry_penalty_last_n ?? undefined;
// v1.12.2: ctx_max lookup is cached after the first hit per model, so this // v1.12.2: ctx_max lookup is cached after the first hit per model, so this
// is a Map probe in steady state. We capture nCtx once at the top of the // is a Map probe in steady state. We capture nCtx once at the top of the
@@ -435,7 +482,19 @@ export async function executeStreamPhase(
ctx, ctx,
session.model, session.model,
messages, messages,
{ tools: effectiveTools, temperature: effectiveTemperature, top_p: effectiveTopP, top_k: effectiveTopK, min_p: effectiveMinP, presence_penalty: effectivePresencePenalty }, {
tools: effectiveTools,
temperature: effectiveTemperature,
top_p: effectiveTopP,
top_k: effectiveTopK,
min_p: effectiveMinP,
presence_penalty: effectivePresencePenalty,
top_n_sigma: effectiveTopNSigma,
dry_multiplier: effectiveDryMultiplier,
dry_base: effectiveDryBase,
dry_allowed_length: effectiveDryAllowedLength,
dry_penalty_last_n: effectiveDryPenaltyLastN,
},
(delta) => { (delta) => {
state.accumulated += delta; state.accumulated += delta;
ctx.publish(sessionId, { ctx.publish(sessionId, {

View File

@@ -17,6 +17,7 @@ import { formatUnknownToolError } from './tool-suggestions.js';
// prompted about paths we couldn't grant anyway (e.g. /etc/passwd). // prompted about paths we couldn't grant anyway (e.g. /etc/passwd).
import { resolveGrantRoot } from '../grant_resolver.js'; import { resolveGrantRoot } from '../grant_resolver.js';
import { stripToolMarkup } from './tool-call-parser.js'; import { stripToolMarkup } from './tool-call-parser.js';
import type { FailureKind } from './mistake-tracker.js';
import type { import type {
InferenceContext, InferenceContext,
StreamResult, StreamResult,
@@ -33,13 +34,18 @@ async function executeToolCall(
toolCall: ToolCall, toolCall: ToolCall,
extraRoots: readonly string[], extraRoots: readonly string[],
toolCtx?: ToolExecCtx, toolCtx?: ToolExecCtx,
): Promise<{ output: unknown; truncated: boolean; error?: string }> { ): Promise<{ output: unknown; truncated: boolean; error?: string; outcome: FailureKind | 'success' }> {
// v#12 MistakeTracker: every return path carries an `outcome` so the turn
// loop can detect a run of heterogeneous failures. The failure taxonomy
// mirrors mistake-tracker.ts:FailureKind. Does NOT alter the existing
// output/truncated/error shape — outcome is purely additive.
const tool = TOOLS_BY_NAME[toolCall.name]; const tool = TOOLS_BY_NAME[toolCall.name];
if (!tool) { if (!tool) {
return { return {
output: null, output: null,
truncated: false, truncated: false,
error: formatUnknownToolError(toolCall.name, Object.keys(TOOLS_BY_NAME)), error: formatUnknownToolError(toolCall.name, Object.keys(TOOLS_BY_NAME)),
outcome: 'tool_not_found',
}; };
} }
const parsed = tool.inputSchema.safeParse(toolCall.args); const parsed = tool.inputSchema.safeParse(toolCall.args);
@@ -64,6 +70,7 @@ async function executeToolCall(
output: null, output: null,
truncated: false, truncated: false,
error: `tool '${toolCall.name}' rejected — ${hint}`, error: `tool '${toolCall.name}' rejected — ${hint}`,
outcome: 'zod_reject',
}; };
} }
try { try {
@@ -72,15 +79,16 @@ async function executeToolCall(
typeof output === 'object' && output !== null && 'truncated' in output typeof output === 'object' && output !== null && 'truncated' in output
? Boolean((output as { truncated: unknown }).truncated) ? Boolean((output as { truncated: unknown }).truncated)
: false; : false;
return { output, truncated }; return { output, truncated, outcome: 'success' };
} catch (err) { } catch (err) {
if (err instanceof PathScopeError) { if (err instanceof PathScopeError) {
return { output: null, truncated: false, error: err.message }; return { output: null, truncated: false, error: err.message, outcome: 'permission_denied' };
} }
return { return {
output: null, output: null,
truncated: false, truncated: false,
error: err instanceof Error ? err.message : String(err), error: err instanceof Error ? err.message : String(err),
outcome: 'exec_error',
}; };
} }
} }
@@ -93,6 +101,12 @@ export interface ToolPhaseResult {
toolCallCount: number; toolCallCount: number;
toolCalls: ToolCall[]; toolCalls: ToolCall[];
nextAssistantId: string | null; nextAssistantId: string | null;
// v#12 MistakeTracker: one outcome per executed tool call, in no particular
// order (filled inside the Promise.all callbacks). The turn loop folds these
// into TurnArgs.mistakeTracker via recordStep. Pause/auto-grant control-flow
// tools record 'success' (they aren't model mistakes); the genuine error
// paths record their FailureKind.
outcomes: (FailureKind | 'success')[];
} }
export async function executeToolPhase( export async function executeToolPhase(
@@ -187,6 +201,10 @@ export async function executeToolPhase(
// for the synthesis input. Race-free under Promise.all because each // for the synthesis input. Race-free under Promise.all because each
// callback pushes its own captured value. // callback pushes its own captured value.
const synthEntries: Array<{ tc: ToolCall; output: unknown; error?: string }> = []; const synthEntries: Array<{ tc: ToolCall; output: unknown; error?: string }> = [];
// v#12 MistakeTracker: collect each tool's outcome. Concurrent pushes under
// Promise.all are safe (each callback appends its own value; order is not
// significant to recordStep which folds them sequentially).
const outcomes: (FailureKind | 'success')[] = [];
await Promise.all( await Promise.all(
toolCalls.map(async (tc) => { toolCalls.map(async (tc) => {
const [toolRow] = await ctx.sql<{ id: string }[]>` const [toolRow] = await ctx.sql<{ id: string }[]>`
@@ -197,6 +215,7 @@ export async function executeToolPhase(
const toolMessageId = toolRow!.id; const toolMessageId = toolRow!.id;
if (tc.name === 'ask_user_input') { if (tc.name === 'ask_user_input') {
pausingForUserInput = true; pausingForUserInput = true;
outcomes.push('success');
const sentinel = { tool_call_id: tc.id, output: null, truncated: false }; const sentinel = { tool_call_id: tc.id, output: null, truncated: false };
// v1.13.20: parts-only. The answer-endpoint UPDATE later // v1.13.20: parts-only. The answer-endpoint UPDATE later
// (messages.ts) will delete and re-insert this part when the user // (messages.ts) will delete and re-insert this part when the user
@@ -227,7 +246,10 @@ export async function executeToolPhase(
); );
if (!resolution.ok) { if (!resolution.ok) {
// Auto-deny without pausing. The model sees the reason on its // Auto-deny without pausing. The model sees the reason on its
// next turn and decides what to do. // next turn and decides what to do. Counts as a permission_denied
// failure for the mistake tracker (the model asked for a path it
// can't have — a recoverable mistake it should learn from).
outcomes.push('permission_denied');
const stored = { const stored = {
tool_call_id: tc.id, tool_call_id: tc.id,
output: `denied: ${resolution.reason}`, output: `denied: ${resolution.reason}`,
@@ -255,6 +277,7 @@ export async function executeToolPhase(
// pause. The grant endpoint re-derives the root at decision time // pause. The grant endpoint re-derives the root at decision time
// (state may have changed in the meantime) so we don't stash it here. // (state may have changed in the meantime) so we don't stash it here.
pausingForUserInput = true; pausingForUserInput = true;
outcomes.push('success');
const sentinel = { tool_call_id: tc.id, output: null, truncated: false }; const sentinel = { tool_call_id: tc.id, output: null, truncated: false };
// v1.13.20: parts-only write. // v1.13.20: parts-only write.
await insertParts( await insertParts(
@@ -267,6 +290,10 @@ export async function executeToolPhase(
return; return;
} }
if (agent && !matchToolGlob(tc.name, agent.tools)) { if (agent && !matchToolGlob(tc.name, agent.tools)) {
// Agent-scope denial — the model called a tool outside its whitelist.
// permission_denied for the mistake tracker (the model should pick a
// tool it's actually allowed to use).
outcomes.push('permission_denied');
const stored = { const stored = {
tool_call_id: tc.id, tool_call_id: tc.id,
output: null, output: null,
@@ -295,6 +322,10 @@ export async function executeToolPhase(
sql: ctx.sql, sql: ctx.sql,
sessionId, sessionId,
}); });
// v#12 MistakeTracker: record the real execution outcome (success or a
// FailureKind). This is the primary signal for heterogeneous-failure
// detection.
outcomes.push(tres.outcome);
if (SYNTHESIS_TOOLS.has(tc.name)) { if (SYNTHESIS_TOOLS.has(tc.name)) {
synthEntries.push({ tc, output: tres.output, ...(tres.error ? { error: tres.error } : {}) }); synthEntries.push({ tc, output: tres.output, ...(tres.error ? { error: tres.error } : {}) });
} }
@@ -340,6 +371,7 @@ export async function executeToolPhase(
toolCallCount: toolCalls.length, toolCallCount: toolCalls.length,
toolCalls, toolCalls,
nextAssistantId: null, nextAssistantId: null,
outcomes,
}; };
} }
@@ -378,6 +410,7 @@ export async function executeToolPhase(
toolCallCount: toolCalls.length, toolCallCount: toolCalls.length,
toolCalls, toolCalls,
nextAssistantId: null, nextAssistantId: null,
outcomes,
}; };
} }
// ran === false → synthesis failed (timeout / model error) → fall through // ran === false → synthesis failed (timeout / model error) → fall through
@@ -397,5 +430,6 @@ export async function executeToolPhase(
toolCallCount: toolCalls.length, toolCallCount: toolCalls.length,
toolCalls, toolCalls,
nextAssistantId: nextAssistant!.id, nextAssistantId: nextAssistant!.id,
outcomes,
}; };
} }

View File

@@ -22,6 +22,13 @@ import { resolveToolBudget } from './budget.js';
import { import {
detectDoomLoop, detectDoomLoop,
} from './sentinels.js'; } from './sentinels.js';
import {
detectMistakePattern,
freshMistakeState,
recordStep,
MISTAKE_RECOVERY_NOTE,
type MistakeState,
} from './mistake-tracker.js';
import { import {
buildMessagesPayload, buildMessagesPayload,
loadContext, loadContext,
@@ -39,6 +46,7 @@ import {
runCapHitSummary, runCapHitSummary,
runDoomLoopSummary, runDoomLoopSummary,
runStepCapSummary, runStepCapSummary,
insertMistakeRecoverySentinel,
} from './sentinel-summaries.js'; } from './sentinel-summaries.js';
// v1.14.0: hard ceiling on the number of stream-and-tool iterations per // v1.14.0: hard ceiling on the number of stream-and-tool iterations per
@@ -144,6 +152,16 @@ export interface TurnArgs {
// boundaries by runInference, same as toolsUsed. Doom-loop check at the // boundaries by runInference, same as toolsUsed. Doom-loop check at the
// top of runAssistantTurn slices the last DOOM_LOOP_THRESHOLD entries. // top of runAssistantTurn slices the last DOOM_LOOP_THRESHOLD entries.
recentToolCalls: ToolCall[]; recentToolCalls: ToolCall[];
// v#12 MistakeTracker: heterogeneous-failure recovery state. Loop-local,
// reset per runInference (user-message boundary) like recentToolCalls. Folds
// tool-phase outcomes via recordStep each iteration; detectMistakePattern
// gates the nudge/escalate decision.
mistakeTracker: MistakeState;
// v#12: transient model-facing recovery note set when a nudge fires. Consumed
// (appended as a role:'system' message + cleared) on the NEXT payload build.
// Never persisted — mirrors how the cap-hit/doom-loop notes live only inside
// the summary call's messages array.
pendingRecoveryNote?: string;
signal: AbortSignal | undefined; signal: AbortSignal | undefined;
} }
@@ -188,6 +206,12 @@ export async function runAssistantTurn(
let toolsUsed = args.toolsUsed; let toolsUsed = args.toolsUsed;
let recentToolCalls = args.recentToolCalls; let recentToolCalls = args.recentToolCalls;
let assistantMessageId = args.assistantMessageId; let assistantMessageId = args.assistantMessageId;
// v#12 MistakeTracker: the tracker state is carried on `args` (mutated in
// place by recordStep). pendingRecoveryNote is a loop-local because it is a
// single-step transient — set when a nudge fires, consumed (injected into the
// next payload) and cleared on the following iteration.
const mistakeTracker = args.mistakeTracker;
let pendingRecoveryNote: string | undefined = args.pendingRecoveryNote;
while (stepNumber < effectiveCap) { while (stepNumber < effectiveCap) {
// ---- doom-loop check (moved from top-of-function) ---- // ---- doom-loop check (moved from top-of-function) ----
@@ -196,7 +220,7 @@ export async function runAssistantTurn(
// Need fresh history for the summary. // Need fresh history for the summary.
const loaded = await loadContext(ctx.sql, sessionId, chatId); const loaded = await loadContext(ctx.sql, sessionId, chatId);
if (loaded) { if (loaded) {
const iterArgs: TurnArgs = { sessionId, chatId, assistantMessageId, toolsUsed, recentToolCalls, signal }; const iterArgs: TurnArgs = { sessionId, chatId, assistantMessageId, toolsUsed, recentToolCalls, mistakeTracker, signal };
await runDoomLoopSummary(ctx, iterArgs, loaded.session, loaded.project, loaded.history, agent, loop); await runDoomLoopSummary(ctx, iterArgs, loaded.session, loaded.project, loaded.history, agent, loop);
} }
break; break;
@@ -206,7 +230,7 @@ export async function runAssistantTurn(
if (toolsUsed >= budget) { if (toolsUsed >= budget) {
const loaded = await loadContext(ctx.sql, sessionId, chatId); const loaded = await loadContext(ctx.sql, sessionId, chatId);
if (loaded) { if (loaded) {
const iterArgs: TurnArgs = { sessionId, chatId, assistantMessageId, toolsUsed, recentToolCalls, signal }; const iterArgs: TurnArgs = { sessionId, chatId, assistantMessageId, toolsUsed, recentToolCalls, mistakeTracker, signal };
await runCapHitSummary(ctx, iterArgs, loaded.session, loaded.project, loaded.history, agent, budget); await runCapHitSummary(ctx, iterArgs, loaded.session, loaded.project, loaded.history, agent, budget);
} }
break; break;
@@ -265,7 +289,16 @@ export async function runAssistantTurn(
} }
} }
const iterArgs: TurnArgs = { sessionId, chatId, assistantMessageId, toolsUsed, recentToolCalls, signal }; // v#12 MistakeTracker: if the prior iteration's nudge fired, append the
// transient recovery note to THIS payload (consumed exactly once, then
// cleared). Never persisted — same lifecycle as the cap-hit/doom-loop
// summary notes, which live only inside the in-memory messages array.
if (pendingRecoveryNote) {
messages.push({ role: 'system', content: pendingRecoveryNote });
pendingRecoveryNote = undefined;
}
const iterArgs: TurnArgs = { sessionId, chatId, assistantMessageId, toolsUsed, recentToolCalls, mistakeTracker, signal };
const state: StreamPhaseState = { accumulated: '', startedAt: null }; const state: StreamPhaseState = { accumulated: '', startedAt: null };
let result: StreamResult; let result: StreamResult;
try { try {
@@ -305,10 +338,78 @@ export async function runAssistantTurn(
recentToolCalls = [...recentToolCalls, ...toolPhaseResult.toolCalls]; recentToolCalls = [...recentToolCalls, ...toolPhaseResult.toolCalls];
stepNumber++; stepNumber++;
// v#12 MistakeTracker: fold this iteration's tool outcomes into the
// tracker, in order. recordStep mutates `mistakeTracker` in place (it is
// the same object referenced by args). A 'success' clears the streak.
for (const o of toolPhaseResult.outcomes) {
recordStep(mistakeTracker, o);
}
if (toolPhaseResult.action !== 'continue') { if (toolPhaseResult.action !== 'continue') {
// 'paused' (user input) or 'synthesis_done' — stop the loop. // 'paused' (user input) or 'synthesis_done' — stop the loop. The turn is
// already ending, so neither a nudge nor an escalate would change the
// control flow; we skip the mistake decision here.
break; break;
} }
// v#12 MistakeTracker: heterogeneous-failure decision. Only evaluated on
// the 'continue' path (the only case where the loop would otherwise
// proceed to another step). Complements the doom-loop check above, which
// only catches *identical* repeats.
const mistake = detectMistakePattern(mistakeTracker);
if (mistake === 'nudge') {
// Soft intervention: inject model-facing recovery guidance into the NEXT
// step's payload, drop a UI sentinel, bump nudges, reset the streak, and
// continue. The note is consumed (and cleared) at the top of the next
// iteration's payload build.
pendingRecoveryNote = MISTAKE_RECOVERY_NOTE;
const failureKinds = [...mistakeTracker.run];
await insertMistakeRecoverySentinel(ctx, sessionId, chatId, {
failureKinds,
count: failureKinds.length,
escalated: false,
canContinue: true,
});
mistakeTracker.nudges += 1;
mistakeTracker.run = [];
ctx.log.info(
{ sessionId, chatId, step: stepNumber, nudges: mistakeTracker.nudges, failureKinds },
'mistake_recovery nudge',
);
assistantMessageId = toolPhaseResult.nextAssistantId!;
continue;
}
if (mistake === 'escalate') {
// The nudge didn't break the failure run — stop the turn (cap-hit-style)
// to avoid burning the whole step budget on heterogeneous failures. The
// next assistant row is still 'streaming'; finalize it as a short note so
// the slot doesn't dangle, then drop the escalate sentinel.
const failureKinds = [...mistakeTracker.run];
assistantMessageId = toolPhaseResult.nextAssistantId!;
await ctx.sql`
UPDATE messages
SET content = '', status = 'complete', finished_at = clock_timestamp()
WHERE id = ${assistantMessageId}
`;
ctx.publish(sessionId, {
type: 'message_complete',
message_id: assistantMessageId,
chat_id: chatId,
});
await insertMistakeRecoverySentinel(ctx, sessionId, chatId, {
failureKinds,
count: failureKinds.length,
escalated: true,
canContinue: true,
});
ctx.publishUser({ type: 'chat_status', chat_id: chatId, status: 'idle', at: new Date().toISOString() });
ctx.log.info(
{ sessionId, chatId, step: stepNumber, failureKinds },
'mistake_recovery escalate — stopping turn',
);
break;
}
// 'continue' — advance to next assistant message. // 'continue' — advance to next assistant message.
assistantMessageId = toolPhaseResult.nextAssistantId!; assistantMessageId = toolPhaseResult.nextAssistantId!;
} }
@@ -320,7 +421,7 @@ export async function runAssistantTurn(
if (stepNumber >= effectiveCap && effectiveCap < Infinity) { if (stepNumber >= effectiveCap && effectiveCap < Infinity) {
const loaded = await loadContext(ctx.sql, sessionId, chatId); const loaded = await loadContext(ctx.sql, sessionId, chatId);
if (loaded) { if (loaded) {
const capArgs: TurnArgs = { sessionId, chatId, assistantMessageId, toolsUsed, recentToolCalls, signal }; const capArgs: TurnArgs = { sessionId, chatId, assistantMessageId, toolsUsed, recentToolCalls, mistakeTracker, signal };
await runStepCapSummary(ctx, capArgs, loaded.session, loaded.project, loaded.history, agent, stepNumber, effectiveCap); await runStepCapSummary(ctx, capArgs, loaded.session, loaded.project, loaded.history, agent, stepNumber, effectiveCap);
} }
} }
@@ -378,12 +479,16 @@ export async function runInference(
// per-call budget. // per-call budget.
// v1.11.6: recentToolCalls also resets — doom-loop detection is scoped // v1.11.6: recentToolCalls also resets — doom-loop detection is scoped
// to a single user-message turn, so a Continue starts with no history. // to a single user-message turn, so a Continue starts with no history.
// v#12 MistakeTracker: fresh per user-message turn, like recentToolCalls.
// Tracks consecutive heterogeneous tool failures across the loop's
// stream-and-tool iterations within this turn.
return runAssistantTurn(ctx, { return runAssistantTurn(ctx, {
sessionId, sessionId,
chatId, chatId,
assistantMessageId, assistantMessageId,
toolsUsed: 0, toolsUsed: 0,
recentToolCalls: [], recentToolCalls: [],
mistakeTracker: freshMistakeState(),
signal, signal,
}); });
} }

View File

@@ -4,6 +4,12 @@
* Reads a JSON config file (default `/data/mcp.json`) that declares MCP * Reads a JSON config file (default `/data/mcp.json`) that declares MCP
* servers — their transport type, connection parameters, and enabled state. * servers — their transport type, connection parameters, and enabled state.
* Schema shape matches opencode's `mcpServers` key for copy-paste compat. * Schema shape matches opencode's `mcpServers` key for copy-paste compat.
*
* Secrets stay out of the config file via `{env:VAR}` substitution
* (opencode-compatible). Any string value can reference an environment
* variable, e.g. a header `"CONTEXT7_API_KEY": "{env:CONTEXT7_API_KEY}"`
* resolves from `process.env` at load. This keeps real keys in `.env`
* (`env_file` in docker-compose) rather than the gitignored config.
*/ */
import { readFileSync } from 'node:fs'; import { readFileSync } from 'node:fs';
import { z } from 'zod'; import { z } from 'zod';
@@ -38,6 +44,49 @@ export interface McpServerEntry {
config: McpServerConfig; config: McpServerConfig;
} }
// ---- Env-var substitution ----
const ENV_VAR_PATTERN = /\{env:([A-Za-z_][A-Za-z0-9_]*)\}/g;
/**
* Recursively replace `{env:VAR}` references in string values with the
* matching environment variable (opencode-compatible). Runs before Zod
* validation so a resolved value (e.g. a `{env:...}` URL) still validates.
* An unset var resolves to '' and logs a warning so a missing secret is
* visible in the boot log rather than silently sending a literal placeholder.
* Pass an optional `unsetVars` set to collect the names that resolved to '';
* the loader surfaces them on a validation failure (an empty value in a strict
* url/command field invalidates the whole config — see loadMcpConfig).
*/
export function substituteEnvVars(
value: unknown,
log: FastifyBaseLogger,
unsetVars?: Set<string>,
): unknown {
if (typeof value === 'string') {
return value.replace(ENV_VAR_PATTERN, (_match, name: string) => {
const resolved = process.env[name];
if (resolved === undefined) {
unsetVars?.add(name);
log.warn(`mcp: env var ${name} referenced in config is unset; substituting empty string`);
return '';
}
return resolved;
});
}
if (Array.isArray(value)) {
return value.map((v) => substituteEnvVars(v, log, unsetVars));
}
if (value && typeof value === 'object') {
const out: Record<string, unknown> = {};
for (const [k, v] of Object.entries(value as Record<string, unknown>)) {
out[k] = substituteEnvVars(v, log, unsetVars);
}
return out;
}
return value;
}
// ---- Loader ---- // ---- Loader ----
/** /**
@@ -61,9 +110,19 @@ export function loadMcpConfig(configPath: string, log: FastifyBaseLogger): McpSe
return []; return [];
} }
const result = McpConfigSchema.safeParse(json); const unsetVars = new Set<string>();
const result = McpConfigSchema.safeParse(substituteEnvVars(json, log, unsetVars));
if (!result.success) { if (!result.success) {
log.warn({ errors: result.error.flatten().fieldErrors }, `mcp: invalid config at ${configPath}`); // Connect the two otherwise-disconnected warnings: an unset {env:VAR} that
// resolved to '' can invalidate a strict field (url/command) and drop the
// whole config, so name the unset vars alongside the validation errors.
const hint = unsetVars.size
? `${unsetVars.size} referenced env var(s) unset & substituted with '' (${[...unsetVars].join(', ')}); an unset {env:VAR} in a url/command field invalidates the whole config`
: '';
log.warn(
{ errors: result.error.flatten().fieldErrors, unsetEnvVars: [...unsetVars] },
`mcp: invalid config at ${configPath}${hint}`,
);
return []; return [];
} }

View File

@@ -25,19 +25,9 @@ export interface AvailableProject {
export type SessionStatus = 'open' | 'archived'; export type SessionStatus = 'open' | 'archived';
// Session-delete work-loss guard. Returned (as `reports`) in the 409 body when // WorktreeRiskReport single-sourced in @boocode/contracts — edit the package, not here.
// a delete is blocked because the session's worktree holds work at risk. The import type { WorktreeRiskReport } from '@boocode/contracts/worktree-risk';
// shape is produced by BooCoder's checkWorktreeWorkAtRisk and passed through export type { WorktreeRiskReport };
// verbatim; mirrored byte-for-byte in apps/web/src/api/types.ts for the dialog.
export interface WorktreeRiskReport {
worktreePath: string;
branch: string;
dirty: boolean;
unpushed: number; // commits ahead of upstream, or -1 if no upstream
unmerged: number; // commits not in the project default branch
atRisk: boolean;
error?: string;
}
export interface Session { export interface Session {
id: string; id: string;
@@ -117,6 +107,15 @@ export interface Agent {
top_k: number | null; // null means omit from request body top_k: number | null; // null means omit from request body
min_p: number | null; // null means omit from request body min_p: number | null; // null means omit from request body
presence_penalty: number | null; // null means omit from request body presence_penalty: number | null; // null means omit from request body
// v2.6 sampling-streamjson-tokens (#11): llama.cpp sampler extensions.
// null = omit from request body. top_n_sigma + the DRY repetition family
// help the doom-loop-prone local model. All travel via the same
// providerOptions.openaiCompatible extraBody channel as top_k/min_p.
top_n_sigma: number | null;
dry_multiplier: number | null;
dry_base: number | null;
dry_allowed_length: number | null;
dry_penalty_last_n: number | null;
tools: string[]; // whitelist of tool names; empty = no tools allowed tools: string[]; // whitelist of tool names; empty = no tools allowed
model: string | null; // null means "session.model wins" model: string | null; // null means "session.model wins"
source: AgentSource; source: AgentSource;
@@ -189,38 +188,10 @@ export interface ToolResult {
error?: string; error?: string;
} }
// v1.8.2: structured reason codes for failed inferences. `error` carries the // v1.8.2 / v1.11.6: ErrorReason + MessageMetadata single-sourced in
// human text; `reason` is the machine-readable discriminator the UI matches // @boocode/contracts — edit the package, not here.
// on (with `error` as fallback when reason is absent or unrecognized). import type { ErrorReason, MessageMetadata } from '@boocode/contracts/message-metadata';
export type ErrorReason = export type { ErrorReason, MessageMetadata };
| 'llm_provider_error'
| 'tool_execution_failed'
| 'summary_after_cap_failed';
// v1.8.2 / v1.11.6: shapes stored in messages.metadata. Discriminated on `kind`.
// cap_hit — system sentinel emitted when tool budget is exhausted
// doom_loop — system sentinel emitted when the model called the same
// tool with the same args DOOM_LOOP_THRESHOLD times in a row
// error — attached to a failed assistant message so UI can show reason
export type MessageMetadata =
| {
kind: 'cap_hit';
used: number;
limit: number;
agent_name: string | null;
can_continue: boolean;
}
| {
kind: 'doom_loop';
tool_name: string;
args: Record<string, unknown>;
threshold: number;
}
| {
kind: 'error';
error_reason: ErrorReason;
error_text: string;
};
export interface Message { export interface Message {
id: string; id: string;

47
apps/web/CLAUDE.md Normal file
View File

@@ -0,0 +1,47 @@
# apps/web — BooChat frontend (deep reference)
> Per-app engineering notes for `apps/web/src/`. The frontend is a single React SPA that also hosts the BooCoder `'coder'` pane. Cross-cutting commands, database, environment, workflow, and cross-app contracts (WS-frame / provider-type parity, sentinels) live in the **root `CLAUDE.md`**. This file auto-loads when you read/edit files under `apps/web/`.
## Stack
- **React 18** + React Router v6 + **Tailwind v4** + shadcn/radix-ui primitives.
- **Shiki** for syntax highlighting (async `codeToHtml` in `CodeBlock.tsx` and `FileViewer` in `FileBrowserPane.tsx`).
- Path alias: `@/` maps to `src/`.
- **Mobile interaction primitives**: `useViewport` (matchMedia; mobile <768 / tablet 7681023 / desktop ≥1024), `useSidebarDrawer` / `useRightRailDrawer` (Context + auto-close on `useLocation().pathname` change), `useLongPress` (500ms timer, synthetic `contextmenu` on `[data-tab-id]`), `usePullToRefresh` (80px threshold, 600ms hold), `SwipeablePaneTab` (60px close, 30px vertical bail). Tap-target convention: `max-md:min-h-[44px] max-md:min-w-[44px]`. Mobile headers: `border-b px-3 sm:px-4 py-2` + `paddingTop: 'max(0.5rem, env(safe-area-inset-top))'`. Hamburger left, FolderTree right.
## Key patterns
- **`hooks/sessionEvents.ts`** — Module-singleton event bus (Set of listeners) for cross-component communication: session renames, file-open, attachment dispatch. 26-arm discriminated union (and growing). Adding an event type also requires a `case` in the `applyEvent` switch in `useSidebar.ts` (no-op `return prev` is fine), and a subscribe in any hook that needs it (e.g. `useSessionStream` for `refetch_messages`).
- **`hooks/useSessionStream.ts`** — WebSocket per session; `applyFrame` reducer builds the message list from streaming frames.
- **`hooks/useUserEvents.ts`** — Single app-level WS to `/api/ws/user` with exponential-backoff reconnect. Forwards frames onto the sessionEvents bus.
- **`hooks/useSidebar.ts`** — Module-singleton with `Set<setState>` subscriber pattern; one bus subscription guarded by `globalThis.__boocode_sidebar_subscribed` for HMR safety. Every new `SessionEvent` type needs a `case` in `applyEvent`.
- **`api/client.ts`** — Centralized typed fetch wrapper. All endpoints under `api.*`.
## Font / CSS pipeline
- Tailwind v4's `@import "tailwindcss"` strips font URLs from subsequent CSS `@import`s — `@fontsource*` packages must be JS side-effect imports in `apps/web/src/main.tsx`, not `@import` in `globals.css`, or the woff2 files never reach `dist/`.
- Lightning CSS (inside `@tailwindcss/postcss` v4) collapses contiguous unicode-ranges to wildcard shorthand (`U+0000-FFFF``U+????`), which iOS Safari/Vivaldi mishandles (silently drops the font for those codepoints). Use explicit non-collapsible subranges (e.g. `U+2500-259F`, not `U+2500-25FF`). The `apps/web` build script greps `dist/assets/*.css` for `U+2500-259F` and fails the build if missing — preserve that guard.
- `@font-face` blocks must live AFTER all `@import` statements (CSS spec). Earlier placement silently breaks every subsequent `@import`.
- JetBrainsMono Nerd Font self-hosted in `apps/web/src/fonts/` (TTF from ryanoasis/nerd-fonts) — `@fontsource-variable/jetbrains-mono` ships subsetted woff2s that don't cover `U+2500-259F` (box drawing/block elements, used by opencode's banner). "NL" = No Ligatures (matches `font-feature-settings: "liga" 0`); "Mono" = single-cell icon width so TUI layouts don't desync.
- xterm-addon-webgl rasterizes glyphs via Canvas2D into a GPU atlas; Canvas2D does NOT honor `font-display: block` — it uses whatever font is registered. Gate xterm init on `document.fonts.load(<font-name>)` resolving before `term.open()` (`fontsReady` in `TerminalPane.tsx`). iOS Safari/Vivaldi also reclaim WebGL contexts from backgrounded tabs: keep `webgl.onContextLoss(() => webgl.dispose())` + recreate via visibilitychange. Do NOT manually dispose+recreate the addon after font load — iOS silently fails the second GL context creation and drops to DOM renderer with stale metrics.
## Multi-pane workspace
Sessions hold 15 panes (chat / empty / placeholder terminal+agent). Pane state lives in `sessions.workspace_panes jsonb` for cross-device sync. `PATCH /api/sessions/:id/workspace` persists; `session_workspace_updated` user-channel frame broadcasts to every device. `useWorkspacePanes` debounces saves 300ms and dedups echoes by JSON string (legacy localStorage key seeded once on first hydrate, then no longer written). `validatePanes(validChatIds)` prunes panes referencing deleted chats. Each chat lives in at most one pane; the per-pane tab strip tracks `chatIds[]` + `activeChatIdx`, reorder via native HTML5 drag. `workspace_panes` is a `WorkspaceState` envelope `{panes, tabNumbers, nextTabNumber, closedPaneStack}` (tabNumbers = stable session-scoped chatId→number, never reused; closedPaneStack = reopen LIFO, max 10, persisted); hydrate (`toWorkspaceState`) and the server PATCH validator (`z.union([array, envelope])`) both accept the legacy bare array and normalize. Closing a chat pane relocates its tabs to the oldest chat/empty pane; `reopenPane` strips restored chatIds from all live panes first. `read_tab_by_number` resolves number→chatId through `tabNumbers`.
## Frontend conventions
- `overflowWrap` not `wordWrap` — TypeScript's CSSStyleDeclaration marks `wordWrap` deprecated (error 6385).
- shadcn primitives live in `components/ui/`. Don't modify them unless adding a new primitive.
- `ui/` primitives present: button, card, context-menu, dialog, dropdown-menu, input, label, radio-group, sonner, textarea. No switch/sheet/drawer/badge/checkbox — use a `<button role="switch" aria-checked>` toggle (a hand-rolled `Switch` lives in `SettingsPane.tsx`) and a Dialog-based panel for "drawers".
- `inferLanguage()` from `lib/attachments.ts` is the canonical file-extension→language map. `CodeBlock.tsx` keeps its own `LANG_MAP` because it also resolves markdown fence names.
- Two UI event buses: `hooks/sessionEvents.ts` for DB-state events (chat_created, session_updated); `lib/events.ts` for ephemeral UI (`sendToTerminal`, `terminalsRegistry`). Don't merge — different subscriber lifecycles.
- `vite.config.ts` proxy entries are order-sensitive: more-specific prefixes (`/api/term`, `/ws/term`) must come BEFORE `/api`.
- Mobile pane URL sync (`Session.tsx`): the `?pane=<id>` effect resets `activePaneIdx` whenever `panes` changes. New-pane creation on mobile must push `?pane=` atomically — `addPaneAndSwitch` does this; `addSplitPane` returns the new pane id.
- A scrollable list inside a Dialog on mobile: cap `DialogContent` (`max-h-[85vh]` + `grid-rows-[auto_minmax(0,1fr)_auto]`) and make the list the single scroll region with `overscroll-contain` — otherwise touch-scroll drags the whole fixed modal / chains to the page.
- xterm.js v5 uses canvas rendering — the browser doesn't see xterm's selection, so the native right-click Copy doesn't work for terminal text. App keybindings (`Cmd/Ctrl-C`, `Cmd/Ctrl-Shift-C`) are the path.
- React **StrictMode is on** (`main.tsx`): an updater passed to one `setState` that itself calls another `setState` (e.g. `setClosedPaneStack` inside a `setPanes` updater) is double-invoked in dev. Make such nested updates idempotent — `useWorkspacePanes`'s `appendClosed` dedupes a value-identical top entry for this reason.
- **CoderPane uses ChatInput** (`components/panes/CoderPane.tsx`): shares BooChat's `ChatInput` for full parity — attachments, paste-to-chip, auto-grow textarea, queued messages during send. `sendOneMessage` is the send callback; queued messages drain via `useEffect` when `sending` goes false.
- **AgentComposerBar filters `e.installed`**: provider snapshot entries with `installed:false` (loading/unavailable) are dropped from the dropdown. `getProviderSnapshot` must await the full build — returning synchronous `loading` placeholders makes every provider vanish; surfacing loading states needs a client poll.
- **Pane header architecture (mobile vs desktop)**: desktop coder pane header (BooCode label + [+] [×]) lives in `Workspace.tsx` gated by `isCoder && !isMobile`. Mobile coder controls (● ×) live in `Session.tsx` next to `MobileTabSwitcher`/`NewPaneMenu`. `AgentComposerBar` (provider/mode/model pickers) renders inside `CoderPane.tsx` on both; the ● status dot is passed via `connected` prop.
- **MessageBubble shared between BooChat and BooCoder** (`components/MessageBubble.tsx`): optional `actions?: MessageActions` + `hideActions?` props; CoderPane overrides via `CoderMessageList`. **`CoderMessageList` passes `CoderMessageWire as unknown as Message`** — the coder shape lacks `metadata`/`kind`/`summary`, so they're `undefined` (not `null`). Null-guards on any `Message` field MUST use loose `!= null`, not `!== null` (`undefined !== null` is `true``.kind` throws → blank-screen crash). The cast hides this from tsc; build passes while runtime crashes.

View File

@@ -10,6 +10,7 @@
"typecheck": "tsc -b --noEmit" "typecheck": "tsc -b --noEmit"
}, },
"dependencies": { "dependencies": {
"@boocode/contracts": "workspace:*",
"@fontsource-variable/inter": "^5.2.8", "@fontsource-variable/inter": "^5.2.8",
"@fontsource-variable/jetbrains-mono": "^5.2.8", "@fontsource-variable/jetbrains-mono": "^5.2.8",
"@xterm/addon-fit": "0.10.0", "@xterm/addon-fit": "0.10.0",

View File

@@ -34,6 +34,12 @@ export interface AgentSessionInfo {
status: string; status: string;
has_session: boolean; has_session: boolean;
last_active_at: string | null; last_active_at: string | null;
// v2.6.8 per-(chat,agent) running token/cost totals (sampling-streamjson-tokens
// #8). input_tokens/output_tokens are BIGINT and may arrive as strings; cost is
// DOUBLE. AgentComposerBar coerces with Number(...) before rendering.
input_tokens: number;
output_tokens: number;
cost: number;
} }
// write-edit-robustness #4: a pre-turn worktree snapshot anchored to an // write-edit-robustness #4: a pre-turn worktree snapshot anchored to an

View File

@@ -34,18 +34,8 @@ export interface AvailableProject {
export type SessionStatus = 'open' | 'archived'; export type SessionStatus = 'open' | 'archived';
// Session-delete work-loss guard. Mirror of WorktreeRiskReport in // WorktreeRiskReport single-sourced in @boocode/contracts — edit the package, not here.
// apps/server/src/types/api.ts — edit both copies together. Arrives as the export type { WorktreeRiskReport } from '@boocode/contracts/worktree-risk';
// `reports` field of the 409 body when a delete is blocked.
export interface WorktreeRiskReport {
worktreePath: string;
branch: string;
dirty: boolean;
unpushed: number; // commits ahead of upstream, or -1 if no upstream
unmerged: number; // commits not in the project default branch
atRisk: boolean;
error?: string;
}
export interface Session { export interface Session {
id: string; id: string;
@@ -143,39 +133,10 @@ export interface ToolResult {
error?: string; error?: string;
} }
// v1.8.2: structured reason codes that flow through error frames / metadata. // v1.8.2 / v1.11.6: ErrorReason + MessageMetadata single-sourced in
// `error` text stays human; `reason` is the discriminator the UI matches on. // @boocode/contracts — edit the package, not here.
export type ErrorReason = import type { ErrorReason, MessageMetadata } from '@boocode/contracts/message-metadata';
| 'llm_provider_error' export type { ErrorReason, MessageMetadata };
| 'tool_execution_failed'
| 'summary_after_cap_failed';
// v1.8.2 / v1.11.6: shapes stored in Message.metadata. Discriminated on `kind`.
// cap_hit — sentinel emitted when the tool budget is hit; carries the
// budget + agent name + whether Continue is still allowed.
// doom_loop — sentinel emitted when the model called the same tool with
// the same arguments threshold times in a row.
// error — attached to a failed assistant message so the bubble can show
// a specific reason on reload (WS error frame is one-shot).
export type MessageMetadata =
| {
kind: 'cap_hit';
used: number;
limit: number;
agent_name: string | null;
can_continue: boolean;
}
| {
kind: 'doom_loop';
tool_name: string;
args: Record<string, unknown>;
threshold: number;
}
| {
kind: 'error';
error_reason: ErrorReason;
error_text: string;
};
export interface Message { export interface Message {
id: string; id: string;
@@ -191,6 +152,9 @@ export interface Message {
tokens_used: number | null; tokens_used: number | null;
ctx_used: number | null; ctx_used: number | null;
ctx_max: number | null; ctx_max: number | null;
// model-attribution: which model produced this assistant message (null for
// user/system rows + pre-attribution messages). Rendered as a chip.
model: string | null;
started_at: string | null; started_at: string | null;
finished_at: string | null; finished_at: string | null;
created_at: string; created_at: string;
@@ -226,80 +190,23 @@ export interface ModelInfo {
[key: string]: unknown; [key: string]: unknown;
} }
export interface ProviderModel { export type {
id: string; ProviderModel,
label: string; ProviderMode,
description?: string; ThinkingOption,
isDefault?: boolean; ProviderSnapshotStatus,
thinkingOptions?: ThinkingOption[]; AgentCommand,
defaultThinkingOptionId?: string; ProviderSnapshotEntry,
} } from '@boocode/contracts/provider-snapshot';
export interface ProviderMode { export type {
id: string; ProviderOverride,
label: string; CoderProvidersFile,
description?: string; ProviderConfigPatch,
isUnattended?: boolean; } from '@boocode/contracts/provider-config';
}
export interface ThinkingOption { // AgentSessionConfig single-sourced in @boocode/contracts — edit the package, not here.
id: string; export type { AgentSessionConfig } from '@boocode/contracts/message-metadata';
label: string;
isDefault?: boolean;
}
// v2.3 phase 2: 'loading' + 'unavailable' restored alongside 'ready' | 'error'.
export type ProviderSnapshotStatus = 'loading' | 'ready' | 'unavailable' | 'error';
// KEEP IN SYNC with apps/coder/src/services/provider-types.ts ProviderSnapshotEntry
// — parity is enforced by coder __tests__/provider-types-parity.test.ts (field drift fails it).
export interface ProviderSnapshotEntry {
name: string;
label: string;
description?: string;
transport: string;
status: ProviderSnapshotStatus;
enabled: boolean;
installed: boolean;
models: ProviderModel[];
modes: ProviderMode[];
defaultModeId: string | null;
commands: AgentCommand[];
error?: string;
fetchedAt?: string;
}
// v2.3 Phase 4: provider config file wire types. Mirror of the Zod-inferred
// ProviderOverride / CoderProvidersFile in apps/coder/src/services/provider-config.ts
// (web can't cross-import the coder package — TS6307 on the composite project).
export interface ProviderOverride {
extends?: 'acp';
label?: string;
description?: string;
command?: string[];
env?: Record<string, string>;
enabled?: boolean;
order?: number;
models?: Array<{ id: string; label: string }>;
additionalModels?: Array<{ id: string; label: string }>;
}
export interface CoderProvidersFile {
providers: Record<string, ProviderOverride>;
}
// PATCH body: a partial providers map. A `null` value deletes that id's
// override (revert to built-in default); an object replaces it wholesale.
export interface ProviderConfigPatch {
providers: Record<string, ProviderOverride | null>;
}
export interface AgentSessionConfig {
provider: string;
model: string;
modeId: string | null;
thinkingOptionId: string | null;
}
export type PermissionKind = 'tool' | 'question' | 'plan' | 'elicitation'; export type PermissionKind = 'tool' | 'question' | 'plan' | 'elicitation';
@@ -311,14 +218,6 @@ export interface PermissionPrompt {
options: Array<{ optionId: string; label: string }>; options: Array<{ optionId: string; label: string }>;
} }
export interface AgentCommand {
name: string;
description?: string;
// v2.5.11: 'skill' (plugin skill) vs 'command' (native/CLI slash command).
// Drives the icon split in the coder slash menu. Undefined → command.
kind?: 'command' | 'skill';
}
export interface CoderSendMessageBody { export interface CoderSendMessageBody {
content: string; content: string;
pane_id: string; pane_id: string;
@@ -341,7 +240,13 @@ export interface CoderMessageWire {
role: 'user' | 'assistant' | 'system'; role: 'user' | 'assistant' | 'system';
content: string; content: string;
status?: 'streaming' | 'complete' | 'failed'; status?: 'streaming' | 'complete' | 'failed';
// model-attribution: which model produced this coder assistant message.
model?: string | null;
reasoning_text?: string; reasoning_text?: string;
// Context-window fill for the ContextBar (claude SDK turns set these from the
// SDK's reported window; other agents omit them). Read via the Message cast.
ctx_used?: number | null;
ctx_max?: number | null;
tool_calls?: Array<{ tool_calls?: Array<{
id: string; id: string;
function: { name: string; arguments: string }; function: { name: string; arguments: string };
@@ -561,6 +466,8 @@ export type WsFrame =
ctx_max?: number | null; ctx_max?: number | null;
started_at?: string | null; started_at?: string | null;
finished_at?: string | null; finished_at?: string | null;
// model-attribution: the model that produced this assistant message.
model?: string | null;
// v1.8.2: piggybacks the persisted metadata onto the terminal frame so // v1.8.2: piggybacks the persisted metadata onto the terminal frame so
// cap-hit sentinels (and any future stamped-on-complete metadata) flow // cap-hit sentinels (and any future stamped-on-complete metadata) flow
// to the client without a refetch. // to the client without a refetch.
@@ -586,4 +493,16 @@ export type WsFrame =
| { type: 'compacted'; session_id: string; chat_id: string; summary_message_id: string } | { type: 'compacted'; session_id: string; chat_id: string; summary_message_id: string }
// v1.8.2: `reason` discriminates structured failures (the UI prefers it // v1.8.2: `reason` discriminates structured failures (the UI prefers it
// over `error` text when present). // over `error` text when present).
| { type: 'error'; message_id?: string; chat_id?: string; error: string; reason?: ErrorReason }; | { type: 'error'; message_id?: string; chat_id?: string; error: string; reason?: ErrorReason }
// agent-status-normalize (#10): BooCoder publishes a normalized per-(chat,agent)
// lifecycle status for external coding agents on the per-session channel. The
// CoderPane tracks the latest status per (chat_id, agent) and resets on chat
// switch; AgentComposerBar renders the dot (distinct from the WS-liveness dot).
| {
type: 'agent_status_updated';
chat_id: string;
agent: string;
status: 'working' | 'blocked' | 'idle' | 'error';
reason?: string;
at: string;
};

View File

@@ -1,381 +0,0 @@
// v1.13.11-a: Zod schemas for every WebSocket frame published by the server.
// Validation runs both on send (broker.publishFrame / publishUserFrame) and
// on receive (apps/web/src/hooks/useSessionStream + useUserEvents). Catches
// silent protocol drift between publisher and consumer.
//
// IMPORTANT: This file is duplicated byte-identical at
// apps/web/src/api/ws-frames.ts. The two apps have separate tsconfigs and
// no path alias; the duplication is sync-by-hand. A test asserts the two
// files match. If you change one, change the other.
//
// Per-kind payload schemas (tool_call args, message_parts payloads, etc.)
// stay z.unknown() in v1.13.11. Frame-level drift detection is the goal;
// deep payload validation is follow-up work.
import { z } from 'zod';
// ---- shared primitives -----------------------------------------------------
const Uuid = z.string().uuid();
// Tool call IDs are model-emitted (e.g. "call_abc123") — not UUIDs.
const ToolCallId = z.string().min(1);
// v1.13.12 fix: postgres returns timestamp columns as JS Date objects, not
// strings. The publish sites pass them through unchanged, so the schema must
// tolerate both. preprocess converts Date → ISO string before string-validation;
// on the web side (where frames arrive via JSON.parse) it's a no-op. Before
// this fix, every message_complete / session_updated / chat_updated frame
// failed validation and got dropped — symptoms: token tracking blank in UI,
// status stuck at 'streaming' tripping the 60s stale-stream banner.
const IsoTimestamp = z.preprocess(
(v) => (v instanceof Date ? v.toISOString() : v),
z.string().min(1),
);
const ChatStatusValue = z.enum([
'streaming',
'tool_running',
'waiting_for_input',
'idle',
'error',
]);
const ErrorReasonValue = z.enum([
'llm_provider_error',
'doom_loop',
'doom_loop_summary_failed',
'cap_hit',
'cap_hit_summary_failed',
]);
const MessageRoleValue = z.enum(['user', 'assistant', 'system', 'tool']);
const ToolCallShape = z.object({
id: ToolCallId,
name: z.string().min(1),
args: z.record(z.string(), z.unknown()),
});
// Free-form bags: opaque to the frame schema; deep validation is out of
// scope for v1.13.11 (frame-level drift detection is the goal; per-kind
// payload narrowing is follow-up work). z.unknown() means the consumer
// must narrow before reading — TypeScript-side this is fine because every
// consumer already operates on the hand-maintained Project / Chat / Session
// / WorkspacePane types (the brief's "Don't strip existing types yet"
// rule), and the Zod-typed shape is only used at the publishFrame boundary.
const OpaqueObject = z.unknown();
// ---- per-session channel frames --------------------------------------------
export const SnapshotFrame = z.object({
type: z.literal('snapshot'),
messages: z.array(OpaqueObject),
});
export const MessageStartedFrame = z.object({
type: z.literal('message_started'),
message_id: Uuid,
chat_id: Uuid.optional(),
role: MessageRoleValue,
});
export const DeltaFrame = z.object({
type: z.literal('delta'),
message_id: Uuid,
chat_id: Uuid.optional(),
content: z.string(),
});
export const ReasoningDeltaFrame = z.object({
type: z.literal('reasoning_delta'),
message_id: Uuid,
chat_id: Uuid.optional(),
content: z.string(),
});
export const ToolCallFrame = z.object({
type: z.literal('tool_call'),
message_id: Uuid,
chat_id: Uuid.optional(),
tool_call: ToolCallShape,
});
export const ToolResultFrame = z.object({
type: z.literal('tool_result'),
tool_message_id: Uuid,
chat_id: Uuid.optional(),
tool_call_id: ToolCallId,
output: z.unknown(),
truncated: z.boolean(),
error: z.string().optional(),
});
export const MessageCompleteFrame = z.object({
type: z.literal('message_complete'),
message_id: Uuid,
chat_id: Uuid.optional(),
tokens_used: z.number().int().nonnegative().nullable().optional(),
ctx_used: z.number().int().nonnegative().nullable().optional(),
ctx_max: z.number().int().positive().nullable().optional(),
started_at: IsoTimestamp.nullable().optional(),
finished_at: IsoTimestamp.nullable().optional(),
model: z.string().optional(),
metadata: OpaqueObject.nullable().optional(),
});
export const UsageFrame = z.object({
type: z.literal('usage'),
message_id: Uuid,
chat_id: Uuid.optional(),
completion_tokens: z.number().int().nonnegative().nullable(),
ctx_used: z.number().int().nonnegative().nullable(),
ctx_max: z.number().int().positive().nullable(),
});
export const MessagesDeletedFrame = z.object({
type: z.literal('messages_deleted'),
message_ids: z.array(Uuid),
chat_id: Uuid.optional(),
});
export const ChatRenamedFrame = z.object({
type: z.literal('chat_renamed'),
chat_id: Uuid,
name: z.string(),
});
export const CompactedFrame = z.object({
type: z.literal('compacted'),
session_id: Uuid,
chat_id: Uuid,
summary_message_id: Uuid,
});
export const ErrorFrame = z.object({
type: z.literal('error'),
message_id: Uuid.optional(),
chat_id: Uuid.optional(),
error: z.string(),
reason: ErrorReasonValue.optional(),
});
// ---- per-user channel frames (sidebar refresh) -----------------------------
export const ChatStatusFrame = z.object({
type: z.literal('chat_status'),
chat_id: Uuid,
status: ChatStatusValue,
at: IsoTimestamp,
reason: ErrorReasonValue.optional(),
});
export const SessionUpdatedFrame = z.object({
type: z.literal('session_updated'),
session_id: Uuid,
project_id: Uuid,
name: z.string(),
updated_at: IsoTimestamp,
});
export const SessionRenamedFrame = z.object({
type: z.literal('session_renamed'),
session_id: Uuid,
name: z.string(),
});
export const SessionCreatedFrame = z.object({
type: z.literal('session_created'),
session: OpaqueObject,
project_id: Uuid,
});
export const SessionArchivedFrame = z.object({
type: z.literal('session_archived'),
session_id: Uuid,
project_id: Uuid,
});
export const SessionDeletedFrame = z.object({
type: z.literal('session_deleted'),
session_id: Uuid,
project_id: Uuid,
});
export const SessionWorkspaceUpdatedFrame = z.object({
type: z.literal('session_workspace_updated'),
session_id: Uuid,
// v2.6.x: widened from z.array — the payload is now either the legacy bare
// WorkspacePane[] OR the WorkspaceState envelope object (panes + tabNumbers +
// nextTabNumber + closedPaneStack). z.array alone would fail-closed and drop
// every envelope frame at validation. MUST be mirrored in the server's
// byte-identical copy (parity test).
workspace_panes: z.union([z.array(OpaqueObject), z.record(z.unknown())]),
});
export const ChatCreatedFrame = z.object({
type: z.literal('chat_created'),
chat: OpaqueObject,
session_id: Uuid,
});
export const ChatUpdatedFrame = z.object({
type: z.literal('chat_updated'),
chat_id: Uuid,
session_id: Uuid,
name: z.string().nullable(),
updated_at: IsoTimestamp,
});
export const ChatArchivedFrame = z.object({
type: z.literal('chat_archived'),
chat_id: Uuid,
session_id: Uuid,
});
export const ChatUnarchivedFrame = z.object({
type: z.literal('chat_unarchived'),
chat: OpaqueObject,
});
export const ChatDeletedFrame = z.object({
type: z.literal('chat_deleted'),
chat_id: Uuid,
session_id: Uuid,
});
export const ProjectCreatedFrame = z.object({
type: z.literal('project_created'),
project: OpaqueObject,
});
export const ProjectArchivedFrame = z.object({
type: z.literal('project_archived'),
project_id: Uuid,
});
export const ProjectUnarchivedFrame = z.object({
type: z.literal('project_unarchived'),
project: OpaqueObject,
});
export const ProjectUpdatedFrame = z.object({
type: z.literal('project_updated'),
project_id: Uuid,
name: z.string(),
});
export const ProjectDeletedFrame = z.object({
type: z.literal('project_deleted'),
project_id: Uuid,
});
const PermissionOptionShape = z.object({
option_id: z.string(),
label: z.string(),
});
export const PermissionRequestedFrame = z.object({
type: z.literal('permission_requested'),
task_id: Uuid,
session_id: Uuid,
kind: z.enum(['tool', 'question', 'plan', 'elicitation']).optional(),
tool_title: z.string().optional(),
input: z.record(z.unknown()).optional(),
options: z.array(PermissionOptionShape),
});
export const PermissionResolvedFrame = z.object({
type: z.literal('permission_resolved'),
task_id: Uuid,
session_id: Uuid,
});
const AgentCommandShape = z.object({
name: z.string(),
description: z.string().optional(),
});
export const AgentCommandsFrame = z.object({
type: z.literal('agent_commands'),
task_id: Uuid,
session_id: Uuid,
commands: z.array(AgentCommandShape),
});
// ---- discriminated union ---------------------------------------------------
export const WsFrameSchema = z.discriminatedUnion('type', [
// per-session
SnapshotFrame,
MessageStartedFrame,
DeltaFrame,
ReasoningDeltaFrame,
ToolCallFrame,
ToolResultFrame,
MessageCompleteFrame,
UsageFrame,
MessagesDeletedFrame,
ChatRenamedFrame,
CompactedFrame,
ErrorFrame,
PermissionRequestedFrame,
PermissionResolvedFrame,
AgentCommandsFrame,
// per-user
ChatStatusFrame,
SessionUpdatedFrame,
SessionRenamedFrame,
SessionCreatedFrame,
SessionArchivedFrame,
SessionDeletedFrame,
SessionWorkspaceUpdatedFrame,
ChatCreatedFrame,
ChatUpdatedFrame,
ChatArchivedFrame,
ChatUnarchivedFrame,
ChatDeletedFrame,
ProjectCreatedFrame,
ProjectArchivedFrame,
ProjectUnarchivedFrame,
ProjectUpdatedFrame,
ProjectDeletedFrame,
]);
export type WsFrame = z.infer<typeof WsFrameSchema>;
// Convenience: the set of known frame types. Useful for the publishFrame
// helper to log the offending type name when validation fails. Kept in sync
// by hand with the discriminated union above.
export const KNOWN_FRAME_TYPES: readonly WsFrame['type'][] = [
'snapshot',
'message_started',
'delta',
'reasoning_delta',
'tool_call',
'tool_result',
'message_complete',
'usage',
'messages_deleted',
'chat_renamed',
'compacted',
'error',
'permission_requested',
'permission_resolved',
'agent_commands',
'chat_status',
'session_updated',
'session_renamed',
'session_created',
'session_archived',
'session_deleted',
'session_workspace_updated',
'chat_created',
'chat_updated',
'chat_archived',
'chat_unarchived',
'chat_deleted',
'project_created',
'project_archived',
'project_unarchived',
'project_updated',
'project_deleted',
] as const;

Binary file not shown.

After

Width:  |  Height:  |  Size: 910 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 685 KiB

View File

@@ -1,49 +0,0 @@
import { ChevronDown } from 'lucide-react';
import { useState } from 'react';
import type { AgentCommand } from '@/api/types';
import { cn } from '@/lib/utils';
interface Props {
commands: AgentCommand[];
}
export function AgentCommandsHint({ commands }: Props) {
const [open, setOpen] = useState(false);
const [expanded, setExpanded] = useState<string | null>(null);
if (commands.length === 0) return null;
return (
<div className="mx-2 mb-1 rounded-md border border-border/60 bg-muted/30 text-xs">
<button
type="button"
onClick={() => setOpen((v) => !v)}
className="w-full flex items-center justify-between px-2 py-1.5 text-muted-foreground hover:text-foreground max-md:min-h-[44px]"
>
<span>Slash commands ({commands.length})</span>
<ChevronDown className={cn('size-3.5 transition-transform', open && 'rotate-180')} />
</button>
{open && (
<ul className="px-2 pb-2 space-y-1 border-t border-border/40 max-h-48 overflow-y-auto overscroll-contain touch-pan-y">
{commands.map((cmd) => (
<li
key={cmd.name}
className="cursor-pointer"
onClick={() => setExpanded((v) => v === cmd.name ? null : cmd.name)}
>
<span className="font-mono text-primary/80">/{cmd.name}</span>
{cmd.description && (
<span className={cn(
'ml-1.5 text-muted-foreground font-sans',
expanded === cmd.name ? '' : 'line-clamp-2',
)}>
{cmd.description}
</span>
)}
</li>
))}
</ul>
)}
</div>
);
}

View File

@@ -3,8 +3,8 @@ import { Check, ChevronDown, RefreshCw, Loader2, Shield, Brain, Bot } from 'luci
import { api } from '@/api/client'; import { api } from '@/api/client';
import type { AgentSessionConfig, ProviderSnapshotEntry, AgentCommand } from '@/api/types'; import type { AgentSessionConfig, ProviderSnapshotEntry, AgentCommand } from '@/api/types';
import { useProviderSnapshot, refreshProviderSnapshot } from '@/hooks/useProviderSnapshot'; import { useProviderSnapshot, refreshProviderSnapshot } from '@/hooks/useProviderSnapshot';
import type { AgentStatusEntry } from '@/hooks/useAgentStatus';
import { providerIcon } from '@/components/coder/providerIcons'; import { providerIcon } from '@/components/coder/providerIcons';
import { useAgentSessions } from '@/hooks/useAgentSessions';
import { import {
DropdownMenu, DropdownMenu,
DropdownMenuContent, DropdownMenuContent,
@@ -173,36 +173,49 @@ interface Props {
onChange: (next: AgentSessionConfig) => void; onChange: (next: AgentSessionConfig) => void;
onProviderCommandsChange?: (commands: AgentCommand[]) => void; onProviderCommandsChange?: (commands: AgentCommand[]) => void;
connected?: boolean; connected?: boolean;
// v2.6 Phase 1-UX §9b: chat id for the resumed/new-session chip. Optional so // #10: normalized status (working|blocked|idle|error) for the active external
// BooChat and any other AgentComposerBar caller renders no chip and is // agent in this chat, or null for native boocode / before any frame. Renders
// otherwise unaffected. When present + connected + the chat has ≥1 prior // a status dot DISTINCT from the WS-liveness `connected` dot. Undefined for
// turn, a chip right of the Provider picker reports whether switching to the // non-coder callers — no dot.
// current provider resumes an agent session, replays history (boocode), or agentStatus?: AgentStatusEntry | null;
// starts fresh.
sessionId?: string;
// True once the chat has at least one prior turn — gates the chip so it stays
// hidden on a brand-new chat. Defaults to false (no chip).
hasPriorTurn?: boolean;
} }
// Relative-time formatter for the resumed-chip title (e.g. "3m ago"). // #10: normalized external-agent status dot. Mirrors StatusDot's visual
function relativeTime(iso: string | null): string { // language but on the four normalized buckets (working|blocked|idle|error),
if (!iso) return 'unknown'; // and is DISTINCT from the WS-liveness `connected` dot beside it:
const then = new Date(iso).getTime(); // working — emerald spinning ring (subtle motion, like chat streaming)
if (Number.isNaN(then)) return 'unknown'; // blocked — amber dot (matches the permission/blocked state colour)
const diffMs = Date.now() - then; // idle — gray dot
if (diffMs < 0) return 'just now'; // error — red dot
const sec = Math.floor(diffMs / 1000); function AgentStatusDot({ entry, agent }: { entry: AgentStatusEntry; agent: string }) {
if (sec < 60) return 'just now'; const title =
const min = Math.floor(sec / 60); `${agent}: ${entry.status}` + (entry.reason ? `${entry.reason}` : '');
if (min < 60) return `${min}m ago`;
const hr = Math.floor(min / 60); if (entry.status === 'working') {
if (hr < 24) return `${hr}h ago`; return (
const day = Math.floor(hr / 24); <span
return `${day}d ago`; aria-label={`Agent status: working${entry.reason ? `${entry.reason}` : ''}`}
title={title}
className="inline-block w-3 h-3 rounded-full border-2 border-emerald-500 border-t-transparent animate-spin shrink-0"
/>
);
}
const bg =
entry.status === 'blocked' ? 'bg-amber-500'
: entry.status === 'error' ? 'bg-destructive'
: 'bg-muted-foreground/40';
return (
<span
aria-label={`Agent status: ${entry.status}${entry.reason ? `${entry.reason}` : ''}`}
title={title}
className={cn('inline-block w-1.5 h-1.5 rounded-full shrink-0', bg)}
/>
);
} }
export function AgentComposerBar({ projectPath, value, onChange, onProviderCommandsChange, connected, sessionId, hasPriorTurn }: Props) { export function AgentComposerBar({ projectPath, value, onChange, onProviderCommandsChange, connected, agentStatus }: Props) {
const allEntries = useProviderSnapshot(projectPath); const allEntries = useProviderSnapshot(projectPath);
// 5.5 — the composer picker only offers ENABLED providers that are ready (or // 5.5 — the composer picker only offers ENABLED providers that are ready (or
// still loading). Disabled (enabled:false) and unavailable/error providers are // still loading). Disabled (enabled:false) and unavailable/error providers are
@@ -214,13 +227,6 @@ export function AgentComposerBar({ projectPath, value, onChange, onProviderComma
); );
const [refreshing, setRefreshing] = useState(false); const [refreshing, setRefreshing] = useState(false);
// v2.6 Phase 1-UX §9b: chat-scoped agent-session rows for the resumed/new
// chip. Hook is unconditional (hooks rule); it self-no-ops when sessionId is
// undefined or the chat has no prior turn, so BooChat callers cost nothing.
const { sessions: agentSessions } = useAgentSessions(
sessionId && hasPriorTurn ? sessionId : undefined,
);
const hydratedRef = useRef(false); const hydratedRef = useRef(false);
useEffect(() => { useEffect(() => {
@@ -334,27 +340,8 @@ export function AgentComposerBar({ projectPath, value, onChange, onProviderComma
const modelOptions = (currentEntry?.models ?? []).map((m) => ({ id: m.id, label: m.label })); const modelOptions = (currentEntry?.models ?? []).map((m) => ({ id: m.id, label: m.label }));
const thinkingOpts = thinkingOptions.map((t) => ({ id: t.id, label: t.label })); const thinkingOpts = thinkingOptions.map((t) => ({ id: t.id, label: t.label }));
// v2.6 Phase 1-UX §9b: resumed / history / new-session chip. Only meaningful
// when this is a real chat (sessionId), the WS is connected, and the chat has
// ≥1 prior turn — otherwise render nothing so fresh chats and non-coder
// callers stay clean.
const sessionRow = agentSessions.find((s) => s.agent === value.provider);
const sessionChip: { label: string; title: string } | null =
sessionId && hasPriorTurn && connected
? value.provider === 'boocode'
? // Native boocode never holds an agent_sessions row — it reconstructs
// the conversation from the chat transcript each turn.
{ label: 'history', title: 'BooCode replays the chat transcript each turn' }
: sessionRow?.has_session
? {
label: 'resumed',
title: `Resuming ${value.provider} · last active ${relativeTime(sessionRow.last_active_at)}`,
}
: { label: 'new session', title: `${value.provider} starts a fresh session this turn` }
: null;
return ( return (
<div className="flex flex-wrap items-center gap-1 px-2 py-1 border-b border-border bg-muted/20 shrink-0"> <div className="flex items-center gap-1 px-2 py-1 border-b border-border bg-muted/20 shrink-0">
<CompactPicker <CompactPicker
label="Provider" label="Provider"
value={value.provider} value={value.provider}
@@ -366,14 +353,6 @@ export function AgentComposerBar({ projectPath, value, onChange, onProviderComma
: providerIcon(value.provider) : providerIcon(value.provider)
} }
/> />
{sessionChip && (
<span
title={sessionChip.title}
className="inline-flex items-center rounded-full border border-border bg-muted/40 px-1.5 py-0.5 text-[10px] font-medium text-muted-foreground shrink-0"
>
{sessionChip.label}
</span>
)}
<CompactPicker <CompactPicker
label="Mode" label="Mode"
value={value.modeId ?? ''} value={value.modeId ?? ''}
@@ -400,9 +379,13 @@ export function AgentComposerBar({ projectPath, value, onChange, onProviderComma
icon={<Brain className="size-3 shrink-0" />} icon={<Brain className="size-3 shrink-0" />}
/> />
)} )}
{/* Status dot + refresh as one right-aligned unit so the refresh button {/* Status dot + refresh — pinned right (ml-auto), never on its own line. */}
stays on the top line instead of wrapping past the edge-pinned dot. */}
<div className="ml-auto flex items-center gap-1 shrink-0"> <div className="ml-auto flex items-center gap-1 shrink-0">
{/* #10: normalized agent status — only for an external agent with a
live status frame. Distinct from the WS-liveness dot that follows. */}
{agentStatus && value.provider !== 'boocode' && (
<AgentStatusDot entry={agentStatus} agent={value.provider} />
)}
{connected !== undefined && ( {connected !== undefined && (
<span <span
className={cn('inline-block w-1.5 h-1.5 rounded-full shrink-0', connected ? 'bg-green-500' : 'bg-red-500')} className={cn('inline-block w-1.5 h-1.5 rounded-full shrink-0', connected ? 'bg-green-500' : 'bg-red-500')}

View File

@@ -1,14 +1,8 @@
import { useCallback, useEffect, useMemo, useRef, useState, type DragEvent, type KeyboardEvent } from 'react'; import { useCallback, useEffect, useMemo, useRef, useState, type DragEvent, type KeyboardEvent } from 'react';
import { Check, ListPlus, Plus, Send, Square } from 'lucide-react'; import { Globe, ListPlus, Paperclip, Send, Square, SquareSlash } from 'lucide-react';
import { toast } from 'sonner'; import { toast } from 'sonner';
import { Textarea } from '@/components/ui/textarea'; import { Textarea } from '@/components/ui/textarea';
import { Button } from '@/components/ui/button'; import { Button } from '@/components/ui/button';
import {
DropdownMenu,
DropdownMenuContent,
DropdownMenuItem,
DropdownMenuTrigger,
} from '@/components/ui/dropdown-menu';
import { import {
flattenToMessage, flattenToMessage,
inferLanguage, inferLanguage,
@@ -22,7 +16,6 @@ import { AttachmentPreviewModal } from '@/components/AttachmentPreviewModal';
import { FileMentionPopover } from '@/components/FileMentionPopover'; import { FileMentionPopover } from '@/components/FileMentionPopover';
import { DropOverlay } from '@/components/DropOverlay'; import { DropOverlay } from '@/components/DropOverlay';
import { AgentPicker } from '@/components/AgentPicker'; import { AgentPicker } from '@/components/AgentPicker';
import { AgentCommandsHint } from '@/components/AgentCommandsHint';
import { ContextBar } from '@/components/ContextBar'; import { ContextBar } from '@/components/ContextBar';
import { SlashCommandPicker, type SlashCommandGroup } from '@/components/SlashCommandPicker'; import { SlashCommandPicker, type SlashCommandGroup } from '@/components/SlashCommandPicker';
import { isSlashCommandToken, parseSlashInput, slashQuery } from '@/lib/slash-command'; import { isSlashCommandToken, parseSlashInput, slashQuery } from '@/lib/slash-command';
@@ -123,6 +116,11 @@ export function ChatInput({ disabled, projectId, agentId, onAgentChange, session
); );
const [fileIndex, setFileIndex] = useState<string[] | null>(null); const [fileIndex, setFileIndex] = useState<string[] | null>(null);
const textareaRef = useRef<HTMLTextAreaElement | null>(null); const textareaRef = useRef<HTMLTextAreaElement | null>(null);
// Attach-file button → hidden native picker (same File→Attachment path as drop).
const fileInputRef = useRef<HTMLInputElement | null>(null);
// Slash-commands chip → click-to-open command menu, anchored to the chip.
const cmdChipRef = useRef<HTMLButtonElement | null>(null);
const [cmdMenuOpen, setCmdMenuOpen] = useState(false);
function addAttachment(a: Attachment) { function addAttachment(a: Attachment) {
setAttachments(prev => { setAttachments(prev => {
@@ -180,6 +178,23 @@ export function ChatInput({ disabled, projectId, agentId, onAgentChange, session
setAttachments(prev => prev.filter(a => a.id !== id)); setAttachments(prev => prev.filter(a => a.id !== id));
} }
// Attach-file button: funnel picked files through the same size/binary gate +
// chip pipeline as drag-drop. Reset value so re-picking the same file fires.
async function onPickFiles(e: React.ChangeEvent<HTMLInputElement>) {
const files = Array.from(e.target.files ?? []);
e.target.value = '';
if (files.length === 0) return;
let remaining = MAX_ATTACHMENTS - attachments.length;
for (const file of files) {
if (remaining <= 0) {
toast.error(`Attachment limit reached (${MAX_ATTACHMENTS}).`);
break;
}
await processDroppedFile(file);
remaining -= 1;
}
}
async function submit() { async function submit() {
const text = value.trim(); const text = value.trim();
if (!text && attachments.length === 0) return; if (!text && attachments.length === 0) return;
@@ -582,9 +597,6 @@ export function ChatInput({ disabled, projectId, agentId, onAgentChange, session
))} ))}
</div> </div>
)} )}
{slashItems.length > 0 && (
<AgentCommandsHint commands={slashItems} />
)}
{/* Batch 9 toolbar — agent picker + quick-toggle menu. v1.11.5.1 {/* Batch 9 toolbar — agent picker + quick-toggle menu. v1.11.5.1
inlines ContextBar in the same row so the bar lives next to the inlines ContextBar in the same row so the bar lives next to the
picker rather than as a separate header above it. The row renders picker rather than as a separate header above it. The row renders
@@ -598,39 +610,9 @@ export function ChatInput({ disabled, projectId, agentId, onAgentChange, session
onChange={onAgentChange} onChange={onAgentChange}
/> />
)} )}
{sessionId && ( {/* BooCode 2.0: the web-search toggle moved out of this top toolbar
<DropdownMenu> into the composer box's bottom controls row (the Web pill below),
<DropdownMenuTrigger asChild> leaving the top row as just the agent picker + context bar. */}
<button
type="button"
aria-label="Quick toggles"
title="Quick toggles"
className="inline-flex items-center justify-center size-6 rounded text-muted-foreground hover:bg-muted hover:text-foreground"
>
<Plus className="size-3.5" />
</button>
</DropdownMenuTrigger>
<DropdownMenuContent align="start">
<DropdownMenuItem
onSelect={async () => {
// v1.9: tri-state collapses to two on the wire when toggled
// here. null (inherit) treated as off; click flips to true.
// To restore "inherit" the user opens SettingsPane.
const next = webSearchEnabled === true ? false : true;
try {
await api.sessions.update(sessionId, { web_search_enabled: next });
} catch (err) {
toast.error(err instanceof Error ? err.message : 'failed to toggle web search');
}
}}
className="text-xs"
>
<Check className={`size-3 ${webSearchEnabled === true ? 'opacity-100' : 'opacity-0'}`} />
Enable web search and fetch
</DropdownMenuItem>
</DropdownMenuContent>
</DropdownMenu>
)}
{/* v1.11.5.1: ContextBar fills the remaining horizontal space. {/* v1.11.5.1: ContextBar fills the remaining horizontal space.
`flex-1 min-w-0` is set inside the component. Mounts only when `flex-1 min-w-0` is set inside the component. Mounts only when
the caller passes `messages` so older call sites (without the the caller passes `messages` so older call sites (without the
@@ -640,54 +622,112 @@ export function ChatInput({ disabled, projectId, agentId, onAgentChange, session
)} )}
</div> </div>
)} )}
<div className="px-4 py-3 flex items-end gap-2"> {/* BooCode 2.0 composer: textarea + a bottom controls row live INSIDE one
<Textarea bordered, focus-ringed message box (Refreshed direction). */}
ref={textareaRef} <div className="px-4 py-3">
value={value} <div className="rounded-xl border bg-card transition-colors focus-within:border-primary/50 focus-within:ring-2 focus-within:ring-primary/15">
onChange={handleChange} <Textarea
onKeyDown={onKeyDown} ref={textareaRef}
onPaste={onPaste} value={value}
placeholder={ onChange={handleChange}
isMobile onKeyDown={onKeyDown}
? 'Ask about this project. Tap send to submit.' onPaste={onPaste}
: 'Ask about this project. Enter to send · Shift+Enter for newline.' placeholder={
} isMobile
disabled={disabled || busy} ? 'Ask about this project. Tap send to submit.'
rows={3} : 'Ask about this project. Enter to send · Shift+Enter for newline.'
className="resize-none min-h-[68px] max-h-[240px]" }
/> disabled={disabled || busy}
{(() => { rows={3}
const hasContent = value.trim().length > 0 || attachments.length > 0; className="resize-none min-h-[56px] max-h-[240px] border-0 bg-transparent px-3 pt-2.5 shadow-none focus-visible:ring-0 dark:bg-transparent"
// While generating with an empty draft, the button stops generation. />
if (generating && onStop && !hasContent) { {/* bottom controls row: attach + slash chip + Web on the left, Send/Stop on the right */}
return ( <div className="flex items-center gap-1.5 px-2 pb-2 pt-0.5">
<Button <input ref={fileInputRef} type="file" multiple className="hidden" onChange={onPickFiles} />
onClick={() => void onStop()} <button
size="icon-lg" type="button"
variant="outline" onClick={() => fileInputRef.current?.click()}
aria-label="Stop generating" disabled={disabled || busy || attachments.length >= MAX_ATTACHMENTS}
title="Stop generating" aria-label="Attach file"
> title="Attach file"
<Square className="fill-current size-3.5" /> className="inline-flex items-center justify-center rounded-full border border-border px-2.5 py-1 text-muted-foreground transition-colors hover:bg-muted hover:text-foreground disabled:opacity-50 max-md:min-h-[36px] max-md:min-w-[36px]"
</Button>
);
}
// With a draft, submit. While generating the caller queues it, so the
// button reads as Queue; otherwise it's a normal Send.
const queueing = !!generating && hasContent;
return (
<Button
onClick={() => void submit()}
disabled={disabled || busy || !hasContent}
size="icon-lg"
variant={queueing ? 'secondary' : 'default'}
aria-label={queueing ? 'Queue message' : 'Send'}
title={queueing ? 'Queue message' : 'Send'}
> >
{queueing ? <ListPlus /> : <Send />} <Paperclip className="size-3.5" />
</Button> </button>
); {slashItems.length > 0 && (
})()} <button
ref={cmdChipRef}
type="button"
onMouseDown={(e) => e.stopPropagation()}
onClick={() => setCmdMenuOpen((v) => !v)}
aria-expanded={cmdMenuOpen}
aria-label="Slash commands"
title="Slash commands"
className="inline-flex items-center gap-1.5 rounded-full border border-border px-2.5 py-1 text-xs text-muted-foreground transition-colors hover:bg-muted hover:text-foreground aria-expanded:bg-muted aria-expanded:text-foreground max-md:min-h-[36px] max-md:min-w-[36px]"
>
<SquareSlash className="size-3.5" />
<span className="max-md:hidden">{slashItems.length}</span>
</button>
)}
{sessionId && (
<button
type="button"
onClick={async () => {
// v1.9 tri-state collapses to two on toggle; null (inherit) → on.
const next = webSearchEnabled === true ? false : true;
try {
await api.sessions.update(sessionId, { web_search_enabled: next });
} catch (err) {
toast.error(err instanceof Error ? err.message : 'failed to toggle web search');
}
}}
aria-pressed={webSearchEnabled === true}
title="Web search & fetch"
className={`inline-flex items-center gap-1.5 rounded-full border px-2.5 py-1 text-xs transition-colors max-md:min-h-[36px] ${
webSearchEnabled === true
? 'border-primary/40 bg-primary/10 text-primary'
: 'border-border text-muted-foreground hover:bg-muted hover:text-foreground'
}`}
>
<Globe className="size-3.5" />
Web
</button>
)}
<div className="flex-1" />
{(() => {
const hasContent = value.trim().length > 0 || attachments.length > 0;
// While generating with an empty draft, the button stops generation.
if (generating && onStop && !hasContent) {
return (
<Button
onClick={() => void onStop()}
size="icon"
variant="outline"
aria-label="Stop generating"
title="Stop generating"
>
<Square className="fill-current size-3.5" />
</Button>
);
}
// With a draft, submit. While generating the caller queues it, so the
// button reads as Queue; otherwise it's a normal Send.
const queueing = !!generating && hasContent;
return (
<Button
onClick={() => void submit()}
disabled={disabled || busy || !hasContent}
size="icon"
variant={queueing ? 'secondary' : 'default'}
aria-label={queueing ? 'Queue message' : 'Send'}
title={queueing ? 'Queue message' : 'Send'}
>
{queueing ? <ListPlus /> : <Send />}
</Button>
);
})()}
</div>
</div>
</div> </div>
</div> </div>
<AttachmentPreviewModal <AttachmentPreviewModal
@@ -714,6 +754,21 @@ export function ChatInput({ disabled, projectId, agentId, onAgentChange, session
emptyLabel={slashGroups ? 'No commands available' : 'No skills available'} emptyLabel={slashGroups ? 'No commands available' : 'No skills available'}
/> />
)} )}
{/* Slash-commands chip menu (click-opened); anchored to the chip. */}
{cmdMenuOpen && slashItems.length > 0 && (
<SlashCommandPicker
query=""
items={slashItems}
groups={slashGroups}
inputRef={cmdChipRef}
onSelect={(name) => {
setCmdMenuOpen(false);
handleSlashSelect(name);
}}
onClose={() => setCmdMenuOpen(false)}
emptyLabel={slashGroups ? 'No commands available' : 'No skills available'}
/>
)}
</div> </div>
); );
} }

View File

@@ -1,7 +1,8 @@
import { useState } from 'react'; import { useState } from 'react';
import { Code, Columns2, History, MessageSquare, Plus, RotateCcw, Terminal, X } from 'lucide-react'; import { Code, History, MessageSquare, X } from 'lucide-react';
import type { Chat, WorkspacePane } from '@/api/types'; import type { Chat, WorkspacePane } from '@/api/types';
import { StatusDot } from '@/components/StatusDot'; import { StatusDot } from '@/components/StatusDot';
import { PaneHeaderActions } from '@/components/PaneHeaderActions';
import { import {
ContextMenu, ContextMenu,
ContextMenuContent, ContextMenuContent,
@@ -9,12 +10,6 @@ import {
ContextMenuSeparator, ContextMenuSeparator,
ContextMenuTrigger, ContextMenuTrigger,
} from '@/components/ui/context-menu'; } from '@/components/ui/context-menu';
import {
DropdownMenu,
DropdownMenuContent,
DropdownMenuItem,
DropdownMenuTrigger,
} from '@/components/ui/dropdown-menu';
import { useLongPress } from '@/hooks/useLongPress'; import { useLongPress } from '@/hooks/useLongPress';
import { sessionEvents } from '@/hooks/sessionEvents'; import { sessionEvents } from '@/hooks/sessionEvents';
import { cn } from '@/lib/utils'; import { cn } from '@/lib/utils';
@@ -22,6 +17,9 @@ import { cn } from '@/lib/utils';
interface Props { interface Props {
pane: WorkspacePane; pane: WorkspacePane;
tabs: Chat[]; tabs: Chat[];
// Host pane kind — 'coder' shows the Code glyph + routes the "+" to a new
// BooCode tab. Defaults to 'chat' (the BooChat tab bar).
tabKind?: 'chat' | 'coder';
// v2.6.x (Batch 3a): stable session-scoped tab number per chat id. Keyed by // v2.6.x (Batch 3a): stable session-scoped tab number per chat id. Keyed by
// chat.id, NEVER by tab position. // chat.id, NEVER by tab position.
tabNumbers: Record<string, number>; tabNumbers: Record<string, number>;
@@ -41,6 +39,7 @@ interface Props {
export function ChatTabBar({ export function ChatTabBar({
pane, pane,
tabs, tabs,
tabKind = 'chat',
tabNumbers, tabNumbers,
onSwitchTab, onSwitchTab,
onRemoveTab, onRemoveTab,
@@ -56,6 +55,8 @@ export function ChatTabBar({
}: Props) { }: Props) {
const [renamingId, setRenamingId] = useState<string | null>(null); const [renamingId, setRenamingId] = useState<string | null>(null);
const [renameValue, setRenameValue] = useState(''); const [renameValue, setRenameValue] = useState('');
const TabIcon = tabKind === 'coder' ? Code : MessageSquare;
const newLabel = tabKind === 'coder' ? 'New BooCode' : 'New chat';
// Long-press: dispatch a synthetic contextmenu event on the tab so the // Long-press: dispatch a synthetic contextmenu event on the tab so the
// existing Radix ContextMenuTrigger opens at the touch coordinates. Works // existing Radix ContextMenuTrigger opens at the touch coordinates. Works
@@ -109,7 +110,7 @@ export function ChatTabBar({
: 'bg-muted/30 text-muted-foreground hover:bg-muted/60' : 'bg-muted/30 text-muted-foreground hover:bg-muted/60'
)} )}
> >
<MessageSquare size={12} className="shrink-0" /> <TabIcon size={12} className="shrink-0" />
<StatusDot chatId={chat.id} /> <StatusDot chatId={chat.id} />
{renamingId === chat.id ? ( {renamingId === chat.id ? (
<input <input
@@ -147,7 +148,7 @@ export function ChatTabBar({
</ContextMenuTrigger> </ContextMenuTrigger>
<ContextMenuContent> <ContextMenuContent>
<ContextMenuItem onSelect={onNewTab}> <ContextMenuItem onSelect={onNewTab}>
New chat {newLabel}
</ContextMenuItem> </ContextMenuItem>
<ContextMenuItem <ContextMenuItem
onSelect={() => onSelect={() =>
@@ -191,90 +192,16 @@ export function ChatTabBar({
</div> </div>
)} )}
<div className="flex items-center ml-auto gap-0.5 px-1 shrink-0"> <PaneHeaderActions
<DropdownMenu> className="ml-auto px-1"
<DropdownMenuTrigger asChild> onNewTab={onNewTab}
<button tabKind={tabKind}
type="button" onSplitPane={onSplitPane}
className="inline-flex items-center justify-center p-1 rounded text-muted-foreground hover:bg-muted hover:text-foreground max-md:min-h-[44px] max-md:min-w-[44px]" onReopenPane={onReopenPane}
aria-label="New chat, terminal, or coder" onShowHistory={onShowHistory}
title="New chat / terminal / coder" onRemovePane={onRemovePane}
> historyActive={pane.kind === 'empty'}
<Plus size={12} /> />
</button>
</DropdownMenuTrigger>
<DropdownMenuContent align="end" className="w-fit">
{/* New BooChat opens a tab in THIS pane; terminal/coder can't be
tabs, so they split into a new pane (matches the Split menu). */}
<DropdownMenuItem onSelect={onNewTab}>
<MessageSquare size={14} /> New BooChat
</DropdownMenuItem>
<DropdownMenuItem onSelect={() => onSplitPane('terminal')}>
<Terminal size={14} /> New BooTerm
</DropdownMenuItem>
<DropdownMenuItem onSelect={() => onSplitPane('coder')}>
<Code size={14} /> New BooCode
</DropdownMenuItem>
</DropdownMenuContent>
</DropdownMenu>
<DropdownMenu>
<DropdownMenuTrigger asChild>
<button
type="button"
className="inline-flex items-center justify-center p-1 rounded text-muted-foreground hover:bg-muted hover:text-foreground max-md:min-h-[44px] max-md:min-w-[44px]"
aria-label="Split pane"
title="Split pane"
>
<Columns2 size={12} />
</button>
</DropdownMenuTrigger>
<DropdownMenuContent align="end" className="w-fit">
<DropdownMenuItem onSelect={() => onSplitPane('chat')}>
<MessageSquare size={14} /> New BooChat
</DropdownMenuItem>
<DropdownMenuItem onSelect={() => onSplitPane('terminal')}>
<Terminal size={14} /> New BooTerm
</DropdownMenuItem>
<DropdownMenuItem onSelect={() => onSplitPane('coder')}>
<Code size={14} /> New BooCode
</DropdownMenuItem>
</DropdownMenuContent>
</DropdownMenu>
{onReopenPane && (
<button
type="button"
onClick={onReopenPane}
className="inline-flex items-center justify-center p-1 rounded text-muted-foreground hover:bg-muted hover:text-foreground max-md:min-h-[44px] max-md:min-w-[44px]"
aria-label="Reopen closed pane"
title="Reopen closed pane"
>
<RotateCcw size={12} />
</button>
)}
<button
type="button"
onClick={onShowHistory}
className={cn(
'inline-flex items-center justify-center p-1 rounded text-muted-foreground hover:bg-muted hover:text-foreground max-md:min-h-[44px] max-md:min-w-[44px]',
pane.kind === 'empty' && 'text-foreground bg-muted/50'
)}
aria-label="Session history"
title="Session history"
>
<History size={12} />
</button>
{onRemovePane && (
<button
type="button"
onClick={onRemovePane}
className="inline-flex items-center justify-center p-1 rounded text-muted-foreground hover:bg-muted hover:text-foreground max-md:min-h-[44px] max-md:min-w-[44px]"
aria-label="Close pane"
title="Close pane"
>
<X size={12} />
</button>
)}
</div>
</div> </div>
); );
} }

View File

@@ -19,18 +19,24 @@ interface Props {
// the same boundaries the server's auto-compaction triggers. // the same boundaries the server's auto-compaction triggers.
const COMPACTION_BUFFER = 20_000; const COMPACTION_BUFFER = 20_000;
// Walk newest-first; first message with both ctx_used and ctx_max non-null // Take the latest ctx_used and the latest ctx_max INDEPENDENTLY (newest-first).
// AND ctx_max > 0 wins. Older messages may have ctx_used but missing ctx_max // They needn't be on the same message: ctx_max is the model's context window — a
// (early v1 before llama-swap's n_ctx capture worked) — skip them and keep // constant per model — while some agents report it only intermittently (the claude
// walking. Returns null when no usable pair exists in the chat. // SDK populates modelUsage.contextWindow on some turns, not all) yet report
// ctx_used every turn. Pairing the latest of each gives a correct used/max even
// when the most recent turn omitted the window. Native BooChat sets both on the
// same assistant message, so this is identical there. Returns null until BOTH a
// used and a positive max have been seen at least once.
function latestPair(messages: Message[]): { used: number; max: number } | null { function latestPair(messages: Message[]): { used: number; max: number } | null {
let used: number | null = null;
let max: number | null = null;
for (let i = messages.length - 1; i >= 0; i--) { for (let i = messages.length - 1; i >= 0; i--) {
const m = messages[i]!; const m = messages[i]!;
if (m.ctx_used == null || m.ctx_max == null) continue; if (used === null && m.ctx_used != null) used = m.ctx_used;
if (m.ctx_max <= 0) continue; if (max === null && m.ctx_max != null && m.ctx_max > 0) max = m.ctx_max;
return { used: m.ctx_used, max: m.ctx_max }; if (used !== null && max !== null) break;
} }
return null; return used !== null && max !== null ? { used, max } : null;
} }
interface ColorTier { interface ColorTier {

View File

@@ -1,11 +1,12 @@
import { useEffect, useState } from 'react'; import { useEffect, useState } from 'react';
import type { ReactNode } from 'react'; import type { ReactNode } from 'react';
import { ChevronDown, ChevronRight, Copy, RefreshCw, Check, Share2, RotateCw, GitFork, Trash2, Brain, History } from 'lucide-react'; import { ChevronDown, ChevronRight, Copy, RefreshCw, Check, Share2, RotateCw, GitFork, Trash2, Brain, History, AlertCircle } from 'lucide-react';
import { toast } from 'sonner'; import { toast } from 'sonner';
import type { Chat, ErrorReason, Message } from '@/api/types'; import type { Chat, ErrorReason, Message } from '@/api/types';
import { api } from '@/api/client'; import { api } from '@/api/client';
import { sessionEvents } from '@/hooks/sessionEvents'; import { sessionEvents } from '@/hooks/sessionEvents';
import { sendToTerminal, terminalsRegistry, type TerminalRegistration } from '@/lib/events'; import { sendToTerminal, terminalsRegistry, type TerminalRegistration } from '@/lib/events';
import { shortenModelName } from '@/lib/modelName';
import { CapHitSentinel } from './CapHitSentinel'; import { CapHitSentinel } from './CapHitSentinel';
import { DoomLoopSentinel } from './DoomLoopSentinel'; import { DoomLoopSentinel } from './DoomLoopSentinel';
import { MarkdownRenderer } from './MarkdownRenderer'; import { MarkdownRenderer } from './MarkdownRenderer';
@@ -608,12 +609,12 @@ function SummaryCard({ message }: { message: Message }) {
// Collapsible "Thinking" block for assistant reasoning. Fed by either // Collapsible "Thinking" block for assistant reasoning. Fed by either
// reasoning_text (coder wire / live reasoning_delta stream) or reasoning_parts // reasoning_text (coder wire / live reasoning_delta stream) or reasoning_parts
// (native inference, persisted from message_parts). Auto-expands while the turn // (native inference, persisted from message_parts). Starts COLLAPSED to start
// is still streaming so the user watches it think (Paseo-style), then stays // (a quiet chip) — for native BooChat/BooCode and the external agents (opencode,
// where the user left it once the turn completes — initial state is captured // claude SDK) alike — so the transcript stays tidy; click to expand. The
// once at mount, so we never fight a manual collapse on later re-renders. // `streaming` pulse still animates while the turn runs.
function ReasoningBlock({ text, streaming }: { text: string; streaming: boolean }) { function ReasoningBlock({ text, streaming }: { text: string; streaming: boolean }) {
const [expanded, setExpanded] = useState(() => streaming); const [expanded, setExpanded] = useState(false);
return ( return (
<div className="max-w-[90%] rounded-lg border bg-muted/30 text-sm"> <div className="max-w-[90%] rounded-lg border bg-muted/30 text-sm">
<button <button
@@ -637,6 +638,76 @@ function ReasoningBlock({ text, streaming }: { text: string; streaming: boolean
); );
} }
// feature #12: mistake-recovery sentinel. Inserted by the backend as a
// role='system', metadata.kind='mistake_recovery' row when the model hit
// repeated *different* errors (distinct from doom_loop, which is the same
// call repeated). Visual treatment mirrors CapHitSentinel / DoomLoopSentinel
// (amber card + alert icon). Non-escalated → recovery guidance was injected
// and the turn continues. Escalated → the turn was stopped; if can_continue
// is set, offer the same Continue affordance as the cap-hit sentinel.
// Loose `!= null` guards per the CLAUDE.md coder-message note (coder rows pass
// metadata as undefined, not null).
function MistakeRecoverySentinel({ message }: { message: Message }) {
const meta = message.metadata;
const isMistakeRecovery =
meta != null && typeof meta === 'object' && meta.kind === 'mistake_recovery';
const failureKinds = isMistakeRecovery ? meta.failure_kinds : [];
const escalated = isMistakeRecovery ? meta.escalated : false;
const canContinue = isMistakeRecovery ? meta.can_continue === true : false;
const [continuing, setContinuing] = useState(false);
async function handleContinue() {
if (continuing || !canContinue) return;
setContinuing(true);
try {
await api.chats.continue(message.chat_id, message.id);
} catch (err) {
toast.error(err instanceof Error ? err.message : 'continue failed');
} finally {
setContinuing(false);
}
}
const kindsLabel =
Array.isArray(failureKinds) && failureKinds.length > 0
? failureKinds.join(', ')
: null;
return (
<div className="rounded-md border border-amber-500/40 bg-amber-500/10 text-sm">
<div className="px-3 py-2 flex items-start gap-2">
<AlertCircle className="size-4 text-amber-500 shrink-0 mt-0.5" />
<div className="flex-1 min-w-0 space-y-1">
<div className="text-xs font-medium text-amber-700 dark:text-amber-300">
{escalated ? 'Repeated errors — turn stopped' : 'Recovering from repeated errors'}
</div>
<div className="text-xs text-muted-foreground">
{escalated
? 'Repeated errors persisted — stopped the turn.'
: kindsLabel
? `Hit repeated different errors (${kindsLabel}) — recovery guidance injected, continuing.`
: 'Hit repeated different errors — recovery guidance injected, continuing.'}
</div>
{escalated && canContinue && (
<div className="pt-1">
<Button
type="button"
size="sm"
variant="outline"
onClick={() => void handleContinue()}
disabled={continuing}
>
{continuing ? 'Continuing…' : 'Continue'}
</Button>
</div>
)}
</div>
</div>
</div>
);
}
export function MessageBubble({ export function MessageBubble({
message, message,
sessionChats, sessionChats,
@@ -681,6 +752,13 @@ export function MessageBubble({
return <DoomLoopSentinel message={message} />; return <DoomLoopSentinel message={message} />;
} }
// feature #12: mistake-recovery sentinel. Non-escalated rows narrate that
// recovery guidance was injected mid-turn; escalated rows report the turn
// was stopped and (when can_continue) offer the cap-hit-style Continue.
if (message.role === 'system' && message.metadata?.kind === 'mistake_recovery') {
return <MistakeRecoverySentinel message={message} />;
}
// v1.8.2: tool messages and assistant tool_calls are now rendered by // v1.8.2: tool messages and assistant tool_calls are now rendered by
// MessageList via ToolCallLine / ToolCallGroup. Tool-role messages reach // MessageList via ToolCallLine / ToolCallGroup. Tool-role messages reach
// this point only if MessageList didn't consume them (shouldn't happen, // this point only if MessageList didn't consume them (shouldn't happen,
@@ -691,7 +769,7 @@ export function MessageBubble({
return ( return (
<div className="group flex flex-col items-end gap-1"> <div className="group flex flex-col items-end gap-1">
<SendToTerminalMenu> <SendToTerminalMenu>
<div className="max-w-[80%] rounded-lg bg-primary text-primary-foreground px-3 py-2 text-sm whitespace-pre-wrap break-words min-w-0"> <div className="boo-user-bubble max-w-[80%] rounded-lg bg-primary text-primary-foreground px-3 py-2 text-sm whitespace-pre-wrap break-words min-w-0">
{message.content} {message.content}
</div> </div>
</SendToTerminalMenu> </SendToTerminalMenu>
@@ -705,6 +783,8 @@ export function MessageBubble({
// v1.13.7: match the MessageList.flatten trim guard so a whitespace-only // v1.13.7: match the MessageList.flatten trim guard so a whitespace-only
// assistant turn doesn't render an empty bubble + dangling ActionRow. // assistant turn doesn't render an empty bubble + dangling ActionRow.
const hasContent = message.content.trim().length > 0; const hasContent = message.content.trim().length > 0;
// model-attribution chip: short label for the model that produced this turn.
const modelLabel = shortenModelName(message.model);
// Reasoning arrives as a pre-joined string (coder wire) or as parts (native // Reasoning arrives as a pre-joined string (coder wire) or as parts (native
// inference). Read whichever is present; loose ?? chain tolerates the coder // inference). Read whichever is present; loose ?? chain tolerates the coder
// shape where reasoning_parts is undefined (see CLAUDE.md null-guard note). // shape where reasoning_parts is undefined (see CLAUDE.md null-guard note).
@@ -746,6 +826,14 @@ export function MessageBubble({
)} )}
</div> </div>
)} )}
{!isStreaming && (modelLabel || null) && (
<span
className="inline-flex w-fit items-center rounded-full border border-primary/25 bg-primary/10 px-2 py-0.5 text-[10px] font-mono text-primary/90"
title={message.model ?? undefined}
>
{modelLabel}
</span>
)}
{!isStreaming && <StatsLine message={message} />} {!isStreaming && <StatsLine message={message} />}
{!isStreaming && hasContent && ( {!isStreaming && hasContent && (
<ActionRow <ActionRow

View File

@@ -0,0 +1,148 @@
import { Code, Columns2, History, MessageSquare, Plus, RotateCcw, Terminal, X } from 'lucide-react';
import {
DropdownMenu,
DropdownMenuContent,
DropdownMenuItem,
DropdownMenuTrigger,
} from '@/components/ui/dropdown-menu';
import { cn } from '@/lib/utils';
// Shared pane-header action cluster: + (new) / Split / Reopen-closed-pane /
// Session history / Close. Rendered in the chat tab bar (ChatTabBar) and the
// desktop coder + terminal pane headers (Workspace) so all pane kinds share one
// control set. Extracted to avoid a divergent copy per header.
interface Props {
// When provided, the "+" menu item matching `tabKind` opens an in-pane tab
// (e.g. chat panes: New BooChat → tab; coder panes: New BooCode → tab). Every
// OTHER kind splits into a new pane. When onNewTab is omitted (terminal
// panes, which can't host tabs) all three items split.
onNewTab?: () => void;
// The host pane's own kind — the "+" item of this kind becomes "new tab".
// Defaults to 'chat' for back-compat with the chat tab bar.
tabKind?: 'chat' | 'terminal' | 'coder';
onSplitPane: (kind: 'chat' | 'terminal' | 'coder') => void;
onReopenPane?: () => void;
onShowHistory: () => void;
onRemovePane?: () => void;
// Highlights the History button when the pane is showing the landing page.
historyActive?: boolean;
// Positioning/spacing supplied by the parent (e.g. "ml-auto px-1").
className?: string;
}
const BTN =
'inline-flex items-center justify-center p-1 rounded text-muted-foreground hover:bg-muted hover:text-foreground max-md:min-h-[44px] max-md:min-w-[44px]';
export function PaneHeaderActions({
onNewTab,
tabKind = 'chat',
onSplitPane,
onReopenPane,
onShowHistory,
onRemovePane,
historyActive,
className,
}: Props) {
// The "+" item of the host pane's own kind adds a tab; every other kind
// splits into a new pane. Falls back to split when onNewTab is absent.
const newOrSplit = (kind: 'chat' | 'terminal' | 'coder') =>
onNewTab && tabKind === kind ? onNewTab : () => onSplitPane(kind);
return (
<div className={cn('flex items-center gap-0.5 shrink-0', className)}>
<DropdownMenu>
<DropdownMenuTrigger asChild>
<button
type="button"
onClick={(e) => e.stopPropagation()}
className={BTN}
aria-label="New chat, terminal, or coder"
title="New chat / terminal / coder"
>
<Plus size={12} />
</button>
</DropdownMenuTrigger>
<DropdownMenuContent align="end" className="w-fit">
{/* The item matching the host pane's kind opens an in-pane tab; the
others split into a new pane. (tabKind defaults to 'chat'.) */}
<DropdownMenuItem onSelect={newOrSplit('chat')}>
<MessageSquare size={14} /> New BooChat
</DropdownMenuItem>
<DropdownMenuItem onSelect={newOrSplit('terminal')}>
<Terminal size={14} /> New BooTerm
</DropdownMenuItem>
<DropdownMenuItem onSelect={newOrSplit('coder')}>
<Code size={14} /> New BooCode
</DropdownMenuItem>
</DropdownMenuContent>
</DropdownMenu>
<DropdownMenu>
<DropdownMenuTrigger asChild>
<button
type="button"
onClick={(e) => e.stopPropagation()}
className={cn(BTN, 'max-md:hidden')}
aria-label="Split pane"
title="Split pane"
>
<Columns2 size={12} />
</button>
</DropdownMenuTrigger>
<DropdownMenuContent align="end" className="w-fit">
<DropdownMenuItem onSelect={() => onSplitPane('chat')}>
<MessageSquare size={14} /> New BooChat
</DropdownMenuItem>
<DropdownMenuItem onSelect={() => onSplitPane('terminal')}>
<Terminal size={14} /> New BooTerm
</DropdownMenuItem>
<DropdownMenuItem onSelect={() => onSplitPane('coder')}>
<Code size={14} /> New BooCode
</DropdownMenuItem>
</DropdownMenuContent>
</DropdownMenu>
{onReopenPane && (
<button
type="button"
onClick={(e) => {
e.stopPropagation();
onReopenPane();
}}
className={cn(BTN, 'max-md:hidden')}
aria-label="Reopen closed pane"
title="Reopen closed pane"
>
<RotateCcw size={12} />
</button>
)}
<button
type="button"
onClick={(e) => {
e.stopPropagation();
onShowHistory();
}}
className={cn(BTN, 'max-md:hidden', historyActive && 'text-foreground bg-muted/50')}
aria-label="Session history"
title="Session history"
>
<History size={12} />
</button>
{onRemovePane && (
<button
type="button"
onClick={(e) => {
e.stopPropagation();
onRemovePane();
}}
className={BTN}
aria-label="Close pane"
title="Close pane"
>
<X size={12} />
</button>
)}
</div>
);
}

View File

@@ -3,6 +3,8 @@ import { NavLink, useLocation, useNavigate } from 'react-router-dom';
import { ChevronRight, ExternalLink, Folder, MessageSquare, Plus, Settings as SettingsIcon, X, Code } from 'lucide-react'; import { ChevronRight, ExternalLink, Folder, MessageSquare, Plus, Settings as SettingsIcon, X, Code } from 'lucide-react';
import { toast } from 'sonner'; import { toast } from 'sonner';
import { Button } from '@/components/ui/button'; import { Button } from '@/components/ui/button';
import mascot from '@/assets/brand/banner-mascot.png';
import wordmark from '@/assets/brand/banner-wordmark.png';
import { sessionEvents } from '@/hooks/sessionEvents'; import { sessionEvents } from '@/hooks/sessionEvents';
import { import {
ContextMenu, ContextMenu,
@@ -307,9 +309,22 @@ export function ProjectSidebar() {
return ( return (
<aside className={asideCls}> <aside className={asideCls}>
<div className="px-4 py-3 border-b flex items-center justify-between"> <div className="px-2 py-1 border-b flex items-center justify-between gap-1">
<NavLink to="/" className="font-semibold tracking-tight text-base"> {/* BooCode brand banner: mascot badge + >_BooCode wordmark, big and
BooCode visible, on transparent backgrounds (no chip, no blend). */}
<NavLink to="/" aria-label="BooCode home" className="flex items-center gap-0.5 min-w-0 flex-1">
<img
src={mascot}
alt=""
draggable={false}
className="h-12 w-auto select-none shrink-0"
/>
<img
src={wordmark}
alt="BooCode"
draggable={false}
className="h-12 w-auto select-none min-w-0 flex-1 object-contain object-left"
/>
</NavLink> </NavLink>
<div className="flex items-center gap-1"> <div className="flex items-center gap-1">
<Button size="icon-sm" variant="ghost" onClick={() => setAddOpen(true)} aria-label="Add project"> <Button size="icon-sm" variant="ghost" onClick={() => setAddOpen(true)} aria-label="Add project">

View File

@@ -1,7 +1,15 @@
import { useCallback, useEffect, useState } from 'react'; import { useCallback, useEffect, useState } from 'react';
import { Archive, MessageSquare, RotateCcw } from 'lucide-react'; import { Archive, Code, MessageSquare, RotateCcw, Terminal, Trash2 } from 'lucide-react';
import { toast } from 'sonner'; import { toast } from 'sonner';
import { ChatInput } from '@/components/ChatInput'; import { ChatInput } from '@/components/ChatInput';
import { Button } from '@/components/ui/button';
import {
Dialog,
DialogContent,
DialogDescription,
DialogHeader,
DialogTitle,
} from '@/components/ui/dialog';
import { api } from '@/api/client'; import { api } from '@/api/client';
import type { Chat } from '@/api/types'; import type { Chat } from '@/api/types';
@@ -22,6 +30,8 @@ interface Props {
chats: Chat[]; chats: Chat[];
onOpenChat: (chatId: string) => void; onOpenChat: (chatId: string) => void;
onUnarchiveChat: (chatId: string) => Promise<void>; onUnarchiveChat: (chatId: string) => Promise<void>;
onArchiveChat: (chatId: string) => Promise<void>;
onDeleteChat: (chatId: string) => Promise<void>;
} }
function formatRelative(iso: string): string { function formatRelative(iso: string): string {
@@ -42,6 +52,16 @@ function byRecent(a: Chat, b: Chat): number {
return (b.updated_at ?? '').localeCompare(a.updated_at ?? ''); return (b.updated_at ?? '').localeCompare(a.updated_at ?? '');
} }
// Pick the row icon by the chat's seed name: coder and terminal panes create
// placeholder chats named 'BooCoder' / 'Terminal' (see useWorkspacePanes
// chatNameForPaneKind + the coder chat-resolve). A name heuristic keeps this
// frontend-only — matches ProjectSidebar's isCoderSessionName approach.
function iconForChat(name: string | null) {
if (name === 'BooCoder') return Code;
if (name === 'Terminal') return Terminal;
return MessageSquare;
}
export function SessionLandingPage({ export function SessionLandingPage({
projectId, projectId,
sessionId, sessionId,
@@ -53,9 +73,13 @@ export function SessionLandingPage({
chats, chats,
onOpenChat, onOpenChat,
onUnarchiveChat, onUnarchiveChat,
onArchiveChat,
onDeleteChat,
}: Props) { }: Props) {
const [chatId, setChatId] = useState<string | null>(null); const [chatId, setChatId] = useState<string | null>(null);
const [archived, setArchived] = useState<Chat[]>([]); const [archived, setArchived] = useState<Chat[]>([]);
// Plain Cancel/Confirm delete (no type-to-confirm), mirroring ProjectSidebar.
const [deleteConfirm, setDeleteConfirm] = useState<{ id: string; name: string | null } | null>(null);
// Archived chats aren't in the default (open-only) list, so fetch them. One // Archived chats aren't in the default (open-only) list, so fetch them. One
// shot on session change — the history view is transient (pick a chat and // shot on session change — the history view is transient (pick a chat and
@@ -130,25 +154,52 @@ export function SessionLandingPage({
Conversations Conversations
</h3> </h3>
<div className="space-y-0.5 mb-4"> <div className="space-y-0.5 mb-4">
{openChats.map((c) => ( {openChats.map((c) => {
<button const Icon = iconForChat(c.name);
key={c.id} return (
type="button" <div
onClick={() => onOpenChat(c.id)} key={c.id}
className="w-full flex items-center gap-2 text-left px-2 py-1.5 rounded hover:bg-muted text-sm max-md:min-h-[44px]" className="group/row flex items-center gap-2 px-2 py-1.5 rounded hover:bg-muted text-sm max-md:min-h-[44px]"
> >
<MessageSquare size={14} className="shrink-0 text-muted-foreground" /> <button
<span className="truncate shrink-0 max-w-[45%]">{c.name ?? 'New chat'}</span> type="button"
{c.last_message_preview && ( onClick={() => onOpenChat(c.id)}
<span className="truncate flex-1 text-xs text-muted-foreground hidden sm:block"> className="flex items-center gap-2 flex-1 min-w-0 text-left"
{c.last_message_preview} >
</span> <Icon size={14} className="shrink-0 text-muted-foreground" />
)} <span className="truncate shrink-0 max-w-[45%]">{c.name ?? 'New chat'}</span>
<span className="shrink-0 ml-auto text-xs text-muted-foreground"> {c.last_message_preview && (
{formatRelative(c.updated_at)} <span className="truncate flex-1 text-xs text-muted-foreground hidden sm:block">
</span> {c.last_message_preview}
</button> </span>
))} )}
<span className="shrink-0 ml-auto text-xs text-muted-foreground">
{formatRelative(c.updated_at)}
</span>
</button>
<div className="shrink-0 flex items-center gap-0.5 opacity-0 group-hover/row:opacity-100 focus-within:opacity-100 transition-opacity">
<button
type="button"
onClick={(e) => { e.stopPropagation(); void onArchiveChat(c.id); }}
className="inline-flex items-center justify-center size-7 rounded text-muted-foreground hover:bg-muted hover:text-foreground max-md:size-9"
aria-label="Archive chat"
title="Archive"
>
<Archive size={14} />
</button>
<button
type="button"
onClick={(e) => { e.stopPropagation(); setDeleteConfirm({ id: c.id, name: c.name }); }}
className="inline-flex items-center justify-center size-7 rounded text-muted-foreground hover:bg-destructive/20 hover:text-destructive max-md:size-9"
aria-label="Delete chat"
title="Delete"
>
<Trash2 size={14} />
</button>
</div>
</div>
);
})}
</div> </div>
</> </>
)} )}
@@ -159,21 +210,34 @@ export function SessionLandingPage({
</h3> </h3>
<div className="space-y-0.5"> <div className="space-y-0.5">
{archivedChats.map((c) => ( {archivedChats.map((c) => (
<button <div
key={c.id} key={c.id}
type="button" className="group/arch flex items-center gap-2 px-2 py-1.5 rounded hover:bg-muted text-sm text-muted-foreground max-md:min-h-[44px]"
onClick={() => void restoreAndOpen(c.id)}
title="Restore and open"
className="group/arch w-full flex items-center gap-2 text-left px-2 py-1.5 rounded hover:bg-muted text-sm text-muted-foreground max-md:min-h-[44px]"
> >
<Archive size={14} className="shrink-0" /> <button
<span className="truncate flex-1">{c.name ?? 'New chat'}</span> type="button"
<span className="shrink-0 text-xs">{formatRelative(c.updated_at)}</span> onClick={() => void restoreAndOpen(c.id)}
<RotateCcw title="Restore and open"
size={13} className="flex items-center gap-2 flex-1 min-w-0 text-left"
className="shrink-0 opacity-0 group-hover/arch:opacity-100" >
/> <Archive size={14} className="shrink-0" />
</button> <span className="truncate flex-1">{c.name ?? 'New chat'}</span>
<span className="shrink-0 text-xs">{formatRelative(c.updated_at)}</span>
<RotateCcw
size={13}
className="shrink-0 opacity-0 group-hover/arch:opacity-100"
/>
</button>
<button
type="button"
onClick={(e) => { e.stopPropagation(); setDeleteConfirm({ id: c.id, name: c.name }); }}
className="shrink-0 inline-flex items-center justify-center size-7 rounded hover:bg-destructive/20 hover:text-destructive max-md:size-9 opacity-0 group-hover/arch:opacity-100 focus-within:opacity-100 transition-opacity"
aria-label="Delete chat"
title="Delete"
>
<Trash2 size={14} />
</button>
</div>
))} ))}
</div> </div>
</> </>
@@ -195,6 +259,31 @@ export function SessionLandingPage({
messages={[]} messages={[]}
modelContextLimit={null} modelContextLimit={null}
/> />
<Dialog
open={deleteConfirm !== null}
onOpenChange={(open) => { if (!open) setDeleteConfirm(null); }}
>
<DialogContent>
<DialogHeader>
<DialogTitle>Delete chat?</DialogTitle>
<DialogDescription>
Permanently deletes "{deleteConfirm?.name ?? 'New chat'}" and all its messages. This cannot be undone.
</DialogDescription>
</DialogHeader>
<div className="flex gap-2 justify-end pt-2">
<Button variant="outline" onClick={() => setDeleteConfirm(null)}>Cancel</Button>
<Button
variant="destructive"
onClick={() => {
if (deleteConfirm) void onDeleteChat(deleteConfirm.id);
setDeleteConfirm(null);
}}
>
Delete
</Button>
</div>
</DialogContent>
</Dialog>
</div> </div>
); );
} }

View File

@@ -149,8 +149,13 @@ export function ToolCallLine({ run, insideGroup }: Props) {
onClick={() => setOpen((v) => !v)} onClick={() => setOpen((v) => !v)}
className="flex items-center gap-1.5 w-full text-left hover:bg-muted/40 rounded px-1 py-0.5 -mx-1" className="flex items-center gap-1.5 w-full text-left hover:bg-muted/40 rounded px-1 py-0.5 -mx-1"
> >
{/* BooCode 2.0: glowing activity indicator (was ↳ / >_) */}
{!insideGroup && ( {!insideGroup && (
<span className="text-muted-foreground/60 select-none shrink-0"></span> <span
className="size-1.5 rounded-full bg-primary shrink-0"
style={{ boxShadow: '0 0 6px var(--primary)' }}
aria-hidden
/>
)} )}
<ChevronRight <ChevronRight
className={`size-3 text-muted-foreground/60 shrink-0 transition-transform ${open ? 'rotate-90' : ''}`} className={`size-3 text-muted-foreground/60 shrink-0 transition-transform ${open ? 'rotate-90' : ''}`}

View File

@@ -1,5 +1,5 @@
import { useEffect, useMemo, useState } from 'react'; import { useEffect, useMemo, useState } from 'react';
import { MessageSquare, Terminal, Code, Clipboard, Plus, X } from 'lucide-react'; import { Terminal, Clipboard } from 'lucide-react';
import { api } from '@/api/client'; import { api } from '@/api/client';
import type { Chat, Project, Session, WorkspacePane } from '@/api/types'; import type { Chat, Project, Session, WorkspacePane } from '@/api/types';
import { MAX_PANES, activePaneChatId, type UseWorkspacePanesResult } from '@/hooks/useWorkspacePanes'; import { MAX_PANES, activePaneChatId, type UseWorkspacePanesResult } from '@/hooks/useWorkspacePanes';
@@ -13,13 +13,8 @@ import { CoderPane } from '@/components/panes/CoderPane';
import { MarkdownArtifactPane } from '@/components/MarkdownArtifactPane'; import { MarkdownArtifactPane } from '@/components/MarkdownArtifactPane';
import { HtmlArtifactPane } from '@/components/HtmlArtifactPane'; import { HtmlArtifactPane } from '@/components/HtmlArtifactPane';
import { ChatTabBar } from '@/components/ChatTabBar'; import { ChatTabBar } from '@/components/ChatTabBar';
import { PaneHeaderActions } from '@/components/PaneHeaderActions';
import { SessionLandingPage } from '@/components/SessionLandingPage'; import { SessionLandingPage } from '@/components/SessionLandingPage';
import {
DropdownMenu,
DropdownMenuContent,
DropdownMenuItem,
DropdownMenuTrigger,
} from '@/components/ui/dropdown-menu';
import { cn } from '@/lib/utils'; import { cn } from '@/lib/utils';
interface Props { interface Props {
@@ -65,6 +60,7 @@ export function Workspace({
closeAllTabs, closeAllTabs,
showLandingPage, showLandingPage,
addSplitPane, addSplitPane,
createCoderTab,
removePane, removePane,
reopenPane, reopenPane,
hasClosedPanes, hasClosedPanes,
@@ -219,46 +215,27 @@ export function Workspace({
onRemovePane={panes.length > 1 ? () => removePane(idx) : undefined} onRemovePane={panes.length > 1 ? () => removePane(idx) : undefined}
/> />
)} )}
{/* Coder panes host BooCode tabs (one chat = one agent context,
all sharing the session worktree). "+" adds a tab; the split
button adds a pane. Same tab strip as chat panes (tabKind). */}
{isCoder && !isMobile && ( {isCoder && !isMobile && (
<div className="flex items-center gap-1 border-b border-border px-2 py-1 shrink-0"> <ChatTabBar
<Code size={12} className="text-muted-foreground" /> pane={pane}
<span className="text-xs text-muted-foreground">BooCode</span> tabs={chatsForPane(pane)}
<div className="ml-auto flex items-center gap-1"> tabKind="coder"
<DropdownMenu> tabNumbers={tabNumbers}
<DropdownMenuTrigger asChild> onSwitchTab={(tabIdx) => switchTab(idx, tabIdx)}
<button onRemoveTab={(chatId) => removeTab(idx, chatId)}
type="button" onCloseOthers={(chatId) => closeOtherTabs(idx, chatId)}
onClick={(e) => e.stopPropagation()} onCloseToRight={(chatId) => closeTabsToRight(idx, chatId)}
className="inline-flex items-center justify-center size-5 rounded text-muted-foreground hover:bg-muted hover:text-foreground" onCloseAll={() => closeAllTabs(idx)}
aria-label="New pane" onNewTab={() => void createCoderTab(idx)}
> onSplitPane={(kind) => onAddPane(kind)}
<Plus size={12} /> onReopenPane={hasClosedPanes ? reopenPane : undefined}
</button> onShowHistory={() => showLandingPage(idx)}
</DropdownMenuTrigger> onRename={renameChat}
<DropdownMenuContent align="end" className="w-fit"> onRemovePane={panes.length > 1 ? () => removePane(idx) : undefined}
<DropdownMenuItem onSelect={() => onAddPane('chat')}> />
<MessageSquare size={14} /> New BooChat
</DropdownMenuItem>
<DropdownMenuItem onSelect={() => onAddPane('terminal')}>
<Terminal size={14} /> New BooTerm
</DropdownMenuItem>
<DropdownMenuItem onSelect={() => onAddPane('coder')}>
<Code size={14} /> New BooCode
</DropdownMenuItem>
</DropdownMenuContent>
</DropdownMenu>
{panes.length > 1 && (
<button
type="button"
onClick={(e) => { e.stopPropagation(); removePane(idx); }}
className="inline-flex items-center justify-center size-5 rounded text-muted-foreground hover:bg-muted hover:text-foreground"
aria-label="Close pane"
>
<X size={12} />
</button>
)}
</div>
</div>
)} )}
{isTerminal && ( {isTerminal && (
<div className="flex items-center gap-2 border-b border-border bg-muted/30 px-2 py-1 shrink-0"> <div className="flex items-center gap-2 border-b border-border bg-muted/30 px-2 py-1 shrink-0">
@@ -266,61 +243,31 @@ export function Workspace({
<span className="text-xs text-muted-foreground"> <span className="text-xs text-muted-foreground">
{terminalLabels.get(pane.id) ?? 'Terminal'} {terminalLabels.get(pane.id) ?? 'Terminal'}
</span> </span>
<DropdownMenu> <div className="ml-auto flex items-center gap-0.5">
<DropdownMenuTrigger asChild> {/* v1.10.4: iOS Safari restricts navigator.clipboard.readText
<button outside direct user gestures. A real button click IS a
type="button" gesture, so this works where keystroke-driven paste may
onClick={(e) => e.stopPropagation()} not on iOS. The action lives in TerminalPane behind the
className="ml-auto inline-flex items-center justify-center size-5 rounded text-muted-foreground hover:bg-muted hover:text-foreground max-md:size-7" registry's paste() callback. */}
aria-label="New pane"
title="New pane"
>
<Plus size={12} />
</button>
</DropdownMenuTrigger>
<DropdownMenuContent align="end" className="w-fit">
<DropdownMenuItem onSelect={() => onAddPane('chat')}>
<MessageSquare size={14} /> New BooChat
</DropdownMenuItem>
<DropdownMenuItem onSelect={() => onAddPane('terminal')}>
<Terminal size={14} /> New BooTerm
</DropdownMenuItem>
<DropdownMenuItem onSelect={() => onAddPane('coder')}>
<Code size={14} /> New BooCode
</DropdownMenuItem>
</DropdownMenuContent>
</DropdownMenu>
{/* v1.10.4: iOS Safari restricts navigator.clipboard.readText
outside direct user gestures. A real button click IS a
gesture, so this works where keystroke-driven paste may
not on iOS. The action lives in TerminalPane behind the
registry's paste() callback. */}
<button
type="button"
onClick={(e) => {
e.stopPropagation();
terminalsRegistry.get(pane.id)?.paste();
}}
className="inline-flex items-center justify-center size-5 rounded text-muted-foreground hover:bg-muted hover:text-foreground max-md:size-7"
aria-label="Paste from clipboard"
title="Paste from clipboard"
>
<Clipboard size={12} />
</button>
{panes.length > 1 && (
<button <button
type="button" type="button"
onClick={(e) => { onClick={(e) => {
e.stopPropagation(); e.stopPropagation();
removePane(idx); terminalsRegistry.get(pane.id)?.paste();
}} }}
className="inline-flex items-center justify-center size-5 rounded text-muted-foreground hover:bg-muted hover:text-foreground max-md:size-7" className="inline-flex items-center justify-center p-1 rounded text-muted-foreground hover:bg-muted hover:text-foreground max-md:min-h-[44px] max-md:min-w-[44px]"
aria-label="Close terminal pane" aria-label="Paste from clipboard"
title="Close terminal pane" title="Paste from clipboard"
> >
<X size={12} /> <Clipboard size={12} />
</button> </button>
)} <PaneHeaderActions
onSplitPane={onAddPane}
onReopenPane={hasClosedPanes ? reopenPane : undefined}
onShowHistory={() => showLandingPage(idx)}
onRemovePane={panes.length > 1 ? () => removePane(idx) : undefined}
/>
</div>
</div> </div>
)} )}
</div> </div>
@@ -395,6 +342,8 @@ export function Workspace({
chats={chats} chats={chats}
onOpenChat={(chatId) => openChatInPane(idx, chatId)} onOpenChat={(chatId) => openChatInPane(idx, chatId)}
onUnarchiveChat={unarchiveChat} onUnarchiveChat={unarchiveChat}
onArchiveChat={archiveChat}
onDeleteChat={deleteChat}
/> />
)} )}
</div> </div>

View File

@@ -11,6 +11,7 @@ export interface CoderMessageWire {
role: 'user' | 'assistant' | 'system'; role: 'user' | 'assistant' | 'system';
content: string; content: string;
status?: 'streaming' | 'complete' | 'failed'; status?: 'streaming' | 'complete' | 'failed';
model?: string | null;
reasoning_text?: string; reasoning_text?: string;
tool_calls?: CoderToolCallWire[]; tool_calls?: CoderToolCallWire[];
} }

View File

@@ -18,6 +18,7 @@ import { mergeWireToolCall } from '@/lib/coder-tools';
import { CoderMessageList, type CoderTimelineWire } from '@/components/panes/CoderMessageList'; import { CoderMessageList, type CoderTimelineWire } from '@/components/panes/CoderMessageList';
import { providerIcon, providerLabel } from '@/components/coder/providerIcons'; import { providerIcon, providerLabel } from '@/components/coder/providerIcons';
import { refreshAgentSessions } from '@/hooks/useAgentSessions'; import { refreshAgentSessions } from '@/hooks/useAgentSessions';
import { useAgentStatus, type AgentStatus, type AgentStatusEntry } from '@/hooks/useAgentStatus';
import { cn } from '@/lib/utils'; import { cn } from '@/lib/utils';
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
@@ -29,6 +30,8 @@ interface CoderMessage {
role: 'user' | 'assistant' | 'system'; role: 'user' | 'assistant' | 'system';
content: string; content: string;
status?: 'streaming' | 'complete' | 'failed'; status?: 'streaming' | 'complete' | 'failed';
// model-attribution: which model produced this assistant message (chip).
model?: string | null;
reasoning_text?: string; reasoning_text?: string;
tool_calls?: Array<{ tool_calls?: Array<{
id: string; id: string;
@@ -51,6 +54,46 @@ interface CoderToolMessage {
type CoderTimelineMessage = CoderMessage | CoderToolMessage; type CoderTimelineMessage = CoderMessage | CoderToolMessage;
// Per-chat agent-config cache (provider/model/mode/thinking). Keyed by chat id
// so reopening or switching back to a chat restores the model that was loaded
// last there. Per-device (localStorage) — a UI convenience, not authoritative.
const DEFAULT_AGENT_CONFIG: AgentSessionConfig = {
provider: 'boocode',
model: '',
modeId: null,
thinkingOptionId: null,
};
function agentConfigKey(chatId: string | undefined): string | null {
return chatId ? `boocode.coder.config.${chatId}` : null;
}
function readCachedAgentConfig(chatId: string | undefined): AgentSessionConfig | null {
const key = agentConfigKey(chatId);
if (!key || typeof localStorage === 'undefined') return null;
try {
const raw = localStorage.getItem(key);
if (!raw) return null;
const c = JSON.parse(raw) as Partial<AgentSessionConfig>;
if (typeof c?.provider !== 'string') return null;
return {
provider: c.provider,
model: typeof c.model === 'string' ? c.model : '',
modeId: c.modeId ?? null,
thinkingOptionId: c.thinkingOptionId ?? null,
};
} catch {
return null;
}
}
function writeCachedAgentConfig(chatId: string | undefined, config: AgentSessionConfig): void {
const key = agentConfigKey(chatId);
if (!key || typeof localStorage === 'undefined') return;
try {
localStorage.setItem(key, JSON.stringify(config));
} catch {
/* quota / disabled storage — non-fatal */
}
}
interface PendingChange { interface PendingChange {
id: string; id: string;
file_path: string; file_path: string;
@@ -80,6 +123,14 @@ interface WsHandlers {
onAssistantComplete?: () => void; onAssistantComplete?: () => void;
onAgentCommands?: (taskId: string, commands: AgentCommand[]) => void; onAgentCommands?: (taskId: string, commands: AgentCommand[]) => void;
onConnectedChange?: (connected: boolean) => void; onConnectedChange?: (connected: boolean) => void;
// #10: normalized external-agent status (working|blocked|idle|error) for the
// (chat,agent) carried on the frame. CoderPane records it in a live map and
// feeds the active agent's status to AgentComposerBar's status dot.
onAgentStatus?: (
chatId: string,
agent: string,
entry: AgentStatusEntry,
) => void;
} }
type RawCoderMessage = { type RawCoderMessage = {
@@ -88,6 +139,7 @@ type RawCoderMessage = {
chat_id?: string; chat_id?: string;
content?: string | null; content?: string | null;
status?: string | null; status?: string | null;
model?: string | null;
reasoning_text?: string; reasoning_text?: string;
reasoning_parts?: Array<{ text?: string }> | null; reasoning_parts?: Array<{ text?: string }> | null;
tool_results?: { tool_results?: {
@@ -135,6 +187,7 @@ function mapCoderTimelineRow(raw: RawCoderMessage): CoderTimelineMessage | null
role: raw.role as CoderMessage['role'], role: raw.role as CoderMessage['role'],
content: raw.content ?? '', content: raw.content ?? '',
status: (raw.status ?? 'complete') as CoderMessage['status'], status: (raw.status ?? 'complete') as CoderMessage['status'],
...(raw.model ? { model: raw.model } : {}),
...(reasoning_text ? { reasoning_text } : {}), ...(reasoning_text ? { reasoning_text } : {}),
...(tool_calls?.length ? { tool_calls } : {}), ...(tool_calls?.length ? { tool_calls } : {}),
ctx_used: raw.ctx_used ?? null, ctx_used: raw.ctx_used ?? null,
@@ -244,6 +297,7 @@ function useCoderMessages(sessionId: string, chatId: string | undefined, handler
? { ? {
...m, ...m,
status: 'complete' as const, status: 'complete' as const,
model: (frame as any).model ?? (m as any).model ?? null,
ctx_used: (frame as any).ctx_used ?? (m as any).ctx_used ?? null, ctx_used: (frame as any).ctx_used ?? (m as any).ctx_used ?? null,
ctx_max: (frame as any).ctx_max ?? (m as any).ctx_max ?? null, ctx_max: (frame as any).ctx_max ?? (m as any).ctx_max ?? null,
} }
@@ -326,6 +380,19 @@ function useCoderMessages(sessionId: string, chatId: string | undefined, handler
description: c.description, description: c.description,
})), })),
); );
} else if (frame.type === 'agent_status_updated') {
// #10: { chat_id, agent, status, reason?, at }. The chat_id guard
// above already dropped cross-chat frames; record per (chat,agent).
const chatId = (frame.chat_id ?? scopedChatId) as string | undefined;
const agent = frame.agent as string | undefined;
const status = frame.status as AgentStatus | undefined;
if (chatId && agent && status) {
handlersRef.current.onAgentStatus?.(chatId, agent, {
status,
...(frame.reason ? { reason: frame.reason as string } : {}),
at: (frame.at as string) ?? new Date().toISOString(),
});
}
} }
} catch { } catch {
// ignore unparseable frames // ignore unparseable frames
@@ -564,12 +631,37 @@ export function CoderPane({
onConnectedChange, onConnectedChange,
onAgentLabelChange, onAgentLabelChange,
}: Props) { }: Props) {
const [agentConfig, setAgentConfig] = useState<AgentSessionConfig>({ const [agentConfig, setAgentConfig] = useState<AgentSessionConfig>(
provider: 'boocode', () => readCachedAgentConfig(chatId) ?? DEFAULT_AGENT_CONFIG,
model: '', );
modeId: null, // Restore the per-chat cached config when the chat changes. The ref guard
thinkingOptionId: null, // skips the initial mount (lazy init already loaded it) + StrictMode double-runs.
}); const lastLoadedChatRef = useRef<string | undefined>(chatId);
useEffect(() => {
const prev = lastLoadedChatRef.current;
if (prev === chatId) return;
lastLoadedChatRef.current = chatId;
// undefined → real id: the pane just resolved its chat. A selection made
// while chatId was undefined could not be persisted (the key was null), so
// carry the current in-memory config into the new chat — and persist it —
// rather than clobbering the user's pick with DEFAULT on the cache miss.
if (prev === undefined && chatId) {
const cached = readCachedAgentConfig(chatId);
if (cached) setAgentConfig(cached);
else writeCachedAgentConfig(chatId, agentConfig);
return;
}
setAgentConfig(readCachedAgentConfig(chatId) ?? DEFAULT_AGENT_CONFIG);
}, [chatId, agentConfig]);
// Persist on user-driven changes only (not on the restore above), so switching
// chats never clobbers the new chat's cached config with the old one.
const handleAgentConfigChange = useCallback(
(next: AgentSessionConfig) => {
setAgentConfig(next);
writeCachedAgentConfig(chatId, next);
},
[chatId],
);
useEffect(() => { useEffect(() => {
const parts = [agentConfig.provider || 'boocode']; const parts = [agentConfig.provider || 'boocode'];
@@ -642,6 +734,8 @@ export function CoderPane({
return groups; return groups;
}, [agentCommands, skillItems, agentConfig.provider]); }, [agentCommands, skillItems, agentConfig.provider]);
// #10: live normalized status per (chat,agent), reset on chat switch below.
const agentStatus = useAgentStatus();
const { messages, setMessages, connected, loadMessages } = useCoderMessages(sessionId, chatId, { const { messages, setMessages, connected, loadMessages } = useCoderMessages(sessionId, chatId, {
onConnectedChange, onConnectedChange,
onPermissionRequested: (prompt) => { onPermissionRequested: (prompt) => {
@@ -661,7 +755,21 @@ export function CoderPane({
onAgentCommands: (_taskId, commands) => { onAgentCommands: (_taskId, commands) => {
setLiveTaskCommands(commands); setLiveTaskCommands(commands);
}, },
onAgentStatus: agentStatus.record,
}); });
// Clear any stale status for the previous chat when the pane switches chats so
// a lingering working/blocked dot never carries into the next conversation.
useEffect(() => {
return () => agentStatus.reset(chatId);
}, [chatId, agentStatus]);
// The active agent's normalized status for this chat. null for native boocode
// (no external status published) or before any frame arrives — gates the dot.
const currentAgentStatus: AgentStatusEntry | null =
agentConfig.provider && agentConfig.provider !== 'boocode'
? agentStatus.get(chatId, agentConfig.provider)
: null;
const { changes, loading, refresh, approve, reject } = usePendingChanges(sessionId); const { changes, loading, refresh, approve, reject } = usePendingChanges(sessionId);
const { checkpointMessageIds, refreshCheckpoints } = useCheckpoints(sessionId, chatId); const { checkpointMessageIds, refreshCheckpoints } = useCheckpoints(sessionId, chatId);
const [input, setInput] = useState(''); const [input, setInput] = useState('');
@@ -689,13 +797,6 @@ export function CoderPane({
} }
}, [messages, refresh, refreshCheckpoints, sessionId]); }, [messages, refresh, refreshCheckpoints, sessionId]);
// The §9b chip only shows once the chat has ≥1 prior turn (a completed
// assistant message). Hidden on a brand-new chat.
const hasPriorTurn = useMemo(
() => messages.some((m) => m.role === 'assistant' && (m as CoderMessage).status === 'complete'),
[messages],
);
// Poll fallbacks when WS is disconnected (reconnect uses WS as source of truth) // Poll fallbacks when WS is disconnected (reconnect uses WS as source of truth)
useEffect(() => { useEffect(() => {
if (!activeTaskId || connected) return; if (!activeTaskId || connected) return;
@@ -963,11 +1064,10 @@ export function CoderPane({
<AgentComposerBar <AgentComposerBar
projectPath={projectPath} projectPath={projectPath}
value={agentConfig} value={agentConfig}
onChange={setAgentConfig} onChange={handleAgentConfigChange}
onProviderCommandsChange={handleProviderCommandsChange} onProviderCommandsChange={handleProviderCommandsChange}
connected={connected} connected={connected}
sessionId={sessionId} agentStatus={currentAgentStatus}
hasPriorTurn={hasPriorTurn}
/> />
{/* Chat area — BooChat-style timeline (text + tool runs as siblings) */} {/* Chat area — BooChat-style timeline (text + tool runs as siblings) */}
<div className="flex-1 min-h-0 flex flex-col"> <div className="flex-1 min-h-0 flex flex-col">

View File

@@ -0,0 +1,62 @@
import { useCallback, useMemo, useState } from 'react';
// Normalized external-agent status (#10). Consumed from the
// `agent_status_updated` WS frame the coder backend publishes:
// { type: 'agent_status_updated'; chat_id; agent; status; reason?; at }
// BooCoder collapses ~30 vendor lifecycle events into these four buckets:
// working — turn in flight
// blocked — waiting on a permission / approval
// idle — clean completion
// error — crash / failure
export type AgentStatus = 'working' | 'blocked' | 'idle' | 'error';
export interface AgentStatusEntry {
status: AgentStatus;
reason?: string;
at: string;
}
const key = (chatId: string, agent: string): string => `${chatId}:${agent}`;
// Per-(chat,agent) live status map. The dot reflects the latest frame for the
// active agent in the current chat; entries are reset when the chat switches so
// a stale "working"/"blocked" from a previous chat never leaks into the next.
export function useAgentStatus() {
const [map, setMap] = useState<Record<string, AgentStatusEntry>>({});
const record = useCallback(
(chatId: string, agent: string, entry: AgentStatusEntry) => {
setMap((prev) => ({ ...prev, [key(chatId, agent)]: entry }));
},
[],
);
// Drop every entry for a chat (called on chat switch). No-op when nothing
// matches so it's safe to call unconditionally from an effect.
const reset = useCallback((chatId: string | undefined) => {
setMap((prev) => {
if (!chatId) return prev;
const prefix = `${chatId}:`;
let changed = false;
const next: Record<string, AgentStatusEntry> = {};
for (const [k, v] of Object.entries(prev)) {
if (k.startsWith(prefix)) {
changed = true;
continue;
}
next[k] = v;
}
return changed ? next : prev;
});
}, []);
const get = useCallback(
(chatId: string | undefined, agent: string | undefined): AgentStatusEntry | null => {
if (!chatId || !agent) return null;
return map[key(chatId, agent)] ?? null;
},
[map],
);
return useMemo(() => ({ record, reset, get }), [record, reset, get]);
}

View File

@@ -1,7 +1,7 @@
import { useEffect, useRef, useState } from 'react'; import { useEffect, useRef, useState } from 'react';
import { toast } from 'sonner'; import { toast } from 'sonner';
import type { Message, WsFrame } from '@/api/types'; import type { Message, WsFrame } from '@/api/types';
import { WsFrameSchema } from '@/api/ws-frames'; import { WsFrameSchema } from '@boocode/contracts/ws-frames';
import { api } from '@/api/client'; import { api } from '@/api/client';
import { sessionEvents } from './sessionEvents'; import { sessionEvents } from './sessionEvents';
import { recordUsage } from './useChatThroughput'; import { recordUsage } from './useChatThroughput';
@@ -40,6 +40,7 @@ function applyFrame(state: State, frame: WsFrame): State {
tokens_used: null, tokens_used: null,
ctx_used: null, ctx_used: null,
ctx_max: null, ctx_max: null,
model: null,
started_at: null, started_at: null,
finished_at: null, finished_at: null,
created_at: new Date().toISOString(), created_at: new Date().toISOString(),
@@ -105,6 +106,7 @@ function applyFrame(state: State, frame: WsFrame): State {
tokens_used: null, tokens_used: null,
ctx_used: null, ctx_used: null,
ctx_max: null, ctx_max: null,
model: null,
started_at: null, started_at: null,
finished_at: null, finished_at: null,
created_at: new Date().toISOString(), created_at: new Date().toISOString(),
@@ -123,6 +125,7 @@ function applyFrame(state: State, frame: WsFrame): State {
...(frame.ctx_max !== undefined ? { ctx_max: frame.ctx_max } : {}), ...(frame.ctx_max !== undefined ? { ctx_max: frame.ctx_max } : {}),
...(frame.started_at !== undefined ? { started_at: frame.started_at } : {}), ...(frame.started_at !== undefined ? { started_at: frame.started_at } : {}),
...(frame.finished_at !== undefined ? { finished_at: frame.finished_at } : {}), ...(frame.finished_at !== undefined ? { finished_at: frame.finished_at } : {}),
...(frame.model !== undefined ? { model: frame.model } : {}),
// v1.8.2: cap-hit sentinels (and future stamped metadata) ride // v1.8.2: cap-hit sentinels (and future stamped metadata) ride
// in on this terminal frame so the reducer can attach it // in on this terminal frame so the reducer can attach it
// without waiting for a refetch. // without waiting for a refetch.
@@ -189,6 +192,12 @@ function applyFrame(state: State, frame: WsFrame): State {
// duplicating async work inside a synchronous reducer. // duplicating async work inside a synchronous reducer.
return state; return state;
} }
case 'agent_status_updated': {
// agent-status-normalize (#10): coder-only frame consumed by CoderPane's
// own WS handler, not BooChat's native message reducer. No-op here to keep
// TS exhaustiveness satisfied (native sessions never emit it).
return state;
}
} }
} }

View File

@@ -1,5 +1,5 @@
import { useEffect } from 'react'; import { useEffect } from 'react';
import { WsFrameSchema } from '@/api/ws-frames'; import { WsFrameSchema } from '@boocode/contracts/ws-frames';
import { sessionEvents } from './sessionEvents'; import { sessionEvents } from './sessionEvents';
import { createWsReconnectToast } from './wsReconnectToast'; import { createWsReconnectToast } from './wsReconnectToast';

View File

@@ -188,6 +188,8 @@ export interface UseWorkspacePanesResult {
// id to update mobile URL state so the URL-sync effect doesn't fight the // id to update mobile URL state so the URL-sync effect doesn't fight the
// freshly-set activePaneIdx. // freshly-set activePaneIdx.
addSplitPane: (kind: 'chat' | 'terminal' | 'coder') => string | null; addSplitPane: (kind: 'chat' | 'terminal' | 'coder') => string | null;
/** Append a new BooCode tab to an existing coder pane (the coder "+"). */
createCoderTab: (paneIdx: number) => Promise<void>;
// Open-on-first-click, close-on-second-click. Singleton — settings panes // Open-on-first-click, close-on-second-click. Singleton — settings panes
// don't count toward MAX_PANES. Closing the only remaining pane (edge case) // don't count toward MAX_PANES. Closing the only remaining pane (edge case)
// falls back to an empty pane to preserve the "always one pane" invariant. // falls back to an empty pane to preserve the "always one pane" invariant.
@@ -265,6 +267,42 @@ export function useWorkspacePanes(sessionId: string): UseWorkspacePanesResult {
[sessionId, attachChatToPane, markPaneChatPending], [sessionId, attachChatToPane, markPaneChatPending],
); );
// Add a new BooCode tab to an existing coder pane (the "+" in the coder pane
// header). Creates a fresh chat row (= a new agent context that shares the
// session worktree) and APPENDS it to the pane's chatIds, keeping the pane
// kind 'coder' and focusing the new tab. Mirrors createChat for chat panes;
// the per-pane "split into a new pane" action stays addSplitPane.
const createCoderTab = useCallback(
async (paneIdx: number) => {
const paneId = panes[paneIdx]?.id;
if (!paneId) return;
markPaneChatPending(paneId, true);
try {
const chat = await api.chats.create(sessionId, { name: chatNameForPaneKind('coder') });
setPanes((prev) => {
const idx = prev.findIndex((p) => p.id === paneId);
if (idx < 0) return prev;
const pane = prev[idx]!;
const newIds = [...pane.chatIds, chat.id];
const next = [...prev];
next[idx] = {
...pane,
kind: 'coder',
chatId: chat.id,
chatIds: newIds,
activeChatIdx: newIds.length - 1,
};
return next;
});
} catch (err) {
toast.error(err instanceof Error ? err.message : 'Failed to create coder tab');
} finally {
markPaneChatPending(paneId, false);
}
},
[sessionId, panes, markPaneChatPending],
);
const seedEmptyScopedPanes = useCallback( const seedEmptyScopedPanes = useCallback(
(paneList: WorkspacePane[]) => { (paneList: WorkspacePane[]) => {
for (const pane of paneList) { for (const pane of paneList) {
@@ -426,16 +464,16 @@ export function useWorkspacePanes(sessionId: string): UseWorkspacePanesResult {
}, [sessionId, panes, tabNumbers, nextTabNumber, closedPaneStack]); }, [sessionId, panes, tabNumbers, nextTabNumber, closedPaneStack]);
// v2.6.x (Batch 3a): maintain stable, session-scoped tab numbers. Collect the // v2.6.x (Batch 3a): maintain stable, session-scoped tab numbers. Collect the
// chat ids that appear in CHAT-kind panes in deterministic order (pane index, // chat ids that appear in CHAT- or CODER-kind panes in deterministic order
// then tab index). Assign numbers to any without one (global per session, // (pane index, then tab index). Assign numbers to any without one (global per
// only ever increasing, never reused) and prune entries whose chat is no // session, only ever increasing, never reused) and prune entries whose chat
// longer in any chat-kind pane. Guarded against render loops: only setState // is no longer in any tab-hosting pane. Guarded against render loops: only
// when something actually changed. // setState when something actually changed.
useEffect(() => { useEffect(() => {
const liveChatIds: string[] = []; const liveChatIds: string[] = [];
const liveSet = new Set<string>(); const liveSet = new Set<string>();
for (const pane of panes) { for (const pane of panes) {
if (pane.kind !== 'chat') continue; if (pane.kind !== 'chat' && pane.kind !== 'coder') continue;
for (const id of pane.chatIds) { for (const id of pane.chatIds) {
if (!liveSet.has(id)) { if (!liveSet.has(id)) {
liveSet.add(id); liveSet.add(id);
@@ -597,9 +635,9 @@ export function useWorkspacePanes(sessionId: string): UseWorkspacePanesResult {
const pane = next[paneIdx]!; const pane = next[paneIdx]!;
const keepIdx = pane.chatIds.indexOf(keepChatId); const keepIdx = pane.chatIds.indexOf(keepChatId);
if (keepIdx < 0) return prev; if (keepIdx < 0) return prev;
// Preserve pane.kind (...pane) — a coder pane stays a coder pane.
next[paneIdx] = { next[paneIdx] = {
...pane, ...pane,
kind: 'chat',
chatId: keepChatId, chatId: keepChatId,
chatIds: [keepChatId], chatIds: [keepChatId],
activeChatIdx: 0, activeChatIdx: 0,
@@ -640,13 +678,23 @@ export function useWorkspacePanes(sessionId: string): UseWorkspacePanesResult {
const showLandingPage = useCallback((paneIdx: number) => { const showLandingPage = useCallback((paneIdx: number) => {
setPanes((prev) => { setPanes((prev) => {
const pane = prev[paneIdx]; const pane = prev[paneIdx];
// Coder/terminal panes are not chat hosts — history button is chat-only. if (!pane) return prev;
if (!pane || pane.kind === 'coder' || pane.kind === 'terminal') return prev;
const next = [...prev]; const next = [...prev];
next[paneIdx] = { ...pane, kind: 'empty', chatId: undefined }; if (pane.kind === 'coder' || pane.kind === 'terminal') {
// Scoped panes don't host chat tabs. Leaving one for the session
// history closes it: drop the pane→chat binding, and for terminals
// kill the tmux session (terminals are ephemeral — closing = killing,
// mirroring removePane).
if (pane.kind === 'terminal') {
api.terminals.kill(sessionId, pane.id).catch(() => { /* non-fatal */ });
}
next[paneIdx] = { ...pane, kind: 'empty', chatId: undefined, chatIds: [], activeChatIdx: -1 };
} else {
next[paneIdx] = { ...pane, kind: 'empty', chatId: undefined };
}
return next; return next;
}); });
}, []); }, [sessionId]);
const addSplitPane = useCallback((kind: 'chat' | 'terminal' | 'coder'): string | null => { const addSplitPane = useCallback((kind: 'chat' | 'terminal' | 'coder'): string | null => {
// Generate the id outside the updater so we can return it deterministically. // Generate the id outside the updater so we can return it deterministically.
@@ -944,6 +992,7 @@ export function useWorkspacePanes(sessionId: string): UseWorkspacePanesResult {
closeAllTabs, closeAllTabs,
showLandingPage, showLandingPage,
addSplitPane, addSplitPane,
createCoderTab,
toggleSettingsPane, toggleSettingsPane,
removePane, removePane,
reopenPane, reopenPane,

View File

@@ -0,0 +1,32 @@
// model-attribution: turn a raw model id into a short, friendly label for the
// per-message model chip (e.g. "claude-sonnet-4-6" → "Sonnet 4.6",
// "qwen3.6-35b-a3b-mxfp4" → "Qwen3.6 35B"). Strips provider prefixes and maps
// the common families; falls back to the cleaned id so unknown models still
// read. Returns null for empty/absent input so the caller can skip the chip.
export function shortenModelName(model: string | null | undefined): string | null {
if (!model) return null;
let m = model.trim();
if (!m) return null;
// opencode / provider-prefixed ids: "llama-swap/qwen…", "anthropic/claude…".
const slash = m.lastIndexOf('/');
if (slash >= 0) m = m.slice(slash + 1);
// claude-{opus,sonnet,haiku}-X-Y[-date] → "Opus X.Y".
const claude = /^claude-(opus|sonnet|haiku)-(\d+)-(\d+)/i.exec(m);
if (claude) {
const tier = claude[1]!.charAt(0).toUpperCase() + claude[1]!.slice(1).toLowerCase();
return `${tier} ${claude[2]}.${claude[3]}`;
}
// qwen3.6-35b-a3b-… → "Qwen3.6 35B".
const qwen = /^qwen([\d.]+)-(\d+)b/i.exec(m);
if (qwen) return `Qwen${qwen[1]} ${qwen[2]}B`;
// gpt-4o, gpt-5-… → "GPT-4o" / "GPT-5".
const gpt = /^gpt-([\w.-]+)/i.exec(m);
if (gpt) return `GPT-${gpt[1]}`;
// Fallback: keep the id readable, cap the length for the chip.
return m.length > 26 ? `${m.slice(0, 25)}` : m;
}

Some files were not shown because too many files have changed in this diff Show More