Commit Graph

19 Commits

Author SHA1 Message Date
378e29308e fix: add cache_tokens/reasoning_tokens to Message constructors in useSessionStream 2026-06-08 01:27:31 +00:00
203cfd2fa8 feat: DeepSeek API integration + Whale lift (hooks, tool repair, MCP permissions, token tracking)
DeepSeek API:
- @ai-sdk/deepseek provider replaces openai-compatible for deepseek-* models
- Token tracking: cache_hit/reasoning tokens flow API → DB → WS frames → UI
- thinking effort levels (off/low/medium/high/xhigh/max) via AGENTS.md frontmatter
- V4 models: deepseek-v4-flash, deepseek-v4-pro
- Wired for both chat and coder panes

Whale lifts:
- Tool input repair (schema-based type coercion, markdown link unwrapping)
- Hooks system (6 lifecycle events, shell exec, JSON stdin/stdout contract)
- Per-MCP-server permissions (allow/ask/deny)
- token tracking UI (cache N, think N in message stats line)

Infra:
- New DB columns: messages.cache_tokens, messages.reasoning_tokens
- New WS frame fields: cache_tokens, reasoning_tokens on message_complete
- coder provider snapshot merges DeepSeek models alongside llama-swap
2026-06-08 01:24:23 +00:00
d6d246c15b feat(web,coder): arena pane — compare 2-6 AI competitors on same prompt
Arena is a new pane kind for competitive AI evaluation. A Battle runs
the same prompt against 2-6 Contestants across two concurrent lanes:
local lane (llama-swap models, serial) and cloud lane (parallel).

Added to all three registries: @boocode/contracts WsFrameSchema,
server InferenceFrame, and web WsFrame.

Backend (apps/coder):
- arena-runner: battle scheduler, lane classifier, benchmark, results
  writer, resume, user winner override
- arena-analyzer: two-stage digest→judge analysis on DEFAULT_MODEL
- arena-decisions: status transitions and resume logic (unit-tested)
- arena-analyzer-helpers: pure helper functions (unit-tested)
- arena-model-call: model call utility for analysis
- arena routes: create/get/list/stop/analyze/cross-examine/winner/diff
- schema: battles, contestants, cross_examinations tables (idempotent)
- remove old /api/arena* routes and tasks.arena_id column

Frontend (apps/web):
- ArenaLauncherDialog: battle type, prompt, contestant selection
- ArenaPane: live roster, streaming output, analysis, cross-exam
- DiffView: unified diff with line-by-line color for coding contests
- Winner override per-row dropdown (Trophy icon)
- battle_updated WS handler for live winner/analysis updates
- arena pane kind in Workspace, ChatTabBar, useSidebar

Cross-app:
- ArenaState and ArenaContestantShape/WsFrame types (contracts)
- battle_* frames in WsFrameSchema, InferenceFrame, and web WsFrame
- manifest.json written per battle results folder
- /Arena added to .gitignore

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-06 23:25:29 +00:00
1937af8df9 feat: in-app Orchestrator (Phase 2) — multi-agent conductor
Brings the deterministic Han-flow conductor into BooCode: launch any read-only
flow from BooChat or BooCoder, watch each agent stream live in a Paseo-style
run pane, get an evidence-disciplined report — on local Qwen, persisted and
resumable. Read-only enforced hard via qwen --approval-mode plan (orchestrator
tasks fail closed if qwen is unavailable; never fall to write-capable native).

Backend (apps/coder): re-homed conductor defs, flow_runs/flow_steps schema,
flow-runner + dispatcher onTaskTerminal hook, restart-resume, runs routes
(launch/list/get/cancel), user-channel WS. Contracts: two flow_run_* frames.
Web: orchestrator pane kind + OrchestratorPane, Workflow button + slash flows
(BooChat/BooCoder parity), FlowLauncherDialog, "New Orchestrator" in the + and
split menus, runs history + export. Plan: openspec/changes/orchestrator.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 15:22:48 +00:00
d8bb2dabfe feat: git diff panel (Files/Git tab in the file browser)
Adds a Git tab to the right-side file panel that shows the project
repository's diff and lets the user stage, unstage, commit, and discard
whole files in-session. Two comparison modes (Uncommitted vs HEAD, and the
branch vs its base — upstream tracking branch else default branch), auto-
selected by repo state on first open and pinned after explicit choice;
per-file expand/collapse with lazy syntax-highlighted diffs, +/- stats, and
binary/large-file placeholders. All git read and write logic lives in
apps/server via a new git_diff service: argv-safe execFile only (never a
shell), per-file paths validated repo-relative through pathGuard with a
realpath symlink-escape check, server-derived commit identity (the request
carries no author fields), and the write endpoints are deliberately absent
from the assistant tool registry. Reads are bounded (30s deadline, 10MB);
an index lock or an in-progress merge/rebase/cherry-pick/bisect surfaces as
"repository busy" and disables writes. The panel stays current via a client
git_diff_refresh session event (no new wire contract) coalesced across tab
open, mutations, turn completion, and pending-change apply. Discard is an
irrecoverable hard-delete behind a plain confirm that distinguishes
reverting a tracked file from deleting an untracked one.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 03:18:41 +00:00
649ce71eff feat: single-source cross-app wire contracts in @boocode/contracts (v2.7.13)
Move all hand-synced cross-app wire contracts into one built workspace
package, @boocode/contracts, consumed by server/web/coder/coder-web via
workspace:* + a per-subpath exports map. The ws-frames and provider-config
Zod schemas are schema-first (z.infer); MessageMetadata, ErrorReason,
AgentSessionConfig, the provider snapshot types, and WorktreeRiskReport are
each single-sourced. Deletes the byte-identical copies and their parity
tests, fixes a live AgentSessionConfig drift (coder dead copy removed,
unified to the web required/nullable shape), removes the dead pending_change
WS arms in the fallback SPA, and inverts the build order (contracts builds
first) across root build, Dockerfile, and the coder deploy docs. Reverses
the shared-package decision declined in v2.5.12.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 21:24:08 +00:00
3a646fd6df feat: BooCode 2.0 UI — Ember theme, brand banner, coder tabs, model-attribution chips
- Ember theme (Obsidian charcoal + #ff7a18 orange), now DEFAULT_THEME_ID; server theme_id whitelist gains 'ember'
- Brand banner: transparent Westie mascot + >_BooCode wordmark, big/edge-to-edge (flood-filled to transparency + cropped)
- Coder panes are multi-tab: + opens a BooCode tab, split opens a pane (shared ChatTabBar via tabKind + createCoderTab; closeOtherTabs/tab-numbering extended to coder)
- Model-attribution: new messages.model column stamped at finalizeCompletion (BooChat/native coder) + dispatcher assistant-row creation (external coder); surfaced via view + wire types + live frame; rendered as a subtle shortened-name chip (shortenModelName)
- Composer Web toggle moved into a boxed focus-ringed input; glowing accent dot on tool rows
- Claude SDK follow-ups (1M context, follow-up-message fix, collapsed thinking/tool chips) + CLAUDE_SDK_BACKEND=1

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 22:30:47 +00:00
59cf082e06 feat: normalized external-agent status (#10 scoped) (v2.7.6)
Scoped half of boocode_code_review_v2 §1 #10 — publish the agent status
BooCoder already observes (the config-injection notify-hook is the documented
follow-on, clean-room from superset ELv2).

- agent_status_updated WS frame (working|blocked|idle|error), server+web parity.
- Published from the dispatcher's turn boundaries (warm-acp/opencode/sdk/pty:
  working at start, idle/error at end) + the permission flow (blocked/working).
  Best-effort, never breaks a turn.
- Clean-room normalizeAgentEvent helper (superset's vendor-event -> Start/blocked
  /Stop collapse, event names as facts) + 25 tests — reused by the follow-on.
- AgentComposerBar status dot (distinct from the WS-liveness dot), tracked per
  (chat,agent) by a useAgentStatus map in CoderPane.

Built by 2 parallel agents vs a pinned frame contract. Server 545 + coder 294
tests passing (25 new); web tsc + builds clean; ws-frames parity green. Clears
the actionable review backlog (#1/#3/#4/#6-#12). Builds on v2.7.5.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 14:04:04 +00:00
990a615b87 web(coder UI): ChatInput migration + Thinking render + DiffPanel route fix
Bundles in-progress working-tree UI work not authored this session (CoderPane ChatInput migration, AgentComposerBar/CoderMessageList/tab-bar/sidebar/pane refinements, provider icons) with this session's changes to the same files: MessageBubble renders a collapsible 'Thinking' block from reasoning_text/reasoning_parts (surfacing ACP agent_thought_chunk + native reasoning), and the DiffPanel approve/reject calls are repointed to the real /api/coder/pending/:id/apply and /reject routes (the old /sessions/:id/pending/:id/approve|reject paths did not exist).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 03:12:06 +00:00
93d3f86c2b v2.2-paseo-providers: Paseo provider stack + v2.2.1 pane-scoped chat fixes
Ship Paseo-equivalent provider snapshot, AgentComposerBar, ACP dispatch
rewrite with streaming/persist, permission prompts, and agent commands.
Follow-up: pane-scoped chat resolution, CoderMessageList tool timeline,
WS user-delta replace, and inference orphan tool_call stripping.
Archive openspec v2-2; update CHANGELOG and CURRENT.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-26 15:18:31 +00:00
8b568b36d3 v1.13.11-a: WS frame schemas + frontend receive validation
First half of the WebSocket-frame-typing batch (split per recon — total
scope was ~535 LoC, larger than the roadmap's ~300 estimate, so the
server-side publish-site conversion lands separately in v1.13.11-b).

Phase A scope:

(1) apps/server/src/types/ws-frames.ts (NEW) — Zod schemas for all 27
wire-format WS frame types. Discriminated union (WsFrameSchema) plus
KNOWN_FRAME_TYPES const for diagnostic lookup. UUIDs are z.string().
uuid(); model-emitted tool_call_id stays z.string().min(1) since OpenAI-
compatible APIs emit "call_<random>" not UUID. Per-kind payload narrowing
(tool args, message_parts payloads) intentionally stays z.unknown() —
frame-level drift detection is the goal; deep payload validation is
follow-up work.

(2) apps/web/src/api/ws-frames.ts (NEW) — byte-identical mirror of the
authoritative server file. No path alias from web→server in the existing
tsconfig setup; sync-by-hand was chosen over a new packages/shared/ dir.
A ws-frames.test.ts test asserts the two files match.

(3) apps/server/src/services/broker.ts — adds publishFrame() and
publishUserFrame() methods to the Broker interface. Both validate via
WsFrameSchema and fail-closed: log + drop on invalid. createBroker now
accepts an optional FastifyBaseLogger so validation failures land in
the pino stream (with console.error fallback for unit tests). The
existing publish() / publishUser() raw methods stay legal — they get
converted to the typed variants in v1.13.11-b.

(4) apps/web/src/hooks/useSessionStream.ts + useUserEvents.ts — wrap
ws.onmessage with WsFrameSchema.safeParse. Fail-closed: invalid frames
log + return without dispatching. Hand-maintained WsFrame and
SessionEvent types stay in place; one cast bridges Zod-typed → narrowed
shape (Zod uses OpaqueObject for nested Message[] / WorkspacePane[] etc.,
which are dev-time-narrowed via the existing hand-maintained types).

(5) apps/web/package.json — adds zod ^3.23.8 as a direct dep. Was a
transitive dep via ai-sdk / postgres; promotion makes the import legal.

(6) Tests: 15 new in ws-frames.test.ts covering happy-path per major
frame type, drift-catchers (unknown type, invalid enum, non-UUID, negative
tokens), parts-authoritative read variants, the mirror-file diff check,
and four broker fail-closed scenarios. 219/219 server tests pass (was
204; +15 new).

Two recon corrections to the dispatch brief, both flagged before
implementation:

- No 'parts_appended' frame exists. The brief assumed one; the codebase
  reads parts via the messages_with_parts view after message_complete
  triggers a refetch. MessagePartSchema is therefore unused this batch.
- No 'tool_running' frame exists. The brief listed it as standalone; it
  is in fact a 'chat_status' variant ({ status: 'tool_running' }), already
  covered by ChatStatusFrame.

Smoke: clean container boot, no validation errors in the server log. Real
production frames pass validation (the schemas were derived from the
existing hand-maintained types in api/types.ts and sessionEvents.ts).

v1.13.11-b will follow immediately: convert all ~85 raw broker.publish /
ctx.publish call sites across 11 server files to publishFrame /
publishUserFrame. Mechanical edit; the wiring done here means the diff
in -b is just the call-site swaps.

~310 LoC across 9 files (4 new + 5 modified).
2026-05-22 15:48:32 +00:00
a7104691aa v1.12.2: live tok/s + ctx display next to status indicator
ChatThroughput renders inline beside StatusDot while streaming or
tool_running. Subscribes to existing usage frames via sessionEvents.
Hides when status drops to idle/error or data is older than 10s.

Addresses the 2026-05-21 spike's UX gap where slow streams looked
identical to dead streams — now there's a live token velocity readout
that immediately distinguishes the two.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 20:45:53 +00:00
dc43dd44f9 v1.11: opencode-style compaction port
- compaction.ts: usable/isOverflow/estimate/turns/select/buildPrompt/process
- compaction-prompt.ts: SUMMARY_TEMPLATE verbatim from opencode
- schema: messages.{compacted_at,summary,tail_start_id} + chats.needs_compaction
- inference: auto-trigger on overflow, pre-fetch compaction before next turn
- /compact slash command rewired to new path
- WS: chat_status working/idle around compaction + compacted frame
- frontend: SummaryCard + sonner toast on compacted
- 24 unit tests for pure functions
2026-05-20 19:05:35 +00:00
5c61cc7281 v1.8.2: tool loop cap-hit summary + tool call UI compaction
Old hardcoded MAX_TOOL_LOOP_DEPTH=15 replaced by per-agent
max_tool_calls (1-100, AGENTS.md frontmatter) with defaults: 30 for
read-only-only agents, 10 for agents that include any non-read-only
tool, 15 for raw chat. When the loop hits cap, fire one final summary
call with tools disabled, stream the wrap-up into the in-flight
assistant message, then insert a system sentinel with
metadata.kind='cap_hit'. The sentinel renders an amber bubble with a
Continue button (latest sentinel only) that POSTs to a new
/api/chats/:id/continue route to extend. Hard ceiling: 3 cap-hits per
chat (2 continues max) — third sentinel reports can_continue=false.

Error frames carry a machine-readable reason code alongside human
error text. Failed messages persist the reason via
metadata.kind='error' so the bubble renders specifics on reload (WS
error frame is one-shot).

Tool call UI rewired: ToolCallLine renders inline (↳ name args
spinner/check/✗, expand-on-tap for args+result); ToolCallGroup
collapses 3+ consecutive same-tool runs into a compact card.
MessageList owns a three-pass pre-render (flatten + fold tool
results onto matching runs by id + group same-tool runs + number
sentinels). MessageBubble drops tool rendering and adds the
sentinel / error-reason branches. ToolCallCard deleted.

Roadmap follow-up logged: add explicit max_tool_calls: 30 to the 6
agents in /data/AGENTS.md and /opt/boocode/AGENTS.md post-ship for
discoverability (defaults handle behavior identically).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 10:31:32 +00:00
12d91c9a12 v1.8.1: global agents + parser robustness + WS reconnect toast
Builtins move out of code into /data/AGENTS.md (always-on, mounted ro
into the container); per-project AGENTS.md is now an optional override.
agents.ts merges global + project entries with project-wins-by-name and
caches per-source mtimes (60s TTL). Parser switches to per-block
try/catch and returns AgentsResponse { agents, errors[] } so one
malformed block no longer fails the file. AgentPicker shows a
non-blocking amber chip listing skipped blocks and only fires a gray
toast when zero agents loaded.

WS reconnect UX (useUserEvents + useSessionStream) now silent on the
first disconnect; createWsReconnectToast escalates to gray after 3
failures or 15 s, then to red with a Retry Now action after 60 s.
useSessionStream also gained the exponential-backoff reconnect it was
missing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 23:16:02 +00:00
2f6be39efd chore: surface swallowed errors + remove dead session_renamed paths
Swallowed-error logging (audit Feature 3):
- file_index.ts:36-37 (git mtime probes): comment — best-effort, project
  may not be a git repo.
- useUserEvents.ts:44 / 53 (ws.close on error / unmount): comments —
  best-effort, socket may already be closing.
- RightRail.tsx:38 (localStorage write): comment — best-effort, quota or
  private mode.
- App.tsx:21 (api.sessions.get for RightRail projectId): replaced silent
  catch with console.warn.
- Session.tsx:38, 41 (session fetch + project list for breadcrumb):
  replaced silent catches with console.warn.

H1: ProjectSidebar.tsx:189 — dropped the local sessionEvents.emit
({type:'session_renamed'}) after PATCH. Server publishes via
broker.publishUser since v1.4; useUserEvents forwards.

H2: useSessionStream.ts session_renamed case removed (dead — no
server code path publishes session_renamed on the per-session WS
channel; only user channel via broker.publishUser). Also dropped the
session_renamed variant from WsFrame (in apps/web/src/api/types.ts)
to keep the discriminated-union switch exhaustive.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 04:35:49 +00:00
c35ec65fc4 batch4: chats-in-sessions, force-send, /compact, right-rail file browser
Session 1:N Chat data model with backfill. Workspace switches to client-side
multi-tab pane management. Right-rail file browser with float-over viewer and
click-drag line selection replaces FileBrowserPane. Adds /compact streaming
summarizer (respects compact markers in context builder), force-send (cancels
in-flight, persists partial as 'cancelled', awaits cancellation completion via
deferred Promise + 5s timeout), message queue, stop generation, chat
auto-rename, session archive/unarchive with Closed Sessions section on repo
landing page. CHECK constraints on sessions.status, messages.role,
messages.status with KEEP IN SYNC comments tying to MESSAGE_ROLES /
MESSAGE_STATUSES const arrays. Deletes dead pane routes/hook and the
api.panes.* client block.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 20:39:48 +00:00
2464d23bb6 v1.1 batch 1: markdown, message actions, tok/s+ctx, AI naming
Four features land together on this branch:

1. Markdown rendering — assistant messages go through react-markdown +
   remark-gfm. Fenced code blocks render via existing CodeBlock (with copy
   button); inline `code` is styled inline. User messages stay plain text.
   No raw HTML (no rehype-raw).

2. Per-message Copy + Regenerate. New endpoint
   POST /api/sessions/:id/messages/:message_id/regenerate validates the
   target (404/400/409), atomically deletes the target plus any later
   messages in the session, inserts a fresh streaming assistant row, and
   enqueues a normal inference run. The DELETE bound uses a SQL subquery
   (`created_at >= (SELECT created_at FROM messages WHERE id = $1)`)
   instead of a JS round-trip so postgres TIMESTAMPTZ µs precision is
   preserved — otherwise sub-ms clock_timestamp() differences between the
   user row and the assistant row collapsed to the same JS Date, pulling
   the triggering user message into the >= bound. New `messages_deleted`
   WS frame so already-connected clients prune the stale tail without
   needing a full snapshot resend.

3. tok/s + ctx counter. Five new nullable message columns: tokens_used,
   ctx_used, ctx_max, started_at, finished_at. started_at is set right
   before the OpenAI call in services/inference.ts (not in the route, not
   in the frame handler); finished_at + tokens_used + ctx_used + ctx_max
   are committed in the same UPDATE that flips status to 'complete'. The
   inference request now opts into stream_options.include_usage so the
   final chunk carries usage; defensive parsing also picks up timings.n_ctx
   when llama.cpp emits it (currently absent for our llama-swap models, so
   ctx_max stays NULL and the UI just shows `<used> ctx`). message_complete
   frame extended with tokens_used / ctx_used / ctx_max / started_at /
   finished_at / model. Frontend StatsLine in MessageBubble computes tok/s
   client-side from the timestamps and renders muted mono text below the
   body of completed assistant messages.

4. AI chat naming after the first turn. Backend services/auto_name.ts
   runs via setImmediate after the top-level inference resolves; it
   checks that there is exactly one completed assistant message and that
   the session has not been user-renamed (`name IS NULL OR name = '' OR
   name = 'New session'`), then fires a single non-streaming chat
   completion with the spec prompt. Qwen3 chat templates emit chain-of-
   thought into reasoning_content and burn the entire max_tokens budget
   without producing visible output, so the request includes
   `chat_template_kwargs: { enable_thinking: false }` and max_tokens=30.
   Title is trimmed, quote-stripped, "Title:" prefix dropped, and
   truncated to 60 chars before a guarded UPDATE on sessions.name. New
   `session_renamed` WS frame propagates to the open session view
   directly and to the project's session list via a tiny module-scope
   event bus (apps/web/src/hooks/sessionEvents.ts) — kept dumb: one event
   type, two methods, no library.

Cleanups: dropped the now-unused splitCodeBlocks export from CodeBlock.tsx
(react-markdown supersedes it), and added a long-form NOTE in auto_name.ts
documenting the enable_thinking + max_tokens pattern for any future Qwen-
family non-streaming utility calls (planned: fork-message, agent-routing,
web-search summarization).

Schema bootstrap remains idempotent (ADD COLUMN IF NOT EXISTS). Auth,
broker, clock_timestamp() conventions, and zod validation all unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 22:52:40 +00:00
a7f218e182 initial 2026-05-14 19:24:50 +00:00