Arena is a new pane kind for competitive AI evaluation. A Battle runs
the same prompt against 2-6 Contestants across two concurrent lanes:
local lane (llama-swap models, serial) and cloud lane (parallel).
Added to all three registries: @boocode/contracts WsFrameSchema,
server InferenceFrame, and web WsFrame.
Backend (apps/coder):
- arena-runner: battle scheduler, lane classifier, benchmark, results
writer, resume, user winner override
- arena-analyzer: two-stage digest→judge analysis on DEFAULT_MODEL
- arena-decisions: status transitions and resume logic (unit-tested)
- arena-analyzer-helpers: pure helper functions (unit-tested)
- arena-model-call: model call utility for analysis
- arena routes: create/get/list/stop/analyze/cross-examine/winner/diff
- schema: battles, contestants, cross_examinations tables (idempotent)
- remove old /api/arena* routes and tasks.arena_id column
Frontend (apps/web):
- ArenaLauncherDialog: battle type, prompt, contestant selection
- ArenaPane: live roster, streaming output, analysis, cross-exam
- DiffView: unified diff with line-by-line color for coding contests
- Winner override per-row dropdown (Trophy icon)
- battle_updated WS handler for live winner/analysis updates
- arena pane kind in Workspace, ChatTabBar, useSidebar
Cross-app:
- ArenaState and ArenaContestantShape/WsFrame types (contracts)
- battle_* frames in WsFrameSchema, InferenceFrame, and web WsFrame
- manifest.json written per battle results folder
- /Arena added to .gitignore
Brings the deterministic Han-flow conductor into BooCode: launch any read-only
flow from BooChat or BooCoder, watch each agent stream live in a Paseo-style
run pane, get an evidence-disciplined report — on local Qwen, persisted and
resumable. Read-only enforced hard via qwen --approval-mode plan (orchestrator
tasks fail closed if qwen is unavailable; never fall to write-capable native).
Backend (apps/coder): re-homed conductor defs, flow_runs/flow_steps schema,
flow-runner + dispatcher onTaskTerminal hook, restart-resume, runs routes
(launch/list/get/cancel), user-channel WS. Contracts: two flow_run_* frames.
Web: orchestrator pane kind + OrchestratorPane, Workflow button + slash flows
(BooChat/BooCoder parity), FlowLauncherDialog, "New Orchestrator" in the + and
split menus, runs history + export. Plan: openspec/changes/orchestrator.
Second checkpoint of in-flight work (sessions route, api types, ChatTabBar,
PaneHeaderActions, Workspace, useWorkspacePanes) so the Orchestrator branch
can rebase onto current main before merge.
Checkpoint of in-flight work so the orchestrator branch can rebase onto a
clean main: ContextBar → ContextMeter, model-label helper, model/agent picker
+ provider-snapshot/registry changes, inference payload + message-columns.
In-flight workspace UX work.
- Extract a shared PaneHeaderActions cluster (+/Split/Reopen/History/Close)
used by ChatTabBar + the Workspace coder/terminal pane headers, replacing the
divergent per-header copies; SessionLandingPage history + useWorkspacePanes
tweaks.
- Fix coder-side correctness bug: resolveChatId read sessions.workspace_panes as
a bare WorkspacePane[] but v2.6.5 widened it to a WorkspaceState envelope, so
it mis-read panes and clobbered tabNumbers/nextTabNumber/closedPaneStack on
every pane-chat write. New normalizeWorkspaceState handles either shape and
preserves the envelope (+ regression test).
- CLAUDE.md doc-sync (coder vitest suite, deploy-by-surface, dual-remote push,
in-flight-web-WIP staging, release-branch naming).
Web tsc + coder build + coder tests green. Builds on v2.7.6.
A cohesive batch of pane/tab UX + the persisted workspace-state model (grouped
because the changes interleave across useWorkspacePanes, ChatTabBar, Workspace,
sessionEvents and the api types/client):
- Open a whole chat in a fresh pane via a new open_chat_in_new_pane event:
ChatTabBar tab context menu "Open in new pane", and MessageBubble.fork() now
lands the fork beside the original instead of replacing the active pane.
openChatInNewPane detaches the chat from any pane already holding it
(one-chat-per-pane).
- The tab-bar "+" becomes a New BooChat/BooTerm/BooCode menu (chat as a tab,
term/coder as split panes); the split button is unchanged.
- Drop the per-message "Open in pane" button (it opened a single message's
artifact) and its dead code; the artifact-pane machinery is left orphaned for
a later teardown.
- Session history: the empty/landing pane lists the session's open chats plus
archived chats (fetched separately), click to open / restore-and-open.
- Relocate-on-close: closing a chat pane moves its tabs (in order) into the
oldest chat/empty pane instead of discarding them; terminal/coder panes close
as before. Reopen strips the restored chatIds from all live panes first, so a
relocated-then-reopened pane never duplicates a tab — no stack-shape change.
- Stable global tab numbering: tabNumbers/nextTabNumber assigned on chat-pane
open, retired on close (never reused), rendered map-keyed (not positional).
- workspace_panes is now a WorkspaceState envelope { panes, tabNumbers,
nextTabNumber, closedPaneStack }; the reopen stack moved from a module-level
array into the persisted envelope so it survives reload. Hydrate/persist
normalize the legacy bare-array shape. appendClosed dedupes a value-identical
top entry to neutralize the StrictMode double-invoke of the setPanes updater.
Two independent fixes:
- opencode-server.ts: stripDcpTags() removes <dcp-message-id>…</dcp-message-id>
tags from text deltas before they reach the frame/DB. Applied to all three
text paths (session.next.text.delta, message.part.delta text field,
handleUpdatedPart text type). Reasoning/tool paths untouched.
- useWorkspacePanes.ts: module-level closedPaneStack (capped at 10) captures
pane kind + chatIds on removePane and removeTab auto-remove. reopenPane()
pops the stack and re-attaches a new pane to the existing chat ids (chats
survive pane close server-side). hasClosedPanes drives conditional render.
- ChatTabBar.tsx: [+] is now instant new-tab (no dropdown); split-pane
dropdown (Columns2 icon) opens Chat/Term/Code in a new pane; reopen button
(RotateCcw icon) appears when closed panes exist.
- Workspace.tsx: pass reopenPane + hasClosedPanes through to ChatTabBar.
Two independent UI/UX fixes:
- auto_name.ts: pass the session's own model as fallbackModel to
taskModelCompletion, so chat rename uses whatever model is already
loaded on llama-swap instead of forcing a swap to DEFAULT_MODEL
(which times out at 10s when a different model is active).
- useWorkspacePanes.ts: when the last tab in a pane is closed and
other panes exist, remove the pane entirely instead of leaving an
orphaned empty panel.
Opening the settings pane on mobile set activePaneIdx, but the ?pane= URL-sync
effect snapped it back to the chat pane on the panes change, so the pane never
showed. toggleSettingsPane now returns the new pane id (id generated outside the
updater, strict-mode safe); Session's toggleSettingsAndSync pushes ?pane=<id> on
mobile when opening (and drops it on close) so the sync effect keeps it active —
mirrors the existing addPaneAndSwitch pattern. Desktop unaffected.
Integrates BooCoder as a 'coder' workspace pane within the existing
BooChat SPA at code.indifferentketchup.com. Renamed the placeholder
'agent' pane kind to 'coder' across all types, menus, hooks, and
mobile switcher (Icon: Code instead of Bot).
CoderPane.tsx: split layout with chat area (messages via WS to
boocoder:9502, input bar posting to /api/coder/sessions/:id/messages)
and diff panel (pending changes with Approve/Reject per change plus
Approve All/Reject All). Reuses MarkdownRenderer for message content.
Proxy: Vite dev config adds /api/coder → boocoder:9502 (ordered above
/api per CLAUDE.md proxy-ordering rule). Production: Fastify route in
apps/server/src/index.ts proxies /api/coder/* to http://boocoder:3000
via fetch() pass-through. WS connects directly to :9502 (same
Tailscale network, no proxy needed for WebSocket upgrade).
WorkspacePaneKind mirror updated in both apps/web and apps/server
types. useWorkspacePanes gains coderPane() factory (replaces the old
agent toast stub). Workspace.tsx switch renders CoderPane for
pane.kind === 'coder'.
Every assistant message gets an "Open in pane" affordance that opens the
message in the workspace splitter — Markdown pane (Copy + Download .md) by
default; HTML pane (Download .html only) when the model emits a self-contained
<!DOCTYPE html> or fenced ```html artifact. BOOCHAT.md rule keeps Markdown
default at every length; HTML opt-in on explicit user request.
Backend: services/artifacts.ts (slug derivation + write helpers with
symlink-escape guard via realpath-after-mkdir), routes/artifacts.ts (POST
download + GET stream with nosniff + CSP sandbox defense-in-depth), HTML
detection in finalizeCompletion writing a new message_parts.kind='html_artifact'
row (schema CHECK extended via v1.13.13 pattern), graceful 1MB cap via the
pure decideHtmlArtifactWrite helper. PartKind union extended.
Frontend: MarkdownRenderer.tsx extracted from MessageBubble's inline
MarkdownBody for reuse; MarkdownArtifactPane.tsx + HtmlArtifactPane.tsx with
loading/error states; pane state is reference-only ({chat_id, message_id,
title}) — content fetched on mount to keep workspace_panes jsonb small and
avoid 1MB blobs riding session_workspace_updated frames. iframe sandbox
locked to allow-scripts allow-clipboard-write allow-downloads with no
allow-same-origin, srcDoc not src. openInPane discriminates 404 (expected
fallback) from real errors (toast + bail). PanelRightOpen icon button with
mobile 44px tap-target.
31 new server unit tests including a real-symlink filesystem case; 332/332
server tests passing, tsc clean both sides, pnpm -C apps/web build green.
Smoke deferred to first deploy.
Status indicator (StatusDot): drops the flat amber pulse for a richer set
of states — orbiting amber for streaming, spinning sky ring for tool_running,
static violet for waiting_for_input, plus the existing idle/error. Backend
chat_status frame widens from 'working|idle|error' to discriminate streaming
vs tool execution vs paused for user input.
Workspace pane sync: pane layout moves from per-device localStorage to
server-side sessions.workspace_panes jsonb. PATCH /api/sessions/:id/workspace
broadcasts session_workspace_updated on the user channel for cross-device live
sync. Echo dedup via JSON comparison so the round-trip frame doesn't loop.
Legacy localStorage seeds the server on first hydrate, then is deleted.
Deprecated session_panes table dropped.
Resilience: startup sweep marks any stale 'streaming' message older than
5 minutes as 'failed' so v1.12.0-style hung rows clear on container restart.
useWorkspacePanes gains validatePanes() to prune dead chatId references from
saved pane state when the chat list lands.
Five issues + keyboard shortcuts across booterm and the workspace shell.
Auto-switch on create (mobile): addSplitPane now returns the new pane id;
Session.tsx wraps it with addPaneAndSwitch which pushes ?pane=<newId> on
mobile so the URL-sync effect doesn't fight the just-set activePaneIdx.
NewPaneMenu uses the wrapper; desktop Split dropdown is unaffected.
Tab-away reconnect: TerminalPane has a connect()/manualReconnect() state
machine. ws.onclose backs off 500ms/1s/2s × 3 attempts, then surfaces a
[Disconnected] banner with a Reconnect button. visibilitychange listener
calls manualReconnect when the tab returns and the WS isn't OPEN. tmux
session persists server-side so scrollback is intact on resume.
Copy/paste: attachCustomKeyEventHandler binds Cmd/Ctrl-C (copy if
selection, else send ^C), Cmd/Ctrl-Shift-C (always swallow — copy if any,
no-op otherwise — never sends ^C), Cmd/Ctrl-V and Cmd/Ctrl-Shift-V
(navigator.clipboard.readText → ws.send). No custom right-click menu —
browser's native menu is preserved.
Scroll: removed `set -g mouse on` from tmux.conf so xterm.js sees wheel
and touch events natively. scrollback: 10_000, fastScrollModifier: 'shift',
altClickMovesCursor: false. Container has touch-action: pan-y for mobile.
Right-edge gap: inline <style> overrides xterm's defaults to width:100%
height:100% and hides the scrollbar chrome. Host container is
flex-1 min-w-0 self-stretch w-full. Three refit triggers: ResizeObserver
(rAF-wrapped), document.fonts.ready, and useEffect on the new active prop.
Background color matched between outer div, inner div, and xterm theme.
Keyboard shortcuts in Session.tsx (window-level keydown):
Cmd/Ctrl+` focus active terminal, else jump to last
Cmd/Ctrl+Shift+T new terminal pane
Cmd/Ctrl+Shift+C new chat pane (defers to xterm copy if focused)
Cmd/Ctrl+W close active pane
Cmd/Ctrl+Tab/Shift+Tab cycle next / prev pane
Cmd/Ctrl+1..9 jump to pane N
terminalsRegistry gains a focus() callback per registration so Cmd+`
can call term.focus() on the active terminal.
The v1.9 settings pane had no way to dismiss once opened. ChatTabBar
(which owns the per-pane close X for chat panes) is skipped for
settings panes, and the pane header itself only rendered the maximize
toggle (desktop-only). Mobile users had zero controls beyond the
section tabs.
Add three close paths:
- X button in SettingsPane header, visible on mobile + desktop, sits
next to the maximize toggle. Tap-target sized per the v1.6 mobile
convention (max-md:min-h-[44px]).
- Esc when the settings pane is the active pane and no input/textarea/
dialog has focus. Maximize-restore still wins when maximized.
- Sidebar Settings button is now a strict toggle: opens on first click,
closes on second. Renamed openOrFocusSettingsPane →
toggleSettingsPane in the panes hook.
Edge case: removing the settings pane when it's the only pane left
falls back to an empty pane to preserve the "always one pane"
invariant. In normal flow this is unreachable (the toggle only
appends), but defensive against future entry points.
Adds a singleton, ephemeral 'settings' pane kind to the workspace.
Opened via a new bottom-pinned button in ProjectSidebar (emits an
open_settings_pane event when a session is mounted; navigates to
/settings otherwise). Pane has three sections — Session, Project,
Theme — and a maximize toggle that hides sibling pane columns via
display:none on desktop only. Settings panes don't count toward
MAX_PANES and are filtered out of the localStorage persistence layer
so reload always restores a clean workspace.
Schema (additive):
- projects.default_system_prompt TEXT NOT NULL DEFAULT ''
- projects.default_web_search_enabled BOOLEAN NOT NULL DEFAULT false
- sessions.web_search_enabled BOOLEAN (nullable; null = inherit)
Inference resolves user_prompt = session.system_prompt.trim() ||
project.default_system_prompt.trim() — empty/whitespace at either
layer means "no override". Keeps the columns NOT NULL and matches
the existing inherit semantics.
Server routes:
- GET /api/projects/:id (new; settings pane refetches on
project_updated)
- PATCH /api/projects/:id accepts default_system_prompt,
default_web_search_enabled
- PATCH /api/sessions/:id accepts web_search_enabled (tri-state)
- POST /api/projects/:id/sessions/archive-all + GET
/api/projects/:id/sessions/open-count
- POST /api/sessions/:id/chats/archive-all + GET
/api/sessions/:id/chats/open-count
- PATCH /api/sessions/:id now broadcasts session_updated on every
successful PATCH (was rename-only). Lets SettingsPane open in
another tab pick up edits without a refetch.
Bulk-archive publishes one session_archived / chat_archived frame
per affected id so useSidebar's existing reducer cases handle them
incrementally — no new frame type, no payload widening.
ModelPicker refactored: shared ModelList inside a responsive shell.
Desktop = labeled trigger + DropdownMenu, mobile = icon-only Cpu
button + BottomSheet. Header in Session.tsx drops the pill wrap on
mobile since the new trigger is the visual.
ChatInput gains an icon-only '+' DropdownMenu next to AgentPicker
when sessionId + webSearchEnabled props are provided. One item for
now — Web search — with a checkmark reflecting the stored value
(true), not the effective one. Click PATCHes the override; to
restore inherit-from-project the user opens SettingsPane.
ThemePicker lifted out of pages/Settings.tsx into a reusable
component. The standalone /settings route is now a thin wrapper
that mounts <ThemePicker /> with a Back button on top
(navigate(-1) with fallback to '/'); the SettingsPane Theme tab
renders the same picker bare.
Project section delete-flow removed (button + confirm dialog +
handler). Replaced with "Archive all sessions" using the same
two-step count → confirm → fire pattern as "Archive all chats" in
the Session section. api.projects.remove() stays in the client
because useProjects.ts still uses it.
Hand-rolled Switch primitive in SettingsPane (no shadcn switch in
the project; spec said no new deps). Section nav is plain buttons
(no shadcn Tabs).