Arena is a new pane kind for competitive AI evaluation. A Battle runs
the same prompt against 2-6 Contestants across two concurrent lanes:
local lane (llama-swap models, serial) and cloud lane (parallel).
Added to all three registries: @boocode/contracts WsFrameSchema,
server InferenceFrame, and web WsFrame.
Backend (apps/coder):
- arena-runner: battle scheduler, lane classifier, benchmark, results
writer, resume, user winner override
- arena-analyzer: two-stage digest→judge analysis on DEFAULT_MODEL
- arena-decisions: status transitions and resume logic (unit-tested)
- arena-analyzer-helpers: pure helper functions (unit-tested)
- arena-model-call: model call utility for analysis
- arena routes: create/get/list/stop/analyze/cross-examine/winner/diff
- schema: battles, contestants, cross_examinations tables (idempotent)
- remove old /api/arena* routes and tasks.arena_id column
Frontend (apps/web):
- ArenaLauncherDialog: battle type, prompt, contestant selection
- ArenaPane: live roster, streaming output, analysis, cross-exam
- DiffView: unified diff with line-by-line color for coding contests
- Winner override per-row dropdown (Trophy icon)
- battle_updated WS handler for live winner/analysis updates
- arena pane kind in Workspace, ChatTabBar, useSidebar
Cross-app:
- ArenaState and ArenaContestantShape/WsFrame types (contracts)
- battle_* frames in WsFrameSchema, InferenceFrame, and web WsFrame
- manifest.json written per battle results folder
- /Arena added to .gitignore
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Source-level recon of QwenLM/qwen-code (Apache-2.0) informed 4 lifts:
1. FAST_MODEL config: optional env var routes cheap LLM calls (titles,
summaries, labeling) to a smaller model on llama-swap. auto_name.ts
uses ctx.config.FAST_MODEL ?? session.model. Set FAST_MODEL=nemotron-
nano-4b to avoid loading the 35B model for 20-token title generation.
2. Tool-use summaries (services/inference/tool-summaries.ts): utility
that generates "git-commit-subject-style" labels for tool batches via
a fast-model LLM call. System prompt + truncation logic ported from
Qwen Code's toolUseSummary.ts. Exported via @boocode/server/inference
for BooCoder's dispatcher to call after task completion.
3. Qwen as dispatchable agent: added to agent-probe.ts KNOWN_AGENTS.
PTY dispatch builds: qwen -p "<task>" --output-format stream-json
(NDJSON structured events over stdout). Env: OPENAI_BASE_URL +
OPENAI_API_KEY points Qwen Code at llama-swap. execution_path CHECK
constraint extended with 'qwen'.
4. Arena routes (routes/arena.ts): POST /api/arena dispatches the same
task to N contestants (2-5, each with different agent/model), each
getting its own task row linked by arena_id UUID. GET /api/arena/:id
shows all contestants. POST /api/arena/:id/select/:task_id marks
winner. Schema: arena_id column added to tasks.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>