feat(web,coder): arena pane — compare 2-6 AI competitors on same prompt

Arena is a new pane kind for competitive AI evaluation. A Battle runs
the same prompt against 2-6 Contestants across two concurrent lanes:
local lane (llama-swap models, serial) and cloud lane (parallel).

Added to all three registries: @boocode/contracts WsFrameSchema,
server InferenceFrame, and web WsFrame.

Backend (apps/coder):
- arena-runner: battle scheduler, lane classifier, benchmark, results
  writer, resume, user winner override
- arena-analyzer: two-stage digest→judge analysis on DEFAULT_MODEL
- arena-decisions: status transitions and resume logic (unit-tested)
- arena-analyzer-helpers: pure helper functions (unit-tested)
- arena-model-call: model call utility for analysis
- arena routes: create/get/list/stop/analyze/cross-examine/winner/diff
- schema: battles, contestants, cross_examinations tables (idempotent)
- remove old /api/arena* routes and tasks.arena_id column

Frontend (apps/web):
- ArenaLauncherDialog: battle type, prompt, contestant selection
- ArenaPane: live roster, streaming output, analysis, cross-exam
- DiffView: unified diff with line-by-line color for coding contests
- Winner override per-row dropdown (Trophy icon)
- battle_updated WS handler for live winner/analysis updates
- arena pane kind in Workspace, ChatTabBar, useSidebar

Cross-app:
- ArenaState and ArenaContestantShape/WsFrame types (contracts)
- battle_* frames in WsFrameSchema, InferenceFrame, and web WsFrame
- manifest.json written per battle results folder
- /Arena added to .gitignore

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
2026-06-06 23:25:29 +00:00
parent e04d0fdaa8
commit d6d246c15b
34 changed files with 4581 additions and 146 deletions

View File

@@ -1,4 +1,4 @@
import { Code, Columns2, History, MessageSquare, Plus, RotateCcw, Terminal, Workflow, X } from 'lucide-react';
import { Code, Columns2, History, MessageSquare, Plus, RotateCcw, Swords, Terminal, Workflow, X } from 'lucide-react';
import {
DropdownMenu,
DropdownMenuContent,
@@ -19,6 +19,8 @@ interface Props {
// When provided, shows a "New Orchestrator" item that opens the flow launcher.
// Orchestrators are always split (run-bound; can't live as a tab in another pane).
onNewOrchestrator?: () => void;
// When provided, shows a "New Arena" item that opens the arena launcher.
onNewArena?: () => void;
onReopenPane?: () => void;
onShowHistory: () => void;
onRemovePane?: () => void;
@@ -35,6 +37,7 @@ export function PaneHeaderActions({
onNewTab,
onSplitPane,
onNewOrchestrator,
onNewArena,
onReopenPane,
onShowHistory,
onRemovePane,
@@ -71,6 +74,11 @@ export function PaneHeaderActions({
<Workflow size={14} /> New Orchestrator
</DropdownMenuItem>
)}
{onNewArena && (
<DropdownMenuItem onSelect={onNewArena}>
<Swords size={14} /> New Arena
</DropdownMenuItem>
)}
</DropdownMenuContent>
</DropdownMenu>
@@ -101,6 +109,11 @@ export function PaneHeaderActions({
<Workflow size={14} /> New Orchestrator
</DropdownMenuItem>
)}
{onNewArena && (
<DropdownMenuItem onSelect={onNewArena}>
<Swords size={14} /> New Arena
</DropdownMenuItem>
)}
</DropdownMenuContent>
</DropdownMenu>