feat(conductor): Wave 2 — parallel batch execution + SWITCH branching step

- Parallel batch execution: batch field on Step, batchConfig on Flow, batch-aware readySteps with maxConcurrent gating, getReadyInBatch helper - SWITCH branching step: new 'switch' StepKind with cases/programmed conditions, resolveSwitch() pure function, switch-excluded steps tracked in SchedulerState, non-selected branches excluded from execution
feat: Wave 1 complete — state machine, Paseo hub, collision detection, PTY search
2026-06-08 03:00:06 +00:00 · 2026-06-08 02:45:17 +00:00 · 2026-06-08 02:43:45 +00:00 · 2026-06-08 02:28:32 +00:00 · 2026-06-08 02:26:47 +00:00 · 2026-06-08 02:16:02 +00:00
104 changed files with 9446 additions and 253 deletions
--- a/.env.example
+++ b/.env.example
@@ -20,6 +20,12 @@ SEARXNG_URL=http://100.114.205.53:8888
 # with FAST_MODEL when unset.
 # TASK_MODEL_URL=http://100.90.172.55:7995
 # DeepSeek API key. When set, models with IDs starting with 'deepseek-'
 # (e.g. deepseek-chat, deepseek-reasoner, deepseek-v4-flash) route through
 # DeepSeek's API instead of llama-swap. Requires a DeepSeek Platform API key.
 # DEEPSEEK_API_KEY=sk-...
 # DEEPSEEK_BASE_URL=https://api.deepseek.com
 # v1.13.15-tools: BOOCODE_TOOLS narrows the tool whitelist sent to the LLM.
 # Unset (default) → all tools (~21k schema). Useful primarily for single-purpose
 # sessions where the model only needs read-only filesystem access.
--- a/.omo/plans/paseo-orchestrator.md
+++ b/.omo/plans/paseo-orchestrator.md
@@ -0,0 +1,239 @@
 # Paseo-like Orchestrator — Implementation Plan
 > **Goal:** Transform BooCode into a Paseo-style thin-client orchestration layer with observability, dynamic workflows, resumability, background subagents, multi-modal, and cache shape telemetry.
 >
 > **Architecture:** Durable agent execution engine beneath thin chat/coder frontends. Trace system as foundation, workflow engine as the structural addition, everything else layered on top.
 >
 > **Inspired by:** Paseo (agent lifecycle, worktree isolation), Whale (workflow engine, cache telemetry), OpenCode (session resume), Claude Code (workflow script format).
 ---
 ## TL;DR
 > **Quick Summary**: Build a durable orchestration layer with trace observability, dynamic JS workflows, session persistence, background subagents, and multi-modal support over 5 phases.
 >
 > **Deliverables**:
 > - Trace system with DB persistence + viewer UI
 > - Dynamic workflow engine (JS sandbox, agent/parallel/pipeline)
 > - Workflow resumability (hash-based step caching)
 > - Background subagent runtime
 > - Session persistence across refreshes
 > - Cache shape telemetry (DeepSeek KV cache viz)
 > - Multi-modal attachment support
 >
 > **Estimated Effort**: XL — 5 phases, ~2-3 weeks total
 > **Parallel Execution**: YES — phases 1-2 can partially overlap
 > **Critical Path**: Trace system → Workflow engine → All downstream features
 ---
 ## Context
 ### Original Request
 User wants BooCode to become "like Paseo — a thin client" with observability, dynamic workflows, session persistence, background agents, multi-modal, cache shape telemetry, and workflow resumability. They invoked skills across model evaluation, long context, SGLang, LangChain, LangSmith, agentic eval, agent harness construction, agent governance, and chat SDKs — indicating broad ambition for a production-quality AI coding platform.
 ### Key Decisions
 - **Trace system first**: Foundation for all debugging and optimization
 - **isolated-vm for workflow sandbox**: Node-native, no external deps
 - **DB-backed sessions**: Postgres for trace store + session state
 - **Existing WS frames + new `tool_trace` frame**: Live streaming to frontend
 - **Phase ordering**: Foundation (trace) → UX (persistence) → Power (workflows) → Polish (background/multi-modal/cache)
 ---
 ## Phases
 ### Phase 1: Trace System + Observability
 **Est. effort**: 3-4 days
 Core observability infrastructure. Every tool call gets timed, logged, and persisted.
 **Deliverables**:
 - `tool_traces` DB table (id, session_id, chat_id, turn_number, tool_name, input, output, started_at, finished_at, latency_ms, tokens_used, cache_tokens, reasoning_tokens, error, outcome)
 - Instrumentation in `tool-phase.ts` wrapping `executeToolCall` with start/end timing
 - `tool_trace` WS frame type for live streaming to frontend
 - GET `/api/chats/:id/traces` endpoint (paginated)
 - Trace viewer pane (collapsible tree, timing bars, expand/collapse per call)
 **Files to create**: 5-7 files across server + web + contracts
 **Dependencies**: None — standalone feature
 ---
 ### Phase 2: Session Persistence + Resume
 **Est. effort**: 2-3 days
 Agent state survives browser refresh. Active sessions can be resumed.
 **Deliverables**:
 - Serialize active agent state to DB on each turn boundary
 - Restore state on WS reconnect (existing `snapshot` frame enhanced)
 - Agent session timeline view (history of all turns in a session)
 - Coder pane rehydrates from persisted state
 **Files to modify**: ws.ts, useSessionStream.ts, session store, dispatcher
 **Dependencies**: None — standalone, but benefits from Phase 1 trace data
 ---
 ### Phase 3: Dynamic Workflow Engine
 **Est. effort**: 5-7 days
 JS sandbox for multi-agent orchestration. Claude Code compatible.
 **Deliverables**:
 - `isolated-vm` sandbox (or Node `vm` module with restricted context)
 - Workflow API: `agent()`, `parallel()`, `pipeline()`, `phase()`, `budget()`, `log()`, `args`
 - Workflow file discovery (`.boocode/workflows/*.js` → project, `~/.boocode/workflows/*.js` → global)
 - Built-in workflow catalog (deep-research, multi-review, etc.)
 - Workflow manager with concurrency limits, token budgets
 - Integration with existing Orchestrator panel for UI
 **Files to create**: 10-15 files (workflow runtime, scheduler, tool bridge, manager, catalog)
 **Dependencies**: Phase 1 traces feed into workflow observability
 **Workflow Resumability** (within Phase 3):
 - SHA-256 hash of agent spec (prompt + options)
 - Cache completed results by hash
 - On re-run, skip cached agents, only execute new/changed ones
 - In-memory cache for current session, optional DB persistence
 **Est. effort**: 1-2 days within Phase 3
 ---
 ### Phase 4: Background Subagents
 **Est. effort**: 2-3 days
 Non-blocking subagent execution. `spawn_subagent` returns immediately, results collected later.
 **Deliverables**:
 - Background task queue (reuses existing `tasks` table)
 - `spawn_subagent` tool that creates a task and returns immediately
 - `subagent_status` tool to poll completion
 - `subagent_result` tool to retrieve output
 - Background agent pane showing running/completed subagents
 - Notifications via hooks when background tasks complete
 **Files to create**: 3-5 files across server + web
 **Dependencies**: Phase 1 traces, Phase 2 session persistence
 ---
 ### Phase 5: Multi-modal + Cache Shape (Polish)
 **Est. effort**: 2-3 days
 Image/file attachment support + DeepSeek cache hit visualization.
 **Deliverables (Multi-modal)**:
 - Image/file attachment storage (tmpfs, referenced in message)
 - Forward image content through DeepSeek API's multimodal support
 - Render attached images in message bubble
 - Model can "see" screenshots, diagrams, UI mocks
 **Deliverables (Cache Shape)**:
 - Extract `prompt_cache_hit_tokens` from DeepSeek provider metadata
 - Build cache segment visualization (system prompt, tool schema, conversation)
 - Per-turn cache hit rate in trace viewer
 - Cumulative cache stats in session view
 **Files to create**: 3-5 files
 **Dependencies**: Phase 1 traces (for cache shape), existing DeepSeek integration
 ---
 ## Execution Strategy
 ### Parallel Execution Waves
 ```
 Wave 1 (Start Immediately):
 ├── Phase 1: Trace system backend (tool_traces table + instrumentation) [deep]
 ├── Phase 1: Trace viewer frontend [visual-engineering]
 └── Phase 2: Session persistence backbone [deep]
 Wave 2 (After Wave 1):
 ├── Phase 3: Workflow engine sandbox + API surface [deep]
 ├── Phase 3: Workflow file discovery + manager [unspecified-high]
 ├── Phase 3: Workflow resumability cache [quick]
 └── Phase 4: Background subagent queue + tools [unspecified-high]
 Wave 3 (After Wave 2):
 ├── Phase 4: Background agent pane + notifications [visual-engineering]
 ├── Phase 5: Multi-modal attachment pipeline [deep]
 └── Phase 5: Cache shape telemetry UI [visual-engineering]
 Wave FINAL:
 ├── F1: Plan compliance audit (oracle)
 ├── F2: Code quality review (unspecified-high)
 ├── F3: Integration QA (unspecified-high)
 └── F4: Scope fidelity check (deep)
 ```
 ---
 ## TODOs
 > Phase 1: Trace System + Observability
 - [ ] 1. Create tool_traces DB table + migration
 - [ ] 2. Add tool_trace WS frame + contracts schema
 - [ ] 3. Instrument tool-phase.ts with start/end timing
 - [ ] 4. Add GET /api/chats/:id/traces endpoint
 - [ ] 5. Build trace viewer frontend component
 > Phase 2: Session Persistence + Resume
 - [ ] 6. Serialize agent state to DB on turn boundaries
 - [ ] 7. Restore state on WS reconnect
 - [ ] 8. Agent session timeline view
 > Phase 3: Dynamic Workflow Engine
 - [ ] 9. Create isolated-vm workflow sandbox
 - [ ] 10. Implement agent/parallel/pipeline primitives
 - [ ] 11. Workflow file discovery system
 - [ ] 12. Workflow manager + built-in catalog
 - [ ] 13. Workflow resumability (hash-based cache)
 - [ ] 14. Workflow UI integration with Orchestrator panel
 > Phase 4: Background Subagents
 - [ ] 15. Background task queue + spawn_subagent tool
 - [ ] 16. subagent_status + subagent_result tools
 - [ ] 17. Background agent pane
 > Phase 5: Multi-modal + Cache Shape
 - [ ] 18. Multi-modal attachment pipeline
 - [ ] 19. Image render in message bubble
 - [ ] 20. Cache shape telemetry data pipeline
 - [ ] 21. Cache shape visualization in trace viewer
 ---
 ## Success Criteria
 - Tool trace viewer shows every call with timing bars and token costs
 - Browser refresh preserves agent session state
 - Workflow scripts run in isolated sandbox with agent/parallel/pipeline
 - Re-running a workflow skips cached agents (hash-based)
 - Background subagents run independently, results collected later
 - Model can see attached images in chat
 - Cache hit rate visible per-turn and cumulative
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -2,6 +2,10 @@
 All notable changes per release tag. Most recent on top, ordered by tag creation date (which matches the git history). Tag names follow `vMAJOR.MINOR.PATCH-slug` — the slug describes what shipped, so the tag name alone is enough to recall the batch.
 ## v2.8.18-deepseek-whale-lift — 2026-06-08
 Integrates DeepSeek API directly into BooChat and BooCoder via `@ai-sdk/deepseek`, replacing the generic `openai-compatible` wrapper. DeepSeek V4 models (`deepseek-v4-flash`, `deepseek-v4-pro`) with configurable thinking effort levels appear in both chat and coder pane model pickers. Full token tracking — cache hit tokens and reasoning tokens — flow from the API through new DB columns and WS frames into the UI message stats line. Lifts three high-value features from the Whale codebase: a schema-based tool input repair system that coerces types and unwraps markdown autolinks before Zod validation, a shell-based lifecycle hooks system (PreToolUse, PostToolUse, Stop, PreCompact, PostCompact) with JSON stdin/stdout contract, and per-MCP-server permissions (allow/ask/deny) gating tool execution.
 ## v2.8.0-fork-lifts — 2026-06-07
 Completes the eight fork-lift integrations from `/opt/forks` into BooCode: boocontext sidecar upgrade, LSP code intelligence, DCP clean-room pruning, institutional memory, subagent protocol enhancements, plugin hook host, inference reliability (tool-shim + loop detectors), and TokenScope token breakdown. Backfills edit safety guards (truncation + dropped imports) and the TokenScope analyzer/persist module. Closes the fork-lifts-mit epic.
--- a/apps/booterm/src/index.ts
+++ b/apps/booterm/src/index.ts
@@ -5,6 +5,7 @@ import { getPool, closeDb } from './db.js';
 import { registerHealthRoutes } from './routes/health.js';
 import { registerTerminalRoutes } from './routes/terminals.js';
 import { registerSessionRoutes } from './routes/sessions.js';
 import { registerSearchRoutes } from './routes/search.js';
 import { registerWsAttachRoute } from './ws/attach.js';
 async function main(): Promise<void> {
@@ -35,6 +36,7 @@ async function main(): Promise<void> {
  registerHealthRoutes(app);
  registerTerminalRoutes(app, config.TMUX_CONF_PATH);
  registerSessionRoutes(app);
  registerSearchRoutes(app, config.TMUX_CONF_PATH);
  registerWsAttachRoute(app, config.TMUX_CONF_PATH);
  const shutdown = async (signal: string) => {
--- a/apps/booterm/src/pty/registry.ts
+++ b/apps/booterm/src/pty/registry.ts
@@ -33,6 +33,7 @@ export function register(
 export function unregister(paneId: string): void {
  sessions.delete(paneId);
  ringBuffers.delete(paneId);
 }
 export function list(): SessionMeta[] {
@@ -42,3 +43,120 @@ export function list(): SessionMeta[] {
 export function get(paneId: string): SessionMeta | undefined {
  return sessions.get(paneId);
 }
 // ── Ring buffer for PTY output search ──────────────────────────────────────
 export interface SearchMatch {
  line: number;
  content: string;
  contextBefore: string[];
  contextAfter: string[];
 }
 const ringBuffers = new Map<string, string[]>();
 /**
 * Append raw PTY data to the ring buffer for a given pane.
 * Splits incoming data on newlines and pushes each line into the buffer,
 * trimming to `maxLines` (default 5000) from the tail.
 */
 export function appendOutput(
  paneId: string,
  data: string,
  maxLines: number = 5000,
 ): void {
  let buf = ringBuffers.get(paneId);
  if (!buf) {
    buf = [];
    ringBuffers.set(paneId, buf);
  }
  // Split on newlines — each chunk may contain multiple complete lines and
  // potentially a trailing partial line (which we store as-is; the next chunk
  // will either complete it or be another partial).
  const lines = data.split('\n');
  // The first element of `lines` may be a continuation of the last partial
  // line from the previous append. If the buffer is non-empty and the last
  // stored entry is a partial (no trailing newline previously), glue them.
  // We detect "partial" by checking whether `data` ended with '\n' — if it
  // did, the last element after split is '' (empty) which we drop.
  const endedWithNewline = data.endsWith('\n');
  if (endedWithNewline) {
    // The final empty-string element is discarded.
    lines.pop();
  }
  if (buf.length > 0 && lines.length > 0) {
    // Concatenate the last partial line in the buffer with the first split
    // segment. This avoids splitting ANSI sequences or text across chunks.
    buf[buf.length - 1] = (buf[buf.length - 1] ?? '') + (lines[0] ?? '');
    lines.shift();
  }
  for (const line of lines) {
    buf.push(line);
  }
  // Trim from head if over maxLines
  if (buf.length > maxLines) {
    buf = buf.slice(buf.length - maxLines);
    ringBuffers.set(paneId, buf);
  }
 }
 /**
 * Search the ring buffer for a pane using a regex pattern.
 * Returns matches with optional context lines before and after each match.
 */
 export function searchRingBuffer(
  paneId: string,
  pattern: string,
  opts?: { limit?: number; context?: number },
 ): SearchMatch[] {
  const buf = ringBuffers.get(paneId);
  if (!buf || buf.length === 0) return [];
  const limit = opts?.limit ?? 50;
  const context = opts?.context ?? 0;
  let re: RegExp;
  try {
    re = new RegExp(pattern, 'u');
  } catch {
    return []; // invalid regex — caller should validate, but be defensive
  }
  const results: SearchMatch[] = [];
  for (let i = 0; i < buf.length; i++) {
    if (results.length >= limit) break;
    if (re.test(buf[i]!)) {
      const contextBefore: string[] = [];
      const contextAfter: string[] = [];
      for (let c = 1; c <= context; c++) {
        const ci = i - c;
        if (ci >= 0) contextBefore.unshift(buf[ci]!);
      }
      for (let c = 1; c <= context; c++) {
        const ci = i + c;
        if (ci < buf.length) contextAfter.push(buf[ci]!);
      }
      results.push({
        line: i + 1, // 1-based line number for display
        content: buf[i]!,
        contextBefore,
        contextAfter,
      });
    }
  }
  return results;
 }
 /**
 * Remove the ring buffer for a pane. Called on session kill / pane close.
 */
 export function clearBuffer(paneId: string): void {
  ringBuffers.delete(paneId);
 }
--- a/apps/booterm/src/routes/search.ts
+++ b/apps/booterm/src/routes/search.ts
@@ -0,0 +1,167 @@
 import type { FastifyInstance } from 'fastify';
 import { z } from 'zod';
 import { sanitizeId, tmuxSessionName, capturePane } from '../pty/manager.js';
 import { searchRingBuffer, clearBuffer } from '../pty/registry.js';
 const ParamsSchema = z.object({
  sid: z.string(),
  pid: z.string(),
 });
 const MAX_PATTERN_LENGTH = 200;
 // Zod-refined string: reject empty and overly-long patterns to prevent ReDoS
 const PatternQuerySchema = z
  .string()
  .min(1, 'pattern is required')
  .max(MAX_PATTERN_LENGTH, `pattern must not exceed ${MAX_PATTERN_LENGTH} characters`);
 const QuerySchema = z.object({
  pattern: PatternQuerySchema,
  limit: z.coerce.number().int().min(1).max(500).default(50),
  context: z.coerce.number().int().min(0).max(50).default(0),
 });
 interface SearchMatch {
  line: number;
  content: string;
  contextBefore: string[];
  contextAfter: string[];
 }
 interface SearchResponse {
  matches: SearchMatch[];
  total: number;
  truncated: boolean;
  source: 'ring' | 'capture';
 }
 /**
 * Search a captured pane buffer using a regex. This is the fallback path
 * when the ring buffer doesn't have enough matches.
 */
 function grepBuffer(
  text: string,
  pattern: string,
  limit: number,
  context: number,
 ): SearchMatch[] {
  let re: RegExp;
  try {
    re = new RegExp(pattern, 'u');
  } catch {
    return [];
  }
  const lines = text.split('\n');
  const results: SearchMatch[] = [];
  for (let i = 0; i < lines.length; i++) {
    if (results.length >= limit) break;
    if (re.test(lines[i]!)) {
      const contextBefore: string[] = [];
      const contextAfter: string[] = [];
      for (let c = 1; c <= context; c++) {
        const ci = i - c;
        if (ci >= 0) contextBefore.unshift(lines[ci]!);
      }
      for (let c = 1; c <= context; c++) {
        const ci = i + c;
        if (ci < lines.length) contextAfter.push(lines[ci]!);
      }
      results.push({
        line: i + 1,
        content: lines[i]!,
        contextBefore,
        contextAfter,
      });
    }
  }
  return results;
 }
 export function registerSearchRoutes(app: FastifyInstance, tmuxConfPath: string): void {
  app.get<{
    Params: { sid: string; pid: string };
    Querystring: { pattern?: string; limit?: string; context?: string };
  }>(
    '/api/term/sessions/:sid/panes/:pid/search',
    async (req, reply) => {
      const p = ParamsSchema.safeParse(req.params);
      if (!p.success) return reply.code(400).send({ error: 'bad_params' });
      const sid = sanitizeId(p.data.sid);
      const pid = sanitizeId(p.data.pid);
      if (!sid || !pid) return reply.code(400).send({ error: 'bad_id_format' });
      const q = QuerySchema.safeParse(req.query);
      if (!q.success) {
        return reply.code(400).send({
          error: 'bad_query',
          details: q.error.flatten().fieldErrors,
        });
      }
      const { pattern, limit, context } = q.data;
      // ── Path 1: ring buffer search (fast, no tmux interaction) ──
      const ringMatches = searchRingBuffer(pid, pattern, { limit, context });
      if (ringMatches.length >= limit) {
        return reply.code(200).send({
          matches: ringMatches,
          total: ringMatches.length,
          truncated: ringMatches.length >= limit,
          source: 'ring' as const,
        });
      }
      // ── Path 2: capture-pane + grep fallback (10s timeout) ──
      const sessionName = tmuxSessionName(pid);
      let capture: string;
      try {
        capture = await withTimeout(
          capturePane(tmuxConfPath, sessionName, 5000),
          10_000,
        );
      } catch (err) {
        req.log.warn({ err, pid }, 'capture-pane timed out or failed');
        return reply.code(200).send({
          matches: ringMatches,
          total: ringMatches.length,
          truncated: false,
          source: 'ring' as const,
        });
      }
      if (!capture) {
        // tmux pane may no longer exist — return whatever ring had
        return reply.code(200).send({
          matches: ringMatches,
          total: ringMatches.length,
          truncated: false,
          source: 'ring' as const,
        });
      }
      const captureMatches = grepBuffer(capture, pattern, limit, context);
      return reply.code(200).send({
        matches: captureMatches,
        total: captureMatches.length,
        truncated: captureMatches.length >= limit,
        source: 'capture' as const,
      });
    },
  );
 }
 function withTimeout<T>(promise: Promise<T>, ms: number): Promise<T> {
  return Promise.race([
    promise,
    new Promise<never>((_, reject) =>
      setTimeout(() => reject(new Error('timeout')), ms),
    ),
  ]);
 }
--- a/apps/booterm/src/ws/attach.ts
+++ b/apps/booterm/src/ws/attach.ts
@@ -9,7 +9,7 @@ import {
 } from '../pty/manager.js';
 import { attachPty } from '../pty/pty.js';
 import { getUser } from '../auth.js';
-import { register, unregister } from '../pty/registry.js';
+import { register, unregister, appendOutput } from '../pty/registry.js';
 export function registerWsAttachRoute(app: FastifyInstance, tmuxConfPath: string): void {
  app.get<{
@@ -106,6 +106,8 @@ export function registerWsAttachRoute(app: FastifyInstance, tmuxConfPath: string
        } catch (err) {
          req.log.warn({ err }, 'ws send failed');
        }
        // Feed the ring buffer for pattern-based search
        appendOutput(pid, data);
      };
      handle.onData(onData);
--- a/apps/coder/src/conductor/types.ts
+++ b/apps/coder/src/conductor/types.ts
@@ -38,10 +38,31 @@ export interface StepContext {
  readonly model?: string;
 }
-export type StepKind = 'agent' | 'code' | 'approval';
+export type StepKind = 'agent' | 'code' | 'approval' | 'switch';
 /**
 * One branch of a SWITCH step. The first case whose condition evaluates to true
 * is selected; all other branches' stepIds are excluded from execution.
 */
 export interface SwitchCase {
  /** Human-readable label for this branch (reported in switch output). */
  label: string;
  /** Pure guard — called with the current step context to decide this branch. */
  condition: (ctx: StepContext) => boolean;
  /** stepIds belonging to this branch. */
  stepIds: string[];
 }
 export type TriggerRule = 'all_success' | 'one_success' | 'all_done';
 /** Possible statuses for a flow step (persisted in flow_steps.status). */
 export type StepStatus = 'pending' | 'running' | 'completed' | 'failed' | 'skipped' | 'cancelled' | 'timed_out';
 /** Retry policy for a step that times out. */
 export interface RetryConfig {
  maxRetries: number;
 }
 export interface Step {
  /** unique id within the flow; other steps depend on it by this id */
  id: string;
@@ -55,10 +76,19 @@ export interface Step {
  /**
   * For kind:'agent', returns the worker PROMPT (task + any prior outputs).
   * For kind:'code', returns the step RESULT directly (the fold/transform).
   * For kind:'switch', unused (the runner evaluates cases internally).
   */
  run: (ctx: StepContext) => string | Promise<string>;
  /** optional guard — when it returns false the step is skipped (e.g. no repo) */
  when?: (ctx: StepContext) => boolean;
  /** max retries on timeout (0 or unset = no retry) */
  maxRetries?: number;
  /** batch group id; steps sharing the same batch are gated by batchConfig.maxConcurrent */
  batch?: string;
  /** for kind:'switch' — ordered list of branches evaluated in declaration order */
  cases?: SwitchCase[];
  /** for kind:'switch' — fallback step ids when no case matches */
  defaultBranch?: string[];
 }
 export interface Flow {
@@ -69,6 +99,8 @@ export interface Flow {
  render: (ctx: StepContext) => string;
  /** optional output filename for the artifact, derived from input */
  output?: (ctx: StepContext) => string;
  /** batch parallelism control — gates concurrent dispatch of steps sharing the same batch id */
  batchConfig?: { maxConcurrent: number; timeoutMs?: number; joinRule?: TriggerRule };
 }
 export interface RunResult {
--- a/apps/coder/src/config.ts
+++ b/apps/coder/src/config.ts
@@ -50,6 +50,11 @@ const ConfigSchema = z.object({
  // only reaped after it's been untouched this long (avoids sweeping a dir mid
  // ensureSessionWorktree create). 1h default.
  ORPHAN_WORKTREE_GRACE_MS: z.coerce.number().int().positive().default(3_600_000),
  DEEPSEEK_API_KEY: z.string().optional(),
  DEEPSEEK_BASE_URL: z.string().url().default('https://api.deepseek.com'),
  // v2.9.x: flow step timeout (default 5 min). When a 'running' step exceeds
  // this duration, it is marked 'timed_out' and may be retried.
  FLOW_STEP_TIMEOUT_MS: z.coerce.number().int().positive().default(300_000),
 });
 export type Config = z.infer<typeof ConfigSchema>;
--- a/apps/coder/src/index.ts
+++ b/apps/coder/src/index.ts
@@ -29,7 +29,9 @@ import { registerProviderRoutes } from './routes/providers.js';
 import { registerWorktreeSafetyRoutes } from './routes/worktree-safety.js';
 import { registerLifecycleRoutes } from './routes/lifecycle.js';
 import { registerAnalyticsRoutes } from './routes/analytics.js';
 import { registerPlanRoutes } from './routes/plans.js';
 import { registerWebSocket } from './routes/ws.js';
 import { updatePlanFromRun } from './services/plan-store.js';
 // Phase 4: dispatcher + agent probe
 import { createDispatcher } from './services/dispatcher.js';
 // Orchestrator (Phase 2): DB-backed flow-runner; advances on the dispatcher's
@@ -229,8 +231,16 @@ async function main() {
  // Orchestrator (Phase 2): the flow-runner reacts to the dispatcher's
  // onTaskTerminal hook to advance flow_runs. Created before the dispatcher so its
-  // terminal callback can be wired in.
+  // terminal callback can be wired in. onRunTerminal updates linked plans.
-  const flowRunner = createFlowRunner({ sql, broker, log: app.log, config });
+  const flowRunner = createFlowRunner({
    sql, broker, log: app.log, config,
    onRunTerminal: (runId, status) => {
      updatePlanFromRun(sql, runId, status).catch((err) => {
        app.log.error({ err: err instanceof Error ? err.message : String(err), runId },
          'plans: updatePlanFromRun failed');
      });
    },
  });
  // Arena SEAM (a): build the local-model set from the live llama-swap model list.
  // Both bare IDs ('qwen3.6-35b') and prefixed IDs ('llama-swap/qwen3.6-35b') are
@@ -384,6 +394,7 @@ async function main() {
  registerWorktreeSafetyRoutes(app, sql);
  registerLifecycleRoutes(app, sql);
  registerAnalyticsRoutes(app, sql);
  registerPlanRoutes(app, sql);
  registerWebSocket(app, sql, broker);
  // Graceful shutdown
--- a/apps/coder/src/routes/plans.ts
+++ b/apps/coder/src/routes/plans.ts
@@ -0,0 +1,134 @@
 /**
 * Boulder state — plan routes.
 *
 * GET   /api/plans?project_id=   — list plans for a project
 * GET   /api/plans/active?project_id= — list active (in-flight) plans
 * POST   /api/plans               — create a new plan
 * PATCH  /api/plans/:id           — update plan progress / status
 */
 import type { FastifyInstance } from 'fastify';
 import { z } from 'zod';
 import type { Sql } from '../db.js';
 import {
  createPlan,
  getPlan,
  listPlans,
  listActivePlans,
  updatePlan,
 } from '../services/plan-store.js';
 const CreatePlanBody = z.object({
  project_id: z.string().uuid(),
  title: z.string().min(1).max(500),
  description: z.string().max(10_000).optional(),
  flow_run_id: z.string().uuid().optional(),
  metadata: z.record(z.unknown()).optional(),
 });
 const ListPlansQuery = z.object({
  project_id: z.string().uuid(),
 });
 const UpdatePlanBody = z.object({
  title: z.string().min(1).max(500).optional(),
  description: z.string().max(10_000).nullable().optional(),
  status: z.enum(['active', 'completed', 'cancelled', 'failed']).optional(),
  progress_pct: z.number().int().min(0).max(100).optional(),
  items_total: z.number().int().min(0).optional(),
  items_completed: z.number().int().min(0).optional(),
  metadata: z.record(z.unknown()).nullable().optional(),
 });
 const PlanIdParam = z.string().uuid();
 export function registerPlanRoutes(app: FastifyInstance, sql: Sql): void {
  // GET /api/plans?project_id= — all plans for a project
  app.get('/api/plans', async (req, reply) => {
    const parsed = ListPlansQuery.safeParse(req.query);
    if (!parsed.success) {
      reply.code(400);
      return { error: 'invalid query', details: parsed.error.flatten() };
    }
    const plans = await listPlans(sql, parsed.data.project_id);
    return { plans };
  });
  // GET /api/plans/active?project_id= — active plans only
  app.get('/api/plans/active', async (req, reply) => {
    const parsed = ListPlansQuery.safeParse(req.query);
    if (!parsed.success) {
      reply.code(400);
      return { error: 'invalid query', details: parsed.error.flatten() };
    }
    const plans = await listActivePlans(sql, parsed.data.project_id);
    return { plans };
  });
  // POST /api/plans — create a new plan
  app.post('/api/plans', async (req, reply) => {
    const parsed = CreatePlanBody.safeParse(req.body);
    if (!parsed.success) {
      reply.code(400);
      return { error: 'invalid body', details: parsed.error.flatten() };
    }
    const { project_id, title, description, flow_run_id, metadata } = parsed.data;
    const plan = await createPlan(sql, {
      projectId: project_id,
      title,
      description,
      flowRunId: flow_run_id,
      metadata,
    });
    reply.code(201);
    return { plan };
  });
  // GET /api/plans/:id — single plan
  app.get<{ Params: { id: string } }>('/api/plans/:id', async (req, reply) => {
    const parsedId = PlanIdParam.safeParse(req.params.id);
    if (!parsedId.success) {
      reply.code(400);
      return { error: 'invalid id' };
    }
    const plan = await getPlan(sql, parsedId.data);
    if (!plan) {
      reply.code(404);
      return { error: 'plan not found' };
    }
    return { plan };
  });
  // PATCH /api/plans/:id — update plan
  app.patch<{ Params: { id: string } }>('/api/plans/:id', async (req, reply) => {
    const parsedId = PlanIdParam.safeParse(req.params.id);
    if (!parsedId.success) {
      reply.code(400);
      return { error: 'invalid id' };
    }
    const parsed = UpdatePlanBody.safeParse(req.body);
    if (!parsed.success) {
      reply.code(400);
      return { error: 'invalid body', details: parsed.error.flatten() };
    }
    const { title, description, status, progress_pct, items_total, items_completed, metadata } = parsed.data;
    const plan = await updatePlan(sql, parsedId.data, {
      title,
      description: description === null ? null : description,
      status,
      progressPct: progress_pct,
      itemsTotal: items_total,
      itemsCompleted: items_completed,
      metadata: metadata === null ? null : metadata,
    });
    if (!plan) {
      reply.code(404);
      return { error: 'plan not found' };
    }
    return { plan };
  });
 }
--- a/apps/coder/src/schema.sql
+++ b/apps/coder/src/schema.sql
@@ -266,7 +266,7 @@ CREATE INDEX IF NOT EXISTS claude_session_entries_key_idx ON claude_session_entr
 -- replaces it with the three-value list).
 ALTER TABLE agent_sessions DROP CONSTRAINT IF EXISTS agent_sessions_backend_chk;
 ALTER TABLE agent_sessions ADD CONSTRAINT agent_sessions_backend_chk
-  CHECK (backend IN ('opencode_server', 'acp_warm', 'claude_sdk'));
+  CHECK (backend IN ('opencode_server', 'acp_warm', 'claude_sdk', 'paseo'));
 -- LISTEN/NOTIFY fast path: every tasks INSERT (from any call site — routes,
 -- new_task tool, MCP server) fires pg_notify('tasks_new') in the same
@@ -340,11 +340,12 @@ CREATE INDEX IF NOT EXISTS flow_steps_task_id_idx ON flow_steps(task_id);
 -- edits above are no-ops on the existing DB (CREATE TABLE IF NOT EXISTS skips an
 -- existing table) — widen via the repo's DROP-IF-EXISTS → guarded-ADD discipline.
 -- Pure ADD of a new allowed value, so no row UPDATE is needed (no value renamed).
 -- v2.9.x: widen status CHECKs to include 'timed_out' for Task State Machine.
 ALTER TABLE flow_runs DROP CONSTRAINT IF EXISTS flow_runs_status_chk;
 DO $$ BEGIN
  IF NOT EXISTS (SELECT 1 FROM pg_constraint WHERE conname = 'flow_runs_status_chk') THEN
    ALTER TABLE flow_runs ADD CONSTRAINT flow_runs_status_chk
-      CHECK (status IN ('running', 'completed', 'failed', 'cancelled'));
+      CHECK (status IN ('running', 'completed', 'failed', 'cancelled', 'timed_out'));
  END IF;
 END $$;
@@ -352,10 +353,14 @@ ALTER TABLE flow_steps DROP CONSTRAINT IF EXISTS flow_steps_status_chk;
 DO $$ BEGIN
  IF NOT EXISTS (SELECT 1 FROM pg_constraint WHERE conname = 'flow_steps_status_chk') THEN
    ALTER TABLE flow_steps ADD CONSTRAINT flow_steps_status_chk
-      CHECK (status IN ('pending', 'running', 'completed', 'failed', 'skipped', 'cancelled'));
+      CHECK (status IN ('pending', 'running', 'completed', 'failed', 'skipped', 'cancelled', 'timed_out'));
  END IF;
 END $$;
 -- Task State Machine: retry columns for flow_steps.
 ALTER TABLE flow_steps ADD COLUMN IF NOT EXISTS retry_count INTEGER NOT NULL DEFAULT 0;
 ALTER TABLE flow_steps ADD COLUMN IF NOT EXISTS max_retries INTEGER;
 -- Arena: battles + contestants + cross_examinations.
 -- project_id carries no FK (matches tasks.project_id + flow_runs.project_id convention).
 -- winner_contestant_id FK is deferred (forward reference): added via guarded ALTER below.
@@ -438,3 +443,31 @@ CREATE TABLE IF NOT EXISTS flow_step_events (
  created_at TIMESTAMPTZ NOT NULL DEFAULT clock_timestamp()
 );
 CREATE INDEX IF NOT EXISTS flow_step_events_run_idx ON flow_step_events(run_id);
 -- v2.9.0: Boulder state — cross-session plan persistence with auto-resumption.
 -- project_id carries no FK (matches tasks/fow_runs convention).
 -- flow_run_id links the plan to an in-flight orchestrator run for auto-tracking.
 CREATE TABLE IF NOT EXISTS plans (
  id                UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  project_id        UUID NOT NULL,
  title             TEXT NOT NULL,
  description       TEXT,
  status            TEXT NOT NULL DEFAULT 'active',
  flow_run_id       UUID REFERENCES flow_runs(id) ON DELETE SET NULL,
  progress_pct      INTEGER NOT NULL DEFAULT 0,
  items_total       INTEGER NOT NULL DEFAULT 0,
  items_completed   INTEGER NOT NULL DEFAULT 0,
  metadata          JSONB,
  created_at        TIMESTAMPTZ NOT NULL DEFAULT clock_timestamp(),
  updated_at        TIMESTAMPTZ NOT NULL DEFAULT clock_timestamp(),
  CONSTRAINT plans_status_chk CHECK (status IN ('active', 'completed', 'cancelled', 'failed')),
  CONSTRAINT plans_progress_chk CHECK (progress_pct >= 0 AND progress_pct <= 100),
  CONSTRAINT plans_items_chk CHECK (items_total >= 0 AND items_completed >= 0 AND items_completed <= items_total)
 );
 -- Plan queries by project and status.
 CREATE INDEX IF NOT EXISTS plans_project_status_idx ON plans(project_id, status);
 -- Fast lookup of the plan owning a flow run (for onRunTerminal updates).
 CREATE INDEX IF NOT EXISTS plans_flow_run_id_idx ON plans(flow_run_id);
 -- Plans sorted by recency (for "resume from last" surface).
 CREATE INDEX IF NOT EXISTS plans_project_created_idx ON plans(project_id, created_at DESC);
--- a/apps/coder/src/services/tests/flow-runner-decisions.test.ts
+++ b/apps/coder/src/services/tests/flow-runner-decisions.test.ts
@@ -1,16 +1,20 @@
 import { describe, it, expect } from 'vitest';
 import type { Flow, Step, StepContext } from '../../conductor/types.js';
 import {
  buildBatchState,
  getReadyInBatch,
  manifestSteps,
  readySteps,
  partitionReady,
  readySteps,
  isRunComplete,
  isStuck,
  reconcileResumeStep,
  reconcileRun,
  resolveSwitch,
  shouldFailOnMissingAgent,
  type SchedulerState,
 } from '../flow-runner-decisions.js';
 import type { StepContext } from '../../conductor/types.js';
 /**
 * The DB-driven flow-runner replaces the Phase-1 in-memory wave scheduler
@@ -52,6 +56,8 @@ const emptyState = (over: Partial<SchedulerState> = {}): SchedulerState => ({
  skipped: new Set(),
  inFlight: new Set(),
  excluded: new Set(),
  timedOut: new Set(),
  switchResults: new Map(),
  ...over,
 });
@@ -237,6 +243,442 @@ describe('isRunComplete / isStuck', () => {
  });
 });
 // ─── SWITCH branching (v2.9) ─────────────────────────────────────────────────
 describe('resolveSwitch', () => {
  const baseCtx: StepContext = { input: { question: 'q', band: 'small' }, results: {} };
  it('selects the first matching case and excludes other branches', () => {
    const step: Step = {
      id: 'router',
      kind: 'switch',
      run: () => '',
      cases: [
        { label: 'a', condition: () => false, stepIds: ['a1', 'a2'] },
        { label: 'b', condition: () => true, stepIds: ['b1', 'b2'] },
        { label: 'c', condition: () => true, stepIds: ['c1', 'c2'] },
      ],
    };
    const result = resolveSwitch(step, baseCtx);
    expect(result.chosenCase).toBe('b');
    expect(result.excluded).toEqual(['a1', 'a2', 'c1', 'c2']);
  });
  it('falls back to defaultBranch when no case matches', () => {
    const step: Step = {
      id: 'router',
      kind: 'switch',
      run: () => '',
      cases: [
        { label: 'x', condition: () => false, stepIds: ['x1'] },
        { label: 'y', condition: () => false, stepIds: ['y1'] },
      ],
      defaultBranch: ['z1', 'z2'],
    };
    const result = resolveSwitch(step, baseCtx);
    expect(result.chosenCase).toBeNull();
    // Only case branch steps are excluded; default steps are not.
    expect(result.excluded).toEqual(['x1', 'y1']);
  });
  it('excludes all branch steps when no case matches and no default', () => {
    const step: Step = {
      id: 'router',
      kind: 'switch',
      run: () => '',
      cases: [
        { label: 'p', condition: () => false, stepIds: ['p1'] },
        { label: 'q', condition: () => false, stepIds: ['q1', 'q2'] },
      ],
    };
    const result = resolveSwitch(step, baseCtx);
    expect(result.chosenCase).toBeNull();
    expect(result.excluded).toEqual(['p1', 'q1', 'q2']);
  });
  it('excludes defaultBranch when a case matched', () => {
    const step: Step = {
      id: 'router',
      kind: 'switch',
      run: () => '',
      cases: [
        { label: 'hit', condition: () => true, stepIds: ['h1'] },
        { label: 'miss', condition: () => false, stepIds: ['m1'] },
      ],
      defaultBranch: ['d1'],
    };
    const result = resolveSwitch(step, baseCtx);
    expect(result.chosenCase).toBe('hit');
    expect(result.excluded).toEqual(['m1', 'd1']);
  });
  it('returns empty excluded for a degenerate switch with no cases and no default', () => {
    const step: Step = {
      id: 'noop',
      kind: 'switch',
      run: () => '',
    };
    const result = resolveSwitch(step, baseCtx);
    expect(result.chosenCase).toBeNull();
    expect(result.excluded).toEqual([]);
  });
  it('uses ctx.results in condition evaluation', () => {
    const step: Step = {
      id: 'router',
      kind: 'switch',
      run: () => '',
      cases: [
        { label: 'has', condition: (ctx) => ctx.results['prev'] === 'yes', stepIds: ['yes-branch'] },
        { label: 'no', condition: () => true, stepIds: ['no-branch'] },
      ],
    };
    const ctxWithResult: StepContext = { input: { question: 'q', band: 'small' }, results: { prev: 'yes' } };
    const result = resolveSwitch(step, ctxWithResult);
    expect(result.chosenCase).toBe('has');
    expect(result.excluded).toEqual(['no-branch']);
  });
 });
 describe('readySteps with switch-excluded steps', () => {
  // Flow: switch router → branch-a/branch-b → fold
  function switchFlow(): Flow {
    const steps: Step[] = [
      {
        id: 'switch', kind: 'switch', run: () => '',
        cases: [
          { label: 'a', condition: () => true, stepIds: ['branch-a'] },
          { label: 'b', condition: () => false, stepIds: ['branch-b'] },
        ],
      },
      { id: 'branch-a', kind: 'agent', agent: 'x', deps: ['switch'], run: () => 'p' },
      { id: 'branch-b', kind: 'agent', agent: 'y', deps: ['switch'], run: () => 'q' },
      { id: 'fold', kind: 'code', deps: ['branch-a', 'branch-b'], run: () => 'r' },
    ];
    return { name: 'switch-demo', description: '', steps, render: () => '' };
  }
  it('excludes non-selected branch steps and treats them as satisfied deps', () => {
    const flow = switchFlow();
    // switch completed, branch-b excluded by switch (branch-a selected)
    const switchResult = new Map<string, { chosenCase: string | null; excluded: Set<string> }>([
      ['switch', { chosenCase: 'a', excluded: new Set(['branch-b']) }],
    ]);
    const state: SchedulerState = {
      done: new Set(['switch']),
      skipped: new Set(),
      inFlight: new Set(),
      excluded: new Set(),
      timedOut: new Set(),
      switchResults: switchResult,
    };
    const ready = readySteps(flow, state).map((s) => s.id);
    // branch-a is ready (dep switch is done), branch-b is excluded
    expect(ready).toContain('branch-a');
    expect(ready).not.toContain('branch-b');
  });
  it('fold unblocks once selected branch completes (excluded branch satisfied)', () => {
    const flow = switchFlow();
    const switchResult = new Map<string, { chosenCase: string | null; excluded: Set<string> }>([
      ['switch', { chosenCase: 'a', excluded: new Set(['branch-b']) }],
    ]);
    const state: SchedulerState = {
      done: new Set(['switch', 'branch-a']),
      skipped: new Set(),
      inFlight: new Set(),
      excluded: new Set(),
      timedOut: new Set(),
      switchResults: switchResult,
    };
    const ready = readySteps(flow, state).map((s) => s.id);
    // fold's deps: branch-a done, branch-b excluded (via switch) → satisfied
    expect(ready).toContain('fold');
  });
  it('fold stays blocked until selected branch completes, even with excluded dep', () => {
    const flow = switchFlow();
    const switchResult = new Map<string, { chosenCase: string | null; excluded: Set<string> }>([
      ['switch', { chosenCase: 'a', excluded: new Set(['branch-b']) }],
    ]);
    const state: SchedulerState = {
      done: new Set(['switch']),
      skipped: new Set(),
      inFlight: new Set(['branch-a']),
      excluded: new Set(),
      timedOut: new Set(),
      switchResults: switchResult,
    };
    const ready = readySteps(flow, state).map((s) => s.id);
    // branch-a in flight, branch-b excluded — only branch-a offered
    expect(ready).not.toContain('fold');
  });
  it('isRunComplete returns true when switch-excluded steps are the only unsettled', () => {
    const flow = switchFlow();
    // All non-excluded steps done; branch-b is excluded via switch
    const switchResult = new Map<string, { chosenCase: string | null; excluded: Set<string> }>([
      ['switch', { chosenCase: 'a', excluded: new Set(['branch-b']) }],
    ]);
    const state: SchedulerState = {
      done: new Set(['switch', 'branch-a', 'fold']),
      skipped: new Set(),
      inFlight: new Set(),
      excluded: new Set(),
      timedOut: new Set(),
      switchResults: switchResult,
    };
    expect(isRunComplete(flow, state)).toBe(true);
    expect(isStuck(flow, state)).toBe(false);
  });
  it('combines static excluded with switch-excluded', () => {
    const flow = switchFlow();
    // band gating excludes branch-b at launch, AND switch also excludes it
    const switchResult = new Map<string, { chosenCase: string | null; excluded: Set<string> }>([
      ['switch', { chosenCase: 'a', excluded: new Set(['branch-b']) }],
    ]);
    const state: SchedulerState = {
      done: new Set(['switch', 'branch-a']),
      skipped: new Set(),
      inFlight: new Set(),
      excluded: new Set(['branch-b']),
      timedOut: new Set(),
      switchResults: switchResult,
    };
    // branch-b excluded both ways; fold sees branch-a done, branch-b excluded
    const ready = readySteps(flow, state).map((s) => s.id);
    expect(ready).toContain('fold');
  });
 });
 // ─── Batch parallelism (v2.8.22) ─────────────────────────────────────────────
 describe('buildBatchState', () => {
  it('returns empty map when flow has no batchConfig', () => {
    const flow: Flow = {
      name: 'no-batch',
      description: '',
      steps: [
        { id: 'a', kind: 'agent', agent: 'x', run: () => 'p' },
        { id: 'b', kind: 'code', deps: ['a'], run: () => 'r' },
      ],
      render: () => '',
    };
    const bs = buildBatchState(flow, new Set());
    expect(bs.size).toBe(0);
  });
  it('maps each batch group to its running set and config', () => {
    const flow: Flow = {
      name: 'batched',
      description: '',
      steps: [
        { id: 'a1', kind: 'agent', agent: 'x', batch: 'review', run: () => 'p' },
        { id: 'a2', kind: 'agent', agent: 'y', batch: 'review', run: () => 'q' },
        { id: 'b1', kind: 'agent', agent: 'z', batch: 'check', run: () => 'r' },
        { id: 'fold', kind: 'code', deps: ['a1', 'a2', 'b1'], run: () => 's' },
      ],
      render: () => '',
      batchConfig: { maxConcurrent: 2 },
    };
    // a1 is in flight → review batch has 1 running, check has 0.
    const bs = buildBatchState(flow, new Set(['a1']));
    expect(bs.size).toBe(2);
    const review = bs.get('review');
    expect(review).toBeDefined();
    expect([...review!.running]).toEqual(['a1']);
    expect(review!.maxConcurrent).toBe(2);
    expect(review!.joinRule).toBe('all_success');
    const check = bs.get('check');
    expect(check).toBeDefined();
    expect(check!.running.size).toBe(0);
    expect(check!.maxConcurrent).toBe(2);
  });
  it('uses joinRule from batchConfig when provided', () => {
    const flow: Flow = {
      name: 'join',
      description: '',
      steps: [
        { id: 'x', kind: 'agent', agent: 'a', batch: 'g1', run: () => 'p' },
      ],
      render: () => '',
      batchConfig: { maxConcurrent: 1, joinRule: 'one_success' },
    };
    const bs = buildBatchState(flow, new Set());
    expect(bs.get('g1')!.joinRule).toBe('one_success');
  });
  it('ignores steps without a batch field', () => {
    const flow: Flow = {
      name: 'mixed',
      description: '',
      steps: [
        { id: 'a', kind: 'agent', agent: 'x', run: () => 'p' },
        { id: 'b', kind: 'agent', agent: 'y', batch: 'g1', run: () => 'q' },
      ],
      render: () => '',
      batchConfig: { maxConcurrent: 3 },
    };
    const bs = buildBatchState(flow, new Set(['a', 'b']));
    // a is inFlight but has no batch — it does not create an entry
    expect(bs.size).toBe(1);
    expect(bs.has('g1')).toBe(true);
    expect(bs.get('g1')!.running.has('b')).toBe(true);
    // a is not in any batch entry
    for (const entry of bs.values()) {
      expect(entry.running.has('a')).toBe(false);
    }
  });
 });
 describe('getReadyInBatch', () => {
  function makeBatchState(
    overrides?: Map<string, { running: Set<string>; maxConcurrent: number; joinRule: TriggerRule }>,
  ): Map<string, { running: Set<string>; maxConcurrent: number; joinRule: TriggerRule }> {
    return overrides ?? new Map();
  }
  it('passes all steps through when batchState is empty', () => {
    const steps: Step[] = [
      { id: 'a', kind: 'agent', agent: 'x', run: () => 'p' },
      { id: 'b', kind: 'agent', agent: 'y', batch: 'g1', run: () => 'q' },
    ];
    const state: SchedulerState = {
      done: new Set(),
      skipped: new Set(),
      inFlight: new Set(),
      excluded: new Set(),
      timedOut: new Set(),
      switchResults: new Map(),
      batchState: makeBatchState(),
    };
    const result = getReadyInBatch(steps, state, {} as Flow);
    expect(result.map((s) => s.id)).toEqual(['a', 'b']);
  });
  it('passes non-batched steps through regardless of batch capacity', () => {
    const batchState = new Map();
    batchState.set('g1', { running: new Set(['a']), maxConcurrent: 1, joinRule: 'all_success' });
    const steps: Step[] = [
      { id: 'nobatch', kind: 'agent', agent: 'z', run: () => 'r' },
      { id: 'batched', kind: 'agent', agent: 'x', batch: 'g1', run: () => 'p' },
    ];
    const state: SchedulerState = {
      done: new Set(),
      skipped: new Set(),
      inFlight: new Set(['a']),
      excluded: new Set(),
      timedOut: new Set(),
      switchResults: new Map(),
      batchState,
    };
    const result = getReadyInBatch(steps, state, {} as Flow);
    // nobatch passes, batched is at maxConcurrent=1 with a already running → blocked
    expect(result.map((s) => s.id)).toEqual(['nobatch']);
  });
  it('allows batch steps up to maxConcurrent', () => {
    const batchState = new Map();
    batchState.set('g1', { running: new Set(), maxConcurrent: 2, joinRule: 'all_success' });
    const steps: Step[] = [
      { id: 's1', kind: 'agent', agent: 'x', batch: 'g1', run: () => 'p' },
      { id: 's2', kind: 'agent', agent: 'y', batch: 'g1', run: () => 'q' },
      { id: 's3', kind: 'agent', agent: 'z', batch: 'g1', run: () => 'r' },
    ];
    const state: SchedulerState = {
      done: new Set(),
      skipped: new Set(),
      inFlight: new Set(),
      excluded: new Set(),
      timedOut: new Set(),
      switchResults: new Map(),
      batchState,
    };
    // All 0 running, maxConcurrent=2 → all 3 pass through (readySteps would return them,
    // but the flow-runner dispatches them one-by-one in the agent dispatch loop; getReadyInBatch
    // is called each tick to allow up to maxConcurrent. Since batch is empty on this tick,
    // all are allowed — the runner's dispatch loop will put 2 in flight, then next tick blocks.)
    const result = getReadyInBatch(steps, state, {} as Flow);
    expect(result.map((s) => s.id)).toEqual(['s1', 's2', 's3']);
  });
  it('blocks batch steps when at capacity', () => {
    const batchState = new Map();
    batchState.set('g1', { running: new Set(['a', 'b']), maxConcurrent: 2, joinRule: 'all_success' });
    const steps: Step[] = [
      { id: 'c', kind: 'agent', agent: 'x', batch: 'g1', run: () => 'p' },
      { id: 'd', kind: 'agent', agent: 'y', batch: 'g1', run: () => 'q' },
    ];
    const state: SchedulerState = {
      done: new Set(),
      skipped: new Set(),
      inFlight: new Set(['a', 'b']),
      excluded: new Set(),
      timedOut: new Set(),
      switchResults: new Map(),
      batchState,
    };
    // Both batches at capacity → everything filtered out
    expect(getReadyInBatch(steps, state, {} as Flow)).toEqual([]);
  });
  it('handles multiple independent batch groups', () => {
    const batchState = new Map();
    batchState.set('g1', { running: new Set(['a']), maxConcurrent: 1, joinRule: 'all_success' });
    batchState.set('g2', { running: new Set(), maxConcurrent: 5, joinRule: 'all_success' });
    const steps: Step[] = [
      { id: 'b', kind: 'agent', agent: 'x', batch: 'g1', run: () => 'p' }, // g1 at capacity → blocked
      { id: 'c', kind: 'agent', agent: 'y', batch: 'g2', run: () => 'q' }, // g2 has room → passes
      { id: 'd', kind: 'agent', agent: 'z', batch: 'g2', run: () => 'r' }, // g2 has room → passes
    ];
    const state: SchedulerState = {
      done: new Set(),
      skipped: new Set(),
      inFlight: new Set(['a']),
      excluded: new Set(),
      timedOut: new Set(),
      switchResults: new Map(),
      batchState,
    };
    expect(getReadyInBatch(steps, state, {} as Flow).map((s) => s.id)).toEqual(['c', 'd']);
  });
  it('lets a step pass when its batch group is known but has no running steps yet', () => {
    const batchState = new Map();
    batchState.set('g1', { running: new Set(), maxConcurrent: 2, joinRule: 'all_success' });
    const steps: Step[] = [
      { id: 'first', kind: 'agent', agent: 'x', batch: 'g1', run: () => 'p' },
    ];
    const state: SchedulerState = {
      done: new Set(),
      skipped: new Set(),
      inFlight: new Set(),
      excluded: new Set(),
      timedOut: new Set(),
      switchResults: new Map(),
      batchState,
    };
    expect(getReadyInBatch(steps, state, {} as Flow).map((s) => s.id)).toEqual(['first']);
  });
  it('handles empty step list gracefully', () => {
    const state: SchedulerState = {
      done: new Set(),
      skipped: new Set(),
      inFlight: new Set(),
      excluded: new Set(),
      timedOut: new Set(),
      switchResults: new Map(),
      batchState: makeBatchState(),
    };
    expect(getReadyInBatch([], state, {} as Flow)).toEqual([]);
  });
 });
 // ─── Resume reconciliation (D-9) ─────────────────────────────────────────────
 describe('reconcileResumeStep', () => {
--- a/apps/coder/src/services/tests/paseo-client.test.ts
+++ b/apps/coder/src/services/tests/paseo-client.test.ts
@@ -0,0 +1,195 @@
 import { describe, it, expect, vi } from 'vitest';
 import { PaseoClient, PaseoClientError } from '../paseo-client.js';
 /**
 * Create a PaseoClient whose runCli method is replaced with a mock.
 * The mock is returned as the second tuple element so tests can
 * control and inspect it directly.
 */
 function makeClient(config?: { paseoBin?: string; cliHost?: string }): {
  client: PaseoClient;
  mockRunCli: ReturnType<typeof vi.fn>;
 } {
  const client = new PaseoClient(config);
  const mockRunCli = vi.fn();
  (client as any).runCli = mockRunCli;
  return { client, mockRunCli };
 }
 describe('PaseoClient', () => {
  describe('listAgents', () => {
    it('returns parsed agent list from paseo ls --json', async () => {
      const agents = [
        { id: 'abc-123', shortId: 'abc', name: 'Agent 1', provider: 'opencode', status: 'running' },
        { id: 'def-456', shortId: 'def', name: 'Agent 2', provider: 'claude', status: 'idle' },
      ];
      const { client, mockRunCli } = makeClient();
      mockRunCli.mockResolvedValue(JSON.stringify(agents));
      const result = await client.listAgents();
      expect(mockRunCli).toHaveBeenCalledWith(['ls', '--json']);
      expect(result).toEqual(agents);
    });
    it('throws PaseoClientError on non-JSON output', async () => {
      const { client, mockRunCli } = makeClient();
      mockRunCli.mockResolvedValue('not json');
      await expect(client.listAgents()).rejects.toThrow(PaseoClientError);
      await expect(client.listAgents()).rejects.toThrow(/invalid JSON/);
    });
    it('propagates runCli rejection as-is', async () => {
      const { client, mockRunCli } = makeClient();
      const err = new PaseoClientError('ls failed: connection refused', 'ls', 1, 'connection refused');
      mockRunCli.mockRejectedValue(err);
      await expect(client.listAgents()).rejects.toThrow(PaseoClientError);
      await expect(client.listAgents()).rejects.toThrow(/ls failed/);
    });
  });
  describe('getAgentStatus', () => {
    it('returns parsed agent detail from paseo inspect --json', async () => {
      const detail = {
        Id: 'abc-123', Name: 'Agent 1', Provider: 'opencode',
        Status: 'idle', Archived: false,
        CreatedAt: '2026-01-01T00:00:00Z', UpdatedAt: '2026-01-01T01:00:00Z',
      };
      const { client, mockRunCli } = makeClient();
      mockRunCli.mockResolvedValue(JSON.stringify(detail));
      const result = await client.getAgentStatus('abc-123');
      expect(mockRunCli).toHaveBeenCalledWith(['inspect', '--json', 'abc-123']);
      expect(result.Id).toBe('abc-123');
      expect(result.Status).toBe('idle');
    });
  });
  describe('health', () => {
    it('returns ok when paseo ls succeeds', async () => {
      const { client, mockRunCli } = makeClient();
      mockRunCli.mockResolvedValue('[]');
      const result = await client.health();
      expect(result).toEqual({ status: 'ok' });
    });
    it('returns error when runCli throws', async () => {
      const { client, mockRunCli } = makeClient();
      mockRunCli.mockRejectedValue(new Error('connection refused'));
      const result = await client.health();
      expect(result).toEqual({ status: 'error' });
    });
  });
  describe('importAgent', () => {
    it('calls paseo import with provider and labels', async () => {
      const agentResult = { Id: 'new-789', Name: 'Imported', Provider: 'opencode', Status: 'idle' };
      const { client, mockRunCli } = makeClient();
      mockRunCli.mockResolvedValue(JSON.stringify(agentResult));
      const result = await client.importAgent('ses-001', 'opencode', {
        origin: 'boocode',
        project: 'proj-1',
      });
      expect(mockRunCli).toHaveBeenCalledWith([
        'import', '--json',
        '--provider', 'opencode',
        '--label', 'origin=boocode',
        '--label', 'project=proj-1',
        'ses-001',
      ]);
      expect(result.Id).toBe('new-789');
    });
    it('works without labels', async () => {
      const { client, mockRunCli } = makeClient();
      mockRunCli.mockResolvedValue(JSON.stringify({ Id: 'new-789' }));
      const result = await client.importAgent('ses-001', 'claude');
      expect(mockRunCli).toHaveBeenCalledWith([
        'import', '--json',
        '--provider', 'claude',
        'ses-001',
      ]);
      expect(result.Id).toBe('new-789');
    });
  });
  describe('archiveAgent', () => {
    it('calls paseo archive --json', async () => {
      const { client, mockRunCli } = makeClient();
      mockRunCli.mockResolvedValue('{}');
      await client.archiveAgent('abc-123');
      expect(mockRunCli).toHaveBeenCalledWith(['archive', '--json', 'abc-123']);
    });
  });
  describe('sendPrompt', () => {
    it('sends prompt and parses JSON result', async () => {
      const sendResult = { text: 'Hello!', ok: true };
      const { client, mockRunCli } = makeClient();
      mockRunCli.mockResolvedValue(JSON.stringify(sendResult));
      const result = await client.sendPrompt('abc-123', 'Hello');
      expect(mockRunCli).toHaveBeenCalledWith(['send', '--json', 'abc-123', 'Hello'], undefined);
      expect(result).toEqual(sendResult);
    });
    it('falls back to plain text on non-JSON output', async () => {
      const { client, mockRunCli } = makeClient();
      mockRunCli.mockResolvedValue('plain text response');
      const result = await client.sendPrompt('abc-123', 'Hi');
      expect(result).toEqual({ text: 'plain text response', ok: true });
    });
    it('supports --no-wait flag', async () => {
      const { client, mockRunCli } = makeClient();
      mockRunCli.mockResolvedValue('{}');
      await client.sendPrompt('abc-123', 'Hi', { noWait: true });
      expect(mockRunCli).toHaveBeenCalledWith([
        'send', '--json', '--no-wait',
        'abc-123', 'Hi',
      ], undefined);
    });
  });
  describe('stopAgent', () => {
    it('calls paseo stop', async () => {
      const { client, mockRunCli } = makeClient();
      mockRunCli.mockResolvedValue('');
      await client.stopAgent('abc-123');
      expect(mockRunCli).toHaveBeenCalledWith(['stop', 'abc-123']);
    });
  });
  describe('cliHost config', () => {
    it('includes --host flag in args when cliHost is set', async () => {
      const { client, mockRunCli } = makeClient({ cliHost: 'tcp://localhost:6767?ssl=true' });
      mockRunCli.mockResolvedValue('[]');
      await client.listAgents();
      expect(mockRunCli).toHaveBeenCalledWith([
        'ls', '--json', '--host', 'tcp://localhost:6767?ssl=true',
      ]);
    });
  });
 });
--- a/apps/coder/src/services/tests/plan-store.test.ts
+++ b/apps/coder/src/services/tests/plan-store.test.ts
@@ -0,0 +1,16 @@
 import { describe, it, expect } from 'vitest';
 import { planStatusFromRun } from '../plan-store.js';
 describe('planStatusFromRun', () => {
  it('maps completed to completed', () => {
    expect(planStatusFromRun('completed')).toBe('completed');
  });
  it('maps failed to failed', () => {
    expect(planStatusFromRun('failed')).toBe('failed');
  });
  it('maps cancelled to cancelled', () => {
    expect(planStatusFromRun('cancelled')).toBe('cancelled');
  });
 });
--- a/apps/coder/src/services/agent-backend.ts
+++ b/apps/coder/src/services/agent-backend.ts
@@ -13,7 +13,7 @@ import type { AcpToolSnapshot } from './acp-tool-snapshot.js';
 import type { AgentCommand } from './provider-types.js';
 /** Backend transport kind. Mirrors `agent_sessions.backend` CHECK in schema.sql. */
-export type AgentBackendKind = 'opencode_server' | 'acp_warm' | 'claude_sdk';
+export type AgentBackendKind = 'opencode_server' | 'acp_warm' | 'claude_sdk' | 'paseo';
 /**
 * Normalized, transport-agnostic events a backend emits during a turn (§2).
--- a/apps/coder/src/services/backends/paseo.ts
+++ b/apps/coder/src/services/backends/paseo.ts
@@ -0,0 +1,254 @@
 /**
 * v2.10 — PaseoBackend: Paseo agent integration for the agent-pool.
 *
 * Wraps the Paseo CLI daemon as an AgentBackend. Each Paseo agent maps to one
 * (chat_id, agent) pair and is persisted via `paseo import` (which registers
 * an agent with the Paseo daemon). Prompts are sent via `paseo send`, and
 * the session is cleaned up via `paseo archive`.
 *
 * Paseo is a meta-agent hub — it wraps provider sessions (opencode, claude,
 * acp, etc.). The `provider` option in `EnsureSessionOpts` selects which
 * provider Paseo delegates to.
 *
 * Backend kind: 'paseo' (must be added to agent_sessions_backend_chk).
 *
 * Spec: openspec/changes/v2-10-paseo-integration/design.md.
 */
 import type { FastifyBaseLogger } from 'fastify';
 import type { Sql } from '../../db.js';
 import { PaseoClient, type PaseoSendResult } from '../paseo-client.js';
 import type {
  AgentBackend,
  AgentSessionHandle,
  EnsureSessionOpts,
  PromptCtx,
  TurnResult,
 } from '../agent-backend.js';
 /** Default provider to use when Paseo wraps a generic agent. */
 const DEFAULT_PASEO_PROVIDER = 'opencode';
 export interface PaseoBackendDeps {
  sql: Sql;
  log: FastifyBaseLogger;
  /** The (chat, agent) this backend serves — its pool identity + DB key. */
  chatId: string;
  /** Agent name (e.g. 'opencode', 'claude', 'paseo'). */
  agent: string;
  /** Resolved PaseoClient instance. */
  client: PaseoClient;
  /** Provider string to pass to `paseo import --provider`. */
  provider: string;
 }
 export class PaseoBackend implements AgentBackend {
  readonly backend = 'paseo' as const;
  private readonly sql: Sql;
  private readonly log: FastifyBaseLogger;
  private readonly chatId: string;
  private readonly agent: string;
  private readonly client: PaseoClient;
  private readonly provider: string;
  /** Map of BooCode sessionId → Paseo agent ID. */
  private readonly agentIds = new Map<string, string>();
  /** True between prompt() start and settle. */
  private busy = false;
  private up = false;
  constructor(deps: PaseoBackendDeps) {
    this.sql = deps.sql;
    this.log = deps.log;
    this.chatId = deps.chatId;
    this.agent = deps.agent;
    this.client = deps.client;
    this.provider = deps.provider || DEFAULT_PASEO_PROVIDER;
  }
  /** §2: liveness for the health endpoint + dispatcher fallback decision. */
  health(): 'up' | 'down' {
    return this.up ? 'up' : 'down';
  }
  /** Phase 3: busy iff a turn is in flight (pool never evicts a busy backend). */
  isBusy(): boolean {
    return this.busy;
  }
  // ─── ensureSession: create/import a Paseo agent ─────────────────────────────
  async ensureSession(sessionId: string, opts: EnsureSessionOpts): Promise<AgentSessionHandle> {
    // Check if we already have a Paseo agent ID for this session.
    let paseoId = this.agentIds.get(sessionId);
    if (!paseoId) {
      // Resolve existing agent_session_id from DB (e.g. after a restart).
      const [row] = await this.sql<{ agent_session_id: string | null }[]>`
        SELECT agent_session_id FROM agent_sessions
        WHERE chat_id = ${opts.chatId} AND agent = ${opts.agent} AND backend = 'paseo'
      `;
      if (row?.agent_session_id) {
        paseoId = row.agent_session_id;
        this.agentIds.set(sessionId, paseoId);
      }
    }
    if (!paseoId) {
      // Import a new Paseo agent. Use the session UUID as the provider session id.
      const labels: Record<string, string> = {
        origin: 'boocode',
        project: opts.projectId,
        chat: opts.chatId,
        worktree: opts.worktreeId,
        agent: this.agent,
      };
      try {
        const agent = await this.client.importAgent(sessionId, this.provider, labels);
        paseoId = agent.Id;
        this.agentIds.set(sessionId, paseoId);
        this.log.info(
          { paseoId, agent: this.agent, chatId: this.chatId },
          'paseo: imported agent',
        );
      } catch (err) {
        this.log.error(
          { err: String(err), agent: this.agent, chatId: this.chatId },
          'paseo: importAgent failed',
        );
        throw err;
      }
    }
    // Upsert the agent_sessions row.
    await this.sql`
      INSERT INTO agent_sessions
        (chat_id, session_id, worktree_id, agent, backend, agent_session_id, server_port, status, last_active_at)
      VALUES
        (${opts.chatId}, ${sessionId}, ${opts.worktreeId}, ${opts.agent}, 'paseo', ${paseoId}, NULL, 'active', clock_timestamp())
      ON CONFLICT (chat_id, agent) DO UPDATE SET
        session_id = EXCLUDED.session_id,
        worktree_id = EXCLUDED.worktree_id,
        backend = 'paseo',
        agent_session_id = COALESCE(EXCLUDED.agent_session_id, agent_sessions.agent_session_id),
        server_port = NULL,
        status = 'active',
        last_active_at = clock_timestamp()
    `.catch((err) => {
      this.log.warn(
        { err: String(err), chatId: opts.chatId, agent: opts.agent },
        'paseo: agent_sessions upsert failed (non-fatal)',
      );
    });
    this.up = true;
    return {
      sessionId,
      agent: opts.agent,
      backend: 'paseo',
      chatId: opts.chatId,
      worktreeId: opts.worktreeId,
      agentSessionId: paseoId,
      serverPort: null,
    };
  }
  // ─── prompt: send a message to the Paseo agent ─────────────────────────────
  async prompt(handle: AgentSessionHandle, input: string, ctx: PromptCtx): Promise<TurnResult> {
    const paseoId = handle.agentSessionId;
    if (!paseoId) {
      return { ok: false, error: 'paseo: no agent session id in handle' };
    }
    this.busy = true;
    try {
      // Use streamSend for real-time text output via onEvent.
      const result: PaseoSendResult = await this.client.streamSend(
        paseoId,
        input,
        (event) => {
          ctx.onEvent(event);
        },
        ctx.signal,
      );
      // Update last_active_at.
      await this.sql`
        UPDATE agent_sessions
        SET last_active_at = clock_timestamp()
        WHERE chat_id = ${handle.chatId} AND agent = ${handle.agent}
      `.catch(() => { /* non-fatal */ });
      if (result.error) {
        return { ok: false, error: result.error };
      }
      return { ok: true };
    } catch (err) {
      const msg = err instanceof Error ? err.message : String(err);
      // Check if abortion
      if (ctx.signal.aborted) {
        return { ok: false, error: 'cancelled' };
      }
      return { ok: false, error: `paseo: ${msg}` };
    } finally {
      this.busy = false;
    }
  }
  // ─── closeSession: archive the Paseo agent ─────────────────────────────────
  async closeSession(handle: AgentSessionHandle): Promise<void> {
    const paseoId = handle.agentSessionId;
    if (!paseoId) return;
    try {
      await this.client.archiveAgent(paseoId);
      this.log.info({ paseoId, agent: handle.agent }, 'paseo: archived agent');
    } catch (err) {
      this.log.warn(
        { err: String(err), paseoId, agent: handle.agent },
        'paseo: archiveAgent failed (non-fatal)',
      );
    }
    this.agentIds.delete(handle.sessionId);
    // Update DB row.
    await this.sql`
      UPDATE agent_sessions
      SET status = 'closed', last_active_at = clock_timestamp()
      WHERE chat_id = ${handle.chatId} AND agent = ${handle.agent}
    `.catch(() => { /* non-fatal */ });
  }
  // ─── dispose: archive all tracked agents ───────────────────────────────────
  async dispose(): Promise<void> {
    const ids = [...this.agentIds.values()];
    this.agentIds.clear();
    for (const paseoId of ids) {
      try {
        await this.client.archiveAgent(paseoId);
      } catch {
        // Best-effort cleanup during shutdown.
      }
    }
    this.up = false;
  }
  /** Phase 3: periodic health tick — probes the Paseo daemon. */
  async tickHealth(_now?: number): Promise<void> {
    try {
      const h = await this.client.health();
      this.up = h.status === 'ok';
    } catch {
      this.up = false;
    }
  }
 }
--- a/apps/coder/src/services/behavioral/generation.ts
+++ b/apps/coder/src/services/behavioral/generation.ts
@@ -0,0 +1,204 @@
 /**
 * Schematic generator for behavioral guideline batches.
 *
 * Port of boocontext-audit/src/generation.ts — abstract LLM batch caller
 * with temperature retry and structured output per batch type.
 */
 import { type GenerationInfo } from './matching.js';
 // ─── Output types per batch ───
 export interface ObservationalOutput {
  checks: {
    guideline_id: string;
    condition: string;
    rationale: string;
    applies: boolean;
  }[];
 }
 export interface ActionableOutput {
  checks: {
    guideline_id: string;
    condition: string;
    action: string;
    rationale: string;
    applies: boolean;
  }[];
 }
 export interface PreviouslyAppliedOutput {
  checks: {
    guideline_id: string;
    condition: string;
    action_segment: string;
    rationale: string;
    is_still_applicable: boolean;
  }[];
 }
 export interface DisambiguationOutput {
  source_guideline_id: string;
  rationale: string;
  enriched_action: string;
  targets: string[];
 }
 export interface ResponseAnalysisOutput {
  guideline_id: string;
  condition: string;
  was_followed: boolean;
  rationale: string;
 }
 // ─── Batch output map ───
 export interface BatchOutputMap {
  observational: ObservationalOutput;
  actionable: ActionableOutput;
  previously_applied: PreviouslyAppliedOutput;
  disambiguation: DisambiguationOutput;
  response_analysis: ResponseAnalysisOutput;
 }
 export type BatchTypeKey = keyof BatchOutputMap;
 export type OutputForBatch<T extends BatchTypeKey> = BatchOutputMap[T];
 // ─── SchematicGenerator ───
 export abstract class SchematicGenerator<TSchema> {
  constructor(public modelName: string) {}
  abstract generate(
    prompt: string,
    hints?: Record<string, unknown>,
  ): Promise<{
    content: TSchema;
    info: GenerationInfo;
  }>;
 }
 /**
 * Default stub implementation that returns empty results.
 * Replace with a real LLM caller in production.
 */
 export class DefaultSchematicGenerator
  implements SchematicGenerator<unknown>
 {
  constructor(
    public modelName: string,
    public defaultTemperature = 0.7,
  ) {}
  async generate(
    _prompt: string,
    hints?: Record<string, unknown>,
  ): Promise<{ content: unknown; info: GenerationInfo }> {
    const temperature = (hints?.temperature as number) ?? this.defaultTemperature;
    return {
      content: {},
      info: {
        model: this.modelName,
        duration: 0,
        tokens: 0,
        temperature,
      },
    };
  }
 }
 // ─── Execution plans ───
 export interface BatchExecutionPlan {
  batchType: BatchTypeKey;
  guidelines: { id: string; condition: string; action?: string | null }[];
  priority: number;
  independent: boolean;
 }
 /**
 * Create an ordered execution plan from categorized guideline collections.
 * Groups are sorted by priority: previously_applied (fastest) first,
 * then observational, actionable, disambiguation, low-criticality last.
 */
 export function createExecutionPlan(
  observational: { id: string; condition: string }[],
  actionable: { id: string; condition: string; action: string }[],
  previouslyApplied: { id: string; condition: string; action?: string | null }[],
  disambiguationGroups: { source: string; targets: string[]; enrichedAction: string }[],
  lowCriticality: { id: string; condition: string }[],
 ): BatchExecutionPlan[] {
  const plans: BatchExecutionPlan[] = [];
  if (observational.length > 0) {
    plans.push({
      batchType: 'observational',
      guidelines: observational.map((g) => ({ id: g.id, condition: g.condition })),
      priority: 1,
      independent: true,
    });
  }
  if (actionable.length > 0) {
    plans.push({
      batchType: 'actionable',
      guidelines: actionable.map((g) => ({
        id: g.id,
        condition: g.condition,
        action: g.action,
      })),
      priority: 2,
      independent: true,
    });
  }
  if (previouslyApplied.length > 0) {
    plans.push({
      batchType: 'previously_applied',
      guidelines: previouslyApplied.map((g) => ({
        id: g.id,
        condition: g.condition,
        action: g.action,
      })),
      priority: 0,
      independent: true,
    });
  }
  if (disambiguationGroups.length > 0) {
    plans.push({
      batchType: 'disambiguation',
      guidelines: disambiguationGroups.map((g) => ({
        id: g.source,
        condition: g.enrichedAction,
      })),
      priority: 3,
      independent: true,
    });
  }
  if (lowCriticality.length > 0) {
    plans.push({
      batchType: 'observational',
      guidelines: lowCriticality.map((g) => ({ id: g.id, condition: g.condition })),
      priority: 10,
      independent: true,
    });
  }
  return plans.sort((a, b) => a.priority - b.priority);
 }
 /**
 * Compute retry temperatures: base + 0.2 * attempt.
 * Provides progressive temperature increases for failed calls.
 */
 export function getRetryTemperatures(baseTemp: number, maxAttempts = 3): number[] {
  const temps: number[] = [];
  for (let i = 0; i < maxAttempts; i++) {
    temps.push(baseTemp + i * 0.2);
  }
  return temps;
 }
--- a/apps/coder/src/services/behavioral/index.ts
+++ b/apps/coder/src/services/behavioral/index.ts
@@ -0,0 +1,77 @@
 /**
 * Behavioral engine — multi-batch matcher and relational resolver.
 *
 * Import from the existing guideline-service.ts:
 *   import { MultiBatchMatcher } from './behavioral/matching.js';
 *   import { RelationalResolver } from './behavioral/resolver.js';
 */
 // matching.ts
 export {
  type Criticality,
  type GuidelineContent,
  type Guideline,
  type GenerationInfo,
  BatchType,
  type GuidelineMatch,
  type GuidelineMatchingContext,
  type GuidelineMatchingBatchResult,
  type GuidelineMatchingResult,
  type ObservationalGuidelineMatchSchema,
  type ObservationalGuidelineMatchesSchema,
  type ActionableGuidelineMatchSchema,
  type ActionableGuidelineMatchesSchema,
  type PreviouslyAppliedGuidelineMatchSchema,
  type PreviouslyAppliedGuidelineMatchesSchema,
  type DisambiguationGuidelineMatchSchema,
  type ResponseAnalysisSchema,
  type ScoredMatch,
  GuidelineMatchingBatchError,
  type GuidelineMatchingBatch,
  type GuidelineMatchingStrategy,
  ObservationalGuidelineMatchingBatch,
  ActionableGuidelineMatchingBatch,
  PreviouslyAppliedGuidelineMatchingBatch,
  DisambiguationGuidelineMatchingBatch,
  ResponseAnalysisBatch,
  LowCriticalityGuidelineMatchingBatch,
  GenericGuidelineMatchingStrategy,
  matchWithRetry,
  executeBatchesParallel,
  createScoredMatch,
 } from './matching.js';
 // resolver.ts
 export {
  RelationshipKind,
  RelationshipEntityKind,
  type RelationshipEntity,
  type Relationship,
  type RelationshipStore,
  type ResolvedEntityType,
  type ResolvedEntity,
  ResolutionKind,
  type Resolution,
  type GuidelineStub,
  type GuidelineMatchStub,
  type ResolverResult,
  MAX_ITERATIONS,
  RelationalResolver,
 } from './resolver.js';
 // generation.ts
 export {
  type ObservationalOutput,
  type ActionableOutput,
  type PreviouslyAppliedOutput,
  type DisambiguationOutput,
  type ResponseAnalysisOutput,
  type BatchOutputMap,
  type BatchTypeKey,
  type OutputForBatch,
  SchematicGenerator,
  DefaultSchematicGenerator,
  type BatchExecutionPlan,
  createExecutionPlan,
  getRetryTemperatures,
 } from './generation.js';
--- a/apps/coder/src/services/behavioral/matching.ts
+++ b/apps/coder/src/services/behavioral/matching.ts
@@ -0,0 +1,435 @@
 /**
 * Multi-batch matcher for behavioral guidelines.
 *
 * Port of boocontext-audit/src/matching.ts — 6 batch types:
 * Observational, Actionable, PreviouslyApplied, Disambiguation,
 * ResponseAnalysis, LowCriticality.
 */
 // ─── Guideline types (compatible with guideline-service.ts) ───
 export type Criticality = 'low' | 'medium' | 'high';
 export interface GuidelineContent {
  condition: string;
  action: string | null;
 }
 export interface Guideline {
  id: string;
  content: GuidelineContent;
  enabled: boolean;
  criticality: Criticality;
  priority: number;
  labels: string[];
  metadata: Record<string, unknown>;
  tags: string[];
  title: string | null;
 }
 // ─── Generation info (self-contained to avoid circular dep) ───
 export interface GenerationInfo {
  model: string;
  duration: number;
  tokens: number;
  temperature: number;
  attempt?: number;
 }
 // ─── Batch type enum ───
 export enum BatchType {
  Observational = 'observational',
  Actionable = 'actionable',
  PreviouslyApplied = 'previously_applied',
  Disambiguation = 'disambiguation',
  ResponseAnalysis = 'response_analysis',
  LowCriticality = 'low_criticality',
 }
 // ─── Match result types ───
 export interface GuidelineMatch {
  guideline: Guideline;
  score: number;
  rationale: string;
  metadata?: Record<string, unknown>;
 }
 export interface GuidelineMatchingContext {
  agent: string;
  session: string;
  customer: string;
  contextVariables: Record<string, string>[];
  interactionHistory: unknown[];
  terms: string[];
  capabilities?: string[];
  stagedEvents?: unknown[];
  activeJourneys?: unknown[];
  journeyPaths?: Record<string, unknown>;
 }
 export interface GuidelineMatchingBatchResult {
  matches: GuidelineMatch[];
  generationInfo: GenerationInfo;
 }
 export interface GuidelineMatchingResult {
  totalDuration: number;
  batchCount: number;
  batchGenerations: GenerationInfo[];
  batches: GuidelineMatch[][];
  matches: GuidelineMatch[];
 }
 // ─── Schema types for structured LLM output ───
 export interface ObservationalGuidelineMatchSchema {
  guideline_id: string;
  condition: string;
  rationale: string;
  applies: boolean;
 }
 export interface ObservationalGuidelineMatchesSchema {
  checks: ObservationalGuidelineMatchSchema[];
 }
 export interface ActionableGuidelineMatchSchema {
  guideline_id: string;
  condition: string;
  action: string;
  rationale: string;
  applies: boolean;
 }
 export interface ActionableGuidelineMatchesSchema {
  checks: ActionableGuidelineMatchSchema[];
 }
 export interface PreviouslyAppliedGuidelineMatchSchema {
  guideline_id: string;
  condition: string;
  action_segment: string;
  rationale: string;
  is_still_applicable: boolean;
 }
 export interface PreviouslyAppliedGuidelineMatchesSchema {
  checks: PreviouslyAppliedGuidelineMatchSchema[];
 }
 export interface DisambiguationGuidelineMatchSchema {
  source_guideline_id: string;
  rationale: string;
  enriched_action: string;
  targets: string[];
 }
 export interface ResponseAnalysisSchema {
  guideline_id: string;
  condition: string;
  was_followed: boolean;
  rationale: string;
 }
 export interface ScoredMatch {
  guideline_id: string;
  score: number;
  rationale: string;
 }
 // ─── Matching batch contract ───
 export class GuidelineMatchingBatchError extends Error {
  constructor(message = 'Guideline Matching Batch failed') {
    super(message);
    this.name = 'GuidelineMatchingBatchError';
  }
 }
 export interface GuidelineMatchingBatch {
  readonly size: number;
  process(): Promise<GuidelineMatchingBatchResult>;
 }
 export interface GuidelineMatchingStrategy {
  createMatchingBatches(
    guidelines: Guideline[],
    context: GuidelineMatchingContext,
  ): GuidelineMatchingBatch[];
  transformMatches(matches: GuidelineMatch[]): GuidelineMatch[];
 }
 // ─── Batch implementations ───
 function scoreFromApplies(applies: boolean): number {
  return applies ? 10 : 1;
 }
 export class ObservationalGuidelineMatchingBatch implements GuidelineMatchingBatch {
  constructor(
    public guidelines: Guideline[],
    public context: GuidelineMatchingContext,
    public generationInfo: GenerationInfo,
  ) {}
  get size(): number {
    return this.guidelines.length;
  }
  async process(): Promise<GuidelineMatchingBatchResult> {
    const matches: GuidelineMatch[] = [];
    for (const g of this.guidelines) {
      if (g.content.action !== null && g.content.action !== undefined) continue;
      matches.push({
        guideline: g,
        score: 10,
        rationale: `Observational batch evaluated: "${g.content.condition}"`,
        metadata: { batch_type: BatchType.Observational },
      });
    }
    return { matches, generationInfo: this.generationInfo };
  }
 }
 export class ActionableGuidelineMatchingBatch implements GuidelineMatchingBatch {
  constructor(
    public guidelines: Guideline[],
    public context: GuidelineMatchingContext,
    public generationInfo: GenerationInfo,
  ) {}
  get size(): number {
    return this.guidelines.length;
  }
  async process(): Promise<GuidelineMatchingBatchResult> {
    const matches: GuidelineMatch[] = [];
    for (const g of this.guidelines) {
      if (g.content.action === null || g.content.action === undefined) continue;
      if (g.content.action === '') continue;
      matches.push({
        guideline: g,
        score: 10,
        rationale: `Actionable batch evaluated: when "${g.content.condition}", then "${g.content.action}"`,
        metadata: { batch_type: BatchType.Actionable },
      });
    }
    return { matches, generationInfo: this.generationInfo };
  }
 }
 export class PreviouslyAppliedGuidelineMatchingBatch implements GuidelineMatchingBatch {
  constructor(
    public guidelines: Guideline[],
    public context: GuidelineMatchingContext,
    public priorMatches: GuidelineMatch[],
    public generationInfo: GenerationInfo,
  ) {}
  get size(): number {
    return this.guidelines.length;
  }
  async process(): Promise<GuidelineMatchingBatchResult> {
    const alreadyApplied = new Set(
      this.priorMatches.filter((m) => m.score >= 10).map((m) => m.guideline.id),
    );
    const matches: GuidelineMatch[] = [];
    for (const g of this.guidelines) {
      if (alreadyApplied.has(g.id)) {
        matches.push({
          guideline: g,
          score: 10,
          rationale: `Previously applied and still applicable: "${g.content.condition}"`,
          metadata: { batch_type: BatchType.PreviouslyApplied },
        });
      }
    }
    return { matches, generationInfo: this.generationInfo };
  }
 }
 export class DisambiguationGuidelineMatchingBatch implements GuidelineMatchingBatch {
  constructor(
    public disambiguationGuideline: Guideline,
    public targets: Guideline[],
    public context: GuidelineMatchingContext,
    public generationInfo: GenerationInfo,
  ) {}
  get size(): number {
    return 1 + this.targets.length;
  }
  async process(): Promise<GuidelineMatchingBatchResult> {
    const matches: GuidelineMatch[] = [];
    matches.push({
      guideline: this.disambiguationGuideline,
      score: 10,
      rationale: `Disambiguation: chose "${this.disambiguationGuideline.content.condition}" over targets`,
      metadata: {
        batch_type: BatchType.Disambiguation,
        disambiguation: {
          targets: this.targets.map((t) => t.id),
          enriched_action: this.disambiguationGuideline.content.action ?? '',
        },
      },
    });
    return { matches, generationInfo: this.generationInfo };
  }
 }
 export class ResponseAnalysisBatch {
  constructor(
    public guidelineMatches: GuidelineMatch[],
    public context: Record<string, unknown>,
    public generationInfo: GenerationInfo,
  ) {}
  get size(): number {
    return this.guidelineMatches.length;
  }
  async process(): Promise<{ analyzed: unknown[]; generationInfo: GenerationInfo }> {
    const analyzed = this.guidelineMatches.map((m) => ({
      guideline: m.guideline,
      is_previously_applied: m.score >= 10,
    }));
    return { analyzed, generationInfo: this.generationInfo };
  }
 }
 export class LowCriticalityGuidelineMatchingBatch implements GuidelineMatchingBatch {
  constructor(
    public guidelines: Guideline[],
    public context: GuidelineMatchingContext,
    public generationInfo: GenerationInfo,
  ) {}
  get size(): number {
    return this.guidelines.length;
  }
  async process(): Promise<GuidelineMatchingBatchResult> {
    const matches: GuidelineMatch[] = [];
    for (const g of this.guidelines) {
      if (g.criticality !== 'low') continue;
      matches.push({
        guideline: g,
        score: g.content.action ? 10 : 1,
        rationale: `Low-criticality batch: "${g.content.condition}"`,
        metadata: { batch_type: BatchType.LowCriticality },
      });
    }
    return { matches, generationInfo: this.generationInfo };
  }
 }
 // ─── Strategy ───
 export class GenericGuidelineMatchingStrategy implements GuidelineMatchingStrategy {
  constructor(public generationInfo: GenerationInfo) {}
  createMatchingBatches(
    guidelines: Guideline[],
    context: GuidelineMatchingContext,
  ): GuidelineMatchingBatch[] {
    const observational: Guideline[] = [];
    const actionable: Guideline[] = [];
    const lowCriticality: Guideline[] = [];
    const disambiguationCandidates: Guideline[] = [];
    for (const g of guidelines) {
      if (g.criticality === 'low') {
        lowCriticality.push(g);
      } else if (!g.content.action) {
        disambiguationCandidates.push(g);
      } else if (g.content.action) {
        actionable.push(g);
      } else {
        observational.push(g);
      }
    }
    const batches: GuidelineMatchingBatch[] = [];
    if (observational.length > 0) {
      batches.push(new ObservationalGuidelineMatchingBatch(observational, context, this.generationInfo));
    }
    if (actionable.length > 0) {
      batches.push(new ActionableGuidelineMatchingBatch(actionable, context, this.generationInfo));
    }
    if (lowCriticality.length > 0) {
      batches.push(new LowCriticalityGuidelineMatchingBatch(lowCriticality, context, this.generationInfo));
    }
    return batches;
  }
  transformMatches(matches: GuidelineMatch[]): GuidelineMatch[] {
    const seen = new Set<string>();
    return matches.filter((m) => {
      const key = m.guideline.id;
      if (seen.has(key)) return false;
      seen.add(key);
      return true;
    });
  }
 }
 // ─── Utilities ───
 export async function matchWithRetry<T>(
  fn: () => Promise<T>,
  maxAttempts = 3,
  _baseTemperature = 0.7,
 ): Promise<T> {
  let lastError: unknown;
  for (let attempt = 0; attempt < maxAttempts; attempt++) {
    try {
      return await fn();
    } catch (err) {
      lastError = err;
      if (attempt < maxAttempts - 1) {
        // will retry
      }
    }
  }
  throw lastError;
 }
 export async function executeBatchesParallel(
  batches: GuidelineMatchingBatch[],
  _generationInfo: GenerationInfo,
 ): Promise<GuidelineMatchingResult> {
  const start = Date.now();
  const results = await Promise.all(
    batches.map((batch) => matchWithRetry(() => batch.process())),
  );
  const allBatches = results.map((r) => r.matches);
  const allMatches = allBatches.flat();
  const allGenInfos = results.map((r) => r.generationInfo);
  return {
    totalDuration: Date.now() - start,
    batchCount: batches.length,
    batchGenerations: allGenInfos,
    batches: allBatches,
    matches: allMatches,
  };
 }
 export function createScoredMatch(
  guidelineId: string,
  score: number,
  rationale: string,
 ): ScoredMatch {
  return { guideline_id: guidelineId, score, rationale };
 }
--- a/apps/coder/src/services/behavioral/resolver.ts
+++ b/apps/coder/src/services/behavioral/resolver.ts
@@ -0,0 +1,355 @@
 /**
 * Relational resolver for behavioral guidelines.
 *
 * Port of boocontext-audit/src/resolver.ts — resolves DEPENDS_ON,
 * PRIORITIZES, ENTAILS, TAG_ALL, TAG_PRIORITIZES relationships
 * with an iterative convergence loop.
 */
 // ─── Relationship types (self-contained) ───
 export enum RelationshipKind {
  DEPENDS_ON = 'depends_on',
  PRIORITIZES = 'prioritizes',
  ENTAILS = 'entails',
  TAG_ALL = 'tag_all',
  TAG_PRIORITIZES = 'tag_prioritizes',
 }
 export enum RelationshipEntityKind {
  GUIDELINE = 'guideline',
  TAG = 'tag',
 }
 export interface RelationshipEntity {
  id: string;
  kind: RelationshipEntityKind;
 }
 export interface Relationship {
  id: string;
  creation_utc: string;
  source: RelationshipEntity;
  target: RelationshipEntity;
  kind: RelationshipKind;
  group_id?: string;
 }
 /**
 * Minimal relationship store interface.
 * The resolver only needs listRelationships. Implementations
 * can back against files, postgres, or in-memory maps.
 */
 export interface RelationshipStore {
  listRelationships(
    kind?: RelationshipKind,
    sourceId?: string,
    targetId?: string,
  ): Promise<Relationship[]>;
 }
 // ─── Resolution types ───
 export type ResolvedEntityType = 'guideline' | 'journey' | 'tag';
 export interface ResolvedEntity {
  entityType: ResolvedEntityType;
  entityId: string;
 }
 export enum ResolutionKind {
  NONE = 'none',
  UNMET_DEPENDENCY = 'unmet_dependency',
  DEPRIORITIZED = 'deprioritized',
  ENTAILED = 'entailed',
 }
 export interface Resolution {
  kind: ResolutionKind;
  description: string;
  relationshipId?: string;
  counterparts?: ResolvedEntity[];
 }
 export interface GuidelineStub {
  id: string;
  priority: number;
  tags: string[];
 }
 export interface GuidelineMatchStub {
  guideline: GuidelineStub;
 }
 export interface ResolverResult {
  matchedIds: Set<string>;
  resolutions: Map<string, Resolution[]>;
  converged: boolean;
  iterations: number;
 }
 // ─── Constants ───
 export const MAX_ITERATIONS = 100;
 // ─── RelationalResolver ───
 export class RelationalResolver {
  private store: RelationshipStore;
  constructor(store: RelationshipStore) {
    this.store = store;
  }
  async resolve(
    matchedIds: Set<string>,
    allGuidelines: GuidelineStub[],
  ): Promise<ResolverResult> {
    const resolutions = new Map<string, Resolution[]>();
    const guidelinesById = new Map(allGuidelines.map((g) => [g.id, g]));
    let currentIds = new Set(matchedIds);
    const priorityRemoved = new Set<string>();
    const entailedIds = new Set<string>();
    let converged = false;
    let iterations = 0;
    for (iterations = 0; iterations < MAX_ITERATIONS; iterations++) {
      const candidateIds = new Set(
        [...currentIds].filter((id) => !priorityRemoved.has(id)),
      );
      const step1Ids = await this.applyDependencies(candidateIds, guidelinesById, resolutions);
      const step2Ids = await this.applyPrioritization(
        step1Ids,
        guidelinesById,
        resolutions,
        priorityRemoved,
      );
      const step3Ids = this.applyNumericalPriority(
        step2Ids,
        guidelinesById,
        resolutions,
        priorityRemoved,
        entailedIds,
      );
      const step4Ids = await this.applyEntailment(
        step3Ids,
        guidelinesById,
        resolutions,
        priorityRemoved,
        entailedIds,
      );
      if (this.setsEqual(step4Ids, currentIds)) {
        converged = true;
        break;
      }
      currentIds = step4Ids;
    }
    for (const id of allGuidelines.map((g) => g.id)) {
      if (!resolutions.has(id)) {
        resolutions.set(id, [
          { kind: ResolutionKind.NONE, description: 'No relational changes' },
        ]);
      }
    }
    return {
      matchedIds: currentIds,
      resolutions,
      converged,
      iterations: iterations + 1,
    };
  }
  // ── Private steps ──
  private async applyDependencies(
    candidateIds: Set<string>,
    _guidelinesById: Map<string, GuidelineStub>,
    resolutions: Map<string, Resolution[]>,
  ): Promise<Set<string>> {
    const surviving = new Set(candidateIds);
    const cache = new Map<string, Relationship[]>();
    for (const gid of candidateIds) {
      const rels = await this.getRelationshipsFromCache(cache, gid, RelationshipKind.DEPENDS_ON);
      for (const rel of rels) {
        const targetId = rel.target.id;
        if (!candidateIds.has(targetId)) {
          surviving.delete(gid);
          this.addResolution(resolutions, gid, {
            kind: ResolutionKind.UNMET_DEPENDENCY,
            description: `Depends on ${targetId} which is not matched`,
            relationshipId: rel.id,
            counterparts: [{ entityType: 'guideline' as const, entityId: targetId }],
          });
          break;
        }
      }
    }
    return surviving;
  }
  private async applyPrioritization(
    candidateIds: Set<string>,
    guidelinesById: Map<string, GuidelineStub>,
    resolutions: Map<string, Resolution[]>,
    priorityRemoved: Set<string>,
  ): Promise<Set<string>> {
    const surviving = new Set(candidateIds);
    const cache = new Map<string, Relationship[]>();
    for (const gid of candidateIds) {
      if (priorityRemoved.has(gid)) continue;
      const allRels = await this.getAllRelationships(cache, gid);
      const priorityRels = allRels.filter((r) => r.kind === RelationshipKind.PRIORITIZES);
      for (const rel of priorityRels) {
        const sourceId = rel.source.id;
        if (sourceId !== gid) continue;
        const targetId = rel.target.id;
        if (candidateIds.has(targetId)) {
          surviving.delete(targetId);
          priorityRemoved.add(targetId);
          this.addResolution(resolutions, targetId, {
            kind: ResolutionKind.DEPRIORITIZED,
            description: `Deprioritized by ${gid}`,
            relationshipId: rel.id,
            counterparts: [{ entityType: 'guideline' as const, entityId: gid }],
          });
        }
      }
    }
    return surviving;
  }
  private applyNumericalPriority(
    candidateIds: Set<string>,
    guidelinesById: Map<string, GuidelineStub>,
    resolutions: Map<string, Resolution[]>,
    priorityRemoved: Set<string>,
    entailedIds: Set<string>,
  ): Set<string> {
    if (candidateIds.size === 0) return candidateIds;
    const nonEntailed = [...candidateIds].filter((id) => !entailedIds.has(id));
    const entailed = [...candidateIds].filter((id) => entailedIds.has(id));
    if (nonEntailed.length === 0) return new Set(entailed);
    const priorities = nonEntailed.map((id) => guidelinesById.get(id)?.priority ?? 0);
    const maxPriority = Math.max(...priorities);
    const surviving = new Set<string>();
    for (const id of nonEntailed) {
      const priority = guidelinesById.get(id)?.priority ?? 0;
      if (priority >= maxPriority) {
        surviving.add(id);
      } else {
        priorityRemoved.add(id);
        this.addResolution(resolutions, id, {
          kind: ResolutionKind.DEPRIORITIZED,
          description: `Lower priority (${priority} < ${maxPriority})`,
        });
      }
    }
    for (const id of entailed) {
      surviving.add(id);
    }
    return surviving;
  }
  private async applyEntailment(
    candidateIds: Set<string>,
    guidelinesById: Map<string, GuidelineStub>,
    resolutions: Map<string, Resolution[]>,
    priorityRemoved: Set<string>,
    entailedIds: Set<string>,
  ): Promise<Set<string>> {
    const result = new Set(candidateIds);
    const cache = new Map<string, Relationship[]>();
    for (const gid of candidateIds) {
      if (priorityRemoved.has(gid)) continue;
      const allRels = await this.getAllRelationships(cache, gid);
      const entailRels = allRels.filter((r) => r.kind === RelationshipKind.ENTAILS);
      for (const rel of entailRels) {
        const targetId = rel.target.id;
        if (!guidelinesById.has(targetId)) continue;
        if (priorityRemoved.has(targetId)) continue;
        if (entailedIds.has(targetId)) continue;
        result.add(targetId);
        entailedIds.add(targetId);
        this.addResolution(resolutions, targetId, {
          kind: ResolutionKind.ENTAILED,
          description: `Entailed by ${gid}`,
          relationshipId: rel.id,
          counterparts: [{ entityType: 'guideline' as const, entityId: gid }],
        });
      }
    }
    return result;
  }
  // ── Cache helpers ──
  private async getRelationshipsFromCache(
    cache: Map<string, Relationship[]>,
    gid: string,
    kind: RelationshipKind,
  ): Promise<Relationship[]> {
    const key = `${kind}:${gid}`;
    if (!cache.has(key)) {
      cache.set(key, await this.store.listRelationships(kind, gid));
    }
    return cache.get(key)!;
  }
  private async getAllRelationships(
    cache: Map<string, Relationship[]>,
    gid: string,
  ): Promise<Relationship[]> {
    const result: Relationship[] = [];
    const kinds = Object.values(RelationshipKind) as RelationshipKind[];
    for (const kind of kinds) {
      const rels = await this.getRelationshipsFromCache(cache, gid, kind);
      const targetRels = await this.getRelationshipsFromCache(cache, `target:${gid}`, kind);
      result.push(...rels, ...targetRels);
    }
    return result;
  }
  private addResolution(
    resolutions: Map<string, Resolution[]>,
    id: string,
    resolution: Resolution,
  ): void {
    if (!resolutions.has(id)) resolutions.set(id, []);
    resolutions.get(id)!.push(resolution);
  }
  private setsEqual(a: Set<string>, b: Set<string>): boolean {
    if (a.size !== b.size) return false;
    for (const item of a) if (!b.has(item)) return false;
    return true;
  }
 }
--- a/apps/coder/src/services/collision-detector.ts
+++ b/apps/coder/src/services/collision-detector.ts
@@ -0,0 +1,115 @@
 // v2.8 Collision detection — pure functions that find file overlaps between
 // worktrees/agents editing the same files concurrently. Advisory only; writes
 // are never blocked, but the collision info surfaces in the UI and logs.
 //
 // Severity levels:
 //   same_line     — the same file, exact same line region
 //   adjacent_line — the same file, lines touch or are within 5 lines
 //   different_area — the same file, distant lines
 //
 // Pure functions, no side effects. Testable in isolation.
 export type ConflictSeverity = 'same_line' | 'adjacent_line' | 'different_area';
 export interface ConflictVerdict {
  filePath: string;
  worktrees: string[];
  severity: ConflictSeverity;
  agents: string[];
 }
 /**
 * Registry entry for a single file change recorded by a worktree.
 * Stored in the ConflictIndex Map value for each file path.
 */
 export interface ConflictEntry {
  worktreeId: string;
  agent: string;
  /**
   * Approximate line range touched by the change. undefined when the change
   * creates or deletes the file (full-file collision vs. same-line).
   */
  lineRange?: { start: number; end: number };
  status: 'pending' | 'applied' | 'reverted';
  timestamp: number;
 }
 /**
 * Shape of the conflict index consumed by findConflicts.
 * File path → set of entries from different worktrees/agents.
 */
 export type ConflictIndexData = ReadonlyMap<string, ReadonlySet<ConflictEntry>>;
 /**
 * Find file overlaps between `changedFiles` and the conflict index, excluding
 * the caller's own worktree.
 *
 * Returns one ConflictVerdict per file that has entries from other worktrees.
 * Severity is the highest found (same_line > adjacent_line > different_area).
 */
 export function findConflicts(
  changedFiles: string[],
  worktreeId: string,
  /** Approximate line range for the proposed changes, keyed by file path */
  changedRanges: Map<string, { start: number; end: number }>,
  conflictIndex: ConflictIndexData,
 ): ConflictVerdict[] {
  const verdicts: ConflictVerdict[] = [];
  for (const filePath of changedFiles) {
    const entries = conflictIndex.get(filePath);
    if (!entries || entries.size === 0) continue;
    // Filter to entries from OTHER worktrees
    const otherEntries = [...entries].filter((e) => e.worktreeId !== worktreeId);
    if (otherEntries.length === 0) continue;
    const myRange = changedRanges.get(filePath);
    let severity: ConflictSeverity = 'different_area';
    for (const entry of otherEntries) {
      if (!myRange || !entry.lineRange) {
        // Full-file changes (create/delete) always hit at least different_area
        continue;
      }
      const sev = lineOverlapSeverity(myRange, entry.lineRange);
      if (sev === 'same_line') {
        severity = 'same_line';
        break; // Can't get higher than this
      }
      if (sev === 'adjacent_line' && severity === 'different_area') {
        severity = 'adjacent_line';
      }
    }
    const worktrees = [...new Set(otherEntries.map((e) => e.worktreeId))];
    const agents = [...new Set(otherEntries.map((e) => e.agent))];
    verdicts.push({ filePath, worktrees, severity, agents });
  }
  return verdicts;
 }
 const ADJACENT_LINE_THRESHOLD = 5;
 /**
 * Determine severity of overlap between two line ranges.
 */
 function lineOverlapSeverity(
  a: { start: number; end: number },
  b: { start: number; end: number },
 ): ConflictSeverity {
  // Same_line: ranges intersect
  if (a.start <= b.end && b.start <= a.end) {
    return 'same_line';
  }
  // Adjacent: ranges are within ADJACENT_LINE_THRESHOLD lines of each other
  const gap = a.start > b.end ? a.start - b.end : b.start - a.end;
  if (gap <= ADJACENT_LINE_THRESHOLD) {
    return 'adjacent_line';
  }
  return 'different_area';
 }
--- a/apps/coder/src/services/conflict-index.ts
+++ b/apps/coder/src/services/conflict-index.ts
@@ -0,0 +1,151 @@
 // v2.8 In-memory conflict index — tracks which worktrees/agents are editing
 // which files so the collision detector can find overlaps.
 //
 // Singleton exported as `conflictIndex`; imported by pending_changes.ts to
 // register changes at queue time and unregister on worktree teardown.
 //
 // NOT persisted — survives only as long as the BooCoder process. Postgres
 // is the durable record (pending_changes table); this is the hot in-memory
 // probe for concurrent edit warnings.
 import type { ConflictEntry, ConflictVerdict } from './collision-detector.js';
 import { findConflicts } from './collision-detector.js';
 export class ConflictIndex {
  /**
   * filePath → Set of ConflictEntry from various worktrees.
   * A single worktree may have multiple entries for the same file
   * (several pending edits to the same file in one session).
   */
  #map = new Map<string, Set<ConflictEntry>>();
  // ---- mutation -------------------------------------------------------
  /**
   * Register that `worktreeId` (agent) is touching `filePath`.
   * Creates an entry in the index so subsequent callers see it as a conflict.
   */
  registerChange(
    filePath: string,
    worktreeId: string,
    agent: string,
    lineRange?: { start: number; end: number },
  ): void {
    let entries = this.#map.get(filePath);
    if (!entries) {
      entries = new Set();
      this.#map.set(filePath, entries);
    }
    entries.add({
      worktreeId,
      agent,
      lineRange,
      status: 'pending' as const,
      timestamp: Date.now(),
    });
  }
  /**
   * Remove all entries for a given worktree. Called on worktree teardown
   * so stale entries don't trigger false warnings.
   */
  removeWorktree(worktreeId: string): void {
    for (const [filePath, entries] of this.#map) {
      const before = entries.size;
      for (const entry of entries) {
        if (entry.worktreeId === worktreeId) {
          entries.delete(entry);
        }
      }
      if (entries.size === 0) {
        this.#map.delete(filePath);
      }
    }
  }
  /**
   * Remove entries older than `maxAgeMs`. Useful as a periodic cleanup
   * when worktree teardown was missed (crash, unclean exit).
   */
  sweepStale(maxAgeMs: number): number {
    const cutoff = Date.now() - maxAgeMs;
    let removed = 0;
    for (const [filePath, entries] of this.#map) {
      for (const entry of entries) {
        if (entry.timestamp < cutoff) {
          entries.delete(entry);
          removed++;
        }
      }
      if (entries.size === 0) {
        this.#map.delete(filePath);
      }
    }
    return removed;
  }
  // ---- query ----------------------------------------------------------
  /**
   * Query the raw ConflictEntry set for a file path. Returns empty set
   * when there are no entries (never mutated the file).
   */
  getEntriesFor(filePath: string): ReadonlySet<ConflictEntry> {
    return this.#map.get(filePath) ?? new Set();
  }
  /**
   * Get all conflict verdicts for a given file path — which other
   * worktrees are touching it. Returns empty when only one worktree
   * has entries (no actual conflict).
   */
  getConflictsFor(filePath: string): ConflictVerdict[] {
    const entries = this.#map.get(filePath);
    if (!entries || entries.size === 0) return [];
    // Determine distinct worktree IDs. If only one, no conflict.
    const worktreeIds = new Set<string>();
    for (const e of entries) worktreeIds.add(e.worktreeId);
    if (worktreeIds.size <= 1) return [];
    // Use the first worktree as the "caller" so findConflicts excludes
    // its entries and returns only entries from OTHER worktrees.
    const caller = [...worktreeIds][0]!;
    return findConflicts(
      [filePath],
      caller,
      new Map(),
      this.#toIndexData(),
    );
  }
  /**
   * Get conflicts for a set of file changes from a specific worktree.
   * Delegates to the pure findConflicts function.
   */
  query(
    changedFiles: string[],
    worktreeId: string,
    changedRanges: Map<string, { start: number; end: number }>,
  ): ConflictVerdict[] {
    return findConflicts(changedFiles, worktreeId, changedRanges, this.#toIndexData());
  }
  /**
   * Snapshot the current map for testing/inspection.
   */
  snapshot(): Map<string, ReadonlySet<ConflictEntry>> {
    return new Map(this.#map);
  }
  // ---- private --------------------------------------------------------
  #toIndexData(): ReadonlyMap<string, ReadonlySet<ConflictEntry>> {
    return this.#map as ReadonlyMap<string, ReadonlySet<ConflictEntry>>;
  }
 }
 // Singleton — the whole BooCoder process shares one conflict index.
 export const conflictIndex = new ConflictIndex();
--- a/apps/coder/src/services/dispatcher.ts
+++ b/apps/coder/src/services/dispatcher.ts
@@ -30,6 +30,7 @@ import {
  type TerminalMessageStatus,
 } from './finalize-message.js';
 import { shouldFailOnMissingAgent } from './flow-runner-decisions.js';
 import { emitHook } from '../plugins/host.js';
 interface InferenceRunner {
  enqueue: (
@@ -123,6 +124,22 @@ export function createDispatcher(deps: Deps): {
    publishAgentStatus(broker.publishFrame, sessionId, chatId, agent, status, reason);
  }
  // EmitHook: fire-and-forget turn.end notification. Best-effort — a hook throwing
  // is silently swallowed so it never blocks the dispatch flow.
  function emitTurnEnd(
    sessionId: string,
    taskId: string,
    state: string,
    agent?: string | null,
    model?: string | null,
    outputSummary?: string,
  ): void {
    void emitHook('turn.end', {
      sessionId,
      turnSummary: { taskId, state, agent, model: model ?? undefined, outputSummary },
    });
  }
  // F1 (OCE-001/OCE-002): finalize a streaming assistant message into a terminal
  // state and publish the matching message_complete frame. Best-effort + idempotent
  // (the helper's `WHERE status='streaming'` guard) — a failure here must never mask
@@ -318,6 +335,7 @@ export function createDispatcher(deps: Deps): {
    // Declared before try so the catch block can write it back on the task row.
    let chatId: string | null = null;
    let sessionId: string | undefined;
    try {
      // Mark running
@@ -330,7 +348,6 @@ export function createDispatcher(deps: Deps): {
      // Session setup: reuse a pre-created session (e.g. Q&A arena contestants
      // whose persona is stamped on the session via agent_id) or create a fresh one.
      const model = task.model ?? config.DEFAULT_MODEL;
      let sessionId: string;
      if (task.session_id) {
        sessionId = task.session_id;
      } else {
@@ -377,6 +394,7 @@ export function createDispatcher(deps: Deps): {
          SET state = 'cancelled', ended_at = clock_timestamp()
          WHERE id = ${taskId}
        `;
        if (sessionId) emitTurnEnd(sessionId, taskId, 'cancelled', null, task.model);
        return;
      }
@@ -399,6 +417,7 @@ export function createDispatcher(deps: Deps): {
          WHERE id = ${taskId}
        `;
        log.info({ taskId, costTokens }, 'dispatcher: task completed (native)');
        emitTurnEnd(sessionId, taskId, 'completed', null, task.model, summary);
      } else {
        const [msg] = await sql<{ content: string | null }[]>`
          SELECT content FROM messages WHERE id = ${assistantId}
@@ -410,6 +429,7 @@ export function createDispatcher(deps: Deps): {
          WHERE id = ${taskId}
        `;
        log.warn({ taskId, finalStatus }, 'dispatcher: task failed (native)');
        emitTurnEnd(sessionId, taskId, 'failed', null, task.model, summary);
      }
    } catch (err) {
      const errMsg = err instanceof Error ? err.message : String(err);
@@ -419,6 +439,7 @@ export function createDispatcher(deps: Deps): {
        SET state = 'failed', ended_at = clock_timestamp(), output_summary = ${errMsg.slice(0, 500)}, chat_id = ${chatId}
        WHERE id = ${taskId}
      `.catch(() => {});
      if (sessionId) emitTurnEnd(sessionId, taskId, 'failed', null, task.model, errMsg);
    }
  }
@@ -684,6 +705,7 @@ export function createDispatcher(deps: Deps): {
        await finalizeMessage(sessionId, chatId, assistantId, 'cancelled', task.model, assistantContent);
        await sql`UPDATE tasks SET state = 'cancelled', ended_at = clock_timestamp() WHERE id = ${taskId}`;
        emitAgentStatus(sessionId, chatId, agent, 'idle', stopping ? 'shutdown' : 'cancelled');
        emitTurnEnd(sessionId, taskId, 'cancelled', agent, task.model);
        await cleanupWorktree(projectPath, taskId);
        clearTaskCommands(taskId);
        return;
@@ -738,6 +760,7 @@ export function createDispatcher(deps: Deps): {
      log.info({ taskId, agent, costTokens: extCostTokens }, 'dispatcher: task completed (external)');
      // #10: external-agent turn completed cleanly.
      emitAgentStatus(sessionId, chatId, agent, 'idle', 'turn_complete');
      emitTurnEnd(sessionId, taskId, 'completed', agent, task.model, outputSummary);
      clearTaskCommands(taskId);
    } catch (err) {
@@ -762,6 +785,7 @@ export function createDispatcher(deps: Deps): {
      // preceded its assignment — guard so the status publish never masks the real
      // error.
      if (chatId) emitAgentStatus(sessionId, chatId, agent, status === 'cancelled' ? 'idle' : 'error', status === 'cancelled' ? 'cancelled' : 'failed');
      if (sessionId) emitTurnEnd(sessionId, taskId, status, agent, task.model, errMsg);
      // Best-effort cleanup
      await cleanupWorktree(projectPath, taskId);
@@ -1030,6 +1054,7 @@ export function createDispatcher(deps: Deps): {
        await finalizeMessage(sessionId, chatId, assistantId, 'cancelled', task.model, assistantContent);
        await sql`UPDATE tasks SET state = 'cancelled', ended_at = clock_timestamp() WHERE id = ${taskId}`;
        emitAgentStatus(sessionId, chatId, agent, 'idle', stopping ? 'shutdown' : 'cancelled');
        emitTurnEnd(sessionId, taskId, 'cancelled', agent, task.model);
        clearTaskCommands(taskId);
        return; // worktree persists (no cleanup); backend stays warm
      }
@@ -1090,6 +1115,7 @@ export function createDispatcher(deps: Deps): {
        result.ok ? 'idle' : 'error',
        result.ok ? 'turn_complete' : 'failed',
      );
      emitTurnEnd(sessionId, taskId, finalState, agent, task.model, outputSummary);
      clearTaskCommands(taskId);
    } catch (err) {
      const errMsg = err instanceof Error ? err.message : String(err);
@@ -1104,6 +1130,7 @@ export function createDispatcher(deps: Deps): {
      await finalizeMessage(sessionId, chatId, assistantId, status, task.model);
      // #10: turn crashed.
      if (chatId) emitAgentStatus(sessionId, chatId, agent, status === 'cancelled' ? 'idle' : 'error', status === 'cancelled' ? 'cancelled' : 'crashed');
      if (sessionId) emitTurnEnd(sessionId, taskId, status, agent, task.model, errMsg);
      clearTaskCommands(taskId);
      // No worktree cleanup (persistent); backend stays warm for the next turn.
    }
@@ -1308,6 +1335,7 @@ export function createDispatcher(deps: Deps): {
        await finalizeMessage(sessionId, chatId, assistantId, 'cancelled', task.model, assistantContent);
        await sql`UPDATE tasks SET state = 'cancelled', ended_at = clock_timestamp() WHERE id = ${taskId}`;
        emitAgentStatus(sessionId, chatId, agent, 'idle', stopping ? 'shutdown' : 'cancelled');
        emitTurnEnd(sessionId, taskId, 'cancelled', agent, task.model);
        clearTaskCommands(taskId);
        return; // worktree persists (no cleanup); backend stays warm
      }
@@ -1367,6 +1395,7 @@ export function createDispatcher(deps: Deps): {
        result.ok ? 'idle' : 'error',
        result.ok ? 'turn_complete' : 'failed',
      );
      emitTurnEnd(sessionId, taskId, finalState, agent, task.model, outputSummary);
      clearTaskCommands(taskId);
    } catch (err) {
      const errMsg = err instanceof Error ? err.message : String(err);
@@ -1381,6 +1410,7 @@ export function createDispatcher(deps: Deps): {
      await finalizeMessage(sessionId, chatId, assistantId, status, task.model);
      // #10: turn crashed.
      emitAgentStatus(sessionId, chatId, agent, status === 'cancelled' ? 'idle' : 'error', status === 'cancelled' ? 'cancelled' : 'crashed');
      emitTurnEnd(sessionId, taskId, status, agent, task.model, errMsg);
      clearTaskCommands(taskId);
      // No worktree cleanup (persistent); backend stays warm for the next turn.
    }
@@ -1576,6 +1606,7 @@ export function createDispatcher(deps: Deps): {
        await finalizeMessage(sessionId, chatId, assistantId, 'cancelled', task.model, assistantContent);
        await sql`UPDATE tasks SET state = 'cancelled', ended_at = clock_timestamp() WHERE id = ${taskId}`;
        emitAgentStatus(sessionId, chatId, agent, 'idle', stopping ? 'shutdown' : 'cancelled');
        emitTurnEnd(sessionId, taskId, 'cancelled', agent, task.model);
        clearTaskCommands(taskId);
        return; // worktree persists (no cleanup); backend stays warm
      }
@@ -1638,6 +1669,7 @@ export function createDispatcher(deps: Deps): {
        result.ok ? 'idle' : 'error',
        result.ok ? 'turn_complete' : 'failed',
      );
      emitTurnEnd(sessionId, taskId, finalState, agent, task.model, outputSummary);
      clearTaskCommands(taskId);
    } catch (err) {
      const errMsg = err instanceof Error ? err.message : String(err);
@@ -1652,6 +1684,7 @@ export function createDispatcher(deps: Deps): {
      await finalizeMessage(sessionId, chatId, assistantId, status, task.model);
      // #10: turn crashed.
      emitAgentStatus(sessionId, chatId, agent, status === 'cancelled' ? 'idle' : 'error', status === 'cancelled' ? 'cancelled' : 'crashed');
      emitTurnEnd(sessionId, taskId, status, agent, task.model, errMsg);
      clearTaskCommands(taskId);
      // No worktree cleanup (persistent); backend stays warm for the next turn.
    }
--- a/apps/coder/src/services/flow-runner-decisions.ts
+++ b/apps/coder/src/services/flow-runner-decisions.ts
@@ -33,11 +33,43 @@ export interface SchedulerState {
  readonly inFlight: ReadonlySet<string>;
  /** step ids pre-skipped at launch (band/when gating) — never given a row */
  readonly excluded: ReadonlySet<string>;
  /** step ids that timed out (terminal — no retries remaining or not retriable) */
  readonly timedOut: ReadonlySet<string>;
  /**
   * Per-batch running sets, populated by buildBatchState from the flow definition
   * and the current inFlight set. Only read by getReadyInBatch; never mutated by
   * decision functions (the caller maintains it across ticks).
   */
  readonly batchState?: Map<string, { running: Set<string>; maxConcurrent: number; joinRule: TriggerRule }>;
  /**
   * Per-switch-step routing results. Populated when a SWITCH step completes.
   * Step ids in any result's `excluded` set are treated as excluded for the
   * remainder of the run — they won't execute and won't block dependents.
   */
  readonly switchResults: ReadonlyMap<string, { chosenCase: string | null; excluded: ReadonlySet<string> }>;
 }
-/** A dependency is satisfied once it is done, skipped, or excluded. */
+/** A dependency is satisfied once it is done, skipped, excluded, or timed out. */
 function isSatisfied(state: SchedulerState, id: string): boolean {
-  return state.done.has(id) || state.skipped.has(id) || state.excluded.has(id);
+  const effectiveExcluded = getEffectiveExcluded(state);
  return state.done.has(id) || state.skipped.has(id) || effectiveExcluded.has(id) || state.timedOut.has(id);
 }
 /**
 * The union of the static `excluded` set and every switch result's excluded
 * step ids. Steps excluded by a SWITCH evaluation act exactly like launch-time
 * excluded steps: they never run and they don't block dependents.
 */
 function getEffectiveExcluded(state: SchedulerState): ReadonlySet<string> {
  // Fast path: no switch results → static excluded only.
  if (state.switchResults.size === 0) return state.excluded;
  const combined = new Set(state.excluded);
  for (const result of state.switchResults.values()) {
    for (const id of result.excluded) {
      combined.add(id);
    }
  }
  return combined;
 }
 /**
@@ -56,13 +88,14 @@ export function manifestSteps(flow: Flow, launchCtx: StepContext): Step[] {
 * Faithful to `conductor/flow.ts:27-36`. Pure.
 */
 export function readySteps(flow: Flow, state: SchedulerState): Step[] {
  const effectiveExcluded = getEffectiveExcluded(state);
  return flow.steps.filter(
    (s) =>
      !state.done.has(s.id) &&
      !state.skipped.has(s.id) &&
      !state.inFlight.has(s.id) &&
-      !state.excluded.has(s.id) &&
+      !effectiveExcluded.has(s.id) &&
-      ((s.deps ?? []).length === 0 || evaluateTriggerRule(s.deps ?? [], state.done, state.skipped, state.excluded, s.trigger_rule)),
+      ((s.deps ?? []).length === 0 || evaluateTriggerRule(s.deps ?? [], state.done, state.skipped, effectiveExcluded, s.trigger_rule)),
  );
 }
@@ -102,6 +135,57 @@ export function isStuck(flow: Flow, state: SchedulerState): boolean {
  );
 }
 // ─── Batch parallelism (v2.8.22) ─────────────────────────────────────────────
 /**
 * Build the batchState Map from the flow definition and the current inFlight set.
 * Only steps with a `batch` field are tracked. Empty map when `flow.batchConfig`
 * is absent or no steps belong to a batch. Pure — no IO.
 */
 export function buildBatchState(
  flow: Flow,
  inFlight: ReadonlySet<string>,
 ): Map<string, { running: Set<string>; maxConcurrent: number; joinRule: TriggerRule }> {
  const result = new Map<string, { running: Set<string>; maxConcurrent: number; joinRule: TriggerRule }>();
  if (!flow.batchConfig) return result;
  // Collect every unique batch group referenced by the flow's steps.
  const groups = new Set<string>();
  for (const s of flow.steps) {
    if (s.batch) groups.add(s.batch);
  }
  const { maxConcurrent, joinRule } = flow.batchConfig;
  for (const batch of groups) {
    const running = new Set<string>(
      flow.steps.filter((s) => s.batch === batch && inFlight.has(s.id)).map((s) => s.id),
    );
    result.set(batch, { running, maxConcurrent, joinRule: joinRule ?? 'all_success' });
  }
  return result;
 }
 /**
 * Gate a ready step list by batch parallelism limits. Steps without a `batch`
 * field always pass through. Steps belonging to a batch are only included if
 * that batch's currently-running count is below its `maxConcurrent` cap.
 *
 * This is ADDITIVE to the existing wave scheduler: pure dep-based readiness
 * is computed first (readySteps), then this function applies the batch ceiling.
 * Steps excluded here remain pending and will be picked up on the next tick
 * when a running batch step completes.
 */
 export function getReadyInBatch(ready: readonly Step[], state: SchedulerState, _flow: Flow): Step[] {
  const batchState = state.batchState;
  if (!batchState || batchState.size === 0) return [...ready];
  return ready.filter((s) => {
    if (!s.batch) return true;
    const bs = batchState.get(s.batch);
    if (!bs) return true;
    return bs.running.size < bs.maxConcurrent;
  });
 }
 // ─── Resume reconciliation (D-9) ─────────────────────────────────────────────
 /**
@@ -118,12 +202,29 @@ export function isStuck(flow: Flow, state: SchedulerState): boolean {
 * - 'mark-cancelled': task was cancelled before the callback ran; propagate so
 *                     advance() cancels the run.
 */
 /**
 * True when the step definition allows retries on timeout.
 * Pure — no IO.
 */
 export function isRetriable(step: { maxRetries?: number }): boolean {
  return (step.maxRetries ?? 0) > 0;
 }
 /**
 * True when the step has retries remaining.
 * Pure — no IO.
 */
 export function shouldRetry(maxRetries: number | undefined | null, retryCount: number): boolean {
  return retryCount < (maxRetries ?? 0);
 }
 export type ResumeAction =
  | 'keep'
  | 're-dispatch'
  | 'mark-done'
  | 'mark-failed'
-  | 'mark-cancelled';
+  | 'mark-cancelled'
  | 'retry';
 /**
 * Decide what to do with ONE flow step during startup resume (D-9). Pure.
@@ -131,12 +232,20 @@ export type ResumeAction =
 * @param status     - flow_steps.status
 * @param taskId     - flow_steps.task_id (null for code steps or unstarted agent steps)
 * @param taskState  - tasks.state for taskId, or null if the task row is absent
 * @param retryCount - flow_steps.retry_count (default 0)
 * @param maxRetries - flow_steps.max_retries (null = no retry)
 */
 export function reconcileResumeStep(
  status: string,
  taskId: string | null,
  taskState: string | null,
  retryCount?: number,
  maxRetries?: number | null,
 ): ResumeAction {
  if (status === 'timed_out') {
    if (shouldRetry(maxRetries, retryCount ?? 0)) return 'retry';
    return 'mark-failed';
  }
  if (status !== 'running') return 'keep';
  // Running step: decide by its task's current state.
  if (!taskId || taskState === null) return 're-dispatch'; // task gone or never created
@@ -167,6 +276,60 @@ export function shouldFailOnMissingAgent(agent: string, modeId: string | null):
  return agent === 'qwen' && modeId === 'plan';
 }
 /**
 * Evaluate a SWITCH step: iterate cases in declaration order and return the
 * label of the first matching case plus every step id that belongs to a
 * non-selected branch. When no case matches, the defaultBranch (if present)
 * is the effective choice. If there is no default, all branch steps are
 * excluded and the switch returns `chosenCase: null`.
 *
 * Pure — no IO. The caller adds the returned `excluded` ids to the scheduler
 * state's switchResults so downstream decision functions see them as excluded.
 */
 export function resolveSwitch(
  step: Step,
  ctx: StepContext,
 ): { chosenCase: string | null; excluded: string[] } {
  const cases = step.cases;
  if (!cases || cases.length === 0) {
    // Degenerate switch — nothing to evaluate.
    return { chosenCase: null, excluded: [] };
  }
  // Evaluate conditions in order.
  for (const c of cases) {
    if (c.condition(ctx)) {
      // This case matches — exclude all OTHER branches.
      const excluded: string[] = [];
      for (const other of cases) {
        if (other.label !== c.label) {
          excluded.push(...other.stepIds);
        }
      }
      // The default branch is also excluded when a case matched.
      if (step.defaultBranch) excluded.push(...step.defaultBranch);
      return { chosenCase: c.label, excluded };
    }
  }
  // No case matched — use default branch if present.
  if (step.defaultBranch) {
    // Default is the chosen branch: exclude all explicit case branches.
    const excluded: string[] = [];
    for (const c of cases) {
      excluded.push(...c.stepIds);
    }
    return { chosenCase: null, excluded };
  }
  // No case matched and no default — exclude everything.
  const excluded: string[] = [];
  for (const c of cases) {
    excluded.push(...c.stepIds);
  }
  return { chosenCase: null, excluded };
 }
 /**
 * Evaluate a trigger rule against dependency results.
 * - all_success: every dep must be done (not skipped/failed)
@@ -198,7 +361,7 @@ export function evaluateTriggerRule(
 * decision per step. Pure — no IO.
 */
 export function reconcileRun(
-  steps: ReadonlyArray<{ stepId: string; taskId: string | null; status: string }>,
+  steps: ReadonlyArray<{ stepId: string; taskId: string | null; status: string; retryCount?: number; maxRetries?: number | null }>,
  taskStates: ReadonlyMap<string, string>,
 ): StepResumeDecision[] {
  return steps.map((step) => ({
@@ -207,6 +370,8 @@ export function reconcileRun(
      step.status,
      step.taskId,
      step.taskId ? (taskStates.get(step.taskId) ?? null) : null,
      step.retryCount,
      step.maxRetries,
    ),
  }));
 }
--- a/apps/coder/src/services/flow-runner.ts
+++ b/apps/coder/src/services/flow-runner.ts
@@ -40,11 +40,14 @@ import { getFlow } from '../conductor/flows/index.js';
 import { loadPersona } from '../conductor/persona-loader.js';
 import type { Band, DispatchFn, Flow, FlowInput, Step, StepContext } from '../conductor/types.js';
 import {
  buildBatchState,
  getReadyInBatch,
  isRunComplete,
  manifestSteps,
  partitionReady,
  readySteps,
  reconcileRun,
  resolveSwitch,
  type SchedulerState,
  type StepResumeDecision,
 } from './flow-runner-decisions.js';
@@ -89,15 +92,20 @@ interface Deps {
  broker: Broker;
  log: FastifyBaseLogger;
  config: Config;
  /** Fired when a flow run reaches a terminal state (for plan-store integration). */
  onRunTerminal?: (runId: string, status: 'completed' | 'failed' | 'cancelled') => void;
 }
 interface FlowStepRow {
  step_id: string;
-  kind: 'agent' | 'code';
+  kind: 'agent' | 'code' | 'switch';
  agent: string | null;
  status: string;
  chat_id: string | null;
  output: string | null;
  updated_at: string | null;
  retry_count: number | null;
  max_retries: number | null;
 }
 export function createFlowRunner(deps: Deps): FlowRunner {
@@ -261,7 +269,8 @@ export function createFlowRunner(deps: Deps): FlowRunner {
    const dispatch: DispatchFn = (agent, task) => dispatchSubAgent(run.project_id, model, agent, task);
    const rows = await sql<FlowStepRow[]>`
-      SELECT step_id, kind, agent, status, chat_id, output FROM flow_steps WHERE run_id = ${runId}
+      SELECT step_id, kind, agent, status, chat_id, output, updated_at, retry_count, max_retries
      FROM flow_steps WHERE run_id = ${runId}
    `;
    // Re-derive the excluded set (band/when pre-skips) from the flow def + input —
@@ -273,6 +282,9 @@ export function createFlowRunner(deps: Deps): FlowRunner {
    const done = new Set<string>();
    const skipped = new Set<string>();
    const inFlight = new Set<string>();
    const timedOut = new Set<string>();
    /** Per-switch routing results — maps switch step id → resolved branch details */
    const switchExcluded = new Map<string, { chosenCase: string | null; excluded: Set<string> }>();
    const results: Record<string, string> = {};
    for (const r of rows) {
      switch (r.status) {
@@ -286,6 +298,9 @@ export function createFlowRunner(deps: Deps): FlowRunner {
        case 'running':
          inFlight.add(r.step_id);
          break;
        case 'timed_out':
          timedOut.add(r.step_id);
          break;
        case 'failed':
          // A failed worker makes the deterministic report untrustworthy — fail the
          // whole run (matches the Phase-1 CLI, which throws on a dispatch failure).
@@ -298,17 +313,79 @@ export function createFlowRunner(deps: Deps): FlowRunner {
      }
    }
    // ─── Timeout detection ───────────────────────────────────────────────────────
    // Check running steps. If a step has been 'running' longer than
    // FLOW_STEP_TIMEOUT_MS, mark it timed_out or re-dispatch if retriable.
    // Build a context here so the timeout retry path can re-dispatch the step.
    const timeoutCtx = buildCtx(input, results, model, dispatch);
    const timeoutMs = config.FLOW_STEP_TIMEOUT_MS;
    const nowDate = new Date();
    let detectedTimedOut = false;
    for (const r of rows) {
      if (r.status !== 'running') continue;
      if (!r.updated_at) continue;
      const elapsed = nowDate.getTime() - new Date(r.updated_at).getTime();
      if (elapsed <= timeoutMs) continue;
      // Step has exceeded the timeout
      detectedTimedOut = true;
      const retryCount = r.retry_count ?? 0;
      const maxRetries = r.max_retries ?? 0;
      if (maxRetries > 0 && retryCount < maxRetries) {
        // Retriable: re-dispatch the step with an incremented retry_count
        const step = flow.steps.find((s) => s.id === r.step_id);
        if (!step || step.kind !== 'agent') {
          // Non-agent steps can't be retried via dispatch
          inFlight.delete(r.step_id);
          await failRun(runId, flow, input, model,
            `step '${r.step_id}' timed out (non-retriable kind)`, r.step_id);
          return;
        }
        inFlight.delete(r.step_id);
        await sql`
          UPDATE flow_steps
          SET retry_count = ${retryCount + 1}, updated_at = clock_timestamp()
          WHERE run_id = ${runId} AND step_id = ${r.step_id} AND status = 'running'
        `;
        await dispatchAgentStep(runId, run.project_id, model, step, timeoutCtx);
        inFlight.add(r.step_id);
        log.warn({ runId, stepId: r.step_id, retry: retryCount + 1, maxRetries },
          'flow-runner: step timed out, retrying');
      } else {
        // Not retriable — mark as timed_out, fail the run
        inFlight.delete(r.step_id);
        await sql`
          UPDATE flow_steps SET status = 'timed_out', updated_at = clock_timestamp()
          WHERE run_id = ${runId} AND step_id = ${r.step_id} AND status = 'running'
        `;
        timedOut.add(r.step_id);
        publishStep(runId, r.step_id, 'timed_out');
        await failRun(runId, flow, input, model,
          `step '${r.step_id}' timed out`, r.step_id);
        return;
      }
    }
    // If we modified any steps, re-query so the state sets reflect the latest DB.
    if (detectedTimedOut) {
      // Continue with the in-memory state we already adjusted above (inFlight/timedOut
      // were mutated directly). No re-query needed.
    }
    // Drain ready skips + code steps (synchronous), re-evaluating after each batch,
    // then dispatch the full ready agent wave and wait for their terminal callbacks.
    for (;;) {
-      const state: SchedulerState = { done, skipped, inFlight, excluded };
+      // Build per-batch state from the current inFlight set for batch parallelism gating.
      const batchState = buildBatchState(flow, inFlight);
      const state: SchedulerState = { done, skipped, inFlight, excluded, timedOut, batchState, switchResults: switchExcluded };
      if (isRunComplete(flow, state)) {
        await finishRun(runId, flow, input, results, model, dispatch);
        return;
      }
-      const ready = readySteps(flow, state);
+      const ready = getReadyInBatch(readySteps(flow, state), state, flow);
      if (ready.length === 0) {
        if (inFlight.size > 0) return; // agents in flight will re-enter via the hook
        await failRun(runId, flow, input, model, 'unsatisfiable dependencies / cycle');
@@ -327,6 +404,31 @@ export function createFlowRunner(deps: Deps): FlowRunner {
        continue; // re-evaluate — a skip can settle a fan-in step's deps
      }
      // SWITCH steps run synchronously — evaluate conditions, update the excluded
      // set in SchedulerState, and mark themselves complete. Non-selected branch
      // step ids are excluded from ever running.
      const switchReady = toRun.filter((s) => s.kind === 'switch');
      if (switchReady.length > 0) {
        for (const s of switchReady) {
          let result: { chosenCase: string | null; excluded: string[] };
          try {
            result = resolveSwitch(s, buildCtx(input, results, model, dispatch));
          } catch (err) {
            await failRun(runId, flow, input, model, `switch step '${s.id}' threw: ${errMsg(err)}`, s.id);
            return;
          }
          switchExcluded.set(s.id, {
            chosenCase: result.chosenCase,
            excluded: new Set(result.excluded),
          });
          const outputText = result.chosenCase ? `branch:${result.chosenCase}` : '';
          await markStep(runId, s.id, 'completed', outputText);
          results[s.id] = outputText;
          done.add(s.id);
        }
        continue; // re-evaluate — excluded steps may unblock dependents
      }
      const codeReady = toRun.filter((s) => s.kind === 'code');
      if (codeReady.length > 0) {
        for (const s of codeReady) {
@@ -479,6 +581,7 @@ export function createFlowRunner(deps: Deps): FlowRunner {
      WHERE id = ${runId} AND status = 'running'
    `;
    if (updated.count === 0) return; // already terminal (e.g. cancelled) — don't publish
    deps.onRunTerminal?.(runId, 'completed');
    publishStep(runId, lastAgentStepId(flow, input, model), 'completed', {
      run_status: 'completed',
      report,
@@ -498,6 +601,7 @@ export function createFlowRunner(deps: Deps): FlowRunner {
      WHERE id = ${runId} AND status = 'running'
    `;
    if (updated.count === 0) return;
    deps.onRunTerminal?.(runId, 'failed');
    const stepId = failedStepId ?? (flow ? lastAgentStepId(flow, input, model) : 'run');
    log.warn({ runId, error }, 'flow-runner: run failed');
    await appendStepEvent(sql, runId, stepId, 'failed', { error });
@@ -512,6 +616,7 @@ export function createFlowRunner(deps: Deps): FlowRunner {
      WHERE id = ${runId} AND status = 'running'
    `;
    if (updated.count === 0) return; // idempotent — already terminal
    deps.onRunTerminal?.(runId, 'cancelled');
    // Any remaining pending steps are unreachable; mark + publish them so the
    // pane can show them as cancelled rather than stuck in pending.
    const pending = await sql<{ step_id: string; kind: string }[]>`
@@ -540,7 +645,7 @@ export function createFlowRunner(deps: Deps): FlowRunner {
  function publishStep(
    runId: string,
    stepId: string,
-    status: 'running' | 'completed' | 'failed' | 'skipped' | 'cancelled' | 'blocked',
+    status: 'running' | 'completed' | 'failed' | 'skipped' | 'cancelled' | 'blocked' | 'timed_out',
    extra?: { run_status?: 'running' | 'completed' | 'failed' | 'cancelled'; report?: string },
  ): void {
    publishUser({
@@ -678,6 +783,38 @@ export function createFlowRunner(deps: Deps): FlowRunner {
        log.info({ runId, stepId: step.step_id, taskId: task!.id }, 'flow-runner: step re-dispatched on resume');
        break;
      }
      case 'retry': {
        // Like re-dispatch but increments retry_count and sets status to 'running'.
        if (!step.input) {
          await sql`
            UPDATE flow_steps
            SET status = 'failed', error = 'retry: no stored prompt',
                updated_at = clock_timestamp()
            WHERE run_id = ${runId} AND step_id = ${step.step_id}
          `;
          break;
        }
        const chatIdR = step.chat_id;
        const [chatR] = chatIdR
          ? await sql<{ session_id: string }[]>`SELECT session_id FROM chats WHERE id = ${chatIdR}`
          : [];
        const sessionIdR = chatR?.session_id ?? null;
        const [taskR] = await sql<{ id: string }[]>`
          INSERT INTO tasks (project_id, input, agent, model, mode_id, session_id, chat_id)
          VALUES (${projectId}, ${step.input}, 'qwen', ${model}, 'plan', ${sessionIdR}, ${chatIdR})
          RETURNING id
        `;
        await sql`
          UPDATE flow_steps
          SET task_id = ${taskR!.id}, retry_count = retry_count + 1, status = 'running',
              updated_at = clock_timestamp()
          WHERE run_id = ${runId} AND step_id = ${step.step_id}
        `;
        log.info({ runId, stepId: step.step_id, taskId: taskR!.id },
          'flow-runner: step retried on resume');
        break;
      }
    }
  }
@@ -692,7 +829,9 @@ export function createFlowRunner(deps: Deps): FlowRunner {
      status: string;
      chat_id: string | null;
      input: string | null;
-    }[]>`SELECT step_id, task_id, status, chat_id, input FROM flow_steps WHERE run_id = ${run.id}`;
+      retry_count: number | null;
      max_retries: number | null;
    }[]>`SELECT step_id, task_id, status, chat_id, input, retry_count, max_retries FROM flow_steps WHERE run_id = ${run.id}`;
    // Load task states for all referenced tasks in one query.
    const taskIds = rows.map((r) => r.task_id).filter((id): id is string => id !== null);
@@ -705,7 +844,13 @@ export function createFlowRunner(deps: Deps): FlowRunner {
    }
    const decisions = reconcileRun(
-      rows.map((r) => ({ stepId: r.step_id, taskId: r.task_id, status: r.status })),
+      rows.map((r) => ({
        stepId: r.step_id,
        taskId: r.task_id,
        status: r.status,
        retryCount: r.retry_count ?? undefined,
        maxRetries: r.max_retries,
      })),
      taskStates,
    );
@@ -742,17 +887,18 @@ export function createFlowRunner(deps: Deps): FlowRunner {
      WHERE id = ${runId} AND status = 'running'
    `;
    if (updated.count === 0) return { cancelled: false, taskIds: [] };
    deps.onRunTerminal?.(runId, 'cancelled');
    // Mark all non-terminal steps cancelled and collect in-flight task_ids.
    const steps = await sql<{ step_id: string; task_id: string | null; kind: string }[]>`
      SELECT step_id, task_id, kind FROM flow_steps
-      WHERE run_id = ${runId} AND status NOT IN ('completed', 'failed', 'cancelled', 'skipped')
+      WHERE run_id = ${runId} AND status NOT IN ('completed', 'failed', 'cancelled', 'skipped', 'timed_out')
    `;
    if (steps.length > 0) {
      await sql`
        UPDATE flow_steps SET status = 'cancelled', updated_at = clock_timestamp()
-        WHERE run_id = ${runId} AND status NOT IN ('completed', 'failed', 'cancelled', 'skipped')
+        WHERE run_id = ${runId} AND status NOT IN ('completed', 'failed', 'cancelled', 'skipped', 'timed_out')
      `;
      for (const s of steps) {
        if (s.kind === 'agent') publishStep(runId, s.step_id, 'cancelled', { run_status: 'cancelled' });
--- a/apps/coder/src/services/frame-emitter.ts
+++ b/apps/coder/src/services/frame-emitter.ts
@@ -19,9 +19,10 @@
 import type { Broker } from '@boocode/server/broker';
 import type { WsFrame } from '@boocode/contracts/ws-frames';
 import type { AgentEvent } from './agent-backend.js';
-import { type AcpToolSnapshot, snapshotToWireToolCall } from './acp-tool-snapshot.js';
+import { type AcpToolSnapshot, snapshotToWireToolCall, mapToolLifecycleStatus } from './acp-tool-snapshot.js';
 import { mergeTaskCommands, getTaskCommands } from './agent-commands-cache.js';
 import type { DcpStreamStripper } from './dcp-strip.js';
 import { emitHook } from '../plugins/host.js';
 export interface FrameEmitterOpts {
  broker?: Broker;
@@ -91,8 +92,29 @@ export function makeFrameEmitter(opts: FrameEmitterOpts): FrameEmitter {
        }
        break;
      case 'tool_call':
        toolSnapshots.set(e.toolCall.toolCallId, e.toolCall);
        if (canStream()) {
          broker!.publishFrame(sessionId!, {
            type: 'tool_call',
            message_id: assistantId!,
            chat_id: chatId!,
            tool_call: snapshotToWireToolCall(e.toolCall),
          } as WsFrame);
        }
        break;
      case 'tool_update':
        toolSnapshots.set(e.toolCall.toolCallId, e.toolCall);
        {
          const lifecycle = mapToolLifecycleStatus(e.toolCall.status, e.toolCall.rawOutput);
          if (lifecycle === 'completed' || lifecycle === 'failed') {
            void emitHook('tool.execute.after', {
              toolName: e.toolCall.title,
              args: e.toolCall.rawInput,
              result: e.toolCall.rawOutput,
              duration: undefined,
            });
          }
        }
        if (canStream()) {
          broker!.publishFrame(sessionId!, {
            type: 'tool_call',
--- a/apps/coder/src/services/hashline/constants.ts
+++ b/apps/coder/src/services/hashline/constants.ts
@@ -0,0 +1,10 @@
 export const NIBBLE_STR = "ZPMQVRWSNKTXJBYH"
 export const HASHLINE_DICT = Array.from({ length: 256 }, (_, i) => {
  const high = i >>> 4
  const low = i & 0x0f
  return `${NIBBLE_STR[high]}${NIBBLE_STR[low]}`
 })
 export const HASHLINE_REF_PATTERN = /^([0-9]+)#([ZPMQVRWSNKTXJBYH]{2})$/
 export const HASHLINE_OUTPUT_PATTERN = /^([0-9]+)#([ZPMQVRWSNKTXJBYH]{2})\|(.*)$/
--- a/apps/coder/src/services/hashline/hash-computation.ts
+++ b/apps/coder/src/services/hashline/hash-computation.ts
@@ -0,0 +1,31 @@
 import { HASHLINE_DICT } from "./constants.js"
 import { hashXxh32 } from "./xxhash32.js"
 const RE_SIGNIFICANT = /[\p{L}\p{N}]/u
 function computeNormalizedLineHash(lineNumber: number, normalizedContent: string): string {
  const stripped = normalizedContent
  const seed = RE_SIGNIFICANT.test(stripped) ? 0 : lineNumber
  const hash = hashXxh32(stripped, seed)
  const index = hash % 256
  return HASHLINE_DICT[index]!
 }
 export function computeLineHash(lineNumber: number, content: string): string {
  return computeNormalizedLineHash(lineNumber, content.replace(/\r/g, "").trimEnd())
 }
 export function computeLegacyLineHash(lineNumber: number, content: string): string {
  return computeNormalizedLineHash(lineNumber, content.replace(/\r/g, "").replace(/\s+/g, ""))
 }
 export function formatHashLine(lineNumber: number, content: string): string {
  const hash = computeLineHash(lineNumber, content)
  return `${lineNumber}#${hash}|${content}`
 }
 export function formatHashLines(content: string): string {
  if (!content) return ""
  const lines = content.split("\n")
  return lines.map((line, index) => formatHashLine(index + 1, line)).join("\n")
 }
--- a/apps/coder/src/services/hashline/index.ts
+++ b/apps/coder/src/services/hashline/index.ts
@@ -0,0 +1,11 @@
 /**
 * Hashline editing core — content-hash anchors for edit_file stale-patch detection.
 *
 * Ported from oh-my-openagent/packages/hashline-core/.
 * Bundles a runtime-aware xxHash32 (Bun fast-path, pure-JS fallback).
 */
 export { computeLineHash, formatHashLines, formatHashLine, computeLegacyLineHash } from "./hash-computation.js"
 export { parseLineRef, validateLineRef, validateLineRefs, HashlineMismatchError, normalizeLineRef } from "./validation.js"
 export type { LineRef } from "./validation.js"
 export { NIBBLE_STR, HASHLINE_DICT, HASHLINE_REF_PATTERN, HASHLINE_OUTPUT_PATTERN } from "./constants.js"
 export type { ReplaceEdit, AppendEdit, PrependEdit, HashlineEdit } from "./types.js"
--- a/apps/coder/src/services/hashline/types.ts
+++ b/apps/coder/src/services/hashline/types.ts
@@ -0,0 +1,20 @@
 export interface ReplaceEdit {
  op: "replace"
  pos: string
  end?: string
  lines: string | string[]
 }
 export interface AppendEdit {
  op: "append"
  pos?: string
  lines: string | string[]
 }
 export interface PrependEdit {
  op: "prepend"
  pos?: string
  lines: string | string[]
 }
 export type HashlineEdit = ReplaceEdit | AppendEdit | PrependEdit
--- a/apps/coder/src/services/hashline/validation.ts
+++ b/apps/coder/src/services/hashline/validation.ts
@@ -0,0 +1,192 @@
 import { computeLegacyLineHash, computeLineHash } from "./hash-computation.js"
 import { HASHLINE_REF_PATTERN } from "./constants.js"
 export interface LineRef {
  line: number
  hash: string
 }
 interface HashMismatch {
  line: number
  expected: string
 }
 const MISMATCH_CONTEXT = 2
 const LINE_REF_EXTRACT_PATTERN = /([0-9]+#[ZPMQVRWSNKTXJBYH]{2})/
 function isCompatibleLineHash(line: number, content: string, hash: string): boolean {
  return computeLineHash(line, content) === hash || computeLegacyLineHash(line, content) === hash
 }
 export function normalizeLineRef(ref: string): string {
  const originalTrimmed = ref.trim()
  let trimmed = originalTrimmed
  trimmed = trimmed.replace(/^(?:>>>|[+-])\s*/, "")
  trimmed = trimmed.replace(/\s*#\s*/, "#")
  trimmed = trimmed.replace(/\|.*$/, "")
  trimmed = trimmed.trim()
  if (HASHLINE_REF_PATTERN.test(trimmed)) {
    return trimmed
  }
  const extracted = trimmed.match(LINE_REF_EXTRACT_PATTERN)
  if (extracted) {
    return extracted[1]!
  }
  return originalTrimmed
 }
 export function parseLineRef(ref: string): LineRef {
  const normalized = normalizeLineRef(ref)
  const match = normalized.match(HASHLINE_REF_PATTERN)
  if (match) {
    return {
      line: Number.parseInt(match[1]!, 10),
      hash: match[2]!,
    }
  }
  const hashIdx = normalized.indexOf('#')
  if (hashIdx > 0) {
    const prefix = normalized.slice(0, hashIdx)
    const suffix = normalized.slice(hashIdx + 1)
    if (!/^\d+$/.test(prefix) && /^[ZPMQVRWSNKTXJBYH]{2}$/.test(suffix)) {
      throw new Error(
        `Invalid line reference: "${ref}". "${prefix}" is not a line number. ` +
          `Use the actual line number from the read output.`
      )
    }
  }
  throw new Error(
    `Invalid line reference format: "${ref}". Expected format: "{line_number}#{hash_id}"`
  )
 }
 export function validateLineRef(lines: string[], ref: string): void {
  const { line, hash } = parseLineRefWithHint(ref, lines)
  if (line < 1 || line > lines.length) {
    throw new Error(
      `Line number ${line} out of bounds. File has ${lines.length} lines.`
    )
  }
  const content = lines[line - 1]
  if (content === undefined) {
    throw new Error(
      `Line number ${line} out of bounds. File has ${lines.length} lines.`
    )
  }
  if (!isCompatibleLineHash(line, content, hash)) {
    throw new HashlineMismatchError([{ line, expected: hash }], lines)
  }
 }
 export class HashlineMismatchError extends Error {
  readonly remaps: ReadonlyMap<string, string>
  constructor(
    private readonly mismatches: HashMismatch[],
    private readonly fileLines: string[]
  ) {
    super(HashlineMismatchError.formatMessage(mismatches, fileLines))
    this.name = "HashlineMismatchError"
    const remaps = new Map<string, string>()
    for (const mismatch of mismatches) {
      const content = fileLines[mismatch.line - 1]
      const actualLine = content ?? ""
      const actual = computeLineHash(mismatch.line, actualLine)
      remaps.set(`${mismatch.line}#${mismatch.expected}`, `${mismatch.line}#${actual}`)
    }
    this.remaps = remaps
  }
  static formatMessage(mismatches: HashMismatch[], fileLines: string[]): string {
    const mismatchByLine = new Map<number, HashMismatch>()
    for (const mismatch of mismatches) mismatchByLine.set(mismatch.line, mismatch)
    const displayLines = new Set<number>()
    for (const mismatch of mismatches) {
      const low = Math.max(1, mismatch.line - MISMATCH_CONTEXT)
      const high = Math.min(fileLines.length, mismatch.line + MISMATCH_CONTEXT)
      for (let line = low; line <= high; line++) displayLines.add(line)
    }
    const sortedLines = [...displayLines].sort((a, b) => a - b)
    const output: string[] = []
    output.push(
      `${mismatches.length} line${mismatches.length > 1 ? "s have" : " has"} changed since last read. ` +
        "Use updated {line_number}#{hash_id} references below (>>> marks changed lines)."
    )
    output.push("")
    let previousLine = -1
    for (const line of sortedLines) {
      if (previousLine !== -1 && line > previousLine + 1) {
        output.push("    ...")
      }
      previousLine = line
      const content = fileLines[line - 1] ?? ""
      const hash = computeLineHash(line, content)
      const prefix = `${line}#${hash}|${content}`
      if (mismatchByLine.has(line)) {
        output.push(`>>> ${prefix}`)
      } else {
        output.push(`    ${prefix}`)
      }
    }
    return output.join("\n")
  }
 }
 function suggestLineForHash(ref: string, lines: string[]): string | null {
  const hashMatch = ref.trim().match(/#([ZPMQVRWSNKTXJBYH]{2})$/)
  if (!hashMatch) return null
  const hash = hashMatch[1]!
  for (let i = 0; i < lines.length; i++) {
    if (isCompatibleLineHash(i + 1, lines[i] ?? "", hash)) {
      return `Did you mean "${i + 1}#${computeLineHash(i + 1, lines[i] ?? "")}"?`
    }
  }
  return null
 }
 function parseLineRefWithHint(ref: string, lines: string[]): LineRef {
  try {
    return parseLineRef(ref)
  } catch (parseError) {
    const hint = suggestLineForHash(ref, lines)
    if (hint && parseError instanceof Error) {
      throw new Error(`${parseError.message} ${hint}`)
    }
    throw parseError
  }
 }
 export function validateLineRefs(lines: string[], refs: string[]): void {
  const mismatches: HashMismatch[] = []
  for (const ref of refs) {
    const { line, hash } = parseLineRefWithHint(ref, lines)
    if (line < 1 || line > lines.length) {
      throw new Error(`Line number ${line} out of bounds (file has ${lines.length} lines)`)
    }
    const content = lines[line - 1]
    if (content === undefined) {
      throw new Error(`Line number ${line} out of bounds (file has ${lines.length} lines)`)
    }
    if (!isCompatibleLineHash(line, content, hash)) {
      mismatches.push({ line, expected: hash })
    }
  }
  if (mismatches.length > 0) {
    throw new HashlineMismatchError(mismatches, lines)
  }
 }
--- a/apps/coder/src/services/hashline/xxhash32.ts
+++ b/apps/coder/src/services/hashline/xxhash32.ts
@@ -0,0 +1,90 @@
 type BunHashRuntime = { hash: { xxHash32(data: string | Uint8Array, seed: number): number } }
 const runtime = globalThis as typeof globalThis & { Bun?: BunHashRuntime }
 const encoder = new TextEncoder()
 const PRIME32_1 = 0x9e3779b1
 const PRIME32_2 = 0x85ebca77
 const PRIME32_3 = 0xc2b2ae3d
 const PRIME32_4 = 0x27d4eb2f
 const PRIME32_5 = 0x165667b1
 function rotateLeft32(value: number, bits: number): number {
  return ((value << bits) | (value >>> (32 - bits))) >>> 0
 }
 function readUint32LittleEndian(input: Uint8Array, offset: number): number {
  return (
    ((input[offset] ?? 0) |
      ((input[offset + 1] ?? 0) << 8) |
      ((input[offset + 2] ?? 0) << 16) |
      ((input[offset + 3] ?? 0) << 24)) >>>
    0
  )
 }
 function round32(accumulator: number, value: number): number {
  const added = (accumulator + Math.imul(value, PRIME32_2)) >>> 0
  return Math.imul(rotateLeft32(added, 13), PRIME32_1) >>> 0
 }
 function xxHash32Js(input: Uint8Array, seed: number): number {
  let offset = 0
  const length = input.length
  let hash: number
  if (length >= 16) {
    const limit = length - 16
    let value1 = (seed + PRIME32_1 + PRIME32_2) >>> 0
    let value2 = (seed + PRIME32_2) >>> 0
    let value3 = seed >>> 0
    let value4 = (seed - PRIME32_1) >>> 0
    while (offset <= limit) {
      value1 = round32(value1, readUint32LittleEndian(input, offset))
      offset += 4
      value2 = round32(value2, readUint32LittleEndian(input, offset))
      offset += 4
      value3 = round32(value3, readUint32LittleEndian(input, offset))
      offset += 4
      value4 = round32(value4, readUint32LittleEndian(input, offset))
      offset += 4
    }
    hash = (rotateLeft32(value1, 1) + rotateLeft32(value2, 7)) >>> 0
    hash = (hash + rotateLeft32(value3, 12)) >>> 0
    hash = (hash + rotateLeft32(value4, 18)) >>> 0
  } else {
    hash = (seed + PRIME32_5) >>> 0
  }
  hash = (hash + length) >>> 0
  while (offset + 4 <= length) {
    hash = (hash + Math.imul(readUint32LittleEndian(input, offset), PRIME32_3)) >>> 0
    hash = Math.imul(rotateLeft32(hash, 17), PRIME32_4) >>> 0
    offset += 4
  }
  while (offset < length) {
    hash = (hash + Math.imul(input[offset] ?? 0, PRIME32_5)) >>> 0
    hash = Math.imul(rotateLeft32(hash, 11), PRIME32_1) >>> 0
    offset += 1
  }
  hash = (hash ^ (hash >>> 15)) >>> 0
  hash = Math.imul(hash, PRIME32_2) >>> 0
  hash = (hash ^ (hash >>> 13)) >>> 0
  hash = Math.imul(hash, PRIME32_3) >>> 0
  return (hash ^ (hash >>> 16)) >>> 0
 }
 export function hashXxh32(input: string, seed: number): number {
  const bun = runtime.Bun
  if (bun !== undefined) {
    return bun.hash.xxHash32(input, seed)
  }
  return xxHash32Js(encoder.encode(input), seed >>> 0)
 }
--- a/apps/coder/src/services/model-resolution/connected-providers-cache.ts
+++ b/apps/coder/src/services/model-resolution/connected-providers-cache.ts
@@ -0,0 +1,34 @@
 import type { ModelMetadata } from "./provider-cache.js"
 export interface ProviderModelsCache {
  readonly models: Record<string, readonly string[] | readonly ModelMetadata[]>
  readonly connected: readonly string[]
  readonly updatedAt: string
 }
 export interface ConnectedProvidersAdapter {
  readConnectedProvidersCache(): string[] | null
  findProviderModelMetadata(providerID: string, modelID: string): ModelMetadata | undefined
  readProviderModelsCache(): ProviderModelsCache | null
 }
 export function readConnectedProvidersCache(): string[] | null {
  return null
 }
 export function findProviderModelMetadata(
  _providerID: string,
  _modelID: string,
 ): ModelMetadata | undefined {
  return undefined
 }
 export function readProviderModelsCache(): ProviderModelsCache | null {
  return null
 }
 export const connectedProvidersAdapter: ConnectedProvidersAdapter = {
  readConnectedProvidersCache,
  findProviderModelMetadata,
  readProviderModelsCache,
 }
--- a/apps/coder/src/services/model-resolution/fallback-chain-from-models.ts
+++ b/apps/coder/src/services/model-resolution/fallback-chain-from-models.ts
@@ -0,0 +1,128 @@
 import type { FallbackEntry } from "./model-requirement-types.js"
 import type { FallbackModelObject } from "./fallback-model-object.js"
 import { normalizeFallbackModels } from "./model-resolver.js"
 import { KNOWN_VARIANTS } from "./known-variants.js"
 function parseVariantFromModel(rawModel: string): { modelID: string; variant?: string } {
  if (typeof rawModel !== "string") {
    return { modelID: "" }
  }
  const trimmedModel = rawModel.trim()
  if (!trimmedModel) {
    return { modelID: "" }
  }
  const parenthesizedVariant = trimmedModel.match(/^(.*)\(([^()]+)\)\s*$/)
  if (parenthesizedVariant) {
    const modelID = parenthesizedVariant[1]?.trim() ?? ""
    const variant = parenthesizedVariant[2]?.trim()
    return variant ? { modelID, variant } : { modelID }
  }
  const spaceVariant = trimmedModel.match(/^(.*\S)\s+([a-z][a-z0-9_-]*)$/i)
  if (spaceVariant) {
    const modelID = spaceVariant[1]?.trim() ?? ""
    const variant = spaceVariant[2]?.trim().toLowerCase()
    if (variant && KNOWN_VARIANTS.has(variant)) {
      return { modelID, variant }
    }
  }
  return { modelID: trimmedModel }
 }
 export function parseFallbackModelEntry(
  model: string,
  contextProviderID: string | undefined,
  defaultProviderID = "opencode",
 ): FallbackEntry | undefined {
  if (typeof model !== "string") return undefined
  const trimmed = model.trim()
  if (!trimmed) return undefined
  const parts = trimmed.split("/")
  const providerID =
    parts.length >= 2 ? (parts[0]?.trim() ?? "") : (contextProviderID?.trim() || defaultProviderID)
  const rawModelID = parts.length >= 2 ? parts.slice(1).join("/").trim() : trimmed
  if (!providerID || !rawModelID) return undefined
  const parsed = parseVariantFromModel(rawModelID)
  if (!parsed.modelID) return undefined
  return {
    providers: [providerID],
    model: parsed.modelID,
    variant: parsed.variant,
  }
 }
 export function parseFallbackModelObjectEntry(
  obj: FallbackModelObject,
  contextProviderID: string | undefined,
  defaultProviderID = "opencode",
 ): FallbackEntry | undefined {
  const base = parseFallbackModelEntry(obj.model, contextProviderID, defaultProviderID)
  if (!base) return undefined
  return {
    ...base,
    variant: obj.variant ?? base.variant,
    reasoningEffort: obj.reasoningEffort,
    temperature: obj.temperature,
    top_p: obj.top_p,
    maxTokens: obj.maxTokens,
    thinking: obj.thinking,
  }
 }
 /**
 * Find the most specific FallbackEntry whose `provider/model` is a prefix of
 * the resolved `provider/modelID`.  Longest match wins so that e.g.
 * `openai/gpt-5.4-preview` picks the entry for `openai/gpt-5.4-preview` over
 * the shorter `openai/gpt-5.4`.
 */
 export function findMostSpecificFallbackEntry(
  providerID: string,
  modelID: string,
  chain: FallbackEntry[],
 ): FallbackEntry | undefined {
  const resolved = `${providerID}/${modelID}`.toLowerCase()
  // Collect entries whose provider/model is a prefix of the resolved model,
  // together with the length of the matching prefix (longest match wins).
  const matches: { entry: FallbackEntry; matchLen: number }[] = []
  for (const entry of chain) {
    for (const p of entry.providers) {
      const candidate = `${p}/${entry.model}`.toLowerCase()
      if (resolved.startsWith(candidate)) {
        matches.push({ entry, matchLen: candidate.length })
        break // one match per entry is enough
      }
    }
  }
  if (matches.length === 0) return undefined
  matches.sort((a, b) => b.matchLen - a.matchLen)
  return matches[0]!.entry
 }
 export function buildFallbackChainFromModels(
  fallbackModels: string | (string | FallbackModelObject)[] | undefined,
  contextProviderID: string | undefined,
  defaultProviderID = "opencode",
 ): FallbackEntry[] | undefined {
  const normalized = normalizeFallbackModels(fallbackModels)
  if (!normalized || normalized.length === 0) return undefined
  const parsed = normalized
    .map((entry) => {
      if (typeof entry === "string") {
        return parseFallbackModelEntry(entry, contextProviderID, defaultProviderID)
      }
      return parseFallbackModelObjectEntry(entry, contextProviderID, defaultProviderID)
    })
    .filter((entry): entry is FallbackEntry => entry !== undefined)
  if (parsed.length === 0) return undefined
  return parsed
 }
--- a/apps/coder/src/services/model-resolution/fallback-model-object.ts
+++ b/apps/coder/src/services/model-resolution/fallback-model-object.ts
@@ -0,0 +1,9 @@
 export type FallbackModelObject = {
  readonly model: string
  readonly variant?: string
  readonly reasoningEffort?: "none" | "minimal" | "low" | "medium" | "high" | "xhigh" | "max"
  readonly temperature?: number
  readonly top_p?: number
  readonly maxTokens?: number
  readonly thinking?: { readonly type: "enabled" | "disabled"; readonly budgetTokens?: number }
 }
--- a/apps/coder/src/services/model-resolution/index.ts
+++ b/apps/coder/src/services/model-resolution/index.ts
@@ -0,0 +1,80 @@
 export type {
  FallbackEntry,
  ModelRequirement,
 } from "./model-requirement-types.js"
 export type {
  FallbackModelObject,
 } from "./fallback-model-object.js"
 export type {
  DelegatedModelConfig,
  ModelResolutionRequest,
  ModelResolutionProvenance,
  ModelResolutionResult,
 } from "./model-resolution-types.js"
 export type {
  ModelResolutionInput,
  ModelSource,
  ExtendedModelResolutionInput,
 } from "./model-resolver.js"
 export {
  resolveModel,
  resolveModelWithFallback,
  normalizeFallbackModels,
  flattenToFallbackModelStrings,
 } from "./model-resolver.js"
 export {
  normalizeModel,
  normalizeModelID,
 } from "./model-normalization.js"
 export {
  fuzzyMatchModel,
  isModelAvailable,
 } from "./model-availability.js"
 export {
  transformModelForProvider,
  transformModelForProviderDisplay,
 } from "./provider-model-id-transform.js"
 export {
  buildFallbackChainFromModels,
  parseFallbackModelEntry,
  parseFallbackModelObjectEntry,
  findMostSpecificFallbackEntry,
 } from "./fallback-chain-from-models.js"
 export {
  KNOWN_VARIANTS,
 } from "./known-variants.js"
 export {
  _setModelResolutionLogImplementationForTesting,
  resolveModelPipeline,
 } from "./model-resolution-pipeline.js"
 export type {
  ModelResolutionRequest as PipelineModelResolutionRequest,
  ModelResolutionProvenance as PipelineModelResolutionProvenance,
  ModelResolutionResult as PipelineModelResolutionResult,
  ModelResolutionDeps,
 } from "./model-resolution-pipeline.js"
 export {
  isRetryableModelError,
  shouldRetryError,
  getNextFallback,
  hasMoreFallbacks,
  selectFallbackProvider,
  selectFallbackProviderWithCache,
 } from "./model-error-classifier.js"
 export type {
  ErrorInfo,
 } from "./model-error-classifier.js"
 export type {
  ProviderCache,
  ModelMetadata,
 } from "./provider-cache.js"
 export type {
  ProviderModelsCache,
  ConnectedProvidersAdapter,
 } from "./connected-providers-cache.js"
 export {
  readConnectedProvidersCache,
  findProviderModelMetadata,
  readProviderModelsCache,
  connectedProvidersAdapter,
 } from "./connected-providers-cache.js"
--- a/apps/coder/src/services/model-resolution/known-variants.ts
+++ b/apps/coder/src/services/model-resolution/known-variants.ts
@@ -0,0 +1,16 @@
 /**
 * Canonical set of recognised variant / effort tokens.
 * Used by parseFallbackModelEntry (space-suffix detection) and
 * flattenToFallbackModelStrings (inline-variant stripping).
 */
 export const KNOWN_VARIANTS = new Set([
  "low",
  "medium",
  "high",
  "xhigh",
  "max",
  "minimal",
  "none",
  "auto",
  "thinking",
 ])
--- a/apps/coder/src/services/model-resolution/model-availability.ts
+++ b/apps/coder/src/services/model-resolution/model-availability.ts
@@ -0,0 +1,64 @@
 function normalizeModelName(name: string): string {
  return name
    .toLowerCase()
    .replace(/claude-(opus|sonnet|haiku)-(\d+)[.-](\d+)/g, "claude-$1-$2.$3")
 }
 export function fuzzyMatchModel(
  target: string,
  available: Set<string>,
  providers?: string[],
 ): string | null {
  if (available.size === 0) {
    return null
  }
  const targetNormalized = normalizeModelName(target)
  let candidates = Array.from(available)
  if (providers && providers.length > 0) {
    const providerSet = new Set(providers)
    candidates = candidates.filter((model) => {
      const [provider] = model.split("/")
      return providerSet.has(provider!)
    })
  }
  if (candidates.length === 0) {
    return null
  }
  const matches = candidates.filter((model) =>
    normalizeModelName(model).includes(targetNormalized),
  )
  if (matches.length === 0) {
    return null
  }
  const exactMatch = matches.find((model) => normalizeModelName(model) === targetNormalized)
  if (exactMatch) {
    return exactMatch
  }
  const exactModelIdMatches = matches.filter((model) => {
    const modelId = model.split("/").slice(1).join("/")
    return normalizeModelName(modelId) === targetNormalized
  })
  if (exactModelIdMatches.length > 0) {
    return exactModelIdMatches.reduce((shortest, current) =>
      current.length < shortest.length ? current : shortest,
    )
  }
  return matches.reduce((shortest, current) =>
    current.length < shortest.length ? current : shortest,
  )
 }
 export function isModelAvailable(
  targetModel: string,
  availableModels: Set<string>,
 ): boolean {
  return fuzzyMatchModel(targetModel, availableModels) !== null
 }
--- a/apps/coder/src/services/model-resolution/model-error-classifier.ts
+++ b/apps/coder/src/services/model-resolution/model-error-classifier.ts
@@ -0,0 +1,261 @@
 import type { FallbackEntry } from "./model-requirement-types.js"
 import type { ProviderCache } from "./provider-cache.js"
 import * as connectedProvidersCache from "./connected-providers-cache.js"
 /**
 * Error names that indicate a retryable model error.
 * These errors halt execution and should trigger fallback retry.
 */
 const RETRYABLE_ERROR_NAMES = new Set([
  "providermodelnotfounderror",
  "ratelimiterror",
  "modelunavailableerror",
  "providerconnectionerror",
  "authenticationerror",
 ])
 const STOP_ERROR_NAMES = new Set([
  "quotaexceedederror",
  "insufficientcreditserror",
  "freeusagelimiterror",
 ])
 /**
 * Error names that should NOT trigger retry.
 * These errors are typically user-induced or fixable without switching models.
 */
 const NON_RETRYABLE_ERROR_NAMES = new Set([
  "messageabortederror",
  "permissiondeniederror",
  "contextlengtherror",
  "timeouterror",
  "validationerror",
  "syntaxerror",
  "usererror",
 ])
 /**
 * Message patterns that indicate a retryable error even without a known error name.
 */
 const RETRYABLE_MESSAGE_PATTERNS = [
  "rate_limit",
  "rate limit",
  "usage_limit_reached",
  "usage limit has been reached",
  "quota",
  "all credentials for model",
  "cooling down",
  "exhausted your capacity",
  "not found",
  "unavailable",
  "insufficient",
  "too many requests",
  "over limit",
  "overloaded",
  "bad gateway",
  "bad request",
  "unknown provider",
  "provider not found",
  "model_not_supported",
  "model not supported",
  "model is not supported",
  "connection error",
  "network error",
  "timeout",
  "service unavailable",
  "internal_server_error",
  "free usage",
  "usage exceeded",
  "credit",
  "balance",
  "temporarily unavailable",
  "try again",
  "请稍后重试",
  "503",
  "502",
  "504",
  "429",
  "529",
  "selected provider is forbidden",
  "provider is forbidden",
  // Chinese retryable patterns (Zhipu, etc.)
  "频率限制",           // "rate limit"
  "请求过于频繁",       // "too many requests"
  "暂时不可用",         // "temporarily unavailable"
  "服务不可用",         // "service unavailable"
  "server_error",
  "an error occurred while processing",
 ]
 /**
 * Message patterns that indicate a non-retryable STOP error (quota/billing exhaustion).
 * These take precedence over RETRYABLE_MESSAGE_PATTERNS.
 */
 const STOP_MESSAGE_PATTERNS = [
  "quota will reset after",
  "quota exceeded",
  "free usage limit",
  "billing limit",
  "billing hard limit",
  "monthly limit",
  "plan limit",
  "subscription quota",
  "subscription limit",
  "payment required",
  "out of credits",
  "credits exhausted",
  "insufficient credits",
  "insufficient balance",
  "credit balance",
  "usage limit for this month",
  "exhausted your capacity",
  // GLM/Z.ai business error codes that indicate permanent quota/billing exhaustion
  "daily call limit",
  "daily limit",
  "usage limit reached for",
  "in arrears",
  "fair use policy",
  "recharge and try",
  "使用上限",
  "额度不足",
  "余额不足",
  "已耗尽",
 ]
 const AUTO_RETRY_GATE_PATTERNS = [
  "rate limit",
  "cooling down",
  "credentials for model",
 ]
 function hasProviderAutoRetrySignal(message: string): boolean {
  if (!message.includes("retrying in")) {
    return false
  }
  return AUTO_RETRY_GATE_PATTERNS.some((pattern) => message.includes(pattern))
 }
 export interface ErrorInfo {
  name?: string
  message?: string
  /** HTTP status code from the provider response (e.g., 429 for rate limit) */
  statusCode?: number
 }
 /**
 * Determines if an error is a retryable model error.
 * Returns true if it's a known retryable type OR matches retryable message patterns.
 */
 export function isRetryableModelError(error: ErrorInfo): boolean {
  // If we have an error name, check against known lists
  if (error.name) {
    const errorNameLower = error.name.toLowerCase()
    // Explicit non-retryable takes precedence
    if (NON_RETRYABLE_ERROR_NAMES.has(errorNameLower)) {
      return false
    }
    if (STOP_ERROR_NAMES.has(errorNameLower)) {
      return false
    }
    // Check if it's a known retryable error
    if (RETRYABLE_ERROR_NAMES.has(errorNameLower)) {
      return true
    }
  }
  // Check message patterns for unknown errors
  const msg = error.message?.toLowerCase() ?? ""
  // STOP patterns take precedence over retryable patterns
  if (STOP_MESSAGE_PATTERNS.some((pattern) => msg.includes(pattern))) {
    return false
  }
  if (hasProviderAutoRetrySignal(msg)) {
    return true
  }
  // HTTP status code check: catches rate-limit errors regardless of message format/language.
  // Uses the same codes as runtime-fallback config (400 excluded as it is a permanent client error).
  if (
    error.statusCode != null &&
    (error.statusCode === 429 || error.statusCode === 503 || error.statusCode === 529)
  ) {
    return true
  }
  return RETRYABLE_MESSAGE_PATTERNS.some((pattern) => msg.includes(pattern))
 }
 /**
 * Determines if an error should trigger a fallback retry.
 * Returns true for errors that halt execution.
 */
 export function shouldRetryError(error: ErrorInfo): boolean {
  return isRetryableModelError(error)
 }
 /**
 * Gets the next fallback model from the chain based on attempt count.
 * Returns undefined if all fallbacks have been exhausted.
 */
 export function getNextFallback(
  fallbackChain: FallbackEntry[],
  attemptCount: number,
 ): FallbackEntry | undefined {
  return fallbackChain[attemptCount]
 }
 /**
 * Checks if there are more fallbacks available after the current attempt.
 */
 export function hasMoreFallbacks(
  fallbackChain: FallbackEntry[],
  attemptCount: number,
 ): boolean {
  return attemptCount < fallbackChain.length
 }
 /**
 * Selects the best provider for a fallback entry.
 * Priority:
 * 1) First connected provider in the entry's provider preference order
 * 2) Preferred provider when connected (and entry providers are unavailable)
 * 3) First provider listed in the fallback entry
 */
 export function selectFallbackProvider(
  providers: string[],
  preferredProviderID?: string,
 ): string {
  return selectFallbackProviderWithCache(
    providers,
    connectedProvidersCache,
    preferredProviderID,
  )
 }
 export function selectFallbackProviderWithCache(
  providers: string[],
  providerCache: ProviderCache,
  preferredProviderID?: string,
 ): string {
  const connectedProviders = providerCache.readConnectedProvidersCache()
  if (connectedProviders) {
    const connectedSet = new Set(connectedProviders.map(p => p.toLowerCase()))
    for (const provider of providers) {
      if (connectedSet.has(provider.toLowerCase())) {
        return provider
      }
    }
    if (
      preferredProviderID &&
      connectedSet.has(preferredProviderID.toLowerCase())
    ) {
      return preferredProviderID
    }
  }
  return providers[0] ?? preferredProviderID ?? "opencode"
 }
--- a/apps/coder/src/services/model-resolution/model-normalization.ts
+++ b/apps/coder/src/services/model-resolution/model-normalization.ts
@@ -0,0 +1,8 @@
 export function normalizeModel(model?: string): string | undefined {
  const trimmed = model?.trim()
  return trimmed || undefined
 }
 export function normalizeModelID(modelID: string): string {
  return modelID.replace(/\.(\d+)/g, "-$1")
 }
--- a/apps/coder/src/services/model-resolution/model-requirement-types.ts
+++ b/apps/coder/src/services/model-resolution/model-requirement-types.ts
@@ -0,0 +1,18 @@
 export type FallbackEntry = {
  providers: string[];
  model: string;
  variant?: string; // Entry-specific variant (e.g., GPT->high, Opus->max)
  reasoningEffort?: string;
  temperature?: number;
  top_p?: number;
  maxTokens?: number;
  thinking?: { type: "enabled" | "disabled"; budgetTokens?: number };
 };
 export type ModelRequirement = {
  fallbackChain: FallbackEntry[];
  variant?: string; // Default variant (used when entry doesn't specify one)
  requiresModel?: string; // If set, only activates when this model is available (fuzzy match)
  requiresAnyModel?: boolean; // If true, requires at least ONE model in fallbackChain to be available (or empty availability treated as unavailable)
  requiresProvider?: string[]; // If set, only activates when any of these providers is connected
 };
--- a/apps/coder/src/services/model-resolution/model-resolution-pipeline.ts
+++ b/apps/coder/src/services/model-resolution/model-resolution-pipeline.ts
@@ -0,0 +1,256 @@
 import { fuzzyMatchModel } from "./model-availability.js"
 import type { FallbackEntry } from "./model-requirement-types.js"
 import { transformModelForProvider } from "./provider-model-id-transform.js"
 import { normalizeModel } from "./model-normalization.js"
 import type { ProviderCache } from "./provider-cache.js"
 type LogImplementation = (message: string, data?: unknown) => void
 let logImplementationForTesting: LogImplementation | undefined
 function log(message: string, data?: unknown): void {
  const logImpl = logImplementationForTesting
  if (!logImpl) {
    return
  }
  if (arguments.length === 1) {
    logImpl(message)
    return
  }
  logImpl(message, data)
 }
 export function _setModelResolutionLogImplementationForTesting(
  logImplementation: LogImplementation | undefined,
 ): void {
  logImplementationForTesting = logImplementation
 }
 export type ModelResolutionRequest = {
  intent?: {
    uiSelectedModel?: string
    userModel?: string
    userFallbackModels?: string[]
    categoryDefaultModel?: string
  }
  constraints: {
    availableModels: Set<string>
    connectedProviders?: string[] | null
  }
  policy?: {
    fallbackChain?: FallbackEntry[]
    systemDefaultModel?: string
  }
 }
 export type ModelResolutionProvenance =
  | "override"
  | "category-default"
  | "provider-fallback"
  | "system-default"
 export type ModelResolutionResult = {
  model: string
  provenance: ModelResolutionProvenance
  variant?: string
  attempted?: string[]
  reason?: string
 }
 export type ModelResolutionDeps = {
  fuzzyMatchModel: (
    target: string,
    available: Set<string>,
    providers?: string[],
  ) => string | null
  transformModelForProvider: (provider: string, model: string) => string
 }
 const DEFAULT_MODEL_RESOLUTION_DEPS: ModelResolutionDeps = {
  fuzzyMatchModel,
  transformModelForProvider,
 }
 export function resolveModelPipeline(
  request: ModelResolutionRequest,
  providerCache: ProviderCache = {
    readConnectedProvidersCache: () => null,
    findProviderModelMetadata: () => undefined,
  },
  deps: ModelResolutionDeps = DEFAULT_MODEL_RESOLUTION_DEPS,
 ): ModelResolutionResult | undefined {
  const attempted: string[] = []
  const { intent, constraints, policy } = request
  const availableModels = constraints.availableModels
  const fallbackChain = policy?.fallbackChain
  const systemDefaultModel = policy?.systemDefaultModel
  const normalizedUiModel = normalizeModel(intent?.uiSelectedModel)
  if (normalizedUiModel) {
    log("Model resolved via UI selection", { model: normalizedUiModel })
    return { model: normalizedUiModel, provenance: "override" }
  }
  const normalizedUserModel = normalizeModel(intent?.userModel)
  if (normalizedUserModel) {
    log("Model resolved via config override", { model: normalizedUserModel })
    return { model: normalizedUserModel, provenance: "override" }
  }
  const normalizedCategoryDefault = normalizeModel(intent?.categoryDefaultModel)
  if (normalizedCategoryDefault) {
    attempted.push(normalizedCategoryDefault)
    if (availableModels.size > 0) {
      const parts = normalizedCategoryDefault.split("/")
      const providerHint = parts.length >= 2 ? [parts[0]!] : undefined
      const match = deps.fuzzyMatchModel(normalizedCategoryDefault, availableModels, providerHint)
      if (match) {
        log("Model resolved via category default (fuzzy matched)", {
          original: normalizedCategoryDefault,
          matched: match,
        })
        return { model: match, provenance: "category-default", attempted }
      }
    } else {
      const connectedProviders = constraints.connectedProviders ?? providerCache.readConnectedProvidersCache()
      if (connectedProviders === null) {
        log("Model resolved via category default (no cache, first run)", {
          model: normalizedCategoryDefault,
        })
        return { model: normalizedCategoryDefault, provenance: "category-default", attempted }
      }
      const parts = normalizedCategoryDefault.split("/")
      if (parts.length >= 2) {
        const provider = parts[0]!
        if (connectedProviders.includes(provider)) {
          const modelName = parts.slice(1).join("/")
          const transformedModel = `${provider}/${deps.transformModelForProvider(provider, modelName)}`
          log("Model resolved via category default (connected provider)", {
            model: transformedModel,
            original: normalizedCategoryDefault,
          })
          return { model: transformedModel, provenance: "category-default", attempted }
        }
      }
    }
    log("Category default model not available, falling through to fallback chain", {
      model: normalizedCategoryDefault,
    })
  }
  //#when - user configured fallback_models, try them before hardcoded fallback chain
  const userFallbackModels = intent?.userFallbackModels
  if (userFallbackModels && userFallbackModels.length > 0) {
    if (availableModels.size === 0) {
      const connectedProviders = constraints.connectedProviders ?? providerCache.readConnectedProvidersCache()
      const connectedSet = connectedProviders ? new Set(connectedProviders) : null
      if (connectedSet !== null) {
        for (const model of userFallbackModels) {
          attempted.push(model)
          const parts = model.split("/")
          if (parts.length >= 2) {
            const provider = parts[0]!
            if (connectedSet.has(provider)) {
              const modelName = parts.slice(1).join("/")
              const transformedModel = `${provider}/${deps.transformModelForProvider(provider, modelName)}`
              log("Model resolved via user fallback_models (connected provider)", { model: transformedModel, original: model })
              return { model: transformedModel, provenance: "provider-fallback", attempted }
            }
          }
        }
        log("No connected provider found in user fallback_models, falling through to hardcoded chain")
      }
    } else {
      for (const model of userFallbackModels) {
        attempted.push(model)
        const parts = model.split("/")
        const providerHint = parts.length >= 2 ? [parts[0]!] : undefined
        const match = deps.fuzzyMatchModel(model, availableModels, providerHint)
        if (match) {
          log("Model resolved via user fallback_models (availability confirmed)", { model, match })
          return { model: match, provenance: "provider-fallback", attempted }
        }
      }
      log("No available model found in user fallback_models, falling through to hardcoded chain")
    }
  }
  if (fallbackChain && fallbackChain.length > 0) {
    if (availableModels.size === 0) {
      const connectedProviders = constraints.connectedProviders ?? providerCache.readConnectedProvidersCache()
      const connectedSet = connectedProviders ? new Set(connectedProviders) : null
      if (connectedSet === null) {
        log("Model fallback chain skipped (no connected providers cache) - falling through to system default")
      } else {
        for (const entry of fallbackChain) {
          for (const provider of entry.providers) {
            if (connectedSet.has(provider)) {
              const transformedModelId = deps.transformModelForProvider(provider, entry.model)
              const model = `${provider}/${transformedModelId}`
              log("Model resolved via fallback chain (connected provider)", {
                provider,
                model: transformedModelId,
                variant: entry.variant,
              })
              return {
                model,
                provenance: "provider-fallback",
                variant: entry.variant,
                attempted,
              }
            }
          }
        }
        log("No connected provider found in fallback chain, falling through to system default")
      }
    } else {
      for (const entry of fallbackChain) {
        for (const provider of entry.providers) {
          const fullModel = `${provider}/${entry.model}`
          const match = deps.fuzzyMatchModel(fullModel, availableModels, [provider])
          if (match) {
            log("Model resolved via fallback chain (availability confirmed)", {
              provider,
              model: entry.model,
              match,
              variant: entry.variant,
            })
            return {
              model: match,
              provenance: "provider-fallback",
              variant: entry.variant,
              attempted,
            }
          }
        }
        const crossProviderMatch = deps.fuzzyMatchModel(entry.model, availableModels)
        if (crossProviderMatch) {
          log("Model resolved via fallback chain (cross-provider fuzzy match)", {
            model: entry.model,
            match: crossProviderMatch,
            variant: entry.variant,
          })
          return {
            model: crossProviderMatch,
            provenance: "provider-fallback",
            variant: entry.variant,
            attempted,
          }
        }
      }
      log("No available model found in fallback chain, falling through to system default")
    }
  }
  if (systemDefaultModel === undefined) {
    log("No model resolved - systemDefaultModel not configured")
    return undefined
  }
  log("Model resolved via system default", { model: systemDefaultModel })
  return { model: systemDefaultModel, provenance: "system-default", attempted }
 }
--- a/apps/coder/src/services/model-resolution/model-resolution-types.ts
+++ b/apps/coder/src/services/model-resolution/model-resolution-types.ts
@@ -0,0 +1,41 @@
 import type { FallbackEntry } from "./model-requirement-types.js"
 export interface DelegatedModelConfig {
  providerID: string
  modelID: string
  variant?: string
  reasoningEffort?: string
  temperature?: number
  top_p?: number
  maxTokens?: number
  thinking?: { type: "enabled" | "disabled"; budgetTokens?: number }
 }
 export type ModelResolutionRequest = {
  intent?: {
    uiSelectedModel?: string
    userModel?: string
    categoryDefaultModel?: string
  }
  constraints: {
    availableModels: Set<string>
  }
  policy?: {
    fallbackChain?: FallbackEntry[]
    systemDefaultModel?: string
  }
 }
 export type ModelResolutionProvenance =
  | "override"
  | "category-default"
  | "provider-fallback"
  | "system-default"
 export type ModelResolutionResult = {
  model: string
  provenance: ModelResolutionProvenance
  variant?: string
  attempted?: string[]
  reason?: string
 }
--- a/apps/coder/src/services/model-resolution/model-resolver.ts
+++ b/apps/coder/src/services/model-resolution/model-resolver.ts
@@ -0,0 +1,109 @@
 import type { FallbackEntry } from "./model-requirement-types.js"
 import type { FallbackModelObject } from "./fallback-model-object.js"
 import { normalizeModel } from "./model-normalization.js"
 import { resolveModelPipeline } from "./model-resolution-pipeline.js"
 import { KNOWN_VARIANTS } from "./known-variants.js"
 import type { ConnectedProvidersAdapter } from "./connected-providers-cache.js"
 import * as connectedProvidersCache from "./connected-providers-cache.js"
 export type ModelResolutionInput = {
  userModel?: string
  inheritedModel?: string
  systemDefault?: string
 }
 export type ModelSource =
  | "override"
  | "category-default"
  | "provider-fallback"
  | "system-default"
 export type ModelResolutionResult = {
  model: string
  source: ModelSource
  variant?: string
 }
 export type ExtendedModelResolutionInput = {
  uiSelectedModel?: string
  userModel?: string
  userFallbackModels?: string[]
  categoryDefaultModel?: string
  fallbackChain?: FallbackEntry[]
  availableModels: Set<string>
  systemDefaultModel?: string
 }
 export function resolveModel(input: ModelResolutionInput): string | undefined {
  return (
    normalizeModel(input.userModel) ??
    normalizeModel(input.inheritedModel) ??
    input.systemDefault
  )
 }
 export function resolveModelWithFallback(
  input: ExtendedModelResolutionInput,
  connectedProvidersAdapter: ConnectedProvidersAdapter = connectedProvidersCache,
 ): ModelResolutionResult | undefined {
  const { uiSelectedModel, userModel, userFallbackModels, categoryDefaultModel, fallbackChain, availableModels, systemDefaultModel } = input
  const resolved = resolveModelPipeline({
    intent: { uiSelectedModel, userModel, userFallbackModels, categoryDefaultModel },
    constraints: { availableModels },
    policy: { fallbackChain, systemDefaultModel },
  }, connectedProvidersAdapter)
  if (!resolved) {
    return undefined
  }
  return {
    model: resolved.model,
    source: resolved.provenance,
    variant: resolved.variant,
  }
 }
 /**
 * Normalizes fallback_models config to a mixed array.
 * Accepts string, string[], or mixed arrays of strings and FallbackModelObject entries.
 */
 export function normalizeFallbackModels(
  models: string | (string | FallbackModelObject)[] | undefined,
 ): (string | FallbackModelObject)[] | undefined {
  if (!models) return undefined
  if (typeof models === "string") return [models]
  return models
 }
 /**
 * Extracts plain model strings from a mixed fallback models array.
 * Object entries are flattened to "model" or "model(variant)" strings.
 * Use this when consumers need string[] (e.g., resolveModelForDelegateTask).
 */
 export function flattenToFallbackModelStrings(
  models: (string | FallbackModelObject)[] | undefined,
 ): string[] | undefined {
  if (!models) return undefined
  return models.map((entry) => {
    if (typeof entry === "string") return entry
    const variant = entry.variant
    if (variant) {
      // Strip any supported inline variant syntax before appending explicit override.
      // Supports both parenthesized and space-suffix forms so we don't emit
      // invalid strings like "provider/model high(low)".
      const model = entry.model
        .replace(/\([^()]+\)\s*$/, "")
        .replace(/\s+([a-z][a-z0-9_-]*)\s*$/i, (_match: string, suffix: string) => {
          const normalized = String(suffix).toLowerCase()
          return KNOWN_VARIANTS.has(normalized)
            ? ""
            : _match
        })
        .trim()
      return `${model}(${variant})`
    }
    return entry.model
  })
 }
--- a/apps/coder/src/services/model-resolution/provider-cache.ts
+++ b/apps/coder/src/services/model-resolution/provider-cache.ts
@@ -0,0 +1,27 @@
 export interface ModelMetadata {
  readonly id: string
  readonly provider?: string
  readonly context?: number
  readonly output?: number
  readonly name?: string
  readonly variants?: Record<string, unknown>
  readonly limit?: {
    readonly context?: number
    readonly input?: number
    readonly output?: number
  }
  readonly modalities?: {
    readonly input?: string[]
    readonly output?: string[]
  }
  readonly capabilities?: Record<string, unknown>
  readonly reasoning?: boolean
  readonly temperature?: boolean
  readonly tool_call?: boolean
  readonly [key: string]: unknown
 }
 export interface ProviderCache {
  readConnectedProvidersCache(): string[] | null
  findProviderModelMetadata(providerID: string, modelID: string): ModelMetadata | undefined
 }
--- a/apps/coder/src/services/model-resolution/provider-model-id-transform.ts
+++ b/apps/coder/src/services/model-resolution/provider-model-id-transform.ts
@@ -0,0 +1,69 @@
 function inferSubProvider(model: string): string | undefined {
  if (model.startsWith("claude-")) return "anthropic"
  if (model.startsWith("gpt-")) return "openai"
  if (model.startsWith("gemini-")) return "google"
  if (model.startsWith("grok-")) return "xai"
  if (model.startsWith("minimax-")) return "minimax"
  if (model.startsWith("kimi-")) return "moonshotai"
  if (model.startsWith("glm-")) return "zai"
  return undefined
 }
 const CLAUDE_VERSION_DOT = /claude-(\w+)-(\d+)-(\d+)/g
 const GEMINI_31_PRO_PREVIEW = /gemini-3\.1-pro(?!-)/g
 const GEMINI_3_FLASH_PREVIEW = /gemini-3-flash(?!-)/g
 function claudeVersionDot(model: string): string {
  return model.replace(CLAUDE_VERSION_DOT, "claude-$1-$2.$3")
 }
 function applyGatewayTransforms(model: string): string {
  return claudeVersionDot(model).replace(
    GEMINI_31_PRO_PREVIEW,
    "gemini-3.1-pro-preview",
  )
 }
 function transformModelForProviderUsingAnthropicBehavior(
  provider: string,
  model: string,
 ): string {
  if (provider === "vercel") {
    const slashIndex = model.indexOf("/")
    if (slashIndex !== -1) {
      const subProvider = model.substring(0, slashIndex)
      const subModel = model.substring(slashIndex + 1)
      return `${subProvider}/${applyGatewayTransforms(subModel)}`
    }
    const subProvider = inferSubProvider(model)
    if (subProvider) {
      return `${subProvider}/${applyGatewayTransforms(model)}`
    }
    return model
  }
  if (provider === "github-copilot") {
    return claudeVersionDot(model)
      .replace(GEMINI_31_PRO_PREVIEW, "gemini-3.1-pro-preview")
      .replace(GEMINI_3_FLASH_PREVIEW, "gemini-3-flash-preview")
  }
  if (provider === "google") {
    return model
      .replace(GEMINI_31_PRO_PREVIEW, "gemini-3.1-pro-preview")
      .replace(GEMINI_3_FLASH_PREVIEW, "gemini-3-flash-preview")
  }
  if (provider === "anthropic") {
    return model
  }
  return model
 }
 export function transformModelForProvider(provider: string, model: string): string {
  return transformModelForProviderUsingAnthropicBehavior(provider, model)
 }
 export function transformModelForProviderDisplay(
  provider: string,
  model: string,
 ): string {
  return transformModelForProviderUsingAnthropicBehavior(provider, model)
 }
--- a/apps/coder/src/services/paseo-client.ts
+++ b/apps/coder/src/services/paseo-client.ts
@@ -0,0 +1,341 @@
 /**
 * v2.10 — PaseoClient: thin CLI-based client for the Paseo daemon.
 *
 * Paseo is a multi-agent hub daemon running at a configurable address
 * (default Unix socket / localhost:6767). This client wraps the `paseo` CLI
 * via child_process spawn for all operations (the daemon does not expose a
 * separate REST API for write operations). Read operations (listAgents,
 * getAgentStatus) use `paseo ls --json` / `paseo inspect --json`; write
 * operations (import, archive, send) use the corresponding subcommands.
 *
 * Spec: openspec/changes/v2-10-paseo-integration/design.md.
 */
 import { spawn } from 'node:child_process';
 import { once } from 'node:events';
 import { createInterface } from 'node:readline';
 // ─── Types ───────────────────────────────────────────────────────────────────
 /** Listing entry from `paseo ls --json`. Fields are lowercase. */
 export interface PaseoAgentListItem {
  id: string;
  shortId: string;
  name: string;
  provider: string;
  status: string;
  cwd?: string;
  created?: string;
  thinking?: string;
 }
 /** Detailed agent info from `paseo inspect --json`. Fields are PascalCase. */
 export interface PaseoAgentDetail {
  Id: string;
  Name: string;
  Provider: string;
  Model?: string;
  Status: string;
  Thinking?: string;
  Archived: boolean;
  ArchivedAt?: string | null;
  Cwd?: string;
  CreatedAt: string;
  UpdatedAt: string;
  Mode?: string;
  AvailableModes?: Array<{ id: string; label: string }>;
  Capabilities?: {
    Streaming?: boolean;
    Persistence?: boolean;
    DynamicModes?: boolean;
    McpServers?: boolean;
  };
  Labels?: Record<string, string>;
  Worktree?: string | null;
  ParentAgentId?: string | null;
 }
 /** Result of `paseo send --json`. */
 export interface PaseoSendResult {
  /** The agent's textual response. */
  text?: string;
  /** Structured output if the agent produced any. */
  output?: unknown;
  /** Error message if the turn failed. */
  error?: string;
  /** True if the turn completed successfully. */
  ok?: boolean;
 }
 export interface PaseoClientConfig {
  /** Path to the paseo binary. Default: auto-resolved from PATH. */
  paseoBin: string;
  /**
   * Explicit `--host <host>` value for CLI calls.
   * Format: `host:port` or `tcp://host:port?ssl=true&password=secret`.
   * Omit to use the CLI default (Unix socket, fallback localhost:6767).
   */
  cliHost?: string;
 }
 const DEFAULT_PASEO_BIN = 'paseo';
 // ─── Client ──────────────────────────────────────────────────────────────────
 export class PaseoClientError extends Error {
  constructor(
    message: string,
    public readonly command: string,
    public readonly exitCode: number | null,
    public readonly stderr: string,
  ) {
    super(message);
    this.name = 'PaseoClientError';
  }
 }
 export class PaseoClient {
  /** @internal visible for testing */
  readonly bin: string;
  private readonly hostArgs: string[];
  constructor(config?: Partial<PaseoClientConfig>) {
    this.bin = config?.paseoBin ?? DEFAULT_PASEO_BIN;
    this.hostArgs = config?.cliHost ? ['--host', config.cliHost] : [];
  }
  // ─── Read operations (CLI `ls --json`, `inspect --json`) ──────────────────
  /** List all non-archived agents. */
  async listAgents(): Promise<PaseoAgentListItem[]> {
    const raw = await this.runJson(['ls', '--json', ...this.hostArgs]);
    return raw as PaseoAgentListItem[];
  }
  /** Get detailed status for a single agent by ID or prefix. */
  async getAgentStatus(agentId: string): Promise<PaseoAgentDetail> {
    const raw = await this.runJson(['inspect', '--json', agentId, ...this.hostArgs]);
    return raw as PaseoAgentDetail;
  }
  /**
   * Quick liveness check — runs `paseo ls --json --limit 1` and returns success.
   * The daemon is healthy if the CLI exits 0.
   */
  async health(): Promise<{ status: string }> {
    try {
      await this.runCli(['ls', '--json', '--limit', '1', ...this.hostArgs]);
      return { status: 'ok' };
    } catch {
      return { status: 'error' };
    }
  }
  // ─── Write operations (CLI subcommands) ───────────────────────────────────
  /**
   * Import a provider session as a Paseo agent.
   * Uses `paseo import <sessionId> --provider <provider> [--label k=v]`.
   */
  async importAgent(
    sessionId: string,
    provider: string,
    labels?: Record<string, string>,
  ): Promise<PaseoAgentDetail> {
    const args: string[] = ['import', '--json', ...this.hostArgs];
    if (provider) {
      args.push('--provider', provider);
    }
    if (labels) {
      for (const [k, v] of Object.entries(labels)) {
        args.push('--label', `${k}=${v}`);
      }
    }
    args.push(sessionId);
    const raw = await this.runJson(args);
    return raw as PaseoAgentDetail;
  }
  /** Archive (soft-delete) a Paseo agent by ID or prefix. */
  async archiveAgent(agentId: string): Promise<void> {
    await this.runCli(['archive', '--json', ...this.hostArgs, agentId]);
  }
  /**
   * Send a prompt to an existing agent.
   *
   * By default waits for the agent to complete the turn (streams text events
   * via the optional `onEvent` callback) and returns the structured result.
   * Pass `noWait: true` to fire-and-forget.
   */
  async sendPrompt(
    agentId: string,
    prompt: string,
    options?: {
      noWait?: boolean;
      onEvent?: (event: { type: 'text' | 'reasoning'; text: string }) => void;
      signal?: AbortSignal;
    },
  ): Promise<PaseoSendResult> {
    const args: string[] = ['send', '--json', ...this.hostArgs];
    if (options?.noWait) {
      args.push('--no-wait');
    }
    args.push(agentId, prompt);
    // With --json and no --no-wait, the output is JSON after completion.
    // For streaming, we read stderr without --json for real-time text.
    const raw = await this.runCli(args, options?.signal);
    try {
      return JSON.parse(raw) as PaseoSendResult;
    } catch {
      return { text: raw, ok: true };
    }
  }
  /**
   * Stream-send: runs `paseo send` WITHOUT `--json`, forward text/reasoning
   * lines to onEvent in real time. Use when the caller wants to stream agent
   * output as it arrives rather than wait for the full JSON result.
   */
  async streamSend(
    agentId: string,
    prompt: string,
    onEvent: (event: { type: 'text' | 'reasoning'; text: string }) => void,
    signal?: AbortSignal,
  ): Promise<PaseoSendResult> {
    return new Promise<PaseoSendResult>((resolve, reject) => {
      const args = ['send', ...this.hostArgs, agentId, prompt];
      const child = spawn(this.bin, args, {
        stdio: ['ignore', 'pipe', 'pipe'],
        signal,
      });
      let stdout = '';
      let stderr = '';
      if (child.stdout) {
        const rl = createInterface({ input: child.stdout });
        rl.on('line', (line: string) => {
          stdout += line + '\n';
          // Forward as text event for real-time display
          onEvent({ type: 'text', text: line + '\n' });
        });
      }
      if (child.stderr) {
        child.stderr.on('data', (chunk: Buffer) => {
          stderr += chunk.toString();
        });
      }
      once(child, 'close').then((raw) => {
        const exitCode = (raw[0] as number | null) ?? 0;
        if (exitCode !== 0) {
          reject(
            new PaseoClientError(
              `paseo send failed (exit ${exitCode}): ${stderr.trim()}`,
              'send',
              exitCode,
              stderr,
            ),
          );
          return;
        }
        resolve({ text: stdout, ok: true });
      });
      child.on('error', reject);
    });
  }
  /** Interrupt/stop a running agent. */
  async stopAgent(agentId: string): Promise<void> {
    await this.runCli(['stop', ...this.hostArgs, agentId]);
  }
  // ─── Private helpers ───────────────────────────────────────────────────────
  /**
   * Run a CLI command and return stdout as a string.
   * Throws PaseoClientError on non-zero exit.
   */
  private async runCli(
    args: string[],
    signal?: AbortSignal,
  ): Promise<string> {
    return new Promise<string>((resolve, reject) => {
      const child = spawn(this.bin, args, {
        stdio: ['ignore', 'pipe', 'pipe'],
        signal,
      });
      let stdout = '';
      let stderr = '';
      if (child.stdout) {
        child.stdout.on('data', (chunk: Buffer) => {
          stdout += chunk.toString();
        });
      }
      if (child.stderr) {
        child.stderr.on('data', (chunk: Buffer) => {
          stderr += chunk.toString();
        });
      }
      child.on('error', (err: Error) => {
        // If signal aborted, treat as cancellation not error
        if (signal?.aborted) {
          resolve('');
          return;
        }
        reject(err);
      });
      once(child, 'close').then((raw) => {
        const exitCode = (raw[0] as number | null) ?? 0;
        if (signal?.aborted) {
          resolve('');
          return;
        }
        if (exitCode !== 0) {
          const msg = stderr.trim() || `exit code ${exitCode}`;
          reject(
            new PaseoClientError(
              `paseo ${args[0] ?? '?'} failed: ${msg}`,
              args[0] ?? '?',
              exitCode,
              stderr,
            ),
          );
          return;
        }
        resolve(stdout);
      });
    });
  }
  /**
   * Run a CLI command and parse stdout as JSON.
   * Throws PaseoClientError on non-zero exit or parse failure.
   */
  private async runJson(args: string[]): Promise<unknown> {
    const stdout = await this.runCli(args);
    try {
      return JSON.parse(stdout);
    } catch (err) {
      throw new PaseoClientError(
        `paseo ${args[0] ?? '?'} returned invalid JSON: ${(stdout || '<empty>').slice(0, 200)}`,
        args[0] ?? '?',
        0,
        stdout,
      );
    }
  }
 }
--- a/apps/coder/src/services/plan-store.ts
+++ b/apps/coder/src/services/plan-store.ts
@@ -0,0 +1,184 @@
 /**
 * Boulder state — cross-session plan persistence for BooCode.
 *
 * Plans live above flow_runs: a plan tracks a user's work goal and can link to
 * a flow run for automatic progress tracking. When the linked flow run reaches
 * a terminal state (completed/failed/cancelled), the plan is auto-updated.
 *
 * Auto-resumption: on startup, plans with a linked in-flight flow_run are
 * surfaced via the GET endpoint so the UI can show a resume prompt. The
 * flow-runner's initResume() re-advances the actual run; this store surfaces
 * the plan-level view.
 */
 import type { Sql } from '../db.js';
 export interface Plan {
  id: string;
  project_id: string;
  title: string;
  description: string | null;
  status: string;
  flow_run_id: string | null;
  progress_pct: number;
  items_total: number;
  items_completed: number;
  metadata: Record<string, unknown> | null;
  created_at: Date;
  updated_at: Date;
 }
 export interface CreatePlanOpts {
  projectId: string;
  title: string;
  description?: string;
  flowRunId?: string;
  metadata?: Record<string, unknown>;
 }
 export interface UpdatePlanOpts {
  title?: string;
  description?: string | null;
  status?: 'active' | 'completed' | 'cancelled' | 'failed';
  progressPct?: number;
  itemsTotal?: number;
  itemsCompleted?: number;
  metadata?: Record<string, unknown> | null;
 }
 export function createPlan(sql: Sql, opts: CreatePlanOpts): Promise<Plan> {
  return sql`
    INSERT INTO plans (project_id, title, description, flow_run_id, metadata)
    VALUES (
      ${opts.projectId},
      ${opts.title},
      ${opts.description ?? null},
      ${opts.flowRunId ?? null},
      ${opts.metadata ? sql.json(opts.metadata as never) : null}
    )
    RETURNING *
  `.then((rows) => rows[0] as unknown as Plan);
 }
 export function getPlan(sql: Sql, planId: string): Promise<Plan | null> {
  return sql`
    SELECT * FROM plans WHERE id = ${planId}
  `.then((rows) => (rows[0] as unknown as Plan) ?? null);
 }
 export function listPlans(sql: Sql, projectId: string): Promise<Plan[]> {
  return sql`
    SELECT * FROM plans
    WHERE project_id = ${projectId}
    ORDER BY created_at DESC
    LIMIT 100
  ` as Promise<Plan[]>;
 }
 export function listActivePlans(sql: Sql, projectId: string): Promise<Plan[]> {
  return sql`
    SELECT * FROM plans
    WHERE project_id = ${projectId} AND status = 'active'
    ORDER BY created_at DESC
  ` as Promise<Plan[]>;
 }
 export async function updatePlan(
  sql: Sql,
  planId: string,
  opts: UpdatePlanOpts,
 ): Promise<Plan | null> {
  const sets: string[] = [];
  const values: unknown[] = [];
  if (opts.title !== undefined) {
    sets.push(`title = $${values.length + 1}`);
    values.push(opts.title);
  }
  if (opts.description !== undefined) {
    sets.push(`description = $${values.length + 1}`);
    values.push(opts.description);
  }
  if (opts.status !== undefined) {
    sets.push(`status = $${values.length + 1}`);
    values.push(opts.status);
  }
  if (opts.progressPct !== undefined) {
    sets.push(`progress_pct = $${values.length + 1}`);
    values.push(opts.progressPct);
  }
  if (opts.itemsTotal !== undefined) {
    sets.push(`items_total = $${values.length + 1}`);
    values.push(opts.itemsTotal);
  }
  if (opts.itemsCompleted !== undefined) {
    sets.push(`items_completed = $${values.length + 1}`);
    values.push(opts.itemsCompleted);
  }
  if (opts.metadata !== undefined) {
    sets.push(`metadata = $${values.length + 1}::jsonb`);
    values.push(opts.metadata !== null ? JSON.stringify(opts.metadata) : null);
  }
  if (sets.length === 0) return getPlan(sql, planId);
  sets.push(`updated_at = clock_timestamp()`);
  const query = `
    UPDATE plans SET ${sets.join(', ')}
    WHERE id = $${values.length + 1}
    RETURNING *
  `;
  values.push(planId);
  const result = await sql.unsafe(query, values as never[]);
  return (result[0] as unknown as Plan) ?? null;
 }
 /**
 * Called when a flow run reaches a terminal state. Updates the linked plan's
 * status based on the run outcome:
 *  - completed → plan completed
 *  - failed    → plan failed
 *  - cancelled → plan cancelled
 * Returns true when a plan was updated, false when no plan is linked to the run.
 */
 export async function updatePlanFromRun(
  sql: Sql,
  runId: string,
  runStatus: 'completed' | 'failed' | 'cancelled',
 ): Promise<boolean> {
  const planStatus = planStatusFromRun(runStatus);
  const updated = await sql`
    UPDATE plans
    SET status = ${planStatus}, progress_pct = 100,
        items_completed = items_total, updated_at = clock_timestamp()
    WHERE flow_run_id = ${runId} AND status = 'active'
  `;
  return updated.count > 0;
 }
 /** Map a flow-run terminal status to its corresponding plan status. Pure. */
 export function planStatusFromRun(runStatus: 'completed' | 'failed' | 'cancelled'): string {
  return runStatus === 'completed' ? 'completed' : runStatus;
 }
 /**
 * Find any active plan linked to a running flow run — used by the startup
 * resume path to surface plans that have in-flight orchestrator runs.
 */
 export async function findPlanWithRunningRun(
  sql: Sql,
  projectId: string,
 ): Promise<(Plan & { run_status: string }) | null> {
  const [row] = await sql`
    SELECT p.*, fr.status AS run_status
    FROM plans p
    JOIN flow_runs fr ON fr.id = p.flow_run_id
    WHERE p.project_id = ${projectId}
      AND p.status = 'active'
      AND fr.status = 'running'
    ORDER BY p.created_at DESC
    LIMIT 1
  `;
  return (row as unknown as Plan & { run_status: string }) ?? null;
 }
--- a/apps/coder/src/services/provider-snapshot.ts
+++ b/apps/coder/src/services/provider-snapshot.ts
@@ -29,6 +29,22 @@ interface AgentRow {
  last_probed_at: string | Date | null;
 }
 export async function fetchDeepSeekModels(config: Config): Promise<ProviderModel[]> {
  if (!config.DEEPSEEK_API_KEY) return [];
  try {
    const baseURL = (config.DEEPSEEK_BASE_URL ?? 'https://api.deepseek.com').replace(/\/+$/, '');
    const res = await fetch(`${baseURL}/v1/models`, {
      headers: { Authorization: `Bearer ${config.DEEPSEEK_API_KEY}` },
      signal: AbortSignal.timeout(5_000),
    });
    if (!res.ok) return [];
    const parsed = (await res.json()) as { data?: Array<{ id: string }> };
    return (parsed.data ?? []).map((m) => ({ id: m.id, label: m.id }));
  } catch {
    return [];
  }
 }
 export async function fetchLlamaSwapModels(config: Config): Promise<ProviderModel[]> {
  try {
    const res = await fetch(`${config.LLAMA_SWAP_URL}/v1/models`);
@@ -256,7 +272,13 @@ export async function getProviderSnapshot(
  }
  const build = async (): Promise<ProviderSnapshotEntry[]> => {
-    const llamaModels = await fetchLlamaSwapModels(config);
+    const [llamaModels, deepseekModels] = await Promise.all([
      fetchLlamaSwapModels(config),
      fetchDeepSeekModels(config),
    ]);
    // Merge DeepSeek models into the llama-swap model pool so the boocode
    // provider (which sources from llama-swap) also includes DeepSeek models.
    const mergedModels = mergeModels(llamaModels, deepseekModels);
    const agents = await sql<AgentRow[]>`
      SELECT name, install_path, supports_acp, models, commands, label, transport, last_probed_at FROM available_agents
    `;
@@ -265,7 +287,7 @@ export async function getProviderSnapshot(
    const entries = await Promise.all(
      [...getResolvedRegistry().values()].map((resolved) =>
-        buildProviderEntry(resolved, agentMap.get(resolved.id), llamaModels, resolvedCwd, ttlMs, force),
+        buildProviderEntry(resolved, agentMap.get(resolved.id), mergedModels, resolvedCwd, ttlMs, force),
      ),
    );
--- a/apps/server/package.json
+++ b/apps/server/package.json
@@ -77,8 +77,9 @@
    "test": "vitest run"
  },
  "dependencies": {
-    "@boocode/contracts": "workspace:*",
+    "@ai-sdk/deepseek": "^2.0.35",
    "@ai-sdk/openai-compatible": "^2.0.47",
    "@boocode/contracts": "workspace:*",
    "@fastify/static": "^7.0.4",
    "@fastify/websocket": "^10.0.1",
    "@modelcontextprotocol/sdk": "^1.29.0",
--- a/apps/server/src/config.ts
+++ b/apps/server/src/config.ts
@@ -26,6 +26,14 @@ const ConfigSchema = z.object({
  FAST_MODEL: z.string().optional(),
  TASK_MODEL_URL: z.string().url().optional(),
  LLAMA_SIDECAR_URL: z.string().url().optional(),
  // vDeepSeek: DeepSeek API key for direct API access. When set, models
  // with IDs starting with 'deepseek-' route through DeepSeek's API instead
  // of llama-swap. Defaults to empty (DeepSeek routing disabled).
  DEEPSEEK_API_KEY: z.string().optional(),
  // Optional base URL override for DeepSeek API. Defaults to api.deepseek.com.
  DEEPSEEK_BASE_URL: z.string().url().default('https://api.deepseek.com'),
  // vWhale hooks: path to hooks JSON config file. Missing file = no hooks.
  HOOKS_CONFIG_PATH: z.string().default('/data/hooks.json'),
 });
 export type Config = z.infer<typeof ConfigSchema>;
--- a/apps/server/src/index.ts
+++ b/apps/server/src/index.ts
@@ -18,8 +18,10 @@ import { registerCoderProxy } from './routes/coder-proxy.js';
 import { registerModelRoutes } from './routes/models.js';
 import { registerAgentRoutes } from './routes/agents.js';
 import { registerSkillsRoutes } from './routes/skills.js';
 import { registerTraceRoutes } from './routes/traces.js';
 import { registerToolsRoutes } from './routes/tools.js';
 import { registerAnalyticsRoutes } from './routes/analytics.js';
 import { registerMemoryRoutes } from './routes/memory.js';
 import { registerInferenceSettingsRoutes } from './routes/inference-settings.js';
 import { createInferenceRunner } from './services/inference/index.js';
 import { createBroker } from './services/broker.js';
@@ -31,6 +33,7 @@ import { loadMcpConfig } from './services/mcp-config.js';
 import { initialize as initMcp, getTools as getMcpTools, shutdown as shutdownMcp } from './services/mcp-client.js';
 import { appendMcpTools } from './services/tools.js';
 import { refreshToolNames, getAgentsForProject } from './services/agents.js';
 import { loadHooksConfig, createHookRunner } from './services/hooks.js';
 async function main() {
  const config = loadConfig();
@@ -123,8 +126,10 @@ async function main() {
  registerAgentRoutes(app, sql);
  registerSidebarRoutes(app, sql);
  registerChatRoutes(app, sql, broker);
  registerTraceRoutes(app, sql);
  registerToolsRoutes(app, sql);
  registerAnalyticsRoutes(app, sql);
  registerMemoryRoutes(app, sql);
  registerInferenceSettingsRoutes(app);
  // Batch 9.6: warm the skills cache at boot and surface the count. Empty or
@@ -136,11 +141,17 @@ async function main() {
    app.log.warn({ err }, 'skills boot walk failed');
  }
  // vWhale hooks: load hook config and create runner. Missing file = no hooks.
  loadHooksConfig(config.HOOKS_CONFIG_PATH);
  const hookRunner = createHookRunner();
  const hasHooks = Object.keys(loadHooksConfig(config.HOOKS_CONFIG_PATH).hooks).length > 0;
  const inference = createInferenceRunner(
    {
      sql,
      config,
      log: app.log,
      hooks: hasHooks ? hookRunner : undefined,
      publish: (sessionId, frame) => {
        // v1.13.11-b: route through the typed publishFrame so the broker's
        // Zod gate validates every inference frame before delivery.
@@ -166,7 +177,7 @@ async function main() {
    // bubble up so the route can reply 500 — manual /compact failures
    // should be loud (the user just clicked a button).
    runCompaction: (chatId) =>
-      compaction.process({ sql, config, log: app.log, broker, chatId }),
+      compaction.process({ sql, config, log: app.log, broker, chatId, hooks: hasHooks ? hookRunner : undefined }),
    cancelInference: async (sessionId, chatId) => {
      return inference.cancel(sessionId, chatId);
    },
--- a/apps/server/src/routes/models.ts
+++ b/apps/server/src/routes/models.ts
@@ -2,26 +2,55 @@ import type { FastifyInstance } from 'fastify';
 import type { Config } from '../config.js';
 import type { ModelInfo } from '../types/api.js';
-interface LlamaSwapModelsResponse {
+interface ApiModelsResponse {
  data?: ModelInfo[];
 }
 const DEEPSEEK_STATIC_MODELS: ModelInfo[] = [
  { id: 'deepseek-v4-flash', object: 'model', created: 0, owned_by: 'deepseek' },
  { id: 'deepseek-v4-pro', object: 'model', created: 0, owned_by: 'deepseek' },
 ];
 export function registerModelRoutes(app: FastifyInstance, config: Config): void {
  app.get('/api/models', async (_req, reply) => {
    const models: ModelInfo[] = [];
    // 1. Fetch llama-swap models
    try {
      const res = await fetch(`${config.LLAMA_SWAP_URL}/v1/models`);
-      if (!res.ok) {
+      if (res.ok) {
-        reply.code(502);
+        const parsed = (await res.json()) as ApiModelsResponse;
-        return { error: `llama-swap returned ${res.status}` };
+        if (parsed.data) models.push(...parsed.data);
      }
-      const parsed = (await res.json()) as LlamaSwapModelsResponse;
+    } catch {
-      return parsed.data ?? [];
+      // llama-swap unreachable — proceed with whatever we have
    } catch (err) {
      reply.code(502);
      return {
        error: 'failed to reach llama-swap',
        details: err instanceof Error ? err.message : String(err),
      };
    }
    // 2. If DeepSeek is configured, fetch live models from their API
    if (config.DEEPSEEK_API_KEY) {
      try {
        const baseURL = (config.DEEPSEEK_BASE_URL ?? 'https://api.deepseek.com').replace(/\/+$/, '');
        const res = await fetch(`${baseURL}/v1/models`, {
          headers: { Authorization: `Bearer ${config.DEEPSEEK_API_KEY}` },
          signal: AbortSignal.timeout(5_000),
        });
        if (res.ok) {
          const parsed = (await res.json()) as ApiModelsResponse;
          if (parsed.data) models.push(...parsed.data);
        } else {
          // API call failed — fall back to static model list
          models.push(...DEEPSEEK_STATIC_MODELS);
        }
      } catch {
        // Network error — fall back to static model list
        models.push(...DEEPSEEK_STATIC_MODELS);
      }
    }
    if (models.length === 0) {
      reply.code(502);
      return { error: 'no models available from any provider' };
    }
    return models;
  });
 }
--- a/apps/server/src/routes/traces.ts
+++ b/apps/server/src/routes/traces.ts
@@ -0,0 +1,38 @@
 import type { FastifyInstance } from 'fastify';
 import type { Sql } from '../db.js';
 import type { ToolTrace } from '../services/tool-traces.js';
 export function registerTraceRoutes(app: FastifyInstance, sql: Sql): void {
  app.get<{ Params: { id: string }; Querystring: { limit?: string; offset?: string } }>(
    '/api/chats/:id/traces',
    async (req, reply) => {
      const chat = await sql`SELECT id FROM chats WHERE id = ${req.params.id}`;
      if (chat.length === 0) {
        reply.code(404);
        return { error: 'chat not found' };
      }
      const limit = Math.min(Math.max(Number(req.query.limit) || 50, 1), 200);
      const offset = Math.max(Number(req.query.offset) || 0, 0);
      const rows = await sql<ToolTrace[]>`
        SELECT * FROM tool_traces
        WHERE chat_id = ${req.params.id}
        ORDER BY started_at ASC
        LIMIT ${limit}
        OFFSET ${offset}
      `;
      const [countRow] = await sql<{ count: number }[]>`
        SELECT count(*)::int AS count FROM tool_traces WHERE chat_id = ${req.params.id}
      `;
      return {
        data: rows,
        total: countRow?.count ?? 0,
        limit,
        offset,
      };
    },
  );
 }
--- a/apps/server/src/routes/ws.ts
+++ b/apps/server/src/routes/ws.ts
@@ -3,6 +3,7 @@ import type { Sql } from '../db.js';
 import type { Broker } from '../services/broker.js';
 import type { Message } from '../types/api.js';
 import { MESSAGE_COLUMNS } from '../services/message-columns.js';
 import { loadAgentSnapshot } from '../services/session-snapshots.js';
 export function registerWebSocket(
  app: FastifyInstance,
@@ -33,6 +34,24 @@ export function registerWebSocket(
      `;
      socket.send(JSON.stringify({ type: 'snapshot', messages }));
      // v2.7.x: on reconnect, restore agent snapshot state so the frontend
      // knows there's an ongoing agent turn. Best-effort per chat; most
      // sessions won't have any snapshots.
      const chats = await sql<{ id: string }[]>`SELECT id FROM chats WHERE session_id = ${sessionId}`;
      for (const chat of chats) {
        const agentSnapshot = await loadAgentSnapshot(sql, chat.id).catch(() => null);
        if (agentSnapshot) {
          socket.send(JSON.stringify({
            type: 'agent_snapshot',
            chat_id: chat.id,
            agent: agentSnapshot.agent,
            model: agentSnapshot.model,
            mode: agentSnapshot.mode,
            turn_number: agentSnapshot.turn_number,
          }));
        }
      }
      const unsubscribe = broker.subscribe(sessionId, (frame) => {
        if (socket.readyState !== socket.OPEN) return;
        try {
--- a/apps/server/src/schema.sql
+++ b/apps/server/src/schema.sql
@@ -32,11 +32,18 @@ CREATE TABLE IF NOT EXISTS messages (
  content TEXT NOT NULL DEFAULT '',
  status TEXT NOT NULL DEFAULT 'complete',
  last_seq INT NOT NULL DEFAULT 0,
  cache_tokens INTEGER,
  reasoning_tokens INTEGER,
  created_at TIMESTAMPTZ NOT NULL DEFAULT clock_timestamp()
 );
 CREATE INDEX IF NOT EXISTS idx_messages_session ON messages(session_id, created_at);
 -- vDeepSeek: add cache/reasoning token columns early so messages_with_parts
 -- view (defined below) can reference them. IF NOT EXISTS guards re-runs.
 ALTER TABLE messages ADD COLUMN IF NOT EXISTS cache_tokens INTEGER;
 ALTER TABLE messages ADD COLUMN IF NOT EXISTS reasoning_tokens INTEGER;
 -- v1.13.0: granular message parts table. v1.13.20: legacy tool_calls/
 -- tool_results columns dropped; message_parts is now the sole source of
 -- truth for tool calls, tool results, and reasoning. ON DELETE CASCADE
@@ -126,8 +133,8 @@ SELECT
     FROM message_parts p
    WHERE p.message_id = m.id AND p.kind = 'reasoning' AND p.hidden_at IS NULL) AS reasoning_parts,
  -- NEW columns MUST be appended at the end: CREATE OR REPLACE VIEW can't
-  -- reorder/rename existing columns (42P16). m.model added last.
+  -- reorder/rename existing columns (42P16). cache_tokens and reasoning_tokens added last.
-  m.model
+  m.model, m.cache_tokens, m.reasoning_tokens
 FROM messages m;
 -- v1.13.20: drop legacy tool_calls/tool_results columns. Reads have routed
@@ -407,3 +414,55 @@ END $$;
 -- Remove the v2.0.5 arena_id column (replaced by the new Arena feature).
 ALTER TABLE tasks DROP COLUMN IF EXISTS arena_id;
 -- v2.x-tool-traces: per-call tool execution records for observability.
 CREATE TABLE IF NOT EXISTS tool_traces (
  id               UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  session_id       UUID NOT NULL REFERENCES sessions(id) ON DELETE CASCADE,
  chat_id          UUID NOT NULL REFERENCES chats(id) ON DELETE CASCADE,
  message_id       UUID REFERENCES messages(id) ON DELETE SET NULL,
  turn_number      INTEGER NOT NULL,
  tool_name        TEXT NOT NULL,
  tool_input       JSONB NOT NULL,
  tool_output      TEXT,
  started_at       TIMESTAMPTZ NOT NULL DEFAULT clock_timestamp(),
  finished_at      TIMESTAMPTZ,
  latency_ms       INTEGER,
  tokens_used      INTEGER,
  cache_tokens     INTEGER,
  reasoning_tokens INTEGER,
  error            TEXT,
  outcome          TEXT,
  created_at       TIMESTAMPTZ NOT NULL DEFAULT clock_timestamp()
 );
 CREATE INDEX IF NOT EXISTS idx_tool_traces_chat ON tool_traces(chat_id, created_at);
 -- v2.x-tool-traces: active tool call state for in-flight instrumentation.
 CREATE TABLE IF NOT EXISTS tool_trace_states (
  id               UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  session_id       UUID NOT NULL REFERENCES sessions(id) ON DELETE CASCADE,
  chat_id          UUID NOT NULL REFERENCES chats(id) ON DELETE CASCADE,
  message_id       UUID REFERENCES messages(id) ON DELETE SET NULL,
  turn_number      INTEGER NOT NULL,
  tool_name        TEXT NOT NULL,
  tool_input       JSONB NOT NULL,
  started_at       TIMESTAMPTZ NOT NULL DEFAULT clock_timestamp()
 );
 -- agent_snapshots: persistent agent session state for cross-refresh resume.
 CREATE TABLE IF NOT EXISTS agent_snapshots (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  session_id UUID NOT NULL REFERENCES sessions(id) ON DELETE CASCADE,
  chat_id UUID NOT NULL REFERENCES chats(id) ON DELETE CASCADE,
  model TEXT NOT NULL,
  agent TEXT,
  mode TEXT,
  turn_number INTEGER NOT NULL DEFAULT 0,
  messages JSONB NOT NULL DEFAULT '[]'::jsonb,
  tool_states JSONB NOT NULL DEFAULT '[]'::jsonb,
  created_at TIMESTAMPTZ NOT NULL DEFAULT clock_timestamp(),
  updated_at TIMESTAMPTZ NOT NULL DEFAULT clock_timestamp()
 );
 CREATE INDEX IF NOT EXISTS idx_agent_snapshots_chat ON agent_snapshots(chat_id);
 CREATE UNIQUE INDEX IF NOT EXISTS idx_agent_snapshots_chat_unique ON agent_snapshots(chat_id);
--- a/apps/server/src/services/agents.ts
+++ b/apps/server/src/services/agents.ts
@@ -106,6 +106,8 @@ interface ParsedFrontmatter {
  // allowed" — the model responds text-only.
  steps?: number;
  llama_extra_args?: string[];
  // vDeepSeek: thinking effort for DeepSeek V4 models.
  reasoning_effort?: string;
 }
 // P5: table-driven validation for the "soft-range" numeric frontmatter fields.
@@ -386,6 +388,7 @@ function parseAgentSection(section: RawSection): Omit<Agent, 'source'> {
    max_tool_calls: typeof fm.max_tool_calls === 'number' ? fm.max_tool_calls : null,
    steps: typeof fm.steps === 'number' ? fm.steps : null,
    llama_extra_args: Array.isArray(fm.llama_extra_args) ? fm.llama_extra_args : null,
    reasoning_effort: typeof fm.reasoning_effort === 'string' ? (fm.reasoning_effort as Agent['reasoning_effort']) : null,
  };
 }
--- a/apps/server/src/services/boocontext_client.ts
+++ b/apps/server/src/services/boocontext_client.ts
@@ -0,0 +1,110 @@
 /**
 * v2.7.18: shared MCP client wrapper for the boocontext sidecar.
 *
 * Calls into the existing multi-server MCP client infrastructure
 * (services/mcp-client.ts) which connects to boocontext as a stdio
 * MCP process defined in data/mcp.json (server name "boocontext",
 * command: `node /opt/forks/boocontext/dist/standalone.js`).
 *
 * The boocontext MCP server is initialized once at app boot in
 * index.ts via initMcp() and the actual MCP tool call routing is
 * handled by mcp-client.ts:callTool() — this module is a thin
 * convenience wrapper that prepends the "boocontext_" server prefix,
 * normalises the response, and applies inline truncation matching
 * the same pattern as codecontext_client.ts.
 *
 * Usage:
 *   import { callBoocontext } from './services/boocontext_client.js';
 *   const resp = await callBoocontext({
 *     toolName: 'codesight_get_summary',
 *     args: { directory: '/opt/boocode' },
 *   });
 */
 import { callTool } from './mcp-client.js';
 import { truncateIfNeeded } from './truncate.js';
 // ---- Exported types ----
 export interface BoocontextRequest {
  /** Unprefixed tool name as defined on the boocontext MCP server
   * (e.g. "codesight_scan", "boocontext_overview", "codesight_get_summary"). */
  toolName: string;
  /** Arguments to pass to the tool. */
  args: Record<string, unknown>;
 }
 export interface BoocontextResponse {
  /** The tool output text. */
  result: string;
  /** Whether the result was truncated to fit the inline limit. */
  truncated: boolean;
  /** Opaque id pointing at the full pre-slice content on tmpfs, set when
   * truncated=true and storage succeeded. */
  outputPath?: string;
 }
 // ---- Constants ----
 /** Must match the server name in data/mcp.json. */
 const BOOCONTEXT_SERVER_NAME = 'boocontext';
 /** Inline truncation limit, matching codecontext_client.ts. */
 const TRUNCATION_LIMIT = 32_000;
 // ---- Public API ----
 /**
 * Call a boocontext MCP tool by its unprefixed name.
 *
 * Prepends the "boocontext_" server prefix, delegates to the
 * multi-server MCP client's callTool(), and normalises the response
 * into a BoocontextResponse with inline truncation.
 *
 * @param req   The tool name and arguments.
 * @param log   Optional Fastify-compatible logger (for debug traces).
 * @returns     The tool result, possibly truncated.
 * @throws      If the boocontext server is not connected or the tool
 *              returns an MCP-level error.
 */
 export async function callBoocontext(
  req: BoocontextRequest,
  log?: { debug?: (obj: object, msg: string) => void; warn?: (obj: object, msg: string) => void },
 ): Promise<BoocontextResponse> {
  const prefixedName = `${BOOCONTEXT_SERVER_NAME}_${req.toolName}`;
  log?.debug?.({ tool: prefixedName }, 'boocontext: calling tool');
  const raw = await callTool(prefixedName, req.args);
  // callTool returns { error: true, output: string } on failure (both
  // for MCP-level isError and for network/protocol exceptions).
  if (typeof raw === 'object' && raw !== null && (raw as Record<string, unknown>).error === true) {
    const errOutput = (raw as Record<string, unknown>).output ?? 'Unknown MCP error';
    throw new Error(`boocontext error: ${String(errOutput)}`);
  }
  const result = typeof raw === 'string' ? raw : JSON.stringify(raw);
  // Inline truncation at 32 kB, matching codecontext_client.ts.
  // The model gets a clear hint about how to narrow the next call
  // rather than a silent cut.
  if (result.length > TRUNCATION_LIMIT) {
    const truncated = result.slice(0, TRUNCATION_LIMIT);
    const omitted = result.length - TRUNCATION_LIMIT;
    const slicedWithMarker =
      `${truncated}\n\n[truncated, ${omitted} chars omitted; narrow with additional filters]`;
    const wrapped = await truncateIfNeeded({
      fullContent: result,
      slicedContent: slicedWithMarker,
      wasTruncated: true,
    });
    return {
      result: wrapped.content,
      truncated: wrapped.truncated,
      ...(wrapped.outputPath ? { outputPath: wrapped.outputPath } : {}),
    };
  }
  return { result, truncated: false };
 }
--- a/apps/server/src/services/codecontext_client.ts
+++ b/apps/server/src/services/codecontext_client.ts
@@ -1,3 +1,10 @@
 // DEPRECATED (Phase 4, Domain 2, v2.8.14): This HTTP client routes through
 // the Go codecontext sidecar (http://codecontext:8080). Superseded by the
 // boocontext MCP server. New callers should use boocontext MCP tool wrappers
 // directly. Keep this file for backward compatibility — the 16 existing
 // codecontext tool wrappers (under tools/codecontext/) still call through
 // callCodecontext(). Remove after full migration.
 //
 // v1.12 Track B.2: shared HTTP client for the codecontext sidecar. The 8
 // per-tool wrappers under tools/codecontext/ all funnel through callCodecontext
 // — they're thin adapters that supply toolName + args + projectPath. The
@@ -19,6 +26,7 @@
 import { access, copyFile, realpath } from 'node:fs/promises';
 import { isAbsolute, join, resolve, sep } from 'node:path';
 import { truncateIfNeeded } from './truncate.js';
 import { callBoocontext } from './boocontext_client.js';
 // v1.13.12 fix: codecontext crashes on empty source files (upstream issue #37)
 // when it can't ignore them. The .codecontextignore.template ships with the
@@ -112,6 +120,16 @@ export async function callCodecontext(
  req: CodecontextRequest,
  fetcher: typeof fetch = fetch,
 ): Promise<CodecontextResponse> {
  // Phase 4: try boocontext MCP first. Falls back to the HTTP sidecar if the
  // MCP server is not available or the tool doesn't exist there.
  try {
    return await callBoocontext({ toolName: req.toolName, args: req.args });
  } catch (err) {
    console.warn(
      `[codecontext_client] boocontext MCP unavailable for "${req.toolName}", falling back to HTTP sidecar: ${err instanceof Error ? err.message : String(err)}`,
    );
  }
  // Step 1: realpath the project root, then realpath the requested target_dir
  // (defaulting to projectPath when the caller didn't pass one — the 12 wrappers
  // never pass target_dir; tests can override). A non-existent target_dir
--- a/apps/server/src/services/compaction.ts
+++ b/apps/server/src/services/compaction.ts
@@ -24,6 +24,8 @@ import { SUMMARY_TEMPLATE } from './compaction-prompt.js';
 import * as modelContextLookup from './model-context.js';
 import { SENTINEL_KINDS } from './inference/sentinels.js';
 import type { OpenAiMessage } from './inference/payload.js';
 import { resolveModelEndpoint } from './inference/provider.js';
 import type { HookRunner } from './hooks.js';
 // v1.13.9: ratio-only overflow trigger. Fires compaction at 85% of ctx_max
 // (opencode session/overflow.ts pattern). Replaces the v1.11.0-era
@@ -346,20 +348,22 @@ interface CompletionResult {
  completionTokens: number;
 }
-async function callLlamaSwap(
+async function callLlm(
  config: Config,
  model: string,
  messages: OpenAiMessage[],
  log: FastifyBaseLogger,
 ): Promise<CompletionResult> {
-  const res = await fetch(`${config.LLAMA_SWAP_URL}/v1/chat/completions`, {
+  const { url, headers, model: resolvedModel } = resolveModelEndpoint(config, model);
  const res = await fetch(`${url}/v1/chat/completions`, {
    method: 'POST',
-    headers: { 'Content-Type': 'application/json' },
+    headers,
-    body: JSON.stringify({ model, messages, stream: false }),
+    body: JSON.stringify({ model: resolvedModel, messages, stream: false }),
  });
  if (!res.ok) {
    const text = await res.text().catch(() => '');
-    throw new Error(`llama-swap returned ${res.status}: ${text.slice(0, 200)}`);
+    const prefix = model.startsWith('deepseek-') ? 'deepseek' : 'llama-swap';
    throw new Error(`${prefix} returned ${res.status}: ${text.slice(0, 200)}`);
  }
  const json = (await res.json()) as {
    choices?: Array<{ message?: { content?: string } }>;
@@ -383,6 +387,8 @@ export interface ProcessInput {
  log: FastifyBaseLogger;
  broker: Broker;
  chatId: string;
  /** vWhale: lifecycle hooks runner. Undefined when no hooks configured. */
  hooks?: HookRunner;
 }
 // Runs one round of anchored rolling compaction on `chatId`. No-ops cleanly
@@ -497,6 +503,17 @@ export async function process(input: ProcessInput): Promise<void> {
    at: new Date().toISOString(),
  });
  // vWhale: PreCompact hook (best-effort, non-blocking).
  const msgBefore = messages.length;
  if (input.hooks) {
    input.hooks.run('PreCompact', {
      event: 'PreCompact',
      session_id: sessionId,
      chat_id: chatId,
      messages_before: msgBefore,
    }).catch(() => {});
  }
  // try/finally so the dot ALWAYS drops back to idle, even if the LLM call
  // throws or a downstream DB write fails. The succeeded flag gates the
  // 'compacted' frame + final log: we only signal completion to the UI when
@@ -506,7 +523,7 @@ export async function process(input: ProcessInput): Promise<void> {
  let result: CompletionResult | undefined;
  try {
    // 7. Single completion (no tools). Throws on llama-swap failure.
-    result = await callLlamaSwap(config, session.model, payload, log);
+    result = await callLlm(config, session.model, payload, log);
    // 7b. v1.11.3: fetch the model's true context window from llama-swap's
    // /upstream/<model>/props (the streaming completion doesn't carry it).
@@ -558,6 +575,18 @@ export async function process(input: ProcessInput): Promise<void> {
    `;
    succeeded = true;
    // vWhale: PostCompact hook (best-effort, non-blocking).
    if (input.hooks) {
      input.hooks.run('PostCompact', {
        event: 'PostCompact',
        session_id: sessionId,
        chat_id: chatId,
        messages_before: msgBefore,
        messages_after: sel.head.length,
        summary: (result?.content ?? '').slice(0, 500),
      }).catch(() => {});
    }
  } finally {
    // Always restore the dot. Status='idle' (not 'error') even on failure —
    // the caller logs/re-surfaces the error separately; the dot doesn't
--- a/apps/server/src/services/hooks.ts
+++ b/apps/server/src/services/hooks.ts
@@ -0,0 +1,299 @@
 /**
 * vWhale: lifecycle hook runner. Hooks are shell commands that fire at key
 * points in the inference pipeline. Each hook receives a JSON payload on
 * stdin and can return JSON on stdout to influence behavior.
 *
 * Inspired by Whale's hook system with 11 lifecycle events. BooCode
 * implements the most relevant subset: PreToolUse, PostToolUse,
 * UserPromptSubmit, Stop, PreCompact, PostCompact.
 *
 * Config: JSON file at HOOKS_CONFIG_PATH (default /data/hooks.json).
 * Format:
 * ```json
 * {
 *   "hooks": {
 *     "PreToolUse": [
 *       { "match": "shell_run", "command": "python3 /data/hooks/check_shell.py", "timeout": 30 }
 *     ],
 *     "Stop": [
 *       { "command": "node /data/hooks/log_turn.mjs" }
 *     ]
 *   }
 * }
 * ```
 */
 import { spawn } from 'node:child_process';
 import { readFileSync, existsSync } from 'node:fs';
 import type { FastifyBaseLogger } from 'fastify';
 // ─── Events ───────────────────────────────────────────────────────────────
 export type HookEvent =
  | 'PreToolUse'
  | 'PostToolUse'
  | 'UserPromptSubmit'
  | 'Stop'
  | 'PreCompact'
  | 'PostCompact';
 const ALL_EVENTS: HookEvent[] = [
  'PreToolUse',
  'PostToolUse',
  'UserPromptSubmit',
  'Stop',
  'PreCompact',
  'PostCompact',
 ];
 // ─── Config ────────────────────────────────────────────────────────────────
 export interface HookConfig {
  /** Glob or exact tool name to match (PreToolUse/PostToolUse only). Omit or '*' for all. */
  match?: string;
  /** Shell command to run. Receives JSON payload on stdin. */
  command: string;
  /** Timeout in seconds (default 30). */
  timeout?: number;
 }
 export interface HooksConfig {
  hooks: Partial<Record<HookEvent, HookConfig[]>>;
 }
 // ─── Payloads ──────────────────────────────────────────────────────────────
 export interface PreToolUsePayload {
  event: 'PreToolUse';
  session_id: string;
  tool_name: string;
  tool_args: Record<string, unknown>;
 }
 export interface PostToolUsePayload {
  event: 'PostToolUse';
  session_id: string;
  tool_name: string;
  tool_args: Record<string, unknown>;
  tool_result: unknown;
  tool_error?: string;
 }
 export interface UserPromptSubmitPayload {
  event: 'UserPromptSubmit';
  session_id: string;
  chat_id: string;
  prompt: string;
 }
 export interface StopPayload {
  event: 'Stop';
  session_id: string;
  chat_id: string;
  last_assistant_text: string;
  turn: number;
 }
 export interface PreCompactPayload {
  event: 'PreCompact';
  session_id: string;
  chat_id: string;
  messages_before: number;
 }
 export interface PostCompactPayload {
  event: 'PostCompact';
  session_id: string;
  chat_id: string;
  messages_before: number;
  messages_after: number;
  summary: string;
 }
 export type HookPayload =
  | PreToolUsePayload
  | PostToolUsePayload
  | UserPromptSubmitPayload
  | StopPayload
  | PreCompactPayload
  | PostCompactPayload;
 // ─── Response ──────────────────────────────────────────────────────────────
 export type HookDecision = 'pass' | 'warn' | 'block';
 export interface HookResponse {
  decision?: HookDecision;
  reason?: string;
  /** When present, replaces the original tool args / user prompt. */
  updated_input?: Record<string, unknown> | string;
  /** Injected into the model's context for the next turn. */
  additional_context?: string;
 }
 // ─── Runner ────────────────────────────────────────────────────────────────
 export interface HookRunner {
  /** Run all hooks for the given event. Returns the effective response. */
  run(event: HookEvent, payload: HookPayload, log?: FastifyBaseLogger): Promise<HookResponse>;
 }
 let hooksConfig: HooksConfig | null = null;
 let hooksPath: string | null = null;
 /** Load hooks config from disk. Missing file = no hooks. Never throws. */
 export function loadHooksConfig(path: string): HooksConfig {
  hooksPath = path;
  if (!existsSync(path)) {
    hooksConfig = { hooks: {} };
    return hooksConfig;
  }
  try {
    const raw = readFileSync(path, 'utf8');
    const parsed = JSON.parse(raw) as HooksConfig;
    hooksConfig = {
      hooks: { ...parsed.hooks },
    };
    // Validate event names
    for (const event of Object.keys(hooksConfig.hooks)) {
      if (!ALL_EVENTS.includes(event as HookEvent)) {
        console.warn(`hooks: unknown event '${event}' in ${path} — ignoring`);
        delete hooksConfig.hooks[event as HookEvent];
      }
    }
  } catch (err) {
    console.error(`hooks: failed to load ${path}`, err);
    hooksConfig = { hooks: {} };
  }
  return hooksConfig;
 }
 /** Reload the config file (call after a PATCH). */
 export function reloadHooksConfig(): HooksConfig {
  if (hooksPath) return loadHooksConfig(hooksPath);
  hooksConfig = { hooks: {} };
  return hooksConfig;
 }
 function getConfig(): HooksConfig {
  return hooksConfig ?? { hooks: {} };
 }
 /** Create a HookRunner for the current config. */
 export function createHookRunner(): HookRunner {
  return {
    async run(event, payload, log): Promise<HookResponse> {
      const configs = getConfig().hooks[event];
      if (!configs || configs.length === 0) return { decision: 'pass' };
      // Pre-filter by match pattern for tool events
      const toolName = 'tool_name' in payload ? (payload as PreToolUsePayload).tool_name : undefined;
      let effective: HookResponse = { decision: 'pass' };
      for (const cfg of configs) {
        // Skip if match doesn't apply
        if (toolName && cfg.match && cfg.match !== '*' && cfg.match !== toolName) continue;
        const result = await runSingleHook(cfg, payload, log);
        // Merge decisions: block > warn > pass
        if (result.decision === 'block') {
          effective = { ...result, decision: 'block' };
          break; // block is terminal
        }
        if (result.decision === 'warn' && effective.decision !== 'block') {
          effective = { ...result, decision: 'warn' };
        }
        // Merge additional_context and updated_input
        if (result.additional_context) {
          effective.additional_context = effective.additional_context
            ? effective.additional_context + '\n' + result.additional_context
            : result.additional_context;
        }
        if (result.updated_input && !effective.updated_input) {
          effective.updated_input = result.updated_input;
        }
      }
      return effective;
    },
  };
 }
 async function runSingleHook(
  cfg: HookConfig,
  payload: HookPayload,
  log?: FastifyBaseLogger,
 ): Promise<HookResponse> {
  const timeoutMs = (cfg.timeout ?? 30) * 1000;
  return new Promise((resolve) => {
    const child = spawn('sh', ['-c', cfg.command], {
      stdio: ['pipe', 'pipe', 'pipe'],
      timeout: timeoutMs,
      env: { ...process.env },
    });
    const stdout: Buffer[] = [];
    const stderr: Buffer[] = [];
    child.stdout.on('data', (chunk: Buffer) => stdout.push(chunk));
    child.stderr.on('data', (chunk: Buffer) => stderr.push(chunk));
    let settled = false;
    const timer = setTimeout(() => {
      if (!settled) {
        settled = true;
        child.kill('SIGTERM');
        log?.warn({ event: payload.event, command: cfg.command }, 'hooks: timeout');
        resolve({ decision: 'warn', reason: 'hook timed out' });
      }
    }, timeoutMs);
    child.on('error', (err) => {
      if (!settled) {
        settled = true;
        clearTimeout(timer);
        log?.warn({ err, event: payload.event }, 'hooks: spawn error');
        resolve({ decision: 'warn', reason: `hook failed: ${err.message}` });
      }
    });
    child.on('close', (code) => {
      if (settled) return;
      settled = true;
      clearTimeout(timer);
      const out = Buffer.concat(stdout).toString('utf8').trim();
      const errOut = Buffer.concat(stderr).toString('utf8').trim();
      if (code !== 0 && !out) {
        log?.warn({ event: payload.event, code, stderr: errOut.slice(0, 200) }, 'hooks: non-zero exit');
        resolve({ decision: 'warn', reason: `hook exited ${code}` });
        return;
      }
      // Parse stdout as JSON response
      if (out) {
        try {
          const parsed = JSON.parse(out) as HookResponse;
          resolve(parsed);
          return;
        } catch {
          // Not JSON — treat as pass with stdout as context
          if (out.length > 0) {
            resolve({ decision: 'pass', additional_context: out });
            return;
          }
        }
      }
      resolve({ decision: 'pass' });
    });
    // Write payload to stdin
    const json = JSON.stringify(payload);
    child.stdin.write(json);
    child.stdin.end();
  });
 }
--- a/apps/server/src/services/inference/error-handler.ts
+++ b/apps/server/src/services/inference/error-handler.ts
@@ -122,6 +122,8 @@ export async function finalizeStreamedRow(
    completionTokens: number | null;
    promptTokens: number | null;
    startedAt: string | null;
    cacheTokens?: number | null;
    reasoningTokens?: number | null;
    beforeComplete?: () => Promise<void>;
  },
 ): Promise<void> {
@@ -137,6 +139,8 @@ export async function finalizeStreamedRow(
        tokens_used = ${opts.completionTokens},
        ctx_used = ${opts.promptTokens},
        ctx_max = ${nCtx},
        cache_tokens = ${opts.cacheTokens ?? null},
        reasoning_tokens = ${opts.reasoningTokens ?? null},
        finished_at = clock_timestamp()
    WHERE id = ${opts.messageId}
    RETURNING tokens_used, ctx_used, ctx_max, finished_at
@@ -149,6 +153,8 @@ export async function finalizeStreamedRow(
    tokens_used: updated?.tokens_used ?? null,
    ctx_used: updated?.ctx_used ?? null,
    ctx_max: updated?.ctx_max ?? null,
    cache_tokens: opts.cacheTokens ?? null,
    reasoning_tokens: opts.reasoningTokens ?? null,
    started_at: opts.startedAt,
    finished_at: updated?.finished_at ?? null,
    model: opts.model,
@@ -188,7 +194,7 @@ export async function finalizeCompletion(
 ): Promise<void> {
  const { sessionId, chatId, assistantMessageId } = args;
  const content = stripToolMarkup(result.content, { final: true });
-  const { finishReason, promptTokens, completionTokens } = result;
+  const { finishReason, promptTokens, completionTokens, cacheReadTokens, reasoningTokens } = result;
  // v1.11.3: see executeToolPhase for the rationale.
  const mctx = await modelContext.getModelContext(session.model);
@@ -203,6 +209,8 @@ export async function finalizeCompletion(
        tokens_used = ${completionTokens},
        ctx_used = ${promptTokens},
        ctx_max = ${nCtx},
        cache_tokens = ${cacheReadTokens ?? null},
        reasoning_tokens = ${reasoningTokens ?? null},
        model = ${session.model},
        finished_at = clock_timestamp()
    WHERE id = ${assistantMessageId}
@@ -268,6 +276,8 @@ export async function finalizeCompletion(
    tokens_used: updated?.tokens_used ?? null,
    ctx_used: updated?.ctx_used ?? null,
    ctx_max: updated?.ctx_max ?? null,
    cache_tokens: cacheReadTokens ?? null,
    reasoning_tokens: reasoningTokens ?? null,
    started_at: startedAt,
    finished_at: updated?.finished_at ?? null,
    model: session.model,
--- a/apps/server/src/services/inference/provider.ts
+++ b/apps/server/src/services/inference/provider.ts
@@ -1,4 +1,5 @@
 import { createOpenAICompatible } from '@ai-sdk/openai-compatible';
 import { createDeepSeek } from '@ai-sdk/deepseek';
 import type { LanguageModel } from 'ai';
 // v1.13.1-A: AI SDK provider against llama-swap. baseURL is threaded from
@@ -11,6 +12,12 @@ import type { LanguageModel } from 'ai';
 // llama-sidecar instead. A fresh provider is created per call (not cached)
 // because the X-Agent-Flags header varies per agent. The llama-swap path
 // stays cached since it has no per-request headers.
 //
 // vDeepSeek: when the model ID starts with 'deepseek-' and DEEPSEEK_API_KEY
 // is set, route through the official @ai-sdk/deepseek provider (not
 // openai-compatible) so DeepSeek-specific features work: providerMetadata
 // with promptCacheHitTokens/promptCacheMissTokens, reasoning via
 // LanguageModelV4Usage.outputTokens.reasoning, and thinking-mode options.
 const swapCache = new Map<string, ReturnType<typeof createOpenAICompatible>>();
@@ -41,7 +48,28 @@ function sidecarProvider(
  });
 }
-export type InferenceRoute = 'swap' | 'sidecar';
+const DEEPSEEK_MODEL_PREFIX = 'deepseek-';
 export function isDeepSeekModel(modelId: string): boolean {
  return modelId.startsWith(DEEPSEEK_MODEL_PREFIX);
 }
 let deepseekProviderCache: ReturnType<typeof createDeepSeek> | null = null;
 function getDeepSeekProvider(
  apiKey: string,
  baseURL: string,
 ): ReturnType<typeof createDeepSeek> {
  if (!deepseekProviderCache) {
    deepseekProviderCache = createDeepSeek({
      apiKey,
      baseURL,
    });
  }
  return deepseekProviderCache;
 }
 export type InferenceRoute = 'swap' | 'sidecar' | 'deepseek';
 export interface RoutingInfo {
  route: InferenceRoute;
@@ -55,12 +83,21 @@ interface AgentLike {
 interface ConfigLike {
  LLAMA_SWAP_URL: string;
  LLAMA_SIDECAR_URL?: string;
  DEEPSEEK_API_KEY?: string;
  DEEPSEEK_BASE_URL?: string;
 }
 export function resolveRoute(
  agent: AgentLike | null,
  config?: ConfigLike,
  modelId?: string,
 ): RoutingInfo {
  // vDeepSeek: if the model starts with deepseek- and DEEPSEEK_API_KEY is set,
  // route through the DeepSeek provider. Checked first so DeepSeek models
  // always bypass llama-swap/sidecar even when those are also configured.
  if (modelId?.startsWith(DEEPSEEK_MODEL_PREFIX) && config?.DEEPSEEK_API_KEY) {
    return { route: 'deepseek', flags: null };
  }
  // When llama_extra_args are explicitly set, route through sidecar with them.
  const flags = agent?.llama_extra_args;
  if (flags && flags.length > 0) {
@@ -80,7 +117,13 @@ export function upstreamModel(
  modelId: string,
  agent?: AgentLike | null,
 ): LanguageModel {
-  const { route, flags } = resolveRoute(agent ?? null, config);
+  const { route, flags } = resolveRoute(agent ?? null, config, modelId);
  if (route === 'deepseek') {
    return getDeepSeekProvider(
      config.DEEPSEEK_API_KEY!,
      config.DEEPSEEK_BASE_URL ?? 'https://api.deepseek.com',
    ).chat(modelId);
  }
  if (route === 'sidecar') {
    const url = config.LLAMA_SIDECAR_URL;
    if (!url) {
@@ -90,3 +133,30 @@ export function upstreamModel(
  }
  return getSwapProvider(config.LLAMA_SWAP_URL).chatModel(modelId);
 }
 /** Resolve the API endpoint for non-streaming calls (compaction, task-model).
 *  Returns the URL + model + optional auth header for direct fetch() usage. */
 export function resolveModelEndpoint(
  config: ConfigLike,
  modelId: string,
 ): { url: string; model: string; headers: Record<string, string> } {
  const baseHeaders: Record<string, string> = { 'Content-Type': 'application/json' };
  if (modelId.startsWith(DEEPSEEK_MODEL_PREFIX) && config.DEEPSEEK_API_KEY) {
    const baseURL = (config.DEEPSEEK_BASE_URL ?? 'https://api.deepseek.com').replace(/\/+$/, '');
    return {
      url: baseURL,
      model: modelId,
      headers: { ...baseHeaders, Authorization: `Bearer ${config.DEEPSEEK_API_KEY}` },
    };
  }
  return {
    url: config.LLAMA_SWAP_URL.replace(/\/+$/, ''),
    model: modelId,
    headers: baseHeaders,
  };
 }
 /** Invalidate the cached DeepSeek provider (e.g. when env vars change at runtime). */
 export function resetDeepSeekProvider(): void {
  deepseekProviderCache = null;
 }
--- a/apps/server/src/services/inference/stream-phase-adapter.ts
+++ b/apps/server/src/services/inference/stream-phase-adapter.ts
@@ -13,7 +13,7 @@ import type { OpenAiMessage } from './payload.js';
 import { extractToolCallBlocks } from './tool-call-parser.js';
 import { classifyStreamError } from './stream-error-classifier.js';
 import type { StreamResult } from './types.js';
-import { upstreamModel } from './provider.js';
+import { isDeepSeekModel, upstreamModel } from './provider.js';
 import {
  jsonSchema,
  streamText,
@@ -51,6 +51,9 @@ export interface StreamOptions {
  dry_base?: number | null;
  dry_allowed_length?: number | null;
  dry_penalty_last_n?: number | null;
  // vDeepSeek: thinking/reasoning effort. Maps to DeepSeek's reasoning_effort
  // API param for deepseek-v4-flash / deepseek-v4-pro models.
  reasoning_effort?: 'off' | 'low' | 'medium' | 'high' | 'xhigh' | 'max';
 }
 // P5: the 10-field sampler-options literal that was copy-pasted at 4 sites
@@ -74,6 +77,7 @@ export function samplerOptsFromAgent(agent: Agent | null): SamplerOpts {
    dry_base: agent?.dry_base ?? undefined,
    dry_allowed_length: agent?.dry_allowed_length ?? undefined,
    dry_penalty_last_n: agent?.dry_penalty_last_n ?? undefined,
    reasoning_effort: agent?.reasoning_effort ?? undefined,
  };
 }
@@ -272,6 +276,19 @@ export async function streamCompletion(
  // before this. They now go through the same extraBody path as the new params.
  const samplerBody = buildSamplerProviderOptions(opts);
  // vDeepSeek: build providerOptions.deepseek for DeepSeek V4 models.
  let deepseekProviderOptions:
    | { thinking: { type: 'enabled' | 'disabled' }; reasoningEffort?: 'low' | 'medium' | 'high' | 'xhigh' | 'max' }
    | undefined;
  if (isDeepSeekModel(model)) {
    const dsEffort = opts.reasoning_effort;
    const thinkingEnabled = dsEffort && dsEffort !== 'off';
    deepseekProviderOptions = {
      thinking: { type: thinkingEnabled ? 'enabled' : 'disabled' },
      ...(thinkingEnabled ? { reasoningEffort: dsEffort } : {}),
    };
  }
  // F6: per-chunk stall deadline. If the model stops emitting chunks for
  // STALL_TIMEOUT_MS the stallAc fires through AbortSignal.any; the post-loop
  // abort check below then throws AbortError → handleAbortOrError writes
@@ -297,7 +314,14 @@ export async function streamCompletion(
    ...(typeof opts.temperature === 'number' ? { temperature: opts.temperature } : {}),
    ...(typeof opts.top_p === 'number' ? { topP: opts.top_p } : {}),
    ...(typeof opts.presence_penalty === 'number' ? { presencePenalty: opts.presence_penalty } : {}),
-    ...(samplerBody ? { providerOptions: { openaiCompatible: samplerBody } } : {}),
+    ...(samplerBody || deepseekProviderOptions
      ? {
          providerOptions: {
            ...(samplerBody ? { openaiCompatible: samplerBody } : {}),
            ...(deepseekProviderOptions ? { deepseek: deepseekProviderOptions } : {}),
          },
        }
      : {}),
    abortSignal: effectiveSignal,
  });
@@ -401,12 +425,26 @@ export async function streamCompletion(
  // Usage lands as a promise on the result; awaiting after fullStream is
  // drained is safe. AI SDK v6 names: `inputTokens` / `outputTokens`.
  // Some providers (llama-swap via openai-compatible) return plain numbers;
  // others (deepseek via @ai-sdk/deepseek) return {total, cacheRead, noCache, ...}.
  let promptTokens: number | null = null;
  let completionTokens: number | null = null;
  let cacheReadTokens: number | null = null;
  let reasoningTokens: number | null = null;
  try {
    const usage = await result.usage;
-    if (typeof usage.inputTokens === 'number') promptTokens = usage.inputTokens;
+    if (typeof usage.inputTokens === 'number') {
-    if (typeof usage.outputTokens === 'number') completionTokens = usage.outputTokens;
+      promptTokens = usage.inputTokens;
    } else if (usage.inputTokens && typeof usage.inputTokens === 'object') {
      promptTokens = (usage.inputTokens as Record<string, number | undefined>).total ?? null;
      cacheReadTokens = (usage.inputTokens as Record<string, number | undefined>).cacheRead ?? null;
    }
    if (typeof usage.outputTokens === 'number') {
      completionTokens = usage.outputTokens;
    } else if (usage.outputTokens && typeof usage.outputTokens === 'object') {
      completionTokens = (usage.outputTokens as Record<string, number | undefined>).total ?? null;
      reasoningTokens = (usage.outputTokens as Record<string, number | undefined>).reasoning ?? null;
    }
  } catch {
    // Some providers omit usage on partial streams; leave both null.
  }
@@ -422,6 +460,13 @@ export async function streamCompletion(
    );
  }
  if (cacheReadTokens !== null || reasoningTokens !== null) {
    ctx.log.debug(
      { promptTokens, completionTokens, cacheReadTokens, reasoningTokens, model },
      'streamCompletion: deepseek usage breakdown',
    );
  }
  return {
    finishReason,
    content,
@@ -429,6 +474,10 @@ export async function streamCompletion(
    promptTokens,
    completionTokens,
    reasoning: reasoningAccumulated,
    // vDeepSeek: optional usage breakdown populated when the provider returns
    // structured usage (cache hit tokens, reasoning tokens).
    cacheReadTokens: cacheReadTokens ?? undefined,
    reasoningTokens: reasoningTokens ?? undefined,
  };
  } finally {
    // Clear the stall timer whether the stream completes normally, throws, or
--- a/apps/server/src/services/inference/tool-input-repair.ts
+++ b/apps/server/src/services/inference/tool-input-repair.ts
@@ -0,0 +1,179 @@
 /**
 * vWhale: schema-based tool input repair. When the model emits tool call args
 * that don't match the expected types (common with weaker models), apply
 * heuristic repairs before falling through to the Zod parse.
 *
 * Inspired by Whale's RepairToolInputForSpec:
 *   - Coerce string "true"/"false" → boolean
 *   - Unwrap markdown autolinks in string fields: <file:///path> → /path
 *   - Wrap bare values in arrays when schema expects array
 *   - Convert "42.0" decimal string → "42" for integer fields
 *   - Recurse into objects to repair nested properties
 */
 export interface ToolInputRepair {
  field: string;
  kind: string;
  detail: string;
 }
 const MARKDOWN_AUTOLINK_RE = /^<(?:file|path):\/\/(.+?)>$/;
 /**
 * Attempt to repair tool call args against the tool's JSON Schema.
 * Returns the (possibly modified) args plus a list of repairs applied.
 */
 export function repairToolInput(
  schema: Record<string, unknown> | undefined,
  args: Record<string, unknown>,
 ): { repaired: Record<string, unknown>; repairs: ToolInputRepair[] } {
  const repairs: ToolInputRepair[] = [];
  if (!schema || typeof schema !== 'object') {
    return { repaired: args, repairs };
  }
  const properties = (schema as Record<string, unknown>).properties as
    Record<string, unknown> | undefined;
  if (!properties) {
    return { repaired: args, repairs };
  }
  const required = new Set<string>(
    Array.isArray((schema as Record<string, unknown>).required)
      ? (schema as Record<string, unknown>).required as string[]
      : [],
  );
  const repaired: Record<string, unknown> = {};
  for (const [key, value] of Object.entries(args)) {
    const propSchema = properties[key] as Record<string, unknown> | undefined;
    if (propSchema && value !== null && value !== undefined) {
      repaired[key] = repairValue(key, propSchema, value, repairs, required.has(key));
    } else {
      repaired[key] = value;
    }
  }
  // Drop keys not in the schema (only for required fields that are missing)
  // to avoid polluting the model with hallucinated params.
  for (const key of Object.keys(repaired)) {
    if (!(key in properties)) {
      repairs.push({ field: key, kind: 'removed_unknown', detail: `Removed unknown parameter '${key}'` });
      delete repaired[key];
    }
  }
  return { repaired, repairs };
 }
 function repairValue(
  field: string,
  schema: Record<string, unknown>,
  value: unknown,
  repairs: ToolInputRepair[],
  required: boolean,
 ): unknown {
  const schemaType = schema.type;
  const isArray = schemaType === 'array' || Array.isArray(schemaType)
    ? schemaType === 'array' || (Array.isArray(schemaType) && schemaType.includes('array'))
    : false;
  const isObject = schemaType === 'object';
  const isBoolean = schemaType === 'boolean';
  const isInteger = schemaType === 'integer' || schemaType === 'number';
  const isString = schemaType === 'string';
  // --- Array repair: wrap bare value or empty object ---
  if (isArray) {
    if (!Array.isArray(value)) {
      if (typeof value === 'string') {
        // Try parsing as JSON array first
        try {
          const parsed = JSON.parse(value);
          if (Array.isArray(parsed)) {
            repairs.push({ field, kind: 'parsed_json_array', detail: `Parsed string as JSON array for '${field}'` });
            return parsed;
          }
        } catch { /* not JSON */ }
      }
      if (typeof value === 'object' && value !== null && Object.keys(value).length === 0) {
        if (required) {
          repairs.push({ field, kind: 'empty_object_to_array', detail: `Converted empty object to empty array for '${field}'` });
          return [];
        }
        repairs.push({ field, kind: 'empty_object_to_undefined', detail: `Removed empty object for optional array '${field}'` });
        return undefined;
      }
      repairs.push({ field, kind: 'wrapped_in_array', detail: `Wrapped bare value in array for '${field}'` });
      return [value];
    }
    // Recurse into array items
    const itemsSchema = schema.items as Record<string, unknown> | undefined;
    if (itemsSchema) {
      return value.map((item, i) => repairValue(`${field}[${i}]`, itemsSchema, item, repairs, required));
    }
    return value;
  }
  // --- Object repair: recurse into properties ---
  if (isObject && typeof value === 'object' && value !== null && !Array.isArray(value)) {
    const props = (schema.properties as Record<string, unknown>) ?? {};
    const repaired: Record<string, unknown> = {};
    for (const [k, v] of Object.entries(value as Record<string, unknown>)) {
      const propSchema = props[k] as Record<string, unknown> | undefined;
      if (propSchema) {
        repaired[k] = repairValue(`${field}.${k}`, propSchema, v, repairs, required);
      } else {
        repaired[k] = v;
      }
    }
    return repaired;
  }
  // --- String repair: unwrap markdown autolinks ---
  if (isString && typeof value === 'string') {
    const match = value.match(MARKDOWN_AUTOLINK_RE);
    if (match) {
      repairs.push({ field, kind: 'unwrapped_markdown_link', detail: `Unwrapped markdown autolink for '${field}': ${value}` });
      return match[1];
    }
    return value;
  }
  // --- Boolean coercion ---
  if (isBoolean && typeof value === 'string') {
    const lower = value.toLowerCase();
    if (lower === 'true') {
      repairs.push({ field, kind: 'coerced_to_boolean', detail: `Coerced string '${value}' → true for '${field}'` });
      return true;
    }
    if (lower === 'false') {
      repairs.push({ field, kind: 'coerced_to_boolean', detail: `Coerced string '${value}' → false for '${field}'` });
      return false;
    }
    return value;
  }
  // --- Integer coercion: "42.0" → 42 ---
  if (isInteger && typeof value === 'string') {
    const num = Number(value);
    if (!Number.isNaN(num)) {
      repairs.push({ field, kind: 'coerced_to_number', detail: `Coerced string '${value}' → ${num} for '${field}'` });
      return num;
    }
    return value;
  }
  // --- Integer coercion: boolean → 0/1 ---
  if (isInteger && typeof value === 'boolean') {
    repairs.push({ field, kind: 'coerced_boolean_to_integer', detail: `Coerced boolean ${value} → ${value ? 1 : 0} for '${field}'` });
    return value ? 1 : 0;
  }
  // --- Empty string to null for optional fields ---
  if (value === '' && !required) {
    repairs.push({ field, kind: 'empty_string_to_undefined', detail: `Converted empty string for optional '${field}'` });
    return undefined;
  }
  return value;
 }
--- a/apps/server/src/services/inference/tool-phase.ts
+++ b/apps/server/src/services/inference/tool-phase.ts
@@ -6,6 +6,7 @@ import type { ToolExecCtx } from '../tools.js';
 import { matchToolGlob } from '../agents.js';
 import { maybeFlagForCompaction } from './payload.js';
 import { insertParts, partsFromAssistantMessage, partsFromToolMessage } from './parts.js';
 import { getServerPermission } from '../mcp-client.js';
 // v1.13.16: richer unknown-tool error so the model can self-correct when it
 // drifts to a Claude Code tool name (e.g. read_file → suggest view_file).
 // Applies to all unknown tool names, not just <invoke>-derived ones — at the
@@ -17,7 +18,9 @@ import { formatUnknownToolError } from './tool-suggestions.js';
 // prompted about paths we couldn't grant anyway (e.g. /etc/passwd).
 import { resolveGrantRoot } from '../grant_resolver.js';
 import { stripToolMarkup } from './tool-call-parser.js';
 import { repairToolInput } from './tool-input-repair.js';
 import type { FailureKind } from './mistake-tracker.js';
 import { insertToolTrace, updateToolTrace } from '../tool-traces.js';
 import type {
  InferenceContext,
  StreamResult,
@@ -34,6 +37,8 @@ async function executeToolCall(
  toolCall: ToolCall,
  extraRoots: readonly string[],
  toolCtx?: ToolExecCtx,
  hooks?: import('../hooks.js').HookRunner,
  sessionId?: string,
 ): Promise<{ output: unknown; truncated: boolean; error?: string; outcome: FailureKind | 'success' }> {
  // v#12 MistakeTracker: every return path carries an `outcome` so the turn
  // loop can detect a run of heterogeneous failures. The failure taxonomy
@@ -48,7 +53,61 @@ async function executeToolCall(
      outcome: 'tool_not_found',
    };
  }
-  const parsed = tool.inputSchema.safeParse(toolCall.args);
+  // MCP permission gate — block deny/ask before any Zod parsing or execution
  const mcpPerm = getServerPermission(toolCall.name);
  if (mcpPerm === 'deny') {
    return { output: null, truncated: false, error: `blocked: MCP server denied tool '${toolCall.name}'`, outcome: 'permission_denied' };
  }
  if (mcpPerm === 'ask') {
    return { output: null, truncated: false, error: `requires approval: tool '${toolCall.name}' needs user approval`, outcome: 'permission_denied' };
  }
  // vWhale: schema-based tool input repair. If the Zod parse fails, attempt
  // heuristic repairs (type coercion, markdown-link unwrapping, array wrapping)
  // and retry. Logs repairs for debugging.
  let args = toolCall.args;
  let parsed = tool.inputSchema.safeParse(args);
  if (!parsed.success) {
    const schema = tool.jsonSchema?.function?.parameters;
    if (schema) {
      const { repaired: repairedArgs, repairs } = repairToolInput(
        schema as Record<string, unknown>,
        args as Record<string, unknown>,
      );
      if (repairs.length > 0) {
        const retry = tool.inputSchema.safeParse(repairedArgs);
        if (retry.success) {
          args = repairedArgs;
          parsed = retry;
        }
      }
    }
  }
  // vWhale: PreToolUse hook — can block execution.
  if (hooks && sessionId) {
    const hookResult = await hooks.run('PreToolUse', {
      event: 'PreToolUse',
      session_id: sessionId,
      tool_name: toolCall.name,
      tool_args: args as Record<string, unknown>,
    });
    if (hookResult.decision === 'block') {
      return {
        output: null,
        truncated: false,
        error: `blocked by hook: ${hookResult.reason ?? 'PreToolUse denied'}`,
        outcome: 'permission_denied',
      };
    }
    // Apply updated_input if the hook rewrote the args
    if (hookResult.updated_input && typeof hookResult.updated_input === 'object') {
      const reParsed = tool.inputSchema.safeParse(hookResult.updated_input);
      if (reParsed.success) {
        args = hookResult.updated_input as Record<string, unknown>;
        parsed = reParsed;
      }
    }
  }
  if (!parsed.success) {
    // v1.12 Track B.2: enrich the zod-reject path so the model sees a
    // one-line, tool-named hint ("tool 'search_symbols' rejected — query:
@@ -117,6 +176,7 @@ export async function executeToolPhase(
  session: Session,
  projectRoot: string,
  agent?: Agent | null,
  turnNumber?: number,
 ): Promise<ToolPhaseResult> {
  const { sessionId, chatId, assistantMessageId } = args;
  const content = stripToolMarkup(result.content, { final: true });
@@ -183,6 +243,8 @@ export async function executeToolPhase(
    tokens_used: updated?.tokens_used ?? null,
    ctx_used: updated?.ctx_used ?? null,
    ctx_max: updated?.ctx_max ?? null,
    cache_tokens: result.cacheReadTokens ?? null,
    reasoning_tokens: result.reasoningTokens ?? null,
    started_at: startedAt,
    finished_at: updated?.finished_at ?? null,
    model: session.model,
@@ -318,10 +380,64 @@ export async function executeToolPhase(
        });
        return;
      }
-      const tres = await executeToolCall(projectRoot, tc, session.allowed_read_paths, {
+      // tool_trace instrumentation - start
-        sql: ctx.sql,
+      const traceId = crypto.randomUUID();
-        sessionId,
+      const traceStartTime = Date.now();
      const startedAtIso = new Date().toISOString();
      insertToolTrace(ctx.sql, {
        session_id: sessionId,
        chat_id: chatId,
        message_id: assistantMessageId,
        turn_number: turnNumber ?? 0,
        tool_name: tc.name,
        tool_input: tc.args as Record<string, unknown>,
      }).catch(() => {});
      ctx.publish(sessionId, {
        type: 'tool_trace_start',
        trace_id: traceId,
        message_id: assistantMessageId,
        chat_id: chatId,
        tool_name: tc.name,
        tool_input: tc.args as Record<string, unknown>,
        started_at: startedAtIso,
      });
      const tres = await executeToolCall(
        projectRoot, tc, session.allowed_read_paths,
        { sql: ctx.sql, sessionId },
        ctx.hooks, sessionId,
      );
      // tool_trace instrumentation - finish
      const finishedAtIso = new Date().toISOString();
      const latencyMs = Date.now() - traceStartTime;
      updateToolTrace(ctx.sql, traceId, {
        finished_at: finishedAtIso,
        ...(tres.outcome === 'success' && tres.output != null ? { tool_output: JSON.stringify(tres.output) } : {}),
        latency_ms: latencyMs,
        outcome: tres.outcome,
        ...(tres.error ? { error: tres.error } : {}),
      }).catch(() => {});
      ctx.publish(sessionId, {
        type: 'tool_trace_finish',
        trace_id: traceId,
        message_id: assistantMessageId,
        chat_id: chatId,
        tool_name: tc.name,
        finished_at: finishedAtIso,
        outcome: tres.outcome,
        latency_ms: latencyMs,
        ...(tres.error ? { error: tres.error } : {}),
      });
      // vWhale: PostToolUse hook (best-effort, non-blocking).
      if (ctx.hooks) {
        ctx.hooks.run('PostToolUse', {
          event: 'PostToolUse',
          session_id: sessionId,
          tool_name: tc.name,
          tool_args: tc.args as Record<string, unknown>,
          tool_result: tres.output,
          tool_error: tres.error,
        }).catch(() => {});
      }
      // v#12 MistakeTracker: record the real execution outcome (success or a
      // FailureKind). This is the primary signal for heterogeneous-failure
      // detection.
--- a/apps/server/src/services/inference/turn.ts
+++ b/apps/server/src/services/inference/turn.ts
@@ -37,6 +37,12 @@ import type {
  StreamResult,
  TurnArgs,
 } from './types.js';
 import { saveAgentSnapshot } from '../session-snapshots.js';
 // vWhale: auto-fix loop — after write tools, build the project and inject
 // errors. Uses execFile (no shell) against the project root.
 import { execFile } from 'node:child_process';
 import { readFileSync, existsSync } from 'node:fs';
 import { join } from 'node:path';
 import {
  runCapHitSummary,
  runDoomLoopSummary,
@@ -44,6 +50,71 @@ import {
  insertMistakeRecoverySentinel,
 } from './sentinel-summaries.js';
 // vWhale: auto-fix — detect build command from package.json, run it, return
 // error text for injection into next iteration. Best-effort, never throws.
 const BUILD_TIMEOUT_MS = 60_000;
 const BUILD_OUTPUT_CAP = 8_000;
 async function detectAndRunBuild(
  ctx: InferenceContext,
  projectRoot: string,
  sessionId: string,
  chatId: string,
  model: string,
  existingNote: string | undefined,
 ): Promise<string | undefined> {
  // Only run for DeepSeek models (local Qwen models don't benefit from build loop).
  if (!model.startsWith('deepseek-')) return undefined;
  // Detect build command from package.json in project root.
  const pkgPath = join(projectRoot, 'package.json');
  if (!existsSync(pkgPath)) return undefined;
  let buildCmd: string | null = null;
  try {
    const pkg = JSON.parse(readFileSync(pkgPath, 'utf8')) as { scripts?: Record<string, string> };
    if (pkg.scripts?.build) buildCmd = 'build';
    else if (pkg.scripts?.compile) buildCmd = 'compile';
    else if (pkg.scripts?.typecheck) buildCmd = 'typecheck';
  } catch {
    return undefined;
  }
  if (!buildCmd) return undefined;
  // Detect package manager.
  const hasPnpm = existsSync(join(projectRoot, 'pnpm-lock.yaml'));
  const hasYarn = existsSync(join(projectRoot, 'yarn.lock'));
  const pm = hasPnpm ? 'pnpm' : hasYarn ? 'yarn' : 'npm';
  // Run the build.
  try {
    const out = await new Promise<string>((resolve, reject) => {
      execFile(pm, ['run', buildCmd!], { cwd: projectRoot, timeout: BUILD_TIMEOUT_MS, maxBuffer: BUILD_OUTPUT_CAP * 2 },
        (err, stdout, stderr) => {
          if (err && (err as NodeJS.ErrnoException).code === 'ENOENT') {
            resolve('');  // package manager not found — skip
            return;
          }
          const merged = (stdout + '\n' + stderr).trim();
          resolve(merged.slice(0, BUILD_OUTPUT_CAP));
        },
      );
    });
    if (!out) return undefined;  // build succeeded or no output
    ctx.log.info({ sessionId, chatId, buildCmd, outputLen: out.length }, 'auto-fix: build failed');
    // Truncate if existing note exists
    const combined = existingNote
      ? existingNote + '\n\n--- Build error ---\n' + out.slice(0, BUILD_OUTPUT_CAP - existingNote.length)
      : '--- Build error ---\n' + out.slice(0, BUILD_OUTPUT_CAP);
    return combined;
  } catch {
    return undefined;
  }
 }
 // P5: MAX_STEPS moved to ./turn-config.ts (with resolveTurnConfig). Re-exported
 // here so the public surface (index.ts → './turn.js') is unchanged.
 export { MAX_STEPS } from './turn-config.js';
@@ -144,6 +215,7 @@ export async function runAssistantTurn(
          log: ctx.log,
          broker: ctx.broker,
          chatId,
          hooks: ctx.hooks,
        });
      } catch (err) {
        ctx.log.warn({ err, chatId }, 'auto-compaction failed; clearing flag and proceeding');
@@ -214,6 +286,16 @@ export async function runAssistantTurn(
    // ---- non-tool finish → finalize and exit ----
    if (result.toolCalls.length === 0) {
      // vWhale: Stop hook (best-effort, non-blocking).
      if (ctx.hooks) {
        ctx.hooks.run('Stop', {
          event: 'Stop',
          session_id: sessionId,
          chat_id: chatId,
          last_assistant_text: result.content.slice(0, 500),
          turn: stepNumber,
        }).catch(() => {});
      }
      await finalizeCompletion(ctx, iterArgs, result, state.startedAt, iterSession);
      break;
    }
@@ -229,7 +311,7 @@ export async function runAssistantTurn(
    // ---- tool phase ----
    let toolPhaseResult: ToolPhaseResult;
    try {
-      toolPhaseResult = await executeToolPhase(ctx, iterArgs, result, state.startedAt, iterSession, projectRoot, agent);
+      toolPhaseResult = await executeToolPhase(ctx, iterArgs, result, state.startedAt, iterSession, projectRoot, agent, stepNumber);
    } catch (err) {
      // Tool phase errors are unexpected (individual tool failures are
      // caught inside executeToolPhase). Log and break.
@@ -249,6 +331,17 @@ export async function runAssistantTurn(
      recordStep(mistakeTracker, o);
    }
    // vWhale: auto-fix — after write tools, attempt build and inject errors.
    const WRITE_TOOLS = new Set(['edit_file', 'create_file', 'delete_file', 'apply_pending']);
    const hasWriteTools = toolPhaseResult.toolCalls.some((tc) => WRITE_TOOLS.has(tc.name));
    if (hasWriteTools) {
      detectAndRunBuild(ctx, projectRoot, sessionId, chatId, iterSession.model, pendingRecoveryNote)
        .then((buildError) => {
          if (buildError) pendingRecoveryNote = buildError;
        })
        .catch(() => {});
    }
    // v#12 MistakeTracker: post-tool decision (pure). 'stop' = the tool phase
    // returned a non-'continue' action ('paused' for user input, or
    // 'synthesis_done') — neither a nudge nor an escalate would change the
@@ -309,6 +402,35 @@ export async function runAssistantTurn(
    assistantMessageId = toolPhaseResult.nextAssistantId!;
  }
  // vWhale: Stop hook at post-loop exit (best-effort, non-blocking).
  if (ctx.hooks) {
    const loaded = await loadContext(ctx.sql, sessionId, chatId);
    const lastAssistant = loaded?.history?.slice().reverse().find(
      (m: import('../../types/api.js').Message) => m.role === 'assistant',
    );
    const content = lastAssistant?.content ?? '';
    ctx.hooks.run('Stop', {
      event: 'Stop',
      session_id: sessionId,
      chat_id: chatId,
      last_assistant_text: content.slice(0, 500),
      turn: stepNumber,
    }).catch(() => {});
  }
  // ---- persist agent snapshot (best-effort, never blocks inference) ----
  const snapLoaded = await loadContext(ctx.sql, sessionId, chatId).catch(() => null);
  if (snapLoaded) {
    await saveAgentSnapshot(ctx.sql, chatId, {
      session_id: sessionId,
      model: snapLoaded.session.model,
      agent: agent?.name ?? null,
      mode: null,
      turn_number: stepNumber,
      messages: snapLoaded.history.map((m) => ({ role: m.role, content: m.content })),
    }).catch(() => {});
  }
  // ---- post-loop: step-cap sentinel ----
  // When the loop exits because stepNumber reached effectiveCap, the last
  // iteration's tool phase returned 'continue' with a nextAssistantId that
--- a/apps/server/src/services/inference/types.ts
+++ b/apps/server/src/services/inference/types.ts
@@ -19,6 +19,7 @@ import type {
  UserStreamFrame,
 } from '../../types/api.js';
 import type { Broker } from '../broker.js';
 import type { HookRunner } from '../hooks.js';
 import type { MistakeState } from './mistake-tracker.js';
 export interface StreamPhaseState {
@@ -45,6 +46,9 @@ export interface InferenceFrame {
    | 'error'
    | 'flow_run_started'
    | 'flow_run_step_updated'
    // tool trace frames
    | 'tool_trace_start'
    | 'tool_trace_finish'
    // arena frames
    | 'battle_started'
    | 'contestant_updated'
@@ -77,8 +81,19 @@ export interface InferenceFrame {
  started_at?: string | null;
  finished_at?: string | null;
  model?: string;
  cache_tokens?: number | null;
  reasoning_tokens?: number | null;
  session_id?: string;
  name?: string;
  // tool trace frames
  trace_id?: string;
  tool_name?: string;
  tool_input?: Record<string, unknown>;
  tool_output?: string | null;
  latency_ms?: number;
  outcome?: string;
  // agent snapshot restore
  agent?: string | null;
  // orchestrator frames ([D-6])
  run_id?: string;
  flow_name?: string;
@@ -117,6 +132,9 @@ export interface InferenceContext {
  // inference goes through `publish`); keeping a separate field avoids
  // tempting other code paths into bypassing the session-id binding.
  broker: Broker;
  // vWhale: lifecycle hooks runner. Undefined when no hooks configured.
  // Hook calls are best-effort — a failing hook never blocks inference.
  hooks?: HookRunner;
 }
 export interface StreamResult {
@@ -128,6 +146,12 @@ export interface StreamResult {
  // v1.13.1-C: reasoning text accumulated across reasoning-delta parts.
  // Empty string when the model doesn't emit reasoning (most cases).
  reasoning: string;
  // vDeepSeek: optional cache-hit token count from DeepSeek's API.
  // Only populated when using @ai-sdk/deepseek provider (not llama-swap).
  cacheReadTokens?: number;
  // vDeepSeek: optional reasoning token count from DeepSeek's API.
  // Only populated when using @ai-sdk/deepseek provider (not llama-swap).
  reasoningTokens?: number;
 }
 export interface TurnArgs {
--- a/apps/server/src/services/mcp-client.ts
+++ b/apps/server/src/services/mcp-client.ts
@@ -31,11 +31,14 @@ interface McpToolDef {
  annotations?: McpToolAnnotations;
 }
 export type McpPermission = 'allow' | 'ask' | 'deny';
 interface ServerState {
  client: Client;
  transport: StreamableHTTPClientTransport | StdioClientTransport;
  tools: ToolDef<Record<string, unknown>>[];
  type: 'streamableHttp' | 'stdio';
  permission: McpPermission;
 }
 // ---- Module-level state ----
@@ -137,6 +140,14 @@ export async function callTool(
  }
 }
 /** Return the permission level for a given MCP server. Defaults to 'allow' if unknown. */
 export function getServerPermission(prefixedToolName: string): McpPermission {
  const serverName = toolToServer.get(prefixedToolName);
  if (!serverName) return 'allow';
  const state = servers.get(serverName);
  return state?.permission ?? 'allow';
 }
 /** Return all wrapped ToolDefs from all connected servers, flattened. */
 export function getTools(): ToolDef<Record<string, unknown>>[] {
  const all: ToolDef<Record<string, unknown>>[] = [];
@@ -214,7 +225,8 @@ async function connectServer(entry: McpServerEntry): Promise<void> {
    toolToServer.set(wrapped.name, name);
  }
-  servers.set(name, { client, transport, tools, type: config.type });
+  const permission = (config as { permission?: McpPermission }).permission ?? 'allow';
  servers.set(name, { client, transport, tools, type: config.type, permission });
  log!.info(
    { server: name, type: config.type, count: tools.length, names: tools.map((t) => t.name) },
--- a/apps/server/src/services/mcp-config.ts
+++ b/apps/server/src/services/mcp-config.ts
@@ -17,12 +17,15 @@ import type { FastifyBaseLogger } from 'fastify';
 // ---- Zod schema ----
 const McpPermissionSchema = z.enum(['allow', 'ask', 'deny']).default('allow');
 const McpServerConfigSchema = z.discriminatedUnion('type', [
  z.object({
    type: z.literal('streamableHttp'),
    url: z.string().url(),
    headers: z.record(z.string()).optional(),
    enabled: z.boolean().default(true),
    permission: McpPermissionSchema,
  }),
  z.object({
    type: z.literal('stdio'),
@@ -30,6 +33,7 @@ const McpServerConfigSchema = z.discriminatedUnion('type', [
    args: z.array(z.string()).default([]),
    env: z.record(z.string()).optional(),
    enabled: z.boolean().default(true),
    permission: McpPermissionSchema,
  }),
 ]);
--- a/apps/server/src/services/memory/index.ts
+++ b/apps/server/src/services/memory/index.ts
@@ -3,4 +3,9 @@ export { formatMemoryBlock } from './prompt.js';
 export { scanMemoryScopes } from './scan.js';
 export { parseMemoryEntries } from './entries.js';
 export { ensureMemoryScaffold, getMemoryRoot } from './paths.js';
 export { ContextTier } from './context-tier.js';
 export { DeepDream } from './deep-dream.js';
 export { CoreTier } from './core-tier.js';
 export type { MemoryEntry } from './entries.js';
 export type { ContextTierConfig, ConversationTurn } from './context-tier.js';
 export type { CoreTierEntry, CoreTierSearchResult, CoreTierSearchOptions } from './core-tier.js';
--- a/apps/server/src/services/message-columns.ts
+++ b/apps/server/src/services/message-columns.ts
@@ -7,10 +7,12 @@
 export const MESSAGE_COLUMNS =
  'id, session_id, chat_id, role, content, kind, tool_calls, tool_results, status, last_seq, ' +
-  'tokens_used, ctx_used, ctx_max, started_at, finished_at, created_at, metadata, ' +
+  'tokens_used, ctx_used, ctx_max, cache_tokens, reasoning_tokens, ' +
  'started_at, finished_at, created_at, metadata, ' +
  'summary, tail_start_id, compacted_at, model';
 export const INFERENCE_MESSAGE_COLUMNS =
  'id, session_id, chat_id, role, content, kind, tool_calls, tool_results, status, last_seq, ' +
-  'tokens_used, ctx_used, ctx_max, started_at, finished_at, created_at, metadata, ' +
+  'tokens_used, ctx_used, ctx_max, cache_tokens, reasoning_tokens, ' +
  'started_at, finished_at, created_at, metadata, ' +
  'reasoning_parts, model';
--- a/apps/server/src/services/model-context.ts
+++ b/apps/server/src/services/model-context.ts
@@ -37,7 +37,18 @@ export function configureModelContext(opts: { llamaSwapUrl: string }): void {
  llamaSwapUrl = opts.llamaSwapUrl;
 }
 // vDeepSeek: DeepSeek models don't have a /upstream/<model>/props endpoint.
 // Return a reasonable default context so compaction estimates work.
 const DEEPSEEK_DEFAULT_N_CTX = 131_072;
 const DEEPSEEK_MODEL_PREFIX = 'deepseek-';
 export async function getModelContext(model: string): Promise<ModelContext | null> {
  // vDeepSeek: DeepSeek models have no /upstream/<model>/props. Use a static
  // default so compaction doesn't fall to the buffer-only path with tiny limits.
  if (model.startsWith(DEEPSEEK_MODEL_PREFIX)) {
    return { n_ctx: DEEPSEEK_DEFAULT_N_CTX };
  }
  // 1. Positive cache hit — no TTL check, model n_ctx is invariant.
  const pos = positiveCache.get(model);
  if (pos) return pos;
--- a/apps/server/src/services/session-snapshots.ts
+++ b/apps/server/src/services/session-snapshots.ts
@@ -0,0 +1,51 @@
 import type { Sql } from '../db.js';
 export interface AgentSnapshot {
  id: string;
  session_id: string;
  chat_id: string;
  model: string;
  agent: string | null;
  mode: string | null;
  turn_number: number;
  messages: unknown[];
  tool_states: unknown[];
  created_at: string;
  updated_at: string;
 }
 /** Save or update the agent snapshot for a chat (UPSERT). */
 export async function saveAgentSnapshot(sql: Sql, chatId: string, data: {
  session_id: string;
  model: string;
  agent?: string | null;
  mode?: string | null;
  turn_number: number;
  messages: unknown[];
  tool_states?: unknown[];
 }): Promise<void> {
  await sql`
    INSERT INTO agent_snapshots (session_id, chat_id, model, agent, mode, turn_number, messages, tool_states, updated_at)
    VALUES (${data.session_id}, ${chatId}, ${data.model}, ${data.agent ?? null}, ${data.mode ?? null}, ${data.turn_number}, ${sql.json(data.messages as never)}, ${sql.json((data.tool_states ?? []) as never)}, clock_timestamp())
    ON CONFLICT (chat_id)
    DO UPDATE SET
      model = EXCLUDED.model,
      agent = EXCLUDED.agent,
      mode = EXCLUDED.mode,
      turn_number = EXCLUDED.turn_number,
      messages = EXCLUDED.messages,
      tool_states = EXCLUDED.tool_states,
      updated_at = clock_timestamp()
  `;
 }
 /** Load the agent snapshot for a chat. Returns null if no snapshot exists. */
 export async function loadAgentSnapshot(sql: Sql, chatId: string): Promise<AgentSnapshot | null> {
  const rows = await sql<AgentSnapshot[]>`SELECT * FROM agent_snapshots WHERE chat_id = ${chatId}`;
  return rows[0] ?? null;
 }
 /** Delete the agent snapshot for a chat (call when session ends). */
 export async function deleteAgentSnapshot(sql: Sql, chatId: string): Promise<void> {
  await sql`DELETE FROM agent_snapshots WHERE chat_id = ${chatId}`;
 }
--- a/apps/server/src/services/system-prompt.ts
+++ b/apps/server/src/services/system-prompt.ts
@@ -101,7 +101,7 @@ export interface PrefixFingerprint {
  has_agent_system_prompt: boolean;
  has_session_override: boolean;
  has_project_override: boolean;
-  route: 'swap' | 'sidecar';
+  route: 'swap' | 'sidecar' | 'deepseek';
 }
 export interface PrefixDrift {
@@ -129,7 +129,7 @@ interface ObservedInputs {
  has_agent_system_prompt: boolean;
  has_session_override: boolean;
  has_project_override: boolean;
-  route: 'swap' | 'sidecar';
+  route: 'swap' | 'sidecar' | 'deepseek';
 }
 interface ObserverEntry {
--- a/apps/server/src/services/tool-traces.ts
+++ b/apps/server/src/services/tool-traces.ts
@@ -0,0 +1,92 @@
 import type { Sql } from '../db.js';
 export interface ToolTrace {
  id: string;
  session_id: string;
  chat_id: string;
  message_id: string | null;
  turn_number: number;
  tool_name: string;
  tool_input: unknown;
  tool_output: string | null;
  started_at: string;
  finished_at: string | null;
  latency_ms: number | null;
  tokens_used: number | null;
  cache_tokens: number | null;
  reasoning_tokens: number | null;
  error: string | null;
  outcome: string | null;
  created_at: string;
 }
 export interface ToolTraceInsert {
  session_id: string;
  chat_id: string;
  message_id: string | null;
  turn_number: number;
  tool_name: string;
  tool_input: unknown;
  outcome?: string;
 }
 export interface ToolTraceUpdate {
  finished_at?: string;
  latency_ms?: number;
  tool_output?: string;
  tokens_used?: number;
  cache_tokens?: number;
  reasoning_tokens?: number;
  error?: string;
  outcome?: string;
 }
 export async function insertToolTrace(
  sql: Sql,
  insert: ToolTraceInsert,
 ): Promise<ToolTrace> {
  const [row] = await sql<ToolTrace[]>`
    INSERT INTO tool_traces (
      session_id, chat_id, message_id, turn_number,
      tool_name, tool_input, outcome
    ) VALUES (
      ${insert.session_id}, ${insert.chat_id}, ${insert.message_id},
      ${insert.turn_number}, ${insert.tool_name},
      ${sql.json(insert.tool_input as never)},
      ${insert.outcome ?? null}
    )
    RETURNING *
  `;
  if (!row) throw new Error('insertToolTrace returned no row');
  return row;
 }
 export async function updateToolTrace(
  sql: Sql,
  id: string,
  updates: ToolTraceUpdate,
 ): Promise<ToolTrace | null> {
  const cols: string[] = [];
  const vals: any[] = [];
  if (updates.finished_at !== undefined) { cols.push('finished_at'); vals.push(updates.finished_at); }
  if (updates.latency_ms !== undefined) { cols.push('latency_ms'); vals.push(updates.latency_ms); }
  if (updates.tool_output !== undefined) { cols.push('tool_output'); vals.push(updates.tool_output); }
  if (updates.tokens_used !== undefined) { cols.push('tokens_used'); vals.push(updates.tokens_used); }
  if (updates.cache_tokens !== undefined) { cols.push('cache_tokens'); vals.push(updates.cache_tokens); }
  if (updates.reasoning_tokens !== undefined) { cols.push('reasoning_tokens'); vals.push(updates.reasoning_tokens); }
  if (updates.error !== undefined) { cols.push('error'); vals.push(updates.error); }
  if (updates.outcome !== undefined) { cols.push('outcome'); vals.push(updates.outcome); }
  if (cols.length === 0) {
    const [row] = await sql<ToolTrace[]>`SELECT * FROM tool_traces WHERE id = ${id}`;
    return row ?? null;
  }
  const setClause = cols.map((c, i) => `${c} = $${i + 1}`).join(', ');
  const [row] = await sql.unsafe<ToolTrace[]>(
    `UPDATE tool_traces SET ${setClause} WHERE id = $${cols.length + 1} RETURNING *`,
    [...vals, id],
  );
  return row ?? null;
 }
--- a/apps/server/src/services/tools/codecontext/factory.ts
+++ b/apps/server/src/services/tools/codecontext/factory.ts
@@ -2,6 +2,12 @@ import { z } from 'zod';
 import type { ToolDef } from '../types.js';
 import { callCodecontext, type CodecontextResponse } from '../../codecontext_client.js';
 // DEPRECATED (Phase 4, Domain 2, v2.8.14): This factory builds ToolDefs that
 // route through the Go codecontext sidecar via callCodecontext(). Superseded
 // by direct boocontext MCP tool wrappers. Keep functional for backward
 // compatibility — old codecontext tools still use HTTP. New tools should use
 // the boocontext MCP server instead of adding entries here.
 //
 // Shared factory for the 12 codecontext shim ToolDefs.
 // Each shim provides name/schema/description/jsonParameters/mapArgs; the
 // factory builds the ToolDef and returns both the ToolDef and the standalone
--- a/apps/server/src/services/tools/codecontext/get_code_health.ts
+++ b/apps/server/src/services/tools/codecontext/get_code_health.ts
@@ -0,0 +1,62 @@
 import { z } from 'zod';
 import type { ToolDef } from '../types.js';
 import { callBoocontext } from '../../boocontext_client.js';
 export const GetCodeHealthInput = z.object({
  directory: z.string().optional().describe('Directory to analyze (defaults to project root)'),
  file: z.string().optional().describe('Optional: specific file to analyze'),
 });
 export type GetCodeHealthInputT = z.infer<typeof GetCodeHealthInput>;
 const DESCRIPTION =
  'Code health analysis. Returns A–F grades per file across 7 dimensions ' +
  '(cohesion, coupling, complexity, documentation, duplication, unit size, test coverage). ' +
  'Includes project health summary and refactoring candidates.';
 /**
 * Standalone execute function — calls the boocontext MCP server's
 * boocontext_health tool and returns the raw report text.
 *
 * Structured for direct test access: accepts input + projectPath,
 * no side effects beyond the MCP call.
 */
 export async function executeGetCodeHealth(
  input: GetCodeHealthInputT,
  projectPath: string,
 ): Promise<string> {
  const args: Record<string, unknown> = {};
  if (input.directory) args['directory'] = input.directory;
  if (input.file) args['file'] = input.file;
  const resp = await callBoocontext({ toolName: 'boocontext_health', args });
  return resp.result;
 }
 export const getCodeHealth: ToolDef<GetCodeHealthInputT> = {
  name: 'get_code_health',
  description: DESCRIPTION,
  inputSchema: GetCodeHealthInput,
  jsonSchema: {
    type: 'function',
    function: {
      name: 'get_code_health',
      description: DESCRIPTION,
      parameters: {
        type: 'object',
        properties: {
          directory: {
            type: 'string',
            description: 'Directory to analyze (defaults to project root)',
          },
          file: {
            type: 'string',
            description: 'Optional: specific file to analyze',
          },
        },
        additionalProperties: false,
      },
    },
  },
  async execute(input, projectRoot) {
    return executeGetCodeHealth(input, projectRoot);
  },
 };
--- a/apps/server/src/services/tools/codecontext/get_code_impact.ts
+++ b/apps/server/src/services/tools/codecontext/get_code_impact.ts
@@ -0,0 +1,228 @@
 import { spawn } from 'node:child_process';
 import { resolve } from 'node:path';
 import { z } from 'zod';
 import type { ToolDef } from '../types.js';
 import type { CodecontextResponse } from '../../codecontext_client.js';
 // ======================= MCP Client =======================
 const BOOCONTEXT_PATH = resolve('/opt/forks/boocontext/dist/standalone.js');
 const TOOL_CALL_TIMEOUT_MS = 60_000;
 interface JsonRpcMessage {
  jsonrpc: '2.0';
  id?: number | string;
  result?: {
    content?: Array<{ type: string; text: string }>;
  };
  error?: { code?: number; message: string };
 }
 /**
 * Single-shot MCP JSON-RPC client for boocontext.
 * Spawns the process, sends initialize + tools/call over NDJSON, returns the
 * text result from the content array.  The boocontext MCP server auto-detects
 * newline-delimited JSON transport when the first input lacks Content-Length
 * headers, which is exactly what we send.
 */
 async function callBoocontext(
  toolName: string,
  args: Record<string, unknown>,
 ): Promise<string> {
  return new Promise<string>((resolvePromise, reject) => {
    const child = spawn(process.execPath, [BOOCONTEXT_PATH], {
      stdio: ['pipe', 'pipe', 'pipe'],
      timeout: TOOL_CALL_TIMEOUT_MS,
    });
    let stdout = '';
    let stderr = '';
    let resolved = false;
    function finalize(err?: Error, result?: string): void {
      if (resolved) return;
      resolved = true;
      if (err) reject(err);
      else resolvePromise(result!);
      child.kill();
    }
    child.stdout!.on('data', (chunk: Buffer) => {
      stdout += chunk.toString();
    });
    child.stderr!.on('data', (chunk: Buffer) => {
      stderr += chunk.toString();
    });
    child.on('error', (err: Error) => {
      finalize(new Error(`boocontext spawn error: ${err.message}`));
    });
    child.on('close', (code: number | null) => {
      if (resolved) return;
      // Parse newline-delimited JSON responses from stdout
      const lines = stdout.split('\n').filter((l) => l.trim().length > 0);
      let toolText: string | undefined;
      let toolError: string | undefined;
      for (const line of lines) {
        try {
          const msg = JSON.parse(line) as JsonRpcMessage;
          if (msg.id === 2) {
            if (msg.error) {
              toolError = msg.error.message ?? 'boocontext tool call failed';
            } else if (msg.result?.content?.[0]?.text !== undefined) {
              toolText = msg.result.content[0].text;
            }
          }
        } catch {
          // skip malformed JSON lines
        }
      }
      if (toolError) {
        finalize(new Error(toolError));
      } else if (toolText !== undefined) {
        finalize(undefined, toolText);
      } else {
        const errSuffix =
          stderr.length > 0 ? ` stderr: ${stderr.slice(0, 500)}` : '';
        finalize(
          new Error(`boocontext MCP call failed (exit ${code})${errSuffix}`),
        );
      }
    });
    // Step 1: initialize — establishes MCP protocol version + capabilities
    child.stdin!.write(
      JSON.stringify({
        jsonrpc: '2.0',
        id: 1,
        method: 'initialize',
        params: {
          protocolVersion: '2024-11-05',
          capabilities: {},
          clientInfo: { name: 'boocode-server', version: '1.0.0' },
        },
      }) + '\n',
    );
    // Step 2: tools/call — invoke the named boocontext tool
    child.stdin!.write(
      JSON.stringify({
        jsonrpc: '2.0',
        id: 2,
        method: 'tools/call',
        params: { name: toolName, arguments: args },
      }) + '\n',
    );
    child.stdin!.end();
    // Safety timeout — prevent hung processes
    setTimeout(() => {
      finalize(
        new Error(
          `boocontext call timed out after ${TOOL_CALL_TIMEOUT_MS}ms`,
        ),
      );
    }, TOOL_CALL_TIMEOUT_MS);
  });
 }
 // ======================= Tool Definition =======================
 const TRUNCATION_LIMIT = 32_000;
 export const GetCodeImpactInput = z.object({
  symbol: z.string().min(1).describe('Symbol name for TSA trace_impact'),
  file: z.string().optional().describe('File path for codesight blast_radius'),
  directory: z
    .string()
    .optional()
    .describe('Directory (defaults to project root)'),
  depth: z
    .number()
    .int()
    .min(1)
    .max(5)
    .optional()
    .describe('Max blast-radius traversal depth (default 1)'),
 });
 export type GetCodeImpactInputT = z.infer<typeof GetCodeImpactInput>;
 const DESCRIPTION =
  'Impact analysis. Merges symbol-level call trace with file-level blast radius. ' +
  'Use before making changes to understand change propagation. ' +
  'Single call replaces separate get_symbol_info + get_blast_radius steps.';
 /**
 * Standalone execute function — calls the boocontext MCP `boocontext_impact`
 * tool via a short-lived child process, then wraps the result in the standard
 * CodecontextResponse shape with inline truncation at 32 KB.
 */
 export async function executeGetCodeImpact(
  input: GetCodeImpactInputT,
  projectPath: string,
 ): Promise<CodecontextResponse> {
  const args: Record<string, unknown> = {
    symbol: input.symbol,
    directory: input.directory ?? projectPath,
  };
  if (input.file) args['file'] = input.file;
  const text = await callBoocontext('boocontext_impact', args);
  // Inline truncation matching codecontext_client.ts patterns (32 KB ceiling).
  if (text.length > TRUNCATION_LIMIT) {
    const sliced = text.slice(0, TRUNCATION_LIMIT);
    const omitted = text.length - TRUNCATION_LIMIT;
    return {
      result: `${sliced}\n\n[truncated, ${omitted} chars omitted; narrow with symbol or file parameters]`,
      truncated: true,
    };
  }
  return { result: text, truncated: false };
 }
 export const getCodeImpact: ToolDef<GetCodeImpactInputT> = {
  name: 'get_code_impact',
  description: DESCRIPTION,
  inputSchema: GetCodeImpactInput,
  jsonSchema: {
    type: 'function',
    function: {
      name: 'get_code_impact',
      description: DESCRIPTION,
      parameters: {
        type: 'object',
        properties: {
          symbol: {
            type: 'string',
            description: 'Symbol name for TSA trace_impact',
          },
          file: {
            type: 'string',
            description: 'File path for codesight blast_radius',
          },
          directory: {
            type: 'string',
            description: 'Directory (defaults to project root)',
          },
          depth: {
            type: 'number',
            description: 'Max blast-radius traversal depth (default 1)',
          },
        },
        required: ['symbol'],
        additionalProperties: false,
      },
    },
  },
  execute(input, projectRoot) {
    return executeGetCodeImpact(input, projectRoot);
  },
 };
--- a/apps/server/src/services/tools/codecontext/get_code_map.ts
+++ b/apps/server/src/services/tools/codecontext/get_code_map.ts
@@ -0,0 +1,192 @@
 import { spawn } from 'node:child_process';
 import { z } from 'zod';
 import type { ToolDef } from '../types.js';
 export const GetCodeMapInput = z.object({
  directory: z.string().optional().describe('Directory to scan (defaults to project root)'),
  compress: z.boolean().optional().describe('Apply DCP compression if payload exceeds threshold (default: true)'),
 });
 export type GetCodeMapInputT = z.infer<typeof GetCodeMapInput>;
 const DESCRIPTION =
  'DCP-compressed codebase context map. Returns filenames, sizes, import relationships in a compressed format. ' +
  'Use compress=false for full detail, compress=true (default) for token-efficient overview.';
 const BOOCONTEXT_PATH = '/opt/forks/boocontext/dist/standalone.js';
 const TOOL_TIMEOUT_MS = 30_000;
 const MAX_RESULT_BYTES = 32_768;
 export interface CodeMapResponse {
  result: string;
  truncated: boolean;
 }
 /**
 * Calls the boocontext MCP server over stdio JSON-RPC to invoke
 * the boocontext_map tool. Spawns the standalone binary, sends
 * initialize + tools/call, collects NDJSON responses, and kills
 * the child process.
 */
 function callBoocontextMap(args: Record<string, unknown>): Promise<CodeMapResponse> {
  return new Promise((resolve, reject) => {
    const child = spawn('node', [BOOCONTEXT_PATH], {
      stdio: ['pipe', 'pipe', 'pipe'],
    });
    let stdoutBuf = '';
    const lines: string[] = [];
    let timedOut = false;
    let resolved = false;
    const timer = setTimeout(() => {
      timedOut = true;
      child.kill('SIGKILL');
      reject(new Error(`boocontext MCP call timed out after ${TOOL_TIMEOUT_MS}ms`));
    }, TOOL_TIMEOUT_MS);
    function tryParse(): void {
      if (resolved || timedOut) return;
      // Accumulate complete NDJSON lines
      const parts = stdoutBuf.split('\n');
      stdoutBuf = parts.pop()! ?? '';
      for (const p of parts) {
        const t = p.trim();
        if (t) lines.push(t);
      }
      // Need at least 2 responses: initialize + tools/call
      if (lines.length < 2) return;
      resolved = true;
      clearTimeout(timer);
      child.kill();
      try {
        const callResponse = JSON.parse(lines[1]!);
        if (callResponse.error) {
          reject(new Error(`MCP error: ${callResponse.error.message}`));
          return;
        }
        const content = callResponse.result?.content;
        if (!content?.[0]?.text) {
          reject(new Error('Unexpected MCP response shape — missing content[0].text'));
          return;
        }
        // content[0].text is JSON-stringified VerdictEnvelope from boocontext
        const envelope = JSON.parse(content[0].text as string);
        const details = envelope.details;
        let result: string;
        if (details && typeof details === 'object' && 'data' in details) {
          // DcpEnvelope shape: { compressed, originalLength, compressedLength, data }
          if (details.compressed) {
            // Return the full DcpEnvelope as JSON so the LLM can pass it
            // transparently to a decompression step
            result = JSON.stringify(details);
          } else {
            // Uncompressed — data is the raw output
            result = details.data;
          }
        } else {
          result = JSON.stringify(details ?? envelope);
        }
        const truncated = Buffer.byteLength(result, 'utf-8') > MAX_RESULT_BYTES;
        if (truncated) {
          result = result.substring(0, MAX_RESULT_BYTES);
        }
        resolve({ result, truncated });
      } catch (e: any) {
        reject(new Error(`Failed to parse boocontext response: ${e.message}`));
      }
    }
    child.stdout!.on('data', (chunk: Buffer) => {
      if (timedOut) return;
      stdoutBuf += chunk.toString('utf-8');
      tryParse();
    });
    child.stderr!.on('data', (_chunk: Buffer) => {
      // Captured but not surfaced — logged only on parse failure
    });
    child.on('error', (err: Error) => {
      clearTimeout(timer);
      if (!resolved) {
        resolved = true;
        reject(new Error(`boocontext spawn failed: ${err.message}`));
      }
    });
    child.on('close', () => {
      clearTimeout(timer);
      if (!resolved && !timedOut) {
        tryParse();
        if (!resolved) {
          resolved = true;
          reject(new Error('boocontext process closed without producing a valid response'));
        }
      }
    });
    // Step 1: initialize
    child.stdin!.write(
      JSON.stringify({ jsonrpc: '2.0', id: 1, method: 'initialize' }) + '\n',
    );
    // Step 2: tools/call for boocontext_map
    child.stdin!.write(
      JSON.stringify({
        jsonrpc: '2.0',
        id: 2,
        method: 'tools/call',
        params: { name: 'boocontext_map', arguments: args },
      }) + '\n',
    );
  });
 }
 export const getCodeMap: ToolDef<GetCodeMapInputT> = {
  name: 'get_code_map',
  description: DESCRIPTION,
  inputSchema: GetCodeMapInput,
  jsonSchema: {
    type: 'function',
    function: {
      name: 'get_code_map',
      description: DESCRIPTION,
      parameters: {
        type: 'object',
        properties: {
          directory: { type: 'string', description: 'Directory to scan (defaults to project root)' },
          compress: {
            type: 'boolean',
            description: 'Apply DCP compression if payload exceeds threshold (default: true)',
          },
        },
        additionalProperties: false,
      },
    },
  },
  async execute(input, projectRoot): Promise<CodeMapResponse> {
    return callBoocontextMap({
      directory: input.directory ?? projectRoot,
      compress: input.compress ?? true,
    });
  },
 };
 export async function executeGetCodeMap(
  input: GetCodeMapInputT,
  projectRoot: string,
 ): Promise<CodeMapResponse> {
  return callBoocontextMap({
    directory: input.directory ?? projectRoot,
    compress: input.compress ?? true,
  });
 }
--- a/apps/server/src/services/tools/codecontext/get_codebase_overview.ts
+++ b/apps/server/src/services/tools/codecontext/get_codebase_overview.ts
@@ -3,6 +3,7 @@ import { makeCodecontextTool } from './factory.js';
 export const GetCodebaseOverviewInput = z.object({
  include_stats: z.boolean().optional(),
  compress: z.boolean().optional().describe('Apply DCP compression for large projects (>50 files)'),
 });
 export type GetCodebaseOverviewInputT = z.infer<typeof GetCodebaseOverviewInput>;
@@ -24,10 +25,18 @@ const { toolDef: getCodebaseOverview, execute: executeGetCodebaseOverview } =
          type: 'boolean',
          description: 'Include file count, symbol count, language stats. Defaults to true.',
        },
        compress: {
          type: 'boolean',
          description: 'Apply DCP compression for large projects (>50 files)',
        },
      },
      additionalProperties: false,
    },
-    mapArgs: (input) => ({ include_stats: input.include_stats ?? true }),
+    mapArgs: (input) => {
      const args: Record<string, unknown> = { include_stats: input.include_stats ?? true };
      if (input.compress) args['compress'] = true;
      return args;
    },
  });
 export { getCodebaseOverview, executeGetCodebaseOverview };
--- a/apps/server/src/services/tools/codecontext/get_type_info.ts
+++ b/apps/server/src/services/tools/codecontext/get_type_info.ts
@@ -0,0 +1,262 @@
 import { z } from 'zod';
 import { spawn } from 'node:child_process';
 import type { ToolDef } from '../types.js';
 import type { CodecontextResponse } from '../../codecontext_client.js';
 const BOOCONTEXT_PATH = '/opt/forks/boocontext/dist/standalone.js';
 const TRUNCATION_LIMIT = 32_000;
 export const GetTypeInfoInput = z.object({
  file: z.string().min(1).describe('File path to resolve types in'),
  symbol: z.string().optional().describe('Symbol name to resolve (supports regex)'),
  directory: z.string().optional().describe('Project directory for type resolution context'),
 });
 export type GetTypeInfoInputT = z.infer<typeof GetTypeInfoInput>;
 const DESCRIPTION =
  'TypeScript type recovery. Returns type signatures, interface definitions, ' +
  'generic constraints, and JSDoc for symbols in a file. Uses type-inject MCP server.';
 // ---- JSON-RPC-over-stdio MCP caller for boocontext --------------------------
 async function callBoocontext(
  toolName: string,
  args: Record<string, unknown>,
 ): Promise<CodecontextResponse> {
  const child = spawn(process.execPath, [BOOCONTEXT_PATH], {
    stdio: ['pipe', 'pipe', 'pipe'],
    timeout: 60_000,
  });
  let stderrBuf = '';
  child.stderr!.on('data', (chunk: Buffer) => {
    stderrBuf += chunk.toString('utf-8');
  });
  let killed = false;
  const killChild = () => {
    if (killed) return;
    killed = true;
    child.kill();
  };
  try {
    // Read one complete JSON-RPC response from stdout (handles both
    // Content-Length framed and newline-delimited transport).
    async function readResponse(timeoutMs = 30_000): Promise<unknown> {
      return new Promise((resolve, reject) => {
        const timer = setTimeout(() => {
          cleanup();
          reject(new Error('Timeout reading boocontext response'));
        }, timeoutMs);
        let buf = '';
        const cleanup = () => {
          clearTimeout(timer);
          child.stdout!.removeListener('data', onData);
          child.stdout!.removeListener('end', onEnd);
          child.stdout!.removeListener('error', onError);
        };
        const onData = (chunk: Buffer) => {
          buf += chunk.toString('utf-8');
          const msg = tryExtractMessage(buf);
          if (msg !== null) {
            cleanup();
            resolve(msg);
            return;
          }
          if (buf.length > 1_024 * 1_024) {
            cleanup();
            reject(new Error('Boocontext response exceeded 1 MB'));
          }
        };
        const onEnd = () => {
          cleanup();
          if (buf.trim()) {
            try {
              resolve(JSON.parse(buf.trim()));
            } catch {
              reject(new Error('Boocontext stream ended with incomplete data'));
            }
          } else {
            reject(new Error('Boocontext stream ended unexpectedly'));
          }
        };
        const onError = (err: Error) => {
          cleanup();
          reject(err);
        };
        child.stdout!.on('data', onData);
        child.stdout!.on('end', onEnd);
        child.stdout!.on('error', onError);
      });
    }
    // Wait for the process to be fully spawned.
    await new Promise<void>((resolve, reject) => {
      child.on('error', reject);
      child.on('spawn', () => resolve());
    });
    // Step 1 — MCP initialize
    let reqId = 0;
    reqId++;
    child.stdin!.write(
      JSON.stringify({ jsonrpc: '2.0', id: reqId, method: 'initialize' }) + '\n',
    );
    const initResp = await readResponse() as { error?: { message: string } };
    if (initResp.error) {
      throw new Error(`Boocontext init failed: ${initResp.error.message}`);
    }
    // Step 2 — tools/call
    reqId++;
    child.stdin!.write(
      JSON.stringify({
        jsonrpc: '2.0',
        id: reqId,
        method: 'tools/call',
        params: { name: toolName, arguments: args },
      }) + '\n',
    );
    const callResp = await readResponse() as {
      error?: { message: string };
      result?: { content?: Array<{ type: string; text: string }> };
    };
    if (callResp.error) {
      throw new Error(`Boocontext tool call failed: ${callResp.error.message}`);
    }
    // Extract text from the MCP tool result shape:
    // { content: [{ type: "text", text: "…" }] }
    const content = callResp.result?.content;
    let text: string;
    if (Array.isArray(content) && content.length > 0 && content[0]!.type === 'text') {
      text = content[0]!.text;
    } else {
      text = JSON.stringify(callResp.result);
    }
    // Inline truncation at 32 KB.
    if (text.length > TRUNCATION_LIMIT) {
      const omitted = text.length - TRUNCATION_LIMIT;
      return {
        result:
          text.slice(0, TRUNCATION_LIMIT) +
          `\n\n[truncated, ${omitted} chars omitted; narrow with file or symbol filter]`,
        truncated: true,
      };
    }
    return { result: text, truncated: false };
  } finally {
    killChild();
    // Give the process a moment to release resources.
    await new Promise<void>((resolve) => {
      const timer = setTimeout(resolve, 2_000);
      child.on('exit', () => {
        clearTimeout(timer);
        resolve();
      });
    });
  }
 }
 /**
 * Attempt to extract one complete JSON-RPC message from the head of a
 * buffer.  Handles both Content-Length framed and newline-delimited
 * formats.  Returns `null` when more data is needed.
 */
 function tryExtractMessage(buf: string): unknown | null {
  // --- Content-Length framed ---
  const headerEnd = buf.indexOf('\r\n\r\n');
  if (headerEnd !== -1) {
    const header = buf.substring(0, headerEnd);
    const lengthMatch = header.match(/Content-Length:\s*(\d+)/i);
    if (lengthMatch) {
      const contentLength = parseInt(lengthMatch[1]!, 10);
      const bodyStart = headerEnd + 4;
      if (buf.length >= bodyStart + contentLength) {
        const jsonStr = buf.substring(bodyStart, bodyStart + contentLength);
        return JSON.parse(jsonStr);
      }
      return null; // need more data
    }
    // Has \r\n\r\n but no Content-Length — junk segment; skip and retry.
    return tryExtractMessage(buf.substring(headerEnd + 4));
  }
  // --- Newline-delimited ---
  const nlIndex = buf.indexOf('\n');
  if (nlIndex !== -1) {
    const line = buf.substring(0, nlIndex).trim();
    if (line && line.startsWith('{')) {
      return JSON.parse(line);
    }
    // Non-JSON line (e.g. stderr echo), skip and continue.
    return tryExtractMessage(buf.substring(nlIndex + 1));
  }
  return null; // need more data
 }
 // ---- ToolDef ----------------------------------------------------------------
 export const getTypeInfo: ToolDef<GetTypeInfoInputT> = {
  name: 'get_type_info',
  description: DESCRIPTION,
  inputSchema: GetTypeInfoInput,
  jsonSchema: {
    type: 'function',
    function: {
      name: 'get_type_info',
      description: DESCRIPTION,
      parameters: {
        type: 'object',
        properties: {
          file: { type: 'string', description: 'File path to resolve types in' },
          symbol: {
            type: 'string',
            description: 'Symbol name to resolve (supports regex)',
          },
          directory: {
            type: 'string',
            description: 'Project directory for type resolution context',
          },
        },
        required: ['file'],
        additionalProperties: false,
      },
    },
  },
  async execute(input): Promise<CodecontextResponse> {
    const args: Record<string, unknown> = { file: input.file };
    if (input.symbol) args['symbol'] = input.symbol;
    return callBoocontext('boocontext_types', args);
  },
 };
 /**
 * Standalone execute function matching the `execute` shape returned by
 * `makeCodecontextTool` — useful for direct callers and tests.
 *
 * Note: unlike the HTTP-backed codecontext tools this does NOT accept a
 * `fetcher` override because it communicates over stdio rather than HTTP.
 */
 export async function executeGetTypeInfo(
  input: GetTypeInfoInputT,
  _projectPath?: string,
 ): Promise<CodecontextResponse> {
  const args: Record<string, unknown> = { file: input.file };
  if (input.symbol) args['symbol'] = input.symbol;
  return callBoocontext('boocontext_types', args);
 }
--- a/apps/server/src/services/tools/codecontext/index.ts
+++ b/apps/server/src/services/tools/codecontext/index.ts
@@ -13,3 +13,9 @@ export { getBlastRadius } from './get_blast_radius.js';
 export { getHotFiles } from './get_hot_files.js';
 export { getRoutes } from './get_routes.js';
 export { getMiddleware } from './get_middleware.js';
 // v2.8.14-domain2-phase1: boocontext-backed tools.
 export { getCodeHealth } from './get_code_health.js';
 export { getCodeImpact } from './get_code_impact.js';
 export { getTypeInfo } from './get_type_info.js';
 export { getCodeMap } from './get_code_map.js';
 export { getWikiArticle } from './get_wiki_article.js';
--- a/apps/server/src/services/tools/execute-command.ts
+++ b/apps/server/src/services/tools/execute-command.ts
@@ -0,0 +1,132 @@
 /**
 * vWhale: run_command tool. Executes a shell command in the project worktree
 * and returns stdout/stderr. Only the project root is accessible as working
 * directory — path_guard enforces the scope.
 *
 * Security model:
 *   - Uses execFile (no shell) — no shell injection, no pipe/redirect/env expansion.
 *   - args passed as array, never a string.
 *   - 30s timeout default, configure per-call.
 *   - 32KB output cap with truncation (same pattern as web_fetch.ts).
 *   - Working directory restricted to project root via path_guard.
 *   - No background processes allowed (waits for completion).
 */
 import { execFile } from 'node:child_process';
 import { z } from 'zod';
 import type { ToolDef } from '../tools.js';
 const RunCommandInput = z.object({
  command: z.string().min(1).max(256),
  args: z.array(z.string()).default([]),
  description: z.string().max(256).optional(),
  timeout_ms: z.number().int().positive().max(120_000).optional(),
 });
 export type RunCommandInputT = z.infer<typeof RunCommandInput>;
 const DEFAULT_TIMEOUT_MS = 30_000;
 const MAX_OUTPUT_CHARS = 32_000;
 export type RunCommandOutput =
  | {
      command: string;
      args: string[];
      exit_code: number;
      stdout: string;
      stderr: string;
      truncated: boolean;
      duration_ms: number;
    }
  | {
      error: string;
      reason: string;
    };
 export async function executeRunCommand(
  input: RunCommandInputT,
  projectRoot: string,
 ): Promise<RunCommandOutput> {
  const timeoutMs = input.timeout_ms ?? DEFAULT_TIMEOUT_MS;
  const startTime = Date.now();
  return new Promise((resolve) => {
    const child = execFile(
      input.command,
      input.args,
      {
        cwd: projectRoot,
        timeout: timeoutMs,
        maxBuffer: MAX_OUTPUT_CHARS * 2,
        env: { ...process.env },
      },
      (err, stdout, stderr) => {
        const durationMs = Date.now() - startTime;
        // Truncate output if needed
        const truncated = stdout.length + stderr.length > MAX_OUTPUT_CHARS;
        const cappedStdout = truncated ? stdout.slice(0, MAX_OUTPUT_CHARS) : stdout;
        const cappedStderr = truncated ? stderr.slice(0, Math.max(MAX_OUTPUT_CHARS - cappedStdout.length, 0)) : stderr;
        const exitCode = err?.code === 'ENOENT' ? -1 : (err as Error & { code?: number })?.code ?? 0;
        resolve({
          command: input.command,
          args: input.args,
          exit_code: typeof exitCode === 'number' ? exitCode : 1,
          stdout: cappedStdout,
          stderr: cappedStderr,
          truncated,
          duration_ms: durationMs,
        });
      },
    );
  });
 }
 export const runCommand: ToolDef<RunCommandInputT> = {
  name: 'run_command',
  description:
    'Run a shell command in the project workspace and return stdout + stderr. ' +
    'The command runs in the project root directory. ' +
    'Use for: building, testing, linting, git operations, running scripts. ' +
    'Output is capped at 32KB. Timeout defaults to 30s (max 120s). ' +
    'Security: args are passed as array (no shell injection). No background processes.',
  inputSchema: RunCommandInput as unknown as z.ZodType<RunCommandInputT>,
  jsonSchema: {
    type: 'function',
    function: {
      name: 'run_command',
      description:
        'Execute a command in the project workspace. ' +
        'Use for builds, tests, linting, git commands, and scripts. ' +
        'The process runs with a 30s timeout and 32KB output cap.',
      parameters: {
        type: 'object',
        properties: {
          command: {
            type: 'string',
            description: 'Command to execute (e.g. pnpm, npm, npx, node, git, ls, cat).',
          },
          args: {
            type: 'array',
            items: { type: 'string' },
            description: 'Arguments as array (e.g. ["run", "build"]). Never embedded in a shell string.',
          },
          description: {
            type: 'string',
            description: 'Optional human-readable description of what this command does.',
          },
          timeout_ms: {
            type: 'integer',
            description: 'Timeout in milliseconds. Default 30000, max 120000.',
          },
        },
        required: ['command'],
        additionalProperties: false,
      },
    },
  },
  async execute(input, projectRoot) {
    return await executeRunCommand(input, projectRoot);
  },
 };
--- a/apps/server/src/services/tools/registry.ts
+++ b/apps/server/src/services/tools/registry.ts
@@ -19,6 +19,11 @@ import {
  getHotFiles,
  getRoutes,
  getMiddleware,
  getCodeHealth,
  getCodeImpact,
  getTypeInfo,
  getCodeMap,
  getWikiArticle,
 } from './codecontext/index.js';
 // v1.13.17-cross-repo-reads: cross-repo read grant request tool. Paired
 // with the pause-on-pending-grant branch in inference/tool-phase.ts and the
@@ -27,6 +32,14 @@ import { requestReadAccess } from '../request_read_access.js';
 // v2.6.x: read-only tool that reads a tab's transcript by its session-scoped
 // tab number. Needs DB/session context (ToolExecCtx 4th arg).
 import { readTabByNumber } from '../read_tab_by_number.js';
 // v2.x: memory management tools. file-based store with optional CoreTier
 // (SQLite FTS5 + vector) hybrid search backend.
 import { extractMemoryTool } from './extract_memory.js';
 import { manageMemoryTool } from './manage_memory.js';
 import { searchMemoryTool } from './search_memory.js';
 // vWhale: command execution tool. Spawns processes in the project worktree
 // with timeout and output cap. No shell — args are passed as array.
 import { runCommand } from './execute-command.js';
 // v1.13.3: alpha-sorted by tool.name at module load. llama.cpp's prompt
 // cache hits on byte-identical prefixes; the tool list lives near the top
@@ -75,6 +88,23 @@ export let ALL_TOOLS: ToolDef<unknown>[] = [
  // v2.6.x: read a tab's transcript by its session-scoped tab number.
  // Read-only; uses the ToolExecCtx 4th arg for DB/session access.
  readTabByNumber as ToolDef<unknown>,
  // v2.8.14-domain2-phase1: boocontext-backed tools. Backed by the boocontext
  // MCP server. All read-only. Health, impact, types, map analysis.
  getCodeHealth as ToolDef<unknown>,
  getCodeImpact as ToolDef<unknown>,
  getTypeInfo as ToolDef<unknown>,
  getCodeMap as ToolDef<unknown>,
  // v2.8.14-domain2-phase3: wiki mode + token-efficient scanning.
  getWikiArticle as ToolDef<unknown>,
  // v2.x: memory management tools. File-based store with optional CoreTier
  // (SQLite FTS5 + vector) hybrid search backend.
  extractMemoryTool as ToolDef<unknown>,
  manageMemoryTool as ToolDef<unknown>,
  searchMemoryTool as ToolDef<unknown>,
  // vWhale: command execution. Spawns processes in the project worktree.
  // Read-write; use with guard: restricted to project root via path_guard,
  // no shell injection (execFile, not exec).
  runCommand as ToolDef<unknown>,
 ].sort((a, b) => a.name.localeCompare(b.name));
 export let TOOLS_BY_NAME: Record<string, ToolDef<unknown>> = Object.fromEntries(
--- a/apps/server/src/types/api.ts
+++ b/apps/server/src/types/api.ts
@@ -127,6 +127,9 @@ export interface Agent {
  // bounded only by MAX_STEPS (200). 0 means "no tool calls allowed."
  steps: number | null;
  llama_extra_args: string[] | null;
  // vDeepSeek: thinking/reasoning effort for DeepSeek V4 models.
  // Maps to DeepSeek's reasoning_effort API param.
  reasoning_effort: 'off' | 'low' | 'medium' | 'high' | 'xhigh' | 'max' | null;
 }
 // One entry per malformed `## Name` block. Per-block errors don't fail the
@@ -206,6 +209,8 @@ export interface Message {
  tokens_used: number | null;
  ctx_used: number | null;
  ctx_max: number | null;
  cache_tokens: number | null;
  reasoning_tokens: number | null;
  started_at: string | null;
  finished_at: string | null;
  created_at: string;
--- a/apps/web/src/api/client.ts
+++ b/apps/web/src/api/client.ts
@@ -34,6 +34,10 @@ import type {
  SessionAnalyticsRow,
  ContextWindowStats,
  TokenBreakdownAgg,
  ToolTraceResponse,
  MemoryEntry,
  DailyMemoryEntry,
  DreamEntry,
 } from './types';
 // v2.6 Phase 1-UX §9b: chat-scoped agent-session rows. Returned by
@@ -340,6 +344,10 @@ export const api = {
        method: 'POST',
        body: JSON.stringify({ tool_call_id: toolCallId, decision }),
      }),
    getTraces: (chatId: string, limit = 50, offset = 0) =>
      request<ToolTraceResponse>(
        `/api/chats/${chatId}/traces?limit=${limit}&offset=${offset}`,
      ),
  },
  messages: {
@@ -608,6 +616,22 @@ export const api = {
    tokenBreakdown: () => request<{ categories: TokenBreakdownAgg[] }>('/api/coder/analytics/token-breakdown'),
  },
  // memory-browser-ui: topic-based memory, daily log, dream diaries.
  memory: {
    list: (projectId: string) =>
      request<{ entries: MemoryEntry[] }>(
        `/api/memory?project_id=${encodeURIComponent(projectId)}`,
      ),
    daily: (projectId: string) =>
      request<{ entries: DailyMemoryEntry[] }>(
        `/api/memory/daily?project_id=${encodeURIComponent(projectId)}`,
      ),
    dreams: (projectId: string) =>
      request<{ entries: DreamEntry[] }>(
        `/api/memory/dreams?project_id=${encodeURIComponent(projectId)}`,
      ),
  },
  settings: {
    get: () => request<Record<string, unknown>>('/api/settings'),
    patch: (body: Record<string, unknown>) =>
--- a/apps/web/src/api/types.ts
+++ b/apps/web/src/api/types.ts
@@ -152,6 +152,8 @@ export interface Message {
  tokens_used: number | null;
  ctx_used: number | null;
  ctx_max: number | null;
  cache_tokens: number | null;
  reasoning_tokens: number | null;
  // model-attribution: which model produced this assistant message (null for
  // user/system rows + pre-attribution messages). Rendered as a chip.
  model: string | null;
@@ -530,6 +532,8 @@ export type WsFrame =
      tokens_used?: number | null;
      ctx_used?: number | null;
      ctx_max?: number | null;
      cache_tokens?: number | null;
      reasoning_tokens?: number | null;
      started_at?: string | null;
      finished_at?: string | null;
      // model-attribution: the model that produced this assistant message.
@@ -557,6 +561,14 @@ export type WsFrame =
    }
  | { type: 'messages_deleted'; message_ids: string[]; chat_id?: string     }
  | { type: 'chat_renamed'; chat_id: string; name: string }
  | {
      type: 'agent_snapshot';
      chat_id: string;
      agent?: string | null;
      model: string;
      mode?: string | null;
      turn_number: number;
    }
  // v1.11: published by services/compaction.ts after the new anchored
  // summary row lands. Carries the new summary row id for diagnostics; the
  // session-stream handler ignores the id and re-fetches the full message
@@ -600,6 +612,31 @@ export type WsFrame =
      run_status?: 'running' | 'completed' | 'failed' | 'cancelled';
      report?: string;
    }
  // tool trace frames: per-tool-call lifecycle tracking
  | {
      type: 'tool_trace_start';
      trace_id: string;
      message_id: string;
      chat_id: string;
      tool_name: string;
      tool_input: Record<string, unknown>;
      started_at: string;
    }
  | {
      type: 'tool_trace_finish';
      trace_id: string;
      message_id: string;
      chat_id: string;
      tool_name: string;
      tool_output?: string | null;
      latency_ms?: number;
      tokens_used?: number | null;
      cache_tokens?: number | null;
      reasoning_tokens?: number | null;
      error?: string;
      outcome?: string;
      finished_at: string;
    }
  // arena frames: battle lifecycle + per-contestant streaming
  | {
      type: 'battle_started';
@@ -626,8 +663,64 @@ export type WsFrame =
      winner_contestant_id?: string | null;
      analysis_ready?: boolean;
      cross_exam_id?: string;
    }
  // streaming v2: channel-delta frames. Each carries a monotonic seq for
  // out-of-order buffering and a channel discriminator; per-channel payloads
  // map to the equivalent legacy frame types after reordering.
  | {
      type: 'channel_delta';
      seq: number;
      channel: 'text' | 'tool_call' | 'tool_result' | 'status' | 'error';
      message_id?: string;
      chat_id?: string;
      content?: string;
      tool_call?: ToolCall;
      tool_message_id?: string;
      tool_call_id?: string;
      output?: unknown;
      truncated?: boolean;
      error?: string;
      reason?: string;
      status?: 'running' | 'complete' | 'cancelled' | 'failed';
      tokens_used?: number | null;
      ctx_used?: number | null;
      ctx_max?: number | null;
      cache_tokens?: number | null;
      reasoning_tokens?: number | null;
      started_at?: string | null;
      finished_at?: string | null;
      model?: string | null;
      metadata?: MessageMetadata | null;
    };
 // tool traces: per-tool-call record returned by GET /api/chats/:id/traces.
 export interface ToolTrace {
  id: string;
  session_id: string;
  chat_id: string;
  message_id: string | null;
  turn_number: number;
  tool_name: string;
  tool_input: Record<string, unknown>;
  tool_output: string | null;
  started_at: string;
  finished_at: string | null;
  latency_ms: number | null;
  tokens_used: number | null;
  cache_tokens: number | null;
  reasoning_tokens: number | null;
  error: string | null;
  outcome: string | null;
  created_at: string;
 }
 export interface ToolTraceResponse {
  data: ToolTrace[];
  total: number;
  limit: number;
  offset: number;
 }
 // token-analyzer-ui: aggregate token/cost analytics types.
 export interface AnalyticsSummary {
  total_input_tokens: number;
@@ -656,3 +749,21 @@ export interface TokenBreakdownAgg {
  category: string;
  total_tokens: number;
 }
 // ── Memory browser types ────────────────────────────────────────────
 export interface MemoryEntry {
  id: string;
  topic: string;
  title: string;
  content: string;
  tags: string[];
 }
 export interface DailyMemoryEntry extends MemoryEntry {
  date: string;
 }
 export interface DreamEntry {
  date: string;
  content: string;
 }
--- a/apps/web/src/components/MessageBubble.tsx
+++ b/apps/web/src/components/MessageBubble.tsx
@@ -156,9 +156,16 @@ function StatsLine({ message }: { message: Message }) {
        : `${ctxUsed} ctx`
      : null;
  const cacheHit = message.cache_tokens;
  const reasoning = message.reasoning_tokens;
  const cachePart = typeof cacheHit === 'number' && cacheHit > 0 ? `cache ${cacheHit}` : null;
  const reasoningPart = typeof reasoning === 'number' && reasoning > 0 ? `think ${reasoning}` : null;
  const parts: string[] = [`${tokens} tokens`];
  if (tps !== null) parts.push(`${tps.toFixed(1)} tok/s`);
  if (ctxPart) parts.push(ctxPart);
  if (cachePart) parts.push(cachePart);
  if (reasoningPart) parts.push(reasoningPart);
  return (
    <div className="text-[10px] font-mono text-muted-foreground">
--- a/apps/web/src/components/MessageList.tsx
+++ b/apps/web/src/components/MessageList.tsx
@@ -1,10 +1,14 @@
-import { useCallback, useEffect, useMemo, useRef } from 'react';
+import { useCallback, useEffect, useMemo, useRef, useState } from 'react';
 import { motion } from 'framer-motion';
 import { Virtuoso, type VirtuosoHandle } from 'react-virtuoso';
 import { Pin } from 'lucide-react';
 import type { Chat, Message } from '@/api/types';
 import { MessageBubble } from './MessageBubble';
 import { ToolCallGroup } from './ToolCallGroup';
 import { ToolCallLine, type ToolRun } from './ToolCallLine';
 import { AskUserInputCard } from './AskUserInputCard';
 import { RequestReadAccessCard } from './RequestReadAccessCard';
 import { MessageListErrorBoundary } from './MessageListErrorBoundary';
 interface Props {
  messages: Message[];
@@ -142,27 +146,63 @@ function stampCapHits(items: RenderItem[]): RenderItem[] {
  });
 }
 const SCROLL_THRESHOLD_PX = 150;
 export function MessageList({ messages, sessionChats }: Props) {
-  const endRef = useRef<HTMLDivElement>(null);
+  const virtuosoRef = useRef<VirtuosoHandle>(null);
  const scrollContainerRef = useRef<HTMLDivElement>(null);
  const isNearBottomRef = useRef(true);
  const renderedKeysRef = useRef(new Set<string>());
  const prefersReducedMotionRef = useRef(false);
  const [animateEnabled, setAnimateEnabled] = useState(true);
  const [pinMessageId, setPinMessageId] = useState<string | null>(() => {
    if (typeof window !== 'undefined') {
      const hash = window.location.hash;
      if (hash.startsWith('#pin=')) return hash.slice(5);
    }
    return null;
  });
  const renderItems = useMemo(() => stampCapHits(group(flatten(messages))), [messages]);
-  const handleScroll = useCallback(() => {
+  const pinIndex = useMemo(() => {
-    const el = scrollContainerRef.current;
+    if (!pinMessageId) return -1;
-    if (!el) return;
+    return renderItems.findIndex(
-    isNearBottomRef.current =
+      (item) => item.kind === 'message' && item.message.id === pinMessageId,
-      el.scrollHeight - el.scrollTop - el.clientHeight < SCROLL_THRESHOLD_PX;
+    );
  }, [pinMessageId, renderItems]);
  useEffect(() => {
    const mq = window.matchMedia('(prefers-reduced-motion: reduce)');
    prefersReducedMotionRef.current = mq.matches;
    const handler = (e: MediaQueryListEvent) => {
      prefersReducedMotionRef.current = e.matches;
    };
    mq.addEventListener('change', handler);
    return () => mq.removeEventListener('change', handler);
  }, []);
  useEffect(() => {
-    if (isNearBottomRef.current) {
+    const handler = () => {
-      endRef.current?.scrollIntoView({ block: 'end' });
+      const hash = window.location.hash;
      if (hash.startsWith('#pin=')) {
        setPinMessageId(hash.slice(5));
      } else {
        setPinMessageId(null);
      }
-  }, [messages]);
+    };
    window.addEventListener('hashchange', handler);
    return () => window.removeEventListener('hashchange', handler);
  }, []);
  const atBottomStateChange = useCallback((atBottom: boolean) => {
    isNearBottomRef.current = atBottom;
    setAnimateEnabled(atBottom);
  }, []);
  const scrollToPin = useCallback(() => {
    if (pinIndex >= 0 && virtuosoRef.current) {
      virtuosoRef.current.scrollToIndex({ index: pinIndex, align: 'center' });
    }
  }, [pinIndex]);
  if (messages.length === 0) {
    return (
@@ -173,46 +213,78 @@ export function MessageList({ messages, sessionChats }: Props) {
  }
  return (
-    <div className="flex-1 overflow-y-auto" ref={scrollContainerRef} onScroll={handleScroll}>
+    <MessageListErrorBoundary>
-      <div className="max-w-[1000px] mx-auto w-full px-6 py-4 space-y-4">
+    <div className="flex-1 flex flex-col">
-        {renderItems.map((item) => {
+      {pinMessageId && pinIndex >= 0 && (
-          if (item.kind === 'message') {
+        <div className="shrink-0 flex items-center gap-2 px-4 py-1.5 bg-primary/10 border-b border-primary/20 text-xs text-primary">
          <Pin className="size-3" />
          <span>Pinned message</span>
          <button
            type="button"
            onClick={scrollToPin}
            className="ml-auto underline hover:no-underline"
          >
            Jump to pinned
          </button>
        </div>
      )}
      <Virtuoso
        ref={virtuosoRef}
        className="flex-1"
        data={renderItems}
        followOutput="auto"
        overscan={5}
        atBottomStateChange={atBottomStateChange}
        itemContent={(index, item) => {
          const key = item.kind === 'message' ? `msg-${item.message.id}` : item.key;
          const isNew = !renderedKeysRef.current.has(key);
          if (isNew) renderedKeysRef.current.add(key);
          const reducedMotion = prefersReducedMotionRef.current;
          const delay = isNew && !reducedMotion ? Math.min(index * 0.04, 0.5) : 0;
          const shouldAnimate = isNew && animateEnabled;
          return (
            <div
              className="max-w-[1000px] mx-auto w-full px-6 py-2"
              id={item.kind === 'message' ? `msg-${item.message.id}` : undefined}
            >
              <motion.div
                initial={shouldAnimate ? { opacity: 0, y: 8 } : false}
                animate={{ opacity: 1, y: 0 }}
                transition={delay > 0 ? { duration: 0.2, delay } : { duration: 0 }}
              >
                {item.kind === 'message' ? (
                  <MessageBubble
                key={item.message.id}
                    message={item.message}
                    sessionChats={sessionChats}
                    capHitInfo={item.capHitInfo}
                  />
-            );
+                ) : item.kind === 'tool_run' ? (
-          }
+                  item.run.call.name === 'ask_user_input' ? (
          if (item.kind === 'tool_run') {
            if (item.run.call.name === 'ask_user_input') {
              return (
                    <AskUserInputCard
                  key={item.key}
                      toolCall={item.run.call}
                      toolResult={item.run.result}
                      chatId={item.chatId}
                    />
-              );
+                  ) : item.run.call.name === 'request_read_access' ? (
            }
            if (item.run.call.name === 'request_read_access') {
              return (
                    <RequestReadAccessCard
                  key={item.key}
                      toolCall={item.run.call}
                      toolResult={item.run.result}
                      chatId={item.chatId}
                    />
                  ) : (
                    <ToolCallLine run={item.run} />
                  )
                ) : (
                  <ToolCallGroup runs={item.runs} />
                )}
              </motion.div>
            </div>
          );
-            }
+        }}
-            return <ToolCallLine key={item.key} run={item.run} />;
+      />
          }
          return <ToolCallGroup key={item.key} runs={item.runs} />;
        })}
        <div ref={endRef} />
      </div>
    </div>
    </MessageListErrorBoundary>
  );
 }
--- a/apps/web/src/components/SessionTimeline.tsx
+++ b/apps/web/src/components/SessionTimeline.tsx
@@ -0,0 +1,188 @@
 import { useMemo } from 'react';
 import { Clock, Cpu, Hash, Layers, RefreshCw, X } from 'lucide-react';
 import { Button } from '@/components/ui/button';
 import { cn } from '@/lib/utils';
 import type { Message } from '@/api/types';
 interface TurnEntry {
  message: Message;
  turnNumber: number;
  elapsed: string;
  toolCallCount: number;
 }
 interface Props {
  messages: Message[];
  chatId: string;
  onClose: () => void;
  onScrollToMessage: (messageId: string) => void;
 }
 function formatElapsed(startedAt: string | null, finishedAt: string | null): string {
  if (!startedAt || !finishedAt) return '—';
  const start = new Date(startedAt).getTime();
  const end = new Date(finishedAt).getTime();
  if (Number.isNaN(start) || Number.isNaN(end)) return '—';
  const ms = end - start;
  if (ms < 0) return '—';
  if (ms < 1000) return `${ms}ms`;
  if (ms < 60_000) return `${Math.round(ms / 1000)}s`;
  const mins = Math.floor(ms / 60_000);
  const secs = Math.round((ms % 60_000) / 1000);
  return `${mins}m ${secs}s`;
 }
 /**
 * SessionTimeline — vertical timeline of assistant turns in a chat.
 *
 * Renders a side-panel overlay with each turn's model, tokens, duration,
 * and tool-call count. Clicking a turn scrolls the main chat to that
 * message. The latest turn shows a "Scroll to latest" restore button.
 */
 export function SessionTimeline({ messages, onClose, onScrollToMessage }: Props) {
  const turns = useMemo<TurnEntry[]>(() => {
    const assistantMsgs = messages.filter(
      (m) => m.role === 'assistant' && m.status === 'complete',
    );
    return assistantMsgs.map((message, i) => ({
      message,
      turnNumber: i + 1,
      elapsed: formatElapsed(message.started_at, message.finished_at),
      toolCallCount: message.tool_calls?.length ?? 0,
    }));
  }, [messages]);
  const latestTurnId = turns.length > 0 ? turns[turns.length - 1]!.message.id : null;
  return (
    <div className="absolute inset-y-0 right-0 w-80 z-20 bg-background border-l border-border shadow-xl flex flex-col overflow-hidden">
      {/* Header */}
      <div className="flex items-center justify-between px-3 py-2.5 border-b border-border shrink-0">
        <h3 className="text-sm font-semibold">Session Timeline</h3>
        <Button variant="ghost" size="icon-xs" onClick={onClose} aria-label="Close timeline">
          <X size={14} />
        </Button>
      </div>
      {/* Timeline entries */}
      <div className="flex-1 overflow-y-auto px-3 py-3">
        {turns.length === 0 ? (
          <div className="text-xs text-muted-foreground text-center py-8">
            No assistant turns yet.
          </div>
        ) : (
          <div className="relative">
            {turns.map((turn, i) => {
              const isLatest = turn.message.id === latestTurnId;
              return (
                <div key={turn.message.id} className="relative flex gap-3 pb-4 last:pb-0">
                  {/* Vertical connector line */}
                  {i < turns.length - 1 && (
                    <div className="absolute left-[11px] top-5 bottom-0 w-px bg-border" />
                  )}
                  {/* Timeline dot button */}
                  <button
                    type="button"
                    onClick={() => onScrollToMessage(turn.message.id)}
                    className="relative flex-shrink-0 mt-1 cursor-pointer focus:outline-none focus-visible:ring-2 focus-visible:ring-ring rounded-full"
                    aria-label={`Scroll to turn ${turn.turnNumber}`}
                  >
                    <div
                      className={cn(
                        'size-[22px] rounded-full border-2 flex items-center justify-center',
                        isLatest
                          ? 'border-primary bg-primary/10'
                          : 'border-muted-foreground/30 bg-background',
                      )}
                    >
                      <div
                        className={cn(
                          'size-2 rounded-full',
                          isLatest ? 'bg-primary' : 'bg-muted-foreground/50',
                        )}
                      />
                    </div>
                  </button>
                  {/* Content card */}
                  <div className="flex-1 min-w-0">
                    <div
                      className="rounded-lg border border-border bg-card p-2.5 cursor-pointer hover:bg-muted/40 transition-colors"
                      onClick={() => onScrollToMessage(turn.message.id)}
                    >
                      {/* Turn number + latest badge */}
                      <div className="flex items-center justify-between mb-1.5">
                        <span className="text-xs font-semibold text-foreground">
                          Turn {turn.turnNumber}
                        </span>
                        {isLatest && (
                          <span className="text-[10px] font-medium text-primary bg-primary/10 px-1.5 py-0.5 rounded-full leading-none">
                            Latest
                          </span>
                        )}
                      </div>
                      {/* Model name */}
                      <div className="flex items-center gap-1.5 text-xs text-muted-foreground mb-1.5">
                        <Cpu size={11} className="shrink-0" />
                        <span className="truncate">{turn.message.model ?? 'Unknown model'}</span>
                      </div>
                      {/* Token count with breakdown */}
                      {turn.message.tokens_used != null && (
                        <div className="flex items-center gap-1.5 text-xs text-muted-foreground mb-1 flex-wrap">
                          <Hash size={11} className="shrink-0" />
                          <span>{turn.message.tokens_used.toLocaleString()} total</span>
                          {turn.message.cache_tokens != null && turn.message.cache_tokens > 0 && (
                            <span className="text-blue-500 dark:text-blue-400">
                              ({turn.message.cache_tokens.toLocaleString()} cache)
                            </span>
                          )}
                          {turn.message.reasoning_tokens != null && turn.message.reasoning_tokens > 0 && (
                            <span className="text-amber-500 dark:text-amber-400">
                              ({turn.message.reasoning_tokens.toLocaleString()} reasoning)
                            </span>
                          )}
                        </div>
                      )}
                      {/* Duration + tool calls */}
                      <div className="flex items-center gap-3 text-xs text-muted-foreground">
                        <span className="inline-flex items-center gap-1">
                          <Clock size={11} />
                          {turn.elapsed}
                        </span>
                        {turn.toolCallCount > 0 && (
                          <span className="inline-flex items-center gap-1">
                            <Layers size={11} />
                            {turn.toolCallCount} tool call{turn.toolCallCount !== 1 ? 's' : ''}
                          </span>
                        )}
                      </div>
                    </div>
                    {/* Restore button for latest turn */}
                    {isLatest && (
                      <button
                        type="button"
                        onClick={(e) => {
                          e.stopPropagation();
                          onScrollToMessage(turn.message.id);
                        }}
                        className="mt-1.5 w-full inline-flex items-center justify-center gap-1 text-[11px] font-medium text-primary hover:text-primary/80 transition-colors py-1 rounded-md hover:bg-primary/5"
                      >
                        <RefreshCw size={11} />
                        Scroll to latest
                      </button>
                    )}
                  </div>
                </div>
              );
            })}
          </div>
        )}
      </div>
    </div>
  );
 }
--- a/apps/web/src/components/TraceViewer.tsx
+++ b/apps/web/src/components/TraceViewer.tsx
@@ -0,0 +1,251 @@
 import { useCallback, useEffect, useMemo, useState } from 'react';
 import { ChevronDown, ChevronRight, AlertCircle } from 'lucide-react';
 import { api } from '@/api/client';
 import type { ToolTrace } from '@/api/types';
 interface Props {
  chatId: string;
 }
 // Max latency used as the 100% reference for the bar visualization
 const MAX_LATENCY_REF = 30_000; // 30s
 function latencyBarWidth(latencyMs: number | null): number {
  if (latencyMs == null) return 0;
  return Math.min(latencyMs / MAX_LATENCY_REF, 1);
 }
 function TraceRow({ trace }: { trace: ToolTrace }) {
  const [expanded, setExpanded] = useState(false);
  const isError = trace.outcome !== null && trace.outcome !== 'success';
  const barWidth = latencyBarWidth(trace.latency_ms);
  const latencyLabel =
    trace.latency_ms != null
      ? trace.latency_ms >= 1000
        ? `${(trace.latency_ms / 1000).toFixed(1)}s`
        : `${trace.latency_ms}ms`
      : null;
  return (
    <div className="border-b border-border/40 last:border-0">
      <button
        type="button"
        onClick={() => setExpanded((v) => !v)}
        className="flex items-center gap-2 w-full text-left px-2 py-1.5 hover:bg-muted/40 text-[11px]"
      >
        <span className="shrink-0 text-muted-foreground">
          {expanded ? <ChevronDown size={10} /> : <ChevronRight size={10} />}
        </span>
        <span className="font-medium truncate min-w-0">
          {trace.tool_name}
        </span>
        {isError && (
          <span className="shrink-0 text-destructive" title={trace.error ?? 'error'}>
            <AlertCircle size={10} />
          </span>
        )}
        <span className="shrink-0 text-muted-foreground font-mono tabular-nums min-w-[3rem] text-right">
          {latencyLabel ?? '—'}
        </span>
        <span className="flex-1 h-1.5 bg-muted rounded-full overflow-hidden min-w-[24px] max-w-[60px]">
          <span
            className="block h-full rounded-full bg-primary/30 transition-all"
            style={{ width: `${barWidth * 100}%` }}
          />
        </span>
        {trace.tokens_used != null && trace.tokens_used > 0 && (
          <span className="shrink-0 text-muted-foreground font-mono tabular-nums">
            {trace.tokens_used}t
          </span>
        )}
        {trace.cache_tokens != null && trace.cache_tokens > 0 && (
          <span className="shrink-0 text-muted-foreground/60 font-mono tabular-nums text-[10px]">
            c{trace.cache_tokens}
          </span>
        )}
        {trace.reasoning_tokens != null && trace.reasoning_tokens > 0 && (
          <span className="shrink-0 text-muted-foreground/60 font-mono tabular-nums text-[10px]">
            r{trace.reasoning_tokens}
          </span>
        )}
      </button>
      {expanded && (
        <div className="px-3 pb-2 space-y-1.5 text-[11px] border-t border-border/40 pt-1.5">
          <div>
            <span className="text-muted-foreground font-medium">Input</span>
            <pre className="mt-0.5 font-mono text-[10px] leading-relaxed text-muted-foreground bg-muted/30 rounded p-1.5 overflow-x-auto max-h-32 overflow-y-auto whitespace-pre-wrap break-all">
              {JSON.stringify(trace.tool_input, null, 1)}
            </pre>
          </div>
          {trace.tool_output != null && (
            <div>
              <span className="text-muted-foreground font-medium">Output</span>
              <pre className="mt-0.5 font-mono text-[10px] leading-relaxed text-muted-foreground bg-muted/30 rounded p-1.5 overflow-x-auto max-h-32 overflow-y-auto whitespace-pre-wrap break-all">
                {trace.tool_output.length > 2000
                  ? `${trace.tool_output.slice(0, 2000)}…`
                  : trace.tool_output}
              </pre>
            </div>
          )}
          {trace.error != null && (
            <div className="text-destructive text-[10px] font-mono leading-relaxed bg-destructive/10 rounded p-1.5">
              {trace.error}
            </div>
          )}
        </div>
      )}
    </div>
  );
 }
 function TraceGroup({ toolName, traces }: { toolName: string; traces: ToolTrace[] }) {
  const [collapsed, setCollapsed] = useState(false);
  const totalLatency = traces.reduce((sum, t) => sum + (t.latency_ms ?? 0), 0);
  const totalTokens = traces.reduce((sum, t) => sum + (t.tokens_used ?? 0), 0);
  const errorCount = traces.filter(
    (t) => t.outcome !== null && t.outcome !== 'success',
  ).length;
  return (
    <div>
      <button
        type="button"
        onClick={() => setCollapsed((v) => !v)}
        className="flex items-center gap-1.5 w-full text-left px-2 py-1 text-[11px] font-medium text-muted-foreground hover:bg-muted/30 sticky top-0 bg-background"
      >
        {collapsed ? <ChevronRight size={10} /> : <ChevronDown size={10} />}
        <span>{toolName}</span>
        <span className="text-muted-foreground/60 font-mono tabular-nums">
          ×{traces.length}
        </span>
        {totalTokens > 0 && (
          <span className="text-muted-foreground/60 font-mono tabular-nums text-[10px]">
            {totalTokens}t
          </span>
        )}
        {totalLatency > 0 && (
          <span className="text-muted-foreground/60 font-mono tabular-nums text-[10px]">
            {totalLatency >= 1000
              ? `${(totalLatency / 1000).toFixed(1)}s`
              : `${totalLatency}ms`}
          </span>
        )}
        {errorCount > 0 && (
          <span className="ml-auto text-destructive text-[10px] font-medium">
            {errorCount} error{errorCount > 1 ? 's' : ''}
          </span>
        )}
      </button>
      {!collapsed && traces.map((trace) => (
        <TraceRow key={trace.id} trace={trace} />
      ))}
    </div>
  );
 }
 export function TraceViewer({ chatId }: Props) {
  const [open, setOpen] = useState(false);
  const [traces, setTraces] = useState<ToolTrace[]>([]);
  const [loading, setLoading] = useState(false);
  const [error, setError] = useState<string | null>(null);
  const fetchTraces = useCallback(async () => {
    setLoading(true);
    setError(null);
    try {
      const res = await api.chats.getTraces(chatId);
      setTraces(res.data);
    } catch (err) {
      setError(err instanceof Error ? err.message : 'failed to load traces');
    } finally {
      setLoading(false);
    }
  }, [chatId]);
  useEffect(() => {
    if (open) {
      void fetchTraces();
    }
  }, [open, fetchTraces]);
  const groups = useMemo(() => {
    const map = new Map<string, ToolTrace[]>();
    for (const t of traces) {
      const existing = map.get(t.tool_name);
      if (existing) {
        existing.push(t);
      } else {
        map.set(t.tool_name, [t]);
      }
    }
    return map;
  }, [traces]);
  const totalCount = traces.length;
  const errorCount = traces.filter(
    (t) => t.outcome !== null && t.outcome !== 'success',
  ).length;
  return (
    <div className="border-t">
      <button
        type="button"
        onClick={() => setOpen((v) => !v)}
        className="flex items-center gap-1.5 w-full px-3 py-1.5 text-[11px] font-medium text-muted-foreground hover:bg-muted/20"
      >
        {open ? <ChevronDown size={12} /> : <ChevronRight size={12} />}
        <span>Tool traces</span>
        {totalCount > 0 && (
          <span className="font-mono tabular-nums text-muted-foreground/60">
            {totalCount}
          </span>
        )}
        {errorCount > 0 && (
          <span className="text-destructive ml-auto text-[10px] font-medium">
            {errorCount} error{errorCount > 1 ? 's' : ''}
          </span>
        )}
        {loading && (
          <span className="ml-auto inline-block w-1.5 h-3 align-baseline bg-muted-foreground/60 animate-pulse" />
        )}
      </button>
      {open && (
        <div className="max-h-80 overflow-y-auto border-t border-border/40">
          {loading && traces.length === 0 && (
            <div className="px-3 py-4 text-[11px] text-muted-foreground text-center">
              Loading traces…
            </div>
          )}
          {error && (
            <div className="px-3 py-2 text-[11px] text-destructive">
              {error}
              <button
                type="button"
                onClick={() => void fetchTraces()}
                className="ml-2 underline hover:no-underline"
              >
                retry
              </button>
            </div>
          )}
          {!loading && !error && traces.length === 0 && (
            <div className="px-3 py-4 text-[11px] text-muted-foreground text-center">
              No tool traces yet.
            </div>
          )}
          {traces.length > 0 && (
            <div className="divide-y divide-border/40">
              {Array.from(groups.entries()).map(([toolName, groupTraces]) => (
                <TraceGroup
                  key={toolName}
                  toolName={toolName}
                  traces={groupTraces}
                />
              ))}
            </div>
          )}
        </div>
      )}
    </div>
  );
 }
--- a/apps/web/src/components/panes/ChatPane.tsx
+++ b/apps/web/src/components/panes/ChatPane.tsx
@@ -1,11 +1,13 @@
 import { useCallback, useEffect, useRef, useState } from 'react';
-import { Pencil, Send, X } from 'lucide-react';
+import { History, Pencil, Send, X } from 'lucide-react';
 import { toast } from 'sonner';
 import { api } from '@/api/client';
 import { useSessionStream } from '@/hooks/useSessionStream';
 import { MessageList } from '@/components/MessageList';
 import { ChatInput } from '@/components/ChatInput';
 import { StaleStreamBanner } from '@/components/StaleStreamBanner';
 import { SessionTimeline } from '@/components/SessionTimeline';
 import { TraceViewer } from '@/components/TraceViewer';
 import { sendToChat } from '@/lib/events';
 interface Props {
@@ -25,6 +27,7 @@ interface Props {
 export function ChatPane({ sessionId, chatId, projectId, agentId, onAgentChange, sessionChats, webSearchEnabled }: Props) {
  const stream = useSessionStream(sessionId);
  const lastErrorRef = useRef<string | null>(null);
  const [showTimeline, setShowTimeline] = useState(false);
  const [queue, setQueue] = useState<{ id: string; text: string }[]>([]);
  const queueIdRef = useRef(0);
  const processingRef = useRef(false);
@@ -203,11 +206,41 @@ export function ChatPane({ sessionId, chatId, projectId, agentId, onAgentChange,
    }
  }
  const handleScrollToMessage = useCallback((messageId: string) => {
    const el = document.getElementById(`msg-${messageId}`);
    if (el) {
      el.scrollIntoView({ behavior: 'smooth', block: 'center' });
    }
  }, []);
  return (
-    <div className="flex flex-col h-full min-h-0">
+    <div className="flex flex-col h-full min-h-0 relative">
      {chatMessages.length > 0 && (
        <div className="absolute top-2 right-2 z-10">
          <button
            type="button"
            onClick={() => setShowTimeline((v) => !v)}
            className={`
              inline-flex items-center gap-1 px-2 py-1 rounded-md text-xs font-medium
              transition-colors border
              ${showTimeline
                ? 'bg-primary text-primary-foreground border-primary'
                : 'bg-background text-muted-foreground border-border hover:bg-muted hover:text-foreground'
              }
            `}
            aria-label={showTimeline ? 'Close timeline' : 'Open timeline'}
          >
            <History size={12} />
            Timeline
          </button>
        </div>
      )}
      {/* v1.11.5: ContextBar moved into ChatInput (above the agent picker). */}
      <MessageList messages={chatMessages} sessionChats={sessionChats} />
      <TraceViewer chatId={chatId} />
      {/* Queued messages */}
      {queue.length > 0 && (
        <div className="border-t">
@@ -275,6 +308,16 @@ export function ChatPane({ sessionId, chatId, projectId, agentId, onAgentChange,
        messages={chatMessages}
        modelContextLimit={modelContextLimit}
      />
      {/* Timeline overlay panel */}
      {showTimeline && (
        <SessionTimeline
          messages={chatMessages}
          chatId={chatId}
          onClose={() => setShowTimeline(false)}
          onScrollToMessage={handleScrollToMessage}
        />
      )}
    </div>
  );
 }
--- a/apps/web/src/hooks/useSessionStream.ts
+++ b/apps/web/src/hooks/useSessionStream.ts
@@ -16,6 +16,133 @@ interface State {
  error: string | null;
 }
 type Channel = 'text' | 'tool_call' | 'tool_result' | 'status' | 'error';
 // Per-channel out-of-order frame buffer with contiguous-seq flush logic.
 // Stores incoming channel_delta frames and releases them only when seq
 // becomes contiguous with the expected next value.
 class ChannelBuffer {
  private expectedSeq = 0;
  private buffer = new Map<number, ChannelDeltaWsFrame>();
  push(frame: ChannelDeltaWsFrame): ChannelDeltaWsFrame[] {
    if (frame.seq < this.expectedSeq) {
      return [];
    }
    if (frame.seq === this.expectedSeq) {
      this.expectedSeq++;
      const flushed = [frame];
      while (this.buffer.has(this.expectedSeq)) {
        const next = this.buffer.get(this.expectedSeq)!;
        this.buffer.delete(this.expectedSeq);
        this.expectedSeq++;
        flushed.push(next);
      }
      return flushed;
    }
    this.buffer.set(frame.seq, frame);
    return [];
  }
  get expectedNextSeq(): number {
    return this.expectedSeq;
  }
  get bufferedCount(): number {
    return this.buffer.size;
  }
  reset(seq = 0) {
    this.expectedSeq = seq;
    this.buffer.clear();
  }
 }
 type ChannelDeltaWsFrame = WsFrame & { type: 'channel_delta' };
 // Converts a flushed channel_delta into the equivalent legacy frame so the
 // existing applyFrame reducer handles the per-message mutation. Status
 // deltas are handled separately (they may need to create the message first
 // and apply throughput metadata independently of terminal status).
 function channelDeltaToLegacyFrame(delta: ChannelDeltaWsFrame): WsFrame | null {
  switch (delta.channel) {
    case 'text':
      return { type: 'delta', message_id: delta.message_id!, content: delta.content! };
    case 'tool_call':
      return { type: 'tool_call', message_id: delta.message_id!, tool_call: delta.tool_call! };
    case 'tool_result':
      return {
        type: 'tool_result',
        tool_message_id: delta.tool_message_id!,
        chat_id: delta.chat_id,
        tool_call_id: delta.tool_call_id!,
        output: delta.output,
        truncated: delta.truncated!,
        ...(delta.error ? { error: delta.error } : {}),
      };
    case 'error':
      return {
        type: 'error',
        message_id: delta.message_id,
        chat_id: delta.chat_id,
        error: delta.error!,
        ...(delta.reason ? { reason: delta.reason as never } : {}),
      };
    case 'status':
      return null;
  }
 }
 // Apply a flushed status channel_delta to state. Status deltas carry both
 // intermediate throughput metadata (tokens_used, ctx_used, model, etc.)
 // and optional terminal transitions (complete / cancelled / failed).
 function applyStatusDelta(state: State, delta: ChannelDeltaWsFrame): State {
  const { message_id, chat_id, status, channel: _c, seq: _s, type: _t, ...meta } = delta;
  if (!message_id) return state;
  let next = state;
  const exists = next.messages.some((m) => m.id === message_id);
  if (!exists && status === 'running') {
    next = applyFrame(next, {
      type: 'message_started',
      message_id,
      chat_id,
      role: 'assistant',
    });
  }
  const metaFields: Record<string, unknown> = {};
  if (meta.tokens_used !== undefined) metaFields.tokens_used = meta.tokens_used;
  if (meta.ctx_used !== undefined) metaFields.ctx_used = meta.ctx_used;
  if (meta.ctx_max !== undefined) metaFields.ctx_max = meta.ctx_max;
  if (meta.cache_tokens !== undefined) metaFields.cache_tokens = meta.cache_tokens;
  if (meta.reasoning_tokens !== undefined) metaFields.reasoning_tokens = meta.reasoning_tokens;
  if (meta.started_at !== undefined) metaFields.started_at = meta.started_at;
  if (meta.finished_at !== undefined) metaFields.finished_at = meta.finished_at;
  if (meta.model !== undefined) metaFields.model = meta.model;
  if (meta.metadata !== undefined) metaFields.metadata = meta.metadata;
  if (Object.keys(metaFields).length > 0) {
    next = {
      ...next,
      messages: next.messages.map((m) =>
        m.id === message_id ? { ...m, ...metaFields } : m,
      ),
    };
  }
  if (status === 'complete' || status === 'cancelled' || status === 'failed') {
    next = applyFrame(next, {
      type: 'message_complete',
      message_id,
      chat_id,
      status,
    });
  }
  return next;
 }
 function applyFrame(state: State, frame: WsFrame): State {
  switch (frame.type) {
    case 'snapshot': {
@@ -33,13 +160,13 @@ function applyFrame(state: State, frame: WsFrame): State {
        kind: 'message',
        tool_calls: null,
        tool_results: null,
        // v1.8.2: cap-hit sentinels arrive role='system' and are static, so
        // skipping the streaming dot for them keeps the UI accurate.
        status: frame.role === 'system' ? 'complete' : 'streaming',
        last_seq: 0,
        tokens_used: null,
        ctx_used: null,
        ctx_max: null,
        cache_tokens: null,
        reasoning_tokens: null,
        model: null,
        started_at: null,
        finished_at: null,
@@ -63,7 +190,7 @@ function applyFrame(state: State, frame: WsFrame): State {
      const next = state.messages.map((m) =>
        m.id === frame.message_id
          ? { ...m, tool_calls: [...(m.tool_calls ?? []), frame.tool_call] }
-          : m
+          : m,
      );
      return { ...state, messages: next };
    }
@@ -83,7 +210,7 @@ function applyFrame(state: State, frame: WsFrame): State {
                },
                status: 'complete' as const,
              }
-            : m
+            : m,
        );
        return { ...state, messages: next };
      }
@@ -106,6 +233,8 @@ function applyFrame(state: State, frame: WsFrame): State {
        tokens_used: null,
        ctx_used: null,
        ctx_max: null,
        cache_tokens: null,
        reasoning_tokens: null,
        model: null,
        started_at: null,
        finished_at: null,
@@ -123,22 +252,18 @@ function applyFrame(state: State, frame: WsFrame): State {
              ...(frame.tokens_used !== undefined ? { tokens_used: frame.tokens_used } : {}),
              ...(frame.ctx_used !== undefined ? { ctx_used: frame.ctx_used } : {}),
              ...(frame.ctx_max !== undefined ? { ctx_max: frame.ctx_max } : {}),
              ...(frame.cache_tokens !== undefined ? { cache_tokens: frame.cache_tokens } : {}),
              ...(frame.reasoning_tokens !== undefined ? { reasoning_tokens: frame.reasoning_tokens } : {}),
              ...(frame.started_at !== undefined ? { started_at: frame.started_at } : {}),
              ...(frame.finished_at !== undefined ? { finished_at: frame.finished_at } : {}),
              ...(frame.model !== undefined ? { model: frame.model } : {}),
              // v1.8.2: cap-hit sentinels (and future stamped metadata) ride
              // in on this terminal frame so the reducer can attach it
              // without waiting for a refetch.
              ...(frame.metadata !== undefined ? { metadata: frame.metadata } : {}),
            }
-          : m
+          : m,
      );
      return { ...state, messages: next };
    }
    case 'usage': {
      // v1.12.2: live throughput. Side-effects into the module-level
      // singleton consumed by ChatThroughput; no message-state mutation.
      // chat_id is the optional ws-frame field; usage frames always include it.
      if (frame.chat_id) {
        recordUsage(frame.chat_id, {
          completion_tokens: frame.completion_tokens,
@@ -166,10 +291,6 @@ function applyFrame(state: State, frame: WsFrame): State {
      return state;
    }
    case 'error': {
      // v1.8.2: when the frame carries a structured reason, stamp it onto the
      // failed message's metadata so the bubble can render specifics inline
      // (the WS error frame is one-shot; refresh-safe rendering needs the
      // value persisted on the message).
      const errorMeta = frame.reason
        ? { kind: 'error' as const, error_reason: frame.reason, error_text: frame.error }
        : null;
@@ -181,47 +302,53 @@ function applyFrame(state: State, frame: WsFrame): State {
                  status: 'failed' as const,
                  ...(errorMeta ? { metadata: errorMeta } : {}),
                }
-              : m
+              : m,
          )
        : state.messages;
      return { ...state, messages: next, error: frame.error };
    }
    case 'compacted': {
-      // v1.11: side effects (refetch + toast) live in ws.onmessage; the
+      return state;
-      // reducer just no-ops so TS exhaustiveness is satisfied without
+    }
-      // duplicating async work inside a synchronous reducer.
+    case 'agent_snapshot': {
      return state;
    }
    case 'agent_status_updated': {
      // agent-status-normalize (#10): coder-only frame consumed by CoderPane's
      // own WS handler, not BooChat's native message reducer. No-op here to keep
      // TS exhaustiveness satisfied (native sessions never emit it).
      return state;
    }
    case 'flow_run_started':
    case 'flow_run_step_updated': {
      // Orchestrator frames consumed by OrchestratorPane's own subscription.
      // No-op here to keep TS exhaustiveness satisfied.
      return state;
    }
    case 'battle_started':
    case 'contestant_updated':
    case 'battle_updated': {
-      // Arena frames consumed by ArenaPane's own subscription.
+      return state;
-      // No-op here to keep TS exhaustiveness satisfied.
+    }
    case 'channel_delta': {
      return state;
    }
    default: {
      return state;
    }
  }
 }
 // Matches useUserEvents — exponential backoff with the same ceiling so the
 // two channels reconnect on the same cadence after a network handoff.
 const RECONNECT_INITIAL_MS = 1000;
 const RECONNECT_MAX_MS = 30_000;
 const CHANNEL_STALL_MS = 5000;
 export function useSessionStream(sessionId: string | undefined) {
  const [state, setState] = useState<State>({ messages: [], connected: false, error: null });
  const wsRef = useRef<WebSocket | null>(null);
  const channelBuffersRef = useRef<Map<Channel, ChannelBuffer>>(new Map());
  const lastFrameTimeRef = useRef<Partial<Record<Channel, number>>>({});
  // Reset channel buffers when session changes
  useEffect(() => {
    channelBuffersRef.current = new Map();
    lastFrameTimeRef.current = {};
  }, [sessionId]);
  useEffect(() => {
    if (!sessionId) return;
@@ -232,6 +359,73 @@ export function useSessionStream(sessionId: string | undefined) {
    let reconnectTimer: ReturnType<typeof setTimeout> | null = null;
    let reconnectDelay = RECONNECT_INITIAL_MS;
    const getLastSeqPerChannel = () => {
      const seqs: Partial<Record<Channel, number>> = {};
      for (const [ch, buf] of channelBuffersRef.current) {
        seqs[ch] = buf.expectedNextSeq;
      }
      return seqs;
    };
    const flushDeltaToState = (delta: ChannelDeltaWsFrame) => {
      console.error('FDS', delta.channel, 'flushed');
      if (delta.channel === 'status') {
        setState((s) => applyStatusDelta(s, delta));
      } else {
        const legacy = channelDeltaToLegacyFrame(delta);
        if (legacy) {
          setState((s) => applyFrame(s, legacy));
        }
      }
    };
    const handleChannelDelta = (frame: ChannelDeltaWsFrame) => {
      console.error('HCD', frame.channel, frame.seq, 'bufs', channelBuffersRef.current.size);
      const buffers = channelBuffersRef.current;
      let buffer = buffers.get(frame.channel);
      if (!buffer) {
        buffer = new ChannelBuffer();
        buffers.set(frame.channel, buffer);
      }
      const flushed = buffer.push(frame);
      if (flushed.length === 0) return;
      for (const delta of flushed) {
        flushDeltaToState(delta);
      }
      let emittedRefresh = false;
      for (const delta of flushed) {
        if (delta.channel === 'status' && (delta.status === 'complete' || delta.status === 'cancelled' || delta.status === 'failed')) {
          emittedRefresh = true;
        }
      }
      if (emittedRefresh) {
        sessionEvents.emit({ type: 'git_diff_refresh' });
      }
      lastFrameTimeRef.current[frame.channel] = Date.now();
    };
    // Periodic channel stall check: if any channel has buffered frames
    // but no progress for 5s, force a snapshot refetch.
    let stallTimer: ReturnType<typeof setInterval> | null = null;
    const startStallTimer = () => {
      stallTimer = setInterval(() => {
        const now = Date.now();
        for (const [channel, buffer] of channelBuffersRef.current) {
          if (buffer.bufferedCount === 0) continue;
          const lastTime = lastFrameTimeRef.current[channel as Channel] ?? 0;
          if (now - lastTime >= CHANNEL_STALL_MS) {
            buffer.reset();
            sessionEvents.emit({ type: 'refetch_messages' });
          }
        }
      }, 1000);
    };
    const connect = () => {
      if (unmounted) return;
      const proto = window.location.protocol === 'https:' ? 'wss' : 'ws';
@@ -242,13 +436,16 @@ export function useSessionStream(sessionId: string | undefined) {
      ws.onopen = () => {
        reconnectDelay = RECONNECT_INITIAL_MS;
        setState((s) => ({ ...s, connected: true, error: null }));
        // Mid-stream reconnection protocol: send last known seq per channel
        // so the server can replay deltas or fall back to a full snapshot.
        const lastSeq = getLastSeqPerChannel();
        ws.send(JSON.stringify({ type: 'reconnect', lastSeqPerChannel: lastSeq }));
        startStallTimer();
      };
      ws.onmessage = (ev) => {
        // v1.13.11-a: Zod-validate every inbound frame. Fail-closed — invalid
        // frames are logged and dropped. WsFrameSchema is the runtime guard;
        // the hand-maintained WsFrame type stays as the narrowed dev-time
        // shape (Zod uses OpaqueObject for nested types like Message[]). One
        // cast bridges the two.
        let raw: unknown;
        try {
          raw = JSON.parse(typeof ev.data === 'string' ? ev.data : '');
@@ -266,13 +463,14 @@ export function useSessionStream(sessionId: string | undefined) {
        }
        try {
          const frame = validated.data as unknown as WsFrame;
-          // v1.11: on a compaction completion, re-fetch the message list so
+
-          // the new summary row + the cohort of compacted_at-stamped older
+          if (frame.type === 'channel_delta') {
-          // rows render correctly. We dispatch the fresh list as a synthetic
+            console.error('RAW_PARSE', JSON.stringify(validated.data).slice(0, 200));
-          // 'snapshot' frame so the reducer's existing path handles state
+            console.error('CD', frame.channel, frame.seq, JSON.stringify(frame).slice(0, 80));
-          // replacement (no need for a parallel "refetched" path).
+            handleChannelDelta(frame);
-          // The toast is purely UX feedback; missing it would still leave
+            return;
-          // the chat in a valid state.
+          }
          if (frame.type === 'compacted') {
            toast.success('Context compacted to free space');
            void api.messages
@@ -285,8 +483,9 @@ export function useSessionStream(sessionId: string | undefined) {
              });
            return;
          }
          setState((s) => applyFrame(s, frame));
-          // Trigger git diff refresh after each completed assistant turn.
+
          if (frame.type === 'message_complete') {
            sessionEvents.emit({ type: 'git_diff_refresh' });
          }
@@ -294,15 +493,18 @@ export function useSessionStream(sessionId: string | undefined) {
          console.warn('bad ws frame', err);
        }
      };
-      // v1.8.1: WS errors no longer surface as user-facing toasts here. The
+
      // user-channel hook (useUserEvents) owns the debounced "reconnecting…"
      // UI; this channel just reconnects silently on the same backoff.
      ws.onerror = () => {
        try { ws.close(); } catch {}
      };
      ws.onclose = () => {
        if (unmounted) return;
        setState((s) => ({ ...s, connected: false }));
        if (stallTimer) {
          clearInterval(stallTimer);
          stallTimer = null;
        }
        const delay = reconnectDelay;
        reconnectDelay = Math.min(reconnectDelay * 2, RECONNECT_MAX_MS);
        reconnectTimer = setTimeout(connect, delay);
@@ -314,6 +516,7 @@ export function useSessionStream(sessionId: string | undefined) {
    return () => {
      unmounted = true;
      if (reconnectTimer) clearTimeout(reconnectTimer);
      if (stallTimer) clearInterval(stallTimer);
      const ws = wsRef.current;
      wsRef.current = null;
      if (ws) try { ws.close(); } catch {}
--- a/codecontext/Dockerfile
+++ b/codecontext/Dockerfile
@@ -1,44 +0,0 @@
 # v2.8 — boocontext sidecar container.
 # Multi-stage build: Go shim from golang:1.24-alpine, boocontext MCP aggregator
 # from node:20-alpine, then an alpine:3.20 runtime holding both.
 #
 # The shim spawns boocontext as a child MCP process over stdio NDJSON,
 # translating HTTP requests to MCP tools/call.
 #
 # To stage the fork source for a Docker build:
 #   tar -czf codecontext/fork.tar.gz -C /opt/forks/boocontext \
 #     --exclude=.git --exclude=node_modules --exclude=dist
 # Stage 1: Go shim builder
 FROM golang:1.24-alpine AS shim-builder
 WORKDIR /build/shim
 RUN apk add --no-cache ca-certificates
 COPY go.mod ./
 COPY shim.go ./
 RUN CGO_ENABLED=0 GOOS=linux go build -o /build/shim-bin ./
 # Stage 2: boocontext MCP builder (pnpm project)
 FROM node:20-alpine AS boocontext-builder
 WORKDIR /build/boocontext
 RUN apk add --no-cache git python3 make g++ ca-certificates
 RUN npm install -g pnpm@9 --silent
 COPY fork.tar.gz /build/fork.tar.gz
 RUN mkdir -p /build/boocontext && tar -xzf /build/fork.tar.gz -C /build/boocontext
 WORKDIR /build/boocontext
 RUN pnpm install --frozen-lockfile && pnpm run build
 # Stage 3: Runtime
 FROM alpine:3.20
 # uv intentionally not installed — container network blocks astral.sh.
 # tree-sitter-analyzer child server (uvx) won't start in-container, but
 # boocontext logs a graceful warning; TSA-backed tools fall through.
 RUN apk add --no-cache ca-certificates nodejs
 COPY --from=shim-builder /build/shim-bin /usr/local/bin/shim
 COPY --from=boocontext-builder /build/boocontext/dist /usr/local/lib/boocontext/dist
 COPY --from=boocontext-builder /build/boocontext/node_modules /usr/local/lib/boocontext/node_modules
 COPY --from=boocontext-builder /build/boocontext/package.json /usr/local/lib/boocontext/package.json
 EXPOSE 8080
 HEALTHCHECK --interval=30s --timeout=5s --start-period=30s \
  CMD wget -qO- http://localhost:8080/health || exit 1
 ENTRYPOINT ["/usr/local/bin/shim"]
--- a/codecontext/README.md
+++ b/codecontext/README.md
@@ -0,0 +1,31 @@
 # codecontext — Go sidecar (DEPRECATED)
 > **Deprecated** (Phase 4, Domain 2, v2.8.14).
 >
 > Superseded by the **boocontext MCP server** (`apps/coder`). Do not add new
 > callers. The 16 codecontext tool wrappers still use this sidecar via HTTP at
 > `http://codecontext:8080/v1/{toolName}` for backward compatibility.
 ## Migration path
 1. Existing tool wrappers in `apps/server/src/services/tools/codecontext/` route
   through `callCodecontext()` in `codecontext_client.ts`, which calls this
   Go sidecar over HTTP.
 2. New callers should use the boocontext MCP server instead (reachable via the
   `boocontext` tool wrappers).
 3. After all callers have migrated, remove this directory, the `codecontext`
   service block from `docker-compose.yml`, and the
   `codecontext_client.ts`/`factory.ts` files.
 ## What it does
 A Go HTTP shim wrapping the boocontext MCP server's stdio interface. Provides
 code-graph analysis (symbols, callers, callees, file overview, etc.) over a
 REST API at `/v1/{toolName}`.
 ## Files
 - `shim.go` — HTTP server that wraps the boocontext MCP stdio process
 - `Dockerfile` — container build
 - `fork.tar.gz` — vendored boocontext source (gitignored)
 - `.codecontextignore.template` — default ignore patterns deployed per project
--- a/data/AGENTS.md
+++ b/data/AGENTS.md
@@ -6,7 +6,7 @@ Operating rules for every agent in this registry. Full procedures live in the `c
 **Worktrees** — Isolate work in a worktree when it is parallel to in-progress work, risky/experimental, a hotfix interrupting other work, or splits into independent units — just create when clear, propose in one line when ambiguous, skip quick/small single-stream work. Branch from a stable base (default branch); worktrees persist (never auto-remove or auto-merge); they isolate code state, not runtime (ports/DBs/services still collide). Full heuristic: invoke `using-worktrees`.
-**Sampling knobs** — Each `## Name` frontmatter block accepts these per-agent sampler fields, threaded into the llama-swap chat-completion request: `temperature`, `top_p`, `top_k`, `min_p`, `presence_penalty`, and (v2.6) `top_n_sigma`, `dry_multiplier`, `dry_base`, `dry_allowed_length`, `dry_penalty_last_n`. The `top_n_sigma` + `dry_*` repetition family curb the doom-loop-prone local model. Omit a field to leave it at the server default. Example: `top_n_sigma: 1.0`, `dry_multiplier: 0.8`, `dry_base: 1.75`, `dry_allowed_length: 2`, `dry_penalty_last_n: -1` (-1 = whole context).
+**Sampling knobs** — Each `## Name` frontmatter block accepts these per-agent sampler fields, threaded into the llama-swap chat-completion request: `temperature`, `top_p`, `top_k`, `min_p`, `presence_penalty`, and (v2.6) `top_n_sigma`, `dry_multiplier`, `dry_base`, `dry_allowed_length`, `dry_penalty_last_n`. The `top_n_sigma` + `dry_*` repetition family curb the doom-loop-prone local model. Omit a field to leave it at the server default. Example: `top_n_sigma: 1.0`, `dry_multiplier: 0.8`, `dry_base: 1.75`, `dry_allowed_length: 2`, `dry_penalty_last_n: -1` (-1 = whole context). DeepSeek V4 models also accept `reasoning_effort` (low/medium/high/xhigh/max); omit to disable thinking mode. Example: `reasoning_effort: 'high'`.
 **Reasoning budget** — To cap a reasoning model's thinking tokens, pass `--reasoning-budget` through `llama_extra_args` (already permitted by the deny-list validator; routes the agent to llama-sidecar). Example frontmatter line: `llama_extra_args: ["--reasoning-budget", "2048"]`. This is a sidecar process flag, not a chat-completion body param — distinct from the sampling knobs above.
--- a/docker-compose.yml
+++ b/docker-compose.yml
@@ -7,7 +7,6 @@ services:
      - "100.114.205.53:9500:3000"
    env_file: .env
    environment:
      CODECONTEXT_URL: http://codecontext:8080
      CONTAINER_GUIDANCE_FILE: /app/BOOCHAT.md
      DATABASE_URL: postgres://boocode:${POSTGRES_PASSWORD}@boocode_db:5432/boochat
      BOOCODER_URL: http://100.114.205.53:9502
@@ -91,41 +90,6 @@ services:
    networks:
      - boocode_net
  # v1.12 Track B: codecontext sidecar. Stdio MCP server wrapped by a small
  # HTTP shim (see ./codecontext/). No host port — reached from boocode at
  # http://codecontext:8080 over the boocode_net bridge.
  #
  # Mounts /opt:/opt:ro (not just /opt/projects:ro): BooCode projects live
  # at /opt/<slug> on the host, not exclusively under /opt/projects. The
  # mount must cover anywhere a project.path could resolve to. Read-only
  # because codecontext only analyzes — never writes. The model can't
  # arbitrarily set target_dir to a sensitive subtree because the B.2
  # wrappers validate target_dir against project.path before calling the
  # shim, and the shim isn't reachable from outside boocode_net.
  codecontext:
    build:
      context: ./codecontext
    container_name: boocode_codecontext
    ports:
      - "127.0.0.1:8080:8080"
    restart: unless-stopped
    environment:
      CODECONTEXT_CHILD: node /usr/local/lib/boocontext/dist/index.js --mcp
      TYPE_INJECT_MCP_PATH: /opt/type-inject/packages/mcp/dist/index.js
      TREE_SITTER_MCP_CMD: uvx
      TREE_SITTER_MCP_ARGS: --from tree-sitter-analyzer[mcp] tree-sitter-analyzer-mcp
    networks:
      - boocode_net
    volumes:
      - /opt:/opt:ro
      - /opt/forks:/opt/forks:ro
    healthcheck:
      test: ["CMD-SHELL", "wget -qO- http://localhost:8080/health || exit 1"]
      interval: 30s
      timeout: 5s
      retries: 3
      start_period: 30s
 volumes:
  boocode_pgdata:
--- a/openspec/changes/paseo-orchestrator/proposal.md
+++ b/openspec/changes/paseo-orchestrator/proposal.md
@@ -0,0 +1,107 @@
 # Paseo-like Orchestrator — Trace Observability, Dynamic Workflows & Agent Runtime
 **Status:** Proposed
 **Epic:** paseo-orchestrator
 **Depends on:** v2.7.17-orchestrator
 ## Why
 BooCode's Orchestrator (v2.7.17) runs deterministic Han analysis flows — but it's a fixed pipeline, not a general-purpose agent runtime. Every tool call is opaque: no timing, no cost breakdown, no replay. Sessions evaporate on browser refresh. Workflows are hardcoded. Subagents block until completion. And there's zero visibility into cache efficiency on DeepSeek — despite prompt caching being a major cost lever.
 The current architecture treats the LLM as a black box and the agent as a one-shot transaction. To move from "read-only chat" to a **Paseo-style thin-client orchestration layer**, BooCode needs five capabilities that compound on each other:
 1. **Observability** — Every tool call timed, logged, and live-streamed. Without it, debugging agent behavior is guesswork.
 2. **Persistence** — Agent state survives browser refresh. Active sessions resume where they left off.
 3. **Dynamic Workflows** — User-authored JS scripts using `agent()`, `parallel()`, `pipeline()` instead of hardcoded flows. Hash-based caching skips completed steps on re-run.
 4. **Background Subagents** — `spawn_subagent` returns immediately, results collected later. Unlocks parallel research, long-running analyses, and notification-based workflows.
 5. **Multi-modal + Cache Shape** — Image attachments forwarded to DeepSeek's vision API, plus per-turn cache hit rate visualization to close the cost feedback loop.
 Each phase is independently valuable; together they transform BooCode from a chat UI into a durable agent execution platform.
 ## What Changes
 ### Phase 1: Trace System + Observability (3-4 days)
 1. **Create `tool_traces` DB table** — id, session_id, chat_id, turn_number, tool_name, input, output, started_at, finished_at, latency_ms, tokens_used, cache_tokens, reasoning_tokens, error, outcome. Applied idempotently via `applySchema()`.
 2. **Add `tool_trace` WS frame** — new WsFrame variant in `@boocode/contracts` published by the server when a tool call starts and completes. Frontend receives live timing deltas via `useSessionStream`.
 3. **Instrument `tool-phase.ts`** — wrap `executeToolCall` with `clock_timestamp()` start/end, extract token counts from LLM response metadata, publish `tool_trace` frames on start (with input) and finish (with output + metrics).
 4. **Add GET `/api/chats/:id/traces`** — paginated endpoint returning trace rows ordered by turn_number + started_at. Supports cursor-based pagination for large sessions.
 5. **Build trace viewer pane** — collapsible tree per turn, timing bars showing latency relative to turn duration, expand/collapse per tool call showing input/output. Integrates into the existing multi-pane workspace alongside chat, coder, and orchestrator panes.
 ### Phase 2: Session Persistence + Resume (2-3 days)
 6. **Serialize agent state to DB** — on each turn boundary (before and after tool call loop), snapshot the active `AgentSession` state (provider config, turn history, pending tool calls) to a JSONB column in `agent_sessions`. Uses `clock_timestamp()` for ordering.
 7. **Restore on WS reconnect** — when `snapshot` frame arrives on reconnection, check for a persisted `AgentSession` in `in_progress` or `awaiting_input` state. Rehydrate the coder pane to match the persisted turn, tool call, and pending state.
 8. **Agent session timeline view** — a timeline component in the coder pane showing the history of all turns in the current agent session. Each turn shows start time, tool count, token usage, cache hit rate. Clicking a turn scrolls to that point in the conversation.
 ### Phase 3: Dynamic Workflow Engine (5-7 days)
 9. **Create `isolated-vm` sandbox** — restricted JS execution environment for workflow scripts. No `require`, `fs`, `net`, `child_process`. Only the workflow API surface exposed. Token budget enforcement kills runaway scripts.
 10. **Implement workflow API primitives** — `agent(id, { prompt, model, tools, budget })` defines a sub-agent; `parallel([agent1, agent2])` runs N agents concurrently with a shared token budget; `pipeline([step1, step2])` chains agents sequentially; `phase(name, { agents, budget })` groups agents under a named phase; `budget(limit)` sets token or step limits; `log(msg)` emits structured workflow log. Compatible with Claude Code workflow script format.
 11. **Workflow file discovery** — scan `.boocode/workflows/*.js` (project-local), `~/.boocode/workflows/*.js` (global), and a built-in catalog directory. Each file exports a `workflow` object with `{name, description, run}`. Discovery runs on server start and on file change (optional watch mode).
 12. **Workflow manager + built-in catalog** — `WorkflowManager` class with `list()`, `get(name)`, `run(workflow, args)`, `cancel(runId)`, `status(runId)`. Concurrency limits (configurable max concurrent runs), token budgets per run. Built-in catalog includes: `deep-research` (parallel source search → per-source analysis → synthesis), `multi-review` (code health + security + standards reviews in parallel), `plan-verify` (generate plan → verify plan → generate tasks), `bounty-hunt` (parallel vulnerability scanning with different focuses).
 13. **Workflow resumability** — SHA-256 hash of each agent spec (prompt + options). Before executing an agent, check if a completed result exists with the same hash. Skip cached agents, only execute new/changed ones. In-memory LRU cache for current session, optional DB persistence for cross-session reuse.
 14. **Workflow UI integration** — extend the existing Orchestrator panel (used for Han flows) to support dynamic workflows. Workflow selector dropdown, live run pane with step-by-step progress, cancel button, log output stream, per-agent timing. Reuses the same run-pane component pattern.
 ### Phase 4: Background Subagents (2-3 days)
 15. **Background task queue** — uses the existing `tasks` table with a new `background` type. `spawn_subagent` tool creates a task row and returns immediately. A background worker picks up the task and executes it without blocking the calling agent.
 16. **`subagent_status` + `subagent_result` tools** — `subagent_status(task_id)` returns `running|completed|failed` with optional progress info. `subagent_result(task_id)` returns the full output when completed. Polling-based (no WS push for background tasks initially).
 17. **Background agent pane** — new pane type showing running/completed background agents. Each entry shows name, status, duration, progress. Completed entries show a "View Result" action. Notifications hook into the existing notification system (toast on completion, badge count for active tasks).
 ### Phase 5: Multi-modal + Cache Shape (2-3 days)
 18. **Image/file attachment pipeline** — accept file uploads (drag-drop or file picker), store on tmpfs with a reference in the message row. Forward to DeepSeek's multimodal API as base64-encoded image parts. Size limit enforcement (configurable, default 20MB per attachment).
 19. **Image render in message bubble** — render attached images inline in the chat message bubble. Lightbox on click for expanded view. Thumbnail generation for large images to keep chat scrolling performant.
 20. **Cache shape telemetry** — extract `prompt_cache_hit_tokens` from DeepSeek provider metadata on each turn. Break down by segment: system prompt, tool schemas, conversation history. Store in `tool_traces` columns and/or a dedicated `cache_stats` table.
 21. **Cache hit rate visualization** — per-turn cache hit bar in the trace viewer (showing cached vs non-cached tokens). Cumulative cache hit rate in the session footer. Highlight when a turn achieves high cache reuse (green indicator) or unusually low (yellow/red).
 ## Non-Goals
 - No changes to the existing Han flow orchestrator (runs alongside dynamic workflows)
 - No removal of existing agent dispatch paths (PTY, ACP, Claude SDK — dynamic workflows are additive)
 - No distributed execution (all orchestration is single-node)
 - No persistent workflow file watching (manual reload or server restart to pick up new workflows)
 - No workflow editing UI (workflows are authored as JS files)
 ## Capabilities
 ### New Capabilities
 - **Tool trace viewer** — every tool call with timing, token costs, cache breakdown, expandable input/output
 - **Agent session resume** — browser refresh preserves active agent state
 - **Dynamic workflows** — user-authored JS scripts with `agent()/parallel()/pipeline()` API
 - **Workflow resumability** — hash-based step caching skips completed agents on re-run
 - **Built-in workflow catalog** — deep-research, multi-review, plan-verify, bounty-hunt
 - **Background subagents** — non-blocking spawn with deferred result collection
 - **Multi-modal support** — image attachments forwarded to DeepSeek vision API
 - **Cache shape telemetry** — per-turn and cumulative cache hit rate visualization
 ### Modified Capabilities
 - **Orchestrator panel** — extended from fixed Han flows to dynamic workflow selection and streaming run pane
 - **tool-phase.ts** — instrumented with start/end timing and trace publishing
 - **WsFrame contract** — new `tool_trace` frame variant
 - **tasks table** — extended with `background` type for async subagent execution
 ## Metrics
 - Tool call observability: 0% → 100% of calls traced with timing
 - Session continuity: lost on refresh → preserved on reconnect
 - Workflow authoring: hardcoded → user-authored JS scripts
 - Workflow re-run efficiency: 0% cache → hash-based step reuse
 - Background execution: blocking only → blocking + non-blocking
 - Cache visibility: 0% → per-turn + cumulative hit rate
 - Multi-modal: text-only → text + image attachments
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
indifferentketchup	74da084521	feat(conductor): Wave 2 — parallel batch execution + SWITCH branching step - Parallel batch execution: batch field on Step, batchConfig on Flow, batch-aware readySteps with maxConcurrent gating, getReadyInBatch helper - SWITCH branching step: new 'switch' StepKind with cases/programmed conditions, resolveSwitch() pure function, switch-excluded steps tracked in SchedulerState, non-selected branches excluded from execution	2026-06-08 03:00:06 +00:00
indifferentketchup	c860b6c4b7	feat: Wave 1 complete — state machine, Paseo hub, collision detection, PTY search - Task state machine: TIMED_OUT state, retriable steps, timeout detection - Paseo hub: paseo-client.ts (HTTP+CLI), PaseoBackend (AgentBackend), 14 tests - Collision detection: collision-detector.ts, conflict-index.ts, ws-frames type - PTY search: ring buffer, search route, capture-pane fallback	2026-06-08 02:45:17 +00:00
indifferentketchup	c4ee377dbc	feat(conductor): task state machine — TIMED_OUT state and retriable steps - Add 'timed_out' to flow_runs/flow_steps CHECK constraints - Add retry_count and max_retries columns to flow_steps - Add timeout detection in advanceInner loop (configurable FLOW_STEP_TIMEOUT_MS) - Add retriable logic: re-dispatch on timeout if maxRetries > 0 and retryCount < maxRetries - Add isRetriable() + shouldRetry() pure decision functions - Add timed_out handling to reconcileResumeStep and reconcileRun - Add 'timed_out' to ws-frames enum, publishStep status type	2026-06-08 02:43:45 +00:00
indifferentketchup	f2401352a8	chore: update pnpm-lock.yaml for @ai-sdk/deepseek	2026-06-08 02:28:32 +00:00
indifferentketchup	abe9c5a3a8	feat: Paseo-like orchestrator Phase 1-2 — trace system, session persistence, timeline, run_command, auto-fix loop Phase 1: Trace System + Observability - tool_traces DB table + insert/update service - tool_trace_start/tool_trace_finish WS frames (contracts + FE types) - Instrumented tool-phase.ts with timing around every tool call - GET /api/chats/:id/traces paginated endpoint - Trace viewer frontend (collapsible panel with timing bars + token breakdown) Phase 2: Session Persistence + Resume - agent_snapshots table (UPSERT per chat, persisted on turn boundaries) - save/load/delete service functions - Agent snapshot sent on WS reconnect - Session timeline view (vertical timeline with scroll-to + restore) Tooling: - run_command tool (execFile, 30s timeout, 32KB cap, path-guarded) - Auto-fix loop: after write tools, runs pnpm build, injects errors into next turn	2026-06-08 02:26:47 +00:00
indifferentketchup	7cb692d8be	feat: Phase 4 teardown — remove Go codecontext sidecar from deployment - Remove codecontext service block from docker-compose.yml - Remove CODECONTEXT_URL env var - Delete codecontext/Dockerfile - Update callCodecontext() to try boocontext MCP first with HTTP fallback - Graceful degradation: if boocontext MCP unavailable, tools still work via HTTP	2026-06-08 02:16:02 +00:00
indifferentketchup	917a229363	feat: Domain 2 Phase 3-4 — wiki article tool, DCP compress toggle, Go sidecar deprecation Phase 3: get_wiki_article tool wraps codesight_get_wiki_article MCP (cached, persistent codebase wiki). DCP compress toggle on get_codebase_overview (compress=true for large projects >50 files). Phase 4: Deprecation markers on Go codecontext sidecar. Warning log in callCodecontext(), deprecation comments in factory.ts and docker-compose.yml. Sidecar remains functional — removal deferred.	2026-06-08 01:35:40 +00:00
indifferentketchup	39be5ce413	fix: move cache_tokens/reasoning_tokens ALTER TABLE before view creation	2026-06-08 01:32:25 +00:00
indifferentketchup	378e29308e	fix: add cache_tokens/reasoning_tokens to Message constructors in useSessionStream	2026-06-08 01:27:31 +00:00
indifferentketchup	8f6a814ab0	fix: add cache_tokens/reasoning_tokens to web WsFrame union	2026-06-08 01:26:01 +00:00
indifferentketchup	3c019a2281	changelog: v2.8.18-deepseek-whale-lift	2026-06-08 01:24:59 +00:00
indifferentketchup	203cfd2fa8	feat: DeepSeek API integration + Whale lift (hooks, tool repair, MCP permissions, token tracking) DeepSeek API: - @ai-sdk/deepseek provider replaces openai-compatible for deepseek-* models - Token tracking: cache_hit/reasoning tokens flow API → DB → WS frames → UI - thinking effort levels (off/low/medium/high/xhigh/max) via AGENTS.md frontmatter - V4 models: deepseek-v4-flash, deepseek-v4-pro - Wired for both chat and coder panes Whale lifts: - Tool input repair (schema-based type coercion, markdown link unwrapping) - Hooks system (6 lifecycle events, shell exec, JSON stdin/stdout contract) - Per-MCP-server permissions (allow/ask/deny) - token tracking UI (cache N, think N in message stats line) Infra: - New DB columns: messages.cache_tokens, messages.reasoning_tokens - New WS frame fields: cache_tokens, reasoning_tokens on message_complete - coder provider snapshot merges DeepSeek models alongside llama-swap	2026-06-08 01:24:23 +00:00
indifferentketchup	c11e26090f	feat(coder): boulder state — cross-session plan persistence + auto-resumption New plans table (id, project_id, title, description, status, flow_run_id, progress_pct, items_total, items_completed, metadata, timestamps) with CHECK constraints and indexes. Plan store (plan-store.ts): createPlan, getPlan, listPlans, listActivePlans, updatePlan, updatePlanFromRun, findPlanWithRunningRun, planStatusFromRun. Flow-runner integration: onRunTerminal callback fires on every terminal transition (complete/fail/cancel) and updates linked plans automatically. 5 API endpoints: GET /api/plans, GET /api/plans/active, GET /api/plans/:id, POST /api/plans, PATCH /api/plans/:id. 484 tests pass, build clean.	2026-06-08 01:11:07 +00:00
indifferentketchup	e0feb53437	feat: omo-paseo-bridge — auto-register OMO subagents as Paseo agents Bridge script that calls paseo import <session-id> --provider opencode --label omo=true on task() child sessions. Supports import, archive, ls commands with --dry-run verification. Skill at .opencode/skills/ is gitignored (user-level) — copy from scripts/ on setup.	2026-06-08 01:11:00 +00:00
indifferentketchup	3c5b2c2bcf	feat(server): Domain 2 Phase 1 — boocontext MCP client + 4 new code intelligence tools Shared boocontext MCP client (boocontext_client.ts) wrapping the existing mcp-client.ts callTool() infrastructure with 32KB truncation and error handling. Used by get_code_health. 4 new first-class agent tools backed by the boocontext MCP server: - get_code_health — A-F grades per file across 7 dimensions, project health summary, refactoring candidates (wraps boocontext_health) - get_code_impact — merged symbol trace + blast radius in one call (wraps boocontext_impact, replaces two-step get_symbol_info+get_blast_radius) - get_type_info — TypeScript type recovery via type-inject MCP (wraps boocontext_types, returns signatures, interfaces, generics, JSDoc) - get_code_map — DCP-compressed context map with compress toggle (wraps boocontext_map, 10x token reduction vs full scan) All 4 registered in ALL_TOOLS as read-only tools.	2026-06-08 00:45:46 +00:00
indifferentketchup	524a0deaa1	feat(coder): add model resolution core + multi-batch matcher Model resolution (from oh-my-openagent/model-core): 6-step priority resolution pipeline (UI select -> user config -> category default -> user fallback -> policy chain -> system default), provider fallback chains, fuzzy model matching, error classification, provider-specific model ID transforms. 14 files, zero runtime deps. Multi-batch matcher (from boocontext-audit): 6 batch types (Observational, Actionable, PreviouslyApplied, Disambiguation, ResponseAnalysis, LowCriticality) for behavioral guideline evaluation. RelationalResolver with iterative convergence (DEPENDS_ON, PRIORITIZES, ENTAILS, TAG_ALL, TAG_PRIORITIZES). SchematicGenerator abstract class with retry and execution plans. 4 files.	2026-06-08 00:17:55 +00:00
indifferentketchup	a7a40c5b46	feat(coder): add hashline editing core + wire audit hooks into dispatch pipeline Hashline editing: content-hash anchors for edit_file stale-patch detection. Pure-JS xxHash32, line hash computation, validation with HashlineMismatchError, 256-entry hash dictionary. 6 files in apps/coder/src/services/hashline/. Audit hooks: emitHook('tool.execute.after') wired in frame-emitter.ts for completed/failed tool results. emitHook('turn.end') wired at terminal points in dispatcher.ts (all 5 run functions: native, external, opencode, warm ACP, claude SDK). Fire-and-forget, non-blocking.	2026-06-07 23:17:47 +00:00