Compare commits

..

12 Commits

Author SHA1 Message Date
45a1140fd3 feat: phase 3-5 — workflow engine, background subagents, multi-modal, cache shape, inline diff
Phase 3: Dynamic Workflow Engine
- VM sandbox (node:vm) with agent/parallel/pipeline API, Claude Code compatible
- Workflow file discovery (.boocode/workflows/*.js + ~/.boocode/workflows/*.js)
- Workflow manager with session/chat creation and inference dispatch
- Built-in catalog: deep-research, review-code, find-issues
- Resumability cache: SHA-256 hash of agent spec, in-memory Map

Phase 4: Background Subagents
- background-task.ts service: spawn/poll/cancel lifecycle
- spawn_subagent, subagent_status, subagent_result tools in ALL_TOOLS

Phase 5: Multi-modal + Cache Shape
- Multi-modal stub with type defs and hook point in payload.ts
- CacheShapeBadge component in trace viewer (colored bar + %)
2026-06-08 03:11:39 +00:00
74da084521 feat(conductor): Wave 2 — parallel batch execution + SWITCH branching step
- Parallel batch execution: batch field on Step, batchConfig on Flow,
  batch-aware readySteps with maxConcurrent gating, getReadyInBatch helper
- SWITCH branching step: new 'switch' StepKind with cases/programmed conditions,
  resolveSwitch() pure function, switch-excluded steps tracked in
  SchedulerState, non-selected branches excluded from execution
2026-06-08 03:00:06 +00:00
c860b6c4b7 feat: Wave 1 complete — state machine, Paseo hub, collision detection, PTY search
- Task state machine: TIMED_OUT state, retriable steps, timeout detection
- Paseo hub: paseo-client.ts (HTTP+CLI), PaseoBackend (AgentBackend), 14 tests
- Collision detection: collision-detector.ts, conflict-index.ts, ws-frames type
- PTY search: ring buffer, search route, capture-pane fallback
2026-06-08 02:45:17 +00:00
c4ee377dbc feat(conductor): task state machine — TIMED_OUT state and retriable steps
- Add 'timed_out' to flow_runs/flow_steps CHECK constraints
- Add retry_count and max_retries columns to flow_steps
- Add timeout detection in advanceInner loop (configurable FLOW_STEP_TIMEOUT_MS)
- Add retriable logic: re-dispatch on timeout if maxRetries > 0 and retryCount < maxRetries
- Add isRetriable() + shouldRetry() pure decision functions
- Add timed_out handling to reconcileResumeStep and reconcileRun
- Add 'timed_out' to ws-frames enum, publishStep status type
2026-06-08 02:43:45 +00:00
f2401352a8 chore: update pnpm-lock.yaml for @ai-sdk/deepseek 2026-06-08 02:28:32 +00:00
abe9c5a3a8 feat: Paseo-like orchestrator Phase 1-2 — trace system, session persistence, timeline, run_command, auto-fix loop
Phase 1: Trace System + Observability
- tool_traces DB table + insert/update service
- tool_trace_start/tool_trace_finish WS frames (contracts + FE types)
- Instrumented tool-phase.ts with timing around every tool call
- GET /api/chats/:id/traces paginated endpoint
- Trace viewer frontend (collapsible panel with timing bars + token breakdown)

Phase 2: Session Persistence + Resume
- agent_snapshots table (UPSERT per chat, persisted on turn boundaries)
- save/load/delete service functions
- Agent snapshot sent on WS reconnect
- Session timeline view (vertical timeline with scroll-to + restore)

Tooling:
- run_command tool (execFile, 30s timeout, 32KB cap, path-guarded)
- Auto-fix loop: after write tools, runs pnpm build, injects errors into next turn
2026-06-08 02:26:47 +00:00
7cb692d8be feat: Phase 4 teardown — remove Go codecontext sidecar from deployment
- Remove codecontext service block from docker-compose.yml
- Remove CODECONTEXT_URL env var
- Delete codecontext/Dockerfile
- Update callCodecontext() to try boocontext MCP first with HTTP fallback
- Graceful degradation: if boocontext MCP unavailable, tools still work via HTTP
2026-06-08 02:16:02 +00:00
917a229363 feat: Domain 2 Phase 3-4 — wiki article tool, DCP compress toggle, Go sidecar deprecation
Phase 3: get_wiki_article tool wraps codesight_get_wiki_article MCP
(cached, persistent codebase wiki). DCP compress toggle on
get_codebase_overview (compress=true for large projects >50 files).

Phase 4: Deprecation markers on Go codecontext sidecar. Warning log
in callCodecontext(), deprecation comments in factory.ts and
docker-compose.yml. Sidecar remains functional — removal deferred.
2026-06-08 01:35:40 +00:00
39be5ce413 fix: move cache_tokens/reasoning_tokens ALTER TABLE before view creation 2026-06-08 01:32:25 +00:00
378e29308e fix: add cache_tokens/reasoning_tokens to Message constructors in useSessionStream 2026-06-08 01:27:31 +00:00
8f6a814ab0 fix: add cache_tokens/reasoning_tokens to web WsFrame union 2026-06-08 01:26:01 +00:00
3c019a2281 changelog: v2.8.18-deepseek-whale-lift 2026-06-08 01:24:59 +00:00
64 changed files with 7452 additions and 236 deletions

View File

@@ -0,0 +1,55 @@
# Dynamic Workflow Engine — Design
## Architecture
```
User writes workflow JS file:
.boocode/workflows/my-flow.js
Workflow Runtime (apps/server)
├── isolated-vm sandbox (or node:vm)
├── API surface: agent(), parallel(), pipeline(), phase(), budget()
├── Tool bridge → BooCode's existing tool set
├── Workflow manager (concurrency, lifecycle)
├── Resumability cache (SHA-256 of agent spec)
└── Catalog (built-in workflows: deep-research, review-code)
Workflow execution:
1. User triggers workflow (slash command or Orchestrator panel)
2. File discovery finds .boocode/workflows/<name>.js
3. Sandbox compiles and executes the script
4. agent() calls go through tool bridge → existing inference pipeline
5. parallel() spawns concurrent agent calls (max 3 default)
6. Results stream via existing WS frames
7. Completed agents cached by hash for resume
API Surface (Claude Code compatible):
agent(prompt, { label?, schema?, model?, capabilities?, max_tool_calls? })
parallel([() => agent(...), () => agent(...)])
pipeline(items, ...stages)
phase(title)
log(message)
budget.total / budget.spent() / budget.remaining()
args
workflow(name, args?) — one level of nesting
```
## Implementation Plan
### Phase 1: Core Runtime (this session)
- Sandbox using Node's `vm` module (no extra deps)
- `agent()` function that creates a task and waits for completion
- Workflow file discovery
- Basic workflow manager
### Phase 2: Advanced Primitives
- `parallel()` with concurrency limits
- `pipeline()` streaming
- `budget()` token tracking
- Workflow resumability cache
### Phase 3: UI + Polish
- Integration with Orchestrator panel
- Built-in workflow catalog
- Workflow editor
- Error recovery

View File

@@ -0,0 +1,239 @@
# Paseo-like Orchestrator — Implementation Plan
> **Goal:** Transform BooCode into a Paseo-style thin-client orchestration layer with observability, dynamic workflows, resumability, background subagents, multi-modal, and cache shape telemetry.
>
> **Architecture:** Durable agent execution engine beneath thin chat/coder frontends. Trace system as foundation, workflow engine as the structural addition, everything else layered on top.
>
> **Inspired by:** Paseo (agent lifecycle, worktree isolation), Whale (workflow engine, cache telemetry), OpenCode (session resume), Claude Code (workflow script format).
---
## TL;DR
> **Quick Summary**: Build a durable orchestration layer with trace observability, dynamic JS workflows, session persistence, background subagents, and multi-modal support over 5 phases.
>
> **Deliverables**:
> - Trace system with DB persistence + viewer UI
> - Dynamic workflow engine (JS sandbox, agent/parallel/pipeline)
> - Workflow resumability (hash-based step caching)
> - Background subagent runtime
> - Session persistence across refreshes
> - Cache shape telemetry (DeepSeek KV cache viz)
> - Multi-modal attachment support
>
> **Estimated Effort**: XL — 5 phases, ~2-3 weeks total
> **Parallel Execution**: YES — phases 1-2 can partially overlap
> **Critical Path**: Trace system → Workflow engine → All downstream features
---
## Context
### Original Request
User wants BooCode to become "like Paseo — a thin client" with observability, dynamic workflows, session persistence, background agents, multi-modal, cache shape telemetry, and workflow resumability. They invoked skills across model evaluation, long context, SGLang, LangChain, LangSmith, agentic eval, agent harness construction, agent governance, and chat SDKs — indicating broad ambition for a production-quality AI coding platform.
### Key Decisions
- **Trace system first**: Foundation for all debugging and optimization
- **isolated-vm for workflow sandbox**: Node-native, no external deps
- **DB-backed sessions**: Postgres for trace store + session state
- **Existing WS frames + new `tool_trace` frame**: Live streaming to frontend
- **Phase ordering**: Foundation (trace) → UX (persistence) → Power (workflows) → Polish (background/multi-modal/cache)
---
## Phases
### Phase 1: Trace System + Observability
**Est. effort**: 3-4 days
Core observability infrastructure. Every tool call gets timed, logged, and persisted.
**Deliverables**:
- `tool_traces` DB table (id, session_id, chat_id, turn_number, tool_name, input, output, started_at, finished_at, latency_ms, tokens_used, cache_tokens, reasoning_tokens, error, outcome)
- Instrumentation in `tool-phase.ts` wrapping `executeToolCall` with start/end timing
- `tool_trace` WS frame type for live streaming to frontend
- GET `/api/chats/:id/traces` endpoint (paginated)
- Trace viewer pane (collapsible tree, timing bars, expand/collapse per call)
**Files to create**: 5-7 files across server + web + contracts
**Dependencies**: None — standalone feature
---
### Phase 2: Session Persistence + Resume
**Est. effort**: 2-3 days
Agent state survives browser refresh. Active sessions can be resumed.
**Deliverables**:
- Serialize active agent state to DB on each turn boundary
- Restore state on WS reconnect (existing `snapshot` frame enhanced)
- Agent session timeline view (history of all turns in a session)
- Coder pane rehydrates from persisted state
**Files to modify**: ws.ts, useSessionStream.ts, session store, dispatcher
**Dependencies**: None — standalone, but benefits from Phase 1 trace data
---
### Phase 3: Dynamic Workflow Engine
**Est. effort**: 5-7 days
JS sandbox for multi-agent orchestration. Claude Code compatible.
**Deliverables**:
- `isolated-vm` sandbox (or Node `vm` module with restricted context)
- Workflow API: `agent()`, `parallel()`, `pipeline()`, `phase()`, `budget()`, `log()`, `args`
- Workflow file discovery (`.boocode/workflows/*.js` → project, `~/.boocode/workflows/*.js` → global)
- Built-in workflow catalog (deep-research, multi-review, etc.)
- Workflow manager with concurrency limits, token budgets
- Integration with existing Orchestrator panel for UI
**Files to create**: 10-15 files (workflow runtime, scheduler, tool bridge, manager, catalog)
**Dependencies**: Phase 1 traces feed into workflow observability
**Workflow Resumability** (within Phase 3):
- SHA-256 hash of agent spec (prompt + options)
- Cache completed results by hash
- On re-run, skip cached agents, only execute new/changed ones
- In-memory cache for current session, optional DB persistence
**Est. effort**: 1-2 days within Phase 3
---
### Phase 4: Background Subagents
**Est. effort**: 2-3 days
Non-blocking subagent execution. `spawn_subagent` returns immediately, results collected later.
**Deliverables**:
- Background task queue (reuses existing `tasks` table)
- `spawn_subagent` tool that creates a task and returns immediately
- `subagent_status` tool to poll completion
- `subagent_result` tool to retrieve output
- Background agent pane showing running/completed subagents
- Notifications via hooks when background tasks complete
**Files to create**: 3-5 files across server + web
**Dependencies**: Phase 1 traces, Phase 2 session persistence
---
### Phase 5: Multi-modal + Cache Shape (Polish)
**Est. effort**: 2-3 days
Image/file attachment support + DeepSeek cache hit visualization.
**Deliverables (Multi-modal)**:
- Image/file attachment storage (tmpfs, referenced in message)
- Forward image content through DeepSeek API's multimodal support
- Render attached images in message bubble
- Model can "see" screenshots, diagrams, UI mocks
**Deliverables (Cache Shape)**:
- Extract `prompt_cache_hit_tokens` from DeepSeek provider metadata
- Build cache segment visualization (system prompt, tool schema, conversation)
- Per-turn cache hit rate in trace viewer
- Cumulative cache stats in session view
**Files to create**: 3-5 files
**Dependencies**: Phase 1 traces (for cache shape), existing DeepSeek integration
---
## Execution Strategy
### Parallel Execution Waves
```
Wave 1 (Start Immediately):
├── Phase 1: Trace system backend (tool_traces table + instrumentation) [deep]
├── Phase 1: Trace viewer frontend [visual-engineering]
└── Phase 2: Session persistence backbone [deep]
Wave 2 (After Wave 1):
├── Phase 3: Workflow engine sandbox + API surface [deep]
├── Phase 3: Workflow file discovery + manager [unspecified-high]
├── Phase 3: Workflow resumability cache [quick]
└── Phase 4: Background subagent queue + tools [unspecified-high]
Wave 3 (After Wave 2):
├── Phase 4: Background agent pane + notifications [visual-engineering]
├── Phase 5: Multi-modal attachment pipeline [deep]
└── Phase 5: Cache shape telemetry UI [visual-engineering]
Wave FINAL:
├── F1: Plan compliance audit (oracle)
├── F2: Code quality review (unspecified-high)
├── F3: Integration QA (unspecified-high)
└── F4: Scope fidelity check (deep)
```
---
## TODOs
> Phase 1: Trace System + Observability
- [ ] 1. Create tool_traces DB table + migration
- [ ] 2. Add tool_trace WS frame + contracts schema
- [ ] 3. Instrument tool-phase.ts with start/end timing
- [ ] 4. Add GET /api/chats/:id/traces endpoint
- [ ] 5. Build trace viewer frontend component
> Phase 2: Session Persistence + Resume
- [ ] 6. Serialize agent state to DB on turn boundaries
- [ ] 7. Restore state on WS reconnect
- [ ] 8. Agent session timeline view
> Phase 3: Dynamic Workflow Engine
- [ ] 9. Create isolated-vm workflow sandbox
- [ ] 10. Implement agent/parallel/pipeline primitives
- [ ] 11. Workflow file discovery system
- [ ] 12. Workflow manager + built-in catalog
- [ ] 13. Workflow resumability (hash-based cache)
- [ ] 14. Workflow UI integration with Orchestrator panel
> Phase 4: Background Subagents
- [ ] 15. Background task queue + spawn_subagent tool
- [ ] 16. subagent_status + subagent_result tools
- [ ] 17. Background agent pane
> Phase 5: Multi-modal + Cache Shape
- [ ] 18. Multi-modal attachment pipeline
- [ ] 19. Image render in message bubble
- [ ] 20. Cache shape telemetry data pipeline
- [ ] 21. Cache shape visualization in trace viewer
---
## Success Criteria
- Tool trace viewer shows every call with timing bars and token costs
- Browser refresh preserves agent session state
- Workflow scripts run in isolated sandbox with agent/parallel/pipeline
- Re-running a workflow skips cached agents (hash-based)
- Background subagents run independently, results collected later
- Model can see attached images in chat
- Cache hit rate visible per-turn and cumulative

View File

@@ -2,6 +2,10 @@
All notable changes per release tag. Most recent on top, ordered by tag creation date (which matches the git history). Tag names follow `vMAJOR.MINOR.PATCH-slug` — the slug describes what shipped, so the tag name alone is enough to recall the batch. All notable changes per release tag. Most recent on top, ordered by tag creation date (which matches the git history). Tag names follow `vMAJOR.MINOR.PATCH-slug` — the slug describes what shipped, so the tag name alone is enough to recall the batch.
## v2.8.18-deepseek-whale-lift — 2026-06-08
Integrates DeepSeek API directly into BooChat and BooCoder via `@ai-sdk/deepseek`, replacing the generic `openai-compatible` wrapper. DeepSeek V4 models (`deepseek-v4-flash`, `deepseek-v4-pro`) with configurable thinking effort levels appear in both chat and coder pane model pickers. Full token tracking — cache hit tokens and reasoning tokens — flow from the API through new DB columns and WS frames into the UI message stats line. Lifts three high-value features from the Whale codebase: a schema-based tool input repair system that coerces types and unwraps markdown autolinks before Zod validation, a shell-based lifecycle hooks system (PreToolUse, PostToolUse, Stop, PreCompact, PostCompact) with JSON stdin/stdout contract, and per-MCP-server permissions (allow/ask/deny) gating tool execution.
## v2.8.0-fork-lifts — 2026-06-07 ## v2.8.0-fork-lifts — 2026-06-07
Completes the eight fork-lift integrations from `/opt/forks` into BooCode: boocontext sidecar upgrade, LSP code intelligence, DCP clean-room pruning, institutional memory, subagent protocol enhancements, plugin hook host, inference reliability (tool-shim + loop detectors), and TokenScope token breakdown. Backfills edit safety guards (truncation + dropped imports) and the TokenScope analyzer/persist module. Closes the fork-lifts-mit epic. Completes the eight fork-lift integrations from `/opt/forks` into BooCode: boocontext sidecar upgrade, LSP code intelligence, DCP clean-room pruning, institutional memory, subagent protocol enhancements, plugin hook host, inference reliability (tool-shim + loop detectors), and TokenScope token breakdown. Backfills edit safety guards (truncation + dropped imports) and the TokenScope analyzer/persist module. Closes the fork-lifts-mit epic.

View File

@@ -5,6 +5,7 @@ import { getPool, closeDb } from './db.js';
import { registerHealthRoutes } from './routes/health.js'; import { registerHealthRoutes } from './routes/health.js';
import { registerTerminalRoutes } from './routes/terminals.js'; import { registerTerminalRoutes } from './routes/terminals.js';
import { registerSessionRoutes } from './routes/sessions.js'; import { registerSessionRoutes } from './routes/sessions.js';
import { registerSearchRoutes } from './routes/search.js';
import { registerWsAttachRoute } from './ws/attach.js'; import { registerWsAttachRoute } from './ws/attach.js';
async function main(): Promise<void> { async function main(): Promise<void> {
@@ -35,6 +36,7 @@ async function main(): Promise<void> {
registerHealthRoutes(app); registerHealthRoutes(app);
registerTerminalRoutes(app, config.TMUX_CONF_PATH); registerTerminalRoutes(app, config.TMUX_CONF_PATH);
registerSessionRoutes(app); registerSessionRoutes(app);
registerSearchRoutes(app, config.TMUX_CONF_PATH);
registerWsAttachRoute(app, config.TMUX_CONF_PATH); registerWsAttachRoute(app, config.TMUX_CONF_PATH);
const shutdown = async (signal: string) => { const shutdown = async (signal: string) => {

View File

@@ -33,6 +33,7 @@ export function register(
export function unregister(paneId: string): void { export function unregister(paneId: string): void {
sessions.delete(paneId); sessions.delete(paneId);
ringBuffers.delete(paneId);
} }
export function list(): SessionMeta[] { export function list(): SessionMeta[] {
@@ -42,3 +43,120 @@ export function list(): SessionMeta[] {
export function get(paneId: string): SessionMeta | undefined { export function get(paneId: string): SessionMeta | undefined {
return sessions.get(paneId); return sessions.get(paneId);
} }
// ── Ring buffer for PTY output search ──────────────────────────────────────
export interface SearchMatch {
line: number;
content: string;
contextBefore: string[];
contextAfter: string[];
}
const ringBuffers = new Map<string, string[]>();
/**
* Append raw PTY data to the ring buffer for a given pane.
* Splits incoming data on newlines and pushes each line into the buffer,
* trimming to `maxLines` (default 5000) from the tail.
*/
export function appendOutput(
paneId: string,
data: string,
maxLines: number = 5000,
): void {
let buf = ringBuffers.get(paneId);
if (!buf) {
buf = [];
ringBuffers.set(paneId, buf);
}
// Split on newlines — each chunk may contain multiple complete lines and
// potentially a trailing partial line (which we store as-is; the next chunk
// will either complete it or be another partial).
const lines = data.split('\n');
// The first element of `lines` may be a continuation of the last partial
// line from the previous append. If the buffer is non-empty and the last
// stored entry is a partial (no trailing newline previously), glue them.
// We detect "partial" by checking whether `data` ended with '\n' — if it
// did, the last element after split is '' (empty) which we drop.
const endedWithNewline = data.endsWith('\n');
if (endedWithNewline) {
// The final empty-string element is discarded.
lines.pop();
}
if (buf.length > 0 && lines.length > 0) {
// Concatenate the last partial line in the buffer with the first split
// segment. This avoids splitting ANSI sequences or text across chunks.
buf[buf.length - 1] = (buf[buf.length - 1] ?? '') + (lines[0] ?? '');
lines.shift();
}
for (const line of lines) {
buf.push(line);
}
// Trim from head if over maxLines
if (buf.length > maxLines) {
buf = buf.slice(buf.length - maxLines);
ringBuffers.set(paneId, buf);
}
}
/**
* Search the ring buffer for a pane using a regex pattern.
* Returns matches with optional context lines before and after each match.
*/
export function searchRingBuffer(
paneId: string,
pattern: string,
opts?: { limit?: number; context?: number },
): SearchMatch[] {
const buf = ringBuffers.get(paneId);
if (!buf || buf.length === 0) return [];
const limit = opts?.limit ?? 50;
const context = opts?.context ?? 0;
let re: RegExp;
try {
re = new RegExp(pattern, 'u');
} catch {
return []; // invalid regex — caller should validate, but be defensive
}
const results: SearchMatch[] = [];
for (let i = 0; i < buf.length; i++) {
if (results.length >= limit) break;
if (re.test(buf[i]!)) {
const contextBefore: string[] = [];
const contextAfter: string[] = [];
for (let c = 1; c <= context; c++) {
const ci = i - c;
if (ci >= 0) contextBefore.unshift(buf[ci]!);
}
for (let c = 1; c <= context; c++) {
const ci = i + c;
if (ci < buf.length) contextAfter.push(buf[ci]!);
}
results.push({
line: i + 1, // 1-based line number for display
content: buf[i]!,
contextBefore,
contextAfter,
});
}
}
return results;
}
/**
* Remove the ring buffer for a pane. Called on session kill / pane close.
*/
export function clearBuffer(paneId: string): void {
ringBuffers.delete(paneId);
}

View File

@@ -0,0 +1,167 @@
import type { FastifyInstance } from 'fastify';
import { z } from 'zod';
import { sanitizeId, tmuxSessionName, capturePane } from '../pty/manager.js';
import { searchRingBuffer, clearBuffer } from '../pty/registry.js';
const ParamsSchema = z.object({
sid: z.string(),
pid: z.string(),
});
const MAX_PATTERN_LENGTH = 200;
// Zod-refined string: reject empty and overly-long patterns to prevent ReDoS
const PatternQuerySchema = z
.string()
.min(1, 'pattern is required')
.max(MAX_PATTERN_LENGTH, `pattern must not exceed ${MAX_PATTERN_LENGTH} characters`);
const QuerySchema = z.object({
pattern: PatternQuerySchema,
limit: z.coerce.number().int().min(1).max(500).default(50),
context: z.coerce.number().int().min(0).max(50).default(0),
});
interface SearchMatch {
line: number;
content: string;
contextBefore: string[];
contextAfter: string[];
}
interface SearchResponse {
matches: SearchMatch[];
total: number;
truncated: boolean;
source: 'ring' | 'capture';
}
/**
* Search a captured pane buffer using a regex. This is the fallback path
* when the ring buffer doesn't have enough matches.
*/
function grepBuffer(
text: string,
pattern: string,
limit: number,
context: number,
): SearchMatch[] {
let re: RegExp;
try {
re = new RegExp(pattern, 'u');
} catch {
return [];
}
const lines = text.split('\n');
const results: SearchMatch[] = [];
for (let i = 0; i < lines.length; i++) {
if (results.length >= limit) break;
if (re.test(lines[i]!)) {
const contextBefore: string[] = [];
const contextAfter: string[] = [];
for (let c = 1; c <= context; c++) {
const ci = i - c;
if (ci >= 0) contextBefore.unshift(lines[ci]!);
}
for (let c = 1; c <= context; c++) {
const ci = i + c;
if (ci < lines.length) contextAfter.push(lines[ci]!);
}
results.push({
line: i + 1,
content: lines[i]!,
contextBefore,
contextAfter,
});
}
}
return results;
}
export function registerSearchRoutes(app: FastifyInstance, tmuxConfPath: string): void {
app.get<{
Params: { sid: string; pid: string };
Querystring: { pattern?: string; limit?: string; context?: string };
}>(
'/api/term/sessions/:sid/panes/:pid/search',
async (req, reply) => {
const p = ParamsSchema.safeParse(req.params);
if (!p.success) return reply.code(400).send({ error: 'bad_params' });
const sid = sanitizeId(p.data.sid);
const pid = sanitizeId(p.data.pid);
if (!sid || !pid) return reply.code(400).send({ error: 'bad_id_format' });
const q = QuerySchema.safeParse(req.query);
if (!q.success) {
return reply.code(400).send({
error: 'bad_query',
details: q.error.flatten().fieldErrors,
});
}
const { pattern, limit, context } = q.data;
// ── Path 1: ring buffer search (fast, no tmux interaction) ──
const ringMatches = searchRingBuffer(pid, pattern, { limit, context });
if (ringMatches.length >= limit) {
return reply.code(200).send({
matches: ringMatches,
total: ringMatches.length,
truncated: ringMatches.length >= limit,
source: 'ring' as const,
});
}
// ── Path 2: capture-pane + grep fallback (10s timeout) ──
const sessionName = tmuxSessionName(pid);
let capture: string;
try {
capture = await withTimeout(
capturePane(tmuxConfPath, sessionName, 5000),
10_000,
);
} catch (err) {
req.log.warn({ err, pid }, 'capture-pane timed out or failed');
return reply.code(200).send({
matches: ringMatches,
total: ringMatches.length,
truncated: false,
source: 'ring' as const,
});
}
if (!capture) {
// tmux pane may no longer exist — return whatever ring had
return reply.code(200).send({
matches: ringMatches,
total: ringMatches.length,
truncated: false,
source: 'ring' as const,
});
}
const captureMatches = grepBuffer(capture, pattern, limit, context);
return reply.code(200).send({
matches: captureMatches,
total: captureMatches.length,
truncated: captureMatches.length >= limit,
source: 'capture' as const,
});
},
);
}
function withTimeout<T>(promise: Promise<T>, ms: number): Promise<T> {
return Promise.race([
promise,
new Promise<never>((_, reject) =>
setTimeout(() => reject(new Error('timeout')), ms),
),
]);
}

View File

@@ -9,7 +9,7 @@ import {
} from '../pty/manager.js'; } from '../pty/manager.js';
import { attachPty } from '../pty/pty.js'; import { attachPty } from '../pty/pty.js';
import { getUser } from '../auth.js'; import { getUser } from '../auth.js';
import { register, unregister } from '../pty/registry.js'; import { register, unregister, appendOutput } from '../pty/registry.js';
export function registerWsAttachRoute(app: FastifyInstance, tmuxConfPath: string): void { export function registerWsAttachRoute(app: FastifyInstance, tmuxConfPath: string): void {
app.get<{ app.get<{
@@ -106,6 +106,8 @@ export function registerWsAttachRoute(app: FastifyInstance, tmuxConfPath: string
} catch (err) { } catch (err) {
req.log.warn({ err }, 'ws send failed'); req.log.warn({ err }, 'ws send failed');
} }
// Feed the ring buffer for pattern-based search
appendOutput(pid, data);
}; };
handle.onData(onData); handle.onData(onData);

View File

@@ -38,10 +38,31 @@ export interface StepContext {
readonly model?: string; readonly model?: string;
} }
export type StepKind = 'agent' | 'code' | 'approval'; export type StepKind = 'agent' | 'code' | 'approval' | 'switch';
/**
* One branch of a SWITCH step. The first case whose condition evaluates to true
* is selected; all other branches' stepIds are excluded from execution.
*/
export interface SwitchCase {
/** Human-readable label for this branch (reported in switch output). */
label: string;
/** Pure guard — called with the current step context to decide this branch. */
condition: (ctx: StepContext) => boolean;
/** stepIds belonging to this branch. */
stepIds: string[];
}
export type TriggerRule = 'all_success' | 'one_success' | 'all_done'; export type TriggerRule = 'all_success' | 'one_success' | 'all_done';
/** Possible statuses for a flow step (persisted in flow_steps.status). */
export type StepStatus = 'pending' | 'running' | 'completed' | 'failed' | 'skipped' | 'cancelled' | 'timed_out';
/** Retry policy for a step that times out. */
export interface RetryConfig {
maxRetries: number;
}
export interface Step { export interface Step {
/** unique id within the flow; other steps depend on it by this id */ /** unique id within the flow; other steps depend on it by this id */
id: string; id: string;
@@ -55,10 +76,19 @@ export interface Step {
/** /**
* For kind:'agent', returns the worker PROMPT (task + any prior outputs). * For kind:'agent', returns the worker PROMPT (task + any prior outputs).
* For kind:'code', returns the step RESULT directly (the fold/transform). * For kind:'code', returns the step RESULT directly (the fold/transform).
* For kind:'switch', unused (the runner evaluates cases internally).
*/ */
run: (ctx: StepContext) => string | Promise<string>; run: (ctx: StepContext) => string | Promise<string>;
/** optional guard — when it returns false the step is skipped (e.g. no repo) */ /** optional guard — when it returns false the step is skipped (e.g. no repo) */
when?: (ctx: StepContext) => boolean; when?: (ctx: StepContext) => boolean;
/** max retries on timeout (0 or unset = no retry) */
maxRetries?: number;
/** batch group id; steps sharing the same batch are gated by batchConfig.maxConcurrent */
batch?: string;
/** for kind:'switch' — ordered list of branches evaluated in declaration order */
cases?: SwitchCase[];
/** for kind:'switch' — fallback step ids when no case matches */
defaultBranch?: string[];
} }
export interface Flow { export interface Flow {
@@ -69,6 +99,8 @@ export interface Flow {
render: (ctx: StepContext) => string; render: (ctx: StepContext) => string;
/** optional output filename for the artifact, derived from input */ /** optional output filename for the artifact, derived from input */
output?: (ctx: StepContext) => string; output?: (ctx: StepContext) => string;
/** batch parallelism control — gates concurrent dispatch of steps sharing the same batch id */
batchConfig?: { maxConcurrent: number; timeoutMs?: number; joinRule?: TriggerRule };
} }
export interface RunResult { export interface RunResult {

View File

@@ -52,6 +52,9 @@ const ConfigSchema = z.object({
ORPHAN_WORKTREE_GRACE_MS: z.coerce.number().int().positive().default(3_600_000), ORPHAN_WORKTREE_GRACE_MS: z.coerce.number().int().positive().default(3_600_000),
DEEPSEEK_API_KEY: z.string().optional(), DEEPSEEK_API_KEY: z.string().optional(),
DEEPSEEK_BASE_URL: z.string().url().default('https://api.deepseek.com'), DEEPSEEK_BASE_URL: z.string().url().default('https://api.deepseek.com'),
// v2.9.x: flow step timeout (default 5 min). When a 'running' step exceeds
// this duration, it is marked 'timed_out' and may be retried.
FLOW_STEP_TIMEOUT_MS: z.coerce.number().int().positive().default(300_000),
}); });
export type Config = z.infer<typeof ConfigSchema>; export type Config = z.infer<typeof ConfigSchema>;

View File

@@ -266,7 +266,7 @@ CREATE INDEX IF NOT EXISTS claude_session_entries_key_idx ON claude_session_entr
-- replaces it with the three-value list). -- replaces it with the three-value list).
ALTER TABLE agent_sessions DROP CONSTRAINT IF EXISTS agent_sessions_backend_chk; ALTER TABLE agent_sessions DROP CONSTRAINT IF EXISTS agent_sessions_backend_chk;
ALTER TABLE agent_sessions ADD CONSTRAINT agent_sessions_backend_chk ALTER TABLE agent_sessions ADD CONSTRAINT agent_sessions_backend_chk
CHECK (backend IN ('opencode_server', 'acp_warm', 'claude_sdk')); CHECK (backend IN ('opencode_server', 'acp_warm', 'claude_sdk', 'paseo'));
-- LISTEN/NOTIFY fast path: every tasks INSERT (from any call site — routes, -- LISTEN/NOTIFY fast path: every tasks INSERT (from any call site — routes,
-- new_task tool, MCP server) fires pg_notify('tasks_new') in the same -- new_task tool, MCP server) fires pg_notify('tasks_new') in the same
@@ -340,11 +340,12 @@ CREATE INDEX IF NOT EXISTS flow_steps_task_id_idx ON flow_steps(task_id);
-- edits above are no-ops on the existing DB (CREATE TABLE IF NOT EXISTS skips an -- edits above are no-ops on the existing DB (CREATE TABLE IF NOT EXISTS skips an
-- existing table) — widen via the repo's DROP-IF-EXISTS → guarded-ADD discipline. -- existing table) — widen via the repo's DROP-IF-EXISTS → guarded-ADD discipline.
-- Pure ADD of a new allowed value, so no row UPDATE is needed (no value renamed). -- Pure ADD of a new allowed value, so no row UPDATE is needed (no value renamed).
-- v2.9.x: widen status CHECKs to include 'timed_out' for Task State Machine.
ALTER TABLE flow_runs DROP CONSTRAINT IF EXISTS flow_runs_status_chk; ALTER TABLE flow_runs DROP CONSTRAINT IF EXISTS flow_runs_status_chk;
DO $$ BEGIN DO $$ BEGIN
IF NOT EXISTS (SELECT 1 FROM pg_constraint WHERE conname = 'flow_runs_status_chk') THEN IF NOT EXISTS (SELECT 1 FROM pg_constraint WHERE conname = 'flow_runs_status_chk') THEN
ALTER TABLE flow_runs ADD CONSTRAINT flow_runs_status_chk ALTER TABLE flow_runs ADD CONSTRAINT flow_runs_status_chk
CHECK (status IN ('running', 'completed', 'failed', 'cancelled')); CHECK (status IN ('running', 'completed', 'failed', 'cancelled', 'timed_out'));
END IF; END IF;
END $$; END $$;
@@ -352,10 +353,14 @@ ALTER TABLE flow_steps DROP CONSTRAINT IF EXISTS flow_steps_status_chk;
DO $$ BEGIN DO $$ BEGIN
IF NOT EXISTS (SELECT 1 FROM pg_constraint WHERE conname = 'flow_steps_status_chk') THEN IF NOT EXISTS (SELECT 1 FROM pg_constraint WHERE conname = 'flow_steps_status_chk') THEN
ALTER TABLE flow_steps ADD CONSTRAINT flow_steps_status_chk ALTER TABLE flow_steps ADD CONSTRAINT flow_steps_status_chk
CHECK (status IN ('pending', 'running', 'completed', 'failed', 'skipped', 'cancelled')); CHECK (status IN ('pending', 'running', 'completed', 'failed', 'skipped', 'cancelled', 'timed_out'));
END IF; END IF;
END $$; END $$;
-- Task State Machine: retry columns for flow_steps.
ALTER TABLE flow_steps ADD COLUMN IF NOT EXISTS retry_count INTEGER NOT NULL DEFAULT 0;
ALTER TABLE flow_steps ADD COLUMN IF NOT EXISTS max_retries INTEGER;
-- Arena: battles + contestants + cross_examinations. -- Arena: battles + contestants + cross_examinations.
-- project_id carries no FK (matches tasks.project_id + flow_runs.project_id convention). -- project_id carries no FK (matches tasks.project_id + flow_runs.project_id convention).
-- winner_contestant_id FK is deferred (forward reference): added via guarded ALTER below. -- winner_contestant_id FK is deferred (forward reference): added via guarded ALTER below.

View File

@@ -1,16 +1,20 @@
import { describe, it, expect } from 'vitest'; import { describe, it, expect } from 'vitest';
import type { Flow, Step, StepContext } from '../../conductor/types.js'; import type { Flow, Step, StepContext } from '../../conductor/types.js';
import { import {
buildBatchState,
getReadyInBatch,
manifestSteps, manifestSteps,
readySteps,
partitionReady, partitionReady,
readySteps,
isRunComplete, isRunComplete,
isStuck, isStuck,
reconcileResumeStep, reconcileResumeStep,
reconcileRun, reconcileRun,
resolveSwitch,
shouldFailOnMissingAgent, shouldFailOnMissingAgent,
type SchedulerState, type SchedulerState,
} from '../flow-runner-decisions.js'; } from '../flow-runner-decisions.js';
import type { StepContext } from '../../conductor/types.js';
/** /**
* The DB-driven flow-runner replaces the Phase-1 in-memory wave scheduler * The DB-driven flow-runner replaces the Phase-1 in-memory wave scheduler
@@ -52,6 +56,8 @@ const emptyState = (over: Partial<SchedulerState> = {}): SchedulerState => ({
skipped: new Set(), skipped: new Set(),
inFlight: new Set(), inFlight: new Set(),
excluded: new Set(), excluded: new Set(),
timedOut: new Set(),
switchResults: new Map(),
...over, ...over,
}); });
@@ -237,6 +243,442 @@ describe('isRunComplete / isStuck', () => {
}); });
}); });
// ─── SWITCH branching (v2.9) ─────────────────────────────────────────────────
describe('resolveSwitch', () => {
const baseCtx: StepContext = { input: { question: 'q', band: 'small' }, results: {} };
it('selects the first matching case and excludes other branches', () => {
const step: Step = {
id: 'router',
kind: 'switch',
run: () => '',
cases: [
{ label: 'a', condition: () => false, stepIds: ['a1', 'a2'] },
{ label: 'b', condition: () => true, stepIds: ['b1', 'b2'] },
{ label: 'c', condition: () => true, stepIds: ['c1', 'c2'] },
],
};
const result = resolveSwitch(step, baseCtx);
expect(result.chosenCase).toBe('b');
expect(result.excluded).toEqual(['a1', 'a2', 'c1', 'c2']);
});
it('falls back to defaultBranch when no case matches', () => {
const step: Step = {
id: 'router',
kind: 'switch',
run: () => '',
cases: [
{ label: 'x', condition: () => false, stepIds: ['x1'] },
{ label: 'y', condition: () => false, stepIds: ['y1'] },
],
defaultBranch: ['z1', 'z2'],
};
const result = resolveSwitch(step, baseCtx);
expect(result.chosenCase).toBeNull();
// Only case branch steps are excluded; default steps are not.
expect(result.excluded).toEqual(['x1', 'y1']);
});
it('excludes all branch steps when no case matches and no default', () => {
const step: Step = {
id: 'router',
kind: 'switch',
run: () => '',
cases: [
{ label: 'p', condition: () => false, stepIds: ['p1'] },
{ label: 'q', condition: () => false, stepIds: ['q1', 'q2'] },
],
};
const result = resolveSwitch(step, baseCtx);
expect(result.chosenCase).toBeNull();
expect(result.excluded).toEqual(['p1', 'q1', 'q2']);
});
it('excludes defaultBranch when a case matched', () => {
const step: Step = {
id: 'router',
kind: 'switch',
run: () => '',
cases: [
{ label: 'hit', condition: () => true, stepIds: ['h1'] },
{ label: 'miss', condition: () => false, stepIds: ['m1'] },
],
defaultBranch: ['d1'],
};
const result = resolveSwitch(step, baseCtx);
expect(result.chosenCase).toBe('hit');
expect(result.excluded).toEqual(['m1', 'd1']);
});
it('returns empty excluded for a degenerate switch with no cases and no default', () => {
const step: Step = {
id: 'noop',
kind: 'switch',
run: () => '',
};
const result = resolveSwitch(step, baseCtx);
expect(result.chosenCase).toBeNull();
expect(result.excluded).toEqual([]);
});
it('uses ctx.results in condition evaluation', () => {
const step: Step = {
id: 'router',
kind: 'switch',
run: () => '',
cases: [
{ label: 'has', condition: (ctx) => ctx.results['prev'] === 'yes', stepIds: ['yes-branch'] },
{ label: 'no', condition: () => true, stepIds: ['no-branch'] },
],
};
const ctxWithResult: StepContext = { input: { question: 'q', band: 'small' }, results: { prev: 'yes' } };
const result = resolveSwitch(step, ctxWithResult);
expect(result.chosenCase).toBe('has');
expect(result.excluded).toEqual(['no-branch']);
});
});
describe('readySteps with switch-excluded steps', () => {
// Flow: switch router → branch-a/branch-b → fold
function switchFlow(): Flow {
const steps: Step[] = [
{
id: 'switch', kind: 'switch', run: () => '',
cases: [
{ label: 'a', condition: () => true, stepIds: ['branch-a'] },
{ label: 'b', condition: () => false, stepIds: ['branch-b'] },
],
},
{ id: 'branch-a', kind: 'agent', agent: 'x', deps: ['switch'], run: () => 'p' },
{ id: 'branch-b', kind: 'agent', agent: 'y', deps: ['switch'], run: () => 'q' },
{ id: 'fold', kind: 'code', deps: ['branch-a', 'branch-b'], run: () => 'r' },
];
return { name: 'switch-demo', description: '', steps, render: () => '' };
}
it('excludes non-selected branch steps and treats them as satisfied deps', () => {
const flow = switchFlow();
// switch completed, branch-b excluded by switch (branch-a selected)
const switchResult = new Map<string, { chosenCase: string | null; excluded: Set<string> }>([
['switch', { chosenCase: 'a', excluded: new Set(['branch-b']) }],
]);
const state: SchedulerState = {
done: new Set(['switch']),
skipped: new Set(),
inFlight: new Set(),
excluded: new Set(),
timedOut: new Set(),
switchResults: switchResult,
};
const ready = readySteps(flow, state).map((s) => s.id);
// branch-a is ready (dep switch is done), branch-b is excluded
expect(ready).toContain('branch-a');
expect(ready).not.toContain('branch-b');
});
it('fold unblocks once selected branch completes (excluded branch satisfied)', () => {
const flow = switchFlow();
const switchResult = new Map<string, { chosenCase: string | null; excluded: Set<string> }>([
['switch', { chosenCase: 'a', excluded: new Set(['branch-b']) }],
]);
const state: SchedulerState = {
done: new Set(['switch', 'branch-a']),
skipped: new Set(),
inFlight: new Set(),
excluded: new Set(),
timedOut: new Set(),
switchResults: switchResult,
};
const ready = readySteps(flow, state).map((s) => s.id);
// fold's deps: branch-a done, branch-b excluded (via switch) → satisfied
expect(ready).toContain('fold');
});
it('fold stays blocked until selected branch completes, even with excluded dep', () => {
const flow = switchFlow();
const switchResult = new Map<string, { chosenCase: string | null; excluded: Set<string> }>([
['switch', { chosenCase: 'a', excluded: new Set(['branch-b']) }],
]);
const state: SchedulerState = {
done: new Set(['switch']),
skipped: new Set(),
inFlight: new Set(['branch-a']),
excluded: new Set(),
timedOut: new Set(),
switchResults: switchResult,
};
const ready = readySteps(flow, state).map((s) => s.id);
// branch-a in flight, branch-b excluded — only branch-a offered
expect(ready).not.toContain('fold');
});
it('isRunComplete returns true when switch-excluded steps are the only unsettled', () => {
const flow = switchFlow();
// All non-excluded steps done; branch-b is excluded via switch
const switchResult = new Map<string, { chosenCase: string | null; excluded: Set<string> }>([
['switch', { chosenCase: 'a', excluded: new Set(['branch-b']) }],
]);
const state: SchedulerState = {
done: new Set(['switch', 'branch-a', 'fold']),
skipped: new Set(),
inFlight: new Set(),
excluded: new Set(),
timedOut: new Set(),
switchResults: switchResult,
};
expect(isRunComplete(flow, state)).toBe(true);
expect(isStuck(flow, state)).toBe(false);
});
it('combines static excluded with switch-excluded', () => {
const flow = switchFlow();
// band gating excludes branch-b at launch, AND switch also excludes it
const switchResult = new Map<string, { chosenCase: string | null; excluded: Set<string> }>([
['switch', { chosenCase: 'a', excluded: new Set(['branch-b']) }],
]);
const state: SchedulerState = {
done: new Set(['switch', 'branch-a']),
skipped: new Set(),
inFlight: new Set(),
excluded: new Set(['branch-b']),
timedOut: new Set(),
switchResults: switchResult,
};
// branch-b excluded both ways; fold sees branch-a done, branch-b excluded
const ready = readySteps(flow, state).map((s) => s.id);
expect(ready).toContain('fold');
});
});
// ─── Batch parallelism (v2.8.22) ─────────────────────────────────────────────
describe('buildBatchState', () => {
it('returns empty map when flow has no batchConfig', () => {
const flow: Flow = {
name: 'no-batch',
description: '',
steps: [
{ id: 'a', kind: 'agent', agent: 'x', run: () => 'p' },
{ id: 'b', kind: 'code', deps: ['a'], run: () => 'r' },
],
render: () => '',
};
const bs = buildBatchState(flow, new Set());
expect(bs.size).toBe(0);
});
it('maps each batch group to its running set and config', () => {
const flow: Flow = {
name: 'batched',
description: '',
steps: [
{ id: 'a1', kind: 'agent', agent: 'x', batch: 'review', run: () => 'p' },
{ id: 'a2', kind: 'agent', agent: 'y', batch: 'review', run: () => 'q' },
{ id: 'b1', kind: 'agent', agent: 'z', batch: 'check', run: () => 'r' },
{ id: 'fold', kind: 'code', deps: ['a1', 'a2', 'b1'], run: () => 's' },
],
render: () => '',
batchConfig: { maxConcurrent: 2 },
};
// a1 is in flight → review batch has 1 running, check has 0.
const bs = buildBatchState(flow, new Set(['a1']));
expect(bs.size).toBe(2);
const review = bs.get('review');
expect(review).toBeDefined();
expect([...review!.running]).toEqual(['a1']);
expect(review!.maxConcurrent).toBe(2);
expect(review!.joinRule).toBe('all_success');
const check = bs.get('check');
expect(check).toBeDefined();
expect(check!.running.size).toBe(0);
expect(check!.maxConcurrent).toBe(2);
});
it('uses joinRule from batchConfig when provided', () => {
const flow: Flow = {
name: 'join',
description: '',
steps: [
{ id: 'x', kind: 'agent', agent: 'a', batch: 'g1', run: () => 'p' },
],
render: () => '',
batchConfig: { maxConcurrent: 1, joinRule: 'one_success' },
};
const bs = buildBatchState(flow, new Set());
expect(bs.get('g1')!.joinRule).toBe('one_success');
});
it('ignores steps without a batch field', () => {
const flow: Flow = {
name: 'mixed',
description: '',
steps: [
{ id: 'a', kind: 'agent', agent: 'x', run: () => 'p' },
{ id: 'b', kind: 'agent', agent: 'y', batch: 'g1', run: () => 'q' },
],
render: () => '',
batchConfig: { maxConcurrent: 3 },
};
const bs = buildBatchState(flow, new Set(['a', 'b']));
// a is inFlight but has no batch — it does not create an entry
expect(bs.size).toBe(1);
expect(bs.has('g1')).toBe(true);
expect(bs.get('g1')!.running.has('b')).toBe(true);
// a is not in any batch entry
for (const entry of bs.values()) {
expect(entry.running.has('a')).toBe(false);
}
});
});
describe('getReadyInBatch', () => {
function makeBatchState(
overrides?: Map<string, { running: Set<string>; maxConcurrent: number; joinRule: TriggerRule }>,
): Map<string, { running: Set<string>; maxConcurrent: number; joinRule: TriggerRule }> {
return overrides ?? new Map();
}
it('passes all steps through when batchState is empty', () => {
const steps: Step[] = [
{ id: 'a', kind: 'agent', agent: 'x', run: () => 'p' },
{ id: 'b', kind: 'agent', agent: 'y', batch: 'g1', run: () => 'q' },
];
const state: SchedulerState = {
done: new Set(),
skipped: new Set(),
inFlight: new Set(),
excluded: new Set(),
timedOut: new Set(),
switchResults: new Map(),
batchState: makeBatchState(),
};
const result = getReadyInBatch(steps, state, {} as Flow);
expect(result.map((s) => s.id)).toEqual(['a', 'b']);
});
it('passes non-batched steps through regardless of batch capacity', () => {
const batchState = new Map();
batchState.set('g1', { running: new Set(['a']), maxConcurrent: 1, joinRule: 'all_success' });
const steps: Step[] = [
{ id: 'nobatch', kind: 'agent', agent: 'z', run: () => 'r' },
{ id: 'batched', kind: 'agent', agent: 'x', batch: 'g1', run: () => 'p' },
];
const state: SchedulerState = {
done: new Set(),
skipped: new Set(),
inFlight: new Set(['a']),
excluded: new Set(),
timedOut: new Set(),
switchResults: new Map(),
batchState,
};
const result = getReadyInBatch(steps, state, {} as Flow);
// nobatch passes, batched is at maxConcurrent=1 with a already running → blocked
expect(result.map((s) => s.id)).toEqual(['nobatch']);
});
it('allows batch steps up to maxConcurrent', () => {
const batchState = new Map();
batchState.set('g1', { running: new Set(), maxConcurrent: 2, joinRule: 'all_success' });
const steps: Step[] = [
{ id: 's1', kind: 'agent', agent: 'x', batch: 'g1', run: () => 'p' },
{ id: 's2', kind: 'agent', agent: 'y', batch: 'g1', run: () => 'q' },
{ id: 's3', kind: 'agent', agent: 'z', batch: 'g1', run: () => 'r' },
];
const state: SchedulerState = {
done: new Set(),
skipped: new Set(),
inFlight: new Set(),
excluded: new Set(),
timedOut: new Set(),
switchResults: new Map(),
batchState,
};
// All 0 running, maxConcurrent=2 → all 3 pass through (readySteps would return them,
// but the flow-runner dispatches them one-by-one in the agent dispatch loop; getReadyInBatch
// is called each tick to allow up to maxConcurrent. Since batch is empty on this tick,
// all are allowed — the runner's dispatch loop will put 2 in flight, then next tick blocks.)
const result = getReadyInBatch(steps, state, {} as Flow);
expect(result.map((s) => s.id)).toEqual(['s1', 's2', 's3']);
});
it('blocks batch steps when at capacity', () => {
const batchState = new Map();
batchState.set('g1', { running: new Set(['a', 'b']), maxConcurrent: 2, joinRule: 'all_success' });
const steps: Step[] = [
{ id: 'c', kind: 'agent', agent: 'x', batch: 'g1', run: () => 'p' },
{ id: 'd', kind: 'agent', agent: 'y', batch: 'g1', run: () => 'q' },
];
const state: SchedulerState = {
done: new Set(),
skipped: new Set(),
inFlight: new Set(['a', 'b']),
excluded: new Set(),
timedOut: new Set(),
switchResults: new Map(),
batchState,
};
// Both batches at capacity → everything filtered out
expect(getReadyInBatch(steps, state, {} as Flow)).toEqual([]);
});
it('handles multiple independent batch groups', () => {
const batchState = new Map();
batchState.set('g1', { running: new Set(['a']), maxConcurrent: 1, joinRule: 'all_success' });
batchState.set('g2', { running: new Set(), maxConcurrent: 5, joinRule: 'all_success' });
const steps: Step[] = [
{ id: 'b', kind: 'agent', agent: 'x', batch: 'g1', run: () => 'p' }, // g1 at capacity → blocked
{ id: 'c', kind: 'agent', agent: 'y', batch: 'g2', run: () => 'q' }, // g2 has room → passes
{ id: 'd', kind: 'agent', agent: 'z', batch: 'g2', run: () => 'r' }, // g2 has room → passes
];
const state: SchedulerState = {
done: new Set(),
skipped: new Set(),
inFlight: new Set(['a']),
excluded: new Set(),
timedOut: new Set(),
switchResults: new Map(),
batchState,
};
expect(getReadyInBatch(steps, state, {} as Flow).map((s) => s.id)).toEqual(['c', 'd']);
});
it('lets a step pass when its batch group is known but has no running steps yet', () => {
const batchState = new Map();
batchState.set('g1', { running: new Set(), maxConcurrent: 2, joinRule: 'all_success' });
const steps: Step[] = [
{ id: 'first', kind: 'agent', agent: 'x', batch: 'g1', run: () => 'p' },
];
const state: SchedulerState = {
done: new Set(),
skipped: new Set(),
inFlight: new Set(),
excluded: new Set(),
timedOut: new Set(),
switchResults: new Map(),
batchState,
};
expect(getReadyInBatch(steps, state, {} as Flow).map((s) => s.id)).toEqual(['first']);
});
it('handles empty step list gracefully', () => {
const state: SchedulerState = {
done: new Set(),
skipped: new Set(),
inFlight: new Set(),
excluded: new Set(),
timedOut: new Set(),
switchResults: new Map(),
batchState: makeBatchState(),
};
expect(getReadyInBatch([], state, {} as Flow)).toEqual([]);
});
});
// ─── Resume reconciliation (D-9) ───────────────────────────────────────────── // ─── Resume reconciliation (D-9) ─────────────────────────────────────────────
describe('reconcileResumeStep', () => { describe('reconcileResumeStep', () => {

View File

@@ -0,0 +1,195 @@
import { describe, it, expect, vi } from 'vitest';
import { PaseoClient, PaseoClientError } from '../paseo-client.js';
/**
* Create a PaseoClient whose runCli method is replaced with a mock.
* The mock is returned as the second tuple element so tests can
* control and inspect it directly.
*/
function makeClient(config?: { paseoBin?: string; cliHost?: string }): {
client: PaseoClient;
mockRunCli: ReturnType<typeof vi.fn>;
} {
const client = new PaseoClient(config);
const mockRunCli = vi.fn();
(client as any).runCli = mockRunCli;
return { client, mockRunCli };
}
describe('PaseoClient', () => {
describe('listAgents', () => {
it('returns parsed agent list from paseo ls --json', async () => {
const agents = [
{ id: 'abc-123', shortId: 'abc', name: 'Agent 1', provider: 'opencode', status: 'running' },
{ id: 'def-456', shortId: 'def', name: 'Agent 2', provider: 'claude', status: 'idle' },
];
const { client, mockRunCli } = makeClient();
mockRunCli.mockResolvedValue(JSON.stringify(agents));
const result = await client.listAgents();
expect(mockRunCli).toHaveBeenCalledWith(['ls', '--json']);
expect(result).toEqual(agents);
});
it('throws PaseoClientError on non-JSON output', async () => {
const { client, mockRunCli } = makeClient();
mockRunCli.mockResolvedValue('not json');
await expect(client.listAgents()).rejects.toThrow(PaseoClientError);
await expect(client.listAgents()).rejects.toThrow(/invalid JSON/);
});
it('propagates runCli rejection as-is', async () => {
const { client, mockRunCli } = makeClient();
const err = new PaseoClientError('ls failed: connection refused', 'ls', 1, 'connection refused');
mockRunCli.mockRejectedValue(err);
await expect(client.listAgents()).rejects.toThrow(PaseoClientError);
await expect(client.listAgents()).rejects.toThrow(/ls failed/);
});
});
describe('getAgentStatus', () => {
it('returns parsed agent detail from paseo inspect --json', async () => {
const detail = {
Id: 'abc-123', Name: 'Agent 1', Provider: 'opencode',
Status: 'idle', Archived: false,
CreatedAt: '2026-01-01T00:00:00Z', UpdatedAt: '2026-01-01T01:00:00Z',
};
const { client, mockRunCli } = makeClient();
mockRunCli.mockResolvedValue(JSON.stringify(detail));
const result = await client.getAgentStatus('abc-123');
expect(mockRunCli).toHaveBeenCalledWith(['inspect', '--json', 'abc-123']);
expect(result.Id).toBe('abc-123');
expect(result.Status).toBe('idle');
});
});
describe('health', () => {
it('returns ok when paseo ls succeeds', async () => {
const { client, mockRunCli } = makeClient();
mockRunCli.mockResolvedValue('[]');
const result = await client.health();
expect(result).toEqual({ status: 'ok' });
});
it('returns error when runCli throws', async () => {
const { client, mockRunCli } = makeClient();
mockRunCli.mockRejectedValue(new Error('connection refused'));
const result = await client.health();
expect(result).toEqual({ status: 'error' });
});
});
describe('importAgent', () => {
it('calls paseo import with provider and labels', async () => {
const agentResult = { Id: 'new-789', Name: 'Imported', Provider: 'opencode', Status: 'idle' };
const { client, mockRunCli } = makeClient();
mockRunCli.mockResolvedValue(JSON.stringify(agentResult));
const result = await client.importAgent('ses-001', 'opencode', {
origin: 'boocode',
project: 'proj-1',
});
expect(mockRunCli).toHaveBeenCalledWith([
'import', '--json',
'--provider', 'opencode',
'--label', 'origin=boocode',
'--label', 'project=proj-1',
'ses-001',
]);
expect(result.Id).toBe('new-789');
});
it('works without labels', async () => {
const { client, mockRunCli } = makeClient();
mockRunCli.mockResolvedValue(JSON.stringify({ Id: 'new-789' }));
const result = await client.importAgent('ses-001', 'claude');
expect(mockRunCli).toHaveBeenCalledWith([
'import', '--json',
'--provider', 'claude',
'ses-001',
]);
expect(result.Id).toBe('new-789');
});
});
describe('archiveAgent', () => {
it('calls paseo archive --json', async () => {
const { client, mockRunCli } = makeClient();
mockRunCli.mockResolvedValue('{}');
await client.archiveAgent('abc-123');
expect(mockRunCli).toHaveBeenCalledWith(['archive', '--json', 'abc-123']);
});
});
describe('sendPrompt', () => {
it('sends prompt and parses JSON result', async () => {
const sendResult = { text: 'Hello!', ok: true };
const { client, mockRunCli } = makeClient();
mockRunCli.mockResolvedValue(JSON.stringify(sendResult));
const result = await client.sendPrompt('abc-123', 'Hello');
expect(mockRunCli).toHaveBeenCalledWith(['send', '--json', 'abc-123', 'Hello'], undefined);
expect(result).toEqual(sendResult);
});
it('falls back to plain text on non-JSON output', async () => {
const { client, mockRunCli } = makeClient();
mockRunCli.mockResolvedValue('plain text response');
const result = await client.sendPrompt('abc-123', 'Hi');
expect(result).toEqual({ text: 'plain text response', ok: true });
});
it('supports --no-wait flag', async () => {
const { client, mockRunCli } = makeClient();
mockRunCli.mockResolvedValue('{}');
await client.sendPrompt('abc-123', 'Hi', { noWait: true });
expect(mockRunCli).toHaveBeenCalledWith([
'send', '--json', '--no-wait',
'abc-123', 'Hi',
], undefined);
});
});
describe('stopAgent', () => {
it('calls paseo stop', async () => {
const { client, mockRunCli } = makeClient();
mockRunCli.mockResolvedValue('');
await client.stopAgent('abc-123');
expect(mockRunCli).toHaveBeenCalledWith(['stop', 'abc-123']);
});
});
describe('cliHost config', () => {
it('includes --host flag in args when cliHost is set', async () => {
const { client, mockRunCli } = makeClient({ cliHost: 'tcp://localhost:6767?ssl=true' });
mockRunCli.mockResolvedValue('[]');
await client.listAgents();
expect(mockRunCli).toHaveBeenCalledWith([
'ls', '--json', '--host', 'tcp://localhost:6767?ssl=true',
]);
});
});
});

View File

@@ -13,7 +13,7 @@ import type { AcpToolSnapshot } from './acp-tool-snapshot.js';
import type { AgentCommand } from './provider-types.js'; import type { AgentCommand } from './provider-types.js';
/** Backend transport kind. Mirrors `agent_sessions.backend` CHECK in schema.sql. */ /** Backend transport kind. Mirrors `agent_sessions.backend` CHECK in schema.sql. */
export type AgentBackendKind = 'opencode_server' | 'acp_warm' | 'claude_sdk'; export type AgentBackendKind = 'opencode_server' | 'acp_warm' | 'claude_sdk' | 'paseo';
/** /**
* Normalized, transport-agnostic events a backend emits during a turn (§2). * Normalized, transport-agnostic events a backend emits during a turn (§2).

View File

@@ -0,0 +1,254 @@
/**
* v2.10 — PaseoBackend: Paseo agent integration for the agent-pool.
*
* Wraps the Paseo CLI daemon as an AgentBackend. Each Paseo agent maps to one
* (chat_id, agent) pair and is persisted via `paseo import` (which registers
* an agent with the Paseo daemon). Prompts are sent via `paseo send`, and
* the session is cleaned up via `paseo archive`.
*
* Paseo is a meta-agent hub — it wraps provider sessions (opencode, claude,
* acp, etc.). The `provider` option in `EnsureSessionOpts` selects which
* provider Paseo delegates to.
*
* Backend kind: 'paseo' (must be added to agent_sessions_backend_chk).
*
* Spec: openspec/changes/v2-10-paseo-integration/design.md.
*/
import type { FastifyBaseLogger } from 'fastify';
import type { Sql } from '../../db.js';
import { PaseoClient, type PaseoSendResult } from '../paseo-client.js';
import type {
AgentBackend,
AgentSessionHandle,
EnsureSessionOpts,
PromptCtx,
TurnResult,
} from '../agent-backend.js';
/** Default provider to use when Paseo wraps a generic agent. */
const DEFAULT_PASEO_PROVIDER = 'opencode';
export interface PaseoBackendDeps {
sql: Sql;
log: FastifyBaseLogger;
/** The (chat, agent) this backend serves — its pool identity + DB key. */
chatId: string;
/** Agent name (e.g. 'opencode', 'claude', 'paseo'). */
agent: string;
/** Resolved PaseoClient instance. */
client: PaseoClient;
/** Provider string to pass to `paseo import --provider`. */
provider: string;
}
export class PaseoBackend implements AgentBackend {
readonly backend = 'paseo' as const;
private readonly sql: Sql;
private readonly log: FastifyBaseLogger;
private readonly chatId: string;
private readonly agent: string;
private readonly client: PaseoClient;
private readonly provider: string;
/** Map of BooCode sessionId → Paseo agent ID. */
private readonly agentIds = new Map<string, string>();
/** True between prompt() start and settle. */
private busy = false;
private up = false;
constructor(deps: PaseoBackendDeps) {
this.sql = deps.sql;
this.log = deps.log;
this.chatId = deps.chatId;
this.agent = deps.agent;
this.client = deps.client;
this.provider = deps.provider || DEFAULT_PASEO_PROVIDER;
}
/** §2: liveness for the health endpoint + dispatcher fallback decision. */
health(): 'up' | 'down' {
return this.up ? 'up' : 'down';
}
/** Phase 3: busy iff a turn is in flight (pool never evicts a busy backend). */
isBusy(): boolean {
return this.busy;
}
// ─── ensureSession: create/import a Paseo agent ─────────────────────────────
async ensureSession(sessionId: string, opts: EnsureSessionOpts): Promise<AgentSessionHandle> {
// Check if we already have a Paseo agent ID for this session.
let paseoId = this.agentIds.get(sessionId);
if (!paseoId) {
// Resolve existing agent_session_id from DB (e.g. after a restart).
const [row] = await this.sql<{ agent_session_id: string | null }[]>`
SELECT agent_session_id FROM agent_sessions
WHERE chat_id = ${opts.chatId} AND agent = ${opts.agent} AND backend = 'paseo'
`;
if (row?.agent_session_id) {
paseoId = row.agent_session_id;
this.agentIds.set(sessionId, paseoId);
}
}
if (!paseoId) {
// Import a new Paseo agent. Use the session UUID as the provider session id.
const labels: Record<string, string> = {
origin: 'boocode',
project: opts.projectId,
chat: opts.chatId,
worktree: opts.worktreeId,
agent: this.agent,
};
try {
const agent = await this.client.importAgent(sessionId, this.provider, labels);
paseoId = agent.Id;
this.agentIds.set(sessionId, paseoId);
this.log.info(
{ paseoId, agent: this.agent, chatId: this.chatId },
'paseo: imported agent',
);
} catch (err) {
this.log.error(
{ err: String(err), agent: this.agent, chatId: this.chatId },
'paseo: importAgent failed',
);
throw err;
}
}
// Upsert the agent_sessions row.
await this.sql`
INSERT INTO agent_sessions
(chat_id, session_id, worktree_id, agent, backend, agent_session_id, server_port, status, last_active_at)
VALUES
(${opts.chatId}, ${sessionId}, ${opts.worktreeId}, ${opts.agent}, 'paseo', ${paseoId}, NULL, 'active', clock_timestamp())
ON CONFLICT (chat_id, agent) DO UPDATE SET
session_id = EXCLUDED.session_id,
worktree_id = EXCLUDED.worktree_id,
backend = 'paseo',
agent_session_id = COALESCE(EXCLUDED.agent_session_id, agent_sessions.agent_session_id),
server_port = NULL,
status = 'active',
last_active_at = clock_timestamp()
`.catch((err) => {
this.log.warn(
{ err: String(err), chatId: opts.chatId, agent: opts.agent },
'paseo: agent_sessions upsert failed (non-fatal)',
);
});
this.up = true;
return {
sessionId,
agent: opts.agent,
backend: 'paseo',
chatId: opts.chatId,
worktreeId: opts.worktreeId,
agentSessionId: paseoId,
serverPort: null,
};
}
// ─── prompt: send a message to the Paseo agent ─────────────────────────────
async prompt(handle: AgentSessionHandle, input: string, ctx: PromptCtx): Promise<TurnResult> {
const paseoId = handle.agentSessionId;
if (!paseoId) {
return { ok: false, error: 'paseo: no agent session id in handle' };
}
this.busy = true;
try {
// Use streamSend for real-time text output via onEvent.
const result: PaseoSendResult = await this.client.streamSend(
paseoId,
input,
(event) => {
ctx.onEvent(event);
},
ctx.signal,
);
// Update last_active_at.
await this.sql`
UPDATE agent_sessions
SET last_active_at = clock_timestamp()
WHERE chat_id = ${handle.chatId} AND agent = ${handle.agent}
`.catch(() => { /* non-fatal */ });
if (result.error) {
return { ok: false, error: result.error };
}
return { ok: true };
} catch (err) {
const msg = err instanceof Error ? err.message : String(err);
// Check if abortion
if (ctx.signal.aborted) {
return { ok: false, error: 'cancelled' };
}
return { ok: false, error: `paseo: ${msg}` };
} finally {
this.busy = false;
}
}
// ─── closeSession: archive the Paseo agent ─────────────────────────────────
async closeSession(handle: AgentSessionHandle): Promise<void> {
const paseoId = handle.agentSessionId;
if (!paseoId) return;
try {
await this.client.archiveAgent(paseoId);
this.log.info({ paseoId, agent: handle.agent }, 'paseo: archived agent');
} catch (err) {
this.log.warn(
{ err: String(err), paseoId, agent: handle.agent },
'paseo: archiveAgent failed (non-fatal)',
);
}
this.agentIds.delete(handle.sessionId);
// Update DB row.
await this.sql`
UPDATE agent_sessions
SET status = 'closed', last_active_at = clock_timestamp()
WHERE chat_id = ${handle.chatId} AND agent = ${handle.agent}
`.catch(() => { /* non-fatal */ });
}
// ─── dispose: archive all tracked agents ───────────────────────────────────
async dispose(): Promise<void> {
const ids = [...this.agentIds.values()];
this.agentIds.clear();
for (const paseoId of ids) {
try {
await this.client.archiveAgent(paseoId);
} catch {
// Best-effort cleanup during shutdown.
}
}
this.up = false;
}
/** Phase 3: periodic health tick — probes the Paseo daemon. */
async tickHealth(_now?: number): Promise<void> {
try {
const h = await this.client.health();
this.up = h.status === 'ok';
} catch {
this.up = false;
}
}
}

View File

@@ -0,0 +1,115 @@
// v2.8 Collision detection — pure functions that find file overlaps between
// worktrees/agents editing the same files concurrently. Advisory only; writes
// are never blocked, but the collision info surfaces in the UI and logs.
//
// Severity levels:
// same_line — the same file, exact same line region
// adjacent_line — the same file, lines touch or are within 5 lines
// different_area — the same file, distant lines
//
// Pure functions, no side effects. Testable in isolation.
export type ConflictSeverity = 'same_line' | 'adjacent_line' | 'different_area';
export interface ConflictVerdict {
filePath: string;
worktrees: string[];
severity: ConflictSeverity;
agents: string[];
}
/**
* Registry entry for a single file change recorded by a worktree.
* Stored in the ConflictIndex Map value for each file path.
*/
export interface ConflictEntry {
worktreeId: string;
agent: string;
/**
* Approximate line range touched by the change. undefined when the change
* creates or deletes the file (full-file collision vs. same-line).
*/
lineRange?: { start: number; end: number };
status: 'pending' | 'applied' | 'reverted';
timestamp: number;
}
/**
* Shape of the conflict index consumed by findConflicts.
* File path → set of entries from different worktrees/agents.
*/
export type ConflictIndexData = ReadonlyMap<string, ReadonlySet<ConflictEntry>>;
/**
* Find file overlaps between `changedFiles` and the conflict index, excluding
* the caller's own worktree.
*
* Returns one ConflictVerdict per file that has entries from other worktrees.
* Severity is the highest found (same_line > adjacent_line > different_area).
*/
export function findConflicts(
changedFiles: string[],
worktreeId: string,
/** Approximate line range for the proposed changes, keyed by file path */
changedRanges: Map<string, { start: number; end: number }>,
conflictIndex: ConflictIndexData,
): ConflictVerdict[] {
const verdicts: ConflictVerdict[] = [];
for (const filePath of changedFiles) {
const entries = conflictIndex.get(filePath);
if (!entries || entries.size === 0) continue;
// Filter to entries from OTHER worktrees
const otherEntries = [...entries].filter((e) => e.worktreeId !== worktreeId);
if (otherEntries.length === 0) continue;
const myRange = changedRanges.get(filePath);
let severity: ConflictSeverity = 'different_area';
for (const entry of otherEntries) {
if (!myRange || !entry.lineRange) {
// Full-file changes (create/delete) always hit at least different_area
continue;
}
const sev = lineOverlapSeverity(myRange, entry.lineRange);
if (sev === 'same_line') {
severity = 'same_line';
break; // Can't get higher than this
}
if (sev === 'adjacent_line' && severity === 'different_area') {
severity = 'adjacent_line';
}
}
const worktrees = [...new Set(otherEntries.map((e) => e.worktreeId))];
const agents = [...new Set(otherEntries.map((e) => e.agent))];
verdicts.push({ filePath, worktrees, severity, agents });
}
return verdicts;
}
const ADJACENT_LINE_THRESHOLD = 5;
/**
* Determine severity of overlap between two line ranges.
*/
function lineOverlapSeverity(
a: { start: number; end: number },
b: { start: number; end: number },
): ConflictSeverity {
// Same_line: ranges intersect
if (a.start <= b.end && b.start <= a.end) {
return 'same_line';
}
// Adjacent: ranges are within ADJACENT_LINE_THRESHOLD lines of each other
const gap = a.start > b.end ? a.start - b.end : b.start - a.end;
if (gap <= ADJACENT_LINE_THRESHOLD) {
return 'adjacent_line';
}
return 'different_area';
}

View File

@@ -0,0 +1,151 @@
// v2.8 In-memory conflict index — tracks which worktrees/agents are editing
// which files so the collision detector can find overlaps.
//
// Singleton exported as `conflictIndex`; imported by pending_changes.ts to
// register changes at queue time and unregister on worktree teardown.
//
// NOT persisted — survives only as long as the BooCoder process. Postgres
// is the durable record (pending_changes table); this is the hot in-memory
// probe for concurrent edit warnings.
import type { ConflictEntry, ConflictVerdict } from './collision-detector.js';
import { findConflicts } from './collision-detector.js';
export class ConflictIndex {
/**
* filePath → Set of ConflictEntry from various worktrees.
* A single worktree may have multiple entries for the same file
* (several pending edits to the same file in one session).
*/
#map = new Map<string, Set<ConflictEntry>>();
// ---- mutation -------------------------------------------------------
/**
* Register that `worktreeId` (agent) is touching `filePath`.
* Creates an entry in the index so subsequent callers see it as a conflict.
*/
registerChange(
filePath: string,
worktreeId: string,
agent: string,
lineRange?: { start: number; end: number },
): void {
let entries = this.#map.get(filePath);
if (!entries) {
entries = new Set();
this.#map.set(filePath, entries);
}
entries.add({
worktreeId,
agent,
lineRange,
status: 'pending' as const,
timestamp: Date.now(),
});
}
/**
* Remove all entries for a given worktree. Called on worktree teardown
* so stale entries don't trigger false warnings.
*/
removeWorktree(worktreeId: string): void {
for (const [filePath, entries] of this.#map) {
const before = entries.size;
for (const entry of entries) {
if (entry.worktreeId === worktreeId) {
entries.delete(entry);
}
}
if (entries.size === 0) {
this.#map.delete(filePath);
}
}
}
/**
* Remove entries older than `maxAgeMs`. Useful as a periodic cleanup
* when worktree teardown was missed (crash, unclean exit).
*/
sweepStale(maxAgeMs: number): number {
const cutoff = Date.now() - maxAgeMs;
let removed = 0;
for (const [filePath, entries] of this.#map) {
for (const entry of entries) {
if (entry.timestamp < cutoff) {
entries.delete(entry);
removed++;
}
}
if (entries.size === 0) {
this.#map.delete(filePath);
}
}
return removed;
}
// ---- query ----------------------------------------------------------
/**
* Query the raw ConflictEntry set for a file path. Returns empty set
* when there are no entries (never mutated the file).
*/
getEntriesFor(filePath: string): ReadonlySet<ConflictEntry> {
return this.#map.get(filePath) ?? new Set();
}
/**
* Get all conflict verdicts for a given file path — which other
* worktrees are touching it. Returns empty when only one worktree
* has entries (no actual conflict).
*/
getConflictsFor(filePath: string): ConflictVerdict[] {
const entries = this.#map.get(filePath);
if (!entries || entries.size === 0) return [];
// Determine distinct worktree IDs. If only one, no conflict.
const worktreeIds = new Set<string>();
for (const e of entries) worktreeIds.add(e.worktreeId);
if (worktreeIds.size <= 1) return [];
// Use the first worktree as the "caller" so findConflicts excludes
// its entries and returns only entries from OTHER worktrees.
const caller = [...worktreeIds][0]!;
return findConflicts(
[filePath],
caller,
new Map(),
this.#toIndexData(),
);
}
/**
* Get conflicts for a set of file changes from a specific worktree.
* Delegates to the pure findConflicts function.
*/
query(
changedFiles: string[],
worktreeId: string,
changedRanges: Map<string, { start: number; end: number }>,
): ConflictVerdict[] {
return findConflicts(changedFiles, worktreeId, changedRanges, this.#toIndexData());
}
/**
* Snapshot the current map for testing/inspection.
*/
snapshot(): Map<string, ReadonlySet<ConflictEntry>> {
return new Map(this.#map);
}
// ---- private --------------------------------------------------------
#toIndexData(): ReadonlyMap<string, ReadonlySet<ConflictEntry>> {
return this.#map as ReadonlyMap<string, ReadonlySet<ConflictEntry>>;
}
}
// Singleton — the whole BooCoder process shares one conflict index.
export const conflictIndex = new ConflictIndex();

View File

@@ -33,11 +33,43 @@ export interface SchedulerState {
readonly inFlight: ReadonlySet<string>; readonly inFlight: ReadonlySet<string>;
/** step ids pre-skipped at launch (band/when gating) — never given a row */ /** step ids pre-skipped at launch (band/when gating) — never given a row */
readonly excluded: ReadonlySet<string>; readonly excluded: ReadonlySet<string>;
/** step ids that timed out (terminal — no retries remaining or not retriable) */
readonly timedOut: ReadonlySet<string>;
/**
* Per-batch running sets, populated by buildBatchState from the flow definition
* and the current inFlight set. Only read by getReadyInBatch; never mutated by
* decision functions (the caller maintains it across ticks).
*/
readonly batchState?: Map<string, { running: Set<string>; maxConcurrent: number; joinRule: TriggerRule }>;
/**
* Per-switch-step routing results. Populated when a SWITCH step completes.
* Step ids in any result's `excluded` set are treated as excluded for the
* remainder of the run — they won't execute and won't block dependents.
*/
readonly switchResults: ReadonlyMap<string, { chosenCase: string | null; excluded: ReadonlySet<string> }>;
} }
/** A dependency is satisfied once it is done, skipped, or excluded. */ /** A dependency is satisfied once it is done, skipped, excluded, or timed out. */
function isSatisfied(state: SchedulerState, id: string): boolean { function isSatisfied(state: SchedulerState, id: string): boolean {
return state.done.has(id) || state.skipped.has(id) || state.excluded.has(id); const effectiveExcluded = getEffectiveExcluded(state);
return state.done.has(id) || state.skipped.has(id) || effectiveExcluded.has(id) || state.timedOut.has(id);
}
/**
* The union of the static `excluded` set and every switch result's excluded
* step ids. Steps excluded by a SWITCH evaluation act exactly like launch-time
* excluded steps: they never run and they don't block dependents.
*/
function getEffectiveExcluded(state: SchedulerState): ReadonlySet<string> {
// Fast path: no switch results → static excluded only.
if (state.switchResults.size === 0) return state.excluded;
const combined = new Set(state.excluded);
for (const result of state.switchResults.values()) {
for (const id of result.excluded) {
combined.add(id);
}
}
return combined;
} }
/** /**
@@ -56,13 +88,14 @@ export function manifestSteps(flow: Flow, launchCtx: StepContext): Step[] {
* Faithful to `conductor/flow.ts:27-36`. Pure. * Faithful to `conductor/flow.ts:27-36`. Pure.
*/ */
export function readySteps(flow: Flow, state: SchedulerState): Step[] { export function readySteps(flow: Flow, state: SchedulerState): Step[] {
const effectiveExcluded = getEffectiveExcluded(state);
return flow.steps.filter( return flow.steps.filter(
(s) => (s) =>
!state.done.has(s.id) && !state.done.has(s.id) &&
!state.skipped.has(s.id) && !state.skipped.has(s.id) &&
!state.inFlight.has(s.id) && !state.inFlight.has(s.id) &&
!state.excluded.has(s.id) && !effectiveExcluded.has(s.id) &&
((s.deps ?? []).length === 0 || evaluateTriggerRule(s.deps ?? [], state.done, state.skipped, state.excluded, s.trigger_rule)), ((s.deps ?? []).length === 0 || evaluateTriggerRule(s.deps ?? [], state.done, state.skipped, effectiveExcluded, s.trigger_rule)),
); );
} }
@@ -102,6 +135,57 @@ export function isStuck(flow: Flow, state: SchedulerState): boolean {
); );
} }
// ─── Batch parallelism (v2.8.22) ─────────────────────────────────────────────
/**
* Build the batchState Map from the flow definition and the current inFlight set.
* Only steps with a `batch` field are tracked. Empty map when `flow.batchConfig`
* is absent or no steps belong to a batch. Pure — no IO.
*/
export function buildBatchState(
flow: Flow,
inFlight: ReadonlySet<string>,
): Map<string, { running: Set<string>; maxConcurrent: number; joinRule: TriggerRule }> {
const result = new Map<string, { running: Set<string>; maxConcurrent: number; joinRule: TriggerRule }>();
if (!flow.batchConfig) return result;
// Collect every unique batch group referenced by the flow's steps.
const groups = new Set<string>();
for (const s of flow.steps) {
if (s.batch) groups.add(s.batch);
}
const { maxConcurrent, joinRule } = flow.batchConfig;
for (const batch of groups) {
const running = new Set<string>(
flow.steps.filter((s) => s.batch === batch && inFlight.has(s.id)).map((s) => s.id),
);
result.set(batch, { running, maxConcurrent, joinRule: joinRule ?? 'all_success' });
}
return result;
}
/**
* Gate a ready step list by batch parallelism limits. Steps without a `batch`
* field always pass through. Steps belonging to a batch are only included if
* that batch's currently-running count is below its `maxConcurrent` cap.
*
* This is ADDITIVE to the existing wave scheduler: pure dep-based readiness
* is computed first (readySteps), then this function applies the batch ceiling.
* Steps excluded here remain pending and will be picked up on the next tick
* when a running batch step completes.
*/
export function getReadyInBatch(ready: readonly Step[], state: SchedulerState, _flow: Flow): Step[] {
const batchState = state.batchState;
if (!batchState || batchState.size === 0) return [...ready];
return ready.filter((s) => {
if (!s.batch) return true;
const bs = batchState.get(s.batch);
if (!bs) return true;
return bs.running.size < bs.maxConcurrent;
});
}
// ─── Resume reconciliation (D-9) ───────────────────────────────────────────── // ─── Resume reconciliation (D-9) ─────────────────────────────────────────────
/** /**
@@ -118,12 +202,29 @@ export function isStuck(flow: Flow, state: SchedulerState): boolean {
* - 'mark-cancelled': task was cancelled before the callback ran; propagate so * - 'mark-cancelled': task was cancelled before the callback ran; propagate so
* advance() cancels the run. * advance() cancels the run.
*/ */
/**
* True when the step definition allows retries on timeout.
* Pure — no IO.
*/
export function isRetriable(step: { maxRetries?: number }): boolean {
return (step.maxRetries ?? 0) > 0;
}
/**
* True when the step has retries remaining.
* Pure — no IO.
*/
export function shouldRetry(maxRetries: number | undefined | null, retryCount: number): boolean {
return retryCount < (maxRetries ?? 0);
}
export type ResumeAction = export type ResumeAction =
| 'keep' | 'keep'
| 're-dispatch' | 're-dispatch'
| 'mark-done' | 'mark-done'
| 'mark-failed' | 'mark-failed'
| 'mark-cancelled'; | 'mark-cancelled'
| 'retry';
/** /**
* Decide what to do with ONE flow step during startup resume (D-9). Pure. * Decide what to do with ONE flow step during startup resume (D-9). Pure.
@@ -131,12 +232,20 @@ export type ResumeAction =
* @param status - flow_steps.status * @param status - flow_steps.status
* @param taskId - flow_steps.task_id (null for code steps or unstarted agent steps) * @param taskId - flow_steps.task_id (null for code steps or unstarted agent steps)
* @param taskState - tasks.state for taskId, or null if the task row is absent * @param taskState - tasks.state for taskId, or null if the task row is absent
* @param retryCount - flow_steps.retry_count (default 0)
* @param maxRetries - flow_steps.max_retries (null = no retry)
*/ */
export function reconcileResumeStep( export function reconcileResumeStep(
status: string, status: string,
taskId: string | null, taskId: string | null,
taskState: string | null, taskState: string | null,
retryCount?: number,
maxRetries?: number | null,
): ResumeAction { ): ResumeAction {
if (status === 'timed_out') {
if (shouldRetry(maxRetries, retryCount ?? 0)) return 'retry';
return 'mark-failed';
}
if (status !== 'running') return 'keep'; if (status !== 'running') return 'keep';
// Running step: decide by its task's current state. // Running step: decide by its task's current state.
if (!taskId || taskState === null) return 're-dispatch'; // task gone or never created if (!taskId || taskState === null) return 're-dispatch'; // task gone or never created
@@ -167,6 +276,60 @@ export function shouldFailOnMissingAgent(agent: string, modeId: string | null):
return agent === 'qwen' && modeId === 'plan'; return agent === 'qwen' && modeId === 'plan';
} }
/**
* Evaluate a SWITCH step: iterate cases in declaration order and return the
* label of the first matching case plus every step id that belongs to a
* non-selected branch. When no case matches, the defaultBranch (if present)
* is the effective choice. If there is no default, all branch steps are
* excluded and the switch returns `chosenCase: null`.
*
* Pure — no IO. The caller adds the returned `excluded` ids to the scheduler
* state's switchResults so downstream decision functions see them as excluded.
*/
export function resolveSwitch(
step: Step,
ctx: StepContext,
): { chosenCase: string | null; excluded: string[] } {
const cases = step.cases;
if (!cases || cases.length === 0) {
// Degenerate switch — nothing to evaluate.
return { chosenCase: null, excluded: [] };
}
// Evaluate conditions in order.
for (const c of cases) {
if (c.condition(ctx)) {
// This case matches — exclude all OTHER branches.
const excluded: string[] = [];
for (const other of cases) {
if (other.label !== c.label) {
excluded.push(...other.stepIds);
}
}
// The default branch is also excluded when a case matched.
if (step.defaultBranch) excluded.push(...step.defaultBranch);
return { chosenCase: c.label, excluded };
}
}
// No case matched — use default branch if present.
if (step.defaultBranch) {
// Default is the chosen branch: exclude all explicit case branches.
const excluded: string[] = [];
for (const c of cases) {
excluded.push(...c.stepIds);
}
return { chosenCase: null, excluded };
}
// No case matched and no default — exclude everything.
const excluded: string[] = [];
for (const c of cases) {
excluded.push(...c.stepIds);
}
return { chosenCase: null, excluded };
}
/** /**
* Evaluate a trigger rule against dependency results. * Evaluate a trigger rule against dependency results.
* - all_success: every dep must be done (not skipped/failed) * - all_success: every dep must be done (not skipped/failed)
@@ -198,7 +361,7 @@ export function evaluateTriggerRule(
* decision per step. Pure — no IO. * decision per step. Pure — no IO.
*/ */
export function reconcileRun( export function reconcileRun(
steps: ReadonlyArray<{ stepId: string; taskId: string | null; status: string }>, steps: ReadonlyArray<{ stepId: string; taskId: string | null; status: string; retryCount?: number; maxRetries?: number | null }>,
taskStates: ReadonlyMap<string, string>, taskStates: ReadonlyMap<string, string>,
): StepResumeDecision[] { ): StepResumeDecision[] {
return steps.map((step) => ({ return steps.map((step) => ({
@@ -207,6 +370,8 @@ export function reconcileRun(
step.status, step.status,
step.taskId, step.taskId,
step.taskId ? (taskStates.get(step.taskId) ?? null) : null, step.taskId ? (taskStates.get(step.taskId) ?? null) : null,
step.retryCount,
step.maxRetries,
), ),
})); }));
} }

View File

@@ -40,11 +40,14 @@ import { getFlow } from '../conductor/flows/index.js';
import { loadPersona } from '../conductor/persona-loader.js'; import { loadPersona } from '../conductor/persona-loader.js';
import type { Band, DispatchFn, Flow, FlowInput, Step, StepContext } from '../conductor/types.js'; import type { Band, DispatchFn, Flow, FlowInput, Step, StepContext } from '../conductor/types.js';
import { import {
buildBatchState,
getReadyInBatch,
isRunComplete, isRunComplete,
manifestSteps, manifestSteps,
partitionReady, partitionReady,
readySteps, readySteps,
reconcileRun, reconcileRun,
resolveSwitch,
type SchedulerState, type SchedulerState,
type StepResumeDecision, type StepResumeDecision,
} from './flow-runner-decisions.js'; } from './flow-runner-decisions.js';
@@ -95,11 +98,14 @@ interface Deps {
interface FlowStepRow { interface FlowStepRow {
step_id: string; step_id: string;
kind: 'agent' | 'code'; kind: 'agent' | 'code' | 'switch';
agent: string | null; agent: string | null;
status: string; status: string;
chat_id: string | null; chat_id: string | null;
output: string | null; output: string | null;
updated_at: string | null;
retry_count: number | null;
max_retries: number | null;
} }
export function createFlowRunner(deps: Deps): FlowRunner { export function createFlowRunner(deps: Deps): FlowRunner {
@@ -263,7 +269,8 @@ export function createFlowRunner(deps: Deps): FlowRunner {
const dispatch: DispatchFn = (agent, task) => dispatchSubAgent(run.project_id, model, agent, task); const dispatch: DispatchFn = (agent, task) => dispatchSubAgent(run.project_id, model, agent, task);
const rows = await sql<FlowStepRow[]>` const rows = await sql<FlowStepRow[]>`
SELECT step_id, kind, agent, status, chat_id, output FROM flow_steps WHERE run_id = ${runId} SELECT step_id, kind, agent, status, chat_id, output, updated_at, retry_count, max_retries
FROM flow_steps WHERE run_id = ${runId}
`; `;
// Re-derive the excluded set (band/when pre-skips) from the flow def + input — // Re-derive the excluded set (band/when pre-skips) from the flow def + input —
@@ -275,6 +282,9 @@ export function createFlowRunner(deps: Deps): FlowRunner {
const done = new Set<string>(); const done = new Set<string>();
const skipped = new Set<string>(); const skipped = new Set<string>();
const inFlight = new Set<string>(); const inFlight = new Set<string>();
const timedOut = new Set<string>();
/** Per-switch routing results — maps switch step id → resolved branch details */
const switchExcluded = new Map<string, { chosenCase: string | null; excluded: Set<string> }>();
const results: Record<string, string> = {}; const results: Record<string, string> = {};
for (const r of rows) { for (const r of rows) {
switch (r.status) { switch (r.status) {
@@ -288,6 +298,9 @@ export function createFlowRunner(deps: Deps): FlowRunner {
case 'running': case 'running':
inFlight.add(r.step_id); inFlight.add(r.step_id);
break; break;
case 'timed_out':
timedOut.add(r.step_id);
break;
case 'failed': case 'failed':
// A failed worker makes the deterministic report untrustworthy — fail the // A failed worker makes the deterministic report untrustworthy — fail the
// whole run (matches the Phase-1 CLI, which throws on a dispatch failure). // whole run (matches the Phase-1 CLI, which throws on a dispatch failure).
@@ -300,17 +313,79 @@ export function createFlowRunner(deps: Deps): FlowRunner {
} }
} }
// ─── Timeout detection ───────────────────────────────────────────────────────
// Check running steps. If a step has been 'running' longer than
// FLOW_STEP_TIMEOUT_MS, mark it timed_out or re-dispatch if retriable.
// Build a context here so the timeout retry path can re-dispatch the step.
const timeoutCtx = buildCtx(input, results, model, dispatch);
const timeoutMs = config.FLOW_STEP_TIMEOUT_MS;
const nowDate = new Date();
let detectedTimedOut = false;
for (const r of rows) {
if (r.status !== 'running') continue;
if (!r.updated_at) continue;
const elapsed = nowDate.getTime() - new Date(r.updated_at).getTime();
if (elapsed <= timeoutMs) continue;
// Step has exceeded the timeout
detectedTimedOut = true;
const retryCount = r.retry_count ?? 0;
const maxRetries = r.max_retries ?? 0;
if (maxRetries > 0 && retryCount < maxRetries) {
// Retriable: re-dispatch the step with an incremented retry_count
const step = flow.steps.find((s) => s.id === r.step_id);
if (!step || step.kind !== 'agent') {
// Non-agent steps can't be retried via dispatch
inFlight.delete(r.step_id);
await failRun(runId, flow, input, model,
`step '${r.step_id}' timed out (non-retriable kind)`, r.step_id);
return;
}
inFlight.delete(r.step_id);
await sql`
UPDATE flow_steps
SET retry_count = ${retryCount + 1}, updated_at = clock_timestamp()
WHERE run_id = ${runId} AND step_id = ${r.step_id} AND status = 'running'
`;
await dispatchAgentStep(runId, run.project_id, model, step, timeoutCtx);
inFlight.add(r.step_id);
log.warn({ runId, stepId: r.step_id, retry: retryCount + 1, maxRetries },
'flow-runner: step timed out, retrying');
} else {
// Not retriable — mark as timed_out, fail the run
inFlight.delete(r.step_id);
await sql`
UPDATE flow_steps SET status = 'timed_out', updated_at = clock_timestamp()
WHERE run_id = ${runId} AND step_id = ${r.step_id} AND status = 'running'
`;
timedOut.add(r.step_id);
publishStep(runId, r.step_id, 'timed_out');
await failRun(runId, flow, input, model,
`step '${r.step_id}' timed out`, r.step_id);
return;
}
}
// If we modified any steps, re-query so the state sets reflect the latest DB.
if (detectedTimedOut) {
// Continue with the in-memory state we already adjusted above (inFlight/timedOut
// were mutated directly). No re-query needed.
}
// Drain ready skips + code steps (synchronous), re-evaluating after each batch, // Drain ready skips + code steps (synchronous), re-evaluating after each batch,
// then dispatch the full ready agent wave and wait for their terminal callbacks. // then dispatch the full ready agent wave and wait for their terminal callbacks.
for (;;) { for (;;) {
const state: SchedulerState = { done, skipped, inFlight, excluded }; // Build per-batch state from the current inFlight set for batch parallelism gating.
const batchState = buildBatchState(flow, inFlight);
const state: SchedulerState = { done, skipped, inFlight, excluded, timedOut, batchState, switchResults: switchExcluded };
if (isRunComplete(flow, state)) { if (isRunComplete(flow, state)) {
await finishRun(runId, flow, input, results, model, dispatch); await finishRun(runId, flow, input, results, model, dispatch);
return; return;
} }
const ready = readySteps(flow, state); const ready = getReadyInBatch(readySteps(flow, state), state, flow);
if (ready.length === 0) { if (ready.length === 0) {
if (inFlight.size > 0) return; // agents in flight will re-enter via the hook if (inFlight.size > 0) return; // agents in flight will re-enter via the hook
await failRun(runId, flow, input, model, 'unsatisfiable dependencies / cycle'); await failRun(runId, flow, input, model, 'unsatisfiable dependencies / cycle');
@@ -329,6 +404,31 @@ export function createFlowRunner(deps: Deps): FlowRunner {
continue; // re-evaluate — a skip can settle a fan-in step's deps continue; // re-evaluate — a skip can settle a fan-in step's deps
} }
// SWITCH steps run synchronously — evaluate conditions, update the excluded
// set in SchedulerState, and mark themselves complete. Non-selected branch
// step ids are excluded from ever running.
const switchReady = toRun.filter((s) => s.kind === 'switch');
if (switchReady.length > 0) {
for (const s of switchReady) {
let result: { chosenCase: string | null; excluded: string[] };
try {
result = resolveSwitch(s, buildCtx(input, results, model, dispatch));
} catch (err) {
await failRun(runId, flow, input, model, `switch step '${s.id}' threw: ${errMsg(err)}`, s.id);
return;
}
switchExcluded.set(s.id, {
chosenCase: result.chosenCase,
excluded: new Set(result.excluded),
});
const outputText = result.chosenCase ? `branch:${result.chosenCase}` : '';
await markStep(runId, s.id, 'completed', outputText);
results[s.id] = outputText;
done.add(s.id);
}
continue; // re-evaluate — excluded steps may unblock dependents
}
const codeReady = toRun.filter((s) => s.kind === 'code'); const codeReady = toRun.filter((s) => s.kind === 'code');
if (codeReady.length > 0) { if (codeReady.length > 0) {
for (const s of codeReady) { for (const s of codeReady) {
@@ -545,7 +645,7 @@ export function createFlowRunner(deps: Deps): FlowRunner {
function publishStep( function publishStep(
runId: string, runId: string,
stepId: string, stepId: string,
status: 'running' | 'completed' | 'failed' | 'skipped' | 'cancelled' | 'blocked', status: 'running' | 'completed' | 'failed' | 'skipped' | 'cancelled' | 'blocked' | 'timed_out',
extra?: { run_status?: 'running' | 'completed' | 'failed' | 'cancelled'; report?: string }, extra?: { run_status?: 'running' | 'completed' | 'failed' | 'cancelled'; report?: string },
): void { ): void {
publishUser({ publishUser({
@@ -683,6 +783,38 @@ export function createFlowRunner(deps: Deps): FlowRunner {
log.info({ runId, stepId: step.step_id, taskId: task!.id }, 'flow-runner: step re-dispatched on resume'); log.info({ runId, stepId: step.step_id, taskId: task!.id }, 'flow-runner: step re-dispatched on resume');
break; break;
} }
case 'retry': {
// Like re-dispatch but increments retry_count and sets status to 'running'.
if (!step.input) {
await sql`
UPDATE flow_steps
SET status = 'failed', error = 'retry: no stored prompt',
updated_at = clock_timestamp()
WHERE run_id = ${runId} AND step_id = ${step.step_id}
`;
break;
}
const chatIdR = step.chat_id;
const [chatR] = chatIdR
? await sql<{ session_id: string }[]>`SELECT session_id FROM chats WHERE id = ${chatIdR}`
: [];
const sessionIdR = chatR?.session_id ?? null;
const [taskR] = await sql<{ id: string }[]>`
INSERT INTO tasks (project_id, input, agent, model, mode_id, session_id, chat_id)
VALUES (${projectId}, ${step.input}, 'qwen', ${model}, 'plan', ${sessionIdR}, ${chatIdR})
RETURNING id
`;
await sql`
UPDATE flow_steps
SET task_id = ${taskR!.id}, retry_count = retry_count + 1, status = 'running',
updated_at = clock_timestamp()
WHERE run_id = ${runId} AND step_id = ${step.step_id}
`;
log.info({ runId, stepId: step.step_id, taskId: taskR!.id },
'flow-runner: step retried on resume');
break;
}
} }
} }
@@ -697,7 +829,9 @@ export function createFlowRunner(deps: Deps): FlowRunner {
status: string; status: string;
chat_id: string | null; chat_id: string | null;
input: string | null; input: string | null;
}[]>`SELECT step_id, task_id, status, chat_id, input FROM flow_steps WHERE run_id = ${run.id}`; retry_count: number | null;
max_retries: number | null;
}[]>`SELECT step_id, task_id, status, chat_id, input, retry_count, max_retries FROM flow_steps WHERE run_id = ${run.id}`;
// Load task states for all referenced tasks in one query. // Load task states for all referenced tasks in one query.
const taskIds = rows.map((r) => r.task_id).filter((id): id is string => id !== null); const taskIds = rows.map((r) => r.task_id).filter((id): id is string => id !== null);
@@ -710,7 +844,13 @@ export function createFlowRunner(deps: Deps): FlowRunner {
} }
const decisions = reconcileRun( const decisions = reconcileRun(
rows.map((r) => ({ stepId: r.step_id, taskId: r.task_id, status: r.status })), rows.map((r) => ({
stepId: r.step_id,
taskId: r.task_id,
status: r.status,
retryCount: r.retry_count ?? undefined,
maxRetries: r.max_retries,
})),
taskStates, taskStates,
); );
@@ -752,13 +892,13 @@ export function createFlowRunner(deps: Deps): FlowRunner {
// Mark all non-terminal steps cancelled and collect in-flight task_ids. // Mark all non-terminal steps cancelled and collect in-flight task_ids.
const steps = await sql<{ step_id: string; task_id: string | null; kind: string }[]>` const steps = await sql<{ step_id: string; task_id: string | null; kind: string }[]>`
SELECT step_id, task_id, kind FROM flow_steps SELECT step_id, task_id, kind FROM flow_steps
WHERE run_id = ${runId} AND status NOT IN ('completed', 'failed', 'cancelled', 'skipped') WHERE run_id = ${runId} AND status NOT IN ('completed', 'failed', 'cancelled', 'skipped', 'timed_out')
`; `;
if (steps.length > 0) { if (steps.length > 0) {
await sql` await sql`
UPDATE flow_steps SET status = 'cancelled', updated_at = clock_timestamp() UPDATE flow_steps SET status = 'cancelled', updated_at = clock_timestamp()
WHERE run_id = ${runId} AND status NOT IN ('completed', 'failed', 'cancelled', 'skipped') WHERE run_id = ${runId} AND status NOT IN ('completed', 'failed', 'cancelled', 'skipped', 'timed_out')
`; `;
for (const s of steps) { for (const s of steps) {
if (s.kind === 'agent') publishStep(runId, s.step_id, 'cancelled', { run_status: 'cancelled' }); if (s.kind === 'agent') publishStep(runId, s.step_id, 'cancelled', { run_status: 'cancelled' });

View File

@@ -0,0 +1,341 @@
/**
* v2.10 — PaseoClient: thin CLI-based client for the Paseo daemon.
*
* Paseo is a multi-agent hub daemon running at a configurable address
* (default Unix socket / localhost:6767). This client wraps the `paseo` CLI
* via child_process spawn for all operations (the daemon does not expose a
* separate REST API for write operations). Read operations (listAgents,
* getAgentStatus) use `paseo ls --json` / `paseo inspect --json`; write
* operations (import, archive, send) use the corresponding subcommands.
*
* Spec: openspec/changes/v2-10-paseo-integration/design.md.
*/
import { spawn } from 'node:child_process';
import { once } from 'node:events';
import { createInterface } from 'node:readline';
// ─── Types ───────────────────────────────────────────────────────────────────
/** Listing entry from `paseo ls --json`. Fields are lowercase. */
export interface PaseoAgentListItem {
id: string;
shortId: string;
name: string;
provider: string;
status: string;
cwd?: string;
created?: string;
thinking?: string;
}
/** Detailed agent info from `paseo inspect --json`. Fields are PascalCase. */
export interface PaseoAgentDetail {
Id: string;
Name: string;
Provider: string;
Model?: string;
Status: string;
Thinking?: string;
Archived: boolean;
ArchivedAt?: string | null;
Cwd?: string;
CreatedAt: string;
UpdatedAt: string;
Mode?: string;
AvailableModes?: Array<{ id: string; label: string }>;
Capabilities?: {
Streaming?: boolean;
Persistence?: boolean;
DynamicModes?: boolean;
McpServers?: boolean;
};
Labels?: Record<string, string>;
Worktree?: string | null;
ParentAgentId?: string | null;
}
/** Result of `paseo send --json`. */
export interface PaseoSendResult {
/** The agent's textual response. */
text?: string;
/** Structured output if the agent produced any. */
output?: unknown;
/** Error message if the turn failed. */
error?: string;
/** True if the turn completed successfully. */
ok?: boolean;
}
export interface PaseoClientConfig {
/** Path to the paseo binary. Default: auto-resolved from PATH. */
paseoBin: string;
/**
* Explicit `--host <host>` value for CLI calls.
* Format: `host:port` or `tcp://host:port?ssl=true&password=secret`.
* Omit to use the CLI default (Unix socket, fallback localhost:6767).
*/
cliHost?: string;
}
const DEFAULT_PASEO_BIN = 'paseo';
// ─── Client ──────────────────────────────────────────────────────────────────
export class PaseoClientError extends Error {
constructor(
message: string,
public readonly command: string,
public readonly exitCode: number | null,
public readonly stderr: string,
) {
super(message);
this.name = 'PaseoClientError';
}
}
export class PaseoClient {
/** @internal visible for testing */
readonly bin: string;
private readonly hostArgs: string[];
constructor(config?: Partial<PaseoClientConfig>) {
this.bin = config?.paseoBin ?? DEFAULT_PASEO_BIN;
this.hostArgs = config?.cliHost ? ['--host', config.cliHost] : [];
}
// ─── Read operations (CLI `ls --json`, `inspect --json`) ──────────────────
/** List all non-archived agents. */
async listAgents(): Promise<PaseoAgentListItem[]> {
const raw = await this.runJson(['ls', '--json', ...this.hostArgs]);
return raw as PaseoAgentListItem[];
}
/** Get detailed status for a single agent by ID or prefix. */
async getAgentStatus(agentId: string): Promise<PaseoAgentDetail> {
const raw = await this.runJson(['inspect', '--json', agentId, ...this.hostArgs]);
return raw as PaseoAgentDetail;
}
/**
* Quick liveness check — runs `paseo ls --json --limit 1` and returns success.
* The daemon is healthy if the CLI exits 0.
*/
async health(): Promise<{ status: string }> {
try {
await this.runCli(['ls', '--json', '--limit', '1', ...this.hostArgs]);
return { status: 'ok' };
} catch {
return { status: 'error' };
}
}
// ─── Write operations (CLI subcommands) ───────────────────────────────────
/**
* Import a provider session as a Paseo agent.
* Uses `paseo import <sessionId> --provider <provider> [--label k=v]`.
*/
async importAgent(
sessionId: string,
provider: string,
labels?: Record<string, string>,
): Promise<PaseoAgentDetail> {
const args: string[] = ['import', '--json', ...this.hostArgs];
if (provider) {
args.push('--provider', provider);
}
if (labels) {
for (const [k, v] of Object.entries(labels)) {
args.push('--label', `${k}=${v}`);
}
}
args.push(sessionId);
const raw = await this.runJson(args);
return raw as PaseoAgentDetail;
}
/** Archive (soft-delete) a Paseo agent by ID or prefix. */
async archiveAgent(agentId: string): Promise<void> {
await this.runCli(['archive', '--json', ...this.hostArgs, agentId]);
}
/**
* Send a prompt to an existing agent.
*
* By default waits for the agent to complete the turn (streams text events
* via the optional `onEvent` callback) and returns the structured result.
* Pass `noWait: true` to fire-and-forget.
*/
async sendPrompt(
agentId: string,
prompt: string,
options?: {
noWait?: boolean;
onEvent?: (event: { type: 'text' | 'reasoning'; text: string }) => void;
signal?: AbortSignal;
},
): Promise<PaseoSendResult> {
const args: string[] = ['send', '--json', ...this.hostArgs];
if (options?.noWait) {
args.push('--no-wait');
}
args.push(agentId, prompt);
// With --json and no --no-wait, the output is JSON after completion.
// For streaming, we read stderr without --json for real-time text.
const raw = await this.runCli(args, options?.signal);
try {
return JSON.parse(raw) as PaseoSendResult;
} catch {
return { text: raw, ok: true };
}
}
/**
* Stream-send: runs `paseo send` WITHOUT `--json`, forward text/reasoning
* lines to onEvent in real time. Use when the caller wants to stream agent
* output as it arrives rather than wait for the full JSON result.
*/
async streamSend(
agentId: string,
prompt: string,
onEvent: (event: { type: 'text' | 'reasoning'; text: string }) => void,
signal?: AbortSignal,
): Promise<PaseoSendResult> {
return new Promise<PaseoSendResult>((resolve, reject) => {
const args = ['send', ...this.hostArgs, agentId, prompt];
const child = spawn(this.bin, args, {
stdio: ['ignore', 'pipe', 'pipe'],
signal,
});
let stdout = '';
let stderr = '';
if (child.stdout) {
const rl = createInterface({ input: child.stdout });
rl.on('line', (line: string) => {
stdout += line + '\n';
// Forward as text event for real-time display
onEvent({ type: 'text', text: line + '\n' });
});
}
if (child.stderr) {
child.stderr.on('data', (chunk: Buffer) => {
stderr += chunk.toString();
});
}
once(child, 'close').then((raw) => {
const exitCode = (raw[0] as number | null) ?? 0;
if (exitCode !== 0) {
reject(
new PaseoClientError(
`paseo send failed (exit ${exitCode}): ${stderr.trim()}`,
'send',
exitCode,
stderr,
),
);
return;
}
resolve({ text: stdout, ok: true });
});
child.on('error', reject);
});
}
/** Interrupt/stop a running agent. */
async stopAgent(agentId: string): Promise<void> {
await this.runCli(['stop', ...this.hostArgs, agentId]);
}
// ─── Private helpers ───────────────────────────────────────────────────────
/**
* Run a CLI command and return stdout as a string.
* Throws PaseoClientError on non-zero exit.
*/
private async runCli(
args: string[],
signal?: AbortSignal,
): Promise<string> {
return new Promise<string>((resolve, reject) => {
const child = spawn(this.bin, args, {
stdio: ['ignore', 'pipe', 'pipe'],
signal,
});
let stdout = '';
let stderr = '';
if (child.stdout) {
child.stdout.on('data', (chunk: Buffer) => {
stdout += chunk.toString();
});
}
if (child.stderr) {
child.stderr.on('data', (chunk: Buffer) => {
stderr += chunk.toString();
});
}
child.on('error', (err: Error) => {
// If signal aborted, treat as cancellation not error
if (signal?.aborted) {
resolve('');
return;
}
reject(err);
});
once(child, 'close').then((raw) => {
const exitCode = (raw[0] as number | null) ?? 0;
if (signal?.aborted) {
resolve('');
return;
}
if (exitCode !== 0) {
const msg = stderr.trim() || `exit code ${exitCode}`;
reject(
new PaseoClientError(
`paseo ${args[0] ?? '?'} failed: ${msg}`,
args[0] ?? '?',
exitCode,
stderr,
),
);
return;
}
resolve(stdout);
});
});
}
/**
* Run a CLI command and parse stdout as JSON.
* Throws PaseoClientError on non-zero exit or parse failure.
*/
private async runJson(args: string[]): Promise<unknown> {
const stdout = await this.runCli(args);
try {
return JSON.parse(stdout);
} catch (err) {
throw new PaseoClientError(
`paseo ${args[0] ?? '?'} returned invalid JSON: ${(stdout || '<empty>').slice(0, 200)}`,
args[0] ?? '?',
0,
stdout,
);
}
}
}

View File

@@ -4,6 +4,8 @@ import { randomBytes } from 'node:crypto';
import type { Sql } from '../db.js'; import type { Sql } from '../db.js';
import { resolveWritePath } from './write_guard.js'; import { resolveWritePath } from './write_guard.js';
import { locateMatch } from './fuzzy-match.js'; import { locateMatch } from './fuzzy-match.js';
import { conflictIndex } from './conflict-index.js';
import { findConflicts } from './collision-detector.js';
/** /**
* Write a file atomically: stage to a sibling temp file, then rename over the * Write a file atomically: stage to a sibling temp file, then rename over the
@@ -170,6 +172,10 @@ export async function queueEdit(
VALUES (${sessionId}, ${taskId}, ${resolved}, 'edit', ${diff}, ${agent}) VALUES (${sessionId}, ${taskId}, ${resolved}, 'edit', ${diff}, ${agent})
RETURNING * RETURNING *
`; `;
// Register in the conflict index so concurrent worktrees see this edit.
conflictIndex.registerChange(resolved, sessionId, agent ?? 'unknown');
return row!; return row!;
} }
@@ -216,6 +222,9 @@ export async function queueCreate(
VALUES (${sessionId}, ${taskId}, ${resolved}, 'create', ${content}, ${agent}) VALUES (${sessionId}, ${taskId}, ${resolved}, 'create', ${content}, ${agent})
RETURNING * RETURNING *
`; `;
conflictIndex.registerChange(resolved, sessionId, agent ?? 'unknown');
return row!; return row!;
} }
@@ -238,6 +247,9 @@ export async function queueDelete(
VALUES (${sessionId}, ${taskId}, ${resolved}, 'delete', '', ${agent}) VALUES (${sessionId}, ${taskId}, ${resolved}, 'delete', '', ${agent})
RETURNING * RETURNING *
`; `;
conflictIndex.registerChange(resolved, sessionId, agent ?? 'unknown');
return row!; return row!;
} }
@@ -260,6 +272,23 @@ export async function applyOne(
// Re-validate path in case projectRoot has shifted // Re-validate path in case projectRoot has shifted
resolveWritePath(projectRoot, change.file_path); resolveWritePath(projectRoot, change.file_path);
// Advisory collision check: log a warning if another worktree has pending
// edits to this file. Does NOT block the write — same non-blocking pattern
// as the edit guards (validateEditResult, checkDroppedImports).
{
const conflicts = conflictIndex.query(
[change.file_path],
change.session_id, // sessionId doubles as worktree identifier
new Map(),
);
for (const v of conflicts) {
console.log(
`[collision] ${v.filePath} — conflict with worktrees [${v.worktrees.join(', ')}] ` +
`agents [${v.agents.join(', ')}] severity=${v.severity}`,
);
}
}
switch (change.operation) { switch (change.operation) {
case 'create': { case 'create': {
await mkdir(dirname(change.file_path), { recursive: true }); await mkdir(dirname(change.file_path), { recursive: true });

View File

@@ -18,11 +18,14 @@ import { registerCoderProxy } from './routes/coder-proxy.js';
import { registerModelRoutes } from './routes/models.js'; import { registerModelRoutes } from './routes/models.js';
import { registerAgentRoutes } from './routes/agents.js'; import { registerAgentRoutes } from './routes/agents.js';
import { registerSkillsRoutes } from './routes/skills.js'; import { registerSkillsRoutes } from './routes/skills.js';
import { registerTraceRoutes } from './routes/traces.js';
import { registerToolsRoutes } from './routes/tools.js'; import { registerToolsRoutes } from './routes/tools.js';
import { registerAnalyticsRoutes } from './routes/analytics.js'; import { registerAnalyticsRoutes } from './routes/analytics.js';
import { registerInferenceSettingsRoutes } from './routes/inference-settings.js'; import { registerInferenceSettingsRoutes } from './routes/inference-settings.js';
import { createInferenceRunner } from './services/inference/index.js'; import { createInferenceRunner, runInferenceWithModel } from './services/inference/index.js';
import { createBroker } from './services/broker.js'; import { createBroker } from './services/broker.js';
import { setBackgroundInferenceEnqueuer } from './services/background-task.js';
import { listSkills } from './services/skills.js'; import { listSkills } from './services/skills.js';
import * as compaction from './services/compaction.js'; import * as compaction from './services/compaction.js';
import { configureModelContext } from './services/model-context.js'; import { configureModelContext } from './services/model-context.js';
@@ -123,7 +126,35 @@ async function main() {
registerModelRoutes(app, config); registerModelRoutes(app, config);
registerAgentRoutes(app, sql); registerAgentRoutes(app, sql);
registerSidebarRoutes(app, sql); registerSidebarRoutes(app, sql);
registerChatRoutes(app, sql, broker); registerChatRoutes(app, sql, broker, config, {
enqueueCompare: (sessionId, chatId, assistantMessageId, modelOverride, compareGroupId) => {
// Reuse the inference runner's context pattern for compare mode.
// Each compare run gets its own AbortController; cancellation keyed by
// chatId (cancels ALL parallel runs in that compare group).
const compareCtx: import('./services/inference/types.js').InferenceContext = {
sql,
config,
log: app.log,
publish: (sid, frame) => {
broker.publishFrame(sid, frame as unknown as import('@boocode/contracts/ws-frames').WsFrame);
},
publishUser: (frame) => {
broker.publishUserFrame('default', frame as unknown as import('@boocode/contracts/ws-frames').WsFrame);
},
broker,
hooks: hasHooks ? hookRunner : undefined,
};
compareCtx.publishUser({ type: 'chat_status', chat_id: chatId, status: 'streaming', at: new Date().toISOString() });
void runInferenceWithModel(compareCtx, sessionId, chatId, assistantMessageId, modelOverride, compareGroupId).catch(
(err: Error) => app.log.error({ err, chatId, modelOverride }, 'compare inference failed'),
);
},
cancelInference: async (_sessionId, chatId) => {
return inference.cancel(_sessionId, chatId);
},
hasActiveInference: (chatId) => inference.hasActive(chatId),
});
registerTraceRoutes(app, sql);
registerToolsRoutes(app, sql); registerToolsRoutes(app, sql);
registerAnalyticsRoutes(app, sql); registerAnalyticsRoutes(app, sql);
registerInferenceSettingsRoutes(app); registerInferenceSettingsRoutes(app);
@@ -163,6 +194,13 @@ async function main() {
broker.publishUserFrame(user, frame as unknown as import('@boocode/contracts/ws-frames').WsFrame); broker.publishUserFrame(user, frame as unknown as import('@boocode/contracts/ws-frames').WsFrame);
} }
); );
// v2.x: wire the background subagent task system to the inference runner.
// Tools (spawn_subagent) dispatch fire-and-forget inference via this
// module-level reference — no import cycle through the tool registry.
setBackgroundInferenceEnqueuer((sessionId, chatId, assistantId, user) => {
inference.enqueue(sessionId, chatId, assistantId, user);
});
registerMessageRoutes(app, sql, config, broker, { registerMessageRoutes(app, sql, config, broker, {
enqueueInference: (sessionId, chatId, assistantId, user) => { enqueueInference: (sessionId, chatId, assistantId, user) => {
inference.enqueue(sessionId, chatId, assistantId, user); inference.enqueue(sessionId, chatId, assistantId, user);

View File

@@ -0,0 +1,38 @@
import type { FastifyInstance } from 'fastify';
import type { Sql } from '../db.js';
import type { ToolTrace } from '../services/tool-traces.js';
export function registerTraceRoutes(app: FastifyInstance, sql: Sql): void {
app.get<{ Params: { id: string }; Querystring: { limit?: string; offset?: string } }>(
'/api/chats/:id/traces',
async (req, reply) => {
const chat = await sql`SELECT id FROM chats WHERE id = ${req.params.id}`;
if (chat.length === 0) {
reply.code(404);
return { error: 'chat not found' };
}
const limit = Math.min(Math.max(Number(req.query.limit) || 50, 1), 200);
const offset = Math.max(Number(req.query.offset) || 0, 0);
const rows = await sql<ToolTrace[]>`
SELECT * FROM tool_traces
WHERE chat_id = ${req.params.id}
ORDER BY started_at ASC
LIMIT ${limit}
OFFSET ${offset}
`;
const [countRow] = await sql<{ count: number }[]>`
SELECT count(*)::int AS count FROM tool_traces WHERE chat_id = ${req.params.id}
`;
return {
data: rows,
total: countRow?.count ?? 0,
limit,
offset,
};
},
);
}

View File

@@ -3,6 +3,7 @@ import type { Sql } from '../db.js';
import type { Broker } from '../services/broker.js'; import type { Broker } from '../services/broker.js';
import type { Message } from '../types/api.js'; import type { Message } from '../types/api.js';
import { MESSAGE_COLUMNS } from '../services/message-columns.js'; import { MESSAGE_COLUMNS } from '../services/message-columns.js';
import { loadAgentSnapshot } from '../services/session-snapshots.js';
export function registerWebSocket( export function registerWebSocket(
app: FastifyInstance, app: FastifyInstance,
@@ -33,6 +34,24 @@ export function registerWebSocket(
`; `;
socket.send(JSON.stringify({ type: 'snapshot', messages })); socket.send(JSON.stringify({ type: 'snapshot', messages }));
// v2.7.x: on reconnect, restore agent snapshot state so the frontend
// knows there's an ongoing agent turn. Best-effort per chat; most
// sessions won't have any snapshots.
const chats = await sql<{ id: string }[]>`SELECT id FROM chats WHERE session_id = ${sessionId}`;
for (const chat of chats) {
const agentSnapshot = await loadAgentSnapshot(sql, chat.id).catch(() => null);
if (agentSnapshot) {
socket.send(JSON.stringify({
type: 'agent_snapshot',
chat_id: chat.id,
agent: agentSnapshot.agent,
model: agentSnapshot.model,
mode: agentSnapshot.mode,
turn_number: agentSnapshot.turn_number,
}));
}
}
const unsubscribe = broker.subscribe(sessionId, (frame) => { const unsubscribe = broker.subscribe(sessionId, (frame) => {
if (socket.readyState !== socket.OPEN) return; if (socket.readyState !== socket.OPEN) return;
try { try {

View File

@@ -32,11 +32,18 @@ CREATE TABLE IF NOT EXISTS messages (
content TEXT NOT NULL DEFAULT '', content TEXT NOT NULL DEFAULT '',
status TEXT NOT NULL DEFAULT 'complete', status TEXT NOT NULL DEFAULT 'complete',
last_seq INT NOT NULL DEFAULT 0, last_seq INT NOT NULL DEFAULT 0,
cache_tokens INTEGER,
reasoning_tokens INTEGER,
created_at TIMESTAMPTZ NOT NULL DEFAULT clock_timestamp() created_at TIMESTAMPTZ NOT NULL DEFAULT clock_timestamp()
); );
CREATE INDEX IF NOT EXISTS idx_messages_session ON messages(session_id, created_at); CREATE INDEX IF NOT EXISTS idx_messages_session ON messages(session_id, created_at);
-- vDeepSeek: add cache/reasoning token columns early so messages_with_parts
-- view (defined below) can reference them. IF NOT EXISTS guards re-runs.
ALTER TABLE messages ADD COLUMN IF NOT EXISTS cache_tokens INTEGER;
ALTER TABLE messages ADD COLUMN IF NOT EXISTS reasoning_tokens INTEGER;
-- v1.13.0: granular message parts table. v1.13.20: legacy tool_calls/ -- v1.13.0: granular message parts table. v1.13.20: legacy tool_calls/
-- tool_results columns dropped; message_parts is now the sole source of -- tool_results columns dropped; message_parts is now the sole source of
-- truth for tool calls, tool results, and reasoning. ON DELETE CASCADE -- truth for tool calls, tool results, and reasoning. ON DELETE CASCADE
@@ -204,8 +211,6 @@ ALTER TABLE messages ADD COLUMN IF NOT EXISTS ctx_used INTEGER;
ALTER TABLE messages ADD COLUMN IF NOT EXISTS ctx_max INTEGER; ALTER TABLE messages ADD COLUMN IF NOT EXISTS ctx_max INTEGER;
ALTER TABLE messages ADD COLUMN IF NOT EXISTS started_at TIMESTAMPTZ; ALTER TABLE messages ADD COLUMN IF NOT EXISTS started_at TIMESTAMPTZ;
ALTER TABLE messages ADD COLUMN IF NOT EXISTS finished_at TIMESTAMPTZ; ALTER TABLE messages ADD COLUMN IF NOT EXISTS finished_at TIMESTAMPTZ;
ALTER TABLE messages ADD COLUMN IF NOT EXISTS cache_tokens INTEGER;
ALTER TABLE messages ADD COLUMN IF NOT EXISTS reasoning_tokens INTEGER;
ALTER TABLE sessions ADD COLUMN IF NOT EXISTS updated_at TIMESTAMPTZ NOT NULL DEFAULT clock_timestamp(); ALTER TABLE sessions ADD COLUMN IF NOT EXISTS updated_at TIMESTAMPTZ NOT NULL DEFAULT clock_timestamp();
@@ -409,3 +414,55 @@ END $$;
-- Remove the v2.0.5 arena_id column (replaced by the new Arena feature). -- Remove the v2.0.5 arena_id column (replaced by the new Arena feature).
ALTER TABLE tasks DROP COLUMN IF EXISTS arena_id; ALTER TABLE tasks DROP COLUMN IF EXISTS arena_id;
-- v2.x-tool-traces: per-call tool execution records for observability.
CREATE TABLE IF NOT EXISTS tool_traces (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
session_id UUID NOT NULL REFERENCES sessions(id) ON DELETE CASCADE,
chat_id UUID NOT NULL REFERENCES chats(id) ON DELETE CASCADE,
message_id UUID REFERENCES messages(id) ON DELETE SET NULL,
turn_number INTEGER NOT NULL,
tool_name TEXT NOT NULL,
tool_input JSONB NOT NULL,
tool_output TEXT,
started_at TIMESTAMPTZ NOT NULL DEFAULT clock_timestamp(),
finished_at TIMESTAMPTZ,
latency_ms INTEGER,
tokens_used INTEGER,
cache_tokens INTEGER,
reasoning_tokens INTEGER,
error TEXT,
outcome TEXT,
created_at TIMESTAMPTZ NOT NULL DEFAULT clock_timestamp()
);
CREATE INDEX IF NOT EXISTS idx_tool_traces_chat ON tool_traces(chat_id, created_at);
-- v2.x-tool-traces: active tool call state for in-flight instrumentation.
CREATE TABLE IF NOT EXISTS tool_trace_states (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
session_id UUID NOT NULL REFERENCES sessions(id) ON DELETE CASCADE,
chat_id UUID NOT NULL REFERENCES chats(id) ON DELETE CASCADE,
message_id UUID REFERENCES messages(id) ON DELETE SET NULL,
turn_number INTEGER NOT NULL,
tool_name TEXT NOT NULL,
tool_input JSONB NOT NULL,
started_at TIMESTAMPTZ NOT NULL DEFAULT clock_timestamp()
);
-- agent_snapshots: persistent agent session state for cross-refresh resume.
CREATE TABLE IF NOT EXISTS agent_snapshots (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
session_id UUID NOT NULL REFERENCES sessions(id) ON DELETE CASCADE,
chat_id UUID NOT NULL REFERENCES chats(id) ON DELETE CASCADE,
model TEXT NOT NULL,
agent TEXT,
mode TEXT,
turn_number INTEGER NOT NULL DEFAULT 0,
messages JSONB NOT NULL DEFAULT '[]'::jsonb,
tool_states JSONB NOT NULL DEFAULT '[]'::jsonb,
created_at TIMESTAMPTZ NOT NULL DEFAULT clock_timestamp(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT clock_timestamp()
);
CREATE INDEX IF NOT EXISTS idx_agent_snapshots_chat ON agent_snapshots(chat_id);
CREATE UNIQUE INDEX IF NOT EXISTS idx_agent_snapshots_chat_unique ON agent_snapshots(chat_id);

View File

@@ -0,0 +1,260 @@
// v2.x: Background subagent task service.
// Creates and tracks background tasks that run as independent inference
// sessions. The spawner creates a session+chat, inserts messages, and
// dispatches inference asynchronously. Callers poll status and retrieve
// results via the companion tools (background-subagent-tools.ts).
//
// Module-level inference enqueuer: set at server startup so tools can
// dispatch background inference without importing the runner directly.
import type { Sql } from '../db.js';
import type { FastifyBaseLogger } from 'fastify';
export interface BackgroundTask {
id: string;
session_id: string;
chat_id: string;
agent: string | null;
model: string;
input: string;
status: 'pending' | 'running' | 'completed' | 'failed' | 'cancelled';
output_summary: string | null;
created_at: string;
finished_at: string | null;
}
// Module-level reference to the inference enqueuer, set at server startup.
let _enqueueInference:
| ((sessionId: string, chatId: string, assistantMessageId: string, user: string) => void)
| null = null;
export function setBackgroundInferenceEnqueuer(
enqueue: (
sessionId: string,
chatId: string,
assistantMessageId: string,
user: string,
) => void,
): void {
_enqueueInference = enqueue;
}
function mapTaskState(state: string): BackgroundTask['status'] {
switch (state) {
case 'pending':
return 'pending';
case 'running':
return 'running';
case 'completed':
return 'completed';
case 'failed':
return 'failed';
case 'blocked':
return 'pending'; // blocked is internal — surface as pending
case 'cancelled':
return 'cancelled';
default:
return 'pending';
}
}
// Spawn a background subagent task: create session + chat + messages + tasks
// row, then fire-and-forget the inference. Returns immediately with the task
// metadata — inference runs asynchronously.
export async function spawnBackgroundTask(
sql: Sql,
log: FastifyBaseLogger,
projectId: string,
input: string,
model: string,
agent?: string,
label?: string,
): Promise<BackgroundTask> {
const sessionName =
label != null && label.length > 0
? `Subagent: ${label}`
: `Background: ${input.slice(0, 50)}${input.length > 50 ? '...' : ''}`;
const result = await sql.begin(async (tx) => {
// 1. Create session for the background task
const [sess] = await tx<{ id: string }[]>`
INSERT INTO sessions (project_id, name, model, system_prompt)
VALUES (${projectId}, ${sessionName}, ${model}, '')
RETURNING id
`;
const sessionId = sess!.id;
// 2. Create chat in that session
const [ch] = await tx<{ id: string }[]>`
INSERT INTO chats (session_id, name, status)
VALUES (${sessionId}, ${label ?? null}, 'open')
RETURNING id
`;
const chatId = ch!.id;
// 3. Insert user message with the task input
await tx`
INSERT INTO messages (session_id, chat_id, role, content, status, created_at)
VALUES (${sessionId}, ${chatId}, 'user', ${input}, 'complete', clock_timestamp())
`;
// 4. Insert streaming assistant message (inference fills it)
const [assistantRow] = await tx<{ id: string }[]>`
INSERT INTO messages (session_id, chat_id, role, content, status, created_at)
VALUES (${sessionId}, ${chatId}, 'assistant', '', 'streaming', clock_timestamp())
RETURNING id
`;
const assistantMessageId = assistantRow!.id;
// 5. Insert tasks row for tracking
const [task] = await tx<{ id: string; created_at: string }[]>`
INSERT INTO tasks (project_id, session_id, state, input, agent, model)
VALUES (${projectId}, ${sessionId}, 'running', ${input}, ${agent ?? null}, ${model})
RETURNING id, created_at
`;
return { sessionId, chatId, assistantMessageId, task: task! };
});
// After the transaction commits, fire-and-forget inference dispatch.
if (_enqueueInference) {
try {
_enqueueInference(result.sessionId, result.chatId, result.assistantMessageId, 'default');
} catch (err) {
log.warn(
{ err, taskId: result.task.id },
'background inference enqueue failed',
);
}
}
log.info(
{
taskId: result.task.id,
sessionId: result.sessionId,
chatId: result.chatId,
model,
agent,
},
'spawned background subagent task',
);
return {
id: result.task.id,
session_id: result.sessionId,
chat_id: result.chatId,
agent: agent ?? null,
model,
input,
status: 'running',
output_summary: null,
created_at: result.task.created_at,
finished_at: null,
};
}
// Look up a background task by its tasks.id. Includes the status from the
// tasks table and the chat_id from the linked chat.
export async function getBackgroundTaskStatus(
sql: Sql,
taskId: string,
): Promise<BackgroundTask | null> {
const rows = await sql<
{
id: string;
session_id: string;
state: string;
input: string;
agent: string | null;
model: string | null;
output_summary: string | null;
created_at: string;
ended_at: string | null;
}[]
>`
SELECT id, session_id, state, input, agent, model, output_summary, created_at, ended_at
FROM tasks
WHERE id = ${taskId}
`;
if (rows.length === 0) return null;
const r = rows[0]!;
// Find the chat_id from the session (background sessions have exactly one chat).
const chatRows = await sql<{ id: string }[]>`
SELECT id FROM chats WHERE session_id = ${r.session_id} LIMIT 2
`;
return {
id: r.id,
session_id: r.session_id,
chat_id: chatRows[0]?.id ?? '',
agent: r.agent,
model: r.model ?? '',
input: r.input,
status: mapTaskState(r.state),
output_summary: r.output_summary,
created_at: r.created_at,
finished_at: r.ended_at,
};
}
// Retrieve the full output and token usage from a completed background task.
// Returns null if the task has no completed assistant message.
export async function getBackgroundTaskResult(
sql: Sql,
taskId: string,
chatId: string,
): Promise<{
output: string;
token_usage: { prompt: number; completion: number } | null;
} | null> {
// Verify the task exists and chatId belongs to it.
const taskRows = await sql<{ session_id: string }[]>`
SELECT session_id FROM tasks WHERE id = ${taskId}
`;
if (taskRows.length === 0) return null;
// Read the last complete assistant message (the one with content).
const msgRows = await sql<
{
content: string;
tokens_used: number | null;
ctx_used: number | null;
}[]
>`
SELECT content, tokens_used, ctx_used
FROM messages
WHERE chat_id = ${chatId}
AND role = 'assistant'
AND status = 'complete'
AND content <> ''
ORDER BY created_at DESC
LIMIT 1
`;
if (msgRows.length === 0) return null;
const m = msgRows[0]!;
return {
output: m.content,
token_usage:
m.tokens_used != null || m.ctx_used != null
? { prompt: m.ctx_used ?? 0, completion: m.tokens_used ?? 0 }
: null,
};
}
// Cancel a pending or running background task. Returns true if a row was
// actually updated (the task existed and was in a cancellable state).
export async function cancelBackgroundTask(
sql: Sql,
taskId: string,
): Promise<boolean> {
const rows = await sql<{ id: string }[]>`
UPDATE tasks
SET state = 'cancelled', ended_at = clock_timestamp()
WHERE id = ${taskId}
AND state IN ('pending', 'running')
RETURNING id
`;
return rows.length > 0;
}

View File

@@ -1,3 +1,10 @@
// DEPRECATED (Phase 4, Domain 2, v2.8.14): This HTTP client routes through
// the Go codecontext sidecar (http://codecontext:8080). Superseded by the
// boocontext MCP server. New callers should use boocontext MCP tool wrappers
// directly. Keep this file for backward compatibility — the 16 existing
// codecontext tool wrappers (under tools/codecontext/) still call through
// callCodecontext(). Remove after full migration.
//
// v1.12 Track B.2: shared HTTP client for the codecontext sidecar. The 8 // v1.12 Track B.2: shared HTTP client for the codecontext sidecar. The 8
// per-tool wrappers under tools/codecontext/ all funnel through callCodecontext // per-tool wrappers under tools/codecontext/ all funnel through callCodecontext
// — they're thin adapters that supply toolName + args + projectPath. The // — they're thin adapters that supply toolName + args + projectPath. The
@@ -19,6 +26,7 @@
import { access, copyFile, realpath } from 'node:fs/promises'; import { access, copyFile, realpath } from 'node:fs/promises';
import { isAbsolute, join, resolve, sep } from 'node:path'; import { isAbsolute, join, resolve, sep } from 'node:path';
import { truncateIfNeeded } from './truncate.js'; import { truncateIfNeeded } from './truncate.js';
import { callBoocontext } from './boocontext_client.js';
// v1.13.12 fix: codecontext crashes on empty source files (upstream issue #37) // v1.13.12 fix: codecontext crashes on empty source files (upstream issue #37)
// when it can't ignore them. The .codecontextignore.template ships with the // when it can't ignore them. The .codecontextignore.template ships with the
@@ -112,6 +120,16 @@ export async function callCodecontext(
req: CodecontextRequest, req: CodecontextRequest,
fetcher: typeof fetch = fetch, fetcher: typeof fetch = fetch,
): Promise<CodecontextResponse> { ): Promise<CodecontextResponse> {
// Phase 4: try boocontext MCP first. Falls back to the HTTP sidecar if the
// MCP server is not available or the tool doesn't exist there.
try {
return await callBoocontext({ toolName: req.toolName, args: req.args });
} catch (err) {
console.warn(
`[codecontext_client] boocontext MCP unavailable for "${req.toolName}", falling back to HTTP sidecar: ${err instanceof Error ? err.message : String(err)}`,
);
}
// Step 1: realpath the project root, then realpath the requested target_dir // Step 1: realpath the project root, then realpath the requested target_dir
// (defaulting to projectPath when the caller didn't pass one — the 12 wrappers // (defaulting to projectPath when the caller didn't pass one — the 12 wrappers
// never pass target_dir; tests can override). A non-existent target_dir // never pass target_dir; tests can override). A non-existent target_dir

View File

@@ -0,0 +1,132 @@
/**
* Compact unified-diff generator for write-tool results.
*
* Produces a minimal unified diff string (---/+++ header + +/- lines) from
* old/new text pairs so the frontend can render an inline diff snippet
* without pulling in a full diff library.
*/
// Write-tool names that can produce file diffs.
export const WRITE_TOOL_NAMES = new Set([
'edit_file',
'create_file',
'delete_file',
'apply_pending',
]);
/**
* Compute a compact unified diff from old → new text.
*
* @param oldStr The original text (empty for creates)
* @param newStr The replacement text (empty for deletes)
* @param filePath Display path for the file header
* @returns A unified-diff string, or empty string if old === new
*/
export function computeDiff(oldStr: string, newStr: string, filePath: string): string {
if (oldStr === newStr) return '';
const oldLines = oldStr.split('\n');
const newLines = newStr.split('\n');
// For empty old → new file (create), show all lines as additions
if (oldStr.length === 0 && newStr.length > 0) {
const header = `--- /dev/null\n+++ b/${filePath}\n`;
const body = newLines.map((line) => `+${line}`).join('\n');
return header + body;
}
// For old → empty (delete), show all lines as removals
if (newStr.length === 0 && oldStr.length > 0) {
const header = `--- a/${filePath}\n+++ /dev/null\n`;
const body = oldLines.map((line) => `-${line}`).join('\n');
return header + body;
}
// Simple line-by-line diff for edit: collect changed lines with context.
// Uses a straightforward algorithm: find the first differing line and the
// last differing line, then output the block with +/- markers.
const header = `--- a/${filePath}\n+++ b/${filePath}\n`;
const maxLen = Math.max(oldLines.length, newLines.length);
let firstDiff = -1;
let lastDiff = -1;
for (let i = 0; i < maxLen; i++) {
const a = i < oldLines.length ? oldLines[i] : undefined;
const b = i < newLines.length ? newLines[i] : undefined;
if (a !== b) {
if (firstDiff === -1) firstDiff = i;
lastDiff = i;
}
}
if (firstDiff === -1) return '';
// Add context lines around the changed block (up to 2 lines each side)
const contextBefore = 2;
const contextAfter = 2;
const start = Math.max(0, firstDiff - contextBefore);
const end = Math.min(maxLen - 1, lastDiff + contextAfter);
// Build the unified diff hunk
const hunkLines: string[] = [];
const hunkOldStart = start + 1; // 1-indexed
const hunkNewStart = start + 1;
const hunkOldLen = end - start + 1;
const hunkNewLen = end - start + 1;
for (let i = start; i <= end; i++) {
const oldLine = i < oldLines.length ? oldLines[i] : undefined;
const newLine = i < newLines.length ? newLines[i] : undefined;
if (oldLine === newLine) {
hunkLines.push(` ${oldLine ?? ''}`);
} else {
if (oldLine !== undefined) {
hunkLines.push(`-${oldLine}`);
}
if (newLine !== undefined) {
hunkLines.push(`+${newLine}`);
}
}
}
const hunkHeader = `@@ -${hunkOldStart},${hunkOldLen} +${hunkNewStart},${hunkNewLen} @@\n`;
return header + hunkHeader + hunkLines.join('\n');
}
/**
* Check whether a tool name corresponds to a file-modifying write tool
* that should produce a diff in its tool result.
*/
export function isWriteTool(name: string): boolean {
return WRITE_TOOL_NAMES.has(name);
}
/**
* Extract a diff string from tool call args for write tools.
* Returns empty string if the tool doesn't produce diffs or args are missing.
*/
export function diffFromToolArgs(name: string, args: Record<string, unknown>, filePath?: string): string {
switch (name) {
case 'edit_file': {
const oldStr = String(args.old_string ?? '');
const newStr = String(args.new_string ?? '');
const path = filePath ?? String(args.file_path ?? 'file');
return computeDiff(oldStr, newStr, path);
}
case 'create_file': {
const content = String(args.content ?? '');
const path = filePath ?? String(args.file_path ?? 'file');
return computeDiff('', content, path);
}
case 'delete_file':
// No content available at queue time — actual content is read at apply time.
return '';
case 'apply_pending':
// Meta-tool — individual changes produce their own diffs.
return '';
default:
return '';
}
}

View File

@@ -0,0 +1,56 @@
// vDeepSeek (stub): multi-modal (image) attachment support.
//
// When a message carries images, DeepSeek V4 models can process them
// natively via the @ai-sdk/deepseek provider. This module provides the
// helper types and functions to detect and convert image attachments.
//
// FULL INTEGRATION requires:
// 1. Storing image data alongside messages (message_parts with kind='image'
// or a dedicated attachments table with base64-encoded data).
// 2. Extending OpenAiMessage.content from `string | null` to
// `string | null | Array<{ type: 'text'; text: string } | { type: 'image'; image: string }>`
// in apps/server/src/services/inference/payload.ts.
// 3. Updating toModelMessages() in stream-phase-adapter.ts to emit AI SDK
// content arrays with image parts for multimodal user messages.
//
// None of the above is done yet — this file is a type scaffold.
import type { Message } from '../../types/api.js';
/** Shape of a decoded image attachment ready for the AI SDK. */
export interface ImageAttachment {
/** Base64-encoded image data (no data URI prefix — raw bytes). */
data: string;
/** MIME type (e.g. 'image/png', 'image/jpeg', 'image/webp'). */
mimeType: string;
}
/**
* Check if a user message has image content that can be forwarded to a
* multimodal model. Currently a stub — always returns false until the
* message-pipeline stores image attachments addressably.
*/
export function hasImageAttachments(_message: Message): boolean {
// TODO(vDeepSeek): scan message_parts for kind='image' or inspect
// message.content for inline data URIs (data:image/...).
return false;
}
/**
* Convert internal image attachments to the format expected by the AI SDK
* ModelMessage content array.
*
* The @ai-sdk/deepseek provider accepts images as:
* { type: 'image'; image: 'data:image/png;base64,...' }
*
* @param attachments — List of decoded image attachments.
* @returns AI SDK inline file parts suitable for ModelMessage.content.
*/
export function imageAttachmentsToParts(
attachments: ImageAttachment[],
): Array<{ type: 'image'; image: string }> {
return attachments.map((a) => ({
type: 'image' as const,
image: `data:${a.mimeType};base64,${a.data}`,
}));
}

View File

@@ -19,7 +19,9 @@ import { formatUnknownToolError } from './tool-suggestions.js';
import { resolveGrantRoot } from '../grant_resolver.js'; import { resolveGrantRoot } from '../grant_resolver.js';
import { stripToolMarkup } from './tool-call-parser.js'; import { stripToolMarkup } from './tool-call-parser.js';
import { repairToolInput } from './tool-input-repair.js'; import { repairToolInput } from './tool-input-repair.js';
import { diffFromToolArgs, isWriteTool } from './compute-diff.js';
import type { FailureKind } from './mistake-tracker.js'; import type { FailureKind } from './mistake-tracker.js';
import { insertToolTrace, updateToolTrace } from '../tool-traces.js';
import type { import type {
InferenceContext, InferenceContext,
StreamResult, StreamResult,
@@ -175,6 +177,7 @@ export async function executeToolPhase(
session: Session, session: Session,
projectRoot: string, projectRoot: string,
agent?: Agent | null, agent?: Agent | null,
turnNumber?: number,
): Promise<ToolPhaseResult> { ): Promise<ToolPhaseResult> {
const { sessionId, chatId, assistantMessageId } = args; const { sessionId, chatId, assistantMessageId } = args;
const content = stripToolMarkup(result.content, { final: true }); const content = stripToolMarkup(result.content, { final: true });
@@ -378,11 +381,53 @@ export async function executeToolPhase(
}); });
return; return;
} }
// tool_trace instrumentation - start
const traceId = crypto.randomUUID();
const traceStartTime = Date.now();
const startedAtIso = new Date().toISOString();
insertToolTrace(ctx.sql, {
session_id: sessionId,
chat_id: chatId,
message_id: assistantMessageId,
turn_number: turnNumber ?? 0,
tool_name: tc.name,
tool_input: tc.args as Record<string, unknown>,
}).catch(() => {});
ctx.publish(sessionId, {
type: 'tool_trace_start',
trace_id: traceId,
message_id: assistantMessageId,
chat_id: chatId,
tool_name: tc.name,
tool_input: tc.args as Record<string, unknown>,
started_at: startedAtIso,
});
const tres = await executeToolCall( const tres = await executeToolCall(
projectRoot, tc, session.allowed_read_paths, projectRoot, tc, session.allowed_read_paths,
{ sql: ctx.sql, sessionId }, { sql: ctx.sql, sessionId },
ctx.hooks, sessionId, ctx.hooks, sessionId,
); );
// tool_trace instrumentation - finish
const finishedAtIso = new Date().toISOString();
const latencyMs = Date.now() - traceStartTime;
updateToolTrace(ctx.sql, traceId, {
finished_at: finishedAtIso,
...(tres.outcome === 'success' && tres.output != null ? { tool_output: JSON.stringify(tres.output) } : {}),
latency_ms: latencyMs,
outcome: tres.outcome,
...(tres.error ? { error: tres.error } : {}),
}).catch(() => {});
ctx.publish(sessionId, {
type: 'tool_trace_finish',
trace_id: traceId,
message_id: assistantMessageId,
chat_id: chatId,
tool_name: tc.name,
finished_at: finishedAtIso,
outcome: tres.outcome,
latency_ms: latencyMs,
...(tres.error ? { error: tres.error } : {}),
});
// vWhale: PostToolUse hook (best-effort, non-blocking). // vWhale: PostToolUse hook (best-effort, non-blocking).
if (ctx.hooks) { if (ctx.hooks) {
ctx.hooks.run('PostToolUse', { ctx.hooks.run('PostToolUse', {
@@ -401,6 +446,16 @@ export async function executeToolPhase(
if (SYNTHESIS_TOOLS.has(tc.name)) { if (SYNTHESIS_TOOLS.has(tc.name)) {
synthEntries.push({ tc, output: tres.output, ...(tres.error ? { error: tres.error } : {}) }); synthEntries.push({ tc, output: tres.output, ...(tres.error ? { error: tres.error } : {}) });
} }
// v2.8: compute a compact unified diff for successful write-tool results.
// The diff is derived from tool call args (old_string/new_string for
// edit_file, content for create_file) and included in the WS frame so
// the frontend can render a DiffSnippet inline. Not persisted to message_parts
// (the args alone are enough to reproduce it on reload if needed).
const toolDiff =
!tres.error && tres.outcome === 'success' && isWriteTool(tc.name)
? diffFromToolArgs(tc.name, tc.args as Record<string, unknown>)
: undefined;
const stored = { const stored = {
tool_call_id: tc.id, tool_call_id: tc.id,
output: tres.output, output: tres.output,
@@ -423,6 +478,7 @@ export async function executeToolPhase(
output: tres.output, output: tres.output,
truncated: tres.truncated, truncated: tres.truncated,
...(tres.error ? { error: tres.error } : {}), ...(tres.error ? { error: tres.error } : {}),
...(toolDiff ? { diff: toolDiff } : {}),
}); });
}) })
); );

View File

@@ -37,6 +37,12 @@ import type {
StreamResult, StreamResult,
TurnArgs, TurnArgs,
} from './types.js'; } from './types.js';
import { saveAgentSnapshot } from '../session-snapshots.js';
// vWhale: auto-fix loop — after write tools, build the project and inject
// errors. Uses execFile (no shell) against the project root.
import { execFile } from 'node:child_process';
import { readFileSync, existsSync } from 'node:fs';
import { join } from 'node:path';
import { import {
runCapHitSummary, runCapHitSummary,
runDoomLoopSummary, runDoomLoopSummary,
@@ -44,6 +50,71 @@ import {
insertMistakeRecoverySentinel, insertMistakeRecoverySentinel,
} from './sentinel-summaries.js'; } from './sentinel-summaries.js';
// vWhale: auto-fix — detect build command from package.json, run it, return
// error text for injection into next iteration. Best-effort, never throws.
const BUILD_TIMEOUT_MS = 60_000;
const BUILD_OUTPUT_CAP = 8_000;
async function detectAndRunBuild(
ctx: InferenceContext,
projectRoot: string,
sessionId: string,
chatId: string,
model: string,
existingNote: string | undefined,
): Promise<string | undefined> {
// Only run for DeepSeek models (local Qwen models don't benefit from build loop).
if (!model.startsWith('deepseek-')) return undefined;
// Detect build command from package.json in project root.
const pkgPath = join(projectRoot, 'package.json');
if (!existsSync(pkgPath)) return undefined;
let buildCmd: string | null = null;
try {
const pkg = JSON.parse(readFileSync(pkgPath, 'utf8')) as { scripts?: Record<string, string> };
if (pkg.scripts?.build) buildCmd = 'build';
else if (pkg.scripts?.compile) buildCmd = 'compile';
else if (pkg.scripts?.typecheck) buildCmd = 'typecheck';
} catch {
return undefined;
}
if (!buildCmd) return undefined;
// Detect package manager.
const hasPnpm = existsSync(join(projectRoot, 'pnpm-lock.yaml'));
const hasYarn = existsSync(join(projectRoot, 'yarn.lock'));
const pm = hasPnpm ? 'pnpm' : hasYarn ? 'yarn' : 'npm';
// Run the build.
try {
const out = await new Promise<string>((resolve, reject) => {
execFile(pm, ['run', buildCmd!], { cwd: projectRoot, timeout: BUILD_TIMEOUT_MS, maxBuffer: BUILD_OUTPUT_CAP * 2 },
(err, stdout, stderr) => {
if (err && (err as NodeJS.ErrnoException).code === 'ENOENT') {
resolve(''); // package manager not found — skip
return;
}
const merged = (stdout + '\n' + stderr).trim();
resolve(merged.slice(0, BUILD_OUTPUT_CAP));
},
);
});
if (!out) return undefined; // build succeeded or no output
ctx.log.info({ sessionId, chatId, buildCmd, outputLen: out.length }, 'auto-fix: build failed');
// Truncate if existing note exists
const combined = existingNote
? existingNote + '\n\n--- Build error ---\n' + out.slice(0, BUILD_OUTPUT_CAP - existingNote.length)
: '--- Build error ---\n' + out.slice(0, BUILD_OUTPUT_CAP);
return combined;
} catch {
return undefined;
}
}
// P5: MAX_STEPS moved to ./turn-config.ts (with resolveTurnConfig). Re-exported // P5: MAX_STEPS moved to ./turn-config.ts (with resolveTurnConfig). Re-exported
// here so the public surface (index.ts → './turn.js') is unchanged. // here so the public surface (index.ts → './turn.js') is unchanged.
export { MAX_STEPS } from './turn-config.js'; export { MAX_STEPS } from './turn-config.js';
@@ -240,7 +311,7 @@ export async function runAssistantTurn(
// ---- tool phase ---- // ---- tool phase ----
let toolPhaseResult: ToolPhaseResult; let toolPhaseResult: ToolPhaseResult;
try { try {
toolPhaseResult = await executeToolPhase(ctx, iterArgs, result, state.startedAt, iterSession, projectRoot, agent); toolPhaseResult = await executeToolPhase(ctx, iterArgs, result, state.startedAt, iterSession, projectRoot, agent, stepNumber);
} catch (err) { } catch (err) {
// Tool phase errors are unexpected (individual tool failures are // Tool phase errors are unexpected (individual tool failures are
// caught inside executeToolPhase). Log and break. // caught inside executeToolPhase). Log and break.
@@ -260,6 +331,17 @@ export async function runAssistantTurn(
recordStep(mistakeTracker, o); recordStep(mistakeTracker, o);
} }
// vWhale: auto-fix — after write tools, attempt build and inject errors.
const WRITE_TOOLS = new Set(['edit_file', 'create_file', 'delete_file', 'apply_pending']);
const hasWriteTools = toolPhaseResult.toolCalls.some((tc) => WRITE_TOOLS.has(tc.name));
if (hasWriteTools) {
detectAndRunBuild(ctx, projectRoot, sessionId, chatId, iterSession.model, pendingRecoveryNote)
.then((buildError) => {
if (buildError) pendingRecoveryNote = buildError;
})
.catch(() => {});
}
// v#12 MistakeTracker: post-tool decision (pure). 'stop' = the tool phase // v#12 MistakeTracker: post-tool decision (pure). 'stop' = the tool phase
// returned a non-'continue' action ('paused' for user input, or // returned a non-'continue' action ('paused' for user input, or
// 'synthesis_done') — neither a nudge nor an escalate would change the // 'synthesis_done') — neither a nudge nor an escalate would change the
@@ -336,6 +418,19 @@ export async function runAssistantTurn(
}).catch(() => {}); }).catch(() => {});
} }
// ---- persist agent snapshot (best-effort, never blocks inference) ----
const snapLoaded = await loadContext(ctx.sql, sessionId, chatId).catch(() => null);
if (snapLoaded) {
await saveAgentSnapshot(ctx.sql, chatId, {
session_id: sessionId,
model: snapLoaded.session.model,
agent: agent?.name ?? null,
mode: null,
turn_number: stepNumber,
messages: snapLoaded.history.map((m) => ({ role: m.role, content: m.content })),
}).catch(() => {});
}
// ---- post-loop: step-cap sentinel ---- // ---- post-loop: step-cap sentinel ----
// When the loop exits because stepNumber reached effectiveCap, the last // When the loop exits because stepNumber reached effectiveCap, the last
// iteration's tool phase returned 'continue' with a nextAssistantId that // iteration's tool phase returned 'continue' with a nextAssistantId that

View File

@@ -46,6 +46,9 @@ export interface InferenceFrame {
| 'error' | 'error'
| 'flow_run_started' | 'flow_run_started'
| 'flow_run_step_updated' | 'flow_run_step_updated'
// tool trace frames
| 'tool_trace_start'
| 'tool_trace_finish'
// arena frames // arena frames
| 'battle_started' | 'battle_started'
| 'contestant_updated' | 'contestant_updated'
@@ -82,6 +85,15 @@ export interface InferenceFrame {
reasoning_tokens?: number | null; reasoning_tokens?: number | null;
session_id?: string; session_id?: string;
name?: string; name?: string;
// tool trace frames
trace_id?: string;
tool_name?: string;
tool_input?: Record<string, unknown>;
tool_output?: string | null;
latency_ms?: number;
outcome?: string;
// agent snapshot restore
agent?: string | null;
// orchestrator frames ([D-6]) // orchestrator frames ([D-6])
run_id?: string; run_id?: string;
flow_name?: string; flow_name?: string;

View File

@@ -3,4 +3,9 @@ export { formatMemoryBlock } from './prompt.js';
export { scanMemoryScopes } from './scan.js'; export { scanMemoryScopes } from './scan.js';
export { parseMemoryEntries } from './entries.js'; export { parseMemoryEntries } from './entries.js';
export { ensureMemoryScaffold, getMemoryRoot } from './paths.js'; export { ensureMemoryScaffold, getMemoryRoot } from './paths.js';
export { ContextTier } from './context-tier.js';
export { DeepDream } from './deep-dream.js';
export { CoreTier } from './core-tier.js';
export type { MemoryEntry } from './entries.js'; export type { MemoryEntry } from './entries.js';
export type { ContextTierConfig, ConversationTurn } from './context-tier.js';
export type { CoreTierEntry, CoreTierSearchResult, CoreTierSearchOptions } from './core-tier.js';

View File

@@ -0,0 +1,51 @@
import type { Sql } from '../db.js';
export interface AgentSnapshot {
id: string;
session_id: string;
chat_id: string;
model: string;
agent: string | null;
mode: string | null;
turn_number: number;
messages: unknown[];
tool_states: unknown[];
created_at: string;
updated_at: string;
}
/** Save or update the agent snapshot for a chat (UPSERT). */
export async function saveAgentSnapshot(sql: Sql, chatId: string, data: {
session_id: string;
model: string;
agent?: string | null;
mode?: string | null;
turn_number: number;
messages: unknown[];
tool_states?: unknown[];
}): Promise<void> {
await sql`
INSERT INTO agent_snapshots (session_id, chat_id, model, agent, mode, turn_number, messages, tool_states, updated_at)
VALUES (${data.session_id}, ${chatId}, ${data.model}, ${data.agent ?? null}, ${data.mode ?? null}, ${data.turn_number}, ${sql.json(data.messages as never)}, ${sql.json((data.tool_states ?? []) as never)}, clock_timestamp())
ON CONFLICT (chat_id)
DO UPDATE SET
model = EXCLUDED.model,
agent = EXCLUDED.agent,
mode = EXCLUDED.mode,
turn_number = EXCLUDED.turn_number,
messages = EXCLUDED.messages,
tool_states = EXCLUDED.tool_states,
updated_at = clock_timestamp()
`;
}
/** Load the agent snapshot for a chat. Returns null if no snapshot exists. */
export async function loadAgentSnapshot(sql: Sql, chatId: string): Promise<AgentSnapshot | null> {
const rows = await sql<AgentSnapshot[]>`SELECT * FROM agent_snapshots WHERE chat_id = ${chatId}`;
return rows[0] ?? null;
}
/** Delete the agent snapshot for a chat (call when session ends). */
export async function deleteAgentSnapshot(sql: Sql, chatId: string): Promise<void> {
await sql`DELETE FROM agent_snapshots WHERE chat_id = ${chatId}`;
}

View File

@@ -0,0 +1,92 @@
import type { Sql } from '../db.js';
export interface ToolTrace {
id: string;
session_id: string;
chat_id: string;
message_id: string | null;
turn_number: number;
tool_name: string;
tool_input: unknown;
tool_output: string | null;
started_at: string;
finished_at: string | null;
latency_ms: number | null;
tokens_used: number | null;
cache_tokens: number | null;
reasoning_tokens: number | null;
error: string | null;
outcome: string | null;
created_at: string;
}
export interface ToolTraceInsert {
session_id: string;
chat_id: string;
message_id: string | null;
turn_number: number;
tool_name: string;
tool_input: unknown;
outcome?: string;
}
export interface ToolTraceUpdate {
finished_at?: string;
latency_ms?: number;
tool_output?: string;
tokens_used?: number;
cache_tokens?: number;
reasoning_tokens?: number;
error?: string;
outcome?: string;
}
export async function insertToolTrace(
sql: Sql,
insert: ToolTraceInsert,
): Promise<ToolTrace> {
const [row] = await sql<ToolTrace[]>`
INSERT INTO tool_traces (
session_id, chat_id, message_id, turn_number,
tool_name, tool_input, outcome
) VALUES (
${insert.session_id}, ${insert.chat_id}, ${insert.message_id},
${insert.turn_number}, ${insert.tool_name},
${sql.json(insert.tool_input as never)},
${insert.outcome ?? null}
)
RETURNING *
`;
if (!row) throw new Error('insertToolTrace returned no row');
return row;
}
export async function updateToolTrace(
sql: Sql,
id: string,
updates: ToolTraceUpdate,
): Promise<ToolTrace | null> {
const cols: string[] = [];
const vals: any[] = [];
if (updates.finished_at !== undefined) { cols.push('finished_at'); vals.push(updates.finished_at); }
if (updates.latency_ms !== undefined) { cols.push('latency_ms'); vals.push(updates.latency_ms); }
if (updates.tool_output !== undefined) { cols.push('tool_output'); vals.push(updates.tool_output); }
if (updates.tokens_used !== undefined) { cols.push('tokens_used'); vals.push(updates.tokens_used); }
if (updates.cache_tokens !== undefined) { cols.push('cache_tokens'); vals.push(updates.cache_tokens); }
if (updates.reasoning_tokens !== undefined) { cols.push('reasoning_tokens'); vals.push(updates.reasoning_tokens); }
if (updates.error !== undefined) { cols.push('error'); vals.push(updates.error); }
if (updates.outcome !== undefined) { cols.push('outcome'); vals.push(updates.outcome); }
if (cols.length === 0) {
const [row] = await sql<ToolTrace[]>`SELECT * FROM tool_traces WHERE id = ${id}`;
return row ?? null;
}
const setClause = cols.map((c, i) => `${c} = $${i + 1}`).join(', ');
const [row] = await sql.unsafe<ToolTrace[]>(
`UPDATE tool_traces SET ${setClause} WHERE id = $${cols.length + 1} RETURNING *`,
[...vals, id],
);
return row ?? null;
}

View File

@@ -0,0 +1,305 @@
// v2.x: Background subagent tools. Three tools that let the model spawn
// non-blocking subagent tasks, poll their status, and retrieve results.
//
// spawn_subagent — Create a background session+chat, dispatch inference,
// return immediately with a task_id.
// subagent_status — Poll the status of a previously spawned task.
// subagent_result — Retrieve the full output of a completed task.
//
// These tools reuse the existing sessions/chats/messages/tables and the
// inference pipeline — no new tables or services needed.
//
// Registered in tools.ts ALL_TOOLS. Lives in its own file so tests can
// import executors without dragging in the full tool registry.
//
// Follows the read_tab_by_number.ts pattern: a pure executor function plus
// a ToolDef wrapper. Type-only import from tools.ts to dodge runtime cycles.
import { z } from 'zod';
import type { Sql } from '../../db.js';
import type { ToolDef, ToolExecCtx } from '../tools.js';
import {
spawnBackgroundTask,
getBackgroundTaskStatus,
getBackgroundTaskResult,
} from '../background-task.js';
// ---------------------------------------------------------------------------
// spawn_subagent
// ---------------------------------------------------------------------------
export const SpawnSubagentInput = z.object({
input: z.string().min(1).describe('The task to execute in the background'),
model: z
.string()
.min(1)
.optional()
.describe('Model to use (defaults to session model)'),
agent: z
.string()
.min(1)
.optional()
.describe('Agent to use (defaults to boocode)'),
label: z
.string()
.max(100)
.optional()
.describe('Human-readable label for display'),
});
export type SpawnSubagentInputT = z.infer<typeof SpawnSubagentInput>;
export async function executeSpawnSubagent(
input: SpawnSubagentInputT,
sql: Sql,
sessionId: string,
): Promise<Record<string, unknown>> {
// Resolve project_id + model from the current session.
const sessRows = await sql<
{ project_id: string; model: string }[]
>`
SELECT project_id, model FROM sessions WHERE id = ${sessionId}
`;
if (sessRows.length === 0) {
return { error: 'current session not found' };
}
const projectId = sessRows[0]!.project_id;
const model = input.model ?? sessRows[0]!.model;
const task = await spawnBackgroundTask(
sql,
// We pass a minimal logger shim — the real logger is wired by the
// inference pipeline. This keeps the tool's execute signature clean.
{ info: () => {}, warn: () => {}, error: () => {} } as unknown as import('fastify').FastifyBaseLogger,
projectId,
input.input,
model,
input.agent,
input.label,
);
// Elapsed time since creation is negligible (task was just spawned).
return {
task_id: task.id,
status: task.status,
session_id: task.session_id,
chat_id: task.chat_id,
created_at: task.created_at,
};
}
export const spawnSubagent: ToolDef<SpawnSubagentInputT> = {
name: 'spawn_subagent',
description:
'Spawn a background subagent task. Creates a new session and chat, dispatches inference asynchronously, and returns immediately with a task_id. Use subagent_status to poll for completion and subagent_result to retrieve the full output. Non-blocking — the model continues while the subagent works in the background.',
inputSchema: SpawnSubagentInput,
jsonSchema: {
type: 'function',
function: {
name: 'spawn_subagent',
description:
'Spawn a background subagent task. Returns immediately with a task_id — poll with subagent_status.',
parameters: {
type: 'object',
properties: {
input: {
type: 'string',
description: 'The task to execute in the background',
},
model: {
type: 'string',
description: 'Model to use (defaults to session model)',
},
agent: {
type: 'string',
description: 'Agent to use (defaults to boocode)',
},
label: {
type: 'string',
maxLength: 100,
description: 'Human-readable label for display',
},
},
required: ['input'],
additionalProperties: false,
},
},
},
async execute(input, _projectRoot, _extraRoots, toolCtx?: ToolExecCtx) {
if (!toolCtx) {
return { error: 'spawn_subagent unavailable: no session context' };
}
try {
return await executeSpawnSubagent(input, toolCtx.sql, toolCtx.sessionId);
} catch (err) {
return {
error: `spawn_subagent failed: ${err instanceof Error ? err.message : String(err)}`,
};
}
},
};
// ---------------------------------------------------------------------------
// subagent_status
// ---------------------------------------------------------------------------
export const SubagentStatusInput = z.object({
task_id: z.string().uuid().describe('Task ID from spawn_subagent'),
});
export type SubagentStatusInputT = z.infer<typeof SubagentStatusInput>;
export async function executeSubagentStatus(
input: SubagentStatusInputT,
sql: Sql,
): Promise<Record<string, unknown>> {
const task = await getBackgroundTaskStatus(sql, input.task_id);
if (!task) {
return { error: 'task not found', task_id: input.task_id };
}
// Compute elapsed time from created_at (ISO string).
let elapsed_seconds: number | null = null;
try {
const created = new Date(task.created_at).getTime();
const finished = task.finished_at
? new Date(task.finished_at).getTime()
: Date.now();
elapsed_seconds = Math.round((finished - created) / 1000);
} catch {
elapsed_seconds = null;
}
return {
task_id: task.id,
status: task.status,
output_summary: task.output_summary,
finished_at: task.finished_at,
elapsed_seconds,
};
}
export const subagentStatus: ToolDef<SubagentStatusInputT> = {
name: 'subagent_status',
description:
'Poll the status of a background subagent task by task_id. Returns the current status (running/completed/failed/cancelled), an output summary if completed, and elapsed time. Useful after spawn_subagent to check if work is done.',
inputSchema: SubagentStatusInput,
jsonSchema: {
type: 'function',
function: {
name: 'subagent_status',
description:
'Poll the status of a background subagent task. Returns status, output summary, and elapsed time.',
parameters: {
type: 'object',
properties: {
task_id: {
type: 'string',
format: 'uuid',
description: 'Task ID from spawn_subagent',
},
},
required: ['task_id'],
additionalProperties: false,
},
},
},
async execute(input, _projectRoot, _extraRoots, toolCtx?: ToolExecCtx) {
if (!toolCtx) {
return { error: 'subagent_status unavailable: no session context' };
}
try {
return await executeSubagentStatus(input, toolCtx.sql);
} catch (err) {
return {
error: `subagent_status failed: ${err instanceof Error ? err.message : String(err)}`,
};
}
},
};
// ---------------------------------------------------------------------------
// subagent_result
// ---------------------------------------------------------------------------
export const SubagentResultInput = z.object({
task_id: z.string().uuid().describe('Task ID from spawn_subagent'),
});
export type SubagentResultInputT = z.infer<typeof SubagentResultInput>;
export async function executeSubagentResult(
input: SubagentResultInputT,
sql: Sql,
): Promise<Record<string, unknown>> {
const task = await getBackgroundTaskStatus(sql, input.task_id);
if (!task) {
return { error: 'task not found', task_id: input.task_id };
}
if (task.status !== 'completed') {
return {
task_id: task.id,
status: task.status,
error: `task is not yet completed (status: ${task.status})`,
};
}
if (!task.chat_id) {
return { error: 'task has no chat data', task_id: input.task_id };
}
const result = await getBackgroundTaskResult(sql, input.task_id, task.chat_id);
if (!result) {
return {
task_id: task.id,
status: task.status,
error: 'task completed but no output message found',
};
}
return {
task_id: task.id,
output: result.output,
token_usage: result.token_usage,
};
}
export const subagentResult: ToolDef<SubagentResultInputT> = {
name: 'subagent_result',
description:
'Retrieve the full output of a completed background subagent task by task_id. Returns the response text and token usage. The task must be in completed status — poll with subagent_status first.',
inputSchema: SubagentResultInput,
jsonSchema: {
type: 'function',
function: {
name: 'subagent_result',
description:
'Retrieve the full output of a completed background subagent task. Returns output text and token usage.',
parameters: {
type: 'object',
properties: {
task_id: {
type: 'string',
format: 'uuid',
description: 'Task ID from spawn_subagent',
},
},
required: ['task_id'],
additionalProperties: false,
},
},
},
async execute(input, _projectRoot, _extraRoots, toolCtx?: ToolExecCtx) {
if (!toolCtx) {
return { error: 'subagent_result unavailable: no session context' };
}
try {
return await executeSubagentResult(input, toolCtx.sql);
} catch (err) {
return {
error: `subagent_result failed: ${err instanceof Error ? err.message : String(err)}`,
};
}
},
};

View File

@@ -2,6 +2,12 @@ import { z } from 'zod';
import type { ToolDef } from '../types.js'; import type { ToolDef } from '../types.js';
import { callCodecontext, type CodecontextResponse } from '../../codecontext_client.js'; import { callCodecontext, type CodecontextResponse } from '../../codecontext_client.js';
// DEPRECATED (Phase 4, Domain 2, v2.8.14): This factory builds ToolDefs that
// route through the Go codecontext sidecar via callCodecontext(). Superseded
// by direct boocontext MCP tool wrappers. Keep functional for backward
// compatibility — old codecontext tools still use HTTP. New tools should use
// the boocontext MCP server instead of adding entries here.
//
// Shared factory for the 12 codecontext shim ToolDefs. // Shared factory for the 12 codecontext shim ToolDefs.
// Each shim provides name/schema/description/jsonParameters/mapArgs; the // Each shim provides name/schema/description/jsonParameters/mapArgs; the
// factory builds the ToolDef and returns both the ToolDef and the standalone // factory builds the ToolDef and returns both the ToolDef and the standalone

View File

@@ -3,6 +3,7 @@ import { makeCodecontextTool } from './factory.js';
export const GetCodebaseOverviewInput = z.object({ export const GetCodebaseOverviewInput = z.object({
include_stats: z.boolean().optional(), include_stats: z.boolean().optional(),
compress: z.boolean().optional().describe('Apply DCP compression for large projects (>50 files)'),
}); });
export type GetCodebaseOverviewInputT = z.infer<typeof GetCodebaseOverviewInput>; export type GetCodebaseOverviewInputT = z.infer<typeof GetCodebaseOverviewInput>;
@@ -24,10 +25,18 @@ const { toolDef: getCodebaseOverview, execute: executeGetCodebaseOverview } =
type: 'boolean', type: 'boolean',
description: 'Include file count, symbol count, language stats. Defaults to true.', description: 'Include file count, symbol count, language stats. Defaults to true.',
}, },
compress: {
type: 'boolean',
description: 'Apply DCP compression for large projects (>50 files)',
},
}, },
additionalProperties: false, additionalProperties: false,
}, },
mapArgs: (input) => ({ include_stats: input.include_stats ?? true }), mapArgs: (input) => {
const args: Record<string, unknown> = { include_stats: input.include_stats ?? true };
if (input.compress) args['compress'] = true;
return args;
},
}); });
export { getCodebaseOverview, executeGetCodebaseOverview }; export { getCodebaseOverview, executeGetCodebaseOverview };

View File

@@ -18,3 +18,4 @@ export { getCodeHealth } from './get_code_health.js';
export { getCodeImpact } from './get_code_impact.js'; export { getCodeImpact } from './get_code_impact.js';
export { getTypeInfo } from './get_type_info.js'; export { getTypeInfo } from './get_type_info.js';
export { getCodeMap } from './get_code_map.js'; export { getCodeMap } from './get_code_map.js';
export { getWikiArticle } from './get_wiki_article.js';

View File

@@ -0,0 +1,132 @@
/**
* vWhale: run_command tool. Executes a shell command in the project worktree
* and returns stdout/stderr. Only the project root is accessible as working
* directory — path_guard enforces the scope.
*
* Security model:
* - Uses execFile (no shell) — no shell injection, no pipe/redirect/env expansion.
* - args passed as array, never a string.
* - 30s timeout default, configure per-call.
* - 32KB output cap with truncation (same pattern as web_fetch.ts).
* - Working directory restricted to project root via path_guard.
* - No background processes allowed (waits for completion).
*/
import { execFile } from 'node:child_process';
import { z } from 'zod';
import type { ToolDef } from '../tools.js';
const RunCommandInput = z.object({
command: z.string().min(1).max(256),
args: z.array(z.string()).default([]),
description: z.string().max(256).optional(),
timeout_ms: z.number().int().positive().max(120_000).optional(),
});
export type RunCommandInputT = z.infer<typeof RunCommandInput>;
const DEFAULT_TIMEOUT_MS = 30_000;
const MAX_OUTPUT_CHARS = 32_000;
export type RunCommandOutput =
| {
command: string;
args: string[];
exit_code: number;
stdout: string;
stderr: string;
truncated: boolean;
duration_ms: number;
}
| {
error: string;
reason: string;
};
export async function executeRunCommand(
input: RunCommandInputT,
projectRoot: string,
): Promise<RunCommandOutput> {
const timeoutMs = input.timeout_ms ?? DEFAULT_TIMEOUT_MS;
const startTime = Date.now();
return new Promise((resolve) => {
const child = execFile(
input.command,
input.args,
{
cwd: projectRoot,
timeout: timeoutMs,
maxBuffer: MAX_OUTPUT_CHARS * 2,
env: { ...process.env },
},
(err, stdout, stderr) => {
const durationMs = Date.now() - startTime;
// Truncate output if needed
const truncated = stdout.length + stderr.length > MAX_OUTPUT_CHARS;
const cappedStdout = truncated ? stdout.slice(0, MAX_OUTPUT_CHARS) : stdout;
const cappedStderr = truncated ? stderr.slice(0, Math.max(MAX_OUTPUT_CHARS - cappedStdout.length, 0)) : stderr;
const exitCode = err?.code === 'ENOENT' ? -1 : (err as Error & { code?: number })?.code ?? 0;
resolve({
command: input.command,
args: input.args,
exit_code: typeof exitCode === 'number' ? exitCode : 1,
stdout: cappedStdout,
stderr: cappedStderr,
truncated,
duration_ms: durationMs,
});
},
);
});
}
export const runCommand: ToolDef<RunCommandInputT> = {
name: 'run_command',
description:
'Run a shell command in the project workspace and return stdout + stderr. ' +
'The command runs in the project root directory. ' +
'Use for: building, testing, linting, git operations, running scripts. ' +
'Output is capped at 32KB. Timeout defaults to 30s (max 120s). ' +
'Security: args are passed as array (no shell injection). No background processes.',
inputSchema: RunCommandInput as unknown as z.ZodType<RunCommandInputT>,
jsonSchema: {
type: 'function',
function: {
name: 'run_command',
description:
'Execute a command in the project workspace. ' +
'Use for builds, tests, linting, git commands, and scripts. ' +
'The process runs with a 30s timeout and 32KB output cap.',
parameters: {
type: 'object',
properties: {
command: {
type: 'string',
description: 'Command to execute (e.g. pnpm, npm, npx, node, git, ls, cat).',
},
args: {
type: 'array',
items: { type: 'string' },
description: 'Arguments as array (e.g. ["run", "build"]). Never embedded in a shell string.',
},
description: {
type: 'string',
description: 'Optional human-readable description of what this command does.',
},
timeout_ms: {
type: 'integer',
description: 'Timeout in milliseconds. Default 30000, max 120000.',
},
},
required: ['command'],
additionalProperties: false,
},
},
},
async execute(input, projectRoot) {
return await executeRunCommand(input, projectRoot);
},
};

View File

@@ -23,6 +23,7 @@ import {
getCodeImpact, getCodeImpact,
getTypeInfo, getTypeInfo,
getCodeMap, getCodeMap,
getWikiArticle,
} from './codecontext/index.js'; } from './codecontext/index.js';
// v1.13.17-cross-repo-reads: cross-repo read grant request tool. Paired // v1.13.17-cross-repo-reads: cross-repo read grant request tool. Paired
// with the pause-on-pending-grant branch in inference/tool-phase.ts and the // with the pause-on-pending-grant branch in inference/tool-phase.ts and the
@@ -31,6 +32,21 @@ import { requestReadAccess } from '../request_read_access.js';
// v2.6.x: read-only tool that reads a tab's transcript by its session-scoped // v2.6.x: read-only tool that reads a tab's transcript by its session-scoped
// tab number. Needs DB/session context (ToolExecCtx 4th arg). // tab number. Needs DB/session context (ToolExecCtx 4th arg).
import { readTabByNumber } from '../read_tab_by_number.js'; import { readTabByNumber } from '../read_tab_by_number.js';
// v2.x: memory management tools. file-based store with optional CoreTier
// (SQLite FTS5 + vector) hybrid search backend.
import { extractMemoryTool } from './extract_memory.js';
import { manageMemoryTool } from './manage_memory.js';
import { searchMemoryTool } from './search_memory.js';
// vWhale: command execution tool. Spawns processes in the project worktree
// with timeout and output cap. No shell — args are passed as array.
import { runCommand } from './execute-command.js';
// v2.x: background subagent tools. Non-blocking subagent execution with
// spawn/poll/collect lifecycle. Reuses existing sessions/chats/messages/tasks.
import {
spawnSubagent,
subagentStatus,
subagentResult,
} from './background-subagent-tools.js';
// v1.13.3: alpha-sorted by tool.name at module load. llama.cpp's prompt // v1.13.3: alpha-sorted by tool.name at module load. llama.cpp's prompt
// cache hits on byte-identical prefixes; the tool list lives near the top // cache hits on byte-identical prefixes; the tool list lives near the top
@@ -85,6 +101,21 @@ export let ALL_TOOLS: ToolDef<unknown>[] = [
getCodeImpact as ToolDef<unknown>, getCodeImpact as ToolDef<unknown>,
getTypeInfo as ToolDef<unknown>, getTypeInfo as ToolDef<unknown>,
getCodeMap as ToolDef<unknown>, getCodeMap as ToolDef<unknown>,
// v2.8.14-domain2-phase3: wiki mode + token-efficient scanning.
getWikiArticle as ToolDef<unknown>,
// v2.x: memory management tools. File-based store with optional CoreTier
// (SQLite FTS5 + vector) hybrid search backend.
extractMemoryTool as ToolDef<unknown>,
manageMemoryTool as ToolDef<unknown>,
searchMemoryTool as ToolDef<unknown>,
// vWhale: command execution. Spawns processes in the project worktree.
// Read-write; use with guard: restricted to project root via path_guard,
// no shell injection (execFile, not exec).
runCommand as ToolDef<unknown>,
// v2.x: background subagent tools. Non-blocking spawn/poll/collect lifecycle.
spawnSubagent as ToolDef<unknown>,
subagentStatus as ToolDef<unknown>,
subagentResult as ToolDef<unknown>,
].sort((a, b) => a.name.localeCompare(b.name)); ].sort((a, b) => a.name.localeCompare(b.name));
export let TOOLS_BY_NAME: Record<string, ToolDef<unknown>> = Object.fromEntries( export let TOOLS_BY_NAME: Record<string, ToolDef<unknown>> = Object.fromEntries(

View File

@@ -0,0 +1,376 @@
// v2.8.0: Workflow catalog — built-in workflow definitions that ship with
// BooCode. Each workflow is a metadata object with name, description, and a
// factory function that returns the workflow script source code.
//
// Built-in workflows are merged into the discovery list alongside file-based
// workflows from .boocode/workflows/. They take precedence over user-defined
// workflows with the same name.
import { createHash } from 'node:crypto';
// ---------------------------------------------------------------------------
// Types
// ---------------------------------------------------------------------------
/**
* A built-in workflow definition shipped with BooCode.
*/
export interface BuiltinWorkflow {
/** Unique workflow name (used to invoke via `WorkflowManager`). */
name: string;
/** Human-readable description of what this workflow does. */
description: string;
/** Optional ordered phases for UI progress display. */
phases?: Array<{ title: string; detail?: string }>;
/**
* Generate the workflow script source code for this workflow.
* The returned string must be valid JS that exports `meta` and a `default`
* async function matching the `WorkflowScript` shape.
*
* @param args - Optional arguments provided when the workflow is started.
*/
generateScript: (args?: Record<string, unknown>) => string;
}
// ---------------------------------------------------------------------------
// Script templates (shared helpers)
// ---------------------------------------------------------------------------
/**
* Stable JSON serialisation for generating deterministic cache keys from
* structured arguments. Keys are sorted so the same data always produces
* the same string regardless of property insertion order.
*/
function stableJson(value: unknown): string {
if (value === null) return 'null';
if (typeof value !== 'object') return JSON.stringify(value);
if (Array.isArray(value)) {
return `[${value.map(stableJson).join(',')}]`;
}
const keys = Object.keys(value as Record<string, unknown>).sort();
const pairs = keys.map((k) => `${JSON.stringify(k)}:${stableJson((value as Record<string, unknown>)[k])}`);
return `{${pairs.join(',')}}`;
}
/**
* Compute a deterministic SHA-256 fingerprint for a combined spec + args
* payload. Used by the resumability cache to detect unchanged agent tasks.
*
* Exported for testing.
*/
export function fingerprintAgentTask(
prompt: string,
spec: Record<string, unknown>,
args: string,
): string {
return createHash('sha256')
.update(stableJson({ prompt, spec, args }))
.digest('hex');
}
// ---------------------------------------------------------------------------
// Built-in workflow definitions
// ---------------------------------------------------------------------------
function generateDeepResearchScript(_args?: Record<string, unknown>): string {
return `
export const meta = {
name: 'deep-research',
description: 'Multi-phase deep research: scope, search, fetch, verify, synthesise.',
phases: [
{ title: 'Scope', detail: 'Define the research question and search criteria' },
{ title: 'Search', detail: 'Query web sources in parallel' },
{ title: 'Fetch', detail: 'Retrieve full content from top sources' },
{ title: 'Verify', detail: 'Cross-reference and validate findings' },
{ title: 'Synthesise', detail: 'Produce a final structured report' },
],
};
export default async function main(args) {
const query = args?.query ?? 'No query provided';
log('deep-research: starting with query: ' + query);
// Phase 1: Scope
phase('Scope');
const scope = await agent(
'Analyse this research query and produce a search plan with 3-5 key sub-questions: ' + query,
{ label: 'scope-analysis', phase: 'scope' },
);
log('Scope completed');
// Phase 2: Search
phase('Search');
const searchResults = await agent(
'Based on the scope, search for authoritative sources. Return a list of 3-5 URLs with brief annotations.',
{ label: 'web-search', phase: 'search' },
);
log('Search completed');
// Phase 3: Fetch
phase('Fetch');
const fetchedContent = await agent(
'Extract and summarise the key information from these sources: ' + JSON.stringify(searchResults),
{ label: 'content-fetch', phase: 'fetch' },
);
log('Fetch completed');
// Phase 4: Verify
phase('Verify');
const verified = await agent(
'Cross-reference the fetched information. Note any contradictions, gaps, or weak sources: ' + JSON.stringify(fetchedContent),
{ label: 'verification', phase: 'verify' },
);
log('Verify completed');
// Phase 5: Synthesise
phase('Synthesise');
const report = await agent(
'Synthesise the verified information into a structured report with findings, sources, and confidence levels: ' + JSON.stringify(verified),
{ label: 'synthesis', phase: 'synthesise' },
);
log('deep-research: completed');
return {
ok: true,
output: report,
phases: { scope, searchResults, fetchedContent, verified, report },
};
}
`.trim();
}
function generateReviewCodeScript(_args?: Record<string, unknown>): string {
return `
export const meta = {
name: 'review-code',
description: 'Multi-perspective code review: correctness, security, performance, then synthesise.',
phases: [
{ title: 'Correctness', detail: 'Check logic, edge cases, and correctness' },
{ title: 'Security', detail: 'Analyse for vulnerabilities and unsafe patterns' },
{ title: 'Performance', detail: 'Identify performance bottlenecks and optimisation opportunities' },
{ title: 'Synthesise', detail: 'Merge perspectives into a unified review report' },
],
};
export default async function main(args) {
const target = args?.target ?? args?.path ?? '';
log('review-code: starting review of: ' + (target || '(no target specified)'));
const context = await agent(
'Read the code at ' + (target || 'the provided context') + ' and produce a summary of its structure and purpose.',
{ label: 'read-context', phase: 'context' },
);
// Phase 1: Correctness
phase('Correctness');
const correctness = await agent(
'Review this code for correctness. Check logical errors, edge cases, type safety, and concurrency issues:\\n' + JSON.stringify(context),
{ label: 'correctness-review', phase: 'correctness' },
);
// Phase 2: Security
phase('Security');
const security = await agent(
'Review this code for security vulnerabilities. Check for injection, auth bypasses, unsafe deserialisation, secret exposure:\\n' + JSON.stringify(context),
{ label: 'security-review', phase: 'security' },
);
// Phase 3: Performance
phase('Performance');
const performance = await agent(
'Review this code for performance issues. Check algorithmic complexity, unnecessary allocations, I/O patterns, caching opportunities:\\n' + JSON.stringify(context),
{ label: 'performance-review', phase: 'performance' },
);
// Phase 4: Synthesise
phase('Synthesise');
const report = await agent(
'Merge these three review perspectives into one structured report with severity-ranked findings:\\n' +
'--- Correctness ---\\n' + JSON.stringify(correctness) + '\\n' +
'--- Security ---\\n' + JSON.stringify(security) + '\\n' +
'--- Performance ---\\n' + JSON.stringify(performance),
{ label: 'synthesis', phase: 'synthesise' },
);
log('review-code: completed');
return {
ok: true,
output: report,
reviews: { correctness, security, performance },
};
}
`.trim();
}
function generateFindIssuesScript(_args?: Record<string, unknown>): string {
return `
export const meta = {
name: 'find-issues',
description: 'Iterative issue discovery — keep surfacing issues until consecutive rounds find nothing new.',
phases: [
{ title: 'Analyse', detail: 'Analyse the codebase for issues' },
{ title: 'Check dry', detail: 'Verify no new issues remain' },
],
};
export default async function main(args) {
const target = args?.target ?? args?.path ?? '.';
const maxRounds = args?.maxRounds ?? 5;
log('find-issues: starting on ' + target + ' (max ' + maxRounds + ' rounds)');
const allIssues = [];
let dryRounds = 0;
let round = 0;
while (dryRounds < 2 && round < maxRounds) {
round++;
phase('Analyse');
const context = allIssues.length > 0
? 'Previously found issues (exclude these):\\n' + JSON.stringify(allIssues)
: 'No issues found yet.';
const newIssues = await agent(
'Analyse ' + target + ' for bugs, code smells, and anti-patterns.\\n' + context + '\\nReturn a JSON array of issues. If none found, return an empty array.',
{ label: 'round-' + round + '-analysis', phase: 'analyse' },
);
let parsed: unknown[] = [];
try {
if (typeof newIssues === 'string') {
parsed = JSON.parse(newIssues);
} else if (Array.isArray(newIssues)) {
parsed = newIssues;
}
} catch {
parsed = [];
}
if (parsed.length === 0) {
dryRounds++;
phase('Check dry');
log('Round ' + round + ': no new issues found (dry run ' + dryRounds + '/2)');
} else {
dryRounds = 0;
for (const issue of parsed) {
allIssues.push(issue);
}
log('Round ' + round + ': found ' + parsed.length + ' new issue(s)');
}
}
log('find-issues: completed after ' + round + ' rounds, ' + allIssues.length + ' total issues');
return {
ok: true,
output: allIssues,
totalRounds: round,
totalIssues: allIssues.length,
};
}
`.trim();
}
// ---------------------------------------------------------------------------
// Registry
// ---------------------------------------------------------------------------
/**
* All built-in workflow definitions shipped with BooCode.
*/
const BUILTIN_WORKFLOWS: BuiltinWorkflow[] = [
{
name: 'deep-research',
description:
'Performs multi-phase deep research: scope the question, search web sources in parallel, fetch full content, verify findings, and synthesise a structured report.',
phases: [
{ title: 'Scope', detail: 'Define the research question and search criteria' },
{ title: 'Search', detail: 'Query web sources in parallel' },
{ title: 'Fetch', detail: 'Retrieve full content from top sources' },
{ title: 'Verify', detail: 'Cross-reference and validate findings' },
{ title: 'Synthesise', detail: 'Produce a final structured report' },
],
generateScript: generateDeepResearchScript,
},
{
name: 'review-code',
description:
'Multi-perspective code review that analyses code for correctness, security vulnerabilities, and performance issues in parallel, then merges findings into a unified severity-ranked report.',
phases: [
{ title: 'Correctness', detail: 'Check logic, edge cases, and correctness' },
{ title: 'Security', detail: 'Analyse for vulnerabilities and unsafe patterns' },
{ title: 'Performance', detail: 'Identify performance bottlenecks' },
{ title: 'Synthesise', detail: 'Merge perspectives into a unified report' },
],
generateScript: generateReviewCodeScript,
},
{
name: 'find-issues',
description:
'Iterative issue discovery that runs analysis rounds until two consecutive passes find nothing new, ensuring comprehensive coverage without infinite loops.',
phases: [
{ title: 'Analyse', detail: 'Analyse the codebase for issues' },
{ title: 'Check dry', detail: 'Verify no new issues remain' },
],
generateScript: generateFindIssuesScript,
},
];
/**
* Read-only map of built-in workflows keyed by name.
*/
const BUILTIN_WORKFLOW_MAP = new Map<string, BuiltinWorkflow>(
BUILTIN_WORKFLOWS.map((w) => [w.name, w]),
);
/**
* Return all built-in workflow definitions.
*/
export function getBuiltinWorkflows(): BuiltinWorkflow[] {
return BUILTIN_WORKFLOWS;
}
/**
* Look up a built-in workflow by name.
*
* @param name - Workflow name (e.g. 'deep-research').
* @returns The built-in workflow, or undefined if not found.
*/
export function getBuiltinWorkflow(name: string): BuiltinWorkflow | undefined {
return BUILTIN_WORKFLOW_MAP.get(name);
}
/**
* Merge built-in workflow metadata into a list of file-discovered workflow
* entries. Built-in entries take precedence — if a user has a file-based
* workflow with the same name, the built-in version wins.
*
* @param fileWorkflows - Workflow metadata discovered from the filesystem.
* @returns Merged array with built-in workflows injected and duplicate names
* resolved (built-in wins).
*/
export function mergeBuiltinWorkflows(
fileWorkflows: Array<{ name: string; description: string; sourceFile?: string }>,
): Array<{ name: string; description: string; sourceFile?: string }> {
const seen = new Set<string>();
const result: Array<{ name: string; description: string; sourceFile?: string }> = [];
// Built-in workflows first (they take precedence)
for (const builtin of BUILTIN_WORKFLOWS) {
seen.add(builtin.name);
result.push({
name: builtin.name,
description: builtin.description,
// No sourceFile — built-in workflows are generated, not read from disk
});
}
// File-discovered workflows — skip any name already claimed by built-in
for (const fw of fileWorkflows) {
if (seen.has(fw.name)) continue;
seen.add(fw.name);
result.push(fw);
}
return result;
}

View File

@@ -0,0 +1,134 @@
// v2.8.0: Workflow file discovery — walks project-local and global workflow
// directories to find runnable scripts. Built-in workflows from the catalog
// are merged into the results (they take precedence over user-defined files).
// All functions exported for testing.
import { readdirSync, existsSync } from 'node:fs';
import { join, basename, extname } from 'node:path';
import { homedir } from 'node:os';
import { getBuiltinWorkflows, getBuiltinWorkflow } from './catalog.js';
/**
* Sentinel prefix used in `sourceFile` for built-in workflows from the
* catalog so callers (e.g. WorkflowManager) can detect and handle them
* by calling `generateScript()` instead of reading a file from disk.
*/
const BUILTIN_PREFIX = 'builtin:';
/**
* Metadata about a discovered workflow file (or built-in workflow).
*/
export interface WorkflowMeta {
/** Workflow name (file stem without .js extension). */
name: string;
/** Description loaded from the workflow module's `meta.description`.
* Empty string until loadWorkflowMeta() resolves it. */
description: string;
/** Absolute path to the .js file.
* For built-in workflows this is `'builtin:<name>'` — the caller
* should use `getBuiltinWorkflow(name)` and `generateScript()`
* instead of reading this path from disk. */
sourceFile: string;
}
/**
* Test whether a `WorkflowMeta.sourceFile` points to a built-in workflow
* (rather than a file on disk).
*
* @param meta - The workflow metadata to check.
*/
export function isBuiltinWorkflow(meta: WorkflowMeta): boolean {
return meta.sourceFile.startsWith(BUILTIN_PREFIX);
}
/**
* Find all workflow .js files in the standard search paths, merged with
* built-in workflows from the catalog.
*
* Priority order (first match wins for same-named workflows):
* 1. Built-in catalog (always takes precedence)
* 2. <projectRoot>/.boocode/workflows/ (project-local)
* 3. ~/.boocode/workflows/ (global, per-user)
*
* @param projectRoot - Absolute path to the current project root.
*/
export function discoverWorkflows(projectRoot: string): WorkflowMeta[] {
const seen = new Set<string>();
const results: WorkflowMeta[] = [];
// 1. Built-in workflows (highest priority)
for (const builtin of getBuiltinWorkflows()) {
seen.add(builtin.name);
results.push({
name: builtin.name,
description: builtin.description,
sourceFile: `${BUILTIN_PREFIX}${builtin.name}`,
});
}
// 2. Project-local + global file-based workflows
const dirs = [
join(projectRoot, '.boocode', 'workflows'),
join(homedir(), '.boocode', 'workflows'),
];
for (const dir of dirs) {
if (!existsSync(dir)) continue;
try {
const entries = readdirSync(dir);
for (const f of entries) {
if (!f.endsWith('.js')) continue;
const name = basename(f, '.js');
if (seen.has(name)) continue; // built-in shadows project-local,
// project-local shadows global
seen.add(name);
results.push({
name,
description: '',
sourceFile: join(dir, f),
});
}
} catch {
// Permission error on directory — skip silently
continue;
}
}
return results;
}
/**
* Find a single workflow by name across built-in catalog and search paths.
*
* Priority: built-in > project-local > global.
*
* @param name - Workflow name (without .js extension).
* @param projectRoot - Absolute path to the current project root.
*/
export function findWorkflow(
name: string,
projectRoot: string,
): WorkflowMeta | undefined {
// Check built-in catalog first
const builtin = getBuiltinWorkflow(name);
if (builtin) {
return {
name: builtin.name,
description: builtin.description,
sourceFile: `${BUILTIN_PREFIX}${builtin.name}`,
};
}
// Fall back to file-based discovery
return discoverWorkflows(projectRoot).find((w) => w.name === name);
}
/**
* Validate a candidate workflow file path.
* Checks that the file exists and has a .js extension.
*
* @param filePath - Absolute path to check.
*/
export function isValidWorkflowPath(filePath: string): boolean {
return extname(filePath) === '.js' && existsSync(filePath);
}

View File

@@ -0,0 +1,54 @@
// v2.8.0: Dynamic Workflow Engine — public surface.
//
// Re-exports all types and classes from the workflow sub-modules so consumers
// import from a single entry point:
//
// ```typescript
// import { WorkflowManager } from './services/workflow/index.js';
// ```
export { WorkflowManager } from './manager.js';
export type { WorkflowMetaInfo } from './manager.js';
export type { WorkflowEventHandler } from './manager.js';
export { discoverWorkflows, findWorkflow, isValidWorkflowPath, isBuiltinWorkflow } from './discovery.js';
export type { WorkflowMeta } from './discovery.js';
export {
loadWorkflowScript,
loadWorkflowScriptFromCode,
executeWorkflowScript,
executeWorkflowScriptFromCode,
buildSandbox,
transformEsmToCjs,
isEsmSyntax,
} from './sandbox.js';
export {
getBuiltinWorkflows,
getBuiltinWorkflow,
mergeBuiltinWorkflows,
fingerprintAgentTask,
} from './catalog.js';
export type { BuiltinWorkflow } from './catalog.js';
export {
cacheKey,
getCachedResult,
setCachedResult,
invalidateRun,
clearCache,
cacheSize,
} from './resumability.js';
export type { CachedResult } from './resumability.js';
export type {
WorkflowScript,
WorkflowScriptMeta,
WorkflowContext,
AgentTaskSpec,
AgentTaskResult,
WorkflowRun,
WorkflowRunStatus,
WorkflowEvent,
} from './types.js';

View File

@@ -0,0 +1,659 @@
// v2.8.0: WorkflowManager — ties discovery, sandbox, and inference dispatch
// together into a single orchestrator for multi-agent workflow scripts.
//
// Creates isolated sessions+chats for each agent() call within a workflow,
// dispatches inference via the existing pipeline, polls for completion, and
// returns structured results. All failures are returned as errors rather than
// thrown exceptions (catch-safe API).
import { randomUUID } from 'node:crypto';
import type { Sql } from '../../db.js';
import type { Config } from '../../config.js';
import type { FastifyBaseLogger } from 'fastify';
import type { Broker } from '../broker.js';
import type { UserStreamFrame } from '../../types/api.js';
import type {
WorkflowRun,
WorkflowRunStatus,
WorkflowContext,
WorkflowEvent,
AgentTaskSpec,
AgentTaskResult,
WorkflowScriptMeta,
} from './types.js';
import { discoverWorkflows, findWorkflow, isBuiltinWorkflow } from './discovery.js';
import { getBuiltinWorkflow } from './catalog.js';
import { cacheKey, getCachedResult, setCachedResult } from './resumability.js';
import {
executeWorkflowScript,
executeWorkflowScriptFromCode,
isEsmSyntax,
transformEsmToCjs,
} from './sandbox.js';
import { runInference } from '../inference/index.js';
import { readFileSync } from 'node:fs';
import vm from 'node:vm';
/**
* Maximum time to wait for a single agent task to complete (5 minutes).
* Beyond this, the task is treated as failed/timed out.
*/
const AGENT_TASK_TIMEOUT_MS = 300_000;
/**
* Polling interval when waiting for an agent task to finish.
*/
const POLL_INTERVAL_MS = 500;
/**
* Maximum time for the entire workflow run (30 minutes).
*/
const WORKFLOW_TIMEOUT_MS = 1_800_000;
/**
* Token budget tracker. Tracks total token spend across agent calls.
*/
class BudgetTracker {
total: number | null;
#spent = 0;
constructor(total: number | null) {
this.total = total;
}
spend(amount: number): void {
this.#spent += amount;
}
spent(): number {
return this.#spent;
}
remaining(): number {
if (this.total === null) return Infinity;
return Math.max(0, this.total - this.#spent);
}
}
/**
* Creates a no-op bounded publish function that avoids WS dependency
* for background workflow agent tasks. Messages are still persisted to DB.
*/
function noopPublish(): void {
/* intentional no-op */
}
function noopPublishUser(): void {
/* intentional no-op */
}
/**
* Callback type for workflow lifecycle events.
*/
export type WorkflowEventHandler = (event: WorkflowEvent) => void;
/**
* WorkflowManager — the orchestrator for sandboxed multi-agent workflows.
*/
export class WorkflowManager {
/** Active workflow runs by run ID. */
readonly #runs = new Map<string, WorkflowRunState>();
/** Registered event listeners. */
readonly #listeners = new Set<WorkflowEventHandler>();
constructor(
private sql: Sql,
private config: Config,
private log: FastifyBaseLogger,
private projectRoot: string,
private projectId: string,
private broker: Broker,
) {}
// ---- public API ----
/**
* Discover all available workflow scripts.
*/
listWorkflows(): WorkflowMetaInfo[] {
return discoverWorkflows(this.projectRoot).map((m) => ({
name: m.name,
sourceFile: m.sourceFile,
}));
}
/**
* Find a specific workflow by name.
*/
getWorkflow(name: string): WorkflowMetaInfo | undefined {
const found = findWorkflow(name, this.projectRoot);
if (!found) return undefined;
return { name: found.name, sourceFile: found.sourceFile };
}
/**
* Load the metadata (name, description, phases) from a workflow file
* without executing it.
*
* @param name - Workflow name.
* @returns The script's meta, or undefined if not found.
*/
async loadWorkflowMeta(name: string): Promise<WorkflowScriptMeta | undefined> {
const found = findWorkflow(name, this.projectRoot);
if (!found) return undefined;
// Built-in workflows: return meta directly from the catalog
if (isBuiltinWorkflow(found)) {
const builtin = getBuiltinWorkflow(name);
if (!builtin) return { name, description: '' };
return {
name: builtin.name,
description: builtin.description,
phases: builtin.phases,
};
}
try {
// Load meta by executing the script in a throwaway context
const context = this.#createMinimalContext('meta-loader');
const code = readFileSync(found.sourceFile, 'utf8');
const finalCode = isEsmSyntax(code) ? transformEsmToCjs(code) : code;
const sandboxData: Record<string, unknown> & {
module: { exports: Record<string, unknown> };
} = {
...context,
console: { log: () => {} },
module: { exports: {} },
exports: {},
};
vm.createContext(sandboxData as unknown as vm.Context);
new vm.Script(finalCode).runInContext(sandboxData as unknown as vm.Context, {
timeout: 10_000,
filename: found.sourceFile,
});
const meta = sandboxData.module.exports.meta as WorkflowScriptMeta | undefined;
return meta ?? { name, description: '' };
} catch {
return { name, description: '' };
}
}
/**
* Execute a workflow by name.
*
* @param name - The workflow name (without .js extension).
* @param args - Optional arguments to pass to the workflow function.
* @returns The run ID for tracking.
*/
async runWorkflow(
name: string,
args?: Record<string, unknown>,
): Promise<{ runId: string }> {
const found = findWorkflow(name, this.projectRoot);
if (!found) {
throw new Error(`Workflow not found: "${name}". ` +
`Check .boocode/workflows/ or ~/.boocode/workflows/ for a ${name}.js file.`);
}
const runId = randomUUID();
const startedAt = new Date().toISOString();
const state: WorkflowRunState = {
id: runId,
name,
status: 'running',
startedAt,
abortController: new AbortController(),
};
this.#runs.set(runId, state);
this.#emit({ type: 'run_started', runId, name });
// Run asynchronously — caller receives the runId immediately.
void this.#executeRun(state, found.sourceFile, args ?? {});
return { runId };
}
/**
* Get the current status of a workflow run.
*/
getRunStatus(runId: string): WorkflowRun | undefined {
const state = this.#runs.get(runId);
if (!state) return undefined;
return {
id: state.id,
name: state.name,
status: state.status,
started_at: state.startedAt,
finished_at: state.finishedAt,
error: state.error,
};
}
/**
* Cancel a running workflow. Best-effort — agent tasks in-flight will be
* aborted via AbortSignal.
*
* @param runId - The workflow run ID.
* @returns true if the workflow was found and cancelled.
*/
cancelRun(runId: string): boolean {
const state = this.#runs.get(runId);
if (!state || state.status !== 'running') return false;
state.status = 'cancelled';
state.finishedAt = new Date().toISOString();
state.abortController.abort();
this.#emit({ type: 'run_cancelled', runId, name: state.name });
return true;
}
/**
* Subscribe to workflow lifecycle events.
* Returns an unsubscribe function.
*/
onEvent(handler: WorkflowEventHandler): () => void {
this.#listeners.add(handler);
return () => {
this.#listeners.delete(handler);
};
}
// ---- internal execution ----
/**
* Execute the workflow script in the sandbox.
*/
async #executeRun(
state: WorkflowRunState,
sourceFile: string,
args: Record<string, unknown>,
): Promise<void> {
const BULTIN_MARKER = 'builtin:';
const budgetTracker = new BudgetTracker(null); // no fixed total yet
const runId = state.id;
try {
const context: WorkflowContext = {
agent: (prompt, opts) =>
this.#handleAgentCall(runId, prompt, opts ?? { prompt }, state.abortController.signal),
parallel: (thunks) =>
Promise.all(thunks.map((t) => t())),
pipeline: async (items, ...stages) => {
let result = [...items];
for (const stage of stages) {
result = await Promise.all(result.map(stage));
}
return result;
},
phase: (title) => {
this.#emit({ type: 'phase', runId, title });
},
log: (message) => {
this.#emit({ type: 'log', runId, message });
},
budget: {
total: budgetTracker.total,
spent: () => budgetTracker.spent(),
remaining: () => budgetTracker.remaining(),
},
args,
workflow: (nestedName, nestedArgs) =>
this.#handleNestedWorkflow(runId, nestedName, nestedArgs ?? {}, state.abortController.signal),
};
let result: unknown;
if (sourceFile.startsWith(BULTIN_MARKER)) {
// Built-in workflow: generate script from catalog and execute
const workflowName = sourceFile.slice(BULTIN_MARKER.length);
const builtin = getBuiltinWorkflow(workflowName);
if (!builtin) {
throw new Error(`Built-in workflow "${workflowName}" not found in catalog`);
}
const scriptCode = builtin.generateScript(args);
result = await executeWorkflowScriptFromCode(scriptCode, context, args, sourceFile);
} else {
result = await executeWorkflowScript(sourceFile, context, args);
}
// Only update to completed if we haven't been cancelled mid-flight.
if (state.status !== 'cancelled') {
state.status = 'completed';
state.finishedAt = new Date().toISOString();
}
// Store result
state.result = result;
this.#emit({ type: 'run_completed', runId, name: state.name });
} catch (err) {
if (state.status === 'cancelled') return; // already handled
const message = err instanceof Error ? err.message : String(err);
state.status = 'failed';
state.finishedAt = new Date().toISOString();
state.error = message;
this.#emit({ type: 'run_failed', runId, name: state.name, error: message });
}
}
/**
* Handle an `agent()` call from within a workflow.
* Creates a session + chat, dispatches inference, polls for completion.
*/
async #handleAgentCall(
runId: string,
prompt: string,
spec: AgentTaskSpec,
signal: AbortSignal,
): Promise<unknown> {
const label = spec.label ?? `agent-${prompt.slice(0, 40).replace(/\s+/g, '_')}`;
this.#emit({ type: 'agent_task_started', runId, label });
try {
const result = await this.executeAgentTask(prompt, spec, signal);
this.#emit({ type: 'agent_task_completed', runId, label });
return result;
} catch (err) {
this.#emit({ type: 'agent_task_completed', runId, label });
const message = err instanceof Error ? err.message : String(err);
return {
ok: false,
output: null,
error: message,
} satisfies AgentTaskResult;
}
}
/**
* Core agent task execution: create session/chat, dispatch inference, poll.
*
* Exported as a public method for testing.
*/
async executeAgentTask(
prompt: string,
spec: AgentTaskSpec,
signal?: AbortSignal,
): Promise<unknown> {
// ---- 0. Check resumability cache before creating a new task ----
const cacheKeyStr = cacheKey(spec, '');
const cached = getCachedResult(cacheKeyStr);
if (cached) {
return { ...cached, cached: true } satisfies AgentTaskResult;
}
const model = spec.model ?? null;
// ---- 1. Create a session for this agent task ----
const sessionName = `workflow-agent-${spec.label ?? 'task'}`;
const sessionResult = await this.sql.begin(async (tx) => {
const [session] = await tx<{ id: string }[]>`
INSERT INTO sessions (project_id, name, model)
VALUES (${this.projectId}, ${sessionName}, ${model ?? 'qwen3.6-35b-a3b-mxfp4'})
RETURNING id
`;
if (!session) throw new Error('Failed to create workflow agent session');
return session;
});
const sessionId = sessionResult.id;
// ---- 2. Create a chat in this session ----
const chatResult = await this.sql.begin(async (tx) => {
const [chat] = await tx<{ id: string }[]>`
INSERT INTO chats (session_id, name)
VALUES (${sessionId}, ${spec.label ?? null})
RETURNING id
`;
if (!chat) throw new Error('Failed to create workflow agent chat');
return chat;
});
const chatId = chatResult.id;
// ---- 3. Insert user message + streaming assistant message ----
const { userMessageId, assistantMessageId } = await this.sql.begin(async (tx) => {
const [userMsg] = await tx<{ id: string }[]>`
INSERT INTO messages (session_id, chat_id, role, content, status, created_at)
VALUES (${sessionId}, ${chatId}, 'user', ${prompt}, 'complete', clock_timestamp())
RETURNING id
`;
const [assistantMsg] = await tx<{ id: string }[]>`
INSERT INTO messages (session_id, chat_id, role, content, status, created_at)
VALUES (${sessionId}, ${chatId}, 'assistant', '', 'streaming', clock_timestamp())
RETURNING id
`;
return {
userMessageId: userMsg!.id,
assistantMessageId: assistantMsg!.id,
};
});
// ---- 4. Dispatch inference ----
// Create a bounded InferenceContext that won't crash on missing WS
const ctx: import('../inference/types.js').InferenceContext = {
sql: this.sql,
config: this.config,
log: this.log,
publish: noopPublish as unknown as import('../inference/types.js').FramePublisher,
publishUser: noopPublishUser as unknown as (frame: UserStreamFrame) => void,
broker: this.broker,
};
// Create a merged signal (workflow cancellation + optional caller signal)
const mergedController = new AbortController();
const onAbort = () => mergedController.abort();
signal?.addEventListener('abort', onAbort, { once: true });
const inferencePromise = runInference(
ctx,
sessionId,
chatId,
assistantMessageId,
mergedController.signal,
).finally(() => {
signal?.removeEventListener('abort', onAbort);
});
// ---- 5. Poll for completion ----
try {
const result = await this.#pollForCompletion(
chatId,
assistantMessageId,
inferencePromise,
mergedController.signal,
);
// Cache successful results for resumability
if (typeof result === 'object' && result !== null && (result as Record<string, unknown>).ok === true) {
setCachedResult(cacheKeyStr, {
ok: true,
output: (result as Record<string, unknown>).output,
token_usage: (result as Record<string, unknown>).token_usage as
| { prompt: number; completion: number }
| undefined,
});
}
return result;
} catch (err) {
if ((err as Error)?.message === 'cancelled') {
return { ok: false, output: null, error: 'Task was cancelled' } satisfies AgentTaskResult;
}
return {
ok: false,
output: null,
error: err instanceof Error ? err.message : String(err),
} satisfies AgentTaskResult;
}
}
/**
* Poll the messages table until the assistant message status changes
* from 'streaming' to 'complete' / 'failed' / 'cancelled'.
*/
async #pollForCompletion(
chatId: string,
assistantMessageId: string,
inferencePromise: Promise<void>,
signal: AbortSignal,
): Promise<unknown> {
// Wait for either inference to finish or timeout
const timeout = new Promise<never>((_, reject) => {
const timer = setTimeout(() => {
reject(new Error(`Agent task timed out after ${AGENT_TASK_TIMEOUT_MS}ms`));
}, AGENT_TASK_TIMEOUT_MS);
signal.addEventListener('abort', () => {
clearTimeout(timer);
reject(new Error('cancelled'));
}, { once: true });
});
// Poll loop — runs until inference completes, timeout, or cancellation
const pollLoop = (async () => {
// eslint-disable-next-line no-constant-condition
while (true) {
await new Promise((resolve) => setTimeout(resolve, POLL_INTERVAL_MS));
const rows = await this.sql<{
status: string;
content: string;
tool_calls: unknown;
tokens_used: number | null;
}[]>`
SELECT m.status, m.content, m.role,
(SELECT jsonb_agg(p.payload ORDER BY p.sequence)
FROM message_parts p
WHERE p.message_id = m.id AND p.kind = 'tool_call' AND p.hidden_at IS NULL) AS tool_calls,
m.tokens_used
FROM messages m
WHERE m.id = ${assistantMessageId}
`;
const msg = rows[0];
if (!msg) {
throw new Error(`Assistant message ${assistantMessageId} not found`);
}
if (msg.status === 'complete') {
return {
ok: true,
output: msg.content,
token_usage: msg.tokens_used ? { prompt: 0, completion: msg.tokens_used } : undefined,
};
}
if (msg.status === 'failed' || msg.status === 'cancelled') {
return {
ok: false,
output: msg.content || null,
error: `Assistant message ended with status: ${msg.status}`,
};
}
// Still streaming — continue polling
}
})();
// Race: polling vs timeout vs inference error vs cancellation
try {
return await Promise.race([pollLoop, timeout]);
} finally {
// Ensure inference is settled (but don't block on it)
inferencePromise.catch(() => {});
}
}
/**
* Handle a nested `workflow()` call from within a workflow.
* Runs the named workflow with the given args and returns its result.
*/
async #handleNestedWorkflow(
parentRunId: string,
name: string,
args: Record<string, unknown>,
signal: AbortSignal,
): Promise<unknown> {
const found = findWorkflow(name, this.projectRoot);
if (!found) {
return { ok: false, output: null, error: `Nested workflow not found: "${name}"` };
}
const nestedRunId = randomUUID();
const startedAt = new Date().toISOString();
const nestedState: WorkflowRunState = {
id: nestedRunId,
name,
status: 'running',
startedAt,
abortController: new AbortController(),
};
this.#runs.set(nestedRunId, nestedState);
this.#emit({ type: 'run_started', runId: nestedRunId, name });
// Link parent cancellation to nested
signal.addEventListener('abort', () => {
nestedState.abortController.abort();
}, { once: true });
await this.#executeRun(nestedState, found.sourceFile, args);
if (nestedState.status === 'cancelled') {
return { ok: false, output: null, error: 'Nested workflow cancelled' };
}
if (nestedState.status === 'failed') {
return { ok: false, output: null, error: nestedState.error };
}
return { ok: true, output: nestedState.result };
}
/**
* Create a minimal WorkflowContext for non-execution purposes
* (e.g. loading meta).
*/
#createMinimalContext(runId: string): Record<string, unknown> {
return {
agent: () => Promise.reject(new Error('Not available in this context')),
parallel: () => Promise.reject(new Error('Not available in this context')),
pipeline: () => Promise.reject(new Error('Not available in this context')),
phase: () => {},
log: () => {},
budget: { total: null, spent: () => 0, remaining: () => Infinity },
args: {},
workflow: () => Promise.reject(new Error('Not available in this context')),
};
}
/**
* Emit a workflow event to all registered listeners.
*/
#emit(event: WorkflowEvent): void {
for (const handler of this.#listeners) {
try {
handler(event);
} catch {
// Swallow listener errors — one bad listener shouldn't break others
}
}
}
}
// ---- internal types ----
/**
* Metadata returned from listWorkflows / getWorkflow.
*/
export interface WorkflowMetaInfo {
name: string;
sourceFile: string;
}
/**
* Internal mutable state for an active workflow run.
*/
interface WorkflowRunState {
id: string;
name: string;
status: WorkflowRunStatus;
startedAt: string;
finishedAt?: string;
error?: string;
result?: unknown;
abortController: AbortController;
}

View File

@@ -0,0 +1,195 @@
// v2.8.0: Workflow resumability cache — SHA-256 hash-based in-memory cache
// for completed agent task results. When a workflow re-runs, completed agents
// with unchanged specs skip execution and return cached results.
//
// The cache is purely in-memory (Map). No DB persistence for v1.
// All functions are exported for testing.
import { createHash } from 'node:crypto';
import type { AgentTaskSpec } from './types.js';
// ---------------------------------------------------------------------------
// Types
// ---------------------------------------------------------------------------
/**
* Shape of a cached agent task result. Mirrors the successful fields of
* `AgentTaskResult` without the runtime-only `cached` flag.
*/
export interface CachedResult {
ok: boolean;
output: unknown;
error?: string;
token_usage?: { prompt: number; completion: number };
}
/**
* Internal cache entry with insertion timestamp for TTL support.
*/
interface CacheEntry {
result: CachedResult;
insertedAt: number;
}
// ---------------------------------------------------------------------------
// Cache store
// ---------------------------------------------------------------------------
/**
* Default TTL for cached entries (30 minutes).
* After this period entries are considered stale and are evicted on access.
*/
const DEFAULT_TTL_MS = 1_800_000;
/**
* Maximum number of entries before the cache starts evicting oldest entries.
*/
const MAX_ENTRIES = 500;
/**
* In-memory cache store: SHA-256 hash → cached result.
*/
const cache = new Map<string, CacheEntry>();
// ---------------------------------------------------------------------------
// Public API
// ---------------------------------------------------------------------------
/**
* Build a deterministic SHA-256 hash for an agent task specification.
*
* The hash is computed from a stable-ordered JSON serialisation of the spec
* (prompt + options) so that identical specs always produce the same key
* regardless of JavaScript property insertion order.
*
* @param spec - The agent task specification (prompt, options, etc.).
* @param args - Additional arguments string (e.g. workflow args fingerprint).
* @returns A 64-character hex SHA-256 digest.
*/
export function cacheKey(spec: AgentTaskSpec, args: string): string {
const hash = createHash('sha256');
// Stable-sorted serialisation of the spec
hash.update(stableJson(spec));
// Append the args fingerprint
hash.update('\0');
hash.update(args);
return hash.digest('hex');
}
/**
* Look up a cached result by its cache key.
*
* Returns `null` when:
* - The key doesn't exist in the cache.
* - The cached entry has exceeded the TTL (evicted silently).
*
* @param key - The SHA-256 hex key returned by `cacheKey()`.
* @returns The cached result, or `null` if not found or expired.
*/
export function getCachedResult(key: string): CachedResult | null {
const entry = cache.get(key);
if (!entry) return null;
// TTL check — stale entries are evicted on access
if (Date.now() - entry.insertedAt > DEFAULT_TTL_MS) {
cache.delete(key);
return null;
}
return entry.result;
}
/**
* Store an agent task result in the cache.
*
* If the cache has reached `MAX_ENTRIES`, the oldest entry (by insertion time)
* is evicted first. This is a simple FIFO eviction — not a full LRU — because
* workflow runs are expected to exhibit high temporal locality (recently
* completed steps in the current run are the most likely to be re-queried).
*
* @param key - The SHA-256 hex key returned by `cacheKey()`.
* @param result - The result to cache.
*/
export function setCachedResult(key: string, result: CachedResult): void {
// Evict oldest entry if at capacity
if (cache.size >= MAX_ENTRIES) {
let oldestKey: string | undefined;
let oldestTime = Infinity;
for (const [k, entry] of cache) {
if (entry.insertedAt < oldestTime) {
oldestTime = entry.insertedAt;
oldestKey = k;
}
}
if (oldestKey) {
cache.delete(oldestKey);
}
}
cache.set(key, {
result,
insertedAt: Date.now(),
});
}
/**
* Invalidate all cached entries that were produced during a specific workflow
* run. The `runKey` is matched as a prefix of the cache key — this works
* because `cacheKey()` incorporates the args string, and the caller passes
* a run-specific token as the `args` parameter.
*
* @param runKey - The run-specific key prefix to invalidate.
*/
export function invalidateRun(runKey: string): void {
for (const key of cache.keys()) {
if (key.startsWith(runKey)) {
cache.delete(key);
}
}
}
/**
* Clear the entire cache. Used for testing and manual reset.
*/
export function clearCache(): void {
cache.clear();
}
/**
* Return the current number of entries in the cache.
* Useful for testing assertions.
*/
export function cacheSize(): number {
return cache.size;
}
// ---------------------------------------------------------------------------
// Internal helpers
// ---------------------------------------------------------------------------
/**
* Stable JSON serialisation that produces the same output string for the same
* data regardless of JavaScript object property insertion order.
*
* - Object keys are sorted lexicographically.
* - Arrays preserve their element order.
* - Primitives are serialised via `JSON.stringify`.
*/
function stableJson(value: unknown): string {
if (value === null) return 'null';
if (typeof value !== 'object') return JSON.stringify(value);
if (Array.isArray(value)) {
return `[${value.map(stableJson).join(',')}]`;
}
const keys = Object.keys(value as Record<string, unknown>).sort();
const pairs = keys.map(
(k) =>
`${JSON.stringify(k)}:${stableJson((value as Record<string, unknown>)[k])}`,
);
return `{${pairs.join(',')}}`;
}

View File

@@ -0,0 +1,284 @@
// v2.8.0: VM sandbox for executing workflow scripts in an isolated Node.js
// context with a restricted global scope. Uses Node's built-in `vm` module
// (zero additional dependencies).
//
// Workflow scripts can use either CommonJS (`module.exports`) or ESM syntax
// (`export const` / `export default`). ESM syntax is automatically transformed
// to CJS before execution via a lightweight regex transform.
import vm from 'node:vm';
import { readFileSync } from 'node:fs';
import type { WorkflowContext } from './types.js';
/**
* Shared timeout for all sandboxed script execution.
* Prevents runaway workflows from blocking the server indefinitely.
*/
const EXECUTION_TIMEOUT_MS = 30_000;
/**
* Regex-based ESM-to-CJS transform for workflow scripts.
*
* Handles:
* - `export const|let|var <name> = <value>;` → `<name> = <value>;`
* - `export default <expression>;` → `default = <expression>;`
* - `export default function <name>(...) {...}` → `default = function <name>(...) {...}`
* - `export { <name1>, <name2> }` → removed (inline assignment)
*
* @param code - Raw source code (ESM or CJS).
* @returns Code transformed to CJS assignments suitable for vm.Script.
*/
export function transformEsmToCjs(code: string): string {
// Remove `export ` prefix from declarations and `export default` assignments.
// Order matters: handle `export default function` before bare `export default`.
let transformed = code
// export default async function name(...) {...} → default = async function name(...) {...}
.replace(
/export\s+default\s+(async\s+)?function\s*\**\s*(\w+)?\s*\(/g,
(_, asyncKw, _name) => {
return `default = ${asyncKw ?? ''}function ${_name ?? ''}(`;
},
)
// export default class Name {...} → default = class Name {...}
.replace(/export\s+default\s+(class\s+\w+)/g, 'default = $1')
// export default <expression>; → default = <expression>;
.replace(/export\s+default\s+/g, 'default = ')
// export const|let|var name = value → name = value
.replace(
/export\s+(const|let|var)\s+(\w+)\s*=/g,
(_, _decl, name) => `${name} =`,
)
// export function name(...) {...} → (hoisted, keep as-is but remove export)
.replace(/^export\s+(function\s+\w+)/gm, '$1')
// export class Name {...} → keep but remove export
.replace(/^export\s+(class\s+\w+)/gm, '$1')
// export { a, b, c } → (remove line)
.replace(/^export\s+\{[^}]*\}\s*;?\s*$/gm, '')
// export { a, b as c } → (remove line)
.replace(/^export\s+\{[^}]*\s+as\s+\w+[^}]*\}\s*;?\s*$/gm, '');
return transformed;
}
/**
* Determine whether code uses ESM export syntax (export keyword at line start
* or after optional whitespace).
*/
export function isEsmSyntax(code: string): boolean {
return /^\s*export\s+(const|let|var|function|class|default|\{)/m.test(code);
}
/**
* Build a restricted sandbox object with the workflow runtime API.
*
* @param context - The WorkflowContext methods to expose to the script.
* @returns A plain object suitable for vm.createContext().
*/
export function buildSandbox(context: WorkflowContext): Record<string, unknown> {
return {
// --- Workflow API (from context) ---
agent: context.agent,
parallel: context.parallel,
pipeline: context.pipeline,
phase: context.phase,
log: context.log,
budget: context.budget,
args: context.args,
workflow: context.workflow,
// --- Safe built-ins ---
console: {
log: context.log,
warn: context.log,
error: context.log,
},
setTimeout,
clearTimeout,
setInterval: undefined, // intentionally disabled
clearInterval: undefined, // intentionally disabled
Promise,
JSON,
Math,
Date,
RegExp,
Error,
Array,
Object,
String,
Number,
Boolean,
Map,
Set,
WeakMap,
WeakSet,
parseInt,
parseFloat,
isNaN,
isFinite,
Symbol,
BigInt,
undefined,
null: null,
true: true,
false: false,
// --- CommonJS interop ---
module: { exports: {} },
exports: {},
require: undefined, // intentionally disabled
global: undefined, // prevent escape via `globalThis`
};
}
/**
* Execute a workflow script in the sandbox and return its default export
* (the main async function).
*
* @param sourceFile - Absolute path to the .js workflow file.
* @param context - The WorkflowContext to expose to the script.
* @returns The workflow's default export function.
* @throws {Error} If the script doesn't export a default async function,
* or if execution fails.
*/
export function loadWorkflowScript(
sourceFile: string,
context: WorkflowContext,
): (...args: unknown[]) => Promise<unknown> {
const code = readFileSync(sourceFile, 'utf8');
const finalCode = isEsmSyntax(code) ? transformEsmToCjs(code) : code;
const rawSandbox = buildSandbox(context);
const sandbox = rawSandbox as Record<string, unknown> & {
module: { exports: Record<string, unknown> };
};
vm.createContext(sandbox);
try {
const script = new vm.Script(finalCode);
script.runInContext(sandbox, {
timeout: EXECUTION_TIMEOUT_MS,
filename: sourceFile,
});
} catch (err) {
const msg = err instanceof Error ? err.message : String(err);
throw new Error(`Workflow script execution failed: ${msg}`);
}
// Check module.exports first (CJS), then sandbox.default (ESM transform)
const exported = sandbox.module.exports.default ?? sandbox.default;
// Also support `module.exports = async function(...)` (direct assignment)
const mainFn =
typeof sandbox.module.exports === 'function'
? sandbox.module.exports
: exported;
if (typeof mainFn !== 'function') {
const exportedKeys = Object.keys({
...sandbox.module.exports,
...(sandbox.default ? { default: true } : {}),
});
throw new Error(
`Workflow script must export a default async function. ` +
`Found exports: ${exportedKeys.join(', ') || '(none)'}. ` +
`Make sure your script has "export default async function main(args) {...}".`,
);
}
// eslint-disable-next-line @typescript-eslint/no-unsafe-return
return mainFn as (...args: unknown[]) => Promise<unknown>;
}
/**
* Load a workflow script from a source code string (rather than a file).
* Useful for built-in workflows from the catalog that don't have a
* corresponding .js file on disk.
*
* @param code - The JavaScript source code of the workflow.
* @param context - The WorkflowContext to expose.
* @param filename - Virtual filename for stack traces (e.g. 'builtin://deep-research').
* @returns The workflow's default export function.
* @throws {Error} If the script doesn't export a default async function.
*/
export function loadWorkflowScriptFromCode(
code: string,
context: WorkflowContext,
filename?: string,
): (...args: unknown[]) => Promise<unknown> {
const finalCode = isEsmSyntax(code) ? transformEsmToCjs(code) : code;
const rawSandbox = buildSandbox(context);
const sandbox = rawSandbox as Record<string, unknown> & {
module: { exports: Record<string, unknown> };
};
vm.createContext(sandbox);
try {
const script = new vm.Script(finalCode);
script.runInContext(sandbox, {
timeout: EXECUTION_TIMEOUT_MS,
filename: filename ?? 'workflow:<anonymous>',
});
} catch (err) {
const msg = err instanceof Error ? err.message : String(err);
throw new Error(`Workflow script execution failed: ${msg}`);
}
const exported = sandbox.module.exports.default ?? sandbox.default;
const mainFn =
typeof sandbox.module.exports === 'function'
? sandbox.module.exports
: exported;
if (typeof mainFn !== 'function') {
const exportedKeys = Object.keys({
...sandbox.module.exports,
...(sandbox.default ? { default: true } : {}),
});
throw new Error(
`Workflow script must export a default async function. ` +
`Found exports: ${exportedKeys.join(', ') || '(none)'}.`,
);
}
// eslint-disable-next-line @typescript-eslint/no-unsafe-return
return mainFn as (...args: unknown[]) => Promise<unknown>;
}
/**
* High-level convenience: load and execute a workflow script in a single call.
*
* @param sourceFile - Absolute path to the .js workflow file.
* @param context - The WorkflowContext to expose.
* @param args - Optional arguments passed to the workflow function.
* @returns The workflow's return value.
*/
export async function executeWorkflowScript(
sourceFile: string,
context: WorkflowContext,
args?: Record<string, unknown>,
): Promise<unknown> {
const mainFn = loadWorkflowScript(sourceFile, context);
return mainFn(args);
}
/**
* Execute a workflow from source code (string) rather than a file.
* Convenience wrapper around `loadWorkflowScriptFromCode`.
*
* @param code - The JavaScript source code of the workflow.
* @param context - The WorkflowContext to expose.
* @param args - Optional arguments passed to the workflow function.
* @param filename - Virtual filename for stack traces.
* @returns The workflow's return value.
*/
export async function executeWorkflowScriptFromCode(
code: string,
context: WorkflowContext,
args?: Record<string, unknown>,
filename?: string,
): Promise<unknown> {
const mainFn = loadWorkflowScriptFromCode(code, context, filename);
return mainFn(args);
}

View File

@@ -0,0 +1,128 @@
// v2.8.0: Dynamic Workflow Engine — types for the sandboxed multi-agent
// orchestration runtime. All types are exported for testing.
/**
* The expected shape of a workflow script module.
* Workflow files are plain .js files that export `meta` and `default`:
*
* ```js
* export const meta = {
* name: 'my-workflow',
* description: 'Does something useful in phases',
* phases: [
* { title: 'Research', detail: 'Gather context' },
* { title: 'Implement', detail: 'Make changes' },
* ],
* };
*
* export default async function main(args) {
* const result = await agent('...');
* return result;
* }
* ```
*/
export interface WorkflowScriptMeta {
name: string;
description: string;
phases?: Array<{ title: string; detail?: string }>;
}
export interface WorkflowScript {
meta: WorkflowScriptMeta;
default: (args?: Record<string, unknown>) => Promise<unknown>;
}
/**
* Specification for dispatching a single agent task within a workflow.
*/
export interface AgentTaskSpec {
/** The instruction prompt for the agent. */
prompt: string;
/** Optional human-readable label for this task (shown in UI). */
label?: string;
/** Phase identifier for grouping tasks. */
phase?: string;
/** Model override (defaults to session/chat model). */
model?: string;
/** Zod-style JSON schema for structured output validation. */
schema?: Record<string, unknown>;
/** Required capabilities the agent must have. */
capabilities?: string[];
/** Per-agent tool-call budget ceiling. */
max_tool_calls?: number;
/** Per-agent step cap for the inference loop. */
max_tool_iters?: number;
}
/**
* Result returned after an agent task completes.
*/
export interface AgentTaskResult {
ok: boolean;
output: unknown;
error?: string;
token_usage?: { prompt: number; completion: number };
/** True when this result was served from the resumability cache
* rather than re-executing the agent task. */
cached?: boolean;
}
/**
* Runtime context passed into every workflow script's default function.
* Mirrors the Claude Code-compatible API surface.
*/
export interface WorkflowContext {
/** Dispatch a single agent prompt. Returns the assistant's reply content. */
agent: (prompt: string, opts?: AgentTaskSpec) => Promise<unknown>;
/** Run multiple independent tasks concurrently. Returns results in order. */
parallel: (thunks: Array<() => Promise<unknown>>) => Promise<unknown[]>;
/** Pass items through a sequence of transform stages. */
pipeline: (
items: unknown[],
...stages: Array<(item: unknown) => Promise<unknown>>
) => Promise<unknown[]>;
/** Announce the current execution phase (for UI progress). */
phase: (title: string) => void;
/** Emit a log message for this workflow run. */
log: (message: string) => void;
/** Token budget tracker for the current run. */
budget: {
total: number | null;
spent: () => number;
remaining: () => number;
};
/** The arguments passed when this workflow was started. */
args: Record<string, unknown>;
/** Call another workflow from within a workflow (nested). */
workflow: (name: string, args?: Record<string, unknown>) => Promise<unknown>;
}
/**
* Status of a workflow execution run.
*/
export type WorkflowRunStatus = 'running' | 'completed' | 'failed' | 'cancelled';
/**
* Persistent record of a workflow run.
*/
export interface WorkflowRun {
id: string;
name: string;
status: WorkflowRunStatus;
started_at: string;
finished_at?: string;
error?: string;
}
/**
* Event emitted by the workflow manager for subscribers.
*/
export type WorkflowEvent =
| { type: 'run_started'; runId: string; name: string }
| { type: 'run_completed'; runId: string; name: string }
| { type: 'run_failed'; runId: string; name: string; error: string }
| { type: 'run_cancelled'; runId: string; name: string }
| { type: 'phase'; runId: string; title: string }
| { type: 'log'; runId: string; message: string }
| { type: 'agent_task_started'; runId: string; label?: string }
| { type: 'agent_task_completed'; runId: string; label?: string };

View File

@@ -34,6 +34,10 @@ import type {
SessionAnalyticsRow, SessionAnalyticsRow,
ContextWindowStats, ContextWindowStats,
TokenBreakdownAgg, TokenBreakdownAgg,
ToolTraceResponse,
MemoryEntry,
DailyMemoryEntry,
DreamEntry,
} from './types'; } from './types';
// v2.6 Phase 1-UX §9b: chat-scoped agent-session rows. Returned by // v2.6 Phase 1-UX §9b: chat-scoped agent-session rows. Returned by
@@ -340,6 +344,10 @@ export const api = {
method: 'POST', method: 'POST',
body: JSON.stringify({ tool_call_id: toolCallId, decision }), body: JSON.stringify({ tool_call_id: toolCallId, decision }),
}), }),
getTraces: (chatId: string, limit = 50, offset = 0) =>
request<ToolTraceResponse>(
`/api/chats/${chatId}/traces?limit=${limit}&offset=${offset}`,
),
}, },
messages: { messages: {
@@ -608,6 +616,22 @@ export const api = {
tokenBreakdown: () => request<{ categories: TokenBreakdownAgg[] }>('/api/coder/analytics/token-breakdown'), tokenBreakdown: () => request<{ categories: TokenBreakdownAgg[] }>('/api/coder/analytics/token-breakdown'),
}, },
// memory-browser-ui: topic-based memory, daily log, dream diaries.
memory: {
list: (projectId: string) =>
request<{ entries: MemoryEntry[] }>(
`/api/memory?project_id=${encodeURIComponent(projectId)}`,
),
daily: (projectId: string) =>
request<{ entries: DailyMemoryEntry[] }>(
`/api/memory/daily?project_id=${encodeURIComponent(projectId)}`,
),
dreams: (projectId: string) =>
request<{ entries: DreamEntry[] }>(
`/api/memory/dreams?project_id=${encodeURIComponent(projectId)}`,
),
},
settings: { settings: {
get: () => request<Record<string, unknown>>('/api/settings'), get: () => request<Record<string, unknown>>('/api/settings'),
patch: (body: Record<string, unknown>) => patch: (body: Record<string, unknown>) =>

View File

@@ -101,6 +101,7 @@ export interface Chat {
id: string; id: string;
session_id: string; session_id: string;
name: string | null; name: string | null;
model: string | null;
status: ChatStatus; status: ChatStatus;
created_at: string; created_at: string;
updated_at: string; updated_at: string;
@@ -131,6 +132,10 @@ export interface ToolResult {
output: unknown; output: unknown;
truncated: boolean; truncated: boolean;
error?: string; error?: string;
// v2.8: unified diff snippet for write-tool results. Present when the tool
// modified files (edit_file, create_file, etc.) and the backend computed a
// diff. Rendered inline by DiffSnippet.
diff?: string;
} }
// v1.8.2 / v1.11.6: ErrorReason + MessageMetadata single-sourced in // v1.8.2 / v1.11.6: ErrorReason + MessageMetadata single-sourced in
@@ -172,6 +177,10 @@ export interface Message {
// (CoderPane/CoderMessageList) and streams it live via reasoning_delta // (CoderPane/CoderMessageList) and streams it live via reasoning_delta
// frames. MessageBubble reads whichever of the two is present. // frames. MessageBubble reads whichever of the two is present.
reasoning_text?: string | null; reasoning_text?: string | null;
// v2.8-compare: compare group id. Set when the message is part of a
// multi-model compare response. All assistant messages in the same compare
// group share this id, keyed to the user message that triggered the compare.
compare_group_id?: string;
// v1.11: anchored rolling compaction fields. Optional on the wire so that // v1.11: anchored rolling compaction fields. Optional on the wire so that
// older API responses (or test fixtures) parse without explicit nulls. // older API responses (or test fixtures) parse without explicit nulls.
// summary — true on the assistant row that holds the active // summary — true on the assistant row that holds the active
@@ -513,8 +522,8 @@ export interface WorkspaceState {
export type WsFrame = export type WsFrame =
| { type: 'snapshot'; messages: Message[] } | { type: 'snapshot'; messages: Message[] }
| { type: 'message_started'; message_id: string; chat_id?: string; role: MessageRole } | { type: 'message_started'; message_id: string; chat_id?: string; role: MessageRole; compare_group_id?: string }
| { type: 'delta'; message_id: string; chat_id?: string; content: string } | { type: 'delta'; message_id: string; chat_id?: string; content: string; compare_group_id?: string }
| { type: 'tool_call'; message_id: string; chat_id?: string; tool_call: ToolCall } | { type: 'tool_call'; message_id: string; chat_id?: string; tool_call: ToolCall }
| { | {
type: 'tool_result'; type: 'tool_result';
@@ -524,6 +533,7 @@ export type WsFrame =
output: unknown; output: unknown;
truncated: boolean; truncated: boolean;
error?: string; error?: string;
diff?: string;
} }
| { | {
type: 'message_complete'; type: 'message_complete';
@@ -532,6 +542,8 @@ export type WsFrame =
tokens_used?: number | null; tokens_used?: number | null;
ctx_used?: number | null; ctx_used?: number | null;
ctx_max?: number | null; ctx_max?: number | null;
cache_tokens?: number | null;
reasoning_tokens?: number | null;
started_at?: string | null; started_at?: string | null;
finished_at?: string | null; finished_at?: string | null;
// model-attribution: the model that produced this assistant message. // model-attribution: the model that produced this assistant message.
@@ -545,6 +557,7 @@ export type WsFrame =
// 'cancelled' on a user Stop / stall and 'failed' on a thrown error so the // 'cancelled' on a user Stop / stall and 'failed' on a thrown error so the
// reducer renders a muted "Stopped" / failed state — no new frame type. // reducer renders a muted "Stopped" / failed state — no new frame type.
status?: 'complete' | 'cancelled' | 'failed'; status?: 'complete' | 'cancelled' | 'failed';
compare_group_id?: string;
} }
// v1.12.2: live throughput frame, published mid-stream every ~500ms with // v1.12.2: live throughput frame, published mid-stream every ~500ms with
// the latest token + ctx counts so ChatThroughput can render tok/s and // the latest token + ctx counts so ChatThroughput can render tok/s and
@@ -559,6 +572,14 @@ export type WsFrame =
} }
| { type: 'messages_deleted'; message_ids: string[]; chat_id?: string } | { type: 'messages_deleted'; message_ids: string[]; chat_id?: string }
| { type: 'chat_renamed'; chat_id: string; name: string } | { type: 'chat_renamed'; chat_id: string; name: string }
| {
type: 'agent_snapshot';
chat_id: string;
agent?: string | null;
model: string;
mode?: string | null;
turn_number: number;
}
// v1.11: published by services/compaction.ts after the new anchored // v1.11: published by services/compaction.ts after the new anchored
// summary row lands. Carries the new summary row id for diagnostics; the // summary row lands. Carries the new summary row id for diagnostics; the
// session-stream handler ignores the id and re-fetches the full message // session-stream handler ignores the id and re-fetches the full message
@@ -566,7 +587,7 @@ export type WsFrame =
| { type: 'compacted'; session_id: string; chat_id: string; summary_message_id: string } | { type: 'compacted'; session_id: string; chat_id: string; summary_message_id: string }
// v1.8.2: `reason` discriminates structured failures (the UI prefers it // v1.8.2: `reason` discriminates structured failures (the UI prefers it
// over `error` text when present). // over `error` text when present).
| { type: 'error'; message_id?: string; chat_id?: string; error: string; reason?: ErrorReason } | { type: 'error'; message_id?: string; chat_id?: string; error: string; reason?: ErrorReason; compare_group_id?: string }
// agent-status-normalize (#10): BooCoder publishes a normalized per-(chat,agent) // agent-status-normalize (#10): BooCoder publishes a normalized per-(chat,agent)
// lifecycle status for external coding agents on the per-session channel. The // lifecycle status for external coding agents on the per-session channel. The
// CoderPane tracks the latest status per (chat_id, agent) and resets on chat // CoderPane tracks the latest status per (chat_id, agent) and resets on chat
@@ -602,6 +623,31 @@ export type WsFrame =
run_status?: 'running' | 'completed' | 'failed' | 'cancelled'; run_status?: 'running' | 'completed' | 'failed' | 'cancelled';
report?: string; report?: string;
} }
// tool trace frames: per-tool-call lifecycle tracking
| {
type: 'tool_trace_start';
trace_id: string;
message_id: string;
chat_id: string;
tool_name: string;
tool_input: Record<string, unknown>;
started_at: string;
}
| {
type: 'tool_trace_finish';
trace_id: string;
message_id: string;
chat_id: string;
tool_name: string;
tool_output?: string | null;
latency_ms?: number;
tokens_used?: number | null;
cache_tokens?: number | null;
reasoning_tokens?: number | null;
error?: string;
outcome?: string;
finished_at: string;
}
// arena frames: battle lifecycle + per-contestant streaming // arena frames: battle lifecycle + per-contestant streaming
| { | {
type: 'battle_started'; type: 'battle_started';
@@ -628,8 +674,66 @@ export type WsFrame =
winner_contestant_id?: string | null; winner_contestant_id?: string | null;
analysis_ready?: boolean; analysis_ready?: boolean;
cross_exam_id?: string; cross_exam_id?: string;
}
// streaming v2: channel-delta frames. Each carries a monotonic seq for
// out-of-order buffering and a channel discriminator; per-channel payloads
// map to the equivalent legacy frame types after reordering.
| {
type: 'channel_delta';
seq: number;
channel: 'text' | 'tool_call' | 'tool_result' | 'status' | 'error';
message_id?: string;
chat_id?: string;
content?: string;
compare_group_id?: string;
tool_call?: ToolCall;
tool_message_id?: string;
tool_call_id?: string;
output?: unknown;
truncated?: boolean;
diff?: string;
error?: string;
reason?: string;
status?: 'running' | 'complete' | 'cancelled' | 'failed';
tokens_used?: number | null;
ctx_used?: number | null;
ctx_max?: number | null;
cache_tokens?: number | null;
reasoning_tokens?: number | null;
started_at?: string | null;
finished_at?: string | null;
model?: string | null;
metadata?: MessageMetadata | null;
}; };
// tool traces: per-tool-call record returned by GET /api/chats/:id/traces.
export interface ToolTrace {
id: string;
session_id: string;
chat_id: string;
message_id: string | null;
turn_number: number;
tool_name: string;
tool_input: Record<string, unknown>;
tool_output: string | null;
started_at: string;
finished_at: string | null;
latency_ms: number | null;
tokens_used: number | null;
cache_tokens: number | null;
reasoning_tokens: number | null;
error: string | null;
outcome: string | null;
created_at: string;
}
export interface ToolTraceResponse {
data: ToolTrace[];
total: number;
limit: number;
offset: number;
}
// token-analyzer-ui: aggregate token/cost analytics types. // token-analyzer-ui: aggregate token/cost analytics types.
export interface AnalyticsSummary { export interface AnalyticsSummary {
total_input_tokens: number; total_input_tokens: number;
@@ -658,3 +762,21 @@ export interface TokenBreakdownAgg {
category: string; category: string;
total_tokens: number; total_tokens: number;
} }
// ── Memory browser types ────────────────────────────────────────────
export interface MemoryEntry {
id: string;
topic: string;
title: string;
content: string;
tags: string[];
}
export interface DailyMemoryEntry extends MemoryEntry {
date: string;
}
export interface DreamEntry {
date: string;
content: string;
}

View File

@@ -0,0 +1,38 @@
// vDeepSeek: cache shape telemetry badge. Displays cache token count with
// a colored hit-rate bar in the trace viewer. Color thresholds are relative
// to output tokens (tokens_used) since the trace doesn't carry prompt miss
// tokens separately: green > 50%, yellow > 10%, red ≤ 10%.
export interface CacheShapeBadgeProps {
cacheTokens: number | null | undefined;
totalTokens: number | null | undefined;
}
function hitRate(cache: number, total: number): number {
if (cache <= 0 || total <= 0) return 0;
return cache / (cache + total);
}
function barColor(rate: number): string {
if (rate > 0.5) return 'bg-green-500';
if (rate > 0.1) return 'bg-yellow-500';
return 'bg-red-500';
}
export function CacheShapeBadge({ cacheTokens, totalTokens }: CacheShapeBadgeProps) {
if (cacheTokens == null || cacheTokens <= 0) return null;
const rate = hitRate(cacheTokens, totalTokens ?? 0);
const pct = Math.round(rate * 100);
const color = barColor(rate);
return (
<span className="shrink-0 inline-flex items-center gap-1 font-mono tabular-nums text-[10px] text-muted-foreground/60" title={`cache hit rate ${pct}%`}>
<span className={`inline-block w-1.5 h-3 rounded-sm ${color}`} />
<span>{cacheTokens}c</span>
{totalTokens != null && totalTokens > 0 && (
<span className="text-muted-foreground/40">{pct}%</span>
)}
</span>
);
}

View File

@@ -0,0 +1,88 @@
import { useMemo, useState } from 'react';
import { ChevronDown, ChevronRight, FileCode } from 'lucide-react';
interface Props {
diff: string;
}
const INITIAL_LINES = 10;
export function DiffSnippet({ diff }: Props) {
const [expanded, setExpanded] = useState(false);
const lines = useMemo(() => diff.split('\n'), [diff]);
const totalLines = lines.length;
// Find the first and last content lines (skip leading ---/+++ headers)
const firstContentIdx = lines.findIndex(
(l) => l.startsWith('+') || l.startsWith('-') || l.startsWith(' '),
);
// Count content lines that are either +, -, or context lines
const contentLineCount = lines.filter(
(l) => l.startsWith('+') || l.startsWith('-') || l.startsWith(' '),
).length;
// Show first N content lines, plus header lines
const displayLines = useMemo(() => {
const sliceEnd = expanded ? lines.length : Math.min(firstContentIdx + INITIAL_LINES + contentLineCount, lines.length);
return lines.slice(0, sliceEnd);
}, [lines, expanded, firstContentIdx, contentLineCount]);
const hasMore = totalLines > displayLines.length;
if (totalLines === 0) return null;
return (
<div className="mt-1 rounded border border-border/40 bg-muted/20 overflow-hidden">
<button
type="button"
onClick={() => setExpanded((v) => !v)}
className="flex items-center gap-1.5 w-full px-2 py-1 text-left hover:bg-muted/30 text-[10px] font-mono text-muted-foreground"
>
<FileCode className="size-3 shrink-0" />
<span className="font-medium">diff</span>
<span className="text-muted-foreground/60">
{contentLineCount} line{contentLineCount === 1 ? '' : 's'} changed
</span>
<span className="ml-auto shrink-0">
{expanded ? <ChevronDown className="size-3" /> : <ChevronRight className="size-3" />}
</span>
</button>
<div className="px-0 pb-0.5">
{displayLines.map((line, i) => {
// Determine color class based on line prefix
let colorClass = 'text-muted-foreground/60';
if (line.startsWith('+')) colorClass = 'text-emerald-600 dark:text-emerald-400';
else if (line.startsWith('-')) colorClass = 'text-red-500 dark:text-red-400';
else if (line.startsWith('@@')) colorClass = 'text-muted-foreground';
else if (line.startsWith('---') || line.startsWith('+++')) colorClass = 'text-muted-foreground/50';
return (
<div
key={i}
className={`leading-[1.3] px-2 text-[10px] font-mono whitespace-pre ${colorClass} ${
line.startsWith('+')
? 'bg-emerald-500/5'
: line.startsWith('-')
? 'bg-red-500/5'
: ''
}`}
>
{line}
</div>
);
})}
{hasMore && !expanded && (
<button
type="button"
onClick={() => setExpanded(true)}
className="w-full text-left px-2 py-0.5 text-[10px] font-mono text-muted-foreground/60 hover:text-muted-foreground hover:bg-muted/30"
>
Show {totalLines - displayLines.length} more lines
</button>
)}
</div>
</div>
);
}

View File

@@ -1,10 +1,14 @@
import { useCallback, useEffect, useMemo, useRef } from 'react'; import { useCallback, useEffect, useMemo, useRef, useState } from 'react';
import { motion } from 'framer-motion';
import { Virtuoso, type VirtuosoHandle } from 'react-virtuoso';
import { Pin } from 'lucide-react';
import type { Chat, Message } from '@/api/types'; import type { Chat, Message } from '@/api/types';
import { MessageBubble } from './MessageBubble'; import { MessageBubble } from './MessageBubble';
import { ToolCallGroup } from './ToolCallGroup'; import { ToolCallGroup } from './ToolCallGroup';
import { ToolCallLine, type ToolRun } from './ToolCallLine'; import { ToolCallLine, type ToolRun } from './ToolCallLine';
import { AskUserInputCard } from './AskUserInputCard'; import { AskUserInputCard } from './AskUserInputCard';
import { RequestReadAccessCard } from './RequestReadAccessCard'; import { RequestReadAccessCard } from './RequestReadAccessCard';
import { MessageListErrorBoundary } from './MessageListErrorBoundary';
interface Props { interface Props {
messages: Message[]; messages: Message[];
@@ -142,27 +146,63 @@ function stampCapHits(items: RenderItem[]): RenderItem[] {
}); });
} }
const SCROLL_THRESHOLD_PX = 150;
export function MessageList({ messages, sessionChats }: Props) { export function MessageList({ messages, sessionChats }: Props) {
const endRef = useRef<HTMLDivElement>(null); const virtuosoRef = useRef<VirtuosoHandle>(null);
const scrollContainerRef = useRef<HTMLDivElement>(null);
const isNearBottomRef = useRef(true); const isNearBottomRef = useRef(true);
const renderedKeysRef = useRef(new Set<string>());
const prefersReducedMotionRef = useRef(false);
const [animateEnabled, setAnimateEnabled] = useState(true);
const [pinMessageId, setPinMessageId] = useState<string | null>(() => {
if (typeof window !== 'undefined') {
const hash = window.location.hash;
if (hash.startsWith('#pin=')) return hash.slice(5);
}
return null;
});
const renderItems = useMemo(() => stampCapHits(group(flatten(messages))), [messages]); const renderItems = useMemo(() => stampCapHits(group(flatten(messages))), [messages]);
const handleScroll = useCallback(() => { const pinIndex = useMemo(() => {
const el = scrollContainerRef.current; if (!pinMessageId) return -1;
if (!el) return; return renderItems.findIndex(
isNearBottomRef.current = (item) => item.kind === 'message' && item.message.id === pinMessageId,
el.scrollHeight - el.scrollTop - el.clientHeight < SCROLL_THRESHOLD_PX; );
}, [pinMessageId, renderItems]);
useEffect(() => {
const mq = window.matchMedia('(prefers-reduced-motion: reduce)');
prefersReducedMotionRef.current = mq.matches;
const handler = (e: MediaQueryListEvent) => {
prefersReducedMotionRef.current = e.matches;
};
mq.addEventListener('change', handler);
return () => mq.removeEventListener('change', handler);
}, []); }, []);
useEffect(() => { useEffect(() => {
if (isNearBottomRef.current) { const handler = () => {
endRef.current?.scrollIntoView({ block: 'end' }); const hash = window.location.hash;
if (hash.startsWith('#pin=')) {
setPinMessageId(hash.slice(5));
} else {
setPinMessageId(null);
} }
}, [messages]); };
window.addEventListener('hashchange', handler);
return () => window.removeEventListener('hashchange', handler);
}, []);
const atBottomStateChange = useCallback((atBottom: boolean) => {
isNearBottomRef.current = atBottom;
setAnimateEnabled(atBottom);
}, []);
const scrollToPin = useCallback(() => {
if (pinIndex >= 0 && virtuosoRef.current) {
virtuosoRef.current.scrollToIndex({ index: pinIndex, align: 'center' });
}
}, [pinIndex]);
if (messages.length === 0) { if (messages.length === 0) {
return ( return (
@@ -173,46 +213,78 @@ export function MessageList({ messages, sessionChats }: Props) {
} }
return ( return (
<div className="flex-1 overflow-y-auto" ref={scrollContainerRef} onScroll={handleScroll}> <MessageListErrorBoundary>
<div className="max-w-[1000px] mx-auto w-full px-6 py-4 space-y-4"> <div className="flex-1 flex flex-col">
{renderItems.map((item) => { {pinMessageId && pinIndex >= 0 && (
if (item.kind === 'message') { <div className="shrink-0 flex items-center gap-2 px-4 py-1.5 bg-primary/10 border-b border-primary/20 text-xs text-primary">
<Pin className="size-3" />
<span>Pinned message</span>
<button
type="button"
onClick={scrollToPin}
className="ml-auto underline hover:no-underline"
>
Jump to pinned
</button>
</div>
)}
<Virtuoso
ref={virtuosoRef}
className="flex-1"
data={renderItems}
followOutput="auto"
overscan={5}
atBottomStateChange={atBottomStateChange}
itemContent={(index, item) => {
const key = item.kind === 'message' ? `msg-${item.message.id}` : item.key;
const isNew = !renderedKeysRef.current.has(key);
if (isNew) renderedKeysRef.current.add(key);
const reducedMotion = prefersReducedMotionRef.current;
const delay = isNew && !reducedMotion ? Math.min(index * 0.04, 0.5) : 0;
const shouldAnimate = isNew && animateEnabled;
return ( return (
<div
className="max-w-[1000px] mx-auto w-full px-6 py-2"
id={item.kind === 'message' ? `msg-${item.message.id}` : undefined}
>
<motion.div
initial={shouldAnimate ? { opacity: 0, y: 8 } : false}
animate={{ opacity: 1, y: 0 }}
transition={delay > 0 ? { duration: 0.2, delay } : { duration: 0 }}
>
{item.kind === 'message' ? (
<MessageBubble <MessageBubble
key={item.message.id}
message={item.message} message={item.message}
sessionChats={sessionChats} sessionChats={sessionChats}
capHitInfo={item.capHitInfo} capHitInfo={item.capHitInfo}
/> />
); ) : item.kind === 'tool_run' ? (
} item.run.call.name === 'ask_user_input' ? (
if (item.kind === 'tool_run') {
if (item.run.call.name === 'ask_user_input') {
return (
<AskUserInputCard <AskUserInputCard
key={item.key}
toolCall={item.run.call} toolCall={item.run.call}
toolResult={item.run.result} toolResult={item.run.result}
chatId={item.chatId} chatId={item.chatId}
/> />
); ) : item.run.call.name === 'request_read_access' ? (
}
if (item.run.call.name === 'request_read_access') {
return (
<RequestReadAccessCard <RequestReadAccessCard
key={item.key}
toolCall={item.run.call} toolCall={item.run.call}
toolResult={item.run.result} toolResult={item.run.result}
chatId={item.chatId} chatId={item.chatId}
/> />
) : (
<ToolCallLine run={item.run} />
)
) : (
<ToolCallGroup runs={item.runs} />
)}
</motion.div>
</div>
); );
} }}
return <ToolCallLine key={item.key} run={item.run} />; />
}
return <ToolCallGroup key={item.key} runs={item.runs} />;
})}
<div ref={endRef} />
</div>
</div> </div>
</MessageListErrorBoundary>
); );
} }

View File

@@ -0,0 +1,188 @@
import { useMemo } from 'react';
import { Clock, Cpu, Hash, Layers, RefreshCw, X } from 'lucide-react';
import { Button } from '@/components/ui/button';
import { cn } from '@/lib/utils';
import type { Message } from '@/api/types';
interface TurnEntry {
message: Message;
turnNumber: number;
elapsed: string;
toolCallCount: number;
}
interface Props {
messages: Message[];
chatId: string;
onClose: () => void;
onScrollToMessage: (messageId: string) => void;
}
function formatElapsed(startedAt: string | null, finishedAt: string | null): string {
if (!startedAt || !finishedAt) return '—';
const start = new Date(startedAt).getTime();
const end = new Date(finishedAt).getTime();
if (Number.isNaN(start) || Number.isNaN(end)) return '—';
const ms = end - start;
if (ms < 0) return '—';
if (ms < 1000) return `${ms}ms`;
if (ms < 60_000) return `${Math.round(ms / 1000)}s`;
const mins = Math.floor(ms / 60_000);
const secs = Math.round((ms % 60_000) / 1000);
return `${mins}m ${secs}s`;
}
/**
* SessionTimeline — vertical timeline of assistant turns in a chat.
*
* Renders a side-panel overlay with each turn's model, tokens, duration,
* and tool-call count. Clicking a turn scrolls the main chat to that
* message. The latest turn shows a "Scroll to latest" restore button.
*/
export function SessionTimeline({ messages, onClose, onScrollToMessage }: Props) {
const turns = useMemo<TurnEntry[]>(() => {
const assistantMsgs = messages.filter(
(m) => m.role === 'assistant' && m.status === 'complete',
);
return assistantMsgs.map((message, i) => ({
message,
turnNumber: i + 1,
elapsed: formatElapsed(message.started_at, message.finished_at),
toolCallCount: message.tool_calls?.length ?? 0,
}));
}, [messages]);
const latestTurnId = turns.length > 0 ? turns[turns.length - 1]!.message.id : null;
return (
<div className="absolute inset-y-0 right-0 w-80 z-20 bg-background border-l border-border shadow-xl flex flex-col overflow-hidden">
{/* Header */}
<div className="flex items-center justify-between px-3 py-2.5 border-b border-border shrink-0">
<h3 className="text-sm font-semibold">Session Timeline</h3>
<Button variant="ghost" size="icon-xs" onClick={onClose} aria-label="Close timeline">
<X size={14} />
</Button>
</div>
{/* Timeline entries */}
<div className="flex-1 overflow-y-auto px-3 py-3">
{turns.length === 0 ? (
<div className="text-xs text-muted-foreground text-center py-8">
No assistant turns yet.
</div>
) : (
<div className="relative">
{turns.map((turn, i) => {
const isLatest = turn.message.id === latestTurnId;
return (
<div key={turn.message.id} className="relative flex gap-3 pb-4 last:pb-0">
{/* Vertical connector line */}
{i < turns.length - 1 && (
<div className="absolute left-[11px] top-5 bottom-0 w-px bg-border" />
)}
{/* Timeline dot button */}
<button
type="button"
onClick={() => onScrollToMessage(turn.message.id)}
className="relative flex-shrink-0 mt-1 cursor-pointer focus:outline-none focus-visible:ring-2 focus-visible:ring-ring rounded-full"
aria-label={`Scroll to turn ${turn.turnNumber}`}
>
<div
className={cn(
'size-[22px] rounded-full border-2 flex items-center justify-center',
isLatest
? 'border-primary bg-primary/10'
: 'border-muted-foreground/30 bg-background',
)}
>
<div
className={cn(
'size-2 rounded-full',
isLatest ? 'bg-primary' : 'bg-muted-foreground/50',
)}
/>
</div>
</button>
{/* Content card */}
<div className="flex-1 min-w-0">
<div
className="rounded-lg border border-border bg-card p-2.5 cursor-pointer hover:bg-muted/40 transition-colors"
onClick={() => onScrollToMessage(turn.message.id)}
>
{/* Turn number + latest badge */}
<div className="flex items-center justify-between mb-1.5">
<span className="text-xs font-semibold text-foreground">
Turn {turn.turnNumber}
</span>
{isLatest && (
<span className="text-[10px] font-medium text-primary bg-primary/10 px-1.5 py-0.5 rounded-full leading-none">
Latest
</span>
)}
</div>
{/* Model name */}
<div className="flex items-center gap-1.5 text-xs text-muted-foreground mb-1.5">
<Cpu size={11} className="shrink-0" />
<span className="truncate">{turn.message.model ?? 'Unknown model'}</span>
</div>
{/* Token count with breakdown */}
{turn.message.tokens_used != null && (
<div className="flex items-center gap-1.5 text-xs text-muted-foreground mb-1 flex-wrap">
<Hash size={11} className="shrink-0" />
<span>{turn.message.tokens_used.toLocaleString()} total</span>
{turn.message.cache_tokens != null && turn.message.cache_tokens > 0 && (
<span className="text-blue-500 dark:text-blue-400">
({turn.message.cache_tokens.toLocaleString()} cache)
</span>
)}
{turn.message.reasoning_tokens != null && turn.message.reasoning_tokens > 0 && (
<span className="text-amber-500 dark:text-amber-400">
({turn.message.reasoning_tokens.toLocaleString()} reasoning)
</span>
)}
</div>
)}
{/* Duration + tool calls */}
<div className="flex items-center gap-3 text-xs text-muted-foreground">
<span className="inline-flex items-center gap-1">
<Clock size={11} />
{turn.elapsed}
</span>
{turn.toolCallCount > 0 && (
<span className="inline-flex items-center gap-1">
<Layers size={11} />
{turn.toolCallCount} tool call{turn.toolCallCount !== 1 ? 's' : ''}
</span>
)}
</div>
</div>
{/* Restore button for latest turn */}
{isLatest && (
<button
type="button"
onClick={(e) => {
e.stopPropagation();
onScrollToMessage(turn.message.id);
}}
className="mt-1.5 w-full inline-flex items-center justify-center gap-1 text-[11px] font-medium text-primary hover:text-primary/80 transition-colors py-1 rounded-md hover:bg-primary/5"
>
<RefreshCw size={11} />
Scroll to latest
</button>
)}
</div>
</div>
);
})}
</div>
)}
</div>
</div>
);
}

View File

@@ -1,7 +1,9 @@
import { useState } from 'react'; import { useState } from 'react';
import { Check, ChevronRight, Loader2, X } from 'lucide-react'; import { Check, ChevronRight, Loader2, ShieldAlert, X } from 'lucide-react';
import type { ToolCall, ToolResult } from '@/api/types'; import type { ToolCall, ToolResult } from '@/api/types';
import { linkifyPaths } from '@/lib/linkify-paths'; import { linkifyPaths } from '@/lib/linkify-paths';
import { DiffSnippet } from './DiffSnippet';
import { McpPermissionDialog } from './McpPermissionDialog';
// v1.8.2: cap on the inline arg-summary length. Expanded view shows full // v1.8.2: cap on the inline arg-summary length. Expanded view shows full
// args + full result, so this is purely a single-line render budget. // args + full result, so this is purely a single-line render budget.
@@ -105,14 +107,18 @@ interface Props {
// When rendered inside a ToolCallGroup the line is already nested under a // When rendered inside a ToolCallGroup the line is already nested under a
// shared header, so the leading arrow is dropped to avoid double indent. // shared header, so the leading arrow is dropped to avoid double indent.
insideGroup?: boolean; insideGroup?: boolean;
chatId?: string;
} }
export function ToolCallLine({ run, insideGroup }: Props) { export function ToolCallLine({ run, insideGroup, chatId }: Props) {
const [open, setOpen] = useState(false); const [open, setOpen] = useState(false);
const [approveOpen, setApproveOpen] = useState(false);
const status = runStatus(run); const status = runStatus(run);
const args = run.call.args ?? {}; const args = run.call.args ?? {};
const summary = formatToolArgs(run.call.name, args); const summary = formatToolArgs(run.call.name, args);
const needsApproval = run.result?.error?.startsWith('requires approval:') === true;
return ( return (
<div className="text-xs"> <div className="text-xs">
<button <button
@@ -129,7 +135,7 @@ export function ToolCallLine({ run, insideGroup }: Props) {
/> />
)} )}
<ChevronRight <ChevronRight
className={`size-3 text-muted-foreground/60 shrink-0 transition-transform ${open ? 'rotate-90' : ''}`} className={`size-3 text-muted-foreground/60 shrink-0 motion-reduce:transition-none transition-transform ${open ? 'rotate-90' : ''}`}
/> />
<span className="font-mono text-foreground/90 shrink-0">{run.call.name}</span> <span className="font-mono text-foreground/90 shrink-0">{run.call.name}</span>
{summary && ( {summary && (
@@ -158,7 +164,27 @@ export function ToolCallLine({ run, insideGroup }: Props) {
{run.result && ( {run.result && (
<pre className="text-[11px] font-mono whitespace-pre-wrap bg-muted/30 rounded px-2 py-1 max-h-72 overflow-y-auto"> <pre className="text-[11px] font-mono whitespace-pre-wrap bg-muted/30 rounded px-2 py-1 max-h-72 overflow-y-auto">
{run.result.error ? ( {run.result.error ? (
needsApproval ? (
<span className="flex flex-col gap-2">
<span className="text-amber-600 dark:text-amber-400">
This tool requires your approval
</span>
{chatId && (
<span>
<button
type="button"
onClick={() => setApproveOpen(true)}
className="inline-flex items-center gap-1 rounded bg-amber-500/10 px-2 py-1 text-xs font-medium text-amber-600 hover:bg-amber-500/20 dark:text-amber-400"
>
<ShieldAlert className="size-3" />
Approve
</button>
</span>
)}
</span>
) : (
<span className="text-destructive">{run.result.error}</span> <span className="text-destructive">{run.result.error}</span>
)
) : ( ) : (
linkifyPaths( linkifyPaths(
typeof run.result.output === 'string' typeof run.result.output === 'string'
@@ -171,6 +197,17 @@ export function ToolCallLine({ run, insideGroup }: Props) {
)} )}
</pre> </pre>
)} )}
{needsApproval && chatId && (
<McpPermissionDialog
toolCallId={run.call.id}
toolName={run.call.name}
toolArgs={run.call.args ?? {}}
chatId={chatId}
open={approveOpen}
onClose={() => setApproveOpen(false)}
/>
)}
{run.result?.diff && <DiffSnippet diff={run.result.diff} />}
</div> </div>
)} )}
</div> </div>

View File

@@ -0,0 +1,248 @@
import { useCallback, useEffect, useMemo, useState } from 'react';
import { ChevronDown, ChevronRight, AlertCircle } from 'lucide-react';
import { api } from '@/api/client';
import type { ToolTrace } from '@/api/types';
import { CacheShapeBadge } from '@/components/CacheShapeBadge';
interface Props {
chatId: string;
}
// Max latency used as the 100% reference for the bar visualization
const MAX_LATENCY_REF = 30_000; // 30s
function latencyBarWidth(latencyMs: number | null): number {
if (latencyMs == null) return 0;
return Math.min(latencyMs / MAX_LATENCY_REF, 1);
}
function TraceRow({ trace }: { trace: ToolTrace }) {
const [expanded, setExpanded] = useState(false);
const isError = trace.outcome !== null && trace.outcome !== 'success';
const barWidth = latencyBarWidth(trace.latency_ms);
const latencyLabel =
trace.latency_ms != null
? trace.latency_ms >= 1000
? `${(trace.latency_ms / 1000).toFixed(1)}s`
: `${trace.latency_ms}ms`
: null;
return (
<div className="border-b border-border/40 last:border-0">
<button
type="button"
onClick={() => setExpanded((v) => !v)}
className="flex items-center gap-2 w-full text-left px-2 py-1.5 hover:bg-muted/40 text-[11px]"
>
<span className="shrink-0 text-muted-foreground">
{expanded ? <ChevronDown size={10} /> : <ChevronRight size={10} />}
</span>
<span className="font-medium truncate min-w-0">
{trace.tool_name}
</span>
{isError && (
<span className="shrink-0 text-destructive" title={trace.error ?? 'error'}>
<AlertCircle size={10} />
</span>
)}
<span className="shrink-0 text-muted-foreground font-mono tabular-nums min-w-[3rem] text-right">
{latencyLabel ?? '—'}
</span>
<span className="flex-1 h-1.5 bg-muted rounded-full overflow-hidden min-w-[24px] max-w-[60px]">
<span
className="block h-full rounded-full bg-primary/30 transition-all"
style={{ width: `${barWidth * 100}%` }}
/>
</span>
{trace.tokens_used != null && trace.tokens_used > 0 && (
<span className="shrink-0 text-muted-foreground font-mono tabular-nums">
{trace.tokens_used}t
</span>
)}
<CacheShapeBadge cacheTokens={trace.cache_tokens} totalTokens={trace.tokens_used} />
{trace.reasoning_tokens != null && trace.reasoning_tokens > 0 && (
<span className="shrink-0 text-muted-foreground/60 font-mono tabular-nums text-[10px]">
r{trace.reasoning_tokens}
</span>
)}
</button>
{expanded && (
<div className="px-3 pb-2 space-y-1.5 text-[11px] border-t border-border/40 pt-1.5">
<div>
<span className="text-muted-foreground font-medium">Input</span>
<pre className="mt-0.5 font-mono text-[10px] leading-relaxed text-muted-foreground bg-muted/30 rounded p-1.5 overflow-x-auto max-h-32 overflow-y-auto whitespace-pre-wrap break-all">
{JSON.stringify(trace.tool_input, null, 1)}
</pre>
</div>
{trace.tool_output != null && (
<div>
<span className="text-muted-foreground font-medium">Output</span>
<pre className="mt-0.5 font-mono text-[10px] leading-relaxed text-muted-foreground bg-muted/30 rounded p-1.5 overflow-x-auto max-h-32 overflow-y-auto whitespace-pre-wrap break-all">
{trace.tool_output.length > 2000
? `${trace.tool_output.slice(0, 2000)}`
: trace.tool_output}
</pre>
</div>
)}
{trace.error != null && (
<div className="text-destructive text-[10px] font-mono leading-relaxed bg-destructive/10 rounded p-1.5">
{trace.error}
</div>
)}
</div>
)}
</div>
);
}
function TraceGroup({ toolName, traces }: { toolName: string; traces: ToolTrace[] }) {
const [collapsed, setCollapsed] = useState(false);
const totalLatency = traces.reduce((sum, t) => sum + (t.latency_ms ?? 0), 0);
const totalTokens = traces.reduce((sum, t) => sum + (t.tokens_used ?? 0), 0);
const errorCount = traces.filter(
(t) => t.outcome !== null && t.outcome !== 'success',
).length;
return (
<div>
<button
type="button"
onClick={() => setCollapsed((v) => !v)}
className="flex items-center gap-1.5 w-full text-left px-2 py-1 text-[11px] font-medium text-muted-foreground hover:bg-muted/30 sticky top-0 bg-background"
>
{collapsed ? <ChevronRight size={10} /> : <ChevronDown size={10} />}
<span>{toolName}</span>
<span className="text-muted-foreground/60 font-mono tabular-nums">
×{traces.length}
</span>
{totalTokens > 0 && (
<span className="text-muted-foreground/60 font-mono tabular-nums text-[10px]">
{totalTokens}t
</span>
)}
{totalLatency > 0 && (
<span className="text-muted-foreground/60 font-mono tabular-nums text-[10px]">
{totalLatency >= 1000
? `${(totalLatency / 1000).toFixed(1)}s`
: `${totalLatency}ms`}
</span>
)}
{errorCount > 0 && (
<span className="ml-auto text-destructive text-[10px] font-medium">
{errorCount} error{errorCount > 1 ? 's' : ''}
</span>
)}
</button>
{!collapsed && traces.map((trace) => (
<TraceRow key={trace.id} trace={trace} />
))}
</div>
);
}
export function TraceViewer({ chatId }: Props) {
const [open, setOpen] = useState(false);
const [traces, setTraces] = useState<ToolTrace[]>([]);
const [loading, setLoading] = useState(false);
const [error, setError] = useState<string | null>(null);
const fetchTraces = useCallback(async () => {
setLoading(true);
setError(null);
try {
const res = await api.chats.getTraces(chatId);
setTraces(res.data);
} catch (err) {
setError(err instanceof Error ? err.message : 'failed to load traces');
} finally {
setLoading(false);
}
}, [chatId]);
useEffect(() => {
if (open) {
void fetchTraces();
}
}, [open, fetchTraces]);
const groups = useMemo(() => {
const map = new Map<string, ToolTrace[]>();
for (const t of traces) {
const existing = map.get(t.tool_name);
if (existing) {
existing.push(t);
} else {
map.set(t.tool_name, [t]);
}
}
return map;
}, [traces]);
const totalCount = traces.length;
const errorCount = traces.filter(
(t) => t.outcome !== null && t.outcome !== 'success',
).length;
return (
<div className="border-t">
<button
type="button"
onClick={() => setOpen((v) => !v)}
className="flex items-center gap-1.5 w-full px-3 py-1.5 text-[11px] font-medium text-muted-foreground hover:bg-muted/20"
>
{open ? <ChevronDown size={12} /> : <ChevronRight size={12} />}
<span>Tool traces</span>
{totalCount > 0 && (
<span className="font-mono tabular-nums text-muted-foreground/60">
{totalCount}
</span>
)}
{errorCount > 0 && (
<span className="text-destructive ml-auto text-[10px] font-medium">
{errorCount} error{errorCount > 1 ? 's' : ''}
</span>
)}
{loading && (
<span className="ml-auto inline-block w-1.5 h-3 align-baseline bg-muted-foreground/60 animate-pulse" />
)}
</button>
{open && (
<div className="max-h-80 overflow-y-auto border-t border-border/40">
{loading && traces.length === 0 && (
<div className="px-3 py-4 text-[11px] text-muted-foreground text-center">
Loading traces
</div>
)}
{error && (
<div className="px-3 py-2 text-[11px] text-destructive">
{error}
<button
type="button"
onClick={() => void fetchTraces()}
className="ml-2 underline hover:no-underline"
>
retry
</button>
</div>
)}
{!loading && !error && traces.length === 0 && (
<div className="px-3 py-4 text-[11px] text-muted-foreground text-center">
No tool traces yet.
</div>
)}
{traces.length > 0 && (
<div className="divide-y divide-border/40">
{Array.from(groups.entries()).map(([toolName, groupTraces]) => (
<TraceGroup
key={toolName}
toolName={toolName}
traces={groupTraces}
/>
))}
</div>
)}
</div>
)}
</div>
);
}

View File

@@ -1,11 +1,13 @@
import { useCallback, useEffect, useRef, useState } from 'react'; import { useCallback, useEffect, useRef, useState } from 'react';
import { Pencil, Send, X } from 'lucide-react'; import { History, Pencil, Send, X } from 'lucide-react';
import { toast } from 'sonner'; import { toast } from 'sonner';
import { api } from '@/api/client'; import { api } from '@/api/client';
import { useSessionStream } from '@/hooks/useSessionStream'; import { useSessionStream } from '@/hooks/useSessionStream';
import { MessageList } from '@/components/MessageList'; import { MessageList } from '@/components/MessageList';
import { ChatInput } from '@/components/ChatInput'; import { ChatInput } from '@/components/ChatInput';
import { StaleStreamBanner } from '@/components/StaleStreamBanner'; import { StaleStreamBanner } from '@/components/StaleStreamBanner';
import { SessionTimeline } from '@/components/SessionTimeline';
import { TraceViewer } from '@/components/TraceViewer';
import { sendToChat } from '@/lib/events'; import { sendToChat } from '@/lib/events';
interface Props { interface Props {
@@ -25,6 +27,7 @@ interface Props {
export function ChatPane({ sessionId, chatId, projectId, agentId, onAgentChange, sessionChats, webSearchEnabled }: Props) { export function ChatPane({ sessionId, chatId, projectId, agentId, onAgentChange, sessionChats, webSearchEnabled }: Props) {
const stream = useSessionStream(sessionId); const stream = useSessionStream(sessionId);
const lastErrorRef = useRef<string | null>(null); const lastErrorRef = useRef<string | null>(null);
const [showTimeline, setShowTimeline] = useState(false);
const [queue, setQueue] = useState<{ id: string; text: string }[]>([]); const [queue, setQueue] = useState<{ id: string; text: string }[]>([]);
const queueIdRef = useRef(0); const queueIdRef = useRef(0);
const processingRef = useRef(false); const processingRef = useRef(false);
@@ -203,11 +206,41 @@ export function ChatPane({ sessionId, chatId, projectId, agentId, onAgentChange,
} }
} }
const handleScrollToMessage = useCallback((messageId: string) => {
const el = document.getElementById(`msg-${messageId}`);
if (el) {
el.scrollIntoView({ behavior: 'smooth', block: 'center' });
}
}, []);
return ( return (
<div className="flex flex-col h-full min-h-0"> <div className="flex flex-col h-full min-h-0 relative">
{chatMessages.length > 0 && (
<div className="absolute top-2 right-2 z-10">
<button
type="button"
onClick={() => setShowTimeline((v) => !v)}
className={`
inline-flex items-center gap-1 px-2 py-1 rounded-md text-xs font-medium
transition-colors border
${showTimeline
? 'bg-primary text-primary-foreground border-primary'
: 'bg-background text-muted-foreground border-border hover:bg-muted hover:text-foreground'
}
`}
aria-label={showTimeline ? 'Close timeline' : 'Open timeline'}
>
<History size={12} />
Timeline
</button>
</div>
)}
{/* v1.11.5: ContextBar moved into ChatInput (above the agent picker). */} {/* v1.11.5: ContextBar moved into ChatInput (above the agent picker). */}
<MessageList messages={chatMessages} sessionChats={sessionChats} /> <MessageList messages={chatMessages} sessionChats={sessionChats} />
<TraceViewer chatId={chatId} />
{/* Queued messages */} {/* Queued messages */}
{queue.length > 0 && ( {queue.length > 0 && (
<div className="border-t"> <div className="border-t">
@@ -275,6 +308,16 @@ export function ChatPane({ sessionId, chatId, projectId, agentId, onAgentChange,
messages={chatMessages} messages={chatMessages}
modelContextLimit={modelContextLimit} modelContextLimit={modelContextLimit}
/> />
{/* Timeline overlay panel */}
{showTimeline && (
<SessionTimeline
messages={chatMessages}
chatId={chatId}
onClose={() => setShowTimeline(false)}
onScrollToMessage={handleScrollToMessage}
/>
)}
</div> </div>
); );
} }

View File

@@ -16,6 +16,134 @@ interface State {
error: string | null; error: string | null;
} }
type Channel = 'text' | 'tool_call' | 'tool_result' | 'status' | 'error';
// Per-channel out-of-order frame buffer with contiguous-seq flush logic.
// Stores incoming channel_delta frames and releases them only when seq
// becomes contiguous with the expected next value.
class ChannelBuffer {
private expectedSeq = 0;
private buffer = new Map<number, ChannelDeltaWsFrame>();
push(frame: ChannelDeltaWsFrame): ChannelDeltaWsFrame[] {
if (frame.seq < this.expectedSeq) {
return [];
}
if (frame.seq === this.expectedSeq) {
this.expectedSeq++;
const flushed = [frame];
while (this.buffer.has(this.expectedSeq)) {
const next = this.buffer.get(this.expectedSeq)!;
this.buffer.delete(this.expectedSeq);
this.expectedSeq++;
flushed.push(next);
}
return flushed;
}
this.buffer.set(frame.seq, frame);
return [];
}
get expectedNextSeq(): number {
return this.expectedSeq;
}
get bufferedCount(): number {
return this.buffer.size;
}
reset(seq = 0) {
this.expectedSeq = seq;
this.buffer.clear();
}
}
type ChannelDeltaWsFrame = WsFrame & { type: 'channel_delta' };
// Converts a flushed channel_delta into the equivalent legacy frame so the
// existing applyFrame reducer handles the per-message mutation. Status
// deltas are handled separately (they may need to create the message first
// and apply throughput metadata independently of terminal status).
function channelDeltaToLegacyFrame(delta: ChannelDeltaWsFrame): WsFrame | null {
switch (delta.channel) {
case 'text':
return { type: 'delta', message_id: delta.message_id!, content: delta.content! };
case 'tool_call':
return { type: 'tool_call', message_id: delta.message_id!, tool_call: delta.tool_call! };
case 'tool_result':
return {
type: 'tool_result',
tool_message_id: delta.tool_message_id!,
chat_id: delta.chat_id,
tool_call_id: delta.tool_call_id!,
output: delta.output,
truncated: delta.truncated!,
...(delta.error ? { error: delta.error } : {}),
...(delta.diff ? { diff: delta.diff } : {}),
};
case 'error':
return {
type: 'error',
message_id: delta.message_id,
chat_id: delta.chat_id,
error: delta.error!,
...(delta.reason ? { reason: delta.reason as never } : {}),
};
case 'status':
return null;
}
}
// Apply a flushed status channel_delta to state. Status deltas carry both
// intermediate throughput metadata (tokens_used, ctx_used, model, etc.)
// and optional terminal transitions (complete / cancelled / failed).
function applyStatusDelta(state: State, delta: ChannelDeltaWsFrame): State {
const { message_id, chat_id, status, channel: _c, seq: _s, type: _t, ...meta } = delta;
if (!message_id) return state;
let next = state;
const exists = next.messages.some((m) => m.id === message_id);
if (!exists && status === 'running') {
next = applyFrame(next, {
type: 'message_started',
message_id,
chat_id,
role: 'assistant',
});
}
const metaFields: Record<string, unknown> = {};
if (meta.tokens_used !== undefined) metaFields.tokens_used = meta.tokens_used;
if (meta.ctx_used !== undefined) metaFields.ctx_used = meta.ctx_used;
if (meta.ctx_max !== undefined) metaFields.ctx_max = meta.ctx_max;
if (meta.cache_tokens !== undefined) metaFields.cache_tokens = meta.cache_tokens;
if (meta.reasoning_tokens !== undefined) metaFields.reasoning_tokens = meta.reasoning_tokens;
if (meta.started_at !== undefined) metaFields.started_at = meta.started_at;
if (meta.finished_at !== undefined) metaFields.finished_at = meta.finished_at;
if (meta.model !== undefined) metaFields.model = meta.model;
if (meta.metadata !== undefined) metaFields.metadata = meta.metadata;
if (Object.keys(metaFields).length > 0) {
next = {
...next,
messages: next.messages.map((m) =>
m.id === message_id ? { ...m, ...metaFields } : m,
),
};
}
if (status === 'complete' || status === 'cancelled' || status === 'failed') {
next = applyFrame(next, {
type: 'message_complete',
message_id,
chat_id,
status,
});
}
return next;
}
function applyFrame(state: State, frame: WsFrame): State { function applyFrame(state: State, frame: WsFrame): State {
switch (frame.type) { switch (frame.type) {
case 'snapshot': { case 'snapshot': {
@@ -33,18 +161,19 @@ function applyFrame(state: State, frame: WsFrame): State {
kind: 'message', kind: 'message',
tool_calls: null, tool_calls: null,
tool_results: null, tool_results: null,
// v1.8.2: cap-hit sentinels arrive role='system' and are static, so
// skipping the streaming dot for them keeps the UI accurate.
status: frame.role === 'system' ? 'complete' : 'streaming', status: frame.role === 'system' ? 'complete' : 'streaming',
last_seq: 0, last_seq: 0,
tokens_used: null, tokens_used: null,
ctx_used: null, ctx_used: null,
ctx_max: null, ctx_max: null,
cache_tokens: null,
reasoning_tokens: null,
model: null, model: null,
started_at: null, started_at: null,
finished_at: null, finished_at: null,
created_at: new Date().toISOString(), created_at: new Date().toISOString(),
metadata: null, metadata: null,
...(frame.compare_group_id ? { compare_group_id: frame.compare_group_id } : {}),
}; };
return { ...state, messages: [...state.messages, newMsg] }; return { ...state, messages: [...state.messages, newMsg] };
} }
@@ -63,27 +192,24 @@ function applyFrame(state: State, frame: WsFrame): State {
const next = state.messages.map((m) => const next = state.messages.map((m) =>
m.id === frame.message_id m.id === frame.message_id
? { ...m, tool_calls: [...(m.tool_calls ?? []), frame.tool_call] } ? { ...m, tool_calls: [...(m.tool_calls ?? []), frame.tool_call] }
: m : m,
); );
return { ...state, messages: next }; return { ...state, messages: next };
} }
case 'tool_result': { case 'tool_result': {
const exists = state.messages.some((m) => m.id === frame.tool_message_id); const toolResultsBase = {
if (exists) {
const next = state.messages.map((m) =>
m.id === frame.tool_message_id
? {
...m,
role: 'tool' as const,
tool_results: {
tool_call_id: frame.tool_call_id, tool_call_id: frame.tool_call_id,
output: frame.output, output: frame.output,
truncated: frame.truncated, truncated: frame.truncated,
...(frame.error ? { error: frame.error } : {}), ...(frame.error ? { error: frame.error } : {}),
}, ...(frame.diff ? { diff: frame.diff } : {}),
status: 'complete' as const, };
} const exists = state.messages.some((m) => m.id === frame.tool_message_id);
: m if (exists) {
const next = state.messages.map((m) =>
m.id === frame.tool_message_id
? { ...m, role: 'tool' as const, tool_results: toolResultsBase, status: 'complete' as const }
: m,
); );
return { ...state, messages: next }; return { ...state, messages: next };
} }
@@ -95,17 +221,14 @@ function applyFrame(state: State, frame: WsFrame): State {
content: '', content: '',
kind: 'message', kind: 'message',
tool_calls: null, tool_calls: null,
tool_results: { tool_results: toolResultsBase,
tool_call_id: frame.tool_call_id,
output: frame.output,
truncated: frame.truncated,
...(frame.error ? { error: frame.error } : {}),
},
status: 'complete', status: 'complete',
last_seq: 0, last_seq: 0,
tokens_used: null, tokens_used: null,
ctx_used: null, ctx_used: null,
ctx_max: null, ctx_max: null,
cache_tokens: null,
reasoning_tokens: null,
model: null, model: null,
started_at: null, started_at: null,
finished_at: null, finished_at: null,
@@ -128,19 +251,14 @@ function applyFrame(state: State, frame: WsFrame): State {
...(frame.started_at !== undefined ? { started_at: frame.started_at } : {}), ...(frame.started_at !== undefined ? { started_at: frame.started_at } : {}),
...(frame.finished_at !== undefined ? { finished_at: frame.finished_at } : {}), ...(frame.finished_at !== undefined ? { finished_at: frame.finished_at } : {}),
...(frame.model !== undefined ? { model: frame.model } : {}), ...(frame.model !== undefined ? { model: frame.model } : {}),
// v1.8.2: cap-hit sentinels (and future stamped metadata) ride
// in on this terminal frame so the reducer can attach it
// without waiting for a refetch.
...(frame.metadata !== undefined ? { metadata: frame.metadata } : {}), ...(frame.metadata !== undefined ? { metadata: frame.metadata } : {}),
...(frame.compare_group_id !== undefined ? { compare_group_id: frame.compare_group_id } : {}),
} }
: m : m,
); );
return { ...state, messages: next }; return { ...state, messages: next };
} }
case 'usage': { case 'usage': {
// v1.12.2: live throughput. Side-effects into the module-level
// singleton consumed by ChatThroughput; no message-state mutation.
// chat_id is the optional ws-frame field; usage frames always include it.
if (frame.chat_id) { if (frame.chat_id) {
recordUsage(frame.chat_id, { recordUsage(frame.chat_id, {
completion_tokens: frame.completion_tokens, completion_tokens: frame.completion_tokens,
@@ -168,10 +286,6 @@ function applyFrame(state: State, frame: WsFrame): State {
return state; return state;
} }
case 'error': { case 'error': {
// v1.8.2: when the frame carries a structured reason, stamp it onto the
// failed message's metadata so the bubble can render specifics inline
// (the WS error frame is one-shot; refresh-safe rendering needs the
// value persisted on the message).
const errorMeta = frame.reason const errorMeta = frame.reason
? { kind: 'error' as const, error_reason: frame.reason, error_text: frame.error } ? { kind: 'error' as const, error_reason: frame.reason, error_text: frame.error }
: null; : null;
@@ -182,48 +296,55 @@ function applyFrame(state: State, frame: WsFrame): State {
...m, ...m,
status: 'failed' as const, status: 'failed' as const,
...(errorMeta ? { metadata: errorMeta } : {}), ...(errorMeta ? { metadata: errorMeta } : {}),
...(frame.compare_group_id !== undefined ? { compare_group_id: frame.compare_group_id } : {}),
} }
: m : m,
) )
: state.messages; : state.messages;
return { ...state, messages: next, error: frame.error }; return { ...state, messages: next, error: frame.error };
} }
case 'compacted': { case 'compacted': {
// v1.11: side effects (refetch + toast) live in ws.onmessage; the return state;
// reducer just no-ops so TS exhaustiveness is satisfied without }
// duplicating async work inside a synchronous reducer. case 'agent_snapshot': {
return state; return state;
} }
case 'agent_status_updated': { case 'agent_status_updated': {
// agent-status-normalize (#10): coder-only frame consumed by CoderPane's
// own WS handler, not BooChat's native message reducer. No-op here to keep
// TS exhaustiveness satisfied (native sessions never emit it).
return state; return state;
} }
case 'flow_run_started': case 'flow_run_started':
case 'flow_run_step_updated': { case 'flow_run_step_updated': {
// Orchestrator frames consumed by OrchestratorPane's own subscription.
// No-op here to keep TS exhaustiveness satisfied.
return state; return state;
} }
case 'battle_started': case 'battle_started':
case 'contestant_updated': case 'contestant_updated':
case 'battle_updated': { case 'battle_updated': {
// Arena frames consumed by ArenaPane's own subscription. return state;
// No-op here to keep TS exhaustiveness satisfied. }
case 'channel_delta': {
return state;
}
default: {
return state; return state;
} }
} }
} }
// Matches useUserEvents — exponential backoff with the same ceiling so the
// two channels reconnect on the same cadence after a network handoff.
const RECONNECT_INITIAL_MS = 1000; const RECONNECT_INITIAL_MS = 1000;
const RECONNECT_MAX_MS = 30_000; const RECONNECT_MAX_MS = 30_000;
const CHANNEL_STALL_MS = 5000;
export function useSessionStream(sessionId: string | undefined) { export function useSessionStream(sessionId: string | undefined) {
const [state, setState] = useState<State>({ messages: [], connected: false, error: null }); const [state, setState] = useState<State>({ messages: [], connected: false, error: null });
const wsRef = useRef<WebSocket | null>(null); const wsRef = useRef<WebSocket | null>(null);
const channelBuffersRef = useRef<Map<Channel, ChannelBuffer>>(new Map());
const lastFrameTimeRef = useRef<Partial<Record<Channel, number>>>({});
// Reset channel buffers when session changes
useEffect(() => {
channelBuffersRef.current = new Map();
lastFrameTimeRef.current = {};
}, [sessionId]);
useEffect(() => { useEffect(() => {
if (!sessionId) return; if (!sessionId) return;
@@ -234,6 +355,73 @@ export function useSessionStream(sessionId: string | undefined) {
let reconnectTimer: ReturnType<typeof setTimeout> | null = null; let reconnectTimer: ReturnType<typeof setTimeout> | null = null;
let reconnectDelay = RECONNECT_INITIAL_MS; let reconnectDelay = RECONNECT_INITIAL_MS;
const getLastSeqPerChannel = () => {
const seqs: Partial<Record<Channel, number>> = {};
for (const [ch, buf] of channelBuffersRef.current) {
seqs[ch] = buf.expectedNextSeq;
}
return seqs;
};
const flushDeltaToState = (delta: ChannelDeltaWsFrame) => {
console.error('FDS', delta.channel, 'flushed');
if (delta.channel === 'status') {
setState((s) => applyStatusDelta(s, delta));
} else {
const legacy = channelDeltaToLegacyFrame(delta);
if (legacy) {
setState((s) => applyFrame(s, legacy));
}
}
};
const handleChannelDelta = (frame: ChannelDeltaWsFrame) => {
console.error('HCD', frame.channel, frame.seq, 'bufs', channelBuffersRef.current.size);
const buffers = channelBuffersRef.current;
let buffer = buffers.get(frame.channel);
if (!buffer) {
buffer = new ChannelBuffer();
buffers.set(frame.channel, buffer);
}
const flushed = buffer.push(frame);
if (flushed.length === 0) return;
for (const delta of flushed) {
flushDeltaToState(delta);
}
let emittedRefresh = false;
for (const delta of flushed) {
if (delta.channel === 'status' && (delta.status === 'complete' || delta.status === 'cancelled' || delta.status === 'failed')) {
emittedRefresh = true;
}
}
if (emittedRefresh) {
sessionEvents.emit({ type: 'git_diff_refresh' });
}
lastFrameTimeRef.current[frame.channel] = Date.now();
};
// Periodic channel stall check: if any channel has buffered frames
// but no progress for 5s, force a snapshot refetch.
let stallTimer: ReturnType<typeof setInterval> | null = null;
const startStallTimer = () => {
stallTimer = setInterval(() => {
const now = Date.now();
for (const [channel, buffer] of channelBuffersRef.current) {
if (buffer.bufferedCount === 0) continue;
const lastTime = lastFrameTimeRef.current[channel as Channel] ?? 0;
if (now - lastTime >= CHANNEL_STALL_MS) {
buffer.reset();
sessionEvents.emit({ type: 'refetch_messages' });
}
}
}, 1000);
};
const connect = () => { const connect = () => {
if (unmounted) return; if (unmounted) return;
const proto = window.location.protocol === 'https:' ? 'wss' : 'ws'; const proto = window.location.protocol === 'https:' ? 'wss' : 'ws';
@@ -244,13 +432,16 @@ export function useSessionStream(sessionId: string | undefined) {
ws.onopen = () => { ws.onopen = () => {
reconnectDelay = RECONNECT_INITIAL_MS; reconnectDelay = RECONNECT_INITIAL_MS;
setState((s) => ({ ...s, connected: true, error: null })); setState((s) => ({ ...s, connected: true, error: null }));
// Mid-stream reconnection protocol: send last known seq per channel
// so the server can replay deltas or fall back to a full snapshot.
const lastSeq = getLastSeqPerChannel();
ws.send(JSON.stringify({ type: 'reconnect', lastSeqPerChannel: lastSeq }));
startStallTimer();
}; };
ws.onmessage = (ev) => { ws.onmessage = (ev) => {
// v1.13.11-a: Zod-validate every inbound frame. Fail-closed — invalid
// frames are logged and dropped. WsFrameSchema is the runtime guard;
// the hand-maintained WsFrame type stays as the narrowed dev-time
// shape (Zod uses OpaqueObject for nested types like Message[]). One
// cast bridges the two.
let raw: unknown; let raw: unknown;
try { try {
raw = JSON.parse(typeof ev.data === 'string' ? ev.data : ''); raw = JSON.parse(typeof ev.data === 'string' ? ev.data : '');
@@ -268,13 +459,14 @@ export function useSessionStream(sessionId: string | undefined) {
} }
try { try {
const frame = validated.data as unknown as WsFrame; const frame = validated.data as unknown as WsFrame;
// v1.11: on a compaction completion, re-fetch the message list so
// the new summary row + the cohort of compacted_at-stamped older if (frame.type === 'channel_delta') {
// rows render correctly. We dispatch the fresh list as a synthetic console.error('RAW_PARSE', JSON.stringify(validated.data).slice(0, 200));
// 'snapshot' frame so the reducer's existing path handles state console.error('CD', frame.channel, frame.seq, JSON.stringify(frame).slice(0, 80));
// replacement (no need for a parallel "refetched" path). handleChannelDelta(frame);
// The toast is purely UX feedback; missing it would still leave return;
// the chat in a valid state. }
if (frame.type === 'compacted') { if (frame.type === 'compacted') {
toast.success('Context compacted to free space'); toast.success('Context compacted to free space');
void api.messages void api.messages
@@ -287,8 +479,9 @@ export function useSessionStream(sessionId: string | undefined) {
}); });
return; return;
} }
setState((s) => applyFrame(s, frame)); setState((s) => applyFrame(s, frame));
// Trigger git diff refresh after each completed assistant turn.
if (frame.type === 'message_complete') { if (frame.type === 'message_complete') {
sessionEvents.emit({ type: 'git_diff_refresh' }); sessionEvents.emit({ type: 'git_diff_refresh' });
} }
@@ -296,15 +489,18 @@ export function useSessionStream(sessionId: string | undefined) {
console.warn('bad ws frame', err); console.warn('bad ws frame', err);
} }
}; };
// v1.8.1: WS errors no longer surface as user-facing toasts here. The
// user-channel hook (useUserEvents) owns the debounced "reconnecting…"
// UI; this channel just reconnects silently on the same backoff.
ws.onerror = () => { ws.onerror = () => {
try { ws.close(); } catch {} try { ws.close(); } catch {}
}; };
ws.onclose = () => { ws.onclose = () => {
if (unmounted) return; if (unmounted) return;
setState((s) => ({ ...s, connected: false })); setState((s) => ({ ...s, connected: false }));
if (stallTimer) {
clearInterval(stallTimer);
stallTimer = null;
}
const delay = reconnectDelay; const delay = reconnectDelay;
reconnectDelay = Math.min(reconnectDelay * 2, RECONNECT_MAX_MS); reconnectDelay = Math.min(reconnectDelay * 2, RECONNECT_MAX_MS);
reconnectTimer = setTimeout(connect, delay); reconnectTimer = setTimeout(connect, delay);
@@ -316,6 +512,7 @@ export function useSessionStream(sessionId: string | undefined) {
return () => { return () => {
unmounted = true; unmounted = true;
if (reconnectTimer) clearTimeout(reconnectTimer); if (reconnectTimer) clearTimeout(reconnectTimer);
if (stallTimer) clearInterval(stallTimer);
const ws = wsRef.current; const ws = wsRef.current;
wsRef.current = null; wsRef.current = null;
if (ws) try { ws.close(); } catch {} if (ws) try { ws.close(); } catch {}

View File

@@ -1,44 +0,0 @@
# v2.8 — boocontext sidecar container.
# Multi-stage build: Go shim from golang:1.24-alpine, boocontext MCP aggregator
# from node:20-alpine, then an alpine:3.20 runtime holding both.
#
# The shim spawns boocontext as a child MCP process over stdio NDJSON,
# translating HTTP requests to MCP tools/call.
#
# To stage the fork source for a Docker build:
# tar -czf codecontext/fork.tar.gz -C /opt/forks/boocontext \
# --exclude=.git --exclude=node_modules --exclude=dist
# Stage 1: Go shim builder
FROM golang:1.24-alpine AS shim-builder
WORKDIR /build/shim
RUN apk add --no-cache ca-certificates
COPY go.mod ./
COPY shim.go ./
RUN CGO_ENABLED=0 GOOS=linux go build -o /build/shim-bin ./
# Stage 2: boocontext MCP builder (pnpm project)
FROM node:20-alpine AS boocontext-builder
WORKDIR /build/boocontext
RUN apk add --no-cache git python3 make g++ ca-certificates
RUN npm install -g pnpm@9 --silent
COPY fork.tar.gz /build/fork.tar.gz
RUN mkdir -p /build/boocontext && tar -xzf /build/fork.tar.gz -C /build/boocontext
WORKDIR /build/boocontext
RUN pnpm install --frozen-lockfile && pnpm run build
# Stage 3: Runtime
FROM alpine:3.20
# uv intentionally not installed — container network blocks astral.sh.
# tree-sitter-analyzer child server (uvx) won't start in-container, but
# boocontext logs a graceful warning; TSA-backed tools fall through.
RUN apk add --no-cache ca-certificates nodejs
COPY --from=shim-builder /build/shim-bin /usr/local/bin/shim
COPY --from=boocontext-builder /build/boocontext/dist /usr/local/lib/boocontext/dist
COPY --from=boocontext-builder /build/boocontext/node_modules /usr/local/lib/boocontext/node_modules
COPY --from=boocontext-builder /build/boocontext/package.json /usr/local/lib/boocontext/package.json
EXPOSE 8080
HEALTHCHECK --interval=30s --timeout=5s --start-period=30s \
CMD wget -qO- http://localhost:8080/health || exit 1
ENTRYPOINT ["/usr/local/bin/shim"]

31
codecontext/README.md Normal file
View File

@@ -0,0 +1,31 @@
# codecontext — Go sidecar (DEPRECATED)
> **Deprecated** (Phase 4, Domain 2, v2.8.14).
>
> Superseded by the **boocontext MCP server** (`apps/coder`). Do not add new
> callers. The 16 codecontext tool wrappers still use this sidecar via HTTP at
> `http://codecontext:8080/v1/{toolName}` for backward compatibility.
## Migration path
1. Existing tool wrappers in `apps/server/src/services/tools/codecontext/` route
through `callCodecontext()` in `codecontext_client.ts`, which calls this
Go sidecar over HTTP.
2. New callers should use the boocontext MCP server instead (reachable via the
`boocontext` tool wrappers).
3. After all callers have migrated, remove this directory, the `codecontext`
service block from `docker-compose.yml`, and the
`codecontext_client.ts`/`factory.ts` files.
## What it does
A Go HTTP shim wrapping the boocontext MCP server's stdio interface. Provides
code-graph analysis (symbols, callers, callees, file overview, etc.) over a
REST API at `/v1/{toolName}`.
## Files
- `shim.go` — HTTP server that wraps the boocontext MCP stdio process
- `Dockerfile` — container build
- `fork.tar.gz` — vendored boocontext source (gitignored)
- `.codecontextignore.template` — default ignore patterns deployed per project

View File

@@ -7,7 +7,6 @@ services:
- "100.114.205.53:9500:3000" - "100.114.205.53:9500:3000"
env_file: .env env_file: .env
environment: environment:
CODECONTEXT_URL: http://codecontext:8080
CONTAINER_GUIDANCE_FILE: /app/BOOCHAT.md CONTAINER_GUIDANCE_FILE: /app/BOOCHAT.md
DATABASE_URL: postgres://boocode:${POSTGRES_PASSWORD}@boocode_db:5432/boochat DATABASE_URL: postgres://boocode:${POSTGRES_PASSWORD}@boocode_db:5432/boochat
BOOCODER_URL: http://100.114.205.53:9502 BOOCODER_URL: http://100.114.205.53:9502
@@ -91,41 +90,6 @@ services:
networks: networks:
- boocode_net - boocode_net
# v1.12 Track B: codecontext sidecar. Stdio MCP server wrapped by a small
# HTTP shim (see ./codecontext/). No host port — reached from boocode at
# http://codecontext:8080 over the boocode_net bridge.
#
# Mounts /opt:/opt:ro (not just /opt/projects:ro): BooCode projects live
# at /opt/<slug> on the host, not exclusively under /opt/projects. The
# mount must cover anywhere a project.path could resolve to. Read-only
# because codecontext only analyzes — never writes. The model can't
# arbitrarily set target_dir to a sensitive subtree because the B.2
# wrappers validate target_dir against project.path before calling the
# shim, and the shim isn't reachable from outside boocode_net.
codecontext:
build:
context: ./codecontext
container_name: boocode_codecontext
ports:
- "127.0.0.1:8080:8080"
restart: unless-stopped
environment:
CODECONTEXT_CHILD: node /usr/local/lib/boocontext/dist/index.js --mcp
TYPE_INJECT_MCP_PATH: /opt/type-inject/packages/mcp/dist/index.js
TREE_SITTER_MCP_CMD: uvx
TREE_SITTER_MCP_ARGS: --from tree-sitter-analyzer[mcp] tree-sitter-analyzer-mcp
networks:
- boocode_net
volumes:
- /opt:/opt:ro
- /opt/forks:/opt/forks:ro
healthcheck:
test: ["CMD-SHELL", "wget -qO- http://localhost:8080/health || exit 1"]
interval: 30s
timeout: 5s
retries: 3
start_period: 30s
volumes: volumes:
boocode_pgdata: boocode_pgdata:

View File

@@ -0,0 +1,107 @@
# Paseo-like Orchestrator — Trace Observability, Dynamic Workflows & Agent Runtime
**Status:** Proposed
**Epic:** paseo-orchestrator
**Depends on:** v2.7.17-orchestrator
## Why
BooCode's Orchestrator (v2.7.17) runs deterministic Han analysis flows — but it's a fixed pipeline, not a general-purpose agent runtime. Every tool call is opaque: no timing, no cost breakdown, no replay. Sessions evaporate on browser refresh. Workflows are hardcoded. Subagents block until completion. And there's zero visibility into cache efficiency on DeepSeek — despite prompt caching being a major cost lever.
The current architecture treats the LLM as a black box and the agent as a one-shot transaction. To move from "read-only chat" to a **Paseo-style thin-client orchestration layer**, BooCode needs five capabilities that compound on each other:
1. **Observability** — Every tool call timed, logged, and live-streamed. Without it, debugging agent behavior is guesswork.
2. **Persistence** — Agent state survives browser refresh. Active sessions resume where they left off.
3. **Dynamic Workflows** — User-authored JS scripts using `agent()`, `parallel()`, `pipeline()` instead of hardcoded flows. Hash-based caching skips completed steps on re-run.
4. **Background Subagents**`spawn_subagent` returns immediately, results collected later. Unlocks parallel research, long-running analyses, and notification-based workflows.
5. **Multi-modal + Cache Shape** — Image attachments forwarded to DeepSeek's vision API, plus per-turn cache hit rate visualization to close the cost feedback loop.
Each phase is independently valuable; together they transform BooCode from a chat UI into a durable agent execution platform.
## What Changes
### Phase 1: Trace System + Observability (3-4 days)
1. **Create `tool_traces` DB table** — id, session_id, chat_id, turn_number, tool_name, input, output, started_at, finished_at, latency_ms, tokens_used, cache_tokens, reasoning_tokens, error, outcome. Applied idempotently via `applySchema()`.
2. **Add `tool_trace` WS frame** — new WsFrame variant in `@boocode/contracts` published by the server when a tool call starts and completes. Frontend receives live timing deltas via `useSessionStream`.
3. **Instrument `tool-phase.ts`** — wrap `executeToolCall` with `clock_timestamp()` start/end, extract token counts from LLM response metadata, publish `tool_trace` frames on start (with input) and finish (with output + metrics).
4. **Add GET `/api/chats/:id/traces`** — paginated endpoint returning trace rows ordered by turn_number + started_at. Supports cursor-based pagination for large sessions.
5. **Build trace viewer pane** — collapsible tree per turn, timing bars showing latency relative to turn duration, expand/collapse per tool call showing input/output. Integrates into the existing multi-pane workspace alongside chat, coder, and orchestrator panes.
### Phase 2: Session Persistence + Resume (2-3 days)
6. **Serialize agent state to DB** — on each turn boundary (before and after tool call loop), snapshot the active `AgentSession` state (provider config, turn history, pending tool calls) to a JSONB column in `agent_sessions`. Uses `clock_timestamp()` for ordering.
7. **Restore on WS reconnect** — when `snapshot` frame arrives on reconnection, check for a persisted `AgentSession` in `in_progress` or `awaiting_input` state. Rehydrate the coder pane to match the persisted turn, tool call, and pending state.
8. **Agent session timeline view** — a timeline component in the coder pane showing the history of all turns in the current agent session. Each turn shows start time, tool count, token usage, cache hit rate. Clicking a turn scrolls to that point in the conversation.
### Phase 3: Dynamic Workflow Engine (5-7 days)
9. **Create `isolated-vm` sandbox** — restricted JS execution environment for workflow scripts. No `require`, `fs`, `net`, `child_process`. Only the workflow API surface exposed. Token budget enforcement kills runaway scripts.
10. **Implement workflow API primitives**`agent(id, { prompt, model, tools, budget })` defines a sub-agent; `parallel([agent1, agent2])` runs N agents concurrently with a shared token budget; `pipeline([step1, step2])` chains agents sequentially; `phase(name, { agents, budget })` groups agents under a named phase; `budget(limit)` sets token or step limits; `log(msg)` emits structured workflow log. Compatible with Claude Code workflow script format.
11. **Workflow file discovery** — scan `.boocode/workflows/*.js` (project-local), `~/.boocode/workflows/*.js` (global), and a built-in catalog directory. Each file exports a `workflow` object with `{name, description, run}`. Discovery runs on server start and on file change (optional watch mode).
12. **Workflow manager + built-in catalog**`WorkflowManager` class with `list()`, `get(name)`, `run(workflow, args)`, `cancel(runId)`, `status(runId)`. Concurrency limits (configurable max concurrent runs), token budgets per run. Built-in catalog includes: `deep-research` (parallel source search → per-source analysis → synthesis), `multi-review` (code health + security + standards reviews in parallel), `plan-verify` (generate plan → verify plan → generate tasks), `bounty-hunt` (parallel vulnerability scanning with different focuses).
13. **Workflow resumability** — SHA-256 hash of each agent spec (prompt + options). Before executing an agent, check if a completed result exists with the same hash. Skip cached agents, only execute new/changed ones. In-memory LRU cache for current session, optional DB persistence for cross-session reuse.
14. **Workflow UI integration** — extend the existing Orchestrator panel (used for Han flows) to support dynamic workflows. Workflow selector dropdown, live run pane with step-by-step progress, cancel button, log output stream, per-agent timing. Reuses the same run-pane component pattern.
### Phase 4: Background Subagents (2-3 days)
15. **Background task queue** — uses the existing `tasks` table with a new `background` type. `spawn_subagent` tool creates a task row and returns immediately. A background worker picks up the task and executes it without blocking the calling agent.
16. **`subagent_status` + `subagent_result` tools** — `subagent_status(task_id)` returns `running|completed|failed` with optional progress info. `subagent_result(task_id)` returns the full output when completed. Polling-based (no WS push for background tasks initially).
17. **Background agent pane** — new pane type showing running/completed background agents. Each entry shows name, status, duration, progress. Completed entries show a "View Result" action. Notifications hook into the existing notification system (toast on completion, badge count for active tasks).
### Phase 5: Multi-modal + Cache Shape (2-3 days)
18. **Image/file attachment pipeline** — accept file uploads (drag-drop or file picker), store on tmpfs with a reference in the message row. Forward to DeepSeek's multimodal API as base64-encoded image parts. Size limit enforcement (configurable, default 20MB per attachment).
19. **Image render in message bubble** — render attached images inline in the chat message bubble. Lightbox on click for expanded view. Thumbnail generation for large images to keep chat scrolling performant.
20. **Cache shape telemetry** — extract `prompt_cache_hit_tokens` from DeepSeek provider metadata on each turn. Break down by segment: system prompt, tool schemas, conversation history. Store in `tool_traces` columns and/or a dedicated `cache_stats` table.
21. **Cache hit rate visualization** — per-turn cache hit bar in the trace viewer (showing cached vs non-cached tokens). Cumulative cache hit rate in the session footer. Highlight when a turn achieves high cache reuse (green indicator) or unusually low (yellow/red).
## Non-Goals
- No changes to the existing Han flow orchestrator (runs alongside dynamic workflows)
- No removal of existing agent dispatch paths (PTY, ACP, Claude SDK — dynamic workflows are additive)
- No distributed execution (all orchestration is single-node)
- No persistent workflow file watching (manual reload or server restart to pick up new workflows)
- No workflow editing UI (workflows are authored as JS files)
## Capabilities
### New Capabilities
- **Tool trace viewer** — every tool call with timing, token costs, cache breakdown, expandable input/output
- **Agent session resume** — browser refresh preserves active agent state
- **Dynamic workflows** — user-authored JS scripts with `agent()/parallel()/pipeline()` API
- **Workflow resumability** — hash-based step caching skips completed agents on re-run
- **Built-in workflow catalog** — deep-research, multi-review, plan-verify, bounty-hunt
- **Background subagents** — non-blocking spawn with deferred result collection
- **Multi-modal support** — image attachments forwarded to DeepSeek vision API
- **Cache shape telemetry** — per-turn and cumulative cache hit rate visualization
### Modified Capabilities
- **Orchestrator panel** — extended from fixed Han flows to dynamic workflow selection and streaming run pane
- **tool-phase.ts** — instrumented with start/end timing and trace publishing
- **WsFrame contract** — new `tool_trace` frame variant
- **tasks table** — extended with `background` type for async subagent execution
## Metrics
- Tool call observability: 0% → 100% of calls traced with timing
- Session continuity: lost on refresh → preserved on reconnect
- Workflow authoring: hardcoded → user-authored JS scripts
- Workflow re-run efficiency: 0% cache → hash-based step reuse
- Background execution: blocking only → blocking + non-blocking
- Cache visibility: 0% → per-turn + cumulative hit rate
- Multi-modal: text-only → text + image attachments

View File

@@ -0,0 +1,230 @@
# Tasks — Paseo-like Orchestrator
## Phase 1: Trace System + Observability (5 tasks)
### 1. Create tool_traces DB table + migration
Add `tool_traces` table to `apps/server/src/schema.sql`:
- Columns: id (UUID PK), session_id (UUID FK → sessions), chat_id (UUID FK → chats), turn_number (int), tool_name (text), input (jsonb), output (jsonb), started_at (timestamptz), finished_at (timestamptz), latency_ms (int), tokens_used (int), cache_tokens (int), reasoning_tokens (int), error (text), outcome (text)
- Index on (chat_id, turn_number, started_at) for trace queries
- Index on (session_id) for session-level aggregation
- Applied idempotently via `applySchema()` — wrap in `CREATE TABLE IF NOT EXISTS`
**Verification**: `psql` shows `tool_traces` table with all columns and indexes. Schema re-run is no-op.
### 2. Add tool_trace WS frame + contracts schema
Add `tool_trace` frame to `WsFrameSchema` in `packages/contracts/src/ws-frames.ts`:
- Frame types: `tool_trace:start` (tool_name, input, started_at) and `tool_trace:complete` (tool_name, output, latency_ms, tokens_used, cache_tokens, reasoning_tokens, error)
- Add to `InferenceFrame` loose union in `apps/server/src/services/inference/turn.ts`
- Add to strict `WsFrame` discriminated union in `apps/web/src/api/types.ts`
- Rebuild contracts: `pnpm -C packages/contracts build`
**Verification**: tsc --noEmit passes. WS client receives `tool_trace:start` and `tool_trace:complete` frames.
### 3. Instrument tool-phase.ts with start/end timing
Update `apps/server/src/services/tools/tool-phase.ts`:
- Before `executeToolCall`: record `clock_timestamp()` as start, publish `tool_trace:start` frame with tool_name and input
- After `executeToolCall`: record `clock_timestamp()` as finish, compute latency_ms, extract token counts from response metadata, INSERT into `tool_traces` table, publish `tool_trace:complete` frame
- Handle errors: on thrown error, publish `tool_trace:complete` with error field set, set outcome='error'; on success, outcome='success'
- Use `sql.json(input as never)` for JSONB columns — no double-serialization
**Verification**: Every tool call produces a `tool_traces` row with correct latency_ms and outcome. WS client receives both start and complete frames.
### 4. Add GET /api/chats/:id/traces endpoint
Create `apps/server/src/routes/traces.ts`:
- `GET /api/chats/:id/traces` — paginated, ordered by (turn_number, started_at)
- Query params: `cursor` (opaque cursor for keyset pagination), `limit` (default 50, max 200), `turn_number` (optional filter to single turn)
- Returns `{traces: Trace[], next_cursor: string | null}`
- Register in Fastify router with `chatOwnershipPreHandler` guard
**Verification**: `curl /api/chats/:id/traces` returns paginated trace rows. Turn filter returns only matching traces.
### 5. Build trace viewer frontend component
Create `apps/web/src/components/TraceViewer.tsx` (and supporting files):
- Collapsible tree grouped by turn_number
- Per tool call row: tool_name badge, latency bar (relative bar width, color-coded: green <1s, yellow <5s, red ≥5s), token count, expand/collapse chevron
- Expanded view: tool input (JSON formatted), tool output (JSON formatted), error message if any
- Fetch traces from `/api/chats/:id/traces` on pane mount, paginate on scroll
- Integrate as a new pane option in the multi-pane workspace (existing pane registry)
**Verification**: Trace viewer loads, groups by turn, shows timing bars, expands/collapses tool calls. Pagination works for sessions with 50+ traces.
## Phase 2: Session Persistence + Resume (3 tasks)
### 6. Serialize agent state to DB on turn boundaries
Modify `apps/coder` agent dispatch:
- On each turn boundary (after LLM response, before next tool call loop), serialize `AgentSession` state to `agent_sessions` table
- Persist: provider config, turn history, pending tool calls, current phase, token budget remaining
- Use JSONB column for the snapshot state, `clock_timestamp()` for last_update
- Guard against rapid consecutive saves (debounce 200ms)
**Verification**: Agent session state is written to `agent_sessions` after each LLM turn. JSONB snapshot contains all fields needed for resume.
### 7. Restore state on WS reconnect
Update `apps/server/src/services/ws.ts`:
- On `snapshot` frame from a reconnecting client, check for `AgentSession` in `in_progress` or `awaiting_input` state
- If found, rehydrate the coder pane: restore provider config, replay pending tool calls, set turn history
- Publish a `session_restored` frame with the restored state metadata
- Client-side: `useSessionStream` handles `session_restored` by resetting pane state to match
**Verification**: Refresh browser mid-agent-session → after reconnect, the coder pane shows the same turn state, pending tool calls, and conversation history.
### 8. Agent session timeline view
Add timeline component to the coder pane:
- Horizontal timeline showing all turns in the current agent session
- Each turn entry: turn number, start time, tool call count, token usage, cache hit rate
- Active turn highlighted, past turns dimmed
- Clicking a past turn scrolls the conversation to that turn and collapses later turns
- Fetch turn metadata from existing session data (no new endpoint needed)
**Verification**: Timeline shows all turns. Clicking a turn scrolls to it. Active turn is highlighted.
## Phase 3: Dynamic Workflow Engine (6 tasks)
### 9. Create isolated-vm workflow sandbox
Create `apps/server/src/services/workflow/sandbox.ts`:
- Use `isolated-vm` npm package to create a V8 isolate for each workflow run
- No `require`, `fs`, `net`, `child_process` accessible in the sandbox
- Expose only the workflow API surface (`agent`, `parallel`, `pipeline`, `phase`, `budget`, `log`, `args`)
- Token budget enforcement: inject a step counter, throw when budget exceeded
- Timeout: 30s default, configurable per workflow
- Error boundary: caught exceptions produce structured error results instead of crashing the worker
- Add `isolated-vm` to `apps/server/package.json` dependencies
**Verification**: Workflow script that calls `agent()` runs without error. Script trying `require('fs')` throws a sandbox violation. Run exceeding budget is killed with a clear message.
### 10. Implement agent/parallel/pipeline primitives
Create `apps/server/src/services/workflow/api.ts`:
- `agent(id, { prompt, model?, tools?, budget? })` — registers a sub-agent. Returns an object with `.run(input)` that dispatches the agent through the existing agent dispatch system and returns result.
- `parallel([agents], { budget? })` — runs all agents concurrently. Returns when all complete (or any fails). Shared token budget across parallel agents. Uses `Promise.allSettled` for resilience.
- `pipeline([steps], { budget? })` — runs steps sequentially. Each step receives the previous step's output. Steps can be `agent()` results or inline functions.
- `phase(name, { agents, budget })` — groups agents under a named phase. Phases can have their own budget. Results are namespaced by phase name.
- `budget(limit)` — sets token or step limits. Returns a budget object consumed by agent/parallel/pipeline.
- `log(msg)` — emits a structured log entry tagged with current phase/agent context. Published as WS frame to the Orchestrator pane.
- `args` — the input arguments passed to `workflow.run(args)`.
**Verification**: A test workflow using `agent()`, `parallel()`, and `pipeline()` executes correctly. Logs appear in the output stream. Token budgets are enforced.
### 11. Workflow file discovery system
Create `apps/server/src/services/workflow/discovery.ts`:
- Scan `.boocode/workflows/*.js` (project root, relative to `PROJECT_ROOT_WHITELIST`)
- Scan `~/.boocode/workflows/*.js` (global, `os.homedir()`)
- Scan `data/workflows/` (built-in catalog)
- Each file must export a `workflow` object: `{name, description, run(args) => {...}}`
- Validate the workflow object at discovery time: required fields, run must be a function
- On server start, run full discovery. Cache results in a `Map<name, Workflow>`.
- Log discovered workflows with name + description at `info` level
**Verification**: Placing a valid `.boocode/workflows/test.js` file makes the workflow appear in `WorkflowManager.list()`. Invalid workflow files are logged as warnings and skipped.
### 12. Workflow manager + built-in catalog
Create `apps/server/src/services/workflow/manager.ts`:
- `WorkflowManager` singleton class:
- `list()` — returns all discovered workflows with name, description, and arg schema
- `get(name)` — returns a workflow by name
- `run(workflow, args)` — creates a sandbox, injects args, executes `workflow.run()`. Returns a runId (UUID).
- `cancel(runId)` — terminates the sandbox, marks run as cancelled
- `status(runId)` — returns run status: `pending|running|completed|failed|cancelled`, with progress info
- Concurrency limit: configurable via `WORKFLOW_MAX_CONCURRENT` env var (default 3)
- Token budget: configurable via `WORKFLOW_DEFAULT_BUDGET` env var (default 100_000 tokens)
- Run state tracked in-memory with optional DB persistence
Built-in workflows in `data/workflows/`:
- `deep-research` — parallel source search → per-source analysis → synthesis report
- `multi-review` — run code health + security + standards reviews in parallel, merge findings
- `plan-verify` — generate implementation plan → verify plan → generate work items
- `bounty-hunt` — parallel vulnerability scans with different focus areas (injection, auth, crypto, business logic)
**Verification**: `list()` returns built-in workflows. `run()` executes a workflow and returns runId. `status()` reflects progress. `cancel()` stops execution cleanly.
### 13. Workflow resumability (hash-based cache)
Create `apps/server/src/services/workflow/cache.ts`:
- Compute SHA-256 hash of each agent spec: `crypto.createHash('sha256').update(JSON.stringify({prompt, options})).digest('hex')`
- Before executing an agent, check in-memory LRU cache for existing result matching the hash
- Hit: return cached result, emit `log('cached', agentId, hash)` — no actual dispatch
- Miss: execute agent, store result in cache keyed by hash
- LRU eviction: `WORKFLOW_CACHE_SIZE` env var (default 100 entries)
- Optional DB persistence: `workflow_cache` table with `hash`, `result`, `created_at` — cross-session reuse
- Re-run detection: identical workflow with same args → all agents skipped
- Partial re-run: changed args → only changed agents re-execute, unchanged ones read from cache
**Verification**: First run of a workflow executes all agents. Second run with identical args skips all agents (logs show 'cached'). Run with modified args for one agent only re-executes that agent.
### 14. Workflow UI integration with Orchestrator panel
Extend `apps/web/src/components/Orchestrator/`:
- Add workflow selector dropdown listing workflows from `WorkflowManager.list()`
- Add "Run Workflow" button that opens workflow args editor (JSON or form)
- Extend existing run pane to show workflow steps with per-agent progress
- Live log stream from workflow `log()` calls, displayed in a scrollable log view
- Cancel button for running workflows
- Resumability indicator: "3/5 steps cached — skipping" when hash cache hits
- Fetch workflow list via new API endpoint or WS message (add `GET /api/orchestrator/workflows`)
**Verification**: Workflow selector lists built-in workflows. Running a workflow shows step-by-step progress in the run pane. Cancelling a running workflow works. Cached steps show "skipped" indicator.
## Phase 4: Background Subagents (3 tasks)
### 15. Background task queue + spawn_subagent tool
Modify `apps/coder/` and `apps/server/`:
- Extend `tasks` table usage with a new task type marker for background subagent tasks
- Create `spawn_subagent` tool in `apps/server/src/services/tools/`:
- Schema: `{prompt, model?, tools?, budget?, metadata?}`
- Creates a `tasks` row with state=`pending`, type=`background_subagent`
- Returns `{task_id, status: 'pending'}` immediately — does NOT block
- Background worker loop: polls `tasks` table for `background_subagent` tasks in `pending` state, picks one up, executes it via existing agent dispatch, writes result back to tasks row on completion
- Max concurrency: `BACKGROUND_MAX_CONCURRENT` env var (default 2)
- Worker polls interval: 1s (configurable)
**Verification**: Calling `spawn_subagent` returns immediately with a task_id. The task eventually completes with a result in the tasks table. Multiple background tasks run concurrently up to the concurrency limit.
### 16. subagent_status + subagent_result tools
Create two tools in `apps/server/src/services/tools/`:
- `subagent_status(task_id)`:
- Schema: `{task_id}`
- Returns: `{task_id, status: 'pending'|'running'|'completed'|'failed', progress?: string, started_at?, finished_at?}`
- Queries `tasks` table for the status
- `subagent_result(task_id)`:
- Schema: `{task_id}`
- Returns: `{task_id, status, result?: json, error?: string}`
- Only returns result when status='completed'; returns empty result otherwise with a message
- Updates task state to `read` on successful result retrieval (optional)
**Verification**: Calling `subagent_status` on a running task returns 'running'. Calling `subagent_result` on a completed task returns the full result. Calling `subagent_result` on a pending task returns a clear "not ready yet" message.
### 17. Background agent pane
Create `apps/web/src/components/BackgroundAgentPane.tsx`:
- New pane type showing running, completed, and failed background subagents
- Each entry: agent name/description, status badge, duration (elapsed or total), progress indicator
- Running entries: progress bar (if available), cancel button
- Completed entries: "View Result" action that opens a modal or inline view with the full output
- Failed entries: error message, "Retry" action
- Badge counter on pane tab showing number of running tasks
- Poll status every 2s for running entries, stop polling on completion
- Register in pane registry alongside existing pane types
**Verification**: Background pane shows spawning tasks as "pending", transitioning to "running", then "completed"/"failed". "View Result" shows the full output. Badge counter reflects active running tasks.
## Phase 5: Multi-modal + Cache Shape (4 tasks)
### 18. Multi-modal attachment pipeline
Add file upload support:
- Accept file uploads via drag-drop or file picker in the message input area
- Store uploaded files on tmpfs (`/tmp/boocode-uploads/` by default, configurable via `UPLOAD_DIR`)
- Reference attachments in message row via `message_parts` with `type='image'` and a `url` pointing to the tmpfs path
- Forward to DeepSeek API: encode image as base64 data URI, send as multimodal content part in the user message
- Supported formats: png, jpg, jpeg, gif, webp
- Size limit: 20MB default, configurable via `MAX_ATTACHMENT_SIZE_MB` env var
- Server-side cleanup: delete tmpfs files after message is fully processed or on a periodic sweep
**Verification**: Uploading an image creates a file on tmpfs and a referenced `message_parts` row. DeepSeek API call includes the image as a base64 content part. Error on files over size limit.
### 19. Image render in message bubble
Update message rendering in `apps/web/src/components/MessageBubble.tsx`:
- Detect `message_parts` with `type='image'` in the message content
- Render attached images inline in the chat bubble, below the text content
- Thumbnail: max 300px wide, aspect-ratio preserved, rounded corners
- Lightbox: clicking the thumbnail opens a full-size overlay with close button
- Loading state: skeleton placeholder while image loads from tmpfs URL
- Error state: broken image placeholder with retry option
- Clean layout: images displayed in a grid (1-2 columns depending on count)
**Verification**: Chat messages with image attachments render inline thumbnails. Clicking opens lightbox. Large images are thumbnailed. Broken images show error state.
### 20. Cache shape telemetry data pipeline
Extract and store cache metrics:
- In the DeepSeek provider response handler, extract `prompt_cache_hit_tokens` and `prompt_cache_miss_tokens` from the API response metadata
- Break down cache segments: system prompt tokens, tool schema tokens, conversation history tokens (approximate by measuring each segment length)
- Store cache metrics in `tool_traces.cache_tokens` column (already created in Phase 1)
- Optionally create a `cache_stats` table for per-segment breakdown: `{turn_id, segment_name, hit_tokens, miss_tokens}`
- Expose via existing traces API (cache fields already part of the Trace schema)
**Verification**: After a DeepSeek call, `tool_traces` row has `cache_tokens` populated. Cache segment breakdown is available when querying traces.
### 21. Cache shape visualization in trace viewer
Update the TraceViewer component with cache metrics:
- Per-turn cache hit bar: horizontal stacked bar showing cached (green) vs non-cached (gray) tokens
- Hit rate percentage displayed as a badge next to token count
- Cumulative cache hit rate in the session footer: "Cache hit rate: 67% (45K/67K tokens)"
- Color coding: green ≥60%, yellow 30-59%, red <30%
- Tooltip on hover showing segment breakdown if available
- Animate transitions when new trace data arrives
**Verification**: Trace viewer shows cache hit/miss bars per turn. Cumulative rate in footer updates as new traces load. Color coding matches thresholds.

View File

@@ -76,6 +76,8 @@ export const MessageStartedFrame = z.object({
message_id: Uuid, message_id: Uuid,
chat_id: Uuid.optional(), chat_id: Uuid.optional(),
role: MessageRoleValue, role: MessageRoleValue,
// v2.8-compare: groups messages belonging to the same compare operation.
compare_group_id: z.string().uuid().optional(),
}); });
export const DeltaFrame = z.object({ export const DeltaFrame = z.object({
@@ -83,6 +85,7 @@ export const DeltaFrame = z.object({
message_id: Uuid, message_id: Uuid,
chat_id: Uuid.optional(), chat_id: Uuid.optional(),
content: z.string(), content: z.string(),
compare_group_id: z.string().uuid().optional(),
}); });
export const ReasoningDeltaFrame = z.object({ export const ReasoningDeltaFrame = z.object({
@@ -107,6 +110,10 @@ export const ToolResultFrame = z.object({
output: z.unknown(), output: z.unknown(),
truncated: z.boolean(), truncated: z.boolean(),
error: z.string().optional(), error: z.string().optional(),
// v2.8: unified diff for write tools (edit_file, create_file, etc.).
// Published alongside successful tool results so the frontend can render
// a compact diff snippet inline. Absent for read-only tools or failures.
diff: z.string().optional(),
}); });
export const MessageCompleteFrame = z.object({ export const MessageCompleteFrame = z.object({
@@ -132,6 +139,7 @@ export const MessageCompleteFrame = z.object({
// web reducer can render a muted "Stopped" / failed state without a new frame // web reducer can render a muted "Stopped" / failed state without a new frame
// type. Optional → fail-closed publishFrame must keep, not strip, it. // type. Optional → fail-closed publishFrame must keep, not strip, it.
status: z.enum(['complete', 'cancelled', 'failed']).optional(), status: z.enum(['complete', 'cancelled', 'failed']).optional(),
compare_group_id: z.string().uuid().optional(),
}); });
export const UsageFrame = z.object({ export const UsageFrame = z.object({
@@ -168,6 +176,7 @@ export const ErrorFrame = z.object({
chat_id: Uuid.optional(), chat_id: Uuid.optional(),
error: z.string(), error: z.string(),
reason: ErrorReasonValue.optional(), reason: ErrorReasonValue.optional(),
compare_group_id: z.string().uuid().optional(),
}); });
// ---- per-user channel frames (sidebar refresh) ----------------------------- // ---- per-user channel frames (sidebar refresh) -----------------------------
@@ -355,7 +364,7 @@ export const FlowRunStepUpdatedFrame = z.object({
type: z.literal('flow_run_step_updated'), type: z.literal('flow_run_step_updated'),
run_id: Uuid, run_id: Uuid,
step_id: z.string().min(1), step_id: z.string().min(1),
status: z.enum(['pending', 'running', 'completed', 'failed', 'skipped', 'cancelled']), status: z.enum(['pending', 'running', 'completed', 'failed', 'skipped', 'cancelled', 'timed_out']),
run_status: z.enum(['running', 'completed', 'failed', 'cancelled']).optional(), run_status: z.enum(['running', 'completed', 'failed', 'cancelled']).optional(),
report: z.string().optional(), report: z.string().optional(),
}); });
@@ -407,11 +416,144 @@ export const BattleUpdatedFrame = z.object({
cross_exam_id: Uuid.optional(), cross_exam_id: Uuid.optional(),
}); });
// ---- agent snapshot restore frame ------------------------------------------
export const AgentSnapshotFrame = z.object({
type: z.literal('agent_snapshot'),
chat_id: z.string().uuid(),
agent: z.string().nullable().optional(),
model: z.string(),
mode: z.string().nullable().optional(),
turn_number: z.number().int().nonnegative(),
});
// ---- tool trace frames -----------------------------------------------------
export const ToolTraceStartFrame = z.object({
type: z.literal('tool_trace_start'),
trace_id: z.string().uuid(),
message_id: z.string().uuid(),
chat_id: z.string().uuid(),
tool_name: z.string().min(1),
tool_input: z.record(z.unknown()),
started_at: z.string().datetime(),
});
export const ToolTraceFinishFrame = z.object({
type: z.literal('tool_trace_finish'),
trace_id: z.string().uuid(),
message_id: z.string().uuid(),
chat_id: z.string().uuid(),
tool_name: z.string().min(1),
tool_output: z.union([z.string(), z.null()]).optional(),
latency_ms: z.number().int().nonnegative().optional(),
tokens_used: z.number().int().nonnegative().nullable().optional(),
cache_tokens: z.number().int().nonnegative().nullable().optional(),
reasoning_tokens: z.number().int().nonnegative().nullable().optional(),
error: z.string().optional(),
outcome: z.string().optional(),
finished_at: z.string().datetime(),
});
// ---- collision warning frame (v2.8) ----------------------------------------
//
// Published when the BooCoder detects that multiple worktrees/agents are editing
// the same file concurrently. Advisory only — writes are not blocked.
const ConflictSeverityValue = z.enum(['same_line', 'adjacent_line', 'different_area']);
export const CollisionWarningFrame = z.object({
type: z.literal('collision_warning'),
file_path: z.string().min(1),
worktrees: z.array(z.string().min(1)),
agents: z.array(z.string().min(1)),
severity: ConflictSeverityValue,
});
// ---- channel-delta frames (streaming v2) ----------------------------------
//
// Each channel frame carries a monotonic `seq` counter so the client can
// reorder out-of-order deltas per-channel, detect gaps, and request replay on
// reconnect. The `channel` discriminator tells the reducer which substate to
// update.
const TextChannelPayload = z.object({
message_id: Uuid,
chat_id: Uuid.optional(),
content: z.string(),
compare_group_id: z.string().uuid().optional(),
});
const ToolCallChannelPayload = z.object({
message_id: Uuid,
chat_id: Uuid.optional(),
tool_call: ToolCallShape,
});
const ToolResultChannelPayload = z.object({
tool_message_id: Uuid,
chat_id: Uuid.optional(),
tool_call_id: ToolCallId,
output: z.unknown(),
truncated: z.boolean(),
error: z.string().optional(),
diff: z.string().optional(),
});
const StatusChannelPayload = z.object({
message_id: Uuid,
chat_id: Uuid.optional(),
status: z.enum(['running', 'complete', 'cancelled', 'failed']).optional(),
tokens_used: z.number().int().nonnegative().nullable().optional(),
ctx_used: z.number().int().nonnegative().nullable().optional(),
ctx_max: z.number().int().positive().nullable().optional(),
cache_tokens: z.number().int().nonnegative().nullable().optional(),
reasoning_tokens: z.number().int().nonnegative().nullable().optional(),
started_at: IsoTimestamp.nullable().optional(),
finished_at: IsoTimestamp.nullable().optional(),
model: z.string().nullable().optional(),
metadata: OpaqueObject.nullable().optional(),
});
const ErrorChannelPayload = z.object({
message_id: Uuid.optional(),
chat_id: Uuid.optional(),
error: z.string(),
reason: ErrorReasonValue.optional(),
});
const ChannelDeltaPayload = z.discriminatedUnion('channel', [
z.object({ channel: z.literal('text'), ...TextChannelPayload.shape }),
z.object({ channel: z.literal('tool_call'), ...ToolCallChannelPayload.shape }),
z.object({ channel: z.literal('tool_result'), ...ToolResultChannelPayload.shape }),
z.object({ channel: z.literal('status'), ...StatusChannelPayload.shape }),
z.object({ channel: z.literal('error'), ...ErrorChannelPayload.shape }),
]);
export const ChannelDeltaFrame = z.object({
type: z.literal('channel_delta'),
seq: z.number().int().nonnegative(),
channel: z.union([
z.literal('text'), z.literal('tool_call'),
z.literal('tool_result'), z.literal('status'), z.literal('error'),
]),
message_id: Uuid.optional(),
chat_id: Uuid.optional(),
content: z.string().optional(),
tool_call: ToolCallShape.optional(),
tool_message_id: Uuid.optional(),
tool_call_id: ToolCallId.optional(),
output: z.unknown().optional(),
truncated: z.boolean().optional(),
diff: z.string().optional(),
});
// ---- discriminated union --------------------------------------------------- // ---- discriminated union ---------------------------------------------------
export const WsFrameSchema = z.discriminatedUnion('type', [ export const WsFrameSchema = z.discriminatedUnion('type', [
// per-session // per-session
SnapshotFrame, SnapshotFrame,
AgentSnapshotFrame,
MessageStartedFrame, MessageStartedFrame,
DeltaFrame, DeltaFrame,
ReasoningDeltaFrame, ReasoningDeltaFrame,
@@ -434,6 +576,13 @@ export const WsFrameSchema = z.discriminatedUnion('type', [
BattleStartedFrame, BattleStartedFrame,
ContestantUpdatedFrame, ContestantUpdatedFrame,
BattleUpdatedFrame, BattleUpdatedFrame,
// tool trace
ToolTraceStartFrame,
ToolTraceFinishFrame,
// collision warning
CollisionWarningFrame,
// channel-delta (streaming v2)
ChannelDeltaFrame,
// per-user // per-user
ChatStatusFrame, ChatStatusFrame,
SessionUpdatedFrame, SessionUpdatedFrame,
@@ -461,6 +610,7 @@ export type WsFrame = z.infer<typeof WsFrameSchema>;
// by the drift test in src/__tests__/ws-frames.test.ts. // by the drift test in src/__tests__/ws-frames.test.ts.
export const KNOWN_FRAME_TYPES: readonly WsFrame['type'][] = [ export const KNOWN_FRAME_TYPES: readonly WsFrame['type'][] = [
'snapshot', 'snapshot',
'agent_snapshot',
'message_started', 'message_started',
'delta', 'delta',
'reasoning_delta', 'reasoning_delta',
@@ -481,6 +631,10 @@ export const KNOWN_FRAME_TYPES: readonly WsFrame['type'][] = [
'battle_started', 'battle_started',
'contestant_updated', 'contestant_updated',
'battle_updated', 'battle_updated',
'tool_trace_start',
'tool_trace_finish',
'collision_warning',
'channel_delta',
'chat_status', 'chat_status',
'session_updated', 'session_updated',
'session_renamed', 'session_renamed',

15
pnpm-lock.yaml generated
View File

@@ -97,6 +97,9 @@ importers:
apps/server: apps/server:
dependencies: dependencies:
'@ai-sdk/deepseek':
specifier: ^2.0.35
version: 2.0.35(zod@3.25.76)
'@ai-sdk/openai-compatible': '@ai-sdk/openai-compatible':
specifier: ^2.0.47 specifier: ^2.0.47
version: 2.0.47(zod@3.25.76) version: 2.0.47(zod@3.25.76)
@@ -302,6 +305,12 @@ packages:
peerDependencies: peerDependencies:
zod: ^3.25.0 || ^4.0.0 zod: ^3.25.0 || ^4.0.0
'@ai-sdk/deepseek@2.0.35':
resolution: {integrity: sha512-9DhYurbAvcurOEGN6u2myYDybrrzGfcrkG8hwmFjwTrePW6KCMggm0YxP7e8RkLYcQKqCEMgFlyEB4BM6EmiKg==}
engines: {node: '>=18'}
peerDependencies:
zod: ^3.25.76 || ^4.1.8
'@ai-sdk/gateway@3.0.119': '@ai-sdk/gateway@3.0.119':
resolution: {integrity: sha512-VAhfRWC+JexZakkVfmjaJKaTj00x7/UHdE8kMWL3NhuQAlf8oXtg9r4dfvFZrByXxchGRBvYE3biEUyibkg0xg==} resolution: {integrity: sha512-VAhfRWC+JexZakkVfmjaJKaTj00x7/UHdE8kMWL3NhuQAlf8oXtg9r4dfvFZrByXxchGRBvYE3biEUyibkg0xg==}
engines: {node: '>=18'} engines: {node: '>=18'}
@@ -4363,6 +4372,12 @@ snapshots:
dependencies: dependencies:
zod: 3.25.76 zod: 3.25.76
'@ai-sdk/deepseek@2.0.35(zod@3.25.76)':
dependencies:
'@ai-sdk/provider': 3.0.10
'@ai-sdk/provider-utils': 4.0.27(zod@3.25.76)
zod: 3.25.76
'@ai-sdk/gateway@3.0.119(zod@3.25.76)': '@ai-sdk/gateway@3.0.119(zod@3.25.76)':
dependencies: dependencies:
'@ai-sdk/provider': 3.0.10 '@ai-sdk/provider': 3.0.10