Compare commits

..

5 Commits

Author SHA1 Message Date
f32fd928b3 feat: post-review backlog hardening (cancel/parser/stall/history/9502)
Five independent items from the post-review backlog. F1: Stop on an external
agent task now aborts the running child via a per-task AbortController registry
reachable from the cancel route, and finalizes the assistant message as
cancelled (fixing two latent bugs — catch blocks left the message streaming,
and warm success-paths wrote complete on an aborted turn); warm pools/worktrees
are preserved and the native path is unchanged. F2/F3: prune the tool-call
parser to its two load-bearing exports (unexport eight zero-caller symbols, add
a gate test for the <invoke>-as-text fallback) and route placeholder-rejection
logging through pino. F6: a 90s per-chunk stall-timeout wraps native inference's
fullStream via AbortSignal.any so a hung stream finalizes the message instead of
hanging — no retry (a pure classifyStreamError helper is added). F7: a read-only
view_session_history MCP tool (newest-N, chronological). F9: retire the unused
apps/coder/web :9502 fallback SPA, keeping every API/WS/health/MCP route.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 02:23:11 +00:00
9a139633b8 fix(coder): drop human_inbox view before dropping tasks columns
The audit-cleanup migration dropped tasks.feature_values/worktree_path, but
human_inbox is `SELECT * FROM tasks` and pins every column, so the DROP COLUMN
failed (2BP01) on any existing DB and crash-looped boocoder on boot. Drop the
view, drop the columns, then recreate it — idempotent on fresh and existing DBs.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 00:29:28 +00:00
2c4ff2063d fix: reconcile audit-cleanup refactor with @boocode/contracts SSOT (worktree-risk type, frame-emitter import)
worktree-risk.ts now returns the package's WorktreeRiskReport (local RiskReport interface removed); frame-emitter.ts imports WsFrame from @boocode/contracts/ws-frames (the deleted @boocode/server/ws-frames subpath).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 23:25:50 +00:00
ae3f10b19d Merge remote-tracking branch 'origin/main' 2026-06-02 21:30:28 +00:00
8c200216eb refactor: codebase audit cleanup — dead code, dedup, module splits
Multi-agent audit + aggressive cleanup across server/web/coder/booterm,
delivered behind a DEFER discipline so none of the in-flight files were
touched. Removes dead code/deps/columns, dedups server + coder helpers,
and splits the oversized modules (tools.ts, opencode-server.ts,
sentinel-summaries, turn.ts, TerminalPane.tsx) behind stable contracts.
Adds 78 parity/unit tests (server 587, coder 323); fixes two latent bugs
(ChatPane queue keys, FileViewerOverlay blank-line parity).

Intended tag: v2.7.12-audit-cleanup.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 21:12:29 +00:00
185 changed files with 7727 additions and 8315 deletions

View File

@@ -6,6 +6,10 @@ All notable changes per release tag. Most recent on top, ordered by tag creation
Creates `@boocode/contracts` (`packages/contracts`), a new workspace package that becomes the single source of truth for every cross-app wire contract — reversing the decision recorded in `v2.5.12-provider-lifecycle-phase4` that declined a shared types package as not worth the Docker/build-order risk at solo scale; a live `AgentSessionConfig` drift that had since appeared between `apps/coder` and `apps/web` justified the investment. Six contracts are now defined exactly once: the `WsFrameSchema` Zod runtime schema, the provider snapshot types (`ProviderSnapshotEntry` and family), the Zod provider-config schemas, `MessageMetadata` + `ErrorReason`, `AgentSessionConfig`, and `WorktreeRiskReport`; both Zod-backed contracts use `z.infer` so validator and type derive from the same definition and cannot drift independently. All four consumers — `apps/server`, `apps/web`, `apps/coder`, and the fallback SPA `apps/coder/web` — import via `workspace:*` through a per-subpath exports map consuming built dist only (no tsconfig project references); the hand-synced copies and their parity tests (`provider-types-parity.test.ts`; the ws-frames byte-parity assertion) are deleted while the KNOWN_FRAME_TYPES drift test and broker fail-closed tests are preserved. Build order is inverted in the root build script, Dockerfile, and coder deploy docs; `apps/coder/web`'s migration also removed dead `pending_change_*` reducer arms (no frame publisher exists for these — pending changes are HTTP-delivered), closing a latent missing-default-arm crash, and reconciled field-type conflicts with the canonical `WsFrame`; zod is pinned to a single version across the workspace. Server 543 / coder 293 / contracts 11 tests passing; human smoke verified on the live stack 2026-06-02. Creates `@boocode/contracts` (`packages/contracts`), a new workspace package that becomes the single source of truth for every cross-app wire contract — reversing the decision recorded in `v2.5.12-provider-lifecycle-phase4` that declined a shared types package as not worth the Docker/build-order risk at solo scale; a live `AgentSessionConfig` drift that had since appeared between `apps/coder` and `apps/web` justified the investment. Six contracts are now defined exactly once: the `WsFrameSchema` Zod runtime schema, the provider snapshot types (`ProviderSnapshotEntry` and family), the Zod provider-config schemas, `MessageMetadata` + `ErrorReason`, `AgentSessionConfig`, and `WorktreeRiskReport`; both Zod-backed contracts use `z.infer` so validator and type derive from the same definition and cannot drift independently. All four consumers — `apps/server`, `apps/web`, `apps/coder`, and the fallback SPA `apps/coder/web` — import via `workspace:*` through a per-subpath exports map consuming built dist only (no tsconfig project references); the hand-synced copies and their parity tests (`provider-types-parity.test.ts`; the ws-frames byte-parity assertion) are deleted while the KNOWN_FRAME_TYPES drift test and broker fail-closed tests are preserved. Build order is inverted in the root build script, Dockerfile, and coder deploy docs; `apps/coder/web`'s migration also removed dead `pending_change_*` reducer arms (no frame publisher exists for these — pending changes are HTTP-delivered), closing a latent missing-default-arm crash, and reconciled field-type conflicts with the canonical `WsFrame`; zod is pinned to a single version across the workspace. Server 543 / coder 293 / contracts 11 tests passing; human smoke verified on the live stack 2026-06-02.
## v2.7.12-audit-cleanup — 2026-06-02
A repo-wide audit and aggressive cleanup pass, run as a multi-agent orchestration (five read-only Opus auditors over server/web/coder/booterm + cross-cutting deps/build/parity + a structural-architecture lens) followed by phased, behavior-preserving implementation — every change gated on the per-app test suites and delivered behind a strict DEFER discipline that never touched the files in flight for `v2.7.9``v2.7.11` (`mcp-config`, the `ws-frames` pair, `dispatcher`, `claude-sdk-map`, `AgentComposerBar`/`CoderMessageList`/`CoderPane`), so the branch rebased onto current main with zero conflicts. **Dead code/deps/schema**: removed ~9 dead files and a swathe of dead exports/write-only state across all four apps, dropped dead deps (`next-themes`, `@xterm/addon-webgl`, booterm `tslib`; `shadcn`→devDep), and idempotently dropped dead schema columns/tables (`sessions.tags`, `tasks.worktree_path`/`feature_values`, `available_agents.supports_mcp_client`, the superseded `session_worktrees` table, the always-empty `list_worktrees` MCP tool) — chat/session/message DATA untouched, only never-read columns. **Server dedup + reshapes**: collapsed the dead `budget.ts` tier system (surfacing a latent `READ_ONLY_TOOL_NAMES` drift, then deleted), extracted shared `MESSAGE_COLUMNS`/`selectProject`/`stripQuotes`/`SENTINEL_KINDS`/`samplerOptsFromAgent`/`createContentFlusher`/`insertSentinel`/a `makeCodecontextTool` factory/a pending-tool-call resolver, split `tools.ts` (799→46 barrel + `tools/{types,fs-tools,misc-tools,registry,tiers}`, register-through registry preserved so coder's import contract stays byte-stable), and decomposed the inference pipeline (`sentinel-summaries``runWrapUpSummary`, `turn.ts``turn-config`+`step-decision`, a pure `stream-phase-adapter`, shared finalize atoms — stopping short of fusing synthesis to preserve frame timing). **Coder reshapes**: split the 1062-line `opencode-server.ts` god-class into supervisor / sse-loop / pure event-map / port-utils + extracted `buildAcpClient`/`makeFrameEmitter`/`worktree-risk`, plus happy-path-safe concurrency hardening (reconnect backoff, double-spawn guard; a defensive busy-assert + ensureSession coalescing flagged for review). **Web**: `React.memo` on `MessageBubble`/`MarkdownRenderer` + module-hoisted markdown components (the streaming re-parse was the biggest perf cost), shared `linkifyPaths`/artifact/tab dedup, two latent bug fixes (`ChatPane` index-keys → stable ids; `FileViewerOverlay` blank-line line-number desync), and decomposed the 1298-line `TerminalPane.tsx` into fit/socket/selection hooks + presentational pieces (verbatim move, all ~30 listeners/timers inventoried; the label-dep fix stops a live terminal tearing down on pane renumber). +78 parity/unit tests (server 597, coder 328 green; `apps/web` has no harness, so its changes are typecheck + manual/device QA). Net ≈ 4,600 LOC. Deferred (designed; blueprints in the audit reports): the `tasks` dual-CREATE / `project_id` FK (a cross-service deploy-ordering decision, not a data migration), web structural decomposition of `useWorkspacePanes`/`MessageBubble` (needs a web test harness first), a `@boocode/contracts` shared package, and the `dispatcher.ts` split — the last two now unblocked since their in-flight files shipped in `v2.7.9``v2.7.11`. Rebased clean onto `v2.7.11-coder-model-snapshot`.
## v2.7.11-coder-model-snapshot — 2026-06-02 ## v2.7.11-coder-model-snapshot — 2026-06-02
Hotfix for the coder model-attribution chip vanishing on refresh. The chip showed during a live turn (the `message_complete` frame carries `model`) but disappeared when a BooCoder session was reloaded — only in the coder, not BooChat. Root cause: `CoderPane`'s `useCoderMessages` hydrates from two sources on load — the HTTP `listMessages` fetch (whose SELECT includes `model`, added `v2.7.8`) AND the WS `snapshot` frame — and the WS snapshot's query in `apps/coder/src/routes/ws.ts` had its own column list that omitted `model`. The client's `snapshot` handler `setMessages`-overwrites the HTTP load, so the model-less rows won, and with no later `message_complete` for historical messages the chip stayed gone. Fix is one column: add `model` to the WS snapshot SELECT so both hydration paths agree. The `apps/coder/CLAUDE.md` "update every mapper" note now lists the WS snapshot SELECT explicitly (it was the one place not enumerated). apps/server + apps/coder builds green; deployed via `systemctl restart boocoder` (host service — the earlier `v2.7.10` docker deploy rebuilt only the container, never this route). Fixes the chip shipped in `v2.7.8-ember-coder-tabs-model-chips` / completed in `v2.7.9-mcp-keys-docs-coder-fixes`. Hotfix for the coder model-attribution chip vanishing on refresh. The chip showed during a live turn (the `message_complete` frame carries `model`) but disappeared when a BooCoder session was reloaded — only in the coder, not BooChat. Root cause: `CoderPane`'s `useCoderMessages` hydrates from two sources on load — the HTTP `listMessages` fetch (whose SELECT includes `model`, added `v2.7.8`) AND the WS `snapshot` frame — and the WS snapshot's query in `apps/coder/src/routes/ws.ts` had its own column list that omitted `model`. The client's `snapshot` handler `setMessages`-overwrites the HTTP load, so the model-less rows won, and with no later `message_complete` for historical messages the chip stayed gone. Fix is one column: add `model` to the WS snapshot SELECT so both hydration paths agree. The `apps/coder/CLAUDE.md` "update every mapper" note now lists the WS snapshot SELECT explicitly (it was the one place not enumerated). apps/server + apps/coder builds green; deployed via `systemctl restart boocoder` (host service — the earlier `v2.7.10` docker deploy rebuilt only the container, never this route). Fixes the chip shipped in `v2.7.8-ember-coder-tabs-model-chips` / completed in `v2.7.9-mcp-keys-docs-coder-fixes`.

View File

@@ -1,10 +1,9 @@
# Current focus # Current focus
Last updated: 2026-05-26 Last updated: 2026-06-02
- **Batch:** v2.3-provider-lifecycle (openspec drafted; not started) - **Last shipped:** `v2.7.8-ember-coder-tabs-model-chips` (2026-06-01)
- **Branch:** `main` - **Branch:** `codebase-audit-cleanup` (audit + cleanup epic, off main HEAD)
- **Blockers:** none - **In progress:** Phase 3 — stale comments + docs refresh
- **Last shipped:** `v2.2.2-xml-placeholder-reject`
Update this file when starting or finishing a batch. Agents: read this first for session intent; if stale vs `CHANGELOG.md`, trust CHANGELOG for shipped state. See `CHANGELOG.md` for the full shipped history. That file is always authoritative; this file is a quick orientation pointer only.

View File

@@ -15,7 +15,6 @@
"fastify": "^4.28.1", "fastify": "^4.28.1",
"node-pty": "^1.0.0", "node-pty": "^1.0.0",
"pg": "^8.13.0", "pg": "^8.13.0",
"tslib": "^2.6.3",
"zod": "^3.23.8" "zod": "^3.23.8"
}, },
"devDependencies": { "devDependencies": {

View File

@@ -9,7 +9,7 @@ const ConfigSchema = z.object({
TMUX_CONF_PATH: z.string().default('/etc/booterm/tmux.conf'), TMUX_CONF_PATH: z.string().default('/etc/booterm/tmux.conf'),
}); });
export type Config = z.infer<typeof ConfigSchema>; type Config = z.infer<typeof ConfigSchema>;
let cached: Config | null = null; let cached: Config | null = null;

View File

@@ -10,7 +10,7 @@ export function getPool(databaseUrl: string): pg.Pool {
return pool; return pool;
} }
export interface SessionInfo { interface SessionInfo {
id: string; id: string;
project_id: string; project_id: string;
project_path: string; project_path: string;

View File

@@ -1,7 +1,7 @@
import * as pty from 'node-pty'; import * as pty from 'node-pty';
import type { IPty } from 'node-pty'; import type { IPty } from 'node-pty';
export interface AttachPtyOptions { interface AttachPtyOptions {
sessionName: string; sessionName: string;
projectRoot: string; projectRoot: string;
cols: number; cols: number;

View File

@@ -7,7 +7,6 @@ WORKDIR /build
COPY package.json pnpm-workspace.yaml pnpm-lock.yaml tsconfig.base.json ./ COPY package.json pnpm-workspace.yaml pnpm-lock.yaml tsconfig.base.json ./
COPY apps/server/package.json ./apps/server/ COPY apps/server/package.json ./apps/server/
COPY apps/coder/package.json ./apps/coder/ COPY apps/coder/package.json ./apps/coder/
COPY apps/coder/web/package.json ./apps/coder/web/
RUN pnpm install --frozen-lockfile RUN pnpm install --frozen-lockfile
@@ -16,7 +15,6 @@ COPY apps/server ./apps/server
RUN pnpm -C apps/server build RUN pnpm -C apps/server build
COPY apps/coder ./apps/coder COPY apps/coder ./apps/coder
RUN pnpm -C apps/coder/web build
RUN pnpm -C apps/coder build RUN pnpm -C apps/coder build
RUN pnpm deploy --filter=@boocode/coder --prod --legacy /out/coder RUN pnpm deploy --filter=@boocode/coder --prod --legacy /out/coder
@@ -27,7 +25,6 @@ RUN apt-get update && apt-get install -y --no-install-recommends ripgrep git ope
WORKDIR /app WORKDIR /app
COPY --from=builder /out/coder ./ COPY --from=builder /out/coder ./
COPY --from=builder /build/apps/coder/web/dist ./web
ENV NODE_ENV=production ENV NODE_ENV=production
EXPOSE 3000 EXPOSE 3000

View File

@@ -17,7 +17,6 @@
"@agentclientprotocol/sdk": "^0.22.1", "@agentclientprotocol/sdk": "^0.22.1",
"@anthropic-ai/claude-agent-sdk": "^0.3.159", "@anthropic-ai/claude-agent-sdk": "^0.3.159",
"@boocode/server": "workspace:*", "@boocode/server": "workspace:*",
"@fastify/static": "^7.0.4",
"@fastify/websocket": "^10.0.1", "@fastify/websocket": "^10.0.1",
"@modelcontextprotocol/sdk": "^1.29.0", "@modelcontextprotocol/sdk": "^1.29.0",
"@opencode-ai/sdk": "~1.15.0", "@opencode-ai/sdk": "~1.15.0",

View File

@@ -1,12 +1,5 @@
import { resolve, dirname } from 'node:path';
import { fileURLToPath } from 'node:url';
import { existsSync } from 'node:fs';
import Fastify from 'fastify'; import Fastify from 'fastify';
import fastifyWebsocket from '@fastify/websocket'; import fastifyWebsocket from '@fastify/websocket';
import fastifyStatic from '@fastify/static';
const __filename = fileURLToPath(import.meta.url);
const __dirname = dirname(__filename);
import { loadConfig } from './config.js'; import { loadConfig } from './config.js';
import { getSql, applySchema, pingDb, closeDb } from './db.js'; import { getSql, applySchema, pingDb, closeDb } from './db.js';
import { startMcpServer } from './services/mcp-server.js'; import { startMcpServer } from './services/mcp-server.js';
@@ -257,7 +250,7 @@ async function main() {
registerPendingRoutes(app, sql); registerPendingRoutes(app, sql);
registerCheckpointRoutes(app, sql); registerCheckpointRoutes(app, sql);
registerAgentSessionRoutes(app, sql); registerAgentSessionRoutes(app, sql);
registerTaskRoutes(app, sql, inferenceApi); registerTaskRoutes(app, sql, inferenceApi, dispatcher.cancelExternalTask);
registerInboxRoutes(app, sql); registerInboxRoutes(app, sql);
registerStatsRoutes(app, sql); registerStatsRoutes(app, sql);
registerArenaRoutes(app, sql); registerArenaRoutes(app, sql);
@@ -266,28 +259,6 @@ async function main() {
registerLifecycleRoutes(app, sql); registerLifecycleRoutes(app, sql);
registerWebSocket(app, sql, broker); registerWebSocket(app, sql, broker);
// Serve static frontend (built web app). In production, the dist/ is
// copied to ../web relative to the dist/ directory at /app/web. In dev,
// check adjacent to the source.
const webRoot = resolve(__dirname, '../web');
if (existsSync(webRoot)) {
await app.register(fastifyStatic, {
root: webRoot,
prefix: '/',
// Don't intercept /api routes — static only serves files that exist.
wildcard: false,
});
// SPA fallback: serve index.html for non-API routes that don't match a file.
app.setNotFoundHandler(async (req, reply) => {
if (req.url.startsWith('/api')) {
reply.code(404);
return { error: 'not found' };
}
return reply.sendFile('index.html');
});
app.log.info(`serving frontend from ${webRoot}`);
}
// Graceful shutdown // Graceful shutdown
const shutdown = async () => { const shutdown = async () => {
app.log.info('shutting down'); app.log.info('shutting down');

View File

@@ -0,0 +1,138 @@
import { describe, it, expect, beforeAll, afterAll } from 'vitest';
import { readFileSync } from 'node:fs';
import { resolve } from 'node:path';
import Fastify, { type FastifyInstance } from 'fastify';
import postgres from 'postgres';
import { registerTaskRoutes } from '../tasks.js';
/**
* F1 — POST /api/tasks/:id/cancel route wiring.
*
* The route's job: reach the in-flight external run via `cancelExternal(taskId)`
* (the new abort hook), keep cancelling native inference for open chats unchanged,
* and land the task row in 'cancelled'. The streaming assistant message is
* finalized by the dispatcher's run-function, not here — that path is covered by
* finalize-message.test.ts. This suite pins the route's behavior against a real DB.
*/
describe.runIf(!!process.env.DATABASE_URL)('POST /api/tasks/:id/cancel (route, F1)', () => {
let sql: ReturnType<typeof postgres>;
let app: FastifyInstance;
let projectId: string;
let sessionId: string;
let chatId: string;
const externalCancelCalls: string[] = [];
const inferenceCancelCalls: Array<[string, string]> = [];
let externalReturns = true;
beforeAll(async () => {
sql = postgres(process.env.DATABASE_URL!, { max: 3 });
const serverSchema = resolve(__dirname, '../../../../server/src/schema.sql');
const coderSchema = resolve(__dirname, '../../schema.sql');
await sql.unsafe(readFileSync(serverSchema, 'utf8'));
await sql.unsafe(readFileSync(coderSchema, 'utf8'));
const [p] = await sql<{ id: string }[]>`
INSERT INTO projects (name, path, status) VALUES ('f1-cancel-route', '/tmp/f1-cancel-route', 'open') RETURNING id
`;
projectId = p!.id;
const [s] = await sql<{ id: string }[]>`
INSERT INTO sessions (project_id, name, model, status) VALUES (${projectId}, 'f1', 'm', 'open') RETURNING id
`;
sessionId = s!.id;
const [c] = await sql<{ id: string }[]>`
INSERT INTO chats (session_id, name, status) VALUES (${sessionId}, 'tab', 'open') RETURNING id
`;
chatId = c!.id;
app = Fastify();
registerTaskRoutes(
app,
sql,
{
cancel: async (sid: string, cid: string) => {
inferenceCancelCalls.push([sid, cid]);
return false;
},
},
(taskId: string) => {
externalCancelCalls.push(taskId);
return externalReturns;
},
);
await app.ready();
});
afterAll(async () => {
if (app) await app.close();
if (!sql) return;
await sql`DELETE FROM messages WHERE session_id = ${sessionId}`.catch(() => {});
await sql`DELETE FROM tasks WHERE project_id = ${projectId}`.catch(() => {});
await sql`DELETE FROM chats WHERE id = ${chatId}`.catch(() => {});
await sql`DELETE FROM sessions WHERE id = ${sessionId}`.catch(() => {});
await sql`DELETE FROM projects WHERE id = ${projectId}`.catch(() => {});
await sql.end({ timeout: 5 });
});
async function insertTask(agent: string | null, state: string): Promise<string> {
const [t] = await sql<{ id: string }[]>`
INSERT INTO tasks (project_id, input, agent, session_id, state, started_at)
VALUES (${projectId}, 'do a thing', ${agent}, ${sessionId}, ${state}, clock_timestamp())
RETURNING id
`;
return t!.id;
}
it('reaches cancelExternal and lands the task cancelled for a running external task', async () => {
externalReturns = true;
externalCancelCalls.length = 0;
const taskId = await insertTask('opencode', 'running');
const res = await app.inject({ method: 'POST', url: `/api/tasks/${taskId}/cancel` });
expect(res.statusCode).toBe(200);
expect(res.json()).toEqual({ cancelled: true });
expect(externalCancelCalls).toContain(taskId);
const [row] = await sql<{ state: string; ended_at: Date | null }[]>`
SELECT state, ended_at FROM tasks WHERE id = ${taskId}
`;
expect(row!.state).toBe('cancelled');
expect(row!.ended_at).not.toBeNull();
});
it('still cancels a native boocode task (cancelExternal returns false → inference.cancel path unchanged)', async () => {
externalReturns = false; // native task: no controller registered
externalCancelCalls.length = 0;
inferenceCancelCalls.length = 0;
const taskId = await insertTask(null, 'running');
const res = await app.inject({ method: 'POST', url: `/api/tasks/${taskId}/cancel` });
expect(res.statusCode).toBe(200);
// The route calls cancelExternal unconditionally (cheap, returns false here)...
expect(externalCancelCalls).toContain(taskId);
// ...and the native inference.cancel path still fires for the open chat.
expect(inferenceCancelCalls).toContainEqual([sessionId, chatId]);
const [row] = await sql<{ state: string }[]>`SELECT state FROM tasks WHERE id = ${taskId}`;
expect(row!.state).toBe('cancelled');
});
it('rejects cancelling an already-terminal task with 409 and never touches the abort hook', async () => {
externalCancelCalls.length = 0;
const taskId = await insertTask('opencode', 'completed');
const res = await app.inject({ method: 'POST', url: `/api/tasks/${taskId}/cancel` });
expect(res.statusCode).toBe(409);
expect(externalCancelCalls).not.toContain(taskId);
});
it('returns 404 for an unknown task', async () => {
const res = await app.inject({
method: 'POST',
url: `/api/tasks/00000000-0000-0000-0000-000000000000/cancel`,
});
expect(res.statusCode).toBe(404);
});
});

View File

@@ -8,6 +8,12 @@ interface InferenceApi {
cancel: (sessionId: string, chatId: string) => Promise<boolean>; cancel: (sessionId: string, chatId: string) => Promise<boolean>;
} }
// F1: the dispatcher's reach into an in-flight external-agent run. Narrow by
// design (not the whole dispatcher) — the route only needs to fire the abort.
// Returns true when a controller was registered for the task (an external run was
// in flight), false otherwise (native boocode task, or already finished).
export type ExternalCancelFn = (taskId: string) => boolean;
const CreateBody = z.object({ const CreateBody = z.object({
project_id: z.string().uuid(), project_id: z.string().uuid(),
input: z.string().min(1).max(64_000), input: z.string().min(1).max(64_000),
@@ -27,7 +33,12 @@ const ListQuery = z.object({
project_id: z.string().uuid().optional(), project_id: z.string().uuid().optional(),
}); });
export function registerTaskRoutes(app: FastifyInstance, sql: Sql, inference: InferenceApi): void { export function registerTaskRoutes(
app: FastifyInstance,
sql: Sql,
inference: InferenceApi,
cancelExternal: ExternalCancelFn,
): void {
// POST /api/tasks — create a new task // POST /api/tasks — create a new task
app.post('/api/tasks', async (req, reply) => { app.post('/api/tasks', async (req, reply) => {
const parsed = CreateBody.safeParse(req.body); const parsed = CreateBody.safeParse(req.body);
@@ -95,7 +106,7 @@ export function registerTaskRoutes(app: FastifyInstance, sql: Sql, inference: In
// GET /api/tasks/:id — single task detail // GET /api/tasks/:id — single task detail
app.get<{ Params: { id: string } }>('/api/tasks/:id', async (req, reply) => { app.get<{ Params: { id: string } }>('/api/tasks/:id', async (req, reply) => {
const rows = await sql` const rows = await sql`
SELECT id, project_id, parent_task_id, state, input, output_summary, agent, model, execution_path, worktree_path, session_id, cost_tokens, started_at, ended_at, created_at SELECT id, project_id, parent_task_id, state, input, output_summary, agent, model, execution_path, session_id, cost_tokens, started_at, ended_at, created_at
FROM tasks FROM tasks
WHERE id = ${req.params.id} WHERE id = ${req.params.id}
`; `;
@@ -127,7 +138,14 @@ export function registerTaskRoutes(app: FastifyInstance, sql: Sql, inference: In
cancelPendingPermission(taskId); cancelPendingPermission(taskId);
// If running, try to cancel inference // F1: abort the in-flight external-agent run (opencode / goose / qwen / claude).
// Idempotent — a double-Stop re-aborts harmlessly; a native boocode task is not
// registered, so this returns false and the inference.cancel path below handles
// it unchanged. The dispatcher's run-function finalizes the streaming assistant
// message as 'cancelled' once the backend honors the signal.
cancelExternal(taskId);
// If running, try to cancel inference (native boocode path — unchanged).
if ((task.state === 'running' || task.state === 'blocked') && task.session_id) { if ((task.state === 'running' || task.state === 'blocked') && task.session_id) {
// Find active chat in the task's session // Find active chat in the task's session
const chats = await sql<{ id: string }[]>` const chats = await sql<{ id: string }[]>`

View File

@@ -9,7 +9,7 @@
*/ */
import type { FastifyInstance } from 'fastify'; import type { FastifyInstance } from 'fastify';
import type { Sql } from '../db.js'; import type { Sql } from '../db.js';
import { checkWorktreeWorkAtRisk, stashWorktree } from '../services/worktrees.js'; import { checkWorktreeWorkAtRisk, stashWorktree } from '../services/worktree-risk.js';
export function registerWorktreeSafetyRoutes(app: FastifyInstance, sql: Sql): void { export function registerWorktreeSafetyRoutes(app: FastifyInstance, sql: Sql): void {
// GET risk for a session's worktree(s). One row per session today (PK on // GET risk for a session's worktree(s). One row per session today (PK on

View File

@@ -25,7 +25,6 @@ CREATE TABLE IF NOT EXISTS tasks (
agent TEXT, agent TEXT,
model TEXT, model TEXT,
execution_path TEXT, execution_path TEXT,
worktree_path TEXT,
cost_tokens INTEGER, cost_tokens INTEGER,
started_at TIMESTAMPTZ, started_at TIMESTAMPTZ,
ended_at TIMESTAMPTZ, ended_at TIMESTAMPTZ,
@@ -39,9 +38,9 @@ CREATE TABLE IF NOT EXISTS available_agents (
install_path TEXT, install_path TEXT,
version TEXT, version TEXT,
supports_acp BOOLEAN NOT NULL DEFAULT false, supports_acp BOOLEAN NOT NULL DEFAULT false,
supports_mcp_client BOOLEAN NOT NULL DEFAULT false,
last_probed_at TIMESTAMPTZ last_probed_at TIMESTAMPTZ
); );
ALTER TABLE available_agents DROP COLUMN IF EXISTS supports_mcp_client;
-- v2.0.0 Phase 4: link tasks to their inference sessions. -- v2.0.0 Phase 4: link tasks to their inference sessions.
ALTER TABLE tasks ADD COLUMN IF NOT EXISTS session_id UUID REFERENCES sessions(id); ALTER TABLE tasks ADD COLUMN IF NOT EXISTS session_id UUID REFERENCES sessions(id);
@@ -74,31 +73,16 @@ ALTER TABLE available_agents ADD COLUMN IF NOT EXISTS commands JSONB DEFAULT '[]
-- v2.2.0: Paseo-style session config on tasks. -- v2.2.0: Paseo-style session config on tasks.
ALTER TABLE tasks ADD COLUMN IF NOT EXISTS mode_id TEXT; ALTER TABLE tasks ADD COLUMN IF NOT EXISTS mode_id TEXT;
ALTER TABLE tasks ADD COLUMN IF NOT EXISTS thinking_option_id TEXT; ALTER TABLE tasks ADD COLUMN IF NOT EXISTS thinking_option_id TEXT;
ALTER TABLE tasks ADD COLUMN IF NOT EXISTS feature_values JSONB; -- tasks.feature_values and tasks.worktree_path were never read or written by any
-- code path; drop them from existing DBs (fresh DBs never had them in the CREATE).
-- v2.6: one shared worktree per session (all agents/panes in the session operate in it). -- human_inbox is `SELECT *` over tasks, so it pins every task column — dropping a
CREATE TABLE IF NOT EXISTS session_worktrees ( -- column while the view exists fails (2BP01). Drop the view, drop the columns, then
session_id UUID PRIMARY KEY REFERENCES sessions(id) ON DELETE CASCADE, -- recreate it with the current column set (idempotent on fresh + existing DBs).
worktree_path TEXT NOT NULL, DROP VIEW IF EXISTS human_inbox;
base_commit TEXT, ALTER TABLE tasks DROP COLUMN IF EXISTS feature_values;
created_at TIMESTAMPTZ NOT NULL DEFAULT clock_timestamp() ALTER TABLE tasks DROP COLUMN IF EXISTS worktree_path;
); CREATE OR REPLACE VIEW human_inbox AS
-- P1.5-b: DEFANG the CASCADE — a session delete must no longer wipe its worktree SELECT * FROM tasks WHERE state IN ('blocked', 'failed');
-- row. This table is SUPERSEDED by `worktrees` below; all readers are repointed
-- this phase, so the row just persists (dead) on session delete until a later
-- cleanup drops the table. session_id is this table's PRIMARY KEY, so it cannot be
-- nullable → SET NULL is invalid and NO ACTION/RESTRICT would block deletes; the
-- only valid defang is to drop the FK with no replacement. Idempotent: only fires
-- while the FK is still ON DELETE CASCADE ('c').
DO $$ BEGIN
IF EXISTS (
SELECT 1 FROM pg_constraint
WHERE conname = 'session_worktrees_session_id_fkey'
AND confdeltype = 'c'
) THEN
ALTER TABLE session_worktrees DROP CONSTRAINT session_worktrees_session_id_fkey;
END IF;
END $$;
-- v2.6: one backend session per (session, agent); resumed on switch-back. -- v2.6: one backend session per (session, agent); resumed on switch-back.
CREATE TABLE IF NOT EXISTS agent_sessions ( CREATE TABLE IF NOT EXISTS agent_sessions (
@@ -168,12 +152,9 @@ CREATE TABLE IF NOT EXISTS worktrees (
); );
CREATE UNIQUE INDEX IF NOT EXISTS worktrees_active_path_uidx ON worktrees(path) WHERE status='active'; CREATE UNIQUE INDEX IF NOT EXISTS worktrees_active_path_uidx ON worktrees(path) WHERE status='active';
-- Migrate any surviving session_worktrees rows → worktrees (idempotent; 0 rows -- session_worktrees was superseded by worktrees (v2.6/P1.5-b); all rows migrated
-- after the test-session delete, kept for generality / fresh-DB safety). -- before P2 cleanup. Drop the dead table; no-op on fresh DBs that never had it.
INSERT INTO worktrees (session_id, path, branch, base_commit, status) DROP TABLE IF EXISTS session_worktrees;
SELECT sw.session_id, sw.worktree_path, 'session-' || sw.session_id, sw.base_commit, 'active'
FROM session_worktrees sw
WHERE NOT EXISTS (SELECT 1 FROM worktrees w WHERE w.session_id = sw.session_id AND w.status='active');
-- Dispatch hint: which chat (tab) a task belongs to. The coder message route and -- Dispatch hint: which chat (tab) a task belongs to. The coder message route and
-- skills route set it from the frontend tab; session-less creators (arena, MCP, -- skills route set it from the frontend tab; session-less creators (arena, MCP,

View File

@@ -0,0 +1,74 @@
import { describe, it, expect, vi } from 'vitest';
import type { RequestPermissionRequest, CreateElicitationRequest, SessionNotification } from '@agentclientprotocol/sdk';
import { buildAcpClient, type AcpTurnContext } from '../acp-client.js';
/**
* buildAcpClient (v2.7 audit reshape): the shared ACP `Client` closures. These
* tests cover the pure routing decisions that don't require the permission-waiter
* broker machinery — the auto-select/decline fallbacks and the between-turns drop.
*/
describe('buildAcpClient — sessionUpdate', () => {
it('drops the update when no turn is active (resolveTurn → null)', async () => {
const client = buildAcpClient('/wt', () => null);
// Must resolve without throwing and without an onSessionUpdate to call.
await expect(client.sessionUpdate({ sessionId: 's', update: {} } as unknown as SessionNotification)).resolves.toBeUndefined();
});
it('forwards the update to the active turn', async () => {
const onSessionUpdate = vi.fn();
const turn: AcpTurnContext = { taskId: 't', sessionId: 's', modeId: undefined, agent: 'goose', onSessionUpdate };
const client = buildAcpClient('/wt', () => turn);
const note = { sessionId: 's', update: {} } as unknown as SessionNotification;
await client.sessionUpdate(note);
expect(onSessionUpdate).toHaveBeenCalledWith(note);
});
});
describe('buildAcpClient — requestPermission fallback (no UI routing)', () => {
function req(options: Array<{ optionId: string }>): RequestPermissionRequest {
return { options } as unknown as RequestPermissionRequest;
}
it('auto-selects the first option when there is no turn', async () => {
const client = buildAcpClient('/wt', () => null);
const res = await client.requestPermission(req([{ optionId: 'allow' }, { optionId: 'deny' }]));
expect(res).toEqual({ outcome: { outcome: 'selected', optionId: 'allow' } });
});
it('cancels when there is no turn and no options', async () => {
const client = buildAcpClient('/wt', () => null);
const res = await client.requestPermission(req([]));
expect(res).toEqual({ outcome: { outcome: 'cancelled' } });
});
it('auto-selects when the turn has no taskId (UI routing gated off)', async () => {
const turn: AcpTurnContext = { taskId: undefined, sessionId: 's', modeId: undefined, agent: 'goose', onSessionUpdate: () => {} };
const client = buildAcpClient('/wt', () => turn);
const res = await client.requestPermission(req([{ optionId: 'ok' }]));
expect(res).toEqual({ outcome: { outcome: 'selected', optionId: 'ok' } });
});
});
describe('buildAcpClient — elicitation fallback', () => {
it('declines when there is no turn', async () => {
const client = buildAcpClient('/wt', () => null);
const res = await client.unstable_createElicitation!({} as CreateElicitationRequest);
expect(res).toEqual({ action: 'decline' });
});
it('declines when the turn has no taskId', async () => {
const turn: AcpTurnContext = { taskId: undefined, sessionId: 's', modeId: undefined, agent: 'goose', onSessionUpdate: () => {} };
const client = buildAcpClient('/wt', () => turn);
const res = await client.unstable_createElicitation!({} as CreateElicitationRequest);
expect(res).toEqual({ action: 'decline' });
});
});
describe('buildAcpClient — createTerminal', () => {
it('returns the noop terminal id', async () => {
const client = buildAcpClient('/wt', () => null);
const res = await client.createTerminal!({} as never);
expect(res).toEqual({ terminalId: 'noop' });
});
});

View File

@@ -0,0 +1,51 @@
import { describe, it, expect } from 'vitest';
import { createCancelRegistry } from '../cancel-registry.js';
/**
* F1 — per-task abort wiring. The registry is the missing link between the Stop
* route and the in-flight external run: register an AbortController per task id,
* cancel(taskId) aborts its signal, the run's .finally deletes it. Pure (no DB /
* child / IO) so the abort + idempotency contract is unit-testable in isolation.
*/
describe('CancelRegistry (F1 abort wiring)', () => {
it('register hands back a fresh controller; cancel aborts its signal', () => {
const reg = createCancelRegistry();
const ac = reg.register('t1');
expect(ac.signal.aborted).toBe(false);
expect(reg.has('t1')).toBe(true);
expect(reg.cancel('t1')).toBe(true);
expect(ac.signal.aborted).toBe(true);
});
it('cancel on an unknown task returns false (native task / cancel-before-register)', () => {
const reg = createCancelRegistry();
expect(reg.has('nope')).toBe(false);
expect(reg.cancel('nope')).toBe(false);
});
it('double-Stop is idempotent: a second cancel never throws and the signal stays aborted', () => {
const reg = createCancelRegistry();
const ac = reg.register('t1');
expect(reg.cancel('t1')).toBe(true);
// The run-function has not hit its .finally yet, so the entry is still
// present — a rapid second Stop re-aborts (abort() no-ops) without throwing.
expect(() => reg.cancel('t1')).not.toThrow();
expect(reg.cancel('t1')).toBe(true);
expect(ac.signal.aborted).toBe(true);
});
it('cancel after delete returns false (cancel-after-natural-exit is safe)', () => {
const reg = createCancelRegistry();
reg.register('t1');
reg.delete('t1');
expect(reg.has('t1')).toBe(false);
expect(reg.cancel('t1')).toBe(false);
});
it('delete of an unknown id is a no-op (never throws)', () => {
const reg = createCancelRegistry();
expect(() => reg.delete('ghost')).not.toThrow();
});
});

View File

@@ -0,0 +1,163 @@
import { describe, it, expect, beforeAll, afterAll } from 'vitest';
import { readFileSync } from 'node:fs';
import { resolve } from 'node:path';
import postgres from 'postgres';
import type { WsFrame } from '@boocode/contracts/ws-frames';
import { classifyTerminalStatus, finalizeStreamingMessage } from '../finalize-message.js';
/**
* F1 (D-7 / OCE-001 / OCE-002) — finalizing a Stop'd or errored external turn.
*
* `classifyTerminalStatus` is the pure D-7 decision (user Stop / AbortError →
* cancelled, genuine error → failed). `finalizeStreamingMessage` writes that
* terminal state onto the streaming assistant row and publishes the matching
* message_complete frame — idempotently, guarded by `WHERE status='streaming'`,
* so a double-Stop or an abort-then-catch settles the message exactly once and
* never clobbers a row that already finished cleanly.
*/
describe('classifyTerminalStatus (pure, D-7)', () => {
it('maps a fired abort signal to cancelled (user Stop)', () => {
expect(classifyTerminalStatus({ aborted: true })).toBe('cancelled');
});
it('maps a thrown AbortError to cancelled', () => {
const e = new Error('the operation was aborted');
e.name = 'AbortError';
expect(classifyTerminalStatus({ aborted: false, error: e })).toBe('cancelled');
});
it('maps a genuine thrown error to failed', () => {
expect(classifyTerminalStatus({ aborted: false, error: new Error('boom') })).toBe('failed');
});
it('defaults a no-abort / no-error catch to failed', () => {
expect(classifyTerminalStatus({ aborted: false })).toBe('failed');
});
});
describe.runIf(!!process.env.DATABASE_URL)('finalizeStreamingMessage (DB)', () => {
let sql: ReturnType<typeof postgres>;
let projectId: string;
let sessionId: string;
let chatId: string;
beforeAll(async () => {
sql = postgres(process.env.DATABASE_URL!, { max: 3 });
// Server schema owns messages/sessions/chats (FK targets); coder schema after.
const serverSchema = resolve(__dirname, '../../../../server/src/schema.sql');
const coderSchema = resolve(__dirname, '../../schema.sql');
await sql.unsafe(readFileSync(serverSchema, 'utf8'));
await sql.unsafe(readFileSync(coderSchema, 'utf8'));
const [p] = await sql<{ id: string }[]>`
INSERT INTO projects (name, path, status) VALUES ('f1-finalize', '/tmp/f1-finalize', 'open') RETURNING id
`;
projectId = p!.id;
const [s] = await sql<{ id: string }[]>`
INSERT INTO sessions (project_id, name, model, status) VALUES (${projectId}, 'f1', 'm', 'open') RETURNING id
`;
sessionId = s!.id;
const [c] = await sql<{ id: string }[]>`
INSERT INTO chats (session_id, name, status) VALUES (${sessionId}, 'tab', 'open') RETURNING id
`;
chatId = c!.id;
});
afterAll(async () => {
if (!sql) return;
await sql`DELETE FROM messages WHERE session_id = ${sessionId}`.catch(() => {});
await sql`DELETE FROM chats WHERE id = ${chatId}`.catch(() => {});
await sql`DELETE FROM sessions WHERE id = ${sessionId}`.catch(() => {});
await sql`DELETE FROM projects WHERE id = ${projectId}`.catch(() => {});
await sql.end({ timeout: 5 });
});
async function insertStreaming(): Promise<string> {
const [m] = await sql<{ id: string }[]>`
INSERT INTO messages (session_id, chat_id, role, content, status)
VALUES (${sessionId}, ${chatId}, 'assistant', '', 'streaming') RETURNING id
`;
return m!.id;
}
it('finalizes a streaming row to cancelled, persists partial content, publishes one frame', async () => {
const id = await insertStreaming();
const frames: WsFrame[] = [];
const did = await finalizeStreamingMessage(sql, (_s, f) => frames.push(f), {
sessionId,
chatId,
assistantId: id,
status: 'cancelled',
model: 'qwen',
content: 'partial answer',
});
expect(did).toBe(true);
const [row] = await sql<{ status: string; content: string; finished_at: Date | null }[]>`
SELECT status, content, finished_at FROM messages WHERE id = ${id}
`;
expect(row!.status).toBe('cancelled');
expect(row!.content).toBe('partial answer');
expect(row!.finished_at).not.toBeNull();
expect(frames).toHaveLength(1);
expect(frames[0]!.type).toBe('message_complete');
expect((frames[0] as { status?: string }).status).toBe('cancelled');
});
it('is idempotent for a double-Stop: second call updates nothing and re-publishes nothing', async () => {
const id = await insertStreaming();
const frames: WsFrame[] = [];
const push = (_s: string, f: WsFrame): void => {
frames.push(f);
};
expect(
await finalizeStreamingMessage(sql, push, { sessionId, chatId, assistantId: id, status: 'cancelled', model: null }),
).toBe(true);
expect(
await finalizeStreamingMessage(sql, push, { sessionId, chatId, assistantId: id, status: 'cancelled', model: null }),
).toBe(false);
expect(frames).toHaveLength(1);
const [row] = await sql<{ status: string }[]>`SELECT status FROM messages WHERE id = ${id}`;
expect(row!.status).toBe('cancelled');
});
it('never clobbers a row that already finished cleanly (abort raced a clean finish)', async () => {
const [m] = await sql<{ id: string }[]>`
INSERT INTO messages (session_id, chat_id, role, content, status)
VALUES (${sessionId}, ${chatId}, 'assistant', 'done', 'complete') RETURNING id
`;
const id = m!.id;
const frames: WsFrame[] = [];
const did = await finalizeStreamingMessage(sql, (_s, f) => frames.push(f), {
sessionId,
chatId,
assistantId: id,
status: 'cancelled',
model: null,
});
expect(did).toBe(false);
expect(frames).toHaveLength(0);
const [row] = await sql<{ status: string; content: string }[]>`
SELECT status, content FROM messages WHERE id = ${id}
`;
expect(row!.status).toBe('complete');
expect(row!.content).toBe('done');
});
it('no-ops on an empty assistantId (throw happened before the row was created)', async () => {
const frames: WsFrame[] = [];
const did = await finalizeStreamingMessage(sql, (_s, f) => frames.push(f), {
sessionId,
chatId,
assistantId: '',
status: 'failed',
model: null,
});
expect(did).toBe(false);
expect(frames).toHaveLength(0);
});
});

View File

@@ -0,0 +1,102 @@
import { describe, it, expect } from 'vitest';
import type { Broker } from '@boocode/server/broker';
import { makeFrameEmitter } from '../frame-emitter.js';
import { makeDcpStreamStripper } from '../dcp-strip.js';
import type { AcpToolSnapshot } from '../acp-tool-snapshot.js';
/**
* makeFrameEmitter (v2.7 audit reshape): the AgentEvent → WS-frame mapping + turn
* accumulators extracted from AcpStreamContext. Pure-ish over an injected broker.
*/
function fakeBroker(): { broker: Broker; frames: Array<{ sid: string; frame: Record<string, unknown> }> } {
const frames: Array<{ sid: string; frame: Record<string, unknown> }> = [];
const broker = {
publishFrame: (sid: string, frame: unknown) => {
frames.push({ sid, frame: frame as Record<string, unknown> });
},
} as unknown as Broker;
return { broker, frames };
}
const toolSnap: AcpToolSnapshot = { toolCallId: 'c1', title: 'grep', status: 'completed', rawOutput: 'x' };
describe('makeFrameEmitter — streaming frames', () => {
it('maps text/reasoning/tool events to delta/reasoning_delta/tool_call frames', () => {
const { broker, frames } = fakeBroker();
const em = makeFrameEmitter({ broker, sessionId: 's1', chatId: 'ch1', assistantId: 'm1' });
em.onEvent({ type: 'text', text: 'hello ' });
em.onEvent({ type: 'reasoning', text: 'mulling' });
em.onEvent({ type: 'tool_call', toolCall: toolSnap });
expect(frames.map((f) => f.frame.type)).toEqual(['delta', 'reasoning_delta', 'tool_call']);
expect(frames[0]!.frame).toMatchObject({ message_id: 'm1', chat_id: 'ch1', content: 'hello ' });
expect(frames[2]!.frame).toMatchObject({ message_id: 'm1', chat_id: 'ch1' });
expect(em.output).toBe('hello ');
expect(em.reasoningText).toBe('mulling');
expect(em.snapshots).toHaveLength(1);
});
it('publishes a tool_call frame for BOTH tool_call and tool_update events', () => {
const { broker, frames } = fakeBroker();
const em = makeFrameEmitter({ broker, sessionId: 's1', chatId: 'ch1', assistantId: 'm1' });
em.onEvent({ type: 'tool_update', toolCall: toolSnap });
expect(frames).toHaveLength(1);
expect(frames[0]!.frame.type).toBe('tool_call');
});
it('publishes an agent_commands frame and merges the command cache', () => {
const { broker, frames } = fakeBroker();
const taskId = `task-fe-${Math.floor(performance.now())}-${frames.length}`;
const em = makeFrameEmitter({ broker, sessionId: 's1', chatId: 'ch1', assistantId: 'm1', taskId });
em.onEvent({ type: 'commands', commands: [{ name: 'plan' }] });
expect(frames).toHaveLength(1);
expect(frames[0]!.frame).toMatchObject({ type: 'agent_commands', task_id: taskId, session_id: 's1' });
});
it('does not publish a commands frame without a taskId', () => {
const { broker, frames } = fakeBroker();
const em = makeFrameEmitter({ broker, sessionId: 's1', chatId: 'ch1', assistantId: 'm1' });
em.onEvent({ type: 'commands', commands: [{ name: 'plan' }] });
expect(frames).toHaveLength(0);
});
});
describe('makeFrameEmitter — no broker (one-shot accumulation)', () => {
it('accumulates output/reasoning/snapshots but publishes nothing', () => {
const em = makeFrameEmitter({ sessionId: 's1', chatId: 'ch1', assistantId: 'm1' });
em.onEvent({ type: 'text', text: 'abc' });
em.onEvent({ type: 'reasoning', text: 'r' });
em.onEvent({ type: 'tool_call', toolCall: toolSnap });
expect(em.output).toBe('abc');
expect(em.reasoningText).toBe('r');
expect(em.snapshots).toHaveLength(1);
});
});
describe('makeFrameEmitter — dcp stripping (opencode path contract)', () => {
it('strips a split dcp tag across deltas and flushes the tail on finalize', () => {
const { broker, frames } = fakeBroker();
const em = makeFrameEmitter({ broker, sessionId: 's1', chatId: 'ch1', assistantId: 'm1', dcp: makeDcpStreamStripper() });
for (const chunk of ['Answer.', '<dcp', '-message', '-id>m1</dcp', '-message-id>', ' tail']) {
em.onEvent({ type: 'text', text: chunk });
}
em.finalize();
expect(em.output).toBe('Answer. tail');
const published = frames.filter((f) => f.frame.type === 'delta').map((f) => f.frame.content).join('');
expect(published).toBe('Answer. tail');
});
it('finalize is a no-op without a dcp stripper', () => {
const { broker, frames } = fakeBroker();
const em = makeFrameEmitter({ broker, sessionId: 's1', chatId: 'ch1', assistantId: 'm1' });
em.onEvent({ type: 'text', text: 'raw <dcp-message-id>m</dcp-message-id>' });
em.finalize();
// No stripping without a stripper — verbatim text (prior ACP-path behavior).
expect(em.output).toBe('raw <dcp-message-id>m</dcp-message-id>');
expect(frames).toHaveLength(1);
});
});

View File

@@ -1,83 +0,0 @@
import { describe, it, expect } from 'vitest';
import { normalizeAgentEvent } from '../normalize-agent-status.js';
describe('normalizeAgentEvent', () => {
describe('working bucket', () => {
const cases = [
'SessionStart',
'UserPromptSubmit',
'UserPromptSubmitted',
'PostToolUse',
'PostToolUseFailure',
'BeforeAgent',
'AfterTool',
'task_started',
];
for (const name of cases) {
it(`maps ${name} → working`, () => {
expect(normalizeAgentEvent(name)).toBe('working');
});
}
});
describe('blocked bucket', () => {
const cases = [
'PreToolUse',
'Notification',
'PermissionRequest',
'exec_approval_request',
'apply_patch_approval_request',
'request_user_input',
];
for (const name of cases) {
it(`maps ${name} → blocked`, () => {
expect(normalizeAgentEvent(name)).toBe('blocked');
});
}
});
describe('done bucket', () => {
const cases = [
'Stop',
'AfterAgent',
'SessionEnd',
'task_complete',
'agent-turn-complete',
];
for (const name of cases) {
it(`maps ${name} → done`, () => {
expect(normalizeAgentEvent(name)).toBe('done');
});
}
});
describe('unknown / nullish → null', () => {
it('returns null for an unrecognized event', () => {
expect(normalizeAgentEvent('SomeRandomEvent')).toBeNull();
});
it('returns null for empty string', () => {
expect(normalizeAgentEvent('')).toBeNull();
});
it('returns null for undefined', () => {
expect(normalizeAgentEvent(undefined)).toBeNull();
});
});
describe('case- and separator-insensitive matching', () => {
it('matches snake_case spelling of a PascalCase event', () => {
expect(normalizeAgentEvent('session_start')).toBe('working');
expect(normalizeAgentEvent('post_tool_use')).toBe('working');
expect(normalizeAgentEvent('pre_tool_use')).toBe('blocked');
});
it('matches camelCase spelling', () => {
expect(normalizeAgentEvent('userPromptSubmitted')).toBe('working');
expect(normalizeAgentEvent('postToolUse')).toBe('working');
expect(normalizeAgentEvent('preToolUse')).toBe('blocked');
expect(normalizeAgentEvent('sessionEnd')).toBe('done');
});
it('matches arbitrary case', () => {
expect(normalizeAgentEvent('STOP')).toBe('done');
expect(normalizeAgentEvent('notification')).toBe('blocked');
});
});
});

View File

@@ -0,0 +1,88 @@
/**
* Shared ACP `Client` builder — the callback closures every ACP connection needs
* (worktree-scoped FS bridge + permission/elicitation routing + session updates).
*
* Extracted (v2.7 audit reshape) from the byte-identical `buildClient` closures in
* `acp-dispatch.ts` (one-shot) and `backends/warm-acp.ts` (warm). The two differed
* only in WHERE the per-turn context comes from (a fixed dispatch vs. the warm
* backend's `activeTurn`) and a trivially-equivalent permission gate — both are now
* supplied via the `resolveTurn` callback, so the FS/permission/elicitation wiring
* lives once. Behavior is preserved exactly:
* - `sessionUpdate` drops when `resolveTurn()` returns null (between turns).
* - permission/elicitation route to the UI only when BOTH a taskId AND sessionId
* are present (warm always has a sessionId, so this matches its prior
* `turn?.taskId` gate); otherwise the same auto-select-first / decline fallback.
*/
import type {
Client,
SessionNotification,
RequestPermissionRequest,
RequestPermissionResponse,
ReadTextFileRequest,
ReadTextFileResponse,
WriteTextFileRequest,
WriteTextFileResponse,
CreateTerminalRequest,
CreateTerminalResponse,
CreateElicitationRequest,
CreateElicitationResponse,
} from '@agentclientprotocol/sdk';
import { readWorktreeTextFile, writeWorktreeTextFile } from './acp-client-fs.js';
import { waitForPermissionResponse, waitForElicitationResponse } from './permission-waiter.js';
/** The per-turn context an ACP `Client` closure needs, resolved lazily per call. */
export interface AcpTurnContext {
/** Per-turn task id, for routing permission/elicitation prompts back to the UI. */
taskId: string | undefined;
/** BooCode session id (for permission-waiter's broker frames). */
sessionId: string | undefined;
/** Per-turn mode id (autonomous-mode gate in permission-waiter). */
modeId: string | undefined;
/** The agent name (for permission-waiter routing). */
agent: string;
/** Forward a session/update notification to the turn's event sink. */
onSessionUpdate: (params: SessionNotification) => void | Promise<void>;
}
/**
* Build the ACP `Client` callbacks once per connection. `resolveTurn` is called at
* the moment each callback fires and returns the live turn context (or null when no
* turn is active — `sessionUpdate` then drops, matching the warm backend's
* between-turns behavior). The FS bridge is scoped to `worktreePath`.
*/
export function buildAcpClient(worktreePath: string, resolveTurn: () => AcpTurnContext | null): Client {
return {
sessionUpdate: async (params: SessionNotification): Promise<void> => {
const turn = resolveTurn();
if (!turn) return; // between turns — drop (no orphan settles a future turn)
await turn.onSessionUpdate(params);
},
requestPermission: async (params: RequestPermissionRequest): Promise<RequestPermissionResponse> => {
const turn = resolveTurn();
if (turn && turn.taskId && turn.sessionId) {
return waitForPermissionResponse(turn.taskId, turn.sessionId, turn.agent, turn.modeId, params);
}
const firstOption = params.options[0];
if (firstOption) return { outcome: { outcome: 'selected', optionId: firstOption.optionId } };
return { outcome: { outcome: 'cancelled' } };
},
readTextFile: async (params: ReadTextFileRequest): Promise<ReadTextFileResponse> => {
const content = await readWorktreeTextFile(worktreePath, params.path, params.line, params.limit);
return { content };
},
writeTextFile: async (params: WriteTextFileRequest): Promise<WriteTextFileResponse> => {
await writeWorktreeTextFile(worktreePath, params.path, params.content);
return {};
},
createTerminal: async (_params: CreateTerminalRequest): Promise<CreateTerminalResponse> => {
return { terminalId: 'noop' };
},
unstable_createElicitation: async (params: CreateElicitationRequest): Promise<CreateElicitationResponse> => {
const turn = resolveTurn();
if (turn && turn.taskId && turn.sessionId) {
return waitForElicitationResponse(turn.taskId, turn.sessionId, turn.agent, turn.modeId, params);
}
return { action: 'decline' };
},
};
}

View File

@@ -9,35 +9,21 @@ import {
ClientSideConnection, ClientSideConnection,
type Client, type Client,
type SessionNotification, type SessionNotification,
type RequestPermissionRequest,
type RequestPermissionResponse,
type ReadTextFileRequest,
type ReadTextFileResponse,
type WriteTextFileRequest,
type WriteTextFileResponse,
type CreateTerminalRequest,
type CreateTerminalResponse,
type CreateElicitationRequest,
type CreateElicitationResponse,
type SessionConfigOption, type SessionConfigOption,
type ClientSideConnection as ConnectionType, type ClientSideConnection as ConnectionType,
} from '@agentclientprotocol/sdk'; } from '@agentclientprotocol/sdk';
import type { Broker } from '@boocode/server/broker'; import type { Broker } from '@boocode/server/broker';
import type { WsFrame } from '@boocode/contracts/ws-frames';
import { spawn } from 'node:child_process'; import { spawn } from 'node:child_process';
import { findThoughtLevelConfigId } from './acp-derive.js'; import { findThoughtLevelConfigId } from './acp-derive.js';
import { resolveLaunchSpec } from './acp-spawn.js'; import { resolveLaunchSpec } from './acp-spawn.js';
import { getResolvedRegistry, type ResolvedProviderDef } from './provider-config-registry.js'; import { getResolvedRegistry, type ResolvedProviderDef } from './provider-config-registry.js';
import { createAcpNdJsonStream } from './acp-stream.js'; import { createAcpNdJsonStream } from './acp-stream.js';
import { waitForPermissionResponse, waitForElicitationResponse, cancelPendingPermission } from './permission-waiter.js'; import { cancelPendingPermission } from './permission-waiter.js';
import { mergeTaskCommands, getTaskCommands } from './agent-commands-cache.js';
import { readWorktreeTextFile, writeWorktreeTextFile } from './acp-client-fs.js';
import { mapSessionUpdate } from './acp-event-map.js'; import { mapSessionUpdate } from './acp-event-map.js';
import { import { type AcpToolSnapshot, synthesizeCanceledSnapshots } from './acp-tool-snapshot.js';
type AcpToolSnapshot, import { makeFrameEmitter, type FrameEmitter } from './frame-emitter.js';
snapshotToWireToolCall, import { buildAcpClient } from './acp-client.js';
synthesizeCanceledSnapshots,
} from './acp-tool-snapshot.js';
export interface AcpDispatchResult { export interface AcpDispatchResult {
exitCode: number; exitCode: number;
@@ -111,144 +97,61 @@ async function applySessionOverrides(
} }
class AcpStreamContext { class AcpStreamContext {
readonly textChunks: string[] = []; /** AgentEvent → WS-frame mapping + text/reasoning/tool accumulation (shared
readonly reasoningChunks: string[] = []; * `makeFrameEmitter`). The one-shot path passes no `dcp` stripper, so text is
readonly toolSnapshots = new Map<string, AcpToolSnapshot>(); * emitted verbatim — byte-identical to the prior inline switch. */
private aborted = false; private readonly emitter: FrameEmitter;
constructor( constructor(
private readonly opts: Pick< opts: Pick<AcpDispatchOpts, 'broker' | 'sessionId' | 'chatId' | 'messageId' | 'taskId'>,
AcpDispatchOpts,
'broker' | 'sessionId' | 'chatId' | 'messageId' | 'taskId'
>,
private readonly worktreePath: string, private readonly worktreePath: string,
) {} ) {
this.emitter = makeFrameEmitter({
broker: opts.broker,
sessionId: opts.sessionId,
chatId: opts.chatId,
assistantId: opts.messageId,
taskId: opts.taskId,
});
}
get reasoningText(): string { get reasoningText(): string {
return this.reasoningChunks.join(''); return this.emitter.reasoningText;
} }
get output(): string { get output(): string {
return this.textChunks.join(''); return this.emitter.output;
} }
get snapshots(): AcpToolSnapshot[] { get snapshots(): AcpToolSnapshot[] {
return [...this.toolSnapshots.values()]; return this.emitter.snapshots;
} }
markAborted(): void { markAborted(): void {
this.aborted = true; // Synthesize 'canceled' updates for still-running tool calls so the UI doesn't
for (const snap of synthesizeCanceledSnapshots(this.toolSnapshots.values())) { // leave them spinning, then emit them through the same frame path (tool_update
this.toolSnapshots.set(snap.toolCallId, snap); // → the same `tool_call` wire frame the original published).
this.publishToolSnapshot(snap); for (const snap of synthesizeCanceledSnapshots(this.emitter.toolSnapshots.values())) {
this.emitter.onEvent({ type: 'tool_update', toolCall: snap });
} }
} }
private canStream(): boolean { handleSessionUpdate(params: SessionNotification): void {
return !!(this.opts.broker && this.opts.sessionId && this.opts.chatId && this.opts.messageId); // The merge accumulator (`this.emitter.toolSnapshots`) is the same Map the
} // emitter publishes from, so a later tool_call_update merges over its tool_call.
for (const event of mapSessionUpdate(params, this.emitter.toolSnapshots)) {
private publishToolSnapshot(snapshot: AcpToolSnapshot): void { this.emitter.onEvent(event);
if (!this.canStream()) return;
const wire = snapshotToWireToolCall(snapshot);
this.opts.broker!.publishFrame(this.opts.sessionId!, {
type: 'tool_call',
message_id: this.opts.messageId!,
chat_id: this.opts.chatId!,
tool_call: wire,
} as WsFrame);
}
async handleSessionUpdate(params: SessionNotification): Promise<void> {
// v2.6 Phase 2: the case-by-case mapping now lives in the shared, pure
// `mapSessionUpdate` (reused by the warm ACP backend). This method keeps the
// identical broker-publishing side effects — it just translates the normalized
// AgentEvents back into the same frames it always emitted. `this.toolSnapshots`
// is the merge accumulator, so a later tool_call_update merges over its
// tool_call (the prior `handleToolUpdate` behavior, byte-for-byte).
for (const event of mapSessionUpdate(params, this.toolSnapshots)) {
switch (event.type) {
case 'text':
this.textChunks.push(event.text);
if (this.canStream()) {
this.opts.broker!.publishFrame(this.opts.sessionId!, {
type: 'delta',
message_id: this.opts.messageId!,
chat_id: this.opts.chatId!,
content: event.text,
} as WsFrame);
}
break;
case 'reasoning':
this.reasoningChunks.push(event.text);
if (this.canStream()) {
this.opts.broker!.publishFrame(this.opts.sessionId!, {
type: 'reasoning_delta',
message_id: this.opts.messageId!,
chat_id: this.opts.chatId!,
content: event.text,
} as WsFrame);
}
break;
case 'tool_call':
case 'tool_update':
// mapSessionUpdate already stored the merged snapshot in this.toolSnapshots.
this.publishToolSnapshot(event.toolCall);
break;
case 'commands':
if (this.opts.taskId && event.commands.length > 0) {
mergeTaskCommands(this.opts.taskId, event.commands);
if (this.canStream() && this.opts.sessionId) {
const all = getTaskCommands(this.opts.taskId) ?? event.commands;
this.opts.broker!.publishFrame(this.opts.sessionId, {
type: 'agent_commands',
task_id: this.opts.taskId,
session_id: this.opts.sessionId,
commands: all,
} as WsFrame);
}
}
break;
}
} }
} }
buildClient(agent: string, modeId: string | undefined, taskId: string | undefined, sessionId: string | undefined): Client { buildClient(agent: string, modeId: string | undefined, taskId: string | undefined, sessionId: string | undefined): Client {
return { return buildAcpClient(this.worktreePath, () => ({
sessionUpdate: (params) => this.handleSessionUpdate(params), taskId,
requestPermission: async (params: RequestPermissionRequest): Promise<RequestPermissionResponse> => { sessionId,
if (taskId && sessionId) { modeId,
return waitForPermissionResponse(taskId, sessionId, agent, modeId, params); agent,
} onSessionUpdate: (params) => this.handleSessionUpdate(params),
const firstOption = params.options[0]; }));
if (firstOption) {
return { outcome: { outcome: 'selected', optionId: firstOption.optionId } };
}
return { outcome: { outcome: 'cancelled' } };
},
readTextFile: async (params: ReadTextFileRequest): Promise<ReadTextFileResponse> => {
const content = await readWorktreeTextFile(
this.worktreePath,
params.path,
params.line,
params.limit,
);
return { content };
},
writeTextFile: async (params: WriteTextFileRequest): Promise<WriteTextFileResponse> => {
await writeWorktreeTextFile(this.worktreePath, params.path, params.content);
return {};
},
createTerminal: async (_params: CreateTerminalRequest): Promise<CreateTerminalResponse> => {
return { terminalId: 'noop' };
},
unstable_createElicitation: async (params: CreateElicitationRequest): Promise<CreateElicitationResponse> => {
if (taskId && sessionId) {
return waitForElicitationResponse(taskId, sessionId, agent, modeId, params);
}
return { action: 'decline' };
},
};
} }
} }

View File

@@ -2,7 +2,7 @@ import { Readable, Writable } from 'node:stream';
import type { ChildProcess } from 'node:child_process'; import type { ChildProcess } from 'node:child_process';
import { ndJsonStream } from '@agentclientprotocol/sdk'; import { ndJsonStream } from '@agentclientprotocol/sdk';
export function nodeReadableToWeb(nodeStream: NodeJS.ReadableStream): ReadableStream<Uint8Array> { function nodeReadableToWeb(nodeStream: NodeJS.ReadableStream): ReadableStream<Uint8Array> {
return new ReadableStream<Uint8Array>({ return new ReadableStream<Uint8Array>({
start(controller) { start(controller) {
nodeStream.on('data', (chunk: Buffer) => controller.enqueue(new Uint8Array(chunk))); nodeStream.on('data', (chunk: Buffer) => controller.enqueue(new Uint8Array(chunk)));
@@ -17,7 +17,7 @@ export function nodeReadableToWeb(nodeStream: NodeJS.ReadableStream): ReadableSt
}); });
} }
export function nodeWritableToWeb(nodeStream: NodeJS.WritableStream): WritableStream<Uint8Array> { function nodeWritableToWeb(nodeStream: NodeJS.WritableStream): WritableStream<Uint8Array> {
return new WritableStream<Uint8Array>({ return new WritableStream<Uint8Array>({
write(chunk) { write(chunk) {
return new Promise<void>((resolve, reject) => { return new Promise<void>((resolve, reject) => {

View File

@@ -0,0 +1,173 @@
import { describe, it, expect, vi } from 'vitest';
import type { Event, OpencodeClient } from '@opencode-ai/sdk/v2/client';
import {
reconnectDecision,
runSessionEventLoop,
DEFAULT_RECONNECT_POLICY,
type SessionState,
type SseLoopDeps,
} from '../opencode-sse.js';
import { shouldStartServer } from '../opencode-server-process.js';
/**
* v2.7 concurrency hardening (Phase 7): the pure decision cores for SSE reconnect
* backoff + the ensureServer double-spawn guard, plus a deterministic exercise of
* the loop's breaker (injected sleep, fake client). Happy path is asserted to be
* unchanged (clean stream end → reset → base-delay reconnect).
*/
function freshState(): SessionState {
return {
boocodeSessionId: 'boo1',
agentSessionId: 'oc1',
worktreePath: '/wt',
streamedPartKeys: new Set(),
partTypeById: new Map(),
activeTurn: { onEvent: () => {}, settle: () => {} },
watchdog: null,
sseAbort: null,
swallowNextTerminal: false,
};
}
const silentLog = {
warn: () => {},
info: () => {},
error: () => {},
debug: () => {},
} as unknown as SseLoopDeps['log'];
describe('reconnectDecision (pure backoff + breaker)', () => {
it('first failure uses the base delay (matches pre-hardening flat delay)', () => {
expect(reconnectDecision(1)).toEqual({ action: 'reconnect', delayMs: DEFAULT_RECONNECT_POLICY.baseMs });
});
it('grows exponentially and caps at maxMs', () => {
const policy = { baseMs: 1000, maxMs: 30_000, maxAttempts: 10 };
expect(reconnectDecision(2, policy)).toEqual({ action: 'reconnect', delayMs: 2000 });
expect(reconnectDecision(3, policy)).toEqual({ action: 'reconnect', delayMs: 4000 });
expect(reconnectDecision(6, policy)).toEqual({ action: 'reconnect', delayMs: 30_000 }); // 32000 capped
expect(reconnectDecision(9, policy)).toEqual({ action: 'reconnect', delayMs: 30_000 });
});
it('gives up once failures exceed maxAttempts', () => {
const policy = { baseMs: 1, maxMs: 8, maxAttempts: 3 };
expect(reconnectDecision(3, policy).action).toBe('reconnect');
expect(reconnectDecision(4, policy)).toEqual({ action: 'give-up' });
});
});
describe('shouldStartServer (double-spawn guard)', () => {
it('does not start when the server is live', () => {
expect(shouldStartServer({ up: true, hasClient: true, serverStarting: true, childDead: false, startInFlight: false })).toBe(false);
});
it('starts on a fresh process (no start in flight)', () => {
expect(shouldStartServer({ up: false, hasClient: false, serverStarting: false, childDead: false, startInFlight: false })).toBe(true);
});
it('re-spawns after a crash once the prior start finished', () => {
expect(shouldStartServer({ up: false, hasClient: false, serverStarting: true, childDead: true, startInFlight: false })).toBe(true);
});
it('does NOT double-spawn while a start is already in flight (the race fix)', () => {
expect(shouldStartServer({ up: false, hasClient: false, serverStarting: true, childDead: true, startInFlight: true })).toBe(false);
});
it('does NOT double-spawn when a crash nulled serverStarting mid-start', () => {
// The narrow window: a crash during the in-flight start (await freePort) nulls
// serverStarting while startInFlight is still true. The startInFlight guard must
// win over the !serverStarting branch, else a second server spawns on a new port.
expect(shouldStartServer({ up: false, hasClient: false, serverStarting: false, childDead: true, startInFlight: true })).toBe(false);
});
it('waits (no spawn) when a cached start exists and the child is still alive', () => {
expect(shouldStartServer({ up: false, hasClient: false, serverStarting: true, childDead: false, startInFlight: false })).toBe(false);
});
});
describe('runSessionEventLoop — happy path (unchanged)', () => {
it('dispatches streamed events, reconciles on clean end, reconnects at base delay', async () => {
const state = freshState();
const abort = new AbortController();
const events = [
{ type: 'session.next.text.delta', properties: { sessionID: 'oc1', delta: 'hi' } },
{ type: 'session.idle', properties: { sessionID: 'oc1' } },
] as unknown as Event[];
const client = {
event: {
subscribe: vi.fn(async () => ({
stream: (async function* () {
for (const ev of events) yield ev;
})(),
})),
},
} as unknown as OpencodeClient;
const dispatched: Event[] = [];
const sleeps: number[] = [];
let reconciles = 0;
const deps: SseLoopDeps = {
isUp: () => true,
getClient: () => client,
dispatchEvent: (ev) => dispatched.push(ev),
reconcile: async () => {
reconciles += 1;
abort.abort(); // stop the loop after the first clean cycle
return false;
},
onReconnectGiveUp: () => {
throw new Error('should not give up on the happy path');
},
log: silentLog,
sleep: async (ms) => {
sleeps.push(ms);
},
};
await runSessionEventLoop(state, abort, deps);
expect(dispatched).toHaveLength(2);
expect(reconciles).toBe(1);
expect(sleeps).toEqual([DEFAULT_RECONNECT_POLICY.baseMs]); // base delay, not backed off
});
});
describe('runSessionEventLoop — circuit breaker', () => {
it('backs off on repeated throws then gives up + fails the turn', async () => {
const state = freshState();
const abort = new AbortController();
const policy = { baseMs: 1, maxMs: 8, maxAttempts: 3 };
const subscribe = vi.fn(async () => {
throw new Error('connection refused');
});
const client = { event: { subscribe } } as unknown as OpencodeClient;
const sleeps: number[] = [];
const gaveUp = vi.fn();
const deps: SseLoopDeps = {
isUp: () => true,
getClient: () => client,
dispatchEvent: () => {},
reconcile: async () => false,
onReconnectGiveUp: gaveUp,
log: silentLog,
sleep: async (ms) => {
sleeps.push(ms);
},
policy,
};
await runSessionEventLoop(state, abort, deps);
// 3 backoff sleeps (1, 2, 4), then the 4th failure trips the breaker.
expect(sleeps).toEqual([1, 2, 4]);
expect(subscribe).toHaveBeenCalledTimes(4);
expect(gaveUp).toHaveBeenCalledTimes(1);
expect(gaveUp).toHaveBeenCalledWith(state);
});
});

View File

@@ -0,0 +1,226 @@
import { describe, it, expect } from 'vitest';
import type { Event, Part } from '@opencode-ai/sdk/v2/client';
import {
stripDcpTags,
eventSessionId,
resolvePartDedupeKey,
mapToolStatus,
toolPartToSnapshot,
toolCalledSnapshot,
toolSuccessSnapshot,
toolFailedSnapshot,
classifyPartDelta,
classifyUpdatedPart,
errToString,
errMsg,
type DedupState,
} from '../opencode-event-map.js';
/**
* Pure opencode Event → AgentEvent translation + dedup gate (v2.7 audit reshape).
* Mirrors the original `dispatchEvent` / `handleUpdatedPart` arms verbatim — no
* I/O, so it's unit-testable. The slimmed backend keeps the routing + side effects.
*/
function freshDedup(): DedupState {
return { streamedPartKeys: new Set(), partTypeById: new Map() };
}
describe('stripDcpTags', () => {
it('removes a complete dcp tag', () => {
expect(stripDcpTags('hi <dcp-message-id>m1</dcp-message-id> there')).toBe('hi there');
});
it('leaves untagged text untouched', () => {
expect(stripDcpTags('plain text <div>')).toBe('plain text <div>');
});
});
describe('eventSessionId', () => {
it('reads properties.sessionID for a normal event', () => {
const ev = { type: 'session.idle', properties: { sessionID: 's1' } } as unknown as Event;
expect(eventSessionId(ev)).toBe('s1');
});
it('reads properties.part.sessionID for message.part.updated', () => {
const ev = {
type: 'message.part.updated',
properties: { part: { sessionID: 's2' } },
} as unknown as Event;
expect(eventSessionId(ev)).toBe('s2');
});
it('returns null when there is no session', () => {
const ev = { type: 'server.connected', properties: {} } as unknown as Event;
expect(eventSessionId(ev)).toBeNull();
});
});
describe('resolvePartDedupeKey', () => {
it('prefers the part id', () => {
expect(resolvePartDedupeKey({ id: 'p1', messageID: 'm1' }, 'text')).toBe('text:p1');
});
it('falls back to the message id', () => {
expect(resolvePartDedupeKey({ id: ' ', messageID: 'm1' }, 'reasoning')).toBe('reasoning:message:m1');
});
it('returns null when neither is present', () => {
expect(resolvePartDedupeKey({ id: '', messageID: '' }, 'text')).toBeNull();
});
});
describe('mapToolStatus', () => {
it('maps the opencode tool states to ACP statuses', () => {
expect(mapToolStatus('pending')).toBe('pending');
expect(mapToolStatus('running')).toBe('in_progress');
expect(mapToolStatus('completed')).toBe('completed');
expect(mapToolStatus('error')).toBe('failed');
expect(mapToolStatus(undefined)).toBeNull();
});
});
describe('session.next.tool.* snapshot builders', () => {
it('toolCalledSnapshot → in_progress with tool title + raw input', () => {
expect(toolCalledSnapshot({ callID: 'c1', tool: 'read_file', input: { path: 'a.ts' } })).toEqual({
toolCallId: 'c1',
title: 'read_file',
kind: null,
status: 'in_progress',
rawInput: { path: 'a.ts' },
rawOutput: undefined,
});
});
it('toolSuccessSnapshot → completed with joined text content', () => {
const snap = toolSuccessSnapshot({ callID: 'c1', content: [{ text: 'foo' }, { text: 'bar' }, { other: 1 }] });
expect(snap.status).toBe('completed');
expect(snap.title).toBe('c1');
expect(snap.rawOutput).toBe('foobar');
});
it('toolSuccessSnapshot → empty output when content is missing', () => {
expect(toolSuccessSnapshot({ callID: 'c1' }).rawOutput).toBe('');
});
it('toolFailedSnapshot → failed with stringified error', () => {
const snap = toolFailedSnapshot({ callID: 'c1', error: 'boom' });
expect(snap.status).toBe('failed');
expect(snap.title).toBe('c1');
expect(snap.rawOutput).toBe('boom');
});
});
describe('toolPartToSnapshot', () => {
it('extracts input/output/title/status from the tool state', () => {
const part = {
type: 'tool',
callID: 'c1',
tool: 'grep',
state: { status: 'completed', input: { q: 'x' }, output: 'result', title: 'Grep run' },
} as unknown as Parameters<typeof toolPartToSnapshot>[0];
expect(toolPartToSnapshot(part)).toEqual({
toolCallId: 'c1',
title: 'Grep run',
kind: null,
status: 'completed',
rawInput: { q: 'x' },
rawOutput: 'result',
});
});
it('falls back to the tool name and uses error as output', () => {
const part = {
type: 'tool',
callID: 'c2',
tool: 'edit',
state: { status: 'error', error: 'nope' },
} as unknown as Parameters<typeof toolPartToSnapshot>[0];
const snap = toolPartToSnapshot(part);
expect(snap.title).toBe('edit');
expect(snap.status).toBe('failed');
expect(snap.rawOutput).toBe('nope');
});
});
describe('classifyPartDelta (message.part.delta dedup recording)', () => {
it('records a reasoning key and emits a reasoning event', () => {
const st = freshDedup();
const e = classifyPartDelta({ partID: 'p1', field: 'reasoning', delta: 'thinking' }, st);
expect(e).toEqual({ type: 'reasoning', text: 'thinking' });
expect(st.streamedPartKeys.has('reasoning:p1')).toBe(true);
});
it('records a text key, strips dcp, and emits text', () => {
const st = freshDedup();
const e = classifyPartDelta({ partID: 'p2', field: 'text', delta: 'hi <dcp-message-id>m</dcp-message-id>' }, st);
expect(e).toEqual({ type: 'text', text: 'hi ' });
expect(st.streamedPartKeys.has('text:p2')).toBe(true);
});
it('still records the text key even when the cleaned delta is empty', () => {
const st = freshDedup();
const e = classifyPartDelta({ partID: 'p3', field: 'text', delta: '<dcp-message-id>m</dcp-message-id>' }, st);
expect(e).toBeNull();
expect(st.streamedPartKeys.has('text:p3')).toBe(true);
});
it('uses the recorded part type when the field is absent', () => {
const st = freshDedup();
st.partTypeById.set('p4', 'reasoning');
const e = classifyPartDelta({ partID: 'p4', delta: 'more' }, st);
expect(e).toEqual({ type: 'reasoning', text: 'more' });
});
it('returns null for an unknown field', () => {
expect(classifyPartDelta({ partID: 'p5', field: 'other', delta: 'x' }, freshDedup())).toBeNull();
});
});
describe('classifyUpdatedPart (message.part.updated dedup gate)', () => {
function textPart(over: Partial<Part> = {}): Part {
return {
type: 'text',
id: 'p1',
messageID: 'm1',
sessionID: 's1',
text: 'final text',
time: { start: 1, end: 2 },
...over,
} as unknown as Part;
}
it('drops a terminal part already streamed via deltas', () => {
const st = freshDedup();
st.streamedPartKeys.add('text:p1');
expect(classifyUpdatedPart(textPart(), st)).toBeNull();
// the key is consumed
expect(st.streamedPartKeys.has('text:p1')).toBe(false);
});
it('emits a finished (ended) text part not seen via deltas', () => {
const st = freshDedup();
expect(classifyUpdatedPart(textPart(), st)).toEqual({ type: 'text', text: 'final text' });
expect(st.partTypeById.get('p1')).toBe('text');
});
it('does not emit a part that has not ended yet', () => {
const st = freshDedup();
expect(classifyUpdatedPart(textPart({ time: { start: 1 } as never }), st)).toBeNull();
});
it('strips dcp tags from the finished text', () => {
const st = freshDedup();
const part = textPart({ text: 'a <dcp-message-id>m</dcp-message-id>b' });
expect(classifyUpdatedPart(part, st)).toEqual({ type: 'text', text: 'a b' });
});
it('maps a running tool part to tool_call', () => {
const st = freshDedup();
const part = { type: 'tool', callID: 'c1', tool: 'grep', state: { status: 'running' } } as unknown as Part;
const e = classifyUpdatedPart(part, st);
expect(e?.type).toBe('tool_call');
});
it('maps a completed tool part to tool_update', () => {
const st = freshDedup();
const part = { type: 'tool', callID: 'c1', tool: 'grep', state: { status: 'completed', output: 'x' } } as unknown as Part;
const e = classifyUpdatedPart(part, st);
expect(e?.type).toBe('tool_update');
});
});
describe('error formatters', () => {
it('errMsg unwraps Error.message', () => {
expect(errMsg(new Error('x'))).toBe('x');
expect(errMsg('plain')).toBe('plain');
});
it('errToString handles null/string/Error/object', () => {
expect(errToString(null)).toBe('unknown error');
expect(errToString('s')).toBe('s');
expect(errToString(new Error('e'))).toBe('e');
expect(errToString({ a: 1 })).toBe('{"a":1}');
});
});

View File

@@ -0,0 +1,203 @@
/**
* Pure opencode `Event` → normalized `AgentEvent` translation.
*
* Extracted (v2.7 audit reshape) from `OpenCodeServerBackend.dispatchEvent` /
* `handleUpdatedPart` and the file-local helpers. NO I/O, no timers, no DB, no
* `byOpencodeId` — every function here is a deterministic transform over its
* arguments (the dedup state is caller-owned and mutated in place, mirroring the
* `acp-event-map.ts` `priorSnapshots` pattern). This is the unit-testable core; the
* backend keeps the routing + side effects (watchdog, usage persistence, settle).
*
* Depends only on SDK TYPES + AcpToolSnapshot — safe to import anywhere.
*/
import type { Event, Part, ToolPart, ToolState } from '@opencode-ai/sdk/v2/client';
import type { ToolCallStatus } from '@agentclientprotocol/sdk';
import type { AcpToolSnapshot } from '../acp-tool-snapshot.js';
import type { AgentEvent } from '../agent-backend.js';
/** Per-(opencode session) dedup state the part-stream classifiers read + mutate. */
export interface DedupState {
/** dedup gate: `${type}:${id}` added on delta, deleted-and-tested on updated. */
streamedPartKeys: Set<string>;
/** partID → 'text' | 'reasoning', so a delta with a non-'reasoning' field is still classed right. */
partTypeById: Map<string, string>;
}
/** Strip opencode-dcp plugin tags that render as literal text in the UI. */
export function stripDcpTags(s: string): string {
return s.replace(/<dcp-message-id>[^<]*<\/dcp-message-id>/g, '');
}
/** Extract the opencode sessionID an event belongs to, across event shapes.
* Most carry `properties.sessionID`; `message.part.updated` nests it under
* `properties.part.sessionID`. Returns null when the event has no session
* (the per-session loop then leaves it to dispatchEvent, which drops it). */
export function eventSessionId(ev: Event): string | null {
const props = (ev as { properties?: unknown }).properties;
if (!props || typeof props !== 'object') return null;
if (ev.type === 'message.part.updated') {
const part = (props as { part?: { sessionID?: string } }).part;
return part?.sessionID ?? null;
}
return (props as { sessionID?: string }).sessionID ?? null;
}
/** Ported verbatim from Paseo opencode-agent.ts: id → message-id fallback → null. */
export function resolvePartDedupeKey(part: { id: string; messageID: string }, type: string): string | null {
if (part.id.trim().length > 0) return `${type}:${part.id}`;
if (part.messageID.trim().length > 0) return `${type}:message:${part.messageID}`;
return null;
}
export function mapToolStatus(s: ToolState['status'] | undefined): ToolCallStatus | null {
switch (s) {
case 'pending':
return 'pending';
case 'running':
return 'in_progress';
case 'completed':
return 'completed';
case 'error':
return 'failed';
default:
return null;
}
}
/** opencode ToolPart → ACP-shaped snapshot (reuses the existing persist/render path). */
export function toolPartToSnapshot(part: ToolPart): AcpToolSnapshot {
const state = part.state;
let rawInput: unknown;
let rawOutput: unknown;
let title: string | undefined;
if (state) {
if ('input' in state) rawInput = (state as { input?: unknown }).input;
if ('output' in state) rawOutput = (state as { output?: unknown }).output;
else if ('error' in state) rawOutput = (state as { error?: unknown }).error;
if ('title' in state) title = (state as { title?: string }).title;
}
return {
toolCallId: part.callID,
title: title ?? part.tool,
kind: null,
status: mapToolStatus(state?.status),
rawInput,
rawOutput,
};
}
// ─── session.next.tool.* snapshot builders ───────────────────────────────────
/** `session.next.tool.called` → an in-progress tool_call snapshot. */
export function toolCalledSnapshot(p: { callID: string; tool: string; input: unknown }): AcpToolSnapshot {
return {
toolCallId: p.callID,
title: p.tool,
kind: null,
status: 'in_progress',
rawInput: p.input,
rawOutput: undefined,
};
}
/** `session.next.tool.success` → a completed tool snapshot (text content joined). */
export function toolSuccessSnapshot(p: { callID: string; content?: ReadonlyArray<unknown> | null }): AcpToolSnapshot {
const output = p.content?.map((c) => (c && typeof c === 'object' && 'text' in c ? (c as { text: string }).text : '')).join('') ?? '';
return {
toolCallId: p.callID,
title: p.callID,
kind: null,
status: 'completed',
rawInput: undefined,
rawOutput: output,
};
}
/** `session.next.tool.failed` → a failed tool snapshot (error stringified). */
export function toolFailedSnapshot(p: { callID: string; error: unknown }): AcpToolSnapshot {
return {
toolCallId: p.callID,
title: p.callID,
kind: null,
status: 'failed',
rawInput: undefined,
rawOutput: errToString(p.error),
};
}
// ─── message.part.* dedup gate ────────────────────────────────────────────────
/**
* `message.part.delta`: mark the part as streamed (so a later `message.part.updated`
* for the same part is deduped) and return the AgentEvent to emit, or null when the
* field is neither reasoning nor text, or a text delta strips down to empty. Mutates
* `st.streamedPartKeys` exactly as the original inline arm did (the key is recorded
* for text even when the cleaned delta is empty).
*/
export function classifyPartDelta(
p: { partID: string; field?: string; delta: string },
st: DedupState,
): AgentEvent | null {
const isReasoning = p.field === 'reasoning' || st.partTypeById.get(p.partID) === 'reasoning';
if (isReasoning) {
st.streamedPartKeys.add(`reasoning:${p.partID}`);
return { type: 'reasoning', text: p.delta };
}
if (p.field === 'text') {
st.streamedPartKeys.add(`text:${p.partID}`);
const cleaned = stripDcpTags(p.delta);
return cleaned ? { type: 'text', text: cleaned } : null;
}
return null;
}
/**
* `message.part.updated` terminal part: the dedup gate for text/reasoning (drop a
* part already streamed via deltas; otherwise emit the finished text) plus the
* tool-part → tool_call/tool_update mapping. Returns null when nothing should be
* emitted. Mutates `st.partTypeById` / `st.streamedPartKeys` like the original.
*/
export function classifyUpdatedPart(part: Part, st: DedupState): AgentEvent | null {
if (part.type === 'text' || part.type === 'reasoning') {
st.partTypeById.set(part.id, part.type);
const key = resolvePartDedupeKey(part, part.type);
if (key && st.streamedPartKeys.delete(key)) return null; // already streamed via delta
const raw = part.text ?? '';
const text = part.type === 'text' ? stripDcpTags(raw) : raw;
if (text && part.time?.end != null) {
return { type: part.type, text };
}
return null;
}
if (part.type === 'tool') {
const snap = toolPartToSnapshot(part);
const status = part.state?.status;
// tool_call on start (pending/running), tool_update on terminal (completed/error).
// The current ACP path merges both into one frame; the contract keeps them
// distinct because opencode's SSE distinguishes start from result.
return status === 'completed' || status === 'error'
? { type: 'tool_update', toolCall: snap }
: { type: 'tool_call', toolCall: snap };
}
// NOTE: opencode's SSE payload union carries no available-commands event, so the
// AgentEvent 'commands' arm is intentionally never emitted here.
return null;
}
// ─── shared error formatters (pure) ───────────────────────────────────────────
export function errMsg(e: unknown): string {
return e instanceof Error ? e.message : String(e);
}
export function errToString(e: unknown): string {
if (e == null) return 'unknown error';
if (typeof e === 'string') return e;
if (e instanceof Error) return e.message;
try {
return JSON.stringify(e);
} catch {
return String(e);
}
}

View File

@@ -0,0 +1,325 @@
/**
* OpenCodeServerSupervisor — the opencode `serve` child + HTTP client + port +
* health-counter lifecycle, extracted (v2.7 audit reshape) from the backend
* god-class. Owns spawn / ready / crash / proactive-health restart / dispose and
* exposes `client` / `port` / `health()` / `tickHealth()` to the backend.
*
* Session-level recovery (failing in-flight turns, marking agent_sessions crashed,
* tearing down SSE loops) is NOT a process concern — it's delegated back to the
* backend through the injected `hooks.onServerDown` callback, keeping this module
* free of the demux map / SQL / turn state.
*
* v2.7 concurrency hardening: `ensureServer` is guarded against the crash-window
* double-spawn (two concurrent callers each re-spawning on different ports) via a
* synchronous `startInFlight` flag — see `shouldStartServer`.
*/
import { spawn, type ChildProcess } from 'node:child_process';
import { createOpencodeClient, type OpencodeClient } from '@opencode-ai/sdk/v2/client';
import type { FastifyBaseLogger } from 'fastify';
import { decideRestart, DEFAULT_HEALTH_FAILURE_THRESHOLD } from './lifecycle-decisions.js';
import { reclaimPort, waitForPortRelease, freePort } from '../net/port-utils.js';
const READY_TIMEOUT_MS = 30_000;
/** Info handed to the backend when the server goes down (crash or forced restart). */
export interface ServerDownInfo {
code: number | null;
signal: NodeJS.Signals | null;
port: number;
}
export interface SupervisorHooks {
/** True iff ANY pooled session has an in-flight turn (defers a busy restart). */
isBusy: () => boolean;
/** Session-level recovery: fail in-flight turns, mark crashed, drop demux state. */
onServerDown: (info: ServerDownInfo) => void;
}
export interface OpenCodeServerSupervisorDeps {
/** Absolute path to the opencode binary (resolved from available_agents). */
opencodeBinary: string;
log: FastifyBaseLogger;
hooks: SupervisorHooks;
}
/**
* Pure decision for `ensureServer`: should we (re)spawn the server right now?
*
* - A live, ready server (`up && client`) → no.
* - A start already in flight (`startInFlight`) → no, NEVER double-spawn — join the
* running start instead. This is checked BEFORE `serverStarting` because the crash
* handler can null `serverStarting` mid-start (a crash during `await freePort()`),
* and without this guard the `!serverStarting` branch would spawn a second server
* on a different port while the first is still coming up.
* - No start cached/running → yes (fresh start or post-crash re-spawn, since the
* crash handler nulls `serverStarting`).
* - A cached start that already finished, but the child has since died and the crash
* handler hasn't reset us yet → yes.
*/
export function shouldStartServer(s: {
up: boolean;
hasClient: boolean;
serverStarting: boolean;
childDead: boolean;
startInFlight: boolean;
}): boolean {
if (s.up && s.hasClient) return false;
if (s.startInFlight) return false;
if (!s.serverStarting) return true;
if (!s.up && s.childDead) return true;
return false;
}
export class OpenCodeServerSupervisor {
private readonly opencodeBinary: string;
private readonly log: FastifyBaseLogger;
private readonly hooks: SupervisorHooks;
private childProc: ChildProcess | null = null;
private opencodeClient: OpencodeClient | null = null;
private serverPort: number | null = null;
private up = false;
private serverStarting: Promise<void> | null = null;
/** True from the synchronous head of startServer() until it settles — the
* double-spawn guard reads it so a concurrent ensureServer joins instead of
* kicking a second spawn. */
private startInFlight = false;
// Phase 3 busy-aware health monitor (openchamber lift): consecutive failed
// probes + the start of an unhealthy-while-busy window feed `decideRestart`.
private consecutiveHealthFailures = 0;
private unhealthyBusySince = 0;
private restarting: Promise<void> | null = null;
constructor(deps: OpenCodeServerSupervisorDeps) {
this.opencodeBinary = deps.opencodeBinary;
this.log = deps.log;
this.hooks = deps.hooks;
}
/** The live opencode HTTP client, or null between (re)starts. */
get client(): OpencodeClient | null {
return this.opencodeClient;
}
/** The current server port, or null before the first start. */
get port(): number | null {
return this.serverPort;
}
/** §2: liveness for the health endpoint + dispatcher fallback decision. */
health(): 'up' | 'down' {
return this.up ? 'up' : 'down';
}
isUp(): boolean {
return this.up;
}
// ─── lifecycle (spawn once + client + ready; crash-restart) ──────────────────
/**
* Lazy: start the single server on first use; re-spawn after a crash. Idempotent
* within one live server — `serverStarting` caches the in-flight start, reset to
* null by the crash handler so the NEXT ensureServer re-spawns. A dead-but-not-
* yet-reaped child (exit handler raced) is also treated as needing a restart.
* Concurrent callers in a crash window are coalesced via `startInFlight`.
*/
ensureServer(): Promise<void> {
if (this.up && this.opencodeClient) return Promise.resolve();
const childDead =
this.childProc != null && (this.childProc.exitCode !== null || this.childProc.signalCode !== null);
if (
shouldStartServer({
up: this.up,
hasClient: this.opencodeClient != null,
serverStarting: this.serverStarting != null,
childDead,
startInFlight: this.startInFlight,
})
) {
this.serverStarting = this.startServer();
}
return this.serverStarting ?? Promise.resolve();
}
private async startServer(): Promise<void> {
// Set synchronously (before the first await) so a concurrent ensureServer sees
// the in-flight start and joins `serverStarting` instead of double-spawning.
this.startInFlight = true;
try {
const port = await freePort();
// Phase 1: run unsecured on loopback (opencode's documented default — serve.ts
// only WARNS when OPENCODE_SERVER_PASSWORD is unset). The real boundary is the
// 127.0.0.1 bind.
const child = spawn(this.opencodeBinary, ['serve', '--hostname', '127.0.0.1', '--port', String(port)], {
stdio: ['ignore', 'pipe', 'pipe'],
env: { ...process.env },
});
this.childProc = child;
this.serverPort = port;
// Child lifetime is the backend's (the pool's), NOT a request's. On unexpected
// exit we recover: settle in-flight turns, mark sessions crashed (the backend's
// onServerDown), reclaim the port, and reset state so the next ensureServer
// re-spawns.
child.on('exit', (code, signal) => {
// Only react to THIS child's exit (a restart may have swapped in a new one).
if (this.childProc !== child) return;
this.handleCrash(code, signal, port);
});
await waitForReady(child, READY_TIMEOUT_MS);
this.opencodeClient = createOpencodeClient({ baseUrl: `http://127.0.0.1:${port}` });
this.up = true;
this.log.info({ port }, 'opencode-server: ready');
} finally {
this.startInFlight = false;
}
}
/**
* Server down (crash-exit or forced restart): reset process/port state, delegate
* session-level recovery to the backend, and reclaim the port. Mirrors the
* original `handleServerCrash` ordering (up=false → session cleanup → client/
* serverStarting null → reclaimPort).
*/
private handleCrash(code: number | null, signal: NodeJS.Signals | null, port: number): void {
this.up = false;
this.hooks.onServerDown({ code, signal, port });
this.opencodeClient = null;
this.serverStarting = null; // force a re-spawn on the next ensureServer
// Reclaim the port so a re-spawn on a fixed/leaked port isn't blocked. Best
// effort; the next start uses a fresh ephemeral port anyway.
reclaimPort(port);
}
/**
* Phase 3 proactive health monitor (openchamber `runHealthCheckCycle` lift,
* busy-aware). Probes /global/health; on a sustained failure of a NON-busy server,
* force a restart so the next turn isn't blocked by a wedged process. Busy servers
* are deferred via the stale-grace in `decideRestart`. No-op when never started or
* a restart is already in flight.
*/
async tickHealth(now: number = Date.now()): Promise<void> {
if (!this.childProc || this.restarting) return;
const childExited = this.childProc.exitCode !== null || this.childProc.signalCode !== null;
// An exited child is recovered lazily by ensureServer; don't double-restart it.
if (childExited) return;
const healthy = await this.probeHealth();
if (healthy) {
this.consecutiveHealthFailures = 0;
this.unhealthyBusySince = 0;
return;
}
this.consecutiveHealthFailures += 1;
const busy = this.hooks.isBusy();
const decision = decideRestart({
processExited: false,
consecutiveFailures: this.consecutiveHealthFailures,
busy,
unhealthyBusySince: this.unhealthyBusySince,
now,
failureThreshold: DEFAULT_HEALTH_FAILURE_THRESHOLD,
});
// Stamp the start of an unhealthy-while-busy window so the stale-grace can fire.
if (busy && this.unhealthyBusySince === 0) this.unhealthyBusySince = now;
if (decision.action === 'restart') {
this.log.warn(
{ failures: this.consecutiveHealthFailures, busy, reason: decision.reason },
'opencode-server: health monitor forcing restart',
);
this.consecutiveHealthFailures = 0;
this.unhealthyBusySince = 0;
await this.restartServer();
}
}
private async probeHealth(): Promise<boolean> {
if (!this.opencodeClient) return false;
try {
const res = await this.opencodeClient.global.health();
return !res.error;
} catch {
return false;
}
}
/** Force-kill the current server + reclaim its port; the next ensureServer
* re-spawns (lazy). Mirrors handleCrash's state reset but is initiated by the
* health monitor rather than the OS. */
private async restartServer(): Promise<void> {
if (this.restarting) return this.restarting;
this.restarting = (async () => {
const child = this.childProc;
const port = this.serverPort;
this.up = false;
// Fail in-flight turns + mark sessions crashed via the same path as a crash.
if (child) {
this.handleCrash(null, null, port ?? 0);
if (!child.killed) child.kill('SIGTERM');
}
if (port) {
reclaimPort(port);
await waitForPortRelease(port, 3_000);
}
this.childProc = null;
})().finally(() => {
this.restarting = null;
});
return this.restarting;
}
/** Full teardown of the child + client + port state. */
async dispose(): Promise<void> {
this.up = false;
const child = this.childProc;
this.childProc = null;
this.opencodeClient = null;
if (child && !child.killed) {
child.kill('SIGTERM');
const t = setTimeout(() => {
if (!child.killed) child.kill('SIGKILL');
}, 5_000);
t.unref();
}
}
}
/** Resolve when the child prints the ready line; reject on timeout or early exit. */
function waitForReady(child: ChildProcess, timeoutMs: number): Promise<void> {
return new Promise((resolve, reject) => {
let done = false;
let stderrBuf = '';
const finish = (err?: Error) => {
if (done) return;
done = true;
clearTimeout(timer);
child.stdout?.off('data', onOut);
child.stderr?.off('data', onErr);
child.off('exit', onExit);
if (err) reject(err);
else resolve();
};
const onOut = (buf: Buffer) => {
if (buf.toString().includes('opencode server listening on')) finish();
};
const onErr = (buf: Buffer) => {
stderrBuf += buf.toString();
};
const onExit = (code: number | null) =>
finish(new Error(`opencode serve exited before ready (code ${code}); stderr: ${stderrBuf.slice(-2000)}`));
const timer = setTimeout(
() => finish(new Error(`opencode serve not ready in ${timeoutMs}ms; stderr: ${stderrBuf.slice(-2000)}`)),
timeoutMs,
);
child.stdout?.on('data', onOut);
child.stderr?.on('data', onErr);
child.on('exit', onExit);
});
}

View File

@@ -1,91 +1,64 @@
/** /**
* v2.6 Phase 1 — OpenCodeServerBackend. * v2.6 Phase 1 — OpenCodeServerBackend (slimmed, v2.7 audit reshape).
* *
* Warm, multi-turn backend for the `opencode` agent. One `opencode serve` HTTP * Warm, multi-turn backend for the `opencode` agent. One `opencode serve` HTTP
* server per BooCoder process; one opencode session per BooCode session (resumed * server per BooCoder process; one opencode session per BooCode session (resumed
* on switch-back); one SSE read loop PER session, each scoped to that session's * on switch-back); one SSE read loop PER session, each scoped to that session's
* worktree directory so sessions in different directories stream concurrently * worktree directory so sessions in different directories stream concurrently.
* (P1.5-a — replaced the Phase-1 single-stream-last-directory model). *
* This file is now just the `AgentBackend` SURFACE — ensureSession / prompt /
* accumulateUsage / closeSession + the per-session demux side effects (watchdog,
* reconcile, usage). It composes three extracted collaborators:
* - `OpenCodeServerSupervisor` (opencode-server-process.ts) — child/client/port/
* health lifecycle, spawn/crash/restart/dispose.
* - the per-session SSE loop (opencode-sse.ts) — subscribe + reconnect/backoff.
* - the pure event map (opencode-event-map.ts) — Event → AgentEvent translation,
* dedup gate, dcp-strip, tool-snapshot.
* *
* Implements the Phase 0 `AgentBackend` interface. Emits transport-agnostic * Implements the Phase 0 `AgentBackend` interface. Emits transport-agnostic
* `AgentEvent`s the dispatcher (Phase 1.7, NOT wired in this batch) maps them * `AgentEvent`s; the dispatcher maps them to WS frames.
* to WS frames. No dispatcher/route references this file yet.
* *
* Spec: openspec/changes/v2-6-persistent-agent-sessions/design.md §2 / §2a. * Spec: openspec/changes/v2-6-persistent-agent-sessions/design.md §2 / §2a.
* SDK shapes verified by direct read of @opencode-ai/sdk@1.15.12 dist .d.ts:
* - client methods take FLATTENED params (sessionID/directory/body all inline),
* not {path,query,body}. create→{directory}, promptAsync→{sessionID,directory,
* parts,model}, abort→{sessionID,directory}. model is {providerID,modelID}.
* - client.event() resolves to { stream: AsyncGenerator<GlobalEvent> }; the
* real event is chunk.payload (discriminate on chunk.payload.type).
* - promptAsync is fire-and-forget (204); the turn completes via a
* 'session.idle' event for that opencode session id.
*/ */
import { spawn, spawnSync, type ChildProcess } from 'node:child_process';
import { createHash } from 'node:crypto'; import { createHash } from 'node:crypto';
import { createServer, connect as netConnect } from 'node:net';
import type { FastifyBaseLogger } from 'fastify'; import type { FastifyBaseLogger } from 'fastify';
import { import type { Event, AssistantMessage } from '@opencode-ai/sdk/v2/client';
createOpencodeClient,
type OpencodeClient,
type Event,
type Part,
type ToolPart,
type ToolState,
type AssistantMessage,
} from '@opencode-ai/sdk/v2/client';
import type { ToolCallStatus } from '@agentclientprotocol/sdk';
import type { Sql } from '../../db.js'; import type { Sql } from '../../db.js';
import type { AcpToolSnapshot } from '../acp-tool-snapshot.js';
import { armAbortGuard, noteTurnActivity, consumeTerminal } from './turn-guard.js'; import { armAbortGuard, noteTurnActivity, consumeTerminal } from './turn-guard.js';
import { stepEndedToUsage, type StepUsage } from './opencode-usage.js'; import { stepEndedToUsage, type StepUsage } from './opencode-usage.js';
import { decideRestart, DEFAULT_HEALTH_FAILURE_THRESHOLD } from './lifecycle-decisions.js'; import { OpenCodeServerSupervisor, type ServerDownInfo } from './opencode-server-process.js';
import {
startSessionEventLoop,
type SessionState,
type TurnState,
type SseLoopDeps,
} from './opencode-sse.js';
import {
classifyPartDelta,
classifyUpdatedPart,
toolCalledSnapshot,
toolSuccessSnapshot,
toolFailedSnapshot,
stripDcpTags,
errMsg,
errToString,
} from './opencode-event-map.js';
import type { import type {
AgentBackend, AgentBackend,
AgentEvent,
AgentSessionHandle, AgentSessionHandle,
EnsureSessionOpts, EnsureSessionOpts,
PromptCtx, PromptCtx,
TurnResult, TurnResult,
} from '../agent-backend.js'; } from '../agent-backend.js';
const READY_TIMEOUT_MS = 30_000;
const SSE_RECONNECT_DELAY_MS = 1_000;
/** /**
* No-activity backstop for an in-flight turn. opencode streams reasoning/text/tool * No-activity backstop for an in-flight turn. opencode streams reasoning/text/tool
* deltas continuously while working, so "zero events for this long" means the turn * deltas continuously while working, so "zero events for this long" means the turn
* is wedged or its terminal event (session.idle) was lost (see the reconnect race * is wedged or its terminal event (session.idle) was lost. Generous so a
* below). Generous so a legitimately slow turn never trips it. * legitimately slow turn never trips it.
*/ */
const TURN_INACTIVITY_MS = 180_000; const TURN_INACTIVITY_MS = 180_000;
/** One in-flight turn's emitter + completion settler. */
interface TurnState {
onEvent: (e: AgentEvent) => void;
settle: (r: TurnResult) => void;
}
/** Per-(opencode session) demux state. dedup sets scoped here, cleared per turn. */
interface SessionState {
boocodeSessionId: string;
agentSessionId: string;
/** Worktree directory for SDK `directory` routing; refreshed each turn from ctx. */
worktreePath: string;
/** dedup gate: `${type}:${id}` added on delta, deleted-and-tested on updated. Cleared at turn end. */
streamedPartKeys: Set<string>;
/** partID → 'text' | 'reasoning', so a delta with a non-'reasoning' field is still classed right. Cleared at turn end. */
partTypeById: Map<string, string>;
activeTurn: TurnState | null;
/** Inactivity backstop timer for the active turn; null when no turn in flight. */
watchdog: ReturnType<typeof setTimeout> | null;
/** Per-session SSE subscription handle. Non-null while the loop is running;
* aborting it tears down the underlying fetch and exits the loop. */
sseAbort: AbortController | null;
/** F.1 post-abort orphan-terminal guard: swallow the one session.idle/error
* opencode emits for an aborted turn so it can't settle the next turn. */
swallowNextTerminal: boolean;
}
export interface OpenCodeServerBackendDeps { export interface OpenCodeServerBackendDeps {
sql: Sql; sql: Sql;
log: FastifyBaseLogger; log: FastifyBaseLogger;
@@ -98,36 +71,32 @@ export class OpenCodeServerBackend implements AgentBackend {
private readonly sql: Sql; private readonly sql: Sql;
private readonly log: FastifyBaseLogger; private readonly log: FastifyBaseLogger;
private readonly opencodeBinary: string; private readonly supervisor: OpenCodeServerSupervisor;
private child: ChildProcess | null = null;
private client: OpencodeClient | null = null;
private port: number | null = null;
private up = false;
private serverStarting: Promise<void> | null = null;
// Phase 3 busy-aware health monitor (openchamber lift): consecutive failed
// probes + the start of an unhealthy-while-busy window feed `decideRestart`.
private consecutiveHealthFailures = 0;
private unhealthyBusySince = 0;
private restarting: Promise<void> | null = null;
/** opencode session id → demux state. Maintained by ensureSession; read by the SSE loop. */ /** opencode session id → demux state. Maintained by ensureSession; read by the SSE loop. */
private readonly byOpencodeId = new Map<string, SessionState>(); private readonly byOpencodeId = new Map<string, SessionState>();
/** Coalesces concurrent ensureSession calls for the same (chat, agent) key. */
private readonly ensuring = new Map<string, Promise<AgentSessionHandle>>();
constructor(deps: OpenCodeServerBackendDeps) { constructor(deps: OpenCodeServerBackendDeps) {
this.sql = deps.sql; this.sql = deps.sql;
this.log = deps.log; this.log = deps.log;
this.opencodeBinary = deps.opencodeBinary; this.supervisor = new OpenCodeServerSupervisor({
opencodeBinary: deps.opencodeBinary,
log: deps.log,
hooks: {
isBusy: () => this.isBusy(),
onServerDown: (info) => this.onServerDown(info),
},
});
} }
/** §2: liveness for the health endpoint + dispatcher fallback decision. */ /** §2: liveness for the health endpoint + dispatcher fallback decision. */
health(): 'up' | 'down' { health(): 'up' | 'down' {
return this.up ? 'up' : 'down'; return this.supervisor.health();
} }
/** Phase 3: busy iff ANY pooled opencode session has an in-flight turn. The /** Phase 3: busy iff ANY pooled opencode session has an in-flight turn. */
* pool reads this to skip idle/LRU eviction and the health-monitor to defer a
* restart (never tear down a session mid-stream). */
isBusy(): boolean { isBusy(): boolean {
for (const st of this.byOpencodeId.values()) { for (const st of this.byOpencodeId.values()) {
if (st.activeTurn) return true; if (st.activeTurn) return true;
@@ -135,72 +104,23 @@ export class OpenCodeServerBackend implements AgentBackend {
return false; return false;
} }
// ─── Server lifecycle (1.2: spawn once + client + ready; Phase 3 crash-restart) ── /** Phase 3 proactive health probe + busy-aware self-restart, run by the pool's
* periodic sweep. Delegates to the supervisor. */
/** async tickHealth(now: number = Date.now()): Promise<void> {
* Lazy: start the single server on first use; re-spawn after a crash. Idempotent await this.supervisor.tickHealth(now);
* within one live server — `serverStarting` caches the in-flight start, and is
* reset to null by the crash handler so the NEXT ensureServer re-spawns a fresh
* server (Phase 3 crash recovery). A dead-but-not-yet-reaped child (exit handler
* raced) is also treated as needing a restart.
*/
private ensureServer(): Promise<void> {
const childDead = this.child != null && (this.child.exitCode !== null || this.child.signalCode !== null);
if (!this.serverStarting || (!this.up && childDead)) {
this.serverStarting = this.startServer();
}
return this.serverStarting;
}
private async startServer(): Promise<void> {
const port = await freePort();
// Phase 1: run unsecured on loopback (opencode's documented default — serve.ts
// only WARNS when OPENCODE_SERVER_PASSWORD is unset). The real boundary is the
// 127.0.0.1 bind. Defense-in-depth basic-auth is deferred: the hey-api client's
// auth wiring + opencode's exact scheme must be confirmed against a live server
// first, else every request 401s. Recon explicitly said "do NOT block on it".
const child = spawn(this.opencodeBinary, ['serve', '--hostname', '127.0.0.1', '--port', String(port)], {
stdio: ['ignore', 'pipe', 'pipe'],
env: { ...process.env },
});
this.child = child;
this.port = port;
// Child lifetime is the backend's (the pool's), NOT a request's. We never tie
// it to a per-turn abort signal. Phase 3: on unexpected exit we recover —
// settle any in-flight turns as failed, mark their agent_sessions rows crashed,
// and reset `serverStarting` so the next ensureServer re-spawns. opencode keeps
// sessions on disk, but a fresh server's in-memory state is gone, so the next
// turn's ensureSession (rows now 'crashed') creates fresh opencode sessions.
child.on('exit', (code, signal) => {
// Only react to THIS child's exit (a restart may have swapped in a new one).
if (this.child !== child) return;
this.handleServerCrash(code, signal, port);
});
await waitForReady(child, READY_TIMEOUT_MS);
this.client = createOpencodeClient({ baseUrl: `http://127.0.0.1:${port}` });
this.up = true;
this.log.info({ port }, 'opencode-server: ready');
} }
/** /**
* Crash handler (Phase 3, lift of openchamber's restart-on-exit path). The * Server down (crash-exit or forced restart): fail every in-flight turn so its
* server died with N live opencode sessions; we can't restart it here (the next * dispatcher unblocks, mark each session crashed so ensureSession won't resume a
* turn does, lazily — avoids a restart storm if the binary is broken). We: * now-dead native id, and tear down the SSE loops + demux state. Invoked by the
* 1. fail every in-flight turn so its dispatcher unblocks + publishes an error, * supervisor (it owns the process/port reset). Mirrors the original
* 2. mark each session's agent_sessions row 'crashed' so ensureSession won't * handleServerCrash session-half byte-for-byte.
* resume a now-dead native session id (it creates fresh),
* 3. tear down the SSE loops + demux state (stale against the dead server),
* 4. reclaim the port + reset state so the next ensureServer re-spawns.
*/ */
private handleServerCrash(code: number | null, signal: NodeJS.Signals | null, port: number): void { private onServerDown(info: ServerDownInfo): void {
this.up = false;
const states = [...this.byOpencodeId.values()]; const states = [...this.byOpencodeId.values()];
this.log.warn( this.log.warn(
{ code, signal, port, liveSessions: states.length }, { code: info.code, signal: info.signal, port: info.port, liveSessions: states.length },
'opencode-server: child exited — recovering (fail in-flight, mark crashed, re-spawn next turn)', 'opencode-server: child exited — recovering (fail in-flight, mark crashed, re-spawn next turn)',
); );
@@ -219,8 +139,6 @@ export class OpenCodeServerBackend implements AgentBackend {
} }
// Drop the demux map: every session id is stale against a fresh server. // Drop the demux map: every session id is stale against a fresh server.
this.byOpencodeId.clear(); this.byOpencodeId.clear();
this.client = null;
this.serverStarting = null; // force a re-spawn on the next ensureServer
if (crashedIds.length > 0) { if (crashedIds.length > 0) {
this.sql` this.sql`
@@ -230,146 +148,20 @@ export class OpenCodeServerBackend implements AgentBackend {
this.log.warn({ err: errMsg(err) }, 'opencode-server: failed to mark crashed sessions (non-fatal)'); this.log.warn({ err: errMsg(err) }, 'opencode-server: failed to mark crashed sessions (non-fatal)');
}); });
} }
// Reclaim the port so a re-spawn on a fixed/leaked port isn't blocked. Best
// effort; the next start uses a fresh ephemeral port anyway.
reclaimPort(port);
} }
/** // ─── SSE loop wiring ─────────────────────────────────────────────────────────
* Phase 3 proactive health monitor (openchamber `runHealthCheckCycle` lift,
* busy-aware). Probes the server's /global/health; on a sustained failure of a
* NON-busy server, force a restart so the next turn isn't blocked by a wedged
* (hung-but-not-exited) process. Busy servers are deferred via the stale-grace in
* `decideRestart` — never tear down live work. Driven by the pool's periodic
* sweep (best-effort; a crash-exit is already handled by `handleServerCrash` +
* lazy `ensureServer` re-spawn, so this only catches the hung case). No-op when
* the server was never started or a restart is already in flight.
*/
async tickHealth(now: number = Date.now()): Promise<void> {
if (!this.child || this.restarting) return;
const childExited = this.child.exitCode !== null || this.child.signalCode !== null;
// An exited child is recovered lazily by ensureServer; don't double-restart it.
if (childExited) return;
const healthy = await this.probeHealth(); /** The dependency bundle the per-session SSE loop reads. */
if (healthy) { private sseDeps(): SseLoopDeps {
this.consecutiveHealthFailures = 0; return {
this.unhealthyBusySince = 0; isUp: () => this.supervisor.isUp(),
return; getClient: () => this.supervisor.client,
} dispatchEvent: (ev) => this.dispatchEvent(ev),
this.consecutiveHealthFailures += 1; reconcile: (st) => this.reconcile(st),
const busy = this.isBusy(); onReconnectGiveUp: (st) => this.onReconnectGiveUp(st),
const decision = decideRestart({ log: this.log,
processExited: false, };
consecutiveFailures: this.consecutiveHealthFailures,
busy,
unhealthyBusySince: this.unhealthyBusySince,
now,
failureThreshold: DEFAULT_HEALTH_FAILURE_THRESHOLD,
});
// Stamp the start of an unhealthy-while-busy window so the stale-grace can fire.
if (busy && this.unhealthyBusySince === 0) this.unhealthyBusySince = now;
if (decision.action === 'restart') {
this.log.warn(
{ failures: this.consecutiveHealthFailures, busy, reason: decision.reason },
'opencode-server: health monitor forcing restart',
);
this.consecutiveHealthFailures = 0;
this.unhealthyBusySince = 0;
await this.restartServer();
}
}
private async probeHealth(): Promise<boolean> {
if (!this.client) return false;
try {
const res = await this.client.global.health();
return !res.error;
} catch {
return false;
}
}
/** Force-kill the current server + reclaim its port; the next ensureServer
* re-spawns (lazy). Mirrors handleServerCrash's state reset but is initiated by
* the health monitor rather than the OS. */
private async restartServer(): Promise<void> {
if (this.restarting) return this.restarting;
this.restarting = (async () => {
const child = this.child;
const port = this.port;
this.up = false;
// Fail in-flight turns + mark sessions crashed via the same path as a crash.
if (child) {
this.handleServerCrash(null, null, port ?? 0);
if (!child.killed) child.kill('SIGTERM');
}
if (port) {
reclaimPort(port);
await waitForPortRelease(port, 3_000);
}
this.child = null;
})().finally(() => {
this.restarting = null;
});
return this.restarting;
}
// ─── SSE read loop + demux + translate (1.3) + dedup (1.4) ───────────────────
/** Per-session SSE subscription, scoped to the session's worktree directory.
* opencode scopes events by the `directory` query param (defaults to the
* server's cwd if omitted), so two sessions in different worktrees each get
* their own dir-scoped stream and never drop each other's events. Idempotent:
* a no-op if this session's loop is already running. Started from ensureSession
* (and defensively from prompt) once worktreePath is known. */
private startSessionEventLoop(state: SessionState): void {
if (state.sseAbort) return; // already running
const abort = new AbortController();
state.sseAbort = abort;
void this.runSessionEventLoop(state, abort).finally(() => {
// Only clear if this controller is still the live one (a later restart may
// have already installed a new one).
if (state.sseAbort === abort) state.sseAbort = null;
});
}
private async runSessionEventLoop(state: SessionState, abort: AbortController): Promise<void> {
const signal = abort.signal;
while (this.up && this.client && !signal.aborted) {
try {
// Re-read worktreePath each (re)subscribe so a directory refresh is picked
// up on reconnect. Passing `signal` lets close/dispose tear down a stream
// that's parked in `for await` between events.
const sub = await this.client.event.subscribe(
{ directory: state.worktreePath },
{ signal },
);
for await (const ev of sub.stream) {
if (signal.aborted) break;
// Dir-scoped streams should only carry this session's events, but two
// sessions sharing a worktree (possible post-P1.5-b) each receive BOTH
// sessions' events — so drop anything that isn't ours, else the other
// session's deltas get processed twice (once per loop).
const sid = eventSessionId(ev);
if (sid != null && sid !== state.agentSessionId) continue;
this.dispatchEvent(ev);
}
if (this.up && !signal.aborted) {
await this.reconcile(state); // recover an idle/error lost during the gap
await sleep(SSE_RECONNECT_DELAY_MS);
}
} catch (err) {
if (!this.up || signal.aborted) break;
this.log.warn(
{ err: errMsg(err), agentSessionId: state.agentSessionId },
'opencode-server: session event loop error; reconnecting',
);
await this.reconcile(state);
await sleep(SSE_RECONNECT_DELAY_MS);
}
}
} }
/** Demux one event to the owning session's active turn. Unknown/between-turns → drop. */ /** Demux one event to the owning session's active turn. Unknown/between-turns → drop. */
@@ -398,15 +190,7 @@ export class OpenCodeServerBackend implements AgentBackend {
const st = this.byOpencodeId.get(p.sessionID); const st = this.byOpencodeId.get(p.sessionID);
if (!st?.activeTurn) return; if (!st?.activeTurn) return;
this.bumpActivity(st); this.bumpActivity(st);
const snap: AcpToolSnapshot = { st.activeTurn.onEvent({ type: 'tool_call', toolCall: toolCalledSnapshot(p) });
toolCallId: p.callID,
title: p.tool,
kind: null,
status: 'in_progress',
rawInput: p.input,
rawOutput: undefined,
};
st.activeTurn.onEvent({ type: 'tool_call', toolCall: snap });
return; return;
} }
case 'session.next.tool.success': { case 'session.next.tool.success': {
@@ -414,16 +198,7 @@ export class OpenCodeServerBackend implements AgentBackend {
const st = this.byOpencodeId.get(p.sessionID); const st = this.byOpencodeId.get(p.sessionID);
if (!st?.activeTurn) return; if (!st?.activeTurn) return;
this.bumpActivity(st); this.bumpActivity(st);
const output = p.content?.map((c) => ('text' in c ? (c as { text: string }).text : '')).join('') ?? ''; st.activeTurn.onEvent({ type: 'tool_update', toolCall: toolSuccessSnapshot(p) });
const snap: AcpToolSnapshot = {
toolCallId: p.callID,
title: p.callID,
kind: null,
status: 'completed',
rawInput: undefined,
rawOutput: output,
};
st.activeTurn.onEvent({ type: 'tool_update', toolCall: snap });
return; return;
} }
case 'session.next.tool.failed': { case 'session.next.tool.failed': {
@@ -431,15 +206,7 @@ export class OpenCodeServerBackend implements AgentBackend {
const st = this.byOpencodeId.get(p.sessionID); const st = this.byOpencodeId.get(p.sessionID);
if (!st?.activeTurn) return; if (!st?.activeTurn) return;
this.bumpActivity(st); this.bumpActivity(st);
const snap: AcpToolSnapshot = { st.activeTurn.onEvent({ type: 'tool_update', toolCall: toolFailedSnapshot(p) });
toolCallId: p.callID,
title: p.callID,
kind: null,
status: 'failed',
rawInput: undefined,
rawOutput: errToString(p.error),
};
st.activeTurn.onEvent({ type: 'tool_update', toolCall: snap });
return; return;
} }
// ─── per-step usage (U.6) — token/cost accounting for opencode sessions ── // ─── per-step usage (U.6) — token/cost accounting for opencode sessions ──
@@ -449,8 +216,7 @@ export class OpenCodeServerBackend implements AgentBackend {
if (!st?.activeTurn) return; if (!st?.activeTurn) return;
this.bumpActivity(st); this.bumpActivity(st);
// Accumulate this step's normalized usage onto the (chat_id, agent) row. // Accumulate this step's normalized usage onto the (chat_id, agent) row.
// Fire-and-forget: a DB hiccup must not stall the turn. opencode emits this // Fire-and-forget: a DB hiccup must not stall the turn.
// once per LLM step, so a multi-tool turn sums several deltas.
const usage = stepEndedToUsage(p); const usage = stepEndedToUsage(p);
void this.accumulateUsage(st, usage); void this.accumulateUsage(st, usage);
return; return;
@@ -461,15 +227,8 @@ export class OpenCodeServerBackend implements AgentBackend {
const st = this.byOpencodeId.get(p.sessionID); const st = this.byOpencodeId.get(p.sessionID);
if (!st?.activeTurn) return; if (!st?.activeTurn) return;
this.bumpActivity(st); this.bumpActivity(st);
const isReasoning = p.field === 'reasoning' || st.partTypeById.get(p.partID) === 'reasoning'; const e = classifyPartDelta(p, st);
if (isReasoning) { if (e) st.activeTurn.onEvent(e);
st.streamedPartKeys.add(`reasoning:${p.partID}`);
st.activeTurn.onEvent({ type: 'reasoning', text: p.delta });
} else if (p.field === 'text') {
st.streamedPartKeys.add(`text:${p.partID}`);
const cleaned = stripDcpTags(p.delta);
if (cleaned) st.activeTurn.onEvent({ type: 'text', text: cleaned });
}
return; return;
} }
case 'message.part.updated': { case 'message.part.updated': {
@@ -477,7 +236,8 @@ export class OpenCodeServerBackend implements AgentBackend {
const st = this.byOpencodeId.get(part.sessionID); const st = this.byOpencodeId.get(part.sessionID);
if (!st?.activeTurn) return; if (!st?.activeTurn) return;
this.bumpActivity(st); this.bumpActivity(st);
this.handleUpdatedPart(part, st); const e = classifyUpdatedPart(part, st);
if (e) st.activeTurn.onEvent(e);
return; return;
} }
// ─── lifecycle ───────────────────────────────────────────────────────── // ─── lifecycle ─────────────────────────────────────────────────────────
@@ -502,40 +262,6 @@ export class OpenCodeServerBackend implements AgentBackend {
} }
} }
/** Terminal part: dedup gate for text/reasoning; tool parts → tool_call/tool_update. */
private handleUpdatedPart(part: Part, st: SessionState): void {
const turn = st.activeTurn;
if (!turn) return;
if (part.type === 'text' || part.type === 'reasoning') {
st.partTypeById.set(part.id, part.type);
const key = resolvePartDedupeKey(part, part.type);
if (key && st.streamedPartKeys.delete(key)) return; // already streamed via delta
const raw = part.text ?? '';
const text = part.type === 'text' ? stripDcpTags(raw) : raw;
if (text && part.time?.end != null) {
turn.onEvent({ type: part.type, text });
}
return;
}
if (part.type === 'tool') {
const snap = toolPartToSnapshot(part);
const status = part.state?.status;
// tool_call on start (pending/running), tool_update on terminal (completed/error).
// The current ACP path merges both into one frame; the contract keeps them
// distinct because opencode's SSE distinguishes start from result.
const event: AgentEvent =
status === 'completed' || status === 'error'
? { type: 'tool_update', toolCall: snap }
: { type: 'tool_call', toolCall: snap };
turn.onEvent(event);
return;
}
// NOTE: opencode's SSE payload union carries no available-commands event, so the
// AgentEvent 'commands' arm is intentionally never emitted here (1.3).
}
// ─── turn-completion resilience (watchdog + reconnect reconcile) ───────────── // ─── turn-completion resilience (watchdog + reconnect reconcile) ─────────────
/** Reset the inactivity backstop on any event routed to a session's active turn. */ /** Reset the inactivity backstop on any event routed to a session's active turn. */
@@ -550,8 +276,8 @@ export class OpenCodeServerBackend implements AgentBackend {
st.watchdog.unref?.(); st.watchdog.unref?.();
} }
/** Watchdog fired: reconcile once; if the server says still-running we can't tell, so fail closed. /** Watchdog fired: reconcile once; if still-running we can't tell, so fail closed.
* Also mark the agent_sessions row crashed so a stale session isn't resumed next turn. */ * Also mark the agent_sessions row crashed so a stale session isn't resumed. */
private async onTurnStall(st: SessionState): Promise<void> { private async onTurnStall(st: SessionState): Promise<void> {
const settled = await this.reconcile(st); const settled = await this.reconcile(st);
if (!settled) { if (!settled) {
@@ -564,16 +290,27 @@ export class OpenCodeServerBackend implements AgentBackend {
} }
} }
/** SSE circuit-breaker fired (reconnect gave up): fail the active turn + mark the
* session crashed so it isn't resumed. The next turn re-creates a fresh session. */
private async onReconnectGiveUp(st: SessionState): Promise<void> {
if (!st.activeTurn) return;
await this.sql`
UPDATE agent_sessions SET status = 'crashed'
WHERE agent_session_id = ${st.agentSessionId}
`.catch(() => {});
st.activeTurn?.settle({ ok: false, error: 'opencode SSE stream lost (reconnect gave up)' });
}
/** /**
* Ask the server whether this session's turn already finished — recovers a * Ask the server whether this session's turn already finished — recovers a
* session.idle/error lost during an SSE gap. Returns true if it settled the turn. * session.idle/error lost during an SSE gap. Returns true if it settled the turn.
* Inconclusive (still running / call failed) → false; the watchdog covers that.
*/ */
private async reconcile(st: SessionState): Promise<boolean> { private async reconcile(st: SessionState): Promise<boolean> {
const turn = st.activeTurn; const turn = st.activeTurn;
if (!turn || !this.client) return false; const client = this.supervisor.client;
if (!turn || !client) return false;
try { try {
const res = await this.client.session.messages({ const res = await client.session.messages({
sessionID: st.agentSessionId, sessionID: st.agentSessionId,
directory: st.worktreePath, directory: st.worktreePath,
}); });
@@ -605,10 +342,8 @@ export class OpenCodeServerBackend implements AgentBackend {
/** /**
* Accumulate one `session.next.step.ended`'s normalized usage onto the session's * Accumulate one `session.next.step.ended`'s normalized usage onto the session's
* agent_sessions row, keyed by the resumed `agent_session_id` (unique per active * agent_sessions row. Running totals for the whole conversation context. Zero-delta
* row — the dispatcher's `(chat_id, agent)` lookup wrote it). Running totals for * steps are skipped. Errors are swallowed: usage telemetry must never fail a turn.
* the whole conversation context (not last-step). Zero-delta steps are skipped to
* avoid a no-op write. Errors are swallowed: usage telemetry must never fail a turn.
*/ */
private async accumulateUsage(st: SessionState, u: StepUsage): Promise<void> { private async accumulateUsage(st: SessionState, u: StepUsage): Promise<void> {
if (u.input === 0 && u.output === 0 && u.cost === 0) return; if (u.input === 0 && u.output === 0 && u.cost === 0) return;
@@ -631,13 +366,29 @@ export class OpenCodeServerBackend implements AgentBackend {
// ─── ensureSession: create-or-resume against agent_sessions (1.5) ──────────── // ─── ensureSession: create-or-resume against agent_sessions (1.5) ────────────
async ensureSession(sessionId: string, opts: EnsureSessionOpts): Promise<AgentSessionHandle> { async ensureSession(sessionId: string, opts: EnsureSessionOpts): Promise<AgentSessionHandle> {
await this.ensureServer(); // Coalesce concurrent first-turns for the same (chat, agent) so the SELECT…
if (!this.client) throw new Error('opencode-server: client not ready after ensureServer'); // create…upsert can't race into two opencode sessions (the second orphaning
// the first). A single (non-concurrent) call is unaffected — the entry is set
// and removed within this call. Defensive: the dispatcher already serializes
// turns per (chat, agent) via its inflight map.
const key = `${opts.chatId}:${opts.agent}`;
const existing = this.ensuring.get(key);
if (existing) return existing;
const p = this.ensureSessionInner(sessionId, opts).finally(() => {
if (this.ensuring.get(key) === p) this.ensuring.delete(key);
});
this.ensuring.set(key, p);
return p;
}
private async ensureSessionInner(sessionId: string, opts: EnsureSessionOpts): Promise<AgentSessionHandle> {
await this.supervisor.ensureServer();
const client = this.supervisor.client;
if (!client) throw new Error('opencode-server: client not ready after ensureServer');
const configHash = sessionConfigHash(opts.model); const configHash = sessionConfigHash(opts.model);
// P1.5-b: agent_sessions is keyed (chat_id, agent) — the tab/chat is the // P1.5-b: agent_sessions is keyed (chat_id, agent) — the tab/chat is the
// context unit (two tabs in one session = two contexts sharing one worktree). // context unit (two tabs in one session = two contexts sharing one worktree).
// session_id + worktree_id are retained as informational (SET NULL) columns.
const [row] = await this.sql<{ agent_session_id: string | null; status: string; config_hash: string | null }[]>` const [row] = await this.sql<{ agent_session_id: string | null; status: string; config_hash: string | null }[]>`
SELECT agent_session_id, status, config_hash FROM agent_sessions SELECT agent_session_id, status, config_hash FROM agent_sessions
WHERE chat_id = ${opts.chatId} AND agent = ${opts.agent} WHERE chat_id = ${opts.chatId} AND agent = ${opts.agent}
@@ -655,7 +406,7 @@ export class OpenCodeServerBackend implements AgentBackend {
'opencode-server: not resuming stale session, creating fresh'); 'opencode-server: not resuming stale session, creating fresh');
this.byOpencodeId.delete(agentSessionId); this.byOpencodeId.delete(agentSessionId);
} }
const created = await this.client.session.create({ directory: opts.worktreePath }); const created = await client.session.create({ directory: opts.worktreePath });
if (created.error || !created.data) { if (created.error || !created.data) {
throw new Error(`opencode-server: session.create failed: ${errToString(created.error)}`); throw new Error(`opencode-server: session.create failed: ${errToString(created.error)}`);
} }
@@ -664,7 +415,7 @@ export class OpenCodeServerBackend implements AgentBackend {
INSERT INTO agent_sessions INSERT INTO agent_sessions
(chat_id, session_id, worktree_id, agent, backend, agent_session_id, server_port, status, last_active_at, config_hash) (chat_id, session_id, worktree_id, agent, backend, agent_session_id, server_port, status, last_active_at, config_hash)
VALUES VALUES
(${opts.chatId}, ${sessionId}, ${opts.worktreeId}, ${opts.agent}, 'opencode_server', ${agentSessionId}, ${this.port}, 'active', clock_timestamp(), ${configHash}) (${opts.chatId}, ${sessionId}, ${opts.worktreeId}, ${opts.agent}, 'opencode_server', ${agentSessionId}, ${this.supervisor.port}, 'active', clock_timestamp(), ${configHash})
ON CONFLICT (chat_id, agent) DO UPDATE SET ON CONFLICT (chat_id, agent) DO UPDATE SET
session_id = EXCLUDED.session_id, session_id = EXCLUDED.session_id,
worktree_id = EXCLUDED.worktree_id, worktree_id = EXCLUDED.worktree_id,
@@ -678,7 +429,7 @@ export class OpenCodeServerBackend implements AgentBackend {
} else { } else {
await this.sql` await this.sql`
UPDATE agent_sessions UPDATE agent_sessions
SET status = 'active', last_active_at = clock_timestamp(), server_port = ${this.port}, config_hash = ${configHash} SET status = 'active', last_active_at = clock_timestamp(), server_port = ${this.supervisor.port}, config_hash = ${configHash}
WHERE chat_id = ${opts.chatId} AND agent = ${opts.agent} WHERE chat_id = ${opts.chatId} AND agent = ${opts.agent}
`; `;
} }
@@ -693,24 +444,13 @@ export class OpenCodeServerBackend implements AgentBackend {
state.boocodeSessionId = sessionId; state.boocodeSessionId = sessionId;
state.worktreePath = opts.worktreePath; state.worktreePath = opts.worktreePath;
} else { } else {
state = { state = this.makeSessionState(sessionId, ocSessionId, opts.worktreePath);
boocodeSessionId: sessionId,
agentSessionId: ocSessionId,
worktreePath: opts.worktreePath,
streamedPartKeys: new Set(),
partTypeById: new Map(),
activeTurn: null,
watchdog: null,
sseAbort: null,
swallowNextTerminal: false,
};
this.byOpencodeId.set(ocSessionId, state); this.byOpencodeId.set(ocSessionId, state);
} }
// Start this session's own SSE loop, scoped to its worktree directory. Both // Start this session's own SSE loop, scoped to its worktree directory. Both
// fresh-create and resume reach here; idempotent, so a re-ensure (e.g. a // fresh-create and resume reach here; idempotent.
// second turn) won't spawn a duplicate loop. startSessionEventLoop(state, this.sseDeps());
this.startSessionEventLoop(state);
return { return {
sessionId, sessionId,
@@ -719,40 +459,53 @@ export class OpenCodeServerBackend implements AgentBackend {
chatId: opts.chatId, chatId: opts.chatId,
worktreeId: opts.worktreeId, worktreeId: opts.worktreeId,
agentSessionId: ocSessionId, agentSessionId: ocSessionId,
serverPort: this.port, serverPort: this.supervisor.port,
};
}
/** Fresh per-(opencode session) demux state. */
private makeSessionState(boocodeSessionId: string, agentSessionId: string, worktreePath: string): SessionState {
return {
boocodeSessionId,
agentSessionId,
worktreePath,
streamedPartKeys: new Set(),
partTypeById: new Map(),
activeTurn: null,
watchdog: null,
sseAbort: null,
swallowNextTerminal: false,
}; };
} }
// ─── prompt: send one turn (1.6) ───────────────────────────────────────────── // ─── prompt: send one turn (1.6) ─────────────────────────────────────────────
async prompt(handle: AgentSessionHandle, input: string, ctx: PromptCtx): Promise<TurnResult> { async prompt(handle: AgentSessionHandle, input: string, ctx: PromptCtx): Promise<TurnResult> {
if (!this.client) throw new Error('opencode-server: client not ready'); const client = this.supervisor.client;
if (!client) throw new Error('opencode-server: client not ready');
const oc = handle.agentSessionId; const oc = handle.agentSessionId;
if (!oc) throw new Error('opencode-server: handle has no agentSessionId'); if (!oc) throw new Error('opencode-server: handle has no agentSessionId');
let state = this.byOpencodeId.get(oc); let state = this.byOpencodeId.get(oc);
if (!state) { if (!state) {
state = { state = this.makeSessionState(handle.sessionId, oc, ctx.worktreePath);
boocodeSessionId: handle.sessionId,
agentSessionId: oc,
worktreePath: ctx.worktreePath,
streamedPartKeys: new Set(),
partTypeById: new Map(),
activeTurn: null,
watchdog: null,
sseAbort: null,
swallowNextTerminal: false,
};
this.byOpencodeId.set(oc, state); this.byOpencodeId.set(oc, state);
} }
const session = state; const session = state;
// v2.7 busy-assert: one in-flight turn per session. The dispatcher serializes
// turns per (chat, agent), so this never fires in normal dispatch — but if a
// second prompt arrives while one is live it would silently overwrite the slot
// and orphan the first turn, so reject instead.
if (session.activeTurn) {
return { ok: false, error: 'opencode-server: session already has an in-flight turn' };
}
// Authoritative per-turn directory for SDK routing + reconcile. // Authoritative per-turn directory for SDK routing + reconcile.
session.worktreePath = ctx.worktreePath; session.worktreePath = ctx.worktreePath;
// Defensive: ensureSession normally starts the loop, but if prompt is reached // Defensive: ensureSession normally starts the loop, but if prompt is reached
// with a freshly-created state (no loop yet), start it so the turn streams. // with a freshly-created state (no loop yet), start it so the turn streams.
// Idempotent when ensureSession already started one. startSessionEventLoop(session, this.sseDeps());
this.startSessionEventLoop(session);
const client = this.client;
return await new Promise<TurnResult>((resolve) => { return await new Promise<TurnResult>((resolve) => {
let settled = false; let settled = false;
@@ -781,7 +534,8 @@ export class OpenCodeServerBackend implements AgentBackend {
settle({ ok: false, error: 'aborted' }); settle({ ok: false, error: 'aborted' });
}; };
session.activeTurn = { onEvent: ctx.onEvent, settle }; const turn: TurnState = { onEvent: ctx.onEvent, settle };
session.activeTurn = turn;
this.bumpActivity(session); // arm the inactivity backstop this.bumpActivity(session); // arm the inactivity backstop
if (ctx.signal.aborted) { if (ctx.signal.aborted) {
@@ -822,39 +576,15 @@ export class OpenCodeServerBackend implements AgentBackend {
} }
async dispose(): Promise<void> { async dispose(): Promise<void> {
this.up = false;
// Abort every per-session SSE loop so none survive the teardown. // Abort every per-session SSE loop so none survive the teardown.
for (const st of this.byOpencodeId.values()) st.sseAbort?.abort(); for (const st of this.byOpencodeId.values()) st.sseAbort?.abort();
const child = this.child;
this.child = null;
this.client = null;
this.byOpencodeId.clear(); this.byOpencodeId.clear();
if (child && !child.killed) { await this.supervisor.dispose();
child.kill('SIGTERM');
const t = setTimeout(() => {
if (!child.killed) child.kill('SIGKILL');
}, 5_000);
t.unref();
}
} }
} }
// ─── helpers ────────────────────────────────────────────────────────────────── // ─── helpers ──────────────────────────────────────────────────────────────────
/** Extract the opencode sessionID an event belongs to, across event shapes.
* Most carry `properties.sessionID`; `message.part.updated` nests it under
* `properties.part.sessionID`. Returns null when the event has no session
* (the per-session loop then leaves it to dispatchEvent, which drops it). */
function eventSessionId(ev: Event): string | null {
const props = (ev as { properties?: unknown }).properties;
if (!props || typeof props !== 'object') return null;
if (ev.type === 'message.part.updated') {
const part = (props as { part?: { sessionID?: string } }).part;
return part?.sessionID ?? null;
}
return (props as { sessionID?: string }).sessionID ?? null;
}
/** BooCoder model string "provider/model" → opencode's structured {providerID, modelID}. */ /** BooCoder model string "provider/model" → opencode's structured {providerID, modelID}. */
function parseModel(model: string | undefined): { providerID: string; modelID: string } | undefined { function parseModel(model: string | undefined): { providerID: string; modelID: string } | undefined {
if (!model || !model.trim()) return undefined; if (!model || !model.trim()) return undefined;
@@ -864,199 +594,14 @@ function parseModel(model: string | undefined): { providerID: string; modelID: s
return { providerID: trimmed.slice(0, idx), modelID: trimmed.slice(idx + 1) }; return { providerID: trimmed.slice(0, idx), modelID: trimmed.slice(idx + 1) };
} }
// No slash but non-empty → infer llama-swap (the only configured provider). // No slash but non-empty → infer llama-swap (the only configured provider).
// Guard against bare '/' or trailing/leading slash.
if (idx < 0 && trimmed.length > 0) { if (idx < 0 && trimmed.length > 0) {
return { providerID: 'llama-swap', modelID: trimmed }; return { providerID: 'llama-swap', modelID: trimmed };
} }
return undefined; return undefined;
} }
/** Ported verbatim from Paseo opencode-agent.ts: id → message-id fallback → null. */
function resolvePartDedupeKey(part: { id: string; messageID: string }, type: string): string | null {
if (part.id.trim().length > 0) return `${type}:${part.id}`;
if (part.messageID.trim().length > 0) return `${type}:message:${part.messageID}`;
return null;
}
/** opencode ToolPart → ACP-shaped snapshot (reuses the existing persist/render path). */
function toolPartToSnapshot(part: ToolPart): AcpToolSnapshot {
const state = part.state;
let rawInput: unknown;
let rawOutput: unknown;
let title: string | undefined;
if (state) {
if ('input' in state) rawInput = (state as { input?: unknown }).input;
if ('output' in state) rawOutput = (state as { output?: unknown }).output;
else if ('error' in state) rawOutput = (state as { error?: unknown }).error;
if ('title' in state) title = (state as { title?: string }).title;
}
return {
toolCallId: part.callID,
title: title ?? part.tool,
kind: null,
status: mapToolStatus(state?.status),
rawInput,
rawOutput,
};
}
function mapToolStatus(s: ToolState['status'] | undefined): ToolCallStatus | null {
switch (s) {
case 'pending':
return 'pending';
case 'running':
return 'in_progress';
case 'completed':
return 'completed';
case 'error':
return 'failed';
default:
return null;
}
}
/**
* Reclaim a loopback port a dead opencode child may still hold (lift of
* openchamber `killProcessOnPort`). Best-effort, POSIX-only (`lsof`/`kill`); a
* failure is harmless because the next spawn allocates a fresh ephemeral port.
* Never kills this process. Synchronous + short-timeout so the crash handler
* doesn't block.
*/
function reclaimPort(port: number | null): void {
if (!port || process.platform === 'win32') return;
try {
const res = spawnSync('lsof', ['-ti', `:${port}`], { encoding: 'utf8', timeout: 3_000, windowsHide: true });
const out = res.stdout || '';
const myPid = process.pid;
for (const pidStr of out.split(/\s+/)) {
const pid = parseInt(pidStr.trim(), 10);
if (pid && pid !== myPid) {
try {
spawnSync('kill', ['-9', String(pid)], { stdio: 'ignore', timeout: 2_000 });
} catch {
// ignore — best effort
}
}
}
} catch {
// lsof absent or failed — the fresh-ephemeral-port spawn doesn't need this.
}
}
/**
* Resolve true once nothing is listening on `port` (lift of openchamber
* `waitForPortRelease`). Used before re-spawning on a fixed port; with ephemeral
* ports it's a fast no-op. Probes 127.0.0.1; resolves false at the deadline.
*/
function waitForPortRelease(port: number, timeoutMs: number): Promise<boolean> {
const deadline = Date.now() + timeoutMs;
return new Promise((resolve) => {
const attempt = () => {
const socket = netConnect({ port, host: '127.0.0.1' });
let settled = false;
const finish = (released: boolean) => {
if (settled) return;
settled = true;
socket.removeAllListeners();
socket.destroy();
if (released || Date.now() >= deadline) {
resolve(released);
return;
}
setTimeout(attempt, 150);
};
socket.once('connect', () => finish(false));
socket.once('error', (err: NodeJS.ErrnoException) => {
if (err && (err.code === 'ECONNREFUSED' || err.code === 'EHOSTUNREACH')) finish(true);
else finish(false);
});
socket.setTimeout(500, () => finish(true));
};
attempt();
});
}
/** Bind-probe an ephemeral port on loopback. */
function freePort(): Promise<number> {
return new Promise((resolve, reject) => {
const srv = createServer();
srv.unref();
srv.on('error', reject);
srv.listen(0, '127.0.0.1', () => {
const addr = srv.address();
if (addr && typeof addr === 'object') {
const { port } = addr;
srv.close(() => resolve(port));
} else {
srv.close(() => reject(new Error('opencode-server: could not determine a free port')));
}
});
});
}
/** Resolve when the child prints the ready line; reject on timeout or early exit. */
function waitForReady(child: ChildProcess, timeoutMs: number): Promise<void> {
return new Promise((resolve, reject) => {
let done = false;
let stderrBuf = '';
const finish = (err?: Error) => {
if (done) return;
done = true;
clearTimeout(timer);
child.stdout?.off('data', onOut);
child.stderr?.off('data', onErr);
child.off('exit', onExit);
if (err) reject(err);
else resolve();
};
const onOut = (buf: Buffer) => {
if (buf.toString().includes('opencode server listening on')) finish();
};
const onErr = (buf: Buffer) => {
stderrBuf += buf.toString();
};
const onExit = (code: number | null) =>
finish(new Error(`opencode serve exited before ready (code ${code}); stderr: ${stderrBuf.slice(-2000)}`));
const timer = setTimeout(
() => finish(new Error(`opencode serve not ready in ${timeoutMs}ms; stderr: ${stderrBuf.slice(-2000)}`)),
timeoutMs,
);
child.stdout?.on('data', onOut);
child.stderr?.on('data', onErr);
child.on('exit', onExit);
});
}
function sleep(ms: number): Promise<void> {
return new Promise((r) => setTimeout(r, ms));
}
/** Strip opencode-dcp plugin tags that render as literal text in the UI. */
function stripDcpTags(s: string): string {
return s.replace(/<dcp-message-id>[^<]*<\/dcp-message-id>/g, '');
}
function errMsg(e: unknown): string {
return e instanceof Error ? e.message : String(e);
}
function errToString(e: unknown): string {
if (e == null) return 'unknown error';
if (typeof e === 'string') return e;
if (e instanceof Error) return e.message;
try {
return JSON.stringify(e);
} catch {
return String(e);
}
}
/** Hash of stable config — detects model changes across sessions without /** Hash of stable config — detects model changes across sessions without
* invalidating on ephemeral state like the random server port (which changes * invalidating on ephemeral state like the random server port. */
* every BooCoder restart). */
function sessionConfigHash(model: string): string { function sessionConfigHash(model: string): string {
return createHash('sha256').update(`opencode_server|${model}`).digest('hex').slice(0, 16); return createHash('sha256').update(`opencode_server|${model}`).digest('hex').slice(0, 16);
} }

View File

@@ -0,0 +1,181 @@
/**
* Per-session SSE subscribe loop + reconnect/backoff + eventSessionId demux.
*
* Extracted (v2.7 audit reshape) from `OpenCodeServerBackend.startSessionEventLoop`
* / `runSessionEventLoop`. opencode scopes events by the `directory` query param, so
* each session runs its own dir-scoped stream and never drops a sibling's events.
*
* The loop is intentionally thin: it owns subscribe + the demux filter + reconnect
* timing only. Translating an event into turn side effects (watchdog, usage,
* settle) stays on the backend via the injected `dispatchEvent` / `reconcile`
* callbacks — `opencode-sse` knows nothing about turns or the DB.
*
* v2.7 concurrency hardening: the throw-driven reconnect path now backs off
* exponentially and trips a circuit-breaker (`onReconnectGiveUp`) after a bounded
* number of consecutive failures, instead of looping forever at a flat 1s. The
* HAPPY PATH is unchanged — a clean stream end (server still up) reconnects after
* `baseMs` (1s, as before) and resets the failure counter, so a long-lived session
* that re-subscribes normally never backs off.
*/
import type { FastifyBaseLogger } from 'fastify';
import type { Event, OpencodeClient } from '@opencode-ai/sdk/v2/client';
import type { AgentEvent } from '../agent-backend.js';
import type { TurnResult } from '../agent-backend.js';
import { eventSessionId, errMsg } from './opencode-event-map.js';
export const SSE_RECONNECT_DELAY_MS = 1_000;
/** One in-flight turn's emitter + completion settler. */
export interface TurnState {
onEvent: (e: AgentEvent) => void;
settle: (r: TurnResult) => void;
}
/** Per-(opencode session) demux state. dedup sets scoped here, cleared per turn. */
export interface SessionState {
boocodeSessionId: string;
agentSessionId: string;
/** Worktree directory for SDK `directory` routing; refreshed each turn from ctx. */
worktreePath: string;
/** dedup gate: `${type}:${id}` added on delta, deleted-and-tested on updated. Cleared at turn end. */
streamedPartKeys: Set<string>;
/** partID → 'text' | 'reasoning', so a delta with a non-'reasoning' field is still classed right. Cleared at turn end. */
partTypeById: Map<string, string>;
activeTurn: TurnState | null;
/** Inactivity backstop timer for the active turn; null when no turn in flight. */
watchdog: ReturnType<typeof setTimeout> | null;
/** Per-session SSE subscription handle. Non-null while the loop is running;
* aborting it tears down the underlying fetch and exits the loop. */
sseAbort: AbortController | null;
/** F.1 post-abort orphan-terminal guard: swallow the one session.idle/error
* opencode emits for an aborted turn so it can't settle the next turn. */
swallowNextTerminal: boolean;
}
// ─── reconnect backoff (pure) ────────────────────────────────────────────────
export interface ReconnectPolicy {
/** First retry delay (and the steady-state clean-reconnect delay). */
baseMs: number;
/** Cap on the exponential delay. */
maxMs: number;
/** Consecutive failures tolerated before the breaker trips (give up). */
maxAttempts: number;
}
export const DEFAULT_RECONNECT_POLICY: ReconnectPolicy = {
baseMs: SSE_RECONNECT_DELAY_MS,
maxMs: 30_000,
maxAttempts: 6,
};
export type ReconnectDecision =
| { action: 'reconnect'; delayMs: number }
| { action: 'give-up' };
/**
* Pure backoff decision after `failures` consecutive throwing reconnect attempts
* (1-based: the first failure passes `failures=1`). Returns an exponentially
* growing delay capped at `maxMs`, or `give-up` once the count exceeds
* `maxAttempts`. `failures=1` yields `baseMs`, so the very first retry matches the
* pre-hardening flat delay (happy-path-preserving).
*/
export function reconnectDecision(
failures: number,
policy: ReconnectPolicy = DEFAULT_RECONNECT_POLICY,
): ReconnectDecision {
if (failures > policy.maxAttempts) return { action: 'give-up' };
const exp = policy.baseMs * 2 ** (failures - 1);
return { action: 'reconnect', delayMs: Math.min(policy.maxMs, exp) };
}
// ─── the loop ────────────────────────────────────────────────────────────────
export interface SseLoopDeps {
/** Live iff the server is up (read each iteration so a crash stops the loop). */
isUp: () => boolean;
/** The current opencode client (null between server restarts). */
getClient: () => OpencodeClient | null;
/** Route one demuxed event to its turn (backend side effects live here). */
dispatchEvent: (ev: Event) => void;
/** Recover an idle/error lost during an SSE gap. Returns true if it settled. */
reconcile: (state: SessionState) => Promise<boolean>;
/** Circuit-breaker: called once the backoff gives up; fail the active turn. */
onReconnectGiveUp: (state: SessionState) => Promise<void> | void;
log: FastifyBaseLogger;
/** Injectable for tests; defaults to a real timer sleep. */
sleep?: (ms: number) => Promise<void>;
policy?: ReconnectPolicy;
}
function defaultSleep(ms: number): Promise<void> {
return new Promise((r) => setTimeout(r, ms));
}
/** Per-session SSE subscription, scoped to the session's worktree directory.
* Idempotent: a no-op if this session's loop is already running. */
export function startSessionEventLoop(state: SessionState, deps: SseLoopDeps): void {
if (state.sseAbort) return; // already running
const abort = new AbortController();
state.sseAbort = abort;
void runSessionEventLoop(state, abort, deps).finally(() => {
// Only clear if this controller is still the live one (a later restart may
// have already installed a new one).
if (state.sseAbort === abort) state.sseAbort = null;
});
}
export async function runSessionEventLoop(
state: SessionState,
abort: AbortController,
deps: SseLoopDeps,
): Promise<void> {
const signal = abort.signal;
const sleep = deps.sleep ?? defaultSleep;
const policy = deps.policy ?? DEFAULT_RECONNECT_POLICY;
let failures = 0;
while (deps.isUp() && deps.getClient() && !signal.aborted) {
const client = deps.getClient()!;
try {
// Re-read worktreePath each (re)subscribe so a directory refresh is picked
// up on reconnect. Passing `signal` lets close/dispose tear down a stream
// that's parked in `for await` between events.
const sub = await client.event.subscribe({ directory: state.worktreePath }, { signal });
for await (const ev of sub.stream) {
if (signal.aborted) break;
// Dir-scoped streams should only carry this session's events, but two
// sessions sharing a worktree (possible post-P1.5-b) each receive BOTH
// sessions' events — so drop anything that isn't ours, else the other
// session's deltas get processed twice (once per loop).
const sid = eventSessionId(ev);
if (sid != null && sid !== state.agentSessionId) continue;
deps.dispatchEvent(ev);
}
// Clean stream end — a healthy reconnect, NOT a failure: recover any lost
// terminal then re-subscribe at the base delay (pre-hardening behavior).
failures = 0;
if (deps.isUp() && !signal.aborted) {
await deps.reconcile(state); // recover an idle/error lost during the gap
await sleep(policy.baseMs);
}
} catch (err) {
if (!deps.isUp() || signal.aborted) break;
failures += 1;
const decision = reconnectDecision(failures, policy);
deps.log.warn(
{ err: errMsg(err), agentSessionId: state.agentSessionId, failures, action: decision.action },
'opencode-server: session event loop error; reconnecting',
);
await deps.reconcile(state);
if (decision.action === 'give-up') {
deps.log.warn(
{ agentSessionId: state.agentSessionId, failures },
'opencode-server: SSE reconnect gave up (circuit breaker) — failing active turn',
);
await deps.onReconnectGiveUp(state);
break;
}
await sleep(decision.delayMs);
}
}
}

View File

@@ -36,29 +36,15 @@
*/ */
import { spawn, type ChildProcess } from 'node:child_process'; import { spawn, type ChildProcess } from 'node:child_process';
import type { FastifyBaseLogger } from 'fastify'; import type { FastifyBaseLogger } from 'fastify';
import { import { ClientSideConnection, type Client } from '@agentclientprotocol/sdk';
ClientSideConnection,
type Client,
type SessionNotification,
type RequestPermissionRequest,
type RequestPermissionResponse,
type ReadTextFileRequest,
type ReadTextFileResponse,
type WriteTextFileRequest,
type WriteTextFileResponse,
type CreateTerminalRequest,
type CreateTerminalResponse,
type CreateElicitationRequest,
type CreateElicitationResponse,
} from '@agentclientprotocol/sdk';
import type { Sql } from '../../db.js'; import type { Sql } from '../../db.js';
import { resolveLaunchSpec } from '../acp-spawn.js'; import { resolveLaunchSpec } from '../acp-spawn.js';
import { isTurnOkForStopReason } from './warm-acp-routing.js'; import { isTurnOkForStopReason } from './warm-acp-routing.js';
import { getResolvedRegistry, type ResolvedProviderDef } from '../provider-config-registry.js'; import { getResolvedRegistry, type ResolvedProviderDef } from '../provider-config-registry.js';
import { createAcpNdJsonStream } from '../acp-stream.js'; import { createAcpNdJsonStream } from '../acp-stream.js';
import { mapSessionUpdate } from '../acp-event-map.js'; import { mapSessionUpdate } from '../acp-event-map.js';
import { readWorktreeTextFile, writeWorktreeTextFile } from '../acp-client-fs.js'; import { buildAcpClient } from '../acp-client.js';
import { waitForPermissionResponse, waitForElicitationResponse, cancelPendingPermission } from '../permission-waiter.js'; import { cancelPendingPermission } from '../permission-waiter.js';
import { type AcpToolSnapshot, synthesizeCanceledSnapshots } from '../acp-tool-snapshot.js'; import { type AcpToolSnapshot, synthesizeCanceledSnapshots } from '../acp-tool-snapshot.js';
import type { import type {
AgentBackend, AgentBackend,
@@ -211,47 +197,25 @@ export class WarmAcpBackend implements AgentBackend {
); );
} }
/** Build the ACP Client callbacks ONCE per connection. They read `this.activeTurn` /** Build the ACP Client callbacks ONCE per connection (shared `buildAcpClient`).
* so each turn's events/permissions route to the right place — exactly the * `resolveTurn` reads `this.activeTurn` at each callback so events/permissions
* opencode-server `activeTurn` pattern. Worktree-scoped FS like AcpStreamContext. */ * route to the live turn — exactly the prior behavior. The warm session always
* has a non-empty `sessionId`, so the shared `taskId && sessionId` permission
* gate is equivalent to the old `turn?.taskId` gate. */
private buildClient(worktreePath: string): Client { private buildClient(worktreePath: string): Client {
return { return buildAcpClient(worktreePath, () => {
sessionUpdate: async (params: SessionNotification): Promise<void> => { const turn = this.activeTurn;
const turn = this.activeTurn; if (!turn) return null;
if (!turn) return; // between turns — drop (no orphan settles a future turn) return {
for (const event of mapSessionUpdate(params, turn.snapshots)) { taskId: turn.taskId,
turn.onEvent(event); sessionId: turn.sessionId,
} modeId: turn.modeId,
}, agent: this.agent,
requestPermission: async (params: RequestPermissionRequest): Promise<RequestPermissionResponse> => { onSessionUpdate: (params) => {
const turn = this.activeTurn; for (const event of mapSessionUpdate(params, turn.snapshots)) turn.onEvent(event);
if (turn?.taskId) { },
// Route to the UI via the per-turn task id (same as the one-shot path). };
return waitForPermissionResponse(turn.taskId, turn.sessionId, this.agent, turn.modeId, params); });
}
const firstOption = params.options[0];
if (firstOption) return { outcome: { outcome: 'selected', optionId: firstOption.optionId } };
return { outcome: { outcome: 'cancelled' } };
},
readTextFile: async (params: ReadTextFileRequest): Promise<ReadTextFileResponse> => {
const content = await readWorktreeTextFile(worktreePath, params.path, params.line, params.limit);
return { content };
},
writeTextFile: async (params: WriteTextFileRequest): Promise<WriteTextFileResponse> => {
await writeWorktreeTextFile(worktreePath, params.path, params.content);
return {};
},
createTerminal: async (_params: CreateTerminalRequest): Promise<CreateTerminalResponse> => {
return { terminalId: 'noop' };
},
unstable_createElicitation: async (params: CreateElicitationRequest): Promise<CreateElicitationResponse> => {
const turn = this.activeTurn;
if (turn?.taskId) {
return waitForElicitationResponse(turn.taskId, turn.sessionId, this.agent, turn.modeId, params);
}
return { action: 'decline' };
},
};
} }
// ─── ensureSession: create-or-reuse the warm session (2.1) ─────────────────── // ─── ensureSession: create-or-reuse the warm session (2.1) ───────────────────
@@ -303,6 +267,14 @@ export class WarmAcpBackend implements AgentBackend {
return { ok: false, error: 'warm-acp: no live ACP connection' }; return { ok: false, error: 'warm-acp: no live ACP connection' };
} }
// v2.7 busy-assert: one in-flight turn per warm session. The dispatcher
// serializes turns per (chat, agent), so this never fires in normal dispatch —
// but a second concurrent prompt would silently overwrite `activeTurn` and
// orphan the first turn, so reject instead.
if (this.activeTurn) {
return { ok: false, error: 'warm-acp: session already has an in-flight turn' };
}
const snapshots = new Map<string, AcpToolSnapshot>(); const snapshots = new Map<string, AcpToolSnapshot>();
// taskId routes permission/elicitation prompts back to the UI. The dispatcher // taskId routes permission/elicitation prompts back to the UI. The dispatcher
// passes it (plus mode) on the per-turn PromptCtx; permission-waiter keys on it. // passes it (plus mode) on the per-turn PromptCtx; permission-waiter keys on it.

View File

@@ -0,0 +1,50 @@
/**
* F1 — per-task abort registry. A Stop on an external-agent task must reach the
* in-flight run and abort its child / prompt. Each external run-function registers
* its per-turn AbortController here keyed by task id; the cancel route calls
* `cancel(taskId)` to fire it; the run-function's `.finally` deletes the entry.
*
* Idempotent by construction:
* - `cancel()` on an already-aborted controller no-ops (AbortController.abort()
* is idempotent) → a rapid double-Stop is safe.
* - `cancel()` on an unknown / already-finished task returns false → a
* cancel-after-natural-exit (entry already deleted) and a Stop on a native
* boocode task (never registered) are both safe no-ops.
*
* Pure (no DB / child / IO) so the abort wiring + idempotency contract is
* unit-testable in isolation — mirrors the turn-guard / lifecycle-decisions
* pure-helper precedent.
*/
export interface CancelRegistry {
/** Create + store an AbortController for this task, returning it for the run. */
register(taskId: string): AbortController;
/** Abort the task's in-flight run. Returns false when no controller is registered. */
cancel(taskId: string): boolean;
/** Drop the task's entry (called from the run's `.finally`). No-op if absent. */
delete(taskId: string): void;
/** Whether a controller is currently registered for this task. */
has(taskId: string): boolean;
}
export function createCancelRegistry(): CancelRegistry {
const controllers = new Map<string, AbortController>();
return {
register(taskId) {
const ac = new AbortController();
controllers.set(taskId, ac);
return ac;
},
cancel(taskId) {
const ac = controllers.get(taskId);
if (!ac) return false;
ac.abort();
return true;
},
delete(taskId) {
controllers.delete(taskId);
},
has(taskId) {
return controllers.has(taskId);
},
};
}

View File

@@ -22,6 +22,12 @@ import { shouldUseClaudeSdk } from './backends/claude-sdk-routing.js';
import type { AgentBackend, AgentEvent } from './agent-backend.js'; import type { AgentBackend, AgentEvent } from './agent-backend.js';
import { publishAgentStatus } from './agent-status-publish.js'; import { publishAgentStatus } from './agent-status-publish.js';
import type { AgentStatus } from './normalize-agent-status.js'; import type { AgentStatus } from './normalize-agent-status.js';
import { createCancelRegistry } from './cancel-registry.js';
import {
finalizeStreamingMessage,
classifyTerminalStatus,
type TerminalMessageStatus,
} from './finalize-message.js';
interface InferenceRunner { interface InferenceRunner {
enqueue: (sessionId: string, chatId: string, assistantId: string, user: string) => void; enqueue: (sessionId: string, chatId: string, assistantId: string, user: string) => void;
@@ -43,7 +49,11 @@ interface Deps {
const POLL_INTERVAL_MS = 2_000; const POLL_INTERVAL_MS = 2_000;
const COMPLETION_POLL_MS = 2_000; const COMPLETION_POLL_MS = 2_000;
export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<void> } { export function createDispatcher(deps: Deps): {
cancelExternalTask(taskId: string): boolean;
start(): void;
stop(): Promise<void>;
} {
const { sql, inference, broker, log, config } = deps; const { sql, inference, broker, log, config } = deps;
let timer: ReturnType<typeof setInterval> | null = null; let timer: ReturnType<typeof setInterval> | null = null;
let listener: { unlisten: () => Promise<void> } | null = null; let listener: { unlisten: () => Promise<void> } | null = null;
@@ -55,6 +65,13 @@ export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<v
// turn at a time. // turn at a time.
const inflight = new Map<string, Promise<void>>(); const inflight = new Map<string, Promise<void>>();
// F1: per-task abort registry. Each external run-function registers its per-turn
// AbortController here (keyed by task id); the cancel route reaches it through the
// exported `cancelExternalTask`; the run's `.finally` deletes the entry. Native
// boocode tasks are never registered, so a Stop on one returns false and falls
// through to the unchanged inference.cancel path.
const taskControllers = createCancelRegistry();
// Shared entry point for both the poll timer and the NOTIFY listener. poll()'s // Shared entry point for both the poll timer and the NOTIFY listener. poll()'s
// `polling`/`stopping` guard makes this safe to call concurrently — a notify // `polling`/`stopping` guard makes this safe to call concurrently — a notify
// arriving mid-poll returns immediately and never double-dispatches. // arriving mid-poll returns immediately and never double-dispatches.
@@ -83,6 +100,40 @@ export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<v
publishAgentStatus(broker.publishFrame, sessionId, chatId, agent, status, reason); publishAgentStatus(broker.publishFrame, sessionId, chatId, agent, status, reason);
} }
// F1 (OCE-001/OCE-002): finalize a streaming assistant message into a terminal
// state and publish the matching message_complete frame. Best-effort + idempotent
// (the helper's `WHERE status='streaming'` guard) — a failure here must never mask
// the original abort/error, so it logs and swallows.
function finalizeMessage(
sessionId: string,
chatId: string,
assistantId: string,
status: TerminalMessageStatus,
model: string | null,
content?: string,
): Promise<boolean> {
return finalizeStreamingMessage(sql, broker.publishFrame, {
sessionId,
chatId,
assistantId,
status,
model,
content,
}).catch((err) => {
log.error({ err: err instanceof Error ? err.message : String(err), assistantId }, 'dispatcher: finalizeStreamingMessage failed');
return false;
});
}
// F1: the cancel route's reach into an in-flight external run. Idempotent — a
// double-Stop re-aborts an already-aborted controller (no-op) and a Stop on a
// finished/native task returns false. Aborting only fires the backend's per-turn
// cancel (session.abort / session/cancel / interrupt / child.kill); it never kills
// a warm pool process, so persistent worktrees + pooled backends are preserved.
function cancelExternalTask(taskId: string): boolean {
return taskControllers.cancel(taskId);
}
async function poll(): Promise<void> { async function poll(): Promise<void> {
// `polling` serializes poll() execution itself (timer + NOTIFY can fire // `polling` serializes poll() execution itself (timer + NOTIFY can fire
// concurrently) so we never double-select a task. It does NOT serialize task // concurrently) so we never double-select a task. It does NOT serialize task
@@ -116,6 +167,9 @@ export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<v
// with the same key is skipped and a concurrent poll can't re-pick it. // with the same key is skipped and a concurrent poll can't re-pick it.
const p = runTask(task).finally(() => { const p = runTask(task).finally(() => {
inflight.delete(key); inflight.delete(key);
// F1: drop the abort controller once the run settles. After this, a Stop
// on the (now-finished) task returns false — cancel-after-exit is safe.
taskControllers.delete(task.id);
}); });
inflight.set(key, p); inflight.set(key, p);
} }
@@ -312,13 +366,16 @@ export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<v
return; return;
} }
// Create an abort controller for this task // F1: register the per-task abort controller so a Stop reaches this run.
const ac = new AbortController(); const ac = taskControllers.register(taskId);
// #10: hoisted above the try so the catch block can report `error` status with // #10: hoisted above the try so the catch block can report `error` status with
// the (chat, agent) key. Empty until resolved below; guarded before use. // the (chat, agent) key. Empty until resolved below; guarded before use.
let sessionId = ''; let sessionId = '';
let chatId = ''; let chatId = '';
// F1: hoisted so the catch / abort short-circuit can finalize the streaming
// assistant row. Empty until the row is created; finalize no-ops on ''.
let assistantId = '';
try { try {
// Mark running // Mark running
@@ -384,7 +441,7 @@ export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<v
VALUES (${sessionId}, ${chatId}, 'assistant', '', 'streaming', ${task.model}, clock_timestamp()) VALUES (${sessionId}, ${chatId}, 'assistant', '', 'streaming', ${task.model}, clock_timestamp())
RETURNING id RETURNING id
`; `;
const assistantId = assistantMsg!.id; assistantId = assistantMsg!.id;
// write-edit-robustness #4: pre-turn worktree checkpoint (best-effort; a // write-edit-robustness #4: pre-turn worktree checkpoint (best-effort; a
// failure logs and never breaks dispatch). This path uses a per-task worktree // failure logs and never breaks dispatch). This path uses a per-task worktree
@@ -526,6 +583,20 @@ export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<v
} }
} }
// F1: abort short-circuit BEFORE the unconditional 'complete' write. A Stop
// (cancelExternalTask → ac.abort) or shutdown finalizes the streaming row as
// 'cancelled' (keeping whatever streamed) instead of recording 'complete',
// and skips the diff. This one-shot path owns a per-task worktree, so we DO
// tear it down here (unlike the warm paths, which keep their persistent one).
if (ac.signal.aborted || stopping) {
await finalizeMessage(sessionId, chatId, assistantId, 'cancelled', task.model, assistantContent);
await sql`UPDATE tasks SET state = 'cancelled', ended_at = clock_timestamp() WHERE id = ${taskId}`;
emitAgentStatus(sessionId, chatId, agent, 'idle', stopping ? 'shutdown' : 'cancelled');
await cleanupWorktree(projectPath, taskId);
clearTaskCommands(taskId);
return;
}
await sql` await sql`
UPDATE messages UPDATE messages
SET content = ${assistantContent}, status = 'complete', finished_at = clock_timestamp() SET content = ${assistantContent}, status = 'complete', finished_at = clock_timestamp()
@@ -539,14 +610,6 @@ export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<v
model: task.model, model: task.model,
} as WsFrame); } as WsFrame);
if (stopping) {
await sql`
UPDATE tasks SET state = 'cancelled', ended_at = clock_timestamp() WHERE id = ${taskId}
`;
await cleanupWorktree(projectPath, taskId);
return;
}
// Step 3: Diff the worktree and queue pending changes // Step 3: Diff the worktree and queue pending changes
log.info({ taskId }, 'dispatcher: diffing worktree'); log.info({ taskId }, 'dispatcher: diffing worktree');
const diff = await diffWorktree(worktreePath, projectPath, { signal: ac.signal }); const diff = await diffWorktree(worktreePath, projectPath, { signal: ac.signal });
@@ -587,18 +650,26 @@ export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<v
} catch (err) { } catch (err) {
const errMsg = err instanceof Error ? err.message : String(err); const errMsg = err instanceof Error ? err.message : String(err);
const status = classifyTerminalStatus({ aborted: ac.signal.aborted, error: err });
log.error({ taskId, agent, err: errMsg }, 'dispatcher: external agent error'); log.error({ taskId, agent, err: errMsg }, 'dispatcher: external agent error');
// Guard `NOT IN ('cancelled','completed')` so a genuine error in the catch
// never overwrites a state the cancel route already wrote (user-Stop wins).
await sql` await sql`
UPDATE tasks UPDATE tasks
SET state = 'failed', ended_at = clock_timestamp(), output_summary = ${errMsg.slice(0, 500)} SET state = ${status}, ended_at = clock_timestamp(), output_summary = ${errMsg.slice(0, 500)}
WHERE id = ${taskId} WHERE id = ${taskId} AND state NOT IN ('cancelled', 'completed')
`.catch(() => {}); `.catch(() => {});
// F1 (OCE-001): finalize the streaming assistant message — the catch
// previously updated only `tasks` and left the message 'streaming' forever
// (the BooChat 5-min sweep runs in a different process and can't reach it).
await finalizeMessage(sessionId, chatId, assistantId, status, task.model);
// #10: external-agent turn failed/crashed. chatId may be unbound if the throw // #10: external-agent turn failed/crashed. chatId may be unbound if the throw
// preceded its assignment — guard so the status publish never masks the real // preceded its assignment — guard so the status publish never masks the real
// error. // error.
if (chatId) emitAgentStatus(sessionId, chatId, agent, 'error', 'failed'); if (chatId) emitAgentStatus(sessionId, chatId, agent, status === 'cancelled' ? 'idle' : 'error', status === 'cancelled' ? 'cancelled' : 'failed');
// Best-effort cleanup // Best-effort cleanup
await cleanupWorktree(projectPath, taskId); await cleanupWorktree(projectPath, taskId);
@@ -652,11 +723,14 @@ export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<v
return; return;
} }
const ac = new AbortController(); // F1: register the per-task abort controller so a Stop reaches this run.
const ac = taskControllers.register(taskId);
// #10: hoisted so the catch can report `error` with the (chat, agent) key. // #10: hoisted so the catch can report `error` with the (chat, agent) key.
let sessionId = ''; let sessionId = '';
let chatId = ''; let chatId = '';
// F1: hoisted so the catch / abort short-circuit can finalize the streaming row.
let assistantId = '';
try { try {
// execution_path = 'acp' — the schema CHECK has no 'opencode_server' value // execution_path = 'acp' — the schema CHECK has no 'opencode_server' value
@@ -728,7 +802,7 @@ export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<v
VALUES (${sessionId}, ${chatId}, 'assistant', '', 'streaming', ${task.model}, clock_timestamp()) VALUES (${sessionId}, ${chatId}, 'assistant', '', 'streaming', ${task.model}, clock_timestamp())
RETURNING id RETURNING id
`; `;
const assistantId = assistantMsg!.id; assistantId = assistantMsg!.id;
// write-edit-robustness #4: pre-turn checkpoint of the persistent session // write-edit-robustness #4: pre-turn checkpoint of the persistent session
// worktree (best-effort; never breaks dispatch). worktreeId comes from the // worktree (best-effort; never breaks dispatch). worktreeId comes from the
@@ -856,6 +930,18 @@ export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<v
await persistExternalAgentTurn(sql, assistantId, [...toolSnaps.values()], reasoningText); await persistExternalAgentTurn(sql, assistantId, [...toolSnaps.values()], reasoningText);
// F1: abort short-circuit BEFORE the unconditional 'complete' write — fixes
// the warm success-path recording 'complete' on a Stop'd turn. The abort fired
// session.abort on the prompt only: the persistent session worktree is kept
// (no cleanup) and the pooled opencode server stays warm for the next turn.
if (ac.signal.aborted || stopping) {
await finalizeMessage(sessionId, chatId, assistantId, 'cancelled', task.model, assistantContent);
await sql`UPDATE tasks SET state = 'cancelled', ended_at = clock_timestamp() WHERE id = ${taskId}`;
emitAgentStatus(sessionId, chatId, agent, 'idle', stopping ? 'shutdown' : 'cancelled');
clearTaskCommands(taskId);
return; // worktree persists (no cleanup); backend stays warm
}
await sql` await sql`
UPDATE messages UPDATE messages
SET content = ${assistantContent}, status = 'complete', finished_at = clock_timestamp() SET content = ${assistantContent}, status = 'complete', finished_at = clock_timestamp()
@@ -868,11 +954,6 @@ export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<v
model: task.model, model: task.model,
} as WsFrame); } as WsFrame);
if (stopping) {
await sql`UPDATE tasks SET state = 'cancelled', ended_at = clock_timestamp() WHERE id = ${taskId}`;
return; // worktree persists (no cleanup); backend stays warm
}
// 1.10: diff the persistent worktree against its captured baseline and // 1.10: diff the persistent worktree against its captured baseline and
// SUPERSEDE the session's prior pending row (latest-wins, one accumulating // SUPERSEDE the session's prior pending row (latest-wins, one accumulating
// diff) instead of stacking. Stamp agent for DiffPanel attribution. // diff) instead of stacking. Stamp agent for DiffPanel attribution.
@@ -920,14 +1001,17 @@ export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<v
clearTaskCommands(taskId); clearTaskCommands(taskId);
} catch (err) { } catch (err) {
const errMsg = err instanceof Error ? err.message : String(err); const errMsg = err instanceof Error ? err.message : String(err);
const status = classifyTerminalStatus({ aborted: ac.signal.aborted, error: err });
log.error({ taskId, agent, err: errMsg }, 'dispatcher: opencode server error'); log.error({ taskId, agent, err: errMsg }, 'dispatcher: opencode server error');
await sql` await sql`
UPDATE tasks UPDATE tasks
SET state = 'failed', ended_at = clock_timestamp(), output_summary = ${errMsg.slice(0, 500)} SET state = ${status}, ended_at = clock_timestamp(), output_summary = ${errMsg.slice(0, 500)}
WHERE id = ${taskId} WHERE id = ${taskId} AND state NOT IN ('cancelled', 'completed')
`.catch(() => {}); `.catch(() => {});
// F1 (OCE-001): finalize the streaming message (was left 'streaming').
await finalizeMessage(sessionId, chatId, assistantId, status, task.model);
// #10: turn crashed. // #10: turn crashed.
if (chatId) emitAgentStatus(sessionId, chatId, agent, 'error', 'crashed'); if (chatId) emitAgentStatus(sessionId, chatId, agent, status === 'cancelled' ? 'idle' : 'error', status === 'cancelled' ? 'cancelled' : 'crashed');
clearTaskCommands(taskId); clearTaskCommands(taskId);
// No worktree cleanup (persistent); backend stays warm for the next turn. // No worktree cleanup (persistent); backend stays warm for the next turn.
} }
@@ -988,7 +1072,10 @@ export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<v
return; return;
} }
const ac = new AbortController(); // F1: register the per-task abort controller so a Stop reaches this run.
const ac = taskControllers.register(taskId);
// F1: hoisted so the catch / abort short-circuit can finalize the streaming row.
let assistantId = '';
try { try {
await sql` await sql`
@@ -1010,7 +1097,7 @@ export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<v
VALUES (${sessionId}, ${chatId}, 'assistant', '', 'streaming', ${task.model}, clock_timestamp()) VALUES (${sessionId}, ${chatId}, 'assistant', '', 'streaming', ${task.model}, clock_timestamp())
RETURNING id RETURNING id
`; `;
const assistantId = assistantMsg!.id; assistantId = assistantMsg!.id;
// write-edit-robustness #4: pre-turn checkpoint of the persistent session // write-edit-robustness #4: pre-turn checkpoint of the persistent session
// worktree (best-effort; never breaks dispatch). Same worktree the opencode // worktree (best-effort; never breaks dispatch). Same worktree the opencode
@@ -1121,6 +1208,18 @@ export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<v
await persistExternalAgentTurn(sql, assistantId, [...toolSnaps.values()], reasoningText); await persistExternalAgentTurn(sql, assistantId, [...toolSnaps.values()], reasoningText);
// F1: abort short-circuit BEFORE the unconditional 'complete' write — fixes
// the warm success-path recording 'complete' on a Stop'd turn. The abort fired
// session/cancel on the warm connection only (never killed the child), so the
// persistent worktree is kept and the pooled (chat,agent) backend stays warm.
if (ac.signal.aborted || stopping) {
await finalizeMessage(sessionId, chatId, assistantId, 'cancelled', task.model, assistantContent);
await sql`UPDATE tasks SET state = 'cancelled', ended_at = clock_timestamp() WHERE id = ${taskId}`;
emitAgentStatus(sessionId, chatId, agent, 'idle', stopping ? 'shutdown' : 'cancelled');
clearTaskCommands(taskId);
return; // worktree persists (no cleanup); backend stays warm
}
await sql` await sql`
UPDATE messages UPDATE messages
SET content = ${assistantContent}, status = 'complete', finished_at = clock_timestamp() SET content = ${assistantContent}, status = 'complete', finished_at = clock_timestamp()
@@ -1133,11 +1232,6 @@ export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<v
model: task.model, model: task.model,
} as WsFrame); } as WsFrame);
if (stopping) {
await sql`UPDATE tasks SET state = 'cancelled', ended_at = clock_timestamp() WHERE id = ${taskId}`;
return; // worktree persists (no cleanup); backend stays warm
}
// Diff the persistent worktree against its captured baseline and SUPERSEDE // Diff the persistent worktree against its captured baseline and SUPERSEDE
// the session's prior pending row (latest-wins) — identical to opencode. // the session's prior pending row (latest-wins) — identical to opencode.
const diff = await diffWorktree(worktreePath, projectPath, { const diff = await diffWorktree(worktreePath, projectPath, {
@@ -1184,14 +1278,17 @@ export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<v
clearTaskCommands(taskId); clearTaskCommands(taskId);
} catch (err) { } catch (err) {
const errMsg = err instanceof Error ? err.message : String(err); const errMsg = err instanceof Error ? err.message : String(err);
const status = classifyTerminalStatus({ aborted: ac.signal.aborted, error: err });
log.error({ taskId, agent, err: errMsg }, 'dispatcher: warm ACP error'); log.error({ taskId, agent, err: errMsg }, 'dispatcher: warm ACP error');
await sql` await sql`
UPDATE tasks UPDATE tasks
SET state = 'failed', ended_at = clock_timestamp(), output_summary = ${errMsg.slice(0, 500)} SET state = ${status}, ended_at = clock_timestamp(), output_summary = ${errMsg.slice(0, 500)}
WHERE id = ${taskId} WHERE id = ${taskId} AND state NOT IN ('cancelled', 'completed')
`.catch(() => {}); `.catch(() => {});
// F1 (OCE-001): finalize the streaming message (was left 'streaming').
await finalizeMessage(sessionId, chatId, assistantId, status, task.model);
// #10: turn crashed. // #10: turn crashed.
emitAgentStatus(sessionId, chatId, agent, 'error', 'crashed'); emitAgentStatus(sessionId, chatId, agent, status === 'cancelled' ? 'idle' : 'error', status === 'cancelled' ? 'cancelled' : 'crashed');
clearTaskCommands(taskId); clearTaskCommands(taskId);
// No worktree cleanup (persistent); backend stays warm for the next turn. // No worktree cleanup (persistent); backend stays warm for the next turn.
} }
@@ -1245,7 +1342,10 @@ export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<v
return; return;
} }
const ac = new AbortController(); // F1: register the per-task abort controller so a Stop reaches this run.
const ac = taskControllers.register(taskId);
// F1: hoisted so the catch / abort short-circuit can finalize the streaming row.
let assistantId = '';
try { try {
await sql` await sql`
@@ -1267,7 +1367,7 @@ export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<v
VALUES (${sessionId}, ${chatId}, 'assistant', '', 'streaming', ${task.model}, clock_timestamp()) VALUES (${sessionId}, ${chatId}, 'assistant', '', 'streaming', ${task.model}, clock_timestamp())
RETURNING id RETURNING id
`; `;
const assistantId = assistantMsg!.id; assistantId = assistantMsg!.id;
// write-edit-robustness #4: pre-turn checkpoint of the persistent session // write-edit-robustness #4: pre-turn checkpoint of the persistent session
// worktree (best-effort; never breaks dispatch). // worktree (best-effort; never breaks dispatch).
@@ -1376,6 +1476,18 @@ export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<v
await persistExternalAgentTurn(sql, assistantId, [...toolSnaps.values()], reasoningText); await persistExternalAgentTurn(sql, assistantId, [...toolSnaps.values()], reasoningText);
// F1: abort short-circuit BEFORE the unconditional 'complete' write — fixes
// the warm success-path recording 'complete' on a Stop'd turn. The abort fired
// the SDK interrupt on the same query generator only (never killed the warm
// process), so the persistent worktree is kept and the backend stays warm.
if (ac.signal.aborted || stopping) {
await finalizeMessage(sessionId, chatId, assistantId, 'cancelled', task.model, assistantContent);
await sql`UPDATE tasks SET state = 'cancelled', ended_at = clock_timestamp() WHERE id = ${taskId}`;
emitAgentStatus(sessionId, chatId, agent, 'idle', stopping ? 'shutdown' : 'cancelled');
clearTaskCommands(taskId);
return; // worktree persists (no cleanup); backend stays warm
}
// ctx_used/ctx_max from the SDK result (1M-aware) → the assistant message, so // ctx_used/ctx_max from the SDK result (1M-aware) → the assistant message, so
// the ContextBar renders a real context-window fill for claude. // the ContextBar renders a real context-window fill for claude.
await sql` await sql`
@@ -1391,11 +1503,6 @@ export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<v
model: task.model, model: task.model,
} as WsFrame); } as WsFrame);
if (stopping) {
await sql`UPDATE tasks SET state = 'cancelled', ended_at = clock_timestamp() WHERE id = ${taskId}`;
return; // worktree persists (no cleanup); backend stays warm
}
// Diff the persistent worktree against its captured baseline and SUPERSEDE // Diff the persistent worktree against its captured baseline and SUPERSEDE
// the session's prior pending row (latest-wins) — identical to opencode/ACP. // the session's prior pending row (latest-wins) — identical to opencode/ACP.
const diff = await diffWorktree(worktreePath, projectPath, { const diff = await diffWorktree(worktreePath, projectPath, {
@@ -1442,14 +1549,17 @@ export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<v
clearTaskCommands(taskId); clearTaskCommands(taskId);
} catch (err) { } catch (err) {
const errMsg = err instanceof Error ? err.message : String(err); const errMsg = err instanceof Error ? err.message : String(err);
const status = classifyTerminalStatus({ aborted: ac.signal.aborted, error: err });
log.error({ taskId, agent, err: errMsg }, 'dispatcher: claude SDK error'); log.error({ taskId, agent, err: errMsg }, 'dispatcher: claude SDK error');
await sql` await sql`
UPDATE tasks UPDATE tasks
SET state = 'failed', ended_at = clock_timestamp(), output_summary = ${errMsg.slice(0, 500)} SET state = ${status}, ended_at = clock_timestamp(), output_summary = ${errMsg.slice(0, 500)}
WHERE id = ${taskId} WHERE id = ${taskId} AND state NOT IN ('cancelled', 'completed')
`.catch(() => {}); `.catch(() => {});
// F1 (OCE-001): finalize the streaming message (was left 'streaming').
await finalizeMessage(sessionId, chatId, assistantId, status, task.model);
// #10: turn crashed. // #10: turn crashed.
emitAgentStatus(sessionId, chatId, agent, 'error', 'crashed'); emitAgentStatus(sessionId, chatId, agent, status === 'cancelled' ? 'idle' : 'error', status === 'cancelled' ? 'cancelled' : 'crashed');
clearTaskCommands(taskId); clearTaskCommands(taskId);
// No worktree cleanup (persistent); backend stays warm for the next turn. // No worktree cleanup (persistent); backend stays warm for the next turn.
} }
@@ -1476,6 +1586,7 @@ export function createDispatcher(deps: Deps): { start(): void; stop(): Promise<v
} }
return { return {
cancelExternalTask,
start() { start() {
log.info('dispatcher: starting poll loop + tasks_new listener'); log.info('dispatcher: starting poll loop + tasks_new listener');

View File

@@ -0,0 +1,76 @@
import type { Sql } from '../db.js';
import type { WsFrame } from '@boocode/contracts/ws-frames';
export type TerminalMessageStatus = 'cancelled' | 'failed';
/**
* F1 (D-7) — decide the terminal status a Stop'd / errored external turn lands in.
*
* A user Stop (the per-task AbortController fired) or a thrown `AbortError` is a
* deliberate, non-error outcome → `'cancelled'`. A genuine thrown error → `'failed'`.
* Keeping the two distinct keeps the human-inbox / failure surfaces honest.
*
* Pure (no DB / IO) so the mapping is unit-testable in isolation.
*/
export function classifyTerminalStatus(opts: { aborted: boolean; error?: unknown }): TerminalMessageStatus {
if (opts.aborted) return 'cancelled';
if (opts.error instanceof Error && opts.error.name === 'AbortError') return 'cancelled';
return 'failed';
}
/**
* F1 (OCE-001 / OCE-002) — finalize a streaming assistant message into a terminal
* state and publish the matching `message_complete` frame.
*
* Idempotent via `WHERE status = 'streaming'`: a second call (a double-Stop, or an
* abort short-circuit followed by the catch block) updates zero rows and does NOT
* re-publish, so the frontend reducer settles the message exactly once. It also
* never clobbers a row that already finished cleanly (`complete`) — the abort that
* raced a clean finish is a no-op.
*
* Returns `true` iff this call performed the finalization (the row was still
* streaming); `false` if it was already terminal or the id is absent (the throw
* preceded the row's creation).
*/
export async function finalizeStreamingMessage(
sql: Sql,
publishFrame: (sessionId: string, frame: WsFrame) => void,
opts: {
sessionId: string;
chatId: string;
assistantId: string;
status: TerminalMessageStatus;
model: string | null;
/** Partial accumulated text to persist; omit to leave the row's content untouched. */
content?: string;
},
): Promise<boolean> {
const { sessionId, chatId, assistantId, status, model, content } = opts;
if (!assistantId) return false;
const rows =
content !== undefined
? await sql<{ id: string }[]>`
UPDATE messages
SET content = ${content}, status = ${status}, finished_at = clock_timestamp()
WHERE id = ${assistantId} AND status = 'streaming'
RETURNING id
`
: await sql<{ id: string }[]>`
UPDATE messages
SET status = ${status}, finished_at = clock_timestamp()
WHERE id = ${assistantId} AND status = 'streaming'
RETURNING id
`;
if (rows.length === 0) return false;
publishFrame(sessionId, {
type: 'message_complete',
message_id: assistantId,
chat_id: chatId,
model,
status,
} as WsFrame);
return true;
}

View File

@@ -0,0 +1,142 @@
/**
* AgentEvent → WS-frame emitter + turn accumulators.
*
* Extracted (v2.7 audit reshape) from `AcpStreamContext.handleSessionUpdate` in
* `acp-dispatch.ts` — the `AgentEvent → broker.publishFrame` switch that maps a
* backend's normalized events onto the wire frames the UI consumes, while
* accumulating the turn's text / reasoning / tool snapshots for persistence.
*
* The same shape backs the dispatcher's 4 inline `onEvent` copies (DEFERRED while
* dispatcher.ts has uncommitted edits), hence the optional `dcp` stripper + the
* `finalize()` flush: the opencode dispatch path strips dcp tags from text deltas,
* the ACP path does not (passes no `dcp`, so text is emitted verbatim — identical
* to the prior AcpStreamContext behavior).
*
* Publishing is gated on `canStream()` (all of broker/sessionId/chatId/assistantId
* present) exactly as the original — a one-shot dispatch with no broker accumulates
* but never publishes.
*/
import type { Broker } from '@boocode/server/broker';
import type { WsFrame } from '@boocode/contracts/ws-frames';
import type { AgentEvent } from './agent-backend.js';
import { type AcpToolSnapshot, snapshotToWireToolCall } from './acp-tool-snapshot.js';
import { mergeTaskCommands, getTaskCommands } from './agent-commands-cache.js';
import type { DcpStreamStripper } from './dcp-strip.js';
export interface FrameEmitterOpts {
broker?: Broker;
sessionId?: string;
chatId?: string;
/** The assistant message id — the frames' `message_id`. */
assistantId?: string;
/** Per-turn task id, for the agent_commands frame + command cache. */
taskId?: string;
/** Optional cross-chunk dcp stripper for text deltas (opencode path). When
* provided, text is stripped before push/publish and `finalize()` flushes the
* held-back tail. The ACP path passes none → text emitted verbatim. */
dcp?: DcpStreamStripper;
}
export interface FrameEmitter {
/** Map one AgentEvent to its WS frame(s) + accumulate it. */
onEvent: (e: AgentEvent) => void;
/** Flush a dcp stripper's held-back tail at turn end (no-op without `dcp`). */
finalize: () => void;
/** The merge accumulator for tool snapshots (toolCallId → snapshot). */
readonly toolSnapshots: Map<string, AcpToolSnapshot>;
/** Accumulated assistant text (post-dcp-strip when a stripper is set). */
readonly output: string;
/** Accumulated reasoning text. */
readonly reasoningText: string;
/** Tool snapshots in insertion order. */
readonly snapshots: AcpToolSnapshot[];
}
export function makeFrameEmitter(opts: FrameEmitterOpts): FrameEmitter {
const { broker, sessionId, chatId, assistantId, taskId, dcp } = opts;
const textChunks: string[] = [];
const reasoningChunks: string[] = [];
const toolSnapshots = new Map<string, AcpToolSnapshot>();
const canStream = (): boolean => !!(broker && sessionId && chatId && assistantId);
const publishText = (content: string): void => {
textChunks.push(content);
if (canStream()) {
broker!.publishFrame(sessionId!, {
type: 'delta',
message_id: assistantId!,
chat_id: chatId!,
content,
} as WsFrame);
}
};
const onEvent = (e: AgentEvent): void => {
switch (e.type) {
case 'text': {
const safe = dcp ? dcp.push(e.text) : e.text;
if (safe) publishText(safe);
break;
}
case 'reasoning':
reasoningChunks.push(e.text);
if (canStream()) {
broker!.publishFrame(sessionId!, {
type: 'reasoning_delta',
message_id: assistantId!,
chat_id: chatId!,
content: e.text,
} as WsFrame);
}
break;
case 'tool_call':
case 'tool_update':
toolSnapshots.set(e.toolCall.toolCallId, e.toolCall);
if (canStream()) {
broker!.publishFrame(sessionId!, {
type: 'tool_call',
message_id: assistantId!,
chat_id: chatId!,
tool_call: snapshotToWireToolCall(e.toolCall),
} as WsFrame);
}
break;
case 'commands':
if (taskId && e.commands.length > 0) {
mergeTaskCommands(taskId, e.commands);
if (canStream() && sessionId) {
const all = getTaskCommands(taskId) ?? e.commands;
broker!.publishFrame(sessionId, {
type: 'agent_commands',
task_id: taskId,
session_id: sessionId,
commands: all,
} as WsFrame);
}
}
break;
}
};
const finalize = (): void => {
if (!dcp) return;
const tail = dcp.flush();
if (tail) publishText(tail);
};
return {
onEvent,
finalize,
toolSnapshots,
get output() {
return textChunks.join('');
},
get reasoningText() {
return reasoningChunks.join('');
},
get snapshots() {
return [...toolSnapshots.values()];
},
};
}

View File

@@ -25,17 +25,21 @@ interface PendingRow {
session_id: string; session_id: string;
} }
interface WorktreeRow {
id: string;
worktree_path: string;
agent: string;
started_at: string;
}
interface ProjectPathRow { interface ProjectPathRow {
path: string; path: string;
} }
interface MessageRow {
id: string;
session_id: string;
chat_id: string | null;
role: string;
content: string;
status: string;
model: string | null;
created_at: Date;
}
function textResult(data: unknown) { function textResult(data: unknown) {
return { content: [{ type: 'text' as const, text: JSON.stringify(data, null, 2) }] }; return { content: [{ type: 'text' as const, text: JSON.stringify(data, null, 2) }] };
} }
@@ -196,25 +200,53 @@ export async function startMcpServer(sql: Sql): Promise<void> {
}, },
); );
// 6. boocoder.list_worktrees // 6. boocoder.view_session_history
server.tool( server.tool(
'boocoder.list_worktrees', 'boocoder.view_session_history',
'List active worktrees from running tasks', 'Retrieve the most-recent N messages of a session chat transcript (role != system) from messages_with_parts, returned in chronological (oldest→newest) order',
{}, {
async () => { session_id: z.string().describe('Session UUID'),
const rows = await sql<WorktreeRow[]>` chat_id: z.string().optional().describe('Optional chat UUID — narrows to one chat tab'),
SELECT id, worktree_path, agent, started_at limit: z
FROM tasks .number()
WHERE worktree_path IS NOT NULL AND state = 'running' .int()
ORDER BY started_at DESC .min(1)
`; .max(200)
const items = rows.map((r) => ({ .optional()
task_id: r.id, .describe('Max messages to return (default 50, max 200)'),
worktree_path: r.worktree_path, },
agent: r.agent, async (args) => {
started_at: r.started_at, const effectiveLimit = Math.min(args.limit ?? 50, 200);
})); let rows: MessageRow[];
return textResult(items); if (args.chat_id) {
rows = await sql<MessageRow[]>`
SELECT id, session_id, chat_id, role, content, status, model, created_at
FROM (
SELECT id, session_id, chat_id, role, content, status, model, created_at
FROM messages_with_parts
WHERE session_id = ${args.session_id}
AND chat_id = ${args.chat_id}
AND role != 'system'
ORDER BY created_at DESC
LIMIT ${effectiveLimit}
) sub
ORDER BY created_at ASC
`;
} else {
rows = await sql<MessageRow[]>`
SELECT id, session_id, chat_id, role, content, status, model, created_at
FROM (
SELECT id, session_id, chat_id, role, content, status, model, created_at
FROM messages_with_parts
WHERE session_id = ${args.session_id}
AND role != 'system'
ORDER BY created_at DESC
LIMIT ${effectiveLimit}
) sub
ORDER BY created_at ASC
`;
}
return textResult({ session_id: args.session_id, count: rows.length, messages: rows });
}, },
); );

View File

@@ -0,0 +1,88 @@
/**
* Generic POSIX loopback-port utilities.
*
* Extracted verbatim (v2.7 audit reshape) from `backends/opencode-server.ts`,
* where they were embedded in the backend god-class. They have nothing to do with
* opencode semantics — they reclaim/await/allocate a 127.0.0.1 port — so they live
* here as reusable infra. No behavior change from the original.
*/
import { createServer, connect as netConnect } from 'node:net';
import { spawnSync } from 'node:child_process';
/**
* Reclaim a loopback port a dead child may still hold (lift of openchamber
* `killProcessOnPort`). Best-effort, POSIX-only (`lsof`/`kill`); a failure is
* harmless because the next spawn allocates a fresh ephemeral port. Never kills
* this process. Synchronous + short-timeout so a crash handler doesn't block.
*/
export function reclaimPort(port: number | null): void {
if (!port || process.platform === 'win32') return;
try {
const res = spawnSync('lsof', ['-ti', `:${port}`], { encoding: 'utf8', timeout: 3_000, windowsHide: true });
const out = res.stdout || '';
const myPid = process.pid;
for (const pidStr of out.split(/\s+/)) {
const pid = parseInt(pidStr.trim(), 10);
if (pid && pid !== myPid) {
try {
spawnSync('kill', ['-9', String(pid)], { stdio: 'ignore', timeout: 2_000 });
} catch {
// ignore — best effort
}
}
}
} catch {
// lsof absent or failed — the fresh-ephemeral-port spawn doesn't need this.
}
}
/**
* Resolve true once nothing is listening on `port` (lift of openchamber
* `waitForPortRelease`). Used before re-spawning on a fixed port; with ephemeral
* ports it's a fast no-op. Probes 127.0.0.1; resolves false at the deadline.
*/
export function waitForPortRelease(port: number, timeoutMs: number): Promise<boolean> {
const deadline = Date.now() + timeoutMs;
return new Promise((resolve) => {
const attempt = () => {
const socket = netConnect({ port, host: '127.0.0.1' });
let settled = false;
const finish = (released: boolean) => {
if (settled) return;
settled = true;
socket.removeAllListeners();
socket.destroy();
if (released || Date.now() >= deadline) {
resolve(released);
return;
}
setTimeout(attempt, 150);
};
socket.once('connect', () => finish(false));
socket.once('error', (err: NodeJS.ErrnoException) => {
if (err && (err.code === 'ECONNREFUSED' || err.code === 'EHOSTUNREACH')) finish(true);
else finish(false);
});
socket.setTimeout(500, () => finish(true));
};
attempt();
});
}
/** Bind-probe an ephemeral port on loopback. */
export function freePort(): Promise<number> {
return new Promise((resolve, reject) => {
const srv = createServer();
srv.unref();
srv.on('error', reject);
srv.listen(0, '127.0.0.1', () => {
const addr = srv.address();
if (addr && typeof addr === 'object') {
const { port } = addr;
srv.close(() => resolve(port));
} else {
srv.close(() => reject(new Error('port-utils: could not determine a free port')));
}
});
});
}

View File

@@ -21,72 +21,3 @@
*/ */
export type AgentStatus = 'working' | 'blocked' | 'idle' | 'error'; export type AgentStatus = 'working' | 'blocked' | 'idle' | 'error';
/** The coarse signal a raw vendor event collapses to. */
export type AgentEventBucket = 'working' | 'blocked' | 'done';
// Each bucket lists the canonical vendor event names. Lookup is
// case-insensitive AND separator-insensitive (snake_case / camelCase /
// PascalCase all fold to the same key), so we normalize the raw input the same
// way before matching rather than enumerating every spelling here.
const WORKING_EVENTS = [
'SessionStart',
'UserPromptSubmit',
'UserPromptSubmitted',
'PostToolUse',
'PostToolUseFailure',
'BeforeAgent',
'AfterTool',
'task_started',
] as const;
const BLOCKED_EVENTS = [
'PreToolUse',
'Notification',
'PermissionRequest',
'exec_approval_request',
'apply_patch_approval_request',
'request_user_input',
] as const;
const DONE_EVENTS = [
'Stop',
'AfterAgent',
'SessionEnd',
'task_complete',
'agent-turn-complete',
] as const;
/**
* Fold a raw event name to a separator/case-insensitive key:
* strip every non-alphanumeric character and lowercase. So `post_tool_use`,
* `postToolUse`, `PostToolUse`, and `POST-TOOL-USE` all map to `posttooluse`.
*/
function foldKey(raw: string): string {
return raw.replace(/[^a-z0-9]/gi, '').toLowerCase();
}
function buildLookup(
groups: ReadonlyArray<readonly [AgentEventBucket, readonly string[]]>,
): Map<string, AgentEventBucket> {
const map = new Map<string, AgentEventBucket>();
for (const [bucket, names] of groups) {
for (const name of names) map.set(foldKey(name), bucket);
}
return map;
}
const EVENT_LOOKUP = buildLookup([
['working', WORKING_EVENTS],
['blocked', BLOCKED_EVENTS],
['done', DONE_EVENTS],
]);
/**
* Map a raw vendor hook-event name to its normalized bucket, or `null` when the
* name is unknown / undefined. Case- and separator-insensitive.
*/
export function normalizeAgentEvent(raw: string | undefined): AgentEventBucket | null {
if (!raw) return null;
return EVENT_LOOKUP.get(foldKey(raw)) ?? null;
}

View File

@@ -21,7 +21,8 @@ import { readdir, stat } from 'node:fs/promises';
import { join } from 'node:path'; import { join } from 'node:path';
import type { FastifyBaseLogger } from 'fastify'; import type { FastifyBaseLogger } from 'fastify';
import type { Sql } from '../db.js'; import type { Sql } from '../db.js';
import { WORKTREE_BASE, checkWorktreeWorkAtRisk } from './worktrees.js'; import { WORKTREE_BASE } from './worktrees.js';
import { checkWorktreeWorkAtRisk } from './worktree-risk.js';
import { hostExec } from './host-exec.js'; import { hostExec } from './host-exec.js';
import { import {
selectOrphanWorktreeTargets, selectOrphanWorktreeTargets,

View File

@@ -181,10 +181,6 @@ export async function rejectOne(sql: Sql, changeId: string): Promise<void> {
await sql`UPDATE pending_changes SET status = 'rejected' WHERE id = ${changeId} AND status = 'pending'`; await sql`UPDATE pending_changes SET status = 'rejected' WHERE id = ${changeId} AND status = 'pending'`;
} }
export async function rejectAll(sql: Sql, sessionId: string): Promise<void> {
await sql`UPDATE pending_changes SET status = 'rejected' WHERE session_id = ${sessionId} AND status = 'pending'`;
}
// --- Rewind functions -------------------------------------------------------- // --- Rewind functions --------------------------------------------------------
export async function rewindOne( export async function rewindOne(

View File

@@ -127,7 +127,3 @@ export function getResolvedRegistry(): Map<string, ResolvedProviderDef> {
return cachedRegistry ?? buildResolvedRegistry(PROVIDERS, { providers: {} }); return cachedRegistry ?? buildResolvedRegistry(PROVIDERS, { providers: {} });
} }
/** Resolved provider ids in registry order. */
export function getResolvedProviderIds(): string[] {
return [...getResolvedRegistry().keys()];
}

View File

@@ -26,9 +26,4 @@ export const WRITE_TOOLS: readonly ToolDef<any>[] = [
checkTaskStatusTool, checkTaskStatusTool,
]; ];
// eslint-disable-next-line @typescript-eslint/no-explicit-any
export const WRITE_TOOLS_BY_NAME: ReadonlyMap<string, ToolDef<any>> = new Map(
WRITE_TOOLS.map((t) => [t.name, t]),
);
export { editFileTool, createFileTool, deleteFileTool, applyPendingTool, rewindTool, newTaskTool, listTasksTool, checkTaskStatusTool }; export { editFileTool, createFileTool, deleteFileTool, applyPendingTool, rewindTool, newTaskTool, listTasksTool, checkTaskStatusTool };

View File

@@ -0,0 +1,160 @@
/**
* Worktree work-at-risk assessment (split out of `worktrees.ts`, v2.7 audit
* reshape). The git-worktree create/diff/remove lifecycle stays in `worktrees.ts`;
* this module owns the orthogonal "would deleting this worktree lose work?" gate
* the server consults before a session delete, plus the recoverable stash escape.
*
* Session delete itself lives in apps/server (Docker), which CANNOT see the host
* worktree dirs or run git on them — only BooCoder (host systemd) can — so the
* server calls the routes that wrap these helpers. Behavior is unchanged from the
* original worktrees.ts implementation.
*/
import type { WorktreeRiskReport } from '@boocode/contracts/worktree-risk';
import { hostExec } from './host-exec.js';
/**
* Resolve the project's default branch as a git-usable ref (e.g. "origin/main").
*
* `refs/remotes/origin/HEAD` lives in the repo's COMMON git dir and is shared
* across every linked worktree, so reading it from the session worktree returns
* the REMOTE's default branch — never this worktree's own `session-<id>` branch
* (that would be `symbolic-ref HEAD`, a different ref). Falls back to probing
* common defaults by verified existence when origin/HEAD isn't set (e.g. a repo
* that never ran `git remote set-head`). Returns null if none resolve, in which
* case the unmerged check is skipped (dirty + unpushed still protect the work).
*/
async function detectDefaultBranchRef(
worktreePath: string,
opts?: { signal?: AbortSignal },
): Promise<string | null> {
const head = await hostExec(
`git -C ${shellEscape(worktreePath)} symbolic-ref --short refs/remotes/origin/HEAD`,
{ signal: opts?.signal, timeoutMs: 10_000 },
);
if (head.exitCode === 0) {
const ref = head.stdout.trim(); // e.g. "origin/main"
if (ref) {
const verify = await hostExec(
`git -C ${shellEscape(worktreePath)} rev-parse --verify --quiet ${shellEscape(ref + '^{commit}')}`,
{ signal: opts?.signal, timeoutMs: 10_000 },
);
if (verify.exitCode === 0 && verify.stdout.trim()) return ref;
}
}
// origin/HEAD unset or unresolvable — probe common defaults. Prefer the
// remote-tracking ref (always resolvable in a fresh worktree) over the local
// head, which may not exist if the default branch lives only in the main tree.
for (const cand of ['origin/main', 'origin/master', 'main', 'master']) {
const verify = await hostExec(
`git -C ${shellEscape(worktreePath)} rev-parse --verify --quiet ${shellEscape(cand + '^{commit}')}`,
{ signal: opts?.signal, timeoutMs: 10_000 },
);
if (verify.exitCode === 0 && verify.stdout.trim()) return cand;
}
return null;
}
/**
* Inspect a worktree for work that would be lost if its session were deleted.
* Three checks, all via the audited hostExec + shellEscape path (every
* interpolated value — paths, refs — is single-quote-escaped; no bare
* interpolation). Any unexpected git failure is treated as at-risk, never a
* silent pass.
*/
export async function checkWorktreeWorkAtRisk(
worktreePath: string,
opts?: { signal?: AbortSignal },
): Promise<WorktreeRiskReport> {
// Branch name — also doubles as the "is this still a git worktree?" probe.
const br = await hostExec(
`git -C ${shellEscape(worktreePath)} rev-parse --abbrev-ref HEAD`,
{ signal: opts?.signal, timeoutMs: 10_000 },
);
if (br.exitCode !== 0) {
return {
worktreePath,
branch: '',
dirty: false,
unpushed: 0,
unmerged: 0,
atRisk: true,
error: `git rev-parse failed: ${br.stderr.trim() || 'not a git worktree'}`,
};
}
const branch = br.stdout.trim();
// (a) Uncommitted (dirty working tree, including untracked files).
const st = await hostExec(
`git -C ${shellEscape(worktreePath)} status --porcelain`,
{ signal: opts?.signal, timeoutMs: 15_000 },
);
if (st.exitCode !== 0) {
return {
worktreePath,
branch,
dirty: false,
unpushed: 0,
unmerged: 0,
atRisk: true,
error: `git status failed: ${st.stderr.trim()}`,
};
}
const dirty = st.stdout.trim().length > 0;
// (b) Unpushed commits. No upstream configured => work exists only locally;
// treat as unpushed-by-definition (-1) rather than an error.
const up = await hostExec(
`git -C ${shellEscape(worktreePath)} rev-list --count ${shellEscape('@{u}..HEAD')}`,
{ signal: opts?.signal, timeoutMs: 15_000 },
);
const unpushed = up.exitCode === 0 ? (parseInt(up.stdout.trim() || '0', 10) || 0) : -1;
// (c) Unmerged commits — on this branch but not in the project default branch.
const defaultRef = await detectDefaultBranchRef(worktreePath, opts);
let unmerged = 0;
if (defaultRef) {
const rl = await hostExec(
`git -C ${shellEscape(worktreePath)} rev-list --count ${shellEscape(defaultRef + '..HEAD')}`,
{ signal: opts?.signal, timeoutMs: 15_000 },
);
if (rl.exitCode === 0) unmerged = parseInt(rl.stdout.trim() || '0', 10) || 0;
}
// unpushed only contributes when an upstream actually exists. Session branches
// (session-<id>) never have one (unpushed === -1), and any real local-only work
// there already surfaces as unmerged > 0 — so the no-upstream case adds no
// protection, only friction (it flagged every pristine worktree-backed session).
// The unpushed > 0 arm stays forward-compatible with P1.5 pushable branches.
const hasUpstream = unpushed !== -1;
const atRisk = dirty || unmerged > 0 || (hasUpstream && unpushed > 0);
return { worktreePath, branch, dirty, unpushed, unmerged, atRisk };
}
/**
* Stash a worktree's uncommitted changes (including untracked, via -u) so the
* working tree is clean. Stash entries live in the repo's common git dir, so
* they survive worktree-dir removal — this is the recoverable, safe-by-default
* escape. Note it only clears the *dirty* risk; unpushed/unmerged commits
* remain on the branch, so a re-attempted delete may still block on those.
*/
export async function stashWorktree(
worktreePath: string,
opts?: { signal?: AbortSignal },
): Promise<{ stashed: boolean; error?: string }> {
const r = await hostExec(
`git -C ${shellEscape(worktreePath)} stash push -u -m ${shellEscape('boocode: pre-delete stash')}`,
{ signal: opts?.signal, timeoutMs: 30_000 },
);
if (r.exitCode !== 0) {
return { stashed: false, error: r.stderr.trim() || r.stdout.trim() };
}
// "No local changes to save" => exit 0, nothing stashed — not an error.
const stashed = !/no local changes to save/i.test(r.stdout);
return { stashed };
}
/** Minimal shell escape for paths (single-quote wrapping). */
function shellEscape(s: string): string {
// Replace single quotes with escaped version, wrap in single quotes
return "'" + s.replace(/'/g, "'\\''") + "'";
}

View File

@@ -9,6 +9,7 @@
import type { Sql } from '../db.js'; import type { Sql } from '../db.js';
import { hostExec } from './host-exec.js'; import { hostExec } from './host-exec.js';
import type { WorktreeRiskReport } from '@boocode/contracts/worktree-risk'; import type { WorktreeRiskReport } from '@boocode/contracts/worktree-risk';
import { checkWorktreeWorkAtRisk } from './worktree-risk.js';
export const WORKTREE_BASE = '/tmp/booworktrees'; export const WORKTREE_BASE = '/tmp/booworktrees';
@@ -383,147 +384,6 @@ export async function rebaselineWorktreeAfterApply(
// WorktreeRiskReport single-sourced in @boocode/contracts — edit the package, not here. // WorktreeRiskReport single-sourced in @boocode/contracts — edit the package, not here.
export type { WorktreeRiskReport }; export type { WorktreeRiskReport };
/**
* Resolve the project's default branch as a git-usable ref (e.g. "origin/main").
*
* `refs/remotes/origin/HEAD` lives in the repo's COMMON git dir and is shared
* across every linked worktree, so reading it from the session worktree returns
* the REMOTE's default branch — never this worktree's own `session-<id>` branch
* (that would be `symbolic-ref HEAD`, a different ref). Falls back to probing
* common defaults by verified existence when origin/HEAD isn't set (e.g. a repo
* that never ran `git remote set-head`). Returns null if none resolve, in which
* case the unmerged check is skipped (dirty + unpushed still protect the work).
*/
async function detectDefaultBranchRef(
worktreePath: string,
opts?: { signal?: AbortSignal },
): Promise<string | null> {
const head = await hostExec(
`git -C ${shellEscape(worktreePath)} symbolic-ref --short refs/remotes/origin/HEAD`,
{ signal: opts?.signal, timeoutMs: 10_000 },
);
if (head.exitCode === 0) {
const ref = head.stdout.trim(); // e.g. "origin/main"
if (ref) {
const verify = await hostExec(
`git -C ${shellEscape(worktreePath)} rev-parse --verify --quiet ${shellEscape(ref + '^{commit}')}`,
{ signal: opts?.signal, timeoutMs: 10_000 },
);
if (verify.exitCode === 0 && verify.stdout.trim()) return ref;
}
}
// origin/HEAD unset or unresolvable — probe common defaults. Prefer the
// remote-tracking ref (always resolvable in a fresh worktree) over the local
// head, which may not exist if the default branch lives only in the main tree.
for (const cand of ['origin/main', 'origin/master', 'main', 'master']) {
const verify = await hostExec(
`git -C ${shellEscape(worktreePath)} rev-parse --verify --quiet ${shellEscape(cand + '^{commit}')}`,
{ signal: opts?.signal, timeoutMs: 10_000 },
);
if (verify.exitCode === 0 && verify.stdout.trim()) return cand;
}
return null;
}
/**
* Inspect a worktree for work that would be lost if its session were deleted.
* Three checks, all via the audited hostExec + shellEscape path (every
* interpolated value — paths, refs — is single-quote-escaped; no bare
* interpolation). Any unexpected git failure is treated as at-risk, never a
* silent pass.
*/
export async function checkWorktreeWorkAtRisk(
worktreePath: string,
opts?: { signal?: AbortSignal },
): Promise<WorktreeRiskReport> {
// Branch name — also doubles as the "is this still a git worktree?" probe.
const br = await hostExec(
`git -C ${shellEscape(worktreePath)} rev-parse --abbrev-ref HEAD`,
{ signal: opts?.signal, timeoutMs: 10_000 },
);
if (br.exitCode !== 0) {
return {
worktreePath,
branch: '',
dirty: false,
unpushed: 0,
unmerged: 0,
atRisk: true,
error: `git rev-parse failed: ${br.stderr.trim() || 'not a git worktree'}`,
};
}
const branch = br.stdout.trim();
// (a) Uncommitted (dirty working tree, including untracked files).
const st = await hostExec(
`git -C ${shellEscape(worktreePath)} status --porcelain`,
{ signal: opts?.signal, timeoutMs: 15_000 },
);
if (st.exitCode !== 0) {
return {
worktreePath,
branch,
dirty: false,
unpushed: 0,
unmerged: 0,
atRisk: true,
error: `git status failed: ${st.stderr.trim()}`,
};
}
const dirty = st.stdout.trim().length > 0;
// (b) Unpushed commits. No upstream configured => work exists only locally;
// treat as unpushed-by-definition (-1) rather than an error.
const up = await hostExec(
`git -C ${shellEscape(worktreePath)} rev-list --count ${shellEscape('@{u}..HEAD')}`,
{ signal: opts?.signal, timeoutMs: 15_000 },
);
const unpushed = up.exitCode === 0 ? (parseInt(up.stdout.trim() || '0', 10) || 0) : -1;
// (c) Unmerged commits — on this branch but not in the project default branch.
const defaultRef = await detectDefaultBranchRef(worktreePath, opts);
let unmerged = 0;
if (defaultRef) {
const rl = await hostExec(
`git -C ${shellEscape(worktreePath)} rev-list --count ${shellEscape(defaultRef + '..HEAD')}`,
{ signal: opts?.signal, timeoutMs: 15_000 },
);
if (rl.exitCode === 0) unmerged = parseInt(rl.stdout.trim() || '0', 10) || 0;
}
// unpushed only contributes when an upstream actually exists. Session branches
// (session-<id>) never have one (unpushed === -1), and any real local-only work
// there already surfaces as unmerged > 0 — so the no-upstream case adds no
// protection, only friction (it flagged every pristine worktree-backed session).
// The unpushed > 0 arm stays forward-compatible with P1.5 pushable branches.
const hasUpstream = unpushed !== -1;
const atRisk = dirty || unmerged > 0 || (hasUpstream && unpushed > 0);
return { worktreePath, branch, dirty, unpushed, unmerged, atRisk };
}
/**
* Stash a worktree's uncommitted changes (including untracked, via -u) so the
* working tree is clean. Stash entries live in the repo's common git dir, so
* they survive worktree-dir removal — this is the recoverable, safe-by-default
* escape. Note it only clears the *dirty* risk; unpushed/unmerged commits
* remain on the branch, so a re-attempted delete may still block on those.
*/
export async function stashWorktree(
worktreePath: string,
opts?: { signal?: AbortSignal },
): Promise<{ stashed: boolean; error?: string }> {
const r = await hostExec(
`git -C ${shellEscape(worktreePath)} stash push -u -m ${shellEscape('boocode: pre-delete stash')}`,
{ signal: opts?.signal, timeoutMs: 30_000 },
);
if (r.exitCode !== 0) {
return { stashed: false, error: r.stderr.trim() || r.stdout.trim() };
}
// "No local changes to save" => exit 0, nothing stashed — not an error.
const stashed = !/no local changes to save/i.test(r.stdout);
return { stashed };
}
/** Minimal shell escape for paths (single-quote wrapping). */ /** Minimal shell escape for paths (single-quote wrapping). */
function shellEscape(s: string): string { function shellEscape(s: string): string {
// Replace single quotes with escaped version, wrap in single quotes // Replace single quotes with escaped version, wrap in single quotes

View File

@@ -1,12 +0,0 @@
<!doctype html>
<html lang="en" class="dark">
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>BooCoder</title>
</head>
<body class="bg-zinc-900 text-zinc-100">
<div id="root"></div>
<script type="module" src="/src/main.tsx"></script>
</body>
</html>

View File

@@ -1,30 +0,0 @@
{
"name": "@boocode/coder-web",
"version": "2.0.0",
"private": true,
"type": "module",
"scripts": {
"dev": "vite",
"build": "tsc -b && vite build",
"typecheck": "tsc -b --noEmit",
"preview": "vite preview"
},
"dependencies": {
"@boocode/contracts": "workspace:*",
"lucide-react": "^1.16.0",
"react": "^18.3.1",
"react-dom": "^18.3.1",
"react-markdown": "^10.1.0",
"react-router-dom": "^6.26.0",
"remark-gfm": "^4.0.1"
},
"devDependencies": {
"@tailwindcss/postcss": "^4.3.0",
"@types/react": "^18.3.3",
"@types/react-dom": "^18.3.0",
"@vitejs/plugin-react": "^4.3.1",
"tailwindcss": "^4.3.0",
"typescript": "^5.5.0",
"vite": "^5.3.4"
}
}

View File

@@ -1,5 +0,0 @@
export default {
plugins: {
'@tailwindcss/postcss': {},
},
};

View File

@@ -1,13 +0,0 @@
import { Routes, Route, Navigate } from 'react-router-dom';
import { Home } from './pages/Home';
import { Session } from './pages/Session';
export function App() {
return (
<Routes>
<Route path="/" element={<Home />} />
<Route path="/sessions/:sessionId" element={<Session />} />
<Route path="*" element={<Navigate to="/" replace />} />
</Routes>
);
}

View File

@@ -1,101 +0,0 @@
import type { Project, Session, Chat, Message, PendingChange, AskUserAnswer } from './types';
export class ApiError extends Error {
constructor(
public status: number,
public body: unknown,
) {
super(
typeof body === 'object' && body && 'error' in body
? String((body as { error: unknown }).error)
: `HTTP ${status}`,
);
}
}
async function request<T>(path: string, init: RequestInit = {}): Promise<T> {
const res = await fetch(path, {
...init,
headers: {
'Content-Type': 'application/json',
...(init.headers ?? {}),
},
});
if (res.status === 204) return undefined as T;
const text = await res.text();
const data = text ? JSON.parse(text) : undefined;
if (!res.ok) throw new ApiError(res.status, data);
return data as T;
}
export const api = {
health: () => request<{ ok: boolean; db: boolean; tools: number }>('/api/health'),
projects: {
list: (params?: { status?: 'open' | 'archived' }) =>
request<Project[]>(`/api/projects${params?.status ? `?status=${params.status}` : ''}`),
},
sessions: {
listForProject: (projectId: string, status?: 'open' | 'archived') =>
request<Session[]>(
`/api/projects/${projectId}/sessions${status ? `?status=${status}` : ''}`,
),
get: (id: string) => request<Session>(`/api/sessions/${id}`),
},
chats: {
listForSession: (sessionId: string) =>
request<Chat[]>(`/api/sessions/${sessionId}/chats`),
create: (sessionId: string, body?: { name?: string }) =>
request<Chat>(`/api/sessions/${sessionId}/chats`, {
method: 'POST',
body: JSON.stringify(body ?? {}),
}),
answerUserInput: (chatId: string, toolCallId: string, answers: AskUserAnswer[]) =>
request<{ tool_message_id: string; assistant_message_id: string }>(
`/api/chats/${chatId}/answer_user_input`,
{
method: 'POST',
body: JSON.stringify({ tool_call_id: toolCallId, answers }),
},
),
},
messages: {
send: (sessionId: string, chatId: string, content: string) =>
request<{ user_message_id: string; assistant_message_id: string }>(
`/api/sessions/${sessionId}/messages`,
{
method: 'POST',
body: JSON.stringify({ content, chat_id: chatId }),
},
),
stop: (sessionId: string) =>
request<{ cancelled: boolean }>(`/api/sessions/${sessionId}/stop`, {
method: 'POST',
}),
},
pending: {
list: (sessionId: string) =>
request<PendingChange[]>(`/api/sessions/${sessionId}/pending`),
applyAll: (sessionId: string) =>
request<{ results: Array<{ id: string; success: boolean; error?: string }> }>(
`/api/sessions/${sessionId}/pending/apply`,
{ method: 'POST' },
),
applyOne: (changeId: string) =>
request<{ success: boolean; error?: string }>(`/api/pending/${changeId}/apply`, {
method: 'POST',
}),
rejectOne: (changeId: string) =>
request<{ ok: boolean }>(`/api/pending/${changeId}/reject`, {
method: 'POST',
}),
rewindOne: (changeId: string) =>
request<{ success: boolean; error?: string }>(`/api/pending/${changeId}/rewind`, {
method: 'POST',
}),
},
};

View File

@@ -1,117 +0,0 @@
// Minimal types for the BooCoder frontend.
// Shared DB entities (same schema as BooChat).
//
// WS wire contracts are single-sourced from @boocode/contracts (the canonical
// Zod-backed schema). The DB entity types below (Project/Session/Chat/Message/
// ToolCall/ToolResult/PendingChange) are an intentional minimal SPA-local subset
// and are NOT cross-app contracts — they stay defined here.
import type { WsFrame } from '@boocode/contracts/ws-frames';
// Re-export the canonical WebSocket frame union (single source of truth). The
// coder backend publishes the full frame set; this SPA's reducer handles the
// subset it renders and ignores the rest.
export type { WsFrame };
// The error frame's `reason`, single-sourced from the canonical schema's
// frame-level reason enum (derived from WsFrame so it cannot drift from the
// wire). Distinct from message-metadata's ErrorReason, which is a different set.
export type ErrorReason = NonNullable<Extract<WsFrame, { type: 'error' }>['reason']>;
export interface Project {
id: string;
name: string;
path: string;
status: 'open' | 'archived';
created_at: string;
updated_at: string;
}
export interface Session {
id: string;
project_id: string;
name: string | null;
model: string | null;
status: 'open' | 'archived';
created_at: string;
updated_at: string;
}
export interface Chat {
id: string;
session_id: string;
name: string | null;
status: 'open' | 'archived';
created_at: string;
updated_at: string;
}
export interface ToolCall {
id: string;
name: string;
args: unknown;
}
export interface ToolResult {
tool_call_id: string;
output: unknown;
truncated?: boolean;
// Canonical wire shape: the failure message string (present only on error),
// not a boolean. ToolResultBubble treats it as truthy → renders error styling.
error?: string;
}
// Batch 9.7: ask_user_input shapes. The tool_call.args is { questions: AskUserQuestion[] }
// (1-3 entries); the eventual tool_result.output is { answers: AskUserAnswer[] } in the
// same order. AskUserInputCard renders questions and POSTs answers.
export type AskUserQuestionType = 'single_select' | 'multi_select';
export interface AskUserQuestion {
question: string;
type: AskUserQuestionType;
options: string[];
}
export interface AskUserAnswer {
question: string;
selected_options: string[];
free_text: string | null;
}
export interface AskUserAnswerSet {
answers: AskUserAnswer[];
}
export interface Message {
id: string;
session_id: string;
chat_id: string;
role: 'user' | 'assistant' | 'tool' | 'system';
content: string;
kind: string;
tool_calls: ToolCall[] | null;
tool_results: ToolResult | null;
status: 'streaming' | 'complete' | 'failed' | 'cancelled';
tokens_used: number | null;
ctx_used: number | null;
ctx_max: number | null;
started_at: string | null;
finished_at: string | null;
created_at: string;
metadata: unknown;
}
export interface PendingChange {
id: string;
session_id: string;
task_id: string | null;
file_path: string;
operation: 'create' | 'edit' | 'delete';
old_string: string | null;
new_string: string | null;
content: string | null;
diff: string | null;
status: 'pending' | 'applied' | 'rejected' | 'reverted';
created_at: string;
applied_at: string | null;
}

View File

@@ -1,323 +0,0 @@
import { useMemo, useState } from 'react';
import { Check } from 'lucide-react';
import { api } from '@/api/client';
import { RadioGroup, RadioGroupItem } from '@/components/ui/radio-group';
import { Button } from '@/components/ui/button';
import type {
AskUserAnswer,
AskUserAnswerSet,
AskUserQuestion,
ToolCall,
ToolResult,
} from '@/api/types';
// Batch 9.7: Inline interactive picker. Renders inside MessageList in place of
// the standard ToolCallLine when the assistant emits an ask_user_input tool
// call. While the tool result is null (server pre-stamps a sentinel with
// output=null), shows the form; once the WS tool_result frame arrives with a
// real AnswerSet, flips to read-only review mode.
interface Props {
toolCall: ToolCall;
toolResult: ToolResult | null;
chatId: string;
}
function parseQuestions(raw: unknown): AskUserQuestion[] {
if (!raw || typeof raw !== 'object' || !('questions' in raw)) return [];
const arr = (raw as { questions: unknown }).questions;
if (!Array.isArray(arr)) return [];
const out: AskUserQuestion[] = [];
for (const item of arr) {
if (!item || typeof item !== 'object') continue;
const q = item as { question?: unknown; type?: unknown; options?: unknown };
if (typeof q.question !== 'string') continue;
if (q.type !== 'single_select' && q.type !== 'multi_select') continue;
if (!Array.isArray(q.options)) continue;
const opts = q.options.filter((o): o is string => typeof o === 'string');
if (opts.length < 2) continue;
out.push({ question: q.question, type: q.type, options: opts });
}
return out;
}
function parseAnswerSet(raw: unknown): AskUserAnswerSet | null {
if (!raw || typeof raw !== 'object' || !('answers' in raw)) return null;
const arr = (raw as { answers: unknown }).answers;
if (!Array.isArray(arr)) return null;
const answers: AskUserAnswer[] = [];
for (const item of arr) {
if (!item || typeof item !== 'object') continue;
const a = item as { question?: unknown; selected_options?: unknown; free_text?: unknown };
if (typeof a.question !== 'string') continue;
if (!Array.isArray(a.selected_options)) continue;
if (a.free_text !== null && typeof a.free_text !== 'string') continue;
const sel = a.selected_options.filter((s): s is string => typeof s === 'string');
answers.push({
question: a.question,
selected_options: sel,
free_text: (a.free_text as string | null) ?? null,
});
}
return { answers };
}
export function AskUserInputCard({ toolCall, toolResult, chatId }: Props) {
const questions = useMemo(() => parseQuestions(toolCall.args), [toolCall.args]);
if (questions.length === 0) {
return (
<div className="rounded border border-destructive/40 bg-destructive/10 text-xs px-3 py-2 text-destructive">
ask_user_input: malformed tool args
</div>
);
}
// Tool result with a non-null output means the answer is already submitted.
// The pending sentinel uses output=null, so this branch only triggers after
// the real WS tool_result frame lands.
const answered = toolResult && toolResult.output !== null;
if (answered) {
const answerSet = parseAnswerSet(toolResult!.output);
return <AnsweredView questions={questions} answers={answerSet} />;
}
return (
<PendingView questions={questions} toolCallId={toolCall.id} chatId={chatId} />
);
}
function PendingView({
questions,
toolCallId,
chatId,
}: {
questions: AskUserQuestion[];
toolCallId: string;
chatId: string;
}) {
// Per-question selections + free text. Selections are option arrays so the
// multi_select case is uniform; single_select just constrains to length 1.
const [selections, setSelections] = useState<string[][]>(() => questions.map(() => []));
const [freeTexts, setFreeTexts] = useState<string[]>(() => questions.map(() => ''));
const [submitting, setSubmitting] = useState(false);
const singleQuestion = questions.length === 1;
const anyFreeText = freeTexts.some((t) => t.trim().length > 0);
// Submit button shows when:
// - more than one question (always batched), OR
// - one question and the user has typed free text (committing it needs an
// explicit Submit so an accidental Tab/click doesn't lose it).
// For one question with no free text, clicking an option submits inline.
const showSubmitButton = !singleQuestion || anyFreeText;
// Every question must have at least one of (option, free text).
const allComplete = questions.every((_, i) => {
return selections[i]!.length > 0 || freeTexts[i]!.trim().length > 0;
});
function buildAnswers(): AskUserAnswer[] {
return questions.map((q, i) => {
const freeText = freeTexts[i]!.trim();
return {
question: q.question,
selected_options: selections[i]!,
free_text: freeText.length > 0 ? freeText : null,
};
});
}
async function submit(answers: AskUserAnswer[]) {
if (submitting) return;
setSubmitting(true);
try {
await api.chats.answerUserInput(chatId, toolCallId, answers);
// Card stays mounted; the incoming WS tool_result frame will flip it
// into AnsweredView via the parent prop change.
} catch (err) {
console.error('ask_user_input submit failed:', err instanceof Error ? err.message : err);
setSubmitting(false);
}
}
function pickSingle(qIdx: number, option: string) {
setSelections((prev) => prev.map((arr, i) => (i === qIdx ? [option] : arr)));
// Immediate submit for the single-question single-select shortcut. Only
// fires when no free text exists anywhere — once the user typed, the
// Submit button takes over so the typed text isn't silently dropped.
if (singleQuestion && !anyFreeText) {
const answers: AskUserAnswer[] = [
{
question: questions[0]!.question,
selected_options: [option],
free_text: null,
},
];
void submit(answers);
}
}
function toggleMulti(qIdx: number, option: string) {
setSelections((prev) =>
prev.map((arr, i) => {
if (i !== qIdx) return arr;
return arr.includes(option) ? arr.filter((o) => o !== option) : [...arr, option];
}),
);
}
function setFreeText(qIdx: number, value: string) {
setFreeTexts((prev) => prev.map((t, i) => (i === qIdx ? value : t)));
}
return (
<div className="rounded-lg border bg-muted/20 text-sm">
<div className="px-4 py-3 space-y-4">
{questions.map((q, i) => (
<div key={i} className="space-y-2">
{questions.length > 1 && (
<div className="text-[10px] uppercase tracking-wide text-muted-foreground/70">
Question {i + 1}
</div>
)}
<div className="font-medium leading-snug">{q.question}</div>
{q.type === 'single_select' ? (
<RadioGroup
value={selections[i]![0] ?? ''}
onValueChange={(v) => pickSingle(i, v)}
disabled={submitting}
className="gap-1.5"
>
{q.options.map((opt, j) => {
const id = `q${i}-opt${j}`;
return (
<label
key={j}
htmlFor={id}
className="flex items-start gap-2 text-sm leading-snug cursor-pointer rounded px-1 py-0.5 hover:bg-muted/40"
>
<RadioGroupItem id={id} value={opt} className="mt-0.5" />
<span>{opt}</span>
</label>
);
})}
</RadioGroup>
) : (
<div className="grid gap-1.5">
{q.options.map((opt, j) => {
const id = `q${i}-opt${j}`;
const checked = selections[i]!.includes(opt);
return (
<label
key={j}
htmlFor={id}
className="flex items-start gap-2 text-sm leading-snug cursor-pointer rounded px-1 py-0.5 hover:bg-muted/40"
>
<input
id={id}
type="checkbox"
checked={checked}
disabled={submitting}
onChange={() => toggleMulti(i, opt)}
className="mt-1 size-3.5 rounded border-input accent-primary"
/>
<span>{opt}</span>
</label>
);
})}
</div>
)}
<div className="pt-1 space-y-1">
<div className="text-[10px] uppercase tracking-wide text-muted-foreground/70">
Or type a custom answer
</div>
<input
type="text"
value={freeTexts[i]}
disabled={submitting}
placeholder="Free text…"
onChange={(e) => setFreeText(i, e.target.value)}
className="w-full rounded border border-input bg-background px-2 py-1 text-sm outline-none focus-visible:ring-2 focus-visible:ring-ring/40 disabled:opacity-60"
/>
</div>
</div>
))}
</div>
{showSubmitButton && (
<div className="flex justify-end gap-2 border-t px-4 py-2">
<Button
type="button"
size="sm"
disabled={!allComplete || submitting}
onClick={() => void submit(buildAnswers())}
>
{submitting ? 'Submitting…' : 'Submit'}
</Button>
</div>
)}
</div>
);
}
function AnsweredView({
questions,
answers,
}: {
questions: AskUserQuestion[];
answers: AskUserAnswerSet | null;
}) {
if (!answers) {
return (
<div className="rounded-lg border bg-muted/20 text-xs px-4 py-3 text-muted-foreground">
ask_user_input: answers unavailable
</div>
);
}
return (
<div className="rounded-lg border bg-muted/10 text-sm">
<div className="px-4 py-3 space-y-3">
{questions.map((q, i) => {
const a = answers.answers[i];
if (!a) return null;
return (
<div key={i} className="space-y-1.5">
{questions.length > 1 && (
<div className="text-[10px] uppercase tracking-wide text-muted-foreground/70">
Question {i + 1}
</div>
)}
<div className="font-medium leading-snug">{q.question}</div>
<div className="space-y-0.5">
{q.options.map((opt, j) => {
const selected = a.selected_options.includes(opt);
return (
<div
key={j}
className={
selected
? 'flex items-start gap-2 text-sm leading-snug text-foreground'
: 'flex items-start gap-2 text-sm leading-snug text-muted-foreground/60 line-through'
}
>
<span className="mt-0.5 size-3.5 shrink-0 inline-flex items-center justify-center">
{selected && <Check className="size-3 text-primary" />}
</span>
<span>{opt}</span>
</div>
);
})}
</div>
{a.free_text && (
<div className="rounded bg-background border px-2 py-1 text-xs font-mono whitespace-pre-wrap">
{a.free_text}
</div>
)}
</div>
);
})}
</div>
</div>
);
}

View File

@@ -1,139 +0,0 @@
import { useState, useRef, useEffect } from 'react';
import { Send, Square } from 'lucide-react';
import type { Message, ToolResult } from '@/api/types';
import { api } from '@/api/client';
import { MessageBubble } from './MessageBubble';
interface Props {
sessionId: string;
chatId: string;
messages: Message[];
isStreaming: boolean;
connected: boolean;
}
export function ChatPane({ sessionId, chatId, messages, isStreaming, connected }: Props) {
const [input, setInput] = useState('');
const [sending, setSending] = useState(false);
const messagesEndRef = useRef<HTMLDivElement>(null);
const textareaRef = useRef<HTMLTextAreaElement>(null);
// Auto-scroll to bottom when messages change
useEffect(() => {
messagesEndRef.current?.scrollIntoView({ behavior: 'smooth' });
}, [messages]);
// Auto-resize textarea
useEffect(() => {
const el = textareaRef.current;
if (!el) return;
el.style.height = 'auto';
el.style.height = Math.min(el.scrollHeight, 200) + 'px';
}, [input]);
const handleSend = async () => {
const content = input.trim();
if (!content || sending || isStreaming) return;
setInput('');
setSending(true);
try {
await api.messages.send(sessionId, chatId, content);
} catch (err) {
console.error('send failed:', err);
// Restore input on failure
setInput(content);
} finally {
setSending(false);
}
};
const handleStop = async () => {
try {
await api.messages.stop(sessionId);
} catch (err) {
console.error('stop failed:', err);
}
};
const handleKeyDown = (e: React.KeyboardEvent) => {
if (e.key === 'Enter' && !e.shiftKey) {
e.preventDefault();
handleSend();
}
};
// Filter out system messages for display (sentinels)
const visibleMessages = messages.filter((m) => m.role !== 'system');
// Build a lookup map from tool_call_id -> ToolResult for all messages
const toolResultsMap: Record<string, ToolResult> = {};
for (const msg of messages) {
if (msg.tool_results) {
toolResultsMap[msg.tool_results.tool_call_id] = msg.tool_results;
}
}
return (
<div className="flex flex-col h-full">
{/* Connection indicator */}
<div className="flex items-center gap-2 px-4 py-2 border-b border-zinc-800 text-xs text-zinc-500">
<div
className={`w-1.5 h-1.5 rounded-full ${connected ? 'bg-green-500' : 'bg-red-500'}`}
/>
<span>{connected ? 'Connected' : 'Disconnected'}</span>
{isStreaming && (
<span className="text-blue-400 ml-auto">Generating...</span>
)}
</div>
{/* Messages list */}
<div className="flex-1 overflow-y-auto px-4 py-4">
{visibleMessages.length === 0 && (
<div className="text-center text-zinc-500 mt-8">
<p className="text-lg font-medium">BooCoder</p>
<p className="text-sm mt-1">Send a message to start coding.</p>
</div>
)}
{visibleMessages.map((msg) => (
<MessageBubble key={msg.id} message={msg} chatId={msg.chat_id} toolResultsMap={toolResultsMap} />
))}
<div ref={messagesEndRef} />
</div>
{/* Input area */}
<div className="border-t border-zinc-800 px-4 py-3">
<div className="flex items-end gap-2">
<textarea
ref={textareaRef}
value={input}
onChange={(e) => setInput(e.target.value)}
onKeyDown={handleKeyDown}
placeholder="Message BooCoder..."
rows={1}
className="flex-1 bg-zinc-800 border border-zinc-700 rounded-lg px-3 py-2 text-sm text-zinc-100 placeholder-zinc-500 resize-none focus:outline-none focus:ring-1 focus:ring-blue-500 focus:border-blue-500"
disabled={sending}
/>
{isStreaming ? (
<button
onClick={handleStop}
className="p-2 rounded-lg bg-red-600 hover:bg-red-500 text-white transition-colors"
title="Stop generation"
>
<Square size={18} />
</button>
) : (
<button
onClick={handleSend}
disabled={!input.trim() || sending}
className="p-2 rounded-lg bg-blue-600 hover:bg-blue-500 disabled:opacity-40 disabled:cursor-not-allowed text-white transition-colors"
title="Send message"
>
<Send size={18} />
</button>
)}
</div>
</div>
</div>
);
}

View File

@@ -1,337 +0,0 @@
import { useState, useEffect, useCallback } from 'react';
import { Check, X, RotateCcw, FileText, FilePlus, Trash2, RefreshCw } from 'lucide-react';
import type { PendingChange } from '@/api/types';
import { api } from '@/api/client';
interface Props {
sessionId: string;
}
export function DiffPane({ sessionId }: Props) {
const [changes, setChanges] = useState<PendingChange[]>([]);
const [loading, setLoading] = useState(true);
const [expandedId, setExpandedId] = useState<string | null>(null);
const fetchPending = useCallback(async () => {
try {
const result = await api.pending.list(sessionId);
setChanges(result);
} catch (err) {
console.error('fetch pending failed:', err);
} finally {
setLoading(false);
}
}, [sessionId]);
// Initial load. Pending changes are delivered over HTTP (list + apply/reject/
// rewind below); there is no WS pending-change frame, so the list refreshes on
// mount, on the Refresh button, and optimistically as the user acts on it.
useEffect(() => {
fetchPending();
}, [fetchPending]);
const pendingChanges = changes.filter((c) => c.status === 'pending');
const resolvedChanges = changes.filter((c) => c.status !== 'pending');
const handleApplyOne = async (id: string) => {
try {
await api.pending.applyOne(id);
setChanges((prev) =>
prev.map((c) => (c.id === id ? { ...c, status: 'applied' as const } : c)),
);
} catch (err) {
console.error('apply failed:', err);
}
};
const handleRejectOne = async (id: string) => {
try {
await api.pending.rejectOne(id);
setChanges((prev) =>
prev.map((c) => (c.id === id ? { ...c, status: 'rejected' as const } : c)),
);
} catch (err) {
console.error('reject failed:', err);
}
};
const handleRewindOne = async (id: string) => {
try {
await api.pending.rewindOne(id);
setChanges((prev) =>
prev.map((c) => (c.id === id ? { ...c, status: 'reverted' as const } : c)),
);
} catch (err) {
console.error('rewind failed:', err);
}
};
const handleApplyAll = async () => {
try {
const result = await api.pending.applyAll(sessionId);
const appliedIds = new Set(
result.results.filter((r) => r.success).map((r) => r.id),
);
setChanges((prev) =>
prev.map((c) =>
appliedIds.has(c.id) ? { ...c, status: 'applied' as const } : c,
),
);
} catch (err) {
console.error('apply all failed:', err);
}
};
const handleRejectAll = async () => {
// Reject each pending change individually (no batch reject endpoint)
for (const c of pendingChanges) {
await handleRejectOne(c.id);
}
};
const OpIcon = ({ op }: { op: PendingChange['operation'] }) => {
switch (op) {
case 'create':
return <FilePlus size={14} className="text-green-400" />;
case 'edit':
return <FileText size={14} className="text-blue-400" />;
case 'delete':
return <Trash2 size={14} className="text-red-400" />;
}
};
const StatusBadge = ({ status }: { status: PendingChange['status'] }) => {
const colors: Record<PendingChange['status'], string> = {
pending: 'bg-yellow-500/20 text-yellow-400',
applied: 'bg-green-500/20 text-green-400',
rejected: 'bg-zinc-500/20 text-zinc-400',
reverted: 'bg-orange-500/20 text-orange-400',
};
return (
<span className={`text-[10px] px-1.5 py-0.5 rounded ${colors[status]}`}>
{status}
</span>
);
};
return (
<div className="flex flex-col h-full">
{/* Header */}
<div className="flex items-center justify-between px-4 py-2 border-b border-zinc-800">
<h2 className="text-sm font-medium text-zinc-300">
Pending Changes
{pendingChanges.length > 0 && (
<span className="ml-1.5 text-xs text-zinc-500">
({pendingChanges.length})
</span>
)}
</h2>
<div className="flex items-center gap-1">
<button
onClick={fetchPending}
className="p-1 rounded hover:bg-zinc-800 text-zinc-400 hover:text-zinc-200"
title="Refresh"
>
<RefreshCw size={14} />
</button>
{pendingChanges.length > 0 && (
<>
<button
onClick={handleApplyAll}
className="text-xs px-2 py-1 rounded bg-green-600/80 hover:bg-green-600 text-white"
>
Apply All
</button>
<button
onClick={handleRejectAll}
className="text-xs px-2 py-1 rounded bg-zinc-700 hover:bg-zinc-600 text-zinc-300"
>
Reject All
</button>
</>
)}
</div>
</div>
{/* Changes list */}
<div className="flex-1 overflow-y-auto">
{loading && (
<div className="text-center text-zinc-500 text-sm py-8">Loading...</div>
)}
{!loading && changes.length === 0 && (
<div className="text-center text-zinc-500 text-sm py-8">
No pending changes yet.
</div>
)}
{/* Pending changes first */}
{pendingChanges.map((change) => (
<ChangeItem
key={change.id}
change={change}
expanded={expandedId === change.id}
onToggle={() =>
setExpandedId((prev) => (prev === change.id ? null : change.id))
}
onApply={() => handleApplyOne(change.id)}
onReject={() => handleRejectOne(change.id)}
OpIcon={OpIcon}
StatusBadge={StatusBadge}
/>
))}
{/* Resolved changes */}
{resolvedChanges.length > 0 && pendingChanges.length > 0 && (
<div className="border-t border-zinc-800 my-1" />
)}
{resolvedChanges.map((change) => (
<ChangeItem
key={change.id}
change={change}
expanded={expandedId === change.id}
onToggle={() =>
setExpandedId((prev) => (prev === change.id ? null : change.id))
}
onRewind={
change.status === 'applied'
? () => handleRewindOne(change.id)
: undefined
}
OpIcon={OpIcon}
StatusBadge={StatusBadge}
/>
))}
</div>
</div>
);
}
interface ChangeItemProps {
change: PendingChange;
expanded: boolean;
onToggle: () => void;
onApply?: () => void;
onReject?: () => void;
onRewind?: () => void;
OpIcon: React.ComponentType<{ op: PendingChange['operation'] }>;
StatusBadge: React.ComponentType<{ status: PendingChange['status'] }>;
}
function ChangeItem({
change,
expanded,
onToggle,
onApply,
onReject,
onRewind,
OpIcon,
StatusBadge,
}: ChangeItemProps) {
const fileName = change.file_path.split('/').pop() || change.file_path;
const dirPath = change.file_path.split('/').slice(0, -1).join('/');
return (
<div className="border-b border-zinc-800/50">
<div
className="flex items-center gap-2 px-4 py-2 hover:bg-zinc-800/50 cursor-pointer"
onClick={onToggle}
>
<OpIcon op={change.operation} />
<div className="flex-1 min-w-0">
<span className="text-sm font-mono text-zinc-200 truncate block">
{fileName}
</span>
{dirPath && (
<span className="text-[11px] text-zinc-500 truncate block">
{dirPath}
</span>
)}
</div>
<StatusBadge status={change.status} />
{change.status === 'pending' && (
<div className="flex items-center gap-1 ml-1">
<button
onClick={(e) => {
e.stopPropagation();
onApply?.();
}}
className="p-1 rounded hover:bg-green-600/30 text-green-400"
title="Apply"
>
<Check size={14} />
</button>
<button
onClick={(e) => {
e.stopPropagation();
onReject?.();
}}
className="p-1 rounded hover:bg-red-600/30 text-red-400"
title="Reject"
>
<X size={14} />
</button>
</div>
)}
{change.status === 'applied' && onRewind && (
<button
onClick={(e) => {
e.stopPropagation();
onRewind();
}}
className="p-1 rounded hover:bg-orange-600/30 text-orange-400"
title="Rewind"
>
<RotateCcw size={14} />
</button>
)}
</div>
{expanded && (
<div className="px-4 pb-3">
{change.operation === 'edit' && (
<div className="space-y-2">
{change.old_string && (
<div className="rounded bg-red-950/30 border border-red-900/30 p-2">
<div className="text-[10px] text-red-400 mb-1 font-medium">
Remove
</div>
<pre className="text-xs text-red-200 whitespace-pre-wrap break-all font-mono">
{change.old_string}
</pre>
</div>
)}
{change.new_string && (
<div className="rounded bg-green-950/30 border border-green-900/30 p-2">
<div className="text-[10px] text-green-400 mb-1 font-medium">
Add
</div>
<pre className="text-xs text-green-200 whitespace-pre-wrap break-all font-mono">
{change.new_string}
</pre>
</div>
)}
</div>
)}
{change.operation === 'create' && change.content && (
<div className="rounded bg-green-950/30 border border-green-900/30 p-2">
<div className="text-[10px] text-green-400 mb-1 font-medium">
New file
</div>
<pre className="text-xs text-green-200 whitespace-pre-wrap break-all font-mono max-h-60 overflow-y-auto">
{change.content.length > 2000
? change.content.slice(0, 2000) + '\n... (truncated)'
: change.content}
</pre>
</div>
)}
{change.operation === 'delete' && (
<div className="rounded bg-red-950/30 border border-red-900/30 p-2 text-xs text-red-300">
This file will be deleted.
</div>
)}
</div>
)}
</div>
);
}

View File

@@ -1,62 +0,0 @@
import { useState } from 'react';
import { Code2, MessageSquare, GitPullRequest } from 'lucide-react';
interface Props {
chatPane: React.ReactNode;
diffPane: React.ReactNode;
}
export function Layout({ chatPane, diffPane }: Props) {
const [activeTab, setActiveTab] = useState<'chat' | 'diff'>('chat');
return (
<div className="flex flex-col h-screen bg-zinc-900">
{/* Top bar */}
<header className="flex items-center gap-3 px-4 py-2 border-b border-zinc-800 bg-zinc-900/95">
<Code2 size={20} className="text-blue-400" />
<h1 className="text-sm font-semibold text-zinc-200">BooCoder</h1>
</header>
{/* Mobile tab bar (visible below lg breakpoint) */}
<div className="lg:hidden flex border-b border-zinc-800">
<button
onClick={() => setActiveTab('chat')}
className={`flex-1 flex items-center justify-center gap-1.5 py-2 text-sm ${
activeTab === 'chat'
? 'text-blue-400 border-b-2 border-blue-400'
: 'text-zinc-500'
}`}
>
<MessageSquare size={14} />
Chat
</button>
<button
onClick={() => setActiveTab('diff')}
className={`flex-1 flex items-center justify-center gap-1.5 py-2 text-sm ${
activeTab === 'diff'
? 'text-blue-400 border-b-2 border-blue-400'
: 'text-zinc-500'
}`}
>
<GitPullRequest size={14} />
Changes
</button>
</div>
{/* Desktop split layout */}
<div className="flex-1 hidden lg:flex overflow-hidden">
<div className="w-[60%] border-r border-zinc-800 overflow-hidden">
{chatPane}
</div>
<div className="w-[40%] overflow-hidden">
{diffPane}
</div>
</div>
{/* Mobile: show only the active tab */}
<div className="flex-1 lg:hidden overflow-hidden">
{activeTab === 'chat' ? chatPane : diffPane}
</div>
</div>
);
}

View File

@@ -1,135 +0,0 @@
import Markdown from 'react-markdown';
import remarkGfm from 'remark-gfm';
import type { Message, ToolResult } from '@/api/types';
import { Wrench, AlertCircle, Loader2 } from 'lucide-react';
import { AskUserInputCard } from './AskUserInputCard';
interface Props {
message: Message;
chatId: string;
toolResultsMap: Record<string, ToolResult>;
}
export function MessageBubble({ message, chatId }: Props) {
if (message.role === 'tool') {
return <ToolResultBubble message={message} />;
}
const isUser = message.role === 'user';
const isStreaming = message.status === 'streaming';
const isFailed = message.status === 'failed';
return (
<div className={`flex ${isUser ? 'justify-end' : 'justify-start'} mb-3`}>
<div
className={`max-w-[85%] rounded-lg px-4 py-2.5 ${
isUser
? 'bg-blue-600 text-white'
: 'bg-zinc-800 text-zinc-100 border border-zinc-700'
}`}
>
{isFailed && (
<div className="flex items-center gap-1.5 text-red-400 text-xs mb-1">
<AlertCircle size={12} />
<span>Failed</span>
</div>
)}
{message.tool_calls && message.tool_calls.length > 0 && (
<div className="mb-2 space-y-1">
{message.tool_calls.map((tc) => {
if (tc.name === 'ask_user_input') {
const result = message.tool_results ?? null;
return (
<AskUserInputCard
key={tc.id}
toolCall={tc}
toolResult={result}
chatId={chatId}
/>
);
}
return (
<div
key={tc.id}
className="flex items-center gap-1.5 text-xs text-zinc-400 bg-zinc-900/50 rounded px-2 py-1"
>
<Wrench size={11} />
<span className="font-mono">{tc.name}</span>
<span className="text-zinc-500 truncate max-w-[200px]">
{truncateArgs(tc.args)}
</span>
</div>
);
})}
</div>
)}
{message.content.trim() && (
<div className="prose prose-invert prose-sm max-w-none [&_pre]:bg-zinc-900 [&_pre]:p-3 [&_pre]:rounded [&_pre]:overflow-x-auto [&_code]:text-zinc-300 [&_p]:my-1.5">
<Markdown remarkPlugins={[remarkGfm]}>{message.content}</Markdown>
</div>
)}
{isStreaming && !message.content.trim() && (
<div className="flex items-center gap-1.5 text-zinc-400">
<Loader2 size={14} className="animate-spin" />
<span className="text-xs">Thinking...</span>
</div>
)}
{isStreaming && message.content.trim() && (
<span className="inline-block w-1.5 h-4 bg-zinc-400 animate-pulse ml-0.5 align-middle" />
)}
</div>
</div>
);
}
function ToolResultBubble({ message }: { message: Message }) {
const result = message.tool_results;
if (!result) return null;
const isError = result.error;
const output = result.output != null ? String(result.output) : '';
const displayOutput =
output.length > 300 ? output.slice(0, 300) + '...' : output;
return (
<div className="flex justify-start mb-2 ml-6">
<div
className={`max-w-[80%] rounded px-3 py-2 text-xs font-mono border ${
isError
? 'bg-red-950/30 border-red-800/50 text-red-300'
: 'bg-zinc-800/50 border-zinc-700/50 text-zinc-400'
}`}
>
{result.truncated && (
<span className="text-yellow-500 text-[10px] block mb-1">
[truncated]
</span>
)}
<pre className="whitespace-pre-wrap break-all">{displayOutput}</pre>
</div>
</div>
);
}
function truncateArgs(args: unknown): string {
if (!args) return '';
try {
if (typeof args === 'object' && args !== null) {
const obj = args as Record<string, unknown>;
const keys = Object.keys(obj);
if (keys.length === 0) return '';
const first = keys[0]!;
const val = String(obj[first] ?? '');
const display = val.length > 40 ? val.slice(0, 40) + '...' : val;
return `${first}: ${display}`;
}
const str = String(args);
return str.length > 50 ? str.slice(0, 50) + '...' : str;
} catch {
return String(args).length > 50 ? String(args).slice(0, 50) + '...' : String(args);
}
}

View File

@@ -1,35 +0,0 @@
import * as React from 'react';
export interface ButtonProps
extends React.ButtonHTMLAttributes<HTMLButtonElement> {
variant?: 'default' | 'destructive' | 'outline' | 'secondary' | 'ghost' | 'link';
size?: 'default' | 'sm' | 'lg' | 'icon';
}
const variantClasses: Record<string, string> = {
default: 'bg-primary text-primary-foreground hover:bg-primary/90',
destructive: 'bg-destructive text-destructive-foreground hover:bg-destructive/90',
outline: 'border border-input bg-background hover:bg-accent hover:text-accent-foreground',
secondary: 'bg-secondary text-secondary-foreground hover:bg-secondary/80',
ghost: 'hover:bg-accent hover:text-accent-foreground',
link: 'text-primary underline-offset-4 hover:underline',
};
const sizeClasses: Record<string, string> = {
default: 'h-9 px-4 py-2',
sm: 'h-8 rounded-md px-3 text-xs',
lg: 'h-10 rounded-md px-8',
icon: 'h-9 w-9',
};
const Button = React.forwardRef<HTMLButtonElement, ButtonProps>(
({ className, variant = 'default', size = 'default', ...props }, ref) => {
const base =
'inline-flex items-center justify-center whitespace-nowrap rounded-md text-sm font-medium transition-colors focus-visible:outline-none focus-visible:ring-2 focus-visible:ring-ring/40 disabled:pointer-events-none disabled:opacity-60';
const cls = [base, variantClasses[variant] ?? '', sizeClasses[size] ?? '', className ?? ''].join(' ');
return <button className={cls} ref={ref} {...props} />;
},
);
Button.displayName = 'Button';
export { Button };

View File

@@ -1,56 +0,0 @@
import * as React from 'react';
const RadioGroupContext = React.createContext<{
value: string | undefined;
onValueChange: (v: string) => void;
disabled?: boolean;
} | null>(null);
interface RadioGroupProps extends React.HTMLAttributes<HTMLDivElement> {
value?: string;
onValueChange?: (value: string) => void;
disabled?: boolean;
}
const RadioGroup = React.forwardRef<HTMLDivElement, RadioGroupProps>(
({ className, value, onValueChange, disabled, ...props }, ref) => {
const ctx = React.useMemo(() => ({ value, onValueChange: onValueChange ?? (() => {}), disabled }), [value, onValueChange, disabled]);
return (
<RadioGroupContext.Provider value={ctx}>
<div
ref={ref}
role="radiogroup"
className={className}
{...props}
/>
</RadioGroupContext.Provider>
);
},
);
RadioGroup.displayName = 'RadioGroup';
interface RadioGroupItemProps extends React.InputHTMLAttributes<HTMLInputElement> {
value: string;
}
const RadioGroupItem = React.forwardRef<HTMLInputElement, RadioGroupItemProps>(
({ className, value, ...props }, ref) => {
const ctx = React.useContext(RadioGroupContext);
if (!ctx) return <input ref={ref} type="radio" className={className} value={value} {...props} />;
const checked = ctx.value === value;
return (
<input
ref={ref}
type="radio"
checked={checked}
disabled={ctx.disabled}
onChange={() => ctx.onValueChange(value)}
className={className}
{...props}
/>
);
},
);
RadioGroupItem.displayName = 'RadioGroupItem';
export { RadioGroup, RadioGroupItem };

View File

@@ -1,22 +0,0 @@
@import "tailwindcss";
body {
margin: 0;
min-height: 100vh;
}
/* Scrollbar styling for dark theme */
::-webkit-scrollbar {
width: 8px;
height: 8px;
}
::-webkit-scrollbar-track {
background: transparent;
}
::-webkit-scrollbar-thumb {
background: #3f3f46;
border-radius: 4px;
}
::-webkit-scrollbar-thumb:hover {
background: #52525b;
}

View File

@@ -1,217 +0,0 @@
import { useEffect, useRef, useState } from 'react';
import type { Message, WsFrame } from '@/api/types';
interface State {
messages: Message[];
connected: boolean;
error: string | null;
}
function applyFrame(state: State, frame: WsFrame): State {
switch (frame.type) {
case 'snapshot': {
// Canonical SnapshotFrame.messages is opaque (z.array(z.unknown())); the
// coder backend sends Message-shaped rows, so cast to the SPA's local type.
return { ...state, messages: frame.messages as Message[] };
}
case 'message_started': {
const exists = state.messages.some((m) => m.id === frame.message_id);
if (exists) return state;
const newMsg: Message = {
id: frame.message_id,
session_id: '',
chat_id: frame.chat_id ?? '',
role: frame.role,
content: '',
kind: 'message',
tool_calls: null,
tool_results: null,
status: frame.role === 'system' ? 'complete' : 'streaming',
tokens_used: null,
ctx_used: null,
ctx_max: null,
started_at: null,
finished_at: null,
created_at: new Date().toISOString(),
metadata: null,
};
return { ...state, messages: [...state.messages, newMsg] };
}
case 'delta': {
const next = state.messages.map((m) =>
m.id === frame.message_id ? { ...m, content: m.content + frame.content } : m,
);
return { ...state, messages: next };
}
case 'tool_call': {
const next = state.messages.map((m) =>
m.id === frame.message_id
? { ...m, tool_calls: [...(m.tool_calls ?? []), frame.tool_call] }
: m,
);
return { ...state, messages: next };
}
case 'tool_result': {
const exists = state.messages.some((m) => m.id === frame.tool_message_id);
if (exists) {
const next = state.messages.map((m) =>
m.id === frame.tool_message_id
? {
...m,
role: 'tool' as const,
tool_results: {
tool_call_id: frame.tool_call_id,
output: frame.output,
truncated: frame.truncated,
...(frame.error ? { error: frame.error } : {}),
},
status: 'complete' as const,
}
: m,
);
return { ...state, messages: next };
}
const newMsg: Message = {
id: frame.tool_message_id,
session_id: '',
chat_id: frame.chat_id ?? '',
role: 'tool',
content: '',
kind: 'message',
tool_calls: null,
tool_results: {
tool_call_id: frame.tool_call_id,
output: frame.output,
truncated: frame.truncated,
...(frame.error ? { error: frame.error } : {}),
},
status: 'complete',
tokens_used: null,
ctx_used: null,
ctx_max: null,
started_at: null,
finished_at: null,
created_at: new Date().toISOString(),
metadata: null,
};
return { ...state, messages: [...state.messages, newMsg] };
}
case 'message_complete': {
const next = state.messages.map((m) =>
m.id === frame.message_id
? {
...m,
status: 'complete' as const,
...(frame.tokens_used !== undefined ? { tokens_used: frame.tokens_used } : {}),
...(frame.ctx_used !== undefined ? { ctx_used: frame.ctx_used } : {}),
...(frame.ctx_max !== undefined ? { ctx_max: frame.ctx_max } : {}),
...(frame.started_at !== undefined ? { started_at: frame.started_at } : {}),
...(frame.finished_at !== undefined ? { finished_at: frame.finished_at } : {}),
...(frame.metadata !== undefined ? { metadata: frame.metadata } : {}),
}
: m,
);
return { ...state, messages: next };
}
case 'error': {
const next = frame.message_id
? state.messages.map((m) =>
m.id === frame.message_id ? { ...m, status: 'failed' as const } : m,
)
: state.messages;
return { ...state, messages: next, error: frame.error };
}
default:
// The canonical WsFrame carries the full set of frames the coder backend
// can publish; this SPA only renders the subset handled above and safely
// ignores the rest (reasoning_delta, usage, permission_*, agent_*, and the
// per-user sidebar frames). pending_change_* frames have no publisher —
// pending changes are delivered over HTTP, so there is nothing to handle.
return state;
}
}
const RECONNECT_INITIAL_MS = 1000;
const RECONNECT_MAX_MS = 30_000;
interface SessionStreamResult {
messages: Message[];
connected: boolean;
error: string | null;
isStreaming: boolean;
}
export function useSessionStream(sessionId: string | undefined): SessionStreamResult {
const [state, setState] = useState<State>({ messages: [], connected: false, error: null });
const wsRef = useRef<WebSocket | null>(null);
useEffect(() => {
if (!sessionId) return;
setState({ messages: [], connected: false, error: null });
let unmounted = false;
let reconnectTimer: ReturnType<typeof setTimeout> | null = null;
let reconnectDelay = RECONNECT_INITIAL_MS;
const connect = () => {
if (unmounted) return;
const proto = window.location.protocol === 'https:' ? 'wss' : 'ws';
const url = `${proto}://${window.location.host}/api/ws/sessions/${sessionId}`;
const ws = new WebSocket(url);
wsRef.current = ws;
ws.onopen = () => {
reconnectDelay = RECONNECT_INITIAL_MS;
setState((s) => ({ ...s, connected: true, error: null }));
};
ws.onmessage = (ev) => {
let frame: WsFrame;
try {
frame = JSON.parse(typeof ev.data === 'string' ? ev.data : '') as WsFrame;
} catch {
return;
}
setState((s) => applyFrame(s, frame));
};
ws.onerror = () => {
try {
ws.close();
} catch {}
};
ws.onclose = () => {
if (unmounted) return;
setState((s) => ({ ...s, connected: false }));
const delay = reconnectDelay;
reconnectDelay = Math.min(reconnectDelay * 2, RECONNECT_MAX_MS);
reconnectTimer = setTimeout(connect, delay);
};
};
connect();
return () => {
unmounted = true;
if (reconnectTimer) clearTimeout(reconnectTimer);
const ws = wsRef.current;
wsRef.current = null;
if (ws)
try {
ws.close();
} catch {}
};
}, [sessionId]);
const isStreaming = state.messages.some((m) => m.status === 'streaming');
return {
messages: state.messages,
connected: state.connected,
error: state.error,
isStreaming,
};
}

View File

@@ -1,13 +0,0 @@
import { StrictMode } from 'react';
import { createRoot } from 'react-dom/client';
import { BrowserRouter } from 'react-router-dom';
import { App } from './App';
import './globals.css';
createRoot(document.getElementById('root')!).render(
<StrictMode>
<BrowserRouter>
<App />
</BrowserRouter>
</StrictMode>,
);

View File

@@ -1,138 +0,0 @@
import { useEffect, useState } from 'react';
import { useNavigate } from 'react-router-dom';
import { Code2, Folder, ArrowRight } from 'lucide-react';
import type { Project, Session } from '@/api/types';
import { api } from '@/api/client';
export function Home() {
const navigate = useNavigate();
const [projects, setProjects] = useState<Project[]>([]);
const [sessions, setSessions] = useState<Session[]>([]);
const [selectedProject, setSelectedProject] = useState<string | null>(null);
const [loading, setLoading] = useState(true);
// Fetch projects on mount
useEffect(() => {
api.projects
.list({ status: 'open' })
.then(setProjects)
.catch(console.error)
.finally(() => setLoading(false));
}, []);
// Fetch sessions when a project is selected
useEffect(() => {
if (!selectedProject) {
setSessions([]);
return;
}
api.sessions
.listForProject(selectedProject, 'open')
.then(setSessions)
.catch(console.error);
}, [selectedProject]);
const handleSessionClick = (session: Session) => {
navigate(`/sessions/${session.id}`);
};
if (loading) {
return (
<div className="min-h-screen bg-zinc-900 flex items-center justify-center">
<div className="text-zinc-500">Loading...</div>
</div>
);
}
return (
<div className="min-h-screen bg-zinc-900 p-6">
<div className="max-w-2xl mx-auto">
{/* Header */}
<div className="flex items-center gap-3 mb-8">
<Code2 size={28} className="text-blue-400" />
<h1 className="text-2xl font-bold text-zinc-100">BooCoder</h1>
</div>
{/* Projects list */}
<div className="mb-8">
<h2 className="text-sm font-medium text-zinc-400 uppercase tracking-wide mb-3">
Projects
</h2>
{projects.length === 0 ? (
<p className="text-zinc-500 text-sm">
No projects found. Create one in BooChat first.
</p>
) : (
<div className="space-y-1">
{projects.map((project) => (
<button
key={project.id}
onClick={() => setSelectedProject(project.id)}
className={`w-full flex items-center gap-3 px-4 py-3 rounded-lg text-left transition-colors ${
selectedProject === project.id
? 'bg-blue-600/20 border border-blue-500/40'
: 'bg-zinc-800/50 border border-zinc-800 hover:bg-zinc-800'
}`}
>
<Folder
size={16}
className={
selectedProject === project.id
? 'text-blue-400'
: 'text-zinc-500'
}
/>
<div className="flex-1 min-w-0">
<div className="text-sm font-medium text-zinc-200 truncate">
{project.name}
</div>
<div className="text-xs text-zinc-500 truncate">
{project.path}
</div>
</div>
</button>
))}
</div>
)}
</div>
{/* Sessions list */}
{selectedProject && (
<div>
<h2 className="text-sm font-medium text-zinc-400 uppercase tracking-wide mb-3">
Sessions
</h2>
{sessions.length === 0 ? (
<p className="text-zinc-500 text-sm">
No open sessions. Create one in BooChat first.
</p>
) : (
<div className="space-y-1">
{sessions.map((session) => (
<button
key={session.id}
onClick={() => handleSessionClick(session)}
className="w-full flex items-center gap-3 px-4 py-3 rounded-lg bg-zinc-800/50 border border-zinc-800 hover:bg-zinc-800 text-left transition-colors group"
>
<div className="flex-1 min-w-0">
<div className="text-sm font-medium text-zinc-200 truncate">
{session.name || 'Untitled session'}
</div>
<div className="text-xs text-zinc-500">
{new Date(session.updated_at).toLocaleDateString()}
</div>
</div>
<ArrowRight
size={16}
className="text-zinc-600 group-hover:text-zinc-400 transition-colors"
/>
</button>
))}
</div>
)}
</div>
)}
</div>
</div>
);
}

View File

@@ -1,83 +0,0 @@
import { useEffect, useState } from 'react';
import { useParams, useNavigate } from 'react-router-dom';
import { ArrowLeft } from 'lucide-react';
import type { Chat } from '@/api/types';
import { api } from '@/api/client';
import { useSessionStream } from '@/hooks/useSessionStream';
import { ChatPane } from '@/components/ChatPane';
import { DiffPane } from '@/components/DiffPane';
import { Layout } from '@/components/Layout';
export function Session() {
const { sessionId } = useParams<{ sessionId: string }>();
const navigate = useNavigate();
const [chat, setChat] = useState<Chat | null>(null);
const [loading, setLoading] = useState(true);
const { messages, connected, isStreaming } = useSessionStream(sessionId);
// Get or create a chat for this session
useEffect(() => {
if (!sessionId) return;
api.chats
.listForSession(sessionId)
.then((chats) => {
// Use the first open chat, or create one
const openChat = chats.find((c) => c.status === 'open');
if (openChat) {
setChat(openChat);
} else {
// Create a new chat
return api.chats.create(sessionId).then((newChat) => {
setChat(newChat);
});
}
})
.catch(console.error)
.finally(() => setLoading(false));
}, [sessionId]);
if (!sessionId) {
navigate('/');
return null;
}
if (loading) {
return (
<div className="min-h-screen bg-zinc-900 flex items-center justify-center">
<div className="text-zinc-500">Loading session...</div>
</div>
);
}
if (!chat) {
return (
<div className="min-h-screen bg-zinc-900 flex flex-col items-center justify-center gap-4">
<div className="text-zinc-500">Could not load chat for this session.</div>
<button
onClick={() => navigate('/')}
className="flex items-center gap-2 text-sm text-blue-400 hover:text-blue-300"
>
<ArrowLeft size={14} />
Back to projects
</button>
</div>
);
}
return (
<Layout
chatPane={
<ChatPane
sessionId={sessionId}
chatId={chat.id}
messages={messages}
isStreaming={isStreaming}
connected={connected}
/>
}
diffPane={<DiffPane sessionId={sessionId} />}
/>
);
}

View File

@@ -1 +0,0 @@
/// <reference types="vite/client" />

View File

@@ -1,27 +0,0 @@
{
"compilerOptions": {
"target": "ES2022",
"module": "ESNext",
"moduleResolution": "Bundler",
"strict": true,
"esModuleInterop": true,
"skipLibCheck": true,
"forceConsistentCasingInFileNames": true,
"resolveJsonModule": true,
"allowSyntheticDefaultImports": true,
"isolatedModules": true,
"noUncheckedIndexedAccess": true,
"composite": true,
"tsBuildInfoFile": "./node_modules/.tmp/tsconfig.app.tsbuildinfo",
"lib": ["ES2022", "DOM", "DOM.Iterable"],
"jsx": "react-jsx",
"noEmit": true,
"useDefineForClassFields": true,
"allowImportingTsExtensions": true,
"baseUrl": ".",
"paths": {
"@/*": ["./src/*"]
}
},
"include": ["src"]
}

View File

@@ -1,13 +0,0 @@
{
"files": [],
"references": [
{ "path": "./tsconfig.app.json" },
{ "path": "./tsconfig.node.json" }
],
"compilerOptions": {
"baseUrl": ".",
"paths": {
"@/*": ["./src/*"]
}
}
}

View File

@@ -1,14 +0,0 @@
{
"compilerOptions": {
"target": "ES2022",
"module": "ESNext",
"moduleResolution": "Bundler",
"strict": true,
"esModuleInterop": true,
"skipLibCheck": true,
"composite": true,
"tsBuildInfoFile": "./node_modules/.tmp/tsconfig.node.tsbuildinfo",
"allowSyntheticDefaultImports": true
},
"include": ["vite.config.ts"]
}

View File

@@ -1,2 +0,0 @@
declare const _default: import("vite").UserConfig;
export default _default;

View File

@@ -1,25 +0,0 @@
import { defineConfig } from 'vite';
import react from '@vitejs/plugin-react';
import path from 'node:path';
export default defineConfig({
plugins: [react()],
resolve: {
alias: {
'@': path.resolve(__dirname, './src'),
},
},
server: {
port: 5174,
proxy: {
'/api': {
target: 'http://127.0.0.1:3000',
changeOrigin: true,
ws: true,
},
},
},
build: {
outDir: 'dist',
emptyOutDir: true,
},
});

View File

@@ -1,26 +0,0 @@
import { defineConfig } from 'vite';
import react from '@vitejs/plugin-react';
import path from 'node:path';
export default defineConfig({
plugins: [react()],
resolve: {
alias: {
'@': path.resolve(__dirname, './src'),
},
},
server: {
port: 5174,
proxy: {
'/api': {
target: 'http://127.0.0.1:3000',
changeOrigin: true,
ws: true,
},
},
},
build: {
outDir: 'dist',
emptyOutDir: true,
},
});

View File

@@ -18,7 +18,6 @@ const ConfigSchema = z.object({
GITEA_BASE_URL: z.string().url().default('https://git.indifferentketchup.com'), GITEA_BASE_URL: z.string().url().default('https://git.indifferentketchup.com'),
GITEA_USER: z.string().default('indifferentketchup'), GITEA_USER: z.string().default('indifferentketchup'),
GITEA_TOKEN: z.string().optional(), GITEA_TOKEN: z.string().optional(),
GITEA_SSH_HOST: z.string().default('100.114.205.53:2222'),
// v1.15.0-mcp-multi: path to the MCP config JSON file. Default /data/mcp.json // v1.15.0-mcp-multi: path to the MCP config JSON file. Default /data/mcp.json
// (bind-mounted alongside AGENTS.md). File missing = no MCP (opt-in). // (bind-mounted alongside AGENTS.md). File missing = no MCP (opt-in).
MCP_CONFIG_PATH: z.string().optional(), MCP_CONFIG_PATH: z.string().optional(),

View File

@@ -5,6 +5,7 @@ import type { Broker } from '../services/broker.js';
import type { Chat, Message } from '../types/api.js'; import type { Chat, Message } from '../types/api.js';
import { getModelContext } from '../services/model-context.js'; import { getModelContext } from '../services/model-context.js';
import { notifyCoderClose } from '../services/coder-notify.js'; import { notifyCoderClose } from '../services/coder-notify.js';
import { MESSAGE_COLUMNS } from '../services/message-columns.js';
const CreateBody = z.object({ const CreateBody = z.object({
name: z.string().min(1).max(200).optional(), name: z.string().min(1).max(200).optional(),
@@ -439,9 +440,7 @@ export function registerChatRoutes(
} }
// v1.13.1-B: reads tool_calls/tool_results via the parts-merged view. // v1.13.1-B: reads tool_calls/tool_results via the parts-merged view.
const rows = await sql<Message[]>` const rows = await sql<Message[]>`
SELECT id, session_id, chat_id, role, content, kind, tool_calls, tool_results, status, last_seq, SELECT ${sql.unsafe(MESSAGE_COLUMNS)}
tokens_used, ctx_used, ctx_max, started_at, finished_at, created_at, metadata,
summary, tail_start_id, compacted_at, model
FROM messages_with_parts FROM messages_with_parts
WHERE chat_id = ${req.params.id} WHERE chat_id = ${req.params.id}
ORDER BY created_at ASC, id ASC ORDER BY created_at ASC, id ASC

View File

@@ -8,6 +8,81 @@ import type { Chat, Message, Session, ToolCall } from '../types/api.js';
// decision time (not at request time) so concurrent project changes don't // decision time (not at request time) so concurrent project changes don't
// stale-bind the resolution. // stale-bind the resolution.
import { resolveGrantRoot } from '../services/grant_resolver.js'; import { resolveGrantRoot } from '../services/grant_resolver.js';
import { MESSAGE_COLUMNS } from '../services/message-columns.js';
// Shared lookup for the answer_user_input + grant_read_access pause-resume
// endpoints. Finds the originating assistant tool_call by id in message_parts,
// validates the tool name, finds the pending tool_result part, and checks the
// already-answered guard. Returns ok:true+context on success, ok:false+HTTP
// status+body on any error (caller does reply.code(ctx.code); return ctx.body).
type PendingToolLookupResult =
| {
ok: true;
foundCall: ToolCall;
toolMessageId: string;
toolRow: { message_id: string; payload: { tool_call_id: string; output: unknown } };
}
| { ok: false; code: number; body: Record<string, unknown> };
async function lookupPendingToolCall(
sql: Sql,
chatId: string,
tool_call_id: string,
expectedToolName: string,
wrongToolError: string,
): Promise<PendingToolLookupResult> {
// Find the assistant's tool_call by id via message_parts.
const callerRows = await sql<{
message_id: string;
payload: { id: string; name: string; args: Record<string, unknown> };
}[]>`
SELECT p.message_id, p.payload
FROM message_parts p
JOIN messages m ON m.id = p.message_id
WHERE m.chat_id = ${chatId}
AND m.role = 'assistant'
AND p.kind = 'tool_call'
AND p.payload->>'id' = ${tool_call_id}
ORDER BY m.created_at DESC
LIMIT 1
`;
const callerRow = callerRows[0];
if (!callerRow) return { ok: false, code: 404, body: { error: 'unknown_tool_call_id' } };
const foundCall: ToolCall = {
id: callerRow.payload.id,
name: callerRow.payload.name,
args: callerRow.payload.args,
};
if (foundCall.name !== expectedToolName) {
return { ok: false, code: 400, body: { error: wrongToolError } };
}
// Find the pending tool_result part by tool_call_id.
const toolRows = await sql<{
message_id: string;
payload: { tool_call_id: string; output: unknown };
}[]>`
SELECT p.message_id, p.payload
FROM message_parts p
JOIN messages m ON m.id = p.message_id
WHERE m.chat_id = ${chatId}
AND m.role = 'tool'
AND p.kind = 'tool_result'
AND p.payload->>'tool_call_id' = ${tool_call_id}
ORDER BY m.created_at DESC
LIMIT 1
`;
const toolRow = toolRows[0];
if (!toolRow) {
return { ok: false, code: 404, body: { error: 'unknown_tool_call_id', detail: 'tool message not found' } };
}
if (toolRow.payload && toolRow.payload.output !== null) {
return { ok: false, code: 409, body: { error: 'tool_call_already_answered' } };
}
return { ok: true, foundCall, toolMessageId: toolRow.message_id, toolRow };
}
const SendBody = z.object({ const SendBody = z.object({
content: z.string().min(1).max(64_000), content: z.string().min(1).max(64_000),
@@ -116,9 +191,7 @@ export function registerMessageRoutes(
// see services/inference.ts loadContext + services/compaction.ts. // see services/inference.ts loadContext + services/compaction.ts.
// v1.13.1-B: reads tool_calls/tool_results via the parts-merged view. // v1.13.1-B: reads tool_calls/tool_results via the parts-merged view.
const rows = await sql<Message[]>` const rows = await sql<Message[]>`
SELECT id, session_id, chat_id, role, content, kind, tool_calls, tool_results, status, last_seq, SELECT ${sql.unsafe(MESSAGE_COLUMNS)}
tokens_used, ctx_used, ctx_max, started_at, finished_at, created_at, metadata,
summary, tail_start_id, compacted_at, model
FROM messages_with_parts FROM messages_with_parts
WHERE session_id = ${req.params.id} WHERE session_id = ${req.params.id}
ORDER BY created_at ASC, id ASC ORDER BY created_at ASC, id ASC
@@ -493,40 +566,16 @@ export function registerMessageRoutes(
const chat = chatRows[0]!; const chat = chatRows[0]!;
const sessionId = chat.session_id; const sessionId = chat.session_id;
// v1.13.1-C: find the assistant's tool_call by indexing message_parts // v1.13.1-C: resolve the originating tool_call + pending tool row.
// directly on payload->>'id'. Scoped by chat_id + role via the JOIN. // Pre-v1.13.0 history has no parts rows — those become unreachable (404).
// Pre-v1.13.0 history has no parts rows — those tool_calls become const ctx = await lookupPendingToolCall(
// unreachable here (404). Acceptable per the dispatch decision: any sql, chat.id, tool_call_id, 'ask_user_input', 'tool_call_not_ask_user_input',
// pending elicitation from before v1.13.0 is long timed out by now; );
// promote to a hotfix with a JSON-column fallback if it ever surfaces. if (!ctx.ok) {
const callerRows = await sql<{ reply.code(ctx.code);
message_id: string; return ctx.body;
payload: { id: string; name: string; args: Record<string, unknown> };
}[]>`
SELECT p.message_id, p.payload
FROM message_parts p
JOIN messages m ON m.id = p.message_id
WHERE m.chat_id = ${chat.id}
AND m.role = 'assistant'
AND p.kind = 'tool_call'
AND p.payload->>'id' = ${tool_call_id}
ORDER BY m.created_at DESC
LIMIT 1
`;
const callerRow = callerRows[0];
if (!callerRow) {
reply.code(404);
return { error: 'unknown_tool_call_id' };
}
const foundCall: ToolCall = {
id: callerRow.payload.id,
name: callerRow.payload.name,
args: callerRow.payload.args,
};
if (foundCall.name !== 'ask_user_input') {
reply.code(400);
return { error: 'tool_call_not_ask_user_input' };
} }
const { foundCall, toolMessageId } = ctx;
// Validate the args themselves — the LLM could have emitted bad JSON. // Validate the args themselves — the LLM could have emitted bad JSON.
const argsParsed = AskUserInputArgs.safeParse(foundCall.args); const argsParsed = AskUserInputArgs.safeParse(foundCall.args);
@@ -569,33 +618,6 @@ export function registerMessageRoutes(
} }
} }
// v1.13.1-C: find the pending tool row via message_parts on
// payload->>'tool_call_id'. Same fallback caveat as the caller lookup
// above — pre-v1.13.0 rows are unreachable here.
const toolRows = await sql<{
message_id: string;
payload: { tool_call_id: string; output: unknown };
}[]>`
SELECT p.message_id, p.payload
FROM message_parts p
JOIN messages m ON m.id = p.message_id
WHERE m.chat_id = ${chat.id}
AND m.role = 'tool'
AND p.kind = 'tool_result'
AND p.payload->>'tool_call_id' = ${tool_call_id}
ORDER BY m.created_at DESC
LIMIT 1
`;
const toolRow = toolRows[0];
if (!toolRow) {
reply.code(404);
return { error: 'unknown_tool_call_id', detail: 'tool message not found' };
}
if (toolRow.payload && toolRow.payload.output !== null) {
reply.code(409);
return { error: 'tool_call_already_answered' };
}
const answerSet = { answers }; const answerSet = { answers };
const newToolResults = { const newToolResults = {
tool_call_id, tool_call_id,
@@ -603,7 +625,6 @@ export function registerMessageRoutes(
truncated: false, truncated: false,
}; };
const toolMessageId = toolRow.message_id;
const result = await sql.begin(async (tx) => { const result = await sql.begin(async (tx) => {
// v1.13.20: parts-only. Replace the pending tool_result part inserted // v1.13.20: parts-only. Replace the pending tool_result part inserted
// at message creation (tool-phase.ts) with the answered one. Delete- // at message creation (tool-phase.ts) with the answered one. Delete-
@@ -681,35 +702,15 @@ export function registerMessageRoutes(
const chat = chatRows[0]!; const chat = chatRows[0]!;
const sessionId = chat.session_id; const sessionId = chat.session_id;
// Mirror the /answer lookup: assistant tool_call by id via message_parts. const grantCtx = await lookupPendingToolCall(
const callerRows = await sql<{ sql, chat.id, tool_call_id, 'request_read_access', 'tool_call_not_request_read_access',
message_id: string; );
payload: { id: string; name: string; args: Record<string, unknown> }; if (!grantCtx.ok) {
}[]>` reply.code(grantCtx.code);
SELECT p.message_id, p.payload return grantCtx.body;
FROM message_parts p
JOIN messages m ON m.id = p.message_id
WHERE m.chat_id = ${chat.id}
AND m.role = 'assistant'
AND p.kind = 'tool_call'
AND p.payload->>'id' = ${tool_call_id}
ORDER BY m.created_at DESC
LIMIT 1
`;
const callerRow = callerRows[0];
if (!callerRow) {
reply.code(404);
return { error: 'unknown_tool_call_id' };
}
const foundCall: ToolCall = {
id: callerRow.payload.id,
name: callerRow.payload.name,
args: callerRow.payload.args,
};
if (foundCall.name !== 'request_read_access') {
reply.code(400);
return { error: 'tool_call_not_request_read_access' };
} }
const { foundCall, toolMessageId } = grantCtx;
const argsParsed = RequestReadAccessArgs.safeParse(foundCall.args); const argsParsed = RequestReadAccessArgs.safeParse(foundCall.args);
if (!argsParsed.success) { if (!argsParsed.success) {
reply.code(400); reply.code(400);
@@ -717,31 +718,6 @@ export function registerMessageRoutes(
} }
const requestedPath = argsParsed.data.path; const requestedPath = argsParsed.data.path;
// Find the pending tool row.
const toolRows = await sql<{
message_id: string;
payload: { tool_call_id: string; output: unknown };
}[]>`
SELECT p.message_id, p.payload
FROM message_parts p
JOIN messages m ON m.id = p.message_id
WHERE m.chat_id = ${chat.id}
AND m.role = 'tool'
AND p.kind = 'tool_result'
AND p.payload->>'tool_call_id' = ${tool_call_id}
ORDER BY m.created_at DESC
LIMIT 1
`;
const toolRow = toolRows[0];
if (!toolRow) {
reply.code(404);
return { error: 'unknown_tool_call_id', detail: 'tool message not found' };
}
if (toolRow.payload && toolRow.payload.output !== null) {
reply.code(409);
return { error: 'tool_call_already_answered' };
}
// Look up session + project so we can re-resolve the grant root and // Look up session + project so we can re-resolve the grant root and
// append to allowed_read_paths atomically. We don't need agent or // append to allowed_read_paths atomically. We don't need agent or
// history here — just the project path for the resolver. // history here — just the project path for the resolver.
@@ -790,7 +766,6 @@ export function registerMessageRoutes(
output: resultOutput, output: resultOutput,
truncated: false, truncated: false,
}; };
const toolMessageId = toolRow.message_id;
const dbResult = await sql.begin(async (tx) => { const dbResult = await sql.begin(async (tx) => {
// v1.13.20: parts-only. Same delete+insert dance as /answer — // v1.13.20: parts-only. Same delete+insert dance as /answer —
// UNIQUE (message_id, sequence) blocks plain UPDATE on append-style // UNIQUE (message_id, sequence) blocks plain UPDATE on append-style

View File

@@ -67,6 +67,20 @@ export async function resolveProjectPath(
return { real, name: basename(real) }; return { real, name: basename(real) };
} }
async function selectProject(sql: Sql, id: string): Promise<Project | null> {
const rows = await sql<Project[]>`
SELECT id, name, path, added_at, last_session_id, status, gitea_remote,
default_system_prompt, default_web_search_enabled
FROM projects WHERE id = ${id}
`;
return rows[0] ?? null;
}
async function selectProjectPath(sql: Sql, id: string): Promise<string | null> {
const rows = await sql<{ path: string }[]>`SELECT path FROM projects WHERE id = ${id}`;
return rows[0]?.path ?? null;
}
export function registerProjectRoutes( export function registerProjectRoutes(
app: FastifyInstance, app: FastifyInstance,
sql: Sql, sql: Sql,
@@ -199,16 +213,12 @@ export function registerProjectRoutes(
// v1.9: single-project fetch so the settings pane can refetch on // v1.9: single-project fetch so the settings pane can refetch on
// project_updated without pulling the whole project list. // project_updated without pulling the whole project list.
app.get<{ Params: { id: string } }>('/api/projects/:id', async (req, reply) => { app.get<{ Params: { id: string } }>('/api/projects/:id', async (req, reply) => {
const rows = await sql<Project[]>` const project = await selectProject(sql, req.params.id);
SELECT id, name, path, added_at, last_session_id, status, gitea_remote, if (!project) {
default_system_prompt, default_web_search_enabled
FROM projects WHERE id = ${req.params.id}
`;
if (rows.length === 0) {
reply.code(404); reply.code(404);
return { error: 'not found' }; return { error: 'not found' };
} }
return rows[0]; return project;
}); });
app.patch<{ Params: { id: string } }>('/api/projects/:id', async (req, reply) => { app.patch<{ Params: { id: string } }>('/api/projects/:id', async (req, reply) => {
@@ -340,18 +350,14 @@ export function registerProjectRoutes(
const { id } = req.params; const { id } = req.params;
const relPath = req.query.path ?? '.'; const relPath = req.query.path ?? '.';
const rows = await sql<Project[]>` const projectPath = await selectProjectPath(sql, id);
SELECT id, name, path, added_at, last_session_id, status, gitea_remote if (projectPath === null) {
FROM projects WHERE id = ${id}
`;
if (rows.length === 0) {
reply.code(404); reply.code(404);
return { error: 'not found' }; return { error: 'not found' };
} }
const project = rows[0]!;
let projectRoot: string; let projectRoot: string;
try { try {
projectRoot = await resolveProjectRoot(project.path); projectRoot = await resolveProjectRoot(projectPath);
} catch (err) { } catch (err) {
if (err instanceof PathScopeError) { if (err instanceof PathScopeError) {
reply.code(404); reply.code(404);
@@ -385,18 +391,14 @@ export function registerProjectRoutes(
return { error: 'path is required' }; return { error: 'path is required' };
} }
const rows = await sql<Project[]>` const projectPath = await selectProjectPath(sql, id);
SELECT id, name, path, added_at, last_session_id, status, gitea_remote if (projectPath === null) {
FROM projects WHERE id = ${id}
`;
if (rows.length === 0) {
reply.code(404); reply.code(404);
return { error: 'not found' }; return { error: 'not found' };
} }
const project = rows[0]!;
let projectRoot: string; let projectRoot: string;
try { try {
projectRoot = await resolveProjectRoot(project.path); projectRoot = await resolveProjectRoot(projectPath);
} catch (err) { } catch (err) {
if (err instanceof PathScopeError) { if (err instanceof PathScopeError) {
reply.code(404); reply.code(404);
@@ -431,18 +433,14 @@ export function registerProjectRoutes(
'/api/projects/:id/git', '/api/projects/:id/git',
async (req, reply) => { async (req, reply) => {
const { id } = req.params; const { id } = req.params;
const rows = await sql<Project[]>` const projectPath = await selectProjectPath(sql, id);
SELECT id, name, path, added_at, last_session_id, status, gitea_remote if (projectPath === null) {
FROM projects WHERE id = ${id}
`;
if (rows.length === 0) {
reply.code(404); reply.code(404);
return { error: 'not found' }; return { error: 'not found' };
} }
const project = rows[0]!;
let projectRoot: string; let projectRoot: string;
try { try {
projectRoot = await resolveProjectRoot(project.path); projectRoot = await resolveProjectRoot(projectPath);
} catch (err) { } catch (err) {
if (err instanceof PathScopeError) { if (err instanceof PathScopeError) {
reply.code(404); reply.code(404);
@@ -461,18 +459,14 @@ export function registerProjectRoutes(
async (req, reply) => { async (req, reply) => {
const { id } = req.params; const { id } = req.params;
const rows = await sql<Project[]>` const projectPath = await selectProjectPath(sql, id);
SELECT id, name, path, added_at, last_session_id, status, gitea_remote if (projectPath === null) {
FROM projects WHERE id = ${id}
`;
if (rows.length === 0) {
reply.code(404); reply.code(404);
return { error: 'not found' }; return { error: 'not found' };
} }
const project = rows[0]!;
let projectRoot: string; let projectRoot: string;
try { try {
projectRoot = await resolveProjectRoot(project.path); projectRoot = await resolveProjectRoot(projectPath);
} catch (err) { } catch (err) {
if (err instanceof PathScopeError) { if (err instanceof PathScopeError) {
reply.code(404); reply.code(404);

View File

@@ -2,6 +2,7 @@ import type { FastifyInstance } from 'fastify';
import type { Sql } from '../db.js'; import type { Sql } from '../db.js';
import type { Broker } from '../services/broker.js'; import type { Broker } from '../services/broker.js';
import type { Message } from '../types/api.js'; import type { Message } from '../types/api.js';
import { MESSAGE_COLUMNS } from '../services/message-columns.js';
export function registerWebSocket( export function registerWebSocket(
app: FastifyInstance, app: FastifyInstance,
@@ -25,9 +26,7 @@ export function registerWebSocket(
// render the SummaryCard for summary=true rows on first connect. // render the SummaryCard for summary=true rows on first connect.
// v1.13.1-B: reads tool_calls/tool_results via the parts-merged view. // v1.13.1-B: reads tool_calls/tool_results via the parts-merged view.
const messages = await sql<Message[]>` const messages = await sql<Message[]>`
SELECT id, session_id, chat_id, role, content, kind, tool_calls, tool_results, status, last_seq, SELECT ${sql.unsafe(MESSAGE_COLUMNS)}
tokens_used, ctx_used, ctx_max, started_at, finished_at, created_at, metadata,
summary, tail_start_id, compacted_at, model
FROM messages_with_parts FROM messages_with_parts
WHERE session_id = ${sessionId} WHERE session_id = ${sessionId}
ORDER BY created_at ASC, id ASC ORDER BY created_at ASC, id ASC

View File

@@ -1,5 +1,5 @@
-- v1.13.3: statement_timeout is set at database level via: -- v1.13.3: statement_timeout is set at database level via:
-- ALTER DATABASE boocode SET statement_timeout = '30s'; -- ALTER DATABASE boochat SET statement_timeout = '30s';
-- ALTER DATABASE can't run inside a DO block, so this is an operational -- ALTER DATABASE can't run inside a DO block, so this is an operational
-- step rather than schema. Re-apply after a volume reset (the setting -- step rather than schema. Re-apply after a volume reset (the setting
-- lives in pg_db which survives `docker compose up --build` but NOT a -- lives in pg_db which survives `docker compose up --build` but NOT a
@@ -30,8 +30,6 @@ CREATE TABLE IF NOT EXISTS messages (
session_id UUID NOT NULL REFERENCES sessions(id) ON DELETE CASCADE, session_id UUID NOT NULL REFERENCES sessions(id) ON DELETE CASCADE,
role TEXT NOT NULL, role TEXT NOT NULL,
content TEXT NOT NULL DEFAULT '', content TEXT NOT NULL DEFAULT '',
tool_calls JSONB,
tool_results JSONB,
status TEXT NOT NULL DEFAULT 'complete', status TEXT NOT NULL DEFAULT 'complete',
last_seq INT NOT NULL DEFAULT 0, last_seq INT NOT NULL DEFAULT 0,
created_at TIMESTAMPTZ NOT NULL DEFAULT clock_timestamp() created_at TIMESTAMPTZ NOT NULL DEFAULT clock_timestamp()
@@ -39,11 +37,10 @@ CREATE TABLE IF NOT EXISTS messages (
CREATE INDEX IF NOT EXISTS idx_messages_session ON messages(session_id, created_at); CREATE INDEX IF NOT EXISTS idx_messages_session ON messages(session_id, created_at);
-- v1.13.0: granular message parts table for AI SDK migration. Old -- v1.13.0: granular message parts table. v1.13.20: legacy tool_calls/
-- messages.content / tool_calls / tool_results columns stay authoritative -- tool_results columns dropped; message_parts is now the sole source of
-- for reads in v1.13.0; this table is dual-written so the swap can happen -- truth for tool calls, tool results, and reasoning. ON DELETE CASCADE
-- in a later dispatch without a backfill window. ON DELETE CASCADE means -- means removing a message removes its parts in one go.
-- removing a message removes its parts in one go.
CREATE TABLE IF NOT EXISTS message_parts ( CREATE TABLE IF NOT EXISTS message_parts (
id uuid PRIMARY KEY DEFAULT gen_random_uuid(), id uuid PRIMARY KEY DEFAULT gen_random_uuid(),
message_id uuid NOT NULL REFERENCES messages(id) ON DELETE CASCADE, message_id uuid NOT NULL REFERENCES messages(id) ON DELETE CASCADE,
@@ -142,10 +139,9 @@ ALTER TABLE messages DROP COLUMN IF EXISTS tool_calls;
ALTER TABLE messages DROP COLUMN IF EXISTS tool_results; ALTER TABLE messages DROP COLUMN IF EXISTS tool_results;
-- v1.13.10: per-tool token cost rolling window. Derives from -- v1.13.10: per-tool token cost rolling window. Derives from
-- messages_with_parts (the v1.13.1-B view that COALESCEs message_parts over -- messages_with_parts (the v1.13.1-B view; v1.13.20 removed the legacy
-- the legacy JSON column) so this works whether the chat predates v1.13.0 -- JSON-column COALESCE fallback — parts are sole source). No new write
-- or postdates v1.13.2 (column drop). No new write site — all source data -- site — all source data already lands via tool-phase.ts:94-95 UPDATE.
-- already lands via the existing tool-phase.ts:94-95 UPDATE.
-- --
-- Attribution model: equal split. A turn emitting N tool calls divides its -- Attribution model: equal split. A turn emitting N tool calls divides its
-- prompt/completion tokens by N before attribution. See v1.13.10 dispatch -- prompt/completion tokens by N before attribution. See v1.13.10 dispatch
@@ -352,7 +348,7 @@ INSERT INTO settings (key, value) VALUES ('theme_mode', '"dark"') ON CONFLICT (k
ALTER TABLE projects ADD COLUMN IF NOT EXISTS default_system_prompt TEXT NOT NULL DEFAULT ''; ALTER TABLE projects ADD COLUMN IF NOT EXISTS default_system_prompt TEXT NOT NULL DEFAULT '';
ALTER TABLE projects ADD COLUMN IF NOT EXISTS default_web_search_enabled BOOLEAN NOT NULL DEFAULT false; ALTER TABLE projects ADD COLUMN IF NOT EXISTS default_web_search_enabled BOOLEAN NOT NULL DEFAULT false;
ALTER TABLE sessions ADD COLUMN IF NOT EXISTS web_search_enabled BOOLEAN; ALTER TABLE sessions ADD COLUMN IF NOT EXISTS web_search_enabled BOOLEAN;
ALTER TABLE sessions ADD COLUMN IF NOT EXISTS tags TEXT[] DEFAULT '{}'; ALTER TABLE sessions DROP COLUMN IF EXISTS tags;
-- v1.11: anchored rolling compaction. -- v1.11: anchored rolling compaction.
-- compacted_at — marks rows that are "behind the curtain" of the latest -- compacted_at — marks rows that are "behind the curtain" of the latest
@@ -391,9 +387,7 @@ CREATE TABLE IF NOT EXISTS tasks (
model TEXT, model TEXT,
mode_id TEXT, mode_id TEXT,
thinking_option_id TEXT, thinking_option_id TEXT,
feature_values JSONB,
execution_path TEXT CHECK (execution_path IS NULL OR execution_path IN ('native','acp','pty','qwen')), execution_path TEXT CHECK (execution_path IS NULL OR execution_path IN ('native','acp','pty','qwen')),
worktree_path TEXT,
cost_tokens INTEGER, cost_tokens INTEGER,
started_at TIMESTAMPTZ, started_at TIMESTAMPTZ,
ended_at TIMESTAMPTZ, ended_at TIMESTAMPTZ,

View File

@@ -0,0 +1,46 @@
import { describe, expect, it } from 'vitest';
import { resolveToolBudget } from '../inference/budget.js';
import type { Agent } from '../../types/api.js';
const BASE_AGENT: Agent = {
id: 'test-agent',
name: 'Test',
description: 'test',
system_prompt: '',
temperature: 0.7,
top_p: null,
top_k: null,
min_p: null,
presence_penalty: null,
top_n_sigma: null,
dry_multiplier: null,
dry_base: null,
dry_allowed_length: null,
dry_penalty_last_n: null,
tools: ['view_file'],
model: null,
source: 'global',
max_tool_calls: null,
steps: null,
llama_extra_args: null,
};
describe('resolveToolBudget', () => {
it('returns 100 when agent is null (no-agent raw chat)', () => {
expect(resolveToolBudget(null)).toBe(100);
});
it('returns 100 when agent has no max_tool_calls override', () => {
expect(resolveToolBudget(BASE_AGENT)).toBe(100);
});
it('returns max_tool_calls when agent overrides the default', () => {
const agent: Agent = { ...BASE_AGENT, max_tool_calls: 25 };
expect(resolveToolBudget(agent)).toBe(25);
});
it('returns 0 when max_tool_calls is explicitly 0 (text-only mode)', () => {
const agent: Agent = { ...BASE_AGENT, max_tool_calls: 0 };
expect(resolveToolBudget(agent)).toBe(0);
});
});

View File

@@ -0,0 +1,149 @@
import { describe, expect, it, vi, afterEach } from 'vitest';
import { samplerOptsFromAgent } from '../inference/stream-phase.js';
import { createContentFlusher } from '../inference/content-flusher.js';
import type { Sql } from '../../db.js';
import type { Agent } from '../../types/api.js';
const BASE_AGENT: Agent = {
id: 'test-agent',
name: 'Test',
description: 'test',
system_prompt: '',
temperature: 0.7,
top_p: null,
top_k: null,
min_p: null,
presence_penalty: null,
top_n_sigma: null,
dry_multiplier: null,
dry_base: null,
dry_allowed_length: null,
dry_penalty_last_n: null,
tools: ['view_file'],
model: null,
source: 'global',
max_tool_calls: null,
steps: null,
llama_extra_args: null,
};
describe('samplerOptsFromAgent', () => {
it('maps every nullable sampler field to undefined when agent is null', () => {
expect(samplerOptsFromAgent(null)).toEqual({
temperature: undefined,
top_p: undefined,
top_k: undefined,
min_p: undefined,
presence_penalty: undefined,
top_n_sigma: undefined,
dry_multiplier: undefined,
dry_base: undefined,
dry_allowed_length: undefined,
dry_penalty_last_n: undefined,
});
});
it('strips null sampler fields to undefined but keeps numeric values', () => {
const agent: Agent = {
...BASE_AGENT,
temperature: 0.5,
top_p: 0.9,
top_k: null,
min_p: 0.05,
presence_penalty: null,
top_n_sigma: 1,
dry_multiplier: null,
dry_base: 1.75,
dry_allowed_length: null,
dry_penalty_last_n: 256,
};
expect(samplerOptsFromAgent(agent)).toEqual({
temperature: 0.5,
top_p: 0.9,
top_k: undefined,
min_p: 0.05,
presence_penalty: undefined,
top_n_sigma: 1,
dry_multiplier: undefined,
dry_base: 1.75,
dry_allowed_length: undefined,
dry_penalty_last_n: 256,
});
});
it('never includes a tools field (callers add it)', () => {
expect('tools' in samplerOptsFromAgent(BASE_AGENT)).toBe(false);
});
});
describe('createContentFlusher', () => {
afterEach(() => {
vi.useRealTimers();
});
// A tagged-template stub matching postgres' sql`...` shape. Records the
// interpolated content snapshot (values[0]) of each UPDATE.
function makeSqlSpy() {
const writes: string[] = [];
const sql = ((_strings: TemplateStringsArray, ...values: unknown[]) => {
writes.push(values[0] as string);
return Promise.resolve([]);
}) as unknown as Sql;
return { sql, writes };
}
it('debounces: many scheduleFlush calls in one window produce one write', async () => {
vi.useFakeTimers();
const { sql, writes } = makeSqlSpy();
let content = '';
const flusher = createContentFlusher(sql, 'msg-1', () => content, 500);
content = 'a';
flusher.scheduleFlush();
content = 'ab';
flusher.scheduleFlush();
content = 'abc';
flusher.scheduleFlush();
expect(writes).toHaveLength(0); // nothing before the interval elapses
vi.advanceTimersByTime(500);
await flusher.drain();
expect(writes).toHaveLength(1);
// snapshot is read at fire time → latest content, not the value at schedule time
expect(writes[0]).toBe('abc');
});
it('arms a fresh timer after a flush fires', async () => {
vi.useFakeTimers();
const { sql, writes } = makeSqlSpy();
let content = 'one';
const flusher = createContentFlusher(sql, 'msg-1', () => content, 500);
flusher.scheduleFlush();
vi.advanceTimersByTime(500);
await Promise.resolve();
content = 'two';
flusher.scheduleFlush();
vi.advanceTimersByTime(500);
await flusher.drain();
expect(writes).toEqual(['one', 'two']);
});
it('drain cancels a pending timer without performing a final flush', async () => {
vi.useFakeTimers();
const { sql, writes } = makeSqlSpy();
let content = 'pending';
const flusher = createContentFlusher(sql, 'msg-1', () => content, 500);
flusher.scheduleFlush();
// Drain before the timer fires — the pending flush is cancelled, not forced.
await flusher.drain();
vi.advanceTimersByTime(500);
await Promise.resolve();
expect(writes).toHaveLength(0);
});
});

View File

@@ -9,12 +9,9 @@ import {
const TEST_URL = 'http://llama-swap.test:8401'; const TEST_URL = 'http://llama-swap.test:8401';
function mockOkProps(n_ctx: number, total_slots = 1) { function mockOkProps(n_ctx: number) {
return new Response( return new Response(
JSON.stringify({ JSON.stringify({ default_generation_settings: { n_ctx } }),
default_generation_settings: { n_ctx },
total_slots,
}),
{ status: 200, headers: { 'Content-Type': 'application/json' } }, { status: 200, headers: { 'Content-Type': 'application/json' } },
); );
} }
@@ -33,12 +30,10 @@ afterEach(() => {
describe('getModelContext — positive cache', () => { describe('getModelContext — positive cache', () => {
it('returns the parsed body on a 200 with valid shape', async () => { it('returns the parsed body on a 200 with valid shape', async () => {
const fetchSpy = vi.spyOn(globalThis, 'fetch').mockResolvedValueOnce(mockOkProps(262_144, 1)); const fetchSpy = vi.spyOn(globalThis, 'fetch').mockResolvedValueOnce(mockOkProps(262_144));
const result = await getModelContext('qwen3.6'); const result = await getModelContext('qwen3.6');
expect(result).not.toBeNull(); expect(result).not.toBeNull();
expect(result!.n_ctx).toBe(262_144); expect(result!.n_ctx).toBe(262_144);
expect(result!.total_slots).toBe(1);
expect(typeof result!.fetched_at).toBe('number');
// Verify the URL was constructed correctly — encodes the model name in // Verify the URL was constructed correctly — encodes the model name in
// case it contains characters that would break the path. // case it contains characters that would break the path.
expect(fetchSpy).toHaveBeenCalledExactlyOnceWith( expect(fetchSpy).toHaveBeenCalledExactlyOnceWith(
@@ -57,19 +52,6 @@ describe('getModelContext — positive cache', () => {
expect(fetchSpy).toHaveBeenCalledTimes(1); expect(fetchSpy).toHaveBeenCalledTimes(1);
}); });
it('defaults total_slots to 1 when the server omits it', async () => {
// Mirror the docstring claim — total_slots is informational and we don't
// reject the response just because it's missing.
vi.spyOn(globalThis, 'fetch').mockResolvedValueOnce(
new Response(JSON.stringify({ default_generation_settings: { n_ctx: 8192 } }), {
status: 200,
}),
);
const result = await getModelContext('partial-model');
expect(result).not.toBeNull();
expect(result!.n_ctx).toBe(8192);
expect(result!.total_slots).toBe(1);
});
}); });
// ---- negative cache (single-shot) ------------------------------------------ // ---- negative cache (single-shot) ------------------------------------------

View File

@@ -0,0 +1,87 @@
import { describe, it, expect } from 'vitest';
import { SENTINEL_KINDS, isAnySentinel, isCapHitSentinel, isDoomLoopSentinel, isMistakeRecoverySentinel } from '../inference/sentinels.js';
import type { Message } from '../../types/api.js';
function makeSentinel(kind: string): Message {
return {
id: 'msg-1',
session_id: 's',
chat_id: 'c',
role: 'system',
content: '',
kind: 'message',
tool_calls: null,
tool_results: null,
status: 'complete',
last_seq: 0,
tokens_used: null,
ctx_used: null,
ctx_max: null,
started_at: null,
finished_at: null,
created_at: new Date().toISOString(),
metadata: { kind } as unknown as import('../../types/api.js').MessageMetadata,
summary: false,
tail_start_id: null,
compacted_at: null,
};
}
describe('SENTINEL_KINDS — single source of truth', () => {
it('contains the three known sentinel kinds', () => {
expect(SENTINEL_KINDS.has('cap_hit')).toBe(true);
expect(SENTINEL_KINDS.has('doom_loop')).toBe(true);
expect(SENTINEL_KINDS.has('mistake_recovery')).toBe(true);
});
it('does not contain arbitrary strings', () => {
expect(SENTINEL_KINDS.has('user')).toBe(false);
expect(SENTINEL_KINDS.has('assistant')).toBe(false);
expect(SENTINEL_KINDS.has('')).toBe(false);
});
});
describe('isAnySentinel', () => {
it('returns true for cap_hit', () => {
expect(isAnySentinel(makeSentinel('cap_hit'))).toBe(true);
});
it('returns true for doom_loop', () => {
expect(isAnySentinel(makeSentinel('doom_loop'))).toBe(true);
});
it('returns true for mistake_recovery', () => {
expect(isAnySentinel(makeSentinel('mistake_recovery'))).toBe(true);
});
it('returns false for non-system role', () => {
const m = { ...makeSentinel('cap_hit'), role: 'user' as const };
expect(isAnySentinel(m)).toBe(false);
});
it('returns false for null metadata', () => {
const m = { ...makeSentinel('cap_hit'), metadata: null };
expect(isAnySentinel(m)).toBe(false);
});
it('returns false for unknown kind', () => {
expect(isAnySentinel(makeSentinel('unknown_kind'))).toBe(false);
});
});
describe('individual sentinel predicates still work', () => {
it('isCapHitSentinel matches cap_hit only', () => {
expect(isCapHitSentinel(makeSentinel('cap_hit'))).toBe(true);
expect(isCapHitSentinel(makeSentinel('doom_loop'))).toBe(false);
});
it('isDoomLoopSentinel matches doom_loop only', () => {
expect(isDoomLoopSentinel(makeSentinel('doom_loop'))).toBe(true);
expect(isDoomLoopSentinel(makeSentinel('cap_hit'))).toBe(false);
});
it('isMistakeRecoverySentinel matches mistake_recovery only', () => {
expect(isMistakeRecoverySentinel(makeSentinel('mistake_recovery'))).toBe(true);
expect(isMistakeRecoverySentinel(makeSentinel('cap_hit'))).toBe(false);
});
});

View File

@@ -0,0 +1,111 @@
import { describe, expect, it } from 'vitest';
import { resolveTurnConfig, MAX_STEPS } from '../inference/turn-config.js';
import { decideStep, decidePostToolAction } from '../inference/step-decision.js';
import { DOOM_LOOP_THRESHOLD } from '../inference/sentinels.js';
import type { MistakeState } from '../inference/mistake-tracker.js';
import type { Agent, ToolCall } from '../../types/api.js';
const BASE_AGENT: Agent = {
id: 'test-agent',
name: 'Test',
description: 'test',
system_prompt: '',
temperature: 0.7,
top_p: null,
top_k: null,
min_p: null,
presence_penalty: null,
top_n_sigma: null,
dry_multiplier: null,
dry_base: null,
dry_allowed_length: null,
dry_penalty_last_n: null,
tools: ['view_file'],
model: null,
source: 'global',
max_tool_calls: null,
steps: null,
llama_extra_args: null,
};
function call(name: string, args: Record<string, unknown> = {}): ToolCall {
return { id: `tc-${name}-${JSON.stringify(args)}`, name, args };
}
describe('resolveTurnConfig', () => {
it('no agent → budget 100, cap MAX_STEPS, not text-only', () => {
expect(resolveTurnConfig(null)).toEqual({
effectiveCap: MAX_STEPS,
budget: 100,
isTextOnly: false,
});
});
it('steps: 0 → effectiveCap 0 and isTextOnly true', () => {
expect(resolveTurnConfig({ ...BASE_AGENT, steps: 0 })).toEqual({
effectiveCap: 0,
budget: 100,
isTextOnly: true,
});
});
it('steps below MAX_STEPS → effectiveCap is the agent value', () => {
expect(resolveTurnConfig({ ...BASE_AGENT, steps: 5 }).effectiveCap).toBe(5);
});
it('steps above MAX_STEPS → effectiveCap clamps to MAX_STEPS', () => {
expect(resolveTurnConfig({ ...BASE_AGENT, steps: 9999 }).effectiveCap).toBe(MAX_STEPS);
});
it('max_tool_calls overrides the budget', () => {
expect(resolveTurnConfig({ ...BASE_AGENT, max_tool_calls: 12 }).budget).toBe(12);
});
});
describe('decideStep (top-of-loop gate)', () => {
it('returns stream when no doom loop and under budget', () => {
expect(decideStep({ recentToolCalls: [], toolsUsed: 0, budget: 30 })).toEqual({ kind: 'stream' });
});
it('returns budget when toolsUsed has reached the budget', () => {
expect(decideStep({ recentToolCalls: [], toolsUsed: 30, budget: 30 })).toEqual({ kind: 'budget' });
});
it('returns doom (with the looping call) on identical-repeat tail', () => {
const recent = Array.from({ length: DOOM_LOOP_THRESHOLD }, () => call('view_file', { path: '/a' }));
const d = decideStep({ recentToolCalls: recent, toolsUsed: 1, budget: 30 });
expect(d.kind).toBe('doom');
if (d.kind === 'doom') {
expect(d.loop.name).toBe('view_file');
expect(d.loop.args).toEqual({ path: '/a' });
}
});
it('doom takes precedence over budget when both would trip', () => {
const recent = Array.from({ length: DOOM_LOOP_THRESHOLD }, () => call('grep', { q: 'x' }));
expect(decideStep({ recentToolCalls: recent, toolsUsed: 30, budget: 30 }).kind).toBe('doom');
});
});
describe('decidePostToolAction (post-tool decision)', () => {
const clean: MistakeState = { run: [], nudges: 0 };
it('non-continue actions stop the loop without consulting the tracker', () => {
expect(decidePostToolAction('paused', { run: ['exec_error', 'exec_error', 'exec_error'], nudges: 0 })).toBe('stop');
expect(decidePostToolAction('synthesis_done', clean)).toBe('stop');
});
it('continue with a clean tracker → continue', () => {
expect(decidePostToolAction('continue', clean)).toBe('continue');
});
it('continue with a threshold streak and no prior nudge → nudge', () => {
const tracker: MistakeState = { run: ['zod_reject', 'tool_not_found', 'exec_error'], nudges: 0 };
expect(decidePostToolAction('continue', tracker)).toBe('nudge');
});
it('continue with a threshold streak after a nudge already fired → escalate', () => {
const tracker: MistakeState = { run: ['zod_reject', 'tool_not_found', 'exec_error'], nudges: 1 };
expect(decidePostToolAction('continue', tracker)).toBe('escalate');
});
});

View File

@@ -0,0 +1,29 @@
import { describe, expect, it } from 'vitest';
import { classifyStreamError } from '../inference/stream-error-classifier.js';
describe('classifyStreamError', () => {
it("classifies AbortError as 'stall'", () => {
const err = new Error('aborted');
err.name = 'AbortError';
expect(classifyStreamError(err)).toBe('stall');
});
it("classifies a 503 HTTP error as 'transient'", () => {
const err = Object.assign(new Error('Service Unavailable'), { status: 503 });
expect(classifyStreamError(err)).toBe('transient');
});
it("classifies a 500 HTTP error as 'transient'", () => {
const err = Object.assign(new Error('Internal Server Error'), { status: 500 });
expect(classifyStreamError(err)).toBe('transient');
});
it("classifies a 4xx HTTP error as 'non-retryable'", () => {
const err = Object.assign(new Error('Bad Request'), { status: 400 });
expect(classifyStreamError(err)).toBe('non-retryable');
});
it("classifies a generic Error as 'non-retryable'", () => {
expect(classifyStreamError(new Error('something went wrong'))).toBe('non-retryable');
});
});

View File

@@ -0,0 +1,153 @@
// Gate test: pins the <invoke>-as-text fallback in the stream-phase text-delta
// path. This test will fail if extractToolCallBlocks is ever removed from the
// text-delta branch of streamCompletion, which is the only guard for models
// that emit tool calls as inline XML rather than structured tool_calls.
import { describe, expect, it, vi, afterEach } from 'vitest';
import type { FastifyBaseLogger } from 'fastify';
// vi.mock is hoisted before all module imports. Spread the original so all
// other ai exports (tool, jsonSchema, types, …) remain real; only streamText
// is replaced with a controllable spy.
vi.mock('ai', async (importOriginal) => {
const actual = await importOriginal<typeof import('ai')>();
return { ...actual, streamText: vi.fn() };
});
import { streamText } from 'ai';
import { streamCompletion, STALL_TIMEOUT_MS } from '../inference/stream-phase-adapter.js';
import type { StreamAdapterContext } from '../inference/stream-phase-adapter.js';
const INVOKE_BLOCK =
'<invoke name="view_file"><parameter name="path">/tmp/test.ts</parameter></invoke>';
// One-shot async generator that yields a single text-delta carrying a complete
// <invoke> block, simulating a model that emits its tool call as plain XML text.
async function* makeInvokeTextDeltaStream() {
yield { type: 'text-delta' as const, text: INVOKE_BLOCK };
}
const fakeLog = {
debug: vi.fn(),
info: vi.fn(),
warn: vi.fn(),
error: vi.fn(),
fatal: vi.fn(),
trace: vi.fn(),
child: vi.fn(),
} as unknown as FastifyBaseLogger;
const fakeCtx: StreamAdapterContext = {
config: { LLAMA_SWAP_URL: 'http://localhost:11434' } as StreamAdapterContext['config'],
log: fakeLog,
};
describe('<invoke>-as-text fallback gate (stream-phase text-delta path)', () => {
it('surfaces a plain-text <invoke> block as a toolCall and strips markup from content and deltas', async () => {
vi.mocked(streamText).mockReturnValue({
fullStream: makeInvokeTextDeltaStream(),
usage: Promise.resolve({ inputTokens: 1, outputTokens: 1 }),
} as unknown as ReturnType<typeof streamText>);
const deltas: string[] = [];
const result = await streamCompletion(
fakeCtx,
'test-model',
[{ role: 'user', content: 'call a tool' }],
{ tools: null },
(d) => deltas.push(d),
undefined,
);
// The <invoke> block must surface as a structured tool call
expect(result.toolCalls).toHaveLength(1);
expect(result.toolCalls[0]).toMatchObject({
id: 'xml_call_0',
name: 'view_file',
args: { path: '/tmp/test.ts' },
});
// The XML markup must not appear in the saved content or any flushed delta
expect(result.content).not.toContain('<invoke');
expect(result.content).not.toContain('</invoke>');
expect(deltas.join('')).not.toContain('<invoke');
});
});
// T9: stall timeout — fake hanging stream fires AbortError after STALL_TIMEOUT_MS.
describe('stall timeout (F6)', () => {
afterEach(() => {
vi.useRealTimers();
});
it(`aborts the stream after ${STALL_TIMEOUT_MS}ms with no chunks (stall path)`, async () => {
vi.useFakeTimers();
// Capture the effectiveSignal the adapter passes to streamText so the fake
// generator can unblock when the stall fires (matching real ReadableStream
// abort behavior: the stream ends rather than throwing into the generator).
let capturedSignal: AbortSignal | undefined;
vi.mocked(streamText).mockImplementation((opts: Parameters<typeof streamText>[0]) => {
capturedSignal = opts.abortSignal as AbortSignal | undefined;
return {
// Hang until the effective signal fires, then return without emitting
// any parts — mirrors how a real fetch stream ends when aborted.
fullStream: (async function* () {
await new Promise<void>((resolve) => {
if (capturedSignal?.aborted) {
resolve();
return;
}
capturedSignal?.addEventListener('abort', () => resolve(), { once: true });
});
})(),
// Never resolves; the stall throw happens before usage is awaited.
usage: new Promise<never>(() => {}),
} as unknown as ReturnType<typeof streamText>;
});
const streamPromise = streamCompletion(
fakeCtx,
'test-model',
[{ role: 'user', content: 'hang' }],
{ tools: null },
() => {},
undefined,
);
// Attach the rejection handler BEFORE advancing timers so the rejection is
// never unhandled (Node emits PromiseRejectionHandledWarning otherwise).
const assertion = expect(streamPromise).rejects.toMatchObject({ name: 'AbortError' });
// Advance past the stall deadline — the stallAc fires, the hanging generator
// resolves, the post-loop check sees stallAc.signal.aborted and throws.
await vi.advanceTimersByTimeAsync(STALL_TIMEOUT_MS);
await assertion;
});
// T10: regression pin — the original post-loop signal check for user-initiated
// abort must still fire correctly after the stall logic was added.
it('throws AbortError when the inbound signal is aborted (user-abort regression pin)', async () => {
const ac = new AbortController();
ac.abort();
vi.mocked(streamText).mockReturnValue({
fullStream: (async function* () {
// Yield nothing — stream ends immediately after user abort is already set
})(),
usage: Promise.resolve({ inputTokens: 0, outputTokens: 0 }),
} as unknown as ReturnType<typeof streamText>);
await expect(
streamCompletion(
fakeCtx,
'test-model',
[{ role: 'user', content: 'aborted' }],
{ tools: null },
() => {},
undefined,
ac.signal,
),
).rejects.toMatchObject({ name: 'AbortError' });
});
});

View File

@@ -1,179 +1,9 @@
import { describe, expect, it } from 'vitest'; import { describe, expect, it } from 'vitest';
import { import {
parseXmlToolCall,
parseInvokeToolCall,
partialXmlOpenerStart,
extractToolCallBlocks, extractToolCallBlocks,
stripToolMarkup, stripToolMarkup,
XML_TOOL_OPEN,
XML_TOOL_CLOSE,
INVOKE_TOOL_OPEN,
INVOKE_TOOL_CLOSE,
} from '../inference/tool-call-parser.js'; } from '../inference/tool-call-parser.js';
// ── Ported from xml-parser.test.ts ───────────────────────────────────────
describe('parseXmlToolCall (Qwen/Hermes <tool_call>)', () => {
it('parses a well-formed single-parameter call', () => {
const block = '<tool_call><function=view_file><parameter=path>/tmp/foo</parameter></function></tool_call>';
expect(parseXmlToolCall(block)).toEqual({
name: 'view_file',
args: { path: '/tmp/foo' },
});
});
it('parses multi-parameter call', () => {
const block = '<tool_call><function=grep><parameter=pattern>foo</parameter><parameter=path>src/</parameter></function></tool_call>';
expect(parseXmlToolCall(block)).toEqual({
name: 'grep',
args: { pattern: 'foo', path: 'src/' },
});
});
it('JSON-parses numeric parameter values', () => {
const block = '<tool_call><function=foo><parameter=count>42</parameter></function></tool_call>';
expect(parseXmlToolCall(block)).toEqual({ name: 'foo', args: { count: 42 } });
});
it('tolerates whitespace around = in function (v1.13.16 tightening)', () => {
const block = '<tool_call><function = view_file><parameter=path>/tmp/foo</parameter></function></tool_call>';
expect(parseXmlToolCall(block)).toEqual({
name: 'view_file',
args: { path: '/tmp/foo' },
});
});
it('tolerates whitespace around = in parameter (v1.13.16 tightening)', () => {
const block = '<tool_call><function=view_file><parameter = path>/tmp/foo</parameter></function></tool_call>';
expect(parseXmlToolCall(block)).toEqual({
name: 'view_file',
args: { path: '/tmp/foo' },
});
});
it('returns null when function name is missing', () => {
const block = '<tool_call><parameter=path>/tmp/foo</parameter></tool_call>';
expect(parseXmlToolCall(block)).toBeNull();
});
});
describe('parseInvokeToolCall (Anthropic <invoke>) — v1.13.16', () => {
it('parses a well-formed single-parameter call (spec case 1)', () => {
const block = '<invoke name="view_file"><parameter name="path">/tmp/foo</parameter></invoke>';
expect(parseInvokeToolCall(block)).toEqual({
name: 'view_file',
args: { path: '/tmp/foo' },
});
});
it('parses a multi-parameter call (spec case 2)', () => {
const block = '<invoke name="grep"><parameter name="pattern">foo</parameter><parameter name="path">src/</parameter></invoke>';
expect(parseInvokeToolCall(block)).toEqual({
name: 'grep',
args: { pattern: 'foo', path: 'src/' },
});
});
it('tolerates newlines and spaces in attributes (spec case 3)', () => {
const block = `<invoke
name="view_file"
>
<parameter
name="path"
>/tmp/foo</parameter>
</invoke>`;
expect(parseInvokeToolCall(block)).toEqual({
name: 'view_file',
args: { path: '/tmp/foo' },
});
});
it('parses a call whose name is not a registered BooCode tool (spec case 4)', () => {
const block = '<invoke name="read_file"><parameter name="path">/tmp/foo</parameter></invoke>';
expect(parseInvokeToolCall(block)).toEqual({
name: 'read_file',
args: { path: '/tmp/foo' },
});
});
it('supports single-quoted attribute values', () => {
const block = "<invoke name='view_file'><parameter name='path'>/tmp/foo</parameter></invoke>";
expect(parseInvokeToolCall(block)).toEqual({
name: 'view_file',
args: { path: '/tmp/foo' },
});
});
it('JSON-parses numeric parameter values', () => {
const block = '<invoke name="foo"><parameter name="count">42</parameter></invoke>';
expect(parseInvokeToolCall(block)).toEqual({ name: 'foo', args: { count: 42 } });
});
it('tolerates spaces around = inside name attribute', () => {
const block = '<invoke name = "view_file"><parameter name = "path">/tmp/foo</parameter></invoke>';
expect(parseInvokeToolCall(block)).toEqual({
name: 'view_file',
args: { path: '/tmp/foo' },
});
});
it('returns null when name attribute is missing', () => {
const block = '<invoke><parameter name="path">/tmp/foo</parameter></invoke>';
expect(parseInvokeToolCall(block)).toBeNull();
});
it('returns null when name attribute is empty', () => {
const block = '<invoke name=""><parameter name="path">/tmp/foo</parameter></invoke>';
expect(parseInvokeToolCall(block)).toBeNull();
});
it('exports the expected delimiters', () => {
expect(INVOKE_TOOL_OPEN).toBe('<invoke');
expect(INVOKE_TOOL_CLOSE).toBe('</invoke>');
expect(XML_TOOL_OPEN).toBe('<tool_call>');
expect(XML_TOOL_CLOSE).toBe('</tool_call>');
});
});
describe('partialXmlOpenerStart (v1.13.16 — both flavors)', () => {
it('returns -1 when the buffer is empty', () => {
expect(partialXmlOpenerStart('')).toBe(-1);
});
it('returns -1 when the buffer has no openers', () => {
expect(partialXmlOpenerStart('plain prose, no markup')).toBe(-1);
});
it('returns the index of a complete <tool_call> opener (existing)', () => {
expect(partialXmlOpenerStart('prose <tool_call>more')).toBe(6);
});
it('returns the index of a complete <invoke opener (v1.13.16)', () => {
expect(partialXmlOpenerStart('prose <invoke name=')).toBe(6);
});
it('holds a partial <tool_ prefix at end of buffer', () => {
expect(partialXmlOpenerStart('text <tool_')).toBe(5);
});
it('holds a partial <invo prefix at end of buffer (v1.13.16)', () => {
expect(partialXmlOpenerStart('text <invo')).toBe(5);
});
it('holds a bare < at end of buffer', () => {
expect(partialXmlOpenerStart('text <')).toBe(5);
});
it('returns -1 when < is followed by non-opener text', () => {
expect(partialXmlOpenerStart('text <unknown>')).toBe(-1);
});
it('returns the earliest opener when both flavors are present', () => {
expect(partialXmlOpenerStart('xxx <tool_call>YYY <invoke>')).toBe(4);
expect(partialXmlOpenerStart('xxx <invoke>YYY <tool_call>')).toBe(4);
});
});
describe('extractToolCallBlocks (v1.13.16 — unified extraction)', () => { describe('extractToolCallBlocks (v1.13.16 — unified extraction)', () => {
it('extracts a single <invoke> block (spec case 1)', () => { it('extracts a single <invoke> block (spec case 1)', () => {
const input = '<invoke name="view_file"><parameter name="path">/tmp/foo</parameter></invoke>'; const input = '<invoke name="view_file"><parameter name="path">/tmp/foo</parameter></invoke>';
@@ -341,11 +171,3 @@ describe('stripToolMarkup', () => {
}); });
}); });
describe('delimiter constants', () => {
it('exports the expected delimiters', () => {
expect(INVOKE_TOOL_OPEN).toBe('<invoke');
expect(INVOKE_TOOL_CLOSE).toBe('</invoke>');
expect(XML_TOOL_OPEN).toBe('<tool_call>');
expect(XML_TOOL_CLOSE).toBe('</tool_call>');
});
});

View File

@@ -0,0 +1,68 @@
import { describe, it, expect } from 'vitest';
import { z } from 'zod';
import {
ALL_TOOLS,
TOOLS_BY_NAME,
appendMcpTools,
toolJsonSchemas,
type ToolDef,
} from '../tools.js';
// Parity test for the register-through MCP-discovery contract (Phase 6 split).
// `ALL_TOOLS` / `TOOLS_BY_NAME` are `let`-bound in tools/registry.ts and
// reassigned by appendMcpTools() at startup; this barrel re-exports them.
// apps/coder relies on this exact behavior: it imports `appendMcpTools` + the
// live `ALL_TOOLS` binding from @boocode/server/tools, calls appendMcpTools()
// once, then reads ALL_TOOLS. ESM live bindings must carry the mutation
// through the barrel re-export — if the split ever snapshots the array instead
// of re-exporting the live binding, these assertions fail. Each test file gets
// an isolated module instance (vitest default), so mutating the registry here
// does not leak into tools.test.ts.
function makeFakeMcpTool(name: string): ToolDef<unknown> {
return {
name,
description: `fake mcp tool ${name}`,
inputSchema: z.object({}) as z.ZodType<unknown>,
jsonSchema: {
type: 'function',
function: {
name,
description: `fake mcp tool ${name}`,
parameters: { type: 'object', properties: {}, additionalProperties: false },
},
},
async execute() {
return { ok: true };
},
};
}
describe('appendMcpTools register-through contract', () => {
it('is a no-op for an empty array', () => {
const before = ALL_TOOLS.length;
appendMcpTools([]);
expect(ALL_TOOLS.length).toBe(before);
});
it('mutates the live ALL_TOOLS / TOOLS_BY_NAME bindings observable through the barrel', () => {
const before = ALL_TOOLS.length;
// Names chosen so insertion lands away from the array ends, proving the
// re-sort runs (a naive concat would leave them at the tail).
const a = makeFakeMcpTool('mcp__alpha__probe');
const z2 = makeFakeMcpTool('mcp__zeta__probe');
appendMcpTools([z2, a]);
expect(ALL_TOOLS.length).toBe(before + 2);
expect(TOOLS_BY_NAME['mcp__alpha__probe']).toBe(a);
expect(TOOLS_BY_NAME['mcp__zeta__probe']).toBe(z2);
// Still alpha-sorted after the append (prompt-cache stability invariant).
const names = ALL_TOOLS.map((t) => t.name);
expect(names).toEqual([...names].sort((x, y) => x.localeCompare(y)));
// toolJsonSchemas() reads through the same live binding.
const schemaNames = toolJsonSchemas().map((s) => s.function.name);
expect(schemaNames).toContain('mcp__alpha__probe');
expect(schemaNames).toContain('mcp__zeta__probe');
});
});

View File

@@ -3,6 +3,7 @@ import { join } from 'node:path';
import type { Agent, AgentsResponse, AgentParseError } from '../types/api.js'; import type { Agent, AgentsResponse, AgentParseError } from '../types/api.js';
import { ALL_TOOLS, resolveToolTier } from './tools.js'; import { ALL_TOOLS, resolveToolTier } from './tools.js';
import { validateExtraArgs } from './inference/llama-args-validator.js'; import { validateExtraArgs } from './inference/llama-args-validator.js';
import { stripQuotes } from '../utils/string-utils.js';
// v1.8.1: global agents live at /data/AGENTS.md inside the container // v1.8.1: global agents live at /data/AGENTS.md inside the container
// (./data:/data:ro mount on the host). Per-project AGENTS.md at the project // (./data:/data:ro mount on the host). Per-project AGENTS.md at the project
@@ -107,17 +108,50 @@ interface ParsedFrontmatter {
llama_extra_args?: string[]; llama_extra_args?: string[];
} }
function stripQuotes(s: string): string { // P5: table-driven validation for the "soft-range" numeric frontmatter fields.
if ( // Each was a near-identical Number() + finite/integer + range-warn + push-error
s.length >= 2 && // block. "Soft-range" = the value is STORED whenever the type checks out; an
(s[0] === '"' || s[0] === "'") && // out-of-range value only emits a console.warn (it is NOT skipped). A type
s[0] === s[s.length - 1] // mismatch hard-fails the block. The range descriptor in the warn message is
) { // `min-max` when both bounds exist, else `(≥min)` — matching the original
return s.slice(1, -1); // hand-written strings byte-for-byte.
} //
return s; // max_tool_calls and steps are deliberately NOT in this table: they are
// "hard-range" (store ONLY if in range; an in-type-but-out-of-range value is
// warned AND skipped) with bespoke messages, so they stay explicit below.
type NumericFieldKey =
| 'temperature'
| 'top_p'
| 'top_k'
| 'min_p'
| 'presence_penalty'
| 'top_n_sigma'
| 'dry_multiplier'
| 'dry_base'
| 'dry_allowed_length'
| 'dry_penalty_last_n';
interface NumericFieldSpec {
key: NumericFieldKey;
isInt: boolean;
min?: number;
max?: number;
} }
const NUMERIC_FIELDS: readonly NumericFieldSpec[] = [
{ key: 'temperature', isInt: false },
{ key: 'top_p', isInt: false, min: 0, max: 1 },
{ key: 'top_k', isInt: true, min: 0, max: 200 },
{ key: 'min_p', isInt: false, min: 0, max: 1 },
{ key: 'presence_penalty', isInt: false, min: -2, max: 2 },
// v2.6 sampling-streamjson-tokens (#11): llama.cpp sampler extensions.
{ key: 'top_n_sigma', isInt: false, min: 0 },
{ key: 'dry_multiplier', isInt: false, min: 0 },
{ key: 'dry_base', isInt: false, min: 0 },
{ key: 'dry_allowed_length', isInt: true, min: 0 },
{ key: 'dry_penalty_last_n', isInt: true, min: -1 },
];
function parseFrontmatter(yaml: string): { data: ParsedFrontmatter; errors: string[] } { function parseFrontmatter(yaml: string): { data: ParsedFrontmatter; errors: string[] } {
const data: ParsedFrontmatter = {}; const data: ParsedFrontmatter = {};
const errors: string[] = []; const errors: string[] = [];
@@ -140,108 +174,33 @@ function parseFrontmatter(yaml: string): { data: ParsedFrontmatter; errors: stri
const key = line.slice(0, colonIdx).trim(); const key = line.slice(0, colonIdx).trim();
const valueRaw = line.slice(colonIdx + 1).trim(); const valueRaw = line.slice(colonIdx + 1).trim();
if (key === 'temperature') { const numericSpec = NUMERIC_FIELDS.find((f) => f.key === key);
if (numericSpec) {
const n = Number(valueRaw); const n = Number(valueRaw);
if (Number.isFinite(n)) data.temperature = n; const typeOk = numericSpec.isInt ? Number.isInteger(n) : Number.isFinite(n);
else errors.push(`temperature must be a number (got "${valueRaw}")`); if (typeOk) {
} else if (key === 'top_p') { // Soft-range: store regardless of range; out-of-range only warns.
const n = Number(valueRaw); data[numericSpec.key] = n;
if (Number.isFinite(n)) { const below = numericSpec.min !== undefined && n < numericSpec.min;
data.top_p = n; const above = numericSpec.max !== undefined && n > numericSpec.max;
if (n < 0 || n > 1) { if (below || above) {
console.warn(`agents: top_p ${n} out of range 0-1, ignoring (falling back to default)`); const range =
numericSpec.max !== undefined
? `${numericSpec.min}-${numericSpec.max}`
: `(≥${numericSpec.min})`;
console.warn(
`agents: ${numericSpec.key} ${n} out of range ${range}, ignoring (falling back to default)`,
);
} }
} else { } else {
errors.push(`top_p must be a number (got "${valueRaw}")`); errors.push(
`${numericSpec.key} must be ${numericSpec.isInt ? 'an integer' : 'a number'} (got "${valueRaw}")`,
);
} }
} else if (key === 'top_k') { continue;
const n = Number(valueRaw); }
if (Number.isInteger(n)) {
data.top_k = n; if (key === 'tools') {
if (n < 0 || n > 200) {
console.warn(`agents: top_k ${n} out of range 0-200, ignoring (falling back to default)`);
}
} else {
errors.push(`top_k must be an integer (got "${valueRaw}")`);
}
} else if (key === 'min_p') {
const n = Number(valueRaw);
if (Number.isFinite(n)) {
data.min_p = n;
if (n < 0 || n > 1) {
console.warn(`agents: min_p ${n} out of range 0-1, ignoring (falling back to default)`);
}
} else {
errors.push(`min_p must be a number (got "${valueRaw}")`);
}
} else if (key === 'presence_penalty') {
const n = Number(valueRaw);
if (Number.isFinite(n)) {
data.presence_penalty = n;
if (n < -2 || n > 2) {
console.warn(`agents: presence_penalty ${n} out of range -2-2, ignoring (falling back to default)`);
}
} else {
errors.push(`presence_penalty must be a number (got "${valueRaw}")`);
}
} else if (key === 'top_n_sigma') {
// v2.6 #11: llama.cpp top-n-sigma sampler. Float ≥ 0 (typical 0-3).
// Mirrors top_p/min_p: store then warn on out-of-range (non-numeric
// hard-fails the block).
const n = Number(valueRaw);
if (Number.isFinite(n)) {
data.top_n_sigma = n;
if (n < 0) {
console.warn(`agents: top_n_sigma ${n} out of range (≥0), ignoring (falling back to default)`);
}
} else {
errors.push(`top_n_sigma must be a number (got "${valueRaw}")`);
}
} else if (key === 'dry_multiplier') {
// v2.6 #11: DRY repetition-penalty multiplier. Float ≥ 0 (0 disables DRY).
const n = Number(valueRaw);
if (Number.isFinite(n)) {
data.dry_multiplier = n;
if (n < 0) {
console.warn(`agents: dry_multiplier ${n} out of range (≥0), ignoring (falling back to default)`);
}
} else {
errors.push(`dry_multiplier must be a number (got "${valueRaw}")`);
}
} else if (key === 'dry_base') {
// v2.6 #11: DRY penalty growth base. Float ≥ 0.
const n = Number(valueRaw);
if (Number.isFinite(n)) {
data.dry_base = n;
if (n < 0) {
console.warn(`agents: dry_base ${n} out of range (≥0), ignoring (falling back to default)`);
}
} else {
errors.push(`dry_base must be a number (got "${valueRaw}")`);
}
} else if (key === 'dry_allowed_length') {
// v2.6 #11: DRY max sequence length not penalized. Integer ≥ 0.
const n = Number(valueRaw);
if (Number.isInteger(n)) {
data.dry_allowed_length = n;
if (n < 0) {
console.warn(`agents: dry_allowed_length ${n} out of range (≥0), ignoring (falling back to default)`);
}
} else {
errors.push(`dry_allowed_length must be an integer (got "${valueRaw}")`);
}
} else if (key === 'dry_penalty_last_n') {
// v2.6 #11: DRY lookback window. Integer ≥ -1 (-1 = whole context, 0 = off).
const n = Number(valueRaw);
if (Number.isInteger(n)) {
data.dry_penalty_last_n = n;
if (n < -1) {
console.warn(`agents: dry_penalty_last_n ${n} out of range (≥-1), ignoring (falling back to default)`);
}
} else {
errors.push(`dry_penalty_last_n must be an integer (got "${valueRaw}")`);
}
} else if (key === 'tools') {
if (valueRaw === '') { if (valueRaw === '') {
data.tools = []; data.tools = [];
arrayKey = 'tools'; arrayKey = 'tools';
@@ -478,14 +437,6 @@ interface CacheEntry {
// corresponding mtime so the next read sees a miss without a watcher. // corresponding mtime so the next read sees a miss without a watcher.
const cache = new Map<string, CacheEntry>(); const cache = new Map<string, CacheEntry>();
export function invalidateAgentsCache(projectPath?: string): void {
if (projectPath === undefined) {
cache.clear();
} else {
cache.delete(projectPath);
}
}
// v1.13.8: cache-read accessor for the system-prompt prefix-fingerprint log. // v1.13.8: cache-read accessor for the system-prompt prefix-fingerprint log.
// Returns the AGENTS.md mtimes that getAgentsForProject() observed on its // Returns the AGENTS.md mtimes that getAgentsForProject() observed on its
// last cache fill for this projectPath. Both fields are null when the cache // last cache fill for this projectPath. Both fields are null when the cache

View File

@@ -19,8 +19,6 @@ function cleanTitle(raw: string): string {
return name; return name;
} }
// TODO: wire suggestTags after task model validation
export async function maybeAutoNameChat( export async function maybeAutoNameChat(
ctx: InferenceContext, ctx: InferenceContext,
chatId: string, chatId: string,

View File

@@ -113,7 +113,7 @@ export async function callCodecontext(
fetcher: typeof fetch = fetch, fetcher: typeof fetch = fetch,
): Promise<CodecontextResponse> { ): Promise<CodecontextResponse> {
// Step 1: realpath the project root, then realpath the requested target_dir // Step 1: realpath the project root, then realpath the requested target_dir
// (defaulting to projectPath when the caller didn't pass one — the 8 wrappers // (defaulting to projectPath when the caller didn't pass one — the 12 wrappers
// never pass target_dir; tests can override). A non-existent target_dir // never pass target_dir; tests can override). A non-existent target_dir
// throws before we hit the network so the model gets a sharp error. // throws before we hit the network so the model gets a sharp error.
const resolvedProject = await realpath(req.projectPath); const resolvedProject = await realpath(req.projectPath);

View File

@@ -22,6 +22,8 @@ import type { Config } from '../config.js';
import type { Broker } from './broker.js'; import type { Broker } from './broker.js';
import { SUMMARY_TEMPLATE } from './compaction-prompt.js'; import { SUMMARY_TEMPLATE } from './compaction-prompt.js';
import * as modelContextLookup from './model-context.js'; import * as modelContextLookup from './model-context.js';
import { SENTINEL_KINDS } from './inference/sentinels.js';
import type { OpenAiMessage } from './inference/payload.js';
// v1.13.9: ratio-only overflow trigger. Fires compaction at 85% of ctx_max // v1.13.9: ratio-only overflow trigger. Fires compaction at 85% of ctx_max
// (opencode session/overflow.ts pattern). Replaces the v1.11.0-era // (opencode session/overflow.ts pattern). Replaces the v1.11.0-era
@@ -256,24 +258,9 @@ export function buildPrompt(
// would silently drop pre-legacy-compact history before the LLM sees it. // would silently drop pre-legacy-compact history before the LLM sees it.
// Compaction wants to send the entire head, full stop.) === // Compaction wants to send the entire head, full stop.) ===
// v1.13.6: exported for unit-test access (reasoning render coverage). // #12: SENTINEL_KINDS imported from inference/sentinels.ts (single source).
export interface OpenAiMessage { // OpenAiMessage imported from inference/payload.ts (structurally compatible —
role: 'system' | 'user' | 'assistant' | 'tool'; // compaction's head payload doesn't need the optional reasoning? field).
content: string | null;
tool_calls?: Array<{
id: string;
type: 'function';
function: { name: string; arguments: string };
}>;
tool_call_id?: string;
}
// #12: mirror inference/sentinels.ts:isAnySentinel over the CompactionMessage
// shape (which carries metadata as { kind?: string } | null, not the full
// Message type isAnySentinel expects). All UI-only sentinels are stripped from
// the head payload — they never go to the summarizer LLM. Keep the kind list in
// sync with isAnySentinel in sentinels.ts.
const SENTINEL_KINDS = new Set(['cap_hit', 'doom_loop', 'mistake_recovery']);
function isAnySentinel(m: CompactionMessage): boolean { function isAnySentinel(m: CompactionMessage): boolean {
return ( return (
m.role === 'system' && m.role === 'system' &&

View File

@@ -200,7 +200,7 @@ export async function grep(
export async function findFiles( export async function findFiles(
projectRoot: string, projectRoot: string,
pattern?: string, pattern?: string,
opts?: { type?: 'file' | 'dir'; max_results?: number; path?: string; extra_roots?: readonly string[] } opts?: { max_results?: number; path?: string; extra_roots?: readonly string[] }
): Promise<FindFilesResult> { ): Promise<FindFilesResult> {
const limit = Math.min( const limit = Math.min(
Math.max(opts?.max_results ?? DEFAULT_FIND_RESULTS, 1), Math.max(opts?.max_results ?? DEFAULT_FIND_RESULTS, 1),

View File

@@ -83,10 +83,3 @@ export async function getGitMeta(rootPath: string): Promise<GitMeta | null> {
return value; return value;
} }
export function invalidateGitMetaCache(rootPath?: string): void {
if (rootPath) {
cache.delete(rootPath);
} else {
cache.clear();
}
}

View File

@@ -1,32 +1,10 @@
import type { Agent } from '../../types/api.js'; import type { Agent } from '../../types/api.js';
import { READ_ONLY_TOOL_NAMES } from '../tools.js';
// v1.8.2: tool-call budget defaults. Resolved per-turn by resolveToolBudget.
// - Agent with explicit max_tool_calls: that value.
// - Agent with read-only-only tools: BUDGET_READ_ONLY (50).
// - Agent with any non-read-only tool: BUDGET_NON_READ_ONLY (10).
// - No agent (raw chat): BUDGET_NO_AGENT (50).
// v1.13.7: bumped BUDGET_NO_AGENT 15→30 to match BUDGET_READ_ONLY. Every tool
// in ALL_TOOLS today is read-only (see services/tools.ts comment at
// READ_ONLY_TOOL_NAMES); the cautious 15-cap was a forward-looking guard for
// write tools that haven't landed yet. No-agent mode gets the same toolset as
// an all-read-only agent at runtime, so they should share the same budget.
// v1.13.12: bumped read-only caps 30→50. Real recon sessions were hitting 30
// with ~3 turns wasted on codecontext parse failures (empty node_modules
// files); legitimate need was ~27, and Architect-class system overviews want
// deeper recon than a 30-cap permits. Headroom of 20 absorbs failure-retry
// turns + deeper exploration without changing the safety floor materially —
// the doom-loop guard (3 identical calls → abort) catches the actual failure
// mode this cap was guarding against.
export const BUDGET_READ_ONLY = 100;
export const BUDGET_NON_READ_ONLY = 100;
export const BUDGET_NO_AGENT = 100;
const READ_ONLY_SET: ReadonlySet<string> = new Set(READ_ONLY_TOOL_NAMES);
// Tool-call budget. All three historical tiers (read-only, non-read-only,
// no-agent) converged to 100 as of v1.13.12, collapsing the tier logic.
// The only remaining override is per-agent max_tool_calls from AGENTS.md
// frontmatter. Flat default of 100; doom-loop guard in sentinels.ts catches
// pathological cases well before the cap is reached.
export function resolveToolBudget(agent: Agent | null): number { export function resolveToolBudget(agent: Agent | null): number {
if (agent?.max_tool_calls != null) return agent.max_tool_calls; return agent?.max_tool_calls ?? 100;
if (!agent) return BUDGET_NO_AGENT;
const allReadOnly = agent.tools.every((t) => READ_ONLY_SET.has(t));
return allReadOnly ? BUDGET_READ_ONLY : BUDGET_NON_READ_ONLY;
} }

View File

@@ -0,0 +1,64 @@
// P5: the debounced DB content-flush timer, extracted from the verbatim copy
// that lived in executeStreamPhase + the three sentinel summaries (4 sites).
// Each site streamed deltas into a local `accumulated`/`state.accumulated`
// string and threw an UPDATE at the row at most once per DB_FLUSH_INTERVAL_MS
// to bound write rate under heavy streaming.
//
// The accumulated string stays owned by the caller (stream-phase keeps it on
// the shared StreamPhaseState; the summaries keep a local) — the flusher reads
// it through a `getContent` thunk at fire time, snapshotting the latest value
// exactly as the inline `const snapshot = accumulated` did. No final flush is
// performed on drain (matches the originals): every caller writes the full
// content itself in its terminal UPDATE, so drain only cancels the pending
// timer and awaits whatever write is already chained.
import type { Sql } from '../../db.js';
import { DB_FLUSH_INTERVAL_MS } from './types.js';
export interface ContentFlusher {
// Arm a debounced flush. No-op if one is already pending (the in-flight timer
// will pick up the latest content via getContent when it fires).
scheduleFlush: () => void;
// Cancel any pending timer and await the in-flight write chain. Does NOT
// perform a final flush — the caller's terminal UPDATE owns the final write.
drain: () => Promise<void>;
}
export function createContentFlusher(
sql: Sql,
messageId: string,
getContent: () => string,
intervalMs: number = DB_FLUSH_INTERVAL_MS,
): ContentFlusher {
let pendingFlushTimer: NodeJS.Timeout | null = null;
let flushPromise: Promise<unknown> = Promise.resolve();
const flushNow = () => {
if (pendingFlushTimer) {
clearTimeout(pendingFlushTimer);
pendingFlushTimer = null;
}
const snapshot = getContent();
flushPromise = flushPromise.then(() =>
sql`UPDATE messages SET content = ${snapshot} WHERE id = ${messageId}`
);
};
const scheduleFlush = () => {
if (pendingFlushTimer) return;
pendingFlushTimer = setTimeout(() => {
pendingFlushTimer = null;
flushNow();
}, intervalMs);
};
const drain = async () => {
if (pendingFlushTimer) {
clearTimeout(pendingFlushTimer);
pendingFlushTimer = null;
}
await flushPromise;
};
return { scheduleFlush, drain };
}

View File

@@ -10,7 +10,7 @@ import { maybeFlagForCompaction } from './payload.js';
import { insertParts, partsFromAssistantMessage } from './parts.js'; import { insertParts, partsFromAssistantMessage } from './parts.js';
import type { PartInsert } from './parts.js'; import type { PartInsert } from './parts.js';
import { stripToolMarkup } from './tool-call-parser.js'; import { stripToolMarkup } from './tool-call-parser.js';
import type { InferenceContext, StreamResult, TurnArgs } from './turn.js'; import type { InferenceContext, StreamResult, TurnArgs } from './types.js';
export async function handleAbortOrError( export async function handleAbortOrError(
ctx: InferenceContext, ctx: InferenceContext,
@@ -95,6 +95,90 @@ export async function handleAbortOrError(
} }
} }
// P5: the success-finalize atom shared by the wrap-up summaries
// (sentinel-summaries.ts) and the synthesis pass (synthesisPipeline.ts). Both
// previously hand-rolled this exact ceremony — n_ctx lookup, the complete
// UPDATE (content/status/tokens/ctx/ctx_max/finished_at; NO model column), and
// the message_complete frame with the full token fields. Single-sourcing it
// means a message_complete frame-contract change lands in one place instead of
// silently skipping the summary/synthesis paths.
//
// `beforeComplete` runs AFTER the UPDATE and BEFORE the message_complete frame
// — synthesis uses it to write its kind='synthesis' part in the original order
// (UPDATE → insertParts → message_complete), preserving timing exactly.
//
// NOTE: finalizeCompletion does NOT use this — it additionally writes the
// `model` column, the text/reasoning/html_artifact parts, the compaction flag,
// and the session_updated bump, which this atom deliberately omits (the summary
// and synthesis paths handle those — or not — themselves).
export async function finalizeStreamedRow(
ctx: InferenceContext,
opts: {
sessionId: string;
chatId: string;
messageId: string;
model: string;
content: string;
completionTokens: number | null;
promptTokens: number | null;
startedAt: string | null;
beforeComplete?: () => Promise<void>;
},
): Promise<void> {
// v1.11.3: see executeToolPhase for the rationale.
const mctx = await modelContext.getModelContext(opts.model);
const nCtx = mctx?.n_ctx ?? null;
const [updated] = await ctx.sql<
{ tokens_used: number | null; ctx_used: number | null; ctx_max: number | null; finished_at: string | null }[]
>`
UPDATE messages
SET content = ${opts.content},
status = 'complete',
tokens_used = ${opts.completionTokens},
ctx_used = ${opts.promptTokens},
ctx_max = ${nCtx},
finished_at = clock_timestamp()
WHERE id = ${opts.messageId}
RETURNING tokens_used, ctx_used, ctx_max, finished_at
`;
if (opts.beforeComplete) await opts.beforeComplete();
ctx.publish(opts.sessionId, {
type: 'message_complete',
message_id: opts.messageId,
chat_id: opts.chatId,
tokens_used: updated?.tokens_used ?? null,
ctx_used: updated?.ctx_used ?? null,
ctx_max: updated?.ctx_max ?? null,
started_at: opts.startedAt,
finished_at: updated?.finished_at ?? null,
model: opts.model,
});
}
// P5: minimal empty-finalize for the mistake-escalate path. The escalate
// branch in runAssistantTurn stops the turn cap-hit-style; the next assistant
// row is still 'streaming', so it's finalized as an empty complete row (no
// tokens, no parts, no session bump — the escalate branch handles the sentinel
// + chat_status itself). Centralizing the status-column write + message_complete
// frame here keeps it next to the other finalize paths so a status-column
// change is found in one place.
export async function finalizeEmpty(
ctx: InferenceContext,
args: TurnArgs,
): Promise<void> {
const { sessionId, chatId, assistantMessageId } = args;
await ctx.sql`
UPDATE messages
SET content = '', status = 'complete', finished_at = clock_timestamp()
WHERE id = ${assistantMessageId}
`;
ctx.publish(sessionId, {
type: 'message_complete',
message_id: assistantMessageId,
chat_id: chatId,
});
}
export async function finalizeCompletion( export async function finalizeCompletion(
ctx: InferenceContext, ctx: InferenceContext,
args: TurnArgs, args: TurnArgs,

View File

@@ -7,26 +7,17 @@
export { export {
createInferenceRunner, createInferenceRunner,
MAX_STEPS, MAX_STEPS,
runAssistantTurn,
runInference, runInference,
} from './turn.js'; } from './turn.js';
// P5: the shared pipeline types moved from turn.ts to types.ts (breaking the
// hub-and-leaf near-cycle). Re-exported here so the public surface is unchanged.
export type { export type {
FramePublisher, FramePublisher,
InferenceContext, InferenceContext,
InferenceFrame, InferenceFrame,
StreamResult, StreamResult,
TurnArgs, TurnArgs,
} from './turn.js'; } from './types.js';
export type { ToolPhaseResult } from './tool-phase.js'; export type { ToolPhaseResult } from './tool-phase.js';
export { detectDoomLoop, DOOM_LOOP_THRESHOLD } from './sentinels.js'; export { detectDoomLoop, DOOM_LOOP_THRESHOLD } from './sentinels.js';
export {
detectMistakePattern,
freshMistakeState,
recordStep,
MISTAKE_THRESHOLD,
MISTAKE_RECOVERY_NOTE,
} from './mistake-tracker.js';
export type { FailureKind, MistakeState } from './mistake-tracker.js';
export { buildMessagesPayload } from './payload.js'; export { buildMessagesPayload } from './payload.js';
export { generateToolUseSummary } from './tool-summaries.js';
export type { ToolInfo } from './tool-summaries.js';

View File

@@ -1,11 +1,10 @@
import type { Sql } from '../../db.js'; import type { Sql } from '../../db.js';
import type { ToolCall, ToolResult } from '../../types/api.js'; import type { ToolCall, ToolResult } from '../../types/api.js';
// v1.13.0: dual-write helper. Every site that writes the legacy // v1.13.0: message_parts write helpers. v1.13.20: legacy tool_calls/
// messages.tool_calls / messages.tool_results JSON columns calls into here // tool_results JSON columns dropped; message_parts is the sole source of
// to mirror the same data into message_parts rows. Reads still go to the // truth. All writes go through insertParts / partsFromAssistantMessage /
// JSON columns; the swap to parts-as-source-of-truth happens in a later // partsFromToolMessage. Reads use the messages_with_parts view.
// v1.13 dispatch alongside the AI SDK streamText migration.
// v1.13.13: 'synthesis' added. Schema CHECK constraint is updated in lockstep // v1.13.13: 'synthesis' added. Schema CHECK constraint is updated in lockstep
// (schema.sql adds 'synthesis' to message_parts_kind_chk on startup). The // (schema.sql adds 'synthesis' to message_parts_kind_chk on startup). The

View File

@@ -10,7 +10,8 @@ import * as compaction from '../compaction.js';
import { buildSystemPromptWithFingerprint } from '../system-prompt.js'; import { buildSystemPromptWithFingerprint } from '../system-prompt.js';
import { isAnySentinel } from './sentinels.js'; import { isAnySentinel } from './sentinels.js';
import { PRUNE_TRIGGER_TOKENS, prune } from './prune.js'; import { PRUNE_TRIGGER_TOKENS, prune } from './prune.js';
import type { InferenceContext } from './turn.js'; import type { InferenceContext } from './types.js';
import { INFERENCE_MESSAGE_COLUMNS } from '../message-columns.js';
export interface OpenAiMessage { export interface OpenAiMessage {
role: 'system' | 'user' | 'assistant' | 'tool'; role: 'system' | 'user' | 'assistant' | 'tool';
@@ -205,9 +206,7 @@ export async function loadContext(
// v1.13.1-C: also pull reasoning_parts so assistant messages from // v1.13.1-C: also pull reasoning_parts so assistant messages from
// reasoning models can be replayed with their reasoning context preserved. // reasoning models can be replayed with their reasoning context preserved.
const history = await sql<Message[]>` const history = await sql<Message[]>`
SELECT id, session_id, chat_id, role, content, kind, tool_calls, tool_results, status, last_seq, SELECT ${sql.unsafe(INFERENCE_MESSAGE_COLUMNS)}
tokens_used, ctx_used, ctx_max, started_at, finished_at, created_at, metadata,
reasoning_parts
FROM messages_with_parts FROM messages_with_parts
WHERE chat_id = ${chatId} AND compacted_at IS NULL WHERE chat_id = ${chatId} AND compacted_at IS NULL
ORDER BY created_at ASC, id ASC ORDER BY created_at ASC, id ASC

View File

@@ -5,16 +5,16 @@ import type {
Project, Project,
Session, Session,
} from '../../types/api.js'; } from '../../types/api.js';
import * as modelContext from '../model-context.js';
import { buildMessagesPayload } from './payload.js'; import { buildMessagesPayload } from './payload.js';
import { DOOM_LOOP_THRESHOLD } from './sentinels.js'; import { DOOM_LOOP_THRESHOLD } from './sentinels.js';
import { streamCompletion } from './stream-phase.js'; import { streamCompletion, samplerOptsFromAgent } from './stream-phase.js';
import { DB_FLUSH_INTERVAL_MS } from './types.js'; import { createContentFlusher } from './content-flusher.js';
import { finalizeStreamedRow } from './error-handler.js';
import type { import type {
InferenceContext, InferenceContext,
StreamResult, StreamResult,
TurnArgs, TurnArgs,
} from './turn.js'; } from './types.js';
// Synthetic system note appended to the cap-hit summary call. Verbatim from // Synthetic system note appended to the cap-hit summary call. Verbatim from
// the v1.8.2 spec — do not paraphrase: the model is more reliable when the // the v1.8.2 spec — do not paraphrase: the model is more reliable when the
@@ -25,21 +25,50 @@ const CAP_HIT_SUMMARY_NOTE = (limit: number) =>
const DOOM_LOOP_NOTE = (name: string) => const DOOM_LOOP_NOTE = (name: string) =>
`You called ${name} with the same arguments ${DOOM_LOOP_THRESHOLD} times in a row. Stop calling it. Produce the best answer you can with what you have.`; `You called ${name} with the same arguments ${DOOM_LOOP_THRESHOLD} times in a row. Stop calling it. Produce the best answer you can with what you have.`;
export async function runCapHitSummary( // v1.14.0: step-cap wrap-up note. Names the step limit rather than the tool
// budget. The sentinel reuses metadata.kind = 'cap_hit' so the frontend
// CapHitSentinel component renders it without changes.
const STEP_CAP_NOTE = (steps: number, cap: number) =>
`You've reached the step limit (${steps}/${cap} steps). Produce the best answer you can with what you have. Do not call more tools.`;
// P5: the ONE generic wrap-up flow shared by the three sentinel summaries
// (cap-hit, doom-loop, step-cap). Each reuses the in-flight assistant slot to
// stream a short tools-disabled summary, finalizes via the same 3-outcome
// branch (complete / cancelled / failed), bumps the session, then drops a
// sentinel and the chat_status. The three differ only in:
// - `note`: the synthetic system instruction appended to the summary call.
// - `errorText`: the fallback used in the failed-status metadata + error frame.
// - sentinel timing: cap-hit inserts BEFORE the stream (`beforeStream`);
// doom-loop + step-cap insert AFTER the session bump (`afterSession`).
// - `logMsg` / `logFields`: per-kind log line + extra fields.
// All three use error_reason / chat_status reason = 'summary_after_cap_failed'
// (doom-loop reuses it deliberately — the user-visible failure mode is the
// same "model gave up mid-summary"; the ErrorReason union is shared and the UI
// surfaces a generic "summary failed" line for every sentinel path).
interface WrapUpOpts {
note: string;
errorText: string;
logMsg: string;
logFields: Record<string, unknown>;
beforeStream?: () => Promise<void>;
afterSession?: () => Promise<void>;
}
async function runWrapUpSummary(
ctx: InferenceContext, ctx: InferenceContext,
args: TurnArgs, args: TurnArgs,
session: Session, session: Session,
project: Project, project: Project,
history: Message[], history: Message[],
agent: Agent | null, agent: Agent | null,
budget: number, opts: WrapUpOpts,
): Promise<void> { ): Promise<void> {
const { sessionId, chatId, assistantMessageId, signal } = args; const { sessionId, chatId, assistantMessageId, signal } = args;
await insertCapHitSentinel(ctx, sessionId, chatId, agent, budget); if (opts.beforeStream) await opts.beforeStream();
const messages = await buildMessagesPayload(session, project, history, agent, ctx.log); const messages = await buildMessagesPayload(session, project, history, agent, ctx.log);
messages.push({ role: 'system', content: CAP_HIT_SUMMARY_NOTE(budget) }); messages.push({ role: 'system', content: opts.note });
const startedRow = await ctx.sql<{ started_at: string }[]>` const startedRow = await ctx.sql<{ started_at: string }[]>`
UPDATE messages UPDATE messages
@@ -57,25 +86,7 @@ export async function runCapHitSummary(
}); });
let accumulated = ''; let accumulated = '';
let pendingFlushTimer: NodeJS.Timeout | null = null; const flusher = createContentFlusher(ctx.sql, assistantMessageId, () => accumulated);
let flushPromise: Promise<unknown> = Promise.resolve();
const flushNow = () => {
if (pendingFlushTimer) {
clearTimeout(pendingFlushTimer);
pendingFlushTimer = null;
}
const snapshot = accumulated;
flushPromise = flushPromise.then(() =>
ctx.sql`UPDATE messages SET content = ${snapshot} WHERE id = ${assistantMessageId}`
);
};
const scheduleFlush = () => {
if (pendingFlushTimer) return;
pendingFlushTimer = setTimeout(() => {
pendingFlushTimer = null;
flushNow();
}, DB_FLUSH_INTERVAL_MS);
};
let summaryOk = false; let summaryOk = false;
let summarySoftCancelled = false; let summarySoftCancelled = false;
@@ -86,7 +97,7 @@ export async function runCapHitSummary(
ctx, ctx,
session.model, session.model,
messages, messages,
{ tools: null, temperature: agent?.temperature, top_p: agent?.top_p ?? undefined, top_k: agent?.top_k ?? undefined, min_p: agent?.min_p ?? undefined, presence_penalty: agent?.presence_penalty ?? undefined, top_n_sigma: agent?.top_n_sigma ?? undefined, dry_multiplier: agent?.dry_multiplier ?? undefined, dry_base: agent?.dry_base ?? undefined, dry_allowed_length: agent?.dry_allowed_length ?? undefined, dry_penalty_last_n: agent?.dry_penalty_last_n ?? undefined }, { tools: null, ...samplerOptsFromAgent(agent) },
(delta) => { (delta) => {
accumulated += delta; accumulated += delta;
ctx.publish(sessionId, { ctx.publish(sessionId, {
@@ -95,7 +106,7 @@ export async function runCapHitSummary(
chat_id: chatId, chat_id: chatId,
content: delta, content: delta,
}); });
scheduleFlush(); flusher.scheduleFlush();
}, },
undefined, undefined,
signal, signal,
@@ -108,44 +119,23 @@ export async function runCapHitSummary(
summaryError = err instanceof Error ? err.message : String(err); summaryError = err instanceof Error ? err.message : String(err);
} }
} finally { } finally {
if (pendingFlushTimer) { await flusher.drain();
clearTimeout(pendingFlushTimer);
pendingFlushTimer = null;
}
await flushPromise;
} }
// Finalize the summary message based on the three outcomes. The sentinel // Finalize the summary message based on the three outcomes. The sentinel is
// is inserted regardless so the user always has the Continue affordance — // inserted regardless (before or after, per opts) so the user always has the
// even on a partial / failed summary the chat history shows where the // appropriate affordance — even on a partial / failed summary the chat
// budget was hit. // history shows where the loop stopped.
if (summaryOk && result) { if (summaryOk && result) {
// v1.11.3: see executeToolPhase for the rationale. await finalizeStreamedRow(ctx, {
const mctx = await modelContext.getModelContext(session.model); sessionId,
const nCtx = mctx?.n_ctx ?? null; chatId,
const [updated] = await ctx.sql< messageId: assistantMessageId,
{ tokens_used: number | null; ctx_used: number | null; ctx_max: number | null; finished_at: string | null }[]
>`
UPDATE messages
SET content = ${result.content},
status = 'complete',
tokens_used = ${result.completionTokens},
ctx_used = ${result.promptTokens},
ctx_max = ${nCtx},
finished_at = clock_timestamp()
WHERE id = ${assistantMessageId}
RETURNING tokens_used, ctx_used, ctx_max, finished_at
`;
ctx.publish(sessionId, {
type: 'message_complete',
message_id: assistantMessageId,
chat_id: chatId,
tokens_used: updated?.tokens_used ?? null,
ctx_used: updated?.ctx_used ?? null,
ctx_max: updated?.ctx_max ?? null,
started_at: startedAt,
finished_at: updated?.finished_at ?? null,
model: session.model, model: session.model,
content: result.content,
completionTokens: result.completionTokens,
promptTokens: result.promptTokens,
startedAt,
}); });
} else if (summarySoftCancelled) { } else if (summarySoftCancelled) {
await ctx.sql` await ctx.sql`
@@ -164,7 +154,7 @@ export async function runCapHitSummary(
const errMeta: MessageMetadata = { const errMeta: MessageMetadata = {
kind: 'error', kind: 'error',
error_reason: 'summary_after_cap_failed', error_reason: 'summary_after_cap_failed',
error_text: summaryError ?? 'summary failed', error_text: summaryError ?? opts.errorText,
}; };
await ctx.sql` await ctx.sql`
UPDATE messages UPDATE messages
@@ -178,7 +168,7 @@ export async function runCapHitSummary(
type: 'error', type: 'error',
message_id: assistantMessageId, message_id: assistantMessageId,
chat_id: chatId, chat_id: chatId,
error: summaryError ?? 'summary failed', error: summaryError ?? opts.errorText,
reason: 'summary_after_cap_failed', reason: 'summary_after_cap_failed',
}); });
} }
@@ -197,11 +187,11 @@ export async function runCapHitSummary(
updated_at: sessRow!.updated_at, updated_at: sessRow!.updated_at,
}); });
if (opts.afterSession) await opts.afterSession();
// Status frame fires last so the dot color reflects the terminal state. // Status frame fires last so the dot color reflects the terminal state.
// Success → idle, abort → idle (user-driven stop), error → error+reason. // Success → idle, abort → idle (user-driven stop), error → error+reason.
if (summaryOk) { if (summaryOk || summarySoftCancelled) {
ctx.publishUser({ type: 'chat_status', chat_id: chatId, status: 'idle', at: new Date().toISOString() });
} else if (summarySoftCancelled) {
ctx.publishUser({ type: 'chat_status', chat_id: chatId, status: 'idle', at: new Date().toISOString() }); ctx.publishUser({ type: 'chat_status', chat_id: chatId, status: 'idle', at: new Date().toISOString() });
} else { } else {
ctx.publishUser({ ctx.publishUser({
@@ -214,11 +204,113 @@ export async function runCapHitSummary(
} }
ctx.log.info( ctx.log.info(
{ sessionId, chatId, assistantMessageId, budget, summaryOk, summaryCancelled: summarySoftCancelled }, { sessionId, chatId, assistantMessageId, ...opts.logFields, summaryOk, summaryCancelled: summarySoftCancelled },
'inference cap-hit summary finished', opts.logMsg,
); );
} }
// v1.8.2: cap-hit summary flow. Called instead of erroring when the loop hits
// its budget. The cap-hit sentinel is inserted FIRST (before the summary
// stream) so the UI shows the Continue affordance regardless of summary
// outcome.
export async function runCapHitSummary(
ctx: InferenceContext,
args: TurnArgs,
session: Session,
project: Project,
history: Message[],
agent: Agent | null,
budget: number,
): Promise<void> {
await runWrapUpSummary(ctx, args, session, project, history, agent, {
note: CAP_HIT_SUMMARY_NOTE(budget),
errorText: 'summary failed',
logMsg: 'inference cap-hit summary finished',
logFields: { budget },
beforeStream: () => insertCapHitSentinel(ctx, args.sessionId, args.chatId, agent, budget),
});
}
// v1.11.6: doom-loop wrap-up. The doom-loop sentinel is inserted AFTER the
// session bump (no Continue affordance — continuing would re-trigger the loop
// with the same tools available; the user needs to restate or switch agents).
export async function runDoomLoopSummary(
ctx: InferenceContext,
args: TurnArgs,
session: Session,
project: Project,
history: Message[],
agent: Agent | null,
loop: { name: string; args: Record<string, unknown> },
): Promise<void> {
await runWrapUpSummary(ctx, args, session, project, history, agent, {
note: DOOM_LOOP_NOTE(loop.name),
errorText: 'doom-loop summary failed',
logMsg: 'inference doom-loop summary finished',
logFields: { loopedTool: loop.name },
afterSession: () => insertDoomLoopSentinel(ctx, args.sessionId, args.chatId, loop),
});
}
// v1.14.0: step-cap wrap-up. Reuses the cap_hit sentinel (inserted AFTER the
// session bump) so the frontend CapHitSentinel component renders it without
// changes; the content text distinguishes step cap from budget.
export async function runStepCapSummary(
ctx: InferenceContext,
args: TurnArgs,
session: Session,
project: Project,
history: Message[],
agent: Agent | null,
steps: number,
cap: number,
): Promise<void> {
await runWrapUpSummary(ctx, args, session, project, history, agent, {
note: STEP_CAP_NOTE(steps, cap),
errorText: 'step-cap summary failed',
logMsg: 'inference step-cap summary finished',
logFields: { steps, cap },
afterSession: () => insertCapHitSentinel(ctx, args.sessionId, args.chatId, agent, cap),
});
}
// P5: the ONE INSERT + message_started → delta → message_complete frame
// sequence shared by every sentinel inserter. The sentinel row is a
// role='system', status='complete' message; the static content rides the same
// streaming-frame path useSessionStream's reducer uses for assistant messages
// (the delta carries the full text in one chunk).
async function insertSentinel(
ctx: InferenceContext,
sessionId: string,
chatId: string,
metadata: MessageMetadata,
content: string,
): Promise<void> {
const [row] = await ctx.sql<{ id: string }[]>`
INSERT INTO messages (session_id, chat_id, role, content, status, created_at, metadata)
VALUES (${sessionId}, ${chatId}, 'system', ${content}, 'complete', clock_timestamp(), ${ctx.sql.json(metadata as never)})
RETURNING id
`;
ctx.publish(sessionId, {
type: 'message_started',
message_id: row!.id,
chat_id: chatId,
role: 'system',
});
ctx.publish(sessionId, {
type: 'delta',
message_id: row!.id,
chat_id: chatId,
content,
});
ctx.publish(sessionId, {
type: 'message_complete',
message_id: row!.id,
chat_id: chatId,
metadata,
});
}
async function insertCapHitSentinel( async function insertCapHitSentinel(
ctx: InferenceContext, ctx: InferenceContext,
sessionId: string, sessionId: string,
@@ -246,430 +338,7 @@ async function insertCapHitSentinel(
can_continue: canContinue, can_continue: canContinue,
}; };
const content = `Reached tool budget (${budget}/${budget}). Continue to extend.`; const content = `Reached tool budget (${budget}/${budget}). Continue to extend.`;
await insertSentinel(ctx, sessionId, chatId, metadata, content);
const [row] = await ctx.sql<{ id: string }[]>`
INSERT INTO messages (session_id, chat_id, role, content, status, created_at, metadata)
VALUES (${sessionId}, ${chatId}, 'system', ${content}, 'complete', clock_timestamp(), ${ctx.sql.json(metadata as never)})
RETURNING id
`;
// The sentinel content is static, but we still walk the standard frame
// sequence (started → delta → complete) so useSessionStream's reducer
// appends it via the same path it uses for streaming assistant messages.
// The delta carries the full text in one chunk.
ctx.publish(sessionId, {
type: 'message_started',
message_id: row!.id,
chat_id: chatId,
role: 'system',
});
ctx.publish(sessionId, {
type: 'delta',
message_id: row!.id,
chat_id: chatId,
content,
});
ctx.publish(sessionId, {
type: 'message_complete',
message_id: row!.id,
chat_id: chatId,
metadata,
});
}
// v1.11.6: doom-loop wrap-up. Mirrors runCapHitSummary structurally — same
// in-flight-slot reuse, same tools-disabled streaming-summary call, same
// post-finalize sentinel insert + chat_status drop. Differences:
// - synthetic note text comes from DOOM_LOOP_NOTE (names the looping tool)
// - sentinel metadata is { kind: 'doom_loop', tool_name, args, threshold }
// and has no Continue affordance (manual retry would just re-loop)
// - chat_status error path uses reason: 'doom_loop_summary_failed'
// Kept as a clone rather than refactored into a shared helper because the
// two summary paths still differ in error reason + sentinel shape; a third
// sentinel would justify factoring out runWrapUpSummary(opts).
export async function runDoomLoopSummary(
ctx: InferenceContext,
args: TurnArgs,
session: Session,
project: Project,
history: Message[],
agent: Agent | null,
loop: { name: string; args: Record<string, unknown> },
): Promise<void> {
const { sessionId, chatId, assistantMessageId, signal } = args;
const messages = await buildMessagesPayload(session, project, history, agent, ctx.log);
messages.push({ role: 'system', content: DOOM_LOOP_NOTE(loop.name) });
const startedRow = await ctx.sql<{ started_at: string }[]>`
UPDATE messages
SET started_at = clock_timestamp()
WHERE id = ${assistantMessageId}
RETURNING started_at
`;
const startedAt = startedRow[0]?.started_at ?? null;
ctx.publish(sessionId, {
type: 'message_started',
message_id: assistantMessageId,
chat_id: chatId,
role: 'assistant',
});
let accumulated = '';
let pendingFlushTimer: NodeJS.Timeout | null = null;
let flushPromise: Promise<unknown> = Promise.resolve();
const flushNow = () => {
if (pendingFlushTimer) {
clearTimeout(pendingFlushTimer);
pendingFlushTimer = null;
}
const snapshot = accumulated;
flushPromise = flushPromise.then(() =>
ctx.sql`UPDATE messages SET content = ${snapshot} WHERE id = ${assistantMessageId}`
);
};
const scheduleFlush = () => {
if (pendingFlushTimer) return;
pendingFlushTimer = setTimeout(() => {
pendingFlushTimer = null;
flushNow();
}, DB_FLUSH_INTERVAL_MS);
};
let summaryOk = false;
let summarySoftCancelled = false;
let summaryError: string | null = null;
let result: StreamResult | null = null;
try {
result = await streamCompletion(
ctx,
session.model,
messages,
{ tools: null, temperature: agent?.temperature, top_p: agent?.top_p ?? undefined, top_k: agent?.top_k ?? undefined, min_p: agent?.min_p ?? undefined, presence_penalty: agent?.presence_penalty ?? undefined, top_n_sigma: agent?.top_n_sigma ?? undefined, dry_multiplier: agent?.dry_multiplier ?? undefined, dry_base: agent?.dry_base ?? undefined, dry_allowed_length: agent?.dry_allowed_length ?? undefined, dry_penalty_last_n: agent?.dry_penalty_last_n ?? undefined },
(delta) => {
accumulated += delta;
ctx.publish(sessionId, {
type: 'delta',
message_id: assistantMessageId,
chat_id: chatId,
content: delta,
});
scheduleFlush();
},
undefined,
signal,
);
summaryOk = true;
} catch (err) {
if (err instanceof Error && err.name === 'AbortError') {
summarySoftCancelled = true;
} else {
summaryError = err instanceof Error ? err.message : String(err);
}
} finally {
if (pendingFlushTimer) {
clearTimeout(pendingFlushTimer);
pendingFlushTimer = null;
}
await flushPromise;
}
if (summaryOk && result) {
const mctx = await modelContext.getModelContext(session.model);
const nCtx = mctx?.n_ctx ?? null;
const [updated] = await ctx.sql<
{ tokens_used: number | null; ctx_used: number | null; ctx_max: number | null; finished_at: string | null }[]
>`
UPDATE messages
SET content = ${result.content},
status = 'complete',
tokens_used = ${result.completionTokens},
ctx_used = ${result.promptTokens},
ctx_max = ${nCtx},
finished_at = clock_timestamp()
WHERE id = ${assistantMessageId}
RETURNING tokens_used, ctx_used, ctx_max, finished_at
`;
ctx.publish(sessionId, {
type: 'message_complete',
message_id: assistantMessageId,
chat_id: chatId,
tokens_used: updated?.tokens_used ?? null,
ctx_used: updated?.ctx_used ?? null,
ctx_max: updated?.ctx_max ?? null,
started_at: startedAt,
finished_at: updated?.finished_at ?? null,
model: session.model,
});
} else if (summarySoftCancelled) {
await ctx.sql`
UPDATE messages
SET content = ${accumulated},
status = 'cancelled',
finished_at = clock_timestamp()
WHERE id = ${assistantMessageId}
`;
ctx.publish(sessionId, {
type: 'message_complete',
message_id: assistantMessageId,
chat_id: chatId,
});
} else {
// Doom-loop summary failure reuses the existing summary_after_cap_failed
// error reason — the ErrorReason union is shared between sentinel paths
// and the UI surfaces a generic "summary failed" line for both. We don't
// add a new reason code because the user-visible failure mode is the
// same (model gave up mid-summary). Sentinel below still fires.
const errMeta: MessageMetadata = {
kind: 'error',
error_reason: 'summary_after_cap_failed',
error_text: summaryError ?? 'doom-loop summary failed',
};
await ctx.sql`
UPDATE messages
SET content = ${accumulated},
status = 'failed',
finished_at = clock_timestamp(),
metadata = ${ctx.sql.json(errMeta as never)}
WHERE id = ${assistantMessageId}
`;
ctx.publish(sessionId, {
type: 'error',
message_id: assistantMessageId,
chat_id: chatId,
error: summaryError ?? 'doom-loop summary failed',
reason: 'summary_after_cap_failed',
});
}
const [sessRow] = await ctx.sql<{ project_id: string; name: string; updated_at: string }[]>`
UPDATE sessions SET updated_at = clock_timestamp()
WHERE id = ${sessionId}
RETURNING project_id, name, updated_at
`;
ctx.publishUser({
type: 'session_updated',
session_id: sessionId,
project_id: sessRow!.project_id,
name: sessRow!.name,
updated_at: sessRow!.updated_at,
});
await insertDoomLoopSentinel(ctx, sessionId, chatId, loop);
if (summaryOk || summarySoftCancelled) {
ctx.publishUser({ type: 'chat_status', chat_id: chatId, status: 'idle', at: new Date().toISOString() });
} else {
ctx.publishUser({
type: 'chat_status',
chat_id: chatId,
status: 'error',
at: new Date().toISOString(),
reason: 'summary_after_cap_failed',
});
}
ctx.log.info(
{ sessionId, chatId, assistantMessageId, loopedTool: loop.name, summaryOk, summaryCancelled: summarySoftCancelled },
'inference doom-loop summary finished',
);
}
// v1.14.0: step-cap wrap-up. Mirrors runCapHitSummary structurally — same
// in-flight-slot reuse, same tools-disabled streaming-summary call, same
// post-finalize sentinel insert + chat_status drop. Difference: the note
// text names the step limit rather than the tool budget. Sentinel reuses
// metadata.kind = 'cap_hit' so the frontend CapHitSentinel component
// renders it without changes.
const STEP_CAP_NOTE = (steps: number, cap: number) =>
`You've reached the step limit (${steps}/${cap} steps). Produce the best answer you can with what you have. Do not call more tools.`;
export async function runStepCapSummary(
ctx: InferenceContext,
args: TurnArgs,
session: Session,
project: Project,
history: Message[],
agent: Agent | null,
steps: number,
cap: number,
): Promise<void> {
const { sessionId, chatId, assistantMessageId, signal } = args;
const messages = await buildMessagesPayload(session, project, history, agent, ctx.log);
messages.push({ role: 'system', content: STEP_CAP_NOTE(steps, cap) });
const startedRow = await ctx.sql<{ started_at: string }[]>`
UPDATE messages
SET started_at = clock_timestamp()
WHERE id = ${assistantMessageId}
RETURNING started_at
`;
const startedAt = startedRow[0]?.started_at ?? null;
ctx.publish(sessionId, {
type: 'message_started',
message_id: assistantMessageId,
chat_id: chatId,
role: 'assistant',
});
let accumulated = '';
let pendingFlushTimer: NodeJS.Timeout | null = null;
let flushPromise: Promise<unknown> = Promise.resolve();
const flushNow = () => {
if (pendingFlushTimer) {
clearTimeout(pendingFlushTimer);
pendingFlushTimer = null;
}
const snapshot = accumulated;
flushPromise = flushPromise.then(() =>
ctx.sql`UPDATE messages SET content = ${snapshot} WHERE id = ${assistantMessageId}`
);
};
const scheduleFlush = () => {
if (pendingFlushTimer) return;
pendingFlushTimer = setTimeout(() => {
pendingFlushTimer = null;
flushNow();
}, DB_FLUSH_INTERVAL_MS);
};
let summaryOk = false;
let summarySoftCancelled = false;
let summaryError: string | null = null;
let result: StreamResult | null = null;
try {
result = await streamCompletion(
ctx,
session.model,
messages,
{ tools: null, temperature: agent?.temperature, top_p: agent?.top_p ?? undefined, top_k: agent?.top_k ?? undefined, min_p: agent?.min_p ?? undefined, presence_penalty: agent?.presence_penalty ?? undefined, top_n_sigma: agent?.top_n_sigma ?? undefined, dry_multiplier: agent?.dry_multiplier ?? undefined, dry_base: agent?.dry_base ?? undefined, dry_allowed_length: agent?.dry_allowed_length ?? undefined, dry_penalty_last_n: agent?.dry_penalty_last_n ?? undefined },
(delta) => {
accumulated += delta;
ctx.publish(sessionId, {
type: 'delta',
message_id: assistantMessageId,
chat_id: chatId,
content: delta,
});
scheduleFlush();
},
undefined,
signal,
);
summaryOk = true;
} catch (err) {
if (err instanceof Error && err.name === 'AbortError') {
summarySoftCancelled = true;
} else {
summaryError = err instanceof Error ? err.message : String(err);
}
} finally {
if (pendingFlushTimer) {
clearTimeout(pendingFlushTimer);
pendingFlushTimer = null;
}
await flushPromise;
}
if (summaryOk && result) {
const mctx = await modelContext.getModelContext(session.model);
const nCtx = mctx?.n_ctx ?? null;
const [updated] = await ctx.sql<
{ tokens_used: number | null; ctx_used: number | null; ctx_max: number | null; finished_at: string | null }[]
>`
UPDATE messages
SET content = ${result.content},
status = 'complete',
tokens_used = ${result.completionTokens},
ctx_used = ${result.promptTokens},
ctx_max = ${nCtx},
finished_at = clock_timestamp()
WHERE id = ${assistantMessageId}
RETURNING tokens_used, ctx_used, ctx_max, finished_at
`;
ctx.publish(sessionId, {
type: 'message_complete',
message_id: assistantMessageId,
chat_id: chatId,
tokens_used: updated?.tokens_used ?? null,
ctx_used: updated?.ctx_used ?? null,
ctx_max: updated?.ctx_max ?? null,
started_at: startedAt,
finished_at: updated?.finished_at ?? null,
model: session.model,
});
} else if (summarySoftCancelled) {
await ctx.sql`
UPDATE messages
SET content = ${accumulated},
status = 'cancelled',
finished_at = clock_timestamp()
WHERE id = ${assistantMessageId}
`;
ctx.publish(sessionId, {
type: 'message_complete',
message_id: assistantMessageId,
chat_id: chatId,
});
} else {
const errMeta: MessageMetadata = {
kind: 'error',
error_reason: 'summary_after_cap_failed',
error_text: summaryError ?? 'step-cap summary failed',
};
await ctx.sql`
UPDATE messages
SET content = ${accumulated},
status = 'failed',
finished_at = clock_timestamp(),
metadata = ${ctx.sql.json(errMeta as never)}
WHERE id = ${assistantMessageId}
`;
ctx.publish(sessionId, {
type: 'error',
message_id: assistantMessageId,
chat_id: chatId,
error: summaryError ?? 'step-cap summary failed',
reason: 'summary_after_cap_failed',
});
}
const [sessRow] = await ctx.sql<{ project_id: string; name: string; updated_at: string }[]>`
UPDATE sessions SET updated_at = clock_timestamp()
WHERE id = ${sessionId}
RETURNING project_id, name, updated_at
`;
ctx.publishUser({
type: 'session_updated',
session_id: sessionId,
project_id: sessRow!.project_id,
name: sessRow!.name,
updated_at: sessRow!.updated_at,
});
// Reuse cap_hit sentinel so the frontend CapHitSentinel component renders
// it without changes. The content text distinguishes step cap from budget.
await insertCapHitSentinel(ctx, sessionId, chatId, agent, cap);
if (summaryOk || summarySoftCancelled) {
ctx.publishUser({ type: 'chat_status', chat_id: chatId, status: 'idle', at: new Date().toISOString() });
} else {
ctx.publishUser({
type: 'chat_status',
chat_id: chatId,
status: 'error',
at: new Date().toISOString(),
reason: 'summary_after_cap_failed',
});
}
ctx.log.info(
{ sessionId, chatId, assistantMessageId, steps, cap, summaryOk, summaryCancelled: summarySoftCancelled },
'inference step-cap summary finished',
);
} }
async function insertDoomLoopSentinel( async function insertDoomLoopSentinel(
@@ -689,39 +358,12 @@ async function insertDoomLoopSentinel(
threshold: DOOM_LOOP_THRESHOLD, threshold: DOOM_LOOP_THRESHOLD,
}; };
const content = `Detected ${DOOM_LOOP_THRESHOLD} identical calls to ${loop.name}. Stopping the tool-call loop. Produce the best answer you can with what you have.`; const content = `Detected ${DOOM_LOOP_THRESHOLD} identical calls to ${loop.name}. Stopping the tool-call loop. Produce the best answer you can with what you have.`;
await insertSentinel(ctx, sessionId, chatId, metadata, content);
const [row] = await ctx.sql<{ id: string }[]>`
INSERT INTO messages (session_id, chat_id, role, content, status, created_at, metadata)
VALUES (${sessionId}, ${chatId}, 'system', ${content}, 'complete', clock_timestamp(), ${ctx.sql.json(metadata as never)})
RETURNING id
`;
// Standard frame sequence — same as cap-hit sentinel — so
// useSessionStream's reducer appends the row via the existing path.
ctx.publish(sessionId, {
type: 'message_started',
message_id: row!.id,
chat_id: chatId,
role: 'system',
});
ctx.publish(sessionId, {
type: 'delta',
message_id: row!.id,
chat_id: chatId,
content,
});
ctx.publish(sessionId, {
type: 'message_complete',
message_id: row!.id,
chat_id: chatId,
metadata,
});
} }
// #12 MistakeTracker: heterogeneous-failure recovery sentinel. Mirrors // #12 MistakeTracker: heterogeneous-failure recovery sentinel. A role='system',
// insertDoomLoopSentinel structurally — a role='system', status='complete' row // status='complete' row firing the standard sentinel frame sequence. Two
// firing the standard message_started → delta → message_complete frame // variants distinguished by `escalated`:
// sequence. Two variants distinguished by `escalated`:
// - escalated:false → a nudge fired; recovery guidance was injected into the // - escalated:false → a nudge fired; recovery guidance was injected into the
// model's next step and the loop continued. can_continue is true (the turn // model's next step and the loop continued. can_continue is true (the turn
// is still live). // is still live).
@@ -744,30 +386,5 @@ export async function insertMistakeRecoverySentinel(
const content = opts.escalated const content = opts.escalated
? `Repeated different errors persisted after a recovery nudge (${opts.count} in a row). Stopping the tool-call loop.` ? `Repeated different errors persisted after a recovery nudge (${opts.count} in a row). Stopping the tool-call loop.`
: `Hit ${opts.count} different errors in a row. Injected recovery guidance and continuing.`; : `Hit ${opts.count} different errors in a row. Injected recovery guidance and continuing.`;
await insertSentinel(ctx, sessionId, chatId, metadata, content);
const [row] = await ctx.sql<{ id: string }[]>`
INSERT INTO messages (session_id, chat_id, role, content, status, created_at, metadata)
VALUES (${sessionId}, ${chatId}, 'system', ${content}, 'complete', clock_timestamp(), ${ctx.sql.json(metadata as never)})
RETURNING id
`;
// Standard frame sequence — same as cap-hit / doom-loop sentinels.
ctx.publish(sessionId, {
type: 'message_started',
message_id: row!.id,
chat_id: chatId,
role: 'system',
});
ctx.publish(sessionId, {
type: 'delta',
message_id: row!.id,
chat_id: chatId,
content,
});
ctx.publish(sessionId, {
type: 'message_complete',
message_id: row!.id,
chat_id: chatId,
metadata,
});
} }

View File

@@ -27,6 +27,10 @@ export function detectDoomLoop(
return { name: ref.name, args: ref.args }; return { name: ref.name, args: ref.args };
} }
// All sentinel kinds. isAnySentinel and compaction.ts's local predicate both
// consume this set — single source so a new kind can't be missed in one.
export const SENTINEL_KINDS = new Set(['cap_hit', 'doom_loop', 'mistake_recovery']);
export function isCapHitSentinel(m: Message): boolean { export function isCapHitSentinel(m: Message): boolean {
return ( return (
m.role === 'system' && m.role === 'system' &&
@@ -61,5 +65,10 @@ export function isMistakeRecoverySentinel(m: Message): boolean {
} }
export function isAnySentinel(m: Message): boolean { export function isAnySentinel(m: Message): boolean {
return isCapHitSentinel(m) || isDoomLoopSentinel(m) || isMistakeRecoverySentinel(m); return (
m.role === 'system' &&
m.metadata !== null &&
typeof m.metadata === 'object' &&
SENTINEL_KINDS.has((m.metadata as { kind?: unknown }).kind as string)
);
} }

View File

@@ -0,0 +1,47 @@
// P5 (SPLIT SKETCH 5): pure step-decision helpers for the runAssistantTurn
// loop. These COMPOSE the existing decision predicates (detectDoomLoop,
// detectMistakePattern) — they do not reimplement them — so the loop body in
// turn.ts becomes a thin driver and the branch logic is unit-testable without
// a DB, broker, or stream.
import type { ToolCall } from '../../types/api.js';
import { detectDoomLoop } from './sentinels.js';
import { detectMistakePattern, type MistakeState } from './mistake-tracker.js';
import type { ToolPhaseResult } from './tool-phase.js';
// Top-of-loop gate, evaluated before the stream phase. Order matters and
// matches the original inline checks exactly: doom-loop first (identical-repeat
// guard), then the cumulative tool-call budget, otherwise proceed to stream.
export type PreStepDecision =
| { kind: 'doom'; loop: { name: string; args: Record<string, unknown> } }
| { kind: 'budget' }
| { kind: 'stream' };
export function decideStep(input: {
recentToolCalls: ToolCall[];
toolsUsed: number;
budget: number;
}): PreStepDecision {
const loop = detectDoomLoop(input.recentToolCalls);
if (loop) return { kind: 'doom', loop };
if (input.toolsUsed >= input.budget) return { kind: 'budget' };
return { kind: 'stream' };
}
// Post-tool-phase decision, evaluated after the tool phase returns. 'stop'
// covers the tool-phase's own non-'continue' actions ('paused' for user input,
// 'synthesis_done'); on 'continue' the mistake-tracker pattern gates the
// nudge/escalate/continue choice (detectMistakePattern is only consulted on the
// 'continue' path, exactly as the original loop did).
export type PostToolDecision = 'continue' | 'nudge' | 'escalate' | 'stop';
export function decidePostToolAction(
action: ToolPhaseResult['action'],
mistakeTracker: MistakeState,
): PostToolDecision {
if (action !== 'continue') return 'stop';
const mistake = detectMistakePattern(mistakeTracker);
if (mistake === 'nudge') return 'nudge';
if (mistake === 'escalate') return 'escalate';
return 'continue';
}

View File

@@ -0,0 +1,18 @@
// Pure classifier for errors thrown from the fullStream loop. Establishes the
// retry seam for when llama-swap gains restart-in-place-with-clear-partial
// semantics. No retry is performed today (partial-stream re-emit is
// non-idempotent at single-local-instance scale).
export type StreamErrorKind = 'stall' | 'transient' | 'non-retryable';
export function classifyStreamError(err: unknown): StreamErrorKind {
if (err instanceof Error && err.name === 'AbortError') {
return 'stall';
}
if (err != null && typeof err === 'object') {
const status = (err as Record<string, unknown>).status;
if (typeof status === 'number' && status >= 500 && status < 600) {
return 'transient';
}
}
return 'non-retryable';
}

View File

@@ -0,0 +1,441 @@
// P5 (SPLIT SKETCH): the generic AI-SDK adapter, split out of stream-phase.ts.
// This module is the v1.13.1-A streamText adapter and nothing else — it has NO
// SQL, broker, or BooCode persistence dependencies (its only `ctx` access is
// config + log), so it can be unit-tested without standing up a DB or broker.
// stream-phase.ts (the I/O layer) re-exports the public names below so existing
// importers (`./stream-phase.js`) are unchanged.
import type { FastifyBaseLogger } from 'fastify';
import type { Config } from '../../config.js';
import type { Agent, ToolCall } from '../../types/api.js';
import type { ToolJsonSchema } from '../tools.js';
import type { OpenAiMessage } from './payload.js';
import { extractToolCallBlocks } from './tool-call-parser.js';
import { classifyStreamError } from './stream-error-classifier.js';
import type { StreamResult } from './types.js';
import { upstreamModel } from './provider.js';
import {
jsonSchema,
streamText,
tool,
type JSONValue,
type ModelMessage,
type ToolCallRepairFunction,
} from 'ai';
// The slice of InferenceContext the adapter actually needs. Narrowing it here
// (instead of taking the full InferenceContext) keeps the adapter free of the
// SQL/broker/publish surface. InferenceContext structurally satisfies this, so
// callers pass their ctx unchanged.
export interface StreamAdapterContext {
config: Config;
log: FastifyBaseLogger;
}
export interface StreamOptions {
// null = omit tools entirely (compact phase); [] = caller stripped all tools
// (rare; we still omit from the request body to avoid OpenAI 400).
tools: ToolJsonSchema[] | null;
temperature?: number;
top_p?: number | null;
top_k?: number | null;
min_p?: number | null;
presence_penalty?: number | null;
// v2.6 sampling-streamjson-tokens (#11): llama.cpp sampler extensions. These
// are NOT standard AI-SDK streamText options and are NOT serialized by the
// openai-compatible provider's standardized-settings path (topK is even
// explicitly dropped with an "unsupported feature: topK" warning). They reach
// llama-server only via providerOptions.openaiCompatible (see buildSamplerProviderOptions).
top_n_sigma?: number | null;
dry_multiplier?: number | null;
dry_base?: number | null;
dry_allowed_length?: number | null;
dry_penalty_last_n?: number | null;
}
// P5: the 10-field sampler-options literal that was copy-pasted at 4 sites
// (the three sentinel summaries + executeStreamPhase). Builds the StreamOptions
// sampler subset from an agent's frontmatter knobs. `temperature` is
// `agent?.temperature` (already number|undefined); the nullable fields strip
// null → undefined so they're omitted from the request body when unset. Keep
// this in lockstep with the StreamOptions sampler fields — a new sampler knob
// (the v2.7.3 dry_* family did this) is added here once instead of at 4 sites.
export type SamplerOpts = Omit<StreamOptions, 'tools'>;
export function samplerOptsFromAgent(agent: Agent | null): SamplerOpts {
return {
temperature: agent?.temperature,
top_p: agent?.top_p ?? undefined,
top_k: agent?.top_k ?? undefined,
min_p: agent?.min_p ?? undefined,
presence_penalty: agent?.presence_penalty ?? undefined,
top_n_sigma: agent?.top_n_sigma ?? undefined,
dry_multiplier: agent?.dry_multiplier ?? undefined,
dry_base: agent?.dry_base ?? undefined,
dry_allowed_length: agent?.dry_allowed_length ?? undefined,
dry_penalty_last_n: agent?.dry_penalty_last_n ?? undefined,
};
}
// v2.6 #11: build the providerOptions.openaiCompatible extraBody object for the
// llama.cpp sampler extensions. @ai-sdk/openai-compatible (2.0.47) merges every
// non-reserved key under providerOptions.openaiCompatible straight into the
// chat-completion request body (see its getArgs: the Object.fromEntries spread
// filtered against openaiCompatibleLanguageModelChatOptions.shape). This is the
// ONLY working passthrough for these params:
// - top_k / min_p were latently dropped before this: top_k was passed as the
// AI-SDK `topK` setting which the openai-compatible provider rejects as
// unsupported; min_p was never passed to streamText at all.
// - top_n_sigma + the dry_* family have no AI-SDK equivalent.
// Keys use llama-server's snake_case body names so they land verbatim.
function buildSamplerProviderOptions(opts: StreamOptions): Record<string, number> | undefined {
const body: Record<string, number> = {};
if (typeof opts.top_k === 'number') body.top_k = opts.top_k;
if (typeof opts.min_p === 'number') body.min_p = opts.min_p;
if (typeof opts.top_n_sigma === 'number') body.top_n_sigma = opts.top_n_sigma;
if (typeof opts.dry_multiplier === 'number') body.dry_multiplier = opts.dry_multiplier;
if (typeof opts.dry_base === 'number') body.dry_base = opts.dry_base;
if (typeof opts.dry_allowed_length === 'number') body.dry_allowed_length = opts.dry_allowed_length;
if (typeof opts.dry_penalty_last_n === 'number') body.dry_penalty_last_n = opts.dry_penalty_last_n;
return Object.keys(body).length > 0 ? body : undefined;
}
// v1.13.1-A: convert BooCode's OpenAI-shaped history into AI SDK
// ModelMessage[]. Tool result messages need a `toolName` field that the
// OpenAI shape doesn't carry; we look it up by scanning earlier assistant
// `tool_calls` entries for a matching id.
function toModelMessages(messages: OpenAiMessage[]): ModelMessage[] {
const toolNameById = new Map<string, string>();
for (const m of messages) {
if (m.role === 'assistant' && m.tool_calls) {
for (const tc of m.tool_calls) {
toolNameById.set(tc.id, tc.function.name);
}
}
}
const out: ModelMessage[] = [];
for (const m of messages) {
if (m.role === 'system' || m.role === 'user') {
out.push({ role: m.role, content: m.content ?? '' });
continue;
}
if (m.role === 'assistant') {
const hasTools = m.tool_calls && m.tool_calls.length > 0;
const hasReasoning = typeof m.reasoning === 'string' && m.reasoning.length > 0;
if (!hasTools && !hasReasoning) {
// Bare text assistant (string content). null content + no tool_calls
// is degenerate but harmless to forward.
out.push({ role: 'assistant', content: m.content ?? '' });
continue;
}
// v1.13.1-C: AI SDK ReasoningPart precedes text + tool-calls in the
// assistant content array. Reasoning models (qwen3.6) consume their
// prior reasoning context to resume mid-thought across tool boundaries.
const parts: Array<
| { type: 'reasoning'; text: string }
| { type: 'text'; text: string }
| { type: 'tool-call'; toolCallId: string; toolName: string; input: unknown }
> = [];
if (hasReasoning) {
parts.push({ type: 'reasoning', text: m.reasoning! });
}
if (m.content && m.content.length > 0) {
parts.push({ type: 'text', text: m.content });
}
for (const tc of m.tool_calls ?? []) {
let input: unknown = {};
try {
input = tc.function.arguments.length > 0 ? JSON.parse(tc.function.arguments) : {};
} catch {
// Malformed args from a prior turn: pass through as a raw blob so
// the model sees the same shape it emitted. Wraps the string under
// _raw to match the buildMessagesPayload upstream convention.
input = { _raw: tc.function.arguments };
}
parts.push({ type: 'tool-call', toolCallId: tc.id, toolName: tc.function.name, input });
}
out.push({ role: 'assistant', content: parts });
continue;
}
if (m.role === 'tool') {
const toolCallId = m.tool_call_id ?? '';
const toolName = toolNameById.get(toolCallId) ?? 'unknown';
const raw = m.content ?? '';
let output: { type: 'text'; value: string } | { type: 'json'; value: JSONValue };
try {
// JSON.parse returns `any`; cast to JSONValue since the upstream
// tool_results column is already JSON-serializable by construction.
output = { type: 'json', value: JSON.parse(raw) as JSONValue };
} catch {
output = { type: 'text', value: raw };
}
out.push({
role: 'tool',
content: [{ type: 'tool-result', toolCallId, toolName, output }],
});
continue;
}
}
return out;
}
// Build the AI SDK tools record from BooCode's JSON-schema tool definitions.
// No `execute` field: BooCode runs tools itself in tool-phase.ts; streamText
// surfaces the tool-call parts via fullStream and we capture them for the
// outer loop to dispatch.
function buildAiTools(schemas: ToolJsonSchema[]): Record<string, ReturnType<typeof tool>> {
const out: Record<string, ReturnType<typeof tool>> = {};
for (const s of schemas) {
out[s.function.name] = tool({
description: s.function.description,
inputSchema: jsonSchema(s.function.parameters),
});
}
return out;
}
// F6: per-chunk stall deadline. Exported so tests can advance fake timers by
// exactly this value without hardcoding a magic number.
export const STALL_TIMEOUT_MS = 90_000;
// v1.10.5 Qwen-coder XML fallback. Some local models (notably qwen3-coder via
// llama-swap) emit tool calls as inline XML inside delta.content rather than
// the structured tool_calls field. We extract them out of the streamed text
// before flushing it to the client.
//
// Qwen shape:
// <tool_call>
// <function=NAME>
// <parameter=KEY>VALUE</parameter>
// ...
// </function>
// </tool_call>
//
// v1.13.16: also recognize Anthropic <invoke> markup that qwen3.6-35b-a3b-mxfp4
// drifts to (training-data residue from Claude Code documentation):
// <invoke name="NAME">
// <parameter name="KEY">VALUE</parameter>
// </invoke>
// Both formats share the synthetic xml_call_${idx} ID space; the counter
// increments across whichever opener appears first. Multiple blocks may
// appear back-to-back in either format and they never nest.
export async function streamCompletion(
ctx: StreamAdapterContext,
model: string,
messages: OpenAiMessage[],
opts: StreamOptions,
onDelta: (content: string) => void,
onUsage: ((prompt: number | null, completion: number | null) => void) | undefined,
signal?: AbortSignal,
agent?: Agent | null,
): Promise<StreamResult> {
const aiMessages = toModelMessages(messages);
const hasTools = opts.tools !== null && opts.tools.length > 0;
const aiTools = hasTools ? buildAiTools(opts.tools!) : undefined;
const startedAt = Date.now();
// v1.13.1-C: accumulate reasoning text across reasoning-delta parts.
// qwen3.6 emits these on a separate channel from text content; we capture
// them per stream so finalizeCompletion can dual-write a 'reasoning' part.
// Replaces the v1.13.1-A counter-only diagnostic.
let reasoningAccumulated = '';
// v1.13.3: experimental_repairToolCall keeps the stream alive when the
// model emits a malformed tool call (bad JSON args, unknown name, etc.).
// Without a repair function streamText throws and the WHOLE stream dies;
// with one, the SDK invokes us and we route the bad call through normally.
// Strategy: pass through unmodified. executeToolPhase's existing error
// path (unknown tool name → "unknown tool: X" result; zod-reject → tool
// 'X' rejected — fieldname: required) already gives the model a clean
// recovery surface on the next turn. Logging gives us visibility into
// how often qwen3.6 actually emits broken calls.
const repairToolCall: ToolCallRepairFunction<NonNullable<typeof aiTools>> = async ({
toolCall,
error,
}) => {
ctx.log.warn(
{
toolCallId: toolCall.toolCallId,
toolName: toolCall.toolName,
error: error.message,
},
'malformed tool call surfaced via repairToolCall',
);
return toolCall;
};
// v2.6 #11: llama.cpp sampler extensions (top_k, min_p, top_n_sigma, dry_*)
// ride providerOptions.openaiCompatible — they are NOT standardized streamText
// settings. NB: top_k used to be passed below as the AI-SDK `topK` setting;
// the openai-compatible provider dropped it with an "unsupported feature: topK"
// warning and min_p was never wired at all, so both were dead on the wire
// before this. They now go through the same extraBody path as the new params.
const samplerBody = buildSamplerProviderOptions(opts);
// F6: per-chunk stall deadline. If the model stops emitting chunks for
// STALL_TIMEOUT_MS the stallAc fires through AbortSignal.any; the post-loop
// abort check below then throws AbortError → handleAbortOrError writes
// 'cancelled'. Timer is bumped on every chunk and cleared in the finally.
// NO retry: partial-stream re-emit is non-idempotent at single-local-instance
// scale; see stream-error-classifier.ts for the future retry seam.
const stallAc = new AbortController();
let stallTimer: ReturnType<typeof setTimeout> | null = null;
const bumpStallTimer = () => {
if (stallTimer !== null) clearTimeout(stallTimer);
stallTimer = setTimeout(() => stallAc.abort(), STALL_TIMEOUT_MS);
};
const effectiveSignal = signal
? AbortSignal.any([signal, stallAc.signal])
: stallAc.signal;
const result = streamText({
model: upstreamModel(ctx.config, model, agent ?? null),
messages: aiMessages,
...(aiTools
? { tools: aiTools, toolChoice: 'auto' as const, experimental_repairToolCall: repairToolCall }
: {}),
...(typeof opts.temperature === 'number' ? { temperature: opts.temperature } : {}),
...(typeof opts.top_p === 'number' ? { topP: opts.top_p } : {}),
...(typeof opts.presence_penalty === 'number' ? { presencePenalty: opts.presence_penalty } : {}),
...(samplerBody ? { providerOptions: { openaiCompatible: samplerBody } } : {}),
abortSignal: effectiveSignal,
});
let content = '';
let pendingBuffer = '';
let finishReason: string | null = null;
// v1.13.1-A: AI SDK emits one `tool-call` part per fully-aggregated call,
// so we no longer need the OpenAI-index reassembly map the manual SSE
// parser used. XML tool calls extracted from text content go into the
// same flat list and keep the v1.10.5 synthetic id convention.
const toolCalls: ToolCall[] = [];
bumpStallTimer();
try {
for await (const part of result.fullStream) {
bumpStallTimer();
switch (part.type) {
case 'text-delta': {
pendingBuffer += part.text;
// v1.13.16: unified extraction. The helper finds the earliest-opening
// complete <tool_call> or <invoke> block, flushes prose between/around
// them, holds any partial opener for the next chunk, and silently
// drops blocks that fail to parse (matches pre-v1.13.16 behavior).
const extracted = extractToolCallBlocks(pendingBuffer, ctx.log);
if (extracted.flushed.length > 0) {
content += extracted.flushed;
onDelta(extracted.flushed);
}
for (const call of extracted.calls) {
const synthIdx = toolCalls.length;
toolCalls.push({
id: `xml_call_${synthIdx}`,
name: call.name,
args: call.args,
});
}
pendingBuffer = extracted.remaining;
break;
}
case 'tool-call': {
// AI SDK has already parsed the input into an object. Match the
// ToolCall shape BooCode passes around in toolCallsBuffer downstream.
toolCalls.push({
id: part.toolCallId,
name: part.toolName,
args: (part.input ?? {}) as Record<string, unknown>,
});
break;
}
case 'reasoning-delta': {
// v1.13.1-C: accumulate; finalizeCompletion / executeToolPhase
// dual-write the resulting text as a kind='reasoning' part.
if (typeof part.text === 'string') {
reasoningAccumulated += part.text;
}
break;
}
case 'finish': {
if (typeof part.finishReason === 'string') {
finishReason = part.finishReason;
}
break;
}
case 'error': {
const err = part.error;
const actualErr = err instanceof Error ? err : new Error(String(err));
ctx.log.warn({ kind: classifyStreamError(actualErr) }, 'stream error part');
throw actualErr;
}
// Intentional no-op: start, start-step, text-start, text-end,
// reasoning-start, reasoning-end, source, file, tool-input-start,
// tool-input-delta, tool-input-end, tool-result, tool-error,
// finish-step, raw. We only care about the aggregated tool-call and
// text-delta paths above; the rest are AI SDK lifecycle/streaming
// breadcrumbs that don't change BooCode's persistence or WS contract.
default:
break;
}
}
// v1.13.1-A: drain any buffered partial XML opener as plain text. The
// pre-AI-SDK path did this on stream end too — better to leak `<tool_c`
// than vanish the text.
if (pendingBuffer.length > 0) {
content += pendingBuffer;
onDelta(pendingBuffer);
pendingBuffer = '';
}
// AI SDK v6 fullStream returns normally on abort; check signal explicitly.
// Without this throw the row would land as status='complete' with partial
// content instead of going through handleAbortOrError → status='cancelled'.
// Smoke D caught this in v1.13.1-A — don't refactor it away.
// F6: also catch the stall timeout arm (stallAc.signal.aborted).
if (signal?.aborted || stallAc.signal.aborted) {
const abortErr = new Error('aborted');
abortErr.name = 'AbortError';
throw abortErr;
}
// Usage lands as a promise on the result; awaiting after fullStream is
// drained is safe. AI SDK v6 names: `inputTokens` / `outputTokens`.
let promptTokens: number | null = null;
let completionTokens: number | null = null;
try {
const usage = await result.usage;
if (typeof usage.inputTokens === 'number') promptTokens = usage.inputTokens;
if (typeof usage.outputTokens === 'number') completionTokens = usage.outputTokens;
} catch {
// Some providers omit usage on partial streams; leave both null.
}
if (onUsage && (promptTokens !== null || completionTokens !== null)) {
onUsage(promptTokens, completionTokens);
}
if (reasoningAccumulated.length > 0) {
ctx.log.debug(
{ reasoningChars: reasoningAccumulated.length, model, elapsed_ms: Date.now() - startedAt },
'streamCompletion: captured reasoning',
);
}
return {
finishReason,
content,
toolCalls,
promptTokens,
completionTokens,
reasoning: reasoningAccumulated,
};
} finally {
// Clear the stall timer whether the stream completes normally, throws, or
// is aborted — prevents a dangling timer from firing after the turn ends.
if (stallTimer !== null) {
clearTimeout(stallTimer);
stallTimer = null;
}
}
}

View File

@@ -1,377 +1,34 @@
import type { // P5 (SPLIT SKETCH): stream-phase.ts is now the BooCode I/O layer for the
Agent, // stream phase — `executeStreamPhase` owns the row UPDATE, message_started
Session, // frame, debounced content flush, throttled usage publish, model-context
ToolCall, // lookup, and tool-whitelist filter. The generic AI-SDK adapter
} from '../../types/api.js'; // (streamCompletion / toModelMessages / buildAiTools / sampler helpers) moved
// to ./stream-phase-adapter.ts, which has no SQL/broker/publish deps and is
// unit-testable on its own. The adapter's public names are re-exported below so
// existing importers of './stream-phase.js' (sentinel-summaries, synthesis
// pipeline, the helper tests) keep working unchanged.
import type { Agent, Session } from '../../types/api.js';
import * as modelContext from '../model-context.js'; import * as modelContext from '../model-context.js';
import { toolJsonSchemas, type ToolJsonSchema } from '../tools.js'; import { toolJsonSchemas, type ToolJsonSchema } from '../tools.js';
import { matchToolGlob } from '../agents.js'; import { matchToolGlob } from '../agents.js';
import type { OpenAiMessage } from './payload.js'; import type { OpenAiMessage } from './payload.js';
import { extractToolCallBlocks } from './tool-call-parser.js'; import { createContentFlusher } from './content-flusher.js';
import { DB_FLUSH_INTERVAL_MS, type StreamPhaseState } from './types.js';
import type { import type {
StreamPhaseState,
InferenceContext, InferenceContext,
StreamResult, StreamResult,
TurnArgs, TurnArgs,
} from './turn.js'; } from './types.js';
import { upstreamModel } from './provider.js'; import { streamCompletion, samplerOptsFromAgent } from './stream-phase-adapter.js';
import {
jsonSchema,
streamText,
tool,
type JSONValue,
type ModelMessage,
type ToolCallRepairFunction,
} from 'ai';
interface StreamOptions { export {
// null = omit tools entirely (compact phase); [] = caller stripped all tools streamCompletion,
// (rare; we still omit from the request body to avoid OpenAI 400). samplerOptsFromAgent,
tools: ToolJsonSchema[] | null; type StreamOptions,
temperature?: number; type SamplerOpts,
top_p?: number | null; type StreamAdapterContext,
top_k?: number | null; } from './stream-phase-adapter.js';
min_p?: number | null;
presence_penalty?: number | null;
// v2.6 sampling-streamjson-tokens (#11): llama.cpp sampler extensions. These
// are NOT standard AI-SDK streamText options and are NOT serialized by the
// openai-compatible provider's standardized-settings path (topK is even
// explicitly dropped with an "unsupported feature: topK" warning). They reach
// llama-server only via providerOptions.openaiCompatible (see buildSamplerProviderOptions).
top_n_sigma?: number | null;
dry_multiplier?: number | null;
dry_base?: number | null;
dry_allowed_length?: number | null;
dry_penalty_last_n?: number | null;
}
// v2.6 #11: build the providerOptions.openaiCompatible extraBody object for the
// llama.cpp sampler extensions. @ai-sdk/openai-compatible (2.0.47) merges every
// non-reserved key under providerOptions.openaiCompatible straight into the
// chat-completion request body (see its getArgs: the Object.fromEntries spread
// filtered against openaiCompatibleLanguageModelChatOptions.shape). This is the
// ONLY working passthrough for these params:
// - top_k / min_p were latently dropped before this: top_k was passed as the
// AI-SDK `topK` setting which the openai-compatible provider rejects as
// unsupported; min_p was never passed to streamText at all.
// - top_n_sigma + the dry_* family have no AI-SDK equivalent.
// Keys use llama-server's snake_case body names so they land verbatim.
function buildSamplerProviderOptions(opts: StreamOptions): Record<string, number> | undefined {
const body: Record<string, number> = {};
if (typeof opts.top_k === 'number') body.top_k = opts.top_k;
if (typeof opts.min_p === 'number') body.min_p = opts.min_p;
if (typeof opts.top_n_sigma === 'number') body.top_n_sigma = opts.top_n_sigma;
if (typeof opts.dry_multiplier === 'number') body.dry_multiplier = opts.dry_multiplier;
if (typeof opts.dry_base === 'number') body.dry_base = opts.dry_base;
if (typeof opts.dry_allowed_length === 'number') body.dry_allowed_length = opts.dry_allowed_length;
if (typeof opts.dry_penalty_last_n === 'number') body.dry_penalty_last_n = opts.dry_penalty_last_n;
return Object.keys(body).length > 0 ? body : undefined;
}
// v1.13.1-A: convert BooCode's OpenAI-shaped history into AI SDK
// ModelMessage[]. Tool result messages need a `toolName` field that the
// OpenAI shape doesn't carry; we look it up by scanning earlier assistant
// `tool_calls` entries for a matching id.
function toModelMessages(messages: OpenAiMessage[]): ModelMessage[] {
const toolNameById = new Map<string, string>();
for (const m of messages) {
if (m.role === 'assistant' && m.tool_calls) {
for (const tc of m.tool_calls) {
toolNameById.set(tc.id, tc.function.name);
}
}
}
const out: ModelMessage[] = [];
for (const m of messages) {
if (m.role === 'system' || m.role === 'user') {
out.push({ role: m.role, content: m.content ?? '' });
continue;
}
if (m.role === 'assistant') {
const hasTools = m.tool_calls && m.tool_calls.length > 0;
const hasReasoning = typeof m.reasoning === 'string' && m.reasoning.length > 0;
if (!hasTools && !hasReasoning) {
// Bare text assistant (string content). null content + no tool_calls
// is degenerate but harmless to forward.
out.push({ role: 'assistant', content: m.content ?? '' });
continue;
}
// v1.13.1-C: AI SDK ReasoningPart precedes text + tool-calls in the
// assistant content array. Reasoning models (qwen3.6) consume their
// prior reasoning context to resume mid-thought across tool boundaries.
const parts: Array<
| { type: 'reasoning'; text: string }
| { type: 'text'; text: string }
| { type: 'tool-call'; toolCallId: string; toolName: string; input: unknown }
> = [];
if (hasReasoning) {
parts.push({ type: 'reasoning', text: m.reasoning! });
}
if (m.content && m.content.length > 0) {
parts.push({ type: 'text', text: m.content });
}
for (const tc of m.tool_calls ?? []) {
let input: unknown = {};
try {
input = tc.function.arguments.length > 0 ? JSON.parse(tc.function.arguments) : {};
} catch {
// Malformed args from a prior turn: pass through as a raw blob so
// the model sees the same shape it emitted. Wraps the string under
// _raw to match the buildMessagesPayload upstream convention.
input = { _raw: tc.function.arguments };
}
parts.push({ type: 'tool-call', toolCallId: tc.id, toolName: tc.function.name, input });
}
out.push({ role: 'assistant', content: parts });
continue;
}
if (m.role === 'tool') {
const toolCallId = m.tool_call_id ?? '';
const toolName = toolNameById.get(toolCallId) ?? 'unknown';
const raw = m.content ?? '';
let output: { type: 'text'; value: string } | { type: 'json'; value: JSONValue };
try {
// JSON.parse returns `any`; cast to JSONValue since the upstream
// tool_results column is already JSON-serializable by construction.
output = { type: 'json', value: JSON.parse(raw) as JSONValue };
} catch {
output = { type: 'text', value: raw };
}
out.push({
role: 'tool',
content: [{ type: 'tool-result', toolCallId, toolName, output }],
});
continue;
}
}
return out;
}
// Build the AI SDK tools record from BooCode's JSON-schema tool definitions.
// No `execute` field: BooCode runs tools itself in tool-phase.ts; streamText
// surfaces the tool-call parts via fullStream and we capture them for the
// outer loop to dispatch.
function buildAiTools(schemas: ToolJsonSchema[]): Record<string, ReturnType<typeof tool>> {
const out: Record<string, ReturnType<typeof tool>> = {};
for (const s of schemas) {
out[s.function.name] = tool({
description: s.function.description,
inputSchema: jsonSchema(s.function.parameters),
});
}
return out;
}
// v1.10.5 Qwen-coder XML fallback. Some local models (notably qwen3-coder via
// llama-swap) emit tool calls as inline XML inside delta.content rather than
// the structured tool_calls field. We extract them out of the streamed text
// before flushing it to the client.
//
// Qwen shape:
// <tool_call>
// <function=NAME>
// <parameter=KEY>VALUE</parameter>
// ...
// </function>
// </tool_call>
//
// v1.13.16: also recognize Anthropic <invoke> markup that qwen3.6-35b-a3b-mxfp4
// drifts to (training-data residue from Claude Code documentation):
// <invoke name="NAME">
// <parameter name="KEY">VALUE</parameter>
// </invoke>
// Both formats share the synthetic xml_call_${idx} ID space; the counter
// increments across whichever opener appears first. Multiple blocks may
// appear back-to-back in either format and they never nest.
export async function streamCompletion(
ctx: InferenceContext,
model: string,
messages: OpenAiMessage[],
opts: StreamOptions,
onDelta: (content: string) => void,
onUsage: ((prompt: number | null, completion: number | null) => void) | undefined,
signal?: AbortSignal,
agent?: Agent | null,
): Promise<StreamResult> {
const aiMessages = toModelMessages(messages);
const hasTools = opts.tools !== null && opts.tools.length > 0;
const aiTools = hasTools ? buildAiTools(opts.tools!) : undefined;
const startedAt = Date.now();
// v1.13.1-C: accumulate reasoning text across reasoning-delta parts.
// qwen3.6 emits these on a separate channel from text content; we capture
// them per stream so finalizeCompletion can dual-write a 'reasoning' part.
// Replaces the v1.13.1-A counter-only diagnostic.
let reasoningAccumulated = '';
// v1.13.3: experimental_repairToolCall keeps the stream alive when the
// model emits a malformed tool call (bad JSON args, unknown name, etc.).
// Without a repair function streamText throws and the WHOLE stream dies;
// with one, the SDK invokes us and we route the bad call through normally.
// Strategy: pass through unmodified. executeToolPhase's existing error
// path (unknown tool name → "unknown tool: X" result; zod-reject → tool
// 'X' rejected — fieldname: required) already gives the model a clean
// recovery surface on the next turn. Logging gives us visibility into
// how often qwen3.6 actually emits broken calls.
const repairToolCall: ToolCallRepairFunction<NonNullable<typeof aiTools>> = async ({
toolCall,
error,
}) => {
ctx.log.warn(
{
toolCallId: toolCall.toolCallId,
toolName: toolCall.toolName,
error: error.message,
},
'malformed tool call surfaced via repairToolCall',
);
return toolCall;
};
// v2.6 #11: llama.cpp sampler extensions (top_k, min_p, top_n_sigma, dry_*)
// ride providerOptions.openaiCompatible — they are NOT standardized streamText
// settings. NB: top_k used to be passed below as the AI-SDK `topK` setting;
// the openai-compatible provider dropped it with an "unsupported feature: topK"
// warning and min_p was never wired at all, so both were dead on the wire
// before this. They now go through the same extraBody path as the new params.
const samplerBody = buildSamplerProviderOptions(opts);
const result = streamText({
model: upstreamModel(ctx.config, model, agent ?? null),
messages: aiMessages,
...(aiTools
? { tools: aiTools, toolChoice: 'auto' as const, experimental_repairToolCall: repairToolCall }
: {}),
...(typeof opts.temperature === 'number' ? { temperature: opts.temperature } : {}),
...(typeof opts.top_p === 'number' ? { topP: opts.top_p } : {}),
...(typeof opts.presence_penalty === 'number' ? { presencePenalty: opts.presence_penalty } : {}),
...(samplerBody ? { providerOptions: { openaiCompatible: samplerBody } } : {}),
abortSignal: signal,
});
let content = '';
let pendingBuffer = '';
let finishReason: string | null = null;
// v1.13.1-A: AI SDK emits one `tool-call` part per fully-aggregated call,
// so we no longer need the OpenAI-index reassembly map the manual SSE
// parser used. XML tool calls extracted from text content go into the
// same flat list and keep the v1.10.5 synthetic id convention.
const toolCalls: ToolCall[] = [];
for await (const part of result.fullStream) {
switch (part.type) {
case 'text-delta': {
pendingBuffer += part.text;
// v1.13.16: unified extraction. The helper finds the earliest-opening
// complete <tool_call> or <invoke> block, flushes prose between/around
// them, holds any partial opener for the next chunk, and silently
// drops blocks that fail to parse (matches pre-v1.13.16 behavior).
const extracted = extractToolCallBlocks(pendingBuffer);
if (extracted.flushed.length > 0) {
content += extracted.flushed;
onDelta(extracted.flushed);
}
for (const call of extracted.calls) {
const synthIdx = toolCalls.length;
toolCalls.push({
id: `xml_call_${synthIdx}`,
name: call.name,
args: call.args,
});
}
pendingBuffer = extracted.remaining;
break;
}
case 'tool-call': {
// AI SDK has already parsed the input into an object. Match the
// ToolCall shape BooCode passes around in toolCallsBuffer downstream.
toolCalls.push({
id: part.toolCallId,
name: part.toolName,
args: (part.input ?? {}) as Record<string, unknown>,
});
break;
}
case 'reasoning-delta': {
// v1.13.1-C: accumulate; finalizeCompletion / executeToolPhase
// dual-write the resulting text as a kind='reasoning' part.
if (typeof part.text === 'string') {
reasoningAccumulated += part.text;
}
break;
}
case 'finish': {
if (typeof part.finishReason === 'string') {
finishReason = part.finishReason;
}
break;
}
case 'error': {
const err = part.error;
throw err instanceof Error ? err : new Error(String(err));
}
// Intentional no-op: start, start-step, text-start, text-end,
// reasoning-start, reasoning-end, source, file, tool-input-start,
// tool-input-delta, tool-input-end, tool-result, tool-error,
// finish-step, raw. We only care about the aggregated tool-call and
// text-delta paths above; the rest are AI SDK lifecycle/streaming
// breadcrumbs that don't change BooCode's persistence or WS contract.
default:
break;
}
}
// v1.13.1-A: drain any buffered partial XML opener as plain text. The
// pre-AI-SDK path did this on stream end too — better to leak `<tool_c`
// than vanish the text.
if (pendingBuffer.length > 0) {
content += pendingBuffer;
onDelta(pendingBuffer);
pendingBuffer = '';
}
// AI SDK v6 fullStream returns normally on abort; check signal explicitly.
// Without this throw the row would land as status='complete' with partial
// content instead of going through handleAbortOrError → status='cancelled'.
// Smoke D caught this in v1.13.1-A — don't refactor it away.
if (signal?.aborted) {
const abortErr = new Error('aborted');
abortErr.name = 'AbortError';
throw abortErr;
}
// Usage lands as a promise on the result; awaiting after fullStream is
// drained is safe. AI SDK v6 names: `inputTokens` / `outputTokens`.
let promptTokens: number | null = null;
let completionTokens: number | null = null;
try {
const usage = await result.usage;
if (typeof usage.inputTokens === 'number') promptTokens = usage.inputTokens;
if (typeof usage.outputTokens === 'number') completionTokens = usage.outputTokens;
} catch {
// Some providers omit usage on partial streams; leave both null.
}
if (onUsage && (promptTokens !== null || completionTokens !== null)) {
onUsage(promptTokens, completionTokens);
}
if (reasoningAccumulated.length > 0) {
ctx.log.debug(
{ reasoningChars: reasoningAccumulated.length, model, elapsed_ms: Date.now() - startedAt },
'streamCompletion: captured reasoning',
);
}
return {
finishReason,
content,
toolCalls,
promptTokens,
completionTokens,
reasoning: reasoningAccumulated,
};
}
export async function executeStreamPhase( export async function executeStreamPhase(
ctx: InferenceContext, ctx: InferenceContext,
@@ -401,27 +58,7 @@ export async function executeStreamPhase(
role: 'assistant', role: 'assistant',
}); });
let pendingFlushTimer: NodeJS.Timeout | null = null; const flusher = createContentFlusher(ctx.sql, assistantMessageId, () => state.accumulated);
let flushPromise: Promise<unknown> = Promise.resolve();
const flushNow = () => {
if (pendingFlushTimer) {
clearTimeout(pendingFlushTimer);
pendingFlushTimer = null;
}
const snapshot = state.accumulated;
flushPromise = flushPromise.then(() =>
ctx.sql`UPDATE messages SET content = ${snapshot} WHERE id = ${assistantMessageId}`
);
};
const scheduleFlush = () => {
if (pendingFlushTimer) return;
pendingFlushTimer = setTimeout(() => {
pendingFlushTimer = null;
flushNow();
}, DB_FLUSH_INTERVAL_MS);
};
// Tool whitelist: if an agent is set, filter the global tool list to only the // Tool whitelist: if an agent is set, filter the global tool list to only the
// tool names it allows. v1.15.0-mcp-multi: uses matchToolGlob for glob // tool names it allows. v1.15.0-mcp-multi: uses matchToolGlob for glob
@@ -434,17 +71,6 @@ export async function executeStreamPhase(
? toolJsonSchemas().filter((t) => matchToolGlob(t.function.name, agent.tools)) ? toolJsonSchemas().filter((t) => matchToolGlob(t.function.name, agent.tools))
: toolJsonSchemas() : toolJsonSchemas()
).filter((t) => webToolsEnabled || !WEB_TOOL_NAMES.has(t.function.name)); ).filter((t) => webToolsEnabled || !WEB_TOOL_NAMES.has(t.function.name));
const effectiveTemperature = agent?.temperature;
const effectiveTopP = agent?.top_p ?? undefined;
const effectiveTopK = agent?.top_k ?? undefined;
const effectiveMinP = agent?.min_p ?? undefined;
const effectivePresencePenalty = agent?.presence_penalty ?? undefined;
// v2.6 #11: llama.cpp sampler extensions, threaded the same way as top_k/min_p.
const effectiveTopNSigma = agent?.top_n_sigma ?? undefined;
const effectiveDryMultiplier = agent?.dry_multiplier ?? undefined;
const effectiveDryBase = agent?.dry_base ?? undefined;
const effectiveDryAllowedLength = agent?.dry_allowed_length ?? undefined;
const effectiveDryPenaltyLastN = agent?.dry_penalty_last_n ?? undefined;
// v1.12.2: ctx_max lookup is cached after the first hit per model, so this // v1.12.2: ctx_max lookup is cached after the first hit per model, so this
// is a Map probe in steady state. We capture nCtx once at the top of the // is a Map probe in steady state. We capture nCtx once at the top of the
@@ -484,16 +110,7 @@ export async function executeStreamPhase(
messages, messages,
{ {
tools: effectiveTools, tools: effectiveTools,
temperature: effectiveTemperature, ...samplerOptsFromAgent(agent),
top_p: effectiveTopP,
top_k: effectiveTopK,
min_p: effectiveMinP,
presence_penalty: effectivePresencePenalty,
top_n_sigma: effectiveTopNSigma,
dry_multiplier: effectiveDryMultiplier,
dry_base: effectiveDryBase,
dry_allowed_length: effectiveDryAllowedLength,
dry_penalty_last_n: effectiveDryPenaltyLastN,
}, },
(delta) => { (delta) => {
state.accumulated += delta; state.accumulated += delta;
@@ -504,7 +121,7 @@ export async function executeStreamPhase(
content: delta, content: delta,
}); });
ctx.log.debug({ sessionId, delta }, 'inference delta'); ctx.log.debug({ sessionId, delta }, 'inference delta');
scheduleFlush(); flusher.scheduleFlush();
}, },
(prompt, completion) => { (prompt, completion) => {
pendingUsage = { p: prompt, c: completion }; pendingUsage = { p: prompt, c: completion };
@@ -522,14 +139,10 @@ export async function executeStreamPhase(
agent, agent,
); );
} finally { } finally {
if (pendingFlushTimer) {
clearTimeout(pendingFlushTimer);
pendingFlushTimer = null;
}
if (usageTimer) { if (usageTimer) {
clearTimeout(usageTimer); clearTimeout(usageTimer);
usageTimer = null; usageTimer = null;
} }
await flushPromise; await flusher.drain();
} }
} }

View File

@@ -5,10 +5,10 @@
// ── Constants ──────────────────────────────────────────────────────────── // ── Constants ────────────────────────────────────────────────────────────
export const XML_TOOL_OPEN = '<tool_call>'; const XML_TOOL_OPEN = '<tool_call>';
export const XML_TOOL_CLOSE = '</tool_call>'; const XML_TOOL_CLOSE = '</tool_call>';
export const INVOKE_TOOL_OPEN = '<invoke'; const INVOKE_TOOL_OPEN = '<invoke';
export const INVOKE_TOOL_CLOSE = '</invoke>'; const INVOKE_TOOL_CLOSE = '</invoke>';
// ── Strip patterns ─────────────────────────────────────────────────────── // ── Strip patterns ───────────────────────────────────────────────────────
@@ -45,7 +45,7 @@ export interface ParsedCall {
const PLACEHOLDER_LITERALS = new Set(['...', 'placeholder', '<path>', '<file>']); const PLACEHOLDER_LITERALS = new Set(['...', 'placeholder', '<path>', '<file>']);
const ANGLE_BRACKET_SENTINEL_RE = /^<[^>]+>$/; const ANGLE_BRACKET_SENTINEL_RE = /^<[^>]+>$/;
export function isPlaceholderArgValue(value: unknown): boolean { function isPlaceholderArgValue(value: unknown): boolean {
if (typeof value !== 'string') return false; if (typeof value !== 'string') return false;
const trimmed = value.trim(); const trimmed = value.trim();
if (trimmed === '') return true; if (trimmed === '') return true;
@@ -61,17 +61,21 @@ function hasPlaceholderArgs(args: Record<string, unknown>): boolean {
return false; return false;
} }
function logRejectedPlaceholder(parsed: ParsedCall): void { type MinLogger = { debug(obj: object, msg: string): void };
console.debug(
{ toolName: parsed.name, args: parsed.args }, function logRejectedPlaceholder(parsed: ParsedCall, log?: MinLogger): void {
'rejected placeholder tool call at parse time', if (log) {
); log.debug(
{ toolName: parsed.name, args: parsed.args },
'rejected placeholder tool call at parse time',
);
}
} }
const QWEN_FUNCTION_RE = /<function\s*=\s*([^>\s]+)\s*>/; const QWEN_FUNCTION_RE = /<function\s*=\s*([^>\s]+)\s*>/;
const QWEN_PARAM_RE = /<parameter\s*=\s*([^>\s]+)\s*>([\s\S]*?)<\/parameter>/g; const QWEN_PARAM_RE = /<parameter\s*=\s*([^>\s]+)\s*>([\s\S]*?)<\/parameter>/g;
export function parseXmlToolCall(block: string): ParsedCall | null { function parseXmlToolCall(block: string): ParsedCall | null {
const nameMatch = block.match(QWEN_FUNCTION_RE); const nameMatch = block.match(QWEN_FUNCTION_RE);
if (!nameMatch || !nameMatch[1]) return null; if (!nameMatch || !nameMatch[1]) return null;
const name = nameMatch[1].trim(); const name = nameMatch[1].trim();
@@ -95,7 +99,7 @@ const INVOKE_NAME_RE =
const INVOKE_PARAM_RE = const INVOKE_PARAM_RE =
/<parameter\s+name\s*=\s*("([^"]*)"|'([^']*)')\s*>([\s\S]*?)<\/parameter>/g; /<parameter\s+name\s*=\s*("([^"]*)"|'([^']*)')\s*>([\s\S]*?)<\/parameter>/g;
export function parseInvokeToolCall(block: string): ParsedCall | null { function parseInvokeToolCall(block: string): ParsedCall | null {
const nameMatch = block.match(INVOKE_NAME_RE); const nameMatch = block.match(INVOKE_NAME_RE);
if (!nameMatch) return null; if (!nameMatch) return null;
const name = (nameMatch[2] ?? nameMatch[3] ?? '').trim(); const name = (nameMatch[2] ?? nameMatch[3] ?? '').trim();
@@ -116,7 +120,7 @@ export function parseInvokeToolCall(block: string): ParsedCall | null {
const ALL_OPENERS = [XML_TOOL_OPEN, INVOKE_TOOL_OPEN] as const; const ALL_OPENERS = [XML_TOOL_OPEN, INVOKE_TOOL_OPEN] as const;
export function partialXmlOpenerStart(s: string): number { function partialXmlOpenerStart(s: string): number {
let earliest = -1; let earliest = -1;
for (const op of ALL_OPENERS) { for (const op of ALL_OPENERS) {
const idx = s.indexOf(op); const idx = s.indexOf(op);
@@ -150,7 +154,7 @@ const OPENER_SPECS: ReadonlyArray<OpenerSpec> = [
{ open: INVOKE_TOOL_OPEN, close: INVOKE_TOOL_CLOSE, parse: parseInvokeToolCall }, { open: INVOKE_TOOL_OPEN, close: INVOKE_TOOL_CLOSE, parse: parseInvokeToolCall },
]; ];
export function extractToolCallBlocks(buffer: string): ToolCallExtraction { export function extractToolCallBlocks(buffer: string, log?: MinLogger): ToolCallExtraction {
let flushed = ''; let flushed = '';
const calls: ParsedCall[] = []; const calls: ParsedCall[] = [];
let pos = 0; let pos = 0;
@@ -176,7 +180,7 @@ export function extractToolCallBlocks(buffer: string): ToolCallExtraction {
const parsed = next.spec.parse(block); const parsed = next.spec.parse(block);
if (parsed) { if (parsed) {
if (hasPlaceholderArgs(parsed.args)) { if (hasPlaceholderArgs(parsed.args)) {
logRejectedPlaceholder(parsed); logRejectedPlaceholder(parsed, log);
flushed += block; flushed += block;
} else { } else {
calls.push(parsed); calls.push(parsed);

Some files were not shown because too many files have changed in this diff Show More