Compare commits

...

2 Commits

Author SHA1 Message Date
cce685b1a7 fix(coder): harden edit-apply pipeline against block duplication
Root cause: two proven corruption mechanisms — (M1) non-idempotent apply
stamped the same block N times when a quantized model re-emitted the same
edit_file call or a turn was retried; (M2) Levenshtein tier 4 was fail-open
with no uniqueness guard, silently splicing into the wrong location.

Fixes applied at every layer of the pipeline:

Matcher (fuzzy-match.ts): raise SIMILARITY_THRESHOLD 0.66 → 0.85; add
AMBIGUITY_EPSILON uniqueness guard — two windows within 0.05 of the top
score → ambiguous, not a guess; add block-anchor gate (≥3-line needles
require first+last line exact match before a window is scored).

Edit planner (pending_changes.ts): extract planEdit() as a pure function;
idempotency guards detect already-applied states (anchored insert re-stamp,
old-gone-but-new-present); findPendingDuplicate() collapses identical
pending rows at queue time so M1 never reaches applyOne.

Atomic writes (pending_changes.ts): temp-file + rename on the same
filesystem so a crash can't leave a half-written source file; realpath()
first so symlinks survive the rename.

Per-file mutex (pending_changes.ts): withFileLock() serializes concurrent
read-modify-write on the same path via a chained-Promise Map.

EOL preservation (pending_changes.ts): normalize CRLF → LF for matching,
restore native line ending on write so Windows-style files stay clean.

Context isolation (inference_context.ts): replace module-level singleton
with AsyncLocalStorage so concurrent inference runs (arena parallel
dispatch, dispatcher poll racing a user message) each get their own
scoped context with no clobbering.

Tests: plan-edit.test.ts (pure planEdit unit tests), extended fuzzy-match
and pending_changes_integration suites, ALS isolation test that proves
overlapping runs get correct session IDs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-07 01:44:37 +00:00
dbf1662982 docs: add v2.7.20-arena-pane changelog entry 2026-06-06 23:34:58 +00:00
17 changed files with 648 additions and 157 deletions

View File

@@ -2,6 +2,10 @@
All notable changes per release tag. Most recent on top, ordered by tag creation date (which matches the git history). Tag names follow `vMAJOR.MINOR.PATCH-slug` — the slug describes what shipped, so the tag name alone is enough to recall the batch. All notable changes per release tag. Most recent on top, ordered by tag creation date (which matches the git history). Tag names follow `vMAJOR.MINOR.PATCH-slug` — the slug describes what shipped, so the tag name alone is enough to recall the batch.
## v2.7.20-arena-pane — 2026-06-06
Adds the **Arena** pane for running the same prompt against 26 AI competitors simultaneously and picking the best result. A Battle is one Arena run: pick a battle type (Coding — backend+model with git worktrees producing diffs; or Q&A — BooChat persona+model producing text), write or generate a prompt, add contestants, and hit Start. Contestants are scheduled in two concurrent lanes — the local lane (llama-swap models, serial) and the cloud lane (Claude Code, OpenCode-on-cloud, parallel). The lane scheduler captures wall-clock duration for every contestant and tokens/sec for local models. When all contestants finish, a two-stage analysis (digest then judge) auto-runs on the DEFAULT_MODEL, writing `analysis.md` naming a winner; the user can override the winner per-row or trigger cross-examination. Results land in `/<project-root>/Arena/<dated-battle>/` with per-contestant `result.md`, diff patches for coding, and `manifest.json`. Replaces the old API-only `POST /api/arena` with dedicated `battles`/`contestants`/`cross_examinations` tables and full UI. Also adds a `DiffView` component with line-by-line colored unified diff and a per-row dropdown for winner override. Built on `v2.7.18-permission-modes`; pairs conceptually with the earlier `v2.7.17-orchestrator` multi-agent work (both share the pane kind pattern and `onTaskTerminal` hook).
## v2.7.18-permission-modes — 2026-06-05 ## v2.7.18-permission-modes — 2026-06-05
Adds a unified **permission picker** to the BooCoder composer — Plan / Ask Permission / Bypass — replacing the old raw per-agent mode dropdown that exposed each agent's full native vocabulary with inconsistent labels. The three options map generically onto every provider's existing mode metadata: the `plan`-id mode → Plan, the default mode → Ask, the `isUnattended` mode → Bypass (claude `bypassPermissions`, qwen `yolo`, opencode `full-access`); goose has no modes so it shows no picker, exactly as before. `modeId` stays the single wire field — the active unified mode is derived from it, so no contracts change was needed. Native BooCode gains its own mode set (registered in the manifest and exposed by the snapshot): **Ask** stages edits to the pending-changes queue as today, **Bypass** auto-applies the queue to disk after the turn (both the interactive messages path and the task-based dispatcher path), and **Plan** falls back to Ask — the shared `apps/server` inference engine is deliberately left untouched. A supporting fix preserves the `isUnattended` flag on live-probed ACP modes (`acp-derive.ts`) so opencode's bypass mode is still detectable from the wire. Coder 373 tests green, coder + web typecheck clean. Built on `v2.7.17-orchestrator`. Adds a unified **permission picker** to the BooCoder composer — Plan / Ask Permission / Bypass — replacing the old raw per-agent mode dropdown that exposed each agent's full native vocabulary with inconsistent labels. The three options map generically onto every provider's existing mode metadata: the `plan`-id mode → Plan, the default mode → Ask, the `isUnattended` mode → Bypass (claude `bypassPermissions`, qwen `yolo`, opencode `full-access`); goose has no modes so it shows no picker, exactly as before. `modeId` stays the single wire field — the active unified mode is derived from it, so no contracts change was needed. Native BooCode gains its own mode set (registered in the manifest and exposed by the snapshot): **Ask** stages edits to the pending-changes queue as today, **Bypass** auto-applies the queue to disk after the turn (both the interactive messages path and the task-based dispatcher path), and **Plan** falls back to Ask — the shared `apps/server` inference engine is deliberately left untouched. A supporting fix preserves the `isUnattended` flag on live-probed ACP modes (`acp-derive.ts`) so opencode's bypass mode is still detectable from the wire. Coder 373 tests green, coder + web typecheck clean. Built on `v2.7.17-orchestrator`.

View File

@@ -13,7 +13,7 @@ import type { WsFrame } from '@boocode/contracts/ws-frames';
// v2.0.0 Phase 2C: write tools + adapter for BooChat ToolDef compatibility. // v2.0.0 Phase 2C: write tools + adapter for BooChat ToolDef compatibility.
import { WRITE_TOOLS } from './services/tools/index.js'; import { WRITE_TOOLS } from './services/tools/index.js';
import { adaptWriteTool } from './services/tools/adapter.js'; import { adaptWriteTool } from './services/tools/adapter.js';
import { setInferenceContext, clearInferenceContext } from './services/tools/inference_context.js'; import { runWithInferenceContext } from './services/tools/inference_context.js';
// Routes // Routes
import { registerMessageRoutes } from './routes/messages.js'; import { registerMessageRoutes } from './routes/messages.js';
import { registerSkillRoutes } from './routes/skills.js'; import { registerSkillRoutes } from './routes/skills.js';
@@ -174,22 +174,27 @@ async function main() {
} }
); );
// Wrap the inference runner to set/clear the write-tool context around each run. // Wrap the inference runner to bind the write-tool context around each run.
// The inference runner calls enqueue() which fires asynchronously — we hook // enqueue() starts its async loop synchronously, so wrapping the call in
// into the enqueue to set context before the run starts. // runWithInferenceContext propagates the per-run context (sql, sessionId, the
// Plan/Ask/Bypass gate) through every awaited tool execution — and concurrent
// runs (a user message racing a dispatcher-polled native task) each get their
// own, instead of clobbering a shared global.
const inferenceApi = { const inferenceApi = {
enqueue: (sessionId: string, chatId: string, assistantId: string, user: string) => { enqueue: (
// Set the inference context so write tools can access sql + sessionId. sessionId: string,
// The context persists for the duration of the inference run. Since chatId: string,
// BooCoder is single-user and runs one inference at a time per session, assistantId: string,
// this module-level state is safe. user: string,
setInferenceContext({ sql, sessionId, taskId: null }); permissionMode?: 'plan' | 'ask' | 'bypass',
inference.enqueue(sessionId, chatId, assistantId, user); ) => {
runWithInferenceContext({ sql, sessionId, taskId: null, permissionMode }, () => {
inference.enqueue(sessionId, chatId, assistantId, user);
});
}, },
cancel: async (sessionId: string, chatId: string) => { cancel: async (sessionId: string, chatId: string) => {
const result = await inference.cancel(sessionId, chatId); // No context to clear — AsyncLocalStorage scopes it to each run's own chain.
clearInferenceContext(); return inference.cancel(sessionId, chatId);
return result;
}, },
hasActive: (chatId: string) => inference.hasActive(chatId), hasActive: (chatId: string) => inference.hasActive(chatId),
}; };

View File

@@ -4,7 +4,7 @@ import type { Sql } from '../db.js';
import type { Broker } from '@boocode/server/broker'; import type { Broker } from '@boocode/server/broker';
import type { WsFrame } from '@boocode/contracts/ws-frames'; import type { WsFrame } from '@boocode/contracts/ws-frames';
import { resolveChatId } from './chat-resolve.js'; import { resolveChatId } from './chat-resolve.js';
import { applyAll } from '../services/pending_changes.js'; import { asPermissionMode } from '../services/tools/types.js';
const AnswerUserInputBody = z.object({ const AnswerUserInputBody = z.object({
tool_call_id: z.string().min(1), tool_call_id: z.string().min(1),
@@ -44,7 +44,13 @@ const SendBody = z.object({
}); });
interface InferenceApi { interface InferenceApi {
enqueue: (sessionId: string, chatId: string, assistantId: string, user: string) => void; enqueue: (
sessionId: string,
chatId: string,
assistantId: string,
user: string,
permissionMode?: 'plan' | 'ask' | 'bypass',
) => void;
cancel: (sessionId: string, chatId: string) => Promise<boolean>; cancel: (sessionId: string, chatId: string) => Promise<boolean>;
hasActive: (chatId: string) => boolean; hasActive: (chatId: string) => boolean;
} }
@@ -246,36 +252,16 @@ export function registerMessageRoutes(
RETURNING id RETURNING id
`; `;
inference.enqueue(sessionId, chatId, assistantMsg!.id, 'default'); // Native BooCode permission gate (plan/ask/bypass) — threaded into the
// write-tool context so create/edit/delete and apply_pending honor it.
// Bypass permission mode (native BooCode): auto-apply staged edits to disk // Plan = read-only, Ask = stage to the queue (agent can't self-apply),
// once the turn settles. `enqueue` registers synchronously, so hasActive is // Bypass = apply each write immediately. Other mode ids (e.g. an external
// true immediately; poll until it clears, apply, then re-publish // fallback's native mode) leave the gate undefined = legacy behavior.
// message_complete so the DiffPanel reflects the now-applied (non-pending) req.log.info(
// state. Best-effort — failures stay in the pending queue for manual apply. { provider, mode_id, permissionMode: asPermissionMode(mode_id), chatId },
if (mode_id === 'bypass') { 'native enqueue — permission gate',
const projectId = sessionRows[0]!.project_id; );
const assistantId = assistantMsg!.id; inference.enqueue(sessionId, chatId, assistantMsg!.id, 'default', asPermissionMode(mode_id));
void (async () => {
try {
const [proj] = await sql<{ path: string }[]>`SELECT path FROM projects WHERE id = ${projectId}`;
if (!proj?.path) return;
for (let i = 0; i < 1200 && inference.hasActive(chatId); i++) {
await new Promise((r) => setTimeout(r, 1000));
}
const applied = await applyAll(sql, sessionId, proj.path);
if (applied.length > 0) {
broker.publishFrame(sessionId, {
type: 'message_complete',
message_id: assistantId,
chat_id: chatId,
} as unknown as WsFrame);
}
} catch {
/* best-effort auto-apply — leave staged changes for manual apply */
}
})();
}
reply.code(202); reply.code(202);
return { user_message_id: userMsg!.id, assistant_message_id: assistantMsg!.id }; return { user_message_id: userMsg!.id, assistant_message_id: assistantMsg!.id };

View File

@@ -161,6 +161,52 @@ describe('locateMatch — strategy 4: Levenshtein', () => {
}); });
}); });
describe('locateMatch — strategy 4: fail-closed on ambiguity (corruption guard)', () => {
it('refuses (ambiguous) when two equally-similar anchored blocks both clear the bar', () => {
// The repetitive-file case that duplicated blocks: two blocks share the same
// first+last anchor lines and their middle lines are EQUALLY similar to the
// (drifted) needle. Tier 4 must refuse rather than splice over one of them.
const content = [
'const x = {',
' total = aa;',
'};',
'const x = {',
' total = bb;',
'};',
].join('\n');
const needle = ['const x = {', ' total = ab;', '};'].join('\n');
const result = locateMatch(content, needle);
expect(result.kind).toBe('ambiguous');
});
it('refuses a below-threshold near-miss that the old 0.66 floor would have spliced', () => {
// ~0.7 similar: under the raised 0.85 floor this is now not_found, so the
// caller surfaces a correctable error instead of corrupting the file.
const content = 'const grandTotalAmount = a + b;\n';
const needle = 'const totalValue = a + b;';
const result = locateMatch(content, needle);
expect(result).toEqual({ kind: 'not_found' });
});
it('still matches a single genuine high-similarity drift uniquely', () => {
const content = 'const total = sum + tax;\n';
const needle = 'const totals = sum + tax;'; // one-char typo, ~0.96
const result = locateMatch(content, needle);
expect(result.kind).toBe('fuzzy');
const { start, end } = span(result);
expect(content.slice(start, end)).toBe('const total = sum + tax;');
});
it('requires an exact first+last line anchor for multi-line needles', () => {
// First line drifted too far to anchor → no window is scored → not_found,
// even though the middle lines are identical.
const content = ['function compute() {', ' return a + b;', ' return done;', '}'].join('\n');
const needle = ['totally different opener', ' return a + b;', '}'].join('\n');
const result = locateMatch(content, needle);
expect(result).toEqual({ kind: 'not_found' });
});
});
describe('locateMatch — edge cases', () => { describe('locateMatch — edge cases', () => {
it('returns not_found for an empty needle', () => { it('returns not_found for an empty needle', () => {
expect(locateMatch('anything', '')).toEqual({ kind: 'not_found' }); expect(locateMatch('anything', '')).toEqual({ kind: 'not_found' });

View File

@@ -83,6 +83,53 @@ describe.runIf(!!process.env.DATABASE_URL)('pending_changes integration', () =>
expect(existsSync(resolve(testDir, 'deleteme.txt'))).toBe(false); expect(existsSync(resolve(testDir, 'deleteme.txt'))).toBe(false);
}); });
it('re-emitted identical edits dedupe at queue and never duplicate on apply', async () => {
// Regression: the 2-3x block-stamping corruption. An anchored insert queued
// three times (a local model re-emitting the same tool call) must collapse to
// ONE pending row and apply exactly once.
await queueCreate(sql, testSessionId, null, 'dup.js', '<script>\nrender();\n', projectRoot)
.then((c) => applyOne(sql, c.id, projectRoot));
const oldStr = '<script>';
const newStr = '<script>\nconst recordFormats = ["gif"];';
const a = await queueEdit(sql, testSessionId, null, 'dup.js', oldStr, newStr, projectRoot);
const b = await queueEdit(sql, testSessionId, null, 'dup.js', oldStr, newStr, projectRoot);
const c = await queueEdit(sql, testSessionId, null, 'dup.js', oldStr, newStr, projectRoot);
// All three calls return the SAME pending row (deduped).
expect(b.id).toBe(a.id);
expect(c.id).toBe(a.id);
await applyOne(sql, a.id, projectRoot);
let content = await readFile(resolve(testDir, 'dup.js'), 'utf8');
expect((content.match(/const recordFormats/g) || []).length).toBe(1);
// Even a fresh, separately-queued identical edit re-applied is a no-op, not a stamp.
const again = await queueEdit(sql, testSessionId, null, 'dup.js', oldStr, newStr, projectRoot);
const res = await applyOne(sql, again.id, projectRoot);
expect(res.success).toBe(true);
content = await readFile(resolve(testDir, 'dup.js'), 'utf8');
expect((content.match(/const recordFormats/g) || []).length).toBe(1);
});
it('preserves CRLF line endings on edit', async () => {
await queueCreate(sql, testSessionId, null, 'crlf.txt', 'line one\r\nline two\r\nline three\r\n', projectRoot)
.then((c) => applyOne(sql, c.id, projectRoot));
const edit = await queueEdit(sql, testSessionId, null, 'crlf.txt', 'line two', 'line TWO', projectRoot);
const res = await applyOne(sql, edit.id, projectRoot);
expect(res.success).toBe(true);
const content = await readFile(resolve(testDir, 'crlf.txt'), 'utf8');
expect(content).toBe('line one\r\nline TWO\r\nline three\r\n');
});
it('refuses an edit that matches multiple locations instead of corrupting', async () => {
await queueCreate(sql, testSessionId, null, 'ambig.js', 'x=1;\ny=2;\nx=1;\n', projectRoot)
.then((ch) => applyOne(sql, ch.id, projectRoot));
const edit = await queueEdit(sql, testSessionId, null, 'ambig.js', 'x=1;', 'x=9;', projectRoot);
const res = await applyOne(sql, edit.id, projectRoot);
expect(res.success).toBe(false);
expect(res.error).toMatch(/matches 2 locations/);
});
it('rewindOne → verify reverted', async () => { it('rewindOne → verify reverted', async () => {
// Setup: create and apply a file // Setup: create and apply a file
const createChange = await queueCreate(sql, testSessionId, null, 'rewindable.txt', 'initial', projectRoot); const createChange = await queueCreate(sql, testSessionId, null, 'rewindable.txt', 'initial', projectRoot);

View File

@@ -0,0 +1,69 @@
import { describe, it, expect } from 'vitest';
import { planEdit } from '../pending_changes.js';
// planEdit is the pure core of applyOne's edit splice. These tests pin the
// idempotency guards that stop the "block stamped 2-3x" corruption: applying the
// same queued edit more than once must be a no-op, never a duplicate.
describe('planEdit — normal application', () => {
it('applies a unique exact edit', () => {
const content = 'a\nfoo\nb\n';
const plan = planEdit(content, 'foo', 'bar');
expect(plan).toEqual({ kind: 'apply', updated: 'a\nbar\nb\n' });
});
it('reports ambiguous when old_string occurs more than once', () => {
const content = 'foo\nx\nfoo\n';
const plan = planEdit(content, 'foo', 'bar');
expect(plan).toEqual({ kind: 'ambiguous', count: 2 });
});
it('reports not_found when old_string is absent and new is not present', () => {
const content = 'alpha\nbeta\n';
const plan = planEdit(content, 'gamma that is clearly nowhere', 'delta');
expect(plan).toEqual({ kind: 'not_found' });
});
});
describe('planEdit — idempotency (the corruption guard)', () => {
it('treats a re-applied anchored insert as already-applied (no duplicate)', () => {
// The exact mechanism that tripled `const recordFormats` in settings.html:
// an anchored insert (old=anchor, new=anchor+block) where the anchor still
// matches uniquely after the first apply.
const oldStr = '<script>';
const newStr = '<script>\nconst recordFormats = ["gif","mp4"];';
const before = '<script>\nfunction render() {}\n</script>\n';
const first = planEdit(before, oldStr, newStr);
expect(first.kind).toBe('apply');
const after = first.kind === 'apply' ? first.updated : '';
expect((after.match(/const recordFormats/g) || []).length).toBe(1);
// Re-applying the identical edit to the already-edited content is a no-op.
const second = planEdit(after, oldStr, newStr);
expect(second).toEqual({ kind: 'noop', reason: 'already-applied' });
});
it('treats an edit whose old_string is gone but new_string is present as already-applied', () => {
const content = 'const total = sum + tax;\n';
const plan = planEdit(content, 'const subtotal = sum;', 'const total = sum + tax;');
expect(plan).toEqual({ kind: 'noop', reason: 'already-applied' });
});
it('treats a no-change splice as a noop', () => {
const content = 'a\nfoo\nb\n';
const plan = planEdit(content, 'foo', 'foo');
expect(plan).toEqual({ kind: 'noop', reason: 'identical' });
});
it('does not duplicate across three repeated applications', () => {
const oldStr = 'function f() {';
const newStr = 'function f() {\n const x = 1;';
let content = 'function f() {\n return x;\n}\n';
for (let i = 0; i < 3; i++) {
const plan = planEdit(content, oldStr, newStr);
if (plan.kind === 'apply') content = plan.updated;
}
expect((content.match(/const x = 1;/g) || []).length).toBe(1);
});
});

View File

@@ -4,7 +4,7 @@ import type { Broker } from '@boocode/server/broker';
import type { WsFrame } from '@boocode/contracts/ws-frames'; import type { WsFrame } from '@boocode/contracts/ws-frames';
import type { Config } from '../config.js'; import type { Config } from '../config.js';
import { createWorktree, diffWorktree, cleanupWorktree, ensureSessionWorktree } from './worktrees.js'; import { createWorktree, diffWorktree, cleanupWorktree, ensureSessionWorktree } from './worktrees.js';
import { applyAll } from './pending_changes.js'; import { asPermissionMode } from './tools/types.js';
import { createCheckpoint } from './checkpoints.js'; import { createCheckpoint } from './checkpoints.js';
import { makeDcpStreamStripper } from './dcp-strip.js'; import { makeDcpStreamStripper } from './dcp-strip.js';
import { dispatchViaAcp } from './acp-dispatch.js'; import { dispatchViaAcp } from './acp-dispatch.js';
@@ -32,7 +32,13 @@ import {
import { shouldFailOnMissingAgent } from './flow-runner-decisions.js'; import { shouldFailOnMissingAgent } from './flow-runner-decisions.js';
interface InferenceRunner { interface InferenceRunner {
enqueue: (sessionId: string, chatId: string, assistantId: string, user: string) => void; enqueue: (
sessionId: string,
chatId: string,
assistantId: string,
user: string,
permissionMode?: 'plan' | 'ask' | 'bypass',
) => void;
cancel: (sessionId: string, chatId: string) => Promise<boolean>; cancel: (sessionId: string, chatId: string) => Promise<boolean>;
hasActive: (chatId: string) => boolean; hasActive: (chatId: string) => boolean;
} }
@@ -358,8 +364,9 @@ export function createDispatcher(deps: Deps): {
`; `;
const assistantId = assistantMsg!.id; const assistantId = assistantMsg!.id;
// Enqueue inference // Enqueue inference — pass the native permission gate (plan/ask/bypass)
inference.enqueue(sessionId, chatId, assistantId, 'default'); // through to the write-tool context. Non-unified mode ids → undefined.
inference.enqueue(sessionId, chatId, assistantId, 'default', asPermissionMode(task.mode_id));
// Wait for inference to complete (poll message status) // Wait for inference to complete (poll message status)
const finalStatus = await waitForCompletion(assistantId); const finalStatus = await waitForCompletion(assistantId);
@@ -392,22 +399,6 @@ export function createDispatcher(deps: Deps): {
WHERE id = ${taskId} WHERE id = ${taskId}
`; `;
log.info({ taskId, costTokens }, 'dispatcher: task completed (native)'); log.info({ taskId, costTokens }, 'dispatcher: task completed (native)');
// Bypass permission mode: auto-apply the staged edits to disk after the
// turn. Ask/Plan leave them in the pending-changes queue for review.
if (task.mode_id === 'bypass') {
try {
const [proj] = await sql<{ path: string }[]>`SELECT path FROM projects WHERE id = ${task.project_id}`;
if (proj?.path) {
const applied = await applyAll(sql, sessionId, proj.path);
log.info({ taskId, applied: applied.length }, 'dispatcher: native bypass auto-applied pending changes');
}
} catch (applyErr) {
log.warn(
{ taskId, err: applyErr instanceof Error ? applyErr.message : String(applyErr) },
'dispatcher: native bypass auto-apply failed',
);
}
}
} else { } else {
const [msg] = await sql<{ content: string | null }[]>` const [msg] = await sql<{ content: string | null }[]>`
SELECT content FROM messages WHERE id = ${assistantId} SELECT content FROM messages WHERE id = ${assistantId}

View File

@@ -21,7 +21,16 @@
// punctuation to ASCII on both sides; the match is // punctuation to ASCII on both sides; the match is
// mapped back to original offsets. // mapped back to original offsets.
// 4. levenshtein — best line-window by normalized edit-distance // 4. levenshtein — best line-window by normalized edit-distance
// similarity; accepted only at >= SIMILARITY_THRESHOLD. // similarity; accepted only at >= SIMILARITY_THRESHOLD,
// anchored on an exact first+last line for multi-line
// needles, and REFUSED (ambiguous) when a second window
// scores within AMBIGUITY_EPSILON of the best. Like the
// exact/whitespace tiers, this tier fails CLOSED — it
// never splices over a merely-plausible guess, because a
// wrong-window splice corrupts the file (it leaves the
// real target intact and duplicates it). This mirrors
// opencode/cline/qwen, whose fuzzy tiers all keep the
// unique-match requirement rather than picking a winner.
// //
// Pure and dependency-free (Levenshtein is the standard iterative two-row DP), // Pure and dependency-free (Levenshtein is the standard iterative two-row DP),
// reimplemented from the general technique — no vendored source. // reimplemented from the general technique — no vendored source.
@@ -31,8 +40,31 @@ export type MatchResult =
| { kind: 'ambiguous'; count: number } | { kind: 'ambiguous'; count: number }
| { kind: 'not_found' }; | { kind: 'not_found' };
/** Levenshtein similarity floor for the final fuzzy fallback (strategy 4). */ /**
export const SIMILARITY_THRESHOLD = 0.66; * Levenshtein similarity floor for the final fuzzy fallback (strategy 4).
* 0.66 was far too low — at two-thirds similarity a structurally-wrong window
* (e.g. one of three near-identical form blocks) clears the bar and gets spliced
* over, leaving the real target intact and duplicated. Competent agents anchor
* far tighter (opencode's BlockAnchor needs an exact anchor; cline needs exact
* first+last lines). 0.85 keeps genuine quantized-model drift (a typo, an indent
* shift) while refusing a different block.
*/
export const SIMILARITY_THRESHOLD = 0.85;
/**
* If a second candidate window scores within this of the best, the match is
* ambiguous and tier 4 refuses rather than guessing — the same fail-closed
* stance the exact and whitespace tiers take on multiple hits. Repetitive files
* (the duplicate-block corruption case) produce near-tied windows; this is what
* turns that into a clean "add more context" error instead of a wrong splice.
*/
export const AMBIGUITY_EPSILON = 0.05;
/** Multi-line needles at or above this length must anchor on an exact (after
* trim + unicode-fold) first AND last line before similarity is even scored —
* the cline/opencode block-anchor rule. Below it, threshold + uniqueness alone
* guard the match. */
const ANCHOR_MIN_LINES = 3;
export function locateMatch(content: string, needle: string): MatchResult { export function locateMatch(content: string, needle: string): MatchResult {
// Empty needle has no meaningful match. // Empty needle has no meaningful match.
@@ -252,20 +284,39 @@ function locateByLevenshtein(content: string, needle: string): MatchResult | nul
const needleJoined = needleLines.map((l) => l.trim()).join('\n'); const needleJoined = needleLines.map((l) => l.trim()).join('\n');
let best = -1; // Block-anchor gate for multi-line needles: the first and last lines must match
let bestSpan: { start: number; end: number } | null = null; // exactly (after trim + unicode-fold) or the window is not even scored. This
// stops a high interior-similarity from dragging a structurally-wrong window
// over the threshold — the failure that duplicates blocks in repetitive files.
const anchored = n >= ANCHOR_MIN_LINES;
const needleFirst = canonicalize(needleLines[0]!.trim());
const needleLast = canonicalize(needleLines[n - 1]!.trim());
const scored: Array<{ score: number; start: number; end: number }> = [];
for (let i = 0; i + n <= contentLines.length; i++) { for (let i = 0; i + n <= contentLines.length; i++) {
const window = contentLines.slice(i, i + n); const window = contentLines.slice(i, i + n);
const windowJoined = window.map((l) => l.text.trim()).join('\n'); if (anchored) {
const score = similarity(windowJoined, needleJoined); const winFirst = canonicalize(window[0]!.text.trim());
if (score > best) { const winLast = canonicalize(window[n - 1]!.text.trim());
best = score; if (winFirst !== needleFirst || winLast !== needleLast) continue;
bestSpan = { start: window[0]!.start, end: window[n - 1]!.end };
} }
const windowJoined = window.map((l) => l.text.trim()).join('\n');
scored.push({
score: similarity(windowJoined, needleJoined),
start: window[0]!.start,
end: window[n - 1]!.end,
});
} }
if (bestSpan && best >= SIMILARITY_THRESHOLD) { if (scored.length === 0) return null;
return { kind: 'fuzzy', start: bestSpan.start, end: bestSpan.end }; scored.sort((a, b) => b.score - a.score);
} const best = scored[0]!;
return null; if (best.score < SIMILARITY_THRESHOLD) return null;
// Uniqueness guard: refuse when a second window is within epsilon of the best.
// Fail closed (ambiguous) rather than silently splicing one of several lookalikes.
const tied = scored.filter((s) => s.score >= best.score - AMBIGUITY_EPSILON);
if (tied.length > 1) return { kind: 'ambiguous', count: tied.length };
return { kind: 'fuzzy', start: best.start, end: best.end };
} }

View File

@@ -1,9 +1,120 @@
import { readFile, writeFile, unlink, mkdir } from 'node:fs/promises'; import { readFile, writeFile, unlink, mkdir, rename, realpath } from 'node:fs/promises';
import { dirname } from 'node:path'; import { dirname, join, basename } from 'node:path';
import { randomBytes } from 'node:crypto';
import type { Sql } from '../db.js'; import type { Sql } from '../db.js';
import { resolveWritePath } from './write_guard.js'; import { resolveWritePath } from './write_guard.js';
import { locateMatch } from './fuzzy-match.js'; import { locateMatch } from './fuzzy-match.js';
/**
* Write a file atomically: stage to a sibling temp file, then rename over the
* target. rename(2) on the same filesystem is atomic, so a crash mid-write can
* never leave a half-written (truncated/corrupt) source file — readers see
* either the old content or the complete new content. The temp lives in the same
* directory to guarantee a same-filesystem rename.
*
* Symlinks: a plain writeFile FOLLOWS a symlink and writes through to its target;
* a bare rename would REPLACE the link with a regular file. We realpath an
* existing target first so the rename lands on the real file and the link
* survives — preserving the prior follow-through behavior. A missing target
* (create, or a broken link) just writes the literal path.
*/
async function writeFileAtomic(filePath: string, content: string): Promise<void> {
let target = filePath;
try {
target = await realpath(filePath);
} catch {
// ENOENT (new file) or broken link — write the literal path.
}
const tmp = join(dirname(target), `.${basename(target)}.tmp.${process.pid}.${randomBytes(6).toString('hex')}`);
await writeFile(tmp, content, 'utf8');
try {
await rename(tmp, target);
} catch (err) {
await unlink(tmp).catch(() => {});
throw err;
}
}
/** Detect a file's dominant line ending so an edit can preserve it. */
function detectEol(text: string): '\r\n' | '\n' {
return text.includes('\r\n') ? '\r\n' : '\n';
}
/**
* Serialize the read-modify-write of a single file so two concurrent applies
* (e.g. two chat tabs sharing one worktree, or a Bypass write racing an
* apply_pending) can't lose an update. In-process keying is sufficient —
* BooCoder is a single Fastify process. One Map entry per distinct path.
*/
const fileLocks = new Map<string, Promise<void>>();
async function withFileLock<T>(filePath: string, fn: () => Promise<T>): Promise<T> {
const prev = fileLocks.get(filePath) ?? Promise.resolve();
let release!: () => void;
const current = new Promise<void>((r) => { release = r; });
fileLocks.set(filePath, prev.then(() => current));
await prev.catch(() => {});
try {
return await fn();
} finally {
release();
}
}
// --- Edit-apply planning (pure, unit-tested) ---------------------------------
/**
* Decision for applying one queued edit to a file's current content. Pulled out
* of `applyOne` so the splice — the part that actually corrupted files — is pure
* and testable without a DB or filesystem. Mirrors how opencode/cline/qwen keep
* their matchers fail-closed and idempotent.
*/
export type EditPlan =
| { kind: 'apply'; updated: string }
| { kind: 'noop'; reason: 'identical' | 'already-applied' }
| { kind: 'ambiguous'; count: number }
| { kind: 'not_found' };
/**
* Decide how (or whether) to apply an `old → new` edit to `content`.
*
* Idempotency is the whole point here: a queued edit can legitimately be
* re-applied (a local model re-emits the same tool call; a turn is retried; the
* same change sits in the queue twice). A naive splice stamps the new text again
* each time — the 23× block duplication. Two guards make re-application a no-op:
*
* - already-applied (anchored insert): when `new` is `old` + an appended block
* (`old="anchor"`, `new="anchor\n<block>"`), `old` still matches uniquely after
* the first apply, so a second apply would duplicate `<block>`. If the full
* `new` text is already present at the match site, the edit is already applied.
* - already-applied (old gone): if `old` can't be located but `new` is already
* in the file, the change landed on a prior pass — treat as a no-op, not an error.
* - identical: the splice would not change the file.
*
* Anything ambiguous or genuinely absent fails CLOSED so the caller surfaces a
* correctable error instead of writing a guess.
*/
export function planEdit(content: string, oldStr: string, newStr: string): EditPlan {
const match = locateMatch(content, oldStr);
if (match.kind === 'ambiguous') return { kind: 'ambiguous', count: match.count };
if (match.kind === 'not_found') {
if (newStr.length > 0 && content.includes(newStr)) {
return { kind: 'noop', reason: 'already-applied' };
}
return { kind: 'not_found' };
}
const updated = content.slice(0, match.start) + newStr + content.slice(match.end);
// No-change splice first (covers old === new), then the anchored re-stamp guard:
// the full replacement already sits at the match site (re-emitted anchored insert).
if (updated === content) return { kind: 'noop', reason: 'identical' };
if (content.slice(match.start, match.start + newStr.length) === newStr) {
return { kind: 'noop', reason: 'already-applied' };
}
return { kind: 'apply', updated };
}
// --- Types ------------------------------------------------------------------- // --- Types -------------------------------------------------------------------
export interface PendingChange { export interface PendingChange {
@@ -47,6 +158,13 @@ export async function queueEdit(
const resolved = resolveWritePath(projectRoot, filePath); const resolved = resolveWritePath(projectRoot, filePath);
const diff = JSON.stringify({ old: oldString, new: newString }); const diff = JSON.stringify({ old: oldString, new: newString });
// Idempotent queue: collapse an identical edit that is still pending. Local
// quantized models re-emit the same edit_file call within a turn, and a retried
// turn re-queues — each duplicate row would apply and stamp another copy. One
// pending row per (session, file, operation, diff) is enough.
const existing = await findPendingDuplicate(sql, sessionId, resolved, 'edit', diff);
if (existing) return existing;
const [row] = await sql<PendingChange[]>` const [row] = await sql<PendingChange[]>`
INSERT INTO pending_changes (session_id, task_id, file_path, operation, diff, agent) INSERT INTO pending_changes (session_id, task_id, file_path, operation, diff, agent)
VALUES (${sessionId}, ${taskId}, ${resolved}, 'edit', ${diff}, ${agent}) VALUES (${sessionId}, ${taskId}, ${resolved}, 'edit', ${diff}, ${agent})
@@ -55,6 +173,28 @@ export async function queueEdit(
return row!; return row!;
} }
/** Return an identical still-pending change for this (session, file, op, diff),
* or undefined. Used to keep the queue idempotent against re-emitted edits. */
async function findPendingDuplicate(
sql: Sql,
sessionId: string,
resolvedPath: string,
operation: 'create' | 'edit' | 'delete',
diff: string,
): Promise<PendingChange | undefined> {
const [row] = await sql<PendingChange[]>`
SELECT * FROM pending_changes
WHERE session_id = ${sessionId}
AND file_path = ${resolvedPath}
AND operation = ${operation}
AND diff = ${diff}
AND status = 'pending'
ORDER BY created_at ASC
LIMIT 1
`;
return row;
}
export async function queueCreate( export async function queueCreate(
sql: Sql, sql: Sql,
sessionId: string, sessionId: string,
@@ -68,6 +208,9 @@ export async function queueCreate(
): Promise<PendingChange> { ): Promise<PendingChange> {
const resolved = resolveWritePath(projectRoot, filePath); const resolved = resolveWritePath(projectRoot, filePath);
const existing = await findPendingDuplicate(sql, sessionId, resolved, 'create', content);
if (existing) return existing;
const [row] = await sql<PendingChange[]>` const [row] = await sql<PendingChange[]>`
INSERT INTO pending_changes (session_id, task_id, file_path, operation, diff, agent) INSERT INTO pending_changes (session_id, task_id, file_path, operation, diff, agent)
VALUES (${sessionId}, ${taskId}, ${resolved}, 'create', ${content}, ${agent}) VALUES (${sessionId}, ${taskId}, ${resolved}, 'create', ${content}, ${agent})
@@ -87,6 +230,9 @@ export async function queueDelete(
): Promise<PendingChange> { ): Promise<PendingChange> {
const resolved = resolveWritePath(projectRoot, filePath); const resolved = resolveWritePath(projectRoot, filePath);
const existing = await findPendingDuplicate(sql, sessionId, resolved, 'delete', '');
if (existing) return existing;
const [row] = await sql<PendingChange[]>` const [row] = await sql<PendingChange[]>`
INSERT INTO pending_changes (session_id, task_id, file_path, operation, diff, agent) INSERT INTO pending_changes (session_id, task_id, file_path, operation, diff, agent)
VALUES (${sessionId}, ${taskId}, ${resolved}, 'delete', '', ${agent}) VALUES (${sessionId}, ${taskId}, ${resolved}, 'delete', '', ${agent})
@@ -110,48 +256,60 @@ export async function applyOne(
} }
try { try {
// Re-validate path in case projectRoot has shifted return await withFileLock(change.file_path, async () => {
resolveWritePath(projectRoot, change.file_path); // Re-validate path in case projectRoot has shifted
resolveWritePath(projectRoot, change.file_path);
switch (change.operation) { switch (change.operation) {
case 'create': { case 'create': {
await mkdir(dirname(change.file_path), { recursive: true }); await mkdir(dirname(change.file_path), { recursive: true });
await writeFile(change.file_path, change.diff, 'utf8'); await writeFileAtomic(change.file_path, change.diff);
break; break;
}
case 'edit': {
const { old: oldStr, new: newStr } = JSON.parse(change.diff) as { old: string; new: string };
const content = await readFile(change.file_path, 'utf8');
const match = locateMatch(content, oldStr);
if (match.kind === 'ambiguous') {
throw new Error(
`old_string matches ${match.count} locations — add surrounding context to disambiguate`,
);
} }
if (match.kind === 'not_found') { case 'edit': {
throw new Error( const { old: oldStr, new: newStr } = JSON.parse(change.diff) as { old: string; new: string };
'old_string not found in file (even fuzzily) — file may have changed since the edit was queued', const raw = await readFile(change.file_path, 'utf8');
); // Normalize to LF for matching, then write back in the file's native EOL
// so an LF-emitting model doesn't leave a CRLF file with mixed endings.
const eol = detectEol(raw);
const toLf = (t: string) => t.replaceAll('\r\n', '\n');
const plan = planEdit(toLf(raw), toLf(oldStr), toLf(newStr));
if (plan.kind === 'ambiguous') {
throw new Error(
`old_string matches ${plan.count} locations — add surrounding context to disambiguate`,
);
}
if (plan.kind === 'not_found') {
throw new Error(
'old_string not found in file (even fuzzily) — file may have changed since the edit was queued',
);
}
if (plan.kind === 'apply') {
const out = eol === '\r\n' ? plan.updated.replaceAll('\n', '\r\n') : plan.updated;
await writeFileAtomic(change.file_path, out);
} else {
// noop: the edit is already applied (re-emitted / retried) or a no-change.
// Mark it applied without rewriting so it can't stamp a duplicate.
console.log(`[pending] edit ${change.file_path} is a no-op (${plan.reason}) — not rewriting`);
}
break;
} }
const updated = content.slice(0, match.start) + newStr + content.slice(match.end); case 'delete': {
await writeFile(change.file_path, updated, 'utf8'); // Stash current content in diff for potential rewind
break; try {
} const existing = await readFile(change.file_path, 'utf8');
case 'delete': { await sql`UPDATE pending_changes SET diff = ${existing} WHERE id = ${changeId}`;
// Stash current content in diff for potential rewind } catch {
try { // File may already be gone — proceed with status update
const existing = await readFile(change.file_path, 'utf8'); }
await sql`UPDATE pending_changes SET diff = ${existing} WHERE id = ${changeId}`; await unlink(change.file_path);
} catch { break;
// File may already be gone — proceed with status update
} }
await unlink(change.file_path);
break;
} }
}
await sql`UPDATE pending_changes SET status = 'applied' WHERE id = ${changeId}`; await sql`UPDATE pending_changes SET status = 'applied' WHERE id = ${changeId}`;
return { id: change.id, file_path: change.file_path, operation: change.operation, success: true }; return { id: change.id, file_path: change.file_path, operation: change.operation, success: true };
});
} catch (err) { } catch (err) {
const message = err instanceof Error ? err.message : String(err); const message = err instanceof Error ? err.message : String(err);
return { id: change.id, file_path: change.file_path, operation: change.operation, success: false, error: message }; return { id: change.id, file_path: change.file_path, operation: change.operation, success: false, error: message };
@@ -220,13 +378,13 @@ export async function rewindOne(
); );
} }
const reverted = content.slice(0, match.start) + oldStr + content.slice(match.end); const reverted = content.slice(0, match.start) + oldStr + content.slice(match.end);
await writeFile(change.file_path, reverted, 'utf8'); await writeFileAtomic(change.file_path, reverted);
break; break;
} }
case 'delete': { case 'delete': {
// Reverse a delete: recreate the file (diff holds the original content stashed at apply time) // Reverse a delete: recreate the file (diff holds the original content stashed at apply time)
await mkdir(dirname(change.file_path), { recursive: true }); await mkdir(dirname(change.file_path), { recursive: true });
await writeFile(change.file_path, change.diff, 'utf8'); await writeFileAtomic(change.file_path, change.diff);
break; break;
} }
} }

View File

@@ -0,0 +1,38 @@
import { describe, it, expect } from 'vitest';
import { runWithInferenceContext, getInferenceContext } from '../inference_context.js';
import type { Sql } from '../../../db.js';
const fakeSql = {} as unknown as Sql;
describe('inference context (AsyncLocalStorage isolation)', () => {
it('throws when read outside a run', () => {
expect(() => getInferenceContext()).toThrow(/outside inference context/);
});
it('keeps each run its own context across overlapping awaits', async () => {
// The race the global `let current` had: run B starts (and would overwrite a
// shared global) while run A is awaiting. After A resumes it must still read
// its OWN sessionId, not B's.
const run = (id: string, delay: number) =>
runWithInferenceContext({ sql: fakeSql, sessionId: id, taskId: null }, async () => {
await new Promise((r) => setTimeout(r, delay));
return getInferenceContext().sessionId;
});
const [a, b] = await Promise.all([run('A', 20), run('B', 5)]);
expect(a).toBe('A');
expect(b).toBe('B');
});
it('carries permissionMode and taskId per run', async () => {
const result = await runWithInferenceContext(
{ sql: fakeSql, sessionId: 's1', taskId: 't1', permissionMode: 'bypass' },
async () => {
await Promise.resolve();
const ctx = getInferenceContext();
return { taskId: ctx.taskId, mode: ctx.permissionMode };
},
);
expect(result).toEqual({ taskId: 't1', mode: 'bypass' });
});
});

View File

@@ -26,6 +26,15 @@ export const applyPendingTool: ToolDef<ApplyPendingInputT> = {
}, },
}, },
async execute(_input: ApplyPendingInputT, projectRoot: string, context: ToolContext): Promise<unknown> { async execute(_input: ApplyPendingInputT, projectRoot: string, context: ToolContext): Promise<unknown> {
// Under Ask (and Plan) the human approves via the Pending Changes panel — the
// agent must not auto-apply. Bypass and legacy (undefined) may apply.
if (context.permissionMode === 'ask' || context.permissionMode === 'plan') {
return {
status: 'denied',
message:
'Permission mode is Ask — staged changes must be approved by the user in the Pending Changes panel, not applied by the agent.',
};
}
const results = await applyAll(context.sql, context.sessionId, projectRoot); const results = await applyAll(context.sql, context.sessionId, projectRoot);
const succeeded = results.filter((r) => r.success).length; const succeeded = results.filter((r) => r.success).length;
const failed = results.filter((r) => !r.success).length; const failed = results.filter((r) => !r.success).length;

View File

@@ -1,6 +1,7 @@
import { z } from 'zod'; import { z } from 'zod';
import type { ToolDef, ToolContext } from './types.js'; import type { ToolDef, ToolContext } from './types.js';
import { queueCreate } from '../pending_changes.js'; import { queueCreate } from '../pending_changes.js';
import { denyReadOnly, finalizeWrite } from './write-gate.js';
const CreateFileInput = z.object({ const CreateFileInput = z.object({
file_path: z.string().min(1), file_path: z.string().min(1),
@@ -32,6 +33,7 @@ export const createFileTool: ToolDef<CreateFileInputT> = {
}, },
}, },
async execute(input: CreateFileInputT, projectRoot: string, context: ToolContext): Promise<unknown> { async execute(input: CreateFileInputT, projectRoot: string, context: ToolContext): Promise<unknown> {
if (context.permissionMode === 'plan') return denyReadOnly('create_file');
const change = await queueCreate( const change = await queueCreate(
context.sql, context.sql,
context.sessionId, context.sessionId,
@@ -40,12 +42,11 @@ export const createFileTool: ToolDef<CreateFileInputT> = {
input.content, input.content,
projectRoot, projectRoot,
); );
return { return finalizeWrite(
status: 'queued', context,
change_id: change.id, projectRoot,
file_path: change.file_path, change,
operation: 'create', `File creation queued: ${change.file_path}. Use apply_pending to write changes to disk.`,
message: `File creation queued: ${change.file_path}. Use apply_pending to write changes to disk.`, );
};
}, },
}; };

View File

@@ -1,6 +1,7 @@
import { z } from 'zod'; import { z } from 'zod';
import type { ToolDef, ToolContext } from './types.js'; import type { ToolDef, ToolContext } from './types.js';
import { queueDelete } from '../pending_changes.js'; import { queueDelete } from '../pending_changes.js';
import { denyReadOnly, finalizeWrite } from './write-gate.js';
const DeleteFileInput = z.object({ const DeleteFileInput = z.object({
file_path: z.string().min(1), file_path: z.string().min(1),
@@ -30,6 +31,7 @@ export const deleteFileTool: ToolDef<DeleteFileInputT> = {
}, },
}, },
async execute(input: DeleteFileInputT, projectRoot: string, context: ToolContext): Promise<unknown> { async execute(input: DeleteFileInputT, projectRoot: string, context: ToolContext): Promise<unknown> {
if (context.permissionMode === 'plan') return denyReadOnly('delete_file');
const change = await queueDelete( const change = await queueDelete(
context.sql, context.sql,
context.sessionId, context.sessionId,
@@ -37,12 +39,11 @@ export const deleteFileTool: ToolDef<DeleteFileInputT> = {
input.file_path, input.file_path,
projectRoot, projectRoot,
); );
return { return finalizeWrite(
status: 'queued', context,
change_id: change.id, projectRoot,
file_path: change.file_path, change,
operation: 'delete', `File deletion queued: ${change.file_path}. Use apply_pending to write changes to disk.`,
message: `File deletion queued: ${change.file_path}. Use apply_pending to write changes to disk.`, );
};
}, },
}; };

View File

@@ -1,6 +1,7 @@
import { z } from 'zod'; import { z } from 'zod';
import type { ToolDef, ToolContext } from './types.js'; import type { ToolDef, ToolContext } from './types.js';
import { queueEdit } from '../pending_changes.js'; import { queueEdit } from '../pending_changes.js';
import { denyReadOnly, finalizeWrite } from './write-gate.js';
const EditFileInput = z.object({ const EditFileInput = z.object({
file_path: z.string().min(1), file_path: z.string().min(1),
@@ -34,6 +35,7 @@ export const editFileTool: ToolDef<EditFileInputT> = {
}, },
}, },
async execute(input: EditFileInputT, projectRoot: string, context: ToolContext): Promise<unknown> { async execute(input: EditFileInputT, projectRoot: string, context: ToolContext): Promise<unknown> {
if (context.permissionMode === 'plan') return denyReadOnly('edit_file');
const change = await queueEdit( const change = await queueEdit(
context.sql, context.sql,
context.sessionId, context.sessionId,
@@ -43,12 +45,11 @@ export const editFileTool: ToolDef<EditFileInputT> = {
input.new_string, input.new_string,
projectRoot, projectRoot,
); );
return { return finalizeWrite(
status: 'queued', context,
change_id: change.id, projectRoot,
file_path: change.file_path, change,
operation: 'edit', `Edit queued for ${change.file_path}. Use apply_pending to write changes to disk.`,
message: `Edit queued for ${change.file_path}. Use apply_pending to write changes to disk.`, );
};
}, },
}; };

View File

@@ -1,36 +1,49 @@
import { AsyncLocalStorage } from 'node:async_hooks';
import type { Sql } from '../../db.js'; import type { Sql } from '../../db.js';
import type { PermissionMode } from './types.js';
/** /**
* Module-level inference context for write tools. * Per-run inference context for write tools.
* *
* Set via `setInferenceContext()` before each inference run starts. * Write tools need ambient state (sql, sessionId, the permission gate) that the
* Write tools read it via `getInferenceContext()` during execute. * BooChat tool-phase `execute(input, projectRoot, extraRoots?)` signature can't
* Same pattern as BooChat's `loadConfig()` singleton — tools need * carry. This used to be a single module-level `let current` — but the inference
* ambient state that can't be threaded through the tool-phase execute * runner's `enqueue()` is fire-and-forget, so two overlapping runs (a user
* signature (which is `execute(input, projectRoot, extraRoots?)`). * message racing a dispatcher-polled native task; two chat tabs streaming) would
* clobber each other's context, and `cancel()` cleared it for ALL in-flight runs.
*
* AsyncLocalStorage gives each run its own context: `enqueue()` starts its async
* loop synchronously inside `runWithInferenceContext`, so the store propagates
* through every awaited tool execution in that run — and only that run.
*/ */
export interface InferenceContext { export interface InferenceContext {
sql: Sql; sql: Sql;
sessionId: string; sessionId: string;
taskId: string | null; taskId: string | null;
/** Native-BooCode permission gate, set per run from the request/task mode. */
permissionMode?: PermissionMode;
} }
let current: InferenceContext | null = null; const storage = new AsyncLocalStorage<InferenceContext>();
export function setInferenceContext(ctx: InferenceContext): void { /**
current = ctx; * Bind `ctx` for the duration of the (possibly detached) async chain `fn` starts.
} * The inference runner kicks off its loop synchronously within this call, so all
* downstream `await`s — including write-tool `execute` via the adapter — read the
export function clearInferenceContext(): void { * same store. Concurrent runs each get their own; nothing is shared or cleared
current = null; * out from under an in-flight run.
*/
export function runWithInferenceContext<T>(ctx: InferenceContext, fn: () => T): T {
return storage.run(ctx, fn);
} }
export function getInferenceContext(): InferenceContext { export function getInferenceContext(): InferenceContext {
if (!current) { const ctx = storage.getStore();
if (!ctx) {
throw new Error( throw new Error(
'Write tool called outside inference context — setInferenceContext() was not called before this run', 'Write tool called outside inference context — runWithInferenceContext() did not wrap this run',
); );
} }
return current; return ctx;
} }

View File

@@ -1,6 +1,22 @@
import type { z } from 'zod'; import type { z } from 'zod';
import type { Sql } from '../../db.js'; import type { Sql } from '../../db.js';
/**
* Unified permission ladder for native BooCode inference. Gates the write tools:
* plan — read-only: create/edit/delete are denied (no staging).
* ask — stage to the pending-changes queue; `apply_pending` is denied so the
* agent cannot self-apply (the human approves via the Diff panel).
* bypass — apply each write immediately (no queue, no approval).
* Undefined preserves the historical behavior (stage + `apply_pending` allowed).
*/
export type PermissionMode = 'plan' | 'ask' | 'bypass';
/** Narrow a raw task/request mode id to a unified PermissionMode, else undefined
* (e.g. an external agent's native mode id, or null). */
export function asPermissionMode(id: string | null | undefined): PermissionMode | undefined {
return id === 'plan' || id === 'ask' || id === 'bypass' ? id : undefined;
}
export interface ToolJsonSchema { export interface ToolJsonSchema {
type: 'function'; type: 'function';
function: { function: {
@@ -21,6 +37,8 @@ export interface ToolContext {
sql: Sql; sql: Sql;
sessionId: string; sessionId: string;
taskId: string | null; taskId: string | null;
/** Native-BooCode permission gate for write tools (undefined = legacy behavior). */
permissionMode?: PermissionMode;
} }
export interface ToolDef<TInput> { export interface ToolDef<TInput> {

View File

@@ -0,0 +1,53 @@
/**
* Permission-gate helpers for native BooCode write tools. The gate comes from
* the per-run inference context (`ToolContext.permissionMode`):
* plan — deny the write (read-only); nothing is staged.
* bypass — apply the staged change immediately (no queue, no approval).
* ask / undefined — leave it in the pending-changes queue for review.
*/
import type { ToolContext } from './types.js';
import { applyOne } from '../pending_changes.js';
/** Result returned when a write is denied under Plan (read-only) mode. */
export function denyReadOnly(operation: string): unknown {
return {
status: 'denied',
operation,
message: `Read-only (Plan) permission mode — ${operation} is not permitted. Switch to Ask or Bypass to make changes.`,
};
}
/** Finalize a just-staged change per the permission gate: apply now under Bypass,
* otherwise return it as queued for the human to approve. */
export async function finalizeWrite(
context: ToolContext,
projectRoot: string,
change: { id: string; file_path: string; operation: string },
queuedHint: string,
): Promise<unknown> {
if (context.permissionMode === 'bypass') {
const res = await applyOne(context.sql, change.id, projectRoot);
console.log(
`[write-gate] bypass apply ${change.operation} ${change.file_path} -> ${res.success ? 'applied' : 'FAILED: ' + (res.error ?? '?')}`,
);
return {
status: res.success ? 'applied' : 'failed',
change_id: change.id,
file_path: change.file_path,
operation: change.operation,
message: res.success
? `${change.operation} applied to ${change.file_path}.`
: `Apply failed for ${change.file_path}: ${res.error ?? 'unknown error'}. Left in the pending queue.`,
};
}
console.log(
`[write-gate] ${context.permissionMode ?? 'legacy'} queued ${change.operation} ${change.file_path}`,
);
return {
status: 'queued',
change_id: change.id,
file_path: change.file_path,
operation: change.operation,
message: queuedHint,
};
}