fix(coder): harden edit-apply pipeline against block duplication

Root cause: two proven corruption mechanisms — (M1) non-idempotent apply stamped the same block N times when a quantized model re-emitted the same edit_file call or a turn was retried; (M2) Levenshtein tier 4 was fail-open with no uniqueness guard, silently splicing into the wrong location. Fixes applied at every layer of the pipeline: Matcher (fuzzy-match.ts): raise SIMILARITY_THRESHOLD 0.66 → 0.85; add AMBIGUITY_EPSILON uniqueness guard — two windows within 0.05 of the top score → ambiguous, not a guess; add block-anchor gate (≥3-line needles require first+last line exact match before a window is scored). Edit planner (pending_changes.ts): extract planEdit() as a pure function; idempotency guards detect already-applied states (anchored insert re-stamp, old-gone-but-new-present); findPendingDuplicate() collapses identical pending rows at queue time so M1 never reaches applyOne. Atomic writes (pending_changes.ts): temp-file + rename on the same filesystem so a crash can't leave a half-written source file; realpath() first so symlinks survive the rename. Per-file mutex (pending_changes.ts): withFileLock() serializes concurrent read-modify-write on the same path via a chained-Promise Map. EOL preservation (pending_changes.ts): normalize CRLF → LF for matching, restore native line ending on write so Windows-style files stay clean. Context isolation (inference_context.ts): replace module-level singleton with AsyncLocalStorage so concurrent inference runs (arena parallel dispatch, dispatcher poll racing a user message) each get their own scoped context with no clobbering. Tests: plan-edit.test.ts (pure planEdit unit tests), extended fuzzy-match and pending_changes_integration suites, ALS isolation test that proves overlapping runs get correct session IDs. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-07 01:44:37 +00:00
parent dbf1662982
commit cce685b1a7
16 changed files with 644 additions and 157 deletions
--- a/apps/coder/src/index.ts
+++ b/apps/coder/src/index.ts
@@ -13,7 +13,7 @@ import type { WsFrame } from '@boocode/contracts/ws-frames';
 // v2.0.0 Phase 2C: write tools + adapter for BooChat ToolDef compatibility.
 import { WRITE_TOOLS } from './services/tools/index.js';
 import { adaptWriteTool } from './services/tools/adapter.js';
-import { setInferenceContext, clearInferenceContext } from './services/tools/inference_context.js';
+import { runWithInferenceContext } from './services/tools/inference_context.js';
 // Routes
 import { registerMessageRoutes } from './routes/messages.js';
 import { registerSkillRoutes } from './routes/skills.js';
@@ -174,22 +174,27 @@ async function main() {
    }
  );

-  // Wrap the inference runner to set/clear the write-tool context around each run.
-  // The inference runner calls enqueue() which fires asynchronously — we hook
-  // into the enqueue to set context before the run starts.
+  // Wrap the inference runner to bind the write-tool context around each run.
+  // enqueue() starts its async loop synchronously, so wrapping the call in
+  // runWithInferenceContext propagates the per-run context (sql, sessionId, the
+  // Plan/Ask/Bypass gate) through every awaited tool execution — and concurrent
+  // runs (a user message racing a dispatcher-polled native task) each get their
+  // own, instead of clobbering a shared global.
  const inferenceApi = {
-    enqueue: (sessionId: string, chatId: string, assistantId: string, user: string) => {
-      // Set the inference context so write tools can access sql + sessionId.
-      // The context persists for the duration of the inference run. Since
-      // BooCoder is single-user and runs one inference at a time per session,
-      // this module-level state is safe.
-      setInferenceContext({ sql, sessionId, taskId: null });
-      inference.enqueue(sessionId, chatId, assistantId, user);
+    enqueue: (
+      sessionId: string,
+      chatId: string,
+      assistantId: string,
+      user: string,
+      permissionMode?: 'plan' | 'ask' | 'bypass',
+    ) => {
+      runWithInferenceContext({ sql, sessionId, taskId: null, permissionMode }, () => {
+        inference.enqueue(sessionId, chatId, assistantId, user);
+      });
    },
    cancel: async (sessionId: string, chatId: string) => {
-      const result = await inference.cancel(sessionId, chatId);
-      clearInferenceContext();
-      return result;
+      // No context to clear — AsyncLocalStorage scopes it to each run's own chain.
+      return inference.cancel(sessionId, chatId);
    },
    hasActive: (chatId: string) => inference.hasActive(chatId),
  };