fix(coder): harden edit-apply pipeline against block duplication

Root cause: two proven corruption mechanisms — (M1) non-idempotent apply stamped the same block N times when a quantized model re-emitted the same edit_file call or a turn was retried; (M2) Levenshtein tier 4 was fail-open with no uniqueness guard, silently splicing into the wrong location. Fixes applied at every layer of the pipeline: Matcher (fuzzy-match.ts): raise SIMILARITY_THRESHOLD 0.66 → 0.85; add AMBIGUITY_EPSILON uniqueness guard — two windows within 0.05 of the top score → ambiguous, not a guess; add block-anchor gate (≥3-line needles require first+last line exact match before a window is scored). Edit planner (pending_changes.ts): extract planEdit() as a pure function; idempotency guards detect already-applied states (anchored insert re-stamp, old-gone-but-new-present); findPendingDuplicate() collapses identical pending rows at queue time so M1 never reaches applyOne. Atomic writes (pending_changes.ts): temp-file + rename on the same filesystem so a crash can't leave a half-written source file; realpath() first so symlinks survive the rename. Per-file mutex (pending_changes.ts): withFileLock() serializes concurrent read-modify-write on the same path via a chained-Promise Map. EOL preservation (pending_changes.ts): normalize CRLF → LF for matching, restore native line ending on write so Windows-style files stay clean. Context isolation (inference_context.ts): replace module-level singleton with AsyncLocalStorage so concurrent inference runs (arena parallel dispatch, dispatcher poll racing a user message) each get their own scoped context with no clobbering. Tests: plan-edit.test.ts (pure planEdit unit tests), extended fuzzy-match and pending_changes_integration suites, ALS isolation test that proves overlapping runs get correct session IDs. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-07 01:44:37 +00:00
parent dbf1662982
commit cce685b1a7
16 changed files with 644 additions and 157 deletions
--- a/apps/coder/src/services/fuzzy-match.ts
+++ b/apps/coder/src/services/fuzzy-match.ts
@@ -21,7 +21,16 @@
 //                         punctuation to ASCII on both sides; the match is
 //                         mapped back to original offsets.
 //   4. levenshtein      — best line-window by normalized edit-distance
-//                         similarity; accepted only at >= SIMILARITY_THRESHOLD.
+//                         similarity; accepted only at >= SIMILARITY_THRESHOLD,
+//                         anchored on an exact first+last line for multi-line
+//                         needles, and REFUSED (ambiguous) when a second window
+//                         scores within AMBIGUITY_EPSILON of the best. Like the
+//                         exact/whitespace tiers, this tier fails CLOSED — it
+//                         never splices over a merely-plausible guess, because a
+//                         wrong-window splice corrupts the file (it leaves the
+//                         real target intact and duplicates it). This mirrors
+//                         opencode/cline/qwen, whose fuzzy tiers all keep the
+//                         unique-match requirement rather than picking a winner.
 //
 // Pure and dependency-free (Levenshtein is the standard iterative two-row DP),
 // reimplemented from the general technique — no vendored source.
@@ -31,8 +40,31 @@ export type MatchResult =
  | { kind: 'ambiguous'; count: number }
  | { kind: 'not_found' };

-/** Levenshtein similarity floor for the final fuzzy fallback (strategy 4). */
-export const SIMILARITY_THRESHOLD = 0.66;
+/**
+ * Levenshtein similarity floor for the final fuzzy fallback (strategy 4).
+ * 0.66 was far too low — at two-thirds similarity a structurally-wrong window
+ * (e.g. one of three near-identical form blocks) clears the bar and gets spliced
+ * over, leaving the real target intact and duplicated. Competent agents anchor
+ * far tighter (opencode's BlockAnchor needs an exact anchor; cline needs exact
+ * first+last lines). 0.85 keeps genuine quantized-model drift (a typo, an indent
+ * shift) while refusing a different block.
+ */
+export const SIMILARITY_THRESHOLD = 0.85;
+
+/**
+ * If a second candidate window scores within this of the best, the match is
+ * ambiguous and tier 4 refuses rather than guessing — the same fail-closed
+ * stance the exact and whitespace tiers take on multiple hits. Repetitive files
+ * (the duplicate-block corruption case) produce near-tied windows; this is what
+ * turns that into a clean "add more context" error instead of a wrong splice.
+ */
+export const AMBIGUITY_EPSILON = 0.05;
+
+/** Multi-line needles at or above this length must anchor on an exact (after
+ *  trim + unicode-fold) first AND last line before similarity is even scored —
+ *  the cline/opencode block-anchor rule. Below it, threshold + uniqueness alone
+ *  guard the match. */
+const ANCHOR_MIN_LINES = 3;

 export function locateMatch(content: string, needle: string): MatchResult {
  // Empty needle has no meaningful match.
@@ -252,20 +284,39 @@ function locateByLevenshtein(content: string, needle: string): MatchResult | nul

  const needleJoined = needleLines.map((l) => l.trim()).join('\n');

-  let best = -1;
-  let bestSpan: { start: number; end: number } | null = null;
+  // Block-anchor gate for multi-line needles: the first and last lines must match
+  // exactly (after trim + unicode-fold) or the window is not even scored. This
+  // stops a high interior-similarity from dragging a structurally-wrong window
+  // over the threshold — the failure that duplicates blocks in repetitive files.
+  const anchored = n >= ANCHOR_MIN_LINES;
+  const needleFirst = canonicalize(needleLines[0]!.trim());
+  const needleLast = canonicalize(needleLines[n - 1]!.trim());
+
+  const scored: Array<{ score: number; start: number; end: number }> = [];
  for (let i = 0; i + n <= contentLines.length; i++) {
    const window = contentLines.slice(i, i + n);
-    const windowJoined = window.map((l) => l.text.trim()).join('\n');
-    const score = similarity(windowJoined, needleJoined);
-    if (score > best) {
-      best = score;
-      bestSpan = { start: window[0]!.start, end: window[n - 1]!.end };
+    if (anchored) {
+      const winFirst = canonicalize(window[0]!.text.trim());
+      const winLast = canonicalize(window[n - 1]!.text.trim());
+      if (winFirst !== needleFirst || winLast !== needleLast) continue;
    }
+    const windowJoined = window.map((l) => l.text.trim()).join('\n');
+    scored.push({
+      score: similarity(windowJoined, needleJoined),
+      start: window[0]!.start,
+      end: window[n - 1]!.end,
+    });
  }

-  if (bestSpan && best >= SIMILARITY_THRESHOLD) {
-    return { kind: 'fuzzy', start: bestSpan.start, end: bestSpan.end };
-  }
-  return null;
+  if (scored.length === 0) return null;
+  scored.sort((a, b) => b.score - a.score);
+  const best = scored[0]!;
+  if (best.score < SIMILARITY_THRESHOLD) return null;
+
+  // Uniqueness guard: refuse when a second window is within epsilon of the best.
+  // Fail closed (ambiguous) rather than silently splicing one of several lookalikes.
+  const tied = scored.filter((s) => s.score >= best.score - AMBIGUITY_EPSILON);
+  if (tied.length > 1) return { kind: 'ambiguous', count: tied.length };
+
+  return { kind: 'fuzzy', start: best.start, end: best.end };
 }