# Write/edit robustness — fuzzy patch applier + worktree checkpoints **Status:** in progress (started 2026-06-01) **Source:** `boocode_code_review_v2.md` §1 #3 + #4, §5b/§5d–5e (cline, Apache-2.0 — algorithm clean-reimplemented, not vendored). Two independent BooCoder hardening features for local quantized models. ## #3 — Fuzzy patch applier **Problem:** `applyOne`'s edit case (`apps/coder/src/services/pending_changes.ts:124`) does exact `content.includes(oldStr)` → throw, then `content.replace(oldStr, newStr)` (first occurrence). `rewindOne` (line 206) is the same. Local models (qwen3.6) drift `old_string` by whitespace/ indentation/unicode (curly quotes, en/em-dash, nbsp), so a valid edit fails at apply with "old_string not found" and is lost. **Design:** new pure module `apps/coder/src/services/fuzzy-match.ts`: `locateMatch(content: string, needle: string): { kind: 'exact'|'fuzzy'; start: number; end: number } | { kind: 'ambiguous'; count: number } | { kind: 'not_found' }`. Match ladder: 1. **Exact** `indexOf`. If exactly one → exact span. If >1 → **ambiguous** (refuse; decision 2026-06-01: safer than silently editing the first). 2. **Per-line whitespace-insensitive** — compare `needle` lines to file line-windows ignoring per-line `trimEnd`/leading-trailing blank lines. 3. **Unicode canonicalization** — normalize curly→straight quotes, en/em-dash→`-`, nbsp→space on both sides, then retry the whitespace pass. 4. **Levenshtein** similarity ≥ 0.66 over line-windows sized to `needle`'s line count; best window wins. Non-exact (fuzzy) matches return the actual file span so the caller replaces the real file text with `new_string`. `pending_changes.ts` `applyOne`/`rewindOne` use `locateMatch`; `ambiguous`/`not_found` return `success:false` with a clear message (no throw escaping the existing catch). Unit-tested (`apps/coder/src/services/__tests__/fuzzy-match.test.ts`), per the `turn-guard.ts` pure-helper pattern. ## #4 — Worktree checkpoint + conversation-trim **Problem:** `rewind` only reverses BooCode's own `pending_changes` (applied to the project root). External agents (opencode/goose/qwen/claude) write **directly into the session worktree** (`/tmp/booworktrees/sess-`); rewind has zero coverage there. **Schema** (`apps/coder/src/schema.sql`): ```sql CREATE TABLE IF NOT EXISTS checkpoints ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), chat_id UUID NOT NULL REFERENCES chats(id) ON DELETE CASCADE, session_id UUID, worktree_id UUID REFERENCES worktrees(id) ON DELETE SET NULL, message_id UUID, -- anchor: the assistant turn row this checkpoint precedes commit_sha TEXT NOT NULL, -- shadow-commit capturing the pre-turn worktree tree label TEXT, created_at TIMESTAMPTZ NOT NULL DEFAULT clock_timestamp() ); CREATE INDEX IF NOT EXISTS checkpoints_chat_created_idx ON checkpoints(chat_id, created_at); ``` **Create** (`apps/coder/src/services/checkpoints.ts` → `createCheckpoint`): hooked into the three external-agent dispatch paths in `dispatcher.ts` (`runWarmAcpTask` ~821, `runOpenCodeServerTask` ~513, `runExternalAgent` ~255) — after `ensureSessionWorktree()` and the assistant-message insert (so the anchor `message_id` exists), before the backend runs. Snapshot captures tracked **+ untracked** via a temp-index shadow commit, stored in a private GC-safe ref: ``` cd && TMP=$(mktemp) && GIT_INDEX_FILE="$TMP" git read-tree HEAD \ && GIT_INDEX_FILE="$TMP" git add -A \ && TREE=$(GIT_INDEX_FILE="$TMP" git write-tree) \ && SHA=$(git commit-tree "$TREE" -p HEAD -m "boocode checkpoint") \ && git update-ref refs/boocode/checkpoints/ "$SHA" && rm -f "$TMP" && echo "$SHA" ``` Best-effort: a checkpoint failure logs and never breaks the turn. Native-boocode turns (project-root, rewind-covered) get no checkpoint. **Restore** (`POST /api/sessions/:sessionId/checkpoints/:checkpointId/restore`, proxied `/api/coder/*`): 1. Resolve + validate the checkpoint belongs to the session. 2. Reset worktree: `git -C reset --hard && git -C clean -fd` (hostExec+shellEscape). 3. Trim transcript: `DELETE FROM messages WHERE chat_id = AND created_at >= (SELECT created_at FROM messages WHERE id = )` (+ explicit `message_parts` delete if the FK isn't ON DELETE CASCADE — verify). 4. Reset backend (decision 2026-06-01): `UPDATE agent_sessions SET status='crashed' WHERE chat_id=` and evict the live pool session for `(chat,agent)` if present, so the next turn re-establishes a fresh backend — transcript, files, and agent context all consistent at the restore point. (Warm backends hold context server-side; no partial rewind exists.) 5. Delete now-orphaned later checkpoints: `DELETE FROM checkpoints WHERE chat_id=? AND created_at > `. 6. Return `{ checkpoint_id, messages_deleted, worktree_reset, backend_reset }`. **Frontend:** per-message "Restore to here" in `CoderMessageList.tsx` (via a new optional `onRestoreCheckpoint?(chatId, messageId)` on `MessageActions` in `MessageBubble.tsx`), wired in `CoderPane.tsx`; guarded to `status==='complete'` and to messages that have a checkpoint. After the call returns, refetch the chat's messages (existing GET) — no new WS frame required. ## Decisions (2026-06-01) - Multi-exact-match → **refuse as ambiguous** (#3). - #4 **full** scope incl. conversation-trim. - Restore **resets** the external-agent backend session (context re-established fresh). ## Parallelization - **Unit 1 (#3)** — fully independent (`fuzzy-match.ts` + `pending_changes.ts` + test). - **Unit 2 (#4 backend)** — schema + `checkpoints.ts` (create+restore) + 3 dispatcher hooks + restore route + backend reset. One agent owns all #4 coder backend (shared `checkpoints.ts`). - **Unit 3 (#4 frontend)** — `CoderMessageList`/`MessageBubble`/`CoderPane`, against the pinned restore contract. Parallel with Unit 2. MUST NOT touch Sam's uncommitted WIP (`ChatTabBar`, `SessionLandingPage`, `Workspace`, `useWorkspacePanes`, `PaneHeaderActions`). ## Verify - `pnpm -C apps/coder test` (incl. new `fuzzy-match` + any checkpoint pure-helper tests) - `pnpm -C apps/server build` then `pnpm -C apps/coder build` - `npx tsc -p apps/web/tsconfig.app.json --noEmit` - Live smoke (manual, host): external-agent edit → checkpoint row; "Restore to here" → worktree reset + transcript trimmed + next turn fresh.