Two native-inference hardening features from boocode_code_review_v2 §1 #12. MistakeTracker: new pure mistake-tracker.ts tracks consecutive heterogeneous tool failures (kinds surfaced per tool from tool-phase.ts). On 3 in a row the turn loop soft-nudges (model-facing recovery guidance + mistake_recovery sentinel + reset), then escalates to stopping the turn (cap-hit-style, Continue affordance) on a re-trip. Complements doom-loop (identical repeats) + cap-hit. File-provenance ledger: compaction.ts derives a deterministic ## Files Read list from the head messages' read-tool calls and injects it into the rolling-summary prompt so provenance survives compaction (no new table; read-only). mistake_recovery sentinel: MessageMetadata arm (server + web) + MessageBubble render branch. Built by 2 parallel agents. Server 545 tests passing (23 new); build + web tsc clean. Native-inference only. Builds on v2.7.3. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
4.3 KiB
4.3 KiB
MistakeTracker + file-provenance ledger (#12)
Status: in progress (started 2026-06-01)
Source: boocode_code_review_v2.md §1 #12, §5e (cline — algorithm-reimplemented, not vendored).
Two native-inference (apps/server) hardening features. One cohesive backend change (they share
TurnArgs + the tool-phase observation point) + a small frontend sentinel render.
Part A — MistakeTracker (heterogeneous-failure recovery)
Complements the doom-loop guard (sentinels.ts:detectDoomLoop, which only catches identical
repeats) by catching a run of consecutive tool failures the model isn't recovering from.
- New pure
apps/server/src/services/inference/mistake-tracker.ts(mirrorsdetectDoomLoop):FailureKind = 'zod_reject' | 'tool_not_found' | 'exec_error' | 'api_error' | 'permission_denied'(all already distinguished intool-phase.ts:executeToolCall).MISTAKE_THRESHOLD = 3.- State
{ run: FailureKind[]; nudges: number }—runis the current consecutive-failure streak, reset on ANY successful tool step;nudgescounts recovery injections not yet cleared by a success. recordStep(state, outcome)where outcome is a failure kind or'success'.detectMistakePattern(state): 'nudge' | 'escalate' | null—run.length >= 3→'nudge'the first time (nudges === 0),'escalate'if it trips again whilenudges >= 1(no intervening success).
- Lives in
TurnArgs(loop-local, reset perrunInference, likerecentToolCalls). - Integration in
turn.tsloop: after each tool phase,recordStepper tool outcome; thendetectMistakePattern:'nudge'(decision: soft + escalate): append a transient model-facing recovery-guidance system message to the NEXT turn's payload (re-read schemas, verify paths exist before acting, try a different approach — not retry variations), insert amistake_recoveryUI sentinel (escalated:false), bumpnudges, resetrun. Loop continues.'escalate': stop the turn (break), insert amistake_recoverysentinel (escalated:true,can_continue:true, cap-hit-style), finalize. Prevents heterogeneous failures from burning the whole step budget.
Part B — File-provenance ledger (Read-only)
- Accumulate file paths read by
view_file/grep/find_files/list_dirintoTurnArgs.filesRead: Set<string>(recorded at the tool-phase, like the failure outcomes). - On compaction (
compaction.ts:buildPrompt), inject a deterministic, sorted## Files Readlist into the summary prompt context so the summarizer merges it into the rolling summary — no new table/column; it propagates as summary text across compactions.compaction-prompt.ts'sSUMMARY_TEMPLATEalready has a## Relevant Filessection to extend/merge with. - BooChat is read-only (no write tools on apps/server) → "Files Modified" is N/A here; only "Files Read". (The apps/coder write side can add "Modified" later.)
Sentinel contract (pinned — backend + frontend must match)
New sentinel kind on MessageMetadata in BOTH apps/server/src/types/api.ts AND
apps/web/src/api/types.ts:
{ kind: 'mistake_recovery'; failure_kinds: string[]; count: number; escalated: boolean; can_continue?: boolean }
role='system',status='complete', stripped from the LLM payload viaisAnySentinelinpayload.ts(UI-only) andcompaction.ts:buildHeadPayload.- Frontend render branch in
apps/web/src/components/MessageBubble.tsx:escalated:false→ "Hit repeated different errors — recovery guidance injected, continuing."escalated:true→ "Repeated errors persisted — stopped the turn." (mirror the doom-loop/cap-hit branches).
Decisions (2026-06-01)
- MistakeTracker intervention: soft nudge + escalate.
- UI sentinel for recovery (
mistake_recovery).
Files (backend, one agent) / (frontend, one agent)
- Backend:
mistake-tracker.ts(new),turn.ts,tool-phase.ts,sentinels.ts,sentinel-summaries.ts,payload.ts,compaction.ts,compaction-prompt.ts,types/api.ts+ tests (mistake-tracker.test.ts, ledger/compaction assertions). - Frontend:
apps/web/src/api/types.ts(MessageMetadata arm) +MessageBubble.tsx(render branch). MUST NOT touch Sam's WIP web files.
Verify
pnpm -C apps/server test;pnpm -C apps/server build;npx tsc -p apps/web/tsconfig.app.json --noEmit