feat: MistakeTracker + file-provenance ledger (v2.7.4)
Two native-inference hardening features from boocode_code_review_v2 §1 #12. MistakeTracker: new pure mistake-tracker.ts tracks consecutive heterogeneous tool failures (kinds surfaced per tool from tool-phase.ts). On 3 in a row the turn loop soft-nudges (model-facing recovery guidance + mistake_recovery sentinel + reset), then escalates to stopping the turn (cap-hit-style, Continue affordance) on a re-trip. Complements doom-loop (identical repeats) + cap-hit. File-provenance ledger: compaction.ts derives a deterministic ## Files Read list from the head messages' read-tool calls and injects it into the rolling-summary prompt so provenance survives compaction (no new table; read-only). mistake_recovery sentinel: MessageMetadata arm (server + web) + MessageBubble render branch. Built by 2 parallel agents. Server 545 tests passing (23 new); build + web tsc clean. Native-inference only. Builds on v2.7.3. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -717,3 +717,57 @@ async function insertDoomLoopSentinel(
|
||||
metadata,
|
||||
});
|
||||
}
|
||||
|
||||
// #12 MistakeTracker: heterogeneous-failure recovery sentinel. Mirrors
|
||||
// insertDoomLoopSentinel structurally — a role='system', status='complete' row
|
||||
// firing the standard message_started → delta → message_complete frame
|
||||
// sequence. Two variants distinguished by `escalated`:
|
||||
// - escalated:false → a nudge fired; recovery guidance was injected into the
|
||||
// model's next step and the loop continued. can_continue is true (the turn
|
||||
// is still live).
|
||||
// - escalated:true → the nudge didn't break the failure run; the turn was
|
||||
// stopped (cap-hit-style). can_continue is true so the UI can still offer a
|
||||
// Continue affordance — a fresh user turn resets the tracker.
|
||||
export async function insertMistakeRecoverySentinel(
|
||||
ctx: InferenceContext,
|
||||
sessionId: string,
|
||||
chatId: string,
|
||||
opts: { failureKinds: string[]; count: number; escalated: boolean; canContinue: boolean },
|
||||
): Promise<void> {
|
||||
const metadata: MessageMetadata = {
|
||||
kind: 'mistake_recovery',
|
||||
failure_kinds: opts.failureKinds,
|
||||
count: opts.count,
|
||||
escalated: opts.escalated,
|
||||
can_continue: opts.canContinue,
|
||||
};
|
||||
const content = opts.escalated
|
||||
? `Repeated different errors persisted after a recovery nudge (${opts.count} in a row). Stopping the tool-call loop.`
|
||||
: `Hit ${opts.count} different errors in a row. Injected recovery guidance and continuing.`;
|
||||
|
||||
const [row] = await ctx.sql<{ id: string }[]>`
|
||||
INSERT INTO messages (session_id, chat_id, role, content, status, created_at, metadata)
|
||||
VALUES (${sessionId}, ${chatId}, 'system', ${content}, 'complete', clock_timestamp(), ${ctx.sql.json(metadata as never)})
|
||||
RETURNING id
|
||||
`;
|
||||
|
||||
// Standard frame sequence — same as cap-hit / doom-loop sentinels.
|
||||
ctx.publish(sessionId, {
|
||||
type: 'message_started',
|
||||
message_id: row!.id,
|
||||
chat_id: chatId,
|
||||
role: 'system',
|
||||
});
|
||||
ctx.publish(sessionId, {
|
||||
type: 'delta',
|
||||
message_id: row!.id,
|
||||
chat_id: chatId,
|
||||
content,
|
||||
});
|
||||
ctx.publish(sessionId, {
|
||||
type: 'message_complete',
|
||||
message_id: row!.id,
|
||||
chat_id: chatId,
|
||||
metadata,
|
||||
});
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user