Converts the ad-hoc executeToolPhase → runAssistantTurn recursion into an
explicit while (stepNumber < effectiveCap) loop. A step is one stream-and-
tool-execute iteration; the loop terminates on non-tool finish, step-cap hit,
doom-loop, budget exhaustion, abort, or synthesis success.
MAX_STEPS = 200 hard ceiling (4x old effective limit from budget). Per-agent
steps: field in AGENTS.md frontmatter sets tighter caps (Refactorer: 5,
Architect: 20, others: unset = bounded only by MAX_STEPS). Resolution:
effectiveCap = Math.min(agent.steps ?? Infinity, MAX_STEPS).
executeToolPhase no longer recurses — returns ToolPhaseResult struct
(action: 'continue' | 'paused' | 'synthesis_done') so the caller decides
whether to continue or break. steps: 0 handled as "no tool calls allowed"
via runTextOnlyTurn (one text-only stream phase, tool calls ignored with
warn log).
Step-cap hits produce a sentinel summary (reuses cap_hit kind so
CapHitSentinel.tsx renders without frontend changes; text distinguishes
"Step limit reached" from "Tool budget exhausted"). Doom-loop check migrated
to top of loop body — same predicate, same threshold (3), break instead of
return.
step_start parts are in the schema CHECK but not emitted as message_parts —
writing before the stream phase creates a sequence-0 collision with
partsFromAssistantMessage. Structured log line emitted instead. Adversarial
review caught the collision pre-deploy.
332/332 server tests passing. No frontend changes. No schema changes.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Recon during planning disproved the original v1.13.7 (DB-cache) premise:
buildSystemPrompt already runs over inputs mtime-cached at the file layer
(BOOCHAT.md in system-prompt.ts:25, AGENTS.md global+per-project in
agents.ts:245), and DB scalars are byte-stable until edited. The output
is microsecond pure-string concat with no I/O. Skills aren't in the
prefix; tools live in a separate request body field alpha-sorted by
v1.13.3.
This batch closes the verification gap with instrumentation, not
implementation:
- system-prompt.ts: buildSystemPromptWithFingerprint canonical impl
computes SHA-256 over the assembled prefix, runs a per-session
Map<sessionId, lastHash> observer, emits PrefixFingerprint per call
and PrefixDrift (with field-level changed_inputs) on hash change.
buildSystemPrompt is now a thin shim returning .prompt.
- agents.ts: getAgentsMtimes accessor — cache-read only, no I/O.
- payload.ts: buildMessagesPayload takes optional log argument; when
passed, emits prefix-fingerprint (info) + prefix-drift (warn).
- turn.ts + sentinel-summaries.ts: pass ctx.log at 3 production call
sites; sentinel summaries log too so any drift across cap-hit /
doom-loop paths surfaces.
- system-prompt.test.ts: 4 new tests (byte-identical, no-drift-on-
stable, drift-fires-with-changed-inputs, cross-session-no-drift).
194/194 tests pass (was 190).
Smoke: 5 messages in a fresh session produced 7 prefix-fingerprint
logs (extras from buildMessagesPayload being called from sentinel
summary paths), all with identical prefix_hash and prefix_length=2907,
zero prefix-drift. Prefix is byte-stable in steady-state.
Decision: original system_prompt_cache DB table from the roadmap is
permanently dropped. The v1.12.0 mtime caches at the input layer plus
alpha tool ordering at the request body (v1.13.3) already address the
load-bearing cache-stability surfaces. Instrumentation stays so the
claim can be re-verified at any time.
- sentinel-summaries.ts: runCapHitSummary, insertCapHitSentinel,
runDoomLoopSummary, insertDoomLoopSentinel
- inference.ts → inference/turn.ts: residue is runAssistantTurn,
runInference, createInferenceRunner orchestration only
- inference/index.ts: re-export shim preserves the public surface
(createInferenceRunner, runInference, runAssistantTurn,
detectDoomLoop, DOOM_LOOP_THRESHOLD, buildMessagesPayload, plus
type-side InferenceContext/InferenceFrame/StreamResult/TurnArgs/
FramePublisher)
- src/index.ts + auto_name.ts + the two vitest test files updated to
import from ./services/inference/index.js explicitly (NodeNext ESM
doesn't honor directory-index resolution)
Final tally: 11 files under services/inference/, the largest being
sentinel-summaries.ts at 523 LoC (two near-clone summary paths kept
side-by-side until a third sentinel justifies factoring out a shared
runWrapUpSummary). turn.ts is now 326 LoC, the next-largest is
stream-phase.ts at 380. Public import surface unchanged.
tool-phase.ts → turn.ts back-edge for runAssistantTurn remains
(cycle is safe; resolved at call time).
Prepares the file structure for v1.13 AI SDK migration — streamText
swap targets stream-phase.ts only.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>