Files

indifferentketchup f4a97808ad v1.14.0-outer-loop: explicit while loop replaces inference recursion

Converts the ad-hoc executeToolPhase → runAssistantTurn recursion into an
explicit while (stepNumber < effectiveCap) loop. A step is one stream-and-
tool-execute iteration; the loop terminates on non-tool finish, step-cap hit,
doom-loop, budget exhaustion, abort, or synthesis success.

MAX_STEPS = 200 hard ceiling (4x old effective limit from budget). Per-agent
steps: field in AGENTS.md frontmatter sets tighter caps (Refactorer: 5,
Architect: 20, others: unset = bounded only by MAX_STEPS). Resolution:
effectiveCap = Math.min(agent.steps ?? Infinity, MAX_STEPS).

executeToolPhase no longer recurses — returns ToolPhaseResult struct
(action: 'continue' | 'paused' | 'synthesis_done') so the caller decides
whether to continue or break. steps: 0 handled as "no tool calls allowed"
via runTextOnlyTurn (one text-only stream phase, tool calls ignored with
warn log).

Step-cap hits produce a sentinel summary (reuses cap_hit kind so
CapHitSentinel.tsx renders without frontend changes; text distinguishes
"Step limit reached" from "Tool budget exhausted"). Doom-loop check migrated
to top of loop body — same predicate, same threshold (3), break instead of
return.

step_start parts are in the schema CHECK but not emitted as message_parts —
writing before the stream phase creates a sequence-0 collision with
partsFromAssistantMessage. Structured log line emitted instead. Adversarial
review caught the collision pre-deploy.

332/332 server tests passing. No frontend changes. No schema changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-23 20:29:21 +00:00

6.3 KiB

Raw Blame History

v1.14.0-outer-loop — explicit outer agent loop

Replace the ad-hoc executeToolPhase → runAssistantTurn recursion with an explicit while loop. A step is one stream-and-tool-execute iteration; a step can contain multiple parallel tool calls. The loop terminates on non-tool finish OR step-cap hit OR doom-loop OR budget exhaustion OR abort OR synthesis success.

Why

The current recursion works but has two problems: (a) stack depth grows linearly with tool iterations — 50 nested async frames is fragile, (b) there's no explicit step counter, so there's no per-agent step cap and no step-boundary instrumentation. BooChat also gets stuck at 50 tool calls (the budget ceiling) more often than it should — the new MAX_STEPS = 200 hard ceiling lets the loop run much longer before the step cap fires, while the existing budget (50 tool calls) remains a separate concern.

Recon findings (verified 2026-05-23)

runAssistantTurn at turn.ts:144-147 is the recursive entry. Returns Promise<void>.
executeToolPhase at tool-phase.ts:89-96 calls back into runAssistantTurn at tool-phase.ts:342.
Recursion terminates on: non-tool finish, budget exhaustion (args.toolsUsed >= budget), doom-loop (3 identical calls via detectDoomLoop), user-input pause (ask_user_input / request_read_access), synthesis success, stream error, abort.
No existing hard recursion depth limit — MAX_TOOL_LOOP_DEPTH does not exist. Safety comes from budget (50) + doom-loop (3 identical).
TurnArgs defined in turn.ts:127-141, not types.ts. Fields: sessionId, chatId, assistantMessageId, toolsUsed, recentToolCalls, signal. All mutable fields are threaded through the recursive call.
Synthesis pipeline (synthesisPipeline.ts) is a branch in executeToolPhase — if synthesis succeeds, recursion is skipped.
step_start already in the message_parts.kind CHECK constraint. No schema change needed.
agents.ts does NOT currently parse a steps field. Needs adding to ParsedFrontmatter.

Scope

S1. Outer loop in `turn.ts`

Convert the recursive chain to a while (stepNumber < effectiveCap) loop:

let stepNumber = 0
while (stepNumber < effectiveCap) {
  // doom-loop check
  // budget check
  // emit step_start part
  // stream phase (executeStreamPhase)
  // if no tool calls → finalize, break
  // tool phase (executeToolPhase — now returns, doesn't recurse)
  // if paused (user input / grant) → break
  // if synthesis succeeded → break
  // create next assistant message row
  // increment stepNumber, update toolsUsed, append recentToolCalls
}
// if stepNumber >= effectiveCap → sentinel summary

effectiveCap = Math.min(agent.steps ?? Infinity, MAX_STEPS) where MAX_STEPS = 200.

S2. `executeToolPhase` becomes non-recursive

Remove the runAssistantTurn call at tool-phase.ts:342. Instead, return a result indicating what happened: {action: 'continue' | 'paused' | 'synthesis_done', toolsUsed, recentToolCalls, nextAssistantId}. The caller (the while loop) uses the action to decide whether to continue or break.

S3. `agent.steps` field

agents.ts:ParsedFrontmatter gains steps?: number. Parser extracts it from YAML frontmatter (integer ≥ 0). steps: 0 means "no tool calls allowed" — loop body never executes; assistant responds text-only.

S4. Step-boundary events

At the top of each loop iteration, emit a step_start part with payload {step_number, started_at}. Uses insertParts into the current assistant message. No step_finish — the next step_start (or message completion) implicitly ends the previous step.

S5. Doom-loop migration

detectDoomLoop check moves from runAssistantTurn (top of function, pre-stream) to the top of the while-loop body (same logical position). Same predicate, same threshold (3). Same runDoomLoopSummary call. Control flow changes from return (unwinding recursion) to break (exiting loop).

S6. Step-cap sentinel

When stepNumber >= effectiveCap, write a sentinel summary like the existing runCapHitSummary. Reuse runCapHitSummary with a reason parameter distinguishing "budget exhaustion" from "step cap hit", or create a parallel runStepCapSummary. The sentinel makes the cap visible in chat.

S7. AGENTS.md updates

Add steps: to each agent in data/AGENTS.md:

Refactorer: steps: 5
Architect: steps: 20
All others: unset (infinity — bounded only by MAX_STEPS = 200)

S8. Tests

New test file apps/server/src/services/__tests__/outer-loop.test.ts covering:

Clean finish (stream returns non-tool, loop exits after 1 iteration)
Step-cap hit (loop exits at cap, sentinel written)
Doom-loop break (3 identical calls, sentinel written)
Budget exhaustion (toolsUsed >= budget, cap-hit sentinel written)
Abort mid-step (signal fires, loop exits)
steps: 0 edge case (no loop iterations, text-only response)
Synthesis success (loop exits after synthesis)

Non-goals

No frontend changes. step_start parts surface via messages_with_parts automatically; UI doesn't render them in v1.14.
No output_schema / exit_expression / execution_strategy AGENTS.md fields.
No per-step snapshot for revert (v2.0 BooCoder concern).
No changes to budget constants (50 / 10 / 50). That's a separate concern.
No repairToolCall changes.
No compaction changes.

Hard rules

No git commit, push. Sam commits.
Backup before editing.
TS strict, no any.
Doom-loop threshold stays at 3.
332+ existing tests still pass + new outer-loop tests.

Files expected to touch

apps/server/src/services/inference/turn.ts — recursion → loop
apps/server/src/services/inference/tool-phase.ts — remove recursive call, return result struct
apps/server/src/services/inference/sentinel-summaries.ts — step-cap sentinel (or extend cap-hit)
apps/server/src/services/agents.ts — parse steps field
data/AGENTS.md — add steps: to Refactorer + Architect
apps/server/src/services/__tests__/outer-loop.test.ts — NEW
apps/server/src/services/inference/index.ts — re-export if new types needed

Estimate

~300 LoC net (turn.ts refactor + tool-phase return struct + agents parser + tests). The conversion is structural, not behavioral — every exit path is preserved, just expressed as loop control flow instead of recursion unwinding.

6.3 KiB Raw Blame History