Converts the ad-hoc executeToolPhase → runAssistantTurn recursion into an explicit while (stepNumber < effectiveCap) loop. A step is one stream-and- tool-execute iteration; the loop terminates on non-tool finish, step-cap hit, doom-loop, budget exhaustion, abort, or synthesis success. MAX_STEPS = 200 hard ceiling (4x old effective limit from budget). Per-agent steps: field in AGENTS.md frontmatter sets tighter caps (Refactorer: 5, Architect: 20, others: unset = bounded only by MAX_STEPS). Resolution: effectiveCap = Math.min(agent.steps ?? Infinity, MAX_STEPS). executeToolPhase no longer recurses — returns ToolPhaseResult struct (action: 'continue' | 'paused' | 'synthesis_done') so the caller decides whether to continue or break. steps: 0 handled as "no tool calls allowed" via runTextOnlyTurn (one text-only stream phase, tool calls ignored with warn log). Step-cap hits produce a sentinel summary (reuses cap_hit kind so CapHitSentinel.tsx renders without frontend changes; text distinguishes "Step limit reached" from "Tool budget exhausted"). Doom-loop check migrated to top of loop body — same predicate, same threshold (3), break instead of return. step_start parts are in the schema CHECK but not emitted as message_parts — writing before the stream phase creates a sequence-0 collision with partsFromAssistantMessage. Structured log line emitted instead. Adversarial review caught the collision pre-deploy. 332/332 server tests passing. No frontend changes. No schema changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
5.0 KiB
v1.14.0-outer-loop — design decisions
Answers to the dispatch's blocking questions, resolved 2026-05-23.
D1. Step cap — what replaces MAX_TOOL_LOOP_DEPTH?
MAX_TOOL_LOOP_DEPTH never existed — no hard recursion depth guard was ever in the codebase. Safety came from budget (50 tool calls) + doom-loop (3 identical calls).
Decision: introduce MAX_STEPS = 200 as a hard ceiling. Per-agent cap via agent.steps is the primary knob. Resolution: effectiveCap = Math.min(agent.steps ?? Infinity, MAX_STEPS).
Rationale: Sam reports BooChat gets stuck at 50 tool calls (the budget) too often. The step cap should be generous — 200 is 4x the current de-facto ceiling. Budget (50 tool calls total across all steps) remains a separate concern and is not changed in this batch.
Note: "step" ≠ "tool call." One step = one stream iteration that may produce multiple parallel tool calls. Budget counts individual tool calls; step cap counts iterations. At 200 steps with average 1-2 tool calls per step, the budget (50) will fire well before the step cap in most scenarios. The step cap is a safety ceiling for cases where the model makes many 1-tool-call iterations.
D2. step_finish — emit or not?
Decision: No step_finish part. The next step_start (or assistant message completion) implicitly ends the previous step.
Rationale: opencode only emits step_start. Less noise in parts, simpler code. If UI ever needs step durations, compute from the timestamps of consecutive step_start parts.
D3. Step-cap hit — sentinel or quiet?
Decision: Write a sentinel summary on step-cap hit. Visible to the user in chat, same as budget-exhaustion's runCapHitSummary.
Implementation: Extend runCapHitSummary to accept a reason: 'budget' | 'step_cap' parameter (or add a parallel runStepCapSummary). The sentinel metadata kind stays cap_hit — frontend CapHitSentinel component already renders it. The sentinel's text distinguishes the two cases ("Tool budget exhausted" vs "Step limit reached").
D4. agent.steps = 0
Decision: steps: 0 means "no tool calls allowed." The loop body never executes. The assistant can only respond with text.
Implementation: When effectiveCap === 0, skip the loop entirely. Stream the first assistant turn (text-only), finalize, return. The model receives no tools in the request payload when steps: 0 (or equivalently, tools are passed but the loop never enters the tool-execution branch).
Actually, cleaner: steps: 0 means the loop cap is 0. The while condition stepNumber < effectiveCap is false on the first check. The stream phase still runs (the model produces a text response), but if it emits tool calls they're ignored and the turn finalizes as text-only. This may produce a confusing response if the model's text references tool results it never got — but steps: 0 is an explicit constraint the agent author chose. Document in AGENTS.md parser validation.
D5. Synthesis success terminates the loop?
Decision: Yes. break out of the loop after synthesis success. Preserves current behavior (synthesis replaces the recursive call; no further iterations).
Rationale: The synthesis pass produces a self-contained summary turn. Continuing the loop after synthesis would let the model issue more tool calls on top of a synthesis summary, which is semantically wrong — the synthesis IS the final answer for that tool call batch.
D6. executeToolPhase return struct
The recursive call at tool-phase.ts:342 is currently the last thing executeToolPhase does (after creating the next assistant row). After the conversion, executeToolPhase returns a struct the loop body reads:
interface ToolPhaseResult {
action: 'continue' | 'paused' | 'synthesis_done';
toolCallCount: number;
toolCalls: ToolCall[];
nextAssistantId: string | null;
}
continue→ loop continues;nextAssistantIdis the new assistant message's UUID.paused→ user-input or grant pause; loop breaks.nextAssistantIdis null.synthesis_done→ synthesis succeeded; loop breaks.nextAssistantIdis null (synthesis wrote its own parts).
The loop body then:
- Updates
toolsUsed += result.toolCallCount - Appends
result.toolCallstorecentToolCalls - Sets
assistantMessageId = result.nextAssistantIdfor the next iteration - Increments
stepNumber - Checks
result.action— if notcontinue, breaks.
D7. Budget vs steps interaction
Budget counts individual tool calls across the entire turn. Steps counts loop iterations. They are orthogonal:
- Budget fires when
toolsUsed >= resolveToolBudget(agent)(currently 50 for read-only). Checked at the top of each iteration. - Step cap fires when
stepNumber >= effectiveCap. Checked by the loop condition.
Both produce a sentinel summary. A turn can be terminated by whichever fires first. In practice, budget (50 tool calls) fires before step cap (200 steps) unless the model produces many 0-tool-call iterations (which shouldn't happen — 0 tool calls means non-tool finish, which exits the loop via the break path).