Files
boocode/openspec/changes/v1.14-outer-loop/design.md
indifferentketchup f4a97808ad v1.14.0-outer-loop: explicit while loop replaces inference recursion
Converts the ad-hoc executeToolPhase → runAssistantTurn recursion into an
explicit while (stepNumber < effectiveCap) loop. A step is one stream-and-
tool-execute iteration; the loop terminates on non-tool finish, step-cap hit,
doom-loop, budget exhaustion, abort, or synthesis success.

MAX_STEPS = 200 hard ceiling (4x old effective limit from budget). Per-agent
steps: field in AGENTS.md frontmatter sets tighter caps (Refactorer: 5,
Architect: 20, others: unset = bounded only by MAX_STEPS). Resolution:
effectiveCap = Math.min(agent.steps ?? Infinity, MAX_STEPS).

executeToolPhase no longer recurses — returns ToolPhaseResult struct
(action: 'continue' | 'paused' | 'synthesis_done') so the caller decides
whether to continue or break. steps: 0 handled as "no tool calls allowed"
via runTextOnlyTurn (one text-only stream phase, tool calls ignored with
warn log).

Step-cap hits produce a sentinel summary (reuses cap_hit kind so
CapHitSentinel.tsx renders without frontend changes; text distinguishes
"Step limit reached" from "Tool budget exhausted"). Doom-loop check migrated
to top of loop body — same predicate, same threshold (3), break instead of
return.

step_start parts are in the schema CHECK but not emitted as message_parts —
writing before the stream phase creates a sequence-0 collision with
partsFromAssistantMessage. Structured log line emitted instead. Adversarial
review caught the collision pre-deploy.

332/332 server tests passing. No frontend changes. No schema changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 20:29:21 +00:00

73 lines
5.0 KiB
Markdown

# v1.14.0-outer-loop — design decisions
Answers to the dispatch's blocking questions, resolved 2026-05-23.
## D1. Step cap — what replaces MAX_TOOL_LOOP_DEPTH?
`MAX_TOOL_LOOP_DEPTH` never existed — no hard recursion depth guard was ever in the codebase. Safety came from budget (50 tool calls) + doom-loop (3 identical calls).
**Decision:** introduce `MAX_STEPS = 200` as a hard ceiling. Per-agent cap via `agent.steps` is the primary knob. Resolution: `effectiveCap = Math.min(agent.steps ?? Infinity, MAX_STEPS)`.
**Rationale:** Sam reports BooChat gets stuck at 50 tool calls (the budget) too often. The step cap should be generous — 200 is 4x the current de-facto ceiling. Budget (50 tool calls total across all steps) remains a separate concern and is not changed in this batch.
Note: "step" ≠ "tool call." One step = one stream iteration that may produce multiple parallel tool calls. Budget counts individual tool calls; step cap counts iterations. At 200 steps with average 1-2 tool calls per step, the budget (50) will fire well before the step cap in most scenarios. The step cap is a safety ceiling for cases where the model makes many 1-tool-call iterations.
## D2. step_finish — emit or not?
**Decision:** No `step_finish` part. The next `step_start` (or assistant message completion) implicitly ends the previous step.
**Rationale:** opencode only emits `step_start`. Less noise in parts, simpler code. If UI ever needs step durations, compute from the timestamps of consecutive `step_start` parts.
## D3. Step-cap hit — sentinel or quiet?
**Decision:** Write a sentinel summary on step-cap hit. Visible to the user in chat, same as budget-exhaustion's `runCapHitSummary`.
**Implementation:** Extend `runCapHitSummary` to accept a `reason: 'budget' | 'step_cap'` parameter (or add a parallel `runStepCapSummary`). The sentinel metadata kind stays `cap_hit` — frontend `CapHitSentinel` component already renders it. The sentinel's text distinguishes the two cases ("Tool budget exhausted" vs "Step limit reached").
## D4. agent.steps = 0
**Decision:** `steps: 0` means "no tool calls allowed." The loop body never executes. The assistant can only respond with text.
**Implementation:** When `effectiveCap === 0`, skip the loop entirely. Stream the first assistant turn (text-only), finalize, return. The model receives no tools in the request payload when `steps: 0` (or equivalently, tools are passed but the loop never enters the tool-execution branch).
Actually, cleaner: `steps: 0` means the loop cap is 0. The while condition `stepNumber < effectiveCap` is false on the first check. The stream phase still runs (the model produces a text response), but if it emits tool calls they're ignored and the turn finalizes as text-only. This may produce a confusing response if the model's text references tool results it never got — but `steps: 0` is an explicit constraint the agent author chose. Document in AGENTS.md parser validation.
## D5. Synthesis success terminates the loop?
**Decision:** Yes. `break` out of the loop after synthesis success. Preserves current behavior (synthesis replaces the recursive call; no further iterations).
**Rationale:** The synthesis pass produces a self-contained summary turn. Continuing the loop after synthesis would let the model issue more tool calls on top of a synthesis summary, which is semantically wrong — the synthesis IS the final answer for that tool call batch.
## D6. executeToolPhase return struct
The recursive call at `tool-phase.ts:342` is currently the last thing `executeToolPhase` does (after creating the next assistant row). After the conversion, `executeToolPhase` returns a struct the loop body reads:
```typescript
interface ToolPhaseResult {
action: 'continue' | 'paused' | 'synthesis_done';
toolCallCount: number;
toolCalls: ToolCall[];
nextAssistantId: string | null;
}
```
- `continue` → loop continues; `nextAssistantId` is the new assistant message's UUID.
- `paused` → user-input or grant pause; loop breaks. `nextAssistantId` is null.
- `synthesis_done` → synthesis succeeded; loop breaks. `nextAssistantId` is null (synthesis wrote its own parts).
The loop body then:
1. Updates `toolsUsed += result.toolCallCount`
2. Appends `result.toolCalls` to `recentToolCalls`
3. Sets `assistantMessageId = result.nextAssistantId` for the next iteration
4. Increments `stepNumber`
5. Checks `result.action` — if not `continue`, breaks.
## D7. Budget vs steps interaction
Budget counts **individual tool calls** across the entire turn. Steps counts **loop iterations**. They are orthogonal:
- Budget fires when `toolsUsed >= resolveToolBudget(agent)` (currently 50 for read-only). Checked at the top of each iteration.
- Step cap fires when `stepNumber >= effectiveCap`. Checked by the loop condition.
Both produce a sentinel summary. A turn can be terminated by whichever fires first. In practice, budget (50 tool calls) fires before step cap (200 steps) unless the model produces many 0-tool-call iterations (which shouldn't happen — 0 tool calls means non-tool finish, which exits the loop via the `break` path).