Converts the ad-hoc executeToolPhase → runAssistantTurn recursion into an explicit while (stepNumber < effectiveCap) loop. A step is one stream-and- tool-execute iteration; the loop terminates on non-tool finish, step-cap hit, doom-loop, budget exhaustion, abort, or synthesis success. MAX_STEPS = 200 hard ceiling (4x old effective limit from budget). Per-agent steps: field in AGENTS.md frontmatter sets tighter caps (Refactorer: 5, Architect: 20, others: unset = bounded only by MAX_STEPS). Resolution: effectiveCap = Math.min(agent.steps ?? Infinity, MAX_STEPS). executeToolPhase no longer recurses — returns ToolPhaseResult struct (action: 'continue' | 'paused' | 'synthesis_done') so the caller decides whether to continue or break. steps: 0 handled as "no tool calls allowed" via runTextOnlyTurn (one text-only stream phase, tool calls ignored with warn log). Step-cap hits produce a sentinel summary (reuses cap_hit kind so CapHitSentinel.tsx renders without frontend changes; text distinguishes "Step limit reached" from "Tool budget exhausted"). Doom-loop check migrated to top of loop body — same predicate, same threshold (3), break instead of return. step_start parts are in the schema CHECK but not emitted as message_parts — writing before the stream phase creates a sequence-0 collision with partsFromAssistantMessage. Structured log line emitted instead. Adversarial review caught the collision pre-deploy. 332/332 server tests passing. No frontend changes. No schema changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
5.0 KiB
5.0 KiB
v1.14.0-outer-loop tasks
B1 — Backups
turn.ts,tool-phase.ts,sentinel-summaries.ts,agents.ts,data/AGENTS.md
B2 — agents.ts: parse steps field
- Add
steps?: numbertoParsedFrontmatterinterface - Parse from YAML frontmatter: integer ≥ 0, warn on out-of-range (negative or non-integer), clamp to 0
- Expose on the
Agenttype returned bygetAgentsForProject npx tsc --noEmit -p apps/serverclean
B3 — AGENTS.md: add steps: to Refactorer + Architect
data/AGENTS.md— Refactorer:steps: 5data/AGENTS.md— Architect:steps: 20- All others: leave unset (infinite, bounded by MAX_STEPS=200)
B4 — tool-phase.ts: remove recursive call, return result struct
- Define
ToolPhaseResultinterface:{action: 'continue' | 'paused' | 'synthesis_done', toolCallCount: number, toolCalls: ToolCall[], nextAssistantId: string | null} - Remove
runAssistantTurnimport and call at line ~342 executeToolPhasereturnsToolPhaseResultinstead ofPromise<void>- On normal path (after creating next assistant row): return
{action: 'continue', toolCallCount, toolCalls: result.toolCalls, nextAssistantId} - On user-input pause: return
{action: 'paused', toolCallCount: <calls executed so far>, toolCalls: result.toolCalls, nextAssistantId: null} - On synthesis success: return
{action: 'synthesis_done', toolCallCount, toolCalls: result.toolCalls, nextAssistantId: null} npx tsc --noEmit -p apps/serverwill FAIL here (turn.ts still expects void) — expected, fixed in B5
B5 — turn.ts: recursion → while loop
- Add
MAX_STEPS = 200constant - Resolve
effectiveCap = Math.min(agent?.steps ?? Infinity, MAX_STEPS)at the top ofrunAssistantTurn - Convert
runAssistantTurnbody into awhile (stepNumber < effectiveCap)loop:- Top of loop: doom-loop check (move from current position;
breakinstead ofreturn) - Top of loop: budget check (move from current position;
breakinstead ofreturn, but still callrunCapHitSummarybefore break) - Emit
step_startpart viainsertPartswith payload{step_number: stepNumber, started_at: new Date().toISOString()} - Call
executeStreamPhase - If no tool calls →
finalizeCompletion,break - Call
executeToolPhase(now returnsToolPhaseResult) - If
result.action !== 'continue'→break - Update
toolsUsed += result.toolCallCount - Update
recentToolCalls = [...recentToolCalls, ...result.toolCalls] - Update
assistantMessageId = result.nextAssistantId! - Increment
stepNumber
- Top of loop: doom-loop check (move from current position;
- After loop: if
stepNumber >= effectiveCap→ call step-cap sentinel (B6) effectiveCap === 0edge case: the while condition is immediately false; stream the first turn text-only (the stream phase at the top of the function runs once before the loop — OR handle this by structuring the loop as do-while, OR handle by pre-checking and skipping tools from the request). Pick the cleanest approach.- Remove
TurnArgsfrom the module export if it's no longer threaded through recursion — OR keep it and populate from loop locals. (Design note:TurnArgsis still used byexecuteStreamPhase,executeToolPhase,sentinel-summaries.ts,error-handler.ts. Keep the interface; populate from loop locals each iteration.) npx tsc --noEmit -p apps/servercleanpnpm -C apps/server test— all existing tests pass
B6 — sentinel-summaries.ts: step-cap sentinel
- Add
runStepCapSummary(or extendrunCapHitSummarywith areasonparam) - Write a sentinel with
metadata.kind = 'cap_hit'(same as budget) soCapHitSentinelUI renders it - Sentinel text distinguishes "Step limit reached (N steps)" from "Tool budget exhausted (N calls)"
- Called from the post-loop check in turn.ts (B5)
B7 — Tests
- NEW
apps/server/src/services/__tests__/outer-loop.test.ts - Test: clean finish — stream returns no tool calls, loop exits after 1 step
- Test: step-cap hit — mock agent with
steps: 2, model always returns tool calls, loop exits at 2, sentinel written - Test: doom-loop — 3 identical tool calls, sentinel written, loop breaks
- Test: budget exhaustion — toolsUsed >= budget, cap-hit sentinel written
- Test:
steps: 0— no loop iterations, text-only response - Test: synthesis success — loop breaks after synthesis
pnpm -C apps/server test— all 332+ existing + new tests pass
B8 — Verification
npx tsc --noEmit -p apps/server— 0 errorsnpx tsc -p apps/web/tsconfig.app.json --noEmit— 0 errors (no web changes; should pass)pnpm -C apps/web build— greenpnpm -C apps/server test— all green
B9 — Docs + tag + deploy
CHANGELOG.mdentry for v1.14.0-outer-loopboocode_roadmap.mdretrospective bullet on the v1.14 sectionCLAUDE.mdupdates: mention the outer loop, MAX_STEPS, agent.steps in the inference/ section- Commit, tag
v1.14.0-outer-loop, push, rebuild