Files
boocode/openspec/changes/v1.14-outer-loop/tasks.md
indifferentketchup f4a97808ad v1.14.0-outer-loop: explicit while loop replaces inference recursion
Converts the ad-hoc executeToolPhase → runAssistantTurn recursion into an
explicit while (stepNumber < effectiveCap) loop. A step is one stream-and-
tool-execute iteration; the loop terminates on non-tool finish, step-cap hit,
doom-loop, budget exhaustion, abort, or synthesis success.

MAX_STEPS = 200 hard ceiling (4x old effective limit from budget). Per-agent
steps: field in AGENTS.md frontmatter sets tighter caps (Refactorer: 5,
Architect: 20, others: unset = bounded only by MAX_STEPS). Resolution:
effectiveCap = Math.min(agent.steps ?? Infinity, MAX_STEPS).

executeToolPhase no longer recurses — returns ToolPhaseResult struct
(action: 'continue' | 'paused' | 'synthesis_done') so the caller decides
whether to continue or break. steps: 0 handled as "no tool calls allowed"
via runTextOnlyTurn (one text-only stream phase, tool calls ignored with
warn log).

Step-cap hits produce a sentinel summary (reuses cap_hit kind so
CapHitSentinel.tsx renders without frontend changes; text distinguishes
"Step limit reached" from "Tool budget exhausted"). Doom-loop check migrated
to top of loop body — same predicate, same threshold (3), break instead of
return.

step_start parts are in the schema CHECK but not emitted as message_parts —
writing before the stream phase creates a sequence-0 collision with
partsFromAssistantMessage. Structured log line emitted instead. Adversarial
review caught the collision pre-deploy.

332/332 server tests passing. No frontend changes. No schema changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 20:29:21 +00:00

83 lines
5.0 KiB
Markdown

# v1.14.0-outer-loop tasks
## B1 — Backups
- [ ] `turn.ts`, `tool-phase.ts`, `sentinel-summaries.ts`, `agents.ts`, `data/AGENTS.md`
## B2 — agents.ts: parse `steps` field
- [ ] Add `steps?: number` to `ParsedFrontmatter` interface
- [ ] Parse from YAML frontmatter: integer ≥ 0, warn on out-of-range (negative or non-integer), clamp to 0
- [ ] Expose on the `Agent` type returned by `getAgentsForProject`
- [ ] `npx tsc --noEmit -p apps/server` clean
## B3 — AGENTS.md: add `steps:` to Refactorer + Architect
- [ ] `data/AGENTS.md` — Refactorer: `steps: 5`
- [ ] `data/AGENTS.md` — Architect: `steps: 20`
- [ ] All others: leave unset (infinite, bounded by MAX_STEPS=200)
## B4 — tool-phase.ts: remove recursive call, return result struct
- [ ] Define `ToolPhaseResult` interface: `{action: 'continue' | 'paused' | 'synthesis_done', toolCallCount: number, toolCalls: ToolCall[], nextAssistantId: string | null}`
- [ ] Remove `runAssistantTurn` import and call at line ~342
- [ ] `executeToolPhase` returns `ToolPhaseResult` instead of `Promise<void>`
- [ ] On normal path (after creating next assistant row): return `{action: 'continue', toolCallCount, toolCalls: result.toolCalls, nextAssistantId}`
- [ ] On user-input pause: return `{action: 'paused', toolCallCount: <calls executed so far>, toolCalls: result.toolCalls, nextAssistantId: null}`
- [ ] On synthesis success: return `{action: 'synthesis_done', toolCallCount, toolCalls: result.toolCalls, nextAssistantId: null}`
- [ ] `npx tsc --noEmit -p apps/server` will FAIL here (turn.ts still expects void) — expected, fixed in B5
## B5 — turn.ts: recursion → while loop
- [ ] Add `MAX_STEPS = 200` constant
- [ ] Resolve `effectiveCap = Math.min(agent?.steps ?? Infinity, MAX_STEPS)` at the top of `runAssistantTurn`
- [ ] Convert `runAssistantTurn` body into a `while (stepNumber < effectiveCap)` loop:
- Top of loop: doom-loop check (move from current position; `break` instead of `return`)
- Top of loop: budget check (move from current position; `break` instead of `return`, but still call `runCapHitSummary` before break)
- Emit `step_start` part via `insertParts` with payload `{step_number: stepNumber, started_at: new Date().toISOString()}`
- Call `executeStreamPhase`
- If no tool calls → `finalizeCompletion`, `break`
- Call `executeToolPhase` (now returns `ToolPhaseResult`)
- If `result.action !== 'continue'``break`
- Update `toolsUsed += result.toolCallCount`
- Update `recentToolCalls = [...recentToolCalls, ...result.toolCalls]`
- Update `assistantMessageId = result.nextAssistantId!`
- Increment `stepNumber`
- [ ] After loop: if `stepNumber >= effectiveCap` → call step-cap sentinel (B6)
- [ ] `effectiveCap === 0` edge case: the while condition is immediately false; stream the first turn text-only (the stream phase at the top of the function runs once before the loop — OR handle this by structuring the loop as do-while, OR handle by pre-checking and skipping tools from the request). Pick the cleanest approach.
- [ ] Remove `TurnArgs` from the module export if it's no longer threaded through recursion — OR keep it and populate from loop locals. (Design note: `TurnArgs` is still used by `executeStreamPhase`, `executeToolPhase`, `sentinel-summaries.ts`, `error-handler.ts`. Keep the interface; populate from loop locals each iteration.)
- [ ] `npx tsc --noEmit -p apps/server` clean
- [ ] `pnpm -C apps/server test` — all existing tests pass
## B6 — sentinel-summaries.ts: step-cap sentinel
- [ ] Add `runStepCapSummary` (or extend `runCapHitSummary` with a `reason` param)
- [ ] Write a sentinel with `metadata.kind = 'cap_hit'` (same as budget) so `CapHitSentinel` UI renders it
- [ ] Sentinel text distinguishes "Step limit reached (N steps)" from "Tool budget exhausted (N calls)"
- [ ] Called from the post-loop check in turn.ts (B5)
## B7 — Tests
- [ ] NEW `apps/server/src/services/__tests__/outer-loop.test.ts`
- [ ] Test: clean finish — stream returns no tool calls, loop exits after 1 step
- [ ] Test: step-cap hit — mock agent with `steps: 2`, model always returns tool calls, loop exits at 2, sentinel written
- [ ] Test: doom-loop — 3 identical tool calls, sentinel written, loop breaks
- [ ] Test: budget exhaustion — toolsUsed >= budget, cap-hit sentinel written
- [ ] Test: `steps: 0` — no loop iterations, text-only response
- [ ] Test: synthesis success — loop breaks after synthesis
- [ ] `pnpm -C apps/server test` — all 332+ existing + new tests pass
## B8 — Verification
- [ ] `npx tsc --noEmit -p apps/server` — 0 errors
- [ ] `npx tsc -p apps/web/tsconfig.app.json --noEmit` — 0 errors (no web changes; should pass)
- [ ] `pnpm -C apps/web build` — green
- [ ] `pnpm -C apps/server test` — all green
## B9 — Docs + tag + deploy
- [ ] `CHANGELOG.md` entry for v1.14.0-outer-loop
- [ ] `boocode_roadmap.md` retrospective bullet on the v1.14 section
- [ ] `CLAUDE.md` updates: mention the outer loop, MAX_STEPS, agent.steps in the inference/ section
- [ ] Commit, tag `v1.14.0-outer-loop`, push, rebuild