Source-level recon of QwenLM/qwen-code (Apache-2.0) informed 4 lifts:
1. FAST_MODEL config: optional env var routes cheap LLM calls (titles,
summaries, labeling) to a smaller model on llama-swap. auto_name.ts
uses ctx.config.FAST_MODEL ?? session.model. Set FAST_MODEL=nemotron-
nano-4b to avoid loading the 35B model for 20-token title generation.
2. Tool-use summaries (services/inference/tool-summaries.ts): utility
that generates "git-commit-subject-style" labels for tool batches via
a fast-model LLM call. System prompt + truncation logic ported from
Qwen Code's toolUseSummary.ts. Exported via @boocode/server/inference
for BooCoder's dispatcher to call after task completion.
3. Qwen as dispatchable agent: added to agent-probe.ts KNOWN_AGENTS.
PTY dispatch builds: qwen -p "<task>" --output-format stream-json
(NDJSON structured events over stdout). Env: OPENAI_BASE_URL +
OPENAI_API_KEY points Qwen Code at llama-swap. execution_path CHECK
constraint extended with 'qwen'.
4. Arena routes (routes/arena.ts): POST /api/arena dispatches the same
task to N contestants (2-5, each with different agent/model), each
getting its own task row linked by arena_id UUID. GET /api/arena/:id
shows all contestants. POST /api/arena/:id/select/:task_id marks
winner. Schema: arena_id column added to tasks.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Converts the ad-hoc executeToolPhase → runAssistantTurn recursion into an
explicit while (stepNumber < effectiveCap) loop. A step is one stream-and-
tool-execute iteration; the loop terminates on non-tool finish, step-cap hit,
doom-loop, budget exhaustion, abort, or synthesis success.
MAX_STEPS = 200 hard ceiling (4x old effective limit from budget). Per-agent
steps: field in AGENTS.md frontmatter sets tighter caps (Refactorer: 5,
Architect: 20, others: unset = bounded only by MAX_STEPS). Resolution:
effectiveCap = Math.min(agent.steps ?? Infinity, MAX_STEPS).
executeToolPhase no longer recurses — returns ToolPhaseResult struct
(action: 'continue' | 'paused' | 'synthesis_done') so the caller decides
whether to continue or break. steps: 0 handled as "no tool calls allowed"
via runTextOnlyTurn (one text-only stream phase, tool calls ignored with
warn log).
Step-cap hits produce a sentinel summary (reuses cap_hit kind so
CapHitSentinel.tsx renders without frontend changes; text distinguishes
"Step limit reached" from "Tool budget exhausted"). Doom-loop check migrated
to top of loop body — same predicate, same threshold (3), break instead of
return.
step_start parts are in the schema CHECK but not emitted as message_parts —
writing before the stream phase creates a sequence-0 collision with
partsFromAssistantMessage. Structured log line emitted instead. Adversarial
review caught the collision pre-deploy.
332/332 server tests passing. No frontend changes. No schema changes.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- sentinel-summaries.ts: runCapHitSummary, insertCapHitSentinel,
runDoomLoopSummary, insertDoomLoopSentinel
- inference.ts → inference/turn.ts: residue is runAssistantTurn,
runInference, createInferenceRunner orchestration only
- inference/index.ts: re-export shim preserves the public surface
(createInferenceRunner, runInference, runAssistantTurn,
detectDoomLoop, DOOM_LOOP_THRESHOLD, buildMessagesPayload, plus
type-side InferenceContext/InferenceFrame/StreamResult/TurnArgs/
FramePublisher)
- src/index.ts + auto_name.ts + the two vitest test files updated to
import from ./services/inference/index.js explicitly (NodeNext ESM
doesn't honor directory-index resolution)
Final tally: 11 files under services/inference/, the largest being
sentinel-summaries.ts at 523 LoC (two near-clone summary paths kept
side-by-side until a third sentinel justifies factoring out a shared
runWrapUpSummary). turn.ts is now 326 LoC, the next-largest is
stream-phase.ts at 380. Public import surface unchanged.
tool-phase.ts → turn.ts back-edge for runAssistantTurn remains
(cycle is safe; resolved at call time).
Prepares the file structure for v1.13 AI SDK migration — streamText
swap targets stream-phase.ts only.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>