• v2.0.5 e423579e99

    v2.0.5: FAST_MODEL routing + tool-use summaries + Qwen dispatch + Arena

    indifferentketchup released this 2026-05-25 14:05:59 +00:00 | 55 commits to main since this release

    Source-level recon of QwenLM/qwen-code (Apache-2.0) informed 4 lifts:

    1. FAST_MODEL config: optional env var routes cheap LLM calls (titles,
      summaries, labeling) to a smaller model on llama-swap. auto_name.ts
      uses ctx.config.FAST_MODEL ?? session.model. Set FAST_MODEL=nemotron-
      nano-4b to avoid loading the 35B model for 20-token title generation.

    2. Tool-use summaries (services/inference/tool-summaries.ts): utility
      that generates "git-commit-subject-style" labels for tool batches via
      a fast-model LLM call. System prompt + truncation logic ported from
      Qwen Code's toolUseSummary.ts. Exported via @boocode/server/inference
      for BooCoder's dispatcher to call after task completion.

    3. Qwen as dispatchable agent: added to agent-probe.ts KNOWN_AGENTS.
      PTY dispatch builds: qwen -p "" --output-format stream-json
      (NDJSON structured events over stdout). Env: OPENAI_BASE_URL +
      OPENAI_API_KEY points Qwen Code at llama-swap. execution_path CHECK
      constraint extended with 'qwen'.

    4. Arena routes (routes/arena.ts): POST /api/arena dispatches the same
      task to N contestants (2-5, each with different agent/model), each
      getting its own task row linked by arena_id UUID. GET /api/arena/:id
      shows all contestants. POST /api/arena/:id/select/:task_id marks
      winner. Schema: arena_id column added to tasks.

    Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com

    Downloads