Files
boocode/openspec/changes/archived/orchestrator/artifacts/design-context.md
indifferentketchup a734615480 docs: archive shipped openspec changes, refresh roadmap + DEFERRED-WORK
Move openspec/changes/{contracts-ssot,orchestrator} → archived/ (both shipped,
v2.7.13 and v2.7.17). Mark the roadmap's "Write/edit robustness" and "Claude
provider SDK" milestones as shipped (fuzzy-match.ts + checkpoints.ts; the
claude-sdk backend is live via CLAUDE_SDK_BACKEND in .env.host) and add a
v2.7.12–v2.7.17 shipped summary. Flag DEFERRED-WORK.md as superseded.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 16:30:01 +00:00

5.1 KiB

Orchestrator (Phase 2) — settled design (source spec)

This is the behavioral specification for the BooCode Orchestrator, settled via a grill-me interview. It is the ground truth for what; the implementation plan covers how. Phase 1 (the standalone code conductor at /opt/boocode/conductor/) is done; this is the in-app integration.

Outcome

Bring the deterministic multi-agent conductor into the BooCode app: a user can launch any Han flow from BooChat or BooCoder, watch each agent's progress live (Paseo-style), and get an evidence-disciplined report — all on local Qwen, free.

Settled decisions (immutable for this plan)

  1. One engine. The conductor is the only engine. Every read-only Han skill IS a conductor flow. No single-agent degraded path.
  2. Two doors, full parity. A slash command and an "Orchestrator" button, both on the shared ChatInput composer — so both appear in BooChat (ChatPane) and BooCoder (CoderPane), desktop and mobile (mobile = icon only). Launching from either opens the same run view.
  3. Run view = new pane kind. A fourth pane kind, orchestrator, alongside chat | terminal | coder, opened as a new pane in the current session. It renders: flow + band at top, a list of agents with live status, each expandable to watch its stream; the final report at the top when done. Shape is parent-with-nested-children (Paseo's parent/subagents), not one-agent.
  4. Execution through BooCoder backends. Each flow agent is a real BooCoder agent session dispatched through the existing AgentBackends → live streaming via the existing AgentEvent→WS-frame pipeline, persisted to Postgres, resumable. New flow_runs + flow_steps tables in the coder schema; the conductor's scheduler fires each step's task as its deps complete.
  5. Read-only, no worktree. Flows never write to the repo. Agents read the project's working directory directly; no git worktree is created. Read-only is enforced at dispatch (agents get no edit/write tools). The report is the only output.
  6. Qwen-only. Default the loaded 35B (llama-swap/qwen3.6-35b-a3b-mxfp4), held as a single config value so additional local models slot in later with near-zero rework. No Claude path — Claude Code remains the Claude lane.
  7. Naming. "Orchestrator." The existing Arena (same-task-on-N-models, apps/coder/src/routes/arena.ts) stays a separate feature.
  8. Skill placement. The slash menu surfaces the read-only analysis/review set (research, investigate, code-review, architectural-analysis, security-review, gap-analysis, data-review, devops-review, issue-triage, project-discovery, test-planning). The Orchestrator button exposes the FULL catalog (22 flows), including the planning/authoring draft flows.
  9. Launch UX. Slash launches instantly with defaults (band = small, target = the current pane's project, text after the command = the question/focus), opening an Orchestrator pane. The button opens a launcher first (pick flow, size, target/focus) then launches. Same run view either way.
  10. Report output. Stored with the run in Postgres, shown at the top of the Orchestrator pane. Runs persist and are reopenable from a runs history. Export on demand: copy / save-to-file / send-to-chat. Nothing auto-written to the repo.
  11. Concurrency. Multiple runs allowed; each its own pane. They share the one local model, so workers queue at llama-swap — panes show queued honestly.
  12. Sizing modes. Han's bands (small/medium/large) select roster breadth per flow; a fast mode caps each worker's depth. Both carry over from Phase 1.

Evidence & rule alignment (carried from Phase 1)

Flows apply Han's evidence-rule (trust classes, web corroboration gate, no-evidence labeling) and yagni-rule (producing flows only), injected as contracts. The adversarial-validator gate runs the review checklists and emits a plain-language Summary + Confidence. This is already built in the conductor (conductor/src/contracts.ts); Phase 2 must preserve it when porting dispatch from opencode run subprocess to the BooCoder backends.

Out of scope (Phase 2)

  • The exact pixel-faithful per-skill Han report templates (spine-level only, by prior decision).
  • A Claude execution path (Claude Code covers it).
  • Folding the existing Arena into the Orchestrator (stays separate).
  • Per-agent model tiering (single model per run for now; revisit when more local models exist).

Open items the plan must resolve (the HOW)

  • How the conductor's flow/spine definitions (Phase 1, conductor/src/) are reused vs. re-homed inside apps/coder when dispatch moves to the backends.
  • The flow_runs/flow_steps schema shape, status lifecycle, and how a step maps to a tasks/agent_sessions row.
  • How the scheduler resumes a run after a coder-service restart (mid-flight steps).
  • The new orchestrator WS frame(s) and how the pane subscribes to per-agent streams (reusing the existing broker/AgentEvent pipeline).
  • The Orchestrator pane component structure and how it nests N live agent streams without the crowding the grill rejected.