Brings the deterministic Han-flow conductor into BooCode: launch any read-only flow from BooChat or BooCoder, watch each agent stream live in a Paseo-style run pane, get an evidence-disciplined report — on local Qwen, persisted and resumable. Read-only enforced hard via qwen --approval-mode plan (orchestrator tasks fail closed if qwen is unavailable; never fall to write-capable native). Backend (apps/coder): re-homed conductor defs, flow_runs/flow_steps schema, flow-runner + dispatcher onTaskTerminal hook, restart-resume, runs routes (launch/list/get/cancel), user-channel WS. Contracts: two flow_run_* frames. Web: orchestrator pane kind + OrchestratorPane, Workflow button + slash flows (BooChat/BooCoder parity), FlowLauncherDialog, "New Orchestrator" in the + and split menus, runs history + export. Plan: openspec/changes/orchestrator. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
88 lines
5.1 KiB
Markdown
88 lines
5.1 KiB
Markdown
# Orchestrator (Phase 2) — settled design (source spec)
|
|
|
|
This is the behavioral specification for the BooCode Orchestrator, settled via a
|
|
`grill-me` interview. It is the ground truth for *what*; the implementation plan
|
|
covers *how*. Phase 1 (the standalone code conductor at `/opt/boocode/conductor/`)
|
|
is done; this is the in-app integration.
|
|
|
|
## Outcome
|
|
|
|
Bring the deterministic multi-agent conductor into the BooCode app: a user can
|
|
launch any Han flow from BooChat or BooCoder, watch each agent's progress live
|
|
(Paseo-style), and get an evidence-disciplined report — all on local Qwen, free.
|
|
|
|
## Settled decisions (immutable for this plan)
|
|
|
|
1. **One engine.** The conductor is the only engine. Every read-only Han skill IS
|
|
a conductor flow. No single-agent degraded path.
|
|
2. **Two doors, full parity.** A slash command and an "Orchestrator" button, both
|
|
on the shared `ChatInput` composer — so both appear in BooChat (ChatPane) and
|
|
BooCoder (CoderPane), desktop and mobile (mobile = icon only). Launching from
|
|
either opens the same run view.
|
|
3. **Run view = new pane kind.** A fourth pane kind, `orchestrator`, alongside
|
|
`chat | terminal | coder`, opened as a new pane in the current session. It
|
|
renders: flow + band at top, a list of agents with live status, each
|
|
expandable to watch its stream; the final report at the top when done. Shape
|
|
is parent-with-nested-children (Paseo's parent/subagents), not one-agent.
|
|
4. **Execution through BooCoder backends.** Each flow agent is a real BooCoder
|
|
agent session dispatched through the existing `AgentBackend`s → live streaming
|
|
via the existing AgentEvent→WS-frame pipeline, persisted to Postgres,
|
|
resumable. New `flow_runs` + `flow_steps` tables in the coder schema; the
|
|
conductor's scheduler fires each step's task as its deps complete.
|
|
5. **Read-only, no worktree.** Flows never write to the repo. Agents read the
|
|
project's working directory directly; no git worktree is created. Read-only is
|
|
enforced at dispatch (agents get no edit/write tools). The report is the only
|
|
output.
|
|
6. **Qwen-only.** Default the loaded 35B (`llama-swap/qwen3.6-35b-a3b-mxfp4`),
|
|
held as a single config value so additional local models slot in later with
|
|
near-zero rework. No Claude path — Claude Code remains the Claude lane.
|
|
7. **Naming.** "Orchestrator." The existing Arena (same-task-on-N-models,
|
|
`apps/coder/src/routes/arena.ts`) stays a separate feature.
|
|
8. **Skill placement.** The slash menu surfaces the read-only analysis/review set
|
|
(research, investigate, code-review, architectural-analysis, security-review,
|
|
gap-analysis, data-review, devops-review, issue-triage, project-discovery,
|
|
test-planning). The Orchestrator button exposes the FULL catalog (22 flows),
|
|
including the planning/authoring draft flows.
|
|
9. **Launch UX.** Slash launches instantly with defaults (band = small, target =
|
|
the current pane's project, text after the command = the question/focus),
|
|
opening an Orchestrator pane. The button opens a launcher first (pick flow,
|
|
size, target/focus) then launches. Same run view either way.
|
|
10. **Report output.** Stored with the run in Postgres, shown at the top of the
|
|
Orchestrator pane. Runs persist and are reopenable from a runs history.
|
|
Export on demand: copy / save-to-file / send-to-chat. Nothing auto-written to
|
|
the repo.
|
|
11. **Concurrency.** Multiple runs allowed; each its own pane. They share the one
|
|
local model, so workers queue at llama-swap — panes show `queued` honestly.
|
|
12. **Sizing modes.** Han's bands (small/medium/large) select roster breadth per
|
|
flow; a fast mode caps each worker's depth. Both carry over from Phase 1.
|
|
|
|
## Evidence & rule alignment (carried from Phase 1)
|
|
|
|
Flows apply Han's `evidence-rule` (trust classes, web corroboration gate,
|
|
no-evidence labeling) and `yagni-rule` (producing flows only), injected as
|
|
contracts. The adversarial-validator gate runs the review checklists and emits a
|
|
plain-language Summary + Confidence. This is already built in the conductor
|
|
(`conductor/src/contracts.ts`); Phase 2 must preserve it when porting dispatch
|
|
from `opencode run` subprocess to the BooCoder backends.
|
|
|
|
## Out of scope (Phase 2)
|
|
|
|
- The exact pixel-faithful per-skill Han report templates (spine-level only, by
|
|
prior decision).
|
|
- A Claude execution path (Claude Code covers it).
|
|
- Folding the existing Arena into the Orchestrator (stays separate).
|
|
- Per-agent model tiering (single model per run for now; revisit when more local
|
|
models exist).
|
|
|
|
## Open items the plan must resolve (the HOW)
|
|
|
|
- How the conductor's flow/spine definitions (Phase 1, `conductor/src/`) are
|
|
reused vs. re-homed inside `apps/coder` when dispatch moves to the backends.
|
|
- The `flow_runs`/`flow_steps` schema shape, status lifecycle, and how a step maps
|
|
to a `tasks`/`agent_sessions` row.
|
|
- How the scheduler resumes a run after a coder-service restart (mid-flight steps).
|
|
- The new `orchestrator` WS frame(s) and how the pane subscribes to per-agent
|
|
streams (reusing the existing broker/AgentEvent pipeline).
|
|
- The Orchestrator pane component structure and how it nests N live agent streams
|
|
without the crowding the grill rejected.
|