Brings the deterministic Han-flow conductor into BooCode: launch any read-only flow from BooChat or BooCoder, watch each agent stream live in a Paseo-style run pane, get an evidence-disciplined report — on local Qwen, persisted and resumable. Read-only enforced hard via qwen --approval-mode plan (orchestrator tasks fail closed if qwen is unavailable; never fall to write-capable native). Backend (apps/coder): re-homed conductor defs, flow_runs/flow_steps schema, flow-runner + dispatcher onTaskTerminal hook, restart-resume, runs routes (launch/list/get/cancel), user-channel WS. Contracts: two flow_run_* frames. Web: orchestrator pane kind + OrchestratorPane, Workflow button + slash flows (BooChat/BooCoder parity), FlowLauncherDialog, "New Orchestrator" in the + and split menus, runs history + export. Plan: openspec/changes/orchestrator. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
13 KiB
Orchestrator (Phase 2) — design (the HOW)
Planning altitude: names files, columns, frames, and decision-bearing values (the plan-mode flag, status sets, frame field names). Every non-obvious choice cites a committed decision in artifacts/implementation-decision-log.md. The behavioral spec is artifacts/design-context.md (decision 5 REVISED here); integration surfaces are in artifacts/.discovery-notes.md.
Architecture at a glance
ChatInput (shared composer) apps/web
├─ Workflow button → FlowLauncherDialog ─┐
└─ /flow slash (instant defaults) ───────┤
▼
POST /api/runs ── apps/coder/routes
▼
flow-runner.ts (DB-driven scheduler)
· loads flow def from src/conductor/
· step.run(ctx) IN-PROCESS → prompt (contracts injected)
· INSERT flow_runs / flow_steps
· INSERT each ready agent step as a tasks row
(mode_id='plan', synthetic chat_id)
▼
dispatcher.ts (REUSED, unchanged internals)
· LISTEN 'tasks_new' → external-agent path
· qwen --approval-mode plan (read-only gate)
· worktree = read snapshot; AgentEvents → WS frames
· onTaskTerminal(taskId,state) ← ONE new hook
▼
flow-runner advances: read full output → run code steps
inline → INSERT next ready wave (or finish + report)
▼
flow_run_started / flow_run_step_updated + reused delta/tool_call/
message_complete (keyed by step chat_id) → broker → WS
▼
OrchestratorPane.tsx (run header, report-at-top,
collapsed roster, expand-one-at-a-time stream)
Re-home & DispatchFn seam (D-1)
Copy the pure (dispatch-free) conductor files into apps/coder/src/conductor/:
spine.ts, flows/*, contracts.ts, types.ts, render.ts. Copy the 23
personas (conductor/agents/*.md). Do NOT copy flow.ts (in-memory scheduler,
replaced by the flow-runner) or dispatch.ts (opencode run subprocess,
replaced by dispatcher reuse). The Phase-1 CLI under conductor/ stays alive
unchanged as a regression oracle.
Two seam edits on the copies:
- Sever flow→dispatch coupling.
flows/code-review.ts:10importsdispatchAgentfrom../dispatch.jsand calls it at:62. Replace that import with aDispatchFnfield onStepContext, injected by the flow-runner. Every flow then reaches dispatch through the context, not a module import. - Parameterize the model.
spine.ts:122readsprocess.env.CONDUCTOR_MODELinto the report header. Make it read the run's configuredmodel(passed through the spine factory / step context) so the header matches the run, not a process env.
The evidence/yagni contracts (contracts.ts) and the adversarial-validator gate
are preserved because the flow-runner calls step.run(ctx) in-process to build
each prompt before it INSERTs the task — the closures execute in the coder process;
prompts are never serialized to DB ([D-1] rationale, C11).
Schema (D-5, D-10)
Two tables in apps/coder/src/schema.sql (coder-owned; applied by the host
boocoder service). Explicit CHECK names + the repo's DROP-IF-EXISTS →
guarded-ADD discipline (root CLAUDE.md).
flow_runs:
id,project_id(NO FK — matchestasks.project_id,schema.sql:19),flow_name,bandCHECK(small|medium|large),model,statusCHECK-named(running|completed|failed),inputJSONB CHECK(input ? 'question'),reportTEXT nullable,error,created_at/updated_at(clock_timestamp()).- Index
flow_runs(project_id, created_at DESC)(runs history).
flow_steps:
id,run_idUUID →flow_runs(id)ON DELETE CASCADE,step_id,kindCHECK(agent|code),agent,statusCHECK-named(pending|running|completed|failed|skipped)— noqueuedstatus ([D-10]; llama-swap can't populate it, C16),task_idUUID →tasks(id)ON DELETE SET NULL (nullable; code steps NULL),chat_idUUID →chats(id)ON DELETE SET NULL,inputTEXT,outputTEXT (FULL output —tasks.output_summaryis ≤500 char,schema.sql:26, and can't reconstructctx.results, C3),error, timestamps, UNIQUE(run_id, step_id).- Index
flow_steps(run_id, status)(ready-wave + resume scans).
No depends_on column and no skipped-step rows — deps and skips are derivable from
the loaded flow def (flow.ts:28-41, types.ts:27, C6). The FK lives on
flow_steps.task_id, NOT a new column on tasks ([D-5]; keeps tasks generic, C4).
JSONB writes via sql.json(value as never).
Flow-runner & onTaskTerminal (D-2)
New apps/coder/src/services/flow-runner.ts — a DB-backed scheduler that owns
flow_runs/flow_steps. It does NOT run a poll loop; it reacts to ONE new hook.
createDispatcher gains an onTaskTerminal(taskId, state) callback, invoked at
the existing external-agent terminal transitions (dispatcher.ts:642-646
completed, :659-661 failed). No change to the dispatcher's internal run
functions ([D-2]).
Run lifecycle:
POST /api/runs→ flow-runner loads the flow def, derives the first ready wave, INSERTsflow_runs(status='running') and itsflow_steps(eachstatus='pending'), and a syntheticchatsrow per agent step (stream attribution, [D-6]).- For each ready
agentstep: build the prompt viastep.run(ctx)in-process, then INSERT atasksrow(project_id, input=prompt, agent, model, mode_id='plan', chat_id=<synthetic>)withstate='pending'. The dispatcher picks it up viaLISTEN 'tasks_new'([D-3]). codesteps run inline in the flow-runner (no task;flow_steps.task_idNULL).onTaskTerminalfires → flow-runner reads the full task output, writes it toflow_steps.output, marks the step completed/failed, derives the next ready wave, and INSERTs it (or, on the last wave, renders the report intoflow_runs.reportand setsstatus='completed').
Execution via dispatcher reuse (D-3)
Steps execute through the existing dispatcher external-agent path — not a
direct-PTY bypass. The dispatcher creates a git worktree (a stable HEAD
read-checkout), runs the agent, and streams AgentEvents → WS frames unchanged.
This REVISES design-context decision 5 ("no worktree") to "worktree as a harmless
read snapshot" — inert because the agent cannot write under plan mode ([D-4]).
Task-as-dispatch precedents the flow-runner mirrors: routes/skills.ts:94,
routes/arena.ts:49, tools/new_task.ts:54.
Read-only via plan mode (D-4)
The flow-runner hardcodes mode_id='plan' on every step task; never
user-overridable. The PTY dispatcher already passes it to qwen as
--approval-mode plan (pty-dispatch.ts:75), a built-in tool-level gate: reads
allowed, writes blocked. This is the SOLE read-only enforcement. Persona prompts
and BOOCODE_TOOLS are NOT relied upon — they do not govern an external qwen CLI
child (R2 security finding, C13). Adding a non-qwen agent to flows requires
re-verifying that agent's plan-mode equivalent before allowing it.
WS frames (D-6)
Two new frames in packages/contracts/src/ws-frames.ts WsFrameSchema:
flow_run_started:{ run_id, flow_name, band, steps: [{ step_id, agent, kind, chat_id, label }] }.flow_run_step_updated:{ run_id, step_id, status, run_status?, report? }(the report rides here — no separate report frame, [D-6]).
The per-agent token stream REUSES the existing delta / tool_call /
message_complete frames keyed by the step's synthetic chat_id — no new
streaming frames. Register both new frames in ALL THREE registries: contracts
WsFrameSchema (rebuild pnpm -C packages/contracts build), the server loose
InferenceFrame union (services/inference/turn.ts), and the web strict
WsFrame union (apps/web/src/api/types.ts — the wire-format gate; missing it
silently drops the frame at JSON-parse).
Resume (D-9)
initResume runs on coder startup over flow_runs WHERE status='running':
- step whose
task_idtask iscompleted→ mark step done, advance the run; - step whose task is lost/failed (PTY died on restart) → re-dispatch (re-INSERT a
fresh task, again
mode_id='plan'); - completed steps are kept (no re-run).
Reconcile-and-advance, not mark-run-failed — decision 4 commits to resumable and task state is durable under [D-3] (C15).
Orchestrator pane (D-7)
New orchestrator pane kind following the markdown_artifact/html_artifact
precedent (api/types.ts:386 WorkspacePaneKind). Touches WorkspacePaneKind,
useWorkspacePanes, Workspace, NewPaneMenu, ChatTabBar,
PaneHeaderActions.
OrchestratorPane.tsx:
- run header (flow + band);
- report-at-top on completion;
- collapsed agent roster reusing
AgentStatusDot(AgentComposerBar.tsx:204); - expand-one-at-a-time detail well reusing the CoderPane stream rendering (keyed by
the step's
chat_id); - mobile single-column inline expand; auto-expand-follows-active.
The pane subscribes to flow_run_started (to build the roster) and
flow_run_step_updated (status + report), and to the reused
delta/tool_call/message_complete frames by chat_id for the expanded agent.
Toolbar button & launcher (D-8)
A Workflow (lucide) button on ChatInput's controls row, between the
SquareSlash chip and the Globe pill (ChatInput.tsx:648-732, :673 — row is
≤5 elements, stays one line, C9). Because ChatInput is rendered by both ChatPane
and CoderPane, this is BooChat + BooCoder parity from one button. "Flows" label
desktop, icon-only mobile.
- Slash (
/flow <focus>): launches instantly with defaults (bandsmall, current pane's project, text-after-command = focus), opening an Orchestrator pane. - Button →
FlowLauncherDialog.tsx: 5 category tabs (Analysis / Discovery / Planning / Authoring / Review) filtering the flow list (flows/index.ts), + size- focus + fast toggle; defaults Analysis / Small / off. Same run pane either way.
Runs history surfaces in NewPaneMenu. Export (copy / save-file / send-to-chat via
the existing sendToChat, lib/events.ts) lives in the pane header …,
conditional on a completed report.
Concurrency (D-10)
Multiple runs allowed; each its own pane + flow_runs row, no shared state. Step
statuses: pending / running / completed / failed / skipped (no queued — the
dispatcher's pending covers a step waiting on deps or on the busy model; llama-
swap can't report queue position, C16). Single model per run, default
qwen3.6-35b-a3b-mxfp4.
Routes
POST /api/runs—{ project_id, flow_name, band, input:{question,...}, model? }→ creates the run, starts the flow-runner, returnsrun_id. Publishesflow_run_started.GET /api/runs?project_id=— runs history (backsNewPaneMenu).GET /api/runs/:id— reopen a run (run + steps + report).
Deploy surface
apps/coderchanges (conductor defs, flow-runner, dispatcher hook, schema, resume, routes) →sudo systemctl restart boocoder.packages/contracts+apps/web(frames, pane, button, launcher, history) →docker compose up --build -d boocode. Build contracts first (pnpm -C packages/contracts build).
Deferred (YAGNI)
Full list with reopen triggers in
artifacts/implementation-decision-log.md:
@boocode/conductor workspace package (copy-in instead);
flow_steps.depends_on column (derive from flow def);
persisted skipped-step rows (when() is pure);
a read_only flag on tasks (superseded by mode_id='plan');
an explicit queued status (llama-swap can't populate it);
a launcher search box (5 category tabs suffice);
a separate report WS frame (report rides on flow_run_step_updated).