boocode/openspec/changes/orchestrator/design.md

# Orchestrator (Phase 2) — design (the HOW)

Planning altitude: names files, columns, frames, and decision-bearing values
(the plan-mode flag, status sets, frame field names). Every non-obvious choice
cites a committed decision in
[artifacts/implementation-decision-log.md](artifacts/implementation-decision-log.md).
The behavioral spec is [artifacts/design-context.md](artifacts/design-context.md)
(decision 5 REVISED here); integration surfaces are in
[artifacts/.discovery-notes.md](artifacts/.discovery-notes.md).

## Architecture at a glance

```
ChatInput (shared composer)                          apps/web
  ├─ Workflow button → FlowLauncherDialog ─┐
  └─ /flow slash (instant defaults) ───────┤
                                           ▼
                          POST /api/runs  ── apps/coder/routes
                                           ▼
                       flow-runner.ts (DB-driven scheduler)
                         · loads flow def from src/conductor/
                         · step.run(ctx) IN-PROCESS → prompt (contracts injected)
                         · INSERT flow_runs / flow_steps
                         · INSERT each ready agent step as a tasks row
                              (mode_id='plan', synthetic chat_id)
                                           ▼
                          dispatcher.ts (REUSED, unchanged internals)
                         · LISTEN 'tasks_new' → external-agent path
                         · qwen --approval-mode plan  (read-only gate)
                         · worktree = read snapshot; AgentEvents → WS frames
                         · onTaskTerminal(taskId,state)  ← ONE new hook
                                           ▼
                       flow-runner advances: read full output → run code steps
                         inline → INSERT next ready wave  (or finish + report)
                                           ▼
   flow_run_started / flow_run_step_updated  +  reused delta/tool_call/
   message_complete (keyed by step chat_id)  → broker → WS
                                           ▼
                 OrchestratorPane.tsx (run header, report-at-top,
                 collapsed roster, expand-one-at-a-time stream)
```

## Re-home & DispatchFn seam ([D-1](artifacts/implementation-decision-log.md#d-1--re-home-the-pure-conductor-definitions-into-appscodersrcconductor))

Copy the pure (dispatch-free) conductor files into `apps/coder/src/conductor/`:
`spine.ts`, `flows/*`, `contracts.ts`, `types.ts`, `render.ts`. Copy the 23
personas (`conductor/agents/*.md`). Do NOT copy `flow.ts` (in-memory scheduler,
replaced by the flow-runner) or `dispatch.ts` (`opencode run` subprocess,
replaced by dispatcher reuse). The Phase-1 CLI under `conductor/` stays alive
unchanged as a regression oracle.

Two seam edits on the copies:

- **Sever flow→dispatch coupling.** `flows/code-review.ts:10` imports
  `dispatchAgent` from `../dispatch.js` and calls it at `:62`. Replace that import
  with a `DispatchFn` field on `StepContext`, injected by the flow-runner. Every
  flow then reaches dispatch through the context, not a module import.
- **Parameterize the model.** `spine.ts:122` reads `process.env.CONDUCTOR_MODEL`
  into the report header. Make it read the run's configured `model` (passed through
  the spine factory / step context) so the header matches the run, not a process
  env.

The evidence/yagni contracts (`contracts.ts`) and the adversarial-validator gate
are preserved because the flow-runner calls `step.run(ctx)` **in-process** to build
each prompt before it INSERTs the task — the closures execute in the coder process;
prompts are never serialized to DB ([D-1] rationale, C11).

## Schema ([D-5](artifacts/implementation-decision-log.md#d-5--flow_runs--flow_steps-schema-in-the-coder-schema), [D-10](artifacts/implementation-decision-log.md#d-10--concurrency-multiple-runs-no-queued-status-single-model-per-run))

Two tables in `apps/coder/src/schema.sql` (coder-owned; applied by the host
boocoder service). Explicit CHECK names + the repo's DROP-IF-EXISTS →
guarded-ADD discipline (root CLAUDE.md).

`flow_runs`:
- `id`, `project_id` (NO FK — matches `tasks.project_id`, `schema.sql:19`),
  `flow_name`, `band` CHECK `(small|medium|large)`, `model`,
  `status` CHECK-named `(running|completed|failed)`,
  `input` JSONB CHECK `(input ? 'question')`,
  `report` TEXT nullable, `error`, `created_at`/`updated_at` (`clock_timestamp()`).
- Index `flow_runs(project_id, created_at DESC)` (runs history).

`flow_steps`:
- `id`, `run_id` UUID → `flow_runs(id)` ON DELETE CASCADE, `step_id`,
  `kind` CHECK `(agent|code)`, `agent`,
  `status` CHECK-named `(pending|running|completed|failed|skipped)`
  — no `queued` status ([D-10]; llama-swap can't populate it, C16),
  `task_id` UUID → `tasks(id)` ON DELETE SET NULL (nullable; code steps NULL),
  `chat_id` UUID → `chats(id)` ON DELETE SET NULL,
  `input` TEXT, `output` TEXT (FULL output — `tasks.output_summary` is ≤500 char,
  `schema.sql:26`, and can't reconstruct `ctx.results`, C3), `error`,
  timestamps, UNIQUE `(run_id, step_id)`.
- Index `flow_steps(run_id, status)` (ready-wave + resume scans).

No `depends_on` column and no skipped-step rows — deps and skips are derivable from
the loaded flow def (`flow.ts:28-41`, `types.ts:27`, C6). The FK lives on
`flow_steps.task_id`, NOT a new column on `tasks` ([D-5]; keeps `tasks` generic, C4).
JSONB writes via `sql.json(value as never)`.

## Flow-runner & onTaskTerminal ([D-2](artifacts/implementation-decision-log.md#d-2--db-driven-flow-runner-with-an-ontaskterminal-dispatcher-hook))

New `apps/coder/src/services/flow-runner.ts` — a DB-backed scheduler that owns
`flow_runs`/`flow_steps`. It does NOT run a poll loop; it reacts to ONE new hook.

`createDispatcher` gains an `onTaskTerminal(taskId, state)` callback, invoked at
the existing external-agent terminal transitions (`dispatcher.ts:642-646`
completed, `:659-661` failed). No change to the dispatcher's internal run
functions ([D-2]).

Run lifecycle:
1. `POST /api/runs` → flow-runner loads the flow def, derives the first ready wave,
   INSERTs `flow_runs` (`status='running'`) and its `flow_steps` (each
   `status='pending'`), and a synthetic `chats` row per agent step (stream
   attribution, [D-6]).
2. For each ready `agent` step: build the prompt via `step.run(ctx)` in-process,
   then INSERT a `tasks` row `(project_id, input=prompt, agent, model,
   mode_id='plan', chat_id=<synthetic>)` with `state='pending'`. The dispatcher
   picks it up via `LISTEN 'tasks_new'` ([D-3]).
3. `code` steps run inline in the flow-runner (no task; `flow_steps.task_id` NULL).
4. `onTaskTerminal` fires → flow-runner reads the **full** task output, writes it to
   `flow_steps.output`, marks the step completed/failed, derives the next ready
   wave, and INSERTs it (or, on the last wave, renders the report into
   `flow_runs.report` and sets `status='completed'`).

## Execution via dispatcher reuse ([D-3](artifacts/implementation-decision-log.md#d-3--reuse-the-existing-dispatcher-insert-pending-task-not-a-direct-pty-bypass))

Steps execute through the **existing** dispatcher external-agent path — not a
direct-PTY bypass. The dispatcher creates a git worktree (a stable HEAD
read-checkout), runs the agent, and streams AgentEvents → WS frames unchanged.
This REVISES design-context decision 5 ("no worktree") to "worktree as a harmless
read snapshot" — inert because the agent cannot write under plan mode ([D-4]).
Task-as-dispatch precedents the flow-runner mirrors: `routes/skills.ts:94`,
`routes/arena.ts:49`, `tools/new_task.ts:54`.

## Read-only via plan mode ([D-4](artifacts/implementation-decision-log.md#d-4--read-only-enforced-hard-by-mode_idplan-qwen---approval-mode-plan))

The flow-runner hardcodes `mode_id='plan'` on every step task; never
user-overridable. The PTY dispatcher already passes it to qwen as
`--approval-mode plan` (`pty-dispatch.ts:75`), a built-in tool-level gate: reads
allowed, writes blocked. This is the SOLE read-only enforcement. Persona prompts
and `BOOCODE_TOOLS` are NOT relied upon — they do not govern an external qwen CLI
child (R2 security finding, C13). Adding a non-qwen agent to flows requires
re-verifying that agent's plan-mode equivalent before allowing it.

## WS frames ([D-6](artifacts/implementation-decision-log.md#d-6--two-new-ws-frames-per-agent-stream-reuses-existing-frames-by-chat_id))

Two new frames in `packages/contracts/src/ws-frames.ts` `WsFrameSchema`:
- `flow_run_started`: `{ run_id, flow_name, band, steps: [{ step_id, agent, kind,
  chat_id, label }] }`.
- `flow_run_step_updated`: `{ run_id, step_id, status, run_status?, report? }`
  (the report rides here — no separate report frame, [D-6]).

The per-agent token stream REUSES the existing `delta` / `tool_call` /
`message_complete` frames keyed by the step's synthetic `chat_id` — no new
streaming frames. Register both new frames in ALL THREE registries: contracts
`WsFrameSchema` (rebuild `pnpm -C packages/contracts build`), the server loose
`InferenceFrame` union (`services/inference/turn.ts`), and the web strict
`WsFrame` union (`apps/web/src/api/types.ts` — the wire-format gate; missing it
silently drops the frame at JSON-parse).

## Resume ([D-9](artifacts/implementation-decision-log.md#d-9--resumable-runs-via-initresume-on-coder-startup))

`initResume` runs on coder startup over `flow_runs WHERE status='running'`:
- step whose `task_id` task is `completed` → mark step done, advance the run;
- step whose task is lost/failed (PTY died on restart) → re-dispatch (re-INSERT a
  fresh task, again `mode_id='plan'`);
- completed steps are kept (no re-run).

Reconcile-and-advance, not mark-run-failed — decision 4 commits to resumable and
task state is durable under [D-3] (C15).

## Orchestrator pane ([D-7](artifacts/implementation-decision-log.md#d-7--orchestrator-pane-kind--orchestratorpane))

New `orchestrator` pane kind following the `markdown_artifact`/`html_artifact`
precedent (`api/types.ts:386` `WorkspacePaneKind`). Touches `WorkspacePaneKind`,
`useWorkspacePanes`, `Workspace`, `NewPaneMenu`, `ChatTabBar`,
`PaneHeaderActions`.

`OrchestratorPane.tsx`:
- run header (flow + band);
- report-at-top on completion;
- collapsed agent roster reusing `AgentStatusDot` (`AgentComposerBar.tsx:204`);
- expand-one-at-a-time detail well reusing the CoderPane stream rendering (keyed by
  the step's `chat_id`);
- mobile single-column inline expand; auto-expand-follows-active.

The pane subscribes to `flow_run_started` (to build the roster) and
`flow_run_step_updated` (status + report), and to the reused
delta/tool_call/message_complete frames by `chat_id` for the expanded agent.

## Toolbar button & launcher ([D-8](artifacts/implementation-decision-log.md#d-8--workflow-toolbar-button--slash-launch-boochatboocoder-parity))

A `Workflow` (lucide) button on `ChatInput`'s controls row, between the
`SquareSlash` chip and the `Globe` pill (`ChatInput.tsx:648-732`, `:673` — row is
≤5 elements, stays one line, C9). Because `ChatInput` is rendered by both ChatPane
and CoderPane, this is BooChat + BooCoder parity from one button. "Flows" label
desktop, icon-only mobile.

- **Slash** (`/flow <focus>`): launches instantly with defaults (band `small`,
  current pane's project, text-after-command = focus), opening an Orchestrator
  pane.
- **Button** → `FlowLauncherDialog.tsx`: 5 category tabs (Analysis / Discovery /
  Planning / Authoring / Review) filtering the flow list (`flows/index.ts`), + size
  + focus + fast toggle; defaults Analysis / Small / off. Same run pane either way.

Runs history surfaces in `NewPaneMenu`. Export (copy / save-file / send-to-chat via
the existing `sendToChat`, `lib/events.ts`) lives in the pane header `…`,
conditional on a completed report.

## Concurrency ([D-10](artifacts/implementation-decision-log.md#d-10--concurrency-multiple-runs-no-queued-status-single-model-per-run))

Multiple runs allowed; each its own pane + `flow_runs` row, no shared state. Step
statuses: pending / running / completed / failed / skipped (no `queued` — the
dispatcher's `pending` covers a step waiting on deps or on the busy model; llama-
swap can't report queue position, C16). Single model per run, default
`qwen3.6-35b-a3b-mxfp4`.

## Routes

- `POST /api/runs` — `{ project_id, flow_name, band, input:{question,...}, model? }`
  → creates the run, starts the flow-runner, returns `run_id`. Publishes
  `flow_run_started`.
- `GET /api/runs?project_id=` — runs history (backs `NewPaneMenu`).
- `GET /api/runs/:id` — reopen a run (run + steps + report).

## Deploy surface

- `apps/coder` changes (conductor defs, flow-runner, dispatcher hook, schema,
  resume, routes) → `sudo systemctl restart boocoder`.
- `packages/contracts` + `apps/web` (frames, pane, button, launcher, history) →
  `docker compose up --build -d boocode`. Build contracts first
  (`pnpm -C packages/contracts build`).

## Deferred (YAGNI)

Full list with reopen triggers in
[artifacts/implementation-decision-log.md](artifacts/implementation-decision-log.md#deferred-yagni):
`@boocode/conductor` workspace package (copy-in instead);
`flow_steps.depends_on` column (derive from flow def);
persisted skipped-step rows (`when()` is pure);
a `read_only` flag on `tasks` (superseded by `mode_id='plan'`);
an explicit `queued` status (llama-swap can't populate it);
a launcher search box (5 category tabs suffice);
a separate report WS frame (report rides on `flow_run_step_updated`).