docs: archive shipped openspec changes, refresh roadmap + DEFERRED-WORK

Move openspec/changes/{contracts-ssot,orchestrator} → archived/ (both shipped, v2.7.13 and v2.7.17). Mark the roadmap's "Write/edit robustness" and "Claude provider SDK" milestones as shipped (fuzzy-match.ts + checkpoints.ts; the claude-sdk backend is live via CLAUDE_SDK_BACKEND in .env.host) and add a v2.7.12–v2.7.17 shipped summary. Flag DEFERRED-WORK.md as superseded. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 16:30:01 +00:00
parent 868d9db3f2
commit a734615480
11 changed files with 9 additions and 5 deletions
--- a/openspec/changes/archived/contracts-ssot/proposal.md
+++ b/openspec/changes/archived/contracts-ssot/proposal.md
@@ -0,0 +1,102 @@
+# @boocode/contracts — cross-app wire contract SSOT
+
+**Status:** shipped (2026-06-02)
+
+Eliminate BooCode's hand-synced, duplicated cross-app TypeScript wire contracts by
+creating one workspace package, `@boocode/contracts`, that every consumer imports.
+Each contract is defined exactly once; the two Zod-backed contracts (`ws-frames`,
+`provider-config`) use `z.infer` so validator and type derive from the same definition
+and cannot drift independently.
+
+## Why
+
+BooCode maintained hand-synced copies of cross-app contracts across up to four
+locations (`apps/server`, `apps/web`, `apps/coder`, `apps/coder/web`), verified only
+by byte-parity tests — `provider-types-parity.test.ts` and the ws-frames byte-parity
+assertion in `ws-frames.test.ts`. A live drift had appeared: `AgentSessionConfig`
+existed in two incompatible shapes — `apps/coder` held an all-optional dead copy
+(zero live importers) while `apps/web` held the real required/nullable shape — making
+the parity-test regime insufficient for type-only contracts.
+
+`v2.5.12-provider-lifecycle-phase4` had explicitly deferred a shared types package as
+"not worth the Docker/build-order risk at solo scale"; the observed drift made the
+investment worth taking.
+
+## What shipped
+
+**Package:** `packages/contracts` (`@boocode/contracts`), `declaration:true`, zod
+pinned `^3.23.8`, per-subpath exports map with `types`-then-`default` conditions.
+`packages/*` added to `pnpm-workspace.yaml`; `pnpm-lock.yaml` regenerated.
+
+**Six contracts single-sourced:**
+
+- `./ws-frames` — `WsFrameSchema` (Zod runtime), `KNOWN_FRAME_TYPES`, `WsFrame`
+  (`z.infer` from the schema).
+- `./provider-snapshot` — `ProviderSnapshotEntry`, `ProviderModel`, `ProviderMode`,
+  `ThinkingOption`, `AgentCommand`, `ProviderSnapshotStatus` (plain TS; the coder's
+  `provider-types.ts` re-exports them so internal importers are unchanged).
+- `./provider-config` — `ProviderOverrideSchema`, `CoderProvidersFileSchema`,
+  `ProviderConfigPatchSchema` and their `z.infer` types.
+- `./message-metadata` — `MessageMetadata`, `ErrorReason`, `AgentSessionConfig`.
+- `./worktree-risk` — `WorktreeRiskReport` (unified from three copies that differed
+  only in name; the coder called it `RiskReport`).
+
+**Four consumers** import via `workspace:*` through the exports map: `apps/server`,
+`apps/web`, `apps/coder`, and the fallback SPA `apps/coder/web`. No tsconfig project
+references; built dist only.
+
+**Deleted:**
+- `apps/server/src/types/ws-frames.ts` (server `./ws-frames` exports subpath dropped)
+- `apps/web/src/api/ws-frames.ts`
+- Provider-snapshot mirror block in web (`apps/web/src/api/types.ts`)
+- Provider-config 17-line hand-mirror in web
+- `apps/coder/src/services/__tests__/provider-types-parity.test.ts` (6 parity tests)
+- ws-frames byte-parity assertion (server test suite)
+- All duplicate `MessageMetadata`, `ErrorReason`, `AgentSessionConfig` copies
+- `WorktreeRiskReport` / `RiskReport` duplicates (3 copies)
+- `apps/coder/web` dead `pending_change_added`/`pending_change_updated` reducer arms
+  and associated WS plumbing (`DiffPane` prop, `Session.tsx` listener)
+
+**Preserved:**
+- KNOWN_FRAME_TYPES drift test — moved into the package (11/11)
+- Broker fail-closed tests — kept in `apps/server`, importing from the package (4/4)
+- Web strict `SessionFrame`/`UserFrame` discriminated union — web-local, untouched
+
+## Key decisions
+
+**F1 (ws-frames):** repoint all 8 server/coder importers + 2 web validators to the
+package; drop the server `./ws-frames` re-export subpath — one path per contract, no
+shim.
+
+**F2 (web strict union):** the package exports the runtime schema only (`WsFrameSchema`,
+`KNOWN_FRAME_TYPES`, loose `WsFrame`). The web's rich discriminated union stays
+web-local — it references entity types (`Message`, `ToolCall`, etc.) that are
+intentionally web-local and not cross-app duplicated. Zero entity-type scope expansion.
+
+**AgentSessionConfig drift:** unified to the web required/nullable shape; the coder's
+all-optional copy confirmed dead (zero live importers) and deleted.
+
+**`apps/coder/web` (fallback SPA):** its hand-copied 9-arm `WsFrame` union replaced by
+the canonical import; dead `pending_change_*` arms removed (no publisher exists for
+these frames anywhere in the codebase — they were HTTP-delivered, not WS); field
+conflicts reconciled per-field (`tool_result.error` boolean→string, `tokens_used`
+number→number|null, `snapshot.messages` cast).
+
+**Two `ErrorReason` concepts (intentional, not duplication):**
+`message-metadata`'s `ErrorReason` is the DB-persisted 3-value set; the ws-frames
+frame-level `reason` is the wire 5-value set. Different value sets, different
+semantics, confirmed during audit.
+
+## Build-order inversion
+
+Contracts builds before all consumers in:
+- Root `package.json` build script
+- `Dockerfile` — new `COPY packages/*` block + contracts build step before web/server
+- Coder deploy command — updated in all 7 doc sites (CLAUDE.md, apps/coder/CLAUDE.md,
+  BOOCODER.md, docs/ARCHITECTURE.md, docs/project-discovery.md, README.md,
+  docs/coder-backends.md)
+
+## Test counts at ship
+
+Server 543 / coder 293 / contracts 11. Clean `docker compose build --no-cache boocode`
+green. Human smoke verified 2026-06-02.
--- a/openspec/changes/archived/contracts-ssot/tasks.md
+++ b/openspec/changes/archived/contracts-ssot/tasks.md
@@ -0,0 +1,109 @@
+# Tasks — @boocode/contracts SSOT
+
+Nine phases, each independently verifiable. Phases 1–7 are the migration units (one
+contract group each after a proven tracer); Phase 8 is the audit gate; Phase 9 is the
+human smoke test. All shipped 2026-06-02.
+
+## Phase 1 — Tracer: scaffold package + build-order inversion + web proof ✅ SHIPPED
+
+- [x] Create `packages/contracts` (`@boocode/contracts`): `declaration:true`, per-subpath
+      exports map, zod `^3.23.8`. Placeholder `src/index.ts`.
+- [x] Add `packages/*` to `pnpm-workspace.yaml`.
+- [x] Invert build order everywhere: root `package.json` build script, `Dockerfile`
+      (`COPY packages/*` + contracts build before web/server), coder deploy command.
+- [x] Regenerate `pnpm-lock.yaml` (Dockerfile uses `--frozen-lockfile`).
+- [x] Prove the web consumption path end-to-end: `tsc -b` (composite+Bundler), `vite dev`
+      HMR, and `vite build` all resolve the built `.d.ts`+`.js` via the exports map.
+- [x] Verify all consumer builds + `docker compose build boocode` green.
+- [x] Remove Phase 1 probe artifacts before Phase 2.
+
+## Phase 2a — Single-source the ws-frames runtime schema ✅ SHIPPED
+
+- [x] Move ws-frames schema to `packages/contracts/src/ws-frames.ts` (`./ws-frames` subpath).
+- [x] Repoint 8 server/coder importers + 2 web validators to `@boocode/contracts/ws-frames`.
+- [x] Delete `apps/server/src/types/ws-frames.ts`; drop server `./ws-frames` exports subpath.
+- [x] Delete `apps/web/src/api/ws-frames.ts`.
+- [x] Move KNOWN_FRAME_TYPES drift + accept/reject tests into the package (11/11); delete
+      byte-parity test; keep broker fail-closed tests in server importing from the package.
+- [x] Container smoke: `docker compose up`, `/api/health` 200, broker imports from package.
+
+## Phase 3 — Single-source provider snapshot types ✅ SHIPPED
+
+- [x] Move `ProviderSnapshotEntry`, `ProviderModel`, `ProviderMode`, `ThinkingOption`,
+      `AgentCommand`, `ProviderSnapshotStatus` to `packages/contracts/src/provider-snapshot.ts`
+      (`./provider-snapshot` subpath).
+- [x] `apps/coder/src/services/provider-types.ts` re-exports them (importers unchanged).
+- [x] Delete web mirror block; delete `provider-types-parity.test.ts` (coder −6 tests).
+- [x] All builds and typechecks green; server 543 / coder 293 unchanged.
+
+## Phase 4 — Single-source the Zod provider-config schemas ✅ SHIPPED
+
+- [x] Move `ProviderOverrideSchema`, `CoderProvidersFileSchema`, `ProviderConfigPatchSchema`
+      + `z.infer` types to `packages/contracts/src/provider-config.ts` (`./provider-config`
+      subpath).
+- [x] `apps/coder/src/services/provider-config.ts` imports + re-exports (importers unchanged).
+- [x] Delete 17-line web hand-mirror.
+- [x] Coder provider-config tests 13/13; all builds green.
+
+## Phase 5 — Single-source type-only contracts (MessageMetadata + AgentSessionConfig) ✅ SHIPPED
+
+- [x] Move `MessageMetadata`, `ErrorReason`, `AgentSessionConfig` to
+      `packages/contracts/src/message-metadata.ts` (`./message-metadata` subpath).
+- [x] Unify `AgentSessionConfig` to the web required/nullable shape; delete the coder's
+      dead all-optional copy (zero live importers confirmed).
+- [x] Delete all duplicate `MessageMetadata` + `ErrorReason` copies (confirmed byte-identical).
+- [x] Repoint server `api.ts`, web types, and `MessageBubble.tsx` to the package.
+- [x] Server 543 / coder 293 unchanged; all builds green.
+
+## Phase 6 — Single-source WorktreeRiskReport ✅ SHIPPED
+
+- [x] Move `WorktreeRiskReport` to `packages/contracts/src/worktree-risk.ts`
+      (`./worktree-risk` subpath); unify name (coder `RiskReport` → `WorktreeRiskReport`).
+- [x] Delete all three copies (shapes identical, names differed).
+- [x] Repoint `sessions.ts`, `ProjectSidebar.tsx`, `orphan-worktree-reaper.ts`,
+      `worktree-safety.ts` via re-exports; `checkWorktreeWorkAtRisk` returns shared type.
+- [x] Server 543 / coder 293 unchanged; all builds green.
+
+## Phase 7 — Migrate apps/coder/web (fallback SPA) ✅ SHIPPED
+
+- [x] Add `@boocode/contracts` `workspace:*` dep to `apps/coder/web/package.json`.
+- [x] Delete hand-copied 9-arm `WsFrame` union; import canonical `WsFrame` from the package.
+- [x] Reconcile field conflicts: `tool_result.error` boolean→string; `tokens_used`
+      number→number|null; `snapshot.messages` cast `as Message[]`; `chat_id ?? ''`.
+- [x] Delete dead `pending_change_added`/`pending_change_updated` reducer arms + the entire
+      dead `onPendingChange` WS plumbing (`DiffPane` prop, `Session.tsx` listener).
+- [x] Confirm HTTP pending-change apply/reject path untouched.
+- [x] `apps/coder/web` Vite build green; root build + server 543 / coder 293 green.
+
+## Phase 8 — Audit every requirement bullet ✅ PASSED (12/12)
+
+- [x] Verify one definition per contract (file + line evidence for each).
+- [x] Verify `z.infer` for both Zod contracts (ws-frames, provider-config).
+- [x] Verify all four consumers wired via exports map (no project refs, no src imports).
+- [x] Verify all hand-copies + parity tests deleted; drift/broker tests preserved.
+- [x] Verify single zod version; build-order inverted in root + Dockerfile + deploy docs.
+- [x] Verify `packages/*` in workspace; dead `pending_change_*` arms gone.
+- [x] Verify web strict union preserved + coherent post-migration.
+- [x] Close gap G: update coder deploy command in all 7 doc sites (CLAUDE.md,
+      apps/coder/CLAUDE.md, BOOCODER.md, docs/ARCHITECTURE.md, docs/project-discovery.md,
+      README.md, docs/coder-backends.md).
+- [x] Correct now-false byte-parity/duplication claims in CLAUDE.md conventions,
+      apps/server/CLAUDE.md broker note, docs/coder-backends.md, and
+      docs/coding-standards/cross-app-contract-parity.md (rewritten to describe the SSOT).
+- [x] Mark DEFERRED-WORK §3 + STALE-DEPRECATED item as shipped.
+- [x] Clean `docker compose build --no-cache boocode` green; server 543 / coder 293 /
+      contracts 11 at exact baselines; nothing staged (HEAD e5ce01a).
+
+## Phase 9 — Human smoke test ✅ PASSED (Sam, 2026-06-02)
+
+- [x] Web dev HMR, web prod build at :9500, live WS stream rendering.
+- [x] Coder restart + a turn; fallback SPA; pending-change apply.
+
+## Verify (all runs green)
+
+- `pnpm -C packages/contracts build`
+- `pnpm -C apps/server test` (543)
+- `pnpm -C apps/coder test` (293)
+- `pnpm -C apps/server build && pnpm -C apps/coder build`
+- `npx tsc -p apps/web/tsconfig.app.json --noEmit`
+- `docker compose build --no-cache boocode`
--- a/openspec/changes/archived/orchestrator/artifacts/.discovery-notes.md
+++ b/openspec/changes/archived/orchestrator/artifacts/.discovery-notes.md
@@ -0,0 +1,116 @@
+# Discovery notes — Orchestrator (Phase 2)
+
+Single source of truth for project context. Specialists: read this first; do not
+re-grep for what's here. Search further only for what your domain needs that's
+missing.
+
+## Tech stack
+
+- pnpm monorepo. `apps/server` (BooChat: Fastify + Postgres), `apps/coder`
+  (BooCoder: host systemd service, agent dispatch), `apps/web` (React + Vite),
+  `apps/booterm`, `packages/contracts` (`@boocode/contracts`, cross-app wire SSOT).
+- DB `boochat` (Postgres 16). TypeScript strict, NodeNext (server/coder), `.js`
+  import extensions. Tests: vitest (server + coder); **no web test harness**.
+- Deploy: `apps/coder` change → `sudo systemctl restart boocoder`; `apps/web`/
+  `apps/server` → `docker compose up --build -d boocode`.
+
+## Phase 1 assets to reuse (`/opt/boocode/conductor/`)
+
+- `src/spine.ts` — the Spine→Flow factory (band gating, fold, synthesizer,
+  validator, render), + contracts injection.
+- `src/flows/*.ts` + `flows/index.ts` — 22 flows (21 Spine configs + bespoke
+  `code-review.ts`), registry (`getFlow`, `FLOW_NAMES`, `describeFlows`).
+- `src/contracts.ts` — Han evidence/yagni contracts (produce/review).
+- `src/types.ts` — `Flow`, `Step`, `Spine`, `Angle`, `Band`, `Contract`.
+- `src/flow.ts` — the wave scheduler (dep-aware parallel). **This is what Phase 2
+  must re-home/replace** so steps dispatch through BooCoder backends + persist.
+- `src/dispatch.ts` — current `opencode run` subprocess dispatch. **Replaced in
+  Phase 2** by BooCoder backend dispatch.
+- `agents/*.md` — 23 Han personas (also live in `~/.config/opencode/agents/`).
+
+## apps/coder — execution surfaces
+
+- `src/services/dispatcher.ts:46` — `createDispatcher`. `LISTEN 'tasks_new'` fast
+  path (pg trigger `notify_tasks_new`, schema line 279) + 2s poll. `runTask`
+  routes a `state='pending'` task to a backend. `inflight` map keyed
+  `session_id ?? 'task:<id>'` serializes per session.
+- `src/services/agent-backend.ts:97` — `AgentBackend` (ensureSession / prompt /
+  closeSession / dispose / health). Backends: `backends/opencode-server.ts`,
+  `warm-acp.ts`, `claude-sdk.ts`; one-shot `acp-dispatch.ts` / `pty-dispatch.ts`.
+- `AgentEvent` (agent-backend.ts:28, union text|reasoning|tool_call|tool_update|
+  commands) → mapped to WS frames by the dispatcher → `broker.publishUserFrame`.
+- **Tasks are how work is dispatched.** `INSERT INTO tasks (project_id, input,
+  agent, model, mode_id, thinking_option_id, session_id, chat_id)` then the
+  LISTEN/NOTIFY trigger picks it up. Precedents: `routes/messages.ts:233`,
+  **`routes/skills.ts:94` (a skill IS already dispatched as a task)**,
+  `routes/arena.ts:49`, `tools/new_task.ts:54` (writes `parent_task_id`).
+
+## apps/coder — schema (`src/schema.sql`, coder-owned)
+
+- `tasks` (line 18): `id, project_id, parent_task_id (FK self, written by new_task,
+  NOT read by dispatcher), state CHECK(pending|running|completed|failed|blocked|
+  cancelled), input, output_summary (≤500 char), agent, model, execution_path,
+  cost_tokens, started_at, ended_at, session_id, arena_id, mode_id,
+  thinking_option_id, chat_id`.
+- `agent_sessions` (line 88): PK `(chat_id, agent)`; `backend, agent_session_id,
+  server_port, status(idle|active|crashed|closed), config_hash, token/cost cols`.
+- `worktrees` (line 142), `available_agents` (line 36), `checkpoints` (233),
+  `claude_session_entries` (252). `notify_tasks_new` trigger (279).
+- **Schema discipline (root CLAUDE.md):** two schema files one DB; coder schema is
+  applied by the host boocoder service. CHECK migrations: DROP IF EXISTS the
+  system-named constraint → UPDATE → guarded ADD. `CREATE OR REPLACE VIEW` can't
+  reorder cols. JSONB via `sql.json(value as never)`. `clock_timestamp()` in txns.
+
+## packages/contracts — WS frames
+
+- `src/ws-frames.ts` — Zod frames in `WsFrameSchema` (SSOT). Existing: snapshot,
+  message_started, delta, reasoning_delta, tool_call, tool_result,
+  message_complete, usage, messages_deleted, chat_renamed, compacted, error.
+- **Adding a frame (cross-app, root CLAUDE.md):** add to `WsFrameSchema` here
+  (rebuild `pnpm -C packages/contracts build`), AND the server's loose
+  `InferenceFrame` union (`services/inference/turn.ts`), AND the web's strict
+  `WsFrame` union (`apps/web/src/api/types.ts`) — the web type is the wire gate;
+  missing it silently drops the frame at JSON-parse.
+
+## apps/web — panes + composer
+
+- Pane kinds (`api/types.ts:386` `WorkspacePaneKind`): `empty | chat | coder |
+  terminal | settings | markdown_artifact | html_artifact`. **Extra non-chat pane
+  kinds are already precedented** — adding `orchestrator` follows
+  `markdown_artifact`/`html_artifact`.
+- `hooks/useWorkspacePanes.ts` — pane state, `addSplitPane(kind)`, server-persisted
+  (+ legacy localStorage seed). `Workspace.tsx`, `NewPaneMenu.tsx`,
+  `ChatTabBar.tsx`, `PaneHeaderActions.tsx` all take `kind: 'chat'|'terminal'|
+  'coder'` — adding a kind touches these.
+- **`ChatInput.tsx` is the shared composer** rendered by BOTH `ChatPane.tsx` and
+  `CoderPane.tsx` (CoderPane also stacks `AgentComposerBar` above it). Its toolbar
+  row (icons: Globe, ListPlus, Paperclip, Send/Stop, `SquareSlash` for slash)
+  is where the Orchestrator button goes → parity for free. It takes `slashGroups`
+  (ChatPane passes BooChat skills; CoderPane passes agent-commands+skills),
+  `onSlashCommand`. `SlashCommandPicker.tsx`, `hooks/useSkills.ts`.
+- Mobile: per prior preference, crowded toolbars must fit one line (no scroll/wrap)
+  and the new button shows icon-only on mobile.
+
+## Precedents / related
+
+- **Arena** (`routes/arena.ts`): same task → N contestants (tasks sharing
+  `arena_id`), parallel, `[SELECTED]` winner. Closest existing fan-out; stays
+  separate but is a structural precedent for "one launch → many tasks grouped".
+- **BooChat skills** (`apps/server` `routes/skills.ts` + `services/skills`,
+  `getSkillBody`): slash injects a skill body, the single chat model runs it
+  inline. The coder also has `routes/skills.ts` that dispatches a skill as a task.
+- Event-dedup discipline (root CLAUDE.md): a mutation published via
+  `broker.publishUser` must NOT also `sessionEvents.emit` locally; handlers
+  idempotent.
+
+## Enumerated gaps (searched, not found)
+
+- No `flow_runs` / `flow_steps` / `flows` tables, no `depends_on`/`step_index` on
+  `tasks`, no DAG/pipeline concept anywhere (confirmed Phase-1 research).
+- No `orchestrator` pane kind, component, or WS frame yet.
+- No coding-standards dir hits for orchestration; ADR dir not present under
+  `docs/adr/` (none found) — architectural decisions live in `openspec/changes/`.
+- No resume mechanism for a multi-step run after coder restart (single tasks
+  resume via `agent_sessions`; a *run* spanning tasks does not).
+- The conductor's scheduler (`conductor/src/flow.ts`) is in-process/in-memory;
+  it does not persist step state or survive restart.
--- a/openspec/changes/archived/orchestrator/artifacts/design-context.md
+++ b/openspec/changes/archived/orchestrator/artifacts/design-context.md
@@ -0,0 +1,87 @@
+# Orchestrator (Phase 2) — settled design (source spec)
+
+This is the behavioral specification for the BooCode Orchestrator, settled via a
+`grill-me` interview. It is the ground truth for *what*; the implementation plan
+covers *how*. Phase 1 (the standalone code conductor at `/opt/boocode/conductor/`)
+is done; this is the in-app integration.
+
+## Outcome
+
+Bring the deterministic multi-agent conductor into the BooCode app: a user can
+launch any Han flow from BooChat or BooCoder, watch each agent's progress live
+(Paseo-style), and get an evidence-disciplined report — all on local Qwen, free.
+
+## Settled decisions (immutable for this plan)
+
+1. **One engine.** The conductor is the only engine. Every read-only Han skill IS
+   a conductor flow. No single-agent degraded path.
+2. **Two doors, full parity.** A slash command and an "Orchestrator" button, both
+   on the shared `ChatInput` composer — so both appear in BooChat (ChatPane) and
+   BooCoder (CoderPane), desktop and mobile (mobile = icon only). Launching from
+   either opens the same run view.
+3. **Run view = new pane kind.** A fourth pane kind, `orchestrator`, alongside
+   `chat | terminal | coder`, opened as a new pane in the current session. It
+   renders: flow + band at top, a list of agents with live status, each
+   expandable to watch its stream; the final report at the top when done. Shape
+   is parent-with-nested-children (Paseo's parent/subagents), not one-agent.
+4. **Execution through BooCoder backends.** Each flow agent is a real BooCoder
+   agent session dispatched through the existing `AgentBackend`s → live streaming
+   via the existing AgentEvent→WS-frame pipeline, persisted to Postgres,
+   resumable. New `flow_runs` + `flow_steps` tables in the coder schema; the
+   conductor's scheduler fires each step's task as its deps complete.
+5. **Read-only, no worktree.** Flows never write to the repo. Agents read the
+   project's working directory directly; no git worktree is created. Read-only is
+   enforced at dispatch (agents get no edit/write tools). The report is the only
+   output.
+6. **Qwen-only.** Default the loaded 35B (`llama-swap/qwen3.6-35b-a3b-mxfp4`),
+   held as a single config value so additional local models slot in later with
+   near-zero rework. No Claude path — Claude Code remains the Claude lane.
+7. **Naming.** "Orchestrator." The existing Arena (same-task-on-N-models,
+   `apps/coder/src/routes/arena.ts`) stays a separate feature.
+8. **Skill placement.** The slash menu surfaces the read-only analysis/review set
+   (research, investigate, code-review, architectural-analysis, security-review,
+   gap-analysis, data-review, devops-review, issue-triage, project-discovery,
+   test-planning). The Orchestrator button exposes the FULL catalog (22 flows),
+   including the planning/authoring draft flows.
+9. **Launch UX.** Slash launches instantly with defaults (band = small, target =
+   the current pane's project, text after the command = the question/focus),
+   opening an Orchestrator pane. The button opens a launcher first (pick flow,
+   size, target/focus) then launches. Same run view either way.
+10. **Report output.** Stored with the run in Postgres, shown at the top of the
+    Orchestrator pane. Runs persist and are reopenable from a runs history.
+    Export on demand: copy / save-to-file / send-to-chat. Nothing auto-written to
+    the repo.
+11. **Concurrency.** Multiple runs allowed; each its own pane. They share the one
+    local model, so workers queue at llama-swap — panes show `queued` honestly.
+12. **Sizing modes.** Han's bands (small/medium/large) select roster breadth per
+    flow; a fast mode caps each worker's depth. Both carry over from Phase 1.
+
+## Evidence & rule alignment (carried from Phase 1)
+
+Flows apply Han's `evidence-rule` (trust classes, web corroboration gate,
+no-evidence labeling) and `yagni-rule` (producing flows only), injected as
+contracts. The adversarial-validator gate runs the review checklists and emits a
+plain-language Summary + Confidence. This is already built in the conductor
+(`conductor/src/contracts.ts`); Phase 2 must preserve it when porting dispatch
+from `opencode run` subprocess to the BooCoder backends.
+
+## Out of scope (Phase 2)
+
+- The exact pixel-faithful per-skill Han report templates (spine-level only, by
+  prior decision).
+- A Claude execution path (Claude Code covers it).
+- Folding the existing Arena into the Orchestrator (stays separate).
+- Per-agent model tiering (single model per run for now; revisit when more local
+  models exist).
+
+## Open items the plan must resolve (the HOW)
+
+- How the conductor's flow/spine definitions (Phase 1, `conductor/src/`) are
+  reused vs. re-homed inside `apps/coder` when dispatch moves to the backends.
+- The `flow_runs`/`flow_steps` schema shape, status lifecycle, and how a step maps
+  to a `tasks`/`agent_sessions` row.
+- How the scheduler resumes a run after a coder-service restart (mid-flight steps).
+- The new `orchestrator` WS frame(s) and how the pane subscribes to per-agent
+  streams (reusing the existing broker/AgentEvent pipeline).
+- The Orchestrator pane component structure and how it nests N live agent streams
+  without the crowding the grill rejected.
--- a/openspec/changes/archived/orchestrator/artifacts/implementation-decision-log.md
+++ b/openspec/changes/archived/orchestrator/artifacts/implementation-decision-log.md
@@ -0,0 +1,478 @@
+# Implementation Decision Log — Orchestrator (Phase 2)
+
+Han synthesis output. Each decision is committed: it cites evidence, records
+rejected alternatives, names an owner, and a revisit criterion. Cross-reference
+invariant: every `D-N` here is referenced by [design.md](../design.md) and/or
+[tasks.md](../tasks.md), and produced by a round recorded in
+[implementation-iteration-history.md](implementation-iteration-history.md).
+
+Source: a conversational `grill-me` design session. The settled behavioral spec
+is captured in [design-context.md](design-context.md) (12 decisions; decision 5
+is REVISED by D-3 / D-4 below). Specialist findings are in the claim ledger
+C1–C16 of the iteration history.
+
+Trust class of evidence below: **codebase** (file:line in this repo) unless
+noted. No single-source web claims underpin any committed decision.
+
+---
+
+## D-1 — Re-home the pure conductor definitions into `apps/coder/src/conductor/`
+
+**Decision.** Copy the pure (dispatch-free) conductor definition files —
+`spine.ts`, `flows/*`, `contracts.ts`, `types.ts`, `render.ts` — into
+`apps/coder/src/conductor/`, plus the 23 Han personas (`conductor/agents/*.md`).
+The Phase-1 standalone CLI (`conductor/`) stays alive and unchanged. Sever the
+`flows/code-review.ts` → `dispatch.ts` coupling by adding a `DispatchFn` to
+`StepContext`, injected by the flow-runner. Parameterize `spine.ts`'s model from
+`process.env.CONDUCTOR_MODEL` to the run's configured model.
+
+**Rationale.** The flow definitions are pure data + closures; only `dispatch.ts`
+(the `opencode run` subprocess path) and `flow.ts` (the in-memory scheduler) are
+Phase-1-specific. Copying the pure files avoids a workspace-package extraction
+(YAGNI — only two consumers) while keeping the Phase-1 CLI as a regression
+oracle. The evidence/yagni contracts are preserved because the flow-runner calls
+`step.run(ctx)` in-process to build each prompt BEFORE inserting the task — the
+closures execute in the coder process; prompts are never serialized to DB.
+
+**Evidence.** `code-review.ts:10` (`import { dispatchAgent } from '../dispatch.js'`)
+and `:62` (the per-dimension dispatch call) — the only flow→dispatch coupling
+(C1). `spine.ts:122` renders `process.env.CONDUCTOR_MODEL` into the report header
+(C14). `spine.ts:73` — contracts injected via the step closure, in-process (C11).
+23 personas confirmed at `conductor/agents/*.md`.
+
+**Rejected alternatives.**
+- A `@boocode/conductor` workspace package — rejected: only two consumers (Phase-1
+  CLI + coder); a shared package is premature abstraction (YAGNI). Deferred with a
+  reopen trigger (a 3rd app needing conductor types). See Deferred (YAGNI).
+- Importing `conductor/src/*` directly from `apps/coder` across the workspace
+  boundary — rejected: couples the coder build to the standalone CLI tree and its
+  `opencode`-flavored dispatch import graph.
+
+**Specialist owner.** software-architect.
+**Revisit criterion.** A third app needs the conductor types (then extract the
+workspace package).
+**Driven by rounds:** R1.
+**Referenced in plan:** design.md §Re-home & DispatchFn seam; tasks.md group 1.
+
+---
+
+## D-2 — DB-driven flow-runner with an `onTaskTerminal` dispatcher hook
+
+**Decision.** Add `apps/coder/src/services/flow-runner.ts`: a DB-backed scheduler
+that owns `flow_runs`/`flow_steps`, computes the ready wave from the loaded flow
+def, INSERTs each ready `agent` step as a `tasks` row, runs `code` steps inline,
+and advances. Fan-out is driven by ONE new hook — an `onTaskTerminal(taskId,
+state)` callback on `createDispatcher` — invoked when any task reaches a terminal
+state. No third poll loop; no modification to the dispatcher's internal run
+functions.
+
+**Rationale.** The dispatcher already has the LISTEN/NOTIFY + poll machinery and
+the terminal-state transitions; a single callback at those transition points lets
+the flow-runner react without duplicating the dispatch loop. The flow-runner stays
+a pure scheduler; execution stays in the dispatcher.
+
+**Evidence.** `dispatcher.ts:46-179` (the loop + `runTask`), `:279-286` (the
+`notify_tasks_new` trigger) (C2). Terminal transitions the hook attaches to:
+external completed `dispatcher.ts:642-646`, external failed `:659-661`. Full step
+output must persist in `flow_steps.output TEXT` because `tasks.output_summary` is
+≤500 char and cannot reconstruct `ctx.results` for render/resume (`schema.sql:26`,
+`flow.ts:49,59`) (C3).
+
+**Rejected alternatives.**
+- A standalone third poll loop in the flow-runner — rejected: duplicates the
+  dispatcher's LISTEN/poll, two writers racing on `tasks`.
+- Modifying the dispatcher's `runTask` internals to know about flows — rejected:
+  couples the generic dispatcher to the orchestrator; the callback seam keeps the
+  dispatcher flow-agnostic.
+
+**Specialist owner.** software-architect.
+**Revisit criterion.** Step throughput requires batching beyond what one callback
+per terminal task supports.
+**Driven by rounds:** R1.
+**Referenced in plan:** design.md §Flow-runner & onTaskTerminal; tasks.md group 4.
+
+---
+
+## D-3 — Reuse the existing dispatcher (insert pending task), not a direct-PTY bypass
+
+**Decision.** The flow-runner INSERTs each ready step as a normal `state='pending'`
+`tasks` row; the existing dispatcher picks it up via `LISTEN 'tasks_new'`, runs it
+through the existing external-agent path (creating a git worktree as a stable HEAD
+read-checkout), and streams AgentEvents → WS frames unchanged. The new
+`onTaskTerminal` hook (D-2) notifies the flow-runner on terminal state. No
+direct-PTY bypass; the dispatcher is reused with exactly one new hook.
+
+This **REVISES design-context decision 5** ("no worktree") to: a worktree IS
+created, but it is a harmless read snapshot — read-only is enforced by plan mode
+(D-4), not by the absence of a worktree.
+
+**Rationale.** Reuse (architect's A2) gets streaming, persistence, resume,
+cancellation, and AgentEvent→WS mapping for free. The only objection to A2 was
+that it creates a worktree the "no worktree" decision-5 wanted to avoid; once
+read-only is enforced at the tool level by plan mode (D-4), the worktree is inert
+(a checkout the agent cannot write to), so the objection dissolves. This was the
+user's explicit choice over the architect's leaning-toward-bypass (A4).
+
+**Evidence.** The external-agent path with worktree creation and AgentEvent→WS
+streaming: `dispatcher.ts` external branch (worktree create → run → terminal at
+`:642-646`/`:659-661`). Task-as-dispatch precedents the flow-runner copies:
+`routes/skills.ts:94` (a skill is already dispatched as a task),
+`routes/arena.ts:49`, `tools/new_task.ts:54`. Dispatch tension recorded as C12
+(A2 vs A4, architect self-flagged Disputed); resolved here by user choice.
+
+**Rejected alternatives.**
+- A4 direct-`dispatchViaPty` bypass (insert `running` task + call PTY directly to
+  skip worktree creation) — rejected: duplicates streaming/persistence/resume
+  wiring, and a restart kills the PTY child outside the dispatcher's lifecycle
+  (worsening resume, C15). The worktree it was avoiding is harmless under D-4.
+- design-context decision 5's "no worktree, read project dir directly" — rejected:
+  reusing the dispatcher means reusing its worktree creation; under D-4 the
+  worktree is a read snapshot, so avoiding it bought nothing and cost the reuse.
+
+**Specialist owner.** software-architect (execution path); devops-engineer
+(operational behavior of the reused dispatcher under flow load).
+**Revisit criterion.** Worktree creation per step becomes a measured throughput or
+disk-cost problem under real flow concurrency.
+**Driven by rounds:** R1 (C12), R2 (read-only finding that made the worktree inert).
+**Referenced in plan:** design.md §Execution via dispatcher reuse; tasks.md group 4.
+
+---
+
+## D-4 — Read-only enforced HARD by `mode_id='plan'` (qwen `--approval-mode plan`)
+
+**Decision.** Every orchestrator step task is dispatched with `mode_id = 'plan'`,
+which the PTY dispatcher passes to qwen as `--approval-mode plan` — a built-in
+tool-level gate: reads allowed, writes blocked. The flow-runner hardcodes
+`mode_id='plan'` for every step task; it is never user-overridable. This is the
+sole read-only enforcement mechanism. `BOOCODE_TOOLS` and persona prompts are NOT
+relied upon (they do not govern external CLI agents).
+
+**Rationale.** Read-only is a safety-critical invariant of the whole feature
+(flows never write the repo). Prompt-level intent and `BOOCODE_TOOLS` ceilings
+govern BooChat's in-process tools, not an external `qwen` CLI child — so they are
+not watertight. qwen's `--approval-mode plan` is a tool-level gate inside the
+agent binary itself, which the adversarial-security-analyst (R2) identified as the
+only enforcement that actually binds the external agent. Qwen-only (decision 6)
+makes a single hardcoded flag sufficient.
+
+**Evidence.** The wiring already exists: `pty-dispatch.ts:75` —
+`if (modeId) args.push('--approval-mode', modeId)` in the `qwen` spawn spec. R2
+security finding recorded as C13 (the R1 claim that prompt-level + `BOOCODE_TOOLS`
+enforcement was sufficient was Anecdotal/unproven; R2 refuted it and named plan
+mode as the binding control).
+
+**Rejected alternatives.**
+- Prompt-level read-only intent (personas tell the agent not to write) — rejected
+  (C13, R2): an instruction, not a gate; a model can ignore or be steered past it.
+- `BOOCODE_TOOLS=core` as the gate — rejected (C13, R2): governs BooChat's
+  in-process tool registry, does not constrain the external `qwen` CLI's own tools.
+- A `read_only` boolean flag on `tasks` — rejected: superseded by `mode_id='plan'`,
+  which is an existing column already plumbed to the binary. See Deferred (YAGNI).
+
+**Specialist owner.** adversarial-security-analyst.
+**Revisit criterion.** A non-qwen agent is added to flows (re-verify that agent's
+equivalent of `--approval-mode plan` before allowing it), or qwen changes
+`--approval-mode plan` semantics.
+**Driven by rounds:** R1 (C13 flagged), R2 (resolved).
+**Referenced in plan:** design.md §Read-only via plan mode; tasks.md group 4.
+
+---
+
+## D-5 — `flow_runs` + `flow_steps` schema in the coder schema
+
+**Decision.** Add two tables to `apps/coder/src/schema.sql`:
+
+- `flow_runs(id, project_id [no FK, matches tasks.project_id], flow_name, band
+  [CHECK small|medium|large], model, status [CHECK-named], input JSONB
+  [CHECK (input ? 'question')], report TEXT [nullable], error, timestamps)`.
+- `flow_steps(id, run_id [FK → flow_runs ON DELETE CASCADE], step_id, kind
+  [CHECK agent|code], agent, status [CHECK-named], task_id [UUID → tasks(id) ON
+  DELETE SET NULL; nullable, code steps NULL], chat_id [UUID → chats(id) ON DELETE
+  SET NULL], input TEXT, output TEXT [FULL output], error, timestamps, UNIQUE(run_id,
+  step_id))`.
+
+No `depends_on` column (derive from the loaded flow def). Do NOT insert
+skipped-step rows (`when()` is pure on stored input). Indexes:
+`flow_steps(run_id, status)`, `flow_runs(project_id, created_at DESC)`. Explicit
+CHECK constraint names + the repo's DROP-IF-EXISTS → guarded-ADD migration
+discipline.
+
+**Rationale.** A run spans multiple tasks; existing tables (`tasks`,
+`agent_sessions`) model single dispatches, not a DAG. `flow_steps.task_id →
+tasks(id)` (not a column on `tasks`) keeps `tasks` generic. `output TEXT` is FULL
+because `tasks.output_summary` is ≤500 char and cannot reconstruct `ctx.results`.
+`project_id` has no FK to match `tasks.project_id`'s existing convention.
+
+**Evidence.** `tasks` shape and `output_summary` ≤500 char: `schema.sql:18-34`,
+`:26` (C3, C4). `flow.ts:49,59` (results reconstruction needs full output, C3).
+`flow.ts:28-41`, `types.ts:27` (deps + `when()` derivable from flow def — omit
+`depends_on` and skipped rows, C6). `schema.sql:19,32` (project_id no-FK pattern;
+CHECK-named discipline, C5). Migration discipline: root CLAUDE.md schema section.
+
+**Rejected alternatives.**
+- A `depends_on` column on `flow_steps` — rejected (C6, YAGNI): deps are in the
+  loaded flow def; storing them duplicates the source of truth. Deferred.
+- Persisting skipped-step rows — rejected (C6, YAGNI): `when()` is pure on stored
+  `input`, so a skip is reconstructable. Deferred.
+- A column on `tasks` (e.g. `flow_step_id`) — rejected (C4): pollutes the generic
+  tasks table; the FK belongs on `flow_steps`.
+
+**Specialist owner.** data-engineer.
+**Revisit criterion.** A stored-run DAG visualization needs deps without loading
+the flow def (then add `depends_on`); the UI must explain a skip without the flow
+def (then persist skipped rows).
+**Driven by rounds:** R1.
+**Referenced in plan:** design.md §Schema; tasks.md group 2.
+
+---
+
+## D-6 — Two new WS frames; per-agent stream reuses existing frames by `chat_id`
+
+**Decision.** Add two frames to `packages/contracts/src/ws-frames.ts`:
+
+- `flow_run_started`: `run_id, flow_name, band, steps[]` (each `step_id, agent,
+  kind, chat_id, label`).
+- `flow_run_step_updated`: `run_id, step_id, status, run_status?, report?`.
+
+The per-agent content stream REUSES the existing `delta` / `tool_call` /
+`message_complete` frames keyed by the step's `chat_id`. Each agent step gets a
+synthetic `chats` row for stream attribution. Register in all THREE frame
+registries: contracts `WsFrameSchema`, the server `InferenceFrame` union
+(`services/inference/turn.ts`), and the web strict `WsFrame` union
+(`apps/web/src/api/types.ts`) — the web type is the wire-format gate.
+
+**Rationale.** The run-level lifecycle (which agents exist, their status, the final
+report) needs new frames; the per-agent token stream is exactly what the existing
+delta/tool_call/message_complete pipeline already carries, so keying it by a
+synthetic `chat_id` reuses the whole broker→WS path with no new streaming code.
+The report rides on `flow_run_step_updated` rather than its own frame (one fewer
+frame type; revisit only if reports exceed the frame size limit).
+
+**Evidence.** Existing broker→WS frame pipeline and frame list: `ws-frames.ts`
+(snapshot…error). Three-registry rule + web-type-is-wire-gate: root CLAUDE.md
+"Adding a new WS frame type" + discovery notes §packages/contracts. Stream-by-chat
+reuse precedent: the dispatcher publishes delta/tool_call/message_complete keyed
+by chat already (C7).
+
+**Rejected alternatives.**
+- New per-agent stream frames (`flow_agent_delta`, etc.) — rejected: the existing
+  delta/tool_call/message_complete already stream by chat; new frames duplicate
+  them.
+- A separate `flow_run_report` frame — rejected (YAGNI): the report fits on
+  `flow_run_step_updated`. Deferred with a reopen trigger (reports exceed ~50KB).
+
+**Specialist owner.** software-architect.
+**Revisit criterion.** Reports exceed the frame size limit (~50KB) → split the
+report onto its own frame.
+**Driven by rounds:** R1.
+**Referenced in plan:** design.md §WS frames; tasks.md group 3.
+
+---
+
+## D-7 — `orchestrator` pane kind + OrchestratorPane
+
+**Decision.** Add an `orchestrator` pane kind (following the
+`markdown_artifact`/`html_artifact` precedent) — touching `WorkspacePaneKind`,
+`useWorkspacePanes`, `Workspace`, `NewPaneMenu`, `ChatTabBar`,
+`PaneHeaderActions`. `OrchestratorPane.tsx`: run header; report-at-top on
+completion; collapsed agent roster reusing `AgentStatusDot`; expand-one-at-a-time
+detail well reusing CoderPane stream rendering; mobile single-column inline
+expand; auto-expand-follows-active. Runs history in `NewPaneMenu`. Export (copy /
+save-file / send-to-chat via the existing `sendToChat`) in the pane header `…`,
+conditional on a completed report.
+
+**Rationale.** A fourth pane kind is already a precedented extension point; the
+pane reuses `AgentStatusDot` and the CoderPane stream renderer, so the new surface
+is composition, not new streaming UI. Expand-one-at-a-time avoids the crowding the
+grill rejected.
+
+**Evidence.** Pane-kind precedent: `api/types.ts:386` `WorkspacePaneKind` (with
+`markdown_artifact`/`html_artifact`). Roster/status reuse: `AgentComposerBar.tsx:204`
+(`AgentStatusDot`), CoderPane stream rendering (C8). Launcher categories from the
+flow registry: `flows/index.ts`; runs history host `NewPaneMenu.tsx`; export via
+`lib/events.ts` `sendToChat` (C10).
+
+**Rejected alternatives.**
+- Rendering runs inside the existing `coder` pane — rejected: a run is a
+  parent-with-nested-children view, not a single agent session; conflating them
+  crowds both.
+- All-agents-expanded simultaneously — rejected (C8): the crowding the design
+  session explicitly rejected.
+
+**Specialist owner.** user-experience-designer.
+**Revisit criterion.** Users cannot follow multiple concurrent runs from the
+roster (then revisit the expand model).
+**Driven by rounds:** R1.
+**Referenced in plan:** design.md §Orchestrator pane; tasks.md groups 7, 10.
+
+---
+
+## D-8 — Workflow toolbar button + slash launch, BooChat/BooCoder parity
+
+**Decision.** Add a `Workflow` (lucide) button on `ChatInput`'s controls row,
+between the `SquareSlash` chip and the `Globe` pill — yielding parity in BooChat
+(ChatPane) and BooCoder (CoderPane) for free. Label "Flows" on desktop, icon-only
+on mobile (toolbar confirmed to fit one line). Slash launches instantly with
+defaults (band small, current pane's project, text-after-command = focus),
+opening the pane. The button opens `FlowLauncherDialog.tsx` first: 5 category tabs
+(Analysis/Discovery/Planning/Authoring/Review) → filtered flow list + size + focus
+ fast toggle; defaults Analysis/Small/off.
+
+**Rationale.** `ChatInput` is the shared composer rendered by both panes, so a
+single button gives both doors with parity at no extra cost. The toolbar fits one
+line at ≤5 elements, so adding the button does not force scroll/wrap (a standing
+mobile constraint).
+
+**Evidence.** `ChatInput.tsx:648-732`, `:673` — the controls row is ≤5 elements;
+adding the `Workflow` icon between SquareSlash and Globe keeps it one line; refutes
+junior Q13's crowding worry (C9). Launcher categories from `flows/index.ts` (C10).
+Shared-composer fact: discovery notes §apps/web (ChatInput rendered by ChatPane +
+CoderPane).
+
+**Rejected alternatives.**
+- Separate buttons in ChatPane and CoderPane — rejected: duplicates wiring; the
+  shared composer already gives parity from one button.
+- A launcher search box instead of category tabs — rejected (YAGNI): 22 flows in 5
+  categories are browsable; a search box is unproven need. Deferred.
+
+**Specialist owner.** user-experience-designer.
+**Revisit criterion.** Category grouping fails users at the 22-flow catalog size
+(then add the search box).
+**Driven by rounds:** R1.
+**Referenced in plan:** design.md §Toolbar button & launcher; tasks.md groups 8, 9.
+
+---
+
+## D-9 — Resumable runs via `initResume` on coder startup
+
+**Decision.** On coder startup, an `initResume` re-advances every `flow_runs WHERE
+status='running'`: a step whose task completed → mark the step done + advance the
+run; a step whose task is lost/failed (PTY died on restart) → re-dispatch;
+completed steps are kept. (design-context decision 4 commits to "resumable".)
+
+**Rationale.** A restart can land mid-flight. Because execution goes through the
+dispatcher with persisted task state (D-3), a step's outcome is recoverable from
+the DB; the run-level scheduler just has to re-derive the wave and re-dispatch only
+the steps that did not finish. Reconcile-and-advance (architect A3) beats
+mark-run-failed (data's conservative option) because decision 4 already committed
+to resumable and the task state is durable.
+
+**Evidence.** No run-level resume exists today (single tasks resume via
+`agent_sessions`; a run spanning tasks does not) — discovery notes §Enumerated
+gaps. Resume tension recorded as C15 (architect reconcile-and-advance vs data
+mark-failed); resolved toward reconcile-and-advance by decision 4 + durable task
+state under D-3.
+
+**Rejected alternatives.**
+- Mark a running run failed on restart — rejected (C15): contradicts decision 4
+  (resumable) and discards recoverable completed-step work.
+- Re-running the whole flow from step 0 — rejected: re-does completed steps,
+  burning the local model on work already persisted.
+
+**Specialist owner.** software-architect (scheduler); data-engineer (recovery query).
+**Revisit criterion.** A step-level idempotency hazard surfaces where re-dispatch
+of a "lost" step double-counts side effects (none expected under read-only plan
+mode).
+**Driven by rounds:** R1.
+**Referenced in plan:** design.md §Resume; tasks.md group 5.
+
+---
+
+## D-10 — Concurrency: multiple runs, no `queued` status, single model per run
+
+**Decision.** Multiple runs are allowed; each gets its own pane + `flow_runs` row,
+no shared state. Step statuses are pending / running / completed / failed / skipped
+— there is NO separate `queued` status (the dispatcher's `pending` covers a step
+waiting on the busy model or on deps). Model is a single config value per run,
+default `qwen3.6-35b-a3b-mxfp4`.
+
+**Rationale.** Each run is independent state, so concurrency needs no coordination
+beyond the dispatcher's existing per-session serialization. A `queued` status is
+not observable: with the model busy, a task is simply `pending`/`running` and
+llama-swap does not expose queue position, so a distinct `queued` state would be a
+label the system cannot honestly populate (revising decision-11's "panes show
+queued honestly").
+
+**Evidence.** `queued` unobservability recorded as C16 (junior Q11, data
+DATA-005): llama-swap does not report queue position; the status reduces to
+pending(dep/model-wait)/running. Single-model-per-run carried from decision 6/11.
+
+**Rejected alternatives.**
+- A distinct `queued` step status — rejected (C16): nothing can populate it
+  honestly; `pending` already means "waiting". Deferred (reopen if llama-swap
+  exposes queue position).
+- Serializing runs (one at a time) — rejected: runs are independent; serialization
+  adds coordination for no benefit and hurts the multi-pane UX (decision 11).
+
+**Specialist owner.** data-engineer (status set), devops-engineer (model-busy
+behavior under concurrent runs).
+**Revisit criterion.** llama-swap exposes queue position → add an observable
+`queued` status.
+**Driven by rounds:** R1.
+**Referenced in plan:** design.md §Schema (status sets) + §Concurrency; tasks.md
+group 2.
+
+---
+
+## Cross-reference index
+
+| Decision | Driven by | Design.md section | Tasks.md group |
+|---|---|---|---|
+| D-1 Re-home + DispatchFn | R1 (C1, C11, C14) | Re-home & DispatchFn seam | 1 |
+| D-2 Flow-runner + onTaskTerminal | R1 (C2, C3) | Flow-runner & onTaskTerminal | 4 |
+| D-3 Dispatcher reuse (not bypass) | R1 (C12), R2 | Execution via dispatcher reuse | 4 |
+| D-4 Read-only via plan mode | R1 (C13), R2 | Read-only via plan mode | 4 |
+| D-5 Schema flow_runs/flow_steps | R1 (C3–C6) | Schema | 2 |
+| D-6 WS frames | R1 (C7) | WS frames | 3 |
+| D-7 Orchestrator pane | R1 (C8) | Orchestrator pane | 7, 10 |
+| D-8 Toolbar button + slash | R1 (C9, C10) | Toolbar button & launcher | 8, 9 |
+| D-9 Resume | R1 (C15) | Resume | 5 |
+| D-10 Concurrency / no-queued | R1 (C16) | Schema + Concurrency | 2 |
+
+## Deferred (YAGNI)
+
+These were considered and deferred under the evidence rule. Each names the trigger
+that would justify reopening.
+
+### `@boocode/conductor` workspace package
+- **Why deferred:** only two consumers (Phase-1 CLI + coder); copy-in (D-1) avoids
+  premature shared-package abstraction.
+- **Reopen when:** a third app needs the conductor types.
+- **Source:** architect (D-1 rejected alternative).
+
+### `flow_steps.depends_on` column
+- **Why deferred:** deps are derivable from the loaded flow def (`flow.ts:28-41`,
+  `types.ts:27`); a column duplicates the source of truth.
+- **Reopen when:** a stored-run DAG visualization must show deps without loading
+  the flow def.
+- **Source:** data-engineer C6 (D-5 rejected alternative).
+
+### Persisted skipped-step rows
+- **Why deferred:** `when()` is pure on stored `input`, so a skip is
+  reconstructable from the flow def + run input.
+- **Reopen when:** the UI must explain a skip without the flow def.
+- **Source:** data-engineer C6 (D-5 rejected alternative).
+
+### `read_only` flag on `tasks`
+- **Why deferred:** superseded by `mode_id='plan'` (D-4), an existing column
+  already plumbed to qwen's `--approval-mode`.
+- **Reopen when:** a non-qwen agent without a `--approval-mode plan` equivalent is
+  added to flows.
+- **Source:** D-4 rejected alternative.
+
+### Explicit `queued` step status
+- **Why deferred:** llama-swap does not expose queue position; nothing can populate
+  the status honestly (C16). `pending` covers waiting.
+- **Reopen when:** llama-swap exposes queue position.
+- **Source:** junior Q11 / data DATA-005 (D-10 rejected alternative).
+
+### Launcher search box
+- **Why deferred:** 22 flows in 5 category tabs are browsable; a search box is
+  unproven need.
+- **Reopen when:** category grouping fails users at the catalog size.
+- **Source:** UX C10 (D-8 rejected alternative).
+
+### Separate report-stored WS frame
+- **Why deferred:** the report rides on `flow_run_step_updated` (D-6).
+- **Reopen when:** reports exceed the ~50KB frame size limit.
+- **Source:** architect C7 (D-6 rejected alternative).
--- a/openspec/changes/archived/orchestrator/artifacts/implementation-iteration-history.md
+++ b/openspec/changes/archived/orchestrator/artifacts/implementation-iteration-history.md
@@ -0,0 +1,89 @@
+# Implementation Iteration History — Orchestrator (Phase 2)
+
+Source spec: [../proposal.md](../proposal.md) · design context:
+[design-context.md](design-context.md) · discovery: [.discovery-notes.md](.discovery-notes.md)
+· decision log: [implementation-decision-log.md](implementation-decision-log.md)
+
+## R1 — Parallel specialist review
+
+**Specialists engaged:** software-architect, data-engineer, user-experience-designer, junior-developer (all sonnet).
+
+**New input:** the settled design (12 decisions) + discovery notes. Each produced concrete HOW recommendations.
+
+### Claim ledger
+
+| # | Claim | State | Spec-maturity | Supporting |
+|---|---|---|---|---|
+| C1 | Re-home pure flow defs (`spine/flows/contracts/types/render`) into `apps/coder/src/conductor/`; sever `code-review.ts`→`dispatch.ts` coupling by injecting a `DispatchFn` via `StepContext`. Keep Phase-1 CLI alive. | Evidenced (`code-review.ts:10,62`; `conductor/tsconfig.json:6`) | plan-level | architect |
+| C2 | DB-driven scheduler `apps/coder/src/services/flow-runner.ts` + `flow_runs`/`flow_steps`; fan-out via an `onTaskTerminal` callback on the existing dispatcher (no 3rd poll loop). | Evidenced (`dispatcher.ts:46-179,279-286`) | plan-level | architect |
+| C3 | Full step output must persist in `flow_steps.output TEXT` — `tasks.output_summary` is ≤500 char and can't reconstruct `ctx.results` for render/resume. | Evidenced (`schema.sql:26`, `flow.ts:49,59`) | plan-level | data-engineer, architect |
+| C4 | FK direction = `flow_steps.task_id → tasks(id) ON DELETE SET NULL` (nullable; code steps NULL). Do NOT add a column to `tasks`. | Evidenced (`schema.sql:18-34`) | plan-level | data-engineer, architect |
+| C5 | `flow_runs.project_id` no FK (matches `tasks.project_id`); CHECK-named status constraints; `CHECK (input ? 'question')`. | Evidenced (`schema.sql:19,32`) | plan-level | data-engineer |
+| C6 | Omit `depends_on` column (deps derivable from loaded flow def) and skipped-step rows (`when()` is pure on stored `input`). | Evidenced (`flow.ts:28-41`, `types.ts:27`) — **YAGNI** | plan-level | data-engineer |
+| C7 | Two new WS frames: `flow_run_started` (step manifest + per-step `chat_id`) + `flow_run_step_updated` (status + final report). Content stream REUSES existing `delta/tool_call/message_complete` by `chat_id`. Per-step synthetic chat row. | Evidenced (broker pipeline; `ws-frames.ts`) | plan-level | architect |
+| C8 | Orchestrator pane: collapsed roster, expand-one-at-a-time detail well, reuse `AgentStatusDot`; report at top on completion. Mobile single-column inline expand. | Evidenced (`AgentComposerBar.tsx:204`, `CoderPane`) | plan-level | UX |
+| C9 | Toolbar fits: actual `ChatInput` row ≤5 elements; add `Workflow` icon between SquareSlash and Globe; "Flows" label desktop, icon-only mobile. **Resolves junior Q13.** | Evidenced (`ChatInput.tsx:648-732,673`) | plan-level | UX (refutes junior Q13 worry) |
+| C10 | Launcher: 5 category tabs (Analysis/Discovery/Planning/Authoring/Review) + filtered flow list + size + focus + fast; defaults Analysis/Small/off. Runs history in NewPaneMenu; export in pane header `…`. | Evidenced (`flows/index.ts`, `NewPaneMenu.tsx`, `lib/events.ts`) | plan-level | UX |
+| C11 | Contracts (evidence/yagni) still injected by calling `step.run(ctx)` in-process in flow-runner before INSERT — closures execute in the coder process; prompts are NOT serialized to DB. **Resolves junior Q12.** | Evidenced (`spine.ts:73`) | plan-level | architect (confirms junior Q12) |
+| **C12** | **Dispatch-mechanism tension:** A2 says insert *pending* task → dispatcher picks it up via LISTEN (reuses streaming+worktree). A4 says insert *running* task + call `dispatchViaPty` DIRECTLY to avoid worktree creation (decision 5). The two contradict; architect resolved toward A4 (bypass). | **Disputed (internal to architect: A2 vs A4)** | plan-level | architect (self-flagged) |
+| **C13** | **Read-only enforcement is prompt-level + `BOOCODE_TOOLS=core` (if the binary honors it)** + project-dir-as-cwd (no worktree). Architect + junior both say adversarial-security-analyst must verify it's watertight before decision 5 is safe. | **Anecdotal (enforcement not proven)** | plan-level | architect (A4), junior (Q8/Q9) |
+| C14 | `spine.ts` renders `process.env.CONDUCTOR_MODEL` into the report header (`spine.ts:122`) — must be parameterized to the run's model on re-home. Personas (`conductor/agents/*.md`) copied into `apps/coder`. | Evidenced (`spine.ts:122`, `dispatch.ts:15`) | plan-level | junior Q5/Q6 |
+| C15 | Resume semantics underspecified: re-dispatch in-flight steps vs mark-run-failed. With A4 direct-PTY, a restart kills the PTY child → in-flight steps MUST re-dispatch. Decision 4 commits to "resumable." | Disputed (architect reconcile-and-advance vs data mark-failed) | plan-level | architect (A3), data (OQ1), junior (Q3) |
+| C16 | `queued` (decision 11) is hard to observe: with direct PTY the task is `running` and blocked on the busy model; llama-swap doesn't report queue position. May reduce to pending(dep-wait)/running. | Anecdotal | plan-level (spec-vs-reality) | junior Q11, data DATA-005 |
+
+### Spec-maturity gate
+
+No `T#` notes exist (gate reduces to spec-level threshold). spec-level findings = 0 by ≥3 specialists: junior's Q15 nominated three "open items" as spec-level, but the architect + data-engineer **resolved** them in-plan (re-home → copy into `apps/coder/src/conductor`; step→task → 1:1 `flow_steps.task_id`; resume → decision 4 already commits to "resumable", direction settled below). **Gate does NOT trip.** Proceed in-plan.
+
+### Open Questions
+
+- **OQ1 (C12):** dispatch mechanism — reuse the dispatcher with a no-worktree branch, vs bypass via direct `dispatchViaPty`. → user escalation (recommendation below).
+- **OQ2 (C13):** is read-only watertight? → **R2: adversarial-security-analyst.**
+- **OQ3 (C15):** resume = re-dispatch in-flight steps on restart (recommended; decision 4 = resumable). → user confirm.
+- **OQ4 (C16):** keep an explicit `queued` status or reduce to pending/running. → user confirm (minor).
+
+### Next-step recommendation
+
+`continue iterating` → **R2**: one targeted specialist (adversarial-security-analyst) on read-only enforcement + the no-worktree safety question (OQ2), since it gates the safety of decision 5. Then a single batched user escalation (OQ1, OQ3, OQ4) and synthesis.
+
+_Decisions produced: D-1 (from C1, C11, C14), D-2 (C2, C3), D-5 (C3–C6), D-6 (C7), D-7 (C8), D-8 (C9, C10), D-9 (C15), D-10 (C16). Partially produced (resolved in R2 + user escalation): D-3 (C12), D-4 (C13)._
+_Changed in plan: C12's A4-leaning bypass was REVERSED — the user chose dispatcher reuse (D-3), and R2's read-only finding made the worktree A4 wanted to avoid harmless. C13's "prompt + BOOCODE_TOOLS" enforcement was REPLACED by `mode_id='plan'` (D-4). C16's `queued` status was DROPPED (D-10)._
+
+## R2 — Targeted security review (read-only enforcement)
+
+**Specialist engaged:** adversarial-security-analyst (opus). **Charter:** OQ2 only — is read-only watertight for flow steps, and is the no-worktree posture (decision 5) safe?
+
+**New input:** R1's C12/C13 (the dispatch-mechanism tension and the unproven prompt-level enforcement claim), plus the qwen PTY dispatch path.
+
+### Claim ledger
+
+| # | Claim | State | Supporting |
+|---|---|---|---|
+| C13-R | The R1 enforcement story (persona prompts + `BOOCODE_TOOLS=core` + project-dir-as-cwd) is NOT watertight for an external `qwen` CLI child: persona text is instruction not a gate; `BOOCODE_TOOLS` governs BooChat's in-process tool registry, not the external binary's own tools. Read-only must be enforced at the agent's own tool layer. | **Evidenced** (`pty-dispatch.ts:72-77` — the qwen spawn spec; `BOOCODE_TOOLS` scope is BooChat-only per CLAUDE.md env section) | adversarial-security-analyst |
+| C13-FIX | qwen's `--approval-mode plan` IS the binding control: a built-in tool-level gate (reads allowed, writes blocked) inside the agent binary. Already wired — `mode_id` → `--approval-mode` at `pty-dispatch.ts:75`. Dispatch every step with `mode_id='plan'`, never user-overridable. Qwen-only (decision 6) makes one hardcoded flag sufficient. | **Evidenced** (`pty-dispatch.ts:75`) | adversarial-security-analyst |
+| C12-R | Because plan mode is a tool-level write-block, the worktree the dispatcher creates is INERT — the agent cannot write to it. The "no worktree" motivation behind A4 (decision 5) dissolves: keep the worktree as a harmless read snapshot and REUSE the dispatcher (A2) rather than bypass it (A4). Resolves the C12 tension toward A2. | **Evidenced** (worktree = HEAD checkout; plan mode blocks writes to it) | adversarial-security-analyst (settles C12) |
+
+### Resolution
+
+- **OQ2 → resolved.** Read-only is watertight via `mode_id='plan'` (qwen
+  `--approval-mode plan`), NOT prompt/`BOOCODE_TOOLS`. C13 moves Anecdotal →
+  Evidenced (refuted-and-replaced).
+- **OQ1 (C12) → unblocked.** The security finding removes A4's only advantage; the
+  user chose A2 (dispatcher reuse). Decision-context decision 5 ("no worktree") is
+  REVISED to "worktree as a harmless read snapshot."
+- A new non-qwen agent in flows would require re-verifying its plan-mode equivalent
+  before allowing it (recorded as the D-4 revisit criterion).
+
+### User escalation (batched, post-R2)
+
+- OQ1 → **reuse the dispatcher** (A2), one new `onTaskTerminal` hook, no PTY bypass.
+- OQ3 → **reconcile-and-advance** resume (re-dispatch lost/failed steps; keep
+  completed).
+- OQ4 → **drop `queued`**; `pending` covers waiting.
+
+### Next-step recommendation
+
+`synthesize` — all blocking open questions resolved; no spec-maturity gate trip.
+
+_Decisions produced: D-3 (from C12-R / OQ1 user choice), D-4 (from C13-R / C13-FIX). Co-produced with R1: confirms D-9 (OQ3) and D-10 (OQ4)._
+_Changed in plan: decision-context decision 5 REVISED (no-worktree → read-snapshot worktree, read-only via plan mode); C13's enforcement mechanism REPLACED (prompt/BOOCODE_TOOLS → `mode_id='plan'`); C12 RESOLVED toward A2 (reuse) over A4 (bypass)._
--- a/openspec/changes/archived/orchestrator/design.md
+++ b/openspec/changes/archived/orchestrator/design.md
@@ -0,0 +1,243 @@
+# Orchestrator (Phase 2) — design (the HOW)
+
+Planning altitude: names files, columns, frames, and decision-bearing values
+(the plan-mode flag, status sets, frame field names). Every non-obvious choice
+cites a committed decision in
+[artifacts/implementation-decision-log.md](artifacts/implementation-decision-log.md).
+The behavioral spec is [artifacts/design-context.md](artifacts/design-context.md)
+(decision 5 REVISED here); integration surfaces are in
+[artifacts/.discovery-notes.md](artifacts/.discovery-notes.md).
+
+## Architecture at a glance
+
+```
+ChatInput (shared composer)                          apps/web
+  ├─ Workflow button → FlowLauncherDialog ─┐
+  └─ /flow slash (instant defaults) ───────┤
+                                           ▼
+                          POST /api/runs  ── apps/coder/routes
+                                           ▼
+                       flow-runner.ts (DB-driven scheduler)
+                         · loads flow def from src/conductor/
+                         · step.run(ctx) IN-PROCESS → prompt (contracts injected)
+                         · INSERT flow_runs / flow_steps
+                         · INSERT each ready agent step as a tasks row
+                              (mode_id='plan', synthetic chat_id)
+                                           ▼
+                          dispatcher.ts (REUSED, unchanged internals)
+                         · LISTEN 'tasks_new' → external-agent path
+                         · qwen --approval-mode plan  (read-only gate)
+                         · worktree = read snapshot; AgentEvents → WS frames
+                         · onTaskTerminal(taskId,state)  ← ONE new hook
+                                           ▼
+                       flow-runner advances: read full output → run code steps
+                         inline → INSERT next ready wave  (or finish + report)
+                                           ▼
+   flow_run_started / flow_run_step_updated  +  reused delta/tool_call/
+   message_complete (keyed by step chat_id)  → broker → WS
+                                           ▼
+                 OrchestratorPane.tsx (run header, report-at-top,
+                 collapsed roster, expand-one-at-a-time stream)
+```
+
+## Re-home & DispatchFn seam ([D-1](artifacts/implementation-decision-log.md#d-1--re-home-the-pure-conductor-definitions-into-appscodersrcconductor))
+
+Copy the pure (dispatch-free) conductor files into `apps/coder/src/conductor/`:
+`spine.ts`, `flows/*`, `contracts.ts`, `types.ts`, `render.ts`. Copy the 23
+personas (`conductor/agents/*.md`). Do NOT copy `flow.ts` (in-memory scheduler,
+replaced by the flow-runner) or `dispatch.ts` (`opencode run` subprocess,
+replaced by dispatcher reuse). The Phase-1 CLI under `conductor/` stays alive
+unchanged as a regression oracle.
+
+Two seam edits on the copies:
+
+- **Sever flow→dispatch coupling.** `flows/code-review.ts:10` imports
+  `dispatchAgent` from `../dispatch.js` and calls it at `:62`. Replace that import
+  with a `DispatchFn` field on `StepContext`, injected by the flow-runner. Every
+  flow then reaches dispatch through the context, not a module import.
+- **Parameterize the model.** `spine.ts:122` reads `process.env.CONDUCTOR_MODEL`
+  into the report header. Make it read the run's configured `model` (passed through
+  the spine factory / step context) so the header matches the run, not a process
+  env.
+
+The evidence/yagni contracts (`contracts.ts`) and the adversarial-validator gate
+are preserved because the flow-runner calls `step.run(ctx)` **in-process** to build
+each prompt before it INSERTs the task — the closures execute in the coder process;
+prompts are never serialized to DB ([D-1] rationale, C11).
+
+## Schema ([D-5](artifacts/implementation-decision-log.md#d-5--flow_runs--flow_steps-schema-in-the-coder-schema), [D-10](artifacts/implementation-decision-log.md#d-10--concurrency-multiple-runs-no-queued-status-single-model-per-run))
+
+Two tables in `apps/coder/src/schema.sql` (coder-owned; applied by the host
+boocoder service). Explicit CHECK names + the repo's DROP-IF-EXISTS →
+guarded-ADD discipline (root CLAUDE.md).
+
+`flow_runs`:
+- `id`, `project_id` (NO FK — matches `tasks.project_id`, `schema.sql:19`),
+  `flow_name`, `band` CHECK `(small|medium|large)`, `model`,
+  `status` CHECK-named `(running|completed|failed)`,
+  `input` JSONB CHECK `(input ? 'question')`,
+  `report` TEXT nullable, `error`, `created_at`/`updated_at` (`clock_timestamp()`).
+- Index `flow_runs(project_id, created_at DESC)` (runs history).
+
+`flow_steps`:
+- `id`, `run_id` UUID → `flow_runs(id)` ON DELETE CASCADE, `step_id`,
+  `kind` CHECK `(agent|code)`, `agent`,
+  `status` CHECK-named `(pending|running|completed|failed|skipped)`
+  — no `queued` status ([D-10]; llama-swap can't populate it, C16),
+  `task_id` UUID → `tasks(id)` ON DELETE SET NULL (nullable; code steps NULL),
+  `chat_id` UUID → `chats(id)` ON DELETE SET NULL,
+  `input` TEXT, `output` TEXT (FULL output — `tasks.output_summary` is ≤500 char,
+  `schema.sql:26`, and can't reconstruct `ctx.results`, C3), `error`,
+  timestamps, UNIQUE `(run_id, step_id)`.
+- Index `flow_steps(run_id, status)` (ready-wave + resume scans).
+
+No `depends_on` column and no skipped-step rows — deps and skips are derivable from
+the loaded flow def (`flow.ts:28-41`, `types.ts:27`, C6). The FK lives on
+`flow_steps.task_id`, NOT a new column on `tasks` ([D-5]; keeps `tasks` generic, C4).
+JSONB writes via `sql.json(value as never)`.
+
+## Flow-runner & onTaskTerminal ([D-2](artifacts/implementation-decision-log.md#d-2--db-driven-flow-runner-with-an-ontaskterminal-dispatcher-hook))
+
+New `apps/coder/src/services/flow-runner.ts` — a DB-backed scheduler that owns
+`flow_runs`/`flow_steps`. It does NOT run a poll loop; it reacts to ONE new hook.
+
+`createDispatcher` gains an `onTaskTerminal(taskId, state)` callback, invoked at
+the existing external-agent terminal transitions (`dispatcher.ts:642-646`
+completed, `:659-661` failed). No change to the dispatcher's internal run
+functions ([D-2]).
+
+Run lifecycle:
+1. `POST /api/runs` → flow-runner loads the flow def, derives the first ready wave,
+   INSERTs `flow_runs` (`status='running'`) and its `flow_steps` (each
+   `status='pending'`), and a synthetic `chats` row per agent step (stream
+   attribution, [D-6]).
+2. For each ready `agent` step: build the prompt via `step.run(ctx)` in-process,
+   then INSERT a `tasks` row `(project_id, input=prompt, agent, model,
+   mode_id='plan', chat_id=<synthetic>)` with `state='pending'`. The dispatcher
+   picks it up via `LISTEN 'tasks_new'` ([D-3]).
+3. `code` steps run inline in the flow-runner (no task; `flow_steps.task_id` NULL).
+4. `onTaskTerminal` fires → flow-runner reads the **full** task output, writes it to
+   `flow_steps.output`, marks the step completed/failed, derives the next ready
+   wave, and INSERTs it (or, on the last wave, renders the report into
+   `flow_runs.report` and sets `status='completed'`).
+
+## Execution via dispatcher reuse ([D-3](artifacts/implementation-decision-log.md#d-3--reuse-the-existing-dispatcher-insert-pending-task-not-a-direct-pty-bypass))
+
+Steps execute through the **existing** dispatcher external-agent path — not a
+direct-PTY bypass. The dispatcher creates a git worktree (a stable HEAD
+read-checkout), runs the agent, and streams AgentEvents → WS frames unchanged.
+This REVISES design-context decision 5 ("no worktree") to "worktree as a harmless
+read snapshot" — inert because the agent cannot write under plan mode ([D-4]).
+Task-as-dispatch precedents the flow-runner mirrors: `routes/skills.ts:94`,
+`routes/arena.ts:49`, `tools/new_task.ts:54`.
+
+## Read-only via plan mode ([D-4](artifacts/implementation-decision-log.md#d-4--read-only-enforced-hard-by-mode_idplan-qwen---approval-mode-plan))
+
+The flow-runner hardcodes `mode_id='plan'` on every step task; never
+user-overridable. The PTY dispatcher already passes it to qwen as
+`--approval-mode plan` (`pty-dispatch.ts:75`), a built-in tool-level gate: reads
+allowed, writes blocked. This is the SOLE read-only enforcement. Persona prompts
+and `BOOCODE_TOOLS` are NOT relied upon — they do not govern an external qwen CLI
+child (R2 security finding, C13). Adding a non-qwen agent to flows requires
+re-verifying that agent's plan-mode equivalent before allowing it.
+
+## WS frames ([D-6](artifacts/implementation-decision-log.md#d-6--two-new-ws-frames-per-agent-stream-reuses-existing-frames-by-chat_id))
+
+Two new frames in `packages/contracts/src/ws-frames.ts` `WsFrameSchema`:
+- `flow_run_started`: `{ run_id, flow_name, band, steps: [{ step_id, agent, kind,
+  chat_id, label }] }`.
+- `flow_run_step_updated`: `{ run_id, step_id, status, run_status?, report? }`
+  (the report rides here — no separate report frame, [D-6]).
+
+The per-agent token stream REUSES the existing `delta` / `tool_call` /
+`message_complete` frames keyed by the step's synthetic `chat_id` — no new
+streaming frames. Register both new frames in ALL THREE registries: contracts
+`WsFrameSchema` (rebuild `pnpm -C packages/contracts build`), the server loose
+`InferenceFrame` union (`services/inference/turn.ts`), and the web strict
+`WsFrame` union (`apps/web/src/api/types.ts` — the wire-format gate; missing it
+silently drops the frame at JSON-parse).
+
+## Resume ([D-9](artifacts/implementation-decision-log.md#d-9--resumable-runs-via-initresume-on-coder-startup))
+
+`initResume` runs on coder startup over `flow_runs WHERE status='running'`:
+- step whose `task_id` task is `completed` → mark step done, advance the run;
+- step whose task is lost/failed (PTY died on restart) → re-dispatch (re-INSERT a
+  fresh task, again `mode_id='plan'`);
+- completed steps are kept (no re-run).
+
+Reconcile-and-advance, not mark-run-failed — decision 4 commits to resumable and
+task state is durable under [D-3] (C15).
+
+## Orchestrator pane ([D-7](artifacts/implementation-decision-log.md#d-7--orchestrator-pane-kind--orchestratorpane))
+
+New `orchestrator` pane kind following the `markdown_artifact`/`html_artifact`
+precedent (`api/types.ts:386` `WorkspacePaneKind`). Touches `WorkspacePaneKind`,
+`useWorkspacePanes`, `Workspace`, `NewPaneMenu`, `ChatTabBar`,
+`PaneHeaderActions`.
+
+`OrchestratorPane.tsx`:
+- run header (flow + band);
+- report-at-top on completion;
+- collapsed agent roster reusing `AgentStatusDot` (`AgentComposerBar.tsx:204`);
+- expand-one-at-a-time detail well reusing the CoderPane stream rendering (keyed by
+  the step's `chat_id`);
+- mobile single-column inline expand; auto-expand-follows-active.
+
+The pane subscribes to `flow_run_started` (to build the roster) and
+`flow_run_step_updated` (status + report), and to the reused
+delta/tool_call/message_complete frames by `chat_id` for the expanded agent.
+
+## Toolbar button & launcher ([D-8](artifacts/implementation-decision-log.md#d-8--workflow-toolbar-button--slash-launch-boochatboocoder-parity))
+
+A `Workflow` (lucide) button on `ChatInput`'s controls row, between the
+`SquareSlash` chip and the `Globe` pill (`ChatInput.tsx:648-732`, `:673` — row is
+≤5 elements, stays one line, C9). Because `ChatInput` is rendered by both ChatPane
+and CoderPane, this is BooChat + BooCoder parity from one button. "Flows" label
+desktop, icon-only mobile.
+
+- **Slash** (`/flow <focus>`): launches instantly with defaults (band `small`,
+  current pane's project, text-after-command = focus), opening an Orchestrator
+  pane.
+- **Button** → `FlowLauncherDialog.tsx`: 5 category tabs (Analysis / Discovery /
+  Planning / Authoring / Review) filtering the flow list (`flows/index.ts`), + size
+  + focus + fast toggle; defaults Analysis / Small / off. Same run pane either way.
+
+Runs history surfaces in `NewPaneMenu`. Export (copy / save-file / send-to-chat via
+the existing `sendToChat`, `lib/events.ts`) lives in the pane header `…`,
+conditional on a completed report.
+
+## Concurrency ([D-10](artifacts/implementation-decision-log.md#d-10--concurrency-multiple-runs-no-queued-status-single-model-per-run))
+
+Multiple runs allowed; each its own pane + `flow_runs` row, no shared state. Step
+statuses: pending / running / completed / failed / skipped (no `queued` — the
+dispatcher's `pending` covers a step waiting on deps or on the busy model; llama-
+swap can't report queue position, C16). Single model per run, default
+`qwen3.6-35b-a3b-mxfp4`.
+
+## Routes
+
+- `POST /api/runs` — `{ project_id, flow_name, band, input:{question,...}, model? }`
+  → creates the run, starts the flow-runner, returns `run_id`. Publishes
+  `flow_run_started`.
+- `GET /api/runs?project_id=` — runs history (backs `NewPaneMenu`).
+- `GET /api/runs/:id` — reopen a run (run + steps + report).
+
+## Deploy surface
+
+- `apps/coder` changes (conductor defs, flow-runner, dispatcher hook, schema,
+  resume, routes) → `sudo systemctl restart boocoder`.
+- `packages/contracts` + `apps/web` (frames, pane, button, launcher, history) →
+  `docker compose up --build -d boocode`. Build contracts first
+  (`pnpm -C packages/contracts build`).
+
+## Deferred (YAGNI)
+
+Full list with reopen triggers in
+[artifacts/implementation-decision-log.md](artifacts/implementation-decision-log.md#deferred-yagni):
+`@boocode/conductor` workspace package (copy-in instead);
+`flow_steps.depends_on` column (derive from flow def);
+persisted skipped-step rows (`when()` is pure);
+a `read_only` flag on `tasks` (superseded by `mode_id='plan'`);
+an explicit `queued` status (llama-swap can't populate it);
+a launcher search box (5 category tabs suffice);
+a separate report WS frame (report rides on `flow_run_step_updated`).
--- a/openspec/changes/archived/orchestrator/proposal.md
+++ b/openspec/changes/archived/orchestrator/proposal.md
@@ -0,0 +1,82 @@
+# Orchestrator (Phase 2) — in-app multi-agent conductor
+
+## Source
+
+Settled via a conversational `grill-me` design session. The captured behavioral
+spec (12 decisions, the *what*) is
+[artifacts/design-context.md](artifacts/design-context.md); the HOW is
+[design.md](design.md), decomposed in [tasks.md](tasks.md) and committed in
+[artifacts/implementation-decision-log.md](artifacts/implementation-decision-log.md)
+(D-1…D-10). Note: design-context **decision 5** ("no worktree") is REVISED by
+D-3/D-4 (worktree kept as a harmless read snapshot; read-only enforced by qwen
+plan mode).
+
+## Why
+
+Phase 1 shipped a deterministic multi-agent **conductor** as a standalone host CLI
+(`/opt/boocode/conductor/`): Han flows that fan out read-only specialist agents,
+apply the evidence/yagni contracts, and emit an evidence-disciplined report. It is
+not reachable from the app — a user in BooChat or BooCoder cannot launch a flow,
+cannot watch agents progress live, and the run leaves no persisted, reopenable
+artifact.
+
+This change brings the conductor in-app: launch any Han flow from the shared
+composer, watch each agent stream live (Paseo-style parent-with-subagents), and
+get the report — all on the **already-loaded local Qwen 35B, free**, with the run
+persisted and resumable across a coder restart. It reuses the existing task
+dispatcher, streaming pipeline, and broker rather than standing up a parallel
+execution path (see [design-context](artifacts/design-context.md) decision 4 and
+D-2/D-3).
+
+## What Changes
+
+- **Two doors, full parity.** A `Workflow` button and a slash command on the shared
+  `ChatInput` composer → both appear in BooChat (ChatPane) and BooCoder (CoderPane),
+  desktop and mobile (icon-only). Slash launches instantly with defaults; the
+  button opens a flow launcher first. (decisions 2, 8, 9 · D-8)
+- **A new `orchestrator` pane kind.** A run view alongside `chat | coder |
+  terminal`: flow + band header, the report at the top on completion, a collapsed
+  agent roster, expand-one-at-a-time to watch a single agent's live stream.
+  (decision 3 · D-7)
+- **Read-only flows, enforced HARD.** Every step runs as a qwen agent under
+  `--approval-mode plan` (`mode_id='plan'`): reads allowed, writes blocked at the
+  tool level. Flows never write the repo; the report is the only output.
+  (decision 5 revised · D-4)
+- **Execution reuses the dispatcher.** Each flow step is inserted as a normal
+  `tasks` row; the existing dispatcher runs it through the external-agent path and
+  streams AgentEvents → WS frames unchanged. One new `onTaskTerminal` hook advances
+  the flow. (decisions 1, 4 · D-2, D-3)
+- **Persisted + resumable runs.** New `flow_runs` / `flow_steps` tables in the
+  coder schema; a run survives a coder restart (`initResume` reconciles mid-flight
+  steps). Runs are reopenable from a history; the report is exportable on demand
+  (copy / save-file / send-to-chat). (decisions 4, 10 · D-5, D-9)
+- **Qwen-only, one model per run.** Default `qwen3.6-35b-a3b-mxfp4`, held as a
+  single config value so more local models slot in later. Multiple runs allowed,
+  each its own pane. (decisions 6, 11 · D-10)
+- **Conductor definitions re-homed.** The pure flow/spine/contracts/types/render
+  files + 23 personas are copied into `apps/coder/src/conductor/`; the Phase-1 CLI
+  stays alive. The evidence/yagni contracts and adversarial-validator gate are
+  preserved (the flow-runner builds each prompt in-process before dispatch).
+  (decision 1 · D-1)
+
+## Impact
+
+- **`apps/coder` (deploy: `sudo systemctl restart boocoder`):** new
+  `conductor/` defs, `flow-runner.ts`, the `onTaskTerminal` dispatcher hook,
+  `flow_runs`/`flow_steps` in `schema.sql`, `initResume`, `POST /api/runs` +
+  list/reopen routes.
+- **`packages/contracts` + `apps/web` (deploy: `docker compose up --build -d
+  boocode`):** two new WS frames (in all three registries), the `orchestrator`
+  pane kind + `OrchestratorPane.tsx`, the `Workflow` toolbar button + slash wiring,
+  `FlowLauncherDialog.tsx`, runs history + export.
+- **No `apps/server` chat-pipeline change** beyond the contracts frame registry
+  (the web type is the wire gate).
+- **Safety:** read-only is the whole feature's invariant; D-4 makes it a tool-level
+  gate, not a prompt. Reviewed by adversarial-security-analyst in R2.
+
+## Out of scope (carried from design-context)
+
+- A Claude execution path (Claude Code covers it).
+- Folding Arena into the Orchestrator (stays separate).
+- Per-agent model tiering (single model per run for now).
+- Pixel-faithful per-skill Han report templates (spine-level only).
--- a/openspec/changes/archived/orchestrator/tasks.md
+++ b/openspec/changes/archived/orchestrator/tasks.md
@@ -0,0 +1,144 @@
+# Orchestrator (Phase 2) — tasks
+
+Decomposition in dependency order, grouped by subsystem. Each group notes its
+deploy surface. Decisions referenced are in
+[artifacts/implementation-decision-log.md](artifacts/implementation-decision-log.md);
+the HOW is in [design.md](design.md).
+
+Deploy surfaces:
+- **coder** = `sudo systemctl restart boocoder`
+- **web/contracts** = `pnpm -C packages/contracts build && docker compose up --build -d boocode`
+
+---
+
+## 1. Re-home conductor defs + DispatchFn seam + model param (coder) — [D-1]
+
+- [ ] Copy `spine.ts`, `flows/*`, `contracts.ts`, `types.ts`, `render.ts` into
+  `apps/coder/src/conductor/`. Do NOT copy `flow.ts` or `dispatch.ts`.
+- [ ] Copy the 23 personas (`conductor/agents/*.md`) into the coder tree.
+- [ ] Add a `DispatchFn` field to `StepContext` (`types.ts`); replace
+  `flows/code-review.ts:10` (`import { dispatchAgent } from '../dispatch.js'`) and
+  its call site (`:62`) with a call through `ctx.dispatch`.
+- [ ] Parameterize `spine.ts:122` `process.env.CONDUCTOR_MODEL` → the run's `model`
+  (threaded through the spine factory / step context).
+- [ ] Confirm the Phase-1 CLI (`conductor/`) still builds + runs unchanged
+  (regression oracle).
+- [ ] Coder build passes (`pnpm -C apps/coder build`).
+
+## 2. Schema: `flow_runs` / `flow_steps` (coder) — [D-5], [D-10]
+
+- [ ] Add `flow_runs` to `apps/coder/src/schema.sql` (project_id no FK; band CHECK
+  small|medium|large; status CHECK-named running|completed|failed; input JSONB
+  CHECK `input ? 'question'`; report TEXT nullable; timestamps via
+  `clock_timestamp()`).
+- [ ] Add `flow_steps` (run_id FK→flow_runs CASCADE; step_id; kind CHECK agent|code;
+  status CHECK-named pending|running|completed|failed|skipped — NO `queued`;
+  task_id→tasks SET NULL nullable; chat_id→chats SET NULL; output TEXT FULL;
+  UNIQUE(run_id, step_id)).
+- [ ] Indexes `flow_steps(run_id, status)`, `flow_runs(project_id, created_at DESC)`.
+- [ ] Explicit CHECK names + DROP-IF-EXISTS → guarded-ADD migration discipline.
+- [ ] Verify idempotent re-apply (schema applies clean twice).
+
+## 3. WS frames in all three registries (contracts + server + web) — [D-6]
+
+- [ ] Add `flow_run_started` (`run_id, flow_name, band, steps[{step_id, agent,
+  kind, chat_id, label}]`) + `flow_run_step_updated` (`run_id, step_id, status,
+  run_status?, report?`) to `packages/contracts/src/ws-frames.ts` `WsFrameSchema`;
+  rebuild (`pnpm -C packages/contracts build`).
+- [ ] Add both to the server loose `InferenceFrame` union
+  (`apps/server/src/services/inference/turn.ts`).
+- [ ] Add both to the web strict `WsFrame` union (`apps/web/src/api/types.ts`) — the
+  wire-format gate.
+- [ ] Confirm the per-agent stream reuses existing `delta`/`tool_call`/
+  `message_complete` by `chat_id` (no new stream frames).
+
+## 4. Flow-runner + onTaskTerminal hook + plan mode + per-step chats (coder) — [D-2], [D-3], [D-4]
+
+- [ ] Add `onTaskTerminal(taskId, state)` callback to `createDispatcher`, invoked at
+  the external terminal transitions (`dispatcher.ts:642-646`, `:659-661`). No
+  change to internal run functions.
+- [ ] Create `apps/coder/src/services/flow-runner.ts`: load flow def, derive ready
+  wave, INSERT `flow_runs`/`flow_steps`, build prompts via `step.run(ctx)`
+  in-process (contracts injected), INSERT a synthetic `chats` row per agent step,
+  INSERT each ready agent step as a `tasks` row with `state='pending'`,
+  `mode_id='plan'` (hardcoded, never user-overridable), `chat_id=<synthetic>`.
+- [ ] Run `code` steps inline (`flow_steps.task_id` NULL).
+- [ ] On `onTaskTerminal`: read FULL task output → `flow_steps.output`, mark step,
+  derive + INSERT next wave; on last wave render report → `flow_runs.report`,
+  `status='completed'`.
+- [ ] Publish `flow_run_started` on launch and `flow_run_step_updated` on each step
+  transition (report on completion).
+
+## 5. Resume on startup (coder) — [D-9]
+
+- [ ] Add `initResume` (coder startup) over `flow_runs WHERE status='running'`:
+  task completed → mark step done + advance; task lost/failed → re-dispatch (fresh
+  task, `mode_id='plan'`); keep completed steps.
+- [ ] Verify a coder restart mid-run reconciles and advances (manual smoke).
+
+## 6. Routes: create / list / reopen (coder) — [D-2], [D-7]
+
+- [ ] `POST /api/runs` `{project_id, flow_name, band, input:{question}, model?}` →
+  create run, start flow-runner, return `run_id`.
+- [ ] `GET /api/runs?project_id=` → runs history.
+- [ ] `GET /api/runs/:id` → run + steps + report (reopen).
+
+## 7. `orchestrator` pane kind + OrchestratorPane (web) — [D-7]
+
+- [ ] Add `orchestrator` to `WorkspacePaneKind` (`api/types.ts`); thread through
+  `useWorkspacePanes`, `Workspace`, `NewPaneMenu`, `ChatTabBar`,
+  `PaneHeaderActions` (follow the `markdown_artifact`/`html_artifact` precedent).
+- [ ] `OrchestratorPane.tsx`: run header, report-at-top on completion, collapsed
+  roster reusing `AgentStatusDot` (`AgentComposerBar.tsx:204`),
+  expand-one-at-a-time well reusing CoderPane stream rendering keyed by step
+  `chat_id`, mobile single-column inline expand, auto-expand-follows-active.
+- [ ] Subscribe to `flow_run_started` (build roster) + `flow_run_step_updated`
+  (status/report) + reused stream frames by `chat_id`; handlers idempotent
+  (event-dedup discipline).
+
+## 8. Workflow toolbar button + slash launch wiring, parity (web) — [D-8]
+
+- [ ] Add the `Workflow` (lucide) button to `ChatInput`'s controls row between
+  `SquareSlash` and `Globe` (`ChatInput.tsx:648-732`); "Flows" desktop, icon-only
+  mobile. Confirm the row stays one line (no scroll/wrap) on mobile.
+- [ ] Slash (`/flow <focus>`): launch instantly with defaults (band small, current
+  pane's project, text-after = focus), open an Orchestrator pane.
+- [ ] Verify parity: the button + slash both appear in BooChat (ChatPane) AND
+  BooCoder (CoderPane) since `ChatInput` is shared.
+
+## 9. FlowLauncherDialog (web) — [D-8]
+
+- [ ] `FlowLauncherDialog.tsx`: 5 category tabs (Analysis/Discovery/Planning/
+  Authoring/Review) filtering the flow list (`flows/index.ts`), + size + focus +
+  fast toggle; defaults Analysis/Small/off. On launch → `POST /api/runs`, open the
+  Orchestrator pane.
+
+## 10. Runs history + export (web) — [D-7]
+
+- [ ] Runs history entry point in `NewPaneMenu` (backed by `GET /api/runs`);
+  reopening opens an Orchestrator pane via `GET /api/runs/:id`.
+- [ ] Export in the pane header `…`: copy / save-to-file / send-to-chat (existing
+  `sendToChat`, `lib/events.ts`), conditional on a completed report.
+
+## 11. Tests — [D-2], [D-9]
+
+- [ ] Coder vitest for the scheduler: ready-wave derivation from a flow def,
+  `onTaskTerminal` advancement, last-wave report render. Extract a pure helper
+  (e.g. `flow-runner-decisions.ts`) for the wave/advance logic (repo pattern:
+  `turn-guard.ts`/`lifecycle-decisions.ts`).
+- [ ] Coder vitest for resume: completed step kept, lost/failed step re-dispatched
+  (pure reconcile helper).
+- [ ] Parity check: assert the `Workflow` button + slash are wired through the
+  shared `ChatInput` (so both panes get them) — code-level assertion, no web test
+  harness exists.
+
+---
+
+## Sequencing notes
+
+- Groups 1–6 are coder-only (one `systemctl restart boocoder` after 1–6 land
+  together, or incrementally).
+- Group 3 (contracts) must build before groups 7–10 consume the frame types
+  (`pnpm -C packages/contracts build` first).
+- Groups 7–10 are web; ship with `docker compose up --build -d boocode`.
+- Group 11 lands alongside the code it tests (coder tests after group 4/5).