# BooCode Code Conductor A deterministic, code-driven orchestrator for multi-agent flows, modelled on [Han](https://github.com/testdouble/han). **Code owns the sequence; the model only works.** ## Why this exists Han's skills are conducted by a *model* — the driving LLM reads a skill and decides when to dispatch each specialist agent. That works with a strong model (Claude), but a local 35B model drops orchestration steps when it has to track methodology *and* sequence agents at once (measured: the loose Han `research` skill on Qwen 3.6 35B dropped the adversarial-validator step). This conductor inverts that. A small TypeScript scheduler decides every step, runs independent steps in parallel, folds their results, and dispatches each Han agent as a **bounded single-task worker** — one `opencode run` per worker, with the agent's persona baked in. The model never sees the workflow; it only does one task and returns. That is the "external deterministic orchestrator drives bounded workers" pattern, which holds on weak local models where self- orchestration does not. ## The research flow (Han's spine, as code) ``` research-analyst ─┐ ├─▶ synthesis (code fold) ─▶ adversarial-validator ─▶ render codebase-explorer ┘ (parallel angles) (attacks the fold) ``` - `research-analyst` and `codebase-explorer` run **in parallel** (explorer only when `--repo` is given). - A **code** step folds their full outputs verbatim — no 500-char summary, no model deciding order. - `adversarial-validator` attacks the fold. - `render` stitches the Han-shaped report, validation above recommendation. ## Run it ```bash cd conductor # (local-only: node_modules is symlinked to ../apps/coder/node_modules — gitignored) node_modules/.bin/tsx src/run.ts "" [--size=small|medium|large] [--repo=/abs/path] [--fast] # examples: node_modules/.bin/tsx src/run.ts research "polling vs webhooks for a third-party API?" --fast node_modules/.bin/tsx src/run.ts architectural-analysis "the dispatcher" --repo=/opt/boocode/apps/coder --size=large node_modules/.bin/tsx src/run.ts security-review "the auth middleware" --repo=/opt/boocode node_modules/.bin/tsx src/run.ts investigate "tasks occasionally dispatch twice" --repo=/opt/boocode ``` Run with no args to list flows. Writes `conductor-report--.md`, prints its path on stdout. ### Flows 22 flows are wired — run with no args to list them. By category: - **Analysis / review:** `research`, `investigate`, `architectural-analysis`, `security-review`, `gap-analysis`, `data-review`, `devops-review`, `issue-triage` - **Discovery / docs / tests:** `project-discovery`, `project-documentation`, `test-planning` - **Planning** (one-pass drafts): `plan-a-feature`, `plan-implementation`, `plan-a-phased-build`, `plan-work-items`, `iterative-plan-review` - **Authoring / reporting** (one-pass drafts): `adr`, `coding-standard`, `runbook`, `tdd`, `stakeholder-summary` - **Bespoke pipeline:** `code-review` — per-dimension reviewers → adversarially verify each dimension (drops false positives) Every spine flow ends with the **adversarial-validator** gate. `--size` selects how many angles fan out; `--fast` caps each worker's depth. ### Modes - **Band** (`--size`, Han's small/medium/large): selects the roster breadth per flow. Small = the core angle(s); large = every angle. - **Fast** (`--fast`): appends a speed directive to every worker (cap tool calls, decisive evidence only). Turns a ~12-min web-research worker into ~1 min. ### Config (env) | var | default | meaning | |---|---|---| | `CONDUCTOR_MODEL` | `llama-swap/qwen3.6-35b-a3b-mxfp4` | model each worker runs on | | `CONDUCTOR_OPENCODE_BIN` | `/home/samkintop/.opencode/bin/opencode` | opencode binary | | `CONDUCTOR_TIMEOUT_MS` | `1500000` | per-worker timeout (25 min — strict web-research personas on a local 35B routinely run 10+ min) | ## Layout - `src/types.ts` — `Flow` / `Step` / `StepContext`, plus `Spine` / `Angle` / `Band` (a Han skill as data). - `src/dispatch.ts` — one bounded worker = one `opencode run` with a baked persona. - `src/flow.ts` — the conductor: a dependency-aware wave scheduler (parallel fan-out, barrier on deps). - `src/spine.ts` — the **factory**: compiles a `Spine` into a `Flow` (band gating, fold, optional synthesizer, validator gate, generic render). - `src/flows/*.ts` — the Han skills as `Spine` configs + the registry (`index.ts`). - `src/run.ts` — CLI. - `agents/` — all 23 Han personas (the worker roster). ## Han coverage All 23 Han **agents** are in `agents/` and dispatchable, and the full **skill** surface is wired (22 flows): - **Spine-shaped (21):** fan-out → fold → optional synthesizer → adversarial-validator → render. Added as a `Spine` in `flows/`. - **Bespoke (1):** `code-review` — a per-dimension find → verify-each-dimension pipeline (`flows/code-review.ts`). **Honesty about the one-pass drafts.** Han's planning/authoring skills (`plan-*`, `iterative-plan-review`, `tdd`, `adr`, `coding-standard`, `runbook`, `stakeholder-summary`) are designed as human-in-the-loop loops. Run unattended here they produce a first-draft artifact and still take the adversarial-validator gate, but they are *not* a substitute for the interactive refinement Han intends. Phase 2 (in-app) is where they get a real human-in-the-loop surface. ## Evidence & reference alignment (Han) The conductor applies Han's two foundational rules, vendored verbatim in `references/` and injected as contracts (`src/contracts.ts`): - **`evidence-rule`** — every evidence-bearing flow brief and the validator carry it: trust classes (codebase / web / provided), the web **corroboration gate** (a single-source web claim is marked `[single-source]` and can't be the sole basis for a conclusion), codebase-as-current-state-anchor, and explicit **no-evidence labeling** with a reopen trigger. The validator additionally runs the rule's *reviewing* checklist — verified live: it flags single-source laundering and unnamed source contradictions. - **`yagni-rule`** — flows that PRODUCE a committable artifact (`plan-*`, `adr`, `coding-standard`, `runbook`, `tdd`, `project-documentation`) carry `contracts: ['evidence', 'yagni']`: the inclusion gate (evidence-of-need), the simpler-version test, and a `## Deferred (YAGNI)` section for items that fail the gate. The validator runs the YAGNI review checklist on them. Per flow, set `contracts` on the `Spine` (default `['evidence']`). The report header states which rules were applied; the plain-language **Summary**, the **Confidence** (High/Med/Low), and any deferrals live in the **Validation** section. **Template fidelity — partial, by design.** Han skills render a fixed per-skill template (e.g. `research-report-template.md`'s Sources registry). This conductor follows that template *spine* — sourced/numbered evidence (`A#`/`E#`) from the agents, then Validation with `V#` + Confidence — but does **not** run a model to re-assemble a pixel-faithful template (that would reintroduce model-driven rendering the conductor deliberately avoids). The agents emit the sourced sections; the conductor stitches them in template order. Exact per-skill template rendering is available as a follow-up if wanted (vendor the template, pass it to the terminal agent). ## Phase 2 (in-app integration) — not started This standalone conductor is Phase 1. Phase 2 moves it into `apps/coder`: persist flow/step rows to Postgres, dispatch through the existing `AgentBackend`s (instead of shelling `opencode run`), expose an API route, and surface launch/watch in the CoderPane. See `docs/research/2026-06-03-boocode-orchestration-integration.md`. ## Not done yet (deliberate v1 scope) - **Tool-permission safety.** Workers run on opencode's default agent, so the persona (read-only by charter) is trusted to stay read-only. v2: ship `mode: all`, `edit: deny` agent files and dispatch via `--agent` for enforced read-only. - **Conditional / dynamic flows.** Steps are static (band-gated). No data- dependent branching or dynamic agent selection yet — the scheduler could support it, but no flow needs it so far. - **Output parsing.** Worker output is the cleaned default `opencode run` text (banner + tool-progress lines stripped). A `--format json` parser would be more robust.