boocode/openspec/changes/v2.0-boocoder/proposal.md

# v2.0 — BooCoder

Major version bump. New app `apps/coder/` inside the existing monorepo. Lands together with the `boocode_db` → `boochat_db` DB rename and the per-app subdomain split (`code.indifferentketchup.com` → BooChat, `coder.indifferentketchup.com` → BooCoder).

## What BooCoder is

A write-capable coding agent surface. Two execution paths, same UI:

- **Path A (native):** BooCode's own inference loop with write tools (`edit_file`, `create_file`, `delete_file`). Edits queue in `pending_changes` — nothing touches disk until user approves via `/apply`.
- **Path B (dispatch):** Shells out to external CLI agents (`opencode`, `goose`, `claude`, `pi`) via ACP (preferred) or raw PTY (fallback). One git worktree per dispatch. Captures events into the same parts taxonomy.

Both paths feed the same task DAG, same project registry, same pending-changes queue, same UI.

## Why now

v1.x proved the read-only loop works end-to-end: inference, tool dispatch, streaming, compaction, MCP client, outer loop, step caps, artifact rendering. The infrastructure is stable. The jump from "read-only chat" to "write-capable agent orchestrator" is the remaining gap between BooCode and having a real development environment.

## Architecture

### Three protocol roles (locked 2026-05-22)

1. **MCP client (write-capable allowed).** Inherits v1.15 client. Write-capable MCP servers (e.g. `@modelcontextprotocol/server-filesystem`) route writes through `pending_changes`. Per-task allow/deny means dispatched tasks can have a different MCP roster.
2. **MCP server (BooCoder's own primitives).** Exposes `boocoder.create_task`, `boocoder.list_pending_changes`, `boocoder.apply`, `boocoder.reject`, `boocoder.dispatch_external_agent`, `boocoder.list_worktrees` as MCP tools. Stdio transport for local consumers (Sam's `opencode` in Termius); HTTP deferred until OAuth + secret storage.
3. **ACP client (host).** Spawns `opencode acp` and `goose acp` as JSON-RPC stdio subprocesses. Maps ACP events (file operations, tool calls, terminal output) to BooCode's parts taxonomy. MCP servers configured in BooCoder are auto-forwarded to the dispatched agent (per goose docs — `context_servers` is the field).

### Container layout (post-v2.0)

| Container | Port | Mount | Purpose |
|---|---|---|---|
| `boochat` (was `boocode`) | `100.114.205.53:9500` | `/opt:/opt:ro` | Read-only chat + MCP client |
| `booterm` | `100.114.205.53:9501` | `/opt:/opt:rw` | PTY/tmux terminal |
| `boocoder` | `100.114.205.53:9502` | `/opt:/opt:rw` (policy-gated) | Write tools + ACP host + MCP client + MCP server |
| `boochat_db` (was `boocode_db`) | `127.0.0.1:5500` | `boocode_pgdata` | Shared Postgres 16 |
| `codecontext` | internal `:8080` | `/opt:/opt:ro` | Analysis sidecar (shared) |

### Caddy routing

```
code.indifferentketchup.com    → boochat:9500
coder.indifferentketchup.com   → boocoder:9502
term.indifferentketchup.com    → booterm:9501 (or routed under code.*/term/)
```

## Schema (new tables)

```sql
-- Pending changes: queued writes before /apply
CREATE TABLE IF NOT EXISTS pending_changes (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  session_id UUID NOT NULL REFERENCES sessions(id),
  task_id UUID REFERENCES tasks(id),
  file_path TEXT NOT NULL,
  operation TEXT NOT NULL CHECK (operation IN ('create', 'edit', 'delete')),
  diff TEXT NOT NULL,
  status TEXT NOT NULL DEFAULT 'pending' CHECK (status IN ('pending', 'applied', 'rejected', 'reverted')),
  created_at TIMESTAMPTZ NOT NULL DEFAULT clock_timestamp()
);

-- Tasks: the dispatch DAG
CREATE TABLE IF NOT EXISTS tasks (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  project_id UUID NOT NULL REFERENCES projects(id),
  parent_task_id UUID REFERENCES tasks(id),
  state TEXT NOT NULL DEFAULT 'pending'
    CHECK (state IN ('pending', 'running', 'completed', 'failed', 'blocked', 'cancelled')),
  input TEXT NOT NULL,
  output_summary TEXT,
  agent TEXT,
  model TEXT,
  execution_path TEXT CHECK (execution_path IN ('native', 'acp', 'pty')),
  worktree_path TEXT,
  cost_tokens INTEGER,
  started_at TIMESTAMPTZ,
  ended_at TIMESTAMPTZ,
  created_at TIMESTAMPTZ NOT NULL DEFAULT clock_timestamp()
);

-- Available agents: probed at startup
CREATE TABLE IF NOT EXISTS available_agents (
  name TEXT PRIMARY KEY,
  install_path TEXT,
  version TEXT,
  supports_acp BOOLEAN NOT NULL DEFAULT false,
  supports_mcp_client BOOLEAN NOT NULL DEFAULT false,
  last_probed_at TIMESTAMPTZ
);

-- Human inbox: tasks needing attention
CREATE VIEW human_inbox AS
  SELECT * FROM tasks WHERE state IN ('blocked', 'failed');
```

`task_templates` and `pipelines` deferred to v2.1 — overhead for single-user. The core is `tasks` + `pending_changes` + `available_agents`.

## Path A — Native write tools

### Tools

| Tool | Description |
|---|---|
| `edit_file` | Apply a diff to an existing file. Input: `{file_path, old_string, new_string}`. Queues in `pending_changes` with `operation='edit'`. |
| `create_file` | Create a new file. Input: `{file_path, content}`. Queues as `operation='create'`. |
| `delete_file` | Delete a file. Input: `{file_path}`. Queues as `operation='delete'`. |
| `apply_pending` | Flush all pending changes for the current session to disk. Path-guarded. |
| `rewind` | Revert a specific applied change or all changes since a checkpoint. |

### Path guard for writes

Same `pathGuard()` function from BooChat, but with a write-path variant:
- `resolveWritePath(projectRoot, requested)` — uses `resolve()` (not `realpath()`, since the file may not exist yet for creates), then verifies the result starts with `projectRoot + sep`.
- Deny list: everything in `secret_guard.ts` (`.env`, `*.pem`, etc.) — can't write to those either.
- Defense-in-depth: the `pending_changes` queue means even a path-guard bypass only queues; it doesn't hit disk until `/apply` (which re-validates).

### Diff format

Standard unified diff (what `git diff` produces). The `edit_file` tool takes `old_string` / `new_string` (same as Claude Code's edit tool — the model is trained on this shape). Server computes the unified diff for storage in `pending_changes.diff`.

### UI: per-pane diff viewer

Frontend pane type `pending_changes` in BooCoder's workspace. Shows:
- List of queued changes with file path + operation
- Per-change diff view (syntax-highlighted, side-by-side or unified)
- Approve / Reject per change, or Approve All / Reject All

## Path B — External agent dispatch

### dispatch_external_agent tool

```typescript
{
  agent: 'opencode' | 'claude' | 'goose' | 'pi',
  model: string,        // e.g. 'claude-opus-4-7'
  task: string,         // natural-language task description
  worktree?: string,    // optional — auto-creates if not specified
}
```

### Transport selection

Dispatcher checks `available_agents.supports_acp` at runtime:
- **ACP** (preferred): `opencode acp`, `goose acp` — JSON-RPC stdio. Native session lifecycle, file-operation events, terminal events, permission prompts.
- **PTY** (fallback): `claude`, `pi`, `smallcode` — raw terminal capture via `node-pty`. Captures stdout/stderr/exit-code into PostgreSQL. Less structured than ACP.

### Worktree management

Each dispatched task gets its own git worktree:
```bash
git worktree add /tmp/booworktrees/<task-id> -b task-<task-id> HEAD
```

On completion: diff the worktree against HEAD, queue the diff into `pending_changes` for the same task, clean up the worktree. User approves/rejects the diff the same way as Path A.

### ACP event mapping

ACP events → BooCode parts taxonomy:
- `file_operation` → `tool_call` part (name: `acp_edit_file`) + `tool_result` part
- `tool_call` → `tool_call` part (preserves name)
- `terminal_output` → routes into BooTerm pane
- `permission_request` → pause inference (same mechanism as `ask_user_input`)
- `session_end` → task state → `completed` or `failed`

### MCP server auto-forward

Per goose docs, `context_servers` field in the ACP session config auto-forwards BooCoder's configured MCP servers to the dispatched agent. One MCP config drives every agent.

## Dispatcher worker

Background process (or in-process `setInterval` for v2.0 simplicity) that:
1. Queries `tasks` WHERE `state = 'pending'` ORDER BY `created_at`
2. For each ready task (no unmet dependencies):
   - Mark `state = 'running'`
   - Resolve execution path (Path A if no agent specified, Path B if agent specified)
   - Path A: run the inference loop with write tools enabled
   - Path B: spawn ACP/PTY subprocess, stream events into parts
   - On completion: mark `state = 'completed'` or `'failed'`
   - Queue output diff into `pending_changes`
3. On failure: mark `state = 'failed'`, surface in `human_inbox` view

## BooCoder MCP server

Exposes BooCoder's primitives as MCP tools so external agents (Sam's opencode in Termius) can drive the task queue:

| MCP Tool | Description |
|---|---|
| `boocoder.create_task` | Create a new task in the queue |
| `boocoder.list_pending_changes` | List queued changes awaiting approval |
| `boocoder.apply` | Apply a specific pending change |
| `boocoder.reject` | Reject a pending change |
| `boocoder.dispatch_external_agent` | Dispatch a task to an external agent |
| `boocoder.list_worktrees` | List active git worktrees |

Stdio transport for local consumers. HTTP transport deferred until OAuth + secret storage.

**Eval requirement:** run through `anthropics/skills mcp-builder` 10-question evaluation framework before shipping.

## Code lifts

### Primary architectural template

**`Dominic789654/agent-hub`** (Apache-2.0) — task DAG schema, dispatcher worker, project registry, human inbox. Three-process model (board server + dispatcher + assistant terminal). BooCode adapts this into a single-process Fastify app (v2.0.0) with the dispatcher as an in-process worker.

### Pending-changes UX

**`plandex-ai/plandex`** (MIT) — diff/apply/rewind vocabulary. The `pending_changes` queue concept, per-file diff view, approve/reject UI pattern. No code lifted — schema and UX design only.

### ACP client

**`agentclientprotocol.com` spec + `@zed-industries/agent-client-protocol` SDK** (Apache-2.0) — local-subprocess ACP via stdio JSON-RPC. The SDK handles framing; BooCode maps events to its parts taxonomy.

**`goose` docs** (`goose-docs.ai/docs/guides/acp-clients/`) — `context_servers` auto-forward pattern. Critical: one MCP config drives every dispatched agent.

### MCP server

**`anthropics/skills/mcp-builder`** (MIT) — 4-phase build workflow + 10-question evaluation framework for validating the MCP server before shipping.

### Dispatcher pattern

**Paseo (`getpaseo/paseo`)** — AGPL-3.0, **design only, no code lift**. Daemon+clients architecture, `--worktree` flag, CLI verb shape (`run/ls/attach/send`). BooCode reproduces the architecture using only license-clean patterns.

**Roo Code Boomerang Tasks** — orchestrator with intentional capability restriction. Down-pass/up-pass context discipline (`new_task` message, `attempt_completion` result, no implicit inheritance). Explicit precedence override clause.

### Write-tool security

**opencode `permission/evaluate.ts`** — wildcard permission ruleset (already lifted in v1.15). Extended in v2.0 to gate write tools.

**`covibes/zeroshot`** — blind-validation invariant. Verify gate runs in a separate agent context that only sees the diff and acceptance criteria, not the producing conversation. v2.0+ optional batch.

## Sub-versions

| Version | Scope |
|---|---|
| **v2.0.0** | Schema + Path A (native write tools + pending-changes queue + diff UI) + basic dispatcher |
| **v2.0.1** | Path B (ACP client for opencode/goose + PTY fallback for claude/pi + worktree management) |
| **v2.0.2** | BooCoder MCP server (stdio transport, `boocoder.*` tools, eval framework) |
| **v2.0.3** | Polish: `boocode` CLI (`run/ls/attach/send`), human_inbox UI, cost tracking |

## Dependencies

- v1.13 ✅ (parts table — the event taxonomy for everything)
- v1.14 ✅ (outer loop + step boundaries for future revert snapshots)
- v1.14.x-mcp ✅ (MCP client PoC — proves the protocol)
- v1.15 ✅ (full MCP client + tool globs — write-capable MCP servers route through pending_changes)
- v1.16 ✅ (codesight merge — codecontext now has blast-radius for impact analysis)

All dependencies shipped. v2.0 is unblocked.

## Estimate

- v2.0.0: ~800 LoC (schema + write tools + pending-changes service + diff pane + dispatcher skeleton)
- v2.0.1: ~600 LoC (ACP client + PTY dispatch + worktree management + event mapping)
- v2.0.2: ~400 LoC (MCP server + 6 tool handlers + stdio transport + eval)
- v2.0.3: ~400 LoC (CLI client + inbox UI + cost aggregation)
- **Total: ~2200 LoC** across 4 sub-versions

## Hard rules

- BooChat stays read-only. BooCoder is the only surface with write tools.
- Path-guard correctness is the #1 test target. Fuzz against every traversal pattern.
- Pending-changes queue gates ALL writes (native + MCP). Nothing touches disk without user approval (or explicit auto-apply flag per task).
- One shared database. Cross-surface joins are valuable (task → chat → terminal debugging session).
- External CLI agents on the host, not in containers. BooCoder shells out via local-exec.
- No OAuth in v2.0. MCP server is stdio-only until secret storage lands.
- DB rename `boocode_db` → `boochat_db` lands with v2.0.0 (one-time migration).

## AGENTS.md extensions (v2.0.0)

Port from `qodo-ai/agents` (MIT) `agent.toml` schema and `ai-christianson/RA.Aid` (Apache-2.0) three-stage pattern:

| Field | Type | Purpose | Source |
|---|---|---|---|
| `steps` | number | Per-agent step cap (already shipped v1.14.0) | opencode |
| `output_schema` | JSON Schema | Structured output constraint for the agent's final response | qodo-ai/agents |
| `exit_expression` | string | Regex/predicate — when the agent considers itself done | qodo-ai/agents |
| `execution_strategy` | `plan` \| `act` \| `research` | Which phase of the RA.Aid three-stage pattern this agent operates in | qodo-ai/agents + RA.Aid |
| `model` | string | Per-agent model override (already shipped v1.8) | — |
| `expert_model` | string | Escalation model for hard reasoning (RA.Aid "expert tool" escape hatch) | RA.Aid |

The three-stage pattern maps to BooCoder's use case:
- **Research agent** (cheap model) → understand the task, find relevant files
- **Planning agent** (standard model) → decide which files to edit, what the changes look like
- **Implementation agent** (full model) → produce the actual diffs

`expert_model` is the escape hatch: a routine model handles most subtasks, but can call the expert model (e.g. qwopus27b) when stuck. Matches Sam's existing cost-routing discipline.

## Subagent isolation (Boomerang pattern, v2.0.1)

From Roo Code Boomerang Tasks (Apache-2.0 pattern):

When an orchestrator agent calls a `new_task` tool, BooCoder:
1. Creates a fresh `tasks` row with `parent_task_id` pointing to the orchestrator's task
2. Spawns a fresh inference session (Path A) or dispatch (Path B) with ONLY the task spec as context — no inherited conversation
3. Child runs to `attempt_completion`, writes a summary to `tasks.output_summary`
4. Parent resumes reading ONLY the summary (not the child's full conversation)

**Three principles:**
- Orchestrator capability restriction: the orchestrator agent's tool list includes ONLY `new_task`, `list_tasks`, `check_task_status` — it cannot read files or call MCP tools directly
- Down-pass: parent sends task spec via `new_task(input)`, nothing else inherited
- Up-pass: child sends result via `attempt_completion(summary)`, nothing else surfaces to parent

This is the **single most important context-management primitive** — it prevents long-running orchestrators from poisoning their context with implementation detail.

## Observation hooks (v2.0.3)

From `siropkin/budi` (MIT) Claude Code 5-hook taxonomy:

Register BooCoder as a hook receiver for dispatched agents. Five events:
- `SessionStart` — agent spawned
- `UserPromptSubmit` — task spec delivered
- `PostToolUse` — each tool call completed
- `SubagentStart` — nested dispatch
- `Stop` — agent finished

These map directly to BooCode's existing WS frame protocol. The hook receiver is the BooCoder Fastify server; events flow into the `message_parts` taxonomy as `step_start`-style instrumentation parts.

## Follow-up batches (v2.0+ optional, ordered by value)

| Batch | Source | What | When |
|---|---|---|---|
| **PR-resolver tool** | `qodo-ai/qodo-skills` (MIT) | Fetch GitHub issues → batch/interactive fix → inline PR reply. BooCoder tool that replaces Sam's manual PR workflow. | v2.0.3+ |
| **HMAC audit log** | `sipyourdrink-ltd/bernstein` (verify license) | One new `audit_log` table with `prev_hmac` field. Tamper-evident history of every edit BooCoder makes. Small lift (~50 LoC). | v2.0.1+ |
| **Blind-validation gate** | `covibes/zeroshot` (MIT) | Verify gate runs in a separate agent context that sees ONLY the diff + acceptance criteria, not the producing conversation. Complements Boomerang (isolation) + bernstein (lineage). | v2.0.2+ |
| **Majority-vote ensembler** | `augmentcode/augment-swebench-agent` (MIT) | K candidate diffs from K agents → ranker model picks the best one. Optional layer above `pending_changes`. | v2.1+ |
| **Drift detection** | `memovai/memov` (MIT) | `validate_commit` concept — detects when actual changes diverge from what was requested. Shadow timeline comparison. | v2.0.3+ |
| **Anti-slop for frontend** | `Leonxlnx/taste-skill` (MIT) | 100+ specific font/color/layout ban list + 3-dial parameterization. Vendor into skills/ when BooCoder generates frontend code. | v2.0+ |
| **Verify-before-commit gate** | `DeepSourceCorp/globstar` (MIT) | Rule-based AST linter as a pre-apply quality gate. YAML checkers in `.globstar/`. | v2.1+ (parked) |
| **Docker sandbox** | `OpenHands/OpenHands` (MIT) | Per-session Docker container for write tools. Closes the `/opt:rw` mount risk if path-guard ever proves insufficient. | v2.1 (optional) |
| **Multi-provider LLM** | `earendil-works/pi` (MIT) | Provider abstraction if a need for Anthropic/OpenAI/Mistral direct surfaces beyond llama-swap. | v2.x (optional) |

## Repos to clone before starting

```bash
cd /opt/forks
git clone https://github.com/Dominic789654/agent-hub.git    # Apache-2.0, task DAG + dispatcher
git clone https://github.com/plandex-ai/plandex.git          # MIT, pending-changes UX
git clone https://github.com/anomalyco/opencode.git          # MIT, permission evaluate.ts reference
git clone https://github.com/qodo-ai/agents.git              # MIT, agent.toml schema (output_schema, exit_expression, execution_strategy)
```

Also read (no clone needed):
- `ai-christianson/RA.Aid` README — three-stage pattern + expert-tool escape hatch
- `getpaseo/paseo` README + `skills/` directory — daemon architecture + CLI verbs (AGPL, design-only)
- `agentclientprotocol.com` spec — ACP stdio protocol
- `goose-docs.ai/docs/guides/acp-clients/` — `context_servers` auto-forward pattern
- `siropkin/budi` README — 5-hook Claude Code taxonomy for observation

ACP SDK and MCP SDK are npm packages installed at implementation time.