v2.0 implementation plan: 8 phases from foundation to production

Detailed execution plan for all v2.0 sub-versions: Phase 1 (v2.0.0-alpha): container skeleton, DB rename, schema migration Phase 2 (v2.0.0-beta): write tools + pending-changes service + fuzz tests Phase 3 (v2.0.0): frontend diff pane + chat pane + Caddy routing Phase 4 (v2.0.0-final): dispatcher worker + task queue + agent probing Phase 5 (v2.0.1): ACP client + PTY fallback + worktree management Phase 6 (v2.0.2): MCP server (6 tools, stdio, 10-question eval) Phase 7 (v2.0.3): CLI + human inbox + cost tracking + observation hooks + Boomerang Phase 8 (v2.0.x): path-guard fuzz, integration tests, docs, production deploy ~2050 LoC total. Phases 1-4 sequential, 5-7 parallelizable after 4. Risk register covers path-guard bypass, ACP instability, worktree cleanup, DB rename, MCP eval, Boomerang context leak. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 01:09:05 +00:00
parent 531d39ace9
commit 62d818af23
1 changed files with 413 additions and 0 deletions
--- a/openspec/changes/v2.0-boocoder/implementation-plan.md
+++ b/openspec/changes/v2.0-boocoder/implementation-plan.md
@@ -0,0 +1,413 @@
+# v2.0 BooCoder — Implementation Plan
+
+Ordered execution plan across all 4 sub-versions. Each phase is dispatchable as a single batch. Phases 1-4 are sequential (each builds on the prior); phases within a sub-version can sometimes be parallelized.
+
+---
+
+## Phase 1 — Foundation (v2.0.0-alpha)
+
+**Goal:** Standalone BooCoder container boots, connects to DB, serves a health endpoint. No inference yet.
+
+**Estimated:** ~200 LoC
+
+### Steps
+
+1. **Clone lift sources** (prep, no code)
+   - `cd /opt/forks && git clone agent-hub, plandex, opencode, qodo-ai/agents`
+   - Read agent-hub schema, plandex pending-changes, opencode permission/evaluate.ts
+   - Read RA.Aid README for three-stage pattern
+
+2. **Create `apps/coder/` skeleton**
+   - `apps/coder/package.json` (Fastify, postgres, zod — same deps as `apps/server`)
+   - `apps/coder/tsconfig.json` (extends base, NodeNext)
+   - `apps/coder/src/index.ts` (Fastify boot, health endpoint, DB connect)
+   - `apps/coder/src/config.ts` (Zod config schema — DATABASE_URL, PORT, HOST, LLAMA_SWAP_URL, CONTAINER_GUIDANCE_FILE)
+   - `apps/coder/src/db.ts` (postgres connection, schema apply — shared with `apps/server` or fresh)
+
+3. **Create Dockerfile**
+   - `apps/coder/Dockerfile` — Node 20 bookworm-slim (matches booterm for glibc compat with node-pty later)
+   - Mount: `/opt:/opt:rw`
+   - COPY built server + BOOCODER.md
+
+4. **docker-compose.yml** — add `boocoder` service
+   - Port `100.114.205.53:9502:3000`
+   - Environment: `DATABASE_URL`, `LLAMA_SWAP_URL`, `CONTAINER_GUIDANCE_FILE=/app/BOOCODER.md`
+   - Network: `boocode_net`
+   - Depends on: `boocode_db`
+
+5. **DB rename** — `boocode_db` → `boochat_db`
+   - `ALTER DATABASE boocode RENAME TO boochat;` (one-time, run manually)
+   - Update `DATABASE_URL` in all docker-compose services
+   - Update volume name mapping
+   - Verify all 3 services boot against renamed DB
+
+6. **Schema migration** — new tables in `apps/coder/src/schema.sql`
+   - `pending_changes` table
+   - `tasks` table
+   - `available_agents` table
+   - `human_inbox` view
+   - Applied idempotently on boot (same pattern as BooChat's `applySchema()`)
+
+7. **BOOCODER.md** — container guidance file
+   - Write tools enabled (unlike BOOCHAT.md which declares read-only)
+   - Pending-changes queue discipline
+   - Path-guard rules
+
+### Verification
+- `docker compose up --build -d` — boocoder container starts
+- `curl http://100.114.205.53:9502/api/health` — 200 OK
+- `psql` confirms new tables exist
+- BooChat + BooTerm unaffected (still boot, still serve)
+
+---
+
+## Phase 2 — Write Tools + Pending Changes (v2.0.0-beta)
+
+**Goal:** BooCoder can chat with the LLM, the LLM can call write tools, changes queue in `pending_changes`, user can apply/reject.
+
+**Estimated:** ~400 LoC
+
+### Steps
+
+1. **Write-path guard** (`apps/coder/src/services/write_guard.ts`)
+   - `resolveWritePath(projectRoot, filePath): string` — `resolve()` + prefix check (no realpath — file may not exist for creates)
+   - Deny list: inherit from BooChat's `secret_guard.ts` (`.env`, `*.pem`, `id_rsa*`, etc.)
+   - Fuzz tests: `../` escape, symlink outside root, null bytes, non-existent parent dirs
+
+2. **Pending-changes service** (`apps/coder/src/services/pending_changes.ts`)
+   - `queueEdit(session_id, task_id, file_path, old_string, new_string): PendingChange` — computes unified diff, validates write path, INSERTs
+   - `queueCreate(session_id, task_id, file_path, content): PendingChange`
+   - `queueDelete(session_id, task_id, file_path): PendingChange`
+   - `applyAll(session_id): ApplyResult[]` — re-validates each path, writes to disk, marks `status='applied'`
+   - `applyOne(change_id): ApplyResult`
+   - `rejectOne(change_id): void` — marks `status='rejected'`
+   - `rejectAll(session_id): void`
+   - `rewindOne(change_id): void` — inverse-diff, writes to disk, marks `status='reverted'`
+   - `listPending(session_id): PendingChange[]`
+
+3. **Write tools** (`apps/coder/src/services/tools/`)
+   - `edit_file.ts` — input: `{file_path, old_string, new_string}`, calls `queueEdit`
+   - `create_file.ts` — input: `{file_path, content}`, calls `queueCreate`
+   - `delete_file.ts` — input: `{file_path}`, calls `queueDelete`
+   - `apply_pending.ts` — calls `applyAll` for current session
+   - `rewind.ts` — input: `{change_id}` or `{all: true}`, calls `rewindOne`/`rewindAll`
+
+4. **Tool registry** — register write tools alongside ALL read tools from BooChat
+   - Import BooChat's read tools (view_file, grep, etc.) + codecontext tools
+   - Add the 5 write tools
+   - Alpha-sort the combined list
+
+5. **Inference loop** — port from BooChat or share via workspace package
+   - Copy `apps/server/src/services/inference/` into `apps/coder/src/services/inference/` (or symlink via pnpm workspace)
+   - The outer loop (v1.14) runs unchanged — write tools are just ToolDefs with `execute()` functions
+   - Compaction, doom-loop, step cap all carry forward
+
+6. **API routes**
+   - `POST /api/sessions/:id/messages` — same as BooChat (creates user + assistant rows, enqueues inference)
+   - `GET /api/sessions/:id/pending` — returns pending changes for the session
+   - `POST /api/sessions/:id/pending/apply` — applies all pending
+   - `POST /api/pending/:id/apply` — applies one
+   - `POST /api/pending/:id/reject` — rejects one
+   - `POST /api/pending/:id/rewind` — reverts one
+   - WebSocket streaming (same protocol as BooChat)
+
+### Verification
+- Send a chat asking BooCoder to edit a file
+- LLM calls `edit_file` → change queued in `pending_changes`
+- `GET /api/sessions/:id/pending` shows the queued change with diff
+- `POST /api/pending/:id/apply` writes to disk
+- `POST /api/pending/:id/rewind` reverts it
+- Fuzz test: attempt traversal via `edit_file("../../etc/passwd", ...)` → rejected by write_guard
+
+---
+
+## Phase 3 — Frontend: Diff Pane + Chat (v2.0.0)
+
+**Goal:** Browser UI at `coder.indifferentketchup.com` with chat pane + diff pane side by side.
+
+**Estimated:** ~200 LoC
+
+### Steps
+
+1. **Create `apps/coder/web/`** — React + Vite SPA (same stack as BooChat's `apps/web/`)
+   - Copy BooChat's Vite config, Tailwind v4 setup, font pipeline
+   - Shared components: `MarkdownRenderer`, `CodeBlock`, `Button`, `Input`
+   - New app shell: sidebar (sessions) + workspace (panes)
+
+2. **Chat pane** — reuse BooChat's ChatPane/MessageBubble pattern
+   - Same WS streaming, same `useSessionStream` hook, same message rendering
+   - ActionRow includes tool-call rendering for write tools
+
+3. **Diff pane** — NEW (`apps/coder/web/src/components/DiffPane.tsx`)
+   - Fetches `GET /api/sessions/:id/pending`
+   - Lists pending changes: file path + operation badge (create/edit/delete)
+   - Per-change: syntax-highlighted unified diff view (use Shiki or a diff-specific highlighter)
+   - Buttons: Approve / Reject per change, Approve All / Reject All
+   - Real-time updates via WS frame (`pending_change_added`, `pending_change_applied`, etc.)
+
+4. **Workspace splitter** — chat left, diff right (or configurable)
+
+5. **Caddy route** — `coder.indifferentketchup.com` → boocoder:9502
+   - Authelia gating (same as BooChat)
+
+### Verification
+- Open `coder.indifferentketchup.com` in browser
+- Send a message asking for a code change
+- See the change appear in the diff pane in real time
+- Click Approve → file written, change marked applied
+- Click Reject → change discarded
+
+---
+
+## Phase 4 — Dispatcher + Tasks (v2.0.0 final)
+
+**Goal:** Task queue works. User can create tasks, dispatcher picks them up and runs them through Path A.
+
+**Estimated:** ~150 LoC
+
+### Steps
+
+1. **Dispatcher** (`apps/coder/src/services/dispatcher.ts`)
+   - In-process `setInterval(5000)` polling `tasks` WHERE `state='pending'` ORDER BY `created_at`
+   - For each ready task: mark `state='running'`, run inference with the task's `input` as the user message
+   - On completion: mark `state='completed'`
+   - On error: mark `state='failed'`
+   - On abort: mark `state='cancelled'`
+   - Respects `app.addHook('onClose')` — stops polling, waits for in-flight task
+
+2. **Task API routes**
+   - `POST /api/tasks` — create a task `{project_id, input, agent?, model?}`
+   - `GET /api/tasks` — list tasks (filterable by state, project)
+   - `GET /api/tasks/:id` — get task details + output_summary
+   - `POST /api/tasks/:id/cancel` — cancel a running task
+
+3. **Task → session linkage**
+   - Each task creates its own session + chat for isolation
+   - Task's pending_changes reference the task_id
+   - When task completes, its pending_changes are visible in the UI for approval
+
+4. **Agent probing** (`apps/coder/src/services/agent-probe.ts`)
+   - On startup: `which opencode`, `which goose`, `which claude`, `which pi`
+   - Parse version from `<agent> --version`
+   - Check ACP support: `opencode acp --help` exits 0 → supports_acp = true
+   - Populate `available_agents` table
+
+### Verification
+- `POST /api/tasks {input: "add a /api/version endpoint"}` → task created
+- Dispatcher picks it up → inference runs → `edit_file` queued → task completes
+- `GET /api/tasks/:id` shows `state='completed'` + output_summary
+- Pending changes visible in diff pane for approval
+
+---
+
+## Phase 5 — ACP Dispatch (v2.0.1)
+
+**Goal:** Tasks can be dispatched to external agents via ACP. opencode and goose run as subprocesses, their events flow back into BooCode.
+
+**Estimated:** ~350 LoC
+
+### Steps
+
+1. **ACP client** (`apps/coder/src/services/acp-client.ts`)
+   - Install: `pnpm -C apps/coder add @zed-industries/agent-client-protocol`
+   - `spawnAcpAgent(agent: string, task: string, worktree: string, mcpServers: McpConfig[]): AcpSession`
+   - Uses SDK's `StdioTransport` — spawn `opencode acp` or `goose acp` as child
+   - Pass `context_servers` for MCP auto-forward
+   - Event listener: maps ACP events to BooCode's parts taxonomy
+
+2. **ACP event mapping**
+   - `file_operation` → queue into `pending_changes` (same as Path A native writes)
+   - `tool_call` / `tool_result` → insert as `message_parts` in the task's session
+   - `terminal_output` → publish as WS frame for BooTerm routing
+   - `permission_request` → pause (same mechanism as `ask_user_input`)
+   - `session_end` → task state → `completed` or `failed`
+
+3. **Worktree management** (`apps/coder/src/services/worktrees.ts`)
+   - `createWorktree(projectPath, taskId): string` — `git worktree add /tmp/booworktrees/<taskId> -b task-<taskId> HEAD`
+   - `diffWorktree(worktreePath, projectPath): UnifiedDiff[]` — `git diff HEAD...<worktree-branch>`
+   - `cleanupWorktree(worktreePath): void` — `git worktree remove`
+   - On ACP session end: diff the worktree, queue diffs into `pending_changes`, cleanup
+
+4. **PTY fallback** (`apps/coder/src/services/pty-dispatch.ts`)
+   - For agents without ACP (claude, pi, smallcode)
+   - `spawnPtyAgent(agent: string, task: string, worktree: string): PtySession`
+   - Uses `node-pty` — spawn `claude` or `pi` with cwd = worktree
+   - Capture stdout/stderr into `message_parts` (kind='text', less structured than ACP)
+   - On exit: diff worktree → queue pending_changes → cleanup
+
+5. **Dispatcher update** — transport selection
+   - Check `available_agents[agent].supports_acp` at dispatch time
+   - ACP-capable → `spawnAcpAgent`
+   - PTY fallback → `spawnPtyAgent`
+   - Native (no agent specified) → Path A inference loop (Phase 4)
+
+6. **AGENTS.md extensions**
+   - Add `execution_strategy: plan | act | research` field
+   - Add `expert_model` field for cost-routing
+   - Add `output_schema` field (optional JSON Schema for structured final output)
+
+### Verification
+- Create task with `agent: 'opencode'` → ACP subprocess spawns
+- opencode edits files in worktree → events stream into UI
+- On completion: worktree diff queued in `pending_changes`
+- Approve → changes applied to main project
+- Fallback: create task with `agent: 'claude'` → PTY captures output → worktree diff queued
+
+---
+
+## Phase 6 — MCP Server (v2.0.2)
+
+**Goal:** BooCoder exposes its own primitives as MCP tools. External opencode sessions in Termius can drive the task queue.
+
+**Estimated:** ~250 LoC
+
+### Steps
+
+1. **MCP server** (`apps/coder/src/services/mcp-server.ts`)
+   - Use `@modelcontextprotocol/sdk` server-side (`Server` class)
+   - Stdio transport (read from stdin, write to stdout)
+   - Entry point: `boocoder --mcp` CLI flag starts the MCP server instead of the HTTP server
+
+2. **Tool handlers** (6 tools)
+   - `boocoder.create_task` → INSERT into tasks table, return task_id
+   - `boocoder.list_pending_changes` → SELECT from pending_changes WHERE session matches
+   - `boocoder.apply` → call `applyOne(change_id)`
+   - `boocoder.reject` → call `rejectOne(change_id)`
+   - `boocoder.dispatch_external_agent` → create task with agent specified, return task_id
+   - `boocoder.list_worktrees` → list active worktrees from tasks WHERE worktree_path IS NOT NULL AND state='running'
+
+3. **10-question eval** (per `anthropics/skills/mcp-builder` framework)
+   - Write 10 independent, read-only, verifiable questions about the BooCoder state
+   - Run eval: `echo '{"jsonrpc":"2.0","method":"tools/call","params":{"name":"boocoder.list_pending_changes","arguments":{}},"id":1}' | boocoder --mcp`
+   - All 10 must return correct answers
+
+4. **opencode integration test**
+   - Add BooCoder as an MCP server in `~/.opencode/config.json`:
+     ```json
+     {"mcpServers": {"boocoder": {"type": "stdio", "command": "boocoder", "args": ["--mcp"]}}}
+     ```
+   - From opencode: call `boocoder.create_task` → verify task appears in BooCoder UI
+
+### Verification
+- `echo '...' | boocoder --mcp` returns valid MCP responses
+- 10-question eval passes
+- opencode can drive BooCoder's task queue via MCP
+
+---
+
+## Phase 7 — CLI + Polish (v2.0.3)
+
+**Goal:** `boocode` CLI client, human inbox UI, cost tracking, observation hooks.
+
+**Estimated:** ~400 LoC
+
+### Steps
+
+1. **CLI client** (`apps/coder/src/cli.ts`)
+   - Thin HTTP/WS client against BooCoder API
+   - `boocode run "task description"` → POST /api/tasks → stream output via WS
+   - `boocode ls` → GET /api/tasks → formatted table
+   - `boocode attach <id>` → WS subscribe to task's session → stream live
+   - `boocode send <id> "message"` → POST message to task's session chat
+   - Build as a standalone binary via `pkg` or `esbuild --bundle`
+
+2. **Human inbox UI** (frontend)
+   - New route: `/inbox` → shows tasks WHERE `state IN ('blocked', 'failed')`
+   - Per-task: view output, retry (reset state to pending), cancel, reassign agent
+   - Badge on sidebar showing count of inbox items
+
+3. **Cost tracking**
+   - `tasks.cost_tokens` populated from inference `usage` callback (same as BooChat's `tokens_used`)
+   - Summary API: `GET /api/stats/costs?group_by=project|agent|day` → aggregated token spend
+   - Simple UI: cost badge on each task, totals in settings
+
+4. **Observation hooks** (budi taxonomy)
+   - Emit 5 event types on the BooCoder WS protocol for dispatched agents:
+     - `session_start` — agent spawned
+     - `user_prompt_submit` — task spec delivered
+     - `post_tool_use` — each tool call completed
+     - `subagent_start` — nested dispatch (Boomerang)
+     - `stop` — agent finished
+   - Consumed by frontend for real-time status indicators
+
+5. **Boomerang `new_task` tool** (subagent isolation)
+   - When an agent's toolset includes `new_task`:
+     - Creates a child task (fresh session, fresh context)
+     - Child runs to completion
+     - Parent gets only `attempt_completion` summary
+   - Orchestrator agent profile: tools = `[new_task, list_tasks, check_task_status]` ONLY
+
+### Verification
+- `boocode run "add health endpoint"` from terminal → task runs → output streams → diff queued
+- `boocode ls` shows task list with states + cost
+- Inbox shows failed tasks, retry works
+- Boomerang: orchestrator creates subtask → subtask runs isolated → parent gets summary only
+
+---
+
+## Phase 8 — Hardening + Ship (v2.0.x)
+
+**Goal:** Security hardening, integration tests, documentation, production deploy.
+
+**Estimated:** ~100 LoC (mostly tests + docs)
+
+### Steps
+
+1. **Path-guard fuzz suite** — property tests for every traversal pattern:
+   - `../` sequences (all depths)
+   - Symlink outside project root
+   - Null bytes in path
+   - Unicode normalization attacks
+   - Race conditions (TOCTOU between validate + write)
+   - MCP-served filesystem writes routed through pending_changes
+
+2. **Integration tests**
+   - End-to-end: create task → inference → edit_file → apply → file written → verify content
+   - ACP dispatch: mock opencode → events flow → pending_changes queued
+   - MCP server: 10-question eval automated in CI
+
+3. **Documentation**
+   - `BOOCODER.md` finalized (container guidance)
+   - `CLAUDE.md` updated with BooCoder architecture section
+   - `boocode_roadmap.md` v2.0 retrospective
+   - `CHANGELOG.md` entries for each sub-version
+
+4. **Production deploy**
+   - Caddy config: `coder.indifferentketchup.com`
+   - Authelia: same SSO group as BooChat
+   - Smoke: full workflow (chat → edit → approve → verify)
+
+5. **Tag** — `v2.0.0` (or `v2.0.0-rc1` if Sam wants a bake period)
+
+---
+
+## Execution order summary
+
+```
+Phase 1 (foundation)     → v2.0.0-alpha   ~200 LoC   container boots
+Phase 2 (write tools)    → v2.0.0-beta    ~400 LoC   inference + pending_changes
+Phase 3 (frontend)       → v2.0.0         ~200 LoC   chat + diff panes
+Phase 4 (dispatcher)     → v2.0.0-final   ~150 LoC   task queue + native dispatch
+Phase 5 (ACP dispatch)   → v2.0.1         ~350 LoC   external agents + worktrees
+Phase 6 (MCP server)     → v2.0.2         ~250 LoC   boocoder.* tools + eval
+Phase 7 (CLI + polish)   → v2.0.3         ~400 LoC   CLI + inbox + hooks + Boomerang
+Phase 8 (hardening)      → v2.0.x         ~100 LoC   fuzz + integration tests + docs
+                                          --------
+                                          ~2050 LoC total
+```
+
+Each phase is independently dispatchable. Phases 1-4 are sequential (each needs the prior). Phases 5-7 are parallelizable after Phase 4 ships (they're independent protocol surfaces). Phase 8 gates the production tag.
+
+---
+
+## Risk register
+
+| Risk | Mitigation |
+|---|---|
+| Path-guard bypass → arbitrary writes | Pending-changes double-validates (at queue time + apply time). Fuzz suite in Phase 8. OpenHands sandbox (v2.1) as fallback. |
+| ACP spec instability (remote transport WIP) | Use stdio only. No remote ACP in v2.0. |
+| node-pty native compilation breaks in Docker | bookworm-slim + glibc matches booterm's working config. Pin node-pty version. |
+| Worktree cleanup failure → disk bloat | 30-min idle timeout sweeper. `git worktree prune` on startup. |
+| DB rename breaks existing sessions | One-time migration with explicit backup. BooChat/BooTerm URLs unchanged. |
+| MCP server eval failure | Ship stdio MCP server only after 10/10 eval passes. |
+| Boomerang context leak (child leaks state to parent) | Architectural enforcement: child's session_id ≠ parent's. Summary field is the ONLY bridge. |