Files
boocode/openspec/changes/v2.0-boocoder/implementation-plan.md
indifferentketchup 62d818af23 v2.0 implementation plan: 8 phases from foundation to production
Detailed execution plan for all v2.0 sub-versions:

Phase 1 (v2.0.0-alpha): container skeleton, DB rename, schema migration
Phase 2 (v2.0.0-beta): write tools + pending-changes service + fuzz tests
Phase 3 (v2.0.0): frontend diff pane + chat pane + Caddy routing
Phase 4 (v2.0.0-final): dispatcher worker + task queue + agent probing
Phase 5 (v2.0.1): ACP client + PTY fallback + worktree management
Phase 6 (v2.0.2): MCP server (6 tools, stdio, 10-question eval)
Phase 7 (v2.0.3): CLI + human inbox + cost tracking + observation hooks + Boomerang
Phase 8 (v2.0.x): path-guard fuzz, integration tests, docs, production deploy

~2050 LoC total. Phases 1-4 sequential, 5-7 parallelizable after 4.
Risk register covers path-guard bypass, ACP instability, worktree cleanup,
DB rename, MCP eval, Boomerang context leak.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 01:09:05 +00:00

19 KiB

v2.0 BooCoder — Implementation Plan

Ordered execution plan across all 4 sub-versions. Each phase is dispatchable as a single batch. Phases 1-4 are sequential (each builds on the prior); phases within a sub-version can sometimes be parallelized.


Phase 1 — Foundation (v2.0.0-alpha)

Goal: Standalone BooCoder container boots, connects to DB, serves a health endpoint. No inference yet.

Estimated: ~200 LoC

Steps

  1. Clone lift sources (prep, no code)

    • cd /opt/forks && git clone agent-hub, plandex, opencode, qodo-ai/agents
    • Read agent-hub schema, plandex pending-changes, opencode permission/evaluate.ts
    • Read RA.Aid README for three-stage pattern
  2. Create apps/coder/ skeleton

    • apps/coder/package.json (Fastify, postgres, zod — same deps as apps/server)
    • apps/coder/tsconfig.json (extends base, NodeNext)
    • apps/coder/src/index.ts (Fastify boot, health endpoint, DB connect)
    • apps/coder/src/config.ts (Zod config schema — DATABASE_URL, PORT, HOST, LLAMA_SWAP_URL, CONTAINER_GUIDANCE_FILE)
    • apps/coder/src/db.ts (postgres connection, schema apply — shared with apps/server or fresh)
  3. Create Dockerfile

    • apps/coder/Dockerfile — Node 20 bookworm-slim (matches booterm for glibc compat with node-pty later)
    • Mount: /opt:/opt:rw
    • COPY built server + BOOCODER.md
  4. docker-compose.yml — add boocoder service

    • Port 100.114.205.53:9502:3000
    • Environment: DATABASE_URL, LLAMA_SWAP_URL, CONTAINER_GUIDANCE_FILE=/app/BOOCODER.md
    • Network: boocode_net
    • Depends on: boocode_db
  5. DB renameboocode_dbboochat_db

    • ALTER DATABASE boocode RENAME TO boochat; (one-time, run manually)
    • Update DATABASE_URL in all docker-compose services
    • Update volume name mapping
    • Verify all 3 services boot against renamed DB
  6. Schema migration — new tables in apps/coder/src/schema.sql

    • pending_changes table
    • tasks table
    • available_agents table
    • human_inbox view
    • Applied idempotently on boot (same pattern as BooChat's applySchema())
  7. BOOCODER.md — container guidance file

    • Write tools enabled (unlike BOOCHAT.md which declares read-only)
    • Pending-changes queue discipline
    • Path-guard rules

Verification

  • docker compose up --build -d — boocoder container starts
  • curl http://100.114.205.53:9502/api/health — 200 OK
  • psql confirms new tables exist
  • BooChat + BooTerm unaffected (still boot, still serve)

Phase 2 — Write Tools + Pending Changes (v2.0.0-beta)

Goal: BooCoder can chat with the LLM, the LLM can call write tools, changes queue in pending_changes, user can apply/reject.

Estimated: ~400 LoC

Steps

  1. Write-path guard (apps/coder/src/services/write_guard.ts)

    • resolveWritePath(projectRoot, filePath): stringresolve() + prefix check (no realpath — file may not exist for creates)
    • Deny list: inherit from BooChat's secret_guard.ts (.env, *.pem, id_rsa*, etc.)
    • Fuzz tests: ../ escape, symlink outside root, null bytes, non-existent parent dirs
  2. Pending-changes service (apps/coder/src/services/pending_changes.ts)

    • queueEdit(session_id, task_id, file_path, old_string, new_string): PendingChange — computes unified diff, validates write path, INSERTs
    • queueCreate(session_id, task_id, file_path, content): PendingChange
    • queueDelete(session_id, task_id, file_path): PendingChange
    • applyAll(session_id): ApplyResult[] — re-validates each path, writes to disk, marks status='applied'
    • applyOne(change_id): ApplyResult
    • rejectOne(change_id): void — marks status='rejected'
    • rejectAll(session_id): void
    • rewindOne(change_id): void — inverse-diff, writes to disk, marks status='reverted'
    • listPending(session_id): PendingChange[]
  3. Write tools (apps/coder/src/services/tools/)

    • edit_file.ts — input: {file_path, old_string, new_string}, calls queueEdit
    • create_file.ts — input: {file_path, content}, calls queueCreate
    • delete_file.ts — input: {file_path}, calls queueDelete
    • apply_pending.ts — calls applyAll for current session
    • rewind.ts — input: {change_id} or {all: true}, calls rewindOne/rewindAll
  4. Tool registry — register write tools alongside ALL read tools from BooChat

    • Import BooChat's read tools (view_file, grep, etc.) + codecontext tools
    • Add the 5 write tools
    • Alpha-sort the combined list
  5. Inference loop — port from BooChat or share via workspace package

    • Copy apps/server/src/services/inference/ into apps/coder/src/services/inference/ (or symlink via pnpm workspace)
    • The outer loop (v1.14) runs unchanged — write tools are just ToolDefs with execute() functions
    • Compaction, doom-loop, step cap all carry forward
  6. API routes

    • POST /api/sessions/:id/messages — same as BooChat (creates user + assistant rows, enqueues inference)
    • GET /api/sessions/:id/pending — returns pending changes for the session
    • POST /api/sessions/:id/pending/apply — applies all pending
    • POST /api/pending/:id/apply — applies one
    • POST /api/pending/:id/reject — rejects one
    • POST /api/pending/:id/rewind — reverts one
    • WebSocket streaming (same protocol as BooChat)

Verification

  • Send a chat asking BooCoder to edit a file
  • LLM calls edit_file → change queued in pending_changes
  • GET /api/sessions/:id/pending shows the queued change with diff
  • POST /api/pending/:id/apply writes to disk
  • POST /api/pending/:id/rewind reverts it
  • Fuzz test: attempt traversal via edit_file("../../etc/passwd", ...) → rejected by write_guard

Phase 3 — Frontend: Diff Pane + Chat (v2.0.0)

Goal: Browser UI at coder.indifferentketchup.com with chat pane + diff pane side by side.

Estimated: ~200 LoC

Steps

  1. Create apps/coder/web/ — React + Vite SPA (same stack as BooChat's apps/web/)

    • Copy BooChat's Vite config, Tailwind v4 setup, font pipeline
    • Shared components: MarkdownRenderer, CodeBlock, Button, Input
    • New app shell: sidebar (sessions) + workspace (panes)
  2. Chat pane — reuse BooChat's ChatPane/MessageBubble pattern

    • Same WS streaming, same useSessionStream hook, same message rendering
    • ActionRow includes tool-call rendering for write tools
  3. Diff pane — NEW (apps/coder/web/src/components/DiffPane.tsx)

    • Fetches GET /api/sessions/:id/pending
    • Lists pending changes: file path + operation badge (create/edit/delete)
    • Per-change: syntax-highlighted unified diff view (use Shiki or a diff-specific highlighter)
    • Buttons: Approve / Reject per change, Approve All / Reject All
    • Real-time updates via WS frame (pending_change_added, pending_change_applied, etc.)
  4. Workspace splitter — chat left, diff right (or configurable)

  5. Caddy routecoder.indifferentketchup.com → boocoder:9502

    • Authelia gating (same as BooChat)

Verification

  • Open coder.indifferentketchup.com in browser
  • Send a message asking for a code change
  • See the change appear in the diff pane in real time
  • Click Approve → file written, change marked applied
  • Click Reject → change discarded

Phase 4 — Dispatcher + Tasks (v2.0.0 final)

Goal: Task queue works. User can create tasks, dispatcher picks them up and runs them through Path A.

Estimated: ~150 LoC

Steps

  1. Dispatcher (apps/coder/src/services/dispatcher.ts)

    • In-process setInterval(5000) polling tasks WHERE state='pending' ORDER BY created_at
    • For each ready task: mark state='running', run inference with the task's input as the user message
    • On completion: mark state='completed'
    • On error: mark state='failed'
    • On abort: mark state='cancelled'
    • Respects app.addHook('onClose') — stops polling, waits for in-flight task
  2. Task API routes

    • POST /api/tasks — create a task {project_id, input, agent?, model?}
    • GET /api/tasks — list tasks (filterable by state, project)
    • GET /api/tasks/:id — get task details + output_summary
    • POST /api/tasks/:id/cancel — cancel a running task
  3. Task → session linkage

    • Each task creates its own session + chat for isolation
    • Task's pending_changes reference the task_id
    • When task completes, its pending_changes are visible in the UI for approval
  4. Agent probing (apps/coder/src/services/agent-probe.ts)

    • On startup: which opencode, which goose, which claude, which pi
    • Parse version from <agent> --version
    • Check ACP support: opencode acp --help exits 0 → supports_acp = true
    • Populate available_agents table

Verification

  • POST /api/tasks {input: "add a /api/version endpoint"} → task created
  • Dispatcher picks it up → inference runs → edit_file queued → task completes
  • GET /api/tasks/:id shows state='completed' + output_summary
  • Pending changes visible in diff pane for approval

Phase 5 — ACP Dispatch (v2.0.1)

Goal: Tasks can be dispatched to external agents via ACP. opencode and goose run as subprocesses, their events flow back into BooCode.

Estimated: ~350 LoC

Steps

  1. ACP client (apps/coder/src/services/acp-client.ts)

    • Install: pnpm -C apps/coder add @zed-industries/agent-client-protocol
    • spawnAcpAgent(agent: string, task: string, worktree: string, mcpServers: McpConfig[]): AcpSession
    • Uses SDK's StdioTransport — spawn opencode acp or goose acp as child
    • Pass context_servers for MCP auto-forward
    • Event listener: maps ACP events to BooCode's parts taxonomy
  2. ACP event mapping

    • file_operation → queue into pending_changes (same as Path A native writes)
    • tool_call / tool_result → insert as message_parts in the task's session
    • terminal_output → publish as WS frame for BooTerm routing
    • permission_request → pause (same mechanism as ask_user_input)
    • session_end → task state → completed or failed
  3. Worktree management (apps/coder/src/services/worktrees.ts)

    • createWorktree(projectPath, taskId): stringgit worktree add /tmp/booworktrees/<taskId> -b task-<taskId> HEAD
    • diffWorktree(worktreePath, projectPath): UnifiedDiff[]git diff HEAD...<worktree-branch>
    • cleanupWorktree(worktreePath): voidgit worktree remove
    • On ACP session end: diff the worktree, queue diffs into pending_changes, cleanup
  4. PTY fallback (apps/coder/src/services/pty-dispatch.ts)

    • For agents without ACP (claude, pi, smallcode)
    • spawnPtyAgent(agent: string, task: string, worktree: string): PtySession
    • Uses node-pty — spawn claude or pi with cwd = worktree
    • Capture stdout/stderr into message_parts (kind='text', less structured than ACP)
    • On exit: diff worktree → queue pending_changes → cleanup
  5. Dispatcher update — transport selection

    • Check available_agents[agent].supports_acp at dispatch time
    • ACP-capable → spawnAcpAgent
    • PTY fallback → spawnPtyAgent
    • Native (no agent specified) → Path A inference loop (Phase 4)
  6. AGENTS.md extensions

    • Add execution_strategy: plan | act | research field
    • Add expert_model field for cost-routing
    • Add output_schema field (optional JSON Schema for structured final output)

Verification

  • Create task with agent: 'opencode' → ACP subprocess spawns
  • opencode edits files in worktree → events stream into UI
  • On completion: worktree diff queued in pending_changes
  • Approve → changes applied to main project
  • Fallback: create task with agent: 'claude' → PTY captures output → worktree diff queued

Phase 6 — MCP Server (v2.0.2)

Goal: BooCoder exposes its own primitives as MCP tools. External opencode sessions in Termius can drive the task queue.

Estimated: ~250 LoC

Steps

  1. MCP server (apps/coder/src/services/mcp-server.ts)

    • Use @modelcontextprotocol/sdk server-side (Server class)
    • Stdio transport (read from stdin, write to stdout)
    • Entry point: boocoder --mcp CLI flag starts the MCP server instead of the HTTP server
  2. Tool handlers (6 tools)

    • boocoder.create_task → INSERT into tasks table, return task_id
    • boocoder.list_pending_changes → SELECT from pending_changes WHERE session matches
    • boocoder.apply → call applyOne(change_id)
    • boocoder.reject → call rejectOne(change_id)
    • boocoder.dispatch_external_agent → create task with agent specified, return task_id
    • boocoder.list_worktrees → list active worktrees from tasks WHERE worktree_path IS NOT NULL AND state='running'
  3. 10-question eval (per anthropics/skills/mcp-builder framework)

    • Write 10 independent, read-only, verifiable questions about the BooCoder state
    • Run eval: echo '{"jsonrpc":"2.0","method":"tools/call","params":{"name":"boocoder.list_pending_changes","arguments":{}},"id":1}' | boocoder --mcp
    • All 10 must return correct answers
  4. opencode integration test

    • Add BooCoder as an MCP server in ~/.opencode/config.json:
      {"mcpServers": {"boocoder": {"type": "stdio", "command": "boocoder", "args": ["--mcp"]}}}
      
    • From opencode: call boocoder.create_task → verify task appears in BooCoder UI

Verification

  • echo '...' | boocoder --mcp returns valid MCP responses
  • 10-question eval passes
  • opencode can drive BooCoder's task queue via MCP

Phase 7 — CLI + Polish (v2.0.3)

Goal: boocode CLI client, human inbox UI, cost tracking, observation hooks.

Estimated: ~400 LoC

Steps

  1. CLI client (apps/coder/src/cli.ts)

    • Thin HTTP/WS client against BooCoder API
    • boocode run "task description" → POST /api/tasks → stream output via WS
    • boocode ls → GET /api/tasks → formatted table
    • boocode attach <id> → WS subscribe to task's session → stream live
    • boocode send <id> "message" → POST message to task's session chat
    • Build as a standalone binary via pkg or esbuild --bundle
  2. Human inbox UI (frontend)

    • New route: /inbox → shows tasks WHERE state IN ('blocked', 'failed')
    • Per-task: view output, retry (reset state to pending), cancel, reassign agent
    • Badge on sidebar showing count of inbox items
  3. Cost tracking

    • tasks.cost_tokens populated from inference usage callback (same as BooChat's tokens_used)
    • Summary API: GET /api/stats/costs?group_by=project|agent|day → aggregated token spend
    • Simple UI: cost badge on each task, totals in settings
  4. Observation hooks (budi taxonomy)

    • Emit 5 event types on the BooCoder WS protocol for dispatched agents:
      • session_start — agent spawned
      • user_prompt_submit — task spec delivered
      • post_tool_use — each tool call completed
      • subagent_start — nested dispatch (Boomerang)
      • stop — agent finished
    • Consumed by frontend for real-time status indicators
  5. Boomerang new_task tool (subagent isolation)

    • When an agent's toolset includes new_task:
      • Creates a child task (fresh session, fresh context)
      • Child runs to completion
      • Parent gets only attempt_completion summary
    • Orchestrator agent profile: tools = [new_task, list_tasks, check_task_status] ONLY

Verification

  • boocode run "add health endpoint" from terminal → task runs → output streams → diff queued
  • boocode ls shows task list with states + cost
  • Inbox shows failed tasks, retry works
  • Boomerang: orchestrator creates subtask → subtask runs isolated → parent gets summary only

Phase 8 — Hardening + Ship (v2.0.x)

Goal: Security hardening, integration tests, documentation, production deploy.

Estimated: ~100 LoC (mostly tests + docs)

Steps

  1. Path-guard fuzz suite — property tests for every traversal pattern:

    • ../ sequences (all depths)
    • Symlink outside project root
    • Null bytes in path
    • Unicode normalization attacks
    • Race conditions (TOCTOU between validate + write)
    • MCP-served filesystem writes routed through pending_changes
  2. Integration tests

    • End-to-end: create task → inference → edit_file → apply → file written → verify content
    • ACP dispatch: mock opencode → events flow → pending_changes queued
    • MCP server: 10-question eval automated in CI
  3. Documentation

    • BOOCODER.md finalized (container guidance)
    • CLAUDE.md updated with BooCoder architecture section
    • boocode_roadmap.md v2.0 retrospective
    • CHANGELOG.md entries for each sub-version
  4. Production deploy

    • Caddy config: coder.indifferentketchup.com
    • Authelia: same SSO group as BooChat
    • Smoke: full workflow (chat → edit → approve → verify)
  5. Tagv2.0.0 (or v2.0.0-rc1 if Sam wants a bake period)


Execution order summary

Phase 1 (foundation)     → v2.0.0-alpha   ~200 LoC   container boots
Phase 2 (write tools)    → v2.0.0-beta    ~400 LoC   inference + pending_changes
Phase 3 (frontend)       → v2.0.0         ~200 LoC   chat + diff panes
Phase 4 (dispatcher)     → v2.0.0-final   ~150 LoC   task queue + native dispatch
Phase 5 (ACP dispatch)   → v2.0.1         ~350 LoC   external agents + worktrees
Phase 6 (MCP server)     → v2.0.2         ~250 LoC   boocoder.* tools + eval
Phase 7 (CLI + polish)   → v2.0.3         ~400 LoC   CLI + inbox + hooks + Boomerang
Phase 8 (hardening)      → v2.0.x         ~100 LoC   fuzz + integration tests + docs
                                          --------
                                          ~2050 LoC total

Each phase is independently dispatchable. Phases 1-4 are sequential (each needs the prior). Phases 5-7 are parallelizable after Phase 4 ships (they're independent protocol surfaces). Phase 8 gates the production tag.


Risk register

Risk Mitigation
Path-guard bypass → arbitrary writes Pending-changes double-validates (at queue time + apply time). Fuzz suite in Phase 8. OpenHands sandbox (v2.1) as fallback.
ACP spec instability (remote transport WIP) Use stdio only. No remote ACP in v2.0.
node-pty native compilation breaks in Docker bookworm-slim + glibc matches booterm's working config. Pin node-pty version.
Worktree cleanup failure → disk bloat 30-min idle timeout sweeper. git worktree prune on startup.
DB rename breaks existing sessions One-time migration with explicit backup. BooChat/BooTerm URLs unchanged.
MCP server eval failure Ship stdio MCP server only after 10/10 eval passes.
Boomerang context leak (child leaks state to parent) Architectural enforcement: child's session_id ≠ parent's. Summary field is the ONLY bridge.