Files

indifferentketchup 62d818af23 v2.0 implementation plan: 8 phases from foundation to production

Detailed execution plan for all v2.0 sub-versions:

Phase 1 (v2.0.0-alpha): container skeleton, DB rename, schema migration
Phase 2 (v2.0.0-beta): write tools + pending-changes service + fuzz tests
Phase 3 (v2.0.0): frontend diff pane + chat pane + Caddy routing
Phase 4 (v2.0.0-final): dispatcher worker + task queue + agent probing
Phase 5 (v2.0.1): ACP client + PTY fallback + worktree management
Phase 6 (v2.0.2): MCP server (6 tools, stdio, 10-question eval)
Phase 7 (v2.0.3): CLI + human inbox + cost tracking + observation hooks + Boomerang
Phase 8 (v2.0.x): path-guard fuzz, integration tests, docs, production deploy

~2050 LoC total. Phases 1-4 sequential, 5-7 parallelizable after 4.
Risk register covers path-guard bypass, ACP instability, worktree cleanup,
DB rename, MCP eval, Boomerang context leak.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-25 01:09:05 +00:00

19 KiB

Raw Blame History

v2.0 BooCoder — Implementation Plan

Ordered execution plan across all 4 sub-versions. Each phase is dispatchable as a single batch. Phases 1-4 are sequential (each builds on the prior); phases within a sub-version can sometimes be parallelized.

Phase 1 — Foundation (v2.0.0-alpha)

Goal: Standalone BooCoder container boots, connects to DB, serves a health endpoint. No inference yet.

Estimated: ~200 LoC

Steps

Clone lift sources (prep, no code)
- cd /opt/forks && git clone agent-hub, plandex, opencode, qodo-ai/agents
- Read agent-hub schema, plandex pending-changes, opencode permission/evaluate.ts
- Read RA.Aid README for three-stage pattern
Create apps/coder/ skeleton
- apps/coder/package.json (Fastify, postgres, zod — same deps as apps/server)
- apps/coder/tsconfig.json (extends base, NodeNext)
- apps/coder/src/index.ts (Fastify boot, health endpoint, DB connect)
- apps/coder/src/config.ts (Zod config schema — DATABASE_URL, PORT, HOST, LLAMA_SWAP_URL, CONTAINER_GUIDANCE_FILE)
- apps/coder/src/db.ts (postgres connection, schema apply — shared with apps/server or fresh)
Create Dockerfile
- apps/coder/Dockerfile — Node 20 bookworm-slim (matches booterm for glibc compat with node-pty later)
- Mount: /opt:/opt:rw
- COPY built server + BOOCODER.md
docker-compose.yml — add boocoder service
- Port 100.114.205.53:9502:3000
- Environment: DATABASE_URL, LLAMA_SWAP_URL, CONTAINER_GUIDANCE_FILE=/app/BOOCODER.md
- Network: boocode_net
- Depends on: boocode_db
DB rename — boocode_db → boochat_db
- ALTER DATABASE boocode RENAME TO boochat; (one-time, run manually)
- Update DATABASE_URL in all docker-compose services
- Update volume name mapping
- Verify all 3 services boot against renamed DB
Schema migration — new tables in apps/coder/src/schema.sql
- pending_changes table
- tasks table
- available_agents table
- human_inbox view
- Applied idempotently on boot (same pattern as BooChat's applySchema())
BOOCODER.md — container guidance file
- Write tools enabled (unlike BOOCHAT.md which declares read-only)
- Pending-changes queue discipline
- Path-guard rules

Verification

docker compose up --build -d — boocoder container starts
curl http://100.114.205.53:9502/api/health — 200 OK
psql confirms new tables exist
BooChat + BooTerm unaffected (still boot, still serve)

Phase 2 — Write Tools + Pending Changes (v2.0.0-beta)

Goal: BooCoder can chat with the LLM, the LLM can call write tools, changes queue in pending_changes, user can apply/reject.

Estimated: ~400 LoC

Steps

Write-path guard (apps/coder/src/services/write_guard.ts)
- resolveWritePath(projectRoot, filePath): string — resolve() + prefix check (no realpath — file may not exist for creates)
- Deny list: inherit from BooChat's secret_guard.ts (.env, *.pem, id_rsa*, etc.)
- Fuzz tests: ../ escape, symlink outside root, null bytes, non-existent parent dirs
Pending-changes service (apps/coder/src/services/pending_changes.ts)
- queueEdit(session_id, task_id, file_path, old_string, new_string): PendingChange — computes unified diff, validates write path, INSERTs
- queueCreate(session_id, task_id, file_path, content): PendingChange
- queueDelete(session_id, task_id, file_path): PendingChange
- applyAll(session_id): ApplyResult[] — re-validates each path, writes to disk, marks status='applied'
- applyOne(change_id): ApplyResult
- rejectOne(change_id): void — marks status='rejected'
- rejectAll(session_id): void
- rewindOne(change_id): void — inverse-diff, writes to disk, marks status='reverted'
- listPending(session_id): PendingChange[]
Write tools (apps/coder/src/services/tools/)
- edit_file.ts — input: {file_path, old_string, new_string}, calls queueEdit
- create_file.ts — input: {file_path, content}, calls queueCreate
- delete_file.ts — input: {file_path}, calls queueDelete
- apply_pending.ts — calls applyAll for current session
- rewind.ts — input: {change_id} or {all: true}, calls rewindOne/rewindAll
Tool registry — register write tools alongside ALL read tools from BooChat
- Import BooChat's read tools (view_file, grep, etc.) + codecontext tools
- Add the 5 write tools
- Alpha-sort the combined list
Inference loop — port from BooChat or share via workspace package
- Copy apps/server/src/services/inference/ into apps/coder/src/services/inference/ (or symlink via pnpm workspace)
- The outer loop (v1.14) runs unchanged — write tools are just ToolDefs with execute() functions
- Compaction, doom-loop, step cap all carry forward
API routes
- POST /api/sessions/:id/messages — same as BooChat (creates user + assistant rows, enqueues inference)
- GET /api/sessions/:id/pending — returns pending changes for the session
- POST /api/sessions/:id/pending/apply — applies all pending
- POST /api/pending/:id/apply — applies one
- POST /api/pending/:id/reject — rejects one
- POST /api/pending/:id/rewind — reverts one
- WebSocket streaming (same protocol as BooChat)

Verification

Send a chat asking BooCoder to edit a file
LLM calls edit_file → change queued in pending_changes
GET /api/sessions/:id/pending shows the queued change with diff
POST /api/pending/:id/apply writes to disk
POST /api/pending/:id/rewind reverts it
Fuzz test: attempt traversal via edit_file("../../etc/passwd", ...) → rejected by write_guard

Phase 3 — Frontend: Diff Pane + Chat (v2.0.0)

Goal: Browser UI at coder.indifferentketchup.com with chat pane + diff pane side by side.

Estimated: ~200 LoC

Steps

Create apps/coder/web/ — React + Vite SPA (same stack as BooChat's apps/web/)
- Copy BooChat's Vite config, Tailwind v4 setup, font pipeline
- Shared components: MarkdownRenderer, CodeBlock, Button, Input
- New app shell: sidebar (sessions) + workspace (panes)
Chat pane — reuse BooChat's ChatPane/MessageBubble pattern
- Same WS streaming, same useSessionStream hook, same message rendering
- ActionRow includes tool-call rendering for write tools
Diff pane — NEW (apps/coder/web/src/components/DiffPane.tsx)
- Fetches GET /api/sessions/:id/pending
- Lists pending changes: file path + operation badge (create/edit/delete)
- Per-change: syntax-highlighted unified diff view (use Shiki or a diff-specific highlighter)
- Buttons: Approve / Reject per change, Approve All / Reject All
- Real-time updates via WS frame (pending_change_added, pending_change_applied, etc.)
Workspace splitter — chat left, diff right (or configurable)
Caddy route — coder.indifferentketchup.com → boocoder:9502
- Authelia gating (same as BooChat)

Verification

Open coder.indifferentketchup.com in browser
Send a message asking for a code change
See the change appear in the diff pane in real time
Click Approve → file written, change marked applied
Click Reject → change discarded

Phase 4 — Dispatcher + Tasks (v2.0.0 final)

Goal: Task queue works. User can create tasks, dispatcher picks them up and runs them through Path A.

Estimated: ~150 LoC

Steps

Dispatcher (apps/coder/src/services/dispatcher.ts)
- In-process setInterval(5000) polling tasks WHERE state='pending' ORDER BY created_at
- For each ready task: mark state='running', run inference with the task's input as the user message
- On completion: mark state='completed'
- On error: mark state='failed'
- On abort: mark state='cancelled'
- Respects app.addHook('onClose') — stops polling, waits for in-flight task
Task API routes
- POST /api/tasks — create a task {project_id, input, agent?, model?}
- GET /api/tasks — list tasks (filterable by state, project)
- GET /api/tasks/:id — get task details + output_summary
- POST /api/tasks/:id/cancel — cancel a running task
Task → session linkage
- Each task creates its own session + chat for isolation
- Task's pending_changes reference the task_id
- When task completes, its pending_changes are visible in the UI for approval
Agent probing (apps/coder/src/services/agent-probe.ts)
- On startup: which opencode, which goose, which claude, which pi
- Parse version from <agent> --version
- Check ACP support: opencode acp --help exits 0 → supports_acp = true
- Populate available_agents table

Verification

POST /api/tasks {input: "add a /api/version endpoint"} → task created
Dispatcher picks it up → inference runs → edit_file queued → task completes
GET /api/tasks/:id shows state='completed' + output_summary
Pending changes visible in diff pane for approval

Phase 5 — ACP Dispatch (v2.0.1)

Goal: Tasks can be dispatched to external agents via ACP. opencode and goose run as subprocesses, their events flow back into BooCode.

Estimated: ~350 LoC

Steps

ACP client (apps/coder/src/services/acp-client.ts)
- Install: pnpm -C apps/coder add @zed-industries/agent-client-protocol
- spawnAcpAgent(agent: string, task: string, worktree: string, mcpServers: McpConfig[]): AcpSession
- Uses SDK's StdioTransport — spawn opencode acp or goose acp as child
- Pass context_servers for MCP auto-forward
- Event listener: maps ACP events to BooCode's parts taxonomy
ACP event mapping
- file_operation → queue into pending_changes (same as Path A native writes)
- tool_call / tool_result → insert as message_parts in the task's session
- terminal_output → publish as WS frame for BooTerm routing
- permission_request → pause (same mechanism as ask_user_input)
- session_end → task state → completed or failed
Worktree management (apps/coder/src/services/worktrees.ts)
- createWorktree(projectPath, taskId): string — git worktree add /tmp/booworktrees/<taskId> -b task-<taskId> HEAD
- diffWorktree(worktreePath, projectPath): UnifiedDiff[] — git diff HEAD...<worktree-branch>
- cleanupWorktree(worktreePath): void — git worktree remove
- On ACP session end: diff the worktree, queue diffs into pending_changes, cleanup
PTY fallback (apps/coder/src/services/pty-dispatch.ts)
- For agents without ACP (claude, pi, smallcode)
- spawnPtyAgent(agent: string, task: string, worktree: string): PtySession
- Uses node-pty — spawn claude or pi with cwd = worktree
- Capture stdout/stderr into message_parts (kind='text', less structured than ACP)
- On exit: diff worktree → queue pending_changes → cleanup
Dispatcher update — transport selection
- Check available_agents[agent].supports_acp at dispatch time
- ACP-capable → spawnAcpAgent
- PTY fallback → spawnPtyAgent
- Native (no agent specified) → Path A inference loop (Phase 4)
AGENTS.md extensions
- Add execution_strategy: plan | act | research field
- Add expert_model field for cost-routing
- Add output_schema field (optional JSON Schema for structured final output)

Verification

Create task with agent: 'opencode' → ACP subprocess spawns
opencode edits files in worktree → events stream into UI
On completion: worktree diff queued in pending_changes
Approve → changes applied to main project
Fallback: create task with agent: 'claude' → PTY captures output → worktree diff queued

Phase 6 — MCP Server (v2.0.2)

Goal: BooCoder exposes its own primitives as MCP tools. External opencode sessions in Termius can drive the task queue.

Estimated: ~250 LoC

Steps

MCP server (apps/coder/src/services/mcp-server.ts)
- Use @modelcontextprotocol/sdk server-side (Server class)
- Stdio transport (read from stdin, write to stdout)
- Entry point: boocoder --mcp CLI flag starts the MCP server instead of the HTTP server
Tool handlers (6 tools)
- boocoder.create_task → INSERT into tasks table, return task_id
- boocoder.list_pending_changes → SELECT from pending_changes WHERE session matches
- boocoder.apply → call applyOne(change_id)
- boocoder.reject → call rejectOne(change_id)
- boocoder.dispatch_external_agent → create task with agent specified, return task_id
- boocoder.list_worktrees → list active worktrees from tasks WHERE worktree_path IS NOT NULL AND state='running'
10-question eval (per anthropics/skills/mcp-builder framework)
- Write 10 independent, read-only, verifiable questions about the BooCoder state
- Run eval: echo '{"jsonrpc":"2.0","method":"tools/call","params":{"name":"boocoder.list_pending_changes","arguments":{}},"id":1}' | boocoder --mcp
- All 10 must return correct answers
opencode integration test
- Add BooCoder as an MCP server in ~/.opencode/config.json:
```
{"mcpServers": {"boocoder": {"type": "stdio", "command": "boocoder", "args": ["--mcp"]}}}
```
- From opencode: call boocoder.create_task → verify task appears in BooCoder UI

Verification

echo '...' | boocoder --mcp returns valid MCP responses
10-question eval passes
opencode can drive BooCoder's task queue via MCP

Phase 7 — CLI + Polish (v2.0.3)

Goal: boocode CLI client, human inbox UI, cost tracking, observation hooks.

Estimated: ~400 LoC

Steps

CLI client (apps/coder/src/cli.ts)
- Thin HTTP/WS client against BooCoder API
- boocode run "task description" → POST /api/tasks → stream output via WS
- boocode ls → GET /api/tasks → formatted table
- boocode attach <id> → WS subscribe to task's session → stream live
- boocode send <id> "message" → POST message to task's session chat
- Build as a standalone binary via pkg or esbuild --bundle
Human inbox UI (frontend)
- New route: /inbox → shows tasks WHERE state IN ('blocked', 'failed')
- Per-task: view output, retry (reset state to pending), cancel, reassign agent
- Badge on sidebar showing count of inbox items
Cost tracking
- tasks.cost_tokens populated from inference usage callback (same as BooChat's tokens_used)
- Summary API: GET /api/stats/costs?group_by=project|agent|day → aggregated token spend
- Simple UI: cost badge on each task, totals in settings
Observation hooks (budi taxonomy)
- Emit 5 event types on the BooCoder WS protocol for dispatched agents:
  - session_start — agent spawned
  - user_prompt_submit — task spec delivered
  - post_tool_use — each tool call completed
  - subagent_start — nested dispatch (Boomerang)
  - stop — agent finished
- Consumed by frontend for real-time status indicators
Boomerang new_task tool (subagent isolation)
- When an agent's toolset includes new_task:
  - Creates a child task (fresh session, fresh context)
  - Child runs to completion
  - Parent gets only attempt_completion summary
- Orchestrator agent profile: tools = [new_task, list_tasks, check_task_status] ONLY

Verification

boocode run "add health endpoint" from terminal → task runs → output streams → diff queued
boocode ls shows task list with states + cost
Inbox shows failed tasks, retry works
Boomerang: orchestrator creates subtask → subtask runs isolated → parent gets summary only

Phase 8 — Hardening + Ship (v2.0.x)

Goal: Security hardening, integration tests, documentation, production deploy.

Estimated: ~100 LoC (mostly tests + docs)

Steps

Path-guard fuzz suite — property tests for every traversal pattern:
- ../ sequences (all depths)
- Symlink outside project root
- Null bytes in path
- Unicode normalization attacks
- Race conditions (TOCTOU between validate + write)
- MCP-served filesystem writes routed through pending_changes
Integration tests
- End-to-end: create task → inference → edit_file → apply → file written → verify content
- ACP dispatch: mock opencode → events flow → pending_changes queued
- MCP server: 10-question eval automated in CI
Documentation
- BOOCODER.md finalized (container guidance)
- CLAUDE.md updated with BooCoder architecture section
- boocode_roadmap.md v2.0 retrospective
- CHANGELOG.md entries for each sub-version
Production deploy
- Caddy config: coder.indifferentketchup.com
- Authelia: same SSO group as BooChat
- Smoke: full workflow (chat → edit → approve → verify)
Tag — v2.0.0 (or v2.0.0-rc1 if Sam wants a bake period)

Execution order summary

Phase 1 (foundation)     → v2.0.0-alpha   ~200 LoC   container boots
Phase 2 (write tools)    → v2.0.0-beta    ~400 LoC   inference + pending_changes
Phase 3 (frontend)       → v2.0.0         ~200 LoC   chat + diff panes
Phase 4 (dispatcher)     → v2.0.0-final   ~150 LoC   task queue + native dispatch
Phase 5 (ACP dispatch)   → v2.0.1         ~350 LoC   external agents + worktrees
Phase 6 (MCP server)     → v2.0.2         ~250 LoC   boocoder.* tools + eval
Phase 7 (CLI + polish)   → v2.0.3         ~400 LoC   CLI + inbox + hooks + Boomerang
Phase 8 (hardening)      → v2.0.x         ~100 LoC   fuzz + integration tests + docs
                                          --------
                                          ~2050 LoC total

Each phase is independently dispatchable. Phases 1-4 are sequential (each needs the prior). Phases 5-7 are parallelizable after Phase 4 ships (they're independent protocol surfaces). Phase 8 gates the production tag.

Risk register

Risk	Mitigation
Path-guard bypass → arbitrary writes	Pending-changes double-validates (at queue time + apply time). Fuzz suite in Phase 8. OpenHands sandbox (v2.1) as fallback.
ACP spec instability (remote transport WIP)	Use stdio only. No remote ACP in v2.0.
node-pty native compilation breaks in Docker	bookworm-slim + glibc matches booterm's working config. Pin node-pty version.
Worktree cleanup failure → disk bloat	30-min idle timeout sweeper. `git worktree prune` on startup.
DB rename breaks existing sessions	One-time migration with explicit backup. BooChat/BooTerm URLs unchanged.
MCP server eval failure	Ship stdio MCP server only after 10/10 eval passes.
Boomerang context leak (child leaks state to parent)	Architectural enforcement: child's session_id ≠ parent's. Summary field is the ONLY bridge.

19 KiB Raw Blame History

v2.0 BooCoder — Implementation Plan

Phase 1 — Foundation (v2.0.0-alpha)

Steps

Verification

Phase 2 — Write Tools + Pending Changes (v2.0.0-beta)

Steps

Verification

Phase 3 — Frontend: Diff Pane + Chat (v2.0.0)

Steps

Verification

Phase 4 — Dispatcher + Tasks (v2.0.0 final)

Steps

Verification

Phase 5 — ACP Dispatch (v2.0.1)

Steps

Verification

Phase 6 — MCP Server (v2.0.2)

Steps

Verification

Phase 7 — CLI + Polish (v2.0.3)

Steps

Verification

Phase 8 — Hardening + Ship (v2.0.x)

Steps

Execution order summary

Risk register

19 KiB

Raw Blame History