Files

indifferentketchup 531d39ace9 v2.0 proposal update: add AGENTS.md extensions, Boomerang pattern, observation hooks, follow-up batches

Additions from second pass of boocode_code_review.md:

- AGENTS.md extensions: output_schema, exit_expression, execution_strategy
  (qodo-ai/agents MIT), expert_model escape hatch (RA.Aid Apache-2.0)
- Subagent isolation via Boomerang Tasks pattern: orchestrator-only-dispatches,
  down-pass/up-pass context discipline, fresh session per subtask
- Observation hooks: 5-event taxonomy from budi (SessionStart, UserPromptSubmit,
  PostToolUse, SubagentStart, Stop) mapped to WS frames
- Follow-up batches table: PR-resolver, HMAC audit log, blind-validation gate,
  majority-vote ensembler, drift detection, anti-slop, globstar gate, Docker
  sandbox, multi-provider LLM
- Additional repo to clone: qodo-ai/agents for agent.toml schema reference

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-24 23:22:57 +00:00

19 KiB

Raw Blame History

v2.0 — BooCoder

Major version bump. New app apps/coder/ inside the existing monorepo. Lands together with the boocode_db → boochat_db DB rename and the per-app subdomain split (code.indifferentketchup.com → BooChat, coder.indifferentketchup.com → BooCoder).

What BooCoder is

A write-capable coding agent surface. Two execution paths, same UI:

Path A (native): BooCode's own inference loop with write tools (edit_file, create_file, delete_file). Edits queue in pending_changes — nothing touches disk until user approves via /apply.
Path B (dispatch): Shells out to external CLI agents (opencode, goose, claude, pi) via ACP (preferred) or raw PTY (fallback). One git worktree per dispatch. Captures events into the same parts taxonomy.

Both paths feed the same task DAG, same project registry, same pending-changes queue, same UI.

Why now

v1.x proved the read-only loop works end-to-end: inference, tool dispatch, streaming, compaction, MCP client, outer loop, step caps, artifact rendering. The infrastructure is stable. The jump from "read-only chat" to "write-capable agent orchestrator" is the remaining gap between BooCode and having a real development environment.

Architecture

Three protocol roles (locked 2026-05-22)

MCP client (write-capable allowed). Inherits v1.15 client. Write-capable MCP servers (e.g. @modelcontextprotocol/server-filesystem) route writes through pending_changes. Per-task allow/deny means dispatched tasks can have a different MCP roster.
MCP server (BooCoder's own primitives). Exposes boocoder.create_task, boocoder.list_pending_changes, boocoder.apply, boocoder.reject, boocoder.dispatch_external_agent, boocoder.list_worktrees as MCP tools. Stdio transport for local consumers (Sam's opencode in Termius); HTTP deferred until OAuth + secret storage.
ACP client (host). Spawns opencode acp and goose acp as JSON-RPC stdio subprocesses. Maps ACP events (file operations, tool calls, terminal output) to BooCode's parts taxonomy. MCP servers configured in BooCoder are auto-forwarded to the dispatched agent (per goose docs — context_servers is the field).

Container layout (post-v2.0)

Container	Port	Mount	Purpose
`boochat` (was `boocode`)	`100.114.205.53:9500`	`/opt:/opt:ro`	Read-only chat + MCP client
`booterm`	`100.114.205.53:9501`	`/opt:/opt:rw`	PTY/tmux terminal
`boocoder`	`100.114.205.53:9502`	`/opt:/opt:rw` (policy-gated)	Write tools + ACP host + MCP client + MCP server
`boochat_db` (was `boocode_db`)	`127.0.0.1:5500`	`boocode_pgdata`	Shared Postgres 16
`codecontext`	internal `:8080`	`/opt:/opt:ro`	Analysis sidecar (shared)

Caddy routing

code.indifferentketchup.com    → boochat:9500
coder.indifferentketchup.com   → boocoder:9502
term.indifferentketchup.com    → booterm:9501 (or routed under code.*/term/)

Schema (new tables)

-- Pending changes: queued writes before /apply
CREATE TABLE IF NOT EXISTS pending_changes (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  session_id UUID NOT NULL REFERENCES sessions(id),
  task_id UUID REFERENCES tasks(id),
  file_path TEXT NOT NULL,
  operation TEXT NOT NULL CHECK (operation IN ('create', 'edit', 'delete')),
  diff TEXT NOT NULL,
  status TEXT NOT NULL DEFAULT 'pending' CHECK (status IN ('pending', 'applied', 'rejected', 'reverted')),
  created_at TIMESTAMPTZ NOT NULL DEFAULT clock_timestamp()
);

-- Tasks: the dispatch DAG
CREATE TABLE IF NOT EXISTS tasks (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  project_id UUID NOT NULL REFERENCES projects(id),
  parent_task_id UUID REFERENCES tasks(id),
  state TEXT NOT NULL DEFAULT 'pending'
    CHECK (state IN ('pending', 'running', 'completed', 'failed', 'blocked', 'cancelled')),
  input TEXT NOT NULL,
  output_summary TEXT,
  agent TEXT,
  model TEXT,
  execution_path TEXT CHECK (execution_path IN ('native', 'acp', 'pty')),
  worktree_path TEXT,
  cost_tokens INTEGER,
  started_at TIMESTAMPTZ,
  ended_at TIMESTAMPTZ,
  created_at TIMESTAMPTZ NOT NULL DEFAULT clock_timestamp()
);

-- Available agents: probed at startup
CREATE TABLE IF NOT EXISTS available_agents (
  name TEXT PRIMARY KEY,
  install_path TEXT,
  version TEXT,
  supports_acp BOOLEAN NOT NULL DEFAULT false,
  supports_mcp_client BOOLEAN NOT NULL DEFAULT false,
  last_probed_at TIMESTAMPTZ
);

-- Human inbox: tasks needing attention
CREATE VIEW human_inbox AS
  SELECT * FROM tasks WHERE state IN ('blocked', 'failed');

task_templates and pipelines deferred to v2.1 — overhead for single-user. The core is tasks + pending_changes + available_agents.

Path A — Native write tools

Tools

Tool	Description
`edit_file`	Apply a diff to an existing file. Input: `{file_path, old_string, new_string}`. Queues in `pending_changes` with `operation='edit'`.
`create_file`	Create a new file. Input: `{file_path, content}`. Queues as `operation='create'`.
`delete_file`	Delete a file. Input: `{file_path}`. Queues as `operation='delete'`.
`apply_pending`	Flush all pending changes for the current session to disk. Path-guarded.
`rewind`	Revert a specific applied change or all changes since a checkpoint.

Path guard for writes

Same pathGuard() function from BooChat, but with a write-path variant:

resolveWritePath(projectRoot, requested) — uses resolve() (not realpath(), since the file may not exist yet for creates), then verifies the result starts with projectRoot + sep.
Deny list: everything in secret_guard.ts (.env, *.pem, etc.) — can't write to those either.
Defense-in-depth: the pending_changes queue means even a path-guard bypass only queues; it doesn't hit disk until /apply (which re-validates).

Diff format

Standard unified diff (what git diff produces). The edit_file tool takes old_string / new_string (same as Claude Code's edit tool — the model is trained on this shape). Server computes the unified diff for storage in pending_changes.diff.

UI: per-pane diff viewer

Frontend pane type pending_changes in BooCoder's workspace. Shows:

List of queued changes with file path + operation
Per-change diff view (syntax-highlighted, side-by-side or unified)
Approve / Reject per change, or Approve All / Reject All

Path B — External agent dispatch

dispatch_external_agent tool

{
  agent: 'opencode' | 'claude' | 'goose' | 'pi',
  model: string,        // e.g. 'claude-opus-4-7'
  task: string,         // natural-language task description
  worktree?: string,    // optional — auto-creates if not specified
}

Transport selection

Dispatcher checks available_agents.supports_acp at runtime:

ACP (preferred): opencode acp, goose acp — JSON-RPC stdio. Native session lifecycle, file-operation events, terminal events, permission prompts.
PTY (fallback): claude, pi, smallcode — raw terminal capture via node-pty. Captures stdout/stderr/exit-code into PostgreSQL. Less structured than ACP.

Worktree management

Each dispatched task gets its own git worktree:

git worktree add /tmp/booworktrees/<task-id> -b task-<task-id> HEAD

On completion: diff the worktree against HEAD, queue the diff into pending_changes for the same task, clean up the worktree. User approves/rejects the diff the same way as Path A.

ACP event mapping

ACP events → BooCode parts taxonomy:

file_operation → tool_call part (name: acp_edit_file) + tool_result part
tool_call → tool_call part (preserves name)
terminal_output → routes into BooTerm pane
permission_request → pause inference (same mechanism as ask_user_input)
session_end → task state → completed or failed

MCP server auto-forward

Per goose docs, context_servers field in the ACP session config auto-forwards BooCoder's configured MCP servers to the dispatched agent. One MCP config drives every agent.

Dispatcher worker

Background process (or in-process setInterval for v2.0 simplicity) that:

Queries tasks WHERE state = 'pending' ORDER BY created_at
For each ready task (no unmet dependencies):
- Mark state = 'running'
- Resolve execution path (Path A if no agent specified, Path B if agent specified)
- Path A: run the inference loop with write tools enabled
- Path B: spawn ACP/PTY subprocess, stream events into parts
- On completion: mark state = 'completed' or 'failed'
- Queue output diff into pending_changes
On failure: mark state = 'failed', surface in human_inbox view

BooCoder MCP server

Exposes BooCoder's primitives as MCP tools so external agents (Sam's opencode in Termius) can drive the task queue:

MCP Tool	Description
`boocoder.create_task`	Create a new task in the queue
`boocoder.list_pending_changes`	List queued changes awaiting approval
`boocoder.apply`	Apply a specific pending change
`boocoder.reject`	Reject a pending change
`boocoder.dispatch_external_agent`	Dispatch a task to an external agent
`boocoder.list_worktrees`	List active git worktrees

Stdio transport for local consumers. HTTP transport deferred until OAuth + secret storage.

Eval requirement: run through anthropics/skills mcp-builder 10-question evaluation framework before shipping.

Code lifts

Primary architectural template

Dominic789654/agent-hub (Apache-2.0) — task DAG schema, dispatcher worker, project registry, human inbox. Three-process model (board server + dispatcher + assistant terminal). BooCode adapts this into a single-process Fastify app (v2.0.0) with the dispatcher as an in-process worker.

Pending-changes UX

plandex-ai/plandex (MIT) — diff/apply/rewind vocabulary. The pending_changes queue concept, per-file diff view, approve/reject UI pattern. No code lifted — schema and UX design only.

ACP client

agentclientprotocol.com spec + @zed-industries/agent-client-protocol SDK (Apache-2.0) — local-subprocess ACP via stdio JSON-RPC. The SDK handles framing; BooCode maps events to its parts taxonomy.

goose docs (goose-docs.ai/docs/guides/acp-clients/) — context_servers auto-forward pattern. Critical: one MCP config drives every dispatched agent.

MCP server

anthropics/skills/mcp-builder (MIT) — 4-phase build workflow + 10-question evaluation framework for validating the MCP server before shipping.

Dispatcher pattern

Paseo (getpaseo/paseo) — AGPL-3.0, design only, no code lift. Daemon+clients architecture, --worktree flag, CLI verb shape (run/ls/attach/send). BooCode reproduces the architecture using only license-clean patterns.

Roo Code Boomerang Tasks — orchestrator with intentional capability restriction. Down-pass/up-pass context discipline (new_task message, attempt_completion result, no implicit inheritance). Explicit precedence override clause.

Write-tool security

opencode permission/evaluate.ts — wildcard permission ruleset (already lifted in v1.15). Extended in v2.0 to gate write tools.

covibes/zeroshot — blind-validation invariant. Verify gate runs in a separate agent context that only sees the diff and acceptance criteria, not the producing conversation. v2.0+ optional batch.

Sub-versions

Version	Scope
v2.0.0	Schema + Path A (native write tools + pending-changes queue + diff UI) + basic dispatcher
v2.0.1	Path B (ACP client for opencode/goose + PTY fallback for claude/pi + worktree management)
v2.0.2	BooCoder MCP server (stdio transport, `boocoder.*` tools, eval framework)
v2.0.3	Polish: `boocode` CLI (`run/ls/attach/send`), human_inbox UI, cost tracking

Dependencies

v1.13 ✅ (parts table — the event taxonomy for everything)
v1.14 ✅ (outer loop + step boundaries for future revert snapshots)
v1.14.x-mcp ✅ (MCP client PoC — proves the protocol)
v1.15 ✅ (full MCP client + tool globs — write-capable MCP servers route through pending_changes)
v1.16 ✅ (codesight merge — codecontext now has blast-radius for impact analysis)

All dependencies shipped. v2.0 is unblocked.

Estimate

v2.0.0: ~800 LoC (schema + write tools + pending-changes service + diff pane + dispatcher skeleton)
v2.0.1: ~600 LoC (ACP client + PTY dispatch + worktree management + event mapping)
v2.0.2: ~400 LoC (MCP server + 6 tool handlers + stdio transport + eval)
v2.0.3: ~400 LoC (CLI client + inbox UI + cost aggregation)
Total: ~2200 LoC across 4 sub-versions

Hard rules

BooChat stays read-only. BooCoder is the only surface with write tools.
Path-guard correctness is the #1 test target. Fuzz against every traversal pattern.
Pending-changes queue gates ALL writes (native + MCP). Nothing touches disk without user approval (or explicit auto-apply flag per task).
One shared database. Cross-surface joins are valuable (task → chat → terminal debugging session).
External CLI agents on the host, not in containers. BooCoder shells out via local-exec.
No OAuth in v2.0. MCP server is stdio-only until secret storage lands.
DB rename boocode_db → boochat_db lands with v2.0.0 (one-time migration).

AGENTS.md extensions (v2.0.0)

Port from qodo-ai/agents (MIT) agent.toml schema and ai-christianson/RA.Aid (Apache-2.0) three-stage pattern:

Field	Type	Purpose	Source
`steps`	number	Per-agent step cap (already shipped v1.14.0)	opencode
`output_schema`	JSON Schema	Structured output constraint for the agent's final response	qodo-ai/agents
`exit_expression`	string	Regex/predicate — when the agent considers itself done	qodo-ai/agents
`execution_strategy`	`plan` \| `act` \| `research`	Which phase of the RA.Aid three-stage pattern this agent operates in	qodo-ai/agents + RA.Aid
`model`	string	Per-agent model override (already shipped v1.8)	—
`expert_model`	string	Escalation model for hard reasoning (RA.Aid "expert tool" escape hatch)	RA.Aid

The three-stage pattern maps to BooCoder's use case:

Research agent (cheap model) → understand the task, find relevant files
Planning agent (standard model) → decide which files to edit, what the changes look like
Implementation agent (full model) → produce the actual diffs

expert_model is the escape hatch: a routine model handles most subtasks, but can call the expert model (e.g. qwopus27b) when stuck. Matches Sam's existing cost-routing discipline.

Subagent isolation (Boomerang pattern, v2.0.1)

From Roo Code Boomerang Tasks (Apache-2.0 pattern):

When an orchestrator agent calls a new_task tool, BooCoder:

Creates a fresh tasks row with parent_task_id pointing to the orchestrator's task
Spawns a fresh inference session (Path A) or dispatch (Path B) with ONLY the task spec as context — no inherited conversation
Child runs to attempt_completion, writes a summary to tasks.output_summary
Parent resumes reading ONLY the summary (not the child's full conversation)

Three principles:

Orchestrator capability restriction: the orchestrator agent's tool list includes ONLY new_task, list_tasks, check_task_status — it cannot read files or call MCP tools directly
Down-pass: parent sends task spec via new_task(input), nothing else inherited
Up-pass: child sends result via attempt_completion(summary), nothing else surfaces to parent

This is the single most important context-management primitive — it prevents long-running orchestrators from poisoning their context with implementation detail.

Observation hooks (v2.0.3)

From siropkin/budi (MIT) Claude Code 5-hook taxonomy:

SessionStart — agent spawned
UserPromptSubmit — task spec delivered
PostToolUse — each tool call completed
SubagentStart — nested dispatch
Stop — agent finished

These map directly to BooCode's existing WS frame protocol. The hook receiver is the BooCoder Fastify server; events flow into the message_parts taxonomy as step_start-style instrumentation parts.

Follow-up batches (v2.0+ optional, ordered by value)

Batch	Source	What	When
PR-resolver tool	`qodo-ai/qodo-skills` (MIT)	Fetch GitHub issues → batch/interactive fix → inline PR reply. BooCoder tool that replaces Sam's manual PR workflow.	v2.0.3+
HMAC audit log	`sipyourdrink-ltd/bernstein` (verify license)	One new `audit_log` table with `prev_hmac` field. Tamper-evident history of every edit BooCoder makes. Small lift (~50 LoC).	v2.0.1+
Blind-validation gate	`covibes/zeroshot` (MIT)	Verify gate runs in a separate agent context that sees ONLY the diff + acceptance criteria, not the producing conversation. Complements Boomerang (isolation) + bernstein (lineage).	v2.0.2+
Majority-vote ensembler	`augmentcode/augment-swebench-agent` (MIT)	K candidate diffs from K agents → ranker model picks the best one. Optional layer above `pending_changes`.	v2.1+
Drift detection	`memovai/memov` (MIT)	`validate_commit` concept — detects when actual changes diverge from what was requested. Shadow timeline comparison.	v2.0.3+
Anti-slop for frontend	`Leonxlnx/taste-skill` (MIT)	100+ specific font/color/layout ban list + 3-dial parameterization. Vendor into skills/ when BooCoder generates frontend code.	v2.0+
Verify-before-commit gate	`DeepSourceCorp/globstar` (MIT)	Rule-based AST linter as a pre-apply quality gate. YAML checkers in `.globstar/`.	v2.1+ (parked)
Docker sandbox	`OpenHands/OpenHands` (MIT)	Per-session Docker container for write tools. Closes the `/opt:rw` mount risk if path-guard ever proves insufficient.	v2.1 (optional)
Multi-provider LLM	`earendil-works/pi` (MIT)	Provider abstraction if a need for Anthropic/OpenAI/Mistral direct surfaces beyond llama-swap.	v2.x (optional)

Repos to clone before starting

cd /opt/forks
git clone https://github.com/Dominic789654/agent-hub.git    # Apache-2.0, task DAG + dispatcher
git clone https://github.com/plandex-ai/plandex.git          # MIT, pending-changes UX
git clone https://github.com/anomalyco/opencode.git          # MIT, permission evaluate.ts reference
git clone https://github.com/qodo-ai/agents.git              # MIT, agent.toml schema (output_schema, exit_expression, execution_strategy)

19 KiB Raw Blame History