Compare commits

...

1 Commits

Author SHA1 Message Date
04673eaf59 v2.1.1: roadmap cleanup + README update + openspec archive
- Archive all 10 shipped openspec changes to openspec/changes/archived/
- Update boocode_roadmap.md: date, shipped status for v1.14/v1.15/v2.0, add v2.1.0 section
- Update README.md: 3-app monorepo, add services table, add What's shipped section
- Remove stale active openspec folders (all work shipped)
2026-05-25 20:23:22 +00:00
37 changed files with 1254 additions and 2821 deletions

View File

@@ -6,10 +6,46 @@ All notable changes per release tag. Most recent on top, ordered by tag creation
Provider picker: BooCoder moves from Docker container to host systemd service (`boocoder.service`). All agent dispatch (ACP + PTY) switches from SSH tunnel to direct `spawn`/`exec` — no more `sshSpawn`/`sshExec`/`sshSpawnWithStdin` (marked `@deprecated`). New provider registry (`provider-registry.ts`) with 5 providers (boocode, opencode, goose, claude, qwen), per-provider model discovery (llama-swap for ACP agents, `~/.qwen/settings.json` for qwen, static for claude), and `agent-probe.ts` runs direct `which`/`exec` instead of SSH. `GET /api/providers` route assembles the provider list with installed status, models, and transport (ACP→PTY fallback if `supports_acp` is false). Frontend `ProviderPicker` component in CoderPane header lets users pick provider/model per message; messages route through `tasks` row for external providers instead of inference enqueue. Smart scroll: `MessageList` only auto-scrolls when user is near bottom (150px threshold). DB schema adds `models`, `label`, `transport` columns to `available_agents`. Bug fixes: `loadContext` SELECT now includes `allowed_read_paths` (cross-repo read grants were silently failing), cap hit sentinel insertion moved before `buildMessagesPayload` call.
## v2.0.5 — 2026-05-25
FAST_MODEL routing: optional `FAST_MODEL` env var routes cheaper models (titles, summaries, labeling) to a small model on llama-swap (e.g. `nemotron-nano-4b`) instead of loading the 35B for 20-token calls. Falls back to session model or DEFAULT_MODEL. Tool-use summaries: `runCapHitSummary` now writes the cap_hit sentinel before building the summary payload (bug fix — sentinel was written after, causing it to appear after the summary text in the message list). Qwen Code dispatch: `qwen -p "<task>" --output-format stream-json` via PTY (non-interactive mode, no `--yolo` flag needed). Arena: `POST /api/arena` dispatches the same task to N models/agents in parallel, each with its own task + worktree; `GET /api/arena/:id` for results; `POST /api/arena/:id/select/:task_id` picks winner.
## v2.0.4-hardening — 2026-05-25
Path-guard fuzz suite: 25+ traversal-attack tests covering ../ sequences (all depths), encoded traversal (%2e%2e), null byte injection, absolute path escape, prefix-without-separator, backslash traversal, and the full secret-file deny list (.env, *.pem, id_rsa*, *.key, credentials.json, *.kdbx, .netrc). Plus 5 valid-path positive tests confirming normal writes aren't blocked and 5 edge-case tests (empty, whitespace-only, very long path, triple-dot, multiple slashes). Null-byte and whitespace-only guards added to `resolveWritePath` (previously only checked empty string). DB-integration test skeleton for pending_changes full-cycle (queue create/edit/delete, apply, rewind) gated on DATABASE_URL via `describe.runIf`. Production readiness verified: all services healthy, all builds clean, 57 tests passing (23 existing + 34 new).
## v2.0.3 — 2026-05-25
CLI client (`apps/coder/src/cli.ts`, 249 lines) for headless agent interaction. Human inbox view (`human_inbox` view) surfaces tasks in `blocked`/`failed` state. Cost tracking: `tool_cost_stats` view with per-tool 100-call rolling window. `new_task` tool (Boomerang pattern): creates tasks with project context and optional arena contestants. `check_task_status` and `list_tasks` tools for task lifecycle management. Stats routes (`GET /api/stats`) for cost aggregation. Dispatcher extended to support new task states.
## v2.0.2 — 2026-05-25
BooCoder MCP server (`mcp-server.ts`, 201 lines) exposing 6 write-capable tools over stdio: `edit_file`, `create_file`, `delete_file`, `view_pending_changes`, `apply_pending`, `rewind`. Registered in `apps/coder/src/index.ts` as an MCP stdio server. Enables external agents (opencode, claude, qwen) to call BooCoder's write tools through the MCP protocol.
## v2.0.1 — 2026-05-25
ACP dispatch (`acp-dispatch.ts`, 271 lines): runs ACP-capable agents (opencode, goose) via SSH tunnel wrapping stdio into NDJSON streams for `@agentclientprotocol/sdk` JSON-RPC sessions. PTY dispatch (`pty-dispatch.ts`, 139 lines): runs non-ACP agents (claude, qwen) via SSH with stdin pipe for non-interactive mode. Worktree management (`worktrees.ts`, 118 lines): per-task git worktree creation and cleanup. SSH helper (`ssh.ts`, 126 lines): `sshSpawn`, `sshExec`, `sshSpawnWithStdin` for host command execution. Dispatcher extended to route tasks to ACP vs PTY based on agent capability. Agent probe updated to verify ACP support.
## v2.0.0-final — 2026-05-25
Dispatcher (`dispatcher.ts`, 191 lines): task queue with polling loop, Path A (native inference) and Path B (external agent dispatch). Task routes (`tasks.ts`, 138 lines): CRUD for tasks with state transitions. Agent probe (`agent-probe.ts`, 51 lines): startup scan of host for installed agents (opencode, goose, claude, pi, qwen), version detection, ACP capability verification. Schema adds `tasks` table. CLAUDE.md updated with v2.0.0 architecture docs covering BooCoder, DB rename, MCP config, workspace deps.
## v2.0.0 — 2026-05-25
BooCoder frontend: `CoderPane.tsx` (432 lines) as a `'coder'` pane type within BooChat's SPA — chat pane + diff pane (pending changes) + session picker. Standalone fallback SPA in `apps/coder/web/` (Vite + React) served at `:9502` directly. Session streaming via `useSessionStream` WS hook. API client with typed endpoints. Workspace pane persistence via `useWorkspacePanes`. Server routes for pending changes (`PATCH/POST /api/coder/sessions/:id/pending`). Verification discipline rules + chat naming from assistant response.
## v2.0.0-beta — 2026-05-25
Write tools: `edit_file`, `create_file`, `delete_file`, `apply_pending`, `rewind` — queue in `pending_changes` table, nothing hits disk until applied. `write_guard.ts` validates paths (resolve + prefix-check, no realpath for creates). Inference loop integration via `inference_context.ts` (bridges inference turn state to tool execution). API routes: `messages.ts` (POST /api/coder/sessions/:id/messages), `pending.ts` (GET/POST /api/coder/sessions/:id/pending). WebSocket support (`ws.ts`) for real-time pending changes updates. Tool adapter (`adapter.ts`) converts inference tool calls to tool execution. Write guard tests (115 lines). Server-side inference loop wired to BooCoder tools.
## v2.0.0-alpha — 2026-05-25
BooCoder foundation: Docker container (`apps/coder/Dockerfile`), docker-compose service, host env file. Schema: `sessions`, `chats`, `messages`, `pending_changes`, `tasks`, `message_parts` tables. DB renamed from `boocode` to `boochat`. Config module, PostgreSQL connection (porsager/postgres). Initial Fastify server with health endpoint. BOOCODER.md guidance file. Implementation plan (8 phases). Proposal updated with AGENTS.md extensions, Boomerang pattern, observation hooks.
## v2.0-proposal — 2026-05-24
v2.0 proposal: BooCoder write tools, pending-changes queue, ACP dispatch, MCP server. Openspec proposal (`proposal.md`, 274 lines) and task breakdown (`tasks.md`, 130 lines) defining the v2.0 feature scope — write-capable coding agent with file operations, external agent dispatch via ACP/PTY, and MCP server for tool exposure.
## v1.16.0-codesight-merge — 2026-05-24
Ports codesight's highest-value analysis capabilities into the codecontext sidecar as 4 new MCP tools. Tier 1 (graph queries on existing edges, no re-parsing): `get_blast_radius` (BFS reverse-edge traversal — "what breaks if I change this file?", with depth tracking) and `get_hot_files` (most-imported files ranked by incoming edge count — change-risk indicators). Tier 2 (tree-sitter AST re-parsing on demand): `get_routes` (Fastify/Express HTTP route extraction with method, path, file, line, inferred tags for db/auth/cache) and `get_middleware` (middleware registration detection via import-name heuristics and app.register/addHook/setErrorHandler patterns, classifying as auth/cors/rate-limit/security/error-handler/logging/validation). All 4 tools use `defer s.graphMu.RUnlock()` for consistent mutex discipline (reviewer caught that the initial implementation released the lock early on the Tier 2 tools). Route object-property extraction delegates to `extractStringValue` for template-literal handling (reviewer catch). codecontext sidecar rebuilt from `/opt/forks/codecontext` commit `b19e646`, tagged `v1.16.0-codesight-merge`. BooCode wrapper tools follow the existing codecontext pattern — 4 new files in `apps/server/src/services/tools/codecontext/`, registered in ALL_TOOLS. 29 new Go tests + 363/363 BooCode server tests passing. No schema changes, no frontend changes.

View File

@@ -1,6 +1,6 @@
# boocode
Self-hosted single-user developer chat app. v1: chat only.
Self-hosted single-user developer chat app. 3-app monorepo: BooChat (read-only chat), BooCoder (write tools + agent dispatch), BooTerm (PTY terminals).
## Stack
@@ -13,6 +13,8 @@ Self-hosted single-user developer chat app. v1: chat only.
- `apps/server` — Fastify API + WebSocket + inference loop + file-read tools
- `apps/web` — React frontend; served by Fastify in production, Vite in dev
- `apps/booterm` — Fastify + node-pty + tmux for in-browser terminal panes
- `apps/coder` — Fastify write tools + ACP/PTY dispatcher + MCP server (BooCoder)
## Local dev
@@ -49,11 +51,18 @@ docker compose up --build -d
Binds to `100.114.205.53:9500` (Tailscale). Authelia is expected to gate the
upstream and inject `Remote-User`. Postgres binds loopback only.
## What v1 has
## Services
Project sidebar, sessions per project, chat with streaming responses over
WebSocket, four file-read tools scoped to the project root (`view_file`,
`list_dir`, `grep`, `find_files`), and a model picker driven by llama-swap's
`/v1/models`.
|Service|Port|Description|
|---|---|---|
|BooChat|`100.114.205.53:9500`|Read-only chat + SPA |
|BooTerm|`100.114.205.53:9501`|PTY/tmux terminal panes |
|BooCoder|host:9502|Write tools + agent dispatch + MCP server (systemd service, not Docker) |
|Postgres|`127.0.0.1:5500`|Shared database (`boochat_db`) |
|codecontext|`:8765` (internal)|MCP server for architect tools |
What v1 does not have lives in v2 (terminal pane) and v3 (Coder pane).
## What's shipped
- **BooChat**: streaming chat, file-read tools, compaction, reasoning support, HTML/Markdown artifact panes, cross-repo read grants, MCP client (Context7 + multi-server), tool-cost tracking, skills system, agent registry, provider picker with model discovery
- **BooTerm**: in-browser terminal panes via tmux + xterm.js, per-session tmux sessions, SSH-out support
- **BooCoder**: write tools (`edit_file`, `create_file`, `delete_file`, `apply_pending`, `rewind`), pending-changes queue with diff UI, ACP/PTY dual-path agent dispatch, MCP server (6 tools, stdio), CLI client, human inbox, Boomerang orchestration, path-guard fuzz suite

View File

@@ -1,6 +1,6 @@
# BooCode v1.x — Roadmap
Last updated: 2026-05-23
Last updated: 2026-05-25
> **Companion doc:** `boocode_code_review.md` holds the full external-repo inventory, lift rationale, and license analysis. This document is the canonical source for shipping state, version ordering, and what's planned vs. shipped.
@@ -8,9 +8,9 @@ Last updated: 2026-05-23
BooCode is a **3-app monorepo** at `/opt/boocode/` (locked 2026-05-22):
- **BooChat** (`apps/chat`, port `9500`, `code.indifferentketchup.com`) — read-only chat with file-inspection tools. The live thing. Pick a project, chat with a local LLM, get streaming responses over WebSocket. Will rename `boocode_db``boochat_db` when BooCoder lands.
- **BooCoder** (`apps/coder`, port `9502`, `coder.indifferentketchup.com`) — write tools + external-CLI dispatch. **Planned, v2.0.** Both an in-process inference loop (with `pending_changes` table) AND ACP-dispatched external agents (opencode/goose) with PTY fallback (claude/pi/smallcode) — same surface, two execution paths.
- **BooTerm** (`apps/booterm`, port `9501`) — PTY/tmux/xterm.js. **Live since May 2026.** Node 20 Alpine + node-pty + tmux + xterm.js. Tmux session per pane (`bc-<uuid>`), SSH-out works (openssh-client + gosu in the image). `/api/term/health` shares the existing `boocode_db`.
- **BooChat** (`apps/chat`, port `9500`, `code.indifferentketchup.com`) — read-only chat with file-inspection tools. The live thing. Pick a project, chat with a local LLM, get streaming responses over WebSocket. DB renamed `boochat_db` at v2.0.
- **BooCoder** (`apps/coder`, port `9502`, `coder.indifferentketchup.com`) — write tools + external-CLI dispatch. **Shipped v2.0.0v2.0.4.** In-process inference loop (with `pending_changes` table) AND ACP-dispatched external agents (opencode/goose) with PTY fallback (claude/pi/smallcode) — same surface, two execution paths.
- **BooTerm** (`apps/booterm`, port `9501`) — PTY/tmux/xterm.js. **Live since May 2026.** Node 20 Alpine + node-pty + tmux + xterm.js. Tmux session per pane (`bc-<uuid>`), SSH-out works (openssh-client + gosu in the image). `/api/term/health` shares the existing `boochat_db`.
Caddy → Authelia → Tailscale → `100.114.205.53` → 9500/9501/9502. Three apps, **one shared Postgres** (`boocode_db``boochat_db`).
@@ -126,6 +126,8 @@ The v1.13.x line is closed. Three batches still sit in the **In flight** column
**Estimated:** ~800 LoC.
**Shipped as `v1.14.0-outer-loop`.** Explicit `while (stepNumber < effectiveCap)` loop in `turn.ts`, per-agent `steps:` field from AGENTS.md frontmatter, `MAX_STEPS=200` ceiling, doom-loop guard migrated to loop-iteration style.
-----
## v1.14.x-mcp — single-server MCP-client proof-of-concept (NEW, 2026-05-22)
@@ -133,7 +135,6 @@ The v1.13.x line is closed. Three batches still sit in the **In flight** column
**Goal:** validate the MCP-client loop end-to-end against one real MCP server before committing to the full opencode `mcp/index.ts` port at v1.15. Small, throwaway-if-needed, slots between v1.14 and v1.15 without disrupting either.
**Scope:**
1. Add a hardcoded MCP client (single server) to BooChat. Initial target: **Context7** (Sam already uses it via opencode, so the config is known to work). Remote HTTP transport at `https://mcp.context7.com/mcp` with optional `CONTEXT7_API_KEY` header.
1. Use the official `@modelcontextprotocol/sdk` TypeScript client. No SSE transport yet (deferred to v1.15). Stdio transport not needed for Context7.
1. Tool discovery on startup: `tools/list`. Tools surface in BooChat alongside `view_file`/`grep`/etc., prefixed `context7_*` to avoid collisions.
@@ -161,6 +162,8 @@ The v1.13.x line is closed. Three batches still sit in the **In flight** column
**Skip-condition:** if v1.14 finishes and Sam wants to leap straight to v1.15, fold this into the early steps of v1.15.
**Shipped as `v1.14.1-mcp-poc`.** Context7 MCP client validated end-to-end.
-----
## v1.14.x-html — pane-based artifact viewer with Markdown + HTML (REVISED, 2026-05-23)
@@ -217,7 +220,6 @@ Inspired by Thariq Shihipar's "HTML > Markdown at length" pattern (`claude.com/b
**Goal:** wildcard permission ruleset (opencode `evaluate.ts` pattern) and a proper MCP client implementation. Foundation for BooCoder to gate writes; immediate value for codecontext to be re-wired as a real MCP server.
**Scope:**
1. Wildcard rule matcher: `{ permission, pattern, action: 'allow' | 'deny' | 'ask' }`. Last-match-wins. Per-agent rulesets layer under per-session rulesets.
1. **Full MCP client implementation:** stdio (local subprocess) + SSE (remote HTTP) transports, `tools/list` discovery, `tools/call` invocation, OAuth via Dynamic Client Registration (RFC 7591), per-server enabled flag, **glob patterns for per-agent tool whitelisting** (matching opencode's `tools` config shape).
1. codecontext sidecar gets re-pointed from static wrappers (v1.12) to real MCP. New connectors become a config-only addition.
@@ -239,6 +241,8 @@ Inspired by Thariq Shihipar's "HTML > Markdown at length" pattern (`claude.com/b
**Estimated:** ~600 LoC.
**Shipped as `v1.15.0-mcp-multi`.** Multi-server MCP client with stdio transport + config file, per-agent tool glob patterns in AGENTS.md frontmatter.
-----
## v1.16 — codesight repo_health
@@ -259,6 +263,8 @@ Independent batch — ships clean any time after v1.13. Low leverage unless Sam
**Major version bump.** New app `apps/coder/` inside the existing monorepo (not a separate repo). Lands together with the `boocode_db``boochat_db` DB rename and the per-app subdomain split (`code.indifferentketchup.com` → BooChat, `coder.indifferentketchup.com` → BooCoder).
**Shipped v2.0.0v2.0.4.** All 8 phases complete. See retrospective below.
**Three protocol roles in one surface:**
1. **MCP client (write-capable allowed).** Inherits the v1.15 client unchanged. BooCoder can enable write-capable MCP servers (`@modelcontextprotocol/server-filesystem` write tools, git commit MCP servers, etc.). All MCP writes route through the same `pending_changes` queue as native writes. Per-task allow/deny means dispatched tasks can have a different MCP roster than the interactive shell.
@@ -328,6 +334,8 @@ Per-session Docker sandbox spawned by BooCoder on first write. Only project path
**Estimated:** ~600 LoC.
**Status:** Still optional. v2.0 path-guard fuzz suite (34 traversal-attack tests) passed. No production pressure to containerize yet.
-----
## v2.2 — BooCoder as ACP agent (driveable from external editors)
@@ -350,17 +358,23 @@ Per-session Docker sandbox spawned by BooCoder on first write. Only project path
-----
## v2.1.0 — Provider picker + model discovery
**Shipped `v2.1.0-provider-picker`.** Provider registry with 5 providers (boocode, opencode, goose, claude, qwen). Model discovery via `LLAMA_SWAP_URL/upstream/<model>/props`. `/api/providers` route returns installed providers with models. `ProviderPicker` frontend component in workspace toolbar. Agent-probe startup probe discovers installed agents on host, their versions, ACP support, and models. Booterm SSH host configurable via `BOOTERM_SSH_HOST`/`BOOTERM_SSH_USER` env vars.
-----
## v2.x — Optional / far future
- **Verify gate above pending-changes** — `augmentcode/augment-swebench-agent` majority-vote ensembler pattern (K candidate diffs → ranker model picks winner). JSONL schema only, no code lift. Combine with zeroshot blind-validation invariant. v2.0+ optional batch.
- **PR-resolver tool** — `qodo-ai/qodo-skills` PR-resolver state machine (fetch issues → batch/interactive fix → inline reply). BooCoder v2.0+.
- **Record/replay LLM harness for tests** — `qodo-ai/qodo-cover` pattern (hashed prompt → fixture YAML). Re-implement in Vitest, don't vendor (AGPL). v1.13+ test infrastructure.
- **HMAC-chained audit log** — `sipyourdrink-ltd/bernstein` pattern. Small lift, adds tamper-evident session history. v1.13+ optional.
- **Tiered tool loading** — `eyaltoledano/claude-task-master` pattern (env var: `core` / `standard` / `all`). ~30 LoC in `agents.ts`. Pattern-only lift (claude-task-master is MIT + Commons Clause; reimplement). v1.13.x or v1.14.
- **Spec directory structure** — `Fission-AI/OpenSpec` `openspec/changes/<name>/{proposal,specs,design,tasks}.md` shape for BooCode's own batch docs. Zero-dep documentation reformat, replaces ad-hoc `boocode_batchN.md` convention. v1.13.x or v1.14.
- **Tiered tool loading** — `eyaltoledano/claude-task-master` pattern (env var: `core` / `standard` / `all`). ~30 LoC in `agents.ts`. Pattern-only lift (claude-task-master is MIT + Commons Clause; reimplement). **Shipped as `v1.13.11-tools`.**
- **Spec directory structure** — `Fission-AI/OpenSpec` `openspec/changes/<name>/{proposal,specs,design,tasks}.md` shape for BooCode's own batch docs. Zero-dep documentation reformat, replaces ad-hoc `boocode_batchN.md` convention. **Shipped as `v1.13.10-openspec`.**
- **`view_session_history` MCP tool** — `memovai/memov` `snap`/`mem_history`/`validate_commit` shape. Reference design for v1.13+ session-history feature.
- **`taste-skill` anti-slop ban list** — vendor `Leonxlnx/taste-skill` SKILL.md after diff against existing `frontend-design` skill. Real value at v2.0+ when BooCoder generates frontend code (DubDrive, BooLab, Fathom).
- **AgentLint audit pass** — manual review of BooCode's own CLAUDE.md/AGENTS.md/BOOCHAT.md/BOOCODER.md using `0xmariowu/AgentLint`'s 31 evidence-backed checks. Trim emphasis-keyword density, hit 60120 line sweet spot, SHA-pin Actions, ensure `.env`/`CLAUDE.local.md` are gitignored. One-evening pass, immediate ROI. Optional plugin install at v1.12.x post-merge for ongoing audits.
- **AgentLint audit pass** — manual review of BooCode's own CLAUDE.md/AGENTS.md/BOOCHAT.md/BOOCODER.md using `0xmariowu/AgentLint`'s 31 evidence-backed checks. Trim emphasis-keyword density, hit 60120 line sweet spot, SHA-pin Actions, ensure `.env`/`CLAUDE.local.md` are gitignored. One-evening pass, immediate ROI. **Shipped as `v1.13.9-agentlint`.**
- **`budi` install (Sam's host)** — `siropkin/budi` Claude Code 5-hook observer (`SessionStart`/`UserPromptSubmit`/`PostToolUse`/`SubagentStart`/`Stop`). Local SQLite, sub-ms hook latency, dashboard at `localhost:7878`. Not a BooCode lift — install globally for Claude Code session observability.
- **Multi-provider LLM** (pi-ai pattern): Only if a concrete need for Anthropic / OpenAI / Mistral direct surfaces. llama-swap covers everything today.
- **Workflow graphs** (microsoft/agent-framework concepts): Multi-agent coordination. Conceptual reference only. Realistically a v3.x topic.
@@ -376,8 +390,8 @@ Per-session Docker sandbox spawned by BooCoder on first write. Only project path
|-------------------------------|---------------------|-----------------------------|------------------------------------------------------------------------|----------------------|
|`boochat` (was `boocode`) |`100.114.205.53:9500`|`/opt:/opt:ro` |Read-only chat + SPA host + MCP client |Live (renames at v2.0)|
|`booterm` |`100.114.205.53:9501`|`/opt:/opt` |PTY/tmux terminal sessions |**Live (May 2026)** |
|`boocoder` |`100.114.205.53:9502`|`/opt:/opt:rw` (policy-gated)|Write tools + ACP host + MCP client + MCP server + external-CLI dispatch|v2.0 |
|`boochat_db` (was `boocode_db`)|`127.0.0.1:5500` |`boocode_pgdata` volume |Postgres 16-alpine (shared by all three) |Live (renames at v2.0)|
|`boocoder` |`100.114.205.53:9502`|`/opt:/opt:rw` (policy-gated)|Write tools + ACP host + MCP client + MCP server + external-CLI dispatch|**Shipped v2.0.0v2.0.4** |
|`boochat_db` (was `boocode_db`)|`127.0.0.1:5500` |`boocode_pgdata` volume |Postgres 16-alpine (shared by all three) |**Live** (renamed at v2.0)|
|`codecontext` |`:8765` (internal) |`/opt/projects:/workspace:ro`|MCP server for architect tools |**Live (v1.12.0)** |
### Caddy routing target (post-v2.0)
@@ -417,8 +431,8 @@ term.indifferentketchup.com → booterm :9501 (or routed under code.
- **v1.13.19-html-artifact-panes:** `message_parts.kind` CHECK constraint extended with `'html_artifact'` value (same v1.13.15 pattern)
- **v1.13.20-drop-legacy-cols:** `ALTER TABLE messages DROP COLUMN tool_calls, DROP COLUMN tool_results` (the strangler-fig's final phase). `messages_with_parts` view rewritten to parts-only subselects via `CREATE OR REPLACE VIEW` BEFORE the drops (Postgres ordering constraint). v1.12.1 `messages_status_check`/`messages_role_check` cleanup block removed (one-shot effective long ago)
- **v1.14:** `agents.steps` column (or AGENTS.md parser extension; no DB if file-only)
- **v1.14.x-mcp (NEW):** none — single-server MCP-client PoC is config-only at first, no schema change
- **v1.14.x-html (NEW):** `message_parts.kind` CHECK constraint extended with `'html_artifact'` value
- **v1.14.x-mcp:** none — single-server MCP-client PoC is config-only at first, no schema change
- **v1.14.x-html:** `message_parts.kind` CHECK constraint extended with `'html_artifact'` value
- **v1.15:** `permissions` table, `agent_permissions` join, `session_permissions` join, `mcp_servers (name, type, transport, url_or_command, enabled, config_hash, last_probed_at)` registry
- **v1.16:** `repo_health_cache (project_id, file_hashes_sig, payload JSONB, created_at)`
- **v2.0:** `pending_changes (id, session_id, file_path, diff TEXT, status, created_at)`; `tasks`, `task_templates`, `pipelines`, `pipeline_runs`; `available_agents (name, install_path, version, supports_acp, supports_mcp_client, last_probed_at)`; `human_inbox` view; DB rename `boocode_db``boochat_db`
@@ -441,17 +455,17 @@ Full inventory and rationale in `boocode_code_review.md`. Headline items below;
|`anomalyco/opencode` |MIT, TS |`experimental_repairToolCall` via AI SDK v6 |v1.13.3 ✅ |
|`anomalyco/opencode` |MIT, TS |Two-tier compaction prune (`message_parts.hidden_at` + tier logic) |v1.13.4 ✅ |
|`anomalyco/opencode` |MIT, TS |`tool/truncate.ts` truncation + outputPath pattern (adapted: opaque id) |v1.13.5 ✅ |
|`anomalyco/opencode` |MIT, TS |0.85×ctx_max overflow trigger formula |v1.13.9 (planned) |
|`anomalyco/opencode` |MIT, TS |`session/prompt.ts` `runLoop()` outer agent loop + `agent.steps` cap |v1.14 |
|**Anthropic MCP SDK (TypeScript)** |**MIT** |**MCP client, single-server PoC** |**v1.14.x-mcp** |
|`anomalyco/opencode` |MIT, TS |0.85×ctx_max overflow trigger formula |v1.13.7-compaction-trigger ✅ |
|`anomalyco/opencode` |MIT, TS |`session/prompt.ts` `runLoop()` outer agent loop + `agent.steps` cap |v1.14.0-outer-loop ✅ |
|**Anthropic MCP SDK (TypeScript)** |**MIT** |**MCP client, single-server PoC** |**v1.14.1-mcp-poc ✅** |
|**`claude.com/blog/using-claude-code-the-unreasonable-effectiveness-of-html`** |**(blog, pattern only)** |**HTML-output bias rule + use-case taxonomy** |**v1.14.x-html** |
|**`anthropics/skills/web-artifacts-builder`** |**MIT (design-principle reference)** |**"Avoid AI slop" conventions inline in AGENTS.md** |**v1.14.x-html** |
|**`mgechev/skills-best-practices`** |**MIT (pattern)** |**4-step skill validation protocol with paste-ready prompts** |**v1.13.12 (skills audit)** |
|**`mgechev/skillgrade`** |**MIT** |**Agent-agnostic skill eval framework (eval.yaml + smoke/reliable/regression presets)** |**v1.13.12 (skills audit) + ongoing** |
|**`blog.codeminer42.com/stop-putting-best-practices-in-skills/`** |**(blog, pattern only)** |**Rules→recipes split: skills 6% invoke vs AGENTS.md 100% present** |**v1.13.12 (skills audit)** |
|**`platform.claude.com/docs/.../agent-skills/best-practices`** |**(docs, canonical)** |**500-line ceiling, gerund naming, progressive-disclosure patterns, MCP `ServerName:tool_name` format** |**v1.13.12 + all future skills** |
|`anomalyco/opencode` |MIT, TS |`permission/evaluate.ts` wildcard ruleset |v1.15 |
|`anomalyco/opencode` |MIT, TS |`mcp/index.ts` MCP client (stdio + SSE, tools/list, tools/call, OAuth RFC 7591) |v1.15 |
|`anomalyco/opencode` |MIT, TS |`permission/evaluate.ts` wildcard ruleset |v1.15.0-mcp-multi (planned, not shipped) |
|`anomalyco/opencode` |MIT, TS |`mcp/index.ts` MCP client (stdio + SSE, tools/list, tools/call, OAuth RFC 7591) |v1.15.0-mcp-multi ✅ |
|`Aider-AI/aider` |Apache-2.0 |Fallback `aider/queries/tree-sitter-*.scm` grammars |v1.12 (fallback) |
|`cline/cline` |Apache-2.0 |Plan/Act invariant (absorbed into v1.15 permissions) |v1.15 |
|`spirituslab/codesight` |MIT-ish |Repo health analyzer (`analyze.mjs`) |v1.16 |
@@ -527,6 +541,14 @@ Earlier May 18 chat recommended Option A (thin orchestration shell over OpenCode
The v1.13.x cleanup line shipped 21 batches over a single intense window in `vMAJOR.MINOR.PATCH-slug` form: **v1.13.0-ai-sdk-v6 ✅ → v1.13.1-cleanup-bundle ✅ → v1.13.2-compaction-prune ✅ → v1.13.3-truncate ✅ → v1.13.4-reasoning-fix ✅ → v1.13.5-stability-bundle ✅ → v1.13.6-prefix-stability ✅ → v1.13.7-compaction-trigger ✅ → v1.13.8-tool-cost ✅ → v1.13.9-agentlint ✅ → v1.13.10-openspec ✅ → v1.13.11-tools ✅ → v1.13.12-ws-schemas ✅ → v1.13.13-ws-publish ✅ → v1.13.14-skills-audit ✅ → v1.13.15-codecontext-synth ✅ → v1.13.16-xml-parser ✅ → v1.13.17-cross-repo-reads ✅ → v1.13.18-codecontext-file-path ✅ → v1.13.19-html-artifact-panes ✅ → v1.13.20-drop-legacy-cols ✅** → umbrella `v1.13` ✅. **Do not fold** was the discipline — each batch has a distinct rollback surface, and bisecting a 750-LoC merge across four unrelated changes is worse than four separate dispatches. Held throughout; CHANGELOG.md is the per-tag canonical record.
### v1.14v2.1 shipped (2026-05-25)
- **v1.14.0-outer-loop** ✅ — explicit `while` loop, per-agent `steps:` cap, doom-loop migration
- **v1.14.1-mcp-poc** ✅ — Context7 MCP client validated
- **v1.15.0-mcp-multi** ✅ — multi-server MCP client, stdio transport, per-agent tool globs
- **v2.0.0-alpha through v2.0.4-hardening** ✅ — full BooCoder line: write tools, dispatcher (ACP/PTY), MCP server (6 tools, stdio, 10-question eval passed), CLI client, human inbox, Boomerang `new_task` orchestration, path-guard fuzz suite (34 traversal-attack tests)
- **v2.1.0-provider-picker** ✅ — 5-provider registry, model discovery, `/api/providers` route, `ProviderPicker` UI, agent-probe startup probe
### Numbering and scope-revision discipline during v1.13.x (2026-05-23)
The v1.13.x line ran 21 batches; planned-vs-shipped numbering diverged for half of them, and three batches had material scope revisions mid-design. Pattern that emerged and is worth carrying forward:
@@ -548,7 +570,7 @@ The v1.13.x line ran 21 batches; planned-vs-shipped numbering diverged for half
- **v1.13.5** — opencode truncate.ts port + view_truncated_output tool. Tagged on `f8fc5db`.
- **v1.13.6** — compaction head-assembly audit + reasoning fix. Closed the Q3 reasoning gap from v1.13.1-C. Tagged on `81d837c`.
- **v1.13.7** — stability bundle: includeUsage fix + trim guards + payload filter + budget bump. Surfaces tokens (closes a v1.13.1-A latent regression where `result.usage` resolved empty), kills the empty-bubble + ActionRow noise between tool calls on single-tool-call turns, and unblocks Continue after cap-hit on chats that have trailing empty/failed assistants.
- **v1.13.6 (numbering re-aligned)** — system-prompt prefix verify-and-measure batch (originally numbered v1.13.8 in the planning doc). Reframed mid-design from "add a `system_prompt_cache` table" to "instrument-and-prove" after recon showed input-layer mtime caches already achieve byte-stable prefixes. Smoke confirmed zero drift across 5 turns; dropped the planned DB table.
- **v1.13.6-prefix-stability** — system-prompt prefix verify-and-measure batch (originally numbered v1.13.8 in the planning doc). Reframed mid-design from "add a `system_prompt_cache` table" to "instrument-and-prove" after recon showed input-layer mtime caches already achieve byte-stable prefixes. Smoke confirmed zero drift across 5 turns; dropped the planned DB table. Tagged on `81d837c`.
- **v1.13.7-compaction-trigger** — 0.85×ctx_max early trigger (planned as v1.13.8 / v1.13.9).
- **v1.13.8-tool-cost** — `tool_cost_stats` SQL view + AgentPicker tooltip surfacing (planned as v1.13.9 / v1.13.10).
- **v1.13.9-agentlint** — instruction-file AgentLint pass (planned as part of v1.13.11 skills audit; split into its own batch when it grew larger than fitting).

742
docs/codecontext-ts-plan.md Normal file
View File

@@ -0,0 +1,742 @@
# Codecontext + TypeScript: recon and plan
**Date:** 2026-05-22
**Author:** read-only recon, evidence-first
## Part A — Current codecontext usage in BooCode
### A1. Server-side synthesis pipeline
BooCode runs a **forced second-inference synthesis pass** after a model
emits any of three codecontext tool calls. The list is hard-coded:
`/opt/boocode/apps/server/src/services/synthesisPipeline.ts:34-38`
```ts
export const SYNTHESIS_TOOLS: ReadonlySet<string> = new Set([
'get_codebase_overview',
'get_framework_analysis',
'get_semantic_neighborhoods',
]);
```
The pipeline is triggered from the tool-phase, not by the model:
`/opt/boocode/apps/server/src/services/inference/tool-phase.ts:200-279`.
After tool-phase records the tool_call/tool_result rows it picks the first
synth-eligible entry, expands the inline-truncated head via tmpfs
(`readTruncation`), pulls top-N referenced files + project docs
(BOOCHAT.md, AGENTS.md, CONTEXT.md, *roadmap*.md), token-budgets to
32k chars/4 (`synthesisPipeline.ts:45-46`), streams a second model
inference with a 90s timeout (`synthesisPipeline.ts:50`), and either
emits a `kind='synthesis'` message-part or falls through to the
recursive turn on failure (`synthesisPipeline.ts:250-272`).
The pipeline is **invoked once per turn that contains a SYNTHESIS_TOOLS
call** — at most one synthesis pass per turn (the loop picks the first
synth-eligible entry, `tool-phase.ts:256`).
The codecontext tools themselves are HTTP wrappers over the sidecar:
`/opt/boocode/codecontext/shim.go:412-419` registers eight POST routes
(`/v1/get_codebase_overview``/v1/get_framework_analysis`). The shim
serialises calls under `callMu` and forwards JSON-RPC to a single
`codecontext mcp` child (`shim.go:194`, `shim.go:328-333`). The child
binary is built from `github.com/nmakod/codecontext` tag `v3.2.1`
(`/opt/boocode/codecontext/Dockerfile:18-22`), NOT from the local fork at
`/opt/forks/codecontext` (which is `github.com/nuthan-ms/codecontext`,
fork go.mod: `/opt/forks/codecontext/go.mod:1`). Container reports
`codecontext version dev` (recon: `docker exec boocode_codecontext
codecontext --version` returned `codecontext version dev / Build Date:
unknown / Git Commit: unknown`).
Wrapper boundaries:
- `/opt/boocode/apps/server/src/services/codecontext_client.ts:68-70`
hard timeout `REQUEST_TIMEOUT_MS = 30_000`, inline truncation
`TRUNCATION_LIMIT = 32_000`.
- Same file lines 80-95: realpath project + target_dir, reject any
target_dir that escapes the project root. The eight wrappers never
pass `target_dir` (`callCodecontext` injects it server-side, line 99).
- Lines 130-141 surface the upstream "content is empty" parser bug
(issue #37) with an actionable hint pointing at `.codecontextignore`.
### A2. Agent-exposed tool surface
Source of truth: `/opt/boocode/data/AGENTS.md` (six agents) plus the
`DEFAULT_TOOLS` fallback in
`/opt/boocode/apps/server/src/services/agents.ts:19-20` (every tool in
`ALL_TOOLS`).
Per-agent codecontext exposure (cited from
`/opt/boocode/data/AGENTS.md:6,41,62,100,138,179`):
| Agent | Codecontext tools exposed |
|---|---|
| Code Reviewer (line 3) | get_codebase_overview, get_dependencies, get_file_analysis, get_framework_analysis, get_semantic_neighborhoods, get_symbol_info, search_symbols, watch_changes |
| Debugger (line 38) | same eight |
| Refactorer (line 59) | same eight |
| Architect (line 97) | same eight |
| Security Auditor (line 135) | same eight |
| Prompt Builder (line 176) | **none**`tools: [view_file, list_dir, grep, find_files]` |
Every project-less or no-agent chat falls back to `DEFAULT_TOOLS` =
`ALL_TOOLS` (all 21 tools including the eight codecontext ones)
(`agents.ts:19-20,196`). The `BOOCODE_TOOLS` env var can narrow further
via `resolveToolTier()` (`tools.ts:712-732`): `core` (4 tools, no
codecontext) / `standard` (16, all eight codecontext) / `all` (21).
`STANDARD_TOOL_NAMES` includes all eight codecontext tools
(`tools.ts:719-732`).
The eight codecontext tool registrations live in `tools.ts:653-660` and
are all marked read-only in `READ_ONLY_TOOL_NAMES` (`tools.ts:689-696`).
### A3. Actual usage (DB)
Tool-call frequency from `message_parts` (all-time; DB only has data
back to 2026-05-22 today — see "Claims I did not verify" for the
retention question):
Query: `SELECT payload->>'name', COUNT(*) FROM message_parts WHERE
kind='tool_call' GROUP BY 1 ORDER BY 2 DESC`
| Tool | Calls | Chats |
|---|---:|---:|
| view_file | 129 | — |
| grep | 81 | — |
| list_dir | 78 | — |
| find_files | 25 | — |
| **get_codebase_overview** | **24** | 23 |
| **search_symbols** | **8** | 5 |
| ask_user_input | 5 | 3 |
| `foo` (typo/invalid) | 4 | 2 |
| view_truncated_output | 4 | 2 |
| git_status | 3 | 2 |
| **get_file_analysis** | **3** | 1 |
| **get_framework_analysis** | **1** | 1 |
| `([^` (typo/invalid) | 1 | 1 |
Codecontext-tool calls observed: **only 5 of 8** ever invoked
(`get_codebase_overview`, `search_symbols`, `get_file_analysis`,
`get_framework_analysis`, and `get_dependencies` does not appear).
**Never called** (in the recorded window): `get_dependencies`,
`get_symbol_info`, `get_semantic_neighborhoods`, `watch_changes`.
Per-call args sample (`mp.created_at` desc, last 12 calls;
recon-verified by query against message_parts):
- `get_codebase_overview` invoked ~9 times in a row with
`{"include_stats":true}` — repeated overview fetches within minutes.
- `search_symbols` examples: `{"limit":20,"query":"Kind"}`,
`{"limit":20,"query":"SymbolKind"}`,
`{"limit":20,"query":"Kind","framework_type":"typescript"}`.
- `get_file_analysis` invoked 3 times in one chat with
`file_path` = `apps/server/src/services/inference.ts`,
`apps/server/src/services/inference/parts.ts`,
`apps/server/src/services/system-prompt.ts`**all three failed**
with "File not found in graph" (see C3).
### A4. Hang and drift correlation
**Cohort analysis** (query against `messages` joined to chats that
ever used any codecontext tool):
| Cohort | status | rows |
|---|---|---:|
| no_codecontext | complete | 24 |
| no_codecontext | cancelled | 1 |
| used_codecontext | complete | 191 |
| used_codecontext | streaming | 2 |
| used_codecontext | **failed** | **2** |
Two failed assistant messages, both in chats that used codecontext.
Both have empty `content` — characteristic of a synth pass that aborted
before any deltas streamed (see `synthesisPipeline.ts:278-303`,
`markSynthFailed`). DB query:
```
SELECT id, status, created_at, LEFT(content,200)
FROM messages WHERE role='assistant' AND status IN ('failed','streaming')
```
returned two `failed` rows with empty content at 2026-05-22 18:43:39 and
2026-05-22 19:59:56. The 18:43 failure correlates with the codecontext
sidecar log line `2026/05/22 18:44:10.842554 get_framework_analysis
target_dir=/opt/boocode duration_ms=30002 status=rpc_error` — a 30 s
timeout (`codecontext_client.ts:70`) under a `get_framework_analysis`
call (`synthesisPipeline.ts:34-38` would have triggered synthesis on
success — failure path skipped synthesis and surfaced the error).
**Drift / format leakage:** the query
`SELECT * FROM messages WHERE role='assistant' AND (content LIKE
'%<invoke%' OR content LIKE '%<tool_call%')` returned 8 rows; manual
review showed 7 are recon/discussion content where the model is
quoting `<invoke>` as a *topic*, not actually emitting a tool call as
text. **One real drift case** at 2026-05-22 19:05:03 — content begins
"I need to investigate the codecontext fork to write this design
document. Let me start by reading the key files.\n\n<invoke
name=\"read_file\">…" — an Anthropic-format leak. This message is in a
chat that did use codecontext, but the drift evidence is too thin
(n=1) to claim a correlation.
## Part B — TypeScript parsing gap
### B1. TS-targeted workload
Per-language breakdown of codecontext calls that target a specific
file or framework (DB query):
| Language hint | Calls |
|---|---:|
| no file_path (overview/framework/symbol search) | 33 |
| ts/tsx | 3 |
| (no other extension observed) | — |
The three TS-targeted calls were all `get_file_analysis` in a single
chat: `inference.ts`, `inference/parts.ts`, `system-prompt.ts`. **All
three failed** with `File not found in graph` (see C3 — relative path
mishandling). One `search_symbols` call carried
`framework_type=typescript` (Q="Kind").
So **TS is the actual workload** for narrow codecontext use; the rest
is whole-repo overview/framework analysis with no specific language
filter.
### B2. Symbol recovery quality
I called the live container against three load-bearing BooCode TS files
and compared the symbol list against a manual grep of top-level
declarations.
**File 1: `/opt/boocode/apps/server/src/types/api.ts` (371 lines)**
Manual count (grep `^(export )?(interface|type|const) `):
- interfaces: 36
- top-level types: 15
- top-level consts: 5
- total significant: 56
Codecontext output (live HTTP call to
`http://codecontext:8080/v1/get_file_analysis`):
```json
{
"result": "# File Analysis: ...\n**Lines:** 372\n**Symbols:** 10\n\n## Symbols\n\n- **PROJECT_STATUSES** () - Line 2\n- **PROJECT_STATUSES** () - Line 2\n- **CHAT_STATUSES** () - Line 91\n..."
}
```
Total reported: 10 symbols, all five `*_STATUSES` consts duplicated
(line 2 appears twice, etc.). After regex-extracting names:
- Unique symbols reported by codecontext: 8 (5 *_STATUSES consts + 3
header strings `Language:`/`Lines:`/`Symbols:`)
- Interfaces / types found: **0 of 51**.
- Symbol-recovery rate: **5/56 = ~9%** (only the const arrays the JS
grammar understands).
Specific misses checked against the actual file
(grep -nE on `/opt/boocode/apps/server/src/types/api.ts`):
- Line 5 `export interface Project` — MISSED
- Line 26 `export type SessionStatus` — MISSED
- Line 28 `export interface Session` — MISSED
- Line 47 `export type WorkspacePaneKind` — MISSED
- All 36 interface declarations and 15 type aliases — MISSED.
**File 2: `/opt/boocode/apps/server/src/services/tools.ts` (763 lines)**
Manual count: 47 top-level decls
(grep `^(export )?(interface|type|enum|namespace|const|function|class|async function) `).
Codecontext output: **112 symbols** reported (but many are noise:
local function-scope variables, the literal token `"unknown"` from
type cast positions, even raw labels like `out:`).
Python-extracted from result: 71 unique names. Cross-checked against
20 significant TS exports the file declares:
- Found: `ListDirInput`, `READ_ONLY_TOOL_NAMES`, `CORE_TOOL_NAMES`,
`STANDARD_TOOL_NAMES` (4 / 20)
- **MISSED: `ToolDef`, `ViewFileInput`, `viewFile`, `listDir`, `grep`,
`findFiles`, `viewTruncatedOutput`, `gitStatus`, `skillFind`,
`skillUse`, `skillResource`, `askUserInput`, `ALL_TOOLS`,
`TOOLS_BY_NAME`, `resolveToolTier`, `toolJsonSchemas`** — every
exported `ToolDef<…>` named constant is missed because the JS
grammar can't parse the TS type annotation `: ToolDef<…>` that
precedes the `=` and bails out of recognising the const at
top-level.
- Symbol-recovery rate (significant): **4/20 = 20%**.
**File 3: `/opt/boocode/apps/server/src/services/inference/stream-phase.ts` (482 lines)**
Manual count: 5 top-level decls (2 are `export async function`,
1 interface, 1 type, 1 const).
Codecontext output: 53 symbols extracted, but the first 20 are header
strings (`Language:`, `Lines:`, `Symbols:`), imports (`api.js`,
`model-context.js`, …), local function names from inside bodies
(`toolNameById`, `out:`, `hasTools`), and string literals
(`parts:`). Neither `streamCompletion` nor `executeStreamPhase` (the
two `export async function` declarations at lines 145, 346) appear in
the symbol list explicitly.
**Aggregate:** across the three files, codecontext recovers
type/interface/enum symbols at effectively **0%**, and function/const
symbols at roughly **20%**. The 9596-symbol whole-repo overview is
heavily noise-padded. Generic type parameters and decorators were not
checked individually because they're a strict subset of the
already-broken case.
### B3. Fork status
**`docs/ts-bindings-design.md` does NOT exist.** Verified by
`ls /opt/forks/codecontext/docs/ts-bindings-design.md``No such file
or directory`. The `/opt/forks/codecontext/docs/` tree has 23 markdown
files; none mention TypeScript bindings work (greps under
`/opt/forks/codecontext/docs/` for `TypescriptLanguage|tree-sitter-tsx`
returned nothing beyond a CodeContext example in `HLD.md:831` and
config mentions in `ARCHITECTURE.md:297`).
**go.mod dependencies (`/opt/forks/codecontext/go.mod:5-18`):**
- `github.com/tree-sitter/tree-sitter-javascript v0.23.1` (present)
- `github.com/tree-sitter/tree-sitter-typescript`**NOT present**.
**TS-as-JS fallback in `internal/parser/manager.go:72-79`:**
```go
// TypeScript - use JavaScript grammar as fallback until TypeScript bindings are fixed
// Both JS and TS have similar syntax and this provides basic parsing capability
tsLang := sitter.NewLanguage(javascript.Language())
m.languages["typescript"] = tsLang
tsParser := sitter.NewParser()
tsParser.SetLanguage(tsLang)
m.parsers["typescript"] = tsParser
```
The comment claims this provides "basic parsing capability". B2 shows
that interface/type recovery is effectively zero — the JS grammar does
not recognise `interface`, `type`, generic params, decorators, or even
TS-typed const declarations.
**Downstream code IS prepared for TS-specific nodes.** In
`internal/parser/manager.go:746-765` `nodeToSymbolJS` already has
cases for `interface_declaration` and `type_alias_declaration`:
```go
case "interface_declaration", "interface":
return &types.Symbol{Type: types.SymbolTypeInterface, ...}
case "type_alias_declaration", "type_declaration":
return &types.Symbol{Type: types.SymbolTypeType, ...}
```
These cases are dead code with the JS grammar — they only fire when
the parser is the TypeScript grammar. The fork already has the symbol
extraction wiring; it's just missing the grammar.
**`SymbolType` is open (string), not an iota** —
`/opt/forks/codecontext/pkg/types/graph.go:14`:
```go
type SymbolType string
```
with constants like `SymbolTypeInterface`, `SymbolTypeType`,
`SymbolTypeNamespace` already declared (`graph.go:16-48`). No code
changes needed there to add TS-aware symbol types.
**Upstream `tree-sitter-typescript` Go bindings exist.** Context7 docs
for `/tree-sitter/tree-sitter-typescript` show the Go package
`github.com/tree-sitter/tree-sitter-typescript` exporting
`LanguageTypescript()` and `LanguageTSX()`:
```go
typescript := sitter.NewLanguage(tree_sitter_typescript.LanguageTypescript())
tsx := sitter.NewLanguage(tree_sitter_typescript.LanguageTSX())
```
(Context7 query `/tree-sitter/tree-sitter-typescript`,
"Go bindings package name and how to import…", returned a working
sample.)
**The fork (`/opt/forks/codecontext`) is not what runs in production.**
The deployed image is built from `github.com/nmakod/codecontext` tag
v3.2.1 (`/opt/boocode/codecontext/Dockerfile:18-22`). The fork is a
separate working tree at `/opt/forks/codecontext` on
`github.com/nuthan-ms/codecontext` (`/opt/forks/codecontext/go.mod:1`).
Any TS-grammar work landing in either repo requires a Dockerfile
update to point at the right source.
**Fork HEAD:** `ba6b94c 2025-09-01 12:43:09 +0530 Merge pull request
#29 from nmakod/release-please--branches--main` — newer than the
deployed v3.2.1 tag but on the same upstream lineage.
### B4. Existing TS-aware alternatives
Searches in `/opt/boocode`:
- `grep -rln 'ts-morph|@typescript/vfs|createCompilerHost'
/opt/boocode/apps` → **no matches** in source (only types).
- Only the `typescript` package is depended on
(`/opt/boocode/package.json`, `/opt/boocode/apps/booterm/package.json`,
`/opt/boocode/apps/server/package.json`,
`/opt/boocode/apps/web/package.json` — each declares
`"typescript": "^5.5.0"`). That's the tsc compiler, used for
building, not for runtime symbol extraction.
- No tool in `/opt/boocode/apps/server/src` parses TS at runtime for
any reason other than what codecontext provides.
So BooCode has **no existing fallback** for TS symbol data: if
codecontext can't extract it, nobody else does.
## Part C — Optimization opportunities
### C1. Tool surface review
Cross-referencing the agent whitelist (A2) with actual usage (A3):
| Tool | Exposed to 5 agents? | Calls observed | Recommendation |
|---|---|---:|---|
| get_codebase_overview | yes | 24 | **Keep** — load-bearing, synth-triggering |
| search_symbols | yes | 8 | **Keep** — only viable TS query path |
| get_file_analysis | yes | 3 | **Keep** but fix relative-path bug (C3) |
| get_framework_analysis | yes | 1 | Low-use; **keep** for synth signalling |
| get_dependencies | yes | **0** | **Demote** — unused, considered for removal |
| get_symbol_info | yes | **0** | **Demote** — unused, considered for removal |
| get_semantic_neighborhoods | yes | **0** | **Demote** — unused, considered for removal |
| watch_changes | yes | **0** | **Remove** from agent whitelist — also pulled out of synthesis if currently kept |
`watch_changes` in particular is a state-changing async tool with no
sensible LLM consumer (the model can't await fsnotify events). It
should not be in the 5 agents' whitelists; the synthesis pipeline only
calls 3 specific tools (`synthesisPipeline.ts:34-38`) so removing
`watch_changes` from agent whitelists does not affect the pipeline.
`get_dependencies`, `get_symbol_info`, `get_semantic_neighborhoods`
are credible tools but the model never reaches for them — likely a
descriptions/discoverability issue. Either improve their tool
descriptions (the `.description` strings registered in
`tools/codecontext/*.ts`) or remove them from agent whitelists.
### C2. Latency and token cost
Latencies parsed from the codecontext sidecar access log
(`docker logs boocode_codecontext --since 24h | grep duration_ms=`):
- Total calls observed: 40 in 24h
- Total time: 610,404 ms
- Avg: **15,260 ms per call**
- Min: 1,379 ms
- p50: 9,417 ms
- p90: 27,611 ms
- Max: 30,002 ms (= the 30 s rpc_error timeout)
Sampled MCP-server log lines confirm overview rebuilds cost 28 s on
/opt/boocode (`6575 files, 115601 symbols, 1186758 chars markdown`
in 8.22 s). The shim's per-tool log shows the analysis dominates;
markdown serialization is sub-second.
**Synthesis pipeline expansion** (from `docker logs boocode`):
Five completed synthesis passes today, sample sizes:
- `originalChars` (truncated head shipped to synth): **32,078** in
every case (= the wrapper's 32 kB cap).
- `fullChars` (full overview after re-expansion from tmpfs): 83,406 /
83,408 / 83,410 / 97,283 / 97,464.
In other words, every overview is over the wrapper cap and synthesis
always pays a tmpfs round-trip to recover the full content for
reference-file extraction. The full content is *not* shipped to the
synth model (the truncated head is — `synthesisPipeline.ts:141`), so
the token-budget contract holds, but the synth still has to wait on
the file I/O.
One synthesis timeout in the day (`synthesis pass timed out; falling
through to recursive turn`, chatId a74bfecb…, toolName
get_codebase_overview, 90 s after expansion completed — the synth
inference itself was too slow). The retry inside the same chat then
completed in 31 s with `files: 0` (no referenced files extracted),
suggesting the timeout repeated until reference extraction was
empty.
I have no cache-hit statistics to report — the shim does not log
cache hits. The codecontext binary itself logs `Refreshing analysis
for codebase overview…` on every call (`[MCP] Refreshing analysis…`
appears for each `get_codebase_overview` in the sidecar log), so the
analysis is rebuilt per call.
### C3. Failure modes
Sidecar errors in the last 7 days
(`docker logs boocode_codecontext --since 168h | grep -E
"status=tool_error|content is empty|panic"`):
1. **`content is empty` parser bug** — 2026-05-22 17:37:41 and
17:43:41, both against `/opt/homelabhealth`, on
`frontend/node_modules/hono/dist/adapter/aws-lambda/types.js`.
The wrapper's `.codecontextignore` template installation
(`codecontext_client.ts:30-52`) didn't help because the file is
under `node_modules` which is supposedly in the template. Suggests
either the template hadn't been copied yet or the template's
ignore list doesn't cover the path. Each failed call cost ~25 s.
2. **Relative-path failures** — 2026-05-22 17:56:51 through 17:57:07
(three back-to-back), all `get_file_analysis`:
```
[MCP] ERROR: File not found in graph: apps/server/src/services/inference.ts (available files: 6575)
```
The wrapper resolves `target_dir` to an absolute realpath
(`codecontext_client.ts:80-99`) but `file_path` is forwarded
unchanged. The codecontext binary's file index is keyed on
absolute paths (the 115,876-symbol overview reports absolute
paths). The model passed `apps/server/src/services/inference.ts`
and the binary couldn't find it. Each failure cost 824 s.
3. **30 s rpc_error timeout** — 2026-05-22 18:44:10
(get_framework_analysis) and 19:38:06 (search_symbols vs
/opt/forks/codecontext). The shim's per-call context timeout is
60 s (`shim.go:325`) but the wrapper aborts at 30 s
(`codecontext_client.ts:70`), so the client gives up before the
shim does — the call still runs to completion on the codecontext
side, wasting CPU.
4. **Panic in `searchSymbols`** — concurrent map iteration crash in
`internal/mcp/server.go:1305` (`getFilePathForSymbol`) under
`matchesFramework`, captured in
`docker logs boocode_codecontext --since 24h`:
```
internal/runtime/maps.fatal(...)
github.com/nuthan-ms/codecontext/internal/mcp.(*CodeContextMCPServer).getFilePathForSymbol(...)
/build/codecontext/internal/mcp/server.go:1305
```
This is an upstream bug in v3.2.1 — concurrent map access without
a lock. The shim's `callMu` serialises *its* calls but the
codecontext binary itself appears to have internal concurrency
that hits this.
**Pattern:** the 2 failed assistant messages in A4 align with the 30 s
rpc_error timeout (18:44:10) and one other failure window. Failed
turns leave empty `content` because synthesis aborts before any
deltas — the model never sees the codecontext error.
## Part D — Plan
### D1. Tool surface decisions
**Title:** Trim agent codecontext exposure to the four tools that earn
their keep; demote the rest until evidence justifies them.
**Why:** A3 shows 4 of 8 codecontext tools have zero observed calls,
and `watch_changes` (a fsnotify-coupled tool) has no LLM consumer.
The synthesis pipeline only auto-triggers on three tools
(`synthesisPipeline.ts:34-38`), so removing tools from agent
whitelists does not affect the server-side synth path.
**Scope:** edit `/opt/boocode/data/AGENTS.md` lines 6, 41, 62, 100,
138 (Code Reviewer, Debugger, Refactorer, Architect, Security
Auditor) to drop `get_dependencies`, `get_symbol_info`,
`get_semantic_neighborhoods`, `watch_changes` from each `tools:`
array. Roughly 5 line edits.
**Risk:** if there's a legitimate workflow not yet captured in 24 h
of DB data, dropping these tools removes that affordance. Mitigation:
keep them registered in `tools.ts` (the server-side wrappers stay) so
the synth pipeline can still call them if `SYNTHESIS_TOOLS` expands
later, and so the `BOOCODE_TOOLS=standard` tier continues to expose
them via the tier filter. Tests: `agents.test.ts`, `tools.test.ts`,
any agent-roundtrip tests.
**Effort:** 30 min.
**Sequence:** standalone. Unblocks D3 (smaller tool list = smaller
system prompt = better prompt-cache stability per `tools.ts:629-632`).
### D2. TypeScript support path
**Title:** Narrow the TS fork scope to "interfaces, types, enums, top-
level typed consts" — defer generics and decorators.
**Why:** Evidence from B1 (3 TS-targeted calls — all
`get_file_analysis` — and 1 `search_symbols framework_type=typescript`)
shows TS is in the workload but at low volume. Evidence from B2
shows symbol recovery is **~0% for interfaces/types and ~20% for
typed consts**. That gap is what actually breaks model behaviour:
when the model asks `get_file_analysis` for `api.ts` (which IS what
happened today) it gets 10 noise symbols and no `interface Project`,
`interface Session`, `type SessionStatus`. The narrow scope
(declarations only; skip generics, JSX, decorators) covers ~90% of
the recovered-symbol gap and is achievable with one new dependency
and one parser-init change.
**Scope:**
1. `/opt/forks/codecontext/go.mod`: add
`github.com/tree-sitter/tree-sitter-typescript v0.23.x` to the
`require` block.
2. `/opt/forks/codecontext/internal/parser/manager.go:72-79`:
replace the JS-fallback init with
```go
typescript "github.com/tree-sitter/tree-sitter-typescript/bindings/go"
...
tsLang := sitter.NewLanguage(typescript.LanguageTypescript())
m.languages["typescript"] = tsLang
tsxLang := sitter.NewLanguage(typescript.LanguageTSX())
m.languages["tsx"] = tsxLang
```
Plus parser registrations. `nodeToSymbolJS` already handles
`interface_declaration` and `type_alias_declaration` (lines
746-765) — no extraction code changes needed for the narrow scope.
3. `/opt/forks/codecontext/internal/parser/manager.go:357-395`
`detectLanguage` (skim verified to live around line 357): ensure
`.tsx` maps to `"tsx"` not `"typescript"`. Likely already correct
— verify.
4. Tests in `internal/parser/` — add TS-grammar fixtures (a small
`.ts` file with interface, type, enum) to assert recovery.
5. Update `/opt/boocode/codecontext/Dockerfile:18-22` to clone from
the fork instead of `github.com/nmakod/codecontext` v3.2.1 once
the TS-grammar branch lands. **Or** PR the change upstream first
if `nmakod/codecontext` is open to it.
6. Drop the fork's own `tree-sitter-javascript` dependency? No —
`tree-sitter-typescript` Go binding is separate and the JS
grammar is still needed for `.js`/`.jsx` files.
Rough LoC: ~20 lines in manager.go, +1 line go.mod, +1 import, +1
language-detect entry; ~50 lines of tests; ~5 lines in Dockerfile.
**Risk:** TS grammar parses superset syntax; some TS files may now
hit `ERROR` nodes the JS grammar happily accepted. Mitigate by
keeping the JS grammar registered for `.js`/`.jsx` and not changing
JS handling. Regression risk lives in the codecontext-binary CI
(JS+TS combined corpus) — verify their existing tests still pass.
Tests to add: a fixture file containing each B2 missed symbol and a
manager_test that asserts the symbols are recovered.
**Effort:** Phase A (grammar swap + tests + Dockerfile pin): 90 min
once a build-and-test loop is set up in the fork.
**Sequence:** Blocked on a decision about whether to PR upstream
(`nmakod/codecontext`) or fork-and-deploy (`nuthan-ms/codecontext`).
Unblocks D3 (cleaner TS results = smaller noise in synthesis output
= smaller token cost).
**Decision:** **Narrow**, not "drop" and not "full TS support". Drop
is wrong because TS *is* the workload (A2 + B1 show every agent and
the codebase under analysis are TS-heavy). Full Phase 3-4 TS support
(generics, decorators, full type queries) is overkill for current
usage — interface/type/enum recovery captures the model's actual
need.
### D3. Synthesis pipeline optimizations
**Title:** Reduce per-turn codecontext latency and cache the overview.
**Why:** C2 shows avg 15.2 s per codecontext call and an overview
that rebuilds on every call. Synthesis always pays the 30 s wrapper
timeout when the codecontext binary panics (C3 case 4) or hangs.
**Three sub-items:**
D3a. **Cache the overview at the shim layer.** The shim already
serialises calls under `callMu` (`shim.go:74-77`). Add a per-
`target_dir` overview cache keyed on a directory-mtime hash, TTL ~60s.
Sub-second cache hits for repeated `get_codebase_overview` calls
(today shows ~9 in a single chat over a few minutes).
- File: `/opt/boocode/codecontext/shim.go`
- LoC: ~80
- Effort: 90 min
- Risk: invalidation. Use the fastest cheap invalidator (mtime of
target_dir + a hash of the file count via `os.ReadDir`). On any
doubt, bypass cache.
D3b. **Align wrapper and shim timeouts.** Wrapper 30 s
(`codecontext_client.ts:70`), shim ctx 60 s (`shim.go:325`). The
mismatch wastes CPU when the wrapper gives up but the shim keeps
running. Either drop the shim ctx to 30 s, or raise the wrapper
to 60 s (depending on which budget is right). Recommended: align
both to 45 s, abort upstream on wrapper cancel.
- LoC: 2 lines
- Effort: 30 min
D3c. **Fix the relative-path bug in `get_file_analysis`.** The
wrapper resolves `target_dir` but not `file_path`. Three failures
in one chat today wasted 48 s of CPU. Fix:
- File: `/opt/boocode/apps/server/src/services/tools/codecontext/get_file_analysis.ts`
(and possibly the shared client at `codecontext_client.ts`).
- Have the wrapper resolve `file_path` against the realpath'd
project root before forwarding, mirroring `target_dir`. Error out
if the resolved path doesn't start with the project root.
- LoC: ~20
- Effort: 60 min
- Risk: low — the model loses no affordance; absolute and relative
both work.
- Tests: `codecontext_client.test.ts`.
**Sequence:** D3c is independent and high-ROI. D3a depends on
nothing. D3b is independent. Recommended order: D3c → D3b → D3a.
### D4. Removal candidates
1. **`watch_changes` agent exposure** (A3 + A2). Server-side handler
stays for completeness; it should not appear in agent
`tools:` arrays. Edit `/opt/boocode/data/AGENTS.md` lines 6, 41,
62, 100, 138.
2. **The dead "csharp" comment-out block** in
`/opt/forks/codecontext/internal/parser/manager.go:146-152` —
delete-on-touch when D2 lands; not part of D2's core scope.
3. **The 3 zero-use codecontext tool exposures** —
`get_dependencies`, `get_symbol_info`, `get_semantic_neighborhoods`.
Same surgical edits as item 1. Consider keeping
`get_dependencies` on the Refactorer because the agent
description explicitly invokes "Use get_dependencies to map call
sites" (`AGENTS.md:92-93`); if the model isn't using it despite
the system-prompt nudge, the description in
`tools/codecontext/get_dependencies.ts` likely needs the same
verb-forward rewrite.
## Claims I did not verify
- **DB retention horizon.** All `message_parts` rows are dated
2026-05-22. That could mean (a) the DB was wiped today, (b) the
schema/path moved today, or (c) the project is brand-new and 24 h
is genuinely the full history. The CLAUDE.md project context
references "v1.13.15-codecontext-synth" which is recent. To verify:
`docker exec boocode_db psql -U boocode -d boocode -c "SELECT
MIN(created_at), MAX(created_at), COUNT(*) FROM messages;"` then
cross-check against the BooCode roadmap's release dates. The 30-day
window in A3's query may simply not have older data to find.
- **Whether `nmakod/codecontext` v3.2.1 hosts the same
`nodeToSymbolJS` switch I read in the fork.** The fork at
`/opt/forks/codecontext` is `nuthan-ms/codecontext` per
go.mod. The deployed v3.2.1 is `nmakod/codecontext`. The Dockerfile
comment (`/opt/boocode/codecontext/Dockerfile:13-16`) says the
module path differs but "the tagged v3.2.1 source tree is the same
either way." To verify, clone
`https://github.com/nmakod/codecontext` at tag v3.2.1 and diff
`internal/parser/manager.go` against the fork — outside this
recon's read-only scope.
- **Whether `tree-sitter-typescript v0.23.x` Go bindings actually
build under the fork's `go 1.24.5` + Tree-sitter `v0.25.0`
combination.** Context7 docs confirm the *API exists*. Confirm by
`go get github.com/tree-sitter/tree-sitter-typescript@latest`
followed by `go build ./...` in a scratch worktree.
- **Whether the codecontext panic in `searchSymbols` is reproducible
on `/opt/boocode` or only on `/opt/forks/codecontext`** (the panic
was captured against target_dir `/opt/forks/codecontext`). Reproduce
via `docker exec boocode_codecontext wget -qO -
--post-data='{"target_dir":"/opt/boocode","query":"foo","limit":10}'
--header='Content-Type: application/json'
http://localhost:8080/v1/search_symbols`.
- **Cache hit rate of codecontext analysis (per call vs reused).**
The MCP-server log line `Refreshing analysis for codebase
overview…` suggests rebuild-every-call, but I did not confirm by
reading the codecontext source — only the deployed binary's log
output. To verify, read
`/opt/forks/codecontext/internal/mcp/server.go` around the
`Refreshing analysis…` log lines.
- **Drift correlation strength.** N=1 confirmed drift case is too
small to call a correlation with codecontext use. To raise the
signal: extend retention, re-query after a week of synthetic
load with and without codecontext tools.
- **Whether the synth pipeline's `truncated head only` ships fewer
tokens than a full inlined codecontext result would.** Today's
budget contract assumes yes (`synthesisPipeline.ts:138-145`
comment "Truncated head only — full content was used for
reference extraction above"). To verify: instrument the
per-pass `promptTokens` and compare against a one-off pass with
the full content.
- **The Architect/Code-Reviewer agents' system-prompt copy versus
actual tool usage.** AGENTS.md text claims agents will "Use
get_dependencies to map call sites" (line 92) and "Use
get_semantic_neighborhoods to find related components"
(line 132), but A3 shows neither is called. To verify whether the
model is ignoring the prompt or whether these agents simply
aren't being invoked, query
`SELECT s.name, COUNT(*) FROM sessions s JOIN chats c ON
c.session_id=s.id JOIN messages m ON m.chat_id=c.id WHERE
m.role='assistant' GROUP BY 1 ORDER BY 2 DESC;` and compare
named agents to chat counts.

View File

@@ -0,0 +1,379 @@
# BooCoder Provider Picker — Backend (Steps 13)
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Expose a `GET /api/providers` endpoint on BooCoder (port 9502) that returns all available providers with their model lists, so the frontend can build a two-level provider → model picker.
**Architecture:** A static provider registry maps agent names to their metadata (transport, model source). The existing `agent-probe.ts` is extended to discover models for each agent and persist them in a new `models` JSONB column on `available_agents`. A new `/api/providers` route merges the registry with DB state and llama-swap models to produce the response.
**Tech Stack:** Fastify, postgres (porsager), Zod, SSH exec to host for agent discovery.
---
## File Map
| Action | File | Responsibility |
|--------|------|----------------|
| Create | `apps/coder/src/services/provider-registry.ts` | Static provider metadata (label, transport, model source) |
| Modify | `apps/coder/src/schema.sql` | Add `models`, `label`, `transport` columns to `available_agents` |
| Modify | `apps/coder/src/services/agent-probe.ts` | Discover models per agent, persist to DB |
| Create | `apps/coder/src/routes/providers.ts` | `GET /api/providers` route |
| Modify | `apps/coder/src/index.ts` | Register providers route |
---
### Task 1: Provider Registry
**Files:**
- Create: `apps/coder/src/services/provider-registry.ts`
- [ ] **Step 1: Create the provider registry**
```typescript
// apps/coder/src/services/provider-registry.ts
export interface ProviderDef {
name: string;
label: string;
transport: 'native' | 'acp' | 'pty';
modelSource: 'llama-swap' | 'static';
staticModels?: Array<{ id: string; label: string }>;
}
export const PROVIDERS: ProviderDef[] = [
{
name: 'boocode',
label: 'BooCoder',
transport: 'native',
modelSource: 'llama-swap',
},
{
name: 'opencode',
label: 'OpenCode',
transport: 'acp',
modelSource: 'llama-swap',
},
{
name: 'goose',
label: 'Goose',
transport: 'acp',
modelSource: 'llama-swap',
},
{
name: 'claude',
label: 'Claude Code',
transport: 'pty',
modelSource: 'static',
staticModels: [
{ id: 'claude-opus-4-20250514', label: 'Opus 4' },
{ id: 'claude-sonnet-4-20250514', label: 'Sonnet 4' },
],
},
{
name: 'qwen',
label: 'Qwen Code',
transport: 'pty',
modelSource: 'static',
// Models discovered at probe time from ~/.qwen/settings.json on host
},
];
export const PROVIDERS_BY_NAME = new Map(PROVIDERS.map((p) => [p.name, p]));
```
- [ ] **Step 2: Verify TypeScript compiles**
Run: `npx tsc -p apps/coder/tsconfig.json --noEmit 2>&1 | head -20`
Expected: No errors from provider-registry.ts
---
### Task 2: Schema Migration
**Files:**
- Modify: `apps/coder/src/schema.sql`
- [ ] **Step 1: Back up the schema file**
Run: `cp apps/coder/src/schema.sql apps/coder/src/schema.sql.bak-$(date +%Y%m%d)`
- [ ] **Step 2: Add columns to available_agents**
Append to the end of `apps/coder/src/schema.sql`:
```sql
-- v2.1.0: provider picker — extend available_agents with model discovery.
ALTER TABLE available_agents ADD COLUMN IF NOT EXISTS models JSONB DEFAULT '[]'::jsonb;
ALTER TABLE available_agents ADD COLUMN IF NOT EXISTS label TEXT;
ALTER TABLE available_agents ADD COLUMN IF NOT EXISTS transport TEXT DEFAULT 'pty';
```
- [ ] **Step 3: Verify schema applies cleanly**
This can't be tested locally without a DB connection. The schema is idempotent (`ADD COLUMN IF NOT EXISTS`), so it's safe to apply on startup. Verify syntax by reading the file.
---
### Task 3: Extend agent-probe for model discovery
**Files:**
- Modify: `apps/coder/src/services/agent-probe.ts`
- [ ] **Step 1: Import the provider registry**
Add at the top of `agent-probe.ts`:
```typescript
import { PROVIDERS_BY_NAME } from './provider-registry.js';
```
- [ ] **Step 2: Replace KNOWN_AGENTS with registry-driven list**
Replace the `KNOWN_AGENTS` array and its type with a derivation from the provider registry:
```typescript
const KNOWN_AGENTS = ['opencode', 'goose', 'claude', 'qwen'].map((name) => ({
name,
supportsAcp: PROVIDERS_BY_NAME.get(name)?.transport === 'acp',
}));
```
This preserves the same shape the rest of `probeAgents` expects while deriving `supportsAcp` from the registry's `transport` field. `pi` is dropped (no provider def, not actively used). `boocode` is excluded (native — no binary to probe on host).
- [ ] **Step 3: Add model discovery after the existing version check**
Inside the `for (const agent of KNOWN_AGENTS)` loop, after the ACP check block and before the UPSERT, add model discovery:
```typescript
// Discover models for this agent
let models: Array<{ id: string; label: string }> = [];
const providerDef = PROVIDERS_BY_NAME.get(agent.name);
if (providerDef?.modelSource === 'static' && providerDef.staticModels) {
models = providerDef.staticModels;
}
if (agent.name === 'qwen') {
try {
const catResult = await sshExec('cat ~/.qwen/settings.json', { timeoutMs: 10_000 });
if (catResult.exitCode === 0 && catResult.stdout.trim()) {
const settings = JSON.parse(catResult.stdout) as {
modelProviders?: { openai?: Array<{ id: string }> };
};
const openaiModels = settings?.modelProviders?.openai;
if (Array.isArray(openaiModels)) {
models = openaiModels.map((m) => ({ id: m.id, label: m.id }));
}
}
} catch {
// ~/.qwen/settings.json missing or unparseable — fall back to empty
}
}
```
- [ ] **Step 4: Update the UPSERT to include new columns**
Replace the existing UPSERT statement with:
```typescript
const label = providerDef?.label ?? agent.name;
const transport = providerDef?.transport ?? 'pty';
await sql`
INSERT INTO available_agents (name, install_path, version, supports_acp, last_probed_at, models, label, transport)
VALUES (${agent.name}, ${installPath}, ${version}, ${supportsAcp}, clock_timestamp(), ${JSON.stringify(models)}::jsonb, ${label}, ${transport})
ON CONFLICT (name) DO UPDATE SET
install_path = EXCLUDED.install_path,
version = EXCLUDED.version,
supports_acp = EXCLUDED.supports_acp,
last_probed_at = EXCLUDED.last_probed_at,
models = EXCLUDED.models,
label = EXCLUDED.label,
transport = EXCLUDED.transport
`;
```
- [ ] **Step 5: Update the log line to include model count**
Replace the existing log.info with:
```typescript
log.info({ agent: agent.name, version, installPath, supportsAcp, modelCount: models.length }, 'agent-probe: found on host');
```
- [ ] **Step 6: Verify TypeScript compiles**
Run: `npx tsc -p apps/coder/tsconfig.json --noEmit 2>&1 | head -20`
Expected: No errors
---
### Task 4: Goose PTY dispatch
**Files:**
- Modify: `apps/coder/src/services/pty-dispatch.ts`
- [ ] **Step 1: Add goose case to buildAgentCommand**
In `pty-dispatch.ts`, replace the goose case (line ~61):
```typescript
case 'goose':
return model
? `goose run --text '${escapedTask}' --model '${model}'`
: `goose run --text '${escapedTask}'`;
```
Note: `goose run --text` is the non-interactive execution flag. If goose's actual CLI differs, the dispatch will fail with a nonzero exit code and the task will be marked `failed` — no silent corruption.
- [ ] **Step 2: Update the module docstring**
Replace `goose: stub (not yet supported)` with `goose: \`goose run --text <task>\` (non-interactive)` in the header comment.
- [ ] **Step 3: Verify TypeScript compiles**
Run: `npx tsc -p apps/coder/tsconfig.json --noEmit 2>&1 | head -20`
Expected: No errors
---
### Task 5: GET /api/providers Route
**Files:**
- Create: `apps/coder/src/routes/providers.ts`
- Modify: `apps/coder/src/index.ts`
- [ ] **Step 1: Create the providers route**
```typescript
// apps/coder/src/routes/providers.ts
import type { FastifyInstance } from 'fastify';
import type { Sql } from '../db.js';
import type { Config } from '../config.js';
import { PROVIDERS } from '../services/provider-registry.js';
interface ProviderModel {
id: string;
label: string;
}
interface ProviderResponse {
name: string;
label: string;
transport: string;
installed: boolean;
models: ProviderModel[];
}
interface LlamaSwapModel {
id: string;
[key: string]: unknown;
}
async function fetchLlamaSwapModels(config: Config): Promise<ProviderModel[]> {
try {
const res = await fetch(`${config.LLAMA_SWAP_URL}/v1/models`);
if (!res.ok) return [];
const parsed = (await res.json()) as { data?: LlamaSwapModel[] };
return (parsed.data ?? []).map((m) => ({ id: m.id, label: m.id }));
} catch {
return [];
}
}
export function registerProviderRoutes(app: FastifyInstance, sql: Sql, config: Config): void {
app.get('/api/providers', async (_req, _reply) => {
// Fetch llama-swap models (shared by boocode, opencode, goose)
const llamaModels = await fetchLlamaSwapModels(config);
// Fetch installed agents from DB
const agents = await sql<{ name: string; models: ProviderModel[]; label: string | null; transport: string | null }[]>`
SELECT name, models, label, transport FROM available_agents
`;
const agentMap = new Map(agents.map((a) => [a.name, a]));
const result: ProviderResponse[] = [];
for (const provider of PROVIDERS) {
const isNative = provider.name === 'boocode';
const agentRow = agentMap.get(provider.name);
const installed = isNative || !!agentRow;
if (!installed) continue;
let models: ProviderModel[];
if (provider.modelSource === 'llama-swap') {
models = llamaModels;
} else if (agentRow?.models && agentRow.models.length > 0) {
models = agentRow.models;
} else if (provider.staticModels) {
models = provider.staticModels;
} else {
models = [];
}
result.push({
name: provider.name,
label: agentRow?.label ?? provider.label,
transport: agentRow?.transport ?? provider.transport,
installed,
models,
});
}
return result;
});
}
```
- [ ] **Step 2: Register the route in index.ts**
In `apps/coder/src/index.ts`, add the import near the other route imports (around line 28):
```typescript
import { registerProviderRoutes } from './routes/providers.js';
```
Add the registration call after the other `register*Routes` calls (around line 148):
```typescript
registerProviderRoutes(app, sql, config);
```
- [ ] **Step 3: Verify TypeScript compiles**
Run: `npx tsc -p apps/coder/tsconfig.json --noEmit 2>&1 | head -20`
Expected: No errors
- [ ] **Step 4: Build and test**
Run:
```bash
docker compose build --no-cache boocode && docker compose up -d
```
Then verify:
```bash
curl http://100.114.205.53:9502/api/providers | jq .
```
Expected shape:
```json
[
{ "name": "boocode", "label": "BooCoder", "transport": "native", "installed": true, "models": [...] },
{ "name": "opencode", ... },
...
]
```
---
## Checkpoint Verification
After Task 5, report:
1. `curl http://100.114.205.53:9502/api/providers` output
2. `available_agents` schema after migration: `psql -h localhost -p 5500 -U boocode -d boochat -c '\d available_agents'`
3. Any issues with qwen model discovery from `~/.qwen/settings.json`
**Do NOT proceed to frontend (Step 4 in the spec) without confirming the API works.**

View File

@@ -0,0 +1,4 @@
# v1.13.12-skills-audit
**Status:** Shipped. Archived.

View File

@@ -0,0 +1,4 @@
# v1.13.15-codecontext-synth
**Status:** Shipped. Archived.

View File

@@ -0,0 +1,4 @@
# v1.13.17-cross-repo-reads
**Status:** Shipped. Archived.

View File

@@ -0,0 +1,4 @@
# v1.13.18-codecontext-file-path
**Status:** Shipped. Archived.

View File

@@ -0,0 +1,4 @@
# v1.13.20-drop-legacy-cols
**Status:** Shipped. Archived.

View File

@@ -0,0 +1,4 @@
# v1.14-outer-loop
**Status:** Shipped. Archived.

View File

@@ -0,0 +1,4 @@
# v1.14.1-mcp-poc
**Status:** Shipped. Archived.

View File

@@ -0,0 +1,4 @@
# v1.14.x-html-artifact-panes
**Status:** Shipped. Archived.

View File

@@ -0,0 +1,4 @@
# v1.15-mcp-multi
**Status:** Shipped. Archived.

View File

@@ -0,0 +1,4 @@
# v2.0-boocoder
**Status:** Shipped. Archived.

View File

@@ -1,132 +0,0 @@
# v1.13.12 — skills audit pass
Audit of 26 skills vendored from `/home/samkintop/opt/skills/` into `/opt/boocode/data/skills/`. Each sorted into one of four buckets per the Codeminer42 rules→recipes split.
## Deviations from the batch spec
| Spec said | Reality | Resolution |
|---|---|---|
| `/opt/boocode/skills/` is the audit target | Skills directory is `/opt/boocode/data/skills/` (per `services/skills.ts:19` `SKILLS_ROOT = '/data/skills'`) | Vendored to the correct path |
| `/opt/boocode/AGENTS.md` for bucket-(a) rule additions | `data/AGENTS.md` is an agent registry (`## H2` per agent with frontmatter), not a rules file | Bucket-(a) rules go to `BOOCHAT.md` (the container guidance file the chat agent reads) instead |
| "7 vendored v1.12 skills" exist to audit | Zero SKILL.md ever committed; `data/skills/` was empty | Vendored all 26 from `/home/samkintop/opt/skills/` in this batch (vendor + audit combined) |
| `data/` content tracked in git | `.gitignore` excluded all of `data/` | Added negation patterns (`data/*` + `!data/AGENTS.md` + `!data/skills/`) so audit work shows up in git |
| Container reads `data/skills/` from the boocode repo | `docker-compose.yml:18` had `- /opt/skills:/data/skills` override mount — container actually read from host-level `/opt/skills/`, ignoring repo `data/skills/` | Removed the override mount. Skill library now lives in `data/skills/` (repo-tracked, per-batch auditable). Host `/opt/skills/` preserved untouched for other tools (Claude Code, etc.). 1-line deviation from spec's "zero code change" claim — necessary to make the spec's intent actually take effect |
## Bucket tally
| Bucket | Action | Count |
|---|---|---|
| (a) | Move to BOOCHAT.md as always-true rule | 1 |
| (b) | Keep as recipe, apply Anthropic conventions | 14 |
| (c) | Keep + move bulk to `references/` (SKILL.md > 500 lines) | 0 |
| (d) | Delete (duplicates Claude native capability or doesn't fit BooCode) | 11 |
| **Total** | | **26** |
No skill exceeded the 500-line ceiling — bucket (c) is empty. Longest survivor: `systematic-debugging` at 296 lines.
## Per-skill decisions
| Skill (path) | Lines | Bucket | Disposition | Rationale |
|---|---:|:---:|---|---|
| `anthropics/agent-development` | 196 | (b) | Keep; rename → `developing-agents` | BooCode-specific value (manages `data/AGENTS.md` tier-2 registry) |
| `anthropics/claude-md-improver` | 180 | (d) | Delete | Overlaps `boocode-guidance-improver` (more specific) |
| `anthropics/frontend-design` | 42 | (b) | Keep; rename → `designing-frontends` | Concise UI design guidance, no overlap |
| `anthropics-knowledge-work/code-review` | 118 | (b) | Keep; rename → `reviewing-code` | Generic code review process distinct from `receiving-` / `requesting-code-review` |
| `anthropics-knowledge-work/task-management` | 91 | (d) | Delete | `user-invocable: false`; duplicates BooCode's TodoWrite/TaskCreate native capability |
| `asyrafhussin/react-vite-best-practices` | 182 | (b) | Keep; rename → `optimizing-react-vite` | Matches BooCode's stack (Vite, not Next.js) |
| `boocode/boocode-guidance-improver` | 167 | (b) | Keep; rename → `improving-boocode-guidance` | BooCode-specific 10-dimension rubric for `CLAUDE.md`/`BOOCHAT.md`/`BOOCODER.md`/`AGENTS.md` |
| `mattpocock/diagnose` | 117 | (b) | Keep; rename → `diagnosing-bugs` | Complement to `systematic-debugging`: focus on building a feedback loop |
| `mattpocock/grill-me` | 20 | (b) | Keep; rename → `grilling-plans` | Plan stress-testing |
| `mattpocock/grill-with-docs` | 98 | (d) | Delete | Requires `CONTEXT.md` and `docs/adr/` that BooCode doesn't have |
| `mattpocock/handoff` | 17 | (d) | Delete | BooCode is single-user; no agent handoff scenario |
| `mattpocock/improve-codebase-architecture` | 71 | (d) | Delete | Requires `CONTEXT.md` and `docs/adr/` |
| `mattpocock/to-issues` | 83 | (d) | Delete | BooCode uses `openspec/changes/`, not an issue tracker |
| `mattpocock/to-prd` | 76 | (d) | Delete | Same — no issue tracker |
| `mattpocock/write-a-skill` | 121 | (b) | Keep; rename → `writing-skills` | Authoring new skills for this very system |
| `mattpocock/zoom-out` | 7 | (d) | Delete | Claude does this natively when asked; 7-line skill is overhead |
| `superpowers/brainstorming` | 164 | (b) | Keep (already gerund) | Before-features creative-work process |
| `superpowers/receiving-code-review` | 213 | (b) | Keep (already gerund) | Sam reviews everything; process for handling feedback |
| `superpowers/requesting-code-review` | 103 | (b) | Keep (already gerund) | Before-merge verification |
| `superpowers/systematic-debugging` | 296 | (b) | Keep (already gerund) | Comprehensive bug-fix discipline (root-cause-first) |
| `superpowers/using-superpowers` | 117 | (d) | Delete | Meta-skill about skill discovery; Claude does discovery natively |
| `superpowers/verification-before-completion` | 139 | (a) | Migrate rule to `BOOCHAT.md`, delete skill dir | Always-true rule: evidence before assertions. Belongs 100% present, not 6% invoked |
| `superpowers/writing-plans` | 152 | (b) | Keep (already gerund) | Maps to BooCode's `openspec/changes/` workflow |
| `vercel-labs/find-skills` | 142 | (d) | Delete | Skill discovery — Claude does this natively |
| `vercel-labs/react-best-practices` | 149 | (d) | Delete | Next.js focus; BooCode uses Vite (asyrafhussin's version is the fit) |
| `vercel-labs/web-design-guidelines` | 39 | (b) | Keep; rename → `reviewing-web-design` | UI compliance review |
## Bucket-(a) migration text
Single rule extracted from `superpowers/verification-before-completion` (139 lines → ~3 lines in BOOCHAT.md):
> **Don't claim work is complete without verifying.** Run the relevant command (test, build, smoke) and confirm the expected output before reporting success. Evidence before assertions catches regressions you'd otherwise miss.
The 139-line process content does not move to BOOCHAT.md — the rule itself is what needs to be 100% present. Process detail is recoverable from the upstream repo if anyone wants to read it later.
## Verification protocol coverage
| Step | Owner | Status |
|---|---|---|
| 1. Discovery (paste SKILL.md, check first-200-char triggering) | Sam — fresh Claude.ai chat per skill | Pending |
| 2. Logic (paste realistic task, check skill recognizes) | Sam — fresh Claude.ai chat | Pending |
| 3. Edge Case (paste boundary task, check correct invoke/decline) | Sam — fresh Claude.ai chat | Pending |
| 4. Architecture Refinement (paste skill + chats, ask for critique) | Sam — fresh Claude.ai chat | Pending |
| 5. `skillgrade --smoke` (5 trials per skill) | Sam — host install `npm i -g skillgrade` first | Pending |
Eval.yaml files written per surviving skill (14 files) so the `skillgrade --smoke` runs are mechanical once `skillgrade` is installed.
## skillgrade scope correction
The `eval.yaml` stubs authored in the prior session use a flat `tasks: [{prompt, grader: [list]}]` shape that does not validate against skillgrade's canonical schema (canonical needs `name`, `instruction`, `workspace`, structured `graders` with `type: deterministic | llm_rubric`, `run` shell script, `weight`, plus a Docker `provider` block). Rewriting all 14 in the canonical format is out of scope for this batch — each needs Docker workspace setup and grader scripts that capture skill-output correctness. Filed as a follow-up: **v1.13.13 — skillgrade eval.yaml canonical rewrite + first quantitative pass**.
The smoke-results column in the table below is `n/a*` for that reason. The 4-step qualitative protocol still runs (via the agent team in this batch) and surfaces the structural issues that quantitative trials would have caught anyway.
## 4-step protocol findings (agent-team batch)
Each surviving skill was assessed by one of 5 parallel teammates (alpha / bravo / charlie / delta / echo) running the mgechev/skills-best-practices 4-step protocol: Discovery → Logic → Edge Case → self-Architecture-Refinement. Teammates wrote per-agent findings to `/tmp/audit-<name>.md`; the table here aggregates.
Ratings shorthand: D=Discovery, L=Logic, E=Edge Case (each 1-5).
| Skill | Auditor | D / L / E | 4-step verdict | Fix applied |
|---|---|:---:|---|---|
| `anthropics/designing-frontends` | alpha | 5 / 5 / 3 | Strong primary triggers; over-broad with "artifacts, posters" (not code targets) | Removed "artifacts, posters" from description trigger list |
| `anthropics/developing-agents` | alpha | 5 / 5 / 4 | Sharp triggering; stale "(as of v1.11.x)" tag and broken `inference.ts:721-731` reference (actual code is at `stream-phase.ts:403-406`) | Updated stale version tag + cross-reference |
| `anthropics-knowledge-work/reviewing-code` | alpha | 4 / 4 / 3 | Good for explicit PR/diff triggers; dead `CONNECTORS.md` cross-reference (file doesn't exist) | Removed broken cross-reference |
| `mattpocock/diagnosing-bugs` | bravo | 5 / 5 / 4 | Strong; missing colloquial phrasings like "not working" / "something wrong" | Added informal trigger phrases |
| `mattpocock/grilling-plans` | bravo | 3 / 4 / 3 | Trigger coverage too narrow — only "grill me" reliably fires; structural risk: mandatory `ask_user_input` tool may not exist in BooCode's tool registry (flagged but not patched — needs env verification) | Added "poke holes", "challenge my design", "play devil's advocate", "what am I missing" triggers |
| `mattpocock/writing-skills` | bravo | 4 / 5 / 3 | Missing "create" phrasing; description-length rule (≤1024 chars) not in Review Checklist | Added "create" trigger + checklist item |
| `superpowers/brainstorming` | charlie | 4 / 4 / 3 | Vague "modifying behavior" causes both over- and under-firing; HARD-GATE wording could be clearer about writing-plans being permitted | Tightened to "non-trivial modifications" + added "refactoring" |
| `superpowers/receiving-code-review` | charlie | 3 / 4 / 3 | Conditional qualifier "especially if feedback seems unclear or technically questionable" mis-frames as edge-case skill rather than default protocol | Removed conditional + broadened to informal channels |
| `superpowers/requesting-code-review` | charlie | 3 / 4 / 2 | Scope collision with built-in `code-review` skill — near-identical surface language but different execution (subagent dispatch vs inline). LOWEST EDGE-CASE SCORE OF THE BATCH | Added "dispatches a separate subagent reviewer" differentiator |
| `superpowers/systematic-debugging` | delta | 5 / 5 / 4 | Strong; build/compile failures appear in body but missing from frontmatter trigger | Extended description with "build failure, compile error" + "debug/investigate/diagnose" |
| `superpowers/writing-plans` | delta | 3 / 4 / 3 | Spec-centric framing gatekeeps on pre-existing spec doc; colloquial "write me a plan" misses | Added colloquial planning trigger phrases |
| `asyrafhussin/optimizing-react-vite` | delta | 4 / 5 / 3 | Over-broad "Vite configuration" triggers on non-perf tasks; body references `rules/*.md` and `AGENTS.md` files that don't exist in the skill dir | Narrowed scope + added broken-reference warnings |
| `boocode/improving-boocode-guidance` | echo | 5 / 5 / 4 | Strong; "critique" in description prose but missing from examples list | Added `"critique my BOOCODER.md"` to examples |
| `vercel-labs/reviewing-web-design` | echo | 3 / 4 / 3 | Generic triggers collide with general code-review; delegates substance to external GitHub URL with no fallback on fetch failure | Named "Vercel's live web-interface-guidelines" as differentiator + added 404 fallback |
## Aggregate notes
**Trigger-quality stats (qualitative, n=14):**
- Discovery 5/5: 5 skills | 4/5: 4 skills | 3/5: 5 skills (avg ~4.0)
- Edge case is the weakest dimension across the batch — most skills hit 3/5 (borderline invoke/decline). Suggests skills are over- or under-triggering on adjacent-but-different tasks.
- Every skill had at least one fix applied. None were judged "clean" with zero issues.
- Zero skills were flagged for retroactive bucket-(a) reclassification — all 14 remain (b) recipes.
**Real bugs surfaced (not just polish):**
- `anthropics/developing-agents`: stale code reference (inference.ts:721-731 → stream-phase.ts:403-406). Real dead link.
- `anthropics-knowledge-work/reviewing-code`: dead CONNECTORS.md cross-reference.
- `asyrafhussin/optimizing-react-vite`: references `rules/*.md` and `AGENTS.md` subfiles that don't exist in the skill directory.
- `superpowers/requesting-code-review`: scope collision with built-in `code-review` (review skills/auto-routing — Sam may want to drop one of these).
**Structural flags requiring environment verification (not patched):**
- `mattpocock/grilling-plans`: mandatory `ask_user_input` tool call assumed available. Confirm BooCode's tool registry exposes this to the chat-surface model. If not, the skill body's MANDATORY instruction deadlocks.
**Skillgrade gap remains:**
- Quantitative trigger rates (the original v1.13.12 N/5 column) require skillgrade with canonical-format eval.yaml. Filed as **v1.13.13** follow-up. The qualitative 4-step protocol catches the same class of issue (and arguably more — the broken-reference bugs above would not have shown up in skillgrade's invoke/decline trials).
**Per-agent artifacts (working files, not part of repo):**
- `/tmp/audit-alpha.md` — designing-frontends, developing-agents, reviewing-code
- `/tmp/audit-bravo.md` — diagnosing-bugs, grilling-plans, writing-skills
- `/tmp/audit-charlie.md` — brainstorming, receiving-code-review, requesting-code-review
- `/tmp/audit-delta.md` — systematic-debugging, writing-plans, optimizing-react-vite
- `/tmp/audit-echo.md` — improving-boocode-guidance, reviewing-web-design

View File

@@ -1,145 +0,0 @@
# v1.13.13 — codecontext synthesis pipeline
Slots between v1.13.12 (skills audit) and v1.14 (Phase C outer agent loop). Adds a forced second-inference synthesis pass for codecontext overview/analysis tools so the model stops returning shallow first-touch summaries.
Does NOT change the recursion structure, depth cap, or budget — those are v1.14 concerns. The cap-50 patch from v1.13.12 stays; v1.14 supersedes it via per-agent `agent.steps`.
## What ships
- `apps/server/src/services/synthesisPrompt.ts` (NEW, 20 lines) — verbatim system prompt as a const.
- `apps/server/src/services/synthesisPipeline.ts` (NEW, ~450 lines) — `SYNTHESIS_TOOLS` set + `runSynthesisPass(params) → Promise<boolean>`. Auto-fetches top-N referenced files + project docs (BOOCHAT.md, AGENTS.md, *roadmap*.md, CONTEXT.md), applies a 32k-token budget with priority drop order, streams a synthesis turn via `streamCompletion`, dual-writes a `kind='synthesis'` part.
- `apps/server/src/services/inference/parts.ts``PartKind` union extended with `'synthesis'`.
- `apps/server/src/services/inference/tool-phase.ts` — synth-tool result capture during `Promise.all`; post-pause synth check before the recursive `runAssistantTurn`.
- `apps/server/src/schema.sql` — inline CHECK constraint updated + `DROP CONSTRAINT IF EXISTS` + `DO $$ pg_constraint` migration block. Idempotent (drops + re-adds on every startup; per-boot cost is trivial).
SYNTHESIS_TOOLS = `{get_codebase_overview, get_framework_analysis, get_semantic_neighborhoods}`. The other 5 codecontext tools (search_symbols, get_dependencies, get_file_analysis, get_symbol_info, watch_changes) return targeted data the model uses directly — no synthesis pass.
## Decisions
### Schema migration was required (dispatch was wrong)
The original dispatch said "kind is text column, no schema migration needed." Reality: `schema.sql:54` has an explicit `message_parts_kind_chk` CHECK constraint enumerating allowed kinds (`'text', 'tool_call', 'tool_result', 'reasoning', 'step_start'`). Adding `'synthesis'` requires updating the constraint.
Resolution: added a `DROP CONSTRAINT IF EXISTS` + `DO $$ ... pg_constraint` idempotency-guarded migration block in `schema.sql` matching the CLAUDE.md migration pattern, plus updated the inline CREATE TABLE constraint so fresh installs include the new value.
### `view_file` input shape uses `start_line`/`end_line`, not `line_count`
The dispatch's auto-fetch sketch implied a `line_count` parameter. The real `viewFile` tool's input schema (`tools.ts:51-55`) takes `start_line`/`end_line` (1-indexed inclusive) with a 200-line default if both are omitted. The pipeline uses `end_line: FILE_LINE_CAP` for files (200) and `end_line: DOC_LINE_CAP` for docs (500), which gives the first N lines — same effective truncation.
### User-abort during synthesis marks the synth message failed (deviates from review req)
**Decision: option A — mark synth message `status='failed'` on every catch path including user-abort, then re-throw on user-abort.**
Sam's stated review requirement: "User-abort path does NOT mark the message failed (re-throw to outer handler is correct)."
Why this deviation: the outer abort handler (`error-handler.ts:handleAbortOrError`) operates on `args.assistantMessageId` — the *parent* assistant message that triggered the tool call. It does not know about the *new* synth assistant message that `runSynthesisPass` created. If the synth row isn't explicitly marked failed on user-abort, it sits in `status='streaming'` until the 5-min stale-streaming sweeper (`apps/server/src/index.ts`) picks it up — meanwhile the frontend's 60s no-token-activity timer trips the stale-stream banner on the orphan. Same UX bug class the v1.13.3 stuck-row sweeper was added to handle.
Cost: one extra DB write + one `message_complete` republish on the rare user-abort-during-synth path. Worth it to avoid the zombie message + ghost banner.
**Note for v1.14 outer-loop port**: when Phase C migrates the depth cap into `agent.steps` and reworks the recursion, the synth message is a sibling to the parent assistant message — both belong to the same chat. The new outer loop should either (a) preserve this pattern (mark all chat-scoped streaming messages failed on abort) or (b) extend `handleAbortOrError` to sweep chat-scoped streaming rows. Option (b) is a wider blast radius and was rejected here; option (a) is one targeted call site.
### Token budget priority list
Drop order when the 32k cap is exceeded (lowest priority first):
1. top-2..N files (keep top-1)
2. top-1 file
3. `*roadmap*.md` + `CONTEXT.md` (mid-priority — both describe state/intent)
4. `AGENTS.md`
5. `BOOCHAT.md`**never dropped**; truncated to 32k if it alone exceeds
CONTEXT.md wasn't in the original dispatch's priority list; grouped with roadmap as mid-priority (same semantic — both are state/intent docs).
### 90s timeout via `AbortSignal.any`
Synthesis call has its own `AbortController` with a 90s `setTimeout`. Combined with `p.args.signal` (the user-abort signal) via `AbortSignal.any([user, synth])` — either fires correctly. Node 20.3+. A `timedOut` flag in scope disambiguates which signal tripped after `streamCompletion` throws (`AbortError`): timeout → return false (fall through to recursion); user-abort → re-throw (after `markSynthFailed`).
### Race-safe synth-tool capture under `Promise.all`
`synthEntries: Array<{tc, output, error?}>` populated by each parallel callback pushing its own result. After `Promise.all` resolves, `synthEntries.find((e) => !e.error && e.output != null)` picks the first non-error synth entry by call-order (i.e. by `toolCalls` array index in the original LLM emit order). Not result-quality scoring — explicitly call-order, documented inline.
### Known interaction: qwen3.6 `include_stats: "True"` retry loop compounds synth-pass cost
Smoke #1 surfaced a pre-existing qwen3.6 quirk: the model emits `"True"` (string) instead of `true` (bool) for boolean tool args. The `experimental_repairToolCall` + zod-reject retry path (v1.13.3) handles this — the model retries on the next turn with corrected args, then succeeds.
**Synth pass cost interaction:** when the first tool-call fails zod validation, the recursive runAssistantTurn fires *before* the successful synth-tool call lands. The user effectively pays: (1) failed tool-call turn → (2) error tool-result → (3) retry tool-call turn → (4) successful tool-result → (5) synth pass.
Per-fire token cost for an overview question now: ~5 inference calls (turns 1, 3, 5 are model calls; 5 is the synth pass adding ~5k tokens of auto-fetched context). Not a blocker — the synth content is dramatically better than the without-synth case (4920 tokens of cited analysis vs. a 70-token tool-call-only turn). Worth tracking if usage stats start showing it.
### v1.14 outer-loop port — preserve this pattern
Two patterns from this batch the Phase C outer-loop port must preserve:
1. **Chat-scoped abort cleanup**: the synth message is a sibling to the parent assistant message, both belong to the same chat. The new outer loop should either (a) keep `markSynthFailed` (or its equivalent) firing on every catch path including user-abort, or (b) extend `handleAbortOrError` to sweep all chat-scoped streaming rows. This batch chose (a); (b) was rejected as wider blast radius.
2. **Race-safe `Promise.all` capture**: `synthEntries: Array<...>` instead of a single shared variable. Per-callback push avoids the last-write-wins race when a batch has multiple synth tools.
## Test plan
6-prompt smoke + 1 failure-injection. Sequence:
1. **Default agent** — "What's in this codebase?" → expect `get_codebase_overview` + synthesis pass, response cites BOOCHAT.md + actual files + roadmap state.
2. **Architect agent** — "Give me a system overview of how BooCode handles tool calls" → expect synthesis with refs to inference/turn.ts, tool-phase.ts, stream-phase.ts.
3. **Architect agent** — "What's the current state of v1.13?" → synthesis must read `boocode_roadmap.md` and report shipped vs planned correctly. Must NOT infer "v1.13.2 shipped" from code presence — roadmap explicitly defers it.
4. **Code Reviewer** — "Find all callers of buildSystemPrompt" → `search_symbols` fires, NO synthesis pass (not in SYNTHESIS_TOOLS).
5. **Debugger** — "Where is detectDoomLoop defined and called from?" → `search_symbols` + `get_dependencies`, NO synthesis pass.
6. **Failure injection** — temporarily make `streamCompletion` throw inside `runSynthesisPass`; verify fall-through to recursion + log entry visible + non-empty answer.
## Backups in place
```
apps/server/src/schema.sql.bak-v1.13.13-20260522
apps/server/src/services/inference/parts.ts.bak-v1.13.13-20260522
apps/server/src/services/inference/tool-phase.ts.bak-v1.13.13-20260522
```
To be deleted after merge.
## Smoke results
### Smoke #1 — default agent, "What is in this codebase?"
Synthesis fired on `get_codebase_overview`. Log line:
```
{"chatId":"7bb05e54-…","synthMessageId":"44480541-…","toolName":"get_codebase_overview","chars":6727,"files":5,"msg":"synthesis pass complete"}
```
Token accounting: synth turn = 4920 tokens (vs. 63 + 70 on the preceding tool-call-only turns). Model is using the auto-fetched context, not parroting codecontext output. Synth message has the expected `kind='synthesis'` part dual-write.
Side note: qwen3.6 needed one retry due to the `include_stats: "True"` quirk (see Decisions). `repairToolCall` handled it; synth fired on the successful call.
### Smoke #6 — fault injection
Env-gated throw inserted between the synth-message INSERT and the `streamCompletion` call. Container rebuilt with `V1_13_13_FAULT_INJECT=1`. Sent the same prompt to a new smoke chat.
All 6 expected outcomes confirmed:
| # | Outcome | Evidence |
|---|---|---|
| 1 | `runSynthesisPass` throws | log: `err: "Error: v1.13.13 smoke #6 fault injection"` |
| 2 | Synth message marked `status='failed'` with empty content | msg `7ac9c685-…` role=assistant status=failed content_len=0 |
| 3 | `message_complete` frame published for the synth message | implicit via `markSynthFailed`; frontend never tripped the 60s timer |
| 4 | Fall-through to recursive `runAssistantTurn` | log: `synthesis pass failed; falling through to recursive turn` |
| 5 | User sees normal (non-synthesized) assistant response | final msg `924076a3-…` 453 tokens: `"This is **boocode** — a self-hosted, single-user developer chat app."` |
| 6 | Stale-stream banner does NOT fire on failed synth | confirmed — terminal `status='failed'` is what `applyFrame` writes |
Fault injection reverted post-test:
- `grep FAULT_INJECT apps/server/src/services/synthesisPipeline.ts docker-compose.yml` → empty
- `grep FAULT_INJECT apps/server/dist/services/synthesisPipeline.js` → empty
- `docker compose exec boocode printenv V1_13_13_FAULT_INJECT` → exit 1 (unset)
- Boot log clean, `skills loaded: 14`
### Smokes #2#5
Sam is doing the qualitative reads from the UI in parallel — those verifications are about synthesis content quality (cites correct files, reads roadmap accurately, no-synthesis on `search_symbols`).
## Done when
-`synthesisPrompt.ts` + `synthesisPipeline.ts` created
-`parts.ts` PartKind union extended
-`tool-phase.ts` insertion point edited
- ✅ Schema migration block added (deviation from dispatch acknowledged)
- ✅ Type-clean (`pnpm -C apps/server build`)
- ✅ Container rebuilt + migration confirmed via pg_constraint and logs
- ✅ Smoke #1 (positive synth path) verified
- ✅ Smoke #6 (fault injection + fall-through) verified, injection reverted
- ⏳ Smokes #2#5 (Sam's UI reads)
- ⏳ Sam commit

View File

@@ -1,185 +0,0 @@
# v1.13.17-cross-repo-reads — on-demand read access to another repo (draft, 2026-05-22)
BooChat sessions are scoped to one project root. When the agent needs context from another repo (e.g. `/opt/forks/codecontext` to investigate a dependency), `pathGuard` rejects every read tool and the agent has no recovery path.
This batch adds a reactive `ask_user_input`-style flow that the agent triggers on `PathScopeError`. User approves once per session per project root; subsequent reads under that root succeed without further prompting.
## Trigger flow
1. Model emits `view_file("/opt/forks/codecontext/go.mod")` while session is scoped to `/opt/boocode`.
2. `pathGuard` throws `PathScopeError`. Existing tool wrapper catches it and returns the error to the model. **The error message now ends with a hint:** `"Use request_read_access(path, reason) to ask the user for permission."`
3. Model self-issues `request_read_access("/opt/forks/codecontext/go.mod", "investigating codecontext fork to write design doc")` on the next turn.
4. The new tool emits a pending tool-call frame (same pause mechanism as `ask_user_input`); inference loop pauses.
5. Frontend renders approve/deny chips with the path + reason.
6. User picks Allow → append the grant root to `session.allowed_read_paths`, resume inference, tool returns `"granted: /opt/forks/codecontext"`. Model retries the original `view_file` on the next turn.
7. User picks Deny → tool returns `"denied"` without mutating session state; model decides what to do next.
## Decisions (draft — override in dispatch if different)
### D1. Grant unit = nearest registered project root, then nearest path-whitelist ancestor, then refuse
When user approves access to `/opt/forks/codecontext/go.mod`:
- If a row in `projects.path` is an ancestor of the requested path → grant the project's root path.
- Else if `PROJECT_ROOT_WHITELIST` env (default `/opt`) is an ancestor and the immediate child dir of the whitelist looks like a repo root (`.git/`, `package.json`, `go.mod`, or `Cargo.toml` present) → grant that immediate child dir (e.g. `/opt/forks/codecontext`).
- Else → refuse without prompting. Tool returns `"denied: path outside permitted scope"`. No user prompt fires.
Why: granting the literal path is too narrow (next file in the same repo re-prompts). Granting an arbitrary parent dir over-scopes. The nearest repo-shaped directory is the natural unit.
### D2. Persistence = per-session, no expiry
`sessions.allowed_read_paths` is the source of truth. Grants stick until the session is archived. A new session in the same project re-prompts on the first cross-repo read.
Why: per-chat is too granular for the typical workflow (Sam investigates the same fork across multiple chats in one investigation session). Per-project is too broad (different sessions in the same project might have different scope needs). Per-session is the natural unit and matches `session.web_search_enabled`'s scope.
### D3. Secret-file deny list applies across all grant roots
`is_secret_path` in `secret_guard.ts` filters filenames (`.env`, `*.pem`, `credentials.json`, etc.) regardless of which root they're under. The check is post-`pathGuard`, so it already runs on the resolved path. No change needed.
### D4. Revocation UI = chat-settings panel + automatic clear on archive
- Settings panel under the session-info popover: lists current `allowed_read_paths` with a per-row delete button.
- Session archive deletes the row (no need to clear allowed_read_paths separately — the row goes).
- No expiry timer.
Optional v1.13.18 follow-up if Sam wants it: a `/clear_grants` slash command for power users. Out of scope for v1.13.17.
## Schema
```sql
-- v1.13.17: session-scoped cross-repo read grants. Populated via the
-- request_read_access tool's approve path; never written by other code.
ALTER TABLE sessions
ADD COLUMN IF NOT EXISTS allowed_read_paths text[] NOT NULL DEFAULT ARRAY[]::text[];
```
No CHECK constraint — values are absolute paths validated at write time against the projects table + whitelist heuristic.
## New tool: `request_read_access`
```ts
// apps/server/src/services/request_read_access.ts (new)
export const requestReadAccessInput = z.object({
path: z.string().min(1),
reason: z.string().min(1).max(500),
});
export const requestReadAccess: ToolDef<...> = {
name: 'request_read_access',
description:
'Ask the user for read-only access to a path outside the current ' +
'session\'s project scope. Use when pathGuard rejected a read ' +
'attempt and the path is plausibly under another known repo. ' +
'Returns "granted: <root>" or "denied".',
inputSchema: requestReadAccessInput,
jsonSchema: { ... },
category: 'read_only',
async execute(input, projectRoot) {
// Validate path: must be absolute, must be under PROJECT_ROOT_WHITELIST
// (default /opt), must NOT already be under the session's primary
// projectRoot (silly to ask for what's already in scope).
// Validation failures return sentinel without prompting the user.
// Emit pending-grant tool result (parallel of ask_user_input's pause
// sentinel). Inference loop pauses on this kind=pending_grant marker.
// User picks Allow/Deny via a new POST /api/messages/:id/grant endpoint.
// On Allow: derive grant root per D1 + UPDATE sessions SET
// allowed_read_paths = array_append(allowed_read_paths, <root>);
// resume inference; tool returns "granted: <root>".
// On Deny: resume immediately; tool returns "denied".
},
};
```
Registered in `ALL_TOOLS` + `READ_ONLY_TOOL_NAMES`. Available to all agents by default (no agent's `tools` whitelist needs to be updated to grant access — the tool registry's filter is per-agent).
## `pathGuard` extension
```ts
// apps/server/src/services/path_guard.ts — current signature:
// pathGuard(projectRoot, requestedPath): Promise<string>
//
// Extended:
// pathGuard(projectRoot, requestedPath, extraRoots?: string[]): Promise<string>
//
// Tries primary projectRoot first; on PathScopeError, walks extraRoots and
// returns the first one that resolves the requestedPath inside its tree.
// Throws PathScopeError if no root accepts.
```
Every tool that calls `pathGuard` (currently `view_file`, `list_dir`, `grep`, `find_files`, `view_truncated_output`) threads `session.allowed_read_paths` through `executeToolCall`. The `Session` interface already flows through `TurnArgs`; tool-phase just needs to forward `session.allowed_read_paths` as the third arg.
## Pause/resume infrastructure reuse
The pending-grant pause uses the **same mechanism as `ask_user_input`**:
- Tool insert with `payload.output = null` + `payload.kind = 'pending_grant'`.
- `pausingForUserInput` branch in `tool-phase.ts` is widened to also catch pending grants.
- `chat_status` flips to `waiting_for_input` per the v1.12.1 5-state model.
New endpoint `POST /api/messages/:tool_msg_id/grant` (parallel of the existing `/answer`):
- Body: `{ decision: 'allow' | 'deny' }`.
- Resolves grant root per D1 if Allow. UPDATEs `sessions.allowed_read_paths`. UPDATEs tool message with output. Resumes inference via existing enqueue path.
## Frontend changes (in scope; small)
- `MessageBubble.tsx`: render `pending_grant` tool messages with Allow/Deny chips + the path + reason text. Wires to `api.messages.grant(toolMsgId, decision)`.
- New API client method `api.messages.grant`.
- Settings popover: `allowed_read_paths` list with per-row delete (calls `PATCH /api/sessions/:id` with the modified array).
## Hard rules
- No git commit, no git push, no git pull during dispatch. Sam commits manually.
- Backup every file before edit per the standard convention.
- TS strict, no `any`.
- No new deps.
- Schema migration is **additive only** (ADD COLUMN IF NOT EXISTS), idempotent on re-run.
- Tool is **read-only** — no path under `allowed_read_paths` can ever be written by BooChat (no write tools registered today; this is a structural guarantee).
- Secret-file deny list still runs unconditionally on resolved paths.
## Stop checkpoints
1. After recon (read existing path_guard + ask_user_input + answer endpoint patterns): stop, hand back the recon report.
2. After code edits, before schema migration applies: stop, hand back the diff.
3. After schema migration applies in dev: stop, run smoke plan, report.
## Smoke plan
1. **Approve flow.** Send a chat in a `/opt/boocode` session asking the agent to investigate `/opt/forks/codecontext/go.mod`. Confirm:
- `pathGuard` throws on the first attempt; tool result includes the `request_read_access` hint.
- Agent calls `request_read_access`; tool-call frame lands; chat status flips to `waiting_for_input`.
- Frontend renders Allow/Deny chips with the path + reason.
- Pick Allow → grant root resolves to `/opt/forks/codecontext` (per D1); `sessions.allowed_read_paths` shows the entry; agent retries `view_file` successfully on the next turn.
2. **Deny flow.** Same setup; pick Deny. Confirm session state unchanged, tool returns `"denied"`, agent gives up or asks differently.
3. **Persistence.** In the same session, a second `view_file` against a different file under `/opt/forks/codecontext/` succeeds without re-prompting.
4. **Cross-session isolation.** Open a fresh session in the boocode project, try the same path — re-prompts (allowed_read_paths is empty on the new session).
5. **Secret-file deny still fires.** Approve access to a repo that contains a `.env` file. Try `view_file('/opt/forks/some-repo/.env')`. Confirm refused via `is_secret_path`, not via pathGuard scope.
6. **Out-of-scope refusal.** Try `request_read_access('/etc/passwd', 'system file')`. Tool validates against the whitelist + repo-shape heuristic, returns `"denied: path outside permitted scope"` without prompting the user.
## Done when
- New `request_read_access` tool + `POST /api/messages/:id/grant` endpoint shipped.
- `path_guard.ts` extended; all read tools forward `allowed_read_paths`.
- `MessageBubble.tsx` renders pending-grant bubbles; settings popover lists + clears grants.
- Schema migration applied (sessions.allowed_read_paths).
- Smoke plan green.
- v1.13.17-cross-repo-reads tag + CHANGELOG entry + roadmap retrospective bullet.
## Files expected to touch
- `apps/server/src/schema.sql` — new column
- `apps/server/src/services/request_read_access.ts` — NEW
- `apps/server/src/services/path_guard.ts` — extra-roots param + helpful PathScopeError message
- `apps/server/src/services/tools.ts` — register the new tool, update view_file / list_dir / grep / find_files / view_truncated_output to thread allowed_read_paths
- `apps/server/src/services/inference/tool-phase.ts` — pause-on-pending-grant branch (alongside ask_user_input)
- `apps/server/src/routes/messages.ts` — new `/grant` endpoint
- `apps/server/src/types/api.ts``Session.allowed_read_paths`
- `apps/web/src/api/client.ts``api.messages.grant`
- `apps/web/src/api/types.ts``Session.allowed_read_paths`
- `apps/web/src/components/MessageBubble.tsx` — render pending_grant chips
- `apps/web/src/components/` — settings-popover grants list (file TBD during impl)
Estimate: ~120 LoC across backend + frontend + schema. Single batch.
## Open questions for dispatch
The four design decisions above are my recommendations. Override any of them in the dispatch and I'll update the proposal before recon. Most likely-overridable: **D1** (grant unit — you may want exact-path-only for tighter scoping, accepting the re-prompt cost) and **D4** (revocation UI — you may want it deferred entirely).

View File

@@ -1,46 +0,0 @@
# v1.13.18 — design notes
## Resolver contract
`resolveProjectPath(projectRoot: string, rawPath: string): Promise<string>`
1. **Trim check**`rawPath.trim() === ''` throws `INVALID_FILE_PATH`. This is defensive code; the Zod `.trim().min(1)` in required-`file_path` wrappers catches empty paths before the shim. For optional-`file_path` wrappers, the caller guard `file_path.trim() !== ''` prevents `resolveProjectPath` from being reached at all when the string is empty or whitespace-only.
2. **Absolute branch**`isAbsolute(rawPath)` uses the candidate as-is; otherwise `resolve(projectRoot, rawPath)` anchors it.
3. **realpath with ENOENT fallthrough**`realpath(candidate)` resolves symlinks and normalises the path. On `ENOENT` (file doesn't exist), the un-realpathed absolute is used as the forwarded value. Any other error (EACCES, EBADF, etc.) re-throws immediately.
4. **Escape check**`resolved !== projectRoot && !resolved.startsWith(projectRoot + sep)`. Uses `path.sep` not a string literal `'/'` so the check is platform-safe (Windows posture, forward compatibility).
5. **Return** — the resolved absolute path, which replaces `req.args['file_path']` in `argsToSend`.
The guard in `callCodecontext` only invokes `resolveProjectPath` when `typeof req.args['file_path'] === 'string' && req.args['file_path'].trim() !== ''`. Wrappers that don't include `file_path` in their args object are unaffected.
## Error-shape parity rationale
The `target_dir` escape error message is: `target_dir <targetDir> escapes project root <resolvedProject>`.
The `file_path` escape error message is: `file_path <rawPath> escapes project root <projectRoot>`.
The template is byte-identical except for the field name prefix. This is intentional:
- The existing escape error regex `/escapes project root/` used in tests and potentially in log alerting applies to both error types without special-casing.
- A model receiving either error message can apply the same self-correction: the escape check is the same invariant (`path starts with project root + sep`), so the same remediation applies (use a path inside the project).
- Keeping the shapes uniform reduces cognitive overhead when reading logs that mix both error types.
## ENOENT fallthrough rationale
When a `file_path` doesn't exist on disk, `resolveProjectPath` forwards the un-realpathed absolute path to the sidecar. The sidecar responds with its own error: `"file not found: <path>"` (or `"File not found in graph: <path>"`).
The alternative — re-implementing the "file not found" check in the resolver — would:
1. Diverge from the sidecar's canonical error language, producing two different "not found" messages depending on whether the file existed at realpath time.
2. Conflict with future scenarios where the sidecar's graph is stale (file existed at index time but was deleted, or vice versa). The sidecar's error is always authoritative.
3. Add no user-visible value: the model can self-correct on either "file not found" message by checking the path.
The resolver's job is path safety (scope enforcement) and path normalisation (relative → absolute). Existence checking is the sidecar's job.
## `codecontext_tools.test.ts` impact
The existing `get_file_analysis forwards file_path` test in `codecontext_tools.test.ts` passes `'apps/server/src/index.ts'` as a relative `file_path` and asserts it reaches the wire unchanged. After this fix the path is resolved to `join(projectDir, 'apps/server/src/index.ts')`. The test now fails.
This test file is outside this batch's allowed file list. Sam should update the test assertion to expect the resolved absolute path, or create the file in the test tmpdir and assert the full resolved path. The fix is a one-liner: change `file_path: 'apps/server/src/index.ts'` to `file_path: join(projectDir, 'apps/server/src/index.ts')` in the `expect(body).toMatchObject(...)` call, and create the file before the call (so realpath succeeds).

View File

@@ -1,36 +0,0 @@
# v1.13.18 — codecontext file_path resolver
Fixes a silent failure that caused all four `file_path`-taking codecontext wrappers to return "file not found" whenever the model passed a relative path.
## Why
BooCode's codecontext sidecar (`codecontext_client.ts`) already realpath-resolves `target_dir` before forwarding it to the HTTP shim. It did not do the same for `file_path`. The sidecar's internal file index is keyed on absolute paths, so any relative path from the model produced a JSON error response:
```
{"error":"file not found: apps/server/src/services/inference/turn.ts","result":null}
```
This was observed repeatedly in the 2026-05-22 docker logs (17:56 UTC window) — the model passed relative paths on every `get_file_analysis` tool call and received no useful output, burning tool budget on dead calls.
## Scope
Four wrappers take a `file_path` argument:
- `tools/codecontext/get_file_analysis.ts``file_path` required
- `tools/codecontext/get_symbol_info.ts``file_path` optional
- `tools/codecontext/get_dependencies.ts``file_path` optional
- `tools/codecontext/get_semantic_neighborhoods.ts``file_path` optional
Fix lands in one place: `callCodecontext` in `codecontext_client.ts`. A new `resolveProjectPath` helper is inserted at the args-spread site and invoked whenever `file_path` is present and non-empty. All four wrappers benefit automatically; no per-wrapper edits required.
Zod `.trim()` is added to all four `file_path` schema entries so that whitespace-padded paths from the model are cleaned before they reach the resolver.
## Decision: single resolver over per-wrapper edits
Four wrappers, one shared code path. Per-wrapper edits would require four edits and make it easy to miss one. The `callCodecontext` shim already owns `target_dir` validation; `file_path` validation belongs there too for symmetry.
## Non-goals
- No changes to the `target_dir` resolver — it already works correctly.
- No extension to wrappers that do not take `file_path` (`get_codebase_overview`, `get_framework_analysis`, `search_symbols`, `watch_changes`).
- No fix for the unrelated RPC errors and Go map-race warnings visible in the codecontext sidecar logs — those are upstream bugs.

View File

@@ -1,57 +0,0 @@
# v1.13.18 tasks
## B1 — Backups
- [x] `apps/server/src/services/codecontext_client.ts.bak-v1.13.18-20260522`
- [x] `apps/server/src/services/tools/codecontext/get_file_analysis.ts.bak-v1.13.18-20260522`
- [x] `apps/server/src/services/tools/codecontext/get_symbol_info.ts.bak-v1.13.18-20260522`
- [x] `apps/server/src/services/tools/codecontext/get_dependencies.ts.bak-v1.13.18-20260522`
- [x] `apps/server/src/services/tools/codecontext/get_semantic_neighborhoods.ts.bak-v1.13.18-20260522`
## B2 — Resolver implementation in `codecontext_client.ts`
- [x] Import `isAbsolute`, `resolve`, `sep` from `node:path` (alongside existing `join`)
- [x] Add `resolveProjectPath(projectRoot, rawPath)` helper — trim check, isAbsolute branch, realpath with ENOENT fallthrough, escape check
- [x] Wire into `callCodecontext` at args-spread site — guard on `file_path.trim() !== ''`
- [x] Error-shape parity verified: `file_path <raw> escapes project root <root>` mirrors `target_dir <dir> escapes project root <root>`
## B3 — Zod `.trim()` on wrapper schemas
- [x] `get_file_analysis.ts``z.string().trim().min(1)`
- [x] `get_symbol_info.ts``z.string().trim().optional()`
- [x] `get_dependencies.ts``z.string().trim().optional()`
- [x] `get_semantic_neighborhoods.ts``z.string().trim().optional()`
## B4 — Tests
- [x] Added `describe('callCodecontext — file_path resolution', ...)` to `codecontext_client.test.ts`
- [x] Case 1: relative path resolves to absolute inside project root
- [x] Case 2: absolute path inside project root passes through
- [x] Case 3: relative escape (`../../etc/passwd`) rejected with `escapes project root`
- [x] Case 4: absolute path outside project root rejected
- [x] Case 5: nonexistent file (ENOENT) forwarded as un-realpath'd absolute
- [x] Case 6: empty string skipped by guard (treated as not provided)
- [x] Case 7: wrapper without `file_path` — resolver not invoked, no `file_path` in wire body
- [x] All 17 tests in `codecontext_client.test.ts` pass
## B5 — Typecheck + smoke
- [x] `npx tsc --noEmit -p apps/server` — 0 errors
- [x] Before-fix smoke (relative path): `{"error":"file not found: apps/server/src/services/inference/turn.ts","result":null}`
- [x] Before-fix smoke (absolute path): returns `Lines: 330 / Symbols: 48` as expected
## B6 — Test asserting old buggy behavior updated
- [x] `apps/server/src/services/__tests__/codecontext_tools.test.ts` — assertion at line 73 updated from `file_path: 'apps/server/src/index.ts'` to `file_path: join(projectDir, 'apps/server/src/index.ts')` to match the new resolved-absolute contract.
## B7 — OpenSpec docs
- [x] `openspec/changes/v1.13.18-codecontext-file-path/proposal.md`
- [x] `openspec/changes/v1.13.18-codecontext-file-path/tasks.md`
- [x] `openspec/changes/v1.13.18-codecontext-file-path/design.md`
## B8 — Review-pass defence-in-depth (P2 fixes from adversarial review)
- [x] `codecontext_client.ts:71` — absolute branch now goes through `resolve()` to normalise dot-segments. Closes the ENOENT-fallthrough escape gap where `<projectRoot>/../etc/x` would prefix-match `<projectRoot>/` literally.
- [x] `codecontext_client.test.ts` — added Case 8 (absolute path with `..` resolving outside root, ENOENT branch) and Case 9 (in-project symlink whose target sits outside root). 19 tests pass.
- [x] Updated `resolveProjectPath` docstring to reflect the new normalisation step.

View File

@@ -1,126 +0,0 @@
# v1.13.20-drop-legacy-cols — drop messages.tool_calls + messages.tool_results
Final phase of the v1.13.0 strangler-fig migration. Removes the dual-write into `messages.tool_calls` / `messages.tool_results` JSON columns and drops the columns themselves. After this batch, `message_parts` is the only source of truth for tool-call and tool-result data.
Tag `v1.13` (umbrella) ships on the same commit per the original roadmap entry.
## Why
v1.13.0 (AI SDK v6 migration) introduced `message_parts` as the new canonical store for tool calls, tool results, reasoning, text, synthesis, and now html_artifact. To stay safe during the migration, every write site also dual-wrote to the legacy `messages.tool_calls` / `messages.tool_results` JSON columns, and `messages_with_parts` view COALESCEs over both. Reads have been migrated; dual-writes are pure overhead at this point.
Verification query (per the original v1.13.2 plan) returns `0 / 0` orphan rows. Today's DB is also empty (0 messages on the live instance), so the COUNT query alone is weakly informative — the safety check shifts to a code-level audit: every dual-write site listed in the v1.13.2 roadmap entry must be located and its parts-write half kept, JSON-column half removed.
## Scope
### S1. Remove dual-write from every site
Per the v1.13.2 roadmap entry, dual-writes live at:
- `services/inference/tool-phase.ts` — 3 sites
- `services/inference/error-handler.ts``finalizeCompletion`
- `routes/skills.ts` — 2 sites
- `routes/messages.ts` — answer flow
- `routes/chats.ts` — fork flow
Implementer must grep for every UPDATE / INSERT that touches `tool_calls` or `tool_results` columns and verify it has a paired `insertParts(...)` call. Keep the parts write, remove the column write. If a site only writes to the JSON column with no parts pair — STOP and escalate (would indicate a bug in the v1.13.0 dual-write rollout we haven't caught).
### S2. Simplify `messages_with_parts` view
Current view COALESCEs parts-table rows over legacy JSON columns to support pre-v1.13.0 history. After this batch, the JSON columns no longer exist — drop the COALESCE fallbacks. The view should read only from `message_parts` joined to `messages`.
### S3. Drop the columns
```sql
ALTER TABLE messages DROP COLUMN tool_calls;
ALTER TABLE messages DROP COLUMN tool_results;
```
Idempotent via `IF EXISTS`. Apply unconditionally on startup (matches the rest of `schema.sql`'s shape).
### S4. Remove from API types
`Message` interface in `apps/server/src/types/api.ts` AND `apps/web/src/api/types.ts` — drop `tool_calls?` and `tool_results?` fields. The API boundary is unchanged because every consumer already reads parts-derived values through `messages_with_parts`. Mirror byte-for-byte.
### S5. Drop the stale `messages_status_check` cleanup DO block from v1.12.1 if still present
Per the v1.13.2 roadmap entry, there's a v1.12.1 `DO $$ DROP CONSTRAINT messages_status_check` block that was meant to clean up the old anonymous constraint. If still present in `schema.sql`, remove — it's been one-shot effective.
### S6. Update test fixtures
`inference.test.ts` and `compaction.test.ts` (and any other test file the grep finds) construct Message-shaped fixtures with `tool_calls: null, tool_results: null` literals. Rewrite ~30 fixtures to construct via `message_parts` rows where the test actually exercises tool calls. For tests that don't exercise tool calls at all, just drop the now-absent fields.
`partsFromAssistantMessage` and `partsFromToolMessage` helpers in `parts.ts` currently take `tool_calls` and `tool_results` as args (because that's what the legacy Message shape carried). Keep their input shapes — they're useful constructors. The change is at the call sites, not the helpers.
## Non-goals
- **No changes to `message_parts` schema.** It's correct as-is.
- **No changes to the `messages_with_parts` view name or interface.** Just the implementation simplifies.
- **No removal of `partsFromAssistantMessage` / `partsFromToolMessage`.** They're useful as constructors; their job becomes producing parts from raw ToolCall/ToolResult objects, not from a legacy Message row.
- **No frontend changes beyond the type mirror.** Web reads parts via `messages_with_parts` already.
- **No reads from the legacy columns in any code path.** Verify with grep.
## Hard rules
- No git commits during dispatch. Sam commits manually (handled by controller after all dispatches done).
- Backups: every modified file → `.bak-v1.13.20-20260523`.
- TS strict, no `any`.
- No new deps.
- Schema migration: additive-or-destructive but idempotent (`IF EXISTS` on the column drops).
- Run the full server test suite after — must be green.
- Frontend: `tsc -p apps/web/tsconfig.app.json --noEmit` + `pnpm -C apps/web build` clean.
## Stop checkpoints
1. **After recon** (grep-driven inventory of dual-write call sites + read sites still touching the legacy columns): stop, hand back inventory. The roadmap listed 7+ sites; verify nothing's been missed.
2. **After code edits, before schema migration**: stop, hand back diff + test results. Confirm the parts write at every former dual-write site still happens.
3. **After schema migration applies in dev**: stop, run tests, run a fresh `applySchema()` cycle (boot twice), confirm idempotent.
## Smoke plan
1. **Fresh boot.** Restart the boocode container, confirm `applySchema()` completes without error.
2. **Idempotent boot.** Restart again, confirm no error on the second pass (column DROP IF EXISTS is a no-op).
3. **Send a chat that triggers a tool call.** Confirm:
- Assistant message lands with content + reasoning + tool_call parts (all in `message_parts`).
- Tool result lands as a `tool_result` part.
- `messages_with_parts` returns the same shape the frontend expects (verify by reading the live chat in the UI).
4. **DB inspection.** `\d messages` — confirm `tool_calls` and `tool_results` columns are gone.
5. **Compaction roundtrip.** Trigger a compaction-eligible turn (long context); confirm the rolling summary still anchors correctly and uses parts as input.
## Done when
- All dual-write sites converted to parts-only writes.
- View simplified, columns dropped, types updated.
- Test suite green.
- Frontend typecheck + build clean.
- Smoke green.
- Tagged `v1.13.20-drop-legacy-cols` AND the umbrella `v1.13` on the same commit.
- CHANGELOG.md entry + roadmap retrospective bullet.
## Files expected to touch
**Backend:**
- `apps/server/src/schema.sql` — DROP columns + simplify view + remove v1.12.1 cleanup block
- `apps/server/src/services/inference/tool-phase.ts` — remove 3 dual-write sites
- `apps/server/src/services/inference/error-handler.ts` — remove dual-write in `finalizeCompletion`
- `apps/server/src/routes/skills.ts` — remove 2 dual-write sites
- `apps/server/src/routes/messages.ts` — remove dual-write in answer flow
- `apps/server/src/routes/chats.ts` — remove dual-write in fork
- `apps/server/src/types/api.ts` — drop `tool_calls?` / `tool_results?` from Message
- `apps/server/src/services/__tests__/inference.test.ts` — fixture rewrites
- `apps/server/src/services/__tests__/compaction.test.ts` — fixture rewrites
- `apps/server/src/services/__tests__/parts.test.ts` — likely some fixture updates
- `apps/server/src/services/__tests__/tool_cost_stats.test.ts` — likely some fixture updates
- `apps/server/src/services/__tests__/system-prompt.test.ts` — likely some fixture updates
**Frontend:**
- `apps/web/src/api/types.ts` — mirror Message change
**Docs:**
- `BOOCHAT.md` — no change expected (rules don't mention the legacy columns)
- `boocode_roadmap.md` — retrospective bullet
- `CHANGELOG.md` — new section
- `CLAUDE.md` — drop the v1.13.0 dual-write notes that no longer apply (audit the surrounding paragraphs)
## Estimate
~150 LoC net (mostly deletions). Mechanical work — same per-batch shape as v1.13.18.

View File

@@ -1,104 +0,0 @@
# v1.13.20-drop-legacy-cols tasks
## B1 — Recon (STOP after this step)
- [ ] Grep `apps/server/src/**/*.ts` for every `tool_calls` and `tool_results` mention. Categorize each hit as:
- **dual-write** (an UPDATE / INSERT that writes the JSON column)
- **read** (a SELECT that reads the JSON column, or code that destructures it from a row)
- **type-only** (interface / type field reference)
- **test fixture** (literal in a test file)
- **comment / docs**
- [ ] Confirm the v1.13.2 roadmap inventory is complete:
- tool-phase.ts: 3 sites
- error-handler.ts (`finalizeCompletion`): 1 site
- routes/skills.ts: 2 sites
- routes/messages.ts (answer flow): 1 site
- routes/chats.ts (fork): 1 site
- Any extras the grep finds: list them
- [ ] Confirm no READ sites still touching the legacy columns (everything should go through `messages_with_parts`). If reads remain, flag them — they need to migrate to the view BEFORE dropping the columns.
- [ ] Hand back inventory as a per-file table: file, line, kind (dual-write / read / type / fixture), action (delete / migrate-to-view / type-prune).
## B2 — Backups
- [ ] `cp <file> <file>.bak-v1.13.20-20260523` for every file in B1's action list before editing.
## B3 — Remove dual-writes
- [ ] Remove the JSON-column UPDATE / INSERT at every site identified in B1 as a dual-write. Keep the paired `insertParts(...)` call.
- [ ] If a site only writes the JSON column with no parts pair (would indicate a bug from v1.13.0) — STOP, report as BLOCKED.
- [ ] Verify by grep: zero remaining writes to `tool_calls` or `tool_results` outside of `schema.sql` and test fixtures.
## B4 — Simplify `messages_with_parts` view
- [ ] Open `schema.sql`. Find the view definition.
- [ ] Drop the COALESCE fallbacks that read `m.tool_calls` / `m.tool_results` from `messages`.
- [ ] View now reads only from `message_parts` joined to `messages`.
- [ ] Confirm view's output column shapes are unchanged: `tool_calls jsonb[]`, `tool_results jsonb` single object, `reasoning_parts jsonb[]`.
## B5 — Drop columns
- [ ] `ALTER TABLE messages DROP COLUMN IF EXISTS tool_calls;`
- [ ] `ALTER TABLE messages DROP COLUMN IF EXISTS tool_results;`
- [ ] Idempotent on re-run.
- [ ] Apply order in `schema.sql`: AFTER the view is updated (view depends on the columns; can't drop a column referenced by a view).
- [ ] Actually verify the order — if the view references the columns, you must drop the view first OR change it before the ALTER.
## B6 — Remove v1.12.1 cleanup block
- [ ] Find the `DO $$ DROP CONSTRAINT messages_status_check` block in `schema.sql` (likely near the messages CHECK constraints).
- [ ] Confirm it's safe to remove (the constraint should have been dropped long ago).
- [ ] Delete the block.
## B7 — Type pruning
- [ ] `apps/server/src/types/api.ts` — remove `tool_calls?` and `tool_results?` from the `Message` interface.
- [ ] `apps/web/src/api/types.ts` — mirror byte-for-byte.
- [ ] Search for any other type references — `ToolCallsField`, `ToolResultsField`, etc.
## B8 — Test fixture updates
- [ ] Run `pnpm -C apps/server test` to see what breaks.
- [ ] For each failing test that constructs a `Message` literal with `tool_calls: null` / `tool_results: null` — remove those fields.
- [ ] For tests that exercised tool-call behavior via the legacy columns, rewrite to construct via `message_parts` rows.
- [ ] Confirm: `pnpm -C apps/server test` — all green.
## B9 — Type / build verification
- [ ] `npx tsc --noEmit -p apps/server` — 0 errors.
- [ ] `npx tsc -p apps/web/tsconfig.app.json --noEmit` — 0 errors.
- [ ] `pnpm -C apps/web build` — green.
## B10 — STOP checkpoint, hand back diff
- [ ] Hand controller the diff for backend changes + test results.
## B11 — Schema deploy
- [ ] `docker compose up --build -d` rebuilds with new schema.
- [ ] Boot twice in sequence — confirm idempotent (column DROP IF EXISTS is a no-op on the second boot).
- [ ] `docker exec boocode_db psql -U boocode -d boocode -c "\d messages"` — confirm columns absent.
- [ ] `docker logs boocode 2>&1 | tail -50` — confirm no schema errors.
## B12 — Smoke
- [ ] Live-smoke: send a chat that triggers at least one tool call. Confirm:
- [ ] Assistant message renders with content + tool_call ActionRow.
- [ ] Tool result renders.
- [ ] No console errors in browser or `docker logs boocode`.
- [ ] Trigger a compaction-eligible turn (long context). Confirm rolling summary anchors correctly.
## B13 — Docs
- [ ] `CHANGELOG.md` entry for v1.13.20-drop-legacy-cols.
- [ ] `boocode_roadmap.md` retrospective bullet on the v1.13.2 section (note the slug rename and ship date).
- [ ] `CLAUDE.md` — drop the v1.13.0 dual-write notes that no longer apply. Audit the surrounding paragraphs.
## B14 — Tag + push + rebuild
- [ ] `git add` only the v1.13.20 batch files (per CLAUDE.md convention).
- [ ] `git commit` with HEREDOC commit message.
- [ ] `git tag v1.13.20-drop-legacy-cols` AND `git tag v1.13` (umbrella, per original v1.13.2 plan).
- [ ] Push: `GIT_SSH_COMMAND="ssh -i /opt/boocode/secrets/boocode_gitea -o IdentitiesOnly=yes" git push origin main`
- [ ] Push both tags.
- [ ] `docker compose up --build -d`.
- [ ] Curl health check.

View File

@@ -1,72 +0,0 @@
# v1.14.0-outer-loop — design decisions
Answers to the dispatch's blocking questions, resolved 2026-05-23.
## D1. Step cap — what replaces MAX_TOOL_LOOP_DEPTH?
`MAX_TOOL_LOOP_DEPTH` never existed — no hard recursion depth guard was ever in the codebase. Safety came from budget (50 tool calls) + doom-loop (3 identical calls).
**Decision:** introduce `MAX_STEPS = 200` as a hard ceiling. Per-agent cap via `agent.steps` is the primary knob. Resolution: `effectiveCap = Math.min(agent.steps ?? Infinity, MAX_STEPS)`.
**Rationale:** Sam reports BooChat gets stuck at 50 tool calls (the budget) too often. The step cap should be generous — 200 is 4x the current de-facto ceiling. Budget (50 tool calls total across all steps) remains a separate concern and is not changed in this batch.
Note: "step" ≠ "tool call." One step = one stream iteration that may produce multiple parallel tool calls. Budget counts individual tool calls; step cap counts iterations. At 200 steps with average 1-2 tool calls per step, the budget (50) will fire well before the step cap in most scenarios. The step cap is a safety ceiling for cases where the model makes many 1-tool-call iterations.
## D2. step_finish — emit or not?
**Decision:** No `step_finish` part. The next `step_start` (or assistant message completion) implicitly ends the previous step.
**Rationale:** opencode only emits `step_start`. Less noise in parts, simpler code. If UI ever needs step durations, compute from the timestamps of consecutive `step_start` parts.
## D3. Step-cap hit — sentinel or quiet?
**Decision:** Write a sentinel summary on step-cap hit. Visible to the user in chat, same as budget-exhaustion's `runCapHitSummary`.
**Implementation:** Extend `runCapHitSummary` to accept a `reason: 'budget' | 'step_cap'` parameter (or add a parallel `runStepCapSummary`). The sentinel metadata kind stays `cap_hit` — frontend `CapHitSentinel` component already renders it. The sentinel's text distinguishes the two cases ("Tool budget exhausted" vs "Step limit reached").
## D4. agent.steps = 0
**Decision:** `steps: 0` means "no tool calls allowed." The loop body never executes. The assistant can only respond with text.
**Implementation:** When `effectiveCap === 0`, skip the loop entirely. Stream the first assistant turn (text-only), finalize, return. The model receives no tools in the request payload when `steps: 0` (or equivalently, tools are passed but the loop never enters the tool-execution branch).
Actually, cleaner: `steps: 0` means the loop cap is 0. The while condition `stepNumber < effectiveCap` is false on the first check. The stream phase still runs (the model produces a text response), but if it emits tool calls they're ignored and the turn finalizes as text-only. This may produce a confusing response if the model's text references tool results it never got — but `steps: 0` is an explicit constraint the agent author chose. Document in AGENTS.md parser validation.
## D5. Synthesis success terminates the loop?
**Decision:** Yes. `break` out of the loop after synthesis success. Preserves current behavior (synthesis replaces the recursive call; no further iterations).
**Rationale:** The synthesis pass produces a self-contained summary turn. Continuing the loop after synthesis would let the model issue more tool calls on top of a synthesis summary, which is semantically wrong — the synthesis IS the final answer for that tool call batch.
## D6. executeToolPhase return struct
The recursive call at `tool-phase.ts:342` is currently the last thing `executeToolPhase` does (after creating the next assistant row). After the conversion, `executeToolPhase` returns a struct the loop body reads:
```typescript
interface ToolPhaseResult {
action: 'continue' | 'paused' | 'synthesis_done';
toolCallCount: number;
toolCalls: ToolCall[];
nextAssistantId: string | null;
}
```
- `continue` → loop continues; `nextAssistantId` is the new assistant message's UUID.
- `paused` → user-input or grant pause; loop breaks. `nextAssistantId` is null.
- `synthesis_done` → synthesis succeeded; loop breaks. `nextAssistantId` is null (synthesis wrote its own parts).
The loop body then:
1. Updates `toolsUsed += result.toolCallCount`
2. Appends `result.toolCalls` to `recentToolCalls`
3. Sets `assistantMessageId = result.nextAssistantId` for the next iteration
4. Increments `stepNumber`
5. Checks `result.action` — if not `continue`, breaks.
## D7. Budget vs steps interaction
Budget counts **individual tool calls** across the entire turn. Steps counts **loop iterations**. They are orthogonal:
- Budget fires when `toolsUsed >= resolveToolBudget(agent)` (currently 50 for read-only). Checked at the top of each iteration.
- Step cap fires when `stepNumber >= effectiveCap`. Checked by the loop condition.
Both produce a sentinel summary. A turn can be terminated by whichever fires first. In practice, budget (50 tool calls) fires before step cap (200 steps) unless the model produces many 0-tool-call iterations (which shouldn't happen — 0 tool calls means non-tool finish, which exits the loop via the `break` path).

View File

@@ -1,112 +0,0 @@
# v1.14.0-outer-loop — explicit outer agent loop
Replace the ad-hoc `executeToolPhase → runAssistantTurn` recursion with an explicit `while` loop. A **step** is one stream-and-tool-execute iteration; a step can contain multiple parallel tool calls. The loop terminates on non-tool finish OR step-cap hit OR doom-loop OR budget exhaustion OR abort OR synthesis success.
## Why
The current recursion works but has two problems: (a) stack depth grows linearly with tool iterations — 50 nested async frames is fragile, (b) there's no explicit step counter, so there's no per-agent step cap and no step-boundary instrumentation. BooChat also gets stuck at 50 tool calls (the budget ceiling) more often than it should — the new `MAX_STEPS = 200` hard ceiling lets the loop run much longer before the step cap fires, while the existing budget (50 tool calls) remains a separate concern.
## Recon findings (verified 2026-05-23)
- `runAssistantTurn` at `turn.ts:144-147` is the recursive entry. Returns `Promise<void>`.
- `executeToolPhase` at `tool-phase.ts:89-96` calls back into `runAssistantTurn` at `tool-phase.ts:342`.
- Recursion terminates on: non-tool finish, budget exhaustion (`args.toolsUsed >= budget`), doom-loop (3 identical calls via `detectDoomLoop`), user-input pause (ask_user_input / request_read_access), synthesis success, stream error, abort.
- **No existing hard recursion depth limit** — `MAX_TOOL_LOOP_DEPTH` does not exist. Safety comes from budget (50) + doom-loop (3 identical).
- `TurnArgs` defined in `turn.ts:127-141`, not `types.ts`. Fields: `sessionId`, `chatId`, `assistantMessageId`, `toolsUsed`, `recentToolCalls`, `signal`. All mutable fields are threaded through the recursive call.
- Synthesis pipeline (`synthesisPipeline.ts`) is a branch in `executeToolPhase` — if synthesis succeeds, recursion is skipped.
- `step_start` already in the `message_parts.kind` CHECK constraint. No schema change needed.
- `agents.ts` does NOT currently parse a `steps` field. Needs adding to `ParsedFrontmatter`.
## Scope
### S1. Outer loop in `turn.ts`
Convert the recursive chain to a `while (stepNumber < effectiveCap)` loop:
```
let stepNumber = 0
while (stepNumber < effectiveCap) {
// doom-loop check
// budget check
// emit step_start part
// stream phase (executeStreamPhase)
// if no tool calls → finalize, break
// tool phase (executeToolPhase — now returns, doesn't recurse)
// if paused (user input / grant) → break
// if synthesis succeeded → break
// create next assistant message row
// increment stepNumber, update toolsUsed, append recentToolCalls
}
// if stepNumber >= effectiveCap → sentinel summary
```
`effectiveCap = Math.min(agent.steps ?? Infinity, MAX_STEPS)` where `MAX_STEPS = 200`.
### S2. `executeToolPhase` becomes non-recursive
Remove the `runAssistantTurn` call at `tool-phase.ts:342`. Instead, return a result indicating what happened: `{action: 'continue' | 'paused' | 'synthesis_done', toolsUsed, recentToolCalls, nextAssistantId}`. The caller (the while loop) uses the action to decide whether to continue or break.
### S3. `agent.steps` field
`agents.ts:ParsedFrontmatter` gains `steps?: number`. Parser extracts it from YAML frontmatter (integer ≥ 0). `steps: 0` means "no tool calls allowed" — loop body never executes; assistant responds text-only.
### S4. Step-boundary events
At the top of each loop iteration, emit a `step_start` part with payload `{step_number, started_at}`. Uses `insertParts` into the current assistant message. No `step_finish` — the next `step_start` (or message completion) implicitly ends the previous step.
### S5. Doom-loop migration
`detectDoomLoop` check moves from `runAssistantTurn` (top of function, pre-stream) to the top of the while-loop body (same logical position). Same predicate, same threshold (3). Same `runDoomLoopSummary` call. Control flow changes from `return` (unwinding recursion) to `break` (exiting loop).
### S6. Step-cap sentinel
When `stepNumber >= effectiveCap`, write a sentinel summary like the existing `runCapHitSummary`. Reuse `runCapHitSummary` with a reason parameter distinguishing "budget exhaustion" from "step cap hit", or create a parallel `runStepCapSummary`. The sentinel makes the cap visible in chat.
### S7. AGENTS.md updates
Add `steps:` to each agent in `data/AGENTS.md`:
- Refactorer: `steps: 5`
- Architect: `steps: 20`
- All others: unset (infinity — bounded only by `MAX_STEPS = 200`)
### S8. Tests
New test file `apps/server/src/services/__tests__/outer-loop.test.ts` covering:
- Clean finish (stream returns non-tool, loop exits after 1 iteration)
- Step-cap hit (loop exits at cap, sentinel written)
- Doom-loop break (3 identical calls, sentinel written)
- Budget exhaustion (toolsUsed >= budget, cap-hit sentinel written)
- Abort mid-step (signal fires, loop exits)
- `steps: 0` edge case (no loop iterations, text-only response)
- Synthesis success (loop exits after synthesis)
## Non-goals
- No frontend changes. `step_start` parts surface via `messages_with_parts` automatically; UI doesn't render them in v1.14.
- No `output_schema` / `exit_expression` / `execution_strategy` AGENTS.md fields.
- No per-step snapshot for revert (v2.0 BooCoder concern).
- No changes to budget constants (50 / 10 / 50). That's a separate concern.
- No `repairToolCall` changes.
- No compaction changes.
## Hard rules
- No git commit, push. Sam commits.
- Backup before editing.
- TS strict, no `any`.
- Doom-loop threshold stays at 3.
- 332+ existing tests still pass + new outer-loop tests.
## Files expected to touch
- `apps/server/src/services/inference/turn.ts` — recursion → loop
- `apps/server/src/services/inference/tool-phase.ts` — remove recursive call, return result struct
- `apps/server/src/services/inference/sentinel-summaries.ts` — step-cap sentinel (or extend cap-hit)
- `apps/server/src/services/agents.ts` — parse `steps` field
- `data/AGENTS.md` — add `steps:` to Refactorer + Architect
- `apps/server/src/services/__tests__/outer-loop.test.ts` — NEW
- `apps/server/src/services/inference/index.ts` — re-export if new types needed
## Estimate
~300 LoC net (turn.ts refactor + tool-phase return struct + agents parser + tests). The conversion is structural, not behavioral — every exit path is preserved, just expressed as loop control flow instead of recursion unwinding.

View File

@@ -1,82 +0,0 @@
# v1.14.0-outer-loop tasks
## B1 — Backups
- [ ] `turn.ts`, `tool-phase.ts`, `sentinel-summaries.ts`, `agents.ts`, `data/AGENTS.md`
## B2 — agents.ts: parse `steps` field
- [ ] Add `steps?: number` to `ParsedFrontmatter` interface
- [ ] Parse from YAML frontmatter: integer ≥ 0, warn on out-of-range (negative or non-integer), clamp to 0
- [ ] Expose on the `Agent` type returned by `getAgentsForProject`
- [ ] `npx tsc --noEmit -p apps/server` clean
## B3 — AGENTS.md: add `steps:` to Refactorer + Architect
- [ ] `data/AGENTS.md` — Refactorer: `steps: 5`
- [ ] `data/AGENTS.md` — Architect: `steps: 20`
- [ ] All others: leave unset (infinite, bounded by MAX_STEPS=200)
## B4 — tool-phase.ts: remove recursive call, return result struct
- [ ] Define `ToolPhaseResult` interface: `{action: 'continue' | 'paused' | 'synthesis_done', toolCallCount: number, toolCalls: ToolCall[], nextAssistantId: string | null}`
- [ ] Remove `runAssistantTurn` import and call at line ~342
- [ ] `executeToolPhase` returns `ToolPhaseResult` instead of `Promise<void>`
- [ ] On normal path (after creating next assistant row): return `{action: 'continue', toolCallCount, toolCalls: result.toolCalls, nextAssistantId}`
- [ ] On user-input pause: return `{action: 'paused', toolCallCount: <calls executed so far>, toolCalls: result.toolCalls, nextAssistantId: null}`
- [ ] On synthesis success: return `{action: 'synthesis_done', toolCallCount, toolCalls: result.toolCalls, nextAssistantId: null}`
- [ ] `npx tsc --noEmit -p apps/server` will FAIL here (turn.ts still expects void) — expected, fixed in B5
## B5 — turn.ts: recursion → while loop
- [ ] Add `MAX_STEPS = 200` constant
- [ ] Resolve `effectiveCap = Math.min(agent?.steps ?? Infinity, MAX_STEPS)` at the top of `runAssistantTurn`
- [ ] Convert `runAssistantTurn` body into a `while (stepNumber < effectiveCap)` loop:
- Top of loop: doom-loop check (move from current position; `break` instead of `return`)
- Top of loop: budget check (move from current position; `break` instead of `return`, but still call `runCapHitSummary` before break)
- Emit `step_start` part via `insertParts` with payload `{step_number: stepNumber, started_at: new Date().toISOString()}`
- Call `executeStreamPhase`
- If no tool calls → `finalizeCompletion`, `break`
- Call `executeToolPhase` (now returns `ToolPhaseResult`)
- If `result.action !== 'continue'``break`
- Update `toolsUsed += result.toolCallCount`
- Update `recentToolCalls = [...recentToolCalls, ...result.toolCalls]`
- Update `assistantMessageId = result.nextAssistantId!`
- Increment `stepNumber`
- [ ] After loop: if `stepNumber >= effectiveCap` → call step-cap sentinel (B6)
- [ ] `effectiveCap === 0` edge case: the while condition is immediately false; stream the first turn text-only (the stream phase at the top of the function runs once before the loop — OR handle this by structuring the loop as do-while, OR handle by pre-checking and skipping tools from the request). Pick the cleanest approach.
- [ ] Remove `TurnArgs` from the module export if it's no longer threaded through recursion — OR keep it and populate from loop locals. (Design note: `TurnArgs` is still used by `executeStreamPhase`, `executeToolPhase`, `sentinel-summaries.ts`, `error-handler.ts`. Keep the interface; populate from loop locals each iteration.)
- [ ] `npx tsc --noEmit -p apps/server` clean
- [ ] `pnpm -C apps/server test` — all existing tests pass
## B6 — sentinel-summaries.ts: step-cap sentinel
- [ ] Add `runStepCapSummary` (or extend `runCapHitSummary` with a `reason` param)
- [ ] Write a sentinel with `metadata.kind = 'cap_hit'` (same as budget) so `CapHitSentinel` UI renders it
- [ ] Sentinel text distinguishes "Step limit reached (N steps)" from "Tool budget exhausted (N calls)"
- [ ] Called from the post-loop check in turn.ts (B5)
## B7 — Tests
- [ ] NEW `apps/server/src/services/__tests__/outer-loop.test.ts`
- [ ] Test: clean finish — stream returns no tool calls, loop exits after 1 step
- [ ] Test: step-cap hit — mock agent with `steps: 2`, model always returns tool calls, loop exits at 2, sentinel written
- [ ] Test: doom-loop — 3 identical tool calls, sentinel written, loop breaks
- [ ] Test: budget exhaustion — toolsUsed >= budget, cap-hit sentinel written
- [ ] Test: `steps: 0` — no loop iterations, text-only response
- [ ] Test: synthesis success — loop breaks after synthesis
- [ ] `pnpm -C apps/server test` — all 332+ existing + new tests pass
## B8 — Verification
- [ ] `npx tsc --noEmit -p apps/server` — 0 errors
- [ ] `npx tsc -p apps/web/tsconfig.app.json --noEmit` — 0 errors (no web changes; should pass)
- [ ] `pnpm -C apps/web build` — green
- [ ] `pnpm -C apps/server test` — all green
## B9 — Docs + tag + deploy
- [ ] `CHANGELOG.md` entry for v1.14.0-outer-loop
- [ ] `boocode_roadmap.md` retrospective bullet on the v1.14 section
- [ ] `CLAUDE.md` updates: mention the outer loop, MAX_STEPS, agent.steps in the inference/ section
- [ ] Commit, tag `v1.14.0-outer-loop`, push, rebuild

View File

@@ -1,39 +0,0 @@
# v1.14.1-mcp-poc — design decisions
## D1. Transport: Streamable HTTP (not stdio)
Context7 is a remote service at `https://mcp.context7.com/mcp`. Uses the MCP Streamable HTTP transport. The `@modelcontextprotocol/sdk` TypeScript client supports this via `StreamableHTTPClientTransport`. No stdio needed.
## D2. Tool name prefixing
MCP tools get a `context7_` prefix to avoid collisions with BooCode's native tools. Context7's tools are `resolve-library-id` and `query-docs` — these become `context7_resolve-library-id` and `context7_query-docs`. The prefix is stripped before calling the MCP server's `tools/call`.
## D3. Read-only invariant guard
BooChat is read-only through v1.x. The MCP client rejects any tool whose `annotations?.readOnly === false`. Tools with `readOnly: true` or no annotations are accepted. Context7's tools are all read-only (they query documentation — no write side effects). Fail-open on missing annotations is a deliberate choice: most MCP servers don't set annotations yet, and rejecting all un-annotated tools would make the feature useless. The guard catches explicitly-declared write tools.
## D4. Zod inputSchema for MCP tools
MCP tools come with a JSON Schema `inputSchema`. BooCode's `ToolDef` has both a Zod `inputSchema` (for server-side validation) and a `jsonSchema` (for the LLM's tool schema). For MCP tools:
- `jsonSchema` is built directly from the MCP tool's `inputSchema` (it's already JSON Schema).
- `inputSchema` uses `z.record(z.unknown())` as a pass-through — the MCP server does its own validation. Double-validating with a generated Zod schema from JSON Schema adds complexity with no value for a PoC.
## D5. Tool registration: append + re-sort (not lazy-init)
The simplest approach: keep `ALL_TOOLS` as the native tool array. Add an `appendMcpTools(tools: ToolDef[])` function that pushes MCP tools, re-sorts alphabetically, and rebuilds `TOOLS_BY_NAME` and `READ_ONLY_TOOL_NAMES`. Called once at startup after MCP init. More invasive approaches (lazy-init, factory function) change the import shape for every consumer. Mutation-at-startup is ugly but contained to one call site and matches the existing alpha-sort-at-module-level pattern.
## D6. No per-session toggle
Web tools have `session.web_search_enabled`. MCP tools do NOT get a session toggle in v1.14.1. If configured via env var, MCP tools are always available. Per-session MCP control is a v1.15 concern (when multiple MCP servers and the permission ruleset land together).
## D7. Graceful degradation
MCP server down at startup → log warning, expose zero MCP tools, BooCode functions normally. MCP server down mid-session (tool call fails) → the `execute` wrapper catches the error and returns `{error: true, output: "MCP server unreachable"}` — the model sees the error and can self-correct (use native tools instead).
## D8. Result content extraction
MCP `tools/call` returns `{content: ContentBlock[]}` where each block is `{type: 'text', text: string}` or `{type: 'resource', ...}`. For the PoC:
- Text blocks: join with `\n`.
- Resource blocks: serialize as JSON (the model can read structured data).
- Empty content: return `"(no output)"`.
- `isError: true` in the response: return `{error: true, output: joinedContent}`.

View File

@@ -1,96 +0,0 @@
# v1.14.1-mcp-poc — single-server MCP client proof-of-concept
Validate the MCP-client loop end-to-end against one real MCP server (Context7) before committing to the full opencode `mcp/index.ts` port at v1.15. Small, throwaway-if-needed.
## Why
BooCode's tool registry (`ALL_TOOLS` in `tools.ts`) is static — tools are hardcoded TypeScript modules. MCP is the protocol for dynamic tool discovery. Wiring one real MCP server end-to-end proves: tool-discovery → tool-list → tool-call → result-render → context-budget accounting all hold. If Context7 works, any MCP server will work via the same plumbing.
## Scope
### S1. Install `@modelcontextprotocol/sdk`
New dependency in `apps/server/package.json`. The official TypeScript MCP client SDK (MIT). Provides `Client`, `StreamableHTTPClientTransport`, tool-call/result types.
### S2. New service: `apps/server/src/services/mcp-client.ts`
Singleton MCP client that:
1. Connects to Context7 at `MCP_CONTEXT7_URL` (default `https://mcp.context7.com/mcp`) via Streamable HTTP transport.
2. Optional `MCP_CONTEXT7_API_KEY` env var passed as a header.
3. On `initialize()`: calls `tools/list`, wraps each MCP tool as a `ToolDef`, prefixes names with `context7_` to avoid collisions with BooCode's native tools.
4. **Read-only invariant guard:** rejects any tool whose `annotations?.readOnly` is explicitly `false`. Tools with `readOnly: true` or no `annotations` field are accepted (fail-open on read-only, since most MCP tools don't set annotations yet — Context7's tools don't).
5. `callTool(name, args)` → calls the MCP server's `tools/call` endpoint and returns the result content.
6. `getTools(): ToolDef[]` → returns the discovered tools wrapped as BooCode `ToolDef` objects.
7. Graceful degradation: if the MCP server is unreachable at startup, log a warning and expose zero MCP tools. BooCode functions normally with its native tools.
### S3. Config extension
`apps/server/src/config.ts` gains two optional env vars:
- `MCP_CONTEXT7_URL` (string, default `https://mcp.context7.com/mcp`)
- `MCP_CONTEXT7_API_KEY` (string, optional)
### S4. Tool registration
`apps/server/src/services/tools.ts` — after building `ALL_TOOLS` from native tools, append MCP-discovered tools from `mcpClient.getTools()`. The alpha-sort at the end of `ALL_TOOLS` construction covers both native and MCP tools. `TOOLS_BY_NAME` map includes MCP tools.
MCP tools are registered with `category: 'read_only'` (per the read-only invariant guard in S2).
### S5. Tool dispatch
`apps/server/src/services/inference/tool-phase.ts` `executeToolCall` already dispatches via `TOOLS_BY_NAME[toolName].execute(...)`. MCP tools' `execute` function calls `mcpClient.callTool(name, args)` — the dispatch is transparent to the rest of the inference loop. No changes to `executeToolCall` needed.
### S6. MCP tool result → BooCode format
MCP `tools/call` returns `{ content: [{type: 'text', text: string}, ...] }`. BooCode's `executeToolCall` expects a string or JSON-serializable output. The `execute` wrapper in the ToolDef extracts `content[0].text` (or joins multiple content blocks with `\n`). If the MCP server returns an error, the wrapper returns `{error: true, output: errorMessage}` matching BooCode's existing error-result shape.
### S7. Startup initialization
`apps/server/src/index.ts` — after `applySchema()` and before route registration, call `mcpClient.initialize()`. If `MCP_CONTEXT7_URL` is not set (or empty), skip initialization entirely (MCP is opt-in). Log the number of discovered tools on success.
Tool registration (S4) must happen AFTER MCP initialization, since `getTools()` returns the discovered tools. Current flow: `ALL_TOOLS` is a module-level constant. This needs to change to a lazy-init pattern — either a function that returns the tool list (called once at startup after MCP init), or a mutable array that MCP tools get appended to during startup.
### S8. Agent tool whitelist interaction
MCP tools are prefixed `context7_*`. Existing agents' `tools:` whitelists don't include MCP tool names — so MCP tools are only available to the default agent (no agent selected, which gets ALL_TOOLS). To make MCP tools available to specific agents, their AGENTS.md `tools:` list would need to include `context7_*` names. For the PoC, this is fine — the default agent (most common) gets MCP tools.
## Non-goals
- No stdio transport. Context7 is HTTP-only.
- No OAuth. Context7 uses an API key header.
- No multiple servers. One hardcoded server (Context7).
- No per-agent MCP server allow/deny. All agents that don't have a `tools:` whitelist get MCP tools.
- No per-session MCP toggle. If configured, MCP tools are always available.
- No UI changes. MCP tools surface in the tool list the model sees; results render as normal tool-result parts.
- No schema changes. MCP state is in-memory only.
## Hard rules
- No git commit/push. Sam commits.
- Read-only invariant: reject any MCP tool with `readOnly: false`.
- Graceful degradation: MCP server down → zero MCP tools, BooCode works normally.
- One new dep only: `@modelcontextprotocol/sdk`.
- Alpha-sort of ALL_TOOLS preserved (v1.13.3 prompt-cache invariant).
## Files expected to touch
- `apps/server/package.json` — add `@modelcontextprotocol/sdk`
- `pnpm-lock.yaml` — auto-updated
- `apps/server/src/config.ts``MCP_CONTEXT7_URL`, `MCP_CONTEXT7_API_KEY`
- `apps/server/src/services/mcp-client.ts` — NEW, ~100 lines
- `apps/server/src/services/tools.ts` — lazy-init or append MCP tools to ALL_TOOLS
- `apps/server/src/index.ts` — call `mcpClient.initialize()` at startup
- `apps/server/src/services/__tests__/mcp-client.test.ts` — NEW, unit tests for tool wrapping + read-only guard
## Estimate
~150 LoC. The MCP SDK handles the protocol; BooCode's job is wrapping discovered tools as ToolDefs and routing calls through the SDK client.
## Smoke plan
1. Set `MCP_CONTEXT7_URL=https://mcp.context7.com/mcp` in `.env` (or docker-compose env).
2. Restart boocode container.
3. Check logs: should see "mcp: initialized Context7, discovered N tools" (or similar).
4. Open a chat with no agent selected. Send "What does the `streamText` function do in the AI SDK? Use context7 to look it up."
5. Confirm: model calls `context7_resolve-library-id` then `context7_query-docs` (or whatever Context7's tool names are after prefixing).
6. Confirm: tool results render normally in the chat.
7. Without `MCP_CONTEXT7_URL` set: restart, confirm BooCode starts normally with zero MCP tools.

View File

@@ -1,80 +0,0 @@
# v1.14.1-mcp-poc tasks
## B1 — Backups
- [ ] `apps/server/src/services/tools.ts`
- [ ] `apps/server/src/config.ts`
- [ ] `apps/server/src/index.ts`
## B2 — Install `@modelcontextprotocol/sdk`
- [ ] `pnpm -C apps/server add @modelcontextprotocol/sdk`
- [ ] Verify `pnpm -C apps/server build` still works after install
- [ ] Note the installed version
## B3 — Config extension
- [ ] `apps/server/src/config.ts` — add `MCP_CONTEXT7_URL` (string, optional, default `https://mcp.context7.com/mcp`)
- [ ] `apps/server/src/config.ts` — add `MCP_CONTEXT7_API_KEY` (string, optional)
- [ ] Both via Zod `.optional()` with `.default()` for the URL
## B4 — MCP client service
- [ ] NEW `apps/server/src/services/mcp-client.ts`
- [ ] Import `Client`, `StreamableHTTPClientTransport` from `@modelcontextprotocol/sdk/client`
- [ ] `initialize(config, log)` — connect to Context7, call `tools/list`, wrap each as ToolDef, apply read-only guard
- [ ] `callTool(name, args)` — call MCP server `tools/call`, extract text content, return as string
- [ ] `getTools()` — return wrapped ToolDef[]
- [ ] `isInitialized()` — boolean
- [ ] Read-only guard: skip tools with `annotations?.readOnly === false`; accept all others
- [ ] Graceful degradation: catch connection errors, log warning, expose zero tools
- [ ] Tool name prefixing: `context7_<original_name>`
- [ ] ToolDef wrapping: map MCP inputSchema (JSONSchema) to ToolJsonSchema `function.parameters`; use `z.any()` for Zod inputSchema (MCP already validated on the server side)
- [ ] Execute wrapper: strip `context7_` prefix before calling MCP, join result content blocks with `\n`
## B5 — Tool registration (lazy-init)
- [ ] `apps/server/src/services/tools.ts` — convert `ALL_TOOLS` from a module-level constant to a lazy-initialized array
- [ ] Add `initializeTools(mcpTools: ToolDef[])` function that builds the final sorted list
- [ ] `TOOLS_BY_NAME`, `READ_ONLY_TOOL_NAMES` derived from the initialized list
- [ ] Ensure all existing callers of `ALL_TOOLS` / `TOOLS_BY_NAME` still work (they import from tools.ts — verify the export shape)
- [ ] OR simpler: keep ALL_TOOLS as-is (native tools), add `appendMcpTools(tools)` that mutates + re-sorts + rebuilds TOOLS_BY_NAME. Less clean but less invasive.
## B6 — Startup wiring
- [ ] `apps/server/src/index.ts` — after `applySchema()`, before route registration:
- If `config.MCP_CONTEXT7_URL` is set: `await mcpClient.initialize(config, app.log)`
- `appendMcpTools(mcpClient.getTools())` (or equivalent)
- Log tool count
- [ ] If URL not set: skip, log "mcp: Context7 not configured, skipping"
## B7 — Verification
- [ ] `npx tsc --noEmit -p apps/server` — 0 errors
- [ ] `pnpm -C apps/server test` — all existing tests pass (MCP client is startup-only; tests don't initialize it)
- [ ] `pnpm -C apps/web build` — green (no web changes)
## B8 — Unit tests
- [ ] NEW `apps/server/src/services/__tests__/mcp-client.test.ts`
- [ ] Test: tool wrapping produces correct ToolDef shape (name, description, jsonSchema, execute fn)
- [ ] Test: read-only guard rejects tools with `readOnly: false`
- [ ] Test: read-only guard accepts tools with `readOnly: true` or no annotations
- [ ] Test: name prefixing — `resolve-library-id``context7_resolve-library-id`
- [ ] Test: result extraction — single text content block → string; multiple → joined with `\n`
- [ ] Test: error result — MCP error → `{error: true, output: ...}` shape
## B9 — Deploy + smoke
- [ ] Add `MCP_CONTEXT7_URL=https://mcp.context7.com/mcp` to docker-compose env (or .env)
- [ ] `docker compose up --build -d`
- [ ] Check logs for MCP initialization message
- [ ] Live-smoke: send a chat asking about AI SDK docs via Context7
- [ ] Verify tool calls + results render normally
## B10 — Docs + tag
- [ ] `CHANGELOG.md` entry
- [ ] `boocode_roadmap.md` retrospective bullet
- [ ] `CLAUDE.md` — mention MCP client in the tools/services section
- [ ] Commit, tag `v1.14.1-mcp-poc`, push, rebuild

View File

@@ -1,194 +0,0 @@
# v1.14.x-html-artifact-panes — pane-based artifact viewer (Markdown + HTML)
Every assistant message gets an "Open in pane" affordance that renders it as a full-height artifact in BooChat's existing workspace splitter. Markdown is the default render (the model's normal output, just promoted to a pane); HTML is opt-in when the user explicitly asks (e.g. "render this as HTML", "make me a dashboard", "build an interactive diagram"). Pane headers expose Copy + Download for Markdown, Download-only for HTML. **No inline iframe preview** — artifacts are pane-only.
Final tag slug to be assigned at ship time depending on ordering against v1.14 (outer loop) and v1.14.x-mcp (MCP PoC). This batch is independent of both.
## Why
Three pressures land in the same place:
1. **Long assistant replies are uncomfortable to read in the chat stream.** Scrolling a 400-line Markdown reply between bubbles is worse than reading it in a dedicated pane next to the chat. The workspace splitter already exists; the splitter just has no artifact pane type yet.
2. **HTML output is a real format the model wants to produce sometimes** (Thariq Shihipar's "HTML > Markdown at length" pattern, May 20 2026 Claude blog) — diagrams, sliders, syntax-highlighted code, side-by-side comparisons, mobile-responsive layouts. But auto-biasing the model to HTML for >100-line outputs (the blog's recommendation) is too aggressive for BooChat's typical workflow; most replies are conversational and Markdown is the right surface. **HTML stays opt-in.**
3. **Durable artifact downloads** — Sam can already copy Markdown out of a chat bubble, but there's no path to "save this reply as a `.md` next to the project, keep it around." Adding a Download button parallel to Copy gives every long reply a portable form.
## Scope
### S1. AGENTS.md guidance (no code change)
Add HTML-on-request rule to global `data/AGENTS.md`:
> Stay in Markdown by default for all outputs, short or long. Switch to a self-contained `<!DOCTYPE html>...</html>` artifact only when the user explicitly asks (e.g. "render this as HTML", "make a dashboard", "build a diagram"). When producing HTML, follow these design conventions: no excessive centered layouts, no purple gradients, no uniform rounded corners, no Inter font, no generic AI aesthetics. See `claude.com/blog/using-claude-code-the-unreasonable-effectiveness-of-html` (Thariq Shihipar, May 2026) for the design taxonomy.
The "auto-bias to HTML for >100 lines" recommendation from the blog post is deliberately NOT adopted. Markdown stays the default at every length.
### S2. Backend: HTML detection + part-kind extension
In `apps/server/src/services/inference/stream-phase.ts` post-processing, detect when an assistant text part:
- Starts with `<!DOCTYPE html>` (case-insensitive, whitespace-trimmed), OR
- Is wrapped entirely in a fenced ` ```html ... ``` ` block
When detected, emit a new `message_parts` row with `kind='html_artifact'` and payload `{html_content, char_count, title}`. Title resolution order: `<title>` tag → first `<h1>` text → first 80 chars of inner text.
Detection is **opportunistic** — fires only when the model produced HTML (because the user asked). Otherwise the message stays plain-Markdown and no `html_artifact` part is written.
**Schema:**
```sql
-- v1.14.x: extend message_parts.kind CHECK constraint with html_artifact
ALTER TABLE message_parts DROP CONSTRAINT IF EXISTS message_parts_kind_chk;
DO $$ BEGIN
IF NOT EXISTS (SELECT 1 FROM pg_constraint WHERE conname = 'message_parts_kind_chk') THEN
ALTER TABLE message_parts ADD CONSTRAINT message_parts_kind_chk
CHECK (kind IN ('text', 'reasoning', 'tool_call', 'tool_result', 'synthesis', 'html_artifact'));
END IF;
END $$;
```
Idempotent on re-run (drops + re-adds on every startup; trivial cost).
### S3. Frontend: pane affordance + two pane types
**MessageBubble.tsx** — add an "Open in pane" icon button to every assistant message footer, alongside the existing copy/regenerate controls. Click dispatches a workspace-pane action:
- If the message has an `html_artifact` part → opens `{type: 'html_artifact', message_id, html_content}`.
- Otherwise → opens `{type: 'markdown_artifact', message_id}`.
**New pane types** registered in the workspace splitter (currently chat / empty / placeholder terminal+agent — adds `markdown_artifact` and `html_artifact`):
- `MarkdownArtifactPane.tsx` — pane shell. Header: title (derived from first heading or first 6 words), Copy button (raw Markdown source via `navigator.clipboard.writeText`), Download button (POST to `/api/chats/:id/messages/:msg_id/artifacts/download?fmt=md`). Body: reuses the same Markdown component used inline in `MessageBubble` (Shiki syntax highlighting, fenced code, tables, all preserved).
- `HtmlArtifactPane.tsx` — pane shell. Header: title (from `html_artifact.payload.title`), Download button only (`?fmt=html`). Body: `<iframe srcdoc={html_content} sandbox="allow-scripts allow-clipboard-write allow-downloads" />` at full pane height. **No Copy button** for HTML.
Pane state persisted via `sessions.workspace_panes jsonb` (the v1.12.1 schema already supports arbitrary pane payloads — extend the `Pane` discriminated union with two new variants).
### S4. Download endpoint
New endpoint `POST /api/chats/:id/messages/:msg_id/artifacts/download?fmt=md|html`:
- Resolves the message and (for HTML) its `html_artifact` part.
- Computes slug:
- Markdown: first `# ` heading text, else first 6 words of message body, lowercased + hyphenated.
- HTML: `<title>` tag content, else first `<h1>` text, else first 6 words of inner text. Same lowercase-hyphen treatment.
- Writes to `/opt/<project>/.boocode/artifacts/<slug>-<unix-timestamp>.<ext>`. Path-guarded same as native write tools — must stay under the project root.
- Returns `{path, url}` where `url` is the pre-signed link via the existing static-file serving route.
### S5. HTML iframe security stance
Locked from the original 2026-05-22 design:
```
sandbox="allow-scripts allow-clipboard-write allow-downloads"
```
**No `allow-same-origin`** — artifact has its own opaque origin, cannot read BooChat's cookies, Authelia session, or DOM. Backend serves the iframe content via `srcdoc=` inline (not `src=`) so no separate URL exists to disclose.
CSP applied to the iframe content (via `<meta http-equiv="Content-Security-Policy">` injected into the artifact's `<head>` if not already present):
```
default-src 'none'; script-src 'unsafe-inline'; style-src 'unsafe-inline'; img-src data: blob:; font-src data:; connect-src 'none'
```
`connect-src 'none'` is the key clause — artifacts can't `fetch()`, can't open WebSockets, can't ping tracking pixels, can't exfiltrate. JS runs (interactive controls work) but nothing network-touching does.
### S6. Token-budget guard
Single HTML artifact: max 1MB of HTML in `message_parts.payload`. Larger triggers a streaming abort with a friendly error:
> Artifact exceeded 1MB; consider splitting into multiple files or reducing inline assets.
Markdown artifacts have no separate cap — they're bounded by the existing message-size envelope.
## Hard rules
- No git commit, no git push, no git pull during dispatch. Sam commits manually.
- Backup every file before edit per the standard convention (`.bak-v1.14.x-html-<YYYYMMDD>`).
- TS strict, no `any`.
- No new deps. The Markdown renderer, Shiki, the workspace splitter, and `navigator.clipboard.writeText` are all already in the bundle.
- Schema migration is additive only (extend CHECK constraint), idempotent on re-run.
- Path-guard layer (`apps/server/src/services/path_guard.ts`) enforces that downloads stay under the project root.
- Secret-file deny list still runs on the resolved download path.
- HTML iframe sandbox attributes are non-negotiable — exact attribute string as written in S5.
## Non-goals
- **No auto-bias to HTML for long outputs.** The AGENTS.md rule explicitly says Markdown is default at every length.
- **No inline iframe preview in the chat stream.** Pane-only.
- **No Copy button on HTML panes.** Download-only for HTML.
- **No separate artifacts table.** Artifacts live in `message_parts` (HTML) or derive from the assistant message (Markdown). Downloads are user-managed on disk under `/opt/<project>/.boocode/artifacts/`.
- **No vendor of `anthropics/skills/web-artifacts-builder`.** That skill is built for Claude.ai's Vite/Parcel runtime; BooChat has no shell execution surface. Just lift the design principles into AGENTS.md.
- **No changes to `apps/booterm` or `apps/coder`.** This is a BooChat-only batch.
## Stop checkpoints
1. After recon (read existing `Pane` discriminated union + workspace splitter + MessageBubble + `message_parts` shape + path_guard): stop, hand back the recon report.
2. After backend edits (detection + schema + download endpoint), before frontend work: stop, hand back diff + curl test of the download endpoint.
3. After frontend edits, before schema migration applies in dev: stop, hand back diff.
4. After schema migration applies in dev: stop, run smoke plan, report.
## Smoke plan
1. **Markdown pane — happy path.** Send a chat that produces a long Markdown reply (e.g. "explain the inference loop in detail"). Click "Open in pane" on the assistant message. Confirm:
- Pane opens in the workspace splitter at full height.
- Markdown renders with syntax highlighting on fenced code blocks (Shiki working).
- Header shows a sensible title (first heading or first 6 words).
- Copy button writes raw Markdown source to clipboard — paste into a text editor and verify it's the same source the assistant emitted.
- Download button writes `/opt/boocode/.boocode/artifacts/<slug>-<ts>.md` and the file contains the raw source.
2. **HTML pane — happy path.** Send "render a simple HTML dashboard with three interactive sliders that update a div in real time." Confirm:
- Model produces `<!DOCTYPE html>...` content.
- `message_parts` row with `kind='html_artifact'` is written.
- Click "Open in pane" — HTML pane renders the artifact in a sandboxed iframe.
- Sliders work (JS runs inside the iframe).
- Download button writes `.html` to the artifacts dir.
- No Copy button on the HTML pane.
3. **HTML security — exfil attempt.** Send "render an HTML page that tries to fetch('https://example.com/exfil') and display the result." Confirm:
- Iframe loads but the `fetch()` is blocked by `connect-src 'none'`.
- Browser devtools shows the CSP violation.
- No network request leaves the iframe.
4. **HTML security — DOM access attempt.** Send "render an HTML page with `<script>document.cookie</script>`." Confirm the script sees the iframe's own (empty) cookie jar, NOT BooChat's parent cookies — sandbox without `allow-same-origin` enforces opaque origin.
5. **Markdown opt-in HTML.** Send a normal "summarize the codebase" reply (Markdown), then a follow-up "now render that as HTML." Confirm the second reply produces an HTML artifact while the first stays plain-Markdown — detection is opportunistic, doesn't auto-promote.
6. **1MB cap.** Construct a synthetic test that asks for a >1MB HTML artifact. Confirm the streaming aborts with the friendly error message; no `message_parts` row with oversized payload is written.
7. **Path-guard enforcement on download.** Try to download with a hand-crafted slug containing `../`. Confirm the path-guard rejects it.
8. **Persistence across reload.** Open both a Markdown and an HTML pane. Hard-reload the browser. Confirm both panes restore via `sessions.workspace_panes`.
## Done when
- Backend: `stream-phase.ts` detects HTML, writes `html_artifact` part. Schema migration shipped. Download endpoint live + path-guarded.
- Frontend: `MarkdownArtifactPane` + `HtmlArtifactPane` components shipped. MessageBubble has the "Open in pane" affordance. Workspace `Pane` discriminated union extended.
- AGENTS.md updated with the HTML-on-request rule.
- Smoke plan green (all 8 steps).
- Tag + CHANGELOG entry + roadmap retrospective bullet at the bottom of the v1.14.x-html roadmap section.
## Files expected to touch
**Backend:**
- `apps/server/src/schema.sql` — extend `message_parts.kind` CHECK constraint
- `apps/server/src/services/inference/stream-phase.ts` — HTML detection in post-processing
- `apps/server/src/services/inference/parts.ts``PartKind` union adds `'html_artifact'`
- `apps/server/src/routes/messages.ts` — new `POST /api/chats/:id/messages/:msg_id/artifacts/download` endpoint (or new `artifacts.ts` route file)
- `apps/server/src/services/artifacts.ts` — NEW. `writeMarkdownArtifact(msg, projectRoot)` + `writeHtmlArtifact(part, projectRoot)` + slug derivation helpers
- `apps/server/src/services/path_guard.ts` — no change expected; existing guard handles the artifacts dir as a project-scoped write target
**Frontend:**
- `apps/web/src/components/MessageBubble.tsx` — add "Open in pane" affordance to assistant message footer
- `apps/web/src/components/MarkdownArtifactPane.tsx` — NEW
- `apps/web/src/components/HtmlArtifactPane.tsx` — NEW
- `apps/web/src/types/panes.ts` (or wherever `Pane` lives) — extend discriminated union with `markdown_artifact` + `html_artifact` variants
- `apps/web/src/api/client.ts``api.messages.downloadArtifact(msgId, fmt)`
- `apps/web/src/api/types.ts` — mirror the new pane variants and `html_artifact` part kind
**Docs:**
- `data/AGENTS.md` — HTML-on-request rule
- `boocode_roadmap.md` — retrospective bullet at the bottom of the v1.14.x-html section
- `CHANGELOG.md` — new `##` entry with the tag
## Estimate
~400 LoC total. Backend ~200 LoC (detection + part-kind extension + download endpoint + slug derivation). Frontend ~200 LoC (two pane components + MessageBubble affordance + pane integration + API client wiring).

View File

@@ -1,124 +0,0 @@
# v1.14.x-html-artifact-panes tasks
## B1 — Backups
- [ ] `apps/server/src/schema.sql.bak-v1.14.x-html-<YYYYMMDD>`
- [ ] `apps/server/src/services/inference/stream-phase.ts.bak-v1.14.x-html-<YYYYMMDD>`
- [ ] `apps/server/src/services/inference/parts.ts.bak-v1.14.x-html-<YYYYMMDD>`
- [ ] `apps/server/src/routes/messages.ts.bak-v1.14.x-html-<YYYYMMDD>`
- [ ] `apps/web/src/components/MessageBubble.tsx.bak-v1.14.x-html-<YYYYMMDD>`
- [ ] `apps/web/src/api/client.ts.bak-v1.14.x-html-<YYYYMMDD>`
- [ ] `apps/web/src/api/types.ts.bak-v1.14.x-html-<YYYYMMDD>`
- [ ] `data/AGENTS.md.bak-v1.14.x-html-<YYYYMMDD>`
## B2 — Recon (STOP after this step)
- [ ] Read existing `Pane` discriminated union and locate the workspace splitter component
- [ ] Read `MessageBubble.tsx` to find the assistant-message footer (copy/regenerate controls location)
- [ ] Read `message_parts` shape + `PartKind` union in `parts.ts`
- [ ] Read `stream-phase.ts` post-processing path (where text parts are finalized into rows)
- [ ] Read `path_guard.ts` to confirm write semantics for `/opt/<project>/.boocode/artifacts/`
- [ ] Read the existing static-file serving route to understand the URL shape for downloads
- [ ] Hand back a recon report: exact line numbers + signatures of insertion points
## B3 — Schema migration
- [ ] Extend `message_parts.kind` CHECK constraint with `'html_artifact'`
- [ ] Use the `DROP CONSTRAINT IF EXISTS` + `DO $$ pg_constraint $$` pattern (matches the rest of `schema.sql`)
- [ ] Confirm idempotent on re-run: apply twice in dev, no error
## B4 — Backend: HTML detection
- [ ] Extend `PartKind` union in `apps/server/src/services/inference/parts.ts` with `'html_artifact'`
- [ ] In `stream-phase.ts` post-processing: detect text parts starting with `<!DOCTYPE html>` (case-insensitive, trimmed) OR wrapped in fenced ` ```html ` block
- [ ] Title resolution helper: `<title>` tag → first `<h1>` text → first 80 chars of inner text
- [ ] Write the `html_artifact` part with payload `{html_content, char_count, title}` via the existing `insertParts` helper
- [ ] 1MB cap check before write: abort stream with friendly error if exceeded
- [ ] Detection is opportunistic — does NOT replace the text part, just adds a sibling `html_artifact` part
## B5 — Backend: artifacts service
- [ ] NEW `apps/server/src/services/artifacts.ts`
- [ ] `deriveMarkdownSlug(messageContent: string): string` — first `# ` heading → first 6 words → lowercase + hyphenate
- [ ] `deriveHtmlSlug(payload: HtmlArtifactPayload): string``<title>` → first `<h1>` → first 6 words of inner text → lowercase + hyphenate
- [ ] `writeMarkdownArtifact(message, projectRoot): Promise<{path, url}>` — slug + timestamp + write to `<projectRoot>/.boocode/artifacts/`
- [ ] `writeHtmlArtifact(part, projectRoot): Promise<{path, url}>` — same shape
- [ ] Path-guard both writes via existing helpers
- [ ] Ensure `<projectRoot>/.boocode/artifacts/` exists (mkdir recursive)
## B6 — Backend: download endpoint
- [ ] NEW endpoint registration: `POST /api/chats/:id/messages/:msg_id/artifacts/download?fmt=md|html`
- [ ] Fastify route in `apps/server/src/routes/messages.ts` (or new `artifacts.ts` route file — decide during impl)
- [ ] Zod schema on `?fmt=` query param
- [ ] Resolve message + (for HTML) the `html_artifact` part
- [ ] Call `writeMarkdownArtifact` or `writeHtmlArtifact` per `fmt`
- [ ] Return `{path, url}`
- [ ] Error path: 404 if `fmt=html` requested but no html_artifact part exists
## B7 — Backend: STOP checkpoint after B3B6
- [ ] `npx tsc --noEmit -p apps/server` — 0 errors
- [ ] Smoke download endpoint via `curl http://100.114.205.53:9500/api/chats/<id>/messages/<msg_id>/artifacts/download?fmt=md` against a real message
- [ ] Hand back diff + curl output
## B8 — Frontend: Pane discriminated union extension
- [ ] Extend `Pane` discriminated union with two variants:
- `{ kind: 'markdown_artifact', message_id: string }`
- `{ kind: 'html_artifact', message_id: string, html_content: string, title: string }`
- [ ] Update `validatePanes` to handle the new variants (no-op if message_id still exists)
- [ ] Mirror types in `apps/web/src/api/types.ts` (`MessagePart` discriminator + new pane variants)
## B9 — Frontend: pane components
- [ ] NEW `apps/web/src/components/MarkdownArtifactPane.tsx`
- Header: title + Copy button (raw source via `navigator.clipboard.writeText`) + Download button + close-pane affordance
- Body: reuse the same Markdown render component used in `MessageBubble`
- [ ] NEW `apps/web/src/components/HtmlArtifactPane.tsx`
- Header: title + Download button + close-pane affordance (NO Copy)
- Body: `<iframe srcdoc={html_content} sandbox="allow-scripts allow-clipboard-write allow-downloads" className="w-full h-full" />`
- [ ] Wire both into the workspace splitter's pane-type registry
## B10 — Frontend: MessageBubble affordance
- [ ] Add "Open in pane" icon button to assistant message footer (next to existing copy/regenerate controls)
- [ ] On click: dispatch workspace-pane action
- If message has `html_artifact` part → open as html_artifact pane (with title + html_content from the part)
- Else → open as markdown_artifact pane
- [ ] Mobile tap target: `max-md:min-h-[44px] max-md:min-w-[44px]`
## B11 — Frontend: API client
- [ ] `api.messages.downloadArtifact(chatId, msgId, fmt: 'md' | 'html')` → POST to the new endpoint
- [ ] Returns `{path, url}` — Copy button uses raw text from the message; Download button uses the returned URL
## B12 — Frontend: STOP checkpoint after B8B11
- [ ] `npx tsc -p apps/web/tsconfig.app.json --noEmit` — 0 errors (root tsc may miss web errors per CLAUDE.md)
- [ ] `pnpm -C apps/web build` succeeds (including the `U+2500-259F` guard)
- [ ] Hand back diff
## B13 — AGENTS.md guidance
- [ ] Add HTML-on-request rule to `data/AGENTS.md`
- [ ] Inline "avoid AI slop" design conventions (no centered layouts, no purple gradients, no uniform rounded corners, no Inter font)
- [ ] Cite Thariq Shihipar's blog post (May 2026) as the source
## B14 — Smoke (STOP at end, full report)
- [ ] Markdown pane happy path (open, render, copy, download)
- [ ] HTML pane happy path (open, render, JS executes, download — no Copy button)
- [ ] HTML security exfil attempt — `fetch()` blocked by `connect-src 'none'`
- [ ] HTML security DOM access — sandbox without `allow-same-origin` enforces opaque origin
- [ ] Opt-in opportunistic detection — first reply Markdown, follow-up "render as HTML" produces artifact
- [ ] 1MB cap — synthetic test, streaming aborts with friendly error
- [ ] Path-guard on download — hand-crafted `../` slug rejected
- [ ] Persistence — pane state survives hard reload via `sessions.workspace_panes`
## B15 — OpenSpec docs + release
- [ ] Mark this `tasks.md` checkboxes complete after each step
- [ ] Append retrospective bullet to bottom of v1.14.x-html section in `boocode_roadmap.md`
- [ ] Add `CHANGELOG.md` entry with the assigned tag (e.g. `v1.14.1-html-artifact-panes` — final patch number assigned at ship time depending on order vs v1.14 outer loop)
- [ ] Hand back to Sam for tag + commit

View File

@@ -1,59 +0,0 @@
# v1.15.0-mcp-multi — design decisions
## D1. Config file path
`/data/mcp.json` (alongside `AGENTS.md` at `/data/AGENTS.md`). Both are bind-mounted from the host's `data/` directory. Override via `MCP_CONFIG_PATH` env var.
File missing = no MCP (opt-in by file presence, not by env var). Simpler than the v1.14.1 approach of always-defaulting a URL.
## D2. Config schema matches opencode's `mcpServers` shape
opencode uses `~/.opencode/config.json` with a `mcpServers` key. BooCode uses `mcp.json` with the same `mcpServers` key so server entries are copy-pasteable. Property names match: `type`, `url`, `command`, `args`, `env`, `headers`. BooCode adds `enabled` (boolean toggle per server, default true) which opencode doesn't have — harmless extra key.
## D3. Transport types: streamableHttp + stdio only
- **streamableHttp**: For remote servers (Context7, future cloud MCP services). Uses `@modelcontextprotocol/sdk`'s `StreamableHTTPClientTransport`.
- **stdio**: For local subprocess servers (codecontext, future local tools). Uses `@modelcontextprotocol/sdk`'s `StdioClientTransport` (spawns child process, NDJSON framing over stdin/stdout).
- **SSE**: Skipped. Streamable HTTP supersedes SSE per the MCP spec (May 2025 protocol update). If a legacy server requires SSE, it can be added later.
## D4. Tool name prefixing: `<serverName>_<toolName>`
Generalizes v1.14.1's `context7_<name>` pattern. Server name comes from the config key (e.g. `"context7"`, `"codecontext"`). Collisions between servers with the same name are impossible (config keys are unique). Collisions between an MCP tool and a native tool are possible if someone names a server entry the same as a native tool prefix — but that's a user-configuration error, not a code bug.
## D5. Per-agent glob patterns: last-match-wins
AGENTS.md `tools:` field already supports exact-match arrays. Globs extend the same field:
```yaml
tools: [view_file, grep, context7_*]
```
Evaluation: for each tool in `ALL_TOOLS`, scan the pattern list left-to-right. A `!` prefix denies. Last matching pattern wins. This matches the roadmap's "wildcard rule matcher" language.
Examples:
- `[*]` — all tools (same as omitting `tools:` entirely)
- `[*, !web_*]` — all tools except web
- `[view_file, grep, context7_*]` — only view_file, grep, and all Context7 tools
- `[*]` on Architect + `[view_file]` on Prompt Builder — each agent gets its intended scope
Globs use a simple `minimatch`-style check: `*` matches any characters. No `?` or `**` — tool names are flat (no path separators).
## D6. No DB tables in v1.15
The roadmap listed `permissions`, `agent_permissions`, `session_permissions`, `mcp_servers` tables. All deferred to v2.0:
- **Permission tables**: Enterprise multi-user pattern. BooChat is single-user behind Authelia. The read-only invariant guard is the BooChat-era defense. Formal permission rulesets land when BooCoder adds write tools.
- **`mcp_servers` table**: In-memory registry is sufficient. No need to persist server state to DB when the config file is the source of truth and tools are re-discovered on every boot.
## D7. Stdio child lifecycle
- Spawn on `initialize()`. Persistent connection for server lifetime (not per-call).
- On child exit (unexpected): mark server unavailable, log error. Do NOT auto-restart. BooCode continues with remaining servers.
- On BooCode shutdown (`app.addHook('onClose')`): send SIGTERM to all stdio children. Wait up to 5s, then SIGKILL.
- On ENOENT (command not found): skip server with a warning. Matches the graceful-degradation pattern from v1.14.1.
## D8. v1.14.1 env vars removed
`MCP_CONTEXT7_URL` and `MCP_CONTEXT7_API_KEY` are deleted from `config.ts`. They're superseded by the JSON config file's `context7` entry. The PoC was explicitly designed as throwaway.
Migration path for anyone who had the env vars set: add a `data/mcp.json` with the Context7 entry. The CHANGELOG entry will note this.

View File

@@ -1,130 +0,0 @@
# v1.15.0-mcp-multi — multi-server MCP client + stdio transport + config file
Generalize the v1.14.1 single-server Context7 PoC into a multi-server MCP client. Add stdio transport (for local subprocess MCP servers like codecontext). JSON config file matching opencode's schema shape. Per-agent tool glob patterns in AGENTS.md frontmatter.
## Why
v1.14.1 proved the MCP loop works end-to-end but is hardcoded to one server (Context7) via env vars. Real value comes from multiple servers: Context7 for docs, codecontext re-wired as a proper MCP server (stdio), future local tools. The config shape should match opencode's so Sam can copy `mcp` blocks between the two without translation.
## Scope
### S1. JSON config file for MCP servers
New file at `/data/mcp.json` (bind-mounted like `AGENTS.md`). Env var `MCP_CONFIG_PATH` points to it (default `/data/mcp.json`).
Schema (matching opencode's shape):
```json
{
"mcpServers": {
"context7": {
"type": "streamableHttp",
"url": "https://mcp.context7.com/mcp",
"headers": { "X-API-Key": "optional-key" },
"enabled": true
},
"codecontext": {
"type": "stdio",
"command": "/usr/local/bin/codecontext",
"args": ["--mcp"],
"env": { "WORKSPACE": "/opt" },
"enabled": false
}
}
}
```
Zod-validated at startup. Unknown keys silently ignored (forward-compat). Each server entry has:
- `type`: `"streamableHttp"` | `"stdio"` (SSE deferred — Streamable HTTP supersedes it per the MCP spec)
- `url` (HTTP) or `command` + `args` + `env` (stdio)
- `headers` (HTTP, optional) — for API keys
- `enabled` (boolean, default true)
### S2. Multi-server MCP client
Refactor `mcp-client.ts` from a singleton to a registry of named MCP clients. On startup:
1. Read `/data/mcp.json` (or path from `MCP_CONFIG_PATH`)
2. For each enabled server: create a Client + transport, connect, discover tools via `tools/list`
3. Wrap tools with `<server-name>_<tool-name>` prefix (generalizes the `context7_` pattern)
4. Apply read-only invariant guard per-tool (reject `readOnlyHint: false`)
5. Append all MCP tools to `ALL_TOOLS` in a single `appendMcpTools()` call
6. Per-server graceful degradation: one server failing doesn't block others
Expose: `getMcpServers(): McpServerStatus[]` for debug/status endpoint, `callTool(prefixedName, args)` routed to the correct server by prefix.
### S3. Stdio transport
For `type: "stdio"` servers: spawn a subprocess via `child_process.spawn(command, args, {env, stdio: 'pipe'})`. Use `@modelcontextprotocol/sdk`'s `StdioClientTransport` (or implement the NDJSON framing ourselves — the SDK should have it). The subprocess runs for the lifetime of the BooCode server (persistent connection, not per-call spawn).
Child lifecycle:
- Spawn on initialize. If spawn fails, log warn, skip server (graceful degradation).
- On child exit: log error, mark server as unavailable. Do NOT restart automatically (v1.15 keeps it simple; auto-restart is a v2.0 concern).
- On BooCode shutdown (`app.addHook('onClose')`): kill child processes.
### S4. Per-agent tool glob patterns in AGENTS.md
Currently `tools:` in AGENTS.md frontmatter is an exact-match whitelist (array of tool names). Extend to support glob patterns via a lightweight matcher:
- `context7_*` — all tools from the context7 server
- `view_*` — all tools starting with `view_`
- `!web_*` — exclude web tools (deny pattern)
- Plain names (`grep`, `view_file`) work as before (exact match)
Evaluation order: for each tool in `ALL_TOOLS`, check if it matches any pattern in the agent's `tools:` list. A `!` prefix means exclude. Last-match-wins.
Parser change in `agents.ts`: when validating `tools:`, don't reject unknown names if they contain `*` (glob patterns can't be validated against the current tool list since MCP tools are discovered at runtime). Exact names are still validated.
### S5. Remove v1.14.1 env-var config
Delete `MCP_CONTEXT7_URL` and `MCP_CONTEXT7_API_KEY` from `config.ts`. They're superseded by the JSON config file. The v1.14.1 PoC is throwaway-by-design (proposal said "throwaway-if-needed").
### S6. Read-only invariant preserved
BooChat's read-only guarantee stays: every MCP tool with `readOnlyHint: false` is rejected at discovery. This applies globally, not per-server. Config has no `allowWriteTools` flag — that's a v2.0 BooCoder concern.
## Deferred to v2.0
- **Permission ruleset tables** (`permissions`, `agent_permissions`, `session_permissions`). Enterprise pattern that doesn't serve until BooCoder adds write tools. The read-only invariant guard is the BooChat-era defense-in-depth.
- **OAuth / Dynamic Client Registration.** Needs secret storage primitive first.
- **SSE transport.** Streamable HTTP supersedes it per the MCP spec. SSE is a legacy fallback.
- **Per-session MCP toggle.** No `session.mcp_enabled` column in v1.15. MCP servers are globally configured; agent tool globs are the scoping mechanism.
- **`mcp_servers` DB table.** In-memory registry is sufficient for single-user. DB tracking deferred to v2.0.
- **codecontext re-wiring to MCP.** Separate batch after v1.15 proves stdio transport works.
## Non-goals
- No frontend changes. MCP tools surface via the existing tool registry; results render as normal tool-result parts.
- No schema changes. No new DB tables or columns.
- No changes to the inference loop (v1.14.0 outer loop unchanged).
- No changes to `executeToolCall` dispatch (transparent via ToolDef.execute).
## Hard rules
- No git commit/push. Sam commits.
- Read-only invariant: reject any MCP tool with `readOnlyHint: false`.
- Graceful degradation: any server down → that server's tools unavailable, rest unaffected.
- Alpha-sort of ALL_TOOLS preserved.
- One new dep only: none (MCP SDK already installed from v1.14.1).
- 348+ existing tests still pass.
## Files expected to touch
- `apps/server/src/services/mcp-client.ts` — refactor from singleton to multi-server registry (~200→300 lines)
- `apps/server/src/services/tools.ts` — no changes expected (appendMcpTools already works for multiple tools)
- `apps/server/src/config.ts` — replace MCP env vars with `MCP_CONFIG_PATH`
- `apps/server/src/index.ts` — startup reads config file, iterates servers
- `apps/server/src/services/agents.ts` — glob pattern support in `tools:` whitelist
- `data/mcp.json` — NEW, example config with Context7 (disabled by default, enabled via edit)
- `apps/server/src/services/__tests__/mcp-client.test.ts` — update for multi-server, add stdio transport tests
- `apps/server/src/services/__tests__/agents-glob.test.ts` — NEW, glob pattern matching tests
## Estimate
~350 LoC. The MCP SDK handles both transports; BooCode's job is config parsing, multi-server lifecycle, and glob matching.
## Smoke plan
1. Create `/data/mcp.json` with Context7 enabled. Restart. Confirm tools discovered + logged.
2. Send a chat asking about library docs. Confirm `context7_*` tools called + results rendered.
3. Disable Context7 in config (`"enabled": false`). Restart. Confirm zero MCP tools.
4. Add a dummy stdio server entry pointing to `/bin/cat` (will fail). Confirm graceful degradation: Context7 works, dummy fails with a logged warning.
5. Add `tools: [context7_*]` to the Architect agent in AGENTS.md. Confirm Architect sees only Context7 tools (via AgentPicker or by chatting with Architect selected).
6. Stop boocode, confirm child processes are killed (no orphans).

View File

@@ -1,87 +0,0 @@
# v1.15.0-mcp-multi tasks
## B1 — Backups
- [ ] `mcp-client.ts`, `config.ts`, `index.ts`, `agents.ts`, `mcp-client.test.ts`
## B2 — MCP config file schema + loader
- [ ] NEW `apps/server/src/services/mcp-config.ts` (~50 lines)
- [ ] Zod schema for `mcp.json`: `McpServerConfig` with `type`, `url/command/args/env`, `headers`, `enabled`
- [ ] `loadMcpConfig(configPath: string, log): McpServerConfig[]` — reads JSON, validates, returns enabled servers
- [ ] Graceful: file missing → log info, return empty array (no MCP)
- [ ] Graceful: parse error → log warn with details, return empty array
## B3 — Config.ts: replace MCP env vars
- [ ] Remove `MCP_CONTEXT7_URL` and `MCP_CONTEXT7_API_KEY` from Zod schema
- [ ] Add `MCP_CONFIG_PATH: z.string().optional()` (no default — opt-in)
## B4 — Refactor mcp-client.ts to multi-server registry
- [ ] Replace module-level singleton with `Map<serverName, {client, transport, tools}>`
- [ ] `initialize(servers: McpServerConfig[], log)` — iterate servers, connect each, discover tools, wrap with `<serverName>_<toolName>` prefix, apply read-only guard
- [ ] Streamable HTTP transport: reuse existing pattern from v1.14.1
- [ ] Stdio transport: use `@modelcontextprotocol/sdk`'s `StdioClientTransport` (check SDK exports; fallback to `child_process.spawn` + NDJSON if SDK doesn't expose it)
- [ ] `callTool(prefixedName, args)` — extract server name from prefix, route to correct client
- [ ] `getTools()` — return all tools from all servers, flattened
- [ ] `getMcpServers()` — return status of each server (name, type, toolCount, connected)
- [ ] Per-server graceful degradation: catch per-server errors, log, skip; continue with others
- [ ] `shutdown()` — kill stdio child processes, close HTTP clients
- [ ] `app.addHook('onClose')` calls shutdown
## B5 — Startup wiring (index.ts)
- [ ] Read config: `const mcpConfigPath = config.MCP_CONFIG_PATH ?? '/data/mcp.json'`
- [ ] `const mcpServers = loadMcpConfig(mcpConfigPath, app.log)`
- [ ] `await mcpClient.initialize(mcpServers, app.log)`
- [ ] `appendMcpTools(mcpClient.getTools())`
- [ ] Log summary: "mcp: N servers connected, M tools registered"
- [ ] `app.addHook('onClose', () => mcpClient.shutdown())`
## B6 — AGENTS.md glob patterns
- [ ] `apps/server/src/services/agents.ts` — in tool whitelist validation, skip validation for entries containing `*` (can't validate against runtime-discovered tools)
- [ ] NEW helper `matchToolGlob(toolName: string, patterns: string[]): boolean` — supports `*` wildcard and `!` deny prefix, last-match-wins
- [ ] Wire into `executeStreamPhase` (stream-phase.ts) where agent tools are filtered: replace exact-match `.includes()` with `matchToolGlob()`
- [ ] Export `matchToolGlob` for test access
## B7 — Example config file
- [ ] NEW `data/mcp.json` with Context7 entry (enabled: true, with URL, no API key)
- [ ] Comment in the file noting it's bind-mounted at `/data/mcp.json` inside the container
## B8 — Tests
- [ ] Update `mcp-client.test.ts` for multi-server wrapping (tools from two servers, prefix routing)
- [ ] Test: server A fails, server B succeeds — only B's tools registered
- [ ] Test: callTool routes to correct server by prefix
- [ ] Test: shutdown kills stdio transports
- [ ] NEW `apps/server/src/services/__tests__/mcp-glob.test.ts`
- [ ] Test: exact match ("grep" matches "grep")
- [ ] Test: wildcard ("context7_*" matches "context7_query-docs")
- [ ] Test: deny ("!web_*" excludes "web_search")
- [ ] Test: last-match-wins ("*" then "!web_*" → web tools excluded)
- [ ] Test: empty pattern list → nothing matches (agent gets no tools — same as current behavior for explicit whitelists)
## B9 — Verification
- [ ] `npx tsc --noEmit -p apps/server` — 0 errors
- [ ] `pnpm -C apps/server test` — all passing
- [ ] `pnpm -C apps/web build` — green (no web changes)
## B10 — Deploy + smoke
- [ ] Create `/data/mcp.json` on the host with Context7 enabled
- [ ] Update docker-compose bind mount if needed (data/ already mounted)
- [ ] `docker compose up --build -d`
- [ ] Check logs for multi-server init
- [ ] Live-smoke: Context7 tool call from chat
- [ ] Disable Context7 in config, restart, confirm zero MCP tools
## B11 — Docs + tag
- [ ] `CHANGELOG.md` entry
- [ ] `boocode_roadmap.md` retrospective bullet on v1.15 section
- [ ] `CLAUDE.md` — update MCP references
- [ ] Commit, tag `v1.15.0-mcp-multi`, push, rebuild

View File

@@ -1,413 +0,0 @@
# v2.0 BooCoder — Implementation Plan
Ordered execution plan across all 4 sub-versions. Each phase is dispatchable as a single batch. Phases 1-4 are sequential (each builds on the prior); phases within a sub-version can sometimes be parallelized.
---
## Phase 1 — Foundation (v2.0.0-alpha)
**Goal:** Standalone BooCoder container boots, connects to DB, serves a health endpoint. No inference yet.
**Estimated:** ~200 LoC
### Steps
1. **Clone lift sources** (prep, no code)
- `cd /opt/forks && git clone agent-hub, plandex, opencode, qodo-ai/agents`
- Read agent-hub schema, plandex pending-changes, opencode permission/evaluate.ts
- Read RA.Aid README for three-stage pattern
2. **Create `apps/coder/` skeleton**
- `apps/coder/package.json` (Fastify, postgres, zod — same deps as `apps/server`)
- `apps/coder/tsconfig.json` (extends base, NodeNext)
- `apps/coder/src/index.ts` (Fastify boot, health endpoint, DB connect)
- `apps/coder/src/config.ts` (Zod config schema — DATABASE_URL, PORT, HOST, LLAMA_SWAP_URL, CONTAINER_GUIDANCE_FILE)
- `apps/coder/src/db.ts` (postgres connection, schema apply — shared with `apps/server` or fresh)
3. **Create Dockerfile**
- `apps/coder/Dockerfile` — Node 20 bookworm-slim (matches booterm for glibc compat with node-pty later)
- Mount: `/opt:/opt:rw`
- COPY built server + BOOCODER.md
4. **docker-compose.yml** — add `boocoder` service
- Port `100.114.205.53:9502:3000`
- Environment: `DATABASE_URL`, `LLAMA_SWAP_URL`, `CONTAINER_GUIDANCE_FILE=/app/BOOCODER.md`
- Network: `boocode_net`
- Depends on: `boocode_db`
5. **DB rename**`boocode_db``boochat_db`
- `ALTER DATABASE boocode RENAME TO boochat;` (one-time, run manually)
- Update `DATABASE_URL` in all docker-compose services
- Update volume name mapping
- Verify all 3 services boot against renamed DB
6. **Schema migration** — new tables in `apps/coder/src/schema.sql`
- `pending_changes` table
- `tasks` table
- `available_agents` table
- `human_inbox` view
- Applied idempotently on boot (same pattern as BooChat's `applySchema()`)
7. **BOOCODER.md** — container guidance file
- Write tools enabled (unlike BOOCHAT.md which declares read-only)
- Pending-changes queue discipline
- Path-guard rules
### Verification
- `docker compose up --build -d` — boocoder container starts
- `curl http://100.114.205.53:9502/api/health` — 200 OK
- `psql` confirms new tables exist
- BooChat + BooTerm unaffected (still boot, still serve)
---
## Phase 2 — Write Tools + Pending Changes (v2.0.0-beta)
**Goal:** BooCoder can chat with the LLM, the LLM can call write tools, changes queue in `pending_changes`, user can apply/reject.
**Estimated:** ~400 LoC
### Steps
1. **Write-path guard** (`apps/coder/src/services/write_guard.ts`)
- `resolveWritePath(projectRoot, filePath): string``resolve()` + prefix check (no realpath — file may not exist for creates)
- Deny list: inherit from BooChat's `secret_guard.ts` (`.env`, `*.pem`, `id_rsa*`, etc.)
- Fuzz tests: `../` escape, symlink outside root, null bytes, non-existent parent dirs
2. **Pending-changes service** (`apps/coder/src/services/pending_changes.ts`)
- `queueEdit(session_id, task_id, file_path, old_string, new_string): PendingChange` — computes unified diff, validates write path, INSERTs
- `queueCreate(session_id, task_id, file_path, content): PendingChange`
- `queueDelete(session_id, task_id, file_path): PendingChange`
- `applyAll(session_id): ApplyResult[]` — re-validates each path, writes to disk, marks `status='applied'`
- `applyOne(change_id): ApplyResult`
- `rejectOne(change_id): void` — marks `status='rejected'`
- `rejectAll(session_id): void`
- `rewindOne(change_id): void` — inverse-diff, writes to disk, marks `status='reverted'`
- `listPending(session_id): PendingChange[]`
3. **Write tools** (`apps/coder/src/services/tools/`)
- `edit_file.ts` — input: `{file_path, old_string, new_string}`, calls `queueEdit`
- `create_file.ts` — input: `{file_path, content}`, calls `queueCreate`
- `delete_file.ts` — input: `{file_path}`, calls `queueDelete`
- `apply_pending.ts` — calls `applyAll` for current session
- `rewind.ts` — input: `{change_id}` or `{all: true}`, calls `rewindOne`/`rewindAll`
4. **Tool registry** — register write tools alongside ALL read tools from BooChat
- Import BooChat's read tools (view_file, grep, etc.) + codecontext tools
- Add the 5 write tools
- Alpha-sort the combined list
5. **Inference loop** — port from BooChat or share via workspace package
- Copy `apps/server/src/services/inference/` into `apps/coder/src/services/inference/` (or symlink via pnpm workspace)
- The outer loop (v1.14) runs unchanged — write tools are just ToolDefs with `execute()` functions
- Compaction, doom-loop, step cap all carry forward
6. **API routes**
- `POST /api/sessions/:id/messages` — same as BooChat (creates user + assistant rows, enqueues inference)
- `GET /api/sessions/:id/pending` — returns pending changes for the session
- `POST /api/sessions/:id/pending/apply` — applies all pending
- `POST /api/pending/:id/apply` — applies one
- `POST /api/pending/:id/reject` — rejects one
- `POST /api/pending/:id/rewind` — reverts one
- WebSocket streaming (same protocol as BooChat)
### Verification
- Send a chat asking BooCoder to edit a file
- LLM calls `edit_file` → change queued in `pending_changes`
- `GET /api/sessions/:id/pending` shows the queued change with diff
- `POST /api/pending/:id/apply` writes to disk
- `POST /api/pending/:id/rewind` reverts it
- Fuzz test: attempt traversal via `edit_file("../../etc/passwd", ...)` → rejected by write_guard
---
## Phase 3 — Frontend: Diff Pane + Chat (v2.0.0)
**Goal:** Browser UI at `coder.indifferentketchup.com` with chat pane + diff pane side by side.
**Estimated:** ~200 LoC
### Steps
1. **Create `apps/coder/web/`** — React + Vite SPA (same stack as BooChat's `apps/web/`)
- Copy BooChat's Vite config, Tailwind v4 setup, font pipeline
- Shared components: `MarkdownRenderer`, `CodeBlock`, `Button`, `Input`
- New app shell: sidebar (sessions) + workspace (panes)
2. **Chat pane** — reuse BooChat's ChatPane/MessageBubble pattern
- Same WS streaming, same `useSessionStream` hook, same message rendering
- ActionRow includes tool-call rendering for write tools
3. **Diff pane** — NEW (`apps/coder/web/src/components/DiffPane.tsx`)
- Fetches `GET /api/sessions/:id/pending`
- Lists pending changes: file path + operation badge (create/edit/delete)
- Per-change: syntax-highlighted unified diff view (use Shiki or a diff-specific highlighter)
- Buttons: Approve / Reject per change, Approve All / Reject All
- Real-time updates via WS frame (`pending_change_added`, `pending_change_applied`, etc.)
4. **Workspace splitter** — chat left, diff right (or configurable)
5. **Caddy route**`coder.indifferentketchup.com` → boocoder:9502
- Authelia gating (same as BooChat)
### Verification
- Open `coder.indifferentketchup.com` in browser
- Send a message asking for a code change
- See the change appear in the diff pane in real time
- Click Approve → file written, change marked applied
- Click Reject → change discarded
---
## Phase 4 — Dispatcher + Tasks (v2.0.0 final)
**Goal:** Task queue works. User can create tasks, dispatcher picks them up and runs them through Path A.
**Estimated:** ~150 LoC
### Steps
1. **Dispatcher** (`apps/coder/src/services/dispatcher.ts`)
- In-process `setInterval(5000)` polling `tasks` WHERE `state='pending'` ORDER BY `created_at`
- For each ready task: mark `state='running'`, run inference with the task's `input` as the user message
- On completion: mark `state='completed'`
- On error: mark `state='failed'`
- On abort: mark `state='cancelled'`
- Respects `app.addHook('onClose')` — stops polling, waits for in-flight task
2. **Task API routes**
- `POST /api/tasks` — create a task `{project_id, input, agent?, model?}`
- `GET /api/tasks` — list tasks (filterable by state, project)
- `GET /api/tasks/:id` — get task details + output_summary
- `POST /api/tasks/:id/cancel` — cancel a running task
3. **Task → session linkage**
- Each task creates its own session + chat for isolation
- Task's pending_changes reference the task_id
- When task completes, its pending_changes are visible in the UI for approval
4. **Agent probing** (`apps/coder/src/services/agent-probe.ts`)
- On startup: `which opencode`, `which goose`, `which claude`, `which pi`
- Parse version from `<agent> --version`
- Check ACP support: `opencode acp --help` exits 0 → supports_acp = true
- Populate `available_agents` table
### Verification
- `POST /api/tasks {input: "add a /api/version endpoint"}` → task created
- Dispatcher picks it up → inference runs → `edit_file` queued → task completes
- `GET /api/tasks/:id` shows `state='completed'` + output_summary
- Pending changes visible in diff pane for approval
---
## Phase 5 — ACP Dispatch (v2.0.1)
**Goal:** Tasks can be dispatched to external agents via ACP. opencode and goose run as subprocesses, their events flow back into BooCode.
**Estimated:** ~350 LoC
### Steps
1. **ACP client** (`apps/coder/src/services/acp-client.ts`)
- Install: `pnpm -C apps/coder add @zed-industries/agent-client-protocol`
- `spawnAcpAgent(agent: string, task: string, worktree: string, mcpServers: McpConfig[]): AcpSession`
- Uses SDK's `StdioTransport` — spawn `opencode acp` or `goose acp` as child
- Pass `context_servers` for MCP auto-forward
- Event listener: maps ACP events to BooCode's parts taxonomy
2. **ACP event mapping**
- `file_operation` → queue into `pending_changes` (same as Path A native writes)
- `tool_call` / `tool_result` → insert as `message_parts` in the task's session
- `terminal_output` → publish as WS frame for BooTerm routing
- `permission_request` → pause (same mechanism as `ask_user_input`)
- `session_end` → task state → `completed` or `failed`
3. **Worktree management** (`apps/coder/src/services/worktrees.ts`)
- `createWorktree(projectPath, taskId): string``git worktree add /tmp/booworktrees/<taskId> -b task-<taskId> HEAD`
- `diffWorktree(worktreePath, projectPath): UnifiedDiff[]``git diff HEAD...<worktree-branch>`
- `cleanupWorktree(worktreePath): void``git worktree remove`
- On ACP session end: diff the worktree, queue diffs into `pending_changes`, cleanup
4. **PTY fallback** (`apps/coder/src/services/pty-dispatch.ts`)
- For agents without ACP (claude, pi, smallcode)
- `spawnPtyAgent(agent: string, task: string, worktree: string): PtySession`
- Uses `node-pty` — spawn `claude` or `pi` with cwd = worktree
- Capture stdout/stderr into `message_parts` (kind='text', less structured than ACP)
- On exit: diff worktree → queue pending_changes → cleanup
5. **Dispatcher update** — transport selection
- Check `available_agents[agent].supports_acp` at dispatch time
- ACP-capable → `spawnAcpAgent`
- PTY fallback → `spawnPtyAgent`
- Native (no agent specified) → Path A inference loop (Phase 4)
6. **AGENTS.md extensions**
- Add `execution_strategy: plan | act | research` field
- Add `expert_model` field for cost-routing
- Add `output_schema` field (optional JSON Schema for structured final output)
### Verification
- Create task with `agent: 'opencode'` → ACP subprocess spawns
- opencode edits files in worktree → events stream into UI
- On completion: worktree diff queued in `pending_changes`
- Approve → changes applied to main project
- Fallback: create task with `agent: 'claude'` → PTY captures output → worktree diff queued
---
## Phase 6 — MCP Server (v2.0.2)
**Goal:** BooCoder exposes its own primitives as MCP tools. External opencode sessions in Termius can drive the task queue.
**Estimated:** ~250 LoC
### Steps
1. **MCP server** (`apps/coder/src/services/mcp-server.ts`)
- Use `@modelcontextprotocol/sdk` server-side (`Server` class)
- Stdio transport (read from stdin, write to stdout)
- Entry point: `boocoder --mcp` CLI flag starts the MCP server instead of the HTTP server
2. **Tool handlers** (6 tools)
- `boocoder.create_task` → INSERT into tasks table, return task_id
- `boocoder.list_pending_changes` → SELECT from pending_changes WHERE session matches
- `boocoder.apply` → call `applyOne(change_id)`
- `boocoder.reject` → call `rejectOne(change_id)`
- `boocoder.dispatch_external_agent` → create task with agent specified, return task_id
- `boocoder.list_worktrees` → list active worktrees from tasks WHERE worktree_path IS NOT NULL AND state='running'
3. **10-question eval** (per `anthropics/skills/mcp-builder` framework)
- Write 10 independent, read-only, verifiable questions about the BooCoder state
- Run eval: `echo '{"jsonrpc":"2.0","method":"tools/call","params":{"name":"boocoder.list_pending_changes","arguments":{}},"id":1}' | boocoder --mcp`
- All 10 must return correct answers
4. **opencode integration test**
- Add BooCoder as an MCP server in `~/.opencode/config.json`:
```json
{"mcpServers": {"boocoder": {"type": "stdio", "command": "boocoder", "args": ["--mcp"]}}}
```
- From opencode: call `boocoder.create_task` → verify task appears in BooCoder UI
### Verification
- `echo '...' | boocoder --mcp` returns valid MCP responses
- 10-question eval passes
- opencode can drive BooCoder's task queue via MCP
---
## Phase 7 — CLI + Polish (v2.0.3)
**Goal:** `boocode` CLI client, human inbox UI, cost tracking, observation hooks.
**Estimated:** ~400 LoC
### Steps
1. **CLI client** (`apps/coder/src/cli.ts`)
- Thin HTTP/WS client against BooCoder API
- `boocode run "task description"` → POST /api/tasks → stream output via WS
- `boocode ls` → GET /api/tasks → formatted table
- `boocode attach <id>` → WS subscribe to task's session → stream live
- `boocode send <id> "message"` → POST message to task's session chat
- Build as a standalone binary via `pkg` or `esbuild --bundle`
2. **Human inbox UI** (frontend)
- New route: `/inbox` → shows tasks WHERE `state IN ('blocked', 'failed')`
- Per-task: view output, retry (reset state to pending), cancel, reassign agent
- Badge on sidebar showing count of inbox items
3. **Cost tracking**
- `tasks.cost_tokens` populated from inference `usage` callback (same as BooChat's `tokens_used`)
- Summary API: `GET /api/stats/costs?group_by=project|agent|day` → aggregated token spend
- Simple UI: cost badge on each task, totals in settings
4. **Observation hooks** (budi taxonomy)
- Emit 5 event types on the BooCoder WS protocol for dispatched agents:
- `session_start` — agent spawned
- `user_prompt_submit` — task spec delivered
- `post_tool_use` — each tool call completed
- `subagent_start` — nested dispatch (Boomerang)
- `stop` — agent finished
- Consumed by frontend for real-time status indicators
5. **Boomerang `new_task` tool** (subagent isolation)
- When an agent's toolset includes `new_task`:
- Creates a child task (fresh session, fresh context)
- Child runs to completion
- Parent gets only `attempt_completion` summary
- Orchestrator agent profile: tools = `[new_task, list_tasks, check_task_status]` ONLY
### Verification
- `boocode run "add health endpoint"` from terminal → task runs → output streams → diff queued
- `boocode ls` shows task list with states + cost
- Inbox shows failed tasks, retry works
- Boomerang: orchestrator creates subtask → subtask runs isolated → parent gets summary only
---
## Phase 8 — Hardening + Ship (v2.0.x)
**Goal:** Security hardening, integration tests, documentation, production deploy.
**Estimated:** ~100 LoC (mostly tests + docs)
### Steps
1. **Path-guard fuzz suite** — property tests for every traversal pattern:
- `../` sequences (all depths)
- Symlink outside project root
- Null bytes in path
- Unicode normalization attacks
- Race conditions (TOCTOU between validate + write)
- MCP-served filesystem writes routed through pending_changes
2. **Integration tests**
- End-to-end: create task → inference → edit_file → apply → file written → verify content
- ACP dispatch: mock opencode → events flow → pending_changes queued
- MCP server: 10-question eval automated in CI
3. **Documentation**
- `BOOCODER.md` finalized (container guidance)
- `CLAUDE.md` updated with BooCoder architecture section
- `boocode_roadmap.md` v2.0 retrospective
- `CHANGELOG.md` entries for each sub-version
4. **Production deploy**
- Caddy config: `coder.indifferentketchup.com`
- Authelia: same SSO group as BooChat
- Smoke: full workflow (chat → edit → approve → verify)
5. **Tag** — `v2.0.0` (or `v2.0.0-rc1` if Sam wants a bake period)
---
## Execution order summary
```
Phase 1 (foundation) → v2.0.0-alpha ~200 LoC container boots
Phase 2 (write tools) → v2.0.0-beta ~400 LoC inference + pending_changes
Phase 3 (frontend) → v2.0.0 ~200 LoC chat + diff panes
Phase 4 (dispatcher) → v2.0.0-final ~150 LoC task queue + native dispatch
Phase 5 (ACP dispatch) → v2.0.1 ~350 LoC external agents + worktrees
Phase 6 (MCP server) → v2.0.2 ~250 LoC boocoder.* tools + eval
Phase 7 (CLI + polish) → v2.0.3 ~400 LoC CLI + inbox + hooks + Boomerang
Phase 8 (hardening) → v2.0.x ~100 LoC fuzz + integration tests + docs
--------
~2050 LoC total
```
Each phase is independently dispatchable. Phases 1-4 are sequential (each needs the prior). Phases 5-7 are parallelizable after Phase 4 ships (they're independent protocol surfaces). Phase 8 gates the production tag.
---
## Risk register
| Risk | Mitigation |
|---|---|
| Path-guard bypass → arbitrary writes | Pending-changes double-validates (at queue time + apply time). Fuzz suite in Phase 8. OpenHands sandbox (v2.1) as fallback. |
| ACP spec instability (remote transport WIP) | Use stdio only. No remote ACP in v2.0. |
| node-pty native compilation breaks in Docker | bookworm-slim + glibc matches booterm's working config. Pin node-pty version. |
| Worktree cleanup failure → disk bloat | 30-min idle timeout sweeper. `git worktree prune` on startup. |
| DB rename breaks existing sessions | One-time migration with explicit backup. BooChat/BooTerm URLs unchanged. |
| MCP server eval failure | Ship stdio MCP server only after 10/10 eval passes. |
| Boomerang context leak (child leaks state to parent) | Architectural enforcement: child's session_id ≠ parent's. Summary field is the ONLY bridge. |

View File

@@ -1,346 +0,0 @@
# v2.0 — BooCoder
Major version bump. New app `apps/coder/` inside the existing monorepo. Lands together with the `boocode_db``boochat_db` DB rename and the per-app subdomain split (`code.indifferentketchup.com` → BooChat, `coder.indifferentketchup.com` → BooCoder).
## What BooCoder is
A write-capable coding agent surface. Two execution paths, same UI:
- **Path A (native):** BooCode's own inference loop with write tools (`edit_file`, `create_file`, `delete_file`). Edits queue in `pending_changes` — nothing touches disk until user approves via `/apply`.
- **Path B (dispatch):** Shells out to external CLI agents (`opencode`, `goose`, `claude`, `pi`) via ACP (preferred) or raw PTY (fallback). One git worktree per dispatch. Captures events into the same parts taxonomy.
Both paths feed the same task DAG, same project registry, same pending-changes queue, same UI.
## Why now
v1.x proved the read-only loop works end-to-end: inference, tool dispatch, streaming, compaction, MCP client, outer loop, step caps, artifact rendering. The infrastructure is stable. The jump from "read-only chat" to "write-capable agent orchestrator" is the remaining gap between BooCode and having a real development environment.
## Architecture
### Three protocol roles (locked 2026-05-22)
1. **MCP client (write-capable allowed).** Inherits v1.15 client. Write-capable MCP servers (e.g. `@modelcontextprotocol/server-filesystem`) route writes through `pending_changes`. Per-task allow/deny means dispatched tasks can have a different MCP roster.
2. **MCP server (BooCoder's own primitives).** Exposes `boocoder.create_task`, `boocoder.list_pending_changes`, `boocoder.apply`, `boocoder.reject`, `boocoder.dispatch_external_agent`, `boocoder.list_worktrees` as MCP tools. Stdio transport for local consumers (Sam's `opencode` in Termius); HTTP deferred until OAuth + secret storage.
3. **ACP client (host).** Spawns `opencode acp` and `goose acp` as JSON-RPC stdio subprocesses. Maps ACP events (file operations, tool calls, terminal output) to BooCode's parts taxonomy. MCP servers configured in BooCoder are auto-forwarded to the dispatched agent (per goose docs — `context_servers` is the field).
### Container layout (post-v2.0)
| Container | Port | Mount | Purpose |
|---|---|---|---|
| `boochat` (was `boocode`) | `100.114.205.53:9500` | `/opt:/opt:ro` | Read-only chat + MCP client |
| `booterm` | `100.114.205.53:9501` | `/opt:/opt:rw` | PTY/tmux terminal |
| `boocoder` | `100.114.205.53:9502` | `/opt:/opt:rw` (policy-gated) | Write tools + ACP host + MCP client + MCP server |
| `boochat_db` (was `boocode_db`) | `127.0.0.1:5500` | `boocode_pgdata` | Shared Postgres 16 |
| `codecontext` | internal `:8080` | `/opt:/opt:ro` | Analysis sidecar (shared) |
### Caddy routing
```
code.indifferentketchup.com → boochat:9500
coder.indifferentketchup.com → boocoder:9502
term.indifferentketchup.com → booterm:9501 (or routed under code.*/term/)
```
## Schema (new tables)
```sql
-- Pending changes: queued writes before /apply
CREATE TABLE IF NOT EXISTS pending_changes (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
session_id UUID NOT NULL REFERENCES sessions(id),
task_id UUID REFERENCES tasks(id),
file_path TEXT NOT NULL,
operation TEXT NOT NULL CHECK (operation IN ('create', 'edit', 'delete')),
diff TEXT NOT NULL,
status TEXT NOT NULL DEFAULT 'pending' CHECK (status IN ('pending', 'applied', 'rejected', 'reverted')),
created_at TIMESTAMPTZ NOT NULL DEFAULT clock_timestamp()
);
-- Tasks: the dispatch DAG
CREATE TABLE IF NOT EXISTS tasks (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
project_id UUID NOT NULL REFERENCES projects(id),
parent_task_id UUID REFERENCES tasks(id),
state TEXT NOT NULL DEFAULT 'pending'
CHECK (state IN ('pending', 'running', 'completed', 'failed', 'blocked', 'cancelled')),
input TEXT NOT NULL,
output_summary TEXT,
agent TEXT,
model TEXT,
execution_path TEXT CHECK (execution_path IN ('native', 'acp', 'pty')),
worktree_path TEXT,
cost_tokens INTEGER,
started_at TIMESTAMPTZ,
ended_at TIMESTAMPTZ,
created_at TIMESTAMPTZ NOT NULL DEFAULT clock_timestamp()
);
-- Available agents: probed at startup
CREATE TABLE IF NOT EXISTS available_agents (
name TEXT PRIMARY KEY,
install_path TEXT,
version TEXT,
supports_acp BOOLEAN NOT NULL DEFAULT false,
supports_mcp_client BOOLEAN NOT NULL DEFAULT false,
last_probed_at TIMESTAMPTZ
);
-- Human inbox: tasks needing attention
CREATE VIEW human_inbox AS
SELECT * FROM tasks WHERE state IN ('blocked', 'failed');
```
`task_templates` and `pipelines` deferred to v2.1 — overhead for single-user. The core is `tasks` + `pending_changes` + `available_agents`.
## Path A — Native write tools
### Tools
| Tool | Description |
|---|---|
| `edit_file` | Apply a diff to an existing file. Input: `{file_path, old_string, new_string}`. Queues in `pending_changes` with `operation='edit'`. |
| `create_file` | Create a new file. Input: `{file_path, content}`. Queues as `operation='create'`. |
| `delete_file` | Delete a file. Input: `{file_path}`. Queues as `operation='delete'`. |
| `apply_pending` | Flush all pending changes for the current session to disk. Path-guarded. |
| `rewind` | Revert a specific applied change or all changes since a checkpoint. |
### Path guard for writes
Same `pathGuard()` function from BooChat, but with a write-path variant:
- `resolveWritePath(projectRoot, requested)` — uses `resolve()` (not `realpath()`, since the file may not exist yet for creates), then verifies the result starts with `projectRoot + sep`.
- Deny list: everything in `secret_guard.ts` (`.env`, `*.pem`, etc.) — can't write to those either.
- Defense-in-depth: the `pending_changes` queue means even a path-guard bypass only queues; it doesn't hit disk until `/apply` (which re-validates).
### Diff format
Standard unified diff (what `git diff` produces). The `edit_file` tool takes `old_string` / `new_string` (same as Claude Code's edit tool — the model is trained on this shape). Server computes the unified diff for storage in `pending_changes.diff`.
### UI: per-pane diff viewer
Frontend pane type `pending_changes` in BooCoder's workspace. Shows:
- List of queued changes with file path + operation
- Per-change diff view (syntax-highlighted, side-by-side or unified)
- Approve / Reject per change, or Approve All / Reject All
## Path B — External agent dispatch
### dispatch_external_agent tool
```typescript
{
agent: 'opencode' | 'claude' | 'goose' | 'pi',
model: string, // e.g. 'claude-opus-4-7'
task: string, // natural-language task description
worktree?: string, // optional — auto-creates if not specified
}
```
### Transport selection
Dispatcher checks `available_agents.supports_acp` at runtime:
- **ACP** (preferred): `opencode acp`, `goose acp` — JSON-RPC stdio. Native session lifecycle, file-operation events, terminal events, permission prompts.
- **PTY** (fallback): `claude`, `pi`, `smallcode` — raw terminal capture via `node-pty`. Captures stdout/stderr/exit-code into PostgreSQL. Less structured than ACP.
### Worktree management
Each dispatched task gets its own git worktree:
```bash
git worktree add /tmp/booworktrees/<task-id> -b task-<task-id> HEAD
```
On completion: diff the worktree against HEAD, queue the diff into `pending_changes` for the same task, clean up the worktree. User approves/rejects the diff the same way as Path A.
### ACP event mapping
ACP events → BooCode parts taxonomy:
- `file_operation``tool_call` part (name: `acp_edit_file`) + `tool_result` part
- `tool_call``tool_call` part (preserves name)
- `terminal_output` → routes into BooTerm pane
- `permission_request` → pause inference (same mechanism as `ask_user_input`)
- `session_end` → task state → `completed` or `failed`
### MCP server auto-forward
Per goose docs, `context_servers` field in the ACP session config auto-forwards BooCoder's configured MCP servers to the dispatched agent. One MCP config drives every agent.
## Dispatcher worker
Background process (or in-process `setInterval` for v2.0 simplicity) that:
1. Queries `tasks` WHERE `state = 'pending'` ORDER BY `created_at`
2. For each ready task (no unmet dependencies):
- Mark `state = 'running'`
- Resolve execution path (Path A if no agent specified, Path B if agent specified)
- Path A: run the inference loop with write tools enabled
- Path B: spawn ACP/PTY subprocess, stream events into parts
- On completion: mark `state = 'completed'` or `'failed'`
- Queue output diff into `pending_changes`
3. On failure: mark `state = 'failed'`, surface in `human_inbox` view
## BooCoder MCP server
Exposes BooCoder's primitives as MCP tools so external agents (Sam's opencode in Termius) can drive the task queue:
| MCP Tool | Description |
|---|---|
| `boocoder.create_task` | Create a new task in the queue |
| `boocoder.list_pending_changes` | List queued changes awaiting approval |
| `boocoder.apply` | Apply a specific pending change |
| `boocoder.reject` | Reject a pending change |
| `boocoder.dispatch_external_agent` | Dispatch a task to an external agent |
| `boocoder.list_worktrees` | List active git worktrees |
Stdio transport for local consumers. HTTP transport deferred until OAuth + secret storage.
**Eval requirement:** run through `anthropics/skills mcp-builder` 10-question evaluation framework before shipping.
## Code lifts
### Primary architectural template
**`Dominic789654/agent-hub`** (Apache-2.0) — task DAG schema, dispatcher worker, project registry, human inbox. Three-process model (board server + dispatcher + assistant terminal). BooCode adapts this into a single-process Fastify app (v2.0.0) with the dispatcher as an in-process worker.
### Pending-changes UX
**`plandex-ai/plandex`** (MIT) — diff/apply/rewind vocabulary. The `pending_changes` queue concept, per-file diff view, approve/reject UI pattern. No code lifted — schema and UX design only.
### ACP client
**`agentclientprotocol.com` spec + `@zed-industries/agent-client-protocol` SDK** (Apache-2.0) — local-subprocess ACP via stdio JSON-RPC. The SDK handles framing; BooCode maps events to its parts taxonomy.
**`goose` docs** (`goose-docs.ai/docs/guides/acp-clients/`) — `context_servers` auto-forward pattern. Critical: one MCP config drives every dispatched agent.
### MCP server
**`anthropics/skills/mcp-builder`** (MIT) — 4-phase build workflow + 10-question evaluation framework for validating the MCP server before shipping.
### Dispatcher pattern
**Paseo (`getpaseo/paseo`)** — AGPL-3.0, **design only, no code lift**. Daemon+clients architecture, `--worktree` flag, CLI verb shape (`run/ls/attach/send`). BooCode reproduces the architecture using only license-clean patterns.
**Roo Code Boomerang Tasks** — orchestrator with intentional capability restriction. Down-pass/up-pass context discipline (`new_task` message, `attempt_completion` result, no implicit inheritance). Explicit precedence override clause.
### Write-tool security
**opencode `permission/evaluate.ts`** — wildcard permission ruleset (already lifted in v1.15). Extended in v2.0 to gate write tools.
**`covibes/zeroshot`** — blind-validation invariant. Verify gate runs in a separate agent context that only sees the diff and acceptance criteria, not the producing conversation. v2.0+ optional batch.
## Sub-versions
| Version | Scope |
|---|---|
| **v2.0.0** | Schema + Path A (native write tools + pending-changes queue + diff UI) + basic dispatcher |
| **v2.0.1** | Path B (ACP client for opencode/goose + PTY fallback for claude/pi + worktree management) |
| **v2.0.2** | BooCoder MCP server (stdio transport, `boocoder.*` tools, eval framework) |
| **v2.0.3** | Polish: `boocode` CLI (`run/ls/attach/send`), human_inbox UI, cost tracking |
## Dependencies
- v1.13 ✅ (parts table — the event taxonomy for everything)
- v1.14 ✅ (outer loop + step boundaries for future revert snapshots)
- v1.14.x-mcp ✅ (MCP client PoC — proves the protocol)
- v1.15 ✅ (full MCP client + tool globs — write-capable MCP servers route through pending_changes)
- v1.16 ✅ (codesight merge — codecontext now has blast-radius for impact analysis)
All dependencies shipped. v2.0 is unblocked.
## Estimate
- v2.0.0: ~800 LoC (schema + write tools + pending-changes service + diff pane + dispatcher skeleton)
- v2.0.1: ~600 LoC (ACP client + PTY dispatch + worktree management + event mapping)
- v2.0.2: ~400 LoC (MCP server + 6 tool handlers + stdio transport + eval)
- v2.0.3: ~400 LoC (CLI client + inbox UI + cost aggregation)
- **Total: ~2200 LoC** across 4 sub-versions
## Hard rules
- BooChat stays read-only. BooCoder is the only surface with write tools.
- Path-guard correctness is the #1 test target. Fuzz against every traversal pattern.
- Pending-changes queue gates ALL writes (native + MCP). Nothing touches disk without user approval (or explicit auto-apply flag per task).
- One shared database. Cross-surface joins are valuable (task → chat → terminal debugging session).
- External CLI agents on the host, not in containers. BooCoder shells out via local-exec.
- No OAuth in v2.0. MCP server is stdio-only until secret storage lands.
- DB rename `boocode_db``boochat_db` lands with v2.0.0 (one-time migration).
## AGENTS.md extensions (v2.0.0)
Port from `qodo-ai/agents` (MIT) `agent.toml` schema and `ai-christianson/RA.Aid` (Apache-2.0) three-stage pattern:
| Field | Type | Purpose | Source |
|---|---|---|---|
| `steps` | number | Per-agent step cap (already shipped v1.14.0) | opencode |
| `output_schema` | JSON Schema | Structured output constraint for the agent's final response | qodo-ai/agents |
| `exit_expression` | string | Regex/predicate — when the agent considers itself done | qodo-ai/agents |
| `execution_strategy` | `plan` \| `act` \| `research` | Which phase of the RA.Aid three-stage pattern this agent operates in | qodo-ai/agents + RA.Aid |
| `model` | string | Per-agent model override (already shipped v1.8) | — |
| `expert_model` | string | Escalation model for hard reasoning (RA.Aid "expert tool" escape hatch) | RA.Aid |
The three-stage pattern maps to BooCoder's use case:
- **Research agent** (cheap model) → understand the task, find relevant files
- **Planning agent** (standard model) → decide which files to edit, what the changes look like
- **Implementation agent** (full model) → produce the actual diffs
`expert_model` is the escape hatch: a routine model handles most subtasks, but can call the expert model (e.g. qwopus27b) when stuck. Matches Sam's existing cost-routing discipline.
## Subagent isolation (Boomerang pattern, v2.0.1)
From Roo Code Boomerang Tasks (Apache-2.0 pattern):
When an orchestrator agent calls a `new_task` tool, BooCoder:
1. Creates a fresh `tasks` row with `parent_task_id` pointing to the orchestrator's task
2. Spawns a fresh inference session (Path A) or dispatch (Path B) with ONLY the task spec as context — no inherited conversation
3. Child runs to `attempt_completion`, writes a summary to `tasks.output_summary`
4. Parent resumes reading ONLY the summary (not the child's full conversation)
**Three principles:**
- Orchestrator capability restriction: the orchestrator agent's tool list includes ONLY `new_task`, `list_tasks`, `check_task_status` — it cannot read files or call MCP tools directly
- Down-pass: parent sends task spec via `new_task(input)`, nothing else inherited
- Up-pass: child sends result via `attempt_completion(summary)`, nothing else surfaces to parent
This is the **single most important context-management primitive** — it prevents long-running orchestrators from poisoning their context with implementation detail.
## Observation hooks (v2.0.3)
From `siropkin/budi` (MIT) Claude Code 5-hook taxonomy:
Register BooCoder as a hook receiver for dispatched agents. Five events:
- `SessionStart` — agent spawned
- `UserPromptSubmit` — task spec delivered
- `PostToolUse` — each tool call completed
- `SubagentStart` — nested dispatch
- `Stop` — agent finished
These map directly to BooCode's existing WS frame protocol. The hook receiver is the BooCoder Fastify server; events flow into the `message_parts` taxonomy as `step_start`-style instrumentation parts.
## Follow-up batches (v2.0+ optional, ordered by value)
| Batch | Source | What | When |
|---|---|---|---|
| **PR-resolver tool** | `qodo-ai/qodo-skills` (MIT) | Fetch GitHub issues → batch/interactive fix → inline PR reply. BooCoder tool that replaces Sam's manual PR workflow. | v2.0.3+ |
| **HMAC audit log** | `sipyourdrink-ltd/bernstein` (verify license) | One new `audit_log` table with `prev_hmac` field. Tamper-evident history of every edit BooCoder makes. Small lift (~50 LoC). | v2.0.1+ |
| **Blind-validation gate** | `covibes/zeroshot` (MIT) | Verify gate runs in a separate agent context that sees ONLY the diff + acceptance criteria, not the producing conversation. Complements Boomerang (isolation) + bernstein (lineage). | v2.0.2+ |
| **Majority-vote ensembler** | `augmentcode/augment-swebench-agent` (MIT) | K candidate diffs from K agents → ranker model picks the best one. Optional layer above `pending_changes`. | v2.1+ |
| **Drift detection** | `memovai/memov` (MIT) | `validate_commit` concept — detects when actual changes diverge from what was requested. Shadow timeline comparison. | v2.0.3+ |
| **Anti-slop for frontend** | `Leonxlnx/taste-skill` (MIT) | 100+ specific font/color/layout ban list + 3-dial parameterization. Vendor into skills/ when BooCoder generates frontend code. | v2.0+ |
| **Verify-before-commit gate** | `DeepSourceCorp/globstar` (MIT) | Rule-based AST linter as a pre-apply quality gate. YAML checkers in `.globstar/`. | v2.1+ (parked) |
| **Docker sandbox** | `OpenHands/OpenHands` (MIT) | Per-session Docker container for write tools. Closes the `/opt:rw` mount risk if path-guard ever proves insufficient. | v2.1 (optional) |
| **Multi-provider LLM** | `earendil-works/pi` (MIT) | Provider abstraction if a need for Anthropic/OpenAI/Mistral direct surfaces beyond llama-swap. | v2.x (optional) |
## Repos to clone before starting
```bash
cd /opt/forks
git clone https://github.com/Dominic789654/agent-hub.git # Apache-2.0, task DAG + dispatcher
git clone https://github.com/plandex-ai/plandex.git # MIT, pending-changes UX
git clone https://github.com/anomalyco/opencode.git # MIT, permission evaluate.ts reference
git clone https://github.com/qodo-ai/agents.git # MIT, agent.toml schema (output_schema, exit_expression, execution_strategy)
```
Also read (no clone needed):
- `ai-christianson/RA.Aid` README — three-stage pattern + expert-tool escape hatch
- `getpaseo/paseo` README + `skills/` directory — daemon architecture + CLI verbs (AGPL, design-only)
- `agentclientprotocol.com` spec — ACP stdio protocol
- `goose-docs.ai/docs/guides/acp-clients/``context_servers` auto-forward pattern
- `siropkin/budi` README — 5-hook Claude Code taxonomy for observation
ACP SDK and MCP SDK are npm packages installed at implementation time.

View File

@@ -1,130 +0,0 @@
# v2.0 — BooCoder task breakdown
## Phase 0 — Prep (before any code)
- [ ] Clone lift sources: `agent-hub`, `plandex`, `opencode` to `/opt/forks/`
- [ ] Read agent-hub's schema + dispatcher pattern (Apache-2.0)
- [ ] Read plandex's pending-changes + diff/apply/rewind flow (MIT)
- [ ] Read opencode's `permission/evaluate.ts` for write-gate patterns (MIT)
- [ ] Install ACP SDK: `pnpm add @zed-industries/agent-client-protocol`
- [ ] Verify `opencode acp` and `goose acp` are available on the host
- [ ] Write `openspec/changes/v2.0-boocoder/design.md` with finalized decisions
## v2.0.0 — Schema + Path A (native write tools + pending-changes + diff UI)
### Infra
- [ ] Create `apps/coder/` directory skeleton (Fastify server, mirroring `apps/server/` structure)
- [ ] Create `apps/coder/Dockerfile` (Node 20 bookworm-slim, `/opt:/opt:rw` mount)
- [ ] Add `boocoder` service to `docker-compose.yml` (port 9502, boocode_net)
- [ ] Add Caddy route: `coder.indifferentketchup.com` → boocoder:9502
- [ ] DB rename: `boocode_db``boochat_db` (one-time ALTER DATABASE + docker-compose volume rename)
- [ ] Schema migration: CREATE TABLE `pending_changes`, `tasks`, `available_agents`; CREATE VIEW `human_inbox`
- [ ] Container guidance: `BOOCODER.md` (bind-mounted at `/app/BOOCODER.md`)
### Write tools
- [ ] `apps/coder/src/services/write_guard.ts``resolveWritePath(projectRoot, filePath)` (resolve + prefix-check, no realpath since file may not exist)
- [ ] `apps/coder/src/services/pending_changes.ts` — queue, apply, reject, revert operations
- [ ] Tool: `edit_file` — takes `{file_path, old_string, new_string}`, computes unified diff, queues in `pending_changes`
- [ ] Tool: `create_file` — takes `{file_path, content}`, queues as `operation='create'`
- [ ] Tool: `delete_file` — takes `{file_path}`, queues as `operation='delete'`
- [ ] Tool: `apply_pending` — flushes pending changes to disk (re-validates write_guard before each write)
- [ ] Tool: `rewind` — reverts applied changes by inverse-diff
### Inference loop
- [ ] Port the v1.14 outer loop from `apps/server/` into `apps/coder/` (or share via workspace package)
- [ ] Register write tools in the coder's tool registry (alongside all read tools from BooChat)
- [ ] Permission gate: write tools require `pending_changes` queue (can't bypass to direct disk write)
### Frontend (diff pane)
- [ ] Create `apps/coder/web/` SPA (React + Vite, same stack as BooChat's `apps/web/`)
- [ ] Diff pane component: shows pending changes with syntax-highlighted diffs
- [ ] Approve / Reject per change, Approve All / Reject All buttons
- [ ] Workspace splitter integration (chat pane + diff pane side by side)
### Verification
- [ ] `pnpm -C apps/coder build` clean
- [ ] Write path-guard fuzz tests (traversal patterns, symlinks, non-existent paths, `.env` deny)
- [ ] `docker compose up --build -d` — boocoder container starts, healthcheck passes
- [ ] Smoke: send a chat requesting a file edit → see it queued in diff pane → approve → file written
## v2.0.1 — Path B (ACP dispatch + PTY fallback + worktrees)
### ACP client
- [ ] `apps/coder/src/services/acp-client.ts` — spawn `opencode acp` / `goose acp` via `@zed-industries/agent-client-protocol` StdioTransport
- [ ] Event mapping: ACP `file_operation``tool_call` part, `terminal_output` → BooTerm route, `permission_request` → pause
- [ ] Session lifecycle: start, mid-session model switch, end
- [ ] MCP auto-forward: pass BooCoder's `context_servers` config to the ACP session
### PTY fallback
- [ ] `apps/coder/src/services/pty-dispatch.ts` — spawn `claude` / `pi` / `smallcode` via `node-pty`
- [ ] Capture stdout/stderr/exit-code into parts (less structured than ACP)
- [ ] Worktree setup: `git worktree add /tmp/booworktrees/<task-id> -b task-<task-id> HEAD`
- [ ] On completion: diff worktree vs HEAD → queue into `pending_changes`
### Dispatcher
- [ ] `apps/coder/src/services/dispatcher.ts` — polls `tasks` WHERE `state='pending'`, picks by priority + creation order
- [ ] Transport selection: check `available_agents.supports_acp` at dispatch time
- [ ] On failure: mark `state='failed'`, surface in `human_inbox`
- [ ] On completion: mark `state='completed'`, queue diff if Path B
### Agent probing
- [ ] Startup probe: `which opencode && opencode --version`, `which goose`, `which claude`, `which pi`
- [ ] Populate `available_agents` table with version + ACP support
### Verification
- [ ] Smoke: dispatch a task to `opencode` via ACP → task completes → diff queued
- [ ] Smoke: dispatch to `claude` via PTY fallback → captures output → diff from worktree
- [ ] Worktree cleanup after task completion
## v2.0.2 — BooCoder MCP server
### Implementation
- [ ] `apps/coder/src/services/mcp-server.ts` — register 6 tools as MCP tool handlers
- [ ] Stdio transport (use `@modelcontextprotocol/sdk` server-side, same SDK as client)
- [ ] Tools: `boocoder.create_task`, `boocoder.list_pending_changes`, `boocoder.apply`, `boocoder.reject`, `boocoder.dispatch_external_agent`, `boocoder.list_worktrees`
- [ ] Each tool maps to a DB operation or service call
### Eval
- [ ] Write 10-question eval per `anthropics/skills/mcp-builder` framework
- [ ] Run eval against the MCP server — all 10 must pass before shipping
- [ ] Document eval results in openspec
### Verification
- [ ] From a terminal: `echo '{"jsonrpc":"2.0","method":"tools/list","id":1}' | boocoder --mcp` → returns 6 tools
- [ ] From opencode: configure BooCoder as an MCP server in `~/.opencode/config.json`, verify tool calls work
## v2.0.3 — Polish
### CLI client
- [ ] `apps/coder/src/cli.ts` — thin WebSocket/HTTP client against BooCoder API
- [ ] Verbs: `boocode run <task>`, `boocode ls`, `boocode attach <id>`, `boocode send <id> <message>`
- [ ] Mirrors Paseo's UX, license-clean implementation
### Human inbox UI
- [ ] Frontend route showing tasks in `blocked`/`failed` state
- [ ] Per-task: view output, retry, cancel, reassign to different agent
### Cost tracking
- [ ] `tasks.cost_tokens` populated from inference usage
- [ ] Summary view: per-project, per-agent, per-day token spend
### Verification
- [ ] `boocode run "add a health endpoint"` from terminal → task appears in UI → completes → diff in pane
- [ ] `boocode ls` shows running/completed/failed tasks