Files
boocode/boocode_roadmap.md
indifferentketchup 2e1a81de72 v1.13.16-xml-parser: Anthropic <invoke> support + unknown-tool recovery hints
Two-part fix for the model-emitted XML drift the v1.13.15-codecontext-synth
investigation surfaced (1 raw <invoke> leak observed out of 190 qwen3.6
turns — qwen3.6-35b-a3b-mxfp4 drifts to the Anthropic format when prompted
as an Architect-style agent because Claude Code documentation in its
pre-training corpus uses that shape).

## Parser extension

xml-parser.ts now recognizes BOTH XML tool-call flavors:

  - Qwen/Hermes:   <tool_call><function=NAME>...<parameter=K>V</parameter>...</function></tool_call>
  - Anthropic:     <invoke name="NAME"><parameter name="K">V</parameter></invoke>

Both route through the same synthetic-id xml_call_${idx} ToolCall path.
extractToolCallBlocks() and partialXmlOpenerStart() handle both openers
(<tool_call> and <invoke...) so partial buffers don't get prematurely
flushed during streaming.

The existing Qwen parser was tightened to tolerate whitespace around `=`
(<function = name>, <parameter = key>...) so a stray space doesn't get
absorbed into the function name. Name capture is non-whitespace,
non-`>`.

## Unknown-tool recovery hint

New tool-suggestions.ts exports levenshtein() + suggestToolName() +
formatUnknownToolError(). When tool-phase.ts:executeToolCall receives a
toolCall.name that isn't in TOOLS_BY_NAME, the error returned to the
model now includes a "Did you mean: X?" hint based on Levenshtein
distance ≤3 or substring match against Object.keys(TOOLS_BY_NAME).
Targets the qwen3.6 drift to read_file → suggest view_file. Applies to
all unknown tool names, not just <invoke>-derived ones — at the
dispatch layer we no longer know which format produced the call, and
the extra signal is harmless for Qwen-derived calls.

## Test coverage

xml-parser.test.ts: 46 tests, all green. Covers both parsers
(well-formed, malformed, multi-parameter, nested-content), the
partial-opener detector for both flavors, the unified extraction
helper, and the unknown-tool error formatter.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 20:59:25 +00:00

112 KiB
Raw Blame History

BooCode v1.x — Roadmap

Last updated: 2026-05-22

Companion doc: boocode_code_review.md holds the full external-repo inventory, lift rationale, and license analysis. This document is the canonical source for shipping state, version ordering, and what's planned vs. shipped.

Overview

BooCode is a 3-app monorepo at /opt/boocode/ (locked 2026-05-22):

  • BooChat (apps/chat, port 9500, code.indifferentketchup.com) — read-only chat with file-inspection tools. The live thing. Pick a project, chat with a local LLM, get streaming responses over WebSocket. Will rename boocode_dbboochat_db when BooCoder lands.
  • BooCoder (apps/coder, port 9502, coder.indifferentketchup.com) — write tools + external-CLI dispatch. Planned, v2.0. Both an in-process inference loop (with pending_changes table) AND ACP-dispatched external agents (opencode/goose) with PTY fallback (claude/pi/smallcode) — same surface, two execution paths.
  • BooTerm (apps/booterm, port 9501) — PTY/tmux/xterm.js. Live since May 2026. Node 20 Alpine + node-pty + tmux + xterm.js. Tmux session per pane (bc-<uuid>), SSH-out works (openssh-client + gosu in the image). /api/term/health shares the existing boocode_db.

Caddy → Authelia → Tailscale → 100.114.205.53 → 9500/9501/9502. Three apps, one shared Postgres (boocode_dbboochat_db).

Architectural commitments:

  • No embeddings. Model uses file-view tools (view_file, list_dir, grep, find_files) + sidecar analyzers (codecontext, future codesight) + codecontext MCP tools. Walked away from the RAG pipeline May 2026.
  • BooChat is read-only through v1.x. Write tools land in BooCoder at v2.0.
  • Mount strategy: blanket /opt:rw, permission gating at the write-tool layer. Per-project scoping is policy, not mount. Path-guard correctness is the #1 test target for v2.0.
  • External CLI agents (opencode/claude/goose/pi) live on the host, not in containers. BooCoder shells out via local-exec PTY or ACP subprocess. Host install inherits Sam's existing ~/.opencode/, ~/.claude/, ~/.config/goose/ configs.
  • Protocol roles locked (2026-05-22): BooChat = MCP client only (read-only tool consumer, never enables write-capable MCP servers). BooCoder = MCP client + MCP server + ACP client (host) + ACP agent (driveable) — full matrix. BooCoder's ACP-client role replaces raw-PTY dispatch for ACP-capable agents (opencode opencode acp, goose goose acp); PTY fallback retained for claude/pi/smallcode.
  • Strategic target: Paseo-equivalent dispatcher inside BooCode (2026-05-22 pivot). Paseo (getpaseo/paseo) is AGPL-3.0 — incompatible with BooCode's MIT license and network-served deployment. Reproduce the architecture using only license-clean patterns. Primary architectural template: Dominic789654/agent-hub (Apache-2.0). Critical context-management primitive: Roo Code Boomerang Tasks pattern. Observation pattern: Claude Code hooks (siropkin/budi reference).

External code lifted from / referenced in: see boocode_code_review.md for full inventory.


Shipped (status as of 2026-05-22)

Version Theme Tag
v1.0 Initial scaffold
Batches 14.4 Markdown, sidebar, panes, chats-inside-sessions, archive, fork/delete, header polish, settings drawer
v1.5 resolveProjectPath, BOOTSTRAP_ROOT, vitest pin
v1.6, v1.6.1, v1.6.2 Mobile pass + RightRail mobile drawer
v1.7 Drag-drop file + paste-as-attachment
v1.8, v1.8.1, v1.8.2 Settings drawer, git_status tool, WS reconnect, per-turn budget reset + Continue affordance + CapHitSentinel
v1.9.1 Skills system (/opt/skills/ + skill_find / skill_use / skill_resource + /skill slash command) v1.9.1
v1.9.7 ask_user_input elicitation tool v1.9.7
Batch 9 (Agents Tier 2) AGENTS.md + 6 builtin agents + AgentPicker in ChatInput toolbar + sessions.agent_id folded into v1.9.1/v1.9.7
v1.10.0 BooTerm: separate container, xterm.js + node-pty + tmux v1.10.0
v1.10.1 BooTerm-user (spawn as samkintop, login bash, Claude Code/opencode PATH) v1.10.1
v1.10.4, v1.10.5 Mobile terminal + XML tool-call fallback parser
v1.11.0 opencode-style compaction port (auto-overflow, anchored summary, tail preservation)
v1.11.1 Compaction follow-up (working indicator during compaction, unit tests, .bak cleanup)
v1.11.2 ContextBar (persistent context-usage indicator above MessageList)
v1.11.3 ctx_max capture via /upstream/<model>/props (replaces dead timings.n_ctx read) v1.11.3
v1.11.5 ContextBar inline next to agent picker; remove ChatContextPopover; default new sessions to no agent
v1.11.6 Doom-loop guard from opencode (3 identical tool calls → sentinel, abort recursion)
v1.11.7 pathGuard secrets filter (continue.dev DEFAULT_SECURITY_IGNORE_FILETYPES)
v1.11.8 web_search + web_fetch tools via SearXNG
v1.11.9 Manual redirect handling — re-run URL guard on each hop (SSRF hardening)
v1.11.10 Stream-cap response body at 5MB, abort on overflow v1.11.x
v1.12.0 codecontext sidecar (Go HTTP shim, NDJSON MCP framing, child.Wait supervisor) + container guidance (BOOCHAT.md/BOOCODER.md) + 7 vendored skills + system-prompt.ts extraction + mtime-watch cache + 8 codecontext tool wrappers + per-agent tool whitelists + .codecontextignore template + agents.ts ALL_TOOL_NAMES single-source-of-truth fix v1.12.0
v1.12.1 Server-side workspace pane sync (sessions.workspace_panes jsonb) + 5-state status indicator overhaul (streaming/tool_running/waiting_for_input/idle/error) + startup hung-row sweep + stale messages_status_check constraint dropped + detectSameNameLoop reverted (dead code) + stop-handler writes cancelled status v1.12.1
v1.12.2 Live tok/s + ctx_used display next to status indicator while streaming (frontend-only) v1.12.2
v1.12.3 Stale-stream banner — "Previous response didn't complete. [Retry] [Discard]" when streaming row > ~60s with no new tokens. POST /api/chats/:id/discard_stale backend endpoint v1.12.3
v1.12.4 Refactor only — inference.ts (1700 LoC) split into inference/ directory: turn.ts, stream-phase.ts, tool-phase.ts, error-handler.ts, sentinel-summaries.ts, payload.ts, xml-parser.ts, sentinels.ts, budget.ts, types.ts, index.ts. Shipped as rc1/rc2/rc3 → final. No behavior change. Lined up stream-phase.ts as the swap target for v1.13 AI SDK migration v1.12.4
v1.13.0 message_parts table (id, message_id, sequence, kind, payload jsonb, created_at) with kinds text/tool_call/tool_result/reasoning/step_start. CHECK constraint, (message_id, sequence) unique + index. Dual-write at every site that wrote tool_calls/tool_results JSON (stream-phase finalize, skills × 2, messages.ts answer flow, chats.ts × 2). ToolDef<T> gained `category: 'read_only' 'write'`. v1.x registry rejects write. Old JSON columns remain authoritative for reads. Strangler-fig phase 1
v1.13.1-A AI SDK v6 install + streamCompletion adapter. ai@^6, @ai-sdk/openai-compatible@^2. provider.ts wraps createOpenAICompatible against config.LLAMA_SWAP_URL. streamCompletion rewritten as adapter over streamText. XML fallback parser preserved for qwen3.6's inline <tool_call> emissions. Patched mid-flight: AI SDK v6 swallows abort signals silently — explicit if (signal?.aborted) throw after stream drain. Without it, stop button writes complete instead of cancelled. reasoning-delta counted + dropped (re-captured in -C). Known regression flagged: live mid-stream tps gone (single trailing publish; TODO for delta-cadence interpolation against result.usage) (umbrella tag)
v1.13.1-B messages_with_parts view with COALESCE fallbacks against legacy JSON columns. Read sites switched: chats.ts:427, messages.ts:95, ws.ts:27, payload.ts, compaction.ts. Perf verified at 1ms for 42-message chat. reasoning_parts column added to the view (consumed in -C). API contract preserved. Parts become source of truth at read; JSON columns kept by dual-write only (umbrella tag)
v1.13.1-C ask_user_input correlation ported to parts. messages.ts:478/549 now JOINs message_parts on payload->>'id' and payload->>'tool_call_id'. Downstream call sites updated to {message_id, payload} shape. 404 fallback for pre-v1.13.0 history (acceptable scope). Reasoning end-to-end: reasoning-delta accumulated in stream-phase.ts adapter via StreamResult.reasoning (simpler than the brief's StreamPhaseState approach); partsFromAssistantMessage accepts optional reasoning, emits at seq 0; finalizeCompletion + executeToolPhase dual-write reasoning parts; payload.ts reads reasoning_parts from view, collapses into OpenAiMessage.reasoning; toModelMessages emits AI SDK ReasoningPart in assistant content array. Smoke: 361 chars reasoning at seq 0, 429 chars text at seq 1 v1.13.1 (ac1a71f)
v1.13.3 Cleanup bundle, 4 independent items. (1) ALTER DATABASE boocode SET statement_timeout = '30s' — caps damage from query-plan regression on the view's nested subselects; documented in schema.sql since ALTER DATABASE can't run inside a DO block. (2) Alpha-sorted tool registry — .sort((a, b) => a.name.localeCompare(b.name)) at ALL_TOOLS export; llama.cpp prompt cache hits on byte-identical prefixes, tool-order drift killed hit rate every turn. (3) Periodic 60s in-process sweeper marks streaming rows older than 5 min as failed and publishes chat_status='idle' so the UI dot drops — closes mid-session crash UX gap that the startup sweep (v1.12.1) only handled at boot. (4) experimental_repairToolCall wired through AI SDK v6 streamText — routes malformed tool calls to a logged passthrough instead of crashing the stream. Owed since v1.13.1-A. 173/173 tests pass (+1 alpha-ordering test) v1.13.3 (a08d809)
v1.13.4 Two-tier compaction prune. services/inference/prune.ts with pure selectPruneTargets decision helper. Tier 1 hides stale tool_result parts via message_parts.hidden_at at the 20k-freed threshold (cheap, no inference call); tier 2 falls back to anchored summarize when prune alone isn't enough. Schema additions: message_parts.hidden_at column + partial index ON (message_id) WHERE hidden_at IS NULL. messages_with_parts view filters hidden parts so payload assembly never sees them. Avoids burning an inference round on every overflow. opencode-pattern half-shipped in v1.11.0 — this closes it. v1.13.4 (ec8593c)
v1.13.5 opencode truncate.ts port — full tool output retrievable via opaque id. New services/truncate.ts with tr_<12 base32> ids on tmpfs (/tmp/boocode-truncations, 0o700, 5MB cap matching view_file's MAX_FILE_BYTES, 7-day TTL). Three exports: storeTruncation, readTruncation, truncateIfNeeded (wrap-or-passthrough helper). New view_truncated_output(id) tool retrieves the full content; model never sees the truncation dir (resolved server-side). Wired through 5 of 7 tool sites: view_file, list_dir, web_fetch, codecontext_client, plus alpha-sorted into ALL_TOOLS (count 19→20). cleanupTruncations piggybacks on the v1.13.3 60s sweeper (TTL pass + orphan reap via parts query on payload->'output'->>'outputPath'). grep and find_files deferred (need file_ops refactor to expose uncapped output). 186 tests (was 179, +7 in truncate.test.ts). v1.13.5 (f8fc5db)
v1.13.6 Compaction head-assembly audit + reasoning fix. Audit traced compaction's summary path post-v1.13.1-B read flip across three quadrants — Q1 view read (clean), Q2 parts shape (clean), Q3 reasoning render (FIX NEEDED). v1.13.1-C wired reasoning end-to-end into inference/payload.ts but missed the compaction read site, silently degrading summary quality for reasoning-channel models (qwen3.6) since -C shipped. Fix: CompactionMessage extended with reasoning_parts field; SELECT pulls reasoning_parts from messages_with_parts; buildHeadPayload (now exported for tests) prefixes assistant content with <reasoning>...</reasoning>\n\n<content> when reasoning is present; standalone <reasoning> tag for tool-call-only turns; omits tag when reasoning is null or empty. 4 new render-branch tests (190 total). v1.13.6 (81d837c)
v1.13.7 (uncommitted) Stability bundle, 5 fixes from production observability gap. (1) provider.tsincludeUsage: true on createOpenAICompatible. @ai-sdk/openai-compatible defaults this false, omitting stream_options.include_usage from request body; llama-swap never emitted the usage block, so result.usage.inputTokens/outputTokens resolved undefined and tokens_used/ctx_used landed NULL in every assistant row since v1.13.1-A. Surfaces tokens in StatsLine + persisted DB rows going forward (no backfill). (2) MessageList.tsx:48hasText = m.content.trim().length > 0. AI SDK v6 streaming occasionally emits a leading \n text-delta on tool-call-only turns; the literal newline passed length > 0 and rendered an empty bubble + ActionRow between each tool call. (3) MessageBubble.tsx:654 — same trim on hasContent (defensive, no-tool-calls path). (4) payload.ts:64buildMessagesPayload skips assistant rows with status='failed' AND status='complete' && empty content && no tool_calls. Without this, a trailing empty/failed assistant + the next attempt's placeholder produced "Cannot have 2 or more assistant messages at the end of the list" rejections from the upstream API. (5) budget.ts:11BUDGET_NO_AGENT = 30 (was 15). No-agent mode shares the read-only-agent toolset at runtime; the cautious 15-cap was forward-looking for write tools that haven't landed. 190/190 tests still pass.

v1.13.2 deliberately deferred — keep the dual-write through v1.13.4v1.13.11 as rollback insurance. Drop legacy columns last.


Shipped (v1.13.x — written 2026-05-22, retagged same day)

All v1.13.x batches were retagged to the vMAJOR.MINOR.PATCH-slug scheme on 2026-05-22. CHANGELOG.md is the canonical per-tag record (slug describes what shipped; tag name alone recalls the batch). Tip is v1.13.14-skills-audit (0fa46cd); the next batch is v1.13.15-codecontext-synth (this batch, tag pending). Tags in chronological order:

  • v1.13.0-ai-sdk-v6 — AI SDK v6 migration; streamCompletion adapter; messages_with_parts view; reasoning_parts end-to-end
  • v1.13.1-cleanup-bundlestatement_timeout='30s', alpha-sorted tool registry, 60s stuck-row sweeper, experimental_repairToolCall pass-through
  • v1.13.2-compaction-prune — two-tier prune; message_parts.hidden_at column + partial index; messages_with_parts view CASE refinement
  • v1.13.3-truncate — opencode truncate.ts port; opaque tr_<…> id, view_truncated_output(id) tool, tmpfs storage
  • v1.13.4-reasoning-fix<reasoning> prose-prefix in compaction head-assembly for tool-bearing turns
  • v1.13.5-stability-bundleincludeUsage: true on provider, hasText trim guard, BUDGET_NO_AGENT 15→30, trailing-empty-assistant filter
  • v1.13.6-prefix-stabilitybuildSystemPromptWithFingerprint SHA-256 + per-session drift observer
  • v1.13.7-compaction-trigger — overflow trigger lowered to floor(0.85 × ctx_max)
  • v1.13.8-tool-costtool_cost_stats SQL view + per-tool rolling 100-call mean in AgentPicker
  • v1.13.9-agentlint — instruction-file AgentLint pass; identity-openers removed; CLAUDE.local.md to .gitignore
  • v1.13.10-openspecopenspec/changes/<slug>/{proposal,tasks,design}.md shape; archived batch docs preserved via git mv
  • v1.13.11-tools — tiered tool loading via BOOCODE_TOOLS env (core | standard | all)
  • v1.13.12-ws-schemas — Zod schemas for all 27 wire-format frames; publishFrame / publishUserFrame wrappers; parity test
  • v1.13.13-ws-publish — all ~80 publish sites converted to the typed wrappers; every WS frame now Zod-validated at boundary
  • v1.13.14-skills-audit — 26 skills vendored + audited via 5 parallel agent teams; 14 kept, 11 dropped, 1 migrated to BOOCHAT.md/BOOCODER.md
  • v1.13.15-codecontext-synth — forced second-inference synthesis pass for codecontext overview tools (truncation-aware extraction; auto-fetched top-N files + project docs; 32k payload-budget contract preserved)
  • v1.13.16-xml-parser — Anthropic <invoke> parser support + Levenshtein-based unknown-tool recovery hints (qwen3.6 drift to Claude Code-style tool names like read_file); xml-parser test coverage

The remaining strangler-fig final step (drop messages.tool_calls + tool_results columns) is still pending under its old v1.13.2 working name; will get a new tag slug when scoped.

In flight / next (v1.13.x cleanup line)

Five more single-dispatch batches before the strangler-fig closes. Each ships independently with its own smoke and rollback surface. Do not fold. Order is locked:

v1.13.8 — system-prompt prefix stability verify-and-measure (REFRAMED, 2026-05-22)

Original plan: add a system_prompt_cache DB table keyed by (agent_id, project_id, skills_version), mtime-invalidated.

Why reframed: recon disproved the premise. apps/server/src/services/system-prompt.ts:buildSystemPrompt already runs over mtime-cached inputs at the file layer:

  • BOOCHAT.md / BOOCODER.md cached in system-prompt.ts:25 (cachedGuidance, keyed by mtime)
  • global + per-project AGENTS.md cached in agents.ts:245 (safeStat pattern, 60s TTL)
  • session.system_prompt / project.default_system_prompt are DB scalars (byte-stable until edited)
  • BASE_SYSTEM_PROMPT is a hardcoded template with ${projectPath} interpolation

Output assembly is a microsecond pure-string concat with no I/O. Skills aren't in the prefix (runtime discovery via skill_find). Tools live in a separate request body field, alpha-sorted by v1.13.3. In theory the prefix is already byte-stable across turns; nothing has measured it.

New scope — instrumentation only, no cache:

  1. SHA-256 fingerprint of buildSystemPrompt's output logged per turn at level=info, msg prefix-fingerprint, with project_id / agent_id / session_id / prefix_hash / prefix_length / mtime fields.
  2. Module-level Map<sessionId, lastHash> observer. On hash change for a known session → emit prefix-drift at level=warn with prev_hash, new_hash, and a field-level changed_inputs diff.
  3. Unit-level byte-stability assertion in system-prompt.test.ts: two consecutive buildSystemPrompt calls with the same inputs return byte-identical strings.

Decision criterion: smoke 5 turns in a fresh session. 5 identical hashes + zero drift logs → close v1.13.8 as no-op, drop the DB cache plan permanently, move to v1.13.9. If drift surfaces → characterize the failure mode in a follow-up batch (the answer may not be a cache at all).

Doctrine: matches the v1.13.6 audit pattern. Don't add infrastructure without a proven cache miss. The v1.12.0 mtime caches at the input layer plus alpha tool ordering at the request body layer already address the load-bearing cache-stability surfaces.

Dispatch brief: handoff_v1.13.8_prefix_verify.md.

Estimated: ~95 LoC (system-prompt.ts + small getAgentsMtimes accessor in agents.ts + 3 new tests).

v1.13.9 — compaction overflow trigger formula

opencode pattern: 0.85 * ctx_max early trigger (not at 100% saturation). Reduces tail-loss risk and gives compaction a safer window. Tiny change but tied to v1.13.4's tier logic — sequence matters.

Lift source: anomalyco/opencode session/overflow.ts.

Estimated: ~30 LoC.

v1.13.10 — per-tool token cost accounting

Rolling average per tool, surfaced in AgentPicker tooltip + agent-pick decisions. Backend tracks (tool_name, prompt_tokens_in, completion_tokens_out) per call; surfaces a 100-call rolling mean. Frontend reads it for tool-cost hints. Depends on v1.13.7's includeUsage fix — without real token numbers in DB rows, the rolling average is empty.

Estimated: ~250 LoC.

v1.13.11 — WebSocket frame typing

Zod schemas validated both ends. Catches the recurring class of bug that drove the 2026-05-21 debugging spike (silent protocol drift). Upfront work that pays back every time the protocol changes. chat_status, usage, parts_appended, session_workspace_updated, tool_running — every frame gets a Zod schema, every send/receive site validates.

Estimated: ~300 LoC.

v1.13.12 — skills audit pass (NEW, 2026-05-22)

Goal: apply the rules→recipes split (per Codeminer42 activation-gap data: plain skills invoke 6% in clean multi-turn, CLAUDE.md/AGENTS.md is 100% present) to BooCode's 7 vendored v1.12 skills. Sort each into: (a) move to AGENTS.md as always-true rule, (b) keep as recipe invoked via /skill <name>, (c) move bulky context into references/ flat subdirectory inside the skill, (d) delete (Claude already does it reliably).

Scope:

  1. Audit each of the 7 vendored skills against the 4-way split. Most workflow-rule content ("always do X before Y", "never do Z") moves to AGENTS.md since it should be 100% present. Recipe content ("here's how to scaffold a component", "here's the release checklist") stays as skill, gets context: fork if heavy.
  2. Adopt Anthropic best-practices conventions for any skills that remain after audit: gerund names (scaffolding-components, not component-helper), SKILL.md ≤500 lines, references one level deep, third-person imperative voice, MCP tool references in ServerName:tool_name format, no Windows-style paths, no time-sensitive info, consistent terminology, no "voodoo constants."
  3. Run each remaining skill through the 4-step validation protocol from mgechev/skills-best-practices (Discovery → Logic → Edge Case → Architecture Refinement) using a fresh Claude chat per step. Prompts are paste-ready; ~10 minutes per skill.
  4. Install skillgrade on Sam's host (npm i -g skillgrade). For each remaining skill, write a minimal eval.yaml with 23 tasks and run skillgrade --smoke (5 trials, ~5 min) to confirm the skill triggers when expected and produces correct output. Likely outcome: some skills show 020% trigger rate — confirms they belong in AGENTS.md, not as skills.
  5. Document the rules→recipes split as a BooCode convention in BOOCODER.md / BOOCHAT.md. Future-proofs against re-adding workflow rules as skills.

Lift sources:

  • blog.codeminer42.com/stop-putting-best-practices-in-skills/ — empirical 6%/33%/66%/100% invocation-rate data with Vercel-style multi-turn methodology. The activation-gap framing.
  • mgechev/skills-best-practices (25 stars, MIT) — 4-step validation protocol with paste-ready prompts. Directory structure conventions.
  • mgechev/skillgrade (132 stars, MIT) — agent-agnostic skill eval framework. eval.yaml task+grader schema. Smoke/reliable/regression presets.
  • platform.claude.com/docs/en/agents-and-tools/agent-skills/best-practices — canonical Anthropic standard. 500-line ceiling, gerund naming, progressive disclosure patterns, MCP tool reference format, verification checklist.

Dependencies: none (the 7 v1.12 skills already exist; this is an audit pass on shipped material). Can ship at any point in the v1.13.x line.

Estimated: zero code changes, ~one evening of audit work, plus skillgrade install. Per-skill eval.yaml authoring is ~30 min per skill including the 4-step validation. Total roughly 56 hours of focused work for all 7 skills.

v1.13.2 — drop legacy columns (final phase of strangler-fig)

Wait at least one week of production traffic on v1.13.1 before shipping. The dual-write is rollback insurance. Drop the columns and that rollback is gone.

Verification query before shipping:

SELECT
  COUNT(*) FILTER (WHERE m.tool_calls IS NOT NULL AND NOT EXISTS (
    SELECT 1 FROM message_parts p WHERE p.message_id = m.id AND p.kind = 'tool_call'
  )) AS missing_tool_call_parts,
  COUNT(*) FILTER (WHERE m.tool_results IS NOT NULL AND NOT EXISTS (
    SELECT 1 FROM message_parts p WHERE p.message_id = m.id AND p.kind = 'tool_result'
  )) AS missing_tool_result_parts
FROM messages m
WHERE m.created_at > '2026-05-22'::timestamptz;

Both columns must read 0.

Scope (~150 LoC, mostly deletions):

  1. Remove dual-write from every v1.13.0 site: tool-phase.ts (3 sites), finalizeCompletion, skills.ts (2 sites), messages.ts answer flow, chats.ts (fork). Keep only the parts write.
  2. Simplify messages_with_parts view — drop COALESCE fallbacks since legacy columns are about to disappear.
  3. ALTER TABLE messages DROP COLUMN tool_calls, DROP COLUMN tool_results.
  4. Remove tool_calls/tool_results fields from Message API type. API boundary unchanged (frontend already reads parts-derived values).
  5. Drop the stale messages_status_check cleanup DO block from v1.12.1 schema if still present.
  6. Update test fixtures in inference.test.ts and compaction.test.ts to construct parts instead of inline tool_calls: null, tool_results: null literals. ~30 fixture rewrites.

After v1.13.2 ships, tag the umbrella v1.13 on the same commit (or on -C — Sam's call).


v1.14 — Phase C: outer agent loop

Goal: explicit multi-step loop per opencode prompt.ts runLoop(). Replace the current ad-hoc tool-call recursion.

Scope:

  1. Outer loop continues until model returns non-tool finish OR step cap hit. Step ≠ tool call: one step can contain multiple tool calls in parallel.
  2. agent.steps ?? Infinity per-agent step cap. AGENTS.md gains steps: field. Refactorer steps: 5, Architect steps: 20, etc.
  3. Step-boundary events (step_start, step_finish) explicit in the parts stream. Per-step snapshot for revert (planned for BooCoder; backend-only in v1.14).
  4. Doom-loop guards (v1.11.6) migrate from "abort recursion" to "raise within loop iteration." Same predicate, different control flow.

Lift sources:

  • anomalyco/opencode session/prompt.ts runLoop() outer agent loop
  • anomalyco/opencode agent.steps per-agent step cap
  • AGENTS.md extensions for steps, output_schema (Qodo agent.toml pattern), exit_expression (Qodo pattern), execution_strategy (Qodo plan/act)
  • Reference: RA.Aid three-stage Research/Planning/Implementation as AGENTS.md design principle; expert-tool escape hatch pattern (most subtasks on routine model, escalate to qwopus27b only when needed)
  • Reference: Roo Code Boomerang Tasks — orchestrator-with-capability-restriction pattern. Adopt as AGENTS.md design principle (orchestrator role can call only dispatch tools, no file reads / MCP / shell).

Dependencies: v1.13 merged.

Estimated: ~800 LoC.


v1.14.x-mcp — single-server MCP-client proof-of-concept (NEW, 2026-05-22)

Goal: validate the MCP-client loop end-to-end against one real MCP server before committing to the full opencode mcp/index.ts port at v1.15. Small, throwaway-if-needed, slots between v1.14 and v1.15 without disrupting either.

Scope:

  1. Add a hardcoded MCP client (single server) to BooChat. Initial target: Context7 (Sam already uses it via opencode, so the config is known to work). Remote HTTP transport at https://mcp.context7.com/mcp with optional CONTEXT7_API_KEY header.
  2. Use the official @modelcontextprotocol/sdk TypeScript client. No SSE transport yet (deferred to v1.15). Stdio transport not needed for Context7.
  3. Tool discovery on startup: tools/list. Tools surface in BooChat alongside view_file/grep/etc., prefixed context7_* to avoid collisions.
  4. Read-only invariant guard: the client must reject any MCP tool whose annotations.readOnly is false (or absent). Fail-closed. This is BooChat-specific defense-in-depth — v1.15 lifts this restriction for BooCoder.
  5. Per-server enabled flag in agents.ts. No glob patterns yet.
  6. No OAuth. Context7 supports an API key header; that's it for v1.14.x. OAuth lands in v1.15.

What this proves:

  • MCP protocol loop works end-to-end against a real server in BooCode's Fastify backend.
  • Tool-discovery → tool-list → tool-call → result-render → context-budget accounting all hold.
  • Read-only enforcement at the client layer is sound.
  • Config schema shape is right before v1.15 commits to the opencode-compatible JSON config.

What this does NOT do:

  • No SSE transport. (v1.15.)
  • No OAuth flow. (v1.15.)
  • No multiple servers. (v1.15.)
  • No per-agent server allow/deny. (v1.15.)

Dependencies: v1.13 merged (parts table for tool-call/tool-result emission).

Estimated: ~150 LoC.

Skip-condition: if v1.14 finishes and Sam wants to leap straight to v1.15, fold this into the early steps of v1.15.


v1.14.x-html — HTML artifacts in BooChat (NEW, 2026-05-22)

Goal: integrate Thariq Shihipar's "HTML > Markdown for agent output at length" pattern (claude.com/blog/using-claude-code-the-unreasonable-effectiveness-of-html, May 20 2026) into BooChat. Bias the model toward HTML for outputs >100 lines: information density, visual clarity, interactive controls (sliders/knobs/SVG diagrams/side-by-side comparisons), shareability. BooChat already renders into a webview, so the surface fit is unusually good.

Scope:

  1. Model-side prompting (no code change yet, just AGENTS.md guidance):
  • Add HTML-bias rule to global AGENTS.md: "For outputs >100 lines, default to a self-contained <!DOCTYPE html>...</html> artifact unless the user explicitly asks for Markdown. For outputs <100 lines or for short conversational replies, stay in Markdown."
  • Reasoning shown in the rule: HTML carries diagrams, tabs, illustrations, code-with-syntax-highlighting, interactive controls, mobile-responsive layouts. Markdown is restrictive at any length.
  • Cite Thariq's blog post in the rule comment so future audit passes know where it came from.
  1. Detection at the BooChat backend. In apps/chat/services/inference/stream-phase.ts post-processing: detect any assistant text part starting with <!DOCTYPE html> (case-insensitive, whitespace-trimmed) — or wrapped in a fenced ```html block — and tag it as an HTML artifact. Emit a new part kind html_artifact into message_parts (CHECK constraint update). Payload: {html_content, char_count, title}. Title pulled from <title> tag or first <h1> if available.
  2. Three render targets (Sam's pick: "3 with a download"):
  • Inline preview in the chat stream: small sandboxed iframe (~400px tall), renders the artifact next to where it was streamed. Default size, click-to-expand.
  • Open in pane: button on the inline preview opens the artifact in a full-height pane in BooChat's existing workspace splitter, alongside the file viewer and BooTerm. Pane is dismissible. Pane state persisted via sessions.workspace_panes jsonb (the v1.12.1 schema already supports this).
  • Download: button writes the artifact to /opt/<project>/.boocode/artifacts/<slug>-<unix-timestamp>.html (path-guarded same as native write tools), surfaces an OS download link via the existing file-serving path. Filename slug derived from artifact title.
  1. Security stance — locked 2026-05-22: the iframe is sandboxed with sandbox="allow-scripts allow-clipboard-write allow-downloads". Crucially, omit allow-same-origin so the artifact has its own opaque origin and cannot read BooChat's cookies, Authelia session, or DOM. Backend serves the iframe content via srcdoc=... inline (not src=) so no separate URL exists to disclose. CSP header on the iframe response: default-src 'none'; script-src 'unsafe-inline'; style-src 'unsafe-inline'; img-src data: blob:; font-src data:; connect-src 'none'. The connect-src 'none' is the key clause — artifacts can't fetch(), can't open WebSockets, can't ping a tracking pixel, can't exfiltrate. JS runs (so Thariq's interactive knobs/sliders/copy-as-prompt buttons work) but nothing else network-touching does. None of Thariq's blog examples need the relaxed permissions — they're all client-side.
  2. Frontend rendering (apps/web/src/components/HtmlArtifactPart.tsx):
  • Inline preview: <iframe srcdoc={html_content} sandbox="allow-scripts allow-clipboard-write allow-downloads" className="..." /> with the strict-sandbox attributes above.
  • "Open in pane" button: dispatches workspace-pane action with {type: 'html_artifact', message_part_id, html_content}.
  • "Download" button: POST to new endpoint /api/chats/:id/artifacts/:part_id/download which writes to disk (path-guarded) and returns the absolute path or pre-signed URL for the existing static-file serving route.
  1. No artifact persistence beyond the chat. Artifacts live in message_parts.payload->>'html_content' with the chat. Downloads go to /opt/<project>/.boocode/artifacts/ and are user-managed from there. No separate artifacts table.
  2. Token-budget guard. Single artifact can be at most 1MB of HTML in message_parts.payload. Larger triggers a streaming abort with a friendly error: "Artifact exceeded 1MB; consider splitting into multiple files or reducing inline assets."
  3. No web-artifacts-builder skill vendor. That skill (anthropics/skills/web-artifacts-builder) is built for Claude.ai's runtime with Vite + Parcel + tspaths + html-inline toolchain. BooChat has no shell execution surface. The pattern transplants; the toolchain doesn't. Treat the skill's "avoid AI slop" design principles (no excessive centered layouts, no purple gradients, no uniform rounded corners, no Inter font) as conventions inlined in the HTML-bias AGENTS.md rule. The init/bundle scripts are out of scope.

Lift sources:

  • claude.com/blog/using-claude-code-the-unreasonable-effectiveness-of-html (Thariq Shihipar, May 20 2026) — the pattern, the use-case taxonomy (specs/code-review/design/reports/custom editors), the design philosophy.
  • HTML iframe sandbox spec (web platform standard, no license issues).
  • anthropics/skills/web-artifacts-builder — design-principle reference only ("avoid AI slop" rules). Do not vendor the toolchain.

Dependencies: v1.13 merged (message_parts table is where artifacts live). Independent of v1.14 (outer loop) and v1.14.x-mcp (MCP PoC). Can ship in any order relative to those.

Estimated: ~400 LoC. Roughly half backend (detection + part-kind extension + download endpoint + path-guard integration), half frontend (HtmlArtifactPart component + pane integration + download button wiring).

Schema addition:

  • message_parts.kind CHECK constraint adds 'html_artifact' to the allowed set.

Skip-condition: none — independent batch, ships clean any time after v1.13. Highest user-visible payoff of any v1.13.x/v1.14.x batch (transforms what the model can produce, not just how the backend handles it).


v1.15 — Phase D: permission ruleset + full MCP client

Goal: wildcard permission ruleset (opencode evaluate.ts pattern) and a proper MCP client implementation. Foundation for BooCoder to gate writes; immediate value for codecontext to be re-wired as a real MCP server.

Scope:

  1. Wildcard rule matcher: { permission, pattern, action: 'allow' | 'deny' | 'ask' }. Last-match-wins. Per-agent rulesets layer under per-session rulesets.
  2. Full MCP client implementation: stdio (local subprocess) + SSE (remote HTTP) transports, tools/list discovery, tools/call invocation, OAuth via Dynamic Client Registration (RFC 7591), per-server enabled flag, glob patterns for per-agent tool whitelisting (matching opencode's tools config shape).
  3. codecontext sidecar gets re-pointed from static wrappers (v1.12) to real MCP. New connectors become a config-only addition.
  4. UI: permission-ask flow when a tool requires ask action. Modal or inline card with Allow once / Allow always / Deny. Reuses v1.9.7 elicitation surface.
  5. BooChat stays read-only by default — the read-only invariant guard from v1.14.x carries forward (defense-in-depth even with the ruleset).
  6. Config shape: match opencode's JSON schema near-verbatim so any opencode user can copy mcp blocks from ~/.opencode/config.json into BooCode unchanged. Schema is not copyrightable; matching it is pure interoperability.

v1 MCP scope limit (security): local-stdio MCP servers and Context7-style API-key remote servers only. Remote MCP servers requiring OAuth tokens are deferred until BooCode has a real secret-storage primitive (sops-encrypted entries, Vault sidecar, or OS keyring). Reason: MCP OAuth tokens are bearer credentials for third-party services; storing them in plaintext PostgreSQL inside the BooCode DB widens the attack surface significantly if Authelia is bypassed. v1.15 ships the OAuth code path but the config schema rejects OAuth servers until secret storage lands.

Absorbs: Original Batch 12 (tool approval + plan/act mode) — same outcome via permission rules instead of mode enum.

Lift sources:

  • anomalyco/opencode permission/evaluate.ts wildcard ruleset
  • anomalyco/opencode mcp/index.ts MCP client (SSE transport, tools/list, tools/call, OAuth RFC 7591)
  • cline/cline plan/act invariant — read-only mode pattern (absorbed)

Dependencies: v1.13 merged (parts table for permission events). Independent of v1.14.

Estimated: ~600 LoC.


v1.16 — codesight repo_health

Call graph, circular dependency detection, dead code flagging. Port analyze.mjs from spirituslab/codesight. New tool repo_health(project_id). In-process Node (not sidecar). Cache results keyed by (project_id, file_hashes_sig) in new repo_health_cache table.

Independent batch — ships clean any time after v1.13. Low leverage unless Sam actually uses the dead-code / circular-dep output.

Lift source: spirituslab/codesight analyze.mjs. Drop VS Code wrapper.

Dependencies: v1.12 merged (can reuse codecontext parse output where overlapping).

Estimated: ~400 LoC.


v2.0 — BooCoder: pending changes + dual execution paths + ACP host + MCP server

Major version bump. New app apps/coder/ inside the existing monorepo (not a separate repo). Lands together with the boocode_dbboochat_db DB rename and the per-app subdomain split (code.indifferentketchup.com → BooChat, coder.indifferentketchup.com → BooCoder).

Three protocol roles in one surface:

  1. MCP client (write-capable allowed). Inherits the v1.15 client unchanged. BooCoder can enable write-capable MCP servers (@modelcontextprotocol/server-filesystem write tools, git commit MCP servers, etc.). All MCP writes route through the same pending_changes queue as native writes. Per-task allow/deny means dispatched tasks can have a different MCP roster than the interactive shell.
  2. MCP server (BooCoder's own primitives). New apps/coder/services/mcp_server.ts exposes boocoder.create_task, boocoder.list_pending_changes, boocoder.apply, boocoder.reject, boocoder.dispatch_external_agent, boocoder.list_worktrees as MCP tools. Stdio transport for local consumers (Sam's opencode in Termius), HTTP for remote (deferred until OAuth + secret storage). This is what makes external opencode-on-the-host BooCoder-aware.
  3. ACP client (host). Replaces the raw-PTY dispatch path for ACP-capable agents. Spawns opencode acp and goose acp as JSON-RPC stdio subprocesses. Native session lifecycle, mid-session model/mode switching, file-operation events surfaced as diffs in the BooCoder UI, terminal events that route into BooTerm, permission prompts answered via real dialogs. MCP servers configured in BooCoder are auto-forwarded to the dispatched ACP agent (per goose docs — context_servers is the field name). One MCP config drives every dispatched agent.

Two execution paths, same surface (the answer to the May 18 "1 and 2 full featured" question):

Path A — in-process write-tool inference loop (Option B / native)

  • New write tools: edit_file, create_file, delete_file, apply_pending, rewind.
  • Edits queue in pending_changes (id, session_id, file_path, diff TEXT, status, created_at). Nothing touches disk until /apply.
  • Per-pane diff UI with Approve/Reject.
  • Path-guard layer (apps/coder/services/path_guard.ts) enforces per-project scoping using the v1.15 permission wildcard ruleset. Blanket /opt:rw mount, policy at the tool layer. Highest-priority test target: fuzz the path-guard against every traversal-attack pattern, including MCP-served filesystem writes.

Lift source: plandex-ai/plandex pending-changes data model and diff/apply/rewind UX vocabulary.

Path B — ACP/PTY dispatch to external CLI agents (Option A / dispatch)

  • New tool dispatch_external_agent(agent: 'opencode'|'claude'|'goose'|'pi', model: string, task: string, worktree: string).
  • Primary path: ACP subprocess for agents that support it (opencode opencode acp, goose goose acp). JSON-RPC over stdio. Native session/tool/file/terminal events.
  • Fallback path: raw PTY for claude/pi/smallcode via node-pty with cwd = /opt/<project> or a git worktree add /tmp/booworktrees/<session-id> worktree per dispatch.
  • Dispatch worker checks available_agents.supports_acp at runtime and picks the right transport. Same task table, same project registry, same pending-changes flow.
  • Captures stdout/stderr/exit-code into PostgreSQL stream tables (PTY path) or maps ACP events to the parts taxonomy (ACP path). WebSocket events surface to all three React surfaces.
  • One worktree per active dispatched session.
  • User picks per task via UI dropdown at task creation, or the in-process loop calls dispatch_external_agent itself.

Lift sources:

  • Dominic789654/agent-hub (Apache-2.0) — task DAG schema, dispatcher worker, project registry, human inbox. Primary architectural template.
  • getpaseo/paseo (AGPL-3.0, design only — no code lift) — daemon+clients architecture, --worktree feature-x flag, paseo run/ls/attach/send CLI verb shape, /handoff /loop /orchestrator skills concept.
  • Roo Code Boomerang Tasks pattern — orchestrator capability restriction + down-pass/up-pass context discipline (new_task message, attempt_completion result, no implicit inheritance) + explicit precedence override clause.
  • covibes/zeroshot blind-validation invariant — verify gate runs in separate agent context that only sees the diff and acceptance criteria, not the producing conversation.
  • ACP spec (agentclientprotocol.com) — local-subprocess ACP via stdio JSON-RPC. Remote ACP (HTTP/WS) is still work-in-progress per the spec maintainers; v2.0 uses stdio only.
  • Goose ACP docs (goose-docs.ai/docs/guides/acp-clients/) — context_servers auto-forward pattern. Critical: one MCP config drives every dispatched agent.

Shared infrastructure between A and B

  • tasks table (id, project_id, template_id, parent_task_id, state, input, output_summary, dependencies, agent, model, worktree_path, cost, started_at, ended_at)
  • task_templates table (reusable spec → task instantiations)
  • pipelines table + pipeline_runs (ordered template invocations)
  • available_agents table (name, install_path, version, supports_acp, supports_mcp_client, last_probed_at) — populated by startup probe (which opencode && opencode --version, etc.)
  • human_inbox view (state IN ('blocked', 'failed', 'needs_human'))
  • Worker process boocoder-dispatcher (systemd unit alongside Fastify): picks ready tasks, dispatches via A or B (and within B, ACP or PTY), captures output, marks state.
  • New boocode CLI as a thin WebSocket/HTTP client against the BooCoder API. Verbs: boocode run, boocode ls, boocode attach <id>, boocode send <id>. Mirrors Paseo's UX, license-clean implementation.
  • BooCoder-internal MCP server (see role 2 above) registered on the Fastify server alongside the existing HTTP/WS endpoints. Stdio transport for opencode-in-Termius; HTTP transport gated on OAuth + secret storage.

MCP server eval requirement: run BooCoder's internal MCP server through the anthropics mcp-builder skill's 10-question evaluation framework before shipping. Ten independent, read-only, complex questions with verifiable answers in XML format. If the eval doesn't pass, the MCP server isn't shippable.

Dependencies: v1.13 (parts table) + v1.14 (outer loop + step boundaries for revert snapshots) + v1.14.x (MCP-client PoC) + v1.15 (full MCP client + permissions for path-guard policy).

Estimated: ~1500 LoC for Path A + Path B + shared schema, plus ~400 LoC for the MCP-server role, plus ~300 LoC for the ACP-client role. Multiple sub-versions: v2.0.0 native + ACP, v2.0.1 MCP server, v2.0.2 polish.


v2.1 — BooCoder runtime isolation (optional)

Per-session Docker sandbox spawned by BooCoder on first write. Only project path mounted, not /opt. Idle-timeout 30 min. Standard OpenHands runtime contract: HTTP API inside container, BooCoder calls in.

Skip-condition: if the v2.0 path-guard layer holds up under fuzzing + a few months of production use, runtime isolation becomes optional hardening rather than necessary defense. Track but don't commit.

Lift source: OpenHands/OpenHands V1 runtime pattern.

Dependencies: v2.0.

Estimated: ~600 LoC.


v2.2 — BooCoder as ACP agent (driveable from external editors)

Goal: expose boocoder acp so Zed, JetBrains, Avante.nvim, CodeCompanion.nvim can drive BooCoder as their agent. Outbound exposure of the BooCoder write-tool surface to ACP-compatible editors.

Scope:

  1. New ACP server entry point: boocoder acp reads JSON-RPC over stdio, exposes BooCoder's task primitives as ACP sessions.
  2. BooCoder UI features remain optional: editor drives session via ACP; pending-changes queue still gates writes; user can approve/reject from either BooCoder's web UI or the editor's permission dialog (whichever responds first).
  3. Same auth model as the rest of BooCoder — editor must be reachable on the Tailscale mesh, or BooCoder is invoked with a short-lived token.

Why this is v2.2, not v2.0: outbound ACP-agent role is cheap once the inbound ACP-client side is implemented (same protocol library, server side), but it's a different product surface — driving BooCoder from external editors. Ship it after BooCoder's own surface stabilizes.

Lift source: zed-industries/codex-acp (Apache-2.0) as a server-side ACP reference implementation.

Dependencies: v2.0 + v2.1 (recommended; ACP-driven sessions inside a sandbox are stronger).

Estimated: ~400 LoC.


v2.x — Optional / far future

  • Verify gate above pending-changesaugmentcode/augment-swebench-agent majority-vote ensembler pattern (K candidate diffs → ranker model picks winner). JSONL schema only, no code lift. Combine with zeroshot blind-validation invariant. v2.0+ optional batch.
  • PR-resolver toolqodo-ai/qodo-skills PR-resolver state machine (fetch issues → batch/interactive fix → inline reply). BooCoder v2.0+.
  • Record/replay LLM harness for testsqodo-ai/qodo-cover pattern (hashed prompt → fixture YAML). Re-implement in Vitest, don't vendor (AGPL). v1.13+ test infrastructure.
  • HMAC-chained audit logsipyourdrink-ltd/bernstein pattern. Small lift, adds tamper-evident session history. v1.13+ optional.
  • Tiered tool loadingeyaltoledano/claude-task-master pattern (env var: core / standard / all). ~30 LoC in agents.ts. Pattern-only lift (claude-task-master is MIT + Commons Clause; reimplement). v1.13.x or v1.14.
  • Spec directory structureFission-AI/OpenSpec openspec/changes/<name>/{proposal,specs,design,tasks}.md shape for BooCode's own batch docs. Zero-dep documentation reformat, replaces ad-hoc boocode_batchN.md convention. v1.13.x or v1.14.
  • view_session_history MCP toolmemovai/memov snap/mem_history/validate_commit shape. Reference design for v1.13+ session-history feature.
  • taste-skill anti-slop ban list — vendor Leonxlnx/taste-skill SKILL.md after diff against existing frontend-design skill. Real value at v2.0+ when BooCoder generates frontend code (DubDrive, BooLab, Fathom).
  • AgentLint audit pass — manual review of BooCode's own CLAUDE.md/AGENTS.md/BOOCHAT.md/BOOCODER.md using 0xmariowu/AgentLint's 31 evidence-backed checks. Trim emphasis-keyword density, hit 60120 line sweet spot, SHA-pin Actions, ensure .env/CLAUDE.local.md are gitignored. One-evening pass, immediate ROI. Optional plugin install at v1.12.x post-merge for ongoing audits.
  • budi install (Sam's host)siropkin/budi Claude Code 5-hook observer (SessionStart/UserPromptSubmit/PostToolUse/SubagentStart/Stop). Local SQLite, sub-ms hook latency, dashboard at localhost:7878. Not a BooCode lift — install globally for Claude Code session observability.
  • Multi-provider LLM (pi-ai pattern): Only if a concrete need for Anthropic / OpenAI / Mistral direct surfaces. llama-swap covers everything today.
  • Workflow graphs (microsoft/agent-framework concepts): Multi-agent coordination. Conceptual reference only. Realistically a v3.x topic.
  • Secret storage primitive (prerequisite for remote OAuth MCP servers). Pick between: sops-encrypted entries in PostgreSQL, HashiCorp Vault sidecar, or OS-level keyring on ubuntu-homelab accessed via a thin service. Unblocks remote OAuth MCP servers in BooCode generally. v2.x or earlier if a remote OAuth server (Sentry, Atlassian, etc.) becomes urgent.

Architecture target state

Containers (post-v2.0)

Container Port Mount Purpose Status
boochat (was boocode) 100.114.205.53:9500 /opt:/opt:ro Read-only chat + SPA host + MCP client Live (renames at v2.0)
booterm 100.114.205.53:9501 /opt:/opt PTY/tmux terminal sessions Live (May 2026)
boocoder 100.114.205.53:9502 /opt:/opt:rw (policy-gated) Write tools + ACP host + MCP client + MCP server + external-CLI dispatch v2.0
boochat_db (was boocode_db) 127.0.0.1:5500 boocode_pgdata volume Postgres 16-alpine (shared by all three) Live (renames at v2.0)
codecontext :8765 (internal) /opt/projects:/workspace:ro MCP server for architect tools Live (v1.12.0)

Caddy routing target (post-v2.0)

code.indifferentketchup.com         → boochat   :9500   (SPA + chat API + MCP client)
coder.indifferentketchup.com        → boocoder  :9502   (SPA + write API + MCP client + MCP server HTTP)
coder.indifferentketchup.com/mcp    → boocoder  :9502   (BooCoder MCP server endpoint, when remote-MCP unlocked)
term.indifferentketchup.com         → booterm   :9501   (or routed under code.*/term/)

Schema additions by version

  • v1.11.0: messages.compacted_at, messages.summary, messages.tail_start_id, chats.needs_compaction
  • v1.11.7: none (pathGuard logic, no DB)
  • v1.12.0: none (codecontext stateless; truncation in-memory id-map with TTL cleanup)
  • v1.12.1: sessions.workspace_panes jsonb (workspace sync); drop deprecated session_panes table; drop stale messages_status_check constraint
  • v1.13.0-ai-sdk-v6: message_parts (id, message_id, sequence, kind, payload jsonb, created_at) + unique (message_id, sequence) + kind CHECK; messages_with_parts view with COALESCE fallbacks; ToolDef.category field (TS type, not DB)
  • v1.13.1-cleanup-bundle: ALTER DATABASE boocode SET statement_timeout = '30s' (op step, documented in schema.sql; doesn't survive volume reset)
  • v1.13.2-compaction-prune: message_parts.hidden_at TIMESTAMPTZ column + partial index (message_id) WHERE hidden_at IS NULL; messages_with_parts view filters hidden parts
  • v1.13.3-truncate: none (tmpfs id-map stored on disk under BOOCODE_TRUNCATION_DIR; no schema)
  • v1.13.4-reasoning-fix: none (compaction read-side change; CompactionMessage extended in TS, not DB)
  • v1.13.5-stability-bundle: none (provider config + 4 frontend/payload guards + budget constant, no schema change)
  • v1.13.6-prefix-stability: none — verify-and-measure batch, instrumentation only; drops the originally-planned system_prompt_cache table since recon proved input-layer mtime caches already achieve prefix stability
  • v1.13.7-compaction-trigger: none (compaction overflow trigger is a constant change in services/compaction.ts, no DB)
  • v1.13.8-tool-cost: tool_cost_stats SQL view over messages_with_parts (no new table — view + LATERAL jsonb_array_elements on tool_calls); rolling 100-call window
  • v1.13.9-agentlint: none (instruction-file audit + .gitignore add of CLAUDE.local.md, no DB)
  • v1.13.10-openspec: none (docs reorganization, git mv only)
  • v1.13.11-tools: none (env-var tier filter at request time, no DB)
  • v1.13.12-ws-schemas: none (Zod schemas + wrappers in TS, no DB)
  • v1.13.13-ws-publish: none (publish-site conversion + protocol-drift fix in compaction.ts, no DB)
  • v1.13.14-skills-audit: none (skills + AGENTS.md migration into git via .gitignore negation patterns; no DB)
  • v1.13.15-codecontext-synth (this batch, tag pending): message_parts.kind CHECK constraint extended with 'synthesis' value (DROP + DO $$ pg_constraint idempotency-guarded re-add)
  • (column drop, pending — old working name v1.13.2): drop messages.tool_calls, messages.tool_results; simplify messages_with_parts view
  • v1.14: agents.steps column (or AGENTS.md parser extension; no DB if file-only)
  • v1.14.x-mcp (NEW): none — single-server MCP-client PoC is config-only at first, no schema change
  • v1.14.x-html (NEW): message_parts.kind CHECK constraint extended with 'html_artifact' value
  • v1.15: permissions table, agent_permissions join, session_permissions join, mcp_servers (name, type, transport, url_or_command, enabled, config_hash, last_probed_at) registry
  • v1.16: repo_health_cache (project_id, file_hashes_sig, payload JSONB, created_at)
  • v2.0: pending_changes (id, session_id, file_path, diff TEXT, status, created_at); tasks, task_templates, pipelines, pipeline_runs; available_agents (name, install_path, version, supports_acp, supports_mcp_client, last_probed_at); human_inbox view; DB rename boocode_dbboochat_db
  • v2.2: none (boocoder acp is a new entry point, not a schema change)

Lift sources (headline table)

Full inventory and rationale in boocode_code_review.md. Headline items below; anomalyco/opencode is canonical (not sst/opencode — correction 2026-05-22).

Source License Used for Where
anomalyco/opencode MIT, TS Compaction algorithms (session/compaction.ts + session/overflow.ts) v1.11.0
anomalyco/opencode MIT, TS Doom-loop guard (session/processor.ts DOOM_LOOP_THRESHOLD=3) v1.11.6
continuedev/continue Apache-2.0 DEFAULT_SECURITY_IGNORE_FILETYPES v1.11.7
nmakod/codecontext MIT, Go Architect: codebase map sidecar (8 MCP-shaped tools, static-wrapped) v1.12.0
anomalyco/opencode MIT, TS AI SDK v6 adoption + streamText swap + ReasoningPart shape v1.13.1
anomalyco/opencode MIT, TS Parts-message taxonomy (text/tool_call/tool_result/reasoning/step_start) v1.13.0
anomalyco/opencode MIT, TS experimental_repairToolCall via AI SDK v6 v1.13.3
anomalyco/opencode MIT, TS Two-tier compaction prune (message_parts.hidden_at + tier logic) v1.13.4
anomalyco/opencode MIT, TS tool/truncate.ts truncation + outputPath pattern (adapted: opaque id) v1.13.5
anomalyco/opencode MIT, TS 0.85×ctx_max overflow trigger formula v1.13.9 (planned)
anomalyco/opencode MIT, TS session/prompt.ts runLoop() outer agent loop + agent.steps cap v1.14
Anthropic MCP SDK (TypeScript) MIT MCP client, single-server PoC v1.14.x-mcp
claude.com/blog/using-claude-code-the-unreasonable-effectiveness-of-html (blog, pattern only) HTML-output bias rule + use-case taxonomy v1.14.x-html
anthropics/skills/web-artifacts-builder MIT (design-principle reference) "Avoid AI slop" conventions inline in AGENTS.md v1.14.x-html
mgechev/skills-best-practices MIT (pattern) 4-step skill validation protocol with paste-ready prompts v1.13.12 (skills audit)
mgechev/skillgrade MIT Agent-agnostic skill eval framework (eval.yaml + smoke/reliable/regression presets) v1.13.12 (skills audit) + ongoing
blog.codeminer42.com/stop-putting-best-practices-in-skills/ (blog, pattern only) Rules→recipes split: skills 6% invoke vs AGENTS.md 100% present v1.13.12 (skills audit)
platform.claude.com/docs/.../agent-skills/best-practices (docs, canonical) 500-line ceiling, gerund naming, progressive-disclosure patterns, MCP ServerName:tool_name format v1.13.12 + all future skills
anomalyco/opencode MIT, TS permission/evaluate.ts wildcard ruleset v1.15
anomalyco/opencode MIT, TS mcp/index.ts MCP client (stdio + SSE, tools/list, tools/call, OAuth RFC 7591) v1.15
Aider-AI/aider Apache-2.0 Fallback aider/queries/tree-sitter-*.scm grammars v1.12 (fallback)
cline/cline Apache-2.0 Plan/Act invariant (absorbed into v1.15 permissions) v1.15
spirituslab/codesight MIT-ish Repo health analyzer (analyze.mjs) v1.16
plandex-ai/plandex MIT Pending-changes data model + diff/apply/rewind UX v2.0
Dominic789654/agent-hub Apache-2.0 Task DAG schema, dispatcher worker, project registry, human inbox — primary architectural template for v2.0 dispatcher v2.0
getpaseo/paseo AGPL-3.0 (design only, no code lift) Daemon+clients arch, CLI verb shape, worktree flag, three skills concept v2.0 / v2.x
agentclientprotocol.com spec + @zed-industries/agent-client-protocol SDK Apache-2.0 ACP client (host) — replaces raw-PTY dispatch for opencode/goose v2.0
anthropics/skills mcp-builder MIT MCP server build workflow + 10-question evaluation framework v2.0 (BooCoder MCP server)
zed-industries/codex-acp Apache-2.0 ACP server-side reference for boocoder acp v2.2
Roo Code: Boomerang Tasks Apache-2.0 (pattern only) Orchestrator capability restriction + down-pass/up-pass context discipline v1.14 (AGENTS.md) → v2.0 (real delegation)
covibes/zeroshot MIT (pattern only) Blind-validation invariant + complexity-classification conductor v1.14 (AGENTS.md) → v2.0 (verify gate)
OpenHands/OpenHands MIT Sandbox runtime contract v2.1
qodo-ai/agents MIT agent.toml schema (output_schema, exit_expression, execution_strategy) v1.14
qodo-ai/qodo-cover AGPL-3.0 (re-implement, don't vendor) Record/replay LLM response harness v1.13+ tests
qodo-ai/qodo-skills MIT PR-resolver state machine + provider-CLI adapter pattern v2.0+
augmentcode/augment-swebench-agent MIT Majority-vote ensembler (K diffs → ranker → winner) + JSONL schema v2.0+ optional
eyaltoledano/claude-task-master MIT+Commons Clause (pattern only) Tiered tool loading via env var + three model roles v1.13.x / v1.14
Fission-AI/OpenSpec permissive (verify) openspec/changes/<name>/{proposal,specs,design,tasks}.md structure for batch docs v1.13.x / v1.14
0xmariowu/AgentLint MIT 31 evidence-backed checks for CLAUDE.md/AGENTS.md quality Immediate manual pass; v1.12.x optional plugin
Leonxlnx/taste-skill MIT Anti-slop ban list + 3-dial parameterization pattern v2.0+ (BooCoder frontend output)
RA.Aid (ai-christianson) Apache-2.0 (pattern only) Three-stage Research/Planning/Implementation + expert-tool escape hatch v1.14 (AGENTS.md)
memovai/memov MIT (pattern only) .mem shadow timeline + snap/validate_commit MCP tool shape v1.13+ history tool design; v2.0+ drift gate
sipyourdrink-ltd/bernstein (verify) HMAC-chained audit log primitive v1.13+ optional
aimasteracc/tree-sitter-analyzer MIT Outline-first patterns (trace_impact tool) v1.12 (alt) / unscheduled
earendil-works/pi MIT Multi-provider LLM (pi-ai) v2.x (optional)
siropkin/budi (tooling, not lift) MIT Claude Code 5-hook observer for Sam's host workflow Immediate (install globally)
aaif-goose/goose Apache-2.0 ACP agent (goose acp) — dispatched alongside opencode in v2.0 Path B v2.0 (host install)

Decisions log

  • v1.13.7 stability bundle (2026-05-22, uncommitted). Five-fix sweep during the cosmetic-revert investigation surfaced two production-affecting regressions latent since v1.13.1-A. (1) @ai-sdk/openai-compatible includeUsage defaults to falseprovider.ts never asked llama-swap to emit usage, so tokens_used/ctx_used had been NULL in every assistant row since v1.13.1-A. The fix is one line at provider.ts:18. No backfill for historical rows. (2) AI SDK v6 streaming emits a stray \n text-delta on tool-call-only turns, which passed content.length > 0 and rendered an empty bubble + ActionRow between each tool call. Trim in MessageList.flatten (hasText) and defensively in MessageBubble (hasContent). (3) buildMessagesPayload did not filter trailing empty or failed assistant rows — combined with (2), a Continue retry produced …summary-assistant, empty-assistant, failed-assistant payloads and the upstream rejected with "Cannot have 2 or more assistant messages at the end of the list." Skip rules added at payload.ts:64. (4) BUDGET_NO_AGENT bumped 15→30. Every tool in ALL_TOOLS is read-only today; the cautious 15-cap was forward-looking for write tools that haven't landed. No-agent mode now matches BUDGET_READ_ONLY. None of the five changes touch schema or compaction — they're cleanup against a "v1.13.1-A regression that hadn't been caught yet" surface.
  • Skills taxonomy locked: AGENTS.md = rules, skills = recipes (2026-05-22). Codeminer42's multi-turn eval showed plain skills invoke 6% in clean runs vs CLAUDE.md/AGENTS.md 100% present. General workflow rules (TDD, paraphrase-before-quote, security gotchas, "never git pull/commit/push", alpha-tool-ordering, codecontext-not-RAG) belong in AGENTS.md; specific on-demand procedures (/skill scaffold-component, /skill run-release-checklist) belong in skills. Hooks are for automation, not instruction delivery. The 7 vendored v1.12 skills get an audit pass in v1.13.12 to sort each into the 4-way split (move to AGENTS.md / keep as recipe / move bulky context to references/ / delete). Validation via mgechev/skills-best-practices 4-step protocol + mgechev/skillgrade --smoke per skill. Anthropic's agent-skills/best-practices page becomes the canonical convention reference (500-line ceiling, gerund naming, MCP ServerName:tool_name format, progressive disclosure one level deep, etc.). Documented in BOOCHAT.md / BOOCODER.md to future-proof against re-adding workflow rules as skills.
  • HTML artifacts in BooChat locked (2026-05-22). Adopt Thariq Shihipar's "HTML > Markdown for outputs >100 lines" pattern. AGENTS.md gets the HTML-bias rule. Backend detection emits new html_artifact part kind. Frontend renders in three places: inline iframe preview in chat stream, "open in pane" workspace splitter integration, and download to /opt/<project>/.boocode/artifacts/<slug>-<timestamp>.html. Security: sandbox="allow-scripts allow-clipboard-write allow-downloads" with no allow-same-origin, CSP connect-src 'none', srcdoc= inline (not src=). All of Thariq's interactive examples (sliders/knobs/SVG diagrams/copy-as-JSON) work under this sandbox because they're entirely client-side. Don't vendor anthropics/skills/web-artifacts-builder — its Vite + Parcel toolchain can't run in BooChat (no shell). Treat the skill's "avoid AI slop" rules as design conventions inlined in AGENTS.md.

MCP and ACP protocol roles per surface (2026-05-22, locked)

  • BooChat = MCP client only. Read-only tool consumer. Per-server enabled flag. Hard rule: never enable a write-capable MCP server — the read-only invariant overrides protocol convenience. Defense-in-depth: client must reject any tool whose annotations.readOnly is false or absent.
  • BooCoder = MCP client + MCP server + ACP client (host) + ACP agent (driveable). Full matrix.
    • MCP client role: inherits v1.15 client; write-capable servers allowed but writes route through pending_changes queue.
    • MCP server role: BooCoder exposes its own task primitives (boocoder.create_task etc.) so external opencode sessions in Termius become BooCoder-aware. Stdio for local, HTTP gated on OAuth+secret storage.
    • ACP client (host) role: replaces raw-PTY dispatch for ACP-capable agents (opencode, goose). PTY retained as fallback for claude/pi/smallcode. Critical pattern: ACP clients auto-forward MCP context_servers to the dispatched agent (per goose docs) — one MCP config drives every dispatched agent.
    • ACP agent role: boocoder acp exposes BooCoder to Zed/JetBrains/Avante.nvim. Deferred to v2.2.
  • Why BooChat doesn't get ACP: ACP standardizes the editor→agent direction. BooChat doesn't drive agents; it is the chat. Adding ACP-agent to BooChat would convert it into an opencode-equivalent — different product. Skip.
  • MCP/ACP integration phasing: v1.14.x (single-server MCP-client PoC against Context7) → v1.15 (full MCP client + permissions) → v2.0 (BooCoder full matrix: write-capable MCP client + MCP server + ACP client) → v2.2 (BooCoder ACP agent for external editor drive).
  • Reference materials: anthropics mcp-builder skill (4-phase build workflow + 10-question eval framework — required for BooCoder's MCP server before shipping), opencode MCP/ACP docs as JSON-schema interop reference, goose ACP docs for the context_servers auto-forward pattern, agentclientprotocol.com spec (note: remote ACP via HTTP/WS still WIP, v2.0 uses stdio only).
  • v1 MCP scope limit (security): local-stdio MCP servers + Context7-style API-key remote only. Remote OAuth MCP servers (Sentry, Atlassian, etc.) deferred until BooCode has a real secret-storage primitive — token leakage from a PostgreSQL dump or Authelia bypass is a real attack surface that doesn't exist with local-stdio MCP.

Monorepo / multi-app structure (2026-05-22, locked)

  • BooCode is a 3-app monorepo at /opt/boocode/: apps/chat (read-only, currently the live thing at 9500), apps/coder (write tools + external CLI dispatch, 9502, v2.0 planned), apps/booterm (PTY terminal, live since May 2026 at 9501). Shared apps/server (Fastify backend) and apps/web (React shell hosting the three surfaces as tabs).
  • Single shared database, rename boocode_dbboochat_db when BooCoder lands. All three surfaces in one Postgres. Cross-surface joins are valuable (coder task → originating chat → term debugging session). Separate databases would break this.
  • Mount strategy: blanket /opt:rw, policy enforcement at the write-tool layer. Per-project scoping is logic, not mount. Path-guard correctness becomes the highest-priority test target for v2.0 — fuzz it, property-test it, every traversal-attack pattern (including MCP-served filesystem writes).
  • External CLI agents on the host, not in containers. BooCoder shells out via local-exec PTY or ACP subprocess (node-pty, host shell, or child_process.spawn('opencode', ['acp'])). Host install inherits Sam's existing ~/.opencode/, ~/.claude/, ~/.config/goose/ configs without re-mounting. Containerize later only if a concrete reason emerges.

Strategic pivot: Paseo-equivalent dispatcher (2026-05-22)

Sam wants BooCode to function like Paseo without using Paseo itself. Paseo is AGPL-3.0 — incompatible with BooCode's MIT license and its network-served deployment at code.indifferentketchup.com. Solution: reproduce the architecture in BooCode's existing Fastify + TS + PostgreSQL + React stack, using only license-clean patterns.

  • Primary architectural template: Dominic789654/agent-hub (Apache-2.0) — three-process model (board server + dispatcher + assistant terminal) and schema (tasks/projects/templates/pipelines/human_inbox).
  • Critical context-management primitive: Roo Code Boomerang Tasks pattern — orchestrator with intentional capability restriction, down-pass/up-pass context discipline, no implicit inheritance.
  • Observation pattern: Claude Code hooks (siropkin/budi reference) — register BooCode as the hook receiver for SessionStart/UserPromptSubmit/PostToolUse/SubagentStart/Stop.
  • Protocol-level Paseo equivalence: the ACP client + MCP server combination in BooCoder is the protocol-spelled version of Paseo's daemon. ACP gives multi-agent dispatch with structured events instead of free-form PTY output. MCP server gives BooCoder-as-task-board, callable from any MCP client (Termius-based opencode, future editors). One MCP config feeds every dispatched agent (via context_servers auto-forward).

This is now the dominant roadmap direction, ahead of v1.13.x cleanup batches in importance but behind them in sequence (v1.13 finishing now; Paseo-equivalent work is v2.0+).

Earlier May 18 chat recommended Option A (thin orchestration shell over OpenCode) but explicitly called the choice not-locked. Sam's call this session: ship both paths in the same BooCoder surface. Option B / in-process loop handles interactive write work with native tools + pending-changes UI (v2.0 plandex pattern). Option A / PTY-or-ACP dispatch handles parallel/batch work where Sam wants to A/B opencode vs claude vs goose vs pi against the same task in separate worktrees. User picks per task. ACP replaces raw PTY wherever the agent supports it (opencode, goose); PTY fallback retained for claude/pi/smallcode.

v1.13.x cleanup line locked (2026-05-22)

After the 2026-05-22 retag, the v1.13.x cleanup line in vMAJOR.MINOR.PATCH-slug form is v1.13.0-ai-sdk-v6 → v1.13.1-cleanup-bundle → v1.13.2-compaction-prune → v1.13.3-truncate → v1.13.4-reasoning-fix → v1.13.5-stability-bundle → v1.13.6-prefix-stability → v1.13.7-compaction-trigger → v1.13.8-tool-cost → v1.13.9-agentlint → v1.13.10-openspec → v1.13.11-tools → v1.13.12-ws-schemas → v1.13.13-ws-publish → v1.13.14-skills-audit → v1.13.15-codecontext-synth → v1.13.16-xml-parser → column drop (final, pending — old working name v1.13.2). Do not fold. Smoke isolation matters: each batch has a distinct rollback surface, and bisecting a 750-LoC merge across four unrelated changes is worse than four separate dispatches.

v1.13 retrospective (what shipped)

  • v1.13.0message_parts table + dual-write at every JSON-write site. Old columns authoritative for reads. Reversible.
  • v1.13.1-A — AI SDK v6 (ai@^6, @ai-sdk/openai-compatible@^2). streamCompletion rewritten as streamText adapter. Silent-abort bug caught and patched (explicit if (signal?.aborted) throw). Known regression: mid-stream tps gone — TODO for delta-cadence interpolation against result.usage. Latent regression discovered v1.13.7: includeUsage defaults false on @ai-sdk/openai-compatible, so result.usage resolved empty all along; tokens_used/ctx_used NULL in every row since this version. Fixed in v1.13.7.
  • v1.13.1-Bmessages_with_parts view with COALESCE fallbacks. Read sites switched. 1ms for 42-message chat verified.
  • v1.13.1-Cask_user_input correlation ported to parts; reasoning end-to-end (361 chars reasoning at seq 0, 429 chars text at seq 1 in smoke). v1.13.1 tagged on ac1a71f. Latent regression discovered v1.13.6: reasoning was wired into the inference payload but NOT into compaction's head-assembly payload — summarizer model couldn't see reasoning for tool-bearing turns, degrading qwen3.6 summary quality. Fixed in v1.13.6.
  • v1.13.3 — bundle: statement_timeout=30s, alpha tool ordering, periodic stuck-row sweeper, repairToolCall wiring. Tagged on a08d809.
  • v1.13.4 — two-tier compaction prune. Tagged on ec8593c.
  • v1.13.5 — opencode truncate.ts port + view_truncated_output tool. Tagged on f8fc5db.
  • v1.13.6 — compaction head-assembly audit + reasoning fix. Closed the Q3 reasoning gap from v1.13.1-C. Tagged on 81d837c.
  • v1.13.7 — stability bundle: includeUsage fix + trim guards + payload filter + budget bump. Surfaces tokens (closes a v1.13.1-A latent regression where result.usage resolved empty), kills the empty-bubble + ActionRow noise between tool calls on single-tool-call turns, and unblocks Continue after cap-hit on chats that have trailing empty/failed assistants.
  • v1.13.2 deferred — at least one week of production traffic on v1.13.1 before dropping legacy columns. Dual-write is rollback insurance.

Pre-v1.13 architectural decisions (still load-bearing)

  • Embeddings dropped from BooCode (May 2026). Replaced RAG with file-view tools + sidecar analyzers.
  • opencode promoted to Tier A (2026-05-20). Five algorithms identified for lift (compaction, doom-loop, repairToolCall, runLoop, permission evaluate) plus truncate.ts and MCP client.
  • OpenCode canonical repo: anomalyco/opencode, NOT sst/opencode (correction 2026-05-22). Development moved to anomalyco; sst/opencode is the predecessor lineage. All 15 catalog references rewritten.
  • Original Batch 11 (aider PageRank port) replaced by codecontext sidecar approach.
  • Original Batch 12 (codebase indexer w/ Harrier) removed. No embedding infrastructure.
  • Original Batch 13 (OpenHands event log) replaced by v1.13 parts table (opencode pattern).
  • Original Batch 12 (cline plan/act mode) absorbed into v1.15 (opencode permission ruleset).
  • Aider's repomap.py port dropped. Codecontext supersedes it. Aider contribution narrows to the .scm query files only.
  • Globstar parked — not an architect tool. Future verify-before-commit candidate only.
  • codeprysm rejected — embedding-based. Node/edge taxonomy noted as reference if we ever build our own graph.
  • Batch 9 decoupled from Batch 7 (2026-05-16); shipped in 92bd3b1. Builtin defaults: six agents (Code Reviewer, Debugger, Refactorer, Architect, Security Auditor, Prompt Builder) with no model field. Session model wins by default.
  • AI SDK adoption deferred to v1.13 — and shipped as v1.13.1-A. v6 chosen (not v5) for native typed parts model and top-level experimental_repairToolCall.
  • tool_choice='required' confirmed supported by llama-swap (qwen3.6-35b-a3b-mxfp4, 2026-05-20).
  • v1.12.0 shipped 2026-05-21. codecontext sidecar Track B + container guidance Track A. v1.12 truncation and repairToolCall deferred into v1.13.
  • v1.12.1 workspace pane sync (2026-05-21). Moved pane state from per-device localStorage to sessions.workspace_panes jsonb with WS broadcast for cross-device sync. Deprecated session_panes table dropped. Legacy localStorage migrates on first load.
  • v1.12.1 status indicator overhaul (2026-05-21). ChatStatusFrame expanded from working|idle|error to streaming|tool_running|waiting_for_input|idle|error. StatusDot rewritten with distinct animations per state.
  • detectSameNameLoop reverted in v1.12.1. Added during the 2026-05-21 debugging spike, never fired in any real run. Dead code.
  • The 2026-05-21 "freeze" debugging spike taught one lesson: BooCode had no UI signal for the difference between a slow stream and a dead stream. v1.12.2 (live tok/s) and v1.12.3 (stale-stream banner) directly closed that gap. v1.13's typed parts table made the inference state machine visible by construction — the structural fix the spike pointed to.
  • v1.12.4 refactor shipped 2026-05-21/22. inference.ts (1700 LoC) split into inference/ directory before v1.13 so the AI SDK migration had clean seams. stream-phase.ts became the swap target for streamText, tool-phase.ts got the per-tool category tag (added in v1.13.0). Pure structural move, no behavior change.
  • AI SDK v6 silent-abort patched (v1.13.1-A). fullStream returns normally on abort instead of throwing. Without explicit if (signal?.aborted) throw after the stream drain, stop button writes complete instead of cancelled. One-liner comment at the site so it survives future refactors.

Catalog growth (2026-05-22 deep review pass)

The session-of-the-day catalog review added 50+ new entries to boocode_code_review.md. Decisions worth carrying into roadmap planning:

  • Tier A active lifts unchanged: opencode, codecontext, tree-sitter-analyzer, codesight, aider.
  • Tier B / Tier C reviewed and triaged. Most consequential additions: agent-hub (#48, primary v2.0 architectural template), Roo Boomerang Tasks (#46, v1.14 AGENTS.md pattern), zeroshot (#37, blind-validation invariant), AgentLint (#39, immediate manual audit pass), RA.Aid (#44, three-stage routing), OpenSpec (#36, batch-doc structure), bernstein (#49, HMAC audit log), memov (#42, session-history tool design), siropkin/budi (#51, install for Claude Code observability).
  • Rejected as code sources: kilocode, costrict, prompt-tower, mycoder, reviewcerberus (closed Docker), Junie (closed), Cody (parked), VS Code extensions broadly, all Web Builders, LynxPrompt (GPL-3.0), claude-task-master code (Commons Clause), Paseo source (AGPL).
  • No additional code lifts promoted to a current version. All catalog adds are either patterns (license-clean), references (for v2.0+), or one-off audit-pass items (AgentLint, budi install).

Workflow

Each batch:

  1. Verify previous batch merged. git log --oneline main -5.
  2. Cut branch from main. Single-branch-per-dispatch convention.
  3. Dispatch via Paseo to Claude Code at /opt/boocode.
  4. Claude Code recon → blocking questions → implement → hand back.
  5. Compliance review in separate Claude chat (paste handback).
  6. Build: docker compose build --no-cache <surface> where surface is boocode (chat) / booterm / boocoder (v2.0+). No-cache avoids the v1.11.2 stale-bundle trap.
  7. Restart: docker compose up -d <surface>.
  8. Smoke test in browser (hard refresh).
  9. Sam commits and pushes. Never git pull / git push / git commit on his behalf.

Sam reviews all diffs. Backups before any destructive step: cp file file.bak-$(date +%Y%m%d-%H%M%S).