Files
boocode/boocode_roadmap.md
indifferentketchup 48ee63a286 v1.12.1: rich status indicator + server-side workspace pane sync
Status indicator (StatusDot): drops the flat amber pulse for a richer set
of states — orbiting amber for streaming, spinning sky ring for tool_running,
static violet for waiting_for_input, plus the existing idle/error. Backend
chat_status frame widens from 'working|idle|error' to discriminate streaming
vs tool execution vs paused for user input.

Workspace pane sync: pane layout moves from per-device localStorage to
server-side sessions.workspace_panes jsonb. PATCH /api/sessions/:id/workspace
broadcasts session_workspace_updated on the user channel for cross-device live
sync. Echo dedup via JSON comparison so the round-trip frame doesn't loop.
Legacy localStorage seeds the server on first hydrate, then is deleted.
Deprecated session_panes table dropped.

Resilience: startup sweep marks any stale 'streaming' message older than
5 minutes as 'failed' so v1.12.0-style hung rows clear on container restart.
useWorkspacePanes gains validatePanes() to prune dead chatId references from
saved pane state when the chat list lands.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 20:32:02 +00:00

287 lines
18 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# BooCode v1.x — Roadmap
Last updated: 2026-05-21
## Overview
BooCode is a standalone code-chat tool at `/opt/boocode/`. Read-only by design — pick a project, chat with a local LLM that has file-inspection tools, get streaming responses over WebSocket.
Live at `https://code.indifferentketchup.com` (Caddy → Authelia → Tailscale → `100.114.205.53:9500`).
**Architectural commitments:**
- No embeddings. Model uses file-view tools (`view_file`, `list_dir`, `grep`, `find_files`) + sidecar analyzers (codecontext, codesight) + codecontext MCP tools. Walked away from the RAG pipeline May 2026.
- Read-only in v1.x. Write tools land in BooCoder (separate container, post-v1.x).
- One Postgres (`boocode_db`), one frontend SPA, container-per-service for new capabilities.
External code lifted from / referenced in: see `boocode_code_review.md` for full inventory.
-----
## Shipped (status as of 2026-05-21)
| Version | Theme | Tag |
|---|---|---|
| v1.0 | Initial scaffold | — |
| Batches 14.4 | Markdown, sidebar, panes, chats-inside-sessions, archive, fork/delete, header polish, settings drawer | — |
| v1.5 | resolveProjectPath, BOOTSTRAP_ROOT, vitest pin | — |
| v1.6, v1.6.1, v1.6.2 | Mobile pass + RightRail mobile drawer | — |
| v1.7 | Drag-drop file + paste-as-attachment | — |
| v1.8, v1.8.1, v1.8.2 | Settings drawer, git_status tool, WS reconnect, per-turn budget reset + Continue affordance + CapHitSentinel | — |
| v1.9.1 | Skills system (`/opt/skills/` + `skill_find` / `skill_use` / `skill_resource` + `/skill` slash command) | `v1.9.1` |
| v1.9.7 | `ask_user_input` elicitation tool | `v1.9.7` |
| Batch 9 (Agents Tier 2) | `AGENTS.md` + 6 builtin agents + AgentPicker in ChatInput toolbar + `sessions.agent_id` | folded into `v1.9.1`/`v1.9.7` |
| v1.10.0 | BooTerm: separate container, xterm.js + node-pty + tmux | `v1.10.0` |
| v1.10.1 | BooTerm-user (spawn as samkintop, login bash, Claude Code/opencode PATH) | `v1.10.1` |
| v1.10.4, v1.10.5 | Mobile terminal + XML tool-call fallback parser | — |
| v1.11.0 | opencode-style compaction port (auto-overflow, anchored summary, tail preservation) | — |
| v1.11.1 | Compaction follow-up (working indicator during compaction, unit tests, .bak cleanup) | — |
| v1.11.2 | ContextBar (persistent context-usage indicator above MessageList) | — |
| v1.11.3 | `ctx_max` capture via `/upstream/<model>/props` (replaces dead `timings.n_ctx` read) | `v1.11.3` |
| v1.11.5 | ContextBar inline next to agent picker; remove ChatContextPopover; default new sessions to no agent | — |
| v1.11.6 | Doom-loop guard from opencode (3 identical tool calls → sentinel, abort recursion) | — |
| v1.11.7 | pathGuard secrets filter (continue.dev `DEFAULT_SECURITY_IGNORE_FILETYPES`) | — |
| v1.11.8 | web_search + web_fetch tools via SearXNG | — |
| v1.11.9 | Manual redirect handling — re-run URL guard on each hop (SSRF hardening) | — |
| v1.11.10 | Stream-cap response body at 5MB, abort on overflow | `v1.11.x` |
| **v1.12.0** | **codecontext sidecar (Go HTTP shim, NDJSON MCP framing, child.Wait supervisor) + container guidance (BOOCHAT.md/BOOCODER.md) + 7 vendored skills + system-prompt.ts extraction + mtime-watch cache + 8 codecontext tool wrappers + per-agent tool whitelists + .codecontextignore template + agents.ts ALL_TOOL_NAMES single-source-of-truth fix** | `v1.12.0` |
-----
## In flight (uncommitted on disk, 2026-05-21)
v1.12.1 work — landed today, not yet committed:
| Item | Status | Notes |
|---|---|---|
| Server-side workspace pane sync | Done | `sessions.workspace_panes jsonb` column; PATCH endpoint; `session_workspace_updated` WS frame; localStorage migration on first load; deprecated `session_panes` table dropped |
| Richer status indicators | Done | Five states (`streaming` / `tool_running` / `waiting_for_input` / `idle` / `error`) with distinct visuals: amber orbiting dots for streaming, amber spinning ring for tool execution, blue static for waiting on user, emerald/gray/red for idle/error |
| Startup hung-row sweep | Done | `UPDATE messages SET status='failed' WHERE status='streaming' AND created_at < NOW() - INTERVAL '5 minutes'` on server boot |
| One stuck row from v1.12.0 smoke | Cleared | Manual UPDATE (`d63c25b1`) |
| `detectSameNameLoop` code path | Added, never fired | Candidate for revert in next batch — dead code |
| Diagnostic logging in inference.ts | Added for debugging | Must come out before commit |
-----
## v1.12.x cleanup (NEXT — small, immediate)
Five items. Group them or split them — your call.
### v1.12.1 — commit consolidation
**Action items, in order:**
1. **Remove diagnostic logging** from `apps/server/src/services/inference.ts`. The 12 `ctx.log.info` calls added today proved the inference loop was functioning correctly; the prompts were just slow. Verbose for production. Strip them, keep the file clean.
2. **Revert `detectSameNameLoop`.** Three additions in inference.ts:
- `DOOM_LOOP_SAME_NAME_THRESHOLD = 5` constant
- `detectSameNameLoop()` function
- Call site in `runAssistantTurn` immediately after the existing `detectDoomLoop` check
Never fired in any real run today. Dead code. The existing `detectDoomLoop` (identical args, threshold 3) is sufficient.
3. **Drop the stale `messages_status_check` CHECK constraint** in `apps/server/src/schema.sql`. Two constraints exist on the table:
- `messages_status_check` allows `streaming|complete|failed` (old, stale)
- `messages_status_chk` allows `streaming|complete|failed|cancelled` (new)
The old one prevents `cancelled` from being written. Drop it with `ALTER TABLE messages DROP CONSTRAINT IF EXISTS messages_status_check;`.
4. **Stop-handler writes terminal status.** When user clicks stop mid-stream, the abort path must `UPDATE messages SET status='cancelled' WHERE id = $assistantMessageId AND status='streaming'`. Currently rows just sit `streaming` forever. The startup sweep catches them on restart, but they should be written immediately. Edit `apps/server/src/services/inference.ts` `handleAbortOrError` to add the UPDATE.
5. **Commit + tag v1.12.1.** Include the workspace pane sync, status indicator overhaul, startup sweep, and items 14 above. Single commit per item is fine; tag at end.
**Estimated:** ~150 LoC net (deletions dominate).
### v1.12.2 — live throughput display (small UX win)
Surface `tokens_per_second` and `ctx_used` next to the status indicator while streaming. Backend already emits these in the `usage` frame; just consume them in the StatusDot wrapper or a sibling component. ~80 LoC, frontend-only.
### v1.12.3 — stale-stream frontend banner
When a chat has a `streaming` row older than ~60s with no new tokens, the UI should surface a "Previous response didn't complete. [Retry] [Discard]" banner instead of silently queueing new sends. Today's debugging spent four hours misreading slow streams as dead; this is the UX fix that prevents that. ~150 LoC, frontend + small backend endpoint for the discard action.
-----
## v1.13 — Phase B: parts table + AI SDK + per-tool tagging
**Goal:** typed message parts replace JSON blobs on `messages.tool_calls` / `tool_results`. Adopt Vercel AI SDK `streamText`. Tag tools as `read_only` or `write` at definition time.
**Scope:**
1. Schema: new `message_parts` table (`id, message_id, kind, payload JSONB, sequence`). Kinds: `text`, `tool_call`, `tool_result`, `reasoning`, `step_start`. The `messages` table becomes header-only.
2. Inference loop rewritten on AI SDK `streamText`. `streamCompletion` becomes a thin wrapper. Native AI SDK `experimental_repairToolCall` replaces v1.12's hand-rolled version.
3. Tool registry: `ToolDef<T>` gains `category: 'read_only' | 'write'` field. BooCode v1.x rejects any `write` tool at registry time (defense in depth for the BooCoder split). Alpha-sort tool list before sending to model (prompt-cache stability).
4. Reasoning content (`reasoning_content` from Qwen3.6) captured as its own part type instead of dropped or inlined.
**Migration risk:** non-trivial. `inference.ts` is ~1700 lines with custom XML fallback, SSE parsing, compaction integration. Plan dedicated cutover window. `compaction.ts` must update to assemble head from parts.
**Replaces:** Original Batch 13 (append-only event log) — same outcome, different vocabulary.
**Today's debugging spike validates this work.** Four hours of confusion came from JSON-blob `tool_calls` / `tool_results` columns hiding state from logs and from the inference state machine being invisible. Typed parts + per-part status would have shown the slow-stream-vs-dead distinction in seconds.
**Dependencies:** v1.12.x cleanup merged.
**Estimated:** ~1500 LoC.
-----
## v1.14 — Phase C: outer agent loop
**Goal:** explicit multi-step loop per opencode `prompt.ts` `runLoop()`. Replace the current ad-hoc tool-call recursion.
**Scope:**
1. Outer loop continues until model returns non-tool finish OR step cap hit. Step ≠ tool call: one step can contain multiple tool calls in parallel.
2. `agent.steps ?? Infinity` per-agent step cap. AGENTS.md gains `steps:` field. Refactorer `steps: 5`, Architect `steps: 20`, etc.
3. Step-boundary events (`step_start`, `step_finish`) explicit in the parts stream. Per-step snapshot for revert (planned for BooCoder; backend-only in v1.14).
4. Doom-loop guards (v1.11.6) migrate from "abort recursion" to "raise within loop iteration." Same predicate, different control flow.
**Dependencies:** v1.13 merged.
**Estimated:** ~800 LoC.
-----
## v1.15 — Phase D: permission ruleset + MCP client
**Goal:** wildcard permission ruleset (opencode `evaluate.ts` pattern) and a proper MCP client implementation. Foundation for BooCoder to gate writes; immediate value for codecontext to be re-wired as a real MCP server.
**Scope:**
1. Wildcard rule matcher: `{ permission, pattern, action: 'allow' | 'deny' | 'ask' }`. Last-match-wins. Per-agent rulesets layer under per-session rulesets.
2. MCP client implementation: SSE transport, `tools/list` discovery, `tools/call` invocation. codecontext sidecar gets re-pointed from static wrappers (v1.12) to real MCP. New connectors become a config-only addition.
3. UI: permission-ask flow when a tool requires `ask` action. Modal or inline card with Allow once / Allow always / Deny.
4. v1.x stays read-only by default (no `write` tools in the registry yet).
**Absorbs:** Original Batch 12 (tool approval + plan/act mode) — same outcome via permission rules instead of mode enum.
**Dependencies:** v1.13 merged (parts table for permission events). Independent of v1.14.
**Estimated:** ~600 LoC.
-----
## v1.16 — Batch 11b: codesight repo_health
Call graph, circular dependency detection, dead code flagging. Port `analyze.mjs` from spirituslab/codesight. New tool `repo_health(project_id)`. In-process Node (not sidecar). Cache results keyed by `(project_id, file_hashes_sig)`.
**Dependencies:** v1.12 merged (can reuse codecontext parse output where overlapping).
**Estimated:** ~400 LoC.
-----
## v2.0 — BooCoder pending changes
New container `boocoder` at `100.114.205.53:9502`. Owns write tools (`edit_file`, `create_file`, `delete_file`, `apply_pending`, `rewind`). Edits queue in `pending_changes` table; nothing touches disk until `/apply`. Per-pane diff UI with Approve/Reject. BooCode chat stays read-only (`/opt:/opt:ro`).
**Lift source:** plandex pending-changes data model.
**Dependencies:** v1.13 (parts) + v1.15 (permissions).
**Estimated:** ~1200 LoC.
-----
## v2.1 — BooCoder runtime isolation
Per-session Docker sandbox spawned by BooCoder on first write. Only project path mounted, not `/opt`. Idle-timeout 30 min. Standard OpenHands runtime contract: HTTP API inside container, BooCoder calls in.
**Lift source:** OpenHands V1 runtime pattern.
**Dependencies:** v2.0.
**Estimated:** ~600 LoC.
-----
## v2.x — Optional / far future
- **Multi-provider LLM** (pi-ai pattern): Only if a concrete need for Anthropic / OpenAI / Mistral direct surfaces. llama-swap covers everything today.
- **Workflow graphs** (microsoft/agent-framework concepts): Multi-agent coordination. Conceptual reference only. Realistically a v3.x topic.
-----
## Architecture target state
### Containers
| Container | Port | Mount | Purpose | Status |
|---|---|---|---|---|
| `boocode` | `100.114.205.53:9500` | `/opt:/opt` | Chat + read-only tools + SPA | Live |
| `boocode_db` | `127.0.0.1:5500` | `boocode_pgdata` volume | Postgres 16-alpine | Live |
| `booterm` | `100.114.205.53:9501` | `/opt/repos:/opt/repos:rw` | Terminals (tmux + node-pty) | Live (v1.10.0) |
| **`codecontext`** | **`:8765` (internal)** | **`/opt/projects:/workspace:ro`** | **MCP server for architect tools** | **Live (v1.12.0)** |
| `boocoder` | `100.114.205.53:9502` | per-session sandbox | Write tools | v2.0 |
### Schema additions by version
- **v1.11.0:** `messages.compacted_at`, `messages.summary`, `messages.tail_start_id`, `chats.needs_compaction`
- **v1.11.7:** none (pathGuard logic, no DB)
- **v1.12.0:** none (codecontext stateless; truncation in-memory id-map with TTL cleanup)
- **v1.12.1:** `sessions.workspace_panes jsonb` (workspace sync); drop deprecated `session_panes` table; drop stale `messages_status_check` constraint
- **v1.13:** `message_parts` table; `messages` becomes header-only
- **v1.14:** `agents.steps` column (or AGENTS.md parser extension; no DB if file-only)
- **v1.15:** `permissions` table, `agent_permissions` join, `session_permissions` join
- **v1.16:** `repo_health_cache (project_id, file_hashes_sig, payload JSONB, created_at)`
- **v2.0:** `pending_changes (id, session_id, file_path, diff TEXT, status, created_at)`
-----
## Lift sources (summary)
Full inventory in `boocode_code_review.md`. Headline items:
| Source | Used for | Where |
|---|---|---|
| `sst/opencode` (MIT, TS) | Compaction algorithms | v1.11.0 (shipped) |
| `sst/opencode` (MIT, TS) | Doom-loop guard | v1.11.6 (shipped) |
| `sst/opencode` (MIT, TS) | `repairToolCall`, truncate.ts, MCP client, permission evaluate, runLoop | v1.12 (shipped) / v1.13 / v1.14 / v1.15 |
| `continuedev/continue` (Apache-2.0) | `DEFAULT_SECURITY_IGNORE_FILETYPES` | v1.11.7 (shipped) |
| `nmakod/codecontext` (MIT, Go) | Architect: codebase map sidecar | v1.12.0 (shipped) |
| `spirituslab/codesight` (MIT-ish, TS) | Architect: repo health analyzer | v1.16 |
| `Aider-AI/aider` (Apache-2.0) | Fallback `.scm` grammars | v1.12 (fallback) |
| `cline/cline` (Apache-2.0) | Plan/Act pattern (absorbed into v1.15 permissions) | v1.15 |
| `plandex-ai/plandex` (MIT) | Pending-changes data model | v2.0 |
| `OpenHands/OpenHands` (MIT) | Sandbox runtime contract | v2.1 |
| `aimasteracc/tree-sitter-analyzer` (MIT) | Outline-first patterns | v1.12 (alt) |
| `earendil-works/pi` (MIT) | Multi-provider LLM | v2.x (optional) |
-----
## Decisions log
- **Embeddings dropped from BooCode** (May 2026). Replaced RAG with file-view tools + sidecar analyzers.
- **Original Batch 11 (aider PageRank port) replaced** by codecontext sidecar approach.
- **Original Batch 12 (codebase indexer w/ Harrier) removed.** No embedding infrastructure in BooCode v1.x.
- **Globstar parked** — not an architect tool. Future verify-before-commit candidate only.
- **codeprysm rejected** — embedding-based. Node/edge taxonomy noted as reference if we ever build our own graph.
- **Batch 9 decoupled from Batch 7 (2026-05-16); shipped in `92bd3b1`.** Builtin defaults: six agents (Code Reviewer, Debugger, Refactorer, Architect, Security Auditor, Prompt Builder) with no `model` field. Session model wins by default.
- **opencode lift opened** (2026-05-20). Started with compaction (v1.11.0). Continuing through v1.15. Five distinct algorithms: compaction, doom-loop guard, repairToolCall, runLoop, permission evaluate. Plus `truncate.ts` and MCP client. Each lifts the algorithm, not the Effect-TS plumbing.
- **AI SDK adoption deferred to v1.13.** Hand-roll repairToolCall in v1.12 — not actually done in v1.12.0; truncation also deferred. v1.12.0 shipped codecontext + container guidance + skills only.
- **`tool_choice='required'` confirmed supported** by llama-swap (qwen3.6-35b-a3b-mxfp4, 2026-05-20).
- **v1.11.4 cancelled** (2026-05-20). Per-turn budget reset + Continue affordance + CapHitSentinel were already shipped in v1.8.2.
- **v1.12.0 shipped** (2026-05-21). codecontext sidecar Track B + container guidance Track A. v1.12 truncation and repairToolCall were deferred into v1.13's AI SDK migration where they get for-free.
- **v1.12.1 workspace pane sync** (2026-05-21). Moved pane state from per-device localStorage to `sessions.workspace_panes jsonb` with WS broadcast for cross-device sync. Deprecated `session_panes` table dropped. Legacy localStorage migrates on first load.
- **v1.12.1 status indicator overhaul** (2026-05-21). ChatStatusFrame expanded from `working|idle|error` to `streaming|tool_running|waiting_for_input|idle|error`. StatusDot rewritten with distinct animations per state. Added `executeToolPhase`-entry `tool_running` publish.
- **detectSameNameLoop reverted** (planned v1.12.1). Added during the 2026-05-21 debugging spike to catch same-tool-name-with-different-args loops. Never fired in any real run because the existing `detectDoomLoop` covers the actual failure modes. Dead code, reverting.
- **The 2026-05-21 "freeze" debugging spike taught one lesson**: BooCode has no UI signal for the difference between a slow stream and a dead stream. Diagnostic logging (added today, reverted in v1.12.1) revealed the inference loop was working correctly throughout — what looked like four hours of deterministic hang was multiple instances of qwen3.6 generating 8k tokens of self-doubt at temperature 0.2 on a "find the bug" prompt with no real bug. v1.12.2 (live tok/s display) and v1.12.3 (stale-stream banner) directly address this gap.
-----
## Workflow
Each batch:
1. Verify previous batch merged. `git log --oneline main -5`.
2. Cut branch from main. Single-branch-per-dispatch convention.
3. Dispatch via Paseo to Claude Code at `/opt/boocode`.
4. Claude Code recon → blocking questions → implement → hand back.
5. Compliance review in separate Claude chat (paste handback).
6. Build: `docker compose build --no-cache boocode` (no-cache avoids the v1.11.2 stale-bundle trap).
7. Restart: `docker compose up -d boocode`.
8. Smoke test in browser (hard refresh).
9. Sam commits and pushes. **Never** `git pull` / `git push` / `git commit` on his behalf.
Sam reviews all diffs.