Commit Graph

144 Commits

Author SHA1 Message Date
7096ae4ddc feat: remove Go codecontext sidecar, wire all boocontext MCP tools
Deletes all 17 native codecontext tool wrappers (~2,400 lines). Code analysis now provided entirely by boocontext MCP server (discovered at startup via appendMcpTools()). Adds 9 previously missing MCP tools (get_summary, scan, get_coverage, get_schema, get_env, get_events, get_knowledge, get_wiki_index, lint_wiki) to all relevant agent tool lists. Updates AGENTS.md, guidance files.
2026-06-08 04:18:04 +00:00
381b97f78a feat(server): inference state-graph + supervisor, memory tools, MCP client, schema, routes
- Add state-graph.ts: typed state machine for inference lifecycle
- Add supervisor.ts: agent supervisor pattern for multi-agent coordination
- Add export-formatter.ts: structured export formatting
- Add manage_memory.ts: memory CRUD tool for agent persistence
- Add get_wiki_article.ts: codecontext wiki article retrieval
- Extend memory/index.ts: 3-tier memory (context/daily/core)
- Extend MCP client: mcp-config.ts env-var substitution
- Update schema.sql: agent_sessions, tasks, pending_changes extensions
- Update API types: MessageMetadata, ErrorReason, AgentSessionConfig
- Update routes: chats, messages, sessions — column renames and agent_session_id
- Update inference: error handler, payload builder, stream phase, turn orchestrator
2026-06-08 03:48:47 +00:00
45a1140fd3 feat: phase 3-5 — workflow engine, background subagents, multi-modal, cache shape, inline diff
Phase 3: Dynamic Workflow Engine
- VM sandbox (node:vm) with agent/parallel/pipeline API, Claude Code compatible
- Workflow file discovery (.boocode/workflows/*.js + ~/.boocode/workflows/*.js)
- Workflow manager with session/chat creation and inference dispatch
- Built-in catalog: deep-research, review-code, find-issues
- Resumability cache: SHA-256 hash of agent spec, in-memory Map

Phase 4: Background Subagents
- background-task.ts service: spawn/poll/cancel lifecycle
- spawn_subagent, subagent_status, subagent_result tools in ALL_TOOLS

Phase 5: Multi-modal + Cache Shape
- Multi-modal stub with type defs and hook point in payload.ts
- CacheShapeBadge component in trace viewer (colored bar + %)
2026-06-08 03:11:39 +00:00
c4ee377dbc feat(conductor): task state machine — TIMED_OUT state and retriable steps
- Add 'timed_out' to flow_runs/flow_steps CHECK constraints
- Add retry_count and max_retries columns to flow_steps
- Add timeout detection in advanceInner loop (configurable FLOW_STEP_TIMEOUT_MS)
- Add retriable logic: re-dispatch on timeout if maxRetries > 0 and retryCount < maxRetries
- Add isRetriable() + shouldRetry() pure decision functions
- Add timed_out handling to reconcileResumeStep and reconcileRun
- Add 'timed_out' to ws-frames enum, publishStep status type
2026-06-08 02:43:45 +00:00
abe9c5a3a8 feat: Paseo-like orchestrator Phase 1-2 — trace system, session persistence, timeline, run_command, auto-fix loop
Phase 1: Trace System + Observability
- tool_traces DB table + insert/update service
- tool_trace_start/tool_trace_finish WS frames (contracts + FE types)
- Instrumented tool-phase.ts with timing around every tool call
- GET /api/chats/:id/traces paginated endpoint
- Trace viewer frontend (collapsible panel with timing bars + token breakdown)

Phase 2: Session Persistence + Resume
- agent_snapshots table (UPSERT per chat, persisted on turn boundaries)
- save/load/delete service functions
- Agent snapshot sent on WS reconnect
- Session timeline view (vertical timeline with scroll-to + restore)

Tooling:
- run_command tool (execFile, 30s timeout, 32KB cap, path-guarded)
- Auto-fix loop: after write tools, runs pnpm build, injects errors into next turn
2026-06-08 02:26:47 +00:00
7cb692d8be feat: Phase 4 teardown — remove Go codecontext sidecar from deployment
- Remove codecontext service block from docker-compose.yml
- Remove CODECONTEXT_URL env var
- Delete codecontext/Dockerfile
- Update callCodecontext() to try boocontext MCP first with HTTP fallback
- Graceful degradation: if boocontext MCP unavailable, tools still work via HTTP
2026-06-08 02:16:02 +00:00
917a229363 feat: Domain 2 Phase 3-4 — wiki article tool, DCP compress toggle, Go sidecar deprecation
Phase 3: get_wiki_article tool wraps codesight_get_wiki_article MCP
(cached, persistent codebase wiki). DCP compress toggle on
get_codebase_overview (compress=true for large projects >50 files).

Phase 4: Deprecation markers on Go codecontext sidecar. Warning log
in callCodecontext(), deprecation comments in factory.ts and
docker-compose.yml. Sidecar remains functional — removal deferred.
2026-06-08 01:35:40 +00:00
39be5ce413 fix: move cache_tokens/reasoning_tokens ALTER TABLE before view creation 2026-06-08 01:32:25 +00:00
203cfd2fa8 feat: DeepSeek API integration + Whale lift (hooks, tool repair, MCP permissions, token tracking)
DeepSeek API:
- @ai-sdk/deepseek provider replaces openai-compatible for deepseek-* models
- Token tracking: cache_hit/reasoning tokens flow API → DB → WS frames → UI
- thinking effort levels (off/low/medium/high/xhigh/max) via AGENTS.md frontmatter
- V4 models: deepseek-v4-flash, deepseek-v4-pro
- Wired for both chat and coder panes

Whale lifts:
- Tool input repair (schema-based type coercion, markdown link unwrapping)
- Hooks system (6 lifecycle events, shell exec, JSON stdin/stdout contract)
- Per-MCP-server permissions (allow/ask/deny)
- token tracking UI (cache N, think N in message stats line)

Infra:
- New DB columns: messages.cache_tokens, messages.reasoning_tokens
- New WS frame fields: cache_tokens, reasoning_tokens on message_complete
- coder provider snapshot merges DeepSeek models alongside llama-swap
2026-06-08 01:24:23 +00:00
3c5b2c2bcf feat(server): Domain 2 Phase 1 — boocontext MCP client + 4 new code intelligence tools
Shared boocontext MCP client (boocontext_client.ts) wrapping the existing
mcp-client.ts callTool() infrastructure with 32KB truncation and error
handling. Used by get_code_health.

4 new first-class agent tools backed by the boocontext MCP server:
- get_code_health — A-F grades per file across 7 dimensions, project health
  summary, refactoring candidates (wraps boocontext_health)
- get_code_impact — merged symbol trace + blast radius in one call (wraps
  boocontext_impact, replaces two-step get_symbol_info+get_blast_radius)
- get_type_info — TypeScript type recovery via type-inject MCP (wraps
  boocontext_types, returns signatures, interfaces, generics, JSDoc)
- get_code_map — DCP-compressed context map with compress toggle (wraps
  boocontext_map, 10x token reduction vs full scan)

All 4 registered in ALL_TOOLS as read-only tools.
2026-06-08 00:45:46 +00:00
a8e475fdf4 perf(llama): unshadow cache-type + spec-decoding flags for agent opt-in
KV cache quantization (--cache-type-k q4_0) and ngram speculative decoding
(--spec-type ngram-mod) are high-value llama.cpp features that improve VRAM
usage and tokens/sec. Removing them from the shadowing lists allows agents
to enable them via llama_extra_args.
2026-06-07 22:40:23 +00:00
876c9bcd02 feat(coder,server): audit engine — session audit, guideline compliance, user correction tracking
Implements audit-harness-inspired session lifecycle: audit session
creation/end/recover/report-daily with JSONL buffer and graded context
recovery (L0-L4). Guideline service for behavioral compliance rules
(condition/action model with criticality). Correction service for
persistent user correction tracking across agent sessions.

8 supporting skills: audit-start/end/report-daily/recover + command
variants for slash-command integration.
2026-06-07 22:16:35 +00:00
c132215064 feat(web,server): inference settings UI with per-session inference overrides
Adds Inference tab to SettingsPane with controls for temperature, top-p,
top-k, min-p, and other inference parameters. Server-side route and
provider config wiring to pass overrides through the inference pipeline.
2026-06-07 22:16:29 +00:00
a72f7954b4 feat(web,coder): add analytics + results pages for token usage and run history
New /analytics route: token usage dashboard with aggregate summary,
per-session breakdown, context window stats, and per-category token
distribution. Data served from existing agent_sessions + tool_cost_stats.

New /results route: browsable archive of orchestrator flow runs and
arena battles. Two-tab layout (Analysis Runs / Arena Battles) using
existing API endpoints (no new backend).

Sidebar gains Results (ScrollText icon) and Token Analytics (BarChart3
icon) nav buttons above Settings.
2026-06-07 22:16:25 +00:00
31d8efe66a feat(web): enhanced file panel — side-by-side diff, hide whitespace, inline review
Adds DiffSplitView component for side-by-side diff mode, whitespace-only
change filtering, inline review comments with thread/gutter cell UI, diff
preferences persistence, and write-file API support for in-browser editing.

Backend: hideWhitespace param on git diff endpoint, write_file route.
2026-06-07 22:16:20 +00:00
6344105877 feat(server): memory v2 tests and search_memory tool 2026-06-07 21:55:47 +00:00
648a59a563 feat(server): memory v2 — BM25 + local embedding hybrid search
- Bm25Ranker: Okapi BM25 scoring (pure TS, no deps)
- Embedding module: ONNX-based local embeddings via onnxruntime-node
- Hybrid recall: BM25 (30%) + cosine similarity (70%) weighted merge
- Falls back to keyword-only via MEMORY_SEARCH=keyword env var
- extract_memory agent tool for persisting memory entries
2026-06-07 21:34:25 +00:00
1b70d41996 feat(server): add inference reliability - tool-shim and loop detectors
- ToolShim recovers XML/JSON tool calls from plain-text model output
- detectContentRepeat catches same-content loops
- detectToolLoop catches repeated tool invocations
- detectDoomLoop combines both detectors
2026-06-07 17:57:58 +00:00
02bb355a09 feat(server): add institutional memory recall
- File-based memory under .boocode/memory/ (project/user/reference topics)
- Hierarchical 4-scope scan: global → home → project → session
- Keyword/tag relevance matching for query-based recall
- Injected as <boocode-memory> block in system prompt at assembly
- v1 recall-only (extract/dream deferred to v2)
2026-06-07 17:57:44 +00:00
b8b2666fdc feat(server): add DCP clean-room context pruning
- Deduplication: removes consecutive identical tool_call+tool_result pairs
- Purge-errors: removes failed/empty tool results
- Transform orchestrator runs strategies in sequence pre-payload
- Wired into turn.ts before buildMessagesPayload
- Clean-room reimplementation (AGPL reference: behavior only)
2026-06-07 17:57:39 +00:00
bc83475a3d feat(server): add boocontext deep analysis tools and synthesis pipeline
- get_symbol_details: type signature, definition location, usage count
- get_call_graph: callers, callees, transitive references
- get_blast_radius added to SYNTHESIS_TOOLS
2026-06-07 17:57:29 +00:00
d6d246c15b feat(web,coder): arena pane — compare 2-6 AI competitors on same prompt
Arena is a new pane kind for competitive AI evaluation. A Battle runs
the same prompt against 2-6 Contestants across two concurrent lanes:
local lane (llama-swap models, serial) and cloud lane (parallel).

Added to all three registries: @boocode/contracts WsFrameSchema,
server InferenceFrame, and web WsFrame.

Backend (apps/coder):
- arena-runner: battle scheduler, lane classifier, benchmark, results
  writer, resume, user winner override
- arena-analyzer: two-stage digest→judge analysis on DEFAULT_MODEL
- arena-decisions: status transitions and resume logic (unit-tested)
- arena-analyzer-helpers: pure helper functions (unit-tested)
- arena-model-call: model call utility for analysis
- arena routes: create/get/list/stop/analyze/cross-examine/winner/diff
- schema: battles, contestants, cross_examinations tables (idempotent)
- remove old /api/arena* routes and tasks.arena_id column

Frontend (apps/web):
- ArenaLauncherDialog: battle type, prompt, contestant selection
- ArenaPane: live roster, streaming output, analysis, cross-exam
- DiffView: unified diff with line-by-line color for coding contests
- Winner override per-row dropdown (Trophy icon)
- battle_updated WS handler for live winner/analysis updates
- arena pane kind in Workspace, ChatTabBar, useSidebar

Cross-app:
- ArenaState and ArenaContestantShape/WsFrame types (contracts)
- battle_* frames in WsFrameSchema, InferenceFrame, and web WsFrame
- manifest.json written per battle results folder
- /Arena added to .gitignore

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-06 23:25:29 +00:00
1937af8df9 feat: in-app Orchestrator (Phase 2) — multi-agent conductor
Brings the deterministic Han-flow conductor into BooCode: launch any read-only
flow from BooChat or BooCoder, watch each agent stream live in a Paseo-style
run pane, get an evidence-disciplined report — on local Qwen, persisted and
resumable. Read-only enforced hard via qwen --approval-mode plan (orchestrator
tasks fail closed if qwen is unavailable; never fall to write-capable native).

Backend (apps/coder): re-homed conductor defs, flow_runs/flow_steps schema,
flow-runner + dispatcher onTaskTerminal hook, restart-resume, runs routes
(launch/list/get/cancel), user-channel WS. Contracts: two flow_run_* frames.
Web: orchestrator pane kind + OrchestratorPane, Workflow button + slash flows
(BooChat/BooCoder parity), FlowLauncherDialog, "New Orchestrator" in the + and
split menus, runs history + export. Plan: openspec/changes/orchestrator.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 15:22:48 +00:00
519b1d2ca1 wip: pane/session + tab-bar checkpoint
Second checkpoint of in-flight work (sessions route, api types, ChatTabBar,
PaneHeaderActions, Workspace, useWorkspacePanes) so the Orchestrator branch
can rebase onto current main before merge.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 15:15:47 +00:00
163b5b86f7 wip: context-meter + model-label UI and provider/inference tweaks
Checkpoint of in-flight work so the orchestrator branch can rebase onto a
clean main: ContextBar → ContextMeter, model-label helper, model/agent picker
+ provider-snapshot/registry changes, inference payload + message-columns.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 14:55:38 +00:00
fc4fbb0b7e feat: futuristic theme ladder + stacked landing banner
Add three opt-in dark themes (BooCode+, BooCode Classic, BooCode
Override) plus an in-place Ember polish, on a class-scoped effects
engine: matrix rain, a neon grid field, and frosted glass, all gated
by a localStorage "Animated background" toggle and prefers-reduced-
motion. Extend the server theme_id whitelist so the new ids persist,
and replace the Home landing wordmark with the stacked mascot +
wordmark banner.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 14:16:59 +00:00
d8bb2dabfe feat: git diff panel (Files/Git tab in the file browser)
Adds a Git tab to the right-side file panel that shows the project
repository's diff and lets the user stage, unstage, commit, and discard
whole files in-session. Two comparison modes (Uncommitted vs HEAD, and the
branch vs its base — upstream tracking branch else default branch), auto-
selected by repo state on first open and pinned after explicit choice;
per-file expand/collapse with lazy syntax-highlighted diffs, +/- stats, and
binary/large-file placeholders. All git read and write logic lives in
apps/server via a new git_diff service: argv-safe execFile only (never a
shell), per-file paths validated repo-relative through pathGuard with a
realpath symlink-escape check, server-derived commit identity (the request
carries no author fields), and the write endpoints are deliberately absent
from the assistant tool registry. Reads are bounded (30s deadline, 10MB);
an index lock or an in-progress merge/rebase/cherry-pick/bisect surfaces as
"repository busy" and disables writes. The panel stays current via a client
git_diff_refresh session event (no new wire contract) coalesced across tab
open, mutations, turn completion, and pending-change apply. Discard is an
irrecoverable hard-delete behind a plain confirm that distinguishes
reverting a tracked file from deleting an untracked one.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 03:18:41 +00:00
f32fd928b3 feat: post-review backlog hardening (cancel/parser/stall/history/9502)
Five independent items from the post-review backlog. F1: Stop on an external
agent task now aborts the running child via a per-task AbortController registry
reachable from the cancel route, and finalizes the assistant message as
cancelled (fixing two latent bugs — catch blocks left the message streaming,
and warm success-paths wrote complete on an aborted turn); warm pools/worktrees
are preserved and the native path is unchanged. F2/F3: prune the tool-call
parser to its two load-bearing exports (unexport eight zero-caller symbols, add
a gate test for the <invoke>-as-text fallback) and route placeholder-rejection
logging through pino. F6: a 90s per-chunk stall-timeout wraps native inference's
fullStream via AbortSignal.any so a hung stream finalizes the message instead of
hanging — no retry (a pure classifyStreamError helper is added). F7: a read-only
view_session_history MCP tool (newest-N, chronological). F9: retire the unused
apps/coder/web :9502 fallback SPA, keeping every API/WS/health/MCP route.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 02:23:11 +00:00
ae3f10b19d Merge remote-tracking branch 'origin/main' 2026-06-02 21:30:28 +00:00
649ce71eff feat: single-source cross-app wire contracts in @boocode/contracts (v2.7.13)
Move all hand-synced cross-app wire contracts into one built workspace
package, @boocode/contracts, consumed by server/web/coder/coder-web via
workspace:* + a per-subpath exports map. The ws-frames and provider-config
Zod schemas are schema-first (z.infer); MessageMetadata, ErrorReason,
AgentSessionConfig, the provider snapshot types, and WorktreeRiskReport are
each single-sourced. Deletes the byte-identical copies and their parity
tests, fixes a live AgentSessionConfig drift (coder dead copy removed,
unified to the web required/nullable shape), removes the dead pending_change
WS arms in the fallback SPA, and inverts the build order (contracts builds
first) across root build, Dockerfile, and the coder deploy docs. Reverses
the shared-package decision declined in v2.5.12.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 21:24:08 +00:00
8c200216eb refactor: codebase audit cleanup — dead code, dedup, module splits
Multi-agent audit + aggressive cleanup across server/web/coder/booterm,
delivered behind a DEFER discipline so none of the in-flight files were
touched. Removes dead code/deps/columns, dedups server + coder helpers,
and splits the oversized modules (tools.ts, opencode-server.ts,
sentinel-summaries, turn.ts, TerminalPane.tsx) behind stable contracts.
Adds 78 parity/unit tests (server 587, coder 323); fixes two latent bugs
(ChatPane queue keys, FileViewerOverlay blank-line parity).

Intended tag: v2.7.12-audit-cleanup.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 21:12:29 +00:00
afaca9e426 feat: MCP {env:VAR} key substitution + coder model/tool-result fixes + docs refactor (v2.7.9)
- MCP secrets: substituteEnvVars recursively resolves {env:NAME} in mcp.json string values from process.env before Zod (opencode-compatible); unset -> '' + boot warning, and invalid-config log names the unset vars (an empty {env:VAR} in a strict url/command field invalidates the whole config)
- data/mcp.json now untracked (.gitignore flips !data/mcp.json -> !data/mcp.example.json); tracked template data/mcp.example.json carries "{env:CONTEXT7_API_KEY}"; .env.example documents the key (9 mcp-config tests)
- Coder fix: message_complete frame model widened string -> string|null (server+web ws-frames parity); dispatcher publishes model: task.model at all 4 external completion points — a null model otherwise fail-closed in publishFrame and dropped the whole frame incl. status:'complete' (regression test)
- Coder fix: claude-sdk mapUserToolResults maps user-message tool_result blocks -> terminal tool_update events (completed/failed w/ output) so tool snapshots resolve instead of spinning forever
- Composer: AgentComposerBar drops §9b resumed/history/new chip + token readout, loses flex-wrap so the row stays one line; CoderPane gains a per-chat localStorage agent-config cache (restores last model on reopen) + threads model into the timeline/chip
- Docs: root CLAUDE.md slimmed (~190 lines), per-app refs split to apps/{coder,server,web}/CLAUDE.md; new docs/coder-backends.md, docs/project-discovery.md, docs/coding-standards/ (cross-app-contract-parity); ARCHITECTURE.md links the backends doc

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 17:01:03 +00:00
3a646fd6df feat: BooCode 2.0 UI — Ember theme, brand banner, coder tabs, model-attribution chips
- Ember theme (Obsidian charcoal + #ff7a18 orange), now DEFAULT_THEME_ID; server theme_id whitelist gains 'ember'
- Brand banner: transparent Westie mascot + >_BooCode wordmark, big/edge-to-edge (flood-filled to transparency + cropped)
- Coder panes are multi-tab: + opens a BooCode tab, split opens a pane (shared ChatTabBar via tabKind + createCoderTab; closeOtherTabs/tab-numbering extended to coder)
- Model-attribution: new messages.model column stamped at finalizeCompletion (BooChat/native coder) + dispatcher assistant-row creation (external coder); surfaced via view + wire types + live frame; rendered as a subtle shortened-name chip (shortenModelName)
- Composer Web toggle moved into a boxed focus-ringed input; glowing accent dot on tool rows
- Claude SDK follow-ups (1M context, follow-up-message fix, collapsed thinking/tool chips) + CLAUDE_SDK_BACKEND=1

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 22:30:47 +00:00
59cf082e06 feat: normalized external-agent status (#10 scoped) (v2.7.6)
Scoped half of boocode_code_review_v2 §1 #10 — publish the agent status
BooCoder already observes (the config-injection notify-hook is the documented
follow-on, clean-room from superset ELv2).

- agent_status_updated WS frame (working|blocked|idle|error), server+web parity.
- Published from the dispatcher's turn boundaries (warm-acp/opencode/sdk/pty:
  working at start, idle/error at end) + the permission flow (blocked/working).
  Best-effort, never breaks a turn.
- Clean-room normalizeAgentEvent helper (superset's vendor-event -> Start/blocked
  /Stop collapse, event names as facts) + 25 tests — reused by the follow-on.
- AgentComposerBar status dot (distinct from the WS-liveness dot), tracked per
  (chat,agent) by a useAgentStatus map in CoderPane.

Built by 2 parallel agents vs a pinned frame contract. Server 545 + coder 294
tests passing (25 new); web tsc + builds clean; ws-frames parity green. Clears
the actionable review backlog (#1/#3/#4/#6-#12). Builds on v2.7.5.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 14:04:04 +00:00
bcc89d8adc feat: MistakeTracker + file-provenance ledger (v2.7.4)
Two native-inference hardening features from boocode_code_review_v2 §1 #12.

MistakeTracker: new pure mistake-tracker.ts tracks consecutive heterogeneous
tool failures (kinds surfaced per tool from tool-phase.ts). On 3 in a row the
turn loop soft-nudges (model-facing recovery guidance + mistake_recovery
sentinel + reset), then escalates to stopping the turn (cap-hit-style, Continue
affordance) on a re-trip. Complements doom-loop (identical repeats) + cap-hit.

File-provenance ledger: compaction.ts derives a deterministic ## Files Read list
from the head messages' read-tool calls and injects it into the rolling-summary
prompt so provenance survives compaction (no new table; read-only).

mistake_recovery sentinel: MessageMetadata arm (server + web) + MessageBubble
render branch. Built by 2 parallel agents. Server 545 tests passing (23 new);
build + web tsc clean. Native-inference only. Builds on v2.7.3.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 13:05:03 +00:00
a584dd16b0 feat: sampling knobs + live PTY stream-json + token UI (v2.7.3)
Three small wins from boocode_code_review_v2 §1 #11/#7/#8.

#11 sampling knobs: top_n_sigma + dry_* family as first-class Agent fields,
threaded into the request body via providerOptions.openaiCompatible. Fixes a
latent bug — top_k (rejected by the AI-SDK provider) and min_p (never passed to
streamText) were dead on the wire; both now route through the same channel.
--reasoning-budget documented in data/AGENTS.md.

#7 live PTY stream-json: new stream-json-parser.ts line-buffers qwen/claude
NDJSON and emits text/reasoning/tool frames live + persists, with a fallback to
the old opaque slice. claude gets --output-format stream-json --verbose.

#8 token UI: agent_sessions input/output_tokens/cost now flow through the route
+ type and render beside the AgentComposerBar session chip.

Built by 3 parallel agents. Server 523 + coder 245 tests passing; builds + web
tsc clean. Builds on v2.7.2. openspec sampling-streamjson-tokens.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 12:47:17 +00:00
a8bfde8f8d feat: relicense AGPL-3.0 → MIT (v2.7.0)
Clear the 3 Unsloth-Studio-derived AGPL files and flip LICENSE + 5
package.json from AGPL-3.0-only to MIT.

- html-to-md.ts → MIT node-html-markdown (parse5 dropped)
- llama-args-validator.ts → clean-room (flag denylist = facts)
- tool-call-parser.ts → delete dead Unsloth-ported code; keep
  extractToolCallBlocks/stripToolMarkup byte-identical (no behavior change)
- LICENSE → MIT (Copyright (c) 2026 indifferentketchup); 5 package.json → MIT;
  AGPL SPDX headers removed; README License section; license-mit guard test
- roadmap License-debt batch marked shipped; openspec/changes/license-debt-mit

Decouples the relicense from the native-parsing retirement (the ported parser
was dead code). Server suite 519 passing; build + coder typecheck clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 08:16:03 +00:00
2dfbef4c41 feat: v2.6 follow-ups — apps/server close-hook caller + DiffPanel staging hint (3.7)
apps/server fire-and-forgets BooCoder's Phase-3 close hooks (new coder-notify.ts, reuses BOOCODER_URL, never-rejects) on session-delete + chat archive/archive-all/delete, so warm backends + worktrees tear down immediately (idle-evict/reaper was the backstop). 3.7: BooCoder DiffPanel shows a muted one-liner when the selected provider can't see another agent's unapplied worktree edits (pure derivation from per-change agent + current provider, no new state). 6 new server tests (coder-notify); 537 server tests pass; web+server tsc/build clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 02:35:11 +00:00
d05f73be26 feat(server): workspace_panes envelope + read_tab_by_number tool
Widen the sessions.workspace_panes JSONB from a bare WorkspacePane[] to a
WorkspaceState envelope { panes, tabNumbers, nextTabNumber, closedPaneStack }.
The PATCH validator accepts either the legacy array or the envelope (zod union)
and normalizes to a full envelope before storing, so existing array-shaped rows
migrate transparently on next write. The session_workspace_updated WS frame
schema is widened to match (kept byte-identical to the web copy; parity test
passes).

Adds read_tab_by_number, a read-only tool that resolves a session-scoped tab
number to its chat via the persisted tabNumbers map and returns that chat's
transcript (oldest-first, sentinels skipped, capped at 20k chars). Tools gain an
optional ToolExecCtx ({ sql, sessionId }) 4th param on ToolDef.execute, threaded
through executeToolCall from executeToolPhase; the param is optional so existing
filesystem tools and the apps/coder consumer stay compatible. Registered in
ALL_TOOLS + READ_ONLY_TOOL_NAMES.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 02:14:42 +00:00
cb1846c0d5 feat(coder): re-key agent_sessions to (chat_id, agent) + worktrees table (P1.5-b)
The tab (a chat) is the context unit: two opencode tabs in one session are two independent agent contexts sharing one worktree. agent_sessions re-keys from (session_id, agent) to (chat_id, agent) — chat_id FK ON DELETE CASCADE (closing a tab ends its context); worktree_id and session_id become informational SET NULL columns. New worktrees table (one-per-session, survives session delete via session_id SET NULL) supersedes session_worktrees, which is defanged (CASCADE dropped) not yet removed. chat_id is threaded end-to-end: tasks.chat_id added, written by the coder message + skills routes from the frontend tab, read by runOpenCodeServerTask which falls back to resolve-or-create a chat for session-less creators (arena/MCP/new_task/generic) so ensureSession never gets a null key. Idempotent migration with a backfill-verify gate (0-row assertion after the test session was deleted). config_hash fingerprint logic preserved; one-worktree-per-session unchanged; runExternalAgent untouched. Column rename worktree_path -> path repointed at all five readers (server delete-guard, risk/stash endpoints, ensureSessionWorktree). Supersedes the earlier (worktree_id) draft.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 00:04:35 +00:00
3a26563be2 feat(coder): guard session delete against worktree work loss
Deleting a BooChat session CASCADE-wipes its session_worktrees row, which would silently orphan uncommitted/unpushed/unmerged work in the worktree. Add a pre-DELETE gate: the server reads session_worktrees from the shared DB first (no row = chat-only session = delete immediately, zero round-trip), and for worktree-backed sessions calls a new BooCoder endpoint that runs git on the host (only the host systemd service can see /tmp/booworktrees). checkWorktreeWorkAtRisk reports dirty/unpushed/unmerged via the audited hostExec+shellEscape path; default branch is detected from refs/remotes/origin/HEAD (not the worktree's own branch), never hardcoded. Any at-risk worktree returns 409 with per-worktree RiskReport[]; force=true bypasses the check entirely. Fail-closed: coder unreachable/errored also blocks (force still escapes). The sidebar renders a block dialog distinguishing work-at-risk (Commit/Stash/Force) from couldn't-verify (Cancel/Force only); stash uses -u and re-blocks on remaining commits with an explanatory message. Commit never auto-commits — it routes the user to the session.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 22:01:25 +00:00
1bbeaf95c7 fix: auto-name uses session model + pane auto-remove on last tab close
Two independent UI/UX fixes:

- auto_name.ts: pass the session's own model as fallbackModel to
  taskModelCompletion, so chat rename uses whatever model is already
  loaded on llama-swap instead of forcing a swap to DEFAULT_MODEL
  (which times out at 10s when a different model is active).
- useWorkspacePanes.ts: when the last tab in a pane is closed and
  other panes exist, remove the pane entirely instead of leaving an
  orphaned empty panel.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 20:37:38 +00:00
547fd70650 server/coder: working-tree backend changes (pre-existing)
Checkpoint of in-progress backend work present in the tree, not authored this session: auto_name, inference tool-phase/turn, secret_guard, provider-registry, plus a new agent-allowlist test (7 tests, passing).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 03:12:16 +00:00
cbef7618b3 v2.5.1-budget-100: raise all tool call budgets to 100 + codecontextignore fix
Budget defaults raised from 50/10/50 to 100/100/100 (read-only,
non-read-only, no-agent). Per-agent max_tool_calls from AGENTS.md
still overrides.

Added .claude/worktrees/ to .codecontextignore to prevent
get_codebase_overview from parsing empty stub files in stale
worktree node_modules.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-28 02:40:26 +00:00
fcc7c5a86e v2.5.0-task-model: lightweight task model services + tasks table
Task model infrastructure for cheap LLM calls (auto-naming, search
rewrite, tags, summaries) via a dedicated llama-server instance at
TASK_MODEL_URL, falling back to LLAMA_SWAP_URL with FAST_MODEL when
unset. Replaces the inline fetch in auto_name.ts with taskModelCompletion.

Adds search query rewriting: on step 0 when web tools are enabled, the
user's message is summarized into a search intent hint appended to the
system prompt, improving web_search relevance.

Schema: tasks table for provider dispatch and arena, sessions.tags column.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-27 21:44:39 +00:00
bcfc94fa47 v2.4.1-sidecar-routing: route per-agent flags to llama-sidecar + tool gap fix
Batch 3c: when an agent has llama_extra_args in AGENTS.md, provider.ts
routes inference through LLAMA_SIDECAR_URL instead of LLAMA_SWAP_URL.
X-Agent-Flags header built from the agent's flags. Boot-time guard
refuses to start if any agent has llama_extra_args but LLAMA_SIDECAR_URL
is unset. PrefixFingerprint gains a route field (swap/sidecar) for
per-turn visibility. 9 provider tests.

AGENTS.md tool gap: all agents (except Prompt Builder) were missing 8
tools that were added after the original tool lists were written:
request_read_access, view_truncated_output, ask_user_input, git_status,
get_blast_radius, get_hot_files, get_middleware, get_routes. The missing
request_read_access caused silent "permission denied" when reading files
outside the project root.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-27 19:28:08 +00:00
90a6761b07 v2.4.0-unsloth-studio-lift: port 3 Unsloth Studio AGPL-3.0 modules
Batch 1 — tool-call-parser.ts: replaces xml-parser.ts with a port of
Unsloth's tool_call_parser.py. Adds balanced-brace JSON scanner,
single-param fast path, hasToolSignal/stripToolMarkup/parseToolCallsFromText
exports, and stream-finalization stripping at all three final-write sites
(error-handler, finalizeCompletion, executeToolPhase). Anthropic <invoke>
shape preserved. 75+12 tests.

Batch 2 — web/html-to-md.ts: parse5 tree-walking HTML-to-Markdown converter
ported from Unsloth's _html_to_md.py. Replaces web_fetch's regex stripHtml
with structured markdown output (headings, links, lists, tables, code blocks,
blockquotes, entity decoding). 29 tests.

Batch 3 — llama-args-validator.ts: port of llama_server_args.py deny-list
validator. Wired into AGENTS.md frontmatter parser — llama_extra_args field
validated at load time, rejects managed flags (model identity, networking,
auth/TLS, server UI). No runtime consumer yet (llama-swap boundary). 76 tests.

All three files carry SPDX-License-Identifier: AGPL-3.0-only headers.
LICENSE flipped to AGPL-3.0-only in prior commit (a938cf1).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-26 23:30:50 +00:00
154ef78f7c v2.3.1-permission-questions: enrich ACP permission wire for interactive questions and elicitations
The permission_requested WS frame now carries kind ('tool'|'question'|'plan'|
'elicitation'), input (the tool's rawInput payload), and description fields.
PermissionCard detects question-type permissions (Claude Code's AskUserQuestion)
and renders an interactive radio/checkbox form instead of approve/deny buttons.
Submitting answers auto-selects the first allow option.

Also wires up ACP createElicitation (unstable/experimental) — JSON Schema-driven
forms for structured user input. The same PermissionCard renders elicitation
fields with type-appropriate inputs. Both flows use the existing permission-waiter
blocking pattern with 120s timeout.

The response path (POST /api/coder/tasks/:id/permission) now accepts optional
updated_input alongside option_id, forwarded to the ACP agent as the user's
answer payload. Elicitation responses map to accept/decline/cancel actions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-26 21:28:14 +00:00
792bbb9da3 v2.3.0-sampling-params-ask-user: agent sampling params, ask_user_input in CoderPane, UX polish
Add top_p/top_k/min_p/presence_penalty to AGENTS.md frontmatter and thread
through inference (agents.ts parser → Agent type → stream-phase → sentinel
summaries). Null means omit from request body, preserving provider defaults.

Wire ask_user_input interactive card into both BooCoder frontends: the
CoderPane in BooChat's SPA (CoderMessageList now renders AskUserInputCard
instead of ToolCallLine for ask_user_input tool calls) and the standalone
coder SPA (MessageBubble + new AskUserInputCard + shadcn ui primitives).

Additional fixes: SessionLandingPage uses ChatInput with slash-command
support and lazy chat creation; Session.tsx hydrate-race fix for empty pane
promotion; AgentPicker wider dropdown with line-clamp; ModelPicker min-width;
Textarea converted to forwardRef; Recon agent added to AGENTS.md; codecontext
host port exposed in docker-compose.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-26 21:02:21 +00:00
31e1b32be1 v2.2.2-xml-placeholder-reject: drop placeholder XML tool calls at parse time
Reject qwen3.6 spurious <invoke> tails with path "..." or empty args before
they enter toolCalls, preventing duplicate assistant answers. Dropped blocks
append to flushed text; four new xml-parser tests. DEFERRED-WORK §6 for
console.debug → pino cleanup.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-26 16:22:43 +00:00