Commit Graph

85 Commits

Author SHA1 Message Date
c11e26090f feat(coder): boulder state — cross-session plan persistence + auto-resumption
New plans table (id, project_id, title, description, status, flow_run_id,
progress_pct, items_total, items_completed, metadata, timestamps) with
CHECK constraints and indexes.

Plan store (plan-store.ts): createPlan, getPlan, listPlans, listActivePlans,
updatePlan, updatePlanFromRun, findPlanWithRunningRun, planStatusFromRun.

Flow-runner integration: onRunTerminal callback fires on every terminal
transition (complete/fail/cancel) and updates linked plans automatically.

5 API endpoints: GET /api/plans, GET /api/plans/active, GET /api/plans/:id,
POST /api/plans, PATCH /api/plans/:id.

484 tests pass, build clean.
2026-06-08 01:11:07 +00:00
524a0deaa1 feat(coder): add model resolution core + multi-batch matcher
Model resolution (from oh-my-openagent/model-core): 6-step priority
resolution pipeline (UI select -> user config -> category default ->
user fallback -> policy chain -> system default), provider fallback
chains, fuzzy model matching, error classification, provider-specific
model ID transforms. 14 files, zero runtime deps.

Multi-batch matcher (from boocontext-audit): 6 batch types
(Observational, Actionable, PreviouslyApplied, Disambiguation,
ResponseAnalysis, LowCriticality) for behavioral guideline evaluation.
RelationalResolver with iterative convergence (DEPENDS_ON,
PRIORITIZES, ENTAILS, TAG_ALL, TAG_PRIORITIZES). SchematicGenerator
abstract class with retry and execution plans. 4 files.
2026-06-08 00:17:55 +00:00
a7a40c5b46 feat(coder): add hashline editing core + wire audit hooks into dispatch pipeline
Hashline editing: content-hash anchors for edit_file stale-patch detection.
Pure-JS xxHash32, line hash computation, validation with HashlineMismatchError,
256-entry hash dictionary. 6 files in apps/coder/src/services/hashline/.

Audit hooks: emitHook('tool.execute.after') wired in frame-emitter.ts for
completed/failed tool results. emitHook('turn.end') wired at terminal points
in dispatcher.ts (all 5 run functions: native, external, opencode, warm ACP,
claude SDK). Fire-and-forget, non-blocking.
2026-06-07 23:17:47 +00:00
876c9bcd02 feat(coder,server): audit engine — session audit, guideline compliance, user correction tracking
Implements audit-harness-inspired session lifecycle: audit session
creation/end/recover/report-daily with JSONL buffer and graded context
recovery (L0-L4). Guideline service for behavioral compliance rules
(condition/action model with criticality). Correction service for
persistent user correction tracking across agent sessions.

8 supporting skills: audit-start/end/report-daily/recover + command
variants for slash-command integration.
2026-06-07 22:16:35 +00:00
c132215064 feat(web,server): inference settings UI with per-session inference overrides
Adds Inference tab to SettingsPane with controls for temperature, top-p,
top-k, min-p, and other inference parameters. Server-side route and
provider config wiring to pass overrides through the inference pipeline.
2026-06-07 22:16:29 +00:00
a72f7954b4 feat(web,coder): add analytics + results pages for token usage and run history
New /analytics route: token usage dashboard with aggregate summary,
per-session breakdown, context window stats, and per-category token
distribution. Data served from existing agent_sessions + tool_cost_stats.

New /results route: browsable archive of orchestrator flow runs and
arena battles. Two-tab layout (Analysis Runs / Arena Battles) using
existing API endpoints (no new backend).

Sidebar gains Results (ScrollText icon) and Token Analytics (BarChart3
icon) nav buttons above Settings.
2026-06-07 22:16:25 +00:00
31d8efe66a feat(web): enhanced file panel — side-by-side diff, hide whitespace, inline review
Adds DiffSplitView component for side-by-side diff mode, whitespace-only
change filtering, inline review comments with thread/gutter cell UI, diff
preferences persistence, and write-file API support for in-browser editing.

Backend: hideWhitespace param on git diff endpoint, write_file route.
2026-06-07 22:16:20 +00:00
0d6e9a2413 feat(coder): complete orchestrator advanced patterns
- Approval gate steps pause and await human resolution
- appendStepEvent wired into markStep, failRun, dispatchAgentStep
- Trigger rule unit tests (6 variants)
- New parallel-research flow with one_success trigger
2026-06-07 21:55:47 +00:00
fb52eb3efa feat(coder): orchestrator advanced flow patterns
- TriggerRule type (all_success/one_success/all_done) for parallel deps
- Variable substitution ($stepId.output.field) in agent step prompts
- Approval gate step kind (pauses flow via permission frames)
- flow_step_events table for append-only event-sourced step log
- evaluateTriggerRule pure function in flow-runner-decisions
2026-06-07 21:34:30 +00:00
f436021bf9 feat: deferred items — arena token API + UI, ToolShim docs
- Arena API: token_breakdown selected in contestant query
- ArenaPane: token category breakdown bar (s/u/a/t/r) in expanded contestant view
- apps/server/CLAUDE.md: document tool-shim and loop-detectors
2026-06-07 18:41:26 +00:00
bef6bef504 docs: update changelog, roadmap, current focus, and coder CLAUDE.md
- CHANGELOG: v2.8.0-fork-lifts entry covering all 8 integrations
- Roadmap: update shipped header through v2.8.0, bump last-updated date
- CURRENT.md: reflect fork-lifts as last-shipped batch
- apps/coder/CLAUDE.md: document edit-guards behavior and API
2026-06-07 18:05:55 +00:00
87923cb07b feat(coder): add flow-artifacts write helper and boocontext MCP template 2026-06-07 18:05:49 +00:00
c6ecd984c5 feat(coder): add TokenScope analyzer and DB persistence module
- analyzeMessages classifies message parts into system/user/assistant/tools/reasoning
- persistTaskBreakdown writes JSONB to tasks table
- Backfills the token-analysis/ module (contract committed earlier)
- 6 unit tests covering classification, tool calls, reasoning tokens
2026-06-07 18:05:35 +00:00
2a83f61070 feat(coder): add import-drop detection to edit safety guards
- checkDroppedImports detects removed import/require lines in edits
- Runs alongside truncation guard in pending_changes.ts
- Supports ESM imports, CJS require, type imports, side-effect imports
2026-06-07 18:05:30 +00:00
b64941ad4b feat(coder): add plugin hook host
- Typed hook registry with registerHook/emitHook/clearHooks
- Hooks: tool.execute.before/after, turn.start/end, task.terminal
- SUL patterns only (oh-my-openagent: architecture study, no code copy)
2026-06-07 17:57:53 +00:00
cdc782e044 feat(core): add subagent protocol enhancements
- AgentCapabilitiesSchema with supportsStreaming/Reasoning/Background flags
- supportsStreaming and supportsReasoningStream fields in ProviderSnapshotEntry
- new_task tool: background mode flag for non-blocking subtask dispatch
2026-06-07 17:57:49 +00:00
ee749d8698 feat(coder): add LSP code intelligence tools
- lsp/ module: types, config, JSON-RPC client, server-manager, operations
- lsp_diagnostics: TypeScript/JavaScript diagnostics for a file
- lsp_goto_definition: find symbol definition at position
- lsp_find_references: find all references to a symbol
- Registered as READ_TOOLS in tool index
2026-06-07 17:57:35 +00:00
6b7c2bab1e feat(coder): persist token breakdown in arena decisions and schema 2026-06-07 17:57:19 +00:00
373ba86e5d feat(coder): add edit safety guards against truncation 2026-06-07 17:57:15 +00:00
cce685b1a7 fix(coder): harden edit-apply pipeline against block duplication
Root cause: two proven corruption mechanisms — (M1) non-idempotent apply
stamped the same block N times when a quantized model re-emitted the same
edit_file call or a turn was retried; (M2) Levenshtein tier 4 was fail-open
with no uniqueness guard, silently splicing into the wrong location.

Fixes applied at every layer of the pipeline:

Matcher (fuzzy-match.ts): raise SIMILARITY_THRESHOLD 0.66 → 0.85; add
AMBIGUITY_EPSILON uniqueness guard — two windows within 0.05 of the top
score → ambiguous, not a guess; add block-anchor gate (≥3-line needles
require first+last line exact match before a window is scored).

Edit planner (pending_changes.ts): extract planEdit() as a pure function;
idempotency guards detect already-applied states (anchored insert re-stamp,
old-gone-but-new-present); findPendingDuplicate() collapses identical
pending rows at queue time so M1 never reaches applyOne.

Atomic writes (pending_changes.ts): temp-file + rename on the same
filesystem so a crash can't leave a half-written source file; realpath()
first so symlinks survive the rename.

Per-file mutex (pending_changes.ts): withFileLock() serializes concurrent
read-modify-write on the same path via a chained-Promise Map.

EOL preservation (pending_changes.ts): normalize CRLF → LF for matching,
restore native line ending on write so Windows-style files stay clean.

Context isolation (inference_context.ts): replace module-level singleton
with AsyncLocalStorage so concurrent inference runs (arena parallel
dispatch, dispatcher poll racing a user message) each get their own
scoped context with no clobbering.

Tests: plan-edit.test.ts (pure planEdit unit tests), extended fuzzy-match
and pending_changes_integration suites, ALS isolation test that proves
overlapping runs get correct session IDs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-07 01:44:37 +00:00
d6d246c15b feat(web,coder): arena pane — compare 2-6 AI competitors on same prompt
Arena is a new pane kind for competitive AI evaluation. A Battle runs
the same prompt against 2-6 Contestants across two concurrent lanes:
local lane (llama-swap models, serial) and cloud lane (parallel).

Added to all three registries: @boocode/contracts WsFrameSchema,
server InferenceFrame, and web WsFrame.

Backend (apps/coder):
- arena-runner: battle scheduler, lane classifier, benchmark, results
  writer, resume, user winner override
- arena-analyzer: two-stage digest→judge analysis on DEFAULT_MODEL
- arena-decisions: status transitions and resume logic (unit-tested)
- arena-analyzer-helpers: pure helper functions (unit-tested)
- arena-model-call: model call utility for analysis
- arena routes: create/get/list/stop/analyze/cross-examine/winner/diff
- schema: battles, contestants, cross_examinations tables (idempotent)
- remove old /api/arena* routes and tasks.arena_id column

Frontend (apps/web):
- ArenaLauncherDialog: battle type, prompt, contestant selection
- ArenaPane: live roster, streaming output, analysis, cross-exam
- DiffView: unified diff with line-by-line color for coding contests
- Winner override per-row dropdown (Trophy icon)
- battle_updated WS handler for live winner/analysis updates
- arena pane kind in Workspace, ChatTabBar, useSidebar

Cross-app:
- ArenaState and ArenaContestantShape/WsFrame types (contracts)
- battle_* frames in WsFrameSchema, InferenceFrame, and web WsFrame
- manifest.json written per battle results folder
- /Arena added to .gitignore

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-06 23:25:29 +00:00
e04d0fdaa8 feat(coder): unified Plan/Ask/Bypass permission picker
Replace the raw per-agent mode dropdown in the BooCoder composer with a
curated three-option permission ladder mapped generically onto each
provider's native modes: `plan` id -> Plan, default -> Ask, isUnattended
-> Bypass (claude bypassPermissions, qwen yolo, opencode full-access).
modeId stays the single wire field; the active unified mode is derived
from it (no contracts change).

Native BooCode gains its own mode set: Ask stages to the pending-changes
queue (today's behavior), Bypass auto-applies the queue to disk after the
turn (interactive messages path + task dispatcher path), Plan falls back
to Ask. The shared apps/server inference engine is left untouched.

Also preserve isUnattended on live-probed ACP modes so opencode's bypass
mode stays detectable from the wire.

Coder 373 tests green; coder + web typecheck clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 15:14:21 +00:00
875cae0843 fix(coder): parse YAML block-scalar descriptions in slash command discovery
Most plugin/han SKILL.md and command files write `description:` as a folded
block scalar (`>` / `|`) with the text on the following indented lines. The
old single-line frontmatter reader captured the literal `>`, so the slash
menu showed garbage/blank descriptions for nearly all of them. frontmatterField
now collapses folded blocks (join with spaces) and preserves literal blocks.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 17:00:49 +00:00
4caa5f91ff docs: CLAUDE.md notes for Orchestrator + gitignored .env.host
Document the in-app Orchestrator engine and its load-bearing read-only
invariant in apps/coder/CLAUDE.md, and note that apps/coder/.env.host is
now gitignored (recreated from .env.example with CLAUDE_SDK_BACKEND=1).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 16:48:50 +00:00
bfda61e27e chore: stop tracking apps/coder/.env.host
Untrack the host env file (git rm --cached, kept on disk for the boocoder
service) and widen .gitignore to .env.* (re-including .env.example) so env
files no longer get committed. The file's prior contents (dev DB password +
internal Tailscale URLs; no API keys) remain in history — left as-is given the
single-user Tailscale-only threat model.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 16:32:03 +00:00
1937af8df9 feat: in-app Orchestrator (Phase 2) — multi-agent conductor
Brings the deterministic Han-flow conductor into BooCode: launch any read-only
flow from BooChat or BooCoder, watch each agent stream live in a Paseo-style
run pane, get an evidence-disciplined report — on local Qwen, persisted and
resumable. Read-only enforced hard via qwen --approval-mode plan (orchestrator
tasks fail closed if qwen is unavailable; never fall to write-capable native).

Backend (apps/coder): re-homed conductor defs, flow_runs/flow_steps schema,
flow-runner + dispatcher onTaskTerminal hook, restart-resume, runs routes
(launch/list/get/cancel), user-channel WS. Contracts: two flow_run_* frames.
Web: orchestrator pane kind + OrchestratorPane, Workflow button + slash flows
(BooChat/BooCoder parity), FlowLauncherDialog, "New Orchestrator" in the + and
split menus, runs history + export. Plan: openspec/changes/orchestrator.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 15:22:48 +00:00
163b5b86f7 wip: context-meter + model-label UI and provider/inference tweaks
Checkpoint of in-flight work so the orchestrator branch can rebase onto a
clean main: ContextBar → ContextMeter, model-label helper, model/agent picker
+ provider-snapshot/registry changes, inference payload + message-columns.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 14:55:38 +00:00
f32fd928b3 feat: post-review backlog hardening (cancel/parser/stall/history/9502)
Five independent items from the post-review backlog. F1: Stop on an external
agent task now aborts the running child via a per-task AbortController registry
reachable from the cancel route, and finalizes the assistant message as
cancelled (fixing two latent bugs — catch blocks left the message streaming,
and warm success-paths wrote complete on an aborted turn); warm pools/worktrees
are preserved and the native path is unchanged. F2/F3: prune the tool-call
parser to its two load-bearing exports (unexport eight zero-caller symbols, add
a gate test for the <invoke>-as-text fallback) and route placeholder-rejection
logging through pino. F6: a 90s per-chunk stall-timeout wraps native inference's
fullStream via AbortSignal.any so a hung stream finalizes the message instead of
hanging — no retry (a pure classifyStreamError helper is added). F7: a read-only
view_session_history MCP tool (newest-N, chronological). F9: retire the unused
apps/coder/web :9502 fallback SPA, keeping every API/WS/health/MCP route.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 02:23:11 +00:00
9a139633b8 fix(coder): drop human_inbox view before dropping tasks columns
The audit-cleanup migration dropped tasks.feature_values/worktree_path, but
human_inbox is `SELECT * FROM tasks` and pins every column, so the DROP COLUMN
failed (2BP01) on any existing DB and crash-looped boocoder on boot. Drop the
view, drop the columns, then recreate it — idempotent on fresh and existing DBs.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-03 00:29:28 +00:00
2c4ff2063d fix: reconcile audit-cleanup refactor with @boocode/contracts SSOT (worktree-risk type, frame-emitter import)
worktree-risk.ts now returns the package's WorktreeRiskReport (local RiskReport interface removed); frame-emitter.ts imports WsFrame from @boocode/contracts/ws-frames (the deleted @boocode/server/ws-frames subpath).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 23:25:50 +00:00
ae3f10b19d Merge remote-tracking branch 'origin/main' 2026-06-02 21:30:28 +00:00
649ce71eff feat: single-source cross-app wire contracts in @boocode/contracts (v2.7.13)
Move all hand-synced cross-app wire contracts into one built workspace
package, @boocode/contracts, consumed by server/web/coder/coder-web via
workspace:* + a per-subpath exports map. The ws-frames and provider-config
Zod schemas are schema-first (z.infer); MessageMetadata, ErrorReason,
AgentSessionConfig, the provider snapshot types, and WorktreeRiskReport are
each single-sourced. Deletes the byte-identical copies and their parity
tests, fixes a live AgentSessionConfig drift (coder dead copy removed,
unified to the web required/nullable shape), removes the dead pending_change
WS arms in the fallback SPA, and inverts the build order (contracts builds
first) across root build, Dockerfile, and the coder deploy docs. Reverses
the shared-package decision declined in v2.5.12.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 21:24:08 +00:00
8c200216eb refactor: codebase audit cleanup — dead code, dedup, module splits
Multi-agent audit + aggressive cleanup across server/web/coder/booterm,
delivered behind a DEFER discipline so none of the in-flight files were
touched. Removes dead code/deps/columns, dedups server + coder helpers,
and splits the oversized modules (tools.ts, opencode-server.ts,
sentinel-summaries, turn.ts, TerminalPane.tsx) behind stable contracts.
Adds 78 parity/unit tests (server 587, coder 323); fixes two latent bugs
(ChatPane queue keys, FileViewerOverlay blank-line parity).

Intended tag: v2.7.12-audit-cleanup.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 21:12:29 +00:00
e5ce01ae72 fix(coder): include model in WS snapshot SELECT so the attribution chip survives refresh
CoderPane hydrates from the HTTP listMessages fetch (SELECT has model) AND the WS snapshot frame, and the snapshot handler setMessages-overwrites the HTTP load. The snapshot query in apps/coder/src/routes/ws.ts had its own column list that omitted model, so on coder refresh the chip's model was lost (it showed live via the message_complete frame). One-column fix: add model to that SELECT. CLAUDE.md mapper-chain note updated to list the WS snapshot SELECT.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 18:03:10 +00:00
afaca9e426 feat: MCP {env:VAR} key substitution + coder model/tool-result fixes + docs refactor (v2.7.9)
- MCP secrets: substituteEnvVars recursively resolves {env:NAME} in mcp.json string values from process.env before Zod (opencode-compatible); unset -> '' + boot warning, and invalid-config log names the unset vars (an empty {env:VAR} in a strict url/command field invalidates the whole config)
- data/mcp.json now untracked (.gitignore flips !data/mcp.json -> !data/mcp.example.json); tracked template data/mcp.example.json carries "{env:CONTEXT7_API_KEY}"; .env.example documents the key (9 mcp-config tests)
- Coder fix: message_complete frame model widened string -> string|null (server+web ws-frames parity); dispatcher publishes model: task.model at all 4 external completion points — a null model otherwise fail-closed in publishFrame and dropped the whole frame incl. status:'complete' (regression test)
- Coder fix: claude-sdk mapUserToolResults maps user-message tool_result blocks -> terminal tool_update events (completed/failed w/ output) so tool snapshots resolve instead of spinning forever
- Composer: AgentComposerBar drops §9b resumed/history/new chip + token readout, loses flex-wrap so the row stays one line; CoderPane gains a per-chat localStorage agent-config cache (restores last model on reopen) + threads model into the timeline/chip
- Docs: root CLAUDE.md slimmed (~190 lines), per-app refs split to apps/{coder,server,web}/CLAUDE.md; new docs/coder-backends.md, docs/project-discovery.md, docs/coding-standards/ (cross-app-contract-parity); ARCHITECTURE.md links the backends doc

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 17:01:03 +00:00
3a646fd6df feat: BooCode 2.0 UI — Ember theme, brand banner, coder tabs, model-attribution chips
- Ember theme (Obsidian charcoal + #ff7a18 orange), now DEFAULT_THEME_ID; server theme_id whitelist gains 'ember'
- Brand banner: transparent Westie mascot + >_BooCode wordmark, big/edge-to-edge (flood-filled to transparency + cropped)
- Coder panes are multi-tab: + opens a BooCode tab, split opens a pane (shared ChatTabBar via tabKind + createCoderTab; closeOtherTabs/tab-numbering extended to coder)
- Model-attribution: new messages.model column stamped at finalizeCompletion (BooChat/native coder) + dispatcher assistant-row creation (external coder); surfaced via view + wire types + live frame; rendered as a subtle shortened-name chip (shortenModelName)
- Composer Web toggle moved into a boxed focus-ringed input; glowing accent dot on tool rows
- Claude SDK follow-ups (1M context, follow-up-message fix, collapsed thinking/tool chips) + CLAUDE_SDK_BACKEND=1

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 22:30:47 +00:00
c56d169ef9 feat: shared PaneHeaderActions + chat-resolve WorkspaceState fix (v2.7.7)
In-flight workspace UX work.

- Extract a shared PaneHeaderActions cluster (+/Split/Reopen/History/Close)
  used by ChatTabBar + the Workspace coder/terminal pane headers, replacing the
  divergent per-header copies; SessionLandingPage history + useWorkspacePanes
  tweaks.
- Fix coder-side correctness bug: resolveChatId read sessions.workspace_panes as
  a bare WorkspacePane[] but v2.6.5 widened it to a WorkspaceState envelope, so
  it mis-read panes and clobbered tabNumbers/nextTabNumber/closedPaneStack on
  every pane-chat write. New normalizeWorkspaceState handles either shape and
  preserves the envelope (+ regression test).
- CLAUDE.md doc-sync (coder vitest suite, deploy-by-surface, dual-remote push,
  in-flight-web-WIP staging, release-branch naming).

Web tsc + coder build + coder tests green. Builds on v2.7.6.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 14:28:49 +00:00
59cf082e06 feat: normalized external-agent status (#10 scoped) (v2.7.6)
Scoped half of boocode_code_review_v2 §1 #10 — publish the agent status
BooCoder already observes (the config-injection notify-hook is the documented
follow-on, clean-room from superset ELv2).

- agent_status_updated WS frame (working|blocked|idle|error), server+web parity.
- Published from the dispatcher's turn boundaries (warm-acp/opencode/sdk/pty:
  working at start, idle/error at end) + the permission flow (blocked/working).
  Best-effort, never breaks a turn.
- Clean-room normalizeAgentEvent helper (superset's vendor-event -> Start/blocked
  /Stop collapse, event names as facts) + 25 tests — reused by the follow-on.
- AgentComposerBar status dot (distinct from the WS-liveness dot), tracked per
  (chat,agent) by a useAgentStatus map in CoderPane.

Built by 2 parallel agents vs a pinned frame contract. Server 545 + coder 294
tests passing (25 new); web tsc + builds clean; ws-frames parity green. Clears
the actionable review backlog (#1/#3/#4/#6-#12). Builds on v2.7.5.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 14:04:04 +00:00
f3a0197d6a feat: Claude Agent SDK backend + clean-room PostgresSessionStore (v2.7.5)
Lands the lean-SDK direction (boocode_code_review_v2 §1 #9) behind a flag.
Adds @anthropic-ai/claude-agent-sdk@0.3.159 (Commercial Terms, runtime dep).

- PostgresSessionStore: clean-room impl of the SDK's real SessionStore type
  over a new claude_session_entries table. Typechecks against the SDK type;
  8 DB-integration tests.
- ClaudeSdkBackend (implements AgentBackend): one warm query() per (chat,claude)
  in streaming-input mode via a pushable async-iterable pump, sessionStore +
  resume continuity, pure mapSdkMessage->AgentEvent, session_id from init,
  usage/cost onto agent_sessions (backend CHECK gains 'claude_sdk').
- Routing env-gated by CLAUDE_SDK_BACKEND (default off) -> PTY path UNCHANGED.
- Built against real SDK 0.3.159 types (install paid off: partial=stream_event
  needing includePartialMessages, MessageParam, result error arm).
- Fix latent test-infra deadlock: serialize DB suites (fileParallelism:false).

Coder 269 passing default / 290 with DB; tsc clean vs SDK types; builds clean.
LIVE pump + resume + actual claude turn need a host smoke (CLAUDE_SDK_BACKEND=1
+ claude binary + auth). zod peer-dep wants ^4 (workspace 3.25). Builds on v2.7.4.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 13:37:57 +00:00
a584dd16b0 feat: sampling knobs + live PTY stream-json + token UI (v2.7.3)
Three small wins from boocode_code_review_v2 §1 #11/#7/#8.

#11 sampling knobs: top_n_sigma + dry_* family as first-class Agent fields,
threaded into the request body via providerOptions.openaiCompatible. Fixes a
latent bug — top_k (rejected by the AI-SDK provider) and min_p (never passed to
streamText) were dead on the wire; both now route through the same channel.
--reasoning-budget documented in data/AGENTS.md.

#7 live PTY stream-json: new stream-json-parser.ts line-buffers qwen/claude
NDJSON and emits text/reasoning/tool frames live + persists, with a fallback to
the old opaque slice. claude gets --output-format stream-json --verbose.

#8 token UI: agent_sessions input/output_tokens/cost now flow through the route
+ type and render beside the AgentComposerBar session chip.

Built by 3 parallel agents. Server 523 + coder 245 tests passing; builds + web
tsc clean. Builds on v2.7.2. openspec sampling-streamjson-tokens.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 12:47:17 +00:00
9c7d80e2d8 fix(security): scope checkpoint routes to session — close 2 IDORs (v2.7.2)
Flagged by the automated push security review on v2.7.1.

- GET /checkpoints?chat_id= : the chat_id branch filtered by chat_id alone
  (any session's chat_id read its checkpoints). Now joins chats and gates on
  chats.session_id.
- restoreCheckpoint scope guard was fail-open: `cp.session_id && cp.session_id
  !== sessionId` fell through on a null denormalized session_id, allowing a
  cross-session restore (worktree reset + transcript trim). Now resolves the
  owning session via the checkpoint's chat and denies on missing/mismatch.
- Adds a DB-integration regression for the null-session_id cross-session case.

Both scope authoritatively through chats.session_id (checkpoints.session_id is
a nullable hint). Coder suite 234 passing; 7/7 checkpoint tests (incl. the
regression) against live postgres+git; typecheck clean. Hotfix on v2.7.1.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 12:15:54 +00:00
59f07e8cb8 feat: write/edit robustness — fuzzy patch applier + worktree checkpoints (v2.7.1)
#3 Fuzzy patch applier: new pure fuzzy-match.ts (locateMatch, exact→trim→
unicode-canon→Levenshtein≥0.66, refuse-on-ambiguous) wired into pending_changes
applyOne/rewindOne so local-model whitespace/unicode drift in old_string no
longer loses the edit.

#4 Worktree checkpoint + conversation-trim: checkpoints table + checkpoints.ts
(shadow-commit of tracked+untracked into refs/boocode/checkpoints, hooked into
the 3 external-agent dispatcher paths) + POST restore route (reset --hard +
clean -fd -> transcript trim -> backend-session reset) + "Restore to here" UI.

Built by 3 parallel agents; DB-integration testing caught a created_at
self-deletion bug. Coder suite 234 passing; server+coder build + web tsc clean.
Builds on v2.7.0-mit. openspec write-edit-robustness.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 12:01:57 +00:00
a8bfde8f8d feat: relicense AGPL-3.0 → MIT (v2.7.0)
Clear the 3 Unsloth-Studio-derived AGPL files and flip LICENSE + 5
package.json from AGPL-3.0-only to MIT.

- html-to-md.ts → MIT node-html-markdown (parse5 dropped)
- llama-args-validator.ts → clean-room (flag denylist = facts)
- tool-call-parser.ts → delete dead Unsloth-ported code; keep
  extractToolCallBlocks/stripToolMarkup byte-identical (no behavior change)
- LICENSE → MIT (Copyright (c) 2026 indifferentketchup); 5 package.json → MIT;
  AGPL SPDX headers removed; README License section; license-mit guard test
- roadmap License-debt batch marked shipped; openspec/changes/license-debt-mit

Decouples the relicense from the native-parsing retirement (the ported parser
was dead code). Server suite 519 passing; build + coder typecheck clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 08:16:03 +00:00
aa3797e356 feat(coder): v2.6 Phase 3 — lifecycle hardening (idle evict, crash recovery, worktree reaper)
Idle TTL eviction per (chat,agent) + LRU cap (never a busy backend); pure lifecycle-decisions.ts (TDD). Crash recovery lifts openchamber's health-monitor + busy-aware-restart + stale-grace state machine into opencode-server.ts (+ port reclaim) and warm-acp.ts; opencode crash -> fresh sessions, ACP -> re-session/new. F.1 turn-guard + U.6 usage preserved (their tests pass). Orphan worktree reaper (1h grace, superset-style dirty/unpushed preflight, Paseo soft-delete) + close hooks + diff re-baseline after apply_pending. 35 new tests + DB-opt-in reconnect test; 215 coder tests pass; tsc + build clean. Completes v2.6. Follow-ups out of scope: apps/server close-hook caller, 3.7 DiffPanel staging hint, live smokes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 01:10:09 +00:00
0d3d08f5f2 feat(coder): v2.6 Phase 2 — warm ACP backend for goose/qwen
WarmAcpBackend (AgentBackend) holds one persistent goose acp / qwen --acp child + ClientSideConnection + ACP session per (chat,agent); initialize+session/new once, reused across turns. Abort = session/cancel the prompt only (never kills the child); child exit -> agent_sessions.status='crashed' -> re-spawn next turn. Dispatcher routes goose/qwen chat-tab tasks to the pooled warm backend via pure shouldUseWarmBackend (needs session_id+chat_id); one-shot runExternalAgent kept as fallback for arena/MCP/new_task. handleSessionUpdate extracted to a shared pure acp-event-map.ts (one-shot path byte-identical). SDK: installed @agentclientprotocol/sdk@^0.22.1 has stable resumeSession/loadSession; resume moot in the warm hot path, deferred to Phase 3. 15 new tests (warm-acp-routing, acp-event-map); 180 coder tests pass; tsc + build clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 23:57:03 +00:00
c060778258 feat(coder): Phase 1-UX backend — agent attribution + agent-sessions route + opencode usage
pending_changes.agent stamped at every queue site (native -> 'boocode', dispatched external -> task.agent, manual RightRail -> NULL) + flows through listPending. New GET /api/sessions/:id/agent-sessions -> [{agent,status,has_session,last_active_at}] per (chat,agent). opencode warm server consumes session.next.step.ended, accumulating input_tokens/output_tokens/cost onto agent_sessions (new idempotent columns) via a pure opencode-usage.ts mapper. Tests: agent-sessions.routes (3) + opencode-usage (6); tsc clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 22:07:14 +00:00
372651bcb1 fix(coder): F.1 post-interrupt stale-terminal guard (opencode warm server)
opencode emits one trailing session.idle/error for a turn cancelled via client.session.abort(), carrying only a sessionID (no turn id). The warm-server backend settled activeTurn on that event, so after Stop + an immediate new message the orphan idle settled the NEXT turn early as success (one-click reachable since v2.6.5's Send->Stop composer).

Adds a pure per-session guard (backends/turn-guard.ts: armAbortGuard / noteTurnActivity / consumeTerminal over swallowNextTerminal) wired into opencode-server.ts: abort arms it, the next terminal is swallowed once, and a new turn's first delta self-heals so a never-arriving orphan can't strand a real turn. Test-first; 3 regression tests in turn-guard.test.ts. Paseo parallel: 1d38aac.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 21:31:35 +00:00
7f6c4780e2 fix(coder): converge agent_sessions.session_id FK to SET NULL (P1.5-b follow-up)
The P1.5-b re-key block (cb1846c) re-adds session_id_fkey as ON DELETE
SET NULL, but the whole block is guarded on chat_id_fkey's absence. A DB
already re-keyed to (chat_id, agent) while session_id_fkey was still
ON DELETE CASCADE never re-enters that block, so applySchema leaves it at
'c' forever — diverging from the schema's stated intent, from worktree_id
(already SET NULL), and from the v2.6.3 changelog's own claim that
session_id is informational SET NULL.

Add a standalone confdeltype-guarded block (mirroring the session_worktrees
defang) that flips session_id_fkey CASCADE -> SET NULL independently of the
re-key gate. Idempotent: fires only while the FK is still 'c' — a no-op on a
fresh deploy (already 'n' from the re-key block) and on every re-run. The
live DB was converged by hand with the identical statements; \d
agent_sessions now shows session_id ... ON DELETE SET NULL.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 00:46:41 +00:00
cb1846c0d5 feat(coder): re-key agent_sessions to (chat_id, agent) + worktrees table (P1.5-b)
The tab (a chat) is the context unit: two opencode tabs in one session are two independent agent contexts sharing one worktree. agent_sessions re-keys from (session_id, agent) to (chat_id, agent) — chat_id FK ON DELETE CASCADE (closing a tab ends its context); worktree_id and session_id become informational SET NULL columns. New worktrees table (one-per-session, survives session delete via session_id SET NULL) supersedes session_worktrees, which is defanged (CASCADE dropped) not yet removed. chat_id is threaded end-to-end: tasks.chat_id added, written by the coder message + skills routes from the frontend tab, read by runOpenCodeServerTask which falls back to resolve-or-create a chat for session-less creators (arena/MCP/new_task/generic) so ensureSession never gets a null key. Idempotent migration with a backfill-verify gate (0-row assertion after the test session was deleted). config_hash fingerprint logic preserved; one-worktree-per-session unchanged; runExternalAgent untouched. Column rename worktree_path -> path repointed at all five readers (server delete-guard, risk/stash endpoints, ensureSessionWorktree). Supersedes the earlier (worktree_id) draft.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 00:04:35 +00:00
f1a85627e4 fix(coder): strip dcp-message-id tags split across stream chunks
The dcp tag (<dcp-message-id>mNNNN</dcp-message-id>) is streamed token-by-token, so it arrives split across SSE deltas. The existing per-chunk stripDcpTags never sees a complete tag in any single fragment, so fragments pass through and the dispatcher reassembles the tag in textChunks (persisted + shown) — and the terminal message.part.updated path that would strip the full text is suppressed by the dedup gate. Add a stateful cross-chunk stripper (dcp-strip.ts: makeDcpStreamStripper) at the dispatcher's opencode frame boundary: it emits text that cannot be part of a forming tag, holds back only a trailing partial-tag prefix (without swallowing legitimate <…> content), and flushes at turn end. Fixes both live delta frames and persisted content. 11 unit tests incl. split-at-every-boundary and the documented per-chunk-fails case. opencode path only; ACP (goose/qwen/claude) untouched.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 23:16:47 +00:00