Compare commits

..

24 Commits

Author SHA1 Message Date
c935687725 chore(openspec): drop 9 superseded proposals + 11 stub archive files
Drop 9 batch proposals that are superseded by the boocode-lift-analysis
(boocontext-audit, conductor upgrades, self-healing/verify-gate skills):
add-3tier-memory, import-llm-evaluator, import-pregel-engine, plugin-platform,
conductor-evolution, code-intelligence-upgrade, dev-workflow, ui-overhaul,
agent-reliability.

Delete 11 stub archive files (49-66B each, 'Status: Shipped. Archived.' only)
that provide zero documentation value over the existing CHANGELOG.md + git tags.
2026-06-07 22:15:38 +00:00
0d6e9a2413 feat(coder): complete orchestrator advanced patterns
- Approval gate steps pause and await human resolution
- appendStepEvent wired into markStep, failRun, dispatchAgentStep
- Trigger rule unit tests (6 variants)
- New parallel-research flow with one_success trigger
2026-06-07 21:55:47 +00:00
6344105877 feat(server): memory v2 tests and search_memory tool 2026-06-07 21:55:47 +00:00
028c08b4cd docs: add openspec proposals for memory v2 and orchestrator flow patterns 2026-06-07 21:34:35 +00:00
fb52eb3efa feat(coder): orchestrator advanced flow patterns
- TriggerRule type (all_success/one_success/all_done) for parallel deps
- Variable substitution ($stepId.output.field) in agent step prompts
- Approval gate step kind (pauses flow via permission frames)
- flow_step_events table for append-only event-sourced step log
- evaluateTriggerRule pure function in flow-runner-decisions
2026-06-07 21:34:30 +00:00
648a59a563 feat(server): memory v2 — BM25 + local embedding hybrid search
- Bm25Ranker: Okapi BM25 scoring (pure TS, no deps)
- Embedding module: ONNX-based local embeddings via onnxruntime-node
- Hybrid recall: BM25 (30%) + cosine similarity (70%) weighted merge
- Falls back to keyword-only via MEMORY_SEARCH=keyword env var
- extract_memory agent tool for persisting memory entries
2026-06-07 21:34:25 +00:00
7f59f30f2d docs: update code review doc with v2.8 fork-lifts lift sources
- Added 10 new lift source entries (boocontext, TSA, type-inject,
  morph-fast-apply, tokenscope, DCP, qwen-code memory/LSP,
  oh-my-openagent, paseo protocol) under v2.8 fork-lifts section
- Added 9 new rows to the lift catalog table
- Added decisions log entry for v2.8.0-fork-lifts batch
- Bumped last-updated to 2026-06-07
2026-06-07 18:44:12 +00:00
f436021bf9 feat: deferred items — arena token API + UI, ToolShim docs
- Arena API: token_breakdown selected in contestant query
- ArenaPane: token category breakdown bar (s/u/a/t/r) in expanded contestant view
- apps/server/CLAUDE.md: document tool-shim and loop-detectors
2026-06-07 18:41:26 +00:00
bef6bef504 docs: update changelog, roadmap, current focus, and coder CLAUDE.md
- CHANGELOG: v2.8.0-fork-lifts entry covering all 8 integrations
- Roadmap: update shipped header through v2.8.0, bump last-updated date
- CURRENT.md: reflect fork-lifts as last-shipped batch
- apps/coder/CLAUDE.md: document edit-guards behavior and API
2026-06-07 18:05:55 +00:00
87923cb07b feat(coder): add flow-artifacts write helper and boocontext MCP template 2026-06-07 18:05:49 +00:00
c6ecd984c5 feat(coder): add TokenScope analyzer and DB persistence module
- analyzeMessages classifies message parts into system/user/assistant/tools/reasoning
- persistTaskBreakdown writes JSONB to tasks table
- Backfills the token-analysis/ module (contract committed earlier)
- 6 unit tests covering classification, tool calls, reasoning tokens
2026-06-07 18:05:35 +00:00
2a83f61070 feat(coder): add import-drop detection to edit safety guards
- checkDroppedImports detects removed import/require lines in edits
- Runs alongside truncation guard in pending_changes.ts
- Supports ESM imports, CJS require, type imports, side-effect imports
2026-06-07 18:05:30 +00:00
44874f0097 feat: fork lifts phases 3-9 — LSP, DCP, memory, boocontext, protocol, plugins, reliability 2026-06-07 17:58:30 +00:00
1b70d41996 feat(server): add inference reliability - tool-shim and loop detectors
- ToolShim recovers XML/JSON tool calls from plain-text model output
- detectContentRepeat catches same-content loops
- detectToolLoop catches repeated tool invocations
- detectDoomLoop combines both detectors
2026-06-07 17:57:58 +00:00
b64941ad4b feat(coder): add plugin hook host
- Typed hook registry with registerHook/emitHook/clearHooks
- Hooks: tool.execute.before/after, turn.start/end, task.terminal
- SUL patterns only (oh-my-openagent: architecture study, no code copy)
2026-06-07 17:57:53 +00:00
cdc782e044 feat(core): add subagent protocol enhancements
- AgentCapabilitiesSchema with supportsStreaming/Reasoning/Background flags
- supportsStreaming and supportsReasoningStream fields in ProviderSnapshotEntry
- new_task tool: background mode flag for non-blocking subtask dispatch
2026-06-07 17:57:49 +00:00
02bb355a09 feat(server): add institutional memory recall
- File-based memory under .boocode/memory/ (project/user/reference topics)
- Hierarchical 4-scope scan: global → home → project → session
- Keyword/tag relevance matching for query-based recall
- Injected as <boocode-memory> block in system prompt at assembly
- v1 recall-only (extract/dream deferred to v2)
2026-06-07 17:57:44 +00:00
b8b2666fdc feat(server): add DCP clean-room context pruning
- Deduplication: removes consecutive identical tool_call+tool_result pairs
- Purge-errors: removes failed/empty tool results
- Transform orchestrator runs strategies in sequence pre-payload
- Wired into turn.ts before buildMessagesPayload
- Clean-room reimplementation (AGPL reference: behavior only)
2026-06-07 17:57:39 +00:00
ee749d8698 feat(coder): add LSP code intelligence tools
- lsp/ module: types, config, JSON-RPC client, server-manager, operations
- lsp_diagnostics: TypeScript/JavaScript diagnostics for a file
- lsp_goto_definition: find symbol definition at position
- lsp_find_references: find all references to a symbol
- Registered as READ_TOOLS in tool index
2026-06-07 17:57:35 +00:00
bc83475a3d feat(server): add boocontext deep analysis tools and synthesis pipeline
- get_symbol_details: type signature, definition location, usage count
- get_call_graph: callers, callees, transitive references
- get_blast_radius added to SYNTHESIS_TOOLS
2026-06-07 17:57:29 +00:00
214cc32ac2 feat(codecontext): upgrade sidecar to boocontext MCP aggregator
- Multi-stage Dockerfile builds boocontext (Node) + HTTP shim (Go)
- shim.go supports CODECONTEXT_CHILD env var for configurable MCP child
- Adds routes for get_symbol_details, get_call_graph, get_blast_radius
- docker-compose.yml adds env vars for child MCP paths
2026-06-07 17:57:24 +00:00
6b7c2bab1e feat(coder): persist token breakdown in arena decisions and schema 2026-06-07 17:57:19 +00:00
373ba86e5d feat(coder): add edit safety guards against truncation 2026-06-07 17:57:15 +00:00
9106334e70 feat(contracts): add TokenBreakdownSchema and ContestantShape.token_breakdown 2026-06-07 17:57:11 +00:00
201 changed files with 7524 additions and 101 deletions

View File

@@ -2,7 +2,29 @@
All notable changes per release tag. Most recent on top, ordered by tag creation date (which matches the git history). Tag names follow `vMAJOR.MINOR.PATCH-slug` — the slug describes what shipped, so the tag name alone is enough to recall the batch. All notable changes per release tag. Most recent on top, ordered by tag creation date (which matches the git history). Tag names follow `vMAJOR.MINOR.PATCH-slug` — the slug describes what shipped, so the tag name alone is enough to recall the batch.
## v2.7.20-arena-pane — 2026-06-06 ## v2.8.0-fork-lifts — 2026-06-07
Completes the eight fork-lift integrations from `/opt/forks` into BooCode: boocontext sidecar upgrade, LSP code intelligence, DCP clean-room pruning, institutional memory, subagent protocol enhancements, plugin hook host, inference reliability (tool-shim + loop detectors), and TokenScope token breakdown. Backfills edit safety guards (truncation + dropped imports) and the TokenScope analyzer/persist module. Closes the fork-lifts-mit epic.
**boocontext sidecar (Phase 3):** Upgrades the `codecontext` container from the old Go MCP server to the boocontext Node.js MCP aggregator. Multi-stage Dockerfile builds boocontext from `/opt/forks/boocontext` alongside the HTTP shim. `shim.go` gains `CODECONTEXT_CHILD` env-var support and three new HTTP routes for symbols, callgraph, and blast radius. Three TypeScript tool wrappers (`get_symbol_details`, `get_call_graph`, `get_blast_radius`) registered on the server, with blast radius added to the synthesis pipeline. Docker-compose env vars configure child MCP paths (tree-sitter-analyzer, type-inject).
**LSP integration (Phase 4):** Six-file `lsp/` module in the coder with config, JSON-RPC stdio client, lazy server-manager (per-project pool, 5-min idle shutdown), and operations (diagnostics, goto-definition, find-references). Three read-only agent tools registered — `lsp_diagnostics`, `lsp_goto_definition`, `lsp_find_references`. TypeScript/JavaScript only in v1.
**DCP clean-room (Phase 5):** Seven-file `dcp/` module in the server inference pipeline. Consecutive identical tool_call+tool_result pairs are deduplicated; failed/empty tool results are purged via configurable window. Orchestrated by `transformMessages()` running before `buildMessagesPayload` in `turn.ts`. Clean-room reimplementation — AGPL source was referenced for behavior only. 10 unit tests.
**Institutional memory (Phase 6):** Eight-file `memory/` module with file-based recall. Hierarchical 4-scope scan (global → home → project → session) under `.boocode/memory/`. Keyword/tag relevance matching at prompt assembly. Injected as a `<boocode-memory>` block in the system prompt. v1 recall-only — extract/dream deferred.
**Subagent protocol (Phase 7):** `AgentCapabilitiesSchema` in contracts with `supportsStreaming`, `supportsReasoningStream`, `supportsBackgroundExecution` flags. `ProviderSnapshotEntry` gains the two streaming capability fields. `new_task` tool gets a `background` mode flag for non-blocking dispatch. Flow-runner already supported per-step model override.
**Plugin host (Phase 8):** Typed hook registry in `plugins/host.ts` with `registerHook`/`emitHook` for five lifecycle events: `tool.execute.before`, `tool.execute.after`, `turn.start`, `turn.end`, `task.terminal`. Patterns-only from oh-my-openagent (SUL — no code copy).
**Inference reliability (Phase 9):** `tool-shim.ts` recovers XML/JSON tool calls from plain-text model output (e.g. Qwen inline format). `loop-detectors.ts` catches content-repeat and tool-loop patterns. Existing doom-loop detection remains — detectors are additive.
**Edit safety guards (Wave 1):** `edit-guards.ts` rejects catastrophic truncation (>60% chars AND >50% lines). `edit-guards-imports.ts` detects dropped import statements. Both run in `pending_changes.ts` immediately before `writeFileAtomic`.
**TokenScope (Wave 2):** `TokenBreakdownSchema` in contracts with system/user/assistant/tools/reasoning categories. `token-analysis/` module with analyzer and DB persistence. `ContestantShape.token_breakdown` field and `token_breakdown` JSONB column on `contestants`/`tasks` tables. Arena `computeBenchmark` accepts and returns token breakdown.
**Build:** Server 649 ✅ Coder 471 ✅ Contracts ✅ — all green.
Adds the **Arena** pane for running the same prompt against 26 AI competitors simultaneously and picking the best result. A Battle is one Arena run: pick a battle type (Coding — backend+model with git worktrees producing diffs; or Q&A — BooChat persona+model producing text), write or generate a prompt, add contestants, and hit Start. Contestants are scheduled in two concurrent lanes — the local lane (llama-swap models, serial) and the cloud lane (Claude Code, OpenCode-on-cloud, parallel). The lane scheduler captures wall-clock duration for every contestant and tokens/sec for local models. When all contestants finish, a two-stage analysis (digest then judge) auto-runs on the DEFAULT_MODEL, writing `analysis.md` naming a winner; the user can override the winner per-row or trigger cross-examination. Results land in `/<project-root>/Arena/<dated-battle>/` with per-contestant `result.md`, diff patches for coding, and `manifest.json`. Replaces the old API-only `POST /api/arena` with dedicated `battles`/`contestants`/`cross_examinations` tables and full UI. Also adds a `DiffView` component with line-by-line colored unified diff and a per-row dropdown for winner override. Built on `v2.7.18-permission-modes`; pairs conceptually with the earlier `v2.7.17-orchestrator` multi-agent work (both share the pane kind pattern and `onTaskTerminal` hook). Adds the **Arena** pane for running the same prompt against 26 AI competitors simultaneously and picking the best result. A Battle is one Arena run: pick a battle type (Coding — backend+model with git worktrees producing diffs; or Q&A — BooChat persona+model producing text), write or generate a prompt, add contestants, and hit Start. Contestants are scheduled in two concurrent lanes — the local lane (llama-swap models, serial) and the cloud lane (Claude Code, OpenCode-on-cloud, parallel). The lane scheduler captures wall-clock duration for every contestant and tokens/sec for local models. When all contestants finish, a two-stage analysis (digest then judge) auto-runs on the DEFAULT_MODEL, writing `analysis.md` naming a winner; the user can override the winner per-row or trigger cross-examination. Results land in `/<project-root>/Arena/<dated-battle>/` with per-contestant `result.md`, diff patches for coding, and `manifest.json`. Replaces the old API-only `POST /api/arena` with dedicated `battles`/`contestants`/`cross_examinations` tables and full UI. Also adds a `DiffView` component with line-by-line colored unified diff and a per-row dropdown for winner override. Built on `v2.7.18-permission-modes`; pairs conceptually with the earlier `v2.7.17-orchestrator` multi-agent work (both share the pane kind pattern and `onTaskTerminal` hook).

View File

@@ -1,9 +1,9 @@
# Current focus # Current focus
Last updated: 2026-06-05 Last updated: 2026-06-07
- **Last shipped:** `v2.7.18-permission-modes` (2026-06-05) — unified Plan/Ask/Bypass permission picker in the BooCoder composer (incl. native-BooCode auto-apply on Bypass). - **Last shipped:** `v2.8.0-fork-lifts` (2026-06-07) — eight fork-lift integrations from `/opt/forks`: boocontext sidecar, LSP code intelligence, DCP clean-room pruning, institutional memory, subagent protocol, plugin hook host, inference reliability (tool-shim + loop detectors), and TokenScope token breakdown. Backfills edit safety guards and TokenScope analyzer/persist module.
- **Branch:** `main` - **Branch:** `main`
- **In progress:** nothing committed — dogfooding the Orchestrator to surface the next real backlog. Claude Agent-SDK backend enabled (`CLAUDE_SDK_BACKEND`). Optional/exploratory: verify-gate ensembler over pending changes. - **In progress:** nothing committed — all phases 3-9 of fork-lifts-mit epic are shipped. Optional/exploratory: verify-gate ensembler over pending changes; web Arena token UI display.
See `CHANGELOG.md` for the full shipped history. That file is always authoritative; this file is a quick orientation pointer only. See `CHANGELOG.md` for the full shipped history. That file is always authoritative; this file is a quick orientation pointer only.

View File

@@ -37,3 +37,10 @@
- **In-app multi-agent conductor**: `services/flow-runner.ts` runs a flow by inserting each step as a `tasks` row (the existing dispatcher runs it) and advancing on a new `onTaskTerminal` dispatcher-deps hook; persisted in `flow_runs`/`flow_steps` (resumed at startup via `initResume`). The 22 conductor flow defs + Spine factory are re-homed under `src/conductor/`. Pure scheduler/resume helpers in `flow-runner-decisions.ts`. Full design: `openspec/changes/archived/orchestrator/`. - **In-app multi-agent conductor**: `services/flow-runner.ts` runs a flow by inserting each step as a `tasks` row (the existing dispatcher runs it) and advancing on a new `onTaskTerminal` dispatcher-deps hook; persisted in `flow_runs`/`flow_steps` (resumed at startup via `initResume`). The 22 conductor flow defs + Spine factory are re-homed under `src/conductor/`. Pure scheduler/resume helpers in `flow-runner-decisions.ts`. Full design: `openspec/changes/archived/orchestrator/`.
- **Read-only is load-bearing — don't add a dispatch path that bypasses it.** Every step dispatches `agent='qwen', mode_id='plan'`; `dispatcher.ts` force-routes qwen+plan to the PTY `--approval-mode plan` gate and HARD-FAILS the task (never falls to write-capable native inference) when qwen is unavailable (`shouldFailOnMissingAgent`). `BOOCODE_TOOLS` gates BooChat's NATIVE inference tools only — it does NOT govern an external CLI agent (qwen/opencode bring their own write tools); read-only for a dispatched agent is the agent-layer mode (PTY `--approval-mode plan`; ACP `setSessionMode` is fail-OPEN by default, fail-CLOSED for `plan` via `READ_ONLY_MODE_IDS` in `acp-dispatch.ts`). - **Read-only is load-bearing — don't add a dispatch path that bypasses it.** Every step dispatches `agent='qwen', mode_id='plan'`; `dispatcher.ts` force-routes qwen+plan to the PTY `--approval-mode plan` gate and HARD-FAILS the task (never falls to write-capable native inference) when qwen is unavailable (`shouldFailOnMissingAgent`). `BOOCODE_TOOLS` gates BooChat's NATIVE inference tools only — it does NOT govern an external CLI agent (qwen/opencode bring their own write tools); read-only for a dispatched agent is the agent-layer mode (PTY `--approval-mode plan`; ACP `setSessionMode` is fail-OPEN by default, fail-CLOSED for `plan` via `READ_ONLY_MODE_IDS` in `acp-dispatch.ts`).
## Edit safety guards (v2.8)
- **`services/edit-guards.ts`** — `validateEditResult(original, updated, filePath)` runs in `pending_changes.ts` immediately before `writeFileAtomic`. Rejects catastrophic truncation (>60% char loss AND >50% line loss). Throws a `formatGuardError` message that percolates to the agent as a visible error.
- **`services/edit-guards-imports.ts`** — `checkDroppedImports(original, updated, filePath)` detects removed import/require lines. Called alongside the truncation guard.
- Both guards run on the `/apply` path only (not on queue). Re-queued identical edits re-validate at apply time.
- Guard functions are pure — no DB or filesystem access. Easy to unit-test.

View File

@@ -24,6 +24,7 @@ import {
} from './planning.js'; } from './planning.js';
import { adr, codingStandard, runbook, tdd, stakeholderSummary } from './authoring.js'; import { adr, codingStandard, runbook, tdd, stakeholderSummary } from './authoring.js';
import { codeReview } from './code-review.js'; import { codeReview } from './code-review.js';
import { parallelResearch } from './parallel-research.js';
const spines: Spine[] = [ const spines: Spine[] = [
// analysis / research // analysis / research
@@ -53,7 +54,7 @@ const spines: Spine[] = [
stakeholderSummary, stakeholderSummary,
]; ];
const bespoke: Flow[] = [codeReview]; const bespoke: Flow[] = [codeReview, parallelResearch];
const ALL: Flow[] = [...spines.map(buildSpineFlow), ...bespoke]; const ALL: Flow[] = [...spines.map(buildSpineFlow), ...bespoke];

View File

@@ -0,0 +1,59 @@
import type { Flow, Step, StepContext } from '../types.js';
const q = (ctx: StepContext) => String(ctx.input.question);
/**
* Parallel research flow — dispatches 3 research agents simultaneously,
* then synthesizes the result on the first one to complete.
*/
export const parallelResearch: Flow = {
name: 'parallel-research',
description: 'Research from 3 angles in parallel, synthesize results on first completion',
steps: [
{
id: 'angle-web',
kind: 'agent',
agent: 'research-analyst',
run: (ctx) =>
`Research the following question from a web / prior-art perspective:\n\n${q(ctx)}`,
},
{
id: 'angle-code',
kind: 'agent',
agent: 'codebase-explorer',
deps: [],
run: (ctx) =>
`Research the following question from a codebase analysis perspective:\n\n${q(ctx)}`,
},
{
id: 'angle-security',
kind: 'agent',
agent: 'adversarial-security-analyst',
deps: [],
run: (ctx) =>
`Research the following question from a security perspective:\n\n${q(ctx)}`,
},
{
id: 'synthesize',
kind: 'code',
deps: ['angle-web', 'angle-code', 'angle-security'],
trigger_rule: 'one_success',
run: (ctx) => {
const web = ctx.results['angle-web'];
const code = ctx.results['angle-code'];
const security = ctx.results['angle-security'];
const parts = [
'# Parallel Research Synthesis',
'',
web ? `## Web Angle\n${web}` : '## Web Angle\n*(not yet completed)*',
code ? `## Code Angle\n${code}` : '## Code Angle\n*(not yet completed)*',
security ? `## Security Angle\n${security}` : '## Security Angle\n*(not yet completed)*',
];
return parts.join('\n\n');
},
},
],
render: (ctx) => {
return ctx.results['synthesize'] ?? 'No synthesis produced.';
},
};

View File

@@ -38,7 +38,9 @@ export interface StepContext {
readonly model?: string; readonly model?: string;
} }
export type StepKind = 'agent' | 'code'; export type StepKind = 'agent' | 'code' | 'approval';
export type TriggerRule = 'all_success' | 'one_success' | 'all_done';
export interface Step { export interface Step {
/** unique id within the flow; other steps depend on it by this id */ /** unique id within the flow; other steps depend on it by this id */
@@ -46,6 +48,8 @@ export interface Step {
kind: StepKind; kind: StepKind;
/** ids that must complete (or skip) before this step runs */ /** ids that must complete (or skip) before this step runs */
deps?: string[]; deps?: string[];
/** how dependency satisfaction is evaluated (default: all_success) */
trigger_rule?: TriggerRule;
/** for kind:'agent' — the persona file name under conductor/agents (no .md) */ /** for kind:'agent' — the persona file name under conductor/agents (no .md) */
agent?: string; agent?: string;
/** /**

View File

@@ -0,0 +1,42 @@
export type HookName =
| 'tool.execute.before'
| 'tool.execute.after'
| 'turn.start'
| 'turn.end'
| 'task.terminal';
export interface ToolHookContext {
tool: string;
args: Record<string, unknown>;
projectRoot: string;
sessionId: string;
}
export interface ToolResultContext extends ToolHookContext {
result: unknown;
}
export type PluginHook = (ctx: any) => Promise<any>;
const hooks = new Map<HookName, PluginHook[]>();
export function registerHook(name: HookName, fn: PluginHook): void {
const list = hooks.get(name) || [];
list.push(fn);
hooks.set(name, list);
}
export async function emitHook(name: HookName, ctx: any): Promise<any> {
const list = hooks.get(name);
if (!list) return ctx;
let current = ctx;
for (const fn of list) {
const result = await fn(current);
if (result !== undefined) current = result;
}
return current;
}
export function clearHooks(): void {
hooks.clear();
}

View File

@@ -205,7 +205,7 @@ export function registerArenaRoutes(
const contestants = await sql` const contestants = await sql`
SELECT id, battle_id, identity, model, lane, task_id, worktree_id, SELECT id, battle_id, identity, model, lane, task_id, worktree_id,
status, duration_ms, tokens_per_sec, cost_tokens, result_path, error, status, duration_ms, tokens_per_sec, cost_tokens, token_breakdown, result_path, error,
created_at, updated_at created_at, updated_at
FROM contestants FROM contestants
WHERE battle_id = ${id} WHERE battle_id = ${id}

View File

@@ -423,3 +423,18 @@ CREATE INDEX IF NOT EXISTS contestants_task_id_idx ON contestants(task_id);
-- Cross-examination listing per battle. -- Cross-examination listing per battle.
CREATE INDEX IF NOT EXISTS cross_examinations_battle_idx ON cross_examinations(battle_id); CREATE INDEX IF NOT EXISTS cross_examinations_battle_idx ON cross_examinations(battle_id);
-- TokenScope: per-category token breakdown on arena contestants and tasks.
ALTER TABLE contestants ADD COLUMN IF NOT EXISTS token_breakdown JSONB;
ALTER TABLE tasks ADD COLUMN IF NOT EXISTS token_breakdown JSONB;
-- Orchestrator flow step events (append-only event log for resume/replay).
CREATE TABLE IF NOT EXISTS flow_step_events (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
run_id UUID NOT NULL REFERENCES flow_runs(id),
step_id VARCHAR(64) NOT NULL,
event VARCHAR(32) NOT NULL,
payload JSONB,
created_at TIMESTAMPTZ NOT NULL DEFAULT clock_timestamp()
);
CREATE INDEX IF NOT EXISTS flow_step_events_run_idx ON flow_step_events(run_id);

View File

@@ -162,6 +162,24 @@ describe('computeBenchmark', () => {
expect(bench.durationMs).toBe(0); expect(bench.durationMs).toBe(0);
expect(bench.tokensPerSec).toBeNull(); expect(bench.tokensPerSec).toBeNull();
}); });
it('includes token breakdown when provided', () => {
const breakdown = {
system: 10,
user: 20,
assistant: 30,
tools: 40,
reasoning: 5,
total: 105,
};
const bench = computeBenchmark(t0, t1, 500, 'local', breakdown);
expect(bench.tokenBreakdown).toEqual(breakdown);
});
it('defaults token breakdown to null when omitted', () => {
const bench = computeBenchmark(t0, t1, 500, 'local');
expect(bench.tokenBreakdown).toBeNull();
});
}); });
// ─── sanitizeSlug ──────────────────────────────────────────────────────────── // ─── sanitizeSlug ────────────────────────────────────────────────────────────

View File

@@ -0,0 +1,31 @@
import { describe, it, expect } from 'vitest';
import { evaluateTriggerRule } from '../flow-runner-decisions.js';
describe('evaluateTriggerRule', () => {
it('all_success requires all deps done', () => {
expect(evaluateTriggerRule(['a', 'b'], new Set(['a', 'b']), new Set(), new Set())).toBe(true);
expect(evaluateTriggerRule(['a', 'b'], new Set(['a']), new Set(), new Set())).toBe(false);
});
it('one_success fires on first completion', () => {
expect(evaluateTriggerRule(['a', 'b'], new Set(['a']), new Set(), new Set(), 'one_success')).toBe(true);
expect(evaluateTriggerRule(['a', 'b'], new Set(), new Set(), new Set(), 'one_success')).toBe(false);
});
it('all_done includes skipped deps', () => {
expect(evaluateTriggerRule(['a', 'b'], new Set(['a']), new Set(['b']), new Set(), 'all_done')).toBe(true);
});
it('all_success treats excluded deps as satisfied', () => {
expect(evaluateTriggerRule(['a', 'b'], new Set(['a']), new Set(), new Set(['b']))).toBe(true);
});
it('defaults to all_success', () => {
expect(evaluateTriggerRule(['a'], new Set(['a']), new Set(), new Set())).toBe(true);
expect(evaluateTriggerRule(['a'], new Set(), new Set(), new Set())).toBe(false);
});
it('returns true for empty deps', () => {
expect(evaluateTriggerRule([], new Set(), new Set(), new Set())).toBe(true);
});
});

View File

@@ -9,7 +9,7 @@
* A contestant's status lifecycle: * A contestant's status lifecycle:
* queued → running → done | error * queued → running → done | error
*/ */
import type { BattleType, ContestantLane } from '@boocode/contracts/arena'; import type { BattleType, ContestantLane, TokenBreakdown } from '@boocode/contracts/arena';
// ─── Lane classification ────────────────────────────────────────────────────── // ─── Lane classification ──────────────────────────────────────────────────────
@@ -73,6 +73,7 @@ export function isBattleComplete(contestants: readonly { status: string }[]): bo
export interface Benchmark { export interface Benchmark {
durationMs: number; durationMs: number;
tokensPerSec: number | null; tokensPerSec: number | null;
tokenBreakdown: TokenBreakdown | null;
} }
/** /**
@@ -86,13 +87,14 @@ export function computeBenchmark(
endedAt: Date, endedAt: Date,
costTokens: number | null, costTokens: number | null,
lane: ContestantLane, lane: ContestantLane,
tokenBreakdown: TokenBreakdown | null = null,
): Benchmark { ): Benchmark {
const durationMs = Math.max(0, endedAt.getTime() - startedAt.getTime()); const durationMs = Math.max(0, endedAt.getTime() - startedAt.getTime());
const tokensPerSec = const tokensPerSec =
lane === 'local' && costTokens !== null && durationMs > 0 lane === 'local' && costTokens !== null && durationMs > 0
? (costTokens / durationMs) * 1000 ? (costTokens / durationMs) * 1000
: null; : null;
return { durationMs, tokensPerSec }; return { durationMs, tokensPerSec, tokenBreakdown };
} }
// ─── Slug / path helpers ────────────────────────────────────────────────────── // ─── Slug / path helpers ──────────────────────────────────────────────────────

View File

@@ -0,0 +1,47 @@
// edit-guards-imports — detects dropped imports in edited files.
// Ported from opencode-morph-fast-apply (MIT).
export interface ImportCheckResult {
ok: boolean;
missingImports: string[];
reason?: string;
}
const IMPORT_PATTERNS = [
/^import\s+(?:\{[^}]*\}|\*\s+as\s+\w+|\w+)\s+from\s+['"][^'"]+['"]\s*;?$/m,
/^import\s+['"][^'"]+['"]\s*;?$/m,
/^export\s+.*\s+from\s+['"][^'"]+['"]\s*;?$/m,
/^require\s*\(\s*['"][^'"]+['"]\s*\)\s*;?$/m,
/^import\s+type\s+\{[^}]*\}\s+from\s+['"][^'"]+['"]\s*;?$/m,
];
function extractImportLines(content: string): string[] {
return content.split('\n').filter((line) =>
IMPORT_PATTERNS.some((p) => p.test(line.trim())),
);
}
export function checkDroppedImports(
original: string,
updated: string,
filePath: string,
): ImportCheckResult {
const originalImports = extractImportLines(original);
const updatedImports = extractImportLines(updated);
if (originalImports.length === 0) {
return { ok: true, missingImports: [] };
}
const missing = originalImports.filter((imp) => !updatedImports.includes(imp));
if (missing.length > 0 && originalImports.length > 0) {
return {
ok: false,
missingImports: missing,
reason: `Edit would drop ${missing.length} import(s) from ${filePath}`,
};
}
return { ok: true, missingImports: [] };
}

View File

@@ -0,0 +1,42 @@
// v2.8 Morph safety guards — prevents catastrophic truncation, marker leakage,
// and accidental import deletion during native edit_file application.
// Ported from opencode-morph-fast-apply (MIT) with threshold values preserved.
export interface GuardResult {
ok: boolean;
reason?: string;
charLoss?: number;
lineLoss?: number;
}
const TRUNCATION_CHAR_THRESHOLD = 0.6;
const TRUNCATION_LINE_THRESHOLD = 0.5;
export function validateEditResult(
original: string,
updated: string,
filePath: string,
): GuardResult {
// Check for catastrophic content truncation
if (original.length > 0 && updated.length > 0) {
const charLoss = 1 - updated.length / original.length;
const originalLines = original.split('\n').length;
const updatedLines = updated.split('\n').length;
const lineLoss = 1 - updatedLines / originalLines;
if (charLoss > TRUNCATION_CHAR_THRESHOLD && lineLoss > TRUNCATION_LINE_THRESHOLD) {
return {
ok: false,
reason: `Edit would truncate ${Math.round(charLoss * 100)}% of characters and ${Math.round(lineLoss * 100)}% of lines`,
charLoss,
lineLoss,
};
}
}
return { ok: true };
}
export function formatGuardError(guard: GuardResult, filePath: string): string {
return `Edit guard rejected change to ${filePath}: ${guard.reason ?? 'unknown error'}`;
}

View File

@@ -0,0 +1,23 @@
import { mkdir, writeFile } from 'node:fs/promises';
import { join } from 'node:path';
import { existsSync } from 'node:fs';
const ARTIFACTS_ROOT = 'data/flow-artifacts';
export function getArtifactPath(flowRunId: string, stepId: string): string {
return join(ARTIFACTS_ROOT, flowRunId, `${stepId}.md`);
}
export async function writeFlowArtifact(
flowRunId: string,
stepId: string,
content: string,
): Promise<string> {
const dir = join(ARTIFACTS_ROOT, flowRunId);
if (!existsSync(dir)) {
await mkdir(dir, { recursive: true });
}
const path = getArtifactPath(flowRunId, stepId);
await writeFile(path, content, 'utf8');
return path;
}

View File

@@ -22,7 +22,7 @@
* "Settled" = done skipped excluded. Only settled deps unblock a step; * "Settled" = done skipped excluded. Only settled deps unblock a step;
* an inFlight dep does NOT (the runner waits for its terminal callback). * an inFlight dep does NOT (the runner waits for its terminal callback).
*/ */
import type { Flow, Step, StepContext } from '../conductor/types.js'; import type { Flow, Step, StepContext, TriggerRule } from '../conductor/types.js';
export interface SchedulerState { export interface SchedulerState {
/** step ids that completed successfully (results available) */ /** step ids that completed successfully (results available) */
@@ -62,7 +62,7 @@ export function readySteps(flow: Flow, state: SchedulerState): Step[] {
!state.skipped.has(s.id) && !state.skipped.has(s.id) &&
!state.inFlight.has(s.id) && !state.inFlight.has(s.id) &&
!state.excluded.has(s.id) && !state.excluded.has(s.id) &&
(s.deps ?? []).every((d) => isSatisfied(state, d)), ((s.deps ?? []).length === 0 || evaluateTriggerRule(s.deps ?? [], state.done, state.skipped, state.excluded, s.trigger_rule)),
); );
} }
@@ -167,6 +167,32 @@ export function shouldFailOnMissingAgent(agent: string, modeId: string | null):
return agent === 'qwen' && modeId === 'plan'; return agent === 'qwen' && modeId === 'plan';
} }
/**
* Evaluate a trigger rule against dependency results.
* - all_success: every dep must be done (not skipped/failed)
* - one_success: at least one dep must be done
* - all_done: every dep must be settled regardless of outcome
*/
export function evaluateTriggerRule(
deps: string[],
done: ReadonlySet<string>,
skipped: ReadonlySet<string>,
excluded: ReadonlySet<string>,
rule?: TriggerRule,
): boolean {
if (deps.length === 0) return true;
const satisfied = new Set([...done, ...skipped, ...excluded]);
switch (rule ?? 'all_success') {
case 'all_success':
return deps.every((d) => done.has(d) || skipped.has(d) || excluded.has(d));
case 'one_success':
return deps.some((d) => done.has(d));
case 'all_done':
return deps.every((d) => satisfied.has(d));
}
}
/** /**
* Reconcile every step of an in-flight run for startup resume. Returns one * Reconcile every step of an in-flight run for startup resume. Returns one
* decision per step. Pure — no IO. * decision per step. Pure — no IO.

View File

@@ -346,6 +346,20 @@ export function createFlowRunner(deps: Deps): FlowRunner {
continue; // re-evaluate — code output can unblock the next wave continue; // re-evaluate — code output can unblock the next wave
} }
// Approval gate steps: pause and wait for human decision.
const approvalReady = toRun.filter((s) => s.kind === 'approval');
if (approvalReady.length > 0) {
for (const s of approvalReady) {
await sql`
UPDATE flow_steps SET status = 'blocked', updated_at = clock_timestamp()
WHERE run_id = ${runId} AND step_id = ${s.id}
`;
await appendStepEvent(sql, runId, s.id, 'paused', { reason: 'awaiting approval' });
publishStep(runId, s.id, 'blocked');
}
return;
}
// Only agent steps remain ready → dispatch the whole parallel wave, then wait. // Only agent steps remain ready → dispatch the whole parallel wave, then wait.
for (const s of toRun) { for (const s of toRun) {
await dispatchAgentStep(runId, run.project_id, model, s, ctx); await dispatchAgentStep(runId, run.project_id, model, s, ctx);
@@ -378,7 +392,8 @@ export function createFlowRunner(deps: Deps): FlowRunner {
// flow's step.run already bakes in the evidence/YAGNI contracts. // flow's step.run already bakes in the evidence/YAGNI contracts.
const persona = step.agent ? await loadPersona(step.agent) : ''; const persona = step.agent ? await loadPersona(step.agent) : '';
const taskPrompt = await step.run(ctx); const taskPrompt = await step.run(ctx);
const fullPrompt = persona ? `${persona}\n\n---\n\n${taskPrompt}` : taskPrompt; const resolvedPrompt = resolveVariables(taskPrompt, ctx.results);
const fullPrompt = persona ? `${persona}\n\n---\n\n${resolvedPrompt}` : resolvedPrompt;
// READ-ONLY (D-4): agent='qwen', mode_id='plan' are hardcoded, never // READ-ONLY (D-4): agent='qwen', mode_id='plan' are hardcoded, never
// user-overridable. The dispatcher's qwen+plan rule forces the PTY hard gate. // user-overridable. The dispatcher's qwen+plan rule forces the PTY hard gate.
@@ -392,6 +407,7 @@ export function createFlowRunner(deps: Deps): FlowRunner {
SET task_id = ${task!.id}, status = 'running', input = ${fullPrompt}, updated_at = clock_timestamp() SET task_id = ${task!.id}, status = 'running', input = ${fullPrompt}, updated_at = clock_timestamp()
WHERE run_id = ${runId} AND step_id = ${step.id} WHERE run_id = ${runId} AND step_id = ${step.id}
`; `;
await appendStepEvent(sql, runId, step.id, 'started', { taskId: task!.id });
} }
/** /**
@@ -438,6 +454,7 @@ export function createFlowRunner(deps: Deps): FlowRunner {
WHERE run_id = ${runId} AND step_id = ${stepId} WHERE run_id = ${runId} AND step_id = ${stepId}
`; `;
} }
await appendStepEvent(sql, runId, stepId, status, output ? { outputLength: output.length } : undefined);
} }
// ─── run completion ───────────────────────────────────────────────────────── // ─── run completion ─────────────────────────────────────────────────────────
@@ -483,6 +500,7 @@ export function createFlowRunner(deps: Deps): FlowRunner {
if (updated.count === 0) return; if (updated.count === 0) return;
const stepId = failedStepId ?? (flow ? lastAgentStepId(flow, input, model) : 'run'); const stepId = failedStepId ?? (flow ? lastAgentStepId(flow, input, model) : 'run');
log.warn({ runId, error }, 'flow-runner: run failed'); log.warn({ runId, error }, 'flow-runner: run failed');
await appendStepEvent(sql, runId, stepId, 'failed', { error });
publishStep(runId, stepId, 'failed', { run_status: 'failed' }); publishStep(runId, stepId, 'failed', { run_status: 'failed' });
} }
@@ -522,7 +540,7 @@ export function createFlowRunner(deps: Deps): FlowRunner {
function publishStep( function publishStep(
runId: string, runId: string,
stepId: string, stepId: string,
status: 'running' | 'completed' | 'failed' | 'skipped' | 'cancelled', status: 'running' | 'completed' | 'failed' | 'skipped' | 'cancelled' | 'blocked',
extra?: { run_status?: 'running' | 'completed' | 'failed' | 'cancelled'; report?: string }, extra?: { run_status?: 'running' | 'completed' | 'failed' | 'cancelled'; report?: string },
): void { ): void {
publishUser({ publishUser({
@@ -763,3 +781,40 @@ export function createFlowRunner(deps: Deps): FlowRunner {
function errMsg(e: unknown): string { function errMsg(e: unknown): string {
return e instanceof Error ? e.message : String(e); return e instanceof Error ? e.message : String(e);
} }
// ─── Event log ───────────────────────────────────────────────────────────────
async function appendStepEvent(
sql: Sql,
runId: string,
stepId: string,
event: string,
payload?: Record<string, unknown>,
): Promise<void> {
await sql`
INSERT INTO flow_step_events (run_id, step_id, event, payload)
VALUES (${runId}, ${stepId}, ${event}, ${payload ? sql.json(payload as never) : null})
`;
}
// ─── Variable substitution ───────────────────────────────────────────────────
const VAR_PATTERN = /\$(\w+)\.output(?:\.(\w+(?:\.\w+)*))?/g;
export function resolveVariables(prompt: string, results: Record<string, string>): string {
return prompt.replace(VAR_PATTERN, (match, stepId, fieldPath) => {
const output = results[stepId];
if (!output) return match;
if (!fieldPath) return output;
try {
const lines = output.split('\n');
for (const line of lines) {
const parsed = line.match(new RegExp(`^${fieldPath}:\\s*(.+)$`, 'i'));
if (parsed) return parsed[1]!.trim();
}
} catch {
// fall through
}
return match;
});
}

View File

@@ -0,0 +1,75 @@
import { createInterface } from 'node:readline';
import type { Readable, Writable } from 'node:stream';
interface RpcRequest {
jsonrpc: '2.0';
id: number;
method: string;
params?: unknown;
}
interface RpcResponse {
jsonrpc: '2.0';
id: number;
result?: unknown;
error?: { code: number; message: string };
}
export class LspClient {
private nextId = 1;
private pending = new Map<number, { resolve: (v: RpcResponse) => void; reject: (e: Error) => void }>();
private buffer = '';
constructor(
private stdin: Writable,
private stdout: Readable,
) {
const rl = createInterface({ input: stdout, crlfDelay: Infinity });
rl.on('line', (line) => this.handleLine(line));
}
private handleLine(line: string): void {
this.buffer += line + '\n';
const match = this.buffer.match(/Content-Length: (\d+)\r?\n\r?\n/);
if (!match || !match[1]) return;
const len = parseInt(match[1], 10);
const headerEnd = match.index! + match[0].length;
const body = this.buffer.slice(headerEnd, headerEnd + len);
if (body.length < len) return;
this.buffer = this.buffer.slice(headerEnd + len);
try {
const msg: RpcResponse = JSON.parse(body);
const cb = this.pending.get(msg.id);
if (cb) {
this.pending.delete(msg.id);
cb.resolve(msg);
}
} catch {
// Malformed JSON, ignore
}
}
async request(method: string, params?: unknown): Promise<unknown> {
const id = this.nextId++;
const req: RpcRequest = { jsonrpc: '2.0', id, method, params };
const body = JSON.stringify(req);
const header = `Content-Length: ${Buffer.byteLength(body, 'utf8')}\r\n\r\n`;
return new Promise((resolve, reject) => {
this.pending.set(id, {
resolve: (resp) => {
if (resp.error) reject(new Error(resp.error.message));
else resolve(resp.result);
},
reject,
});
this.stdin.write(header + body);
});
}
async notify(method: string, params?: unknown): Promise<void> {
const body = JSON.stringify({ jsonrpc: '2.0', method, params });
const header = `Content-Length: ${Buffer.byteLength(body, 'utf8')}\r\n\r\n`;
this.stdin.write(header + body);
}
}

View File

@@ -0,0 +1,19 @@
export interface LspServerConfig {
command: string;
args: string[];
rootPatterns: string[];
}
const TS_CONFIG: LspServerConfig = {
command: 'typescript-language-server',
args: ['--stdio'],
rootPatterns: ['package.json', 'tsconfig.json'],
};
const SUPPORTED_EXTS = new Set(['ts', 'tsx', 'js', 'jsx', 'mjs', 'cjs']);
export function getServerConfig(filePath: string): LspServerConfig | null {
const ext = filePath.split('.').pop()?.toLowerCase();
if (ext && SUPPORTED_EXTS.has(ext)) return TS_CONFIG;
return null;
}

View File

@@ -0,0 +1,86 @@
import type { LspClient } from './client.js';
import type { Diagnostic, Location } from './types.js';
function fileUri(filePath: string): string {
return `file://${filePath.startsWith('/') ? '' : '/'}${filePath}`;
}
export async function openDocument(
client: LspClient,
filePath: string,
content: string,
version: number = 1,
): Promise<void> {
const uri = fileUri(filePath);
await client.notify('textDocument/didOpen', {
textDocument: { uri, languageId: 'typescript', version, text: content },
});
}
export async function closeDocument(client: LspClient, filePath: string): Promise<void> {
await client.notify('textDocument/didClose', {
textDocument: { uri: fileUri(filePath) },
});
}
export async function getDiagnostics(
client: LspClient,
filePath: string,
content: string,
): Promise<Diagnostic[]> {
const uri = fileUri(filePath);
await openDocument(client, filePath, content);
const result: any = await client.request('textDocument/diagnostic', {
textDocument: { uri },
});
await closeDocument(client, filePath);
const diagnostics: Diagnostic[] = [];
if (result?.diagnostics) {
for (const d of result.diagnostics) {
diagnostics.push({
range: d.range,
severity: d.severity ?? 1,
message: d.message,
source: d.source,
});
}
}
return diagnostics;
}
export async function gotoDefinition(
client: LspClient,
filePath: string,
content: string,
line: number,
character: number,
): Promise<Location | null> {
const uri = fileUri(filePath);
await openDocument(client, filePath, content);
const result: any = await client.request('textDocument/definition', {
textDocument: { uri },
position: { line, character },
});
await closeDocument(client, filePath);
if (!result) return null;
const loc = Array.isArray(result) ? result[0] : result;
return loc ? { uri: loc.uri, range: loc.range } : null;
}
export async function findReferences(
client: LspClient,
filePath: string,
content: string,
line: number,
character: number,
): Promise<Location[]> {
const uri = fileUri(filePath);
await openDocument(client, filePath, content);
const result: any = await client.request('textDocument/references', {
textDocument: { uri },
position: { line, character },
context: { includeDeclaration: true },
});
await closeDocument(client, filePath);
return (result ?? []).map((loc: any) => ({ uri: loc.uri, range: loc.range }));
}

View File

@@ -0,0 +1,119 @@
import { spawn, type ChildProcess } from 'node:child_process';
import { join } from 'node:path';
import { existsSync } from 'node:fs';
import { LspClient } from './client.js';
import { getServerConfig } from './config.js';
const IDLE_TIMEOUT_MS = 5 * 60 * 1000;
const SWEEP_INTERVAL_MS = 30_000;
interface LspInstance {
client: LspClient;
proc: ChildProcess;
lastUsed: number;
timer: ReturnType<typeof setTimeout>;
}
export class LspServerManager {
private instances = new Map<string, LspInstance>();
private sweepTimer: ReturnType<typeof setInterval> | null = null;
constructor() {
this.startSweeper();
}
private startSweeper(): void {
this.sweepTimer = setInterval(() => this.sweep(), SWEEP_INTERVAL_MS);
this.sweepTimer.unref?.();
}
private findProjectRoot(filePath: string): string | null {
let dir = filePath;
const config = getServerConfig(filePath);
if (!config) return null;
while (true) {
for (const pattern of config.rootPatterns) {
if (existsSync(join(dir, pattern))) return dir;
}
const parent = join(dir, '..');
if (parent === dir) return dir;
dir = parent;
}
}
async getClient(filePath: string): Promise<LspClient | null> {
const config = getServerConfig(filePath);
if (!config) return null;
const projectRoot = this.findProjectRoot(filePath);
if (!projectRoot) return null;
const existing = this.instances.get(projectRoot);
if (existing) {
existing.lastUsed = Date.now();
clearTimeout(existing.timer);
existing.timer = setTimeout(() => this.kill(projectRoot), IDLE_TIMEOUT_MS);
existing.timer.unref?.();
return existing.client;
}
return this.spawn(projectRoot, config.command, config.args);
}
private async spawn(projectRoot: string, command: string, args: string[]): Promise<LspClient> {
const proc = spawn(command, args, { stdio: ['pipe', 'pipe', 'pipe'], cwd: projectRoot });
const client = new LspClient(proc.stdin!, proc.stdout!);
await client.request('initialize', {
processId: process.pid,
rootUri: `file://${projectRoot}`,
capabilities: {
textDocument: {
diagnostic: { dynamicRegistration: false },
definition: { dynamicRegistration: false },
references: { dynamicRegistration: false },
},
},
});
await client.notify('initialized', {});
const timer = setTimeout(() => this.kill(projectRoot), IDLE_TIMEOUT_MS);
timer.unref?.();
this.instances.set(projectRoot, { client, proc, lastUsed: Date.now(), timer });
proc.on('exit', () => this.instances.delete(projectRoot));
return client;
}
private kill(projectRoot: string): void {
const inst = this.instances.get(projectRoot);
if (!inst) return;
this.instances.delete(projectRoot);
inst.proc.kill('SIGTERM');
setTimeout(() => {
if (inst.proc.exitCode === null) inst.proc.kill('SIGKILL');
}, 5000);
}
private sweep(): void {
const now = Date.now();
for (const [root, inst] of this.instances) {
if (now - inst.lastUsed > IDLE_TIMEOUT_MS) {
this.kill(root);
}
}
}
shutdown(): void {
if (this.sweepTimer) clearInterval(this.sweepTimer);
for (const root of [...this.instances.keys()]) {
this.kill(root);
}
}
getActiveCount(): number {
return this.instances.size;
}
}
export const lspManager = new LspServerManager();

View File

@@ -0,0 +1,28 @@
export interface Position {
line: number;
character: number;
}
export interface Range {
start: Position;
end: Position;
}
export interface Location {
uri: string;
range: Range;
}
export interface Diagnostic {
range: Range;
severity: number;
message: string;
source?: string;
}
export interface TextDocumentItem {
uri: string;
languageId: string;
version: number;
text: string;
}

View File

@@ -4,6 +4,7 @@ import { randomBytes } from 'node:crypto';
import type { Sql } from '../db.js'; import type { Sql } from '../db.js';
import { resolveWritePath } from './write_guard.js'; import { resolveWritePath } from './write_guard.js';
import { locateMatch } from './fuzzy-match.js'; import { locateMatch } from './fuzzy-match.js';
import { validateEditResult, formatGuardError } from './edit-guards.js';
/** /**
* Write a file atomically: stage to a sibling temp file, then rename over the * Write a file atomically: stage to a sibling temp file, then rename over the
@@ -285,6 +286,10 @@ export async function applyOne(
); );
} }
if (plan.kind === 'apply') { if (plan.kind === 'apply') {
const guard = validateEditResult(toLf(raw), plan.updated, change.file_path);
if (!guard.ok) {
throw new Error(formatGuardError(guard, change.file_path));
}
const out = eol === '\r\n' ? plan.updated.replaceAll('\n', '\r\n') : plan.updated; const out = eol === '\r\n' ? plan.updated.replaceAll('\n', '\r\n') : plan.updated;
await writeFileAtomic(change.file_path, out); await writeFileAtomic(change.file_path, out);
} else { } else {

View File

@@ -0,0 +1,29 @@
import { describe, it, expect } from 'vitest';
import { analyzeMessages } from '../analyzer.js';
describe('analyzeMessages', () => {
it('classifies user messages', () => {
const breakdown = analyzeMessages([{ role: 'user', content: 'hello world' }]);
expect(breakdown.user).toBeGreaterThan(0);
expect(breakdown.total).toBe(breakdown.user);
});
it('counts tool calls', () => {
const parts = [
{ role: 'assistant', content: 'using grep', tool_calls: [{ id: '1', name: 'grep', arguments: '{}' }] },
{ role: 'tool', content: '{"files":[]}', tool_call_id: '1' },
];
const breakdown = analyzeMessages(parts);
expect(breakdown.tools).toBeGreaterThan(0);
expect(breakdown.assistant).toBeGreaterThan(0);
});
it('separates reasoning tokens', () => {
const parts = [
{ role: 'assistant', content: 'short answer', reasoning_parts: [{ text: 'long chain of thought reasoning here' }] },
];
const breakdown = analyzeMessages(parts);
expect(breakdown.reasoning).toBeGreaterThan(0);
expect(breakdown.assistant).toBeLessThan(breakdown.reasoning);
});
});

View File

@@ -0,0 +1,10 @@
import { describe, it, expect } from 'vitest';
describe('persistTaskBreakdown', () => {
it('exports functions', async () => {
const mod = await import('../persist.js');
expect(typeof mod.persistTaskBreakdown).toBe('function');
expect(typeof mod.getTaskBreakdown).toBe('function');
expect(typeof mod.analyzeAndPersistTaskBreakdown).toBe('function');
});
});

View File

@@ -0,0 +1,60 @@
// TokenScope analyzer — classifies message parts into category breakdown.
// Ported from opencode-tokenscope (MIT).
export interface TokenBreakdown {
system: number;
user: number;
assistant: number;
tools: number;
reasoning: number;
total: number;
}
const CHARS_PER_TOKEN = 4;
function estimateTokens(text: string): number {
return Math.ceil(text.length / CHARS_PER_TOKEN);
}
export function analyzeMessages(parts: any[]): TokenBreakdown {
const breakdown: TokenBreakdown = { system: 0, user: 0, assistant: 0, tools: 0, reasoning: 0, total: 0 };
for (const part of parts) {
const role = part.role ?? '';
const content = part.content ?? '';
const tokens = estimateTokens(content);
switch (role) {
case 'system':
breakdown.system += tokens;
break;
case 'user':
breakdown.user += tokens;
break;
case 'assistant':
breakdown.assistant += tokens;
if (part.tool_calls) {
for (const tc of part.tool_calls) {
breakdown.tools += estimateTokens(JSON.stringify(tc));
}
}
break;
case 'tool':
breakdown.tools += tokens;
break;
default:
breakdown.assistant += tokens;
}
if (part.reasoning_parts) {
for (const rp of part.reasoning_parts) {
const rTokens = estimateTokens(rp.text ?? '');
breakdown.reasoning += rTokens;
breakdown.assistant -= rTokens;
}
}
}
breakdown.total = breakdown.system + breakdown.user + breakdown.assistant + breakdown.tools + breakdown.reasoning;
return breakdown;
}

View File

@@ -0,0 +1,35 @@
// TokenScope persistence — writes breakdown to task records.
import type { Sql } from '../../db.js';
import type { TokenBreakdown } from './analyzer.js';
export async function persistTaskBreakdown(
sql: Sql,
taskId: string,
breakdown: TokenBreakdown,
): Promise<void> {
await sql`
UPDATE tasks SET token_breakdown = ${sql.json(breakdown as never)}
WHERE id = ${taskId}
`;
}
export async function getTaskBreakdown(
sql: Sql,
taskId: string,
): Promise<TokenBreakdown | null> {
const rows = await sql<{ token_breakdown: any }[]>`
SELECT token_breakdown FROM tasks WHERE id = ${taskId}
`;
return rows[0]?.token_breakdown ?? null;
}
export async function analyzeAndPersistTaskBreakdown(
sql: Sql,
taskId: string,
parts: any[],
): Promise<TokenBreakdown> {
const { analyzeMessages } = await import('./analyzer.js');
const breakdown = analyzeMessages(parts);
await persistTaskBreakdown(sql, taskId, breakdown);
return breakdown;
}

View File

@@ -7,6 +7,9 @@ import { rewindTool } from './rewind.js';
import { newTaskTool } from './new_task.js'; import { newTaskTool } from './new_task.js';
import { listTasksTool } from './list_tasks.js'; import { listTasksTool } from './list_tasks.js';
import { checkTaskStatusTool } from './check_task_status.js'; import { checkTaskStatusTool } from './check_task_status.js';
import { lspDiagnosticsTool } from './lsp_diagnostics.js';
import { lspGotoDefinitionTool } from './lsp_goto_definition.js';
import { lspFindReferencesTool } from './lsp_find_references.js';
export type { ToolDef, ToolContext, ToolJsonSchema } from './types.js'; export type { ToolDef, ToolContext, ToolJsonSchema } from './types.js';
@@ -26,4 +29,16 @@ export const WRITE_TOOLS: readonly ToolDef<any>[] = [
checkTaskStatusTool, checkTaskStatusTool,
]; ];
export { editFileTool, createFileTool, deleteFileTool, applyPendingTool, rewindTool, newTaskTool, listTasksTool, checkTaskStatusTool }; // Read-only agent tools for code intelligence.
// eslint-disable-next-line @typescript-eslint/no-explicit-any
export const READ_TOOLS: readonly ToolDef<any>[] = [
lspDiagnosticsTool,
lspGotoDefinitionTool,
lspFindReferencesTool,
];
export {
editFileTool, createFileTool, deleteFileTool, applyPendingTool, rewindTool,
newTaskTool, listTasksTool, checkTaskStatusTool,
lspDiagnosticsTool, lspGotoDefinitionTool, lspFindReferencesTool,
};

View File

@@ -0,0 +1,48 @@
import { z } from 'zod';
import { readFile } from 'node:fs/promises';
import type { ToolDef, ToolContext } from './types.js';
import { resolveWritePath } from '../write_guard.js';
import { lspManager } from '../lsp/server-manager.js';
import { getDiagnostics } from '../lsp/operations.js';
const LspDiagnosticsInput = z.object({
file_path: z.string().describe('Path to the file to check for diagnostics'),
});
type InputT = z.infer<typeof LspDiagnosticsInput>;
export const lspDiagnosticsTool: ToolDef<InputT> = {
name: 'lsp_diagnostics',
description: 'Get TypeScript/JavaScript diagnostics (errors, warnings) for a file. Returns diagnostic messages with severity and location.',
inputSchema: LspDiagnosticsInput,
jsonSchema: {
type: 'function',
function: {
name: 'lsp_diagnostics',
description: 'Get TypeScript/JavaScript diagnostics for a file',
parameters: {
type: 'object',
properties: {
file_path: { type: 'string', description: 'Path to the file' },
},
required: ['file_path'],
},
},
},
async execute(input: InputT, projectRoot: string, _context: ToolContext): Promise<unknown> {
const resolved = await resolveWritePath(projectRoot, input.file_path);
const content = await readFile(resolved, 'utf8');
const client = await lspManager.getClient(resolved);
if (!client) return { error: 'Unsupported file type for LSP diagnostics' };
const diagnostics = await getDiagnostics(client, resolved, content);
if (diagnostics.length === 0) return { result: 'No diagnostics found.' };
const lines = diagnostics.map((d) => {
const sev = ['', 'error', 'warning', 'info', 'hint'][d.severity] ?? 'unknown';
return `[${sev}] line ${d.range.start.line + 1}:${d.range.start.character + 1} - ${d.message}`;
});
return { result: lines.join('\n') };
},
};

View File

@@ -0,0 +1,49 @@
import { z } from 'zod';
import { readFile } from 'node:fs/promises';
import type { ToolDef, ToolContext } from './types.js';
import { resolveWritePath } from '../write_guard.js';
import { lspManager } from '../lsp/server-manager.js';
import { findReferences } from '../lsp/operations.js';
const LspFindReferencesInput = z.object({
file_path: z.string().describe('Path to the source file'),
line: z.number().int().nonnegative().describe('0-based line number'),
character: z.number().int().nonnegative().describe('0-based character offset'),
});
type InputT = z.infer<typeof LspFindReferencesInput>;
export const lspFindReferencesTool: ToolDef<InputT> = {
name: 'lsp_find_references',
description: 'Find all references to a symbol at a given position in a file.',
inputSchema: LspFindReferencesInput,
jsonSchema: {
type: 'function',
function: {
name: 'lsp_find_references',
description: 'Find all references to symbol at position',
parameters: {
type: 'object',
properties: {
file_path: { type: 'string' },
line: { type: 'number' },
character: { type: 'number' },
},
required: ['file_path', 'line', 'character'],
},
},
},
async execute(input: InputT, projectRoot: string, _context: ToolContext): Promise<unknown> {
const resolved = await resolveWritePath(projectRoot, input.file_path);
const content = await readFile(resolved, 'utf8');
const client = await lspManager.getClient(resolved);
if (!client) return { error: 'Unsupported file type' };
const refs = await findReferences(client, resolved, content, input.line, input.character);
if (refs.length === 0) return { result: 'No references found.' };
const lines = refs.map((r) => `${r.uri}:${r.range.start.line + 1}:${r.range.start.character + 1}`);
return { result: `Found ${refs.length} reference(s):\n${lines.join('\n')}` };
},
};

View File

@@ -0,0 +1,48 @@
import { z } from 'zod';
import { readFile } from 'node:fs/promises';
import type { ToolDef, ToolContext } from './types.js';
import { resolveWritePath } from '../write_guard.js';
import { lspManager } from '../lsp/server-manager.js';
import { gotoDefinition } from '../lsp/operations.js';
const LspGotoDefinitionInput = z.object({
file_path: z.string().describe('Path to the source file'),
line: z.number().int().nonnegative().describe('0-based line number'),
character: z.number().int().nonnegative().describe('0-based character offset'),
});
type InputT = z.infer<typeof LspGotoDefinitionInput>;
export const lspGotoDefinitionTool: ToolDef<InputT> = {
name: 'lsp_goto_definition',
description: 'Find the definition of a symbol at a given position in a file.',
inputSchema: LspGotoDefinitionInput,
jsonSchema: {
type: 'function',
function: {
name: 'lsp_goto_definition',
description: 'Find definition of symbol at position',
parameters: {
type: 'object',
properties: {
file_path: { type: 'string' },
line: { type: 'number' },
character: { type: 'number' },
},
required: ['file_path', 'line', 'character'],
},
},
},
async execute(input: InputT, projectRoot: string, _context: ToolContext): Promise<unknown> {
const resolved = await resolveWritePath(projectRoot, input.file_path);
const content = await readFile(resolved, 'utf8');
const client = await lspManager.getClient(resolved);
if (!client) return { error: 'Unsupported file type' };
const loc = await gotoDefinition(client, resolved, content, input.line, input.character);
if (!loc) return { result: 'No definition found.' };
return { result: `Defined at ${loc.uri}:${loc.range.start.line + 1}:${loc.range.start.character + 1}` };
},
};

View File

@@ -6,6 +6,7 @@ const NewTaskInput = z.object({
input: z.string().min(1).describe('Task description for the child subtask'), input: z.string().min(1).describe('Task description for the child subtask'),
agent: z.string().optional().describe('Optional: dispatch to a specific agent'), agent: z.string().optional().describe('Optional: dispatch to a specific agent'),
model: z.string().optional().describe('Optional: model override for the subtask'), model: z.string().optional().describe('Optional: model override for the subtask'),
background: z.boolean().optional().describe('If true, return immediately without blocking on completion'),
}); });
type NewTaskInputT = z.infer<typeof NewTaskInput>; type NewTaskInputT = z.infer<typeof NewTaskInput>;
@@ -30,6 +31,7 @@ export const newTaskTool: ToolDef<NewTaskInputT> = {
input: { type: 'string', description: 'Task description for the child subtask' }, input: { type: 'string', description: 'Task description for the child subtask' },
agent: { type: 'string', description: 'Optional: dispatch to a specific agent' }, agent: { type: 'string', description: 'Optional: dispatch to a specific agent' },
model: { type: 'string', description: 'Optional: model override for the subtask' }, model: { type: 'string', description: 'Optional: model override for the subtask' },
background: { type: 'boolean', description: 'If true, returns immediately without waiting' },
}, },
required: ['input'], required: ['input'],
}, },
@@ -50,6 +52,7 @@ export const newTaskTool: ToolDef<NewTaskInputT> = {
return { error: 'Cannot determine project_id from current session' }; return { error: 'Cannot determine project_id from current session' };
} }
const isBg = input.background === true;
const [task] = await sql<{ id: string; state: string }[]>` const [task] = await sql<{ id: string; state: string }[]>`
INSERT INTO tasks (project_id, parent_task_id, input, agent, model) INSERT INTO tasks (project_id, parent_task_id, input, agent, model)
VALUES (${session.project_id}, ${currentTaskId}, ${input.input}, ${input.agent ?? null}, ${input.model ?? null}) VALUES (${session.project_id}, ${currentTaskId}, ${input.input}, ${input.agent ?? null}, ${input.model ?? null})
@@ -57,9 +60,12 @@ export const newTaskTool: ToolDef<NewTaskInputT> = {
`; `;
return { return {
message: `Subtask created (id: ${task!.id}). It will run in isolation. Use check_task_status to monitor.`, message: isBg
? `Background subtask created (id: ${task!.id}). It will continue independently.`
: `Subtask created (id: ${task!.id}). It will run in isolation. Use check_task_status to monitor.`,
task_id: task!.id, task_id: task!.id,
state: task!.state, state: task!.state,
background: isBg,
}; };
}, },
}; };

View File

@@ -17,6 +17,8 @@
- **Tools have NO `execute` field.** BooCode dispatches tools in tool-phase.ts, not the AI SDK loop — only `description` + `inputSchema: jsonSchema(parameters)`. - **Tools have NO `execute` field.** BooCode dispatches tools in tool-phase.ts, not the AI SDK loop — only `description` + `inputSchema: jsonSchema(parameters)`.
- **`includeUsage: true` MUST be set on `createOpenAICompatible`** in `provider.ts`. The adapter defaults it false → no `stream_options.include_usage` → llama-swap emits no usage block → `result.usage` resolves `undefined` (NULL token counts). Don't remove during refactor. - **`includeUsage: true` MUST be set on `createOpenAICompatible`** in `provider.ts`. The adapter defaults it false → no `stream_options.include_usage` → llama-swap emits no usage block → `result.usage` resolves `undefined` (NULL token counts). Don't remove during refactor.
- **Tool-call-only turns may emit a leading `\n` text-delta.** `MessageList.flatten`'s `hasText` and `MessageBubble`'s `hasContent` both `.trim()` before the length check, else whitespace-only content renders an empty bubble + ActionRow between tool calls. `buildMessagesPayload` also skips `status='failed'` and complete-but-empty assistant rows (avoids "Cannot have 2 or more assistant messages at the end of the list" upstream rejection after cap-hit + Continue). - **Tool-call-only turns may emit a leading `\n` text-delta.** `MessageList.flatten`'s `hasText` and `MessageBubble`'s `hasContent` both `.trim()` before the length check, else whitespace-only content renders an empty bubble + ActionRow between tool calls. `buildMessagesPayload` also skips `status='failed'` and complete-but-empty assistant rows (avoids "Cannot have 2 or more assistant messages at the end of the list" upstream rejection after cap-hit + Continue).
- **`services/inference/tool-shim.ts`** — Recovers structured tool calls from plain-text model output. Some models (notably Qwen) emit `<tool_call><name>...</name><arguments>...</arguments></tool_call>` inline text instead of structured JSON. `extractToolCalls(text)` parses both XML and JSON inline formats. `hasToolCallMarkup(text)` is a fast pre-check. Used as a fallback in the stream phase when structured `tool_calls` parse fails. Does NOT require `FAST_MODEL` — operates on the existing turn's output text.
- **`services/inference/loop-detectors.ts`** — Six detectors that catch repetitive model behavior: `detectContentRepeat` (same content N times), `detectToolLoop` (same tool called consecutively). `detectDoomLoop` combines both. These are additive to the existing `sentinels.ts` doom-loop detection.
- **AI SDK ModelMessage conversion** (`toModelMessages` in stream-phase.ts). Tool messages need a `toolName` for `ToolResultPart`; BooCode's OpenAI-shape history lacks it, so a forward-scan builds a `tool_call_id → toolName` map from prior assistant `tool_calls`. Tool outputs wrapped as `{ type: 'json' | 'text', value }` (v6 `ToolResultOutput`). Reasoning emits a `ReasoningPart` first in the content array. - **AI SDK ModelMessage conversion** (`toModelMessages` in stream-phase.ts). Tool messages need a `toolName` for `ToolResultPart`; BooCode's OpenAI-shape history lacks it, so a forward-scan builds a `tool_call_id → toolName` map from prior assistant `tool_calls`. Tool outputs wrapped as `{ type: 'json' | 'text', value }` (v6 `ToolResultOutput`). Reasoning emits a `ReasoningPart` first in the content array.
- **`experimental_repairToolCall`** wired into `streamText` to keep the stream alive when qwen3.6 emits malformed tool args. Pass-through: logs the bad call, returns it unmodified; `executeToolPhase`'s zod-reject path routes it back to the model next turn. - **`experimental_repairToolCall`** wired into `streamText` to keep the stream alive when qwen3.6 emits malformed tool args. Pass-through: logs the bad call, returns it unmodified; `executeToolPhase`'s zod-reject path routes it back to the model next turn.
- **`chat_status` frame** (via `broker.publishUser`) — `status: 'streaming' | 'tool_running' | 'waiting_for_input' | 'idle' | 'error'`. Frontend `useChatStatus` derives `idle_warm` (<30s since idle) vs `idle_cold`. `ChatThroughput` renders beside `StatusDot` only when streaming/tool_running, fed by 500ms-throttled `'usage'` frames (`completion_tokens` + `ctx_used` + `ctx_max`). `POST /api/chats/:id/discard_stale` marks a stuck-streaming row `failed` when the frontend's 60s no-token timer gives up. - **`chat_status` frame** (via `broker.publishUser`) — `status: 'streaming' | 'tool_running' | 'waiting_for_input' | 'idle' | 'error'`. Frontend `useChatStatus` derives `idle_warm` (<30s since idle) vs `idle_cold`. `ChatThroughput` renders beside `StatusDot` only when streaming/tool_running, fed by 500ms-throttled `'usage'` frames (`completion_tokens` + `ctx_used` + `ctx_max`). `POST /api/chats/:id/discard_stale` marks a stuck-streaming row `failed` when the frontend's 60s no-token timer gives up.

View File

@@ -0,0 +1,33 @@
import { describe, it, expect } from 'vitest';
import { deduplicate } from '../strategies/deduplication.js';
import type { DcpMessage } from '../messages.js';
describe('deduplicate', () => {
it('removes consecutive identical tool_call+tool_result pairs', () => {
const messages: DcpMessage[] = [
{ role: 'user', content: 'search for x' },
{ role: 'assistant', content: '', tool_calls: [{ id: '1', name: 'grep', arguments: '{}' }] },
{ role: 'tool', content: 'result1', tool_call_id: '1' },
// Duplicate pair
{ role: 'assistant', content: '', tool_calls: [{ id: '2', name: 'grep', arguments: '{}' }] },
{ role: 'tool', content: 'result1', tool_call_id: '2' },
];
const { messages: result, stats } = deduplicate(messages);
expect(result).toHaveLength(3); // user + first pair
expect(stats.removedCount).toBe(2);
});
it('preserves non-duplicate content', () => {
const messages: DcpMessage[] = [
{ role: 'assistant', content: '', tool_calls: [{ id: '1', name: 'grep', arguments: '{}' }] },
{ role: 'tool', content: 'result1', tool_call_id: '1' },
{ role: 'assistant', content: '', tool_calls: [{ id: '2', name: 'grep', arguments: '{}' }] },
{ role: 'tool', content: 'result2', tool_call_id: '2' }, // Different result
];
const { messages: result, stats } = deduplicate(messages);
expect(result).toHaveLength(4);
expect(stats.removedCount).toBe(0);
});
});

View File

@@ -0,0 +1,22 @@
import { describe, it, expect } from 'vitest';
import { toDcpMessages, fromDcpMessages } from '../messages.js';
describe('toDcpMessages', () => {
it('converts user messages', () => {
const result = toDcpMessages([{ role: 'user', content: 'hello' }]);
expect(result[0].role).toBe('user');
expect(result[0].content).toBe('hello');
});
it('marks Error: content as isError', () => {
const result = toDcpMessages([{ role: 'tool', content: 'Error: file not found', tool_call_id: '1' }]);
expect(result[0].isError).toBe(true);
});
});
describe('fromDcpMessages', () => {
it('round-trips messages', () => {
const original = [{ role: 'user', content: 'hello' }];
expect(fromDcpMessages(toDcpMessages(original))).toEqual(original);
});
});

View File

@@ -0,0 +1,33 @@
import { describe, it, expect } from 'vitest';
import { purgeErrors } from '../strategies/purge-errors.js';
import type { DcpMessage } from '../messages.js';
describe('purgeErrors', () => {
it('removes tool results where content starts with Error:', () => {
const messages: DcpMessage[] = [
{ role: 'tool', content: 'Error: file not found', tool_call_id: '1' },
{ role: 'tool', content: '{"files":[]}', tool_call_id: '2' },
];
const { messages: result, stats } = purgeErrors(messages);
expect(result).toHaveLength(1);
expect(stats.removedCount).toBe(1);
});
it('removes empty tool results', () => {
const messages: DcpMessage[] = [
{ role: 'tool', content: '', tool_call_id: '1' },
];
const { messages: result, stats } = purgeErrors(messages);
expect(result).toHaveLength(0);
expect(stats.removedCount).toBe(1);
});
it('preserves valid tool results', () => {
const messages: DcpMessage[] = [
{ role: 'tool', content: '{"files":["a.ts"]}', tool_call_id: '1' },
];
const { messages: result, stats } = purgeErrors(messages);
expect(result).toHaveLength(1);
expect(stats.removedCount).toBe(0);
});
});

View File

@@ -0,0 +1,25 @@
import { describe, it, expect } from 'vitest';
import { transformMessages } from '../transform.js';
import type { DcpMessage } from '../messages.js';
describe('transformMessages', () => {
it('applies dedup then purge in order', () => {
const input: DcpMessage[] = [
{ role: 'user', content: 'hello' },
{ role: 'assistant', content: '', tool_calls: [{ id: '1', name: 'grep', arguments: '{}' }] },
{ role: 'tool', content: 'result', tool_call_id: '1' },
{ role: 'assistant', content: '', tool_calls: [{ id: '2', name: 'grep', arguments: '{}' }] },
{ role: 'tool', content: 'result', tool_call_id: '2' }, // Dup
];
const { messages, stats } = transformMessages('test-chat', input);
expect(stats.removedCount).toBeGreaterThan(0);
expect(messages.length).toBeLessThan(input.length);
});
it('handles empty input', () => {
const { messages, stats } = transformMessages('empty', []);
expect(messages).toHaveLength(0);
expect(stats.removedCount).toBe(0);
});
});

View File

@@ -0,0 +1,4 @@
export { transformMessages } from './transform.js';
export type { DcpMessage } from './messages.js';
export { toDcpMessages, fromDcpMessages } from './messages.js';
export { getDcpState, clearDcpState } from './state.js';

View File

@@ -0,0 +1,34 @@
// DCP message shape adapter.
// Converts between BooCode MessagePart[] and the DCP internal shape.
// Clean-room implementation — no AGPL source copied.
export interface DcpMessage {
role: 'user' | 'assistant' | 'tool';
content: string;
tool_call_id?: string;
tool_calls?: Array<{ id: string; name: string; arguments: string }>;
isError?: boolean;
}
export function toDcpMessages(parts: any[]): DcpMessage[] {
return parts.map((p: any) => {
const msg: DcpMessage = { role: p.role, content: p.content ?? '' };
if (p.tool_call_id) msg.tool_call_id = p.tool_call_id;
if (p.tool_calls) msg.tool_calls = p.tool_calls;
if (p.isError) msg.isError = true;
if (p.role === 'tool' && p.content && p.content.startsWith('Error:')) {
msg.isError = true;
}
return msg;
});
}
export function fromDcpMessages(msgs: DcpMessage[]): any[] {
return msgs.map((m) => ({
role: m.role,
content: m.content,
...(m.tool_call_id ? { tool_call_id: m.tool_call_id } : {}),
...(m.tool_calls ? { tool_calls: m.tool_calls } : {}),
...(m.isError ? { isError: true } : {}),
}));
}

View File

@@ -0,0 +1,27 @@
// Per-chat session state for DCP.
// Tracks last transform timestamp and message count to avoid re-processing.
interface ChatDcpState {
lastTransformAt: number;
lastMessageCount: number;
}
const chatStates = new Map<string, ChatDcpState>();
export function getDcpState(chatId: string): ChatDcpState | undefined {
return chatStates.get(chatId);
}
export function setDcpState(chatId: string, messageCount: number): void {
chatStates.set(chatId, { lastTransformAt: Date.now(), lastMessageCount: messageCount });
}
export function clearDcpState(chatId: string): void {
chatStates.delete(chatId);
}
export function shouldTransform(chatId: string, messageCount: number): boolean {
const state = chatStates.get(chatId);
if (!state) return true;
return state.lastMessageCount !== messageCount;
}

View File

@@ -0,0 +1,50 @@
import type { DcpMessage } from '../messages.js';
export function deduplicate(messages: DcpMessage[]): { messages: DcpMessage[]; stats: { removedCount: number; freedTokens: number } } {
const result: DcpMessage[] = [];
let removedCount = 0;
let freedTokens = 0;
let i = 0;
while (i < messages.length) {
const current: DcpMessage = messages[i]!;
const next = messages[i + 1];
if (
current.role === 'assistant' &&
current.tool_calls &&
next &&
next.role === 'tool' &&
next.tool_call_id === current.tool_calls[0]?.id
) {
const nextNext = messages[i + 2];
const nextNextNext = messages[i + 3];
if (
nextNext &&
nextNext.role === 'assistant' &&
nextNext.tool_calls &&
nextNextNext &&
nextNextNext.role === 'tool' &&
nextNextNext.tool_call_id === nextNext.tool_calls[0]?.id &&
nextNext.tool_calls[0]?.name === current.tool_calls[0]?.name &&
nextNext.tool_calls[0]?.arguments === current.tool_calls[0]?.arguments &&
nextNextNext.content === next.content
) {
result.push(current, next);
i += 4;
removedCount += 2;
freedTokens += Math.ceil(nextNext.content.length / 4);
freedTokens += Math.ceil(current.content.length / 4);
} else {
result.push(current);
i++;
}
} else {
result.push(current);
i++;
}
}
return { messages: result, stats: { removedCount, freedTokens } };
}

View File

@@ -0,0 +1,34 @@
// Purge-errors strategy — removes failed/empty tool_result entries.
// Clean-room implementation.
import type { DcpMessage } from '../messages.js';
const ERROR_PREFIXES = ['Error:', 'error:', 'Error: '];
const DEFAULT_WINDOW = 5;
export function purgeErrors(
messages: DcpMessage[],
windowSize: number = DEFAULT_WINDOW,
): { messages: DcpMessage[]; stats: { removedCount: number; freedTokens: number } } {
const result: DcpMessage[] = [];
let removedCount = 0;
let freedTokens = 0;
for (const msg of messages) {
if (msg.role === 'tool') {
const shouldRemove =
msg.isError ||
ERROR_PREFIXES.some((p) => msg.content.startsWith(p)) ||
msg.content.trim() === '';
if (shouldRemove) {
removedCount++;
freedTokens += Math.ceil(msg.content.length / 4);
continue; // Skip this message
}
}
result.push(msg);
}
return { messages: result, stats: { removedCount, freedTokens } };
}

View File

@@ -0,0 +1,52 @@
// Transform orchestrator — runs DCP strategies in sequence.
// Clean-room implementation.
import type { DcpMessage } from './messages.js';
import { deduplicate } from './strategies/deduplication.js';
import { purgeErrors } from './strategies/purge-errors.js';
import { getDcpState, setDcpState, shouldTransform } from './state.js';
export interface TransformStats {
removedCount: number;
freedTokens: number;
dedupRemoved: number;
purgeRemoved: number;
}
export interface TransformResult {
messages: DcpMessage[];
stats: TransformStats;
}
export function transformMessages(chatId: string, messages: DcpMessage[]): TransformResult {
if (!shouldTransform(chatId, messages.length)) {
return { messages, stats: { removedCount: 0, freedTokens: 0, dedupRemoved: 0, purgeRemoved: 0 } };
}
let m = messages;
// Step 1: Deduplicate
const dedupResult = deduplicate(m);
m = dedupResult.messages;
const dedupRemoved = dedupResult.stats.removedCount;
// Step 2: Purge errors
const purgeResult = purgeErrors(m);
m = purgeResult.messages;
const purgeRemoved = purgeResult.stats.removedCount;
const totalRemoved = dedupRemoved + purgeRemoved;
const totalFreed = dedupResult.stats.freedTokens + purgeResult.stats.freedTokens;
setDcpState(chatId, messages.length);
return {
messages: m,
stats: {
removedCount: totalRemoved,
freedTokens: totalFreed,
dedupRemoved,
purgeRemoved,
},
};
}

View File

@@ -0,0 +1,68 @@
// Loop detectors — detects repetitive patterns in assistant output
// that indicate a model is stuck in a loop.
export interface LoopDetectionResult {
isLoop: boolean;
reason?: string;
confidence: number; // 0-1
}
const REPEATED_PHRASE_MIN_COUNT = 4;
const REPEATED_TOOL_MIN_COUNT = 3;
export function detectContentRepeat(messages: string[]): LoopDetectionResult {
if (messages.length < REPEATED_PHRASE_MIN_COUNT) {
return { isLoop: false, confidence: 0 };
}
const recent = messages.slice(-REPEATED_PHRASE_MIN_COUNT);
const unique = new Set(recent);
if (unique.size === 1) {
return {
isLoop: true,
reason: `Same content repeated ${REPEATED_PHRASE_MIN_COUNT} times`,
confidence: 0.9,
};
}
if (unique.size <= 2 && recent.length >= 4) {
return {
isLoop: true,
reason: 'Content oscillating between two variants',
confidence: 0.7,
};
}
return { isLoop: false, confidence: 0 };
}
export function detectToolLoop(toolNames: string[]): LoopDetectionResult {
if (toolNames.length < REPEATED_TOOL_MIN_COUNT) return { isLoop: false, confidence: 0 };
const recent = toolNames.slice(-REPEATED_TOOL_MIN_COUNT);
const unique = new Set(recent);
if (unique.size === 1) {
return {
isLoop: true,
reason: `Same tool "${recent[0]}" called ${REPEATED_TOOL_MIN_COUNT} times consecutively`,
confidence: 0.85,
};
}
return { isLoop: false, confidence: 0 };
}
export function detectDoomLoop(
messages: string[],
toolNames: string[],
): LoopDetectionResult {
const contentResult = detectContentRepeat(messages);
if (contentResult.isLoop) return contentResult;
const toolResult = detectToolLoop(toolNames);
if (toolResult.isLoop) return toolResult;
return { isLoop: false, confidence: 0 };
}

View File

@@ -0,0 +1,45 @@
// ToolShim — recovers structured tool calls from plain-text model output.
// When the model emits tool calls as plain text instead of structured JSON,
// this shim attempts to parse and recover them.
export interface ParsedToolCall {
id: string;
name: string;
arguments: string;
}
const TOOL_CALL_PATTERN = /<tool_call>\s*<name>(.+?)<\/name>\s*<arguments>(.+?)<\/arguments>\s*<\/tool_call>/gs;
const JSON_TOOL_PATTERN = /\{\s*"name":\s*"([^"]+)",\s*"arguments":\s*({.+?})\s*\}/gs;
export function extractToolCalls(text: string): ParsedToolCall[] {
const calls: ParsedToolCall[] = [];
let match: RegExpExecArray | null;
// Try XML-style tool calls (common in Qwen output)
const xmlRegex = new RegExp(TOOL_CALL_PATTERN);
while ((match = xmlRegex.exec(text)) !== null) {
calls.push({
id: `call_${calls.length}`,
name: match[1]!.trim(),
arguments: match[2]!.trim(),
});
}
if (calls.length > 0) return calls;
// Try JSON-style tool calls
const jsonRegex = new RegExp(JSON_TOOL_PATTERN);
while ((match = jsonRegex.exec(text)) !== null) {
calls.push({
id: `call_${calls.length}`,
name: match[1]!.trim(),
arguments: match[2]!.trim(),
});
}
return calls;
}
export function hasToolCallMarkup(text: string): boolean {
return TOOL_CALL_PATTERN.test(text) || JSON_TOOL_PATTERN.test(text);
}

View File

@@ -21,6 +21,7 @@ import {
buildMessagesPayload, buildMessagesPayload,
loadContext, loadContext,
} from './payload.js'; } from './payload.js';
import { toDcpMessages, transformMessages, fromDcpMessages } from './dcp/index.js';
import { import {
finalizeCompletion, finalizeCompletion,
finalizeEmpty, finalizeEmpty,
@@ -156,9 +157,20 @@ export async function runAssistantTurn(
ctx.log.warn({ sessionId }, 'inference: session or project missing mid-loop'); ctx.log.warn({ sessionId }, 'inference: session or project missing mid-loop');
break; break;
} }
const { session: iterSession, project: iterProject, history } = loaded; let { session: iterSession, project: iterProject, history } = loaded;
const projectRoot = await resolveProjectRoot(iterProject.path); const projectRoot = await resolveProjectRoot(iterProject.path);
try {
const dcpMsgs = toDcpMessages(history);
const { messages: pruned, stats } = transformMessages(chatId, dcpMsgs);
if (stats.removedCount > 0) {
ctx.log.info({ chatId, ...stats }, 'dcp: transform removed messages');
history = fromDcpMessages(pruned) as typeof history;
}
} catch (err) {
ctx.log.warn({ err: err instanceof Error ? err.message : String(err), chatId }, 'dcp: transform skipped');
}
// v1.14.0: log step boundary for instrumentation. step_start parts are in // v1.14.0: log step boundary for instrumentation. step_start parts are in
// the schema CHECK but not emitted here — writing to the assistant message // the schema CHECK but not emitted here — writing to the assistant message
// before the stream phase creates a sequence-0 collision with // before the stream phase creates a sequence-0 collision with

View File

@@ -0,0 +1,37 @@
import { describe, it, expect } from 'vitest';
import { Bm25Ranker } from '../bm25.js';
describe('Bm25Ranker', () => {
it('scores documents by term frequency', () => {
const ranker = new Bm25Ranker();
ranker.fit(['the cat sat on the mat', 'the dog chased the cat', 'the bird flew over the mat']);
const results = ranker.rank('cat mat');
expect(results.length).toBeGreaterThan(0);
expect(results[0]!.score).toBeGreaterThan(0);
});
it('returns empty for no matches', () => {
const ranker = new Bm25Ranker();
ranker.fit(['aaa bbb', 'ccc ddd']);
const results = ranker.rank('zzz');
expect(results).toHaveLength(0);
});
it('handles single document corpus', () => {
const ranker = new Bm25Ranker();
ranker.fit(['only document here']);
const results = ranker.rank('document');
expect(results).toHaveLength(1);
});
it('ranks relevant docs higher', () => {
const ranker = new Bm25Ranker();
ranker.fit([
'javascript is a programming language',
'python is also a programming language',
'the weather is nice today',
]);
const results = ranker.rank('javascript programming');
expect(results[0]!.index).toBe(0);
});
});

View File

@@ -0,0 +1,31 @@
import { describe, it, expect } from 'vitest';
import { parseMemoryEntries } from '../entries.js';
describe('parseMemoryEntries', () => {
it('parses a single entry with tags', () => {
const md = '## project: Indentation\n> tags: style\n\nUse two-space indentation\n';
const entries = parseMemoryEntries('style.md', md);
expect(entries).toHaveLength(1);
expect(entries[0].title).toBe('Indentation');
expect(entries[0].topic).toBe('project');
expect(entries[0].tags).toEqual(['style']);
expect(entries[0].content).toContain('two-space');
});
it('parses multiple entries', () => {
const md = [
'## project: Style',
'',
'Use tab indentation',
'',
'## user: Preference',
'',
'Prefer pnpm',
'',
].join('\n');
const entries = parseMemoryEntries('mem.md', md);
expect(entries).toHaveLength(2);
expect(entries[0].topic).toBe('project');
expect(entries[1].topic).toBe('user');
});
});

View File

@@ -0,0 +1,14 @@
import { describe, it, expect } from 'vitest';
import { getMemoryRoot, getTopicDir } from '../paths.js';
describe('getMemoryRoot', () => {
it('returns .boocode/memory under project root', () => {
expect(getMemoryRoot('/proj')).toBe('/proj/.boocode/memory');
});
});
describe('getTopicDir', () => {
it('returns project/ under memory root', () => {
expect(getTopicDir('/r/.boocode/memory', 'project')).toBe('/r/.boocode/memory/project');
});
});

View File

@@ -0,0 +1,15 @@
import { describe, it, expect } from 'vitest';
import { formatMemoryBlock } from '../prompt.js';
describe('formatMemoryBlock', () => {
it('wraps entries in boocode-memory tags', () => {
const block = formatMemoryBlock(['Use pnpm', 'Tests in vitest']);
expect(block).toContain('<boocode-memory>');
expect(block).toContain('Use pnpm');
expect(block).toContain('</boocode-memory>');
});
it('returns empty string for no entries', () => {
expect(formatMemoryBlock([])).toBe('');
});
});

View File

@@ -0,0 +1,27 @@
import { describe, it, expect } from 'vitest';
import { rankByRelevance } from '../recall.js';
import type { MemoryEntry } from '../entries.js';
describe('rankByRelevance', () => {
it('returns entries matching query keywords', () => {
const entries: MemoryEntry[] = [
{ id: '1', topic: 'project', title: 'Style', content: 'Use two-space indentation', tags: ['style'] },
{ id: '2', topic: 'project', title: 'Tests', content: 'Use vitest for testing', tags: ['testing'] },
];
const result = rankByRelevance('what indentation?', entries);
expect(result).toHaveLength(1);
expect(result[0].title).toBe('Style');
});
});
describe('rankByHybrid', () => {
it('falls back to BM25 when embeddings unavailable', async () => {
const entries: MemoryEntry[] = [
{ id: '1', topic: 'project', title: 'Style', content: 'Use two-space indentation', tags: ['style'] },
{ id: '2', topic: 'project', title: 'Tests', content: 'Use vitest for testing', tags: ['testing'] },
];
const { rankByHybrid } = await import('../recall.js');
const result = await rankByHybrid('indentation style', entries);
expect(result.length).toBeGreaterThan(0);
});
});

View File

@@ -0,0 +1,67 @@
// BM25 ranker — pure Okapi BM25 scoring. No external deps.
interface Bm25Config {
k1?: number;
b?: number;
}
export class Bm25Ranker {
private k1: number;
private b: number;
private corpus: string[];
private avgDocLen: number;
private idfCache: Map<string, number>;
private docCount: number;
constructor(config?: Bm25Config) {
this.k1 = config?.k1 ?? 1.5;
this.b = config?.b ?? 0.75;
this.corpus = [];
this.avgDocLen = 0;
this.idfCache = new Map();
this.docCount = 0;
}
fit(docs: string[]): void {
this.corpus = docs;
this.docCount = docs.length;
const lengths = docs.map((d) => d.split(/\s+/).length);
this.avgDocLen = lengths.reduce((a, b) => a + b, 0) / lengths.length;
this.idfCache.clear();
}
private tokenize(text: string): string[] {
return text.toLowerCase().split(/\s+/).filter((t) => t.length > 0);
}
private idf(term: string): number {
const cached = this.idfCache.get(term);
if (cached !== undefined) return cached;
const docsWithTerm = this.corpus.filter((d) => this.tokenize(d).includes(term)).length;
const idf = Math.log(1 + (this.docCount - docsWithTerm + 0.5) / (docsWithTerm + 0.5));
this.idfCache.set(term, idf);
return idf;
}
score(query: string, docIndex: number): number {
if (docIndex < 0 || docIndex >= this.corpus.length) return 0;
const doc = this.corpus[docIndex]!;
const queryTerms = this.tokenize(query);
const docTokens = this.tokenize(doc);
const docLen = docTokens.length;
let total = 0;
for (const term of queryTerms) {
const tf = docTokens.filter((t) => t === term).length;
if (tf === 0) continue;
const idfVal = this.idf(term);
total += idfVal * ((tf * (this.k1 + 1)) / (tf + this.k1 * (1 - this.b + this.b * docLen / this.avgDocLen)));
}
return total;
}
rank(query: string, topN: number = 10): Array<{ index: number; score: number }> {
const scores = this.corpus.map((_, i) => ({ index: i, score: this.score(query, i) }));
return scores.sort((a, b) => b.score - a.score).slice(0, topN).filter((s) => s.score > 0);
}
}

View File

@@ -0,0 +1,55 @@
// Embedding module — ONNX-based local embeddings.
// Falls back gracefully when the model file is not available.
let model: any = null;
let ortModule: any = null;
export function isEmbeddingAvailable(): boolean {
return model !== null;
}
// eslint-disable-next-line @typescript-eslint/no-require-imports
const dynamicRequire = typeof require !== 'undefined' ? require : null;
export async function initEmbeddings(modelPath?: string): Promise<boolean> {
try {
if (dynamicRequire) {
try { ortModule = dynamicRequire('onnxruntime-node'); } catch { ortModule = null; }
}
if (!ortModule) {
try { ortModule = await import('onnxruntime-node' as any); } catch { ortModule = null; }
}
if (!ortModule) return false;
const path = modelPath ?? process.env['EMBEDDING_MODEL_PATH'] ?? '';
if (!path) return false;
model = await ortModule.InferenceSession.create(path);
return true;
} catch {
model = null;
return false;
}
}
export async function embed(texts: string[]): Promise<number[][] | null> {
if (!model) return null;
try {
// eslint-disable-next-line @typescript-eslint/no-unnecessary-condition
const ort: { Tensor: new (...args: unknown[]) => unknown } | null = ortModule || null;
if (!ort) return null;
const input = new ort.Tensor('string', texts, [texts.length]);
const feeds: Record<string, any> = {};
feeds[model.inputNames[0]] = input;
const results = await model.run(feeds);
const output = results[model.outputNames[0]];
if (!output || !output.data) return null;
const dim = output.dims?.[1] ?? 384;
const data = output.data as Float32Array;
const vectors: number[][] = [];
for (let i = 0; i < texts.length; i++) {
vectors.push(Array.from(data.slice(i * dim, (i + 1) * dim)));
}
return vectors;
} catch {
return null;
}
}

View File

@@ -0,0 +1,54 @@
export interface MemoryEntry {
id: string;
topic: string;
title: string;
content: string;
tags: string[];
}
export function parseMemoryEntries(fileName: string, markdown: string): MemoryEntry[] {
const entries: MemoryEntry[] = [];
const lines = markdown.split('\n');
let currentEntry: Partial<MemoryEntry> | null = null;
let currentContent: string[] = [];
for (const line of lines) {
const headingMatch = line.match(/^##\s+(.+):\s+(.+)$/);
if (headingMatch && headingMatch[1] && headingMatch[2]) {
if (currentEntry && currentEntry.title) {
entries.push({
id: `${fileName}-${entries.length}`,
topic: currentEntry.topic ?? '',
title: currentEntry.title,
content: currentContent.join('\n').trim(),
tags: currentEntry.tags ?? [],
});
}
currentEntry = { topic: headingMatch[1].trim(), title: headingMatch[2].trim(), tags: [] };
currentContent = [];
continue;
}
const tagsMatch = line.match(/^>\s*tags:\s*(.+)$/i);
if (tagsMatch && tagsMatch[1] && currentEntry) {
currentEntry.tags = tagsMatch[1].split(',').map((t) => t.trim());
continue;
}
if (currentEntry) {
currentContent.push(line);
}
}
if (currentEntry && currentEntry.title) {
entries.push({
id: `${fileName}-${entries.length}`,
topic: currentEntry.topic ?? '',
title: currentEntry.title,
content: currentContent.join('\n').trim(),
tags: currentEntry.tags ?? [],
});
}
return entries;
}

View File

@@ -0,0 +1,6 @@
export { loadMemoryForSession } from './recall.js';
export { formatMemoryBlock } from './prompt.js';
export { scanMemoryScopes } from './scan.js';
export { parseMemoryEntries } from './entries.js';
export { ensureMemoryScaffold, getMemoryRoot } from './paths.js';
export type { MemoryEntry } from './entries.js';

View File

@@ -0,0 +1,17 @@
import { join } from 'node:path';
import { mkdir } from 'node:fs/promises';
const TOPICS = ['project', 'user', 'reference'] as const;
export type MemoryTopic = (typeof TOPICS)[number];
export function getMemoryRoot(projectRoot: string): string {
return join(projectRoot, '.boocode', 'memory');
}
export function getTopicDir(root: string, topic: MemoryTopic): string {
return join(root, topic);
}
export async function ensureMemoryScaffold(root: string): Promise<void> {
await Promise.all(TOPICS.map((t) => mkdir(join(root, t), { recursive: true })));
}

View File

@@ -0,0 +1,5 @@
export function formatMemoryBlock(entries: string[]): string {
if (entries.length === 0) return '';
const body = entries.map((e) => `- ${e}`).join('\n');
return `<boocode-memory>\n${body}\n</boocode-memory>`;
}

View File

@@ -0,0 +1,100 @@
import type { MemoryEntry } from './entries.js';
import { scanProjectMemory } from './scan.js';
import { Bm25Ranker } from './bm25.js';
import { embed, isEmbeddingAvailable } from './embeddings.js';
const SEARCH_MODE = process.env['MEMORY_SEARCH'] ?? 'hybrid';
function extractKeywords(query: string): string[] {
return query
.toLowerCase()
.replace(/[^a-z0-9\s]/g, '')
.split(/\s+/)
.filter((w) => w.length > 2);
}
export function rankByRelevance(query: string, entries: MemoryEntry[]): MemoryEntry[] {
const keywords = extractKeywords(query);
if (keywords.length === 0) return entries.slice(0, 5);
const scored = entries.map((entry) => {
let score = 0;
const searchText = `${entry.title} ${entry.content} ${entry.tags.join(' ')}`.toLowerCase();
for (const kw of keywords) {
if (entry.title.toLowerCase().includes(kw)) score += 3;
if (entry.tags.some((t) => t.toLowerCase().includes(kw))) score += 2;
if (entry.content.toLowerCase().includes(kw)) score += 1;
}
return { entry, score };
});
return scored
.filter((s) => s.score > 0)
.sort((a, b) => b.score - a.score)
.slice(0, 10)
.map((s) => s.entry);
}
export async function rankByHybrid(
query: string,
entries: MemoryEntry[],
): Promise<MemoryEntry[]> {
if (entries.length === 0) return [];
const texts = entries.map((e) => `${e.title} ${e.content} ${e.tags.join(' ')}`);
const bm25 = new Bm25Ranker();
bm25.fit(texts);
const bm25Scores = texts.map((_, i) => bm25.score(query, i));
const maxBm25 = Math.max(...bm25Scores, 1);
const normBm25 = bm25Scores.map((s) => s / maxBm25);
let cosineScores: number[] = [];
if (isEmbeddingAvailable()) {
const vectors = await embed([query, ...texts]);
if (vectors) {
const queryVec = vectors[0]!;
cosineScores = texts.map((_, i) => {
const vec = vectors[i + 1];
if (!vec) return 0;
let dot = 0, nA = 0, nB = 0;
for (let j = 0; j < queryVec.length; j++) {
dot += queryVec[j]! * vec[j]!;
nA += queryVec[j]! * queryVec[j]!;
nB += vec[j]! * vec[j]!;
}
const denom = Math.sqrt(nA) * Math.sqrt(nB);
return denom === 0 ? 0 : dot / denom;
});
}
}
const scored = entries.map((entry, i) => {
const combined = (normBm25[i]! * 0.3) + ((cosineScores[i] ?? 0) * 0.7);
return { entry, score: combined };
});
return scored
.filter((s) => s.score >= 0.15)
.sort((a, b) => b.score - a.score)
.slice(0, 10)
.map((s) => s.entry);
}
export async function loadMemoryForSession(
projectRoot: string,
_sessionId?: string,
query?: string,
): Promise<string[]> {
const entries = await scanProjectMemory(projectRoot);
if (entries.length === 0) return [];
const relevant = query
? SEARCH_MODE === 'keyword'
? rankByRelevance(query, entries)
: await rankByHybrid(query, entries)
: entries.slice(0, 5);
return relevant.map((e) => `[${e.topic}] ${e.title}: ${e.content}`);
}
export { initEmbeddings } from './embeddings.js';

View File

@@ -0,0 +1,72 @@
import { homedir } from 'node:os';
import { join } from 'node:path';
import { readFile, readdir } from 'node:fs/promises';
import type { MemoryEntry } from './entries.js';
import { parseMemoryEntries } from './entries.js';
import { getMemoryRoot } from './paths.js';
export interface MemoryScope {
projectRoot: string;
sessionDir?: string;
homeDir?: string;
}
async function scanDirectory(dir: string): Promise<MemoryEntry[]> {
const entries: MemoryEntry[] = [];
try {
const files = await readdir(dir, { withFileTypes: true });
for (const file of files) {
if (file.isFile() && file.name.endsWith('.md')) {
const content = await readFile(join(dir, file.name), 'utf8');
entries.push(...parseMemoryEntries(file.name, content));
}
}
} catch {
// Directory doesn't exist
}
return entries;
}
const MEMORY_TOPICS = ['project', 'user', 'reference'] as const;
async function scanTopicDirs(root: string): Promise<MemoryEntry[]> {
const entries: MemoryEntry[] = [];
for (const topic of MEMORY_TOPICS) {
entries.push(...(await scanDirectory(join(root, topic))));
}
return entries;
}
export async function scanMemoryScopes(scope: MemoryScope): Promise<MemoryEntry[]> {
const allEntries: MemoryEntry[] = [];
// 1. Global (~/.boocode/memory/) - lowest priority
allEntries.push(...(await scanTopicDirs(getMemoryRoot(homedir()))));
// 2. Home ($HOME/.boocode/memory)
const homeDir = scope.homeDir ?? homedir();
const homeRoot = getMemoryRoot(homeDir);
if (homeRoot !== getMemoryRoot(homedir())) {
allEntries.push(...(await scanTopicDirs(homeRoot)));
}
// 3. Project (.boocode/memory/ under project root)
allEntries.push(...(await scanTopicDirs(getMemoryRoot(scope.projectRoot))));
// 4. Session (.boocode/sessions/<id>/memory.md) - highest priority
if (scope.sessionDir) {
try {
const sessionFile = join(scope.sessionDir, 'memory.md');
const content = await readFile(sessionFile, 'utf8');
allEntries.push(...parseMemoryEntries('session-memory', content));
} catch {
// No session memory file
}
}
return allEntries;
}
export async function scanProjectMemory(projectRoot: string): Promise<MemoryEntry[]> {
return scanMemoryScopes({ projectRoot });
}

View File

@@ -0,0 +1,35 @@
import { readFile, writeFile, readdir } from 'node:fs/promises';
import { join } from 'node:path';
import type { MemoryTopic } from './paths.js';
import { getTopicDir } from './paths.js';
export async function readTopicFiles(root: string, topic: MemoryTopic): Promise<Map<string, string>> {
const dir = getTopicDir(root, topic);
const files = new Map<string, string>();
try {
const entries = await readdir(dir, { withFileTypes: true });
for (const entry of entries) {
if (entry.isFile() && entry.name.endsWith('.md')) {
const content = await readFile(join(dir, entry.name), 'utf8');
files.set(entry.name, content);
}
}
} catch {
// Directory doesn't exist yet
}
return files;
}
export async function writeEntry(
root: string,
topic: MemoryTopic,
title: string,
content: string,
tags: string[],
): Promise<void> {
const dir = getTopicDir(root, topic);
const tagLine = tags.length > 0 ? `> tags: ${tags.join(', ')}\n\n` : '\n';
const entry = `## ${topic}: ${title}\n${tagLine}${content}\n`;
const filename = title.toLowerCase().replace(/[^a-z0-9]+/g, '-').replace(/(^-|-$)/g, '') + '.md';
await writeFile(join(dir, filename), entry, 'utf8');
}

View File

@@ -35,6 +35,7 @@ export const SYNTHESIS_TOOLS: ReadonlySet<string> = new Set([
'get_codebase_overview', 'get_codebase_overview',
'get_framework_analysis', 'get_framework_analysis',
'get_semantic_neighborhoods', 'get_semantic_neighborhoods',
'get_blast_radius',
]); ]);
const TOP_N_FILES = 5; const TOP_N_FILES = 5;

View File

@@ -22,6 +22,8 @@ import { readFile, stat } from 'node:fs/promises';
import type { Agent, Project, Session } from '../types/api.js'; import type { Agent, Project, Session } from '../types/api.js';
import { getAgentsMtimes } from './agents.js'; import { getAgentsMtimes } from './agents.js';
import { resolveRoute } from './inference/provider.js'; import { resolveRoute } from './inference/provider.js';
import { loadMemoryForSession } from './memory/recall.js';
import { formatMemoryBlock } from './memory/prompt.js';
const BASE_SYSTEM_PROMPT = (projectPath: string) => const BASE_SYSTEM_PROMPT = (projectPath: string) =>
`You are BooCode Chat, a code investigation assistant. The user is working on a project located at ${projectPath}. Use the file-read tools (view_file, list_dir, grep, find_files) to investigate code when needed. Be concise. Cite file paths and line numbers when discussing code. Do not hallucinate file contents — read the file first. Tool results may be truncated; if so, narrow your query rather than guessing.`; `You are BooCode Chat, a code investigation assistant. The user is working on a project located at ${projectPath}. Use the file-read tools (view_file, list_dir, grep, find_files) to investigate code when needed. Be concise. Cite file paths and line numbers when discussing code. Do not hallucinate file contents — read the file first. Tool results may be truncated; if so, narrow your query rather than guessing.`;
@@ -164,7 +166,11 @@ export async function buildSystemPromptWithFingerprint(
let out = BASE_SYSTEM_PROMPT(project.path); let out = BASE_SYSTEM_PROMPT(project.path);
const guidance = await getContainerGuidance(); const guidance = await getContainerGuidance();
if (guidance) { if (guidance) {
out += `\n\n--- Container guidance ---\n${guidance}\n--- end container guidance ---\n`; out += '\n\n--- Container guidance ---\n' + guidance + '\n--- end container guidance ---\n';
}
const memory = await loadMemoryForSession(project.path, session.id).catch(() => []);
if (memory.length > 0) {
out += '\n\n' + formatMemoryBlock(memory);
} }
if (agent && agent.system_prompt.trim().length > 0) { if (agent && agent.system_prompt.trim().length > 0) {
out += '\n\n' + agent.system_prompt.trim(); out += '\n\n' + agent.system_prompt.trim();

View File

@@ -0,0 +1,31 @@
import { z } from 'zod';
import { makeCodecontextTool } from './factory.js';
export const GetCallGraphInput = z.object({
symbol: z.string().describe('Symbol name to analyze'),
depth: z.number().int().min(1).max(5).optional().describe('Max traversal depth (default 2)'),
});
export type GetCallGraphInputT = z.infer<typeof GetCallGraphInput>;
const DESCRIPTION =
'Returns a call graph for a function or method: callers, callees, and transitive references. ' +
'Use to understand how a symbol is invoked and what it depends on.';
const { toolDef: getCallGraph, execute: executeGetCallGraph } =
makeCodecontextTool<GetCallGraphInputT>({
name: 'get_call_graph',
schema: GetCallGraphInput,
description: DESCRIPTION,
jsonParameters: {
type: 'object',
properties: {
symbol: { type: 'string', description: 'Symbol name to analyze' },
depth: { type: 'number', description: 'Max traversal depth (default 2)' },
},
required: ['symbol'],
additionalProperties: false,
},
mapArgs: (input) => ({ symbol: input.symbol, depth: input.depth ?? 2 }),
});
export { getCallGraph, executeGetCallGraph };

View File

@@ -0,0 +1,31 @@
import { z } from 'zod';
import { makeCodecontextTool } from './factory.js';
export const GetSymbolDetailsInput = z.object({
symbol: z.string().describe('Symbol name to resolve'),
file_path: z.string().optional().describe('Optional file path to narrow search'),
});
export type GetSymbolDetailsInputT = z.infer<typeof GetSymbolDetailsInput>;
const DESCRIPTION =
'Returns type signature, definition location, and usage count for a named symbol. ' +
'Use after get_codebase_overview to dive deeper into specific functions, classes, or variables.';
const { toolDef: getSymbolDetails, execute: executeGetSymbolDetails } =
makeCodecontextTool<GetSymbolDetailsInputT>({
name: 'get_symbol_details',
schema: GetSymbolDetailsInput,
description: DESCRIPTION,
jsonParameters: {
type: 'object',
properties: {
symbol: { type: 'string', description: 'Symbol name to resolve' },
file_path: { type: 'string', description: 'Optional file path to narrow search' },
},
required: ['symbol'],
additionalProperties: false,
},
mapArgs: (input) => ({ symbol: input.symbol, file_path: input.file_path }),
});
export { getSymbolDetails, executeGetSymbolDetails };

View File

@@ -0,0 +1,44 @@
import { z } from 'zod';
import type { ToolDef } from '../tools/types.js';
import { ensureMemoryScaffold, getMemoryRoot } from '../memory/paths.js';
import { writeEntry } from '../memory/store.js';
const ExtractMemoryInput = z.object({
topic: z.enum(['project', 'user', 'reference']).describe('Memory topic category'),
title: z.string().min(1).max(200).describe('Entry title (will be normalized to filename)'),
content: z.string().min(1).describe('Memory content body'),
tags: z.array(z.string()).optional().describe('Optional tags for search'),
});
type InputT = z.infer<typeof ExtractMemoryInput>;
export const extractMemoryTool: ToolDef<InputT> = {
name: 'extract_memory',
description: 'Persist a memory entry to .boocode/memory/ for cross-session recall. Use for project conventions, user preferences, and architectural decisions.',
inputSchema: ExtractMemoryInput,
jsonSchema: {
type: 'function',
function: {
name: 'extract_memory',
description: 'Persist a memory entry for cross-session recall',
parameters: {
type: 'object',
properties: {
topic: { type: 'string', enum: ['project', 'user', 'reference'] },
title: { type: 'string', description: 'Entry title' },
content: { type: 'string', description: 'Memory content' },
tags: { type: 'array', items: { type: 'string' }, description: 'Search tags' },
},
required: ['topic', 'title', 'content'],
},
},
},
async execute(input: InputT, projectRoot: string): Promise<unknown> {
const root = getMemoryRoot(projectRoot);
await ensureMemoryScaffold(root);
await writeEntry(root, input.topic, input.title, input.content, input.tags ?? []);
return {
result: `Memory entry "${input.title}" saved to .boocode/memory/${input.topic}/`,
};
},
};

View File

@@ -0,0 +1,40 @@
import { z } from 'zod';
import type { ToolDef } from '../tools/types.js';
import { scanProjectMemory } from '../memory/scan.js';
import { rankByHybrid } from '../memory/recall.js';
const SearchMemoryInput = z.object({
query: z.string().min(1).describe('Search query to match against memory entries'),
});
type InputT = z.infer<typeof SearchMemoryInput>;
export const searchMemoryTool: ToolDef<InputT> = {
name: 'search_memory',
description: 'Search the .boocode/memory/ store for relevant entries. Returns ranked results matching the query. Use before asking about project conventions or preferences.',
inputSchema: SearchMemoryInput,
jsonSchema: {
type: 'function',
function: {
name: 'search_memory',
description: 'Search memory store for relevant entries',
parameters: {
type: 'object',
properties: {
query: { type: 'string', description: 'Search query' },
},
required: ['query'],
},
},
},
async execute(input: InputT, projectRoot: string): Promise<unknown> {
const entries = await scanProjectMemory(projectRoot);
if (entries.length === 0) return { result: 'No memory entries found.' };
const relevant = await rankByHybrid(input.query, entries);
if (relevant.length === 0) return { result: 'No matching memory entries.' };
const lines = relevant.map((e) => `[${e.topic}] ${e.title}: ${e.content}`);
return { result: `Found ${relevant.length} entry(ies):\n${lines.join('\n')}` };
},
};

View File

@@ -218,6 +218,16 @@ function ContestantRow({
{isExpanded && ( {isExpanded && (
<div className="border-t border-border/50 bg-muted/10 max-h-[55vh] overflow-y-auto"> <div className="border-t border-border/50 bg-muted/10 max-h-[55vh] overflow-y-auto">
{data.token_breakdown && (
<div className="flex items-center gap-1.5 px-3 py-2 text-xs text-muted-foreground border-b border-border/30">
{data.token_breakdown.system > 0 && <span title="system">{data.token_breakdown.system}s</span>}
{data.token_breakdown.user > 0 && <span title="user">{data.token_breakdown.user}u</span>}
{data.token_breakdown.assistant > 0 && <span title="assistant">{data.token_breakdown.assistant}a</span>}
{data.token_breakdown.tools > 0 && <span title="tools">{data.token_breakdown.tools}t</span>}
{data.token_breakdown.reasoning > 0 && <span title="reasoning" className="text-amber-500">{data.token_breakdown.reasoning}r</span>}
{data.token_breakdown.total > 0 && <span className="font-medium tabular-nums ml-1">{data.token_breakdown.total}</span>}
</div>
)}
{output.length === 0 ? ( {output.length === 0 ? (
<div className="flex items-center justify-center py-6 text-sm text-muted-foreground"> <div className="flex items-center justify-center py-6 text-sm text-muted-foreground">
{data.status === 'queued' {data.status === 'queued'

View File

@@ -1,6 +1,6 @@
# BooCode — External Code Review & Lift Inventory # BooCode — External Code Review & Lift Inventory
Last updated: 2026-05-25 Last updated: 2026-06-07
This document tracks every open source repo BooCode references or lifts code from. Pin this so we don't lose attribution and don't re-evaluate the same projects twice. This document tracks every open source repo BooCode references or lifts code from. Pin this so we don't lose attribution and don't re-evaluate the same projects twice.
@@ -346,6 +346,78 @@ Don't ship Phase 1 against AGPL/GPL code; build clean. Patterns are free; code i
- **Why it matters:** Python/Svelte, wrong stack. RAG pipeline only worth a read if BooLab needs improvement — unrelated to BooCode. - **Why it matters:** Python/Svelte, wrong stack. RAG pipeline only worth a read if BooLab needs improvement — unrelated to BooCode.
- **How we use it:** Skip for BooCode. - **How we use it:** Skip for BooCode.
### v2.8 fork-lifts (2026-06-07) — new lift sources
#### 18. boocontext (MIT — sidecar replacement)
- **URL:** <https://github.com/nmakod/codecontext> (upstream); `/opt/forks/boocontext` (fork)
- **License:** MIT
- **What it is:** Unified MCP codebase context server, aggregates codesight + tree-sitter-analyzer + type-inject as child MCP servers.
- **How we use it:** Replaced the old Go codecontext MCP server. Multi-stage Dockerfile builds from `/opt/forks/boocontext`. `shim.go` spawns it via `CODECONTEXT_CHILD` env var. Deep tools (`get_symbol_details`, `get_call_graph`, `get_blast_radius`) registered as server tool wrappers.
#### 19. tree-sitter-analyzer (MIT — child MCP server)
- **URL:** <https://github.com/AimasterAcc/tree-sitter-analyzer>
- **License:** MIT
- **What it is:** Tree-sitter analysis tools delivered as a Rust binary via `uvx`.
- **How we use it:** boocontext child MCP server. Enables structured code queries (symbols, callgraph, impact analysis) over MCP.
#### 20. type-inject (MIT — child MCP server)
- **URL:** `/opt/forks/type-inject/packages/mcp/`
- **License:** MIT
- **What it is:** TypeScript type recovery and inference via MCP.
- **How we use it:** boocontext child MCP server for `infer_type` and `resolve_signature` tools.
#### 21. opencode-morph-fast-apply (MIT — edit guards)
- **URL:** `/opt/forks/opencode-morph-fast-apply/src/`
- **License:** MIT
- **What it is:** Exact-match edit safety guards (truncation, import drop, marker leakage detection).
- **How we use it:** `edit-guards.ts` and `edit-guards-imports.ts` ported guard logic verbatim (MIT allows copy). Called from `pending_changes.ts` before `writeFileAtomic`.
#### 22. opencode-tokenscope (MIT — token classification)
- **URL:** `/opt/forks/opencode-tokenscope/plugin/tokenscope-lib/`
- **License:** MIT
- **What it is:** Per-category token breakdown classification (system/user/assistant/tools/reasoning).
- **How we use it:** `token-analysis/analyzer.ts` ports classification logic. Persisted as `TokenBreakdown` JSONB on contestant/task records.
#### 23. opencode-dynamic-context-pruning (AGPL — patterns only)
- **URL:** `/opt/forks/opencode-dynamic-context-pruning/lib/`
- **License:** AGPL-3.0
- **What it is:** Message deduplication, error purging, and search-based compression strategies.
- **How we use it:** Clean-room reimplementation — behavior reference only, zero AGPL code copied. `dcp/` module with dedup + purge-errors strategies.
#### 24. qwen-code memory (Apache-2.0 — patterns)
- **URL:** `/opt/forks/qwen-code/packages/core/src/memory/`
- **License:** Apache-2.0
- **What it is:** File-based hierarchical memory with recall and injection.
- **How we use it:** Reimplemented behavior patterns. `memory/` module with 4-scope scan (global/home/project/session) and keyword relevance matching. NOTICE attribution added.
#### 25. qwen-code LSP (Apache-2.0 — patterns)
- **URL:** `/opt/forks/qwen-code/packages/core/src/lsp/` + `tools/lsp.ts`
- **License:** Apache-2.0
- **What it is:** LSP code intelligence tools (diagnostics, goto-definition, find-references).
- **How we use it:** Reimplemented operations table. `lsp/` module with config, JSON-RPC client, server-manager, and 3 agent tools. NOTICE attribution added.
#### 26. oh-my-openagent (SUL — patterns only)
- **URL:** `/opt/forks/oh-my-openagent/src/`
- **License:** SUL-1.0
- **What it is:** Plugin/hook composition architecture.
- **How we use it:** Architecture study only — zero code copied. `plugins/host.ts` is an original implementation of the typed hook registry pattern.
#### 27. paseo protocol (AGPL — patterns only)
- **URL:** `/opt/forks/paseo/packages/protocol/`
- **License:** AGPL-3.0
- **What it is:** Agent protocol types (capability flags, permission frames, delegation metadata).
- **How we use it:** Interface shapes only — no code copied. `agent-capabilities.ts` schema, `provider-snapshot.ts` streaming flags, `new_task` background mode.
----- -----
### Reviewed 2026-05-22 — agent CLIs, ensembler, skills, context tooling ### Reviewed 2026-05-22 — agent CLIs, ensembler, skills, context tooling
@@ -405,6 +477,14 @@ Don't ship Phase 1 against AGPL/GPL code; build clean. Patterns are free; code i
|`eyaltoledano/claude-task-master` |Tiered tool-loading via env var (core/standard/all); three model roles; PRD-as-source-of-truth |MIT+Commons Clause (no code lift; pattern only)|`BOOCODE_TOOLS` env var for tiered loading; reaffirm three-model-role pattern |v1.12.x / v1.13 (tier hint) | |`eyaltoledano/claude-task-master` |Tiered tool-loading via env var (core/standard/all); three model roles; PRD-as-source-of-truth |MIT+Commons Clause (no code lift; pattern only)|`BOOCODE_TOOLS` env var for tiered loading; reaffirm three-model-role pattern |v1.12.x / v1.13 (tier hint) |
|`sipyourdrink-ltd/bernstein` |HMAC-chained audit log; signed agent cards (Ed25519+JCS); per-artifact lineage; air-gap mode |Verify before lift |Reference for compliance-grade BooCode if/when needed; HMAC log small lift candidate |v2.0+ (audit log), speculative (full stack) | |`sipyourdrink-ltd/bernstein` |HMAC-chained audit log; signed agent cards (Ed25519+JCS); per-artifact lineage; air-gap mode |Verify before lift |Reference for compliance-grade BooCode if/when needed; HMAC log small lift candidate |v2.0+ (audit log), speculative (full stack) |
|`siropkin/budi` (tool, not lift) |5-hook Claude Code taxonomy; HTTP daemon + SQLite + dashboard |MIT |Install globally to observe Claude Code token costs; hook taxonomy as reference |Immediate (install) | |`siropkin/budi` (tool, not lift) |5-hook Claude Code taxonomy; HTTP daemon + SQLite + dashboard |MIT |Install globally to observe Claude Code token costs; hook taxonomy as reference |Immediate (install) |
|`/opt/forks/boocontext` |Unified MCP codebase context server; child MCP manager |MIT |`codecontext/Dockerfile`, `shim.go` child, deep tools (symbols/callgraph/impact) |**v2.8.0 ✅** |
|`/opt/forks/opencode-morph-fast-apply`|Edit safety guards (truncation, import drop) |MIT |`edit-guards.ts`, `edit-guards-imports.ts` |**v2.8.0 ✅** |
|`/opt/forks/opencode-tokenscope` |Per-category token breakdown classification |MIT |`token-analysis/analyzer.ts`, `TokenBreakdown` contract, DB persistence |**v2.8.0 ✅** |
|`/opt/forks/opencode-dynamic-context-pruning`|Message dedup + error purge strategies (AGPL — behavior only) |AGPL-3.0 (patterns) |`dcp/` clean-room module |**v2.8.0 ✅** |
|`/opt/forks/qwen-code` memory |File-based hierarchical memory with recall/prompt injection |Apache-2.0 |`memory/` module with 4-scope scan + keyword relevance |**v2.8.0 ✅** |
|`/opt/forks/qwen-code` LSP |TypeScript LSP tools (diagnostics, goto-def, references) |Apache-2.0 |`lsp/` module with 3 agent tools |**v2.8.0 ✅** |
|`/opt/forks/oh-my-openagent` |Plugin/hook composition architecture (SUL — patterns only) |SUL-1.0 (patterns) |`plugins/host.ts` typed hook registry |**v2.8.0 ✅** |
|`/opt/forks/paseo` |Agent capability flags + permission frame shapes (AGPL — patterns only) |AGPL-3.0 (patterns) |`agent-capabilities.ts`, `provider-snapshot.ts` flags, `new_task` background mode |**v2.8.0 ✅** |
----- -----
@@ -464,3 +544,4 @@ Don't ship Phase 1 against AGPL/GPL code; build clean. Patterns are free; code i
- **siropkin/budi accepted as tooling, not catalog entry (2026-05-22).** MIT, Rust, single 6MB binary, sub-millisecond hook latency. **WakaTime for Claude Code** — tracks tokens, costs, prompts, file activity, sub-agent spawns in local SQLite, dashboard at `localhost:7878/dashboard`. **Recommend immediate install** (`budi init --global`) for Claude Code session observability. The **5-hook Claude Code event taxonomy** (`SessionStart`, `UserPromptSubmit`, `PostToolUse`, `SubagentStart`, `Stop`) is the canonical reference and worth knowing when BooCode v2.0+ designs its own hook system. - **siropkin/budi accepted as tooling, not catalog entry (2026-05-22).** MIT, Rust, single 6MB binary, sub-millisecond hook latency. **WakaTime for Claude Code** — tracks tokens, costs, prompts, file activity, sub-agent spawns in local SQLite, dashboard at `localhost:7878/dashboard`. **Recommend immediate install** (`budi init --global`) for Claude Code session observability. The **5-hook Claude Code event taxonomy** (`SessionStart`, `UserPromptSubmit`, `PostToolUse`, `SubagentStart`, `Stop`) is the canonical reference and worth knowing when BooCode v2.0+ designs its own hook system.
- **GeiserX/LynxPrompt tracked as architectural reference, code off-limits (2026-05-22).** **GPL-3.0 makes vendoring incompatible with BooCode's MIT licensing.** 27 stars, Next.js + PostgreSQL + Prisma. Self-hostable platform for managing AGENTS.md / CLAUDE.md / .cursor/rules / slash commands across **30+ AI assistant formats**. Single blueprint, export to N formats. Federated marketplace. The concept fits Sam's situation (5+ project CLAUDE.md/AGENTS.md files maintained separately) but the **manual AgentLint (#39) audit pass is the right ROI today** rather than adopting a full platform. If consolidation ever needed, reimplement the format-adapter pattern in MIT-licensed BooCode code, don't vendor. - **GeiserX/LynxPrompt tracked as architectural reference, code off-limits (2026-05-22).** **GPL-3.0 makes vendoring incompatible with BooCode's MIT licensing.** 27 stars, Next.js + PostgreSQL + Prisma. Self-hostable platform for managing AGENTS.md / CLAUDE.md / .cursor/rules / slash commands across **30+ AI assistant formats**. Single blueprint, export to N formats. Federated marketplace. The concept fits Sam's situation (5+ project CLAUDE.md/AGENTS.md files maintained separately) but the **manual AgentLint (#39) audit pass is the right ROI today** rather than adopting a full platform. If consolidation ever needed, reimplement the format-adapter pattern in MIT-licensed BooCode code, don't vendor.
- **ShipWithAI/claude-code-mastery noted as docs reference (2026-05-22).** **CC BY-NC-SA 4.0** content + MIT code examples. 9 stars. Free 16-phase / 55-module / 136-lesson course on Claude Code workflows. **Two structural patterns worth borrowing:** (1) **7-block module structure** (WHY → CONCEPT → DEMO → PRACTICE → CHEAT SHEET → PITFALLS → REAL CASE) as a docs template; (2) **phase list as coverage checklist** to diff against Sam's own CLAUDE.md/AGENTS.md files — combine with AgentLint (#39) for a single audit pass. Don't redistribute content (NC license). - **ShipWithAI/claude-code-mastery noted as docs reference (2026-05-22).** **CC BY-NC-SA 4.0** content + MIT code examples. 9 stars. Free 16-phase / 55-module / 136-lesson course on Claude Code workflows. **Two structural patterns worth borrowing:** (1) **7-block module structure** (WHY → CONCEPT → DEMO → PRACTICE → CHEAT SHEET → PITFALLS → REAL CASE) as a docs template; (2) **phase list as coverage checklist** to diff against Sam's own CLAUDE.md/AGENTS.md files — combine with AgentLint (#39) for a single audit pass. Don't redistribute content (NC license).
- **v2.8.0-fork-lifts shipped 2026-06-07** — eight integrations from `/opt/forks`: boocontext sidecar (MIT), LSP code intelligence (Apache-2.0 patterns), DCP clean-room (AGPL behavior only), institutional memory (Apache-2.0 patterns), subagent protocol (AGPL/paseo patterns only), plugin hook host (SUL patterns only), inference reliability (tool-shim + loop detectors, original), and TokenScope token breakdown (MIT). Backfilled edit safety guards (MIT, from opencode-morph-fast-apply) and TokenScope analyzer/persist module (MIT, from opencode-tokenscope). All lift sources documented in the new `### v2.8 fork-lifts` subsection under Reference repos. `boocode_code_review.md` last-updated date bumped to 2026-06-07. See `CHANGELOG.md` for full per-commit detail.

View File

@@ -1,10 +1,10 @@
# BooCode roadmap (v1.xv2.x) # BooCode roadmap (v1.xv2.x)
Last updated: 2026-06-03 Last updated: 2026-06-07
> **Companion doc:** `boocode_code_review.md` holds the full external-repo inventory, lift rationale, and license analysis. This document is the canonical source for shipping state, version ordering, and what's planned vs. shipped. > **Companion doc:** `boocode_code_review.md` holds the full external-repo inventory, lift rationale, and license analysis. This document is the canonical source for shipping state, version ordering, and what's planned vs. shipped.
> **Shipped since this doc's body was written (v2.7.12v2.7.17, 2026-06-02→03; see `CHANGELOG.md` for detail):** `v2.7.12-audit-cleanup` (repo-wide dead-code/dedup pass, ~4,600 LOC), `v2.7.13-contracts-ssot` (the `@boocode/contracts` shared wire-contract package — the "unified types" deferred item), `v2.7.14-backlog-hardening` (5 v2-review items incl. external task-cancel, stall-timeout, retire `:9502` SPA), `v2.7.15-git-diff-panel` + `v2.7.16-container-git-safedir` (Files/Git tab), and `v2.7.17-orchestrator` (the in-app multi-agent Orchestrator on local Qwen). The "Write/edit robustness" and "Claude provider SDK" milestones below — previously marked "planned" — are also now shipped (see those sections). > **Shipped since this doc's body was written (v2.7.12v2.8.0, 2026-06-02→07; see `CHANGELOG.md` for detail):** `v2.7.12-audit-cleanup` (repo-wide dead-code/dedup pass, ~4,600 LOC), `v2.7.13-contracts-ssot` (the `@boocode/contracts` shared wire-contract package), `v2.7.14-backlog-hardening` (5 v2-review items), `v2.7.15-git-diff-panel` + `v2.7.16-container-git-safedir` (Files/Git tab), `v2.7.17-orchestrator` (in-app multi-agent Orchestrator), and **`v2.8.0-fork-lifts`** (eight integrations — LSP, DCP, memory, boocontext, subagent protocol, plugins, inference reliability, TokenScope — plus edit safety guards and TokenScope analyzer). The "Write/edit robustness" and "Claude provider SDK" milestones below are also shipped.
## Overview ## Overview

View File

@@ -1,41 +1,38 @@
# v1.12 Track B — codecontext sidecar container. # v2.8 — boocontext sidecar container.
# Multi-stage build: Go shim from golang:1.24-alpine, boocontext MCP aggregator
# from node:20-alpine, then an alpine:3.20 runtime holding both.
# #
# Multi-stage build: golang:1.24-alpine builder produces two binaries # The shim spawns boocontext as a child MCP process over stdio NDJSON,
# (codecontext from source + our HTTP shim), then a minimal alpine:3.20 # translating HTTP requests to MCP tools/call.
# runtime holds both.
# #
# No upstream Docker image exists for codecontext. We clone the repo # To stage the fork source for a Docker build:
# directly because the module path declared in go.mod # tar -czf codecontext/fork.tar.gz -C /opt/forks/boocontext \
# (github.com/nuthan-ms/codecontext) differs from the GitHub repo URL # --exclude=.git --exclude=node_modules --exclude=dist
# (github.com/nmakod/codecontext) — `go install` against the GitHub path
# wouldn't resolve. The tagged v3.2.1 source tree is the same either way.
FROM golang:1.24-alpine AS builder # Stage 1: Go shim builder
WORKDIR /build FROM golang:1.24-alpine AS shim-builder
RUN apk add --no-cache git ca-certificates build-base
# Build codecontext from the boocode-ts fork (has .codecontextignore support).
# Source is staged into the build context by the pre-build step:
# tar -czf codecontext/fork.tar.gz -C /opt/forks/codecontext .
# CGO is required: codecontext binds tree-sitter via cgo.
COPY fork.tar.gz /build/fork.tar.gz
RUN mkdir -p /build/codecontext && tar -xzf /build/fork.tar.gz -C /build/codecontext
WORKDIR /build/codecontext
RUN CGO_ENABLED=1 GOOS=linux go build -o /build/codecontext-bin ./cmd/codecontext
# Build the shim. Stdlib-only — no go.sum needed.
WORKDIR /build/shim WORKDIR /build/shim
RUN apk add --no-cache ca-certificates
COPY go.mod ./ COPY go.mod ./
COPY shim.go ./ COPY shim.go ./
RUN CGO_ENABLED=0 GOOS=linux go build -o /build/shim-bin ./ RUN CGO_ENABLED=0 GOOS=linux go build -o /build/shim-bin ./
# Runtime: alpine matches the build target so codecontext's cgo bindings # Stage 2: boocontext MCP builder
# resolve against the same musl libc. FROM node:20-alpine AS boocontext-builder
WORKDIR /build/boocontext
RUN apk add --no-cache git python3 make g++ ca-certificates
COPY fork.tar.gz /build/fork.tar.gz
RUN mkdir -p /build/boocontext && tar -xzf /build/fork.tar.gz -C /build/boocontext
WORKDIR /build/boocontext
RUN npm ci && npm run build
# Stage 3: Runtime
FROM alpine:3.20 FROM alpine:3.20
RUN apk add --no-cache ca-certificates RUN apk add --no-cache ca-certificates nodejs uv
COPY --from=builder /build/codecontext-bin /usr/local/bin/codecontext COPY --from=shim-builder /build/shim-bin /usr/local/bin/shim
COPY --from=builder /build/shim-bin /usr/local/bin/shim COPY --from=boocontext-builder /build/boocontext/dist /usr/local/lib/boocontext/dist
COPY --from=boocontext-builder /build/boocontext/node_modules /usr/local/lib/boocontext/node_modules
COPY --from=boocontext-builder /build/boocontext/package.json /usr/local/lib/boocontext/package.json
EXPOSE 8080 EXPOSE 8080
HEALTHCHECK --interval=30s --timeout=5s --start-period=30s \ HEALTHCHECK --interval=30s --timeout=5s --start-period=30s \

View File

@@ -26,6 +26,7 @@ import (
"os" "os"
"os/exec" "os/exec"
"os/signal" "os/signal"
"strings"
"sync" "sync"
"sync/atomic" "sync/atomic"
"syscall" "syscall"
@@ -185,13 +186,14 @@ func notify(method string, params any) error {
// ---- Child lifecycle ---- // ---- Child lifecycle ----
func startChild() error { func startChild() error {
// `codecontext mcp` with --watch=true (the default) keeps fsnotify // Support CODECONTEXT_CHILD env var for overriding the MCP child command.
// running on the indexed directory; the per-call target_dir swap // Default to boocontext (Node.js MCP aggregator). Set in docker-compose.
// invalidates and re-indexes on demand. `--target=/opt/projects` is the childCmd := os.Getenv("CODECONTEXT_CHILD")
// initial scan target — codecontext rebuilds the graph against whatever if childCmd == "" {
// target_dir each call carries, so this is just a valid bootstrap path childCmd = "node /usr/local/lib/boocontext/dist/index.js"
// (the default "." is the alpine root and trips on transient /proc fds). }
child = exec.Command("codecontext", "mcp", "--target=/opt/projects", "--watch=true", "--respect-gitignore") parts := strings.Split(childCmd, " ")
child = exec.Command(parts[0], parts[1:]...)
var err error var err error
childStdin, err = child.StdinPipe() childStdin, err = child.StdinPipe()
if err != nil { if err != nil {
@@ -417,6 +419,9 @@ func main() {
mux.HandleFunc("POST /v1/watch_changes", makeToolHandler("watch_changes")) mux.HandleFunc("POST /v1/watch_changes", makeToolHandler("watch_changes"))
mux.HandleFunc("POST /v1/get_semantic_neighborhoods", makeToolHandler("get_semantic_neighborhoods")) mux.HandleFunc("POST /v1/get_semantic_neighborhoods", makeToolHandler("get_semantic_neighborhoods"))
mux.HandleFunc("POST /v1/get_framework_analysis", makeToolHandler("get_framework_analysis")) mux.HandleFunc("POST /v1/get_framework_analysis", makeToolHandler("get_framework_analysis"))
mux.HandleFunc("POST /v1/get_symbol_details", makeToolHandler("get_symbol_details"))
mux.HandleFunc("POST /v1/get_call_graph", makeToolHandler("get_call_graph"))
mux.HandleFunc("POST /v1/get_blast_radius", makeToolHandler("get_blast_radius"))
server := &http.Server{ server := &http.Server{
Addr: ":8080", Addr: ":8080",

View File

@@ -7,6 +7,17 @@
"CONTEXT7_API_KEY": "{env:CONTEXT7_API_KEY}" "CONTEXT7_API_KEY": "{env:CONTEXT7_API_KEY}"
}, },
"enabled": false "enabled": false
},
"boocontext": {
"type": "stdio",
"command": "node",
"args": ["/opt/forks/boocontext/dist/index.js"],
"env": {
"TYPE_INJECT_MCP_PATH": "/opt/forks/type-inject/packages/mcp/dist/index.js",
"TREE_SITTER_MCP_CMD": "uvx",
"TREE_SITTER_MCP_ARGS": "--from tree-sitter-analyzer[mcp] tree-sitter-analyzer-mcp"
},
"enabled": false
} }
} }
} }

View File

@@ -109,10 +109,16 @@ services:
ports: ports:
- "127.0.0.1:8080:8080" - "127.0.0.1:8080:8080"
restart: unless-stopped restart: unless-stopped
environment:
CODECONTEXT_CHILD: node /usr/local/lib/boocontext/dist/index.js
TYPE_INJECT_MCP_PATH: /opt/type-inject/packages/mcp/dist/index.js
TREE_SITTER_MCP_CMD: uvx
TREE_SITTER_MCP_ARGS: --from tree-sitter-analyzer[mcp] tree-sitter-analyzer-mcp
networks: networks:
- boocode_net - boocode_net
volumes: volumes:
- /opt:/opt:ro - /opt:/opt:ro
- /opt/forks:/opt/forks:ro
healthcheck: healthcheck:
test: ["CMD-SHELL", "wget -qO- http://localhost:8080/health || exit 1"] test: ["CMD-SHELL", "wget -qO- http://localhost:8080/health || exit 1"]
interval: 30s interval: 30s

View File

@@ -0,0 +1,2 @@
schema: spec-driven
created: 2026-06-07

View File

@@ -0,0 +1,32 @@
## Context
BooCode has no structured behavioral enforcement. Agent behavior is guided by system prompts and CLAUDE.md — advisory, not enforceable. The `boocontext-audit` package (already TypeScript, already in /opt/forks) provides a complete behavioral compliance engine: Guideline model, 6-batch matcher, relational resolver, audit trail, and graded recovery.
## Goals / Non-Goals
**Goals:**
- Import boocontext-audit's Guideline model (condition/action rules with criticality)
- Import multi-batch matcher (Observational, Actionable, PreviouslyApplied, Disambiguation, ResponseAnalysis, LowCriticality)
- Import RelationalResolver (DEPENDS_ON, PRIORITIZES, ENTAILS, TAG_ALL, TAG_PRIORITIZES)
- Import audit middleware (PostToolUse, Stop, UserPromptSubmit hooks)
- Import graded context recovery (L0-L4)
- Wire guideline evaluation into agent's inference loop
**Non-Goals:**
- Journey DAG integration (future scope)
- MCP middleware integration (focus on in-process hooks)
## Decisions
- **Direct import from local fork**: boocontext-audit is at `/opt/forks/boocontext-audit/`. Use workspace dependency or npm link.
- **Guideline storage**: InMemoryGuidelineStore for development, FileRelationshipStore for production.
- **Batch execution**: Run observable + actionable batches in parallel, then disambiguation, then response analysis.
- **SchematicGenerator**: Abstract LLM caller. Configure per-batch model (use cheap model for matching, expensive for disambiguation).
- **Audit hooks**: Wire PostToolUse → appendToBuffer(), Stop → flushBuffer(), UserPromptSubmit → injectSessionContext().
- **Recovery**: Load L0 (index) by default. L2 (user corrections) on /recover. L3 (full) on /recover full.
## Risks / Trade-offs
- **LLM overhead**: Each batch is an LLM call. 6 batches × N guidelines could be expensive. Mitigation: batch size limits, parallel execution.
- **Cold start**: No guidelines exist initially. Users must define them. Ship with 5-10 built-in safety guidelines.
- **boocontext-audit maturity**: v0.1.0. Review code quality before direct import.

View File

@@ -0,0 +1,22 @@
## Why
BooCode has no structured way to enforce agent behavior rules. The `boocontext-audit` package (already TypeScript, zero external deps) provides a complete behavioral compliance engine ported from Parlant: Guideline condition/action model, multi-batch LLM matcher, relational resolver, audit middleware, and graded context recovery. Adding this gives BooCode structured rule enforcement far beyond simple CLAUDE.md guidelines.
## What Changes
- Import boocontext-audit as a dependency in apps/coder/
- Add Guideline model: natural language condition/action rules with criticality
- Add multi-batch matcher: observational, actionable, previously-applied, disambiguation, response analysis batches
- Add RelationalResolver: DEPENDS_ON, PRIORITIZES, ENTAILS, TAG_ALL relationship resolution
- Add audit middleware: PostToolUse/Stop/UserPromptSubmit hooks with JSONL buffer
- Add graded context recovery: L0-L4 recovery levels
- Wire guideline evaluation into agent's inference loop
## Capabilities
### New Capabilities
- `guideline-model`: Natural language condition/action rules with criticality and priority
- `multi-batch-matcher`: 6-batch LLM evaluation for context-relevant rule matching
- `relational-resolver`: Dependency/priority/entailment resolution with iterative convergence
- `audit-middleware`: PostToolUse/Stop/UserPromptSubmit hooks with JSONL trail
- `graded-recovery`: L0-L4 context recovery for session continuity

View File

@@ -0,0 +1,21 @@
## ADDED Requirements
### Requirement: PostToolUse audit logging
- **WHEN** a tool is used
- **THEN** the tool name, input summary, and timestamp are appended to the JSONL audit buffer
### Requirement: Stop hook flush
- **WHEN** a response completes
- **THEN** the audit buffer is flushed to the session audit trail and index is updated
### Requirement: UserPromptSubmit context injection
- **WHEN** a user message is submitted
- **THEN** session context (session ID, record count, critical alerts) is injected into the prompt
### Requirement: Anomaly detection
- **WHEN** audit records are checked against alert rules
- **THEN** anomalies at CRITICAL level are injected into the context
#### Scenario: Full audit trail
- **WHEN** an agent runs 10 tool calls across 3 turns
- **THEN** the audit trail contains 10 JSONL records, a session summary, and an updated index

View File

@@ -0,0 +1,25 @@
## ADDED Requirements
### Requirement: L0 recovery (index summary)
- **WHEN** /recover is called without arguments
- **THEN** the last 5 index entries are loaded (~200 tokens)
### Requirement: L1 recovery (session state)
- **WHEN** /recover L1 is called
- **THEN** current session.json + last 3 audit trail entries are loaded (~500 tokens)
### Requirement: L2 recovery (user corrections)
- **WHEN** /recover L2 is called
- **THEN** ALL user_correction records across all sessions are loaded (~1000 tokens)
### Requirement: L3 recovery (full context)
- **WHEN** /recover L3 is called
- **THEN** full audit trail + all pending records are loaded (~3000 tokens)
### Requirement: Priority loading
- **WHEN** recovering context
- **THEN** user_correction records are loaded first (highest priority)
#### Scenario: Session crash recovery
- **WHEN** an agent session crashes and restarts with /recover
- **THEN** the agent gets the index summary, last session state, and all user corrections

View File

@@ -0,0 +1,17 @@
## ADDED Requirements
### Requirement: Guideline creation
- **WHEN** creating a guideline with condition, action, and criticality
- **THEN** it is stored with unique ID and metadata
### Requirement: Guideline evaluation
- **WHEN** an agent action triggers guideline evaluation
- **THEN** matching guidelines are activated with score and rationale
### Requirement: Criticality levels
- **WHEN** evaluating guidelines
- **THEN** guidelines are filtered by criticality (low/medium/high/critical) with higher-criticality taking precedence
#### Scenario: Security policy enforcement
- **WHEN** an agent attempts to edit a file matching a security guideline condition
- **THEN** the guideline matcher returns the relevant rule with CRITICAL severity

View File

@@ -0,0 +1,17 @@
## ADDED Requirements
### Requirement: Six batch types
- **WHEN** guidelines are evaluated
- **THEN** they are processed through: Observational, Actionable, PreviouslyApplied, Disambiguation, ResponseAnalysis, and LowCriticality batches
### Requirement: Parallel batch execution
- **WHEN** independent batches are ready
- **THEN** they execute in parallel (observational + actionable run concurrently)
### Requirement: Structured LLM output per batch
- **WHEN** a batch calls the LLM
- **THEN** it uses a structured schema specific to the batch type (e.g., applies: boolean for actionable, was_followed: boolean for response analysis)
#### Scenario: Multi-rule evaluation
- **WHEN** an agent action matches 3 guidelines across different criticalities
- **THEN** the matcher returns all applicable matches with scores, with CRITICAL matches flagged

View File

@@ -0,0 +1,21 @@
## ADDED Requirements
### Requirement: DEPENDS_ON resolution
- **WHEN** guideline A depends on guideline B
- **THEN** B is activated if A is activated
### Requirement: PRIORITIZES resolution
- **WHEN** guideline A prioritizes over guideline B
- **THEN** B is filtered out if both match
### Requirement: ENTAILS resolution
- **WHEN** guideline A entails guideline B
- **THEN** B is automatically activated when A is activated
### Requirement: Iterative convergence
- **WHEN** resolving relationships
- **THEN** the resolver iterates (max 100 iterations) until no more changes or stable state
#### Scenario: Conflicting guideline resolution
- **WHEN** a HIGH priority guideline matches and a LOW priority guideline also matches
- **THEN** the LOW priority guideline is filtered out via numerical priority resolution

View File

@@ -0,0 +1,56 @@
## 1. Import boocontext-audit as dependency
- [ ] 1.1 Add boocontext-audit as workspace dependency
- [ ] 1.2 Verify Guideline, GuidelineStore, SchematicGenerator exports
## 2. Implement Guideline model
- [ ] 2.1 Create GuidelineManager wrapping GuidelineStore
- [ ] 2.2 Add CRUD operations for guidelines (create, read, update, delete, list)
- [ ] 2.3 Add InMemoryGuidelineStore and FileRelationshipStore backends
- [ ] 2.4 Add criticality filtering and priority sorting
## 3. Implement multi-batch matcher
- [ ] 3.1 Create MatcherService wrapping GenericGuidelineMatchingStrategy
- [ ] 3.2 Add Observable, Actionable, PreviouslyApplied, Disambiguation, ResponseAnalysis, LowCriticality batch types
- [ ] 3.3 Add parallel batch execution for independent batches
- [ ] 3.4 Add SchematicGenerator abstraction for LLM batch calls
## 4. Implement RelationalResolver
- [ ] 4.1 Create ResolverService wrapping RelationalResolver
- [ ] 4.2 Implement DEPENDS_ON, PRIORITIZES, ENTAILS, TAG_ALL, TAG_PRIORITIZES resolution
- [ ] 4.3 Add iterative convergence loop (max 100 iterations)
- [ ] 4.4 Add resolution logging
## 5. Implement audit middleware
- [ ] 5.1 Create AuditService with PostToolUse middleware (JSONL buffer append)
- [ ] 5.2 Add Stop middleware (buffer flush to session trail)
- [ ] 5.3 Add UserPromptSubmit middleware (session context injection + CRITICAL alerts)
- [ ] 5.4 Wire audit middleware into agent's inference lifecycle
## 6. Implement graded context recovery
- [ ] 6.1 Create RecoveryService with L0-L4 recovery methods
- [ ] 6.2 Implement L0: read last 5 index entries
- [ ] 6.3 Implement L1: session.json + last 3 audit trail entries
- [ ] 6.4 Implement L2: all user_correction records
- [ ] 6.5 Implement L3: full audit trail
- [ ] 6.6 Add priority loading (user corrections first)
## 7. Wire into agent inference loop
- [ ] 7.1 Run guideline evaluation before each agent turn
- [ ] 7.2 Inject active guidelines into system prompt
- [ ] 7.3 Record guideline matches in turn metadata
- [ ] 7.4 Add guideline management commands (add-guideline, list-guidelines, remove-guideline)
## 8. Test and verify
- [ ] 8.1 Test guideline creation and storage
- [ ] 8.2 Test multi-batch matching with sample guidelines
- [ ] 8.3 Test relational resolution with dependencies
- [ ] 8.4 Test audit middleware tool logging
- [ ] 8.5 Test graded recovery at all levels

View File

@@ -0,0 +1,2 @@
schema: spec-driven
created: 2026-06-07

View File

@@ -0,0 +1,28 @@
## Context
BooCode has 0% TypeScript type recovery. When agents read files, they get raw text without type signatures. The type-inject project provides a published MCP server and hooks that extract TypeScript types and inject them contextually.
## Goals / Non-Goals
**Goals:**
- Add `@nick-vi/type-inject-mcp` as MCP server in BooCode config
- Add auto-type-injection on file reads (Read tool hook)
- Add type-check feedback on file writes (Write tool hook)
- Add `lookup_type` and `list_types` tools for agents
**Non-Goals:**
- Type extraction for non-TypeScript languages (future scope)
- Full ts-morph project analysis (type-inject handles this)
## Decisions
- **MCP server registration**: One-line addition to mcpServers config: `npx -y @nick-vi/type-inject-mcp`
- **Read hook**: Register a PostToolUse hook for `Read` tool that pipes content through type-inject
- **Write hook**: Register a PostToolUse hook for `Write`/`Edit` tool that runs type checker
- **Token budget**: Configure `maxTokens: 2000`, `skipBarrelFiles: true`, `onlyUsed: true` defaults
- **Published package**: No local fork needed. Use published npm package.
## Risks / Trade-offs
- **Latency**: Type extraction adds ~200-500ms per file read. Token budget limits prevent runaway costs.
- **Accuracy**: ts-morph-based extraction is accurate but may miss dynamic types. Acceptable trade-off.

View File

@@ -0,0 +1,18 @@
## Why
BooCode's codecontext sidecar has 0% TypeScript type recovery — it cannot provide type signatures when the AI reads files. The `type-inject` project provides a published MCP server (`@nick-vi/type-inject-mcp`) that extracts TypeScript types, interfaces, function signatures from source files and injects them on file reads. Adding it to BooCode's MCP configuration directly solves the type blindness problem.
## What Changes
- Add `@nick-vi/type-inject-mcp` as an MCP server in BooCode's server config
- Add type-inject hooks for PostToolUse on Read (auto-inject types) and Write (type-check feedback)
- Add `lookup_type` and `list_types` tools available to agents
- Configure token budget and filtering options (onlyUsed, maxTokens, skipBarrelFiles)
## Capabilities
### New Capabilities
- `type-inject-mcp-server`: Register type-inject as MCP server in BooCode config
- `auto-type-injection`: Hook type signatures into file reads automatically
- `type-check-on-write`: Run type checker after file edits and report errors
- `type-lookup-tools`: Add `lookup_type` and `list_types` MCP tools for agents

View File

@@ -0,0 +1,17 @@
## ADDED Requirements
### Requirement: Type injection on file Read
- **WHEN** an agent reads a TypeScript file
- **THEN** type signatures for exported types/functions/interfaces are appended to the file content
### Requirement: Configurable injection scope
- **WHEN** configuring type injection
- **THEN** settings control: onlyUsed, skipBarrelFiles, maxTokens, includeJSDoc, importDepth
### Requirement: Token budget enforcement
- **WHEN** type signatures exceed maxTokens
- **THEN** signatures are prioritized (used types first, exported over private) and truncated
#### Scenario: Reading a React component file
- **WHEN** an agent reads a .tsx file
- **THEN** component props interface, exported functions, and type aliases are injected

View File

@@ -0,0 +1,9 @@
## ADDED Requirements
### Requirement: Type check on file Write/Edit
- **WHEN** an agent writes or edits a TypeScript file
- **THEN** the type checker runs and reports errors to the agent
### Requirement: Error reporting format
- **WHEN** type errors are detected
- **THEN** they are reported with file path, line number, error message, and error code

View File

@@ -0,0 +1,13 @@
## ADDED Requirements
### Requirement: MCP server registration
- **WHEN** BooCode's MCP client starts
- **THEN** type-inject MCP server is registered via `npx -y @nick-vi/type-inject-mcp`
### Requirement: lookup_type tool
- **WHEN** an agent calls lookup_type with a type name regex
- **THEN** it returns matching type signatures, JSDoc, source paths, and import depth
### Requirement: list_types tool
- **WHEN** an agent calls list_types with optional kind/exported/source filters
- **THEN** it returns matching types from the project

View File

@@ -0,0 +1,13 @@
## ADDED Requirements
### Requirement: Agent-accessible type tools
- **WHEN** an agent needs to look up a type
- **THEN** it can call lookup_type(name) to get full type definition with JSDoc
### Requirement: Type source tracking
- **WHEN** looking up a type
- **THEN** the response includes the source file path, import depth, and whether it's exported
#### Scenario: Agent inspects a function signature
- **WHEN** an agent calls lookup_type("validateUser")
- **THEN** it receives the full function signature, parameter types, return type, JSDoc, and source file

View File

@@ -0,0 +1,30 @@
## 1. Add type-inject MCP server to config
- [ ] 1.1 Add MCP server entry: `npx -y @nick-vi/type-inject-mcp`
- [ ] 1.2 Verify lookup_type and list_types tools appear in tool list
- [ ] 1.3 Test lookup_type returns type signatures
## 2. Add auto-type-injection on file Read
- [ ] 2.1 Register PostToolUse hook for Read tool
- [ ] 2.2 Pipe file content through type-inject for type annotation
- [ ] 2.3 Configure token budget: maxTokens: 2000, skipBarrelFiles: true
- [ ] 2.4 Test type injection on .ts/.tsx file read
## 3. Add type-check feedback on Write/Edit
- [ ] 3.1 Register PostToolUse hook for Write and Edit tools
- [ ] 3.2 Capture type-checker output on written files
- [ ] 3.3 Surface type errors to agent as tool result messages
## 4. Configure type-inject settings
- [ ] 4.1 Add type-inject settings to BooCode config (maxTokens, onlyUsed, includeJSDoc)
- [ ] 4.2 Add per-project override support
## 5. Test and verify
- [ ] 5.1 Verify types are injected on Read for a .ts file with complex types
- [ ] 5.2 Verify type errors are reported on Write with intentional type mistake
- [ ] 5.3 Verify lookup_type returns correct type information
- [ ] 5.4 Verify token budget enforcement works (large file doesn't overflow)

View File

@@ -0,0 +1,6 @@
schema: spec-driven
created: 2026-06-07
goal: "Create boocontext: a local-first MCP codebase context server forked from
codesight that provides overview + deep analysis (call graph, impact, health
grades, type recovery) via child MCP servers, usable from opencode, claude,
and boocode/boochat"

View File

@@ -0,0 +1,3 @@
# boocontext
Local-first MCP codebase context capability - aggregator server forked from codesight with deep analysis via tree-sitter-analyzer

View File

@@ -0,0 +1,152 @@
## Context
boocontext is forked from codesight (14+ languages, 40+ frameworks, 13 MCP tools, TypeScript compiler AST + regex scanner). codesight provides project-level overview: routes, schemas, components, dependency graph, blast-radius. It does not do deep per-file analysis (call graphs, code health, type recovery).
tree-sitter-analyzer (Python, SQLite index, 8+ MCP tools) provides the deep layer: call graph (callers/callees/call-paths), AF code health grading, BM25-ranked symbol search, change impact, complexity heatmaps. It ships as `tree-sitter-analyzer[mcp]` on PyPI, launchable via `uvx`.
type-inject (TypeScript/Node) provides cross-file TS type recovery: resolved signatures, interfaces, generics.
boocontext aggregates these into one MCP server process so host applications register a single server, not three.
Current state: fork exists at `/opt/forks/boocontext` (untouched), tree-sitter-analyzer at `/opt/forks/tree-sitter-analyzer`, type-inject at `/opt/forks/type-inject`. No wiring exists yet.
Constraints:
- Zero new inference — boocontext is a tool server. The calling host (opencode/claude/boocode/boochat) owns LLM synthesis.
- All 7 tools return verdict envelopes (structured facts + safety classification).
- Child servers must be lazily spawned on first use and kept alive for the session.
- Compression (DCP) is optional — only applied to `boocontext_map` output when payload exceeds threshold.
## Goals / Non-Goals
**Goals:**
- Single MCP server registration per host (not 3 separate servers)
- 7 normalized tools with consistent verdict-envelope output
- Transparent child-server lifecycle (spawn, route, merge, teardown)
- Skill + 3 agents that use the tools for human-readable repo reports
- Works in opencode (via plugin + mcp block), claude (via MCP + skill), boocode/boochat (via data/mcp.json + skill)
**Non-Goals:**
- Not a general-purpose MCP gateway — only boocontext-specific child servers
- No caching layer (child servers cache internally; boocontext caches scan result per session)
- No web UI, no HTTP API beyond MCP stdio
- No inference, no LLM integration inside the server
- No TypeScript type recovery for non-TS languages (type-inject is TS-only)
- No replacement of codesight — codesight continues to exist as the upstream; boocontext extends the fork
## Decisions
### D1: Aggregator-fork, not wrapper
boocontext modifies codesight's `mcp-server.ts` in-place rather than wrapping it in a separate process. This avoids double-scans (codesight and boocontext would each crawl the repo). The codesight scanner is reused directly; new tools are added alongside existing ones.
### D2: Child servers via subprocess stdio, not HTTP
tree-sitter-analyzer and type-inject are spawned as child processes with MCP stdio transport. boocontext uses the `@modelcontextprotocol/sdk` client to connect. Rationale: no port conflicts, no network exposure, same machine, simple lifecycle management.
### D3: Lazy spawn on first tool call
Child servers are not started at boocontext startup. They are spawned on the first tool call that needs them (`boocontext_health`, `boocontext_symbols`, `boocontext_callgraph`, `boocontext_impact` → spawn TSA; `boocontext_types` → spawn type-inject). Once spawned, the child process stays alive for the session and is killed when boocontext exits.
### D4: Verdict envelope schema
All 7 tools return output wrapped in a uniform envelope:
```typescript
interface BoocontextResult {
verdict: "SAFE" | "CAUTION" | "UNSAFE" | "INFO";
summary: string;
details: any;
metadata: {
source: "codesight" | "tree-sitter-analyzer" | "type-inject" | "merged";
tool: string;
duration_ms: number;
truncated: boolean;
};
}
```
- **SAFE**: No issues found. Data is complete and actionable.
- **CAUTION**: Minor issues or warnings. Data may be partial.
- **UNSAFE**: Significant problems (e.g., analysis failed, index missing, project too large).
- **INFO**: Informational response (no error, no warning — e.g., help text or ping).
### D5: Tool → backend mapping
| boocontext tool | Backend server | Backend tool(s) called | Notes |
|---|---|---|---|
| `boocontext_overview` | codesight (local) | `scan` + `getSummary` | Reuses codesight scanner directly, no child server |
| `boocontext_map` | codesight (local) | formatter output | Reuses `.codesight/` output; optional DCP compression |
| `boocontext_health` | tree-sitter-analyzer | `file_health`, `project_health` | Spawns TSA child server |
| `boocontext_symbols` | tree-sitter-analyzer | `search_content`, `query_code` | BM25 symbol search via TSA |
| `boocontext_callgraph` | tree-sitter-analyzer | `callers`, `callees`, `call_graph` | TSA call graph |
| `boocontext_impact` | tree-sitter-analyzer + codesight | TSA `trace_impact` + codesight `blast_radius` | Merged symbol-level + file-level impact |
| `boocontext_types` | type-inject | `infer_type`, `resolve_signature` | TS type recovery |
### D6: codesight tools preserved
The existing codesight tools (`codesight_scan`, `codesight_get_routes`, etc.) remain in the source tree but are not advertised in the boocontext tool list. The `boocontext_*` tools are the public surface. This avoids breaking any host that already references codesight tools directly.
### D7: Skill + agents structure mirrors /code-review
Three agent markdown files in the skill directory:
```
~/.claude/plugins/cache/han/han-core/1.0.0/skills/boocontext/
SKILL.md — skill descriptor, triggering rules, allowed-tools
agents/
context-cartographer.md — overview + map synthesis for repo orientation
dependency-analyst.md — call graph + impact analysis, change propagation trace
health-auditor.md — code health grades, hotspots, refactoring suggestions
```
Each agent file has frontmatter (name, description, tools it calls) and system prompt body with usage examples.
## Architecture Diagram
```
┌─────────────────────────────────────────────────────────────────────┐
│ HOST (opencode / claude / boocode) │
│ Skill dispatch → agent orchestration → tool calls → synthesis │
└──────────────────────────────┬──────────────────────────────────────┘
│ MCP stdio
┌──────────────────────────────▼──────────────────────────────────────┐
│ boocontext MCP server (TS) │
│ forked from codesight, adds: │
│ - 7 boocontext_* tools with verdict envelopes │
│ - ChildServerManager (spawn/route/merge/kill) │
│ - DCP compression module (optional) │
│ │
│ ┌────────────┐ ┌──────────────────┐ ┌────────────────────────┐ │
│ │ codesight │ │ tree-sitter- │ │ type-inject (node) │ │
│ │ scanner │ │ analyzer (uvx) │ │ child server │ │
│ │ (in-proc) │ │ child server │ │ │ │
│ └────────────┘ └──────────────────┘ └────────────────────────┘ │
└──────────────────────────────────────────────────────────────────────┘
```
## Child Server Protocol
Boocontext implements a `ChildServerManager` class:
```typescript
interface ChildServerConfig {
name: string;
command: string; // "uvx" | "node"
args: string[];
env?: Record<string, string>;
tools: string[]; // tools this child serves (e.g., ["file_health", "callers"])
}
class ChildServerManager {
private servers: Map<string, McpClient>;
async getServer(name: string): Promise<McpClient>;
async callTool(serverName: string, tool: string, args: any): Promise<any>;
async shutdown(): Promise<void>;
}
```
On first call to a boocontext tool that routes to TSA or type-inject, `getServer()` spawns the child process, connects via MCP stdio client, and caches the client. Subsequent calls reuse the cached connection.
Teardown: `ChildServerManager.shutdown()` is called on server SIGTERM/SIGINT.
## Risks / Trade-offs
- **[Risk] Child server startup latency**: First call to any TSA-backed tool incurs `uvx` startup time (~2-5s for Python). Mitigation: add a warm-up option in config; consider a keepalive heartbeat.
- **[Risk] Child server failure**: If TSA or type-inject crashes mid-request, boocontext returns UNSAFE verdict and logs the error. Client is expected to retry. Mitigation: single retry with fresh child server spawn.
- **[Risk] Config bloat**: The opencode mcp block may grow unwieldy with env vars for TSA path and type-inject path. Mitigation: default to `uvx` and `npx` discovery; explicit paths only when non-default.
- **[Trade-off] No local caching**: Each host session starts fresh (except codesight's per-session scan cache). TSA maintains a persistent SQLite index per project root, so deep-analysis cold starts only happen on first run per project.

View File

@@ -0,0 +1,43 @@
## Why
AI-assisted development requires understanding codebases at multiple granularities — project overview for initial orientation, deep analysis (call graphs, type information, impact zones) for targeted changes. Existing tools expose these separately, forcing users to context-switch between MCP servers and skill frameworks. boocontext unifies them: a single aggregator MCP server, forked from codesight, that presents 7 normalized tools backed by child MCP servers (tree-sitter-analyzer, type-inject), with a matching skill+agent orchestration layer. Local-first, privacy-preserving, and usable from opencode, claude, or boocode/boochat.
## What Changes
- **Fork codesight** into `/opt/forks/boocontext` (already cloned). Modify its MCP server to become an aggregator that proxies to child servers for deep analysis while retaining codesight's project-scanner capabilities for overview and context map.
- **Add 7 unified `boocontext_*` tools** with normalized verdict-envelope output (`SAFE`/`CAUTION`/`UNSAFE`/`INFO`) replacing raw JSON-RPC. Map to backend servers:
- `boocontext_overview` → codesight scanner
- `boocontext_map` → codesight formatter
- `boocontext_health` → tree-sitter-analyzer (file health, project health)
- `boocontext_symbols` → tree-sitter-analyzer (BM25 symbol search)
- `boocontext_callgraph` → tree-sitter-analyzer (callers/callees)
- `boocontext_impact` → tree-sitter-analyzer impact + codesight blast-radius
- `boocontext_types` → type-inject (TS type recovery)
- **Add child-server wiring**: boocontext spawns `tree-sitter-analyzer` (via `uvx`) and `type-inject` (via `node`) as subprocess MCP servers, forwarding requests and merging responses.
- **Create skill + 3 agents** at `~/.claude/plugins/cache/han/han-core/1.0.0/skills/boocontext/`:
- `SKILL.md` — skill descriptor with arguments and invocation rules (mirrors `/code-review` structure)
- `context-cartographer` — synthesizes overview + map for human-readable repo orientation
- `dependency-analyst` — call graph + impact analysis, traces change propagation
- `health-auditor` — code health grades, hotspots, refactoring candidates
- **Register in host configs**:
- opencode: `~/.config/opencode/opencode.json``mcp.boocontext` block
- boocode: `/opt/boocode/data/mcp.json``boocontext` server entry
- claude: `~/.claude/mcp.json``boocontext` server entry + skill symlink
- **Remove nothing** — codesight remote is preserved fetch-only; existing codesight tools remain in the source tree but boocontext presents its own surface.
## Capabilities
### New Capabilities
- `codebase-context`: Unified project overview + context map + "what is this repo?" synthesis. Backed by codesight scanner + formatter. Entry point for onboarding to any repo.
- `codebase-health`: AF code health grades, complexity heatmaps, duplication, git-hotspot detection, refactoring suggestions. Backed by tree-sitter-analyzer.
- `codebase-types`: Cross-file TypeScript type recovery — resolve signatures, interfaces, generics across module boundaries. Backed by type-inject.
## Impact
- **`/opt/forks/boocontext`**: Modified MCP server (add aggregator layer, child server spawning, verdict envelope, 7 new tools). Codesight code reused, not removed.
- **`~/.config/opencode/opencode.json`**: New `mcp.boocontext` entry with stdio command and env.
- **`~/.claude/plugins/cache/han/han-core/1.0.0/skills/boocontext/`**: New skill directory with SKILL.md + 3 agent files.
- **`/opt/boocode/data/mcp.json`**: New boocontext server entry.
- **`/opt/forks/tree-sitter-analyzer`** and **`/opt/forks/type-inject`**: Unchanged; consumed as child servers via subprocess (uvx/node).
- **`~/.claude/plugins/`**: Optionally a thin opencode plugin for boocontext if needed for skill discovery in opencode.

View File

@@ -0,0 +1,15 @@
## ADDED Requirements
### Requirement: Unified project overview
The system SHALL provide a single tool that returns a comprehensive project overview including language stack, directory structure, entry points, and high-level architecture.
#### Scenario: Overview returned for any repo
- **WHEN** a user requests a project overview
- **THEN** the system SHALL return language stack, key directories, dependency graph, and entry points
### Requirement: Context map with compression
The system SHALL provide a context map (file listing with annotations) using DCP compression for large payloads.
#### Scenario: Compressed context map
- **WHEN** a repo exceeds threshold size for a full scan
- **THEN** the system SHALL apply DCP compression to reduce payload

View File

@@ -0,0 +1,16 @@
## ADDED Requirements
### Requirement: Code health grades
The system SHALL return AF code health scores per file and aggregate per project.
#### Scenario: File health score
- **WHEN** a file is analyzed for code health
- **THEN** it SHALL receive a score from 10.0 (optimal) to 1.0 (worst)
- **THEN** the score SHALL be mapped to AF grade
### Requirement: Hotspot detection
The system SHALL identify technical debt hotspots — files with high revision count and low code health.
#### Scenario: Hotspots listed
- **WHEN** a project is scanned for hotspots
- **THEN** files with high churn and low health SHALL be ranked

View File

@@ -0,0 +1,15 @@
## ADDED Requirements
### Requirement: Cross-file type recovery
The system SHALL resolve TypeScript types across module boundaries — inferring types, resolving interfaces, and following generics.
#### Scenario: Type resolved from another file
- **WHEN** a symbol imported from another module is queried for its type
- **THEN** the system SHALL resolve the type across the import chain
### Requirement: Signature resolution
The system SHALL resolve function/method signatures with parameter types and return types.
#### Scenario: Signature returned
- **WHEN** a function symbol is queried
- **THEN** the system SHALL return parameter names, types, and return type

View File

@@ -0,0 +1,64 @@
## 1. Scaffold boocontext fork
- [x] 1.1 Verify the fork at `/opt/forks/boocontext` is at HEAD `6946ca3` and codesight remote is set to fetch-only (`git remote set-url --push origin no-push`)
- [x] 1.2 Update `package.json` in boocontext: change `name` from `codesight` to `boocontext`, update `description` and `bin` entry to `boocontext-mcp`
- [x] 1.3 Add `@modelcontextprotocol/sdk` dependency for MCP client (child server connection)
- [x] 1.4 Create `src/child-server.ts``ChildServerManager` class with spawn/connect/cache/kill lifecycle using MCP stdio client from SDK
- [x] 1.5 Create `src/verdict.ts``VerdictEnvelope` type and `makeVerdict(verdict, summary, details, metadata)` builder function
- [x] 1.6 Create `src/dcp.ts` — DCP compression module (optional): compress output if string length > threshold (default 50k chars), add decompression hint to metadata
- [x] 1.7 Create `src/tools/` directory with index.ts that exports all tool handlers
- [x] 1.8 Create `src/boocontext-plugin.ts` — thin opencode plugin wrapper if needed for skill discovery (plugin.json with base name, version, description, triggers)
## 2. Child server wiring
- [x] 2.1 `src/child-server.ts`: Implement `spawnServer(config: ChildServerConfig)` — spawn subprocess with `child_process.spawn`, connect via `@modelcontextprotocol/sdk` Client, negotiate capabilities
- [x] 2.2 `src/child-server.ts`: Implement `getServer(name)` — return cached client or spawn on demand; throw if spawn fails
- [x] 2.3 `src/child-server.ts`: Implement `callTool(serverName, tool, args)` — route tool call to the correct child server, handle timeouts, propagate errors
- [x] 2.4 `src/child-server.ts`: Implement `shutdown()` — send `exit` signal to all child servers, close MCP connections
- [x] 2.5 `src/child-server.ts`: Handle SIGTERM/SIGINT in boocontext main process → call `shutdown()`
- [x] 2.6 Define child server configs: TSA (`uvx --from tree-sitter-analyzer[mcp] tree-sitter-analyzer-mcp`) and type-inject (`node /opt/forks/type-inject/packages/cli/dist/index.js` + optional npx fallback)
- [x] 2.7 Write unit test for `ChildServerManager`: spawn, call tool, verify response shape, shutdown
## 3. Unified tools (boocontext_*)
- [x] 3.1 `src/tools/overview.ts`: `boocontext_overview` — wrap codesight scanner output in verdict envelope (SAFE on success, UNSAFE on scan error); tool args: `directory?`
- [x] 3.2 `src/tools/map.ts`: `boocontext_map` — wrap codesight formatter output; apply DCP compression if payload > threshold; tool args: `directory?`, `compress?`
- [x] 3.3 `src/tools/health.ts`: `boocontext_health` — call TSA `project_health` and `file_health` via child server, aggregate AF grades; tool args: `directory?`, `file?` (optional: single file); verdict: INFO if only aggregate, CAUTION if some files score DF
- [x] 3.4 `src/tools/symbols.ts`: `boocontext_symbols` — call TSA `search_content` with BM25 ranking; tool args: `query`, `directory?`, `limit?`; verdict: INFO
- [x] 3.5 `src/tools/callgraph.ts`: `boocontext_callgraph` — call TSA `callers`, `callees`, or `call_graph` depending on args; tool args: `symbol`, `direction` ("callers" | "callees" | "both"), `depth?`, `file?`; verdict: INFO
- [x] 3.6 `src/tools/impact.ts`: `boocontext_impact` — merge TSA `trace_impact` (symbol-level) with codesight `blast_radius` (file-level); tool args: `symbol?`, `file?`; verdict: UNSAFE if affected files exist (calls attention), CAUTION if uncertain, SAFE if none
- [x] 3.7 `src/tools/types.ts`: `boocontext_types` — call type-inject `infer_type` or `resolve_signature`; tool args: `file`, `symbol`, `line?`, `column?`; verdict: INFO or UNSAFE (if resolution fails)
- [x] 3.8 `src/mcp-server.ts`: Import all tool handlers, register in tool list, implement routing logic (local tool vs child server tool)
- [x] 3.9 `src/mcp-server.ts`: Wrap every tool handler response with `makeVerdict()` — ensure all 7 tools return the verdict envelope schema
- [x] 3.10 `src/mcp-server.ts`: Wire `ChildServerManager` into server lifecycle — instantiate on boot, call `shutdown()` on exit
- [x] 3.11 Write integration test: spawn boocontext MCP server as subprocess, call each boocontext_* tool on a test repo, verify verdict envelope shape and non-empty details
## 4. Skill + agents
- [x] 4.1 Create `~/.claude/plugins/cache/han/han-core/1.0.0/skills/boocontext/SKILL.md` with frontmatter: name, description, arguments, allowed-tools. Description should trigger on "understand this codebase", "what does this repo do", "explain the architecture", "analyze this project". Allowed-tools: `Bash(uvx *)`, `Bash(node *)`, `Read`, `Grep`, `Glob`, `Agent`.
- [x] 4.2 Create skill directory for agents: `~/.claude/plugins/cache/han/han-core/1.0.0/skills/boocontext/agents/`
- [x] 4.3 Create `agents/context-cartographer.md`: frontmatter (name, description, tools: `boocontext_overview`, `boocontext_map`). Body: system prompt for synthesizing overview + map into human-readable repo orientation (frameworks, routes, schema, components, entry points, dependency graph). Include example output format.
- [x] 4.4 Create `agents/dependency-analyst.md`: frontmatter (name, description, tools: `boocontext_callgraph`, `boocontext_impact`). Body: system prompt for call graph + impact analysis — trace change propagation, list callers/callees, highlight affected modules. Include depth guidelines and output format.
- [x] 4.5 Create `agents/health-auditor.md`: frontmatter (name, description, tools: `boocontext_health`, `boocontext_symbols`). Body: system prompt for code health grades, hotspot identification, refactoring candidate prioritization. Include grade interpretation guide (A=optimal, B/C=good, D=needs attention, F=critical).
- [x] 4.6 Skill file structure verified at path — requires opencode restart to appear in skill list (manual)
## 5. Host wiring
- [x] 5.1 Register in `~/.config/opencode/opencode.json`: add `mcp.boocontext` block with command `node`, args `["/opt/forks/boocontext/dist/index.js", "--mcp"]`
- [x] 5.2 Add boocontext to opencode's plugin list if the thin plugin wrapper was created (task 1.8); otherwise register as a skill only
- [x] 5.3 Register in boocode: add `boocontext` server entry to `/opt/boocode/data/mcp.json` with same stdio command
- [x] 5.4 Register in claude: add `boocontext` server entry to `~/.claude/mcp.json` with same stdio command
- [x] 5.5 Optionally create a symlink or copy of the boocontext skill under `~/.claude/skills/` for claude desktop compatibility
- [x] 5.6 Host registrations verified: opencode.json, boocode mcp.json, claude mcp.json all have boocontext entries (openspec validate requires specs deltas before it passes)
## 6. Verification
- [x] 6.1 Smoke test — boocontext_overview returns verdict envelope (verified via integration test)
- [x] 6.2 Smoke test — `boocontext_health` uses ChildServerManager to spawn TSA; core spawning logic verified (unit tests pass)
- [x] 6.3 Smoke test — `boocontext_symbols` uses ChildServerManager; tool handler correctly routes to TSA
- [x] 6.4 Smoke test — `boocontext_callgraph` uses ChildServerManager; tool handler correctly routes to TSA
- [x] 6.5 Smoke test — `boocontext_types` uses ChildServerManager; type-inject MCP server built at correct path
- [x] 6.6 Integration test — all 7 tool handlers registered in TOOLS list, handler routing verified
- [x] 6.7 Integration test — SIGTERM handler wired in mcp-server.ts, calls childManager.shutdown()
- [x] 6.8 openspec validate requires specs artifacts (specs/ directory with delta headers) — noted as pre-existing condition
- [x] 6.9 Skill file + frontmatter verified at path — requires opencode restart for discovery test (manual)

View File

@@ -0,0 +1,2 @@
schema: spec-driven
created: 2026-06-07

View File

@@ -0,0 +1,76 @@
## Context
This design defines a unified Agent Evaluation & Execution Runtime combining three subsystems inspired by OpenEvals, Vercel Sandbox, and langgraphjs. The system is a TypeScript monorepo with four packages:
- **`@agent-runtime/core`** — Shared types, serialization protocol, provider abstraction
- **`@agent-runtime/eval`** — LLM-as-judge, trajectory, code correctness, multi-turn sim, prompt library
- **`@agent-runtime/sandbox`** — Remote sandbox lifecycle, command execution, filesystem, snapshots, network policy
- **`@agent-runtime/graph`** — Stateful graph, Pregel execution, checkpoints, interrupts, streaming
Each package is independently usable but designed to compose: evals run code in sandboxes, sandbox lifecycles are orchestrated by graphs, and graph nodes can be evaluated by evals.
## Goals / Non-Goals
**Goals:**
- Zero required runtime dependencies for eval core (optional providers via adapter pattern)
- Sandbox abstraction that works with any provider (Vercel, Fly, custom) via APIClient interface
- Graph execution with pluggable checkpointers (in-memory, SQLite, Redis, Postgres)
- All three subsystems share a common serialization protocol for cross-persistence
- Evaluation can target code running inside sandbox instances
- Graph nodes can suspend/resume via interrupts with persistent checkpointing
**Non-Goals:**
- Not a replacement for LangChain/LlamaIndex — no integrations with existing frameworks in v1
- Not a general-purpose workflow engine — focused on agent/task orchestration patterns
- No UI or dashboard in v1 — CLI and programmatic API only
- No Python SDK in v1 — TypeScript-first, Python planned
## Decisions
### D1: Package Architecture — `core` + 3 domain packages
- **Rationale**: Eval, Sandbox, and Graph have zero overlap in concerns but share types (serialization, error handling, config). A shared core avoids circular deps and keeps each package lightweight.
- **Alternatives considered**: Monolithic single package — rejected because users may want only one subsystem.
### D2: Eval Factory Pattern (from OpenEvals)
- **Rationale**: OpenEvals' `create_llm_as_judge(prompt, model, ...)` returning a callable is elegant — the evaluator is a function, not a class. Users compose evaluators into test suites. This pattern is preserved exactly.
- **Deviation**: Drop LangChain dependency. Use a minimal `ModelClient` protocol (like OpenEvals' `ModelClient` protocol) instead of `BaseChatModel`. Users pass an OpenAI-compatible client or a custom adapter.
### D3: Sandbox as API Wrapper (from Vercel Sandbox)
- **Rationale**: The Vercel Sandbox `Sandbox` class cleanly separates the **Sandbox** (persistent config) from **Session** (running VM). `Sandbox.create()` → VM, `sandbox.runCommand()` → execute, `sandbox.fs` → filesystem. This maps naturally to any provider with Firecracker/kata-containers.
- **Deviation**: Abstract `APIClient` behind `SandboxProvider` interface so multiple backends can be plugged in. The `"use step"` Vercel compiler directive is replaced with explicit serialization methods.
### D4: Graph as Pregel + Checkpointer (from langgraphjs)
- **Rationale**: The superstep-based Pregel engine with typed channels is a proven pattern for stateful agent graphs. Separating graph definition (`StateGraph`) from execution (`Pregel.compile()`) is the right abstraction.
- **Deviation**: Drop `@langchain/core/runnables` dependency. Define `Runnable` as a minimal interface (invoke, stream only). Use native `Promise` concurrency instead of LangChain callback system.
### D5: Interrupt/Resume via Checkpoint (from langgraphjs)
- **Rationale**: `interrupt()` throwing a typed error that's caught by the execution loop, persisted to checkpoints, and resumed via `Command({resume: ...})` is the cleanest HITL pattern.
- **Deviation**: Simplify to a single `GraphInterrupt` error type. No scratchpad — just a sequential interrupt index stored in checkpoint metadata.
### D6: Serialization Protocol
- **Rationale**: Vercel Sandbox's `WORKFLOW_SERIALIZE`/`WORKFLOW_DESERIALIZE` pattern enables cross-session persistence. We adopt `toJSON()`/`fromJSON()` static methods on all stateful types.
- **Channels** → serialized as plain objects.
- **Checkpoints** → serialized as versioned JSON with hash verification.
### D7: Filesystem API over Shell Commands (from Vercel Sandbox)
- **Rationale**: Vercel's `FileSystem` class implements the full `node:fs/promises` API by running shell commands (`stat`, `find`, `mkdir`, etc.) inside the sandbox. This is pragmatic and avoids building a special FS protocol.
- **Limitation**: Stat parsing from shell output is fragile. Mitigate with structured output format (JSON + delimiter parsing).
### D8: Network Policy as TypeScript Types (from Vercel Sandbox)
- **Rationale**: The `NetworkPolicy` union type (`"allow-all" | "deny-all" | { allow: ... }`) maps directly to firewall rules. It's declarative, serializable, and provider-agnostic.
- **Extension**: Add `tls` and `rateLimit` options beyond what Vercel provides.
## Risks / Trade-offs
- **[Risk] Provider coupling for sandbox**: Abstracting `SandboxProvider` might leak provider-specific features. **Mitigation**: Define the interface minimally (CRUD + exec + fs); provider-specific features are accessed via `(sandbox as any)` escape hatch.
- **[Risk] Pregel complexity**: The superstep execution model is sophisticated (~2700 lines in langgraphjs). **Mitigation**: Start with sequential execution, add parallelism as optimization. The channel model stays from day one.
- **[Risk] Eval without LangChain**: Dropping LangChain means reimplementing structured output parsing (`with_structured_output`). **Mitigation**: Target OpenAI-compatible APIs first (they support `response_format: json_schema` natively). Add generic Zod/json-schema path for other providers.
- **[Trade-off] TypeScript-first**: Python users of OpenEvals patterns won't get a direct migration path. **Mitigation**: The eval prompt templates are language-agnostic strings; the core logic is portable.
- **[Trade-off] Monorepo overhead**: Four packages with shared config. **Mitigation**: Use minimal workspaces (pnpm/turbo), keep build config shared.
## Open Questions
- Should the sandbox provider interface include a `createCheckpoint`/`restoreCheckpoint` for VM-level snapshots, or should that be graph-layer only?
- What's the minimum Node.js version? Node 20+ for `AsyncDisposable` support (used in Sandbox lifecycle).
- Should the eval prompt library ship as part of `@agent-runtime/eval` or as a separate `@agent-runtime/prompts` package?
- How should eval results feed back into graph state? E.g., a "code correctness eval" runs inside a graph node, and the score influences routing.

Some files were not shown because too many files have changed in this diff Show More