From 028c08b4cd6f0b0bac8715c655e6efa2b8b7f7d6 Mon Sep 17 00:00:00 2001 From: indifferentketchup Date: Sun, 7 Jun 2026 21:34:35 +0000 Subject: [PATCH] docs: add openspec proposals for memory v2 and orchestrator flow patterns --- .../changes/memory-v2-hybrid-search/design.md | 60 ++++++++++++++++ .../memory-v2-hybrid-search/proposal.md | 39 +++++++++++ .../changes/memory-v2-hybrid-search/tasks.md | 30 ++++++++ .../orchestrator-flow-advanced/design.md | 70 +++++++++++++++++++ .../orchestrator-flow-advanced/proposal.md | 42 +++++++++++ .../orchestrator-flow-advanced/tasks.md | 40 +++++++++++ 6 files changed, 281 insertions(+) create mode 100644 openspec/changes/memory-v2-hybrid-search/design.md create mode 100644 openspec/changes/memory-v2-hybrid-search/proposal.md create mode 100644 openspec/changes/memory-v2-hybrid-search/tasks.md create mode 100644 openspec/changes/orchestrator-flow-advanced/design.md create mode 100644 openspec/changes/orchestrator-flow-advanced/proposal.md create mode 100644 openspec/changes/orchestrator-flow-advanced/tasks.md diff --git a/openspec/changes/memory-v2-hybrid-search/design.md b/openspec/changes/memory-v2-hybrid-search/design.md new file mode 100644 index 0000000..1576dbb --- /dev/null +++ b/openspec/changes/memory-v2-hybrid-search/design.md @@ -0,0 +1,60 @@ +# Memory v2 — Design + +## Architecture + +``` + ┌─────────────────────────┐ + │ system-prompt.ts │ + │ (inject memory block) │ + └────────┬────────────────┘ + │ + ┌─────────▼──────────┐ + │ memory/recall.ts │ + │ (renamed to query) │ + └─────────┬──────────┘ + │ + ┌───────────────┼───────────────┐ + │ │ │ + ┌────────▼──────┐ ┌─────▼──────┐ ┌──────▼───────┐ + │ BM25Ranker │ │ EmbedCache │ │ CosineRanker │ + │ (stateless) │ │ (LRU map) │ │ (ONNX) │ + └───────────────┘ └────────────┘ └──────────────┘ +``` + +## Module Changes + +### `apps/server/src/services/memory/` — new/changed files + +| File | Change | +|------|--------| +| `recall.ts` | Replace `rankByRelevance` with hybrid `rankByHybrid(query, entries)` | +| `embeddings.ts` | **New** — ONNX model loader + `embed(texts: string[]): number[][]` | +| `bm25.ts` | **New** — BM25 scorer with `score(query, doc): number` | +| `ranker.ts` | **New** — weighted merge of BM25 + cosine scores | +| `entries.ts` | Add `serializeForEmbedding(entry): string` helper | + +### Embedding Model + +- Model: `all-MiniLM-L6-v2` (384-dim, ~23MB ONNX) +- Runtime: `onnxruntime-node` npm package or subprocess via `node:child_process` +- Cache: `Map` in-memory, cleared on process restart +- Fallback: BM25-only when model file is missing + +### Agent Tools (new) + +| Tool | Description | +|------|-------------| +| `extract_memory(topic, title, content, tags?)` | Persists a memory entry. Topic must be one of project/user/reference | +| `search_memory(query)` | Returns up to 10 ranked memory entries matching the query. Replaces blind injection | + +### Scoring Formula + +``` +score = (BM25_score * 0.3) + (cosine_similarity * 0.7) +``` + +Both normalized to [0,1] before merging. Entries below threshold (0.15) are excluded. + +## Rollback + +Set `MEMORY_SEARCH=keyword` env var to fall back to the v1 keyword-only path. Default is `hybrid`. diff --git a/openspec/changes/memory-v2-hybrid-search/proposal.md b/openspec/changes/memory-v2-hybrid-search/proposal.md new file mode 100644 index 0000000..a116d39 --- /dev/null +++ b/openspec/changes/memory-v2-hybrid-search/proposal.md @@ -0,0 +1,39 @@ +# Memory v2 — Hybrid Search & Auto-Extract + +**Status:** Proposed +**Epic:** memory-v2-hybrid-search +**Depends on:** v2.8.0-fork-lifts (v1 memory already shipped) + +## Why + +v1 memory (shipped in v2.8.0-fork-lifts) provides file-based recall with keyword/tag matching injected into `system-prompt.ts`. It works but has three gaps: + +1. **Keyword-only recall misses semantic matches** — "indentation" won't match a memory entry titled "Code style: tabs vs spaces" unless the word "indentation" appears verbatim. +2. **No auto-extraction** — memory files must be created manually. The LLM can't persist useful facts it discovers during conversation. +3. **Flat search, no ranking** — all keyword matches are equally weighted. No relevance scoring or deduplication. + +v2 upgrades the retrieval layer while keeping the file-based storage format. No breaking changes to `.boocode/memory/` structure. + +## What Changes + +### Hybrid Search (high confidence) +Replace keyword-only `rankByRelevance` with BM25 + embedding hybrid search. Use a tiny local embedding model (all-MiniLM-L6-v2 through ONNX runtime or a local subprocess) so there's no external API dependency. + +- **BM25** (already implementable without deps — term frequency + inverse document frequency scoring on the memory entries) +- **Embedding** (local ONNX model, ~20MB, runs inference in ~5ms on CPU, produces 384-dim vectors) +- **Weighted merge** (`score = 0.3 * bm25 + 0.7 * cosine`) — configurable ratio + +### Auto-Extract Agent Tool (medium confidence) +A new `extract_memory` tool exposed to agents (not automatic — agent decides when to persist): + +- `extract_memory(topic, title, content, tags)` → writes a markdown entry +- `search_memory(query)` → returns ranked memory entries (new tool, replaces raw injection) + +### In-Memory Embedding Cache (optional) +Keep embeddings in an LRU map keyed by file mtime. Recompute only when files change. No DB migration needed. + +## Non-Goals +- No vector database (SQLite FTS5 or in-memory BM25 suffice) +- No automatic background extraction agent (agent must explicitly call `extract_memory`) +- No changes to the `.boocode/memory/` file format +- No Python dependencies — ONNX runtime is a Node.js native addon or subprocess diff --git a/openspec/changes/memory-v2-hybrid-search/tasks.md b/openspec/changes/memory-v2-hybrid-search/tasks.md new file mode 100644 index 0000000..2bc535a --- /dev/null +++ b/openspec/changes/memory-v2-hybrid-search/tasks.md @@ -0,0 +1,30 @@ +# Tasks — Memory v2 + +## Prerequisites +- v2.8.0 on main (v1 memory module shipped) + +## Tasks + +### 1. BM25 ranker +- [ ] 1.1 Write `bm25.ts` — pure function, no deps. BM25Okapi formula: `sum over terms of IDF * (tf * (k1 + 1)) / (tf + k1 * (1 - b + b * docLen / avgDocLen))` +- [ ] 1.2 Unit tests with known corpus + +### 2. Embedding module +- [ ] 2.1 Write `embeddings.ts` — load ONNX model, `embed(texts: string[]): number[][]` +- [ ] 2.2 Write `ranker.ts` — cosine similarity + BM25 weighted merge +- [ ] 2.3 Fallback to BM25-only when model unavailable + +### 3. Hybrid recall +- [ ] 3.1 Refactor `recall.ts` — `rankByRelevance` → `rankByHybrid` using BM25 + embedding when available +- [ ] 3.2 Keep keyword-only path as `MEMORY_SEARCH=keyword` env fallback +- [ ] 3.3 Server tests pass + +### 4. Agent tools +- [ ] 4.1 Create `extract_memory` tool — persists entry, returns path +- [ ] 4.2 Create `search_memory` tool — replaces raw injection when used +- [ ] 4.3 Tool tests pass + +### 5. Smoke +- [ ] 5.1 Create `.boocode/memory/project/style.md` with "Use two-space indentation" +- [ ] 5.2 `search_memory("what spacing convention")` returns the entry +- [ ] 5.3 `extract_memory("project", "Naming", "PascalCase for components")` creates the file diff --git a/openspec/changes/orchestrator-flow-advanced/design.md b/openspec/changes/orchestrator-flow-advanced/design.md new file mode 100644 index 0000000..b99b4b5 --- /dev/null +++ b/openspec/changes/orchestrator-flow-advanced/design.md @@ -0,0 +1,70 @@ +# Orchestrator Advanced Flows — Design + +## Architecture + +``` +┌───────────── Step dispatch ─────────────────┐ +│ │ +│ Flow-runner resolves step: │ +│ 1. Check trigger_rule on deps │ +│ 2. Substitute $vars in prompt │ +│ 3. If approval gate → pause for user │ +│ 4. INSERT task row → dispatcher picks up │ +│ 5. On terminal: append to event log │ +│ 6. Advance next ready step │ +│ │ +└──────────────────────────────────────────────┘ +``` + +## Type Changes + +### `apps/coder/src/conductor/types.ts` + +```typescript +export type TriggerRule = 'all_success' | 'one_success' | 'all_done'; + +export interface Step { + id: string; + kind: StepKind | 'approval'; // + new kind + deps?: string[]; + trigger_rule?: TriggerRule; // NEW: default 'all_success' + agent?: string; + run: (ctx: StepContext) => string | Promise; + when?: (ctx: StepContext) => boolean; +} +``` + +### `apps/coder/src/services/flow-runner.ts` + +| Change | Detail | +|--------|--------| +| Trigger evaluation | Before dispatching a step, check deps statuses against `trigger_rule`. Skip if conditions not met | +| Variable substitution | Scan prompt for `$word.word` patterns, resolve from previous step outputs | +| Approval gate | When `step.kind === 'approval'`, insert a `tasks` row with `state='blocked'` and publish a `permission_requested` WS frame. Wait for `permission_resolved` to unblock | +| Event log | Append-only per-step events: `{ step_id, event: 'started'|'completed'|'failed'|'paused'|'resumed', at: timestamp }` in `flow_step_events` table | + +## Schema + +```sql +CREATE TABLE IF NOT EXISTS flow_step_events ( + id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + run_id UUID NOT NULL REFERENCES flow_runs(id), + step_id VARCHAR(64) NOT NULL, + event VARCHAR(32) NOT NULL, -- started, completed, failed, paused, resumed, skipped + payload JSONB, + created_at TIMESTAMPTZ NOT NULL DEFAULT clock_timestamp() +); +``` + +## Resolution Order + +1. Collect all completed steps in the run +2. For each unstarted step whose deps are met: + - Evaluate `trigger_rule` against dep statuses + - If met → advance the step (with variable substitution) + - If not met → skip (for `one_success`, mark it complete when any dep succeeds) +3. For approval gates: pause, publish frame, wait for user response + +## Rollback + +All changes are additive to the Step type. Existing flows without `trigger_rule` default to `all_success`, preserving current behavior. Approval gates are opt-in per step definition. diff --git a/openspec/changes/orchestrator-flow-advanced/proposal.md b/openspec/changes/orchestrator-flow-advanced/proposal.md new file mode 100644 index 0000000..24a0f85 --- /dev/null +++ b/openspec/changes/orchestrator-flow-advanced/proposal.md @@ -0,0 +1,42 @@ +# Orchestrator — Advanced Flow Patterns + +**Status:** Proposed +**Epic:** orchestrator-flow-advanced +**Depends on:** v2.7.17-orchestrator (flow-runner already shipped) + +## Why + +The orchestrator (shipped v2.7.17) runs sequential research/analysis flows on local Qwen. Each step is a linear dependency chain: A → B → C. This works for analysis flows (code-review, investigate) but limits three high-value scenarios: + +1. **Parallel research** — "Analyze this from 3 angles" currently requires 3 separate runs. A single flow with parallel branches would halve the wall-clock time. +2. **Adaptive depth** — "Investigate bug, and if it's a security issue, escalate to security review" requires conditional branching. Currently all steps run unconditionally. +3. **Human-in-the-loop** — "Review the diff and approve before applying" requires the orchestrator to pause and wait for user input before proceeding. + +The patterns from the Ion hybrid workflow engine (trigger rules, event sourcing, approval gates, variable substitution) provide a proven vocabulary for these scenarios — we adapt the patterns without adopting the project. + +## What Changes + +### Trigger rules on step deps +Add `trigger_rule?: 'all_success' | 'one_success' | 'all_done'` to the `Step` type. Default `all_success` preserves existing behavior. + +- `all_success` — step runs when ALL dependencies complete successfully (current behavior) +- `one_success` — step runs when ANY dependency completes (parallel research: whichever finishes first seeds the synthesis) +- `all_done` — step runs when all deps finish regardless of status (cleanup/reporting steps) + +### Variable substitution in step prompts +Add `$stepId.output` and `$stepId.output.field` syntax in step prompts. The flow-runner resolves these before dispatching. + +- `$research.output` — the full text output of step with id "research" +- `$classify.output.severity` — the "severity" field from step output parsed as YAML/JSON frontmatter + +### Human approval gate +New `kind: 'approval'` step type that pauses the flow and publishes a permission frame to the user channel. Flow resumes when the user approves or rejects. + +### Event-sourced step log +Append-only event log for each step execution (start, complete, fail, skip, pause, resume). Enables deterministic resume after coder restart without polling. + +## Non-Goals +- No YAML DAG format (stay with TypeScript flow definitions) +- No CLI tool (orchestrator stays in-app) +- No replacement of the existing flow definitions — additive changes only +- No VM sandbox or WASM diff --git a/openspec/changes/orchestrator-flow-advanced/tasks.md b/openspec/changes/orchestrator-flow-advanced/tasks.md new file mode 100644 index 0000000..7c782ca --- /dev/null +++ b/openspec/changes/orchestrator-flow-advanced/tasks.md @@ -0,0 +1,40 @@ +# Tasks — Orchestrator Advanced Flows + +## Prerequisites +- v2.7.17 on main (orchestrator + flow-runner shipped) +- v2.8.0 on main (fork-lifts complete) + +## Tasks + +### 1. Trigger rules in Step type +- [ ] 1.1 Add `TriggerRule` type to `conductor/types.ts` +- [ ] 1.2 Add `trigger_rule?: TriggerRule` field to `Step` interface (defaults `all_success`) +- [ ] 1.3 Write `evaluateTriggerRule(deps, rule): boolean` in `flow-runner-decisions.ts` +- [ ] 1.4 Unit tests for each rule variant + +### 2. Variable substitution +- [ ] 2.1 Write `resolveVariables(prompt, completedSteps): string` in flow-runner +- [ ] 2.2 Supports `$stepId.output` and `$stepId.output.field` (dot-path) +- [ ] 2.3 Unit tests with multi-step outputs + +### 3. Approval gate step kind +- [ ] 3.1 Add `'approval'` to `StepKind` union +- [ ] 3.2 Flow-runner: when step.kind === 'approval', pause and publish `permission_requested` frame +- [ ] 3.3 Wire `permission_resolved` frame handler to unblock blocked step +- [ ] 3.4 Test: approval gate pauses flow, approval resumes it + +### 4. Event-sourced step log +- [ ] 4.1 Create `flow_step_events` table in `apps/coder/src/schema.sql` +- [ ] 4.2 Write `appendStepEvent(runId, stepId, event, payload?)` helper +- [ ] 4.3 Wire events into flow-runner lifecycle hooks (start, complete, fail, skip, pause, resume) +- [ ] 4.4 Unit test: events are recorded in order + +### 5. Example flow with parallel branches +- [ ] 5.1 Create `conductor/flows/parallel-research.ts` — splits into 3 parallel research steps, then joins with synthesis +- [ ] 5.2 Uses `trigger_rule: 'one_success'` on the synthesis step +- [ ] 5.3 Integration test: parallel flow completes correctly + +### 6. Smoke +- [ ] 6.1 Run parallel-research flow with 3 agents +- [ ] 6.2 Verify synthesis step triggers on first completion +- [ ] 6.3 Verify variable substitution in synthesis prompt