docs: add openspec proposals for memory v2 and orchestrator flow patterns

This commit is contained in:
2026-06-07 21:34:35 +00:00
parent fb52eb3efa
commit 028c08b4cd
6 changed files with 281 additions and 0 deletions

View File

@@ -0,0 +1,60 @@
# Memory v2 — Design
## Architecture
```
┌─────────────────────────┐
│ system-prompt.ts │
│ (inject memory block) │
└────────┬────────────────┘
┌─────────▼──────────┐
│ memory/recall.ts │
│ (renamed to query) │
└─────────┬──────────┘
┌───────────────┼───────────────┐
│ │ │
┌────────▼──────┐ ┌─────▼──────┐ ┌──────▼───────┐
│ BM25Ranker │ │ EmbedCache │ │ CosineRanker │
│ (stateless) │ │ (LRU map) │ │ (ONNX) │
└───────────────┘ └────────────┘ └──────────────┘
```
## Module Changes
### `apps/server/src/services/memory/` — new/changed files
| File | Change |
|------|--------|
| `recall.ts` | Replace `rankByRelevance` with hybrid `rankByHybrid(query, entries)` |
| `embeddings.ts` | **New** — ONNX model loader + `embed(texts: string[]): number[][]` |
| `bm25.ts` | **New** — BM25 scorer with `score(query, doc): number` |
| `ranker.ts` | **New** — weighted merge of BM25 + cosine scores |
| `entries.ts` | Add `serializeForEmbedding(entry): string` helper |
### Embedding Model
- Model: `all-MiniLM-L6-v2` (384-dim, ~23MB ONNX)
- Runtime: `onnxruntime-node` npm package or subprocess via `node:child_process`
- Cache: `Map<string, { embedding: number[], mtime: number }>` in-memory, cleared on process restart
- Fallback: BM25-only when model file is missing
### Agent Tools (new)
| Tool | Description |
|------|-------------|
| `extract_memory(topic, title, content, tags?)` | Persists a memory entry. Topic must be one of project/user/reference |
| `search_memory(query)` | Returns up to 10 ranked memory entries matching the query. Replaces blind injection |
### Scoring Formula
```
score = (BM25_score * 0.3) + (cosine_similarity * 0.7)
```
Both normalized to [0,1] before merging. Entries below threshold (0.15) are excluded.
## Rollback
Set `MEMORY_SEARCH=keyword` env var to fall back to the v1 keyword-only path. Default is `hybrid`.

View File

@@ -0,0 +1,39 @@
# Memory v2 — Hybrid Search & Auto-Extract
**Status:** Proposed
**Epic:** memory-v2-hybrid-search
**Depends on:** v2.8.0-fork-lifts (v1 memory already shipped)
## Why
v1 memory (shipped in v2.8.0-fork-lifts) provides file-based recall with keyword/tag matching injected into `system-prompt.ts`. It works but has three gaps:
1. **Keyword-only recall misses semantic matches** — "indentation" won't match a memory entry titled "Code style: tabs vs spaces" unless the word "indentation" appears verbatim.
2. **No auto-extraction** — memory files must be created manually. The LLM can't persist useful facts it discovers during conversation.
3. **Flat search, no ranking** — all keyword matches are equally weighted. No relevance scoring or deduplication.
v2 upgrades the retrieval layer while keeping the file-based storage format. No breaking changes to `.boocode/memory/` structure.
## What Changes
### Hybrid Search (high confidence)
Replace keyword-only `rankByRelevance` with BM25 + embedding hybrid search. Use a tiny local embedding model (all-MiniLM-L6-v2 through ONNX runtime or a local subprocess) so there's no external API dependency.
- **BM25** (already implementable without deps — term frequency + inverse document frequency scoring on the memory entries)
- **Embedding** (local ONNX model, ~20MB, runs inference in ~5ms on CPU, produces 384-dim vectors)
- **Weighted merge** (`score = 0.3 * bm25 + 0.7 * cosine`) — configurable ratio
### Auto-Extract Agent Tool (medium confidence)
A new `extract_memory` tool exposed to agents (not automatic — agent decides when to persist):
- `extract_memory(topic, title, content, tags)` → writes a markdown entry
- `search_memory(query)` → returns ranked memory entries (new tool, replaces raw injection)
### In-Memory Embedding Cache (optional)
Keep embeddings in an LRU map keyed by file mtime. Recompute only when files change. No DB migration needed.
## Non-Goals
- No vector database (SQLite FTS5 or in-memory BM25 suffice)
- No automatic background extraction agent (agent must explicitly call `extract_memory`)
- No changes to the `.boocode/memory/` file format
- No Python dependencies — ONNX runtime is a Node.js native addon or subprocess

View File

@@ -0,0 +1,30 @@
# Tasks — Memory v2
## Prerequisites
- v2.8.0 on main (v1 memory module shipped)
## Tasks
### 1. BM25 ranker
- [ ] 1.1 Write `bm25.ts` — pure function, no deps. BM25Okapi formula: `sum over terms of IDF * (tf * (k1 + 1)) / (tf + k1 * (1 - b + b * docLen / avgDocLen))`
- [ ] 1.2 Unit tests with known corpus
### 2. Embedding module
- [ ] 2.1 Write `embeddings.ts` — load ONNX model, `embed(texts: string[]): number[][]`
- [ ] 2.2 Write `ranker.ts` — cosine similarity + BM25 weighted merge
- [ ] 2.3 Fallback to BM25-only when model unavailable
### 3. Hybrid recall
- [ ] 3.1 Refactor `recall.ts``rankByRelevance``rankByHybrid` using BM25 + embedding when available
- [ ] 3.2 Keep keyword-only path as `MEMORY_SEARCH=keyword` env fallback
- [ ] 3.3 Server tests pass
### 4. Agent tools
- [ ] 4.1 Create `extract_memory` tool — persists entry, returns path
- [ ] 4.2 Create `search_memory` tool — replaces raw injection when used
- [ ] 4.3 Tool tests pass
### 5. Smoke
- [ ] 5.1 Create `.boocode/memory/project/style.md` with "Use two-space indentation"
- [ ] 5.2 `search_memory("what spacing convention")` returns the entry
- [ ] 5.3 `extract_memory("project", "Naming", "PascalCase for components")` creates the file