2.2 KiB
Memory v2 — Hybrid Search & Auto-Extract
Status: Proposed Epic: memory-v2-hybrid-search Depends on: v2.8.0-fork-lifts (v1 memory already shipped)
Why
v1 memory (shipped in v2.8.0-fork-lifts) provides file-based recall with keyword/tag matching injected into system-prompt.ts. It works but has three gaps:
- Keyword-only recall misses semantic matches — "indentation" won't match a memory entry titled "Code style: tabs vs spaces" unless the word "indentation" appears verbatim.
- No auto-extraction — memory files must be created manually. The LLM can't persist useful facts it discovers during conversation.
- Flat search, no ranking — all keyword matches are equally weighted. No relevance scoring or deduplication.
v2 upgrades the retrieval layer while keeping the file-based storage format. No breaking changes to .boocode/memory/ structure.
What Changes
Hybrid Search (high confidence)
Replace keyword-only rankByRelevance with BM25 + embedding hybrid search. Use a tiny local embedding model (all-MiniLM-L6-v2 through ONNX runtime or a local subprocess) so there's no external API dependency.
- BM25 (already implementable without deps — term frequency + inverse document frequency scoring on the memory entries)
- Embedding (local ONNX model, ~20MB, runs inference in ~5ms on CPU, produces 384-dim vectors)
- Weighted merge (
score = 0.3 * bm25 + 0.7 * cosine) — configurable ratio
Auto-Extract Agent Tool (medium confidence)
A new extract_memory tool exposed to agents (not automatic — agent decides when to persist):
extract_memory(topic, title, content, tags)→ writes a markdown entrysearch_memory(query)→ returns ranked memory entries (new tool, replaces raw injection)
In-Memory Embedding Cache (optional)
Keep embeddings in an LRU map keyed by file mtime. Recompute only when files change. No DB migration needed.
Non-Goals
- No vector database (SQLite FTS5 or in-memory BM25 suffice)
- No automatic background extraction agent (agent must explicitly call
extract_memory) - No changes to the
.boocode/memory/file format - No Python dependencies — ONNX runtime is a Node.js native addon or subprocess