## 1. Module Scaffold & Data Schemas - [x] 1.1 Create `memory-engine/` directory tree with all subdirectories and `__init__.py` files - [x] 1.2 Create `config.py` with `MemoryConfig` pydantic model (embedding, chunking, search, tier settings) - [x] 1.3 Create `core/schemas.py` with `MemoryChunk`, `SearchResult`, `Fact`, `RunningSummary`, `ExtractedMemory` data classes - [x] 1.4 Create `utils/token_counter.py` with tiktoken + char-fallback token counting - [x] 1.5 Create `utils/namespace.py` with `NamespaceTemplate` for runtime namespace resolution - [x] 1.6 Create `utils/chunker.py` with `TextChunker` (line-based, overlapping, configurable max_tokens) ## 2. Core Store: SQLite + FTS5 + Vector - [x] 2.1 Create `core/store.py` with `MemoryStore` — SQLite init with WAL mode, FTS5 tables, integrity checks - [x] 2.2 Implement `create_chunks_table()` with embedding BLOB storage, indexes, meta table - [x] 2.3 Implement `create_fts5_tables()` with standard unicode61 tokenizer + trigram tokenizer for CJK - [x] 2.4 Implement FTS5 triggers (AFTER INSERT/UPDATE/DELETE) for auto-sync - [x] 2.5 Implement `save_chunk()` / `save_chunks_batch()` with SQLite UPSERT (INSERT ... ON CONFLICT DO UPDATE) - [x] 2.6 Implement `delete_by_path()`, `get_file_hash()`, `update_file_metadata()` - [x] 2.7 Implement FTS5 self-healing: `_fts5_state_inconsistent()`, `_fts5_shadow_corrupt()`, `reset_fts5()` - [x] 2.8 Implement embedding encode/decode (float32 BLOB via numpy, struct fallback, legacy JSON fallback) - [x] 2.9 Implement `get_stats()` and `close()` methods ## 3. Hybrid Search - [x] 3.1 Implement `search_vector()` — numpy matrix cosine similarity with argpartition top-K (pure-Python fallback) - [x] 3.2 Implement FTS5 keyword search with BM25 scoring: `_search_fts5()`, `_search_fts5_trigram()` - [x] 3.3 Implement `_search_like()` — CJK (1+ chars) + ASCII word (3+ chars) with dynamic scoring - [x] 3.4 Implement `search_keyword()` — three-tier strategy (FTS5 → trigram FTS5 → LIKE) - [x] 3.5 Implement BM25 rank to score conversion (`0.3 + 0.69 * abs(r)/(1+abs(r))`) - [x] 3.6 Create `core/hybrid_search.py` with weighted merge (vector_weight, keyword_weight) + temporal decay - [x] 3.7 Implement `_compute_temporal_decay(path, half_life=30)` — exponential decay for dated files ## 4. LLM Memory Extraction - [x] 4.1 Create `extraction/prompts.py` with memory update system prompt (structured JSON output) - [x] 4.2 Create `extraction/manager.py` with `MemoryUpdater` — LLM fact extraction from conversation - [x] 4.3 Implement `_prepare_update_prompt()` — loads current memory, formats conversation, builds prompt - [x] 4.4 Implement `_parse_memory_update_response()` — JSON extraction from LLM response (handles fences/thinking) - [x] 4.5 Implement `_apply_updates()` — update user/history sections, add/remove facts, enforce max_facts - [x] 4.6 Implement `create_fact()`, `update_fact()`, `delete_memory_fact()` CRUD operations - [x] 4.7 Implement content deduplication (casefold comparison) and confidence threshold filtering - [x] 4.8 Implement upload-mention scrubbing from memory data ## 5. Tiered Consolidation - [x] 5.1 Create `tiers/daily.py` with `DailyTier` — lazy file creation, append-only writes with timestamped headers - [x] 5.2 Create `tiers/context.py` with `ContextTier` — short-term context window management with RunningSummary - [x] 5.3 Create `tiers/core.py` with `CoreTier` — wraps MemoryStore, manages MEMORY.md file - [x] 5.4 Create `tiers/__init__.py` with `flush_messages()` — context summarization + daily file append - [x] 5.5 Implement incremental summarization (initial summary, extend existing, RunningSummary tracking) - [x] 5.6 Create `background/deep_dream.py` with `DeepDream` — LLM-based MEMORY.md consolidation - [x] 5.7 Implement Deep Dream dedup (content-hash check), dream diary writing, empty-output guard ## 6. Background Processing Queue - [x] 6.1 Create `background/queue.py` with `MemoryUpdateQueue` — thread-safe, debounced, keyed by (thread, user, agent) - [x] 6.2 Implement `add()` with debounce timer reset, `add_nowait()` for immediate processing - [x] 6.3 Implement timer-triggered processing with rate limiting between updates - [x] 6.4 Implement signal detection: `detect_correction()`, `detect_reinforcement()` with pattern matching - [x] 6.5 Create `background/__init__.py` with `flush_messages()` — dedup + background thread LLM summarization - [x] 6.6 Support `context_summary_callback` for in-context injection of summaries ## 7. Agent Tools & Public API - [x] 7.1 Create `tools/manage.py` with `manage_memory()` — create/update/delete facts with namespace isolation - [x] 7.2 Create `tools/search.py` with `search_memory()` — hybrid search with query/filter/limit/offset - [x] 7.3 Implement `__init__.py` with `MemoryEngine` unified class: `manage()`, `search()`, `flush()`, `dream()`, `format_for_injection()` - [x] 7.4 Implement `format_for_injection()` — token-budgeted memory string for system prompts - [x] 7.5 Thread-safe singleton pattern for `MemoryUpdateQueue` and `MemoryStore` ## 8. Embedding Provider Interface - [x] 8.1 Create `embedding/base.py` with `EmbeddingProvider` ABC — `embed_query()`, `embed_batch()` - [x] 8.2 Create `embedding/openai.py` with `OpenAIEmbeddingProvider` implementation - [x] 8.3 Implement `EmbeddingCache` — per-session cache keyed by (provider, model, text_hash) - [x] 8.4 Create `embedding/__init__.py` with `create_embedding_provider()` factory ## 9. Integration Tests - [x] 9.1 Test short-term context summarization with token budget enforcement - [x] 9.2 Test long-term fact extraction with LLM mock - [x] 9.3 Test hybrid search: vector-only, keyword-only, and combined - [x] 9.4 Test tiered consolidation: flush → daily file → Deep Dream → MEMORY.md rewrite - [x] 9.5 Test background queue: debounce, dedup, async execution - [x] 9.6 Test namespace isolation: scoped searches across tenants - [x] 9.7 Test graceful degradation: no embeddings → keyword-only, no numpy → Python fallback - [x] 9.8 Test memory tools: create/update/delete/search round-trip