## Why Current AI agents lack structured, durable memory beyond the immediate context window. Conversations are stateless, preferences are forgotten, and long-term learning is nonexistent. Three OSS repos (LangMem, DeerFlow, CowAgent) demonstrate production patterns for agent memory — but no unified, portable engine exists that combines short-term context management, long-term semantic memory, tiered consolidation, and hybrid retrieval. This change builds that engine by extracting and adapting the best patterns from all three. ## What Changes - **New `memory-engine/` module** in the codebase providing a unified memory & context API - **Short-term context summarization** — token-budget-aware conversation windowing (LangMem pattern) - **Long-term semantic memory** — LLM-extracted facts stored with optional vector embeddings (LangMem/DeerFlow hybrid) - **Tiered memory architecture** — Context tier (ephemeral session) → Daily tier (summarized records) → Core tier (distilled long-term) (CowAgent pattern) - **Hybrid search** — Keyword (FTS5) + Vector (cosine similarity on embeddings) with weighted merge (CowAgent pattern) - **Background consolidation** — Debounced, async memory extraction pipeline (DeerFlow queue + LangMem ReflectionExecutor) - **Deep Dream distillation** — Periodic overnight LLM consolidation of daily records into core memory (CowAgent pattern) - **Memory tools for agents** — `manage_memory` and `search_memory` tool interfaces (LangMem pattern) ## Capabilities ### New Capabilities - `short-term-context`: Token-budget window management, conversation summarization, and context trimming for LLM interactions - `long-term-memory`: Persistent fact extraction, storage, and retrieval with Pydantic-typed schemas - `tiered-consolidation`: Three-tier memory pipeline (context→daily→core) with promotion rules and Deep Dream distillation - `hybrid-search`: Combined keyword (FTS5) + vector (embedding cosine similarity) search with weighted scoring and temporal decay - `memory-tools`: `manage_memory` (CRUD) and `search_memory` (semantic query) tools for agent integration - `background-processing`: Debounced async memory update queue with thread-pool execution ### Modified Capabilities ## Impact - New `memory-engine/` directory tree (no existing code modified) - Dependencies: `sqlite3` (stdlib), `numpy` (optional, for vector search), `pydantic` (schemas), `tiktoken` (token counting) - LLM provider integration via abstract `ChatModel` interface (not coupled to any provider) - Embedding provider integration via abstract `EmbeddingProvider` interface (supports OpenAI, local models) - Agent integration via simple tool interface (not coupled to any agent framework)