## ADDED Requirements ### Requirement: Fact extraction from conversation The system SHALL extract structured facts from conversations using an LLM, with confidence scoring and category classification. #### Scenario: Extract facts from conversation turn - **WHEN** a conversation turn (user message + assistant reply) is processed - **THEN** the system SHALL call the configured LLM with the conversation text - **THEN** the LLM response SHALL be parsed as structured JSON with facts - **THEN** each fact SHALL contain: `content`, `category`, `confidence` (0.0-1.0) #### Scenario: Fact categories - **WHEN** a fact is extracted - **THEN** its `category` SHALL be one of: `preference`, `knowledge`, `context`, `behavior`, `goal`, `correction` - **THEN** the system SHALL validate the category against the allowed set #### Scenario: Confidence thresholds - **WHEN** a fact's confidence is below the configurable threshold (default 0.5) - **THEN** the fact SHALL NOT be persisted - **THEN** the system SHALL log that a low-confidence fact was skipped ### Requirement: Fact CRUD operations The system SHALL support creating, reading, updating, and deleting memory facts. #### Scenario: Create fact - **WHEN** a new fact is created - **THEN** it SHALL be assigned a unique ID (`fact_{uuid_hex[:8]}`) - **THEN** it SHALL be timestamped with ISO-8601 UTC - **THEN** it SHALL be persisted to the core store #### Scenario: Delete fact by ID - **WHEN** a fact deletion is requested with a valid ID - **THEN** the fact SHALL be removed from the store - **THEN** the updated store SHALL be persisted #### Scenario: Delete non-existent fact - **WHEN** a fact deletion is requested with an unknown ID - **THEN** the system SHALL raise `KeyError` #### Scenario: Update fact - **WHEN** a fact update is requested with a valid ID - **THEN** the system SHALL update only the provided fields (`content`, `category`, `confidence`) - **THEN** the fact's `createdAt` SHALL NOT be modified - **THEN** the updated store SHALL be persisted ### Requirement: Content deduplication The system SHALL prevent duplicate facts by casefolded content comparison. #### Scenario: Exact duplicate detected - **WHEN** a new fact's content (casefolded) matches an existing fact - **THEN** the new fact SHALL be skipped - **THEN** the existing fact SHALL remain unchanged - **THEN** the system SHALL log that a duplicate was skipped #### Scenario: Near-duplicate with different casing - **WHEN** a new fact's content differs only in letter casing - **THEN** it SHALL be treated as a duplicate - **THEN** the new fact SHALL be skipped ### Requirement: Max facts limit The system SHALL enforce a configurable maximum number of stored facts (default 500). #### Scenario: Fact count exceeds limit - **WHEN** adding a new fact would exceed `max_facts` - **THEN** the system SHALL sort existing facts by confidence (descending) - **THEN** the lowest-confidence fact SHALL be removed - **THEN** the new fact SHALL be added ### Requirement: Memory formatting for context injection The system SHALL format memory data into a compact string for injection into LLM system prompts, respecting a token budget. #### Scenario: Format with all sections - **WHEN** memory data contains user context, history, and facts - **THEN** the output SHALL include: "User Context:" with work/personal/topOfMind - **THEN** the output SHALL include: "History:" with recent/earlier/background - **THEN** the output SHALL include: "Facts:" sorted by confidence descending - **THEN** each fact SHALL be formatted as: `- [{category} | {confidence:.2f}] {content}` #### Scenario: Token budget enforcement - **WHEN** the formatted output exceeds `max_tokens` (default 2000) - **THEN** the system SHALL trim facts from lowest confidence up - **THEN** if still over budget, the output SHALL be truncated at the character level - **THEN** `"\n..."` SHALL be appended to indicate truncation