Files
boocode/openspec/changes/archived/2026-06-07-memory-context-engineering/specs/hybrid-search/spec.md
indifferentketchup c935687725 chore(openspec): drop 9 superseded proposals + 11 stub archive files
Drop 9 batch proposals that are superseded by the boocode-lift-analysis
(boocontext-audit, conductor upgrades, self-healing/verify-gate skills):
add-3tier-memory, import-llm-evaluator, import-pregel-engine, plugin-platform,
conductor-evolution, code-intelligence-upgrade, dev-workflow, ui-overhaul,
agent-reliability.

Delete 11 stub archive files (49-66B each, 'Status: Shipped. Archived.' only)
that provide zero documentation value over the existing CHANGELOG.md + git tags.
2026-06-07 22:15:38 +00:00

3.5 KiB

ADDED Requirements

Requirement: Hybrid search with vector + keyword fusion

The system SHALL combine vector similarity search and keyword search into unified ranked results.

Scenario: Vector search runs when embedding provider available

  • WHEN an embedding provider is configured
  • THEN the system SHALL compute a query embedding and perform cosine similarity search
  • WHEN no embedding provider is configured
  • THEN the system SHALL gracefully degrade to keyword-only search

Scenario: Keyword search always runs

  • WHEN a search query is submitted
  • THEN the system SHALL always perform keyword search regardless of embedding provider availability

Scenario: Weighted score merging

  • WHEN both vector and keyword results are available
  • THEN the final score SHALL be: vector_weight * vector_score + keyword_weight * keyword_score
  • THEN default weights SHALL be vector_weight=0.7, keyword_weight=0.3
  • THEN weights SHALL be configurable

Requirement: Vector search via numpy cosine similarity

The system SHALL perform vector search using numpy-vectorized cosine similarity for performance.

Scenario: Vectorized cosine similarity

  • WHEN numpy is available
  • THEN all chunk embeddings SHALL be loaded into a numpy matrix (N, D)
  • THEN cosine similarity SHALL be computed as matrix @ query_vector (BLAS matrix-vector multiply)
  • THEN top-K results SHALL be selected via argpartition (O(N) average)

Scenario: Pure-Python fallback

  • WHEN numpy is unavailable
  • THEN cosine similarity SHALL be computed per-row with pure Python
  • THEN results SHALL be sorted and the top K returned

Requirement: Three-tier keyword search (FTS5 → trigram → LIKE)

The system SHALL provide a cascading keyword search strategy for multi-language support.

Scenario: Standard FTS5 for ASCII queries

  • WHEN the query contains only ASCII characters
  • THEN the system SHALL use SQLite FTS5 with the unicode61 tokenizer
  • THEN BM25 ranking SHALL be converted to a [0, 1) score

Scenario: Trigram FTS5 for CJK queries

  • WHEN the query contains CJK (Chinese, Japanese, Korean) characters
  • THEN the system SHALL use SQLite FTS5 with the trigram tokenizer
  • THEN CJK character sequences and ASCII words SHALL be extracted and joined with AND

Scenario: LIKE fallback for edge cases

  • WHEN FTS5 is unavailable or returns empty results
  • THEN the system SHALL fall back to LIKE-based search
  • THEN CJK runs (1+ chars) and ASCII words (3+ chars) SHALL be matched independently

Requirement: Temporal decay for dated memory files

The system SHALL apply exponential decay to search scores for dated memory files.

Scenario: Decay applied to dated files

  • WHEN a memory chunk path matches YYYY-MM-DD.md
  • THEN the combined score SHALL be multiplied by exp(-ln(2)/half_life * age_days)
  • THEN the default half_life SHALL be 30 days
  • WHEN the path does not contain a date (e.g., MEMORY.md)
  • THEN no decay SHALL be applied (multiplier = 1.0)

Requirement: Result filtering and limits

The system SHALL filter search results by minimum score and maximum count.

Scenario: Min score threshold

  • WHEN search results are merged
  • THEN results with score below min_score (default 0.1) SHALL be discarded

Scenario: Max results limit

  • WHEN search results exceed max_results
  • THEN only the top max_results by combined score SHALL be returned