Drop 9 batch proposals that are superseded by the boocode-lift-analysis (boocontext-audit, conductor upgrades, self-healing/verify-gate skills): add-3tier-memory, import-llm-evaluator, import-pregel-engine, plugin-platform, conductor-evolution, code-intelligence-upgrade, dev-workflow, ui-overhaul, agent-reliability. Delete 11 stub archive files (49-66B each, 'Status: Shipped. Archived.' only) that provide zero documentation value over the existing CHANGELOG.md + git tags.
3.6 KiB
3.6 KiB
ADDED Requirements
Requirement: Token budget management
The system SHALL manage LLM context window limits by tracking token usage and triggering summarization when thresholds are exceeded.
Scenario: Token threshold exceeded
- WHEN cumulative message tokens exceed
max_tokensconfiguration - THEN the system SHALL identify messages to summarize starting from oldest
- THEN the system SHALL replace summarized messages with a
RunningSummaryobject - THEN the system SHALL ensure remaining messages + summary fit within
max_tokensbudget
Scenario: Partial token budget allocation
- WHEN
max_summary_tokensis configured (default 256) - THEN the system SHALL reserve
max_summary_tokenstokens for the summary itself - THEN remaining messages SHALL be trimmed to fit within
max_tokens - max_summary_tokens
Requirement: Incremental summarization
The system SHALL support incremental summarization across multiple turns, tracking which messages have already been summarized to avoid redundant work.
Scenario: First summarization
- WHEN no existing
RunningSummaryexists and token threshold is exceeded - THEN the system SHALL call the LLM with an initial summary prompt
- THEN the system SHALL return a
RunningSummarywithsummary,summarized_message_idsset, andlast_summarized_message_id
Scenario: Subsequent summarization (append)
- WHEN a
RunningSummaryexists and new messages exceed threshold - THEN the system SHALL call the LLM with the existing summary plus new messages
- THEN the system SHALL extend
summarized_message_idswith newly summarized message IDs - THEN the system SHALL update
last_summarized_message_id
Requirement: Context trimming with summarization hook
The system SHALL provide a hook that fires before messages are discarded, allowing the daily tier to capture summarized content.
Scenario: Pre-trim flush
- WHEN messages are about to be discarded (summarized)
- THEN the system SHALL fire a
memory_flush_hookwith the messages being summarized - THEN the hook SHALL queue the messages for async memory extraction
- THEN the main thread SHALL NOT block on memory extraction
Requirement: Token counting with fallback
The system SHALL provide accurate token counting using tiktoken when available, with a char-based fallback.
Scenario: tiktoken available
- WHEN tiktoken package is installed
- THEN the system SHALL use
tiktoken.get_encoding("cl100k_base")for token counting - THEN token counts SHALL be accurate per OpenAI/Anthropic tokenization
Scenario: tiktoken unavailable
- WHEN tiktoken is not installed
- THEN the system SHALL fall back to character-based estimation:
len(text) // 4 - THEN the system SHALL log a warning about missing tiktoken
Requirement: Summarization node for LangGraph
The system SHALL provide a SummarizationNode Runnable that integrates into LangGraph state graphs.
Scenario: Graph integration
- WHEN
SummarizationNodeis added to a LangGraph workflow - THEN it SHALL read messages from
input_messages_key(default "messages") - THEN it SHALL write updated messages to
output_messages_key(default "summarized_messages") - THEN it SHALL store
RunningSummaryincontext.running_summary
Scenario: Equality of input/output keys
- WHEN
input_messages_keyequalsoutput_messages_key - THEN the node SHALL emit a
RemoveMessage(REMOVE_ALL_MESSAGES)to clear previous state - THEN the node SHALL write the new message list including the summary