chore(openspec): drop 9 superseded proposals + 11 stub archive files

Drop 9 batch proposals that are superseded by the boocode-lift-analysis
(boocontext-audit, conductor upgrades, self-healing/verify-gate skills):
add-3tier-memory, import-llm-evaluator, import-pregel-engine, plugin-platform,
conductor-evolution, code-intelligence-upgrade, dev-workflow, ui-overhaul,
agent-reliability.

Delete 11 stub archive files (49-66B each, 'Status: Shipped. Archived.' only)
that provide zero documentation value over the existing CHANGELOG.md + git tags.
This commit is contained in:
2026-06-07 22:15:38 +00:00
parent 0d6e9a2413
commit c935687725
119 changed files with 4897 additions and 45 deletions

View File

@@ -0,0 +1,2 @@
schema: spec-driven
created: 2026-06-07

View File

@@ -0,0 +1,32 @@
## Context
BooCode has no structured behavioral enforcement. Agent behavior is guided by system prompts and CLAUDE.md — advisory, not enforceable. The `boocontext-audit` package (already TypeScript, already in /opt/forks) provides a complete behavioral compliance engine: Guideline model, 6-batch matcher, relational resolver, audit trail, and graded recovery.
## Goals / Non-Goals
**Goals:**
- Import boocontext-audit's Guideline model (condition/action rules with criticality)
- Import multi-batch matcher (Observational, Actionable, PreviouslyApplied, Disambiguation, ResponseAnalysis, LowCriticality)
- Import RelationalResolver (DEPENDS_ON, PRIORITIZES, ENTAILS, TAG_ALL, TAG_PRIORITIZES)
- Import audit middleware (PostToolUse, Stop, UserPromptSubmit hooks)
- Import graded context recovery (L0-L4)
- Wire guideline evaluation into agent's inference loop
**Non-Goals:**
- Journey DAG integration (future scope)
- MCP middleware integration (focus on in-process hooks)
## Decisions
- **Direct import from local fork**: boocontext-audit is at `/opt/forks/boocontext-audit/`. Use workspace dependency or npm link.
- **Guideline storage**: InMemoryGuidelineStore for development, FileRelationshipStore for production.
- **Batch execution**: Run observable + actionable batches in parallel, then disambiguation, then response analysis.
- **SchematicGenerator**: Abstract LLM caller. Configure per-batch model (use cheap model for matching, expensive for disambiguation).
- **Audit hooks**: Wire PostToolUse → appendToBuffer(), Stop → flushBuffer(), UserPromptSubmit → injectSessionContext().
- **Recovery**: Load L0 (index) by default. L2 (user corrections) on /recover. L3 (full) on /recover full.
## Risks / Trade-offs
- **LLM overhead**: Each batch is an LLM call. 6 batches × N guidelines could be expensive. Mitigation: batch size limits, parallel execution.
- **Cold start**: No guidelines exist initially. Users must define them. Ship with 5-10 built-in safety guidelines.
- **boocontext-audit maturity**: v0.1.0. Review code quality before direct import.

View File

@@ -0,0 +1,22 @@
## Why
BooCode has no structured way to enforce agent behavior rules. The `boocontext-audit` package (already TypeScript, zero external deps) provides a complete behavioral compliance engine ported from Parlant: Guideline condition/action model, multi-batch LLM matcher, relational resolver, audit middleware, and graded context recovery. Adding this gives BooCode structured rule enforcement far beyond simple CLAUDE.md guidelines.
## What Changes
- Import boocontext-audit as a dependency in apps/coder/
- Add Guideline model: natural language condition/action rules with criticality
- Add multi-batch matcher: observational, actionable, previously-applied, disambiguation, response analysis batches
- Add RelationalResolver: DEPENDS_ON, PRIORITIZES, ENTAILS, TAG_ALL relationship resolution
- Add audit middleware: PostToolUse/Stop/UserPromptSubmit hooks with JSONL buffer
- Add graded context recovery: L0-L4 recovery levels
- Wire guideline evaluation into agent's inference loop
## Capabilities
### New Capabilities
- `guideline-model`: Natural language condition/action rules with criticality and priority
- `multi-batch-matcher`: 6-batch LLM evaluation for context-relevant rule matching
- `relational-resolver`: Dependency/priority/entailment resolution with iterative convergence
- `audit-middleware`: PostToolUse/Stop/UserPromptSubmit hooks with JSONL trail
- `graded-recovery`: L0-L4 context recovery for session continuity

View File

@@ -0,0 +1,21 @@
## ADDED Requirements
### Requirement: PostToolUse audit logging
- **WHEN** a tool is used
- **THEN** the tool name, input summary, and timestamp are appended to the JSONL audit buffer
### Requirement: Stop hook flush
- **WHEN** a response completes
- **THEN** the audit buffer is flushed to the session audit trail and index is updated
### Requirement: UserPromptSubmit context injection
- **WHEN** a user message is submitted
- **THEN** session context (session ID, record count, critical alerts) is injected into the prompt
### Requirement: Anomaly detection
- **WHEN** audit records are checked against alert rules
- **THEN** anomalies at CRITICAL level are injected into the context
#### Scenario: Full audit trail
- **WHEN** an agent runs 10 tool calls across 3 turns
- **THEN** the audit trail contains 10 JSONL records, a session summary, and an updated index

View File

@@ -0,0 +1,25 @@
## ADDED Requirements
### Requirement: L0 recovery (index summary)
- **WHEN** /recover is called without arguments
- **THEN** the last 5 index entries are loaded (~200 tokens)
### Requirement: L1 recovery (session state)
- **WHEN** /recover L1 is called
- **THEN** current session.json + last 3 audit trail entries are loaded (~500 tokens)
### Requirement: L2 recovery (user corrections)
- **WHEN** /recover L2 is called
- **THEN** ALL user_correction records across all sessions are loaded (~1000 tokens)
### Requirement: L3 recovery (full context)
- **WHEN** /recover L3 is called
- **THEN** full audit trail + all pending records are loaded (~3000 tokens)
### Requirement: Priority loading
- **WHEN** recovering context
- **THEN** user_correction records are loaded first (highest priority)
#### Scenario: Session crash recovery
- **WHEN** an agent session crashes and restarts with /recover
- **THEN** the agent gets the index summary, last session state, and all user corrections

View File

@@ -0,0 +1,17 @@
## ADDED Requirements
### Requirement: Guideline creation
- **WHEN** creating a guideline with condition, action, and criticality
- **THEN** it is stored with unique ID and metadata
### Requirement: Guideline evaluation
- **WHEN** an agent action triggers guideline evaluation
- **THEN** matching guidelines are activated with score and rationale
### Requirement: Criticality levels
- **WHEN** evaluating guidelines
- **THEN** guidelines are filtered by criticality (low/medium/high/critical) with higher-criticality taking precedence
#### Scenario: Security policy enforcement
- **WHEN** an agent attempts to edit a file matching a security guideline condition
- **THEN** the guideline matcher returns the relevant rule with CRITICAL severity

View File

@@ -0,0 +1,17 @@
## ADDED Requirements
### Requirement: Six batch types
- **WHEN** guidelines are evaluated
- **THEN** they are processed through: Observational, Actionable, PreviouslyApplied, Disambiguation, ResponseAnalysis, and LowCriticality batches
### Requirement: Parallel batch execution
- **WHEN** independent batches are ready
- **THEN** they execute in parallel (observational + actionable run concurrently)
### Requirement: Structured LLM output per batch
- **WHEN** a batch calls the LLM
- **THEN** it uses a structured schema specific to the batch type (e.g., applies: boolean for actionable, was_followed: boolean for response analysis)
#### Scenario: Multi-rule evaluation
- **WHEN** an agent action matches 3 guidelines across different criticalities
- **THEN** the matcher returns all applicable matches with scores, with CRITICAL matches flagged

View File

@@ -0,0 +1,21 @@
## ADDED Requirements
### Requirement: DEPENDS_ON resolution
- **WHEN** guideline A depends on guideline B
- **THEN** B is activated if A is activated
### Requirement: PRIORITIZES resolution
- **WHEN** guideline A prioritizes over guideline B
- **THEN** B is filtered out if both match
### Requirement: ENTAILS resolution
- **WHEN** guideline A entails guideline B
- **THEN** B is automatically activated when A is activated
### Requirement: Iterative convergence
- **WHEN** resolving relationships
- **THEN** the resolver iterates (max 100 iterations) until no more changes or stable state
#### Scenario: Conflicting guideline resolution
- **WHEN** a HIGH priority guideline matches and a LOW priority guideline also matches
- **THEN** the LOW priority guideline is filtered out via numerical priority resolution

View File

@@ -0,0 +1,56 @@
## 1. Import boocontext-audit as dependency
- [ ] 1.1 Add boocontext-audit as workspace dependency
- [ ] 1.2 Verify Guideline, GuidelineStore, SchematicGenerator exports
## 2. Implement Guideline model
- [ ] 2.1 Create GuidelineManager wrapping GuidelineStore
- [ ] 2.2 Add CRUD operations for guidelines (create, read, update, delete, list)
- [ ] 2.3 Add InMemoryGuidelineStore and FileRelationshipStore backends
- [ ] 2.4 Add criticality filtering and priority sorting
## 3. Implement multi-batch matcher
- [ ] 3.1 Create MatcherService wrapping GenericGuidelineMatchingStrategy
- [ ] 3.2 Add Observable, Actionable, PreviouslyApplied, Disambiguation, ResponseAnalysis, LowCriticality batch types
- [ ] 3.3 Add parallel batch execution for independent batches
- [ ] 3.4 Add SchematicGenerator abstraction for LLM batch calls
## 4. Implement RelationalResolver
- [ ] 4.1 Create ResolverService wrapping RelationalResolver
- [ ] 4.2 Implement DEPENDS_ON, PRIORITIZES, ENTAILS, TAG_ALL, TAG_PRIORITIZES resolution
- [ ] 4.3 Add iterative convergence loop (max 100 iterations)
- [ ] 4.4 Add resolution logging
## 5. Implement audit middleware
- [ ] 5.1 Create AuditService with PostToolUse middleware (JSONL buffer append)
- [ ] 5.2 Add Stop middleware (buffer flush to session trail)
- [ ] 5.3 Add UserPromptSubmit middleware (session context injection + CRITICAL alerts)
- [ ] 5.4 Wire audit middleware into agent's inference lifecycle
## 6. Implement graded context recovery
- [ ] 6.1 Create RecoveryService with L0-L4 recovery methods
- [ ] 6.2 Implement L0: read last 5 index entries
- [ ] 6.3 Implement L1: session.json + last 3 audit trail entries
- [ ] 6.4 Implement L2: all user_correction records
- [ ] 6.5 Implement L3: full audit trail
- [ ] 6.6 Add priority loading (user corrections first)
## 7. Wire into agent inference loop
- [ ] 7.1 Run guideline evaluation before each agent turn
- [ ] 7.2 Inject active guidelines into system prompt
- [ ] 7.3 Record guideline matches in turn metadata
- [ ] 7.4 Add guideline management commands (add-guideline, list-guidelines, remove-guideline)
## 8. Test and verify
- [ ] 8.1 Test guideline creation and storage
- [ ] 8.2 Test multi-batch matching with sample guidelines
- [ ] 8.3 Test relational resolution with dependencies
- [ ] 8.4 Test audit middleware tool logging
- [ ] 8.5 Test graded recovery at all levels