chore(openspec): drop 9 superseded proposals + 11 stub archive files
Drop 9 batch proposals that are superseded by the boocode-lift-analysis (boocontext-audit, conductor upgrades, self-healing/verify-gate skills): add-3tier-memory, import-llm-evaluator, import-pregel-engine, plugin-platform, conductor-evolution, code-intelligence-upgrade, dev-workflow, ui-overhaul, agent-reliability. Delete 11 stub archive files (49-66B each, 'Status: Shipped. Archived.' only) that provide zero documentation value over the existing CHANGELOG.md + git tags.
This commit is contained in:
76
openspec/changes/audit-harness-integration/design.md
Normal file
76
openspec/changes/audit-harness-integration/design.md
Normal file
@@ -0,0 +1,76 @@
|
||||
## Context
|
||||
|
||||
boocontext (TypeScript code scanner, MCP server) and codecontext (Go code graph engine, MCP server) currently lack persistent audit trails for their operations. When a scan or graph analysis is interrupted, context is lost — tool calls have no recoverable log, session state disappears, and there's no mechanism to detect repeated mistakes or anomalies across runs.
|
||||
|
||||
The audit-harness repo provides a production-tested 3-layer audit enforcement system. Its hooks (PostToolUse, Stop, UserPromptSubmit) intercept every tool call, buffer to JSONL, flush to session trails, and inject session context on every user turn. Its Python core library (AuditContext, 600 lines) provides environment snapshots, SHA256 verification, configurable anomaly detection, and unified index management.
|
||||
|
||||
This design ports the **hook audit pipeline** into both tools as MCP server middleware, and the **session lifecycle** as boocode commands.
|
||||
|
||||
## Goals / Non-Goals
|
||||
|
||||
**Goals:**
|
||||
- Port PostToolUse pattern as MCP middleware → auto-log every tool call to JSONL buffer
|
||||
- Port Stop pattern as MCP middleware → flush buffer to session trail + update index on completion
|
||||
- Port UserPromptSubmit pattern as MCP middleware → inject session context + CRITICAL alerts on each request
|
||||
- Port /start, /end, /recover, /report-daily as boocode commands
|
||||
- Port AuditContext's unified index schema for cross-session tracking
|
||||
- All patterns opt-in via configuration, zero breaking changes
|
||||
|
||||
**Non-Goals:**
|
||||
- Full AuditContext Python class port (environment snapshots, anomaly lambdas) — Phase 2
|
||||
- Parlant's relationship resolver (ARQ) — separate change
|
||||
- Codex/Claude-specific hook formats — only MCP middleware abstraction
|
||||
|
||||
## Decisions
|
||||
|
||||
### Decision 1: MCP Middleware over Hook Scripts
|
||||
**Choice**: Implement audit as MCP server middleware (TypeScript for boocontext, Go for codecontext), not shell hooks.
|
||||
**Rationale**: Shell hooks (like audit-harness uses) are platform-specific (Claude Code vs Codex). MCP middleware is framework-agnostic — any MCP client automatically gets audit trails. Both boocontext and codecontext already have MCP servers.
|
||||
**Alternatives considered**: Shell hooks (rejected — platform-specific), Python subprocess (rejected — dependency overhead).
|
||||
|
||||
### Decision 2: JSONL Buffer + File Rotation
|
||||
**Choice**: Port audit-harness's JSONL buffer pattern exactly — append-only JSONL with size-limited rotation (1MB per buffer file).
|
||||
**Rationale**: JSONL is grep-able, pipeable, compressible, and append-only. The 1MB limit prevents unbounded memory growth. This is the most battle-tested pattern in audit-harness.
|
||||
**Alternatives considered**: SQLite (rejected — adds DB dependency for a log), structured logging (rejected — not designed for session replay).
|
||||
|
||||
### Decision 3: Session Handoff via Pointer File
|
||||
**Choice**: Port the `.current_session` handshake file pattern — a single file containing the current session ID, read by all hooks.
|
||||
**Rationale**: This is the simplest reliable inter-process coordination. No locks, no DB, no race conditions (atomic writes). Works across MCP middleware invocations.
|
||||
**Alternatives considered**: Environment variable (rejected — not persistent across MCP calls), in-memory state (rejected — lost on restart).
|
||||
|
||||
### Decision 4: Unified Index Schema (JSON)
|
||||
**Choice**: Port the index.json schema with `schema_version`, `entries[]` containing `{id, type, task, created, status, record_count}`.
|
||||
**Rationale**: audit-harness's index schema is proven across 4 skills + 3 hooks writing to the same file. JSON with version field allows forward-compatible schema evolution.
|
||||
**Alternatives considered**: SQLite (rejected — overkill for metadata index), binary format (rejected — not human-readable).
|
||||
|
||||
### Decision 5: Graded Context Recovery (L0-L4)
|
||||
**Choice**: Port the tiered loading system — Level 0 (index, ~200t) → Level 1 (task state, ~500t) → Level 2 (corrections, ~1000t) → Level 3 (full, ~3000t) → Level 4 (cross-day, ~5000t+).
|
||||
**Rationale**: Loading all context every time wastes tokens. Graded loading lets the agent fetch exactly what it needs. The token budgets are tuned to avoid context window exhaustion.
|
||||
**Alternatives considered**: Load-all (rejected — token waste), agent-decides (rejected — inconsistent).
|
||||
|
||||
### Decision 6: Opt-in Configuration
|
||||
**Choice**: All audit features disabled by default. Enabled via `audit.enabled: true` in the MCP server config.
|
||||
**Rationale**: Zero behavioral change for existing users. Audit is valuable but has file I/O overhead.
|
||||
**Alternatives considered**: Always-on (rejected — breaking change), env-var-only (rejected — less discoverable).
|
||||
|
||||
## Risks / Trade-offs
|
||||
|
||||
- **[Risk]** JSONL buffer write contention under high concurrency → **Mitigation**: Append-only writes are atomic on most filesystems for lines under PIPE_BUF. Use flock() for safety on NFS.
|
||||
- **[Risk]** Disk space from unbounded audit trails → **Mitigation**: Configurable `audit.maxRetentionDays` (default 30), auto-cleanup on session end.
|
||||
- **[Risk]** Performance overhead from every tool call being logged → **Mitigation**: Buffer writes are async (fire-and-forget). Benchmarked at <0.5ms per write in audit-harness.
|
||||
- **[Trade-off]** File-based audit is simple but doesn't scale to distributed deployments → Acceptable for single-node code analysis tools. Cluster deployments would need a DB-backed backend in Phase 2.
|
||||
|
||||
## Migration Plan
|
||||
|
||||
1. **Phase 1a**: Add +audit middleware to boocontext's MCP server (PostToolUse + Stop patterns, JSONL buffer to session dirs)
|
||||
2. **Phase 1b**: Add audit middleware to codecontext's MCP server (same patterns, Go implementation)
|
||||
3. **Phase 1c**: Add `/start`, `/end`, `/recover`, `/report-daily` commands to boocode
|
||||
4. **Phase 2a**: Port AuditContext Python class to TypeScript (environment snapshots, hash verification, anomaly detection)
|
||||
5. **Phase 2b**: Add CRITICAL anomaly alert injection (UserPromptSubmit pattern)
|
||||
6. **Rollback**: Remove `audit.enabled: true` from config → zero residual effects. Delete `.audit/` directory to purge all data.
|
||||
|
||||
## Open Questions
|
||||
|
||||
- Should the buffer flush be synchronous (blocking response until written) or async (fire-and-forget, could lose last N records on crash)? audit-harness uses sync flush on Stop hook — recommend same for consistency.
|
||||
- Index.json merge strategy when two processes write simultaneously? audit-harness uses atomic file replace (write .tmp → os.replace) — adequate for single-process MCP server.
|
||||
- Token budget for context injection on UserPromptSubmit? audit-harness uses ~50 tokens for the context prefix. Recommend same default with `audit.maxContextTokens` config.
|
||||
Reference in New Issue
Block a user