Drop 9 batch proposals that are superseded by the boocode-lift-analysis (boocontext-audit, conductor upgrades, self-healing/verify-gate skills): add-3tier-memory, import-llm-evaluator, import-pregel-engine, plugin-platform, conductor-evolution, code-intelligence-upgrade, dev-workflow, ui-overhaul, agent-reliability. Delete 11 stub archive files (49-66B each, 'Status: Shipped. Archived.' only) that provide zero documentation value over the existing CHANGELOG.md + git tags.
4.8 KiB
Context
boocode currently has no persistent session management for its agents (the persona agents in data/AGENTS.md). When a session is interrupted, there's no recoverable audit trail, no way to detect repeated mistakes, and no mechanism to enforce learned behavioral guidelines across sessions.
audit-harness provides: hooks (PostToolUse buffer→Stop flush→UserPromptSubmit injection), skills (/start→/end→/recover→/report-daily), and a Python core (AuditContext) with unified index schema.
Parlant provides: GuidelineDocumentStore (versioned, tag/label filtered), JourneyStore (graph-based SOPs), and JourneyGuidelineProjection (node→guideline auto-conversion).
This design ports the high-value subset of both into boocode as agent-facing skills and a TypeScript core library.
Goals / Non-Goals
Goals:
- Define
.boo/runs/directory convention with auto-creation and.gitignore - Port /start, /end, /recover, /report-daily as boocode skills (markdown)
- Port user_correction record format and detection
- Port GuidelineDocumentStore from Parlant as TypeScript service
- Port Journey → guideline auto-projection (node→guideline conversion)
- Implement guideline find_guideline() by content match
- All features opt-in, zero breaking changes
Non-Goals:
- AuditContext full Python class port (environment snapshots, anomaly lambdas)
- Hooks implementation (PostToolUse/Stop/UserPromptSubmit) — separate batch
- Parlant's vector DB / embedder infrastructure
- Parlant's relationship resolver (ARQ)
- Web UI for guideline management — CLI/skill-only
Decisions
Decision 1: Skill-based commands over CLI tools
Choice: Implement /start, /end, /recover, /report-daily as skill markdown files in data/skills/boocode/, following the existing committing-changes pattern.
Rationale: boocode agents already load skills from this path. Adding a new skill is zero code change to the agent runtime — just a new markdown file with YAML frontmatter. CLI tools would require new API routes, dispatch logic, and frontend work.
Alternatives considered: Fastify API routes (rejected — too heavy for agent-facing commands), shell scripts (rejected — platform-specific).
Decision 2: JSONL buffer + index.json
Choice: Port audit-harness's file layout exactly: audit_buffer.jsonl for live writes, audit_pending.jsonl for agent-authored AUDIT blocks, per-session audit_trail.jsonl for flushed records, index.json for cross-session metadata.
Rationale: audit-harness has production-miles with this layout. JSONL is grep-able, append-only, and needs no DB connection.
Alternatives considered: Postgres (rejected — agents don't all have DB access), SQLite (rejected — adds a native dep).
Decision 3: GUID-based session IDs
Choice: adhoc_YYYYMMDD_HHMM format for session IDs, matching audit-harness pattern.
Rationale: Human-readable, sort-able, no collision risk within the same second.
Decision 4: File-based GuidelineStore
Choice: Port GuidelineDocumentStore's abstract interface (create/list/read/update/delete/find) but use filesystem JSON storage instead of Parlant's DocumentDatabase. Rationale: boocode doesn't have Parlant's document DB abstraction. A JSON-file store is simpler and sufficient for single-user operation. The interface stays the same, so a future Postgres backend can be swapped in. Alternatives considered: Postgres backend (rejected — adds coupling), in-memory only (rejected — no persistence).
Decision 5: Journey → guideline projection as pure function
Choice: Port JourneyGuidelineProjection as a pure function (not a class). Takes a Journey + its nodes/edges, returns Guideline[].
Rationale: The projection logic (DFS traversal, node→guideline conversion, edge metadata grafting) is deterministic and has no side effects. A pure function is simpler to test and compose.
Alternatives considered: Class with JourneyStore dependency (rejected — unnecessary indirection for our use case).
Risks / Trade-offs
- [Risk] Skills grow stale if agent runtime doesn't load them → Mitigation: Test with existing agent by loading skill explicitly.
- [Risk] JSONL file contention from multiple agents → Mitigation: Single-user homelab. Acceptable.
- [Risk] GuidelineStore JSON files grow unbounded → Mitigation: TBD — add compaction/archival in future batch.
- [Trade-off] File storage is simple but doesn't scale to multi-user → Acceptable for single-user.
Migration / Rollout
- Create openspec spec files (proposal/design/tasks/specs)
- Create
.boo/runs/directory structure (service) - Create 4 skill files in
data/skills/boocode/ - Create core AuditContext TypeScript service
- Create GuidelineStore + Journey service
- Create user_correction utilities
- Update data/AGENTS.md with new agents
- Test with skill invocation