Files
boocode/.omo/plans/openspec-cleanup.md
indifferentketchup 02063072ab chore: add ion package, codesight wiki, work plans, ascli config
New @boocode/ion package (v0.0.1) for inference optimization network.
.codesight/ wiki artifacts for codebase documentation.
.omo/ work plans for openspec cleanup and enhanced file panel.
2026-06-07 22:16:45 +00:00

1016 lines
41 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Openspec Cleanup & High-Value Batch Implementation
## TL;DR
> **Quick Summary**: Clean up the `openspec/` folder structure (delete 11 stub files, move 5 misplaced proposals, add missing `.openspec.yaml` files), then implement 5 high-value batches: llama-cache-and-spec, pty-enhancements, results-page, token-analyzer-ui, and enhanced-file-panel.
>
> **Deliverables**:
> - Clean openspec folder: stubs removed, archived/ accurate, all batches schema-compliant
> - llama-server KV cache quantization + ngram speculative decoding enabled
> - PTY exit notifications and session metadata
> - `/results` page for orchestrator runs and arena battles (new route)
> - `/analytics` page for token usage dashboard (new route)
> - Enhanced file panel: side-by-side diff, hide whitespace, wrap lines, expand/collapse all
>
> **Estimated Effort**: Medium-Large
> **Parallel Execution**: YES — 3 waves + final verification
> **Critical Path**: Cleanup → Backend impls → Frontend impls → Integration
---
## Context
### Original Request
Analyze `openspec/` folder for structural issues, cross-reference against git tags, and create a work plan for implementing the high-value openspec batch proposals.
### Interview Summary
**Key Discussions**:
- `openspec/changes/` has 22 active batches (all uncommitted, all unshipped) plus `archived/` with 29 entries
- 11 stub files in archived/ are pure noise (49-66 bytes each, "Status: Shipped. Archived." only)
- 5 misplaced 2026-06-07 proposals were dumped in archived/ — they're active design docs, not shipped batches
- 6 active batches missing `.openspec.yaml`; `openspec/config.yaml` is empty
- Active proposals overlap: multiple batches cover evaluation, memory, and workflow engine territory
**Research Findings**:
- Git tag cross-reference confirms all folder-based archived entries match shipped tags
- 3 stub files reference wrong tags (v1.13.12→v1.13.14, v1.14.x→v1.13.19, etc.)
- All 22 active batches have zero git references — pure filesystem artifacts
- No active batch has shipped yet — zero can be archived
### Metis Review
Identified gaps:
1. **Deduplication needed**: 2026-06-07 proposals overlap with active changes/ — merging must happen before cleanup is complete
2. **Prioritization needed**: 22 batches can't all ship at once — need clear tiers
3. **User sign-off needed**: Which Tier 1-2 batches to include in this plan vs defer
---
## Work Objectives
### Core Objective
Restore openspec structural integrity and ship the 5 highest-value, lowest-effort batch proposals.
### Concrete Deliverables
- Clean openspec: stubs deleted (11 files ~573 bytes), misplaced proposals moved (5 folders), `.openspec.yaml` files added (6 batches), config.yaml populated
- llama-cache-and-spec: KV cache quantization (Q4_0) + ngram speculative decoding enabled
- pty-enhancements: PTY exit notifications, session metadata, X-Agent-Flags
- results-page: `/results` route with Analysis Runs + Arena Battles tabs
- token-analyzer-ui: `/analytics` route with token usage dashboard
- enhanced-file-panel: side-by-side diff toggle, hide whitespace, wrap long lines, expand/collapse all
### Must Have
- All 11 stub files removed from archived/
- 5 misplaced 2026-06-07 proposals moved from archived/ into `changes/` (or merged into existing batches)
- `.openspec.yaml` added to all 6 missing batches
- `openspec/config.yaml` gets a `context:` block and `rules:` block
- llama-server restarts with new flags (verify via `ps aux | grep llama`)
- `/results` page loads without 404 and shows real data from existing API endpoints
- `/analytics` page loads and shows token aggregates
- Side-by-side diff renders correctly for files with wide lines
### Must NOT Have (Guardrails)
- **NO** breaking changes to existing routes or API contracts
- **NO** new database tables or migrations (all data sources already exist)
- **NO** external API dependencies (no cloud embedding models)
- **NO** behavioral engine or Pregel state machine work (deferred to future batch)
- **NO** touching the conductor flow runner or orchestrator pipeline
- **NO** CSS framework changes (stay on Tailwind v4 / shadcn/ui)
- **NO** backend changes unless explicitly required by the batch scope
### Spec Framework Integration
- **Detected Framework**: OpenSpec (folder structure only — no CLI)
- **Config File**: `openspec/config.yaml`
- **Active Specs**: 22 batch folders in `openspec/changes/`
- **Available Commands**: Manual folder/file operations (no OpenSpec CLI)
---
## Verification Strategy
> **ZERO HUMAN INTERVENTION** — ALL verification is agent-executed.
### Test Decision
- **Infrastructure exists**: YES (vitest in apps/server, apps/coder)
- **Automated tests**: Tests-after (no TDD — these are config/frontend changes)
- **Framework**: vitest for backend, Playwright for frontend verification
### QA Policy
Every task includes agent-executed QA scenarios. Evidence saved to `.omo/evidence/`.
- **Frontend**: Playwright — navigate, assert DOM elements, screenshot
- **Backend**: Bash (curl) — send requests, assert status + response
- **Config/Restart**: Bash — check processes, verify new flags
- **File operations**: Bash — verify files exist/deleted with `test -f` / `test ! -f`
---
## Execution Strategy
```
Wave 1 (Structural Cleanup — quick, MAX PARALLEL):
├── Task 1: Delete 11 stub files from archived/ [quick]
├── Task 2: Move 5 misplaced 2026-06-07 proposals → changes/ [quick]
├── Task 3: Add .openspec.yaml to 6 missing batches [quick]
├── Task 4: Populate openspec/config.yaml with project context [quick]
├── Task 5: Add shipped status metadata to archived/ entries [writing]
Wave 2 (Backend — moderate, MAX PARALLEL):
├── Task 6: llama-cache-and-spec — KV cache + ngram flags [quick]
├── Task 7: pty-enhancements — exit notifications + session metadata [unspecified-high]
├── Task 8: token-analyzer-ui — backend API endpoints [unspecified-high]
Wave 3 (Frontend — moderate, MAX PARALLEL):
├── Task 9: results-page — /results route [visual-engineering]
├── Task 10: token-analyzer-ui — /analytics route [visual-engineering]
├── Task 11: enhanced-file-panel — diff modes + UI [visual-engineering]
Wave FINAL (Verification — 4 parallel reviews):
├── Task F1: Plan compliance audit [oracle]
├── Task F2: Code quality + type check [unspecified-high]
├── Task F3: Real QA — execute every scenario [unspecified-high + playwright]
└── Task F4: Scope fidelity check [deep]
Critical Path: Cleanup → Backend → Frontend → Integration
Parallel Speedup: ~60% faster than sequential
Max Concurrent: 4 (Wave 2 & 3)
```
---
## TODOs
- [ ] 1. Delete 11 stub files from archived/
**What to do**:
- Remove these 11 files from `openspec/changes/archived/`:
- `v1.13.12-skills-audit.md` (57B, wrong tag ref)
- `v1.13.15-codecontext-synth.md` (62B)
- `v1.13.17-cross-repo-reads.md` (61B)
- `v1.13.18-codecontext-file-path.md` (66B)
- `v1.13.20-drop-legacy-cols.md` (61B)
- `v1.14-outer-loop.md` (52B)
- `v1.14.1-mcp-poc.md` (51B)
- `v1.14.x-html-artifact-panes.md` (63B, wrong tag ref)
- `v1.15-mcp-multi.md` (51B)
- `v2.0-boocoder.md` (49B)
- `v2.2-paseo-providers.md` (222B)
- Each file contains ONLY "# Title\n\n**Status:** Shipped. Archived.\n" — zero documentation value
- Git history preserves the knowledge; CHANGELOG.md + tags are the authoritative record
**Must NOT do**:
- Do NOT delete any folder-based archived entries (they have real content)
- Do NOT delete `boocode_batch10.md` or handoff files (they're valuable)
**Recommended Agent Profile**:
- **Category**: `quick`
- **Skills**: `[]`
- **Justification**: Trivial file deletion — no domain skills needed
**Parallelization**:
- **Can Run In Parallel**: YES
- **Parallel Group**: Wave 1 (with Tasks 2-5)
- **Blocks**: F1-F4
- **Blocked By**: None
**References**:
- `openspec/changes/archived/` — target directory
- `openspec/README.md` — schema definition
- `~/.gitconfig` — no special config needed
**Acceptance Criteria**:
- [ ] `test ! -f openspec/changes/archived/v1.13.12-skills-audit.md` → success for all 11 files
- [ ] `ls openspec/changes/archived/*.md` shows only allowed files (boocode_batch10.md, handoff_*)
**QA Scenarios**:
```
Scenario: Verify stubs deleted
Tool: Bash
Preconditions: Clean working tree
Steps:
1. For each stub file, run: test ! -f openspec/changes/archived/{filename}
2. Assert: all 11 commands return exit code 0 (file does not exist)
3. List remaining .md files: ls openspec/changes/archived/*.md
4. Assert: only boocode_batch10.md and handoff_*.md files remain
Expected Result: 11 stubs absent, 3 valuable files present
Evidence: .omo/evidence/task-1-stubs-deleted.txt
Scenario: Valuable files preserved
Tool: Bash
Preconditions: Stubs deleted
Steps:
1. test -f openspec/changes/archived/boocode_batch10.md
2. test -f openspec/changes/archived/handoff_v1.13.10_per_tool_cost.md
3. test -f openspec/changes/archived/handoff_v1.13.8_prefix_verify.md
Expected Result: All 3 return exit code 0
Evidence: .omo/evidence/task-1-valuables-preserved.txt
```
**Evidence to Capture**:
- `task-1-stubs-deleted.txt` — confirmation each stub is gone
- `task-1-valuables-preserved.txt` — confirmation valuable files remain
**Commit**: YES
- Message: `chore(openspec): delete 11 stub archive files with zero documentation value`
- Files: openspec/changes/archived/v1.13.12-skills-audit.md, ...
- [ ] 2. Move 5 misplaced 2026-06-07 proposals from archived/ to changes/
**What to do**:
- Move these 5 folders from `openspec/changes/archived/2026-06-07-*` to `openspec/changes/*`:
1. `archived/2026-06-07-boocontext/` → `changes/boocontext/` (partially shipped in v2.8.0)
2. `archived/2026-06-07-eval-sandbox-agent-runtime/` → merge into `changes/import-llm-evaluator/` and `changes/import-pregel-engine/` (overlapping scope)
3. `archived/2026-06-07-hybrid-workflow-engine/` → merge into `changes/orchestrator-flow-advanced/`
4. `archived/2026-06-07-memory-context-engineering/` → merge into `changes/memory-context/`
5. `archived/2026-06-07-port-audit-parlant-patterns/` → merge into `changes/add-behavioral-engine/` and `changes/audit-harness-integration/`
- For merges (2-5): append relevant content from the 2026-06-07 proposal into the existing batch's proposal.md, tasks.md, design.md. The 2026-06-07 versions are "grand vision" — extract the concrete specs relevant to the narrower active batch.
- For `boocontext/` (1): move as-is since it's a new slug with no direct collision.
**Must NOT do**:
- Do NOT delete the content of the 2026-06-07 folders — merge, don't discard
- Do NOT create duplicate batch slugs
- Do NOT overwrite existing proposal content — append/extend
**Recommended Agent Profile**:
- **Category**: `writing`
- **Skills**: `[]`
- **Justification**: File organization + content merging — technical writing task
**Parallelization**:
- **Can Run In Parallel**: YES
- **Parallel Group**: Wave 1 (with Tasks 1, 3-5)
- **Blocks**: F1-F4
- **Blocked By**: None
**References**:
- `openspec/changes/archived/2026-06-07-*/` — source folders
- `openspec/changes/import-llm-evaluator/` — target for eval overlap
- `openspec/changes/import-pregel-engine/` — target for graph overlap
- `openspec/changes/orchestrator-flow-advanced/` — target for workflow overlap
- `openspec/changes/memory-context/` — target for memory overlap
- `openspec/changes/add-behavioral-engine/` — target for port patterns
- `openspec/changes/audit-harness-integration/` — target for audit patterns
**Acceptance Criteria**:
- [ ] `openspec/changes/boocontext/` exists with proposal.md + tasks.md + design.md + specs/
- [ ] `openspec/changes/import-llm-evaluator/` proposal.md now references eval-sandbox content
- [ ] `openspec/changes/import-pregel-engine/` proposal.md now references graph engine content
- [ ] `openspec/changes/orchestrator-flow-advanced/` proposal.md now references hybrid workflow
- [ ] `openspec/changes/memory-context/` proposal.md now references context engineering
- [ ] `openspec/changes/add-behavioral-engine/` and `audit-harness-integration/` now reference port patterns
- [ ] `test ! -d openspec/changes/archived/2026-06-07-eval-sandbox-agent-runtime/` for each moved folder
**QA Scenarios**:
```
Scenario: boocontext moved
Tool: Bash
Preconditions: Files moved
Steps:
1. test -f openspec/changes/boocontext/proposal.md
2. test -f openspec/changes/boocontext/tasks.md
3. test ! -f openspec/changes/archived/2026-06-07-boocontext/proposal.md
Expected Result: Files exist in new location, not in old
Evidence: .omo/evidence/task-2-boocontext-moved.txt
```
```
Scenario: Merged proposals updated
Tool: Bash
Preconditions: Files merged
Steps:
1. grep -q "eval-sandbox\|graph engine\|hybrid workflow\|context engineering\|port patterns" openspec/changes/*/proposal.md
2. Assert: each merged batch's proposal.md references the 2026-06-07 source
Expected Result: grep finds references in the right target files
Evidence: .omo/evidence/task-2-merges-verified.txt
```
**Evidence to Capture**:
- `task-2-boocontext-moved.txt`
- `task-2-merges-verified.txt`
**Commit**: YES (groups with Task 1)
- Message: `chore(openspec): move 5 misplaced proposals from archived/ → changes/, merge overlapping content`
- Files: openspec/changes/boocontext/*, openspec/changes/*/proposal.md, openspec/changes/*/tasks.md
- [ ] 3. Add .openspec.yaml to 6 missing batches
**What to do**:
- Create `.openspec.yaml` in each of these 6 active batches:
- `enhanced-file-panel/`
- `llama-cache-and-spec/`
- `memory-v2-hybrid-search/`
- `omo-paseo-bridge/`
- `orchestrator-flow-advanced/`
- `results-page/`
- Each file must contain:
```yaml
schema: spec-driven
created: 2026-06-07
```
**Must NOT do**:
- Do NOT modify existing proposal.md or tasks.md content
- Do NOT add .openspec.yaml to batches that already have one
**Recommended Agent Profile**:
- **Category**: `quick`
- **Skills**: `[]`
- **Justification**: Trivial boilerplate file creation
**Parallelization**:
- **Can Run In Parallel**: YES
- **Parallel Group**: Wave 1 (with Tasks 1, 2, 4, 5)
- **Blocks**: F1-F4
- **Blocked By**: None
**References**:
- `openspec/changes/add-3tier-memory/.openspec.yaml` — template
**Acceptance Criteria**:
- [ ] All 6 created files contain `schema: spec-driven`
- [ ] `find openspec/changes/ -name ".openspec.yaml" | wc -l` counts all expected files
**QA Scenarios**:
```
Scenario: All .openspec.yaml files present
Tool: Bash
Preconditions: Files created
Steps:
1. For each batch: test -f openspec/changes/{batch}/.openspec.yaml
2. For each: grep -q "schema: spec-driven" openspec/changes/{batch}/.openspec.yaml
Expected Result: All 6 files exist with correct content
Evidence: .omo/evidence/task-3-openspec-yaml-added.txt
```
**Evidence to Capture**:
- `task-3-openspec-yaml-added.txt`
**Commit**: YES (groups with Task 1)
- Message: `chore(openspec): add .openspec.yaml to 6 missing batch folders`
- Files: openspec/changes/enhanced-file-panel/.openspec.yaml, ...
- [ ] 4. Populate openspec/config.yaml with project context
**What to do**:
- Replace the empty `openspec/config.yaml` with a populated version:
```yaml
schema: spec-driven
context: |
Tech stack: TypeScript, React 18, Vite, Tailwind v4, shadcn/ui, Fastify, PostgreSQL 16, pnpm workspaces
Apps: BooChat (read-only chat), BooCoder (write tools + agent dispatch), BooTerm (PTY terminals), Orchestrator (multi-agent conductor)
Infrastructure: Docker Compose, Tailscale (100.114.205.53), Authelia auth, llama-swap inference
Monorepo: apps/server, apps/web, apps/booterm, apps/coder, packages/contracts
Commits: conventional commits, strict TypeScript, NodeNext module resolution
Testing: vitest (server + coder), Playwright (web E2E), no root tsconfig
rules:
proposal:
- Every proposal must have a "Why" section explaining the motivation
- Every proposal must have a "What Changes" section enumerating deliverables
- Include "Must Have" / "Must NOT Have" guardrails
- Reference shipped git tags when applicable
tasks:
- Tasks must be ordered by dependency, not priority
- Each task is one atomic change (file, config, or command)
- Parallel tasks go in the same wave
```
**Must NOT do**:
- Do NOT delete the `schema: spec-driven` line
**Recommended Agent Profile**:
- **Category**: `writing`
- **Skills**: `[]`
**Parallelization**:
- **Can Run In Parallel**: YES
- **Parallel Group**: Wave 1 (with Tasks 1-3, 5)
- **Blocks**: F1-F4
- **Blocked By**: None
**References**:
- `openspec/config.yaml` — current (empty) file
- `/home/samkintop/opt/boocode/CLAUDE.md` — source for context info
**Acceptance Criteria**:
- [ ] `grep -q "context:" openspec/config.yaml` → success
- [ ] `grep -q "rules:" openspec/config.yaml` → success
- [ ] config.yaml has more than 50 bytes (was 20 bytes)
**QA Scenarios**:
```
Scenario: config.yaml populated
Tool: Bash
Preconditions: File written
Steps:
1. wc -c openspec/config.yaml → assert > 500 bytes
2. grep -q "context:" openspec/config.yaml
3. grep -q "rules:" openspec/config.yaml
4. grep -q "schema: spec-driven" openspec/config.yaml
Expected Result: All assertions pass
Evidence: .omo/evidence/task-4-config-populated.txt
```
**Evidence to Capture**:
- `task-4-config-populated.txt`
**Commit**: YES (groups with Task 1)
- Message: `chore(openspec): populate config.yaml with project context and rules`
- Files: openspec/config.yaml
- [ ] 5. Add shipped-status metadata to 10 archived folder entries
**What to do**:
- Add frontmatter or status line to each archived folder's proposal.md documenting the shipped version:
- `agent-status-normalize/` → `v2.7.6`
- `claude-sdk-sessionstore/` → `v2.7.5`
- `contracts-ssot/` → `v2.7.13`
- `license-debt-mit/` → `v2.7.0`
- `mistake-tracker-file-ledger/` → `v2.7.4`
- `orchestrator/` → `v2.7.17`
- `sampling-streamjson-tokens/` → `v2.7.3`
- `v2-3-provider-lifecycle/` → `v2.5.4``v2.5.13`
- `v2-6-persistent-agent-sessions/` → `v2.6.4``v2.6.8`
- `write-edit-robustness/` → `v2.7.1`
- Add line after the `## Why` section heading: `**Shipped in:** \`v2.7.6-agent-status-normalize\`` (or equivalent)
**Must NOT do**:
- Do NOT change the body of the proposal beyond the shipped annotation
- Do NOT add shipped annotations to the 2026-06-07 batches (they're not shipped)
**Recommended Agent Profile**:
- **Category**: `quick`
- **Skills**: `[]`
**Parallelization**:
- **Can Run In Parallel**: YES
- **Parallel Group**: Wave 1 (with Tasks 1-4)
- **Blocks**: F1-F4
- **Blocked By**: None
**References**:
- Git tags: `v2.7.0-mit`, `v2.7.1-write-edit-robustness`, etc.
**Acceptance Criteria**:
- [ ] All 10 archived batch proposals contain "Shipped in:" referencing a git tag
- [ ] `grep -r "Shipped in:" openspec/changes/archived/*/proposal.md | wc -l` = 10
**QA Scenarios**:
```
Scenario: All archived batches annotated
Tool: Bash
Preconditions: Files edited
Steps:
1. grep -rl "Shipped in:" openspec/changes/archived/*/proposal.md | wc -l
2. Assert: exactly 10 files contain "Shipped in:"
Expected Result: 10 files annotated
Evidence: .omo/evidence/task-5-shipped-annotations.txt
```
**Evidence to Capture**:
- `task-5-shipped-annotations.txt`
**Commit**: YES (groups with Task 1)
- Message: `chore(openspec): add shipped-in version annotations to 10 archived batch proposals`
- Files: openspec/changes/archived/*/proposal.md
---
## TODOs (Wave 2)
- [ ] 6. llama-cache-and-spec — Enable KV cache quantization + ngram speculative decoding
**What to do**:
- Edit `apps/server/src/services/inference/providers/llama.ts` (or the llama args validator `llama-args-validator.ts`) to allow `--cache-type-k q4_0` and `--spec-type ngram-mod` through the shadowing lists
- Change the base llama-server args to include:
- `--cache-type-k q4_0` (4-bit KV cache, ~4× VRAM reduction)
- `--spec-type ngram-mod` (ngram speculative decoding, 2-3× tok/s on code)
- Verify the sidecar validator (`sidecar/validator.go`) also allows these flags through
- Read `apps/server/src/services/inference/llama-args-validator.ts` and `sidecar/validator.go` to understand the current blocklist
- Add the two flags to the allowlist instead of the shadow list
- Update the sidecar Dockerfile or config if needed
**Must NOT do**:
- Do NOT change any other llama-server args
- Do NOT enable KV cache quantization for Q8_0 or Q3_K (only Q4_0)
- Do NOT add a separate draft model (ngram is self-contained)
**Recommended Agent Profile**:
- **Category**: `unspecified-high`
- **Skills**: `[]`
- **Justification**: Requires understanding llama.cpp arg validation across two codebases
**Parallelization**:
- **Can Run In Parallel**: YES
- **Parallel Group**: Wave 2 (with Tasks 7-8)
- **Blocks**: F1-F4
- **Blocked By**: Task 1-5 (Wave 1)
**References**:
- `apps/server/src/services/inference/llama-args-validator.ts` — current arg blocklist/allowlist
- `sidecar/validator.go` — sidecar validation (if exists)
- `docker-compose.yml` or sidecar Dockerfile — restart config
- `openspec/changes/llama-cache-and-spec/proposal.md` — full spec
**Acceptance Criteria**:
- [ ] `--cache-type-k q4_0` present in llama-server args after restart
- [ ] `--spec-type ngram-mod` present in llama-server args after restart
- [ ] llama-server starts without errors
- [ ] Inference still works (send test message)
**QA Scenarios**:
```
Scenario: KV cache quantization enabled
Tool: Bash
Preconditions: Server restarted after changes
Steps:
1. ps aux | grep llama-server | grep -o "cache-type-k q4_0"
2. Assert: output matches "q4_0"
Expected Result: KV cache quantization is active
Evidence: .omo/evidence/task-6-kv-cache-enabled.txt
```
```
Scenario: Speculative decoding enabled
Tool: Bash
Preconditions: Server restarted
Steps:
1. ps aux | grep llama-server | grep -o "spec-type ngram-mod"
2. Assert: output matches "ngram-mod"
Expected Result: Ngram speculative decoding is active
Evidence: .omo/evidence/task-6-ngram-enabled.txt
```
```
Scenario: Inference still works
Tool: Bash (curl)
Preconditions: Server running with new flags
Steps:
1. curl -s -o /dev/null -w "%{http_code}" http://100.114.205.53:9500/api/health
2. Assert: HTTP 200
Expected Result: Server is healthy and serving
Evidence: .omo/evidence/task-6-health-check.txt
```
**Evidence to Capture**:
- `task-6-kv-cache-enabled.txt` — grep output showing the flag
- `task-6-ngram-enabled.txt` — grep output showing the flag
- `task-6-health-check.txt` — health check confirmation
**Commit**: YES
- Message: `perf(llama): enable KV cache quantization (q4_0) + ngram speculative decoding`
- Files: apps/server/src/services/inference/llama-args-validator.ts, sidecar/validator.go (if needed)
- [ ] 7. pty-enhancements — PTY exit notifications + session metadata
**What to do**:
- Add `notifyOnExit` support to the PTY session manager (likely in `apps/booterm/`)
- When a PTY process exits AND `notifyOnExit` was set:
- Emit an event/message to the agent channel with: session ID, title, exit code, total output lines, last line of output
- Add session metadata fields: agent ID that spawned it, task ID, optional title
- Add `pty_list` endpoint that returns metadata for all sessions
- Wire `X-Agent-Flags` header support for agent identification
- Read `apps/booterm/` to understand the current PTY architecture
**Must NOT do**:
- Do NOT change the existing pty_spawn interface (add notifyOnExit as optional param)
- Do NOT implement sandbox or circuit breaker (out of scope for this wave)
- Do NOT add new database tables (metadata lives in-memory or in existing session store)
**Recommended Agent Profile**:
- **Category**: `unspecified-high`
- **Skills**: `[]`
**Parallelization**:
- **Can Run In Parallel**: YES
- **Parallel Group**: Wave 2 (with Tasks 6, 8)
- **Blocks**: F1-F4
- **Blocked By**: Task 1-5 (Wave 1)
**References**:
- `apps/booterm/src/` — PTY session management code
- `apps/coder/src/services/` — agent dispatch that spawns PTYs
- `openspec/changes/pty-enhancements/proposal.md` — full spec
- `apps/server/src/services/inference/` — inference pipeline that may need to handle notifications
**Acceptance Criteria**:
- [ ] `notifyOnExit` optional parameter on pty_spawn works
- [ ] On process exit with notifyOnExit=true, agent receives notification
- [ ] `pty_list` returns session metadata
- [ ] `X-Agent-Flags` header is recognized
**QA Scenarios**:
```
Scenario: notifyOnExit triggers notification
Tool: Bash + tmux
Preconditions: booterm running
Steps:
1. Start a short PTY with notifyOnExit=true: sleep 1
2. Wait 2 seconds for completion
3. Check notification was delivered
Expected Result: Exit notification received with title, exit code, last line
Evidence: .omo/evidence/task-7-notify-on-exit.txt
```
```
Scenario: pty_list shows metadata
Tool: Bash (curl)
Preconditions: PTY sessions exist
Steps:
1. curl http://localhost:9501/api/pty/list 2>/dev/null
2. Assert: response contains session metadata fields
Expected Result: Metadata returned for each session
Evidence: .omo/evidence/task-7-pty-list.txt
```
**Evidence to Capture**:
- `task-7-notify-on-exit.txt` — notification evidence
- `task-7-pty-list.txt` — pty_list response
**Commit**: YES
- Message: `feat(booterm): PTY exit notifications + session metadata + X-Agent-Flags`
- Files: apps/booterm/src/*.ts, apps/coder/src/services/*.ts
- [ ] 8. token-analyzer-ui — Backend API endpoints for token analytics
**What to do**:
- Add read-only API endpoints to serve aggregate token data:
- `GET /api/coder/token-analytics/sessions` — per-session token usage (input, output, cost)
- `GET /api/coder/token-analytics/tools` — per-tool cost breakdown (from tool_cost_stats view)
- `GET /api/coder/token-analytics/trends` — token usage over time
- Reuse existing data sources:
- `agent_sessions.input_tokens`, `agent_sessions.output_tokens`, `agent_sessions.cost`
- `tool_cost_stats` view (per-tool 100-call rolling window)
- `tasks.token_breakdown` JSONB column
- Implement in `apps/coder/src/routes/` (follow existing route patterns)
- Add proper error handling, pagination for large result sets, and date filtering
**Must NOT do**:
- Do NOT create new database tables or migrations
- Do NOT add token tracking logic (data is already accumulated)
- Do NOT add real-time streaming (data is historical aggregate)
**Recommended Agent Profile**:
- **Category**: `unspecified-high`
- **Skills**: `[]`
**Parallelization**:
- **Can Run In Parallel**: YES
- **Parallel Group**: Wave 2 (with Tasks 6-7)
- **Blocks**: Task 10 (frontend depends on backend)
- **Blocked By**: Task 1-5 (Wave 1)
**References**:
- `apps/coder/src/routes/` — existing route patterns
- `apps/server/src/schema.sql` — `tool_cost_stats` view definition
- `apps/coder/CLAUDE.md` — coder conventions, route registration
- `packages/contracts/` — shared types for response schemas
- `openspec/changes/token-analyzer-ui/proposal.md` — full spec
**Acceptance Criteria**:
- [ ] `GET /api/coder/token-analytics/sessions?project_id=X` returns 200 with token data
- [ ] `GET /api/coder/token-analytics/tools?project_id=X` returns 200 with tool breakdown
- [ ] `GET /api/coder/token-analytics/trends?project_id=X` returns 200 with trend data
- [ ] All endpoints respect `project_id` filtering
- [ ] Empty data returns valid empty arrays (not errors)
**QA Scenarios**:
```
Scenario: Sessions endpoint works
Tool: Bash (curl)
Preconditions: Server running, project exists
Steps:
1. curl -s "http://localhost:3000/api/coder/token-analytics/sessions?project_id=1"
2. Assert: HTTP 200
3. Assert: response is valid JSON with expected fields
Expected Result: Session token data returned
Evidence: .omo/evidence/task-8-sessions-endpoint.txt
```
```
Scenario: Empty data returns valid response
Tool: Bash (curl)
Preconditions: Server running
Steps:
1. curl -s "http://localhost:3000/api/coder/token-analytics/sessions?project_id=999"
2. Assert: HTTP 200
3. Assert: response contains empty array (not error)
Expected Result: Graceful empty state
Evidence: .omo/evidence/task-8-empty-data.txt
```
**Evidence to Capture**:
- `task-8-sessions-endpoint.txt` — successful API response
- `task-8-empty-data.txt` — graceful empty handling
**Commit**: YES
- Message: `feat(coder): add token-analytics API endpoints for session/tool/trend data`
- Files: apps/coder/src/routes/token-analytics.ts, apps/coder/src/services/token-analytics.ts
---
## TODOs (Wave 3)
- [ ] 9. results-page — /results route for orchestrator runs + arena battles
**What to do**:
- Add sidebar nav button with `ScrollText` icon (lucide-react), **above** the Token Analytics button
- Create new `/results` route page with two tabs:
- "Analysis Runs" — list orchestrator flow runs (research, code-review, investigate, etc.)
- "Arena Battles" — list battle history
- Each tab shows: status dot, name/type, band/battle-type, model, timing, error indicator
- Completed runs show "View Report" link; completed battles show "View Analysis"
- Uses existing API endpoints (no backend changes needed):
- `GET /api/coder/runs?project_id=X`
- `GET /api/coder/battles?project_id=X`
- Requires `project_id` context — load from sidebar on mount, or show project selector
- Follow existing route patterns in web (React Router routes, lazy loading)
**Must NOT do**:
- Do NOT create new API endpoints
- Do NOT modify existing API contracts
- Do NOT add pagination beyond what the API already provides
- Do NOT add real-time updates (static list, refreshed on mount)
**Recommended Agent Profile**:
- **Category**: `visual-engineering`
- **Skills**: `[]`
**Parallelization**:
- **Can Run In Parallel**: YES
- **Parallel Group**: Wave 3 (with Tasks 10-11)
- **Blocks**: F1-F4
- **Blocked By**: Task 1-5 (Wave 1)
**References**:
- `apps/web/src/routes/` — existing route patterns (analytics, settings)
- `apps/web/src/components/sidebar/` — nav button patterns
- `apps/web/src/api/` — existing API client
- `openspec/changes/results-page/proposal.md` — full spec
- `apps/coder/src/routes/runs.ts` — runs endpoint
- `apps/coder/src/routes/battles.ts` — battles endpoint
**Acceptance Criteria**:
- [ ] Sidebar shows "Results" button with ScrollText icon above Token Analytics
- [ ] Clicking navigates to `/results`
- [ ] "Analysis Runs" tab loads and displays orchestrator flow history
- [ ] "Arena Battles" tab loads and displays battle history
- [ ] Completed runs show "View Report" link
- [ ] Empty state shown when no data
- [ ] Error state shown on API failure
**QA Scenarios**:
```
Scenario: Nav button renders
Tool: Playwright
Preconditions: Web app loaded
Steps:
1. Navigate to /
2. Look for sidebar nav button with text "Results"
3. Assert: button exists and links to /results
Expected Result: Results nav button present
Evidence: .omo/evidence/task-9-nav-button.png
```
```
Scenario: Results page loads
Tool: Playwright
Preconditions: Web app loaded, project exists
Steps:
1. Navigate to /results
2. Wait for "Analysis Runs" tab to appear
3. Assert: tab shows list of runs or empty state
Expected Result: Page loads with data
Evidence: .omo/evidence/task-9-results-page.png
```
**Evidence to Capture**:
- `task-9-nav-button.png` — screenshot of sidebar with Results button
- `task-9-results-page.png` — screenshot of /results page with data
**Commit**: YES
- Message: `feat(web): add /results page for orchestrator runs and arena battle history`
- Files: apps/web/src/routes/results.tsx, apps/web/src/components/sidebar/*.tsx
- [ ] 10. token-analyzer-ui — /analytics dashboard route
**What to do**:
- Add sidebar nav button with appropriate icon, **above Settings** button
- Create new `/analytics` route page showing token usage dashboard:
- Aggregate token usage across sessions (total input/output tokens)
- Per-tool cost breakdown (bar chart or table)
- Per-session token history (list or mini chart)
- Per-provider cost comparison
- Reuse existing data from the backend endpoints created in Task 8
- Follow the same route/nav patterns as results-page
**Must NOT do**:
- Do NOT add new charting libraries (use what's already available)
- Do NOT implement real-time updates
**Recommended Agent Profile**:
- **Category**: `visual-engineering`
- **Skills**: `[]`
**Parallelization**:
- **Can Run In Parallel**: YES
- **Parallel Group**: Wave 3 (with Tasks 9, 11)
- **Blocks**: F1-F4
- **Blocked By**: Tasks 1-5 (Wave 1), Task 8 (backend endpoints)
**References**:
- Same as Task 9 + Task 8 endpoints
- `openspec/changes/token-analyzer-ui/proposal.md` — full spec
- `apps/web/src/components/` — existing chart/list components
**Acceptance Criteria**:
- [ ] Sidebar shows "Token Analytics" button above Settings
- [ ] `/analytics` loads and shows token dashboard
- [ ] Per-session, per-tool, per-provider breakdowns visible
- [ ] Empty state shown when no data
**QA Scenarios**:
```
Scenario: Token Analytics nav button renders
Tool: Playwright
Preconditions: Web app loaded
Steps:
1. Navigate to /
2. Look for "Token Analytics" button in sidebar
3. Assert: button exists above Settings
Expected Result: Nav button present
Evidence: .omo/evidence/task-10-nav-button.png
```
```
Scenario: Analytics dashboard loads
Tool: Playwright
Preconditions: Web app loaded
Steps:
1. Navigate to /analytics
2. Wait for dashboard content to render
3. Assert: token usage data is visible
Expected Result: Dashboard shows data
Evidence: .omo/evidence/task-10-analytics-dashboard.png
```
**Evidence to Capture**:
- `task-10-nav-button.png`
- `task-10-analytics-dashboard.png`
**Commit**: YES
- Message: `feat(web): add /analytics route for token usage dashboard`
- Files: apps/web/src/routes/analytics.tsx, apps/web/src/components/sidebar/*.tsx
- [ ] 11. enhanced-file-panel — Side-by-side diff, hide whitespace, wrap lines, expand/collapse all
**What to do**:
- Add side-by-side diff toggle to the Git diff tab in the file panel
- Add "Hide whitespace" checkbox that filters whitespace-only changes
- Add "Wrap long lines" toggle for diff display
- Add "Expand All" / "Collapse All" buttons for file-level diffs
- Implement in `apps/web/src/components/` following existing file panel patterns
- Read `apps/web/src/components/` to find the existing diff rendering components
**Must NOT do**:
- Do NOT implement inline diff comments (deferred)
- Do NOT implement in-browser file editing (deferred)
- Do NOT change the backend diff generation logic
**Recommended Agent Profile**:
- **Category**: `visual-engineering`
- **Skills**: `[]`
**Parallelization**:
- **Can Run In Parallel**: YES
- **Parallel Group**: Wave 3 (with Tasks 9-10)
- **Blocks**: F1-F4
- **Blocked By**: Task 1-5 (Wave 1)
**References**:
- `apps/web/src/components/` — existing file panel and diff components
- `apps/web/src/hooks/` — hooks for diff state management
- `openspec/changes/enhanced-file-panel/proposal.md` — full spec
- `apps/server/src/routes/projects.ts` — git diff backend route
**Acceptance Criteria**:
- [ ] Side-by-side diff toggles correctly
- [ ] Hide whitespace checkbox filters whitespace changes
- [ ] Wrap long lines toggle works
- [ ] Expand/Collapse All buttons toggle all files
- [ ] All changes are frontend-only (no new API calls)
**QA Scenarios**:
```
Scenario: Side-by-side diff renders
Tool: Playwright
Preconditions: Repo with uncommitted changes
Steps:
1. Open file panel
2. Click Git tab
3. Toggle side-by-side view
4. Assert: diff renders in two columns
Expected Result: Side-by-side diff visible
Evidence: .omo/evidence/task-11-side-by-side.png
```
```
Scenario: Hide whitespace works
Tool: Playwright
Preconditions: Diff has whitespace changes
Steps:
1. Open diff with whitespace changes
2. Check "Hide whitespace"
3. Assert: only-whitespace hunks hidden
Expected Result: Whitespace-only changes filtered
Evidence: .omo/evidence/task-11-hide-whitespace.png
```
```
Scenario: Expand/Collapse All toggles
Tool: Playwright
Preconditions: Multiple files changed
Steps:
1. Click "Collapse All"
2. Assert: all files collapsed to summary
3. Click "Expand All"
4. Assert: all files expanded
Expected Result: Bulk toggle works
Evidence: .omo/evidence/task-11-expand-collapse.png
```
**Evidence to Capture**:
- `task-11-side-by-side.png`
- `task-11-hide-whitespace.png`
- `task-11-expand-collapse.png`
**Commit**: YES
- Message: `feat(web): enhanced file panel — side-by-side diff, hide whitespace, wrap lines, expand/collapse all`
- Files: apps/web/src/components/*.tsx, apps/web/src/hooks/*.ts
---
## Final Verification Wave
- [ ] F1. **Plan Compliance Audit** — `oracle`
Read the plan end-to-end. For each "Must Have": verify implementation exists (read file, curl endpoint, run command). For each "Must NOT Have": search codebase for forbidden patterns — reject with file:line if found. Check evidence files exist in .omo/evidence/. Compare deliverables against plan.
Output: `Must Have [N/N] | Must NOT Have [N/N] | Tasks [N/N] | VERDICT: APPROVE/REJECT`
- [ ] F2. **Code Quality Review** — `unspecified-high`
Run `tsc --noEmit` for any changed apps + `bun test`. Review all changed files for: `as any`/`@ts-ignore`, empty catches, console.log in prod, commented-out code, unused imports.
Output: `Build [PASS/FAIL] | Lint [PASS/FAIL] | Tests [N pass/N fail] | Files [N clean/N issues] | VERDICT`
- [ ] F3. **Real Manual QA** — `unspecified-high`
Start from clean state. Execute EVERY QA scenario from EVERY task — follow exact steps, capture evidence. Test cross-task integration (features working together, not isolation). Test edge cases: empty state, invalid input, missing project_id. Save to `.omo/evidence/final-qa/`.
Output: `Scenarios [N/N pass] | Integration [N/N] | Edge Cases [N tested] | VERDICT`
- [ ] F4. **Scope Fidelity Check** — `deep`
For each task: read "What to do", read actual diff. Verify 1:1 — everything in scope was built (no missing), nothing beyond scope was built (no creep). Check "Must NOT do" compliance.
Output: `Tasks [N/N compliant] | Contamination [CLEAN/N issues] | Unaccounted [CLEAN/N files] | VERDICT`
---
## Commit Strategy
- **1-5** (grouped): `chore(openspec): cleanup openspec folder structure — delete stubs, move proposals, add metadata, populate config`
- **6**: `perf(llama): enable KV cache quantization (q4_0) + ngram speculative decoding`
- **7**: `feat(booterm): PTY exit notifications + session metadata + X-Agent-Flags`
- **8**: `feat(coder): add token-analytics API endpoints`
- **9**: `feat(web): add /results page for orchestrator runs + arena battles`
- **10**: `feat(web): add /analytics token usage dashboard`
- **11**: `feat(web): enhanced file panel — side-by-side diff, hide whitespace, wrap lines, expand/collapse`
---
## Success Criteria
### Verification Commands
```bash
# OpenSpec cleanup
test ! -f openspec/changes/archived/v1.13.12-skills-audit.md
test -d openspec/changes/boocontext/
test -f openspec/changes/enhanced-file-panel/.openspec.yaml
grep -q "context:" openspec/config.yaml
# llama-cache-and-spec
ps aux | grep llama-server | grep -o "cache-type-k q4_0"
ps aux | grep llama-server | grep -o "spec-type ngram-mod"
# PTY enhancements
curl -s http://localhost:9501/api/pty/list | jq '.'
# Results page
curl -s "http://localhost:3000/api/coder/runs?project_id=1" | jq '.'
# Token analytics
curl -s "http://localhost:3000/api/coder/token-analytics/sessions?project_id=1" | jq '.'
# Enhanced file panel
# (visual verification via Playwright)
```
### Final Checklist
- [ ] 11 stub files deleted from archived/
- [ ] 5 misplaced proposals moved/merged into changes/
- [ ] 6 .openspec.yaml files added
- [ ] config.yaml populated with context + rules
- [ ] 10 archived proposals annotated with shipped versions
- [ ] llama-server running with KV cache Q4_0 + ngram
- [ ] PTY exit notifications working
- [ ] `/results` page renders and loads data
- [ ] `/analytics` page renders and loads data
- [ ] Side-by-side diff, hide whitespace, wrap lines, expand/collapse all working
- [ ] All type checks pass
- [ ] All QA scenarios pass