Files

indifferentketchup 02063072ab chore: add ion package, codesight wiki, work plans, ascli config

New @boocode/ion package (v0.0.1) for inference optimization network.
.codesight/ wiki artifacts for codebase documentation.
.omo/ work plans for openspec cleanup and enhanced file panel.

2026-06-07 22:16:45 +00:00

41 KiB

Raw Blame History

Openspec Cleanup & High-Value Batch Implementation

TL;DR

Quick Summary: Clean up the openspec/ folder structure (delete 11 stub files, move 5 misplaced proposals, add missing .openspec.yaml files), then implement 5 high-value batches: llama-cache-and-spec, pty-enhancements, results-page, token-analyzer-ui, and enhanced-file-panel.

Deliverables:

Clean openspec folder: stubs removed, archived/ accurate, all batches schema-compliant

llama-server KV cache quantization + ngram speculative decoding enabled

PTY exit notifications and session metadata

/results page for orchestrator runs and arena battles (new route)

/analytics page for token usage dashboard (new route)

Enhanced file panel: side-by-side diff, hide whitespace, wrap lines, expand/collapse all

Estimated Effort: Medium-Large Parallel Execution: YES — 3 waves + final verification Critical Path: Cleanup → Backend impls → Frontend impls → Integration

Context

Original Request

Analyze openspec/ folder for structural issues, cross-reference against git tags, and create a work plan for implementing the high-value openspec batch proposals.

Interview Summary

Key Discussions:

openspec/changes/ has 22 active batches (all uncommitted, all unshipped) plus archived/ with 29 entries
11 stub files in archived/ are pure noise (49-66 bytes each, "Status: Shipped. Archived." only)
5 misplaced 2026-06-07 proposals were dumped in archived/ — they're active design docs, not shipped batches
6 active batches missing .openspec.yaml; openspec/config.yaml is empty
Active proposals overlap: multiple batches cover evaluation, memory, and workflow engine territory

Research Findings:

Git tag cross-reference confirms all folder-based archived entries match shipped tags
3 stub files reference wrong tags (v1.13.12→v1.13.14, v1.14.x→v1.13.19, etc.)
All 22 active batches have zero git references — pure filesystem artifacts
No active batch has shipped yet — zero can be archived

Metis Review

Identified gaps:

Deduplication needed: 2026-06-07 proposals overlap with active changes/ — merging must happen before cleanup is complete
Prioritization needed: 22 batches can't all ship at once — need clear tiers
User sign-off needed: Which Tier 1-2 batches to include in this plan vs defer

Work Objectives

Core Objective

Restore openspec structural integrity and ship the 5 highest-value, lowest-effort batch proposals.

Concrete Deliverables

Clean openspec: stubs deleted (11 files ~573 bytes), misplaced proposals moved (5 folders), .openspec.yaml files added (6 batches), config.yaml populated
llama-cache-and-spec: KV cache quantization (Q4_0) + ngram speculative decoding enabled
pty-enhancements: PTY exit notifications, session metadata, X-Agent-Flags
results-page: /results route with Analysis Runs + Arena Battles tabs
token-analyzer-ui: /analytics route with token usage dashboard
enhanced-file-panel: side-by-side diff toggle, hide whitespace, wrap long lines, expand/collapse all

Must Have

All 11 stub files removed from archived/
5 misplaced 2026-06-07 proposals moved from archived/ into changes/ (or merged into existing batches)
.openspec.yaml added to all 6 missing batches
openspec/config.yaml gets a context: block and rules: block
llama-server restarts with new flags (verify via ps aux | grep llama)
/results page loads without 404 and shows real data from existing API endpoints
/analytics page loads and shows token aggregates
Side-by-side diff renders correctly for files with wide lines

Must NOT Have (Guardrails)

NO breaking changes to existing routes or API contracts
NO new database tables or migrations (all data sources already exist)
NO external API dependencies (no cloud embedding models)
NO behavioral engine or Pregel state machine work (deferred to future batch)
NO touching the conductor flow runner or orchestrator pipeline
NO CSS framework changes (stay on Tailwind v4 / shadcn/ui)
NO backend changes unless explicitly required by the batch scope

Spec Framework Integration

Detected Framework: OpenSpec (folder structure only — no CLI)
Config File: openspec/config.yaml
Active Specs: 22 batch folders in openspec/changes/
Available Commands: Manual folder/file operations (no OpenSpec CLI)

Verification Strategy

ZERO HUMAN INTERVENTION — ALL verification is agent-executed.

Test Decision

Infrastructure exists: YES (vitest in apps/server, apps/coder)
Automated tests: Tests-after (no TDD — these are config/frontend changes)
Framework: vitest for backend, Playwright for frontend verification

QA Policy

Every task includes agent-executed QA scenarios. Evidence saved to .omo/evidence/.

Frontend: Playwright — navigate, assert DOM elements, screenshot
Backend: Bash (curl) — send requests, assert status + response
Config/Restart: Bash — check processes, verify new flags
File operations: Bash — verify files exist/deleted with test -f / test ! -f

Execution Strategy

Wave 1 (Structural Cleanup — quick, MAX PARALLEL):
├── Task 1: Delete 11 stub files from archived/ [quick]
├── Task 2: Move 5 misplaced 2026-06-07 proposals → changes/ [quick]
├── Task 3: Add .openspec.yaml to 6 missing batches [quick]
├── Task 4: Populate openspec/config.yaml with project context [quick]
├── Task 5: Add shipped status metadata to archived/ entries [writing]

Wave 2 (Backend — moderate, MAX PARALLEL):
├── Task 6: llama-cache-and-spec — KV cache + ngram flags [quick]
├── Task 7: pty-enhancements — exit notifications + session metadata [unspecified-high]
├── Task 8: token-analyzer-ui — backend API endpoints [unspecified-high]

Wave 3 (Frontend — moderate, MAX PARALLEL):
├── Task 9: results-page — /results route [visual-engineering]
├── Task 10: token-analyzer-ui — /analytics route [visual-engineering]
├── Task 11: enhanced-file-panel — diff modes + UI [visual-engineering]

Wave FINAL (Verification — 4 parallel reviews):
├── Task F1: Plan compliance audit [oracle]
├── Task F2: Code quality + type check [unspecified-high]
├── Task F3: Real QA — execute every scenario [unspecified-high + playwright]
└── Task F4: Scope fidelity check [deep]

Critical Path: Cleanup → Backend → Frontend → Integration
Parallel Speedup: ~60% faster than sequential
Max Concurrent: 4 (Wave 2 & 3)

TODOs

1. Delete 11 stub files from archived/

What to do:
- Remove these 11 files from openspec/changes/archived/:
  - v1.13.12-skills-audit.md (57B, wrong tag ref)
  - v1.13.15-codecontext-synth.md (62B)
  - v1.13.17-cross-repo-reads.md (61B)
  - v1.13.18-codecontext-file-path.md (66B)
  - v1.13.20-drop-legacy-cols.md (61B)
  - v1.14-outer-loop.md (52B)
  - v1.14.1-mcp-poc.md (51B)
  - v1.14.x-html-artifact-panes.md (63B, wrong tag ref)
  - v1.15-mcp-multi.md (51B)
  - v2.0-boocoder.md (49B)
  - v2.2-paseo-providers.md (222B)
- Each file contains ONLY "# Title\n\nStatus: Shipped. Archived.\n" — zero documentation value
- Git history preserves the knowledge; CHANGELOG.md + tags are the authoritative record
Must NOT do:
- Do NOT delete any folder-based archived entries (they have real content)
- Do NOT delete boocode_batch10.md or handoff files (they're valuable)
Recommended Agent Profile:
- Category: quick
- Skills: []
- Justification: Trivial file deletion — no domain skills needed
Parallelization:
- Can Run In Parallel: YES
- Parallel Group: Wave 1 (with Tasks 2-5)
- Blocks: F1-F4
- Blocked By: None
References:
- openspec/changes/archived/ — target directory
- openspec/README.md — schema definition
- ~/.gitconfig — no special config needed
Acceptance Criteria:
- test ! -f openspec/changes/archived/v1.13.12-skills-audit.md → success for all 11 files
- ls openspec/changes/archived/*.md shows only allowed files (boocode_batch10.md, handoff_*)
QA Scenarios:
```
Scenario: Verify stubs deleted
  Tool: Bash
  Preconditions: Clean working tree
  Steps:
    1. For each stub file, run: test ! -f openspec/changes/archived/{filename}
    2. Assert: all 11 commands return exit code 0 (file does not exist)
    3. List remaining .md files: ls openspec/changes/archived/*.md
    4. Assert: only boocode_batch10.md and handoff_*.md files remain
  Expected Result: 11 stubs absent, 3 valuable files present
  Evidence: .omo/evidence/task-1-stubs-deleted.txt

Scenario: Valuable files preserved
  Tool: Bash
  Preconditions: Stubs deleted
  Steps:
    1. test -f openspec/changes/archived/boocode_batch10.md
    2. test -f openspec/changes/archived/handoff_v1.13.10_per_tool_cost.md
    3. test -f openspec/changes/archived/handoff_v1.13.8_prefix_verify.md
  Expected Result: All 3 return exit code 0
  Evidence: .omo/evidence/task-1-valuables-preserved.txt
```
Evidence to Capture:
- task-1-stubs-deleted.txt — confirmation each stub is gone
- task-1-valuables-preserved.txt — confirmation valuable files remain
Commit: YES
- Message: chore(openspec): delete 11 stub archive files with zero documentation value
- Files: openspec/changes/archived/v1.13.12-skills-audit.md, ...
2. Move 5 misplaced 2026-06-07 proposals from archived/ to changes/

What to do:
- Move these 5 folders from openspec/changes/archived/2026-06-07-* to openspec/changes/*:
  1. archived/2026-06-07-boocontext/ → changes/boocontext/ (partially shipped in v2.8.0)
  2. archived/2026-06-07-eval-sandbox-agent-runtime/ → merge into changes/import-llm-evaluator/ and changes/import-pregel-engine/ (overlapping scope)
  3. archived/2026-06-07-hybrid-workflow-engine/ → merge into changes/orchestrator-flow-advanced/
  4. archived/2026-06-07-memory-context-engineering/ → merge into changes/memory-context/
  5. archived/2026-06-07-port-audit-parlant-patterns/ → merge into changes/add-behavioral-engine/ and changes/audit-harness-integration/
- For merges (2-5): append relevant content from the 2026-06-07 proposal into the existing batch's proposal.md, tasks.md, design.md. The 2026-06-07 versions are "grand vision" — extract the concrete specs relevant to the narrower active batch.
- For boocontext/ (1): move as-is since it's a new slug with no direct collision.
Must NOT do:
- Do NOT delete the content of the 2026-06-07 folders — merge, don't discard
- Do NOT create duplicate batch slugs
- Do NOT overwrite existing proposal content — append/extend
Recommended Agent Profile:
- Category: writing
- Skills: []
- Justification: File organization + content merging — technical writing task
Parallelization:
- Can Run In Parallel: YES
- Parallel Group: Wave 1 (with Tasks 1, 3-5)
- Blocks: F1-F4
- Blocked By: None
References:
- openspec/changes/archived/2026-06-07-*/ — source folders
- openspec/changes/import-llm-evaluator/ — target for eval overlap
- openspec/changes/import-pregel-engine/ — target for graph overlap
- openspec/changes/orchestrator-flow-advanced/ — target for workflow overlap
- openspec/changes/memory-context/ — target for memory overlap
- openspec/changes/add-behavioral-engine/ — target for port patterns
- openspec/changes/audit-harness-integration/ — target for audit patterns
Acceptance Criteria:
- openspec/changes/boocontext/ exists with proposal.md + tasks.md + design.md + specs/
- openspec/changes/import-llm-evaluator/ proposal.md now references eval-sandbox content
- openspec/changes/import-pregel-engine/ proposal.md now references graph engine content
- openspec/changes/orchestrator-flow-advanced/ proposal.md now references hybrid workflow
- openspec/changes/memory-context/ proposal.md now references context engineering
- openspec/changes/add-behavioral-engine/ and audit-harness-integration/ now reference port patterns
- test ! -d openspec/changes/archived/2026-06-07-eval-sandbox-agent-runtime/ for each moved folder
QA Scenarios:
```
Scenario: boocontext moved
  Tool: Bash
  Preconditions: Files moved
  Steps:
    1. test -f openspec/changes/boocontext/proposal.md
    2. test -f openspec/changes/boocontext/tasks.md
    3. test ! -f openspec/changes/archived/2026-06-07-boocontext/proposal.md
  Expected Result: Files exist in new location, not in old
  Evidence: .omo/evidence/task-2-boocontext-moved.txt
```
```
Scenario: Merged proposals updated
  Tool: Bash
  Preconditions: Files merged
  Steps:
    1. grep -q "eval-sandbox\|graph engine\|hybrid workflow\|context engineering\|port patterns" openspec/changes/*/proposal.md
    2. Assert: each merged batch's proposal.md references the 2026-06-07 source
  Expected Result: grep finds references in the right target files
  Evidence: .omo/evidence/task-2-merges-verified.txt
```
Evidence to Capture:
- task-2-boocontext-moved.txt
- task-2-merges-verified.txt
Commit: YES (groups with Task 1)
- Message: chore(openspec): move 5 misplaced proposals from archived/ → changes/, merge overlapping content
- Files: openspec/changes/boocontext/, openspec/changes//proposal.md, openspec/changes/*/tasks.md
3. Add .openspec.yaml to 6 missing batches

What to do:
- Create .openspec.yaml in each of these 6 active batches:
  - enhanced-file-panel/
  - llama-cache-and-spec/
  - memory-v2-hybrid-search/
  - omo-paseo-bridge/
  - orchestrator-flow-advanced/
  - results-page/
- Each file must contain:
```
schema: spec-driven
created: 2026-06-07
```
Must NOT do:
- Do NOT modify existing proposal.md or tasks.md content
- Do NOT add .openspec.yaml to batches that already have one
Recommended Agent Profile:
- Category: quick
- Skills: []
- Justification: Trivial boilerplate file creation
Parallelization:
- Can Run In Parallel: YES
- Parallel Group: Wave 1 (with Tasks 1, 2, 4, 5)
- Blocks: F1-F4
- Blocked By: None
References:
- openspec/changes/add-3tier-memory/.openspec.yaml — template
Acceptance Criteria:
- All 6 created files contain schema: spec-driven
- find openspec/changes/ -name ".openspec.yaml" | wc -l counts all expected files
QA Scenarios:
```
Scenario: All .openspec.yaml files present
  Tool: Bash
  Preconditions: Files created
  Steps:
    1. For each batch: test -f openspec/changes/{batch}/.openspec.yaml
    2. For each: grep -q "schema: spec-driven" openspec/changes/{batch}/.openspec.yaml
  Expected Result: All 6 files exist with correct content
  Evidence: .omo/evidence/task-3-openspec-yaml-added.txt
```
Evidence to Capture:
- task-3-openspec-yaml-added.txt
Commit: YES (groups with Task 1)
- Message: chore(openspec): add .openspec.yaml to 6 missing batch folders
- Files: openspec/changes/enhanced-file-panel/.openspec.yaml, ...

4. Populate openspec/config.yaml with project context

What to do:

Replace the empty openspec/config.yaml with a populated version:

schema: spec-driven

context: |
  Tech stack: TypeScript, React 18, Vite, Tailwind v4, shadcn/ui, Fastify, PostgreSQL 16, pnpm workspaces
  Apps: BooChat (read-only chat), BooCoder (write tools + agent dispatch), BooTerm (PTY terminals), Orchestrator (multi-agent conductor)
  Infrastructure: Docker Compose, Tailscale (100.114.205.53), Authelia auth, llama-swap inference
  Monorepo: apps/server, apps/web, apps/booterm, apps/coder, packages/contracts
  Commits: conventional commits, strict TypeScript, NodeNext module resolution
  Testing: vitest (server + coder), Playwright (web E2E), no root tsconfig

rules:
  proposal:
    - Every proposal must have a "Why" section explaining the motivation
    - Every proposal must have a "What Changes" section enumerating deliverables
    - Include "Must Have" / "Must NOT Have" guardrails
    - Reference shipped git tags when applicable
  tasks:
    - Tasks must be ordered by dependency, not priority
    - Each task is one atomic change (file, config, or command)
    - Parallel tasks go in the same wave

Must NOT do:

Do NOT delete the schema: spec-driven line

Recommended Agent Profile:

Category: writing
Skills: []

Parallelization:

Can Run In Parallel: YES
Parallel Group: Wave 1 (with Tasks 1-3, 5)
Blocks: F1-F4
Blocked By: None

References:

openspec/config.yaml — current (empty) file
/home/samkintop/opt/boocode/CLAUDE.md — source for context info

Acceptance Criteria:

grep -q "context:" openspec/config.yaml → success
grep -q "rules:" openspec/config.yaml → success
config.yaml has more than 50 bytes (was 20 bytes)

QA Scenarios:

Scenario: config.yaml populated
  Tool: Bash
  Preconditions: File written
  Steps:
    1. wc -c openspec/config.yaml → assert > 500 bytes
    2. grep -q "context:" openspec/config.yaml
    3. grep -q "rules:" openspec/config.yaml
    4. grep -q "schema: spec-driven" openspec/config.yaml
  Expected Result: All assertions pass
  Evidence: .omo/evidence/task-4-config-populated.txt

Evidence to Capture:

task-4-config-populated.txt

Commit: YES (groups with Task 1)

Message: chore(openspec): populate config.yaml with project context and rules
Files: openspec/config.yaml

5. Add shipped-status metadata to 10 archived folder entries

What to do:
- Add frontmatter or status line to each archived folder's proposal.md documenting the shipped version:
  - agent-status-normalize/ → v2.7.6
  - claude-sdk-sessionstore/ → v2.7.5
  - contracts-ssot/ → v2.7.13
  - license-debt-mit/ → v2.7.0
  - mistake-tracker-file-ledger/ → v2.7.4
  - orchestrator/ → v2.7.17
  - sampling-streamjson-tokens/ → v2.7.3
  - v2-3-provider-lifecycle/ → v2.5.4–v2.5.13
  - v2-6-persistent-agent-sessions/ → v2.6.4–v2.6.8
  - write-edit-robustness/ → v2.7.1
- Add line after the ## Why section heading: **Shipped in:** \v2.7.6-agent-status-normalize`` (or equivalent)
Must NOT do:
- Do NOT change the body of the proposal beyond the shipped annotation
- Do NOT add shipped annotations to the 2026-06-07 batches (they're not shipped)
Recommended Agent Profile:
- Category: quick
- Skills: []
Parallelization:
- Can Run In Parallel: YES
- Parallel Group: Wave 1 (with Tasks 1-4)
- Blocks: F1-F4
- Blocked By: None
References:
- Git tags: v2.7.0-mit, v2.7.1-write-edit-robustness, etc.
Acceptance Criteria:
- All 10 archived batch proposals contain "Shipped in:" referencing a git tag
- grep -r "Shipped in:" openspec/changes/archived/*/proposal.md | wc -l = 10
QA Scenarios:
```
Scenario: All archived batches annotated
  Tool: Bash
  Preconditions: Files edited
  Steps:
    1. grep -rl "Shipped in:" openspec/changes/archived/*/proposal.md | wc -l
    2. Assert: exactly 10 files contain "Shipped in:"
  Expected Result: 10 files annotated
  Evidence: .omo/evidence/task-5-shipped-annotations.txt
```
Evidence to Capture:
- task-5-shipped-annotations.txt
Commit: YES (groups with Task 1)
- Message: chore(openspec): add shipped-in version annotations to 10 archived batch proposals
- Files: openspec/changes/archived/*/proposal.md

TODOs (Wave 2)

6. llama-cache-and-spec — Enable KV cache quantization + ngram speculative decoding

What to do:
- Edit apps/server/src/services/inference/providers/llama.ts (or the llama args validator llama-args-validator.ts) to allow --cache-type-k q4_0 and --spec-type ngram-mod through the shadowing lists
- Change the base llama-server args to include:
  - --cache-type-k q4_0 (4-bit KV cache, ~4× VRAM reduction)
  - --spec-type ngram-mod (ngram speculative decoding, 2-3× tok/s on code)
- Verify the sidecar validator (sidecar/validator.go) also allows these flags through
- Read apps/server/src/services/inference/llama-args-validator.ts and sidecar/validator.go to understand the current blocklist
- Add the two flags to the allowlist instead of the shadow list
- Update the sidecar Dockerfile or config if needed
Must NOT do:
- Do NOT change any other llama-server args
- Do NOT enable KV cache quantization for Q8_0 or Q3_K (only Q4_0)
- Do NOT add a separate draft model (ngram is self-contained)
Recommended Agent Profile:
- Category: unspecified-high
- Skills: []
- Justification: Requires understanding llama.cpp arg validation across two codebases
Parallelization:
- Can Run In Parallel: YES
- Parallel Group: Wave 2 (with Tasks 7-8)
- Blocks: F1-F4
- Blocked By: Task 1-5 (Wave 1)
References:
- apps/server/src/services/inference/llama-args-validator.ts — current arg blocklist/allowlist
- sidecar/validator.go — sidecar validation (if exists)
- docker-compose.yml or sidecar Dockerfile — restart config
- openspec/changes/llama-cache-and-spec/proposal.md — full spec
Acceptance Criteria:
- --cache-type-k q4_0 present in llama-server args after restart
- --spec-type ngram-mod present in llama-server args after restart
- llama-server starts without errors
- Inference still works (send test message)
QA Scenarios:
```
Scenario: KV cache quantization enabled
  Tool: Bash
  Preconditions: Server restarted after changes
  Steps:
    1. ps aux | grep llama-server | grep -o "cache-type-k q4_0"
    2. Assert: output matches "q4_0"
  Expected Result: KV cache quantization is active
  Evidence: .omo/evidence/task-6-kv-cache-enabled.txt
```
```
Scenario: Speculative decoding enabled
  Tool: Bash
  Preconditions: Server restarted
  Steps:
    1. ps aux | grep llama-server | grep -o "spec-type ngram-mod"
    2. Assert: output matches "ngram-mod"
  Expected Result: Ngram speculative decoding is active
  Evidence: .omo/evidence/task-6-ngram-enabled.txt
```
```
Scenario: Inference still works
  Tool: Bash (curl)
  Preconditions: Server running with new flags
  Steps:
    1. curl -s -o /dev/null -w "%{http_code}" http://100.114.205.53:9500/api/health
    2. Assert: HTTP 200
  Expected Result: Server is healthy and serving
  Evidence: .omo/evidence/task-6-health-check.txt
```
Evidence to Capture:
- task-6-kv-cache-enabled.txt — grep output showing the flag
- task-6-ngram-enabled.txt — grep output showing the flag
- task-6-health-check.txt — health check confirmation
Commit: YES
- Message: perf(llama): enable KV cache quantization (q4_0) + ngram speculative decoding
- Files: apps/server/src/services/inference/llama-args-validator.ts, sidecar/validator.go (if needed)
7. pty-enhancements — PTY exit notifications + session metadata

What to do:
- Add notifyOnExit support to the PTY session manager (likely in apps/booterm/)
- When a PTY process exits AND notifyOnExit was set:
  - Emit an event/message to the agent channel with: session ID, title, exit code, total output lines, last line of output
- Add session metadata fields: agent ID that spawned it, task ID, optional title
- Add pty_list endpoint that returns metadata for all sessions
- Wire X-Agent-Flags header support for agent identification
- Read apps/booterm/ to understand the current PTY architecture
Must NOT do:
- Do NOT change the existing pty_spawn interface (add notifyOnExit as optional param)
- Do NOT implement sandbox or circuit breaker (out of scope for this wave)
- Do NOT add new database tables (metadata lives in-memory or in existing session store)
Recommended Agent Profile:
- Category: unspecified-high
- Skills: []
Parallelization:
- Can Run In Parallel: YES
- Parallel Group: Wave 2 (with Tasks 6, 8)
- Blocks: F1-F4
- Blocked By: Task 1-5 (Wave 1)
References:
- apps/booterm/src/ — PTY session management code
- apps/coder/src/services/ — agent dispatch that spawns PTYs
- openspec/changes/pty-enhancements/proposal.md — full spec
- apps/server/src/services/inference/ — inference pipeline that may need to handle notifications
Acceptance Criteria:
- notifyOnExit optional parameter on pty_spawn works
- On process exit with notifyOnExit=true, agent receives notification
- pty_list returns session metadata
- X-Agent-Flags header is recognized
QA Scenarios:
```
Scenario: notifyOnExit triggers notification
  Tool: Bash + tmux
  Preconditions: booterm running
  Steps:
    1. Start a short PTY with notifyOnExit=true: sleep 1
    2. Wait 2 seconds for completion
    3. Check notification was delivered
  Expected Result: Exit notification received with title, exit code, last line
  Evidence: .omo/evidence/task-7-notify-on-exit.txt
```
```
Scenario: pty_list shows metadata
  Tool: Bash (curl)
  Preconditions: PTY sessions exist
  Steps:
    1. curl http://localhost:9501/api/pty/list 2>/dev/null
    2. Assert: response contains session metadata fields
  Expected Result: Metadata returned for each session
  Evidence: .omo/evidence/task-7-pty-list.txt
```
Evidence to Capture:
- task-7-notify-on-exit.txt — notification evidence
- task-7-pty-list.txt — pty_list response
Commit: YES
- Message: feat(booterm): PTY exit notifications + session metadata + X-Agent-Flags
- Files: apps/booterm/src/.ts, apps/coder/src/services/.ts
8. token-analyzer-ui — Backend API endpoints for token analytics

What to do:
- Add read-only API endpoints to serve aggregate token data:
  - GET /api/coder/token-analytics/sessions — per-session token usage (input, output, cost)
  - GET /api/coder/token-analytics/tools — per-tool cost breakdown (from tool_cost_stats view)
  - GET /api/coder/token-analytics/trends — token usage over time
- Reuse existing data sources:
  - agent_sessions.input_tokens, agent_sessions.output_tokens, agent_sessions.cost
  - tool_cost_stats view (per-tool 100-call rolling window)
  - tasks.token_breakdown JSONB column
- Implement in apps/coder/src/routes/ (follow existing route patterns)
- Add proper error handling, pagination for large result sets, and date filtering
Must NOT do:
- Do NOT create new database tables or migrations
- Do NOT add token tracking logic (data is already accumulated)
- Do NOT add real-time streaming (data is historical aggregate)
Recommended Agent Profile:
- Category: unspecified-high
- Skills: []
Parallelization:
- Can Run In Parallel: YES
- Parallel Group: Wave 2 (with Tasks 6-7)
- Blocks: Task 10 (frontend depends on backend)
- Blocked By: Task 1-5 (Wave 1)
References:
- apps/coder/src/routes/ — existing route patterns
- apps/server/src/schema.sql — tool_cost_stats view definition
- apps/coder/CLAUDE.md — coder conventions, route registration
- packages/contracts/ — shared types for response schemas
- openspec/changes/token-analyzer-ui/proposal.md — full spec
Acceptance Criteria:
- GET /api/coder/token-analytics/sessions?project_id=X returns 200 with token data
- GET /api/coder/token-analytics/tools?project_id=X returns 200 with tool breakdown
- GET /api/coder/token-analytics/trends?project_id=X returns 200 with trend data
- All endpoints respect project_id filtering
- Empty data returns valid empty arrays (not errors)
QA Scenarios:
```
Scenario: Sessions endpoint works
  Tool: Bash (curl)
  Preconditions: Server running, project exists
  Steps:
    1. curl -s "http://localhost:3000/api/coder/token-analytics/sessions?project_id=1"
    2. Assert: HTTP 200
    3. Assert: response is valid JSON with expected fields
  Expected Result: Session token data returned
  Evidence: .omo/evidence/task-8-sessions-endpoint.txt
```
```
Scenario: Empty data returns valid response
  Tool: Bash (curl)
  Preconditions: Server running
  Steps:
    1. curl -s "http://localhost:3000/api/coder/token-analytics/sessions?project_id=999"
    2. Assert: HTTP 200
    3. Assert: response contains empty array (not error)
  Expected Result: Graceful empty state
  Evidence: .omo/evidence/task-8-empty-data.txt
```
Evidence to Capture:
- task-8-sessions-endpoint.txt — successful API response
- task-8-empty-data.txt — graceful empty handling
Commit: YES
- Message: feat(coder): add token-analytics API endpoints for session/tool/trend data
- Files: apps/coder/src/routes/token-analytics.ts, apps/coder/src/services/token-analytics.ts

TODOs (Wave 3)

9. results-page — /results route for orchestrator runs + arena battles

What to do:
- Add sidebar nav button with ScrollText icon (lucide-react), above the Token Analytics button
- Create new /results route page with two tabs:
  - "Analysis Runs" — list orchestrator flow runs (research, code-review, investigate, etc.)
  - "Arena Battles" — list battle history
- Each tab shows: status dot, name/type, band/battle-type, model, timing, error indicator
- Completed runs show "View Report" link; completed battles show "View Analysis"
- Uses existing API endpoints (no backend changes needed):
  - GET /api/coder/runs?project_id=X
  - GET /api/coder/battles?project_id=X
- Requires project_id context — load from sidebar on mount, or show project selector
- Follow existing route patterns in web (React Router routes, lazy loading)
Must NOT do:
- Do NOT create new API endpoints
- Do NOT modify existing API contracts
- Do NOT add pagination beyond what the API already provides
- Do NOT add real-time updates (static list, refreshed on mount)
Recommended Agent Profile:
- Category: visual-engineering
- Skills: []
Parallelization:
- Can Run In Parallel: YES
- Parallel Group: Wave 3 (with Tasks 10-11)
- Blocks: F1-F4
- Blocked By: Task 1-5 (Wave 1)
References:
- apps/web/src/routes/ — existing route patterns (analytics, settings)
- apps/web/src/components/sidebar/ — nav button patterns
- apps/web/src/api/ — existing API client
- openspec/changes/results-page/proposal.md — full spec
- apps/coder/src/routes/runs.ts — runs endpoint
- apps/coder/src/routes/battles.ts — battles endpoint
Acceptance Criteria:
- Sidebar shows "Results" button with ScrollText icon above Token Analytics
- Clicking navigates to /results
- "Analysis Runs" tab loads and displays orchestrator flow history
- "Arena Battles" tab loads and displays battle history
- Completed runs show "View Report" link
- Empty state shown when no data
- Error state shown on API failure
QA Scenarios:
```
Scenario: Nav button renders
  Tool: Playwright
  Preconditions: Web app loaded
  Steps:
    1. Navigate to /
    2. Look for sidebar nav button with text "Results"
    3. Assert: button exists and links to /results
  Expected Result: Results nav button present
  Evidence: .omo/evidence/task-9-nav-button.png
```
```
Scenario: Results page loads
  Tool: Playwright
  Preconditions: Web app loaded, project exists
  Steps:
    1. Navigate to /results
    2. Wait for "Analysis Runs" tab to appear
    3. Assert: tab shows list of runs or empty state
  Expected Result: Page loads with data
  Evidence: .omo/evidence/task-9-results-page.png
```
Evidence to Capture:
- task-9-nav-button.png — screenshot of sidebar with Results button
- task-9-results-page.png — screenshot of /results page with data
Commit: YES
- Message: feat(web): add /results page for orchestrator runs and arena battle history
- Files: apps/web/src/routes/results.tsx, apps/web/src/components/sidebar/*.tsx
10. token-analyzer-ui — /analytics dashboard route

What to do:
- Add sidebar nav button with appropriate icon, above Settings button
- Create new /analytics route page showing token usage dashboard:
  - Aggregate token usage across sessions (total input/output tokens)
  - Per-tool cost breakdown (bar chart or table)
  - Per-session token history (list or mini chart)
  - Per-provider cost comparison
- Reuse existing data from the backend endpoints created in Task 8
- Follow the same route/nav patterns as results-page
Must NOT do:
- Do NOT add new charting libraries (use what's already available)
- Do NOT implement real-time updates
Recommended Agent Profile:
- Category: visual-engineering
- Skills: []
Parallelization:
- Can Run In Parallel: YES
- Parallel Group: Wave 3 (with Tasks 9, 11)
- Blocks: F1-F4
- Blocked By: Tasks 1-5 (Wave 1), Task 8 (backend endpoints)
References:
- Same as Task 9 + Task 8 endpoints
- openspec/changes/token-analyzer-ui/proposal.md — full spec
- apps/web/src/components/ — existing chart/list components
Acceptance Criteria:
- Sidebar shows "Token Analytics" button above Settings
- /analytics loads and shows token dashboard
- Per-session, per-tool, per-provider breakdowns visible
- Empty state shown when no data
QA Scenarios:
```
Scenario: Token Analytics nav button renders
  Tool: Playwright
  Preconditions: Web app loaded
  Steps:
    1. Navigate to /
    2. Look for "Token Analytics" button in sidebar
    3. Assert: button exists above Settings
  Expected Result: Nav button present
  Evidence: .omo/evidence/task-10-nav-button.png
```
```
Scenario: Analytics dashboard loads
  Tool: Playwright
  Preconditions: Web app loaded
  Steps:
    1. Navigate to /analytics
    2. Wait for dashboard content to render
    3. Assert: token usage data is visible
  Expected Result: Dashboard shows data
  Evidence: .omo/evidence/task-10-analytics-dashboard.png
```
Evidence to Capture:
- task-10-nav-button.png
- task-10-analytics-dashboard.png
Commit: YES
- Message: feat(web): add /analytics route for token usage dashboard
- Files: apps/web/src/routes/analytics.tsx, apps/web/src/components/sidebar/*.tsx
11. enhanced-file-panel — Side-by-side diff, hide whitespace, wrap lines, expand/collapse all

What to do:
- Add side-by-side diff toggle to the Git diff tab in the file panel
- Add "Hide whitespace" checkbox that filters whitespace-only changes
- Add "Wrap long lines" toggle for diff display
- Add "Expand All" / "Collapse All" buttons for file-level diffs
- Implement in apps/web/src/components/ following existing file panel patterns
- Read apps/web/src/components/ to find the existing diff rendering components
Must NOT do:
- Do NOT implement inline diff comments (deferred)
- Do NOT implement in-browser file editing (deferred)
- Do NOT change the backend diff generation logic
Recommended Agent Profile:
- Category: visual-engineering
- Skills: []
Parallelization:
- Can Run In Parallel: YES
- Parallel Group: Wave 3 (with Tasks 9-10)
- Blocks: F1-F4
- Blocked By: Task 1-5 (Wave 1)
References:
- apps/web/src/components/ — existing file panel and diff components
- apps/web/src/hooks/ — hooks for diff state management
- openspec/changes/enhanced-file-panel/proposal.md — full spec
- apps/server/src/routes/projects.ts — git diff backend route
Acceptance Criteria:
- Side-by-side diff toggles correctly
- Hide whitespace checkbox filters whitespace changes
- Wrap long lines toggle works
- Expand/Collapse All buttons toggle all files
- All changes are frontend-only (no new API calls)
QA Scenarios:
```
Scenario: Side-by-side diff renders
  Tool: Playwright
  Preconditions: Repo with uncommitted changes
  Steps:
    1. Open file panel
    2. Click Git tab
    3. Toggle side-by-side view
    4. Assert: diff renders in two columns
  Expected Result: Side-by-side diff visible
  Evidence: .omo/evidence/task-11-side-by-side.png
```
```
Scenario: Hide whitespace works
  Tool: Playwright
  Preconditions: Diff has whitespace changes
  Steps:
    1. Open diff with whitespace changes
    2. Check "Hide whitespace"
    3. Assert: only-whitespace hunks hidden
  Expected Result: Whitespace-only changes filtered
  Evidence: .omo/evidence/task-11-hide-whitespace.png
```
```
Scenario: Expand/Collapse All toggles
  Tool: Playwright
  Preconditions: Multiple files changed
  Steps:
    1. Click "Collapse All"
    2. Assert: all files collapsed to summary
    3. Click "Expand All"
    4. Assert: all files expanded
  Expected Result: Bulk toggle works
  Evidence: .omo/evidence/task-11-expand-collapse.png
```
Evidence to Capture:
- task-11-side-by-side.png
- task-11-hide-whitespace.png
- task-11-expand-collapse.png
Commit: YES
- Message: feat(web): enhanced file panel — side-by-side diff, hide whitespace, wrap lines, expand/collapse all
- Files: apps/web/src/components/.tsx, apps/web/src/hooks/.ts

Final Verification Wave

F1. Plan Compliance Audit — oracle Read the plan end-to-end. For each "Must Have": verify implementation exists (read file, curl endpoint, run command). For each "Must NOT Have": search codebase for forbidden patterns — reject with file:line if found. Check evidence files exist in .omo/evidence/. Compare deliverables against plan. Output: Must Have [N/N] | Must NOT Have [N/N] | Tasks [N/N] | VERDICT: APPROVE/REJECT
F2. Code Quality Review — unspecified-high Run tsc --noEmit for any changed apps + bun test. Review all changed files for: as any/@ts-ignore, empty catches, console.log in prod, commented-out code, unused imports. Output: Build [PASS/FAIL] | Lint [PASS/FAIL] | Tests [N pass/N fail] | Files [N clean/N issues] | VERDICT
F3. Real Manual QA — unspecified-high Start from clean state. Execute EVERY QA scenario from EVERY task — follow exact steps, capture evidence. Test cross-task integration (features working together, not isolation). Test edge cases: empty state, invalid input, missing project_id. Save to .omo/evidence/final-qa/. Output: Scenarios [N/N pass] | Integration [N/N] | Edge Cases [N tested] | VERDICT
F4. Scope Fidelity Check — deep For each task: read "What to do", read actual diff. Verify 1:1 — everything in scope was built (no missing), nothing beyond scope was built (no creep). Check "Must NOT do" compliance. Output: Tasks [N/N compliant] | Contamination [CLEAN/N issues] | Unaccounted [CLEAN/N files] | VERDICT

Commit Strategy

1-5 (grouped): chore(openspec): cleanup openspec folder structure — delete stubs, move proposals, add metadata, populate config
6: perf(llama): enable KV cache quantization (q4_0) + ngram speculative decoding
7: feat(booterm): PTY exit notifications + session metadata + X-Agent-Flags
8: feat(coder): add token-analytics API endpoints
9: feat(web): add /results page for orchestrator runs + arena battles
10: feat(web): add /analytics token usage dashboard
11: feat(web): enhanced file panel — side-by-side diff, hide whitespace, wrap lines, expand/collapse

Success Criteria

Verification Commands

# OpenSpec cleanup
test ! -f openspec/changes/archived/v1.13.12-skills-audit.md
test -d openspec/changes/boocontext/
test -f openspec/changes/enhanced-file-panel/.openspec.yaml
grep -q "context:" openspec/config.yaml

# llama-cache-and-spec
ps aux | grep llama-server | grep -o "cache-type-k q4_0"
ps aux | grep llama-server | grep -o "spec-type ngram-mod"

# PTY enhancements
curl -s http://localhost:9501/api/pty/list | jq '.'

# Results page
curl -s "http://localhost:3000/api/coder/runs?project_id=1" | jq '.'

# Token analytics
curl -s "http://localhost:3000/api/coder/token-analytics/sessions?project_id=1" | jq '.'

# Enhanced file panel
# (visual verification via Playwright)

Final Checklist

11 stub files deleted from archived/
5 misplaced proposals moved/merged into changes/
6 .openspec.yaml files added
config.yaml populated with context + rules
10 archived proposals annotated with shipped versions
llama-server running with KV cache Q4_0 + ngram
PTY exit notifications working
/results page renders and loads data
/analytics page renders and loads data
Side-by-side diff, hide whitespace, wrap lines, expand/collapse all working
All type checks pass
All QA scenarios pass

41 KiB Raw Blame History Unescape Escape