New @boocode/ion package (v0.0.1) for inference optimization network. .codesight/ wiki artifacts for codebase documentation. .omo/ work plans for openspec cleanup and enhanced file panel.
41 KiB
Openspec Cleanup & High-Value Batch Implementation
TL;DR
Quick Summary: Clean up the
openspec/folder structure (delete 11 stub files, move 5 misplaced proposals, add missing.openspec.yamlfiles), then implement 5 high-value batches: llama-cache-and-spec, pty-enhancements, results-page, token-analyzer-ui, and enhanced-file-panel.Deliverables:
- Clean openspec folder: stubs removed, archived/ accurate, all batches schema-compliant
- llama-server KV cache quantization + ngram speculative decoding enabled
- PTY exit notifications and session metadata
/resultspage for orchestrator runs and arena battles (new route)/analyticspage for token usage dashboard (new route)- Enhanced file panel: side-by-side diff, hide whitespace, wrap lines, expand/collapse all
Estimated Effort: Medium-Large Parallel Execution: YES — 3 waves + final verification Critical Path: Cleanup → Backend impls → Frontend impls → Integration
Context
Original Request
Analyze openspec/ folder for structural issues, cross-reference against git tags, and create a work plan for implementing the high-value openspec batch proposals.
Interview Summary
Key Discussions:
openspec/changes/has 22 active batches (all uncommitted, all unshipped) plusarchived/with 29 entries- 11 stub files in archived/ are pure noise (49-66 bytes each, "Status: Shipped. Archived." only)
- 5 misplaced 2026-06-07 proposals were dumped in archived/ — they're active design docs, not shipped batches
- 6 active batches missing
.openspec.yaml;openspec/config.yamlis empty - Active proposals overlap: multiple batches cover evaluation, memory, and workflow engine territory
Research Findings:
- Git tag cross-reference confirms all folder-based archived entries match shipped tags
- 3 stub files reference wrong tags (v1.13.12→v1.13.14, v1.14.x→v1.13.19, etc.)
- All 22 active batches have zero git references — pure filesystem artifacts
- No active batch has shipped yet — zero can be archived
Metis Review
Identified gaps:
- Deduplication needed: 2026-06-07 proposals overlap with active changes/ — merging must happen before cleanup is complete
- Prioritization needed: 22 batches can't all ship at once — need clear tiers
- User sign-off needed: Which Tier 1-2 batches to include in this plan vs defer
Work Objectives
Core Objective
Restore openspec structural integrity and ship the 5 highest-value, lowest-effort batch proposals.
Concrete Deliverables
- Clean openspec: stubs deleted (11 files ~573 bytes), misplaced proposals moved (5 folders),
.openspec.yamlfiles added (6 batches), config.yaml populated - llama-cache-and-spec: KV cache quantization (Q4_0) + ngram speculative decoding enabled
- pty-enhancements: PTY exit notifications, session metadata, X-Agent-Flags
- results-page:
/resultsroute with Analysis Runs + Arena Battles tabs - token-analyzer-ui:
/analyticsroute with token usage dashboard - enhanced-file-panel: side-by-side diff toggle, hide whitespace, wrap long lines, expand/collapse all
Must Have
- All 11 stub files removed from archived/
- 5 misplaced 2026-06-07 proposals moved from archived/ into
changes/(or merged into existing batches) .openspec.yamladded to all 6 missing batchesopenspec/config.yamlgets acontext:block andrules:block- llama-server restarts with new flags (verify via
ps aux | grep llama) /resultspage loads without 404 and shows real data from existing API endpoints/analyticspage loads and shows token aggregates- Side-by-side diff renders correctly for files with wide lines
Must NOT Have (Guardrails)
- NO breaking changes to existing routes or API contracts
- NO new database tables or migrations (all data sources already exist)
- NO external API dependencies (no cloud embedding models)
- NO behavioral engine or Pregel state machine work (deferred to future batch)
- NO touching the conductor flow runner or orchestrator pipeline
- NO CSS framework changes (stay on Tailwind v4 / shadcn/ui)
- NO backend changes unless explicitly required by the batch scope
Spec Framework Integration
- Detected Framework: OpenSpec (folder structure only — no CLI)
- Config File:
openspec/config.yaml - Active Specs: 22 batch folders in
openspec/changes/ - Available Commands: Manual folder/file operations (no OpenSpec CLI)
Verification Strategy
ZERO HUMAN INTERVENTION — ALL verification is agent-executed.
Test Decision
- Infrastructure exists: YES (vitest in apps/server, apps/coder)
- Automated tests: Tests-after (no TDD — these are config/frontend changes)
- Framework: vitest for backend, Playwright for frontend verification
QA Policy
Every task includes agent-executed QA scenarios. Evidence saved to .omo/evidence/.
- Frontend: Playwright — navigate, assert DOM elements, screenshot
- Backend: Bash (curl) — send requests, assert status + response
- Config/Restart: Bash — check processes, verify new flags
- File operations: Bash — verify files exist/deleted with
test -f/test ! -f
Execution Strategy
Wave 1 (Structural Cleanup — quick, MAX PARALLEL):
├── Task 1: Delete 11 stub files from archived/ [quick]
├── Task 2: Move 5 misplaced 2026-06-07 proposals → changes/ [quick]
├── Task 3: Add .openspec.yaml to 6 missing batches [quick]
├── Task 4: Populate openspec/config.yaml with project context [quick]
├── Task 5: Add shipped status metadata to archived/ entries [writing]
Wave 2 (Backend — moderate, MAX PARALLEL):
├── Task 6: llama-cache-and-spec — KV cache + ngram flags [quick]
├── Task 7: pty-enhancements — exit notifications + session metadata [unspecified-high]
├── Task 8: token-analyzer-ui — backend API endpoints [unspecified-high]
Wave 3 (Frontend — moderate, MAX PARALLEL):
├── Task 9: results-page — /results route [visual-engineering]
├── Task 10: token-analyzer-ui — /analytics route [visual-engineering]
├── Task 11: enhanced-file-panel — diff modes + UI [visual-engineering]
Wave FINAL (Verification — 4 parallel reviews):
├── Task F1: Plan compliance audit [oracle]
├── Task F2: Code quality + type check [unspecified-high]
├── Task F3: Real QA — execute every scenario [unspecified-high + playwright]
└── Task F4: Scope fidelity check [deep]
Critical Path: Cleanup → Backend → Frontend → Integration
Parallel Speedup: ~60% faster than sequential
Max Concurrent: 4 (Wave 2 & 3)
TODOs
-
1. Delete 11 stub files from archived/
What to do:
- Remove these 11 files from
openspec/changes/archived/:v1.13.12-skills-audit.md(57B, wrong tag ref)v1.13.15-codecontext-synth.md(62B)v1.13.17-cross-repo-reads.md(61B)v1.13.18-codecontext-file-path.md(66B)v1.13.20-drop-legacy-cols.md(61B)v1.14-outer-loop.md(52B)v1.14.1-mcp-poc.md(51B)v1.14.x-html-artifact-panes.md(63B, wrong tag ref)v1.15-mcp-multi.md(51B)v2.0-boocoder.md(49B)v2.2-paseo-providers.md(222B)
- Each file contains ONLY "# Title\n\nStatus: Shipped. Archived.\n" — zero documentation value
- Git history preserves the knowledge; CHANGELOG.md + tags are the authoritative record
Must NOT do:
- Do NOT delete any folder-based archived entries (they have real content)
- Do NOT delete
boocode_batch10.mdor handoff files (they're valuable)
Recommended Agent Profile:
- Category:
quick - Skills:
[] - Justification: Trivial file deletion — no domain skills needed
Parallelization:
- Can Run In Parallel: YES
- Parallel Group: Wave 1 (with Tasks 2-5)
- Blocks: F1-F4
- Blocked By: None
References:
openspec/changes/archived/— target directoryopenspec/README.md— schema definition~/.gitconfig— no special config needed
Acceptance Criteria:
test ! -f openspec/changes/archived/v1.13.12-skills-audit.md→ success for all 11 filesls openspec/changes/archived/*.mdshows only allowed files (boocode_batch10.md, handoff_*)
QA Scenarios:
Scenario: Verify stubs deleted Tool: Bash Preconditions: Clean working tree Steps: 1. For each stub file, run: test ! -f openspec/changes/archived/{filename} 2. Assert: all 11 commands return exit code 0 (file does not exist) 3. List remaining .md files: ls openspec/changes/archived/*.md 4. Assert: only boocode_batch10.md and handoff_*.md files remain Expected Result: 11 stubs absent, 3 valuable files present Evidence: .omo/evidence/task-1-stubs-deleted.txt Scenario: Valuable files preserved Tool: Bash Preconditions: Stubs deleted Steps: 1. test -f openspec/changes/archived/boocode_batch10.md 2. test -f openspec/changes/archived/handoff_v1.13.10_per_tool_cost.md 3. test -f openspec/changes/archived/handoff_v1.13.8_prefix_verify.md Expected Result: All 3 return exit code 0 Evidence: .omo/evidence/task-1-valuables-preserved.txtEvidence to Capture:
task-1-stubs-deleted.txt— confirmation each stub is gonetask-1-valuables-preserved.txt— confirmation valuable files remain
Commit: YES
- Message:
chore(openspec): delete 11 stub archive files with zero documentation value - Files: openspec/changes/archived/v1.13.12-skills-audit.md, ...
- Remove these 11 files from
-
2. Move 5 misplaced 2026-06-07 proposals from archived/ to changes/
What to do:
- Move these 5 folders from
openspec/changes/archived/2026-06-07-*toopenspec/changes/*:archived/2026-06-07-boocontext/→changes/boocontext/(partially shipped in v2.8.0)archived/2026-06-07-eval-sandbox-agent-runtime/→ merge intochanges/import-llm-evaluator/andchanges/import-pregel-engine/(overlapping scope)archived/2026-06-07-hybrid-workflow-engine/→ merge intochanges/orchestrator-flow-advanced/archived/2026-06-07-memory-context-engineering/→ merge intochanges/memory-context/archived/2026-06-07-port-audit-parlant-patterns/→ merge intochanges/add-behavioral-engine/andchanges/audit-harness-integration/
- For merges (2-5): append relevant content from the 2026-06-07 proposal into the existing batch's proposal.md, tasks.md, design.md. The 2026-06-07 versions are "grand vision" — extract the concrete specs relevant to the narrower active batch.
- For
boocontext/(1): move as-is since it's a new slug with no direct collision.
Must NOT do:
- Do NOT delete the content of the 2026-06-07 folders — merge, don't discard
- Do NOT create duplicate batch slugs
- Do NOT overwrite existing proposal content — append/extend
Recommended Agent Profile:
- Category:
writing - Skills:
[] - Justification: File organization + content merging — technical writing task
Parallelization:
- Can Run In Parallel: YES
- Parallel Group: Wave 1 (with Tasks 1, 3-5)
- Blocks: F1-F4
- Blocked By: None
References:
openspec/changes/archived/2026-06-07-*/— source foldersopenspec/changes/import-llm-evaluator/— target for eval overlapopenspec/changes/import-pregel-engine/— target for graph overlapopenspec/changes/orchestrator-flow-advanced/— target for workflow overlapopenspec/changes/memory-context/— target for memory overlapopenspec/changes/add-behavioral-engine/— target for port patternsopenspec/changes/audit-harness-integration/— target for audit patterns
Acceptance Criteria:
openspec/changes/boocontext/exists with proposal.md + tasks.md + design.md + specs/openspec/changes/import-llm-evaluator/proposal.md now references eval-sandbox contentopenspec/changes/import-pregel-engine/proposal.md now references graph engine contentopenspec/changes/orchestrator-flow-advanced/proposal.md now references hybrid workflowopenspec/changes/memory-context/proposal.md now references context engineeringopenspec/changes/add-behavioral-engine/andaudit-harness-integration/now reference port patternstest ! -d openspec/changes/archived/2026-06-07-eval-sandbox-agent-runtime/for each moved folder
QA Scenarios:
Scenario: boocontext moved Tool: Bash Preconditions: Files moved Steps: 1. test -f openspec/changes/boocontext/proposal.md 2. test -f openspec/changes/boocontext/tasks.md 3. test ! -f openspec/changes/archived/2026-06-07-boocontext/proposal.md Expected Result: Files exist in new location, not in old Evidence: .omo/evidence/task-2-boocontext-moved.txtScenario: Merged proposals updated Tool: Bash Preconditions: Files merged Steps: 1. grep -q "eval-sandbox\|graph engine\|hybrid workflow\|context engineering\|port patterns" openspec/changes/*/proposal.md 2. Assert: each merged batch's proposal.md references the 2026-06-07 source Expected Result: grep finds references in the right target files Evidence: .omo/evidence/task-2-merges-verified.txtEvidence to Capture:
task-2-boocontext-moved.txttask-2-merges-verified.txt
Commit: YES (groups with Task 1)
- Message:
chore(openspec): move 5 misplaced proposals from archived/ → changes/, merge overlapping content - Files: openspec/changes/boocontext/, openspec/changes//proposal.md, openspec/changes/*/tasks.md
- Move these 5 folders from
-
3. Add .openspec.yaml to 6 missing batches
What to do:
- Create
.openspec.yamlin each of these 6 active batches:enhanced-file-panel/llama-cache-and-spec/memory-v2-hybrid-search/omo-paseo-bridge/orchestrator-flow-advanced/results-page/
- Each file must contain:
schema: spec-driven created: 2026-06-07
Must NOT do:
- Do NOT modify existing proposal.md or tasks.md content
- Do NOT add .openspec.yaml to batches that already have one
Recommended Agent Profile:
- Category:
quick - Skills:
[] - Justification: Trivial boilerplate file creation
Parallelization:
- Can Run In Parallel: YES
- Parallel Group: Wave 1 (with Tasks 1, 2, 4, 5)
- Blocks: F1-F4
- Blocked By: None
References:
openspec/changes/add-3tier-memory/.openspec.yaml— template
Acceptance Criteria:
- All 6 created files contain
schema: spec-driven find openspec/changes/ -name ".openspec.yaml" | wc -lcounts all expected files
QA Scenarios:
Scenario: All .openspec.yaml files present Tool: Bash Preconditions: Files created Steps: 1. For each batch: test -f openspec/changes/{batch}/.openspec.yaml 2. For each: grep -q "schema: spec-driven" openspec/changes/{batch}/.openspec.yaml Expected Result: All 6 files exist with correct content Evidence: .omo/evidence/task-3-openspec-yaml-added.txtEvidence to Capture:
task-3-openspec-yaml-added.txt
Commit: YES (groups with Task 1)
- Message:
chore(openspec): add .openspec.yaml to 6 missing batch folders - Files: openspec/changes/enhanced-file-panel/.openspec.yaml, ...
- Create
-
4. Populate openspec/config.yaml with project context
What to do:
- Replace the empty
openspec/config.yamlwith a populated version:schema: spec-driven context: | Tech stack: TypeScript, React 18, Vite, Tailwind v4, shadcn/ui, Fastify, PostgreSQL 16, pnpm workspaces Apps: BooChat (read-only chat), BooCoder (write tools + agent dispatch), BooTerm (PTY terminals), Orchestrator (multi-agent conductor) Infrastructure: Docker Compose, Tailscale (100.114.205.53), Authelia auth, llama-swap inference Monorepo: apps/server, apps/web, apps/booterm, apps/coder, packages/contracts Commits: conventional commits, strict TypeScript, NodeNext module resolution Testing: vitest (server + coder), Playwright (web E2E), no root tsconfig rules: proposal: - Every proposal must have a "Why" section explaining the motivation - Every proposal must have a "What Changes" section enumerating deliverables - Include "Must Have" / "Must NOT Have" guardrails - Reference shipped git tags when applicable tasks: - Tasks must be ordered by dependency, not priority - Each task is one atomic change (file, config, or command) - Parallel tasks go in the same wave
Must NOT do:
- Do NOT delete the
schema: spec-drivenline
Recommended Agent Profile:
- Category:
writing - Skills:
[]
Parallelization:
- Can Run In Parallel: YES
- Parallel Group: Wave 1 (with Tasks 1-3, 5)
- Blocks: F1-F4
- Blocked By: None
References:
openspec/config.yaml— current (empty) file/home/samkintop/opt/boocode/CLAUDE.md— source for context info
Acceptance Criteria:
grep -q "context:" openspec/config.yaml→ successgrep -q "rules:" openspec/config.yaml→ success- config.yaml has more than 50 bytes (was 20 bytes)
QA Scenarios:
Scenario: config.yaml populated Tool: Bash Preconditions: File written Steps: 1. wc -c openspec/config.yaml → assert > 500 bytes 2. grep -q "context:" openspec/config.yaml 3. grep -q "rules:" openspec/config.yaml 4. grep -q "schema: spec-driven" openspec/config.yaml Expected Result: All assertions pass Evidence: .omo/evidence/task-4-config-populated.txtEvidence to Capture:
task-4-config-populated.txt
Commit: YES (groups with Task 1)
- Message:
chore(openspec): populate config.yaml with project context and rules - Files: openspec/config.yaml
- Replace the empty
-
5. Add shipped-status metadata to 10 archived folder entries
What to do:
- Add frontmatter or status line to each archived folder's proposal.md documenting the shipped version:
agent-status-normalize/→v2.7.6claude-sdk-sessionstore/→v2.7.5contracts-ssot/→v2.7.13license-debt-mit/→v2.7.0mistake-tracker-file-ledger/→v2.7.4orchestrator/→v2.7.17sampling-streamjson-tokens/→v2.7.3v2-3-provider-lifecycle/→v2.5.4–v2.5.13v2-6-persistent-agent-sessions/→v2.6.4–v2.6.8write-edit-robustness/→v2.7.1
- Add line after the
## Whysection heading:**Shipped in:** \v2.7.6-agent-status-normalize`` (or equivalent)
Must NOT do:
- Do NOT change the body of the proposal beyond the shipped annotation
- Do NOT add shipped annotations to the 2026-06-07 batches (they're not shipped)
Recommended Agent Profile:
- Category:
quick - Skills:
[]
Parallelization:
- Can Run In Parallel: YES
- Parallel Group: Wave 1 (with Tasks 1-4)
- Blocks: F1-F4
- Blocked By: None
References:
- Git tags:
v2.7.0-mit,v2.7.1-write-edit-robustness, etc.
Acceptance Criteria:
- All 10 archived batch proposals contain "Shipped in:" referencing a git tag
grep -r "Shipped in:" openspec/changes/archived/*/proposal.md | wc -l= 10
QA Scenarios:
Scenario: All archived batches annotated Tool: Bash Preconditions: Files edited Steps: 1. grep -rl "Shipped in:" openspec/changes/archived/*/proposal.md | wc -l 2. Assert: exactly 10 files contain "Shipped in:" Expected Result: 10 files annotated Evidence: .omo/evidence/task-5-shipped-annotations.txtEvidence to Capture:
task-5-shipped-annotations.txt
Commit: YES (groups with Task 1)
- Message:
chore(openspec): add shipped-in version annotations to 10 archived batch proposals - Files: openspec/changes/archived/*/proposal.md
- Add frontmatter or status line to each archived folder's proposal.md documenting the shipped version:
TODOs (Wave 2)
-
6. llama-cache-and-spec — Enable KV cache quantization + ngram speculative decoding
What to do:
- Edit
apps/server/src/services/inference/providers/llama.ts(or the llama args validatorllama-args-validator.ts) to allow--cache-type-k q4_0and--spec-type ngram-modthrough the shadowing lists - Change the base llama-server args to include:
--cache-type-k q4_0(4-bit KV cache, ~4× VRAM reduction)--spec-type ngram-mod(ngram speculative decoding, 2-3× tok/s on code)
- Verify the sidecar validator (
sidecar/validator.go) also allows these flags through - Read
apps/server/src/services/inference/llama-args-validator.tsandsidecar/validator.goto understand the current blocklist - Add the two flags to the allowlist instead of the shadow list
- Update the sidecar Dockerfile or config if needed
Must NOT do:
- Do NOT change any other llama-server args
- Do NOT enable KV cache quantization for Q8_0 or Q3_K (only Q4_0)
- Do NOT add a separate draft model (ngram is self-contained)
Recommended Agent Profile:
- Category:
unspecified-high - Skills:
[] - Justification: Requires understanding llama.cpp arg validation across two codebases
Parallelization:
- Can Run In Parallel: YES
- Parallel Group: Wave 2 (with Tasks 7-8)
- Blocks: F1-F4
- Blocked By: Task 1-5 (Wave 1)
References:
apps/server/src/services/inference/llama-args-validator.ts— current arg blocklist/allowlistsidecar/validator.go— sidecar validation (if exists)docker-compose.ymlor sidecar Dockerfile — restart configopenspec/changes/llama-cache-and-spec/proposal.md— full spec
Acceptance Criteria:
--cache-type-k q4_0present in llama-server args after restart--spec-type ngram-modpresent in llama-server args after restart- llama-server starts without errors
- Inference still works (send test message)
QA Scenarios:
Scenario: KV cache quantization enabled Tool: Bash Preconditions: Server restarted after changes Steps: 1. ps aux | grep llama-server | grep -o "cache-type-k q4_0" 2. Assert: output matches "q4_0" Expected Result: KV cache quantization is active Evidence: .omo/evidence/task-6-kv-cache-enabled.txtScenario: Speculative decoding enabled Tool: Bash Preconditions: Server restarted Steps: 1. ps aux | grep llama-server | grep -o "spec-type ngram-mod" 2. Assert: output matches "ngram-mod" Expected Result: Ngram speculative decoding is active Evidence: .omo/evidence/task-6-ngram-enabled.txtScenario: Inference still works Tool: Bash (curl) Preconditions: Server running with new flags Steps: 1. curl -s -o /dev/null -w "%{http_code}" http://100.114.205.53:9500/api/health 2. Assert: HTTP 200 Expected Result: Server is healthy and serving Evidence: .omo/evidence/task-6-health-check.txtEvidence to Capture:
task-6-kv-cache-enabled.txt— grep output showing the flagtask-6-ngram-enabled.txt— grep output showing the flagtask-6-health-check.txt— health check confirmation
Commit: YES
- Message:
perf(llama): enable KV cache quantization (q4_0) + ngram speculative decoding - Files: apps/server/src/services/inference/llama-args-validator.ts, sidecar/validator.go (if needed)
- Edit
-
7. pty-enhancements — PTY exit notifications + session metadata
What to do:
- Add
notifyOnExitsupport to the PTY session manager (likely inapps/booterm/) - When a PTY process exits AND
notifyOnExitwas set:- Emit an event/message to the agent channel with: session ID, title, exit code, total output lines, last line of output
- Add session metadata fields: agent ID that spawned it, task ID, optional title
- Add
pty_listendpoint that returns metadata for all sessions - Wire
X-Agent-Flagsheader support for agent identification - Read
apps/booterm/to understand the current PTY architecture
Must NOT do:
- Do NOT change the existing pty_spawn interface (add notifyOnExit as optional param)
- Do NOT implement sandbox or circuit breaker (out of scope for this wave)
- Do NOT add new database tables (metadata lives in-memory or in existing session store)
Recommended Agent Profile:
- Category:
unspecified-high - Skills:
[]
Parallelization:
- Can Run In Parallel: YES
- Parallel Group: Wave 2 (with Tasks 6, 8)
- Blocks: F1-F4
- Blocked By: Task 1-5 (Wave 1)
References:
apps/booterm/src/— PTY session management codeapps/coder/src/services/— agent dispatch that spawns PTYsopenspec/changes/pty-enhancements/proposal.md— full specapps/server/src/services/inference/— inference pipeline that may need to handle notifications
Acceptance Criteria:
notifyOnExitoptional parameter on pty_spawn works- On process exit with notifyOnExit=true, agent receives notification
pty_listreturns session metadataX-Agent-Flagsheader is recognized
QA Scenarios:
Scenario: notifyOnExit triggers notification Tool: Bash + tmux Preconditions: booterm running Steps: 1. Start a short PTY with notifyOnExit=true: sleep 1 2. Wait 2 seconds for completion 3. Check notification was delivered Expected Result: Exit notification received with title, exit code, last line Evidence: .omo/evidence/task-7-notify-on-exit.txtScenario: pty_list shows metadata Tool: Bash (curl) Preconditions: PTY sessions exist Steps: 1. curl http://localhost:9501/api/pty/list 2>/dev/null 2. Assert: response contains session metadata fields Expected Result: Metadata returned for each session Evidence: .omo/evidence/task-7-pty-list.txtEvidence to Capture:
task-7-notify-on-exit.txt— notification evidencetask-7-pty-list.txt— pty_list response
Commit: YES
- Message:
feat(booterm): PTY exit notifications + session metadata + X-Agent-Flags - Files: apps/booterm/src/.ts, apps/coder/src/services/.ts
- Add
-
8. token-analyzer-ui — Backend API endpoints for token analytics
What to do:
- Add read-only API endpoints to serve aggregate token data:
GET /api/coder/token-analytics/sessions— per-session token usage (input, output, cost)GET /api/coder/token-analytics/tools— per-tool cost breakdown (from tool_cost_stats view)GET /api/coder/token-analytics/trends— token usage over time
- Reuse existing data sources:
agent_sessions.input_tokens,agent_sessions.output_tokens,agent_sessions.costtool_cost_statsview (per-tool 100-call rolling window)tasks.token_breakdownJSONB column
- Implement in
apps/coder/src/routes/(follow existing route patterns) - Add proper error handling, pagination for large result sets, and date filtering
Must NOT do:
- Do NOT create new database tables or migrations
- Do NOT add token tracking logic (data is already accumulated)
- Do NOT add real-time streaming (data is historical aggregate)
Recommended Agent Profile:
- Category:
unspecified-high - Skills:
[]
Parallelization:
- Can Run In Parallel: YES
- Parallel Group: Wave 2 (with Tasks 6-7)
- Blocks: Task 10 (frontend depends on backend)
- Blocked By: Task 1-5 (Wave 1)
References:
apps/coder/src/routes/— existing route patternsapps/server/src/schema.sql—tool_cost_statsview definitionapps/coder/CLAUDE.md— coder conventions, route registrationpackages/contracts/— shared types for response schemasopenspec/changes/token-analyzer-ui/proposal.md— full spec
Acceptance Criteria:
GET /api/coder/token-analytics/sessions?project_id=Xreturns 200 with token dataGET /api/coder/token-analytics/tools?project_id=Xreturns 200 with tool breakdownGET /api/coder/token-analytics/trends?project_id=Xreturns 200 with trend data- All endpoints respect
project_idfiltering - Empty data returns valid empty arrays (not errors)
QA Scenarios:
Scenario: Sessions endpoint works Tool: Bash (curl) Preconditions: Server running, project exists Steps: 1. curl -s "http://localhost:3000/api/coder/token-analytics/sessions?project_id=1" 2. Assert: HTTP 200 3. Assert: response is valid JSON with expected fields Expected Result: Session token data returned Evidence: .omo/evidence/task-8-sessions-endpoint.txtScenario: Empty data returns valid response Tool: Bash (curl) Preconditions: Server running Steps: 1. curl -s "http://localhost:3000/api/coder/token-analytics/sessions?project_id=999" 2. Assert: HTTP 200 3. Assert: response contains empty array (not error) Expected Result: Graceful empty state Evidence: .omo/evidence/task-8-empty-data.txtEvidence to Capture:
task-8-sessions-endpoint.txt— successful API responsetask-8-empty-data.txt— graceful empty handling
Commit: YES
- Message:
feat(coder): add token-analytics API endpoints for session/tool/trend data - Files: apps/coder/src/routes/token-analytics.ts, apps/coder/src/services/token-analytics.ts
- Add read-only API endpoints to serve aggregate token data:
TODOs (Wave 3)
-
9. results-page — /results route for orchestrator runs + arena battles
What to do:
- Add sidebar nav button with
ScrollTexticon (lucide-react), above the Token Analytics button - Create new
/resultsroute page with two tabs:- "Analysis Runs" — list orchestrator flow runs (research, code-review, investigate, etc.)
- "Arena Battles" — list battle history
- Each tab shows: status dot, name/type, band/battle-type, model, timing, error indicator
- Completed runs show "View Report" link; completed battles show "View Analysis"
- Uses existing API endpoints (no backend changes needed):
GET /api/coder/runs?project_id=XGET /api/coder/battles?project_id=X
- Requires
project_idcontext — load from sidebar on mount, or show project selector - Follow existing route patterns in web (React Router routes, lazy loading)
Must NOT do:
- Do NOT create new API endpoints
- Do NOT modify existing API contracts
- Do NOT add pagination beyond what the API already provides
- Do NOT add real-time updates (static list, refreshed on mount)
Recommended Agent Profile:
- Category:
visual-engineering - Skills:
[]
Parallelization:
- Can Run In Parallel: YES
- Parallel Group: Wave 3 (with Tasks 10-11)
- Blocks: F1-F4
- Blocked By: Task 1-5 (Wave 1)
References:
apps/web/src/routes/— existing route patterns (analytics, settings)apps/web/src/components/sidebar/— nav button patternsapps/web/src/api/— existing API clientopenspec/changes/results-page/proposal.md— full specapps/coder/src/routes/runs.ts— runs endpointapps/coder/src/routes/battles.ts— battles endpoint
Acceptance Criteria:
- Sidebar shows "Results" button with ScrollText icon above Token Analytics
- Clicking navigates to
/results - "Analysis Runs" tab loads and displays orchestrator flow history
- "Arena Battles" tab loads and displays battle history
- Completed runs show "View Report" link
- Empty state shown when no data
- Error state shown on API failure
QA Scenarios:
Scenario: Nav button renders Tool: Playwright Preconditions: Web app loaded Steps: 1. Navigate to / 2. Look for sidebar nav button with text "Results" 3. Assert: button exists and links to /results Expected Result: Results nav button present Evidence: .omo/evidence/task-9-nav-button.pngScenario: Results page loads Tool: Playwright Preconditions: Web app loaded, project exists Steps: 1. Navigate to /results 2. Wait for "Analysis Runs" tab to appear 3. Assert: tab shows list of runs or empty state Expected Result: Page loads with data Evidence: .omo/evidence/task-9-results-page.pngEvidence to Capture:
task-9-nav-button.png— screenshot of sidebar with Results buttontask-9-results-page.png— screenshot of /results page with data
Commit: YES
- Message:
feat(web): add /results page for orchestrator runs and arena battle history - Files: apps/web/src/routes/results.tsx, apps/web/src/components/sidebar/*.tsx
- Add sidebar nav button with
-
10. token-analyzer-ui — /analytics dashboard route
What to do:
- Add sidebar nav button with appropriate icon, above Settings button
- Create new
/analyticsroute page showing token usage dashboard:- Aggregate token usage across sessions (total input/output tokens)
- Per-tool cost breakdown (bar chart or table)
- Per-session token history (list or mini chart)
- Per-provider cost comparison
- Reuse existing data from the backend endpoints created in Task 8
- Follow the same route/nav patterns as results-page
Must NOT do:
- Do NOT add new charting libraries (use what's already available)
- Do NOT implement real-time updates
Recommended Agent Profile:
- Category:
visual-engineering - Skills:
[]
Parallelization:
- Can Run In Parallel: YES
- Parallel Group: Wave 3 (with Tasks 9, 11)
- Blocks: F1-F4
- Blocked By: Tasks 1-5 (Wave 1), Task 8 (backend endpoints)
References:
- Same as Task 9 + Task 8 endpoints
openspec/changes/token-analyzer-ui/proposal.md— full specapps/web/src/components/— existing chart/list components
Acceptance Criteria:
- Sidebar shows "Token Analytics" button above Settings
/analyticsloads and shows token dashboard- Per-session, per-tool, per-provider breakdowns visible
- Empty state shown when no data
QA Scenarios:
Scenario: Token Analytics nav button renders Tool: Playwright Preconditions: Web app loaded Steps: 1. Navigate to / 2. Look for "Token Analytics" button in sidebar 3. Assert: button exists above Settings Expected Result: Nav button present Evidence: .omo/evidence/task-10-nav-button.pngScenario: Analytics dashboard loads Tool: Playwright Preconditions: Web app loaded Steps: 1. Navigate to /analytics 2. Wait for dashboard content to render 3. Assert: token usage data is visible Expected Result: Dashboard shows data Evidence: .omo/evidence/task-10-analytics-dashboard.pngEvidence to Capture:
task-10-nav-button.pngtask-10-analytics-dashboard.png
Commit: YES
- Message:
feat(web): add /analytics route for token usage dashboard - Files: apps/web/src/routes/analytics.tsx, apps/web/src/components/sidebar/*.tsx
-
11. enhanced-file-panel — Side-by-side diff, hide whitespace, wrap lines, expand/collapse all
What to do:
- Add side-by-side diff toggle to the Git diff tab in the file panel
- Add "Hide whitespace" checkbox that filters whitespace-only changes
- Add "Wrap long lines" toggle for diff display
- Add "Expand All" / "Collapse All" buttons for file-level diffs
- Implement in
apps/web/src/components/following existing file panel patterns - Read
apps/web/src/components/to find the existing diff rendering components
Must NOT do:
- Do NOT implement inline diff comments (deferred)
- Do NOT implement in-browser file editing (deferred)
- Do NOT change the backend diff generation logic
Recommended Agent Profile:
- Category:
visual-engineering - Skills:
[]
Parallelization:
- Can Run In Parallel: YES
- Parallel Group: Wave 3 (with Tasks 9-10)
- Blocks: F1-F4
- Blocked By: Task 1-5 (Wave 1)
References:
apps/web/src/components/— existing file panel and diff componentsapps/web/src/hooks/— hooks for diff state managementopenspec/changes/enhanced-file-panel/proposal.md— full specapps/server/src/routes/projects.ts— git diff backend route
Acceptance Criteria:
- Side-by-side diff toggles correctly
- Hide whitespace checkbox filters whitespace changes
- Wrap long lines toggle works
- Expand/Collapse All buttons toggle all files
- All changes are frontend-only (no new API calls)
QA Scenarios:
Scenario: Side-by-side diff renders Tool: Playwright Preconditions: Repo with uncommitted changes Steps: 1. Open file panel 2. Click Git tab 3. Toggle side-by-side view 4. Assert: diff renders in two columns Expected Result: Side-by-side diff visible Evidence: .omo/evidence/task-11-side-by-side.pngScenario: Hide whitespace works Tool: Playwright Preconditions: Diff has whitespace changes Steps: 1. Open diff with whitespace changes 2. Check "Hide whitespace" 3. Assert: only-whitespace hunks hidden Expected Result: Whitespace-only changes filtered Evidence: .omo/evidence/task-11-hide-whitespace.pngScenario: Expand/Collapse All toggles Tool: Playwright Preconditions: Multiple files changed Steps: 1. Click "Collapse All" 2. Assert: all files collapsed to summary 3. Click "Expand All" 4. Assert: all files expanded Expected Result: Bulk toggle works Evidence: .omo/evidence/task-11-expand-collapse.pngEvidence to Capture:
task-11-side-by-side.pngtask-11-hide-whitespace.pngtask-11-expand-collapse.png
Commit: YES
- Message:
feat(web): enhanced file panel — side-by-side diff, hide whitespace, wrap lines, expand/collapse all - Files: apps/web/src/components/.tsx, apps/web/src/hooks/.ts
Final Verification Wave
-
F1. Plan Compliance Audit —
oracleRead the plan end-to-end. For each "Must Have": verify implementation exists (read file, curl endpoint, run command). For each "Must NOT Have": search codebase for forbidden patterns — reject with file:line if found. Check evidence files exist in .omo/evidence/. Compare deliverables against plan. Output:Must Have [N/N] | Must NOT Have [N/N] | Tasks [N/N] | VERDICT: APPROVE/REJECT -
F2. Code Quality Review —
unspecified-highRuntsc --noEmitfor any changed apps +bun test. Review all changed files for:as any/@ts-ignore, empty catches, console.log in prod, commented-out code, unused imports. Output:Build [PASS/FAIL] | Lint [PASS/FAIL] | Tests [N pass/N fail] | Files [N clean/N issues] | VERDICT -
F3. Real Manual QA —
unspecified-highStart from clean state. Execute EVERY QA scenario from EVERY task — follow exact steps, capture evidence. Test cross-task integration (features working together, not isolation). Test edge cases: empty state, invalid input, missing project_id. Save to.omo/evidence/final-qa/. Output:Scenarios [N/N pass] | Integration [N/N] | Edge Cases [N tested] | VERDICT -
F4. Scope Fidelity Check —
deepFor each task: read "What to do", read actual diff. Verify 1:1 — everything in scope was built (no missing), nothing beyond scope was built (no creep). Check "Must NOT do" compliance. Output:Tasks [N/N compliant] | Contamination [CLEAN/N issues] | Unaccounted [CLEAN/N files] | VERDICT
Commit Strategy
- 1-5 (grouped):
chore(openspec): cleanup openspec folder structure — delete stubs, move proposals, add metadata, populate config - 6:
perf(llama): enable KV cache quantization (q4_0) + ngram speculative decoding - 7:
feat(booterm): PTY exit notifications + session metadata + X-Agent-Flags - 8:
feat(coder): add token-analytics API endpoints - 9:
feat(web): add /results page for orchestrator runs + arena battles - 10:
feat(web): add /analytics token usage dashboard - 11:
feat(web): enhanced file panel — side-by-side diff, hide whitespace, wrap lines, expand/collapse
Success Criteria
Verification Commands
# OpenSpec cleanup
test ! -f openspec/changes/archived/v1.13.12-skills-audit.md
test -d openspec/changes/boocontext/
test -f openspec/changes/enhanced-file-panel/.openspec.yaml
grep -q "context:" openspec/config.yaml
# llama-cache-and-spec
ps aux | grep llama-server | grep -o "cache-type-k q4_0"
ps aux | grep llama-server | grep -o "spec-type ngram-mod"
# PTY enhancements
curl -s http://localhost:9501/api/pty/list | jq '.'
# Results page
curl -s "http://localhost:3000/api/coder/runs?project_id=1" | jq '.'
# Token analytics
curl -s "http://localhost:3000/api/coder/token-analytics/sessions?project_id=1" | jq '.'
# Enhanced file panel
# (visual verification via Playwright)
Final Checklist
- 11 stub files deleted from archived/
- 5 misplaced proposals moved/merged into changes/
- 6 .openspec.yaml files added
- config.yaml populated with context + rules
- 10 archived proposals annotated with shipped versions
- llama-server running with KV cache Q4_0 + ngram
- PTY exit notifications working
/resultspage renders and loads data/analyticspage renders and loads data- Side-by-side diff, hide whitespace, wrap lines, expand/collapse all working
- All type checks pass
- All QA scenarios pass