boocode/.omo/plans/paseo-orchestrator.md

# Paseo-like Orchestrator — Implementation Plan

> **Goal:** Transform BooCode into a Paseo-style thin-client orchestration layer with observability, dynamic workflows, resumability, background subagents, multi-modal, and cache shape telemetry.
>
> **Architecture:** Durable agent execution engine beneath thin chat/coder frontends. Trace system as foundation, workflow engine as the structural addition, everything else layered on top.
>
> **Inspired by:** Paseo (agent lifecycle, worktree isolation), Whale (workflow engine, cache telemetry), OpenCode (session resume), Claude Code (workflow script format).

---

## TL;DR

> **Quick Summary**: Build a durable orchestration layer with trace observability, dynamic JS workflows, session persistence, background subagents, and multi-modal support over 5 phases.
>
> **Deliverables**:
> - Trace system with DB persistence + viewer UI
> - Dynamic workflow engine (JS sandbox, agent/parallel/pipeline)
> - Workflow resumability (hash-based step caching)
> - Background subagent runtime
> - Session persistence across refreshes
> - Cache shape telemetry (DeepSeek KV cache viz)
> - Multi-modal attachment support
>
> **Estimated Effort**: XL — 5 phases, ~2-3 weeks total
> **Parallel Execution**: YES — phases 1-2 can partially overlap
> **Critical Path**: Trace system → Workflow engine → All downstream features

---

## Context

### Original Request
User wants BooCode to become "like Paseo — a thin client" with observability, dynamic workflows, session persistence, background agents, multi-modal, cache shape telemetry, and workflow resumability. They invoked skills across model evaluation, long context, SGLang, LangChain, LangSmith, agentic eval, agent harness construction, agent governance, and chat SDKs — indicating broad ambition for a production-quality AI coding platform.

### Key Decisions
- **Trace system first**: Foundation for all debugging and optimization
- **isolated-vm for workflow sandbox**: Node-native, no external deps
- **DB-backed sessions**: Postgres for trace store + session state
- **Existing WS frames + new `tool_trace` frame**: Live streaming to frontend
- **Phase ordering**: Foundation (trace) → UX (persistence) → Power (workflows) → Polish (background/multi-modal/cache)

---

## Phases

### Phase 1: Trace System + Observability
**Est. effort**: 3-4 days

Core observability infrastructure. Every tool call gets timed, logged, and persisted.

**Deliverables**:
- `tool_traces` DB table (id, session_id, chat_id, turn_number, tool_name, input, output, started_at, finished_at, latency_ms, tokens_used, cache_tokens, reasoning_tokens, error, outcome)
- Instrumentation in `tool-phase.ts` wrapping `executeToolCall` with start/end timing
- `tool_trace` WS frame type for live streaming to frontend
- GET `/api/chats/:id/traces` endpoint (paginated)
- Trace viewer pane (collapsible tree, timing bars, expand/collapse per call)

**Files to create**: 5-7 files across server + web + contracts
**Dependencies**: None — standalone feature

---

### Phase 2: Session Persistence + Resume
**Est. effort**: 2-3 days

Agent state survives browser refresh. Active sessions can be resumed.

**Deliverables**:
- Serialize active agent state to DB on each turn boundary
- Restore state on WS reconnect (existing `snapshot` frame enhanced)
- Agent session timeline view (history of all turns in a session)
- Coder pane rehydrates from persisted state

**Files to modify**: ws.ts, useSessionStream.ts, session store, dispatcher
**Dependencies**: None — standalone, but benefits from Phase 1 trace data

---

### Phase 3: Dynamic Workflow Engine
**Est. effort**: 5-7 days

JS sandbox for multi-agent orchestration. Claude Code compatible.

**Deliverables**:
- `isolated-vm` sandbox (or Node `vm` module with restricted context)
- Workflow API: `agent()`, `parallel()`, `pipeline()`, `phase()`, `budget()`, `log()`, `args`
- Workflow file discovery (`.boocode/workflows/*.js` → project, `~/.boocode/workflows/*.js` → global)
- Built-in workflow catalog (deep-research, multi-review, etc.)
- Workflow manager with concurrency limits, token budgets
- Integration with existing Orchestrator panel for UI

**Files to create**: 10-15 files (workflow runtime, scheduler, tool bridge, manager, catalog)
**Dependencies**: Phase 1 traces feed into workflow observability

**Workflow Resumability** (within Phase 3):
- SHA-256 hash of agent spec (prompt + options)
- Cache completed results by hash
- On re-run, skip cached agents, only execute new/changed ones
- In-memory cache for current session, optional DB persistence

**Est. effort**: 1-2 days within Phase 3

---

### Phase 4: Background Subagents
**Est. effort**: 2-3 days

Non-blocking subagent execution. `spawn_subagent` returns immediately, results collected later.

**Deliverables**:
- Background task queue (reuses existing `tasks` table)
- `spawn_subagent` tool that creates a task and returns immediately
- `subagent_status` tool to poll completion
- `subagent_result` tool to retrieve output
- Background agent pane showing running/completed subagents
- Notifications via hooks when background tasks complete

**Files to create**: 3-5 files across server + web
**Dependencies**: Phase 1 traces, Phase 2 session persistence

---

### Phase 5: Multi-modal + Cache Shape (Polish)
**Est. effort**: 2-3 days

Image/file attachment support + DeepSeek cache hit visualization.

**Deliverables (Multi-modal)**:
- Image/file attachment storage (tmpfs, referenced in message)
- Forward image content through DeepSeek API's multimodal support
- Render attached images in message bubble
- Model can "see" screenshots, diagrams, UI mocks

**Deliverables (Cache Shape)**:
- Extract `prompt_cache_hit_tokens` from DeepSeek provider metadata
- Build cache segment visualization (system prompt, tool schema, conversation)
- Per-turn cache hit rate in trace viewer
- Cumulative cache stats in session view

**Files to create**: 3-5 files
**Dependencies**: Phase 1 traces (for cache shape), existing DeepSeek integration

---

## Execution Strategy

### Parallel Execution Waves

```
Wave 1 (Start Immediately):
├── Phase 1: Trace system backend (tool_traces table + instrumentation) [deep]
├── Phase 1: Trace viewer frontend [visual-engineering]
└── Phase 2: Session persistence backbone [deep]

Wave 2 (After Wave 1):
├── Phase 3: Workflow engine sandbox + API surface [deep]
├── Phase 3: Workflow file discovery + manager [unspecified-high]
├── Phase 3: Workflow resumability cache [quick]
└── Phase 4: Background subagent queue + tools [unspecified-high]

Wave 3 (After Wave 2):
├── Phase 4: Background agent pane + notifications [visual-engineering]
├── Phase 5: Multi-modal attachment pipeline [deep]
└── Phase 5: Cache shape telemetry UI [visual-engineering]

Wave FINAL:
├── F1: Plan compliance audit (oracle)
├── F2: Code quality review (unspecified-high)
├── F3: Integration QA (unspecified-high)
└── F4: Scope fidelity check (deep)
```

---

## TODOs

> Phase 1: Trace System + Observability

- [ ] 1. Create tool_traces DB table + migration

- [ ] 2. Add tool_trace WS frame + contracts schema

- [ ] 3. Instrument tool-phase.ts with start/end timing

- [ ] 4. Add GET /api/chats/:id/traces endpoint

- [ ] 5. Build trace viewer frontend component

> Phase 2: Session Persistence + Resume

- [ ] 6. Serialize agent state to DB on turn boundaries

- [ ] 7. Restore state on WS reconnect

- [ ] 8. Agent session timeline view

> Phase 3: Dynamic Workflow Engine

- [ ] 9. Create isolated-vm workflow sandbox

- [ ] 10. Implement agent/parallel/pipeline primitives

- [ ] 11. Workflow file discovery system

- [ ] 12. Workflow manager + built-in catalog

- [ ] 13. Workflow resumability (hash-based cache)

- [ ] 14. Workflow UI integration with Orchestrator panel

> Phase 4: Background Subagents

- [ ] 15. Background task queue + spawn_subagent tool

- [ ] 16. subagent_status + subagent_result tools

- [ ] 17. Background agent pane

> Phase 5: Multi-modal + Cache Shape

- [ ] 18. Multi-modal attachment pipeline

- [ ] 19. Image render in message bubble

- [ ] 20. Cache shape telemetry data pipeline

- [ ] 21. Cache shape visualization in trace viewer

---

## Success Criteria

- Tool trace viewer shows every call with timing bars and token costs
- Browser refresh preserves agent session state
- Workflow scripts run in isolated sandbox with agent/parallel/pipeline
- Re-running a workflow skips cached agents (hash-based)
- Background subagents run independently, results collected later
- Model can see attached images in chat
- Cache hit rate visible per-turn and cumulative