# Paseo-like Orchestrator — Trace Observability, Dynamic Workflows & Agent Runtime **Status:** Proposed **Epic:** paseo-orchestrator **Depends on:** v2.7.17-orchestrator ## Why BooCode's Orchestrator (v2.7.17) runs deterministic Han analysis flows — but it's a fixed pipeline, not a general-purpose agent runtime. Every tool call is opaque: no timing, no cost breakdown, no replay. Sessions evaporate on browser refresh. Workflows are hardcoded. Subagents block until completion. And there's zero visibility into cache efficiency on DeepSeek — despite prompt caching being a major cost lever. The current architecture treats the LLM as a black box and the agent as a one-shot transaction. To move from "read-only chat" to a **Paseo-style thin-client orchestration layer**, BooCode needs five capabilities that compound on each other: 1. **Observability** — Every tool call timed, logged, and live-streamed. Without it, debugging agent behavior is guesswork. 2. **Persistence** — Agent state survives browser refresh. Active sessions resume where they left off. 3. **Dynamic Workflows** — User-authored JS scripts using `agent()`, `parallel()`, `pipeline()` instead of hardcoded flows. Hash-based caching skips completed steps on re-run. 4. **Background Subagents** — `spawn_subagent` returns immediately, results collected later. Unlocks parallel research, long-running analyses, and notification-based workflows. 5. **Multi-modal + Cache Shape** — Image attachments forwarded to DeepSeek's vision API, plus per-turn cache hit rate visualization to close the cost feedback loop. Each phase is independently valuable; together they transform BooCode from a chat UI into a durable agent execution platform. ## What Changes ### Phase 1: Trace System + Observability (3-4 days) 1. **Create `tool_traces` DB table** — id, session_id, chat_id, turn_number, tool_name, input, output, started_at, finished_at, latency_ms, tokens_used, cache_tokens, reasoning_tokens, error, outcome. Applied idempotently via `applySchema()`. 2. **Add `tool_trace` WS frame** — new WsFrame variant in `@boocode/contracts` published by the server when a tool call starts and completes. Frontend receives live timing deltas via `useSessionStream`. 3. **Instrument `tool-phase.ts`** — wrap `executeToolCall` with `clock_timestamp()` start/end, extract token counts from LLM response metadata, publish `tool_trace` frames on start (with input) and finish (with output + metrics). 4. **Add GET `/api/chats/:id/traces`** — paginated endpoint returning trace rows ordered by turn_number + started_at. Supports cursor-based pagination for large sessions. 5. **Build trace viewer pane** — collapsible tree per turn, timing bars showing latency relative to turn duration, expand/collapse per tool call showing input/output. Integrates into the existing multi-pane workspace alongside chat, coder, and orchestrator panes. ### Phase 2: Session Persistence + Resume (2-3 days) 6. **Serialize agent state to DB** — on each turn boundary (before and after tool call loop), snapshot the active `AgentSession` state (provider config, turn history, pending tool calls) to a JSONB column in `agent_sessions`. Uses `clock_timestamp()` for ordering. 7. **Restore on WS reconnect** — when `snapshot` frame arrives on reconnection, check for a persisted `AgentSession` in `in_progress` or `awaiting_input` state. Rehydrate the coder pane to match the persisted turn, tool call, and pending state. 8. **Agent session timeline view** — a timeline component in the coder pane showing the history of all turns in the current agent session. Each turn shows start time, tool count, token usage, cache hit rate. Clicking a turn scrolls to that point in the conversation. ### Phase 3: Dynamic Workflow Engine (5-7 days) 9. **Create `isolated-vm` sandbox** — restricted JS execution environment for workflow scripts. No `require`, `fs`, `net`, `child_process`. Only the workflow API surface exposed. Token budget enforcement kills runaway scripts. 10. **Implement workflow API primitives** — `agent(id, { prompt, model, tools, budget })` defines a sub-agent; `parallel([agent1, agent2])` runs N agents concurrently with a shared token budget; `pipeline([step1, step2])` chains agents sequentially; `phase(name, { agents, budget })` groups agents under a named phase; `budget(limit)` sets token or step limits; `log(msg)` emits structured workflow log. Compatible with Claude Code workflow script format. 11. **Workflow file discovery** — scan `.boocode/workflows/*.js` (project-local), `~/.boocode/workflows/*.js` (global), and a built-in catalog directory. Each file exports a `workflow` object with `{name, description, run}`. Discovery runs on server start and on file change (optional watch mode). 12. **Workflow manager + built-in catalog** — `WorkflowManager` class with `list()`, `get(name)`, `run(workflow, args)`, `cancel(runId)`, `status(runId)`. Concurrency limits (configurable max concurrent runs), token budgets per run. Built-in catalog includes: `deep-research` (parallel source search → per-source analysis → synthesis), `multi-review` (code health + security + standards reviews in parallel), `plan-verify` (generate plan → verify plan → generate tasks), `bounty-hunt` (parallel vulnerability scanning with different focuses). 13. **Workflow resumability** — SHA-256 hash of each agent spec (prompt + options). Before executing an agent, check if a completed result exists with the same hash. Skip cached agents, only execute new/changed ones. In-memory LRU cache for current session, optional DB persistence for cross-session reuse. 14. **Workflow UI integration** — extend the existing Orchestrator panel (used for Han flows) to support dynamic workflows. Workflow selector dropdown, live run pane with step-by-step progress, cancel button, log output stream, per-agent timing. Reuses the same run-pane component pattern. ### Phase 4: Background Subagents (2-3 days) 15. **Background task queue** — uses the existing `tasks` table with a new `background` type. `spawn_subagent` tool creates a task row and returns immediately. A background worker picks up the task and executes it without blocking the calling agent. 16. **`subagent_status` + `subagent_result` tools** — `subagent_status(task_id)` returns `running|completed|failed` with optional progress info. `subagent_result(task_id)` returns the full output when completed. Polling-based (no WS push for background tasks initially). 17. **Background agent pane** — new pane type showing running/completed background agents. Each entry shows name, status, duration, progress. Completed entries show a "View Result" action. Notifications hook into the existing notification system (toast on completion, badge count for active tasks). ### Phase 5: Multi-modal + Cache Shape (2-3 days) 18. **Image/file attachment pipeline** — accept file uploads (drag-drop or file picker), store on tmpfs with a reference in the message row. Forward to DeepSeek's multimodal API as base64-encoded image parts. Size limit enforcement (configurable, default 20MB per attachment). 19. **Image render in message bubble** — render attached images inline in the chat message bubble. Lightbox on click for expanded view. Thumbnail generation for large images to keep chat scrolling performant. 20. **Cache shape telemetry** — extract `prompt_cache_hit_tokens` from DeepSeek provider metadata on each turn. Break down by segment: system prompt, tool schemas, conversation history. Store in `tool_traces` columns and/or a dedicated `cache_stats` table. 21. **Cache hit rate visualization** — per-turn cache hit bar in the trace viewer (showing cached vs non-cached tokens). Cumulative cache hit rate in the session footer. Highlight when a turn achieves high cache reuse (green indicator) or unusually low (yellow/red). ## Non-Goals - No changes to the existing Han flow orchestrator (runs alongside dynamic workflows) - No removal of existing agent dispatch paths (PTY, ACP, Claude SDK — dynamic workflows are additive) - No distributed execution (all orchestration is single-node) - No persistent workflow file watching (manual reload or server restart to pick up new workflows) - No workflow editing UI (workflows are authored as JS files) ## Capabilities ### New Capabilities - **Tool trace viewer** — every tool call with timing, token costs, cache breakdown, expandable input/output - **Agent session resume** — browser refresh preserves active agent state - **Dynamic workflows** — user-authored JS scripts with `agent()/parallel()/pipeline()` API - **Workflow resumability** — hash-based step caching skips completed agents on re-run - **Built-in workflow catalog** — deep-research, multi-review, plan-verify, bounty-hunt - **Background subagents** — non-blocking spawn with deferred result collection - **Multi-modal support** — image attachments forwarded to DeepSeek vision API - **Cache shape telemetry** — per-turn and cumulative cache hit rate visualization ### Modified Capabilities - **Orchestrator panel** — extended from fixed Han flows to dynamic workflow selection and streaming run pane - **tool-phase.ts** — instrumented with start/end timing and trace publishing - **WsFrame contract** — new `tool_trace` frame variant - **tasks table** — extended with `background` type for async subagent execution ## Metrics - Tool call observability: 0% → 100% of calls traced with timing - Session continuity: lost on refresh → preserved on reconnect - Workflow authoring: hardcoded → user-authored JS scripts - Workflow re-run efficiency: 0% cache → hash-based step reuse - Background execution: blocking only → blocking + non-blocking - Cache visibility: 0% → per-turn + cumulative hit rate - Multi-modal: text-only → text + image attachments