Phase 1: Trace System + Observability - tool_traces DB table + insert/update service - tool_trace_start/tool_trace_finish WS frames (contracts + FE types) - Instrumented tool-phase.ts with timing around every tool call - GET /api/chats/:id/traces paginated endpoint - Trace viewer frontend (collapsible panel with timing bars + token breakdown) Phase 2: Session Persistence + Resume - agent_snapshots table (UPSERT per chat, persisted on turn boundaries) - save/load/delete service functions - Agent snapshot sent on WS reconnect - Session timeline view (vertical timeline with scroll-to + restore) Tooling: - run_command tool (execFile, 30s timeout, 32KB cap, path-guarded) - Auto-fix loop: after write tools, runs pnpm build, injects errors into next turn
9.7 KiB
Paseo-like Orchestrator — Trace Observability, Dynamic Workflows & Agent Runtime
Status: Proposed Epic: paseo-orchestrator Depends on: v2.7.17-orchestrator
Why
BooCode's Orchestrator (v2.7.17) runs deterministic Han analysis flows — but it's a fixed pipeline, not a general-purpose agent runtime. Every tool call is opaque: no timing, no cost breakdown, no replay. Sessions evaporate on browser refresh. Workflows are hardcoded. Subagents block until completion. And there's zero visibility into cache efficiency on DeepSeek — despite prompt caching being a major cost lever.
The current architecture treats the LLM as a black box and the agent as a one-shot transaction. To move from "read-only chat" to a Paseo-style thin-client orchestration layer, BooCode needs five capabilities that compound on each other:
- Observability — Every tool call timed, logged, and live-streamed. Without it, debugging agent behavior is guesswork.
- Persistence — Agent state survives browser refresh. Active sessions resume where they left off.
- Dynamic Workflows — User-authored JS scripts using
agent(),parallel(),pipeline()instead of hardcoded flows. Hash-based caching skips completed steps on re-run. - Background Subagents —
spawn_subagentreturns immediately, results collected later. Unlocks parallel research, long-running analyses, and notification-based workflows. - Multi-modal + Cache Shape — Image attachments forwarded to DeepSeek's vision API, plus per-turn cache hit rate visualization to close the cost feedback loop.
Each phase is independently valuable; together they transform BooCode from a chat UI into a durable agent execution platform.
What Changes
Phase 1: Trace System + Observability (3-4 days)
-
Create
tool_tracesDB table — id, session_id, chat_id, turn_number, tool_name, input, output, started_at, finished_at, latency_ms, tokens_used, cache_tokens, reasoning_tokens, error, outcome. Applied idempotently viaapplySchema(). -
Add
tool_traceWS frame — new WsFrame variant in@boocode/contractspublished by the server when a tool call starts and completes. Frontend receives live timing deltas viauseSessionStream. -
Instrument
tool-phase.ts— wrapexecuteToolCallwithclock_timestamp()start/end, extract token counts from LLM response metadata, publishtool_traceframes on start (with input) and finish (with output + metrics). -
Add GET
/api/chats/:id/traces— paginated endpoint returning trace rows ordered by turn_number + started_at. Supports cursor-based pagination for large sessions. -
Build trace viewer pane — collapsible tree per turn, timing bars showing latency relative to turn duration, expand/collapse per tool call showing input/output. Integrates into the existing multi-pane workspace alongside chat, coder, and orchestrator panes.
Phase 2: Session Persistence + Resume (2-3 days)
-
Serialize agent state to DB — on each turn boundary (before and after tool call loop), snapshot the active
AgentSessionstate (provider config, turn history, pending tool calls) to a JSONB column inagent_sessions. Usesclock_timestamp()for ordering. -
Restore on WS reconnect — when
snapshotframe arrives on reconnection, check for a persistedAgentSessioninin_progressorawaiting_inputstate. Rehydrate the coder pane to match the persisted turn, tool call, and pending state. -
Agent session timeline view — a timeline component in the coder pane showing the history of all turns in the current agent session. Each turn shows start time, tool count, token usage, cache hit rate. Clicking a turn scrolls to that point in the conversation.
Phase 3: Dynamic Workflow Engine (5-7 days)
-
Create
isolated-vmsandbox — restricted JS execution environment for workflow scripts. Norequire,fs,net,child_process. Only the workflow API surface exposed. Token budget enforcement kills runaway scripts. -
Implement workflow API primitives —
agent(id, { prompt, model, tools, budget })defines a sub-agent;parallel([agent1, agent2])runs N agents concurrently with a shared token budget;pipeline([step1, step2])chains agents sequentially;phase(name, { agents, budget })groups agents under a named phase;budget(limit)sets token or step limits;log(msg)emits structured workflow log. Compatible with Claude Code workflow script format. -
Workflow file discovery — scan
.boocode/workflows/*.js(project-local),~/.boocode/workflows/*.js(global), and a built-in catalog directory. Each file exports aworkflowobject with{name, description, run}. Discovery runs on server start and on file change (optional watch mode). -
Workflow manager + built-in catalog —
WorkflowManagerclass withlist(),get(name),run(workflow, args),cancel(runId),status(runId). Concurrency limits (configurable max concurrent runs), token budgets per run. Built-in catalog includes:deep-research(parallel source search → per-source analysis → synthesis),multi-review(code health + security + standards reviews in parallel),plan-verify(generate plan → verify plan → generate tasks),bounty-hunt(parallel vulnerability scanning with different focuses). -
Workflow resumability — SHA-256 hash of each agent spec (prompt + options). Before executing an agent, check if a completed result exists with the same hash. Skip cached agents, only execute new/changed ones. In-memory LRU cache for current session, optional DB persistence for cross-session reuse.
-
Workflow UI integration — extend the existing Orchestrator panel (used for Han flows) to support dynamic workflows. Workflow selector dropdown, live run pane with step-by-step progress, cancel button, log output stream, per-agent timing. Reuses the same run-pane component pattern.
Phase 4: Background Subagents (2-3 days)
-
Background task queue — uses the existing
taskstable with a newbackgroundtype.spawn_subagenttool creates a task row and returns immediately. A background worker picks up the task and executes it without blocking the calling agent. -
subagent_status+subagent_resulttools —subagent_status(task_id)returnsrunning|completed|failedwith optional progress info.subagent_result(task_id)returns the full output when completed. Polling-based (no WS push for background tasks initially). -
Background agent pane — new pane type showing running/completed background agents. Each entry shows name, status, duration, progress. Completed entries show a "View Result" action. Notifications hook into the existing notification system (toast on completion, badge count for active tasks).
Phase 5: Multi-modal + Cache Shape (2-3 days)
-
Image/file attachment pipeline — accept file uploads (drag-drop or file picker), store on tmpfs with a reference in the message row. Forward to DeepSeek's multimodal API as base64-encoded image parts. Size limit enforcement (configurable, default 20MB per attachment).
-
Image render in message bubble — render attached images inline in the chat message bubble. Lightbox on click for expanded view. Thumbnail generation for large images to keep chat scrolling performant.
-
Cache shape telemetry — extract
prompt_cache_hit_tokensfrom DeepSeek provider metadata on each turn. Break down by segment: system prompt, tool schemas, conversation history. Store intool_tracescolumns and/or a dedicatedcache_statstable. -
Cache hit rate visualization — per-turn cache hit bar in the trace viewer (showing cached vs non-cached tokens). Cumulative cache hit rate in the session footer. Highlight when a turn achieves high cache reuse (green indicator) or unusually low (yellow/red).
Non-Goals
- No changes to the existing Han flow orchestrator (runs alongside dynamic workflows)
- No removal of existing agent dispatch paths (PTY, ACP, Claude SDK — dynamic workflows are additive)
- No distributed execution (all orchestration is single-node)
- No persistent workflow file watching (manual reload or server restart to pick up new workflows)
- No workflow editing UI (workflows are authored as JS files)
Capabilities
New Capabilities
- Tool trace viewer — every tool call with timing, token costs, cache breakdown, expandable input/output
- Agent session resume — browser refresh preserves active agent state
- Dynamic workflows — user-authored JS scripts with
agent()/parallel()/pipeline()API - Workflow resumability — hash-based step caching skips completed agents on re-run
- Built-in workflow catalog — deep-research, multi-review, plan-verify, bounty-hunt
- Background subagents — non-blocking spawn with deferred result collection
- Multi-modal support — image attachments forwarded to DeepSeek vision API
- Cache shape telemetry — per-turn and cumulative cache hit rate visualization
Modified Capabilities
- Orchestrator panel — extended from fixed Han flows to dynamic workflow selection and streaming run pane
- tool-phase.ts — instrumented with start/end timing and trace publishing
- WsFrame contract — new
tool_traceframe variant - tasks table — extended with
backgroundtype for async subagent execution
Metrics
- Tool call observability: 0% → 100% of calls traced with timing
- Session continuity: lost on refresh → preserved on reconnect
- Workflow authoring: hardcoded → user-authored JS scripts
- Workflow re-run efficiency: 0% cache → hash-based step reuse
- Background execution: blocking only → blocking + non-blocking
- Cache visibility: 0% → per-turn + cumulative hit rate
- Multi-modal: text-only → text + image attachments