boocode/openspec/changes/v2-6-persistent-agent-sessions/design.md

# v2.6 Design — Persistent agent sessions

Reference implementations: `/opt/forks/opencode` (server + SDK),
`/opt/forks/paseo` (warm ACP + opencode server-manager + reasoning dedup).

## 1. Architecture overview

```
                         BooCoder (systemd host service)
  ┌─────────────────────────────────────────────────────────────────┐
  │  dispatcher (per-turn unit = tasks row)                           │
  │      │ resolve backend + worktree + agent-session for the chat    │
  │      ▼                                                            │
  │  agent-pool ──────────────────────────────────────────────────┐  │
  │   ├─ OpenCodeServerBackend (1 process, N sessions)             │  │
  │   │     `opencode serve` ◄── @opencode-ai/sdk ──► /event SSE   │  │
  │   └─ WarmAcpBackend[session]  (1 stdio process per session)    │  │
  │         `goose acp` / `qwen --acp` ◄── ClientSideConnection    │  │
  └──────────────────────────────────────────────────────────────┘  │
        │ broker.publishFrame (delta / reasoning_delta / tool_call)   │
        ▼                                                            │
     web (CoderPane) — unchanged                                     │
```

The **task row stays the per-turn unit**. What changes: instead of building a
fresh world per task, the dispatcher resolves the chat's *persistent* backend,
worktree, and agent-session, sends one prompt, streams events, diffs, and leaves
everything warm.

## 2. Backends

Common interface (`AgentBackend`):

```
interface AgentBackend {
  ensureSession(sessionId, opts): Promise<AgentSessionHandle>  // create-or-reuse
  prompt(handle, input, { worktreePath, model, signal, onEvent }): Promise<TurnResult>
  closeSession(handle): Promise<void>
  dispose(): Promise<void>                                     // backend teardown
  health(): 'up' | 'down'
}
```

`onEvent` emits the same normalized events the current `acp-dispatch.ts` produces
(`text`, `reasoning`, `tool_call`, `tool_update`) so the broker-frame publishing and
`persistExternalAgentTurn` paths are reused unchanged.

### 2a. OpenCodeServerBackend (shared HTTP server)

- **Spawn once per BooCoder process:** `opencode serve --hostname 127.0.0.1 --port <p>`
  with `OPENCODE_SERVER_PASSWORD=<random-at-boot>` (verified: `serve.ts`, `network.ts`;
  default port 4096, prints `opencode server listening on http://…`). Use the official
  `@opencode-ai/sdk` (`createOpencodeServer` / `createOpencodeClient`) rather than
  hand-rolling HTTP — it already parses the ready line and wraps routes.
- **One SSE subscription** to `GET /event`, consumed in a single read loop; events
  demuxed by `properties.sessionID` → BooCode session. Reasoning arrives as
  `message.part.delta` (`field: "reasoning"`) and `message.part.updated`
  (`part.type: "reasoning"`); text as the `text` field; tool calls as tool parts.
- **One opencode session per BooCode chat.** `client.session.create()` once, store the
  returned `id` in `agent_sessions.agent_session_id`. Per-turn: `client.session.prompt({
  path:{id}, body:{ parts:[{type:'text',text}], model:"provider/model" }})`. Worktree
  routing via the `x-opencode-directory` header (set to the session's persistent
  worktree) so the agent operates inside it.
- **Reasoning dedup (port from Paseo `opencode-agent.ts`):** track
  `streamedPartKeys` of `reasoning:${partID}`; when a `message.part.updated` reasoning
  part arrives whose key was already streamed via delta, drop it. Prevents the
  double-thought bug (covered by Paseo's `opencode-reasoning-dedup` e2e test).

### 2b. WarmAcpBackend (goose, qwen — stdio)

- **One persistent process + ACP connection per (chat, agent)** (Paseo's
  `SpawnedACPProcess`): spawn `goose acp` / `qwen --acp` once, NDJSON over stdio,
  `initialize` → `session/new` once; store the ACP session id in the
  `agent_sessions` row. Each turn calls `session/prompt` on the same connection;
  switching away and back resumes this same connection/session. Reuses the existing `acp-dispatch.ts`
  `handleSessionUpdate` switch verbatim for `agent_message_chunk` /
  `agent_thought_chunk` / `tool_call*`.
- **Child lifetime is the pool's, not a request's.** Spawn detached/managed; do not
  tie the process to a single dispatch's abort signal (only the in-flight `prompt`
  gets the per-turn signal). Mirrors the codecontext shim rule (CLAUDE.md): supervise
  the child and react to its exit, don't let a request scope kill it.

## 3. Data model

Agent switching is **free** within a chat (the picker is per-turn, not locked), so
the worktree is shared across agents but each agent keeps its own backend session.
That splits into two tables: one **shared worktree per chat**, and one **backend
session per (chat, agent)** pair.

```sql
-- One shared worktree per BooCode chat. All agents used in the chat operate in it.
CREATE TABLE IF NOT EXISTS session_worktrees (
  session_id     UUID PRIMARY KEY REFERENCES sessions(id),
  worktree_path  TEXT NOT NULL,
  base_commit    TEXT,                          -- project HEAD captured at create (diff baseline)
  created_at     TIMESTAMPTZ NOT NULL DEFAULT clock_timestamp()
);

-- One backend session per (chat, agent). Resumed when the user switches back to
-- that agent, so each agent retains its own conversation memory across switches.
CREATE TABLE IF NOT EXISTS agent_sessions (
  session_id        UUID NOT NULL REFERENCES sessions(id),
  agent             TEXT NOT NULL,             -- opencode | goose | qwen (native boocode needs no row)
  backend           TEXT NOT NULL,             -- opencode_server | acp_warm
  agent_session_id  TEXT,                      -- opencode/ACP native session id (the memory handle)
  server_port       INTEGER,                   -- opencode server port (nullable)
  status            TEXT NOT NULL DEFAULT 'idle', -- idle | active | crashed | closed
  last_active_at    TIMESTAMPTZ,
  created_at        TIMESTAMPTZ NOT NULL DEFAULT clock_timestamp(),
  PRIMARY KEY (session_id, agent),
  CONSTRAINT agent_sessions_backend_chk CHECK (backend IN ('opencode_server','acp_warm')),
  CONSTRAINT agent_sessions_status_chk  CHECK (status IN ('idle','active','crashed','closed'))
);
```

Plus one column for attribution (drives the DiffPanel badges in §9):

```sql
-- Which agent staged each pending change. Stamped at queue time:
--   worktree-diff path → the task's agent; native boocode write tools → 'boocode';
--   manual RightRail create (v2.5.x) → NULL (renders as "manual").
ALTER TABLE pending_changes ADD COLUMN IF NOT EXISTS agent TEXT;
```

`tasks.worktree_path` already exists but was per-task; the persistent worktree now
lives on `session_worktrees`. `tasks` stays the per-turn record (state machine
unchanged) and gains nothing required. **Native boocode** keeps no `agent_sessions`
row — it has no warm backend; it reconstructs conversation context from the chat's
`messages` rows each turn (so it transparently sees every other agent's prior turns).
DB is the source of truth for reconnect after a BooCoder restart (the in-memory pool
rebuilds lazily from these tables on the next turn).

## 3a. Agent switching & continuity (the decided model)

Per the design review: **free switch, per-agent memory.** Concretely:

- **Picker is per-turn.** The message route already sends `provider`/`model` per
  message; nothing locks a chat to one agent. v2.6 keeps that.
- **Worktree is shared.** All agents in a chat resolve the same `session_worktrees`
  row, so file state carries across switches — *once applied*. (See the staging
  boundary caveat below.)
- **Each agent resumes its own session.** Switching opencode → boocode → opencode
  reuses opencode's stored `agent_session_id` (its memory intact), not a fresh one.
  Lazy-create on first use of an agent in the chat; resume thereafter.
- **Native boocode is the universal reader.** It rebuilds from the `messages` table,
  so it always sees the full transcript including other agents' turns.
- **Gap turns are NOT auto-replayed** into a resumed agent. When you return to
  opencode, it sees the shared worktree + your new prompt, but did not "hear" the
  boocode/goose turns in between. (A future refinement could inject a short
  "changes since you last ran" preamble; out of scope for v2.6.)
- **Staging-boundary caveat (must be documented in the UI):** external agents edit
  *inside the worktree*; native boocode reads/writes the *project root* via
  `pending_changes`. So unapplied edits do **not** cross between a worktree agent and
  native boocode — file continuity between the two only exists after apply. This is
  an inherent consequence of v2.5's review-before-apply model, not a v2.6 bug.
- **No mid-turn switch.** Per-chat turns are serialized (§5); the agent is fixed for
  the duration of an in-flight turn. The user can switch the picker for the *next*
  turn while one is running, but it won't retarget the running turn.

## 4. Persistent worktree + incremental diff

- **Create** on the first turn of a chat (`createWorktree(projectPath, sessionId)`
  — keyed by chat, not task), capturing project HEAD as `base_commit`. Persist the
  `session_worktrees` row; all agents in the chat share it.
- **Reuse** every subsequent turn — no new worktree, no cleanup between turns.
- **Diff strategy (per turn):** diff the worktree against the **project HEAD baseline**
  captured when the worktree was created. Each turn supersedes the prior
  `pending_changes` row for that session (one accumulating unified diff, latest wins) —
  mirrors how the anchored rolling summary supersedes itself. Avoids stacking N partial
  diffs the user must reason about; the pending change always reflects the full current
  delta of the worktree.
- **Apply** merges the worktree delta back to the project (existing `apply_pending`
  path); after apply, re-baseline so the next turn's diff is relative to applied state.
- **Cleanup** on chat close/archive (new hook) and on `dispose()`; removes the
  `session_worktrees` row + all `agent_sessions` rows for the chat. Orphan reaper
  sweeps worktrees with no live `session_worktrees` row (extends the periodic sweeper).

## 5. Concurrency

Current dispatcher: global `running` boolean → strictly one task at a time.
Target: **per-session serialization, cross-session concurrency.**

- Replace the single `running` flag with a `Map<sessionId, Promise>` in-flight registry.
- `poll()` selects the oldest pending task whose **session has no in-flight turn**, so
  two different chats run concurrently but a chat never has two turns at once (the agent
  holds conversational state — overlapping prompts would corrupt it).
- The LISTEN/NOTIFY `tasks_new` fast path (v2.5.x) already triggers immediate polls;
  the registry replaces the boolean guard there too.

## 6. Lifecycle & failure

- **Lazy spawn:** backend/worktree/agent-session created on first turn for a session.
- **Idle eviction:** pool evicts a backend/session after an idle TTL (e.g. 30 min);
  worktree persists (DB-backed); next turn re-spawns and reattaches via stored
  `agent_session_id` (opencode persists sessions on disk; ACP re-`session/new` if the
  native id is gone).
- **Crash recovery:** supervise children; on exit mark `agent_sessions.status='crashed'`,
  publish `chat_status='error'`, and rebuild on the next turn. opencode server crash
  takes all opencode sessions down → restart server, recreate sessions.
- **Shutdown drain:** `app.addHook('onClose')` disposes the pool (close opencode server,
  kill warm ACP children) after in-flight turns settle — extends the existing
  dispatcher `stop()`.
- **systemd:** BooCoder already spawns agent children under `NoNewPrivileges`; long-lived
  pool children are fine. Use `context.Background`-equivalent detachment so children
  outlive the dispatch that created them.

## 7. Risks / open questions

- **opencode single-server blast radius:** one crash drops all opencode sessions. Mitigated
  by on-disk session persistence + lazy re-create. Could later shard one server per project
  if it bites.
- **Worktree disk growth:** persistent worktrees per session accumulate; the close-hook +
  orphan reaper must be reliable or disk leaks. Add a max-live-worktrees cap with LRU evict.
- **SDK version coupling:** `@opencode-ai/sdk` is a new workspace dep pinned to the installed
  opencode (1.15.x). Probe-time version check should warn on major drift.
- **Incremental-diff baseline correctness:** re-baselining after apply must handle the user
  editing the project out-of-band; diff vs a stored base commit, not vs a moving target.
- **Reconnect fidelity:** after BooCoder restart, reattaching to a stored opencode session id
  assumes the server (also restarted) still has it on disk — verify the SDK reattach path.
- **Cross-agent staging gap:** worktree agents and native boocode don't see each other's
  *unapplied* edits (worktree vs project root). The UI must make this legible (e.g. show
  which agent staged a pending change) so a switch doesn't look like lost work. A resumed
  agent also won't have heard other agents' in-between turns — acceptable per the decided
  model, but worth a small "N turns by other agents since you last ran" hint later.
- **Per-(chat,agent) session sprawl:** a chat that cycles through many agents accumulates
  warm backends/worktree co-tenants; idle eviction (§6) must key on (chat,agent), and the
  opencode server's session count is bounded by eviction, not per-chat.

## 8. File map (anticipated)

| File | Change |
|------|--------|
| `apps/coder/src/services/agent-pool.ts` | NEW — pool + backend interface |
| `apps/coder/src/services/backends/opencode-server.ts` | NEW — SDK + SSE demux + dedup |
| `apps/coder/src/services/backends/warm-acp.ts` | NEW — persistent ACP connection |
| `apps/coder/src/services/dispatcher.ts` | per-chat concurrency; resolve-or-create shared worktree + per-(chat,agent) backend session; no per-turn teardown |
| `apps/coder/src/services/worktrees.ts` | chat-keyed create; baseline capture; re-baseline-on-apply |
| `apps/coder/src/services/agent-turn-persist.ts` | reused as-is |
| `apps/coder/src/schema.sql` | `session_worktrees` + `agent_sessions` (per (chat,agent)) + `pending_changes.agent` column |
| `apps/coder/src/routes/sessions|tasks` | chat-close cleanup hook |
| `apps/coder/src/routes/pending.ts` | `agent` on `listPending` response; stamp `agent` in queue paths |
| `apps/coder/src/routes/agent-sessions.ts` | NEW — `GET /api/sessions/:id/agent-sessions` (§9b) |
| `apps/coder/package.json` | add `@opencode-ai/sdk` dep |
| `apps/web/src/components/panes/CoderPane.tsx` | `PendingChange.agent`; DiffPanel badges + staging hint; pass `sessionId` to composer |
| `apps/web/src/components/AgentComposerBar.tsx` | optional `sessionId` prop; resumed/new chip; export `providerIcon` |
| `apps/web/src/hooks/useAgentSessions.ts` | NEW — chat-scoped agent-session fetch |
| `apps/web/src/api/client.ts` | `api.coder.agentSessions(sessionId)` |

## 9. Frontend UX — agent attribution & switch affordances

The switching model (§3a) is only good if it's **legible**: the user must see which
agent did what, and whether switching back resumes or starts fresh. Pure read+display
over the new `agent` column and `agent_sessions` — no dispatch-logic change.

### 9a. Per-change agent attribution (DiffPanel) — Phase 1
- **Wire:** `listPending` returns the row; add `agent` to the response and to the
  frontend `PendingChange` type (`CoderPane.tsx`, today `{id, file_path, operation, diff?, status}`).
- **UI:** each DiffPanel row gains a small agent badge before the file path — reuse the
  `providerIcon()` switch from `AgentComposerBar` (extract to a shared helper / the new
  `icons/ProviderIcons` module) + the provider label; `agent === null` → a neutral
  "manual" chip. When the pending set spans >1 distinct agent, a one-line header note
  ("Changes from opencode, boocode") makes mixed provenance obvious.

### 9b. "Resumed" vs "new session" indicator (AgentComposerBar) — Phase 1
- **API:** `GET /api/sessions/:id/agent-sessions` → `[{ agent, status, has_session, last_active_at }]`
  (reads `agent_sessions` for the chat). Chat-scoped, so it is NOT foldable into the
  project-level provider snapshot.
- **Hook:** `useAgentSessions(sessionId)` — fetch on mount, refetch on `message_complete`
  (same trigger `usePendingChanges` already uses).
- **UI:** a subtle chip right of the Provider picker:
  - current provider has a live row → muted **"resumed"** (title: "Resuming <agent> · last active <relative>").
  - native boocode (never has a row) → **"history"** (it reconstructs from the transcript).
  - otherwise → **"new session"**.
  - Render only when connected and the chat has ≥1 prior turn; hidden on a fresh chat.
  - `AgentComposerBar` gains an optional `sessionId?: string` prop (CoderPane has it);
    absent → render nothing, so BooChat and other callers are unaffected.

### 9c. Staging-boundary hint (DiffPanel) — Phase 3 polish
- When the selected provider is **native boocode** and pending changes were staged by a
  **worktree agent** (or vice-versa), show a one-line muted caveat:
  "opencode's edits live in its worktree — boocode won't see them until applied."
  Derived purely from per-change `agent` + current `value.provider`; no new state.
  Keeps the §3a staging caveat from biting silently.