boocode/openspec/changes/v2-6-persistent-agent-sessions/proposal.md

# v2.6 Persistent agent sessions (warm processes + OpenCode server)

**Status:** Planned
**Depends on:** v2.2 Paseo providers (ACP dispatch), v2.3 provider lifecycle (registry/snapshot)
**Reference fork:** `/opt/forks/paseo`, `/opt/forks/opencode`
**Pairs with:** the v2.5.x MessageBubble "Thinking" render fix — reasoning already flows; this batch is about persistence, not capability.

## Why

BooCode dispatches external agents (opencode, goose, qwen) **one-shot per task**:
per task the dispatcher cuts a fresh worktree (`createWorktree(projectPath, taskId)`),
spawns `opencode acp` / `goose acp` / `qwen --acp`, runs **one** turn, then tears
down the process *and* the worktree (`dispatcher.ts:runExternalAgent`). Consequences:

- **No session continuity.** A follow-up message in the same chat creates a new
  task with a new worktree and a new agent process. The agent has no memory of
  the prior turn beyond what BooCode replays as chat history, and it cannot see
  the files it edited last turn (fresh worktree every time).
- **Cold start every turn.** Each turn pays the process spawn + ACP `initialize`
  handshake (and, for some agents, model load) before any work happens.
- **Diverges from Paseo.** Paseo runs **OpenCode as a long-lived HTTP server**
  (`opencode serve` + `@opencode-ai/sdk`, SSE `/event` stream) and keeps **goose /
  qwen as warm stdio-ACP processes** (`SpawnedACPProcess`: one ACP connection,
  `newSession()` once, many `prompt()`s). BooCode rebuilds the world per turn.

This batch makes a BooCode chat map to a **persistent agent backend + a persistent
worktree** that live for the whole conversation, so turns are warm and the agent
sees its own accumulating edits. Reasoning passthrough is **already solved** (ACP
`agent_thought_chunk` → `reasoning_delta` → the new MessageBubble Thinking block);
this batch does not touch it beyond porting OpenCode's reasoning-dedup.

## Decisions locked (from design review)

- **Worktree model:** *Persistent worktree per session.* A chat owns one worktree
  for the whole conversation; each turn the agent sees prior edits; pending_changes
  accumulate; worktree is cleaned on session close, not per turn.
- **Agent switching:** *Free switch, per-agent memory.* The picker stays per-turn
  (not locked to a chat). The worktree is shared across agents; each agent keeps its
  own backend session, resumed when you switch back to it. Native boocode reconstructs
  from chat history (so it sees every agent's turns); a resumed agent does not auto-
  ingest the gap turns. Data model: one shared worktree per chat + one backend session
  per `(chat, agent)` pair. Caveat: unapplied edits don't cross the worktree↔project
  boundary between external agents and native boocode (a v2.5 review-model consequence).
- **Transport per agent (matches Paseo exactly):**
  - **OpenCode** → one shared `opencode serve` HTTP server, driven via
    `@opencode-ai/sdk`; one opencode *session* per BooCode chat (multi-session,
    directory-routed via `x-opencode-directory`).
  - **Goose / Qwen** → warm **stdio** ACP process per live session. Their HTTP
    "server" modes are just ACP-over-HTTP wrappers (goose: undocumented/internal;
    qwen `serve`: an HTTP bridge around a single `qwen --acp` child) — no gain over
    stdio, so we keep stdio ACP like Paseo does.

## Scope

### In scope

1. **Agent process pool** (`apps/coder/src/services/agent-pool.ts`) — owns long-lived
   backends, lazy spawn, idle eviction, crash restart, shutdown drain.
2. **OpenCode server backend** — spawn `opencode serve`, hold SDK client + single
   SSE subscription demuxed by opencode `sessionID` → BooCode session; port +
   `OPENCODE_SERVER_PASSWORD` managed at boot.
3. **Warm ACP backend** — persistent `SpawnedACPProcess`-style connection for
   goose/qwen reused across turns (one `newSession()`, many prompts).
4. **Persistent worktree lifecycle** — worktree created on first turn of a session,
   reused, diffed incrementally into `pending_changes`, cleaned on session close.
5. **Session ↔ backend ↔ worktree mapping** — new `agent_sessions` table.
6. **Per-session concurrency** — replace the dispatcher's global single-flight
   `running` guard with per-session serialization (different sessions run
   concurrently; one turn at a time within a session).
7. **OpenCode reasoning dedup** — port Paseo's `streamedPartKeys` partID dedup so
   reasoning isn't double-emitted (delta + final part).
8. **Switch-aware UI** (design §9) — per-change agent attribution in the DiffPanel
   (`pending_changes.agent` column + badges), a resumed/new-session chip on the
   AgentComposerBar (chat-scoped `agent-sessions` endpoint), and a staging-boundary
   hint so the worktree↔project gap is legible.
9. **Tests + smoke** — pool lifecycle unit tests; multi-turn opencode smoke; switch
   round-trip smoke; attribution/indicator smoke.

### Out of scope (this batch)

- Claude PTY→structured transport (separate deferred work — claude stays PTY here).
- Goose/qwen HTTP server modes (intentionally not used).
- Frontend redesign — existing CoderPane multi-turn chat UI already supports
  follow-ups; only backend continuity changes.
- Replacing `acp-dispatch.ts` wholesale — warm backend reuses its event handlers.
- Cross-host agent servers (opencode server stays local to the BooCoder host).

## Non-goals

- Multi-user session sharing (single-user homelab).
- Multiple concurrent turns within one agent session (the agent holds conversational
  state; turns within a session are serialized).

## Success criteria

- Send two messages in one external-agent chat → second turn reuses the same agent
  session **and** the same worktree (verified: no second `createWorktree`, agent
  references files it edited in turn 1).
- Warm-start latency for turn 2 materially below turn 1 (no spawn/handshake).
- opencode reasoning shows once per thought (no dupes) in the Thinking block.
- Killing the opencode server mid-session → pool restarts it and the next turn
  recovers (opencode persists sessions on disk).
- Switch opencode → boocode → opencode in one chat → opencode resumes its *same*
  session (its memory intact), boocode saw opencode's turns as history, and all three
  shared the one worktree. No agent is locked to the chat.
- Closing/archiving a session removes its worktree; BooCoder restart drains cleanly.
- Existing one-shot paths (arena, `new_task` tool, MCP create-task) still work.

## Deliverables

| Doc | Purpose |
|-----|---------|
| [`design.md`](./design.md) | Architecture, backends, data model, worktree/diff strategy, lifecycle, risks |
| [`tasks.md`](./tasks.md) | Phased implementation checklist |