# Deferred work — post stale cleanup (2026-05-26) This document describes work intentionally **not** shipped in the 2026-05-26 stale/simplify batch. Each item needs a product or architecture decision before implementation. See also [`STALE-DEPRECATED.md`](./STALE-DEPRECATED.md) for what was resolved in that batch. Last updated: 2026-05-29 --- ## Summary | Item | Category | User impact | Effort | Risk if left alone | |------|----------|-------------|--------|-------------------| | Task cancel → abort ACP/PTY child | Correctness / UX | High — Stop does not kill external agents | Medium | Zombie processes, stuck `running` tasks, orphaned worktrees | | Skip ACP cold probe when DB fresh | Performance | Medium — composer open can stall 5–30s on cache miss | ✅ Shipped (v2.3, Phase 2) | Resolved — `PROVIDER_PROBE_TTL_MS` TTL gate live | | Unified `packages/types` | Maintainability | Low (dev-only) | Medium–High | Type drift between server, coder, web | | Large file splits | Maintainability | None directly | Medium per file | Harder reviews, merge conflicts | | Retire `apps/coder/web/` fallback SPA | Scope / ops | Low — Sam uses CoderPane | Medium | Dual UI maintenance, divergent API client | | `console.debug` in xml-parser placeholder rejection | Maintainability | None (logs only) | Trivial | Placeholder rejections miss pino pipeline / level filters | --- ## 1. Task cancel → abort external ACP/PTY child ### Current behavior External agent tasks (opencode, cursor, claude, qwen, goose, etc.) flow through `apps/coder/src/services/dispatcher.ts` path B: 1. Create git worktree (`worktrees.ts` → local `hostExec`) 2. Spawn ACP or PTY child with an **`AbortController`** (`ac`) passed as `signal` 3. Stream assistant output over session WS 4. Diff worktree → `pending_changes` 5. Cleanup worktree The abort signal is **created per task** but **never registered anywhere cancel can reach**: ```typescript // dispatcher.ts — ac is local to dispatchExternal(); no Map const ac = new AbortController(); // ... await dispatchViaAcp({ ..., signal: ac.signal }); // or await dispatchViaPty({ ..., signal: ac.signal }); ``` Cancel paths today: | Route | What it does | Gap | |-------|--------------|-----| | `POST /api/tasks/:id/cancel` | Sets DB `state = 'cancelled'`, calls `cancelPendingPermission`, tries `inference.cancel()` on session chats | **`inference.cancel` only affects native boocode streaming** — not ACP/PTY children | | `POST /api/sessions/:sessionId/stop` | Same — loops open chats, calls `inference.cancel` | Same gap for external tasks tied to that session | | Dispatcher `stop()` on shutdown | Sets `stopping = true`; in-flight external task may finish or mark cancelled at end | No immediate child kill | `dispatchViaAcp` and `dispatchViaPty` **do** honor `signal` when aborted (worktree ops, stream read, spawn teardown). The wiring from HTTP cancel → `ac.abort()` is missing. There is also **no frontend** calling task cancel today (`grep` across `apps/web` finds no `cancelTask` usage). Stop in CoderPane targets native inference only. ### Symptoms users would see - Click Stop / cancel task → DB says `cancelled` but agent process keeps running on host - Assistant row may stay `streaming` until the child exits naturally - Worktree under `/tmp/booworktrees/` may linger until dispatcher cleanup or manual removal - Permission prompts cancel correctly (`cancelPendingPermission`) — that part works ### Proposed implementation **Phase A — backend registry (minimum viable)** 1. Add `Map` (or `{ ac, childPid? }`) in dispatcher module scope 2. On `dispatchExternal` start: `activeAbort.set(taskId, ac)` 3. On task completion/failure/cleanup: `activeAbort.delete(taskId)` 4. Export `cancelExternalTask(taskId: string): boolean` from dispatcher 5. In `POST /api/tasks/:id/cancel`: - Call `cancelExternalTask(taskId)` **before** or **instead of** `inference.cancel` when task is external (`execution_path` or agent !== boocode) - Mark assistant message `cancelled` / `failed` and publish `message_complete` + `chat_status: idle` on session WS 6. Wire `POST /api/sessions/:sessionId/stop` to find **running external tasks** for that session and abort them too **Phase B — child process kill** - PTY: store `child` ref from `spawn`; on abort, `child.kill('SIGTERM')` then SIGKILL after timeout - ACP: `acp-dispatch` already uses signal; ensure `createAcpNdJsonStream` cancel kills the underlying spawn (verify in `acp-stream.ts`) - Worktree: on abort mid-flight, call `cleanupWorktree` in `finally` block **Phase C — frontend** - CoderPane Stop button: if `provider !== 'boocode'` and a `task_id` is active, call `POST /api/coder/tasks/:id/cancel` (needs API client method) - Show cancelled state in UI (task row or composer status) ### Files likely touched - `apps/coder/src/services/dispatcher.ts` — registry, export cancel - `apps/coder/src/routes/tasks.ts` — call dispatcher cancel - `apps/coder/src/routes/messages.ts` — session stop for external tasks - `apps/coder/src/services/acp-dispatch.ts`, `pty-dispatch.ts` — verify signal → kill - `apps/web/src/api/client.ts`, `CoderPane.tsx` — Stop wiring ### Acceptance criteria - Cancel a running opencode/cursor task → child process gone within 5s (`ps` / `pgrep` on host) - Task row `state = 'cancelled'`, assistant message not left `streaming` - Worktree directory removed (best-effort) - Native boocode cancel unchanged - No regression on permission-denied / blocked flows ### Open questions - Should cancel be **best-effort** (mark cancelled even if kill fails) or **fail closed** until process dies? - Arena parallel tasks: cancel one contestant vs whole arena? - Blocked task waiting on permission: cancel already resolves waiter — confirm ACP session teardown order --- ## 2. Skip ACP cold probe when DB models are fresh **Status:** ✅ **ADDRESSED** in v2.3 (phases 1–5: `v2.5.4-provider-lifecycle-phase1` … `v2.5.12-provider-lifecycle-phase4`, plus the phase-5 settings UI + picker filter). The `PROVIDER_PROBE_TTL_MS` (default 24h) gate on `available_agents.last_probed_at` is live — the tier-2 cold ACP probe runs only on `force` (`POST /api/providers/refresh`), TTL staleness, or empty DB models; otherwise the snapshot serves cached models. See [`openspec/changes/v2-3-provider-lifecycle/`](../openspec/changes/v2-3-provider-lifecycle/proposal.md). The original (v2.2) behavior below is kept for history. ### Current behavior (v2.2) `apps/coder/src/services/provider-snapshot.ts` on cache miss: 1. Reads `available_agents.models` as a **fallback** when building each entry 2. Still **cold-probes every installed ACP provider** on every rebuild (`probeAcpProvider` in `buildProviderEntry`) — DB models do not skip the probe 3. In-memory snapshot cache TTL is **5 minutes** (`CACHE_TTL_MS`); opening `AgentComposerBar` calls `GET /api/providers/snapshot` via `useProviderSnapshot` 4. `POST /api/providers/refresh` clears cache and forces a full rebuild (all probes again) 5. Uninstalled agents are **omitted** from the snapshot (`return null`) — not listed as `unavailable` 6. No config-file enable/disable; providers are hardcoded in `provider-registry.ts` `persistProbedModels()` writes probe results back to `available_agents` (including `last_probed_at`), but nothing reads `last_probed_at` to skip tier-2 probes yet. There is no `PROVIDER_PROBE_TTL_MS` env var. ### Planned behavior (v2.3) See [`design.md`](../openspec/changes/v2-3-provider-lifecycle/design.md): - Tier-1 fast binary check + tier-2 ACP session only when stale or explicitly refreshed - `PROVIDER_PROBE_TTL_MS` (default 24h) gate on `available_agents.last_probed_at` - Return `loading` synchronously on cache miss; complete via inflight promise - Always list registered providers (`ready` | `unavailable` | `error` | `loading`); respect `enabled` from `/data/coder-providers.json` - Cold probe only on `POST /api/providers/refresh` (or TTL expiry), not on every composer open ### Why deferred past v2.2 v2.2 shipped the snapshot wire shape and ACP dispatch stack. Lifecycle semantics (config registry, enable/disable, probe TTL, settings UI) were scoped as the follow-on **v2.3** batch to avoid mixing two large behavior changes in one tag. ### Acceptance criteria — met - Second `GET /api/providers/snapshot` within TTL does not invoke `probeAcpProvider` (mock assert in `provider-snapshot.test.ts`) - Disabled provider visible in settings (Providers tab), absent from composer - Explicit refresh repopulates models; warm open is sub-second ### Still deferred (Tier-2 follow-ups, not shipped in v2.3) These were explicitly scoped out of v2.3 (see `design.md` §11) and remain open: - **`provider_snapshot_updated` WS frame** — the loading state uses a capped client poll / one-shot refetch instead of a server-pushed frame (design §4.4, §11; tasks O.1). - **`available_agents.enabled` DB column** — `enabled` is read from the in-memory resolved registry only; no DB mirror, so settings state after a coder restart re-derives from the JSON config rather than the DB (design §3.3; tasks O.2). - **Single-source-of-truth shared types package** — the provider snapshot types are duplicated across `apps/coder/.../provider-types.ts` and `apps/web/src/api/types.ts`, guarded by the text-identity `provider-types-parity.test.ts` rather than a shared package (see §3 below). - **MCP `list_providers` / `inspect_provider` tools** — provider introspection over MCP is not wired (design §11). --- ## 3. Unified `packages/types` for provider snapshot JSON ### Current behavior Provider snapshot shapes are **duplicated** (not byte-identical exports): | Location | Types | |----------|-------| | `apps/coder/src/services/provider-types.ts` | `ProviderSnapshotEntry`, `AgentCommand`, `ProviderModel`, … | | `apps/web/src/api/types.ts` | Same names, hand-maintained | | WS frames | `agent_commands` frame in `ws-frames.ts` (server + web copy) | Today drift is caught by: - TypeScript at compile time when fields are added to one side only - `provider-snapshot.test.ts` on coder - Manual review during v2.2 batch There is **no** `packages/` workspace yet (monorepo is `apps/server`, `apps/web`, `apps/coder`, `apps/booterm` only). ### Options **A. Zod schema + inferred types (lighter)** - Add `ProviderSnapshotEntrySchema` in server or coder - Web imports schema for runtime validation on fetch (optional) - Parity test: `expect(webTypeKeys).toEqual(zodShapeKeys)` — similar to existing `ws-frames.test.ts` **B. Shared `packages/types` package** ``` packages/types/ package.json # @boocode/types src/provider.ts # interfaces + zod src/ws-frames.ts # move duplicated unions here ``` - `apps/coder`, `apps/web`, `apps/server` depend on `workspace:*` - Requires NodeNext export map, build order (`tsc` for types package first) - Biggest payoff if WS frames + provider + task types all move **C. Status quo + discipline** - Keep duplicated TS interfaces - Add one test file importing both sides and asserting structural equality ### Tradeoffs | Approach | Setup cost | Runtime safety | Best when | |----------|------------|----------------|-----------| | A — Zod in coder | Low | Validate API responses | Snapshot is coder-only API | | B — packages/types | High | Full shared source | Many cross-app types growing | | C — parity test | Lowest | Compile-time only | Rare schema changes | ### Recommendation Start with **A or C** unless planning a broader “shared types” initiative (tasks, permissions, arena). Full `packages/types` is justified when a third consumer appears or WS frame duplication becomes painful again. ### Files likely touched (option B) - New `packages/types/` - Root `pnpm-workspace.yaml` - `apps/web/tsconfig`, `apps/coder/tsconfig` — path references - Strip duplicate blocks from `apps/web/src/api/types.ts` --- ## 4. Large file splits (refactor when touched) Not user-facing defects — deferred to avoid scope creep in the stale batch. Split when the next feature touches the file heavily. ### 4.1 `CoderPane.tsx` (~663 lines) **Responsibilities today (single component):** - Session WS + polling fallback for messages - Pending changes fetch/apply/discard - External vs native send paths (`AgentComposerBar`, slash commands, skill invoke) - Permission card + agent commands hint - Provider snapshot–driven composer state **Suggested extractions:** | Hook / module | Owns | |---------------|------| | `useCoderMessages(sessionId)` | WS subscribe, poll-when-disconnected, temp-id dedup | | `usePendingChanges(sessionId)` | List, apply, discard, diff preview trigger | | `CoderComposerFooter` | AgentComposerBar + hints + permission card layout | **Trigger to do it:** next CoderPane feature (e.g. task cancel UI, multi-task inbox) or when file exceeds ~800 lines. ### 4.2 `ChatInput.tsx` (~669 lines) **Responsibilities:** attachments, drag/drop, paste-as-file, `@` mentions, slash picker, agent picker row, context bar, send/submit, registry for send-to-chat. **Suggested extractions:** | Hook / module | Owns | |---------------|------| | `useSlashCommandInput(...)` | `slashState`, `isSlashCommandToken`, picker open/close | | `useFileMention(...)` | `@` detection, file index fetch, popover state | | `useAttachments(...)` | chips, drop, paste, MAX_ATTACHMENTS | Slash helpers already live in `lib/slash-command.ts`; hooks would consume them. ### 4.3 `dispatcher.ts` (~483 lines) **Two paths in one file:** - Path A: native boocode — enqueue inference, wait for message completion - Path B: external — worktree, ACP/PTY, diff, pending_changes **Suggested split:** - `dispatcher-native.ts` — poll loop pick-up for `provider = boocode` or no agent - `dispatcher-external.ts` — path B + abort registry (pairs with item §1) - `dispatcher.ts` — thin poll loop + routing ### 4.4 `AgentComposerBar.tsx` (~323 lines) Optional split: **provider/model/mode/thinking pickers** vs **persistence** (`AgentSessionConfig` localStorage/session). Lower priority — file is manageable. ### 4.5 `acp-dispatch.ts` (~294 lines) Stream setup already extracted to `acp-stream.ts`. Remaining split candidate: **session lifecycle** (create → prompt loop → teardown) vs **permission/tool side effects**. ### General rule Split only when: 1. A new feature needs a clear seam, or 2. Review feedback repeatedly hits the same file, or 3. Tests need isolated units (e.g. dispatcher external path) Avoid drive-by splits — they churn blame without shipping user value. --- ## 5. Retire `apps/coder/web/` fallback SPA ### What it is Standalone Vite React app (`@boocode/coder-web`) built into `apps/coder/web/dist/` and served by BooCoder Fastify at `:9502` when no BooChat proxy is in front. **Primary UI:** `CoderPane` inside BooChat SPA (`apps/web`) — API via `/api/coder/*` proxy, WS to `:9502`. **Fallback UI:** direct visit to `http://100.114.205.53:9502` — own `api/client.ts`, `ChatPane`, `DiffPane`, duplicated types. ### Why it was kept - Host-only debugging without full BooChat stack - Historical path before CoderPane integration - Zero dependency on Authelia-protected BooChat origin ### Cost of keeping - Separate build step in coder deploy - Duplicate message/stream types and API paths - Features land in CoderPane first; fallback rots unless manually updated ### Options | Option | When to choose | |--------|----------------| | **Keep** | Still use `:9502` directly for debugging or demos | | **Freeze** | Stop feature work; build only on release for emergency access | | **Remove** | Always use BooChat; `:9502` serves health + WS + API only (or minimal static “open in BooChat” page) | ### Removal checklist (if chosen) 1. Confirm Sam never uses standalone UI (bookmarks, systemd docs, BOOCODER.md) 2. Remove `apps/coder/web/` package and static serve from `apps/coder/src/index.ts` 3. Update Dockerfile/coder build scripts 4. Keep WS + REST routes — CoderPane depends on them 5. Optional: single-page static “Use code.indifferentketchup.com” at `/` --- ## 6. xml-parser placeholder rejection — structured logging (v2.2.2 cleanup) **Shipped (uncommitted deploy):** `extractToolCallBlocks` rejects placeholder XML tool args at parse time; dropped blocks append to `flushed`. **Nit:** rejection path uses `console.debug` instead of the Fastify/pino `log.debug({ toolName, args }, '…')` pattern used elsewhere. Cosmetic — behavior is correct; logs won't appear in the usual structured pipeline or respect `LOG_LEVEL`. **Fix:** pass an optional logger into `extractToolCallBlocks` from `stream-phase.ts` executeStreamPhase (one call site), or use a module-level debug hook. Target tag: **v2.2.2** cleanup batch, not a blocker. --- ## Suggested batch ordering If picking these up as openspec batches: 1. **Task cancel abort** — highest correctness gap; unblocks honest Stop button in CoderPane 2. **v2.3 provider lifecycle** — probe TTL, config registry, enable/disable (includes §2 above) 3. **CoderPane hook extraction** — natural follow-on when adding cancel UI 4. **Zod parity or packages/types** — when next WS/provider field is added 5. **Retire coder/web** — only after explicit “I don’t use :9502 UI” confirmation 6. **v2.2.2 xml-parser log uniformity** — `console.debug` → pino (§6) --- ## Related docs - [`STALE-DEPRECATED.md`](./STALE-DEPRECATED.md) — resolved stale items - [`ARCHITECTURE.md`](./ARCHITECTURE.md) — BooChat / BooCoder surfaces - [`openspec/changes/archived/v2.2-paseo-providers.md`](../openspec/changes/archived/v2.2-paseo-providers.md) — shipped v2.2 provider snapshot API - [`openspec/changes/v2-3-provider-lifecycle/`](../openspec/changes/v2-3-provider-lifecycle/) — planned probe lifecycle + config registry (§2) - [`BOOCODER.md`](../BOOCODER.md) — dispatch, worktrees, pending changes