diff --git a/CHANGELOG.md b/CHANGELOG.md index 673ca07..7fbf070 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -2,6 +2,10 @@ All notable changes per release tag. Most recent on top, ordered by tag creation date (which matches the git history). Tag names follow `vMAJOR.MINOR.PATCH-slug` — the slug describes what shipped, so the tag name alone is enough to recall the batch. +## v2.6.11-close-hooks-staging — 2026-06-01 + +The two v2.6 follow-ups left after `v2.6.10-lifecycle-hardening`. **Server close-hook caller:** `apps/server` (BooChat) now fire-and-forgets BooCoder's Phase-3 close hooks so warm agent backends + worktrees tear down *immediately* on delete/archive instead of waiting for the idle-evict/reaper backstop — a new `coder-notify.ts` `notifyCoderClose(kind,id)` (reusing the v2.6.2 `BOOCODER_URL` reach, never-rejects) is `void`-called after the WS frame at session-delete (`POST /api/sessions/:id/close`) and chat archive / archive-all / delete (`POST /api/chats/:id/close`); an unreachable coder can never block or fail the user's delete/archive. **Staging-boundary hint (task 3.7):** the BooCoder DiffPanel now shows a muted one-liner when the selected provider can't see another agent's unapplied worktree edits — native boocode selected + external-agent-staged changes (or vice-versa) → "'s edits live in its worktree — BooCode won't see them until applied" — derived purely from the per-change `agent` + current provider, no new state. 6 new server tests (`coder-notify`), 537 server tests pass; web + server tsc/build clean. **With these the v2.6 openspec is fully closed** — only the live Smoke 2/2b/3 remain (manual exercise). + ## v2.6.10-lifecycle-hardening — 2026-06-01 v2.6 Phase 3 (the last phase) — lifecycle hardening of the warm-process backends. **Idle eviction + LRU cap:** the agent pool runs a 60s sweep that evicts backends/sessions idle past `AGENT_POOL_IDLE_TTL_MS` (30 min default) and any beyond `AGENT_POOL_MAX_LIVE` (10, LRU) — **never a busy one** (in-flight turn, double-checked via a new `isBusy()` backend hook); the worktree persists (DB-backed) and the next turn re-spawns + reattaches. The eviction/LRU/restart decisions are factored into a pure `lifecycle-decisions.ts` (modeled on the inference `selectPruneTargets` pattern). **Crash recovery:** lifts openchamber's health-monitor + busy-aware-restart + consecutive-failure + stale-busy-grace state machine into `opencode-server.ts` (with port reclaim) and `warm-acp.ts` — an opencode server crash settles in-flight turns as failed, marks the rows `crashed`, and recreates fresh sessions (a fresh server can't hold the old in-memory id), while a warm-ACP child crash re-`session/new`s next turn; the F.1 turn-guard and U.6 usage are preserved (their tests still pass). **Worktree reaper:** a periodic reaper removes orphan on-disk worktrees (no live `worktrees` row, 1h grace) behind a superset-style preflight that skips dirty/unpushed/unmerged work, with Paseo-style soft-delete (`status='archived'`). Plus close hooks (`/api/chats/:id/close`, `/api/sessions/:id/close`, awaiting the apps/server caller) and diff re-baseline after `apply_pending`. Built test-first — 35 new tests (`lifecycle-decisions` 22, `agent-pool` 13) + a DB-opt-in reconnect integration test; 215 coder tests pass, tsc + build clean. **This completes v2.6** (Phase 0–3 + F.1 + Phase 1-UX). Remaining follow-ups (out of v2.6 scope): the apps/server close-hook caller, the 3.7 DiffPanel staging-boundary hint (frontend), and live Smoke 2/2b/3. diff --git a/apps/server/src/routes/chats.ts b/apps/server/src/routes/chats.ts index ad6bc5c..c39d210 100644 --- a/apps/server/src/routes/chats.ts +++ b/apps/server/src/routes/chats.ts @@ -4,6 +4,7 @@ import type { Sql } from '../db.js'; import type { Broker } from '../services/broker.js'; import type { Chat, Message } from '../types/api.js'; import { getModelContext } from '../services/model-context.js'; +import { notifyCoderClose } from '../services/coder-notify.js'; const CreateBody = z.object({ name: z.string().min(1).max(200).optional(), @@ -167,6 +168,9 @@ export function registerChatRoutes( chat_id: id, session_id: req.params.id, }); + // Fire-and-forget per archived chat: tear down its warm agent backends + // on the coder. Best-effort — never blocks/fails the bulk archive. + void notifyCoderClose('chat', id, req.log); } return { archived: ids.length, ids }; } @@ -208,6 +212,9 @@ export function registerChatRoutes( chat_id: row.id, session_id: row.session_id, }); + // Fire-and-forget: tear down this chat's warm agent backends + (last-chat) + // worktree on the coder. Best-effort — never blocks/fails the archive. + void notifyCoderClose('chat', row.id, req.log); reply.code(204); return null; } @@ -248,6 +255,9 @@ export function registerChatRoutes( chat_id: row.id, session_id: row.session_id, }); + // Fire-and-forget: tear down this chat's warm agent backends + (last-chat) + // worktree on the coder. Best-effort — never blocks/fails the delete. + void notifyCoderClose('chat', row.id, req.log); reply.code(204); return null; } diff --git a/apps/server/src/routes/sessions.ts b/apps/server/src/routes/sessions.ts index dbc24f3..d7a2e4b 100644 --- a/apps/server/src/routes/sessions.ts +++ b/apps/server/src/routes/sessions.ts @@ -5,6 +5,7 @@ import type { Config } from '../config.js'; import type { Broker } from '../services/broker.js'; import type { Session, WorktreeRiskReport } from '../types/api.js'; import { getSetting } from './settings.js'; +import { notifyCoderClose } from '../services/coder-notify.js'; const CreateBody = z.object({ name: z.string().min(1).max(200).optional(), @@ -513,6 +514,10 @@ export function registerSessionRoutes( } const project_id = deleted[0]!.project_id; broker.publishUserFrame('default', { type: 'session_deleted', session_id: id, project_id }); + // Fire-and-forget: ask BooCoder to tear down this session's warm agent + // backends + worktree immediately. Best-effort — never blocks/fails the + // delete; the coder's idle-evict + orphan reaper backstop a missed call. + void notifyCoderClose('session', id, req.log); reply.code(204); return null; } diff --git a/apps/server/src/services/__tests__/coder-notify.test.ts b/apps/server/src/services/__tests__/coder-notify.test.ts new file mode 100644 index 0000000..acd6aa2 --- /dev/null +++ b/apps/server/src/services/__tests__/coder-notify.test.ts @@ -0,0 +1,67 @@ +// v2.6.10 Phase 3 (server wiring) — notifyCoderClose fire-and-forget helper. +// +// The guarantee under test: the helper NEVER throws (so it can't break the +// user's delete/archive path), targets the correct coder URL shape, and folds +// every failure mode (non-2xx, network error) into a `false` result. + +import { describe, it, expect, vi, beforeEach, afterEach } from 'vitest'; +import { notifyCoderClose } from '../coder-notify.js'; + +const ORIGINAL_BOOCODER_URL = process.env.BOOCODER_URL; + +describe('notifyCoderClose', () => { + beforeEach(() => { + delete process.env.BOOCODER_URL; + }); + afterEach(() => { + if (ORIGINAL_BOOCODER_URL === undefined) delete process.env.BOOCODER_URL; + else process.env.BOOCODER_URL = ORIGINAL_BOOCODER_URL; + }); + + it('POSTs the chat close hook at the default coder origin and resolves true on 2xx', async () => { + const fetcher = vi.fn().mockResolvedValue(new Response(null, { status: 200 })); + const ok = await notifyCoderClose('chat', 'chat-123', undefined, fetcher as unknown as typeof fetch); + expect(ok).toBe(true); + expect(fetcher).toHaveBeenCalledTimes(1); + const [url, init] = fetcher.mock.calls[0]!; + expect(url).toBe('http://boocoder:3000/api/chats/chat-123/close'); + expect(init).toEqual({ method: 'POST' }); + }); + + it('POSTs the session close hook with the sessions segment', async () => { + const fetcher = vi.fn().mockResolvedValue(new Response(null, { status: 200 })); + const ok = await notifyCoderClose('session', 'sess-abc', undefined, fetcher as unknown as typeof fetch); + expect(ok).toBe(true); + expect(fetcher.mock.calls[0]![0]).toBe('http://boocoder:3000/api/sessions/sess-abc/close'); + }); + + it('honors BOOCODER_URL for the origin', async () => { + process.env.BOOCODER_URL = 'http://100.114.205.53:9502'; + const fetcher = vi.fn().mockResolvedValue(new Response(null, { status: 200 })); + await notifyCoderClose('chat', 'c1', undefined, fetcher as unknown as typeof fetch); + expect(fetcher.mock.calls[0]![0]).toBe('http://100.114.205.53:9502/api/chats/c1/close'); + }); + + it('resolves false on a non-2xx response (does not throw)', async () => { + const fetcher = vi.fn().mockResolvedValue(new Response(null, { status: 500 })); + const log = { debug: vi.fn() }; + const ok = await notifyCoderClose('chat', 'c1', log, fetcher as unknown as typeof fetch); + expect(ok).toBe(false); + expect(log.debug).toHaveBeenCalledTimes(1); + }); + + it('resolves false on a network error (coder unreachable) — never rejects', async () => { + const fetcher = vi.fn().mockRejectedValue(new Error('ECONNREFUSED')); + const log = { debug: vi.fn() }; + const ok = await notifyCoderClose('session', 's1', log, fetcher as unknown as typeof fetch); + expect(ok).toBe(false); + expect(log.debug).toHaveBeenCalledTimes(1); + }); + + it('does not require a logger', async () => { + const fetcher = vi.fn().mockRejectedValue(new Error('boom')); + await expect( + notifyCoderClose('chat', 'c1', undefined, fetcher as unknown as typeof fetch), + ).resolves.toBe(false); + }); +}); diff --git a/apps/server/src/services/coder-notify.ts b/apps/server/src/services/coder-notify.ts new file mode 100644 index 0000000..ac7f12e --- /dev/null +++ b/apps/server/src/services/coder-notify.ts @@ -0,0 +1,64 @@ +// v2.6.10 Phase 3 (server wiring) — fire-and-forget BooCoder close hooks. +// +// BooCoder (apps/coder, host systemd) added close hooks in +// apps/coder/src/routes/lifecycle.ts: +// POST /api/chats/:chatId/close — evict the chat's warm (chat,agent) +// backends, close its opencode session, +// mark agent_sessions closed, and remove +// the shared worktree on the last chat. +// POST /api/sessions/:sessionId/close — loop the chat-close path for every +// chat in the session. +// +// apps/server (Docker) can't see the host worktree dirs or reach the warm agent +// processes, so — exactly like the existing `worktree-risk` guard in +// routes/sessions.ts — it signals the coder over HTTP and the coder does the +// real teardown. This call is BEST-EFFORT: the coder's idle-pool eviction and +// the orphan-worktree reaper backstop a missed/failed call. It MUST NEVER block +// or fail the user's delete/archive — hence fire-and-forget with a swallowed +// catch. We do not await the returned promise at the call sites. + +import type { FastifyBaseLogger } from 'fastify'; + +export type CoderCloseKind = 'chat' | 'session'; + +function coderOrigin(): string { + // Same env + default as routes/sessions.ts' worktree-risk fetch. + return process.env.BOOCODER_URL ?? 'http://boocoder:3000'; +} + +/** + * Fire-and-forget POST to the BooCoder close hook for a chat or session. + * + * Resolves to `true` if the coder acknowledged (HTTP 2xx), `false` otherwise + * (non-2xx or network error). Callers SHOULD NOT await this — invoke it and + * move on. The returned promise never rejects: every failure path is caught, + * logged at debug, and folded into a `false` result so an unreachable or + * erroring coder can't surface to the user's delete/archive request. + */ +export async function notifyCoderClose( + kind: CoderCloseKind, + id: string, + log?: Pick, + fetcher: typeof fetch = fetch, +): Promise { + const segment = kind === 'chat' ? 'chats' : 'sessions'; + const url = `${coderOrigin()}/api/${segment}/${id}/close`; + try { + const res = await fetcher(url, { method: 'POST' }); + if (!res.ok) { + log?.debug( + { kind, id, status: res.status }, + 'coder close hook returned non-2xx (best-effort; reaper backstops)', + ); + return false; + } + log?.debug({ kind, id }, 'coder close hook acknowledged'); + return true; + } catch (err) { + log?.debug( + { kind, id, err: err instanceof Error ? err.message : String(err) }, + 'coder close hook unreachable (best-effort; reaper backstops)', + ); + return false; + } +} diff --git a/apps/web/src/components/panes/CoderPane.tsx b/apps/web/src/components/panes/CoderPane.tsx index 266107c..dfdff48 100644 --- a/apps/web/src/components/panes/CoderPane.tsx +++ b/apps/web/src/components/panes/CoderPane.tsx @@ -388,12 +388,14 @@ function usePendingChanges(sessionId: string) { function DiffPanel({ changes, loading, + currentProvider, onRefresh, onApprove, onReject, }: { changes: PendingChange[]; loading: boolean; + currentProvider: string; onRefresh: () => void; onApprove: (id: string) => void; onReject: (id: string) => void; @@ -409,6 +411,29 @@ function DiffPanel({ ? `Changes from ${distinctAgents.map((a) => providerLabel(a)).join(', ')}` : null; + // v2.6 §9c: staging-boundary caveat. External agents (opencode/goose/qwen/ + // claude) edit *inside their worktree*; native boocode reads/writes the + // *project root* via pending_changes. Unapplied edits don't cross that + // boundary. When the currently-selected provider can't see another side's + // staged-but-unapplied edits, surface a muted one-liner. agent===null + // (manual) is boundary-neutral. Pure derivation — no new state/fetch. + const isNativeProvider = currentProvider === 'boocode'; + const boundaryHint = (() => { + if (isNativeProvider) { + // Native boocode is selected: it won't see external-worktree edits. + const external = distinctAgents.filter((a) => a !== null && a !== 'boocode'); + if (external.length === 0) return null; + const who = + external.length === 1 + ? providerLabel(external[0]!) + : external.map((a) => providerLabel(a)).join(', '); + return `${who}'s edits live in its worktree — BooCode won't see them until applied.`; + } + // An external agent is selected: it won't see boocode's project-root edits. + if (!distinctAgents.includes('boocode')) return null; + return `BooCode's edits live in the project root — ${providerLabel(currentProvider)} won't see them until applied.`; + })(); + return (
@@ -430,6 +455,14 @@ function DiffPanel({ {mixedNote}
)} + {boundaryHint && ( +
+ {boundaryHint} +
+ )}
{pending.length === 0 ? (
@@ -914,6 +947,7 @@ export function CoderPane({ **Lift (design §10):** hardening from **openchamber** (MIT, same warm-opencode-server architecture) — health-monitor + crash auto-restart + busy-aware restart + port reclaim (`killProcessOnPort`/`waitForPortRelease`) + stall-SSE = a concrete state machine for 3.1/3.2/3.6. Reaper (3.3/3.4): Paseo worktree-archive cascade + superset destroy-saga (preflight dirty/unpushed inspect) + LRU cap on warm-server Maps. Do crash-recovery + reaper together (shared supervision loop). - [x] 3.1 Idle TTL eviction per `(chat, agent)` (`AGENT_POOL_IDLE_TTL_MS`=30min) + LRU cap (`AGENT_POOL_MAX_LIVE`=10), busy never evicted; reattach next turn. Pure `lifecycle-decisions.ts` (TDD). - [x] 3.2 Crash recovery: openchamber health-monitor + busy-aware-restart + stale-grace state machine in `opencode-server.ts` (+ port reclaim) + `warm-acp.ts`. opencode → fresh sessions; ACP → re-`session/new`. F.1 guard + U.6 usage preserved. -- [x] 3.3 Close hooks (`/api/chats/:id/close`, `/api/sessions/:id/close`) → `closeChat` evicts backends + archives the `worktrees` row + removes the worktree. *(apps/server caller is a follow-up; idle-evict + reaper backstop it.)* +- [x] 3.3 Close hooks (`/api/chats/:id/close`, `/api/sessions/:id/close`) → `closeChat` evicts backends + archives the `worktrees` row + removes the worktree. **apps/server caller wired in `v2.6.11`** (`coder-notify.ts`, fire-and-forget on session-delete + chat archive/delete). - [x] 3.4 Orphan worktree reaper (periodic, 1h grace, superset-style dirty/unpushed preflight, Paseo soft-delete) + LRU cap on the pool. - [x] 3.5 Re-baseline `worktrees.base_commit` after a successful `apply_pending` (both apply routes). - [x] 3.6 Reconnect integration test (DB-opt-in): restart mid-session → next turn reattaches/recreates from `agent_sessions`/`worktrees`. -- [ ] 3.7 Staging-boundary hint in DiffPanel (§9c) — **frontend follow-up** (apps/web; deferred — Sam has uncommitted web work). +- [x] 3.7 Staging-boundary hint in DiffPanel (§9c) — `v2.6.11`: muted one-liner when the selected provider can't see another agent's unapplied worktree edits (derived from per-change `agent` + current provider; no new state). ## Tests — ⬜ REMAINING (none of T.1–T.3 exist yet)