Files
boocode/openspec/changes/pty-exit-notifications/design.md
indifferentketchup b18de2a331 chore: snapshot working tree - pty_exited notifications + in-flight inference WIP
feat(booterm): structured pty_exited WS notifications. Plan-validated, impl-validated, code-reviewed green (contracts build clean, contracts test 29/29, booterm + web typecheck clean).

wip: in-progress inference/provider refactor (agents.ts, provider.ts, new llama-providers.ts, removed llama-args-validator), plus arena, dispatcher, compaction, schema changes.

openspec: pty-exit-notifications complete; x-agent-flags planned (not yet implemented).
2026-06-14 12:48:47 +00:00

165 lines
7.7 KiB
Markdown

# Design: PTY Exit Notifications
## Overview
When a process exits in a booterm terminal pane, emit a structured `pty_exited` notification over the booterm WS protocol. The notification carries exit code, last output lines, session metadata, and timeout status. This is a client-facing change only; broker publish for inference-loop consumption is deferred (see Deferred section).
## Architecture
### Current exit flow
1. `apps/booterm/src/ws/attach.ts:170-183` -- `handle.onExit` fires
2. Sends bare `{type: 'exit', code: exitCode}` to browser WS
3. Closes the socket
4. Registry is unregistered on socket `close` event (line 190)
### Proposed exit flow
1. `handle.onExit` fires
2. Read metadata from registry and ring buffer BEFORE any unregister
3. Build structured `pty_exited` frame
4. Send `pty_exited` to browser WS (replaces bare `exit` frame)
5. Close socket
6. Registry cleanup happens on socket `close` (existing behavior, unchanged)
### Cross-app wire changes
**packages/contracts/src/ws-frames.ts** -- Add `PtyExitedFrame` to `WsFrameSchema`:
```typescript
export const PtyExitedFrame = z.object({
type: z.literal('pty_exited'),
session_id: z.string().min(1).max(64),
pane_id: z.string().min(1).max(64),
exit_code: z.number().int(),
last_lines: z.array(z.string()),
session_title: z.string().nullable().optional(),
session_description: z.string().nullable().optional(),
parent_agent: z.string().nullable().optional(),
timed_out: z.boolean(),
});
```
Note: `session_id` and `pane_id` use `z.string().min(1).max(64)` because booterm IDs are `[a-zA-Z0-9_-]{1,64}` (validated by `sanitizeId` using `ID_RE` in `apps/booterm/src/pty/manager.ts:5`). They are NOT UUIDs. This matches the existing `ToolCallId` pattern (`z.string().min(1)`) for non-UUID identifiers in the contract.
Add to `KNOWN_FRAME_TYPES` array. Rebuild `@boocode/contracts`.
**apps/booterm/src/ws/attach.ts** -- Replace the `onExit` handler:
Current (line 170-183):
```typescript
handle.onExit(({ exitCode }) => {
socket.send(JSON.stringify({ type: 'exit', code: exitCode }));
socket.close(1000);
});
```
New:
```typescript
handle.onExit(({ exitCode }) => {
// Read metadata BEFORE any cleanup — registry.get and getLastLines
// must run while the entry still exists.
const meta = registry.get(pid);
const lastLines = getLastLines(pid, 5);
const frame = {
type: 'pty_exited',
session_id: sid,
pane_id: pid,
exit_code: exitCode,
last_lines: lastLines,
session_title: meta?.title ?? null,
session_description: meta?.description ?? null,
parent_agent: meta?.parentAgent ?? null,
timed_out: meta?.timedOut ?? false,
};
if (socket.readyState === socket.OPEN) {
socket.send(JSON.stringify(frame));
}
socket.close(1000);
});
```
### Web frontend changes
**apps/web/src/lib/terminal-protocol.ts** -- Add `pty_exited` to `ServerControlFrame` union:
```typescript
export type ServerControlFrame =
| { type: 'init' }
| { type: 'exit'; code: number }
| { type: 'pty_exited'; session_id: string; pane_id: string;
exit_code: number; last_lines: string[];
session_title?: string | null; session_description?: string | null;
parent_agent?: string | null; timed_out: boolean };
```
Update `parseServerFrame` to recognize `type: 'pty_exited'` and return the structured frame.
**apps/web/src/hooks/terminal/useTerminalSocket.ts** -- Handle `pty_exited` in the message handler:
Rendering spec:
- Write a dim notification line: `\r\n\x1b[2m[process exited with code ${frame.exit_code}]\x1b[0m\r\n`
- If `last_lines` is non-empty, write the last line (at most 1) to xterm as-is (xterm handles ANSI). Prepend a dim prefix if desired.
- If `timed_out: true`, write `\r\n\x1b[2m[process timed out and was killed]\x1b[0m\r\n` instead of the exit code line.
- Do NOT display session_title/parent_agent in the terminal -- these are metadata for the inference loop, not user-facing terminal content.
- Preserve backward compatibility: if `parseServerFrame` returns `{type: 'exit', code: N}` (legacy frame), handle it exactly as before.
### Timeout integration
The `sweepExpired` path in `apps/booterm/src/pty/manager.ts:172-198` is currently dead code -- it is never wired to a `setInterval` in `apps/booterm/src/index.ts`. The timeout config vars (`PTY_IDLE_TIMEOUT_SECONDS`, `PTY_ABSOLUTE_TIMEOUT_SECONDS`) default to 0 and are never passed to `registerWsAttachRoute`.
For this change:
- Add `timedOut?: boolean` field to `SessionMeta` in the registry (pre-wiring).
- In `sweepExpired`, set `meta.timedOut = true` BEFORE calling `killSession`. Do NOT call `registry.unregister()` in sweepExpired. The two-phase approach: sweepExpired flags + kills, then the `onExit` handler (firing when tmux kill takes effect) reads metadata, and the socket `close` handler does the unregister. This avoids the race where `onExit` fires after unregister deletes metadata.
- The `timed_out: true` path in `onExit` will work once `sweepExpired` is wired to an interval (future change). Until then, `meta?.timedOut` is always `undefined` and the frame defaults to `false`.
### Ring buffer last-lines helper
Add `getLastLines(paneId: string, n: number): string[]` to `apps/booterm/src/pty/registry.ts`:
```typescript
export function getLastLines(paneId: string, n: number): string[] {
const buf = ringBuffers.get(paneId);
if (!buf || buf.length === 0) return [];
// Return last n non-empty, non-whitespace-only lines.
// ANSI escape sequences are preserved (xterm handles them).
// Partial lines from mid-stream exit are included as-is.
const nonEmpty = buf.filter(l => l.trim().length > 0);
return nonEmpty.slice(-n);
}
```
Note: `appendOutput` may store partial (non-newline-terminated) lines when a process exits mid-line. These are included as-is -- the last line may be truncated. This is acceptable because the existing `exit` handler shows no output at all.
## Data flow
```
PTY process exits (normal or sweepExpired kill)
-> handle.onExit fires (attach.ts)
-> registry.get(paneId) reads SessionMeta [BEFORE any unregister]
-> getLastLines(paneId, 5) reads ring buffer
-> Build PtyExitedFrame with meta?.timedOut ?? false
-> socket.send(JSON.stringify(frame)) [to browser]
-> socket.close(1000)
-> socket 'close' handler calls registry.unregister(pid) [existing, unchanged]
```
## Files touched
| File | Change |
|------|--------|
| `packages/contracts/src/ws-frames.ts` | Add PtyExitedFrame, add to WsFrameSchema + KNOWN_FRAME_TYPES |
| `apps/booterm/src/ws/attach.ts` | Replace onExit handler with structured frame |
| `apps/booterm/src/pty/registry.ts` | Add getLastLines helper, add timedOut flag to SessionMeta |
| `apps/booterm/src/pty/manager.ts` | Set timedOut flag in sweepExpired before kill; remove unregister() call (cleanup moves to socket close) |
| `apps/web/src/lib/terminal-protocol.ts` | Add pty_exited to ServerControlFrame + parseServerFrame |
| `apps/web/src/hooks/terminal/useTerminalSocket.ts` | Handle pty_exited frame in message handler |
## Deferred (YAGNI)
- **Inference-loop broker publish**: Booterm cannot directly access the server's in-memory broker. Adding HTTP callback or DB LISTEN/NOTIFY for server-side notification is a separate integration. Reopen when: (a) the server needs to react to PTY exits, or (b) a task completion workflow requires inference-loop awareness. The `pty_exited` frame type in WsFrame contract makes this straightforward to add later.
- **sweepExpired wiring**: The timeout kill machinery is implemented but never wired to an interval. Adding `setInterval(sweepExpired, ...)` in `index.ts` is a one-liner but changes behavior (timeouts start killing). Reopen when: timeouts are desired.
- **Log search extras**: Already implemented in `searchRingBuffer` and the `/api/term/search` route. No additional work needed.