Files
boocode/docs/DEFERRED-WORK.md
indifferentketchup 31e1b32be1 v2.2.2-xml-placeholder-reject: drop placeholder XML tool calls at parse time
Reject qwen3.6 spurious <invoke> tails with path "..." or empty args before
they enter toolCalls, preventing duplicate assistant answers. Dropped blocks
append to flushed text; four new xml-parser tests. DEFERRED-WORK §6 for
console.debug → pino cleanup.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-26 16:22:43 +00:00

359 lines
16 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Deferred work — post stale cleanup (2026-05-26)
This document describes work intentionally **not** shipped in the 2026-05-26 stale/simplify batch. Each item needs a product or architecture decision before implementation. See also [`STALE-DEPRECATED.md`](./STALE-DEPRECATED.md) for what was resolved in that batch.
Last updated: 2026-05-26
---
## Summary
| Item | Category | User impact | Effort | Risk if left alone |
|------|----------|-------------|--------|-------------------|
| Task cancel → abort ACP/PTY child | Correctness / UX | High — Stop does not kill external agents | Medium | Zombie processes, stuck `running` tasks, orphaned worktrees |
| Skip ACP cold probe when DB fresh | Performance | Medium — composer open can stall 530s on cache miss | Medium (v2.3 batch) | Slow provider picker; repeated ACP spawns on every snapshot rebuild |
| Unified `packages/types` | Maintainability | Low (dev-only) | MediumHigh | Type drift between server, coder, web |
| Large file splits | Maintainability | None directly | Medium per file | Harder reviews, merge conflicts |
| Retire `apps/coder/web/` fallback SPA | Scope / ops | Low — Sam uses CoderPane | Medium | Dual UI maintenance, divergent API client |
| `console.debug` in xml-parser placeholder rejection | Maintainability | None (logs only) | Trivial | Placeholder rejections miss pino pipeline / level filters |
---
## 1. Task cancel → abort external ACP/PTY child
### Current behavior
External agent tasks (opencode, cursor, claude, qwen, goose, etc.) flow through `apps/coder/src/services/dispatcher.ts` path B:
1. Create git worktree (`worktrees.ts` → local `hostExec`)
2. Spawn ACP or PTY child with an **`AbortController`** (`ac`) passed as `signal`
3. Stream assistant output over session WS
4. Diff worktree → `pending_changes`
5. Cleanup worktree
The abort signal is **created per task** but **never registered anywhere cancel can reach**:
```typescript
// dispatcher.ts — ac is local to dispatchExternal(); no Map<taskId, AbortController>
const ac = new AbortController();
// ...
await dispatchViaAcp({ ..., signal: ac.signal });
// or
await dispatchViaPty({ ..., signal: ac.signal });
```
Cancel paths today:
| Route | What it does | Gap |
|-------|--------------|-----|
| `POST /api/tasks/:id/cancel` | Sets DB `state = 'cancelled'`, calls `cancelPendingPermission`, tries `inference.cancel()` on session chats | **`inference.cancel` only affects native boocode streaming** — not ACP/PTY children |
| `POST /api/sessions/:sessionId/stop` | Same — loops open chats, calls `inference.cancel` | Same gap for external tasks tied to that session |
| Dispatcher `stop()` on shutdown | Sets `stopping = true`; in-flight external task may finish or mark cancelled at end | No immediate child kill |
`dispatchViaAcp` and `dispatchViaPty` **do** honor `signal` when aborted (worktree ops, stream read, spawn teardown). The wiring from HTTP cancel → `ac.abort()` is missing.
There is also **no frontend** calling task cancel today (`grep` across `apps/web` finds no `cancelTask` usage). Stop in CoderPane targets native inference only.
### Symptoms users would see
- Click Stop / cancel task → DB says `cancelled` but agent process keeps running on host
- Assistant row may stay `streaming` until the child exits naturally
- Worktree under `/tmp/booworktrees/<taskId>` may linger until dispatcher cleanup or manual removal
- Permission prompts cancel correctly (`cancelPendingPermission`) — that part works
### Proposed implementation
**Phase A — backend registry (minimum viable)**
1. Add `Map<taskId, AbortController>` (or `{ ac, childPid? }`) in dispatcher module scope
2. On `dispatchExternal` start: `activeAbort.set(taskId, ac)`
3. On task completion/failure/cleanup: `activeAbort.delete(taskId)`
4. Export `cancelExternalTask(taskId: string): boolean` from dispatcher
5. In `POST /api/tasks/:id/cancel`:
- Call `cancelExternalTask(taskId)` **before** or **instead of** `inference.cancel` when task is external (`execution_path` or agent !== boocode)
- Mark assistant message `cancelled` / `failed` and publish `message_complete` + `chat_status: idle` on session WS
6. Wire `POST /api/sessions/:sessionId/stop` to find **running external tasks** for that session and abort them too
**Phase B — child process kill**
- PTY: store `child` ref from `spawn`; on abort, `child.kill('SIGTERM')` then SIGKILL after timeout
- ACP: `acp-dispatch` already uses signal; ensure `createAcpNdJsonStream` cancel kills the underlying spawn (verify in `acp-stream.ts`)
- Worktree: on abort mid-flight, call `cleanupWorktree` in `finally` block
**Phase C — frontend**
- CoderPane Stop button: if `provider !== 'boocode'` and a `task_id` is active, call `POST /api/coder/tasks/:id/cancel` (needs API client method)
- Show cancelled state in UI (task row or composer status)
### Files likely touched
- `apps/coder/src/services/dispatcher.ts` — registry, export cancel
- `apps/coder/src/routes/tasks.ts` — call dispatcher cancel
- `apps/coder/src/routes/messages.ts` — session stop for external tasks
- `apps/coder/src/services/acp-dispatch.ts`, `pty-dispatch.ts` — verify signal → kill
- `apps/web/src/api/client.ts`, `CoderPane.tsx` — Stop wiring
### Acceptance criteria
- Cancel a running opencode/cursor task → child process gone within 5s (`ps` / `pgrep` on host)
- Task row `state = 'cancelled'`, assistant message not left `streaming`
- Worktree directory removed (best-effort)
- Native boocode cancel unchanged
- No regression on permission-denied / blocked flows
### Open questions
- Should cancel be **best-effort** (mark cancelled even if kill fails) or **fail closed** until process dies?
- Arena parallel tasks: cancel one contestant vs whole arena?
- Blocked task waiting on permission: cancel already resolves waiter — confirm ACP session teardown order
---
## 2. Skip ACP cold probe when DB models are fresh
**Status:** Planned — [`openspec/changes/v2-3-provider-lifecycle/`](../openspec/changes/v2-3-provider-lifecycle/proposal.md). **Not shipped** (no `v2.3` tag; all tasks unchecked).
### Current behavior (v2.2)
`apps/coder/src/services/provider-snapshot.ts` on cache miss:
1. Reads `available_agents.models` as a **fallback** when building each entry
2. Still **cold-probes every installed ACP provider** on every rebuild (`probeAcpProvider` in `buildProviderEntry`) — DB models do not skip the probe
3. In-memory snapshot cache TTL is **5 minutes** (`CACHE_TTL_MS`); opening `AgentComposerBar` calls `GET /api/providers/snapshot` via `useProviderSnapshot`
4. `POST /api/providers/refresh` clears cache and forces a full rebuild (all probes again)
5. Uninstalled agents are **omitted** from the snapshot (`return null`) — not listed as `unavailable`
6. No config-file enable/disable; providers are hardcoded in `provider-registry.ts`
`persistProbedModels()` writes probe results back to `available_agents` (including `last_probed_at`), but nothing reads `last_probed_at` to skip tier-2 probes yet. There is no `PROVIDER_PROBE_TTL_MS` env var.
### Planned behavior (v2.3)
See [`design.md`](../openspec/changes/v2-3-provider-lifecycle/design.md):
- Tier-1 fast binary check + tier-2 ACP session only when stale or explicitly refreshed
- `PROVIDER_PROBE_TTL_MS` (default 24h) gate on `available_agents.last_probed_at`
- Return `loading` synchronously on cache miss; complete via inflight promise
- Always list registered providers (`ready` | `unavailable` | `error` | `loading`); respect `enabled` from `/data/coder-providers.json`
- Cold probe only on `POST /api/providers/refresh` (or TTL expiry), not on every composer open
### Why deferred past v2.2
v2.2 shipped the snapshot wire shape and ACP dispatch stack. Lifecycle semantics (config registry, enable/disable, probe TTL, settings UI) were scoped as the follow-on **v2.3** batch to avoid mixing two large behavior changes in one tag.
### Acceptance criteria (when v2.3 ships)
- Second `GET /api/providers/snapshot` within TTL does not invoke `probeAcpProvider` (mock assert in tests)
- Disabled provider visible in settings, absent from composer
- Explicit refresh repopulates models; warm open is sub-second
---
## 3. Unified `packages/types` for provider snapshot JSON
### Current behavior
Provider snapshot shapes are **duplicated** (not byte-identical exports):
| Location | Types |
|----------|-------|
| `apps/coder/src/services/provider-types.ts` | `ProviderSnapshotEntry`, `AgentCommand`, `ProviderModel`, … |
| `apps/web/src/api/types.ts` | Same names, hand-maintained |
| WS frames | `agent_commands` frame in `ws-frames.ts` (server + web copy) |
Today drift is caught by:
- TypeScript at compile time when fields are added to one side only
- `provider-snapshot.test.ts` on coder
- Manual review during v2.2 batch
There is **no** `packages/` workspace yet (monorepo is `apps/server`, `apps/web`, `apps/coder`, `apps/booterm` only).
### Options
**A. Zod schema + inferred types (lighter)**
- Add `ProviderSnapshotEntrySchema` in server or coder
- Web imports schema for runtime validation on fetch (optional)
- Parity test: `expect(webTypeKeys).toEqual(zodShapeKeys)` — similar to existing `ws-frames.test.ts`
**B. Shared `packages/types` package**
```
packages/types/
package.json # @boocode/types
src/provider.ts # interfaces + zod
src/ws-frames.ts # move duplicated unions here
```
- `apps/coder`, `apps/web`, `apps/server` depend on `workspace:*`
- Requires NodeNext export map, build order (`tsc` for types package first)
- Biggest payoff if WS frames + provider + task types all move
**C. Status quo + discipline**
- Keep duplicated TS interfaces
- Add one test file importing both sides and asserting structural equality
### Tradeoffs
| Approach | Setup cost | Runtime safety | Best when |
|----------|------------|----------------|-----------|
| A — Zod in coder | Low | Validate API responses | Snapshot is coder-only API |
| B — packages/types | High | Full shared source | Many cross-app types growing |
| C — parity test | Lowest | Compile-time only | Rare schema changes |
### Recommendation
Start with **A or C** unless planning a broader “shared types” initiative (tasks, permissions, arena). Full `packages/types` is justified when a third consumer appears or WS frame duplication becomes painful again.
### Files likely touched (option B)
- New `packages/types/`
- Root `pnpm-workspace.yaml`
- `apps/web/tsconfig`, `apps/coder/tsconfig` — path references
- Strip duplicate blocks from `apps/web/src/api/types.ts`
---
## 4. Large file splits (refactor when touched)
Not user-facing defects — deferred to avoid scope creep in the stale batch. Split when the next feature touches the file heavily.
### 4.1 `CoderPane.tsx` (~663 lines)
**Responsibilities today (single component):**
- Session WS + polling fallback for messages
- Pending changes fetch/apply/discard
- External vs native send paths (`AgentComposerBar`, slash commands, skill invoke)
- Permission card + agent commands hint
- Provider snapshotdriven composer state
**Suggested extractions:**
| Hook / module | Owns |
|---------------|------|
| `useCoderMessages(sessionId)` | WS subscribe, poll-when-disconnected, temp-id dedup |
| `usePendingChanges(sessionId)` | List, apply, discard, diff preview trigger |
| `CoderComposerFooter` | AgentComposerBar + hints + permission card layout |
**Trigger to do it:** next CoderPane feature (e.g. task cancel UI, multi-task inbox) or when file exceeds ~800 lines.
### 4.2 `ChatInput.tsx` (~669 lines)
**Responsibilities:** attachments, drag/drop, paste-as-file, `@` mentions, slash picker, agent picker row, context bar, send/submit, registry for send-to-chat.
**Suggested extractions:**
| Hook / module | Owns |
|---------------|------|
| `useSlashCommandInput(...)` | `slashState`, `isSlashCommandToken`, picker open/close |
| `useFileMention(...)` | `@` detection, file index fetch, popover state |
| `useAttachments(...)` | chips, drop, paste, MAX_ATTACHMENTS |
Slash helpers already live in `lib/slash-command.ts`; hooks would consume them.
### 4.3 `dispatcher.ts` (~483 lines)
**Two paths in one file:**
- Path A: native boocode — enqueue inference, wait for message completion
- Path B: external — worktree, ACP/PTY, diff, pending_changes
**Suggested split:**
- `dispatcher-native.ts` — poll loop pick-up for `provider = boocode` or no agent
- `dispatcher-external.ts` — path B + abort registry (pairs with item §1)
- `dispatcher.ts` — thin poll loop + routing
### 4.4 `AgentComposerBar.tsx` (~323 lines)
Optional split: **provider/model/mode/thinking pickers** vs **persistence** (`AgentSessionConfig` localStorage/session). Lower priority — file is manageable.
### 4.5 `acp-dispatch.ts` (~294 lines)
Stream setup already extracted to `acp-stream.ts`. Remaining split candidate: **session lifecycle** (create → prompt loop → teardown) vs **permission/tool side effects**.
### General rule
Split only when:
1. A new feature needs a clear seam, or
2. Review feedback repeatedly hits the same file, or
3. Tests need isolated units (e.g. dispatcher external path)
Avoid drive-by splits — they churn blame without shipping user value.
---
## 5. Retire `apps/coder/web/` fallback SPA
### What it is
Standalone Vite React app (`@boocode/coder-web`) built into `apps/coder/web/dist/` and served by BooCoder Fastify at `:9502` when no BooChat proxy is in front.
**Primary UI:** `CoderPane` inside BooChat SPA (`apps/web`) — API via `/api/coder/*` proxy, WS to `:9502`.
**Fallback UI:** direct visit to `http://100.114.205.53:9502` — own `api/client.ts`, `ChatPane`, `DiffPane`, duplicated types.
### Why it was kept
- Host-only debugging without full BooChat stack
- Historical path before CoderPane integration
- Zero dependency on Authelia-protected BooChat origin
### Cost of keeping
- Separate build step in coder deploy
- Duplicate message/stream types and API paths
- Features land in CoderPane first; fallback rots unless manually updated
### Options
| Option | When to choose |
|--------|----------------|
| **Keep** | Still use `:9502` directly for debugging or demos |
| **Freeze** | Stop feature work; build only on release for emergency access |
| **Remove** | Always use BooChat; `:9502` serves health + WS + API only (or minimal static “open in BooChat” page) |
### Removal checklist (if chosen)
1. Confirm Sam never uses standalone UI (bookmarks, systemd docs, BOOCODER.md)
2. Remove `apps/coder/web/` package and static serve from `apps/coder/src/index.ts`
3. Update Dockerfile/coder build scripts
4. Keep WS + REST routes — CoderPane depends on them
5. Optional: single-page static “Use code.indifferentketchup.com” at `/`
---
## 6. xml-parser placeholder rejection — structured logging (v2.2.2 cleanup)
**Shipped (uncommitted deploy):** `extractToolCallBlocks` rejects placeholder XML tool args at parse time; dropped blocks append to `flushed`.
**Nit:** rejection path uses `console.debug` instead of the Fastify/pino `log.debug({ toolName, args }, '…')` pattern used elsewhere. Cosmetic — behavior is correct; logs won't appear in the usual structured pipeline or respect `LOG_LEVEL`.
**Fix:** pass an optional logger into `extractToolCallBlocks` from `stream-phase.ts` executeStreamPhase (one call site), or use a module-level debug hook. Target tag: **v2.2.2** cleanup batch, not a blocker.
---
## Suggested batch ordering
If picking these up as openspec batches:
1. **Task cancel abort** — highest correctness gap; unblocks honest Stop button in CoderPane
2. **v2.3 provider lifecycle** — probe TTL, config registry, enable/disable (includes §2 above)
3. **CoderPane hook extraction** — natural follow-on when adding cancel UI
4. **Zod parity or packages/types** — when next WS/provider field is added
5. **Retire coder/web** — only after explicit “I dont use :9502 UI” confirmation
6. **v2.2.2 xml-parser log uniformity**`console.debug` → pino (§6)
---
## Related docs
- [`STALE-DEPRECATED.md`](./STALE-DEPRECATED.md) — resolved stale items
- [`ARCHITECTURE.md`](./ARCHITECTURE.md) — BooChat / BooCoder surfaces
- [`openspec/changes/archived/v2.2-paseo-providers.md`](../openspec/changes/archived/v2.2-paseo-providers.md) — shipped v2.2 provider snapshot API
- [`openspec/changes/v2-3-provider-lifecycle/`](../openspec/changes/v2-3-provider-lifecycle/) — planned probe lifecycle + config registry (§2)
- [`BOOCODER.md`](../BOOCODER.md) — dispatch, worktrees, pending changes