boocode/docs/plans/post-review-backlog/scope-brief.md

# Scope brief — post-review backlog (spec stand-in)

**Status:** ground-truth source for `plan-implementation`. No `feature-specification.md` exists; this brief
captures the items, decisions, and verified current state from the 2026-06-02 backlog review conversation
plus `boocode_code_review_v2.md`, `docs/DEFERRED-WORK.md`, and `boocode_roadmap.md`. Sam chose scope =
"everything we discussed" (one plan covering all actionable items; WANT items have no per-feature spec, so
their planning is intentionally shallower).

Two items were verified live during the review (file:line evidence below). The plan must honor those
findings as committed mechanics, not re-debate them.

---

## Items to implement

### F1 — External task cancel actually kills the child (DEFERRED #1)

**Problem (VERIFIED 2026-06-02):** Hitting Stop on an external agent task (opencode/goose/qwen/claude)
marks `tasks.state='cancelled'` but does NOT stop the agent. The frontend is wired
(`CoderPane.tsx:987` `handleStop` → `api.coder.cancelTask` → `POST /api/coder/tasks/:id/cancel`, fired only
for external tasks since `activeTaskId` is set only when `messages.ts:238` returns a `task_id`). But the
route (`apps/coder/src/routes/tasks.ts:110-148`) only calls `cancelPendingPermission` + `inference.cancel`
(native-only) + sets DB cancelled. It never reaches the dispatcher. The dispatcher
(`apps/coder/src/services/dispatcher.ts`) has no `Map<taskId, AbortController>` and no exported cancel; each
external run-function creates a private `const ac = new AbortController()` (lines ~316/655/991/1248) and
threads `ac.signal` into every backend, but `ac.abort()` is never called by any route. All four backends
honor the signal correctly (PTY `child.kill` SIGTERM→SIGKILL at `pty-dispatch.ts:159`; warm-ACP
`session/cancel` at `warm-acp.ts:318`; opencode `session.abort` at `opencode-server.ts:775`; claude-sdk
interrupt at `claude-sdk.ts:209`) — but the signal is never fired from Stop, and the assistant message is
left `status='streaming'`.

**Decision / approach:** Implement Phase A + B of the DEFERRED #1 proposal — a `Map<taskId,AbortController>`
registry in the dispatcher, an exported `cancelExternalTask(taskId)`, wire `POST /api/tasks/:id/cancel` (and
`POST /api/sessions/:id/stop`) into it for external tasks, and finalize the streaming assistant message
(`complete`/`cancelled`) + publish terminal frame on cancel. Warm backends keep their persistent worktree
(by design); one-shot PTY cleans its worktree in `finally`. Native boocode cancel path unchanged.

**Surface:** apps/coder (dispatcher, routes/tasks.ts, routes/messages.ts). Sam: implement.

### F2 — Tool-call parser clean/prune (review §4a, optional retirement)

**Problem (VERIFIED 2026-06-02):** The TS tool-call parser
(`apps/server/src/services/inference/tool-call-parser.ts`) is STILL load-bearing in code —
`extractToolCallBlocks` is called in the live text-delta path (`stream-phase.ts:263-284`) and
`stripToolMarkup` in `tool-phase.ts:122` + `error-handler.ts:25,106`. But a live probe of llama-swap
(`http://100.101.41.16:8401`) confirmed native `--jinja` parsing is ON: the server returns structured
`tool_calls` + `finish_reason:tool_calls`, content is just `"\n"`. The structured path
(`stream-phase.ts:285` `case 'tool-call'`) does all real work today; the TS parser is dormant
defense-in-depth. qwen3.6 is specifically the model that drifts to `<invoke>` markup, and the TS fallback is
the only guard for "tool call emitted as plain text" (`experimental_repairToolCall` does NOT cover that
case).

**Decision / approach:** Sam wants to clean/prune. Honor the safe path: this is NOT a zero-risk delete.
The plan must decide between (a) prune only what is provably dead now while keeping the
`<invoke>`-text-fallback guard, vs (b) the full retirement = gate the text-scrape fallback behind a flag,
validate native parsing on live qwen3.6 for one release, then delete. The relicense already removed the
AGPL-dead exports; this is pure simplification, no license pressure. Identify exactly which exports/branches
are safely removable today vs which are the load-bearing guard.

**Surface:** apps/server inference. Sam: clean/prune (safe path).

### F3 — xml-parser structured logging (DEFERRED #6)

**Problem:** `tool-call-parser.ts:65` uses `console.debug` for placeholder-rejection logging instead of the
pino `log.debug({...})` pattern, so it skips the structured pipeline and ignores `LOG_LEVEL`. Cosmetic.

**Decision / approach:** Pass an optional logger into `extractToolCallBlocks` from its one call site
(`stream-phase.ts` executeStreamPhase), or a module-level debug hook. ~5-minute edit. Sam: implement (will
be done directly, included here for completeness).

### F4 — Notify-hook config injection → out-of-band agent status (review #10 follow-on)

**Context:** v2.7.6 shipped the SCOPED status-publish (normalized `working|blocked|idle|error` frame from
BooCoder's OWN turn boundaries via `normalize-agent-status.ts`). The follow-on — the superset clean-room
pattern (ELv2, PATTERN-ONLY) — injects a notify hook into each PTY agent's native config
(`~/.claude/settings.json`, `~/.qwen/settings.json`, `~/.config/goose/`), the hook POSTs
`{terminalId/taskId, eventType, agent}` back to BooCoder, which normalizes ~30 vendor event names → the
existing status buckets via the already-built `normalizeAgentEvent` helper. This gives goose/qwen/claude
real out-of-band working/blocked-on-permission/done signals the turn-boundary publish can't see.

**Decision / approach:** Sam wants it. Clean-room only (no superset code). Reuse the existing
`normalize-agent-status.ts`. Plan the injection mechanism (idempotent settings.json merge per agent),
the inbound POST route, and how it composes with the existing scoped publish without double-publishing.

**Surface:** apps/coder (agent config injection, new route, normalize reuse). Sam: want.

### F5 — opencode compaction surfacing (review §3 #1)

**Problem (VERIFIED earlier):** `opencode-server.ts` consumes ~5 SSE arms but NOT `compaction.{started,delta,ended}`
— so when the warm opencode server auto-compacts mid-conversation it shows as a silent context gap in the
UI. Only `step.ended` (tokens, v2.7.3) was taken from the fuller `session.next.*` set.

**Decision / approach:** Sam wants it. Consume the compaction arms (and consider `step.failed`,
`tool.progress`) in `opencode-server.ts`, map to a UI signal so the user sees "context compacted" instead of
a gap. May need a new WS frame + web `WsFrame` arm (cross-app parity rule applies).

**Surface:** apps/coder (opencode-server.ts) + apps/web (frame + render). Sam: want.

### F6 — BooChat resilience: stall-timeout + retry/backoff (review §3 #2/#3)

**Problem (VERIFIED 2026-06-02):** `apps/server/src/services/inference/stream-phase.ts` has ZERO retry and
NO server-side stall timeout. A hung llama-swap stream relies entirely on the frontend 60s `discard_stale`
watchdog. The `for await (const part of result.fullStream)` loop (`stream-phase.ts:261`) is the wrap point.

**Decision / approach:** Sam wants it. Add a per-chunk stall timeout wrapping the fullStream loop that fires
the existing abort path (~40-60 LoC), and a minimal retry/backoff classifier (transient-5xx / stall;
llama-swap rarely emits retry-after, so value is mostly transient-5xx + stall retry — strip the cloud-billing
arms). YAGNI-gate the retry scope hard: single local instance, so keep it minimal.

**Surface:** apps/server inference (stream-phase.ts). Sam: want.

### F7 — view_session_history MCP tool

**Context:** Roadmap v2.x optional. Expose chat/session history as a read-only MCP tool on BooCoder's MCP
server (`apps/coder/src/services/mcp-server.ts`) so an MCP client can retrieve the transcript of a session.

**Decision / approach:** Sam wants it. Read-only tool over the existing `messages_with_parts` read path,
keyed by session/chat id. Reference shape: memov `snap`/`mem_history`. Keep minimal — one read tool, no
write/validate verbs unless evidence demands.

**Surface:** apps/coder MCP server. Sam: want.

### F8 — diff-line → agent re-prompt UX (review §5j, superset pattern)

**Context:** Select lines in a diff, send them as a comment to the existing agent session or to a new agent
("fix this here"). superset has the pattern (ELv2, PATTERN-ONLY clean-room). This is a review/diff UX
frontier, the heaviest WANT and the one most needing its own feature spec.

**Decision / approach:** Sam wants it. Plan at a coarse level (it likely warrants a follow-on plan-a-feature):
the diff-line selection affordance, the compose-to-session/new-agent action, and how it routes through the
existing dispatcher + AgentComposerBar. The diff UI component must be located (grep for the diff panel did not
surface a `*diff*` filename — UX specialist to locate the actual component).

**Surface:** apps/web (diff UI) + apps/coder (route to existing/new session). Sam: want.

### F9 — Retire apps/coder/web :9502 fallback SPA (DEFERRED #5)

**Context:** Standalone Vite SPA (`@boocode/coder-web`) served by BooCoder at `:9502` when no BooChat proxy
is in front. Sam confirmed: "I don't use 9502." Primary UI is CoderPane inside BooChat.

**Decision / approach:** Remove `apps/coder/web/` + its static serve from `apps/coder/src/index.ts` + the
coder build step; KEEP all WS + REST routes (CoderPane depends on them); optional minimal static
"open in BooChat" page at `/`. Per DEFERRED #5 removal checklist. Low risk. Sam: greenlit.

---

## Deferred (Sam's explicit dispositions — record, do not plan)

- **Subagent permission demux** — DEFER. Condition met (VERIFIED 2026-06-02: `opencode-server.ts` has zero
  permission handling; parent and child sub-sessions run without BooCode gating — effectively bypassed).
- **Tier-2 provider follow-ups** — `available_agents.enabled` column and MCP `list_providers` tools DROP
  (redundant / moot now that v2.4 external-driving is dead); `provider_snapshot_updated` WS frame + shared
  `packages/types` PARK (revisit only if status feels laggy / a third type consumer appears).
- **Large-file splits** — DEFER (refactor-when-touched).
- **PR-resolver, multi-provider LLM, HMAC audit log, verify-gate ensembler, taste-skill, workflow graphs,
  record/replay test harness** — DEFER (optional/far-future).
- **v2.4 BooCoder-as-ACP-agent** — DROPPED (won't-do, never drive BooCoder from external editors).
- **Native llama-server tool-call-parser full retirement** — covered by F2's safe-path decision.

## Cross-cutting constraints (from CLAUDE.md — must honor)

- New WS frame type → update BOTH server `InferenceFrame`/`ws-frames.ts` AND web `WsFrame`
  (`apps/web/src/api/types.ts`); the web type is the wire-format gate (applies to F5, maybe F4).
- Sentinels are `role='system'` rows with `metadata.kind`; `buildMessagesPayload` strips them (relevant if
  F4/F5 add a UI-only signal).
- Deploy by surface: apps/coder change → `systemctl restart boocoder`; apps/web|server → docker rebuild.
- Stage commits explicitly by path; never `git add -A` (Sam has uncommitted web WIP).
- Coder↔web provider-type parity + ws-frames parity tests must stay green.
- Default to next patch tags, one batch per coherent unit (Sam declined v2.8.0 twice).