boocode/openspec/changes/v1.13.15-codecontext-synth/proposal.md

# v1.13.13 — codecontext synthesis pipeline

Slots between v1.13.12 (skills audit) and v1.14 (Phase C outer agent loop). Adds a forced second-inference synthesis pass for codecontext overview/analysis tools so the model stops returning shallow first-touch summaries.

Does NOT change the recursion structure, depth cap, or budget — those are v1.14 concerns. The cap-50 patch from v1.13.12 stays; v1.14 supersedes it via per-agent `agent.steps`.

## What ships

- `apps/server/src/services/synthesisPrompt.ts` (NEW, 20 lines) — verbatim system prompt as a const.
- `apps/server/src/services/synthesisPipeline.ts` (NEW, ~450 lines) — `SYNTHESIS_TOOLS` set + `runSynthesisPass(params) → Promise<boolean>`. Auto-fetches top-N referenced files + project docs (BOOCHAT.md, AGENTS.md, *roadmap*.md, CONTEXT.md), applies a 32k-token budget with priority drop order, streams a synthesis turn via `streamCompletion`, dual-writes a `kind='synthesis'` part.
- `apps/server/src/services/inference/parts.ts` — `PartKind` union extended with `'synthesis'`.
- `apps/server/src/services/inference/tool-phase.ts` — synth-tool result capture during `Promise.all`; post-pause synth check before the recursive `runAssistantTurn`.
- `apps/server/src/schema.sql` — inline CHECK constraint updated + `DROP CONSTRAINT IF EXISTS` + `DO $$ pg_constraint` migration block. Idempotent (drops + re-adds on every startup; per-boot cost is trivial).

SYNTHESIS_TOOLS = `{get_codebase_overview, get_framework_analysis, get_semantic_neighborhoods}`. The other 5 codecontext tools (search_symbols, get_dependencies, get_file_analysis, get_symbol_info, watch_changes) return targeted data the model uses directly — no synthesis pass.

## Decisions

### Schema migration was required (dispatch was wrong)

The original dispatch said "kind is text column, no schema migration needed." Reality: `schema.sql:54` has an explicit `message_parts_kind_chk` CHECK constraint enumerating allowed kinds (`'text', 'tool_call', 'tool_result', 'reasoning', 'step_start'`). Adding `'synthesis'` requires updating the constraint.

Resolution: added a `DROP CONSTRAINT IF EXISTS` + `DO $$ ... pg_constraint` idempotency-guarded migration block in `schema.sql` matching the CLAUDE.md migration pattern, plus updated the inline CREATE TABLE constraint so fresh installs include the new value.

### `view_file` input shape uses `start_line`/`end_line`, not `line_count`

The dispatch's auto-fetch sketch implied a `line_count` parameter. The real `viewFile` tool's input schema (`tools.ts:51-55`) takes `start_line`/`end_line` (1-indexed inclusive) with a 200-line default if both are omitted. The pipeline uses `end_line: FILE_LINE_CAP` for files (200) and `end_line: DOC_LINE_CAP` for docs (500), which gives the first N lines — same effective truncation.

### User-abort during synthesis marks the synth message failed (deviates from review req)

**Decision: option A — mark synth message `status='failed'` on every catch path including user-abort, then re-throw on user-abort.**

Sam's stated review requirement: "User-abort path does NOT mark the message failed (re-throw to outer handler is correct)."

Why this deviation: the outer abort handler (`error-handler.ts:handleAbortOrError`) operates on `args.assistantMessageId` — the *parent* assistant message that triggered the tool call. It does not know about the *new* synth assistant message that `runSynthesisPass` created. If the synth row isn't explicitly marked failed on user-abort, it sits in `status='streaming'` until the 5-min stale-streaming sweeper (`apps/server/src/index.ts`) picks it up — meanwhile the frontend's 60s no-token-activity timer trips the stale-stream banner on the orphan. Same UX bug class the v1.13.3 stuck-row sweeper was added to handle.

Cost: one extra DB write + one `message_complete` republish on the rare user-abort-during-synth path. Worth it to avoid the zombie message + ghost banner.

**Note for v1.14 outer-loop port**: when Phase C migrates the depth cap into `agent.steps` and reworks the recursion, the synth message is a sibling to the parent assistant message — both belong to the same chat. The new outer loop should either (a) preserve this pattern (mark all chat-scoped streaming messages failed on abort) or (b) extend `handleAbortOrError` to sweep chat-scoped streaming rows. Option (b) is a wider blast radius and was rejected here; option (a) is one targeted call site.

### Token budget priority list

Drop order when the 32k cap is exceeded (lowest priority first):
1. top-2..N files (keep top-1)
2. top-1 file
3. `*roadmap*.md` + `CONTEXT.md` (mid-priority — both describe state/intent)
4. `AGENTS.md`
5. `BOOCHAT.md` — **never dropped**; truncated to 32k if it alone exceeds

CONTEXT.md wasn't in the original dispatch's priority list; grouped with roadmap as mid-priority (same semantic — both are state/intent docs).

### 90s timeout via `AbortSignal.any`

Synthesis call has its own `AbortController` with a 90s `setTimeout`. Combined with `p.args.signal` (the user-abort signal) via `AbortSignal.any([user, synth])` — either fires correctly. Node 20.3+. A `timedOut` flag in scope disambiguates which signal tripped after `streamCompletion` throws (`AbortError`): timeout → return false (fall through to recursion); user-abort → re-throw (after `markSynthFailed`).

### Race-safe synth-tool capture under `Promise.all`

`synthEntries: Array<{tc, output, error?}>` populated by each parallel callback pushing its own result. After `Promise.all` resolves, `synthEntries.find((e) => !e.error && e.output != null)` picks the first non-error synth entry by call-order (i.e. by `toolCalls` array index in the original LLM emit order). Not result-quality scoring — explicitly call-order, documented inline.

### Known interaction: qwen3.6 `include_stats: "True"` retry loop compounds synth-pass cost

Smoke #1 surfaced a pre-existing qwen3.6 quirk: the model emits `"True"` (string) instead of `true` (bool) for boolean tool args. The `experimental_repairToolCall` + zod-reject retry path (v1.13.3) handles this — the model retries on the next turn with corrected args, then succeeds.

**Synth pass cost interaction:** when the first tool-call fails zod validation, the recursive runAssistantTurn fires *before* the successful synth-tool call lands. The user effectively pays: (1) failed tool-call turn → (2) error tool-result → (3) retry tool-call turn → (4) successful tool-result → (5) synth pass.

Per-fire token cost for an overview question now: ~5 inference calls (turns 1, 3, 5 are model calls; 5 is the synth pass adding ~5k tokens of auto-fetched context). Not a blocker — the synth content is dramatically better than the without-synth case (4920 tokens of cited analysis vs. a 70-token tool-call-only turn). Worth tracking if usage stats start showing it.

### v1.14 outer-loop port — preserve this pattern

Two patterns from this batch the Phase C outer-loop port must preserve:

1. **Chat-scoped abort cleanup**: the synth message is a sibling to the parent assistant message, both belong to the same chat. The new outer loop should either (a) keep `markSynthFailed` (or its equivalent) firing on every catch path including user-abort, or (b) extend `handleAbortOrError` to sweep all chat-scoped streaming rows. This batch chose (a); (b) was rejected as wider blast radius.
2. **Race-safe `Promise.all` capture**: `synthEntries: Array<...>` instead of a single shared variable. Per-callback push avoids the last-write-wins race when a batch has multiple synth tools.

## Test plan

6-prompt smoke + 1 failure-injection. Sequence:

1. **Default agent** — "What's in this codebase?" → expect `get_codebase_overview` + synthesis pass, response cites BOOCHAT.md + actual files + roadmap state.
2. **Architect agent** — "Give me a system overview of how BooCode handles tool calls" → expect synthesis with refs to inference/turn.ts, tool-phase.ts, stream-phase.ts.
3. **Architect agent** — "What's the current state of v1.13?" → synthesis must read `boocode_roadmap.md` and report shipped vs planned correctly. Must NOT infer "v1.13.2 shipped" from code presence — roadmap explicitly defers it.
4. **Code Reviewer** — "Find all callers of buildSystemPrompt" → `search_symbols` fires, NO synthesis pass (not in SYNTHESIS_TOOLS).
5. **Debugger** — "Where is detectDoomLoop defined and called from?" → `search_symbols` + `get_dependencies`, NO synthesis pass.
6. **Failure injection** — temporarily make `streamCompletion` throw inside `runSynthesisPass`; verify fall-through to recursion + log entry visible + non-empty answer.

## Backups in place

```
apps/server/src/schema.sql.bak-v1.13.13-20260522
apps/server/src/services/inference/parts.ts.bak-v1.13.13-20260522
apps/server/src/services/inference/tool-phase.ts.bak-v1.13.13-20260522
```

To be deleted after merge.

## Smoke results

### Smoke #1 — default agent, "What is in this codebase?"

Synthesis fired on `get_codebase_overview`. Log line:
```
{"chatId":"7bb05e54-…","synthMessageId":"44480541-…","toolName":"get_codebase_overview","chars":6727,"files":5,"msg":"synthesis pass complete"}
```

Token accounting: synth turn = 4920 tokens (vs. 63 + 70 on the preceding tool-call-only turns). Model is using the auto-fetched context, not parroting codecontext output. Synth message has the expected `kind='synthesis'` part dual-write.

Side note: qwen3.6 needed one retry due to the `include_stats: "True"` quirk (see Decisions). `repairToolCall` handled it; synth fired on the successful call.

### Smoke #6 — fault injection

Env-gated throw inserted between the synth-message INSERT and the `streamCompletion` call. Container rebuilt with `V1_13_13_FAULT_INJECT=1`. Sent the same prompt to a new smoke chat.

All 6 expected outcomes confirmed:

| # | Outcome | Evidence |
|---|---|---|
| 1 | `runSynthesisPass` throws | log: `err: "Error: v1.13.13 smoke #6 fault injection"` |
| 2 | Synth message marked `status='failed'` with empty content | msg `7ac9c685-…` role=assistant status=failed content_len=0 |
| 3 | `message_complete` frame published for the synth message | implicit via `markSynthFailed`; frontend never tripped the 60s timer |
| 4 | Fall-through to recursive `runAssistantTurn` | log: `synthesis pass failed; falling through to recursive turn` |
| 5 | User sees normal (non-synthesized) assistant response | final msg `924076a3-…` 453 tokens: `"This is **boocode** — a self-hosted, single-user developer chat app."` |
| 6 | Stale-stream banner does NOT fire on failed synth | confirmed — terminal `status='failed'` is what `applyFrame` writes |

Fault injection reverted post-test:
- `grep FAULT_INJECT apps/server/src/services/synthesisPipeline.ts docker-compose.yml` → empty
- `grep FAULT_INJECT apps/server/dist/services/synthesisPipeline.js` → empty
- `docker compose exec boocode printenv V1_13_13_FAULT_INJECT` → exit 1 (unset)
- Boot log clean, `skills loaded: 14`

### Smokes #2–#5

Sam is doing the qualitative reads from the UI in parallel — those verifications are about synthesis content quality (cites correct files, reads roadmap accurately, no-synthesis on `search_symbols`).

## Done when

- ✅ `synthesisPrompt.ts` + `synthesisPipeline.ts` created
- ✅ `parts.ts` PartKind union extended
- ✅ `tool-phase.ts` insertion point edited
- ✅ Schema migration block added (deviation from dispatch acknowledged)
- ✅ Type-clean (`pnpm -C apps/server build`)
- ✅ Container rebuilt + migration confirmed via pg_constraint and logs
- ✅ Smoke #1 (positive synth path) verified
- ✅ Smoke #6 (fault injection + fall-through) verified, injection reverted
- ⏳ Smokes #2–#5 (Sam's UI reads)
- ⏳ Sam commit