boocode

Author	SHA1	Message	Date
indifferentketchup	ec8593cf77	v1.13.4: two-tier compaction prune — opencode pattern half-shipped in v1.11.0 - message_parts.hidden_at timestamptz column (NULL by default) with a partial index on (message_id) WHERE hidden_at IS NULL for the common visible-parts filter. - messages_with_parts view changed from COALESCE(parts, legacy) to CASE WHEN EXISTS(any parts of kind) THEN visible-parts ELSE legacy. COALESCE would have leaked hidden parts back via the legacy fallback when every part was pruned (smoke caught it pre-commit). The CASE distinguishes "no parts at all → fall back to legacy column for pre-v1.13.0 history" from "all parts hidden → return null/empty so the row drops out of the model payload" exactly. - prune.ts: scans tool_result parts newest-first, protects the last 40k tokens (PROTECTED_TOKENS), marks older candidates hidden when their combined estimate clears 20k (PRUNE_TRIGGER_TOKENS — equal to COMPACTION_BUFFER from v1.11.0, so a successful prune is exactly the budget the summary path would have freed). Stops at chats.tail_start_id so it doesn't double-erase across the last summary boundary. Pure decision helper selectPruneTargets exported separately for unit tests. - Wired into maybeFlagForCompaction: prune runs synchronously when overflow is detected; if it freed >= PRUNE_TRIGGER_TOKENS, the needs_compaction flag is NOT set and the (expensive) summary inference call is skipped this turn. The next turn's overflow check re-evaluates from scratch. - 6 new unit tests in prune.test.ts cover: empty input, protection-only (no candidates), candidates below trigger, candidates above trigger, candidates straddling a summary boundary, exactly-protection-tokens. 179 tests total (was 173). Smoke verified post-rebuild: - \\d message_parts shows hidden_at + partial index. - View definition shows AND p.hidden_at IS NULL filters on all three subselects. - Synthetic hide-then-restore confirmed the view drops the tool_result jsonb to null when its only part is hidden, and restores when un-hidden. - EXPLAIN ANALYZE on the 42-message stress chat: 0.325ms (faster than v1.13.1-B's 1.018ms — EXISTS short-circuits cleanly for the common no-parts case). - Normal turn (plain text prompt) completes unaffected. Closes a v1.11.0 design item that was scoped but never implemented. With v1.13's parts table the prune is dramatically cheaper to write — pre-parts it would have meant editing JSON blobs in-place; now it's a hidden_at flag and a view subselect. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 07:02:17 +00:00
indifferentketchup	ac1a71f583	v1.13.1-C: port ask_user_input correlation to parts + wire reasoning_parts end-to-end Pass 1 — ask_user_input correlation port (messages.ts:478, :549): - The two correlation queries that backed the elicitation flow used to scan messages.tool_calls and messages.tool_results JSON columns directly. They now JOIN message_parts on payload->>'id' (for the caller assistant) and payload->>'tool_call_id' (for the pending tool row). Semantics preserved: ORDER BY m.created_at DESC LIMIT 1 still picks the latest issuance, the already-answered 409 guard now reads payload.output, and the UPDATE + parts replace inside sql.begin is unchanged from v1.13.0. - Pre-v1.13.0 history has no parts rows and is unreachable to this lookup path (404). Acceptable per dispatch decision — no pending elicitation from before v1.13.0 will still be open. JSON-column fallback can land as a hotfix if it ever surfaces. Pass 2 — reasoning_parts wired end-to-end: - types.ts/StreamResult gains `reasoning: string`. stream-phase.ts accumulates reasoning-delta text per stream (replacing the v1.13.1-A counter-only diagnostic) and returns it on the result. - parts.ts/partsFromAssistantMessage gains an optional `reasoning` param. When present it emits a kind='reasoning' part at sequence 0, ahead of the text and tool_call parts. - error-handler.ts/finalizeCompletion and tool-phase.ts/executeToolPhase both thread result.reasoning into the dual-write call so reasoning-channel models (qwen3.6) get persistent reasoning rows. - payload.ts: loadContext SELECT pulls reasoning_parts from the v1.13.1-B view; OpenAiMessage gains an optional `reasoning` field; buildMessagesPayload collapses reasoning_parts into a single string per assistant message. - stream-phase.ts/toModelMessages converts assistant messages with reasoning into an AI SDK ModelMessage content array starting with a ReasoningPart, matching the @ai-sdk/provider-utils AssistantContent union. Reasoning models can now replay prior reasoning context across tool-call boundaries. - types/api.ts and apps/web/src/api/types.ts Message interface gain reasoning_parts (optional, nullable). Frontend doesn't render this yet — field reserved for a v1.14 UI surface. Tests: 2 new in parts.test.ts cover reasoning-at-sequence-0 with and without text content. 172 tests pass (170 prior + 2 new). Smoke verified against the live container: - A reasoning-prompt ("walk through 17 × 23 step by step") produced one message with kind='reasoning' (361 chars) at sequence 0 and kind='text' (429 chars) at sequence 1. Adapter log confirmed reasoning capture. - The new correlation SQL was validated against existing tool_call / tool_result parts: returns the expected message_id + payload shape with pending state correctly identified via payload.output IS NULL. - ask_user_input end-to-end through the UI is Sam's smoke — the Prompt Builder agent does not always trigger ask_user_input for these prompts, so synthetic verification via SQL substituted for traffic-driven cover. Annotation: the v1.13.1-A abort-throw site in stream-phase.ts got a one-liner comment ("AI SDK v6 fullStream returns normally on abort; check signal explicitly.") to prevent a future refactor removing it. v1.13.2 drops the dual-write + the JSON columns + collapses the view. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 06:34:10 +00:00
indifferentketchup	13c3aa5b4e	v1.13.1-B: read-path flip from tool_calls/tool_results JSON columns to message_parts - schema.sql: new messages_with_parts view. tool_calls aggregates parts with kind='tool_call' as a jsonb array of {id, name, args}; tool_results picks the single sequence=0 part with kind='tool_result' as a jsonb {tool_call_id, output, truncated, error?}. COALESCE against the legacy jsonb columns means pre-v1.13.0 history (no parts rows) still reads correctly via the fallback, and fresh inserts (where parts dual-write follows the row INSERT) hit the legacy columns until the parts land. - reasoning_parts column added to the view but not selected by any caller yet — v1.13.1-C extends the Message type and pulls it into the model payload alongside the type extension. - Read sites switched to FROM messages_with_parts: - routes/chats.ts:427 (chat history GET) - routes/messages.ts:95 (session history GET) - routes/ws.ts:27 (WS snapshot on session connect, resume path) - services/inference/payload.ts (loadContext for model assembly) - services/compaction.ts (compaction's payload assembly) - chats.ts:394 (discard_stale UPDATE RETURNING) unchanged — UPDATEs target messages directly and the returned shape is for a freshly-modified row where the legacy column is dual-written and correct. - messages.ts:478/549 (ask_user_input correlation) intentionally not migrated — those query a different shape, ported in v1.13.1-C. - Writes still target `messages` directly; the view is read-only. Smoke verified against the live container: - Equivalence: 5/5 messages with both legacy column and parts row return identical tool_calls jsonb between FROM messages and FROM messages_with_parts. - Perf: EXPLAIN ANALYZE on the 42-message stress chat returns in ~1ms (50ms threshold). Bitmap Index Scan on message_parts_msg_seq_idx carries the parts lookups. - API contract: GET /api/chats/:id/messages returns identical {id, name, args} tool_calls and {tool_call_id, output, truncated, error} tool_results shapes to frontend consumers — no UI changes needed. - Inference path: sent a view_file prompt; assistant turn 1 emitted the tool_call, tool message captured the result, follow-up assistant turn read the result back via loadContext (now view-backed) and answered correctly. End-to-end loop intact. v1.13.2 drops the dual-write + the JSON columns + simplifies the view to just SELECT FROM message_parts. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 06:22:47 +00:00
indifferentketchup	9ef00c0268	v1.12.4: complete inference.ts split into services/inference/ - sentinel-summaries.ts: runCapHitSummary, insertCapHitSentinel, runDoomLoopSummary, insertDoomLoopSentinel - inference.ts → inference/turn.ts: residue is runAssistantTurn, runInference, createInferenceRunner orchestration only - inference/index.ts: re-export shim preserves the public surface (createInferenceRunner, runInference, runAssistantTurn, detectDoomLoop, DOOM_LOOP_THRESHOLD, buildMessagesPayload, plus type-side InferenceContext/InferenceFrame/StreamResult/TurnArgs/ FramePublisher) - src/index.ts + auto_name.ts + the two vitest test files updated to import from ./services/inference/index.js explicitly (NodeNext ESM doesn't honor directory-index resolution) Final tally: 11 files under services/inference/, the largest being sentinel-summaries.ts at 523 LoC (two near-clone summary paths kept side-by-side until a third sentinel justifies factoring out a shared runWrapUpSummary). turn.ts is now 326 LoC, the next-largest is stream-phase.ts at 380. Public import surface unchanged. tool-phase.ts → turn.ts back-edge for runAssistantTurn remains (cycle is safe; resolved at call time). Prepares the file structure for v1.13 AI SDK migration — streamText swap targets stream-phase.ts only. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 22:36:35 +00:00
indifferentketchup	8fa7b7fce9	v1.12.4-rc2: extract payload + error-handler from inference.ts - payload.ts: buildMessagesPayload (re-exported), loadContext, maybeFlagForCompaction - error-handler.ts: handleAbortOrError, finalizeCompletion Both new files type-import InferenceContext/StreamResult/TurnArgs from inference.ts; ESM elides type imports so there's no runtime cycle. handleAbortOrError turned out not to call the summary functions, so no back-edge needed. inference.ts shrinks from ~1676 to ~1401 LoC. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 22:09:50 +00:00

5 Commits