v1.13.4: two-tier compaction prune — opencode pattern half-shipped in v1.11.0

- message_parts.hidden_at timestamptz column (NULL by default) with a
  partial index on (message_id) WHERE hidden_at IS NULL for the common
  visible-parts filter.
- messages_with_parts view changed from COALESCE(parts, legacy) to
  CASE WHEN EXISTS(any parts of kind) THEN visible-parts ELSE legacy.
  COALESCE would have leaked hidden parts back via the legacy fallback
  when every part was pruned (smoke caught it pre-commit). The CASE
  distinguishes "no parts at all → fall back to legacy column for
  pre-v1.13.0 history" from "all parts hidden → return null/empty so
  the row drops out of the model payload" exactly.
- prune.ts: scans tool_result parts newest-first, protects the last 40k
  tokens (PROTECTED_TOKENS), marks older candidates hidden when their
  combined estimate clears 20k (PRUNE_TRIGGER_TOKENS — equal to
  COMPACTION_BUFFER from v1.11.0, so a successful prune is exactly the
  budget the summary path would have freed). Stops at chats.tail_start_id
  so it doesn't double-erase across the last summary boundary. Pure
  decision helper selectPruneTargets exported separately for unit tests.
- Wired into maybeFlagForCompaction: prune runs synchronously when
  overflow is detected; if it freed >= PRUNE_TRIGGER_TOKENS, the
  needs_compaction flag is NOT set and the (expensive) summary inference
  call is skipped this turn. The next turn's overflow check re-evaluates
  from scratch.
- 6 new unit tests in prune.test.ts cover: empty input, protection-only
  (no candidates), candidates below trigger, candidates above trigger,
  candidates straddling a summary boundary, exactly-protection-tokens.
  179 tests total (was 173).

Smoke verified post-rebuild:
- \\d message_parts shows hidden_at + partial index.
- View definition shows AND p.hidden_at IS NULL filters on all three
  subselects.
- Synthetic hide-then-restore confirmed the view drops the tool_result
  jsonb to null when its only part is hidden, and restores when un-hidden.
- EXPLAIN ANALYZE on the 42-message stress chat: 0.325ms (faster than
  v1.13.1-B's 1.018ms — EXISTS short-circuits cleanly for the common
  no-parts case).
- Normal turn (plain text prompt) completes unaffected.

Closes a v1.11.0 design item that was scoped but never implemented. With
v1.13's parts table the prune is dramatically cheaper to write — pre-parts
it would have meant editing JSON blobs in-place; now it's a hidden_at
flag and a view subselect.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-22 07:02:17 +00:00
parent a08d809b73
commit ec8593cf77
4 changed files with 286 additions and 15 deletions

View File

@@ -56,6 +56,24 @@ CREATE TABLE IF NOT EXISTS message_parts (
);
CREATE INDEX IF NOT EXISTS message_parts_msg_seq_idx ON message_parts (message_id, sequence);
-- v1.13.4: prune support. hidden_at marks parts that have been pruned out
-- of the model payload by the two-tier compaction prune (services/inference/
-- prune.ts). Rows stay in the DB so frontend can still display them with a
-- "hidden" indicator (out of scope this dispatch). messages_with_parts
-- view filters these out — see below. Partial index speeds the common
-- "visible parts only" filter.
DO $$
BEGIN
IF NOT EXISTS (
SELECT 1 FROM information_schema.columns
WHERE table_name = 'message_parts' AND column_name = 'hidden_at'
) THEN
ALTER TABLE message_parts ADD COLUMN hidden_at timestamptz NULL;
END IF;
END $$;
CREATE INDEX IF NOT EXISTS message_parts_hidden_idx
ON message_parts (message_id) WHERE hidden_at IS NULL;
-- v1.13.1-B: read-path view. Read sites SELECT FROM messages_with_parts
-- instead of messages so tool_calls / tool_results / reasoning_parts come
-- from the granular message_parts table. The COALESCE means pre-v1.13.0
@@ -73,23 +91,32 @@ SELECT
m.last_seq, m.tokens_used, m.ctx_used, m.ctx_max,
m.started_at, m.finished_at, m.created_at, m.metadata,
m.summary, m.tail_start_id, m.compacted_at,
COALESCE(
(SELECT jsonb_agg(p.payload ORDER BY p.sequence)
FROM message_parts p
WHERE p.message_id = m.id AND p.kind = 'tool_call'),
m.tool_calls
) AS tool_calls,
COALESCE(
(SELECT p.payload
FROM message_parts p
WHERE p.message_id = m.id AND p.kind = 'tool_result'
ORDER BY p.sequence
LIMIT 1),
m.tool_results
) AS tool_results,
-- v1.13.4: prune semantics need to distinguish "no parts row exists"
-- (pre-v1.13.0 fallback to legacy column) from "all parts hidden"
-- (prune intended — return null/empty so the row drops from the model
-- payload). A naive COALESCE would fall back to the legacy column when
-- every part is hidden, undoing the prune. CASE on EXISTS(any kind)
-- splits the two cases.
CASE
WHEN EXISTS (SELECT 1 FROM message_parts pp
WHERE pp.message_id = m.id AND pp.kind = 'tool_call')
THEN (SELECT jsonb_agg(p.payload ORDER BY p.sequence)
FROM message_parts p
WHERE p.message_id = m.id AND p.kind = 'tool_call' AND p.hidden_at IS NULL)
ELSE m.tool_calls
END AS tool_calls,
CASE
WHEN EXISTS (SELECT 1 FROM message_parts pp
WHERE pp.message_id = m.id AND pp.kind = 'tool_result')
THEN (SELECT p.payload
FROM message_parts p
WHERE p.message_id = m.id AND p.kind = 'tool_result' AND p.hidden_at IS NULL
ORDER BY p.sequence LIMIT 1)
ELSE m.tool_results
END AS tool_results,
(SELECT jsonb_agg(p.payload ORDER BY p.sequence)
FROM message_parts p
WHERE p.message_id = m.id AND p.kind = 'reasoning') AS reasoning_parts
WHERE p.message_id = m.id AND p.kind = 'reasoning' AND p.hidden_at IS NULL) AS reasoning_parts
FROM messages m;
ALTER TABLE messages ADD COLUMN IF NOT EXISTS tokens_used INTEGER;