v1.13.1-B: read-path flip from tool_calls/tool_results JSON columns to message_parts

- schema.sql: new messages_with_parts view. tool_calls aggregates parts with kind='tool_call' as a jsonb array of {id, name, args}; tool_results picks the single sequence=0 part with kind='tool_result' as a jsonb {tool_call_id, output, truncated, error?}. COALESCE against the legacy jsonb columns means pre-v1.13.0 history (no parts rows) still reads correctly via the fallback, and fresh inserts (where parts dual-write follows the row INSERT) hit the legacy columns until the parts land. - reasoning_parts column added to the view but not selected by any caller yet — v1.13.1-C extends the Message type and pulls it into the model payload alongside the type extension. - Read sites switched to FROM messages_with_parts: - routes/chats.ts:427 (chat history GET) - routes/messages.ts:95 (session history GET) - routes/ws.ts:27 (WS snapshot on session connect, resume path) - services/inference/payload.ts (loadContext for model assembly) - services/compaction.ts (compaction's payload assembly) - chats.ts:394 (discard_stale UPDATE RETURNING) unchanged — UPDATEs target messages directly and the returned shape is for a freshly-modified row where the legacy column is dual-written and correct. - messages.ts:478/549 (ask_user_input correlation) intentionally not migrated — those query a different shape, ported in v1.13.1-C. - Writes still target `messages` directly; the view is read-only. Smoke verified against the live container: - Equivalence: 5/5 messages with both legacy column and parts row return identical tool_calls jsonb between FROM messages and FROM messages_with_parts. - Perf: EXPLAIN ANALYZE on the 42-message stress chat returns in ~1ms (50ms threshold). Bitmap Index Scan on message_parts_msg_seq_idx carries the parts lookups. - API contract: GET /api/chats/:id/messages returns identical {id, name, args} tool_calls and {tool_call_id, output, truncated, error} tool_results shapes to frontend consumers — no UI changes needed. - Inference path: sent a view_file prompt; assistant turn 1 emitted the tool_call, tool message captured the result, follow-up assistant turn read the result back via loadContext (now view-backed) and answered correctly. End-to-end loop intact. v1.13.2 drops the dual-write + the JSON columns + simplifies the view to just SELECT FROM message_parts. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 06:22:47 +00:00
parent c2c4f78a26
commit 13c3aa5b4e
6 changed files with 49 additions and 5 deletions
--- a/apps/server/src/routes/chats.ts
+++ b/apps/server/src/routes/chats.ts
@@ -423,11 +423,12 @@ export function registerChatRoutes(
        reply.code(404);
        return { error: 'chat not found' };
      }
+      // v1.13.1-B: reads tool_calls/tool_results via the parts-merged view.
      const rows = await sql<Message[]>`
        SELECT id, session_id, chat_id, role, content, kind, tool_calls, tool_results, status, last_seq,
               tokens_used, ctx_used, ctx_max, started_at, finished_at, created_at, metadata,
               summary, tail_start_id, compacted_at
-        FROM messages
+        FROM messages_with_parts
        WHERE chat_id = ${req.params.id}
        ORDER BY created_at ASC, id ASC
      `;
--- a/apps/server/src/routes/messages.ts
+++ b/apps/server/src/routes/messages.ts
@@ -91,11 +91,12 @@ export function registerMessageRoutes(
      // SummaryCard) and shows compacted_at-stamped rows inline for context.
      // Internal inference assembly filters compacted_at IS NULL separately —
      // see services/inference.ts loadContext + services/compaction.ts.
+      // v1.13.1-B: reads tool_calls/tool_results via the parts-merged view.
      const rows = await sql<Message[]>`
        SELECT id, session_id, chat_id, role, content, kind, tool_calls, tool_results, status, last_seq,
               tokens_used, ctx_used, ctx_max, started_at, finished_at, created_at, metadata,
               summary, tail_start_id, compacted_at
-        FROM messages
+        FROM messages_with_parts
        WHERE session_id = ${req.params.id}
        ORDER BY created_at ASC, id ASC
      `;
--- a/apps/server/src/routes/ws.ts
+++ b/apps/server/src/routes/ws.ts
@@ -23,11 +23,12 @@ export function registerWebSocket(

      // v1.11: snapshot includes compaction fields so MessageBubble can
      // render the SummaryCard for summary=true rows on first connect.
+      // v1.13.1-B: reads tool_calls/tool_results via the parts-merged view.
      const messages = await sql<Message[]>`
        SELECT id, session_id, chat_id, role, content, kind, tool_calls, tool_results, status, last_seq,
               tokens_used, ctx_used, ctx_max, started_at, finished_at, created_at, metadata,
               summary, tail_start_id, compacted_at
-        FROM messages
+        FROM messages_with_parts
        WHERE session_id = ${sessionId}
        ORDER BY created_at ASC, id ASC
      `;
--- a/apps/server/src/schema.sql
+++ b/apps/server/src/schema.sql
@@ -49,6 +49,42 @@ CREATE TABLE IF NOT EXISTS message_parts (
 );
 CREATE INDEX IF NOT EXISTS message_parts_msg_seq_idx ON message_parts (message_id, sequence);

+-- v1.13.1-B: read-path view. Read sites SELECT FROM messages_with_parts
+-- instead of messages so tool_calls / tool_results / reasoning_parts come
+-- from the granular message_parts table. The COALESCE means pre-v1.13.0
+-- history (no parts rows) still resolves via the legacy JSON columns; the
+-- dual-write from v1.13.0 keeps both in sync for all rows written since.
+-- Writes continue to target `messages` directly — the view is read-only.
+-- Shapes match the in-memory ToolCall / ToolResult types: tool_calls is a
+-- jsonb array of {id, name, args}, tool_results is a single jsonb object
+-- {tool_call_id, output, truncated, error?}. reasoning_parts is new — only
+-- consumed by the inference history fetch (payload.ts) so v1.13.1-C can
+-- wire reasoning into the model payload. Not surfaced in external APIs yet.
+CREATE OR REPLACE VIEW messages_with_parts AS
+SELECT
+  m.id, m.session_id, m.chat_id, m.role, m.content, m.kind, m.status,
+  m.last_seq, m.tokens_used, m.ctx_used, m.ctx_max,
+  m.started_at, m.finished_at, m.created_at, m.metadata,
+  m.summary, m.tail_start_id, m.compacted_at,
+  COALESCE(
+    (SELECT jsonb_agg(p.payload ORDER BY p.sequence)
+       FROM message_parts p
+      WHERE p.message_id = m.id AND p.kind = 'tool_call'),
+    m.tool_calls
+  ) AS tool_calls,
+  COALESCE(
+    (SELECT p.payload
+       FROM message_parts p
+      WHERE p.message_id = m.id AND p.kind = 'tool_result'
+      ORDER BY p.sequence
+      LIMIT 1),
+    m.tool_results
+  ) AS tool_results,
+  (SELECT jsonb_agg(p.payload ORDER BY p.sequence)
+     FROM message_parts p
+    WHERE p.message_id = m.id AND p.kind = 'reasoning') AS reasoning_parts
+FROM messages m;
+
 ALTER TABLE messages ADD COLUMN IF NOT EXISTS tokens_used INTEGER;
 ALTER TABLE messages ADD COLUMN IF NOT EXISTS ctx_used INTEGER;
 ALTER TABLE messages ADD COLUMN IF NOT EXISTS ctx_max INTEGER;
--- a/apps/server/src/services/compaction.ts
+++ b/apps/server/src/services/compaction.ts
@@ -342,9 +342,11 @@ export async function process(input: ProcessInput): Promise<void> {
  // 2. All currently-active messages in this chat (compacted_at IS NULL).
  // ORDER BY (created_at, id) matches loadContext in inference.ts so the
  // turns() boundary logic sees the same sequence the LLM will.
+  // v1.13.1-B: reads tool_calls/tool_results via the parts-merged view so
+  // the compaction payload matches what the LLM saw on the original turn.
  const messages = await sql<CompactionMessage[]>`
    SELECT id, role, content, kind, summary, status, tool_calls, tool_results, metadata, created_at
-    FROM messages
+    FROM messages_with_parts
    WHERE chat_id = ${chatId} AND compacted_at IS NULL
    ORDER BY created_at ASC, id ASC
  `;
--- a/apps/server/src/services/inference/payload.ts
+++ b/apps/server/src/services/inference/payload.ts
@@ -116,10 +116,13 @@ export async function loadContext(
  // /api/sessions/:id/messages endpoint still returns everything (so the UI
  // can show history with the summary card inline); only LLM payloads skip
  // compacted rows. compacted_at IS NULL keeps the active summary + tail.
+  // v1.13.1-B: reads tool_calls/tool_results via the parts-merged view.
+  // v1.13.1-C will extend the Message type with reasoning_parts and pull
+  // it from the same view; deferred here so the type contract stays clean.
  const history = await sql<Message[]>`
    SELECT id, session_id, chat_id, role, content, kind, tool_calls, tool_results, status, last_seq,
           tokens_used, ctx_used, ctx_max, started_at, finished_at, created_at, metadata
-    FROM messages
+    FROM messages_with_parts
    WHERE chat_id = ${chatId} AND compacted_at IS NULL
    ORDER BY created_at ASC, id ASC
  `;