v1.12.2: live tok/s + ctx display next to status indicator

ChatThroughput renders inline beside StatusDot while streaming or tool_running. Subscribes to existing usage frames via sessionEvents. Hides when status drops to idle/error or data is older than 10s. Addresses the 2026-05-21 spike's UX gap where slow streams looked identical to dead streams — now there's a live token velocity readout that immediately distinguishes the two. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
v1.12.1: stop-handler writes terminal status + constraint cleanup + dead code removal
2026-05-21 20:45:53 +00:00 · 2026-05-21 20:34:40 +00:00 · 2026-05-21 20:32:02 +00:00 · 2026-05-21 17:15:02 +00:00
21 changed files with 612 additions and 185 deletions
--- a/apps/server/src/index.ts
+++ b/apps/server/src/index.ts
@@ -49,6 +49,18 @@ async function main() {
  await applySchema(sql);
  app.log.info('database schema applied');
  const swept = await sql<{ count: string }[]>`
    WITH swept AS (
      UPDATE messages SET status = 'failed'
      WHERE status = 'streaming' AND created_at < NOW() - INTERVAL '5 minutes'
      RETURNING id
    ) SELECT count(*)::text AS count FROM swept
  `;
  const sweptCount = Number(swept[0]?.count ?? 0);
  if (sweptCount > 0) {
    app.log.info({ sweptCount }, 'swept stale streaming messages to failed');
  }
  // v1.11.3: tell the model-context cache where llama-swap lives. Cache
  // lookups go to ${LLAMA_SWAP_URL}/upstream/<model>/props to read
  // default_generation_settings.n_ctx — the value persisted as messages.ctx_max.
--- a/apps/server/src/routes/sessions.ts
+++ b/apps/server/src/routes/sessions.ts
@@ -13,6 +13,18 @@ const CreateBody = z.object({
  agent_id: z.string().min(1).max(200).nullable().optional(),
 });
 const WorkspacePaneZ = z.object({
  id: z.string().min(1).max(200),
  kind: z.enum(['chat', 'terminal', 'agent', 'empty', 'settings']),
  chatId: z.string().min(1).max(200).optional(),
  chatIds: z.array(z.string().min(1).max(200)).max(50),
  activeChatIdx: z.number().int(),
 });
 const WorkspacePanesBody = z.object({
  workspace_panes: z.array(WorkspacePaneZ).max(10),
 });
 const PatchBody = z.object({
  name: z.string().min(1).max(200).optional(),
  model: z.string().min(1).max(200).optional(),
@@ -44,7 +56,7 @@ export function registerSessionRoutes(
      }
      const status = req.query.status === 'archived' ? 'archived' : 'open';
      const rows = await sql<Session[]>`
-        SELECT id, project_id, name, model, system_prompt, status, created_at, updated_at, agent_id, web_search_enabled
+        SELECT id, project_id, name, model, system_prompt, status, created_at, updated_at, agent_id, web_search_enabled, workspace_panes
        FROM sessions
        WHERE project_id = ${req.params.id} AND status = ${status}
        ORDER BY updated_at DESC
@@ -92,7 +104,7 @@ export function registerSessionRoutes(
        const [session] = await tx<Session[]>`
          INSERT INTO sessions (project_id, name, model, system_prompt, agent_id)
          VALUES (${req.params.id}, ${name}, ${model}, ${systemPrompt}, ${agentId})
-          RETURNING id, project_id, name, model, system_prompt, status, created_at, updated_at, agent_id, web_search_enabled
+          RETURNING id, project_id, name, model, system_prompt, status, created_at, updated_at, agent_id, web_search_enabled, workspace_panes
        `;
        await tx`
          INSERT INTO chats (session_id, name, status)
@@ -112,7 +124,7 @@ export function registerSessionRoutes(
  app.get<{ Params: { id: string } }>('/api/sessions/:id', async (req, reply) => {
    const rows = await sql<Session[]>`
-      SELECT id, project_id, name, model, system_prompt, status, created_at, updated_at, agent_id, web_search_enabled
+      SELECT id, project_id, name, model, system_prompt, status, created_at, updated_at, agent_id, web_search_enabled, workspace_panes
      FROM sessions WHERE id = ${req.params.id}
    `;
    if (rows.length === 0) {
@@ -158,7 +170,7 @@ export function registerSessionRoutes(
          updated_at = clock_timestamp()
        WHERE id = ${req.params.id}
        RETURNING id, project_id, name, model, system_prompt, status, created_at, updated_at,
-                  agent_id, web_search_enabled
+                  agent_id, web_search_enabled, workspace_panes
      `;
      if (rows.length === 0) {
        reply.code(404);
@@ -187,6 +199,36 @@ export function registerSessionRoutes(
    }
  );
  app.patch<{ Params: { id: string } }>(
    '/api/sessions/:id/workspace',
    async (req, reply) => {
      const parsed = WorkspacePanesBody.safeParse(req.body);
      if (!parsed.success) {
        reply.code(400);
        return { error: 'invalid body', details: parsed.error.flatten() };
      }
      const rows = await sql<Session[]>`
        UPDATE sessions
        SET workspace_panes = ${sql.json(parsed.data.workspace_panes as never)},
            updated_at = clock_timestamp()
        WHERE id = ${req.params.id}
        RETURNING id, project_id, name, model, system_prompt, status, created_at, updated_at,
                  agent_id, web_search_enabled, workspace_panes
      `;
      if (rows.length === 0) {
        reply.code(404);
        return { error: 'session not found' };
      }
      const session = rows[0]!;
      broker.publishUser('default', {
        type: 'session_workspace_updated',
        session_id: session.id,
        workspace_panes: session.workspace_panes,
      });
      return session;
    }
  );
  // v1.9: bulk-archive every open session in a project. Mirrors the
  // single-archive shape (same broker frame type) so the existing useSidebar
  // reducer cases handle it without changes — just N frames instead of 1.
@@ -263,7 +305,7 @@ export function registerSessionRoutes(
      const rows = await sql<Session[]>`
        UPDATE sessions SET status = 'open', updated_at = clock_timestamp()
        WHERE id = ${req.params.id} AND status = 'archived'
-        RETURNING id, project_id, name, model, system_prompt, status, created_at, updated_at, agent_id, web_search_enabled
+        RETURNING id, project_id, name, model, system_prompt, status, created_at, updated_at, agent_id, web_search_enabled, workspace_panes
      `;
      if (rows.length === 0) {
        reply.code(404);
--- a/apps/server/src/schema.sql
+++ b/apps/server/src/schema.sql
@@ -47,22 +47,14 @@ CREATE TABLE IF NOT EXISTS settings (
 INSERT INTO settings (key, value) VALUES ('default_model', '"qwen3.6-35b-a3b-mxfp4"') ON CONFLICT (key) DO NOTHING;
-- DEPRECATED: client-side pane state as of v1.2-batch4. Table retained per
+-- v1.12.1: deprecated session_panes table removed. Workspace pane state now
-- additive schema rule; no writes. Drop in a future destructive migration.
+-- lives in sessions.workspace_panes (jsonb), see below.
-CREATE TABLE IF NOT EXISTS session_panes (
+DROP TABLE IF EXISTS session_panes;
  id           UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  session_id   UUID NOT NULL REFERENCES sessions(id) ON DELETE CASCADE,
  position     INTEGER NOT NULL,
  kind         TEXT NOT NULL CHECK (kind IN ('chat', 'file_browser', 'terminal')),
  state        JSONB NOT NULL DEFAULT '{}',
  created_at   TIMESTAMPTZ NOT NULL DEFAULT clock_timestamp(),
  UNIQUE (session_id, position)
 );
 CREATE INDEX IF NOT EXISTS idx_session_panes_session ON session_panes (session_id);
-- v1.4: backfill removed. Pane layout is client-side (localStorage) since v1.2-batch4.
+-- v1.12.1: server-side workspace pane layout, replaces localStorage so every
-- The CREATE TABLE above is retained for additive-schema discipline; drop is a
+-- device sees the same panes for a given session. Shape matches
-- future destructive migration.
+-- WorkspacePane[] from apps/server/src/types/api.ts.
 ALTER TABLE sessions ADD COLUMN IF NOT EXISTS workspace_panes JSONB NOT NULL DEFAULT '[]'::jsonb;
 -- v1.2: sessions.status (open | archived)
 ALTER TABLE sessions ADD COLUMN IF NOT EXISTS status TEXT NOT NULL DEFAULT 'open';
@@ -128,6 +120,19 @@ BEGIN
  END IF;
 END $$;
 -- v1.12.1: drop stale inline CHECK constraints that were superseded by the
 -- named *_chk variants above. messages_status_check missed 'cancelled' and
 -- messages_role_check missed 'system' — both narrower than what's in use.
 DO $$
 BEGIN
  IF EXISTS (SELECT 1 FROM pg_constraint WHERE conname = 'messages_status_check') THEN
    ALTER TABLE messages DROP CONSTRAINT messages_status_check;
  END IF;
  IF EXISTS (SELECT 1 FROM pg_constraint WHERE conname = 'messages_role_check') THEN
    ALTER TABLE messages DROP CONSTRAINT messages_role_check;
  END IF;
 END $$;
 -- v1.2-project-ux: projects.status + projects.gitea_remote
 -- KEEP IN SYNC: apps/server/src/types/api.ts PROJECT_STATUSES
 ALTER TABLE projects ADD COLUMN IF NOT EXISTS status TEXT NOT NULL DEFAULT 'open';
--- a/apps/server/src/services/inference.ts
+++ b/apps/server/src/services/inference.ts
@@ -117,6 +117,7 @@ export interface InferenceFrame {
    | 'tool_call'
    | 'tool_result'
    | 'message_complete'
    | 'usage'
    | 'messages_deleted'
    | 'session_renamed'
    | 'chat_renamed'
@@ -145,6 +146,7 @@ export interface InferenceFrame {
  tokens_used?: number | null;
  ctx_used?: number | null;
  ctx_max?: number | null;
  completion_tokens?: number | null;
  started_at?: string | null;
  finished_at?: string | null;
  model?: string;
@@ -444,6 +446,7 @@ async function streamCompletion(
  messages: OpenAiMessage[],
  opts: StreamOptions,
  onDelta: (content: string) => void,
  onUsage: ((prompt: number | null, completion: number | null) => void) | undefined,
  signal?: AbortSignal
 ): Promise<StreamResult> {
  const body: Record<string, unknown> = {
@@ -499,6 +502,7 @@ async function streamCompletion(
      if (typeof parsed.usage.completion_tokens === 'number') {
        completionTokens = parsed.usage.completion_tokens;
      }
      onUsage?.(promptTokens, completionTokens);
    }
    // v1.11.3: removed dead `parsed.timings.n_ctx` read. llama-server's
    // streaming completion does NOT emit n_ctx in timings (verified
@@ -728,6 +732,34 @@ async function executeStreamPhase(
  ).filter((t) => webToolsEnabled || !WEB_TOOL_NAMES.has(t.function.name));
  const effectiveTemperature = agent?.temperature;
  // v1.12.2: ctx_max lookup is cached after the first hit per model, so this
  // is a Map probe in steady state. We capture nCtx once at the top of the
  // stream so the throttled usage publish doesn't refetch each tick.
  const mctxForStream = await modelContext.getModelContext(session.model);
  const nCtxForStream = mctxForStream?.n_ctx ?? null;
  // v1.12.2: throttle live usage publishes to ~500ms. The model can land
  // dozens of usage frames per second; without a throttle the WS turns into
  // a firehose for a few KB savings on each render.
  const USAGE_THROTTLE_MS = 500;
  let lastUsageAt = 0;
  let pendingUsage: { p: number | null; c: number | null } | null = null;
  let usageTimer: NodeJS.Timeout | null = null;
  const flushUsage = () => {
    if (!pendingUsage) return;
    const { p, c } = pendingUsage;
    pendingUsage = null;
    lastUsageAt = Date.now();
    ctx.publish(sessionId, {
      type: 'usage',
      message_id: assistantMessageId,
      chat_id: chatId,
      completion_tokens: c,
      ctx_used: p,
      ctx_max: nCtxForStream,
    });
  };
  try {
    return await streamCompletion(
      ctx,
@@ -745,6 +777,18 @@ async function executeStreamPhase(
        ctx.log.debug({ sessionId, delta }, 'inference delta');
        scheduleFlush();
      },
      (prompt, completion) => {
        pendingUsage = { p: prompt, c: completion };
        const elapsed = Date.now() - lastUsageAt;
        if (elapsed >= USAGE_THROTTLE_MS) {
          flushUsage();
        } else if (!usageTimer) {
          usageTimer = setTimeout(() => {
            usageTimer = null;
            flushUsage();
          }, USAGE_THROTTLE_MS - elapsed);
        }
      },
      signal
    );
  } finally {
@@ -752,6 +796,10 @@ async function executeStreamPhase(
      clearTimeout(pendingFlushTimer);
      pendingFlushTimer = null;
    }
    if (usageTimer) {
      clearTimeout(usageTimer);
      usageTimer = null;
    }
    await flushPromise;
  }
 }
@@ -801,6 +849,17 @@ async function handleAbortOrError(
  // genuine errors flip the dot red. v1.8.2: error path also carries a
  // machine-readable `reason` so the UI can render specifics inline.
  if (isAbort) {
    // v1.12.1: defensive cancellation write. The status=${finalStatus} UPDATE
    // above already sets 'cancelled' for the AbortError case, but a row can
    // leak as 'streaming' when the abort fires between the post-tool-phase
    // INSERT (executeToolPhase) and the next runAssistantTurn's stream setup,
    // bypassing the try/catch around executeStreamPhase. The status guard
    // makes this a no-op when the earlier write already landed.
    await ctx.sql`
      UPDATE messages
      SET status = 'cancelled', content = ${accumulated}, finished_at = clock_timestamp()
      WHERE id = ${args.assistantMessageId} AND status = 'streaming'
    `;
    ctx.publishUser({ type: 'chat_status', chat_id: chatId, status: 'idle', at: new Date().toISOString() });
    ctx.publish(sessionId, {
      type: 'message_complete',
@@ -894,6 +953,7 @@ async function executeToolPhase(
  // pre-stamped with output=null as a "pending" sentinel and no tool_result
  // frame goes out — the card renders from the tool_call frame alone. Mixed
  // batches still execute the other tools normally.
  ctx.publishUser({ type: 'chat_status', chat_id: chatId, status: 'tool_running', at: new Date().toISOString() });
  let pausingForUserInput = false;
  await Promise.all(
    toolCalls.map(async (tc) => {
@@ -938,13 +998,10 @@ async function executeToolPhase(
  );
  if (pausingForUserInput) {
    // Drop the dot back to idle — the card is the actionable surface now.
    // The next inference turn fires from POST /api/chats/:id/answer_user_input
    // once the user submits their answers.
    ctx.publishUser({
      type: 'chat_status',
      chat_id: chatId,
-      status: 'idle',
+      status: 'waiting_for_input',
      at: new Date().toISOString(),
    });
    ctx.log.info(
@@ -1229,6 +1286,7 @@ async function runCapHitSummary(
        });
        scheduleFlush();
      },
      undefined,
      signal,
    );
    summaryOk = true;
@@ -1490,6 +1548,7 @@ async function runDoomLoopSummary(
        });
        scheduleFlush();
      },
      undefined,
      signal,
    );
    summaryOk = true;
@@ -1677,7 +1736,7 @@ export function createInferenceRunner(
      };
      // v1.8 mobile-tabs: announce working before the async loop starts so
      // every device subscribed to the user channel sees the amber dot.
-      callCtx.publishUser({ type: 'chat_status', chat_id: chatId, status: 'working', at: new Date().toISOString() });
+      callCtx.publishUser({ type: 'chat_status', chat_id: chatId, status: 'streaming', at: new Date().toISOString() });
      const controller = new AbortController();
      let resolveCompleted!: () => void;
      const completed = new Promise<void>((res) => { resolveCompleted = res; });
--- a/apps/server/src/types/api.ts
+++ b/apps/server/src/types/api.ts
@@ -39,6 +39,19 @@ export interface Session {
  // project.default_web_search_enabled. Plumbed but inert in v1.9 — the
  // actual web_search tool ships in Batch 8.
  web_search_enabled: boolean | null;
  // v1.12.1: server-side workspace pane layout. Replaces per-device
  // localStorage so all devices viewing the session see the same panes.
  workspace_panes: WorkspacePane[];
 }
 export type WorkspacePaneKind = 'chat' | 'terminal' | 'agent' | 'empty' | 'settings';
 export interface WorkspacePane {
  id: string;
  kind: WorkspacePaneKind;
  chatId?: string;
  chatIds: string[];
  activeChatIdx: number;
 }
 // v1.8.1: agents come from two sources. 'global' = /data/AGENTS.md (always
@@ -273,6 +286,11 @@ export interface SessionRenamedFrame {
  session_id: string;
  name: string;
 }
 export interface SessionWorkspaceUpdatedFrame {
  type: 'session_workspace_updated';
  session_id: string;
  workspace_panes: WorkspacePane[];
 }
 export interface SessionArchivedFrame {
  type: 'session_archived';
  session_id: string;
@@ -324,7 +342,7 @@ export interface ProjectUpdatedFrame {
 export interface ChatStatusFrame {
  type: 'chat_status';
  chat_id: string;
-  status: 'working' | 'idle' | 'error';
+  status: 'streaming' | 'tool_running' | 'waiting_for_input' | 'idle' | 'error';
  at: string;
  reason?: ErrorReason;
 }
@@ -335,6 +353,7 @@ export type UserStreamFrame =
  | SessionDeletedFrame
  | SessionUpdatedFrame
  | SessionRenamedFrame
  | SessionWorkspaceUpdatedFrame
  | SessionArchivedFrame
  | ChatCreatedFrame
  | ChatUpdatedFrame
--- a/apps/web/src/api/client.ts
+++ b/apps/web/src/api/client.ts
@@ -143,6 +143,11 @@ export const api = {
      ),
    openChatsCount: (id: string) =>
      request<{ count: number }>(`/api/sessions/${id}/chats/open-count`),
    updateWorkspacePanes: (id: string, panes: Session['workspace_panes']) =>
      request<Session>(`/api/sessions/${id}/workspace`, {
        method: 'PATCH',
        body: JSON.stringify({ workspace_panes: panes }),
      }),
  },
  chats: {
--- a/apps/web/src/api/types.ts
+++ b/apps/web/src/api/types.ts
@@ -34,6 +34,8 @@ export interface Session {
  agent_id: string | null;
  // v1.9: null = inherit from project.default_web_search_enabled.
  web_search_enabled: boolean | null;
  // v1.12.1: server-authoritative pane layout, replaces localStorage.
  workspace_panes: WorkspacePane[];
 }
 // v1.8.1: 'global' = /data/AGENTS.md (always-on), 'project' = per-project
@@ -330,6 +332,17 @@ export type WsFrame =
      // to the client without a refetch.
      metadata?: MessageMetadata | null;
    }
  // v1.12.2: live throughput frame, published mid-stream every ~500ms with
  // the latest token + ctx counts so ChatThroughput can render tok/s and
  // ctx_used while the model is still generating.
  | {
      type: 'usage';
      message_id: string;
      chat_id?: string;
      completion_tokens: number | null;
      ctx_used: number | null;
      ctx_max: number | null;
    }
  | { type: 'messages_deleted'; message_ids: string[]; chat_id?: string }
  | { type: 'chat_renamed'; chat_id: string; name: string }
  // v1.11: published by services/compaction.ts after the new anchored
--- a/apps/web/src/components/ChatTabBar.tsx
+++ b/apps/web/src/components/ChatTabBar.tsx
@@ -2,6 +2,7 @@ import { useState } from 'react';
 import { Bot, History, MessageSquare, Plus, Terminal, X } from 'lucide-react';
 import type { Chat, WorkspacePane } from '@/api/types';
 import { StatusDot } from '@/components/StatusDot';
 import { ChatThroughput } from '@/components/ChatThroughput';
 import {
  ContextMenu,
  ContextMenuContent,
@@ -99,6 +100,7 @@ export function ChatTabBar({
              >
                <MessageSquare size={12} className="shrink-0" />
                <StatusDot chatId={chat.id} />
                <ChatThroughput chatId={chat.id} />
                {renamingId === chat.id ? (
                  <input
                    autoFocus
--- a/apps/web/src/components/ChatThroughput.tsx
+++ b/apps/web/src/components/ChatThroughput.tsx
@@ -0,0 +1,28 @@
 import { useChatStatus } from '@/hooks/useChatStatus';
 import { useChatThroughput } from '@/hooks/useChatThroughput';
 import { cn } from '@/lib/utils';
 interface Props {
  chatId: string | null | undefined;
  className?: string;
 }
 // v1.12.2: inline throughput readout. Renders next to StatusDot while the
 // chat is streaming or running a tool. Hidden in idle/error/waiting states
 // — the dot already communicates those.
 export function ChatThroughput({ chatId, className }: Props) {
  const status = useChatStatus(chatId);
  const t = useChatThroughput(chatId);
  if (!chatId || !t) return null;
  if (status !== 'streaming' && status !== 'tool_running') return null;
  const tps = t.tps != null && t.tps > 0 ? Math.round(t.tps) : null;
  const showCtx = t.ctx_used != null && t.ctx_max != null;
  if (tps === null && !showCtx) return null;
  return (
    <span className={cn('text-xs text-muted-foreground tabular-nums', className)}>
      {tps !== null && `${tps} tok/s`}
      {tps !== null && showCtx && ' · '}
      {showCtx && `${t.ctx_used!.toLocaleString()}/${t.ctx_max!.toLocaleString()}`}
    </span>
  );
 }
--- a/apps/web/src/components/MobileTabSwitcher.tsx
+++ b/apps/web/src/components/MobileTabSwitcher.tsx
@@ -13,6 +13,7 @@ import { toast } from 'sonner';
 import type { Chat, WorkspacePane } from '@/api/types';
 import { BottomSheet } from '@/components/BottomSheet';
 import { StatusDot } from '@/components/StatusDot';
 import { ChatThroughput } from '@/components/ChatThroughput';
 import {
  DropdownMenu,
  DropdownMenuContent,
@@ -206,6 +207,7 @@ export function MobileTabSwitcher({
        >
          <span className="shrink-0 text-muted-foreground">{paneIcon(active?.kind ?? 'chat')}</span>
          <StatusDot chatId={activeChatId} />
          <ChatThroughput chatId={activeChatId} />
          <span className="truncate flex-1 text-left">{activeLabel}</span>
          <ChevronDown size={14} className="opacity-60 shrink-0" />
        </button>
@@ -237,6 +239,7 @@ export function MobileTabSwitcher({
              >
                <span className="shrink-0 text-muted-foreground">{paneIcon(pane.kind)}</span>
                <StatusDot chatId={cid ?? null} />
                <ChatThroughput chatId={cid ?? null} />
                {renamingChatId === cid && cid ? (
                  <input
                    autoFocus
--- a/apps/web/src/components/StatusDot.tsx
+++ b/apps/web/src/components/StatusDot.tsx
@@ -6,15 +6,10 @@ interface Props {
  className?: string;
 }
 const STATUS_CLASS: Record<DerivedStatus, string> = {
  working: 'bg-amber-500 animate-pulse',
  idle_warm: 'bg-emerald-500',
  idle_cold: 'bg-muted-foreground/40',
  error: 'bg-destructive',
 };
 const STATUS_LABEL: Record<DerivedStatus, string> = {
-  working: 'working',
+  streaming: 'streaming',
  tool_running: 'running tool',
  waiting_for_input: 'waiting for input',
  idle_warm: 'idle',
  idle_cold: 'idle',
  error: 'error',
@@ -22,15 +17,58 @@ const STATUS_LABEL: Record<DerivedStatus, string> = {
 export function StatusDot({ chatId, className }: Props) {
  const status = useChatStatus(chatId);
  if (status === 'streaming') {
    return (
      <span
        aria-label="Status: streaming"
        title="streaming"
        className={cn('inline-block relative w-3 h-3 shrink-0', className)}
      >
        <span className="absolute inset-0 animate-spin-slow">
          <span className="absolute top-0 left-1/2 -translate-x-1/2 w-1 h-1 rounded-full bg-amber-500" />
          <span className="absolute bottom-0 left-1/2 -translate-x-1/2 w-1 h-1 rounded-full bg-amber-500/60" />
        </span>
      </span>
    );
  }
  if (status === 'tool_running') {
    return (
      <span
        aria-label="Status: running tool"
        title="running tool"
        className={cn(
          'inline-block w-3 h-3 rounded-full border-2 border-sky-500 border-t-transparent animate-spin shrink-0',
          className,
        )}
      />
    );
  }
  if (status === 'waiting_for_input') {
    return (
      <span
        aria-label="Status: waiting for input"
        title="waiting for input"
        className={cn(
          'inline-block w-1.5 h-1.5 rounded-full shrink-0 bg-violet-500',
          className,
        )}
      />
    );
  }
  const bg =
    status === 'idle_warm' ? 'bg-emerald-500'
      : status === 'error' ? 'bg-destructive'
      : 'bg-muted-foreground/40';
  return (
    <span
      aria-label={`Status: ${STATUS_LABEL[status]}`}
      title={STATUS_LABEL[status]}
-      className={cn(
+      className={cn('inline-block w-1.5 h-1.5 rounded-full shrink-0', bg, className)}
        'inline-block w-1.5 h-1.5 rounded-full shrink-0',
        STATUS_CLASS[status],
        className,
      )}
    />
  );
 }
--- a/apps/web/src/hooks/sessionEvents.ts
+++ b/apps/web/src/hooks/sessionEvents.ts
@@ -41,6 +41,12 @@ export interface SessionUpdatedEvent {
  updated_at: string;
 }
 export interface SessionWorkspaceUpdatedEvent {
  type: 'session_workspace_updated';
  session_id: string;
  workspace_panes: import('@/api/types').WorkspacePane[];
 }
 export interface SessionLoadedEvent {
  type: 'session_loaded';
  session_id: string;
@@ -131,7 +137,7 @@ export interface ProjectUpdatedEvent {
 export interface ChatStatusEvent {
  type: 'chat_status';
  chat_id: string;
-  status: 'working' | 'idle' | 'error';
+  status: 'streaming' | 'tool_running' | 'waiting_for_input' | 'idle' | 'error';
  at: string;
  reason?: ErrorReason;
 }
@@ -143,6 +149,7 @@ export type SessionEvent =
  | SessionCreatedEvent
  | SessionDeletedEvent
  | SessionUpdatedEvent
  | SessionWorkspaceUpdatedEvent
  | SessionLoadedEvent
  | OpenFileInBrowserEvent
  | AttachChatFileEvent
--- a/apps/web/src/hooks/useChatStatus.ts
+++ b/apps/web/src/hooks/useChatStatus.ts
@@ -1,8 +1,14 @@
 import { useEffect, useState } from 'react';
 import { sessionEvents } from './sessionEvents';
-export type RawStatus = 'working' | 'idle' | 'error';
+export type RawStatus = 'streaming' | 'tool_running' | 'waiting_for_input' | 'idle' | 'error';
-export type DerivedStatus = 'working' | 'idle_warm' | 'idle_cold' | 'error';
+export type DerivedStatus =
  | 'streaming'
  | 'tool_running'
  | 'waiting_for_input'
  | 'idle_warm'
  | 'idle_cold'
  | 'error';
 // Window during which an idle dot stays green; after this, it fades to gray.
 const WARM_WINDOW_MS = 30_000;
@@ -53,7 +59,9 @@ if (!G.__boocode_chat_status_subscribed) {
 function derive(entry: Entry | undefined): DerivedStatus {
  if (!entry) return 'idle_cold';
-  if (entry.status === 'working') return 'working';
+  if (entry.status === 'streaming') return 'streaming';
  if (entry.status === 'tool_running') return 'tool_running';
  if (entry.status === 'waiting_for_input') return 'waiting_for_input';
  if (entry.status === 'error') return 'error';
  const age = Date.now() - new Date(entry.at).getTime();
  return age < WARM_WINDOW_MS ? 'idle_warm' : 'idle_cold';
--- a/apps/web/src/hooks/useChatThroughput.ts
+++ b/apps/web/src/hooks/useChatThroughput.ts
@@ -0,0 +1,106 @@
 import { useEffect, useState } from 'react';
 // v1.12.2: live throughput stream consumer. Fed by useSessionStream when a
 // 'usage' WS frame lands. Renders next to StatusDot via ChatThroughput.
 //
 // Singleton + Set<setState> pattern mirrors useChatStatus so any component
 // can subscribe to any chatId without prop drilling.
 export interface ThroughputSample {
  tps: number | null;
  ctx_used: number | null;
  ctx_max: number | null;
 }
 interface Entry {
  ctx_used: number | null;
  ctx_max: number | null;
  completion_tokens: number | null;
  recorded_at: number;
  prev_completion_tokens: number | null;
  prev_recorded_at: number | null;
  tps: number | null;
 }
 // Stale window. After this, useChatThroughput returns null — clears the
 // indicator after the stream ends without the next inference turn.
 const STALE_MS = 10_000;
 const entries = new Map<string, Entry>();
 const subscribers = new Set<() => void>();
 function notify(): void {
  for (const s of subscribers) {
    try { s(); } catch { /* swallow */ }
  }
 }
 // v1.12.2: imported by useSessionStream's WS handler. Computes tps from the
 // gap between successive completion_tokens samples; first sample yields null
 // (we need two points). Skips zero-progress samples so a duplicate usage
 // frame doesn't push tps to 0.
 export function recordUsage(
  chatId: string,
  data: { completion_tokens: number | null; ctx_used: number | null; ctx_max: number | null },
 ): void {
  const now = Date.now();
  const prev = entries.get(chatId);
  let tps: number | null = prev?.tps ?? null;
  if (
    prev &&
    data.completion_tokens != null &&
    prev.completion_tokens != null &&
    data.completion_tokens > prev.completion_tokens &&
    now > prev.recorded_at
  ) {
    const dTokens = data.completion_tokens - prev.completion_tokens;
    const dSeconds = (now - prev.recorded_at) / 1000;
    tps = dTokens / dSeconds;
  }
  entries.set(chatId, {
    ctx_used: data.ctx_used,
    ctx_max: data.ctx_max,
    completion_tokens: data.completion_tokens,
    recorded_at: now,
    prev_completion_tokens: prev?.completion_tokens ?? null,
    prev_recorded_at: prev?.recorded_at ?? null,
    tps,
  });
  notify();
 }
 export function clearThroughput(chatId: string): void {
  if (entries.delete(chatId)) notify();
 }
 // Periodic sweep: re-notify so stale entries fall off the UI when the
 // stream ends without a follow-up frame. Light — one timer for the whole app.
 const G = globalThis as Record<string, unknown>;
 if (!G.__boocode_throughput_ticker) {
  G.__boocode_throughput_ticker = true;
  setInterval(() => {
    const now = Date.now();
    let touched = false;
    for (const [k, v] of entries) {
      if (now - v.recorded_at > STALE_MS) {
        entries.delete(k);
        touched = true;
      }
    }
    if (touched) notify();
  }, 2_000);
 }
 export function useChatThroughput(chatId: string | null | undefined): ThroughputSample | null {
  const [, force] = useState({});
  useEffect(() => {
    const sub = () => force({});
    subscribers.add(sub);
    return () => { subscribers.delete(sub); };
  }, []);
  if (!chatId) return null;
  const entry = entries.get(chatId);
  if (!entry) return null;
  if (Date.now() - entry.recorded_at > STALE_MS) return null;
  return { tps: entry.tps, ctx_used: entry.ctx_used, ctx_max: entry.ctx_max };
 }
--- a/apps/web/src/hooks/useSessionChats.ts
+++ b/apps/web/src/hooks/useSessionChats.ts
@@ -12,6 +12,7 @@ export interface UseSessionChatsOpts {
  // about pane indexing.
  openChatInActivePane: (chatId: string) => void;
  initializeFirstChatIfEmpty: (chatId: string) => void;
  validatePanes: (validChatIds: Set<string>) => void;
 }
 export interface UseSessionChatsResult {
@@ -44,12 +45,15 @@ export function useSessionChats(
  openChatInActivePaneRef.current = opts.openChatInActivePane;
  const initializeFirstChatIfEmptyRef = useRef(opts.initializeFirstChatIfEmpty);
  initializeFirstChatIfEmptyRef.current = opts.initializeFirstChatIfEmpty;
  const validatePanesRef = useRef(opts.validatePanes);
  validatePanesRef.current = opts.validatePanes;
  useEffect(() => {
    let cancelled = false;
    api.chats.listForSession(sessionId).then((list) => {
      if (cancelled) return;
      setChats(list);
      validatePanesRef.current(new Set(list.map((c) => c.id)));
      const openChat = list.find((c) => c.status === 'open');
      if (openChat) {
        initializeFirstChatIfEmptyRef.current(openChat.id);
--- a/apps/web/src/hooks/useSessionStream.ts
+++ b/apps/web/src/hooks/useSessionStream.ts
@@ -3,6 +3,7 @@ import { toast } from 'sonner';
 import type { Message, WsFrame } from '@/api/types';
 import { api } from '@/api/client';
 import { sessionEvents } from './sessionEvents';
 import { recordUsage } from './useChatThroughput';
 // session_renamed frame removed from WsFrame — it was declared but never
 // published on the per-session WS channel (server publishes via broker.publishUser
@@ -125,6 +126,19 @@ function applyFrame(state: State, frame: WsFrame): State {
      );
      return { ...state, messages: next };
    }
    case 'usage': {
      // v1.12.2: live throughput. Side-effects into the module-level
      // singleton consumed by ChatThroughput; no message-state mutation.
      // chat_id is the optional ws-frame field; usage frames always include it.
      if (frame.chat_id) {
        recordUsage(frame.chat_id, {
          completion_tokens: frame.completion_tokens,
          ctx_used: frame.ctx_used,
          ctx_max: frame.ctx_max,
        });
      }
      return state;
    }
    case 'messages_deleted': {
      const removeSet = new Set(frame.message_ids);
      return {
--- a/apps/web/src/hooks/useSidebar.ts
+++ b/apps/web/src/hooks/useSidebar.ts
@@ -143,6 +143,9 @@ function applyEvent(prev: SidebarResponse, event: import('./sessionEvents').Sess
    case 'session_loaded':
      // activeSessionProjectId is updated in the subscribe callback; no data change here.
      return prev;
    case 'session_workspace_updated':
      // Pane layout is consumed by useWorkspacePanes; sidebar has no stake.
      return prev;
    case 'open_file_in_browser':
      // Consumed by Workspace (T7); no sidebar state change needed.
      return prev;
--- a/apps/web/src/hooks/useWorkspacePanes.ts
+++ b/apps/web/src/hooks/useWorkspacePanes.ts
@@ -4,9 +4,14 @@ import { toast } from 'sonner';
 import { api } from '@/api/client';
 import type { WorkspacePane } from '@/api/types';
 import { setActivePaneInfo, clearActivePane } from '@/hooks/useActivePane';
 import { sessionEvents } from '@/hooks/sessionEvents';
 export const MAX_PANES = 5;
-const STORAGE_KEY = 'boocode.workspace.panes';
+// v1.12.1: legacy localStorage key. Read once on mount to seed the server
 // for sessions still on per-device state, then deleted. Server is now
 // authoritative via sessions.workspace_panes.
 const LEGACY_STORAGE_KEY = 'boocode.workspace.panes';
 const SAVE_DEBOUNCE_MS = 300;
 function generateId(): string {
  return crypto.randomUUID();
@@ -51,9 +56,11 @@ function nonSettingsCount(panes: WorkspacePane[]): number {
  return panes.reduce((n, p) => n + (p.kind === 'settings' ? 0 : 1), 0);
 }
-function loadPanes(sessionId: string): WorkspacePane[] | null {
+// v1.12.1: read legacy per-device localStorage. If present, the caller seeds
 // the server then deletes the key. One-time migration per session.
 function readLegacyPanes(sessionId: string): WorkspacePane[] | null {
  try {
-    const raw = localStorage.getItem(`${STORAGE_KEY}.${sessionId}`);
+    const raw = localStorage.getItem(`${LEGACY_STORAGE_KEY}.${sessionId}`);
    if (!raw) return null;
    const parsed = JSON.parse(raw) as WorkspacePane[];
    if (!Array.isArray(parsed) || parsed.length === 0) return null;
@@ -63,15 +70,6 @@ function loadPanes(sessionId: string): WorkspacePane[] | null {
  }
 }
 function savePanes(sessionId: string, panes: WorkspacePane[]): void {
  try {
    localStorage.setItem(
      `${STORAGE_KEY}.${sessionId}`,
      JSON.stringify(persistablePanes(panes)),
    );
  } catch { /* quota or disabled */ }
 }
 export interface UseWorkspacePanesResult {
  panes: WorkspacePane[];
  activePaneIdx: number;
@@ -96,6 +94,7 @@ export interface UseWorkspacePanesResult {
  removePane: (idx: number) => void;
  removeChatFromPanes: (chatId: string) => void;
  initializeFirstChatIfEmpty: (chatId: string) => void;
  validatePanes: (validChatIds: Set<string>) => void;
  handlePaneDragStart: (idx: number) => (e: DragEvent<HTMLDivElement>) => void;
  handlePaneDragOver: (idx: number) => (e: DragEvent<HTMLDivElement>) => void;
  handlePaneDragLeave: () => void;
@@ -106,15 +105,85 @@ export interface UseWorkspacePanesResult {
 }
 export function useWorkspacePanes(sessionId: string): UseWorkspacePanesResult {
-  const [panes, setPanes] = useState<WorkspacePane[]>(() => {
+  const [panes, setPanes] = useState<WorkspacePane[]>(() => [emptyPane()]);
    return loadPanes(sessionId) ?? [emptyPane()];
  });
  const [activePaneIdx, setActivePaneIdx] = useState(0);
  const draggingIdxRef = useRef<number | null>(null);
  const [dragOverIdx, setDragOverIdx] = useState<number | null>(null);
  // v1.12.1: skip PATCH while hydrating from the server. Without this, the
  // initial [emptyPane()] would be saved over the server's real state before
  // the GET resolves.
  const hydratedRef = useRef(false);
  // Tracks the last value broadcast by another device (or this one's own
  // round-trip). If a PATCH would echo this exact payload, we skip the call.
  const lastRemoteJsonRef = useRef<string>('[]');
  // v1.12.1: hydrate from server on mount, then subscribe to remote updates.
  useEffect(() => {
-    savePanes(sessionId, panes);
+    hydratedRef.current = false;
    let cancelled = false;
    void (async () => {
      try {
        const session = await api.sessions.get(sessionId);
        if (cancelled) return;
        let initial: WorkspacePane[] = Array.isArray(session.workspace_panes)
          ? session.workspace_panes
          : [];
        // One-time migration: if server is empty but legacy localStorage has
        // a layout, seed the server and delete the local key.
        if (initial.length === 0) {
          const legacy = readLegacyPanes(sessionId);
          if (legacy && legacy.length > 0) {
            try {
              const updated = await api.sessions.updateWorkspacePanes(sessionId, legacy);
              if (cancelled) return;
              initial = updated.workspace_panes;
              localStorage.removeItem(`${LEGACY_STORAGE_KEY}.${sessionId}`);
            } catch {
              initial = legacy;
            }
          }
        }
        const next = initial.length > 0 ? initial : [emptyPane()];
        lastRemoteJsonRef.current = JSON.stringify(persistablePanes(next));
        setPanes(next);
        setActivePaneIdx(0);
      } finally {
        if (!cancelled) hydratedRef.current = true;
      }
    })();
    return () => { cancelled = true; };
  }, [sessionId]);
  // v1.12.1: live cross-device sync. Replace local state when another device
  // (or our own write echo) lands a session_workspace_updated frame.
  useEffect(() => {
    return sessionEvents.subscribe((ev) => {
      if (ev.type !== 'session_workspace_updated') return;
      if (ev.session_id !== sessionId) return;
      const incoming = Array.isArray(ev.workspace_panes) ? ev.workspace_panes : [];
      const json = JSON.stringify(incoming);
      if (json === lastRemoteJsonRef.current) return;
      lastRemoteJsonRef.current = json;
      setPanes(incoming.length > 0 ? incoming : [emptyPane()]);
      setActivePaneIdx((prev) => Math.min(prev, Math.max(0, incoming.length - 1)));
    });
  }, [sessionId]);
  // v1.12.1: debounced PATCH on every change. Settings panes are stripped
  // before saving (ephemeral per v1.9).
  useEffect(() => {
    if (!hydratedRef.current) return;
    const payload = persistablePanes(panes);
    const json = JSON.stringify(payload);
    if (json === lastRemoteJsonRef.current) return;
    const timer = setTimeout(() => {
      lastRemoteJsonRef.current = json;
      api.sessions.updateWorkspacePanes(sessionId, payload).catch(() => {
        // Non-fatal: next change retries. Persistent failures surface via
        // the network layer's existing reconnect toast.
      });
    }, SAVE_DEBOUNCE_MS);
    return () => clearTimeout(timer);
  }, [sessionId, panes]);
  useEffect(() => {
@@ -328,6 +397,23 @@ export function useWorkspacePanes(sessionId: string): UseWorkspacePanesResult {
    });
  }, []);
  const validatePanes = useCallback((validChatIds: Set<string>) => {
    setPanes((prev) => {
      const cleaned = prev.map((pane) => {
        if (pane.kind !== 'chat' || pane.chatIds.length === 0) return pane;
        const nextIds = pane.chatIds.filter((id) => validChatIds.has(id));
        if (nextIds.length === pane.chatIds.length) return pane;
        if (nextIds.length === 0) {
          return { ...pane, kind: 'empty' as const, chatId: undefined, chatIds: [], activeChatIdx: -1 };
        }
        const nextActiveIdx = Math.min(pane.activeChatIdx, nextIds.length - 1);
        return { ...pane, chatIds: nextIds, activeChatIdx: nextActiveIdx, chatId: nextIds[nextActiveIdx] };
      });
      const unchanged = cleaned.every((p, i) => p === prev[i]);
      return unchanged ? prev : cleaned;
    });
  }, []);
  const removeChatFromPanes = useCallback((chatId: string) => {
    setPanes((prev) => prev.map((p) => {
      const idx = p.chatIds.indexOf(chatId);
@@ -411,6 +497,7 @@ export function useWorkspacePanes(sessionId: string): UseWorkspacePanesResult {
    removePane,
    removeChatFromPanes,
    initializeFirstChatIfEmpty,
    validatePanes,
    handlePaneDragStart,
    handlePaneDragOver,
    handlePaneDragLeave,
--- a/apps/web/src/pages/Session.tsx
+++ b/apps/web/src/pages/Session.tsx
@@ -59,6 +59,7 @@ function SessionInner({ sessionId }: { sessionId: string }) {
    removePane,
    removeChatFromPanes,
    initializeFirstChatIfEmpty,
    validatePanes,
  } = panesHook;
  const openChatInActivePane = useCallback(
@@ -70,6 +71,7 @@ function SessionInner({ sessionId }: { sessionId: string }) {
    openChatInPane,
    openChatInActivePane,
    initializeFirstChatIfEmpty,
    validatePanes,
  });
  const { chats, renameChat } = chatsHook;
--- a/apps/web/src/styles/globals.css
+++ b/apps/web/src/styles/globals.css
@@ -138,6 +138,7 @@
  --radius-xl: calc(var(--radius) + 4px);
  --font-sans: "Inter Variable", "Inter", system-ui, sans-serif;
  --font-mono: "JetBrains Mono Variable", ui-monospace, SFMono-Regular, monospace;
  --animate-spin-slow: spin 1.2s linear infinite;
 }
@layer base {
--- a/boocode_roadmap.md
+++ b/boocode_roadmap.md
@@ -1,6 +1,6 @@
 # BooCode v1.x — Roadmap
-Last updated: 2026-05-20
+Last updated: 2026-05-21
 ## Overview
@@ -10,7 +10,7 @@ Live at `https://code.indifferentketchup.com` (Caddy → Authelia → Tailscale
 **Architectural commitments:**
- No embeddings. The model uses file-view tools (`view_file`, `list_dir`, `grep`, `find_files`) + sidecar analyzers (codecontext, codesight). Walked away from the RAG pipeline May 2026.
+- No embeddings. Model uses file-view tools (`view_file`, `list_dir`, `grep`, `find_files`) + sidecar analyzers (codecontext, codesight) + codecontext MCP tools. Walked away from the RAG pipeline May 2026.
 - Read-only in v1.x. Write tools land in BooCoder (separate container, post-v1.x).
 - One Postgres (`boocode_db`), one frontend SPA, container-per-service for new capabilities.
@@ -18,136 +18,87 @@ External code lifted from / referenced in: see `boocode_code_review.md` for full
 -----
-## Shipped (status as of 2026-05-20)
+## Shipped (status as of 2026-05-21)
-| Version | Theme | Notes |
+| Version | Theme | Tag |
 |---|---|---|
-| v1.0 | Initial scaffold | live |
+| v1.0 | Initial scaffold | — |
-| Batches 1–4.4 | Markdown, sidebar, panes, chats-inside-sessions, archive, fork/delete, header polish, settings drawer | merged |
+| Batches 1–4.4 | Markdown, sidebar, panes, chats-inside-sessions, archive, fork/delete, header polish, settings drawer | — |
-| v1.5 | resolveProjectPath, BOOTSTRAP_ROOT, vitest pin | merged |
+| v1.5 | resolveProjectPath, BOOTSTRAP_ROOT, vitest pin | — |
-| v1.6, v1.6.1, v1.6.2 | Mobile pass + RightRail mobile drawer | merged |
+| v1.6, v1.6.1, v1.6.2 | Mobile pass + RightRail mobile drawer | — |
-| v1.7 | Drag-drop file + paste-as-attachment | merged |
+| v1.7 | Drag-drop file + paste-as-attachment | — |
-| v1.8, v1.8.1, v1.8.2 | Settings drawer, git_status tool, WS reconnect, **per-turn budget reset + Continue affordance + CapHitSentinel** | merged |
+| v1.8, v1.8.1, v1.8.2 | Settings drawer, git_status tool, WS reconnect, per-turn budget reset + Continue affordance + CapHitSentinel | — |
-| v1.9.1 | Skills system (`/opt/skills/` + `skill_find`/`skill_use`/`skill_resource` tools + `/skill` slash command) | merged |
+| v1.9.1 | Skills system (`/opt/skills/` + `skill_find` / `skill_use` / `skill_resource` + `/skill` slash command) | `v1.9.1` |
-| v1.9.7 | `ask_user_input` elicitation tool | merged |
+| v1.9.7 | `ask_user_input` elicitation tool | `v1.9.7` |
-| **Batch 9 (Agents Tier 2)** | `AGENTS.md` + 6 builtin agents + AgentPicker in ChatInput toolbar + `sessions.agent_id` | **merged in `92bd3b1`**, included in v1.9.1/v1.9.7/v1.10.x tags |
+| Batch 9 (Agents Tier 2) | `AGENTS.md` + 6 builtin agents + AgentPicker in ChatInput toolbar + `sessions.agent_id` | folded into `v1.9.1`/`v1.9.7` |
-| v1.10.0 | BooTerm: separate container, xterm.js + node-pty + tmux | merged |
+| v1.10.0 | BooTerm: separate container, xterm.js + node-pty + tmux | `v1.10.0` |
-| v1.10.1 | BooTerm-user (spawn as samkintop, login bash, Claude Code/opencode PATH) | merged |
+| v1.10.1 | BooTerm-user (spawn as samkintop, login bash, Claude Code/opencode PATH) | `v1.10.1` |
-| v1.10.4, v1.10.5 | Mobile terminal + XML tool-call fallback parser | merged |
+| v1.10.4, v1.10.5 | Mobile terminal + XML tool-call fallback parser | — |
-| **v1.11.0** | **opencode-style compaction port** (auto-overflow, anchored summary, tail preservation) | merged |
+| v1.11.0 | opencode-style compaction port (auto-overflow, anchored summary, tail preservation) | — |
-| v1.11.1 | Compaction follow-up (working indicator during compaction, unit tests, .bak cleanup) | merged |
+| v1.11.1 | Compaction follow-up (working indicator during compaction, unit tests, .bak cleanup) | — |
-| v1.11.2 | ContextBar (persistent context-usage indicator) | merged |
+| v1.11.2 | ContextBar (persistent context-usage indicator above MessageList) | — |
-| v1.11.3 | `ctx_max` capture via `/upstream/<model>/props` (replaces dead `timings.n_ctx` read) | merged |
+| v1.11.3 | `ctx_max` capture via `/upstream/<model>/props` (replaces dead `timings.n_ctx` read) | `v1.11.3` |
 | v1.11.5 | ContextBar inline next to agent picker; remove ChatContextPopover; default new sessions to no agent | — |
 | v1.11.6 | Doom-loop guard from opencode (3 identical tool calls → sentinel, abort recursion) | — |
 | v1.11.7 | pathGuard secrets filter (continue.dev `DEFAULT_SECURITY_IGNORE_FILETYPES`) | — |
 | v1.11.8 | web_search + web_fetch tools via SearXNG | — |
 | v1.11.9 | Manual redirect handling — re-run URL guard on each hop (SSRF hardening) | — |
 | v1.11.10 | Stream-cap response body at 5MB, abort on overflow | `v1.11.x` |
 | **v1.12.0** | **codecontext sidecar (Go HTTP shim, NDJSON MCP framing, child.Wait supervisor) + container guidance (BOOCHAT.md/BOOCODER.md) + 7 vendored skills + system-prompt.ts extraction + mtime-watch cache + 8 codecontext tool wrappers + per-agent tool whitelists + .codecontextignore template + agents.ts ALL_TOOL_NAMES single-source-of-truth fix** | `v1.12.0` |
 -----
-## In flight / queued
+## In flight (uncommitted on disk, 2026-05-21)
-| Version | Theme | Status |
+v1.12.1 work — landed today, not yet committed:
 | Item | Status | Notes |
 |---|---|---|
-| ~~v1.11.4~~ | ~~Per-turn budget + Continue affordance~~ | **CANCELLED** — already shipped in v1.8.2 |
+| Server-side workspace pane sync | Done | `sessions.workspace_panes jsonb` column; PATCH endpoint; `session_workspace_updated` WS frame; localStorage migration on first load; deprecated `session_panes` table dropped |
-| **v1.11.5** | ContextBar relocate (above agent-picker row), thicker, always-visible, remove ChatContextPopover | **dispatched** |
+| Richer status indicators | Done | Five states (`streaming` / `tool_running` / `waiting_for_input` / `idle` / `error`) with distinct visuals: amber orbiting dots for streaming, amber spinning ring for tool execution, blue static for waiting on user, emerald/gray/red for idle/error |
-| v1.11.6 | Doom-loop guard from opencode (3 identical tool calls → sentinel, abort recursion) | drafted |
+| Startup hung-row sweep | Done | `UPDATE messages SET status='failed' WHERE status='streaming' AND created_at < NOW() - INTERVAL '5 minutes'` on server boot |
-| v1.11.7 | pathGuard secrets filter (continue.dev's `DEFAULT_SECURITY_IGNORE_FILETYPES`) | drafted |
+| One stuck row from v1.12.0 smoke | Cleared | Manual UPDATE (`d63c25b1`) |
-| v1.11.x | Tag consolidation point (everything since v1.11.0) | queued |
+| `detectSameNameLoop` code path | Added, never fired | Candidate for revert in next batch — dead code |
 | Diagnostic logging in inference.ts | Added for debugging | Must come out before commit |
 -----
-## Major work after v1.11.x
+## v1.12.x cleanup (NEXT — small, immediate)
-| Version | Theme | LoC est. |
+Five items. Group them or split them — your call.
 |---|---|---|
 | **v1.12** | codecontext sidecar + tool output truncation + repair tool call (Integration 1 + 3 from May review, fused) | ~600 |
 | v1.13 | Phase B groundwork — parts table + AI SDK adoption + per-tool `read_only`/`write` tagging | ~1500 |
 | v1.14 | Phase C — outer agent loop (multi-step until non-tool finish, AGENTS.md `steps` field, reasoning as part type) | ~800 |
 | v1.15 | Phase D — permission ruleset + MCP client (lays foundation for BooCoder) | ~600 |
 | v1.16 | Batch 11b — codesight repo_health (call graph, circular deps, dead code) | ~400 |
 | **v2.0** | Batch 14 — BooCoder pending changes (new container, write tools, plandex pattern) | ~1200 |
 | v2.1 | Batch 15 — BooCoder runtime isolation (per-session Docker sandbox, OpenHands pattern) | ~600 |
 | v2.x | Batch 16/17 — Multi-provider LLM (optional, pi-ai) and Workflow graphs (far future, agent-framework concepts) | tbd |
-----
+### v1.12.1 — commit consolidation
-## Roadmap doc deviations and corrections
+**Action items, in order:**
-This roadmap was significantly out of sync with reality until 2026-05-20. Key corrections folded in:
+1. **Remove diagnostic logging** from `apps/server/src/services/inference.ts`. The 12 `ctx.log.info` calls added today proved the inference loop was functioning correctly; the prompts were just slow. Verbose for production. Strip them, keep the file clean.
-1. **Batch 9 (Agents Tier 2) is done**, not "next up." Shipped as commit `92bd3b1`, included in v1.9.1 forward. The original "Track A: Batch 9 next" recommendation was correct but the doc never got updated.
+2. **Revert `detectSameNameLoop`.** Three additions in inference.ts:
-2. **v1.6.2 merged.** No longer "in flight."
+   - `DOOM_LOOP_SAME_NAME_THRESHOLD = 5` constant
-3. **Batch 5 (fork/delete), Batch 6 (drag-drop), Batch 7 (settings drawer), Batch 8 (web search), Batch 10 (BooTerm) all shipped**, scattered across the v1.6–v1.10 version line. Original "Track A polish then agents" plan was abandoned; work happened opportunistically.
+   - `detectSameNameLoop()` function
-4. **v1.11.0 was a major unplanned addition** — opencode-style compaction (auto-overflow detection + anchored rolling summary + tail preservation). This is NOT a batch from the old roadmap. It opened a new patch line (v1.11.x) of small follow-ups in front of the original Batches 11–17.
+   - Call site in `runAssistantTurn` immediately after the existing `detectDoomLoop` check
-5. **Batch 11 (codecontext sidecar) moves to v1.12.** Bundles with truncation and repair-tool-call lift (both from opencode) since they share concerns and the `tool_choice='required'` confirmation makes repair-tool-call viable.
+   
-6. **Phase B (parts table + AI SDK + tool-call lifecycle) becomes v1.13.** This absorbs the old Batch 13 (append-only event log) — same outcome (typed message parts), different mental framing.
+   Never fired in any real run today. Dead code. The existing `detectDoomLoop` (identical args, threshold 3) is sufficient.
 7. **Phase C and Phase D are new** (numbered v1.14/v1.15). They originate from the opencode integration analysis, not from the original 17-batch plan. Phase C delivers the outer agent loop with explicit step boundaries. Phase D delivers the permission ruleset + MCP client needed for codecontext to be useful and for BooCoder to gate writes.
 8. **BooCoder (v2.0/v2.1)** is the second-major-version line. New container, new safety story (pending changes + per-session Docker sandbox). Maps to original Batches 14/15.
-----
+3. **Drop the stale `messages_status_check` CHECK constraint** in `apps/server/src/schema.sql`. Two constraints exist on the table:
   - `messages_status_check` allows `streaming|complete|failed` (old, stale)
   - `messages_status_chk` allows `streaming|complete|failed|cancelled` (new)
   The old one prevents `cancelled` from being written. Drop it with `ALTER TABLE messages DROP CONSTRAINT IF EXISTS messages_status_check;`.
-## v1.11.x patches in detail
+4. **Stop-handler writes terminal status.** When user clicks stop mid-stream, the abort path must `UPDATE messages SET status='cancelled' WHERE id = $assistantMessageId AND status='streaming'`. Currently rows just sit `streaming` forever. The startup sweep catches them on restart, but they should be written immediately. Edit `apps/server/src/services/inference.ts` `handleAbortOrError` to add the UPDATE.
-### v1.11.0 — opencode-style compaction port ✅
+5. **Commit + tag v1.12.1.** Include the workspace pane sync, status indicator overhaul, startup sweep, and items 1–4 above. Single commit per item is fine; tag at end.
-**What shipped:** Auto-detection of context overflow (`isOverflow(usage, model)`) triggers compaction on the *next* user turn. Compaction preserves the last 2 turns verbatim and produces an anchored Markdown summary (8-section template lifted verbatim from opencode `compaction.ts`) that replaces older head messages. Summary is rolling — each new compaction updates the prior summary, not stacks. Schema additions: `messages.compacted_at`, `messages.summary`, `messages.tail_start_id`, `chats.needs_compaction`. WS `compacted` frame fires sonner toast on completion.
+**Estimated:** ~150 LoC net (deletions dominate).
-**Key divergences from opencode:** Per-chat (not per-session) compaction state because BooCode history is per-chat. UUID `tail_start_id` not BIGINT. No `parent_id` on messages. Context limit comes from `messages.ctx_max` (last-known `n_ctx`), not a `model.context_limit` field.
+### v1.12.2 — live throughput display (small UX win)
-### v1.11.1 — Compaction follow-up ✅
+Surface `tokens_per_second` and `ctx_used` next to the status indicator while streaming. Backend already emits these in the `usage` frame; just consume them in the StatusDot wrapper or a sibling component. ~80 LoC, frontend-only.
-Working-state `chat_status: working/idle` frames around the LLM call inside `compaction.process()`. 24 new vitest cases for the six pure functions (`usable`, `isOverflow`, `estimate`, `turns`, `select`, `buildPrompt`). 7 `.bak-v1.11` files deleted.
+### v1.12.3 — stale-stream frontend banner
-### v1.11.2 — ContextBar ✅
+When a chat has a `streaming` row older than ~60s with no new tokens, the UI should surface a "Previous response didn't complete. [Retry] [Discard]" banner instead of silently queueing new sends. Today's debugging spent four hours misreading slow streams as dead; this is the UX fix that prevents that. ~150 LoC, frontend + small backend endpoint for the discard action.
 New `ContextBar.tsx` rendering above MessageList. Shows `{used} / {max} ({pct}%)` with color tiers computed against `max - 20k` reserve (matches `compaction.usable()`): muted <60%, amber 60-80%, orange 80-95%, red ≥95%. Tooltip shows "Auto-compaction at ~N%". Mobile breakpoints: `< 380px` shows "Ctx" + numbers; `380-639px` adds parenthetical %; `≥ 640px` shows full "Context" label.
 ### v1.11.3 — ctx_max capture fix ✅
 Discovered the dead code at `inference.ts:479-481` and `compaction.ts:300` reading `parsed.timings.n_ctx` never fired — llama-server emits `prompt_n / predicted_n / *_ms / *_per_second` in timings but NOT `n_ctx`. New `model-context.ts` module fetches `GET /upstream/<model>/props` with 3s timeout, positive cache (no TTL), 60s negative cache. Wired into all 4 ctx_max write sites (3 in inference.ts, 1 in compaction.ts). 12 new vitest cases. 7 historical rows backfilled to `ctx_max = 262144` (single-day backfill, only qwen3.6-35b-a3b-mxfp4 in use).
 ### v1.11.4 — CANCELLED
 Original scope: per-turn budget reset + Continue affordance + CapHitSentinel card. Recon revealed all three are already shipped (v1.8.2 timestamps in inference.ts comments). Dead version slot.
 ### v1.11.5 — ContextBar relocate (DISPATCHED)
 Relocate ContextBar from above MessageList to above the agent-picker row. Bump height from ~4px bar to ~10-12px. Always-visible (zero-state when no assistant messages + use `model_context_limit` from v1.11.3 cache). Remove `ChatContextPopover` entirely (redundant signal; mobile-hostile).
 ### v1.11.6 — Doom-loop guard (QUEUED)
 Detect 3 identical tool calls in a row within one turn (same name + same args via JSON.stringify). On detection: abort tool-call recursion, insert `metadata.kind='doom_loop'` sentinel, trigger summary turn via existing `runCapHitSummary` path. New `DoomLoopSentinel.tsx` component (no Continue button — looping shouldn't be retried with same tools). Per-turn sliding window, scoped to current turn's tool-call accumulator.
 **Lift source:** opencode `processor.ts`, `DOOM_LOOP_THRESHOLD = 3` constant.
 ### v1.11.7 — pathGuard secrets filter (QUEUED)
 Extend pathGuard with `DEFAULT_SECURITY_IGNORE_FILETYPES` from continue.dev `core/indexing/ignore.ts`. Three-tier matcher: exact basenames (`credentials`, `secrets.yml`), extensions (`.env`, `.pem`, `.key`, `.crt`, etc.), prefix patterns (`id_rsa`, `id_dsa`, `id_ecdsa`, `id_ed25519`). Blocked files appear in `list_dir` and `find_files` results with `(blocked)` annotation. `view_file` returns `{ error: 'blocked_secret_file', ... }`. `grep` cannot read blocked file contents. No override mechanism in v1.x (use host shell).
 **Why it matters:** `/opt:/opt:ro` mount currently exposes `boolab/.env`, `dubdrive/users.json`, `authelia/state`, every other service's secrets to any tool past path validation. Cheap close on that surface area.
 -----
 ## v1.12 — codecontext sidecar + truncation + repair tool call
 Three lifts fused because they share concerns:
 1. **codecontext sidecar** — new container, single-instance, path-addressed multi-project. Mount `/opt/projects:/workspace:ro`. 8 tools wired as static `ToolDef` wrappers in `apps/server/src/services/tools/codecontext/` (one file per tool). HTTP client to `http://codecontext:8765`. New module `apps/server/src/services/codecontext_bridge.ts` translates `project_id` → `/workspace/<relative>/` paths.
 2. **Tool output truncation** — opencode `truncate.ts` pattern. Cap at 2000 lines / 50KB. Larger outputs: write full content server-side, return preview + opaque `id`. New tool `view_truncated_output(id)` retrieves full content by server-mapped id. **No pathGuard exception** for `/tmp` directory — the opaque-id approach avoids exposing a writable filesystem location to the model. Only codecontext outputs need truncation; native tools (view_file 200 lines, grep 200 results, list_dir 500 entries, find_files 200 results) already cap reasonably.
 3. **`experimental_repairToolCall` equivalent** — when model emits malformed tool call (JSON parse fails or Zod validation fails), return a synthetic tool result instead of an error: `{ error, raw_args, tool_name, hint: 'Retry with valid JSON arguments.' }`. Model self-corrects on next step. Add one line to system prompt instructing self-correction on malformed-args results. Confirmed working precondition: `tool_choice: "required"` accepted by llama-swap (verified 2026-05-20 against qwen3.6-35b-a3b-mxfp4).
 **Hand-roll, not AI SDK adoption.** AI SDK migration deferred to v1.13.
 **AGENTS.md updates:** Each of the 6 builtin agents gets a curated codecontext tool whitelist:
 - Architect: all 8
 - Debugger: `search_symbols`, `get_dependencies`
 - Code Reviewer: `get_file_analysis`
 - Refactorer: `get_semantic_neighborhoods`, `get_dependencies`
 - Security Auditor: `get_file_analysis`, `search_symbols`, `get_dependencies`
 - Prompt Builder: none (no structural reasoning relevance)
 **Dependencies:** v1.11.x merged. No others.
 **Estimated:** 600 LoC across 3-4 dispatches under the v1.12 umbrella.
 -----
@@ -162,11 +113,15 @@ Three lifts fused because they share concerns:
 3. Tool registry: `ToolDef<T>` gains `category: 'read_only' | 'write'` field. BooCode v1.x rejects any `write` tool at registry time (defense in depth for the BooCoder split). Alpha-sort tool list before sending to model (prompt-cache stability).
 4. Reasoning content (`reasoning_content` from Qwen3.6) captured as its own part type instead of dropped or inlined.
-**Migration risk:** non-trivial. inference.ts is ~1400 lines with custom XML fallback, SSE parsing, compaction integration. Plan dedicated cutover window. Compaction.ts must update to assemble head from parts.
+**Migration risk:** non-trivial. `inference.ts` is ~1700 lines with custom XML fallback, SSE parsing, compaction integration. Plan dedicated cutover window. `compaction.ts` must update to assemble head from parts.
 **Replaces:** Original Batch 13 (append-only event log) — same outcome, different vocabulary.
-**Dependencies:** v1.12 merged.
+**Today's debugging spike validates this work.** Four hours of confusion came from JSON-blob `tool_calls` / `tool_results` columns hiding state from logs and from the inference state machine being invisible. Typed parts + per-part status would have shown the slow-stream-vs-dead distinction in seconds.
 **Dependencies:** v1.12.x cleanup merged.
 **Estimated:** ~1500 LoC.
 -----
@@ -179,10 +134,12 @@ Three lifts fused because they share concerns:
 1. Outer loop continues until model returns non-tool finish OR step cap hit. Step ≠ tool call: one step can contain multiple tool calls in parallel.
 2. `agent.steps ?? Infinity` per-agent step cap. AGENTS.md gains `steps:` field. Refactorer `steps: 5`, Architect `steps: 20`, etc.
 3. Step-boundary events (`step_start`, `step_finish`) explicit in the parts stream. Per-step snapshot for revert (planned for BooCoder; backend-only in v1.14).
-4. Doom-loop guard (v1.11.6) migrates from "abort recursion" to "raise within loop iteration." Same predicate, different control flow.
+4. Doom-loop guards (v1.11.6) migrate from "abort recursion" to "raise within loop iteration." Same predicate, different control flow.
 **Dependencies:** v1.13 merged.
 **Estimated:** ~800 LoC.
 -----
 ## v1.15 — Phase D: permission ruleset + MCP client
@@ -200,6 +157,8 @@ Three lifts fused because they share concerns:
 **Dependencies:** v1.13 merged (parts table for permission events). Independent of v1.14.
 **Estimated:** ~600 LoC.
 -----
 ## v1.16 — Batch 11b: codesight repo_health
@@ -208,6 +167,8 @@ Call graph, circular dependency detection, dead code flagging. Port `analyze.mjs
 **Dependencies:** v1.12 merged (can reuse codecontext parse output where overlapping).
 **Estimated:** ~400 LoC.
 -----
 ## v2.0 — BooCoder pending changes
@@ -218,6 +179,8 @@ New container `boocoder` at `100.114.205.53:9502`. Owns write tools (`edit_file`
 **Dependencies:** v1.13 (parts) + v1.15 (permissions).
 **Estimated:** ~1200 LoC.
 -----
 ## v2.1 — BooCoder runtime isolation
@@ -228,6 +191,8 @@ Per-session Docker sandbox spawned by BooCoder on first write. Only project path
 **Dependencies:** v2.0.
 **Estimated:** ~600 LoC.
 -----
 ## v2.x — Optional / far future
@@ -243,17 +208,18 @@ Per-session Docker sandbox spawned by BooCoder on first write. Only project path
 | Container | Port | Mount | Purpose | Status |
 |---|---|---|---|---|
-| `boocode` | `100.114.205.53:9500` | `/opt:/opt:ro` | Chat + read-only tools + SPA | Live |
+| `boocode` | `100.114.205.53:9500` | `/opt:/opt` | Chat + read-only tools + SPA | Live |
 | `boocode_db` | `127.0.0.1:5500` | `boocode_pgdata` volume | Postgres 16-alpine | Live |
 | `booterm` | `100.114.205.53:9501` | `/opt/repos:/opt/repos:rw` | Terminals (tmux + node-pty) | Live (v1.10.0) |
-| `codecontext` | `:8765` (internal) | `/opt/projects:/workspace:ro` | MCP server for architect tools | v1.12 |
+| **`codecontext`** | **`:8765` (internal)** | **`/opt/projects:/workspace:ro`** | **MCP server for architect tools** | **Live (v1.12.0)** |
 | `boocoder` | `100.114.205.53:9502` | per-session sandbox | Write tools | v2.0 |
 ### Schema additions by version
 - **v1.11.0:** `messages.compacted_at`, `messages.summary`, `messages.tail_start_id`, `chats.needs_compaction`
 - **v1.11.7:** none (pathGuard logic, no DB)
- **v1.12:** none (codecontext is stateless on disk; truncation uses in-memory id→path map with TTL cleanup)
+- **v1.12.0:** none (codecontext stateless; truncation in-memory id-map with TTL cleanup)
 - **v1.12.1:** `sessions.workspace_panes jsonb` (workspace sync); drop deprecated `session_panes` table; drop stale `messages_status_check` constraint
 - **v1.13:** `message_parts` table; `messages` becomes header-only
 - **v1.14:** `agents.steps` column (or AGENTS.md parser extension; no DB if file-only)
 - **v1.15:** `permissions` table, `agent_permissions` join, `session_permissions` join
@@ -268,11 +234,11 @@ Full inventory in `boocode_code_review.md`. Headline items:
 | Source | Used for | Where |
 |---|---|---|
-| **`sst/opencode`** (MIT, TS) | **Compaction algorithms** | **v1.11.0 (shipped)** |
+| `sst/opencode` (MIT, TS) | Compaction algorithms | v1.11.0 (shipped) |
-| `sst/opencode` (MIT, TS) | Doom-loop guard | v1.11.6 |
+| `sst/opencode` (MIT, TS) | Doom-loop guard | v1.11.6 (shipped) |
-| `sst/opencode` (MIT, TS) | `repairToolCall`, truncate.ts, MCP client, permission evaluate, runLoop | v1.12/v1.13/v1.14/v1.15 |
+| `sst/opencode` (MIT, TS) | `repairToolCall`, truncate.ts, MCP client, permission evaluate, runLoop | v1.12 (shipped) / v1.13 / v1.14 / v1.15 |
-| `continuedev/continue` (Apache-2.0) | `DEFAULT_SECURITY_IGNORE_FILETYPES` | v1.11.7 |
+| `continuedev/continue` (Apache-2.0) | `DEFAULT_SECURITY_IGNORE_FILETYPES` | v1.11.7 (shipped) |
-| `nmakod/codecontext` (MIT, Go) | Architect: codebase map sidecar | v1.12 |
+| `nmakod/codecontext` (MIT, Go) | Architect: codebase map sidecar | v1.12.0 (shipped) |
 | `spirituslab/codesight` (MIT-ish, TS) | Architect: repo health analyzer | v1.16 |
 | `Aider-AI/aider` (Apache-2.0) | Fallback `.scm` grammars | v1.12 (fallback) |
 | `cline/cline` (Apache-2.0) | Plan/Act pattern (absorbed into v1.15 permissions) | v1.15 |
@@ -281,8 +247,6 @@ Full inventory in `boocode_code_review.md`. Headline items:
 | `aimasteracc/tree-sitter-analyzer` (MIT) | Outline-first patterns | v1.12 (alt) |
 | `earendil-works/pi` (MIT) | Multi-provider LLM | v2.x (optional) |
 **Original Batch 13 (event log from OpenHands) replaced** by v1.13 (parts table). Same outcome, different framing.
 -----
 ## Decisions log
@@ -293,10 +257,15 @@ Full inventory in `boocode_code_review.md`. Headline items:
 - **Globstar parked** — not an architect tool. Future verify-before-commit candidate only.
 - **codeprysm rejected** — embedding-based. Node/edge taxonomy noted as reference if we ever build our own graph.
 - **Batch 9 decoupled from Batch 7 (2026-05-16); shipped in `92bd3b1`.** Builtin defaults: six agents (Code Reviewer, Debugger, Refactorer, Architect, Security Auditor, Prompt Builder) with no `model` field. Session model wins by default.
- **opencode lift opened** (2026-05-20). Started with compaction (v1.11.0). Continuing through v1.15. Five distinct algorithms: compaction, doom-loop guard, repairToolCall, runLoop, permission evaluate. Plus `truncate.ts` and `MCP client`. Each lifts the algorithm, not the Effect-TS plumbing.
+- **opencode lift opened** (2026-05-20). Started with compaction (v1.11.0). Continuing through v1.15. Five distinct algorithms: compaction, doom-loop guard, repairToolCall, runLoop, permission evaluate. Plus `truncate.ts` and MCP client. Each lifts the algorithm, not the Effect-TS plumbing.
- **AI SDK adoption deferred to v1.13.** Hand-roll repairToolCall in v1.12 first. Migrate everything together when parts table lands.
+- **AI SDK adoption deferred to v1.13.** Hand-roll repairToolCall in v1.12 — not actually done in v1.12.0; truncation also deferred. v1.12.0 shipped codecontext + container guidance + skills only.
- **`tool_choice='required'` confirmed supported** by llama-swap (qwen3.6-35b-a3b-mxfp4, 2026-05-20). Unblocks repair tool call viability.
+- **`tool_choice='required'` confirmed supported** by llama-swap (qwen3.6-35b-a3b-mxfp4, 2026-05-20).
- **v1.11.4 cancelled** (2026-05-20). Per-turn budget reset + Continue affordance + CapHitSentinel were already shipped in v1.8.2. Roadmap was 14 versions stale at time of recon.
+- **v1.11.4 cancelled** (2026-05-20). Per-turn budget reset + Continue affordance + CapHitSentinel were already shipped in v1.8.2.
 - **v1.12.0 shipped** (2026-05-21). codecontext sidecar Track B + container guidance Track A. v1.12 truncation and repairToolCall were deferred into v1.13's AI SDK migration where they get for-free.
 - **v1.12.1 workspace pane sync** (2026-05-21). Moved pane state from per-device localStorage to `sessions.workspace_panes jsonb` with WS broadcast for cross-device sync. Deprecated `session_panes` table dropped. Legacy localStorage migrates on first load.
 - **v1.12.1 status indicator overhaul** (2026-05-21). ChatStatusFrame expanded from `working|idle|error` to `streaming|tool_running|waiting_for_input|idle|error`. StatusDot rewritten with distinct animations per state. Added `executeToolPhase`-entry `tool_running` publish.
 - **detectSameNameLoop reverted** (planned v1.12.1). Added during the 2026-05-21 debugging spike to catch same-tool-name-with-different-args loops. Never fired in any real run because the existing `detectDoomLoop` covers the actual failure modes. Dead code, reverting.
 - **The 2026-05-21 "freeze" debugging spike taught one lesson**: BooCode has no UI signal for the difference between a slow stream and a dead stream. Diagnostic logging (added today, reverted in v1.12.1) revealed the inference loop was working correctly throughout — what looked like four hours of deterministic hang was multiple instances of qwen3.6 generating 8k tokens of self-doubt at temperature 0.2 on a "find the bug" prompt with no real bug. v1.12.2 (live tok/s display) and v1.12.3 (stale-stream banner) directly address this gap.
 -----
Author	SHA1	Message	Date
indifferentketchup	a7104691aa	v1.12.2: live tok/s + ctx display next to status indicator ChatThroughput renders inline beside StatusDot while streaming or tool_running. Subscribes to existing usage frames via sessionEvents. Hides when status drops to idle/error or data is older than 10s. Addresses the 2026-05-21 spike's UX gap where slow streams looked identical to dead streams — now there's a live token velocity readout that immediately distinguishes the two. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 20:45:53 +00:00
indifferentketchup	1a0a3b1673	v1.12.1: stop-handler writes terminal status + constraint cleanup + dead code removal - handleAbortOrError now writes status='cancelled' on user stop; rows no longer stuck 'streaming' forever - Drop stale messages_status_check constraint (only messages_status_chk remains, allowing 'cancelled' via TS MESSAGE_STATUSES) - Remove detectSameNameLoop and DOOM_LOOP_SAME_NAME_THRESHOLD (added during 2026-05-21 debugging spike, never fired in any real run, existing detectDoomLoop covers actual failure modes) - Remove 12 ctx.log.info diagnostic markers added during the same spike (verbose for production) - Bundles workspace pane sync + status indicator overhaul + startup hung-row sweep landed earlier in v1.12.1 work Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 20:34:40 +00:00
indifferentketchup	48ee63a286	v1.12.1: rich status indicator + server-side workspace pane sync Status indicator (StatusDot): drops the flat amber pulse for a richer set of states — orbiting amber for streaming, spinning sky ring for tool_running, static violet for waiting_for_input, plus the existing idle/error. Backend chat_status frame widens from 'working\|idle\|error' to discriminate streaming vs tool execution vs paused for user input. Workspace pane sync: pane layout moves from per-device localStorage to server-side sessions.workspace_panes jsonb. PATCH /api/sessions/:id/workspace broadcasts session_workspace_updated on the user channel for cross-device live sync. Echo dedup via JSON comparison so the round-trip frame doesn't loop. Legacy localStorage seeds the server on first hydrate, then is deleted. Deprecated session_panes table dropped. Resilience: startup sweep marks any stale 'streaming' message older than 5 minutes as 'failed' so v1.12.0-style hung rows clear on container restart. useWorkspacePanes gains validatePanes() to prune dead chatId references from saved pane state when the chat list lands. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 20:32:02 +00:00
indifferentketchup	d58d553503	v1.12.1: same-name doom-loop guard + runAssistantTurn trace logging Add detectSameNameLoop (threshold 5) to catch over-verification hangs where tool args vary but the model is stuck on one tool. Add 12 structured log points across the inference state machine (runAssistantTurn, executeToolPhase, runDoomLoopSummary) to diagnose the deterministic hang surfaced in v1.12.0 smoke testing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-21 17:15:02 +00:00