Go to file

indifferentketchup 2464d23bb6 v1.1 batch 1: markdown, message actions, tok/s+ctx, AI naming

Four features land together on this branch:

1. Markdown rendering — assistant messages go through react-markdown +
   remark-gfm. Fenced code blocks render via existing CodeBlock (with copy
   button); inline `code` is styled inline. User messages stay plain text.
   No raw HTML (no rehype-raw).

2. Per-message Copy + Regenerate. New endpoint
   POST /api/sessions/:id/messages/:message_id/regenerate validates the
   target (404/400/409), atomically deletes the target plus any later
   messages in the session, inserts a fresh streaming assistant row, and
   enqueues a normal inference run. The DELETE bound uses a SQL subquery
   (`created_at >= (SELECT created_at FROM messages WHERE id = $1)`)
   instead of a JS round-trip so postgres TIMESTAMPTZ µs precision is
   preserved — otherwise sub-ms clock_timestamp() differences between the
   user row and the assistant row collapsed to the same JS Date, pulling
   the triggering user message into the >= bound. New `messages_deleted`
   WS frame so already-connected clients prune the stale tail without
   needing a full snapshot resend.

3. tok/s + ctx counter. Five new nullable message columns: tokens_used,
   ctx_used, ctx_max, started_at, finished_at. started_at is set right
   before the OpenAI call in services/inference.ts (not in the route, not
   in the frame handler); finished_at + tokens_used + ctx_used + ctx_max
   are committed in the same UPDATE that flips status to 'complete'. The
   inference request now opts into stream_options.include_usage so the
   final chunk carries usage; defensive parsing also picks up timings.n_ctx
   when llama.cpp emits it (currently absent for our llama-swap models, so
   ctx_max stays NULL and the UI just shows `<used> ctx`). message_complete
   frame extended with tokens_used / ctx_used / ctx_max / started_at /
   finished_at / model. Frontend StatsLine in MessageBubble computes tok/s
   client-side from the timestamps and renders muted mono text below the
   body of completed assistant messages.

4. AI chat naming after the first turn. Backend services/auto_name.ts
   runs via setImmediate after the top-level inference resolves; it
   checks that there is exactly one completed assistant message and that
   the session has not been user-renamed (`name IS NULL OR name = '' OR
   name = 'New session'`), then fires a single non-streaming chat
   completion with the spec prompt. Qwen3 chat templates emit chain-of-
   thought into reasoning_content and burn the entire max_tokens budget
   without producing visible output, so the request includes
   `chat_template_kwargs: { enable_thinking: false }` and max_tokens=30.
   Title is trimmed, quote-stripped, "Title:" prefix dropped, and
   truncated to 60 chars before a guarded UPDATE on sessions.name. New
   `session_renamed` WS frame propagates to the open session view
   directly and to the project's session list via a tiny module-scope
   event bus (apps/web/src/hooks/sessionEvents.ts) — kept dumb: one event
   type, two methods, no library.

Cleanups: dropped the now-unused splitCodeBlocks export from CodeBlock.tsx
(react-markdown supersedes it), and added a long-form NOTE in auto_name.ts
documenting the enable_thinking + max_tokens pattern for any future Qwen-
family non-streaming utility calls (planned: fork-message, agent-routing,
web-search summarization).

Schema bootstrap remains idempotent (ADD COLUMN IF NOT EXISTS). Auth,
broker, clock_timestamp() conventions, and zod validation all unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-14 22:52:40 +00:00

apps

v1.1 batch 1: markdown, message actions, tok/s+ctx, AI naming

2026-05-14 22:52:40 +00:00

.dockerignore

initial

2026-05-14 19:24:50 +00:00

.env.example

initial

2026-05-14 19:24:50 +00:00

.gitignore

initial

2026-05-14 19:24:50 +00:00

docker-compose.yml

initial

2026-05-14 19:24:50 +00:00

Dockerfile

initial

2026-05-14 19:24:50 +00:00

package.json

initial

2026-05-14 19:24:50 +00:00

pnpm-lock.yaml

v1.1 batch 1: markdown, message actions, tok/s+ctx, AI naming

2026-05-14 22:52:40 +00:00

pnpm-workspace.yaml

initial

2026-05-14 19:24:50 +00:00

README.md

initial

2026-05-14 19:24:50 +00:00

tsconfig.base.json

initial

2026-05-14 19:24:50 +00:00

README.md

boocode

Self-hosted single-user developer chat app. v1: chat only.

Stack

Node 20, Fastify, postgres (porsager/postgres), ws, zod
React 18, Vite, TypeScript, Tailwind v4, shadcn/ui
Postgres 16
pnpm workspaces

Layout

apps/server — Fastify API + WebSocket + inference loop + file-read tools
apps/web — React frontend; served by Fastify in production, Vite in dev

Local dev

Requires Node 20, pnpm, Docker (for Postgres), and ripgrep.

# install
pnpm install

# bring up postgres only
cp .env.example .env
# edit POSTGRES_PASSWORD if you like; default DATABASE_URL points at the container
docker compose up -d boocode_db

# run server (port 3000) and web (port 5173) in two shells
DATABASE_URL=postgres://boocode:devpass@127.0.0.1:5500/boocode \
LLAMA_SWAP_URL=http://100.101.41.16:8401 \
pnpm dev:server

pnpm dev:web

The Vite dev server proxies /api and /api/ws/* to the Fastify backend with a synthetic Remote-User: sam header so the Authelia auth layer can be skipped during development.

Production

cd /opt/boocode
docker compose up --build -d

Binds to 100.114.205.53:9500 (Tailscale). Authelia is expected to gate the upstream and inject Remote-User. Postgres binds loopback only.

What v1 has

Project sidebar, sessions per project, chat with streaming responses over WebSocket, four file-read tools scoped to the project root (view_file, list_dir, grep, find_files), and a model picker driven by llama-swap's /v1/models.

What v1 does not have lives in v2 (terminal pane) and v3 (Coder pane).

Languages

TypeScript 94.1%

CSS 1.9%

JavaScript 1.2%

Shell 0.8%

Go 0.6%

Other 1.4%