Commit Graph

9 Commits

Author SHA1 Message Date
fcc7c5a86e v2.5.0-task-model: lightweight task model services + tasks table
Task model infrastructure for cheap LLM calls (auto-naming, search
rewrite, tags, summaries) via a dedicated llama-server instance at
TASK_MODEL_URL, falling back to LLAMA_SWAP_URL with FAST_MODEL when
unset. Replaces the inline fetch in auto_name.ts with taskModelCompletion.

Adds search query rewriting: on step 0 when web tools are enabled, the
user's message is summarized into a search intent hint appended to the
system prompt, improving web_search relevance.

Schema: tasks table for provider dispatch and arena, sessions.tags column.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-27 21:44:39 +00:00
e423579e99 v2.0.5: FAST_MODEL routing + tool-use summaries + Qwen dispatch + Arena
Source-level recon of QwenLM/qwen-code (Apache-2.0) informed 4 lifts:

1. FAST_MODEL config: optional env var routes cheap LLM calls (titles,
   summaries, labeling) to a smaller model on llama-swap. auto_name.ts
   uses ctx.config.FAST_MODEL ?? session.model. Set FAST_MODEL=nemotron-
   nano-4b to avoid loading the 35B model for 20-token title generation.

2. Tool-use summaries (services/inference/tool-summaries.ts): utility
   that generates "git-commit-subject-style" labels for tool batches via
   a fast-model LLM call. System prompt + truncation logic ported from
   Qwen Code's toolUseSummary.ts. Exported via @boocode/server/inference
   for BooCoder's dispatcher to call after task completion.

3. Qwen as dispatchable agent: added to agent-probe.ts KNOWN_AGENTS.
   PTY dispatch builds: qwen -p "<task>" --output-format stream-json
   (NDJSON structured events over stdout). Env: OPENAI_BASE_URL +
   OPENAI_API_KEY points Qwen Code at llama-swap. execution_path CHECK
   constraint extended with 'qwen'.

4. Arena routes (routes/arena.ts): POST /api/arena dispatches the same
   task to N contestants (2-5, each with different agent/model), each
   getting its own task row linked by arena_id UUID. GET /api/arena/:id
   shows all contestants. POST /api/arena/:id/select/:task_id marks
   winner. Schema: arena_id column added to tasks.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 14:05:59 +00:00
d2108b2f8d verification discipline rules + chat naming from assistant response
BOOCHAT.md + BOOCODER.md: 4 verification rules added to both —
verify against running container not source files, never count dist/,
run commands before claiming success, derive counts from commands.

auto_name.ts: chat titles now derived from the assistant's first
response only (user message dropped from naming input). System prompt
updated to "summarize the topic or outcome — do NOT copy the first
few words verbatim." Produces titles like "Fastify Route Setup"
instead of echoing the assistant's opening sentence.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 02:52:49 +00:00
9ef00c0268 v1.12.4: complete inference.ts split into services/inference/
- sentinel-summaries.ts: runCapHitSummary, insertCapHitSentinel,
  runDoomLoopSummary, insertDoomLoopSentinel
- inference.ts → inference/turn.ts: residue is runAssistantTurn,
  runInference, createInferenceRunner orchestration only
- inference/index.ts: re-export shim preserves the public surface
  (createInferenceRunner, runInference, runAssistantTurn,
  detectDoomLoop, DOOM_LOOP_THRESHOLD, buildMessagesPayload, plus
  type-side InferenceContext/InferenceFrame/StreamResult/TurnArgs/
  FramePublisher)
- src/index.ts + auto_name.ts + the two vitest test files updated to
  import from ./services/inference/index.js explicitly (NodeNext ESM
  doesn't honor directory-index resolution)

Final tally: 11 files under services/inference/, the largest being
sentinel-summaries.ts at 523 LoC (two near-clone summary paths kept
side-by-side until a third sentinel justifies factoring out a shared
runWrapUpSummary). turn.ts is now 326 LoC, the next-largest is
stream-phase.ts at 380. Public import surface unchanged.

tool-phase.ts → turn.ts back-edge for runAssistantTurn remains
(cycle is safe; resolved at call time).

Prepares the file structure for v1.13 AI SDK migration — streamText
swap targets stream-phase.ts only.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 22:36:35 +00:00
5ee266a4d9 feat(auto_name): propagate first chat name to parent session
When a chat is auto-named, also rename the parent session if it is
still on its default 'New session' label. UPDATE is gated by an
atomic WHERE clause so user renames and prior propagations are not
clobbered. Publishes session_renamed via broker.publishUser; useSidebar
already listens.

Closes the gap where sessions auto-created from the sidebar would
stay 'New session' forever.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 15:23:11 +00:00
48a972e139 project-ux: archive/rename/Open-in-Gitea sidebar context menu, archived projects landing, create-project bootstrap with Gitea remote
Server:
- projects.status + projects.gitea_remote (additive) with CHECK ('open','archived')
- GET /api/projects?status=archived; PATCH /api/projects/:id (rename);
  POST /api/projects/:id/archive | unarchive; POST /api/projects/create
- POST /api/projects ON CONFLICT (path) DO UPDATE SET status='open': re-add
  of archived path restores existing row (preserves id + FKs); already-open
  path returns 409. Detected-repos picker now excludes only status='open'.
- New gitea.ts (createGiteaRepo + GiteaRepoExistsError) and
  project_bootstrap.ts (sanitize name, mkdir under PROJECT_ROOT_WHITELIST,
  git init -b main + first commit with -c user.name/email per-command, optional
  Gitea repo create + remote add + push; all via execFile, no shell).
- 3 new user-stream frames: project_archived, project_unarchived, project_updated.
- sidebar.ts now selects path + gitea_remote and filters status='open'.
- Gitea env added to config.ts (GITEA_BASE_URL, GITEA_USER, GITEA_TOKEN,
  GITEA_SSH_HOST).
- docker-compose.yml /opt mount flipped to rw so create-project can mkdir.
- auto_name.ts gate relaxed from `!== 1` to `< 1` (fires on every turn while
  chat name is empty, not only the first).

Web:
- ProjectSidebar: project rows use proper Radix ContextMenu; items Rename /
  Archive / Open in Gitea. Inline rename, archive confirm dialog.
  Removed obsolete handleRemove + DropdownMenu hack.
- Home: Add-existing + Create-new buttons; collapsible Archived Projects
  section with Restore.
- New CreateProjectModal: name + live folder preview, commit msg, Private/
  Public radio, create-Gitea-remote checkbox, toast on success/warnings.
- New projectUrls.ts giteaUrlFor() — uses gitea_remote when present,
  falls back to convention URL.
- 3 new event types in sessionEvents.ts with idempotent useSidebar handlers.
- SidebarProject extended with path + gitea_remote so Open-in-Gitea can
  resolve without a separate fetch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 02:51:59 +00:00
051f3b96ae batch4.1-5.1: dedup audit, archive 400 fix, sidebar Delete, landing-page enrichment, auto-name tool-call fix
- Fastify global empty-JSON-body parser fixes archive/unarchive/stop 400s
- Removed redundant local sessionEvents.emit at all 5+2 sites with server-side WS publishers; added dedupe guards in useSidebar/Workspace/Project handlers
- Sidebar session right-click adds Delete (destructive) with confirm Dialog
- Session.tsx navigates away on session_deleted/session_archived for the active session
- SessionLandingPage chat rows show message_count, effective_context_tokens, last_message_preview via LATERAL joins on GET /api/sessions/:id/chats
- Workspace.tsx pane drag-to-reorder using native HTML5 events (no new deps)
- CompactCard: Copy toast, Send-to-chat with target chat name, empty-state in share popover, Re-run button
- auto_name.ts: filter count gate and assistant-fetch by content <> '' so tool-call assistant rows don't trip the once-and-only-once guard
- Adds CLAUDE.md and apps/web/src/lib/format.ts

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 23:36:01 +00:00
c35ec65fc4 batch4: chats-in-sessions, force-send, /compact, right-rail file browser
Session 1:N Chat data model with backfill. Workspace switches to client-side
multi-tab pane management. Right-rail file browser with float-over viewer and
click-drag line selection replaces FileBrowserPane. Adds /compact streaming
summarizer (respects compact markers in context builder), force-send (cancels
in-flight, persists partial as 'cancelled', awaits cancellation completion via
deferred Promise + 5s timeout), message queue, stop generation, chat
auto-rename, session archive/unarchive with Closed Sessions section on repo
landing page. CHECK constraints on sessions.status, messages.role,
messages.status with KEEP IN SYNC comments tying to MESSAGE_ROLES /
MESSAGE_STATUSES const arrays. Deletes dead pane routes/hook and the
api.panes.* client block.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 20:39:48 +00:00
2464d23bb6 v1.1 batch 1: markdown, message actions, tok/s+ctx, AI naming
Four features land together on this branch:

1. Markdown rendering — assistant messages go through react-markdown +
   remark-gfm. Fenced code blocks render via existing CodeBlock (with copy
   button); inline `code` is styled inline. User messages stay plain text.
   No raw HTML (no rehype-raw).

2. Per-message Copy + Regenerate. New endpoint
   POST /api/sessions/:id/messages/:message_id/regenerate validates the
   target (404/400/409), atomically deletes the target plus any later
   messages in the session, inserts a fresh streaming assistant row, and
   enqueues a normal inference run. The DELETE bound uses a SQL subquery
   (`created_at >= (SELECT created_at FROM messages WHERE id = $1)`)
   instead of a JS round-trip so postgres TIMESTAMPTZ µs precision is
   preserved — otherwise sub-ms clock_timestamp() differences between the
   user row and the assistant row collapsed to the same JS Date, pulling
   the triggering user message into the >= bound. New `messages_deleted`
   WS frame so already-connected clients prune the stale tail without
   needing a full snapshot resend.

3. tok/s + ctx counter. Five new nullable message columns: tokens_used,
   ctx_used, ctx_max, started_at, finished_at. started_at is set right
   before the OpenAI call in services/inference.ts (not in the route, not
   in the frame handler); finished_at + tokens_used + ctx_used + ctx_max
   are committed in the same UPDATE that flips status to 'complete'. The
   inference request now opts into stream_options.include_usage so the
   final chunk carries usage; defensive parsing also picks up timings.n_ctx
   when llama.cpp emits it (currently absent for our llama-swap models, so
   ctx_max stays NULL and the UI just shows `<used> ctx`). message_complete
   frame extended with tokens_used / ctx_used / ctx_max / started_at /
   finished_at / model. Frontend StatsLine in MessageBubble computes tok/s
   client-side from the timestamps and renders muted mono text below the
   body of completed assistant messages.

4. AI chat naming after the first turn. Backend services/auto_name.ts
   runs via setImmediate after the top-level inference resolves; it
   checks that there is exactly one completed assistant message and that
   the session has not been user-renamed (`name IS NULL OR name = '' OR
   name = 'New session'`), then fires a single non-streaming chat
   completion with the spec prompt. Qwen3 chat templates emit chain-of-
   thought into reasoning_content and burn the entire max_tokens budget
   without producing visible output, so the request includes
   `chat_template_kwargs: { enable_thinking: false }` and max_tokens=30.
   Title is trimmed, quote-stripped, "Title:" prefix dropped, and
   truncated to 60 chars before a guarded UPDATE on sessions.name. New
   `session_renamed` WS frame propagates to the open session view
   directly and to the project's session list via a tiny module-scope
   event bus (apps/web/src/hooks/sessionEvents.ts) — kept dumb: one event
   type, two methods, no library.

Cleanups: dropped the now-unused splitCodeBlocks export from CodeBlock.tsx
(react-markdown supersedes it), and added a long-form NOTE in auto_name.ts
documenting the enable_thinking + max_tokens pattern for any future Qwen-
family non-streaming utility calls (planned: fork-message, agent-routing,
web-search summarization).

Schema bootstrap remains idempotent (ADD COLUMN IF NOT EXISTS). Auth,
broker, clock_timestamp() conventions, and zod validation all unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 22:52:40 +00:00