Files

indifferentketchup a584dd16b0 feat: sampling knobs + live PTY stream-json + token UI (v2.7.3)

Three small wins from boocode_code_review_v2 §1 #11/#7/#8.

#11 sampling knobs: top_n_sigma + dry_* family as first-class Agent fields,
threaded into the request body via providerOptions.openaiCompatible. Fixes a
latent bug — top_k (rejected by the AI-SDK provider) and min_p (never passed to
streamText) were dead on the wire; both now route through the same channel.
--reasoning-budget documented in data/AGENTS.md.

#7 live PTY stream-json: new stream-json-parser.ts line-buffers qwen/claude
NDJSON and emits text/reasoning/tool frames live + persists, with a fallback to
the old opaque slice. claude gets --output-format stream-json --verbose.

#8 token UI: agent_sessions input/output_tokens/cost now flow through the route
+ type and render beside the AgentComposerBar session chip.

Built by 3 parallel agents. Server 523 + coder 245 tests passing; builds + web
tsc clean. Builds on v2.7.2. openspec sampling-streamjson-tokens.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

2026-06-01 12:47:17 +00:00

3.0 KiB

Raw Blame History

Small wins — sampling knobs + PTY stream-json + token UI

Status: in progress (started 2026-06-01) Source: boocode_code_review_v2.md §1 #11 / #7 / #8 (config-adopt + qwen-code §5g + opencode §3 #4).

Three independent BooCode improvements, disjoint subsystems (apps/server / apps/coder / apps/web).

#11 — New sampling knobs (apps/server)

Per-agent top_n_sigma + the dry_* repetition family help the doom-loop-prone local model. Today the Agent type threads temperature/top_p/top_k/min_p/presence_penalty into the inference request (stream-phase.ts:396–438). Add top_n_sigma, dry_multiplier, dry_base, dry_allowed_length, dry_penalty_last_n as first-class Agent fields (types/api.ts), parse them in agents.ts:parseFrontmatter (same bounded per-field numeric pattern + out-of-range warn), and thread them into the request body via the same mechanism top_k/min_p already use (the agent must confirm whether that's an AI-SDK providerOptions/extraBody passthrough — these are llama.cpp extensions, not standard OpenAI fields — and ride it; surface it if top_k/min_p turn out to be silently dropped today). --reasoning-budget is a llama-server CLI flag already permitted by the deny-list validator, so it works via llama_extra_args: ["--reasoning-budget","N"] now — document it in data/AGENTS.md. apps/server only.

#7 — Live PTY stream-json NDJSON parsing (apps/coder)

qwen/claude PTY dispatch slices stdout opaque (dispatcher.ts PTY path; qwen already runs --output-format stream-json). Add a parser for the Claude-Code-compatible NDJSON (system/assistant/result/stream_event → content_block_delta text/thinking/tool deltas + usage + session_id) that maps to the existing AgentEvent union (agent-backend.ts). Live incremental (decision 2026-06-01): line-buffer the PTY stdout data events, parse each complete NDJSON line as it arrives, and emit broker frames live (text/reasoning/tool) like the ACP/opencode paths — plus accumulate for persistExternalAgentTurn. claude gets --output-format stream-json too. One parser serves both (same schema). apps/coder only (pty-dispatch.ts, dispatcher.ts, new stream-json-parser.ts + test).

#8 — Surface opencode token usage (apps/coder route + apps/web)

agent_sessions.input_tokens/output_tokens/cost are accumulated (v2.6.8) but the GET /api/sessions/:id/agent-sessions SELECT + the AgentSessionInfo type drop them. Add the 3 columns to both, render condensed beside the existing session chip in AgentComposerBar (ChatThroughput styling: tabular-nums, muted, e.g. "12.4K in / 3.2K out / $0.25"). MUST NOT touch Sam's uncommitted WIP (ChatTabBar, SessionLandingPage, Workspace, useWorkspacePanes, PaneHeaderActions).

Decisions (2026-06-01)

#7 surfacing: live incremental streaming (not parse-at-end).

Verify

pnpm -C apps/server test (+ new agent-parse tests); pnpm -C apps/coder test (+ new parser tests)
pnpm -C apps/server build && pnpm -C apps/coder build; npx tsc -p apps/web/tsconfig.app.json --noEmit

3.0 KiB Raw Blame History Unescape Escape