Three small wins from boocode_code_review_v2 §1 #11/#7/#8. #11 sampling knobs: top_n_sigma + dry_* family as first-class Agent fields, threaded into the request body via providerOptions.openaiCompatible. Fixes a latent bug — top_k (rejected by the AI-SDK provider) and min_p (never passed to streamText) were dead on the wire; both now route through the same channel. --reasoning-budget documented in data/AGENTS.md. #7 live PTY stream-json: new stream-json-parser.ts line-buffers qwen/claude NDJSON and emits text/reasoning/tool frames live + persists, with a fallback to the old opaque slice. claude gets --output-format stream-json --verbose. #8 token UI: agent_sessions input/output_tokens/cost now flow through the route + type and render beside the AgentComposerBar session chip. Built by 3 parallel agents. Server 523 + coder 245 tests passing; builds + web tsc clean. Builds on v2.7.2. openspec sampling-streamjson-tokens. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
3.0 KiB
Small wins — sampling knobs + PTY stream-json + token UI
Status: in progress (started 2026-06-01)
Source: boocode_code_review_v2.md §1 #11 / #7 / #8 (config-adopt + qwen-code §5g + opencode §3 #4).
Three independent BooCode improvements, disjoint subsystems (apps/server / apps/coder / apps/web).
#11 — New sampling knobs (apps/server)
Per-agent top_n_sigma + the dry_* repetition family help the doom-loop-prone local model.
Today the Agent type threads temperature/top_p/top_k/min_p/presence_penalty into the inference
request (stream-phase.ts:396–438). Add top_n_sigma, dry_multiplier, dry_base,
dry_allowed_length, dry_penalty_last_n as first-class Agent fields (types/api.ts), parse them in
agents.ts:parseFrontmatter (same bounded per-field numeric pattern + out-of-range warn), and thread
them into the request body via the same mechanism top_k/min_p already use (the agent must
confirm whether that's an AI-SDK providerOptions/extraBody passthrough — these are llama.cpp
extensions, not standard OpenAI fields — and ride it; surface it if top_k/min_p turn out to be
silently dropped today). --reasoning-budget is a llama-server CLI flag already permitted by the
deny-list validator, so it works via llama_extra_args: ["--reasoning-budget","N"] now — document it
in data/AGENTS.md. apps/server only.
#7 — Live PTY stream-json NDJSON parsing (apps/coder)
qwen/claude PTY dispatch slices stdout opaque (dispatcher.ts PTY path; qwen already runs
--output-format stream-json). Add a parser for the Claude-Code-compatible NDJSON
(system/assistant/result/stream_event → content_block_delta text/thinking/tool deltas +
usage + session_id) that maps to the existing AgentEvent union (agent-backend.ts). Live
incremental (decision 2026-06-01): line-buffer the PTY stdout data events, parse each complete
NDJSON line as it arrives, and emit broker frames live (text/reasoning/tool) like the ACP/opencode
paths — plus accumulate for persistExternalAgentTurn. claude gets --output-format stream-json too.
One parser serves both (same schema). apps/coder only (pty-dispatch.ts, dispatcher.ts, new
stream-json-parser.ts + test).
#8 — Surface opencode token usage (apps/coder route + apps/web)
agent_sessions.input_tokens/output_tokens/cost are accumulated (v2.6.8) but the
GET /api/sessions/:id/agent-sessions SELECT + the AgentSessionInfo type drop them. Add the 3
columns to both, render condensed beside the existing session chip in AgentComposerBar
(ChatThroughput styling: tabular-nums, muted, e.g. "12.4K in / 3.2K out / $0.25"). MUST NOT touch
Sam's uncommitted WIP (ChatTabBar, SessionLandingPage, Workspace, useWorkspacePanes,
PaneHeaderActions).
Decisions (2026-06-01)
- #7 surfacing: live incremental streaming (not parse-at-end).
Verify
pnpm -C apps/server test(+ new agent-parse tests);pnpm -C apps/coder test(+ new parser tests)pnpm -C apps/server build && pnpm -C apps/coder build;npx tsc -p apps/web/tsconfig.app.json --noEmit