# Small wins — sampling knobs + PTY stream-json + token UI **Status:** in progress (started 2026-06-01) **Source:** `boocode_code_review_v2.md` §1 #11 / #7 / #8 (config-adopt + qwen-code §5g + opencode §3 #4). Three independent BooCode improvements, disjoint subsystems (apps/server / apps/coder / apps/web). ## #11 — New sampling knobs (apps/server) Per-agent `top_n_sigma` + the `dry_*` repetition family help the doom-loop-prone local model. Today the Agent type threads `temperature/top_p/top_k/min_p/presence_penalty` into the inference request (`stream-phase.ts:396–438`). Add `top_n_sigma`, `dry_multiplier`, `dry_base`, `dry_allowed_length`, `dry_penalty_last_n` as first-class Agent fields (`types/api.ts`), parse them in `agents.ts:parseFrontmatter` (same bounded per-field numeric pattern + out-of-range warn), and thread them into the request body **via the same mechanism `top_k`/`min_p` already use** (the agent must confirm whether that's an AI-SDK `providerOptions`/`extraBody` passthrough — these are llama.cpp extensions, not standard OpenAI fields — and ride it; surface it if `top_k`/`min_p` turn out to be silently dropped today). `--reasoning-budget` is a llama-server CLI flag already permitted by the deny-list validator, so it works via `llama_extra_args: ["--reasoning-budget","N"]` now — document it in `data/AGENTS.md`. apps/server only. ## #7 — Live PTY stream-json NDJSON parsing (apps/coder) qwen/claude PTY dispatch slices stdout opaque (`dispatcher.ts` PTY path; qwen already runs `--output-format stream-json`). Add a parser for the Claude-Code-compatible NDJSON (`system`/`assistant`/`result`/`stream_event` → `content_block_delta` text/thinking/tool deltas + `usage` + `session_id`) that maps to the existing `AgentEvent` union (`agent-backend.ts`). **Live incremental** (decision 2026-06-01): line-buffer the PTY stdout `data` events, parse each complete NDJSON line as it arrives, and emit broker frames live (text/reasoning/tool) like the ACP/opencode paths — plus accumulate for `persistExternalAgentTurn`. claude gets `--output-format stream-json` too. One parser serves both (same schema). apps/coder only (`pty-dispatch.ts`, `dispatcher.ts`, new `stream-json-parser.ts` + test). ## #8 — Surface opencode token usage (apps/coder route + apps/web) `agent_sessions.input_tokens/output_tokens/cost` are accumulated (v2.6.8) but the `GET /api/sessions/:id/agent-sessions` SELECT + the `AgentSessionInfo` type drop them. Add the 3 columns to both, render condensed beside the existing session chip in `AgentComposerBar` (ChatThroughput styling: `tabular-nums`, muted, e.g. "12.4K in / 3.2K out / $0.25"). MUST NOT touch Sam's uncommitted WIP (`ChatTabBar`, `SessionLandingPage`, `Workspace`, `useWorkspacePanes`, `PaneHeaderActions`). ## Decisions (2026-06-01) - #7 surfacing: **live incremental** streaming (not parse-at-end). ## Verify - `pnpm -C apps/server test` (+ new agent-parse tests); `pnpm -C apps/coder test` (+ new parser tests) - `pnpm -C apps/server build && pnpm -C apps/coder build`; `npx tsc -p apps/web/tsconfig.app.json --noEmit`