46 lines
3.1 KiB
Markdown
46 lines
3.1 KiB
Markdown
# Small wins — sampling knobs + PTY stream-json + token UI
|
||
|
||
**Status:** shipped `v2.7.3-sampling-streamjson-tokens`
|
||
**Source:** `boocode_code_review_v2.md` §1 #11 / #7 / #8 (config-adopt + qwen-code §5g + opencode §3 #4).
|
||
|
||
Three independent BooCode improvements, disjoint subsystems (apps/server / apps/coder / apps/web).
|
||
|
||
## #11 — New sampling knobs (apps/server)
|
||
Per-agent `top_n_sigma` + the `dry_*` repetition family help the doom-loop-prone local model.
|
||
Today the Agent type threads `temperature/top_p/top_k/min_p/presence_penalty` into the inference
|
||
request (`stream-phase.ts:396–438`). Add `top_n_sigma`, `dry_multiplier`, `dry_base`,
|
||
`dry_allowed_length`, `dry_penalty_last_n` as first-class Agent fields (`types/api.ts`), parse them in
|
||
`agents.ts:parseFrontmatter` (same bounded per-field numeric pattern + out-of-range warn), and thread
|
||
them into the request body **via the same mechanism `top_k`/`min_p` already use** (the agent must
|
||
confirm whether that's an AI-SDK `providerOptions`/`extraBody` passthrough — these are llama.cpp
|
||
extensions, not standard OpenAI fields — and ride it; surface it if `top_k`/`min_p` turn out to be
|
||
silently dropped today). `--reasoning-budget` is a llama-server CLI flag already permitted by the
|
||
deny-list validator, so it works via `llama_extra_args: ["--reasoning-budget","N"]` now — document it
|
||
in `data/AGENTS.md`. apps/server only.
|
||
|
||
## #7 — Live PTY stream-json NDJSON parsing (apps/coder)
|
||
qwen/claude PTY dispatch slices stdout opaque (`dispatcher.ts` PTY path; qwen already runs
|
||
`--output-format stream-json`). Add a parser for the Claude-Code-compatible NDJSON
|
||
(`system`/`assistant`/`result`/`stream_event` → `content_block_delta` text/thinking/tool deltas +
|
||
`usage` + `session_id`) that maps to the existing `AgentEvent` union (`agent-backend.ts`). **Live
|
||
incremental** (decision 2026-06-01): line-buffer the PTY stdout `data` events, parse each complete
|
||
NDJSON line as it arrives, and emit broker frames live (text/reasoning/tool) like the ACP/opencode
|
||
paths — plus accumulate for `persistExternalAgentTurn`. claude gets `--output-format stream-json` too.
|
||
One parser serves both (same schema). apps/coder only (`pty-dispatch.ts`, `dispatcher.ts`, new
|
||
`stream-json-parser.ts` + test).
|
||
|
||
## #8 — Surface opencode token usage (apps/coder route + apps/web)
|
||
`agent_sessions.input_tokens/output_tokens/cost` are accumulated (v2.6.8) but the
|
||
`GET /api/sessions/:id/agent-sessions` SELECT + the `AgentSessionInfo` type drop them. Add the 3
|
||
columns to both, render condensed beside the existing session chip in `AgentComposerBar`
|
||
(ChatThroughput styling: `tabular-nums`, muted, e.g. "12.4K in / 3.2K out / $0.25"). MUST NOT touch
|
||
Sam's uncommitted WIP (`ChatTabBar`, `SessionLandingPage`, `Workspace`, `useWorkspacePanes`,
|
||
`PaneHeaderActions`).
|
||
|
||
## Decisions (2026-06-01)
|
||
- #7 surfacing: **live incremental** streaming (not parse-at-end).
|
||
|
||
## Verify
|
||
- `pnpm -C apps/server test` (+ new agent-parse tests); `pnpm -C apps/coder test` (+ new parser tests)
|
||
- `pnpm -C apps/server build && pnpm -C apps/coder build`; `npx tsc -p apps/web/tsconfig.app.json --noEmit`
|