Files
boocode/openspec/changes/sampling-streamjson-tokens/proposal.md
indifferentketchup a584dd16b0 feat: sampling knobs + live PTY stream-json + token UI (v2.7.3)
Three small wins from boocode_code_review_v2 §1 #11/#7/#8.

#11 sampling knobs: top_n_sigma + dry_* family as first-class Agent fields,
threaded into the request body via providerOptions.openaiCompatible. Fixes a
latent bug — top_k (rejected by the AI-SDK provider) and min_p (never passed to
streamText) were dead on the wire; both now route through the same channel.
--reasoning-budget documented in data/AGENTS.md.

#7 live PTY stream-json: new stream-json-parser.ts line-buffers qwen/claude
NDJSON and emits text/reasoning/tool frames live + persists, with a fallback to
the old opaque slice. claude gets --output-format stream-json --verbose.

#8 token UI: agent_sessions input/output_tokens/cost now flow through the route
+ type and render beside the AgentComposerBar session chip.

Built by 3 parallel agents. Server 523 + coder 245 tests passing; builds + web
tsc clean. Builds on v2.7.2. openspec sampling-streamjson-tokens.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 12:47:17 +00:00

46 lines
3.0 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Small wins — sampling knobs + PTY stream-json + token UI
**Status:** in progress (started 2026-06-01)
**Source:** `boocode_code_review_v2.md` §1 #11 / #7 / #8 (config-adopt + qwen-code §5g + opencode §3 #4).
Three independent BooCode improvements, disjoint subsystems (apps/server / apps/coder / apps/web).
## #11 — New sampling knobs (apps/server)
Per-agent `top_n_sigma` + the `dry_*` repetition family help the doom-loop-prone local model.
Today the Agent type threads `temperature/top_p/top_k/min_p/presence_penalty` into the inference
request (`stream-phase.ts:396438`). Add `top_n_sigma`, `dry_multiplier`, `dry_base`,
`dry_allowed_length`, `dry_penalty_last_n` as first-class Agent fields (`types/api.ts`), parse them in
`agents.ts:parseFrontmatter` (same bounded per-field numeric pattern + out-of-range warn), and thread
them into the request body **via the same mechanism `top_k`/`min_p` already use** (the agent must
confirm whether that's an AI-SDK `providerOptions`/`extraBody` passthrough — these are llama.cpp
extensions, not standard OpenAI fields — and ride it; surface it if `top_k`/`min_p` turn out to be
silently dropped today). `--reasoning-budget` is a llama-server CLI flag already permitted by the
deny-list validator, so it works via `llama_extra_args: ["--reasoning-budget","N"]` now — document it
in `data/AGENTS.md`. apps/server only.
## #7 — Live PTY stream-json NDJSON parsing (apps/coder)
qwen/claude PTY dispatch slices stdout opaque (`dispatcher.ts` PTY path; qwen already runs
`--output-format stream-json`). Add a parser for the Claude-Code-compatible NDJSON
(`system`/`assistant`/`result`/`stream_event``content_block_delta` text/thinking/tool deltas +
`usage` + `session_id`) that maps to the existing `AgentEvent` union (`agent-backend.ts`). **Live
incremental** (decision 2026-06-01): line-buffer the PTY stdout `data` events, parse each complete
NDJSON line as it arrives, and emit broker frames live (text/reasoning/tool) like the ACP/opencode
paths — plus accumulate for `persistExternalAgentTurn`. claude gets `--output-format stream-json` too.
One parser serves both (same schema). apps/coder only (`pty-dispatch.ts`, `dispatcher.ts`, new
`stream-json-parser.ts` + test).
## #8 — Surface opencode token usage (apps/coder route + apps/web)
`agent_sessions.input_tokens/output_tokens/cost` are accumulated (v2.6.8) but the
`GET /api/sessions/:id/agent-sessions` SELECT + the `AgentSessionInfo` type drop them. Add the 3
columns to both, render condensed beside the existing session chip in `AgentComposerBar`
(ChatThroughput styling: `tabular-nums`, muted, e.g. "12.4K in / 3.2K out / $0.25"). MUST NOT touch
Sam's uncommitted WIP (`ChatTabBar`, `SessionLandingPage`, `Workspace`, `useWorkspacePanes`,
`PaneHeaderActions`).
## Decisions (2026-06-01)
- #7 surfacing: **live incremental** streaming (not parse-at-end).
## Verify
- `pnpm -C apps/server test` (+ new agent-parse tests); `pnpm -C apps/coder test` (+ new parser tests)
- `pnpm -C apps/server build && pnpm -C apps/coder build`; `npx tsc -p apps/web/tsconfig.app.json --noEmit`