feat: sampling knobs + live PTY stream-json + token UI (v2.7.3)
Three small wins from boocode_code_review_v2 §1 #11/#7/#8. #11 sampling knobs: top_n_sigma + dry_* family as first-class Agent fields, threaded into the request body via providerOptions.openaiCompatible. Fixes a latent bug — top_k (rejected by the AI-SDK provider) and min_p (never passed to streamText) were dead on the wire; both now route through the same channel. --reasoning-budget documented in data/AGENTS.md. #7 live PTY stream-json: new stream-json-parser.ts line-buffers qwen/claude NDJSON and emits text/reasoning/tool frames live + persists, with a fallback to the old opaque slice. claude gets --output-format stream-json --verbose. #8 token UI: agent_sessions input/output_tokens/cost now flow through the route + type and render beside the AgentComposerBar session chip. Built by 3 parallel agents. Server 523 + coder 245 tests passing; builds + web tsc clean. Builds on v2.7.2. openspec sampling-streamjson-tokens. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
45
openspec/changes/sampling-streamjson-tokens/proposal.md
Normal file
45
openspec/changes/sampling-streamjson-tokens/proposal.md
Normal file
@@ -0,0 +1,45 @@
|
||||
# Small wins — sampling knobs + PTY stream-json + token UI
|
||||
|
||||
**Status:** in progress (started 2026-06-01)
|
||||
**Source:** `boocode_code_review_v2.md` §1 #11 / #7 / #8 (config-adopt + qwen-code §5g + opencode §3 #4).
|
||||
|
||||
Three independent BooCode improvements, disjoint subsystems (apps/server / apps/coder / apps/web).
|
||||
|
||||
## #11 — New sampling knobs (apps/server)
|
||||
Per-agent `top_n_sigma` + the `dry_*` repetition family help the doom-loop-prone local model.
|
||||
Today the Agent type threads `temperature/top_p/top_k/min_p/presence_penalty` into the inference
|
||||
request (`stream-phase.ts:396–438`). Add `top_n_sigma`, `dry_multiplier`, `dry_base`,
|
||||
`dry_allowed_length`, `dry_penalty_last_n` as first-class Agent fields (`types/api.ts`), parse them in
|
||||
`agents.ts:parseFrontmatter` (same bounded per-field numeric pattern + out-of-range warn), and thread
|
||||
them into the request body **via the same mechanism `top_k`/`min_p` already use** (the agent must
|
||||
confirm whether that's an AI-SDK `providerOptions`/`extraBody` passthrough — these are llama.cpp
|
||||
extensions, not standard OpenAI fields — and ride it; surface it if `top_k`/`min_p` turn out to be
|
||||
silently dropped today). `--reasoning-budget` is a llama-server CLI flag already permitted by the
|
||||
deny-list validator, so it works via `llama_extra_args: ["--reasoning-budget","N"]` now — document it
|
||||
in `data/AGENTS.md`. apps/server only.
|
||||
|
||||
## #7 — Live PTY stream-json NDJSON parsing (apps/coder)
|
||||
qwen/claude PTY dispatch slices stdout opaque (`dispatcher.ts` PTY path; qwen already runs
|
||||
`--output-format stream-json`). Add a parser for the Claude-Code-compatible NDJSON
|
||||
(`system`/`assistant`/`result`/`stream_event` → `content_block_delta` text/thinking/tool deltas +
|
||||
`usage` + `session_id`) that maps to the existing `AgentEvent` union (`agent-backend.ts`). **Live
|
||||
incremental** (decision 2026-06-01): line-buffer the PTY stdout `data` events, parse each complete
|
||||
NDJSON line as it arrives, and emit broker frames live (text/reasoning/tool) like the ACP/opencode
|
||||
paths — plus accumulate for `persistExternalAgentTurn`. claude gets `--output-format stream-json` too.
|
||||
One parser serves both (same schema). apps/coder only (`pty-dispatch.ts`, `dispatcher.ts`, new
|
||||
`stream-json-parser.ts` + test).
|
||||
|
||||
## #8 — Surface opencode token usage (apps/coder route + apps/web)
|
||||
`agent_sessions.input_tokens/output_tokens/cost` are accumulated (v2.6.8) but the
|
||||
`GET /api/sessions/:id/agent-sessions` SELECT + the `AgentSessionInfo` type drop them. Add the 3
|
||||
columns to both, render condensed beside the existing session chip in `AgentComposerBar`
|
||||
(ChatThroughput styling: `tabular-nums`, muted, e.g. "12.4K in / 3.2K out / $0.25"). MUST NOT touch
|
||||
Sam's uncommitted WIP (`ChatTabBar`, `SessionLandingPage`, `Workspace`, `useWorkspacePanes`,
|
||||
`PaneHeaderActions`).
|
||||
|
||||
## Decisions (2026-06-01)
|
||||
- #7 surfacing: **live incremental** streaming (not parse-at-end).
|
||||
|
||||
## Verify
|
||||
- `pnpm -C apps/server test` (+ new agent-parse tests); `pnpm -C apps/coder test` (+ new parser tests)
|
||||
- `pnpm -C apps/server build && pnpm -C apps/coder build`; `npx tsc -p apps/web/tsconfig.app.json --noEmit`
|
||||
Reference in New Issue
Block a user