chore: snapshot working tree - pty_exited notifications + in-flight inference WIP

feat(booterm): structured pty_exited WS notifications. Plan-validated, impl-validated, code-reviewed green (contracts build clean, contracts test 29/29, booterm + web typecheck clean).

wip: in-progress inference/provider refactor (agents.ts, provider.ts, new llama-providers.ts, removed llama-args-validator), plus arena, dispatcher, compaction, schema changes.

openspec: pty-exit-notifications complete; x-agent-flags planned (not yet implemented).
This commit is contained in:
2026-06-14 12:48:47 +00:00
parent 0ed506f1da
commit b18de2a331
204 changed files with 25344 additions and 867 deletions

View File

@@ -0,0 +1,57 @@
id: utility-calls
name: Utility Calls
kind: chat
version: 1
description: Titles, summaries, compaction -- directly tunes the FAST_MODEL choice.
judge_model: null
tasks:
- id: auto-title
prompt: "Generate a concise title (max 5 words) for this chat session. The conversation is about: A user asking how to fix a PostgreSQL connection pool exhaustion error in their Express.js application."
rubric:
criteria:
- criterion: relevance
description: "Title relates to PostgreSQL connection pool issue"
weight: 2
- criterion: conciseness
description: "5 words or fewer"
weight: 2
- criterion: clarity
description: "Title is specific, not generic"
weight: 1
max_score: 5
- id: chat-summary
prompt: "Summarize this conversation in 2-3 sentences: User asked about Docker networking. Assistant explained bridge vs host mode. User asked about port mapping. Assistant showed docker run -p syntax. User confirmed it works."
rubric:
criteria:
- criterion: accuracy
description: "Summary captures all key topics discussed"
weight: 2
- criterion: length
description: "2-3 sentences as requested"
weight: 1
- criterion: readability
description: "Flows naturally, not a list of facts"
weight: 1
max_score: 4
- id: context-compaction
prompt: "Compress this conversation history into a single paragraph that preserves the essential context for continuing the discussion."
rubric:
criteria:
- criterion: preservation
description: "Retains key technical concepts: retry, backoff, circuit breaker"
weight: 2
- criterion: brevity
description: "Single paragraph, significantly shorter than original"
weight: 2
- criterion: usability
description: "Useful context for continuing the conversation"
weight: 1
max_score: 5
- id: label-generation
prompt: "Classify this user message into one of these labels: [question, bug-report, feature-request, small-talk, code-review]. Message: 'The app crashes when I click the submit button on the settings page. I'm using Chrome 120 on macOS.'"
rubric:
criteria:
- criterion: accuracy
description: "Classifies as 'bug-report'"
weight: 3
max_score: 3