id: utility-calls name: Utility Calls kind: chat version: 1 description: Titles, summaries, compaction -- directly tunes the FAST_MODEL choice. judge_model: null tasks: - id: auto-title prompt: "Generate a concise title (max 5 words) for this chat session. The conversation is about: A user asking how to fix a PostgreSQL connection pool exhaustion error in their Express.js application." rubric: criteria: - criterion: relevance description: "Title relates to PostgreSQL connection pool issue" weight: 2 - criterion: conciseness description: "5 words or fewer" weight: 2 - criterion: clarity description: "Title is specific, not generic" weight: 1 max_score: 5 - id: chat-summary prompt: "Summarize this conversation in 2-3 sentences: User asked about Docker networking. Assistant explained bridge vs host mode. User asked about port mapping. Assistant showed docker run -p syntax. User confirmed it works." rubric: criteria: - criterion: accuracy description: "Summary captures all key topics discussed" weight: 2 - criterion: length description: "2-3 sentences as requested" weight: 1 - criterion: readability description: "Flows naturally, not a list of facts" weight: 1 max_score: 4 - id: context-compaction prompt: "Compress this conversation history into a single paragraph that preserves the essential context for continuing the discussion." rubric: criteria: - criterion: preservation description: "Retains key technical concepts: retry, backoff, circuit breaker" weight: 2 - criterion: brevity description: "Single paragraph, significantly shorter than original" weight: 2 - criterion: usability description: "Useful context for continuing the conversation" weight: 1 max_score: 5 - id: label-generation prompt: "Classify this user message into one of these labels: [question, bug-report, feature-request, small-talk, code-review]. Message: 'The app crashes when I click the submit button on the settings page. I'm using Chrome 120 on macOS.'" rubric: criteria: - criterion: accuracy description: "Classifies as 'bug-report'" weight: 3 max_score: 3