Files
boocode/data/skills/anthropics/developing-agents/scripts/validate-agent.sh
indifferentketchup 0fa46cd06c v1.13.12: skills audit + token-tracking fix + codecontext + cap50 + UI cleanups
Multi-topic batch. The big-ticket item is the skills audit; the rest are
smaller patches that compounded during the audit work.

## Skills audit (rules→recipes split)

Vendored all 26 skills from /home/samkintop/opt/skills/ into data/skills/
(the boocode-repo-local skill library — see docker-compose change below).
Audited via 5 parallel Claude Code agent-teams running the
mgechev/skills-best-practices 4-step protocol (Discovery → Logic → Edge
Case → self-Architecture-Refinement) per skill, ~2 min wall-clock vs the
~3.7-hour serial estimate.

Result: 14 skills surviving (renamed to gerund form, frontmatter matched),
11 deleted (duplicates, BooCode-irrelevant patterns, Claude-already-does-
natively), 1 migrated to BOOCHAT.md/BOOCODER.md as an always-true rule
(verification-before-completion). Each surviving skill had its description
refined to fix specific trigger gaps surfaced by the protocol — 4
real-bug findings landed (dead refs, stale tags, broken sub-file
references in the original vendored content).

Audit decisions documented in openspec/changes/v1.13.12-skills-audit/
audit-notes.md. Convention codified in BOOCHAT.md/BOOCODER.md "rules vs
recipes" sections — future workflow rules go to those files (100%
present), recipes stay in data/skills/ (~6% invoke rate in multi-turn
per the Codeminer42 measurement).

## Token tracking + stale-stream banner fix (same root cause)

ws-frames.ts IsoTimestamp was z.string().min(1) but postgres returns
timestamp columns as JS Date objects. Every message_complete /
session_updated / chat_updated frame was failing the v1.13.11 Zod gate
and being silently dropped. Symptoms: token tracking blank in the UI
(no usage frames landed); the 60s no-token-activity timer tripped the
stale-stream banner because the frontend's local message state never
saw status='streaming' flip to 'complete'.

Fix: z.preprocess(v => v instanceof Date ? v.toISOString() : v,
z.string().min(1)) applied to the IsoTimestamp primitive. Centralized,
no publisher changes, works identically server + web (the parity test
still passes).

## Codecontext .codecontextignore auto-install

services/codecontext_client.ts now copies the
codecontext/.codecontextignore.template into any project's root on the
first call to that project if no .codecontextignore exists. One file
written per project, idempotent (in-memory Set guard + access-check),
silent fallback on read-only project. Stops the upstream empty-source-
file parser crash on foreign projects' node_modules — previously
required manually copying the template per project.

## Tool-call budget cap 30 → 50

services/inference/budget.ts: BUDGET_READ_ONLY and BUDGET_NO_AGENT
bumped to 50 (from 30). BUDGET_NON_READ_ONLY stays at 10 (no write
tools landed yet). Real recon sessions were hitting 30 with ~3 turns
wasted on codecontext parse failures; legitimate need was ~27, and
Architect-class system overviews want deeper recon. Headroom of 20
absorbs failure-retry turns without changing the safety floor — the
doom-loop guard (3 identical calls → abort) catches the actual
failure mode this cap was guarding against.

v1.14 (Phase C outer agent loop) will supersede this via per-agent
agent.steps. Throwaway-ish patch but unblocks deeper recon today.

## UI cleanups

- ChatPane queued-message dropdown removed. Each queued message now
  has three buttons: edit (pop back into ChatInput via sendToChat
  event), force-send (was the dropdown's only useful action), and
  cancel. Default behavior (send when streaming completes) needs no
  UI — it's the implicit do-nothing path.
- ChatThroughput removed from desktop tab strip (ChatTabBar.tsx).
  Mobile tab switcher still shows it.

## Plumbing

- .gitignore: data/* + !data/AGENTS.md + !data/skills/ negation
  patterns so the vendored skill library + agent registry become
  git-tracked while session DB state stays out.
- docker-compose.yml: removed /opt/skills:/data/skills override
  mount. Skills now live in the boocode repo at data/skills/,
  auditable per-batch. The host-level /opt/skills/ is preserved
  untouched for any other tools that read from it.
- .codecontextignore at repo root: auto-installed when codecontext
  was first called against /opt/boocode itself; matches the template.
- CLAUDE.md: updated to document the v1.13.11 publishFrame wrapper +
  message_parts table + tool_cost_stats view + DB-integration test
  pattern + host-side smoke endpoint quirk. (Pre-existing in working
  tree before this batch; shipped here for completeness.)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 18:58:30 +00:00

218 lines
5.7 KiB
Bash
Executable File

#!/bin/bash
# Agent File Validator
# Validates agent markdown files for correct structure and content
set -euo pipefail
# Usage
if [ $# -eq 0 ]; then
echo "Usage: $0 <path/to/agent.md>"
echo ""
echo "Validates agent file for:"
echo " - YAML frontmatter structure"
echo " - Required fields (name, description, model, color)"
echo " - Field formats and constraints"
echo " - System prompt presence and length"
echo " - Example blocks in description"
exit 1
fi
AGENT_FILE="$1"
echo "🔍 Validating agent file: $AGENT_FILE"
echo ""
# Check 1: File exists
if [ ! -f "$AGENT_FILE" ]; then
echo "❌ File not found: $AGENT_FILE"
exit 1
fi
echo "✅ File exists"
# Check 2: Starts with ---
FIRST_LINE=$(head -1 "$AGENT_FILE")
if [ "$FIRST_LINE" != "---" ]; then
echo "❌ File must start with YAML frontmatter (---)"
exit 1
fi
echo "✅ Starts with frontmatter"
# Check 3: Has closing ---
if ! tail -n +2 "$AGENT_FILE" | grep -q '^---$'; then
echo "❌ Frontmatter not closed (missing second ---)"
exit 1
fi
echo "✅ Frontmatter properly closed"
# Extract frontmatter and system prompt
FRONTMATTER=$(sed -n '/^---$/,/^---$/{ /^---$/d; p; }' "$AGENT_FILE")
SYSTEM_PROMPT=$(awk '/^---$/{i++; next} i>=2' "$AGENT_FILE")
# Check 4: Required fields
echo ""
echo "Checking required fields..."
error_count=0
warning_count=0
# Check name field
NAME=$(echo "$FRONTMATTER" | grep '^name:' | sed 's/name: *//' | sed 's/^"\(.*\)"$/\1/')
if [ -z "$NAME" ]; then
echo "❌ Missing required field: name"
((error_count++))
else
echo "✅ name: $NAME"
# Validate name format
if ! [[ "$NAME" =~ ^[a-zA-Z0-9][a-zA-Z0-9-]*[a-zA-Z0-9]$ ]]; then
echo "❌ name must start/end with alphanumeric and contain only letters, numbers, hyphens"
((error_count++))
fi
# Validate name length
name_length=${#NAME}
if [ $name_length -lt 3 ]; then
echo "❌ name too short (minimum 3 characters)"
((error_count++))
elif [ $name_length -gt 50 ]; then
echo "❌ name too long (maximum 50 characters)"
((error_count++))
fi
# Check for generic names
if [[ "$NAME" =~ ^(helper|assistant|agent|tool)$ ]]; then
echo "⚠️ name is too generic: $NAME"
((warning_count++))
fi
fi
# Check description field
DESCRIPTION=$(echo "$FRONTMATTER" | grep '^description:' | sed 's/description: *//')
if [ -z "$DESCRIPTION" ]; then
echo "❌ Missing required field: description"
((error_count++))
else
desc_length=${#DESCRIPTION}
echo "✅ description: ${desc_length} characters"
if [ $desc_length -lt 10 ]; then
echo "⚠️ description too short (minimum 10 characters recommended)"
((warning_count++))
elif [ $desc_length -gt 5000 ]; then
echo "⚠️ description very long (over 5000 characters)"
((warning_count++))
fi
# Check for example blocks
if ! echo "$DESCRIPTION" | grep -q '<example>'; then
echo "⚠️ description should include <example> blocks for triggering"
((warning_count++))
fi
# Check for "Use this agent when" pattern
if ! echo "$DESCRIPTION" | grep -qi 'use this agent when'; then
echo "⚠️ description should start with 'Use this agent when...'"
((warning_count++))
fi
fi
# Check model field
MODEL=$(echo "$FRONTMATTER" | grep '^model:' | sed 's/model: *//')
if [ -z "$MODEL" ]; then
echo "❌ Missing required field: model"
((error_count++))
else
echo "✅ model: $MODEL"
case "$MODEL" in
inherit|sonnet|opus|haiku)
# Valid model
;;
*)
echo "⚠️ Unknown model: $MODEL (valid: inherit, sonnet, opus, haiku)"
((warning_count++))
;;
esac
fi
# Check color field
COLOR=$(echo "$FRONTMATTER" | grep '^color:' | sed 's/color: *//')
if [ -z "$COLOR" ]; then
echo "❌ Missing required field: color"
((error_count++))
else
echo "✅ color: $COLOR"
case "$COLOR" in
blue|cyan|green|yellow|magenta|red)
# Valid color
;;
*)
echo "⚠️ Unknown color: $COLOR (valid: blue, cyan, green, yellow, magenta, red)"
((warning_count++))
;;
esac
fi
# Check tools field (optional)
TOOLS=$(echo "$FRONTMATTER" | grep '^tools:' | sed 's/tools: *//')
if [ -n "$TOOLS" ]; then
echo "✅ tools: $TOOLS"
else
echo "💡 tools: not specified (agent has access to all tools)"
fi
# Check 5: System prompt
echo ""
echo "Checking system prompt..."
if [ -z "$SYSTEM_PROMPT" ]; then
echo "❌ System prompt is empty"
((error_count++))
else
prompt_length=${#SYSTEM_PROMPT}
echo "✅ System prompt: $prompt_length characters"
if [ $prompt_length -lt 20 ]; then
echo "❌ System prompt too short (minimum 20 characters)"
((error_count++))
elif [ $prompt_length -gt 10000 ]; then
echo "⚠️ System prompt very long (over 10,000 characters)"
((warning_count++))
fi
# Check for second person
if ! echo "$SYSTEM_PROMPT" | grep -q "You are\|You will\|Your"; then
echo "⚠️ System prompt should use second person (You are..., You will...)"
((warning_count++))
fi
# Check for structure
if ! echo "$SYSTEM_PROMPT" | grep -qi "responsibilities\|process\|steps"; then
echo "💡 Consider adding clear responsibilities or process steps"
fi
if ! echo "$SYSTEM_PROMPT" | grep -qi "output"; then
echo "💡 Consider defining output format expectations"
fi
fi
echo ""
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
if [ $error_count -eq 0 ] && [ $warning_count -eq 0 ]; then
echo "✅ All checks passed!"
exit 0
elif [ $error_count -eq 0 ]; then
echo "⚠️ Validation passed with $warning_count warning(s)"
exit 0
else
echo "❌ Validation failed with $error_count error(s) and $warning_count warning(s)"
exit 1
fi