Files

indifferentketchup 0fa46cd06c v1.13.12: skills audit + token-tracking fix + codecontext + cap50 + UI cleanups

Multi-topic batch. The big-ticket item is the skills audit; the rest are
smaller patches that compounded during the audit work.

## Skills audit (rules→recipes split)

Vendored all 26 skills from /home/samkintop/opt/skills/ into data/skills/
(the boocode-repo-local skill library — see docker-compose change below).
Audited via 5 parallel Claude Code agent-teams running the
mgechev/skills-best-practices 4-step protocol (Discovery → Logic → Edge
Case → self-Architecture-Refinement) per skill, ~2 min wall-clock vs the
~3.7-hour serial estimate.

Result: 14 skills surviving (renamed to gerund form, frontmatter matched),
11 deleted (duplicates, BooCode-irrelevant patterns, Claude-already-does-
natively), 1 migrated to BOOCHAT.md/BOOCODER.md as an always-true rule
(verification-before-completion). Each surviving skill had its description
refined to fix specific trigger gaps surfaced by the protocol — 4
real-bug findings landed (dead refs, stale tags, broken sub-file
references in the original vendored content).

Audit decisions documented in openspec/changes/v1.13.12-skills-audit/
audit-notes.md. Convention codified in BOOCHAT.md/BOOCODER.md "rules vs
recipes" sections — future workflow rules go to those files (100%
present), recipes stay in data/skills/ (~6% invoke rate in multi-turn
per the Codeminer42 measurement).

## Token tracking + stale-stream banner fix (same root cause)

ws-frames.ts IsoTimestamp was z.string().min(1) but postgres returns
timestamp columns as JS Date objects. Every message_complete /
session_updated / chat_updated frame was failing the v1.13.11 Zod gate
and being silently dropped. Symptoms: token tracking blank in the UI
(no usage frames landed); the 60s no-token-activity timer tripped the
stale-stream banner because the frontend's local message state never
saw status='streaming' flip to 'complete'.

Fix: z.preprocess(v => v instanceof Date ? v.toISOString() : v,
z.string().min(1)) applied to the IsoTimestamp primitive. Centralized,
no publisher changes, works identically server + web (the parity test
still passes).

## Codecontext .codecontextignore auto-install

services/codecontext_client.ts now copies the
codecontext/.codecontextignore.template into any project's root on the
first call to that project if no .codecontextignore exists. One file
written per project, idempotent (in-memory Set guard + access-check),
silent fallback on read-only project. Stops the upstream empty-source-
file parser crash on foreign projects' node_modules — previously
required manually copying the template per project.

## Tool-call budget cap 30 → 50

services/inference/budget.ts: BUDGET_READ_ONLY and BUDGET_NO_AGENT
bumped to 50 (from 30). BUDGET_NON_READ_ONLY stays at 10 (no write
tools landed yet). Real recon sessions were hitting 30 with ~3 turns
wasted on codecontext parse failures; legitimate need was ~27, and
Architect-class system overviews want deeper recon. Headroom of 20
absorbs failure-retry turns without changing the safety floor — the
doom-loop guard (3 identical calls → abort) catches the actual
failure mode this cap was guarding against.

v1.14 (Phase C outer agent loop) will supersede this via per-agent
agent.steps. Throwaway-ish patch but unblocks deeper recon today.

## UI cleanups

- ChatPane queued-message dropdown removed. Each queued message now
  has three buttons: edit (pop back into ChatInput via sendToChat
  event), force-send (was the dropdown's only useful action), and
  cancel. Default behavior (send when streaming completes) needs no
  UI — it's the implicit do-nothing path.
- ChatThroughput removed from desktop tab strip (ChatTabBar.tsx).
  Mobile tab switcher still shows it.

## Plumbing

- .gitignore: data/* + !data/AGENTS.md + !data/skills/ negation
  patterns so the vendored skill library + agent registry become
  git-tracked while session DB state stays out.
- docker-compose.yml: removed /opt/skills:/data/skills override
  mount. Skills now live in the boocode repo at data/skills/,
  auditable per-batch. The host-level /opt/skills/ is preserved
  untouched for any other tools that read from it.
- .codecontextignore at repo root: auto-installed when codecontext
  was first called against /opt/boocode itself; matches the template.
- CLAUDE.md: updated to document the v1.13.11 publishFrame wrapper +
  message_parts table + tool_cost_stats view + DB-integration test
  pattern + host-side smoke endpoint quirk. (Pre-existing in working
  tree before this batch; shipped here for completeness.)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-22 18:58:30 +00:00

10 KiB

Raw Blame History

name, description

name	description
developing-agents	Propose new agents for BooCode's data/AGENTS.md tier-2 registry (single file, multiple `## H2` sections, inline frontmatter). Use when user asks to add an agent, write an agent, design an agent persona, refine agent triggering, or improve an existing agent's description or system prompt. Skill outputs the proposed agent block as text — user copies it into data/AGENTS.md manually.

Agent Development (BooCode tier-2 format)

BooChat adaptation: this skill is a heavy rewrite of the upstream Anthropic agent-development skill. The upstream targets Claude Code's per-file agents/<name>.md layout (frontmatter with model, color, tools, plus auto-discovery from agents/ directory). BooCode uses a single combined file at data/AGENTS.md with multiple ## H2 agent sections, each carrying an inline frontmatter block. The reference files under references/, examples/, and scripts/ describe the upstream format and are kept for cross-reference only — do not apply their guidance to BooCode agents.

Quick overview

A BooCode agent is one ## H2 section inside data/AGENTS.md. Each section contains:

An H2 title (the human-readable agent name, e.g. ## Debugger)
An inline frontmatter block (--- … ---) with three fields
A system-prompt body in markdown

The agent is resolved per-turn by sessions.agent_id. Multiple agents live in the same file; ordering is by appearance.

Canonical example (from data/AGENTS.md)

## Debugger
---
temperature: 0.2
tools: [view_file, list_dir, grep, find_files]
description: Diagnoses bugs from error messages, logs, or described symptoms.
---
You diagnose bugs. Form a hypothesis, prove it with evidence from the code.

Process:
1. Restate the symptom in one line. Confirm you understand it.
2. Read the error/stacktrace. Identify the exact frame where things go wrong.
3. view_file on that frame. Read 50 lines around it.
4. grep for callers, related state, recent changes that could explain it.
5. State the root cause with file:line evidence.
6. Propose the minimal fix. Note any side effects.

Rules:
- Never guess. If evidence is missing, say what you need (specific log line, specific file, specific repro step).
- Distinguish symptom from cause. A null check fixes the symptom; missing init causes it.
- Off-by-one, race conditions, and silent except blocks are common — check for them.
- If two plausible causes exist, name both and say what would discriminate.

Output:
- Symptom: <one line>
- Root cause: <file:line> — <explanation>
- Fix: <minimal diff or description>
- Risk: <what could break>

Second example — agent with a constrained tool list (illustrative)

The Debugger gets the full default read-only set. A more locked-down agent narrows further. Example (synthetic — not in data/AGENTS.md today; included to show how the tools whitelist is used in practice):

## View-only Auditor
---
temperature: 0.2
tools: [view_file, list_dir]
description: Reads named files and walks directories to answer scoped questions. Does not search. Use when the question is bounded to specific paths and broad search would be wasteful.
---
You read what you're pointed at. You do not search.

Process:
1. Confirm the user named specific files or a specific directory. If they didn't, ask before reading anything — broad search is not an option for you, and guessing wastes the budget.
2. view_file each named path. Cap at 3 files per question unless the user expands scope.
3. list_dir to confirm structure if the user is asking about layout.
4. Answer with file:line citations.

Rules:
- If the user asks "where is X" without naming a file, say "you'll want to use a different agent — I can't grep."
- Don't infer a path; ask for it.

Output:
- Answer: <prose>
- Evidence: file:line citations only

The difference from the Debugger is the tools array: dropping grep and find_files forces the agent to either work from the user's explicit pointers or hand off. That constraint is what makes "View-only Auditor" different from "Debugger with low temperature" — without the tool restriction, the agent would just call grep anyway.

There are 6 builtin agents in data/AGENTS.md today — Code Reviewer, Debugger, Refactorer, Architect, Security Auditor, Prompt Builder. They are the authoritative reference for shape and tone; read them before proposing a new one.

Frontmatter fields

Exactly three fields are honored. Anything else is silently ignored (forward-compat hook, not a feature).

`temperature` (number, 0.0–2.0)

LLM sampling temperature for this agent. Lower = more deterministic. Common settings observed in the builtin agents:

Temp	Use case
0.2	Diagnostic / security work where evidence > creativity
0.3	Reviews, refactors (specific, narrow output)
0.4	Prompt builders (some variation; still grounded)
0.5	Architects / designers (broader exploration)

Match the tone you want. Don't copy a number without understanding why.

`tools` (array of tool-name strings)

The allowlist of tools the agent may call. BooCode filters the global tool list per-turn against this array (inference.ts:721-731). Unknown names in the array are silently dropped.

Current canonical tool names in BooCode (as of v1.13.x):

view_file, list_dir, grep, find_files, git_status, skill_find, skill_use, skill_resource, ask_user_input, web_search, web_fetch

Read-only set commonly given to investigation agents: [view_file, list_dir, grep, find_files]. Add git_status if branch state matters. Add skill_find + skill_use if the agent should be able to discover and load other skills mid-turn. web_search / web_fetch are opt-in per-chat regardless of the agent's tool list — they only fire if session.web_search_enabled (or the project default) is true.

Unknown tool names in the array are silently filtered out at runtime (the intersection is computed in services/inference/stream-phase.ts:403–406 and there's no warning log for the dropped names). Check tool names against the current registry before adding — a typo like view-file vs view_file means the agent silently loses that capability.

No model field. Session model wins per the locked v1.8.2 decision; an agent inherits whatever model the chat is set to.

`description` (string, prose)

The trigger summary. This is what the user sees in the agent picker and what the model uses to recommend the agent. Keep it under one short paragraph. The format that works:

<What the agent does in one sentence>. <One or two short trigger phrases>.

Examples from the canonical 6:

"Reviews code for bugs, security issues, and maintainability. Read-only."
"Diagnoses bugs from error messages, logs, or described symptoms."
"Designs new features, modules, or architectural changes. Outputs a build plan."

Patterns that work in the description:

Verb-first ("Reviews", "Diagnoses", "Audits") — the agent is doing something
"Read-only" or similar capability hints when the agent is constrained
A noun phrase saying what's produced ("outputs a build plan", "outputs plans, not edits")

Patterns to avoid:

"Helps the user with X" (vague; says nothing)
Lists of features ("Reviews, audits, suggests, refactors, and improves...") — pick the dominant verb
"Use when..." prose (the trigger sentence is implicit in the verb-first description)

System prompt body

The body becomes the agent's system prompt, appended after the base prompt and the container guidance block. Write in second person ("You diagnose…", "You design…"). Aim for ~150–400 words. Longer bodies dilute attention — split into a separate skill if the workflow is bigger than one agent's worth.

Shape that has been working

Most builtin agents use this skeleton:

You are <role>. <One-line stance on quality / output discipline>.

Process:
1. <Verb> <noun> — <why>
2. <Verb> <noun> — <why>
...

Rules:
- <Imperative>
- <Imperative — often "never X" or "always Y">

Output:
- <Field>: <one-line shape>
- <Field>: <one-line shape>

Variants observed:

Prioritize: / Reject: paired lists (Refactorer)
Look for: long bulleted catalog (Security Auditor)
Skip: to explicitly disclaim non-goals (Code Reviewer)

Discipline

Be specific about what the agent doesn't do. Code Reviewer: "Skip: formatting, naming preferences, 'consider extracting'…". Saying what you reject sharpens the description's positive claim.
Cite the BooCode tooling. Mention view_file, grep, etc. by name in the process steps. The model is more likely to actually use them when the prompt names them.
No second system-prompt. The base prompt already covers "be concise, cite file:line." Don't restate it.
No emojis. None of the builtin agents use them; the convention is plain text.

How to propose a new agent

Identify the gap. Is there a recurring kind of task that the current 6 don't cover well? If a builtin can be tweaked, prefer tweaking.
Pick a verb-first name. Title-case, two words max (Debugger, Code Reviewer).
Write the description in one or two sentences.
Pick a temperature deliberately (see table above).
List the minimum tools needed.
Draft the system prompt: stance, process, rules, output.
Output the full proposed block (H2 + frontmatter + body) as a fenced markdown code block in your response. Don't mkdir, don't write — Sam pastes it into data/AGENTS.md and commits.

Common mistakes

Adding a model field — silently ignored; the session model wins.
Adding a color field — silently ignored.
Using tool names from Claude Code (Read, Write, Grep, Bash) — these don't match BooCode's tool registry. Use the BooCode names from the list above.
Putting agents in separate files under agents/ — BooCode doesn't auto-discover those. Everything lives in data/AGENTS.md.
Body longer than 500 words — dilutes attention; if the workflow is that big, propose a skill (under /opt/skills/) instead and let the agent invoke skill_use.

What this skill outputs

For each agent proposal: one fenced markdown block ready to paste into data/AGENTS.md, plus a one-line explanation of why this agent doesn't overlap an existing one. Nothing else.

10 KiB Raw Blame History Unescape Escape