Files

indifferentketchup 0fa46cd06c v1.13.12: skills audit + token-tracking fix + codecontext + cap50 + UI cleanups

Multi-topic batch. The big-ticket item is the skills audit; the rest are
smaller patches that compounded during the audit work.

## Skills audit (rules→recipes split)

Vendored all 26 skills from /home/samkintop/opt/skills/ into data/skills/
(the boocode-repo-local skill library — see docker-compose change below).
Audited via 5 parallel Claude Code agent-teams running the
mgechev/skills-best-practices 4-step protocol (Discovery → Logic → Edge
Case → self-Architecture-Refinement) per skill, ~2 min wall-clock vs the
~3.7-hour serial estimate.

Result: 14 skills surviving (renamed to gerund form, frontmatter matched),
11 deleted (duplicates, BooCode-irrelevant patterns, Claude-already-does-
natively), 1 migrated to BOOCHAT.md/BOOCODER.md as an always-true rule
(verification-before-completion). Each surviving skill had its description
refined to fix specific trigger gaps surfaced by the protocol — 4
real-bug findings landed (dead refs, stale tags, broken sub-file
references in the original vendored content).

Audit decisions documented in openspec/changes/v1.13.12-skills-audit/
audit-notes.md. Convention codified in BOOCHAT.md/BOOCODER.md "rules vs
recipes" sections — future workflow rules go to those files (100%
present), recipes stay in data/skills/ (~6% invoke rate in multi-turn
per the Codeminer42 measurement).

## Token tracking + stale-stream banner fix (same root cause)

ws-frames.ts IsoTimestamp was z.string().min(1) but postgres returns
timestamp columns as JS Date objects. Every message_complete /
session_updated / chat_updated frame was failing the v1.13.11 Zod gate
and being silently dropped. Symptoms: token tracking blank in the UI
(no usage frames landed); the 60s no-token-activity timer tripped the
stale-stream banner because the frontend's local message state never
saw status='streaming' flip to 'complete'.

Fix: z.preprocess(v => v instanceof Date ? v.toISOString() : v,
z.string().min(1)) applied to the IsoTimestamp primitive. Centralized,
no publisher changes, works identically server + web (the parity test
still passes).

## Codecontext .codecontextignore auto-install

services/codecontext_client.ts now copies the
codecontext/.codecontextignore.template into any project's root on the
first call to that project if no .codecontextignore exists. One file
written per project, idempotent (in-memory Set guard + access-check),
silent fallback on read-only project. Stops the upstream empty-source-
file parser crash on foreign projects' node_modules — previously
required manually copying the template per project.

## Tool-call budget cap 30 → 50

services/inference/budget.ts: BUDGET_READ_ONLY and BUDGET_NO_AGENT
bumped to 50 (from 30). BUDGET_NON_READ_ONLY stays at 10 (no write
tools landed yet). Real recon sessions were hitting 30 with ~3 turns
wasted on codecontext parse failures; legitimate need was ~27, and
Architect-class system overviews want deeper recon. Headroom of 20
absorbs failure-retry turns without changing the safety floor — the
doom-loop guard (3 identical calls → abort) catches the actual
failure mode this cap was guarding against.

v1.14 (Phase C outer agent loop) will supersede this via per-agent
agent.steps. Throwaway-ish patch but unblocks deeper recon today.

## UI cleanups

- ChatPane queued-message dropdown removed. Each queued message now
  has three buttons: edit (pop back into ChatInput via sendToChat
  event), force-send (was the dropdown's only useful action), and
  cancel. Default behavior (send when streaming completes) needs no
  UI — it's the implicit do-nothing path.
- ChatThroughput removed from desktop tab strip (ChatTabBar.tsx).
  Mobile tab switcher still shows it.

## Plumbing

- .gitignore: data/* + !data/AGENTS.md + !data/skills/ negation
  patterns so the vendored skill library + agent registry become
  git-tracked while session DB state stays out.
- docker-compose.yml: removed /opt/skills:/data/skills override
  mount. Skills now live in the boocode repo at data/skills/,
  auditable per-batch. The host-level /opt/skills/ is preserved
  untouched for any other tools that read from it.
- .codecontextignore at repo root: auto-installed when codecontext
  was first called against /opt/boocode itself; matches the template.
- CLAUDE.md: updated to document the v1.13.11 publishFrame wrapper +
  message_parts table + tool_cost_stats view + DB-integration test
  pattern + host-side smoke endpoint quirk. (Pre-existing in working
  tree before this batch; shipped here for completeness.)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-22 18:58:30 +00:00

3.4 KiB

Raw Blame History

Condition-Based Waiting

Overview

Flaky tests often guess at timing with arbitrary delays. This creates race conditions where tests pass on fast machines but fail under load or in CI.

Core principle: Wait for the actual condition you care about, not a guess about how long it takes.

When to Use

digraph when_to_use {
    "Test uses setTimeout/sleep?" [shape=diamond];
    "Testing timing behavior?" [shape=diamond];
    "Document WHY timeout needed" [shape=box];
    "Use condition-based waiting" [shape=box];

    "Test uses setTimeout/sleep?" -> "Testing timing behavior?" [label="yes"];
    "Testing timing behavior?" -> "Document WHY timeout needed" [label="yes"];
    "Testing timing behavior?" -> "Use condition-based waiting" [label="no"];
}

Use when:

Tests have arbitrary delays (setTimeout, sleep, time.sleep())
Tests are flaky (pass sometimes, fail under load)
Tests timeout when run in parallel
Waiting for async operations to complete

Don't use when:

Testing actual timing behavior (debounce, throttle intervals)
Always document WHY if using arbitrary timeout

Core Pattern

// ❌ BEFORE: Guessing at timing
await new Promise(r => setTimeout(r, 50));
const result = getResult();
expect(result).toBeDefined();

// ✅ AFTER: Waiting for condition
await waitFor(() => getResult() !== undefined);
const result = getResult();
expect(result).toBeDefined();

Quick Patterns

Scenario	Pattern
Wait for event	`waitFor(() => events.find(e => e.type === 'DONE'))`
Wait for state	`waitFor(() => machine.state === 'ready')`
Wait for count	`waitFor(() => items.length >= 5)`
Wait for file	`waitFor(() => fs.existsSync(path))`
Complex condition	`waitFor(() => obj.ready && obj.value > 10)`

Implementation

Generic polling function:

async function waitFor<T>(
  condition: () => T | undefined | null | false,
  description: string,
  timeoutMs = 5000
): Promise<T> {
  const startTime = Date.now();

  while (true) {
    const result = condition();
    if (result) return result;

    if (Date.now() - startTime > timeoutMs) {
      throw new Error(`Timeout waiting for ${description} after ${timeoutMs}ms`);
    }

    await new Promise(r => setTimeout(r, 10)); // Poll every 10ms
  }
}

See condition-based-waiting-example.ts in this directory for complete implementation with domain-specific helpers (waitForEvent, waitForEventCount, waitForEventMatch) from actual debugging session.

Common Mistakes

❌ Polling too fast: setTimeout(check, 1) - wastes CPU ✅ Fix: Poll every 10ms

❌ No timeout: Loop forever if condition never met ✅ Fix: Always include timeout with clear error

❌ Stale data: Cache state before loop ✅ Fix: Call getter inside loop for fresh data

When Arbitrary Timeout IS Correct

// Tool ticks every 100ms - need 2 ticks to verify partial output
await waitForEvent(manager, 'TOOL_STARTED'); // First: wait for condition
await new Promise(r => setTimeout(r, 200));   // Then: wait for timed behavior
// 200ms = 2 ticks at 100ms intervals - documented and justified

Requirements:

First wait for triggering condition
Based on known timing (not guessing)
Comment explaining WHY

Real-World Impact

From debugging session (2025-10-03):

Fixed 15 flaky tests across 3 files
Pass rate: 60% → 100%
Execution time: 40% faster
No more race conditions

3.4 KiB Raw Blame History