Multi-topic batch. The big-ticket item is the skills audit; the rest are smaller patches that compounded during the audit work. ## Skills audit (rules→recipes split) Vendored all 26 skills from /home/samkintop/opt/skills/ into data/skills/ (the boocode-repo-local skill library — see docker-compose change below). Audited via 5 parallel Claude Code agent-teams running the mgechev/skills-best-practices 4-step protocol (Discovery → Logic → Edge Case → self-Architecture-Refinement) per skill, ~2 min wall-clock vs the ~3.7-hour serial estimate. Result: 14 skills surviving (renamed to gerund form, frontmatter matched), 11 deleted (duplicates, BooCode-irrelevant patterns, Claude-already-does- natively), 1 migrated to BOOCHAT.md/BOOCODER.md as an always-true rule (verification-before-completion). Each surviving skill had its description refined to fix specific trigger gaps surfaced by the protocol — 4 real-bug findings landed (dead refs, stale tags, broken sub-file references in the original vendored content). Audit decisions documented in openspec/changes/v1.13.12-skills-audit/ audit-notes.md. Convention codified in BOOCHAT.md/BOOCODER.md "rules vs recipes" sections — future workflow rules go to those files (100% present), recipes stay in data/skills/ (~6% invoke rate in multi-turn per the Codeminer42 measurement). ## Token tracking + stale-stream banner fix (same root cause) ws-frames.ts IsoTimestamp was z.string().min(1) but postgres returns timestamp columns as JS Date objects. Every message_complete / session_updated / chat_updated frame was failing the v1.13.11 Zod gate and being silently dropped. Symptoms: token tracking blank in the UI (no usage frames landed); the 60s no-token-activity timer tripped the stale-stream banner because the frontend's local message state never saw status='streaming' flip to 'complete'. Fix: z.preprocess(v => v instanceof Date ? v.toISOString() : v, z.string().min(1)) applied to the IsoTimestamp primitive. Centralized, no publisher changes, works identically server + web (the parity test still passes). ## Codecontext .codecontextignore auto-install services/codecontext_client.ts now copies the codecontext/.codecontextignore.template into any project's root on the first call to that project if no .codecontextignore exists. One file written per project, idempotent (in-memory Set guard + access-check), silent fallback on read-only project. Stops the upstream empty-source- file parser crash on foreign projects' node_modules — previously required manually copying the template per project. ## Tool-call budget cap 30 → 50 services/inference/budget.ts: BUDGET_READ_ONLY and BUDGET_NO_AGENT bumped to 50 (from 30). BUDGET_NON_READ_ONLY stays at 10 (no write tools landed yet). Real recon sessions were hitting 30 with ~3 turns wasted on codecontext parse failures; legitimate need was ~27, and Architect-class system overviews want deeper recon. Headroom of 20 absorbs failure-retry turns without changing the safety floor — the doom-loop guard (3 identical calls → abort) catches the actual failure mode this cap was guarding against. v1.14 (Phase C outer agent loop) will supersede this via per-agent agent.steps. Throwaway-ish patch but unblocks deeper recon today. ## UI cleanups - ChatPane queued-message dropdown removed. Each queued message now has three buttons: edit (pop back into ChatInput via sendToChat event), force-send (was the dropdown's only useful action), and cancel. Default behavior (send when streaming completes) needs no UI — it's the implicit do-nothing path. - ChatThroughput removed from desktop tab strip (ChatTabBar.tsx). Mobile tab switcher still shows it. ## Plumbing - .gitignore: data/* + !data/AGENTS.md + !data/skills/ negation patterns so the vendored skill library + agent registry become git-tracked while session DB state stays out. - docker-compose.yml: removed /opt/skills:/data/skills override mount. Skills now live in the boocode repo at data/skills/, auditable per-batch. The host-level /opt/skills/ is preserved untouched for any other tools that read from it. - .codecontextignore at repo root: auto-installed when codecontext was first called against /opt/boocode itself; matches the template. - CLAUDE.md: updated to document the v1.13.11 publishFrame wrapper + message_parts table + tool_cost_stats view + DB-integration test pattern + host-side smoke endpoint quirk. (Pre-existing in working tree before this batch; shipped here for completeness.) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
159 lines
4.9 KiB
TypeScript
159 lines
4.9 KiB
TypeScript
// Complete implementation of condition-based waiting utilities
|
|
// From: Lace test infrastructure improvements (2025-10-03)
|
|
// Context: Fixed 15 flaky tests by replacing arbitrary timeouts
|
|
|
|
import type { ThreadManager } from '~/threads/thread-manager';
|
|
import type { LaceEvent, LaceEventType } from '~/threads/types';
|
|
|
|
/**
|
|
* Wait for a specific event type to appear in thread
|
|
*
|
|
* @param threadManager - The thread manager to query
|
|
* @param threadId - Thread to check for events
|
|
* @param eventType - Type of event to wait for
|
|
* @param timeoutMs - Maximum time to wait (default 5000ms)
|
|
* @returns Promise resolving to the first matching event
|
|
*
|
|
* Example:
|
|
* await waitForEvent(threadManager, agentThreadId, 'TOOL_RESULT');
|
|
*/
|
|
export function waitForEvent(
|
|
threadManager: ThreadManager,
|
|
threadId: string,
|
|
eventType: LaceEventType,
|
|
timeoutMs = 5000
|
|
): Promise<LaceEvent> {
|
|
return new Promise((resolve, reject) => {
|
|
const startTime = Date.now();
|
|
|
|
const check = () => {
|
|
const events = threadManager.getEvents(threadId);
|
|
const event = events.find((e) => e.type === eventType);
|
|
|
|
if (event) {
|
|
resolve(event);
|
|
} else if (Date.now() - startTime > timeoutMs) {
|
|
reject(new Error(`Timeout waiting for ${eventType} event after ${timeoutMs}ms`));
|
|
} else {
|
|
setTimeout(check, 10); // Poll every 10ms for efficiency
|
|
}
|
|
};
|
|
|
|
check();
|
|
});
|
|
}
|
|
|
|
/**
|
|
* Wait for a specific number of events of a given type
|
|
*
|
|
* @param threadManager - The thread manager to query
|
|
* @param threadId - Thread to check for events
|
|
* @param eventType - Type of event to wait for
|
|
* @param count - Number of events to wait for
|
|
* @param timeoutMs - Maximum time to wait (default 5000ms)
|
|
* @returns Promise resolving to all matching events once count is reached
|
|
*
|
|
* Example:
|
|
* // Wait for 2 AGENT_MESSAGE events (initial response + continuation)
|
|
* await waitForEventCount(threadManager, agentThreadId, 'AGENT_MESSAGE', 2);
|
|
*/
|
|
export function waitForEventCount(
|
|
threadManager: ThreadManager,
|
|
threadId: string,
|
|
eventType: LaceEventType,
|
|
count: number,
|
|
timeoutMs = 5000
|
|
): Promise<LaceEvent[]> {
|
|
return new Promise((resolve, reject) => {
|
|
const startTime = Date.now();
|
|
|
|
const check = () => {
|
|
const events = threadManager.getEvents(threadId);
|
|
const matchingEvents = events.filter((e) => e.type === eventType);
|
|
|
|
if (matchingEvents.length >= count) {
|
|
resolve(matchingEvents);
|
|
} else if (Date.now() - startTime > timeoutMs) {
|
|
reject(
|
|
new Error(
|
|
`Timeout waiting for ${count} ${eventType} events after ${timeoutMs}ms (got ${matchingEvents.length})`
|
|
)
|
|
);
|
|
} else {
|
|
setTimeout(check, 10);
|
|
}
|
|
};
|
|
|
|
check();
|
|
});
|
|
}
|
|
|
|
/**
|
|
* Wait for an event matching a custom predicate
|
|
* Useful when you need to check event data, not just type
|
|
*
|
|
* @param threadManager - The thread manager to query
|
|
* @param threadId - Thread to check for events
|
|
* @param predicate - Function that returns true when event matches
|
|
* @param description - Human-readable description for error messages
|
|
* @param timeoutMs - Maximum time to wait (default 5000ms)
|
|
* @returns Promise resolving to the first matching event
|
|
*
|
|
* Example:
|
|
* // Wait for TOOL_RESULT with specific ID
|
|
* await waitForEventMatch(
|
|
* threadManager,
|
|
* agentThreadId,
|
|
* (e) => e.type === 'TOOL_RESULT' && e.data.id === 'call_123',
|
|
* 'TOOL_RESULT with id=call_123'
|
|
* );
|
|
*/
|
|
export function waitForEventMatch(
|
|
threadManager: ThreadManager,
|
|
threadId: string,
|
|
predicate: (event: LaceEvent) => boolean,
|
|
description: string,
|
|
timeoutMs = 5000
|
|
): Promise<LaceEvent> {
|
|
return new Promise((resolve, reject) => {
|
|
const startTime = Date.now();
|
|
|
|
const check = () => {
|
|
const events = threadManager.getEvents(threadId);
|
|
const event = events.find(predicate);
|
|
|
|
if (event) {
|
|
resolve(event);
|
|
} else if (Date.now() - startTime > timeoutMs) {
|
|
reject(new Error(`Timeout waiting for ${description} after ${timeoutMs}ms`));
|
|
} else {
|
|
setTimeout(check, 10);
|
|
}
|
|
};
|
|
|
|
check();
|
|
});
|
|
}
|
|
|
|
// Usage example from actual debugging session:
|
|
//
|
|
// BEFORE (flaky):
|
|
// ---------------
|
|
// const messagePromise = agent.sendMessage('Execute tools');
|
|
// await new Promise(r => setTimeout(r, 300)); // Hope tools start in 300ms
|
|
// agent.abort();
|
|
// await messagePromise;
|
|
// await new Promise(r => setTimeout(r, 50)); // Hope results arrive in 50ms
|
|
// expect(toolResults.length).toBe(2); // Fails randomly
|
|
//
|
|
// AFTER (reliable):
|
|
// ----------------
|
|
// const messagePromise = agent.sendMessage('Execute tools');
|
|
// await waitForEventCount(threadManager, threadId, 'TOOL_CALL', 2); // Wait for tools to start
|
|
// agent.abort();
|
|
// await messagePromise;
|
|
// await waitForEventCount(threadManager, threadId, 'TOOL_RESULT', 2); // Wait for results
|
|
// expect(toolResults.length).toBe(2); // Always succeeds
|
|
//
|
|
// Result: 60% pass rate → 100%, 40% faster execution
|