Multi-topic batch. The big-ticket item is the skills audit; the rest are smaller patches that compounded during the audit work. ## Skills audit (rules→recipes split) Vendored all 26 skills from /home/samkintop/opt/skills/ into data/skills/ (the boocode-repo-local skill library — see docker-compose change below). Audited via 5 parallel Claude Code agent-teams running the mgechev/skills-best-practices 4-step protocol (Discovery → Logic → Edge Case → self-Architecture-Refinement) per skill, ~2 min wall-clock vs the ~3.7-hour serial estimate. Result: 14 skills surviving (renamed to gerund form, frontmatter matched), 11 deleted (duplicates, BooCode-irrelevant patterns, Claude-already-does- natively), 1 migrated to BOOCHAT.md/BOOCODER.md as an always-true rule (verification-before-completion). Each surviving skill had its description refined to fix specific trigger gaps surfaced by the protocol — 4 real-bug findings landed (dead refs, stale tags, broken sub-file references in the original vendored content). Audit decisions documented in openspec/changes/v1.13.12-skills-audit/ audit-notes.md. Convention codified in BOOCHAT.md/BOOCODER.md "rules vs recipes" sections — future workflow rules go to those files (100% present), recipes stay in data/skills/ (~6% invoke rate in multi-turn per the Codeminer42 measurement). ## Token tracking + stale-stream banner fix (same root cause) ws-frames.ts IsoTimestamp was z.string().min(1) but postgres returns timestamp columns as JS Date objects. Every message_complete / session_updated / chat_updated frame was failing the v1.13.11 Zod gate and being silently dropped. Symptoms: token tracking blank in the UI (no usage frames landed); the 60s no-token-activity timer tripped the stale-stream banner because the frontend's local message state never saw status='streaming' flip to 'complete'. Fix: z.preprocess(v => v instanceof Date ? v.toISOString() : v, z.string().min(1)) applied to the IsoTimestamp primitive. Centralized, no publisher changes, works identically server + web (the parity test still passes). ## Codecontext .codecontextignore auto-install services/codecontext_client.ts now copies the codecontext/.codecontextignore.template into any project's root on the first call to that project if no .codecontextignore exists. One file written per project, idempotent (in-memory Set guard + access-check), silent fallback on read-only project. Stops the upstream empty-source- file parser crash on foreign projects' node_modules — previously required manually copying the template per project. ## Tool-call budget cap 30 → 50 services/inference/budget.ts: BUDGET_READ_ONLY and BUDGET_NO_AGENT bumped to 50 (from 30). BUDGET_NON_READ_ONLY stays at 10 (no write tools landed yet). Real recon sessions were hitting 30 with ~3 turns wasted on codecontext parse failures; legitimate need was ~27, and Architect-class system overviews want deeper recon. Headroom of 20 absorbs failure-retry turns without changing the safety floor — the doom-loop guard (3 identical calls → abort) catches the actual failure mode this cap was guarding against. v1.14 (Phase C outer agent loop) will supersede this via per-agent agent.steps. Throwaway-ish patch but unblocks deeper recon today. ## UI cleanups - ChatPane queued-message dropdown removed. Each queued message now has three buttons: edit (pop back into ChatInput via sendToChat event), force-send (was the dropdown's only useful action), and cancel. Default behavior (send when streaming completes) needs no UI — it's the implicit do-nothing path. - ChatThroughput removed from desktop tab strip (ChatTabBar.tsx). Mobile tab switcher still shows it. ## Plumbing - .gitignore: data/* + !data/AGENTS.md + !data/skills/ negation patterns so the vendored skill library + agent registry become git-tracked while session DB state stays out. - docker-compose.yml: removed /opt/skills:/data/skills override mount. Skills now live in the boocode repo at data/skills/, auditable per-batch. The host-level /opt/skills/ is preserved untouched for any other tools that read from it. - .codecontextignore at repo root: auto-installed when codecontext was first called against /opt/boocode itself; matches the template. - CLAUDE.md: updated to document the v1.13.11 publishFrame wrapper + message_parts table + tool_cost_stats view + DB-integration test pattern + host-side smoke endpoint quirk. (Pre-existing in working tree before this batch; shipped here for completeness.) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
123 lines
3.6 KiB
Markdown
123 lines
3.6 KiB
Markdown
# Defense-in-Depth Validation
|
|
|
|
## Overview
|
|
|
|
When you fix a bug caused by invalid data, adding validation at one place feels sufficient. But that single check can be bypassed by different code paths, refactoring, or mocks.
|
|
|
|
**Core principle:** Validate at EVERY layer data passes through. Make the bug structurally impossible.
|
|
|
|
## Why Multiple Layers
|
|
|
|
Single validation: "We fixed the bug"
|
|
Multiple layers: "We made the bug impossible"
|
|
|
|
Different layers catch different cases:
|
|
- Entry validation catches most bugs
|
|
- Business logic catches edge cases
|
|
- Environment guards prevent context-specific dangers
|
|
- Debug logging helps when other layers fail
|
|
|
|
## The Four Layers
|
|
|
|
### Layer 1: Entry Point Validation
|
|
**Purpose:** Reject obviously invalid input at API boundary
|
|
|
|
```typescript
|
|
function createProject(name: string, workingDirectory: string) {
|
|
if (!workingDirectory || workingDirectory.trim() === '') {
|
|
throw new Error('workingDirectory cannot be empty');
|
|
}
|
|
if (!existsSync(workingDirectory)) {
|
|
throw new Error(`workingDirectory does not exist: ${workingDirectory}`);
|
|
}
|
|
if (!statSync(workingDirectory).isDirectory()) {
|
|
throw new Error(`workingDirectory is not a directory: ${workingDirectory}`);
|
|
}
|
|
// ... proceed
|
|
}
|
|
```
|
|
|
|
### Layer 2: Business Logic Validation
|
|
**Purpose:** Ensure data makes sense for this operation
|
|
|
|
```typescript
|
|
function initializeWorkspace(projectDir: string, sessionId: string) {
|
|
if (!projectDir) {
|
|
throw new Error('projectDir required for workspace initialization');
|
|
}
|
|
// ... proceed
|
|
}
|
|
```
|
|
|
|
### Layer 3: Environment Guards
|
|
**Purpose:** Prevent dangerous operations in specific contexts
|
|
|
|
```typescript
|
|
async function gitInit(directory: string) {
|
|
// In tests, refuse git init outside temp directories
|
|
if (process.env.NODE_ENV === 'test') {
|
|
const normalized = normalize(resolve(directory));
|
|
const tmpDir = normalize(resolve(tmpdir()));
|
|
|
|
if (!normalized.startsWith(tmpDir)) {
|
|
throw new Error(
|
|
`Refusing git init outside temp dir during tests: ${directory}`
|
|
);
|
|
}
|
|
}
|
|
// ... proceed
|
|
}
|
|
```
|
|
|
|
### Layer 4: Debug Instrumentation
|
|
**Purpose:** Capture context for forensics
|
|
|
|
```typescript
|
|
async function gitInit(directory: string) {
|
|
const stack = new Error().stack;
|
|
logger.debug('About to git init', {
|
|
directory,
|
|
cwd: process.cwd(),
|
|
stack,
|
|
});
|
|
// ... proceed
|
|
}
|
|
```
|
|
|
|
## Applying the Pattern
|
|
|
|
When you find a bug:
|
|
|
|
1. **Trace the data flow** - Where does bad value originate? Where used?
|
|
2. **Map all checkpoints** - List every point data passes through
|
|
3. **Add validation at each layer** - Entry, business, environment, debug
|
|
4. **Test each layer** - Try to bypass layer 1, verify layer 2 catches it
|
|
|
|
## Example from Session
|
|
|
|
Bug: Empty `projectDir` caused `git init` in source code
|
|
|
|
**Data flow:**
|
|
1. Test setup → empty string
|
|
2. `Project.create(name, '')`
|
|
3. `WorkspaceManager.createWorkspace('')`
|
|
4. `git init` runs in `process.cwd()`
|
|
|
|
**Four layers added:**
|
|
- Layer 1: `Project.create()` validates not empty/exists/writable
|
|
- Layer 2: `WorkspaceManager` validates projectDir not empty
|
|
- Layer 3: `WorktreeManager` refuses git init outside tmpdir in tests
|
|
- Layer 4: Stack trace logging before git init
|
|
|
|
**Result:** All 1847 tests passed, bug impossible to reproduce
|
|
|
|
## Key Insight
|
|
|
|
All four layers were necessary. During testing, each layer caught bugs the others missed:
|
|
- Different code paths bypassed entry validation
|
|
- Mocks bypassed business logic checks
|
|
- Edge cases on different platforms needed environment guards
|
|
- Debug logging identified structural misuse
|
|
|
|
**Don't stop at one validation point.** Add checks at every layer.
|