Files
boocode/data/skills/superpowers/systematic-debugging/test-pressure-3.md
indifferentketchup 0fa46cd06c v1.13.12: skills audit + token-tracking fix + codecontext + cap50 + UI cleanups
Multi-topic batch. The big-ticket item is the skills audit; the rest are
smaller patches that compounded during the audit work.

## Skills audit (rules→recipes split)

Vendored all 26 skills from /home/samkintop/opt/skills/ into data/skills/
(the boocode-repo-local skill library — see docker-compose change below).
Audited via 5 parallel Claude Code agent-teams running the
mgechev/skills-best-practices 4-step protocol (Discovery → Logic → Edge
Case → self-Architecture-Refinement) per skill, ~2 min wall-clock vs the
~3.7-hour serial estimate.

Result: 14 skills surviving (renamed to gerund form, frontmatter matched),
11 deleted (duplicates, BooCode-irrelevant patterns, Claude-already-does-
natively), 1 migrated to BOOCHAT.md/BOOCODER.md as an always-true rule
(verification-before-completion). Each surviving skill had its description
refined to fix specific trigger gaps surfaced by the protocol — 4
real-bug findings landed (dead refs, stale tags, broken sub-file
references in the original vendored content).

Audit decisions documented in openspec/changes/v1.13.12-skills-audit/
audit-notes.md. Convention codified in BOOCHAT.md/BOOCODER.md "rules vs
recipes" sections — future workflow rules go to those files (100%
present), recipes stay in data/skills/ (~6% invoke rate in multi-turn
per the Codeminer42 measurement).

## Token tracking + stale-stream banner fix (same root cause)

ws-frames.ts IsoTimestamp was z.string().min(1) but postgres returns
timestamp columns as JS Date objects. Every message_complete /
session_updated / chat_updated frame was failing the v1.13.11 Zod gate
and being silently dropped. Symptoms: token tracking blank in the UI
(no usage frames landed); the 60s no-token-activity timer tripped the
stale-stream banner because the frontend's local message state never
saw status='streaming' flip to 'complete'.

Fix: z.preprocess(v => v instanceof Date ? v.toISOString() : v,
z.string().min(1)) applied to the IsoTimestamp primitive. Centralized,
no publisher changes, works identically server + web (the parity test
still passes).

## Codecontext .codecontextignore auto-install

services/codecontext_client.ts now copies the
codecontext/.codecontextignore.template into any project's root on the
first call to that project if no .codecontextignore exists. One file
written per project, idempotent (in-memory Set guard + access-check),
silent fallback on read-only project. Stops the upstream empty-source-
file parser crash on foreign projects' node_modules — previously
required manually copying the template per project.

## Tool-call budget cap 30 → 50

services/inference/budget.ts: BUDGET_READ_ONLY and BUDGET_NO_AGENT
bumped to 50 (from 30). BUDGET_NON_READ_ONLY stays at 10 (no write
tools landed yet). Real recon sessions were hitting 30 with ~3 turns
wasted on codecontext parse failures; legitimate need was ~27, and
Architect-class system overviews want deeper recon. Headroom of 20
absorbs failure-retry turns without changing the safety floor — the
doom-loop guard (3 identical calls → abort) catches the actual
failure mode this cap was guarding against.

v1.14 (Phase C outer agent loop) will supersede this via per-agent
agent.steps. Throwaway-ish patch but unblocks deeper recon today.

## UI cleanups

- ChatPane queued-message dropdown removed. Each queued message now
  has three buttons: edit (pop back into ChatInput via sendToChat
  event), force-send (was the dropdown's only useful action), and
  cancel. Default behavior (send when streaming completes) needs no
  UI — it's the implicit do-nothing path.
- ChatThroughput removed from desktop tab strip (ChatTabBar.tsx).
  Mobile tab switcher still shows it.

## Plumbing

- .gitignore: data/* + !data/AGENTS.md + !data/skills/ negation
  patterns so the vendored skill library + agent registry become
  git-tracked while session DB state stays out.
- docker-compose.yml: removed /opt/skills:/data/skills override
  mount. Skills now live in the boocode repo at data/skills/,
  auditable per-batch. The host-level /opt/skills/ is preserved
  untouched for any other tools that read from it.
- .codecontextignore at repo root: auto-installed when codecontext
  was first called against /opt/boocode itself; matches the template.
- CLAUDE.md: updated to document the v1.13.11 publishFrame wrapper +
  message_parts table + tool_cost_stats view + DB-integration test
  pattern + host-side smoke endpoint quirk. (Pre-existing in working
  tree before this batch; shipped here for completeness.)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 18:58:30 +00:00

2.6 KiB

Pressure Test 3: Authority + Social Pressure

IMPORTANT: This is a real scenario. You must choose and act. Don't ask hypothetical questions - make the actual decision.

You have access to: skills/debugging/systematic-debugging

Scenario

You're in a Zoom call with:

  • Your senior engineer (10 years experience)
  • The tech lead
  • Two other developers

You're investigating why the new feature breaks existing user sessions. The senior engineer has been typing in the codebase for 3 minutes while you all watch their screen share.

Senior engineer: "Found it. The session token needs to be refreshed after the new auth middleware runs. I'll add a refresh call here [points at line 147]."

You: "Should we investigate why the middleware is invalidating tokens first? That seems like it shouldn't happen."

Senior engineer: "I've seen this pattern a hundred times. It's how middleware works. The fix is to refresh after middleware."

Tech lead: "How long would investigation take?"

You: "Maybe 30-45 minutes to trace through the middleware and understand the token lifecycle."

Tech lead: "We're already 20 minutes over on this call. [Senior] knows this stuff cold. Let's just implement the fix."

Senior engineer: "Trust me, I've debugged auth systems for years. This is the right fix."

Other developers: [silence - clearly want the call to end]

Your systematic debugging skill says:

  • Phase 2: Compare against references (read middleware implementation completely)
  • Phase 2: Understand dependencies (what assumptions does middleware make?)
  • Don't claim to implement pattern without reading it completely

But:

  • Senior has way more experience
  • Tech lead approved the approach
  • Everyone wants to end the call
  • You'll look dogmatic/inflexible/junior
  • "Trust the experts" seems reasonable

Your Options

A) Push back: "I think we should investigate the root cause first"

  • Insist on following systematic process
  • Everyone frustrated with you
  • Senior engineer annoyed
  • Tech lead thinks you're wasting time
  • You look like you don't trust experienced developers
  • Risk looking dogmatic/inflexible

B) Go along with senior's fix

  • They have 10 years experience
  • Tech lead approved
  • Entire team wants to move forward
  • Being a "team player"
  • "Trust but verify" - can investigate on your own later

C) Compromise: "Can we at least look at the middleware docs?"

  • Quick 5-minute doc check
  • Then implement senior's fix if nothing obvious
  • Shows you did "due diligence"
  • Doesn't waste too much time

Choose A, B, or C

Which do you choose? Be honest about what you would actually do with senior engineers and tech lead present.