Files
boocode/data/skills/boocode/self-healing/references/examples.md
indifferentketchup 9abc14ef82 feat(skills): add self-healing and verify-gate skills from pskoett-skills fork
Self-healing: heal loop with verify-before-persist discipline, Pattern-Key
dedup, HEAL entry format, 3 scripts, examples reference, eval.yaml.
Verify-gate: 4-step process (Discover -> Run -> Fix Loop -> Gate Signal)
with 3-attempt fix loop, scope-to-fix-only discipline, command discovery.
.learnings/HEALS.md with template entry.
2026-06-07 23:17:33 +00:00

7.7 KiB

Self-Healing Examples

Concrete HEAL entries showing the format applied to real failure shapes. Use these as templates when filing your own heals. All examples use the iteration-2 schema (free-form Trigger / Area, optional Active-Context, no Source field, lazy artifact folders).


Example 1 — Tool failure (lockfile mismatch)

## [HEAL-20260524-001] npm_install_pnpm_lockfile

**Logged**: 2026-05-24T14:22:01Z
**Status**: verified
**Trigger**: tool-failure
**Area**: build
**Priority**: medium

### Failure
`npm install` exited 1 with `npm ERR! code EUSAGE` and a notice that `pnpm-lock.yaml` is present but `package-lock.json` is missing. The project uses pnpm workspaces; npm refuses to install against a pnpm lockfile.

### Diagnosis
Project root contains `pnpm-lock.yaml`. The README and CI both invoke `pnpm`. `npm` was a habit from previous projects, not the actual project's package manager.

### Fix
Use pnpm instead:
```bash
pnpm install

Verification

$ pnpm install
Lockfile is up to date, resolution step is skipped
Already up to date
✓ Done in 1.4s

Exit 0.

Metadata

  • Related Files: package.json, pnpm-lock.yaml
  • See Also: (none yet)
  • Pattern-Key: env.lockfile_mismatch
  • Recurrence-Count: 1
  • First-Seen: 2026-05-24
  • Last-Seen: 2026-05-24


Pattern-Key `env.lockfile_mismatch` is reusable across projects (yarn.lock, bun.lockb, etc.). At Recurrence ≥ 3, this should be promoted to `CLAUDE.md` or `AGENTS.md` as a verification step.

No Artifacts section — the fix is a tool swap, no files generated. Lazy folder pattern: nothing to put in `.learnings/heals/HEAL-20260524-001/`, so the folder isn't created.

---

## Example 2 — Missing capability (helper written on the fly)

```markdown
## [HEAL-20260524-002] bulk_rename_branches_helper

**Logged**: 2026-05-24T15:10:44Z
**Status**: verified
**Trigger**: missing-capability
**Area**: ci
**Priority**: low

### Failure
Need to rename 12 feature branches from `feat-XXX-name` to `feat/XXX-name`. No existing project script handles this; `gh` doesn't have a bulk-rename primitive.

### Diagnosis
This is glue work, not a project bug. A small shell helper using `gh api` per branch is the right level — not worth a top-level script, but worth keeping the file for the next time someone asks.

### Fix
Wrote `.learnings/heals/HEAL-20260524-002/rename-branches.sh`:

```bash
#!/usr/bin/env bash
set -euo pipefail
git fetch --all
for branch in $(git branch -r | grep 'origin/feat-' | sed 's|origin/||'); do
  new="${branch/feat-/feat/}"
  echo "$branch → $new"
  gh api -X POST "repos/{owner}/{repo}/git/refs" \
    -f "ref=refs/heads/$new" \
    -f "sha=$(git rev-parse "origin/$branch")"
  gh api -X DELETE "repos/{owner}/{repo}/git/refs/heads/$branch"
done

Verification

Dry-run (commented out the API calls) printed the 12 expected mappings. Live run renamed all 12; git branch -r | grep 'feat-' | wc -l returns 0.

Artifacts

  • .learnings/heals/HEAL-20260524-002/rename-branches.sh

Metadata

  • Related Files: (none — operates on git refs)
  • See Also: (none)
  • Pattern-Key: tool.gh.bulk_branch_rename
  • Recurrence-Count: 1
  • First-Seen: 2026-05-24
  • Last-Seen: 2026-05-24


Helper script lives under `.learnings/heals/<HEAL-ID>/` — referenceable, but not assumed to be load-bearing. If it gets reused frequently, promote to `scripts/`.

---

## Example 3 — Environment issue (runtime version)

```markdown
## [HEAL-20260524-003] nvm_use_project_node

**Logged**: 2026-05-24T16:01:12Z
**Status**: verified
**Trigger**: env-issue
**Active-Context**: verify-gate
**Area**: tests
**Priority**: medium

### Failure
`pnpm test` exited 1 with `engine "node" is incompatible with this module. Expected version "^20.10.0". Got "18.19.0"`.

### Diagnosis
`.nvmrc` requests node 20.10.0; current shell has 18.19.0 from a previous project context. The shell's nvm wasn't switched after `cd`-ing into the repo.

### Fix
```bash
nvm use   # reads .nvmrc

Verification

$ node --version
v20.10.0
$ pnpm test
✓ 47 tests passed

Metadata

  • Related Files: .nvmrc, package.json
  • See Also: (none)
  • Pattern-Key: env.node_version_mismatch
  • Recurrence-Count: 1
  • First-Seen: 2026-05-24
  • Last-Seen: 2026-05-24


`Active-Context: verify-gate` because that's the workflow phase the agent was in when the test step blew up. An upstream context loader could surface this entry next time `verify-gate` runs in a node project. If you don't have an analogous concept in your pipeline, omit the field.

---

## Example 4 — External service workaround

```markdown
## [HEAL-20260524-004] gh_api_rate_limit_backoff

**Logged**: 2026-05-24T17:33:08Z
**Status**: verified
**Trigger**: external-change
**Area**: ci
**Priority**: high

### Failure
Looping `gh api repos/.../issues` over 200 issues started returning `403 rate limit exceeded` after ~60 calls. Unauthenticated burst limit (abuse detection on rapid successive calls).

### Diagnosis
Script was using `gh api` REST without batching. `gh` is authenticated but the secondary rate limit fires on rapid successive calls — not the primary 5000/hour limit. Switching to a single paginated GraphQL query bypasses the secondary limit entirely.

### Fix
```bash
gh api graphql -f query='
  query($owner:String!,$repo:String!,$cursor:String) {
    repository(owner:$owner,name:$repo) {
      issues(first:100,after:$cursor) { ... }
    }
  }' -F owner=... -F repo=...

Took ~3 calls total instead of 200.

Verification

Full run completed in 4.8s, no 403s, all 200 issues retrieved. Compared output against a sample of the original per-issue calls — fields match.

Artifacts

  • .learnings/heals/HEAL-20260524-004/fetch-issues.sh

Metadata

  • Related Files: (none — ad-hoc query)
  • See Also: (none)
  • Pattern-Key: api.gh.rate_limit
  • Recurrence-Count: 1
  • First-Seen: 2026-05-24
  • Last-Seen: 2026-05-24


---

## Example 5 — Abandoned heal (diagnosis was wrong)

```markdown
## [HEAL-20260524-005] vitest_flaky_snapshot

**Logged**: 2026-05-24T18:14:22Z
**Status**: abandoned
**Trigger**: tool-failure
**Active-Context**: verify-gate
**Area**: tests
**Priority**: medium

### Failure
`vitest` snapshot test `Card > renders default` flaked twice in three runs. Diff showed a timestamp string differing by ~3 seconds.

### Diagnosis (initial — wrong)
Assumed flake was timezone drift in the snapshot fixture. Patched the fixture to use a fixed `Date.now()` stub.

### Diagnosis (current — correct)
The snapshot depends on multiple non-deterministic values: timestamp AND a `crypto.randomUUID()`. The clock stub addressed only one of them. The UUID is still random per render, so the snapshot keeps drifting on subsequent runs.

### Fix (attempted)
Added `vi.useFakeTimers({ now: 1700000000000 })` to the test setup.

### Verification
Test passed twice, then flaked again on the third run — same `Card > renders default`, different diff (this time the UUID changed). Original diagnosis was incomplete.

### Abandonment notes
The right fix is to make the component deterministic via dependency injection (pass a `clock` and `idGen` prop), not to stub globally. That's a real change to the component contract — out of scope for a heal. Filed `FEAT-20260524-001` via self-improvement; surfaced to the user.

### Metadata
- Related Files: src/components/Card.tsx, src/components/Card.test.tsx
- See Also: FEAT-20260524-001
- Pattern-Key: tests.flaky_snapshot_multi_nondeterminism
- Recurrence-Count: 1
- First-Seen: 2026-05-24
- Last-Seen: 2026-05-24

---

Abandoned heals are first-class. They document a dead end so the next agent doesn't re-walk it. The handoff to a FEAT- entry via self-improvement is the right next step when the real fix is a feature, not a heal.

No Artifacts section — the attempted patch was reverted; nothing reusable was generated.