Files

indifferentketchup c935687725 chore(openspec): drop 9 superseded proposals + 11 stub archive files

Drop 9 batch proposals that are superseded by the boocode-lift-analysis
(boocontext-audit, conductor upgrades, self-healing/verify-gate skills):
add-3tier-memory, import-llm-evaluator, import-pregel-engine, plugin-platform,
conductor-evolution, code-intelligence-upgrade, dev-workflow, ui-overhaul,
agent-reliability.

Delete 11 stub archive files (49-66B each, 'Status: Shipped. Archived.' only)
that provide zero documentation value over the existing CHANGELOG.md + git tags.

2026-06-07 22:15:38 +00:00

1.7 KiB

Raw Blame History

llama-cache-and-spec — tasks

Files to change

Three files across two repos:

/opt/forks/llama-sidecar/internal/config/config.go
/opt/boocode/apps/server/src/services/inference/llama-args-validator.ts
/opt/forks/llama-sidecar/internal/validator/validator.go

Tasks

1. Update sidecar default base args

/opt/forks/llama-sidecar/internal/config/config.go edited. defaultBaseArgs() now includes: --cache-type-k q4_0 — KV cache quant → ~4× VRAM savings --cache-reuse 256 — KV cache reuse across turns → prompt caching --slot-save-path /tmp/llama-slots — disk-persistent KV cache --cache-idle-slots — auto-save idle slots to disk --spec-type ngram-mod --spec-ngram-mod-thsh 2 — spec decoding → 2× tok/s --ctx-checkpoints 32 — context overflow protection --sleep-idle-seconds 600 — GPU memory reclaim when idle --metrics — Prometheus /metrics endpoint Build verified: go build ./... exits 0.
2. No change needed — shadow lists are correct

The shadow lists in llama-args-validator.ts already prevent agents from overriding cache/spec/template flags. Adding the flags to defaultBaseArgs + keeping the shadow lists is the correct architecture: flags are enabled by default, agents can't override them.
3. No change needed — same reasoning as task 2

The sidecar validator.go shadow lists serve the same purpose. Both code paths are consistent.
4. Deploy + verify
- Rebuild sidecar binary: go build -o ... ./... → ✅ done
- Restart docker compose: needs manual deploy
- Verify /metrics endpoint returns data
- Verify nvidia-smi shows reduced VRAM (expected: ~4× savings on KV cache)

1.7 KiB Raw Blame History Unescape Escape

llama-cache-and-spec — tasks

Files to change

Tasks

1.7 KiB

Raw Blame History