• v1.13.9: compaction overflow trigger — 0.85 × ctx_max early trigger

    indifferentketchup released this 2026-05-22 13:59:14 +00:00 | 92 commits to main since this release

    Opencode pattern (session/overflow.ts): fire compaction at 85% of
    ctx_max, replacing the v1.11.0-era ctx_max - 20_000 formula.

    Old formula: usable = ctx_max - 20_000

    • ctx=262144 → trigger at 242144 (92.4%) — only 7.6% headroom
    • ctx=100000 → trigger at 80000 (80.0%)
    • ctx= 32000 → trigger at 12000 (37.5%) — over-eager
    • ctx<=20000 → trigger at 0 — never fires

    New formula: usable = floor(0.85 * ctx_max)

    • ctx=262144 → trigger at 222822 (85.0%) — 15% headroom for summarizer
    • ctx=100000 → trigger at 85000 (85.0%)
    • ctx= 32000 → trigger at 27200 (85.0%)
    • ctx= 8192 → trigger at 6963 (85.0%)

    Ratio gives consistent headroom at any context scale. The qwen3.6
    daily driver gets ~19k tokens more breathing room before overflow;
    small-ctx models no longer degenerate to never-triggering.

    usable() is the only consumer of COMPACTION_BUFFER → constant deleted.
    New EARLY_TRIGGER_RATIO constant takes its place.

    isOverflow() and the maybeFlagForCompaction() call site at
    payload.ts:184 are unchanged — formula swap is internal to compaction.ts.
    payload.ts comment touched only to drop the stale COMPACTION_BUFFER
    reference (PRUNE_TRIGGER_TOKENS stays at 20k as the prune-freed
    threshold; independent of the overflow formula).

    Tests: 4 new usable() corner cases (262k/100k/8k/zero+negative), plus
    5 isOverflow() numbers shifted to match the 85k budget at ctx=100k.
    195/195 server tests pass (was 194).

    Smoke: ratio math verified by unit tests at all four corners. Live
    cap-hit verification deferred — requires accumulating >222k tokens
    in a session under qwen3.6-35b-a3b-mxfp4 (was >242k pre-fix); will
    surface organically in extended use.

    Downloads