Add full sortof codebase: API, drain workers, frontend, schema, specs

This commit is contained in:
2026-05-04 03:27:54 +00:00
parent acda2c90f8
commit 55d3794bfb
43 changed files with 13375 additions and 53 deletions

View File

@@ -0,0 +1,155 @@
# Spec A - Multi-branch picker
**Date:** 2026-04-30
**Status:** Draft v2 (incorporates spec-review fixes #1#13)
**Out of scope:** B (collection expansion + live progress), C (build context), D (dep "Add" button), E (precacher), G (cleanups). See §11.
## 1. Summary
Some Steam Workshop items ship multiple `mod.info` files under one wsid (canonical example: AuthenticZ → `AuthenticZBackpacks+`, `Authentic Z - Current`, `AuthenticZLite`). Today every parsed `mod_id` flows into `MODS_LINE`, including alternates the user must pick exactly one of. This spec adds a per-wsid picker UI with `localStorage` persistence and a new `POST /api/resort` endpoint that recomputes load order and warnings for the chosen subset, without re-hitting Steam.
## 2. Problem
- AuthenticZ (wsid `2335368829`) yields three `mod_parsed` rows: `AuthenticZBackpacks+`, `Authentic Z - Current`, `AuthenticZLite`. They are mutually exclusive branches.
- The author left `incompatible_mods` empty on all three, so we have no metadata signal that they are alternates.
- Today's `MODS_LINE` is `";".join(SORTED_ORDER)`, so all three branch IDs land in the output. PZ refuses to start with conflicting mods, so the file **looks valid but bricks the server** - silent corruption.
- Other multi-mod packages exist where every `mod_id` *should* load (cooperative content packs). The system must support both shapes.
## 3. Trigger rules
- The picker UI fires **iff a wsid has ≥2 rows in `mod_parsed`**.
- Row count is the *only* trigger. Author metadata does not gate visibility - see §5 for what it changes.
## 4. Default selection rules
For each picker-eligible wsid:
- **If** any `mod_parsed.incompatible_mods` for that wsid lists another `mod_id` from the same wsid → default selection = **first `mod_id` only**.
- **Else** → default selection = **all `mod_id`s ticked**.
"First" tiebreaker: `ORDER BY parsed_at ASC, mod_id ASC`. `worker.process_one` parses sequentially in a `for mip in mod_info_paths: await conn.execute(UPSERT_MOD_PARSED, ...)` loop (one statement per `mod.info`, no `gather`/`to_thread`), so `parsed_at = now()` produces strictly increasing microsecond values per row in practice. `mod_id ASC` is the defensive tiebreaker for the theoretical sub-microsecond case. **This is a spec-locked decision** - revisit if the resulting "primary" branch feels wrong on real inputs.
### Default-selection safety net (fix for review #1)
The default-all-ticked path covers the canonical AuthenticZ case (3 rows, all `incompatible_mods=[]`) and would otherwise emit the same bricking config that motivated this spec. To prevent silent corruption, the API emits an additional warning whenever the default leaves all branches selected without any author signal:
- If a wsid has ≥2 mod rows, AND every row's `incompatible_mods` is empty, AND the user's current selection includes all branches (i.e., they haven't unticked any), emit a `WARNINGS` entry: `tag: "ambiguous-multi-branch"`, `level: "amber"`, `msg: "X branches selected from <wsid title> - author didn't declare alternates; verify these aren't mutually exclusive (e.g., AuthenticZ Lite vs Current). Expand the row to pick one."`
- The warning clears as soon as the user makes any explicit selection (any branch unticked, or - in radio mode - any branch chosen).
- Picker UI remains opt-in; this rule guarantees the user sees a yellow flag without having to expand every multi-branch row.
## 5. UI mode rules
- Default: **checkboxes** (multi-select).
- Upgrade to **radios** (single-select; exactly one always picked) **iff** any `mod_id` for the wsid lists another `mod_id` from the same wsid in its `incompatible_mods`.
- **Cross-wsid** incompatibilities (mod A in wsid X marks mod B in wsid Y) do **not** trigger radio mode for either wsid; they continue to flow through the existing Warnings system.
## 6. UI placement & interactions
- Inline row expansion in the existing `ModTable` (`sortof-app.jsx:306`). No new top-level component.
- A multi-branch wsid renders as **one** parent `<tr>`, occupying the slot the first selected `mod_id` from that wsid holds in `SORTED_ORDER`. Other selected branch `mod_id`s from the same wsid do not render their own rows - they live inside the expanded panel.
- Mod ID cell affordance:
- **Unresolved** (no user interaction yet, no hydrated selection): `▾ N branches`
- **Resolved** (user touched it, or hydrated from `localStorage`): `✓ X of N`
- Click affordance to toggle expansion. Multiple wsids may be expanded simultaneously - single-pass triage on a 450-mod collection.
- Expanded panel: a single `<tr>` with `colSpan={COLUMN_COUNT}` containing per-branch rows of `[checkbox|radio] mod_id - name - cat - deps - pos`. `COLUMN_COUNT` is a single source-of-truth constant (today 6; Spec C will add a 7th column for build-context). **Do not hardcode the integer.** Match existing column rhythm so zebra striping still reads.
- Single-mod wsids render unchanged.
- **`/api/resort` failure mode** (review #11): on 5xx response, retain the prior `MOD_DB`/`SORTED_ORDER`/`MODS_LINE`/`WARNINGS` state and emit a transient `WARNINGS` entry `couldn't recompute sort - try again` (level=red) with a retry button. Never apply a partial response.
- Parent row's category / deps / load cells reflect the **first selected** branch's values; if zero are selected, the parent row remains visible with affordance `✓ 0 of N` and `-` in the data cells, and contributes nothing to `MODS_LINE` / `SORTED_ORDER`. Display position for zero-selected rows is implementation-defined (e.g., previous slot, or sorted by any `mod_id` from the wsid) since the wsid no longer appears in `SORTED_ORDER`.
## 7. Persistence
- `localStorage` key: `sortof.branch.selections`. **One** key total - hydrate in a single read.
- Value: JSON-serialized object keyed by wsid → array of selected `mod_id` strings.
```json
{ "2335368829": ["Authentic Z - Current"], "2169435993": ["modoptions"] }
```
- **Hydration** on app mount: read once, merge into in-memory `branchSelections` state.
- **Eviction**: if a stored `mod_id` is no longer present in the current `MOD_DB` rows for that wsid (cache invalidated upstream, mod.info changed, etc.), drop it silently. Do not warn.
- **Radio-mode invariant guard** (review #2): if eviction would leave a radio-mode wsid with zero selected `mod_id`s, fall back to the §4 default (first-only). Radio mode's "exactly one always picked" invariant must hold post-hydrate.
- Single-mod wsids never write to this object; absence implies "use default".
- **Cross-tab sync** (review #8): App attaches a `window.addEventListener('storage', ...)` listener; on a `sortof.branch.selections` storage event, replace in-memory `branchSelections` with the new value and trigger a single `/api/resort`. Last-writer-wins on the underlying storage value; in-tab state stays coherent.
## 8. API impact
- **No change** to `POST /api/sort` request or response shape.
- **New** endpoint: `POST /api/resort`, taking the current selection and returning a fresh order + warnings without re-hitting Steam.
```json
{ "selected_mod_ids": ["modoptions","tsarslib","Authentic Z - Current"] }
```
- Response: same shape as `/api/sort` with `status:"success"` and `pending:[]`. Backend filters `mod_parsed` rows to the supplied set via `WHERE mod_id = ANY($1::text[])` (parameterized - review #9), runs `mlos_sort`, returns updated `SORTED_ORDER`, `MODS_LINE`, `WARNINGS`, `MOD_DB`. No DB writes.
- `WORKSHOP_ITEMS_LINE` is **not** affected by selection - wsid stays subscribed regardless of which `mod_id`s are enabled. Matches PZ's `WorkshopItems` vs `Mods` semantics.
### Scope, auth, validation (fix for review #3, #9, #10, #12)
- **Stateless.** No session token, no per-user partition. `mod_parsed` is a shared cache; concurrent drain UPSERTs and `/api/resort` SELECTs serialize via asyncpg row locks. Multi-tenancy is out of scope for v1; if added later, expect a `submission_id` on `/api/sort` and `/api/resort`.
- **Unknown `mod_id` handling** (review #12): server silently drops `selected_mod_ids` not present in `mod_parsed` (matches §7 client semantics) and logs at INFO. If the entire selection is empty after the drop, return HTTP 400 - the client can recover by re-running `/api/sort`.
- **Input validation.** `selected_mod_ids` must be a JSON array of strings, length ≥1 and ≤500, each string ≤256 chars. Reject anything else with 400 before touching the DB. PZ `mod_id`s legitimately contain spaces, `+`, `-`, and apostrophes - the parameterized `ANY` pattern handles them safely; **no string interpolation anywhere** (review #9).
- **Rate limiting** (review #10): not implemented at the FastAPI layer. Recommend a Caddy-level `@rate_limit` matcher on `/api/sort` and `/api/resort` before any public exposure beyond the current Tailscale-only bind. Documented as a known gap.
### Sequenced requests (fix for review #5)
- The frontend tags every `/api/resort` POST with a monotonically increasing client-side sequence number (in-memory counter on App, not part of the request body - sent as header `X-Sortof-Seq` or tracked via the issuing call site).
- When a response arrives, compare its sequence number against the latest issued; if older, **drop the response without applying it** (UI keeps current state, last-issued response wins). Prevents stale-response overwrites under rapid toggling.
## 9. Data assumptions
- Schema column is `mod_parsed.incompatible_mods` (`TEXT[]`) - names already stripped of any leading `\` per the B42 parser fix shipped today.
- `mod_parsed.parsed_at` ordering verified (review #4): `worker.process_one` parses `mod.info` files sequentially with `for mip in mod_info_paths: await conn.execute(UPSERT_MOD_PARSED, …)`. Each upsert is its own asyncpg statement (auto-commit, no transaction wrap), and `parsed_at` is `now()` evaluated server-side per statement. Sequential awaits + asyncpg RTT > 1µs ⇒ strictly increasing microsecond values in practice. `mod_id ASC` is a defensive tiebreaker for the theoretical sub-µs collision; no ordinal column exists in the schema and adding one is out of scope for this spec.
- Dangling-deps detection (review #13) already exists in `mlos_sort.sort_mods` (`mlos_sort.py:432-437`): `enabled = set(by_id.keys())` then `miss = [r for r in mod.requirements if r not in enabled]` per mod. Calling `sort_mods` with a filtered subset on `/api/resort` automatically produces the new missing-dep warnings; no changes to `mlos_sort` are needed.
- Frontend already has `incompatible_mods` available as `m.conflicts` on each `MOD_DB` row (`adapters.py:94`).
- This spec consumes the `MOD_DB`/`SORTED_ORDER`/`WARNINGS` shape currently produced by `app.py` + `adapters.py`. Per-build variant filtering is Spec C; selection here operates on the full `mod_id` corpus the API returned.
## 10. Open questions resolved
1. **Client-side filter vs API round-trip.** *Client-side filter for the row affordance and parent-row rendering; server round-trip via `POST /api/resort` for sort + warnings recompute.* Justification: instant feedback on tick/untick UX, but warnings are dependency-driven and need real `mlos_sort` evaluation. Pure-client would require porting `mlos_sort` to JS - far worse than a 50ms POST to a hot Postgres connection.
2. **`SORTED_ORDER` recompute strategy.** *Re-run `mlos_sort` on the selected subset via `POST /api/resort`.* Justification: when the user unticks `AuthenticZLite` and another mod requires it, the warnings list and possibly the topological order both change. Filtering the previous `SORTED_ORDER` post-hoc misses the new missing-dep warning, defeating the picker's safety value.
3. **First-`mod_id` tiebreaker for default selection.** *`ORDER BY parsed_at ASC, mod_id ASC`.* Schema-deterministic and matches insertion order from `worker.process_one`. Flagged in §4 as a lockable spec decision; revisit on real corpus.
4. **`localStorage` key namespacing.** *Single key `sortof.branch.selections`, value `{ [wsid]: string[] }`.* The `sortof.branch.` prefix reserves namespace for any future per-feature storage; one key keeps hydration to a single read.
## 11. Out of scope
- B41/B42 build-context filtering (Spec C).
- Steam collection URL/ID expansion (Spec B).
- Dependency "Add" button (Spec C/D pair).
- Server-side persistence of branch choices.
- Live drain progress streaming (Spec B+F).
- Cleanups bundle (Spec G).
## 12. Acceptance criteria
- [ ] A wsid with N=1 mod row renders as a single normal row in `ModTable` (no behavior change).
- [ ] A wsid with N≥2 mod rows renders as one parent row with `▾ N branches` in the Mod ID cell.
- [ ] Clicking the affordance expands a `colSpan`'d panel listing all N rows with the correct input type (checkboxes by default, radios when intra-wsid `incompatible_mods` is non-empty).
- [ ] Default selection matches §4 (all-ticked or first-only).
- [ ] Toggling a branch updates the affordance to `✓ X of N` and triggers a `POST /api/resort` whose response replaces `MOD_DB`, `SORTED_ORDER`, `MODS_LINE`, `WARNINGS` in app state.
- [ ] `WORKSHOP_ITEMS_LINE` is unchanged when branches toggle.
- [ ] `localStorage["sortof.branch.selections"]` is read on mount and written after every toggle, matching the §7 schema.
- [ ] A stored `mod_id` not present in the current `MOD_DB` for its wsid is dropped silently on hydrate.
- [ ] Multiple expanded panels can coexist (no auto-collapse on expand).
- [ ] Zero selected `mod_id`s for a wsid: affordance reads `✓ 0 of N`; row contributes nothing to `MODS_LINE` / `SORTED_ORDER`.
- [ ] When a wsid has ≥2 mod rows AND every row's `incompatible_mods=[]` AND user has not unticked any branch, an `ambiguous-multi-branch` (amber) WARNINGS entry is present; entry clears on first explicit user selection in that wsid (review #1).
- [ ] Eviction of a stored `mod_id` that empties a radio-mode wsid falls back to §4 default-first; never leaves a radio-mode wsid with zero selections (review #2).
- [ ] `/api/resort` request carries a client-side sequence number; responses older than the latest issued are discarded without state mutation (review #5).
- [ ] `/api/resort` 5xx response leaves prior state intact and surfaces a transient retry-able warning (review #11).
- [ ] Server drops unknown `selected_mod_ids` silently and logs at INFO; empty post-drop selection returns 400 (review #12).
- [ ] `colSpan` in `ModTable` references a single `COLUMN_COUNT` constant - not a hardcoded integer (review #7).
- [ ] `storage` event listener installed; cross-tab toggle of `sortof.branch.selections` syncs in-memory state and triggers exactly one `/api/resort` (review #8).
## 13. Test cases
1. **AuthenticZ canonical** - wsid `2335368829`, three rows, all `incompatible_mods=[]`. Expect: parent row `▾ 3 branches`, default = all ticked, mode = checkboxes. Untick two → `MODS_LINE` reflects one. Reload → selection persists.
2. **Cooperative pack** - wsid that ships 3 mods, all `incompatible_mods=[]`, deps reference each other. Expect: same affordance, default = all ticked, no behavior change for the user who never expands.
3. **Mutually exclusive 2-branch** - wsid where mod A's `incompatible_mods` lists mod B and vice versa. Expect: mode = radios, default = mod A only (first by `parsed_at, mod_id`).
4. **Persistence across reload** - pick a non-default subset, reload page; confirm hydration from `localStorage["sortof.branch.selections"]` restores the selection on next sort.
5. **Stored `mod_id` no longer exists (checkbox mode)** - manually inject a stored `mod_id` not in `MOD_DB`, reload. Expect: silent drop, no console error, default applies.
6. **Cross-wsid incompatibility** - mod A (wsid X) lists mod B (wsid Y) in `incompatible_mods`; both wsids have N=1. Expect: no picker UI, existing warning still surfaces.
7. **Zero-tick wsid** - untick all branches in a multi-branch wsid. Expect: parent row stays in `ModTable` with `✓ 0 of N`; no contribution to `MODS_LINE` / `SORTED_ORDER` / numeric counts.
8. **Radio-mode eviction-to-empty** (review #6) - wsid in radio mode has stored selection `[X]`; `X` is removed from `MOD_DB` (e.g., upstream cache invalidation), reload. Expect: silent drop, then default-first applied, radio invariant preserved.
9. **Default-all-ticked emits the safety warning** (review #1) - load AuthenticZ-canonical without expanding the row. Expect: a `tag:"ambiguous-multi-branch"` amber entry visible in WARNINGS. Untick one branch → entry disappears on next `/api/resort` response.
10. **Stale resort response discarded** (review #5) - issue toggle 1 (slow), then toggle 2 (fast) before #1 returns. Expect: only #2's response applied; #1 dropped on arrival.
11. **`/api/resort` 5xx** (review #11) - stub the endpoint to return 500; toggle a branch. Expect: prior state retained, transient red warning `couldn't recompute sort - try again` surfaced with retry control.
12. **Cross-tab sync** (review #8) - open two tabs, toggle in tab A. Expect: tab B receives `storage` event and re-runs `/api/resort` with the new selection.
13. **Unknown selected_mod_id from server perspective** (review #12) - POST `/api/resort` with `selected_mod_ids=["modoptions","ghostMod"]` where `ghostMod` isn't in `mod_parsed`. Expect: 200 with `ghostMod` silently absent from response; INFO log entry server-side. POST with all-ghost IDs → 400.

View File

@@ -0,0 +1,87 @@
# Spec G-patch - Patch tier (Final Loads)
**Date:** 2026-04-30
**Status:** Draft (awaiting review)
**Sibling specs:** A (multi-branch picker), B (collection expansion + live progress), C (build context), D (dep "Add" button), E (precacher), F (folded into B), G (cleanups bundle - this spec carves a piece out)
## 1. Summary
Add a **"patch" tier** to the load-order calculation: mods explicitly authored or detected as patches sort *after* every non-patch mod, including those flagged `loadLast=on`. Implementation is a single new axis at the top of `mlos_sort._initial_sort_key` plus a heuristic in `derive_category`. No schema migration. No new endpoint. No backwards-incompat changes for existing mods.
## 2. Problem
The PZ load-order convention (and the user-supplied 37-bucket taxonomy, bucket 37 "Final Loads") treats compatibility patches and retextures-of-other-mods as a strictly-last tier - they have to load *after* `loadLast=on` map mods, because they intercept or override the things those mods install. Today our sort key has no such tier:
```
PREORDER → loadFirst → loadLast → category → in-category loadFirst → in-category loadLast → alpha
```
A `loadLast=on` map mod ends up in the same bucket as a patch, ordered alphabetically. Patches that need to override the map mod can land *before* it. Silent corruption - output looks valid, the wrong mod wins at runtime.
## 3. Detection rules
A mod is a patch iff **any** of these is true:
1. **Explicit:** `mod.info` contains `category=patch` (new value added to `RAW_CATEGORY_ORDER`).
2. **Author-tagged via sorting_rules.txt:** user-supplied `[modId]\ncategory=patch` overrides anything else (existing mechanism, no change).
3. **Name heuristic (conservative):** `mod.name` matches the case-insensitive regex `\b(patch|compat|compatibility)\b`. Examples that match: `BetterFlashlight Patch`, `BB Compatibility`, `RavenCreek - MoreSimpleClothing Compat`. Examples that **do not** match: `BugFixes`, `LittleTweaks`, `BalanceFix` - "fix" / "tweak" / "fixes" are too broad and would over-flag.
The first matching rule wins. The heuristic is intentionally narrow; mod authors who want to opt in should use rule 1.
## 4. Sort behavior
Insert a new axis **at position 0** (above `PREORDER`) of the sort tuple:
```
(is_patch, PREORDER, loadFirst, loadLast, category, in-cat loadFirst, in-cat loadLast, alpha)
```
`is_patch = 1` for patches, `0` otherwise. Tuple comparison guarantees patches sort after every non-patch mod regardless of every downstream axis. Within the patch group, the existing sub-keys still apply (a patch with `PREORDER=2`, e.g. `ModManagerServer-Patch`, still sorts second-among-patches).
## 5. Backend changes
- **`mlos_sort.py`:**
- Append `"patch"` to `RAW_CATEGORY_ORDER` (so it's a valid `mod.category` value and topo sort treats it like any other category).
- Extend `derive_category(mod)` with the §3 name heuristic, returning `"patch"` when matched and category is otherwise `undefined`.
- Modify `_initial_sort_key`: prepend `1 if mod.category == "patch" else 0` as the new tuple element.
- **`adapters.py`:** extend `CAT_MAP` with `"patch": "patch"` so the frontend pill key is preserved (see §6).
- **`worker.py`:** no change. `mod.info` parsing already accepts arbitrary `category=…` values; once `"patch"` is in `CATEGORY_ORDER`, existing parser code passes it through unchanged.
- **No schema migration.** `mod_parsed.category` is already `TEXT NOT NULL DEFAULT 'undefined'` - `"patch"` fits without alteration.
## 6. Frontend changes
- **New pill** `patch` in the mod-table category column. Recommended palette: muted mauve / pale grey to distinguish from `gameplay` (the current default for tweaks-shaped mods) without competing for attention.
- **Pill is descriptive only** - sort position already telegraphs "this is a patch" since patches cluster at the bottom of the table. The pill is a quick visual confirmation, not a signal the user has to learn.
- **CSS** addition only (one rule): `.cat.patch { background: …; color: …; }`. No layout or component changes.
If the user prefers to skip the pill (5 buckets stays cleaner), the spec is satisfied without it; sort behavior is the load-bearing change.
## 7. Out of scope
- Detecting patches from Steam workshop tags (Steam's vocabulary has no canonical "Patch" tag - `Misc` and `Framework` are the closest, both too noisy to map).
- Multiple patch sub-tiers (e.g., "patches-of-patches"). YAGNI; the existing `loadAfter` mechanism handles ordering between two patches when needed.
- A `mod_parsed.is_patch` boolean column. Derived from `category` is sufficient and avoids a migration.
- Auto-detecting patches via mod content inspection (Lua module overrides, file collisions). Heuristics only.
## 8. Acceptance criteria
- [ ] `mlos_sort._initial_sort_key` returns an 8-element tuple with `is_patch` (0 or 1) at index 0.
- [ ] `RAW_CATEGORY_ORDER` includes `"patch"`.
- [ ] `derive_category` returns `"patch"` when the name regex `\b(patch|compat|compatibility)\b` (case-insensitive) matches and `mod.info`'s `category=` is unset or `undefined`.
- [ ] Explicit `category=patch` in `mod.info` is honored by the existing parser (no parser change required).
- [ ] `sorting_rules.txt` `category=patch` override forces a mod into the patch tier.
- [ ] In a sort with one `loadLast=on` map mod and one patch, the patch sorts *after* the map mod in `SORTED_ORDER`.
- [ ] In a sort with two patches, alphabetical ordering applies between them (existing alpha tiebreaker preserved).
- [ ] In a sort with no patches, `SORTED_ORDER` is bit-identical to pre-spec output (`is_patch=0` for all rows preserves existing total ordering).
- [ ] `MOD_DB` rows for patches carry `cat: "patch"` once `adapters.CAT_MAP` is extended.
## 9. Test cases
1. **Explicit patch via mod.info** - wsid X has `category=patch`. Expect: sorts last regardless of `loadLast`. `MOD_DB.cat = "patch"`.
2. **Heuristic match** - mod named `BB Compatibility Patch`, no explicit category. Expect: detected as patch, sorted last.
3. **Heuristic miss (intentional)** - mods named `BugFixes`, `LittleTweaks`, `BalanceFix`. Expect: NOT in patch tier.
4. **Patch + loadLast map mod** - input: a `loadLast=on` map mod (`Eerie_County`) and a patch (`Eerie_County - Brita Compat`). Expect: `Eerie_County` precedes the patch in `SORTED_ORDER`.
5. **Two patches** - `AAA-Compatibility` and `ZZZ-Patch`. Expect: alphabetical order preserved within the tier.
6. **No patches in input** - sort identical to current behavior; regression test against a saved canonical fixture (e.g. `2169435993;2392709985;2487022075`).
7. **`sorting_rules.txt` override** - user supplies `[Some_Mod]\ncategory=patch`; expected to force into tier even if name doesn't match heuristic and `mod.info` doesn't declare it.
8. **Patch with PREORDER mod_id** - hypothetical `ModManagerServer-Patch` (mod_id matches PREORDER table). Expect: still sorts within the patch tier (last), but among patches uses PREORDER=2 sub-ordering.

View File

@@ -0,0 +1,174 @@
# Spec C — Build context + dep Add + auto-disambiguation rules
> **Lineage:** sits on top of Spec A (multi-branch picker) and Spec B+F (collection expansion / live drain). Adds context-aware default selection. Does **not** modify the picker contract — Spec A §8 ownership still holds.
## §1 Overview
Three loosely-related improvements that share the same core: the system has more context than it has been using. The user already tells us their PZ build via the `pzBuild` localStorage toggle. The user already gives us a list of mod_ids (via the wsids in their input). The system should consult both before deciding which branches of a multi-branch wsid land in `MODS_LINE` by default.
**Goal:** smarter pre-ticked boxes in the multi-branch picker. **Non-goal:** any form of "magic" sort that emits a branch the user didn't see.
## §2 Build context (`pzBuild`)
The frontend stores `sortof.pzBuild` in localStorage with values `"B41" | "B42"`, default `"B41"`. It already drives `MODS_LINE` rendering (B42 prefixes mod_ids with `\`).
This spec extends `pzBuild` to:
- Travel with `/api/sort` POST body as `pz_build: "B41" | "B42"`. The backend defaults to `"B42"` when missing or invalid.
- Inform Rule A (§4.3) of auto-disambiguation.
`pz_build` is **not** sent on `/api/resort` — the resort flow uses an explicit mod_id list and never re-evaluates rules.
## §3 Dep Add (already shipped)
Documenting the existing behavior so it's part of the locked design.
When `mlos_sort` reports a missing requirement (mod A requires mod B, B is not in the user's enabled set), `build_warnings` enriches each `tag: "missing"` warning with one of:
- `actions: [{type: "add-wsid", wsid, modId, label}]` — when `mod_parsed` has a row with `mod_id == B` (using `DISTINCT ON (mod_id) ORDER BY parsed_at_time_updated DESC`)
- `actions: [{type: "search-workshop", modId, url, label}]` — when no cache hit; URL is `https://steamcommunity.com/workshop/browse/?appid=108600&searchtext=<modId>`
The frontend renders `[add modId]` as a filled blue chip; clicking appends the wsid to the input textarea AND auto-resorts (no separate sort click needed). The search variant is `[↗ find modId]` and opens Steam's search in a new tab.
## §4 Auto-disambiguation rules
### §4.1 Design principle (locked)
These rules **adjust which boxes are pre-ticked in the picker**. They never bypass the picker and never silently emit a branch the user didn't see. Spec A §8 ownership holds — the picker is the source of truth.
The rules are applied at `/api/sort` time, before `MODS_LINE` is composed. The response's `MOD_DB` always contains every cached branch (so the picker can offer them); `SORTED_ORDER` and `MODS_LINE` reflect only the auto-selected set. The frontend reads the picker default from `SORTED_ORDER` membership.
### §4.2 Order of evaluation per wsid
```
A → C → B
```
The first rule that single-ticks a branch wins; subsequent rules emit warnings only. Exception: A and C are orthogonal (build × addon) and both may tick the same wsid simultaneously — see §4.7.
If a wsid is **coordinated** (any branch references a sibling via `requirements` / `loadAfter` / `loadBefore`) or **radio** (any branch lists a sibling in `incompatibleMods`), it is **exempt from rules A/B/C**. Coordinated → all branches stay; radio → first only.
### §4.3 Rule A — build-aware default *(highest ROI)*
A branch is **B41-flavored** if `mod_id`:
- ends with `B41` (case-insensitive), or
- contains `_legacy_` followed by version digits (e.g., `_legacy_42_12`, `_legacy_41_*`)
A branch is **B42-flavored** if `mod_id`:
- ends with `B42` or `_b42` (case-insensitive), or
- contains `_b42_` (e.g., `vac_mod_b42_utils`)
A branch is **un-flavored** otherwise.
Apply: if exactly one branch is flavored to match the active `pz_build` AND no other branch shares that flavor, pre-tick that one. Un-flavored branches are treated as "the build the author considers default" — currently always B42.
If no branch matches the active build (e.g., user is on B41 but every branch is B42-flavored), fall through to Rule C / B and emit a `build-mismatch` warning: `"no <build> variant for <name> (<wsid>); using author default"`.
Rule A also fires when the wsid has **exactly two branches** and one is `B41`-flavored, one is unflavored — the unflavored is treated as B42 default. So a B42 user gets the unflavored, a B41 user gets the B41-flavored.
### §4.4 Rule B — prefix-base tiebreaker
When the auto-pick path needs to pick *one* branch (Rule A didn't single-tick, Rule C didn't fire), use the **shortest mod_id that is a strict prefix of every other branch's mod_id** instead of alphabetical-first.
**Strict prefix definition:** `A` is a strict prefix of `B` iff:
1. `A``B`, AND
2. `B.startswith(A)`, AND
3. The character at position `len(A)` in `B` is a non-lowercase-letter — separator (`_` `-` ` `), digit (`0`-`9`), or uppercase letter (`A`-`Z`).
Boundary regex: `^<A>([_\- ]|[A-Z]|[0-9]|$)`.
**Examples that qualify:**
- `ArmoredVests``ArmoredVestsPatch` (boundary: `P`)
- `ToadTraits``ToadTraitsDisablePrepared` (boundary: `D`)
- `LitSortOGSN``LitSortOGSN_chocolate` (boundary: `_`)
- `WaterDispenser``WaterDispenser2` (boundary: `2`)
**Examples that do NOT qualify:**
- `Foo` vs `Foobar` (boundary: `b`, lowercase continuation)
- `Lit` vs `LitSort` (boundary: `S`, capital — actually qualifies; this case becomes prefix-base, deliberate)
If multiple branches are mutual prefixes (impossible by definition) or if no branch is a prefix-of-all-others, fall back to alphabetical-first by `mod_id`.
Rule B still emits the `auto-picked-branch` warning (Spec A §4) and renders the click-to-expand picker buttons — the only behavior change is the choice of which branch wins the auto-pick.
### §4.5 Rule C — input cross-reference *(solves Jeeve's Patches)*
For each ambiguous branch whose mod_id matches the pattern `<base>_<TOKEN>` (i.e., contains an underscore, with a base prefix shared with at least one sibling branch):
1. Extract `<TOKEN>` (the part after the last `_`).
2. Look up `<TOKEN>` against the resolved mod_id set from the user's input — case-insensitive substring match against any cached `mod_parsed.mod_id` whose wsid is in the user's input.
3. Match requires `<TOKEN>` length ≥ 3 (avoid `_a`, `_x` false positives).
**Hit:** pre-tick that branch alongside the base branch.
**No hits on any suffix-tokened branch:** tick the base branch only and emit `unmatched-addons` warning listing the unticked branches by name.
The "base branch" inside a Rule-C wsid is determined by Rule B (prefix-base tiebreaker), with alphabetical fallback.
**Worked example: Jeeve's Patches** (wsid 3684025083, branches `JeevesPatches`, `JeevesPatches_AZ`, `JeevesPatches_DAMN`, `JeevesPatches_GGS`, `JeevesPatches_ISA`, `JeevesPatches_PlayerStatus`, `JeevesPatches_Spongie`, `JeevesPatches_Tanker`, `JeevesPatches_Towing`, `JeevesPatches_Vanilla`, `JeevesPatches_ZRE`):
- Base = `JeevesPatches` (Rule B: prefix-of-others).
- Tokens: `AZ`, `DAMN`, `GGS`, `ISA`, `PlayerStatus`, `Spongie`, `Tanker`, `Towing`, `Vanilla`, `ZRE`.
- User submits `AuthenticZ`-related wsids → some cached mod_id contains `AZ` (case-insensitive substring) → tick `JeevesPatches_AZ`.
- User submits `JeevesIntegration` → no token hits → tick `JeevesPatches` only, emit `unmatched-addons` warning.
### §4.6 Rules D + G — hint text only
When the picker renders an ambiguous wsid's branches, attach a per-branch `hint` field surfaced in the picker row. **No state mutation.** Hints are:
| Suffix pattern | Hint text |
|---|---|
| `_Lite`, `_Light` | "lighter alternate variant — pick one" |
| `_HD`, `_DetailsHD` | "high-resolution variant" |
| `_NoCE`, `_NoVanilla`, `_FarmDisable`, `_Disable*` | "opt-out variant" |
| `_USDM`, `_Imports`, `_Exotics`, `_RealNames` | "alternate variant — pick one" |
| `_v\d+(_\d+)*`, `_legacy_*` | "legacy build — usually not what you want" |
| `_AZ`, `_DAMN`, `_GGS`, etc. (Rule C suffixes that didn't match input) | "addon for `<TOKEN>` — only if you have it" |
Pattern matching is case-insensitive with anchored end-of-string for short tags. Multiple hints on one branch concatenate with " · ".
### §4.7 Edge cases
**A and C both fire and disagree** (build hint says B42, input cross-ref says addon `_AZ`): both tick. Build × addon are orthogonal axes; they don't compete for the same slot.
**A says no match (build-mismatch), C also fires:** C still fires; the build-mismatch warning surfaces alongside C's selections. Rule A's "fall back to author default" doesn't override C.
**Coordinated wsid where one branch happens to be B42-flavored and another B41-flavored:** the wsid is exempt from A/B/C (coordinated detection runs first). All branches stay regardless of build mismatch. This is the right answer because coordinated branches by definition need each other.
**Stored selections** in localStorage from previous sessions: the existing `runResort(branchSelections)` flow fires *after* the initial sort response is rendered. User's stored selections override the rules. Rules only determine the initial render's pre-ticked state for never-touched wsids.
**`pz_build` missing or invalid in request:** backend treats as `"B42"`. Forwards-compatible if frontend on a stale build doesn't send it.
## §5 Implementation notes
**Backend** (`adapters.py` + `app.py`):
- `SortRequest` gains `pz_build: Optional[str]`.
- `_autopick_ambiguous(mods)` is renamed `_apply_branch_rules(mods, *, pz_build, input_modids)` and replaces the alphabetical-first picker with the rule pipeline above.
- Returns `(drop_ids, warnings, hints)``hints` is `Dict[mod_id -> str]` consumed by `build_response` and attached to MOD_DB entries as `hint?: string`.
- `sort_endpoint` computes `input_modids = set(by_id.keys())` (the cached mod_ids) and passes alongside `pz_build`.
**Frontend** (`sortof-app.jsx`):
- `onSort` POST body adds `pz_build: pzBuild`.
- `defaultSelectionForBranches(branches, activeSet)` accepts an `activeSet: Set<modId>` (the union of `D.SORTED_ORDER` mod_ids). Returns `branches.filter(b => activeSet.has(b.modId)).map(b => b.modId)`, falling back to `[branches[0].modId]` if none match.
- All call sites of `defaultSelectionForBranches` pass `new Set(D.SORTED_ORDER || [])`.
- `BranchPicker` renders `branch.hint` when present, as a small italic line under the mod_id.
The frontend doesn't reimplement the rules — it simply reflects the backend's chosen `SORTED_ORDER`. Same data path keeps the picker as source of truth and makes the rules introspectable from a curl response.
## §6 Acceptance criteria
- [ ] `POST /api/sort` accepts `pz_build` and defaults to `"B42"` when omitted.
- [ ] B42 user submits a wsid with branches `Foo` (un-flavored) and `FooB41`: `MODS_LINE` includes `Foo` only. B41 user gets `FooB41`.
- [ ] User submits Jeeve's Patches wsid alone: `MODS_LINE` is `JeevesPatches`. WARNINGS includes `unmatched-addons` listing the 10 untouched branches.
- [ ] User submits Jeeve's Patches + a wsid whose mod_id is `AuthenticZ_Current`: `MODS_LINE` includes `JeevesPatches` and `JeevesPatches_AZ`.
- [ ] Wsid `1962761540` (`ArmoredVests`, `ArmoredVestsPatch`): auto-pick selects `ArmoredVests` (prefix-base), not alphabetical-first (which is the same here — pick a wsid where they differ to verify).
- [ ] Coordinated wsid `2791656602` (fhqMotoriusZone): all 5 branches stay in `MODS_LINE` regardless of `pz_build`.
- [ ] Picker UI shows hint text per branch for D/G-class suffixes.
- [ ] `WORKSHOP_ITEMS_LINE` matches `wsids[]` regardless of which branches got ticked (Spec A §8 unchanged).
## §7 Test recipes
1. **Build A — B42 default.** `pz_build=B42` + wsid with branches `[Foo, FooB41]`. Expect `MODS_LINE = "Foo"`, no `build-mismatch` warning.
2. **Build A — B41 + only-B42-variants → mismatch warning.** `pz_build=B41` + wsid where every branch is B42-flavored. Expect `build-mismatch` warning, fall through to Rule B.
3. **Rule C — Jeeve's alone.** Submit only Jeeve's Patches wsid. Expect MODS_LINE = `JeevesPatches`, `unmatched-addons` warning lists the 10 others.
4. **Rule C — Jeeve's + AuthenticZ.** Submit Jeeve's wsid + AuthenticZ wsid. Expect MODS_LINE includes `JeevesPatches` and `JeevesPatches_AZ`.
5. **Rule B — prefix-base picks non-alphabetical.** Find a wsid where alphabetical-first ≠ prefix-base; verify prefix-base wins.
6. **Hint text — `_Lite` / `_legacy_*`.** Picker row shows the appropriate hint string.
7. **Coordinated exemption.** fhqMotoriusZone: all 5 branches in MODS_LINE for both B41 and B42 users.
8. **Stored-selection override.** User has `branchSelections[wsid] = [explicit]` from previous session. Rules don't override on next sort — runResort runs with the explicit selection after initial render.

View File

@@ -0,0 +1,270 @@
# Spec B+F - Collection URL/ID expansion + live drain progress
**Date:** 2026-05-01
**Status:** Draft (awaiting review)
**Sibling specs:** A multi-branch picker (shipped); C+D build-context + dep-add (next); E precacher (parallel); G cleanups + patch tier.
**Folds:** Original Spec F (live drain progress) merges in here - a 50+ mod cold load is exactly when live counters matter, and both features share the polling endpoint.
**Schema notes (corrections to design source text):**
- `download_jobs.status` enum is `queued | downloading | done | failed`. The design text used `running`; this spec uses the actual value `downloading`. UX label may render as "draining" for cohesion with the lifecycle vocabulary; the SQL keys off `downloading`.
- The existing `collections` table (`init/01_schema.sql`) has columns `collection_id PK, title, child_workshop_ids TEXT[], last_fetched_at TIMESTAMPTZ`. There is **no `expires_at` column**. TTL is computed at read time as `last_fetched_at + interval '6 hours'`; no schema change for that.
---
## §1 Overview
Today, sortof accepts one input shape: a blob of newline/`;`-delimited workshop IDs. Anything that isn't a 712 digit number is dropped by `parse.parse_workshop_input`. Pasting a Steam Workshop *collection* URL, of which there is exactly one ID embedded, currently surfaces that ID as a single mod, fails parse (`process_one=no_mod_info`), and lands in the `non_mod` bucket added by the recent unknown/non-mod feature. The user is expected to drag every child mod's ID out by hand.
This spec adds:
1. **Collection URL/ID expansion.** The API recognizes Steam Workshop URLs and resolves collection IDs to their child wsids via `ISteamRemoteStorage/GetCollectionDetails`. Cached in the existing `collections` table.
2. **Async job pipeline.** Any input containing a collection or any uncached wsid creates a `sort_jobs` row, returns a `job_id`, and the frontend polls `GET /api/jobs/{job_id}` every 2.5s until `done|failed`.
3. **Live counters.** During `expanding | queued | draining`, the poll response carries fresh `cached / queued / draining` counts plus an incremental `result_json`. The status strip animates instead of going stale.
Synchronous response is preserved for the all-cached fast path (Open Q1, §10).
## §2 API contract
### 2.1 `POST /api/sort` - polymorphic on input
Request body unchanged: `{ "input": str, "rules": str? }`. Response shape branches on what's in `input`:
```jsonc
// Path A: bare wsid list, all in cache (current behavior, unchanged)
{ "status": "success", "MOD_DB": [...], "MODS_LINE": "...", ... }
// Path B: bare wsid list with ≥1 uncached, OR ≥1 collection URL
{ "status": "queued" | "expanding", "job_id": "<uuid>" }
```
The frontend branches on the presence of `job_id`. Old clients that don't poll silently get the original sync response when their input is fully warm.
### 2.2 `GET /api/jobs/{job_id}` - polling endpoint
Response (any phase):
```jsonc
{
"job_id": "<uuid>",
"phase": "expanding" | "queued" | "draining" | "done" | "failed",
"counts": { "cached": int, "queued": int, "draining": int },
"wsids": [str, ...] | null, // null while phase=expanding; populated after
"result": { ...SORTOF_DATA... } | null, // partial during draining; final on done
"failure_reason": str | null // populated only on phase=failed
}
```
`404` if the `job_id` is unknown or expired (TTL in §3).
### 2.3 `DELETE /api/jobs/{job_id}` - cancel
Marks the job `failed` with `failure_reason="cancelled"`. Returns `204`. Idempotent: deleting an already-terminal job is a no-op `204`. Does **not** cancel underlying `download_jobs` rows (Open Q6, §10).
## §3 Schema
New table:
```sql
CREATE TABLE IF NOT EXISTS sort_jobs (
job_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
phase TEXT NOT NULL CHECK (phase IN ('expanding','queued','draining','done','failed')),
phase_started_at TIMESTAMPTZ NOT NULL DEFAULT now(),
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT now(),
input_raw TEXT NOT NULL,
collection_ids TEXT[] NOT NULL DEFAULT '{}',
wsids TEXT[], -- null until expansion resolves
rules_raw TEXT,
result_json JSONB, -- null until done (incremental partials kept here too)
failure_reason TEXT
);
CREATE INDEX IF NOT EXISTS sort_jobs_phase_idx ON sort_jobs (phase);
CREATE INDEX IF NOT EXISTS sort_jobs_updated_idx ON sort_jobs (updated_at);
```
- **TTL:** rows older than `updated_at + 24h` AND `phase ∈ (done, failed)` are eligible for deletion. Cleanup script lives in Spec G; this spec only requires the schema support it.
- **`updated_at` trigger:** mirror the existing `download_jobs.touch_updated_at` pattern.
- **Migration plan:** `init/02_sort_jobs.sql` for fresh deploys + a one-shot `psql -f` for the live DB. No data migration; pure additive.
The existing `collections` table is reused as-is (4 columns, see corrections at top). No `expires_at` column; freshness derived from `last_fetched_at`.
## §4 Phase state machine
```
┌──────────────────────────────────┐
│ /api/sort with collections only │
▼ │
┌──────────────┐ GetCollectionDetails OK │
│ expanding │ ────────────────────────────┘
└──────┬───────┘
│ wsids = collections + bare ids
┌──────────────┐ ←── /api/sort with bare uncached wsids
│ queued │ ─────────── all wsids in mod_parsed (skip drain)
└──────┬───────┘ │
│ first download_jobs row → downloading
▼ │
┌──────────────┐ │
│ draining │ │
└──────┬───────┘ │
│ all wsids resolved (mod_parsed has rows)
│ │
▼ ▼
┌──────────────┐ ┌──────────────┐
│ done │ │ done │
└──────────────┘ └──────────────┘
Failure terminal at any phase: failed (with phase_at_failure stored in failure_reason prefix).
```
Phase transitions are **monotonic**: `expanding → queued → draining → done`. No backward transitions. A job's phase only advances; the API computes phase fresh on each `GET` rather than mutating it on every event (simpler, no leader needed).
Phase computation rule (executed inside `GET /api/jobs/{job_id}`):
```
if phase in (done, failed): return as-stored
if wsids is null: phase = expanding
elif counts.draining > 0: phase = draining
elif counts.queued > 0: phase = queued
elif counts.cached >= len(wsids): phase = done; persist result_json
else: phase = queued # transient gap between rows
```
## §5 Steam expansion
### 5.1 Detection
The current `parse.parse_workshop_input` strips ini-style prefixes and extracts `\b\d{7,12}\b`. We add a sibling `parse.parse_with_collections(text) -> (wsids: list, collection_ids: list)`:
- Match Steam URLs `https?://steamcommunity\.com/(?:sharedfiles|workshop)/filedetails/\?id=(\d{7,12})` and capture the ID.
- Bare numeric IDs (the existing pattern) remain `wsids`.
- A URL-form ID is classified as a *candidate collection*. We don't know syntactically whether a wsid is a collection vs a mod - so candidate collection IDs are sent to `GetCollectionDetails` first; if the API reports them as actual mods (not collections), they fall back to the wsids list.
### 5.2 Resolution
Single batched call per `/api/sort` with ≥1 candidate:
```
POST https://api.steampowered.com/ISteamRemoteStorage/GetCollectionDetails/v1/
collectioncount=N
publishedfileids[0..N-1]=...
```
Per-collection in the response: `result==1` and `children[]` populated → expand to `[c.publishedfileid for c in children]`. `result!=1` → mark in result warnings as `{tag:"collection-partial", level:"warning", msg:"collection X could not be fetched"}`; keep the job alive with whatever resolved. (Open Q3, §10.)
### 5.3 Caching
Hit on `collections` row where `last_fetched_at > now() - interval '6 hours'`:
- Skip the API call entirely.
- Use cached `child_workshop_ids` directly.
Miss / stale → call API, UPSERT into `collections`, then proceed. The `last_fetched_at = now()` write is the cache write.
### 5.4 Flakiness
One internal retry with 2s backoff on HTTP error or `result!=1` for a candidate. After retries exhausted, the candidate is reported as collection-partial (warning) but the job continues with whatever else resolved. (Open Q4, §10.)
## §6 Counts contract
Computed live on every `GET /api/jobs/{job_id}` against the job's `wsids[]`:
```sql
-- counts.cached
SELECT COUNT(DISTINCT mp.workshop_id)
FROM mod_parsed mp
JOIN workshop_meta wm ON wm.workshop_id = mp.workshop_id
WHERE mp.workshop_id = ANY($1::text[])
AND mp.parsed_at_time_updated = wm.time_updated;
-- counts.queued
SELECT COUNT(DISTINCT workshop_id)
FROM download_jobs
WHERE workshop_id = ANY($1::text[]) AND status = 'queued';
-- counts.draining (status='downloading' in DB; surfaced as 'draining' in API/UI)
SELECT COUNT(DISTINCT workshop_id)
FROM download_jobs
WHERE workshop_id = ANY($1::text[]) AND status = 'downloading';
```
Ownership precedent (Spec A §8): once a job is created, `wsids[]` is **locked**. `WORKSHOP_ITEMS_LINE` in the final `result_json` is computed from `sort_jobs.wsids[]`, **not** recomputed against current `mod_parsed`. This means a wsid that was in the input but is currently `non_mod` or `unknown` still appears in `WORKSHOP_ITEMS_LINE` in the same position - matching the locked contract from Spec A.
## §7 Frontend behavior
Status strip during polling:
| Phase | Strip text |
|---|---|
| `expanding` | `expanding collection…` (animated dot, no counts visible) |
| `queued` | `X cached · Y queued · 0 draining` (animated dots on queued) |
| `draining` | `X cached · Y queued · Z draining` (animated dots on queued + draining) |
| `done` | strip collapses, full result rendered |
| `failed` | red banner with `failure_reason` + Retry button |
Polling: `setInterval` at 2.5s, started on receiving `job_id`. Stops on `phase ∈ (done, failed)`. On `404` (job expired/garbage-collected): show "this job expired - re-submit?" toast; offer one-click resubmit using cached input (the textarea is still populated).
Cancel button: shown during `expanding | queued | draining`. Issues `DELETE /api/jobs/{job_id}`, stops polling on success, clears the strip.
The synchronous code path (no `job_id` in response) renders unchanged - old picker behavior, immediate result.
Owned-fields contract (Spec A §8 precedent): `WORKSHOP_ITEMS_LINE`, `counts.queued` (the picker's internal counter), `unknown[]`, `non_mod[]` are still owned by the **first** `/api/sort` (or final `result_json`). `/api/resort` ignores them. The poll's `counts` object is purely the live drain progress and does not feed the picker's internal queued counter.
## §8 Cancellation
`DELETE /api/jobs/{job_id}` semantics:
- Marks `sort_jobs.phase = 'failed'`, `failure_reason = 'cancelled'`. Idempotent.
- **Does not** touch `download_jobs`. Workshop downloads in flight continue and populate `mod_parsed`, benefiting subsequent users via cache. Aborting them would waste partial progress and potentially trip the drain's `STALE_RECLAIM_MIN` reclaim path. (Open Q6, §10.)
- Frontend stops polling, hides the strip, shows a small "cancelled" toast. The textarea retains the input.
Re-submitting the same input after cancel creates a *new* job. Collection-cache hits make the second submission instant if the cache hasn't expired.
## §9 Restart resilience
uvicorn boot sweep (idempotent, runs in lifespan startup):
```sql
-- Time out long-stuck expansion jobs
UPDATE sort_jobs
SET phase = 'failed', failure_reason = 'expansion timed out',
updated_at = now()
WHERE phase = 'expanding'
AND phase_started_at < now() - interval '10 minutes';
```
Jobs in `queued` / `draining` need no special handling - they resume polling against `download_jobs` on the next client `GET`. The phase derives live from current counts (§4 phase computation rule), so a restart in the middle of a drain is invisible to the client beyond a brief window where counts may shift.
## §10 Open questions resolved
1. **Bare wsid + all-cached: synchronous or job-routed?** *Synchronous.* The cached path is sub-100ms today; routing it through a job adds polling latency and a UI flash. Frontend branches cheaply on `job_id` presence.
2. **Mixed input (bare wsids + collection URLs).** *Treat as collection input.* Job created in `expanding` phase immediately. Bare wsids merge into `wsids[]` after `GetCollectionDetails` resolves. No partial-sync hybrid - keeps the response shape rule clean.
3. **Partial expansion failure.** *Succeed with the resolvable subset.* Each unresolvable collection adds a warning `{tag:"collection-partial", level:"warning", msg:"collection X could not be fetched"}` to `result_json.WARNINGS`. Job completes normally; user sees the result with one or more amber warnings.
4. **`GetCollectionDetails` flakiness.** *One internal retry with 2s backoff* before reporting collection-partial. No frontend-driven retry on the GET poll - it would mask transient failures and give the user no recovery affordance. Job marked `failed` only if **every** candidate collection fails.
5. **Concurrent expansion of the same collection.** *Independent jobs; cache deduplicates.* User A and User B paste the same collection URL near-simultaneously; both create separate `sort_jobs` rows. The first one's `GetCollectionDetails` call populates `collections`; the second's hits cache. Worst case (race within the cache miss window) costs one duplicate API call. In-flight cache key (e.g., `collections.fetching_until`) deferred to Spec G.
6. **Cancel semantics.** *Abandon `sort_job`; leave `download_jobs` running.* Three reasons. (a) Workshop downloads benefit other users via the shared `mod_parsed` cache - wasting them is anti-social. (b) The drain's `STALE_RECLAIM_MIN=30` reclaim path treats half-killed `downloading` rows as candidates for retry; introducing client-driven cancellation creates a class of races where the row is killed mid-write. (c) Worker-side cancellation requires SIGTERM-of-DD-subprocess plumbing that doesn't exist; staying out of that codepath is much cheaper.
## §11 Acceptance criteria
- [ ] `POST /api/sort` with all-cached bare wsids returns the synchronous shape with no `job_id`.
- [ ] `POST /api/sort` with any uncached wsid OR any collection URL returns `{status, job_id}` and persists a `sort_jobs` row.
- [ ] `GET /api/jobs/{job_id}` returns live counts and the current phase per the §4 derivation rule.
- [ ] `GET /api/jobs/{nonexistent}` returns `404`.
- [ ] `DELETE /api/jobs/{job_id}` flips phase to `failed` with `failure_reason="cancelled"`. Idempotent.
- [ ] Collection URL `https://steamcommunity.com/sharedfiles/filedetails/?id=N` is detected by the parser and routed through `GetCollectionDetails`.
- [ ] A `collections` cache hit (row younger than 6h) skips the Steam API call.
- [ ] A collection that returns `result!=1` produces a `collection-partial` amber warning in `result_json.WARNINGS` but does not fail the job (unless **all** collections in the input are unresolvable).
- [ ] uvicorn restart with a job in `expanding > 10min` flips it to `failed` with `failure_reason="expansion timed out"`.
- [ ] uvicorn restart with a job in `queued`/`draining` is invisible to the client beyond next-poll-window jitter.
- [ ] Frontend polls every 2.5s when `phase ∈ (expanding, queued, draining)`; stops on terminal phase.
- [ ] Status strip text matches the §7 table for each phase.
- [ ] Cancel button issues `DELETE`, stops polling, hides strip, retains input in textarea.
- [ ] `WORKSHOP_ITEMS_LINE` in `result_json` matches `sort_jobs.wsids[]` regardless of which wsids ended up in `non_mod` / `unknown` (Spec A §8 ownership preserved).
## §12 Test recipes
1. **Synchronous fast path** - `POST /api/sort` with `{"input":"2169435993;2392709985;2487022075"}`. Expect: response has `MODS_LINE`, no `job_id`. ~50ms.
2. **Collection URL, cold cache** - clear `collections` row for the test ID; `POST /api/sort` with a known PZ collection URL. Expect: `{status:"expanding", job_id:"…"}` immediately. Poll: phase progresses `expanding → queued → draining → done`. Final `result.MODS_LINE` populated.
3. **Collection URL, warm cache** - re-submit the same URL within 6h. Expect: phase skips `expanding`, goes straight to `queued` (or `done` if all children cached). One Steam API call total across both runs (verify via `/var/log/...` or `journalctl -u sortof-api | grep GetCollectionDetails`).
4. **Mixed bare + collection** - `POST /api/sort` with `"<URL>\n2169435993"`. Expect: job created in `expanding`; on resolve, `wsids[]` contains both the collection's children and the bare wsid; deduped.
5. **Partial collection failure** - input contains two collection URLs, one valid, one to a deleted collection. Expect: job phase progresses normally; `result_json.WARNINGS` contains exactly one `collection-partial` entry; `wsids[]` contains only the valid collection's children.
6. **All collections fail** - input contains only unresolvable collection URLs. Expect: job `phase=failed`, `failure_reason="all input collections unresolvable"`.
7. **Cancel during draining** - submit a 50-mod cold collection, wait until `phase=draining`, `DELETE /api/jobs/{id}`. Expect: phase=failed reason=cancelled. Verify `download_jobs` rows for the wsids are still in `queued`/`downloading`/`done` (not nuked).
8. **Restart mid-drain** - submit a job, wait for `phase=draining`, `sudo systemctl restart sortof-api`. Wait 5s, GET the job. Expect: phase still derives correctly (computed from live counts), client polling resumes.
9. **Restart mid-expansion** - submit a collection job, kill `sortof-api` mid-expansion (race window: hard to hit deliberately; can simulate by directly SET `phase='expanding', phase_started_at=now()-interval '15 minutes'` then restart). Expect: lifespan sweep flips it to `failed` with `failure_reason="expansion timed out"`.
10. **404 on expired job** - manually `DELETE FROM sort_jobs WHERE job_id=…`; client poll. Expect: `404`. Frontend shows the expired-toast with re-submit affordance.
11. **Counts contract** - at each poll during a 50-mod cold drain, sum `counts.cached + counts.queued + counts.draining` and compare to `len(wsids)`. Equal at every snapshot. (Some wsids may be `non_mod` post-drain; they appear in `cached=0, queued=0, draining=0` because `mod_parsed` has no row - they're "missing from all three buckets," which is the expected steady state for non-mods.)
12. **Concurrent collection submit** - open two browser tabs simultaneously and submit the same URL. Expect: two distinct `job_id`s, but only one `GetCollectionDetails` call lands at Steam (verify journal). Worst case (cache-miss race): two API calls; this is acceptable.