feat: stale-require filter + Steam-API-keyed required-items fetch

Drops missing-dep warnings whose source mod's mod.info `require=` is
out of sync with its Steam Workshop Required Items sidebar. Author
edits to mod.info often lag build ports; trusting the sidebar means
B42 sorts no longer raise warnings on B41-only deps the author has
already retired (e.g. tikitown's Diederiks Tile Palooza, EN_Newburbs).

Filter is conservative: only drops a dep when (a) we have a cached
wsid for it, (b) that wsid is wrong-build for the user's pz_build,
and (c) the source mod's required_wsids list (with required_scraped_at
populated as the "we have evidence" gate, since the column itself
defaults to '{}') excludes that wsid.

Also swaps worker.fetch_required_wsids from public-page HTML scrape
to authenticated IPublishedFileService/GetDetails. Same `children`
data, no 429 cooldowns. Removes the now-unused throttle/cooldown
infrastructure (SORTOF_STEAM_MIN_INTERVAL / SORTOF_STEAM_COOLDOWN
env vars are no longer read).

See docs/specs/2026-05-06-stale-requires-filter.md.
This commit is contained in:
2026-05-06 21:30:28 +00:00
parent f8b48fbacb
commit 3a34b71e54
3 changed files with 286 additions and 116 deletions

View File

@@ -0,0 +1,117 @@
# Spec — Stale `require=` filter via Steam Required Items
**Date:** 2026-05-06
**Status:** Implemented
**Lineage:** Builds on Spec C (`2026-05-01-build-context-dep-add.md`) — same warning-shaping path, additional filter layer between mlos_sort output and `_lookup_wsids_for_missing` / `build_warnings`. Also opportunistically swaps `worker.fetch_required_wsids` from HTML scrape to the authenticated Steam API.
## 1. Summary
Authors update Steam's "Required Items" sidebar per build (it visibly affects subscribe-all behavior), but routinely forget to clean up `mod.info`'s `require=` line when porting a mod from B41 → B42. The result: a B42 mod's mod.info still declares B41-era deps that the author has implicitly retired, and we surface them as **missing-dep warnings on a build the mod doesn't actually need them on**. Tikitown is the canonical case: B42 mod, B42 Required Items = `{Drazion's, Erika's}`, but `mod.info require=` still names `Diederiks Tile Palooza, EN_Newburbs` (both B41-only). Today we warn about both; we should not.
This spec adds a small filter, `_filter_stale_requires`, that drops a missing-dep entry when **(a)** the dep resolves to a wsid we have cached, **(b)** that wsid is wrong-build for the user's `pz_build`, and **(c)** the source mod's Steam Required Items list does NOT include that wsid. The author has both labelled the dep wrong-build AND removed it from Required Items — strong evidence the `require=` line is stale.
## 2. Problem
`mlos_sort` builds `missing_requirements` straight from `mod.info`'s `require=` field. It has no concept of build tags or Steam-side dependency lists. Spec C's `_lookup_wsids_for_missing` already filters wrong-build wsid *suggestions* (so we don't propose adding a B41-only mod to a B42 sort), but the **warning itself still appears** with no actionable button. From the user's perspective the only signal is "Tikitown wants something I can't add" — which is incorrect: tikitown doesn't actually want it on B42, the author just forgot to update mod.info.
The over-declaration trap (`TacHold requires modoptions` style — declared but not actually needed) is the same shape: wrong-build mod.info dep that's not in Required Items.
## 3. Heuristic
For each `(source_mod_id, dep_mod_id)` pair in `mlos_sort`'s `missing_requirements`, drop the dep iff ALL of:
1. `pz_build` is set (B41 or B42; unknown → no filtering).
2. `source_mod_id` resolves to a wsid we have cached.
3. The source wsid has been **scraped** for Required Items (`workshop_meta.required_scraped_at IS NOT NULL`). The `required_wsids` column itself has `NOT NULL DEFAULT '{}'` so it can't tell us "never scraped" vs. "scraped, no items"; only `required_scraped_at` distinguishes the two. Without this gate the 2741 unscraped wsids in the live cache (as of 2026-05-06, before the authenticated-API backfill) would all look like "author lists no required items" and silently suppress legit warnings.
4. `dep_mod_id` resolves to a wsid via `mod_parsed` (latest-cached row).
5. The dep's wsid has `workshop_meta.tags` indicating it is **wrong-build** for `pz_build` — i.e., `other_tag in tags AND target_tag NOT in tags`. A mod tagged both builds is build-correct and never dropped.
6. The dep's wsid is **NOT** in the source's `required_wsids`.
If any condition fails, the dep is kept. The filter is conservative: silence requires evidence on every axis.
### 3.1 Worked examples
**Tikitown on B42 (the motivating case):**
- Source: tikitown wsid 3037854728, `required_wsids = {3046728955 Drazion's, 3346506593 Erika's}`.
- Dep `Diederiks Tile Palooza` → wsid 2337452747, tags `{Build 40, Build 41}` → wrong-build for B42 → not in `{3046728955, 3346506593}`**drop**.
- Dep `EN_Newburbs` → wsid 2774834715, tags `{Build 41, ...}` → wrong-build for B42 → not in required → **drop**.
- Dep `tikitown_tiles` → wsid 3046728955, tags include `Build 41, Build 42` → build-correct → **keep** (and the user has it in input anyway, so it doesn't appear as missing).
- Result: warning disappears entirely.
**TacHold on B42 (over-declaration):**
- Source: TacHold wsid X, `required_wsids = {...}` (no modoptions).
- Dep `modoptions` → wsid Y, tags include `Build 41` only on the legacy wsid → wrong-build → not in required → **drop**.
- The mod runs fine without modoptions; the over-declared dep is silenced.
**Legitimate missing dep:**
- Source: SomeMod, `required_wsids = {Z}` where Z resolves to mod_id `RealDep`.
- User omitted `RealDep` from input. mod.info: `require=RealDep`.
- `RealDep` is build-correct → **keep**. Warning surfaces with `[add RealDep]`.
**Source has no Required Items data:**
- New wsid, drained yesterday, `required_wsids` is NULL.
- Filter does nothing for this source's deps → existing behavior.
**Wrong-build dep that IS in Required Items:**
- Author intentionally requires a B41-only utility mod on a B42 mod (rare but real).
- Wrong-build BUT in required → **keep**. Warning surfaces; current Spec C lookup may still suppress the wrong-build add-button, but that's the existing behavior and out of scope here.
## 4. Implementation
### 4.1 New helper: `api/app.py:_filter_stale_requires`
Single async function, ~80 lines. Mutates `mlos_warnings["missing_requirements"]` in place — same dict that downstream `_lookup_wsids_for_missing` and `adapters.build_warnings` already read from. Two queries:
```sql
SELECT workshop_id, required_wsids FROM workshop_meta
WHERE workshop_id = ANY($1) AND required_scraped_at IS NOT NULL;
SELECT DISTINCT ON (mp.mod_id) mp.mod_id, mp.workshop_id, wm.tags
FROM mod_parsed mp JOIN workshop_meta wm ON wm.workshop_id = mp.workshop_id
WHERE mp.mod_id = ANY($1)
ORDER BY mp.mod_id, mp.parsed_at_time_updated DESC;
```
Build correctness uses the same `target_tag` / `other_tag` logic as `_lookup_wsids_for_missing` so a future flip to a different rule (or a third build) only has to change one place.
### 4.2 Call sites
The filter must run **before** `_lookup_wsids_for_missing` (which would otherwise build wsid suggestions for soon-to-be-dropped deps) and **before** `adapters.build_response → build_warnings` (which is what materializes the warning payload from `missing_requirements`). Three call sites, all in `app.py`:
1. `/api/sort` sync path (~line 870, after `sort_mods(mods, rules)`).
2. `/api/sort` async-resume path (~line 1198, after `sort_mods(mods, rules)` on the post-drain refetch).
3. `/api/resort` (~line 1417, after `sort_mods(selected_mods, auto_rules)`).
Each call passes `{m.id: m.workshop_id for m in <local mods> if m.workshop_id}` for the source map. For resort, the local mods are `selected_mods` (what was sorted) — using `all_mods` would also work but `selected_mods` is the strict superset of source mods that could have generated warnings.
## 5. Worker swap: HTML scrape → authenticated `GetDetails`
While we're touching this code path, replace `worker.fetch_required_wsids`'s HTML scraping with the authenticated `IPublishedFileService/GetDetails/v1/?key=…&publishedfileids[0]=…&includechildren=true`. Returns the same `children` array Steam renders into the Required Items sidebar, but:
- No 429 rate-limiting at our drain rate.
- No throttle / 1h cooldown infrastructure needed.
- More reliable than HTML regex parsing (Steam page markup has changed in the past).
Required env: `STEAM_WEB_API_KEY` (already in `/opt/sortof/.env`). Without it, the function returns `None` (existing semantics: don't overwrite cached value). Steam returns `result=1` on success; treat anything else as soft failure (also `None`) so a transient lookup miss doesn't clobber a previously good cached value with `[]`.
Removed code: `_THROTTLE_FILE`, `_COOLDOWN_FILE`, `_MIN_SCRAPE_INTERVAL_S`, `_COOLDOWN_S`, `_read_cooldown_until`, `_write_cooldown_until`, `_throttle_scrape`, `_WORKSHOP_PAGE_URL`, `_RE_REQUIRED_BLOCK`, `_RE_REQUIRED_LINK`, the `import fcntl as _fcntl`, and the rate-limit comment block. `SORTOF_STEAM_MIN_INTERVAL` / `SORTOF_STEAM_COOLDOWN` env knobs are no longer read.
## 6. Non-goals
- Mutating `mod_parsed.requirements`. The filter operates on warning generation only; the parsed `require=` field stays as written in `mod.info` (useful for diagnostics and for future rules that may want the raw declaration).
- Surfacing a "this dep was suppressed because it looked stale" debug warning. The filter is silent by design — if it's right, the user never needed to know; if it's wrong, the user can compare against the Workshop page directly.
- Searching Workshop for B42 alternatives to a wrong-build dep. `IPublishedFileService/QueryFiles` text search is too fuzzy to be reliable (search "Diederiks Tile Palooza" + Build 42 → top hit "Rocco's Tiles", entirely unrelated). Out of scope.
- Loosening the build-correct filter in `_lookup_wsids_for_missing` to offer wrong-build add-buttons with a `(B41)` label. Considered and rejected: re-introduces the over-declaration trap. The stale-filter route handles the same cases more cleanly by suppressing the warning entirely instead of offering an action that points at a wrong-build mod.
## 7. Verification
Smoke test against the canonical case:
```bash
curl -sS -X POST http://100.114.205.53:8801/api/sort \
-H 'Content-Type: application/json' \
-d '{"input":"3037854728;3046728955;3346506593","pz_build":"B42"}' \
| jq '.WARNINGS[] | select(.msg | test("tikitown|Diederiks|Newburbs"))'
```
Expected: empty (warning suppressed). Before this change: one missing-dep warning naming Diederiks + EN_Newburbs.