docs: extend CLAUDE.md with cross-session friction notes

Captures four pieces of context that cost time to (re)derive this session: - Docker `--entrypoint php` one-liner for ad-hoc PHP that needs the codex autoloader (used for redactor smoke tests). - Pitfall #6: PZ DebugLog-server has two coexisting line shapes (B41 with `t:` field, B42 without) — `DebugServerPattern::LINE` matches both via an optional group; narrowing it back to B41-only silently disables ServerExceptionProblem / ModMissingProblem on every B42 log. - Deployed iblogs lives at bosslogs.indifferentketchup.com and uses `main` as its default branch, not `master`. Pinned to ^0.3.0. - New top-level section for `tools/pz-analyzer/` describing the intentional split between the pre-production Qwen-backed discovery tool and the production-bound deterministic classifier, plus the redact-all wrapper that feeds both. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
docs: cut v0.3.0 in CHANGELOG
2026-05-06 19:45:26 +00:00 · 2026-05-06 19:04:37 +00:00 · 2026-05-06 13:33:43 +00:00 · 2026-05-06 13:33:35 +00:00 · 2026-05-04 16:31:56 +00:00 · 2026-05-04 16:31:23 +00:00
38 changed files with 3335 additions and 18 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -5,3 +5,10 @@ Logs.zip
 .scratch/
 .claude/
 .claude.local.md
+
+# Python bytecode caches from tools/pz-analyzer/.
+__pycache__/
+
+# Editor / manual backup files.
+*.bak
+*.bak-*
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -6,6 +6,34 @@ The format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/) and

 ## [Unreleased]

+## [0.3.0] — 2026-05-04
+
+Adds IP-address redaction to the PZ redactor, a new `ErrorContextAnalyser` for surrounding-context surfacing, the `tools/pz-analyzer/` Python toolset (pre-production Qwen-driven research analyser and production-bound deterministic classifier), and a parser fix for the PZ B42 log shape that was silently breaking level/prefix attribution since The Indie Stone dropped the per-line `t:` field. New public API surface across the redactor and the analyser-side classes makes this a minor bump rather than a patch.
+
+### Added
+
+- **IP redaction in `ProjectZomboidRedactor`** (`src/Util/ProjectZomboid/ProjectZomboidRedactor.php`) — fourth pass that scrubs IPv4 (strict 0-255 octets, optional `:port` suffix) and IPv6 (full, abbreviated, bracketed-with-port, IPv4-mapped) addresses, replacing them with the literal `[REDACTED_IP]`. New public API: `IP_REPLACEMENT`, `IPV4_REGEX`, `IPV6_REGEX` constants and a `redactIpAddresses(bool)` toggle (defaults on, mirroring the existing three category toggles). Pattern-disjoint from the Steam-ID → name → coordinates chain; runs first by convention. Strict regexes plus `filter_var()` validation prevent false positives on PZ timestamps and PHP / Java scope ops. 20 new unit tests across two files (`ProjectZomboidRedactorIpv4Test.php`, `ProjectZomboidRedactorIpv6Test.php`).
+- **`ErrorContextAnalyser`** (`src/Analyser/ProjectZomboid/ErrorContextAnalyser.php`) — generic-purpose analyser that walks `Entry[]` once and emits one `ErrorContextProblem` per ERROR / WARNING entry with up to `CONTEXT_BEFORE` (20) entries of leading context and `CONTEXT_AFTER` (20) entries of trailing context. Overlapping windows clip to `lastEmittedIndex + 1` so no Entry appears in two context arrays; emission caps at `HIT_CAP` (500) with a single `ErrorContextTruncatedInformation` appended when reached. Standalone — not auto-registered to any existing Log subclass's `getDefaultAnalyser()`; consumers wire it in explicitly. Companion classes `ErrorContextProblem` and `ErrorContextTruncatedInformation` under `src/Analysis/ProjectZomboid/`. 3 unit tests, 134 assertions.
+- **`tools/pz-analyzer/`** — Python toolset adjacent to the library (not part of the Composer package's autoload surface). `pz_redact_all.sh` is a one-shot Docker wrapper that runs the PHP redactor over `.scratch/pz/Logs/` and produces a gitignored `.scratch/pz/Logs.redacted/` directory. `pz_error_analysis.py` is a developer-facing Qwen-backed pre-production analyser that calls a local OpenAI-compatible endpoint to classify residual log shapes the deterministic side hasn't yet captured. `pz_parser.py` + `pz_classify.py` are the production-bound deterministic-only counterpart: pure parser module with mod attribution, file:line extraction, cause-chain unwinding, engine-noise tagging, and a two-level signature scheme (`pattern_id` + `signature`), plus a stdlib-only orchestrator that walks the redacted directory and emits a JSON report. 32 Python unit tests across three files, 16 synthetic fixtures.
+- `docs/superpowers/specs/2026-05-04-pz-deterministic-classifier-design.md` — design contract for `pz_parser.py` / `pz_classify.py`. The PHP-side `ErrorContextAnalyser` ships without a separate spec; its design fell out of a brainstorming session inline with the pzmm-pattern-port discussion.
+- New synthetic fixture `test/src/Games/ProjectZomboid/fixtures/debug-server-42x-minimal.txt` mirroring the existing B41 fixture in PZ B42 line shape.
+
+### Changed
+
+- **`DebugServerPattern::LINE` regex relaxed** to handle PZ build 42.x. The Indie Stone dropped the per-line `t:` (microsecond) field and tightened the spacing between `f:N`, `t:N`, and `st:N,N,N,N>` markers somewhere on the way to build 42.17. The previous regex required the full `f:\d+,\s+t:\d+,\s+st:` triplet and silently failed on every B42 line. Now `(?:,\s+t:\d+)?` makes the `t:N,` field optional and `,?` makes the inter-field comma optional. Backwards-compatible — every B41 line continues to parse identically. `ProjectZomboidServerLogTest` now runs each parser-shape assertion via `#[DataProvider]` against both fixtures.
+- **Pass order in `ProjectZomboidRedactor::redact()`**: the new IP pass runs first, so the chain is now `IP → Steam ID → player name → coordinates`. The mandatory Steam ID → name → coordinates ordering is preserved; placement of the IP pass is by convention since its regexes are pattern-disjoint from the rest.
+- **`CLAUDE.md`** documents `iblogs` as the primary downstream consumer with a per-component checklist for cross-repo public API impact; the release-flow cadence; the feature-branch workflow set by the `redactor` and `iblogs-bootstrap` precedents; and the `docs/superpowers/specs|plans/` path convention.
+- **`.gitignore`** excludes `__pycache__/` (Python bytecode caches generated under `tools/pz-analyzer/`) and `*.bak` / `*.bak-*` (editor / manual backup files).
+
+### Fixed
+
+- PZ build 42.x server logs now parse with proper level / prefix attribution. Previously, every B42 line failed `DebugServerPattern::LINE` and the resulting ServerLog entries fell through as level `INFO` with no prefix. This silently disabled `ServerExceptionProblem` and `ModMissingProblem` (their regexes anchor on `[timestamp]...` at entry start, which a level-less orphan entry doesn't emit). The anchorless `EngineVersionInformation` continued to fire against the joined entry text, producing the user-visible symptom "one Information badge, empty Problems panel" on B42 logs. The fix restores per-line parsing, re-enables both Problem classes, and makes the error-count badge populate correctly.
+
+### Test counts
+
+- PHP suite: **287 tests, 654 assertions** (up from 260 / 492 at v0.2.0).
+- Python suite under `tools/pz-analyzer/`: **32 tests** (stdlib `unittest`, sub-10 ms).
+
 ## [0.2.0] — 2026-05-01

 Render-time PII redaction utility added on the same calendar day as v0.1.0. Cut as a minor version bump rather than a patch because it adds a new public API surface (`RedactorInterface` plus the per-game implementation), which under semver is a minor change, not a patch. Consumers (notably iblogs) pin to `^0.2.0` to opt into the redactor-aware version.
@@ -51,5 +79,6 @@ First public release. Codex is a generic PHP log parsing and analysis framework
 - **Other game implementations** — `Minecraft`, `Hytale`, and `SevenDaysToDie` are detective-stub-only. Each has a TODO `<Game>Detective` extending base `Detective`; their per-component subdirectories under `Analyser`, `Log`, `Parser`, and `Pattern` contain only `.gitkeep` placeholders. Real implementations land if and when fixtures and demand exist.
 - **Packagist publication** — v0.1.0 is consumable via Composer's `vcs` repository entry pointing at the Gitea remote. Pushing to Packagist is a separate decision and is not in scope for this release.

+[0.3.0]: https://git.indifferentketchup.com/indifferentketchup/ik-codex/releases/tag/v0.3.0
 [0.2.0]: https://git.indifferentketchup.com/indifferentketchup/ik-codex/releases/tag/v0.2.0
 [0.1.0]: https://git.indifferentketchup.com/indifferentketchup/ik-codex/releases/tag/v0.1.0
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -16,6 +16,12 @@ docker run --rm -v "$(pwd):/app" -w /app -u "$(id -u):$(id -g)" composer:latest

 Use `$(pwd)` or an absolute path — bare `$PWD` has misfired here, mounting nothing and silently no-op'ing the run.

+For ad-hoc PHP that needs the codex autoloader (e.g. running `ProjectZomboidRedactor::redact()` over a directory of log files, or eyeballing analyser output), override the entrypoint:
+
+```
+docker run --rm --entrypoint php -v "$(pwd):/app" -w /app -u "$(id -u):$(id -g)" composer:latest -r '<php source>'
+```
+
 ## Common commands

 - All tests: `composer test` (= `phpunit test/tests` per `composer.json`)
@@ -84,6 +90,20 @@ Scaffolded games: `Minecraft`, `Hytale`, `SevenDaysToDie` (stubs only — empty

 At minimum: (1) entry count after `parse()` matches the synthetic fixture's line count, (2) one or more named-group `FIELDS` regexes from the `<Type>Pattern` class extract correctly from a representative line, (3) `Detective` handed the fixture path returns an instance of this Log class. Use `#[DataProvider]` when the same shape repeats per file.

+### Downstream consumers
+
+`iblogs` (sibling repo at `/opt/iblogs`, package `indifferentketchup/iblogs`, fork of `aternosorg/mclogs`) is the primary consumer of codex via a Composer `vcs` repository entry pinned to the latest minor tag. Public-API changes in `src/{Detective,Log,Printer,Util}/*.php` and `src/Analysis/*.php` propagate there; when modifying those types, sanity-check the iblogs call sites at `/opt/iblogs/src/{Detective.php,Log.php,Printer/Printer.php,Printer/FormatModification.php,Api/Response/CodexLogResponse.php}` and the stub class at `/opt/iblogs/src/Data/Deobfuscator.php`.
+
+The deployed iblogs instance lives at `bosslogs.indifferentketchup.com` (production renders the same code path as the local dev server on port 4217). iblogs's default branch is `main`, not `master`. iblogs's `composer.json` constraint is currently `^0.3.0`; cutting a v0.4.x will require widening that.
+
+## Out-of-library tools (`tools/pz-analyzer/`)
+
+Python utilities alongside the Composer package, not on the PSR-4 autoload surface. Two artefacts coexist by design — the deterministic classifier is the production target; the Qwen tool is the developer's discovery aid for shapes the deterministic side hasn't captured yet.
+
+- **`pz_redact_all.sh`** — one-shot Docker wrapper. Runs `ProjectZomboidRedactor` over `.scratch/pz/Logs/` and writes `.scratch/pz/Logs.redacted/`. Both Python tools below consume the redacted directory.
+- **`pz_error_analysis.py`** — *pre-production*, Qwen-backed. Sends residual log shapes to the local Qwen endpoint at `http://100.101.41.16:8401/v1` (sam-desktop, model `qwen3.6-35b-a3b`) for natural-language classification with category / cause / fix output. Requires the `openai` package; in practice run via `/opt/analytics/.venv/bin/python` which has it installed.
+- **`pz_parser.py` + `pz_classify.py`** — *production-bound deterministic classifier*. Stdlib only. Mirrors the patterns from `paraxaQQ/pzmm`'s `core/inspector.py` (Lua mod-marker attribution, bidirectional stack collection, file:line extraction, cause-chain unwinding, engine-noise tagging) plus a two-level signature scheme (`pattern_id` + `signature`). Designed to inform a future PHP port to `LuaErrorAnalyser` / `ModAttributionAnalyser` under `src/Analyser/ProjectZomboid/`. 32 stdlib `unittest` tests under `tools/pz-analyzer/tests/`; invocation: `python3 -m unittest discover -s tools/pz-analyzer/tests`.
+
 ## Pitfalls

 1. **`PatternParser` is incompatible with named regex groups.** PHP's `preg_match` returns named groups *plus* their numeric duplicates in the same array; `PatternParser`'s foreach iterates both and throws on the string-key entries. Convention: `LINE` regexes (used by the parser) use **unnamed** groups with field order documented in the Pattern class's docblock. Named groups are fine inside extractor regexes invoked from analysers, since `PatternAnalyser` hands the whole match array to `Insight::setMatches`.
@@ -91,12 +111,16 @@ At minimum: (1) entry count after `parse()` matches the synthetic fixture's line
 3. **`Level::fromString()` defaults to `Level::INFO` for unknown tokens.** Project Zomboid log levels map: `LOG`/`INFO` → INFO; `WARN` → WARNING; `ERROR` → ERROR.
 4. **`PatternParser` matches array** must declare a match-type for **every** capture group in the regex (`TIME`, `LEVEL`, or `PREFIX`); otherwise the parser throws on the unmapped index. Use non-capturing groups `(?:...)` for fields you want to skip.
 5. **`ProjectZomboidRedactor` pass order is mandatory.** `PLAYER_AFTER_STEAMID_REGEX` anchors on the already-redacted Steam ID placeholder — it will not match raw Steam IDs. Do NOT swap the Steam ID and player-name passes, and do NOT stub out the Steam ID pass while leaving the player-name pass enabled.
+6. **Two PZ DebugLog-server line formats coexist.** B41 emits `[ts] LEVEL: Subsystem  f:N, t:N, st:N,N,N,N>`; B42 (build 42.17 onward) dropped the `t:` microsecond field and tightened spacing to `f:N st:N,N,N,N>`. `DebugServerPattern::LINE` matches both via `(?:,\s+t:\d+)?,?` — preserve that optional group when editing or B42 logs silently fail to parse, leaving entries level-less and analysers (`ServerExceptionProblem`, `ModMissingProblem`) silently dormant. Fixtures cover both: `debug-server-minimal.txt` (B41), `debug-server-42x-minimal.txt` (B42).

 ## Workflow conventions

 - **One commit per concrete log type** when adding game support: pattern class + log subclass + synthetic fixture + test in a single commit, run `composer test`, then move on. `<Game>Detective::__construct()` wiring goes in its own follow-up commit once all log types are present.
 - **Out-of-scope cleanup goes in its own commit.** Tempting workflow/lint fixes (e.g. deprecated CI syntax, comment hygiene) noticed mid-feature should not be folded in — separate commit or follow-up PR.
 - **Pre-destructive checkpoint pattern.** Before bulk renames/moves: `git commit --allow-empty -m "pre-X checkpoint"` as a revert anchor. Skip the empty slot if it produces no diff at the end of a plan.
+- **Release flow.** Semver: a new public API surface bumps the minor version, not the patch (`v0.1.x → v0.2.x`). Cut: rename `[Unreleased]` to `[X.Y.Z] — YYYY-MM-DD` in `CHANGELOG.md`, add a `[X.Y.Z]:` link reference at the bottom, fresh empty `[Unreleased]` above; lightweight `backup/pre-vX.Y.Z` tag (local only) before annotated `git tag -a vX.Y.Z`; push the annotated tag only.
+- **Feature branches.** Substantive feature work lands on a `<feature>-bootstrap`-style branch off master with a `backup/pre-<feature>` lightweight tag at the branch start, merged `--no-ff` after user review. The `redactor` and `iblogs-bootstrap` branches set the precedent.
+- **Specs and plans live at** `docs/superpowers/specs/YYYY-MM-DD-<topic>-design.md` and `docs/superpowers/plans/YYYY-MM-DD-<topic>.md` per the brainstorming and writing-plans skill conventions.

 ## Privacy / fixture rules

--- a/docs/superpowers/specs/2026-05-04-pz-deterministic-classifier-design.md
+++ b/docs/superpowers/specs/2026-05-04-pz-deterministic-classifier-design.md
@@ -0,0 +1,246 @@
+# PZ deterministic classifier — design spec
+
+> Drafted 2026-05-04. Status: design-approved, awaiting implementation plan.
+> Sibling tool to the existing pre-production Qwen analyzer (`pz_error_analysis.py`), which is unaffected by this work.
+
+## Summary
+
+A new deterministic-only Project Zomboid log classifier that lives alongside the existing Qwen-based analyzer in `tools/pz-analyzer/`. Walks redacted `DebugLog-server*.txt` files, extracts errors/warnings, attributes each to a mod where evidence allows, classifies by kind, and emits a structured JSON report. **Zero AI dependency**: this is the artefact that informs the future PHP / iblogs production path.
+
+The patterns it implements are inspired by `paraxaQQ/pzmm`'s `core/inspector.py` — Lua mod-marker attribution, multi-fallback file:line extraction, bidirectional stack collection, cause-chain unwinding, engine-noise tagging. Reimplemented originally; no code copied verbatim.
+
+## Why a separate tool, not an edit of `pz_error_analysis.py`
+
+Two artefacts, two purposes:
+
+- `pz_error_analysis.py` (existing, untouched) — pre-production discovery tool. Sends residual log content to Qwen so the developer can see what categories the deterministic side hasn't yet captured.
+- `pz_classify.py` (new) — production-bound deterministic classifier. Output is what an iblogs PHP port would eventually emit. Runs in seconds, no API dependency, no PII-going-to-LLM consideration.
+
+Coexisting them lets the developer compare outputs and treat the LLM's residual output as the "deterministic to-do list."
+
+## Scope
+
+**In scope:**
+- Two new files: `tools/pz-analyzer/pz_parser.py` (pure module) and `tools/pz-analyzer/pz_classify.py` (CLI orchestrator).
+- Tests under `tools/pz-analyzer/tests/` with synthetic fixtures.
+- Operates exclusively on the already-redacted directory produced by `pz_redact_all.sh` (`.scratch/pz/Logs.redacted/`).
+
+**Out of scope:**
+- Any modification to `pz_error_analysis.py`, `pz_redact_all.sh`, or PHP codex source.
+- Filesystem-based mod-scan reattribution (pzmm's symbol-index, vehicle-index, file-path-ownership reattribution requires an actual mod folder we don't have on the server side).
+- iblogs / bosslogs integration. The output schema is designed with that future port in mind, but no PHP code is written here.
+- Generic AI tab patterns from pzmm's `core/ai.py`. Explicitly excluded.
+
+## Architecture
+
+```
+                redacted .txt files
+                        |
+                        v
+          +---------------------------+
+          | pz_classify.py            |   argparse · directory walk · aggregate · JSON write
+          | (orchestrator)            |
+          +-------------+-------------+
+                        |
+                        v
+          +---------------------------+
+          | pz_parser.py              |   regexes · parse · classify · sign
+          | (pure module, no I/O      |
+          |  beyond reading the path  |
+          |  it is handed)            |
+          +---------------------------+
+```
+
+Two files inside `tools/pz-analyzer/`:
+
+- **`pz_parser.py`** — stateless. All regex constants, `parse_file(path) -> list[Entry]`, attribution helpers, file:line extractors, cause-chain extractor, signature computation. No `argparse`, no JSON writing, no directory walking. Unit-testable in isolation.
+- **`pz_classify.py`** — entry point. CLI args, walks the redacted directory, calls `pz_parser`, aggregates records by signature, writes JSON, prints a one-line stats summary.
+
+The split is deliberate: `pz_parser.py` is the module that eventually wants to be ported to PHP codex (separate spec). Keeping it pure makes that port mechanical and Python-side tests trivial.
+
+## Parser pipeline phases
+
+For each `*DebugLog-server*.txt`, the parser walks lines once and emits records via the following phases.
+
+### 1. Severity-prefix recognition
+
+Regex: `^\s*(ERROR|SEVERE|WARN)\s*[:\s]`. Broader than the existing `pz_error_analysis.py` regex — adds `SEVERE` (Java util-logging convention; appears in some PZ Java exception blocks). `LOG`/`INFO` is ignored at this layer.
+
+### 2. Stack collection — bidirectional
+
+Pzmm's contribution: PZ emits stack frames *before* the ERROR/WARN line as often as after.
+
+- **Pre-stack**: walk up to 25 lines back from the severity line. Stop at another severity line or 8 collected. Only keep the block if at least one line looks stack-shaped (`at `, `[string ...]`, `function:`, `file:`, `.lua` markers).
+- **Post-stack**: walk forward up to 25 lines, gated by engine-noise detection. Stop at another severity line or 8 collected.
+- Merge deduped, preserving order; cap at 8 frames per record.
+
+### 3. Mod attribution — three buckets
+
+| Bucket | Trigger | Confidence |
+|---|---|---|
+| `direct` | Line itself matches `Lua\(\(MOD:([^)]+)\)\)` (or the `require("X") failed` shape, or an explicit `needed by <mod>` hint elsewhere in the entry) | `high` |
+| `inferred` | No marker on this line, but body is Lua-shaped (see below) *and* a `Lua((MOD:Y))` was emitted within the previous 40 lines | `medium` |
+| `unattributed` | Neither of the above | `low`; `mod_id = "__unattributed__"` |
+
+"Lua-shaped" means the body matches at least one of (case-insensitive): `luamanager.getfunctionobject`, `no such function`, `exception thrown`, `runtimeexception`, `illegalstateexception`, or contains the bare token `lua`. This filter prevents inferred attribution from latching onto unrelated severity lines that happened to fall within the lookback window.
+
+`mod_id` derives from the marker's raw name with a `_norm_mod_key` transform: lowercase, strip spaces / apostrophes / hyphens. `mod_name` preserves the human-readable form.
+
+We do **not** attempt pzmm's filesystem-based reattribution.
+
+### 4. File:line extraction — five fallbacks
+
+Tried in order against the entry body and stack frames:
+
+1. `at <path>.lua:<n>`
+2. `function: ... file: <path>.lua line #<n>` (or `: <n>`)
+3. `[string "<path>.lua"]:<n>`
+4. quoted path ending in `.lua` / `.txt` / `.xml` / `.json` / `.ini` / `.cfg` / `.bin`
+5. unquoted path segment beginning with `media/`, `maps/`, `lua/`, `scripts/`
+
+Returns `(file, line)`; `line=0` if the matched form had no line number.
+
+### 5. Cause-chain extraction
+
+`Caused by: <X>` chains plus standalone exception lines (`(\w+\.)+\w+(Exception|Error): <msg>`) are normalised to `<ExceptionClass>: <msg>` tokens and joined with ` -> `. Up to 6 chain levels, deduped. Captures both Java exception nesting and Lua-wrapped exception chains.
+
+### 6. Java exception kind detection
+
+DebugLog-server has both Lua and Java exceptions; pzmm targets `console.txt` which is Lua-dominant. Extension here:
+
+- `kind = "java_exception"` when the entry body or stack contains `(\w+\.)+\w+(Exception|Error)` AND no `Lua((MOD:X))` marker is present anywhere in the entry.
+- These typically resolve to `mod_id: __unattributed__` because Java code in PZ is engine, not mod. The exception class name becomes part of the message skeleton so similar Java exceptions dedup tightly.
+
+### 7. Engine-noise tagging
+
+`kind = "engine_noise"` when the body contains `kahluathread.flusherrormessage` or `dumping lua stack trace`. These severity-ERROR lines are PZ's own diagnostic chatter about its error reporting, not actual errors. They stay in the output (consumer can filter on `kind`).
+
+### 8. Signature computation
+
+Two-level deterministic identity, both stored on every record:
+
+```
+pattern_id  = sha256(level + normalized_first_line)[:16]
+signature   = sha256(pattern_id + mod_id)[:16]
+```
+
+Normalization for `pattern_id`:
+- Strip session metadata prefix (`General  f:N, t:N, st:N,N,N,N>` shape)
+- Strip body-prefix severity token (`ERROR:` / `SEVERE:` / `WARN:` / `FATAL:`, case-insensitive) so a body that opens with the severity word still hashes the same as one that doesn't.
+- Flatten double- and single-quoted strings to `"<S>"` / `'<S>'`
+- Flatten ≥2-digit numeric runs to `<N>`
+- Collapse whitespace
+- Truncate to 200 chars
+
+Both fields ride on every record. Two consumer views, neither requires LLM:
+
+- **Per-mod view** (signature is the dedup key): one record per `(mod_id, error_shape)` pair.
+- **Pattern fan-out view** (group records by `pattern_id`): see all mods that hit the same shape.
+
+### 9. Aggregation
+
+Records dedup on `signature`. On second-and-subsequent occurrences: `occurrence_count++`, `files` set-extends, attribution-confidence promotes (direct beats inferred beats unattributed), stack and `cause_chain` merge.
+
+## Output schema
+
+```json
+{
+  "meta": {
+    "input_dir": "/opt/ik-codex/.scratch/pz/Logs.redacted",
+    "files_scanned": 6,
+    "log_lines_total": 78654,
+    "error_lines_total": 30984,
+    "unique_signatures": N,
+    "unique_patterns": M,
+    "redacted": true,
+    "started": "ISO8601",
+    "finished": "ISO8601"
+  },
+  "signatures": [
+    {
+      "signature": "sha256:...",
+      "pattern_id": "sha256:...",
+      "level": "ERROR",
+      "kind": "lua_runtime|require_failed|java_exception|engine_noise|runtime",
+      "mod_id": "spongies_clothing",
+      "mod_name": "Spongie's Clothing",
+      "attribution": "direct|inferred|unattributed",
+      "confidence": "high|medium|low",
+      "attribution_reason": "...",
+      "file": "media/lua/client/X.lua",
+      "line": 42,
+      "cause_chain": "ExceptionA: msg -> ExceptionB: msg",
+      "stack": ["at A.lua:12", "at B.lua:34"],
+      "first_seen": {"file": "...", "line": 1234, "timestamp": "26-04-26 17:14:35.128"},
+      "occurrence_count": 47,
+      "files": ["..."],
+      "excerpt": "..."
+    }
+  ],
+  "summary": {
+    "errors": N,
+    "warnings": N,
+    "by_kind": {"lua_runtime": ..., "java_exception": ..., "require_failed": ..., "engine_noise": ..., "runtime": ...},
+    "by_attribution": {"direct": ..., "inferred": ..., "unattributed": ...},
+    "by_confidence": {"high": ..., "medium": ..., "low": ...},
+    "top_mods": [{"mod_id": "...", "mod_name": "...", "occurrence_count": N}, ...]
+  }
+}
+```
+
+Default output path: `/opt/ik-codex/.scratch/pz/classify.json` (gitignored under `.scratch/`).
+
+## CLI
+
+```
+pz_classify.py [--input <dir>] [--out <path>] [--quiet]
+```
+
+- `--input` defaults to `<repo>/.scratch/pz/Logs.redacted`
+- `--out` defaults to `<repo>/.scratch/pz/classify.json`
+- `--quiet` suppresses the trailing summary line
+
+No `--limit`, `--resume`, or `--checkpoint-every`. Runs in seconds; nothing to throttle or resume.
+
+## Tests
+
+New directory `tools/pz-analyzer/tests/`. Stdlib `unittest`. Three files, ~18 tests total.
+
+- **`test_parser.py`** (~10 tests) — one fixture per scenario in `tests/fixtures/` (synthetic, tracked in git): pure-Lua-attributed, pure-Java-exception, inferred-from-context, unattributed-engine-noise, multi-cause-chain, pre-stack-collection, post-stack-collection, severity-variants, file-line-extraction-fallbacks. All synthetic identifiers (placeholder Steam IDs / mod names) per the existing PHP-side `test/src/Games/ProjectZomboid/fixtures/` convention.
+- **`test_attribution.py`** (~5 tests) — three confidence buckets, the 40-line lookback boundary, "needed by X" extraction, and the rejection of inferred attribution when the message isn't Lua-shaped.
+- **`test_signatures.py`** (~3 tests) — `pattern_id` stability across formatting variations (whitespace, numeric values, quoted strings) and `signature` uniqueness across mods.
+
+Invocation: `python -m unittest discover tools/pz-analyzer/tests/`. No external deps.
+
+## Verification
+
+End-to-end smoke against the redacted real-data directory:
+
+```
+bash /opt/ik-codex/tools/pz-analyzer/pz_redact_all.sh   # one-time, already done
+python /opt/ik-codex/tools/pz-analyzer/pz_classify.py
+```
+
+Expect:
+- 6 files scanned, ~30,984 error lines processed.
+- A meaningful number of unique signatures and patterns (likely in the low hundreds for signatures; fewer patterns).
+- `top_mods` lists the highest-occurrence mods.
+- PII audit: no real Steam IDs, IPs, or coordinates in the output JSON (input is already redacted; classifier doesn't introduce PII).
+
+Test invocation: `python -m unittest discover tools/pz-analyzer/tests/` should be all-green.
+
+## Risks and open questions
+
+- **Inferred attribution accuracy.** The 40-line lookback is pzmm's heuristic; it's correct for tightly-paced server bursts but can mis-attribute when an unrelated mod logs in the gap. Surface as `confidence: medium` so consumers can choose to treat them differently. Acceptable for v1; tunable via a constant in `pz_parser.py`.
+- **Pzmm targets `console.txt`, we target `DebugLog-server.txt`.** Format overlap is high (both share `Lua((MOD:X))` markers, Caused-by chains, Java exception shapes), but some patterns may be `console.txt`-specific. Tests use `DebugLog-server`-shaped fixtures only.
+- **Future PHP port.** `pz_parser.py` is structured for mechanical translation to a `LuaErrorAnalyser` / `ModAttributionAnalyser` pair under `src/Analyser/ProjectZomboid/` in a separate spec. Output schema chosen to be PHP-codex-compatible (Insight subclasses with typed fields).
+- **Licence.** The `paraxaQQ/pzmm` zip we reviewed has no top-level LICENSE; this spec mandates rewriting the patterns originally rather than copying code. Regex shapes and heuristics are general programming patterns and not author-specific, but no code blocks are lifted verbatim.
+
+## Out of scope (explicit)
+
+- Editing `pz_error_analysis.py` or `pz_redact_all.sh`.
+- Modifying any file in `/opt/ik-codex/src/`, `/opt/ik-codex/test/`, or `/opt/iblogs/`.
+- AI / LLM integration of any kind in the new tool.
+- LLM inference at runtime in iblogs / bosslogs production. The Qwen analyzer (`pz_error_analysis.py`) is a developer-only discovery tool used to expand the deterministic ruleset in `pz_parser.py` (and its future PHP port). Production rendering is deterministic-only, forever.
+- iblogs front-end rendering of the classification output.
+- Filesystem mod-scan reattribution (pzmm's symbol/vehicle indexes).
--- a/src/Analyser/ProjectZomboid/ErrorContextAnalyser.php
+++ b/src/Analyser/ProjectZomboid/ErrorContextAnalyser.php
@@ -0,0 +1,131 @@
+<?php
+
+namespace IndifferentKetchup\Codex\Analyser\ProjectZomboid;
+
+use IndifferentKetchup\Codex\Analyser\Analyser;
+use IndifferentKetchup\Codex\Analysis\Analysis;
+use IndifferentKetchup\Codex\Analysis\AnalysisInterface;
+use IndifferentKetchup\Codex\Analysis\ProjectZomboid\ErrorContextProblem;
+use IndifferentKetchup\Codex\Analysis\ProjectZomboid\ErrorContextTruncatedInformation;
+use IndifferentKetchup\Codex\Log\EntryInterface;
+use IndifferentKetchup\Codex\Log\Level;
+
+/**
+ * Surfaces ERROR or WARNING entries with a sliding context window of
+ * surrounding entries, so a viewer can see the lead-up and aftermath of
+ * each event without scanning the full log. PatternAnalyser cannot
+ * express this because windows span multiple entries; this walks once,
+ * classifies by Level (already resolved by the parser), and emits one
+ * ErrorContextProblem per hit.
+ *
+ * Stack-trace continuation lines are absorbed into the same Entry as the
+ * level header that preceded them by PatternParser, so noise filtering
+ * happens at parse time — windows here count Entries, not raw lines, and
+ * a stack-trace ERROR contributes exactly one window.
+ *
+ * Overlapping windows are merged: when two error/warning entries fall
+ * within CONTEXT_BEFORE + CONTEXT_AFTER of each other, the later
+ * window's before- and after-ranges are clipped to start past the
+ * previously emitted range so no Entry appears in two context arrays.
+ * The hit cap is enforced after emission; reaching it adds an
+ * ErrorContextTruncatedInformation to the analysis instead of further
+ * problems.
+ */
+class ErrorContextAnalyser extends Analyser
+{
+    /**
+     * Number of entries preceding a hit captured as leading context.
+     * Twenty entries is wide enough to surface the immediate precursor
+     * events (mod load, player join, prior warning) for a server-log
+     * error without dragging in unrelated activity from minutes earlier.
+     */
+    public const int CONTEXT_BEFORE = 20;
+
+    /**
+     * Number of entries following a hit captured as trailing context.
+     * Mirrors CONTEXT_BEFORE so windows are symmetric and the maximum
+     * window size is CONTEXT_BEFORE + 1 (hit) + CONTEXT_AFTER = 41
+     * entries.
+     */
+    public const int CONTEXT_AFTER = 20;
+
+    /**
+     * Maximum number of hits emitted before truncation. Caps memory and
+     * output size on logs with cascading errors (e.g. a save-system
+     * failure that produces an error every tick). Reaching the cap adds
+     * an ErrorContextTruncatedInformation to the analysis so consumers
+     * can flag truncation rather than silently dropping later hits.
+     */
+    public const int HIT_CAP = 500;
+
+    public function analyse(): AnalysisInterface
+    {
+        $analysis = new Analysis();
+        $analysis->setLog($this->log);
+
+        $entries = [];
+        foreach ($this->log as $entry) {
+            $entries[] = $entry;
+        }
+        $count = count($entries);
+
+        $hits = 0;
+        $truncated = false;
+        $lastEmittedIndex = -1;
+
+        for ($i = 0; $i < $count; $i++) {
+            $type = $this->classify($entries[$i]);
+            if ($type === null) {
+                continue;
+            }
+
+            if ($hits >= self::HIT_CAP) {
+                $truncated = true;
+                break;
+            }
+
+            $beforeStart = max($lastEmittedIndex + 1, $i - self::CONTEXT_BEFORE);
+            if ($beforeStart > $i) {
+                $beforeStart = $i;
+            }
+            $afterStart = max($lastEmittedIndex + 1, $i + 1);
+            $afterEnd = min($count - 1, $i + self::CONTEXT_AFTER);
+            $afterLength = max(0, $afterEnd - $afterStart + 1);
+
+            $analysis->addInsight((new ErrorContextProblem())
+                ->setEntry($entries[$i])
+                ->setType($type)
+                ->setEntryIndex($i + 1)
+                ->setBefore(array_slice($entries, $beforeStart, $i - $beforeStart))
+                ->setAfter(array_slice($entries, $afterStart, $afterLength)));
+
+            $hits++;
+            $lastEmittedIndex = max($lastEmittedIndex, $afterEnd);
+        }
+
+        if ($truncated) {
+            $analysis->addInsight((new ErrorContextTruncatedInformation())
+                ->setHitCap(self::HIT_CAP));
+        }
+
+        return $analysis;
+    }
+
+    /**
+     * Classify an entry as 'error', 'warning', or null based on its Level.
+     * Levels at or below ERROR (EMERGENCY/ALERT/CRITICAL/ERROR) collapse
+     * into 'error'; WARNING alone collapses into 'warning'. Returns null
+     * for anything less severe so the analyser skips it.
+     */
+    protected function classify(EntryInterface $entry): ?string
+    {
+        $level = $entry->getLevel()->asInt();
+        if ($level <= Level::ERROR->asInt()) {
+            return 'error';
+        }
+        if ($level === Level::WARNING->asInt()) {
+            return 'warning';
+        }
+        return null;
+    }
+}
--- a/src/Analysis/ProjectZomboid/ErrorContextProblem.php
+++ b/src/Analysis/ProjectZomboid/ErrorContextProblem.php
@@ -0,0 +1,130 @@
+<?php
+
+namespace IndifferentKetchup\Codex\Analysis\ProjectZomboid;
+
+use IndifferentKetchup\Codex\Analysis\InsightInterface;
+use IndifferentKetchup\Codex\Analysis\Problem;
+use IndifferentKetchup\Codex\Log\EntryInterface;
+
+/**
+ * Problem emitted by ErrorContextAnalyser for each ERROR or WARNING entry,
+ * carrying a sliding window of surrounding entries as before/after
+ * context. Coalesced by 1-based entryIndex so re-adding the same hit
+ * never produces duplicate problems.
+ */
+class ErrorContextProblem extends Problem
+{
+    private string $type = 'error';
+    private int $entryIndex = 0;
+
+    /**
+     * @var EntryInterface[]
+     */
+    private array $before = [];
+
+    /**
+     * @var EntryInterface[]
+     */
+    private array $after = [];
+
+    /**
+     * @param string $type 'error' or 'warning'
+     * @return $this
+     */
+    public function setType(string $type): static
+    {
+        $this->type = $type;
+        return $this;
+    }
+
+    /**
+     * @return string
+     */
+    public function getType(): string
+    {
+        return $this->type;
+    }
+
+    /**
+     * @param int $entryIndex 1-based index of the hit entry within the log
+     * @return $this
+     */
+    public function setEntryIndex(int $entryIndex): static
+    {
+        $this->entryIndex = $entryIndex;
+        return $this;
+    }
+
+    /**
+     * @return int 1-based index of the hit entry within the log
+     */
+    public function getEntryIndex(): int
+    {
+        return $this->entryIndex;
+    }
+
+    /**
+     * @param EntryInterface[] $entries
+     * @return $this
+     */
+    public function setBefore(array $entries): static
+    {
+        $this->before = $entries;
+        return $this;
+    }
+
+    /**
+     * @return EntryInterface[]
+     */
+    public function getBefore(): array
+    {
+        return $this->before;
+    }
+
+    /**
+     * @param EntryInterface[] $entries
+     * @return $this
+     */
+    public function setAfter(array $entries): static
+    {
+        $this->after = $entries;
+        return $this;
+    }
+
+    /**
+     * @return EntryInterface[]
+     */
+    public function getAfter(): array
+    {
+        return $this->after;
+    }
+
+    /**
+     * Convenience accessor returning before-context, hit entry, and
+     * after-context as a single ordered array of at most
+     * ErrorContextAnalyser::CONTEXT_BEFORE + 1 + CONTEXT_AFTER = 41
+     * entries.
+     *
+     * @return EntryInterface[]
+     */
+    public function getContext(): array
+    {
+        return [...$this->before, $this->getEntry(), ...$this->after];
+    }
+
+    public function getMessage(): string
+    {
+        return sprintf(
+            '%s at entry %d (%d before, %d after)',
+            strtoupper($this->type),
+            $this->entryIndex,
+            count($this->before),
+            count($this->after)
+        );
+    }
+
+    public function isEqual(InsightInterface $insight): bool
+    {
+        return $insight instanceof self && $insight->getEntryIndex() === $this->entryIndex;
+    }
+}
--- a/src/Analysis/ProjectZomboid/ErrorContextTruncatedInformation.php
+++ b/src/Analysis/ProjectZomboid/ErrorContextTruncatedInformation.php
@@ -0,0 +1,42 @@
+<?php
+
+namespace IndifferentKetchup\Codex\Analysis\ProjectZomboid;
+
+use IndifferentKetchup\Codex\Analysis\Information;
+use IndifferentKetchup\Codex\Analysis\InsightInterface;
+
+/**
+ * Emitted by ErrorContextAnalyser exactly once when its hit cap is
+ * reached, so downstream consumers can surface a "results truncated"
+ * notice instead of silently dropping subsequent error/warning hits.
+ */
+class ErrorContextTruncatedInformation extends Information
+{
+    private int $hitCap = 0;
+
+    /**
+     * @param int $hitCap the cap that was hit (mirrors
+     *     ErrorContextAnalyser::HIT_CAP at emission time)
+     * @return $this
+     */
+    public function setHitCap(int $hitCap): static
+    {
+        $this->hitCap = $hitCap;
+        $this->setLabel('Error context');
+        $this->setValue(sprintf('truncated after %d hits', $hitCap));
+        return $this;
+    }
+
+    /**
+     * @return int
+     */
+    public function getHitCap(): int
+    {
+        return $this->hitCap;
+    }
+
+    public function isEqual(InsightInterface $insight): bool
+    {
+        return $insight instanceof self;
+    }
+}
--- a/src/Pattern/ProjectZomboid/DebugServerPattern.php
+++ b/src/Pattern/ProjectZomboid/DebugServerPattern.php
@@ -15,7 +15,7 @@ namespace IndifferentKetchup\Codex\Pattern\ProjectZomboid;
 */
 class DebugServerPattern
 {
-    public const string LINE = '/^\[(\d{2}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}\.\d{3})\]\s+(\w+)\s*:\s+(\S+)\s+f:\d+,\s+t:\d+,\s+st:[\d,]+>\s+.*$/';
+    public const string LINE = '/^\[(\d{2}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}\.\d{3})\]\s+(\w+)\s*:\s+(\S+)\s+f:\d+(?:,\s+t:\d+)?,?\s+st:[\d,]+>\s+.*$/';

    public const string VERSION = '/version=(?<version>\S+) (?<hash>[a-f0-9]{40}) (?<date>\d{4}-\d{2}-\d{2}) (?<time>\d{2}:\d{2}:\d{2})/';

--- a/src/Util/ProjectZomboid/ProjectZomboidRedactor.php
+++ b/src/Util/ProjectZomboid/ProjectZomboidRedactor.php
@@ -7,15 +7,24 @@ use IndifferentKetchup\Codex\Util\RedactorInterface;
 /**
 * Render-time PII filter for Project Zomboid log content.
 *
- * Applies up to three sequential regex passes over the raw log string,
+ * Applies up to four sequential regex passes over the raw log string,
 * each controlled by a boolean toggle (all enabled by default):
 *
- *   1. Steam ID pass   — replaces 17-digit Steam IDs with a placeholder token.
- *   2. Player name pass — replaces player display names with a placeholder
+ *   1. IP address pass — replaces IPv4 addresses (with optional :port
+ *      suffix) and IPv6 addresses (full, abbreviated, bracketed, and
+ *      IPv4-mapped forms; all with optional :port when bracketed) with
+ *      a placeholder token. Pattern-disjoint from the other passes.
+ *   2. Steam ID pass    — replaces 17-digit Steam IDs with a placeholder
+ *      token.
+ *   3. Player name pass — replaces player display names with a placeholder
 *      token. This pass anchors on the already-redacted Steam ID token, so
 *      the ordering Steam ID -> name -> coordinates is mandatory.
- *   3. Coordinates pass — replaces world coordinate triplets with a placeholder
- *      token.
+ *   4. Coordinates pass — replaces world coordinate triplets with a
+ *      placeholder token.
+ *
+ * Pass 1 runs first by convention, not dependency: it shares no anchors
+ * with passes 2-4 and could run anywhere in the chain without affecting
+ * their output.
 *
 * All regex passes use the /u flag for Unicode safety.
 *
@@ -24,6 +33,29 @@ use IndifferentKetchup\Codex\Util\RedactorInterface;
 */
 class ProjectZomboidRedactor implements RedactorInterface
 {
+    /** Generic placeholder substituted for every matched IPv4 or IPv6 address (with port suffix consumed when present). */
+    public const string IP_REPLACEMENT = '[REDACTED_IP]';
+
+    /** Strict IPv4 with valid 0-255 octets and optional :port suffix. Lookarounds reject matches embedded in longer alphanumeric or dotted-decimal tokens; the (?<!\d\.) / (?!\.\d) pair specifically prevents matching inside an N-octet (N>4) sequence like 1.2.3.4.5 while still allowing a trailing sentence period after the IP/port. */
+    public const string IPV4_REGEX = '/'
+        . '(?<![A-Za-z0-9_:])(?<!\d\.)'
+        . '(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)'
+        . '(?:\.(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)){3}'
+        . '(?::\d{1,5})?'
+        . '(?![A-Za-z0-9_:])(?!\.\d)'
+        . '/u';
+
+    /** Coarse IPv6 candidate matcher (bracketed-with-port, or bare 2-7-colon hex form covering full / abbreviated / IPv4-mapped). Each match is validated with filter_var() in the redact() callback so PHP/Java scope ops like Foo::Bar and PZ timestamps like 12:00:00.000 are rejected. Boundary lookarounds mirror the IPv4 regex so trailing sentence periods don't block the match. */
+    public const string IPV6_REGEX = '/'
+        . '(?<![A-Za-z0-9_:])(?<!\d\.)'
+        . '(?:'
+        . '\[(?<bracketed>[0-9a-fA-F:.]+)\](?::\d{1,5})?'
+        . '|'
+        . '(?<bare>(?:[0-9a-fA-F]{0,4}:){2,7}[0-9a-fA-F.]*)'
+        . ')'
+        . '(?![A-Za-z0-9_:])(?!\.\d)'
+        . '/u';
+
    /** Regex matching a 17-digit SteamID64 anchored on the 76561198 universe prefix, with lookaround boundaries that reject embedded occurrences. */
    public const string STEAM_ID_REGEX = '/(?<![A-Za-z0-9])76561198\d{9}(?![A-Za-z0-9])/u';

@@ -54,10 +86,23 @@ class ProjectZomboidRedactor implements RedactorInterface
    /** Matches integer coordinate triplets enclosed in round parentheses, anchored on a trailing PvP verb to disambiguate from server-metadata triples (pvp.txt Combat:/Safety: shape); only the attacker/first-coord set is redacted per line — the victim coords lack the trailing keyword and are deferred to v2. */
    public const string COORDS_PARENTHESISED_REGEX = '/(?<=\()(?<x>\d+),(?<y>\d+),(?<z>-?\d+)(?=\) (?:hit|restore|store|true|false))/u';

+    private bool $redactIpAddresses = true;
    private bool $redactSteamIds = true;
    private bool $redactPlayerNames = true;
    private bool $redactCoordinates = true;

+    /**
+     * Enable or disable the IP address redaction pass (covers IPv4 and IPv6).
+     *
+     * @param bool $on Pass true to enable, false to disable.
+     * @return static
+     */
+    public function redactIpAddresses(bool $on): static
+    {
+        $this->redactIpAddresses = $on;
+        return $this;
+    }
+
    /**
     * Enable or disable the Steam ID redaction pass.
     *
@@ -97,14 +142,31 @@ class ProjectZomboidRedactor implements RedactorInterface
    /**
     * Redact PII from the given Project Zomboid log content.
     *
-     * Passes are applied in the mandatory order: Steam ID -> player name ->
-     * coordinates. See class docblock for rationale.
+     * Passes are applied in the order: IP address -> Steam ID -> player
+     * name -> coordinates. The Steam ID -> name -> coordinates ordering
+     * is mandatory (see class docblock); the IP pass is pattern-disjoint
+     * and runs first by convention.
     *
     * @param string $content Raw log content that may contain PII.
     * @return string Content with enabled PII categories replaced by tokens.
     */
    public function redact(string $content): string
    {
+        if ($this->redactIpAddresses) {
+            $content = preg_replace_callback(
+                self::IPV6_REGEX,
+                static function (array $matches): string {
+                    $candidate = ($matches['bracketed'] ?? '') !== ''
+                        ? $matches['bracketed']
+                        : ($matches['bare'] ?? '');
+                    return filter_var($candidate, FILTER_VALIDATE_IP, FILTER_FLAG_IPV6) !== false
+                        ? self::IP_REPLACEMENT
+                        : $matches[0];
+                },
+                $content
+            );
+            $content = preg_replace(self::IPV4_REGEX, self::IP_REPLACEMENT, $content);
+        }
        if ($this->redactSteamIds) {
            $content = preg_replace(self::STEAM_ID_REGEX, self::STEAM_ID_REPLACEMENT, $content);
        }
--- a/test/src/Games/ProjectZomboid/fixtures/debug-server-42x-minimal.txt
+++ b/test/src/Games/ProjectZomboid/fixtures/debug-server-42x-minimal.txt
@@ -0,0 +1,22 @@
+[16-04-26 00:00:42.314] LOG : General f:0 st:48,648,157,434> SLF4J(W): No SLF4J providers were found..
+[16-04-26 00:00:42.315] LOG : General f:0 st:48,648,157,492> SLF4J(W): Defaulting to no-operation (NOP) logger implementation.
+[16-04-26 00:00:42.407] LOG : General f:0 st:48,648,157,584> version=42.17.0 0000000000000000000000000000000000000000 2026-04-20 14:34:44 (ZB) demo=false.
+[16-04-26 00:00:42.407] LOG : General f:0 st:48,648,157,585> revision=0000000000000000000000000000000000000000 date=2026-04-20 time=14:34:44 (ZB).
+[16-04-26 00:01:19.080] ERROR: General f:0 st:48,648,194,258> DebugFileWatcher.registerDir> Exception thrown
+	java.nio.file.NoSuchFileException: /placeholder/config/mods at UnixException.translateToIOException(null:-1).
+	Stack trace:
+		java.base/sun.nio.fs.UnixException.translateToIOException(Unknown Source)
+		java.base/sun.nio.fs.UnixException.asIOException(Unknown Source)
+		java.base/sun.nio.fs.LinuxWatchService$Poller.implRegister(Unknown Source)
+		java.base/sun.nio.fs.AbstractPoller.processRequests(Unknown Source)
+		java.base/sun.nio.fs.LinuxWatchService$Poller.run(Unknown Source)
+[16-04-26 00:01:19.131] LOG : Mod f:0 st:48,648,194,309> loading example_mod_alpha.
+[16-04-26 00:01:19.142] LOG : Mod f:0 st:48,648,194,320> loading example_mod_beta.
+[16-04-26 00:01:19.155] LOG : Mod f:0 st:48,648,194,333> loading example_mod_gamma.
+[16-04-26 00:01:19.200] WARN : Mod f:0 st:48,648,194,378> ZomboidFileSystem.loadModAndRequired> required mod "absent_mod" not found.
+[16-04-26 00:01:45.937] ERROR: WorldGen f:0 st:48,648,221,115> IsoPropertyType.lookupOrDefaultStr> Exception thrown
+	zombie.core.properties.IsoPropertyType$IsoPropertyTypeNotFoundException: Property Name not found: ladderW at IsoPropertyType.lookup(IsoPropertyType.java:269). Message: Property Name not found: ladderW
+		at zombie.core.properties.IsoPropertyType.lookup(IsoPropertyType.java:269)
+		at zombie.iso.IsoChunkData.PostProcessChunk(IsoChunkData.java:512)
+[16-04-26 00:02:00.000] LOG : General f:0 st:48,648,235,178> server initialised.
+[16-04-26 00:05:00.000] LOG : General f:0 st:48,648,415,178> shutdown requested.
--- a/test/tests/Games/ProjectZomboid/Analyser/ErrorContextAnalyserTest.php
+++ b/test/tests/Games/ProjectZomboid/Analyser/ErrorContextAnalyserTest.php
@@ -0,0 +1,128 @@
+<?php
+
+namespace IndifferentKetchup\Codex\Test\Tests\Games\ProjectZomboid\Analyser;
+
+use IndifferentKetchup\Codex\Analyser\AnalyserInterface;
+use IndifferentKetchup\Codex\Analyser\ProjectZomboid\ErrorContextAnalyser;
+use IndifferentKetchup\Codex\Analysis\ProjectZomboid\ErrorContextProblem;
+use IndifferentKetchup\Codex\Analysis\ProjectZomboid\ErrorContextTruncatedInformation;
+use IndifferentKetchup\Codex\Log\AnalysableLog;
+use IndifferentKetchup\Codex\Log\Entry;
+use IndifferentKetchup\Codex\Log\Level;
+use IndifferentKetchup\Codex\Log\Line;
+use PHPUnit\Framework\TestCase;
+
+class ErrorContextAnalyserTest extends TestCase
+{
+    /**
+     * Build an in-memory AnalysableLog with $count entries; entries whose
+     * 1-based index is in $errorIndices are tagged Level::ERROR, the rest
+     * Level::INFO. Anonymous AnalysableLog subclass keeps the fixture
+     * inline since we exercise the analyser directly via setLog().
+     *
+     * @param int[] $errorIndices 1-based entry indices to mark as ERROR
+     */
+    private function makeLog(array $errorIndices, int $count): AnalysableLog
+    {
+        $errorSet = array_flip($errorIndices);
+        $log = new class extends AnalysableLog {
+            public static function getDefaultAnalyser(): AnalyserInterface
+            {
+                return new ErrorContextAnalyser();
+            }
+        };
+        for ($n = 1; $n <= $count; $n++) {
+            $level = isset($errorSet[$n]) ? Level::ERROR : Level::INFO;
+            $entry = (new Entry())
+                ->setLevel($level)
+                ->addLine(new Line($n, sprintf('line %d', $n)));
+            $log->addEntry($entry);
+        }
+        return $log;
+    }
+
+    public function testEmitsThreeNonOverlappingWindows(): void
+    {
+        $log = $this->makeLog([10, 50, 95], 100);
+        $analysis = (new ErrorContextAnalyser())->setLog($log)->analyse();
+
+        $problems = $analysis->getFilteredInsights(ErrorContextProblem::class);
+        $this->assertCount(3, $problems);
+
+        $this->assertSame(10, $problems[0]->getEntryIndex());
+        $this->assertSame(50, $problems[1]->getEntryIndex());
+        $this->assertSame(95, $problems[2]->getEntryIndex());
+
+        // First hit (entry 10): 9 entries before (1..9), 20 after (11..30).
+        $this->assertCount(9, $problems[0]->getBefore());
+        $this->assertCount(20, $problems[0]->getAfter());
+
+        // Second hit (entry 50): clipped to 19 before (31..49), 20 after (51..70).
+        $this->assertCount(19, $problems[1]->getBefore());
+        $this->assertCount(20, $problems[1]->getAfter());
+
+        // Third hit (entry 95): clipped to 20 before (75..94), 5 after (96..100).
+        $this->assertCount(20, $problems[2]->getBefore());
+        $this->assertCount(5, $problems[2]->getAfter());
+
+        // Total window per hit never exceeds 1 + CONTEXT_BEFORE + CONTEXT_AFTER = 41.
+        foreach ($problems as $problem) {
+            $this->assertLessThanOrEqual(ErrorContextAnalyser::CONTEXT_BEFORE, count($problem->getBefore()));
+            $this->assertLessThanOrEqual(ErrorContextAnalyser::CONTEXT_AFTER, count($problem->getAfter()));
+            $this->assertLessThanOrEqual(41, count($problem->getContext()));
+        }
+
+        // No entry appears in two problems' context arrays.
+        $seen = [];
+        foreach ($problems as $problem) {
+            foreach ([...$problem->getBefore(), ...$problem->getAfter()] as $entry) {
+                $id = spl_object_id($entry);
+                $this->assertArrayNotHasKey($id, $seen, 'Entry duplicated across problem context arrays');
+                $seen[$id] = true;
+            }
+        }
+    }
+
+    public function testMergesAdjacentWindowsWhenWithinContextRange(): void
+    {
+        // Errors 5 entries apart; without merge their windows would
+        // overlap heavily.
+        $log = $this->makeLog([10, 15], 50);
+        $analysis = (new ErrorContextAnalyser())->setLog($log)->analyse();
+
+        $problems = $analysis->getFilteredInsights(ErrorContextProblem::class);
+        $this->assertCount(2, $problems);
+
+        // First hit: 9 before (1..9), 20 after (11..30). lastEmittedIndex=29 (0-based).
+        $this->assertCount(9, $problems[0]->getBefore());
+        $this->assertCount(20, $problems[0]->getAfter());
+
+        // Second hit at entry 15 (i=14). beforeStart clamped past i so before is empty.
+        // afterStart=max(30, 15)=30, afterEnd=min(49, 34)=34, so after=entries 31..35
+        // (5 entries, all unseen).
+        $this->assertCount(0, $problems[1]->getBefore());
+        $this->assertCount(5, $problems[1]->getAfter());
+
+        // Confirm no entry appears in both problems' context arrays.
+        $first = [...$problems[0]->getBefore(), ...$problems[0]->getAfter()];
+        $second = [...$problems[1]->getBefore(), ...$problems[1]->getAfter()];
+        foreach ($second as $entry) {
+            $this->assertNotContains($entry, $first, 'Entry duplicated across merged windows');
+        }
+    }
+
+    public function testTruncatesAtHitCap(): void
+    {
+        // 600 consecutive ERROR entries — analyser should cap emission at
+        // HIT_CAP and add exactly one truncation Information.
+        $log = $this->makeLog(range(1, 600), 600);
+        $analysis = (new ErrorContextAnalyser())->setLog($log)->analyse();
+
+        $problems = $analysis->getFilteredInsights(ErrorContextProblem::class);
+        $this->assertCount(ErrorContextAnalyser::HIT_CAP, $problems);
+
+        $information = $analysis->getFilteredInsights(ErrorContextTruncatedInformation::class);
+        $this->assertCount(1, $information);
+        $this->assertSame(ErrorContextAnalyser::HIT_CAP, $information[0]->getHitCap());
+    }
+}
--- a/test/tests/Games/ProjectZomboid/Log/ProjectZomboidServerLogTest.php
+++ b/test/tests/Games/ProjectZomboid/Log/ProjectZomboidServerLogTest.php
@@ -6,18 +6,31 @@ use IndifferentKetchup\Codex\Detective\Detective;
 use IndifferentKetchup\Codex\Log\File\PathLogFile;
 use IndifferentKetchup\Codex\Log\Level;
 use IndifferentKetchup\Codex\Log\ProjectZomboid\ProjectZomboidServerLog;
+use PHPUnit\Framework\Attributes\DataProvider;
 use PHPUnit\Framework\TestCase;

 class ProjectZomboidServerLogTest extends TestCase
 {
-    private function fixturePath(): string
+    /**
+     * Both PZ B41 and B42 line shapes must parse identically. B41 (and the
+     * fixture used by every analyser test) emits `f:N, t:N, st:N,N,N,N>`;
+     * B42 (release branch from 2026-04 onward, e.g. build 42.17) drops the
+     * `t:` microsecond field entirely and tightens whitespace to
+     * `f:N st:N,N,N,N>`.
+     */
+    public static function fixtureProvider(): array
    {
-        return __DIR__ . '/../../../../src/Games/ProjectZomboid/fixtures/debug-server-minimal.txt';
+        $base = __DIR__ . '/../../../../src/Games/ProjectZomboid/fixtures';
+        return [
+            'pz41-format' => [$base . '/debug-server-minimal.txt'],
+            'pz42-format' => [$base . '/debug-server-42x-minimal.txt'],
+        ];
    }

-    public function testParsesEntriesWithLevelAndPrefix(): void
+    #[DataProvider('fixtureProvider')]
+    public function testParsesEntriesWithLevelAndPrefix(string $fixturePath): void
    {
-        $log = (new ProjectZomboidServerLog())->setLogFile(new PathLogFile($this->fixturePath()));
+        $log = (new ProjectZomboidServerLog())->setLogFile(new PathLogFile($fixturePath));
        $log->parse();

        $entries = $log->getEntries();
@@ -29,9 +42,10 @@ class ProjectZomboidServerLogTest extends TestCase
        $this->assertNotNull($first->getTime());
    }

-    public function testStackTraceLinesAttachToTriggeringErrorEntry(): void
+    #[DataProvider('fixtureProvider')]
+    public function testStackTraceLinesAttachToTriggeringErrorEntry(string $fixturePath): void
    {
-        $log = (new ProjectZomboidServerLog())->setLogFile(new PathLogFile($this->fixturePath()));
+        $log = (new ProjectZomboidServerLog())->setLogFile(new PathLogFile($fixturePath));
        $log->parse();

        $errorEntry = null;
@@ -46,19 +60,21 @@ class ProjectZomboidServerLogTest extends TestCase
        $this->assertGreaterThan(1, count($errorEntry->getLines()));
    }

-    public function testWarnLevelMapsCorrectly(): void
+    #[DataProvider('fixtureProvider')]
+    public function testWarnLevelMapsCorrectly(string $fixturePath): void
    {
-        $log = (new ProjectZomboidServerLog())->setLogFile(new PathLogFile($this->fixturePath()));
+        $log = (new ProjectZomboidServerLog())->setLogFile(new PathLogFile($fixturePath));
        $log->parse();

        $warnEntries = array_filter($log->getEntries(), fn($e) => $e->getLevel() === Level::WARNING);
        $this->assertNotEmpty($warnEntries);
    }

-    public function testDetectiveDispatchesByContent(): void
+    #[DataProvider('fixtureProvider')]
+    public function testDetectiveDispatchesByContent(string $fixturePath): void
    {
        $detective = (new Detective())
-            ->setLogFile(new PathLogFile($this->fixturePath()))
+            ->setLogFile(new PathLogFile($fixturePath))
            ->addPossibleLogClass(ProjectZomboidServerLog::class);

        $log = $detective->detect();
--- a/test/tests/Util/Redactor/ProjectZomboidRedactorIpv4Test.php
+++ b/test/tests/Util/Redactor/ProjectZomboidRedactorIpv4Test.php
@@ -0,0 +1,114 @@
+<?php
+
+namespace IndifferentKetchup\Codex\Test\Tests\Util\Redactor;
+
+use IndifferentKetchup\Codex\Util\ProjectZomboid\ProjectZomboidRedactor;
+use PHPUnit\Framework\TestCase;
+
+class ProjectZomboidRedactorIpv4Test extends TestCase
+{
+    public function testRedactsBareIpv4(): void
+    {
+        $input = 'Connection from 192.168.1.1 closed.';
+        $expected = 'Connection from [REDACTED_IP] closed.';
+
+        $output = (new ProjectZomboidRedactor())->redact($input);
+
+        $this->assertSame($expected, $output);
+    }
+
+    public function testRedactsIpv4WithPortSuffix(): void
+    {
+        $input = 'Connected to 10.0.0.42:27015.';
+        $expected = 'Connected to [REDACTED_IP].';
+
+        $output = (new ProjectZomboidRedactor())->redact($input);
+
+        $this->assertSame($expected, $output);
+    }
+
+    public function testRedactsMultipleIpv4OnOneLine(): void
+    {
+        $input = 'Peer 192.168.1.10 -> 192.168.1.20 via 10.0.0.1:8080.';
+        $expected = 'Peer [REDACTED_IP] -> [REDACTED_IP] via [REDACTED_IP].';
+
+        $output = (new ProjectZomboidRedactor())->redact($input);
+
+        $this->assertSame($expected, $output);
+    }
+
+    public function testRedactsLoopbackAndBoundaryAddresses(): void
+    {
+        $input = implode("\n", [
+            '127.0.0.1',
+            '0.0.0.0',
+            '255.255.255.255',
+        ]);
+        $expected = implode("\n", [
+            '[REDACTED_IP]',
+            '[REDACTED_IP]',
+            '[REDACTED_IP]',
+        ]);
+
+        $output = (new ProjectZomboidRedactor())->redact($input);
+
+        $this->assertSame($expected, $output);
+    }
+
+    public function testDoesNotRedactOutOfRangeOctets(): void
+    {
+        // 999 is not a valid octet under the 0-255 alternation; the address
+        // must therefore be left untouched.
+        $input = 'Bogus: 999.999.999.999';
+
+        $output = (new ProjectZomboidRedactor())->redact($input);
+
+        $this->assertSame($input, $output);
+    }
+
+    public function testDoesNotRedactInsideLongerDottedSequence(): void
+    {
+        // Five dotted segments are not an IPv4 address; the lookarounds must
+        // reject any partial match inside the longer sequence.
+        $input = 'Path frag 1.2.3.4.5 should not match.';
+
+        $output = (new ProjectZomboidRedactor())->redact($input);
+
+        $this->assertSame($input, $output);
+    }
+
+    public function testDoesNotRedactThreeSegmentBuildNumbers(): void
+    {
+        // PZ build numbers are 3-segment (e.g. 41.78.16) and must not match.
+        $input = 'Build 41.78.16 starting up.';
+
+        $output = (new ProjectZomboidRedactor())->redact($input);
+
+        $this->assertSame($input, $output);
+    }
+
+    public function testToggleOffLeavesIpv4Intact(): void
+    {
+        $input = 'Connection from 192.168.1.1:27015 closed.';
+
+        $output = (new ProjectZomboidRedactor())
+            ->redactIpAddresses(false)
+            ->redact($input);
+
+        $this->assertSame($input, $output);
+    }
+
+    public function testIdempotence(): void
+    {
+        $input = implode("\n", [
+            'Connection from 192.168.1.1:27015 closed.',
+            'Peer 10.0.0.42 -> 10.0.0.43 via 172.16.0.1:8080.',
+        ]);
+
+        $redactor = new ProjectZomboidRedactor();
+        $once = $redactor->redact($input);
+        $twice = $redactor->redact($once);
+
+        $this->assertSame($once, $twice);
+    }
+}
--- a/test/tests/Util/Redactor/ProjectZomboidRedactorIpv6Test.php
+++ b/test/tests/Util/Redactor/ProjectZomboidRedactorIpv6Test.php
@@ -0,0 +1,135 @@
+<?php
+
+namespace IndifferentKetchup\Codex\Test\Tests\Util\Redactor;
+
+use IndifferentKetchup\Codex\Util\ProjectZomboid\ProjectZomboidRedactor;
+use PHPUnit\Framework\TestCase;
+
+class ProjectZomboidRedactorIpv6Test extends TestCase
+{
+    public function testRedactsFullIpv6(): void
+    {
+        $input = 'Bound 2001:0db8:85a3:0000:0000:8a2e:0370:7334 ok.';
+        $expected = 'Bound [REDACTED_IP] ok.';
+
+        $output = (new ProjectZomboidRedactor())->redact($input);
+
+        $this->assertSame($expected, $output);
+    }
+
+    public function testRedactsAbbreviatedIpv6(): void
+    {
+        $input = 'Server peer 2001:db8::1 connected.';
+        $expected = 'Server peer [REDACTED_IP] connected.';
+
+        $output = (new ProjectZomboidRedactor())->redact($input);
+
+        $this->assertSame($expected, $output);
+    }
+
+    public function testRedactsLoopbackIpv6(): void
+    {
+        $input = 'localhost ::1 reachable.';
+        $expected = 'localhost [REDACTED_IP] reachable.';
+
+        $output = (new ProjectZomboidRedactor())->redact($input);
+
+        $this->assertSame($expected, $output);
+    }
+
+    public function testRedactsBracketedIpv6WithPort(): void
+    {
+        $input = 'Bound to [2001:db8::1]:8080 ok.';
+        $expected = 'Bound to [REDACTED_IP] ok.';
+
+        $output = (new ProjectZomboidRedactor())->redact($input);
+
+        $this->assertSame($expected, $output);
+    }
+
+    public function testRedactsBracketedLoopbackWithPort(): void
+    {
+        $input = 'Listening on [::1]:27015.';
+        $expected = 'Listening on [REDACTED_IP].';
+
+        $output = (new ProjectZomboidRedactor())->redact($input);
+
+        $this->assertSame($expected, $output);
+    }
+
+    public function testRedactsIpv4MappedIpv6(): void
+    {
+        // IPv4-mapped form must be handled by the IPv6 pass before the IPv4
+        // pass so the leading "::ffff:" doesn't get orphaned. With the IPv6
+        // pass first, the whole token collapses into a single placeholder.
+        $input = 'Mapped ::ffff:192.168.1.1 ok.';
+        $expected = 'Mapped [REDACTED_IP] ok.';
+
+        $output = (new ProjectZomboidRedactor())->redact($input);
+
+        $this->assertSame($expected, $output);
+    }
+
+    public function testDoesNotRedactJavaScopeOperator(): void
+    {
+        // Java method references and PHP scope operators look superficially
+        // like leading-:: IPv6 forms but fail filter_var validation; the
+        // word-boundary lookbehind also rejects matches that follow letters.
+        $input = 'Foo::bar called Object::toString.';
+
+        $output = (new ProjectZomboidRedactor())->redact($input);
+
+        $this->assertSame($input, $output);
+    }
+
+    public function testDoesNotRedactTimestampShape(): void
+    {
+        // PZ log timestamps include hh:mm:ss.v segments which match the coarse
+        // IPv6 candidate pattern but are rejected by filter_var.
+        $input = '[16-04-26 12:00:00.000][LOG] startup complete';
+
+        $output = (new ProjectZomboidRedactor())->redact($input);
+
+        $this->assertSame($input, $output);
+    }
+
+    public function testDoesNotRedactSteamIdAsIpv6(): void
+    {
+        // 17-digit Steam IDs share no characters with IPv6 syntax, but assert
+        // explicitly so a future change to the IPv6 regex doesn't accidentally
+        // collide with the Steam ID pass.
+        $input = 'Player 76561198111111111 joined.';
+        $expected = 'Player 76561198000000000 joined.';
+
+        $output = (new ProjectZomboidRedactor())->redact($input);
+
+        $this->assertSame($expected, $output);
+    }
+
+    public function testToggleOffLeavesIpv6Intact(): void
+    {
+        $input = 'Bound to [2001:db8::1]:8080 ok.';
+
+        $output = (new ProjectZomboidRedactor())
+            ->redactIpAddresses(false)
+            ->redact($input);
+
+        $this->assertSame($input, $output);
+    }
+
+    public function testIdempotence(): void
+    {
+        $input = implode("\n", [
+            'Server peer 2001:db8::1 connected.',
+            'Listening on [::1]:27015.',
+            'Mapped ::ffff:192.168.1.1 ok.',
+            '[16-04-26 12:00:00.000][LOG] startup complete',
+        ]);
+
+        $redactor = new ProjectZomboidRedactor();
+        $once = $redactor->redact($input);
+        $twice = $redactor->redact($once);
+
+        $this->assertSame($once, $twice);
+    }
+}
--- a/tools/pz-analyzer/pz_classify.py
+++ b/tools/pz-analyzer/pz_classify.py
@@ -0,0 +1,310 @@
+#!/usr/bin/env python3
+"""
+pz_classify.py — Deterministic Project Zomboid log classifier orchestrator.
+
+Walks ``*DebugLog-server*.txt`` files under the redacted-logs directory,
+runs the pz_parser pipeline per file, merges records cross-file by their
+deterministic ``signature``, and emits the spec-shaped JSON report.
+
+Companion to the existing Qwen-backed discovery tool ``pz_error_analysis.py``
+(left untouched). Zero AI dependency, stdlib-only, runs in seconds.
+
+By convention the input is always the redacted directory produced by
+``pz_redact_all.sh``; ``meta.redacted`` is therefore hard-coded ``true``.
+If the user overrides ``--input`` to a non-redacted source we still emit
+``true`` because we have no upstream way to verify redaction status.
+
+Pipeline:
+  parser.parse_file        per-file Entry list
+  parser.classify_entries  per-file deduped Record list
+  _merge_cross_file        global Record list deduped across files
+  _build_summary           top-line stats + by_kind / by_attribution / top_mods
+
+Output schema, CLI flags, and aggregation rules are defined in
+``docs/superpowers/specs/2026-05-04-pz-deterministic-classifier-design.md``.
+"""
+from __future__ import annotations
+
+import argparse
+import dataclasses
+import json
+import sys
+from collections import Counter
+from datetime import datetime, timezone
+from pathlib import Path
+
+from pz_parser import (
+    MAX_CAUSE_CHAIN_LEVELS,
+    MAX_STACK_FRAMES,
+    SEVERITY_LEVELS,
+    Record,
+    classify_entries,
+    parse_file,
+)
+
+# ---------------------------------------------------------------------------
+# Defaults / constants
+# ---------------------------------------------------------------------------
+
+_REPO_ROOT = Path(__file__).resolve().parents[2]
+DEFAULT_INPUT: Path = _REPO_ROOT / ".scratch" / "pz" / "Logs.redacted"
+DEFAULT_OUT: Path = _REPO_ROOT / ".scratch" / "pz" / "classify.json"
+
+#: Filename glob driving the directory walk.
+INPUT_GLOB: str = "*DebugLog-server*.txt"
+#: Cap on entries in ``summary.top_mods`` — most occurrence-count-heavy mods.
+TOP_MODS_LIMIT: int = 10
+
+#: Confidence / attribution promotion ladders (higher rank wins on merge).
+_CONFIDENCE_RANK: dict[str, int] = {"low": 0, "medium": 1, "high": 2}
+_ATTRIBUTION_RANK: dict[str, int] = {
+    "unattributed": 0,
+    "inferred": 1,
+    "direct": 2,
+}
+#: Levels that count as errors (vs warnings) in the summary.
+_ERROR_LEVELS: frozenset[str] = frozenset({"ERROR", "SEVERE", "FATAL"})
+
+
+# ---------------------------------------------------------------------------
+# Cross-file aggregation (spec §9, inter-file equivalent of parser dedup)
+# ---------------------------------------------------------------------------
+
+
+def _merge_cross_file(per_file_records: list[Record]) -> list[Record]:
+    """Merge ``Record`` instances across files by ``signature``.
+
+    The parser already dedups within a single file. This is the inter-file
+    equivalent: when the same signature appears in records from multiple
+    files, sum occurrences, union file lists, promote attribution/confidence,
+    and merge stack and cause-chain (deduped, capped at parser constants).
+    First-seen is the earliest by file-then-line; since callers feed records
+    in sorted file order, the first record we encounter per signature is
+    already the earliest.
+    """
+    by_signature: dict[str, Record] = {}
+    for incoming in per_file_records:
+        existing = by_signature.get(incoming.signature)
+        if existing is None:
+            # First occurrence — copy so we don't mutate the caller's list.
+            by_signature[incoming.signature] = Record(
+                signature=incoming.signature,
+                pattern_id=incoming.pattern_id,
+                level=incoming.level,
+                kind=incoming.kind,
+                mod_id=incoming.mod_id,
+                mod_name=incoming.mod_name,
+                attribution=incoming.attribution,
+                confidence=incoming.confidence,
+                attribution_reason=incoming.attribution_reason,
+                file=incoming.file,
+                line=incoming.line,
+                cause_chain=incoming.cause_chain,
+                stack=list(incoming.stack),
+                first_seen=incoming.first_seen,
+                occurrence_count=incoming.occurrence_count,
+                files=list(incoming.files),
+                excerpt=incoming.excerpt,
+            )
+            continue
+        # Aggregate.
+        existing.occurrence_count += incoming.occurrence_count
+        for fname in incoming.files:
+            if fname not in existing.files:
+                existing.files.append(fname)
+        # Promote attribution / confidence / mod_name on stronger evidence.
+        if _ATTRIBUTION_RANK[incoming.attribution] > _ATTRIBUTION_RANK[existing.attribution]:
+            existing.attribution = incoming.attribution
+            existing.attribution_reason = incoming.attribution_reason
+            if incoming.mod_name:
+                existing.mod_name = incoming.mod_name
+        if _CONFIDENCE_RANK[incoming.confidence] > _CONFIDENCE_RANK[existing.confidence]:
+            existing.confidence = incoming.confidence
+        # Merge stack frames preserving order, capped.
+        for frame in incoming.stack:
+            if frame not in existing.stack and len(existing.stack) < MAX_STACK_FRAMES:
+                existing.stack.append(frame)
+        # Merge cause chain (deduped tokens, capped).
+        if incoming.cause_chain and incoming.cause_chain != existing.cause_chain:
+            old = existing.cause_chain.split(" -> ") if existing.cause_chain else []
+            new = incoming.cause_chain.split(" -> ")
+            merged = list(old)
+            for tok in new:
+                if tok and tok not in merged:
+                    merged.append(tok)
+            existing.cause_chain = " -> ".join(merged[:MAX_CAUSE_CHAIN_LEVELS])
+    return list(by_signature.values())
+
+
+# ---------------------------------------------------------------------------
+# Summary computation
+# ---------------------------------------------------------------------------
+
+
+def _build_summary(records: list[Record]) -> dict[str, object]:
+    """Build the ``summary`` block per spec.
+
+    Counts records (signatures), not raw occurrences, except for ``top_mods``
+    which sums ``occurrence_count`` per mod_id so that volume-driving mods
+    surface even when they hit the same shape repeatedly.
+    """
+    errors = sum(1 for r in records if r.level in _ERROR_LEVELS)
+    warnings = sum(1 for r in records if r.level == "WARN")
+    by_kind = Counter(r.kind for r in records)
+    by_attribution = Counter(r.attribution for r in records)
+    by_confidence = Counter(r.confidence for r in records)
+
+    # Group by mod_id summing total occurrence_count; preserve any mod_name.
+    mod_totals: dict[str, int] = {}
+    mod_names: dict[str, str] = {}
+    for r in records:
+        mod_totals[r.mod_id] = mod_totals.get(r.mod_id, 0) + r.occurrence_count
+        # First non-empty mod_name wins; subsequent records may have empty
+        # mod_name (e.g. for unattributed) so don't overwrite with "".
+        if r.mod_name and r.mod_id not in mod_names:
+            mod_names[r.mod_id] = r.mod_name
+    top_mods = sorted(
+        (
+            {
+                "mod_id": mod_id,
+                "mod_name": mod_names.get(mod_id, ""),
+                "occurrence_count": total,
+            }
+            for mod_id, total in mod_totals.items()
+        ),
+        key=lambda d: d["occurrence_count"],
+        reverse=True,
+    )[:TOP_MODS_LIMIT]
+
+    return {
+        "errors": errors,
+        "warnings": warnings,
+        "by_kind": dict(by_kind),
+        "by_attribution": dict(by_attribution),
+        "by_confidence": dict(by_confidence),
+        "top_mods": top_mods,
+    }
+
+
+# ---------------------------------------------------------------------------
+# Driver
+# ---------------------------------------------------------------------------
+
+
+def _run(input_dir: Path, out_path: Path, *, quiet: bool) -> int:
+    if not input_dir.is_dir():
+        print(
+            f"pz_classify: --input directory not found: {input_dir}",
+            file=sys.stderr,
+        )
+        return 2
+
+    started = datetime.now(timezone.utc).isoformat(timespec="seconds")
+    files = sorted(input_dir.glob(INPUT_GLOB))
+
+    all_records: list[Record] = []
+    log_lines_total = 0
+    error_lines_total = 0
+
+    for path in files:
+        try:
+            entries = parse_file(path)
+        except Exception as exc:  # noqa: BLE001 — orchestrator must keep going.
+            print(
+                f"pz_classify: warning: failed to parse {path.name}: {exc}",
+                file=sys.stderr,
+            )
+            continue
+        # Body-line totals: every line under every parsed entry contributes
+        # to log_lines_total; severity-level entries' body lines feed
+        # error_lines_total. Counted before dedup so it reflects raw volume.
+        for e in entries:
+            log_lines_total += len(e.body)
+            if e.level in SEVERITY_LEVELS:
+                error_lines_total += len(e.body)
+        all_records.extend(classify_entries(entries, source_file=path.name))
+
+    merged = _merge_cross_file(all_records)
+    merged.sort(key=lambda r: r.occurrence_count, reverse=True)
+
+    finished = datetime.now(timezone.utc).isoformat(timespec="seconds")
+
+    unique_patterns = len({r.pattern_id for r in merged})
+
+    document: dict[str, object] = {
+        "meta": {
+            "input_dir": str(input_dir),
+            "files_scanned": len(files),
+            "log_lines_total": log_lines_total,
+            "error_lines_total": error_lines_total,
+            "unique_signatures": len(merged),
+            "unique_patterns": unique_patterns,
+            "redacted": True,
+            "started": started,
+            "finished": finished,
+        },
+        "signatures": [dataclasses.asdict(r) for r in merged],
+        "summary": _build_summary(merged),
+    }
+
+    tmp = out_path.with_suffix(out_path.suffix + ".tmp")
+    try:
+        out_path.parent.mkdir(parents=True, exist_ok=True)
+        with tmp.open("w", encoding="utf-8") as f:
+            json.dump(document, f, ensure_ascii=False, indent=2)
+            f.write("\n")
+        tmp.replace(out_path)
+    except OSError as exc:
+        print(f"pz_classify: failed to write {out_path}: {exc}", file=sys.stderr)
+        # Best-effort cleanup of the temp file.
+        try:
+            tmp.unlink()
+        except OSError:
+            pass
+        return 1
+
+    if not quiet:
+        print(
+            f"pz_classify: {len(files)} file(s), {log_lines_total} log lines, "
+            f"{error_lines_total} error lines, {len(merged)} records "
+            f"({unique_patterns} unique patterns) -> {out_path}"
+        )
+    return 0
+
+
+def _parse_args(argv: list[str] | None = None) -> argparse.Namespace:
+    parser = argparse.ArgumentParser(
+        prog="pz_classify",
+        description=(
+            "Deterministic Project Zomboid log classifier. Walks redacted "
+            "DebugLog-server*.txt files, classifies errors/warnings, and "
+            "emits a JSON report."
+        ),
+    )
+    parser.add_argument(
+        "--input",
+        type=Path,
+        default=DEFAULT_INPUT,
+        help=f"Input directory of redacted log files (default: {DEFAULT_INPUT}).",
+    )
+    parser.add_argument(
+        "--out",
+        type=Path,
+        default=DEFAULT_OUT,
+        help=f"Output JSON path (default: {DEFAULT_OUT}).",
+    )
+    parser.add_argument(
+        "--quiet",
+        action="store_true",
+        help="Suppress the trailing one-line summary.",
+    )
+    return parser.parse_args(argv)
+
+
+def main(argv: list[str] | None = None) -> int:
+    args = _parse_args(argv)
+    return _run(args.input, args.out, quiet=args.quiet)
+
+
+if __name__ == "__main__":
+    sys.exit(main())
--- a/tools/pz-analyzer/pz_error_analysis.py
+++ b/tools/pz-analyzer/pz_error_analysis.py
@@ -0,0 +1,467 @@
+#!/usr/bin/env python3
+"""
+pz_error_analysis.py — Qwen-backed Project Zomboid error analyzer.
+
+Walks `*DebugLog-server*.txt` files (DEFAULT_INPUT — already PII-redacted by
+pz_redact_all.sh), groups WARN/ERROR/FATAL entries with surrounding context,
+deduplicates by signature hash, and asks Qwen to classify each unique
+signature into a fixed taxonomy (missing_mod, java_exception, lua_error,
+out_of_memory, ...) with a short title / summary / likely_cause /
+suggested_fix / confidence.
+
+Standalone: requires Python 3.10+ and the `openai` package
+(`pip install openai>=1.30`). Talks to a local OpenAI-compatible endpoint
+(default sam-desktop llama-swap on port 8401); override with QWEN_BASE_URL
+and QWEN_MODEL env vars.
+"""
+from __future__ import annotations
+
+import argparse
+import datetime as dt
+import hashlib
+import json
+import os
+import re
+import sys
+import time
+from pathlib import Path
+from typing import Any, Iterator
+
+from openai import OpenAI
+
+_REPO_ROOT = Path(__file__).resolve().parents[2]
+
+DEFAULT_INPUT = _REPO_ROOT / ".scratch" / "pz" / "Logs.redacted"
+DEFAULT_OUT = _REPO_ROOT / ".scratch" / "pz" / "analysis.json"
+
+# --- Qwen client (inlined from /opt/analytics/ib_analytics/llm/local_client.py
+#     so this script has no cross-repo dependency; mirror upstream changes if
+#     the analytics client API evolves) ---
+
+QWEN_DEFAULT_BASE_URL = "http://100.101.41.16:8401/v1"
+QWEN_DEFAULT_MODEL = "qwen3.6-35b-a3b"
+
+SAMPLING_STRUCTURED: dict[str, Any] = {
+    "temperature": 0.7,
+    "top_p": 0.80,
+    "extra_body": {
+        "top_k": 20,
+        "presence_penalty": 1.5,
+        "chat_template_kwargs": {"enable_thinking": False},
+    },
+}
+
+
+def get_client() -> OpenAI:
+    return OpenAI(
+        base_url=os.environ.get("QWEN_BASE_URL", QWEN_DEFAULT_BASE_URL),
+        api_key="EMPTY",
+    )
+
+
+def get_model() -> str:
+    return os.environ.get("QWEN_MODEL", QWEN_DEFAULT_MODEL)
+
+
+def structured_call(
+    tool_schema: dict[str, Any],
+    messages: list[dict[str, Any]],
+    *,
+    sampling: dict[str, Any] = SAMPLING_STRUCTURED,
+    client: OpenAI | None = None,
+    model: str | None = None,
+    max_tokens: int = 4096,
+) -> dict[str, Any]:
+    cli = client or get_client()
+    mdl = model or get_model()
+    fn_name = tool_schema["function"]["name"]
+    kwargs = dict(sampling)
+    extra_body = dict(kwargs.pop("extra_body", {}))
+    response = cli.chat.completions.create(
+        model=mdl,
+        messages=messages,
+        tools=[tool_schema],
+        tool_choice="required",
+        max_tokens=max_tokens,
+        extra_body=extra_body,
+        **kwargs,
+    )
+    choice = response.choices[0]
+    tool_calls = getattr(choice.message, "tool_calls", None) or []
+    if not tool_calls:
+        raise ValueError(
+            f"Qwen did not invoke {fn_name}; finish_reason={choice.finish_reason}, "
+            f"content={(choice.message.content or '')[:500]}"
+        )
+    call = tool_calls[0]
+    if call.function.name != fn_name:
+        raise ValueError(
+            f"Qwen invoked unexpected tool {call.function.name!r}; expected {fn_name!r}"
+        )
+    try:
+        return json.loads(call.function.arguments)
+    except json.JSONDecodeError as e:
+        raise ValueError(
+            f"Malformed tool-call arguments for {fn_name}: {e}; "
+            f"raw={call.function.arguments[:500]}"
+        ) from e
+
+
+# --- Parser ---
+
+ENTRY_RE = re.compile(
+    r"^\[(\d{2}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}\.\d{3})\]\s+"
+    r"(LOG|WARN|ERROR|FATAL)\s*:\s*(.*)"
+)
+SESSION_META_RE = re.compile(r"^[A-Za-z]+\s+f:\d+,?\s*(?:t:\d+,?\s*)?st:[\d,]+>\s*")
+DOUBLE_QUOTED_RE = re.compile(r'"[^"]*"')
+SINGLE_QUOTED_RE = re.compile(r"'[^']*'")
+NUMERIC_RUN_RE = re.compile(r"\d{2,}")
+WS_RUN_RE = re.compile(r"\s+")
+
+CATEGORIES = [
+    "missing_mod", "mod_conflict", "lua_error", "java_exception",
+    "out_of_memory", "corrupt_save", "network_error", "load_order",
+    "performance", "server_crash", "unknown",
+]
+
+TOOL_SCHEMA: dict[str, Any] = {
+    "type": "function",
+    "function": {
+        "name": "submit_error_analysis",
+        "description": (
+            "Analyse a single Project Zomboid server error block and emit "
+            "structured insight."
+        ),
+        "parameters": {
+            "type": "object",
+            "properties": {
+                "category": {"type": "string", "enum": CATEGORIES},
+                "severity": {"type": "string", "enum": ["problem", "warning", "info"]},
+                "title": {"type": "string", "description": "One-line headline (<=80 chars)"},
+                "summary": {"type": "string", "description": "1-3 sentences explaining what happened"},
+                "likely_cause": {"type": "string", "description": "Most plausible cause given the context"},
+                "suggested_fix": {"type": "string", "description": "Concrete remediation, server-admin actionable"},
+                "confidence": {"type": "number", "minimum": 0.0, "maximum": 1.0},
+            },
+            "required": [
+                "category", "severity", "title", "summary",
+                "likely_cause", "suggested_fix", "confidence",
+            ],
+        },
+    },
+}
+
+SYSTEM_PROMPT = """You are a Project Zomboid dedicated server administrator
+diagnosing a server log. You receive one error/warning event with surrounding
+context (entries marked with `>>>` are the hit; the rest are leading or
+trailing context). Classify the event using the submit_error_analysis tool
+ONLY — never reply in plain text.
+
+Rules:
+- `category` must be one of the enum values; choose `unknown` only if no
+  other fits.
+- `severity`: problem = breaks something users notice; warning = degraded
+  but functional; info = noteworthy but not failing.
+- `title`: at most 80 chars, neutral and specific.
+- `suggested_fix`: a concrete admin action ("subscribe to mod X", "increase
+  -Xmx to 8G", "remove the conflicting mod from Mods= line"), not generic
+  advice.
+- `confidence`: 0.0-1.0; lower it when the evidence is ambiguous.
+"""
+
+MAX_PROMPT_CHARS = 4000
+
+
+def parse_file(path: Path) -> list[dict[str, Any]]:
+    """Parse a DebugLog-server file into a list of multi-line entries.
+
+    Continuation lines (lines that don't match ENTRY_RE) append to the
+    previous entry, mirroring codex's PatternParser behaviour.
+    """
+    entries: list[dict[str, Any]] = []
+    current: dict[str, Any] | None = None
+    with path.open("r", encoding="utf-8", errors="replace") as f:
+        for lineno, raw in enumerate(f, start=1):
+            line = raw.rstrip("\n")
+            m = ENTRY_RE.match(line)
+            if m:
+                if current is not None:
+                    entries.append(current)
+                current = {
+                    "timestamp": m.group(1),
+                    "level": m.group(2),
+                    "body": [m.group(3)],
+                    "line_start": lineno,
+                    "line_end": lineno,
+                }
+            elif current is not None:
+                current["body"].append(line)
+                current["line_end"] = lineno
+            # else: orphan line at start of file (no preceding entry); ignore.
+    if current is not None:
+        entries.append(current)
+    return entries
+
+
+def signature_for(level: str, body_lines: list[str]) -> str:
+    """Stable signature derived from the first body line only.
+
+    Stack-trace continuations are deliberately ignored: the same logical
+    exception can produce slightly different traces (e.g. timing-related
+    code paths) but should still collapse to one signature. Quoted strings
+    (vehicle names, mod IDs, paths) are flattened to <S>; numeric runs of
+    length >= 2 are flattened to <N>; session-metadata prefix
+    (`General  f:0,t:N,st:N,N,N>`) is stripped.
+    """
+    first = (body_lines[0] if body_lines else "").strip()
+    first = SESSION_META_RE.sub("", first)
+    first = DOUBLE_QUOTED_RE.sub('"<S>"', first)
+    first = SINGLE_QUOTED_RE.sub("'<S>'", first)
+    first = NUMERIC_RUN_RE.sub("<N>", first)
+    first = WS_RUN_RE.sub(" ", first)
+    first = first[:200]
+    h = hashlib.sha256(f"{level}\n{first}".encode("utf-8")).hexdigest()
+    return f"sha256:{h[:16]}"
+
+
+def build_excerpt(
+    entries: list[dict[str, Any]], hit_idx: int, context: int
+) -> str:
+    """Render an excerpt centered on entries[hit_idx] with ±context entries."""
+    start = max(0, hit_idx - context)
+    end = min(len(entries), hit_idx + context + 1)
+    lines: list[str] = []
+    for i in range(start, end):
+        e = entries[i]
+        is_hit = i == hit_idx
+        marker = ">>>" if is_hit else "   "
+        prefix = f'{marker} [{e["timestamp"]}] {e["level"]}: '
+        body = e["body"]
+        if is_hit:
+            for j, body_line in enumerate(body):
+                lines.append(prefix + body_line if j == 0 else "       " + body_line)
+        else:
+            first = (body[0] if body else "").strip()[:200]
+            lines.append(prefix + first)
+            if len(body) > 1:
+                lines.append(f'       ... (+{len(body) - 1} more lines)')
+    excerpt = "\n".join(lines)
+    if len(excerpt) > MAX_PROMPT_CHARS:
+        excerpt = excerpt[:MAX_PROMPT_CHARS] + "\n... [truncated]"
+    return excerpt
+
+
+def iter_warn_or_error(entries: list[dict[str, Any]]) -> Iterator[int]:
+    for i, e in enumerate(entries):
+        if e["level"] in ("WARN", "ERROR", "FATAL"):
+            yield i
+
+
+def collect_signatures(
+    input_dir: Path, context: int
+) -> tuple[dict[str, dict[str, Any]], dict[str, int]]:
+    """Walk DebugLog-server files and collect dedup'd signatures."""
+    signatures: dict[str, dict[str, Any]] = {}
+    files_scanned = 0
+    log_lines_total = 0
+    error_lines_total = 0
+
+    for path in sorted(input_dir.glob("*DebugLog-server*.txt")):
+        files_scanned += 1
+        entries = parse_file(path)
+        log_lines_total += sum(len(e["body"]) for e in entries)
+        for hit_idx in iter_warn_or_error(entries):
+            hit = entries[hit_idx]
+            error_lines_total += len(hit["body"])
+            sig = signature_for(hit["level"], hit["body"])
+            occurrence = {
+                "file": path.name,
+                "line": hit["line_start"],
+                "timestamp": hit["timestamp"],
+            }
+            if sig not in signatures:
+                signatures[sig] = {
+                    "signature": sig,
+                    "level": hit["level"],
+                    "first_seen": occurrence,
+                    "occurrence_count": 1,
+                    "files": [path.name],
+                    "excerpt": build_excerpt(entries, hit_idx, context),
+                }
+            else:
+                rec = signatures[sig]
+                rec["occurrence_count"] += 1
+                if path.name not in rec["files"]:
+                    rec["files"].append(path.name)
+    return signatures, {
+        "files_scanned": files_scanned,
+        "log_lines_total": log_lines_total,
+        "error_lines_total": error_lines_total,
+    }
+
+
+def call_qwen(client: OpenAI, model: str, sig_rec: dict[str, Any]) -> dict[str, Any]:
+    user_prompt = (
+        f'Level: {sig_rec["level"]}\n'
+        f'First seen: {sig_rec["first_seen"]["file"]} '
+        f'line {sig_rec["first_seen"]["line"]}\n'
+        f'Occurrences across this run: {sig_rec["occurrence_count"]} '
+        f'(across {len(sig_rec["files"])} file(s))\n\n'
+        f'Log excerpt:\n{sig_rec["excerpt"]}'
+    )
+    return structured_call(
+        TOOL_SCHEMA,
+        [
+            {"role": "system", "content": SYSTEM_PROMPT},
+            {"role": "user", "content": user_prompt},
+        ],
+        sampling=SAMPLING_STRUCTURED,
+        client=client,
+        model=model,
+    )
+
+
+def atomic_write(path: Path, payload: Any) -> None:
+    path.parent.mkdir(parents=True, exist_ok=True)
+    tmp = path.with_suffix(path.suffix + ".tmp")
+    with tmp.open("w", encoding="utf-8") as f:
+        json.dump(payload, f, indent=2, ensure_ascii=False)
+    tmp.replace(path)
+
+
+def load_existing(path: Path) -> dict[str, dict[str, Any]]:
+    """Reload signatures previously written to --out.
+
+    Only signatures with an `llm` field count as completed. Bare records
+    (left behind when --limit truncated a prior run) get re-attempted on
+    resume so progressive analysis converges.
+    """
+    if not path.exists():
+        return {}
+    try:
+        with path.open("r", encoding="utf-8") as f:
+            data = json.load(f)
+        return {
+            s["signature"]: s
+            for s in data.get("signatures", [])
+            if "signature" in s and "llm" in s
+        }
+    except Exception:
+        return {}
+
+
+def summarise(analyzed: list[dict[str, Any]]) -> dict[str, Any]:
+    sev_counts = {"problem": 0, "warning": 0, "info": 0}
+    by_cat: dict[str, int] = {}
+    for s in analyzed:
+        llm = s.get("llm") or {}
+        sev = llm.get("severity")
+        cat = llm.get("category")
+        if sev in sev_counts:
+            sev_counts[sev] += 1
+        if cat:
+            by_cat[cat] = by_cat.get(cat, 0) + 1
+    return {
+        "problems": sev_counts["problem"],
+        "warnings": sev_counts["warning"],
+        "info": sev_counts["info"],
+        "by_category": by_cat,
+    }
+
+
+def main() -> None:
+    ap = argparse.ArgumentParser(description=__doc__)
+    ap.add_argument("--input", type=Path, default=DEFAULT_INPUT)
+    ap.add_argument("--out", type=Path, default=DEFAULT_OUT)
+    ap.add_argument("--context", type=int, default=20)
+    ap.add_argument("--limit", type=int, default=None,
+                    help="Stop after N new signatures analysed.")
+    ap.add_argument("--resume", action="store_true",
+                    help="Reuse existing analysis from --out if present.")
+    ap.add_argument("--checkpoint-every", type=int, default=25)
+    args = ap.parse_args()
+
+    if not args.input.is_dir():
+        print(f"error: {args.input} not a directory", file=sys.stderr)
+        sys.exit(2)
+
+    started = dt.datetime.now(dt.timezone.utc).isoformat(timespec="seconds")
+    print(f"[init] scanning {args.input}")
+    signatures, file_stats = collect_signatures(args.input, args.context)
+    print(
+        f"[init] {file_stats['files_scanned']} file(s), "
+        f"{file_stats['log_lines_total']} log lines, "
+        f"{file_stats['error_lines_total']} error lines, "
+        f"{len(signatures)} unique signature(s)"
+    )
+
+    existing = load_existing(args.out) if args.resume else {}
+    if existing:
+        print(f"[init] {len(existing)} signature(s) already analysed; resuming")
+
+    client = get_client()
+    model = get_model()
+    print(f"[init] qwen model={model}")
+
+    n_new = 0
+    t0 = time.time()
+    analyzed: list[dict[str, Any]] = []
+
+    # Process in occurrence_count desc so --limit N picks the most-impactful
+    # signatures rather than whichever happened to scan first.
+    for sig, rec in sorted(
+        signatures.items(), key=lambda kv: -kv[1]["occurrence_count"]
+    ):
+        if sig in existing:
+            analyzed.append(existing[sig])
+            continue
+        if args.limit is not None and n_new >= args.limit:
+            analyzed.append(rec)  # keep raw record so it's not lost on resume
+            continue
+        try:
+            llm = call_qwen(client, model, rec)
+            rec["llm"] = llm
+        except Exception as e:
+            rec["llm"] = {"error": str(e)[:500]}
+            print(f"  [{n_new + 1}] LLM error on {sig}: {e}", file=sys.stderr)
+        analyzed.append(rec)
+        n_new += 1
+        if n_new % args.checkpoint_every == 0:
+            payload = {
+                "meta": {
+                    "input_dir": str(args.input),
+                    **file_stats,
+                    "unique_signatures": len(signatures),
+                    "redacted": True,
+                    "qwen_model": model,
+                    "started": started,
+                    "checkpoint_at": dt.datetime.now(dt.timezone.utc).isoformat(timespec="seconds"),
+                },
+                "signatures": analyzed,
+                "summary": summarise(analyzed),
+            }
+            atomic_write(args.out, payload)
+            rate = n_new / max(time.time() - t0, 1e-3)
+            print(f"  [{n_new}] checkpoint @ {rate:.2f} sig/s")
+
+    finished = dt.datetime.now(dt.timezone.utc).isoformat(timespec="seconds")
+    payload = {
+        "meta": {
+            "input_dir": str(args.input),
+            **file_stats,
+            "unique_signatures": len(signatures),
+            "redacted": True,
+            "qwen_model": model,
+            "started": started,
+            "finished": finished,
+        },
+        "signatures": analyzed,
+        "summary": summarise(analyzed),
+    }
+    atomic_write(args.out, payload)
+    print(f"[done] {n_new} new, {len(analyzed)} total -> {args.out}")
+
+
+if __name__ == "__main__":
+    main()
--- a/tools/pz-analyzer/pz_parser.py
+++ b/tools/pz-analyzer/pz_parser.py
@@ -0,0 +1,777 @@
+"""
+pz_parser.py — Deterministic Project Zomboid log parser.
+
+Pure module (no I/O beyond reading the path it is handed). Walks a redacted
+DebugLog-server*.txt file, extracts errors/warnings, attributes each to a mod
+where evidence allows, classifies by kind, and computes deterministic
+signatures. Output records are designed to be `dataclasses.asdict()`-ready
+for direct JSON serialisation.
+
+Pipeline phases (per design spec at
+docs/superpowers/specs/2026-05-04-pz-deterministic-classifier-design.md):
+
+1. Severity-prefix recognition (ERROR|SEVERE|WARN)
+2. Bidirectional stack collection (pre-stack walk back, post-stack walk forward)
+3. Mod attribution (direct, inferred, unattributed)
+4. File:line extraction (five fallbacks)
+5. Cause-chain extraction (Caused by: chains + standalone exception lines)
+6. Java exception kind detection
+7. Engine-noise tagging
+8. Signature computation (pattern_id + signature)
+9. Aggregation (dedup on signature)
+
+Style notes mirror sibling tool pz_error_analysis.py: type hints with built-in
+generics, `from __future__ import annotations`, regex precompilation as
+module-level constants, stdlib-only.
+"""
+from __future__ import annotations
+
+import hashlib
+import pathlib
+import re
+from dataclasses import dataclass
+
+# ---------------------------------------------------------------------------
+# Tunable constants
+# ---------------------------------------------------------------------------
+
+#: Lookback window (in raw file lines) for inferred mod attribution.
+INFERRED_LOOKBACK_LINES: int = 40
+#: Maximum frames retained per record after pre+post stack merge.
+MAX_STACK_FRAMES: int = 8
+#: Maximum lines walked in each direction during bidirectional stack collection.
+STACK_WALK_LINES: int = 25
+#: Maximum cause-chain depth retained.
+MAX_CAUSE_CHAIN_LEVELS: int = 6
+#: Truncation length for the normalised first line that feeds pattern_id.
+PATTERN_ID_FIRST_LINE_MAX: int = 200
+
+# ---------------------------------------------------------------------------
+# Line-shape regexes (parsing)
+# ---------------------------------------------------------------------------
+
+#: PZ DebugLog entry header.
+#: Example: ``[16-04-26 00:01:19.080] ERROR: General      f:0, t:1, st:1,2,3,4> body``
+ENTRY_RE = re.compile(
+    r"^\[(?P<ts>\d{2}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}\.\d{3})\]\s+"
+    r"(?P<level>[A-Z]+)\s*:\s*(?P<rest>.*)$"
+)
+
+#: Strips the "General  f:N, t:N, st:N,N,N,N>" prefix from a body line.
+SESSION_META_RE = re.compile(
+    r"^[A-Za-z][A-Za-z0-9]*\s+f:\d+,?\s*(?:t:\d+,?\s*)?st:[\d,]+>\s*"
+)
+
+# ---------------------------------------------------------------------------
+# Severity-prefix recognition (phase 1)
+# ---------------------------------------------------------------------------
+
+#: Severity tokens that flag a body line as an error/warning event when they
+#: appear at the start of body text. Per spec: broader than the existing
+#: pz_error_analysis.py regex (adds SEVERE for Java util-logging).
+SEVERITY_BODY_RE = re.compile(r"^\s*(ERROR|SEVERE|WARN)\s*[:\s]")
+#: Bracketed-level tokens that map to severity events.
+SEVERITY_LEVELS: tuple[str, ...] = ("ERROR", "WARN", "SEVERE", "FATAL")
+
+# ---------------------------------------------------------------------------
+# Stack-frame recognition (phase 2)
+# ---------------------------------------------------------------------------
+
+#: Markers that identify a line as stack-shaped. Used to gate pre/post stack
+#: collection so we don't latch onto non-stack continuation text.
+STACK_HINT_RE = re.compile(
+    r"(?:\bat\s+\S+|\[string\s+\"|function:\s|file:\s|\.lua\b)",
+    re.IGNORECASE,
+)
+
+# ---------------------------------------------------------------------------
+# Mod attribution (phase 3)
+# ---------------------------------------------------------------------------
+
+#: Direct attribution marker: ``Lua((MOD:<name>))``.
+LUA_MOD_MARKER_RE = re.compile(r"Lua\(\(MOD:([^)]+)\)\)")
+#: Direct attribution: ``require("X") failed`` shape.
+REQUIRE_FAILED_RE = re.compile(
+    r"""require\s*\(\s*["']([^"']+)["']\s*\)\s+failed""",
+    re.IGNORECASE,
+)
+#: Direct attribution: explicit ``needed by <mod>`` hint.
+NEEDED_BY_RE = re.compile(r"needed\s+by\s+([A-Za-z0-9_'\- ]+?)(?:[,.]|$)", re.IGNORECASE)
+
+#: Patterns that flag a body as "Lua-shaped" — gating filter for inferred
+#: attribution. Mirrors the spec's enumeration.
+LUA_SHAPED_PATTERNS: tuple[re.Pattern[str], ...] = (
+    re.compile(r"luamanager\.getfunctionobject", re.IGNORECASE),
+    re.compile(r"no\s+such\s+function", re.IGNORECASE),
+    re.compile(r"exception\s+thrown", re.IGNORECASE),
+    re.compile(r"runtimeexception", re.IGNORECASE),
+    re.compile(r"illegalstateexception", re.IGNORECASE),
+    re.compile(r"\blua\b", re.IGNORECASE),
+)
+
+# ---------------------------------------------------------------------------
+# File:line extraction (phase 4) — five fallbacks tried in order
+# ---------------------------------------------------------------------------
+
+#: 1. ``at <path>.lua:<n>`` — typical Lua stack frame.
+FILE_LINE_AT_RE = re.compile(r"\bat\s+([^\s:]+\.lua):(\d+)")
+#: 2. ``function: ... file: <path>.lua line #<n>`` (or `: <n>`).
+FILE_LINE_FUNCTION_RE = re.compile(
+    r"function:\s*[^,]*?file:\s*([^\s,]+\.lua)\s+line\s*(?:#|:)\s*(\d+)",
+    re.IGNORECASE,
+)
+#: 3. ``[string "<path>.lua"]:<n>`` — Lua VM source string.
+FILE_LINE_STRING_RE = re.compile(r"""\[string\s+["']([^"']+\.lua)["']\]:(\d+)""")
+#: 4. quoted path ending in a known extension; line # optional.
+FILE_LINE_QUOTED_RE = re.compile(
+    r"""["']([^"']+\.(?:lua|txt|xml|json|ini|cfg|bin))["'](?::(\d+))?"""
+)
+#: 5. unquoted path segment beginning with a recognised root.
+FILE_LINE_UNQUOTED_RE = re.compile(
+    r"\b((?:media|maps|lua|scripts)/[\w./\-]+\.(?:lua|txt|xml|json|ini|cfg|bin))(?::(\d+))?"
+)
+
+# ---------------------------------------------------------------------------
+# Cause-chain extraction (phase 5)
+# ---------------------------------------------------------------------------
+
+#: ``Caused by: <ExceptionClass>: <msg>`` (msg optional).
+CAUSED_BY_RE = re.compile(
+    r"Caused\s+by:\s+((?:\w+\.)+\w+(?:Exception|Error))(?::\s*(.+?))?\s*$",
+    re.IGNORECASE,
+)
+#: Standalone Java exception line: ``com.foo.BarException: msg``.
+EXCEPTION_LINE_RE = re.compile(
+    r"((?:\w+\.)+\w+(?:Exception|Error))(?::\s*(.+?))?(?=\s+at\s|\s*$)"
+)
+
+# ---------------------------------------------------------------------------
+# Engine-noise tagging (phase 7)
+# ---------------------------------------------------------------------------
+
+ENGINE_NOISE_PATTERNS: tuple[re.Pattern[str], ...] = (
+    re.compile(r"kahluathread\.flusherrormessage", re.IGNORECASE),
+    re.compile(r"dumping\s+lua\s+stack\s+trace", re.IGNORECASE),
+)
+
+# ---------------------------------------------------------------------------
+# Signature normalisation (phase 8)
+# ---------------------------------------------------------------------------
+
+DOUBLE_QUOTED_RE = re.compile(r'"[^"]*"')
+SINGLE_QUOTED_RE = re.compile(r"'[^']*'")
+NUMERIC_RUN_RE = re.compile(r"\d{2,}")
+WS_RUN_RE = re.compile(r"\s+")
+#: Strips a leading ``ERROR:`` / ``SEVERE:`` / ``WARN:`` / ``FATAL:`` token
+#: from a body line so a body that happens to begin with the severity word
+#: hashes to the same pattern_id as the bracketed-only variant. Matches the
+#: token plus any colon and trailing whitespace; case-insensitive.
+SEVERITY_PREFIX_STRIP_RE = re.compile(
+    r"^\s*(?:ERROR|SEVERE|WARN|FATAL)\s*[:\s]\s*", re.IGNORECASE
+)
+
+# ---------------------------------------------------------------------------
+# Dataclasses — match the JSON keys the spec mandates so consumers can
+# `dataclasses.asdict(record)` straight to JSON.
+# ---------------------------------------------------------------------------
+
+
+@dataclass
+class Entry:
+    """One parsed log entry. Continuation lines (TAB-indented or otherwise
+    non-header lines) are folded into ``body``. Phase-2 stack collection
+    walks neighbouring entries (not raw lines), so no extra context is
+    stored here.
+    """
+
+    timestamp: str
+    level: str
+    body: list[str]
+    line_start: int
+    line_end: int
+
+
+@dataclass
+class FirstSeen:
+    """Provenance for the first occurrence of a deduped record."""
+
+    file: str
+    line: int
+    timestamp: str
+
+
+@dataclass
+class Record:
+    """One classified, deduplicated error/warning record. Field names mirror
+    the JSON output schema in the spec verbatim — this object is intended to
+    be `dataclasses.asdict()`-ed straight into the output document.
+    """
+
+    signature: str
+    pattern_id: str
+    level: str
+    kind: str
+    mod_id: str
+    mod_name: str
+    attribution: str
+    confidence: str
+    attribution_reason: str
+    file: str
+    line: int
+    cause_chain: str
+    stack: list[str]
+    first_seen: FirstSeen
+    occurrence_count: int
+    files: list[str]
+    excerpt: str
+
+
+# ---------------------------------------------------------------------------
+# Phase 0: file parse
+# ---------------------------------------------------------------------------
+
+
+def parse_file(path: pathlib.Path) -> list[Entry]:
+    """Parse a DebugLog-server file into a list of multi-line entries.
+
+    Continuation lines (those not matching ENTRY_RE) append to the previous
+    entry's body, mirroring codex's PatternParser behaviour for multi-line
+    Java stack traces under an ERROR header.
+    """
+    entries: list[Entry] = []
+    current: Entry | None = None
+    with path.open("r", encoding="utf-8", errors="replace") as f:
+        for lineno, raw in enumerate(f, start=1):
+            line = raw.rstrip("\n")
+            m = ENTRY_RE.match(line)
+            if m:
+                if current is not None:
+                    entries.append(current)
+                current = Entry(
+                    timestamp=m.group("ts"),
+                    level=m.group("level"),
+                    body=[m.group("rest")],
+                    line_start=lineno,
+                    line_end=lineno,
+                )
+            elif current is not None:
+                current.body.append(line)
+                current.line_end = lineno
+            # else: orphan line at start of file (no preceding entry); ignore.
+    if current is not None:
+        entries.append(current)
+    return entries
+
+
+# ---------------------------------------------------------------------------
+# Phase 1: severity-prefix recognition
+# ---------------------------------------------------------------------------
+
+
+def is_severity_entry(entry: Entry) -> bool:
+    """True if this entry is an ERROR/WARN/SEVERE/FATAL — either by the
+    bracketed level or a leading SEVERE/ERROR/WARN token in the body (after
+    stripping the session-meta prefix)."""
+    if entry.level in SEVERITY_LEVELS:
+        return True
+    if entry.body and SEVERITY_BODY_RE.match(_strip_session_meta(entry.body[0])):
+        return True
+    return False
+
+
+def effective_level(entry: Entry) -> str:
+    """Return the effective severity for an entry. Body-prefix takes
+    precedence — covers the SEVERE-in-body case where bracketed level is LOG
+    *and* the case where bracketed level is ERROR but body says SEVERE.
+    """
+    if entry.body:
+        m = SEVERITY_BODY_RE.match(_strip_session_meta(entry.body[0]))
+        if m:
+            return m.group(1).upper()
+    return entry.level
+
+
+# ---------------------------------------------------------------------------
+# Phase 2: bidirectional stack collection
+# ---------------------------------------------------------------------------
+
+
+def _is_stack_shaped(line: str) -> bool:
+    return bool(STACK_HINT_RE.search(line))
+
+
+def _strip_session_meta(body_line: str) -> str:
+    """Strip the ``General  f:N, t:N, st:...> `` session-metadata prefix from
+    a body's first line so pattern matching can run against the meaningful tail.
+    """
+    return SESSION_META_RE.sub("", body_line)
+
+
+def _collect_pre_stack(entries: list[Entry], hit_idx: int) -> list[str]:
+    """Walk back through prior entries; collect stack-shaped lines from each
+    entry's body. Stop at the previous severity-flagged entry. Cap collection
+    at MAX_STACK_FRAMES and at STACK_WALK_LINES of body lines examined.
+    Per spec, only return the block if at least one line looks stack-shaped.
+    """
+    collected: list[str] = []
+    lines_examined = 0
+    for j in range(hit_idx - 1, -1, -1):
+        prior = entries[j]
+        # Stop at another severity line (the previous error's boundary).
+        if is_severity_entry(prior):
+            break
+        # Walk this entry's body in reverse; for body[0] the session-meta
+        # prefix is part of the line — strip it before stack-shape check.
+        for k in range(len(prior.body) - 1, -1, -1):
+            line = prior.body[k]
+            stripped = _strip_session_meta(line) if k == 0 else line
+            lines_examined += 1
+            if _is_stack_shaped(stripped):
+                collected.append(stripped.strip())
+                if len(collected) >= MAX_STACK_FRAMES:
+                    break
+            if lines_examined >= STACK_WALK_LINES:
+                break
+        if len(collected) >= MAX_STACK_FRAMES or lines_examined >= STACK_WALK_LINES:
+            break
+    if not collected:
+        return []
+    collected.reverse()  # restore source order
+    return collected
+
+
+def _collect_post_stack(entries: list[Entry], hit_idx: int) -> list[str]:
+    """Look at the entry's own body continuation lines first (stack frames
+    attached to the ERROR header become continuation lines after parsing),
+    then walk forward through subsequent entries. Stop at the next severity
+    entry. Cap at MAX_STACK_FRAMES and at STACK_WALK_LINES of body lines."""
+    entry = entries[hit_idx]
+    collected: list[str] = []
+    lines_examined = 0
+    # Body continuations (skip body[0] which is the headline itself).
+    for line in entry.body[1:]:
+        lines_examined += 1
+        if _is_stack_shaped(line):
+            collected.append(line.strip())
+            if len(collected) >= MAX_STACK_FRAMES:
+                return collected
+        if lines_examined >= STACK_WALK_LINES:
+            return collected
+    for j in range(hit_idx + 1, len(entries)):
+        next_entry = entries[j]
+        if is_severity_entry(next_entry):
+            break
+        for k, line in enumerate(next_entry.body):
+            stripped = _strip_session_meta(line) if k == 0 else line
+            lines_examined += 1
+            if _is_stack_shaped(stripped):
+                collected.append(stripped.strip())
+                if len(collected) >= MAX_STACK_FRAMES:
+                    return collected
+            if lines_examined >= STACK_WALK_LINES:
+                return collected
+    return collected
+
+
+def collect_stack(entries: list[Entry], hit_idx: int) -> list[str]:
+    """Merge pre + post stack, dedup preserving order, cap at MAX_STACK_FRAMES."""
+    pre = _collect_pre_stack(entries, hit_idx)
+    post = _collect_post_stack(entries, hit_idx)
+    seen: set[str] = set()
+    merged: list[str] = []
+    for frame in pre + post:
+        if frame in seen:
+            continue
+        seen.add(frame)
+        merged.append(frame)
+        if len(merged) >= MAX_STACK_FRAMES:
+            break
+    return merged
+
+
+# ---------------------------------------------------------------------------
+# Phase 3: mod attribution
+# ---------------------------------------------------------------------------
+
+
+def _norm_mod_key(raw_name: str) -> str:
+    """Lowercase, strip spaces / apostrophes / hyphens. Used as mod_id."""
+    s = raw_name.lower()
+    for ch in (" ", "'", "-"):
+        s = s.replace(ch, "")
+    return s
+
+
+def _entry_text(entry: Entry) -> str:
+    """Whole-entry text (body + collected stack) for marker scanning."""
+    return "\n".join(entry.body)
+
+
+def attribute_entry(entry: Entry, prior_lookback_lines: list[str]) -> tuple[str, str, str, str, str]:
+    """Determine ``(mod_id, mod_name, attribution, confidence, reason)``.
+
+    ``prior_lookback_lines`` is the body lines from prior entries that fall
+    within INFERRED_LOOKBACK_LINES raw-file-line distance from this entry's
+    start, in source order. The list is scanned in reverse for the nearest
+    ``Lua((MOD:Y))`` marker when inferred attribution is being attempted.
+
+    Direct-attribution priority: Lua marker -> needed-by -> require-failed.
+
+    Rationale: ``needed by <mod>`` names the dependent mod (more semantically
+    targeted) and is preferred over ``require("...") failed`` which only names
+    the missing module path. ``Lua((MOD:...))`` is unambiguous and wins
+    outright.
+    """
+    text = _entry_text(entry)
+    # 1. Direct via Lua((MOD:X)) — unambiguous; outranks every other signal.
+    m = LUA_MOD_MARKER_RE.search(text)
+    if m:
+        raw = m.group(1).strip()
+        return (
+            _norm_mod_key(raw),
+            raw,
+            "direct",
+            "high",
+            "Lua((MOD:...)) marker on the entry itself",
+        )
+    # 2. Direct via "needed by <mod>"
+    m = NEEDED_BY_RE.search(text)
+    if m:
+        raw = m.group(1).strip().rstrip(".,;")
+        return (
+            _norm_mod_key(raw),
+            raw,
+            "direct",
+            "high",
+            "needed by <mod> hint",
+        )
+    # 3. Direct via require("X") failed — attribute to required module name.
+    m = REQUIRE_FAILED_RE.search(text)
+    if m:
+        raw = m.group(1).strip()
+        # Mod-name first segment (PZ paths often look like Mod/Foo/Bar).
+        mod_name = raw.split("/")[0] if "/" in raw else raw
+        return (
+            _norm_mod_key(mod_name),
+            mod_name,
+            "direct",
+            "high",
+            'require("...") failed shape',
+        )
+    # 4. Inferred — Lua-shaped body + recent Lua((MOD:Y)) within lookback.
+    if any(p.search(text) for p in LUA_SHAPED_PATTERNS):
+        for line in reversed(prior_lookback_lines):
+            mm = LUA_MOD_MARKER_RE.search(line)
+            if mm:
+                raw = mm.group(1).strip()
+                return (
+                    _norm_mod_key(raw),
+                    raw,
+                    "inferred",
+                    "medium",
+                    f"Lua-shaped body; nearest Lua((MOD:{raw})) within "
+                    f"{INFERRED_LOOKBACK_LINES}-line lookback",
+                )
+    return (
+        "__unattributed__",
+        "",
+        "unattributed",
+        "low",
+        "no marker; body not Lua-shaped or no recent Lua((MOD:...))",
+    )
+
+
+# ---------------------------------------------------------------------------
+# Phase 4: file:line extraction (five fallbacks, in order)
+# ---------------------------------------------------------------------------
+
+
+def extract_file_line(text: str) -> tuple[str, int]:
+    """Run the five fallbacks in order. Returns ``(file, line)`` with line=0
+    when only a path was matched."""
+    m = FILE_LINE_AT_RE.search(text)
+    if m:
+        return m.group(1), int(m.group(2))
+    m = FILE_LINE_FUNCTION_RE.search(text)
+    if m:
+        return m.group(1), int(m.group(2))
+    m = FILE_LINE_STRING_RE.search(text)
+    if m:
+        return m.group(1), int(m.group(2))
+    m = FILE_LINE_QUOTED_RE.search(text)
+    if m:
+        return m.group(1), int(m.group(2)) if m.group(2) else 0
+    m = FILE_LINE_UNQUOTED_RE.search(text)
+    if m:
+        return m.group(1), int(m.group(2)) if m.group(2) else 0
+    return "", 0
+
+
+# ---------------------------------------------------------------------------
+# Phase 5: cause-chain extraction
+# ---------------------------------------------------------------------------
+
+
+def extract_cause_chain(text: str) -> str:
+    """Return ``ExceptionA: msg -> ExceptionB: msg`` joined chain, deduped,
+    capped at MAX_CAUSE_CHAIN_LEVELS levels.
+    """
+    tokens: list[str] = []
+    seen: set[str] = set()
+    for line in text.splitlines():
+        cb = CAUSED_BY_RE.search(line)
+        if cb:
+            cls = cb.group(1)
+            msg = cb.group(2) or ""
+            tok = f"{cls}: {msg.strip()}".rstrip(": ").strip()
+            if tok not in seen:
+                seen.add(tok)
+                tokens.append(tok)
+            continue
+        ex = EXCEPTION_LINE_RE.search(line)
+        if ex:
+            cls = ex.group(1)
+            msg = ex.group(2) or ""
+            tok = f"{cls}: {msg.strip()}".rstrip(": ").strip()
+            if tok not in seen:
+                seen.add(tok)
+                tokens.append(tok)
+        if len(tokens) >= MAX_CAUSE_CHAIN_LEVELS:
+            break
+    return " -> ".join(tokens[:MAX_CAUSE_CHAIN_LEVELS])
+
+
+# ---------------------------------------------------------------------------
+# Phase 6: Java exception kind detection
+# ---------------------------------------------------------------------------
+
+
+JAVA_EXCEPTION_RE = re.compile(r"(?:\w+\.)+\w+(?:Exception|Error)\b")
+
+
+def detect_kind(entry: Entry, attribution: str, body_text: str) -> str:
+    """Determine the ``kind`` field. Order: engine_noise > require_failed >
+    java_exception > lua_runtime > runtime."""
+    # Phase 7 short-circuit (engine noise outranks others per spec — engine
+    # noise is PZ's own diagnostic chatter regardless of class).
+    if any(p.search(body_text) for p in ENGINE_NOISE_PATTERNS):
+        return "engine_noise"
+    if REQUIRE_FAILED_RE.search(body_text):
+        return "require_failed"
+    has_java = bool(JAVA_EXCEPTION_RE.search(body_text))
+    has_lua_marker = bool(LUA_MOD_MARKER_RE.search(body_text))
+    if has_java and not has_lua_marker:
+        return "java_exception"
+    # Lua-attributed runtime / inferred
+    if has_lua_marker or attribution in ("direct", "inferred"):
+        return "lua_runtime"
+    return "runtime"
+
+
+# ---------------------------------------------------------------------------
+# Phase 8: signature computation
+# ---------------------------------------------------------------------------
+
+
+def normalize_first_line(first: str) -> str:
+    """Per spec: strip session metadata prefix, strip any leading severity
+    word (so ``SEVERE: foo`` and ``foo`` produce the same pattern_id when both
+    are SEVERE-level), flatten quoted strings to ``"<S>"`` / ``'<S>'``, flatten
+    ≥2-digit numeric runs to ``<N>``, collapse whitespace, truncate to 200
+    chars.
+    """
+    s = first.strip()
+    s = SESSION_META_RE.sub("", s)
+    # Strip any leading ERROR:/SEVERE:/WARN:/FATAL: that survived in the body
+    # — the bracketed level already feeds pattern_id separately, so leaving
+    # the body-prefix in place would fragment signatures across "body has
+    # SEVERE: prefix" vs "body has no prefix but bracketed level is SEVERE."
+    s = SEVERITY_PREFIX_STRIP_RE.sub("", s)
+    s = DOUBLE_QUOTED_RE.sub('"<S>"', s)
+    s = SINGLE_QUOTED_RE.sub("'<S>'", s)
+    s = NUMERIC_RUN_RE.sub("<N>", s)
+    s = WS_RUN_RE.sub(" ", s)
+    return s[:PATTERN_ID_FIRST_LINE_MAX]
+
+
+def compute_pattern_id(level: str, first_line: str) -> str:
+    """``sha256(level + normalized_first_line)[:16]``, prefixed ``sha256:``.
+
+    16 hex chars (64 bits) chosen for JSON readability vs collision-resistance
+    trade-off; consumers treat as opaque.
+    """
+    norm = normalize_first_line(first_line)
+    h = hashlib.sha256(f"{level}\n{norm}".encode("utf-8")).hexdigest()
+    return f"sha256:{h[:16]}"
+
+
+def compute_signature(pattern_id: str, mod_id: str) -> str:
+    """``sha256(pattern_id + mod_id)[:16]``, prefixed ``sha256:``.
+
+    16 hex chars (64 bits) chosen for JSON readability vs collision-resistance
+    trade-off; consumers treat as opaque.
+    """
+    h = hashlib.sha256(f"{pattern_id}\n{mod_id}".encode("utf-8")).hexdigest()
+    return f"sha256:{h[:16]}"
+
+
+# ---------------------------------------------------------------------------
+# Aggregation (phase 9) and the public classify_entries entry point
+# ---------------------------------------------------------------------------
+
+
+_CONFIDENCE_RANK: dict[str, int] = {"low": 0, "medium": 1, "high": 2}
+_ATTRIBUTION_RANK: dict[str, int] = {
+    "unattributed": 0,
+    "inferred": 1,
+    "direct": 2,
+}
+
+
+def _build_excerpt(entry: Entry, max_chars: int = 1000) -> str:
+    """Best-effort one-block excerpt of the entry (header + continuations)."""
+    lines: list[str] = []
+    header = f'[{entry.timestamp}] {entry.level}: '
+    if entry.body:
+        lines.append(header + entry.body[0])
+        for cont in entry.body[1:]:
+            lines.append(cont)
+    text = "\n".join(lines)
+    if len(text) > max_chars:
+        text = text[:max_chars] + "\n... [truncated]"
+    return text
+
+
+def _build_lookback_window(entries: list[Entry], hit_idx: int) -> list[str]:
+    """Collect body lines from prior entries whose ``line_start`` falls within
+    INFERRED_LOOKBACK_LINES raw-file-line distance from the current entry.
+
+    Spec wording is "within the previous 40 lines", measured in raw file lines
+    (mirrors pzmm's ``(i - last_mod_line) <= 40``, inclusive of 40). Counting
+    raw lines means a multi-line entry (e.g., a 5-line Java stack trace) does
+    not shrink the practical window the way a body-line budget would.
+
+    Returned list is in source order (oldest first) so callers can call
+    ``reversed()`` on it.
+    """
+    if hit_idx <= 0:
+        return []
+    threshold = entries[hit_idx].line_start - INFERRED_LOOKBACK_LINES
+    in_window: list[Entry] = []
+    for j in range(hit_idx - 1, -1, -1):
+        prior = entries[j]
+        if prior.line_start < threshold:
+            break
+        in_window.append(prior)
+    # We accumulated newest-first; reverse so we emit in source order.
+    in_window.reverse()
+    collected: list[str] = []
+    for prior in in_window:
+        collected.extend(prior.body)
+    return collected
+
+
+def classify_entries(entries: list[Entry], source_file: str = "") -> list[Record]:
+    """Apply phases 1-9 to a parsed-file entry list. Returns one Record per
+    unique (mod_id, error_shape) pair after dedup on signature.
+    """
+    by_signature: dict[str, Record] = {}
+    for hit_idx, entry in enumerate(entries):
+        if not is_severity_entry(entry):
+            continue
+        level = effective_level(entry)
+        body_text = _entry_text(entry)
+        # Phase 2: stack collection
+        stack = collect_stack(entries, hit_idx)
+        # Phase 3: attribution (with INFERRED_LOOKBACK_LINES lookback)
+        prior_window = _build_lookback_window(entries, hit_idx)
+        mod_id, mod_name, attribution, confidence, attribution_reason = attribute_entry(
+            entry, prior_window
+        )
+        # Phase 4: file:line extraction (search body + stack frames)
+        search_text = body_text + "\n" + "\n".join(stack)
+        file_path, line_no = extract_file_line(search_text)
+        # Phase 5: cause-chain extraction
+        cause_chain = extract_cause_chain(search_text)
+        # Phase 6 & 7: kind detection (engine_noise short-circuits)
+        kind = detect_kind(entry, attribution, body_text)
+        # Phase 8: signature computation
+        pattern_id = compute_pattern_id(level, entry.body[0] if entry.body else "")
+        signature = compute_signature(pattern_id, mod_id)
+        # Phase 9: dedup & aggregate
+        if signature not in by_signature:
+            by_signature[signature] = Record(
+                signature=signature,
+                pattern_id=pattern_id,
+                level=level,
+                kind=kind,
+                mod_id=mod_id,
+                mod_name=mod_name,
+                attribution=attribution,
+                confidence=confidence,
+                attribution_reason=attribution_reason,
+                file=file_path,
+                line=line_no,
+                cause_chain=cause_chain,
+                stack=list(stack),
+                first_seen=FirstSeen(
+                    file=source_file,
+                    line=entry.line_start,
+                    timestamp=entry.timestamp,
+                ),
+                occurrence_count=1,
+                files=[source_file] if source_file else [],
+                excerpt=_build_excerpt(entry),
+            )
+        else:
+            rec = by_signature[signature]
+            rec.occurrence_count += 1
+            if source_file and source_file not in rec.files:
+                rec.files.append(source_file)
+            # Promote attribution / confidence if this hit is stronger.
+            if _ATTRIBUTION_RANK[attribution] > _ATTRIBUTION_RANK[rec.attribution]:
+                rec.attribution = attribution
+                rec.attribution_reason = attribution_reason
+                if mod_name:
+                    rec.mod_name = mod_name
+            if _CONFIDENCE_RANK[confidence] > _CONFIDENCE_RANK[rec.confidence]:
+                rec.confidence = confidence
+            # Merge stack frames (preserving order, capped).
+            for frame in stack:
+                if frame not in rec.stack and len(rec.stack) < MAX_STACK_FRAMES:
+                    rec.stack.append(frame)
+            # Extend cause chain if the new hit has additional segments.
+            if cause_chain and cause_chain != rec.cause_chain:
+                # Concatenate unseen tokens.
+                old = rec.cause_chain.split(" -> ") if rec.cause_chain else []
+                new = cause_chain.split(" -> ")
+                merged = list(old)
+                for tok in new:
+                    if tok and tok not in merged:
+                        merged.append(tok)
+                rec.cause_chain = " -> ".join(merged[:MAX_CAUSE_CHAIN_LEVELS])
+    return list(by_signature.values())
+
+
+__all__ = [
+    "Entry",
+    "FirstSeen",
+    "Record",
+    "parse_file",
+    "classify_entries",
+    "is_severity_entry",
+    "effective_level",
+    "collect_stack",
+    "attribute_entry",
+    "extract_file_line",
+    "extract_cause_chain",
+    "detect_kind",
+    "normalize_first_line",
+    "compute_pattern_id",
+    "compute_signature",
+    "INFERRED_LOOKBACK_LINES",
+    "MAX_STACK_FRAMES",
+    "STACK_WALK_LINES",
+    "MAX_CAUSE_CHAIN_LEVELS",
+    "SEVERITY_LEVELS",
+]
--- a/tools/pz-analyzer/pz_redact_all.sh
+++ b/tools/pz-analyzer/pz_redact_all.sh
@@ -0,0 +1,36 @@
+#!/usr/bin/env bash
+# One-shot PII redaction over the PZ DebugLog-server files extracted from
+# /opt/ik-codex/Logs.zip. Produces /opt/ik-codex/.scratch/pz/Logs.redacted/
+# (gitignored alongside the source). Single Docker invocation; the codex
+# library's vendor/autoload.php is mounted read-write only because composer's
+# image refuses world-readable mounts under -u UID:GID.
+#
+# Re-runnable: rewrites every output file. Add --refresh-cache semantics by
+# rm -rf'ing the OUT directory first if you want.
+set -euo pipefail
+
+IN=/opt/ik-codex/.scratch/pz/Logs
+OUT=/opt/ik-codex/.scratch/pz/Logs.redacted
+
+if [ ! -d "$IN" ]; then
+    echo "error: input directory $IN missing — extract Logs.zip first" >&2
+    exit 1
+fi
+
+mkdir -p "$OUT"
+
+docker run --rm \
+    --entrypoint php \
+    -v /opt/ik-codex:/app -w /app \
+    -v "$IN":/in:ro -v "$OUT":/out \
+    -u "$(id -u):$(id -g)" \
+    composer:latest \
+    -r '
+        require "vendor/autoload.php";
+        $r = new IndifferentKetchup\Codex\Util\ProjectZomboid\ProjectZomboidRedactor();
+        $files = glob("/in/*DebugLog-server*.txt");
+        foreach ($files as $f) {
+            file_put_contents("/out/" . basename($f), $r->redact(file_get_contents($f)));
+        }
+        fprintf(STDERR, "redacted %d file(s)\n", count($files));
+    '
--- a/tools/pz-analyzer/tests/init.py
+++ b/tools/pz-analyzer/tests/init.py
--- a/tools/pz-analyzer/tests/fixtures/fixture_cause_chain.txt
+++ b/tools/pz-analyzer/tests/fixtures/fixture_cause_chain.txt
@@ -0,0 +1,7 @@
+[16-04-26 00:00:42.314] LOG  : General      f:0, t:1776297642254, st:48,648,157,434> server starting.
+[16-04-26 00:04:00.000] ERROR: General      f:0, t:1776297840000, st:48,648,355,178> Lua((MOD:Test Mod Alpha)) wrapper failure
+	java.lang.RuntimeException: outer wrapper at zombie.Foo(Foo.java:10)
+	Caused by: java.lang.IllegalStateException: middle layer
+	Caused by: java.lang.NullPointerException: deepest cause
+		at zombie.Bar(Bar.java:99)
+[16-04-26 00:04:01.000] LOG  : General      f:0, t:1776297841000, st:48,648,356,178> after.
--- a/tools/pz-analyzer/tests/fixtures/fixture_dedup.txt
+++ b/tools/pz-analyzer/tests/fixtures/fixture_dedup.txt
@@ -0,0 +1,8 @@
+[16-04-26 00:00:42.314] LOG  : General      f:0, t:1776297642254, st:48,648,157,434> server starting.
+[16-04-26 00:01:00.000] ERROR: General      f:0, t:1776297660000, st:48,648,175,178> Lua((MOD:Test Mod Alpha)) crash 1
+	at media/lua/client/A.lua:11
+[16-04-26 00:01:01.000] ERROR: General      f:0, t:1776297661000, st:48,648,176,178> Lua((MOD:Test Mod Alpha)) crash 1
+	at media/lua/client/A.lua:11
+[16-04-26 00:01:02.000] ERROR: General      f:0, t:1776297662000, st:48,648,177,178> Lua((MOD:Test Mod Alpha)) crash 1
+	at media/lua/client/A.lua:11
+[16-04-26 00:01:03.000] LOG  : General      f:0, t:1776297663000, st:48,648,178,178> ok.
--- a/tools/pz-analyzer/tests/fixtures/fixture_empty.txt
+++ b/tools/pz-analyzer/tests/fixtures/fixture_empty.txt
--- a/tools/pz-analyzer/tests/fixtures/fixture_engine_noise.txt
+++ b/tools/pz-analyzer/tests/fixtures/fixture_engine_noise.txt
@@ -0,0 +1,4 @@
+[16-04-26 00:00:42.314] LOG  : General      f:0, t:1776297642254, st:48,648,157,434> server starting.
+[16-04-26 00:03:00.000] ERROR: General      f:0, t:1776297780000, st:48,648,295,178> KahluaThread.flusherrormessage> dumping lua stack trace
+	at media/lua/client/Foo.lua:1
+[16-04-26 00:03:01.000] LOG  : General      f:0, t:1776297781000, st:48,648,296,178> after.
--- a/tools/pz-analyzer/tests/fixtures/fixture_file_line_fallbacks.txt
+++ b/tools/pz-analyzer/tests/fixtures/fixture_file_line_fallbacks.txt
@@ -0,0 +1,10 @@
+[16-04-26 00:00:42.314] LOG  : General      f:0, t:1776297642254, st:48,648,157,434> server starting.
+[16-04-26 00:01:00.000] ERROR: General      f:0, t:1776297660000, st:48,648,175,178> Lua((MOD:Test Mod A)) format1
+	at media/lua/client/F1.lua:11
+[16-04-26 00:01:01.000] ERROR: General      f:0, t:1776297661000, st:48,648,176,178> Lua((MOD:Test Mod B)) format2
+	function: doStuff -- file: media/lua/client/F2.lua line # 22
+[16-04-26 00:01:02.000] ERROR: General      f:0, t:1776297662000, st:48,648,177,178> Lua((MOD:Test Mod C)) format3
+	[string "media/lua/client/F3.lua"]:33: bang
+[16-04-26 00:01:03.000] ERROR: General      f:0, t:1776297663000, st:48,648,178,178> Lua((MOD:Test Mod D)) format4 about "media/lua/client/F4.lua" failure
+[16-04-26 00:01:04.000] ERROR: General      f:0, t:1776297664000, st:48,648,179,178> Lua((MOD:Test Mod E)) format5 path media/lua/client/F5.lua mention
+[16-04-26 00:01:05.000] LOG  : General      f:0, t:1776297665000, st:48,648,180,178> ok.
--- a/tools/pz-analyzer/tests/fixtures/fixture_inferred.txt
+++ b/tools/pz-analyzer/tests/fixtures/fixture_inferred.txt
@@ -0,0 +1,7 @@
+[16-04-26 00:00:42.314] LOG  : General      f:0, t:1776297642254, st:48,648,157,434> server starting.
+[16-04-26 00:01:00.000] LOG  : General      f:0, t:1776297660000, st:48,648,175,178> Lua((MOD:Spongies Clothing)) initialised.
+[16-04-26 00:01:01.000] LOG  : General      f:0, t:1776297661000, st:48,648,176,178> ordinary log line.
+[16-04-26 00:01:02.000] LOG  : General      f:0, t:1776297662000, st:48,648,177,178> another log line.
+[16-04-26 00:01:03.000] ERROR: General      f:0, t:1776297663000, st:48,648,178,178> LuaManager.GetFunctionObject> no such function: doStuff
+	at media/lua/client/Spongie.lua:7
+[16-04-26 00:01:04.000] LOG  : General      f:0, t:1776297664000, st:48,648,179,178> ok.
--- a/tools/pz-analyzer/tests/fixtures/fixture_java_exception.txt
+++ b/tools/pz-analyzer/tests/fixtures/fixture_java_exception.txt
@@ -0,0 +1,8 @@
+[16-04-26 00:00:42.314] LOG  : General      f:0, t:1776297642254, st:48,648,157,434> server starting.
+[16-04-26 00:01:19.080] ERROR: General      f:0, t:1776297679080, st:48,648,194,258> DebugFileWatcher.registerDir> Exception thrown
+	java.nio.file.NoSuchFileException: /placeholder/config/mods at UnixException.translateToIOException(null:-1).
+	Stack trace:
+		at java.base/sun.nio.fs.UnixException.translateToIOException(Unknown Source)
+		at java.base/sun.nio.fs.UnixException.asIOException(Unknown Source)
+		at java.base/sun.nio.fs.LinuxWatchService$Poller.implRegister(Unknown Source)
+[16-04-26 00:01:19.090] LOG  : General      f:0, t:1776297679090, st:48,648,194,268> after.
--- a/tools/pz-analyzer/tests/fixtures/fixture_lookback_boundary.txt
+++ b/tools/pz-analyzer/tests/fixtures/fixture_lookback_boundary.txt
@@ -0,0 +1,45 @@
+[16-04-26 00:00:42.314] LOG  : General      f:0, t:1776297642254, st:48,648,157,434> server starting.
+[16-04-26 00:01:00.000] LOG  : General      f:0, t:1776297660000, st:48,648,175,178> Lua((MOD:Test Mod Distant)) initialised.
+[16-04-26 00:01:01.000] LOG  : General      f:0, t:1776297661000, st:48,648,176,178> filler 1.
+[16-04-26 00:01:02.000] LOG  : General      f:0, t:1776297662000, st:48,648,177,178> filler 2.
+[16-04-26 00:01:03.000] LOG  : General      f:0, t:1776297663000, st:48,648,178,178> filler 3.
+[16-04-26 00:01:04.000] LOG  : General      f:0, t:1776297664000, st:48,648,179,178> filler 4.
+[16-04-26 00:01:05.000] LOG  : General      f:0, t:1776297665000, st:48,648,180,178> filler 5.
+[16-04-26 00:01:06.000] LOG  : General      f:0, t:1776297666000, st:48,648,181,178> filler 6.
+[16-04-26 00:01:07.000] LOG  : General      f:0, t:1776297667000, st:48,648,182,178> filler 7.
+[16-04-26 00:01:08.000] LOG  : General      f:0, t:1776297668000, st:48,648,183,178> filler 8.
+[16-04-26 00:01:09.000] LOG  : General      f:0, t:1776297669000, st:48,648,184,178> filler 9.
+[16-04-26 00:01:10.000] LOG  : General      f:0, t:1776297670000, st:48,648,185,178> filler 10.
+[16-04-26 00:01:11.000] LOG  : General      f:0, t:1776297671000, st:48,648,186,178> filler 11.
+[16-04-26 00:01:12.000] LOG  : General      f:0, t:1776297672000, st:48,648,187,178> filler 12.
+[16-04-26 00:01:13.000] LOG  : General      f:0, t:1776297673000, st:48,648,188,178> filler 13.
+[16-04-26 00:01:14.000] LOG  : General      f:0, t:1776297674000, st:48,648,189,178> filler 14.
+[16-04-26 00:01:15.000] LOG  : General      f:0, t:1776297675000, st:48,648,190,178> filler 15.
+[16-04-26 00:01:16.000] LOG  : General      f:0, t:1776297676000, st:48,648,191,178> filler 16.
+[16-04-26 00:01:17.000] LOG  : General      f:0, t:1776297677000, st:48,648,192,178> filler 17.
+[16-04-26 00:01:18.000] LOG  : General      f:0, t:1776297678000, st:48,648,193,178> filler 18.
+[16-04-26 00:01:19.000] LOG  : General      f:0, t:1776297679000, st:48,648,194,178> filler 19.
+[16-04-26 00:01:20.000] LOG  : General      f:0, t:1776297680000, st:48,648,195,178> filler 20.
+[16-04-26 00:01:21.000] LOG  : General      f:0, t:1776297681000, st:48,648,196,178> filler 21.
+[16-04-26 00:01:22.000] LOG  : General      f:0, t:1776297682000, st:48,648,197,178> filler 22.
+[16-04-26 00:01:23.000] LOG  : General      f:0, t:1776297683000, st:48,648,198,178> filler 23.
+[16-04-26 00:01:24.000] LOG  : General      f:0, t:1776297684000, st:48,648,199,178> filler 24.
+[16-04-26 00:01:25.000] LOG  : General      f:0, t:1776297685000, st:48,648,200,178> filler 25.
+[16-04-26 00:01:26.000] LOG  : General      f:0, t:1776297686000, st:48,648,201,178> filler 26.
+[16-04-26 00:01:27.000] LOG  : General      f:0, t:1776297687000, st:48,648,202,178> filler 27.
+[16-04-26 00:01:28.000] LOG  : General      f:0, t:1776297688000, st:48,648,203,178> filler 28.
+[16-04-26 00:01:29.000] LOG  : General      f:0, t:1776297689000, st:48,648,204,178> filler 29.
+[16-04-26 00:01:30.000] LOG  : General      f:0, t:1776297690000, st:48,648,205,178> filler 30.
+[16-04-26 00:01:31.000] LOG  : General      f:0, t:1776297691000, st:48,648,206,178> filler 31.
+[16-04-26 00:01:32.000] LOG  : General      f:0, t:1776297692000, st:48,648,207,178> filler 32.
+[16-04-26 00:01:33.000] LOG  : General      f:0, t:1776297693000, st:48,648,208,178> filler 33.
+[16-04-26 00:01:34.000] LOG  : General      f:0, t:1776297694000, st:48,648,209,178> filler 34.
+[16-04-26 00:01:35.000] LOG  : General      f:0, t:1776297695000, st:48,648,210,178> filler 35.
+[16-04-26 00:01:36.000] LOG  : General      f:0, t:1776297696000, st:48,648,211,178> filler 36.
+[16-04-26 00:01:37.000] LOG  : General      f:0, t:1776297697000, st:48,648,212,178> filler 37.
+[16-04-26 00:01:38.000] LOG  : General      f:0, t:1776297698000, st:48,648,213,178> filler 38.
+[16-04-26 00:01:39.000] LOG  : General      f:0, t:1776297699000, st:48,648,214,178> filler 39.
+[16-04-26 00:01:40.000] LOG  : General      f:0, t:1776297700000, st:48,648,215,178> filler 40.
+[16-04-26 00:01:41.000] LOG  : General      f:0, t:1776297701000, st:48,648,216,178> filler 41.
+[16-04-26 00:01:42.000] ERROR: General      f:0, t:1776297702000, st:48,648,217,178> LuaManager.GetFunctionObject> no such function (way past lookback)
+[16-04-26 00:01:43.000] LOG  : General      f:0, t:1776297703000, st:48,648,218,178> ok.
--- a/tools/pz-analyzer/tests/fixtures/fixture_lua_attributed.txt
+++ b/tools/pz-analyzer/tests/fixtures/fixture_lua_attributed.txt
@@ -0,0 +1,6 @@
+[16-04-26 00:00:42.314] LOG  : General      f:0, t:1776297642254, st:48,648,157,434> server starting.
+[16-04-26 00:01:19.131] LOG  : Mod          f:0, t:1776297679131, st:48,648,194,309> loading example_mod_alpha.
+[16-04-26 00:05:00.000] ERROR: General      f:0, t:1776297900000, st:48,648,415,178> Lua((MOD:Test Mod Alpha)) something broke
+	at media/lua/client/Foo.lua:42
+	function: doStuff -- file: media/lua/client/Foo.lua line # 42
+[16-04-26 00:05:01.000] LOG  : General      f:0, t:1776297901000, st:48,648,416,178> after the error.
--- a/tools/pz-analyzer/tests/fixtures/fixture_no_errors.txt
+++ b/tools/pz-analyzer/tests/fixtures/fixture_no_errors.txt
@@ -0,0 +1,3 @@
+[16-04-26 00:00:42.314] LOG  : General      f:0, t:1776297642254, st:48,648,157,434> server starting.
+[16-04-26 00:01:00.000] LOG  : General      f:0, t:1776297660000, st:48,648,175,178> ordinary line.
+[16-04-26 00:02:00.000] LOG  : General      f:0, t:1776297720000, st:48,648,235,178> nothing wrong.
--- a/tools/pz-analyzer/tests/fixtures/fixture_non_lua_no_inferred.txt
+++ b/tools/pz-analyzer/tests/fixtures/fixture_non_lua_no_inferred.txt
@@ -0,0 +1,5 @@
+[16-04-26 00:00:42.314] LOG  : General      f:0, t:1776297642254, st:48,648,157,434> server starting.
+[16-04-26 00:01:00.000] LOG  : General      f:0, t:1776297660000, st:48,648,175,178> Lua((MOD:Spongies Clothing)) initialised.
+[16-04-26 00:01:01.000] LOG  : General      f:0, t:1776297661000, st:48,648,176,178> ordinary log line.
+[16-04-26 00:01:03.000] ERROR: General      f:0, t:1776297663000, st:48,648,178,178> Disk full while writing chunk data
+[16-04-26 00:01:04.000] LOG  : General      f:0, t:1776297664000, st:48,648,179,178> ok.
--- a/tools/pz-analyzer/tests/fixtures/fixture_post_stack.txt
+++ b/tools/pz-analyzer/tests/fixtures/fixture_post_stack.txt
@@ -0,0 +1,6 @@
+[16-04-26 00:00:42.314] LOG  : General      f:0, t:1776297642254, st:48,648,157,434> server starting.
+[16-04-26 00:01:00.000] ERROR: General      f:0, t:1776297660000, st:48,648,175,178> Lua((MOD:Test Mod Alpha)) crash now
+	at media/lua/client/X.lua:11
+	at media/lua/client/Y.lua:22
+	[string "media/lua/client/Z.lua"]:33: oops
+[16-04-26 00:01:04.000] LOG  : General      f:0, t:1776297664000, st:48,648,179,178> ok.
--- a/tools/pz-analyzer/tests/fixtures/fixture_pre_stack.txt
+++ b/tools/pz-analyzer/tests/fixtures/fixture_pre_stack.txt
@@ -0,0 +1,6 @@
+[16-04-26 00:00:42.314] LOG  : General      f:0, t:1776297642254, st:48,648,157,434> server starting.
+[16-04-26 00:01:00.000] LOG  : General      f:0, t:1776297660000, st:48,648,175,178> 	at media/lua/client/A.lua:11
+[16-04-26 00:01:01.000] LOG  : General      f:0, t:1776297661000, st:48,648,176,178> 	at media/lua/client/B.lua:22
+[16-04-26 00:01:02.000] LOG  : General      f:0, t:1776297662000, st:48,648,177,178> 	[string "media/lua/client/C.lua"]:33: oops
+[16-04-26 00:01:03.000] ERROR: General      f:0, t:1776297663000, st:48,648,178,178> Lua((MOD:Test Mod Alpha)) crash
+[16-04-26 00:01:04.000] LOG  : General      f:0, t:1776297664000, st:48,648,179,178> ok.
--- a/tools/pz-analyzer/tests/fixtures/fixture_require_failed.txt
+++ b/tools/pz-analyzer/tests/fixtures/fixture_require_failed.txt
@@ -0,0 +1,3 @@
+[16-04-26 00:00:42.314] LOG  : General      f:0, t:1776297642254, st:48,648,157,434> server starting.
+[16-04-26 00:01:00.000] ERROR: General      f:0, t:1776297660000, st:48,648,175,178> require("DependencyMod/Foo") failed: needed by Test Mod Alpha
+[16-04-26 00:01:01.000] LOG  : General      f:0, t:1776297661000, st:48,648,176,178> ok.
--- a/tools/pz-analyzer/tests/fixtures/fixture_severity_variants.txt
+++ b/tools/pz-analyzer/tests/fixtures/fixture_severity_variants.txt
@@ -0,0 +1,5 @@
+[16-04-26 00:00:42.314] LOG  : General      f:0, t:1776297642254, st:48,648,157,434> server starting.
+[16-04-26 00:01:00.000] ERROR: General      f:0, t:1776297660000, st:48,648,175,178> ERROR: top-level error message
+[16-04-26 00:01:01.000] WARN : General      f:0, t:1776297661000, st:48,648,176,178> WARN: top-level warn message
+[16-04-26 00:01:02.000] ERROR: General      f:0, t:1776297662000, st:48,648,177,178> SEVERE: java-style severe message at zombie.Foo(Foo.java:5)
+[16-04-26 00:01:03.000] LOG  : General      f:0, t:1776297663000, st:48,648,178,178> ok.
--- a/tools/pz-analyzer/tests/fixtures/fixture_unattributed.txt
+++ b/tools/pz-analyzer/tests/fixtures/fixture_unattributed.txt
@@ -0,0 +1,3 @@
+[16-04-26 00:00:42.314] LOG  : General      f:0, t:1776297642254, st:48,648,157,434> server starting.
+[16-04-26 00:02:00.000] WARN : General      f:0, t:1776297720000, st:48,648,235,178> ZomboidFileSystem.loadModAndRequired> required mod "absent_mod" not found.
+[16-04-26 00:02:01.000] LOG  : General      f:0, t:1776297721000, st:48,648,236,178> after.
--- a/tools/pz-analyzer/tests/test_attribution.py
+++ b/tools/pz-analyzer/tests/test_attribution.py
@@ -0,0 +1,225 @@
+"""Tests for pz_parser phase 3 — mod attribution."""
+from __future__ import annotations
+
+import pathlib
+import sys
+import unittest
+
+sys.path.insert(0, str(pathlib.Path(__file__).resolve().parents[1]))
+
+import pz_parser  # noqa: E402
+
+FIXTURE_DIR = pathlib.Path(__file__).resolve().parent / "fixtures"
+
+
+def fixture(name: str) -> pathlib.Path:
+    return FIXTURE_DIR / name
+
+
+class AttributionBucketTests(unittest.TestCase):
+    """Three confidence buckets: direct (high), inferred (medium),
+    unattributed (low)."""
+
+    def test_direct_attribution_when_lua_marker_on_entry(self) -> None:
+        entries = pz_parser.parse_file(fixture("fixture_lua_attributed.txt"))
+        records = pz_parser.classify_entries(entries, source_file="la.txt")
+        self.assertEqual(len(records), 1)
+        rec = records[0]
+        self.assertEqual(rec.attribution, "direct")
+        self.assertEqual(rec.confidence, "high")
+        # mod_id is normalised: lowercase, no spaces / apostrophes / hyphens.
+        self.assertEqual(rec.mod_id, "testmodalpha")
+        self.assertEqual(rec.mod_name, "Test Mod Alpha")
+
+    def test_inferred_attribution_within_lookback_window(self) -> None:
+        entries = pz_parser.parse_file(fixture("fixture_inferred.txt"))
+        records = pz_parser.classify_entries(entries, source_file="in.txt")
+        self.assertEqual(len(records), 1)
+        rec = records[0]
+        self.assertEqual(rec.attribution, "inferred")
+        self.assertEqual(rec.confidence, "medium")
+        self.assertEqual(rec.mod_id, "spongiesclothing")
+
+    def test_unattributed_when_no_marker_and_not_lua_shaped(self) -> None:
+        entries = pz_parser.parse_file(fixture("fixture_unattributed.txt"))
+        records = pz_parser.classify_entries(entries, source_file="ua.txt")
+        self.assertEqual(len(records), 1)
+        rec = records[0]
+        self.assertEqual(rec.attribution, "unattributed")
+        self.assertEqual(rec.confidence, "low")
+        self.assertEqual(rec.mod_id, "__unattributed__")
+
+
+class LookbackBoundaryTests(unittest.TestCase):
+    """Phase 3 — 40-line inferred-attribution window boundary."""
+
+    def test_lua_marker_beyond_lookback_does_not_attribute(self) -> None:
+        # Fixture places the Lua((MOD:...)) >40 lines before the ERROR.
+        entries = pz_parser.parse_file(fixture("fixture_lookback_boundary.txt"))
+        records = pz_parser.classify_entries(entries, source_file="lb.txt")
+        self.assertEqual(len(records), 1)
+        rec = records[0]
+        # The Lua-shaped ERROR is far enough back to be unattributed.
+        self.assertEqual(rec.attribution, "unattributed")
+        self.assertEqual(rec.mod_id, "__unattributed__")
+
+    def test_non_lua_shaped_body_rejects_inferred_attribution(self) -> None:
+        # Recent Lua((MOD:Spongies Clothing)) emitted, but the ERROR body
+        # ("Disk full while writing chunk data") isn't Lua-shaped.
+        entries = pz_parser.parse_file(fixture("fixture_non_lua_no_inferred.txt"))
+        records = pz_parser.classify_entries(entries, source_file="nl.txt")
+        self.assertEqual(len(records), 1)
+        rec = records[0]
+        self.assertEqual(rec.attribution, "unattributed")
+
+
+class NeededByTests(unittest.TestCase):
+    """Phase 3 — direct attribution via "needed by <mod>" hint."""
+
+    def test_needed_by_extracts_dependent_mod(self) -> None:
+        entries = pz_parser.parse_file(fixture("fixture_require_failed.txt"))
+        records = pz_parser.classify_entries(entries, source_file="rf.txt")
+        self.assertEqual(len(records), 1)
+        rec = records[0]
+        # "needed by Test Mod Alpha" should set the mod to Test Mod Alpha
+        # (preferred over the require("...") side which would mention
+        # DependencyMod). Either way we want direct/high.
+        self.assertEqual(rec.attribution, "direct")
+        self.assertEqual(rec.confidence, "high")
+        # The "needed by" branch is checked before the require() branch in
+        # the priority order; mod_id should reflect Test Mod Alpha.
+        self.assertEqual(rec.mod_id, "testmodalpha")
+
+
+def _make_marker_line(idx: int) -> str:
+    """Synthesise a single LOG-level entry containing a Lua((MOD:...)) marker."""
+    # Vary timestamps so the bracketed prefix is unique-ish; not strictly
+    # required — they only feed Entry.timestamp, not parsing.
+    return (
+        f"[16-04-26 00:00:{idx:02d}.000] LOG  : General      f:0, "
+        f"t:1776297642{idx:03d}, st:48,648,157,434> "
+        "Lua((MOD:Test Mod Alpha)) initialised."
+    )
+
+
+def _make_filler_line(idx: int) -> str:
+    """A plain LOG-level entry with no marker; one raw line."""
+    return (
+        f"[16-04-26 00:01:{idx % 60:02d}.000] LOG  : General      f:0, "
+        f"t:177629760{idx:04d}, st:48,648,200,178> filler entry {idx}."
+    )
+
+
+def _make_error_line() -> str:
+    """A Lua-shaped ERROR with no Lua((MOD:...)) marker on the entry itself
+    — so attribution must come from the lookback window if it comes at all."""
+    return (
+        "[16-04-26 00:02:00.000] ERROR: General      f:0, "
+        "t:1776297900000, st:48,648,300,178> "
+        "LuaManager.GetFunctionObject> no such function: doStuff"
+    )
+
+
+class RawLineLookbackTests(unittest.TestCase):
+    """Phase 3 — lookback semantics measure raw file lines, not body-line
+    budgets. Multi-line entries inside the window must not shrink the
+    practical reach."""
+
+    def _write_fixture(self, name: str, lines: list[str]) -> pathlib.Path:
+        path = FIXTURE_DIR / name
+        path.write_text("\n".join(lines) + "\n")
+        return path
+
+    def test_marker_exactly_at_lookback_boundary_attributes(self) -> None:
+        # Marker on line 1, ERROR on line 41 -> raw-line distance = 40
+        # (inclusive of INFERRED_LOOKBACK_LINES=40 -> still attributed).
+        lines = [_make_marker_line(0)]
+        for i in range(1, 40):
+            lines.append(_make_filler_line(i))
+        lines.append(_make_error_line())  # line 41 in the fixture
+        path = self._write_fixture("_rawline_at_boundary.txt", lines)
+        try:
+            entries = pz_parser.parse_file(path)
+            self.assertEqual(entries[0].line_start, 1)
+            self.assertEqual(entries[-1].line_start, 41)
+            records = pz_parser.classify_entries(entries, source_file="b1.txt")
+            self.assertEqual(len(records), 1)
+            self.assertEqual(records[0].attribution, "inferred")
+            self.assertEqual(records[0].mod_id, "testmodalpha")
+        finally:
+            path.unlink()
+
+    def test_marker_one_line_past_boundary_does_not_attribute(self) -> None:
+        # Marker on line 1, ERROR on line 42 -> raw-line distance = 41
+        # (just outside INFERRED_LOOKBACK_LINES -> unattributed).
+        lines = [_make_marker_line(0)]
+        for i in range(1, 41):
+            lines.append(_make_filler_line(i))
+        lines.append(_make_error_line())  # line 42 in the fixture
+        path = self._write_fixture("_rawline_past_boundary.txt", lines)
+        try:
+            entries = pz_parser.parse_file(path)
+            self.assertEqual(entries[0].line_start, 1)
+            self.assertEqual(entries[-1].line_start, 42)
+            records = pz_parser.classify_entries(entries, source_file="b2.txt")
+            self.assertEqual(len(records), 1)
+            self.assertEqual(records[0].attribution, "unattributed")
+            self.assertEqual(records[0].mod_id, "__unattributed__")
+        finally:
+            path.unlink()
+
+    def test_multiline_entry_does_not_shrink_practical_lookback(self) -> None:
+        """Multi-line entries inside the lookback window do not break
+        attribution. (Old body-line-budget and new raw-line-distance semantics
+        happen to be equivalent on contiguous PZ entries; this test locks the
+        post-fix semantic against future regression to a budget that *would*
+        differ — e.g. a body-line cap with a smaller value.)
+        """
+        # Layout the file so a multi-line entry sits between marker and ERROR.
+        # The marker on line 1 is within 40 raw lines of the ERROR even though
+        # the file has a 6-line multi-line entry in between.
+        lines = [_make_marker_line(0)]            # raw line 1: marker entry
+        # Single-line fillers on raw lines 2..30 (29 entries).
+        for i in range(1, 30):
+            lines.append(_make_filler_line(i))
+        # Multi-line entry: header on raw line 31, 5 continuations on lines
+        # 32..36 (Java-stack-trace shape).
+        lines.append(
+            "[16-04-26 00:01:30.000] LOG  : General      f:0, "
+            "t:1776297930000, st:48,648,200,178> stack trace dump"
+        )
+        for k in range(5):
+            lines.append(f"\tat zombie.SomeClass.method{k}(SomeClass.java:{k + 1})")
+        # Single-line fillers on raw lines 37..40 (4 entries).
+        for i in range(30, 34):
+            lines.append(_make_filler_line(i))
+        # ERROR at raw line 41 -> N - 1 = 40 -> within window.
+        lines.append(_make_error_line())
+        path = self._write_fixture("_rawline_multiline.txt", lines)
+        try:
+            entries = pz_parser.parse_file(path)
+            # Sanity-check the layout: first entry at line 1, multi-line entry
+            # sits at line 31 with 6 body lines (header + 5 continuations),
+            # ERROR at line 41.
+            self.assertEqual(entries[0].line_start, 1)
+            multi = next(
+                e for e in entries
+                if e.line_start == 31 and len(e.body) == 6
+            )
+            self.assertEqual(multi.line_end, 36)
+            self.assertEqual(entries[-1].line_start, 41)
+            records = pz_parser.classify_entries(entries, source_file="ml.txt")
+            self.assertEqual(len(records), 1)
+            # Raw-line-distance semantics: the marker on line 1 is 40 raw
+            # lines from the ERROR on line 41, so attribution holds. (Old
+            # body-line-budget would also pass here on contiguous entries;
+            # this assertion locks the post-fix behavior against future
+            # regression to a tighter cap.)
+            self.assertEqual(records[0].attribution, "inferred")
+            self.assertEqual(records[0].mod_id, "testmodalpha")
+        finally:
+            path.unlink()
+
+
+if __name__ == "__main__":
+    unittest.main()
--- a/tools/pz-analyzer/tests/test_parser.py
+++ b/tools/pz-analyzer/tests/test_parser.py
@@ -0,0 +1,199 @@
+"""Tests for pz_parser parsing pipeline (phases 1, 2, 4-7, 9)."""
+from __future__ import annotations
+
+import pathlib
+import sys
+import unittest
+
+# Make the parser module importable when running via `python -m unittest
+# discover -s tools/pz-analyzer/tests`.
+sys.path.insert(0, str(pathlib.Path(__file__).resolve().parents[1]))
+
+import pz_parser  # noqa: E402
+
+FIXTURE_DIR = pathlib.Path(__file__).resolve().parent / "fixtures"
+
+
+def fixture(name: str) -> pathlib.Path:
+    return FIXTURE_DIR / name
+
+
+class ParseFileTests(unittest.TestCase):
+    """Phase 0 — basic line-shape recognition and continuation folding."""
+
+    def test_parse_file_groups_continuations_under_entry(self) -> None:
+        entries = pz_parser.parse_file(fixture("fixture_java_exception.txt"))
+        # 3 bracketed entries; the ERROR has 4 continuation lines.
+        self.assertEqual(len(entries), 3)
+        error_entry = entries[1]
+        self.assertEqual(error_entry.level, "ERROR")
+        self.assertGreater(len(error_entry.body), 1)
+        # First continuation should be the java exception line.
+        self.assertIn("NoSuchFileException", error_entry.body[1])
+
+    def test_parse_file_handles_empty_file(self) -> None:
+        self.assertEqual(pz_parser.parse_file(fixture("fixture_empty.txt")), [])
+
+    def test_parse_file_handles_no_errors(self) -> None:
+        entries = pz_parser.parse_file(fixture("fixture_no_errors.txt"))
+        self.assertEqual(len(entries), 3)
+        self.assertTrue(all(e.level == "LOG" for e in entries))
+
+
+class SeverityRecognitionTests(unittest.TestCase):
+    """Phase 1 — ERROR / WARN / SEVERE recognition."""
+
+    def test_classify_picks_up_error_warn_and_severe(self) -> None:
+        entries = pz_parser.parse_file(fixture("fixture_severity_variants.txt"))
+        records = pz_parser.classify_entries(entries, source_file="severity.txt")
+        levels = sorted({r.level for r in records})
+        # Spec accepts ERROR / WARN / SEVERE. The third entry has bracketed
+        # ERROR but body starts with SEVERE: ; effective_level should be SEVERE.
+        self.assertIn("ERROR", levels)
+        self.assertIn("WARN", levels)
+        self.assertIn("SEVERE", levels)
+
+    def test_log_lines_are_ignored(self) -> None:
+        entries = pz_parser.parse_file(fixture("fixture_no_errors.txt"))
+        records = pz_parser.classify_entries(entries, source_file="x.txt")
+        self.assertEqual(records, [])
+
+
+class StackCollectionTests(unittest.TestCase):
+    """Phase 2 — bidirectional stack collection."""
+
+    def test_pre_stack_walk_picks_up_preceding_lua_frames(self) -> None:
+        entries = pz_parser.parse_file(fixture("fixture_pre_stack.txt"))
+        # The ERROR entry is the 5th LOG-bracketed line; its predecessors are
+        # LOG-bracketed entries whose bodies are stack-shaped lines.
+        records = pz_parser.classify_entries(entries, source_file="pre.txt")
+        self.assertEqual(len(records), 1)
+        rec = records[0]
+        # Pre-stack walk should pick up at least the "at media/lua/.../A.lua:11" frame.
+        self.assertTrue(any("A.lua:11" in f for f in rec.stack))
+
+    def test_post_stack_collected_from_entry_body_continuations(self) -> None:
+        entries = pz_parser.parse_file(fixture("fixture_post_stack.txt"))
+        records = pz_parser.classify_entries(entries, source_file="post.txt")
+        self.assertEqual(len(records), 1)
+        rec = records[0]
+        self.assertTrue(any("X.lua:11" in f for f in rec.stack))
+        self.assertTrue(any("Y.lua:22" in f for f in rec.stack))
+        # Lua [string "..."]:N form preserves quoting in the captured frame.
+        self.assertTrue(any("Z.lua" in f and ":33" in f for f in rec.stack))
+
+    def test_stack_capped_at_eight_frames(self) -> None:
+        # Synthesise an ERROR with many continuation frames.
+        lines = ["[16-04-26 00:00:42.314] ERROR: General      f:0, t:1, st:1,2,3,4> Lua((MOD:Test Mod Alpha)) crash"]
+        for i in range(20):
+            lines.append(f"\tat media/lua/client/F{i}.lua:{i + 1}")
+        path = FIXTURE_DIR / "_runtime_stack_cap.txt"
+        path.write_text("\n".join(lines) + "\n")
+        try:
+            entries = pz_parser.parse_file(path)
+            records = pz_parser.classify_entries(entries, source_file="cap.txt")
+            self.assertEqual(len(records), 1)
+            self.assertLessEqual(len(records[0].stack), pz_parser.MAX_STACK_FRAMES)
+            # And it should be exactly MAX_STACK_FRAMES given >MAX inputs.
+            self.assertEqual(len(records[0].stack), pz_parser.MAX_STACK_FRAMES)
+        finally:
+            path.unlink()
+
+
+class FileLineExtractionTests(unittest.TestCase):
+    """Phase 4 — five-fallback file:line extraction."""
+
+    def test_each_fallback_form_extracts_path(self) -> None:
+        entries = pz_parser.parse_file(fixture("fixture_file_line_fallbacks.txt"))
+        records = pz_parser.classify_entries(entries, source_file="ff.txt")
+        # 5 distinct ERRORs, distinct mods — should produce 5 records.
+        files = sorted(r.file for r in records)
+        self.assertEqual(
+            files,
+            sorted([
+                "media/lua/client/F1.lua",
+                "media/lua/client/F2.lua",
+                "media/lua/client/F3.lua",
+                "media/lua/client/F4.lua",
+                "media/lua/client/F5.lua",
+            ]),
+        )
+
+    def test_quoted_path_without_line_number_yields_zero(self) -> None:
+        # Format 4 fixture line lacks a :NN suffix on the quoted path.
+        file_path, line_no = pz_parser.extract_file_line(
+            'failure about "media/lua/client/F4.lua" tail'
+        )
+        self.assertEqual(file_path, "media/lua/client/F4.lua")
+        self.assertEqual(line_no, 0)
+
+
+class CauseChainTests(unittest.TestCase):
+    """Phase 5 — Caused-by chain unwinding."""
+
+    def test_caused_by_chain_renders_with_arrow_separator(self) -> None:
+        entries = pz_parser.parse_file(fixture("fixture_cause_chain.txt"))
+        records = pz_parser.classify_entries(entries, source_file="cc.txt")
+        self.assertEqual(len(records), 1)
+        chain = records[0].cause_chain
+        self.assertIn("RuntimeException", chain)
+        self.assertIn("IllegalStateException", chain)
+        self.assertIn("NullPointerException", chain)
+        # Order preserved (outer -> inner).
+        idx_runtime = chain.index("RuntimeException")
+        idx_illegal = chain.index("IllegalStateException")
+        idx_null = chain.index("NullPointerException")
+        self.assertLess(idx_runtime, idx_illegal)
+        self.assertLess(idx_illegal, idx_null)
+
+    def test_no_cause_chain_when_no_exceptions(self) -> None:
+        entries = pz_parser.parse_file(fixture("fixture_unattributed.txt"))
+        records = pz_parser.classify_entries(entries, source_file="u.txt")
+        self.assertEqual(len(records), 1)
+        self.assertEqual(records[0].cause_chain, "")
+
+
+class KindDetectionTests(unittest.TestCase):
+    """Phases 6 & 7 — kind classification."""
+
+    def test_java_exception_kind_when_no_lua_marker(self) -> None:
+        entries = pz_parser.parse_file(fixture("fixture_java_exception.txt"))
+        records = pz_parser.classify_entries(entries, source_file="je.txt")
+        self.assertEqual(len(records), 1)
+        self.assertEqual(records[0].kind, "java_exception")
+        # Java engine errors should resolve to __unattributed__.
+        self.assertEqual(records[0].mod_id, "__unattributed__")
+
+    def test_engine_noise_kind_for_kahluathread(self) -> None:
+        entries = pz_parser.parse_file(fixture("fixture_engine_noise.txt"))
+        records = pz_parser.classify_entries(entries, source_file="en.txt")
+        self.assertEqual(len(records), 1)
+        self.assertEqual(records[0].kind, "engine_noise")
+
+    def test_lua_runtime_kind_for_attributed_lua_error(self) -> None:
+        entries = pz_parser.parse_file(fixture("fixture_lua_attributed.txt"))
+        records = pz_parser.classify_entries(entries, source_file="la.txt")
+        self.assertEqual(len(records), 1)
+        self.assertEqual(records[0].kind, "lua_runtime")
+
+    def test_require_failed_kind(self) -> None:
+        entries = pz_parser.parse_file(fixture("fixture_require_failed.txt"))
+        records = pz_parser.classify_entries(entries, source_file="rf.txt")
+        self.assertEqual(len(records), 1)
+        self.assertEqual(records[0].kind, "require_failed")
+
+
+class AggregationTests(unittest.TestCase):
+    """Phase 9 — dedup, occurrence_count, files-set growth."""
+
+    def test_three_identical_errors_dedup_to_one_record(self) -> None:
+        entries = pz_parser.parse_file(fixture("fixture_dedup.txt"))
+        records = pz_parser.classify_entries(entries, source_file="dd.txt")
+        self.assertEqual(len(records), 1)
+        self.assertEqual(records[0].occurrence_count, 3)
+        # files list shouldn't duplicate "dd.txt".
+        self.assertEqual(records[0].files, ["dd.txt"])
+
+
+if __name__ == "__main__":
+    unittest.main()
--- a/tools/pz-analyzer/tests/test_signatures.py
+++ b/tools/pz-analyzer/tests/test_signatures.py
@@ -0,0 +1,91 @@
+"""Tests for pz_parser phase 8 — signature computation."""
+from __future__ import annotations
+
+import pathlib
+import sys
+import unittest
+
+sys.path.insert(0, str(pathlib.Path(__file__).resolve().parents[1]))
+
+import pz_parser  # noqa: E402
+
+
+class PatternIdStabilityTests(unittest.TestCase):
+    """pattern_id should be invariant under formatting variations."""
+
+    def test_pattern_id_collapses_numeric_runs(self) -> None:
+        a = pz_parser.compute_pattern_id(
+            "ERROR",
+            "General  f:0, t:1776297642, st:48,648,157,434> failed at offset 12345",
+        )
+        b = pz_parser.compute_pattern_id(
+            "ERROR",
+            "General  f:0, t:9999999999, st:99,99,99,99> failed at offset 99999",
+        )
+        self.assertEqual(a, b)
+
+    def test_pattern_id_collapses_quoted_strings_and_whitespace(self) -> None:
+        a = pz_parser.compute_pattern_id(
+            "ERROR",
+            'no such function "doStuff"   in module',
+        )
+        b = pz_parser.compute_pattern_id(
+            "ERROR",
+            'no such function "fooBarBaz" in module',
+        )
+        # Whitespace-collapse plus quoted-string-flatten => same pattern_id.
+        self.assertEqual(a, b)
+
+    def test_pattern_id_changes_with_level(self) -> None:
+        a = pz_parser.compute_pattern_id("ERROR", "exception thrown")
+        b = pz_parser.compute_pattern_id("WARN", "exception thrown")
+        self.assertNotEqual(a, b)
+
+
+class SignatureUniquenessTests(unittest.TestCase):
+    """signature should fan out across mods sharing a pattern_id."""
+
+    def test_signature_unique_per_mod_for_shared_pattern(self) -> None:
+        # Same first line, different mod_ids — different signatures, same pattern_id.
+        pat = pz_parser.compute_pattern_id("ERROR", "Lua((MOD:X)) crash")
+        sig_a = pz_parser.compute_signature(pat, "spongiesclothing")
+        sig_b = pz_parser.compute_signature(pat, "testmodalpha")
+        self.assertNotEqual(sig_a, sig_b)
+        # Both should share their pattern_id (consumer's pattern-fanout view).
+        self.assertEqual(pat[:7], "sha256:")
+
+
+class SeverityPrefixStripTests(unittest.TestCase):
+    """A body line that begins with a literal severity word (``SEVERE:``,
+    ``ERROR:``, ``WARN:``, ``FATAL:``) should not fragment pattern_id away
+    from the otherwise-identical body that lacks the prefix. The bracketed
+    level already feeds pattern_id; the prefix is redundant and varies in
+    practice."""
+
+    def test_pattern_id_invariant_under_body_prefix_severe(self) -> None:
+        # Same logical error: one line carries ``SEVERE: `` body prefix, the
+        # other doesn't. Both classified as SEVERE by their bracketed level.
+        with_prefix = pz_parser.compute_pattern_id(
+            "SEVERE",
+            "SEVERE: foo at zombie.X(File.java:42)",
+        )
+        without_prefix = pz_parser.compute_pattern_id(
+            "SEVERE",
+            "foo at zombie.X(File.java:42)",
+        )
+        self.assertEqual(with_prefix, without_prefix)
+
+    def test_pattern_id_invariant_under_body_prefix_error(self) -> None:
+        with_prefix = pz_parser.compute_pattern_id(
+            "ERROR",
+            "ERROR: doStuff failed in module",
+        )
+        without_prefix = pz_parser.compute_pattern_id(
+            "ERROR",
+            "doStuff failed in module",
+        )
+        self.assertEqual(with_prefix, without_prefix)
+
+
+if __name__ == "__main__":
+    unittest.main()
Author	SHA1	Message	Date
indifferentketchup	aa708a34a4	docs: extend CLAUDE.md with cross-session friction notes Some checks failed Tests / Run tests on PHP v8.4 (push) Failing after 1s Details Tests / Run tests on PHP v8.5 (push) Failing after 0s Details Captures four pieces of context that cost time to (re)derive this session: - Docker `--entrypoint php` one-liner for ad-hoc PHP that needs the codex autoloader (used for redactor smoke tests). - Pitfall #6: PZ DebugLog-server has two coexisting line shapes (B41 with `t:` field, B42 without) — `DebugServerPattern::LINE` matches both via an optional group; narrowing it back to B41-only silently disables ServerExceptionProblem / ModMissingProblem on every B42 log. - Deployed iblogs lives at bosslogs.indifferentketchup.com and uses `main` as its default branch, not `master`. Pinned to ^0.3.0. - New top-level section for `tools/pz-analyzer/` describing the intentional split between the pre-production Qwen-backed discovery tool and the production-bound deterministic classifier, plus the redact-all wrapper that feeds both. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-06 19:45:26 +00:00
indifferentketchup	656142dbf8	docs: cut v0.3.0 in CHANGELOG Some checks failed Tests / Run tests on PHP v8.4 (push) Failing after 1s Details Tests / Run tests on PHP v8.5 (push) Failing after 1s Details Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-06 19:04:37 +00:00
indifferentketchup	c63adb06c4	Merge branch 'pz42x-line-regex' Some checks failed Tests / Run tests on PHP v8.4 (push) Failing after 2s Details Tests / Run tests on PHP v8.5 (push) Failing after 1s Details Fixes DebugServerPattern::LINE so PZ build 42.x logs (which dropped the per-line `t:` microsecond field) parse with proper level/prefix attribution. Without the fix every B42 entry fell through as level INFO and ServerExceptionProblem / ModMissingProblem silently failed to fire, leaving B42 log views with at most a single EngineVersionInformation badge and no Problems panel. Backwards compatible with B41 format; ProjectZomboidServerLogTest now runs parameterised against both shapes via #[DataProvider].	2026-05-06 13:33:43 +00:00
indifferentketchup	0d18cfbfc6	fix: relax DebugServerPattern::LINE for PZ B42 log format PZ build 42.x dropped the per-line `t:` (microsecond) field and tightened the spacing between `f:N`, `t:N`, and `st:N,N,N,N>` markers. The hardcoded `f:\d+,\s+t:\d+,\s+st:` requirement caused every B42 line to fail the parser's LINE regex, leaving ServerLog entries without their level/prefix and silently disabling ServerExceptionProblem and ModMissingProblem (the anchorless EngineVersionInformation still fired against the joined entry text, which is why the symptom was "one Information, no Problems"). Make `t:N,` optional via `(?:,\s+t:\d+)?` and the comma between `f:N` and `st:` optional via `,?`. The B41 format remains a strict match. Add `debug-server-42x-minimal.txt` mirroring the existing synthetic fixture in the new format, and parameterise ProjectZomboidServerLogTest with a #[DataProvider] so all four parser-shape assertions now run against both formats. Spot-check: analysers emit 3 Problems (2 exceptions, 1 missing mod) and 4 Information entries against the new fixture, identical to B41. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-06 13:33:35 +00:00
indifferentketchup	45a5e1a3da	Merge branch 'feature/error-context-analyser' Some checks failed Tests / Run tests on PHP v8.4 (push) Failing after 1s Details Tests / Run tests on PHP v8.5 (push) Failing after 1s Details Adds a generic ErrorContextAnalyser under src/Analyser/ProjectZomboid/ that walks Entry[] and emits one ErrorContextProblem per ERROR or WARNING entry with up to 20 entries of before/after context. Overlapping windows clip so no Entry appears in two context arrays; emission caps at 500 hits with an ErrorContextTruncatedInformation note when reached.	2026-05-04 16:31:56 +00:00
indifferentketchup	6978175dff	chore: track pre-production Qwen analyser and redactor wrapper pz_redact_all.sh is the one-shot Docker wrapper that runs the PHP ProjectZomboidRedactor over .scratch/pz/Logs/ and produces the gitignored .scratch/pz/Logs.redacted/ directory consumed by both the Qwen analyser and the deterministic classifier. pz_error_analysis.py is the developer-facing Qwen-backed log analyser: walks the redacted directory, dedupes signatures, and asks the local sam-desktop Qwen endpoint to classify each unique shape into a fixed taxonomy with title / cause / fix / confidence. Runtime depends on the Qwen endpoint; the deterministic classifier at pz_classify.py is the production-bound counterpart. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 16:31:23 +00:00
indifferentketchup	3df6836909	feat: redact IPv4 and IPv6 addresses from PZ log content Adds a fourth pass to ProjectZomboidRedactor that scrubs IPv4 (strict 0-255 octets, optional :port suffix) and IPv6 (full, abbreviated, bracketed-with-port, IPv4-mapped) addresses, replacing them with the literal [REDACTED_IP]. The new pass runs first because it is pattern-disjoint from the Steam-ID -> name -> coords chain. A single redactIpAddresses(bool) toggle controls both families; the existing toggles are unchanged. Strict regexes plus filter_var() validation prevent false positives on PZ timestamps (12:00:00.000) and PHP/Java scope ops (Foo::bar). 20 new tests cover bare/with-port/multiple/loopback/boundary IPv4, full / abbreviated / bracketed / IPv4-mapped IPv6, scope-op rejection, timestamp rejection, Steam-ID non-collision, toggle-off, and idempotence. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 16:31:10 +00:00
indifferentketchup	b6949ff0c3	docs: add downstream-consumers, release flow, feature-branch conventions Captures iblogs as primary codex consumer with the call-site checklist for cross-repo public-API changes; spells out the semver / changelog cadence; documents the <feature>-bootstrap branch + --no-ff merge pattern set by the redactor and iblogs-bootstrap branches; pins the specs/plans path convention from the superpowers skills. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 16:30:50 +00:00
indifferentketchup	f1d2831d92	chore: gitignore Python bytecode caches and editor backup files Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 16:30:14 +00:00
indifferentketchup	bb4ee0d16a	Merge branch 'pz-classifier' Adds a deterministic-only Project Zomboid log classifier under tools/pz-analyzer/, parallel to the existing Qwen-based research tool. pz_parser.py is a pure module (parsing, attribution, file:line, cause-chain, kind detection, two-level signatures); pz_classify.py walks the redacted DebugLog-server directory, merges cross-file by signature, and writes the spec-shaped JSON. 32 unit tests.	2026-05-04 15:58:20 +00:00
indifferentketchup	58d0ef187b	chore: declare SEVERITY_LEVELS in pz_parser.__all__ Constant was already imported by pz_classify; this just formalises it as part of the public surface so __all__ matches actual usage. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 15:55:02 +00:00
indifferentketchup	9cd898bc9f	fix: route parent-directory creation through the JSON write try/except Was leaking unhandled OSError tracebacks when the output's parent path could not be created. Exit code stays 1; user-facing message matches the existing write-failure path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 15:50:52 +00:00
indifferentketchup	87a0562bd6	feat: deterministic PZ log classifier orchestrator Walks DebugLog-server*.txt under the redacted directory, runs the parser per file, merges cross-file by signature, and emits the spec-shaped JSON report. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 15:43:15 +00:00
indifferentketchup	fdf70a0c06	docs: align lookback test purpose and spec normalization list Honest test docstring (old/new semantics equivalent on contiguous entries; test locks post-fix behavior against future regressions), and add severity-prefix strip to the spec's normalization list. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 15:39:44 +00:00
indifferentketchup	2e7bebc911	fix: address code review findings on pz_parser - Strip body-prefix severity in normalize_first_line so pattern_id is stable across body-prefix vs bracketed-only variants. - Lookback for inferred attribution now counts raw file lines (per spec literal), not body-line budget across entries. - Document hash truncation (64-bit) and direct-attribution priority. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 15:33:56 +00:00
indifferentketchup	4fec3a58f6	feat: deterministic PZ log parser module + unit tests Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 15:18:41 +00:00
indifferentketchup	511583035b	docs: add design spec for deterministic PZ log classifier Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 15:02:28 +00:00
indifferentketchup	e1a7785cf4	feat: add ErrorContextAnalyser for sliding-window error/warning surfacing Walks Entry[] once and emits one ErrorContextProblem per ERROR or WARNING entry, attaching up to 20 entries before and 20 after as context. Overlapping windows clip the second hit's before- and after-ranges so no Entry appears in two context arrays. Caps emission at 500 hits and adds an ErrorContextTruncatedInformation when reached. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 11:29:52 +00:00