docs: backfill Phase B.3 spec and plan

Retroactive design + plan documentation for Phase B.3 (deferred
analysers requiring custom Analyser subclasses for cross-entry and
threshold logic). Records the architectural shift away from vanilla
PatternAnalyser, the threshold constant rationale (event-pairing /
sliding-window / consecutive-snapshot deltas), and the synthetic
fixture extensions that exercise both trigger and non-trigger paths.
Plan is as-built with checkboxes pre-checked and SHAs referenced.
This commit is contained in:
2026-05-01 12:53:32 +00:00
parent b99d8f3061
commit ed920485dc
2 changed files with 191 additions and 0 deletions

View File

@@ -0,0 +1,117 @@
# ProjectZomboid analyser design (Phase B.3 — deferred analysers)
> Retroactive: written 2026-05-01.
## Summary
Add the three remaining Project Zomboid analysers from the original Step D candidate list — connection failure pairing, item duplication heuristic, and skill progression anomaly detection — by introducing custom `Analyser` subclasses under `src/Analyser/ProjectZomboid/`. These are the first analysers in the tree that cannot be expressed as configured `PatternAnalyser` instances; they require cross-entry state (event pairing, sliding windows, snapshot deltas) that `PatternAnalyser` does not provide.
This document covers Phase B.3. Phase B.1 / B.2 docs are at `2026-04-30-pz-analysers-design.md` / `2026-04-30-pz-analysers-pvp-admin-design.md`. With Phase B.3, the original eight-analyser candidate list from Step D is fully implemented.
## Scope
- **In scope:** `ConnectionFailureAnalyser` + `ConnectionFailureProblem` (UserLog, event pairing); `ItemDuplicationAnalyser` + `ItemDuplicationProblem` (ItemLog, sliding-window heuristic); `SkillProgressionAnomalyAnalyser` + `SkillProgressionAnomalyProblem` (PerkLog, consecutive-snapshot delta); wiring three Log subclasses' `getDefaultAnalyser()`; extending two synthetic fixtures to exercise trigger and non-trigger cases; end-to-end tests.
- **Out of scope (B.3):** the five other PZ logs whose `getDefaultAnalyser()` continues returning an empty `PatternAnalyser` stub (Chat, ClientAction, Cmd, Map, BurdJournals); the codex-side `Redactor` utility; Hytale / Minecraft / Seven Days To Die analysers; v0.1.0 release plumbing.
## Architectural shift: custom `Analyser` subclasses
Phases B.1 and B.2 established the convention that vanilla `PatternAnalyser` plus `Insight::isEqual()` coalescing is sufficient for per-entry pattern matching, and a custom Analyser subclass is **not** needed even for multi-line records (PatternParser's continuation-line behaviour combined with `Entry::__toString()` joins solves multi-line capture without subclassing).
Phase B.3's three analysers genuinely require cross-entry state:
- **ConnectionFailureAnalyser** must count `attempting to join` and `allowed to join` events per Steam ID and report unmatched attempts. PatternAnalyser dispatches each entry independently and has no mechanism to compare counts across entries.
- **ItemDuplicationAnalyser** must group positive-delta item events by `(steamid, item)` tuple and slide a fixed-second window across each group. Sliding-window logic spans multiple entries by definition.
- **SkillProgressionAnomalyAnalyser** must collect all perks-row snapshots per Steam ID, sort them by time, then compute pairwise deltas between consecutive snapshots. Pairwise comparison spans entries.
Each subclass extends the framework's abstract `Analyser`, overrides `analyse(): AnalysisInterface`, walks `$this->log` once to aggregate state, and emits `Problem` insights at the end. The CLAUDE.md "Framework architecture" section was updated alongside Phase B.3 to document this pattern.
## Components
Three `Analyser` subclasses under `src/Analyser/ProjectZomboid/` (the directory's `.gitkeep` placeholder is removed in this phase):
| Analyser | Target Log | Logic shape | Threshold constants |
|---|---|---|---|
| `ConnectionFailureAnalyser` | `ProjectZomboidUserLog` | Two-pass count of attempt vs allowed events per Steam ID; emits one Problem per Steam ID where attempts > allowed | None — strict pairing |
| `ItemDuplicationAnalyser` | `ProjectZomboidItemLog` | Sliding-window heuristic over `(steamid, item)` groups | `THRESHOLD_COUNT = 5`, `THRESHOLD_WINDOW_SECONDS = 10` |
| `SkillProgressionAnomalyAnalyser` | `ProjectZomboidPerkLog` | Consecutive-snapshot delta per `(steamid, skill)`; only positive-delta perks-row entries (Login/Logout/LevelUp event tokens are filtered out) | `THRESHOLD_DELTA = 3` |
Three `Problem` subclasses under `src/Analysis/ProjectZomboid/`:
| Problem | Coalescing |
|---|---|
| `ConnectionFailureProblem` | By Steam ID — one problem per player regardless of how many unmatched attempts |
| `ItemDuplicationProblem` | By `(steamid, item)` tuple — one problem per suspicious group |
| `SkillProgressionAnomalyProblem` | By `(steamid, skill)` — one problem per skill exceeding the delta threshold |
## Threshold rationale (recorded as docblocks)
The constants are first-pass heuristics expected to be tuned once production logs flow through codex. Each is documented inline in its analyser class:
- **`ItemDuplicationAnalyser::THRESHOLD_COUNT = 5`**: Five identical item gains in a fixed window. Legitimate gameplay rarely produces five identical items quickly — crafting has animation delays, looting is one-at-a-time, zombie drops are similarly serial. A burst of five suggests admin-spawn or exploit. Tune downward if false negatives appear.
- **`ItemDuplicationAnalyser::THRESHOLD_WINDOW_SECONDS = 10`**: Ten seconds covers a realistic burst-loot scenario (e.g. a crate full of identical items) without collapsing onto unrelated events. Combined with `THRESHOLD_COUNT` this means an effective rate of 0.5 same-item events per second.
- **`SkillProgressionAnomalyAnalyser::THRESHOLD_DELTA = 3`**: PZ skills require thousands of XP per level; even active grinding rarely produces four-or-more level jumps in a single session bridge. Set to 3 as baseline; modded XP servers may need to raise this via subclass override.
## Patterns
No new pattern constants. Existing constants from Phase A are reused inside the per-entry walks:
- `UserPattern::PLAYER_EVENT` — decode `[time] <steamid> "<player>" <event>` lines
- `ItemPattern::FIELDS` — decode `[time] <steamid> "<player>" <location> <delta> <coords> [<item>]` lines
- `PerkPattern::FIELDS` — decode the bracket-heavy perks log line
- `PerkPattern::PERK_PAIR` — extract individual `Skill=N` pairs from the perks-row event field
`Entry::getTime()` returns integer Unix seconds (sub-second precision is dropped by `DateTime::getTimestamp()`). For `ItemDuplicationAnalyser` this means events within the same second collapse to time-diff zero, which is acceptable for v1.
## Wiring
Three `getDefaultAnalyser()` overrides (each was previously `return new PatternAnalyser();`):
```php
// ProjectZomboidUserLog
return new ConnectionFailureAnalyser();
// ProjectZomboidItemLog
return new ItemDuplicationAnalyser();
// ProjectZomboidPerkLog
return new SkillProgressionAnomalyAnalyser();
```
The unused `PatternAnalyser` import is removed from each Log subclass.
## Test plan
End-to-end tests under `test/tests/Games/ProjectZomboid/Analyser/`, one per Log:
- **`UserLogAnalysisTest`** — drives `user-minimal.txt`. Asserts exactly one `ConnectionFailureProblem` for Player1 (Steam ID `76561198000000001`) with `unmatchedAttempts == 1` (Player1 has two `attempting to join` events, one of which is `attempting to join used queue`, and one `allowed to join`). Asserts that Player2 (matched 1+1) is not flagged.
- **`ItemLogAnalysisTest`** — drives the extended `item-minimal.txt`. Asserts one `ItemDuplicationProblem` for AdminUser + Base.Bullets9mm with `eventCount == 6`, and verifies the four-event Base.Plank group does not trigger. Also asserts the threshold constants are positive and documented.
- **`PerkLogAnalysisTest`** — drives the extended `perk-minimal.txt`. Asserts exactly two `SkillProgressionAnomalyProblem` insights for PlayerSuspect (Steam ID `76561198000000004`), one for Strength (delta +8) and one for Fitness (delta +6). Verifies that Maintenance (delta exactly +3) does not trigger because the comparison is strict `>`. Verifies that single-snapshot players (Player1, Player2) are not flagged. Asserts the threshold constant is positive and documented.
## Fixture changes
Two synthetic fixtures extended (no new files, no real-log content):
- **`item-minimal.txt`** — appended 10 lines: a 6-event Bullets9mm burst by AdminUser at sub-second timestamps `19:50:00.001``.006` (triggers the dupe heuristic), and a 4-event Plank group by Player1 scattered across 4 minutes (`20:00:00``20:03:00`, sub-threshold). The Phase A entry-count assertion in `ProjectZomboidItemLogTest` was bumped from 10 → 20.
- **`perk-minimal.txt`** — appended 4 lines: PlayerSuspect (Steam ID `76561198000000004`) with two perks snapshots — a low-stat baseline at `18:30:00.000` and an inflated set at `22:00:00.000` showing Strength 2→10, Fitness 2→8, and Maintenance 0→3 (boundary case). The Phase A entry-count assertion in `ProjectZomboidPerkLogTest` was bumped from 6 → 10.
All identifiers are placeholder per the Privacy / Fixture Rules in CLAUDE.md (`76561198000000001``76561198000000004` for Steam IDs, `Player1`/`Player2`/`AdminUser`/`PlayerSuspect` for names, coords in the `1000-1100, 2000-2200, 0` range).
## Commits (as-built, in order)
1. `c444e85``pre-phase-B.3 checkpoint` (`--allow-empty`)
2. `73e9ca6``Add ConnectionFailureAnalyser`
3. `ba3fae8``Add ItemDuplicationAnalyser`
4. `0c90e40``Add SkillProgressionAnomalyAnalyser`
4 commits total. Each non-checkpoint commit ships an Analyser + Problem + (optional) fixture extension + updated count assertion + e2e test in one logical unit, per the per-analyser commit shape requested up front.
## Open issues
None blocking. All three threshold constants are heuristic guesses pending production data calibration; tuning is expected once iblogs starts feeding real logs through codex. The values are tunable via subclass override and the rationale is in the source docblocks.
## Pointers
- Phase B.1 (foundation, ServerLog analysers): `2026-04-30-pz-analysers-design.md` and `2026-04-30-pz-analysers.md`.
- Phase B.2 (vanilla PatternAnalyser PvP/Admin coverage): `2026-04-30-pz-analysers-pvp-admin-design.md` and `2026-04-30-pz-analysers-pvp-admin.md`.
- Workflow conventions and architecture overview: `CLAUDE.md`.
- The Phase B.3 commit set begins at `c444e85` (pre-checkpoint) and ends at `0c90e40` (the third analyser).