docs: document Redactor utility in CLAUDE.md, README, CHANGELOG

CLAUDE.md: added RedactorInterface bullet to the architecture list
(after Custom Analyser subclasses, before Detectors); added
ProjectZomboidRedactor entry under ProjectZomboid specifics; added
src/Util/ to the game-subtrees layout code block with a prose note
marking it as the sixth component directory introduced post-v0.1.0;
added Pitfall 5 on mandatory pass order.

README.md: new "Redaction" subsection between Quick start and
Architecture — PHP snippet, replacement descriptions, three toggle
methods, three documented v1 limitations.

CHANGELOG.md: added [Unreleased] section (Added + Changed) above
[0.1.0]. Removed the Redactor bullet from [0.1.0]'s Deferred list
entirely — the historical record stays accurate (v0.1.0 shipped
without it) and [Unreleased] now documents its arrival; a stub mention
in Deferred would be redundant.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-01 15:08:49 +00:00
parent d6831c5851
commit 081d40c208
3 changed files with 31 additions and 1 deletions

View File

@@ -49,6 +49,7 @@ Analysis of Insight[]
- **`PatternParser`** is regex-driven. Lines that don't match the LINE regex append to the previous `Entry` — this is the mechanism that handles multi-line records like Java stack traces under an ERROR header.
- **`PatternAnalyser`** walks entries, runs each registered insight class's static `getPatterns()` against entry text via `preg_match_all`, and emits coalesced insights (equal insights bump a counter instead of duplicating).
- **Custom `Analyser` subclasses** are the right move when analysis needs cross-entry state — pairing events, sliding-window thresholds, comparing consecutive snapshots. `PatternAnalyser` operates per-entry only and can't express those. Phase B.3 (`ConnectionFailureAnalyser`, `ItemDuplicationAnalyser`, `SkillProgressionAnomalyAnalyser`) shows the shape: extend `Analyser`, override `analyse()`, walk `$this->log` once, aggregate, then emit coalesced `Problem`/`Information` insights at the end. Tunable thresholds belong as `public const` constants on the subclass with the rationale in a docblock.
- **`RedactorInterface`** is a render-time PII filter — string-in/string-out, configured per game, implemented at `src/Util/<Game>/<Game>Redactor.php`. Consumers call `redact(string $content): string` on a concrete instance before rendering or exporting log content.
- Detectors available out of the box: `SinglePatternDetector`, `WeightedSinglePatternDetector`, `LinePatternDetector` (returns match ratio), `MultiPatternDetector` (AND), and the path-based `FilenameDetector` (uses `LogFileInterface::getPath()`, returns `false` when no path is available).
## Game subtrees
@@ -58,10 +59,13 @@ Layout is **components-outer with game suffix**, not games-outer:
```
src/<Component>/<Game>/... e.g. src/Log/ProjectZomboid/ProjectZomboidServerLog.php
src/Pattern/<Game>/<Type>Pattern.php (regex string constants; not a framework abstraction)
src/Util/<Game>/... e.g. src/Util/ProjectZomboid/ProjectZomboidRedactor.php
test/tests/Games/<Game>/...
test/src/Games/<Game>/fixtures/<type>-minimal.txt (synthetic fixtures only)
```
`src/Util/` is the sixth top-level component directory, introduced post-v0.1.0-tag. Its first occupant is the Redactor; future game-agnostic utilities (tokenising redactor variants, etc.) land here too.
Scaffolded games: `Minecraft`, `Hytale`, `SevenDaysToDie` (stubs only — empty `.gitkeep`s plus a TODO `<Game>Detective` extending base `Detective`). `ProjectZomboid` is fully implemented: 11 log subclasses, 11 pattern classes, detective wired with all 11, synthetic fixtures, dispatch tests, plus the analyser surface — 11 `PatternAnalyser`-driven Insight classes under `src/Analysis/ProjectZomboid/` and 3 custom `Analyser` subclasses under `src/Analyser/ProjectZomboid/` for cross-entry / threshold logic.
`src/Pattern/` is **not a framework abstraction** — patterns are plain `string` class constants. Each `<Type>Pattern` typically holds a `LINE` constant for the parser plus named-group extractor constants (`FIELDS`, `COMBAT`, `MOD_LOAD`, etc.) for analysers.
@@ -74,6 +78,7 @@ Scaffolded games: `Minecraft`, `Hytale`, `SevenDaysToDie` (stubs only — empty
- A custom `Analyser` subclass (cross-entry logic): `UserLog → ConnectionFailureAnalyser`, `ItemLog → ItemDuplicationAnalyser`, `PerkLog → SkillProgressionAnomalyAnalyser`.
- A configured `PatternAnalyser` (per-entry pattern matching): `ServerLog`, `PvpLog`, `AdminLog` register their respective Insight classes.
- An empty `PatternAnalyser` for logs with no analysers yet: `ChatLog`, `ClientActionLog`, `CmdLog`, `MapLog`, `BurdJournalsLog`. These are wiring stubs awaiting future analysis work.
- **`ProjectZomboidRedactor`** at `src/Util/ProjectZomboid/ProjectZomboidRedactor.php` — concrete `RedactorInterface` implementation. Downstream consumers call `redact(string): string` to scrub Steam IDs (zeroed placeholder), player names (`<player>`), and world coordinates (`0,0,0`) from log content. Three independent toggle methods default to on: `redactSteamIds(bool)`, `redactPlayerNames(bool)`, `redactCoordinates(bool)`. Pass order (Steam ID → player name → coords) is mandatory and enforced internally — see Pitfall 5.
### Standard test template for a Log subclass
@@ -85,6 +90,7 @@ At minimum: (1) entry count after `parse()` matches the synthetic fixture's line
2. **PHPUnit 12 requires the `#[DataProvider('methodName')]` attribute.** The legacy `@dataProvider` annotation silently passes zero args and fails with `ArgumentCountError`.
3. **`Level::fromString()` defaults to `Level::INFO` for unknown tokens.** Project Zomboid log levels map: `LOG`/`INFO` → INFO; `WARN` → WARNING; `ERROR` → ERROR.
4. **`PatternParser` matches array** must declare a match-type for **every** capture group in the regex (`TIME`, `LEVEL`, or `PREFIX`); otherwise the parser throws on the unmapped index. Use non-capturing groups `(?:...)` for fields you want to skip.
5. **`ProjectZomboidRedactor` pass order is mandatory.** `PLAYER_AFTER_STEAMID_REGEX` anchors on the already-redacted Steam ID placeholder — it will not match raw Steam IDs. Do NOT swap the Steam ID and player-name passes, and do NOT stub out the Steam ID pass while leaving the player-name pass enabled.
## Workflow conventions