From 409de16003e237e7748146d062435d875f490028 Mon Sep 17 00:00:00 2001 From: indifferentketchup Date: Fri, 1 May 2026 14:28:44 +0000 Subject: [PATCH 01/10] docs: add Redactor implementation plan Forward-looking plan on the redactor branch covering all eight design questions called out in the careful-protocol kickoff: render-time filter (raw is canonical), standalone string utility (not a Printer decorator), regex-based detection with lexical anchors per PII category, per-category placeholder replacement matching synthetic fixture conventions, thin generic interface plus per-game implementation under src/Util/ProjectZomboid/, hybrid fixture strategy (unit-level synthetic plus integration against existing PZ fixtures). Branch off master aec835e. backup/pre-redactor tag pins start. No code is written by this commit. Implementation pass kicks off separately after plan review. --- docs/superpowers/plans/2026-05-01-redactor.md | 211 ++++++++++++++++++ 1 file changed, 211 insertions(+) create mode 100644 docs/superpowers/plans/2026-05-01-redactor.md diff --git a/docs/superpowers/plans/2026-05-01-redactor.md b/docs/superpowers/plans/2026-05-01-redactor.md new file mode 100644 index 0000000..404a6e5 --- /dev/null +++ b/docs/superpowers/plans/2026-05-01-redactor.md @@ -0,0 +1,211 @@ +# Redactor Utility Implementation Plan + +> Forward-looking. No code is written by this document. +> Branch: `redactor` (off master `aec835e`). Backup tag: `backup/pre-redactor`. +> Spec: `docs/superpowers/specs/2026-04-30-redactor-design.md`. + +**Goal:** Land the `RedactorInterface` plus a concrete `ProjectZomboidRedactor` implementation so iblogs (and any other downstream consumer) can scrub Project Zomboid log content of Steam IDs, player names, and world coordinates with a single call. The Redactor is a render-time filter on raw string content; raw stays canonical at the storage layer. + +**Architecture:** Standalone string-in/string-out utility under a new top-level `src/Util/` directory, with per-game implementations under `src/Util//`. Each implementation owns the lexical regex anchors for its game's PII shapes. Three independent toggles per implementation (`redactSteamIds`, `redactPlayerNames`, `redactCoordinates`); defaults all on; "all toggles off" yields verbatim passthrough. + +**Tech stack:** PHP 8.4+, PHPUnit 12, Composer (`indifferentketchup/codex` v0.1.0+). All command invocations wrap in the `composer:latest` Docker image per `CLAUDE.md`. + +--- + +## Design questions — resolved + +### a. Render-time vs ingest-time + +**Decision: render-time. Confirm spec's lean.** + +Raw log content is canonical. Redaction is a view filter that consumers apply when they want to display, export, or analyse a redacted projection. iblogs's storage layer holds the unredacted upload (subject to iblogs's own upload-time `Filter` chain for IPs/access-tokens, which is a different layer of defence); the codex Redactor runs on the way *out* of storage, not on the way in. + +**Why:** the alternative (ingest-time, where storage holds redacted content) is destructive — once stored, the original cannot be recovered for legitimate operator use. Render-time leaves the original in place and lets each render path opt in. iblogs gets a per-session toggle without needing to keep two copies of every paste. + +**Implication for iblogs schema:** iblogs stores raw content; the redaction toggle in the iblogs UI invokes `ProjectZomboidRedactor::redact()` at render time (server-side) or at fetch time (API consumers' choice). No schema migration required for the redaction feature. + +### b. Redactor as standalone class vs Printer decorator + +**Decision: standalone utility (option iii from the question).** + +The Redactor is a `string → string` function. It does not know about `Insight`, `Printer`, or any other codex type. Three options were considered: + +- **(i) Printer wrapper.** Cleanly composable but ties the Redactor to the Printer abstraction. Doesn't help iblogs's most common case: redacting raw log content for display in a non-Printer rendering path (HTML page rendered server-side, raw download served to API client). +- **(ii) Pre-Printer pass on Insights.** Heavy. Insights are typed objects with structured fields; redacting them means per-Insight code that knows which fields are PII-bearing. Against the YAGNI line for v1. +- **(iii) Standalone string utility.** Simple, generic, works on any string input — raw log content, JSON-serialised analysis output, rendered Printer output piped through. Doesn't know about Insights. + +The spec describes (iii). v1 ships (iii) only. If a Printer-wrapper convenience is later wanted, it can be added as a thin adapter that calls the standalone Redactor on the Printer's output; it doesn't require restructuring the core. + +### c. PII field taxonomy for PZ + +**Decision: regex-based with lexical context anchors. No structured-field detection in v1.** + +PZ-specific PII categories observed in the in-tree fixtures and the `.scratch/pz/Logs/` reference corpus: + +| Field | Detection | Rationale | +|---|---|---| +| Steam ID | regex with `76561198\d{9}` prefix anchor and word-boundary classes | Steam's `76561198` SteamID64 universe prefix lets us cleanly distinguish from other long numbers (timestamps, build numbers). | +| Player name | regex with multi-context lexical anchors (after-Steam-ID-quoted, ChatMessage author, `Combat:`/`Safety:` subsystem) | Names are arbitrary strings — not detectable without context. The contexts are well-defined by the parser-side pattern classes. | +| World coordinate triple | regex with bracket / paren / `at`-clause anchors | Generic `\d+,\d+,\d+` would over-redact server metadata (`f:0, t:NNNN, st:48,648,157,584`). Lexical context disambiguates. | + +**Not redacted in v1:** + +- **IP addresses.** PZ logs do not normally include IPs in any of the eleven file types observed. iblogs's upload-side `IPv4Filter` / `IPv6Filter` (ported from upstream mclogs) covers the rare case where a mod might log them. +- **Server-side usernames distinct from player names.** PZ uses Steam display name as the player identity; there's no separate auth username layer. Mclogs's `UsernameFilter` is Minecraft-specific and isn't mirrored here. +- **BurdJournals scientific-notation Steam IDs** (`7.65611…E16`). Spec open-question 2 explicitly defers this to v2; the `[BurdJournals]` tag already disambiguates them as mod-internal. + +**Hybrid (regex + structured-field) deferred.** A v2 enhancement could redact specific Insight fields at JSON-serialisation time (e.g. `ConnectionFailureProblem::$steamId` → placeholder when serialised). Useful only if iblogs starts shipping the structured analysis JSON to redacted views — a real but currently hypothetical need. + +### d. Replacement strategy + +**Decision: per-category placeholder strings matching the synthetic-fixture conventions. Configurable replacement style is YAGNI for v1.** + +Per the spec: + +| Category | Replacement | +|---|---| +| Steam ID | `76561198000000000` (zeroed placeholder, still a syntactically valid Steam ID) | +| Player name | `` | +| Coordinates | `0,0,0` (with shape preserved per anchor — bracketed, parenthesised, or `at` clause) | + +Why these specifically and not `[REDACTED]` / `[STEAM_ID]` / hashed: + +- The placeholders **match the existing synthetic test fixtures** (`76561198000000001`–`76561198000000004` collapse to `76561198000000000`; player names `Player1`/`Player2`/`AdminUser` collapse to ``). Tests can verify "redacted output looks like a synthetic fixture." +- Shape preservation means downstream consumers can still parse the redacted output with the same Pattern classes — a redacted log is still a syntactically valid PZ log, it just contains no identities. +- Type-tagged replacements (`[STEAM_ID]`) break shape preservation: a Pattern looking for `\d{17}` would fail. Worth offering as a config option if a consumer specifically wants type-visibility, but v1 ships placeholder-only. +- Hashing breaks shape preservation similarly and adds determinism / collision concerns. + +If a consumer later needs `[STEAM_ID]`-style output, a `setReplacementStyle('typed' | 'placeholder' | 'redacted')` setter can be added without breaking the v1 API. v1 ships placeholder-only. + +### e. Game-agnostic vs PZ-specific layout + +**Decision: thin generic interface in `src/Util/` plus PZ-specific implementation in `src/Util/ProjectZomboid/`.** + +``` +src/Util/ +├── RedactorInterface.php (1 method: redact(string): string) +└── ProjectZomboid/ + └── ProjectZomboidRedactor.php (toggles + regex passes) +``` + +**YAGNI tradeoff stated:** the interface has one method and currently one implementation. Strictly, YAGNI says collapse to just `ProjectZomboidRedactor` and skip the interface. The interface earns its keep because **iblogs's call sites will type-hint against `RedactorInterface`**, not the concrete class — that's the architectural payoff. Consumer code stays loosely coupled; when Minecraft or another game ships a redactor, iblogs swaps the implementation by changing one DI binding rather than touching call sites. + +The cost is two files instead of one. Acceptable given the dependency-inversion benefit. The directory layout (`src/Util//`) mirrors the components-outer-with-game-suffix convention used everywhere else in the tree (Analyser, Analysis, Detective, Log, Parser, Pattern). + +**Note on the new `src/Util/` directory.** Codex currently has no `src/Util/` (the Phase A scaffolding established Analyser / Analysis / Detective / Log / Parser / Pattern / Printer; Phase B.3 added Analyser/ProjectZomboid content but not Util). The Redactor introduces this new top-level. This is an additive change — no existing code is modified. + +### f. Test strategy + +**Decision: hybrid — small dedicated synthetic fixtures under `test/src/Util/Redactor/` for direct unit tests, plus an integration test that runs the Redactor over an existing PZ fixture and asserts idempotence.** + +**Dedicated unit fixtures** (small string constants in test classes, not separate files): per spec test plan #1–#5. Each test class owns its input/expected pairs. Keeps unit tests self-contained and fast. + +**Integration test** that re-uses an existing PZ fixture (e.g. `test/src/Games/ProjectZomboid/fixtures/admin-minimal.txt`). Two assertions: + +- The Redactor's output is a syntactically valid log (still parses cleanly through the corresponding `ProjectZomboidAdminLog`). +- Idempotence: `redact(redact($x)) === redact($x)`. Existing fixture content is already placeholder-shaped, so the redactor should leave it byte-for-byte identical OR apply the canonical normalisation once and then no-op. + +**False-positive avoidance.** The synthetic fixtures use `76561198000000001` etc. as placeholder Steam IDs. The Redactor's Steam ID regex matches the `76561198\d{9}` prefix and replaces with `76561198000000000` — so `76561198000000001` becomes `76561198000000000` (a normalisation, not a corruption). Tests verify this normalisation is correct and that legitimate-non-PII data (e.g. server metadata triples like `f:0, t:1776297642406, st:48,648,157,584`) is **not** touched. + +--- + +## Tasks + +Tasks are intended for the `redactor` branch. Each is a single logical commit. Test-running between commits uses the standard Docker invocation. Work proceeds only after Step 0 sign-off (this plan reviewed). + +### Task 0 — Plan doc commit + +- [ ] **Step 0.1.** Already done out-of-band: `git checkout -b redactor` off master `aec835e`; `git tag backup/pre-redactor` at branch tip; this plan written. +- [ ] **Step 0.2.** Commit this plan: `docs: add Redactor implementation plan` on branch `redactor`. Push branch to origin for review. + +### Task 1 — Scaffold (interface + skeleton class with toggles) + +- [ ] **Step 1.1.** Create `src/Util/RedactorInterface.php`. Single method: `public function redact(string $content): string;` PHPDoc describing the contract: stateless from the caller's perspective; configuration happens via implementation-specific setters before `redact()`. +- [ ] **Step 1.2.** Create `src/Util/ProjectZomboid/ProjectZomboidRedactor.php` that implements the interface. Class structure: three private bool properties (`$redactSteamIds`, `$redactPlayerNames`, `$redactCoordinates`) all defaulting to `true`; three fluent setters (`redactSteamIds(bool): static`, etc.); `redact(string): string` body that returns input unchanged when all toggles are off (for now — regex passes added in subsequent tasks). +- [ ] **Step 1.3.** Run `composer test` — expect 195 tests still green (no Redactor tests yet). +- [ ] **Step 1.4.** Commit: `feat: scaffold RedactorInterface and ProjectZomboidRedactor with toggles`. + +### Task 2 — Steam ID redaction pass + +- [ ] **Step 2.1.** Add `STEAM_ID_REGEX` and `STEAM_ID_REPLACEMENT` constants on `ProjectZomboidRedactor`. Regex uses the `76561198\d{9}` prefix anchor with word-boundary classes (per spec). The `/u` flag is added to all regexes for Unicode safety even though Steam IDs themselves are ASCII. +- [ ] **Step 2.2.** Implement the Steam ID branch of `redact()`: when `$redactSteamIds` is true, run `preg_replace` against the input. +- [ ] **Step 2.3.** Create `test/tests/Util/Redactor/ProjectZomboidRedactorSteamIdTest.php`. Tests: redaction of various distinct synthetic Steam IDs collapses all to `76561198000000000`; non-Steam-ID 17-digit numbers (e.g. timestamps) are not touched; toggle-off leaves Steam IDs intact. +- [ ] **Step 2.4.** Run `composer test`. Expect new tests pass; old 195 unaffected. +- [ ] **Step 2.5.** Commit: `feat: add Steam ID redaction pass`. + +### Task 3 — Player name redaction pass + +- [ ] **Step 3.1.** Add three regex constants on `ProjectZomboidRedactor` for the three player-name lexical contexts: `PLAYER_AFTER_STEAMID_REGEX`, `PLAYER_IN_CHATMESSAGE_REGEX`, `PLAYER_IN_PVP_SUBSYSTEM_REGEX`. Replacement is `` for all. **Order constraint:** the after-Steam-ID context anchors on the post-redaction Steam ID `76561198000000000`, so the player-name pass must run *after* the Steam ID pass. Document this in a class-level docblock. +- [ ] **Step 3.2.** Implement the player-name branch of `redact()`: three sequential `preg_replace` calls when `$redactPlayerNames` is true. +- [ ] **Step 3.3.** Create `test/tests/Util/Redactor/ProjectZomboidRedactorPlayerNameTest.php`. Tests: each of the three contexts redacts correctly when paired with its anchor; a bare quoted string (e.g. `"foo"` not preceded by a Steam ID) is **not** touched; toggle-off leaves names intact; the after-Steam-ID context works correctly when the Steam ID has already been redacted to the zeroed placeholder. +- [ ] **Step 3.4.** Run `composer test`. Expect new tests pass. +- [ ] **Step 3.5.** Commit: `feat: add player name redaction pass`. + +### Task 4 — Coordinates redaction pass + +- [ ] **Step 4.1.** Add three regex constants on `ProjectZomboidRedactor` for the three coordinate contexts: `COORDS_AT_CLAUSE_REGEX`, `COORDS_BRACKETED_REGEX`, `COORDS_PARENTHESISED_REGEX`. Replacements preserve shape (`0,0,0` inside whatever bracket/paren wrapper). +- [ ] **Step 4.2.** Implement the coords branch of `redact()`: three sequential `preg_replace_callback` (or `preg_replace`) calls when `$redactCoordinates` is true. +- [ ] **Step 4.3.** Create `test/tests/Util/Redactor/ProjectZomboidRedactorCoordinatesTest.php`. Tests: each of the three contexts redacts correctly; **negative test** — server metadata `f:0, t:1776297642406, st:48,648,157,584` is not touched; basement Z-coordinates (`-1`) are handled; toggle-off leaves coords intact. +- [ ] **Step 4.4.** Run `composer test`. Expect new tests pass. +- [ ] **Step 4.5.** Commit: `feat: add coordinates redaction pass`. + +### Task 5 — Combined / toggle / idempotence tests + +- [ ] **Step 5.1.** Create `test/tests/Util/Redactor/ProjectZomboidRedactorCombinedTest.php`. Tests cover: combined input with all three PII categories present produces fully-scrubbed output when all toggles on; each toggle off in isolation produces partial scrubbing matching the toggle's category; all toggles off returns input byte-for-byte identical (`===` equality). +- [ ] **Step 5.2.** Create `test/tests/Util/Redactor/ProjectZomboidRedactorIdempotenceTest.php`. Tests: `redact(redact($x)) === redact($x)` for several input shapes including all three PII categories. +- [ ] **Step 5.3.** Run `composer test`. Expect new tests pass. +- [ ] **Step 5.4.** Commit: `test: add Redactor combined and idempotence coverage`. + +### Task 6 — Existing-fixture integration tests + +- [ ] **Step 6.1.** Create `test/tests/Util/Redactor/ProjectZomboidRedactorIntegrationTest.php`. Loads each existing PZ fixture (`admin-minimal.txt`, `chat-minimal.txt`, etc.) via `PathLogFile`, calls `redact()` on the content, and asserts: (a) the redacted content still parses cleanly through the corresponding `ProjectZomboidLog`'s parser without throwing; (b) the synthetic Steam IDs `76561198000000001`–`76561198000000004` all collapse to `76561198000000000`; (c) the synthetic player names (`Player1`, `Player2`, `AdminUser`, `PlayerSuspect`) all collapse to ``. +- [ ] **Step 6.2.** Run `composer test`. Expect all integration assertions pass without modifying any existing test or fixture. +- [ ] **Step 6.3.** Commit: `test: add Redactor integration coverage against existing PZ fixtures`. + +### Task 7 — Documentation updates + +- [ ] **Step 7.1.** Update `CLAUDE.md`: add a one-line `src/Util/` mention to the framework architecture section; one-line note in the ProjectZomboid specifics section pointing at `ProjectZomboidRedactor` for downstream PII scrubbing; update the "Scaffolded games" line to mention that `ProjectZomboid` now also has a Redactor implementation under `src/Util/ProjectZomboid/`. +- [ ] **Step 7.2.** Update `README.md`: add a short usage block showing `(new ProjectZomboidRedactor())->redact($logContent)` as a render-time scrub option, alongside the existing worked example. +- [ ] **Step 7.3.** Update `CHANGELOG.md`: move Redactor out of the **Deferred** section under `[0.1.0]`, OR add a new `[Unreleased]` section if the v0.1.0 line should remain accurate as-shipped. Decision: **add `[Unreleased]`** — v0.1.0 was tagged without the Redactor and the changelog should reflect the historical truth. +- [ ] **Step 7.4.** Run `composer test` once more for safety; confirm 195+(redactor tests) green. +- [ ] **Step 7.5.** Commit: `docs: document Redactor utility in CLAUDE.md, README, CHANGELOG`. + +### Task 8 — Final verification + +- [ ] **Step 8.1.** Run `composer test`. All tests green. +- [ ] **Step 8.2.** Re-run `vendor/bin/phpunit --display-deprecations --display-warnings --display-notices --display-errors`. Expect zero output beyond the standard pass summary. +- [ ] **Step 8.3.** Sanity-check the branch with `git log --oneline master..redactor`. Should be the plan-doc commit plus 7 implementation commits = 8 commits total. +- [ ] **Step 8.4.** Push final state: `git push origin redactor`. **Do NOT merge to master.** User reviews diff and approves merge separately. + +--- + +## Open questions / spec gaps + +The spec is generally tight. Items worth flagging while implementing: + +1. **`/u` flag for Unicode safety.** Spec doesn't specify regex flags. PZ player names can contain non-ASCII characters (Steam display names are Unicode-permissive). The implementation will use `/u` on all regexes to avoid mangling multi-byte sequences. Documenting in the class docblock. +2. **Replacement order.** Spec says "Redaction order matters: SIDs first, names second" because the after-Steam-ID player-name regex anchors on the redacted Steam ID. The implementation will enforce this order in `redact()` (Steam ID pass first, then names, then coords). The class docblock will document the ordering invariant. +3. **HTML / JSON-encoded input.** Spec assumes plain log text. If a consumer feeds HTML-escaped content (e.g. `"` instead of `"`), the player-name regex won't match. Document as a v2 concern: callers feed plain text in, render afterwards. v1 does not implement HTML/JSON-aware mode. +4. **Future PII categories.** v1 ships exactly the three toggles per spec. New categories (emails, IPs from mods, etc.) extend the toggle set in a future release; v1 does not pre-build extension points beyond what the interface already provides. +5. **`src/Util/` is a new top-level directory** in this codebase. The Redactor is the first occupant. Future utilities (e.g. a tokenizing variant per spec open-question 1) would also live here. No existing-code modification is needed; the new directory is purely additive. +6. **The empty `src/Printer//.gitkeep` situation.** Phase A scaffolding chose not to create `Printer//` directories at all (only Analyser/Detective/Log/Parser/Pattern got per-game subdirs). The Redactor's home in `src/Util//` mirrors that — `src/Util/` is created with PZ as its first occupant; no stub `Hytale/`/`Minecraft/`/`SevenDaysToDie/` placeholders are scaffolded. When other games' redactors land, they create their own subdirectories at that point. + +No spec contradictions found. No existing-code modifications required (additive-only design). + +--- + +## Branch / commit invariants + +- All commits land on the `redactor` branch. +- Master is not touched until the user explicitly approves merge after reviewing the diff. +- Conventional commit prefixes: `docs:`, `feat:`, `test:`, `refactor:`. (No `fix:` expected — this is greenfield work.) +- One logical concept per commit. Tasks 1, 2, 3, 4 each ship implementation + per-pass tests in one commit; Task 5 / 6 / 7 are pure-test or pure-docs commits. +- Backup tag `backup/pre-redactor` at `aec835e` lets us discard the branch and recover if the implementation goes sideways. +- Branch can be pushed to origin freely for visibility / review checkpoints. + +## Pointers + +- Spec: `docs/superpowers/specs/2026-04-30-redactor-design.md`. +- Synthetic fixtures the integration test will reuse: `test/src/Games/ProjectZomboid/fixtures/*.txt`. +- Existing per-game layout precedent: `src/Analyser/ProjectZomboid/`, `src/Pattern/ProjectZomboid/`, `src/Log/ProjectZomboid/`. +- Workflow conventions and pitfalls: `CLAUDE.md`. From 7755d8385c66ee7dd87e99382f7d265c82ff0403 Mon Sep 17 00:00:00 2001 From: indifferentketchup Date: Fri, 1 May 2026 14:34:34 +0000 Subject: [PATCH 02/10] feat: scaffold RedactorInterface and ProjectZomboidRedactor with toggles --- .../ProjectZomboid/ProjectZomboidRedactor.php | 89 +++++++++++++++++++ src/Util/RedactorInterface.php | 20 +++++ 2 files changed, 109 insertions(+) create mode 100644 src/Util/ProjectZomboid/ProjectZomboidRedactor.php create mode 100644 src/Util/RedactorInterface.php diff --git a/src/Util/ProjectZomboid/ProjectZomboidRedactor.php b/src/Util/ProjectZomboid/ProjectZomboidRedactor.php new file mode 100644 index 0000000..a4e96f6 --- /dev/null +++ b/src/Util/ProjectZomboid/ProjectZomboidRedactor.php @@ -0,0 +1,89 @@ + name -> coordinates is mandatory. + * 3. Coordinates pass — replaces world coordinate triplets with a placeholder + * token. + * + * All regex passes use the /u flag for Unicode safety. + * + * Replacements are not reversible; do not apply to content that must later be + * restored to its original form. + */ +class ProjectZomboidRedactor implements RedactorInterface +{ + private bool $redactSteamIds = true; + private bool $redactPlayerNames = true; + private bool $redactCoordinates = true; + + /** + * Enable or disable the Steam ID redaction pass. + * + * @param bool $on Pass true to enable, false to disable. + * @return static + */ + public function redactSteamIds(bool $on): static + { + $this->redactSteamIds = $on; + return $this; + } + + /** + * Enable or disable the player-name redaction pass. + * + * @param bool $on Pass true to enable, false to disable. + * @return static + */ + public function redactPlayerNames(bool $on): static + { + $this->redactPlayerNames = $on; + return $this; + } + + /** + * Enable or disable the coordinates redaction pass. + * + * @param bool $on Pass true to enable, false to disable. + * @return static + */ + public function redactCoordinates(bool $on): static + { + $this->redactCoordinates = $on; + return $this; + } + + /** + * Redact PII from the given Project Zomboid log content. + * + * Passes are applied in the mandatory order: Steam ID -> player name -> + * coordinates. See class docblock for rationale. + * + * @param string $content Raw log content that may contain PII. + * @return string Content with enabled PII categories replaced by tokens. + */ + public function redact(string $content): string + { + if ($this->redactSteamIds) { + // Steam ID pass added in Task 2 + } + if ($this->redactPlayerNames) { + // Player name pass added in Task 3 + } + if ($this->redactCoordinates) { + // Coordinates pass added in Task 4 + } + return $content; + } +} diff --git a/src/Util/RedactorInterface.php b/src/Util/RedactorInterface.php new file mode 100644 index 0000000..9e8e8fc --- /dev/null +++ b/src/Util/RedactorInterface.php @@ -0,0 +1,20 @@ + Date: Fri, 1 May 2026 14:38:26 +0000 Subject: [PATCH 03/10] feat: add Steam ID redaction pass Co-Authored-By: Claude Sonnet 4.6 --- .../ProjectZomboid/ProjectZomboidRedactor.php | 8 ++- .../ProjectZomboidRedactorSteamIdTest.php | 52 +++++++++++++++++++ 2 files changed, 59 insertions(+), 1 deletion(-) create mode 100644 test/tests/Util/Redactor/ProjectZomboidRedactorSteamIdTest.php diff --git a/src/Util/ProjectZomboid/ProjectZomboidRedactor.php b/src/Util/ProjectZomboid/ProjectZomboidRedactor.php index a4e96f6..b0dd91c 100644 --- a/src/Util/ProjectZomboid/ProjectZomboidRedactor.php +++ b/src/Util/ProjectZomboid/ProjectZomboidRedactor.php @@ -24,6 +24,12 @@ use IndifferentKetchup\Codex\Util\RedactorInterface; */ class ProjectZomboidRedactor implements RedactorInterface { + /** Regex matching a 17-digit SteamID64 anchored on the 76561198 universe prefix, with lookaround boundaries that reject embedded occurrences. */ + public const string STEAM_ID_REGEX = '/(?redactSteamIds) { - // Steam ID pass added in Task 2 + $content = preg_replace(self::STEAM_ID_REGEX, self::STEAM_ID_REPLACEMENT, $content); } if ($this->redactPlayerNames) { // Player name pass added in Task 3 diff --git a/test/tests/Util/Redactor/ProjectZomboidRedactorSteamIdTest.php b/test/tests/Util/Redactor/ProjectZomboidRedactorSteamIdTest.php new file mode 100644 index 0000000..4c1c8c2 --- /dev/null +++ b/test/tests/Util/Redactor/ProjectZomboidRedactorSteamIdTest.php @@ -0,0 +1,52 @@ +redact($input); + + $this->assertSame($expected, $output, 'All three distinct Steam IDs should be replaced with the zero placeholder.'); + } + + public function testNonSteamIdLongDigitsAreNotTouched(): void + { + // 13-digit Unix-millisecond timestamp (PZ log t: shape) and a 17-digit number + // that does not begin with 76561198 — neither should be altered. + $input = 't:1776297642406 score=12345678901234567'; + + $output = (new ProjectZomboidRedactor())->redact($input); + + $this->assertSame($input, $output, 'Non-SteamID digit sequences must not be modified.'); + } + + public function testEmbeddedSteamIdInsideLongerAlphanumericTokenIsNotTouched(): void + { + // The SteamID64 pattern is embedded inside a longer alphanumeric token; + // the negative lookaround boundaries should prevent a match. + $input = 'token=abc76561198000000001def other=data'; + + $output = (new ProjectZomboidRedactor())->redact($input); + + $this->assertSame($input, $output, 'A Steam ID embedded inside an alphanumeric token must not be redacted.'); + } + + public function testToggleOffLeavesSteamIdsIntact(): void + { + $input = 'Connected: 76561198111111111 and 76561198222222222.'; + + $output = (new ProjectZomboidRedactor()) + ->redactSteamIds(false) + ->redact($input); + + $this->assertSame($input, $output, 'With the Steam ID toggle disabled the original input must be returned unchanged.'); + } +} From 44b6b990472406a08925a0f5faec0cae56a06037 Mon Sep 17 00:00:00 2001 From: indifferentketchup Date: Fri, 1 May 2026 14:43:14 +0000 Subject: [PATCH 04/10] feat: add player name redaction pass Adds three lexical-context regexes (after-SteamID, ChatMessage author, Combat/Safety pvp subsystem) and wires the player-name branch in redact(). Includes six PHPUnit tests covering all three contexts plus the toggle-off and no-anchor-no-touch cases. Co-Authored-By: Claude Sonnet 4.6 --- .../ProjectZomboid/ProjectZomboidRedactor.php | 16 +++- .../ProjectZomboidRedactorPlayerNameTest.php | 90 +++++++++++++++++++ 2 files changed, 105 insertions(+), 1 deletion(-) create mode 100644 test/tests/Util/Redactor/ProjectZomboidRedactorPlayerNameTest.php diff --git a/src/Util/ProjectZomboid/ProjectZomboidRedactor.php b/src/Util/ProjectZomboid/ProjectZomboidRedactor.php index b0dd91c..7e270b0 100644 --- a/src/Util/ProjectZomboid/ProjectZomboidRedactor.php +++ b/src/Util/ProjectZomboid/ProjectZomboidRedactor.php @@ -30,6 +30,18 @@ class ProjectZomboidRedactor implements RedactorInterface /** Zeroed-out SteamID64 placeholder; syntactically valid but refers to no real account. */ public const string STEAM_ID_REPLACEMENT = '76561198000000000'; + /** Generic placeholder substituted for every matched player display name. */ + public const string PLAYER_NAME_REPLACEMENT = ''; + + /** Matches a double-quoted player name that immediately follows the redacted Steam ID placeholder (cmd.txt / admin.txt shape); relies on the Steam ID pass having run first. */ + public const string PLAYER_AFTER_STEAMID_REGEX = '/(?<=76561198000000000) "(?[^"]+)"/u'; + + /** Matches the author value inside a ChatMessage{...} envelope, using a fixed-length lookbehind on ", author='" and a lookahead on the closing "'" so only the bare name is replaced. */ + public const string PLAYER_IN_CHATMESSAGE_REGEX = '/(?<=, author=\')(?[^\']+)(?=\')/u'; + + /** Matches the first double-quoted player name following a Combat: or Safety: subsystem token (pvp.txt shape); does NOT redact the second name after "hit" — deferred to v2. */ + public const string PLAYER_IN_PVP_SUBSYSTEM_REGEX = '/(?<=(?:Combat|Safety): )"(?[^"]+)"/u'; + private bool $redactSteamIds = true; private bool $redactPlayerNames = true; private bool $redactCoordinates = true; @@ -85,7 +97,9 @@ class ProjectZomboidRedactor implements RedactorInterface $content = preg_replace(self::STEAM_ID_REGEX, self::STEAM_ID_REPLACEMENT, $content); } if ($this->redactPlayerNames) { - // Player name pass added in Task 3 + $content = preg_replace(self::PLAYER_AFTER_STEAMID_REGEX, ' "' . self::PLAYER_NAME_REPLACEMENT . '"', $content); + $content = preg_replace(self::PLAYER_IN_CHATMESSAGE_REGEX, self::PLAYER_NAME_REPLACEMENT, $content); + $content = preg_replace(self::PLAYER_IN_PVP_SUBSYSTEM_REGEX, '"' . self::PLAYER_NAME_REPLACEMENT . '"', $content); } if ($this->redactCoordinates) { // Coordinates pass added in Task 4 diff --git a/test/tests/Util/Redactor/ProjectZomboidRedactorPlayerNameTest.php b/test/tests/Util/Redactor/ProjectZomboidRedactorPlayerNameTest.php new file mode 100644 index 0000000..9463c69 --- /dev/null +++ b/test/tests/Util/Redactor/ProjectZomboidRedactorPlayerNameTest.php @@ -0,0 +1,90 @@ +" admin.broadcastMessage @ 1020,2020,0.'; + + $output = (new ProjectZomboidRedactor()) + ->redactSteamIds(false) + ->redact($input); + + $this->assertSame($expected, $output, 'Player name following the redacted Steam ID placeholder must be replaced.'); + } + + public function testRedactsChatMessageAuthor(): void + { + // The author field inside ChatMessage{...} must be replaced; the text + // payload ('hello') is not in scope for player-name redaction and must + // survive unchanged. + $input = "[16-04-26 17:05:03.280][info] Got message:ChatMessage{chat=Local, author='Player1', text='hello'}."; + $expected = "[16-04-26 17:05:03.280][info] Got message:ChatMessage{chat=Local, author='', text='hello'}."; + + $output = (new ProjectZomboidRedactor()) + ->redactSteamIds(false) + ->redact($input); + + $this->assertSame($expected, $output, 'ChatMessage author must be replaced while the text payload remains unchanged.'); + } + + public function testRedactsCombatNameInPvpLog(): void + { + // Only the FIRST quoted name (after "Combat: ") is redacted in v1. + // The second name (after "hit") is NOT yet redacted — deferred to v2. + // The weapon name ("Tire Iron (Worn)") must also survive unchanged. + $input = '[16-04-26 17:14:35.128][INFO] Combat: "Player1" (1005,2005,0) hit "Player2" (1006,2005,0) weapon="Tire Iron (Worn)" damage=0.112317.'; + $expected = '[16-04-26 17:14:35.128][INFO] Combat: "" (1005,2005,0) hit "Player2" (1006,2005,0) weapon="Tire Iron (Worn)" damage=0.112317.'; + + $output = (new ProjectZomboidRedactor()) + ->redactSteamIds(false) + ->redact($input); + + // Player1 (after "Combat: ") is replaced; Player2 (after "hit") is NOT + // replaced in v1 — that anchor is deferred. + $this->assertSame($expected, $output, 'First Combat: player name must be replaced; second name and weapon must survive.'); + } + + public function testRedactsSafetyNameInPvpLog(): void + { + $input = '[16-04-26 16:17:49.731][LOG] Safety: "Player1" (1000,2000,0) restore true.'; + $expected = '[16-04-26 16:17:49.731][LOG] Safety: "" (1000,2000,0) restore true.'; + + $output = (new ProjectZomboidRedactor()) + ->redactSteamIds(false) + ->redact($input); + + $this->assertSame($expected, $output, 'Player name following the Safety: token must be replaced.'); + } + + public function testBareQuotedStringWithoutAnchorIsNotTouched(): void + { + // "foo" is not preceded by a redacted Steam ID, not inside ChatMessage{...}, + // and not after Combat:/Safety: — it must pass through unchanged. + $input = 'option changed to "foo" successfully.'; + + $output = (new ProjectZomboidRedactor())->redact($input); + + $this->assertSame($input, $output, 'A quoted string with no matching anchor must not be redacted.'); + } + + public function testToggleOffLeavesNamesIntact(): void + { + $input = '76561198000000000 "Player1" ISLogSystem.writeLog @ 1000,2000,0.'; + + $output = (new ProjectZomboidRedactor()) + ->redactSteamIds(false) + ->redactPlayerNames(false) + ->redact($input); + + $this->assertSame($input, $output, 'With the player-name toggle disabled the original input must be returned unchanged.'); + } +} From 2d1cbccc5d5c09714834b70a7a188ebd271ae16d Mon Sep 17 00:00:00 2001 From: indifferentketchup Date: Fri, 1 May 2026 14:49:52 +0000 Subject: [PATCH 05/10] feat: add coordinates redaction pass Adds three COORDS_*_REGEX constants (at-clause, bracketed, parenthesised) plus COORDS_REPLACEMENT, wires them into redact(), and covers all three contexts with 8 new tests including a critical negative test asserting DebugLog-server.txt server-metadata triples are not redacted. Also updates two Task 3 player-name tests whose expected strings now include the coords redaction that the wired pass applies. Co-Authored-By: Claude Opus 4.7 (1M context) --- .../ProjectZomboid/ProjectZomboidRedactor.php | 16 ++- .../ProjectZomboidRedactorCoordinatesTest.php | 124 ++++++++++++++++++ .../ProjectZomboidRedactorPlayerNameTest.php | 15 ++- 3 files changed, 148 insertions(+), 7 deletions(-) create mode 100644 test/tests/Util/Redactor/ProjectZomboidRedactorCoordinatesTest.php diff --git a/src/Util/ProjectZomboid/ProjectZomboidRedactor.php b/src/Util/ProjectZomboid/ProjectZomboidRedactor.php index 7e270b0..03b0d54 100644 --- a/src/Util/ProjectZomboid/ProjectZomboidRedactor.php +++ b/src/Util/ProjectZomboid/ProjectZomboidRedactor.php @@ -42,6 +42,18 @@ class ProjectZomboidRedactor implements RedactorInterface /** Matches the first double-quoted player name following a Combat: or Safety: subsystem token (pvp.txt shape); does NOT redact the second name after "hit" — deferred to v2. */ public const string PLAYER_IN_PVP_SUBSYSTEM_REGEX = '/(?<=(?:Combat|Safety): )"(?[^"]+)"/u'; + /** Zeroed-out coordinate triple used as the inner replacement; bracket/paren/`at` wrapper is preserved by the regex lookaround anchors. */ + public const string COORDS_REPLACEMENT = '0,0,0'; + + /** Matches integer or float coordinate triplets that immediately follow the literal ` at ` token (map.txt / item.txt shape); the trailing dot is preserved via lookahead. */ + public const string COORDS_AT_CLAUSE_REGEX = '/(?<= at )(?[\d.]+),(?[\d.]+),(?-?[\d.]+)(?=\.)/u'; + + /** Matches integer coordinate triplets enclosed in square brackets (ClientActionLog.txt / PerkLog.txt / cmd.txt @-context shape); the surrounding brackets are preserved via lookaround. */ + public const string COORDS_BRACKETED_REGEX = '/(?<=\[)(?\d+),(?\d+),(?-?\d+)(?=\])/u'; + + /** Matches integer coordinate triplets enclosed in round parentheses, anchored on a trailing PvP verb to disambiguate from server-metadata triples (pvp.txt Combat:/Safety: shape); only the attacker/first-coord set is redacted per line — the victim coords lack the trailing keyword and are deferred to v2. */ + public const string COORDS_PARENTHESISED_REGEX = '/(?<=\()(?\d+),(?\d+),(?-?\d+)(?=\) (?:hit|restore|store|true|false))/u'; + private bool $redactSteamIds = true; private bool $redactPlayerNames = true; private bool $redactCoordinates = true; @@ -102,7 +114,9 @@ class ProjectZomboidRedactor implements RedactorInterface $content = preg_replace(self::PLAYER_IN_PVP_SUBSYSTEM_REGEX, '"' . self::PLAYER_NAME_REPLACEMENT . '"', $content); } if ($this->redactCoordinates) { - // Coordinates pass added in Task 4 + $content = preg_replace(self::COORDS_AT_CLAUSE_REGEX, self::COORDS_REPLACEMENT, $content); + $content = preg_replace(self::COORDS_BRACKETED_REGEX, self::COORDS_REPLACEMENT, $content); + $content = preg_replace(self::COORDS_PARENTHESISED_REGEX, self::COORDS_REPLACEMENT, $content); } return $content; } diff --git a/test/tests/Util/Redactor/ProjectZomboidRedactorCoordinatesTest.php b/test/tests/Util/Redactor/ProjectZomboidRedactorCoordinatesTest.php new file mode 100644 index 0000000..6b241a3 --- /dev/null +++ b/test/tests/Util/Redactor/ProjectZomboidRedactorCoordinatesTest.php @@ -0,0 +1,124 @@ +redactSteamIds(false) + ->redactPlayerNames(false) + ->redact($input); + + $this->assertSame($expected, $output, 'Integer coords following " at " must be replaced; leading "at " and trailing "." must be preserved.'); + } + + public function testRedactsAtClauseFloatCoords(): void + { + // map.txt shape: IsoObject form with float coords (x.x,y.y,z.z). + $input = '[16-04-26 12:00:01.000] 76561198000000001 "Player1" added IsoObject (fencing_damaged_01_124) at 1010.0,2010.0,0.0.'; + $expected = '[16-04-26 12:00:01.000] 76561198000000001 "Player1" added IsoObject (fencing_damaged_01_124) at 0,0,0.'; + + $output = (new ProjectZomboidRedactor()) + ->redactSteamIds(false) + ->redactPlayerNames(false) + ->redact($input); + + $this->assertSame($expected, $output, 'Float coords following " at " must be replaced; the IsoObject parenthesised form must be unaffected.'); + } + + public function testRedactsBracketedCoords(): void + { + // ClientActionLog.txt shape: strict 5-field bracketed structure. + // The Steam ID bracket and action/player/param brackets must survive. + $input = '[16-04-26 12:00:02.000] [76561198000000001][ISEnterVehicle][Player1][1000,2000,0][Van_LectroMax].'; + $expected = '[16-04-26 12:00:02.000] [76561198000000001][ISEnterVehicle][Player1][0,0,0][Van_LectroMax].'; + + $output = (new ProjectZomboidRedactor()) + ->redactSteamIds(false) + ->redactPlayerNames(false) + ->redact($input); + + $this->assertSame($expected, $output, 'Coord bracket must become [0,0,0]; Steam ID, action, player name, and param brackets must be unaffected.'); + } + + public function testRedactsBracketedNegativeZ(): void + { + // Basement Z coordinates are negative; the regex must handle the leading minus. + $input = '[16-04-26 12:00:03.000] [76561198000000001][ISEnterVehicle][Player1][1020,2020,-1][Van_LectroMax].'; + $expected = '[16-04-26 12:00:03.000] [76561198000000001][ISEnterVehicle][Player1][0,0,0][Van_LectroMax].'; + + $output = (new ProjectZomboidRedactor()) + ->redactSteamIds(false) + ->redactPlayerNames(false) + ->redact($input); + + $this->assertSame($expected, $output, 'Negative Z (basement level) inside square brackets must be replaced.'); + } + + public function testRedactsParenthesisedCoordsBeforeHit(): void + { + // pvp.txt Combat: shape. The attacker coords are followed by ") hit" and ARE + // redacted. The victim coords are followed by ") weapon=" and are NOT redacted + // in v1 — the trailing-keyword anchor is intentionally absent for that position. + $input = '[16-04-26 17:14:35.128][INFO] Combat: "Player1" (1005,2005,0) hit "Player2" (1006,2005,0) weapon="Tire Iron (Worn)" damage=0.112317.'; + $expected = '[16-04-26 17:14:35.128][INFO] Combat: "Player1" (0,0,0) hit "Player2" (1006,2005,0) weapon="Tire Iron (Worn)" damage=0.112317.'; + + $output = (new ProjectZomboidRedactor()) + ->redactSteamIds(false) + ->redactPlayerNames(false) + ->redact($input); + + // Attacker coords (before "hit") are redacted; victim coords (before "weapon=") are NOT — deferred to v2. + $this->assertSame($expected, $output, 'Attacker coords before "hit" must be replaced; victim coords without a trailing keyword must survive.'); + } + + public function testRedactsParenthesisedCoordsBeforeSafetyVerb(): void + { + // pvp.txt Safety: shape; coords followed by ") restore true". + $input = '[16-04-26 16:17:49.731][LOG] Safety: "Player1" (1000,2000,0) restore true.'; + $expected = '[16-04-26 16:17:49.731][LOG] Safety: "Player1" (0,0,0) restore true.'; + + $output = (new ProjectZomboidRedactor()) + ->redactSteamIds(false) + ->redactPlayerNames(false) + ->redact($input); + + $this->assertSame($expected, $output, 'Coords followed by ") restore" must be replaced.'); + } + + public function testServerMetadataTriplesAreNotRedacted(): void + { + // DebugLog-server.txt entries contain server-state metadata that superficially + // resembles coordinates but is not: "st:48,648,157,584" is a 4-component token, + // "t:1776297642406" is a millisecond timestamp. Neither pattern lives inside + // brackets, parentheses followed by a PvP verb, or after " at " — so none of + // the three coordinate regexes should fire. + $input = '[16-04-26 00:01:19.080] ERROR: General f:0, t:1776297642406, st:48,648,157,584> Server starting up.'; + + $output = (new ProjectZomboidRedactor())->redact($input); + + $this->assertSame($input, $output, 'Server metadata triples (st:) and millisecond timestamps (t:) must pass through unchanged.'); + } + + public function testToggleOffLeavesCoordsIntact(): void + { + $input = '[16-04-26 12:00:04.000] 76561198000000001 "Player1" added Base.Aerosolbomb at 1000,2000,0.'; + + $output = (new ProjectZomboidRedactor()) + ->redactSteamIds(false) + ->redactPlayerNames(false) + ->redactCoordinates(false) + ->redact($input); + + $this->assertSame($input, $output, 'With the coordinates toggle disabled the original input must be returned unchanged.'); + } +} diff --git a/test/tests/Util/Redactor/ProjectZomboidRedactorPlayerNameTest.php b/test/tests/Util/Redactor/ProjectZomboidRedactorPlayerNameTest.php index 9463c69..d7d639b 100644 --- a/test/tests/Util/Redactor/ProjectZomboidRedactorPlayerNameTest.php +++ b/test/tests/Util/Redactor/ProjectZomboidRedactorPlayerNameTest.php @@ -42,27 +42,30 @@ class ProjectZomboidRedactorPlayerNameTest extends TestCase // The second name (after "hit") is NOT yet redacted — deferred to v2. // The weapon name ("Tire Iron (Worn)") must also survive unchanged. $input = '[16-04-26 17:14:35.128][INFO] Combat: "Player1" (1005,2005,0) hit "Player2" (1006,2005,0) weapon="Tire Iron (Worn)" damage=0.112317.'; - $expected = '[16-04-26 17:14:35.128][INFO] Combat: "" (1005,2005,0) hit "Player2" (1006,2005,0) weapon="Tire Iron (Worn)" damage=0.112317.'; + // Attacker coords (before "hit") are also replaced by the coordinates pass. + // Victim coords (before "weapon=") lack the trailing keyword and are NOT replaced — deferred to v2. + $expected = '[16-04-26 17:14:35.128][INFO] Combat: "" (0,0,0) hit "Player2" (1006,2005,0) weapon="Tire Iron (Worn)" damage=0.112317.'; $output = (new ProjectZomboidRedactor()) ->redactSteamIds(false) ->redact($input); - // Player1 (after "Combat: ") is replaced; Player2 (after "hit") is NOT - // replaced in v1 — that anchor is deferred. - $this->assertSame($expected, $output, 'First Combat: player name must be replaced; second name and weapon must survive.'); + // Player1 (after "Combat: ") is replaced; attacker coords (before "hit") are also replaced. + // Player2 (after "hit") and victim coords (before "weapon=") are NOT replaced in v1 — deferred. + $this->assertSame($expected, $output, 'First Combat: player name and attacker coords must be replaced; second name, victim coords, and weapon must survive.'); } public function testRedactsSafetyNameInPvpLog(): void { $input = '[16-04-26 16:17:49.731][LOG] Safety: "Player1" (1000,2000,0) restore true.'; - $expected = '[16-04-26 16:17:49.731][LOG] Safety: "" (1000,2000,0) restore true.'; + // Coords (before ") restore") are also replaced by the coordinates pass. + $expected = '[16-04-26 16:17:49.731][LOG] Safety: "" (0,0,0) restore true.'; $output = (new ProjectZomboidRedactor()) ->redactSteamIds(false) ->redact($input); - $this->assertSame($expected, $output, 'Player name following the Safety: token must be replaced.'); + $this->assertSame($expected, $output, 'Player name and coords following the Safety: token must both be replaced.'); } public function testBareQuotedStringWithoutAnchorIsNotTouched(): void From c2cb64e9a701cd0548c0fb9acc6821d3f52da030 Mon Sep 17 00:00:00 2001 From: indifferentketchup Date: Fri, 1 May 2026 14:57:08 +0000 Subject: [PATCH 06/10] test: add Redactor combined and idempotence coverage Co-Authored-By: Claude Sonnet 4.6 --- .../ProjectZomboidRedactorCombinedTest.php | 146 ++++++++++++++++++ .../ProjectZomboidRedactorIdempotenceTest.php | 99 ++++++++++++ 2 files changed, 245 insertions(+) create mode 100644 test/tests/Util/Redactor/ProjectZomboidRedactorCombinedTest.php create mode 100644 test/tests/Util/Redactor/ProjectZomboidRedactorIdempotenceTest.php diff --git a/test/tests/Util/Redactor/ProjectZomboidRedactorCombinedTest.php b/test/tests/Util/Redactor/ProjectZomboidRedactorCombinedTest.php new file mode 100644 index 0000000..873809c --- /dev/null +++ b/test/tests/Util/Redactor/ProjectZomboidRedactorCombinedTest.php @@ -0,0 +1,146 @@ +" added Base.Aerosolbomb at 0,0,0.', + '[16-04-26 12:00:01.000] 76561198000000000 "" added IsoObject (fence_01) at 0,0,0.', + "[16-04-26 17:05:03.280][info] Got message:ChatMessage{chat=Local, author='', text='hello'}.", + '[16-04-26 17:14:35.128][INFO] Combat: "" (0,0,0) hit "Player2" (1006,2005,0) weapon="Tire Iron (Worn)" damage=0.112317.', + '[16-04-26 16:17:49.731][LOG] Safety: "" (0,0,0) restore true.', + '[16-04-26 12:00:02.000] [76561198000000000][ISEnterVehicle][Player2][0,0,0][Van_LectroMax].', + ]); + + $output = (new ProjectZomboidRedactor())->redact($input); + + $this->assertSame($expected, $output, 'With all three toggles on, every Steam ID, player name context, and coord shape must be replaced.'); + } + + public function testSteamIdToggleOffLeavesSteamIdsIntact(): void + { + // All three PII categories present; Steam ID toggle is disabled. + // + // Important nuance: PLAYER_AFTER_STEAMID_REGEX anchors on the redacted placeholder + // 76561198000000000. With redactSteamIds(false) the raw Steam ID survives, so the + // regex does NOT fire for lines in the "after-Steam-ID" shape — those names survive + // too. Names anchored by other contexts (ChatMessage author, Combat:/Safety:) are + // still redacted because those regexes don't depend on the Steam ID pass. + $input = implode("\n", [ + // after-Steam-ID shape: name will NOT be redacted because the Steam ID is raw + '[16-04-26 12:00:00.000] 76561198111111111 "Player1" added Base.Aerosolbomb at 1000,2000,0.', + // ChatMessage author: still redacted (anchor is independent of Steam ID pass) + "[16-04-26 17:05:03.280][info] Got message:ChatMessage{chat=Local, author='AdminUser', text='hello'}.", + // Combat: name + attacker coords + '[16-04-26 17:14:35.128][INFO] Combat: "Player2" (1005,2005,0) hit "Player1" (1006,2005,0) weapon="Pipe Bomb" damage=1.0.', + ]); + + $expected = implode("\n", [ + // Steam ID intact; "Player1" NOT redacted (anchor regex didn't fire) + '[16-04-26 12:00:00.000] 76561198111111111 "Player1" added Base.Aerosolbomb at 0,0,0.', + // ChatMessage name redacted; coords were an at-clause → redacted + "[16-04-26 17:05:03.280][info] Got message:ChatMessage{chat=Local, author='', text='hello'}.", + // Combat: name + attacker coords both redacted + '[16-04-26 17:14:35.128][INFO] Combat: "" (0,0,0) hit "Player1" (1006,2005,0) weapon="Pipe Bomb" damage=1.0.', + ]); + + $output = (new ProjectZomboidRedactor()) + ->redactSteamIds(false) + ->redact($input); + + $this->assertSame( + $expected, + $output, + 'With Steam ID toggle off: raw Steam IDs survive; PLAYER_AFTER_STEAMID_REGEX does not fire (no placeholder to anchor on) so those names also survive; ChatMessage and Combat:/Safety: names are still redacted; coords are still redacted.', + ); + } + + public function testPlayerNameToggleOffLeavesNamesIntact(): void + { + // Steam IDs and coords redact; player names survive verbatim. + $input = implode("\n", [ + '[16-04-26 12:00:00.000] 76561198111111111 "Player1" added Base.Aerosolbomb at 1000,2000,0.', + "[16-04-26 17:05:03.280][info] Got message:ChatMessage{chat=Local, author='Player2', text='bye'}.", + '[16-04-26 16:17:49.731][LOG] Safety: "AdminUser" (1050,2050,0) restore true.', + ]); + + $expected = implode("\n", [ + '[16-04-26 12:00:00.000] 76561198000000000 "Player1" added Base.Aerosolbomb at 0,0,0.', + "[16-04-26 17:05:03.280][info] Got message:ChatMessage{chat=Local, author='Player2', text='bye'}.", + '[16-04-26 16:17:49.731][LOG] Safety: "AdminUser" (0,0,0) restore true.', + ]); + + $output = (new ProjectZomboidRedactor()) + ->redactPlayerNames(false) + ->redact($input); + + $this->assertSame($expected, $output, 'With player-name toggle off, all player names must survive; Steam IDs and coords must still be redacted.'); + } + + public function testCoordinatesToggleOffLeavesCoordsIntact(): void + { + // Steam IDs and player names redact; coordinates survive verbatim. + $input = implode("\n", [ + '[16-04-26 12:00:00.000] 76561198111111111 "Player1" added Base.Aerosolbomb at 1000,2000,0.', + '[16-04-26 12:00:01.000] [76561198222222222][ISEnterVehicle][Player2][1020,2020,0][Van_LectroMax].', + '[16-04-26 17:14:35.128][INFO] Combat: "AdminUser" (1005,2005,0) hit "Player1" (1006,2005,0) weapon="Baseball Bat" damage=0.5.', + ]); + + $expected = implode("\n", [ + '[16-04-26 12:00:00.000] 76561198000000000 "" added Base.Aerosolbomb at 1000,2000,0.', + '[16-04-26 12:00:01.000] [76561198000000000][ISEnterVehicle][Player2][1020,2020,0][Van_LectroMax].', + '[16-04-26 17:14:35.128][INFO] Combat: "" (1005,2005,0) hit "Player1" (1006,2005,0) weapon="Baseball Bat" damage=0.5.', + ]); + + $output = (new ProjectZomboidRedactor()) + ->redactCoordinates(false) + ->redact($input); + + $this->assertSame($expected, $output, 'With coordinates toggle off, all coord triplets must survive; Steam IDs and player names must still be redacted.'); + } + + public function testAllTogglesOffReturnsInputByteForByte(): void + { + // Disabling every toggle must produce an output identical to the input — + // the "passthrough" contract: opt-out means truly nothing happens. + $input = implode("\n", [ + '[16-04-26 12:00:00.000] 76561198111111111 "Player1" added Base.Aerosolbomb at 1000,2000,0.', + "[16-04-26 17:05:03.280][info] Got message:ChatMessage{chat=Local, author='Player2', text='hello'}.", + '[16-04-26 17:14:35.128][INFO] Combat: "AdminUser" (1005,2005,0) hit "Player1" (1006,2005,0) weapon="Tire Iron (Worn)" damage=0.112317.', + '[16-04-26 12:00:01.000] [76561198333333333][ISEnterVehicle][Player2][1020,2020,0][Van_LectroMax].', + ]); + + $output = (new ProjectZomboidRedactor()) + ->redactSteamIds(false) + ->redactPlayerNames(false) + ->redactCoordinates(false) + ->redact($input); + + $this->assertSame($input, $output, 'With all three toggles disabled, the output must be byte-for-byte identical to the input.'); + } +} diff --git a/test/tests/Util/Redactor/ProjectZomboidRedactorIdempotenceTest.php b/test/tests/Util/Redactor/ProjectZomboidRedactorIdempotenceTest.php new file mode 100644 index 0000000..b84e32a --- /dev/null +++ b/test/tests/Util/Redactor/ProjectZomboidRedactorIdempotenceTest.php @@ -0,0 +1,99 @@ + do not accidentally re-match and produce a doubly- + * nested result like "" → something else. + */ +class ProjectZomboidRedactorIdempotenceTest extends TestCase +{ + public function testIdempotenceSteamIdOnly(): void + { + $input = implode("\n", [ + 'Players: 76561198111111111, 76561198222222222, 76561198333333333 connected.', + '[16-04-26 12:00:00.000] [76561198111111111][ISEnterVehicle][Player1][1000,2000,0][Van_LectroMax].', + ]); + + $redactor = new ProjectZomboidRedactor(); + $redacted = $redactor->redact($input); + $redactedAgain = $redactor->redact($redacted); + + $this->assertSame($redacted, $redactedAgain, 'Applying redact() twice to Steam-ID-only input must produce the same result as applying it once.'); + } + + public function testIdempotencePlayerNamesOnly(): void + { + // Input already has the Steam ID placeholder in place (as the Steam ID pass + // would have written it), so PLAYER_AFTER_STEAMID_REGEX can fire. After the + // first pass the name becomes ""; the second pass must leave "" + // untouched — it is not a valid display name inside double quotes preceded + // by the Steam ID placeholder anchor in a way that would re-match, because + // the replacement written is: 76561198000000000 "", and the regex + // would need an unquoted player name inside quotes after the placeholder. + // "" (with the angle brackets) does satisfy [^"]+ but the second + // pass must still produce an identical result. + $input = implode("\n", [ + '76561198000000000 "Player1" ISLogSystem.writeLog @ 1000,2000,0.', + "[16-04-26 17:05:03.280][info] Got message:ChatMessage{chat=Local, author='AdminUser', text='hi'}.", + '[16-04-26 16:17:49.731][LOG] Safety: "Player2" (1000,2000,0) restore true.', + ]); + + $redactor = (new ProjectZomboidRedactor())->redactSteamIds(false)->redactCoordinates(false); + $redacted = $redactor->redact($input); + $redactedAgain = $redactor->redact($redacted); + + $this->assertSame($redacted, $redactedAgain, 'Applying redact() twice to player-name-only input must produce the same result as applying it once.'); + } + + public function testIdempotenceCoordsOnly(): void + { + $input = implode("\n", [ + '[16-04-26 12:00:00.000] 76561198000000001 "Player1" added Base.Aerosolbomb at 1000,2000,0.', + '[16-04-26 12:00:01.000] [76561198000000001][ISEnterVehicle][Player1][1020,2020,-1][Van_LectroMax].', + '[16-04-26 17:14:35.128][INFO] Combat: "Player1" (1005,2005,0) hit "Player2" (1006,2005,0) weapon="Tire Iron (Worn)" damage=0.112317.', + '[16-04-26 16:17:49.731][LOG] Safety: "Player1" (1000,2000,0) restore true.', + ]); + + $redactor = (new ProjectZomboidRedactor())->redactSteamIds(false)->redactPlayerNames(false); + $redacted = $redactor->redact($input); + $redactedAgain = $redactor->redact($redacted); + + $this->assertSame($redacted, $redactedAgain, 'Applying redact() twice to coords-only input must produce the same result as applying it once; the placeholder 0,0,0 must not be re-matched.'); + } + + public function testIdempotenceAllCategories(): void + { + // Full input: all three PII categories in multiple lexical contexts. + // After the first redact(), every placeholder is in place. The second + // redact() must make no further changes. + $input = implode("\n", [ + '[16-04-26 12:00:00.000] 76561198111111111 "Player1" added Base.Aerosolbomb at 1000,2000,0.', + '[16-04-26 12:00:01.000] 76561198222222222 "Player2" teleported to 1050,2050,0.', + "[16-04-26 17:05:03.280][info] Got message:ChatMessage{chat=Local, author='AdminUser', text='hello'}.", + '[16-04-26 17:14:35.128][INFO] Combat: "Player1" (1005,2005,0) hit "Player2" (1006,2005,0) weapon="Tire Iron (Worn)" damage=0.112317.', + '[16-04-26 16:17:49.731][LOG] Safety: "Player1" (1000,2000,0) restore true.', + '[16-04-26 12:00:02.000] [76561198333333333][ISEnterVehicle][Player2][1020,2020,0][Van_LectroMax].', + ]); + + $redactor = new ProjectZomboidRedactor(); + $redacted = $redactor->redact($input); + $redactedAgain = $redactor->redact($redacted); + + $this->assertSame($redacted, $redactedAgain, 'Applying redact() twice to input with all PII categories must produce the same result as applying it once; no placeholder must re-match on the second pass.'); + } +} From d6831c5851121f3df2d679dccfc7b1c988354568 Mon Sep 17 00:00:00 2001 From: indifferentketchup Date: Fri, 1 May 2026 15:02:57 +0000 Subject: [PATCH 07/10] test: add Redactor integration coverage against existing PZ fixtures Co-Authored-By: Claude Opus 4.7 (1M context) --- .../ProjectZomboidRedactorIntegrationTest.php | 205 ++++++++++++++++++ 1 file changed, 205 insertions(+) create mode 100644 test/tests/Util/Redactor/ProjectZomboidRedactorIntegrationTest.php diff --git a/test/tests/Util/Redactor/ProjectZomboidRedactorIntegrationTest.php b/test/tests/Util/Redactor/ProjectZomboidRedactorIntegrationTest.php new file mode 100644 index 0000000..be54778 --- /dev/null +++ b/test/tests/Util/Redactor/ProjectZomboidRedactorIntegrationTest.php @@ -0,0 +1,205 @@ +` coords survive because COORDS_AT_CLAUSE_REGEX + * anchors on ` at `, not ` to `. + */ +class ProjectZomboidRedactorIntegrationTest extends TestCase +{ + private static string $fixturesDir = __DIR__ . '/../../../src/Games/ProjectZomboid/fixtures'; + + // --------------------------------------------------------------------------- + // Data providers + // --------------------------------------------------------------------------- + + /** + * Yields [fixturePath] for every PZ fixture file. + */ + public static function fixturePathProvider(): array + { + $dir = self::$fixturesDir; + return [ + 'admin' => [$dir . '/admin-minimal.txt'], + 'burd-journals' => [$dir . '/burd-journals-minimal.txt'], + 'chat' => [$dir . '/chat-minimal.txt'], + 'client-action' => [$dir . '/client-action-minimal.txt'], + 'cmd' => [$dir . '/cmd-minimal.txt'], + 'debug-server' => [$dir . '/debug-server-minimal.txt'], + 'item' => [$dir . '/item-minimal.txt'], + 'map' => [$dir . '/map-minimal.txt'], + 'perk' => [$dir . '/perk-minimal.txt'], + 'pvp' => [$dir . '/pvp-minimal.txt'], + 'user' => [$dir . '/user-minimal.txt'], + ]; + } + + /** + * Yields [fixturePath, logClass] for the fixtures whose log class parses + * them. All 11 fixtures are represented. + */ + public static function fixtureWithLogClassProvider(): array + { + $dir = self::$fixturesDir; + return [ + 'admin' => [$dir . '/admin-minimal.txt', ProjectZomboidAdminLog::class], + 'burd-journals' => [$dir . '/burd-journals-minimal.txt', ProjectZomboidBurdJournalsLog::class], + 'chat' => [$dir . '/chat-minimal.txt', ProjectZomboidChatLog::class], + 'client-action' => [$dir . '/client-action-minimal.txt', ProjectZomboidClientActionLog::class], + 'cmd' => [$dir . '/cmd-minimal.txt', ProjectZomboidCmdLog::class], + 'debug-server' => [$dir . '/debug-server-minimal.txt', ProjectZomboidServerLog::class], + 'item' => [$dir . '/item-minimal.txt', ProjectZomboidItemLog::class], + 'map' => [$dir . '/map-minimal.txt', ProjectZomboidMapLog::class], + 'perk' => [$dir . '/perk-minimal.txt', ProjectZomboidPerkLog::class], + 'pvp' => [$dir . '/pvp-minimal.txt', ProjectZomboidPvpLog::class], + 'user' => [$dir . '/user-minimal.txt', ProjectZomboidUserLog::class], + ]; + } + + // --------------------------------------------------------------------------- + // Helper + // --------------------------------------------------------------------------- + + private function redact(string $content): string + { + return (new ProjectZomboidRedactor())->redact($content); + } + + // --------------------------------------------------------------------------- + // Test 1 — Steam ID normalisation + // --------------------------------------------------------------------------- + + /** + * After redaction every 17-digit Steam ID that is NOT the zero-placeholder + * must be gone. The zero-placeholder itself (76561198000000000) is the only + * Steam ID that may remain. + */ + #[DataProvider('fixturePathProvider')] + public function testFixtureContainsNoSteamIdsAfterRedaction(string $fixturePath): void + { + $content = (new PathLogFile($fixturePath))->getContent(); + $redacted = $this->redact($content); + + $matches = preg_match_all('/(?assertSame( + 0, + $matches, + sprintf( + 'After redaction, fixture "%s" must contain no non-zero-placeholder Steam IDs, but %d were found.', + basename($fixturePath), + $matches, + ), + ); + } + + // --------------------------------------------------------------------------- + // Test 2 — Structural preservation (re-parse after redaction) + // --------------------------------------------------------------------------- + + /** + * The redacted content, fed back through the corresponding parser, must + * produce exactly the same number of log entries as the original content. + * + * This asserts that the redactor does not corrupt timestamps, delimiters, + * or structural tokens that the parser relies on. + * + * @param string $fixturePath Path to the fixture file. + * @param class-string<\IndifferentKetchup\Codex\Log\Log> $logClass + * Fully-qualified name of the Log subclass that corresponds to this fixture. + */ + #[DataProvider('fixtureWithLogClassProvider')] + public function testFixtureRedactedOutputParsesToSameEntryCount(string $fixturePath, string $logClass): void + { + $content = (new PathLogFile($fixturePath))->getContent(); + + /** @var \IndifferentKetchup\Codex\Log\Log $originalLog */ + $originalLog = (new $logClass())->setLogFile(new PathLogFile($fixturePath)); + $originalLog->parse(); + $originalCount = count($originalLog->getEntries()); + + $redacted = $this->redact($content); + + /** @var \IndifferentKetchup\Codex\Log\Log $redactedLog */ + $redactedLog = (new $logClass())->setLogFile(new StringLogFile($redacted)); + $redactedLog->parse(); + $redactedCount = count($redactedLog->getEntries()); + + $this->assertSame( + $originalCount, + $redactedCount, + sprintf( + 'Parsing the redacted "%s" fixture with %s must yield the same entry count (%d) as parsing the original, but got %d.', + basename($fixturePath), + $logClass, + $originalCount, + $redactedCount, + ), + ); + } + + // --------------------------------------------------------------------------- + // Test 3 — Idempotence + // --------------------------------------------------------------------------- + + /** + * Applying redact() a second time must produce no further changes: + * redact(redact(content)) === redact(content). + * + * This guards against poorly-anchored regexes that would re-match the + * redaction placeholders themselves on a second pass. + */ + #[DataProvider('fixturePathProvider')] + public function testFixtureIsIdempotent(string $fixturePath): void + { + $content = (new PathLogFile($fixturePath))->getContent(); + + $redactor = new ProjectZomboidRedactor(); + $once = $redactor->redact($content); + $twice = $redactor->redact($once); + + $this->assertSame( + $once, + $twice, + sprintf( + 'redact(redact(content)) must equal redact(content) for fixture "%s"; a second pass must be a no-op.', + basename($fixturePath), + ), + ); + } +} From 081d40c2089dbf950c0ece13f78789fdc4c14573 Mon Sep 17 00:00:00 2001 From: indifferentketchup Date: Fri, 1 May 2026 15:08:49 +0000 Subject: [PATCH 08/10] docs: document Redactor utility in CLAUDE.md, README, CHANGELOG MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit CLAUDE.md: added RedactorInterface bullet to the architecture list (after Custom Analyser subclasses, before Detectors); added ProjectZomboidRedactor entry under ProjectZomboid specifics; added src/Util/ to the game-subtrees layout code block with a prose note marking it as the sixth component directory introduced post-v0.1.0; added Pitfall 5 on mandatory pass order. README.md: new "Redaction" subsection between Quick start and Architecture — PHP snippet, replacement descriptions, three toggle methods, three documented v1 limitations. CHANGELOG.md: added [Unreleased] section (Added + Changed) above [0.1.0]. Removed the Redactor bullet from [0.1.0]'s Deferred list entirely — the historical record stays accurate (v0.1.0 shipped without it) and [Unreleased] now documents its arrival; a stub mention in Deferred would be redundant. Co-Authored-By: Claude Opus 4.7 (1M context) --- CHANGELOG.md | 11 ++++++++++- CLAUDE.md | 6 ++++++ README.md | 15 +++++++++++++++ 3 files changed, 31 insertions(+), 1 deletion(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 38ceea6..c0d3824 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -4,6 +4,16 @@ All notable changes to `indifferentketchup/codex` are documented here. The format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/) and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). +## [Unreleased] + +### Added + +- `RedactorInterface` (`src/Util/RedactorInterface.php`) and `ProjectZomboidRedactor` (`src/Util/ProjectZomboid/ProjectZomboidRedactor.php`) — render-time PII filter that scrubs Steam IDs, player names, and world coordinates from Project Zomboid log content. Three independent toggles default to on. Designed as a string-in/string-out utility so consumers can apply it at any rendering or export step. Documented v1 limitations: in PvP combat lines, only the attacker's name and coords are redacted; victim's name and coords (after `hit`) are deferred to v2. In admin lines, `teleported X to ` coordinates are not redacted in v1. + +### Changed + +- New top-level `src/Util/` directory introduced. The Redactor is its first occupant; future utilities (e.g. tokenising redactor variants) land here. + ## [0.1.0] — 2026-05-01 First public release. Codex is a generic PHP log parsing and analysis framework with full Project Zomboid server-log support across eight analysers. The Composer package name is `indifferentketchup/codex` (the repository directory and Gitea slug are `ik-codex`; the package name is not). @@ -32,7 +42,6 @@ First public release. Codex is a generic PHP log parsing and analysis framework ### Deferred -- **Codex `Redactor` utility** — design captured in `docs/superpowers/specs/2026-04-30-redactor-design.md`. Not implemented in v0.1.0. iblogs (the downstream consumer) handles upload-time PII filtering for this release; codex itself ships no PII helper. The deferred spec exists so iblogs's privacy story has a referenced design to point at and so a future implementation pass has a clear contract to start from. - **Other game implementations** — `Minecraft`, `Hytale`, and `SevenDaysToDie` are detective-stub-only. Each has a TODO `Detective` extending base `Detective`; their per-component subdirectories under `Analyser`, `Log`, `Parser`, and `Pattern` contain only `.gitkeep` placeholders. Real implementations land if and when fixtures and demand exist. - **Packagist publication** — v0.1.0 is consumable via Composer's `vcs` repository entry pointing at the Gitea remote. Pushing to Packagist is a separate decision and is not in scope for this release. diff --git a/CLAUDE.md b/CLAUDE.md index bad7a06..5404b64 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -49,6 +49,7 @@ Analysis of Insight[] - **`PatternParser`** is regex-driven. Lines that don't match the LINE regex append to the previous `Entry` — this is the mechanism that handles multi-line records like Java stack traces under an ERROR header. - **`PatternAnalyser`** walks entries, runs each registered insight class's static `getPatterns()` against entry text via `preg_match_all`, and emits coalesced insights (equal insights bump a counter instead of duplicating). - **Custom `Analyser` subclasses** are the right move when analysis needs cross-entry state — pairing events, sliding-window thresholds, comparing consecutive snapshots. `PatternAnalyser` operates per-entry only and can't express those. Phase B.3 (`ConnectionFailureAnalyser`, `ItemDuplicationAnalyser`, `SkillProgressionAnomalyAnalyser`) shows the shape: extend `Analyser`, override `analyse()`, walk `$this->log` once, aggregate, then emit coalesced `Problem`/`Information` insights at the end. Tunable thresholds belong as `public const` constants on the subclass with the rationale in a docblock. +- **`RedactorInterface`** is a render-time PII filter — string-in/string-out, configured per game, implemented at `src/Util//Redactor.php`. Consumers call `redact(string $content): string` on a concrete instance before rendering or exporting log content. - Detectors available out of the box: `SinglePatternDetector`, `WeightedSinglePatternDetector`, `LinePatternDetector` (returns match ratio), `MultiPatternDetector` (AND), and the path-based `FilenameDetector` (uses `LogFileInterface::getPath()`, returns `false` when no path is available). ## Game subtrees @@ -58,10 +59,13 @@ Layout is **components-outer with game suffix**, not games-outer: ``` src///... e.g. src/Log/ProjectZomboid/ProjectZomboidServerLog.php src/Pattern//Pattern.php (regex string constants; not a framework abstraction) +src/Util//... e.g. src/Util/ProjectZomboid/ProjectZomboidRedactor.php test/tests/Games//... test/src/Games//fixtures/-minimal.txt (synthetic fixtures only) ``` +`src/Util/` is the sixth top-level component directory, introduced post-v0.1.0-tag. Its first occupant is the Redactor; future game-agnostic utilities (tokenising redactor variants, etc.) land here too. + Scaffolded games: `Minecraft`, `Hytale`, `SevenDaysToDie` (stubs only — empty `.gitkeep`s plus a TODO `Detective` extending base `Detective`). `ProjectZomboid` is fully implemented: 11 log subclasses, 11 pattern classes, detective wired with all 11, synthetic fixtures, dispatch tests, plus the analyser surface — 11 `PatternAnalyser`-driven Insight classes under `src/Analysis/ProjectZomboid/` and 3 custom `Analyser` subclasses under `src/Analyser/ProjectZomboid/` for cross-entry / threshold logic. `src/Pattern/` is **not a framework abstraction** — patterns are plain `string` class constants. Each `Pattern` typically holds a `LINE` constant for the parser plus named-group extractor constants (`FIELDS`, `COMBAT`, `MOD_LOAD`, etc.) for analysers. @@ -74,6 +78,7 @@ Scaffolded games: `Minecraft`, `Hytale`, `SevenDaysToDie` (stubs only — empty - A custom `Analyser` subclass (cross-entry logic): `UserLog → ConnectionFailureAnalyser`, `ItemLog → ItemDuplicationAnalyser`, `PerkLog → SkillProgressionAnomalyAnalyser`. - A configured `PatternAnalyser` (per-entry pattern matching): `ServerLog`, `PvpLog`, `AdminLog` register their respective Insight classes. - An empty `PatternAnalyser` for logs with no analysers yet: `ChatLog`, `ClientActionLog`, `CmdLog`, `MapLog`, `BurdJournalsLog`. These are wiring stubs awaiting future analysis work. +- **`ProjectZomboidRedactor`** at `src/Util/ProjectZomboid/ProjectZomboidRedactor.php` — concrete `RedactorInterface` implementation. Downstream consumers call `redact(string): string` to scrub Steam IDs (zeroed placeholder), player names (``), and world coordinates (`0,0,0`) from log content. Three independent toggle methods default to on: `redactSteamIds(bool)`, `redactPlayerNames(bool)`, `redactCoordinates(bool)`. Pass order (Steam ID → player name → coords) is mandatory and enforced internally — see Pitfall 5. ### Standard test template for a Log subclass @@ -85,6 +90,7 @@ At minimum: (1) entry count after `parse()` matches the synthetic fixture's line 2. **PHPUnit 12 requires the `#[DataProvider('methodName')]` attribute.** The legacy `@dataProvider` annotation silently passes zero args and fails with `ArgumentCountError`. 3. **`Level::fromString()` defaults to `Level::INFO` for unknown tokens.** Project Zomboid log levels map: `LOG`/`INFO` → INFO; `WARN` → WARNING; `ERROR` → ERROR. 4. **`PatternParser` matches array** must declare a match-type for **every** capture group in the regex (`TIME`, `LEVEL`, or `PREFIX`); otherwise the parser throws on the unmapped index. Use non-capturing groups `(?:...)` for fields you want to skip. +5. **`ProjectZomboidRedactor` pass order is mandatory.** `PLAYER_AFTER_STEAMID_REGEX` anchors on the already-redacted Steam ID placeholder — it will not match raw Steam IDs. Do NOT swap the Steam ID and player-name passes, and do NOT stub out the Steam ID pass while leaving the player-name pass enabled. ## Workflow conventions diff --git a/README.md b/README.md index dc3b537..f6a823f 100644 --- a/README.md +++ b/README.md @@ -59,6 +59,21 @@ Project Zomboid Debug Server Log If the log content arrives without a filesystem path (clipboard paste, web upload, stream), use `StringLogFile` or `StreamLogFile` instead of `PathLogFile`. The detective falls back to content signatures when the filename hint is absent. +## Redaction + +Before rendering or exporting log content, pass it through `ProjectZomboidRedactor` to strip PII: + +```php +use IndifferentKetchup\Codex\Util\ProjectZomboid\ProjectZomboidRedactor; + +$redactor = new ProjectZomboidRedactor(); +$safe = $redactor->redact($logContent); +``` + +This scrubs three categories in a fixed pass order: Steam IDs are replaced with a zeroed placeholder, player names with ``, and world coordinates with `0,0,0`. All three passes are on by default; opt out per category with `redactSteamIds(bool)`, `redactPlayerNames(bool)`, or `redactCoordinates(bool)`. + +Documented v1 limitations: in PvP combat lines, only the attacker's name and coords are redacted — the victim's name and coords (appearing after `hit`) are deferred to v2. In admin lines, `teleported X to ` coordinates are not redacted in v1. + ## Architecture ``` From 6bf63f1823e3c703f0a209fb9d6a795f6f83bdfb Mon Sep 17 00:00:00 2001 From: indifferentketchup Date: Fri, 1 May 2026 18:21:22 +0000 Subject: [PATCH 09/10] docs: flip Redactor spec status to implemented MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The spec doc was written when the Redactor was deferred and shipped with a "Status: deferred — not implemented" header. The redactor branch lands the implementation; the header is now stale. Replace with a pointer to the plan and CHANGELOG [Unreleased] section. Resolves observation #1 from the final code review. --- docs/superpowers/specs/2026-04-30-redactor-design.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/superpowers/specs/2026-04-30-redactor-design.md b/docs/superpowers/specs/2026-04-30-redactor-design.md index 574ef85..e2e7a34 100644 --- a/docs/superpowers/specs/2026-04-30-redactor-design.md +++ b/docs/superpowers/specs/2026-04-30-redactor-design.md @@ -1,7 +1,7 @@ # Codex Redactor utility — design spec > Retroactive: written 2026-05-01. -> **Status: deferred — not implemented.** This is a forward-looking design captured here for backfill symmetry and to inform iblogs's upload-time PII handling. +> **Status: implemented on the `redactor` branch (2026-05-01).** Plan: `docs/superpowers/plans/2026-05-01-redactor.md`. Arrival commit set documented in `CHANGELOG.md` `[Unreleased]`. The "Status: deferred" framing below is preserved for historical context; treat this file as the as-built design contract. ## Summary From 50194c72b24a2cd96483ce3e30be9e3eedb61c00 Mon Sep 17 00:00:00 2001 From: indifferentketchup Date: Fri, 1 May 2026 18:22:25 +0000 Subject: [PATCH 10/10] test: add player-name collapse integration coverage MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Resolves observation #3 from the final code review. The integration tests previously asserted Steam-ID elimination, structural preservation, and idempotence but did not directly verify that synthetic player names collapse to after redaction. Adds testFixturePlayerNamesCollapseInCoveredContexts, parameterised over the five fixtures (chat, cmd, item, map, user) where every synthetic name appears exclusively in a context the redactor recognises (ChatMessage author or Steam-ID-followed-by-quoted-name). The data provider docblock explicitly enumerates which fixtures are excluded and why — admin and client-action/perk because names appear in unanchored or bracket-only contexts; pvp because the victim name after `hit` is a v1 limitation; burd-journals/debug-server because no synthetic player names are present. Test count: 255 -> 260 (5 new effective cases from data-provider). --- .../ProjectZomboidRedactorIntegrationTest.php | 67 +++++++++++++++++++ 1 file changed, 67 insertions(+) diff --git a/test/tests/Util/Redactor/ProjectZomboidRedactorIntegrationTest.php b/test/tests/Util/Redactor/ProjectZomboidRedactorIntegrationTest.php index be54778..6b94d02 100644 --- a/test/tests/Util/Redactor/ProjectZomboidRedactorIntegrationTest.php +++ b/test/tests/Util/Redactor/ProjectZomboidRedactorIntegrationTest.php @@ -69,6 +69,38 @@ class ProjectZomboidRedactorIntegrationTest extends TestCase ]; } + /** + * Yields [fixturePath] for the subset of fixtures where every synthetic + * player name (Player1 / Player2 / AdminUser / PlayerSuspect) appears + * exclusively in a context the redactor recognises: + * + * - chat: ChatMessage{author='...'} envelope + * - cmd, item, map, user: 77-char-Steam-ID followed by "..." quoted name + * + * Fixtures intentionally excluded: + * + * - admin: names appear in free-text positions (no Steam-ID anchor, + * no quotes, no Combat:/Safety: prefix). Names survive in v1. + * - client-action, + * perk: names appear inside [...] brackets, not "..." quotes. + * PLAYER_AFTER_STEAMID_REGEX requires double-quotes. + * - pvp: attacker name redacts but victim name after `hit "..."` + * survives in v1 (Task 3 limitation). + * - burd-journals, + * debug-server: no synthetic player names present. + */ + public static function fixturesWhereAllNamesAreInCoveredContextsProvider(): array + { + $dir = self::$fixturesDir; + return [ + 'chat' => [$dir . '/chat-minimal.txt'], + 'cmd' => [$dir . '/cmd-minimal.txt'], + 'item' => [$dir . '/item-minimal.txt'], + 'map' => [$dir . '/map-minimal.txt'], + 'user' => [$dir . '/user-minimal.txt'], + ]; + } + /** * Yields [fixturePath, logClass] for the fixtures whose log class parses * them. All 11 fixtures are represented. @@ -202,4 +234,39 @@ class ProjectZomboidRedactorIntegrationTest extends TestCase ), ); } + + // --------------------------------------------------------------------------- + // Test 4 — Player-name collapse in fully-covered fixtures + // --------------------------------------------------------------------------- + + /** + * For fixtures where every synthetic player name appears exclusively in a + * context the redactor recognises, no synthetic name should remain after + * redaction. + * + * This addresses observation #3 from the final code review (the integration + * tests previously asserted Steam-ID elimination + structural preservation + * + idempotence, but did not directly verify name collapse). The unit tests + * in ProjectZomboidRedactorPlayerNameTest cover this property exhaustively + * per-context; this integration test re-verifies it end-to-end against the + * fixtures that ride into iblogs. + */ + #[DataProvider('fixturesWhereAllNamesAreInCoveredContextsProvider')] + public function testFixturePlayerNamesCollapseInCoveredContexts(string $fixturePath): void + { + $content = (new PathLogFile($fixturePath))->getContent(); + $redacted = $this->redact($content); + + foreach (['Player1', 'Player2', 'AdminUser', 'PlayerSuspect'] as $name) { + $this->assertStringNotContainsString( + $name, + $redacted, + sprintf( + 'Fixture "%s": synthetic name %s survived redaction. Every name in this fixture should appear only in a covered lexical context.', + basename($fixturePath), + $name, + ), + ); + } + } }