4 Commits

Author SHA1 Message Date
aa708a34a4 docs: extend CLAUDE.md with cross-session friction notes
Some checks failed
Tests / Run tests on PHP v8.4 (push) Failing after 1s
Tests / Run tests on PHP v8.5 (push) Failing after 0s
Captures four pieces of context that cost time to (re)derive this
session:

- Docker `--entrypoint php` one-liner for ad-hoc PHP that needs the
  codex autoloader (used for redactor smoke tests).
- Pitfall #6: PZ DebugLog-server has two coexisting line shapes
  (B41 with `t:` field, B42 without) — `DebugServerPattern::LINE`
  matches both via an optional group; narrowing it back to B41-only
  silently disables ServerExceptionProblem / ModMissingProblem on
  every B42 log.
- Deployed iblogs lives at bosslogs.indifferentketchup.com and uses
  `main` as its default branch, not `master`. Pinned to ^0.3.0.
- New top-level section for `tools/pz-analyzer/` describing the
  intentional split between the pre-production Qwen-backed
  discovery tool and the production-bound deterministic
  classifier, plus the redact-all wrapper that feeds both.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-06 19:45:26 +00:00
656142dbf8 docs: cut v0.3.0 in CHANGELOG
Some checks failed
Tests / Run tests on PHP v8.4 (push) Failing after 1s
Tests / Run tests on PHP v8.5 (push) Failing after 1s
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-06 19:04:37 +00:00
c63adb06c4 Merge branch 'pz42x-line-regex'
Some checks failed
Tests / Run tests on PHP v8.4 (push) Failing after 2s
Tests / Run tests on PHP v8.5 (push) Failing after 1s
Fixes DebugServerPattern::LINE so PZ build 42.x logs (which dropped
the per-line `t:` microsecond field) parse with proper level/prefix
attribution. Without the fix every B42 entry fell through as level
INFO and ServerExceptionProblem / ModMissingProblem silently failed
to fire, leaving B42 log views with at most a single
EngineVersionInformation badge and no Problems panel. Backwards
compatible with B41 format; ProjectZomboidServerLogTest now runs
parameterised against both shapes via #[DataProvider].
2026-05-06 13:33:43 +00:00
0d18cfbfc6 fix: relax DebugServerPattern::LINE for PZ B42 log format
PZ build 42.x dropped the per-line `t:` (microsecond) field and
tightened the spacing between `f:N`, `t:N`, and `st:N,N,N,N>` markers.
The hardcoded `f:\d+,\s+t:\d+,\s+st:` requirement caused every B42
line to fail the parser's LINE regex, leaving ServerLog entries
without their level/prefix and silently disabling
ServerExceptionProblem and ModMissingProblem (the anchorless
EngineVersionInformation still fired against the joined entry text,
which is why the symptom was "one Information, no Problems").

Make `t:N,` optional via `(?:,\s+t:\d+)?` and the comma between
`f:N` and `st:` optional via `,?`. The B41 format remains a strict
match. Add `debug-server-42x-minimal.txt` mirroring the existing
synthetic fixture in the new format, and parameterise
ProjectZomboidServerLogTest with a #[DataProvider] so all four
parser-shape assertions now run against both formats. Spot-check:
analysers emit 3 Problems (2 exceptions, 1 missing mod) and 4
Information entries against the new fixture, identical to B41.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-06 13:33:35 +00:00
5 changed files with 95 additions and 11 deletions

View File

@@ -6,6 +6,34 @@ The format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/) and
## [Unreleased]
## [0.3.0] — 2026-05-04
Adds IP-address redaction to the PZ redactor, a new `ErrorContextAnalyser` for surrounding-context surfacing, the `tools/pz-analyzer/` Python toolset (pre-production Qwen-driven research analyser and production-bound deterministic classifier), and a parser fix for the PZ B42 log shape that was silently breaking level/prefix attribution since The Indie Stone dropped the per-line `t:` field. New public API surface across the redactor and the analyser-side classes makes this a minor bump rather than a patch.
### Added
- **IP redaction in `ProjectZomboidRedactor`** (`src/Util/ProjectZomboid/ProjectZomboidRedactor.php`) — fourth pass that scrubs IPv4 (strict 0-255 octets, optional `:port` suffix) and IPv6 (full, abbreviated, bracketed-with-port, IPv4-mapped) addresses, replacing them with the literal `[REDACTED_IP]`. New public API: `IP_REPLACEMENT`, `IPV4_REGEX`, `IPV6_REGEX` constants and a `redactIpAddresses(bool)` toggle (defaults on, mirroring the existing three category toggles). Pattern-disjoint from the Steam-ID → name → coordinates chain; runs first by convention. Strict regexes plus `filter_var()` validation prevent false positives on PZ timestamps and PHP / Java scope ops. 20 new unit tests across two files (`ProjectZomboidRedactorIpv4Test.php`, `ProjectZomboidRedactorIpv6Test.php`).
- **`ErrorContextAnalyser`** (`src/Analyser/ProjectZomboid/ErrorContextAnalyser.php`) — generic-purpose analyser that walks `Entry[]` once and emits one `ErrorContextProblem` per ERROR / WARNING entry with up to `CONTEXT_BEFORE` (20) entries of leading context and `CONTEXT_AFTER` (20) entries of trailing context. Overlapping windows clip to `lastEmittedIndex + 1` so no Entry appears in two context arrays; emission caps at `HIT_CAP` (500) with a single `ErrorContextTruncatedInformation` appended when reached. Standalone — not auto-registered to any existing Log subclass's `getDefaultAnalyser()`; consumers wire it in explicitly. Companion classes `ErrorContextProblem` and `ErrorContextTruncatedInformation` under `src/Analysis/ProjectZomboid/`. 3 unit tests, 134 assertions.
- **`tools/pz-analyzer/`** — Python toolset adjacent to the library (not part of the Composer package's autoload surface). `pz_redact_all.sh` is a one-shot Docker wrapper that runs the PHP redactor over `.scratch/pz/Logs/` and produces a gitignored `.scratch/pz/Logs.redacted/` directory. `pz_error_analysis.py` is a developer-facing Qwen-backed pre-production analyser that calls a local OpenAI-compatible endpoint to classify residual log shapes the deterministic side hasn't yet captured. `pz_parser.py` + `pz_classify.py` are the production-bound deterministic-only counterpart: pure parser module with mod attribution, file:line extraction, cause-chain unwinding, engine-noise tagging, and a two-level signature scheme (`pattern_id` + `signature`), plus a stdlib-only orchestrator that walks the redacted directory and emits a JSON report. 32 Python unit tests across three files, 16 synthetic fixtures.
- `docs/superpowers/specs/2026-05-04-pz-deterministic-classifier-design.md` — design contract for `pz_parser.py` / `pz_classify.py`. The PHP-side `ErrorContextAnalyser` ships without a separate spec; its design fell out of a brainstorming session inline with the pzmm-pattern-port discussion.
- New synthetic fixture `test/src/Games/ProjectZomboid/fixtures/debug-server-42x-minimal.txt` mirroring the existing B41 fixture in PZ B42 line shape.
### Changed
- **`DebugServerPattern::LINE` regex relaxed** to handle PZ build 42.x. The Indie Stone dropped the per-line `t:` (microsecond) field and tightened the spacing between `f:N`, `t:N`, and `st:N,N,N,N>` markers somewhere on the way to build 42.17. The previous regex required the full `f:\d+,\s+t:\d+,\s+st:` triplet and silently failed on every B42 line. Now `(?:,\s+t:\d+)?` makes the `t:N,` field optional and `,?` makes the inter-field comma optional. Backwards-compatible — every B41 line continues to parse identically. `ProjectZomboidServerLogTest` now runs each parser-shape assertion via `#[DataProvider]` against both fixtures.
- **Pass order in `ProjectZomboidRedactor::redact()`**: the new IP pass runs first, so the chain is now `IP → Steam ID → player name → coordinates`. The mandatory Steam ID → name → coordinates ordering is preserved; placement of the IP pass is by convention since its regexes are pattern-disjoint from the rest.
- **`CLAUDE.md`** documents `iblogs` as the primary downstream consumer with a per-component checklist for cross-repo public API impact; the release-flow cadence; the feature-branch workflow set by the `redactor` and `iblogs-bootstrap` precedents; and the `docs/superpowers/specs|plans/` path convention.
- **`.gitignore`** excludes `__pycache__/` (Python bytecode caches generated under `tools/pz-analyzer/`) and `*.bak` / `*.bak-*` (editor / manual backup files).
### Fixed
- PZ build 42.x server logs now parse with proper level / prefix attribution. Previously, every B42 line failed `DebugServerPattern::LINE` and the resulting ServerLog entries fell through as level `INFO` with no prefix. This silently disabled `ServerExceptionProblem` and `ModMissingProblem` (their regexes anchor on `[timestamp]...` at entry start, which a level-less orphan entry doesn't emit). The anchorless `EngineVersionInformation` continued to fire against the joined entry text, producing the user-visible symptom "one Information badge, empty Problems panel" on B42 logs. The fix restores per-line parsing, re-enables both Problem classes, and makes the error-count badge populate correctly.
### Test counts
- PHP suite: **287 tests, 654 assertions** (up from 260 / 492 at v0.2.0).
- Python suite under `tools/pz-analyzer/`: **32 tests** (stdlib `unittest`, sub-10 ms).
## [0.2.0] — 2026-05-01
Render-time PII redaction utility added on the same calendar day as v0.1.0. Cut as a minor version bump rather than a patch because it adds a new public API surface (`RedactorInterface` plus the per-game implementation), which under semver is a minor change, not a patch. Consumers (notably iblogs) pin to `^0.2.0` to opt into the redactor-aware version.
@@ -51,5 +79,6 @@ First public release. Codex is a generic PHP log parsing and analysis framework
- **Other game implementations** — `Minecraft`, `Hytale`, and `SevenDaysToDie` are detective-stub-only. Each has a TODO `<Game>Detective` extending base `Detective`; their per-component subdirectories under `Analyser`, `Log`, `Parser`, and `Pattern` contain only `.gitkeep` placeholders. Real implementations land if and when fixtures and demand exist.
- **Packagist publication** — v0.1.0 is consumable via Composer's `vcs` repository entry pointing at the Gitea remote. Pushing to Packagist is a separate decision and is not in scope for this release.
[0.3.0]: https://git.indifferentketchup.com/indifferentketchup/ik-codex/releases/tag/v0.3.0
[0.2.0]: https://git.indifferentketchup.com/indifferentketchup/ik-codex/releases/tag/v0.2.0
[0.1.0]: https://git.indifferentketchup.com/indifferentketchup/ik-codex/releases/tag/v0.1.0

View File

@@ -16,6 +16,12 @@ docker run --rm -v "$(pwd):/app" -w /app -u "$(id -u):$(id -g)" composer:latest
Use `$(pwd)` or an absolute path — bare `$PWD` has misfired here, mounting nothing and silently no-op'ing the run.
For ad-hoc PHP that needs the codex autoloader (e.g. running `ProjectZomboidRedactor::redact()` over a directory of log files, or eyeballing analyser output), override the entrypoint:
```
docker run --rm --entrypoint php -v "$(pwd):/app" -w /app -u "$(id -u):$(id -g)" composer:latest -r '<php source>'
```
## Common commands
- All tests: `composer test` (= `phpunit test/tests` per `composer.json`)
@@ -88,6 +94,16 @@ At minimum: (1) entry count after `parse()` matches the synthetic fixture's line
`iblogs` (sibling repo at `/opt/iblogs`, package `indifferentketchup/iblogs`, fork of `aternosorg/mclogs`) is the primary consumer of codex via a Composer `vcs` repository entry pinned to the latest minor tag. Public-API changes in `src/{Detective,Log,Printer,Util}/*.php` and `src/Analysis/*.php` propagate there; when modifying those types, sanity-check the iblogs call sites at `/opt/iblogs/src/{Detective.php,Log.php,Printer/Printer.php,Printer/FormatModification.php,Api/Response/CodexLogResponse.php}` and the stub class at `/opt/iblogs/src/Data/Deobfuscator.php`.
The deployed iblogs instance lives at `bosslogs.indifferentketchup.com` (production renders the same code path as the local dev server on port 4217). iblogs's default branch is `main`, not `master`. iblogs's `composer.json` constraint is currently `^0.3.0`; cutting a v0.4.x will require widening that.
## Out-of-library tools (`tools/pz-analyzer/`)
Python utilities alongside the Composer package, not on the PSR-4 autoload surface. Two artefacts coexist by design — the deterministic classifier is the production target; the Qwen tool is the developer's discovery aid for shapes the deterministic side hasn't captured yet.
- **`pz_redact_all.sh`** — one-shot Docker wrapper. Runs `ProjectZomboidRedactor` over `.scratch/pz/Logs/` and writes `.scratch/pz/Logs.redacted/`. Both Python tools below consume the redacted directory.
- **`pz_error_analysis.py`** — *pre-production*, Qwen-backed. Sends residual log shapes to the local Qwen endpoint at `http://100.101.41.16:8401/v1` (sam-desktop, model `qwen3.6-35b-a3b`) for natural-language classification with category / cause / fix output. Requires the `openai` package; in practice run via `/opt/analytics/.venv/bin/python` which has it installed.
- **`pz_parser.py` + `pz_classify.py`** — *production-bound deterministic classifier*. Stdlib only. Mirrors the patterns from `paraxaQQ/pzmm`'s `core/inspector.py` (Lua mod-marker attribution, bidirectional stack collection, file:line extraction, cause-chain unwinding, engine-noise tagging) plus a two-level signature scheme (`pattern_id` + `signature`). Designed to inform a future PHP port to `LuaErrorAnalyser` / `ModAttributionAnalyser` under `src/Analyser/ProjectZomboid/`. 32 stdlib `unittest` tests under `tools/pz-analyzer/tests/`; invocation: `python3 -m unittest discover -s tools/pz-analyzer/tests`.
## Pitfalls
1. **`PatternParser` is incompatible with named regex groups.** PHP's `preg_match` returns named groups *plus* their numeric duplicates in the same array; `PatternParser`'s foreach iterates both and throws on the string-key entries. Convention: `LINE` regexes (used by the parser) use **unnamed** groups with field order documented in the Pattern class's docblock. Named groups are fine inside extractor regexes invoked from analysers, since `PatternAnalyser` hands the whole match array to `Insight::setMatches`.
@@ -95,6 +111,7 @@ At minimum: (1) entry count after `parse()` matches the synthetic fixture's line
3. **`Level::fromString()` defaults to `Level::INFO` for unknown tokens.** Project Zomboid log levels map: `LOG`/`INFO` → INFO; `WARN` → WARNING; `ERROR` → ERROR.
4. **`PatternParser` matches array** must declare a match-type for **every** capture group in the regex (`TIME`, `LEVEL`, or `PREFIX`); otherwise the parser throws on the unmapped index. Use non-capturing groups `(?:...)` for fields you want to skip.
5. **`ProjectZomboidRedactor` pass order is mandatory.** `PLAYER_AFTER_STEAMID_REGEX` anchors on the already-redacted Steam ID placeholder — it will not match raw Steam IDs. Do NOT swap the Steam ID and player-name passes, and do NOT stub out the Steam ID pass while leaving the player-name pass enabled.
6. **Two PZ DebugLog-server line formats coexist.** B41 emits `[ts] LEVEL: Subsystem f:N, t:N, st:N,N,N,N>`; B42 (build 42.17 onward) dropped the `t:` microsecond field and tightened spacing to `f:N st:N,N,N,N>`. `DebugServerPattern::LINE` matches both via `(?:,\s+t:\d+)?,?` — preserve that optional group when editing or B42 logs silently fail to parse, leaving entries level-less and analysers (`ServerExceptionProblem`, `ModMissingProblem`) silently dormant. Fixtures cover both: `debug-server-minimal.txt` (B41), `debug-server-42x-minimal.txt` (B42).
## Workflow conventions

View File

@@ -15,7 +15,7 @@ namespace IndifferentKetchup\Codex\Pattern\ProjectZomboid;
*/
class DebugServerPattern
{
public const string LINE = '/^\[(\d{2}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}\.\d{3})\]\s+(\w+)\s*:\s+(\S+)\s+f:\d+,\s+t:\d+,\s+st:[\d,]+>\s+.*$/';
public const string LINE = '/^\[(\d{2}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}\.\d{3})\]\s+(\w+)\s*:\s+(\S+)\s+f:\d+(?:,\s+t:\d+)?,?\s+st:[\d,]+>\s+.*$/';
public const string VERSION = '/version=(?<version>\S+) (?<hash>[a-f0-9]{40}) (?<date>\d{4}-\d{2}-\d{2}) (?<time>\d{2}:\d{2}:\d{2})/';

View File

@@ -0,0 +1,22 @@
[16-04-26 00:00:42.314] LOG : General f:0 st:48,648,157,434> SLF4J(W): No SLF4J providers were found..
[16-04-26 00:00:42.315] LOG : General f:0 st:48,648,157,492> SLF4J(W): Defaulting to no-operation (NOP) logger implementation.
[16-04-26 00:00:42.407] LOG : General f:0 st:48,648,157,584> version=42.17.0 0000000000000000000000000000000000000000 2026-04-20 14:34:44 (ZB) demo=false.
[16-04-26 00:00:42.407] LOG : General f:0 st:48,648,157,585> revision=0000000000000000000000000000000000000000 date=2026-04-20 time=14:34:44 (ZB).
[16-04-26 00:01:19.080] ERROR: General f:0 st:48,648,194,258> DebugFileWatcher.registerDir> Exception thrown
java.nio.file.NoSuchFileException: /placeholder/config/mods at UnixException.translateToIOException(null:-1).
Stack trace:
java.base/sun.nio.fs.UnixException.translateToIOException(Unknown Source)
java.base/sun.nio.fs.UnixException.asIOException(Unknown Source)
java.base/sun.nio.fs.LinuxWatchService$Poller.implRegister(Unknown Source)
java.base/sun.nio.fs.AbstractPoller.processRequests(Unknown Source)
java.base/sun.nio.fs.LinuxWatchService$Poller.run(Unknown Source)
[16-04-26 00:01:19.131] LOG : Mod f:0 st:48,648,194,309> loading example_mod_alpha.
[16-04-26 00:01:19.142] LOG : Mod f:0 st:48,648,194,320> loading example_mod_beta.
[16-04-26 00:01:19.155] LOG : Mod f:0 st:48,648,194,333> loading example_mod_gamma.
[16-04-26 00:01:19.200] WARN : Mod f:0 st:48,648,194,378> ZomboidFileSystem.loadModAndRequired> required mod "absent_mod" not found.
[16-04-26 00:01:45.937] ERROR: WorldGen f:0 st:48,648,221,115> IsoPropertyType.lookupOrDefaultStr> Exception thrown
zombie.core.properties.IsoPropertyType$IsoPropertyTypeNotFoundException: Property Name not found: ladderW at IsoPropertyType.lookup(IsoPropertyType.java:269). Message: Property Name not found: ladderW
at zombie.core.properties.IsoPropertyType.lookup(IsoPropertyType.java:269)
at zombie.iso.IsoChunkData.PostProcessChunk(IsoChunkData.java:512)
[16-04-26 00:02:00.000] LOG : General f:0 st:48,648,235,178> server initialised.
[16-04-26 00:05:00.000] LOG : General f:0 st:48,648,415,178> shutdown requested.

View File

@@ -6,18 +6,31 @@ use IndifferentKetchup\Codex\Detective\Detective;
use IndifferentKetchup\Codex\Log\File\PathLogFile;
use IndifferentKetchup\Codex\Log\Level;
use IndifferentKetchup\Codex\Log\ProjectZomboid\ProjectZomboidServerLog;
use PHPUnit\Framework\Attributes\DataProvider;
use PHPUnit\Framework\TestCase;
class ProjectZomboidServerLogTest extends TestCase
{
private function fixturePath(): string
/**
* Both PZ B41 and B42 line shapes must parse identically. B41 (and the
* fixture used by every analyser test) emits `f:N, t:N, st:N,N,N,N>`;
* B42 (release branch from 2026-04 onward, e.g. build 42.17) drops the
* `t:` microsecond field entirely and tightens whitespace to
* `f:N st:N,N,N,N>`.
*/
public static function fixtureProvider(): array
{
return __DIR__ . '/../../../../src/Games/ProjectZomboid/fixtures/debug-server-minimal.txt';
$base = __DIR__ . '/../../../../src/Games/ProjectZomboid/fixtures';
return [
'pz41-format' => [$base . '/debug-server-minimal.txt'],
'pz42-format' => [$base . '/debug-server-42x-minimal.txt'],
];
}
public function testParsesEntriesWithLevelAndPrefix(): void
#[DataProvider('fixtureProvider')]
public function testParsesEntriesWithLevelAndPrefix(string $fixturePath): void
{
$log = (new ProjectZomboidServerLog())->setLogFile(new PathLogFile($this->fixturePath()));
$log = (new ProjectZomboidServerLog())->setLogFile(new PathLogFile($fixturePath));
$log->parse();
$entries = $log->getEntries();
@@ -29,9 +42,10 @@ class ProjectZomboidServerLogTest extends TestCase
$this->assertNotNull($first->getTime());
}
public function testStackTraceLinesAttachToTriggeringErrorEntry(): void
#[DataProvider('fixtureProvider')]
public function testStackTraceLinesAttachToTriggeringErrorEntry(string $fixturePath): void
{
$log = (new ProjectZomboidServerLog())->setLogFile(new PathLogFile($this->fixturePath()));
$log = (new ProjectZomboidServerLog())->setLogFile(new PathLogFile($fixturePath));
$log->parse();
$errorEntry = null;
@@ -46,19 +60,21 @@ class ProjectZomboidServerLogTest extends TestCase
$this->assertGreaterThan(1, count($errorEntry->getLines()));
}
public function testWarnLevelMapsCorrectly(): void
#[DataProvider('fixtureProvider')]
public function testWarnLevelMapsCorrectly(string $fixturePath): void
{
$log = (new ProjectZomboidServerLog())->setLogFile(new PathLogFile($this->fixturePath()));
$log = (new ProjectZomboidServerLog())->setLogFile(new PathLogFile($fixturePath));
$log->parse();
$warnEntries = array_filter($log->getEntries(), fn($e) => $e->getLevel() === Level::WARNING);
$this->assertNotEmpty($warnEntries);
}
public function testDetectiveDispatchesByContent(): void
#[DataProvider('fixtureProvider')]
public function testDetectiveDispatchesByContent(string $fixturePath): void
{
$detective = (new Detective())
->setLogFile(new PathLogFile($this->fixturePath()))
->setLogFile(new PathLogFile($fixturePath))
->addPossibleLogClass(ProjectZomboidServerLog::class);
$log = $detective->detect();