Compare commits
3 Commits
v0.1.0
...
409de16003
| Author | SHA1 | Date | |
|---|---|---|---|
| 409de16003 | |||
| aec835e0eb | |||
| 6fde2d49ff |
@@ -62,7 +62,7 @@ test/tests/Games/<Game>/...
|
||||
test/src/Games/<Game>/fixtures/<type>-minimal.txt (synthetic fixtures only)
|
||||
```
|
||||
|
||||
Scaffolded games: `Minecraft`, `Hytale`, `SevenDaysToDie` (stubs only — empty `.gitkeep`s plus a TODO `<Game>Detective` extending base `Detective`). `ProjectZomboid` is fully implemented: 11 log subclasses, 11 pattern classes, detective wired with all 11, synthetic fixtures, dispatch tests, plus the analyser surface — 12 `PatternAnalyser`-driven Insight classes under `src/Analysis/ProjectZomboid/` and 3 custom `Analyser` subclasses under `src/Analyser/ProjectZomboid/` for cross-entry / threshold logic.
|
||||
Scaffolded games: `Minecraft`, `Hytale`, `SevenDaysToDie` (stubs only — empty `.gitkeep`s plus a TODO `<Game>Detective` extending base `Detective`). `ProjectZomboid` is fully implemented: 11 log subclasses, 11 pattern classes, detective wired with all 11, synthetic fixtures, dispatch tests, plus the analyser surface — 11 `PatternAnalyser`-driven Insight classes under `src/Analysis/ProjectZomboid/` and 3 custom `Analyser` subclasses under `src/Analyser/ProjectZomboid/` for cross-entry / threshold logic.
|
||||
|
||||
`src/Pattern/` is **not a framework abstraction** — patterns are plain `string` class constants. Each `<Type>Pattern` typically holds a `LINE` constant for the parser plus named-group extractor constants (`FIELDS`, `COMBAT`, `MOD_LOAD`, etc.) for analysers.
|
||||
|
||||
|
||||
211
docs/superpowers/plans/2026-05-01-redactor.md
Normal file
211
docs/superpowers/plans/2026-05-01-redactor.md
Normal file
@@ -0,0 +1,211 @@
|
||||
# Redactor Utility Implementation Plan
|
||||
|
||||
> Forward-looking. No code is written by this document.
|
||||
> Branch: `redactor` (off master `aec835e`). Backup tag: `backup/pre-redactor`.
|
||||
> Spec: `docs/superpowers/specs/2026-04-30-redactor-design.md`.
|
||||
|
||||
**Goal:** Land the `RedactorInterface` plus a concrete `ProjectZomboidRedactor` implementation so iblogs (and any other downstream consumer) can scrub Project Zomboid log content of Steam IDs, player names, and world coordinates with a single call. The Redactor is a render-time filter on raw string content; raw stays canonical at the storage layer.
|
||||
|
||||
**Architecture:** Standalone string-in/string-out utility under a new top-level `src/Util/` directory, with per-game implementations under `src/Util/<Game>/`. Each implementation owns the lexical regex anchors for its game's PII shapes. Three independent toggles per implementation (`redactSteamIds`, `redactPlayerNames`, `redactCoordinates`); defaults all on; "all toggles off" yields verbatim passthrough.
|
||||
|
||||
**Tech stack:** PHP 8.4+, PHPUnit 12, Composer (`indifferentketchup/codex` v0.1.0+). All command invocations wrap in the `composer:latest` Docker image per `CLAUDE.md`.
|
||||
|
||||
---
|
||||
|
||||
## Design questions — resolved
|
||||
|
||||
### a. Render-time vs ingest-time
|
||||
|
||||
**Decision: render-time. Confirm spec's lean.**
|
||||
|
||||
Raw log content is canonical. Redaction is a view filter that consumers apply when they want to display, export, or analyse a redacted projection. iblogs's storage layer holds the unredacted upload (subject to iblogs's own upload-time `Filter` chain for IPs/access-tokens, which is a different layer of defence); the codex Redactor runs on the way *out* of storage, not on the way in.
|
||||
|
||||
**Why:** the alternative (ingest-time, where storage holds redacted content) is destructive — once stored, the original cannot be recovered for legitimate operator use. Render-time leaves the original in place and lets each render path opt in. iblogs gets a per-session toggle without needing to keep two copies of every paste.
|
||||
|
||||
**Implication for iblogs schema:** iblogs stores raw content; the redaction toggle in the iblogs UI invokes `ProjectZomboidRedactor::redact()` at render time (server-side) or at fetch time (API consumers' choice). No schema migration required for the redaction feature.
|
||||
|
||||
### b. Redactor as standalone class vs Printer decorator
|
||||
|
||||
**Decision: standalone utility (option iii from the question).**
|
||||
|
||||
The Redactor is a `string → string` function. It does not know about `Insight`, `Printer`, or any other codex type. Three options were considered:
|
||||
|
||||
- **(i) Printer wrapper.** Cleanly composable but ties the Redactor to the Printer abstraction. Doesn't help iblogs's most common case: redacting raw log content for display in a non-Printer rendering path (HTML page rendered server-side, raw download served to API client).
|
||||
- **(ii) Pre-Printer pass on Insights.** Heavy. Insights are typed objects with structured fields; redacting them means per-Insight code that knows which fields are PII-bearing. Against the YAGNI line for v1.
|
||||
- **(iii) Standalone string utility.** Simple, generic, works on any string input — raw log content, JSON-serialised analysis output, rendered Printer output piped through. Doesn't know about Insights.
|
||||
|
||||
The spec describes (iii). v1 ships (iii) only. If a Printer-wrapper convenience is later wanted, it can be added as a thin adapter that calls the standalone Redactor on the Printer's output; it doesn't require restructuring the core.
|
||||
|
||||
### c. PII field taxonomy for PZ
|
||||
|
||||
**Decision: regex-based with lexical context anchors. No structured-field detection in v1.**
|
||||
|
||||
PZ-specific PII categories observed in the in-tree fixtures and the `.scratch/pz/Logs/` reference corpus:
|
||||
|
||||
| Field | Detection | Rationale |
|
||||
|---|---|---|
|
||||
| Steam ID | regex with `76561198\d{9}` prefix anchor and word-boundary classes | Steam's `76561198` SteamID64 universe prefix lets us cleanly distinguish from other long numbers (timestamps, build numbers). |
|
||||
| Player name | regex with multi-context lexical anchors (after-Steam-ID-quoted, ChatMessage author, `Combat:`/`Safety:` subsystem) | Names are arbitrary strings — not detectable without context. The contexts are well-defined by the parser-side pattern classes. |
|
||||
| World coordinate triple | regex with bracket / paren / `at`-clause anchors | Generic `\d+,\d+,\d+` would over-redact server metadata (`f:0, t:NNNN, st:48,648,157,584`). Lexical context disambiguates. |
|
||||
|
||||
**Not redacted in v1:**
|
||||
|
||||
- **IP addresses.** PZ logs do not normally include IPs in any of the eleven file types observed. iblogs's upload-side `IPv4Filter` / `IPv6Filter` (ported from upstream mclogs) covers the rare case where a mod might log them.
|
||||
- **Server-side usernames distinct from player names.** PZ uses Steam display name as the player identity; there's no separate auth username layer. Mclogs's `UsernameFilter` is Minecraft-specific and isn't mirrored here.
|
||||
- **BurdJournals scientific-notation Steam IDs** (`7.65611…E16`). Spec open-question 2 explicitly defers this to v2; the `[BurdJournals]` tag already disambiguates them as mod-internal.
|
||||
|
||||
**Hybrid (regex + structured-field) deferred.** A v2 enhancement could redact specific Insight fields at JSON-serialisation time (e.g. `ConnectionFailureProblem::$steamId` → placeholder when serialised). Useful only if iblogs starts shipping the structured analysis JSON to redacted views — a real but currently hypothetical need.
|
||||
|
||||
### d. Replacement strategy
|
||||
|
||||
**Decision: per-category placeholder strings matching the synthetic-fixture conventions. Configurable replacement style is YAGNI for v1.**
|
||||
|
||||
Per the spec:
|
||||
|
||||
| Category | Replacement |
|
||||
|---|---|
|
||||
| Steam ID | `76561198000000000` (zeroed placeholder, still a syntactically valid Steam ID) |
|
||||
| Player name | `<player>` |
|
||||
| Coordinates | `0,0,0` (with shape preserved per anchor — bracketed, parenthesised, or `at` clause) |
|
||||
|
||||
Why these specifically and not `[REDACTED]` / `[STEAM_ID]` / hashed:
|
||||
|
||||
- The placeholders **match the existing synthetic test fixtures** (`76561198000000001`–`76561198000000004` collapse to `76561198000000000`; player names `Player1`/`Player2`/`AdminUser` collapse to `<player>`). Tests can verify "redacted output looks like a synthetic fixture."
|
||||
- Shape preservation means downstream consumers can still parse the redacted output with the same Pattern classes — a redacted log is still a syntactically valid PZ log, it just contains no identities.
|
||||
- Type-tagged replacements (`[STEAM_ID]`) break shape preservation: a Pattern looking for `\d{17}` would fail. Worth offering as a config option if a consumer specifically wants type-visibility, but v1 ships placeholder-only.
|
||||
- Hashing breaks shape preservation similarly and adds determinism / collision concerns.
|
||||
|
||||
If a consumer later needs `[STEAM_ID]`-style output, a `setReplacementStyle('typed' | 'placeholder' | 'redacted')` setter can be added without breaking the v1 API. v1 ships placeholder-only.
|
||||
|
||||
### e. Game-agnostic vs PZ-specific layout
|
||||
|
||||
**Decision: thin generic interface in `src/Util/` plus PZ-specific implementation in `src/Util/ProjectZomboid/`.**
|
||||
|
||||
```
|
||||
src/Util/
|
||||
├── RedactorInterface.php (1 method: redact(string): string)
|
||||
└── ProjectZomboid/
|
||||
└── ProjectZomboidRedactor.php (toggles + regex passes)
|
||||
```
|
||||
|
||||
**YAGNI tradeoff stated:** the interface has one method and currently one implementation. Strictly, YAGNI says collapse to just `ProjectZomboidRedactor` and skip the interface. The interface earns its keep because **iblogs's call sites will type-hint against `RedactorInterface`**, not the concrete class — that's the architectural payoff. Consumer code stays loosely coupled; when Minecraft or another game ships a redactor, iblogs swaps the implementation by changing one DI binding rather than touching call sites.
|
||||
|
||||
The cost is two files instead of one. Acceptable given the dependency-inversion benefit. The directory layout (`src/Util/<Game>/`) mirrors the components-outer-with-game-suffix convention used everywhere else in the tree (Analyser, Analysis, Detective, Log, Parser, Pattern).
|
||||
|
||||
**Note on the new `src/Util/` directory.** Codex currently has no `src/Util/` (the Phase A scaffolding established Analyser / Analysis / Detective / Log / Parser / Pattern / Printer; Phase B.3 added Analyser/ProjectZomboid content but not Util). The Redactor introduces this new top-level. This is an additive change — no existing code is modified.
|
||||
|
||||
### f. Test strategy
|
||||
|
||||
**Decision: hybrid — small dedicated synthetic fixtures under `test/src/Util/Redactor/` for direct unit tests, plus an integration test that runs the Redactor over an existing PZ fixture and asserts idempotence.**
|
||||
|
||||
**Dedicated unit fixtures** (small string constants in test classes, not separate files): per spec test plan #1–#5. Each test class owns its input/expected pairs. Keeps unit tests self-contained and fast.
|
||||
|
||||
**Integration test** that re-uses an existing PZ fixture (e.g. `test/src/Games/ProjectZomboid/fixtures/admin-minimal.txt`). Two assertions:
|
||||
|
||||
- The Redactor's output is a syntactically valid log (still parses cleanly through the corresponding `ProjectZomboidAdminLog`).
|
||||
- Idempotence: `redact(redact($x)) === redact($x)`. Existing fixture content is already placeholder-shaped, so the redactor should leave it byte-for-byte identical OR apply the canonical normalisation once and then no-op.
|
||||
|
||||
**False-positive avoidance.** The synthetic fixtures use `76561198000000001` etc. as placeholder Steam IDs. The Redactor's Steam ID regex matches the `76561198\d{9}` prefix and replaces with `76561198000000000` — so `76561198000000001` becomes `76561198000000000` (a normalisation, not a corruption). Tests verify this normalisation is correct and that legitimate-non-PII data (e.g. server metadata triples like `f:0, t:1776297642406, st:48,648,157,584`) is **not** touched.
|
||||
|
||||
---
|
||||
|
||||
## Tasks
|
||||
|
||||
Tasks are intended for the `redactor` branch. Each is a single logical commit. Test-running between commits uses the standard Docker invocation. Work proceeds only after Step 0 sign-off (this plan reviewed).
|
||||
|
||||
### Task 0 — Plan doc commit
|
||||
|
||||
- [ ] **Step 0.1.** Already done out-of-band: `git checkout -b redactor` off master `aec835e`; `git tag backup/pre-redactor` at branch tip; this plan written.
|
||||
- [ ] **Step 0.2.** Commit this plan: `docs: add Redactor implementation plan` on branch `redactor`. Push branch to origin for review.
|
||||
|
||||
### Task 1 — Scaffold (interface + skeleton class with toggles)
|
||||
|
||||
- [ ] **Step 1.1.** Create `src/Util/RedactorInterface.php`. Single method: `public function redact(string $content): string;` PHPDoc describing the contract: stateless from the caller's perspective; configuration happens via implementation-specific setters before `redact()`.
|
||||
- [ ] **Step 1.2.** Create `src/Util/ProjectZomboid/ProjectZomboidRedactor.php` that implements the interface. Class structure: three private bool properties (`$redactSteamIds`, `$redactPlayerNames`, `$redactCoordinates`) all defaulting to `true`; three fluent setters (`redactSteamIds(bool): static`, etc.); `redact(string): string` body that returns input unchanged when all toggles are off (for now — regex passes added in subsequent tasks).
|
||||
- [ ] **Step 1.3.** Run `composer test` — expect 195 tests still green (no Redactor tests yet).
|
||||
- [ ] **Step 1.4.** Commit: `feat: scaffold RedactorInterface and ProjectZomboidRedactor with toggles`.
|
||||
|
||||
### Task 2 — Steam ID redaction pass
|
||||
|
||||
- [ ] **Step 2.1.** Add `STEAM_ID_REGEX` and `STEAM_ID_REPLACEMENT` constants on `ProjectZomboidRedactor`. Regex uses the `76561198\d{9}` prefix anchor with word-boundary classes (per spec). The `/u` flag is added to all regexes for Unicode safety even though Steam IDs themselves are ASCII.
|
||||
- [ ] **Step 2.2.** Implement the Steam ID branch of `redact()`: when `$redactSteamIds` is true, run `preg_replace` against the input.
|
||||
- [ ] **Step 2.3.** Create `test/tests/Util/Redactor/ProjectZomboidRedactorSteamIdTest.php`. Tests: redaction of various distinct synthetic Steam IDs collapses all to `76561198000000000`; non-Steam-ID 17-digit numbers (e.g. timestamps) are not touched; toggle-off leaves Steam IDs intact.
|
||||
- [ ] **Step 2.4.** Run `composer test`. Expect new tests pass; old 195 unaffected.
|
||||
- [ ] **Step 2.5.** Commit: `feat: add Steam ID redaction pass`.
|
||||
|
||||
### Task 3 — Player name redaction pass
|
||||
|
||||
- [ ] **Step 3.1.** Add three regex constants on `ProjectZomboidRedactor` for the three player-name lexical contexts: `PLAYER_AFTER_STEAMID_REGEX`, `PLAYER_IN_CHATMESSAGE_REGEX`, `PLAYER_IN_PVP_SUBSYSTEM_REGEX`. Replacement is `<player>` for all. **Order constraint:** the after-Steam-ID context anchors on the post-redaction Steam ID `76561198000000000`, so the player-name pass must run *after* the Steam ID pass. Document this in a class-level docblock.
|
||||
- [ ] **Step 3.2.** Implement the player-name branch of `redact()`: three sequential `preg_replace` calls when `$redactPlayerNames` is true.
|
||||
- [ ] **Step 3.3.** Create `test/tests/Util/Redactor/ProjectZomboidRedactorPlayerNameTest.php`. Tests: each of the three contexts redacts correctly when paired with its anchor; a bare quoted string (e.g. `"foo"` not preceded by a Steam ID) is **not** touched; toggle-off leaves names intact; the after-Steam-ID context works correctly when the Steam ID has already been redacted to the zeroed placeholder.
|
||||
- [ ] **Step 3.4.** Run `composer test`. Expect new tests pass.
|
||||
- [ ] **Step 3.5.** Commit: `feat: add player name redaction pass`.
|
||||
|
||||
### Task 4 — Coordinates redaction pass
|
||||
|
||||
- [ ] **Step 4.1.** Add three regex constants on `ProjectZomboidRedactor` for the three coordinate contexts: `COORDS_AT_CLAUSE_REGEX`, `COORDS_BRACKETED_REGEX`, `COORDS_PARENTHESISED_REGEX`. Replacements preserve shape (`0,0,0` inside whatever bracket/paren wrapper).
|
||||
- [ ] **Step 4.2.** Implement the coords branch of `redact()`: three sequential `preg_replace_callback` (or `preg_replace`) calls when `$redactCoordinates` is true.
|
||||
- [ ] **Step 4.3.** Create `test/tests/Util/Redactor/ProjectZomboidRedactorCoordinatesTest.php`. Tests: each of the three contexts redacts correctly; **negative test** — server metadata `f:0, t:1776297642406, st:48,648,157,584` is not touched; basement Z-coordinates (`-1`) are handled; toggle-off leaves coords intact.
|
||||
- [ ] **Step 4.4.** Run `composer test`. Expect new tests pass.
|
||||
- [ ] **Step 4.5.** Commit: `feat: add coordinates redaction pass`.
|
||||
|
||||
### Task 5 — Combined / toggle / idempotence tests
|
||||
|
||||
- [ ] **Step 5.1.** Create `test/tests/Util/Redactor/ProjectZomboidRedactorCombinedTest.php`. Tests cover: combined input with all three PII categories present produces fully-scrubbed output when all toggles on; each toggle off in isolation produces partial scrubbing matching the toggle's category; all toggles off returns input byte-for-byte identical (`===` equality).
|
||||
- [ ] **Step 5.2.** Create `test/tests/Util/Redactor/ProjectZomboidRedactorIdempotenceTest.php`. Tests: `redact(redact($x)) === redact($x)` for several input shapes including all three PII categories.
|
||||
- [ ] **Step 5.3.** Run `composer test`. Expect new tests pass.
|
||||
- [ ] **Step 5.4.** Commit: `test: add Redactor combined and idempotence coverage`.
|
||||
|
||||
### Task 6 — Existing-fixture integration tests
|
||||
|
||||
- [ ] **Step 6.1.** Create `test/tests/Util/Redactor/ProjectZomboidRedactorIntegrationTest.php`. Loads each existing PZ fixture (`admin-minimal.txt`, `chat-minimal.txt`, etc.) via `PathLogFile`, calls `redact()` on the content, and asserts: (a) the redacted content still parses cleanly through the corresponding `ProjectZomboid<X>Log`'s parser without throwing; (b) the synthetic Steam IDs `76561198000000001`–`76561198000000004` all collapse to `76561198000000000`; (c) the synthetic player names (`Player1`, `Player2`, `AdminUser`, `PlayerSuspect`) all collapse to `<player>`.
|
||||
- [ ] **Step 6.2.** Run `composer test`. Expect all integration assertions pass without modifying any existing test or fixture.
|
||||
- [ ] **Step 6.3.** Commit: `test: add Redactor integration coverage against existing PZ fixtures`.
|
||||
|
||||
### Task 7 — Documentation updates
|
||||
|
||||
- [ ] **Step 7.1.** Update `CLAUDE.md`: add a one-line `src/Util/` mention to the framework architecture section; one-line note in the ProjectZomboid specifics section pointing at `ProjectZomboidRedactor` for downstream PII scrubbing; update the "Scaffolded games" line to mention that `ProjectZomboid` now also has a Redactor implementation under `src/Util/ProjectZomboid/`.
|
||||
- [ ] **Step 7.2.** Update `README.md`: add a short usage block showing `(new ProjectZomboidRedactor())->redact($logContent)` as a render-time scrub option, alongside the existing worked example.
|
||||
- [ ] **Step 7.3.** Update `CHANGELOG.md`: move Redactor out of the **Deferred** section under `[0.1.0]`, OR add a new `[Unreleased]` section if the v0.1.0 line should remain accurate as-shipped. Decision: **add `[Unreleased]`** — v0.1.0 was tagged without the Redactor and the changelog should reflect the historical truth.
|
||||
- [ ] **Step 7.4.** Run `composer test` once more for safety; confirm 195+(redactor tests) green.
|
||||
- [ ] **Step 7.5.** Commit: `docs: document Redactor utility in CLAUDE.md, README, CHANGELOG`.
|
||||
|
||||
### Task 8 — Final verification
|
||||
|
||||
- [ ] **Step 8.1.** Run `composer test`. All tests green.
|
||||
- [ ] **Step 8.2.** Re-run `vendor/bin/phpunit --display-deprecations --display-warnings --display-notices --display-errors`. Expect zero output beyond the standard pass summary.
|
||||
- [ ] **Step 8.3.** Sanity-check the branch with `git log --oneline master..redactor`. Should be the plan-doc commit plus 7 implementation commits = 8 commits total.
|
||||
- [ ] **Step 8.4.** Push final state: `git push origin redactor`. **Do NOT merge to master.** User reviews diff and approves merge separately.
|
||||
|
||||
---
|
||||
|
||||
## Open questions / spec gaps
|
||||
|
||||
The spec is generally tight. Items worth flagging while implementing:
|
||||
|
||||
1. **`/u` flag for Unicode safety.** Spec doesn't specify regex flags. PZ player names can contain non-ASCII characters (Steam display names are Unicode-permissive). The implementation will use `/u` on all regexes to avoid mangling multi-byte sequences. Documenting in the class docblock.
|
||||
2. **Replacement order.** Spec says "Redaction order matters: SIDs first, names second" because the after-Steam-ID player-name regex anchors on the redacted Steam ID. The implementation will enforce this order in `redact()` (Steam ID pass first, then names, then coords). The class docblock will document the ordering invariant.
|
||||
3. **HTML / JSON-encoded input.** Spec assumes plain log text. If a consumer feeds HTML-escaped content (e.g. `"` instead of `"`), the player-name regex won't match. Document as a v2 concern: callers feed plain text in, render afterwards. v1 does not implement HTML/JSON-aware mode.
|
||||
4. **Future PII categories.** v1 ships exactly the three toggles per spec. New categories (emails, IPs from mods, etc.) extend the toggle set in a future release; v1 does not pre-build extension points beyond what the interface already provides.
|
||||
5. **`src/Util/` is a new top-level directory** in this codebase. The Redactor is the first occupant. Future utilities (e.g. a tokenizing variant per spec open-question 1) would also live here. No existing-code modification is needed; the new directory is purely additive.
|
||||
6. **The empty `src/Printer/<Game>/.gitkeep` situation.** Phase A scaffolding chose not to create `Printer/<Game>/` directories at all (only Analyser/Detective/Log/Parser/Pattern got per-game subdirs). The Redactor's home in `src/Util/<Game>/` mirrors that — `src/Util/` is created with PZ as its first occupant; no stub `Hytale/`/`Minecraft/`/`SevenDaysToDie/` placeholders are scaffolded. When other games' redactors land, they create their own subdirectories at that point.
|
||||
|
||||
No spec contradictions found. No existing-code modifications required (additive-only design).
|
||||
|
||||
---
|
||||
|
||||
## Branch / commit invariants
|
||||
|
||||
- All commits land on the `redactor` branch.
|
||||
- Master is not touched until the user explicitly approves merge after reviewing the diff.
|
||||
- Conventional commit prefixes: `docs:`, `feat:`, `test:`, `refactor:`. (No `fix:` expected — this is greenfield work.)
|
||||
- One logical concept per commit. Tasks 1, 2, 3, 4 each ship implementation + per-pass tests in one commit; Task 5 / 6 / 7 are pure-test or pure-docs commits.
|
||||
- Backup tag `backup/pre-redactor` at `aec835e` lets us discard the branch and recover if the implementation goes sideways.
|
||||
- Branch can be pushed to origin freely for visibility / review checkpoints.
|
||||
|
||||
## Pointers
|
||||
|
||||
- Spec: `docs/superpowers/specs/2026-04-30-redactor-design.md`.
|
||||
- Synthetic fixtures the integration test will reuse: `test/src/Games/ProjectZomboid/fixtures/*.txt`.
|
||||
- Existing per-game layout precedent: `src/Analyser/ProjectZomboid/`, `src/Pattern/ProjectZomboid/`, `src/Log/ProjectZomboid/`.
|
||||
- Workflow conventions and pitfalls: `CLAUDE.md`.
|
||||
186
docs/superpowers/specs/2026-05-01-iblogs-bootstrap-design.md
Normal file
186
docs/superpowers/specs/2026-05-01-iblogs-bootstrap-design.md
Normal file
@@ -0,0 +1,186 @@
|
||||
# iblogs bootstrap design
|
||||
|
||||
> Written 2026-05-01.
|
||||
> **Scope:** design only. No iblogs code is written by this document; the actual fork, rename, and rewire happen in a follow-up session after this design is approved.
|
||||
|
||||
## Summary
|
||||
|
||||
iblogs is a Project-Zomboid-first log triage service forked from `aternosorg/mclogs`. It consumes `indifferentketchup/codex` (pinned at `v0.1.0`) for log detection, parsing, and analysis, replacing mclogs's `aternos/codex-minecraft` / `aternos/codex-hytale` / `aternos/sherlock` dependency stack. The data model gains a session entity that wraps the multiple files Project Zomboid produces per server session (eleven file types per session), while mclogs's existing single-paste paths remain alive as legacy routes that map to "session of size 1."
|
||||
|
||||
## (a) Fork target verification
|
||||
|
||||
| Check | Value |
|
||||
|---|---|
|
||||
| Upstream | `github.com/aternosorg/mclogs` |
|
||||
| Default branch | `main` |
|
||||
| License | **MIT** (SPDX `MIT`) — compatible with `indifferentketchup/codex`'s MIT |
|
||||
| Last push | `2026-03-30` (active; ~30 days ago) |
|
||||
| Last update | `2026-04-26` |
|
||||
| Archived | no |
|
||||
| Stars / open issues | 290 / 2 |
|
||||
| PHP requirement | `>=8.5`, plus `ext-frankenphp`, `ext-mongodb`, `ext-uri`, `ext-zlib`, `ext-mbstring`, `ext-json` |
|
||||
| Storage | MongoDB |
|
||||
| Existing codex dep | yes — `aternos/codex-minecraft ^5.0.1` and `aternos/codex-hytale ^2.0` |
|
||||
|
||||
**Verdict: GO.** License is compatible. Project is actively maintained. No archival or licensing blockers. The fact that mclogs already integrates Aternos's codex stack tells us the fork's swap surface is well-defined: replace those Composer deps and the codex-facing call sites in `src/Api/Action/AnalyseLogAction.php` / `src/Api/Action/LogInsightsAction.php` / `src/Api/Response/CodexLogResponse.php` / `src/Detective.php` / `src/Log.php`.
|
||||
|
||||
The PHP `>=8.5` floor is stricter than codex's `>=8.4` — iblogs inherits the stricter constraint, which is fine. The `ext-frankenphp` requirement means iblogs runs on the FrankenPHP runtime rather than vanilla PHP-FPM; preserving this is the path of least resistance.
|
||||
|
||||
`aternos/sherlock` (MIT, "PHP library to apply minecraft mappings to log files") is Minecraft-specific (Mojang obfuscation maps). It is **not needed for PZ** and gets dropped. If iblogs ever adds Minecraft support, it can come back.
|
||||
|
||||
## (b) Repo plan
|
||||
|
||||
**Primary remote:** Gitea at `git.indifferentketchup.com:2222`. Fork as `indifferentketchup/iblogs`. SSH clone URL: `ssh://git@git.indifferentketchup.com:2222/indifferentketchup/iblogs.git`. Match the codex repo's existing Gitea setup.
|
||||
|
||||
**GitHub mirror:** Push-only secondary, configured via Gitea's Mirror feature (Repo Settings → Mirror Settings → Push Mirror). Same pattern any team using Gitea-as-primary uses for visibility.
|
||||
|
||||
**Composer dep on codex.** iblogs's `composer.json` gains a `repositories` entry of type `vcs` pointing at the codex Gitea URL (`ssh://git@git.indifferentketchup.com:2222/indifferentketchup/ik-codex.git`), and a `require` entry for `indifferentketchup/codex` pinned to exactly `0.1.0`. The exact pin is preferred over `^0.1.0` for early-version (0.x) releases where minor bumps may carry breaking changes.
|
||||
|
||||
**Removed deps:** `aternos/codex-minecraft`, `aternos/codex-hytale`, `aternos/sherlock`. The first two are replaced by `indifferentketchup/codex` (which covers Project Zomboid and ships detective stubs for Minecraft / Hytale / SevenDaysToDie that iblogs will not use in v0.1). The third (Sherlock) is Minecraft-mapping-specific and not relevant to PZ.
|
||||
|
||||
**Package name.** `aternos/mclogs` becomes `indifferentketchup/iblogs`. Composer name and the PSR-4 namespace move together: `Aternos\Mclogs\` → `IndifferentKetchup\Iblogs\`.
|
||||
|
||||
## (c) Multi-file / session paste model
|
||||
|
||||
Project Zomboid produces eleven log files per server session. The data model needs to accommodate this without breaking mclogs's existing single-paste consumers.
|
||||
|
||||
### Option (i) — 1 file = 1 paste, sibling-link via shared `session_id`
|
||||
|
||||
- **Pros:** minimal schema change. Reuse mclogs's existing `Log` per file. Sibling discovery is a `session_id` index.
|
||||
- **Cons:** no atomic ingest (zip becomes N independent uploads). Session views require runtime joins. `session_id` propagation through upload UX is fiddly (URL param? cookie? hidden form field?).
|
||||
- **Effort:** low.
|
||||
|
||||
### Option (ii) — zip upload explodes server-side into N linked pastes
|
||||
|
||||
- **Pros:** atomic ingest. One endpoint for whole-session upload. Maps cleanly to PZ's natural zip-of-logs deliverable.
|
||||
- **Cons:** zip-only ingest is restrictive (no single-file paste UX for users with just `DebugLog-server.txt`). Server-side zip extraction is attack surface (zip bombs, path traversal). Doubles upload paths if single-file is also supported.
|
||||
- **Effort:** medium.
|
||||
|
||||
### Option (iii) — session entity wraps N file entities (1:N relation)
|
||||
|
||||
- **Pros:** rich session model. Single URL for the whole session; child URLs per file. PZ's eleven-file natural session maps cleanly. mclogs's single-paste maps to "session of size 1," so the model degenerates gracefully into legacy behaviour. Session-level metadata (server name, date range, total size) becomes first-class.
|
||||
- **Cons:** most schema migration. Two URL types in routing. More concepts in the API.
|
||||
- **Effort:** medium-high.
|
||||
|
||||
### Recommendation: option (iii)
|
||||
|
||||
PZ's natural unit IS a session — the server emits all eleven files per restart, ZIP-bundled in production. Single-file uploads (the mclogs default UX) become "session of size 1" with no special-case code; the legacy `/api/1/log` routes return a paste that happens to belong to a singleton session. Cross-file analysis (e.g. correlating a `ServerExceptionProblem` from `DebugLog-server.txt` with a `ConnectionFailureProblem` from `user.txt`) is unlocked because both files share a `session_id`. The 1:N model is the only one that supports cross-file analysers in any future Phase B.4-equivalent on iblogs's side.
|
||||
|
||||
## (d) UI changes
|
||||
|
||||
**Primary nav: file-type tabs.** Within a session, eleven tabs (one per PZ file type) with a count badge (e.g. `DebugLog (6,998 lines)`, `chat (115)`). Clicking a tab loads that file's content + analysis. Tab order: DebugLog-server first (most useful for triage), then admin, user, chat, item, map, perk, pvp, ClientActionLog, cmd, BurdJournals.
|
||||
|
||||
**Secondary nav: session index sidebar.** Lists the user's recent sessions (cookie-driven, like mclogs's history). Less primary than tabs.
|
||||
|
||||
**Default view.** `/session/{id}` lands on the DebugLog-server tab by default — that file is what admins want to see when something is broken.
|
||||
|
||||
**Redaction toggle.** Per-session checkbox in the toolbar: "Redact PII". Behaviour depends on Step 4 (codex Redactor) status:
|
||||
- If Redactor ships first: toggle invokes `ProjectZomboidRedactor::redact()` on the rendered file content client-side or server-side (decision for the implementation pass).
|
||||
- If Redactor is still deferred: toggle is hidden in v0.1 of iblogs. Upload-time PII filtering still happens via the ported `Filter` chain (see `src/Filter/*` upstream — `IPv4Filter`, `IPv6Filter`, `AccessTokenFilter`, `UsernameFilter`).
|
||||
|
||||
**Branding.** Drop the "Built for Minecraft & Hytale" tagline and visual cues. Replace `mclo.gs` brand references with whatever short-domain iblogs uses (open question — see (h)). Color palette decision is open; mclogs's green accent (`#5cb85c` in `example.config.json`) is fine to keep or change.
|
||||
|
||||
## (e) API surface
|
||||
|
||||
Iblogs exposes a session-oriented API on top of the recommended (iii) model, plus the legacy mclogs paths kept alive.
|
||||
|
||||
| Path | Method | Purpose |
|
||||
|---|---|---|
|
||||
| `/api/session` | POST | Create a session by uploading one zip OR multiple file fields. Returns `session_id` plus a list of `{type, paste_id}` for each contained file. |
|
||||
| `/api/session/{id}` | GET | Return session metadata + array of contained pastes (`{type, paste_id, line_count, size_bytes}`). |
|
||||
| `/api/session/{id}/file/{type}` | GET | Return one file's content and its codex analysis result. `{type}` is one of the eleven PZ file-type tokens (`server`, `chat`, `clientaction`, `cmd`, `item`, `map`, `perk`, `pvp`, `admin`, `user`, `burdjournals`). |
|
||||
| `/api/paste/{id}` | GET | Single-paste back-compat. Returns content + analysis for any paste (whether part of a multi-file session or a singleton). |
|
||||
| `/api/1/log` | POST | Legacy mclogs path — kept alive. Internally creates a singleton session under the hood and returns the existing-shape mclogs response. |
|
||||
| `/api/1/log/{id}` | GET | Legacy mclogs path — kept alive. Same as `/api/paste/{id}` with the legacy response shape. |
|
||||
|
||||
The legacy paths preserve mclogs's API contract for any third-party clients that already integrate with `mclo.gs` or self-hosted mclogs instances. Upgrading clients to the session-aware API is opt-in.
|
||||
|
||||
## (f) String / branding inventory
|
||||
|
||||
Producing exact `path:line` references requires the cloned working copy of the fork. This section gives directional pointers from the fetched-but-not-cloned upstream tree at `aternosorg/mclogs:main`. The actual line-precise inventory belongs in a follow-up commit on the iblogs side, after the fork exists and can be `grep`ped.
|
||||
|
||||
**Composer / package metadata** — file `composer.json` upstream (no local clone, line refs not yet known):
|
||||
- `"name": "aternos/mclogs"` → `"indifferentketchup/iblogs"`
|
||||
- `"description": "Paste, share and analyse Minecraft logs"` → describe iblogs scope (PZ-first, server-log triage)
|
||||
- `"authors"` block (currently `Matthias Neid <matthias@aternos.org>`) → replace with `indifferentketchup` author
|
||||
- `require` block:
|
||||
- drop `aternos/codex-minecraft`
|
||||
- drop `aternos/codex-hytale`
|
||||
- drop `aternos/sherlock`
|
||||
- add `indifferentketchup/codex` pinned to `0.1.0`
|
||||
- `autoload.psr-4` mapping `"Aternos\\Mclogs\\": "src/"` → `"IndifferentKetchup\\Iblogs\\": "src/"`
|
||||
- new top-level `repositories` array entry of type `vcs` pointing at the codex Gitea URL
|
||||
|
||||
**Namespace bulk substitution** — every PHP file under `src/` (which is roughly 50+ files based on the upstream tree). The pattern mirrors the codex rename in commit `66a2fcc`: bulk `Aternos\Mclogs` → `IndifferentKetchup\Iblogs` across `namespace`, `use`, fully-qualified refs, and PHPDoc tags. Done as one logical commit on the iblogs side per the codex-side precedent.
|
||||
|
||||
**Codex API call sites** — the files mclogs uses to integrate Aternos's codex stack, all under `src/`:
|
||||
- `src/Detective.php` — likely a wrapper around `aternos/codex-minecraft`'s Detective. Swap to `IndifferentKetchup\Codex\Detective\ProjectZomboid\ProjectZomboidDetective` (or wrap multiple game detectives if iblogs ever supports more games).
|
||||
- `src/Log.php` — likely a wrapper. Re-point to codex's `Log` hierarchy.
|
||||
- `src/Api/Action/AnalyseLogAction.php` — the `analyse` endpoint. Update to call codex's `AnalysableLog::analyse()` with the new analyser surface.
|
||||
- `src/Api/Action/LogInsightsAction.php` — insights endpoint.
|
||||
- `src/Api/Response/CodexLogResponse.php` — response shape; verify field-by-field against `IndifferentKetchup\Codex\Analysis\AnalysisInterface::jsonSerialize()`.
|
||||
- `src/Api/Action/CreateLogAction.php` — log creation; integration with codex's `Detective::detect()`.
|
||||
- `src/Api/Action/RawLogAction.php`, `src/Api/Action/LogInfoAction.php` — verify these don't depend on Minecraft-specific codex behaviour.
|
||||
|
||||
**Frontend templates and assets** — file paths only, exact branding strings discovered post-clone:
|
||||
- `web/frontend/start.php` — landing page; "Paste, share and analyse Minecraft logs" hero copy lives here.
|
||||
- `web/frontend/api-docs.php` — API documentation page.
|
||||
- `web/frontend/parts/header.php`, `parts/footer.php`, `parts/head.php` — site title, meta tags, footer links to legal info.
|
||||
- `web/frontend/log.php` — log view template (probably hardcodes the syntax-highlighting language token — needs to handle multiple PZ file types).
|
||||
- `web/frontend/404.php` — error page copy.
|
||||
- `web/public/css/mclogs.css` — file is **renamed** to `iblogs.css` and CSS class names referencing `mclogs` are renamed.
|
||||
- `web/public/js/start.js`, `web/public/js/log.js` — likely contain text constants and reference `mclogs.css` filename.
|
||||
- `web/public/img/logo-icon.svg`, `logo.svg`, `favicon.ico` — replaced with iblogs assets.
|
||||
|
||||
**Configuration** — file `example.config.json`:
|
||||
- database name `mclogs` → `iblogs`
|
||||
- abuse contact `abuse@aternos.org` → iblogs contact (open question — see (h))
|
||||
- imprint and privacy policy links currently point at `aternos.gmbh` → iblogs equivalents
|
||||
- `mclo.gs` brand reference in the frontend styling section → new iblogs short-domain (open question)
|
||||
- worker request limit, ID length, TTL — review for iblogs-appropriate values; PZ sessions are larger than mclogs single pastes so size and line limits may need raising.
|
||||
|
||||
**Docker / deployment** — files `Dockerfile`, `docker/Caddyfile`, `docker/compose.production.yaml`, `docker/mclogs.ini`:
|
||||
- Image label maintainer references
|
||||
- Caddyfile likely hardcodes `mclo.gs` hostname for TLS certificates → replace with iblogs hostname
|
||||
- Compose service name `mclogs` → `iblogs`
|
||||
- File `docker/mclogs.ini` is renamed and its contents updated
|
||||
|
||||
**`LICENSE` file** — per MIT requirements, the original Aternos copyright line stays byte-for-byte unchanged. iblogs's LICENSE preserves the upstream copyright header. This mirrors codex's handling of its own upstream LICENSE.
|
||||
|
||||
**`README.md`** — full rewrite. Title, description, install line, links to upstream codex repo, scope statement (PZ-first, server-log triage). Drop Minecraft / Hytale framing entirely.
|
||||
|
||||
**Filter classes for PZ-specific PII** — upstream's filter chain (`src/Filter/IPv4Filter.php`, `IPv6Filter.php`, `AccessTokenFilter.php`, `UsernameFilter.php`) handles Minecraft-style PII (server access tokens, Minecraft-pattern usernames). For PZ, iblogs may need new filters: `SteamIdFilter`, `WorldCoordinateFilter`, and a PZ-aware username filter (Steam usernames look different from Minecraft ones). These are net-new code, not branding renames.
|
||||
|
||||
## (g) Migration
|
||||
|
||||
**Keep mclogs's existing single-paste API routes alive as legacy.** Two reasons:
|
||||
1. mclogs has live API consumers calling `POST /api/1/log` and `GET /api/1/log/{id}` against `mclo.gs` and self-hosted instances. Iblogs's primary value is PZ support, not breaking compat with the broader mclogs ecosystem.
|
||||
2. Under model option (iii), legacy single pastes are naturally "sessions of size 1." Zero extra schema work to support legacy routes — they just internally create singleton sessions.
|
||||
|
||||
**Strip:** `aternos/codex-minecraft`, `aternos/codex-hytale`, `aternos/sherlock` Composer deps; the `Aternos\Mclogs\` namespace; mclogs-specific branding strings; the `mclo.gs` hostname hardcodes; Minecraft-mapping deobfuscation code paths.
|
||||
|
||||
**Preserve:** the upstream `Filter` chain (it solves real problems — IP redaction, access tokens, usernames); the FrankenPHP runtime; MongoDB storage layer; the cookie-based session-history UX; the Caddy fronting.
|
||||
|
||||
## (h) Open questions
|
||||
|
||||
1. **`aternos/sherlock` license confirmation** — verified MIT (this design doc fetched the metadata) but iblogs is dropping it. No issue.
|
||||
2. **`ext-frankenphp` keep / replace decision** — recommend keep for v0.1 (path of least resistance). Migrating to vanilla nginx+php-fpm is its own project and can come later.
|
||||
3. **Branding decisions:**
|
||||
- Site name: `iblogs` (lowercase) seems chosen given the project mention `indifferentketchup/iblogs`. Confirm.
|
||||
- Tagline: needs writing. "Project Zomboid server log triage" is honest; longer-form copy is open.
|
||||
- Short-domain: mclogs uses `mclo.gs`. Is there an iblogs equivalent (`iblo.gs`? `ib.gs`?)? Affects Caddyfile, frontend assets, and docs links.
|
||||
- Accent / palette: keep mclogs green (`#5cb85c`) or pick a different colour?
|
||||
4. **Database choice:** keep MongoDB or migrate to PostgreSQL / SQLite? Migrating away from Mongo is a significant project; recommend keep for v0.1.
|
||||
5. **API URL versioning:** mclogs uses `/api/1/`. Stay with `/api/1/` for legacy paths (compat) and add `/api/session/...` for new endpoints (no version prefix), or use `/api/v2/session/...`? Recommend the former — minimum surface change.
|
||||
6. **Session-ID generation:** mclogs uses 7-character IDs. For iblogs sessions of N files, pick (a) one session-ID + N independent paste-IDs (richer URLs) or (b) single ID per paste with a sibling `session_id` field (simpler). Affects URL shape.
|
||||
7. **The codex Redactor utility.** Iblogs's redaction toggle (section d) depends on whether Step 4 (Redactor implementation) ships before or after iblogs scaffolding. **Decision deferred to user (Step 4 of the careful run).**
|
||||
8. **PZ-specific filter classes** (`SteamIdFilter`, `WorldCoordinateFilter`, etc.) — net-new work for iblogs. Could lift the regex shapes from `docs/superpowers/specs/2026-04-30-redactor-design.md` (they're the same PII categories). Implementation order: iblogs likely wants these for its upload-time filter chain regardless of whether the codex `Redactor` ships.
|
||||
9. **Multi-game support trajectory.** v0.1 of iblogs is PZ-first. If Minecraft / Hytale / SevenDaysToDie support is on the roadmap, iblogs's Detective wiring needs to be a multi-game dispatcher (not just `ProjectZomboidDetective`). Codex provides the per-game detectives separately; iblogs would compose them. Out of scope for v0.1.
|
||||
10. **The exact line-precise branding inventory** (every file:line ref of `Minecraft` / `Hytale` / `MC` / `mc` / `mclogs` / `mclo.gs` / `Aternos`). This document gives file-level pointers; the line-precise version is produced as a separate work item once the fork is cloned and grep-able.
|
||||
|
||||
## Pointers
|
||||
|
||||
- Codex package consumed: `indifferentketchup/codex` v0.1.0, tag SHA `8a89550` (annotated tag) pointing at commit `52ff8cb`.
|
||||
- Codex Redactor design (deferred): `docs/superpowers/specs/2026-04-30-redactor-design.md`.
|
||||
- Codex CHANGELOG: `CHANGELOG.md` in this repo.
|
||||
- Upstream mclogs: `https://github.com/aternosorg/mclogs` (MIT, `main` default branch, last push 2026-03-30).
|
||||
Reference in New Issue
Block a user