docs: align lookback test purpose and spec normalization list

Honest test docstring (old/new semantics equivalent on contiguous
entries; test locks post-fix behavior against future regressions),
and add severity-prefix strip to the spec's normalization list.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-04 15:39:44 +00:00
parent 2e7bebc911
commit fdf70a0c06
2 changed files with 15 additions and 8 deletions

View File

@@ -126,6 +126,7 @@ signature = sha256(pattern_id + mod_id)[:16]
Normalization for `pattern_id`:
- Strip session metadata prefix (`General f:N, t:N, st:N,N,N,N>` shape)
- Strip body-prefix severity token (`ERROR:` / `SEVERE:` / `WARN:` / `FATAL:`, case-insensitive) so a body that opens with the severity word still hashes the same as one that doesn't.
- Flatten double- and single-quoted strings to `"<S>"` / `'<S>'`
- Flatten ≥2-digit numeric runs to `<N>`
- Collapse whitespace
@@ -240,5 +241,6 @@ Test invocation: `python -m unittest discover tools/pz-analyzer/tests/` should b
- Editing `pz_error_analysis.py` or `pz_redact_all.sh`.
- Modifying any file in `/opt/ik-codex/src/`, `/opt/ik-codex/test/`, or `/opt/iblogs/`.
- AI / LLM integration of any kind in the new tool.
- LLM inference at runtime in iblogs / bosslogs production. The Qwen analyzer (`pz_error_analysis.py`) is a developer-only discovery tool used to expand the deterministic ruleset in `pz_parser.py` (and its future PHP port). Production rendering is deterministic-only, forever.
- iblogs front-end rendering of the classification output.
- Filesystem mod-scan reattribution (pzmm's symbol/vehicle indexes).