First upload

2026-02-17 21:49:58 -06:00
parent 29a13768f7
commit 6821424663
46 changed files with 3179 additions and 14 deletions
--- a/docs/analytics/Part
+++ b/docs/analytics/Part
@@ -0,0 +1,195 @@
+# Mode C: Batch analytics report
+
+**Source:** `Discord Ticket Transcripts/Drive2/`  
+**Computed:** From transcript HTML (metadata + decoded base64 message payloads).  
+**Guide:** Part 1 Analysis (Transcript analytics schemas, Broccolini support section).  
+**Tool:** `scripts/batch_transcript_analytics.py`
+
+Analytics below are **per‑ticket and aggregate** across 722 transcripts. Dimensions that require full Mode A extraction (issue categories, tags, wiki success/failure, intake gaps, frequency/impact, resolution status, email forgotten/misspelled) are noted; tables use parser-derived data where available.
+
+---
+
+## 1. Volume and scope
+
+| Metric | Value |
+|--------|--------|
+| Total tickets | 722 |
+| Transcripts with parse errors | 0 |
+| Tickets with “Ticket closed” / “Transcript saving” in payload | 722 (100%) |
+| Tickets with claimed channel name (staff claimed) | 671 (93%) |
+| Tickets with escalation mentioned in text | 1 |
+
+**Narrative:** The Drive2 batch is fully parseable. Virtually all tickets show a close/saving event; 93% have a claimed channel, indicating most tickets were claimed by staff before closure. Escalations are rare in this set.
+
+---
+
+## 2. Game detection and game_or_server (heuristic)
+
+From decoded form + messages: text buffer scanned for canonical game names and aliases (Part 1 Analysis §5). `game_or_server` routing buckets would require Mode A (Valheim | Rust main | MC modded | MC vanilla | Other).
+
+| game_detected (heuristic) | Count | % |
+|---------------------------|-------|---|
+| Project Zomboid | 257 | 35.6 |
+| Minecraft | 179 | 24.8 |
+| Satisfactory | 79 | 10.9 |
+| Palworld | 46 | 6.4 |
+| Not Mentioned | 45 | 6.2 |
+| Enshrouded | 18 | 2.5 |
+| ARK: Survival Evolved | 18 | 2.5 |
+| 7 Days to Die | 17 | 2.4 |
+| Valheim | 14 | 1.9 |
+| DayZ | 10 | 1.4 |
+| FiveM | 8 | 1.1 |
+| Core Keeper | 6 | 0.8 |
+| Vintage Story | 5 | 0.7 |
+| Rust | 5 | 0.7 |
+| Factorio | 5 | 0.7 |
+| V Rising | 3 | 0.4 |
+| ECO | 3 | 0.4 |
+| Necesse | 4 | 0.6 |
+
+**Narrative:** Project Zomboid and Minecraft dominate; Satisfactory and Palworld are next. About 6% have no game detected from text. Full `game_or_server` (MC modded vs vanilla, Rust main, etc.) needs per‑ticket Mode A extraction.
+
+---
+
+## 3. Issue categories and tags
+
+**Issue categories** (Availability, Connectivity, Billing, Data/saves, Configuration/mods) and **TICKET_TAGS** (Server Down, Stuck Restarting, Can’t Connect, Server Lag, Billing, Refund Request, Mod Help, Backup Restore, World/Save, Server Config) require Mode A extraction from each transcript. No aggregate table is computed from the batch parser.
+
+**Recommendation:** Run Mode A on all transcripts (or a sample), then aggregate `issue_types` and suggested tags into counts and top tags per game.
+
+---
+
+## 4. Message count and conversation shape
+
+| Messages (header) | Number of tickets |
+|-------------------|--------------------|
+| 3–6 | 151 |
+| 7–10 | 128 |
+| 11–15 | 95 |
+| 16–22 | 88 |
+| 23–35 | 78 |
+| 36–60 | 45 |
+| 61+ | 137 |
+
+Summary stats (from header): Min 3, max 356 messages per ticket; majority in the 3–22 range. Back‑and‑forth turns, duration, and “staff asked for more info repeatedly” require Mode A.
+
+**Narrative:** Conversation length is skewed toward short (3–10 messages) and mid (11–35); a smaller set of tickets are long (60+ messages), likely complex or multi-step resolutions.
+
+---
+
+## 5. Attachments (saved / skipped)
+
+| Attachments saved | Number of tickets |
+|-------------------|-------------------|
+| 0 | 325 |
+| 1 | 169 |
+| 2 | 81 |
+| 3 | 55 |
+| 4 | 30 |
+| 5+ | 62 |
+
+| Attachments skipped | Number of tickets |
+|---------------------|-------------------|
+| 0 | 695 |
+| 1 | 16 |
+| 2 | 6 |
+| 3 | 3 |
+| 4 | 2 |
+
+**Narrative:** About 45% of tickets have at least one attachment saved; most have none skipped. Skipped reasons and mentions_screenshots/clips/logs require Mode A.
+
+---
+
+## 6. Staff involvement (from payload)
+
+Staff identified by Broccolini support user IDs (Part 1 Analysis §10.1) appearing in message payloads.
+
+| Staff involved (count per ticket) | Number of tickets |
+|----------------------------------|-------------------|
+| 0 | 152 |
+| 1 | 545 |
+| 2 | 25 |
+
+**Narrative:** Most tickets have exactly one staff member in the payload; 152 have no staff ID in messages (e.g. Ticket Tool–only or unclaimed). “Tickets claimed per member” and first‑response time need claim/unclaim message parsing (channel name gives claim attribution; full workload per member needs Mode A or claim-event parsing).
+
+---
+
+## 7. User count (participants per ticket)
+
+| User count (header) | Number of tickets |
+|--------------------|-------------------|
+| 2 | 70 |
+| 3 | 582 |
+| 4 | 44 |
+| 5 | 8 |
+| 6 | 2 |
+
+**Narrative:** Most tickets have 3 participants (requester + 1 staff + Ticket Tool); 4+ participants suggest multi-staff or extra users in thread.
+
+---
+
+## 8. Wiki usage and wiki‑linked outcomes
+
+**wiki_articles_posted**, **wiki_solved_issue** (true / false / unclear), and staff‑linked outcomes (user_wanted_broccolini_to_do_it, user_wanted_broccolini_but_walkthrough) require Mode A extraction. No aggregate table from the batch parser.
+
+**Recommendation:** After Mode A, aggregate: (1) tickets where wiki_solved_issue = true / false / unclear; (2) per support member: wiki posts that solved vs did not, “do it for me” vs walkthrough counts.
+
+---
+
+## 9. Email analytics
+
+Parser did not detect “Account Email” + email in the same decoded block in this run. **Email analytics** (email_forgotten, email_misspelled, email_didnt_link, email_corrected) require Mode A extraction from form embeds and message text.
+
+---
+
+## 10. Frequency / impact distributions
+
+**frequency** (once | sometimes | every_time | unclear) and **impact** (minor | moderate | severe | blocked | unclear) require inference from transcript wording (Mode A). No aggregate table from the batch parser.
+
+---
+
+## 11. Resolution patterns
+
+From parser: all 722 tickets contain “Ticket closed” or “Transcript saving” in the payload. **status** (resolved | unresolved | escalated | unclear) and **relied_on** (logs | mod_updates | staff_action | other) require Mode A. One ticket mentions escalation in text.
+
+**Narrative:** All transcripts represent closed/saved tickets; resolution outcome and what resolution relied on need per‑ticket extraction.
+
+---
+
+## 12. Intake gaps
+
+Per‑ticket intake_gaps (account_contact, issue_type, reproduction, environment, attachments, priority, rules) each as complete | partial | missing require Mode A. No aggregate table from the batch parser.
+
+**Recommendation:** After Mode A, report % complete / partial / missing per dimension to target form and template improvements.
+
+---
+
+## 13. Recurring analytics (Broccolini support section)
+
+From the batch parser we have:
+
+- **Tickets per game_detected (heuristic):** see §2.
+- **Claimed channel share:** 671/722 (93%).
+- **Staff involved count per ticket:** see §6.
+
+**Require Mode A or claim parsing:**
+
+- Tickets claimed per member (from claim/unclaim messages or channel name).
+- First response time, re‑opens, escalations.
+- Tag distribution, repeat customers, sentiment toward staff.
+- Wiki‑linked outcomes per member (§9.2).
+
+---
+
+## 14. How to reproduce and extend
+
+1. **Run batch parser (this report’s source):**
+   ```bash
+   python3 scripts/batch_transcript_analytics.py "Discord Ticket Transcripts/Drive2"
+   ```
+   For a subfolder or Drive:
+   ```bash
+   python3 scripts/batch_transcript_analytics.py "Discord Ticket Transcripts/Drive"
+   ```
+2. **Full Mode C tables:** Run Mode A extraction on each transcript (or a sample), collect JSON, then aggregate by issue categories, tags, game_or_server, wiki_solved_issue, intake_gaps, frequency/impact, resolution status, and email analytics. Use Part 1 Analysis and `docs/TICKET-ANALYTICS-SCHEMA-PROMPTING.md` as the schema source of truth.