First upload

2026-02-17 21:49:58 -06:00
parent 29a13768f7
commit 6821424663
46 changed files with 3179 additions and 14 deletions
--- a/docs/analytics/Part
+++ b/docs/analytics/Part
@@ -0,0 +1,324 @@
+<!DOCTYPE html>
+<html lang="en">
+<head>
+  <meta charset="UTF-8" />
+  <meta name="viewport" content="width=device-width, initial-scale=1.0" />
+  <title>Mode C: Batch Analytics Report</title>
+  <style>
+    :root {
+      --bg: #fafbfc;
+      --paper: #ffffff;
+      --text: #1a1d21;
+      --text-muted: #57606a;
+      --accent: #0969da;
+      --accent-soft: #ddf4ff;
+      --border: #d0d7de;
+      --table-stripe: #f6f8fa;
+      --code-bg: #f0f2f5;
+      --font-sans: 'Segoe UI', system-ui, -apple-system, sans-serif;
+      --font-mono: ui-monospace, 'Cascadia Code', 'Source Code Pro', monospace;
+      --radius: 6px;
+      --shadow: 0 1px 3px rgba(0,0,0,.06);
+    }
+
+    * { box-sizing: border-box; }
+    body {
+      margin: 0;
+      padding: 2rem 1.5rem 3rem;
+      font-family: var(--font-sans);
+      font-size: 15px;
+      line-height: 1.6;
+      color: var(--text);
+      background: var(--bg);
+    }
+
+    .report {
+      max-width: 820px;
+      margin: 0 auto;
+      background: var(--paper);
+      padding: 2.5rem 3rem;
+      border-radius: var(--radius);
+      box-shadow: var(--shadow);
+    }
+
+    .report-header {
+      border-bottom: 2px solid var(--border);
+      padding-bottom: 1.5rem;
+      margin-bottom: 2rem;
+    }
+    .report-header h1 {
+      margin: 0 0 0.5rem;
+      font-size: 1.75rem;
+      font-weight: 600;
+      color: var(--text);
+    }
+    .report-meta {
+      font-size: 0.9rem;
+      color: var(--text-muted);
+    }
+    .report-meta p { margin: 0.25rem 0; }
+
+    h2 {
+      font-size: 1.2rem;
+      font-weight: 600;
+      margin: 2rem 0 1rem;
+      padding-bottom: 0.35rem;
+      color: var(--text);
+      border-bottom: 1px solid var(--border);
+    }
+    h2:first-of-type { margin-top: 0; }
+
+    p { margin: 0.75rem 0; }
+    .narrative, .recommendation { font-style: normal; }
+    strong { font-weight: 600; }
+
+    table {
+      width: 100%;
+      border-collapse: collapse;
+      font-size: 0.9rem;
+      margin: 1rem 0;
+      border-radius: var(--radius);
+      overflow: hidden;
+      box-shadow: 0 1px 2px rgba(0,0,0,.05);
+    }
+    th, td {
+      padding: 0.6rem 1rem;
+      text-align: left;
+      border: 1px solid var(--border);
+    }
+    th {
+      background: var(--table-stripe);
+      font-weight: 600;
+      color: var(--text);
+    }
+    tr:nth-child(even) { background: var(--table-stripe); }
+    tr:hover { background: #eef2f7; }
+
+    pre, code {
+      font-family: var(--font-mono);
+      font-size: 0.85em;
+    }
+    pre {
+      background: var(--code-bg);
+      padding: 1rem 1.25rem;
+      border-radius: var(--radius);
+      overflow-x: auto;
+      margin: 1rem 0;
+      border: 1px solid var(--border);
+    }
+    code { padding: 0.15em 0.4em; background: var(--code-bg); border-radius: 4px; }
+
+    ul { margin: 0.75rem 0; padding-left: 1.5rem; }
+    li { margin: 0.35rem 0; }
+
+    hr {
+      border: none;
+      border-top: 1px solid var(--border);
+      margin: 2rem 0;
+    }
+
+    @media print {
+      body { background: #fff; padding: 0; }
+      .report {
+        max-width: none;
+        box-shadow: none;
+        padding: 0;
+      }
+      h2 { page-break-after: avoid; }
+      table { page-break-inside: avoid; }
+    }
+  </style>
+</head>
+<body>
+  <div class="report">
+    <header class="report-header">
+      <h1>Mode C: Batch analytics report</h1>
+      <div class="report-meta">
+        <p><strong>Source:</strong> <code>Discord Ticket Transcripts/Drive2/</code></p>
+        <p><strong>Computed:</strong> From transcript HTML (metadata + decoded base64 message payloads).</p>
+        <p><strong>Guide:</strong> Part 1 Analysis (Transcript analytics schemas, Broccolini support section).</p>
+        <p><strong>Tool:</strong> <code>scripts/batch_transcript_analytics.py</code></p>
+      </div>
+      <p style="margin-top: 1rem;">Analytics below are <strong>per‑ticket and aggregate</strong> across 722 transcripts. Dimensions that require full Mode A extraction (issue categories, tags, wiki success/failure, intake gaps, frequency/impact, resolution status, email forgotten/misspelled) are noted; tables use parser-derived data where available.</p>
+    </header>
+
+    <h2>1. Volume and scope</h2>
+    <table>
+      <thead><tr><th>Metric</th><th>Value</th></tr></thead>
+      <tbody>
+        <tr><td>Total tickets</td><td>722</td></tr>
+        <tr><td>Transcripts with parse errors</td><td>0</td></tr>
+        <tr><td>Tickets with "Ticket closed" / "Transcript saving" in payload</td><td>722 (100%)</td></tr>
+        <tr><td>Tickets with claimed channel name (staff claimed)</td><td>671 (93%)</td></tr>
+        <tr><td>Tickets with escalation mentioned in text</td><td>1</td></tr>
+      </tbody>
+    </table>
+    <p class="narrative"><strong>Narrative:</strong> The Drive2 batch is fully parseable. Virtually all tickets show a close/saving event; 93% have a claimed channel, indicating most tickets were claimed by staff before closure. Escalations are rare in this set.</p>
+
+    <hr />
+
+    <h2>2. Game detection and game_or_server (heuristic)</h2>
+    <p>From decoded form + messages: text buffer scanned for canonical game names and aliases (Part 1 Analysis §5). <code>game_or_server</code> routing buckets would require Mode A (Valheim | Rust main | MC modded | MC vanilla | Other).</p>
+    <table>
+      <thead><tr><th>game_detected (heuristic)</th><th>Count</th><th>%</th></tr></thead>
+      <tbody>
+        <tr><td>Project Zomboid</td><td>257</td><td>35.6</td></tr>
+        <tr><td>Minecraft</td><td>179</td><td>24.8</td></tr>
+        <tr><td>Satisfactory</td><td>79</td><td>10.9</td></tr>
+        <tr><td>Palworld</td><td>46</td><td>6.4</td></tr>
+        <tr><td>Not Mentioned</td><td>45</td><td>6.2</td></tr>
+        <tr><td>Enshrouded</td><td>18</td><td>2.5</td></tr>
+        <tr><td>ARK: Survival Evolved</td><td>18</td><td>2.5</td></tr>
+        <tr><td>7 Days to Die</td><td>17</td><td>2.4</td></tr>
+        <tr><td>Valheim</td><td>14</td><td>1.9</td></tr>
+        <tr><td>DayZ</td><td>10</td><td>1.4</td></tr>
+        <tr><td>FiveM</td><td>8</td><td>1.1</td></tr>
+        <tr><td>Core Keeper</td><td>6</td><td>0.8</td></tr>
+        <tr><td>Vintage Story</td><td>5</td><td>0.7</td></tr>
+        <tr><td>Rust</td><td>5</td><td>0.7</td></tr>
+        <tr><td>Factorio</td><td>5</td><td>0.7</td></tr>
+        <tr><td>V Rising</td><td>3</td><td>0.4</td></tr>
+        <tr><td>ECO</td><td>3</td><td>0.4</td></tr>
+        <tr><td>Necesse</td><td>4</td><td>0.6</td></tr>
+      </tbody>
+    </table>
+    <p class="narrative"><strong>Narrative:</strong> Project Zomboid and Minecraft dominate; Satisfactory and Palworld are next. About 6% have no game detected from text. Full <code>game_or_server</code> (MC modded vs vanilla, Rust main, etc.) needs per‑ticket Mode A extraction.</p>
+
+    <hr />
+
+    <h2>3. Issue categories and tags</h2>
+    <p><strong>Issue categories</strong> (Availability, Connectivity, Billing, Data/saves, Configuration/mods) and <strong>TICKET_TAGS</strong> (Server Down, Stuck Restarting, Can't Connect, Server Lag, Billing, Refund Request, Mod Help, Backup Restore, World/Save, Server Config) require Mode A extraction from each transcript. No aggregate table is computed from the batch parser.</p>
+    <p class="recommendation"><strong>Recommendation:</strong> Run Mode A on all transcripts (or a sample), then aggregate <code>issue_types</code> and suggested tags into counts and top tags per game.</p>
+
+    <hr />
+
+    <h2>4. Message count and conversation shape</h2>
+    <table>
+      <thead><tr><th>Messages (header)</th><th>Number of tickets</th></tr></thead>
+      <tbody>
+        <tr><td>3–6</td><td>151</td></tr>
+        <tr><td>7–10</td><td>128</td></tr>
+        <tr><td>11–15</td><td>95</td></tr>
+        <tr><td>16–22</td><td>88</td></tr>
+        <tr><td>23–35</td><td>78</td></tr>
+        <tr><td>36–60</td><td>45</td></tr>
+        <tr><td>61+</td><td>137</td></tr>
+      </tbody>
+    </table>
+    <p>Summary stats (from header): Min 3, max 356 messages per ticket; majority in the 3–22 range. Back‑and‑forth turns, duration, and "staff asked for more info repeatedly" require Mode A.</p>
+    <p class="narrative"><strong>Narrative:</strong> Conversation length is skewed toward short (3–10 messages) and mid (11–35); a smaller set of tickets are long (60+ messages), likely complex or multi-step resolutions.</p>
+
+    <hr />
+
+    <h2>5. Attachments (saved / skipped)</h2>
+    <table>
+      <thead><tr><th>Attachments saved</th><th>Number of tickets</th></tr></thead>
+      <tbody>
+        <tr><td>0</td><td>325</td></tr>
+        <tr><td>1</td><td>169</td></tr>
+        <tr><td>2</td><td>81</td></tr>
+        <tr><td>3</td><td>55</td></tr>
+        <tr><td>4</td><td>30</td></tr>
+        <tr><td>5+</td><td>62</td></tr>
+      </tbody>
+    </table>
+    <table>
+      <thead><tr><th>Attachments skipped</th><th>Number of tickets</th></tr></thead>
+      <tbody>
+        <tr><td>0</td><td>695</td></tr>
+        <tr><td>1</td><td>16</td></tr>
+        <tr><td>2</td><td>6</td></tr>
+        <tr><td>3</td><td>3</td></tr>
+        <tr><td>4</td><td>2</td></tr>
+      </tbody>
+    </table>
+    <p class="narrative"><strong>Narrative:</strong> About 45% of tickets have at least one attachment saved; most have none skipped. Skipped reasons and mentions_screenshots/clips/logs require Mode A.</p>
+
+    <hr />
+
+    <h2>6. Staff involvement (from payload)</h2>
+    <p>Staff identified by Broccolini support user IDs (Part 1 Analysis §10.1) appearing in message payloads.</p>
+    <table>
+      <thead><tr><th>Staff involved (count per ticket)</th><th>Number of tickets</th></tr></thead>
+      <tbody>
+        <tr><td>0</td><td>152</td></tr>
+        <tr><td>1</td><td>545</td></tr>
+        <tr><td>2</td><td>25</td></tr>
+      </tbody>
+    </table>
+    <p class="narrative"><strong>Narrative:</strong> Most tickets have exactly one staff member in the payload; 152 have no staff ID in messages (e.g. Ticket Tool–only or unclaimed). "Tickets claimed per member" and first‑response time need claim/unclaim message parsing (channel name gives claim attribution; full workload per member needs Mode A or claim-event parsing).</p>
+
+    <hr />
+
+    <h2>7. User count (participants per ticket)</h2>
+    <table>
+      <thead><tr><th>User count (header)</th><th>Number of tickets</th></tr></thead>
+      <tbody>
+        <tr><td>2</td><td>70</td></tr>
+        <tr><td>3</td><td>582</td></tr>
+        <tr><td>4</td><td>44</td></tr>
+        <tr><td>5</td><td>8</td></tr>
+        <tr><td>6</td><td>2</td></tr>
+      </tbody>
+    </table>
+    <p class="narrative"><strong>Narrative:</strong> Most tickets have 3 participants (requester + 1 staff + Ticket Tool); 4+ participants suggest multi-staff or extra users in thread.</p>
+
+    <hr />
+
+    <h2>8. Wiki usage and wiki‑linked outcomes</h2>
+    <p><strong>wiki_articles_posted</strong>, <strong>wiki_solved_issue</strong> (true / false / unclear), and staff‑linked outcomes (user_wanted_broccolini_to_do_it, user_wanted_broccolini_but_walkthrough) require Mode A extraction. No aggregate table from the batch parser.</p>
+    <p class="recommendation"><strong>Recommendation:</strong> After Mode A, aggregate: (1) tickets where wiki_solved_issue = true / false / unclear; (2) per support member: wiki posts that solved vs did not, "do it for me" vs walkthrough counts.</p>
+
+    <hr />
+
+    <h2>9. Email analytics</h2>
+    <p>Parser did not detect "Account Email" + email in the same decoded block in this run. <strong>Email analytics</strong> (email_forgotten, email_misspelled, email_didnt_link, email_corrected) require Mode A extraction from form embeds and message text.</p>
+
+    <hr />
+
+    <h2>10. Frequency / impact distributions</h2>
+    <p><strong>frequency</strong> (once | sometimes | every_time | unclear) and <strong>impact</strong> (minor | moderate | severe | blocked | unclear) require inference from transcript wording (Mode A). No aggregate table from the batch parser.</p>
+
+    <hr />
+
+    <h2>11. Resolution patterns</h2>
+    <p>From parser: all 722 tickets contain "Ticket closed" or "Transcript saving" in the payload. <strong>status</strong> (resolved | unresolved | escalated | unclear) and <strong>relied_on</strong> (logs | mod_updates | staff_action | other) require Mode A. One ticket mentions escalation in text.</p>
+    <p class="narrative"><strong>Narrative:</strong> All transcripts represent closed/saved tickets; resolution outcome and what resolution relied on need per‑ticket extraction.</p>
+
+    <hr />
+
+    <h2>12. Intake gaps</h2>
+    <p>Per‑ticket intake_gaps (account_contact, issue_type, reproduction, environment, attachments, priority, rules) each as complete | partial | missing require Mode A. No aggregate table from the batch parser.</p>
+    <p class="recommendation"><strong>Recommendation:</strong> After Mode A, report % complete / partial / missing per dimension to target form and template improvements.</p>
+
+    <hr />
+
+    <h2>13. Recurring analytics (Broccolini support section)</h2>
+    <p>From the batch parser we have:</p>
+    <ul>
+      <li><strong>Tickets per game_detected (heuristic):</strong> see §2.</li>
+      <li><strong>Claimed channel share:</strong> 671/722 (93%).</li>
+      <li><strong>Staff involved count per ticket:</strong> see §6.</li>
+    </ul>
+    <p><strong>Require Mode A or claim parsing:</strong></p>
+    <ul>
+      <li>Tickets claimed per member (from claim/unclaim messages or channel name).</li>
+      <li>First response time, re‑opens, escalations.</li>
+      <li>Tag distribution, repeat customers, sentiment toward staff.</li>
+      <li>Wiki‑linked outcomes per member (§9.2).</li>
+    </ul>
+
+    <hr />
+
+    <h2>14. How to reproduce and extend</h2>
+    <ol>
+      <li><strong>Run batch parser (this report's source):</strong>
+        <pre>python3 scripts/batch_transcript_analytics.py "Discord Ticket Transcripts/Drive2"</pre>
+        For a subfolder or Drive:
+        <pre>python3 scripts/batch_transcript_analytics.py "Discord Ticket Transcripts/Drive"</pre>
+      </li>
+      <li><strong>Full Mode C tables:</strong> Run Mode A extraction on each transcript (or a sample), collect JSON, then aggregate by issue categories, tags, game_or_server, wiki_solved_issue, intake_gaps, frequency/impact, resolution status, and email analytics. Use Part 1 Analysis and <code>docs/TICKET-ANALYTICS-SCHEMA-PROMPTING.md</code> as the schema source of truth.</li>
+    </ol>
+  </div>
+</body>
+</html>