Files
broccolini-bot/docs/analytics/Part 1 Batch Analytics Report.html
2026-02-17 21:49:58 -06:00

325 lines
14 KiB
HTML
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>Mode C: Batch Analytics Report</title>
<style>
:root {
--bg: #fafbfc;
--paper: #ffffff;
--text: #1a1d21;
--text-muted: #57606a;
--accent: #0969da;
--accent-soft: #ddf4ff;
--border: #d0d7de;
--table-stripe: #f6f8fa;
--code-bg: #f0f2f5;
--font-sans: 'Segoe UI', system-ui, -apple-system, sans-serif;
--font-mono: ui-monospace, 'Cascadia Code', 'Source Code Pro', monospace;
--radius: 6px;
--shadow: 0 1px 3px rgba(0,0,0,.06);
}
* { box-sizing: border-box; }
body {
margin: 0;
padding: 2rem 1.5rem 3rem;
font-family: var(--font-sans);
font-size: 15px;
line-height: 1.6;
color: var(--text);
background: var(--bg);
}
.report {
max-width: 820px;
margin: 0 auto;
background: var(--paper);
padding: 2.5rem 3rem;
border-radius: var(--radius);
box-shadow: var(--shadow);
}
.report-header {
border-bottom: 2px solid var(--border);
padding-bottom: 1.5rem;
margin-bottom: 2rem;
}
.report-header h1 {
margin: 0 0 0.5rem;
font-size: 1.75rem;
font-weight: 600;
color: var(--text);
}
.report-meta {
font-size: 0.9rem;
color: var(--text-muted);
}
.report-meta p { margin: 0.25rem 0; }
h2 {
font-size: 1.2rem;
font-weight: 600;
margin: 2rem 0 1rem;
padding-bottom: 0.35rem;
color: var(--text);
border-bottom: 1px solid var(--border);
}
h2:first-of-type { margin-top: 0; }
p { margin: 0.75rem 0; }
.narrative, .recommendation { font-style: normal; }
strong { font-weight: 600; }
table {
width: 100%;
border-collapse: collapse;
font-size: 0.9rem;
margin: 1rem 0;
border-radius: var(--radius);
overflow: hidden;
box-shadow: 0 1px 2px rgba(0,0,0,.05);
}
th, td {
padding: 0.6rem 1rem;
text-align: left;
border: 1px solid var(--border);
}
th {
background: var(--table-stripe);
font-weight: 600;
color: var(--text);
}
tr:nth-child(even) { background: var(--table-stripe); }
tr:hover { background: #eef2f7; }
pre, code {
font-family: var(--font-mono);
font-size: 0.85em;
}
pre {
background: var(--code-bg);
padding: 1rem 1.25rem;
border-radius: var(--radius);
overflow-x: auto;
margin: 1rem 0;
border: 1px solid var(--border);
}
code { padding: 0.15em 0.4em; background: var(--code-bg); border-radius: 4px; }
ul { margin: 0.75rem 0; padding-left: 1.5rem; }
li { margin: 0.35rem 0; }
hr {
border: none;
border-top: 1px solid var(--border);
margin: 2rem 0;
}
@media print {
body { background: #fff; padding: 0; }
.report {
max-width: none;
box-shadow: none;
padding: 0;
}
h2 { page-break-after: avoid; }
table { page-break-inside: avoid; }
}
</style>
</head>
<body>
<div class="report">
<header class="report-header">
<h1>Mode C: Batch analytics report</h1>
<div class="report-meta">
<p><strong>Source:</strong> <code>Discord Ticket Transcripts/Drive2/</code></p>
<p><strong>Computed:</strong> From transcript HTML (metadata + decoded base64 message payloads).</p>
<p><strong>Guide:</strong> Part 1 Analysis (Transcript analytics schemas, Broccolini support section).</p>
<p><strong>Tool:</strong> <code>scripts/batch_transcript_analytics.py</code></p>
</div>
<p style="margin-top: 1rem;">Analytics below are <strong>perticket and aggregate</strong> across 722 transcripts. Dimensions that require full Mode A extraction (issue categories, tags, wiki success/failure, intake gaps, frequency/impact, resolution status, email forgotten/misspelled) are noted; tables use parser-derived data where available.</p>
</header>
<h2>1. Volume and scope</h2>
<table>
<thead><tr><th>Metric</th><th>Value</th></tr></thead>
<tbody>
<tr><td>Total tickets</td><td>722</td></tr>
<tr><td>Transcripts with parse errors</td><td>0</td></tr>
<tr><td>Tickets with "Ticket closed" / "Transcript saving" in payload</td><td>722 (100%)</td></tr>
<tr><td>Tickets with claimed channel name (staff claimed)</td><td>671 (93%)</td></tr>
<tr><td>Tickets with escalation mentioned in text</td><td>1</td></tr>
</tbody>
</table>
<p class="narrative"><strong>Narrative:</strong> The Drive2 batch is fully parseable. Virtually all tickets show a close/saving event; 93% have a claimed channel, indicating most tickets were claimed by staff before closure. Escalations are rare in this set.</p>
<hr />
<h2>2. Game detection and game_or_server (heuristic)</h2>
<p>From decoded form + messages: text buffer scanned for canonical game names and aliases (Part 1 Analysis §5). <code>game_or_server</code> routing buckets would require Mode A (Valheim | Rust main | MC modded | MC vanilla | Other).</p>
<table>
<thead><tr><th>game_detected (heuristic)</th><th>Count</th><th>%</th></tr></thead>
<tbody>
<tr><td>Project Zomboid</td><td>257</td><td>35.6</td></tr>
<tr><td>Minecraft</td><td>179</td><td>24.8</td></tr>
<tr><td>Satisfactory</td><td>79</td><td>10.9</td></tr>
<tr><td>Palworld</td><td>46</td><td>6.4</td></tr>
<tr><td>Not Mentioned</td><td>45</td><td>6.2</td></tr>
<tr><td>Enshrouded</td><td>18</td><td>2.5</td></tr>
<tr><td>ARK: Survival Evolved</td><td>18</td><td>2.5</td></tr>
<tr><td>7 Days to Die</td><td>17</td><td>2.4</td></tr>
<tr><td>Valheim</td><td>14</td><td>1.9</td></tr>
<tr><td>DayZ</td><td>10</td><td>1.4</td></tr>
<tr><td>FiveM</td><td>8</td><td>1.1</td></tr>
<tr><td>Core Keeper</td><td>6</td><td>0.8</td></tr>
<tr><td>Vintage Story</td><td>5</td><td>0.7</td></tr>
<tr><td>Rust</td><td>5</td><td>0.7</td></tr>
<tr><td>Factorio</td><td>5</td><td>0.7</td></tr>
<tr><td>V Rising</td><td>3</td><td>0.4</td></tr>
<tr><td>ECO</td><td>3</td><td>0.4</td></tr>
<tr><td>Necesse</td><td>4</td><td>0.6</td></tr>
</tbody>
</table>
<p class="narrative"><strong>Narrative:</strong> Project Zomboid and Minecraft dominate; Satisfactory and Palworld are next. About 6% have no game detected from text. Full <code>game_or_server</code> (MC modded vs vanilla, Rust main, etc.) needs perticket Mode A extraction.</p>
<hr />
<h2>3. Issue categories and tags</h2>
<p><strong>Issue categories</strong> (Availability, Connectivity, Billing, Data/saves, Configuration/mods) and <strong>TICKET_TAGS</strong> (Server Down, Stuck Restarting, Can't Connect, Server Lag, Billing, Refund Request, Mod Help, Backup Restore, World/Save, Server Config) require Mode A extraction from each transcript. No aggregate table is computed from the batch parser.</p>
<p class="recommendation"><strong>Recommendation:</strong> Run Mode A on all transcripts (or a sample), then aggregate <code>issue_types</code> and suggested tags into counts and top tags per game.</p>
<hr />
<h2>4. Message count and conversation shape</h2>
<table>
<thead><tr><th>Messages (header)</th><th>Number of tickets</th></tr></thead>
<tbody>
<tr><td>36</td><td>151</td></tr>
<tr><td>710</td><td>128</td></tr>
<tr><td>1115</td><td>95</td></tr>
<tr><td>1622</td><td>88</td></tr>
<tr><td>2335</td><td>78</td></tr>
<tr><td>3660</td><td>45</td></tr>
<tr><td>61+</td><td>137</td></tr>
</tbody>
</table>
<p>Summary stats (from header): Min 3, max 356 messages per ticket; majority in the 322 range. Backandforth turns, duration, and "staff asked for more info repeatedly" require Mode A.</p>
<p class="narrative"><strong>Narrative:</strong> Conversation length is skewed toward short (310 messages) and mid (1135); a smaller set of tickets are long (60+ messages), likely complex or multi-step resolutions.</p>
<hr />
<h2>5. Attachments (saved / skipped)</h2>
<table>
<thead><tr><th>Attachments saved</th><th>Number of tickets</th></tr></thead>
<tbody>
<tr><td>0</td><td>325</td></tr>
<tr><td>1</td><td>169</td></tr>
<tr><td>2</td><td>81</td></tr>
<tr><td>3</td><td>55</td></tr>
<tr><td>4</td><td>30</td></tr>
<tr><td>5+</td><td>62</td></tr>
</tbody>
</table>
<table>
<thead><tr><th>Attachments skipped</th><th>Number of tickets</th></tr></thead>
<tbody>
<tr><td>0</td><td>695</td></tr>
<tr><td>1</td><td>16</td></tr>
<tr><td>2</td><td>6</td></tr>
<tr><td>3</td><td>3</td></tr>
<tr><td>4</td><td>2</td></tr>
</tbody>
</table>
<p class="narrative"><strong>Narrative:</strong> About 45% of tickets have at least one attachment saved; most have none skipped. Skipped reasons and mentions_screenshots/clips/logs require Mode A.</p>
<hr />
<h2>6. Staff involvement (from payload)</h2>
<p>Staff identified by Broccolini support user IDs (Part 1 Analysis §10.1) appearing in message payloads.</p>
<table>
<thead><tr><th>Staff involved (count per ticket)</th><th>Number of tickets</th></tr></thead>
<tbody>
<tr><td>0</td><td>152</td></tr>
<tr><td>1</td><td>545</td></tr>
<tr><td>2</td><td>25</td></tr>
</tbody>
</table>
<p class="narrative"><strong>Narrative:</strong> Most tickets have exactly one staff member in the payload; 152 have no staff ID in messages (e.g. Ticket Toolonly or unclaimed). "Tickets claimed per member" and firstresponse time need claim/unclaim message parsing (channel name gives claim attribution; full workload per member needs Mode A or claim-event parsing).</p>
<hr />
<h2>7. User count (participants per ticket)</h2>
<table>
<thead><tr><th>User count (header)</th><th>Number of tickets</th></tr></thead>
<tbody>
<tr><td>2</td><td>70</td></tr>
<tr><td>3</td><td>582</td></tr>
<tr><td>4</td><td>44</td></tr>
<tr><td>5</td><td>8</td></tr>
<tr><td>6</td><td>2</td></tr>
</tbody>
</table>
<p class="narrative"><strong>Narrative:</strong> Most tickets have 3 participants (requester + 1 staff + Ticket Tool); 4+ participants suggest multi-staff or extra users in thread.</p>
<hr />
<h2>8. Wiki usage and wikilinked outcomes</h2>
<p><strong>wiki_articles_posted</strong>, <strong>wiki_solved_issue</strong> (true / false / unclear), and stafflinked outcomes (user_wanted_broccolini_to_do_it, user_wanted_broccolini_but_walkthrough) require Mode A extraction. No aggregate table from the batch parser.</p>
<p class="recommendation"><strong>Recommendation:</strong> After Mode A, aggregate: (1) tickets where wiki_solved_issue = true / false / unclear; (2) per support member: wiki posts that solved vs did not, "do it for me" vs walkthrough counts.</p>
<hr />
<h2>9. Email analytics</h2>
<p>Parser did not detect "Account Email" + email in the same decoded block in this run. <strong>Email analytics</strong> (email_forgotten, email_misspelled, email_didnt_link, email_corrected) require Mode A extraction from form embeds and message text.</p>
<hr />
<h2>10. Frequency / impact distributions</h2>
<p><strong>frequency</strong> (once | sometimes | every_time | unclear) and <strong>impact</strong> (minor | moderate | severe | blocked | unclear) require inference from transcript wording (Mode A). No aggregate table from the batch parser.</p>
<hr />
<h2>11. Resolution patterns</h2>
<p>From parser: all 722 tickets contain "Ticket closed" or "Transcript saving" in the payload. <strong>status</strong> (resolved | unresolved | escalated | unclear) and <strong>relied_on</strong> (logs | mod_updates | staff_action | other) require Mode A. One ticket mentions escalation in text.</p>
<p class="narrative"><strong>Narrative:</strong> All transcripts represent closed/saved tickets; resolution outcome and what resolution relied on need perticket extraction.</p>
<hr />
<h2>12. Intake gaps</h2>
<p>Perticket intake_gaps (account_contact, issue_type, reproduction, environment, attachments, priority, rules) each as complete | partial | missing require Mode A. No aggregate table from the batch parser.</p>
<p class="recommendation"><strong>Recommendation:</strong> After Mode A, report % complete / partial / missing per dimension to target form and template improvements.</p>
<hr />
<h2>13. Recurring analytics (Broccolini support section)</h2>
<p>From the batch parser we have:</p>
<ul>
<li><strong>Tickets per game_detected (heuristic):</strong> see §2.</li>
<li><strong>Claimed channel share:</strong> 671/722 (93%).</li>
<li><strong>Staff involved count per ticket:</strong> see §6.</li>
</ul>
<p><strong>Require Mode A or claim parsing:</strong></p>
<ul>
<li>Tickets claimed per member (from claim/unclaim messages or channel name).</li>
<li>First response time, reopens, escalations.</li>
<li>Tag distribution, repeat customers, sentiment toward staff.</li>
<li>Wikilinked outcomes per member (§9.2).</li>
</ul>
<hr />
<h2>14. How to reproduce and extend</h2>
<ol>
<li><strong>Run batch parser (this report's source):</strong>
<pre>python3 scripts/batch_transcript_analytics.py "Discord Ticket Transcripts/Drive2"</pre>
For a subfolder or Drive:
<pre>python3 scripts/batch_transcript_analytics.py "Discord Ticket Transcripts/Drive"</pre>
</li>
<li><strong>Full Mode C tables:</strong> Run Mode A extraction on each transcript (or a sample), collect JSON, then aggregate by issue categories, tags, game_or_server, wiki_solved_issue, intake_gaps, frequency/impact, resolution status, and email analytics. Use Part 1 Analysis and <code>docs/TICKET-ANALYTICS-SCHEMA-PROMPTING.md</code> as the schema source of truth.</li>
</ol>
</div>
</body>
</html>