Files

indifferentketchup b18de2a331 chore: snapshot working tree - pty_exited notifications + in-flight inference WIP

feat(booterm): structured pty_exited WS notifications. Plan-validated, impl-validated, code-reviewed green (contracts build clean, contracts test 29/29, booterm + web typecheck clean).

wip: in-progress inference/provider refactor (agents.ts, provider.ts, new llama-providers.ts, removed llama-args-validator), plus arena, dispatcher, compaction, schema changes.

openspec: pty-exit-notifications complete; x-agent-flags planned (not yet implemented).

2026-06-14 12:48:47 +00:00

23 KiB

Raw Blame History

Security Analysis: BooControl P1 Implementation

Scope

Files analyzed:

apps/control/src/index.ts — Fastify host service, SSE event handlers, perf poller, retention job
apps/control/src/db.ts — Database connection, schema application
apps/control/src/config.ts — Zod-validated env config
apps/control/src/schema.sql — DDL for fleet cockpit tables
apps/control/src/routes/ws.ts — WebSocket endpoint serving fleet state
apps/control/src/services/fleet-connector.ts — SSE client with reconnect logic
apps/control/src/services/fleet-state.ts — In-memory fleet state types
apps/control/src/services/retention.ts — Rollup and prune jobs
apps/control/src/services/host-access.ts — No-op host access seam (P8 placeholder)
apps/server/src/routes/control-proxy.ts — HTTP/WS reverse proxy in BooChat
apps/server/src/routes/coder-proxy.ts — Reference proxy (keep-in-sync comparison)
apps/server/src/index.ts — Server bootstrap (proxy registration)
apps/web/src/hooks/useControlStream.tsx — Client WS hook + React context
apps/web/src/components/control/ — All UI components (FleetTab, ActivityTab, HostCard, PerfChart, TtlRing, VramGauge, buildEChartsTheme)

Dependency manifests reviewed:

apps/control/package.json, apps/server/package.json, root package.json, docker-compose.yml

No branch specified; analysis performed on working tree.

Summary

The BooControl P1 implementation has a critical functional bug in the SSE fleet connector that renders the entire fleet monitoring system non-functional — no events are ever parsed or persisted. Beyond that, the SQL injection concern raised about ::jsonb patterns is a false positive (the postgres tagged-template library parameterizes these correctly), but real vulnerabilities exist in SSRF via unvalidated ssh_host, missing WebSocket origin validation on the proxy, unbounded in-memory state growth, and response header forwarding without filtering.

Severity	Count
Critical	1
High	2
Medium	3

Full analysis written to: /home/samkintop/opt/boocode/security-analysis.md

Findings

OWASP A01 - Broken Access Control

SEC-001: WebSocket Proxy Has No Origin Validation

OWASP: A01 - Broken Access Control
Location: apps/server/src/routes/control-proxy.ts:20

Evidence:

app.get('/api/control/ws', { websocket: true }, (clientSocket, _req) => {
  const target = boocontrolWsUrl(boocontrolOrigin, '/api/ws/control');
  const upstream = new WebSocket(target);

EXPLOIT: A malicious page on any domain can open a WebSocket to /api/control/ws on the BooChat origin. WebSocket connections are not subject to same-origin policy. If the user has an active Authelia session, the browser will include session cookies automatically. The malicious page receives fleet state snapshots (host IPs, model names, liveness status, GPU metrics) and can relay messages between client and upstream. The @fastify/websocket plugin supports an origin option that is not used. The sibling coder-proxy.ts has the same pattern.
Severity: High

Fix sketch:

// In control-proxy.ts, add origin validation:
app.get('/api/control/ws', { websocket: true }, (clientSocket, req) => {
  const origin = req.headers.origin;
  const allowed = process.env.ALLOWED_ORIGIN ?? 'https://code.indifferentketchup.com';
  if (origin && origin !== allowed) {
    clientSocket.close(1008, 'origin not allowed');
    return;
  }
  // ... rest of proxy
});

SEC-002: Control Service Has No Application-Layer Authentication

OWASP: A01 - Broken Access Control
Location: apps/control/src/index.ts:197-306 (entire server); apps/control/src/config.ts:6 (HOST defaults to 100.114.205.53)

Evidence:

// No auth middleware registered. No CORS. No origin checks.
app.get('/api/health', async (_req: unknown, reply) => { ... });
registerControlWebSocket(app, () => fleet);
await app.listen({ port: config.PORT, host: config.HOST });

EXPLOIT: The control service binds to the Tailscale IP with zero authentication. Any device on the Tailscale network can connect to port 9503 and read fleet state, GPU metrics, and model activity. If the Tailscale network is shared or a device is compromised, the attacker gains full read access to the inference fleet dashboard. The service has no CORS headers, no auth middleware, and no request validation. This is architecturally acceptable only if the Tailscale network is single-user and the service is never exposed beyond it — a fragile assumption.
Severity: Medium

Fix sketch: Add a shared-secret bearer token or mTLS between BooChat proxy and BooControl, so the control service rejects requests not originating from the authenticated proxy.

OWASP A02 - Cryptographic Failures

A02 - Cryptographic Failures: No proven vulnerability found. Checked: No secrets, API keys, or credentials hardcoded in the control service code. DATABASE_URL comes from env vars validated by Zod. config.ts:7 reads DATABASE_URL from process.env. No sensitive data appears in log output (the onnotice: () => {} in db.ts:20 suppresses PostgreSQL notices).

OWASP A03 - Injection

SEC-003: JSONB Interpolation Uses String Interpolation Instead of sql.json()

OWASP: A03 - Injection
Location: apps/control/src/index.ts:68, apps/control/src/index.ts:83, apps/control/src/index.ts:137, apps/control/src/index.ts:184

Evidence:

// Line 68: JSON.stringify → tagged template parameter (SAFE but fragile)
${JSON.stringify({ ttl })}::jsonb

// Line 83: captureTrimmed quoted inside tagged template (SAFE but looks dangerous)
${captureTrimmed ? sql`'${captureTrimmed}'::jsonb` : sql`NULL::jsonb`}

// Line 137: JSON.stringify → tagged template parameter
${JSON.stringify({ oldestReconcile: oldestReconcileTs, newestPersisted: newestRow.ts })}::jsonb

// Line 184: JSON.stringify → tagged template parameter
${JSON.stringify(sample.gpu)}::jsonb, ${JSON.stringify(sample.sys)}::jsonb

EXPLOIT (false positive confirmed): After tracing the postgres library's tagged-template behavior, all four patterns are safe. The postgres library parameterizes each ${} interpolation — JSON.stringify(...) produces a JSON string that becomes a bound parameter, and ::jsonb is raw SQL text appended after the parameter placeholder. Line 83's nested sql`'${captureTrimmed}'::jsonb` is also safe: the inner tagged template parameterizes captureTrimmed and the outer template includes the resulting SQL fragment. However, the line 83 pattern is a maintenance hazard — it looks like SQL injection and would trip every static analysis tool. The project's own CLAUDE.md documents this exact anti-pattern: "use sql.json(value as never) — NOT ${JSON.stringify(value)}::jsonb which double-serializes."
Severity: Medium (maintenance/correctness, not exploitable injection)

Fix sketch (all four locations):

// Before (fragile, trips linters):
${JSON.stringify({ ttl })}::jsonb
${captureTrimmed ? sql`'${captureTrimmed}'::jsonb` : sql`NULL::jsonb`}

// After (idiomatic, defense-in-depth):
${sql.json({ ttl })}
${captureTrimmed ? sql.json(JSON.parse(captureTrimmed)) : sql`NULL::jsonb`}
// Or for raw JSON strings:
${sql.json(JSON.parse(captureTrimmed))}

OWASP A04 - Insecure Design

SEC-004: Unbounded In-Memory Fleet State — Maps Never Pruned

OWASP: A04 - Insecure Design
Location: apps/control/src/index.ts:14-31 (createFleetState, ensureHostState); apps/control/src/services/fleet-state.ts:7-17

Evidence:

function createFleetState(): FleetState {
  return { hosts: new Map() };
}

function ensureHostState(fleet: FleetState, providerId: string): HostState {
  let state = fleet.hosts.get(providerId);
  if (!state) {
    state = { providerId, liveness: 'down', lastSeenAt: null, seq: 0, models: new Map() };
    fleet.hosts.set(providerId, state);
  }
  return state;
}

The fleet.hosts Map and each host's models Map grow without bound. handleLlamaSweepEvent (line 58) sets models: state.models.set(model, ...). There is no eviction, TTL, or size cap. The database retention job (retention.ts) prunes old database rows but never touches in-memory state. Over time, if models are renamed or rotated, the Maps accumulate stale entries.

EXPLOIT: An attacker who can send crafted SSE events (or a misconfigured llama-swap instance) with rapidly changing model names can grow the models Map without bound, eventually exhausting the control service's heap. With the default 5-second poll interval and no cap, this is a slow but real memory leak under normal operation.
Severity: Medium

Fix sketch:

// Cap the models map per host (e.g., 100 entries, LRU eviction):
const MAX_MODELS_PER_HOST = 100;
if (state.models.size >= MAX_MODELS_PER_HOST && !state.models.has(model)) {
  const oldest = state.models.keys().next().value;
  if (oldest) state.models.delete(oldest);
}
state.models.set(model, { ... });

OWASP A05 - Security Misconfiguration

SEC-005: HTTP Proxy Forwards All Upstream Response Headers Without Filtering

OWASP: A05 - Security Misconfiguration
Location: apps/server/src/routes/control-proxy.ts:78-81
Evidence:
```
reply.code(res.status);
for (const [key, value] of res.headers) {
  if (key === 'transfer-encoding') continue;
  reply.header(key, value);
}
```
Every response header from the BooControl upstream (except transfer-encoding) is forwarded verbatim to the client. If the upstream sets Set-Cookie, X-Frame-Options, Content-Security-Policy, or Content-Type: text/html with attacker-influenced body, those reach the browser unfiltered. The same pattern exists in coder-proxy.ts:83-86 (keep-in-sync clone).
EXPLOIT: Currently low-impact because BooControl's only HTTP endpoint (/api/health) returns application/json. But the pattern is dangerous by default — if any future BooControl endpoint returns HTML or sets cookies, the proxy will forward them. A compromised BooControl instance could inject arbitrary response headers into the BooChat origin.
Severity: Medium

Fix sketch:

const HOP_BY_HOP = new Set(['transfer-encoding', 'connection', 'keep-alive', 'upgrade']);
const BLOCKED_HEADERS = new Set(['set-cookie', 'content-security-policy', 'x-frame-options']);
for (const [key, value] of res.headers) {
  if (HOP_BY_HOP.has(key) || BLOCKED_HEADERS.has(key)) continue;
  reply.header(key, value);
}

OWASP A06 - Vulnerable and Outdated Components

A06 - Vulnerable and Outdated Components: No proven vulnerability found. Checked: apps/control/package.json dependencies — fastify ^4.28.1, @fastify/websocket ^10.0.1, postgres ^3.4.4, ws ^8.18.0, zod ^3.23.8. All are current major versions with no known CVEs at these version ranges. The postgres library (porsager/postgres) at ^3.4.4 is actively maintained.

OWASP A07 - Identification and Authentication Failures

A07 - Identification and Authentication Failures: No proven vulnerability beyond what's covered in SEC-001 (WS origin) and SEC-002 (no auth middleware). The architecture delegates authentication to Authelia at the reverse proxy, which is a valid pattern for single-user deployments. No hardcoded credentials, no bypass mechanisms, no session fixation.

OWASP A08 - Software and Data Integrity Failures

SEC-006: Critical SSE Parsing Bug — Fleet Connector Never Processes Events

OWASP: A08 - Software and Data Integrity Failures (data flow integrity)
Location: apps/control/src/services/fleet-connector.ts:158

Evidence:

const trimmed = line.trim();
if (!trimmed || trimmed.startsWith('data:')) continue;  // ← BUG: skips ALL data lines

const dataMatch = trimmed.match(/^data:\s*(.+)$/);       // ← unreachable for data lines
if (!dataMatch) continue;

Line 158 skips every line starting with data: — which is every SSE data line. The regex on line 160 can therefore never match. The handleLlamaSweepEvent callback is never invoked. No model events, metrics, or log data are ever parsed or persisted.

EXPLOIT: This is not an attacker-exploitable vulnerability, but it renders the entire fleet monitoring system non-functional. The control dashboard shows no hosts, no activity, no perf data. The retention job runs but operates on an empty table. The perf poller (line 164-192) does work independently (it uses HTTP, not SSE), so perf samples are ingested — but the primary event stream is dead.
Severity: Critical (functional — the core feature is broken)

Fix sketch:

// Line 158: remove the incorrect `startsWith('data:')` guard
// The intent was probably to skip lines that are JUST "data:" (SSE end-of-event marker)
// but the current code skips all data lines including "data: {...}"
- if (!trimmed || trimmed.startsWith('data:')) continue;
+ if (!trimmed) continue;
+ if (trimmed === 'data:') continue; // SSE end-of-event marker (empty data)

// Also fix line 167: event type extraction is wrong for SSE format.
// In SSE, event type comes from a preceding "event:" line, not from the data line.
// The current code extracts "data" as the event type for every line.
// This needs a proper SSE parser that tracks the current event type across lines.

OWASP A09 - Security Logging and Monitoring Failures

A09 - Security Logging and Monitoring Failures: No proven vulnerability found. Checked: The control service logs reconnect attempts, SSE parse errors, and upstream proxy errors at appropriate levels. No sensitive data (passwords, tokens, PII) appears in log output. The onnotice: () => {} in db.ts:20 suppresses PostgreSQL notices (which can contain query text). However, note that the fleet connector's critical parsing bug (SEC-006) means no fleet events are logged, which is a monitoring blind spot.

OWASP A10 - Server-Side Request Forgery (SSRF)

SEC-007: SSRF via Unvalidated ssh_host in URL Construction

OWASP: A10 - Server-Side Request Forgery
Location: apps/control/src/index.ts:247-248 and apps/control/src/index.ts:269-270
Evidence:
```
// Line 247-248 (fleet connector startup):
const sshHost = host.ssh_host;
if (!sshHost) continue;
const baseUrl = `http://${sshHost}:8401`;

// Line 269-270 (perf poller):
const sshHost = host.ssh_host;
if (!sshHost) continue;
const baseUrl = `http://${sshHost}:8401`;
```
The ssh_host value from the control_hosts database table is interpolated directly into a URL with no validation. This URL is passed to fetch() in fleet-connector.ts:136 and index.ts:176. If an attacker can write to the control_hosts table (via SQL injection elsewhere, compromised DB credentials, or a future admin endpoint), they can set ssh_host to:
- 169.254.169.254 → AWS/GCP metadata service (credential theft)
- localhost:5432 → PostgreSQL (connection probing)
- internal-service.corp → internal network scanning
- evil.com → data exfiltration via DNS
EXPLOIT: Requires write access to control_hosts. Currently no exposed write path exists in the control service (no POST/PUT/PATCH endpoints). The seed data in schema.sql:19-23 sets known hosts. But the table schema (ssh_host TEXT) accepts any string, and the code trusts it completely. This is a latent SSRF — safe today, dangerous the moment an admin API or migration writes to this table.
Severity: High (latent — no current write path, but no validation either)

Fix sketch:

// Validate ssh_host is a bare hostname or IP before constructing URLs:
function validateSshHost(host: string): string {
  // Allow hostnames and IPv4/IPv6, reject URLs and special addresses
  if (/^https?:\/\//.test(host)) throw new Error(`ssh_host must not contain protocol: ${host}`);
  if (/[@#?]/.test(host)) throw new Error(`ssh_host contains invalid characters: ${host}`);
  // Block link-local and metadata addresses
  if (/^(169\.254\.|0\.|127\.|::1|fe80:)/.test(host)) throw new Error(`ssh_host is a blocked address: ${host}`);
  return host;
}

const sshHost = validateSshHost(host.ssh_host);
const baseUrl = `http://${sshHost}:8401`;

Attack-Angle Protocol Results

Protocol 1: Input-to-Sink Tracing

Input Source	Sink	Path	Verdict
SSE event data (`fleet-connector.ts:166`)	SQL INSERT (`index.ts:66-70`)	`JSON.parse(dataStr)` → `onEvent` → `handleLlamaSweepEvent` → `sql\`INSERT...``	Dead path — line 158 skips all data lines (SEC-006)
Perf poller HTTP response (`index.ts:178`)	SQL INSERT (`index.ts:182-186`)	`res.json()` → `sample.gpu`/`sample.sys` → `JSON.stringify()` → `sql\`...``	Safe — JSON.stringify produces valid JSON, parameterized via tagged template
`ssh_host` from DB (`index.ts:247`)	`fetch()` (`fleet-connector.ts:136`)	SQL SELECT → string interpolation → URL construction → `fetch(url)`	Vulnerable — no validation (SEC-007)
Client WS message (`control-proxy.ts:46`)	Upstream WS (`control-proxy.ts:48`)	`clientSocket.on('message')` → `upstream.send(data)`	Relay only — no processing, no injection vector
HTTP request path (`control-proxy.ts:65`)	`fetch()` (`control-proxy.ts:72`)	`req.url.replace(...)` → string concat → `fetch(targetUrl)`	Safe — Fastify normalizes paths; `req.url` is the matched route
WS frame data from upstream (`ws.ts:31`)	Client socket (`ws.ts:32`)	`JSON.stringify(delta)` → `socket.send()`	Safe — server-generated JSON

Protocol 2: Auth/Authz Decision Audit

Decision Point	Location	Bypass?
BooChat → BooControl proxy	`control-proxy.ts:19-88`	No auth forwarded (no headers to forward). Authelia is the sole gate.
BooControl WS endpoint	`routes/ws.ts:19`	No auth check. Any TCP connection to port 9503 gets fleet state.
BooControl health endpoint	`index.ts:227`	No auth. Intentional (health checks).
BooChat WS proxy origin	`control-proxy.ts:20`	No origin validation. SEC-001.
HTTP proxy catch-all	`control-proxy.ts:64`	Forwards `authorization` header if present (line 69). No additional validation.

Protocol 3: Secret and PII Pattern Search

Pattern	Files Searched	Findings
`password`, `secret`, `api_key`, `token`, `credential`	All control source files	None found in code. `DATABASE_URL` in config includes password but is env-var sourced.
`ssn`, `credit_card`, `private_key`	All files	None found.
`BEGIN RSA`, `Bearer` , `Authorization:`	All files	None found in control service. `control-proxy.ts:69` forwards `authorization` header (expected proxy behavior).
PII in fleet state	`fleet-state.ts`, `ws.ts`	`providerId`, model names, GPU metrics — operational data, not PII.

Protocol 4: Dependency Vulnerability Check

Dependency	Version	Known CVEs
`fastify`	^4.28.1	None at this version
`@fastify/websocket`	^10.0.1	None at this version
`postgres` (porsager)	^3.4.4	None at this version
`ws`	^8.18.0	None at this version
`zod`	^3.23.8	None at this version

No known-vulnerable dependency versions detected.

Security Improvement Summary

What Was Found

The BooControl P1 implementation contains one critical functional defect (SEC-006: the SSE parser skips all data lines, rendering fleet monitoring completely non-functional), two high-severity architectural issues (SEC-001: no WebSocket origin validation on the proxy; SEC-007: SSRF via unvalidated ssh_host database values flowing into fetch()), and three medium-severity findings (SEC-002: no application-layer auth; SEC-003: JSONB interpolation uses fragile ::jsonb string patterns instead of sql.json(); SEC-004: unbounded in-memory Maps; SEC-005: unfiltered response header forwarding). The SQL injection concern about ::jsonb patterns was investigated thoroughly and confirmed as a false positive — the postgres tagged-template library correctly parameterizes all four interpolation sites. However, the patterns are a maintenance hazard and violate the project's own documented convention.

How to Improve

Fix the SSE parser (SEC-006): Remove the trimmed.startsWith('data:') guard on line 158 of fleet-connector.ts. Add a proper SSE parser that tracks event: type lines across the stream and pairs them with their data: payloads. The current code always extracts "data" as the event type.
Add WebSocket origin validation (SEC-001): Use @fastify/websocket's origin option or check req.headers.origin in the WS handler. Apply to both control-proxy.ts and coder-proxy.ts (keep-in-sync).
Validate ssh_host before URL construction (SEC-007): Add a validateSshHost() function that rejects URLs, protocol prefixes, special characters, and link-local/metadata IP addresses. Call it before constructing baseUrl in both the fleet connector startup and perf poller.
Replace ::jsonb string patterns with sql.json() (SEC-003): All four interpolation sites in index.ts should use the sql.json() helper documented in the project's own CLAUDE.md. This eliminates linter false positives and follows the established codebase convention.
Cap in-memory fleet state (SEC-004): Add LRU eviction to the models Map per host (e.g., 100 entries max). Consider periodic pruning of hosts that have been down for more than the retention window.
Filter forwarded response headers (SEC-005): In control-proxy.ts, skip hop-by-hop headers AND security-sensitive headers (set-cookie, content-security-policy) when forwarding upstream responses. Apply the same filter to coder-proxy.ts.

How to Prevent This Going Forward

Tagged-template lint rule: Add an ESLint custom rule or pre-commit check that flags ${JSON.stringify(...)}::jsonb patterns and suggests sql.json(). The project's CLAUDE.md already documents this convention — enforcement should be automated.
SSE parser library: Replace the hand-rolled SSE parser with a tested library (e.g., eventsource-parser or @microsoft/fetch-event-source). Hand-rolled SSE parsing has a long history of off-by-one and state-tracking bugs — the line 158 bug is a textbook example.
Proxy header allowlist: Establish a pattern where proxy routes maintain an explicit allowlist of response headers to forward, rather than forwarding everything except a blocklist. This is defense-in-depth against compromised upstream services.
Network boundary documentation: Document the trust boundary between BooChat (port 9500) and BooControl (port 9503). If the control service will ever be exposed beyond Tailscale, add bearer-token auth before that happens. The current "no auth, rely on network isolation" pattern is acceptable for single-user Tailscale but should be explicitly documented as a known limitation.
Integration test for SSE parsing: Add a test that feeds a realistic SSE stream (with event: and data: lines) to runFleetConnector and asserts that onEvent is called with the correct event types. This would have caught SEC-006 immediately.

23 KiB Raw Blame History

Security Analysis: BooControl P1 Implementation

Scope

Summary

Findings

OWASP A01 - Broken Access Control

OWASP A02 - Cryptographic Failures

OWASP A03 - Injection

OWASP A04 - Insecure Design

OWASP A05 - Security Misconfiguration

OWASP A06 - Vulnerable and Outdated Components

OWASP A07 - Identification and Authentication Failures

OWASP A08 - Software and Data Integrity Failures

OWASP A09 - Security Logging and Monitoring Failures

OWASP A10 - Server-Side Request Forgery (SSRF)

Attack-Angle Protocol Results

Protocol 1: Input-to-Sink Tracing

Protocol 2: Auth/Authz Decision Audit

Protocol 3: Secret and PII Pattern Search

Protocol 4: Dependency Vulnerability Check

Security Improvement Summary

What Was Found

How to Improve

How to Prevent This Going Forward

23 KiB

Raw Blame History