feat: DeepSeek API integration + Whale lift (hooks, tool repair, MCP permissions, token tracking)

DeepSeek API:
- @ai-sdk/deepseek provider replaces openai-compatible for deepseek-* models
- Token tracking: cache_hit/reasoning tokens flow API → DB → WS frames → UI
- thinking effort levels (off/low/medium/high/xhigh/max) via AGENTS.md frontmatter
- V4 models: deepseek-v4-flash, deepseek-v4-pro
- Wired for both chat and coder panes

Whale lifts:
- Tool input repair (schema-based type coercion, markdown link unwrapping)
- Hooks system (6 lifecycle events, shell exec, JSON stdin/stdout contract)
- Per-MCP-server permissions (allow/ask/deny)
- token tracking UI (cache N, think N in message stats line)

Infra:
- New DB columns: messages.cache_tokens, messages.reasoning_tokens
- New WS frame fields: cache_tokens, reasoning_tokens on message_complete
- coder provider snapshot merges DeepSeek models alongside llama-swap
This commit is contained in:
2026-06-08 01:24:23 +00:00
parent c11e26090f
commit 203cfd2fa8
29 changed files with 916 additions and 42 deletions

View File

@@ -156,9 +156,16 @@ function StatsLine({ message }: { message: Message }) {
: `${ctxUsed} ctx`
: null;
const cacheHit = message.cache_tokens;
const reasoning = message.reasoning_tokens;
const cachePart = typeof cacheHit === 'number' && cacheHit > 0 ? `cache ${cacheHit}` : null;
const reasoningPart = typeof reasoning === 'number' && reasoning > 0 ? `think ${reasoning}` : null;
const parts: string[] = [`${tokens} tokens`];
if (tps !== null) parts.push(`${tps.toFixed(1)} tok/s`);
if (ctxPart) parts.push(ctxPart);
if (cachePart) parts.push(cachePart);
if (reasoningPart) parts.push(reasoningPart);
return (
<div className="text-[10px] font-mono text-muted-foreground">