feat: DeepSeek API integration + Whale lift (hooks, tool repair, MCP permissions, token tracking)

DeepSeek API: - @ai-sdk/deepseek provider replaces openai-compatible for deepseek-* models - Token tracking: cache_hit/reasoning tokens flow API → DB → WS frames → UI - thinking effort levels (off/low/medium/high/xhigh/max) via AGENTS.md frontmatter - V4 models: deepseek-v4-flash, deepseek-v4-pro - Wired for both chat and coder panes Whale lifts: - Tool input repair (schema-based type coercion, markdown link unwrapping) - Hooks system (6 lifecycle events, shell exec, JSON stdin/stdout contract) - Per-MCP-server permissions (allow/ask/deny) - token tracking UI (cache N, think N in message stats line) Infra: - New DB columns: messages.cache_tokens, messages.reasoning_tokens - New WS frame fields: cache_tokens, reasoning_tokens on message_complete - coder provider snapshot merges DeepSeek models alongside llama-swap
2026-06-08 01:24:23 +00:00
parent 31e5d9d4ab
commit c4079dd85c
29 changed files with 916 additions and 42 deletions
--- a/apps/server/src/services/agents.ts
+++ b/apps/server/src/services/agents.ts
@@ -106,6 +106,8 @@ interface ParsedFrontmatter {
  // allowed" — the model responds text-only.
  steps?: number;
  llama_extra_args?: string[];
+  // vDeepSeek: thinking effort for DeepSeek V4 models.
+  reasoning_effort?: string;
 }

 // P5: table-driven validation for the "soft-range" numeric frontmatter fields.
@@ -386,6 +388,7 @@ function parseAgentSection(section: RawSection): Omit<Agent, 'source'> {
    max_tool_calls: typeof fm.max_tool_calls === 'number' ? fm.max_tool_calls : null,
    steps: typeof fm.steps === 'number' ? fm.steps : null,
    llama_extra_args: Array.isArray(fm.llama_extra_args) ? fm.llama_extra_args : null,
+    reasoning_effort: typeof fm.reasoning_effort === 'string' ? (fm.reasoning_effort as Agent['reasoning_effort']) : null,
  };
 }