Files
boocode/docs/plans/multi-provider-local-models/artifacts/.discovery-notes.md
indifferentketchup b18de2a331 chore: snapshot working tree - pty_exited notifications + in-flight inference WIP
feat(booterm): structured pty_exited WS notifications. Plan-validated, impl-validated, code-reviewed green (contracts build clean, contracts test 29/29, booterm + web typecheck clean).

wip: in-progress inference/provider refactor (agents.ts, provider.ts, new llama-providers.ts, removed llama-args-validator), plus arena, dispatcher, compaction, schema changes.

openspec: pty-exit-notifications complete; x-agent-flags planned (not yet implemented).
2026-06-14 12:48:47 +00:00

6.8 KiB
Raw Blame History

Discovery Notes: Multi-Provider Local Models

Single source of truth for implementation context. Read this first before touching the plan or code.

Tech stack

  • Monorepo with pnpm workspaces.
  • apps/server: Fastify + Postgres, native inference, local-model routing, BooChat APIs.
  • apps/web: React + Vite SPA, shared chat and coder UI.
  • apps/coder: host-side BooCoder service, provider probing, native and external-agent dispatch, Arena, MCP.
  • packages/contracts: shared cross-app schemas and types, built before consumers.
  • TypeScript strict mode. Server and coder use NodeNext and .js import suffixes.
  • Tests: pnpm -C apps/server test, pnpm -C apps/coder test. No dedicated web test harness.

ADRs found

  • docs/adr/0001-arena-two-lane-scheduling.md Summary: local llama-backed contestants run serially in one lane, cloud contestants run in parallel in another lane; multi-provider work must preserve this lane model.
  • docs/adr/0002-arena-dedicated-tables-not-flow-runner.md Summary: Arena owns its own storage and runtime shape; reuse dispatcher machinery but do not fold Arena back into flow-runner abstractions.

Coding standards found

  • docs/coding-standards/cross-app-contract-parity.md Summary: when a cross-app contract changes, update the canonical package source plus app-side secondary representations in the same batch; missing one side silently drops behavior at runtime.
  • CLAUDE.md Summary: packages/contracts is the single source for provider-snapshot and message-metadata contracts, deploy-by-surface rules matter, and contract changes must respect app-local secondary unions and renderers where they still exist.

Relevant architecture notes

  • apps/server/CLAUDE.md Summary: services/inference/provider.ts is the current llama-swap provider seam; model-context.ts and compaction.ts currently assume one upstream.
  • apps/coder/CLAUDE.md Summary: provider snapshot and opencode integration are the main local-model seams; llama-swap/* is currently the local namespace assumption.
  • apps/web/CLAUDE.md Summary: ModelPicker and AgentComposerBar are separate UI surfaces with different constraints; any provider snapshot loading-state change can make providers disappear from the coder UI.

Code touch points

Shared contracts and config patterns

  • packages/contracts/src/provider-config.ts Existing coder ACP provider config schema; useful precedent, but not the right place to overload with local host inventory semantics.
  • apps/coder/src/services/provider-config-registry.ts Existing pattern for schema-in-package plus app-local load/build cache.
  • packages/contracts/src/provider-snapshot.ts Shared snapshot contract used by coder and web.

Server: catalog, routing, and downstream local-model consumers

  • apps/server/src/config.ts Current env config includes LLAMA_SWAP_URL, LLAMA_SIDECAR_URL, and DEFAULT_MODEL; multi-provider config must enter here.
  • apps/server/src/routes/models.ts Current /api/models route fetches one llama-swap and optionally DeepSeek.
  • apps/server/src/services/inference/provider.ts Current route selection and AI SDK provider seam; central place to remove heuristic provider detection.
  • apps/server/src/services/model-context.ts Current context cache keys by bare model string and assumes one LLAMA_SWAP_URL.
  • apps/server/src/services/compaction.ts Uses resolveModelEndpoint() today, but still contains one-provider assumptions and a DeepSeek prefix special case.
  • apps/server/src/services/task-model.ts Returns one resolved {url, model} pair today.
  • apps/server/src/index.ts Calls configureModelContext({ llamaSwapUrl }); this wiring must change when context lookup becomes provider-aware.
  • apps/server/src/routes/settings.ts Existing shared settings persistence surface; right place for favorite_models.

Web: BooChat and coder selection UI

  • apps/web/src/components/ModelPicker.tsx Shared BooChat model picker component; currently assumes a flat /api/models list.
  • apps/web/src/components/AgentComposerBar.tsx Native BooCoder provider/mode/model picker surface.
  • apps/web/src/lib/model-label.ts Display-only model prettifier used by both pickers.
  • apps/web/src/api/client.ts models() currently expects ModelInfo[].
  • apps/web/src/api/types.ts Holds the web-side API contract for /api/models and other cross-app payloads.

Coder: native, snapshot, arena, and external-agent bridge

  • apps/coder/src/config.ts Current coder config still exposes LLAMA_SWAP_URL; multi-provider config must enter here too.
  • apps/coder/src/services/provider-snapshot.ts Current snapshot fetches one LLAMA_SWAP_URL, prefixes local models as llama-swap/*, and merges them into opencode.
  • apps/coder/src/services/dispatcher.ts Current native and external-agent dispatch logic still assumes local bare ids or llama-swap/* for local routing.
  • apps/coder/src/services/backends/opencode-server.ts parseModel() splits only once at /; this is good news because a stable outer provider namespace can carry an inner composite model id.
  • apps/coder/src/services/arena-model-call.ts Direct one-shot local model call against LLAMA_SWAP_URL.
  • apps/coder/src/services/arena-analyzer.ts Local-vs-cloud checks rely on one local model set and one upstream.
  • apps/coder/src/index.ts Builds the local-model set for Arena from one fetched llama-swap list.

Recent activity and churn

High-churn files in the last 90 days:

  • apps/web/src/api/types.ts
  • apps/web/src/api/client.ts
  • apps/server/src/index.ts
  • apps/server/src/types/api.ts
  • apps/coder/src/services/dispatcher.ts
  • apps/coder/src/index.ts
  • apps/coder/src/services/provider-snapshot.ts
  • apps/web/src/components/AgentComposerBar.tsx
  • apps/server/src/services/compaction.ts

Implication: keep work units narrow and avoid combining unrelated refactors in these files.

Constraints and load-bearing facts

  • packages/contracts already owns provider-snapshot types; if the snapshot contract changes, rebuild the package before touching consumers.
  • apps/web has no dedicated test harness, so web verification will rely on typecheck plus smoke testing.
  • Arenas local lane semantics are intentional; multi-provider support must not collapse local models into parallel execution.
  • opencode local parity is not a small rename. The current host config and snapshot behavior collapse identity to one llama-swap namespace.

Gaps and unknowns

  • No existing shared local-provider config file or schema exists in-repo yet.
  • /api/models shape change is not yet specified in app-local types; W2 must settle the contract before W4 starts.
  • The final opencode gateway path is not implemented anywhere yet; W7 is net-new code, not just adaptation.
  • No dedicated docs for “add a machine” exist yet; W8 must create them.