Files
boocode/docs/plans/multi-provider-local-models/artifacts/.discovery-notes.md
indifferentketchup b18de2a331 chore: snapshot working tree - pty_exited notifications + in-flight inference WIP
feat(booterm): structured pty_exited WS notifications. Plan-validated, impl-validated, code-reviewed green (contracts build clean, contracts test 29/29, booterm + web typecheck clean).

wip: in-progress inference/provider refactor (agents.ts, provider.ts, new llama-providers.ts, removed llama-args-validator), plus arena, dispatcher, compaction, schema changes.

openspec: pty-exit-notifications complete; x-agent-flags planned (not yet implemented).
2026-06-14 12:48:47 +00:00

127 lines
6.8 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Discovery Notes: Multi-Provider Local Models
Single source of truth for implementation context. Read this first before touching the plan or code.
## Tech stack
- Monorepo with pnpm workspaces.
- `apps/server`: Fastify + Postgres, native inference, local-model routing, BooChat APIs.
- `apps/web`: React + Vite SPA, shared chat and coder UI.
- `apps/coder`: host-side BooCoder service, provider probing, native and external-agent dispatch, Arena, MCP.
- `packages/contracts`: shared cross-app schemas and types, built before consumers.
- TypeScript strict mode. Server and coder use NodeNext and `.js` import suffixes.
- Tests: `pnpm -C apps/server test`, `pnpm -C apps/coder test`. No dedicated web test harness.
## ADRs found
- `docs/adr/0001-arena-two-lane-scheduling.md`
Summary: local llama-backed contestants run serially in one lane, cloud contestants run in parallel in another lane; multi-provider work must preserve this lane model.
- `docs/adr/0002-arena-dedicated-tables-not-flow-runner.md`
Summary: Arena owns its own storage and runtime shape; reuse dispatcher machinery but do not fold Arena back into flow-runner abstractions.
## Coding standards found
- `docs/coding-standards/cross-app-contract-parity.md`
Summary: when a cross-app contract changes, update the canonical package source plus app-side secondary representations in the same batch; missing one side silently drops behavior at runtime.
- `CLAUDE.md`
Summary: `packages/contracts` is the single source for provider-snapshot and message-metadata contracts, deploy-by-surface rules matter, and contract changes must respect app-local secondary unions and renderers where they still exist.
## Relevant architecture notes
- `apps/server/CLAUDE.md`
Summary: `services/inference/provider.ts` is the current llama-swap provider seam; `model-context.ts` and `compaction.ts` currently assume one upstream.
- `apps/coder/CLAUDE.md`
Summary: provider snapshot and `opencode` integration are the main local-model seams; `llama-swap/*` is currently the local namespace assumption.
- `apps/web/CLAUDE.md`
Summary: `ModelPicker` and `AgentComposerBar` are separate UI surfaces with different constraints; any provider snapshot loading-state change can make providers disappear from the coder UI.
## Code touch points
### Shared contracts and config patterns
- `packages/contracts/src/provider-config.ts`
Existing coder ACP provider config schema; useful precedent, but not the right place to overload with local host inventory semantics.
- `apps/coder/src/services/provider-config-registry.ts`
Existing pattern for schema-in-package plus app-local load/build cache.
- `packages/contracts/src/provider-snapshot.ts`
Shared snapshot contract used by coder and web.
### Server: catalog, routing, and downstream local-model consumers
- `apps/server/src/config.ts`
Current env config includes `LLAMA_SWAP_URL`, `LLAMA_SIDECAR_URL`, and `DEFAULT_MODEL`; multi-provider config must enter here.
- `apps/server/src/routes/models.ts`
Current `/api/models` route fetches one llama-swap and optionally DeepSeek.
- `apps/server/src/services/inference/provider.ts`
Current route selection and AI SDK provider seam; central place to remove heuristic provider detection.
- `apps/server/src/services/model-context.ts`
Current context cache keys by bare model string and assumes one `LLAMA_SWAP_URL`.
- `apps/server/src/services/compaction.ts`
Uses `resolveModelEndpoint()` today, but still contains one-provider assumptions and a DeepSeek prefix special case.
- `apps/server/src/services/task-model.ts`
Returns one resolved `{url, model}` pair today.
- `apps/server/src/index.ts`
Calls `configureModelContext({ llamaSwapUrl })`; this wiring must change when context lookup becomes provider-aware.
- `apps/server/src/routes/settings.ts`
Existing shared settings persistence surface; right place for `favorite_models`.
### Web: BooChat and coder selection UI
- `apps/web/src/components/ModelPicker.tsx`
Shared BooChat model picker component; currently assumes a flat `/api/models` list.
- `apps/web/src/components/AgentComposerBar.tsx`
Native BooCoder provider/mode/model picker surface.
- `apps/web/src/lib/model-label.ts`
Display-only model prettifier used by both pickers.
- `apps/web/src/api/client.ts`
`models()` currently expects `ModelInfo[]`.
- `apps/web/src/api/types.ts`
Holds the web-side API contract for `/api/models` and other cross-app payloads.
### Coder: native, snapshot, arena, and external-agent bridge
- `apps/coder/src/config.ts`
Current coder config still exposes `LLAMA_SWAP_URL`; multi-provider config must enter here too.
- `apps/coder/src/services/provider-snapshot.ts`
Current snapshot fetches one `LLAMA_SWAP_URL`, prefixes local models as `llama-swap/*`, and merges them into `opencode`.
- `apps/coder/src/services/dispatcher.ts`
Current native and external-agent dispatch logic still assumes local bare ids or `llama-swap/*` for local routing.
- `apps/coder/src/services/backends/opencode-server.ts`
`parseModel()` splits only once at `/`; this is good news because a stable outer provider namespace can carry an inner composite model id.
- `apps/coder/src/services/arena-model-call.ts`
Direct one-shot local model call against `LLAMA_SWAP_URL`.
- `apps/coder/src/services/arena-analyzer.ts`
Local-vs-cloud checks rely on one local model set and one upstream.
- `apps/coder/src/index.ts`
Builds the local-model set for Arena from one fetched llama-swap list.
## Recent activity and churn
High-churn files in the last 90 days:
- `apps/web/src/api/types.ts`
- `apps/web/src/api/client.ts`
- `apps/server/src/index.ts`
- `apps/server/src/types/api.ts`
- `apps/coder/src/services/dispatcher.ts`
- `apps/coder/src/index.ts`
- `apps/coder/src/services/provider-snapshot.ts`
- `apps/web/src/components/AgentComposerBar.tsx`
- `apps/server/src/services/compaction.ts`
Implication: keep work units narrow and avoid combining unrelated refactors in these files.
## Constraints and load-bearing facts
- `packages/contracts` already owns provider-snapshot types; if the snapshot contract changes, rebuild the package before touching consumers.
- `apps/web` has no dedicated test harness, so web verification will rely on typecheck plus smoke testing.
- Arenas local lane semantics are intentional; multi-provider support must not collapse local models into parallel execution.
- `opencode` local parity is not a small rename. The current host config and snapshot behavior collapse identity to one `llama-swap` namespace.
## Gaps and unknowns
- No existing shared local-provider config file or schema exists in-repo yet.
- `/api/models` shape change is not yet specified in app-local types; W2 must settle the contract before W4 starts.
- The final `opencode` gateway path is not implemented anywhere yet; W7 is net-new code, not just adaptation.
- No dedicated docs for “add a machine” exist yet; W8 must create them.