feat(booterm): structured pty_exited WS notifications. Plan-validated, impl-validated, code-reviewed green (contracts build clean, contracts test 29/29, booterm + web typecheck clean). wip: in-progress inference/provider refactor (agents.ts, provider.ts, new llama-providers.ts, removed llama-args-validator), plus arena, dispatcher, compaction, schema changes. openspec: pty-exit-notifications complete; x-agent-flags planned (not yet implemented).
6.8 KiB
6.8 KiB
Discovery Notes: Multi-Provider Local Models
Single source of truth for implementation context. Read this first before touching the plan or code.
Tech stack
- Monorepo with pnpm workspaces.
apps/server: Fastify + Postgres, native inference, local-model routing, BooChat APIs.apps/web: React + Vite SPA, shared chat and coder UI.apps/coder: host-side BooCoder service, provider probing, native and external-agent dispatch, Arena, MCP.packages/contracts: shared cross-app schemas and types, built before consumers.- TypeScript strict mode. Server and coder use NodeNext and
.jsimport suffixes. - Tests:
pnpm -C apps/server test,pnpm -C apps/coder test. No dedicated web test harness.
ADRs found
docs/adr/0001-arena-two-lane-scheduling.mdSummary: local llama-backed contestants run serially in one lane, cloud contestants run in parallel in another lane; multi-provider work must preserve this lane model.docs/adr/0002-arena-dedicated-tables-not-flow-runner.mdSummary: Arena owns its own storage and runtime shape; reuse dispatcher machinery but do not fold Arena back into flow-runner abstractions.
Coding standards found
docs/coding-standards/cross-app-contract-parity.mdSummary: when a cross-app contract changes, update the canonical package source plus app-side secondary representations in the same batch; missing one side silently drops behavior at runtime.CLAUDE.mdSummary:packages/contractsis the single source for provider-snapshot and message-metadata contracts, deploy-by-surface rules matter, and contract changes must respect app-local secondary unions and renderers where they still exist.
Relevant architecture notes
apps/server/CLAUDE.mdSummary:services/inference/provider.tsis the current llama-swap provider seam;model-context.tsandcompaction.tscurrently assume one upstream.apps/coder/CLAUDE.mdSummary: provider snapshot andopencodeintegration are the main local-model seams;llama-swap/*is currently the local namespace assumption.apps/web/CLAUDE.mdSummary:ModelPickerandAgentComposerBarare separate UI surfaces with different constraints; any provider snapshot loading-state change can make providers disappear from the coder UI.
Code touch points
Shared contracts and config patterns
packages/contracts/src/provider-config.tsExisting coder ACP provider config schema; useful precedent, but not the right place to overload with local host inventory semantics.apps/coder/src/services/provider-config-registry.tsExisting pattern for schema-in-package plus app-local load/build cache.packages/contracts/src/provider-snapshot.tsShared snapshot contract used by coder and web.
Server: catalog, routing, and downstream local-model consumers
apps/server/src/config.tsCurrent env config includesLLAMA_SWAP_URL,LLAMA_SIDECAR_URL, andDEFAULT_MODEL; multi-provider config must enter here.apps/server/src/routes/models.tsCurrent/api/modelsroute fetches one llama-swap and optionally DeepSeek.apps/server/src/services/inference/provider.tsCurrent route selection and AI SDK provider seam; central place to remove heuristic provider detection.apps/server/src/services/model-context.tsCurrent context cache keys by bare model string and assumes oneLLAMA_SWAP_URL.apps/server/src/services/compaction.tsUsesresolveModelEndpoint()today, but still contains one-provider assumptions and a DeepSeek prefix special case.apps/server/src/services/task-model.tsReturns one resolved{url, model}pair today.apps/server/src/index.tsCallsconfigureModelContext({ llamaSwapUrl }); this wiring must change when context lookup becomes provider-aware.apps/server/src/routes/settings.tsExisting shared settings persistence surface; right place forfavorite_models.
Web: BooChat and coder selection UI
apps/web/src/components/ModelPicker.tsxShared BooChat model picker component; currently assumes a flat/api/modelslist.apps/web/src/components/AgentComposerBar.tsxNative BooCoder provider/mode/model picker surface.apps/web/src/lib/model-label.tsDisplay-only model prettifier used by both pickers.apps/web/src/api/client.tsmodels()currently expectsModelInfo[].apps/web/src/api/types.tsHolds the web-side API contract for/api/modelsand other cross-app payloads.
Coder: native, snapshot, arena, and external-agent bridge
apps/coder/src/config.tsCurrent coder config still exposesLLAMA_SWAP_URL; multi-provider config must enter here too.apps/coder/src/services/provider-snapshot.tsCurrent snapshot fetches oneLLAMA_SWAP_URL, prefixes local models asllama-swap/*, and merges them intoopencode.apps/coder/src/services/dispatcher.tsCurrent native and external-agent dispatch logic still assumes local bare ids orllama-swap/*for local routing.apps/coder/src/services/backends/opencode-server.tsparseModel()splits only once at/; this is good news because a stable outer provider namespace can carry an inner composite model id.apps/coder/src/services/arena-model-call.tsDirect one-shot local model call againstLLAMA_SWAP_URL.apps/coder/src/services/arena-analyzer.tsLocal-vs-cloud checks rely on one local model set and one upstream.apps/coder/src/index.tsBuilds the local-model set for Arena from one fetched llama-swap list.
Recent activity and churn
High-churn files in the last 90 days:
apps/web/src/api/types.tsapps/web/src/api/client.tsapps/server/src/index.tsapps/server/src/types/api.tsapps/coder/src/services/dispatcher.tsapps/coder/src/index.tsapps/coder/src/services/provider-snapshot.tsapps/web/src/components/AgentComposerBar.tsxapps/server/src/services/compaction.ts
Implication: keep work units narrow and avoid combining unrelated refactors in these files.
Constraints and load-bearing facts
packages/contractsalready owns provider-snapshot types; if the snapshot contract changes, rebuild the package before touching consumers.apps/webhas no dedicated test harness, so web verification will rely on typecheck plus smoke testing.- Arena’s local lane semantics are intentional; multi-provider support must not collapse local models into parallel execution.
opencodelocal parity is not a small rename. The current host config and snapshot behavior collapse identity to onellama-swapnamespace.
Gaps and unknowns
- No existing shared local-provider config file or schema exists in-repo yet.
/api/modelsshape change is not yet specified in app-local types; W2 must settle the contract before W4 starts.- The final
opencodegateway path is not implemented anywhere yet; W7 is net-new code, not just adaptation. - No dedicated docs for “add a machine” exist yet; W8 must create them.