# multi-llama-swap-providers-model-favorites — design Detailed implementation plan for named local model providers, composite model IDs, grouped pickers, and shared favorites across BooChat and BooCoder. ## 1. Current state Today the repo splits inference configuration across two incompatible shapes: - `apps/server` reads env vars such as `LLAMA_SWAP_URL`, `LLAMA_SIDECAR_URL`, and `DEFAULT_MODEL`. - `apps/coder` reads the same `LLAMA_SWAP_URL` for BooCode's own provider, plus `data/coder-providers.json` for ACP providers. That leaves several hardcoded single-endpoint assumptions: - `/api/models` fetches one llama-swap plus optional DeepSeek. - `provider.ts` routes by `deepseek-` name prefix and a global sidecar default. - `model-context.ts` caches by bare model string. - `compaction.ts`, `task-model.ts`, and coder arena use a single upstream URL. - BooCoder prepends `llama-swap/` and treats any other slash-containing value as an already-routable provider namespace. ## 2. Design principles 1. Provider identity is explicit. 2. Wire model IDs stay bare; persisted model IDs are composite. 3. Legacy bare model IDs remain readable indefinitely. 4. Favorites are shared across BooChat and BooCoder. 5. Sidecar routing is opt-in per provider, not a global fallback. 6. Any cache keyed by model identity uses the full composite ID. ## 3. Recommended config authority Introduce a new shared file for local inference providers: - Live path: `/data/llama-providers.json` - Env var for both apps: `LLAMA_PROVIDERS_PATH` - Tracked example: `data/llama-providers.example.json` Recommended shape: ```json { "defaultProvider": "sam-desktop", "providers": [ { "id": "sam-desktop", "label": "Sam-desktop", "baseUrl": "http://100.101.41.16:8401", "sidecarUrl": "http://100.101.41.16:8402", "kind": "llama-swap" }, { "id": "embedding", "label": "embedding", "baseUrl": "http://100.90.172.55:8411", "kind": "llama-swap" } ] } ``` Rules: - If the file is missing, synthesize a single legacy provider from `LLAMA_SWAP_URL` and optional `LLAMA_SIDECAR_URL`. - `data/coder-providers.json` remains the ACP registry and is not extended with llama-swap base URLs. - DeepSeek credentials remain env-backed, but the model catalog should expose a synthetic provider group such as `deepseek` so routing no longer depends on a bare `deepseek-` prefix. ## 4. Model identity and parsing Persist model selections as `provider/model`. Examples: - `sam-desktop/qwen3.6-35b-a3b` - `embedding/gemma-4-12b` - `deepseek/deepseek-v4-pro` Helper behavior: - `parseModelRef(id)` returns `{ providerId, wireModelId, isLegacyBareId }` - Bare IDs resolve to `{ providerId: defaultProvider, wireModelId: id }` - Only strip the prefix at the final wire-call boundary This preserves existing `TEXT` columns while fixing duplicate-name ambiguity. ## 5. Server changes ### 5.1 Shared registry + model catalog Add shared registry utilities in `packages/contracts` plus server-side loaders used by: - `apps/server/src/config.ts` - `apps/server/src/routes/models.ts` - `apps/server/src/services/inference/provider.ts` - `apps/server/src/services/model-context.ts` - `apps/server/src/services/task-model.ts` - `apps/server/src/services/compaction.ts` `GET /api/models` should return a provider-aware payload. Recommended shape: ```ts interface ModelCatalogProvider { id: string; label: string; models: ModelInfo[]; } interface ModelCatalogResponse { providers: ModelCatalogProvider[]; } ``` Where each `ModelInfo.id` is already composite. Favorites should **not** be embedded in this payload. They are a user-level view derived in the client from `favorite_models` in `/api/settings`. ### 5.2 Routing Replace string-heuristic routing with provider-aware resolution: - `sam-desktop/*` routes to `baseUrl` or `sidecarUrl` depending on agent flags and provider capabilities. - `embedding/*` always routes directly to its llama-swap `baseUrl`. - `deepseek/*` routes to the DeepSeek SDK provider. `resolveModelEndpoint()` and `upstreamModel()` must both resolve from the same parsed model reference to keep streaming and non-streaming behavior aligned. ### 5.3 Context lookup and cache keys `model-context.ts` must key caches by the full composite ID. The provider prefix is stripped only when building: `/upstream//props` This avoids cross-provider cache poisoning for duplicate names. ## 6. Persistence and settings Keep: - `sessions.model TEXT` - `chats.model TEXT` Add a new `settings` key: - `favorite_models: string[]` Rules: - Stored favorites are composite IDs only. - Missing/offline favorites are hidden from the picker, not deleted. - Legacy bare favorites are not supported; on read they may be ignored or normalized only if the default-provider mapping is unambiguous. ## 7. BooCoder integration Touch points: - `apps/coder/src/services/provider-snapshot.ts` - `apps/coder/src/services/dispatcher.ts` - `apps/coder/src/services/arena-model-call.ts` - `apps/coder/src/services/arena-analyzer.ts` - `apps/coder/src/config.ts` ### 7.1 Native `boocode` provider The native `boocode` provider can use the shared local-provider registry and resolver directly. Its model list should expose composite `provider/model` ids and the UI should group them by local provider. ### 7.2 External-agent parity is a separate seam `opencode` is not safe to migrate by a naive string rewrite. The current bridge assumes one local llama-swap provider and collapses identity back to `llama-swap/`. Recommended bridge rule: - Composite local model IDs remain `provider/model` in native BooCode state and UI. - Do **not** translate `provider/model` back to `llama-swap/` for external-agent paths; that loses provider identity for duplicate model names. - If full `opencode` parity is required, prefer a BooCoder-hosted OpenAI-compatible local-model gateway that accepts provider-aware model ids and routes them to the correct local upstream. If the gateway is not part of the first slice, restrict the initial scope to native `boocode` parity and keep `opencode` local-model parity as a follow-up. ## 8. Picker UX Both BooChat and BooCoder should converge on the same behavior: - Favorites section first - Then one section per provider - Favorite toggle on every model row - A favorited model remains visible in its provider section - Provider order defaults to: 1. `sam-desktop` 2. `embedding` 3. `deepseek` when configured This batch does not require search. Search can be added later if model counts make the grouped list insufficient. ## 9. Rollout and compatibility 1. Land registry/parsing utilities first. 2. Switch server routing and model catalog to composite IDs. 3. Add favorite persistence and picker grouping. 4. Update native BooCoder (`boocode`) model handling and arena. 5. Decide the `opencode` parity path: gateway now, or explicit follow-up. 6. Verify legacy bare IDs across existing chats and sessions before removing any old env-based assumptions. Compatibility requirements: - Missing `/data/llama-providers.json` cannot break startup. - Existing DB rows with bare IDs must remain routable. - Existing `DEFAULT_MODEL` can stay bare during transition, but new writes should become composite. ## 10. Deferred items - Picker search/filtering - Manual favorite ordering beyond insertion order - Host health badges in the picker - Automatic normalization of old session/chat model values - Full `opencode` multi-provider parity if the first slice ships native-only - Any boocontrol fleet UI built on top of this registry