feat(booterm): structured pty_exited WS notifications. Plan-validated, impl-validated, code-reviewed green (contracts build clean, contracts test 29/29, booterm + web typecheck clean). wip: in-progress inference/provider refactor (agents.ts, provider.ts, new llama-providers.ts, removed llama-args-validator), plus arena, dispatcher, compaction, schema changes. openspec: pty-exit-notifications complete; x-agent-flags planned (not yet implemented).
7.5 KiB
multi-llama-swap-providers-model-favorites — design
Detailed implementation plan for named local model providers, composite model IDs, grouped pickers, and shared favorites across BooChat and BooCoder.
1. Current state
Today the repo splits inference configuration across two incompatible shapes:
apps/serverreads env vars such asLLAMA_SWAP_URL,LLAMA_SIDECAR_URL, andDEFAULT_MODEL.apps/coderreads the sameLLAMA_SWAP_URLfor BooCode's own provider, plusdata/coder-providers.jsonfor ACP providers.
That leaves several hardcoded single-endpoint assumptions:
/api/modelsfetches one llama-swap plus optional DeepSeek.provider.tsroutes bydeepseek-name prefix and a global sidecar default.model-context.tscaches by bare model string.compaction.ts,task-model.ts, and coder arena use a single upstream URL.- BooCoder prepends
llama-swap/and treats any other slash-containing value as an already-routable provider namespace.
2. Design principles
- Provider identity is explicit.
- Wire model IDs stay bare; persisted model IDs are composite.
- Legacy bare model IDs remain readable indefinitely.
- Favorites are shared across BooChat and BooCoder.
- Sidecar routing is opt-in per provider, not a global fallback.
- Any cache keyed by model identity uses the full composite ID.
3. Recommended config authority
Introduce a new shared file for local inference providers:
- Live path:
/data/llama-providers.json - Env var for both apps:
LLAMA_PROVIDERS_PATH - Tracked example:
data/llama-providers.example.json
Recommended shape:
{
"defaultProvider": "sam-desktop",
"providers": [
{
"id": "sam-desktop",
"label": "Sam-desktop",
"baseUrl": "http://100.101.41.16:8401",
"sidecarUrl": "http://100.101.41.16:8402",
"kind": "llama-swap"
},
{
"id": "embedding",
"label": "embedding",
"baseUrl": "http://100.90.172.55:8411",
"kind": "llama-swap"
}
]
}
Rules:
- If the file is missing, synthesize a single legacy provider from
LLAMA_SWAP_URLand optionalLLAMA_SIDECAR_URL. data/coder-providers.jsonremains the ACP registry and is not extended with llama-swap base URLs.- DeepSeek credentials remain env-backed, but the model catalog should expose a
synthetic provider group such as
deepseekso routing no longer depends on a baredeepseek-prefix.
4. Model identity and parsing
Persist model selections as provider/model.
Examples:
sam-desktop/qwen3.6-35b-a3bembedding/gemma-4-12bdeepseek/deepseek-v4-pro
Helper behavior:
parseModelRef(id)returns{ providerId, wireModelId, isLegacyBareId }- Bare IDs resolve to
{ providerId: defaultProvider, wireModelId: id } - Only strip the prefix at the final wire-call boundary
This preserves existing TEXT columns while fixing duplicate-name ambiguity.
5. Server changes
5.1 Shared registry + model catalog
Add shared registry utilities in packages/contracts plus server-side loaders
used by:
apps/server/src/config.tsapps/server/src/routes/models.tsapps/server/src/services/inference/provider.tsapps/server/src/services/model-context.tsapps/server/src/services/task-model.tsapps/server/src/services/compaction.ts
GET /api/models should return a provider-aware payload. Recommended shape:
interface ModelCatalogProvider {
id: string;
label: string;
models: ModelInfo[];
}
interface ModelCatalogResponse {
providers: ModelCatalogProvider[];
}
Where each ModelInfo.id is already composite.
Favorites should not be embedded in this payload. They are a user-level
view derived in the client from favorite_models in /api/settings.
5.2 Routing
Replace string-heuristic routing with provider-aware resolution:
sam-desktop/*routes tobaseUrlorsidecarUrldepending on agent flags and provider capabilities.embedding/*always routes directly to its llama-swapbaseUrl.deepseek/*routes to the DeepSeek SDK provider.
resolveModelEndpoint() and upstreamModel() must both resolve from the same
parsed model reference to keep streaming and non-streaming behavior aligned.
5.3 Context lookup and cache keys
model-context.ts must key caches by the full composite ID. The provider
prefix is stripped only when building:
<provider.baseUrl>/upstream/<wireModelId>/props
This avoids cross-provider cache poisoning for duplicate names.
6. Persistence and settings
Keep:
sessions.model TEXTchats.model TEXT
Add a new settings key:
favorite_models: string[]
Rules:
- Stored favorites are composite IDs only.
- Missing/offline favorites are hidden from the picker, not deleted.
- Legacy bare favorites are not supported; on read they may be ignored or normalized only if the default-provider mapping is unambiguous.
7. BooCoder integration
Touch points:
apps/coder/src/services/provider-snapshot.tsapps/coder/src/services/dispatcher.tsapps/coder/src/services/arena-model-call.tsapps/coder/src/services/arena-analyzer.tsapps/coder/src/config.ts
7.1 Native boocode provider
The native boocode provider can use the shared local-provider registry and
resolver directly. Its model list should expose composite provider/model ids
and the UI should group them by local provider.
7.2 External-agent parity is a separate seam
opencode is not safe to migrate by a naive string rewrite. The current bridge
assumes one local llama-swap provider and collapses identity back to
llama-swap/<model>.
Recommended bridge rule:
- Composite local model IDs remain
provider/modelin native BooCode state and UI. - Do not translate
provider/modelback tollama-swap/<wireModelId>for external-agent paths; that loses provider identity for duplicate model names. - If full
opencodeparity is required, prefer a BooCoder-hosted OpenAI-compatible local-model gateway that accepts provider-aware model ids and routes them to the correct local upstream.
If the gateway is not part of the first slice, restrict the initial scope to
native boocode parity and keep opencode local-model parity as a follow-up.
8. Picker UX
Both BooChat and BooCoder should converge on the same behavior:
- Favorites section first
- Then one section per provider
- Favorite toggle on every model row
- A favorited model remains visible in its provider section
- Provider order defaults to:
sam-desktopembeddingdeepseekwhen configured
This batch does not require search. Search can be added later if model counts make the grouped list insufficient.
9. Rollout and compatibility
- Land registry/parsing utilities first.
- Switch server routing and model catalog to composite IDs.
- Add favorite persistence and picker grouping.
- Update native BooCoder (
boocode) model handling and arena. - Decide the
opencodeparity path: gateway now, or explicit follow-up. - Verify legacy bare IDs across existing chats and sessions before removing any old env-based assumptions.
Compatibility requirements:
- Missing
/data/llama-providers.jsoncannot break startup. - Existing DB rows with bare IDs must remain routable.
- Existing
DEFAULT_MODELcan stay bare during transition, but new writes should become composite.
10. Deferred items
- Picker search/filtering
- Manual favorite ordering beyond insertion order
- Host health badges in the picker
- Automatic normalization of old session/chat model values
- Full
opencodemulti-provider parity if the first slice ships native-only - Any boocontrol fleet UI built on top of this registry