feat(booterm): structured pty_exited WS notifications. Plan-validated, impl-validated, code-reviewed green (contracts build clean, contracts test 29/29, booterm + web typecheck clean). wip: in-progress inference/provider refactor (agents.ts, provider.ts, new llama-providers.ts, removed llama-args-validator), plus arena, dispatcher, compaction, schema changes. openspec: pty-exit-notifications complete; x-agent-flags planned (not yet implemented).
3.2 KiB
3.2 KiB
multi-llama-swap-providers-model-favorites
Why
BooCode still treats local inference as a single LLAMA_SWAP_URL, but the
actual setup is already a fleet:
sam-desktopat100.101.41.16:8401embeddingat100.90.172.55:8411- optional DeepSeek cloud models when
DEEPSEEK_API_KEYis set
The current model identity is only a bare model string, which is no longer
safe. Five model IDs already exist on both llama-swap hosts, the seeded
DEFAULT_MODEL has already drifted out of the live list once, and multiple
server/coder call sites still hardcode a single upstream.
The research in
docs/research/2026-06-10-multi-llama-swap-providers-model-favorites.md
validated one direction:
- Introduce a named provider registry.
- Store selected models as composite IDs:
provider/model. - Group pickers by provider with a Favorites section first.
- Persist favorites server-side so BooChat and BooCoder share them.
- Remove single-endpoint assumptions from routing, context lookup, compaction, arena, and coder dispatch.
This batch is also the prerequisite named in openspec/changes/boocontrol/.
What Changes
- Add a shared provider-registry config for local model providers.
- Replace bare model identity with composite
provider/modelIDs at the API, picker, cache, and routing layers while keeping legacy bare IDs readable. - Convert the server model catalog from a flat list into grouped provider sections with favorites surfaced first.
- Make sidecar routing an attribute of the
sam-desktopprovider instead of a global default for all non-DeepSeek traffic. - Update BooCoder's llama-swap namespace bridge so composite IDs still dispatch through opencode correctly.
- Add server-side favorite persistence in
settingswith hide-not-delete behavior for unavailable models.
Non-goals
- Replacing the existing ACP provider registry in
data/coder-providers.json - Introducing llama-swap peer federation or LiteLLM as an aggregation layer
- Adding full-text search, tags, or admin curation to the pickers in this batch
- Cleaning up stale favorites automatically
- Reworking session/chat schema columns from
TEXTto structured provider fields
Success Criteria
GET /api/modelsreturns a provider-aware catalog that can distinguish duplicate model names across hosts.- Existing sessions/chats that store a bare model ID still work, resolving to the default local provider without data migration.
embedding/deepseek-r1-qwen3-8bnever routes to DeepSeek cloud and never receives the fake static 131k context window.- Requests for
embedding/*models never go through llama-sidecar. - BooChat and BooCoder both render a Favorites section first, then provider groups, and a favorited model still remains visible in its provider group.
- A favorite for an offline provider disappears from the visible list but returns automatically when that provider comes back.
- Arena, compaction, task-model, and model-context all resolve the same provider/model pair consistently.
Deliverables
| Doc | Purpose |
|---|---|
design.md |
Registry shape, model identity rules, routing, UX, rollout |
tasks.md |
Ordered implementation and verification checklist |