feat(booterm): structured pty_exited WS notifications. Plan-validated, impl-validated, code-reviewed green (contracts build clean, contracts test 29/29, booterm + web typecheck clean). wip: in-progress inference/provider refactor (agents.ts, provider.ts, new llama-providers.ts, removed llama-args-validator), plus arena, dispatcher, compaction, schema changes. openspec: pty-exit-notifications complete; x-agent-flags planned (not yet implemented).
74 lines
3.2 KiB
Markdown
74 lines
3.2 KiB
Markdown
# multi-llama-swap-providers-model-favorites
|
|
|
|
## Why
|
|
|
|
BooCode still treats local inference as a single `LLAMA_SWAP_URL`, but the
|
|
actual setup is already a fleet:
|
|
|
|
- `sam-desktop` at `100.101.41.16:8401`
|
|
- `embedding` at `100.90.172.55:8411`
|
|
- optional DeepSeek cloud models when `DEEPSEEK_API_KEY` is set
|
|
|
|
The current model identity is only a bare model string, which is no longer
|
|
safe. Five model IDs already exist on both llama-swap hosts, the seeded
|
|
`DEFAULT_MODEL` has already drifted out of the live list once, and multiple
|
|
server/coder call sites still hardcode a single upstream.
|
|
|
|
The research in
|
|
`docs/research/2026-06-10-multi-llama-swap-providers-model-favorites.md`
|
|
validated one direction:
|
|
|
|
1. Introduce a named provider registry.
|
|
2. Store selected models as composite IDs: `provider/model`.
|
|
3. Group pickers by provider with a Favorites section first.
|
|
4. Persist favorites server-side so BooChat and BooCoder share them.
|
|
5. Remove single-endpoint assumptions from routing, context lookup,
|
|
compaction, arena, and coder dispatch.
|
|
|
|
This batch is also the prerequisite named in `openspec/changes/boocontrol/`.
|
|
|
|
## What Changes
|
|
|
|
1. Add a shared provider-registry config for local model providers.
|
|
2. Replace bare model identity with composite `provider/model` IDs at the API,
|
|
picker, cache, and routing layers while keeping legacy bare IDs readable.
|
|
3. Convert the server model catalog from a flat list into grouped provider
|
|
sections with favorites surfaced first.
|
|
4. Make sidecar routing an attribute of the `sam-desktop` provider instead of
|
|
a global default for all non-DeepSeek traffic.
|
|
5. Update BooCoder's llama-swap namespace bridge so composite IDs still
|
|
dispatch through opencode correctly.
|
|
6. Add server-side favorite persistence in `settings` with hide-not-delete
|
|
behavior for unavailable models.
|
|
|
|
## Non-goals
|
|
|
|
- Replacing the existing ACP provider registry in `data/coder-providers.json`
|
|
- Introducing llama-swap peer federation or LiteLLM as an aggregation layer
|
|
- Adding full-text search, tags, or admin curation to the pickers in this batch
|
|
- Cleaning up stale favorites automatically
|
|
- Reworking session/chat schema columns from `TEXT` to structured provider fields
|
|
|
|
## Success Criteria
|
|
|
|
- `GET /api/models` returns a provider-aware catalog that can distinguish
|
|
duplicate model names across hosts.
|
|
- Existing sessions/chats that store a bare model ID still work, resolving to
|
|
the default local provider without data migration.
|
|
- `embedding/deepseek-r1-qwen3-8b` never routes to DeepSeek cloud and never
|
|
receives the fake static 131k context window.
|
|
- Requests for `embedding/*` models never go through llama-sidecar.
|
|
- BooChat and BooCoder both render a Favorites section first, then provider
|
|
groups, and a favorited model still remains visible in its provider group.
|
|
- A favorite for an offline provider disappears from the visible list but
|
|
returns automatically when that provider comes back.
|
|
- Arena, compaction, task-model, and model-context all resolve the same
|
|
provider/model pair consistently.
|
|
|
|
## Deliverables
|
|
|
|
| Doc | Purpose |
|
|
|-----|---------|
|
|
| [`design.md`](./design.md) | Registry shape, model identity rules, routing, UX, rollout |
|
|
| [`tasks.md`](./tasks.md) | Ordered implementation and verification checklist |
|