chore: snapshot working tree - pty_exited notifications + in-flight inference WIP
feat(booterm): structured pty_exited WS notifications. Plan-validated, impl-validated, code-reviewed green (contracts build clean, contracts test 29/29, booterm + web typecheck clean). wip: in-progress inference/provider refactor (agents.ts, provider.ts, new llama-providers.ts, removed llama-args-validator), plus arena, dispatcher, compaction, schema changes. openspec: pty-exit-notifications complete; x-agent-flags planned (not yet implemented).
This commit is contained in:
@@ -0,0 +1,73 @@
|
||||
# multi-llama-swap-providers-model-favorites
|
||||
|
||||
## Why
|
||||
|
||||
BooCode still treats local inference as a single `LLAMA_SWAP_URL`, but the
|
||||
actual setup is already a fleet:
|
||||
|
||||
- `sam-desktop` at `100.101.41.16:8401`
|
||||
- `embedding` at `100.90.172.55:8411`
|
||||
- optional DeepSeek cloud models when `DEEPSEEK_API_KEY` is set
|
||||
|
||||
The current model identity is only a bare model string, which is no longer
|
||||
safe. Five model IDs already exist on both llama-swap hosts, the seeded
|
||||
`DEFAULT_MODEL` has already drifted out of the live list once, and multiple
|
||||
server/coder call sites still hardcode a single upstream.
|
||||
|
||||
The research in
|
||||
`docs/research/2026-06-10-multi-llama-swap-providers-model-favorites.md`
|
||||
validated one direction:
|
||||
|
||||
1. Introduce a named provider registry.
|
||||
2. Store selected models as composite IDs: `provider/model`.
|
||||
3. Group pickers by provider with a Favorites section first.
|
||||
4. Persist favorites server-side so BooChat and BooCoder share them.
|
||||
5. Remove single-endpoint assumptions from routing, context lookup,
|
||||
compaction, arena, and coder dispatch.
|
||||
|
||||
This batch is also the prerequisite named in `openspec/changes/boocontrol/`.
|
||||
|
||||
## What Changes
|
||||
|
||||
1. Add a shared provider-registry config for local model providers.
|
||||
2. Replace bare model identity with composite `provider/model` IDs at the API,
|
||||
picker, cache, and routing layers while keeping legacy bare IDs readable.
|
||||
3. Convert the server model catalog from a flat list into grouped provider
|
||||
sections with favorites surfaced first.
|
||||
4. Make sidecar routing an attribute of the `sam-desktop` provider instead of
|
||||
a global default for all non-DeepSeek traffic.
|
||||
5. Update BooCoder's llama-swap namespace bridge so composite IDs still
|
||||
dispatch through opencode correctly.
|
||||
6. Add server-side favorite persistence in `settings` with hide-not-delete
|
||||
behavior for unavailable models.
|
||||
|
||||
## Non-goals
|
||||
|
||||
- Replacing the existing ACP provider registry in `data/coder-providers.json`
|
||||
- Introducing llama-swap peer federation or LiteLLM as an aggregation layer
|
||||
- Adding full-text search, tags, or admin curation to the pickers in this batch
|
||||
- Cleaning up stale favorites automatically
|
||||
- Reworking session/chat schema columns from `TEXT` to structured provider fields
|
||||
|
||||
## Success Criteria
|
||||
|
||||
- `GET /api/models` returns a provider-aware catalog that can distinguish
|
||||
duplicate model names across hosts.
|
||||
- Existing sessions/chats that store a bare model ID still work, resolving to
|
||||
the default local provider without data migration.
|
||||
- `embedding/deepseek-r1-qwen3-8b` never routes to DeepSeek cloud and never
|
||||
receives the fake static 131k context window.
|
||||
- Requests for `embedding/*` models never go through llama-sidecar.
|
||||
- BooChat and BooCoder both render a Favorites section first, then provider
|
||||
groups, and a favorited model still remains visible in its provider group.
|
||||
- A favorite for an offline provider disappears from the visible list but
|
||||
returns automatically when that provider comes back.
|
||||
- Arena, compaction, task-model, and model-context all resolve the same
|
||||
provider/model pair consistently.
|
||||
|
||||
## Deliverables
|
||||
|
||||
| Doc | Purpose |
|
||||
|-----|---------|
|
||||
| [`design.md`](./design.md) | Registry shape, model identity rules, routing, UX, rollout |
|
||||
| [`tasks.md`](./tasks.md) | Ordered implementation and verification checklist |
|
||||
Reference in New Issue
Block a user