chore: snapshot working tree - pty_exited notifications + in-flight inference WIP

feat(booterm): structured pty_exited WS notifications. Plan-validated, impl-validated, code-reviewed green (contracts build clean, contracts test 29/29, booterm + web typecheck clean).

wip: in-progress inference/provider refactor (agents.ts, provider.ts, new llama-providers.ts, removed llama-args-validator), plus arena, dispatcher, compaction, schema changes.

openspec: pty-exit-notifications complete; x-agent-flags planned (not yet implemented).
This commit is contained in:
2026-06-14 12:48:47 +00:00
parent 0ed506f1da
commit b18de2a331
204 changed files with 25344 additions and 867 deletions

View File

@@ -0,0 +1,73 @@
# multi-llama-swap-providers-model-favorites
## Why
BooCode still treats local inference as a single `LLAMA_SWAP_URL`, but the
actual setup is already a fleet:
- `sam-desktop` at `100.101.41.16:8401`
- `embedding` at `100.90.172.55:8411`
- optional DeepSeek cloud models when `DEEPSEEK_API_KEY` is set
The current model identity is only a bare model string, which is no longer
safe. Five model IDs already exist on both llama-swap hosts, the seeded
`DEFAULT_MODEL` has already drifted out of the live list once, and multiple
server/coder call sites still hardcode a single upstream.
The research in
`docs/research/2026-06-10-multi-llama-swap-providers-model-favorites.md`
validated one direction:
1. Introduce a named provider registry.
2. Store selected models as composite IDs: `provider/model`.
3. Group pickers by provider with a Favorites section first.
4. Persist favorites server-side so BooChat and BooCoder share them.
5. Remove single-endpoint assumptions from routing, context lookup,
compaction, arena, and coder dispatch.
This batch is also the prerequisite named in `openspec/changes/boocontrol/`.
## What Changes
1. Add a shared provider-registry config for local model providers.
2. Replace bare model identity with composite `provider/model` IDs at the API,
picker, cache, and routing layers while keeping legacy bare IDs readable.
3. Convert the server model catalog from a flat list into grouped provider
sections with favorites surfaced first.
4. Make sidecar routing an attribute of the `sam-desktop` provider instead of
a global default for all non-DeepSeek traffic.
5. Update BooCoder's llama-swap namespace bridge so composite IDs still
dispatch through opencode correctly.
6. Add server-side favorite persistence in `settings` with hide-not-delete
behavior for unavailable models.
## Non-goals
- Replacing the existing ACP provider registry in `data/coder-providers.json`
- Introducing llama-swap peer federation or LiteLLM as an aggregation layer
- Adding full-text search, tags, or admin curation to the pickers in this batch
- Cleaning up stale favorites automatically
- Reworking session/chat schema columns from `TEXT` to structured provider fields
## Success Criteria
- `GET /api/models` returns a provider-aware catalog that can distinguish
duplicate model names across hosts.
- Existing sessions/chats that store a bare model ID still work, resolving to
the default local provider without data migration.
- `embedding/deepseek-r1-qwen3-8b` never routes to DeepSeek cloud and never
receives the fake static 131k context window.
- Requests for `embedding/*` models never go through llama-sidecar.
- BooChat and BooCoder both render a Favorites section first, then provider
groups, and a favorited model still remains visible in its provider group.
- A favorite for an offline provider disappears from the visible list but
returns automatically when that provider comes back.
- Arena, compaction, task-model, and model-context all resolve the same
provider/model pair consistently.
## Deliverables
| Doc | Purpose |
|-----|---------|
| [`design.md`](./design.md) | Registry shape, model identity rules, routing, UX, rollout |
| [`tasks.md`](./tasks.md) | Ordered implementation and verification checklist |