Files

indifferentketchup b18de2a331 chore: snapshot working tree - pty_exited notifications + in-flight inference WIP

feat(booterm): structured pty_exited WS notifications. Plan-validated, impl-validated, code-reviewed green (contracts build clean, contracts test 29/29, booterm + web typecheck clean).

wip: in-progress inference/provider refactor (agents.ts, provider.ts, new llama-providers.ts, removed llama-args-validator), plus arena, dispatcher, compaction, schema changes.

openspec: pty-exit-notifications complete; x-agent-flags planned (not yet implemented).

2026-06-14 12:48:47 +00:00

3.2 KiB

Raw Blame History

multi-llama-swap-providers-model-favorites

Why

BooCode still treats local inference as a single LLAMA_SWAP_URL, but the actual setup is already a fleet:

sam-desktop at 100.101.41.16:8401
embedding at 100.90.172.55:8411
optional DeepSeek cloud models when DEEPSEEK_API_KEY is set

The current model identity is only a bare model string, which is no longer safe. Five model IDs already exist on both llama-swap hosts, the seeded DEFAULT_MODEL has already drifted out of the live list once, and multiple server/coder call sites still hardcode a single upstream.

The research in docs/research/2026-06-10-multi-llama-swap-providers-model-favorites.md validated one direction:

Introduce a named provider registry.
Store selected models as composite IDs: provider/model.
Group pickers by provider with a Favorites section first.
Persist favorites server-side so BooChat and BooCoder share them.
Remove single-endpoint assumptions from routing, context lookup, compaction, arena, and coder dispatch.

This batch is also the prerequisite named in openspec/changes/boocontrol/.

What Changes

Add a shared provider-registry config for local model providers.
Replace bare model identity with composite provider/model IDs at the API, picker, cache, and routing layers while keeping legacy bare IDs readable.
Convert the server model catalog from a flat list into grouped provider sections with favorites surfaced first.
Make sidecar routing an attribute of the sam-desktop provider instead of a global default for all non-DeepSeek traffic.
Update BooCoder's llama-swap namespace bridge so composite IDs still dispatch through opencode correctly.
Add server-side favorite persistence in settings with hide-not-delete behavior for unavailable models.

Non-goals

Replacing the existing ACP provider registry in data/coder-providers.json
Introducing llama-swap peer federation or LiteLLM as an aggregation layer
Adding full-text search, tags, or admin curation to the pickers in this batch
Cleaning up stale favorites automatically
Reworking session/chat schema columns from TEXT to structured provider fields

Success Criteria

GET /api/models returns a provider-aware catalog that can distinguish duplicate model names across hosts.
Existing sessions/chats that store a bare model ID still work, resolving to the default local provider without data migration.
embedding/deepseek-r1-qwen3-8b never routes to DeepSeek cloud and never receives the fake static 131k context window.
Requests for embedding/* models never go through llama-sidecar.
BooChat and BooCoder both render a Favorites section first, then provider groups, and a favorited model still remains visible in its provider group.
A favorite for an offline provider disappears from the visible list but returns automatically when that provider comes back.
Arena, compaction, task-model, and model-context all resolve the same provider/model pair consistently.

Deliverables

Doc	Purpose
`design.md`	Registry shape, model identity rules, routing, UX, rollout
`tasks.md`	Ordered implementation and verification checklist

3.2 KiB Raw Blame History