Files
boocode/openspec/changes/multi-llama-swap-providers-model-favorites/proposal.md
indifferentketchup b18de2a331 chore: snapshot working tree - pty_exited notifications + in-flight inference WIP
feat(booterm): structured pty_exited WS notifications. Plan-validated, impl-validated, code-reviewed green (contracts build clean, contracts test 29/29, booterm + web typecheck clean).

wip: in-progress inference/provider refactor (agents.ts, provider.ts, new llama-providers.ts, removed llama-args-validator), plus arena, dispatcher, compaction, schema changes.

openspec: pty-exit-notifications complete; x-agent-flags planned (not yet implemented).
2026-06-14 12:48:47 +00:00

3.2 KiB

multi-llama-swap-providers-model-favorites

Why

BooCode still treats local inference as a single LLAMA_SWAP_URL, but the actual setup is already a fleet:

  • sam-desktop at 100.101.41.16:8401
  • embedding at 100.90.172.55:8411
  • optional DeepSeek cloud models when DEEPSEEK_API_KEY is set

The current model identity is only a bare model string, which is no longer safe. Five model IDs already exist on both llama-swap hosts, the seeded DEFAULT_MODEL has already drifted out of the live list once, and multiple server/coder call sites still hardcode a single upstream.

The research in docs/research/2026-06-10-multi-llama-swap-providers-model-favorites.md validated one direction:

  1. Introduce a named provider registry.
  2. Store selected models as composite IDs: provider/model.
  3. Group pickers by provider with a Favorites section first.
  4. Persist favorites server-side so BooChat and BooCoder share them.
  5. Remove single-endpoint assumptions from routing, context lookup, compaction, arena, and coder dispatch.

This batch is also the prerequisite named in openspec/changes/boocontrol/.

What Changes

  1. Add a shared provider-registry config for local model providers.
  2. Replace bare model identity with composite provider/model IDs at the API, picker, cache, and routing layers while keeping legacy bare IDs readable.
  3. Convert the server model catalog from a flat list into grouped provider sections with favorites surfaced first.
  4. Make sidecar routing an attribute of the sam-desktop provider instead of a global default for all non-DeepSeek traffic.
  5. Update BooCoder's llama-swap namespace bridge so composite IDs still dispatch through opencode correctly.
  6. Add server-side favorite persistence in settings with hide-not-delete behavior for unavailable models.

Non-goals

  • Replacing the existing ACP provider registry in data/coder-providers.json
  • Introducing llama-swap peer federation or LiteLLM as an aggregation layer
  • Adding full-text search, tags, or admin curation to the pickers in this batch
  • Cleaning up stale favorites automatically
  • Reworking session/chat schema columns from TEXT to structured provider fields

Success Criteria

  • GET /api/models returns a provider-aware catalog that can distinguish duplicate model names across hosts.
  • Existing sessions/chats that store a bare model ID still work, resolving to the default local provider without data migration.
  • embedding/deepseek-r1-qwen3-8b never routes to DeepSeek cloud and never receives the fake static 131k context window.
  • Requests for embedding/* models never go through llama-sidecar.
  • BooChat and BooCoder both render a Favorites section first, then provider groups, and a favorited model still remains visible in its provider group.
  • A favorite for an offline provider disappears from the visible list but returns automatically when that provider comes back.
  • Arena, compaction, task-model, and model-context all resolve the same provider/model pair consistently.

Deliverables

Doc Purpose
design.md Registry shape, model identity rules, routing, UX, rollout
tasks.md Ordered implementation and verification checklist