Files

indifferentketchup b18de2a331 chore: snapshot working tree - pty_exited notifications + in-flight inference WIP

feat(booterm): structured pty_exited WS notifications. Plan-validated, impl-validated, code-reviewed green (contracts build clean, contracts test 29/29, booterm + web typecheck clean).

wip: in-progress inference/provider refactor (agents.ts, provider.ts, new llama-providers.ts, removed llama-args-validator), plus arena, dispatcher, compaction, schema changes.

openspec: pty-exit-notifications complete; x-agent-flags planned (not yet implemented).

2026-06-14 12:48:47 +00:00

9.9 KiB

Raw Blame History

multi-llama-swap-providers-model-favorites — implementation analysis

Scope compared

Current state: the shipped implementation in apps/server, apps/coder, apps/web, and packages/contracts
Desired state: the behavior described in docs/research/2026-06-10-multi-llama-swap-providers-model-favorites.md and the corresponding OpenSpec batch

Purpose: determine the safest and most coherent implementation path before building the feature.

Conclusion

The best implementation path is to treat this as a shared local-model routing subsystem, not as a picker-only UI feature.

That subsystem needs two interfaces:

An in-process resolver used directly by BooChat and native BooCoder paths.
A gateway surface for consumers that cannot call the resolver directly and still assume one OpenAI-compatible provider contract.

Without that split, the feature looks straightforward in BooChat but stays architecturally broken in BooCoder because the existing opencode integration collapses provider identity back to one local llama-swap endpoint.

Current-state findings

F-001 — config authority is split

apps/server is driven by LLAMA_SWAP_URL, LLAMA_SIDECAR_URL, and DEFAULT_MODEL.
apps/coder reuses LLAMA_SWAP_URL for local models and has a separate data/coder-providers.json for ACP providers.

Effect: there is no single source of truth for local model providers that both apps can consume.

F-002 — model identity is still a raw string everywhere that matters

sessions.model is TEXT NOT NULL.
chats.model is TEXT.
model-context.ts caches by the raw model string.
multiple dispatchers treat the model as an opaque string and infer behavior from prefixes.

Effect: duplicate model names across hosts cannot be represented safely without composite IDs.

F-003 — routing logic is duplicated and heuristic-heavy

BooChat streaming uses upstreamModel() in provider.ts.
non-streaming calls use resolveModelEndpoint().
context lookup bypasses both and fetches LLAMA_SWAP_URL directly.
arena local calls bypass both and hit LLAMA_SWAP_URL directly.

Effect: even after adding a registry, call sites will diverge unless they all share one resolver.

F-004 — favorites are a UI concern backed by shared settings, not a server catalog concern

The settings table is already the right persistence surface.
BooChat already reads/writes server state.
BooCoder currently keeps picker prefs in browser localStorage, but those are provider-specific UI prefs, not a shared favorite-model feature.

Effect: favorites should be stored server-side and derived in the client from /api/settings + provider-aware model data.

F-005 — BooCoder has a deeper coupling than the research initially surfaced

The dangerous assumption is not only in dispatcher.ts. It is in the whole opencode local-model bridge:

the snapshot merges local llama models into the opencode provider by prefixing them as llama-swap/<model>
the dispatcher treats bare IDs as llama-swap/<model>
the opencode backend parses provider/model
current host opencode config points every local-model family at a single llama-swap base URL

Effect: translating embedding/qwen3.5-9b back to llama-swap/qwen3.5-9b reintroduces the exact ambiguity this batch is trying to remove.

F-006 — Arena is a separate local-model consumer, not just another caller

Arena currently:

builds its "local model" set from one live llama-swap list
classifies local-vs-cloud contestants from that set
performs one-shot local calls directly against LLAMA_SWAP_URL

Effect: arena needs the same provider-aware resolver as BooChat, but it does not need the full BooChat picker/favorites work.

Gap summary

G-001 — no shared local-provider registry

What is missing:

one schema and one loader contract for named local providers consumed by both server and coder

Why it matters:

every downstream fix becomes duplicated if config remains split

G-002 — no canonical model-ref format and parser

What is missing:

a shared provider/model identity format and parse/format helpers

Why it matters:

caches, DB values, routing, and UI rendering cannot stay aligned otherwise

G-003 — no single provider-aware resolver

What is missing:

one shared resolver API for:
- route selection
- base URL selection
- sidecar selection
- wire-model extraction
- context-props endpoint selection

Why it matters:

keeping separate "streaming", "non-streaming", "context", and "arena" resolution paths will re-create subtle bugs

G-004 — no neutral provider-aware catalog contract

What is missing:

a provider-aware model catalog response that exposes providers and models without baking favorites into the server payload

Why it matters:

BooChat and BooCoder both need provider metadata, but favorites are derived from user settings, not from upstream inventory

G-005 — no safe path for opencode local-model parity

What is missing:

either:
- a generated/synced opencode-facing local-model config, or
- a BooCoder-hosted OpenAI-compatible gateway that preserves provider identity under one provider namespace, or
- a deliberate scope cut that removes multi-provider local models from the opencode provider until that bridge exists

Why it matters:

without one of these, the feature is correct in BooChat but false-advertised in the opencode provider

Recommended architecture

1. Shared local-provider registry

Add a new shared config surface for local inference providers, separate from data/coder-providers.json.

Recommendation:

schema in packages/contracts
live file such as /data/llama-providers.json
fallback synthesis from LLAMA_SWAP_URL and LLAMA_SIDECAR_URL while the file is absent

This keeps ACP provider management and local model provider management as two separate concerns.

2. Shared model-ref and resolver helpers

Add shared helpers for:

parsing provider/model
resolving legacy bare IDs to the default provider
deciding route type
selecting upstream base URL
extracting the wire model id

All of these should be used by:

server streaming inference
server non-streaming calls
model-context lookup
arena one-shot local calls
any future control-plane or routing feature

3. Provider-aware catalog, client-derived favorites

Do not make the server return a synthetic Favorites section.

Instead:

/api/models (or a replacement contract) should return provider-grouped inventory only
/api/settings should hold favorite_models: string[]
BooChat and BooCoder should derive:
- Favorites first
- then provider sections
- hide unavailable favorites without deleting them

This keeps the server contract inventory-shaped and the favorite behavior user-shaped.

4. Treat BooCoder native and BooCoder external-agent paths differently

There are two different BooCoder consumers:

native boocode provider
external-agent providers like opencode

The native boocode provider can adopt the shared resolver directly.

The opencode provider cannot safely adopt provider/model by simple string translation, because its current local-model bridge still assumes one local provider.

Recommendation:

ship native boocode provider parity first
do not claim opencode parity until provider identity is preserved end-to-end there too

5. Preferred parity path for opencode: a BooCoder-hosted local-model gateway

If full opencode parity is required in the same initiative, the cleanest path is a small OpenAI-compatible gateway inside apps/coder:

accepts model ids that still carry provider identity
strips provider prefix only at the final upstream boundary
routes to the correct local provider
becomes the single local-model base URL for opencode

Why this is better than adding many direct opencode providers:

one stable provider contract for opencode
no duplicated base-URL registry in opencode config
the same gateway can serve arena/local utility calls later
it stays inside an existing always-on service, not a new third service

If this gateway is not in scope now, the correct fallback is to remove or hide multi-provider local models from the opencode provider until the bridge is real.

Recommended sequence

Phase 1 — shared foundation

shared local-provider config schema
shared provider/model parsing helpers
shared resolver
legacy bare-id fallback

Phase 2 — BooChat + native BooCoder

provider-aware model catalog
server inference routing updates
model-context cache-key fix
compaction and task-model endpoint resolution
BooChat picker grouping + server-side favorites
BooCoder boocode provider model list grouped by local provider

Phase 3 — arena parity

local-model set built from the shared provider catalog, not one llama-swap
one-shot local calls use the shared resolver

Phase 4 — opencode parity

Choose one:

preferred: BooCoder-hosted local-model gateway plus opencode-facing model sync
fallback: temporarily stop advertising multi-provider local models under the opencode provider

Phase 5 — boocontrol

build BooControl only after the local-provider registry and canonical model identity land

What this changes in the existing OpenSpec batch

The design should treat favorites as client-derived from settings, not as a server-generated catalog section.
The design should explicitly separate native BooCoder parity from opencode parity.
The tasks should call out the opencode bridge as a dedicated risk area, not as a small dispatcher rename.

Recommendation

Implement the shared local-provider registry and resolver first, then ship BooChat plus native BooCoder on top of it. Treat opencode multi-provider support as a distinct integration seam that either gets a real gateway or stays out of scope for the first slice.

That is the fastest path that is still architecturally honest.

9.9 KiB Raw Blame History

multi-llama-swap-providers-model-favorites — implementation analysis

Scope compared

Conclusion

Current-state findings

F-001 — config authority is split

F-002 — model identity is still a raw string everywhere that matters

F-003 — routing logic is duplicated and heuristic-heavy

F-004 — favorites are a UI concern backed by shared settings, not a server catalog concern

F-005 — BooCoder has a deeper coupling than the research initially surfaced

F-006 — Arena is a separate local-model consumer, not just another caller

Gap summary

G-001 — no shared local-provider registry

G-002 — no canonical model-ref format and parser

G-003 — no single provider-aware resolver

G-004 — no neutral provider-aware catalog contract

G-005 — no safe path for opencode local-model parity

Recommended architecture

1. Shared local-provider registry

2. Shared model-ref and resolver helpers

3. Provider-aware catalog, client-derived favorites

4. Treat BooCoder native and BooCoder external-agent paths differently

5. Preferred parity path for opencode: a BooCoder-hosted local-model gateway

Recommended sequence

Phase 1 — shared foundation

Phase 2 — BooChat + native BooCoder

Phase 3 — arena parity

Phase 4 — opencode parity

Phase 5 — boocontrol

What this changes in the existing OpenSpec batch

Recommendation

9.9 KiB

Raw Blame History