--- title: "Multi-Provider Local Models — Build Phase Outline" source_artifact: "Multiple sources: docs/research/2026-06-10-multi-llama-swap-providers-model-favorites.md; openspec/changes/multi-llama-swap-providers-model-favorites/design.md; openspec/changes/multi-llama-swap-providers-model-favorites/artifacts/implementation-analysis.md" audience: "mixed" generated: "2026-06-10" generated_by: "han.core:plan-a-phased-build" --- # Multi-Provider Local Models — Build Phase Outline This document describes the order in which multi-provider local model support will be built. The work is broken into a sequence of phases, where each phase is a thin end-to-end deliverable that can be demonstrated to a real person, and each phase builds on the one before it. The goal is to let BooCode work cleanly with more than one local model machine today and make it straightforward to add more local machines later. This outline is built from three sources taken together: the research note that identified the routing and identity problems, the OpenSpec batch that defines the intended behavior, and the implementation analysis that tightened the architecture around the harder integration seams. The source material describes what exists today, what the target behavior is, and where the hidden risks are. This document describes the order in which the work should be built so the system reaches that target in a controlled way. ## Table of Contents - [Executive Summary](#executive-summary) - [Build Phase Index](#build-phase-index) - [How This Rollout Differs from the First Draft](#departures) - [Phase Kinds](#phase-kinds) - [Build Phases](#build-phases) - [Phase 1: Named Provider Inventory](#phase-1) - [Phase 2: Multi-Provider BooChat](#phase-2) - [Phase 3: Shared Favorites and Grouped Selection](#phase-3) - [Phase 4: Native BooCoder Parity](#phase-4) - [Phase 5: Multi-Provider Arena](#phase-5) - [Phase 6: External-Agent Parity](#phase-6) - [Phase 7: Add-a-Machine Operations](#phase-7) - [Phase 8 (Deferred): BooControl Fleet Layer](#phase-8) - [Open Questions](#open-questions) --- ## Executive Summary {#executive-summary} **The goal:** BooCode should treat local inference as a small fleet instead of a single machine. A user should be able to choose models from multiple local providers, keep favorites across BooChat and BooCoder, run coding and arena workflows against the intended provider, and add another local machine later without reopening the core design. **The shape of the build:** - The rollout starts by making provider identity real and visible before any routing changes are hidden behind it. - BooChat gets multi-provider conversations before the broader coding surfaces, so the first live slice proves the model identity and routing rules end to end. - Shared favorites and grouped pickers land before the coding parity work so the selection experience stabilizes once and is then reused. - Native BooCoder and Arena adopt the same provider rules before the harder external-agent bridge is attempted. - The final live phase turns “two machines supported” into “more machines are routine,” so the work ends in an operationally repeatable state instead of a one-off fix. **Sequencing rationale, in plain language:** The order starts with the smallest user-visible slice that proves the new mental model: named providers and distinct model identities. Once that exists, BooChat can safely route real conversations across providers and expose any mistakes early. Only after model identity, routing, and favorites are stable does it make sense to move deeper coding surfaces over, because those surfaces are less forgiving and have more hidden assumptions. The external-agent bridge comes late because it is the one place where a simple rename would look correct but still route the wrong machine. **Departures from the source artifact:** - Favorites are treated as a user-level view derived from shared settings, not as a built-in section of the server’s model inventory. - Native BooCoder parity comes before external-agent parity, because the external-agent path needs its own provider-preserving bridge. **Phases deliberately deferred:** BooControl is listed as a deferred final phase because it depends on this registry and identity work but does not need to exist for the multi-provider rollout itself to be complete. Search, richer filtering, and other picker refinements are also intentionally left out of the live phase sequence unless real usage proves they are needed. **Where to look next:** The [Build Phase Index](#build-phase-index) lists every phase in order. The [departures section](#departures) names the two decisions that shape the rest of the plan. Detailed write-ups follow under [Build Phases](#build-phases). Decisions the team must resolve before phase 1 can start are at [Open Questions](#open-questions). --- ## Build Phase Index {#build-phase-index} | # | Phase | Kind | Outcome (one sentence) | |---|---|---|---| | 1 | [Named Provider Inventory](#phase-1) | Foundation | BooCode can see distinct local providers and distinct model identities. | | 2 | [Multi-Provider BooChat](#phase-2) | Feature slice | A chat can run on the intended local provider without misrouting. | | 3 | [Shared Favorites and Grouped Selection](#phase-3) | Feature slice | Favorites persist once and appear consistently across both chat surfaces. | | 4 | [Native BooCoder Parity](#phase-4) | Feature slice | Native coding tasks can use the same multi-provider local model pool. | | 5 | [Multi-Provider Arena](#phase-5) | Feature slice | Arena can compare local models from more than one machine correctly. | | 6 | [External-Agent Parity](#phase-6) | Feature slice | External coding providers can target local machines without losing provider identity. | | 7 | [Add-a-Machine Operations](#phase-7) | Polish | Adding another local machine becomes a routine configuration change. | | 8 | [BooControl Fleet Layer (deferred)](#phase-8) | Deferred | A fleet cockpit can build on the finished provider registry later. | > Numbers are assigned in build order and are stable for the life of this outline. Cite them as `Phase N` in tickets, comments, and follow-up reports. --- ## How This Rollout Differs from the First Draft {#departures} The rollout deliberately departs from the first pass of the design in the ways named below. Each departure is summarized once here so the phase write-ups can refer to it by name. ### 1. Favorites are a shared user preference, not part of the provider inventory The first draft treated favorites as if they belonged inside the model catalog itself. The rollout instead treats them as a shared user preference layered on top of provider inventory. This matters because provider inventory answers “what exists right now,” while favorites answer “what this user prefers across devices and surfaces.” ### 2. External-agent support is a late seam, not part of the first local-model cut The first draft grouped native and external-agent parity together too early. The rollout separates them because native surfaces can use the new provider resolver directly, while the external-agent path still assumes one local provider behind the scenes. That path needs a real bridge, not a string rewrite. --- ## Phase Kinds {#phase-kinds} - **Foundation** — A capability that does not yet deliver the full user outcome, but is required for later phases. It must still be demonstrable on its own. - **Feature slice** — A thin end-to-end strip of new behavior that a real user can experience. - **Polish** — Refinement, resilience, or operational quality-of-life work that enriches a working core. - **Deferred** — Listed for traceability; not built in the current plan. --- ## Build Phases {#build-phases} ### Phase 1: Named Provider Inventory {#phase-1} **Kind.** Foundation. **Builds on.** Nothing — this is the starting phase. **What we build.** BooCode learns that “local models” are not one undifferentiated pool. The system gains a shared named-provider list, a stable way to name a selected model as “provider plus model,” a default-provider fallback for old data, and a provider-aware inventory view that can show which models belong to which machine. **Why this is Phase 1.** No later phase is safe until provider identity exists as a first-class concept. This phase is still demonstrable on its own because a person can see two named local providers with their own model groups and confirm that existing sessions still resolve instead of breaking. **Outcome to demonstrate.** 1. Start BooCode with two named local providers configured. 2. Open the model selection view and see separate groups for each provider. 3. Open an older session that still stores a legacy bare model value. 4. Confirm the older session still resolves to a usable default instead of failing. **Source citations.** - [Research — Recommendation](../../research/2026-06-10-multi-llama-swap-providers-model-favorites.md#recommendation) - [Research — What exists today](../../research/2026-06-10-multi-llama-swap-providers-model-favorites.md#what-exists-today-codebase--current-state-anchor) - [Implementation analysis — Shared local-provider registry](../../../openspec/changes/multi-llama-swap-providers-model-favorites/artifacts/implementation-analysis.md#1-shared-local-provider-registry) **Connects to.** - Creates the identity rules used by [Phase 2](#phase-2), [Phase 4](#phase-4), and [Phase 5](#phase-5). - Establishes the provider list that [Phase 7](#phase-7) will operationalize for future machines. **Preconditions to verify before starting.** - Confirm the shared provider list lives in one new shared location rather than being split between separate app-specific settings. - Confirm which provider is the long-term default when legacy bare model values are encountered. --- ### Phase 2: Multi-Provider BooChat {#phase-2} **Kind.** Feature slice. **Builds on.** Phase 1, where provider identity and fallback rules are established. **What we build.** BooChat becomes the first live end-to-end consumer of multiple local providers. A person can choose a model from any configured provider, send a message, and trust that the response came from the intended machine. The same phase also fixes the two current routing hazards: models that happen to share a cloud-provider prefix in their name, and models that should never be sent through the sidecar path. **Why this is Phase 2.** BooChat is the fastest way to prove the provider resolver against real behavior. It surfaces routing mistakes immediately, but it is still simpler and easier to inspect than the coding surfaces that layer more state and backend behavior on top. **Outcome to demonstrate.** 1. Open a chat and choose a model from the first local provider. 2. Send a prompt and get a response. 3. Switch to a model from the second local provider and send the same prompt. 4. Confirm both responses arrive successfully and the second provider does not get routed through the wrong path. 5. Run a model whose name resembles a cloud model name and confirm it still uses the intended local provider. **Source citations.** - [Research — Recommendation constraints](../../research/2026-06-10-multi-llama-swap-providers-model-favorites.md#recommendation) - [Research — Does embedding need a llama-sidecar? No.](../../research/2026-06-10-multi-llama-swap-providers-model-favorites.md#does-embedding-need-a-llama-sidecar-no) - [OpenSpec design — Server changes](../../../openspec/changes/multi-llama-swap-providers-model-favorites/design.md#5-server-changes) **Connects to.** - Supplies the stable routing behavior reused in [Phase 3](#phase-3), [Phase 4](#phase-4), and [Phase 5](#phase-5). - Proves the provider resolver before the coding flows depend on it. **Preconditions to verify before starting.** - Confirm the desired provider order for the user-facing list. - Confirm the cloud-backed model group stays visibly separate from local machine groups. --- ### Phase 3: Shared Favorites and Grouped Selection {#phase-3} **Kind.** Feature slice. **Builds on.** Phase 1 for provider identity and Phase 2 for live multi-provider chat behavior. **What we build.** Model selection becomes a stable, shared experience instead of a one-off list. A person can favorite models, see favorites first, still browse by provider below, and have the same favorite set follow them across chat surfaces. If a provider is temporarily unavailable, its favorites disappear from the visible list without being lost. **Why this is Phase 3.** Once the routing rules are real, the next highest-value step is to make selection usable. Doing this before the deeper coding surfaces avoids building two different model-selection experiences and then reconciling them later. **Outcome to demonstrate.** 1. Favorite one model from each local provider. 2. Refresh and confirm both favorites appear at the top while still remaining in their provider groups. 3. Open the other chat surface and confirm the same favorites appear there too. 4. Temporarily remove one provider from the live inventory. 5. Confirm its favorite disappears from view without being deleted, then returns when the provider comes back. **Source citations.** - [Research — Dropdown + favorites prior art](../../research/2026-06-10-multi-llama-swap-providers-model-favorites.md#dropdown--favorites-prior-art-web) - [Research — Favorites persistence](../../research/2026-06-10-multi-llama-swap-providers-model-favorites.md#sub-decision--favorites-persistence) - [Implementation analysis — Provider-aware catalog, client-derived favorites](../../../openspec/changes/multi-llama-swap-providers-model-favorites/artifacts/implementation-analysis.md#3-provider-aware-catalog-client-derived-favorites) **Connects to.** - Provides the selection behavior reused by [Phase 4](#phase-4). - Stabilizes the shared user preference model before the broader fleet tooling in [Phase 7](#phase-7). **Preconditions to verify before starting.** - Confirm favorites are shared for the single user across devices rather than stored per browser. - Confirm insertion order is enough for the first favorite list and manual reordering can wait. --- ### Phase 4: Native BooCoder Parity {#phase-4} **Kind.** Feature slice. **Builds on.** Phase 1 for provider identity, Phase 2 for routing behavior, and Phase 3 for the grouped selection experience. **What we build.** The native coding path in BooCoder gains the same local model pool as BooChat. A person can choose a local model from any configured provider for native coding work and trust that the coding session is using the selected provider instead of collapsing everything back to one machine. **Why this is Phase 4.** The native coding path can use the shared provider resolver directly, so it is the safest BooCoder slice to move next. Shipping it before the external-agent bridge delivers real user value while avoiding the hardest integration seam for one more phase. **Outcome to demonstrate.** 1. Open the native coding experience. 2. Choose a local model from the first provider and run a coding task. 3. Start a second coding task using a model from the second provider. 4. Confirm both tasks run successfully using the intended provider-specific model choice. **Source citations.** - [Research — Recommendation constraints](../../research/2026-06-10-multi-llama-swap-providers-model-favorites.md#recommendation) - [Implementation analysis — Treat native and external-agent paths differently](../../../openspec/changes/multi-llama-swap-providers-model-favorites/artifacts/implementation-analysis.md#4-treat-boocoder-native-and-boocoder-external-agent-paths-differently) - [OpenSpec design — BooCoder integration](../../../openspec/changes/multi-llama-swap-providers-model-favorites/design.md#7-boocoder-integration) **Connects to.** - Establishes the stable native coding baseline before [Phase 6](#phase-6) tackles external-agent parity. - Shares its provider list and identity rules with [Phase 5](#phase-5). **Preconditions to verify before starting.** - Confirm the native coding path is the required BooCoder target for the first live parity slice. - Confirm the same grouped-selection experience should be preserved in the coding surface without new selection concepts. --- ### Phase 5: Multi-Provider Arena {#phase-5} **Kind.** Feature slice. **Builds on.** Phase 1 for provider identity and Phase 2 for provider-aware local routing. **What we build.** Arena stops treating “local” as one machine and instead treats it as a set of named providers. A person can run local comparisons across models from different machines and get correct routing and fair local classification instead of silent misclassification. **Why this is Phase 5.** Arena benefits from the same resolver as chat and coding, but it is a separate consumer with its own local-versus-cloud logic. It belongs after the shared routing behavior is proven, but before the harder external-agent bridge so the local evaluation surface is complete early. **Outcome to demonstrate.** 1. Start an arena comparison using one local model from the first machine and one from the second. 2. Run the comparison to completion. 3. Confirm both contenders are treated as local candidates rather than being collapsed into one generic local lane. 4. Confirm the results still make sense when one contender uses a provider-specific route such as the sidecar-backed machine. **Source citations.** - [Research — Recommendation constraints](../../research/2026-06-10-multi-llama-swap-providers-model-favorites.md#recommendation) - [Implementation analysis — Arena is a separate local-model consumer](../../../openspec/changes/multi-llama-swap-providers-model-favorites/artifacts/implementation-analysis.md#f-006--arena-is-a-separate-local-model-consumer-not-just-another-caller) **Connects to.** - Reuses the same provider resolver established earlier. - Supplies the local evaluation surface that [Phase 7](#phase-7) will harden for future machines. **Preconditions to verify before starting.** - Confirm that the intended outcome is correct provider-aware behavior, not yet a richer benchmarking or reporting layer. - Confirm that local fairness rules should still treat all named local providers as part of the local class rather than introducing provider-specific scheduling policy in this phase. --- ### Phase 6: External-Agent Parity {#phase-6} **Kind.** Feature slice. **Builds on.** Phases 1 through 5, because this phase depends on the final provider model being stable before it is bridged outward. **What we build.** External coding providers gain access to the same multi-provider local fleet without losing provider identity. The user-visible outcome is simple: a local model chosen for an external coding workflow still hits the intended machine even when another machine serves a model with the same name. **Why this is Phase 6.** This is the most failure-prone seam in the entire rollout. Shipping it earlier would make the system look complete while still hiding ambiguous routing behind the scenes. By the time this phase starts, the provider model, picker behavior, and native local routing rules are already stable. **Outcome to demonstrate.** 1. Open an external coding workflow that can use a local model. 2. Choose a model name that also exists on another local machine. 3. Run the task and confirm the request still reaches the intended provider instead of whichever machine happens to share the name. 4. Repeat with a different local provider and confirm the same behavior. **Source citations.** - [Research — Validation V1 and V9](../../research/2026-06-10-multi-llama-swap-providers-model-favorites.md#validation) - [Implementation analysis — No safe path for opencode local-model parity](../../../openspec/changes/multi-llama-swap-providers-model-favorites/artifacts/implementation-analysis.md#g-005--no-safe-path-for-opencode-local-model-parity) - [Implementation analysis — Preferred parity path for opencode](../../../openspec/changes/multi-llama-swap-providers-model-favorites/artifacts/implementation-analysis.md#5-preferred-parity-path-for-opencode-a-boocoder-hosted-local-model-gateway) **Connects to.** - Completes the coding-side multi-provider story started in [Phase 4](#phase-4). - Creates the provider bridge that keeps future machines safe in [Phase 7](#phase-7). **Preconditions to verify before starting.** - Confirm whether this phase will include a provider-preserving gateway or be split into a follow-up initiative. - Confirm external-agent parity is required for the same milestone as native parity rather than being a later enhancement. --- ### Phase 7: Add-a-Machine Operations {#phase-7} **Kind.** Polish. **Builds on.** Phases 1 through 6, where the provider model and all major consumers are already in place. **What we build.** The rollout stops being “support two machines” and becomes “support a growing local fleet.” A person can add another local machine by following a repeatable operational path, see it appear in inventory, and trust that chat, coding, and arena all treat it as just another named provider instead of a custom exception. **Why this is Phase 7.** The architecture can claim success only when adding another machine is routine rather than bespoke. This phase comes late because it is about making the completed system repeatable and low-friction, not about proving the original two-machine behavior. **Outcome to demonstrate.** 1. Add a third local provider using the documented provider path. 2. Restart or refresh the system. 3. See the new machine appear in the provider inventory with its own model group. 4. Use one model from the new machine in chat, one in coding, and one in arena. 5. Confirm all three surfaces recognize the new machine without custom code changes. **Source citations.** - [Research — Recommendation](../../research/2026-06-10-multi-llama-swap-providers-model-favorites.md#recommendation) - [Implementation analysis — Recommended sequence](../../../openspec/changes/multi-llama-swap-providers-model-favorites/artifacts/implementation-analysis.md#recommended-sequence) - [Implementation analysis — Shared local-provider registry](../../../openspec/changes/multi-llama-swap-providers-model-favorites/artifacts/implementation-analysis.md#1-shared-local-provider-registry) **Connects to.** - Turns the whole earlier rollout into an operationally repeatable capability. - Provides the stable registry that the deferred fleet layer in [Phase 8](#phase-8) can consume later. **Preconditions to verify before starting.** - Confirm configuration-based provider management is acceptable for the first operational pass and a full management interface is not required yet. - Confirm the success bar is “no code changes required to add the machine,” not “all provider administration happens inside the product.” --- ### Phase 8 (Deferred): BooControl Fleet Layer {#phase-8} **Kind.** Deferred. **Builds on.** Phases 1 through 7, because it consumes the finished provider registry and the settled provider names. **What we build.** A dedicated fleet-control and observability layer that can show the state of multiple local model providers, collect live information across them, and eventually make routing and benchmarking easier to understand. **Why this is deferred.** BooControl depends on the provider registry, but the registry does not depend on BooControl. Building the control layer earlier would either duplicate the provider model or force BooControl to sit on top of assumptions that this rollout is specifically trying to remove. **Reopen when.** Reopen this phase once multi-provider chat, coding, arena, and add-a-machine operations are already stable and there is enough day-to-day fleet activity to justify a dedicated control surface. **Outcome to demonstrate (when or if built).** 1. Open the fleet view. 2. See every named local provider in one place. 3. Inspect live state or history without having to visit each machine separately. **Source citations.** - [BooControl tasks — prerequisite note](../../../openspec/changes/boocontrol/tasks.md#p0--prerequisite-separate-batch-multi-llama-swap-provider-registry) - [BooControl proposal — prerequisite note](../../../openspec/changes/boocontrol/proposal.md#why) --- ## Open Questions {#open-questions} ### OQ-1. Where should the shared provider list live, and who owns it? {#oq-1} **Blocks phase(s).** Phase 1. The first phase cannot start until there is one agreed source of truth for named local providers. If that decision stays split, every later phase inherits the split. - **Option A — a new shared provider list used by both apps.** One place defines provider names, addresses, and any provider-specific routing attributes. This keeps the local fleet model unified. - **Option B — keep the existing separate settings and derive one view from the other.** This lowers the immediate change but keeps the long-term drift risk alive. - **Recommendation: Option A.** The whole point of the rollout is to make provider identity shared and durable. Keeping two authorities would repeat the same problem in a new shape. ### OQ-2. Does this initiative include external-agent parity, or does it stop after native parity? {#oq-2} **Blocks phase(s).** Phase 6. The rollout can reach a useful and honest midpoint after native parity, but it cannot claim full multi-provider coding parity until the external-agent path is solved too. - **Option A — include external-agent parity in this initiative.** This produces a complete end state, but it requires a dedicated provider-preserving bridge. - **Option B — stop after native parity and split the external-agent work into a follow-up.** This shortens the first initiative, but the end state remains intentionally incomplete. - **Recommendation: Option A if the bridge is accepted; otherwise Option B.** If the team is willing to build the bridge properly, finishing the job now avoids a misleading halfway state. If not, native parity should ship honestly as a bounded milestone and the rest should be split explicitly. ### OQ-3. Is a product-based provider management screen required now, or is configuration-based rollout enough? {#oq-3} **Blocks phase(s).** Phase 7. The final live phase is about making more machines routine to add. The open question is whether “routine” means “edit the provider list and restart” or whether it already means “manage providers inside the product.” - **Option A — configuration-based rollout first.** A trusted operator adds machines through the shared provider list and validates them using the product. - **Option B — product-based management in the same initiative.** Provider administration becomes part of the product immediately. - **Recommendation: Option A.** The current initiative is about correct provider identity and repeatable multi-provider behavior. A full management screen adds another feature layer before the provider model has had time to prove itself. ### Carry-over notes - Search, tag filtering, and richer picker controls are intentionally not blockers for the main rollout. - Full fleet control, reporting, and advanced routing policy stay deferred until the provider model is already stable in daily use.