Files
boocode/docs/plans/multi-provider-local-models/artifacts/implementation-decision-log.md
indifferentketchup b18de2a331 chore: snapshot working tree - pty_exited notifications + in-flight inference WIP
feat(booterm): structured pty_exited WS notifications. Plan-validated, impl-validated, code-reviewed green (contracts build clean, contracts test 29/29, booterm + web typecheck clean).

wip: in-progress inference/provider refactor (agents.ts, provider.ts, new llama-providers.ts, removed llama-args-validator), plus arena, dispatcher, compaction, schema changes.

openspec: pty-exit-notifications complete; x-agent-flags planned (not yet implemented).
2026-06-14 12:48:47 +00:00

110 lines
9.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Implementation Decision Log: Multi-Provider Local Models
This file records the implementation decisions committed while planning the multi-provider local-model rollout.
Behavioral intent lives in [../feature-implementation-plan.md](../feature-implementation-plan.md) and the source
artifacts it cites. Round history lives in [implementation-iteration-history.md](implementation-iteration-history.md).
Source artifacts:
- [../build-phase-outline.md](../build-phase-outline.md)
- [../../../openspec/changes/multi-llama-swap-providers-model-favorites/design.md](../../../openspec/changes/multi-llama-swap-providers-model-favorites/design.md)
- [../../../openspec/changes/multi-llama-swap-providers-model-favorites/artifacts/implementation-analysis.md](../../../openspec/changes/multi-llama-swap-providers-model-favorites/artifacts/implementation-analysis.md)
- [../../research/2026-06-10-multi-llama-swap-providers-model-favorites.md](../../research/2026-06-10-multi-llama-swap-providers-model-favorites.md)
- [./.discovery-notes.md](./.discovery-notes.md)
### D-1: Shared local-provider config authority
- **Question:** Where does the source of truth for named local providers live, and what belongs in the shared package versus app-local loaders?
- **Decision:** Use `/data/llama-providers.json`, wired through `LLAMA_PROVIDERS_PATH`, as the shared authority for local providers. Put the schema and pure model-ref helpers in `packages/contracts`; keep file I/O and legacy env fallback in app-local registry loaders for server and coder.
- **Rationale:** This matches the existing BooCoder pattern of package-owned schemas plus app-local load/build caches, avoids duplicating config semantics, and avoids forcing Node-specific loader code into every consumer of the contracts package.
- **Evidence:** `packages/contracts/src/provider-config.ts` and `apps/coder/src/services/provider-config-registry.ts` already follow this split; the current local-provider gap is that server and coder do not share any equivalent registry.
- **Rejected alternatives:**
- Keep local providers env-only forever. Rejected because server and coder already drift and more machines would multiply the drift.
- Put file reading only in one app and make the other app consume it indirectly. Rejected because both server and coder need startup-time local-provider awareness.
- **Driven by rounds:** R1.
- **Referenced in plan:** Outcome, Working Assumptions, W1.
### D-2: Persist and cache composite `provider/model` ids; keep wire ids bare
- **Question:** What is the canonical identity format for local model selections and caches?
- **Decision:** Persist and cache `provider/model`. Strip the provider prefix only at the final upstream call boundary. Keep indefinite support for legacy bare ids by resolving them to `defaultProvider`.
- **Rationale:** Duplicate wire model names across machines are otherwise impossible to represent safely. This also keeps DB migrations small because the existing columns are already free-form text.
- **Evidence:** `sessions.model` and `chats.model` are stringly typed; `apps/server/src/services/model-context.ts` currently keys by bare model and would otherwise cross-poison duplicate names.
- **Rejected alternatives:**
- Keep persisted ids bare and use side metadata for provider. Rejected because many call sites already pass the model string around alone.
- Prefix wire calls too. Rejected because upstream llama-swap and DeepSeek calls want the actual provider-native model id.
- **Driven by rounds:** R1.
- **Referenced in plan:** Outcome, W1, W2, W3.
### D-3: One provider-aware resolver shared across streaming, non-streaming, context, and Arena
- **Question:** Should each consumer keep its own endpoint logic once multiple local providers exist?
- **Decision:** No. Build one provider-aware resolver contract and make streaming inference, non-streaming calls, context lookup, compaction, task-model resolution, and Arena all go through it.
- **Rationale:** The current failure mode is duplicated routing logic with slightly different heuristics. Fixing only one path would leave subtle misroutes in the others.
- **Evidence:** `apps/server/src/services/inference/provider.ts`, `apps/server/src/services/model-context.ts`, `apps/server/src/services/compaction.ts`, `apps/server/src/services/task-model.ts`, and `apps/coder/src/services/arena-model-call.ts` all handle local-model identity separately today.
- **Rejected alternatives:**
- Only unify server inference and leave context/arena separate. Rejected because that would preserve hidden correctness bugs in context limits and Arena calls.
- **Driven by rounds:** R1.
- **Referenced in plan:** Outcome, W2, W3, W6.
### D-4: Favorites are a settings-backed user view, not a server catalog section
- **Question:** Where should the Favorites concept live?
- **Decision:** Store `favorite_models: string[]` in settings and derive the Favorites section client-side from settings plus provider inventory. The server catalog returns providers and models only.
- **Rationale:** Inventory answers “what exists now.” Favorites answer “what this user prefers.” Keeping them separate avoids overloading the server catalog with user-specific UI state.
- **Evidence:** `settings` already exists server-side; the OpenSpec analysis already identified favorites as a user-level concern rather than an inventory concern.
- **Rejected alternatives:**
- Return a synthetic Favorites section from `/api/models`. Rejected because it entangles inventory with user preference and complicates offline/unavailable favorite behavior.
- **Driven by rounds:** R1.
- **Referenced in plan:** Outcome, W2, W4.
### D-5: Native `boocode` parity ships before `opencode` parity
- **Question:** Should native and external-agent BooCoder paths move together?
- **Decision:** No. Native `boocode` parity is W5. `opencode` parity is W7 and does not begin until the native path is correct and the UI stops falsely advertising multi-provider local models under the old bridge.
- **Rationale:** Native `boocode` can use the shared resolver directly. `opencode` still assumes one local-provider namespace and is the riskier seam.
- **Evidence:** `apps/coder/src/services/provider-snapshot.ts` prefixes local models as `llama-swap/*`; `apps/coder/src/services/backends/opencode-server.ts` still assumes the outer provider namespace identifies the target upstream.
- **Rejected alternatives:**
- Rename everything to `provider/model` in one pass. Rejected because the external-agent bridge would still collapse identity at the last moment.
- **Driven by rounds:** R1.
- **Referenced in plan:** Outcome, W5, W7.
### D-6: `opencode` parity uses a `boocode-local` gateway, not a string rewrite
- **Question:** What is the safe path to external-agent parity?
- **Decision:** Add a BooCoder-hosted OpenAI-compatible local gateway and present it to `opencode` as one stable provider namespace such as `boocode-local`. The inner `modelID` carries the composite local identity like `sam-desktop/qwen3.6-35b`.
- **Rationale:** `parseModel()` in the opencode backend already splits only once at `/`, which means a stable outer provider id can safely carry the inner composite local id. That preserves provider identity without teaching opencode about every machine directly.
- **Evidence:** `apps/coder/src/services/backends/opencode-server.ts` `parseModel()` returns `{ providerID, modelID }` where `modelID` may contain additional slashes; current `llama-swap/<model>` mapping is the ambiguity seam.
- **Rejected alternatives:**
- Keep rewriting `provider/model` back to `llama-swap/model`. Rejected because duplicate local model names would still route incorrectly.
- Add one direct opencode provider per local machine. Rejected because it duplicates the registry and leaks fleet structure into opencode config.
- **Driven by rounds:** R1.
- **Referenced in plan:** Outcome, W7.
### D-7: Add-a-machine stays config-driven in this initiative
- **Question:** Does this rollout include a control-plane UI for adding local machines?
- **Decision:** No. Adding a machine stays a config-driven operation in this initiative, documented in W8. BooControl is the later UI/control-plane consumer.
- **Rationale:** The user goal is multi-provider support now, not a new admin product before the substrate exists.
- **Evidence:** BooControls own tasks call this registry work a prerequisite; current repo state has no stable local-provider substrate yet.
- **Rejected alternatives:**
- Build BooControl first. Rejected because it would either duplicate registry logic or bind to todays broken single-provider assumptions.
- **Driven by rounds:** R1.
- **Referenced in plan:** Outcome, W8, Deferred.
### D-8: Work unit sequencing is contract-first, consumer-second, verification-third
- **Question:** How should this be broken down for Orchestration so branches do not constantly collide?
- **Decision:** Sequence every work unit as:
1. contracts and config
2. primary backend seam
3. downstream consumers
4. tests and smoke
and forbid parallel editing of the shared contract and resolver files.
- **Rationale:** The churniest files in this repo are exactly the shared contract and coordinator files. Letting multiple branches edit them in parallel is the fastest path to merge thrash and subtle drift.
- **Evidence:** Recent churn is highest in `apps/web/src/api/types.ts`, `apps/web/src/api/client.ts`, `apps/server/src/index.ts`, `apps/coder/src/services/dispatcher.ts`, and `apps/coder/src/services/provider-snapshot.ts`.
- **Rejected alternatives:**
- Split by app only. Rejected because this feature crosses contracts, server, web, and coder in nearly every phase.
- **Driven by rounds:** R1.
- **Referenced in plan:** Orchestration Rules, Work Unit Index, all work units.