chore: snapshot working tree - pty_exited notifications + in-flight inference WIP

feat(booterm): structured pty_exited WS notifications. Plan-validated, impl-validated, code-reviewed green (contracts build clean, contracts test 29/29, booterm + web typecheck clean).

wip: in-progress inference/provider refactor (agents.ts, provider.ts, new llama-providers.ts, removed llama-args-validator), plus arena, dispatcher, compaction, schema changes.

openspec: pty-exit-notifications complete; x-agent-flags planned (not yet implemented).
This commit is contained in:
2026-06-14 12:48:47 +00:00
parent 0ed506f1da
commit b18de2a331
204 changed files with 25344 additions and 867 deletions

View File

@@ -0,0 +1,126 @@
# Discovery Notes: Multi-Provider Local Models
Single source of truth for implementation context. Read this first before touching the plan or code.
## Tech stack
- Monorepo with pnpm workspaces.
- `apps/server`: Fastify + Postgres, native inference, local-model routing, BooChat APIs.
- `apps/web`: React + Vite SPA, shared chat and coder UI.
- `apps/coder`: host-side BooCoder service, provider probing, native and external-agent dispatch, Arena, MCP.
- `packages/contracts`: shared cross-app schemas and types, built before consumers.
- TypeScript strict mode. Server and coder use NodeNext and `.js` import suffixes.
- Tests: `pnpm -C apps/server test`, `pnpm -C apps/coder test`. No dedicated web test harness.
## ADRs found
- `docs/adr/0001-arena-two-lane-scheduling.md`
Summary: local llama-backed contestants run serially in one lane, cloud contestants run in parallel in another lane; multi-provider work must preserve this lane model.
- `docs/adr/0002-arena-dedicated-tables-not-flow-runner.md`
Summary: Arena owns its own storage and runtime shape; reuse dispatcher machinery but do not fold Arena back into flow-runner abstractions.
## Coding standards found
- `docs/coding-standards/cross-app-contract-parity.md`
Summary: when a cross-app contract changes, update the canonical package source plus app-side secondary representations in the same batch; missing one side silently drops behavior at runtime.
- `CLAUDE.md`
Summary: `packages/contracts` is the single source for provider-snapshot and message-metadata contracts, deploy-by-surface rules matter, and contract changes must respect app-local secondary unions and renderers where they still exist.
## Relevant architecture notes
- `apps/server/CLAUDE.md`
Summary: `services/inference/provider.ts` is the current llama-swap provider seam; `model-context.ts` and `compaction.ts` currently assume one upstream.
- `apps/coder/CLAUDE.md`
Summary: provider snapshot and `opencode` integration are the main local-model seams; `llama-swap/*` is currently the local namespace assumption.
- `apps/web/CLAUDE.md`
Summary: `ModelPicker` and `AgentComposerBar` are separate UI surfaces with different constraints; any provider snapshot loading-state change can make providers disappear from the coder UI.
## Code touch points
### Shared contracts and config patterns
- `packages/contracts/src/provider-config.ts`
Existing coder ACP provider config schema; useful precedent, but not the right place to overload with local host inventory semantics.
- `apps/coder/src/services/provider-config-registry.ts`
Existing pattern for schema-in-package plus app-local load/build cache.
- `packages/contracts/src/provider-snapshot.ts`
Shared snapshot contract used by coder and web.
### Server: catalog, routing, and downstream local-model consumers
- `apps/server/src/config.ts`
Current env config includes `LLAMA_SWAP_URL`, `LLAMA_SIDECAR_URL`, and `DEFAULT_MODEL`; multi-provider config must enter here.
- `apps/server/src/routes/models.ts`
Current `/api/models` route fetches one llama-swap and optionally DeepSeek.
- `apps/server/src/services/inference/provider.ts`
Current route selection and AI SDK provider seam; central place to remove heuristic provider detection.
- `apps/server/src/services/model-context.ts`
Current context cache keys by bare model string and assumes one `LLAMA_SWAP_URL`.
- `apps/server/src/services/compaction.ts`
Uses `resolveModelEndpoint()` today, but still contains one-provider assumptions and a DeepSeek prefix special case.
- `apps/server/src/services/task-model.ts`
Returns one resolved `{url, model}` pair today.
- `apps/server/src/index.ts`
Calls `configureModelContext({ llamaSwapUrl })`; this wiring must change when context lookup becomes provider-aware.
- `apps/server/src/routes/settings.ts`
Existing shared settings persistence surface; right place for `favorite_models`.
### Web: BooChat and coder selection UI
- `apps/web/src/components/ModelPicker.tsx`
Shared BooChat model picker component; currently assumes a flat `/api/models` list.
- `apps/web/src/components/AgentComposerBar.tsx`
Native BooCoder provider/mode/model picker surface.
- `apps/web/src/lib/model-label.ts`
Display-only model prettifier used by both pickers.
- `apps/web/src/api/client.ts`
`models()` currently expects `ModelInfo[]`.
- `apps/web/src/api/types.ts`
Holds the web-side API contract for `/api/models` and other cross-app payloads.
### Coder: native, snapshot, arena, and external-agent bridge
- `apps/coder/src/config.ts`
Current coder config still exposes `LLAMA_SWAP_URL`; multi-provider config must enter here too.
- `apps/coder/src/services/provider-snapshot.ts`
Current snapshot fetches one `LLAMA_SWAP_URL`, prefixes local models as `llama-swap/*`, and merges them into `opencode`.
- `apps/coder/src/services/dispatcher.ts`
Current native and external-agent dispatch logic still assumes local bare ids or `llama-swap/*` for local routing.
- `apps/coder/src/services/backends/opencode-server.ts`
`parseModel()` splits only once at `/`; this is good news because a stable outer provider namespace can carry an inner composite model id.
- `apps/coder/src/services/arena-model-call.ts`
Direct one-shot local model call against `LLAMA_SWAP_URL`.
- `apps/coder/src/services/arena-analyzer.ts`
Local-vs-cloud checks rely on one local model set and one upstream.
- `apps/coder/src/index.ts`
Builds the local-model set for Arena from one fetched llama-swap list.
## Recent activity and churn
High-churn files in the last 90 days:
- `apps/web/src/api/types.ts`
- `apps/web/src/api/client.ts`
- `apps/server/src/index.ts`
- `apps/server/src/types/api.ts`
- `apps/coder/src/services/dispatcher.ts`
- `apps/coder/src/index.ts`
- `apps/coder/src/services/provider-snapshot.ts`
- `apps/web/src/components/AgentComposerBar.tsx`
- `apps/server/src/services/compaction.ts`
Implication: keep work units narrow and avoid combining unrelated refactors in these files.
## Constraints and load-bearing facts
- `packages/contracts` already owns provider-snapshot types; if the snapshot contract changes, rebuild the package before touching consumers.
- `apps/web` has no dedicated test harness, so web verification will rely on typecheck plus smoke testing.
- Arenas local lane semantics are intentional; multi-provider support must not collapse local models into parallel execution.
- `opencode` local parity is not a small rename. The current host config and snapshot behavior collapse identity to one `llama-swap` namespace.
## Gaps and unknowns
- No existing shared local-provider config file or schema exists in-repo yet.
- `/api/models` shape change is not yet specified in app-local types; W2 must settle the contract before W4 starts.
- The final `opencode` gateway path is not implemented anywhere yet; W7 is net-new code, not just adaptation.
- No dedicated docs for “add a machine” exist yet; W8 must create them.