chore: snapshot working tree - pty_exited notifications + in-flight inference WIP

feat(booterm): structured pty_exited WS notifications. Plan-validated, impl-validated, code-reviewed green (contracts build clean, contracts test 29/29, booterm + web typecheck clean).

wip: in-progress inference/provider refactor (agents.ts, provider.ts, new llama-providers.ts, removed llama-args-validator), plus arena, dispatcher, compaction, schema changes.

openspec: pty-exit-notifications complete; x-agent-flags planned (not yet implemented).
This commit is contained in:
2026-06-14 12:48:47 +00:00
parent 0ed506f1da
commit b18de2a331
204 changed files with 25344 additions and 867 deletions

View File

@@ -0,0 +1,55 @@
# Design — BooControl SSH editor verb-mode + model pull
## Files touched
- `apps/control/src/services/ssh-config.ts` — add the `RemoteOps` seam + `shellOps`/`wrapperOps`; thread `mode` through `readRemoteConfig`/`applyRemoteConfig`.
- `apps/control/src/services/model-pull.ts` (new) — non-blocking pull job runner.
- `apps/control/src/routes/ssh-config.ts` — accept `sshMode` in PATCH; pass mode to read/diff/apply; add `POST /api/hosts/:id/pull`.
- `apps/control/src/schema.sql``ALTER TABLE control_hosts ADD COLUMN IF NOT EXISTS ssh_mode TEXT NOT NULL DEFAULT 'shell'`.
- `apps/web/src/components/control/HostConfigEditor.tsx` — SSH-mode selector + Pull-model field.
- `apps/control/src/services/__tests__/ssh-config.test.ts` — add wrapper-mode mapping tests (keep existing shell-mode tests).
- `apps/control/src/services/__tests__/model-pull.test.ts` (new) — repo-id validation + verb emission.
## RemoteOps seam
```ts
interface RemoteOps {
read(): Promise<string>; // throws on failure
backup(now: Date): Promise<string>; // returns backup path
write(content: string): Promise<void>; // throws on failure
restart(restartCmd: string): Promise<void>;
}
// shell: today's behavior — emits `cat 'p'`, `cp 'p' 'p.bak-ts'`, `cat > 'p'`, restartCmd.
function shellOps(target, configPath, exec): RemoteOps
// wrapper: emits the verbs `read` / `backup` / `write`(stdin) / `restart`.
function wrapperOps(target, exec): RemoteOps
```
`applyRemoteConfig` selects ops from `opts.mode` (default `'shell'`). Shell `backup`
computes the name via `backupFilename` then `cp`; wrapper `backup` sends the
`backup` verb and reads the returned path from stdout (the wrapper stamps it).
Everything else (validate, diff via `computeDiff`, health-wait) is unchanged, so
the existing shell-mode tests pass byte-for-byte.
## Pull job
`runModelPull({ target, repo, mode }, exec, emitter)`:
1. Validate `repo` against `^[A-Za-z0-9._-]+/[A-Za-z0-9._-]+$`; reject early.
2. `exec(target, 'pull ' + repo)` (wrapper) or `exec(target, 'huggingface-cli download ' + repo + ' --local-dir <modelsDir>/...')` (shell). Wrapper mode is the supported path; shell mode requires a `models_dir` and is best-effort.
3. Publish `control_job` frames: `running` at start, `completed`/`failed` at end, `detail.kind = 'pull'`, `detail.repo`, and tail output in `detail.line`.
Reuses jobType `action` from the existing `ControlJobFrame` (no contracts change).
## Backward compatibility
- `ssh_mode` defaults to `shell` -> existing hosts behave exactly as P9.1.
- `applyRemoteConfig` `mode` defaults to `shell` -> existing call sites + tests unchanged.
- No `control_job` schema change; the web `useControlStream` already accepts `jobType: 'action'`.
## Validation lenses folded in
- **V1 (adversarial):** wrapper `backup` must return the path the wrapper chose, not a client-computed one (clock skew between control host and GPU host) -> wrapper `backup` reads stdout.
- **V2 (adversarial):** a `wrapper`-mode host without the script must fail loudly -> verbs surface the non-zero exit + stderr per pipeline step; no shell fallback.
- **JD1 (junior):** server-side repo validation duplicates the wrapper's -> intentional defense in depth; documented.
- **JD2 (junior):** reusing jobType `action` keeps the change additive; a dedicated `pull` type is deferred (would touch contracts + web union) with reopen trigger "if pull needs distinct UI filtering."

View File

@@ -0,0 +1,53 @@
# BooControl SSH editor verb-mode + model pull — proposal
**Status:** READY. Extends BooControl P9.1 (the SSH config editor) so it works
against a forced-command-locked SSH key and can pull HuggingFace models into a
host's models directory.
## Why
P9.1 shipped the SSH config editor sending raw shell commands (`cat`, `cp`,
`cat >`, the restart command) over SSH. To restrict the BooControl key to a
single drive/folder, the operator has deployed an `authorized_keys`
**forced command** on the GPU hosts that binds the key to a wrapper script
(`apps/control/remote/boocontrol-edit.{ps1,sh}`). A forced command ignores the
client's command string and only honors fixed **verbs** (`read` / `backup` /
`write` / `restart` / `pull <repo>`). So the editor's raw-shell commands are now
rejected by those hosts, and there is no way to drive the wrapper's `pull` verb.
This change teaches the editor to speak verbs (per host) and adds a model-pull
capability, closing the loop so a locked-down key is fully usable from the
cockpit.
## What changes
1. **Per-host SSH mode.** `control_hosts.ssh_mode` (`shell` | `wrapper`, default
`shell` for backward compatibility). `shell` keeps today's raw-command
behavior for hosts without a wrapper; `wrapper` sends verbs.
2. **Verb-mode remote ops.** `ssh-config.ts` gains a `RemoteOps` seam with two
implementations (`shellOps`, `wrapperOps`). `applyRemoteConfig` and the
read/diff paths route through it. The pipeline (validate -> read -> diff ->
backup -> write -> restart -> health-wait) is unchanged; only the wire
commands differ.
3. **Model pull.** `POST /api/hosts/:id/pull {repo}` runs a non-blocking job that
invokes the host's `pull <repo>` verb, streaming progress over the existing
`control_job` frame (jobType `action`, `detail.kind = "pull"`). The repo id is
validated server-side (`^[A-Za-z0-9._-]+/[A-Za-z0-9._-]+$`) as defense in depth
on top of the wrapper's own check.
4. **UI.** The Host config editor gains an SSH-mode selector and a "Pull model"
field that posts a repo id and shows job progress.
## Out of scope
- Changing the wrapper scripts (already in `apps/control/remote/`).
- A new `control_job` jobType (reuse `action` to avoid a contracts change).
- Progress percentage parsing from `huggingface-cli` output (stream raw lines).
## Risks
| Risk | Mitigation |
|---|---|
| Refactor breaks existing P9.1 shell-mode tests | `shellOps` emits the identical `cat`/`cp`/`cat >`/restart command strings; existing assertions hold. `mode` defaults to `shell`. |
| Repo id injection via the pull verb | server-side regex validation + the wrapper's own regex; repo passed as a single token. |
| Long pull blocks the HTTP request | non-blocking job (fire-and-forget like bench/eval), progress over `control_job`. |
| Operator points a `wrapper`-mode host at a box without the wrapper | verbs fail loudly (the forced command / shell returns "denied"/127); reported per step, no silent fallback. |

View File

@@ -0,0 +1,42 @@
# ssh-config-editor
## ADDED Requirements
### Requirement: Per-host SSH command mode
The SSH config editor SHALL support a per-host `ssh_mode` of `shell` or
`wrapper`. In `shell` mode it issues raw shell commands as today; in `wrapper`
mode it issues fixed verbs (`read`, `backup`, `write`, `restart`, `pull`) so the
key can be bound to an `authorized_keys` forced command. The mode defaults to
`shell` for backward compatibility.
#### Scenario: Wrapper-mode host receives verbs
- **WHEN** a host configured with `ssh_mode = wrapper` has its config read
- **THEN** the editor sends the `read` verb (not a `cat` command)
#### Scenario: Shell-mode host is unchanged
- **WHEN** a host configured with `ssh_mode = shell` (the default) is edited
- **THEN** the editor sends the same `cat`/`cp`/`cat >`/restart commands as before
#### Scenario: Backup precedes write in both modes
- **WHEN** a config is applied
- **THEN** a timestamped backup is taken before the new config is written, and a write failure leaves the backup intact
### Requirement: HuggingFace model pull
The editor SHALL expose a non-blocking endpoint to pull a HuggingFace model
repository onto a host into its models directory, validating the repository id
and streaming progress over the `control_job` channel.
#### Scenario: Valid repo id is accepted and runs as a job
- **WHEN** `POST /api/hosts/:id/pull` is called with a repo id matching `org/name`
- **THEN** the request returns 202 and a `control_job` (jobType `action`, `detail.kind = pull`) reports progress and a terminal status
#### Scenario: Malformed repo id is rejected
- **WHEN** the pull endpoint receives a repo id containing spaces, shell metacharacters, or path traversal
- **THEN** the request is rejected before any SSH command is issued

View File

@@ -0,0 +1,29 @@
# Tasks — BooControl SSH editor verb-mode + model pull
## T1 — schema
- [x] `apps/control/src/schema.sql`: `ALTER TABLE control_hosts ADD COLUMN IF NOT EXISTS ssh_mode TEXT NOT NULL DEFAULT 'shell'`. Verify: `pnpm -C apps/control build`.
## T2 — RemoteOps seam (shell + wrapper)
- [x] In `ssh-config.ts` add the `RemoteOps` interface + `shellOps(target, configPath, exec)` (current command strings) + `wrapperOps(target, exec)` (verbs `read`/`backup`/`write`/`restart`). Verify: existing `ssh-config.test.ts` still green.
## T3 — thread mode through the pipeline
- [x] `readRemoteConfig` and `applyRemoteConfig` accept `mode: 'shell'|'wrapper'` (default `'shell'`) and select ops. `applyRemoteConfig` backup uses the ops' returned path. Verify: `pnpm -C apps/control test` (ssh-config shell-mode unchanged).
## T4 — wrapper-mode tests
- [x] Add tests: wrapper ops emit `read`/`backup`/`write`(stdin)/`restart` verbs; `applyRemoteConfig({mode:'wrapper'})` reads the backup path from the `backup` verb's stdout; failure at each step reported. Verify: `pnpm -C apps/control test`.
## T5 — model pull job
- [x] `services/model-pull.ts`: `runModelPull` with server-side repo-id validation, wrapper `pull <repo>` verb (shell fallback using a `models_dir`), `control_job` (jobType `action`, `detail.kind='pull'`) progress. Verify: `model-pull.test.ts` (validation accept/reject + verb emission).
## T6 — routes
- [x] `routes/ssh-config.ts`: accept `sshMode` in `PATCH /api/hosts/:id`; pass each host's `ssh_mode` into read/diff/apply; add `POST /api/hosts/:id/pull {repo}` (202, non-blocking). Verify: `pnpm -C apps/control build`.
## T7 — UI
- [x] `HostConfigEditor.tsx`: SSH-mode selector (`shell`/`wrapper`) in the settings form; a "Pull model" repo input + button that POSTs and surfaces job status. Verify: `npx tsc -p apps/web/tsconfig.app.json --noEmit`.
## T8 — gates
- [x] Full gates: control build + test, web tsc. Verify each command above passes.
## Deferred (YAGNI)
- Dedicated `control_job` jobType `pull` (reuse `action`). Reopen trigger: pull needs distinct UI filtering from other actions.
- `huggingface-cli` progress-percent parsing. Reopen trigger: operators want a progress bar rather than streamed lines.