Files
boocode/apps/server/src/services/grant_resolver.ts
indifferentketchup b52c5df705 v1.13.17-cross-repo-reads: on-demand read access to paths outside the project root
When the agent needed context from another repo, pathGuard rejected every read
with no recovery path. This batch adds a reactive request_read_access flow:
pathGuard's error now hints at the tool, the model emits a structured request,
the inference loop pauses (same mechanism as ask_user_input), the user picks
Allow/Deny via inline chips, and subsequent reads under the granted root succeed
for the rest of the session.

Schema: sessions.allowed_read_paths TEXT[] NOT NULL DEFAULT ARRAY[]::TEXT[]
(idempotent ADD COLUMN IF NOT EXISTS).

Grant unit (design D1): nearest registered projects.path ancestor →
nearest repo-shaped ancestor (.git/ / package.json / go.mod / Cargo.toml)
under PROJECT_ROOT_WHITELIST → else refuse. grant_resolver.ts walks
ancestors with a per-iteration whitelist invariant check so symlinked
input can't escape the whitelist mid-walk (Sam's checkpoint-1 ask).

Path-guard: optional extraRoots arg threaded from session.allowed_read_paths
through executeToolCall to view_file / list_dir / grep / find_files. The
ToolDef.execute signature gets an optional third param; non-FS tools
ignore it. view_file re-anchors the secret-guard check on basename(real)
whenever a relative path starts with "../" so .env / id_rsa* etc. still
deny across grant roots.

Endpoint: POST /api/chats/:id/grant_read_access mirrors /answer_user_input.
On 'allow' it re-resolves the grant root (state may have changed since
prompt — auto-falls to denial reason text on failure, not 500), array_appends
to sessions.allowed_read_paths with in-memory dedup, then publishes
tool_result + session_updated frames and enqueues the next assistant turn.

PATCH /api/sessions/:id allowed_read_paths supports revocation only. Zod
refines absolute + no traversal markers; runtime findUnauthorizedAdditions
guard rejects any entry not already present in the row, so a malicious
curl -X PATCH -d '{"allowed_read_paths":["/etc"]}' returns 400 instead of
bypassing the grant flow (Sam's compliance-review action item).

Frontend: RequestReadAccessCard renders pending (path + reason + Allow/Deny)
and answered (granted/denied summary with the resolved root) variants;
MessageList.flatten/group special-cases the tool name; SettingsPane adds a
per-session grants list with per-row revoke that PATCHes the shortened
array.

Tests: 11 grant_resolver, 8 path_guard, 8 sessions PATCH subset, including
explicit cases for symlink escape mid-walk, walk-bound termination at
whitelist root, /etc bypass attempt via PATCH, and nearest-project
disambiguation. 292 total server tests green.

Pairs with v1.13.16-xml-parser — the model now self-recovers from both
a wrong tool name AND from a refused path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 21:45:52 +00:00

162 lines
6.2 KiB
TypeScript

// v1.13.17-cross-repo-reads: derives the grant root for a path the user is
// being asked to approve cross-repo read access to.
//
// Per design decision D1: grant unit = nearest registered project root,
// then nearest path-whitelist ancestor that looks like a repo root, then
// refuse. Granting the literal file path is too narrow (next file in the
// same repo re-prompts). Granting an arbitrary parent dir over-scopes.
//
// The resolver runs in two contexts:
// 1. request_read_access.execute — pre-prompt validation (cheap; bails
// early if the path can't plausibly be granted so the user is never
// asked about /etc/passwd)
// 2. POST /api/chats/:id/grant_read_access — at decision time, re-derives
// the root and persists it on sessions.allowed_read_paths
//
// Sam (2026-05-22 dispatch confirmation): "in the project-root resolver
// ancestor walk, stop the moment parent exits PROJECT_ROOT_WHITELIST or hits
// filesystem root — check on every iteration, not just final parent.
// Symlinked input must not be able to escape the whitelist during the
// walk." Hence the loop here checks both the walk bound AND the still-
// inside-whitelist invariant every step.
import { access, realpath } from 'node:fs/promises';
import { constants } from 'node:fs';
import { dirname, isAbsolute, sep } from 'node:path';
import type { Sql } from '../db.js';
// Files whose presence in a directory marks it as a repo root for grant
// purposes. Kept narrow on purpose; broader heuristics (e.g. ".project",
// "pyproject.toml") can be added with measured intent. Each entry is a
// literal basename — no globs.
const REPO_MARKERS: ReadonlyArray<string> = [
'.git',
'package.json',
'go.mod',
'Cargo.toml',
];
export type GrantResolution =
| { ok: true; root: string; source: 'project' | 'whitelist' }
| { ok: false; reason: string };
function isUnder(child: string, parent: string): boolean {
return child === parent || child.startsWith(parent + sep);
}
async function exists(path: string): Promise<boolean> {
try {
await access(path, constants.F_OK);
return true;
} catch {
return false;
}
}
async function isRepoShaped(dir: string): Promise<boolean> {
for (const marker of REPO_MARKERS) {
if (await exists(`${dir}${sep}${marker}`)) return true;
}
return false;
}
// Resolves an absolute path to its grant root or refuses with a reason
// string suitable for surfacing to the model. Pure helper — no DB writes,
// no broker publishes. Caller persists the root on session.allowed_read_paths
// if it wants the grant to stick.
//
// Arguments:
// sql — used only to read projects.path (no writes)
// requestedPath — absolute path the model wants to read
// projectRoot — the session's primary project root (already
// realpath'd by caller). Used to short-circuit
// "already in scope".
// whitelistRoot — PROJECT_ROOT_WHITELIST from config (default /opt).
// Walk bound for the repo-shape fallback.
//
// Returns { ok: true, root, source } on success; { ok: false, reason } else.
export async function resolveGrantRoot(
sql: Sql,
requestedPath: string,
projectRoot: string,
whitelistRoot: string,
): Promise<GrantResolution> {
if (typeof requestedPath !== 'string' || requestedPath.length === 0) {
return { ok: false, reason: 'path is required' };
}
if (!isAbsolute(requestedPath)) {
return { ok: false, reason: 'path must be absolute' };
}
// Resolve symlinks so subsequent ancestor checks compare apples-to-apples
// with realpath'd projectRoot. If the path doesn't exist at all, bail
// before bothering the user — the model is asking about a phantom.
let real: string;
try {
real = await realpath(requestedPath);
} catch {
return { ok: false, reason: `path does not exist: ${requestedPath}` };
}
// Whitelist guard. Symlinked inputs can resolve outside the whitelist
// even when the surface-form path looks inside it; that's why we test
// the *real* path here, not the requested one.
let realWhitelist: string;
try {
realWhitelist = await realpath(whitelistRoot);
} catch {
return { ok: false, reason: `whitelist root does not exist: ${whitelistRoot}` };
}
if (!isUnder(real, realWhitelist)) {
return { ok: false, reason: 'path outside permitted scope' };
}
// Already in scope? No prompt needed; the tool's caller should retry.
if (isUnder(real, projectRoot)) {
return { ok: false, reason: 'path already accessible without a grant' };
}
// Look for a registered project whose root is an ancestor of the
// requested path. Pick the LONGEST match (nearest ancestor wins) so
// sub-projects don't get over-broadened.
const projectRows = await sql<{ path: string }[]>`
SELECT path FROM projects WHERE status = 'open'
`;
let bestProject: string | null = null;
for (const row of projectRows) {
if (!row.path) continue;
if (!isUnder(real, row.path)) continue;
if (bestProject === null || row.path.length > bestProject.length) {
bestProject = row.path;
}
}
if (bestProject !== null) {
return { ok: true, root: bestProject, source: 'project' };
}
// Repo-shape fallback. Walk from the requested path upward toward the
// whitelist root. At every iteration: confirm we're still inside the
// whitelist (so a symlinked component can't slip the bound mid-walk)
// and confirm we haven't hit the filesystem root. The first dir with a
// REPO_MARKER child is the grant root.
let cursor = real;
while (true) {
// Don't grant the whitelist root itself — that would be far too broad.
if (cursor === realWhitelist) {
return { ok: false, reason: 'no repo-shaped ancestor found under whitelist' };
}
if (!isUnder(cursor, realWhitelist)) {
return { ok: false, reason: 'path outside permitted scope' };
}
const parent = dirname(cursor);
if (parent === cursor) {
// Hit filesystem root without finding a repo marker.
return { ok: false, reason: 'no repo-shaped ancestor found under whitelist' };
}
if (await isRepoShaped(cursor)) {
return { ok: true, root: cursor, source: 'whitelist' };
}
cursor = parent;
}
}