v1.13.17-cross-repo-reads: on-demand read access to paths outside the project root

When the agent needed context from another repo, pathGuard rejected every read
with no recovery path. This batch adds a reactive request_read_access flow:
pathGuard's error now hints at the tool, the model emits a structured request,
the inference loop pauses (same mechanism as ask_user_input), the user picks
Allow/Deny via inline chips, and subsequent reads under the granted root succeed
for the rest of the session.

Schema: sessions.allowed_read_paths TEXT[] NOT NULL DEFAULT ARRAY[]::TEXT[]
(idempotent ADD COLUMN IF NOT EXISTS).

Grant unit (design D1): nearest registered projects.path ancestor →
nearest repo-shaped ancestor (.git/ / package.json / go.mod / Cargo.toml)
under PROJECT_ROOT_WHITELIST → else refuse. grant_resolver.ts walks
ancestors with a per-iteration whitelist invariant check so symlinked
input can't escape the whitelist mid-walk (Sam's checkpoint-1 ask).

Path-guard: optional extraRoots arg threaded from session.allowed_read_paths
through executeToolCall to view_file / list_dir / grep / find_files. The
ToolDef.execute signature gets an optional third param; non-FS tools
ignore it. view_file re-anchors the secret-guard check on basename(real)
whenever a relative path starts with "../" so .env / id_rsa* etc. still
deny across grant roots.

Endpoint: POST /api/chats/:id/grant_read_access mirrors /answer_user_input.
On 'allow' it re-resolves the grant root (state may have changed since
prompt — auto-falls to denial reason text on failure, not 500), array_appends
to sessions.allowed_read_paths with in-memory dedup, then publishes
tool_result + session_updated frames and enqueues the next assistant turn.

PATCH /api/sessions/:id allowed_read_paths supports revocation only. Zod
refines absolute + no traversal markers; runtime findUnauthorizedAdditions
guard rejects any entry not already present in the row, so a malicious
curl -X PATCH -d '{"allowed_read_paths":["/etc"]}' returns 400 instead of
bypassing the grant flow (Sam's compliance-review action item).

Frontend: RequestReadAccessCard renders pending (path + reason + Allow/Deny)
and answered (granted/denied summary with the resolved root) variants;
MessageList.flatten/group special-cases the tool name; SettingsPane adds a
per-session grants list with per-row revoke that PATCHes the shortened
array.

Tests: 11 grant_resolver, 8 path_guard, 8 sessions PATCH subset, including
explicit cases for symlink escape mid-walk, walk-bound termination at
whitelist root, /etc bypass attempt via PATCH, and nearest-project
disambiguation. 292 total server tests green.

Pairs with v1.13.16-xml-parser — the model now self-recovers from both
a wrong tool name AND from a refused path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-22 21:45:52 +00:00
parent 2e1a81de72
commit b52c5df705
21 changed files with 1610 additions and 41 deletions

View File

@@ -0,0 +1,70 @@
// v1.13.17-cross-repo-reads: PATCH /api/sessions/:id allowed_read_paths
// subset enforcement. Sam flagged in the compliance review that without a
// runtime subset check, a malicious client could POST
// {"allowed_read_paths":["/etc"]}
// and bypass the user-consent grant flow entirely. The findUnauthorizedAdditions
// helper is the guard; tests pin its behavior so a regression in the helper
// or its callsite (PATCH handler in sessions.ts) trips CI before prod.
import { describe, it, expect } from 'vitest';
import { findUnauthorizedAdditions } from '../sessions.js';
describe('findUnauthorizedAdditions — PATCH allowed_read_paths subset guard', () => {
it('returns no extras when requested is empty (full revoke)', () => {
expect(findUnauthorizedAdditions(['/opt/forks/foo'], [])).toEqual([]);
});
it('returns no extras when requested is a strict subset (single revoke)', () => {
expect(
findUnauthorizedAdditions(['/opt/forks/foo', '/opt/forks/bar'], ['/opt/forks/foo']),
).toEqual([]);
});
it('returns no extras when requested equals prior (no-op PATCH)', () => {
expect(
findUnauthorizedAdditions(['/opt/forks/foo', '/opt/forks/bar'], [
'/opt/forks/foo',
'/opt/forks/bar',
]),
).toEqual([]);
});
it('flags an unauthorized addition when prior is empty', () => {
// The /etc bypass attempt — Sam's specific concern from the compliance
// review. Without this guard, the PATCH would have written /etc directly.
expect(findUnauthorizedAdditions([], ['/etc'])).toEqual(['/etc']);
});
it('flags a single unauthorized addition mixed in with valid revokes', () => {
// The attacker still tries to be sneaky: keep one legit entry, drop
// another, slip in a new one. The guard catches the addition regardless
// of how the rest of the array shrinks.
expect(
findUnauthorizedAdditions(['/opt/forks/foo', '/opt/forks/bar'], [
'/opt/forks/foo',
'/var/secrets',
]),
).toEqual(['/var/secrets']);
});
it('flags every unauthorized addition when there are multiple', () => {
expect(
findUnauthorizedAdditions(['/opt/forks/foo'], ['/opt/forks/foo', '/etc', '/root']),
).toEqual(['/etc', '/root']);
});
it('treats requested duplicates correctly (each occurrence checked)', () => {
// If the requested array has duplicates of an unauthorized entry, the
// guard surfaces each one. (A frontend would never send duplicates, but
// the guard's contract shouldn't assume that.)
expect(findUnauthorizedAdditions([], ['/etc', '/etc'])).toEqual(['/etc', '/etc']);
});
it('does not flag entries present in prior even if requested has duplicates', () => {
// Duplicate of an authorized entry passes — the membership check is by
// value, not by index. Settled by Set.has semantics.
expect(
findUnauthorizedAdditions(['/opt/forks/foo'], ['/opt/forks/foo', '/opt/forks/foo']),
).toEqual([]);
});
});

View File

@@ -1,7 +1,13 @@
import type { FastifyInstance } from 'fastify';
import { z } from 'zod';
import type { Sql } from '../db.js';
import type { Config } from '../config.js';
import type { Broker } from '../services/broker.js';
import type { Chat, Message, Session, ToolCall } from '../types/api.js';
// v1.13.17-cross-repo-reads: grant_read_access resolves the grant root at
// decision time (not at request time) so concurrent project changes don't
// stale-bind the resolution.
import { resolveGrantRoot } from '../services/grant_resolver.js';
const SendBody = z.object({
content: z.string().min(1).max(64_000),
@@ -47,6 +53,21 @@ const AskUserInputArgs = z.object({
.max(3),
});
// v1.13.17-cross-repo-reads: grant decision body. tool_call_id is the
// model-emitted id (e.g. "call_abc123"), not a UUID. decision is binary.
const GrantReadAccessBody = z.object({
tool_call_id: z.string().min(1),
decision: z.enum(['allow', 'deny']),
});
// Same shape as services/request_read_access.ts RequestReadAccessInput.
// Re-derived to avoid the services/tools.ts import (matches the
// AskUserInputArgs pattern above).
const RequestReadAccessArgs = z.object({
path: z.string().min(1),
reason: z.string().min(1).max(500),
});
interface MessageHandlers {
enqueueInference: (sessionId: string, chatId: string, assistantMessageId: string, user: string) => void;
// v1.11: returns a promise that resolves after compaction.process finishes
@@ -76,6 +97,8 @@ interface MessageHandlers {
export function registerMessageRoutes(
app: FastifyInstance,
sql: Sql,
config: Config,
broker: Broker,
handlers: MessageHandlers
): void {
app.get<{ Params: { id: string } }>(
@@ -626,4 +649,234 @@ export function registerMessageRoutes(
return result;
},
);
// v1.13.17-cross-repo-reads: resume an awaiting-grant pause. Mirror shape
// of /answer_user_input (validate, look up via message_parts, UPDATE,
// publish, enqueue). Differences vs /answer_user_input:
// - On 'allow', re-resolves the grant root via grant_resolver (state
// may have changed since the prompt fired — concurrent project add,
// etc.). Resolution failure auto-falls to a denial with reason text
// rather than 500ing.
// - On 'allow' with a valid root, appends to sessions.allowed_read_paths
// (deduplicated) inside the same transaction.
// - On success, also publishes session_updated so an open SettingsPane
// refetches the new grant list.
// Error codes match /answer:
// 400 invalid_body / mismatched_answer_shape (bad args on the tool_call)
// 404 chat_not_found / unknown_tool_call_id
// 409 tool_call_already_answered
app.post<{ Params: { id: string } }>(
'/api/chats/:id/grant_read_access',
async (req, reply) => {
const parsed = GrantReadAccessBody.safeParse(req.body);
if (!parsed.success) {
reply.code(400);
return { error: 'invalid_body', details: parsed.error.flatten() };
}
const { tool_call_id, decision } = parsed.data;
const chatRows = await sql<Chat[]>`
SELECT id, session_id FROM chats WHERE id = ${req.params.id} AND status = 'open'
`;
if (chatRows.length === 0) {
reply.code(404);
return { error: 'chat_not_found' };
}
const chat = chatRows[0]!;
const sessionId = chat.session_id;
// Mirror the /answer lookup: assistant tool_call by id via message_parts.
const callerRows = await sql<{
message_id: string;
payload: { id: string; name: string; args: Record<string, unknown> };
}[]>`
SELECT p.message_id, p.payload
FROM message_parts p
JOIN messages m ON m.id = p.message_id
WHERE m.chat_id = ${chat.id}
AND m.role = 'assistant'
AND p.kind = 'tool_call'
AND p.payload->>'id' = ${tool_call_id}
ORDER BY m.created_at DESC
LIMIT 1
`;
const callerRow = callerRows[0];
if (!callerRow) {
reply.code(404);
return { error: 'unknown_tool_call_id' };
}
const foundCall: ToolCall = {
id: callerRow.payload.id,
name: callerRow.payload.name,
args: callerRow.payload.args,
};
if (foundCall.name !== 'request_read_access') {
reply.code(400);
return { error: 'tool_call_not_request_read_access' };
}
const argsParsed = RequestReadAccessArgs.safeParse(foundCall.args);
if (!argsParsed.success) {
reply.code(400);
return { error: 'mismatched_answer_shape', detail: 'tool_call args invalid' };
}
const requestedPath = argsParsed.data.path;
// Find the pending tool row.
const toolRows = await sql<{
message_id: string;
payload: { tool_call_id: string; output: unknown };
}[]>`
SELECT p.message_id, p.payload
FROM message_parts p
JOIN messages m ON m.id = p.message_id
WHERE m.chat_id = ${chat.id}
AND m.role = 'tool'
AND p.kind = 'tool_result'
AND p.payload->>'tool_call_id' = ${tool_call_id}
ORDER BY m.created_at DESC
LIMIT 1
`;
const toolRow = toolRows[0];
if (!toolRow) {
reply.code(404);
return { error: 'unknown_tool_call_id', detail: 'tool message not found' };
}
if (toolRow.payload && toolRow.payload.output !== null) {
reply.code(409);
return { error: 'tool_call_already_answered' };
}
// Look up session + project so we can re-resolve the grant root and
// append to allowed_read_paths atomically. We don't need agent or
// history here — just the project path for the resolver.
const sessionRows = await sql<{
id: string;
project_id: string;
allowed_read_paths: string[];
project_path: string;
}[]>`
SELECT s.id, s.project_id, s.allowed_read_paths, p.path AS project_path
FROM sessions s
JOIN projects p ON p.id = s.project_id
WHERE s.id = ${sessionId}
`;
const sessionRow = sessionRows[0];
if (!sessionRow) {
reply.code(404);
return { error: 'session_not_found' };
}
// Decision branch. 'deny' is the easy path: nothing to resolve or
// persist. 'allow' resolves the grant root; if resolution fails (e.g.
// path was deleted, project removed since prompt) the tool gets a
// denial with the resolver's reason text instead of a 500.
let resultOutput: string;
let grantRoot: string | null = null;
if (decision === 'allow') {
const resolution = await resolveGrantRoot(
sql,
requestedPath,
sessionRow.project_path,
config.PROJECT_ROOT_WHITELIST,
);
if (!resolution.ok) {
resultOutput = `denied: ${resolution.reason}`;
} else {
grantRoot = resolution.root;
resultOutput = `granted: ${grantRoot}`;
}
} else {
resultOutput = 'denied';
}
const newToolResults = {
tool_call_id,
output: resultOutput,
truncated: false,
};
const toolMessageId = toolRow.message_id;
const dbResult = await sql.begin(async (tx) => {
await tx`
UPDATE messages
SET tool_results = ${tx.json(newToolResults as never)}
WHERE id = ${toolMessageId}
`;
// Same delete+insert dance as /answer — UNIQUE (message_id, sequence)
// blocks plain UPDATE on append-style parts.
await tx`DELETE FROM message_parts WHERE message_id = ${toolMessageId} AND kind = 'tool_result'`;
await tx`
INSERT INTO message_parts (message_id, sequence, kind, payload)
VALUES (${toolMessageId}, 0, 'tool_result', ${tx.json(newToolResults as never)})
`;
// Persist the grant if we have one. ARRAY-level dedup — append only
// when the root isn't already present. The session row gets
// touched (updated_at) so the post-update publish below has a
// fresh timestamp.
let allowedRootsAfter = sessionRow.allowed_read_paths;
if (grantRoot !== null) {
if (!sessionRow.allowed_read_paths.includes(grantRoot)) {
const updated = await tx<{ allowed_read_paths: string[] }[]>`
UPDATE sessions
SET allowed_read_paths = array_append(allowed_read_paths, ${grantRoot}),
updated_at = clock_timestamp()
WHERE id = ${sessionId}
RETURNING allowed_read_paths
`;
allowedRootsAfter = updated[0]?.allowed_read_paths ?? sessionRow.allowed_read_paths;
} else {
// Already present — touch updated_at so any open settings
// panel still picks up the no-op via session_updated.
await tx`UPDATE sessions SET updated_at = clock_timestamp() WHERE id = ${sessionId}`;
}
}
const [assistantMsg] = await tx<{ id: string }[]>`
INSERT INTO messages (session_id, chat_id, role, content, status, created_at)
VALUES (${sessionId}, ${chat.id}, 'assistant', '', 'streaming', clock_timestamp())
RETURNING id
`;
await tx`UPDATE chats SET updated_at = clock_timestamp() WHERE id = ${chat.id}`;
return {
tool_message_id: toolMessageId,
assistant_message_id: assistantMsg!.id,
allowed_roots_after: allowedRootsAfter,
};
});
// Publish the deferred tool_result frame so the pending card flips to
// its answered view without a refetch.
handlers.publishSessionFrame(sessionId, {
type: 'tool_result',
tool_message_id: dbResult.tool_message_id,
tool_call_id,
chat_id: chat.id,
output: resultOutput,
truncated: false,
});
// session_updated nudge so any open SettingsPane refetches and sees
// the new allowed_read_paths. We publish on the user channel to match
// the existing PATCH /api/sessions/:id behavior — frontend refetches
// via api.sessions.get on receipt.
const nowIso = new Date().toISOString();
broker.publishUserFrame('default', {
type: 'session_updated',
session_id: sessionId,
project_id: sessionRow.project_id,
// session name doesn't change on grant; we look it up fresh to
// avoid carrying stale state if a rename raced us.
name:
(
await sql<{ name: string }[]>`SELECT name FROM sessions WHERE id = ${sessionId}`
)[0]?.name ?? '',
updated_at: nowIso,
});
handlers.enqueueInference(sessionId, chat.id, dbResult.assistant_message_id, 'default');
reply.code(202);
return {
tool_message_id: dbResult.tool_message_id,
assistant_message_id: dbResult.assistant_message_id,
allowed_read_paths: dbResult.allowed_roots_after,
};
},
);
}

View File

@@ -32,6 +32,29 @@ const PatchBody = z.object({
agent_id: z.string().min(1).max(200).nullable().optional(),
// v1.9: null = inherit from project default; true/false = explicit override.
web_search_enabled: z.boolean().nullable().optional(),
// v1.13.17-cross-repo-reads: revocation pathway. PATCH with a shortened
// list deletes entries; the grant flow itself APPENDS via the separate
// grant_read_access endpoint, never via this PATCH. Frontend treats this
// as "send the new whole array". Per-entry shape validation: must be
// absolute, no NUL, no `/..` traversal segment. Server doesn't re-validate
// whitelist membership on PATCH — entries already in the array were
// placed there by the grant endpoint after a full whitelist+repo-shape
// check. THE SUBSET CHECK (every entry must already be in the current
// array) is enforced at runtime in the PATCH handler below, NOT in this
// zod refinement, because the refinement has no access to the existing
// session row.
allowed_read_paths: z
.array(
z
.string()
.min(1)
.max(1024)
.refine((p) => p.startsWith('/') && !p.includes('\0') && !p.includes('/..'), {
message: 'must be an absolute path without traversal markers',
}),
)
.max(64)
.optional(),
});
async function resolveDefaultModel(sql: Sql, config: Config): Promise<string> {
@@ -40,6 +63,19 @@ async function resolveDefaultModel(sql: Sql, config: Config): Promise<string> {
return config.DEFAULT_MODEL;
}
// v1.13.17-cross-repo-reads: subset enforcement for PATCH allowed_read_paths.
// The PATCH route can only SHRINK the array; growth happens exclusively via
// POST /api/chats/:id/grant_read_access (which requires user consent).
// Returns the list of disallowed-additions; an empty list means the request
// is a valid shrink-or-no-op. Exported for the unit test.
export function findUnauthorizedAdditions(
prior: readonly string[],
requested: readonly string[],
): string[] {
const priorSet = new Set(prior);
return requested.filter((p) => !priorSet.has(p));
}
export function registerSessionRoutes(
app: FastifyInstance,
sql: Sql,
@@ -56,7 +92,7 @@ export function registerSessionRoutes(
}
const status = req.query.status === 'archived' ? 'archived' : 'open';
const rows = await sql<Session[]>`
SELECT id, project_id, name, model, system_prompt, status, created_at, updated_at, agent_id, web_search_enabled, workspace_panes
SELECT id, project_id, name, model, system_prompt, status, created_at, updated_at, agent_id, web_search_enabled, workspace_panes, allowed_read_paths
FROM sessions
WHERE project_id = ${req.params.id} AND status = ${status}
ORDER BY updated_at DESC
@@ -124,7 +160,7 @@ export function registerSessionRoutes(
app.get<{ Params: { id: string } }>('/api/sessions/:id', async (req, reply) => {
const rows = await sql<Session[]>`
SELECT id, project_id, name, model, system_prompt, status, created_at, updated_at, agent_id, web_search_enabled, workspace_panes
SELECT id, project_id, name, model, system_prompt, status, created_at, updated_at, agent_id, web_search_enabled, workspace_panes, allowed_read_paths
FROM sessions WHERE id = ${req.params.id}
`;
if (rows.length === 0) {
@@ -150,15 +186,53 @@ export function registerSessionRoutes(
const newAgentId = parsed.data.agent_id ?? null;
const wseProvided = parsed.data.web_search_enabled !== undefined;
const newWse = parsed.data.web_search_enabled ?? null;
// Read the prior name so the post-update publish can skip no-op renames
// (PATCH { name: "Foo" } where the session is already "Foo"). The window
// between SELECT and UPDATE is sub-millisecond in the same request handler;
// a concurrent rename in that gap would just mean one stale publish, which
// existing clients dedup by id.
const before = await sql<{ name: string }[]>`
SELECT name FROM sessions WHERE id = ${req.params.id}
// v1.13.17-cross-repo-reads: tri-state on the wire (undefined = no
// change, [] = clear). Frontend currently uses this PATCH only for
// revocation (delete a single entry from the existing array, send
// shortened result). Append-style grants go through the dedicated
// grant_read_access endpoint inside the inference loop.
const arpProvided = parsed.data.allowed_read_paths !== undefined;
const newArp = parsed.data.allowed_read_paths ?? [];
// Read the prior name + grants so the post-update publish can skip no-op
// renames (PATCH { name: "Foo" } where the session is already "Foo") AND
// so the subset check below has the current grant list to compare against.
// The window between SELECT and UPDATE is sub-millisecond in the same
// request handler; a concurrent rename in that gap would just mean one
// stale publish, which existing clients dedup by id.
const before = await sql<{ name: string; allowed_read_paths: string[] }[]>`
SELECT name, allowed_read_paths FROM sessions WHERE id = ${req.params.id}
`;
const priorName = before[0]?.name;
const priorArp = before[0]?.allowed_read_paths ?? [];
// v1.13.17-cross-repo-reads: subset enforcement. The grant flow is the
// ONLY path that can add entries to allowed_read_paths — PATCH can only
// shrink the array, never grow it. Without this guard, a malicious
// client could POST {"allowed_read_paths":["/etc"]} and bypass the
// user-consent prompt entirely. Sam flagged this in the v1.13.17
// compliance review (2026-05-22).
// Race note: a concurrent grant landing between this SELECT and the
// UPDATE below would briefly make a "shouldn't-have-been-valid" PATCH
// succeed (the newly-granted root sneaks in). Inverse race — a
// legitimate revoke happening alongside a concurrent grant — could
// briefly reject the revoke; the user retries. Both are acceptable
// given the single-user threat model + sub-millisecond window.
if (arpProvided) {
const extras = findUnauthorizedAdditions(priorArp, newArp);
if (extras.length > 0) {
reply.code(400);
return {
error: 'invalid body',
details: {
fieldErrors: {
allowed_read_paths: [
`entries must already be granted; cannot add via PATCH: ${extras.join(', ')}`,
],
},
},
};
}
}
const rows = await sql<Session[]>`
UPDATE sessions
SET
@@ -167,10 +241,11 @@ export function registerSessionRoutes(
system_prompt = COALESCE(${system_prompt ?? null}, system_prompt),
agent_id = CASE WHEN ${agentIdProvided} THEN ${newAgentId} ELSE agent_id END,
web_search_enabled = CASE WHEN ${wseProvided} THEN ${newWse} ELSE web_search_enabled END,
allowed_read_paths = CASE WHEN ${arpProvided} THEN ${sql.array(newArp, 25)} ELSE allowed_read_paths END,
updated_at = clock_timestamp()
WHERE id = ${req.params.id}
RETURNING id, project_id, name, model, system_prompt, status, created_at, updated_at,
agent_id, web_search_enabled, workspace_panes
agent_id, web_search_enabled, workspace_panes, allowed_read_paths
`;
if (rows.length === 0) {
reply.code(404);
@@ -213,7 +288,7 @@ export function registerSessionRoutes(
updated_at = clock_timestamp()
WHERE id = ${req.params.id}
RETURNING id, project_id, name, model, system_prompt, status, created_at, updated_at,
agent_id, web_search_enabled, workspace_panes
agent_id, web_search_enabled, workspace_panes, allowed_read_paths
`;
if (rows.length === 0) {
reply.code(404);