v1.13.17-cross-repo-reads: on-demand read access to paths outside the project root

When the agent needed context from another repo, pathGuard rejected every read with no recovery path. This batch adds a reactive request_read_access flow: pathGuard's error now hints at the tool, the model emits a structured request, the inference loop pauses (same mechanism as ask_user_input), the user picks Allow/Deny via inline chips, and subsequent reads under the granted root succeed for the rest of the session. Schema: sessions.allowed_read_paths TEXT[] NOT NULL DEFAULT ARRAY[]::TEXT[] (idempotent ADD COLUMN IF NOT EXISTS). Grant unit (design D1): nearest registered projects.path ancestor → nearest repo-shaped ancestor (.git/ / package.json / go.mod / Cargo.toml) under PROJECT_ROOT_WHITELIST → else refuse. grant_resolver.ts walks ancestors with a per-iteration whitelist invariant check so symlinked input can't escape the whitelist mid-walk (Sam's checkpoint-1 ask). Path-guard: optional extraRoots arg threaded from session.allowed_read_paths through executeToolCall to view_file / list_dir / grep / find_files. The ToolDef.execute signature gets an optional third param; non-FS tools ignore it. view_file re-anchors the secret-guard check on basename(real) whenever a relative path starts with "../" so .env / id_rsa* etc. still deny across grant roots. Endpoint: POST /api/chats/:id/grant_read_access mirrors /answer_user_input. On 'allow' it re-resolves the grant root (state may have changed since prompt — auto-falls to denial reason text on failure, not 500), array_appends to sessions.allowed_read_paths with in-memory dedup, then publishes tool_result + session_updated frames and enqueues the next assistant turn. PATCH /api/sessions/:id allowed_read_paths supports revocation only. Zod refines absolute + no traversal markers; runtime findUnauthorizedAdditions guard rejects any entry not already present in the row, so a malicious curl -X PATCH -d '{"allowed_read_paths":["/etc"]}' returns 400 instead of bypassing the grant flow (Sam's compliance-review action item). Frontend: RequestReadAccessCard renders pending (path + reason + Allow/Deny) and answered (granted/denied summary with the resolved root) variants; MessageList.flatten/group special-cases the tool name; SettingsPane adds a per-session grants list with per-row revoke that PATCHes the shortened array. Tests: 11 grant_resolver, 8 path_guard, 8 sessions PATCH subset, including explicit cases for symlink escape mid-walk, walk-bound termination at whitelist root, /etc bypass attempt via PATCH, and nearest-project disambiguation. 292 total server tests green. Pairs with v1.13.16-xml-parser — the model now self-recovers from both a wrong tool name AND from a refused path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 21:45:52 +00:00
parent 2e1a81de72
commit b52c5df705
21 changed files with 1610 additions and 41 deletions
--- a/apps/server/src/services/inference/tool-phase.ts
+++ b/apps/server/src/services/inference/tool-phase.ts
@@ -10,6 +10,10 @@ import { insertParts, partsFromAssistantMessage, partsFromToolMessage } from './
 // dispatch layer we no longer know which format produced the call, and the
 // extra signal is harmless for Qwen-derived calls.
 import { formatUnknownToolError } from './tool-suggestions.js';
+// v1.13.17-cross-repo-reads: pre-prompt validation for request_read_access.
+// Resolves the grant root before pausing the loop so the user is never
+// prompted about paths we couldn't grant anyway (e.g. /etc/passwd).
+import { resolveGrantRoot } from '../grant_resolver.js';
 import type {
  InferenceContext,
  StreamResult,
@@ -28,7 +32,8 @@ import { SYNTHESIS_TOOLS, runSynthesisPass } from '../synthesisPipeline.js';

 async function executeToolCall(
  projectRoot: string,
-  toolCall: ToolCall
+  toolCall: ToolCall,
+  extraRoots: readonly string[],
 ): Promise<{ output: unknown; truncated: boolean; error?: string }> {
  const tool = TOOLS_BY_NAME[toolCall.name];
  if (!tool) {
@@ -63,7 +68,7 @@ async function executeToolCall(
    };
  }
  try {
-    const output = await tool.execute(parsed.data, projectRoot);
+    const output = await tool.execute(parsed.data, projectRoot, extraRoots);
    const truncated =
      typeof output === 'object' && output !== null && 'truncated' in output
        ? Boolean((output as { truncated: unknown }).truncated)
@@ -206,7 +211,71 @@ export async function executeToolPhase(
        );
        return;
      }
-      const tres = await executeToolCall(projectRoot, tc);
+      // v1.13.17-cross-repo-reads: request_read_access pauses identically to
+      // ask_user_input EXCEPT for an up-front validation pass — if the path
+      // can't be granted under the whitelist / repo-shape rules, surface an
+      // immediate denial without prompting the user. Per design D1, we never
+      // ask the user about /etc/passwd or paths outside PROJECT_ROOT_WHITELIST.
+      if (tc.name === 'request_read_access') {
+        const tcArgs = tc.args as { path?: unknown; reason?: unknown };
+        const requested =
+          typeof tcArgs.path === 'string' ? tcArgs.path : '';
+        const resolution = await resolveGrantRoot(
+          ctx.sql,
+          requested,
+          projectRoot,
+          ctx.config.PROJECT_ROOT_WHITELIST,
+        );
+        if (!resolution.ok) {
+          // Auto-deny without pausing. The model sees the reason on its
+          // next turn and decides what to do.
+          const stored = {
+            tool_call_id: tc.id,
+            output: `denied: ${resolution.reason}`,
+            truncated: false,
+          };
+          await ctx.sql`
+            UPDATE messages
+            SET tool_results = ${ctx.sql.json(stored as never)}
+            WHERE id = ${toolMessageId}
+          `;
+          await insertParts(
+            ctx.sql,
+            partsFromToolMessage({ tool_results: stored }).map((p) => ({
+              ...p,
+              message_id: toolMessageId,
+            })),
+          );
+          ctx.publish(sessionId, {
+            type: 'tool_result',
+            tool_message_id: toolMessageId,
+            chat_id: chatId,
+            tool_call_id: tc.id,
+            output: stored.output,
+            truncated: false,
+          });
+          return;
+        }
+        // Path is plausibly grantable — install the pending sentinel and
+        // pause. The grant endpoint re-derives the root at decision time
+        // (state may have changed in the meantime) so we don't stash it here.
+        pausingForUserInput = true;
+        const sentinel = { tool_call_id: tc.id, output: null, truncated: false };
+        await ctx.sql`
+          UPDATE messages
+          SET tool_results = ${ctx.sql.json(sentinel as never)}
+          WHERE id = ${toolMessageId}
+        `;
+        await insertParts(
+          ctx.sql,
+          partsFromToolMessage({ tool_results: sentinel }).map((p) => ({
+            ...p,
+            message_id: toolMessageId,
+          })),
+        );
+        return;
+      }
+      const tres = await executeToolCall(projectRoot, tc, session.allowed_read_paths);
      if (SYNTHESIS_TOOLS.has(tc.name)) {
        synthEntries.push({ tc, output: tres.output, ...(tres.error ? { error: tres.error } : {}) });
      }