- Archive all 10 shipped openspec changes to openspec/changes/archived/ - Update boocode_roadmap.md: date, shipped status for v1.14/v1.15/v2.0, add v2.1.0 section - Update README.md: 3-app monorepo, add services table, add What's shipped section - Remove stale active openspec folders (all work shipped)
33 KiB
Codecontext + TypeScript: recon and plan
Date: 2026-05-22 Author: read-only recon, evidence-first
Part A — Current codecontext usage in BooCode
A1. Server-side synthesis pipeline
BooCode runs a forced second-inference synthesis pass after a model emits any of three codecontext tool calls. The list is hard-coded:
/opt/boocode/apps/server/src/services/synthesisPipeline.ts:34-38
export const SYNTHESIS_TOOLS: ReadonlySet<string> = new Set([
'get_codebase_overview',
'get_framework_analysis',
'get_semantic_neighborhoods',
]);
The pipeline is triggered from the tool-phase, not by the model:
/opt/boocode/apps/server/src/services/inference/tool-phase.ts:200-279.
After tool-phase records the tool_call/tool_result rows it picks the first
synth-eligible entry, expands the inline-truncated head via tmpfs
(readTruncation), pulls top-N referenced files + project docs
(BOOCHAT.md, AGENTS.md, CONTEXT.md, roadmap.md), token-budgets to
32k chars/4 (synthesisPipeline.ts:45-46), streams a second model
inference with a 90s timeout (synthesisPipeline.ts:50), and either
emits a kind='synthesis' message-part or falls through to the
recursive turn on failure (synthesisPipeline.ts:250-272).
The pipeline is invoked once per turn that contains a SYNTHESIS_TOOLS
call — at most one synthesis pass per turn (the loop picks the first
synth-eligible entry, tool-phase.ts:256).
The codecontext tools themselves are HTTP wrappers over the sidecar:
/opt/boocode/codecontext/shim.go:412-419 registers eight POST routes
(/v1/get_codebase_overview … /v1/get_framework_analysis). The shim
serialises calls under callMu and forwards JSON-RPC to a single
codecontext mcp child (shim.go:194, shim.go:328-333). The child
binary is built from github.com/nmakod/codecontext tag v3.2.1
(/opt/boocode/codecontext/Dockerfile:18-22), NOT from the local fork at
/opt/forks/codecontext (which is github.com/nuthan-ms/codecontext,
fork go.mod: /opt/forks/codecontext/go.mod:1). Container reports
codecontext version dev (recon: docker exec boocode_codecontext codecontext --version returned codecontext version dev / Build Date: unknown / Git Commit: unknown).
Wrapper boundaries:
/opt/boocode/apps/server/src/services/codecontext_client.ts:68-70hard timeoutREQUEST_TIMEOUT_MS = 30_000, inline truncationTRUNCATION_LIMIT = 32_000.- Same file lines 80-95: realpath project + target_dir, reject any
target_dir that escapes the project root. The eight wrappers never
pass
target_dir(callCodecontextinjects it server-side, line 99). - Lines 130-141 surface the upstream "content is empty" parser bug
(issue #37) with an actionable hint pointing at
.codecontextignore.
A2. Agent-exposed tool surface
Source of truth: /opt/boocode/data/AGENTS.md (six agents) plus the
DEFAULT_TOOLS fallback in
/opt/boocode/apps/server/src/services/agents.ts:19-20 (every tool in
ALL_TOOLS).
Per-agent codecontext exposure (cited from
/opt/boocode/data/AGENTS.md:6,41,62,100,138,179):
| Agent | Codecontext tools exposed |
|---|---|
| Code Reviewer (line 3) | get_codebase_overview, get_dependencies, get_file_analysis, get_framework_analysis, get_semantic_neighborhoods, get_symbol_info, search_symbols, watch_changes |
| Debugger (line 38) | same eight |
| Refactorer (line 59) | same eight |
| Architect (line 97) | same eight |
| Security Auditor (line 135) | same eight |
| Prompt Builder (line 176) | none — tools: [view_file, list_dir, grep, find_files] |
Every project-less or no-agent chat falls back to DEFAULT_TOOLS =
ALL_TOOLS (all 21 tools including the eight codecontext ones)
(agents.ts:19-20,196). The BOOCODE_TOOLS env var can narrow further
via resolveToolTier() (tools.ts:712-732): core (4 tools, no
codecontext) / standard (16, all eight codecontext) / all (21).
STANDARD_TOOL_NAMES includes all eight codecontext tools
(tools.ts:719-732).
The eight codecontext tool registrations live in tools.ts:653-660 and
are all marked read-only in READ_ONLY_TOOL_NAMES (tools.ts:689-696).
A3. Actual usage (DB)
Tool-call frequency from message_parts (all-time; DB only has data
back to 2026-05-22 today — see "Claims I did not verify" for the
retention question):
Query: SELECT payload->>'name', COUNT(*) FROM message_parts WHERE kind='tool_call' GROUP BY 1 ORDER BY 2 DESC
| Tool | Calls | Chats |
|---|---|---|
| view_file | 129 | — |
| grep | 81 | — |
| list_dir | 78 | — |
| find_files | 25 | — |
| get_codebase_overview | 24 | 23 |
| search_symbols | 8 | 5 |
| ask_user_input | 5 | 3 |
foo (typo/invalid) |
4 | 2 |
| view_truncated_output | 4 | 2 |
| git_status | 3 | 2 |
| get_file_analysis | 3 | 1 |
| get_framework_analysis | 1 | 1 |
([^ (typo/invalid) |
1 | 1 |
Codecontext-tool calls observed: only 5 of 8 ever invoked
(get_codebase_overview, search_symbols, get_file_analysis,
get_framework_analysis, and get_dependencies does not appear).
Never called (in the recorded window): get_dependencies,
get_symbol_info, get_semantic_neighborhoods, watch_changes.
Per-call args sample (mp.created_at desc, last 12 calls;
recon-verified by query against message_parts):
get_codebase_overviewinvoked ~9 times in a row with{"include_stats":true}— repeated overview fetches within minutes.search_symbolsexamples:{"limit":20,"query":"Kind"},{"limit":20,"query":"SymbolKind"},{"limit":20,"query":"Kind","framework_type":"typescript"}.get_file_analysisinvoked 3 times in one chat withfile_path=apps/server/src/services/inference.ts,apps/server/src/services/inference/parts.ts,apps/server/src/services/system-prompt.ts— all three failed with "File not found in graph" (see C3).
A4. Hang and drift correlation
Cohort analysis (query against messages joined to chats that
ever used any codecontext tool):
| Cohort | status | rows |
|---|---|---|
| no_codecontext | complete | 24 |
| no_codecontext | cancelled | 1 |
| used_codecontext | complete | 191 |
| used_codecontext | streaming | 2 |
| used_codecontext | failed | 2 |
Two failed assistant messages, both in chats that used codecontext.
Both have empty content — characteristic of a synth pass that aborted
before any deltas streamed (see synthesisPipeline.ts:278-303,
markSynthFailed). DB query:
SELECT id, status, created_at, LEFT(content,200)
FROM messages WHERE role='assistant' AND status IN ('failed','streaming')
returned two failed rows with empty content at 2026-05-22 18:43:39 and
2026-05-22 19:59:56. The 18:43 failure correlates with the codecontext
sidecar log line 2026/05/22 18:44:10.842554 get_framework_analysis target_dir=/opt/boocode duration_ms=30002 status=rpc_error — a 30 s
timeout (codecontext_client.ts:70) under a get_framework_analysis
call (synthesisPipeline.ts:34-38 would have triggered synthesis on
success — failure path skipped synthesis and surfaced the error).
Drift / format leakage: the query
SELECT * FROM messages WHERE role='assistant' AND (content LIKE '%<invoke%' OR content LIKE '%<tool_call%') returned 8 rows; manual
review showed 7 are recon/discussion content where the model is
quoting <invoke> as a topic, not actually emitting a tool call as
text. One real drift case at 2026-05-22 19:05:03 — content begins
"I need to investigate the codecontext fork to write this design
document. Let me start by reading the key files.\n\n<invoke
name="read_file">…" — an Anthropic-format leak. This message is in a
chat that did use codecontext, but the drift evidence is too thin
(n=1) to claim a correlation.
Part B — TypeScript parsing gap
B1. TS-targeted workload
Per-language breakdown of codecontext calls that target a specific file or framework (DB query):
| Language hint | Calls |
|---|---|
| no file_path (overview/framework/symbol search) | 33 |
| ts/tsx | 3 |
| (no other extension observed) | — |
The three TS-targeted calls were all get_file_analysis in a single
chat: inference.ts, inference/parts.ts, system-prompt.ts. All
three failed with File not found in graph (see C3 — relative path
mishandling). One search_symbols call carried
framework_type=typescript (Q="Kind").
So TS is the actual workload for narrow codecontext use; the rest is whole-repo overview/framework analysis with no specific language filter.
B2. Symbol recovery quality
I called the live container against three load-bearing BooCode TS files and compared the symbol list against a manual grep of top-level declarations.
File 1: /opt/boocode/apps/server/src/types/api.ts (371 lines)
Manual count (grep ^(export )?(interface|type|const) ):
- interfaces: 36
- top-level types: 15
- top-level consts: 5
- total significant: 56
Codecontext output (live HTTP call to
http://codecontext:8080/v1/get_file_analysis):
{
"result": "# File Analysis: ...\n**Lines:** 372\n**Symbols:** 10\n\n## Symbols\n\n- **PROJECT_STATUSES** () - Line 2\n- **PROJECT_STATUSES** () - Line 2\n- **CHAT_STATUSES** () - Line 91\n..."
}
Total reported: 10 symbols, all five *_STATUSES consts duplicated
(line 2 appears twice, etc.). After regex-extracting names:
- Unique symbols reported by codecontext: 8 (5 *_STATUSES consts + 3
header strings
Language:/Lines:/Symbols:) - Interfaces / types found: 0 of 51.
- Symbol-recovery rate: 5/56 = ~9% (only the const arrays the JS grammar understands).
Specific misses checked against the actual file
(grep -nE on /opt/boocode/apps/server/src/types/api.ts):
- Line 5
export interface Project— MISSED - Line 26
export type SessionStatus— MISSED - Line 28
export interface Session— MISSED - Line 47
export type WorkspacePaneKind— MISSED - All 36 interface declarations and 15 type aliases — MISSED.
File 2: /opt/boocode/apps/server/src/services/tools.ts (763 lines)
Manual count: 47 top-level decls
(grep ^(export )?(interface|type|enum|namespace|const|function|class|async function) ).
Codecontext output: 112 symbols reported (but many are noise:
local function-scope variables, the literal token "unknown" from
type cast positions, even raw labels like out:).
Python-extracted from result: 71 unique names. Cross-checked against 20 significant TS exports the file declares:
- Found:
ListDirInput,READ_ONLY_TOOL_NAMES,CORE_TOOL_NAMES,STANDARD_TOOL_NAMES(4 / 20) - MISSED:
ToolDef,ViewFileInput,viewFile,listDir,grep,findFiles,viewTruncatedOutput,gitStatus,skillFind,skillUse,skillResource,askUserInput,ALL_TOOLS,TOOLS_BY_NAME,resolveToolTier,toolJsonSchemas— every exportedToolDef<…>named constant is missed because the JS grammar can't parse the TS type annotation: ToolDef<…>that precedes the=and bails out of recognising the const at top-level. - Symbol-recovery rate (significant): 4/20 = 20%.
File 3: /opt/boocode/apps/server/src/services/inference/stream-phase.ts (482 lines)
Manual count: 5 top-level decls (2 are export async function,
1 interface, 1 type, 1 const).
Codecontext output: 53 symbols extracted, but the first 20 are header
strings (Language:, Lines:, Symbols:), imports (api.js,
model-context.js, …), local function names from inside bodies
(toolNameById, out:, hasTools), and string literals
(parts:). Neither streamCompletion nor executeStreamPhase (the
two export async function declarations at lines 145, 346) appear in
the symbol list explicitly.
Aggregate: across the three files, codecontext recovers type/interface/enum symbols at effectively 0%, and function/const symbols at roughly 20%. The 9596-symbol whole-repo overview is heavily noise-padded. Generic type parameters and decorators were not checked individually because they're a strict subset of the already-broken case.
B3. Fork status
docs/ts-bindings-design.md does NOT exist. Verified by
ls /opt/forks/codecontext/docs/ts-bindings-design.md → No such file or directory. The /opt/forks/codecontext/docs/ tree has 23 markdown
files; none mention TypeScript bindings work (greps under
/opt/forks/codecontext/docs/ for TypescriptLanguage|tree-sitter-tsx
returned nothing beyond a CodeContext example in HLD.md:831 and
config mentions in ARCHITECTURE.md:297).
go.mod dependencies (/opt/forks/codecontext/go.mod:5-18):
github.com/tree-sitter/tree-sitter-javascript v0.23.1(present)github.com/tree-sitter/tree-sitter-typescript— NOT present.
TS-as-JS fallback in internal/parser/manager.go:72-79:
// TypeScript - use JavaScript grammar as fallback until TypeScript bindings are fixed
// Both JS and TS have similar syntax and this provides basic parsing capability
tsLang := sitter.NewLanguage(javascript.Language())
m.languages["typescript"] = tsLang
tsParser := sitter.NewParser()
tsParser.SetLanguage(tsLang)
m.parsers["typescript"] = tsParser
The comment claims this provides "basic parsing capability". B2 shows
that interface/type recovery is effectively zero — the JS grammar does
not recognise interface, type, generic params, decorators, or even
TS-typed const declarations.
Downstream code IS prepared for TS-specific nodes. In
internal/parser/manager.go:746-765 nodeToSymbolJS already has
cases for interface_declaration and type_alias_declaration:
case "interface_declaration", "interface":
return &types.Symbol{Type: types.SymbolTypeInterface, ...}
case "type_alias_declaration", "type_declaration":
return &types.Symbol{Type: types.SymbolTypeType, ...}
These cases are dead code with the JS grammar — they only fire when the parser is the TypeScript grammar. The fork already has the symbol extraction wiring; it's just missing the grammar.
SymbolType is open (string), not an iota —
/opt/forks/codecontext/pkg/types/graph.go:14:
type SymbolType string
with constants like SymbolTypeInterface, SymbolTypeType,
SymbolTypeNamespace already declared (graph.go:16-48). No code
changes needed there to add TS-aware symbol types.
Upstream tree-sitter-typescript Go bindings exist. Context7 docs
for /tree-sitter/tree-sitter-typescript show the Go package
github.com/tree-sitter/tree-sitter-typescript exporting
LanguageTypescript() and LanguageTSX():
typescript := sitter.NewLanguage(tree_sitter_typescript.LanguageTypescript())
tsx := sitter.NewLanguage(tree_sitter_typescript.LanguageTSX())
(Context7 query /tree-sitter/tree-sitter-typescript,
"Go bindings package name and how to import…", returned a working
sample.)
The fork (/opt/forks/codecontext) is not what runs in production.
The deployed image is built from github.com/nmakod/codecontext tag
v3.2.1 (/opt/boocode/codecontext/Dockerfile:18-22). The fork is a
separate working tree at /opt/forks/codecontext on
github.com/nuthan-ms/codecontext (/opt/forks/codecontext/go.mod:1).
Any TS-grammar work landing in either repo requires a Dockerfile
update to point at the right source.
Fork HEAD: ba6b94c 2025-09-01 12:43:09 +0530 Merge pull request #29 from nmakod/release-please--branches--main — newer than the
deployed v3.2.1 tag but on the same upstream lineage.
B4. Existing TS-aware alternatives
Searches in /opt/boocode:
grep -rln 'ts-morph|@typescript/vfs|createCompilerHost' /opt/boocode/apps→ no matches in source (only types).- Only the
typescriptpackage is depended on (/opt/boocode/package.json,/opt/boocode/apps/booterm/package.json,/opt/boocode/apps/server/package.json,/opt/boocode/apps/web/package.json— each declares"typescript": "^5.5.0"). That's the tsc compiler, used for building, not for runtime symbol extraction. - No tool in
/opt/boocode/apps/server/srcparses TS at runtime for any reason other than what codecontext provides.
So BooCode has no existing fallback for TS symbol data: if codecontext can't extract it, nobody else does.
Part C — Optimization opportunities
C1. Tool surface review
Cross-referencing the agent whitelist (A2) with actual usage (A3):
| Tool | Exposed to 5 agents? | Calls observed | Recommendation |
|---|---|---|---|
| get_codebase_overview | yes | 24 | Keep — load-bearing, synth-triggering |
| search_symbols | yes | 8 | Keep — only viable TS query path |
| get_file_analysis | yes | 3 | Keep but fix relative-path bug (C3) |
| get_framework_analysis | yes | 1 | Low-use; keep for synth signalling |
| get_dependencies | yes | 0 | Demote — unused, considered for removal |
| get_symbol_info | yes | 0 | Demote — unused, considered for removal |
| get_semantic_neighborhoods | yes | 0 | Demote — unused, considered for removal |
| watch_changes | yes | 0 | Remove from agent whitelist — also pulled out of synthesis if currently kept |
watch_changes in particular is a state-changing async tool with no
sensible LLM consumer (the model can't await fsnotify events). It
should not be in the 5 agents' whitelists; the synthesis pipeline only
calls 3 specific tools (synthesisPipeline.ts:34-38) so removing
watch_changes from agent whitelists does not affect the pipeline.
get_dependencies, get_symbol_info, get_semantic_neighborhoods
are credible tools but the model never reaches for them — likely a
descriptions/discoverability issue. Either improve their tool
descriptions (the .description strings registered in
tools/codecontext/*.ts) or remove them from agent whitelists.
C2. Latency and token cost
Latencies parsed from the codecontext sidecar access log
(docker logs boocode_codecontext --since 24h | grep duration_ms=):
- Total calls observed: 40 in 24h
- Total time: 610,404 ms
- Avg: 15,260 ms per call
- Min: 1,379 ms
- p50: 9,417 ms
- p90: 27,611 ms
- Max: 30,002 ms (= the 30 s rpc_error timeout)
Sampled MCP-server log lines confirm overview rebuilds cost 2–8 s on
/opt/boocode (6575 files, 115601 symbols, 1186758 chars markdown
in 8.22 s). The shim's per-tool log shows the analysis dominates;
markdown serialization is sub-second.
Synthesis pipeline expansion (from docker logs boocode):
Five completed synthesis passes today, sample sizes:
originalChars(truncated head shipped to synth): 32,078 in every case (= the wrapper's 32 kB cap).fullChars(full overview after re-expansion from tmpfs): 83,406 / 83,408 / 83,410 / 97,283 / 97,464.
In other words, every overview is over the wrapper cap and synthesis
always pays a tmpfs round-trip to recover the full content for
reference-file extraction. The full content is not shipped to the
synth model (the truncated head is — synthesisPipeline.ts:141), so
the token-budget contract holds, but the synth still has to wait on
the file I/O.
One synthesis timeout in the day (synthesis pass timed out; falling through to recursive turn, chatId a74bfecb…, toolName
get_codebase_overview, 90 s after expansion completed — the synth
inference itself was too slow). The retry inside the same chat then
completed in 31 s with files: 0 (no referenced files extracted),
suggesting the timeout repeated until reference extraction was
empty.
I have no cache-hit statistics to report — the shim does not log
cache hits. The codecontext binary itself logs Refreshing analysis for codebase overview… on every call ([MCP] Refreshing analysis…
appears for each get_codebase_overview in the sidecar log), so the
analysis is rebuilt per call.
C3. Failure modes
Sidecar errors in the last 7 days
(docker logs boocode_codecontext --since 168h | grep -E "status=tool_error|content is empty|panic"):
content is emptyparser bug — 2026-05-22 17:37:41 and 17:43:41, both against/opt/homelabhealth, onfrontend/node_modules/hono/dist/adapter/aws-lambda/types.js. The wrapper's.codecontextignoretemplate installation (codecontext_client.ts:30-52) didn't help because the file is undernode_moduleswhich is supposedly in the template. Suggests either the template hadn't been copied yet or the template's ignore list doesn't cover the path. Each failed call cost ~25 s.- Relative-path failures — 2026-05-22 17:56:51 through 17:57:07
(three back-to-back), all
get_file_analysis:The wrapper resolves[MCP] ERROR: File not found in graph: apps/server/src/services/inference.ts (available files: 6575)target_dirto an absolute realpath (codecontext_client.ts:80-99) butfile_pathis forwarded unchanged. The codecontext binary's file index is keyed on absolute paths (the 115,876-symbol overview reports absolute paths). The model passedapps/server/src/services/inference.tsand the binary couldn't find it. Each failure cost 8–24 s. - 30 s rpc_error timeout — 2026-05-22 18:44:10
(get_framework_analysis) and 19:38:06 (search_symbols vs
/opt/forks/codecontext). The shim's per-call context timeout is
60 s (
shim.go:325) but the wrapper aborts at 30 s (codecontext_client.ts:70), so the client gives up before the shim does — the call still runs to completion on the codecontext side, wasting CPU. - Panic in
searchSymbols— concurrent map iteration crash ininternal/mcp/server.go:1305(getFilePathForSymbol) undermatchesFramework, captured indocker logs boocode_codecontext --since 24h:This is an upstream bug in v3.2.1 — concurrent map access without a lock. The shim'sinternal/runtime/maps.fatal(...) github.com/nuthan-ms/codecontext/internal/mcp.(*CodeContextMCPServer).getFilePathForSymbol(...) /build/codecontext/internal/mcp/server.go:1305callMuserialises its calls but the codecontext binary itself appears to have internal concurrency that hits this.
Pattern: the 2 failed assistant messages in A4 align with the 30 s
rpc_error timeout (18:44:10) and one other failure window. Failed
turns leave empty content because synthesis aborts before any
deltas — the model never sees the codecontext error.
Part D — Plan
D1. Tool surface decisions
Title: Trim agent codecontext exposure to the four tools that earn their keep; demote the rest until evidence justifies them.
Why: A3 shows 4 of 8 codecontext tools have zero observed calls,
and watch_changes (a fsnotify-coupled tool) has no LLM consumer.
The synthesis pipeline only auto-triggers on three tools
(synthesisPipeline.ts:34-38), so removing tools from agent
whitelists does not affect the server-side synth path.
Scope: edit /opt/boocode/data/AGENTS.md lines 6, 41, 62, 100,
138 (Code Reviewer, Debugger, Refactorer, Architect, Security
Auditor) to drop get_dependencies, get_symbol_info,
get_semantic_neighborhoods, watch_changes from each tools:
array. Roughly 5 line edits.
Risk: if there's a legitimate workflow not yet captured in 24 h
of DB data, dropping these tools removes that affordance. Mitigation:
keep them registered in tools.ts (the server-side wrappers stay) so
the synth pipeline can still call them if SYNTHESIS_TOOLS expands
later, and so the BOOCODE_TOOLS=standard tier continues to expose
them via the tier filter. Tests: agents.test.ts, tools.test.ts,
any agent-roundtrip tests.
Effort: 30 min.
Sequence: standalone. Unblocks D3 (smaller tool list = smaller
system prompt = better prompt-cache stability per tools.ts:629-632).
D2. TypeScript support path
Title: Narrow the TS fork scope to "interfaces, types, enums, top- level typed consts" — defer generics and decorators.
Why: Evidence from B1 (3 TS-targeted calls — all
get_file_analysis — and 1 search_symbols framework_type=typescript)
shows TS is in the workload but at low volume. Evidence from B2
shows symbol recovery is ~0% for interfaces/types and ~20% for
typed consts. That gap is what actually breaks model behaviour:
when the model asks get_file_analysis for api.ts (which IS what
happened today) it gets 10 noise symbols and no interface Project,
interface Session, type SessionStatus. The narrow scope
(declarations only; skip generics, JSX, decorators) covers ~90% of
the recovered-symbol gap and is achievable with one new dependency
and one parser-init change.
Scope:
/opt/forks/codecontext/go.mod: addgithub.com/tree-sitter/tree-sitter-typescript v0.23.xto therequireblock./opt/forks/codecontext/internal/parser/manager.go:72-79: replace the JS-fallback init withPlus parser registrations.typescript "github.com/tree-sitter/tree-sitter-typescript/bindings/go" ... tsLang := sitter.NewLanguage(typescript.LanguageTypescript()) m.languages["typescript"] = tsLang tsxLang := sitter.NewLanguage(typescript.LanguageTSX()) m.languages["tsx"] = tsxLangnodeToSymbolJSalready handlesinterface_declarationandtype_alias_declaration(lines 746-765) — no extraction code changes needed for the narrow scope./opt/forks/codecontext/internal/parser/manager.go:357-395detectLanguage(skim verified to live around line 357): ensure.tsxmaps to"tsx"not"typescript". Likely already correct — verify.- Tests in
internal/parser/— add TS-grammar fixtures (a small.tsfile with interface, type, enum) to assert recovery. - Update
/opt/boocode/codecontext/Dockerfile:18-22to clone from the fork instead ofgithub.com/nmakod/codecontextv3.2.1 once the TS-grammar branch lands. Or PR the change upstream first ifnmakod/codecontextis open to it. - Drop the fork's own
tree-sitter-javascriptdependency? No —tree-sitter-typescriptGo binding is separate and the JS grammar is still needed for.js/.jsxfiles.
Rough LoC: ~20 lines in manager.go, +1 line go.mod, +1 import, +1 language-detect entry; ~50 lines of tests; ~5 lines in Dockerfile.
Risk: TS grammar parses superset syntax; some TS files may now
hit ERROR nodes the JS grammar happily accepted. Mitigate by
keeping the JS grammar registered for .js/.jsx and not changing
JS handling. Regression risk lives in the codecontext-binary CI
(JS+TS combined corpus) — verify their existing tests still pass.
Tests to add: a fixture file containing each B2 missed symbol and a
manager_test that asserts the symbols are recovered.
Effort: Phase A (grammar swap + tests + Dockerfile pin): 90 min once a build-and-test loop is set up in the fork.
Sequence: Blocked on a decision about whether to PR upstream
(nmakod/codecontext) or fork-and-deploy (nuthan-ms/codecontext).
Unblocks D3 (cleaner TS results = smaller noise in synthesis output
= smaller token cost).
Decision: Narrow, not "drop" and not "full TS support". Drop is wrong because TS is the workload (A2 + B1 show every agent and the codebase under analysis are TS-heavy). Full Phase 3-4 TS support (generics, decorators, full type queries) is overkill for current usage — interface/type/enum recovery captures the model's actual need.
D3. Synthesis pipeline optimizations
Title: Reduce per-turn codecontext latency and cache the overview.
Why: C2 shows avg 15.2 s per codecontext call and an overview that rebuilds on every call. Synthesis always pays the 30 s wrapper timeout when the codecontext binary panics (C3 case 4) or hangs.
Three sub-items:
D3a. Cache the overview at the shim layer. The shim already
serialises calls under callMu (shim.go:74-77). Add a per-
target_dir overview cache keyed on a directory-mtime hash, TTL ~60s.
Sub-second cache hits for repeated get_codebase_overview calls
(today shows ~9 in a single chat over a few minutes).
- File:
/opt/boocode/codecontext/shim.go - LoC: ~80
- Effort: 90 min
- Risk: invalidation. Use the fastest cheap invalidator (mtime of
target_dir + a hash of the file count via
os.ReadDir). On any doubt, bypass cache.
D3b. Align wrapper and shim timeouts. Wrapper 30 s
(codecontext_client.ts:70), shim ctx 60 s (shim.go:325). The
mismatch wastes CPU when the wrapper gives up but the shim keeps
running. Either drop the shim ctx to 30 s, or raise the wrapper
to 60 s (depending on which budget is right). Recommended: align
both to 45 s, abort upstream on wrapper cancel.
- LoC: 2 lines
- Effort: 30 min
D3c. Fix the relative-path bug in get_file_analysis. The
wrapper resolves target_dir but not file_path. Three failures
in one chat today wasted 48 s of CPU. Fix:
- File:
/opt/boocode/apps/server/src/services/tools/codecontext/get_file_analysis.ts(and possibly the shared client atcodecontext_client.ts). - Have the wrapper resolve
file_pathagainst the realpath'd project root before forwarding, mirroringtarget_dir. Error out if the resolved path doesn't start with the project root. - LoC: ~20
- Effort: 60 min
- Risk: low — the model loses no affordance; absolute and relative both work.
- Tests:
codecontext_client.test.ts.
Sequence: D3c is independent and high-ROI. D3a depends on nothing. D3b is independent. Recommended order: D3c → D3b → D3a.
D4. Removal candidates
watch_changesagent exposure (A3 + A2). Server-side handler stays for completeness; it should not appear in agenttools:arrays. Edit/opt/boocode/data/AGENTS.mdlines 6, 41, 62, 100, 138.- The dead "csharp" comment-out block in
/opt/forks/codecontext/internal/parser/manager.go:146-152— delete-on-touch when D2 lands; not part of D2's core scope. - The 3 zero-use codecontext tool exposures —
get_dependencies,get_symbol_info,get_semantic_neighborhoods. Same surgical edits as item 1. Consider keepingget_dependencieson the Refactorer because the agent description explicitly invokes "Use get_dependencies to map call sites" (AGENTS.md:92-93); if the model isn't using it despite the system-prompt nudge, the description intools/codecontext/get_dependencies.tslikely needs the same verb-forward rewrite.
Claims I did not verify
- DB retention horizon. All
message_partsrows are dated 2026-05-22. That could mean (a) the DB was wiped today, (b) the schema/path moved today, or (c) the project is brand-new and 24 h is genuinely the full history. The CLAUDE.md project context references "v1.13.15-codecontext-synth" which is recent. To verify:docker exec boocode_db psql -U boocode -d boocode -c "SELECT MIN(created_at), MAX(created_at), COUNT(*) FROM messages;"then cross-check against the BooCode roadmap's release dates. The 30-day window in A3's query may simply not have older data to find. - Whether
nmakod/codecontextv3.2.1 hosts the samenodeToSymbolJSswitch I read in the fork. The fork at/opt/forks/codecontextisnuthan-ms/codecontextper go.mod. The deployed v3.2.1 isnmakod/codecontext. The Dockerfile comment (/opt/boocode/codecontext/Dockerfile:13-16) says the module path differs but "the tagged v3.2.1 source tree is the same either way." To verify, clonehttps://github.com/nmakod/codecontextat tag v3.2.1 and diffinternal/parser/manager.goagainst the fork — outside this recon's read-only scope. - Whether
tree-sitter-typescript v0.23.xGo bindings actually build under the fork'sgo 1.24.5+ Tree-sitterv0.25.0combination. Context7 docs confirm the API exists. Confirm bygo get github.com/tree-sitter/tree-sitter-typescript@latestfollowed bygo build ./...in a scratch worktree. - Whether the codecontext panic in
searchSymbolsis reproducible on/opt/boocodeor only on/opt/forks/codecontext(the panic was captured against target_dir/opt/forks/codecontext). Reproduce viadocker exec boocode_codecontext wget -qO - --post-data='{"target_dir":"/opt/boocode","query":"foo","limit":10}' --header='Content-Type: application/json' http://localhost:8080/v1/search_symbols. - Cache hit rate of codecontext analysis (per call vs reused).
The MCP-server log line
Refreshing analysis for codebase overview…suggests rebuild-every-call, but I did not confirm by reading the codecontext source — only the deployed binary's log output. To verify, read/opt/forks/codecontext/internal/mcp/server.goaround theRefreshing analysis…log lines. - Drift correlation strength. N=1 confirmed drift case is too small to call a correlation with codecontext use. To raise the signal: extend retention, re-query after a week of synthetic load with and without codecontext tools.
- Whether the synth pipeline's
truncated head onlyships fewer tokens than a full inlined codecontext result would. Today's budget contract assumes yes (synthesisPipeline.ts:138-145comment "Truncated head only — full content was used for reference extraction above"). To verify: instrument the per-passpromptTokensand compare against a one-off pass with the full content. - The Architect/Code-Reviewer agents' system-prompt copy versus
actual tool usage. AGENTS.md text claims agents will "Use
get_dependencies to map call sites" (line 92) and "Use
get_semantic_neighborhoods to find related components"
(line 132), but A3 shows neither is called. To verify whether the
model is ignoring the prompt or whether these agents simply
aren't being invoked, query
SELECT s.name, COUNT(*) FROM sessions s JOIN chats c ON c.session_id=s.id JOIN messages m ON m.chat_id=c.id WHERE m.role='assistant' GROUP BY 1 ORDER BY 2 DESC;and compare named agents to chat counts.