## ADDED Requirements ### Requirement: Agent llama_flags frontmatter field The system SHALL parse a `llama_flags` string field from agent AGENTS.md frontmatter. #### Scenario: Agent with llama_flags set - **GIVEN** an agent with `llama_flags: "--cache-type-k q8_0 -c 16384"` - **WHEN** the agent is parsed from AGENTS.md - **THEN** `agent.llama_flags` equals `"--cache-type-k q8_0 -c 16384"` #### Scenario: Agent without llama_flags - **GIVEN** an agent with no `llama_flags` field in frontmatter - **WHEN** the agent is parsed from AGENTS.md - **THEN** `agent.llama_flags` equals `null` ### Requirement: X-Agent-Flags header emission The inference pipeline SHALL emit an `X-Agent-Flags` HTTP header when the agent has `llama_flags` set. #### Scenario: Header emitted for agent with flags - **GIVEN** an agent with `llama_flags: "--cache-type-k q8_0"` - **WHEN** `streamCompletion()` is called with that agent - **THEN** the `streamText()` call receives `headers: { 'X-Agent-Flags': '--cache-type-k q8_0' }` #### Scenario: No header when agent has no flags - **GIVEN** an agent with `llama_flags: null` - **WHEN** `streamCompletion()` is called with that agent - **THEN** no `X-Agent-Flags` header is included in the request #### Scenario: No header when agent is null - **GIVEN** no agent (raw chat session) - **WHEN** `streamCompletion()` is called - **THEN** no `X-Agent-Flags` header is included in the request #### Scenario: Whitespace-only flags produce no header - **GIVEN** an agent with `llama_flags: " "` - **WHEN** `streamCompletion()` is called with that agent - **THEN** no `X-Agent-Flags` header is included in the request ### Requirement: Existing sampler fields unchanged The existing sampler fields (top_k, min_p, etc.) SHALL continue to flow through `providerOptions.openaiCompatible` in the request body, independent of the `X-Agent-Flags` header channel. #### Scenario: Dual-channel sampling - **GIVEN** an agent with `top_k: 20` and `llama_flags: "--cache-type-k q8_0"` - **WHEN** an inference request is made - **THEN** the request body contains `top_k: 20` via providerOptions - **AND** the request header contains `X-Agent-Flags: --cache-type-k q8_0`