Files
indifferentketchup c935687725 chore(openspec): drop 9 superseded proposals + 11 stub archive files
Drop 9 batch proposals that are superseded by the boocode-lift-analysis
(boocontext-audit, conductor upgrades, self-healing/verify-gate skills):
add-3tier-memory, import-llm-evaluator, import-pregel-engine, plugin-platform,
conductor-evolution, code-intelligence-upgrade, dev-workflow, ui-overhaul,
agent-reliability.

Delete 11 stub archive files (49-66B each, 'Status: Shipped. Archived.' only)
that provide zero documentation value over the existing CHANGELOG.md + git tags.
2026-06-07 22:15:38 +00:00

2.0 KiB

ADDED Requirements

Requirement: Built-in evaluation prompt templates

The system SHALL ship with a library of prompt templates organized by domain, ready for use with create_llm_as_judge().

Domains and included prompts:

Quality:

  • CORRECTNESS_PROMPT — factual accuracy and completeness
  • CONCISENESS_PROMPT — concise responses without hedging or fluff
  • HALLUCINATION_PROMPT — claims verifiable from context
  • ANSWER_RELEVANCE_PROMPT — output addresses the input question
  • PLAN_ADHERENCE_PROMPT — agent actions match declared plan
  • LAZINESS_PROMPT — detects blank or low-effort responses

RAG:

  • RAG_GROUNDEDNESS_PROMPT — output claims supported by retrieved context
  • RAG_HELPFULNESS_PROMPT — output addresses core question
  • RAG_RETRIEVAL_RELEVANCE_PROMPT — retrieved context is relevant to input

Safety:

  • TOXICITY_PROMPT — personal attacks, hate speech
  • FAIRNESS_PROMPT — stereotyping, discrimination

Security:

  • PII_LEAKAGE_PROMPT — names, contact info, credentials in output
  • PROMPT_INJECTION_PROMPT — delimiter manipulation, roleplay bypass
  • CODE_INJECTION_PROMPT — SQL injection, XSS, path traversal

Trajectory:

  • TRAJECTORY_ACCURACY_PROMPT — logical progression, goal alignment
  • TRAJECTORY_ACCURACY_PROMPT_WITH_REFERENCE — semantically equivalent to reference
  • TOOL_SELECTION_PROMPT — right tools, right order, no redundant calls

Conversation:

  • USER_SATISFACTION_PROMPT — gratitude, resolution, engagement
  • TASK_COMPLETION_PROMPT — was the user's goal achieved
  • AGENT_TONE_PROMPT — appropriate tone and professionalism

Scenario: Each prompt is a string with {inputs}, {outputs}, {reference_outputs} placeholders

  • WHEN a prompt template is inspected
  • THEN it SHALL be a string compatible with str.format() containing at least {outputs}

Scenario: Prompt templates follow rubric structure

  • WHEN a prompt template is read
  • THEN it SHALL contain <Rubric>, <Instructions>, and <Reminder> XML sections