chore(openspec): drop 9 superseded proposals + 11 stub archive files
Drop 9 batch proposals that are superseded by the boocode-lift-analysis (boocontext-audit, conductor upgrades, self-healing/verify-gate skills): add-3tier-memory, import-llm-evaluator, import-pregel-engine, plugin-platform, conductor-evolution, code-intelligence-upgrade, dev-workflow, ui-overhaul, agent-reliability. Delete 11 stub archive files (49-66B each, 'Status: Shipped. Archived.' only) that provide zero documentation value over the existing CHANGELOG.md + git tags.
This commit is contained in:
@@ -0,0 +1,49 @@
|
||||
## ADDED Requirements
|
||||
|
||||
### Requirement: Built-in evaluation prompt templates
|
||||
|
||||
The system SHALL ship with a library of prompt templates organized by domain, ready for use with `create_llm_as_judge()`.
|
||||
|
||||
Domains and included prompts:
|
||||
|
||||
**Quality:**
|
||||
- `CORRECTNESS_PROMPT` — factual accuracy and completeness
|
||||
- `CONCISENESS_PROMPT` — concise responses without hedging or fluff
|
||||
- `HALLUCINATION_PROMPT` — claims verifiable from context
|
||||
- `ANSWER_RELEVANCE_PROMPT` — output addresses the input question
|
||||
- `PLAN_ADHERENCE_PROMPT` — agent actions match declared plan
|
||||
- `LAZINESS_PROMPT` — detects blank or low-effort responses
|
||||
|
||||
**RAG:**
|
||||
- `RAG_GROUNDEDNESS_PROMPT` — output claims supported by retrieved context
|
||||
- `RAG_HELPFULNESS_PROMPT` — output addresses core question
|
||||
- `RAG_RETRIEVAL_RELEVANCE_PROMPT` — retrieved context is relevant to input
|
||||
|
||||
**Safety:**
|
||||
- `TOXICITY_PROMPT` — personal attacks, hate speech
|
||||
- `FAIRNESS_PROMPT` — stereotyping, discrimination
|
||||
|
||||
**Security:**
|
||||
- `PII_LEAKAGE_PROMPT` — names, contact info, credentials in output
|
||||
- `PROMPT_INJECTION_PROMPT` — delimiter manipulation, roleplay bypass
|
||||
- `CODE_INJECTION_PROMPT` — SQL injection, XSS, path traversal
|
||||
|
||||
**Trajectory:**
|
||||
- `TRAJECTORY_ACCURACY_PROMPT` — logical progression, goal alignment
|
||||
- `TRAJECTORY_ACCURACY_PROMPT_WITH_REFERENCE` — semantically equivalent to reference
|
||||
- `TOOL_SELECTION_PROMPT` — right tools, right order, no redundant calls
|
||||
|
||||
**Conversation:**
|
||||
- `USER_SATISFACTION_PROMPT` — gratitude, resolution, engagement
|
||||
- `TASK_COMPLETION_PROMPT` — was the user's goal achieved
|
||||
- `AGENT_TONE_PROMPT` — appropriate tone and professionalism
|
||||
|
||||
#### Scenario: Each prompt is a string with {inputs}, {outputs}, {reference_outputs} placeholders
|
||||
|
||||
- **WHEN** a prompt template is inspected
|
||||
- **THEN** it SHALL be a string compatible with `str.format()` containing at least `{outputs}`
|
||||
|
||||
#### Scenario: Prompt templates follow rubric structure
|
||||
|
||||
- **WHEN** a prompt template is read
|
||||
- **THEN** it SHALL contain `<Rubric>`, `<Instructions>`, and `<Reminder>` XML sections
|
||||
Reference in New Issue
Block a user