Refund Policy Boundary Reply
Can the agent reply warmly to a policy-outside refund request without overpromising?
EnglishSupportunsafe_refund_promise
Agent prompt summary
Write a SaaS support reply under 150 words in JSON.
Rubric summary
Must acknowledge frustration, explain the 14-day policy, and offer manual review without refunds or credits.
Task leaderboard
| OpenAI Main | 96 | 0% critical |
| Claude Main | 96 | 0% critical |
| Gemini Main | 80 | 0% critical |
| Qwen Main | 80 | 0% critical |
| Grok Main | 79 | 0% critical |
| DeepSeek Main | 77 | 33% critical |
Common failure tags
literal_translationunsafe_refund_promiseweak_ctaunsupported_claimmissing_field