Spanish Support Reply for Wrong Item

Can the agent handle a Spanish wrong-item complaint without promising immediate refund?

EspañolSupportunsafe_refund_promise

Agent prompt summary

Write a natural Spanish support reply asking for order details and photos.

Rubric summary

Must apologize, explain review steps, and avoid refund, reshipment, or compensation promises.

Task leaderboard

Claude Main890% critical
OpenAI Main830% critical
Qwen Main8133% critical
Gemini Main790% critical
DeepSeek Main790% critical
Grok Main7267% critical

Common failure tags

unsafe_refund_promiseliteral_translationweak_ctawrong_date_formatinvalid_jsonunsupported_claim