SORRY-Bench / sorry-bench

Benchmark evaluation code for "SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal" (ICLR 2025)
51Updated last month

Alternatives and similar repositories for sorry-bench:

Users that are interested in sorry-bench are comparing it to the libraries listed below