SORRY-Bench / sorry-bench

Benchmark evaluation code for "SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal" (ICLR 2025)
49Updated 3 weeks ago

Alternatives and similar repositories for sorry-bench:

Users that are interested in sorry-bench are comparing it to the libraries listed below