dsbowen / strong_rejectLinks
☆102Updated 3 months ago
Alternatives and similar repositories for strong_reject
Users that are interested in strong_reject are comparing it to the libraries listed below
Sorting:
- Official implementation of AdvPrompter https//arxiv.org/abs/2404.16873☆169Updated last year
- Improving Alignment and Robustness with Circuit Breakers☆238Updated last year
- WMDP is a LLM proxy benchmark for hazardous knowledge in bio, cyber, and chemical security. We also release code for RMU, an unlearning m…