qizhangli / Gradient-based-Jailbreak-AttacksLinks
Code for our NeurIPS 2024 paper Improved Generation of Adversarial Examples Against Safety-aligned LLMs
☆12Updated 11 months ago
Alternatives and similar repositories for Gradient-based-Jailbreak-Attacks
Users that are interested in Gradient-based-Jailbreak-Attacks are comparing it to the libraries listed below
Sorting:
- ☆52Updated last year
- Backdoor Safety Tuning (NeurIPS 2023 & 2024 Spotlight)☆26Updated 10 months ago
- ☆101Updated last year
- The Oyster series is a set of safety models developed in-house by Alibaba-AAIG, devoted to building a responsible AI ecosystem. | Oyster …