Breakend / SelfDestructingModelsLinks
☆12Updated 2 years ago
Alternatives and similar repositories for SelfDestructingModels
Users that are interested in SelfDestructingModels are comparing it to the libraries listed below
Sorting:
- [ICLR 2025] Official Repository for "Tamper-Resistant Safeguards for Open-Weight LLMs"☆59Updated 2 months ago
- ☆22Updated last year
- Finding trojans in aligned LLMs. Official repository for the competition hosted at SaTML 2024.☆114Updated last year
- ☆42Updated 11 months ago
- [ICLR 2025] On Evluating the Durability of Safegurads for Open-Weight LLMs☆13Updated 2 months ago
- This is the starter kit for the Trojan Detection Challenge 2023 (LLM Edition), a NeurIPS 2023 competition.