VITA-Group / Robust_Weight_SignaturesLinks
[ICML 2023] "Robust Weight Signatures: Gaining Robustness as Easy as Patching Weights?" by Ruisi Cai, Zhenyu Zhang, Zhangyang Wang
☆16Updated 2 years ago
Alternatives and similar repositories for Robust_Weight_Signatures
Users that are interested in Robust_Weight_Signatures are comparing it to the libraries listed below
Sorting:
- ☆21Updated last year
- ☆20Updated 5 months ago
- ☆44Updated 3 months ago
- ☆43Updated 2 years ago
- ☆23Updated 9 months ago
- ☆14Updated 7 months ago
- ☆12Updated 4 months ago
- Code for "Universal Adversarial Triggers Are Not Universal."☆17Updated last year
- ☆19Updated 3 weeks ago
- ☆54Updated 2 years ago
- ☆41Updated 8 months ago
- ☆34Updated 5 months ago
- Host CIFAR-10.2 Data Set☆13Updated 3 years ago
- ☆22Updated 10 months ago
- Codebase for decoding compressed trust.☆23Updated last year
- ☆22Updated 3 months ago
- ECSO (Make MLLM safe without neither training nor any external models!) (https://arxiv.org/abs/2403.09572)☆23Updated 7 months ago
- Code for the paper "Evading Black-box Classifiers Without Breaking Eggs" [SaTML 2024]☆20Updated last year
- Code for safety test in "Keeping LLMs Aligned After Fine-tuning: The Crucial Role of Prompt Templates"☆18Updated last year
- EMNLP 2024: Model Editing Harms General Abilities of Large Language Models: Regularization to the Rescue☆35Updated last week
- ☆53Updated 2 years ago
- [NeurIPS 2024 D&B] Evaluating Copyright Takedown Methods for Language Models☆17Updated 10 months ago
- ☆39Updated 9 months ago
- AIR-Bench 2024 is a safety benchmark that aligns with emerging government regulations and company policies☆21Updated 9 months ago
- ☆21Updated 2 months ago
- [ICLR'24] RAIN: Your Language Models Can Align Themselves without Finetuning☆93Updated last year
- SafeChain: Safety of Language Models with Long Chain-of-Thought Reasoning Capabilities☆15Updated 2 months ago
- OODRobustBench: a Benchmark and Large-Scale Analysis of Adversarial Robustness under Distribution Shift. ICML 2024 and ICLRW-DMLR 2024☆21Updated 10 months ago
- ☆15Updated 9 months ago
- The official repository for paper "MLLM-Protector: Ensuring MLLM’s Safety without Hurting Performance"☆37Updated last year