Jayfeather1024 / Backdoor-Enhanced-Alignment
☆10Updated 8 months ago
Related projects ⓘ
Alternatives and complementary repositories for Backdoor-Enhanced-Alignment
- Code for the paper "BadPrompt: Backdoor Attacks on Continuous Prompts"☆35Updated 4 months ago
- ☆14Updated 6 months ago
- Stable Backdoor Purification (NeurIPS 2023 & 2024)☆23Updated this week
- Improved techniques for optimization-based jailbreaking on large language models☆42Updated 5 months ago
- [NeurIPS 2023] Differentially Private Image Classification by Learning Priors from Random Processes☆11Updated last year
- This is the repository that introduces research topics related to protecting intellectual property (IP) of AI from a data-centric perspec…☆22Updated last year
- ☆18Updated 6 months ago
- ☆11Updated 2 years ago
- ☆53Updated last year
- ☆13Updated 4 months ago
- Reconstructive Neuron Pruning for Backdoor Defense (ICML 2023)☆28Updated 10 months ago
- Github repo for One-shot Neural Backdoor Erasing via Adversarial Weight Masking (NeurIPS 2022)☆14Updated last year
- Code relative to "Adversarial robustness against multiple and single $l_p$-threat models via quick fine-tuning of robust classifiers"☆15Updated last year
- [ICLR 2022] Boosting Randomized Smoothing with Variance Reduced Classifiers☆11Updated 2 years ago
- [CVPR 2022] "Quarantine: Sparsity Can Uncover the Trojan Attack Trigger for Free" by Tianlong Chen*, Zhenyu Zhang*, Yihua Zhang*, Shiyu C…☆25Updated 2 years ago
- ☆29Updated 2 years ago
- The official implementation of USENIX Security'23 paper "Meta-Sift" -- Ten minutes or less to find a 1000-size or larger clean subset on …☆18Updated last year
- ☆38Updated last year
- official implementation of Towards Robust Model Watermark via Reducing Parametric Vulnerability☆12Updated 5 months ago
- Implementation for <Understanding Robust Overftting of Adversarial Training and Beyond> in ICML'22.☆12Updated 2 years ago
- Implementation for <Robust Weight Perturbation for Adversarial Training> in IJCAI'22.☆14Updated 2 years ago
- [ICLR 2022 official code] Robust Learning Meets Generative Models: Can Proxy Distributions Improve Adversarial Robustness?☆26Updated 2 years ago
- ☆26Updated 2 weeks ago
- [NeurIPS 2022] "Randomized Channel Shuffling: Minimal-Overhead Backdoor Attack Detection without Clean Datasets" by Ruisi Cai*, Zhenyu Zh…☆18Updated 2 years ago
- Official Implementation of NIPS 2022 paper Pre-activation Distributions Expose Backdoor Neurons☆14Updated last year
- Code for paper "Universal Jailbreak Backdoors from Poisoned Human Feedback"☆41Updated 6 months ago
- [ICLR 2023, Spotlight] Indiscriminate Poisoning Attacks on Unsupervised Contrastive Learning☆28Updated 11 months ago
- Camouflage poisoning via machine unlearning☆15Updated last year
- Codes for NeurIPS 2021 paper "Adversarial Neuron Pruning Purifies Backdoored Deep Models"☆55Updated last year
- RAB: Provable Robustness Against Backdoor Attacks☆39Updated last year