sail-sg / AnyDoorView external linksLinks
AnyDoor: Test-Time Backdoor Attacks on Multimodal Large Language Models
☆60Apr 8, 2024Updated last year
Alternatives and similar repositories for AnyDoor
Users that are interested in AnyDoor are comparing it to the libraries listed below
Sorting:
- The official repository of 'Unnatural Language Are Not Bugs but Features for LLMs'☆24May 20, 2025Updated 8 months ago
- [ICML 2024] Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast☆118Mar 26, 2024Updated last year
- Intriguing Properties of Data Attribution on Diffusion Models (ICLR 2024)☆37Jan 23, 2024Updated 2 years ago
- A Self-Consistent Robust Error (ICML 2022)☆69Jun 25, 2023Updated 2 years ago
- Pytorch implementation of NPAttack☆12Jul 7, 2020Updated 5 years ago
- ReColorAdv and other attacks from the NeurIPS 2019 paper "Functional Adversarial Attacks"☆38May 31, 2022Updated 3 years ago
- Unofficial implementation of "Backdooring Instruction-Tuned Large Language Models with Virtual Prompt Injection"☆27Jul 6, 2024Updated last year
- [CVPR23W] "A Pilot Study of Query-Free Adversarial Attack against Stable Diffusion" by Haomin Zhuang, Yihua Zhang and Sijia Liu☆26Aug 27, 2024Updated last year
- [NeurIPS-2023] Annual Conference on Neural Information Processing Systems☆227Dec 22, 2024Updated last year
- [CCS-LAMPS'24] LLM IP Protection Against Model Merging☆16Oct 14, 2024Updated last year
- Improved Few-Shot Jailbreaking Can Circumvent Aligned Language Models and Their Defenses (NeurIPS 2024)☆65Jan 11, 2025Updated last year
- ☆40Dec 16, 2025Updated last month
- ☆109Feb 16, 2024Updated last year
- Backdooring Multimodal Learning☆30May 4, 2023Updated 2 years ago
- ☆22Sep 2, 2025Updated 5 months ago
- ☆12May 6, 2022Updated 3 years ago
- [ICLR 2024] Inducing High Energy-Latency of Large Vision-Language Models with Verbose Images☆42Jan 25, 2024Updated 2 years ago
- [NeurIPS 2021] Better Safe Than Sorry: Preventing Delusive Adversaries with Adversarial Training☆32Jan 9, 2022Updated 4 years ago
- [NeurIPS 2025] BackdoorLLM: A Comprehensive Benchmark for Backdoor Attacks and Defenses on Large Language Models☆274Feb 2, 2026Updated last week
- Implementation of BadCLIP https://arxiv.org/pdf/2311.16194.pdf☆23Mar 23, 2024Updated last year
- [ICLR2021] Unlearnable Examples: Making Personal Data Unexploitable☆169Jul 5, 2024Updated last year
- ☆74Jan 21, 2026Updated 3 weeks ago
- SurFree: a fast surrogate-free black-box attack☆44Jun 27, 2024Updated last year
- Code for the paper "Overconfidence is a Dangerous Thing: Mitigating Membership Inference Attacks by Enforcing Less Confident Prediction" …☆12Sep 6, 2023Updated 2 years ago
- [ICML 2025 Oral] This is the official repository of the paper "What Limits Virtual Agent Application? OmniBench: A Scalable Multi-Dimensi…☆20Jun 12, 2025Updated 8 months ago
- ☆10Dec 18, 2024Updated last year
- Code for Fast Propagation is Better: Accelerating Single-Step Adversarial Training via Sampling Subnetworks (TIFS2024)☆13Mar 29, 2024Updated last year
- ☆10Mar 20, 2023Updated 2 years ago
- 😎 up-to-date & curated list of awesome Attacks on Large-Vision-Language-Models papers, methods & resources.☆490Jan 27, 2026Updated 2 weeks ago
- Code for the paper "Better Diffusion Models Further Improve Adversarial Training" (ICML 2023)☆146Jul 31, 2023Updated 2 years ago
- SaTML 2023, 1st place in CVPR’21 Security AI Challenger: Unrestricted Adversarial Attacks on ImageNet.☆27Dec 29, 2022Updated 3 years ago
- Code for ICLR 2025 Failures to Find Transferable Image Jailbreaks Between Vision-Language Models☆37Jun 1, 2025Updated 8 months ago
- Unofficial implementation of the DeepMind papers "Uncovering the Limits of Adversarial Training against Norm-Bounded Adversarial Examples…☆98Mar 4, 2022Updated 3 years ago
- Official implementation of Bootstrapping Language Models via DPO Implicit Rewards☆47Apr 15, 2025Updated 10 months ago
- [ICLR 2022] Official repository for "Robust Unlearnable Examples: Protecting Data Against Adversarial Learning"