☆37Oct 17, 2024Updated last year
Alternatives and similar repositories for BadEdit
Users that are interested in BadEdit are comparing it to the libraries listed below
Sorting:
- ☆15Dec 12, 2023Updated 2 years ago
- [USENIX Security 2025] SOFT: Selective Data Obfuscation for Protecting LLM Fine-tuning against Membership Inference Attacks☆19Sep 18, 2025Updated 5 months ago
- Code and dataset for the paper: "Can Editing LLMs Inject Harm?"☆21Dec 26, 2025Updated 2 months ago
- 🔥🔥🔥 Detecting hidden backdoors in Large Language Models with only black-box access☆52Jun 2, 2025Updated 9 months ago
- Unofficial implementation of "Backdooring Instruction-Tuned Large Language Models with Virtual Prompt Injection"☆26Jul 6, 2024Updated last year
- Reverse Engineering Imperceptible Backdoor Attacks on Deep Neural Networks for Detection and Training Set Cleansing☆14Feb 18, 2021Updated 5 years ago
- ☆11Feb 21, 2022Updated 4 years ago
- ☆14May 8, 2024Updated last year
- ☆70Feb 16, 2025Updated last year
- ☆13Oct 20, 2022Updated 3 years ago
- Distribution Preserving Backdoor Attack in Self-supervised Learning☆20Jan 27, 2024Updated 2 years ago
- [CVPR 2024] Not All Prompts Are Secure: A Switchable Backdoor Attack Against Pre-trained Vision Transfomers☆16Oct 24, 2024Updated last year
- ☆17Sep 4, 2024Updated last year
- ☆18Aug 15, 2022Updated 3 years ago
- Code repo for the paper: Attacking Vision-Language Computer Agents via Pop-ups☆51Dec 23, 2024Updated last year
- This is the official Gtihub repo for our paper: "BEEAR: Embedding-based Adversarial Removal of Safety Backdoors in Instruction-tuned Lang…☆21Jul 3, 2024Updated last year
- [ICML 2023] Official code implementation of "Chameleon: Adapting to Peer Images for Planting Durable Backdoors in Federated Learning (htt…☆43Sep 9, 2025Updated 5 months ago
- A list of recent adversarial attack and defense papers (including those on large language models)☆45Jan 25, 2026Updated last month
- ☆20Feb 11, 2024Updated 2 years ago
- Backdooring Multimodal Learning☆30May 4, 2023Updated 2 years ago
- Composite Backdoor Attacks Against Large Language Models☆22Apr 12, 2024Updated last year
- Code for paper: PoisonPrompt: Backdoor Attack on Prompt-based Large Language Models, IEEE ICASSP 2024. Demo//124.220.228.133:11107☆20Aug 10, 2024Updated last year
- ☆22Nov 19, 2024Updated last year
- [ICLR24] Official Repo of BadChain: Backdoor Chain-of-Thought Prompting for Large Language Models☆48Jul 24, 2024Updated last year
- Code for identifying natural backdoors in existing image datasets.☆15Aug 24, 2022Updated 3 years ago
- [NeurIPS 2025] Mask Image Watermarking (Official Implementation)☆43Nov 9, 2025Updated 3 months ago
- [NeurIPS 2025] BackdoorLLM: A Comprehensive Benchmark for Backdoor Attacks and Defenses on Large Language Models☆276Feb 2, 2026Updated last month
- Official Code for ACL 2023 paper: "Ethicist: Targeted Training Data Extraction Through Loss Smoothed Soft Prompting and Calibrated Confid…☆23May 8, 2023Updated 2 years ago
- [S&P'24] Test-Time Poisoning Attacks Against Test-Time Adaptation Models☆19Feb 18, 2025Updated last year
- A toolbox for backdoor attacks.☆23Jan 13, 2023Updated 3 years ago
- Official code for the ICCV2023 paper ``One-bit Flip is All You Need: When Bit-flip Attack Meets Model Training''☆20Aug 9, 2023Updated 2 years ago
- Example TrojAI Submission☆27Dec 6, 2024Updated last year
- ☆19Mar 9, 2024Updated last year
- Code for Neurips 2024 paper "Shadowcast: Stealthy Data Poisoning Attacks Against Vision-Language Models"☆59Jan 15, 2025Updated last year
- Code and data of the ACL-IJCNLP 2021 paper "Hidden Killer: Invisible Textual Backdoor Attacks with Syntactic Trigger"☆43Sep 11, 2022Updated 3 years ago
- Comprehensive Assessment of Trustworthiness in Multimodal Foundation Models☆27Mar 15, 2025Updated 11 months ago
- ☆25Jun 16, 2024Updated last year
- TFLlib-Trustworthy Federated Learning Library and Benchmark☆62Nov 15, 2025Updated 3 months ago
- AnyDoor: Test-Time Backdoor Attacks on Multimodal Large Language Models☆60Apr 8, 2024Updated last year