zhangrui4041 / Instruction_Backdoor_AttackView external linksLinks
☆26Aug 21, 2024Updated last year
Alternatives and similar repositories for Instruction_Backdoor_Attack
Users that are interested in Instruction_Backdoor_Attack are comparing it to the libraries listed below
Sorting:
- Composite Backdoor Attacks Against Large Language Models☆22Apr 12, 2024Updated last year
- [NeurIPS 2025] BackdoorLLM: A Comprehensive Benchmark for Backdoor Attacks and Defenses on Large Language Models☆274Feb 2, 2026Updated 2 weeks ago
- This is the implementation for IEEE S&P 2022 paper "Model Orthogonalization: Class Distance Hardening in Neural Networks for Better Secur…☆11Aug 24, 2022Updated 3 years ago
- [USENIX Security '24] An LLM-Assisted Easy-to-Trigger Backdoor Attack on Code Completion Models: Injecting Disguised Vulnerabilities agai…☆56Mar 22, 2025Updated 10 months ago
- ☆14May 8, 2024Updated last year
- ICL backdoor attack☆17Nov 4, 2024Updated last year
- Working Memory Attack on LLMs☆17May 27, 2025Updated 8 months ago
- ☆13Oct 21, 2021Updated 4 years ago
- Efficient Secure Computation Protocols for Trigonometric Functions via Function Secret Sharing☆20Nov 8, 2022Updated 3 years ago
- Code for paper "Membership Inference Attacks Against Vision-Language Models"☆26Jan 25, 2025Updated last year
- TextGuard: Provable Defense against Backdoor Attacks on Text Classification☆13Nov 7, 2023Updated 2 years ago
- This is the official Gtihub repo for our paper: "BEEAR: Embedding-based Adversarial Removal of Safety Backdoors in Instruction-tuned Lang…☆21Jul 3, 2024Updated last year
- ☆47Sep 29, 2024Updated last year
- Query-Efficient Data-Free Learning from Black-Box Models☆23Mar 20, 2023Updated 2 years ago
- ☆24Jul 25, 2024Updated last year
- Code for paper "The Philosopher’s Stone: Trojaning Plugins of Large Language Models"☆27Sep 11, 2024Updated last year
- Code&Data for the paper "Watch Out for Your Agents! Investigating Backdoor Threats to LLM-Based Agents" [NeurIPS 2024]☆109Sep 27, 2024Updated last year
- An open-source toolkit for textual backdoor attack and defense (NeurIPS 2022 D&B, Spotlight)☆200Apr 10, 2023Updated 2 years ago
- Code for the paper "Rethinking Stealthiness of Backdoor Attack against NLP Models" (ACL-IJCNLP 2021)☆24Dec 9, 2021Updated 4 years ago
- ☆28Aug 21, 2023Updated 2 years ago
- [USENIX Security'24] REMARK-LLM: A robust and efficient watermarking framework for generative large language models☆27Oct 23, 2024Updated last year
- ☆31Sep 22, 2024Updated last year
- ☆37Sep 30, 2024Updated last year
- Overcooked! 2 TAS Development Framework☆10Aug 18, 2023Updated 2 years ago
- ☆37Feb 7, 2024Updated 2 years ago
- Cybersecurity Ontology (CyberOnto) and Situational Awareness (CyberSA) help teamwork in Cyber Incident Responses, Control, Containment, a…☆10Sep 15, 2022Updated 3 years ago
- ☆46Aug 4, 2023Updated 2 years ago
- A project from EECS6414M of Winter 2020 at York University☆11Mar 26, 2020Updated 5 years ago
- Fingerprint large language models☆49Jul 11, 2024Updated last year
- SurFree: a fast surrogate-free black-box attack☆44Jun 27, 2024Updated last year
- Cloak, Honey, Trap: Proactive Defenses Against LLM Agents☆15Jul 9, 2025Updated 7 months ago
- ☆13Mar 9, 2025Updated 11 months ago
- Repository of reference Gabriel graph, Internet Topology Zoo, SNDlib, CAIDA and synthetic backbone topologies for networking research☆12Sep 30, 2025Updated 4 months ago
- Official TensorFlow implementation of "Parsimonious Black-Box Adversarial Attacks via Efficient Combinatorial Optimization" (ICML 2019)☆41Dec 7, 2020Updated 5 years ago
- Official PyTorch implementation of "MM-PoisonRAG: Disrupting Multimodal RAG with Local and Global Poisoning Attacks"☆12Dec 4, 2025Updated 2 months ago
- ☆10Oct 31, 2022Updated 3 years ago
- Disguising Attacks with Explanation-Aware Backdoors (IEEE S&P 2023)☆11Jan 3, 2026Updated last month
- Implementation of our ICLR 2021 paper: Policy-Driven Attack: Learning to Query for Hard-label Black-box Adversarial Examples.☆11Mar 9, 2021Updated 4 years ago
- The Universal Algebra Calculator☆16Jun 11, 2022Updated 3 years ago