Unofficial implementation of "Backdooring Instruction-Tuned Large Language Models with Virtual Prompt Injection"
☆26Jul 6, 2024Updated last year
Alternatives and similar repositories for virtual-prompt-injection
Users that are interested in virtual-prompt-injection are comparing it to the libraries listed below
Sorting:
- AnyDoor: Test-Time Backdoor Attacks on Multimodal Large Language Models☆60Apr 8, 2024Updated last year
- ☆11Oct 3, 2021Updated 4 years ago
- Code for the paper "Exploring Backdoor Vulnerabilities of Chat Models"☆18Apr 13, 2024Updated last year
- Working Memory Attack on LLMs☆17May 27, 2025Updated 9 months ago
- ☆37Oct 17, 2024Updated last year
- ☆15Jul 8, 2023Updated 2 years ago
- ☆22Sep 2, 2025Updated 6 months ago
- ☆18Jul 1, 2021Updated 4 years ago
- Implementation of BadCLIP https://arxiv.org/pdf/2311.16194.pdf☆23Mar 23, 2024Updated last year
- Code&Data for the paper "Watch Out for Your Agents! Investigating Backdoor Threats to LLM-Based Agents" [NeurIPS 2024]☆109Sep 27, 2024Updated last year
- ☆58May 30, 2024Updated last year
- [USENIX Security'24] REMARK-LLM: A robust and efficient watermarking framework for generative large language models☆27Oct 23, 2024Updated last year
- ☆24Feb 2, 2026Updated last month
- ☆30Sep 3, 2024Updated last year
- ☆28Aug 21, 2023Updated 2 years ago
- ☆585Jul 4, 2025Updated 8 months ago
- [CIKM 2024] Trojan Activation Attack: Attack Large Language Models using Activation Steering for Safety-Alignment.☆29Jul 29, 2024Updated last year
- Re-thinking Federated Active Learning based on Inter-class Diversity (CVPR 2023)☆32May 31, 2023Updated 2 years ago
- Flowlyt is a security analyzer that scans GitHub Actions workflows to detect malicious patterns, misconfigurations, and secrets exposure,…☆15Feb 25, 2026Updated last week
- ☆14Feb 18, 2026Updated 2 weeks ago
- You can use it to modify HTTP (S) response values, redirect static file requests to the local file directory, and support batch modificat…☆18Nov 30, 2022Updated 3 years ago
- Python code to automatically produce a summary of a piece of text.☆12Sep 8, 2016Updated 9 years ago
- ☆52Oct 23, 2023Updated 2 years ago
- Official Repository for ACL 2024 Paper SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware Decoding☆151Jul 19, 2024Updated last year
- ☆16Nov 8, 2024Updated last year
- Implementation of an X86 mini OS from scratch. Reference: https://github.com/yyu/osfs00☆11Jan 9, 2023Updated 3 years ago
- BrainWash: A Poisoning Attack to Forget in Continual Learning☆12Apr 15, 2024Updated last year
- ☆12Aug 15, 2023Updated 2 years ago
- [CVPRW'22] A privacy attack that exploits Adversarial Training models to compromise the privacy of Federated Learning systems.☆12Jul 7, 2022Updated 3 years ago
- [AAMAS 2025] Privacy-preserving and Personalized RLHF, with convergence guarantees. The Code contains experiments for training multiple i…☆15Apr 16, 2025Updated 10 months ago
- Rapid Response sample Foundry app☆17Updated this week
- Hidden backdoor attack on NLP systems☆47Nov 14, 2021Updated 4 years ago
- SJTU 中文简约 LaTeX 报告模板☆10Jun 7, 2021Updated 4 years ago
- ☆10Jun 24, 2021Updated 4 years ago
- Multimodal dataset for evaluating continuous authentication performance in smartphones☆11Feb 1, 2021Updated 5 years ago
- About The corresponding code from our paper " Making Reasoning Matter: Measuring and Improving Faithfulness of Chain-of-Thought Reasoning…☆13Jan 14, 2026Updated last month
- ☆10Jan 21, 2019Updated 7 years ago
- [ICML 2023] Protecting Language Generation Models via Invisible Watermarking☆13Sep 8, 2023Updated 2 years ago
- Blockchain explorer☆13May 31, 2018Updated 7 years ago