papersPapers / BadPromptLinks

Code for the paper "BadPrompt: Backdoor Attacks on Continuous Prompts"

☆40

Alternatives and similar repositories for BadPrompt

Users that are interested in BadPrompt are comparing it to the libraries listed below

Sorting:

Jayfeather1024 / Backdoor-Enhanced-Alignment
☆23Updated 10 months ago
lancopku / SOS
Code for the paper "Rethinking Stealthiness of Backdoor Attack against NLP Models" (ACL-IJCNLP 2021)
☆24Updated 3 years ago
grasses / PoisonPrompt
Code for paper: PoisonPrompt: Backdoor Attack on Prompt-based Large Language Models, IEEE ICASSP 2024. Demo//124.220.228.133:11107
☆17Updated last year
byerose / Awesome-Foundation-Model-Security
A curated list of trustworthy Generative AI papers. Daily updating...
☆74Updated last year
lancopku / RAP
Code for the paper "RAP: Robustness-Aware Perturbations for Defending against Backdoor Attacks on NLP Models" (EMNLP 2021)
☆25Updated 4 years ago
Vaidehi99 / InfoDeletionAttacks
☆46Updated 8 months ago
ethz-spylab / rlhf-poisoning
Code for paper "Universal Jailbreak Backdoors from Poisoned Human Feedback"
☆61Updated last year
David-Li0406 / AI-Supervision-Risk
☆21Updated 7 months ago
THU-BPM / Robust_Watermark
Code and data for paper "A Semantic Invariant Robust Watermark for Large Language Models" accepted by ICLR 2024.
☆34Updated 11 months ago
csdongxian / ANP_backdoor
Codes for NeurIPS 2021 paper "Adversarial Neuron Pruning Purifies Backdoored Deep Models"
☆58Updated 2 years ago
AI45Lab / CodeAttack
[ACL 2024] CodeAttack: Revealing Safety Generalization Challenges of Large Language Models via Code Completion
☆53Updated 3 weeks ago
umd-huang-lab / VLM-Poisoning
Code for Neurips 2024 paper "Shadowcast: Stealthy Data Poisoning Attacks Against Vision-Language Models"
☆55Updated 9 months ago
thunlp / HiddenKiller
Code and data of the ACL-IJCNLP 2021 paper "Hidden Killer: Invisible Textual Backdoor Attacks with Syntactic Trigger"
☆43Updated 3 years ago
CryptoAILab / MergeGuard
[CCS-LAMPS'24] LLM IP Protection Against Model Merging
☆15Updated last year
AISafety-HKUST / Backdoor_Safety_Tuning
Backdoor Safety Tuning (NeurIPS 2023 & 2024 Spotlight)
☆26Updated 11 months ago
sail-sg / AnyDoor
AnyDoor: Test-Time Backdoor Attacks on Multimodal Large Language Models
☆58Updated last year
lapisrocks / rpo
Official repository for "Robust Prompt Optimization for Defending Language Models Against Jailbreaking Attacks"
☆58Updated last year
lancopku / agent-backdoor-attacks
Code&Data for the paper "Watch Out for Your Agents! Investigating Backdoor Threats to LLM-Based Agents" [NeurIPS 2024]
☆93Updated last year
YancyKahn / CoA
Chain of Attack: a Semantic-Driven Contextual Multi-Turn attacker for LLM
☆37Updated 9 months ago
HKUST-KnowComp / LLM-Multistep-Jailbreak
Code for Findings-EMNLP 2023 paper: Multi-step Jailbreaking Privacy Attacks on ChatGPT
☆35Updated 2 years ago
cnut1648 / Model-Fingerprint
Fingerprint large language models
☆44Updated last year
Alibaba-AAIG / Oyster
The Oyster series is a set of safety models developed in-house by Alibaba-AAIG, devoted to building a responsible AI ecosystem. | Oyster …
☆52Updated last month
qingjiesjtu / USC
This is the code repository of our submission: Understanding the Dark Side of LLMs’ Intrinsic Self-Correction.
☆63Updated 10 months ago
thu-ml / Attack-Bard
☆102Updated last year
BrachioLab / adversarial_prompting
☆53Updated 2 years ago
reds-lab / ASSET
This repository is the official implementation of the paper "ASSET: Robust Backdoor Data Detection Across a Multiplicity of Deep Learning…
☆19Updated 2 years ago
leix28 / prompt-universal-vulnerability
Implementation of the paper "Exploring the Universal Vulnerability of Prompt-based Learning Paradigm" on Findings of NAACL 2022
☆30Updated 3 years ago
bboylyg / RNP
Reconstructive Neuron Pruning for Backdoor Defense (ICML 2023)
☆39Updated last year
thu-coai / JailbreakDefense_GoalPriority
[ACL 2024] Defending Large Language Models Against Jailbreaking Attacks Through Goal Prioritization
☆29Updated last year
inspire-group / RobustRAG
☆21Updated last year