grasses / PoisonPromptLinks

Code for paper: PoisonPrompt: Backdoor Attack on Prompt-based Large Language Models, IEEE ICASSP 2024. Demo//124.220.228.133:11107

☆18

Alternatives and similar repositories for PoisonPrompt

Users that are interested in PoisonPrompt are comparing it to the libraries listed below

Sorting:

LiangSiyuan21 / BadCLIP
☆29Updated last year
uw-nsl / CleanGen
[EMNLP 24] Official Implementation of CLEANGEN: Mitigating Backdoor Attacks for Generation Tasks in Large Language Models
☆20Updated 8 months ago
bboylyg / RNP
Reconstructive Neuron Pruning for Backdoor Defense (ICML 2023)
☆39Updated last year
papersPapers / BadPrompt
Code for the paper "BadPrompt: Backdoor Attacks on Continuous Prompts"
☆40Updated last year
shiningrain / JailGuard
☆24Updated 8 months ago
NY1024 / BAP-Jailbreak-Vision-Language-Models-via-Bi-Modal-Adversarial-Prompt
☆53Updated last year
umd-huang-lab / VLM-Poisoning
Code for Neurips 2024 paper "Shadowcast: Stealthy Data Poisoning Attacks Against Vision-Language Models"
☆57Updated 10 months ago
PKU-ML / PAT
Code for NeurIPS 2024 Paper "Fight Back Against Jailbreaking via Prompt Adversarial Tuning"
☆21Updated 7 months ago
MaTengSYSU / HIMRD-jailbreak
Code repository for the paper "Heuristic Induced Multimodal Risk Distribution Jailbreak Attack for Multimodal Large Language Models"
☆14Updated 3 months ago
clearloveclearlove / BEAT
☆14Updated 9 months ago
KuofengGao / Verbose_Images
[ICLR 2024] Inducing High Energy-Latency of Large Vision-Language Models with Verbose Images
☆41Updated last year
abc03570128 / Jailbreaking-Attack-against-Multimodal-Large-Language-Model
☆52Updated last year
YitingQu / unsafe-diffusion
☆40Updated last year
IBM / VillanDiffusion
Code Repo for the NeurIPS 2023 paper "VillanDiffusion: A Unified Backdoor Attack Framework for Diffusion Models"
☆27Updated 2 months ago
reds-lab / ASSET
This repository is the official implementation of the paper "ASSET: Robust Backdoor Data Detection Across a Multiplicity of Deep Learning…
☆19Updated 2 years ago
sail-sg / AnyDoor
AnyDoor: Test-Time Backdoor Attacks on Multimodal Large Language Models
☆60Updated last year
grasses / PromptCARE
Code for paper: "PromptCARE: Prompt Copyright Protection by Watermark Injection and Verification", IEEE S&P 2024.
☆33Updated last year
rain152 / PAT
[NeurIPS 2024] Fight Back Against Jailbreaking via Prompt Adversarial Tuning
☆10Updated last year
TeamPigeonLab / CS-DJ
Accept by CVPR 2025 (highlight)
☆22Updated 5 months ago
thunxxx / MLLM-Jailbreak-evaluation-MMJ-Bench
☆66Updated 8 months ago
Jayfeather1024 / Backdoor-Enhanced-Alignment
☆24Updated 11 months ago
jiawangbai / BadCLIP
Implementation of BadCLIP https://arxiv.org/pdf/2311.16194.pdf
☆23Updated last year
RUCAIBox / HADES
[ECCV'24 Oral] The official GitHub page for ''Images are Achilles' Heel of Alignment: Exploiting Visual Vulnerabilities for Jailbreaking …
☆33Updated last year
ZrW00 / MuScleLoRA
The code implementation of MuScleLoRA (Accepted in ACL 2024)
☆10Updated last year
KuofengGao / ASD
[CVPR 2023] Backdoor Defense via Adaptively Splitting Poisoned Dataset
☆48Updated last year
AISafety-HKUST / Backdoor_Safety_Tuning
Backdoor Safety Tuning (NeurIPS 2023 & 2024 Spotlight)
☆27Updated last year
SCLBD / DBD
☆33Updated 3 years ago
tmllab / 2025_ICLR_PiF
☆37Updated 6 months ago
RU-System-Software-and-Security / FeatureRE
☆27Updated 3 years ago
guanjiyang / SAC
☆18Updated 3 years ago