NY1024/BAP-Jailbreak-Vision-Language-Models-via-Bi-Modal-Adversarial-Prompt

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/NY1024/BAP-Jailbreak-Vision-Language-Models-via-Bi-Modal-Adversarial-Prompt)

NY1024 / BAP-Jailbreak-Vision-Language-Models-via-Bi-Modal-Adversarial-Prompt

☆59

Alternatives and similar repositories for BAP-Jailbreak-Vision-Language-Models-via-Bi-Modal-Adversarial-Prompt

Users that are interested in BAP-Jailbreak-Vision-Language-Models-via-Bi-Modal-Adversarial-Prompt are comparing it to the libraries listed below

Sorting:

RUCAIBox / HADES
View on GitHub
[ECCV'24 Oral] The official GitHub page for ''Images are Achilles' Heel of Alignment: Exploiting Visual Vulnerabilities for Jailbreaking …
☆35Oct 23, 2024Updated last year
abc03570128 / Jailbreaking-Attack-against-Multimodal-Large-Language-Model
View on GitHub
☆58Aug 11, 2024Updated last year
NY1024 / Jailbreak_GPT4o
View on GitHub
☆26Jun 5, 2024Updated last year
roywang021 / UMK
View on GitHub
Code for ACM MM2024 paper: White-box Multimodal Jailbreaks Against Large Vision-Language Models
☆31Dec 30, 2024Updated last year
RylanSchaeffer / AstraFellowship-When-Do-VLM-Image-Jailbreaks-Transfer
View on GitHub
Code for ICLR 2025 Failures to Find Transferable Image Jailbreaks Between Vision-Language Models
☆37Jun 1, 2025Updated 9 months ago
Haochen-Luo / CroPA
View on GitHub
☆55Dec 7, 2024Updated last year
MaTengSYSU / HIMRD-jailbreak
View on GitHub
Code repository for the paper "Heuristic Induced Multimodal Risk Distribution Jailbreak Attack for Multimodal Large Language Models"
☆15Aug 7, 2025Updated 7 months ago
thunxxx / MLLM-Jailbreak-evaluation-MMJ-Bench
View on GitHub
☆73Mar 30, 2025Updated 11 months ago
Unispac / Visual-Adversarial-Examples-Jailbreak-Large-Language-Models
View on GitHub
Repository for the Paper (AAAI 2024, Oral) --- Visual Adversarial Examples Jailbreak Large Language Models
☆266May 13, 2024Updated last year
TeamPigeonLab / CS-DJ
View on GitHub
Accept by CVPR 2025 (highlight)
☆22Jun 8, 2025Updated 8 months ago
SaFo-Lab / JailBreakV_28K
View on GitHub
[COLM 2024] JailBreakV-28K: A comprehensive benchmark designed to evaluate the transferability of LLM jailbreak attacks to MLLMs, and fur…
☆88May 9, 2025Updated 9 months ago
itsvaibhav01 / Immune
View on GitHub
[CVPR2025] Official Repository for IMMUNE: Improving Safety Against Jailbreaks in Multi-modal LLMs via Inference-Time Alignment
☆27Jun 11, 2025Updated 8 months ago
isXinLiu / MM-SafetyBench
View on GitHub
Accepted by ECCV 2024
☆192Oct 15, 2024Updated last year
jiaxiaojunQAQ / FOA-Attack
View on GitHub
Adversarial Attacks against Closed-Source MLLMs via Feature Optimal Alignment (NeurIPS 2025)
☆49Nov 5, 2025Updated 4 months ago
AoiDragon / HADES
View on GitHub
[ECCV'24 Oral] The official GitHub page for ''Images are Achilles' Heel of Alignment: Exploiting Visual Vulnerabilities for Jailbreaking …
☆38Oct 17, 2024Updated last year
SaFo-Lab / AdaShield
View on GitHub
[ECCV 2024] The official code for "AdaShield: Safeguarding Multimodal Large Language Models from Structure-based Attack via Adaptive Shi…
☆71Feb 9, 2026Updated 3 weeks ago
erfanshayegani / Jailbreak-In-Pieces
View on GitHub
[ICLR 2024 Spotlight 🔥 ] - [ Best Paper Award SoCal NLP 2023 🏆] - Jailbreak in pieces: Compositional Adversarial Attacks on Multi-Modal…
☆80Jun 6, 2024Updated last year
tmllab / 2025_ICLR_PiF
View on GitHub
☆40May 17, 2025Updated 9 months ago
UCSC-VLAA / AttnGCG-attack
View on GitHub
☆24Jun 17, 2025Updated 8 months ago
CryptoAILab / FigStep
View on GitHub
[AAAI'25 (Oral)] Jailbreaking Large Vision-language Models via Typographic Visual Prompts
☆192Jun 26, 2025Updated 8 months ago
ASTRAL-Group / ASTRA
View on GitHub
[CVPR 2025] Official implementation for "Steering Away from Harm: An Adaptive Approach to Defending Vision Language Model Against Jailbre…
☆53Jul 5, 2025Updated 8 months ago
wangyu-ovo / MML
View on GitHub
Code for the paper "Jailbreak Large Vision-Language Models Through Multi-Modal Linkage"
☆27Dec 6, 2024Updated last year
isXinLiu / Awesome-MLLM-Safety
View on GitHub
Accepted by IJCAI-24 Survey Track
☆231Aug 25, 2024Updated last year
euanong / image-hijacks
View on GitHub
Official codebase for Image Hijacks: Adversarial Images can Control Generative Models at Runtime
☆54Sep 19, 2023Updated 2 years ago
mbzuai-nlp / AudioJailbreak
View on GitHub
Audio Jailbreak: An Open Comprehensive Benchmark for Jailbreaking Large Audio-Language Models
☆30Oct 6, 2025Updated 5 months ago
serendipity1122 / Pre-trained-Model-Guided-Fine-Tuning-for-Zero-Shot-Adversarial-Robustness
View on GitHub
Code repository for CVPR2024 paper 《Pre-trained Model Guided Fine-Tuning for Zero-Shot Adversarial Robustness》
☆25May 29, 2024Updated last year
YitingQu / unsafe-diffusion
View on GitHub
☆46Jul 14, 2024Updated last year
ys-zong / VLGuard
View on GitHub
[ICML 2024] Safety Fine-Tuning at (Almost) No Cost: A Baseline for Vision Large Language Models.
☆85Jan 19, 2025Updated last year
Lucas-TY / llm_Implicit_reference
View on GitHub
Official Implementation of implicit reference attack
☆11Oct 16, 2024Updated last year
jiah-li / magic
View on GitHub
The repo for paper: Exploiting the Index Gradients for Optimization-Based Jailbreaking on Large Language Models.
☆13Dec 16, 2024Updated last year
SheltonLiu-N / Universal-Prompt-Injection
View on GitHub
The official implementation of our pre-print paper "Automatic and Universal Prompt Injection Attacks against Large Language Models".
☆69Oct 23, 2024Updated last year
liudaizong / Awesome-LVLM-Attack
View on GitHub
😎 up-to-date & curated list of awesome Attacks on Large-Vision-Language-Models papers, methods & resources.
☆505Feb 17, 2026Updated 2 weeks ago
liuxuannan / Awesome-Multimodal-Jailbreak
View on GitHub
A Survey on Jailbreak Attacks and Defenses against Multimodal Generative Models
☆308Jan 11, 2026Updated last month
usail-hkust / JailTrickBench
View on GitHub
Bag of Tricks: Benchmarking of Jailbreak Attacks on LLMs. Empirical tricks for LLM Jailbreaking. (NeurIPS 2024)
☆163Nov 30, 2024Updated last year
yuplin2333 / representation-space-jailbreak
View on GitHub
Code repo of our paper Towards Understanding Jailbreak Attacks in LLMs: A Representation Space Analysis (https://arxiv.org/abs/2406.10794…
☆23Jul 26, 2024Updated last year
roywang021 / IDEATOR
View on GitHub
Code for ICCV2025 paper——IDEATOR: Jailbreaking and Benchmarking Large Vision-Language Models Using Themselves
☆17Jul 11, 2025Updated 7 months ago
TreeLLi / APT
View on GitHub
One Prompt Word is Enough to Boost Adversarial Robustness for Pre-trained Vision-Language Models
☆58Dec 20, 2024Updated last year
yunqing-me / AttackVLM
View on GitHub
[NeurIPS-2023] Annual Conference on Neural Information Processing Systems
☆228Dec 22, 2024Updated last year
HanxunH / Detect-CLIP-Backdoor-Samples
View on GitHub
[ICLR2025] Detecting Backdoor Samples in Contrastive Language Image Pretraining
☆19Feb 26, 2025Updated last year