Code for the paper "Jailbreak Large Vision-Language Models Through Multi-Modal Linkage"
☆33Dec 6, 2024Updated last year
Alternatives and similar repositories for MML
Users that are interested in MML are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ECCV'24 Oral] The official GitHub page for ''Images are Achilles' Heel of Alignment: Exploiting Visual Vulnerabilities for Jailbreaking …☆35Oct 23, 2024Updated last year
- Accept by CVPR 2025 (highlight)☆25Jun 8, 2025Updated 10 months ago
- [ICLR 2025] BlueSuffix: Reinforced Blue Teaming for Vision-Language Models Against Jailbreak Attacks☆31Nov 2, 2025Updated 6 months ago
- Official implementation of Visco-Attack (EMNLP 2025 Main). An open-source one-click reproduction script is also provided.☆30Apr 11, 2026Updated 3 weeks ago
- [CVPR 2025] Official implementation for JOOD "Playing the Fool: Jailbreaking LLMs and Multimodal LLMs with Out-of-Distribution Strategy"☆22Jun 11, 2025Updated 10 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- [COLING 2025] Official repo of paper: "Not Aligned" is Not "Malicious": Being Careful about Hallucinations of Large Language Models' Jail…☆12Jul 26, 2024Updated last year
- ACL 2025 (Main) HiddenDetect: Detecting Jailbreak Attacks against Multimodal Large Language Models via Monitoring Hidden States☆164Jun 8, 2025Updated 10 months ago
- Code repository for the paper "Heuristic Induced Multimodal Risk Distribution Jailbreak Attack for Multimodal Large Language Models"☆17Aug 7, 2025Updated 8 months ago
- ☆13Jan 22, 2026Updated 3 months ago
- Implementation for the paper "Unified Multimodal Model with Unlikelihood Training for Visual Dialog"☆13May 12, 2023Updated 2 years ago
- The implementation of our IEEE S&P 2024 paper "Securely Fine-tuning Pre-trained Encoders Against Adversarial Examples".☆11Jun 28, 2024Updated last year
- The official implementation of our pre-print paper "Automatic and Universal Prompt Injection Attacks against Large Language Models".☆70Oct 23, 2024Updated last year
- [COLM 2024] JailBreakV-28K: A comprehensive benchmark designed to evaluate the transferability of LLM jailbreak attacks to MLLMs, and fur…☆90May 9, 2025Updated 11 months ago
- [ECCV2022] Rethinking Data Augmentation for Robust Visual Question Answering☆13Nov 23, 2022Updated 3 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- [ACM MM 2023] The released code of paper "Deconfounded Visual Question Generation with Causal Inference"☆10Sep 3, 2024Updated last year
- Code for our ACL-2023 paper: "Combo of Thinking and Observing for Outside-Knowledge VQA"☆12Jun 30, 2023Updated 2 years ago
- Accepted by ECCV 2024☆203Oct 15, 2024Updated last year
- ☆20May 14, 2025Updated 11 months ago
- [AAAI 2026] This is the official implementation of the paper "ExtendAttack: Attacking Servers of LRMs via Extending Reasoning".☆22Mar 18, 2026Updated last month
- ☆14Oct 6, 2024Updated last year
- [AAMAS 2025] Privacy-preserving and Personalized RLHF, with convergence guarantees. The Code contains experiments for training multiple i…☆16Apr 16, 2025Updated last year
- Automated Simulations of Adversarial Attacks on Arbitrary Objects in Realistic Scenes☆14Oct 5, 2025Updated 7 months ago
- ☆40May 17, 2025Updated 11 months ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- [AAAI'25 (Oral)] Jailbreaking Large Vision-language Models via Typographic Visual Prompts☆202Jun 26, 2025Updated 10 months ago
- One Prompt Word is Enough to Boost Adversarial Robustness for Pre-trained Vision-Language Models☆59Apr 25, 2026Updated last week
- The official repo for the paper "An Adaptive Model Ensemble Adversarial Attack for Boosting Adversarial Transferability"☆43Oct 12, 2023Updated 2 years ago
- Official Tensorflow implementation for "Improving the Transferability of Adversarial Samples by Path-Augmented Method" (CVPR 2023).☆12Jun 16, 2023Updated 2 years ago
- [ACL 2025 Findings] The official GitHub repo for the paper "Nuclear Deployed: Analyzing Catastrophic Risks in Decision-making of Autonomo…☆21May 20, 2025Updated 11 months ago
- A Survey on Jailbreak Attacks and Defenses against Multimodal Generative Models☆317Jan 11, 2026Updated 3 months ago
- Prompt Generator model for Stable Diffusion Models☆12Jun 20, 2023Updated 2 years ago
- ☆65May 21, 2025Updated 11 months ago
- [ICLR 2025] Official code of "Towards Robust Alignment of Language Models: Distributionally Robustifying Direct Preference Optimization"☆18Jun 1, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Authors's code for "Variational Causal Inference Network for Explanatory Visual Question Answering" and "Integrating Neural-Symbolic Reas…☆12Apr 13, 2026Updated 3 weeks ago
- [Findings of ACL 2023] Bridge the Gap Between CV and NLP! A Optimization-based Textual Adversarial Attack Framework.☆14Aug 27, 2023Updated 2 years ago
- LoRA supervised fine-tuning, RLHF (PPO) and RAG with llama-3-8B on the TLDR summarization dataset☆14Feb 2, 2025Updated last year
- All-in-one repository for Fine-tuning & Pretraining (Large) Language Models☆15Mar 8, 2023Updated 3 years ago
- ☆12Mar 24, 2023Updated 3 years ago
- ☆19May 31, 2023Updated 2 years ago
- Confidence Regulation Neurons in Language Models (NeurIPS 2024)☆15Feb 1, 2025Updated last year