RUCAIBox / HADESLinks
[ECCV'24 Oral] The official GitHub page for ''Images are Achilles' Heel of Alignment: Exploiting Visual Vulnerabilities for Jailbreaking Multimodal Large Language Models''
β34Updated last year
Alternatives and similar repositories for HADES
Users that are interested in HADES are comparing it to the libraries listed below
Sorting:
- β55Updated last year
- [ICLR 2024 Spotlight π₯ ] - [ Best Paper Award SoCal NLP 2023 π] - Jailbreak in pieces: Compositional Adversarial Attacks on Multi-Modalβ¦β77Updated last year
- β54Updated last year
- β68Updated 9 months ago
- Accept by CVPR 2025 (highlight)β22Updated 7 months ago
- Code for ACM MM2024 paper: White-box Multimodal Jailbreaks Against Large Vision-Language Modelsβ31Updated last year
- β57Updated last year
- Code for the paper "Jailbreak Large Vision-Language Models Through Multi-Modal Linkage"β25Updated last year
- β38Updated 8 months ago
- This is an official repository of ``VLAttack: Multimodal Adversarial Attacks on Vision-Language Tasks via Pre-trained Models'' (NeurIPS 2β¦β65Updated 9 months ago
- Code for ICCV2025 paperββIDEATOR: Jailbreaking and Benchmarking Large Vision-Language Models Using Themselvesβ15Updated 6 months ago
- [AAAI'25 (Oral)] Jailbreaking Large Vision-language Models via Typographic Visual Promptsβ189Updated 6 months ago
- Code for NeurIPS 2024 Paper "Fight Back Against Jailbreaking via Prompt Adversarial Tuning"β22Updated 8 months ago
- [COLM 2024] JailBreakV-28K: A comprehensive benchmark designed to evaluate the transferability of LLM jailbreak attacks to MLLMs, and furβ¦β84Updated 8 months ago
- Code repository for the paper "Heuristic Induced Multimodal Risk Distribution Jailbreak Attack for Multimodal Large Language Models"β15Updated 5 months ago
- [ICLR 2024] Inducing High Energy-Latency of Large Vision-Language Models with Verbose Imagesβ42Updated last year
- Accepted by ECCV 2024β180Updated last year
- β109Updated last year
- β25Updated last year
- A package that achieves 95%+ transfer attack success rate against GPT-4β26Updated last year
- β22Updated 7 months ago
- To Think or Not to Think: Exploring the Unthinking Vulnerability in Large Reasoning Modelsβ32Updated 7 months ago
- [ICLR 2025] BlueSuffix: Reinforced Blue Teaming for Vision-Language Models Against Jailbreak Attacksβ30Updated 2 months ago
- [ECCV'24 Oral] The official GitHub page for ''Images are Achilles' Heel of Alignment: Exploiting Visual Vulnerabilities for Jailbreaking β¦β37Updated last year
- [NeurIPS 2024] Fight Back Against Jailbreaking via Prompt Adversarial Tuningβ10Updated last year
- Official codebase for Image Hijacks: Adversarial Images can Control Generative Models at Runtimeβ54Updated 2 years ago
- Code for Neurips 2024 paper "Shadowcast: Stealthy Data Poisoning Attacks Against Vision-Language Models"β58Updated last year
- [ICML 2024] Safety Fine-Tuning at (Almost) No Cost: A Baseline for Vision Large Language Models.β84Updated last year
- [CVPR 2025] Official implementation for "Steering Away from Harm: An Adaptive Approach to Defending Vision Language Model Against Jailbreβ¦β47Updated 6 months ago
- [ICML 2025] X-Transfer Attacks: Towards Super Transferable Adversarial Attacks on CLIPβ34Updated 6 months ago