MaTengSYSU / HIMRD-jailbreakView external linksLinks
Code repository for the paper "Heuristic Induced Multimodal Risk Distribution Jailbreak Attack for Multimodal Large Language Models"
☆15Aug 7, 2025Updated 6 months ago
Alternatives and similar repositories for HIMRD-jailbreak
Users that are interested in HIMRD-jailbreak are comparing it to the libraries listed below
Sorting:
- Accept by CVPR 2025 (highlight)☆22Jun 8, 2025Updated 8 months ago
- this is for the ACM MM paper---Backdoor Attack on Crowd Counting☆17Jul 10, 2022Updated 3 years ago
- ☆57Jun 5, 2024Updated last year
- Adversarial Attacks against Closed-Source MLLMs via Feature Optimal Alignment (NeurIPS 2025)☆47Nov 5, 2025Updated 3 months ago
- [ECCV'24 Oral] The official GitHub page for ''Images are Achilles' Heel of Alignment: Exploiting Visual Vulnerabilities for Jailbreaking …☆34Oct 23, 2024Updated last year
- [CVPR2025] Official Repository for IMMUNE: Improving Safety Against Jailbreaks in Multi-modal LLMs via Inference-Time Alignment☆27Jun 11, 2025Updated 8 months ago
- ☆12May 6, 2022Updated 3 years ago
- Data-Efficient Backdoor Attacks☆20Jun 15, 2022Updated 3 years ago
- This repository is the official implementation of the paper "ASSET: Robust Backdoor Data Detection Across a Multiplicity of Deep Learning…☆19Jun 7, 2023Updated 2 years ago
- ☆20May 6, 2022Updated 3 years ago
- Defending Against Backdoor Attacks Using Robust Covariance Estimation☆22Jul 12, 2021Updated 4 years ago
- ☆27Feb 1, 2023Updated 3 years ago
- Code repo of our paper Towards Understanding Jailbreak Attacks in LLMs: A Representation Space Analysis (https://arxiv.org/abs/2406.10794…☆23Jul 26, 2024Updated last year
- A Unified Benchmark and Toolbox for Multimodal Jailbreak Attack–Defense Evaluation☆58Jan 23, 2026Updated 3 weeks ago
- Code for Fast Propagation is Better: Accelerating Single-Step Adversarial Training via Sampling Subnetworks (TIFS2024)☆13Mar 29, 2024Updated last year
- This work corroborates a run-time Trojan detection method exploiting STRong Intentional Perturbation of inputs, is a multi-domain Trojan …☆10Mar 7, 2021Updated 4 years ago
- ☆10Oct 31, 2022Updated 3 years ago
- The repo for paper: Exploiting the Index Gradients for Optimization-Based Jailbreaking on Large Language Models.☆13Dec 16, 2024Updated last year
- ☆10Dec 18, 2024Updated last year
- Prompt Generator model for Stable Diffusion Models☆11Jun 20, 2023Updated 2 years ago
- ☆12Jan 25, 2025Updated last year
- ☆14Feb 26, 2025Updated 11 months ago
- Official Implementation of implicit reference attack☆11Oct 16, 2024Updated last year
- Implemention of "Piracy Resistant Watermarks for Deep Neural Networks" in TensorFlow.☆12Dec 5, 2020Updated 5 years ago
- Improving fast adversarial training with prior-guided knowledge (TPAMI2024)☆43Apr 21, 2024Updated last year
- ☆26Jun 5, 2024Updated last year
- Codes for the ICLR 2022 paper: Trigger Hunting with a Topological Prior for Trojan Detection☆11Sep 19, 2023Updated 2 years ago
- ☆12May 27, 2022Updated 3 years ago
- ☆11Jan 25, 2022Updated 4 years ago
- ☆29Mar 3, 2021Updated 4 years ago
- Code for ACM MM2024 paper: White-box Multimodal Jailbreaks Against Large Vision-Language Models☆31Dec 30, 2024Updated last year
- Data and code for the paper: Finding Safety Neurons in Large Language Models☆21Jan 29, 2026Updated 2 weeks ago
- Official implementation of the ICCV2023 paper: Enhancing Generalization of Universal Adversarial Perturbation through Gradient Aggregatio…☆27Aug 17, 2023Updated 2 years ago
- Pytorch implementation of NPAttack☆12Jul 7, 2020Updated 5 years ago
- ☆15Apr 7, 2023Updated 2 years ago
- ☆14Jan 4, 2025Updated last year
- [NeurIPS'22] Trap and Replace: Defending Backdoor Attacks by Trapping Them into an Easy-to-Replace Subnetwork. Haotao Wang, Junyuan Hong,…☆15Nov 27, 2023Updated 2 years ago
- Code repository for the paper --- [USENIX Security 2023] Towards A Proactive ML Approach for Detecting Backdoor Poison Samples☆30Jul 11, 2023Updated 2 years ago
- Source code for ECCV 2022 Poster: Data-free Backdoor Removal based on Channel Lipschitzness☆35Jan 9, 2023Updated 3 years ago