TrustAIRLab / VoiceJailbreakAttack
Code for Voice Jailbreak Attacks Against GPT-4o.
☆27Updated 8 months ago
Alternatives and similar repositories for VoiceJailbreakAttack:
Users that are interested in VoiceJailbreakAttack are comparing it to the libraries listed below
- Official repository for the paper "Gradient-based Jailbreak Images for Multimodal Fusion Models" (https//arxiv.org/abs/2410.03489)☆13Updated 3 months ago
- ☆16Updated 2 weeks ago
- [ECCV'24 Oral] The official GitHub page for ''Images are Achilles' Heel of Alignment: Exploiting Visual Vulnerabilities for Jailbreaking …☆16Updated 3 months ago
- ☆17Updated 3 months ago
- [Arxiv 2024] Dissecting Adversarial Robustness of Multimodal LM Agents☆55Updated 2 weeks ago
- Safe Unlearning: A Surprisingly Effective and Generalizable Solution to Defend Against Jailbreak Attacks☆22Updated 6 months ago
- AnyDoor: Test-Time Backdoor Attacks on Multimodal Large Language Models☆48Updated 9 months ago
- ☆20Updated 11 months ago
- Code&Data for the paper "Watch Out for Your Agents! Investigating Backdoor Threats to LLM-Based Agents" [NeurIPS 2024]☆60Updated 4 months ago
- [AAAI'25 (Oral)] Jailbreaking Large Vision-language Models via Typographic Visual Prompts☆104Updated last month
- [USENIX Security'24] Official repository of "Making Them Ask and Answer: Jailbreaking Large Language Models in Few Queries via Disguise a…☆62Updated 3 months ago
- The official implementation of our pre-print paper "Automatic and Universal Prompt Injection Attacks against Large Language Models".☆39Updated 3 months ago
- All in How You Ask for It: Simple Black-Box Method for Jailbreak Attacks☆16Updated 9 months ago
- [USENIX'24] Prompt Stealing Attacks Against Text-to-Image Generation Models☆31Updated 2 weeks ago
- Code and data to go with the Zhu et al. paper "An Objective for Nuanced LLM Jailbreaks"☆22Updated last month
- Code for the paper "BadPrompt: Backdoor Attacks on Continuous Prompts"☆36Updated 6 months ago
- [ArXiv 2024] Denial-of-Service Poisoning Attacks on Large Language Models☆16Updated 3 months ago
- Code to conduct an embedding attack on LLMs☆21Updated 3 weeks ago