The most comprehensive and accurate LLM jailbreak attack benchmark by far
☆21Mar 22, 2025Updated last year
Alternatives and similar repositories for jailbreak-bench
Users that are interested in jailbreak-bench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- An easy-to-use Python framework to defend against jailbreak prompts.☆21Mar 22, 2025Updated last year
- Red Queen Dataset and data generation template☆26Dec 26, 2025Updated 5 months ago
- Towards Safe LLM with our simple-yet-highly-effective Intention Analysis Prompting☆21Mar 25, 2024Updated 2 years ago
- Code of paper: xJailbreak: Representation Space Guided Reinforcement Learning for Interpretable LLM Jailbreaking"☆18Apr 3, 2026Updated last month
- ☆48May 9, 2024Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- PyTorch Implementation of the paper "Defining and Quantifying the Emergence of Sparse Concepts in DNNs" (CVPR 2023)☆12Dec 24, 2023Updated 2 years ago
- Chain of Attack: a Semantic-Driven Contextual Multi-Turn attacker for LLM☆39Jan 17, 2025Updated last year
- SVIP: Towards Verifiable Inference of Open-Source Large Language Models☆15Jun 3, 2025Updated 11 months ago
- Code for paper "Defending aginast LLM Jailbreaking via Backtranslation"☆35Aug 16, 2024Updated last year
- This is the official code repository for paper "Quantization Aware Attack: Enhancing Transferable Adversarial Attacks by Model Quantizati…☆14Sep 21, 2025Updated 8 months ago
- ☆12Sep 29, 2024Updated last year
- 北京邮电大学信通院C++上机题☆14Feb 20, 2021Updated 5 years ago
- ☆15Mar 9, 2025Updated last year
- ☆12Apr 25, 2025Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆27Jun 5, 2024Updated last year
- BASAR:Black-box Attack on Skeletal Action Recognition, CVPR 2021☆19Feb 18, 2025Updated last year
- [CVPR2025] Official Repository for IMMUNE: Improving Safety Against Jailbreaks in Multi-modal LLMs via Inference-Time Alignment☆28Jun 11, 2025Updated 11 months ago
- Code for paper: PoisonPrompt: Backdoor Attack on Prompt-based Large Language Models, IEEE ICASSP 2024. Demo//124.220.228.133:11107☆21Aug 10, 2024Updated last year
- 批量挖掘漏洞☆19May 23, 2021Updated 5 years ago
- [NeurIPS 2025 D&B (Spotlight🌟)] TIME: A Multi-level Benchmark for Temporal Reasoning of LLMs in Real-World Scenario☆31Oct 5, 2025Updated 7 months ago
- The purpose of this project is to build a Decentralized IoT STORage space to Store data of diAPP, enOS and lenOS☆10Feb 24, 2023Updated 3 years ago
- [ICLR 2024]Data for "Multilingual Jailbreak Challenges in Large Language Models"☆105Mar 7, 2024Updated 2 years ago
- ☆25Jun 16, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Code for NDSS paper: Stealthy Adversarial Perturbations Against Real-Time Video Classification Systems☆21Nov 24, 2018Updated 7 years ago
- Research on "Many-Shot Jailbreaking" in Large Language Models (LLMs). It unveils a novel technique capable of bypassing the safety mechan…☆16Aug 6, 2024Updated last year
- ☆14Jan 23, 2023Updated 3 years ago
- An evolutionary, coverage-guided greybox network protocol fuzzer☆21Aug 31, 2021Updated 4 years ago
- A repo for LLM jailbreak☆14Sep 5, 2023Updated 2 years ago
- ☆13Nov 11, 2022Updated 3 years ago
- ☆24Feb 17, 2026Updated 3 months ago
- [NeurIPS 2023] Differentially Private Image Classification by Learning Priors from Random Processes☆12Jun 12, 2023Updated 2 years ago
- ☆202Nov 26, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Official implementation of paper: DrAttack: Prompt Decomposition and Reconstruction Makes Powerful LLM Jailbreakers☆66Aug 25, 2024Updated last year
- [ECCV2024] Boosting Transferability in Vision-Language Attacks via Diversification along the Intersection Region of Adversarial Trajector…☆31Nov 15, 2025Updated 6 months ago
- ICL backdoor attack☆17Nov 4, 2024Updated last year
- ☆18Mar 30, 2025Updated last year
- OCR识别内容后直接请求GPT获取结果的便捷工具。☆10May 5, 2023Updated 3 years ago
- ☆39May 21, 2024Updated 2 years ago
- 小米便签二次开发☆17Sep 7, 2024Updated last year