AIM-Intelligence/Automated-Multi-Turn-Jailbreaks

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/AIM-Intelligence/Automated-Multi-Turn-Jailbreaks)

AIM-Intelligence / Automated-Multi-Turn-Jailbreaks

☆139

Alternatives and similar repositories for Automated-Multi-Turn-Jailbreaks

Users that are interested in Automated-Multi-Turn-Jailbreaks are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

AI45Lab / ActorAttack
View on GitHub
☆135Jun 29, 2026Updated 3 weeks ago
HanjiangHu / NBF-LLM
View on GitHub
The official code for "Steering Dialogue Dynamics for Robustness against Multi-turn Jailbreaking Attacks".
☆18Jun 24, 2026Updated last month
Jinxiaolong1129 / Foot-in-the-door-Jailbreak
View on GitHub
☆23May 14, 2025Updated last year
tmllab / 2025_ICLR_PiF
View on GitHub
☆40May 17, 2025Updated last year
xirui-li / DrAttack
View on GitHub
Official implementation of paper: DrAttack: Prompt Decomposition and Reconstruction Makes Powerful LLM Jailbreakers
☆68Aug 25, 2024Updated last year
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
salman-lui / x-teaming
View on GitHub
☆68May 21, 2025Updated last year
NY1024 / RACE
View on GitHub
☆27Mar 17, 2025Updated last year
kriti-hippo / red_queen
View on GitHub
Red Queen Dataset and data generation template
☆27Dec 26, 2025Updated 6 months ago
AIM-Intelligence / awesome-mcp-security
View on GitHub
Security Threats related with MCP (Model Context Protocol), MCP Servers and more
☆50Apr 24, 2025Updated last year
weiyezhimeng / SQL-Injection-Jailbreak
View on GitHub
☆22Jul 26, 2025Updated 11 months ago
dsbowen / strong_reject
View on GitHub
☆146Jul 7, 2025Updated last year
SproutNan / AI-Safety_Benchmark
View on GitHub
The official repository for guided jailbreak benchmark
☆31Jul 28, 2025Updated 11 months ago
ErxinYu / CoSafe-Dataset
View on GitHub
☆13Nov 12, 2024Updated last year
yueliu1999 / FlipAttack
View on GitHub
[ICML 2025] An official source code for paper "FlipAttack: Jailbreak LLMs via Flipping".
☆178May 2, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
AIM-Intelligence / RepBend
View on GitHub
Code for Representation Bending Paper
☆16Jul 15, 2025Updated last year
akumar2709 / OVERTHINK_public
View on GitHub
☆52Feb 25, 2026Updated 4 months ago
centerforaisafety / HarmBench
View on GitHub
HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal
☆1,011Aug 16, 2024Updated last year
thunxxx / MLLM-Jailbreak-evaluation-MMJ-Bench
View on GitHub
☆81Mar 30, 2025Updated last year
drivetosouth / SafeDialBench-Dataset
View on GitHub
Official github repo for SafeDialBench, a comprehensive multi-turn dialogue benchmark to evaluate LLMs' safety.
☆54May 12, 2025Updated last year
RICommunity / TAP
View on GitHub
TAP: An automated jailbreaking method for black-box LLMs
☆241Dec 10, 2024Updated last year
huizhang-L / CodeChameleon
View on GitHub
☆30Mar 20, 2024Updated 2 years ago
patrickrchao / JailbreakingLLMs
View on GitHub
☆757Jul 2, 2025Updated last year
kangmintong / R-2-Guard
View on GitHub
[ICLR 2025] Code implementation of R^2-Guard: Robust Reasoning Enabled LLM Guardrail via Knowledge-Enhanced Logical Reasoning
☆23Jul 8, 2024Updated 2 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
theshi-1128 / jailbreak-bench
View on GitHub
The most comprehensive and accurate LLM jailbreak attack benchmark by far
☆21Mar 22, 2025Updated last year
HKUST-KnowComp / LLM-Multistep-Jailbreak
View on GitHub
Code for Findings-EMNLP 2023 paper: Multi-step Jailbreaking Privacy Attacks on ChatGPT
☆37Oct 15, 2023Updated 2 years ago
GraySwanAI / nanoGCG
View on GitHub
A fast + lightweight implementation of the GCG algorithm in PyTorch
☆344May 13, 2025Updated last year
byt3bl33d3r / Leek-LLM
View on GitHub
Use a LLM to create the ultimate Leek AI for Leek Wars! 🥬
☆17May 2, 2024Updated 2 years ago
yiksiu-chan / SpeakEasy
View on GitHub
[ICML 2025] Speak Easy: Eliciting Harmful Jailbreaks from LLMs with Simple Interactions
☆14Mar 7, 2026Updated 4 months ago
alexandrasouly / strongreject
View on GitHub
Repository for "StrongREJECT for Empty Jailbreaks" paper
☆157Nov 3, 2024Updated last year
yibo-miao / T2VSafetyBench
View on GitHub
☆28Nov 4, 2024Updated last year
EasyJailbreak / EasyJailbreak
View on GitHub
An easy-to-use Python framework to generate adversarial jailbreak prompts.
☆873Mar 30, 2026Updated 3 months ago
rkbennett / icmpsh
View on GitHub
Simple reverse ICMP shell
☆14Apr 30, 2024Updated 2 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
dukeceicenter / jailbreak-reasoning-openai-o1o3-deepseek-r1
View on GitHub
☆121Apr 27, 2025Updated last year
LLM-DRA / DRA
View on GitHub
[USENIX Security'24] Official repository of "Making Them Ask and Answer: Jailbreaking Large Language Models in Few Queries via Disguise a…
☆116Oct 11, 2024Updated last year
JailbreakBench / jailbreakbench
View on GitHub
JailbreakBench: An Open Robustness Benchmark for Jailbreaking Language Models [NeurIPS 2024 Datasets and Benchmarks Track]
☆634Apr 4, 2025Updated last year
facebookresearch / SecAlign
View on GitHub
Repo for the research paper "SecAlign: Defending Against Prompt Injection with Preference Optimization"
☆98Jul 2, 2026Updated 3 weeks ago
SaFo-Lab / AutoDAN-Turbo
View on GitHub
[ICLR 2025 Spotlight] The official implementation of our ICLR2025 paper "AutoDAN-Turbo: A Lifelong Agent for Strategy Self-Exploration to…
☆381Oct 8, 2025Updated 9 months ago
facebookresearch / multimodal-fusion-jailbreaks
View on GitHub
Official repository for the paper "Gradient-based Jailbreak Images for Multimodal Fusion Models" (https//arxiv.org/abs/2410.03489)
☆20Oct 22, 2024Updated last year
wollschlager / geometry-of-refusal
View on GitHub
Code to the paper: The Geometry of Refusal in Large Language Models: Concept Cones and Representational Independence
☆35Jul 31, 2025Updated 11 months ago