SaFoLab-WISC / AutoDAN-TurboLinks

[ICLR 2025 Spotlight] The official implementation of our ICLR2025 paper "AutoDAN-Turbo: A Lifelong Agent for Strategy Self-Exploration to Jailbreak LLMs".

☆321

Alternatives and similar repositories for AutoDAN-Turbo

Users that are interested in AutoDAN-Turbo are comparing it to the libraries listed below

Sorting:

yueliu1999 / GuardReasoner
[ICLR Workshop 2025] An official source code for paper "GuardReasoner: Towards Reasoning-based LLM Safeguards".
☆160Updated 6 months ago
thu-coai / AISafetyLab
AISafetyLab: A comprehensive framework covering safety attack, defense, evaluation and paper list.
☆212Updated 2 months ago
yueliu1999 / Awesome-Efficient-Inference-for-LRMs
Awesome-Efficient-Inference-for-LRMs is a collection of state-of-the-art, novel, exciting, token-efficient methods for Large Reasoning Mo…
☆233Updated 5 months ago
LZY-the-boys / Twin-Merging
[NeurIPS2024] Twin-Merging: Dynamic Integration of Modular Expertise in Model Merging
☆138Updated 8 months ago
yueliu1999 / FlipAttack
[ICML 2025] An official source code for paper "FlipAttack: Jailbreak LLMs via Flipping".
☆151Updated 6 months ago
THU-BPM / MarkDiffusion
MarkDiffusion: An Open-Source Toolkit for Generative Watermarking of Latent Diffusion Models
☆166Updated this week
Elfsong / Mercury
Code Efficiency Benchmark
☆85Updated 6 months ago
URSA-MATH / URSA-MATH
☆125Updated 2 months ago
SUSTechBruce / SRPO_MLLMs
[NeurIPS 2025🔥]Main source code of SRPO framework.
☆180Updated 2 months ago
codefuse-ai / CodeFuse-CGM
[NeurIPS 2025] A Graph-based LLM Framework for Real-world SE Tasks
☆490Updated 2 months ago
pat-jj / RAS
RAS: Retrieval-And-Structuring for Knowledge-Intensive LLM Generation
☆57Updated last month
trestad / Noisy-Rewards-in-Learning-to-Reason
☆105Updated 5 months ago
HJYao00 / Mulberry
[NIPS'25 Spotlight] Mulberry, an o1-like Reasoning and Reflection MLLM Implemented via Collective MCTS
☆1,224Updated 2 months ago
KodCode-AI / kodcode
✨ A synthetic dataset generation framework that produces diverse coding questions and verifiable solutions - all in one framwork
☆289Updated 2 months ago
S1s-Z / NOVA
[ACL'25] Code for "Aligning Large Language Models to Follow Instructions and Hallucinate Less via Effective Data Filtering"
☆20Updated 4 months ago
gersteinlab / ML-Bench
ML-Bench: Evaluating Large Language Models and Agents for Machine Learning Tasks on Repository-Level Code (https://arxiv.org/abs/2311.098…
☆304Updated 3 months ago
jiaxiaojunQAQ / I-GCG
Improved techniques for optimization-based jailbreaking on large language models (ICLR2025)
☆133Updated 7 months ago
Wuyxin / collabllm
(ICML'25 Outstanding) CollabLLM: From Passive Responders to Active Collaborators
☆256Updated last month
rllm-team / rllm
Pytorch Library for Relational Table Learning with LLMs.
☆438Updated this week
Facico / GOAT-PEFT
[ICML2025] Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment
☆132Updated 2 weeks ago
THU-BPM / MarkLLM
MarkLLM: An Open-Source Toolkit for LLM Watermarking.（EMNLP 2024 System Demonstration)
☆659Updated last month
uclaml / SPPO
The official implementation of Self-Play Preference Optimization (SPPO)
☆582Updated 10 months ago
HKUST-KnowComp / Awesome-LLM-Scientific-Discovery
[EMNLP2025] From Automation to Autonomy: A Survey on Large Language Models in Scientific Discovery
☆260Updated 2 weeks ago
LetterLiGo / SafeGen_CCS2024
[CCS'24] SafeGen: Mitigating Unsafe Content Generation in Text-to-Image Models
☆137Updated 4 months ago
RLHFlow / Online-DPO-R1
Codebase for Iterative DPO Using Rule-based Rewards
☆261Updated 7 months ago
OpenGVLab / ScaleCUA
ScaleCUA is the open-sourced computer use agents that can operate on corss-platform environments (Windows, macOS, Ubuntu, Android).
☆881Updated last month
yueliu1999 / Awesome-Jailbreak-on-LLMs
Awesome-Jailbreak-on-LLMs is a collection of state-of-the-art, novel, exciting jailbreak methods on LLMs. It contains papers, codes, data…
☆1,043Updated 2 weeks ago
EasyJailbreak / EasyJailbreak
An easy-to-use Python framework to generate adversarial jailbreak prompts.
☆756Updated 7 months ago
dayuyang1999 / Awesome-Code-Reasoning
☆356Updated 5 months ago
facebookresearch / DocAgent
DocAgent is a system designed to generate high-quality, context-aware code documentation for Python codebases using a multi-agent approac…
☆402Updated 7 months ago