huizhang-L/CodeChameleon

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/huizhang-L/CodeChameleon)

huizhang-L / CodeChameleon

☆30

Alternatives and similar repositories for CodeChameleon

Users that are interested in CodeChameleon are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

xirui-li / DrAttack
View on GitHub
Official implementation of paper: DrAttack: Prompt Decomposition and Reconstruction Makes Powerful LLM Jailbreakers
☆68Aug 25, 2024Updated last year
yuplin2333 / representation-space-jailbreak
View on GitHub
Code repo of our paper Towards Understanding Jailbreak Attacks in LLMs: A Representation Space Analysis (https://arxiv.org/abs/2406.10794…
☆24Jul 26, 2024Updated last year
NJUNLP / ReNeLLM
View on GitHub
The official implementation of our NAACL 2024 paper "A Wolf in Sheep’s Clothing: Generalized Nested Jailbreak Prompts can Fool Large Lang…
☆163Sep 2, 2025Updated 10 months ago
InvokerStark / OverKill
View on GitHub
☆15Jun 13, 2024Updated 2 years ago
zhxieml / remiss-jailbreak
View on GitHub
☆33Jun 24, 2024Updated 2 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
TeamPigeonLab / CS-DJ
View on GitHub
Accept by CVPR 2025 (highlight)
☆25Jun 8, 2025Updated last year
facebookresearch / multimodal-fusion-jailbreaks
View on GitHub
Official repository for the paper "Gradient-based Jailbreak Images for Multimodal Fusion Models" (https//arxiv.org/abs/2410.03489)
☆20Oct 22, 2024Updated last year
PKU-ML / PAT
View on GitHub
Code for NeurIPS 2024 Paper "Fight Back Against Jailbreaking via Prompt Adversarial Tuning"
☆22May 6, 2025Updated last year
zhaoshiji123 / SI-Attack
View on GitHub
Jailbreaking Multimodal Large Language Models via Shuffle Inconsistency
☆16Aug 6, 2025Updated 11 months ago
jiah-li / magic
View on GitHub
The repo for paper: Exploiting the Index Gradients for Optimization-Based Jailbreaking on Large Language Models.
☆15Dec 16, 2024Updated last year
NY1024 / RACE
View on GitHub
☆27Mar 17, 2025Updated last year
kriti-hippo / red_queen
View on GitHub
Red Queen Dataset and data generation template
☆27Dec 26, 2025Updated 6 months ago
facebookresearch / advprompter
View on GitHub
Official implementation of AdvPrompter https//arxiv.org/abs/2404.16873
☆183May 6, 2024Updated 2 years ago
Princeton-SysML / Jailbreak_LLM
View on GitHub
☆203Nov 26, 2023Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
LLM-DRA / DRA
View on GitHub
[USENIX Security'24] Official repository of "Making Them Ask and Answer: Jailbreaking Large Language Models in Few Queries via Disguise a…
☆116Oct 11, 2024Updated last year
thefcraft / prompt-generator-stable-diffusion
View on GitHub
Prompt Generator model for Stable Diffusion Models
☆12Jun 20, 2023Updated 3 years ago
DSN-2024 / DSN
View on GitHub
DSN jailbreak Attack & Evaluation Ensemble
☆17Feb 7, 2026Updated 5 months ago
salman-lui / x-teaming
View on GitHub
☆68May 21, 2025Updated last year
RUCAIBox / HADES
View on GitHub
[ECCV'24 Oral] The official GitHub page for ''Images are Achilles' Heel of Alignment: Exploiting Visual Vulnerabilities for Jailbreaking …
☆39Oct 23, 2024Updated last year
AI45Lab / CodeAttack
View on GitHub
[ACL 2024] CodeAttack: Revealing Safety Generalization Challenges of Large Language Models via Code Completion
☆62Oct 1, 2025Updated 9 months ago
AI45Lab / ActorAttack
View on GitHub
☆135Jun 29, 2026Updated 3 weeks ago
RICommunity / TAP
View on GitHub
TAP: An automated jailbreaking method for black-box LLMs
☆241Dec 10, 2024Updated last year
weizeming / momentum-attack-llm
View on GitHub
☆25Jan 17, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
uw-nsl / ArtPrompt
View on GitHub
[ACL24] Official Repo of Paper `ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs`
☆102Aug 15, 2025Updated 11 months ago
EVIGBYEN / Mousetrap
View on GitHub
☆17Jul 3, 2025Updated last year
HanjiangHu / NBF-LLM
View on GitHub
The official code for "Steering Dialogue Dynamics for Robustness against Multi-turn Jailbreaking Attacks".
☆18Jun 24, 2026Updated last month
yuki-younai / MTSA
View on GitHub
offical implementation of MTSA: Multi-turn Safety Alignment for LLMs through Multi-round Red-teaming
☆16Jun 2, 2025Updated last year
MaTengSYSU / HIMRD-jailbreak
View on GitHub
Code repository for the paper "Heuristic Induced Multimodal Risk Distribution Jailbreak Attack for Multimodal Large Language Models"
☆19Aug 7, 2025Updated 11 months ago
Lucas-TY / llm_Implicit_reference
View on GitHub
Official Implementation of implicit reference attack
☆11Oct 16, 2024Updated last year
Jinxiaolong1129 / Foot-in-the-door-Jailbreak
View on GitHub
☆23May 14, 2025Updated last year
EasyJailbreak / EasyJailbreak
View on GitHub
An easy-to-use Python framework to generate adversarial jailbreak prompts.
☆873Mar 30, 2026Updated 3 months ago
JailbreakBench / jailbreakbench
View on GitHub
JailbreakBench: An Open Robustness Benchmark for Jailbreaking Language Models [NeurIPS 2024 Datasets and Benchmarks Track]
☆634Apr 4, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
DAMO-NLP-SG / multilingual-safety-for-LLMs
View on GitHub
[ICLR 2024]Data for "Multilingual Jailbreak Challenges in Large Language Models"
☆106Mar 7, 2024Updated 2 years ago
dukeceicenter / jailbreak-reasoning-openai-o1o3-deepseek-r1
View on GitHub
☆121Apr 27, 2025Updated last year
GraySwanAI / nanoGCG
View on GitHub
A fast + lightweight implementation of the GCG algorithm in PyTorch
☆344May 13, 2025Updated last year
thu-coai / TransferAttack
View on GitHub
[ACL 2025] Guiding not Forcing: Enhancing the Transferability of Jailbreaking Attacks on LLMs via Removing Superfluous Constraints
☆19May 23, 2025Updated last year
vinusankars / BEAST
View on GitHub
Implementation of BEAST adversarial attack for language models (ICML 2024)
☆89May 14, 2024Updated 2 years ago
kztakemoto / simbaja
View on GitHub
All in How You Ask for It: Simple Black-Box Method for Jailbreak Attacks
☆17Apr 24, 2024Updated 2 years ago
Allen-piexl / JailbreakZoo
View on GitHub
☆171Sep 2, 2024Updated last year