jiaxiaojunQAQ / I-GCGLinks

Improved techniques for optimization-based jailbreaking on large language models (ICLR2025)

☆132

Alternatives and similar repositories for I-GCG

Users that are interested in I-GCG are comparing it to the libraries listed below

Sorting:

yueliu1999 / FlipAttack
[ICML 2025] An official source code for paper "FlipAttack: Jailbreak LLMs via Flipping".
☆149Updated 6 months ago
LetterLiGo / SafeGen_CCS2024
[CCS'24] SafeGen: Mitigating Unsafe Content Generation in Text-to-Image Models
☆137Updated 4 months ago
thu-coai / AISafetyLab
AISafetyLab: A comprehensive framework covering safety attack, defense, evaluation and paper list.
☆208Updated 2 months ago
cure-lab / MMA-Diffusion
[CVPR2024] MMA-Diffusion: MultiModal Attack on Diffusion Models
☆312Updated last month
tmllab / 2025_ICLR_PiF
☆36Updated 5 months ago
XuankunRong / Awesome-LVLM-Safety
A curated list of resources dedicated to the safety of Large Vision-Language Models. This repository aligns with our survey titled A Surv…
☆156Updated 3 weeks ago
jiaxiaojunQAQ / FGSM-PGK
Improving fast adversarial training with prior-guided knowledge (TPAMI2024)
☆43Updated last year
zhipeng-wei / EmojiAttack
Emoji Attack [ICML 2025]
☆34Updated 3 months ago
AI45Lab / CodeAttack
[ACL 2024] CodeAttack: Revealing Safety Generalization Challenges of Large Language Models via Code Completion
☆54Updated last month
AI45Lab / ActorAttack
☆109Updated 9 months ago
wonderNefelibata / Awesome-LRM-Safety
Awesome Large Reasoning Model(LRM) Safety.This repository is used to collect security-related research on large reasoning models such as …
☆76Updated this week
CryptoAILab / FigStep
[AAAI'25 (Oral)] Jailbreaking Large Vision-language Models via Typographic Visual Prompts
☆177Updated 4 months ago
thunxxx / MLLM-Jailbreak-evaluation-MMJ-Bench
☆63Updated 7 months ago
thu-coai / JailbreakDefense_GoalPriority
[ACL 2024] Defending Large Language Models Against Jailbreaking Attacks Through Goal Prioritization
☆29Updated last year
SproutNan / AI-Safety_SCAV
This is the code repository for "Uncovering Safety Risks of Large Language Models through Concept Activation Vector"
☆46Updated 3 weeks ago
XuanChen-xc / RLbreaker
Code for "When LLM Meets DRL: Advancing Jailbreaking Efficiency via DRL-guided Search" (NeurIPS 2024)
☆13Updated last year
MaTengSYSU / HIMRD-jailbreak
Code repository for the paper "Heuristic Induced Multimodal Risk Distribution Jailbreak Attack for Multimodal Large Language Models"
☆10Updated 2 months ago
TeamPigeonLab / CS-DJ
Accept by CVPR 2025 (highlight)
☆20Updated 4 months ago
yuplin2333 / representation-space-jailbreak
Code repo of our paper Towards Understanding Jailbreak Attacks in LLMs: A Representation Space Analysis (https://arxiv.org/abs/2406.10794…
☆22Updated last year
ZrW00 / MuScleLoRA
The code implementation of MuScleLoRA (Accepted in ACL 2024)
☆10Updated 11 months ago
alipay / YiJian-Community
YiJian-Comunity: a full-process automated large model safety evaluation tool designed for academic research
☆114Updated last year
SaFoLab-WISC / JailBreakV_28K
[COLM 2024] JailBreakV-28K: A comprehensive benchmark designed to evaluate the transferability of LLM jailbreak attacks to MLLMs, and fur…
☆80Updated 5 months ago
NY1024 / Foundation-Model-Paper-Notes
☆66Updated 5 months ago
NY1024 / BAP-Jailbreak-Vision-Language-Models-via-Bi-Modal-Adversarial-Prompt
☆53Updated last year
PKU-ML / PAT
Code for NeurIPS 2024 Paper "Fight Back Against Jailbreaking via Prompt Adversarial Tuning"
☆20Updated 5 months ago
jiaxiaojunQAQ / FP-Better
Code for Fast Propagation is Better: Accelerating Single-Step Adversarial Training via Sampling Subnetworks (TIFS2024)
☆13Updated last year
rain152 / PAT
[NeurIPS 2024] Fight Back Against Jailbreaking via Prompt Adversarial Tuning
☆10Updated last year
isXinLiu / MM-SafetyBench
Accepted by ECCV 2024
☆169Updated last year
abc03570128 / Jailbreaking-Attack-against-Multimodal-Large-Language-Model
☆48Updated last year
RUCAIBox / HADES
[ECCV'24 Oral] The official GitHub page for ''Images are Achilles' Heel of Alignment: Exploiting Visual Vulnerabilities for Jailbreaking …
☆31Updated last year