xingjunm / Awesome-Large-Model-SafetyLinks

Safety at Scale: A Comprehensive Survey of Large Model Safety

☆213

Alternatives and similar repositories for Awesome-Large-Model-Safety

Users that are interested in Awesome-Large-Model-Safety are comparing it to the libraries listed below

Sorting:

liudaizong / Awesome-LVLM-Attack
😎 up-to-date & curated list of awesome Attacks on Large-Vision-Language-Models papers, methods & resources.
☆440Updated last week
NY1024 / Foundation-Model-Paper-Notes
☆69Updated 6 months ago
liuxuannan / Awesome-Multimodal-Jailbreak
A Survey on Jailbreak Attacks and Defenses against Multimodal Generative Models
☆279Updated 2 weeks ago
chen37058 / Red-Team-Arxiv-Paper-Update
Awesome Jailbreak, red teaming arxiv papers (Automatically Update Every 12th hours)
☆80Updated this week
isXinLiu / Awesome-MLLM-Safety
Accepted by IJCAI-24 Survey Track
☆224Updated last year
WUSTL-CSPL / LLMJailbreak
☆37Updated last year
CryptoAILab / FigStep
[AAAI'25 (Oral)] Jailbreaking Large Vision-language Models via Typographic Visual Prompts
☆181Updated 5 months ago
isXinLiu / MM-SafetyBench
Accepted by ECCV 2024
☆176Updated last year
roywang021 / UMK
Code for ACM MM2024 paper: White-box Multimodal Jailbreaks Against Large Vision-Language Models
☆30Updated 11 months ago
bboylyg / BackdoorLLM
[NeurIPS 2025] BackdoorLLM: A Comprehensive Benchmark for Backdoor Attacks and Defenses on Large Language Models
☆254Updated last month
Unispac / Visual-Adversarial-Examples-Jailbreak-Large-Language-Models
Repository for the Paper (AAAI 2024, Oral) --- Visual Adversarial Examples Jailbreak Large Language Models
☆256Updated last year
thu-ml / Attack-Bard
☆107Updated last year
AI45Lab / ActorAttack
☆112Updated 10 months ago
Trustworthy-AI-Group / Adversarial_Examples_Papers
A list of recent papers about adversarial learning
☆257Updated this week
abc03570128 / Jailbreaking-Attack-against-Multimodal-Large-Language-Model
☆52Updated last year
yibo-miao / T2VSafetyBench
☆25Updated last year
wonderNefelibata / Awesome-LRM-Safety
Awesome Large Reasoning Model(LRM) Safety.This repository is used to collect security-related research on large reasoning models such as …
☆78Updated this week
ltroin / llm_attack_defense_arena
☆82Updated 3 months ago
sleeepeer / PoisonedRAG
[USENIX Security 2025] PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation of Large Language Models
☆218Updated 3 weeks ago
NY1024 / BAP-Jailbreak-Vision-Language-Models-via-Bi-Modal-Adversarial-Prompt
☆52Updated last year
SproutNan / AI-Safety_SCAV
This is the code repository for "Uncovering Safety Risks of Large Language Models through Concept Activation Vector"
☆46Updated last month
tmllab / 2025_ICLR_PiF
☆37Updated 6 months ago
Lyz1213 / BadEdit
☆36Updated last year
mengtong0110 / InferDPT
☆32Updated 3 weeks ago
KuofengGao / Verbose_Images
[ICLR 2024] Inducing High Energy-Latency of Large Vision-Language Models with Verbose Images
☆41Updated last year
Django-Jiang / BadChain
[ICLR24] Official Repo of BadChain: Backdoor Chain-of-Thought Prompting for Large Language Models
☆43Updated last year
tmlr-group / DeepInception
[arXiv:2311.03191] "DeepInception: Hypnotize Large Language Model to Be Jailbreaker"
☆164Updated last year
Haochen-Luo / CroPA
☆54Updated last year
SaFoLab-WISC / JailBreakV_28K
[COLM 2024] JailBreakV-28K: A comprehensive benchmark designed to evaluate the transferability of LLM jailbreak attacks to MLLMs, and fur…
☆82Updated 7 months ago
lancopku / codable-watermarking-for-llm
Repository for Towards Codable Watermarking for Large Language Models
☆38Updated 2 years ago