CryptoAILab / Awesome-LM-SSP
View external linksLinks

A reading list for large models safety, security, and privacy (including Awesome LLM Security, Safety, etc.).

☆1,856

Alternatives and similar repositories for Awesome-LM-SSP

Users that are interested in Awesome-LM-SSP are comparing it to the libraries listed below

Sorting:

ydyjya / Awesome-LLM-Safety
View on GitHub
A curated list of safety-related papers, articles, and resources focused on Large Language Models (LLMs). This repository aims to provide…
☆1,769Feb 1, 2026Updated 2 weeks ago
chawins / llm-sp
View on GitHub
Papers and resources related to the security and privacy of LLMs 🤖
☆561Jun 8, 2025Updated 8 months ago
corca-ai / awesome-llm-security
View on GitHub
A curation of awesome tools, documents and projects about LLM Security.
☆1,525Aug 20, 2025Updated 5 months ago
yueliu1999 / Awesome-Jailbreak-on-LLMs
View on GitHub
Awesome-Jailbreak-on-LLMs is a collection of state-of-the-art, novel, exciting jailbreak methods on LLMs. It contains papers, codes, data…
☆1,205Feb 6, 2026Updated last week
liudaizong / Awesome-LVLM-Attack
View on GitHub
😎 up-to-date & curated list of awesome Attacks on Large-Vision-Language-Models papers, methods & resources.
☆490Jan 27, 2026Updated 2 weeks ago
EasyJailbreak / EasyJailbreak
View on GitHub
An easy-to-use Python framework to generate adversarial jailbreak prompts.
☆815Mar 27, 2025Updated 10 months ago
isXinLiu / Awesome-MLLM-Safety
View on GitHub
Accepted by IJCAI-24 Survey Track
☆231Aug 25, 2024Updated last year
CryptoAILab / JailbreakEval
View on GitHub
[NDSS'25 Best Technical Poster] A collection of automated evaluators for assessing jailbreak attempts.
☆184Apr 1, 2025Updated 10 months ago
liuxuannan / Awesome-Multimodal-Jailbreak
View on GitHub
A Survey on Jailbreak Attacks and Defenses against Multimodal Generative Models
☆302Jan 11, 2026Updated last month
SheltonLiu-N / AutoDAN
View on GitHub
[ICLR 2024] The official implementation of our ICLR2024 paper "AutoDAN: Generating Stealthy Jailbreak Prompts on Aligned Large Language M…
☆427Jan 22, 2025Updated last year
JailbreakBench / jailbreakbench
View on GitHub
JailbreakBench: An Open Robustness Benchmark for Jailbreaking Language Models [NeurIPS 2024 Datasets and Benchmarks Track]
☆527Apr 4, 2025Updated 10 months ago
isXinLiu / MM-SafetyBench
View on GitHub
Accepted by ECCV 2024
☆186Oct 15, 2024Updated last year
CryptoAILab / FigStep
View on GitHub
[AAAI'25 (Oral)] Jailbreaking Large Vision-language Models via Typographic Visual Prompts
☆191Jun 26, 2025Updated 7 months ago
llm-attacks / llm-attacks
View on GitHub
Universal and Transferable Attacks on Aligned Language Models
☆4,493Aug 2, 2024Updated last year
Unispac / Visual-Adversarial-Examples-Jailbreak-Large-Language-Models
View on GitHub
Repository for the Paper (AAAI 2024, Oral) --- Visual Adversarial Examples Jailbreak Large Language Models
☆266May 13, 2024Updated last year
LLM-Tuning-Safety / LLMs-Finetuning-Safety
View on GitHub
We jailbreak GPT-3.5 Turbo’s safety guardrails by fine-tuning it on only 10 adversarially designed examples, at a cost of less than $0.20…
☆338Feb 23, 2024Updated last year
wonderNefelibata / Awesome-LRM-Safety
View on GitHub
Awesome Large Reasoning Model(LRM) Safety.This repository is used to collect security-related research on large reasoning models such as …
☆82Updated this week
centerforaisafety / HarmBench
View on GitHub
HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal
☆854Aug 16, 2024Updated last year
git-disl / awesome_LLM-harmful-fine-tuning-papers
View on GitHub
A survey on harmful fine-tuning attack for large language model
☆232Jan 9, 2026Updated last month
patrickrchao / JailbreakingLLMs
View on GitHub
☆696Jul 2, 2025Updated 7 months ago
GraySwanAI / nanoGCG
View on GitHub
A fast + lightweight implementation of the GCG algorithm in PyTorch
☆317May 13, 2025Updated 9 months ago
niconi19 / LLM-Conversation-Safety
View on GitHub
[NAACL2024] Attacks, Defenses and Evaluations for LLM Conversation Safety: A Survey
☆109Aug 7, 2024Updated last year
chrisliu298 / awesome-llm-unlearning
View on GitHub
A resource repository for machine unlearning in large language models
☆534Jan 6, 2026Updated last month
tml-epfl / llm-adaptive-attacks
View on GitHub
Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks [ICLR 2025]
☆377Jan 23, 2025Updated last year
liu00222 / Open-Prompt-Injection
View on GitHub
This repository provides a benchmark for prompt injection attacks and defenses in LLMs
☆391Oct 29, 2025Updated 3 months ago
xingjunm / Awesome-Large-Model-Safety
View on GitHub
Safety at Scale: A Comprehensive Survey of Large Model Safety
☆227Feb 3, 2026Updated last week
gnipping / Awesome-ML-SP-Papers
View on GitHub
A curated list of Meachine learning Security & Privacy papers published in security top-4 conferences (IEEE S&P, ACM CCS, USENIX Security…
☆332Nov 11, 2025Updated 3 months ago
RICommunity / TAP
View on GitHub
TAP: An automated jailbreaking method for black-box LLMs
☆220Dec 10, 2024Updated last year
sherdencooper / GPTFuzz
View on GitHub
Official repo for GPTFUZZER : Red Teaming Large Language Models with Auto-Generated Jailbreak Prompts
☆565Sep 24, 2024Updated last year
THUYimingLi / backdoor-learning-resources
View on GitHub
A list of backdoor learning resources
☆1,158Jul 31, 2024Updated last year
uw-nsl / SafeDecoding
View on GitHub
Official Repository for ACL 2024 Paper SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware Decoding
☆151Jul 19, 2024Updated last year
SheltonLiu-N / Universal-Prompt-Injection
View on GitHub
The official implementation of our pre-print paper "Automatic and Universal Prompt Injection Attacks against Large Language Models".
☆68Oct 23, 2024Updated last year
usail-hkust / JailTrickBench
View on GitHub
Bag of Tricks: Benchmarking of Jailbreak Attacks on LLMs. Empirical tricks for LLM Jailbreaking. (NeurIPS 2024)
☆162Nov 30, 2024Updated last year
THUYimingLi / BackdoorBox
View on GitHub
The open-sourced Python toolbox for backdoor attacks and defenses.
☆641Sep 27, 2025Updated 4 months ago
erfanshayegani / Jailbreak-In-Pieces
View on GitHub
[ICLR 2024 Spotlight 🔥 ] - [ Best Paper Award SoCal NLP 2023 🏆] - Jailbreak in pieces: Compositional Adversarial Attacks on Multi-Modal…
☆79Jun 6, 2024Updated last year
HowieHwong / TrustLLM
View on GitHub
[ICML 2024] TrustLLM: Trustworthiness in Large Language Models
☆618Jun 24, 2025Updated 7 months ago
Unispac / shallow-vs-deep-alignment
View on GitHub
Official Repository for The Paper: Safety Alignment Should Be Made More Than Just a Few Tokens Deep
☆173Apr 23, 2025Updated 9 months ago
ChenWu98 / agent-attack
View on GitHub
[ICLR 2025] Dissecting adversarial robustness of multimodal language model agents
☆123Feb 19, 2025Updated 11 months ago
SaFo-Lab / Awesome-T2I-safety-Papers
View on GitHub
List of T2I safety papers, updated daily, welcome to discuss using Discussions
☆67Aug 12, 2024Updated last year

CryptoAILab / Awesome-LM-SSPView external linksLinks

Alternatives and similar repositories for Awesome-LM-SSP

CryptoAILab / Awesome-LM-SSP
View external linksLinks