WhileBug / AwesomeLLMJailBreakPapersLinks
Awesome LLM Jailbreak academic papers
☆102Updated last year
Alternatives and similar repositories for AwesomeLLMJailBreakPapers
Users that are interested in AwesomeLLMJailBreakPapers are comparing it to the libraries listed below
Sorting:
- [NDSS'25 Best Technical Poster] A collection of automated evaluators for assessing jailbreak attempts.☆158Updated 2 months ago
- [NAACL2024] Attacks, Defenses and Evaluations for LLM Conversation Safety: A Survey☆103Updated 10 months ago
- JailbreakBench: An Open Robustness Benchmark for Jailbreaking Language Models [NeurIPS 2024 Datasets and Benchmarks Track]☆359Updated 2 months ago
- ☆66Updated 11 months ago
- TAP: An automated jailbreaking method for black-box LLMs☆173Updated 6 months ago
- [ICLR 2024] The official implementation of our ICLR2024 paper "AutoDAN: Generating Stealthy Jailbreak Prompts on Aligned Large Language M…☆341Updated 5 months ago
- An LLM can Fool Itself: A Prompt-Based Adversarial Attack (ICLR 2024)☆89Updated 5 months ago
- Bag of Tricks: Benchmarking of Jailbreak Attacks on LLMs. Empirical tricks for LLM Jailbreaking. (NeurIPS 2024)