chujiezheng / LLM-SafeguardLinks
Official repository for ICML 2024 paper "On Prompt-Driven Safeguarding for Large Language Models"
☆99Updated 5 months ago
Alternatives and similar repositories for LLM-Safeguard
Users that are interested in LLM-Safeguard are comparing it to the libraries listed below
Sorting:
- A lightweight library for large laguage model (LLM) jailbreaking defense.☆59Updated 2 months ago
- [ICML 2024] Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications☆86Updated 7 months ago
- ICLR2024 Paper. Showing properties of safety tuning and exaggerated safety.☆88Updated last year
- ☆55Updated last year
- Official Repository for ACL 2024 Paper SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware Decoding☆146Updated last year
- [NAACL2024] Attacks, Defenses and Evaluations for LLM Conversation Safety: A Survey