ydyjya / Awesome-LLM-Safety
A curated list of safety-related papers, articles, and resources focused on Large Language Models (LLMs). This repository aims to provide researchers, practitioners, and enthusiasts with insights into the safety implications, challenges, and advancements surrounding these powerful models.
☆1,346Updated this week
Alternatives and similar repositories for Awesome-LLM-Safety:
Users that are interested in Awesome-LLM-Safety are comparing it to the libraries listed below
- A reading list for large models safety, security, and privacy (including Awesome LLM Security, Safety, etc.).☆1,377Updated this week
- A curation of awesome tools, documents and projects about LLM Security.☆1,186Updated last week
- Papers and resources related to the security and privacy of LLMs 🤖☆496Updated 4 months ago
- Must-read Papers on Knowledge Editing for Large Language Models.☆1,070Updated last month
- 😎 up-to-date & curated list of awesome Attacks on Large-Vision-Language-Models papers, methods & resources.☆271Updated 2 weeks ago
- An easy-to-use Python framework to generate adversarial jailbreak prompts.☆629Updated 3 weeks ago
- An Awesome Collection for LLM Survey☆338Updated 2 weeks ago
- awesome papers in LLM interpretability☆443Updated 2 weeks ago
- Reading list of hallucination in LLMs. Check out our new survey paper: "Siren’s Song in the AI Ocean: A Survey on Hallucination in Large …☆1,010Updated 5 months ago
- A resource repository for machine unlearning in large language models☆377Updated 3 weeks ago
- Accepted by IJCAI-24 Survey Track☆200Updated 8 months ago
- Daily updated LLM papers. 每日更新 LLM 相关的论文,欢迎订阅 👏 喜欢的话动动你的小手 🌟 一个☆1,106Updated 8 months ago
- Latest Advances on System-2 Reasoning☆956Updated this week
- Official repo for GPTFUZZER : Red Teaming Large Language Models with Auto-Generated Jailbreak Prompts☆483Updated 7 months ago
- Awesome-LLM-Robustness: a curated list of Uncertainty, Reliability and Robustness in Large Language Models☆741Updated last month
- [ICML 2024] TrustLLM: Trustworthiness in Large Language Models☆552Updated last month
- Chinese safety prompts for evaluating and improving the safety of LLMs. 中文安全prompts,用于评估和提升大模型的安全性。☆1,001Updated last year
- Safety at Scale: A Comprehensive Survey of Large Model Safety☆149Updated 2 months ago
- ☆533Updated 3 weeks ago
- "他山之石、可以攻玉":复旦白泽智能发布面向国内开源和国外商用大模型的Demo数据集JADE-DB☆399Updated last month
- MarkLLM: An Open-Source Toolkit for LLM Watermarking.(EMNLP 2024 Demo)☆383Updated last month
- UP-TO-DATE LLM Watermark paper. 🔥🔥🔥☆337Updated 4 months ago
- Official github repo for SafetyBench, a comprehensive benchmark to evaluate LLMs' safety. [ACL 2024]☆212Updated 10 months ago
- [ACL 2024] A Survey of Chain of Thought Reasoning: Advances, Frontiers and Future☆441Updated 3 months ago
- [TMLR 2024] Efficient Large Language Models: A Survey☆1,140Updated 3 weeks ago
- BackdoorLLM: A Comprehensive Benchmark for Backdoor Attacks on Large Language Models☆134Updated 2 months ago
- A survey on harmful fine-tuning attack for large language model☆161Updated last week
- SecProbe:任务驱动式大模型安全能力评测系统☆13Updated 4 months ago
- Awesome-Jailbreak-on-LLMs is a collection of state-of-the-art, novel, exciting jailbreak methods on LLMs. It contains papers, codes, data…☆626Updated this week
- Collecting awesome papers of RAG for AIGC. We propose a taxonomy of RAG foundations, enhancements, and applications in paper "Retrieval-…☆1,603Updated 8 months ago