ydyjya / Awesome-LLM-Safety
A curated list of safety-related papers, articles, and resources focused on Large Language Models (LLMs). This repository aims to provide researchers, practitioners, and enthusiasts with insights into the safety implications, challenges, and advancements surrounding these powerful models.
☆1,122Updated 2 weeks ago
Alternatives and similar repositories for Awesome-LLM-Safety:
Users that are interested in Awesome-LLM-Safety are comparing it to the libraries listed below
- A reading list for large models safety, security, and privacy (including Awesome LLM Security, Safety, etc.).☆1,098Updated this week
- A curation of awesome tools, documents and projects about LLM Security.☆1,027Updated last month
- Papers and resources related to the security and privacy of LLMs 🤖☆467Updated last month
- Reading list of hallucination in LLMs. Check out our new survey paper: "Siren’s Song in the AI Ocean: A Survey on Hallucination in Large …☆966Updated last month
- An Awesome Collection for LLM Survey☆321Updated 4 months ago
- Daily updated LLM papers. 每日更新 LLM 相关的论文,欢迎订阅 👏 喜欢的话动动你的小手 🌟 一个☆1,049Updated 5 months ago
- awesome papers in LLM interpretability☆378Updated this week
- 😎 up-to-date & curated list of awesome Attacks on Large-Vision-Language-Models papers, methods & resources.☆184Updated last week
- Must-read Papers on Knowledge Editing for Large Language Models.☆986Updated 3 weeks ago
- This repository collects papers for "A Survey on Knowledge Distillation of Large Language Models". We break down KD into Knowledge Elicit…☆732Updated 2 months ago
- A resource repository for machine unlearning in large language models☆285Updated this week
- Awesome-LLM-Robustness: a curated list of Uncertainty, Reliability and Robustness in Large Language Models☆695Updated 7 months ago
- LLM hallucination paper list☆299Updated 10 months ago
- Neural Code Intelligence Survey 2024; Reading lists and resources☆236Updated this week
- Accepted by IJCAI-24 Survey Track☆182Updated 4 months ago
- MarkLLM: An Open-Source Toolkit for LLM Watermarking.(EMNLP 2024 Demo)☆325Updated last week
- "他山之石、可以攻玉":复旦白泽智能发布面向国内开源和国外商用大模型的Demo数据集JADE-DB☆351Updated last month
- UP-TO-DATE LLM Watermark paper. 🔥🔥🔥☆316Updated last month
- The official implementation of our ICLR2024 paper "AutoDAN: Generating Stealthy Jailbreak Prompts on Aligned Large Language Models".☆275Updated 2 months ago
- The lastest paper about detection of LLM-generated text and code☆244Updated last week
- SecProbe:任务驱动式大模型安全能力评测系统☆10Updated last month
- [NAACL2024] Attacks, Defenses and Evaluations for LLM Conversation Safety: A Survey☆86Updated 5 months ago
- Chinese safety prompts for evaluating and improving the safety of LLMs. 中文安全prompts,用于评估和提升大模型的安全性。☆898Updated 10 months ago
- Official github repo for SafetyBench, a comprehensive benchmark to evaluate LLMs' safety. [ACL 2024]☆182Updated 6 months ago
- Paper list about multimodal and large language models, only used to record papers I read in the daily arxiv for personal needs.☆578Updated this week
- This is a collection of research papers for Self-Correcting Large Language Models with Automated Feedback.☆485Updated 2 months ago
- [TMLR 2024] Efficient Large Language Models: A Survey☆1,073Updated this week
- ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors [EMNLP 2024 Findings]☆170Updated 3 months ago
- [ACL 2024] An Easy-to-use Knowledge Editing Framework for LLMs.☆2,033Updated 2 weeks ago
- [ICML 2024] TrustLLM: Trustworthiness in Large Language Models☆502Updated 3 months ago