ydyjya / Awesome-LLM-Safety
A curated list of safety-related papers, articles, and resources focused on Large Language Models (LLMs). This repository aims to provide researchers, practitioners, and enthusiasts with insights into the safety implications, challenges, and advancements surrounding these powerful models.
☆1,378Updated 2 weeks ago
Alternatives and similar repositories for Awesome-LLM-Safety
Users that are interested in Awesome-LLM-Safety are comparing it to the libraries listed below
Sorting:
- A reading list for large models safety, security, and privacy (including Awesome LLM Security, Safety, etc.).☆1,419Updated this week
- Papers and resources related to the security and privacy of LLMs 🤖☆501Updated 5 months ago
- A curation of awesome tools, documents and projects about LLM Security.☆1,217Updated last month
- 😎 up-to-date & curated list of awesome Attacks on Large-Vision-Language-Models papers, methods & resources.☆285Updated 3 weeks ago
- Daily updated LLM papers. 每日更新 LLM 相关的论文,欢迎订阅 👏 喜欢的话动动你的小手 🌟 一个☆1,114Updated 9 months ago
- MarkLLM: An Open-Source Toolkit for LLM Watermarking.(EMNLP 2024 Demo)☆395Updated 2 months ago
- Must-read Papers on Knowledge Editing for Large Language Models.☆1,082Updated 2 months ago
- An Awesome Collection for LLM Survey☆357Updated last month
- JailbreakBench: An Open Robustness Benchmark for Jailbreaking Language Models [NeurIPS 2024 Datasets and Benchmarks Track]☆339Updated last month
- An easy-to-use Python framework to generate adversarial jailbreak prompts.☆638Updated last month
- Reading list of hallucination in LLMs. Check out our new survey paper: "Siren’s Song in the AI Ocean: A Survey on Hallucination in Large …☆1,014Updated 5 months ago
- [ICLR 2024] The official implementation of our ICLR2024 paper "AutoDAN: Generating Stealthy Jailbreak Prompts on Aligned Large Language M…☆330Updated 3 months ago
- A resource repository for machine unlearning in large language models☆397Updated last month
- Accepted by IJCAI-24 Survey Track☆202Updated 8 months ago
- awesome papers in LLM interpretability☆454Updated this week
- Awesome-Jailbreak-on-LLMs is a collection of state-of-the-art, novel, exciting jailbreak methods on LLMs. It contains papers, codes, data…☆677Updated this week
- [ICML 2024] TrustLLM: Trustworthiness in Large Language Models☆561Updated 2 months ago
- "他山之石、可以攻玉":复旦白泽智能发布面向国内开源和国外商用大模型的Demo数据集JADE-DB☆406Updated 2 months ago
- Latest Advances on System-2 Reasoning☆995Updated 3 weeks ago
- Large Language Model based Multi-Agents: A Survey of Progress and Challenges☆975Updated last year
- Official github repo for SafetyBench, a comprehensive benchmark to evaluate LLMs' safety. [ACL 2024]☆218Updated 10 months ago
- Safety at Scale: A Comprehensive Survey of Large Model Safety☆153Updated 2 months ago
- ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors [EMNLP 2024 Findings]☆189Updated 7 months ago
- LLM hallucination paper list☆316Updated last year
- Aligning Large Language Models with Human: A Survey☆730Updated last year
- Chinese safety prompts for evaluating and improving the safety of LLMs. 中文安全prompts,用于评估和提升大模型的安全性。☆1,012Updated last year
- 该仓库主要记录 大模型(LLMs) 算法工程师相关的面试题☆1,987Updated 4 months ago
- Official repo for GPTFUZZER : Red Teaming Large Language Models with Auto-Generated Jailbreak Prompts☆488Updated 7 months ago
- Collecting awesome papers of RAG for AIGC. We propose a taxonomy of RAG foundations, enhancements, and applications in paper "Retrieval-…☆1,619Updated 8 months ago
- [ACL 2024] A Survey of Chain of Thought Reasoning: Advances, Frontiers and Future☆444Updated 4 months ago