Giskard-AI / awesome-ai-safety
π A curated list of papers & technical articles on AI Quality & Safety
β178Updated last month
Alternatives and similar repositories for awesome-ai-safety
Users that are interested in awesome-ai-safety are comparing it to the libraries listed below
Sorting:
- β267Updated 3 months ago
- Fiddler Auditor is a tool to evaluate language models.β179Updated last year
- Red-Teaming Language Models with DSPyβ192Updated 3 months ago
- Run safety benchmarks against AI models and view detailed reports showing how well they performed.β91Updated this week
- Stanford CRFM's initiative to assess potential compliance with the draft EU AI Actβ94Updated last year
- Creating the tools and data sets necessary to evaluate vulnerabilities in LLMs.β23Updated 2 months ago
- RAGElo is a set of tools that helps you selecting the best RAG-based LLM agents by using an Elo rankerβ109Updated last month
- A repository of Language Model Vulnerabilities and Exposures (LVEs).β109Updated last year
- π Datasets and models for instruction-tuningβ237Updated last year
- β43Updated 9 months ago
- PromptInject is a framework that assembles prompts in a modular fashion to provide a quantitative analysis of the robustness of LLMs to aβ¦β365Updated last year
- An open-source compliance-centered evaluation framework for Generative AI modelsβ148Updated last week
- SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Modelsβ522Updated 10 months ago
- This repo contains the code for generating the ToxiGen dataset, published at ACL 2022.β317Updated 10 months ago
- The Foundation Model Transparency Indexβ78Updated 11 months ago
- A Comprehensive Assessment of Trustworthiness in GPT Modelsβ290Updated 8 months ago
- Steer LLM outputs towards a certain topic/subject and enhance response capabilities using activation engineering by adding steering vectoβ¦β235Updated 2 months ago
- Papers about red teaming LLMs and Multimodal models.β115Updated 5 months ago
- Collection of evals for Inspect AIβ132Updated this week
- A comprehensive guide to LLM evaluation methods designed to assist in identifying the most suitable evaluation techniques for various useβ¦β116Updated last week
- πΈ Open-Source Evaluation & Testing for Computer Vision AI systemsβ27Updated 7 months ago
- Improving Alignment and Robustness with Circuit Breakersβ203Updated 7 months ago
- β100Updated 2 months ago
- data cleaning and curation for unstructured textβ328Updated 9 months ago
- Fast & more realistic evaluation of chat language models. Includes leaderboard.β186Updated last year
- AI Data Management & Evaluation Platformβ215Updated last year
- LLM Self Defense: By Self Examination, LLMs know they are being trickedβ32Updated 11 months ago
- Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks [ICLR 2025]β304Updated 3 months ago
- RuLES: a benchmark for evaluating rule-following in language modelsβ223Updated 2 months ago
- Code for the paper "Fishing for Magikarp"β155Updated 2 weeks ago