Giskard-AI / awesome-ai-safety
π A curated list of papers & technical articles on AI Quality & Safety
β161Updated last year
Related projects β
Alternatives and complementary repositories for awesome-ai-safety
- Fiddler Auditor is a tool to evaluate language models.β171Updated 8 months ago
- β258Updated this week
- Red-Teaming Language Models with DSPyβ142Updated 7 months ago
- π Datasets and models for instruction-tuningβ233Updated last year
- Sample notebooks and prompts for LLM evaluationβ114Updated this week
- Initiative to evaluate and rank the most popular LLMs across common task types based on their propensity to hallucinate.β100Updated 2 months ago
- data cleaning and curation for unstructured textβ327Updated 3 months ago
- A framework to empower forecasting using Large Language Models (LLMs)β101Updated 4 months ago
- The Foundation Model Transparency Indexβ71Updated 5 months ago
- RAGElo is a set of tools that helps you selecting the best RAG-based LLM agents by using an Elo rankerβ106Updated 3 weeks ago
- A tool for evaluating LLMsβ392Updated 6 months ago
- Doing simple retrieval from LLM models at various context lengths to measure accuracyβ97Updated 7 months ago
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absoluteβ¦β48Updated 4 months ago
- Truth Forest: Toward Multi-Scale Truthfulness in Large Language Models through Intervention without Tuningβ41Updated 11 months ago
- Website for hosting the Open Foundation Models Cheat Sheet.β257Updated 4 months ago
- Mixing Language Models with Self-Verification and Meta-Verificationβ97Updated last year
- Web UI & Backend for Data Annotations in Ayaβ26Updated 8 months ago
- Stanford CRFM's initiative to assess potential compliance with the draft EU AI Actβ92Updated last year
- Steer LLM outputs towards a certain topic/subject and enhance response capabilities using activation engineering by adding steering vectoβ¦β203Updated 6 months ago
- A trace analysis tool for AI agents.β124Updated last month
- awesome synthetic (text) datasetsβ242Updated 3 weeks ago
- Large Language Model (LLM) Inference API and Chatbotβ122Updated 7 months ago
- Building a chatbot powered with a RAG pipeline to read,summarize and quote the most relevant papers related to the user query.β162Updated 6 months ago
- This is the reproduction repository for my π€ Hugging Face blog post on synthetic dataβ61Updated 9 months ago
- Let's build better datasets, together!β205Updated this week
- β199Updated this week
- Tutorial for building LLM routerβ163Updated 4 months ago
- A comprehensive guide to LLM evaluation methods designed to assist in identifying the most suitable evaluation techniques for various useβ¦β66Updated this week
- Mistral + Haystack: build RAG pipelines that rock π€β100Updated 9 months ago
- Python SDK for experimenting, testing, evaluating & monitoring LLM-powered applications - Parea AI (YC S23)β74Updated 2 months ago