Giskard-AI / awesome-ai-safetyLinks
π A curated list of papers & technical articles on AI Quality & Safety
β187Updated 3 months ago
Alternatives and similar repositories for awesome-ai-safety
Users that are interested in awesome-ai-safety are comparing it to the libraries listed below
Sorting:
- Fiddler Auditor is a tool to evaluate language models.β183Updated last year
- π Datasets and models for instruction-tuningβ238Updated last year
- β268Updated 5 months ago
- An open-source compliance-centered evaluation framework for Generative AI modelsβ158Updated last week
- Stanford CRFM's initiative to assess potential compliance with the draft EU AI Actβ94Updated last year
- Red-Teaming Language Models with DSPyβ202Updated 5 months ago
- π A curated list of resources dedicated to synthetic dataβ130Updated 2 years ago
- Initiative to evaluate and rank the most popular LLMs across common task types based on their propensity to hallucinate.β111Updated 10 months ago
- The Foundation Model Transparency Indexβ81Updated last year
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absoluteβ¦β49Updated last year
- A curated list of awesome synthetic data tools (open source and commercial).β191Updated last year
- RAGElo is a set of tools that helps you selecting the best RAG-based LLM agents by using an Elo rankerβ113Updated this week
- β239Updated 3 months ago
- π¦ Unitxt is a Python library for enterprise-grade evaluation of AI performance, offering the world's largest catalog of tools and data β¦β206Updated this week
- Mixing Language Models with Self-Verification and Meta-Verificationβ106Updated 7 months ago
- data cleaning and curation for unstructured textβ328Updated 11 months ago
- A tool for evaluating LLMsβ424Updated last year
- Mistral + Haystack: build RAG pipelines that rock π€β105Updated last year
- A framework for fine-tuning retrieval-augmented generation (RAG) systems.β122Updated this week
- Open Source LLM toolkit to build trustworthy LLM applications. TigerArmor (AI safety), TigerRAG (embedding, RAG), TigerTune (fine-tuning)β397Updated last year
- This is the reproduction repository for my π€ Hugging Face blog post on synthetic dataβ68Updated last year
- TitanML Takeoff Server is an optimization, compression and deployment platform that makes state of the art machine learning models accessβ¦β114Updated last year
- Domain Adapted Language Modeling Toolkit - E2E RAGβ324Updated 8 months ago
- Continuous Integration for LLM powered applicationsβ246Updated last year
- Large Language Model (LLM) Inference API and Chatbotβ126Updated last year
- Google Deepmind's PromptBreeder for automated prompt engineering implemented in langchain expression language.β125Updated 11 months ago
- Benchmark various LLM Structured Output frameworks: Instructor, Mirascope, Langchain, LlamaIndex, Fructose, Marvin, Outlines, etc on taskβ¦β173Updated 9 months ago
- β168Updated last year
- Truth Forest: Toward Multi-Scale Truthfulness in Large Language Models through Intervention without Tuningβ46Updated last year
- Curation of prompts that are known to be adversarial to large language modelsβ180Updated 2 years ago