Giskard-AI / awesome-ai-safety
π A curated list of papers & technical articles on AI Quality & Safety
β155Updated 11 months ago
Related projects: β
- Fiddler Auditor is a tool to evaluate language models.β163Updated 6 months ago
- β256Updated this week
- Google Deepmind's PromptBreeder for automated prompt engineering implemented in langchain expression language.β52Updated last month
- Stanford CRFM's initiative to assess potential compliance with the draft EU AI Actβ92Updated 11 months ago
- π Datasets and models for instruction-tuningβ228Updated 11 months ago
- The Foundation Model Transparency Indexβ65Updated 3 months ago
- Benchmark various LLM Structured Output frameworks: Instructor, Mirascope, Langchain, LlamaIndex, Fructose, Marvin, Outlines, etc on taskβ¦β117Updated 3 weeks ago
- Mistral + Haystack: build RAG pipelines that rock π€β99Updated 7 months ago
- This is the reproduction repository for my π€ Hugging Face blog post on synthetic dataβ57Updated 7 months ago
- Building a chatbot powered with a RAG pipeline to read,summarize and quote the most relevant papers related to the user query.β161Updated 4 months ago
- RAGElo is a set of tools that helps you selecting the best RAG-based LLM agents by using an Elo rankerβ101Updated last week
- Let's build better datasets, together!β195Updated last month
- A trace analysis tool for AI agents.β97Updated this week
- Red-Teaming Language Models with DSPyβ116Updated 5 months ago
- Steer LLM outputs towards a certain topic/subject and enhance response capabilities using activation engineering by adding steering vectoβ¦β192Updated 4 months ago
- A framework to empower forecasting using Large Language Models (LLMs)β97Updated 2 months ago
- A tool for evaluating LLMsβ377Updated 4 months ago
- Initiative to evaluate and rank the most popular LLMs across common task types based on their propensity to hallucinate.β91Updated last week
- awesome synthetic (text) datasetsβ213Updated last week
- This repository implements the chain of verification paper by Meta AIβ151Updated 11 months ago
- In-Context Learning for eXtreme Multi-Label Classification (XMC) using only a handful of examples.β362Updated 7 months ago
- Automating enterprise workflows with multimodal agentsβ83Updated last month
- Repository to demonstrate Chain of Table reasoning with multiple tables powered by LangGraphβ143Updated 5 months ago
- data cleaning and curation for unstructured textβ326Updated last month
- Mixing Language Models with Self-Verification and Meta-Verificationβ96Updated 10 months ago
- SUQL: Conversational Search over Structured and Unstructured Data with LLMsβ194Updated 3 weeks ago
- Additional packages (components, document stores and the likes) to extend the capabilities of Haystack version 2.0 and onwardsβ105Updated this week
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absoluteβ¦β48Updated 2 months ago
- PanML is a high level generative AI/ML development and analysis library designed for ease of use and fast experimentation.β113Updated last year
- Sample notebooks and prompts for LLM evaluationβ104Updated 5 months ago