fiddler-labs / fiddler-auditor
Fiddler Auditor is a tool to evaluate language models.
☆171Updated 8 months ago
Related projects ⓘ
Alternatives and complementary repositories for fiddler-auditor
- Sample notebooks and prompts for LLM evaluation☆114Updated last week
- ☆75Updated 5 months ago
- A comprehensive guide to LLM evaluation methods designed to assist in identifying the most suitable evaluation techniques for various use…☆68Updated last week
- Red-Teaming Language Models with DSPy☆142Updated 7 months ago
- ☆83Updated last year
- Low latency, High Accuracy, Custom Query routers for Co-pilots and Agents. Built by Prithivi Da☆52Updated this week
- ☆88Updated 10 months ago
- A trace analysis tool for AI agents.☆124Updated last month
- A tool for evaluating LLMs☆392Updated 6 months ago
- Leverage your LangChain trace data for fine tuning☆38Updated 3 months ago
- 📚 A curated list of papers & technical articles on AI Quality & Safety☆161Updated last year
- ☆67Updated last month
- ☆47Updated 5 months ago
- Open Source LLM toolkit to build trustworthy LLM applications. TigerArmor (AI safety), TigerRAG (embedding, RAG), TigerTune (fine-tuning)☆390Updated 11 months ago
- Framework for LLM evaluation, guardrails and security☆96Updated 2 months ago
- ☆165Updated this week
- ☆162Updated 5 months ago
- ☆179Updated last year
- ☆182Updated 6 months ago
- ☆144Updated 10 months ago
- Stanford CRFM's initiative to assess potential compliance with the draft EU AI Act☆92Updated last year
- LangChain chat model abstractions for dynamic failover, load balancing, chaos engineering, and more!☆79Updated 9 months ago
- Mistral + Haystack: build RAG pipelines that rock 🤘☆100Updated 9 months ago
- Tutorial for building LLM router☆163Updated 4 months ago
- Automatic Evals for Instruction-Tuned Models☆65Updated this week
- Check for data drift between two OpenAI multi-turn chat jsonl files.☆36Updated 7 months ago
- data cleaning and curation for unstructured text☆327Updated 3 months ago
- 🚀 A list of Haystack Integrations, maintained by the community or deepset.☆64Updated last week
- Python SDK for running evaluations on LLM generated responses☆221Updated last week
- Domain Adapted Language Modeling Toolkit - E2E RAG☆312Updated 2 weeks ago