fiddler-labs / fiddler-auditorLinks
Fiddler Auditor is a tool to evaluate language models.
☆183Updated last year
Alternatives and similar repositories for fiddler-auditor
Users that are interested in fiddler-auditor are comparing it to the libraries listed below
Sorting:
- A comprehensive guide to LLM evaluation methods designed to assist in identifying the most suitable evaluation techniques for various use…☆121Updated last month
- Red-Teaming Language Models with DSPy☆198Updated 4 months ago
- A tool for evaluating LLMs☆419Updated last year
- Python SDK for experimenting, testing, evaluating & monitoring LLM-powered applications - Parea AI (YC S23)☆78Updated 4 months ago
- Sample notebooks and prompts for LLM evaluation☆135Updated 2 weeks ago
- ☆167Updated last year
- ☆185Updated last year
- Mistral + Haystack: build RAG pipelines that rock 🤘☆105Updated last year
- Framework for building data agent workflows☆82Updated 10 months ago
- Leverage your LangChain trace data for fine tuning☆41Updated 10 months ago
- Recipes and resources for building, deploying, and fine-tuning generative AI with Fireworks.☆113Updated this week
- Domain Adapted Language Modeling Toolkit - E2E RAG☆322Updated 7 months ago
- ☆77Updated last year
- AI Evaluation Platform☆46Updated 3 weeks ago
- ☆72Updated 7 months ago
- Low latency, High Accuracy, Custom Query routers for Humans and Agents. Built by Prithivi Da☆105Updated 2 months ago
- ☆78Updated last year
- ☆72Updated 8 months ago
- Writing Blog Posts with Generative Feedback Loops!☆48Updated last year
- LangChain chat model abstractions for dynamic failover, load balancing, chaos engineering, and more!☆81Updated last year
- ☆170Updated last year
- Agent that routes to different tools - LLM classifier SDK☆44Updated last year
- Doing simple retrieval from LLM models at various context lengths to measure accuracy☆99Updated last year
- 📚 Datasets and models for instruction-tuning☆238Updated last year
- ☆53Updated last year
- ☆75Updated last year
- ☆89Updated last year
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆49Updated 11 months ago
- This open-source repository offers reference code for integrating workplace datastores with Cohere's LLMs, enabling developers and busine…☆150Updated 8 months ago
- Check for data drift between two OpenAI multi-turn chat jsonl files.☆37Updated last year