fiddler-labs / fiddler-auditor
Fiddler Auditor is a tool to evaluate language models.
☆179Updated last year
Alternatives and similar repositories for fiddler-auditor:
Users that are interested in fiddler-auditor are comparing it to the libraries listed below
- A comprehensive guide to LLM evaluation methods designed to assist in identifying the most suitable evaluation techniques for various use…☆115Updated last week
- Sample notebooks and prompts for LLM evaluation☆124Updated 2 weeks ago
- A tool for evaluating LLMs☆418Updated 11 months ago
- Red-Teaming Language Models with DSPy☆188Updated 2 months ago
- LangChain chat model abstractions for dynamic failover, load balancing, chaos engineering, and more!☆81Updated last year
- Doing simple retrieval from LLM models at various context lengths to measure accuracy☆99Updated last year
- ☆185Updated last year
- Initiative to evaluate and rank the most popular LLMs across common task types based on their propensity to hallucinate.☆108Updated 7 months ago
- ☆51Updated 11 months ago
- ☆75Updated last year
- Domain Adapted Language Modeling Toolkit - E2E RAG☆320Updated 5 months ago
- Stanford CRFM's initiative to assess potential compliance with the draft EU AI Act☆94Updated last year
- ☆67Updated 5 months ago
- ☆163Updated last year
- Metrics to evaluate the quality of responses of your Retrieval Augmented Generation (RAG) applications.☆297Updated 5 months ago
- AI Evaluation Platform☆46Updated 3 weeks ago
- ☆143Updated 9 months ago
- ☆78Updated 11 months ago
- AI Verify☆8Updated this week
- ☆72Updated 6 months ago
- Build Enterprise RAG (Retriver Augmented Generation) Pipelines to tackle various Generative AI use cases with LLM's by simply plugging co…☆109Updated 9 months ago
- The Rule-based Retrieval package is a Python package that enables you to create and manage Retrieval Augmented Generation (RAG) applicati…☆237Updated 7 months ago
- Python SDK for running evaluations on LLM generated responses☆278Updated last week
- Leverage your LangChain trace data for fine tuning☆41Updated 9 months ago
- LLM Evals for Text Summarization and RAG use-cases.☆35Updated last year
- RAGElo is a set of tools that helps you selecting the best RAG-based LLM agents by using an Elo ranker☆108Updated 3 weeks ago
- wandbot is a technical support bot for Weights & Biases' AI developer tools that can run in Discord, Slack, ChatGPT and Zendesk☆296Updated 2 weeks ago
- ☆195Updated last year
- ☆204Updated last year
- GenAIOps on Kubernetes: A collection of reference architectures for running GenAI at scale on Kubernetes using OSS tooling☆130Updated 6 months ago