fiddler-labs / fiddler-auditorLinks

Fiddler Auditor is a tool to evaluate language models.

☆184

Alternatives and similar repositories for fiddler-auditor

Users that are interested in fiddler-auditor are comparing it to the libraries listed below

Sorting:

arthur-ai / bench
A tool for evaluating LLMs
☆424Updated last year
rajshah4 / LLM-Evaluation
Sample notebooks and prompts for LLM evaluation
☆138Updated last month
athina-ai / athina-evals
Python SDK for running evaluations on LLM generated responses
☆291Updated 2 months ago
deepset-ai / prompthub
☆173Updated last year
rungalileo / hallucination-index
Initiative to evaluate and rank the most popular LLMs across common task types based on their propensity to hallucinate.
☆113Updated last week
TonicAI / tonic_validate
Metrics to evaluate the quality of responses of your Retrieval Augmented Generation (RAG) applications.
☆315Updated 3 weeks ago
alopatenko / LLMEvaluation
A comprehensive guide to LLM evaluation methods designed to assist in identifying the most suitable evaluation techniques for various use…
☆130Updated this week
Aggregate-Intellect / sherpa
☆166Updated this week
Giskard-AI / awesome-ai-safety
📚 A curated list of papers & technical articles on AI Quality & Safety
☆188Updated 3 months ago
run-llama / ai-engineer-workshop
☆185Updated last year
haizelabs / dspy-redteam
Red-Teaming Language Models with DSPy
☆203Updated 5 months ago
tigerlab-ai / tiger
Open Source LLM toolkit to build trustworthy LLM applications. TigerArmor (AI safety), TigerRAG (embedding, RAG), TigerTune (fine-tuning)
☆398Updated last year
langchain-ai / langchain-benchmarks
🦜💯 Flex those feathers!
☆253Updated 9 months ago
whyhow-ai / rule-based-retrieval
The Rule-based Retrieval package is a Python package that enables you to create and manage Retrieval Augmented Generation (RAG) applicati…
☆245Updated 10 months ago
run-llama / llamaindex_aws_ingestion
☆89Updated last year
andrewnguonly / ChatAbstractions
LangChain chat model abstractions for dynamic failover, load balancing, chaos engineering, and more!
☆82Updated last year
CYQIQ / MultiCoT
Repository to demonstrate Chain of Table reasoning with multiple tables powered by LangGraph
☆147Updated last year
anyscale / ray-summit-2023-training
☆87Updated last year
NumbersStationAI / meadow
Framework for building data agent workflows
☆82Updated 11 months ago
parea-ai / parea-sdk-py
Python SDK for experimenting, testing, evaluating & monitoring LLM-powered applications - Parea AI (YC S23)
☆78Updated 5 months ago
hwchase17 / langfuzz
☆71Updated 9 months ago
stanford-crfm / EUAIActJune15
Stanford CRFM's initiative to assess potential compliance with the draft EU AI Act
☆94Updated last year
arcee-ai / DALM
Domain Adapted Language Modeling Toolkit - E2E RAG
☆325Updated 8 months ago
cfahlgren1 / observers
A Lightweight Library for AI Observability
☆250Updated 5 months ago
anakin87 / mistral-haystack
Mistral + Haystack: build RAG pipelines that rock 🤘
☆105Updated last year
raga-ai-hub / raga-llm-hub
Framework for LLM evaluation, guardrails and security
☆112Updated 10 months ago
cohere-ai / quick-start-connectors
This open-source repository offers reference code for integrating workplace datastores with Cohere's LLMs, enabling developers and busine…
☆151Updated 9 months ago
amogkam / llama_index_ray
Using LlamaIndex with Ray for productionizing LLM applications
☆71Updated 2 years ago
parlance-labs / langfree
Leverage your LangChain trace data for fine tuning
☆42Updated last year
titanml / takeoff-community
TitanML Takeoff Server is an optimization, compression and deployment platform that makes state of the art machine learning models access…
☆114Updated last year