athina-ai / athina-evalsLinks

Python SDK for running evaluations on LLM generated responses

☆289

Alternatives and similar repositories for athina-evals

Users that are interested in athina-evals are comparing it to the libraries listed below

Sorting:

AI-Northstar-Tech / vector-io
Comprehensive Vector Data Tooling. The universal interface for all vector database, datasets and RAG platforms. Easily export, import, ba…
☆252Updated this week
zenbase-ai / core
Prompt engineering, automated.
☆335Updated 3 months ago
whyhow-ai / rule-based-retrieval
The Rule-based Retrieval package is a Python package that enables you to create and manage Retrieval Augmented Generation (RAG) applicati…
☆246Updated 9 months ago
ganarajpr / awesome-dspy
An Awesome list of curated DSPy resources.
☆390Updated 5 months ago
arthur-ai / bench
A tool for evaluating LLMs
☆423Updated last year
cohere-ai / cohere-terrarium
A simple Python sandbox for helpful LLM data agents
☆276Updated last year
superagent-ai / super-rag
Super performant RAG pipelines for AI apps. Summarization, Retrieve/Rerank and Code Interpreters in one simple API.
☆380Updated last year
relari-ai / continuous-eval
Data-Driven Evaluation for LLM-Powered Applications
☆501Updated 6 months ago
sheet0 / npi
Action library for AI Agent
☆222Updated 4 months ago
mendableai / rag-arena
Open-source RAG evaluation through users' feedback
☆194Updated last year
stanford-oval / suql
SUQL: Conversational Search over Structured and Unstructured Data with LLMs
☆276Updated 2 weeks ago
misbahsy / RAGTune
Tuning and Evaluation of RAG pipeline. (Automated optimization to be added soon)
☆264Updated last year
phospho-app / text-analytics-legacy
Legacy project
☆438Updated 2 weeks ago
anyscale / llm-router
Tutorial for building LLM router
☆220Updated last year
diicellman / dspy-rag-fastapi
FastAPI wrapper around DSPy
☆258Updated last year
TonicAI / tonic_validate
Metrics to evaluate the quality of responses of your Retrieval Augmented Generation (RAG) applications.
☆314Updated 3 weeks ago
567-labs / kura
Kura is a simple reproduction of the CLIO paper which uses language models to label user behaviour before clustering them based on embedd…
☆259Updated last month
cohere-ai / quick-start-connectors
This open-source repository offers reference code for integrating workplace datastores with Cohere's LLMs, enabling developers and busine…
☆151Updated 9 months ago
saharmor / voice-lab
Testing and evaluation framework for voice agents
☆129Updated last month
parea-ai / parea-sdk-py
Python SDK for experimenting, testing, evaluating & monitoring LLM-powered applications - Parea AI (YC S23)
☆78Updated 5 months ago
langchain-ai / langchain-benchmarks
🦜💯 Flex those feathers!
☆252Updated 9 months ago
aryn-ai / sycamore
🍁 Sycamore is an LLM-powered search and analytics platform for unstructured data.
☆547Updated this week
sydverma123 / awesome-ai-repositories
A curated list of open source repositories for AI Engineers
☆116Updated 4 months ago
SciPhi-AI / agent-search
AgentSearch is a framework for powering search agents and enabling customizable local search.
☆496Updated last year
jxnl / n-levels-of-rag
☆195Updated last year
ammirsm / llamaindex-omakase-rag
This project enhances the construction of RAG applications by addressing challenges, improving accessibility, scalability, and managing d…
☆146Updated last year
eidolon-ai / eidolon
The first AI Agent Server, Eidolon is a pluggable Agent SDK and enterprise ready, deployment server for Agentic applications
☆463Updated 4 months ago
langwatch / langevals
LangEvals aggregates various language model evaluators into a single platform, providing a standard interface for a multitude of scores a…
☆63Updated this week
simbianai / taskgen
Task-based Agentic Framework using StrictJSON as the core
☆455Updated 2 weeks ago
hwchase17 / langfuzz
☆71Updated 9 months ago