athina-ai / athina-evals
Python SDK for running evaluations on LLM generated responses
☆272Updated this week
Alternatives and similar repositories for athina-evals:
Users that are interested in athina-evals are comparing it to the libraries listed below
- Prompt engineering, automated.☆288Updated 3 months ago
- Data-Driven Evaluation for LLM-Powered Applications☆484Updated last month
- Comprehensive Vector Data Tooling. The universal interface for all vector database, datasets and RAG platforms. Easily export, import, ba…☆230Updated this week
- OpenTelemetry Instrumentation for AI Observability☆334Updated this week
- Super performant RAG pipelines for AI apps. Summarization, Retrieve/Rerank and Code Interpreters in one simple API.☆363Updated 10 months ago
- A simple Python sandbox for helpful LLM data agents☆235Updated 9 months ago
- Action library for AI Agent☆211Updated this week
- The Rule-based Retrieval package is a Python package that enables you to create and manage Retrieval Augmented Generation (RAG) applicati…☆235Updated 5 months ago
- ☆195Updated 10 months ago
- Tutorial for building LLM router☆187Updated 8 months ago
- Open-source RAG evaluation through users' feedback☆179Updated 11 months ago
- A tool for evaluating LLMs☆407Updated 10 months ago
- Python SDK for experimenting, testing, evaluating & monitoring LLM-powered applications - Parea AI (YC S23)☆76Updated last month
- An Intelligence Operating System☆323Updated this week
- AgentSearch is a framework for powering search agents and enabling customizable local search.☆479Updated 10 months ago
- Synthetic Data for LLM Fine-Tuning☆112Updated last year
- Text analytics for LLM apps. Cluster messages to detect use cases, outliers, power users. Detect intents and run evals with LLM (OpenAI, …☆423Updated 2 months ago
- ☆273Updated last week
- An Awesome list of curated DSPy resources.☆297Updated last month
- Infrastructure for AI code interpreting that's powering E2B.☆339Updated this week
- Testing and evaluation framework for voice agents☆98Updated last month
- 🦜💯 Flex those feathers!☆242Updated 5 months ago
- This open-source repository offers reference code for integrating workplace datastores with Cohere's LLMs, enabling developers and busine…☆148Updated 5 months ago
- Tuning and Evaluation of RAG pipeline. (Automated optimization to be added soon)☆263Updated last year
- AutoEvals is a tool for quickly and easily evaluating AI model outputs using best practices.☆429Updated this week
- LLM Evals for Text Summarization and RAG use-cases.☆35Updated last year
- Open source AI Agent evaluation framework for web tasks 🐒🍌☆289Updated 2 months ago
- The Identity layer for the agentic world☆171Updated this week
- Routing on Random Forest (RoRF)☆135Updated 5 months ago
- ☆221Updated last year