Python SDK for running evaluations on LLM generated responses
β300Jun 6, 2025Updated last year
Alternatives and similar repositories for athina-evals
Users that are interested in athina-evals are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Data-Driven Evaluation for LLM-Powered Applicationsβ516Jan 22, 2025Updated last year
- π§ Open source LLM observability platform. One line of code to monitor, evaluate, and experiment. YC W23 πβ5,838Jun 11, 2026Updated last week
- A tool for evaluating LLMsβ429Mar 15, 2026Updated 3 months ago
- UpTrain is an open-source unified platform to evaluate and improve Generative AI applications. We provide grades for 20+ preconfigured chβ¦β2,350Aug 18, 2024Updated last year
- Supercharge Your LLM Application Evaluations πβ14,430Feb 24, 2026Updated 3 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer β’ AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- LLM Testing SDK that helps you write and run tests to monitor your LLM app in productionβ131Jan 22, 2024Updated 2 years ago
- Python SDK for experimenting, testing, evaluating & monitoring LLM-powered applications - Parea AI (YC S23)β82Feb 13, 2025Updated last year
- The LLM Evaluation Frameworkβ16,192Updated this week
- Small, simple agent task environments for training and evaluationβ20Nov 1, 2024Updated last year
- AI Observability & Evaluationβ10,174Updated this week
- β20Jul 19, 2023Updated 2 years ago
- Metrics to evaluate the quality of responses of your Retrieval Augmented Generation (RAG) applications.β326Jul 10, 2025Updated 11 months ago
- Deepmark AI enables a unique testing environment for language models (LLM) assessment on task-specific metrics and on your own data so yoβ¦β104Nov 24, 2023Updated 2 years ago
- AutoEvals is a tool for quickly and easily evaluating AI model outputs using best practices.β937Updated this week
- Managed Database hosting by DigitalOcean β’ AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Clean and functional LLM frontendβ11Mar 7, 2025Updated last year
- AI Evaluation Platformβ49May 26, 2025Updated last year
- REST API for Large Language Models using FastAPI, Redis and LiteLLMβ14Nov 30, 2023Updated 2 years ago
- LLM evaluation.β16Nov 7, 2023Updated 2 years ago
- structured outputs for llmsβ13,181Updated this week
- Langtrace π is an open-source, Open Telemetry based end-to-end observability tool for LLM applications, providing real-time tracing, evβ¦β1,205Nov 17, 2025Updated 7 months ago
- The platform for LLM evaluations and AI agent testingβ3,310Updated this week
- πͺ’ Open source AI engineering platform: LLM evals, observability, metrics, prompt management, playground, datasets. Integrates with OpenTβ¦β29,372Updated this week
- Audio tokenization, in the fastest way possible!β54Aug 26, 2024Updated last year
- Managed Database hosting by DigitalOcean β’ AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- OpenTelemetry Instrumentation for AI Observabilityβ1,028Updated this week
- A blazing fast AI Gateway with integrated guardrails. Route to 1,600+ LLMs, 50+ AI Guardrails with 1 fast & friendly API.β12,132May 25, 2026Updated 3 weeks ago
- An open-source visual programming environment for battle-testing prompts to LLMs.β2,997Jun 10, 2026Updated last week
- This repository contains various advanced techniques for Retrieval-Augmented Generation (RAG) systems.β2,536Feb 17, 2025Updated last year
- Simple AI agents / assistantsβ52Oct 8, 2024Updated last year
- An end-to-end benchmark suite of multi-modal DNN applications for system-architecture co-designβ22Dec 13, 2024Updated last year
- A super framework for prompt engineering.β15Nov 20, 2024Updated last year
- Python SDK for AI agent monitoring, LLM cost tracking, benchmarking, and more. Integrates with most LLMs and agent frameworks including Cβ¦β5,640Mar 19, 2026Updated 3 months ago
- Artificial Intelligence courses, projects, and resourcesβ11Nov 28, 2016Updated 9 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer β’ AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- AI Infrastructure Engage & Think Layers for Voice & Vision Interactionsβ22Jul 28, 2025Updated 10 months ago
- Cohere Toolkit is a collection of prebuilt components enabling users to quickly build and deploy RAG applications.β3,178Mar 31, 2026Updated 2 months ago
- An attribution library for LLMsβ46Sep 17, 2024Updated last year
- The repository contains code for Adaptive Data Optimizationβ36Dec 9, 2024Updated last year
- Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing aβ¦β50,785Updated this week
- β29May 30, 2023Updated 3 years ago
- The open-source LLMOps platform: prompt playground, prompt management, LLM evaluation, and LLM observability all in one place.β4,209Updated this week