Python SDK for running evaluations on LLM generated responses
☆297Jun 6, 2025Updated 9 months ago
Alternatives and similar repositories for athina-evals
Users that are interested in athina-evals are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- LLM Evals for Text Summarization and RAG use-cases.☆35Jan 22, 2024Updated 2 years ago
- Data-Driven Evaluation for LLM-Powered Applications☆515Jan 22, 2025Updated last year
- 🧊 Open source LLM observability platform. One line of code to monitor, evaluate, and experiment. YC W23 🍓☆5,325Mar 19, 2026Updated last week
- A tool for evaluating LLMs☆428Mar 15, 2026Updated 2 weeks ago
- UpTrain is an open-source unified platform to evaluate and improve Generative AI applications. We provide grades for 20+ preconfigured ch…☆2,339Aug 18, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Supercharge Your LLM Application Evaluations 🚀☆13,106Feb 24, 2026Updated last month
- LLM Testing SDK that helps you write and run tests to monitor your LLM app in production☆132Jan 22, 2024Updated 2 years ago
- Python SDK for experimenting, testing, evaluating & monitoring LLM-powered applications - Parea AI (YC S23)☆82Feb 13, 2025Updated last year
- The LLM Evaluation Framework☆14,227Mar 20, 2026Updated last week
- AI Observability & Evaluation☆9,020Updated this week
- Small, simple agent task environments for training and evaluation☆19Nov 1, 2024Updated last year
- Metrics to evaluate the quality of responses of your Retrieval Augmented Generation (RAG) applications.☆325Jul 10, 2025Updated 8 months ago
- Deepmark AI enables a unique testing environment for language models (LLM) assessment on task-specific metrics and on your own data so yo…☆104Nov 24, 2023Updated 2 years ago
- AI Evaluation Platform☆48May 26, 2025Updated 10 months ago
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- REST API for Large Language Models using FastAPI, Redis and LiteLLM☆14Nov 30, 2023Updated 2 years ago
- LLM evaluation.☆16Nov 7, 2023Updated 2 years ago
- structured outputs for llms☆12,589Updated this week
- Langtrace 🔍 is an open-source, Open Telemetry based end-to-end observability tool for LLM applications, providing real-time tracing, ev…☆1,187Nov 17, 2025Updated 4 months ago
- The platform for LLM evaluations and AI agent testing☆3,161Updated this week
- 🪢 Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with Open…☆23,868Updated this week
- Audio tokenization, in the fastest way possible!☆53Aug 26, 2024Updated last year
- OpenTelemetry Instrumentation for AI Observability☆893Mar 23, 2026Updated last week
- A blazing fast AI Gateway with integrated guardrails. Route to 200+ LLMs, 50+ AI Guardrails with 1 fast & friendly API.☆11,080Updated this week
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Get 100% uptime, reliability from OpenAI. Handle Rate Limit, Timeout, API, Keys Errors☆701Nov 20, 2023Updated 2 years ago
- An open-source visual programming environment for battle-testing prompts to LLMs.☆2,964Jan 2, 2026Updated 2 months ago
- This repository contains various advanced techniques for Retrieval-Augmented Generation (RAG) systems.☆2,474Feb 17, 2025Updated last year
- Simple AI agents / assistants☆51Oct 8, 2024Updated last year
- Python SDK for AI agent monitoring, LLM cost tracking, benchmarking, and more. Integrates with most LLMs and agent frameworks including C…☆5,395Mar 19, 2026Updated last week
- An end-to-end benchmark suite of multi-modal DNN applications for system-architecture co-design☆22Dec 13, 2024Updated last year
- A super framework for prompt engineering.☆14Nov 20, 2024Updated last year
- Artificial Intelligence courses, projects, and resources☆12Nov 28, 2016Updated 9 years ago
- Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing a…☆40,834Updated this week
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- AI Infrastructure Engage & Think Layers for Voice & Vision Interactions☆22Jul 28, 2025Updated 8 months ago
- Cohere Toolkit is a collection of prebuilt components enabling users to quickly build and deploy RAG applications.☆3,160Feb 11, 2026Updated last month
- An attribution library for LLMs☆46Sep 17, 2024Updated last year
- The repository contains code for Adaptive Data Optimization☆33Dec 9, 2024Updated last year
- Test your prompts, agents, and RAGs. Red teaming/pentesting/vulnerability scanning for AI. Compare performance of GPT, Claude, Gemini, Ll…☆18,597Updated this week
- The open-source LLMOps platform: prompt playground, prompt management, LLM evaluation, and LLM observability all in one place.☆3,947Mar 23, 2026Updated last week
- EMNLP 2024 "Re-reading improves reasoning in large language models". Simply repeating the question to get bidirectional understanding for…☆29Dec 10, 2024Updated last year