athina-ai / athina-evalsView external linksLinks
Python SDK for running evaluations on LLM generated responses
β297Jun 6, 2025Updated 8 months ago
Alternatives and similar repositories for athina-evals
Users that are interested in athina-evals are comparing it to the libraries listed below
Sorting:
- Data-Driven Evaluation for LLM-Powered Applicationsβ517Jan 22, 2025Updated last year
- π§ Open source LLM observability platform. One line of code to monitor, evaluate, and experiment. YC W23 πβ5,111Updated this week
- A tool for evaluating LLMsβ428May 10, 2024Updated last year
- UpTrain is an open-source unified platform to evaluate and improve Generative AI applications. We provide grades for 20+ preconfigured chβ¦β2,338Aug 18, 2024Updated last year
- AI Observability & Evaluationβ8,530Updated this week
- The LLM Evaluation Frameworkβ13,613Feb 10, 2026Updated last week
- Small, simple agent task environments for training and evaluationβ19Nov 1, 2024Updated last year
- Supercharge Your LLM Application Evaluations πβ12,605Jan 31, 2026Updated 2 weeks ago
- Metrics to evaluate the quality of responses of your Retrieval Augmented Generation (RAG) applications.β324Jul 10, 2025Updated 7 months ago
- The platform for LLM evaluations and AI agent testingβ2,823Updated this week
- A realtime serving engine for Data-Intensive Generative AI Applicationsβ1,096Updated this week
- structured outputs for llmsβ12,357Feb 10, 2026Updated last week
- Langtrace π is an open-source, Open Telemetry based end-to-end observability tool for LLM applications, providing real-time tracing, evβ¦β1,183Nov 17, 2025Updated 3 months ago
- Python SDK for experimenting, testing, evaluating & monitoring LLM-powered applications - Parea AI (YC S23)β82Feb 13, 2025Updated last year
- An attribution library for LLMsβ46Sep 17, 2024Updated last year
- πͺ’ Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with Openβ¦β21,935Updated this week
- Python SDK for AI agent monitoring, LLM cost tracking, benchmarking, and more. Integrates with most LLMs and agent frameworks including Cβ¦β5,282Oct 30, 2025Updated 3 months ago
- An open-source visual programming environment for battle-testing prompts to LLMs.β2,922Jan 2, 2026Updated last month
- Test your prompts, agents, and RAGs. AI Red teaming, pentesting, and vulnerability scanning for LLMs. Compare performance of GPT, Claude,β¦β10,462Updated this week
- Get 100% uptime, reliability from OpenAI. Handle Rate Limit, Timeout, API, Keys Errorsβ695Nov 20, 2023Updated 2 years ago
- OpenTelemetry Instrumentation for AI Observabilityβ851Feb 10, 2026Updated last week
- AI Evaluation Platformβ47May 26, 2025Updated 8 months ago
- A blazing fast AI Gateway with integrated guardrails. Route to 200+ LLMs, 50+ AI Guardrails with 1 fast & friendly API.β10,614Jan 26, 2026Updated 3 weeks ago
- Deepmark AI enables a unique testing environment for language models (LLM) assessment on task-specific metrics and on your own data so yoβ¦β104Nov 24, 2023Updated 2 years ago
- Open source platform for AI Engineering: OpenTelemetry-native LLM Observability, GPU Monitoring, Guardrails, Evaluations, Prompt Managemeβ¦β2,207Updated this week
- [ACL 2024] A Multimodal, Multigenre, and Multipurpose Audio-Visual Academic Lecture Datasetβ25May 29, 2025Updated 8 months ago
- fork of litellm that is open sourceβ21Jan 22, 2026Updated 3 weeks ago
- LLM evaluation.β16Nov 7, 2023Updated 2 years ago
- The open-source LLMOps platform: prompt playground, prompt management, LLM evaluation, and LLM observability all in one place.β3,840Updated this week
- AI-powered tools to automate code documentation and optimize developer operations.β40Feb 9, 2026Updated last week
- Open-source tools for prompt testing and experimentation, with support for both LLMs (e.g. OpenAI, LLaMA) and vector databases (e.g. Chroβ¦β3,003Updated this week
- πΉοΈ Open-source, developer-first LLMOps platform designed to streamline prompt design, version management, instant delivery, collaboratioβ¦β3,190Jun 28, 2025Updated 7 months ago
- Vision utilities for web interaction agents πβ1,753Nov 25, 2024Updated last year
- Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing aβ¦β35,968Updated this week
- Superagent protects your AI applications against prompt injections, data leaks, and harmful outputs. Embed safety directly into your app β¦β6,407Feb 3, 2026Updated 2 weeks ago
- This repository contains various advanced techniques for Retrieval-Augmented Generation (RAG) systems.β2,434Feb 17, 2025Updated last year
- LLM Testing SDK that helps you write and run tests to monitor your LLM app in productionβ132Jan 22, 2024Updated 2 years ago
- SoTA production-ready AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.β7,673Nov 7, 2025Updated 3 months ago
- Laminar - open-source observability platform purpose-built for AI agents. YC S24.β2,590Feb 10, 2026Updated last week