Github action to evaluate AI agent applications using model as the judge, content safety and mathematical metrics.
☆65Jan 16, 2026Updated last month
Alternatives and similar repositories for ai-agent-evals
Users that are interested in ai-agent-evals are comparing it to the libraries listed below
Sorting:
- The LLMAgentOps Toolkit is a repository that provides a foundational structure for building LLM Agent-based applications using the Semant…☆16Feb 20, 2026Updated last week
- Playground for building AI Agents on Azure☆30Mar 31, 2025Updated 11 months ago
- eShopLite - Semantic Search is a reference .NET application implementing an eCommerce site with Search features using Keyword Search and …☆13Apr 24, 2025Updated 10 months ago
- ReMe: A Personalized Cognitive Training Framework Based on an LLM Voice Chatbot for Research☆17Jul 3, 2025Updated 8 months ago
- SK Multi agentic advanced orchestration example☆15Feb 20, 2026Updated last week
- GitAGU (Git Agent Unblock) - A centralized platform for discovering, configuring, and integrating AI agents into your development workflo…☆23Jul 9, 2025Updated 7 months ago
- ☆12Aug 6, 2020Updated 5 years ago
- This lab is a starter for quickly and easily applying SLM/LLM fine-tuning, evaluation, and quantization with torchtune on Azure ML.☆15Sep 23, 2025Updated 5 months ago