Github action to evaluate AI agent applications using model as the judge, content safety and mathematical metrics.
☆77Mar 13, 2026Updated last month
Alternatives and similar repositories for ai-agent-evals
Users that are interested in ai-agent-evals are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- GitAGU (Git Agent Unblock) - A centralized platform for discovering, configuring, and integrating AI agents into your development workflo…☆29Apr 13, 2026Updated 3 weeks ago
- The LLMAgentOps Toolkit is a repository that provides a foundational structure for building LLM Agent-based applications using the Semant…☆17Apr 1, 2026Updated last month
- Tayra is a sophisticated call center analytics platform designed to systematically evaluate and score call center audio interactions. By …☆14Dec 19, 2025Updated 4 months ago
- SK Multi agentic advanced orchestration example☆15Feb 20, 2026Updated 2 months ago
- eShopLite - Semantic Search is a reference .NET application implementing an eCommerce site with Search features using Keyword Search and …☆13Apr 24, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Implementation of 12 AI agents evaluation techniques☆43Jul 31, 2025Updated 9 months ago
- ReMe: A Personalized Cognitive Training Framework Based on an LLM Voice Chatbot for Research☆18Jul 3, 2025Updated 10 months ago
- The Doc Intelligence in-a-Box project leverages Azure AI Document Intelligence to extract data from PDF forms and store the data in a Azu…☆46Mar 27, 2026Updated last month
- ☆42Apr 9, 2026Updated 3 weeks ago
- This lab is a starter for quickly and easily applying SLM/LLM fine-tuning, evaluation, and quantization with torchtune on Azure ML.☆15Apr 21, 2026Updated last week
- End-to-end solution sample for a travel assistant built with the Azure Agent Runtime☆31Apr 2, 2026Updated last month
- An exploration of the capabilities of GPT-5☆37Sep 4, 2025Updated 8 months ago
- VS Code Extension for Copilot Studio☆89Updated this week
- Azure Computer Vision 4 (March 2023 - Florence) workshop in a day☆42May 11, 2023Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆12Aug 6, 2020Updated 5 years ago
- Examples of how-to use Azure OpenAI Log Probabilities (LogProbs) feature to enhance Generative AI - Q&A grounding.☆24May 10, 2025Updated 11 months ago
- Microsoft AI Value Accelerator☆33Jul 30, 2024Updated last year
- A service for end-to-end (functional) testing of a bot. Programmatically simulate a user’s back-and-forth conversation with a bot, to tes…☆18Apr 20, 2026Updated 2 weeks ago
- Windows Data and Analytics Shared Code - JSON Processing☆15Jun 12, 2023Updated 2 years ago
- VS Code native module for loading and reading OS policies☆16Jan 13, 2026Updated 3 months ago
- Hyperparameter Tuning for Deep Learning☆16Feb 5, 2020Updated 6 years ago
- ☆27Nov 27, 2025Updated 5 months ago
- Magentic-Marketplace: Simulate Agentic Markets and See How They Evolve☆156Mar 1, 2026Updated 2 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆81Updated this week
- Azure AI Agents Playbook☆33Apr 15, 2026Updated 2 weeks ago
- VS Code extension to preview a theme without installing it☆15Mar 26, 2026Updated last month
- ☆30Mar 26, 2026Updated last month
- ☆20Nov 11, 2025Updated 5 months ago
- ☆42Feb 11, 2026Updated 2 months ago
- Implement GenAIOps using Azure AI Foundry with ease and jumpstart☆27Apr 13, 2026Updated 3 weeks ago
- A sample OpenAI plugin using ASP.NET Core API☆17Jun 22, 2023Updated 2 years ago
- ☆36Nov 15, 2024Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- A refactoring benchmark for software engineering agents. [ICLR 2025]☆23Feb 20, 2026Updated 2 months ago
- Get the assets and code here, and then follow our Bee Control tutorial to learn more about how to work with Unity, C#, and Visual Studio …☆15Jun 30, 2016Updated 9 years ago
- Activate GenAI with Azure☆23Jan 26, 2026Updated 3 months ago
- A Mixture‑of‑Experts Educational Framework for Adaptive Cybersecurity☆22Feb 8, 2026Updated 2 months ago
- Upgrade a legacy Python project with GitHub Copilot☆19Sep 24, 2025Updated 7 months ago
- A ruby lib to achieve consensus with Cassandra☆11Feb 28, 2020Updated 6 years ago
- Scaling AOAI using APIM, PTUs and TPMs☆114May 17, 2024Updated last year