Github action to evaluate AI agent applications using model as the judge, content safety and mathematical metrics.
☆70Mar 13, 2026Updated last week
Alternatives and similar repositories for ai-agent-evals
Users that are interested in ai-agent-evals are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- GitAGU (Git Agent Unblock) - A centralized platform for discovering, configuring, and integrating AI agents into your development workflo…☆24Mar 12, 2026Updated last week
- The LLMAgentOps Toolkit is a repository that provides a foundational structure for building LLM Agent-based applications using the Semant…☆16Feb 20, 2026Updated last month
- Playground for building AI Agents on Azure☆31Mar 31, 2025Updated 11 months ago
- SK Multi agentic advanced orchestration example☆15Feb 20, 2026Updated last month
- ReMe: A Personalized Cognitive Training Framework Based on an LLM Voice Chatbot for Research☆17Jul 3, 2025Updated 8 months ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- The Doc Intelligence in-a-Box project leverages Azure AI Document Intelligence to extract data from PDF forms and store the data in a Azu…☆44Jan 31, 2025Updated last year
- ☆42Updated this week
- End-to-end solution sample for a travel assistant built with the Azure Agent Runtime☆30Jan 13, 2026Updated 2 months ago
- This lab is a starter for quickly and easily applying SLM/LLM fine-tuning, evaluation, and quantization with torchtune on Azure ML.☆15Sep 23, 2025Updated 6 months ago
- An exploration of the capabilities of GPT-5☆37Sep 4, 2025Updated 6 months ago
- VS Code Extension for Copilot Studio☆77Updated this week
- Azure Computer Vision 4 (March 2023 - Florence) workshop in a day☆42May 11, 2023Updated 2 years ago
- Examples of how-to use Azure OpenAI Log Probabilities (LogProbs) feature to enhance Generative AI - Q&A grounding.☆25May 10, 2025Updated 10 months ago
- A service for end-to-end (functional) testing of a bot. Programmatically simulate a user’s back-and-forth conversation with a bot, to tes…☆19Feb 12, 2026Updated last month
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Windows Data and Analytics Shared Code - JSON Processing☆15Jun 12, 2023Updated 2 years ago
- Magentic-Marketplace: Simulate Agentic Markets and See How They Evolve☆148Mar 1, 2026Updated 3 weeks ago
- ☆67Mar 18, 2026Updated last week
- A refactoring benchmark for software engineering agents. [ICLR 2025]☆21Feb 20, 2026Updated last month
- Hyperparameter Tuning for Deep Learning☆16Feb 5, 2020Updated 6 years ago
- ☆29Nov 27, 2025Updated 3 months ago
- Azure AI Agents Playbook☆34Jan 27, 2026Updated last month
- VS Code extension to preview a theme without installing it☆15Updated this week
- ☆20Nov 11, 2025Updated 4 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Implement GenAIOps using Azure AI Foundry with ease and jumpstart☆26Apr 23, 2025Updated 11 months ago
- A sample OpenAI plugin using ASP.NET Core API☆18Jun 22, 2023Updated 2 years ago
- ☆36Nov 15, 2024Updated last year
- Get the assets and code here, and then follow our Bee Control tutorial to learn more about how to work with Unity, C#, and Visual Studio …☆15Jun 30, 2016Updated 9 years ago
- Activate GenAI with Azure☆23Jan 26, 2026Updated last month
- Open source framework for evaluating AI Agents☆29Feb 24, 2026Updated last month
- A Mixture‑of‑Experts Educational Framework for Adaptive Cybersecurity☆22Feb 8, 2026Updated last month
- A ruby lib to achieve consensus with Cassandra☆11Feb 28, 2020Updated 6 years ago
- Create an MCP Server for your API using the TypeSpec MCP Server☆47Feb 4, 2026Updated last month
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Scaling AOAI using APIM, PTUs and TPMs☆113May 17, 2024Updated last year
- Bundle of security analysis scripts for keras tensorflow models☆16Apr 15, 2024Updated last year
- MCP server for the windows API.☆22Apr 22, 2025Updated 11 months ago
- ☆58Updated this week
- Convert issue form responses to JSON☆22Updated this week
- A deployment of a secure, extensible and integrated environment for running AI Foundry workloads in Production. It simplifies the process…☆180Mar 16, 2026Updated last week
- Retail Search with AI☆14Feb 14, 2026Updated last month