Evaluation package that allows benchmarking of agentic AIs from various sources and frameworks by producing statistical results which can be compared across different use cases and datasets.
☆75Apr 22, 2026Updated last week
Alternatives and similar repositories for agent-quality-inspect
Users that are interested in agent-quality-inspect are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆13Jun 17, 2023Updated 2 years ago
- A lightweight annotation standard that helps AI agents navigate codebases faster, with fewer file reads and tool calls☆129Updated this week
- A universal plugin framework for development tools that enables seamless browser-server communication and MCP (Model Context Protocol) in…☆32Apr 27, 2026Updated last week
- A practical AI agents handbook covering agent systems, agentic workflows, LangGraph, MCP/A2A, context engineering, agent memory, evaluati…☆113Updated this week
- The official implementation of the paper "AgentDyn: A Dynamic Open-Ended Benchmark for Evaluating Prompt Injection Attacks of Real-World …☆48Apr 19, 2026Updated 2 weeks ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Open-source firewall for AI agents. Policy engine that audits and controls what OpenClaw, Claude Code, Cursor, Codex, and any AI tool can…☆66Apr 28, 2026Updated last week
- NTU Learn Downloader☆18Nov 10, 2021Updated 4 years ago
- Cognithor · Agent OS: Local-first autonomous agent operating system. 19 LLM providers, 18 channels, 145 MCP tools, 6-tier memory, Agent P…☆116Updated this week
- [ACL25' Findings] SWE-Dev is an SWE agent with a scalable test case construction pipeline.☆59Jul 21, 2025Updated 9 months ago
- A Model Context Protocol (MCP) server for Langfuse, enabling AI agents to query Langfuse trace data for enhanced debugging and observabil…☆83Updated this week
- MCP Server with TMDB☆70Apr 26, 2026Updated last week
- Autonomous orchestration framework for Claude Code with MemPalace-inspired memory (4-layer stack, 818-token wake-up), parallel-first Agen…☆130Apr 20, 2026Updated 2 weeks ago
- Go framework for agentic AI app with MCP and built-in tools☆185Apr 17, 2026Updated 2 weeks ago
- The Google Ads MCP Server is an implementation of the Model Context Protocol (MCP) that enables Large Language Models (LLMs), such as Gem…☆186Updated this week
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- The Rust SDK for building coding agents. Tool execution, LLM streaming, graph memory, sub-agent orchestration, MCP — as composable librar…☆281Apr 27, 2026Updated last week
- Zero instrucment LLM and AI agent (e.g. claude code, gemini-cli) observability in eBPF☆310Apr 22, 2026Updated last week
- 🦞 Official plugin for OpenClaw that exports agent traces to Opik. See and monitor agent behaviour, cost, tokens, errors and more.☆575Updated this week
- OpenJudge: A Unified Framework for Holistic Evaluation and Quality Rewards☆583Updated this week
- This solution accelerator leverages Microsoft Foundry, Azure Content Understanding, Azure OpenAI Service, and Foundry IQ to enable organi…☆441Updated this week
- Vibe-Skills is an all-in-one AI skills package. It seamlessly integrates expert-level capabilities and context management into a general-…☆1,915Updated this week
- TruthfulQA: Measuring How Models Imitate Human Falsehoods☆909Jan 16, 2025Updated last year
- Model Context Protocol (MCP) server for Kubernetes and OpenShift☆1,512Updated this week
- Modular Python framework for AI agents and workflows with chain-of-thought reasoning, tools, and memory.☆2,521Updated this week
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- The platform for LLM evaluations and AI agent testing☆3,231Updated this week
- Your agent in your terminal, equipped with local tools: writes code, uses the terminal, browses the web. Make your own persistent autonom…☆4,287Updated this week
- A curated list of awesome skills, hooks, slash-commands, agent orchestrators, applications, and plugins for Claude Code by Anthropic☆41,745Apr 27, 2026Updated last week
- Automate the process of making money online.☆30,295Apr 4, 2026Updated last month
- Robust Speech Recognition via Large-Scale Weak Supervision☆98,662Apr 15, 2026Updated 2 weeks ago
- 21 Lessons, Get Started Building with Generative AI☆110,167Updated this week