Accenture / mcp-benchLinks
MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers
☆360Updated 2 weeks ago
Alternatives and similar repositories for mcp-bench
Users that are interested in mcp-bench are comparing it to the libraries listed below
Sorting:
- MCP-Universe is a comprehensive framework designed for developing, testing, and benchmarking AI agents☆461Updated last week
- Anemoi: A Semi-Centralized Multi-agent Systems Based on Agent-to-Agent Communication MCP server from Coral Protocol☆366Updated last month
- ☆538Updated 2 months ago
- Agentic Web: Weaving the Next Web with AI Agents.☆377Updated 3 weeks ago
- 🚀 MassGen: An Open-source Multi-Agent Scaling System Inspired by Grok Heavy and Gemini Deep Think. Join the discord channel: https://dis…☆554Updated last week
- The offical repo for "Parallel-R1: Towards Parallel Thinking via Reinforcement Learning"☆225Updated last week
- A multi-agent LLM system for detecting and resolving cognitive dissonance.☆268Updated last week
- codes for R-Zero: Self-Evolving Reasoning LLM from Zero Data (https://www.arxiv.org/pdf/2508.05004)☆650Updated 3 weeks ago
- ☆402Updated last month
- Implementation of 17+ agentic architectures designed for practical use across different stages of AI system development.☆223Updated last month
- ☆828Updated last month
- Model-agnostic plug-n-play LangChain/LangGraph agents powered entirely by MCP tools over HTTP/SSE.☆630Updated last week
- One-stop handbook for building, deploying, and understanding LLM agents with 60+ skeletons, tutorials, ecosystem guides, and evaluation t…☆346Updated last month
- On the Theoretical Limitations of Embedding-Based Retrieval☆584Updated last month
- Codes/Notebooks for AI Projects☆1,234Updated this week
- Research code artifacts for Code World Model (CWM) including inference tools, reproducibility, and documentation.☆682Updated last month
- Reached #13 on Stanford's Terminal Bench leaderboard. Orchestrator, explorer & coder agents working together with intelligent context sha…☆1,232Updated last month
- Official Code of Memento: Fine-tuning LLM Agents without Fine-tuning LLMs☆1,631Updated 2 weeks ago
- Code and implementations for the paper "AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcemen…☆453Updated last month
- Qwen3Guard is a multilingual guardrail model series developed by the Qwen team at Alibaba Cloud.☆313Updated this week
- RAGLight is a modular framework for Retrieval-Augmented Generation (RAG). It makes it easy to plug in different LLMs, embeddings, and vec…☆598Updated this week
- Agentic RAG, Multi-Agent Systems, and Vision Reasoning are three pipelines to find the perfect LLM☆120Updated 2 months ago
- An alignment auditing agent capable of quickly exploring alignment hypothesis☆609Updated this week
- 🐉 Loong: Synthesize Long CoTs at Scale through Verifiers.☆451Updated 3 weeks ago
- OpenCUA: Open Foundations for Computer-Use Agents☆520Updated last week
- Collection of scripts and notebooks for OpenAI's latest GPT OSS models☆463Updated 2 months ago
- [EMNLP 2025] Awesome RAG Reasoning Resources☆323Updated 3 months ago
- open-source coding LLM for software engineering tasks☆990Updated 3 weeks ago
- LangGraph template for a simple ReAct agent, with MCP tools support and robust test suites.☆501Updated 3 weeks ago
- UniversalRAG: Retrieval-Augmented Generation over Corpora of Diverse Modalities and Granularities☆125Updated 5 months ago