Accenture / mcp-benchLinks
MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers
☆389Updated last month
Alternatives and similar repositories for mcp-bench
Users that are interested in mcp-bench are comparing it to the libraries listed below
Sorting:
- MCP-Universe is a comprehensive framework designed for developing, testing, and benchmarking AI agents☆507Updated last week
- A general memory system for agents, powered by deep-research☆660Updated this week
- Anemoi: A Semi-Centralized Multi-agent Systems Based on Agent-to-Agent Communication MCP server from Coral Protocol☆369Updated 3 months ago
- Agent0 Series: Self-Evolving Agents from Zero Data☆767Updated this week
- Next paradigm for LLM Agent. Unify plan and action through recursive code generation for adaptive, human-like decision-making.☆452Updated this week
- 🛠️ DeepAgent: A General Reasoning Agent with Scalable Toolsets☆837Updated last month
- Latent Collaboration in Multi-Agent Systems (LatentMAS)☆142Updated last week
- Agentic Web: Weaving the Next Web with AI Agents.☆394Updated 2 months ago
- The offical repo for "Parallel-R1: Towards Parallel Thinking via Reinforcement Learning"☆237Updated 2 weeks ago
- 🚀 MassGen is an open-source multi-agent scaling system that runs in your terminal, autonomously orchestrating frontier models and agents…☆625Updated this week
- codes for R-Zero: Self-Evolving Reasoning LLM from Zero Data (https://www.arxiv.org/pdf/2508.05004)☆687Updated last month
- An agentic Machine Learning Engineer☆1,116Updated last week
- A multi-agent LLM system for detecting and resolving cognitive dissonance.☆269Updated last month
- Agents testing framework made easy☆504Updated last week
- ☆612Updated last month
- Code and implementations for the paper "AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcemen…☆511Updated 2 months ago
- Reached #13 on Stanford's Terminal Bench leaderboard. Orchestrator, explorer & coder agents working together with intelligent context sha…☆1,286Updated last month
- This repo contains implementation of 25+ prompt engineering techniques.☆87Updated last week
- Official Code of Memento: Fine-tuning LLM Agents without Fine-tuning LLMs☆1,937Updated 2 months ago
- Research code artifacts for Code World Model (CWM) including inference tools, reproducibility, and documentation.☆751Updated 2 months ago
- Scaling Coding-Agent RL to 32x H100s. **Achieving 160% improvement** on Stanford's TerminalBench☆88Updated last month
- On the Theoretical Limitations of Embedding-Based Retrieval☆605Updated 2 months ago
- ☆413Updated last week
- [EMNLP 2025] Awesome RAG Reasoning Resources☆360Updated 4 months ago
- ☆468Updated 2 weeks ago
- AgentFlow: In-the-Flow Agentic System Optimization☆1,376Updated last week
- Build common agent use-cases with deepagents library☆338Updated this week
- One-stop handbook for building, deploying, and understanding LLM agents with 60+ skeletons, tutorials, ecosystem guides, and evaluation t…☆354Updated 2 months ago
- An Open-Source Large-Scale Reinforcement Learning Project for Search Agents☆504Updated last week
- OpenCUA: Open Foundations for Computer-Use Agents☆582Updated last week