modelscope / MCPBenchLinks
The evaluation benchmark on MCP servers
☆134Updated last month
Alternatives and similar repositories for MCPBench
Users that are interested in MCPBench are comparing it to the libraries listed below
Sorting:
- AutoCoA (Automatic generation of Chain-of-Action) is an agent model framework that enhances the multi-turn tool usage capability of reaso…☆116Updated 3 months ago
- An open platform for enhancing the capability of LLMs in workflow orchestration.☆148Updated 3 months ago
- DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents☆135Updated last week
- ☆241Updated 2 weeks ago
- Scaling Deep Research via Reinforcement Learning in Real-world Environments.☆461Updated 2 months ago
- Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks☆219Updated last month
- [ACL 2024] AutoAct: Automatic Agent Learning from Scratch for QA via Self-Planning☆228Updated 5 months ago
- Build, evaluate and run General Multi-Agent Assistance with ease☆271Updated this week
- Official implementation for "ScoreFlow: Mastering LLM Agent Workflows via Score-based Preference Optimization"☆78Updated last month
- [ICLR 2025] The official implementation of paper "ToolGen: Unified Tool Retrieval and Calling via Generation"☆145Updated 3 months ago
- ☆71Updated 9 months ago
- Awesome Deep Research list☆104Updated last week
- [ICLR 2025] Benchmarking Agentic Workflow Generation☆100Updated 4 months ago
- Efficient Agent Training for Computer Use☆106Updated 3 weeks ago
- MDocAgent: A Multi-Modal Multi-Agent Framework for Document Understanding☆172Updated 2 months ago
- ☆418Updated last week
- [ACL 2025] Code and data for OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis☆141Updated this week
- MCP-Zero: Proactive Toolchain Construction for LLM Agents from Scratch☆49Updated 2 weeks ago
- Awesome Agent Training☆164Updated this week
- ☆86Updated last month
- R1-searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning☆566Updated last month
- ☆211Updated last month
- This is the official repo for "PromptAgent: Strategic Planning with Language Models Enables Expert-level Prompt Optimization". PromptAgen…☆290Updated 10 months ago
- GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents☆256Updated this week
- ☆519Updated 3 weeks ago
- ☆67Updated 3 weeks ago
- ☆145Updated 5 months ago
- ☆125Updated last month
- ☆152Updated last month
- Designing Multi-Agent Systems with Zero Supervision☆73Updated this week