modelscope / MCPBenchLinks
The evaluation benchmark on MCP servers
☆115Updated 2 weeks ago
Alternatives and similar repositories for MCPBench
Users that are interested in MCPBench are comparing it to the libraries listed below
Sorting:
- AutoCoA (Automatic generation of Chain-of-Action) is an agent model framework that enhances the multi-turn tool usage capability of reaso…☆114Updated 2 months ago
- ☆210Updated last week
- ☆140Updated 4 months ago
- An open platform for enhancing the capability of LLMs in workflow orchestration.☆146Updated 2 months ago
- [ICLR 2025] The official implementation of paper "ToolGen: Unified Tool Retrieval and Calling via Generation"☆142Updated 2 months ago
- Scaling Deep Research via Reinforcement Learning in Real-world Environments.☆409Updated last month
- ☆83Updated 3 weeks ago
- ☆216Updated 2 weeks ago
- [ACL 2024] AutoAct: Automatic Agent Learning from Scratch for QA via Self-Planning☆226Updated 4 months ago
- Official implementation for "ScoreFlow: Mastering LLM Agent Workflows via Score-based Preference Optimization"☆76Updated 2 weeks ago
- ☆55Updated 3 weeks ago
- This is the official repo for "PromptAgent: Strategic Planning with Language Models Enables Expert-level Prompt Optimization". PromptAgen…☆284Updated 10 months ago
- Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks☆208Updated last month
- [NeurIPS 2024 D&B Track] GTA: A Benchmark for General Tool Agents☆94Updated 2 months ago
- Awesome Agent Training☆141Updated this week
- ☆68Updated 8 months ago
- ☆362Updated this week
- MTU-Bench: A Multi-granularity Tool-Use Benchmark for Large Language Models☆44Updated 3 months ago
- [ICLR 2025] Benchmarking Agentic Workflow Generation☆94Updated 3 months ago
- OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning☆141Updated 5 months ago
- [ACL 2025] Code and data for OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis☆134Updated this week
- The official code of paper “Tool-Star: Empowering LLM-brained Multi-Tool Reasoner via Reinforcement Learning”☆117Updated this week
- ☆386Updated this week
- ☆102Updated 6 months ago
- Build, evaluate and run General Multi-Agent Assistance with ease☆246Updated this week
- ☆151Updated last month
- ☆104Updated last month
- MDocAgent: A Multi-Modal Multi-Agent Framework for Document Understanding☆161Updated 2 months ago
- R1-searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning☆548Updated last week
- AN O1 REPLICATION FOR CODING☆336Updated 5 months ago