modelscope / MCPBench
The evaluation benchmark on MCP servers
☆58Updated last week
Alternatives and similar repositories for MCPBench:
Users that are interested in MCPBench are comparing it to the libraries listed below
- ☆130Updated 3 months ago
- Scaling Deep Research via Reinforcement Learning in Real-world Environments.☆282Updated 2 weeks ago
- [ICLR 2025] The official implementation of paper "ToolGen: Unified Tool Retrieval and Calling via Generation"☆138Updated last month
- AutoCoA (Automatic generation of Chain-of-Action) is an agent model framework that enhances the multi-turn tool usage capability of reaso…☆101Updated last month
- 🌐 WebThinker: Empowering Large Reasoning Models with Deep Research Capability☆147Updated 2 weeks ago
- ☆143Updated 9 months ago
- Ling is a MoE LLM provided and open-sourced by InclusionAI.☆143Updated last week
- The RedStone repository includes code for preparing extensive datasets used in training large language models.☆131Updated 2 months ago
- ☆232Updated 2 months ago
- ☆218Updated last year
- A visuailzation tool to make deep understaning and easier debugging for RLHF training.☆187Updated 2 months ago
- This is the reading list for the survey "A Survey on the Optimization of LLM-based Agents ". We will keep adding papers and improving the…☆84Updated last week
- An open platform for enhancing the capability of LLMs in workflow orchestration.☆133Updated last month
- ☆94Updated 4 months ago
- ☆146Updated last month
- ☆51Updated 7 months ago
- ☆81Updated last year
- ☆41Updated this week
- ☆52Updated 2 months ago
- Hammer: Robust Function-Calling for On-Device Language Models via Function Masking☆74Updated 2 months ago
- ☆101Updated 4 months ago
- StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization☆126Updated 3 months ago
- [ICLR 2025] Benchmarking Agentic Workflow Generation☆79Updated 2 months ago
- ☆153Updated 3 weeks ago
- ☆46Updated 10 months ago
- ☆93Updated 4 months ago
- a-m-team's exploration in large language modeling☆49Updated 3 weeks ago
- 代码大模型 预训练&微调&DPO 数据处理 业界处理pipeline sota☆38Updated 9 months ago
- connecting humans and agents☆82Updated 4 months ago
- Official github repo for AutoDetect, an automated weakness detection framework for LLMs.☆42Updated 10 months ago