The evaluation benchmark on MCP servers
☆241Sep 3, 2025Updated 6 months ago
Alternatives and similar repositories for MCPBench
Users that are interested in MCPBench are comparing it to the libraries listed below
Sorting:
- Collection of model-centric MCP servers☆26May 21, 2025Updated 9 months ago
- Twinkle✨: Training workbench to make your model glow.☆93Updated this week
- MLLM @ Game☆16May 12, 2025Updated 9 months ago
- a web logging proxy for MCP client-server communication☆28Aug 17, 2025Updated 6 months ago
- A Model Context Protocol (MCP) server that enables natural language queries to databases☆231Feb 11, 2026Updated 3 weeks ago
- Implemented a script that automatically adjusts Qwen3's inference and non-inference capabilities, based on an OpenAI-like API. The infere…☆22May 9, 2025Updated 10 months ago
- XiYanSQL models for Text-to-SQL.☆148Sep 3, 2025Updated 6 months ago
- An SSH plugin for Dify☆13Jan 16, 2026Updated last month
- OCRVerse: Towards Holistic OCR in End-to-End Vision-Language Models☆29Feb 4, 2026Updated last month
- 一起来养一只拥有专属记忆的AI猫猫吧!☆10Oct 25, 2024Updated last year
- Creating Your Divine Agent 😇☆10Jan 26, 2026Updated last month
- Initial commit☆12Aug 14, 2023Updated 2 years ago
- ☆19Jul 21, 2025Updated 7 months ago
- Prompt templates for language models☆10Feb 28, 2026Updated last week
- An MCP tool that gets things done for you☆13Dec 22, 2024Updated last year
- Aligning Agentic World Models via Knowledgeable Experience Learning☆31Jan 25, 2026Updated last month
- An MCP server providing intelligent transcript processing capabilities, featuring natural formatting, contextual repair, and smart summar…☆19Mar 14, 2025Updated 11 months ago
- ROUGE L metric implementation using tensorflow ops☆12Sep 17, 2018Updated 7 years ago
- entropix style sampling + GUI☆27Oct 30, 2024Updated last year
- A Model Context Protocol server providing LLM Agents a second opinion via AI-powered Deepseek-Reasoning R1 mentorship capabilities, inclu…☆33Jul 22, 2025Updated 7 months ago
- 🤖 Reddit-infuriating, AI-powered Shell scripts using Claude Code SDK. essentially an ADAS (Automated Design of Agentic Systems) implemen…☆31Sep 17, 2025Updated 5 months ago
- [ICLR 2026] Official Implementation of "FeatureBench: Benchmarking Agentic Coding for Complex Feature Development"☆25Mar 3, 2026Updated last week
- AgentProg: Empowering Long-Horizon GUI Agents with Program-Guided Context Management☆24Dec 31, 2025Updated 2 months ago
- MCPAgent for Grupa.AI Multi-agent Collaboration Network (MACNET) with Model Context Protocol (MCP) capabilities baked in☆20Feb 17, 2026Updated 2 weeks ago
- A2A agent implementing OpenDeepResearch☆19Apr 14, 2025Updated 10 months ago
- story based implementation for sequential thinking☆15Dec 15, 2025Updated 2 months ago
- Sotopia-RL: Reward Design for Social Intelligence☆46Jan 29, 2026Updated last month
- Automatic prompt optimization framework for multi-step agent tasks.☆37Nov 12, 2024Updated last year
- [NeurIPS'25] Router-R1: Teaching LLMs Multi-Round Routing and Aggregation via Reinforcement Learning☆117Dec 30, 2025Updated 2 months ago
- ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning & ReCall: Learning to Reason with Tool Call for LLMs via Rei…☆1,338May 16, 2025Updated 9 months ago
- ☆17Feb 27, 2025Updated last year
- Task management for AI agents☆15Jun 25, 2025Updated 8 months ago
- An End-to-End Model with Adaptive Filtering for Retrieval-Augmented Generation☆16Oct 27, 2024Updated last year
- [ACL'25 (Findings)] Explorer: Scaling Exploration-driven Web Trajectory Synthesis for Multimodal Web Agents☆26Feb 17, 2026Updated 3 weeks ago
- MCP server for SecretiveShell/Awesome-llms-txt. Add documentation directly into your conversation via MCP resources.☆24Mar 9, 2025Updated last year
- TypeScript port of the original MCP Agent framework by lastmile-ai☆17Sep 22, 2025Updated 5 months ago
- 基于电商数据微调的Qwen2.5系列的电商大模型,电商数据sft后电商大模型。是https://github.com/leeguandong/EcommerceLLM的升级版本。qwen2.5的效果很好。☆13Oct 4, 2024Updated last year
- A Python implementation of the Sequential Thinking MCP server using the official Model Context Protocol (MCP) Python SDK. This server fac…☆24Jun 1, 2025Updated 9 months ago
- Code for Estimating Multi-cause Treatment Effects via Single-cause Perturbation (NeurIPS 2021)☆14Jan 5, 2022Updated 4 years ago