SalesforceAIResearch / MCP-UniverseLinks
MCP-Universe is a comprehensive framework designed for developing, testing, and benchmarking AI agents
β445Updated this week
Alternatives and similar repositories for MCP-Universe
Users that are interested in MCP-Universe are comparing it to the libraries listed below
Sorting:
- MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Serversβ333Updated last week
- π MassGen: An Open-source Multi-Agent Scaling System Inspired by Grok Heavy and Gemini Deep Think. Join the discord channel: https://disβ¦β463Updated this week
- Agentic Web: Weaving the Next Web with AI Agents.β369Updated last week
- Anemoi: A Semi-Centralized Multi-agent Systems Based on Agent-to-Agent Communication MCP server from Coral Protocolβ367Updated last month
- The offical repo for "Parallel-R1: Towards Parallel Thinking via Reinforcement Learning"β189Updated this week
- A multi-agent LLM system for detecting and resolving cognitive dissonance.β267Updated last month
- codes for R-Zero: Self-Evolving Reasoning LLM from Zero Data (https://www.arxiv.org/pdf/2508.05004)β638Updated last week
- A coding agent framework, that works on its own codebase.β127Updated 5 months ago
- Research code artifacts for Code World Model (CWM) including inference tools, reproducibility, and documentation.β644Updated 2 weeks ago
- Code and implementations for the paper "AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcemenβ¦β422Updated 3 weeks ago
- LIMI: Less is More for Agencyβ134Updated this week
- Ranking LLMs on agentic tasksβ192Updated last month
- β512Updated last month
- Code to accompany the Universal Deep Research paper (https://arxiv.org/abs/2509.00244)β441Updated last month
- β232Updated 3 months ago
- A Tree Search Library with Flexible API for LLM Inference-Time Scalingβ475Updated 2 months ago
- A Text-Based Environment for Interactive Debuggingβ272Updated this week
- π Loong: Synthesize Long CoTs at Scale through Verifiers.β448Updated last week
- Training teachers with reinforcement learning able to make LLMs learn how to reason for test time scaling.β343Updated 3 months ago
- An Automatic Prompt Optimization Framework for Large Language Modelsβ122Updated 2 months ago
- β300Updated 2 months ago
- On the Theoretical Limitations of Embedding-Based Retrievalβ568Updated 3 weeks ago
- OpenCUA: Open Foundations for Computer-Use Agentsβ500Updated this week
- All-in-One Sandbox for AI Agents that combines Browser, Shell, File, MCP and VSCode Server in a single Docker container.β554Updated last week
- Collection of scripts and notebooks for OpenAI's latest GPT OSS modelsβ456Updated last month
- β186Updated last week
- β95Updated 2 weeks ago
- β77Updated last week
- An Open-Source Large-Scale Reinforcement Learning Project for Search Agentsβ442Updated 2 weeks ago
- An agent benchmark with tasks in a simulated software company.β556Updated 3 weeks ago