facebookresearch / meta-agents-research-environmentsLinks
Meta Agents Research Environments is a comprehensive platform designed to evaluate AI agents in dynamic, realistic scenarios. Unlike static benchmarks, this platform introduces evolving environments where agents must adapt their strategies as new information becomes available, mirroring real-world challenges.
☆369Updated last week
Alternatives and similar repositories for meta-agents-research-environments
Users that are interested in meta-agents-research-environments are comparing it to the libraries listed below
Sorting:
- Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks☆252Updated 6 months ago
- Code for the paper: "Learning to Reason without External Rewards"☆380Updated 4 months ago
- 🌍 AppWorld: A Controllable World of Apps and People for Benchmarking Function Calling and Interactive Coding Agent, ACL'24 Best Resource…☆317Updated 2 weeks ago
- Resources for our paper: "Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training"☆161Updated last month
- AWM: Agent Workflow Memory☆359Updated 9 months ago
- [COLM 2025] Official repository for R2E-Gym: Procedural Environment Generation and Hybrid Verifiers for Scaling Open-Weights SWE Agents☆197Updated 4 months ago
- A Gym for Agentic LLMs☆364Updated 2 weeks ago
- Code and implementations for the paper "AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcemen…☆502Updated 2 months ago
- Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.☆189Updated 8 months ago
- Code for Paper: Training Software Engineering Agents and Verifiers with SWE-Gym [ICML 2025]☆579Updated 3 months ago
- A Collection of Competitive Text-Based Games for Language Model Evaluation and Reinforcement Learning☆316Updated last month
- ☆224Updated 9 months ago
- [NeurIPS 2025 D&B Spotlight] Scaling Data for SWE-agents☆463Updated last week
- ☆117Updated 10 months ago
- ☆297Updated 4 months ago
- ☆235Updated 3 months ago
- Framework and toolkits for building and evaluating collaborative agents that can work together with humans.☆110Updated 3 weeks ago
- [NeurIPS 2025] Reinforcement Learning for Reasoning in Large Language Models with One Training Example☆381Updated last week
- ☆80Updated 3 weeks ago
- Official repo for paper: "Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't"☆268Updated last month
- 🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.☆573Updated 3 weeks ago
- A banchmark list for evaluation of large language models.☆151Updated 2 months ago
- ☆190Updated last week
- A framework to study AI models in Reasoning, Alignment, and use of Memory (RAM).☆298Updated 2 weeks ago
- SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning☆160Updated 2 months ago
- Dynamic Cheatsheet: Test-Time Learning with Adaptive Memory☆200Updated 6 months ago
- Code for the paper 🌳 Tree Search for Language Model Agents☆216Updated last year
- (ACL 2025 Main) Code for MultiAgentBench : Evaluating the Collaboration and Competition of LLM agents https://www.arxiv.org/pdf/2503.019…☆186Updated last month
- Reproducible, flexible LLM evaluations☆286Updated last week
- Implementation for OAgents: An Empirical Study of Building Effective Agents☆283Updated last month