facebookresearch / meta-agents-research-environmentsLinks
Meta Agents Research Environments is a comprehensive platform designed to evaluate AI agents in dynamic, realistic scenarios. Unlike static benchmarks, this platform introduces evolving environments where agents must adapt their strategies as new information becomes available, mirroring real-world challenges.
☆411Updated last month
Alternatives and similar repositories for meta-agents-research-environments
Users that are interested in meta-agents-research-environments are comparing it to the libraries listed below
Sorting:
- Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks☆254Updated 8 months ago
- A Collection of Competitive Text-Based Games for Language Model Evaluation and Reinforcement Learning☆334Updated 2 months ago
- 🌍 AppWorld: A Controllable World of Apps and People for Benchmarking Function Calling and Interactive Coding Agent, ACL'24 Best Resource…☆351Updated last month
- A Gym for Agentic LLMs☆412Updated last week
- Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.☆190Updated 10 months ago
- Code for the paper: "Learning to Reason without External Rewards"☆385Updated 5 months ago
- Resources for our paper: "Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training"☆166Updated 2 months ago
- A framework to study AI models in Reasoning, Alignment, and use of Memory (RAM).☆340Updated 3 weeks ago
- [NeurIPS 2025 D&B Spotlight] Scaling Data for SWE-agents☆509Updated this week
- Framework and toolkits for building and evaluating collaborative agents that can work together with humans.☆117Updated last month
- ☆209Updated last week
- ☆226Updated 10 months ago
- Code and implementations for the paper "AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcemen…☆545Updated 3 months ago
- AWM: Agent Workflow Memory☆376Updated 2 weeks ago
- [COLM 2025] Official repository for R2E-Gym: Procedural Environment Generation and Hybrid Verifiers for Scaling Open-Weights SWE Agents☆224Updated 5 months ago
- Code for Paper: Training Software Engineering Agents and Verifiers with SWE-Gym [ICML 2025]☆611Updated 5 months ago
- 🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.☆609Updated this week
- τ²-Bench: Evaluating Conversational Agents in a Dual-Control Environment☆598Updated 3 weeks ago
- Official repo for paper: "Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't"☆271Updated 2 months ago
- A scalable asynchronous reinforcement learning implementation with in-flight weight updates.☆343Updated 2 weeks ago
- A banchmark list for evaluation of large language models.☆154Updated 4 months ago
- ☆308Updated 3 months ago
- Code for the paper 🌳 Tree Search for Language Model Agents☆218Updated last year
- Dynamic Cheatsheet: Test-Time Learning with Adaptive Memory☆231Updated 7 months ago
- [NeurIPS 2025] Reinforcement Learning for Reasoning in Large Language Models with One Training Example☆392Updated last month
- (ACL 2025 Main) Code for MultiAgentBench : Evaluating the Collaboration and Competition of LLM agents https://www.arxiv.org/pdf/2503.019…☆202Updated 2 months ago
- ☆116Updated 11 months ago
- ☆255Updated 4 months ago
- ☆124Updated 10 months ago
- OpenTinker is an RL-as-a-Service infrastructure for foundation models☆499Updated last week