facebookresearch / meta-agents-research-environmentsLinks
Meta Agents Research Environments is a comprehensive platform designed to evaluate AI agents in dynamic, realistic scenarios. Unlike static benchmarks, this platform introduces evolving environments where agents must adapt their strategies as new information becomes available, mirroring real-world challenges.
β338Updated last week
Alternatives and similar repositories for meta-agents-research-environments
Users that are interested in meta-agents-research-environments are comparing it to the libraries listed below
Sorting:
- Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasksβ248Updated 6 months ago
- π AppWorld: A Controllable World of Apps and People for Benchmarking Function Calling and Interactive Coding Agent, ACL'24 Best Resourceβ¦β302Updated last week
- A Collection of Competitive Text-Based Games for Language Model Evaluation and Reinforcement Learningβ301Updated last week
- β221Updated 8 months ago
- Code and implementations for the paper "AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcemenβ¦β467Updated last month
- Code for the paper: "Learning to Reason without External Rewards"β370Updated 3 months ago
- AWM: Agent Workflow Memoryβ343Updated 9 months ago
- Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.β188Updated 8 months ago
- A Gym for Agentic LLMsβ347Updated last week
- Resources for our paper: "Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training"β161Updated 2 weeks ago
- A banchmark list for evaluation of large language models.β146Updated 2 months ago
- β178Updated this week
- β116Updated 9 months ago
- A framework to study AI models in Reasoning, Alignment, and use of Memory (RAM).β296Updated 2 weeks ago
- Framework and toolkits for building and evaluating collaborative agents that can work together with humans.β104Updated last week
- β77Updated last week
- πΎ OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.β558Updated last week
- Code for Paper: Training Software Engineering Agents and Verifiers with SWE-Gym [ICML 2025]β568Updated 3 months ago
- β122Updated 8 months ago
- Code for the paper π³ Tree Search for Language Model Agentsβ217Updated last year
- [COLM 2025] Official repository for R2E-Gym: Procedural Environment Generation and Hybrid Verifiers for Scaling Open-Weights SWE Agentsβ181Updated 3 months ago
- [NeurIPS 2025 D&B Spotlight] Scaling Data for SWE-agentsβ442Updated this week
- Reproducible, flexible LLM evaluationsβ264Updated last week
- Official repo for paper: "Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't"β268Updated 3 weeks ago
- Code and example data for the paper: Rule Based Rewards for Language Model Safetyβ201Updated last year
- [ICML 2025] Flow of Reasoning: Training LLMs for Divergent Reasoning with Minimal Examplesβ109Updated 3 months ago
- β254Updated last month
- [NeurIPS 2025] Reinforcement Learning for Reasoning in Large Language Models with One Training Exampleβ372Updated 3 weeks ago
- A simple unified framework for evaluating LLMsβ254Updated 6 months ago
- β291Updated 3 months ago