facebookresearch / meta-agents-research-environmentsLinks
Meta Agents Research Environments is a comprehensive platform designed to evaluate AI agents in dynamic, realistic scenarios. Unlike static benchmarks, this platform introduces evolving environments where agents must adapt their strategies as new information becomes available, mirroring real-world challenges.
☆158Updated this week
Alternatives and similar repositories for meta-agents-research-environments
Users that are interested in meta-agents-research-environments are comparing it to the libraries listed below
Sorting:
- ☆73Updated 3 weeks ago
- ☆116Updated 8 months ago
- [COLM 2025] Code for Paper: Learning Adaptive Parallel Reasoning with Language Models☆128Updated last month
- ☆122Updated 7 months ago
- Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.☆183Updated 6 months ago
- Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks☆245Updated 4 months ago
- Code for paper "Optima: Optimizing Effectiveness and Efficiency for LLM-Based Multi-Agent System"☆62Updated 10 months ago
- [ICML 2025] Flow of Reasoning: Training LLMs for Divergent Reasoning with Minimal Examples☆106Updated last month
- A Collection of Competitive Text-Based Games for Language Model Evaluation and Reinforcement Learning☆280Updated this week
- Code for Paper: Autonomous Evaluation and Refinement of Digital Agents [COLM 2024]☆143Updated 9 months ago
- [COLM 2025] Official repository for R2E-Gym: Procedural Environment Generation and Hybrid Verifiers for Scaling Open-Weights SWE Agents☆159Updated 2 months ago
- Repository for the paper Stream of Search: Learning to Search in Language☆151Updated 7 months ago
- Dynamic Cheatsheet: Test-Time Learning with Adaptive Memory☆74Updated 4 months ago
- SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning☆147Updated last week
- 🌍 Repository for "AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agent", ACL'24 Best Resource Pap…☆250Updated last month
- A Gym for Generalist LLMs☆125Updated this week
- Code for the paper: "Learning to Reason without External Rewards"☆354Updated 2 months ago
- Framework and toolkits for building and evaluating collaborative agents that can work together with humans.☆98Updated 5 months ago
- A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models☆64Updated 7 months ago
- B-STAR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners☆84Updated 4 months ago
- Can Language Models Solve Olympiad Programming?☆118Updated 8 months ago
- A simple unified framework for evaluating LLMs☆246Updated 5 months ago
- ☆104Updated last year
- [NeurIPS 2024] Can LLMs Learn by Teaching for Better Reasoning? A Preliminary Study☆54Updated 10 months ago
- A scalable asynchronous reinforcement learning implementation with in-flight weight updates.☆140Updated last week
- ☆93Updated 4 months ago
- A banchmark list for evaluation of large language models.☆141Updated 2 weeks ago
- Code for "Reasoning to Learn from Latent Thoughts"☆118Updated 5 months ago
- Research Code for preprint "Optimizing Test-Time Compute via Meta Reinforcement Finetuning".☆105Updated last month
- Replicating O1 inference-time scaling laws☆90Updated 9 months ago