Meta Agents Research Environments is a comprehensive platform designed to evaluate AI agents in dynamic, realistic scenarios. Unlike static benchmarks, this platform introduces evolving environments where agents must adapt their strategies as new information becomes available, mirroring real-world challenges.
☆475Apr 16, 2026Updated this week
Alternatives and similar repositories for meta-agents-research-environments
Users that are interested in meta-agents-research-environments are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆25Oct 9, 2025Updated 6 months ago
- ☆58Aug 5, 2025Updated 8 months ago
- A Searching-based Agent Model for Open-Domain Open-Ended Question Answering☆34Jun 20, 2025Updated 10 months ago
- [ICLR 2026] AgentSynth: Scalable Task Generation for Generalist Computer-Use Agents☆40Updated this week
- A Practitioner's Guide to M(eow)ti Turn Agentic ReinfOrcement learning☆81Jan 16, 2026Updated 3 months ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Measuring Thinking Efficiency in Reasoning Models - Research Repository☆39Dec 2, 2025Updated 4 months ago
- [NeurIPS'25 D&B] Mind2Web-2 Benchmark: Evaluating Agentic Search with Agent-as-a-Judge☆108Feb 28, 2026Updated last month
- Code and implementations for the ACL 2025 paper "AgentGym: Evolving Large Language Model-based Agents across Diverse Environments" by Zhi…☆766Sep 11, 2025Updated 7 months ago
- The agent benchmark that scores the full stack — harness, config, and model — not just the LLM. Trace-based scoring, reliability metrics,…☆54Updated this week
- [ICLR 2025] "Training LMs on Synthetic Edit Sequences Improves Code Synthesis" (Piterbarg, Pinto, Fergus)☆19Feb 11, 2025Updated last year
- LIMI: Less is More for Agency☆161Oct 14, 2025Updated 6 months ago
- SkyRL: A Modular Full-stack RL Library for LLMs☆1,759Updated this week
- ☆24Mar 1, 2025Updated last year
- Lean evaluation and metaprogramming utilities for provers.☆79Updated this week
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Tuning-Free Image Editing with Fidelity and Editability via Unified Latent Diffusion Model☆13Dec 29, 2024Updated last year
- ☆11Oct 25, 2024Updated last year
- A Gym for Agentic LLMs☆477Jan 21, 2026Updated 2 months ago
- Single-file truly minimal implementation of state-of-the-art reinforcement learning algorithms.☆21Feb 13, 2023Updated 3 years ago
- AllenAI's post-training codebase☆3,683Apr 13, 2026Updated last week
- Bayes-Adaptive RL for LLM Reasoning☆46May 28, 2025Updated 10 months ago
- verl: Volcano Engine Reinforcement Learning for LLMs☆20,789Updated this week
- BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions☆25Aug 8, 2024Updated last year
- The original Shared Recurrent Memory Transformer implementation☆33Jul 11, 2025Updated 9 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Our library for RL environments + evals☆4,016Updated this week
- [EMNLP 2025] LightThinker: Thinking Step-by-Step Compression☆152Apr 7, 2026Updated last week
- Trinity-RFT is a general-purpose, flexible and scalable framework designed for reinforcement fine-tuning (RFT) of large language models (…☆608Apr 9, 2026Updated last week
- An enterprise deep research benchmark☆35Apr 8, 2026Updated last week
- Simple repository for training small reasoning models☆50Feb 17, 2026Updated 2 months ago
- Repository for the Q-Filters method (https://arxiv.org/pdf/2503.02812)☆34Mar 7, 2025Updated last year
- Code for the paper 🌳 Tree Search for Language Model Agents☆221Jul 25, 2024Updated last year
- Ling-V2 is a MoE LLM provided and open-sourced by InclusionAI.☆264Oct 4, 2025Updated 6 months ago
- ☆12Jul 31, 2025Updated 8 months ago
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- ☆22May 3, 2025Updated 11 months ago
- MLE-bench is a benchmark for measuring how well AI agents perform at machine learning engineering☆1,472Mar 20, 2026Updated 3 weeks ago
- ☆30Jun 5, 2025Updated 10 months ago
- MLGym A New Framework and Benchmark for Advancing AI Research Agents☆594Aug 10, 2025Updated 8 months ago
- 收录CS卷王的经典强(mai)者(ruo)语录☆11May 16, 2021Updated 4 years ago
- A DL compiler fuzzer☆15Nov 1, 2024Updated last year
- Code for paper "Optima: Optimizing Effectiveness and Efficiency for LLM-Based Multi-Agent System"☆71Nov 14, 2024Updated last year