Meta Agents Research Environments is a comprehensive platform designed to evaluate AI agents in dynamic, realistic scenarios. Unlike static benchmarks, this platform introduces evolving environments where agents must adapt their strategies as new information becomes available, mirroring real-world challenges.
☆462Mar 26, 2026Updated this week
Alternatives and similar repositories for meta-agents-research-environments
Users that are interested in meta-agents-research-environments are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆25Oct 9, 2025Updated 5 months ago
- ☆56Aug 5, 2025Updated 7 months ago
- A Searching-based Agent Model for Open-Domain Open-Ended Question Answering☆33Jun 20, 2025Updated 9 months ago
- AgentSynth: Scalable Task Generation for Generalist Computer-Use Agents☆39Oct 7, 2025Updated 5 months ago
- A Practitioner's Guide to M(eow)ti Turn Agentic ReinfOrcement learning☆79Jan 16, 2026Updated 2 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Measuring Thinking Efficiency in Reasoning Models - Research Repository☆39Dec 2, 2025Updated 3 months ago
- [NeurIPS'25 D&B] Mind2Web-2 Benchmark: Evaluating Agentic Search with Agent-as-a-Judge☆104Feb 28, 2026Updated last month
- Code and implementations for the ACL 2025 paper "AgentGym: Evolving Large Language Model-based Agents across Diverse Environments" by Zhi…☆749Sep 11, 2025Updated 6 months ago
- [ICLR 2025] "Training LMs on Synthetic Edit Sequences Improves Code Synthesis" (Piterbarg, Pinto, Fergus)☆19Feb 11, 2025Updated last year
- LIMI: Less is More for Agency☆161Oct 14, 2025Updated 5 months ago
- SkyRL: A Modular Full-stack RL Library for LLMs☆1,713Updated this week
- [EMNLP 2022] The baseline code for META-GUI dataset☆14Jul 9, 2024Updated last year
- Lean evaluation and metaprogramming utilities for provers.☆50Mar 18, 2026Updated last week
- ☆24Mar 1, 2025Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Tuning-Free Image Editing with Fidelity and Editability via Unified Latent Diffusion Model☆12Dec 29, 2024Updated last year
- ☆11Oct 25, 2024Updated last year
- A Gym for Agentic LLMs☆471Jan 21, 2026Updated 2 months ago
- Single-file truly minimal implementation of state-of-the-art reinforcement learning algorithms.☆21Feb 13, 2023Updated 3 years ago
- AllenAI's post-training codebase☆3,643Mar 23, 2026Updated last week
- Bayes-Adaptive RL for LLM Reasoning☆46May 28, 2025Updated 10 months ago
- BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions☆25Aug 8, 2024Updated last year
- verl: Volcano Engine Reinforcement Learning for LLMs☆20,286Updated this week
- Our library for RL environments + evals☆3,930Updated this week
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- The original Shared Recurrent Memory Transformer implementation☆34Jul 11, 2025Updated 8 months ago
- [EMNLP 2025] LightThinker: Thinking Step-by-Step Compression☆137Apr 12, 2025Updated 11 months ago
- An enterprise deep research benchmark☆35Mar 22, 2026Updated last week
- Simple repository for training small reasoning models☆49Feb 17, 2026Updated last month
- Code for the paper 🌳 Tree Search for Language Model Agents☆221Jul 25, 2024Updated last year
- ☆12Jul 31, 2025Updated 7 months ago
- Ling-V2 is a MoE LLM provided and open-sourced by InclusionAI.☆265Oct 4, 2025Updated 5 months ago
- ☆22May 3, 2025Updated 10 months ago
- MLE-bench is a benchmark for measuring how well AI agents perform at machine learning engineering☆1,419Mar 20, 2026Updated last week
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆29Jun 5, 2025Updated 9 months ago
- MLGym A New Framework and Benchmark for Advancing AI Research Agents☆592Aug 10, 2025Updated 7 months ago
- Code for paper "Optima: Optimizing Effectiveness and Efficiency for LLM-Based Multi-Agent System"☆69Nov 14, 2024Updated last year
- τ²-Bench: Evaluating Conversational Agents in a Dual-Control Environment☆881Mar 20, 2026Updated last week
- ☆13Mar 14, 2026Updated 2 weeks ago
- [NAACL'25] "Revealing the Barriers of Language Agents in Planning"☆13Jun 22, 2025Updated 9 months ago
- Code release for "Generating Code World Models with Large Language Models Guided by Monte Carlo Tree Search" published at NeurIPS '24.☆17Feb 21, 2025Updated last year