Meta Agents Research Environments is a comprehensive platform designed to evaluate AI agents in dynamic, realistic scenarios. Unlike static benchmarks, this platform introduces evolving environments where agents must adapt their strategies as new information becomes available, mirroring real-world challenges.
☆497May 21, 2026Updated last week
Alternatives and similar repositories for meta-agents-research-environments
Users that are interested in meta-agents-research-environments are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆26Oct 9, 2025Updated 7 months ago
- ☆63Aug 5, 2025Updated 9 months ago
- A Searching-based Agent Model for Open-Domain Open-Ended Question Answering☆36Jun 20, 2025Updated 11 months ago
- [ICLR 2026] AgentSynth: Scalable Task Generation for Generalist Computer-Use Agents☆42Apr 17, 2026Updated last month
- A Practitioner's Guide to M(eow)ti Turn Agentic ReinfOrcement learning☆82Jan 16, 2026Updated 4 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Measuring Thinking Efficiency in Reasoning Models - Research Repository☆39Dec 2, 2025Updated 5 months ago
- [NeurIPS'25 D&B] Mind2Web-2 Benchmark: Evaluating Agentic Search with Agent-as-a-Judge☆111May 17, 2026Updated last week
- Code and implementations for the ACL 2025 paper "AgentGym: Evolving Large Language Model-based Agents across Diverse Environments" by Zhi…☆791Sep 11, 2025Updated 8 months ago
- [ICLR 2025] "Training LMs on Synthetic Edit Sequences Improves Code Synthesis" (Piterbarg, Pinto, Fergus)☆19Feb 11, 2025Updated last year
- LIMI: Less is More for Agency☆161Oct 14, 2025Updated 7 months ago
- SkyRL: A Modular Full-stack RL Library for LLMs☆1,894Updated this week
- [EMNLP 2022] The baseline code for META-GUI dataset☆15Jul 9, 2024Updated last year
- ☆24Mar 1, 2025Updated last year
- Tuning-Free Image Editing with Fidelity and Editability via Unified Latent Diffusion Model☆13Dec 29, 2024Updated last year
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- ☆11Oct 25, 2024Updated last year
- Single-file truly minimal implementation of state-of-the-art reinforcement learning algorithms.☆21Feb 13, 2023Updated 3 years ago
- A Gym for Agentic LLMs☆488Jan 21, 2026Updated 4 months ago
- AllenAI's post-training codebase☆3,729Updated this week
- Bayes-Adaptive RL for LLM Reasoning☆45May 28, 2025Updated last year
- BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions☆25Aug 8, 2024Updated last year
- verl/HybridFlow: A Flexible and Efficient RL Post-Training Framework☆21,514Updated this week
- Our library for RL environments + evals☆4,125May 22, 2026Updated last week
- A benchmark for evaluating LLMs on open-ended CS problems. Exploring the Next Frontier of Computer Science.☆218May 13, 2026Updated 2 weeks ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- The original Shared Recurrent Memory Transformer implementation☆36Jul 11, 2025Updated 10 months ago
- An enterprise deep research benchmark☆36Apr 22, 2026Updated last month
- Trinity-RFT is a general-purpose, flexible and scalable framework designed for reinforcement fine-tuning (RFT) of large language models (…☆633May 21, 2026Updated last week
- Repository for the Q-Filters method (https://arxiv.org/pdf/2503.02812)☆34Mar 7, 2025Updated last year
- Code for the paper 🌳 Tree Search for Language Model Agents☆222Jul 25, 2024Updated last year
- ☆12Jul 31, 2025Updated 9 months ago
- ☆22May 3, 2025Updated last year
- [EMNLP 2025] LightThinker: Thinking Step-by-Step Compression☆160Apr 7, 2026Updated last month
- ☆32Jun 5, 2025Updated 11 months ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Simple repository for training small reasoning models☆52Feb 17, 2026Updated 3 months ago
- MLE-bench is a benchmark for measuring how well AI agents perform at machine learning engineering☆1,549Apr 24, 2026Updated last month
- MLGym A New Framework and Benchmark for Advancing AI Research Agents☆600Aug 10, 2025Updated 9 months ago
- Ling-V2 is a MoE LLM provided and open-sourced by InclusionAI.☆270Oct 4, 2025Updated 7 months ago
- A DL compiler fuzzer☆14Nov 1, 2024Updated last year
- Code for paper "Optima: Optimizing Effectiveness and Efficiency for LLM-Based Multi-Agent System"☆72Nov 14, 2024Updated last year
- Implementation for "Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs"☆397Jan 19, 2025Updated last year