FoundationAgents / AutoEnvLinks
Scaling Agentic Environments Automatically.
☆33Updated last week
Alternatives and similar repositories for AutoEnv
Users that are interested in AutoEnv are comparing it to the libraries listed below
Sorting:
- MemGen: Weaving Generative Latent Memory for Self-Evolving Agents☆238Updated 2 weeks ago
- Official Implementation of ARPO: End-to-End Policy Optimization for GUI Agents with Experience Replay☆138Updated 6 months ago
- [ACL 2025] A Generalizable and Purely Unsupervised Self-Training Framework☆70Updated 6 months ago
- [NeurIPS'25 Spotlight] ARM: Adaptive Reasoning Model☆60Updated last month
- Code, benchmark and environment for "ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows"☆117Updated 3 weeks ago
- [EMNLP 2025 Main] AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time☆88Updated 6 months ago
- (ACL-2025 main conference) Dolphin: Moving Towards Closed-loop Auto-research through Thinking, Practice, and Feedback☆36Updated 5 months ago
- [ACL 2025] AgentStore: Scalable Integration of Heterogeneous Agents As Specialized Generalist Computer Assistant☆42Updated 11 months ago
- Towards a Unified View of Large Language Model Post-Training☆192Updated 3 months ago
- R1-Searcher++: Incentivizing the Dynamic Knowledge Acquisition of LLMs via Reinforcement Learning☆66Updated 6 months ago
- ☆142Updated 7 months ago
- Official implementation of MAS-GPT: Training LLMs to Build LLM-based Multi-Agent Systems☆70Updated 5 months ago
- JudgeLRM: Large Reasoning Models as a Judge☆40Updated this week
- Evaluation framework for paper "VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?"☆61Updated last year
- SWE-Factory: Your Automated Factory for Issue Resolution Training Data and Evaluation Benchmarks☆117Updated 3 weeks ago
- [NeurIPS 2024 D&B Track] GTA: A Benchmark for General Tool Agents☆130Updated 8 months ago
- RM-R1: Unleashing the Reasoning Potential of Reward Models☆154Updated 5 months ago
- [ICLR 2025] Benchmarking Agentic Workflow Generation☆139Updated 9 months ago
- ☆36Updated 2 months ago
- VeriWeb: Verifiable Long-Chain Web Benchmark for Agentic Information-Seeking☆82Updated this week
- OpenVLThinker: An Early Exploration to Vision-Language Reasoning via Iterative Self-Improvement☆123Updated 4 months ago
- 🔧Tool-Star: Empowering LLM-brained Multi-Tool Reasoner via Reinforcement Learning☆293Updated last month
- SSRL: Self-Search Reinforcement Learning☆158Updated 3 months ago
- ☆51Updated 10 months ago
- MMSearch-R1 is an end-to-end RL framework that enables LMMs to perform on-demand, multi-turn search with real-world multimodal search too…☆360Updated 3 months ago
- ☆53Updated 4 months ago
- Official code of *Virgo: A Preliminary Exploration on Reproducing o1-like MLLM*☆109Updated 6 months ago
- [NeurIPS 2025] Thinkless: LLM Learns When to Think☆245Updated 2 months ago
- [NeurIPS 2025 Spotlight] Co-Evolving LLM Coder and Unit Tester via Reinforcement Learning☆139Updated 2 months ago
- ☆182Updated last month