RAGEN-AI / VAGENLinks
☆145Updated last week
Alternatives and similar repositories for VAGEN
Users that are interested in VAGEN are comparing it to the libraries listed below
Sorting:
- ☆198Updated last week
- Official Repository of "Learning to Reason under Off-Policy Guidance"☆205Updated this week
- Research Code for preprint "Optimizing Test-Time Compute via Meta Reinforcement Finetuning".☆94Updated 2 months ago
- Official codebase for "GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning".☆73Updated last month
- Repo of paper "Free Process Rewards without Process Labels"☆149Updated 2 months ago
- Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks☆207Updated 3 weeks ago
- ☆293Updated this week
- Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning☆179Updated 2 months ago
- ☆208Updated last week
- A comprehensive collection of process reward models.☆85Updated last week
- Official code for the paper, "Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning"☆120Updated this week
- An Illusion of Progress? Assessing the Current State of Web Agents☆52Updated last week
- ☆173Updated this week
- ☆173Updated 2 months ago
- L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning☆213Updated 2 weeks ago
- ☆193Updated this week
- A Comprehensive Survey on Long Context Language Modeling☆147Updated 2 weeks ago
- verl-agent is an extension of veRL, designed for training LLM/VLM agents via RL. verl-agent is also the official code for paper "Group-in…☆232Updated this week
- Advancing Language Model Reasoning through Reinforcement Learning and Inference Scaling☆102Updated 4 months ago
- RM-R1: Unleashing the Reasoning Potential of Reward Models☆93Updated this week
- xVerify: Efficient Answer Verifier for Reasoning Model Evaluations☆106Updated last month
- Implementation for the research paper "Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision".☆54Updated 6 months ago
- repo for paper https://arxiv.org/abs/2504.13837☆139Updated last week
- Towards Large Multimodal Models as Visual Foundation Agents☆216Updated last month
- official implementation of paper "Process Reward Model with Q-value Rankings"☆59Updated 3 months ago
- [ACL 2025] Code and data for OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis☆135Updated this week
- ☆201Updated 3 months ago
- official repository for “Reinforcement Learning for Reasoning in Large Language Models with One Training Example”☆223Updated last week
- A version of verl to support tool use☆41Updated this week
- Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents (ACL 2024 Main Conference)☆141Updated 7 months ago