RAGEN-AI / VAGENLinks

☆220

Alternatives and similar repositories for VAGEN

Users that are interested in VAGEN are comparing it to the libraries listed below

Sorting:

ltzheng / SimpleTIR
End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning
☆292Updated last week
dvlab-research / ARPO
Official Implementation of ARPO: End-to-End Policy Optimization for GUI Agents with Experience Replay
☆127Updated 4 months ago
TIGER-AI-Lab / verl-tool
A version of verl to support diverse tool use
☆570Updated this week
ElliottYan / LUFFY
Official Repository of "Learning to Reason under Off-Policy Guidance"
☆310Updated 2 weeks ago
GAIR-NLP / ToRL
☆295Updated 4 months ago
ruixin31 / Spurious_Rewards
☆333Updated 2 months ago
THUDM / VisualAgentBench
Towards Large Multimodal Models as Visual Foundation Agents
☆237Updated 5 months ago
PRIME-RL / Entropy-Mechanism-of-RL
The Entropy Mechanism of Reinforcement Learning for Large Language Model Reasoning.
☆337Updated 2 months ago
LeapLabTHU / limit-of-RLVR
repo for paper https://arxiv.org/abs/2504.13837
☆196Updated 3 months ago
TsinghuaC3I / Unify-Post-Training
Towards a Unified View of Large Language Model Post-Training
☆144Updated 3 weeks ago
cmu-l3 / l1
L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning
☆256Updated 4 months ago
facebookresearch / sweet_rl
Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks
☆245Updated 4 months ago
RyanLiu112 / GenPRM
Official codebase for "GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning".
☆81Updated 4 months ago
ypwang61 / One-Shot-RLVR
[NeurIPS 2025] Reinforcement Learning for Reasoning in Large Language Models with One Training Example
☆360Updated this week
RyanLiu112 / Awesome-Process-Reward-Models
A comprehensive collection of process reward models.
☆110Updated 2 months ago
TsinghuaC3I / MARTI
A Framework for LLM-based Multi-Agent Reinforced Training and Inference
☆276Updated this week
multimodal-art-projection / LatentCoT-Horizon
📖 This is a repository for organizing papers, codes, and other resources related to Latent Reasoning.
☆215Updated last week
CJReinforce / PURE
Official code for the paper, "Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning"
☆136Updated 2 months ago
ritzz-ai / GUI-R1
Official implementation of GUI-R1 : A Generalist R1-Style Vision-Language Action Model For GUI Agents
☆185Updated 4 months ago
InternLM / OREAL
Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning
☆190Updated 6 months ago
PRIME-RL / ImplicitPRM
Repo of paper "Free Process Rewards without Process Labels"
☆164Updated 6 months ago
OS-Copilot / OS-Genesis
[ACL 2025] Code and data for OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis
☆162Updated last month
lll6gg / UI-R1
Code for "UI-R1: Enhancing Efficient Action Prediction of GUI Agents by Reinforcement Learning"
☆130Updated 4 months ago
DigiRL-agent / digiq
☆112Updated 5 months ago
qiancheng0 / ToolRL
☆352Updated 3 months ago
OpenRLHF / OpenRLHF-M
An Easy-to-use, Scalable and High-performance RLHF Framework designed for Multimodal Models.
☆146Updated 5 months ago
eddycmu / demystify-long-cot
☆318Updated 4 months ago
CMU-AIRe / MRT
Research Code for preprint "Optimizing Test-Time Compute via Meta Reinforcement Finetuning".
☆107Updated last month
OpenBMB / RLPR
Extrapolating RLVR to General Domains without Verifiers
☆168Updated last month
RUC-NLPIR / Tool-Star
🔧Tool-Star: Empowering LLM-brained Multi-Tool Reasoner via Reinforcement Learning
☆262Updated 3 weeks ago