hengzzzhou / ReSoLinks
☆16Updated last month
Alternatives and similar repositories for ReSo
Users that are interested in ReSo are comparing it to the libraries listed below
Sorting:
- PreAct: Prediction Enhances Agent's Planning Ability (Coling2025)☆29Updated 9 months ago
- A Dynamic Visual Benchmark for Evaluating Mathematical Reasoning Robustness of Vision Language Models☆26Updated 10 months ago
- ☆48Updated 4 months ago
- ☆49Updated 6 months ago
- B-STAR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners☆85Updated 4 months ago
- Official Implementation of ARPO: End-to-End Policy Optimization for GUI Agents with Experience Replay☆122Updated 3 months ago
- Natural Language Reinforcement Learning☆97Updated last month
- Official Repo for SvS: A Self-play with Variational Problem Synthesis strategy for RLVR training☆36Updated last month
- ☆21Updated 4 months ago
- R1-Searcher++: Incentivizing the Dynamic Knowledge Acquisition of LLMs via Reinforcement Learning☆61Updated 4 months ago
- ☆61Updated 3 months ago
- Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents (ACL 2024 Main Conference)☆149Updated 10 months ago
- ☆36Updated last month
- ☆48Updated 7 months ago
- A Self-Training Framework for Vision-Language Reasoning☆84Updated 8 months ago
- TreeRL: LLM Reinforcement Learning with On-Policy Tree Search in ACL'25☆64Updated 3 months ago
- Official codebase for "GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning".☆81Updated 3 months ago
- ☆62Updated this week
- [NeurIPS 2025] Think or Not? Selective Reasoning via Reinforcement Learning for Vision-Language Models☆44Updated this week
- [ACL 2024] PCA-Bench: Evaluating Multimodal Large Language Models in Perception-Cognition-Action Chain☆104Updated last year
- [ICML 2025] M-STAR (Multimodal Self-Evolving TrAining for Reasoning) Project. Diving into Self-Evolving Training for Multimodal Reasoning☆69Updated 2 months ago
- A comrephensive collection of learning from rewards in the post-training and test-time scaling of LLMs, with a focus on both reward model…☆56Updated 3 months ago
- Official code of *Virgo: A Preliminary Exploration on Reproducing o1-like MLLM*☆108Updated 4 months ago
- [EMNLP 2025] WebAgent-R1: Training Web Agents via End-to-End Multi-Turn Reinforcement Learning☆43Updated 3 weeks ago
- ☆35Updated last week
- ☆57Updated 3 months ago
- Reinforced Multi-LLM Agents training☆45Updated 3 months ago
- This is the official implementation of the paper "S²R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning"☆69Updated 5 months ago
- Official code for paper "SPA-RL: Reinforcing LLM Agent via Stepwise Progress Attribution"☆43Updated 2 weeks ago
- [NeurIPS 2025 Spotlight] ReasonFlux-Coder: Open-Source LLM Coders with Co-Evolving Reinforcement Learning☆117Updated last week