ZJU-REAL / LAPOLinks
☆36Updated last month
Alternatives and similar repositories for LAPO
Users that are interested in LAPO are comparing it to the libraries listed below
Sorting:
- ☆31Updated 3 months ago
- [NeurIPS 2025] Let LRMs Break Free from Overthinking via Self-Braking Tuning. https://arxiv.org/abs/2505.14604☆51Updated last month
- [EMNLP 2025] LightThinker: Thinking Step-by-Step Compression☆123Updated 7 months ago
- ☆38Updated 3 months ago
- ☆51Updated 9 months ago
- R1-Searcher++: Incentivizing the Dynamic Knowledge Acquisition of LLMs via Reinforcement Learning☆65Updated 6 months ago
- [ACL 2025] A Generalizable and Purely Unsupervised Self-Training Framework☆70Updated 6 months ago
- ☆21Updated 7 months ago
- ☆65Updated 5 months ago
- Source code for our paper: "ARIA: Training Language Agents with Intention-Driven Reward Aggregation".☆26Updated 3 months ago
- [NeurIPS 2025] Mind the Gap: Bridging Thought Leap for Improved CoT Tuning https://arxiv.org/abs/2505.14684☆45Updated last month
- A comrephensive collection of learning from rewards in the post-training and test-time scaling of LLMs, with a focus on both reward model…☆58Updated 5 months ago
- [NeurIPS'25 Spotlight] ARM: Adaptive Reasoning Model☆60Updated last month
- A Unified Framework for High-Performance and Extensible LLM Steering☆133Updated this week
- RM-R1: Unleashing the Reasoning Potential of Reward Models☆152Updated 5 months ago
- ☆51Updated 9 months ago
- This is the official implementation of the paper "S²R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning"