zxiangx / LC-R1Links
Code for paper: Optimizing Length Compression in Large Reasoning Models
☆26Updated 2 weeks ago
Alternatives and similar repositories for LC-R1
Users that are interested in LC-R1 are comparing it to the libraries listed below
Sorting:
- ☆32Updated 3 months ago
- A Recipe for Building LLM Reasoners to Solve Complex Instructions☆27Updated last month
- Source code for our paper: "ARIA: Training Language Agents with Intention-Driven Reward Aggregation".☆24Updated 2 months ago
- Feedback-Driven Tool-Use Improvements in Large Language Models via Automated Build Environments☆42Updated last month
- ☆44Updated last month
- ☆36Updated last month
- [arxiv: 2505.02156] Adaptive Thinking via Mode Policy Optimization for Social Language Agents☆46Updated 4 months ago
- ☆14Updated 10 months ago
- ☆30Updated last month
- ☆51Updated 3 months ago
- ☆38Updated 2 months ago
- Official Repo for SvS: A Self-play with Variational Problem Synthesis strategy for RLVR training☆40Updated 2 months ago
- ☆15Updated 4 months ago
- [ACL 2025 Findings] Official implementation of the paper "Unveiling the Key Factors for Distilling Chain-of-Thought Reasoning".☆19Updated 8 months ago
- [EMNLP 2025] Code for paper "Table-R1: Inference-Time Scaling for Table Reasoning"☆26Updated 5 months ago
- DCPO: Dynamic Adaptive Clipping for RL☆40Updated last month
- ☆29Updated last month
- Segment Policy Optimization: Effective Segment-Level Credit Assignment in RL for Large Language Models☆41Updated last month
- [ICML'25] Official code of paper "Fast Large Language Model Collaborative Decoding via Speculation"☆28Updated 4 months ago
- ☆16Updated last year
- Emergent Hierarchical Reasoning in LLMs/VLMs through Reinforcement Learning☆46Updated 2 weeks ago
- [ACL 2025] A Generalizable and Purely Unsupervised Self-Training Framework☆71Updated 5 months ago
- ☆24Updated 2 months ago
- ☆29Updated 2 months ago
- ☆14Updated 9 months ago
- This is the code repo for the paper "Learning to Route Queries Across Knowledge Bases for Step-wise Retrieval-Augmented Reasoning".☆26Updated 2 months ago
- This is the official implementation of the paper "S²R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning"☆71Updated 6 months ago
- R1-Searcher++: Incentivizing the Dynamic Knowledge Acquisition of LLMs via Reinforcement Learning☆64Updated 5 months ago
- Official Implementation of our paper "THOR: Tool-Integrated Hierarchical Optimization via RL for Mathematical Reasoning".☆27Updated last month
- TARS: MinMax Token-Adaptive Preference Strategy for Hallucination Reduction in MLLMs☆23Updated last month