cmu-l3 / l1Links

L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning

☆257

Alternatives and similar repositories for l1

Users that are interested in l1 are comparing it to the libraries listed below

Sorting:

eddycmu / demystify-long-cot
☆327Updated 6 months ago
ruixin31 / Spurious_Rewards
☆344Updated 4 months ago
ltzheng / SimpleTIR
End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning
☆330Updated 2 months ago
GAIR-NLP / LIMR
☆213Updated 9 months ago
kanishkg / cognitive-behaviors
☆216Updated 8 months ago
GAIR-NLP / ToRL
☆316Updated 6 months ago
CMU-AIRe / MRT
Research Code for preprint "Optimizing Test-Time Compute via Meta Reinforcement Finetuning".
☆116Updated 3 months ago
PRIME-RL / ImplicitPRM
Repo of paper "Free Process Rewards without Process Labels"
☆167Updated 8 months ago
GeniusHTX / TALE
☆137Updated 2 months ago
StarDewXXX / O1-Pruner
Official repository for paper: O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning
☆98Updated 9 months ago
ElliottYan / LUFFY
Official Repository of "Learning to Reason under Off-Policy Guidance"
☆380Updated 2 months ago
hemingkx / TokenSkip
[EMNLP 2025] TokenSkip: Controllable Chain-of-Thought Compression in LLMs
☆193Updated this week
multimodal-art-projection / LatentCoT-Horizon
📖 This is a repository for organizing papers, codes, and other resources related to Latent Reasoning.
☆290Updated 3 weeks ago
TIGER-AI-Lab / verl-tool
A version of verl to support diverse tool use
☆714Updated last week
THU-KEG / AdaptThink
☆169Updated last month
IAAR-Shanghai / xVerify
xVerify: Efficient Answer Verifier for Reasoning Model Evaluations
☆140Updated 3 weeks ago
Zanette-Labs / efficient-reasoning
☆68Updated 7 months ago
Blueyee / Efficient-CoT-LRMs
Chain of Thoughts (CoT) is so hot! so long! We need short reasoning process!
☆70Updated 8 months ago
TIGER-AI-Lab / General-Reasoner
General Reasoner: Advancing LLM Reasoning Across All Domains [NeurIPS25]
☆204Updated last week
LCLM-Horizon / A-Comprehensive-Survey-For-Long-Context-Language-Modeling
A Comprehensive Survey on Long Context Language Modeling
☆209Updated last week
ypwang61 / One-Shot-RLVR
[NeurIPS 2025] Reinforcement Learning for Reasoning in Large Language Models with One Training Example
☆381Updated last week
sail-sg / CPO
[NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.
☆132Updated 8 months ago
GAIR-NLP / OctoThinker
Revisiting Mid-training in the Era of Reinforcement Learning Scaling
☆180Updated 4 months ago
TIGER-AI-Lab / CritiqueFineTuning
Code for "Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate" [COLM 2025]
☆179Updated 4 months ago
XiaoYee / Awesome_Efficient_LRM_Reasoning
😎 A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, Agent, and Beyond
☆318Updated last month
LeapLabTHU / limit-of-RLVR
repo for paper https://arxiv.org/abs/2504.13837
☆271Updated 5 months ago
multimodal-art-projection / REER_DeepWriter
REverse-Engineered Reasoning for Open-Ended Generation
☆83Updated 2 months ago
PRIME-RL / Entropy-Mechanism-of-RL
The Entropy Mechanism of Reinforcement Learning for Large Language Model Reasoning.
☆390Updated 4 months ago
CJReinforce / PURE
Official code for the paper, "Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning"
☆142Updated last month
NVlabs / Tool-N1
☆210Updated 6 months ago