sail-sg / understand-r1-zeroLinks

Understanding R1-Zero-Like Training: A Critical Perspective

☆1,055

Alternatives and similar repositories for understand-r1-zero

Users that are interested in understand-r1-zero are comparing it to the libraries listed below

Sorting:

GAIR-NLP / LIMO
[COLM 2025] LIMO: Less is More for Reasoning
☆993Updated this week
SimpleBerry / LLaMA-O1
Large Reasoning Models
☆804Updated 8 months ago
microsoft / rStar
☆608Updated 3 weeks ago
huggingface / Math-Verify
☆870Updated last month
Open-Reasoner-Zero / Open-Reasoner-Zero
Official Repo for Open-Reasoner-Zero
☆2,015Updated 2 months ago
sail-sg / oat
🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.
☆418Updated last week
PRIME-RL / TTRL
TTRL: Test-Time Reinforcement Learning
☆745Updated 3 weeks ago
THUDM / slime
slime is a LLM post-training framework aiming for RL Scaling.
☆975Updated this week
BytedTsinghua-SIA / DAPO
An Open-source RL System from ByteDance Seed and Tsinghua AIR
☆1,479Updated 2 months ago
PRIME-RL / PRIME
Scalable RL solution for advanced reasoning of language models
☆1,672Updated 4 months ago
RUCAIBox / Slow_Thinking_with_LLMs
A series of technical report on Slow Thinking with LLM
☆713Updated last month
NovaSky-AI / SkyRL
SkyRL: A Modular Full-stack RL Library for LLMs
☆679Updated this week
ByteDance-Seed / Seed-Thinking-v1.5
☆800Updated last month
facebookresearch / coconut
Training Large Language Model to Reason in a Continuous Latent Space
☆1,224Updated 6 months ago
huggingface / search-and-learn
Recipes to scale inference-time compute of open models
☆1,110Updated 2 months ago
THUDM / ReST-MCTS
ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search (NeurIPS 2024)
☆654Updated 6 months ago
allenai / OLMoE
OLMoE: Open Mixture-of-Experts Language Models
☆823Updated 4 months ago
zhentingqi / rStar
☆954Updated 6 months ago
MoonshotAI / Moonlight
Muon is Scalable for LLM Training
☆1,240Updated this week
McGill-NLP / nano-aha-moment
Single File, Single GPU, From Scratch, Efficient, Full Parameter Tuning library for "RL for LLMs"
☆512Updated 3 weeks ago
DreamLM / Dream
Dream 7B, a large diffusion language model
☆873Updated last month
srush / awesome-o1
A bibliography and survey of the papers surrounding o1
☆1,207Updated 8 months ago
SkyworkAI / Skywork-OR1
Unleashing the Power of Reinforcement Learning for Math and Code Reasoners
☆691Updated last month
eddycmu / demystify-long-cot
☆309Updated 2 months ago
GAIR-NLP / O1-Journey
O1 Replication Journey
☆1,998Updated 6 months ago
RAGEN-AI / RAGEN
RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.
☆2,182Updated this week
NVIDIA / NeMo-Skills
A project to improve skills of large language models
☆501Updated this week
Gen-Verse / ReasonFlux
ReasonFlux Series - A family of LLM post-training algorithms focusing on data selection, reinforcement learning, and inference scaling
☆470Updated 2 weeks ago
TsinghuaC3I / Awesome-RL-Reasoning-Recipes
Awesome RL Reasoning Recipes ("Triple R")
☆762Updated last month
Open-Source-O1 / Open-O1
☆1,356Updated 8 months ago