microsoft / rStarLinks

☆604

Alternatives and similar repositories for rStar

Users that are interested in rStar are comparing it to the libraries listed below

Sorting:

SimpleBerry / LLaMA-O1
Large Reasoning Models
☆804Updated 7 months ago
huggingface / Math-Verify
☆857Updated last month
zhentingqi / rStar
☆953Updated 6 months ago
NovaSky-AI / SkyRL
SkyRL: A Modular Full-stack RL Library for LLMs
☆651Updated this week
facebookresearch / swe-rl
Official codebase for "SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution"
☆571Updated 4 months ago
sail-sg / understand-r1-zero
Understanding R1-Zero-Like Training: A Critical Perspective
☆1,048Updated last week
RUCAIBox / Slow_Thinking_with_LLMs
A series of technical report on Slow Thinking with LLM
☆713Updated last month
knoveleng / open-rs
Official repo for paper: "Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't"
☆245Updated 2 months ago
PRIME-RL / TTRL
TTRL: Test-Time Reinforcement Learning
☆732Updated 3 weeks ago
GAIR-NLP / LIMO
[COLM 2025] LIMO: Less is More for Reasoning
☆986Updated 3 weeks ago
NVIDIA / NeMo-Skills
A project to improve skills of large language models
☆490Updated this week
Gen-Verse / ReasonFlux
ReasonFlux Series - A family of LLM post-training algorithms focusing on data selection, reinforcement learning, and inference scaling
☆462Updated last week
THUDM / ReST-MCTS
ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search (NeurIPS 2024)
☆654Updated 6 months ago
sail-sg / oat
🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.
☆418Updated this week
RyanLiu112 / compute-optimal-tts
Official codebase for "Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling".
☆268Updated 5 months ago
eddycmu / demystify-long-cot
☆306Updated 2 months ago
hkust-nlp / CodeIO
[ICML 2025 Oral] CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction
☆537Updated 2 months ago
ADaM-BJTU / O1-CODER
AN O1 REPLICATION FOR CODING
☆335Updated 7 months ago
sunblaze-ucb / Intuitor
Code for the paper: "Learning to Reason without External Rewards"
☆337Updated 3 weeks ago
allenai / OLMoE
OLMoE: Open Mixture-of-Experts Language Models
☆823Updated 4 months ago
huggingface / search-and-learn
Recipes to scale inference-time compute of open models
☆1,110Updated 2 months ago
PRIME-RL / PRIME
Scalable RL solution for advanced reasoning of language models
☆1,668Updated 4 months ago
cmu-l3 / l1
L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning
☆232Updated 2 months ago
SWE-Gym / SWE-Gym
Code for Paper: Training Software Engineering Agents and Verifiers with SWE-Gym [ICML 2025]
☆513Updated this week
mlfoundations / evalchemy
Automatic evals for LLMs
☆488Updated last month
trotsky1997 / MathBlackBox
☆1,028Updated 7 months ago
THUDM / WebRL
Building Open LLM Web Agents with Self-Evolving Online Curriculum RL
☆424Updated last month
facebookresearch / sweet_rl
Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks
☆231Updated 2 months ago
allenai / reward-bench
RewardBench: the first evaluation tool for reward models.
☆619Updated last month
project-numina / aimo-progress-prize
☆460Updated last year