NovaSky-AI / SkyRLLinks

SkyRL: A Modular Full-stack RL Library for LLMs

☆679

Alternatives and similar repositories for SkyRL

Users that are interested in SkyRL are comparing it to the libraries listed below

Sorting:

sail-sg / oat
🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.
☆425Updated last week
THUDM / slime
slime is a LLM post-training framework aiming for RL Scaling.
☆975Updated this week
huggingface / Math-Verify
☆870Updated last month
microsoft / rStar
☆608Updated 3 weeks ago
facebookresearch / swe-rl
Official codebase for "SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution"
☆573Updated 4 months ago
facebookresearch / sweet_rl
Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks
☆233Updated 3 months ago
NVIDIA-NeMo / RL
Scalable toolkit for efficient model reinforcement
☆558Updated this week
eddycmu / demystify-long-cot
☆309Updated 2 months ago
SWE-Gym / SWE-Gym
Code for Paper: Training Software Engineering Agents and Verifiers with SWE-Gym [ICML 2025]
☆516Updated last week
mlfoundations / evalchemy
Automatic evals for LLMs
☆496Updated last month
SimpleBerry / LLaMA-O1
Large Reasoning Models
☆804Updated 8 months ago
knoveleng / open-rs
Official repo for paper: "Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't"
☆248Updated 2 months ago
openpsi-project / ReaLHF
Super-Efficient RLHF Training of LLMs with Parameter Reallocation
☆307Updated 3 months ago
TIGER-AI-Lab / verl-tool
A version of verl to support tool use
☆312Updated this week
THUDM / ReST-MCTS
ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search (NeurIPS 2024)
☆654Updated 6 months ago
sail-sg / understand-r1-zero
Understanding R1-Zero-Like Training: A Critical Perspective
☆1,055Updated last week
NVIDIA / NeMo-Skills
A project to improve skills of large language models
☆501Updated this week
cmu-l3 / l1
L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning
☆234Updated 2 months ago
RUCAIBox / Slow_Thinking_with_LLMs
A series of technical report on Slow Thinking with LLM
☆713Updated last month
zhentingqi / rStar
☆954Updated 6 months ago
sunblaze-ucb / Intuitor
Code for the paper: "Learning to Reason without External Rewards"
☆337Updated 3 weeks ago
BytedTsinghua-SIA / MemAgent
A MemAgent framework that can be extrapolated to 3.5M, along with a training framework for RL training of any agent workflow.
☆548Updated this week
allenai / reward-bench
RewardBench: the first evaluation tool for reward models.
☆619Updated last month
PRIME-RL / TTRL
TTRL: Test-Time Reinforcement Learning
☆745Updated 3 weeks ago
ypwang61 / One-Shot-RLVR
official repository for “Reinforcement Learning for Reasoning in Large Language Models with One Training Example”
☆330Updated last week
dongguanting / ARPO
The official code of “Agentic Reinforced Policy Optimization”, an agentic RL algorithm optimization.
☆229Updated this week
sail-sg / oat-zero
A lightweight reproduction of DeepSeek-R1-Zero with indepth analysis of self-reflection behavior.
☆245Updated 3 months ago
QwenLM / ParScale
Parallel Scaling Law for Language Model — Beyond Parameter and Inference Time Scaling
☆417Updated 2 months ago
GAIR-NLP / LIMO
[COLM 2025] LIMO: Less is More for Reasoning
☆993Updated this week
Agent-RL / ReCall
ReCall: Learning to Reason with Tool Call for LLMs via Reinforcement Learning
☆1,116Updated 2 months ago