agentica-project / deepscaler

Democratizing Reinforcement Learning for LLMs

☆2,113

Alternatives and similar repositories for deepscaler:

Users that are interested in deepscaler are comparing it to the libraries listed below

hkust-nlp / simpleRL-reason
This is a replicate of DeepSeek-R1-Zero and DeepSeek-R1 training on small models with limited data
☆3,223Updated this week
volcengine / verl
verl: Volcano Engine Reinforcement Learning for LLMs
☆5,693Updated this week
open-thoughts / open-thoughts
Fully open data curation for reasoning models
☆1,576Updated last week
PRIME-RL / PRIME
Scalable RL solution for advanced reasoning of language models
☆1,419Updated last week
Open-Reasoner-Zero / Open-Reasoner-Zero
Official Repo for Open-Reasoner-Zero
☆1,667Updated 3 weeks ago
PeterGriffinJin / Search-R1
Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL
☆1,389Updated this week
AIDC-AI / Marco-o1
An Open Large Reasoning Model for Real-World Solutions
☆1,475Updated 3 weeks ago
RAGEN-AI / RAGEN
RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.
☆1,210Updated this week
MoonshotAI / Moonlight
Muon is Scalable for LLM Training
☆974Updated last month
openreasoner / openr
OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models
☆1,732Updated 2 months ago
MoonshotAI / MoBA
MoBA: Mixture of Block Attention for Long-Context LLMs
☆1,687Updated 3 weeks ago
ML-GSAI / LLaDA
Official PyTorch implementation for "Large Language Diffusion Models"
☆1,313Updated 2 weeks ago
Deep-Agent / R1-V
Witness the aha moment of VLM with less than $3.
☆3,376Updated 3 weeks ago
GAIR-NLP / LIMO
LIMO: Less is More for Reasoning
☆864Updated last month
GAIR-NLP / O1-Journey
O1 Replication Journey
☆1,977Updated 2 months ago
Open-Source-O1 / Open-O1
☆1,348Updated 4 months ago
BytedTsinghua-SIA / DAPO
An Open-source RL System from ByteDance Seed and Tsinghua AIR
☆767Updated last week
huggingface / search-and-learn
Recipes to scale inference-time compute of open models
☆1,044Updated last month
OpenRLHF / OpenRLHF
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)
☆5,919Updated this week
allenai / open-instruct
AllenAI's post-training codebase
☆2,840Updated this week
hiyouga / EasyR1
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
☆1,681Updated this week
facebookresearch / coconut
Training Large Language Model to Reason in a Continuous Latent Space
☆998Updated 2 months ago
SimpleBerry / LLaMA-O1
Large Reasoning Models
☆800Updated 3 months ago
zhentingqi / rStar
☆910Updated 2 months ago
Unakar / Logic-RL
Reproduce R1 Zero on Logic Puzzle
☆2,208Updated last week
microsoft / rStar
☆485Updated last week
MiniMax-AI / MiniMax-01
The official repo of MiniMax-Text-01 and MiniMax-VL-01, large-language-model & vision-language-model based on Linear Attention
☆2,415Updated last week
mbzuai-oryx / Awesome-LLM-Post-training
Awesome Reasoning LLM Tutorial/Survey/Guide
☆1,164Updated last week
deepseek-ai / DeepSeek-MoE
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
☆1,609Updated last year
bespokelabsai / curator
Synthetic data curation for post-training and structured data extraction
☆1,065Updated this week