NVIDIA-NeMo / RLLinks

Scalable toolkit for efficient model reinforcement

☆1,054

Alternatives and similar repositories for RL

Users that are interested in RL are comparing it to the libraries listed below

Sorting:

thinking-machines-lab / batch_invariant_ops
☆917Updated last month
meta-pytorch / torchforge
PyTorch-native post-training at scale
☆549Updated last week
NovaSky-AI / SkyRL
SkyRL: A Modular Full-stack RL Library for LLMs
☆1,287Updated last week
NVIDIA-NeMo / Skills
A project to improve skills of large language models
☆628Updated this week
NVIDIA / NeMo-Aligner
Scalable toolkit for efficient model alignment
☆847Updated last month
MoonshotAI / checkpoint-engine
Checkpoint-engine is a simple middleware to update model weights in LLM inference engines
☆851Updated last week
THUDM / slime
slime is an LLM post-training framework for RL Scaling.
☆2,612Updated last week
allenai / OLMo-core
PyTorch building blocks for the OLMo ecosystem
☆482Updated this week
PrimeIntellect-ai / prime-rl
Async RL Training at Scale
☆867Updated this week
sail-sg / oat
🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.
☆576Updated last month
MoonshotAI / Moonlight
Muon is Scalable for LLM Training
☆1,372Updated 4 months ago
allenai / OLMoE
OLMoE: Open Mixture-of-Experts Language Models
☆919Updated 2 months ago
fla-org / native-sparse-attention
🐳 Efficient Triton implementations for "Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention"
☆928Updated 8 months ago
sail-sg / understand-r1-zero
Understanding R1-Zero-Like Training: A Critical Perspective
☆1,164Updated 3 months ago
NVIDIA / kvpress
LLM KV cache compression made easy
☆701Updated this week
zhuzilin / ring-flash-attention
Ring attention implementation with flash attention
☆923Updated 2 months ago
apple / ml-cross-entropy
☆555Updated 2 months ago
openpsi-project / ReaLHF
Super-Efficient RLHF Training of LLMs with Parameter Reallocation
☆326Updated 7 months ago
changjonathanc / flex-nano-vllm
FlexAttention based, minimal vllm-style inference engine for fast Gemma 2 inference.
☆313Updated last month
NVIDIA-NeMo / Megatron-Bridge
HuggingFace conversion and training library for Megatron-based models
☆228Updated this week
facebookresearch / memory
Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, spars…
☆360Updated 11 months ago
foundation-model-stack / fms-fsdp
🚀 Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flash…
☆271Updated last week
huggingface / search-and-learn
Recipes to scale inference-time compute of open models
☆1,118Updated 6 months ago
huggingface / picotron_tutorial
☆224Updated last week
radixark / miles
☆344Updated this week
ServiceNow / PipelineRL
A scalable asynchronous reinforcement learning implementation with in-flight weight updates.
☆316Updated last week
ScalingIntelligence / KernelBench
KernelBench: Can LLMs Write GPU Kernels? - Benchmark with Torch -> CUDA (+ more DSLs)
☆683Updated this week
McGill-NLP / nano-aha-moment
Single File, Single GPU, From Scratch, Efficient, Full Parameter Tuning library for "RL for LLMs"
☆561Updated last month
NVIDIA / Star-Attention
Efficient LLM Inference over Long Sequences
☆392Updated 5 months ago
QwenLM / ParScale
Parallel Scaling Law for Language Model — Beyond Parameter and Inference Time Scaling
☆460Updated 6 months ago