NVIDIA / NeMo-RLLinks

Scalable toolkit for efficient model reinforcement

☆361

Alternatives and similar repositories for NeMo-RL

Users that are interested in NeMo-RL are comparing it to the libraries listed below

Sorting:

foundation-model-stack / fms-fsdp
🚀 Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flash…
☆249Updated this week
huggingface / picotron_tutorial
☆188Updated 3 months ago
NVIDIA / NeMo-Skills
A project to improve skills of large language models
☆413Updated this week
openpsi-project / ReaLHF
Super-Efficient RLHF Training of LLMs with Parameter Reallocation
☆299Updated last month
NovaSky-AI / SkyRL
SkyRL-v0: Train Real-World Long-Horizon Agents via Reinforcement Learning
☆343Updated last week
ByteDance-Seed / ByteCheckpoint
ByteCheckpoint: An Unified Checkpointing Library for LFMs
☆215Updated 2 months ago
NVIDIA / kvpress
LLM KV cache compression made easy
☆493Updated 3 weeks ago
fla-org / flame
🔥 A minimal training framework for scaling FLA models
☆146Updated 3 weeks ago
shawntan / scattermoe
Triton-based implementation of Sparse Mixture of Experts.
☆216Updated 6 months ago
NVlabs / COAT
[ICLR 2025] COAT: Compressing Optimizer States and Activation for Memory-Efficient FP8 Training
☆203Updated 2 weeks ago
zhuzilin / ring-flash-attention
Ring attention implementation with flash attention
☆771Updated last week
NVIDIA / NeMo-Aligner
Scalable toolkit for efficient model alignment
☆803Updated 2 weeks ago
alexzhang13 / flashattention2-custom-mask
Triton implementation of FlashAttention2 that adds Custom Masks.
☆117Updated 9 months ago
RulinShao / LightSeq
Official repository for DistFlashAttn: Distributed Memory-efficient Attention for Long-context LLMs Training
☆208Updated 9 months ago
XunhaoLai / native-sparse-attention-triton
Efficient triton implementation of Native Sparse Attention.
☆155Updated last week
sail-sg / oat
🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.
☆367Updated this week
ByteDance-Seed / VeOmni
VeOmni: Scaling any Modality Model Training to any Accelerators with PyTorch native Training Framework
☆339Updated 3 weeks ago
agentica-project / verl-pipeline
Async pipelined version of Verl
☆91Updated last month
facebookresearch / memory
Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, spars…
☆333Updated 5 months ago
ServiceNow / PipelineRL
A scalable asynchronous reinforcement learning implementation with in-flight weight updates.
☆117Updated this week
haoliuhl / ringattention
Large Context Attention
☆711Updated 4 months ago
HazyResearch / Megakernels
kernels, of the mega variety
☆184Updated this week
lucidrains / speculative-decoding
Explorations into some recent techniques surrounding speculative decoding
☆266Updated 5 months ago
NVIDIA / ngpt
Normalized Transformer (nGPT)
☆181Updated 6 months ago
NVIDIA / Star-Attention
Efficient LLM Inference over Long Sequences
☆376Updated this week
sail-sg / zero-bubble-pipeline-parallelism
Zero Bubble Pipeline Parallelism
☆395Updated 3 weeks ago
FasterDecoding / REST
REST: Retrieval-Based Speculative Decoding, NAACL 2024
☆202Updated 6 months ago
mit-han-lab / Quest
[ICML 2024] Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inference
☆291Updated 6 months ago
foundation-model-stack / foundation-model-stack
🚀 Collection of components for development, training, tuning, and inference of foundation models leveraging PyTorch native components.
☆196Updated this week
apple / ml-cross-entropy
☆450Updated this week