yaof20/Flash-RL

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/yaof20/Flash-RL)

yaof20 / Flash-RL

Implementation for FP8/INT8 Rollout for RL training without performence drop.

☆306

Alternatives and similar repositories for Flash-RL

Users that are interested in Flash-RL are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

radixark / miles
View on GitHub
Miles is an enterprise-facing reinforcement learning framework for LLM and VLM post-training, forked from and co-evolving with slime.
☆1,767Updated this week
NVlabs / QeRL
View on GitHub
[ICLR 2026]QeRL enables RL for 32B LLMs on a single H100 GPU.
☆511Mar 30, 2026Updated 3 months ago
Dao-AILab / sonic-moe
View on GitHub
Accelerating MoE with IO and Tile-aware Optimizations
☆732Jul 4, 2026Updated 2 weeks ago
RLsys-Foundation / APRIL
View on GitHub
APRIL: Active Partial Rollouts in Reinforcement Learning to Tame Long-tail Generation. A system-level optimization for scalable LLM tra…
☆60Oct 11, 2025Updated 9 months ago
NovaSky-AI / SkyRL
View on GitHub
SkyRL: A Modular Full-stack RL Library for LLMs
☆2,086Updated this week
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
THUDM / slime
View on GitHub
slime is an LLM post-training framework for RL Scaling.
☆7,594Updated this week
ChenxinAn-fdu / POLARIS
View on GitHub
Scaling RL on advanced reasoning models
☆691Oct 20, 2025Updated 9 months ago
li-plus / flash-preference
View on GitHub
Accelerate LLM preference tuning via prefix sharing with a single line of code
☆52Jul 4, 2025Updated last year
sail-sg / Precision-RL
View on GitHub
Defeating the Training-Inference Mismatch via FP16
☆197Nov 14, 2025Updated 8 months ago
alibaba / ROLL
View on GitHub
An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models
☆3,316Updated this week
NVIDIA-NeMo / RL
View on GitHub
Scalable toolkit for efficient model reinforcement
☆1,843Updated this week
yaof20 / verl
View on GitHub
verl: Volcano Engine Reinforcement Learning for LLMs
☆22Nov 6, 2025Updated 8 months ago
areal-project / AReaL
View on GitHub
The RL Bridge for LLM-based Agent Applications. Made Simple & Flexible.
☆5,586Updated this week
MoonshotAI / checkpoint-engine
View on GitHub
Checkpoint-engine is a simple middleware to update model weights in LLM inference engines
☆975Jul 4, 2026Updated 2 weeks ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
Terra-Flux / PolyRL
View on GitHub
[NSDI'26] PolyRL is a reinforcement learning framework for LLM that harvest spot instances on the cloud to reduce cost.
☆19Mar 30, 2026Updated 3 months ago
mit-han-lab / flash-moba
View on GitHub
☆251Nov 19, 2025Updated 8 months ago
PRIME-RL / Entropy-Mechanism-of-RL
View on GitHub
The Entropy Mechanism of Reinforcement Learning for Large Language Model Reasoning.
☆443Jul 11, 2025Updated last year
ShopeeLLM / Spec-RL
View on GitHub
SPEC-RL: Accelerating On-Policy Reinforcement Learning via Speculative Rollouts
☆66Dec 1, 2025Updated 7 months ago
NVlabs / Long-RL
View on GitHub
Long-RL: Scaling RL to Long Sequences (NeurIPS 2025)
☆726Sep 24, 2025Updated 9 months ago
ltzheng / SimpleTIR
View on GitHub
[ICLR 2026] End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning
☆401Mar 30, 2026Updated 3 months ago
rllm-org / rllm
View on GitHub
Democratizing Reinforcement Learning for LLMs
☆5,715Updated this week
mit-han-lab / fastrl
View on GitHub
[ASPLOS'26] Taming the Long-Tail: Efficient Reasoning RL Training with Adaptive Drafter
☆174Feb 27, 2026Updated 4 months ago
ServiceNow / PipelineRL
View on GitHub
A scalable asynchronous reinforcement learning implementation with in-flight weight updates.
☆427Updated this week
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
RLsys-Foundation / TritonForge
View on GitHub
🔥 LLM-powered GPU kernel synthesis: Train models to convert PyTorch ops into optimized Triton kernels via SFT+RL. Multi-turn compilation…
☆146Nov 10, 2025Updated 8 months ago
TIGER-AI-Lab / verl-tool
View on GitHub
A version of verl to support diverse tool use [TMLR 2026]
☆1,022Jul 15, 2026Updated last week
yaof20 / DenseMixer
View on GitHub
Official implementation for DenseMixer: Improving MoE Post-Training with Precise Router Gradient
☆68Aug 3, 2025Updated 11 months ago
MiroMindAI / MiroRL
View on GitHub
MiroRL is an MCP-first reinforcement learning framework for deep research agent.
☆246Aug 27, 2025Updated 10 months ago
tilde-research / nsa-release
View on GitHub
An efficient implementation of the NSA (Native Sparse Attention) kernel
☆133Jun 24, 2025Updated last year
PrimeIntellect-ai / prime-rl
View on GitHub
Agentic RL Training at Scale
☆1,705Updated this week
nex-agi / NexRL
View on GitHub
NexRL is an ultra-loosely-coupled LLM post-training framework.
☆114May 13, 2026Updated 2 months ago
uccl-project / mKernel
View on GitHub
mKernel: fast multi-node, multi-GPU fused kernels
☆253Jun 21, 2026Updated last month
mlc-ai / pith-train
View on GitHub
Compact and Agent-Native MoE Training System
☆296Updated this week
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
hao-ai-lab / DistCA
View on GitHub
Efficient Long-context Language Model Training by Core Attention Disaggregation
☆106Apr 7, 2026Updated 3 months ago
fla-org / flash-linear-attention
View on GitHub
🚀 Efficient implementations for emerging model architectures
☆5,394Updated this week
yzlnew / infra-skills
View on GitHub
A collection of specialized agent skills for AI infrastructure development, enabling Claude Code to write, optimize, and debug high-perfo…
☆139Jul 9, 2026Updated 2 weeks ago
sgl-project / SpecForge
View on GitHub
Train speculative decoding models effortlessly and port them smoothly to SGLang serving.
☆1,004Updated this week
verl-project / verl
View on GitHub
verl/HybridFlow: A Flexible and Efficient RL Post-Training Framework
☆22,587Updated this week
liushulinle / UloRL
View on GitHub
An Ultra-Long Output Reinforcement Learning Approach
☆23Jul 31, 2025Updated 11 months ago
ruixin31 / Spurious_Rewards
View on GitHub
☆361Jul 29, 2025Updated 11 months ago