Tencent-Hunyuan/SAGE-GRPO

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Tencent-Hunyuan/SAGE-GRPO)

Tencent-Hunyuan / SAGE-GRPO

Official Implementation of SAGE-GRPO:Manifold-Aware Exploration for Reinforcement Learning in Video Generation

☆127

Alternatives and similar repositories for SAGE-GRPO

Users that are interested in SAGE-GRPO are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

vvvvvjdy / dmdr
View on GitHub
[ECCV 2026] Official Code of "Distribution Matching Distillation Meets Reinforcement Learning"
☆287Feb 1, 2026Updated 5 months ago
DuNGEOnmassster / VideoGen-of-Thought
View on GitHub
[Neurips 2025 NextVid Workshop Oral✨] Official Implementation of VideoGen-of-Thought: Step-by-step generating multi-shot video with minim…
☆63Sep 22, 2025Updated 10 months ago
GongyeLiu / Awesome-Alignment-of-Diffusion-Models
View on GitHub
paper collection: alignment of diffusion models
☆29Mar 6, 2026Updated 4 months ago
JaydenLyh / Reward-Forcing
View on GitHub
[CVPR 2026 Highlight] Reward Forcing: Efficient Streaming Video Generation with Rewarded Distribution Matching Distillation
☆352Dec 15, 2025Updated 7 months ago
lian700 / SoliReward
View on GitHub
Official Code for "SoliReward: Mitigating Susceptibility to Reward Hacking and Annotation Noise in Video Generation Reward Models" [CVPR2…
☆21Jul 13, 2026Updated 2 weeks ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
KlingAIResearch / VideoAlign
View on GitHub
[NeurIPS 2025] Improving Video Generation with Human Feedback
☆489Sep 24, 2025Updated 10 months ago
G-U-N / UniRL
View on GitHub
[ICML 2026] a unified reinforcement learning toolbox for joint RL on language models and diffusion models
☆91May 26, 2026Updated 2 months ago
NVlabs / DiffusionNFT
View on GitHub
[ICLR 2026 Oral] DiffusionNFT: Online Diffusion Reinforcement with Forward Process
☆990Feb 10, 2026Updated 5 months ago
PKU-YuanGroup / OSP-Next
View on GitHub
OSP-Next
☆68Jun 22, 2026Updated last month
thu-ml / Causal-Forcing
View on GitHub
[ICML 2026] Official codebase for "Causal Forcing: Autoregressive Diffusion Distillation Done Right for High-Quality Real-Time Interactiv…
☆882Updated this week
yuyangyou / Adaptive-Video-Distillation
View on GitHub
official code repository of 《Adaptive Video Distillation: Mitigating Oversaturation and Temporal Collapse in Few-Step Generation》
☆18Jul 10, 2026Updated 2 weeks ago
xbyym / StableWorld
View on GitHub
StableWorld: Towards Stable and Consistent Long Interactive Video Generation
☆97Mar 18, 2026Updated 4 months ago
CostaliyA / Flow-OPD
View on GitHub
Official Repo of "Flow-OPD: On-Policy Distillation for Flow Matching Models"
☆265Jun 24, 2026Updated last month
NVlabs / AnyFlow
View on GitHub
Flow Map OPD for AnyStep Video Diffusion
☆399May 23, 2026Updated 2 months ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
CIntellifusion / VideoDPO
View on GitHub
Official Implementation of VideoDPO
☆169Jun 1, 2025Updated last year
yifan123 / flow_grpo
View on GitHub
[NeurIPS 2025] An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL
☆2,437May 7, 2026Updated 2 months ago
Luo-Yihong / TDM-R1
View on GitHub
[ICML 2026][Ultra Powerful Few-Step Diffusion RL] TDM-R1: Reinforcing Few-Step Diffusion Models with Non-Differentiable Reward
☆116May 25, 2026Updated 2 months ago
shawn0728 / Unify-Agent
View on GitHub
🐧 Unify-Agent: An end-to-end unified multimodal agent for faithful, knowledge-grounded image generation.
☆86May 2, 2026Updated 2 months ago
RuoyuWang-2077 / FlowBP
View on GitHub
[arXiv 2026] FlowBP: Exploring the Design Space of Reward Backpropagation for Flow Matching
☆21Jul 7, 2026Updated 3 weeks ago
XueZeyue / DanceGRPO
View on GitHub
An official implementation of DanceGRPO: Unleashing GRPO on Visual Generation
☆1,642Oct 16, 2025Updated 9 months ago
CodeGoat24 / Pref-GRPO
View on GitHub
Official implementation of Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning
☆275Feb 10, 2026Updated 5 months ago
Jiawei-Yang / FD-Loss
View on GitHub
☆548May 1, 2026Updated 2 months ago
tinnerhrhe / GARDO
View on GitHub
Official codes for the paper "GARDO: Reinforcing Diffusion Models without Reward Hacking"
☆61May 3, 2026Updated 2 months ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
leeruibin / hybrid-forcing
View on GitHub
☆32Apr 29, 2026Updated 3 months ago
vvvvvjdy / D-OPSD
View on GitHub
Official Repo of "D-OPSD: On-Policy Self-Distillation for Continuously Tuning Step-Distilled Diffusion Models"
☆291May 22, 2026Updated 2 months ago
knightyxp / VideoCoF
View on GitHub
[CVPR 2026 Highlight] VideoCoF: Unified Video Editing with Temporal Reasoner
☆205Jun 17, 2026Updated last month
tang-bd / v-grpo
View on GitHub
[CVPR 2026 Findings] V-GRPO: Online Reinforcement Learning for Denoising Generative Models Is Easier than You Think
☆56Apr 28, 2026Updated 3 months ago
scxue / advantage_weighted_matching
View on GitHub
Official code for paper Advantage Weighted Matching: Aligning RL with Pretraining in Diffusion Models
☆93Apr 23, 2026Updated 3 months ago
Seeing-Fast-and-Slow / Seeing-Fast-and-Slow
View on GitHub
☆16May 28, 2026Updated 2 months ago
franklinz233 / Astrolabe
View on GitHub
Steering Forward-Process Reinforcement Learning for Distilled Autoregressive Video Models
☆147Mar 24, 2026Updated 4 months ago
KlingAIResearch / VANS
View on GitHub
[CVPR 2026] Video-as-Answer: Predict and Generate Next Video Event with Joint-GRPO
☆119Feb 28, 2026Updated 5 months ago
TencentARC / RollingForcing
View on GitHub
[ICLR 2026] Official Repo for Rolling Forcing: Autoregressive Long Video Diffusion in Real Time
☆449Oct 31, 2025Updated 8 months ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
HKUST-C4G / diffusion-rm
View on GitHub
The official code of "Beyond VLM-Based Rewards: Diffusion-Native Latent Reward Modeling"
☆66Jun 30, 2026Updated 3 weeks ago
X-GenGroup / Flow-Factory
View on GitHub
A unified framework for easy reinforcement learning in Flow-Matching models
☆641Jul 12, 2026Updated 2 weeks ago
Harahan / MeanFlowNFT
View on GitHub
[arXiv 2026] This is the official PyTorch implementation of "MeanFlowNFT: Bringing Forward-Process RL to Average-Velocity Generators".
☆76Jul 18, 2026Updated last week
MC-E / InstructX
View on GitHub
☆86Oct 10, 2025Updated 9 months ago
byliutao / CDM
View on GitHub
Continuous-Time Distribution Matching for Few-Step Diffusion Distillation👏
☆147May 11, 2026Updated 2 months ago
xgen-universe / Capybara
View on GitHub
☆203Feb 27, 2026Updated 5 months ago
MizzenAI / HPSv3
View on GitHub
Official implementation of HPSv3: Towards Wide-Spectrum Human Preference Score (ICCV2025)
☆330Dec 5, 2025Updated 7 months ago