CodeGoat24/Pref-GRPO

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/CodeGoat24/Pref-GRPO)

CodeGoat24 / Pref-GRPO

Official implementation of Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning

☆274

Alternatives and similar repositories for Pref-GRPO

Users that are interested in Pref-GRPO are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

CodeGoat24 / UniGenBench
View on GitHub
UniGenBench++: A Unified Semantic Evaluation Benchmark for Text-to-Image Generation
☆139Jun 19, 2026Updated last month
CodeGoat24 / UnifiedReward
View on GitHub
Official implementation of UnifiedReward & [NeurIPS 2025] UnifiedReward-Think & UnifiedReward-Flex
☆796Jun 18, 2026Updated last month
CodeGoat24 / LiFT
View on GitHub
Official implementation of LiFT: Leveraging Human Feedback for Text-to-Video Model Alignment.
☆85May 4, 2025Updated last year
NVlabs / DiffusionNFT
View on GitHub
[ICLR 2026 Oral] DiffusionNFT: Online Diffusion Reinforcement with Forward Process
☆974Feb 10, 2026Updated 5 months ago
bcmi / Granular-GRPO
View on GitHub
[CVPR 2026] Fine-Grained GRPO for Precise Preference Alignment in Flow Models
☆64Jun 1, 2026Updated last month
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
MizzenAI / HPSv3
View on GitHub
Official implementation of HPSv3: Towards Wide-Spectrum Human Preference Score (ICCV2025)
☆325Dec 5, 2025Updated 7 months ago
yifan123 / flow_grpo
View on GitHub
[NeurIPS 2025] An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL
☆2,420May 7, 2026Updated 2 months ago
KlingAIResearch / VideoAlign
View on GitHub
[NeurIPS 2025] Improving Video Generation with Human Feedback
☆483Sep 24, 2025Updated 9 months ago
XueZeyue / DanceGRPO
View on GitHub
An official implementation of DanceGRPO: Unleashing GRPO on Visual Generation
☆1,635Oct 16, 2025Updated 9 months ago
IamCreateAI / FlowCPS
View on GitHub
An official implementation of Coefficients-Preserving Sampling for Reinforcement Learning with Flow Matching
☆80Sep 11, 2025Updated 10 months ago
vvvvvjdy / dmdr
View on GitHub
[ECCV 2026] Official Code of "Distribution Matching Distillation Meets Reinforcement Learning"
☆281Feb 1, 2026Updated 5 months ago
VectorSpaceLab / EditScore
View on GitHub
[ICLR 2026] EditScore: Unlocking Online RL for Image Editing via High-Fidelity Reward Modeling
☆253Mar 20, 2026Updated 4 months ago
GongyeLiu / Awesome-Alignment-of-Diffusion-Models
View on GitHub
paper collection: alignment of diffusion models
☆29Mar 6, 2026Updated 4 months ago
X-GenGroup / Flow-Factory
View on GitHub
A unified framework for easy reinforcement learning in Flow-Matching models
☆630Jul 12, 2026Updated last week
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
Luo-Yihong / TDM-R1
View on GitHub
[ICML 2026][Ultra Powerful Few-Step Diffusion RL] TDM-R1: Reinforcing Few-Step Diffusion Models with Non-Differentiable Reward
☆116May 25, 2026Updated last month
zai-org / VisionReward
View on GitHub
[AAAI 2026] VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation
☆421Mar 26, 2025Updated last year
Luo-Yihong / DGPO
View on GitHub
[ICLR 2026][Ultra Fast&Powerful Diffusion RL] Reinforcing Diffusion Models by Direct Group Preference Optimization
☆84May 26, 2026Updated last month
tinnerhrhe / GARDO
View on GitHub
Official codes for the paper "GARDO: Reinforcing Diffusion Models without Reward Hacking"
☆61May 3, 2026Updated 2 months ago
PKU-YuanGroup / Edit-R1
View on GitHub
Edit-R1: Reinforce Image Editing with Diffusion Negative-Aware Finetuning and MLLM Implicit Feedback
☆294Jan 24, 2026Updated 5 months ago
Tencent-Hunyuan / SRPO
View on GitHub
Directly Aligning the Full Diffusion Trajectory with Fine-Grained Human Preference
☆1,278May 11, 2026Updated 2 months ago
InternLM / Spark
View on GitHub
An official implementation of "SPARK: Synergistic Policy And Reward Co-Evolving Framework"
☆25Oct 23, 2025Updated 8 months ago
X-Omni-Team / X-Omni
View on GitHub
Official inference code and LongText-Bench benchmark for our paper X-Omni (https://arxiv.org/pdf/2507.22058).
☆426Aug 26, 2025Updated 10 months ago
G-U-N / UniRL
View on GitHub
[ICML 2026] a unified reinforcement learning toolbox for joint RL on language models and diffusion models
☆91May 26, 2026Updated last month
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
BarretBa / ICTHP
View on GitHub
Enhancing Reward Models for High-quality Image Generation: Beyond Text-Image Alignment [ICCV 2025] - Official implementation
☆45Aug 5, 2025Updated 11 months ago
CodeGoat24 / MagicFace
View on GitHub
Official implementation of MagicFace: Training-free Universal-Style Human Image Customized Synthesis.
☆66Dec 24, 2024Updated last year
Kwai-Kolors / LPO
View on GitHub
Diffusion Model as a Noise-Aware Latent Reward Model for Step-Level Preference Optimization
☆68Sep 19, 2025Updated 10 months ago
VisionXLab / FIRM-Reward
View on GitHub
Trust Your Critic: Robust Reward Modeling and Reinforcement Learning for Faithful Image Editing and Generation
☆40Mar 13, 2026Updated 4 months ago
Cooperx521 / ScaleCap
View on GitHub
(ICLR 2026)Official repository of 'ScaleCap: Inference-Time Scalable Image Captioning via Dual-Modality Debiasing’
☆60Jan 26, 2026Updated 5 months ago
CIntellifusion / VideoDPO
View on GitHub
Official Implementation of VideoDPO
☆169Jun 1, 2025Updated last year
CostaliyA / Flow-OPD
View on GitHub
Official Repo of "Flow-OPD: On-Policy Distillation for Flow Matching Models"
☆265Jun 24, 2026Updated 3 weeks ago
ali-vilab / DiffusionOPD
View on GitHub
[SIGGRAPH Asia 2026] DiffusionOPD: A Unified Perspective of On-Policy Distillation in Diffusion Models
☆139Updated this week
PKU-YuanGroup / UniWorld
View on GitHub
UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation
☆883Dec 23, 2025Updated 6 months ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
Tencent-Hunyuan / MixGRPO
View on GitHub
[ECCV 2026] MixGRPO: Unlocking Flow-based GRPO Efficiency with Mixed ODE-SDE
☆1,151Jul 1, 2026Updated 2 weeks ago
vvvvvjdy / D-OPSD
View on GitHub
Official Repo of "D-OPSD: On-Policy Self-Distillation for Continuously Tuning Step-Distilled Diffusion Models"
☆286May 22, 2026Updated last month
showlab / Adv-GRPO
View on GitHub
[CVPR 2026] An official implementation of Adv-GRPO. The Image as Its Own Reward: Reinforcement Learning with Adversarial Reward for Image…
☆88Feb 26, 2026Updated 4 months ago
Tencent-Hunyuan / SAGE-GRPO
View on GitHub
Official Implementation of SAGE-GRPO:Manifold-Aware Exploration for Reinforcement Learning in Video Generation
☆126Apr 2, 2026Updated 3 months ago
Bujiazi / ByTheWay
View on GitHub
[CVPR 2025] Official implementation of ByTheWay: Boost Your Text-to-Video Generation Model to Higher Quality in a Training-free Way
☆48Oct 10, 2025Updated 9 months ago
tianweiy / DMD2
View on GitHub
(NeurIPS 2024 Oral 🔥) Improved Distribution Matching Distillation for Fast Image Synthesis
☆1,402Mar 5, 2025Updated last year
Maplebb / UniREditBench
View on GitHub
[ECCV 2026] Offline implementation of UniREditBench: A Unified Reasoning-based Image Editing Benchmark.
☆58Jun 21, 2026Updated 3 weeks ago