G-U-N/UniRL

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/G-U-N/UniRL)

G-U-N / UniRL

[ICML 2026] a unified reinforcement learning toolbox for joint RL on language models and diffusion models

☆91

Alternatives and similar repositories for UniRL

Users that are interested in UniRL are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

mm-vl / ULM-R1
View on GitHub
Co-Reinforcement Learning for Unified Multimodal Understanding and Generation
☆48Jul 22, 2025Updated last year
NVlabs / DiffusionNFT
View on GitHub
[ICLR 2026 Oral] DiffusionNFT: Online Diffusion Reinforcement with Forward Process
☆990Feb 10, 2026Updated 5 months ago
Tencent-Hunyuan / SAGE-GRPO
View on GitHub
Official Implementation of SAGE-GRPO:Manifold-Aware Exploration for Reinforcement Learning in Video Generation
☆127Apr 2, 2026Updated 3 months ago
Vchitect / RealDPO
View on GitHub
☆32Dec 17, 2025Updated 7 months ago
Tencent-Hunyuan / UniRL
View on GitHub
UniRL is a Framework for Unified Multimodal Model Reinforcement Learning
☆860Updated this week
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
CodeGoat24 / Pref-GRPO
View on GitHub
Official implementation of Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning
☆275Feb 10, 2026Updated 5 months ago
MizzenAI / HPSv3
View on GitHub
Official implementation of HPSv3: Towards Wide-Spectrum Human Preference Score (ICCV2025)
☆330Dec 5, 2025Updated 7 months ago
bcmi / Granular-GRPO
View on GitHub
[CVPR 2026] Fine-Grained GRPO for Precise Preference Alignment in Flow Models
☆64Jun 1, 2026Updated last month
shawn0728 / Unify-Agent
View on GitHub
🐧 Unify-Agent: An end-to-end unified multimodal agent for faithful, knowledge-grounded image generation.
☆86May 2, 2026Updated 2 months ago
facebookresearch / GenEval2
View on GitHub
Evaluation codes and data for GenEval2
☆80Jan 8, 2026Updated 6 months ago
EvolvingLMMs-Lab / Evolving-Visual-Generation
View on GitHub
[Roadmap] Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling
☆125Jun 9, 2026Updated last month
yifan123 / flow_grpo
View on GitHub
[NeurIPS 2025] An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL
☆2,437May 7, 2026Updated 2 months ago
DAGroup-PKU / SpatialT2I
View on GitHub
[CVPR 2026🔥] Enhancing Spatial Understanding in Image Generation via Reward Modeling
☆86Mar 2, 2026Updated 4 months ago
TIGER-AI-Lab / EditReward
View on GitHub
EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing [ICLR 2026]
☆156Updated this week
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
XueZeyue / Awesome-Visual-Generation-Alignment-Survey
View on GitHub
A survey for visual generation alignment
☆144Nov 9, 2025Updated 8 months ago
HKUST-C4G / diffusion-rm
View on GitHub
The official code of "Beyond VLM-Based Rewards: Diffusion-Native Latent Reward Modeling"
☆66Jun 30, 2026Updated 3 weeks ago
vvvvvjdy / D-OPSD
View on GitHub
Official Repo of "D-OPSD: On-Policy Self-Distillation for Continuously Tuning Step-Distilled Diffusion Models"
☆291May 22, 2026Updated 2 months ago
huangrh99 / AlphaGRPO
View on GitHub
[ICML2026] Official Implementation of AlphaGRPO: Unlocking Self-Reflective Multimodal Generation in Unified Multimodal Models via Decompo…
☆73Updated this week
RuoyuWang-2077 / FlowBP
View on GitHub
[arXiv 2026] FlowBP: Exploring the Design Space of Reward Backpropagation for Flow Matching
☆21Jul 7, 2026Updated 3 weeks ago
X-GenGroup / Flow-Factory
View on GitHub
A unified framework for easy reinforcement learning in Flow-Matching models
☆641Jul 12, 2026Updated 2 weeks ago
zai-org / VisionReward
View on GitHub
[AAAI 2026] VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation
☆422Mar 26, 2025Updated last year
GongyeLiu / Awesome-Alignment-of-Diffusion-Models
View on GitHub
paper collection: alignment of diffusion models
☆29Mar 6, 2026Updated 4 months ago
w-yibo / VTC-R1
View on GitHub
VTC-R1: Vision-Text Compression for Efficient Long-Context Reasoning.
☆26Jul 20, 2026Updated last week
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
PKU-YuanGroup / UAE
View on GitHub
Official repository for the UAE paper, unified-GRPO, and unified-Bench
☆166Sep 12, 2025Updated 10 months ago
JaydenLyh / Reward-Forcing
View on GitHub
[CVPR 2026 Highlight] Reward Forcing: Efficient Streaming Video Generation with Rewarded Distribution Matching Distillation
☆352Dec 15, 2025Updated 7 months ago
NVlabs / TCM
View on GitHub
Codebase of Truncated Consistency Models (ICLR 2025)
☆34Jan 24, 2025Updated last year
Cominclip / OmniVerifier
View on GitHub
[ICLR 2026 Oral & ICML 2026] Generative Universal Verifier as Multimodal Meta-Reasoner
☆64May 29, 2026Updated 2 months ago
Multimedia-Analytics-Laboratory / dpdmd
View on GitHub
[ICML 2026] The offical code of Diversity-Preserved Distribution Matching Distillation for Fast Visual Synthesis
☆87Jun 2, 2026Updated last month
G-U-N / Diffusion-NPO
View on GitHub
[ICLR 2025, AAAI 2026] official implementation of "Diffusion-NPO: Negative Preference Optimization for Better Preference Aligned Generati…
☆39Jan 26, 2026Updated 6 months ago
showlab / Adv-GRPO
View on GitHub
[CVPR 2026] An official implementation of Adv-GRPO. The Image as Its Own Reward: Reinforcement Learning with Adversarial Reward for Image…
☆88Feb 26, 2026Updated 5 months ago
Gen-Verse / Diffusion-Sharpening
View on GitHub
Diffusion-Sharpening: Fine-tuning Diffusion Models with Denoising Trajectory Sharpening
☆72May 18, 2025Updated last year
black-yt / ReaLS
View on GitHub
Exploring Representation-Aligned Latent Space for Better Generation
☆19Mar 17, 2026Updated 4 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
wusize / OpenUni
View on GitHub
☆189Jun 27, 2025Updated last year
tulerfeng / Gen-Searcher
View on GitHub
Gen-Searcher: Reinforcing Agentic Search for Image Generation
☆377Apr 7, 2026Updated 3 months ago
rongyaofang / prism-bench
View on GitHub
This is the official repository for the paper "FLUX-Reason-6M & PRISM-Bench: A Million-Scale Text-to-Image Reasoning Dataset and Comprehe…
☆131Jan 29, 2026Updated 6 months ago
CodeGoat24 / UnifiedReward
View on GitHub
Official implementation of UnifiedReward & [NeurIPS 2025] UnifiedReward-Think & UnifiedReward-Flex
☆796Jun 18, 2026Updated last month
wyhlovecpp / GPT-Image-Edit
View on GitHub
GPT-IMAGE-EDIT-1.5M: A Million-Scale, GPT-Generated Image Dataset
☆243Aug 15, 2025Updated 11 months ago
showlab / Awesome-Unified-Multimodal-Models
View on GitHub
📖 This is a repository for organizing papers, codes and other resources related to unified multimodal models.
☆830Oct 10, 2025Updated 9 months ago
NVlabs / AnyFlow
View on GitHub
Flow Map OPD for AnyStep Video Diffusion
☆399May 23, 2026Updated 2 months ago