hu-zijing/B2-DiffuRL

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/hu-zijing/B2-DiffuRL)

hu-zijing / B2-DiffuRL

[CVPR 25] A framework named B^2-DiffuRL for RL-based diffusion model fine-tuning.

☆57

Alternatives and similar repositories for B2-DiffuRL

Users that are interested in B2-DiffuRL are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

hu-zijing / AsynDM
View on GitHub
[ICLR 26] Asynchronous diffusion models allocate individual pixels with varying timestep schedules, yielding improved text-to-image align…
☆19Oct 7, 2025Updated 9 months ago
ishitaaagupta / tailwind-portfolio
View on GitHub
☆19Dec 9, 2024Updated last year
UCSC-VLAA / STAR-1
View on GitHub
[AAAI'26 Oral] Official Implementation of STAR-1: Safer Alignment of Reasoning LLMs with 1K Data
☆38Apr 7, 2025Updated last year
casiatao / LPO
View on GitHub
The official pytorch implementation of “Diffusion Model as a Noise-Aware Latent Reward Model for Step-Level Preference Optimization”.
☆19May 22, 2025Updated last year
yk7333 / d3po
View on GitHub
[CVPR 2024] Code for the paper "Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model"
☆244Apr 6, 2024Updated 2 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
micky-li-hd / CoCo
View on GitHub
CoCo: Code as CoT for Text-to-Image Preview and Rare Concept Generation
☆54Apr 9, 2026Updated 3 months ago
Gen-Verse / Diffusion-Sharpening
View on GitHub
Diffusion-Sharpening: Fine-tuning Diffusion Models with Denoising Trajectory Sharpening
☆72May 18, 2025Updated last year
G-U-N / Diffusion-NPO
View on GitHub
[ICLR 2025, AAAI 2026] official implementation of "Diffusion-NPO: Negative Preference Optimization for Better Preference Aligned Generati…
☆39Jan 26, 2026Updated 6 months ago
martian422 / MaskGRPO
View on GitHub
The official implementation of MaskGRPO: Consolidating Reinforcement Learning for Multimodal Discrete Diffusion Models. (ICLR 2026, arxiv…
☆19Jan 27, 2026Updated 6 months ago
jacklishufan / diffusion-kto
View on GitHub
The official implementation of Diffusion-KTO: Aligning Diffusion Models by Optimizing Human Utility
☆69Aug 16, 2025Updated 11 months ago
Luo-Yihong / TDM-R1
View on GitHub
[ICML 2026][Ultra Powerful Few-Step Diffusion RL] TDM-R1: Reinforcing Few-Step Diffusion Models with Non-Differentiable Reward
☆116May 25, 2026Updated 2 months ago
JustinXu0 / AnimateZoo
View on GitHub
☆22Mar 27, 2026Updated 4 months ago
showlab / SMS
View on GitHub
[ICCV 2025] Balanced Image Stylization with Style Matching Score
☆69Mar 9, 2026Updated 4 months ago
kvablack / ddpo-pytorch
View on GitHub
DDPO for finetuning diffusion models, implemented in PyTorch with LoRA support
☆768Mar 22, 2024Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
jannerm / ddpo
View on GitHub
Code for the paper "Training Diffusion Models with Reinforcement Learning"
☆574Jul 5, 2023Updated 3 years ago
Tencent-Hunyuan / GEAR
View on GitHub
☆65Jul 1, 2026Updated 3 weeks ago
rongyaofang / GoT
View on GitHub
Official repository of "GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing"
☆317Sep 28, 2025Updated 10 months ago
Fredreic1849 / BranchGRPO
View on GitHub
BranchGRPO: Stable and Efficient GRPO with Structured Branching in Diffusion Models
☆47Oct 30, 2025Updated 8 months ago
STARE-bench / STARE
View on GitHub
☆19Oct 12, 2025Updated 9 months ago
siddharthverma314 / clcp-neurips-2020
View on GitHub
Code for Continual Learning of Control Primitives
☆18Nov 11, 2020Updated 5 years ago
yifan123 / flow_grpo
View on GitHub
[NeurIPS 2025] An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL
☆2,440May 7, 2026Updated 2 months ago
Franklin-Zhang0 / ReasonGen-R1
View on GitHub
Official respository for ReasonGen-R1
☆75Jun 23, 2025Updated last year
CodeGoat24 / Pref-GRPO
View on GitHub
Official implementation of Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning
☆276Feb 10, 2026Updated 5 months ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
wdrink / SimpleAR
View on GitHub
Pytorch implementation for the paper titled "SimpleAR: Pushing the Frontier of Autoregressive Visual Generation"
☆431Jun 20, 2025Updated last year
ZhengrongYue / PAE
View on GitHub
Official Implementation of "What Matters for Diffusion-Friendly Latent Manifold? Prior-Aligned Autoencoders for Latent Diffusion"
☆80May 27, 2026Updated 2 months ago
notmahi / disk
View on GitHub
PyTorch implementation for "Discovery of Incremental Skills" (DISk) algorithm from ICLR 2022 paper "One After Another: Learning Increment…
☆21Mar 22, 2022Updated 4 years ago
Luo-Yihong / DGPO
View on GitHub
[ICLR 2026][Ultra Fast&Powerful Diffusion RL] Reinforcing Diffusion Models by Direct Group Preference Optimization
☆86May 26, 2026Updated 2 months ago
HaroldChen19 / VistaDPO
View on GitHub
[ICML 2025] VistaDPO: Video Hierarchical Spatial-Temporal Direct Preference Optimization for Large Video Models
☆42Jun 14, 2025Updated last year
CodeGoat24 / UnifiedReward
View on GitHub
Official implementation of UnifiedReward & [NeurIPS 2025] UnifiedReward-Think & UnifiedReward-Flex
☆796Jun 18, 2026Updated last month
SalesforceAIResearch / DiffusionDPO
View on GitHub
Code for "Diffusion Model Alignment Using Direct Preference Optimization"
☆706Jun 2, 2026Updated last month
TIGER-AI-Lab / EditReward
View on GitHub
EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing [ICLR 2026]
☆156Updated this week
KlingAIResearch / VideoAlign
View on GitHub
[NeurIPS 2025] Improving Video Generation with Human Feedback
☆489Sep 24, 2025Updated 10 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
PKU-YuanGroup / Edit-R1
View on GitHub
Edit-R1: Reinforce Image Editing with Diffusion Negative-Aware Finetuning and MLLM Implicit Feedback
☆295Jan 24, 2026Updated 6 months ago
wookiekim / SOLACE
View on GitHub
SOLACE: Improving Text-to-Image Generation with Intrinsic Self-Confidence Rewards (CVPR 2026)
☆17Jun 2, 2026Updated last month
RockeyCoss / SPO
View on GitHub
[CVPR 2025] Aesthetic Post-Training Diffusion Models from Generic Preferences with Step-by-step Preference Optimization
☆271Apr 7, 2025Updated last year
xie-lab-ml / CoRe2
View on GitHub
[TPAMI] The official implementation of our paper "Improved and Accelerated Text-to-Image Generation with Collect, Reflect, and Refine".
☆30Mar 8, 2026Updated 4 months ago
gulucaptain / videoassembler
View on GitHub
[ECCV'24] Official project of paper "MagDiff: Multi-Alignment Diffusion for High-Fidelity Video Generation and Editing"
☆30Dec 22, 2024Updated last year
wangqiang9 / Awesome-RLHF-Video-Diffusion
View on GitHub
RLHF for Video Diffusion Models
☆26Jul 30, 2025Updated 11 months ago
showlab / Adv-GRPO
View on GitHub
[CVPR 2026] An official implementation of Adv-GRPO. The Image as Its Own Reward: Reinforcement Learning with Adversarial Reward for Image…
☆88Feb 26, 2026Updated 5 months ago