mihirp1998 / AlignProp
AlignProp uses direct reward backpropogation for the alignment of large-scale text-to-image diffusion models. Our method is 25x more sample and compute efficient than reinforcement learning methods (PPO) for finetuning Stable Diffusion
☆273Updated 4 months ago
Alternatives and similar repositories for AlignProp:
Users that are interested in AlignProp are comparing it to the libraries listed below
- Code for "Diffusion Model Alignment Using Direct Preference Optimization"☆364Updated 3 weeks ago
- [CVPR 2024] Code for the paper "Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model"☆196Updated 10 months ago
- [ICCV 2023] Efficient Diffusion Training via Min-SNR Weighting Strategy☆240Updated 2 months ago
- Smooth Diffusion: Crafting Smooth Latent Spaces in Diffusion Models arXiv 2023 / CVPR 2024