YangLing0818/RPG-DiffusionMaster

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/YangLing0818/RPG-DiffusionMaster)

YangLing0818 / RPG-DiffusionMaster

[ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (RPG)

☆1,840

Alternatives and similar repositories for RPG-DiffusionMaster

Users that are interested in RPG-DiffusionMaster are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

TencentQQGYLab / ELLA
View on GitHub
ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment
☆1,285Jul 17, 2024Updated 2 years ago
PixArt-alpha / PixArt-alpha
View on GitHub
PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
☆3,299Oct 31, 2024Updated last year
Alpha-VLLM / Lumina-T2X
View on GitHub
Lumina-T2X is a unified framework for Text to Any Modality Generation
☆2,247Feb 16, 2025Updated last year
ChenyangSi / FreeU
View on GitHub
FreeU: Free Lunch in Diffusion U-Net (CVPR2024 Oral)
☆1,899Dec 24, 2024Updated last year
google / style-aligned
View on GitHub
Official code for "Style Aligned Image Generation via Shared Attention"
☆1,315Dec 29, 2023Updated 2 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
ali-vilab / VGen
View on GitHub
Official repo for VGen: a holistic video generation ecosystem for video generation building on diffusion models
☆3,155Jan 10, 2025Updated last year
tencent-ailab / IP-Adapter
View on GitHub
The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.
☆6,646Jun 28, 2024Updated 2 years ago
luosiallen / latent-consistency-model
View on GitHub
Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference
☆4,615Jun 14, 2024Updated 2 years ago
instantX-research / InstantID
View on GitHub
InstantID: Zero-shot Identity-Preserving Generation in Seconds 🔥
☆11,979Jul 18, 2024Updated 2 years ago
zai-org / ImageReward
View on GitHub
[NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences for Text-to-image Generation
☆1,695Oct 29, 2025Updated 8 months ago
Doubiiu / DynamiCrafter
View on GitHub
[ECCV 2024, Oral] DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors
☆3,007Sep 8, 2024Updated last year
lllyasviel / LayerDiffuse
View on GitHub
Transparent Image Layer Diffusion using Latent Transparency
☆2,218Jun 16, 2024Updated 2 years ago
Tencent-Hunyuan / HunyuanDiT
View on GitHub
Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
☆4,292Nov 27, 2025Updated 8 months ago
guoyww / AnimateDiff
View on GitHub
Official implementation of AnimateDiff.
☆12,194Jul 31, 2024Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
instantX-research / InstantStyle
View on GitHub
InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation 🔥
☆2,011Sep 18, 2024Updated last year
TencentARC / MasaCtrl
View on GitHub
[ICCV 2023] Consistent Image Synthesis and Editing
☆843Aug 19, 2024Updated last year
TencentARC / MotionCtrl
View on GitHub
Official Code for MotionCtrl [SIGGRAPH 2024]
☆1,497Feb 19, 2025Updated last year
frank-xwang / InstanceDiffusion
View on GitHub
[CVPR 2024] Code release for "InstanceDiffusion: Instance-level Control for Image Generation"
☆614Jun 17, 2025Updated last year
AILab-CVC / VideoCrafter
View on GitHub
VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models
☆5,067Jan 9, 2026Updated 6 months ago
showlab / X-Adapter
View on GitHub
[CVPR 2024] X-Adapter: Adding Universal Compatibility of Plugins for Upgraded Diffusion Model
☆770Aug 14, 2024Updated last year
genforce / freecontrol
View on GitHub
Official implementation of CVPR 2024 paper: "FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Con…
☆480Oct 21, 2024Updated last year
TencentARC / T2I-Adapter
View on GitHub
T2I-Adapter
☆3,804Jun 21, 2024Updated 2 years ago
TonyLianLong / LLM-groundedDiffusion
View on GitHub
LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models (LLM-grounded Diffusi…
☆483Sep 9, 2024Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
MooreThreads / Moore-AnimateAnyone
View on GitHub
Character Animation (AnimateAnyone, Face Reenactment)
☆3,513May 31, 2024Updated 2 years ago
TencentARC / PhotoMaker
View on GitHub
PhotoMaker [CVPR 2024]
☆10,098Oct 31, 2024Updated last year
lllyasviel / Omost
View on GitHub
Your image is almost there!
☆7,607Jul 26, 2024Updated 2 years ago
open-mmlab / PIA
View on GitHub
[CVPR 2024] PIA, your Personalized Image Animator. Animate your images by text prompt, combing with Dreambooth, achieving stunning videos…
☆975Aug 5, 2024Updated last year
TianxingWu / FreeInit
View on GitHub
[ECCV 2024] FreeInit: Bridging Initialization Gap in Video Diffusion Models
☆544Jan 18, 2024Updated 2 years ago
Vchitect / LaVie
View on GitHub
[IJCV 2024] LaVie: High-Quality Video Generation with Cascaded Latent Diffusion Models
☆952Nov 13, 2024Updated last year
tyxsspa / AnyText
View on GitHub
Official implementation code of the paper <AnyText: Multilingual Visual Text Generation And Editing>
☆4,868Mar 7, 2025Updated last year
FoundationVision / LlamaGen
View on GitHub
Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation
☆1,960Aug 15, 2024Updated last year
ali-vilab / AnyDoor
View on GitHub
Official implementations for paper: Anydoor: zero-shot object-level image customization
☆4,237Apr 8, 2024Updated 2 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
HVision-NKU / StoryDiffusion
View on GitHub
Accepted as [NeurIPS 2024] Spotlight Presentation Paper
☆6,441Sep 26, 2024Updated last year
megvii-research / HiDiffusion
View on GitHub
[ECCV 2024] HiDiffusion: Increases the resolution and speed of your diffusion model by only adding a single line of code!
☆841Jan 7, 2026Updated 6 months ago
openai / consistencydecoder
View on GitHub
Consistency Distilled Diff VAE
☆2,213Nov 7, 2023Updated 2 years ago
PixArt-alpha / PixArt-sigma
View on GitHub
PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation
☆1,933Oct 31, 2024Updated last year
lllyasviel / IC-Light
View on GitHub
More relighting!
☆8,475Feb 20, 2025Updated last year
gligen / GLIGEN
View on GitHub
Open-Set Grounded Text-to-Image Generation
☆2,226Mar 6, 2024Updated 2 years ago
PRIV-Creation / Awesome-Controllable-T2I-Diffusion-Models
View on GitHub
A collection of resources on controllable generation with text-to-image diffusion models.
☆1,111Dec 31, 2024Updated last year