atfortes / Awesome-Controllable-DiffusionLinks

Papers and resources on Controllable Generation using Diffusion Models, including ControlNet, DreamBooth, IP-Adapter.

☆495

Alternatives and similar repositories for Awesome-Controllable-Diffusion

Users that are interested in Awesome-Controllable-Diffusion are comparing it to the libraries listed below

Sorting:

YingqingHe / Awesome-LLMs-meet-Multimodal-Generation
🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).
☆512Updated 6 months ago
RunpeiDong / DreamLLM
[ICLR 2024 Spotlight] DreamLLM: Synergistic Multimodal Comprehension and Creation
☆459Updated 10 months ago
AILab-CVC / SEED
Official implementation of SEED-LLaMA (ICLR 2024).
☆631Updated last year
kohjingyu / gill
🐟 Code and models for the NeurIPS 2023 paper "Generating Images with Multimodal Language Models".
☆467Updated last year
jy0205 / LaVIT
LaVIT: Empower the Large Language Model to Understand and Generate Visual Content
☆594Updated last year
ziqihuangg / Awesome-Evaluation-of-Visual-Generation
A list of works on evaluation of visual generation models, including evaluation metrics, models, and systems
☆370Updated last month
showlab / Awesome-Unified-Multimodal-Models
📖 This is a repository for organizing papers, codes and other resources related to unified multimodal models.
☆729Updated 2 weeks ago
HenryHZY / Awesome-Multimodal-LLM
Research Trends in LLM-guided Multimodal Learning.
☆355Updated 2 years ago
TonyLianLong / LLM-groundedDiffusion
LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models (LLM-grounded Diffusi…
☆478Updated last year
AILab-CVC / SEED-Bench
(CVPR2024)A benchmark for evaluating Multimodal LLMs using multiple-choice questions.
☆352Updated 9 months ago
llava-rlhf / LLaVA-RLHF
Aligning LMMs with Factually Augmented RLHF
☆382Updated last year
yzhang2016 / video-generation-survey
A reading list of video generation
☆628Updated this week
Atomic-man007 / Awesome_Multimodel_LLM
Awesome_Multimodel is a curated GitHub repository that provides a comprehensive collection of resources for Multimodal Large Language Mod…
☆343Updated 7 months ago
AILab-CVC / SEED-X
Multimodal Models in Real World
☆549Updated 8 months ago
haoningwu3639 / StoryGen
[CVPR 2024] Intelligent Grimm - Open-ended Visual Storytelling via Latent Diffusion Models
☆258Updated 10 months ago
tsb0601 / MMVP
☆356Updated last year
OpenGVLab / Multi-Modality-Arena
Chatbot Arena meets multi-modality! Multi-Modality Arena allows you to benchmark vision-language models side-by-side while providing imag…
☆543Updated last year
TIGER-AI-Lab / Mantis
Official code for Paper "Mantis: Multi-Image Instruction Tuning" [TMLR 2024]
☆231Updated 7 months ago
PRIV-Creation / Awesome-Controllable-T2I-Diffusion-Models
A collection of resources on controllable generation with text-to-image diffusion models.
☆1,085Updated 10 months ago
snap-research / Panda-70M
[CVPR 2024] Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers
☆637Updated last year
soraw-ai / Awesome-Text-to-Video-Generation
A list for Text-to-Video, Image-to-Video works
☆243Updated 4 months ago
SiatMMLab / Awesome-Diffusion-Model-Based-Image-Editing-Methods
Diffusion Model-Based Image Editing: A Survey (TPAMI 2025)
☆675Updated 3 months ago
lichao-sun / SoraReview
The official GitHub page for the review paper "Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision M…
☆501Updated last year
Zhendong-Wang / Prompt-Diffusion
Official PyTorch implementation of the paper "In-Context Learning Unlocked for Diffusion Models"
☆410Updated last year
yuweihao / MM-Vet
MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities (ICML 2024)
☆310Updated 9 months ago
cientgu / InstructDiffusion
PyTorch implementation of InstructDiffusion, a unifying and generic framework for aligning computer vision tasks with human instructions.
☆439Updated last year
AlonzoLeeeooo / awesome-text-to-image-studies
A collection of awesome text-to-image generation studies.
☆684Updated last week
HaozheZhao / MIC
MMICL, a state-of-the-art VLM with the in context learning ability from ICL, PKU
☆356Updated last year
LMM101 / Awesome-Multimodal-Next-Token-Prediction
[Survey] Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey
☆453Updated 9 months ago
Meituan-AutoML / VisionLLaMA
VisionLLaMA: A Unified LLaMA Backbone for Vision Tasks
☆389Updated last year