atfortes / Awesome-Controllable-DiffusionView external linksLinks
Papers and resources on Controllable Generation using Diffusion Models, including ControlNet, DreamBooth, IP-Adapter.
β502Jun 24, 2025Updated 7 months ago
Alternatives and similar repositories for Awesome-Controllable-Diffusion
Users that are interested in Awesome-Controllable-Diffusion are comparing it to the libraries listed below
Sorting:
- From Chain-of-Thought prompting to OpenAI o1 and DeepSeek-R1 πβ3,534May 7, 2025Updated 9 months ago
- A collection of resources on controllable generation with text-to-image diffusion models.β1,111Dec 31, 2024Updated last year
- Diffusion Model-Based Image Editing: A Survey (TPAMI 2025)β706Jul 15, 2025Updated 6 months ago
- collection of diffusion model papers categorized by their subareasβ2,149Updated this week
- [ACL 2023] Reasoning with Language Model Prompting: A Surveyβ994May 21, 2025Updated 8 months ago
- [CSUR] A Survey on Video Diffusion Modelsβ2,267Jun 27, 2025Updated 7 months ago
- Awesome_Multimodel is a curated GitHub repository that provides a comprehensive collection of resources for Multimodal Large Language Modβ¦β361Mar 19, 2025Updated 10 months ago
- A curated list of recent diffusion models for video generation, editing, and various other applications.β5,451Feb 3, 2026Updated last week
- β17Aug 8, 2024Updated last year
- (ΰ·`κ³Β΄ΰ·) A Survey on Text-to-Image Generation/Synthesis.β2,425Updated this week
- A trend starts from "Chain of Thought Prompting Elicits Reasoning in Large Language Models".β2,101Oct 5, 2023Updated 2 years ago
- Latest Advances on Multimodal Large Language Modelsβ17,337Updated this week
- Lumina-T2X is a unified framework for Text to Any Modality Generationβ2,251Feb 16, 2025Updated 11 months ago
- A collection of resources and papers on Diffusion Modelsβ12,273Aug 1, 2024Updated last year
- This repository contains a collection of papers and resources on Reasoning in Large Language Models.β567Nov 13, 2023Updated 2 years ago
- Paper List for In-context Learning π·β875Oct 8, 2024Updated last year
- LAVIS - A One-stop Library for Language-Vision Intelligenceβ11,166Nov 18, 2024Updated last year
- A library for advanced large language model reasoningβ2,330Jun 10, 2025Updated 8 months ago
- Code and models for the paper "One Transformer Fits All Distributions in Multi-Modal Diffusion"β1,473May 31, 2023Updated 2 years ago
- π₯π₯π₯ A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).β540Apr 4, 2025Updated 10 months ago
- Official implementation of "Ctrl-X: Controlling Structure and Appearance for Text-To-Image Generation Without Guidance" (NeurIPS 2024)β304Sep 12, 2025Updated 5 months ago
- Multimodal-GPTβ1,518Jun 4, 2023Updated 2 years ago
- [ECCV 2024 Oral] MotionDirector: Motion Customization of Text-to-Video Diffusion Models.β1,038Aug 21, 2024Updated last year
- Code release for our NeurIPS 2024 Spotlight paper "GenArtist: Multimodal LLM as an Agent for Unified Image Generation and Editing"β160Oct 23, 2024Updated last year
- The official code and dataset for EMNLP 2022 paper "COPEN: Probing Conceptual Knowledge in Pre-trained Language Models".β21Mar 9, 2023Updated 2 years ago
- A Survey of Image Editingβ465Aug 24, 2025Updated 5 months ago
- Awesome diffusion Video-to-Video (V2V). A collection of paper on diffusion model-based video editing, aka. video-to-video (V2V) translatiβ¦β276Nov 24, 2025Updated 2 months ago
- [CVPR`2024, Oral] Attention Calibration for Disentangled Text-to-Image Personalizationβ109Apr 10, 2024Updated last year
- paper list on reasoning in NLPβ195Apr 7, 2025Updated 10 months ago
- Fine-Grained Subject-Specific Attribute Expression Control in T2I Modelsβ134Feb 27, 2025Updated 11 months ago
- Diffusion model papers, survey, and taxonomyβ3,322Sep 27, 2025Updated 4 months ago
- A reading list of video generationβ665Feb 4, 2026Updated last week
- Official implementation of CVPR 2024 paper: "FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Conβ¦β475Oct 21, 2024Updated last year
- Emu Series: Generative Multimodal Models from BAAIβ1,765Jan 12, 2026Updated last month
- The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.β6,462Jun 28, 2024Updated last year
- [ICLR 2025] Official code implementation of DreamBench++: A Human-Aligned Benchmark for Personalized Image Generationβ130Feb 23, 2025Updated 11 months ago
- Codes for "Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models".β1,139Dec 23, 2023Updated 2 years ago
- implementation for Mucko: Multi-Layer Cross-Modal Knowledge Reasoning for Fact-based Visual Question Answeringβ10Mar 17, 2022Updated 3 years ago
- Course repository for the Spring 2023 COMP664 course "Deep Learning" at UNCβ14Apr 17, 2023Updated 2 years ago