[ICCV 2025] Region-Aware Text-to-Image Generation via Hard Binding and Soft Refinement 🔥
☆620Dec 12, 2025Updated 3 months ago
Alternatives and similar repositories for RAG-Diffusion
Users that are interested in RAG-Diffusion are comparing it to the libraries listed below
Sorting:
- Training-free Regional Prompting for Diffusion Transformers 🔥☆693Nov 28, 2024Updated last year
- [CVPR 2025] InstanceCap: Improving Text-to-Video Generation via Instance-aware Structured Caption 🔍☆46Jul 5, 2025Updated 8 months ago
- Official repository of In-Context LoRA for Diffusion Transformers☆2,061Dec 20, 2024Updated last year
- CoDi:Subject-Consistent and Pose-Diverse Text-to-Image Generation☆37Aug 1, 2025Updated 7 months ago
- [ICCV 2025 Highlight] OminiControl: Minimal and Universal Control for Diffusion Transformer☆1,905Jul 3, 2025Updated 8 months ago
- TextCrafter: Accurately Rendering Multiple Texts in Complex Visual Scenes☆94Nov 26, 2025Updated 3 months ago
- Official implementation of the paper: "FlowEdit: Inversion-Free Text-Based Editing Using Pre-Trained Flow Models"☆957Dec 23, 2025Updated 2 months ago
- Customized ID Consistent for human☆1,018Jan 2, 2026Updated 2 months ago
- [🚀ICML 2025] "Taming Rectified Flow for Inversion and Editing" Using FLUX and HunyuanVideo for image and video editing!☆619May 1, 2025Updated 10 months ago
- It is an Android-based application that enables managing hotspot properties through a web interface, providing mobile routing functionali…☆154Dec 19, 2024Updated last year
- User Identity Scaffolding for Multiple OIDC Authentications for User☆95Jun 14, 2025Updated 9 months ago
- [ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (RPG)☆1,843Feb 1, 2025Updated last year
- [CVPR 2025 Highlight🔥] Identity-Preserving Text-to-Video Generation by Frequency Decomposition☆835Mar 8, 2026Updated 2 weeks ago
- [CVPR 2025] Hallo3: Highly Dynamic and Realistic Portrait Image Animation with Video Diffusion Transformer☆1,376Mar 13, 2025Updated last year
- ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment☆1,281Jul 17, 2024Updated last year
- [ECCV 2024] The official implementation of paper "BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion"☆1,716Dec 17, 2024Updated last year
- Code for SCIS-2025 Paper "UniAnimate: Taming Unified Video Diffusion Models for Consistent Human Image Animation".☆1,189Apr 15, 2025Updated 11 months ago
- ☆247Nov 24, 2024Updated last year
- [ICCV 2025] 🔥🔥 UNO: A Universal Customization Method for Both Single and Multi-Subject Conditioning☆1,354Sep 12, 2025Updated 6 months ago
- [AAAI 2026] Personalize Anything for Free with Diffusion Transformer☆357Mar 20, 2025Updated last year
- Allegro is a powerful text-to-video model that generates high-quality videos up to 6 seconds at 15 FPS and 720p resolution from simple te…☆1,131Feb 7, 2025Updated last year
- [ICML 2023 Oral, NeurIPS 2023] Official implementations for paper: Customizable Image Synthesis with Multiple Subjects☆447Sep 12, 2023Updated 2 years ago
- kight is a static analysis tool for c/c++ programs.☆214Dec 27, 2024Updated last year
- [ICLR 2025] Codebase for "CtrLoRA: An Extensible and Efficient Framework for Controllable Image Generation"☆263Mar 6, 2026Updated 2 weeks ago
- Evaluation of Text-to-Video Generation Models: A Dynamics Perspective[NeurIPS 2024].☆274Dec 3, 2024Updated last year
- Official Implementation of AttentionShift: Iteratively Estimated Part-based Attention Map for Pointly Supervised Instance Segmentation☆155Oct 18, 2024Updated last year
- Unofficial Implementation of ReplaceAnything: https://aigcdesigngroup.github.io/replace-anything/☆400May 27, 2024Updated last year
- Advanced Unsupervised Image Enhancement with GAN☆247Nov 11, 2024Updated last year
- [AAAI 2025] Exploiting Multimodal Spatial-temporal Patterns for Video Object Tracking☆117May 18, 2025Updated 10 months ago
- An open-source library with a powerful Contrastive Language-and-Motion (CLaM) pre-training evaluator☆97Nov 23, 2025Updated 3 months ago
- ☆252Feb 11, 2025Updated last year
- [ICLR 2025] Hallo2: Long-Duration and High-Resolution Audio-driven Portrait Image Animation☆3,685Feb 27, 2025Updated last year
- Efficient DiT architecture for text2any tasks, ICLR2025☆447May 10, 2025Updated 10 months ago
- A curated list of papers, code and resources pertaining to image composition/compositing or object insertion/addition/compositing, which …☆533Feb 24, 2026Updated 3 weeks ago
- Welcome to the 'Open-Alteryx-Macro' project. This project is aimed at providing an open-source solution for managing and updating Alteryx…☆156May 25, 2024Updated last year
- Code for paper "Self-Taught Recognizer: Toward Unsupervised Adaptation for Speech Foundation Models"☆242May 24, 2024Updated last year
- ☆1,053May 14, 2025Updated 10 months ago
- Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation☆8,650Sep 14, 2024Updated last year
- ☆135May 6, 2024Updated last year