[ICCV 2025] Region-Aware Text-to-Image Generation via Hard Binding and Soft Refinement π₯
β621Dec 12, 2025Updated 2 months ago
Alternatives and similar repositories for RAG-Diffusion
Users that are interested in RAG-Diffusion are comparing it to the libraries listed below
Sorting:
- Training-free Regional Prompting for Diffusion Transformers π₯β694Nov 28, 2024Updated last year
- [ICCV 2025 Highlight] OminiControl: Minimal and Universal Control for Diffusion Transformerβ1,903Jul 3, 2025Updated 7 months ago
- [CVPR 2025] InstanceCap: Improving Text-to-Video Generation via Instance-aware Structured Caption πβ47Jul 5, 2025Updated 7 months ago
- Official repository of In-Context LoRA for Diffusion Transformersβ2,058Dec 20, 2024Updated last year
- Official implementation of the paper: "FlowEdit: Inversion-Free Text-Based Editing Using Pre-Trained Flow Models"β942Dec 23, 2025Updated 2 months ago
- Customized ID Consistent for humanβ1,017Jan 2, 2026Updated 2 months ago
- [πICML 2025] "Taming Rectified Flow for Inversion and Editing" Using FLUX and HunyuanVideo for image and video editing!β617May 1, 2025Updated 10 months ago
- [CVPR 2025 Highlightπ₯] Identity-Preserving Text-to-Video Generation by Frequency Decompositionβ828Aug 30, 2025Updated 6 months ago
- [CVPR 2025] Hallo3: Highly Dynamic and Realistic Portrait Image Animation with Video Diffusion Transformerβ1,368Mar 13, 2025Updated 11 months ago
- [ECCV 2024] The official implementation of paper "BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion"β1,713Dec 17, 2024Updated last year
- Allegro is a powerful text-to-video model that generates high-quality videos up to 6 seconds at 15 FPS and 720p resolution from simple teβ¦β1,117Feb 7, 2025Updated last year
- [ICLR 2025] Codebase for "CtrLoRA: An Extensible and Efficient Framework for Controllable Image Generation"β262Jan 12, 2026Updated last month
- It is an Android-based application that enables managing hotspot properties through a web interface, providing mobile routing functionaliβ¦β154Dec 19, 2024Updated last year
- Code for SCIS-2025 Paper "UniAnimate: Taming Unified Video Diο¬usion Models for Consistent Human Image Animation".β1,187Apr 15, 2025Updated 10 months ago
- User Identity Scaffolding for Multiple OIDC Authentications for Userβ95Jun 14, 2025Updated 8 months ago
- Official code for VMix: Improving Text-to-Image Diffusion Model with Cross-Attention Mixing Controlβ191Dec 31, 2024Updated last year
- [ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (RPG)β1,844Feb 1, 2025Updated last year
- Omegance: A Single Parameter for Various Granularities in Diffusion-Based Synthesis (ICCV, 2025)β52Jan 14, 2026Updated last month
- ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignmentβ1,277Jul 17, 2024Updated last year
- [ICCV 2025] π₯π₯ UNO: A Universal Customization Method for Both Single and Multi-Subject Conditioningβ1,350Sep 12, 2025Updated 5 months ago
- [AAAI 2026] Personalize Anything for Free with Diffusion Transformerβ355Mar 20, 2025Updated 11 months ago
- [ICCV 2025] SAM2Long: Enhancing SAM 2 for Long Video Segmentation with a Training-Free Memory Treeβ549Jul 29, 2025Updated 7 months ago
- β1,049May 14, 2025Updated 9 months ago
- [ICML 2023 Oral, NeurIPS 2023] Official implementations for paper: Customizable Image Synthesis with Multiple Subjectsβ446Sep 12, 2023Updated 2 years ago
- [CVPR 2025 Highlight] X-Dyna: Expressive Dynamic Human Image Animationβ261Jan 30, 2025Updated last year
- CoDi:Subject-Consistent and Pose-Diverse Text-to-Image Generationβ37Aug 1, 2025Updated 7 months ago
- [under review] The official implementation of paper "BrushEdit: All-In-One Image Inpainting and Editing"β586Sep 3, 2025Updated 5 months ago
- Evaluation of Text-to-Video Generation Models: A Dynamics Perspective[NeurIPS 2024].β274Dec 3, 2024Updated last year
- β247Nov 24, 2024Updated last year
- Rectified Flow Inversion (RF-Inversion) - ICLR 2025β469Mar 19, 2025Updated 11 months ago
- Efficient DiT architecture for text2any tasks, ICLR2025β447May 10, 2025Updated 9 months ago
- InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation π₯β2,007Sep 18, 2024Updated last year
- Enhance-A-Video: Better Generated Video for Freeβ594Mar 17, 2025Updated 11 months ago
- kight is a static analysis tool for c/c++ programs.β214Dec 27, 2024Updated last year
- Advanced Unsupervised Image Enhancement with GANβ247Nov 11, 2024Updated last year
- (Accepted by IJCV) Liquid: Language Models are Scalable and Unified Multi-modal Generatorsβ640Nov 10, 2025Updated 3 months ago
- Highly encapsulated for effortless usage, this state machine kernel is realized with just a single function call!β55Aug 21, 2025Updated 6 months ago
- [ECCV 2024] HiDiffusion: Increases the resolution and speed of your diffusion model by only adding a single line of code!β838Jan 7, 2026Updated last month
- MoMA: Multimodal LLM Adapter for Fast Personalized Image Generationβ234Jul 11, 2024Updated last year