muzishen / RCDMsLinks
[AAAI 2025] 🎬RCDMs🎬: Boosting Consistency in Story Visualization with Rich-Contextual Conditional Diffusion Models. RCDMs improve story generation with strong semantic and temporal consistency, integrating rich contextual conditions and enabling one-pass inference for enhanced coherence.
☆88Updated last month
Alternatives and similar repositories for RCDMs
Users that are interested in RCDMs are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2024] 🕺IMAGPose🕺: A Unified Conditional Framework for Pose-Guided Person Generation. IMAGPose enables versatile pose-guided im…☆108Updated last month
- (AAAI 2025)MUSES: 3D-Controllable Image Generation via Multi-Modal Agent Collaboration☆39Updated 5 months ago
- [NeurIPS 2025 Spotlight] StreamForest: Efficient Online Video Understanding with Persistent Event Memory☆69Updated last week
- 🎨 IMAGGarment🎨 : Fine-Grained Garment Generation with Controllable Structure, Color, and Logo. It supports precise and customizable ga…☆262Updated 2 weeks ago
- Multi-Reward as Condition for Instruction-Based Image Editing☆56Updated 7 months ago
- Perceive Anything: Recognize, Explain, Caption, and Segment Anything in Images and Videos☆296Updated last month
- [NeurIPS' 2025] JarvisArt: Liberating Human Artistic Creativity via an Intelligent Photo Retouching Agent☆683Updated this week
- [CVPR‘ 2025 ] JarvisIR: Elevating Autonomous Driving Perception with Intelligent Image Restoration☆242Updated last month
- OmniMamba: Efficient and Unified Multimodal Understanding and Generation via State Space Models☆139Updated 6 months ago
- ☆26Updated 6 months ago
- [ICLR2025] ClassDiffusion: Official impl. of Paper "ClassDiffusion: More Aligned Personalization Tuning with Explicit Class Guidance"☆46Updated 8 months ago
- Implementation code of the paper MIGE: A Unified Framework for Multimodal Instruction-Based Image Generation and Editing☆70Updated 4 months ago
- ☆21Updated 11 months ago
- [CVPR2025] Unveil Inversion and Invariance in Flow Transformer for Versatile Image Editing☆21Updated 2 months ago
- Coherent Video Inpainting Using Optical Flow-Guided Efficient Diffusion☆298Updated 5 months ago
- [CVPR 2024] U-VAP: User-specified Visual Appearance Personalization via Decoupled Self Augmentation☆18Updated last year
- This repo contains the code for PreciseControl project [ECCV'24]☆69Updated last year
- PixelHacker: Image Inpainting with Structural and Semantic Consistency☆454Updated 5 months ago
- ☆41Updated 10 months ago
- Official implementation of the paper "Attentive Eraser: Unleashing Diffusion Model’s Object Removal Potential via Self-Attention Redirect…☆197Updated 6 months ago
- ☆33Updated last year
- CAR: Controllable AutoRegressive Modeling for Visual Generation☆125Updated 11 months ago
- Official code for "DiffX: Guide Your Layout to Cross-Modal Generative Modeling"☆22Updated 8 months ago
- Replication in Visual Diffusion Models: A Survey and Outlook☆31Updated last year
- Official code for K-LoRA (CVPR 2025)☆131Updated last month
- An official implementation of "Re-Attentional Controllable Video Diffusion Editing" in PyTorch. (AAAI 2025)☆27Updated 10 months ago
- ☆14Updated last year
- [CVPR 2025] PatchDPO: Patch-level DPO for Finetuning-free Personalized Image Generation☆42Updated 4 months ago
- ☆51Updated 10 months ago
- OmniStyle: Filtering High Quality Style Transfer Data at Scale (CVPR 2025)☆32Updated 3 months ago