bytedance / UMOView external linksLinks
π₯π₯ Official Repo of UMO: Scaling Multi-Identity Consistency for Image Customization via Matching Reward
β179Sep 15, 2025Updated 4 months ago
Alternatives and similar repositories for UMO
Users that are interested in UMO are comparing it to the libraries listed below
Sorting:
- β20Apr 15, 2025Updated 9 months ago
- π₯π₯ Open-sourced unified customization modelβ1,201Sep 12, 2025Updated 5 months ago
- Cut2Next: Generating Next Shot via In-Context Tuningβ31Aug 21, 2025Updated 5 months ago
- β108Sep 3, 2025Updated 5 months ago
- ε°ηΊ’δΉ¦ηfluxηζ¬ηιζεΎηζοΌlayerdiffuseοΌοΌζ―ζζηεΎεεΎηεΎβ17Mar 17, 2025Updated 10 months ago
- Create Latents with Perlin Noise in any shape (dimensionality). Works with Flux, SD3 and other 16d latent models.β34Aug 6, 2024Updated last year
- The official code for paper "UniFashion: A Unified Vision-Language Model for Multimodal Fashion Retrieval and Generation"β35Jul 29, 2024Updated last year
- Official implementation for "Nested Attention: Semantic-aware Attention Values for Concept Personalization" [SIGGRAPH 2025]β27Aug 4, 2025Updated 6 months ago
- OmniInsert: Mask-Free Video Insertion of Any Reference via Diffusion Transformer Modelsβ153Sep 24, 2025Updated 4 months ago
- β227Jul 17, 2025Updated 6 months ago
- Official code for ICCV 2025 paper, X2I: Seamless Integration of Multimodal Understanding into Diffusion Transformer via Attention Distillβ¦β89Jun 26, 2025Updated 7 months ago
- Official implementation of "VSTAR: Generative Temporal Nursing for Longer Dynamic Video Synthesis"β20Jan 26, 2025Updated last year
- [ICCV 2025] π₯π₯ UNO: A Universal Customization Method for Both Single and Multi-Subject Conditioningβ1,350Sep 12, 2025Updated 5 months ago
- [NeurIPS 2025] Official code for Inference-Time Scaling for Flow Models via Stochastic Generation and Rollover Budget Forcingβ72Oct 12, 2025Updated 4 months ago
- β103Jan 6, 2026Updated last month
- β44Jun 7, 2024Updated last year
- Implementation of layer diffuse inference using refinersβ25Apr 25, 2024Updated last year
- Official PyTorch/Diffusers implementation of "RectifiedHR: Enable Efficient High Resolution Image Generation via Energy Rectification"β30Oct 11, 2025Updated 4 months ago
- Official inference code and LongText-Bench benchmark for our paper X-Omni (https://arxiv.org/pdf/2507.22058).β420Aug 26, 2025Updated 5 months ago
- Advanced drum machine for ComfyUI featuring a 64-step sequencer, custom sample support, and retro hardware aesthetics.β20Jan 19, 2026Updated 3 weeks ago
- 3D Editing via Propagation of Image Prompts to Multi-Viewβ19Nov 30, 2025Updated 2 months ago
- β65Jun 16, 2025Updated 7 months ago
- [ICLR 2026] NANO3D: A Training-Free Approach for Efficient 3D Editing Without Masksβ134Oct 20, 2025Updated 3 months ago
- β716Nov 7, 2025Updated 3 months ago
- β29May 7, 2025Updated 9 months ago
- [NeurIPS'2024] Invertible Consistency Distillation for Text-Guided Image Editing in Around 7 Stepsβ101Jul 4, 2024Updated last year
- [ICCV 2025] Official implementation of the paper "DreamCube: 3D Panorama Generation via Multi-plane Synchronization".β168Feb 4, 2026Updated last week
- [ICLR 2026] SparseD: Sparse Attention for Diffusion Language Modelsβ57Oct 7, 2025Updated 4 months ago
- Directly Aligning the Full Diffusion Trajectory with Fine-Grained Human Preferenceβ1,249Oct 26, 2025Updated 3 months ago
- [ICCV 2025] Code & Data for: SuperEdit - Rectifying and Facilitating Supervision for Instruction-Based Image Editingβ165Jun 26, 2025Updated 7 months ago
- ComfyUI custom_node for ByteDance's InfiniteYouβ11Apr 16, 2025Updated 9 months ago
- [ICML 2025] LaCache: Ladder-Shaped KV Caching for Efficient Long-Context Modeling of Large Language Modelsβ17Nov 4, 2025Updated 3 months ago
- ComfyUI-Step1X-3D is now available in ComfyUI, delivering high-fidelity 3D asset generation with consistent geometry-texture alignment. Iβ¦β13May 16, 2025Updated 8 months ago
- Tuning-Free Image Editing with Fidelity and Editability via Unified Latent Diffusion Modelβ13Dec 29, 2024Updated last year
- OmniSVG: A Unified Scalable Vector Graphics Generation Model,you can try it in ComfyUIβ24Dec 5, 2025Updated 2 months ago
- For ACL25 paper "WAFFLE: Multi-Modal Model for Automated Front-End Development" - by Shanchao Liang and Nan Jiang and Shangshu Qian and Lβ¦β11May 28, 2025Updated 8 months ago
- β19Apr 23, 2025Updated 9 months ago
- β14Nov 24, 2023Updated 2 years ago
- (ICCV'25) TF-TI2I: Training-Free Text-and-Image-to-Image Generation via Multi-Modal Implicit-Context Learning in Text-to-Image Models (Auβ¦β14Aug 22, 2025Updated 5 months ago