bytedance / USOLinks
π₯π₯ Open-sourced unified customization model
β1,181Updated 2 months ago
Alternatives and similar repositories for USO
Users that are interested in USO are comparing it to the libraries listed below
Sorting:
- Qwen-Image-Lightning: Speed up Qwen-Image model with distillationβ972Updated last month
- HunyuanImage-2.1: An Efficient Diffusion Model for High-Resolution (2K) Text-to-Image Generationββ656Updated last month
- Official GitHub repository for FLUX.1 Krea [dev].β355Updated 3 months ago
- HuMo: Human-Centric Video Generation via Collaborative Multi-Modal Conditioningβ872Updated last month
- ComfyUI node for highly expressive speech and realistic zero-shot voice cloningβ313Updated last month
- In-context subject-driven image generation while preserving foreground fidelityβ351Updated 5 months ago
- β1,300Updated this week
- β781Updated 4 months ago
- ComfyDeployedβ426Updated 2 months ago
- β670Updated last week
- HunyuanImage-3.0: A Powerful Native Multimodal Model for Image Generationβ2,462Updated 3 weeks ago
- Stand-In is a lightweight, plug-and-play framework for identity-preserving video generation.β663Updated 2 months ago
- Official Implementations for Paper - HoloCine: Holistic Generation of Cinematic Multi-Shot Long Video Narrativesβ433Updated last week
- β1,156Updated 2 weeks ago
- [SIGGRAPH Asia 25] Voost: A Unified and Scalable Diffusion Transformer for Bidirectional Virtual Try-On and Try-Offβ321Updated last month
- [Preprint 2025] Ditto: Scaling Instruction-Based Video Editing with a High-Quality Synthetic Datasetβ489Updated 3 weeks ago
- [ICCV 2025] π₯π₯ UNO: A Universal Customization Method for Both Single and Multi-Subject Conditioningβ1,324Updated 2 months ago
- Pusa: Thousands Timesteps Video Diffusion Modelβ661Updated 2 months ago
- Towards Real-Time Diffusion-Based Streaming Video Super-Resolution β An efficient one-step diffusion framework for streaming VSR with locβ¦β806Updated 2 weeks ago
- FantasyPortrait: Enhancing Multi-Character Portrait Animation with Expression-Augmented Diffusion Transformersβ480Updated 3 months ago
- β1,043Updated 6 months ago
- β753Updated 9 months ago
- Lumina-Image 2.0: A Unified and Efficient Image Generative Frameworkβ820Updated 2 weeks ago
- Industry-level video foundation model for unified Text-to-Video (T2V) and Image-to-Video (I2V) generation.β725Updated 2 months ago
- MoCha: End-to-End Video Character Replacement without Structural Guidanceβ433Updated this week
- ObjectClear: Complete Object Removal via Object-Effect Attentionβ499Updated last month
- ComfyUI custom node for the VibeVoice TTS. Expressive, long-form, multi-speaker conversational audioβ511Updated last month
- HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generationβ1,192Updated last month
- β1,933Updated last month
- [ArXiv 25] Stable Video Infinity: Infinite-Length Video Generation with Error Recyclingβ519Updated 2 weeks ago