bytedance / USOLinks
π₯π₯ Open-sourced unified customization model
β1,201Updated 5 months ago
Alternatives and similar repositories for USO
Users that are interested in USO are comparing it to the libraries listed below
Sorting:
- HunyuanImage-2.1: An Efficient Diffusion Model for High-Resolution (2K) Text-to-Image Generationββ672Updated 3 months ago
- [ICLR 26] Stable Video Infinity: Infinite-Length Video Generation with Error Recyclingβ1,961Updated 3 weeks ago
- HuMo: Human-Centric Video Generation via Collaborative Multi-Modal Conditioningβ1,133Updated 2 weeks ago
- Qwen-Image-Lightning: Speed up Qwen-Image model with distillationβ1,219Updated last month
- ComfyUI node for highly expressive speech and realistic zero-shot voice cloningβ381Updated last month
- Official inference repo for FLUX.2 modelsβ1,762Updated 3 weeks ago
- Official Implementations for Paper - HoloCine: Holistic Generation of Cinematic Multi-Shot Long Video Narrativesβ627Updated 2 months ago
- β2,053Updated last month
- β787Updated 6 months ago
- Stand-In is a lightweight, plug-and-play framework for identity-preserving video generation.β725Updated last month
- Implementation of "Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length"β1,750Updated last week
- Official GitHub repository for FLUX.1 Krea [dev].β360Updated 6 months ago
- Industry-level video foundation model for unified Text-to-Video (T2V) and Image-to-Video (I2V) generation.β877Updated 5 months ago
- ComfyDeployedβ441Updated 4 months ago
- HunyuanImage-3.0: A Powerful Native Multimodal Model for Image Generationβ2,827Updated last week
- β714Updated 3 months ago
- MoCha: End-to-End Video Character Replacement without Structural Guidanceβ635Updated 3 weeks ago
- [ICCV 2025] π₯π₯ UNO: A Universal Customization Method for Both Single and Multi-Subject Conditioningβ1,350Updated 5 months ago
- [SIGGRAPH Asia 25] Voost: A Unified and Scalable Diffusion Transformer for Bidirectional Virtual Try-On and Try-Offβ333Updated 3 months ago
- β1,592Updated 2 months ago
- In-context subject-driven image generation while preserving foreground fidelityβ351Updated 8 months ago
- [Preprint 2025] Ditto: Scaling Instruction-Based Video Editing with a High-Quality Synthetic Datasetβ566Updated 3 months ago
- Pusa: Thousands Timesteps Video Diffusion Modelβ672Updated this week
- FantasyPortrait: Enhancing Multi-Character Portrait Animation with Expression-Augmented Diffusion Transformersβ498Updated 5 months ago
- β1,046Updated 8 months ago
- Offical Implementation of SCAIL: Towards Studio-Grade Character Animation via In-Context Learning of 3D-Consistent Pose Representationsβ826Updated last month
- HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generationβ1,204Updated 3 months ago
- Phantom: Subject-Consistent Video Generation via Cross-Modal Alignmentβ1,479Updated 5 months ago
- ObjectClear: Complete Object Removal via Object-Effect Attentionβ532Updated 2 months ago
- A high-quality rapid TTS voice cloning model that reaches speeds of 150x realtime.β694Updated 2 weeks ago