bytedance / USOLinks
π₯π₯ Open-sourced unified customization model
β1,199Updated 4 months ago
Alternatives and similar repositories for USO
Users that are interested in USO are comparing it to the libraries listed below
Sorting:
- [ArXiv 25] Stable Video Infinity: Infinite-Length Video Generation with Error Recyclingβ1,403Updated last week
- HuMo: Human-Centric Video Generation via Collaborative Multi-Modal Conditioningβ1,080Updated last month
- HunyuanImage-2.1: An Efficient Diffusion Model for High-Resolution (2K) Text-to-Image Generationββ671Updated 3 months ago
- Official inference repo for FLUX.2 modelsβ1,656Updated last week
- Qwen-Image-Lightning: Speed up Qwen-Image model with distillationβ1,192Updated 3 weeks ago
- ComfyUI node for highly expressive speech and realistic zero-shot voice cloningβ371Updated last month
- β784Updated 6 months ago
- Official Implementations for Paper - HoloCine: Holistic Generation of Cinematic Multi-Shot Long Video Narrativesβ597Updated 2 months ago
- β1,566Updated 2 months ago
- Official GitHub repository for FLUX.1 Krea [dev].β359Updated 5 months ago
- Official Python inference and LoRA trainer package for the LTX-2 audioβvideo generative model.β2,900Updated last week
- HunyuanImage-3.0: A Powerful Native Multimodal Model for Image Generationβ2,665Updated 2 months ago
- Stand-In is a lightweight, plug-and-play framework for identity-preserving video generation.β719Updated last month
- β1,960Updated last month
- [SIGGRAPH Asia 25] Voost: A Unified and Scalable Diffusion Transformer for Bidirectional Virtual Try-On and Try-Offβ329Updated 3 months ago
- Industry-level video foundation model for unified Text-to-Video (T2V) and Image-to-Video (I2V) generation.β869Updated 4 months ago
- ComfyDeployedβ439Updated 4 months ago
- β709Updated 2 months ago
- Implementation of "Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length"β1,460Updated this week
- [Preprint 2025] Ditto: Scaling Instruction-Based Video Editing with a High-Quality Synthetic Datasetβ559Updated 2 months ago
- MoCha: End-to-End Video Character Replacement without Structural Guidanceβ597Updated last week
- [ICCV 2025] π₯π₯ UNO: A Universal Customization Method for Both Single and Multi-Subject Conditioningβ1,346Updated 4 months ago
- ObjectClear: Complete Object Removal via Object-Effect Attentionβ533Updated 2 months ago
- FantasyPortrait: Enhancing Multi-Character Portrait Animation with Expression-Augmented Diffusion Transformersβ497Updated 5 months ago
- PersonaLive! : Expressive Portrait Image Animation for Live Streamingβ1,509Updated 3 weeks ago
- The official repository of paper "Stream-DiffVSR: Low-Latency Streamable Video Super-Resolution via Auto-Regressive Diffusion"β259Updated 2 weeks ago
- Offical Implementation of SCAIL: Towards Studio-Grade Character Animation via In-Context Learning of 3D-Consistent Pose Representationsβ783Updated 2 weeks ago
- β1,046Updated 8 months ago
- Light Image Video Generation Inference Frameworkβ1,822Updated last week
- ComfyUI custom node for the VibeVoice TTS. Expressive, long-form, multi-speaker conversational audioβ553Updated 4 months ago