bytedance/USO

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/bytedance/USO)

bytedance / USO

[CVPR 2026] 🔥🔥 Official Repo of USO: Unified Style and Subject-Driven Generation via Disentangled and Reward Learning

☆1,227

Alternatives and similar repositories for USO

Users that are interested in USO are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

bytedance / UMO
View on GitHub
[CVPR 2026] 🔥🔥 Official Repo of UMO: Scaling Multi-Identity Consistency for Image Customization via Matching Reward
☆190Sep 15, 2025Updated 10 months ago
Tencent-Hunyuan / HunyuanImage-2.1
View on GitHub
HunyuanImage-2.1: An Efficient Diffusion Model for High-Resolution (2K) Text-to-Image Generation
☆673Oct 14, 2025Updated 9 months ago
bytedance / UNO
View on GitHub
[ICCV 2025] 🔥🔥 UNO: A Universal Customization Method for Both Single and Multi-Subject Conditioning
☆1,360Sep 12, 2025Updated 10 months ago
bytedance / XVerse
View on GitHub
[NeurIPS 2025] Official implementation of "XVerse: Consistent Multi-Subject Control of Identity and Semantic Attributes via DiT Modulatio…
☆627Oct 22, 2025Updated 9 months ago
WeChatCV / Stand-In
View on GitHub
[CVPR2026 🎉] Stand-In is a lightweight, plug-and-play framework for identity-preserving video generation.
☆777Feb 21, 2026Updated 5 months ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
nv-tlabs / ChronoEdit
View on GitHub
[ICLR 2026] ChronoEdit: Towards Temporal Reasoning for Image Editing and World Simulation
☆698Nov 20, 2025Updated 8 months ago
Fantasy-AMAP / fantasy-portrait
View on GitHub
FantasyPortrait: Enhancing Multi-Character Portrait Animation with Expression-Augmented Diffusion Transformers
☆511Aug 20, 2025Updated 11 months ago
blurgyy / CoMPaSS
View on GitHub
[ICCV 2025] Enhancing spatial understanding in text-to-Image diffusion models
☆94Sep 11, 2025Updated 10 months ago
Saquib764 / omini-kontext
View on GitHub
An inference and training framework for multiple image input in Flux Kontext dev
☆441Sep 1, 2025Updated 10 months ago
showlab / OmniConsistency
View on GitHub
The official code implementation of the paper "OmniConsistency: Learning Style-Agnostic Consistency from Paired Stylization Data."
☆423Jun 8, 2025Updated last year
bytedance / OneReward
View on GitHub
☆348Sep 15, 2025Updated 10 months ago
Yaofang-Liu / Pusa-VidGen
View on GitHub
Pusa: Thousands Timesteps Video Diffusion Model
☆686Feb 13, 2026Updated 5 months ago
Phantom-video / Phantom
View on GitHub
Phantom: Subject-Consistent Video Generation via Cross-Modal Alignment
☆1,512Sep 11, 2025Updated 10 months ago
ModelTC / LightX2V-Qwen-Image-Lightning
View on GitHub
Qwen-Image-Lightning: Speed up Qwen-Image model with distillation
☆1,340Jan 1, 2026Updated 6 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
Phantom-video / HuMo
View on GitHub
HuMo: Human-Centric Video Generation via Collaborative Multi-Modal Conditioning
☆1,274Jan 25, 2026Updated 6 months ago
guyyariv / DyPE
View on GitHub
[ICML 2026] Official implementation for "DyPE: Dynamic Position Extrapolation for Ultra High Resolution Diffusion".
☆356May 18, 2026Updated 2 months ago
Tencent-Hunyuan / SRPO
View on GitHub
Directly Aligning the Full Diffusion Trajectory with Fine-Grained Human Preference
☆1,278May 11, 2026Updated 2 months ago
inclusionAI / TwinFlow
View on GitHub
[ICLR 2026] Taming large-scale few-step training with self-adversarial flows! 👏🏻
☆536Feb 24, 2026Updated 5 months ago
bytedance / RealCustom
View on GitHub
☆97Nov 6, 2025Updated 8 months ago
Tencent-Hunyuan / HunyuanImage-3.0
View on GitHub
HunyuanImage-3.0: A Powerful Native Multimodal Model for Image Generation
☆3,200Jun 23, 2026Updated last month
bytedance / DreamO
View on GitHub
[SIGGRAPH Asia 2025] DreamO: A Unified Framework for Image Customization
☆1,652Aug 14, 2025Updated 11 months ago
QwenLM / Qwen-Image
View on GitHub
Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.
☆8,167Feb 10, 2026Updated 5 months ago
WeChatCV / Wan-Alpha
View on GitHub
[CVPR 2026 Highlight] High-Quality Text-to-Video Generation with Alpha Channel
☆392Apr 9, 2026Updated 3 months ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
Coral-Protocol / Anemoi
View on GitHub
Anemoi: A Semi-Centralized Multi-agent Systems Based on Agent-to-Agent Communication MCP server from Coral Protocol
☆370Aug 27, 2025Updated 10 months ago
little-misfit / GRAG-Image-Editing
View on GitHub
https://little-misfit.github.io/GRAG-Image-Editing/
☆119Nov 27, 2025Updated 7 months ago
bytedance / lynx
View on GitHub
Lynx: Towards High-Fidelity Personalized Video Generation
☆336Feb 27, 2026Updated 4 months ago
Eyeline-Labs / CineScale
View on GitHub
Tuning-Free 4K Video Generation
☆186Updated this week
allenai / OLMoASR
View on GitHub
An open-source implementation of Whisper
☆492Oct 29, 2025Updated 8 months ago
ali-vilab / VACE
View on GitHub
[ICCV 2025] Official implementations for paper: VACE: All-in-One Video Creation and Editing
☆3,883Oct 17, 2025Updated 9 months ago
X-Omni-Team / X-Omni
View on GitHub
Official inference code and LongText-Bench benchmark for our paper X-Omni (https://arxiv.org/pdf/2507.22058).
☆426Aug 26, 2025Updated 11 months ago
MCG-NJU / SteadyDancer
View on GitHub
SteadyDancer: Harmonized and Coherent Human Image Animation with First-Frame Preservation
☆638Dec 23, 2025Updated 7 months ago
Kunbyte-AI / DRA-Ctrl
View on GitHub
Official Implementation of DRA-Ctrl (Dimension-Reduction Attack! Video Generative Models are Experts on Controllable Image Synthesis)
☆119Aug 15, 2025Updated 11 months ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
Tencent-Hunyuan / HunyuanCustom
View on GitHub
HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation
☆1,226Oct 15, 2025Updated 9 months ago
stepfun-ai / Step1X-Edit
View on GitHub
A SOTA open-source image editing model, which aims to provide comparable performance against the closed-source models like GPT-4o and Gem…
☆2,238Apr 29, 2026Updated 2 months ago
VectorSpaceLab / OmniGen2
View on GitHub
OmniGen2: Exploration to Advanced Multimodal Generation. https://arxiv.org/abs/2506.18871
☆4,107Mar 20, 2026Updated 4 months ago
River-Zhang / ICEdit
View on GitHub
[NeurIPS 2025] Image editing is worth a single LoRA! 0.1% training data for fantastic image editing! Surpasses GPT-4o in ID persistence~ …
☆2,101Dec 19, 2025Updated 7 months ago
zjx0101 / ObjectClear
View on GitHub
[CVPR'26] ObjectClear: Precise Object and Effect Removal with Adaptive Target-Aware Attention
☆606Feb 26, 2026Updated 5 months ago
Tencent-Hunyuan / InstantCharacter
View on GitHub
☆1,047May 14, 2025Updated last year
PKU-YuanGroup / UniWorld
View on GitHub
UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation
☆884Dec 23, 2025Updated 7 months ago