bytedance / XVerseLinks
[NeurIPS 2025] Official implementation of "XVerse: Consistent Multi-Subject Control of Identity and Semantic Attributes via DiT Modulation".
β621Updated 3 months ago
Alternatives and similar repositories for XVerse
Users that are interested in XVerse are comparing it to the libraries listed below
Sorting:
- Implementation of "FLUX-Text: A Simple and Advanced Diffusion Transformer Baseline for Scene Text Editing"β435Updated 2 months ago
- [CVPR 2025 Highlightπ₯] Identity-Preserving Text-to-Video Generation by Frequency Decompositionβ812Updated 5 months ago
- UniAnimate-DiT: Human Image Animation with Large-Scale Video Diffusion Transformerβ832Updated 9 months ago
- The official implementation of RealisDanceβ610Updated 7 months ago
- [ICLR'26] Rethinking High-Quality Aesthetic Poster Generation in a Unified Frameworkβ527Updated 2 weeks ago
- β283Updated 6 months ago
- [AAAI 2025] Resolving Multi-Condition Confusion for Finetuning-Free Personalized Image Generationβ171Updated 7 months ago
- [TIP 2025] From Parts to Whole: A Unified Reference Framework for Controllable Human Image Generationβ196Updated 4 months ago
- The official code implementation of the paper "OmniConsistency: Learning Style-Agnostic Consistency from Paired Stylization Data."β425Updated 8 months ago
- Less is Enough: Training-Free Video Diffusion Acceleration via Runtime-Adaptive Cachingβ286Updated 5 months ago
- [ICCV 2025] Region-Aware Text-to-Image Generation via Hard Binding and Soft Refinement π₯β620Updated 2 months ago
- β414Updated 11 months ago
- Official implementation for "RIFLEx: A Free Lunch for Length Extrapolation in Video Diffusion Transformers" (ICML 2025) and UltraViCo (ICβ¦β785Updated last week
- π₯ [ICCV 2025 Highlight] Official ComfyUI native node supporting InfiniteYou with FLUXβ281Updated 6 months ago
- Official implementation of MAGREF: Masked Guidance for Any-Reference Video Generation with Subject Disentanglementβ285Updated last month
- Official project page of MTVCrafter, a new paradigm for animating arbitrary characters with 4D motion tokens.β278Updated last week
- AnyTalker: Scaling Multi-person Talking Video Generation with Interactivity Refinementβ278Updated 2 months ago
- Calligrapher: Freestyle Text Image Customizationβ296Updated 5 months ago
- Official code for AccVideo: Accelerating Video Diffusion Model with Synthetic Datasetβ271Updated 8 months ago
- [SIGGRAPH Asia 25] Voost: A Unified and Scalable Diffusion Transformer for Bidirectional Virtual Try-On and Try-Offβ333Updated 3 months ago
- [ECCV 2024] Bridging Different Language Models and Generative Vision Models for Text-to-Image Generationβ298Updated last year
- [NeurIPS 2025 D&Bπ₯] OpenS2V-Nexus: A Detailed Benchmark and Million-Scale Dataset for Subject-to-Video Generationβ192Updated last month
- Pusa: Thousands Timesteps Video Diffusion Modelβ672Updated this week
- We present FlashPortrait, an end-to-end video diffusion transformer capable of synthesizing ID-preserving, infinite-length videos while aβ¦β434Updated last month
- [Preprint 2025] Ditto: Scaling Instruction-Based Video Editing with a High-Quality Synthetic Datasetβ566Updated 3 months ago
- Official implementation for "Identifying and Solving Conditional Image Leakage in Image-to-Video Diffusion Model" (NeurIPS 2024)β258Updated 9 months ago
- We achieves high-quality first-frame guided video editing given a reference image, while maintaining flexibility for incorporating additiβ¦β322Updated 5 months ago
- Codes for ID-Specific Video Customized Diffusionβ462Updated last year
- Official code for StoryMem: Multi-shot Long Video Storytelling with Memoryβ644Updated 3 weeks ago
- [SIGGRAPH 2025] Official code of the paper "FlexiAct: Towards Flexible Action Control in Heterogeneous Scenarios"β344Updated 3 months ago