Tencent-Hunyuan / HunyuanDiTLinks
Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
ā4,289Updated last month
Alternatives and similar repositories for HunyuanDiT
Users that are interested in HunyuanDiT are comparing it to the libraries listed below
Sorting:
- [CVPR 2025] StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Textā1,620Updated 9 months ago
- šŗ An End-to-End Solution for High-Resolution and Long Video Generation Based on Transformer Diffusionā2,243Updated 10 months ago
- Kolors Teamā4,590Updated last year
- Lumina-T2X is a unified framework for Text to Any Modality Generationā2,250Updated 11 months ago
- InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation š„ā2,003Updated last year
- PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesisā3,260Updated last year
- MuseV: Infinite-length and High Fidelity Virtual Human Video Generation with Visual Conditioned Parallel Denoisingā2,810Updated last year
- [TMLR 2025] Latte: Latent Diffusion Transformer for Video Generation.ā1,901Updated 2 months ago
- [ECCV 2024, Oral] DynamiCrafter: Animating Open-domain Images with Video Diffusion Priorsā2,980Updated last year
- Transparent Image Layer Diffusion using Latent Transparencyā2,186Updated last year
- Accepted as [NeurIPS 2024] Spotlight Presentation Paperā6,373Updated last year
- VideoSys: An easy and efficient system for video generationā2,016Updated 4 months ago
- Character Animation (AnimateAnyone, Face Reenactment)ā3,476Updated last year
- OneDiff: An out-of-the-box acceleration library for diffusion models.ā1,963Updated last month
- High-Quality Human Motion Video Generation with Confidence-aware Pose Guidanceā2,499Updated last month
- PixArt-Ī£: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generationā1,891Updated last year
- Unofficial Implementation of Animate Anyoneā2,933Updated last year
- [ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (RPG)ā1,842Updated 11 months ago
- The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.ā6,411Updated last year
- Official implementations for paper: Zero-shot Image Editing with Reference Imitationā1,305Updated last year
- Controllable video and image Generation, SVD, Animate Anyone, ControlNet, ControlNeXt, LoRAā1,628Updated last year
- [AAAI 2025] Follow-Your-Click: This repo is the official implementation of "Follow-Your-Click: Open-domain Regional Image Animation via Sā¦ā909Updated 4 months ago
- GPT4V-level open-source multi-modal model based on Llama3-8Bā2,427Updated 10 months ago
- CogView4, CogView3-Plus and CogView3(ECCV 2024)ā1,102Updated 9 months ago
- Enjoy the magic of Diffusion models!ā11,430Updated this week
- Official repo for VGen: a holistic video generation ecosystem for video generation building on diffusion modelsā3,150Updated last year
- [ECCV 2024] The official implementation of paper "BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion"ā1,701Updated last year
- ā1,590Updated last year
- MusePose: a Pose-Driven Image-to-Video Framework for Virtual Human Generationā2,645Updated 10 months ago
- AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animationā5,019Updated last year