feizc / DiT-MoE
Scaling Diffusion Transformers with Mixture of Experts
☆293Updated 6 months ago
Alternatives and similar repositories for DiT-MoE:
Users that are interested in DiT-MoE are comparing it to the libraries listed below
- MoVQGAN - model for the image encoding and reconstruction☆223Updated last year
- [ICLR 2025] Rectified Diffusion: Straightness Is Not Your Need☆195Updated last week
- [NeurIPS 2024] The official code of "U-DiTs: Downsample Tokens in U-Shaped Diffusion Transformers"☆190Updated 5 months ago
- [ICLR 2025] OpenVid-1M: A Large-Scale High-Quality Dataset for Text-to-video Generation☆253Updated 2 weeks ago
- [CVPR 2025] Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models☆471Updated last week
- [ICLR2024] The official implementation of paper "VDT: General-purpose Video Diffusion Transformers via Mask Modeling", by Haoyu Lu, Guoxi…☆229Updated 10 months ago
- Implementation of a single layer of the MMDiT, proposed in Stable Diffusion 3, in Pytorch☆322Updated 2 months ago
- [ICLR 2025] Autoregressive Video Generation without Vector Quantization☆413Updated this week
- ☆119Updated 8 months ago
- ☆164Updated last month
- [NeurIPS 2024] CV-VAE: A Compatible Video VAE for Latent Generative Video Models☆268Updated 3 months ago
- SpeeD: A Closer Look at Time Steps is Worthy of Triple Speed-Up for Diffusion Model Training☆177Updated last month
- [ICML 2024 Spotlight] FiT: Flexible Vision Transformer for Diffusion Model☆406Updated 4 months ago
- Adaptive Caching for Faster Video Generation with Diffusion Transformers☆142Updated 4 months ago
- HART: Efficient Visual Generation with Hybrid Autoregressive Transformer☆420Updated 5 months ago
- VideoVAE+: Large Motion Video Autoencoding with Cross-modal Video VAE☆296Updated 2 months ago
- Implementation of MagViT2 Tokenizer in Pytorch☆596Updated 2 months ago
- STAR: Scale-wise Text-to-image generation via Auto-Regressive representations☆137Updated last month
- This is a repo to track the latest autoregressive visual generation papers.☆164Updated this week
- Scalable Diffusion Models with State Space Backbone☆151Updated last year
- Official PyTorch and Diffusers Implementation of "LinFusion: 1 GPU, 1 Minute, 16K Image"☆294Updated 2 months ago
- This repo contains the code for 1D tokenizer and generator☆709Updated 3 weeks ago
- Transformer-Mamba Diffusion Models☆103Updated 8 months ago
- [CVPR 2025] 🔥 Official impl. of "TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation".☆286Updated 2 weeks ago
- ☆144Updated 3 months ago
- SEED-Voken: A Series of Powerful Visual Tokenizers☆843Updated 3 weeks ago
- Official PyTorch implementation of paper "CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up".☆199Updated last month
- ☆82Updated 5 months ago
- [CVPR2025] PAR: Parallelized Autoregressive Visual Generation. https://epiphqny.github.io/PAR-project/☆126Updated 2 months ago