lucidrains / titok-pytorch
Implementation of TiTok, proposed by Bytedance in "An Image is Worth 32 Tokens for Reconstruction and Generation"
☆161Updated 5 months ago
Related projects ⓘ
Alternatives and complementary repositories for titok-pytorch
- Implementation of the proposed MaskBit from Bytedance AI☆62Updated last week
- Implementation of a single layer of the MMDiT, proposed in Stable Diffusion 3, in Pytorch☆255Updated 2 months ago
- Implementation of a multimodal diffusion transformer in Pytorch☆97Updated 4 months ago
- ☆102Updated 4 months ago
- Official PyTorch Implementation of "Scalable Autoregressive Image Generation with Mamba"☆110Updated 2 months ago
- Train VAE like a boss☆246Updated 3 weeks ago
- This repo contains the code for 1D tokenizer and generator☆548Updated this week
- Implementation of rectified flow and some of its followup research / improvements in Pytorch☆191Updated 2 weeks ago
- Official Implementation for our NeurIPS 2024 paper, "Don't Look Twice: Run-Length Tokenization for Faster Video Transformers".☆124Updated last week
- HART: Efficient Visual Generation with Hybrid Autoregressive Transformer☆340Updated last month
- ☆78Updated 10 months ago
- VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation☆137Updated 3 weeks ago
- [ICML 2024 Spotlight] FiT: Flexible Vision Transformer for Diffusion Model☆389Updated last week
- Official PyTorch and Diffusers Implementation of "LinFusion: 1 GPU, 1 Minute, 16K Image"☆240Updated last month
- ☆193Updated 4 months ago
- Adaptive Caching for Faster Video Generation with Diffusion Transformers☆91Updated 2 weeks ago
- Explorations into improving ViTArc with Slot Attention☆37Updated last month
- Implementation of Autoregressive Diffusion in Pytorch☆300Updated 2 weeks ago
- Scaling Diffusion Transformers with Mixture of Experts☆207Updated 2 months ago
- FMBoost: Boosting Latent Diffusion with Flow Matching (ECCV 2024 Oral)☆185Updated last month
- Object Recognition as Next Token Prediction (CVPR 2024 Highlight)☆161Updated last month
- 🔥stable, simple, state-of-the-art VQVAE toolkit & cookbook☆42Updated 4 months ago
- Transformer implementation for "Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion"☆55Updated last month
- Rectified Diffusion: Straightness Is Not Your Need☆128Updated this week
- The official CLIP training codebase of Inf-CL: "Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss". A su…☆178Updated 3 weeks ago
- ☆195Updated last month
- Implementation of Lumiere, SOTA text-to-video generation from Google Deepmind, in Pytorch☆250Updated 3 months ago
- MoVQGAN - model for the image encoding and reconstruction☆197Updated last year
- An in-context conditioning version of MUSE with pre-trained checkpoints.☆111Updated last year