An unofficial implementation of both ViT-VQGAN and RQ-VAE in Pytorch
☆321Apr 7, 2025Updated 11 months ago
Alternatives and similar repositories for enhancing-transformers
Users that are interested in enhancing-transformers are comparing it to the libraries listed below
Sorting:
- JAX implementation ViT-VQGAN☆82Sep 21, 2022Updated 3 years ago
- The official implementation of Autoregressive Image Generation using Residual Quantization (CVPR '22)☆1,004Jan 3, 2024Updated 2 years ago
- [ICLR2025] Halton Scheduler for Masked Generative Image Transformer☆282Oct 28, 2025Updated 4 months ago
- Taming Transformers for High-Resolution Image Synthesis☆6,438Jul 30, 2024Updated last year
- Official Pytorch Implementation of Our CVPR2023 Paper: "Towards Accurate Image Coding: Improved Autoregressive Image Generation with Dyna…☆192Jul 23, 2023Updated 2 years ago
- Pytorch implementation of VQGAN (Taming Transformers for High-Resolution Image Synthesis) (https://arxiv.org/pdf/2012.09841.pdf)☆545Jul 17, 2024Updated last year
- Official Jax Implementation of MaskGIT☆554Nov 18, 2022Updated 3 years ago
- Code for the ECCV 2022 paper "Unleashing Transformers"☆185Apr 17, 2023Updated 2 years ago
- Locally Hierarchical Auto-Regressive Modeling for Image Generation (HQ-Transformer)☆28Feb 14, 2024Updated 2 years ago
- Pytorch implementation of MaskGIT: Masked Generative Image Transformer (https://arxiv.org/pdf/2202.04200.pdf)☆468Sep 3, 2023Updated 2 years ago
- ☆141Jun 28, 2024Updated last year
- Official implementation of VQ-Diffusion☆978Apr 17, 2024Updated last year
- [ICCV 2023] Online Clustered Codebook☆183Sep 19, 2024Updated last year
- SEED-Voken: A Series of Powerful Visual Tokenizers☆997Nov 25, 2025Updated 3 months ago
- ☆484Jun 30, 2022Updated 3 years ago
- MoVQGAN - model for the image encoding and reconstruction☆261Oct 31, 2023Updated 2 years ago
- A PyTorch implementation of MAGE: MAsked Generative Encoder to Unify Representation Learning and Image Synthesis☆576Mar 10, 2023Updated 2 years ago
- Open reproduction of MUSE for fast text2image generation.☆359Jun 1, 2024Updated last year
- High-performance Image Tokenizers for VAR and AR☆303Apr 25, 2025Updated 10 months ago
- Fast and controllable text-to-image model.☆41Jun 16, 2023Updated 2 years ago
- PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838☆1,872Feb 20, 2026Updated 2 weeks ago
- This repo contains the code for 1D tokenizer and generator☆1,120Mar 20, 2025Updated 11 months ago
- Official Implementation of Paella https://arxiv.org/abs/2211.07292v2☆748Oct 4, 2023Updated 2 years ago
- OCR-VQGAN, a discrete image encoder (tokenizer and detokenizer) for figure images in Paper2Fig100k dataset. Implementation of OCR Percept…☆83Jan 30, 2023Updated 3 years ago
- ☆239Jul 24, 2023Updated 2 years ago
- [ICCV 2025] Official repo for "GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation"☆198Jan 7, 2026Updated last month
- FlexTok: Resampling Images into 1D Token Sequences of Flexible Length☆300Jun 2, 2025Updated 9 months ago
- ImageBART: Bidirectional Context with Multinomial Diffusion for Autoregressive Image Synthesis☆126Mar 14, 2022Updated 3 years ago
- Official implementation of Diffusion Autoencoders☆959Sep 12, 2024Updated last year
- Implementation of Generating Diverse High-Fidelity Images with VQ-VAE-2 in PyTorch☆1,796Feb 15, 2023Updated 3 years ago
- ☆144Feb 27, 2024Updated 2 years ago
- Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation☆1,937Aug 15, 2024Updated last year
- Implementation of Denoising Diffusion Probabilistic Models in PyTorch☆394Jun 14, 2022Updated 3 years ago
- Official JAX implementation of MAGVIT: Masked Generative Video Transformer☆995Jan 17, 2024Updated 2 years ago
- This repo contains the implementation of VQGAN, Taming Transformers for High-Resolution Image Synthesis in PyTorch from scratch. I have a…☆40Aug 20, 2024Updated last year
- Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"☆8,393May 31, 2024Updated last year
- A PyTorch implementation of the paper "All are Worth Words: A ViT Backbone for Diffusion Models".☆1,097Mar 25, 2023Updated 2 years ago
- [ICML 2025 Tokshop] One-D-Piece: Image Tokenizer Meets Quality-Controllable Compression☆77Jul 30, 2025Updated 7 months ago
- Official implementation for SSDD Single-Step Diffusion Decoder for Efficient Image Tokenization.☆55Nov 12, 2025Updated 3 months ago