lucidrains / RQ-Transformer
Implementation of RQ Transformer, proposed in the paper "Autoregressive Image Generation using Residual Quantization"
☆95Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for RQ-Transformer
- My attempts at applying Soundstream design on learned tokenization of text and then applying hierarchical attention to text generation☆82Updated last month
- An implementation of simple diffusion in PyTorch (and JAX)☆35Updated last year
- Implementation of Retrieval-Augmented Denoising Diffusion Probabilistic Models in Pytorch☆64Updated 2 years ago
- JAX implementation ViT-VQGAN☆77Updated 2 years ago
- Official Implementation for "Consistency Flow Matching: Defining Straight Flows with Velocity Consistency"☆153Updated 4 months ago
- Implementation of NWT, audio-to-video generation, in Pytorch☆87Updated 2 years ago
- Implementation of a multimodal diffusion transformer in Pytorch☆97Updated 5 months ago
- Implementation of Hourglass Transformer, in Pytorch, from Google and OpenAI☆84Updated 2 years ago
- Jax/Flax implementation of Variational-DiffWave.☆40Updated 2 years ago
- Locally Hierarchical Auto-Regressive Modeling for Image Generation (HQ-Transformer)☆26Updated 9 months ago
- [NeurIPS 2023] Official Implementation: "Consistent Diffusion Models"☆54Updated last year
- Unofficial implementation of Neural Analysis and Synthesis☆7Updated 3 years ago
- Implementation of Discrete Key / Value Bottleneck, in Pytorch☆87Updated last year
- Implementation of "Analyzing and Improving the Training Dynamics of Diffusion Models"☆89Updated 9 months ago
- Implementation of the proposed Adam-atan2 from Google Deepmind in Pytorch☆94Updated this week
- Implementation of the proposed MaskBit from Bytedance AI☆62Updated last week
- ☆29Updated last year
- Official PyTorch implementation for FastDPM, a fast sampling algorithm for diffusion probabilistic models☆81Updated 3 years ago
- Implementation of rectified flow and some of its followup research / improvements in Pytorch☆196Updated this week
- SpeechCLIP: Integrating Speech with Pre-Trained Vision and Language Model, Accepted to IEEE SLT 2022☆110Updated 2 years ago
- ☆33Updated 10 months ago
- The official implementation of MAGVLT: Masked Generative Vision-and-Language Transformer (CVPR'23)☆26Updated 10 months ago
- This repo contains the official PyTorch implementation of AudioToken: Adaptation of Text-Conditioned Diffusion Models for Audio-to-Image …☆77Updated 5 months ago
- Codebase for the Paper: Learning Visual Styles from Audio-Visual Associations (ECCV 2022, in PyTorch)☆14Updated last year
- Minimal multi-gpu implementation of EDM2: "Analyzing and Improving the Training Dynamics of Diffusion Models"☆26Updated 8 months ago
- An official pytorch implementation of AAAI 2024 paper "Latent Space Editing in Transformer-based Flow Matching"☆27Updated 7 months ago
- ☆44Updated 6 months ago
- [ICCV 2023] Online Clustered Codebook☆148Updated 2 months ago