kakaobrain / magvltLinks
The official implementation of MAGVLT: Masked Generative Vision-and-Language Transformer (CVPR'23)
☆27Updated last year
Alternatives and similar repositories for magvlt
Users that are interested in magvlt are comparing it to the libraries listed below
Sorting:
- Official Implementation (Pytorch) of "Constant Acceleration Flow", NeurIPS 2024☆33Updated 8 months ago
- Official Pytorch Implementation of Our CVPR2023 Paper: "Not All Image Regions Matter: Masked Vector Quantization for Autoregressive Image…☆61Updated 2 years ago
- Implementation of RQ Transformer, proposed in the paper "Autoregressive Image Generation using Residual Quantization"☆121Updated 3 years ago
- This repo contains the official PyTorch implementation of AudioToken: Adaptation of Text-Conditioned Diffusion Models for Audio-to-Image …☆85Updated last year
- ☆137Updated last year
- UnifiedMLLM: Enabling Unified Representation for Multi-modal Multi-tasks With Large Language Model☆22Updated last year
- MIO: A Foundation Model on Multimodal Tokens☆30Updated 9 months ago
- An official pytorch implementation of AAAI 2024 paper "Latent Space Editing in Transformer-based Flow Matching"☆44Updated last year
- Official code implementation for the work Preference Alignment with Flow Matching (NeurIPS 2024)☆58Updated 11 months ago
- The official PyTorch implementation of Fast Diffusion Model☆95Updated 2 years ago
- Locally Hierarchical Auto-Regressive Modeling for Image Generation (HQ-Transformer)☆29Updated last year
- ☆55Updated 2 years ago
- ☆31Updated last year
- [ICLR2023] Discrete Contrastive Diffusion for Cross-Modal Music and Image Generation (CDCD).☆162Updated 2 years ago
- Minimal multi-gpu implementation of EDM2: "Analyzing and Improving the Training Dynamics of Diffusion Models"☆36Updated last year
- ACAV100M: Automatic Curation of Large-Scale Datasets for Audio-Visual Video Representation Learning. In ICCV, 2021.☆62Updated 3 years ago
- Official Code Implementation for 'A Simple Early Exiting Framework for Accelerated Sampling in Diffusion Models'☆20Updated last year
- [ICCV 2023] Online Clustered Codebook☆176Updated last year
- [ICLR2022] Code for "Retriever: Learning Content-Style Representation as a Token-Level Bipartite Graph"☆54Updated 2 years ago
- AliTok: Towards Sequence Modeling Alignment between Tokenizer and Autoregressive Model☆44Updated 3 months ago
- Implementation of Retrieval-Augmented Denoising Diffusion Probabilistic Models in Pytorch☆65Updated 3 years ago
- Language Quantized AutoEncoders☆109Updated 2 years ago
- An official pytorch implementation of EACL2024 short paper "Flow Matching for Conditional Text Generation in a Few Sampling Steps"☆25Updated 2 months ago
- This repository is for The Power of Sound(TPoS): Audio Reactive Video Generation with Stable Diffusion (ICCV2023)☆23Updated last year
- Implementation of a multimodal diffusion transformer in Pytorch☆105Updated last year
- ☆151Updated 6 months ago
- Visual Programming for Text-to-Image Generation and Evaluation (NeurIPS 2023)☆56Updated 2 years ago
- Official Pytorch Implementation of Our CVPR2023 Paper: "Towards Accurate Image Coding: Improved Autoregressive Image Generation with Dyna…☆185Updated 2 years ago
- Official implementation for the paper "A Cheaper and Better Diffusion Language Model with Soft-Masked Noise"☆59Updated 2 years ago
- My attempts at applying Soundstream design on learned tokenization of text and then applying hierarchical attention to text generation☆88Updated last year