kakaobrain / magvlt
The official implementation of MAGVLT: Masked Generative Vision-and-Language Transformer (CVPR'23)
☆24Updated 8 months ago
Related projects: ⓘ
- Locally Hierarchical Auto-Regressive Modeling for Image Generation (HQ-Transformer)☆26Updated 7 months ago
- ☆44Updated 4 months ago
- This is an official implementation of GRIT-VLP☆20Updated 2 years ago
- [ECCV2024][ICCV2023] Official PyTorch implementation of SeiT++ and SeiT☆51Updated last month
- ☆24Updated 11 months ago
- PyTorch code for "Perceiver-VL: Efficient Vision-and-Language Modeling with Iterative Latent Attention" (WACV 2023)☆32Updated last year
- ☆46Updated last year
- https://arxiv.org/abs/2209.15162☆48Updated last year
- Visual Instruction-guided Explainable Metric. Code for "Towards Explainable Metrics for Conditional Image Synthesis Evaluation" (ACL 2024…☆22Updated last month
- Data-Efficient Multimodal Fusion on a Single GPU☆45Updated 4 months ago
- [NeurIPS 2023] Official repository for "Distilling Out-of-Distribution Robustness from Vision-Language Foundation Models"☆11Updated 3 months ago
- [ICLR-2023] Rarity Score : A New Metric to Evaluate the Uncommonness of Synthesized Images☆59Updated 2 years ago
- Official Pytorch Implementation of Our CVPR2023 Paper: "Not All Image Regions Matter: Masked Vector Quantization for Autoregressive Image…☆49Updated last year
- Official implementation of our paper "Finetuned Multimodal Language Models are High-Quality Image-Text Data Filters".☆40Updated 2 months ago
- [ICLR2024] The official implementation of paper "UniAdapter: Unified Parameter-Efficient Transfer Learning for Cross-modal Modeling", by …☆68Updated 7 months ago
- ☆18Updated 3 weeks ago
- ☆29Updated last year
- Visual Programming for Text-to-Image Generation and Evaluation (NeurIPS 2023)☆51Updated last year
- Code and Data for Paper: SELMA: Learning and Merging Skill-Specific Text-to-Image Experts with Auto-Generated Data☆30Updated 6 months ago
- Language Quantized AutoEncoders☆94Updated last year
- [NAACL 2024] Vision language model that reduces hallucinations through self-feedback guided revision. Visualizes attentions on image feat…☆41Updated 3 weeks ago
- A PyTorch implementation of Multimodal Few-Shot Learning with Frozen Language Models with OPT.☆41Updated 2 years ago
- Pytorch Implementation of the Model from "MIRASOL3B: A MULTIMODAL AUTOREGRESSIVE MODEL FOR TIME-ALIGNED AND CONTEXTUAL MODALITIES"☆24Updated last week
- [ICLR 2023] RC-MAE☆51Updated 9 months ago
- ☆27Updated last year
- SMILE: A Multimodal Dataset for Understanding Laughter☆13Updated last year
- Official PyTorch implementation of "Energy-Based Contrastive Learning of Visual Representations", NeurIPS 2022 Oral Paper☆9Updated last year
- Censored Sampling of Diffusion Models Using 3 Minutes of Human Feedback☆25Updated last year
- Implementation of Retrieval-Augmented Denoising Diffusion Probabilistic Models in Pytorch☆64Updated 2 years ago