kyegomez / MAGVIT2
Open source community's implementation of the model from "LANGUAGE MODEL BEATS DIFFUSION — TOKENIZER IS KEY TO VISUAL GENERATION"
☆15Updated last week
Related projects: ⓘ
- REVO-LION: Evaluating and Refining Vision-Language Instruction Tuning Datasets☆11Updated 11 months ago
- Video Diffusion State Space Models☆19Updated 5 months ago
- My implementation of the model KosmosG from "KOSMOS-G: Generating Images in Context with Multimodal Large Language Models"☆14Updated last week
- Visual Instruction-guided Explainable Metric. Code for "Towards Explainable Metrics for Conditional Image Synthesis Evaluation" (ACL 2024…☆22Updated last month
- This is the implementation of CounterCurate, the data curation pipeline of both physical and semantic counterfactual image-caption pairs.☆16Updated 2 months ago
- ☆32Updated 3 months ago
- [ECCV 2024] This is the official implementation of "Stitched ViTs are Flexible Vision Backbones".☆22Updated 7 months ago
- ☆19Updated 11 months ago
- Official Implementation of the paper: A Complete Recipe for Diffusion Generative Models☆28Updated 11 months ago
- Official repo for the paper "Discffusion: Discriminative Diffusion Models as Few-shot Vision and Language Learners"☆26Updated 4 months ago
- A big_vision inspired repo that implements a generic Auto-Encoder class capable in representation learning and generative modeling.☆28Updated 2 months ago
- TIER: Text-Image Encoder-based Regression for AIGC Image Quality Assessment☆9Updated 8 months ago
- [ECCV 2024] Official pytorch implementation of "Switch Diffusion Transformer: Synergizing Denoising Tasks with Sparse Mixture-of-Experts"☆30Updated 2 months ago
- Code and Data for Paper: SELMA: Learning and Merging Skill-Specific Text-to-Image Experts with Auto-Generated Data☆30Updated 6 months ago
- ☆24Updated last year
- ☆15Updated 2 months ago
- VPEval Codebase from Visual Programming for Text-to-Image Generation and Evaluation (NeurIPS 2023)☆42Updated 9 months ago
- ☆16Updated this week
- [ACL 2023] PuMer: Pruning and Merging Tokens for Efficient Vision Language Models☆25Updated last year
- Code for paper "Unsegment Anything by Simulating Deformation" (CVPR 2024)☆21Updated 3 months ago
- ☆43Updated 2 weeks ago
- The codebase of our paper "Improving the Training of Rectified Flows"☆65Updated 2 months ago
- Official implementation and dataset for the NAACL 2024 paper "ComCLIP: Training-Free Compositional Image and Text Matching"☆30Updated last month
- Official code for the paper "Image generation with shortest path diffusion" accepted at ICML 2023.☆20Updated last year
- ☆43Updated 5 months ago
- ☆29Updated last year
- Official code for paper "Beyond Sole Strength: Customized Ensembles for Generalized Vision-Language Models, ICML2024"☆19Updated 4 months ago
- A PyTorch implementation of the paper "Revisiting Non-Autoregressive Transformers for Efficient Image Synthesis"☆26Updated 3 months ago
- This repository is the implementation of the paper Training Free Pretrained Model Merging (CVPR2024).☆22Updated 6 months ago