Weixin-Liang / Modality-Gap
Mind the Gap: Understanding the Modality Gap in Multi-modal Contrastive Representation Learning
☆123Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for Modality-Gap
- ☆55Updated last year
- ☆113Updated last year
- ☆60Updated last year
- Official Pytorch implementation of "Improved Probabilistic Image-Text Representations" (ICLR 2024)☆51Updated 5 months ago
- PyTorch code for "VL-Adapter: Parameter-Efficient Transfer Learning for Vision-and-Language Tasks" (CVPR2022)☆202Updated last year
- Official implementation for NeurIPS'23 paper "Geodesic Multi-Modal Mixup for Robust Fine-Tuning"☆28Updated last month
- [ICLR2023] PLOT: Prompt Learning with Optimal Transport for Vision-Language Models☆145Updated 10 months ago
- Repository for the paper: Teaching Structured Vision & Language Concepts to Vision & Language Models☆44Updated last year
- A PyTorch implementation of Multimodal Few-Shot Learning with Frozen Language Models with OPT.☆43Updated 2 years ago
- CVPR 2023: Language in a Bottle: Language Model Guided Concept Bottlenecks for Interpretable Image Classification☆80Updated 5 months ago
- Code and results accompanying our paper titled CHiLS: Zero-Shot Image Classification with Hierarchical Label Sets☆54Updated last year
- Official repository for the ICCV 2023 paper: "Waffling around for Performance: Visual Classification with Random Words and Broad Concepts…☆52Updated last year
- [NeurIPS2023] Official implementation and model release of the paper "What Makes Good Examples for Visual In-Context Learning?"☆166Updated 8 months ago
- Repository for the paper: dense and aligned captions (dac) promote compositional reasoning in vl models☆25Updated 11 months ago
- Code for Finetune like you pretrain: Improved finetuning of zero-shot vision models☆89Updated last year
- Toolkit for Elevater Benchmark☆67Updated last year
- Learning to compose soft prompts for compositional zero-shot learning.☆84Updated last year
- [NeurIPS 2023] Text data, code and pre-trained models for paper "Improving CLIP Training with Language Rewrites"☆258Updated 9 months ago
- NegCLIP.☆26Updated last year
- ☆159Updated 10 months ago
- This repo is the official implementation of UPL (Unsupervised Prompt Learning for Vision-Language Models).☆106Updated 2 years ago
- source code for NeurIPS'23 paper "Dream the Impossible: Outlier Imagination with Diffusion Models"☆61Updated last week
- official implementation of "Interpreting CLIP's Image Representation via Text-Based Decomposition"☆164Updated last month
- [Arxiv] Aligning Modalities in Vision Large Language Models via Preference Fine-tuning☆71Updated 6 months ago
- ☆170Updated last year
- Official Code Release for "Diagnosing and Rectifying Vision Models using Language" (ICLR 2023)☆32Updated last year
- [ICCV 2023] Prompt-aligned Gradient for Prompt Tuning☆150Updated last year
- Official Implementation of "Geometric Multimodal Contrastive Representation Learning" (https://arxiv.org/abs/2202.03390)☆26Updated 2 years ago
- Test-time Prompt Tuning (TPT) for zero-shot generalization in vision-language models (NeurIPS 2022))☆144Updated 2 years ago
- Compress conventional Vision-Language Pre-training data☆49Updated last year