Mind the Gap: Understanding the Modality Gap in Multi-modal Contrastive Representation Learning
☆170Sep 26, 2022Updated 3 years ago
Alternatives and similar repositories for Modality-Gap
Users that are interested in Modality-Gap are comparing it to the libraries listed below
Sorting:
- ☆59Aug 30, 2023Updated 2 years ago
- ☆11Jul 31, 2022Updated 3 years ago
- [ICLR 2024] Analyzing and Mitigating Object Hallucination in Large Vision-Language Models☆155Apr 30, 2024Updated last year
- Code for the paper: "SuS-X: Training-Free Name-Only Transfer of Vision-Language Models" [ICCV'23]☆106Aug 22, 2023Updated 2 years ago
- Experiments and data for the paper "When and why vision-language models behave like bags-of-words, and what to do about it?" Oral @ ICLR …☆292Jun 7, 2023Updated 2 years ago
- CapDec: SOTA Zero Shot Image Captioning Using CLIP and GPT2, EMNLP 2022 (findings)☆203Jan 28, 2024Updated 2 years ago
- ☆54Jul 31, 2022Updated 3 years ago
- Code and data for the paper "Emergent Visual-Semantic Hierarchies in Image-Text Representations" (ECCV 2024)☆34Aug 12, 2024Updated last year
- This repository is related to 'Intriguing Properties of Hyperbolic Embeddings in Vision-Language Models', published at TMLR (2024), https…☆22Jul 5, 2024Updated last year
- CaMEL: Mean Teacher Learning for Image Captioning. ICPR 2022☆29Dec 1, 2022Updated 3 years ago
- Code base of SynthCLIP: CLIP training with purely synthetic text-image pairs from LLMs and TTIs.☆102Mar 23, 2025Updated 11 months ago
- Official Code Release for "Diagnosing and Rectifying Vision Models using Language" (ICLR 2023)☆34Jun 8, 2023Updated 2 years ago
- [NeurIPS 2021] Official codes for "Efficient Training of Visual Transformers with Small Datasets".☆144Jan 1, 2025Updated last year
- [ACL 2025 Findings] "Worse than Random? An Embarrassingly Simple Probing Evaluation of Large Multimodal Models in Medical VQA"☆25Feb 21, 2025Updated last year
- Code for "Can We Characterize Tasks Without Labels or Features?" (CVPR 2021)☆11Aug 31, 2021Updated 4 years ago
- NegCLIP.☆39Feb 6, 2023Updated 3 years ago
- Filtering, Distillation, and Hard Negatives for Vision-Language Pre-Training☆141Dec 16, 2025Updated 2 months ago
- ☆661Nov 28, 2023Updated 2 years ago
- ICLR 2023 DeCap: Decoding CLIP Latents for Zero-shot Captioning☆138Mar 16, 2023Updated 2 years ago
- A pytorch implementation of the ICCV2021 workshop paper SimDis: Simple Distillation Baselines for Improving Small Self-supervised Models☆14Jul 15, 2021Updated 4 years ago
- PyTorch implementation of Asymmetric Siamese (https://arxiv.org/abs/2204.00613)☆99May 2, 2022Updated 3 years ago
- CVPR 2023: Language in a Bottle: Language Model Guided Concept Bottlenecks for Interpretable Image Classification☆106May 28, 2024Updated last year
- [NeurIPS 2023] Text data, code and pre-trained models for paper "Improving CLIP Training with Language Rewrites"☆289Jan 14, 2024Updated 2 years ago
- [AAAI 2024] ConceptBed Evaluations for Personalized Text-to-Image Diffusion Models☆25Jun 1, 2023Updated 2 years ago
- [ECCV 2024] Official PyTorch implementation of DreamLIP: Language-Image Pre-training with Long Captions☆138May 8, 2025Updated 9 months ago
- ☆15Dec 31, 2020Updated 5 years ago
- Medical multi-modal learning with missing modality data (MLHC 2023)☆14Aug 1, 2023Updated 2 years ago
- This repository contains the implementation of Concept Activation Regions, a new framework to explain deep neural networks with human con…☆16Oct 7, 2022Updated 3 years ago
- ☆29Oct 18, 2022Updated 3 years ago
- ☆81Nov 6, 2023Updated 2 years ago
- A PyTorch implementation of Mugs proposed by our paper "Mugs: A Multi-Granular Self-Supervised Learning Framework".☆84Feb 13, 2024Updated 2 years ago
- PyTorch code for "VL-Adapter: Parameter-Efficient Transfer Learning for Vision-and-Language Tasks" (CVPR2022)☆209Dec 18, 2022Updated 3 years ago
- Extended COCO Validation (ECCV) Caption dataset (ECCV 2022)☆56Mar 1, 2024Updated 2 years ago
- ☆16May 23, 2023Updated 2 years ago
- [NeurIPS 2024] Classification Done Right for Vision-Language Pre-Training☆226Mar 20, 2025Updated 11 months ago
- ☆195Mar 5, 2025Updated last year
- [ICLR 23 oral] The Modality Focusing Hypothesis: Towards Understanding Crossmodal Knowledge Distillation☆46Jul 10, 2023Updated 2 years ago
- ☆360Jan 27, 2024Updated 2 years ago
- Accelerating Vision-Language Pretraining with Free Language Modeling (CVPR 2023)☆32May 15, 2023Updated 2 years ago