Mind the Gap: Understanding the Modality Gap in Multi-modal Contrastive Representation Learning
☆176Sep 26, 2022Updated 3 years ago
Alternatives and similar repositories for Modality-Gap
Users that are interested in Modality-Gap are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Experiments and data for the paper "When and why vision-language models behave like bags-of-words, and what to do about it?" Oral @ ICLR …☆293Jun 7, 2023Updated 3 years ago
- ☆59Aug 30, 2023Updated 2 years ago
- ☆11Jul 31, 2022Updated 3 years ago
- [ICLR 2024] Analyzing and Mitigating Object Hallucination in Large Vision-Language Models☆156Apr 30, 2024Updated 2 years ago
- Code for the paper: "SuS-X: Training-Free Name-Only Transfer of Vision-Language Models" [ICCV'23]☆104Aug 22, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- CapDec: SOTA Zero Shot Image Captioning Using CLIP and GPT2, EMNLP 2022 (findings)☆206Jan 28, 2024Updated 2 years ago
- CaMEL: Mean Teacher Learning for Image Captioning. ICPR 2022☆30Dec 1, 2022Updated 3 years ago
- ☆54Jul 31, 2022Updated 3 years ago
- Code base of SynthCLIP: CLIP training with purely synthetic text-image pairs from LLMs and TTIs.☆104Mar 23, 2025Updated last year
- Official repository for the ACL 2025 Findings paper "Worse than Random? An Embarrassingly Simple Probing Evaluation of Large Multimodal M…☆26May 12, 2026Updated last month
- Filtering, Distillation, and Hard Negatives for Vision-Language Pre-Training☆141Dec 16, 2025Updated 5 months ago
- [NeurIPS 2023] Text data, code and pre-trained models for paper "Improving CLIP Training with Language Rewrites"☆289Jan 14, 2024Updated 2 years ago
- Code for Label Propagation for Zero-shot Classification with Vision-Language Models (CVPR2024)☆45Jul 23, 2024Updated last year
- CVPR 2023: Language in a Bottle: Language Model Guided Concept Bottlenecks for Interpretable Image Classification☆108May 28, 2024Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Code and data for the paper "Emergent Visual-Semantic Hierarchies in Image-Text Representations" (ECCV 2024)☆34Aug 12, 2024Updated last year
- Official implementation of "Connect, Collapse, Corrupt: Learning Cross-Modal Tasks with Uni-Modal Data" (ICLR 2024)☆35Oct 16, 2024Updated last year
- Official Code Release for "Diagnosing and Rectifying Vision Models using Language" (ICLR 2023)☆34Jun 8, 2023Updated 3 years ago
- ☆197May 9, 2026Updated last month
- [ECCV 2024] Official PyTorch implementation of DreamLIP: Language-Image Pre-training with Long Captions☆139May 8, 2025Updated last year
- [ICCV 2023] Black Box Few-Shot Adaptation for Vision-Language models☆27May 14, 2024Updated 2 years ago
- ☆363Jan 27, 2024Updated 2 years ago
- [CVPR 2024] The official implementation of paper "synthesize, diagnose, and optimize: towards fine-grained vision-language understanding"☆52Jun 16, 2025Updated 11 months ago
- MetaShift: A Dataset of Datasets for Evaluating Contextual Distribution Shifts and Training Conflicts (ICLR 2022)☆110Aug 29, 2022Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ICLR 2023 DeCap: Decoding CLIP Latents for Zero-shot Captioning☆143Mar 16, 2023Updated 3 years ago
- 【ICLR 2024, Spotlight】Sentence-level Prompts Benefit Composed Image Retrieval☆93Apr 16, 2024Updated 2 years ago
- The codebase for our EMNLP24 paper: Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Mo…☆85Jan 27, 2025Updated last year
- [DMLR 2024] Benchmarking Robustness of Multimodal Image-Text Models under Distribution Shift☆39Jan 25, 2024Updated 2 years ago
- Cross Modal Retrieval with Querybank Normalisation☆57Nov 21, 2023Updated 2 years ago
- [ICLR 23 oral] The Modality Focusing Hypothesis: Towards Understanding Crossmodal Knowledge Distillation☆44Jul 10, 2023Updated 2 years ago
- ☆677Nov 28, 2023Updated 2 years ago
- This repository is related to 'Intriguing Properties of Hyperbolic Embeddings in Vision-Language Models', published at TMLR (2024), https…☆21Jul 5, 2024Updated last year
- [ICLR 2024] Official repository for "Vision-by-Language for Training-Free Compositional Image Retrieval"☆88Jul 4, 2024Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Medical multi-modal learning with missing modality data (MLHC 2023)☆15Aug 1, 2023Updated 2 years ago
- Official implementation for NeurIPS'23 paper "Geodesic Multi-Modal Mixup for Robust Fine-Tuning"☆35Sep 24, 2024Updated last year
- A pytorch implementation of the ICCV2021 workshop paper SimDis: Simple Distillation Baselines for Improving Small Self-supervised Models☆14Jul 15, 2021Updated 4 years ago
- [CVPR 2023] Prompt, Generate, then Cache: Cascade of Foundation Models makes Strong Few-shot Learners☆379Jun 1, 2023Updated 3 years ago
- PyTorch code for "VL-Adapter: Parameter-Efficient Transfer Learning for Vision-and-Language Tasks" (CVPR2022)☆211Dec 18, 2022Updated 3 years ago
- Official code for ICCV 2023 paper, "Improving Zero-Shot Generalization for CLIP with Synthesized Prompts"☆104Mar 6, 2024Updated 2 years ago
- ☆82Nov 6, 2023Updated 2 years ago