Mind the Gap: Understanding the Modality Gap in Multi-modal Contrastive Representation Learning
☆171Sep 26, 2022Updated 3 years ago
Alternatives and similar repositories for Modality-Gap
Users that are interested in Modality-Gap are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Experiments and data for the paper "When and why vision-language models behave like bags-of-words, and what to do about it?" Oral @ ICLR …☆292Jun 7, 2023Updated 2 years ago
- ☆59Aug 30, 2023Updated 2 years ago
- ☆11Jul 31, 2022Updated 3 years ago
- [ICLR 2024] Analyzing and Mitigating Object Hallucination in Large Vision-Language Models☆155Apr 30, 2024Updated last year
- Code for the paper: "SuS-X: Training-Free Name-Only Transfer of Vision-Language Models" [ICCV'23]☆105Aug 22, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- CapDec: SOTA Zero Shot Image Captioning Using CLIP and GPT2, EMNLP 2022 (findings)☆204Jan 28, 2024Updated 2 years ago
- NegCLIP.☆39Feb 6, 2023Updated 3 years ago
- CaMEL: Mean Teacher Learning for Image Captioning. ICPR 2022☆29Dec 1, 2022Updated 3 years ago
- ☆54Jul 31, 2022Updated 3 years ago
- [ACL 2025 Findings] "Worse than Random? An Embarrassingly Simple Probing Evaluation of Large Multimodal Models in Medical VQA"☆25Feb 21, 2025Updated last year
- Code base of SynthCLIP: CLIP training with purely synthetic text-image pairs from LLMs and TTIs.☆103Mar 23, 2025Updated last year
- Filtering, Distillation, and Hard Negatives for Vision-Language Pre-Training☆141Dec 16, 2025Updated 3 months ago
- [NeurIPS 2023] Text data, code and pre-trained models for paper "Improving CLIP Training with Language Rewrites"☆288Jan 14, 2024Updated 2 years ago
- Code for Label Propagation for Zero-shot Classification with Vision-Language Models (CVPR2024)☆44Jul 23, 2024Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- CVPR 2023: Language in a Bottle: Language Model Guided Concept Bottlenecks for Interpretable Image Classification☆106May 28, 2024Updated last year
- Code and data for the paper "Emergent Visual-Semantic Hierarchies in Image-Text Representations" (ECCV 2024)☆34Aug 12, 2024Updated last year
- [NeurIPS 2023] Factorized Contrastive Learning: Going Beyond Multi-view Redundancy☆73Nov 13, 2023Updated 2 years ago
- Official implementation of "Connect, Collapse, Corrupt: Learning Cross-Modal Tasks with Uni-Modal Data" (ICLR 2024)☆34Oct 16, 2024Updated last year
- Official Code Release for "Diagnosing and Rectifying Vision Models using Language" (ICLR 2023)☆34Jun 8, 2023Updated 2 years ago
- [AAAI 2024] ConceptBed Evaluations for Personalized Text-to-Image Diffusion Models☆25Jun 1, 2023Updated 2 years ago
- ☆196Mar 5, 2025Updated last year
- [ECCV 2024] Official PyTorch implementation of DreamLIP: Language-Image Pre-training with Long Captions☆138May 8, 2025Updated 10 months ago
- MetaShift: A Dataset of Datasets for Evaluating Contextual Distribution Shifts and Training Conflicts (ICLR 2022)☆108Aug 29, 2022Updated 3 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- [DMLR 2024] Benchmarking Robustness of Multimodal Image-Text Models under Distribution Shift☆38Jan 25, 2024Updated 2 years ago
- ICLR 2023 DeCap: Decoding CLIP Latents for Zero-shot Captioning☆139Mar 16, 2023Updated 3 years ago
- 【ICLR 2024, Spotlight】Sentence-level Prompts Benefit Composed Image Retrieval☆92Apr 16, 2024Updated last year
- Cross Modal Retrieval with Querybank Normalisation☆57Nov 21, 2023Updated 2 years ago
- [ICLR 23 oral] The Modality Focusing Hypothesis: Towards Understanding Crossmodal Knowledge Distillation☆46Jul 10, 2023Updated 2 years ago
- ☆666Nov 28, 2023Updated 2 years ago
- This repository is related to 'Intriguing Properties of Hyperbolic Embeddings in Vision-Language Models', published at TMLR (2024), https…☆22Jul 5, 2024Updated last year
- [ICLR 2024] Official repository for "Vision-by-Language for Training-Free Compositional Image Retrieval"☆84Jul 4, 2024Updated last year
- Official implementation for NeurIPS'23 paper "Geodesic Multi-Modal Mixup for Robust Fine-Tuning"☆36Sep 24, 2024Updated last year
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- A pytorch implementation of the ICCV2021 workshop paper SimDis: Simple Distillation Baselines for Improving Small Self-supervised Models☆14Jul 15, 2021Updated 4 years ago
- [CVPR 2023] Prompt, Generate, then Cache: Cascade of Foundation Models makes Strong Few-shot Learners☆381Jun 1, 2023Updated 2 years ago
- PyTorch code for "VL-Adapter: Parameter-Efficient Transfer Learning for Vision-and-Language Tasks" (CVPR2022)☆211Dec 18, 2022Updated 3 years ago
- Official code for ICCV 2023 paper, "Improving Zero-Shot Generalization for CLIP with Synthesized Prompts"☆104Mar 6, 2024Updated 2 years ago
- ☆82Nov 6, 2023Updated 2 years ago
- This repository contains the implementation of Concept Activation Regions, a new framework to explain deep neural networks with human con…☆16Oct 7, 2022Updated 3 years ago
- PyTorch implementation of Asymmetric Siamese (https://arxiv.org/abs/2204.00613)☆99May 2, 2022Updated 3 years ago