Mind the Gap: Understanding the Modality Gap in Multi-modal Contrastive Representation Learning
☆175Sep 26, 2022Updated 3 years ago
Alternatives and similar repositories for Modality-Gap
Users that are interested in Modality-Gap are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Experiments and data for the paper "When and why vision-language models behave like bags-of-words, and what to do about it?" Oral @ ICLR …☆294Jun 7, 2023Updated 2 years ago
- ☆59Aug 30, 2023Updated 2 years ago
- ☆11Jul 31, 2022Updated 3 years ago
- [ICLR 2024] Analyzing and Mitigating Object Hallucination in Large Vision-Language Models☆156Apr 30, 2024Updated last year
- Code for the paper: "SuS-X: Training-Free Name-Only Transfer of Vision-Language Models" [ICCV'23]☆105Aug 22, 2023Updated 2 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- CapDec: SOTA Zero Shot Image Captioning Using CLIP and GPT2, EMNLP 2022 (findings)☆204Jan 28, 2024Updated 2 years ago
- NegCLIP.☆40Feb 6, 2023Updated 3 years ago
- CaMEL: Mean Teacher Learning for Image Captioning. ICPR 2022☆29Dec 1, 2022Updated 3 years ago
- ☆54Jul 31, 2022Updated 3 years ago
- Code base of SynthCLIP: CLIP training with purely synthetic text-image pairs from LLMs and TTIs.☆104Mar 23, 2025Updated last year
- Official repository for the ACL 2025 Findings paper "Worse than Random? An Embarrassingly Simple Probing Evaluation of Large Multimodal M…☆26Feb 21, 2025Updated last year
- Filtering, Distillation, and Hard Negatives for Vision-Language Pre-Training☆141Dec 16, 2025Updated 4 months ago
- [NeurIPS 2023] Text data, code and pre-trained models for paper "Improving CLIP Training with Language Rewrites"☆289Jan 14, 2024Updated 2 years ago
- CVPR 2023: Language in a Bottle: Language Model Guided Concept Bottlenecks for Interpretable Image Classification☆106May 28, 2024Updated last year
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Code and data for the paper "Emergent Visual-Semantic Hierarchies in Image-Text Representations" (ECCV 2024)☆34Aug 12, 2024Updated last year
- Official implementation of "Connect, Collapse, Corrupt: Learning Cross-Modal Tasks with Uni-Modal Data" (ICLR 2024)☆34Oct 16, 2024Updated last year
- Official Code Release for "Diagnosing and Rectifying Vision Models using Language" (ICLR 2023)☆34Jun 8, 2023Updated 2 years ago
- [AAAI 2024] ConceptBed Evaluations for Personalized Text-to-Image Diffusion Models☆25Jun 1, 2023Updated 2 years ago
- ☆195Mar 5, 2025Updated last year
- [ECCV 2024] Official PyTorch implementation of DreamLIP: Language-Image Pre-training with Long Captions☆138May 8, 2025Updated 11 months ago
- [ICCV 2023] Black Box Few-Shot Adaptation for Vision-Language models☆27May 14, 2024Updated last year
- ☆360Jan 27, 2024Updated 2 years ago
- ACL 2024 (SRW), Official Codebase of our Paper: "MoExtend: Tuning New Experts for Modality and Task Extension"☆15Dec 3, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- [CVPR 2024] The official implementation of paper "synthesize, diagnose, and optimize: towards fine-grained vision-language understanding"☆52Jun 16, 2025Updated 10 months ago
- MetaShift: A Dataset of Datasets for Evaluating Contextual Distribution Shifts and Training Conflicts (ICLR 2022)☆110Aug 29, 2022Updated 3 years ago
- ICLR 2023 DeCap: Decoding CLIP Latents for Zero-shot Captioning☆141Mar 16, 2023Updated 3 years ago
- 【ICLR 2024, Spotlight】Sentence-level Prompts Benefit Composed Image Retrieval☆93Apr 16, 2024Updated 2 years ago
- The codebase for our EMNLP24 paper: Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Mo…☆86Jan 27, 2025Updated last year
- [DMLR 2024] Benchmarking Robustness of Multimodal Image-Text Models under Distribution Shift☆39Jan 25, 2024Updated 2 years ago
- Cross Modal Retrieval with Querybank Normalisation☆57Nov 21, 2023Updated 2 years ago
- [ICLR 23 oral] The Modality Focusing Hypothesis: Towards Understanding Crossmodal Knowledge Distillation☆45Jul 10, 2023Updated 2 years ago
- ☆668Nov 28, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- This repository is related to 'Intriguing Properties of Hyperbolic Embeddings in Vision-Language Models', published at TMLR (2024), https…☆22Jul 5, 2024Updated last year
- [ICLR 2024] Official repository for "Vision-by-Language for Training-Free Compositional Image Retrieval"☆85Jul 4, 2024Updated last year
- Medical multi-modal learning with missing modality data (MLHC 2023)☆14Aug 1, 2023Updated 2 years ago
- Official implementation for NeurIPS'23 paper "Geodesic Multi-Modal Mixup for Robust Fine-Tuning"☆36Sep 24, 2024Updated last year
- A pytorch implementation of the ICCV2021 workshop paper SimDis: Simple Distillation Baselines for Improving Small Self-supervised Models☆14Jul 15, 2021Updated 4 years ago
- [CVPR 2023] Prompt, Generate, then Cache: Cascade of Foundation Models makes Strong Few-shot Learners☆380Jun 1, 2023Updated 2 years ago
- PyTorch code for "VL-Adapter: Parameter-Efficient Transfer Learning for Vision-and-Language Tasks" (CVPR2022)☆211Dec 18, 2022Updated 3 years ago