Weixin-Liang/Modality-Gap

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Weixin-Liang/Modality-Gap)

Weixin-Liang / Modality-Gap

Mind the Gap: Understanding the Modality Gap in Multi-modal Contrastive Representation Learning

☆178

Alternatives and similar repositories for Modality-Gap

Users that are interested in Modality-Gap are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

mertyg / vision-language-models-are-bows
View on GitHub
Experiments and data for the paper "When and why vision-language models behave like bags-of-words, and what to do about it?" Oral @ ICLR …
☆294Jun 7, 2023Updated 3 years ago
allenai / close
View on GitHub
☆59Aug 30, 2023Updated 2 years ago
NVlabs / PerVLBenchmark
View on GitHub
☆11Jul 31, 2022Updated 3 years ago
YiyangZhou / LURE
View on GitHub
[ICLR 2024] Analyzing and Mitigating Object Hallucination in Large Vision-Language Models
☆158Apr 30, 2024Updated 2 years ago
vishaal27 / SuS-X
View on GitHub
Code for the paper: "SuS-X: Training-Free Name-Only Transfer of Vision-Language Models" [ICCV'23]
☆104Aug 22, 2023Updated 2 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
DavidHuji / CapDec
View on GitHub
CapDec: SOTA Zero Shot Image Captioning Using CLIP and GPT2, EMNLP 2022 (findings)
☆209Jan 28, 2024Updated 2 years ago
vinid / neg_clip
View on GitHub
NegCLIP.
☆41Feb 6, 2023Updated 3 years ago
aimagelab / camel
View on GitHub
CaMEL: Mean Teacher Learning for Image Captioning. ICPR 2022
☆30Dec 1, 2022Updated 3 years ago
NVlabs / PALAVRA
View on GitHub
☆54Jul 31, 2022Updated 3 years ago
ytaek-oh / fsc-clip
View on GitHub
[EMNLP 2024] Preserving Multi-Modal Capabilities of Pre-trained VLMs for Improving Vision-Linguistic Compositionality
☆23Oct 8, 2024Updated last year
hammoudhasan / SynthCLIP
View on GitHub
Code base of SynthCLIP: CLIP training with purely synthetic text-image pairs from LLMs and TTIs.
☆104Mar 23, 2025Updated last year
facebookresearch / diht
View on GitHub
Filtering, Distillation, and Hard Negatives for Vision-Language Pre-Training
☆141Dec 16, 2025Updated 7 months ago
LijieFan / LaCLIP
View on GitHub
[NeurIPS 2023] Text data, code and pre-trained models for paper "Improving CLIP Training with Language Rewrites"
☆291Jan 14, 2024Updated 2 years ago
UCSB-AI / ProbMed
View on GitHub
Official repository for the ACL 2025 Findings paper "Worse than Random? An Embarrassingly Simple Probing Evaluation of Large Multimodal M…
☆25May 12, 2026Updated 2 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
vladan-stojnic / ZLaP
View on GitHub
Code for Label Propagation for Zero-shot Classification with Vision-Language Models (CVPR2024)
☆45Jul 23, 2024Updated 2 years ago
YueYANG1996 / LaBo
View on GitHub
CVPR 2023: Language in a Bottle: Language Model Guided Concept Bottlenecks for Interpretable Image Classification
☆108May 28, 2024Updated 2 years ago
TAU-VAILab / hierarcaps
View on GitHub
Code and data for the paper "Emergent Visual-Semantic Hierarchies in Image-Text Representations" (ECCV 2024)
☆34Aug 12, 2024Updated last year
yuhui-zh15 / C3
View on GitHub
Official implementation of "Connect, Collapse, Corrupt: Learning Cross-Modal Tasks with Uni-Modal Data" (ICLR 2024)
☆36Oct 16, 2024Updated last year
yuhui-zh15 / drml
View on GitHub
Official Code Release for "Diagnosing and Rectifying Vision Models using Language" (ICLR 2023)
☆34Jun 8, 2023Updated 3 years ago
ConceptBed / evaluations
View on GitHub
[AAAI 2024] ConceptBed Evaluations for Personalized Text-to-Image Diffusion Models
☆25Jun 1, 2023Updated 3 years ago
ant-research / DreamLIP
View on GitHub
[ECCV 2024] Official PyTorch implementation of DreamLIP: Language-Image Pre-training with Long Captions
☆138May 8, 2025Updated last year
google-research / composed_image_retrieval
View on GitHub
☆197Updated this week
AITRICS / Medical_Tri_Modal_Pilot
View on GitHub
Medical multi-modal learning with missing modality data (MLHC 2023)
☆15Aug 1, 2023Updated 2 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
saic-fi / LFA
View on GitHub
[ICCV 2023] Black Box Few-Shot Adaptation for Vision-Language models
☆27May 14, 2024Updated 2 years ago
JonathanCrabbe / CARs
View on GitHub
This repository contains the implementation of Concept Activation Regions, a new framework to explain deep neural networks with human con…
☆17Oct 7, 2022Updated 3 years ago
Weixin-Liang / MetaShift
View on GitHub
MetaShift: A Dataset of Datasets for Evaluating Contextual Distribution Shifts and Training Conflicts (ICLR 2022)
☆110Aug 29, 2022Updated 3 years ago
tsb0601 / MMVP
View on GitHub
☆364Jan 27, 2024Updated 2 years ago
wjpoom / SPEC
View on GitHub
[CVPR 2024] The official implementation of paper "synthesize, diagnose, and optimize: towards fine-grained vision-language understanding"
☆52Jun 16, 2025Updated last year
dhg-wei / DeCap
View on GitHub
ICLR 2023 DeCap: Decoding CLIP Latents for Zero-shot Captioning
☆144Mar 16, 2023Updated 3 years ago
zihuixue / MFH
View on GitHub
[ICLR 23 oral] The Modality Focusing Hypothesis: Towards Understanding Crossmodal Knowledge Distillation
☆44Jul 10, 2023Updated 3 years ago
gaopengcuhk / Tip-Adapter
View on GitHub
☆677Nov 28, 2023Updated 2 years ago
zhongshsh / MoExtend
View on GitHub
ACL 2024 (SRW), Official Codebase of our Paper: "MoExtend: Tuning New Experts for Modality and Task Extension"
☆15Dec 3, 2024Updated last year
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
chunmeifeng / SPRC
View on GitHub
【ICLR 2024, Spotlight】Sentence-level Prompts Benefit Composed Image Retrieval
☆94Apr 16, 2024Updated 2 years ago
saibr / hypvl
View on GitHub
This repository is related to 'Intriguing Properties of Hyperbolic Embeddings in Vision-Language Models', published at TMLR (2024), https…
☆21Jul 5, 2024Updated 2 years ago
ioanacroi / qb-norm
View on GitHub
Cross Modal Retrieval with Querybank Normalisation
☆57Nov 21, 2023Updated 2 years ago
ganjiro / OfflineMania
View on GitHub
[COG24] - Official repository of "OfflineMania: A Benchmark Environment for Offline Reinforcement Learning in Racing Games"
☆12Jul 15, 2024Updated 2 years ago
XLearning-SCU / 2021-NeurIPS-NCR
View on GitHub
☆82Nov 6, 2023Updated 2 years ago
JindongGu / SimDis
View on GitHub
A pytorch implementation of the ICCV2021 workshop paper SimDis: Simple Distillation Baselines for Improving Small Self-supervised Models
☆14Jul 15, 2021Updated 5 years ago
ExplainableML / Vision_by_Language
View on GitHub
[ICLR 2024] Official repository for "Vision-by-Language for Training-Free Compositional Image Retrieval"
☆89Jul 4, 2024Updated 2 years ago