Heidelberg-NLP / MM-SHAP
This is the official implementation of the paper "MM-SHAP: A Performance-agnostic Metric for Measuring Multimodal Contributions in Vision and Language Models & Tasks"
☆19Updated 6 months ago
Related projects: ⓘ
- Mind the Gap: Understanding the Modality Gap in Multi-modal Contrastive Representation Learning☆113Updated last year
- Hate-CLIPper: Multimodal Hateful Meme Classification with Explicit Cross-modal Interaction of CLIP features - Accepted at EMNLP 2022 Work…☆40Updated last year
- [ICLR 2023] MultiViz: Towards Visualizing and Understanding Multimodal Models☆87Updated 3 weeks ago
- MedViLL official code. (Published IEEE JBHI 2021)☆82Updated last year
- Fine-tuning CLIP using ROCO dataset which contains image-caption pairs from PubMed articles.☆131Updated last month
- VQA-Med 2020☆13Updated last year
- NLX-GPT: A Model for Natural Language Explanations in Vision and Vision-Language Tasks, CVPR 2022 (Oral)☆44Updated 7 months ago
- ☆40Updated last year
- An Open-source Factuality Evaluation Demo for LLMs☆17Updated 3 weeks ago
- EHRXQA: A Multi-Modal Question Answering Dataset for Electronic Health Records with Chest X-ray Images☆57Updated last month
- [ICML 2022] Code and data for our paper "IGLUE: A Benchmark for Transfer Learning across Modalities, Tasks, and Languages"☆49Updated last year
- Code and data for ImageCoDe, a contextual vison-and-language benchmark☆39Updated 6 months ago
- [ACL 2024] FLEUR: An Explainable Reference-Free Evaluation Metric for Image Captioning Using a Large Multimodal Model☆8Updated 3 weeks ago
- Repository of paper Consistency-preserving Visual Question Answering in Medical Imaging (MICCAI2022)☆20Updated last year
- Repository for the paper: Open-Ended Medical Visual Question Answering Through Prefix Tuning of Language Models (https://arxiv.org/abs/23…☆14Updated last year
- A curated list of vision-and-language pre-training (VLP). :-)☆56Updated 2 years ago
- Visual Question Answering in the Medical Domain VQA-Med 2019☆81Updated 8 months ago
- This repository is made for the paper: Self-supervised vision-language pretraining for Medical visual question answering☆32Updated last year
- Code, data, models for the Sherlock corpus