Heidelberg-NLP / MM-SHAP
This is the official implementation of the paper "MM-SHAP: A Performance-agnostic Metric for Measuring Multimodal Contributions in Vision and Language Models & Tasks"
☆23Updated 11 months ago
Alternatives and similar repositories for MM-SHAP:
Users that are interested in MM-SHAP are comparing it to the libraries listed below
- [ICML 2022] This is the pytorch implementation of "Rethinking Attention-Model Explainability through Faithfulness Violation Test" (https:…☆19Updated 2 years ago
- NLX-GPT: A Model for Natural Language Explanations in Vision and Vision-Language Tasks, CVPR 2022 (Oral)☆47Updated last year
- [ICCV 2023] ViLLA: Fine-grained vision-language representation learning from real-world data☆39Updated last year
- [ICLR 2023] MultiViz: Towards Visualizing and Understanding Multimodal Models☆94Updated 6 months ago
- Official implementation for NeurIPS'23 paper "Geodesic Multi-Modal Mixup for Robust Fine-Tuning"☆32Updated 4 months ago
- [ICML 2022] Code and data for our paper "IGLUE: A Benchmark for Transfer Learning across Modalities, Tasks, and Languages"☆49Updated 2 years ago
- Mind the Gap: Understanding the Modality Gap in Multi-modal Contrastive Representation Learning☆144Updated 2 years ago
- Official repository for the ICCV 2023 paper: "Waffling around for Performance: Visual Classification with Random Words and Broad Concepts…☆56Updated last year
- ☆21Updated 8 months ago
- Implementation for the paper "Reliable Visual Question Answering Abstain Rather Than Answer Incorrectly" (ECCV 2022: https//arxiv.org/abs…☆33Updated last year
- Hate-CLIPper: Multimodal Hateful Meme Classification with Explicit Cross-modal Interaction of CLIP features - Accepted at EMNLP 2022 Work…☆47Updated 2 years ago
- visual question answering prompting recipes for large vision-language models☆24Updated 5 months ago
- Official Code Implementation of the paper : XAI for Transformers: Better Explanations through Conservative Propagation☆63Updated 3 years ago
- Code and data for ImageCoDe, a contextual vison-and-language benchmark☆39Updated 11 months ago
- On the Effectiveness of Parameter-Efficient Fine-Tuning☆38Updated last year
- Code and datasets for "What’s “up” with vision-language models? Investigating their struggle with spatial reasoning".☆40Updated 11 months ago
- ☆29Updated 2 years ago
- ☆58Updated last year
- Source code and data used in the papers ViQuAE (Lerner et al., SIGIR'22), Multimodal ICT (Lerner et al., ECIR'23) and Cross-modal Retriev…☆31Updated 2 months ago
- Retrieval-augmented Image Captioning☆13Updated 2 years ago
- [Arxiv] Aligning Modalities in Vision Large Language Models via Preference Fine-tuning☆79Updated 9 months ago
- ☆117Updated 2 years ago
- This is the oficial repository for "Parameter-Efficient Multi-task Tuning via Attentional Mixtures of Soft Prompts" (EMNLP 2022)☆100Updated 2 years ago
- The SVO-Probes Dataset for Verb Understanding☆31Updated 3 years ago
- ☆29Updated last year
- AlignCLIP: Improving Cross-Modal Alignment in CLIP☆20Updated 7 months ago
- ☆63Updated 3 years ago
- Holistic Coverage and Faithfulness Evaluation of Large Vision-Language Models (ACL-Findings 2024)☆15Updated 9 months ago
- Official Code Release for "Diagnosing and Rectifying Vision Models using Language" (ICLR 2023)☆33Updated last year
- Code for the paper "A Whac-A-Mole Dilemma Shortcuts Come in Multiples Where Mitigating One Amplifies Others"☆47Updated 7 months ago