pliang279 / MultiViz
[ICLR 2023] MultiViz: Towards Visualizing and Understanding Multimodal Models
☆87Updated 3 weeks ago
Related projects: ⓘ
- Mind the Gap: Understanding the Modality Gap in Multi-modal Contrastive Representation Learning☆113Updated last year
- [TMLR 2022] High-Modality Multimodal Transformer☆104Updated 11 months ago
- [NeurIPS 2023] Factorized Contrastive Learning: Going Beyond Multi-view Redundancy☆55Updated 10 months ago
- [NeurIPS 2023, ICMI 2023] Quantifying & Modeling Multimodal Interactions☆52Updated 8 months ago
- Official Implementation of "Geometric Multimodal Contrastive Representation Learning" (https://arxiv.org/abs/2202.03390)☆26Updated 2 years ago
- This repository provides a comprehensive collection of research papers focused on multimodal representation learning, all of which have b…☆65Updated 11 months ago
- Visual Language Transformer Interpreter - An interactive visualization tool for interpreting vision-language transformers☆85Updated last year
- NLX-GPT: A Model for Natural Language Explanations in Vision and Vision-Language Tasks, CVPR 2022 (Oral)☆44Updated 7 months ago
- Holistic evaluation of multimodal foundation models☆36Updated last month
- A curated list of vision-and-language pre-training (VLP). :-)☆56Updated 2 years ago
- This is the official implementation of the paper "MM-SHAP: A Performance-agnostic Metric for Measuring Multimodal Contributions in Vision…☆19Updated 6 months ago
- MultiModN – Multimodal, Multi-Task, Interpretable Modular Networks (NeurIPS 2023)☆30Updated 11 months ago
- Characterizing and overcoming the greedy nature of learning in multi-modal deep neural networks☆27Updated 2 years ago
- [Arxiv] Aligning Modalities in Vision Large Language Models via Preference Fine-tuning☆61Updated 4 months ago
- Multimodal Masked Autoencoders (M3AE): A JAX/Flax Implementation☆100Updated 5 months ago
- ☆111Updated last year
- PyTorch implementation of LIMoE☆49Updated 5 months ago
- ViLLA: Fine-grained vision-language representation learning from real-world data☆38Updated 11 months ago
- The Social-IQ 2.0 Challenge Release for the Artificial Social Intelligence Workshop at ICCV '23☆19Updated 11 months ago
- Hate-CLIPper: Multimodal Hateful Meme Classification with Explicit Cross-modal Interaction of CLIP features - Accepted at EMNLP 2022 Work…☆40Updated last year
- [ICLR 2024 spotlight] Making Pre-trained Language Models Great on Tabular Prediction☆38Updated 2 months ago
- Implementation of Zorro, Masked Multimodal Transformer, in Pytorch☆95Updated 11 months ago
- CVPR 2022, Robust Contrastive Learning against Noisy Views☆81Updated 2 years ago
- Code for paper "UniPELT: A Unified Framework for Parameter-Efficient Language Model Tuning", ACL 2022☆58Updated 2 years ago
- This is the official implementation of "Everything at Once - Multi-modal Fusion Transformer for Video Retrieval". CVPR 2022☆93Updated 2 years ago
- The Continual Learning in Multimodality Benchmark☆58Updated last year
- On the Effectiveness of Parameter-Efficient Fine-Tuning☆38Updated 10 months ago
- ☆55Updated last year
- ☆205Updated 3 years ago
- A new collection of medical VQA dataset based on MIMIC-CXR. Part of the work 'EHRXQA: A Multi-Modal Question Answering Dataset for Electr…☆62Updated 3 weeks ago