divyam3897 / I2M2Links
I2M2: Jointly Modeling Inter- & Intra-Modality Dependencies for Multi-modal Learning (NeurIPS 2024)
☆22Updated last year
Alternatives and similar repositories for I2M2
Users that are interested in I2M2 are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2023, ICMI 2023] Quantifying & Modeling Multimodal Interactions☆84Updated last year
- Symile is a flexible, architecture-agnostic contrastive loss that enables training modality-specific representations for any number of mo…☆43Updated 8 months ago
- [NeurIPS 2023] Factorized Contrastive Learning: Going Beyond Multi-view Redundancy☆74Updated 2 years ago
- Active Learning in the era of Foundation Models☆10Updated 7 months ago
- Implementation of the general framework for AMIE, from the paper "Towards Conversational Diagnostic AI", out of Google Deepmind☆70Updated last year
- MedMax: Mixed-Modal Instruction Tuning for Training Biomedical Assistants☆39Updated 2 months ago
- Sparse Linear Concept Embeddings☆120Updated 8 months ago
- [CVPR 2025] MicroVQA eval and 🤖RefineBot code for "MicroVQA: A Multimodal Reasoning Benchmark for Microscopy-Based Scientific Research"…☆27Updated last month
- [ICLR 2023] MultiViz: Towards Visualizing and Understanding Multimodal Models☆99Updated last year
- [ML4H'25] m1: Unleash the Potential of Test-Time Scaling for Medical Reasoning in Large Language Models☆46Updated 7 months ago
- Expert-level AI radiology report evaluator☆35Updated 7 months ago
- Bilingual Medical Mixture of Experts LLM☆31Updated last year
- [ NeurIPS 2023 ] Official Codebase for "Aligning Synthetic Medical Images with Clinical Knowledge using Human Feedback"☆19Updated 2 years ago
- [NeurIPS 2025 D&B Spotlight] CXReasonBench: A Benchmark for Evaluating Structured Diagnostic Reasoning in Chest X-rays☆25Updated last month
- Accompanying code for "Analyzing Vision Tranformers in Class Embedding Space" (NeurIPS '23)☆15Updated last year
- Implementation of Zorro, Masked Multimodal Transformer, in Pytorch☆97Updated 2 years ago
- MultiModN – Multimodal, Multi-Task, Interpretable Modular Networks (NeurIPS 2023)☆35Updated 2 years ago
- [CVPR 2025] CheXWorld: Exploring Image World Modeling for Radiograph Representation Learning☆29Updated 7 months ago
- Official Implementation of "Geometric Multimodal Contrastive Representation Learning" (https://arxiv.org/abs/2202.03390)☆28Updated 10 months ago
- ☆70Updated 4 months ago
- ☆19Updated 2 years ago
- Evaluation and dataset construction code for the CVPR 2025 paper "Vision-Language Models Do Not Understand Negation"☆41Updated 7 months ago
- ☆30Updated 3 months ago
- Code and benchmark for the paper: "A Practitioner's Guide to Continual Multimodal Pretraining" [NeurIPS'24]☆60Updated 11 months ago
- A regression-alike loss to improve numerical reasoning in language models - ICML 2025☆26Updated 3 months ago
- [ICLR 23] A new framework to transform any neural networks into an interpretable concept-bottleneck-model (CBM) without needing labeled c…☆123Updated last year
- Repo for our work "Systematic Evaluation of Large Vision-Language Models for Surgical Artificial Intelligence"☆16Updated 5 months ago
- Holistic evaluation of multimodal foundation models☆47Updated last year
- ☆12Updated last year
- [CVPR 2025] BIOMEDICA: An Open Biomedical Image-Caption Archive, Dataset, and Vision-Language Models Derived from Scientific Literature☆85Updated 8 months ago