rajesh-lab / symile
Symile is a flexible, architecture-agnostic contrastive loss that enables training modality-specific representations for any number of modalities.
☆30Updated 2 months ago
Alternatives and similar repositories for symile:
Users that are interested in symile are comparing it to the libraries listed below
- I2M2: Jointly Modeling Inter- & Intra-Modality Dependencies for Multi-modal Learning (NeurIPS 2024)☆19Updated 5 months ago
- Holistic evaluation of multimodal foundation models☆43Updated 7 months ago
- Unsolvable Problem Detection: Evaluating Trustworthiness of Vision Language Models☆75Updated 6 months ago
- [CVPR 2025] MicroVQA eval and 🤖RefineBot code for "MicroVQA: A Multimodal Reasoning Benchmark for Microscopy-Based Scientific Research"…☆17Updated last week
- MedMax: Mixed-Modal Instruction Tuning for Training Biomedical Assistants☆28Updated 2 months ago
- MultiModN – Multimodal, Multi-Task, Interpretable Modular Networks (NeurIPS 2023)☆31Updated last year
- Code and benchmark for the paper: "A Practitioner's Guide to Continual Multimodal Pretraining" [NeurIPS'24]☆54Updated 3 months ago
- ☆43Updated 6 months ago
- More dimensions = More fun☆21Updated 8 months ago
- Auto Interpretation Pipeline and many other functionalities for Multimodal SAE Analysis.☆124Updated 2 months ago
- ☆42Updated last year
- Official implementation of "Automated Generation of Challenging Multiple-Choice Questions for Vision Language Model Evaluation" (CVPR 202…☆25Updated last week
- ☆40Updated 8 months ago
- [NeurIPS 2023] Factorized Contrastive Learning: Going Beyond Multi-view Redundancy☆67Updated last year
- Implementation of the paper: "BRAVE : Broadening the visual encoding of vision-language models"☆26Updated last week
- MedAgentsBench: Benchmarking Thinking Models and Agent Frameworks for Complex Medical Reasoning☆25Updated last week
- Official PyTorch Implementation for Task Vectors are Cross-Modal☆22Updated 3 months ago
- Implementation of the general framework for AMIE, from the paper "Towards Conversational Diagnostic AI", out of Google Deepmind☆59Updated 6 months ago
- [NeurIPS 2023, ICMI 2023] Quantifying & Modeling Multimodal Interactions☆70Updated 5 months ago
- [ICLR 2023] MultiViz: Towards Visualizing and Understanding Multimodal Models☆95Updated 7 months ago
- Expert-level AI radiology report evaluator☆21Updated last week
- ☆44Updated last month
- "Worse than Random? An Embarrassingly Simple Probing Evaluation of Large Multimodal Models in Medical VQA"☆15Updated last month
- Evaluation and dataset construction code for the CVPR 2025 paper "Vision-Language Models Do Not Understand Negation"☆19Updated 2 weeks ago
- Official implementation of MAIA, A Multimodal Automated Interpretability Agent☆76Updated 3 weeks ago
- How Good is Google Bard's Visual Understanding? An Empirical Study on Open Challenges☆30Updated last year
- ☆48Updated 4 months ago
- ☆43Updated 2 months ago
- An Enhanced CLIP Framework for Learning with Synthetic Captions☆28Updated 3 months ago
- Official Implementation of DiffCLIP: Differential Attention Meets CLIP☆20Updated 2 weeks ago