rajesh-lab / symile
Symile is a flexible, architecture-agnostic contrastive loss that enables training modality-specific representations for any number of modalities.
☆27Updated 2 months ago
Alternatives and similar repositories for symile:
Users that are interested in symile are comparing it to the libraries listed below
- BIOMEDICA: An Open Biomedical Image-Caption Archive, Dataset, and Vision-Language Models Derived from Scientific Literature☆28Updated this week
- ☆41Updated last year
- "Worse than Random? An Embarrassingly Simple Probing Evaluation of Large Multimodal Models in Medical VQA"☆15Updated 6 months ago
- MultiModN – Multimodal, Multi-Task, Interpretable Modular Networks (NeurIPS 2023)☆30Updated last year
- More dimensions = More fun☆21Updated 5 months ago
- [ICLR 2023] MultiViz: Towards Visualizing and Understanding Multimodal Models☆94Updated 4 months ago
- Unsolvable Problem Detection: Evaluating Trustworthiness of Vision Language Models☆72Updated 4 months ago
- Holistic evaluation of multimodal foundation models☆42Updated 5 months ago
- ☆43Updated 3 months ago
- [NeurIPS 2023] Factorized Contrastive Learning: Going Beyond Multi-view Redundancy☆63Updated last year
- ☆40Updated this week
- ViLLA: Fine-grained vision-language representation learning from real-world data☆39Updated last year
- [NeurIPS 2024] MoME: Mixture of Multimodal Experts for Generalist Multimodal Large Language Models☆42Updated last month
- ☆36Updated last week
- Pytorch Implementation of the paper: "Learning to (Learn at Test Time): RNNs with Expressive Hidden States"☆24Updated last week
- Code for paper: VL-ICL Bench: The Devil in the Details of Benchmarking Multimodal In-Context Learning☆34Updated last week
- MedMax: Mixed-Modal Instruction Tuning for Training Biomedical Assistants☆25Updated last month
- Code for "AVG-LLaVA: A Multimodal Large Model with Adaptive Visual Granularity"☆19Updated 3 months ago
- Code and benchmark for the paper: "A Practitioner's Guide to Continual Multimodal Pretraining" [NeurIPS'24]☆47Updated last month
- [NeurIPS 2023, ICMI 2023] Quantifying & Modeling Multimodal Interactions☆66Updated 2 months ago
- Implementation of "VL-Mamba: Exploring State Space Models for Multimodal Learning"☆79Updated 9 months ago
- visual question answering prompting recipes for large vision-language models☆23Updated 4 months ago
- Official repository of paper titled "UniMed-CLIP: Towards a Unified Image-Text Pretraining Paradigm for Diverse Medical Imaging Modalitie…☆62Updated 3 weeks ago
- MMedPO: Aligning Medical Vision-Language Models with Clinical-Aware Multimodal Preference Optimization☆15Updated last month
- [Arxiv] Aligning Modalities in Vision Large Language Models via Preference Fine-tuning☆78Updated 8 months ago
- Official implementation of "Automated Generation of Challenging Multiple-Choice Questions for Vision Language Model Evaluation"☆18Updated last week
- Code and datasets for "What’s “up” with vision-language models? Investigating their struggle with spatial reasoning".☆38Updated 10 months ago
- Implementation of the paper: "BRAVE : Broadening the visual encoding of vision-language models"☆22Updated last week
- ☆22Updated 8 months ago
- ☆61Updated 6 months ago