microsoft/LLaVA-Med

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/microsoft/LLaVA-Med)

microsoft / LLaVA-Med

Large Language-and-Vision Assistant for Biomedicine, built towards multimodal GPT-4 level capabilities.

☆2,223

Alternatives and similar repositories for LLaVA-Med

Users that are interested in LLaVA-Med are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

xiaoman-zhang / PMC-VQA
View on GitHub
PMC-VQA is a large-scale medical visual question-answering dataset, which contains 227k VQA pairs of 149k images that cover various modal…
☆236Dec 6, 2024Updated last year
taokz / BiomedGPT
View on GitHub
BiomedGPT: A Generalist Vision-Language Foundation Model for Diverse Biomedical Tasks
☆708Jul 8, 2025Updated last year
chaoyi-wu / RadFM
View on GitHub
The official code for "Towards Generalist Foundation Model for Radiology by Leveraging Web-scale 2D&3D Medical Data".
☆559Jul 25, 2025Updated 11 months ago
haotian-liu / LLaVA
View on GitHub
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
☆24,932Aug 12, 2024Updated last year
snap-stanford / med-flamingo
View on GitHub
☆451Aug 23, 2023Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
RyanWangZf / MedCLIP
View on GitHub
EMNLP'22 | MedCLIP: Contrastive Learning from Unpaired Medical Images and Texts
☆694Apr 12, 2024Updated 2 years ago
UCSC-VLAA / MedTrinity-25M
View on GitHub
[ICLR 2025] This is the official repository of our paper "MedTrinity-25M: A Large-scale Multimodal Dataset with Multigranular Annotations…
☆413Jul 11, 2025Updated last year
richard-peng-xia / awesome-multimodal-in-medical-imaging
View on GitHub
A collection of resources on applications of multi-modal learning in medical imaging.
☆971Jun 4, 2026Updated last month
cambridgeltl / visual-med-alpaca
View on GitHub
Visual Med-Alpaca is an open-source, multi-modal foundation model designed specifically for the biomedical domain, built on the LLaMa-7B.…
☆394Mar 11, 2024Updated 2 years ago
bowang-lab / MedSAM
View on GitHub
Segment Anything in Medical Images
☆4,354May 7, 2025Updated last year
mbzuai-oryx / XrayGPT
View on GitHub
[BIONLP@ACL 2024] XrayGPT: Chest Radiographs Summarization using Medical Vision-Language Models.
☆529Aug 8, 2024Updated last year
ljwztc / CLIP-Driven-Universal-Model
View on GitHub
[ICCV 2023] CLIP-Driven Universal Model; Rank first in MSD Competition.
☆677Oct 24, 2025Updated 8 months ago
razorx89 / roco-dataset
View on GitHub
Radiology Objects in COntext (ROCO): A Multimodal Image Dataset
☆249Apr 5, 2022Updated 4 years ago
MediaBrain-SJTU / MedKLIP
View on GitHub
The official code for MedKLIP: Medical Knowledge Enhanced Language-Image Pre-Training in Radiology. We propose to leverage medical specif…
☆181Sep 4, 2023Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
FreedomIntelligence / HuatuoGPT-Vision
View on GitHub
Medical Multimodal LLMs
☆400Apr 23, 2025Updated last year
WeixiongLin / PMC-CLIP
View on GitHub
The official codes for "PMC-CLIP: Contrastive Language-Image Pre-training using Biomedical Documents"
☆241Aug 30, 2024Updated last year
BAAI-DCAI / M3D
View on GitHub
M3D: Advancing 3D Medical Image Analysis with Multi-Modal Large Language Models
☆451Apr 13, 2025Updated last year
Stanford-AIMI / CheXagent
View on GitHub
[Arxiv-2024] CheXagent: Towards a Foundation Model for Chest X-Ray Interpretation
☆229Jan 7, 2025Updated last year
LLaVA-VL / LLaVA-NeXT
View on GitHub
☆4,710Jun 15, 2026Updated last month
richard-peng-xia / MMed-RAG
View on GitHub
[ICLR'25] MMed-RAG: Versatile Multimodal RAG System for Medical Vision Language Models
☆335Jan 22, 2025Updated last year
microsoft / BiomedParse
View on GitHub
BiomedParse: A Foundation Model for Joint Segmentation, Detection, and Recognition of Biomedical Objects Across Nine Modalities
☆685Jan 22, 2026Updated 6 months ago
HKU-MedAI / MGCA
View on GitHub
[NeurIPS'22] Multi-Granularity Cross-modal Alignment for Generalized Medical Visual Representation Learning
☆180May 16, 2024Updated 2 years ago
LLaVA-VL / LLaVA-Med-preview
View on GitHub
☆39Nov 10, 2023Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
OpenGVLab / SAM-Med2D
View on GitHub
Official implementation of SAM-Med2D
☆1,130Jun 18, 2024Updated 2 years ago
Holipori / MIMIC-Diff-VQA
View on GitHub
☆73Feb 3, 2025Updated last year
Vision-CAIR / MiniGPT-Med
View on GitHub
Open-sourced code of MiniGPT-Med
☆140Apr 22, 2026Updated 3 months ago
zhaozh10 / Awesome-CLIP-in-Medical-Imaging
View on GitHub
A Survey on CLIP in Medical Imaging
☆515Mar 26, 2025Updated last year
ibrahimethemhamamci / CT-CLIP
View on GitHub
Developing Generalist Foundation Models from a Multimodal Dataset for 3D Computed Tomography
☆405Jul 18, 2025Updated last year
LLaVA-VL / LLaVA-Plus-Codebase
View on GitHub
LLaVA-Plus: Large Language and Vision Assistants that Plug and Learn to Use Skills
☆769Feb 1, 2024Updated 2 years ago
OpenGVLab / Multi-Modality-Arena
View on GitHub
Chatbot Arena meets multi-modality! Multi-Modality Arena allows you to benchmark vision-language models side-by-side while providing imag…
☆564Apr 21, 2024Updated 2 years ago
ttanida / rgrg
View on GitHub
Code for the CVPR paper "Interactive and Explainable Region-guided Radiology Report Generation"
☆214Jun 23, 2024Updated 2 years ago
pengfeiliHEU / MUMC
View on GitHub
This repository is made for the paper: Masked Vision and Language Pre-training with Unimodal and Multimodal Contrastive Losses for Medica…
☆48Jul 10, 2024Updated 2 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
sarahESL / PubMedCLIP
View on GitHub
Fine-tuning CLIP using ROCO dataset which contains image-caption pairs from PubMed articles.
☆183Aug 13, 2024Updated last year
williamliujl / Qilin-Med-VL
View on GitHub
The first Chinese medical large vision-language model designed to integrate the analysis of textual and visual data
☆65Dec 1, 2023Updated 2 years ago
BradyFU / Awesome-Multimodal-Large-Language-Models
View on GitHub
Latest Advances on Multimodal Large Language Models
☆17,954Jul 2, 2026Updated 2 weeks ago
allenai / medicat
View on GitHub
Dataset of medical images, captions, subfigure-subcaption annotations, and inline textual references
☆175Feb 19, 2026Updated 5 months ago
hyn2028 / llm-cxr
View on GitHub
Official code for "LLM-CXR: Instruction-Finetuned LLM for CXR Image Understanding and Generation"
☆143Nov 11, 2023Updated 2 years ago
BUAADreamer / Chinese-LLaVA-Med
View on GitHub
中文医学多模态大模型 Large Chinese Language-and-Vision Assistant for BioMedicine
☆112May 22, 2024Updated 2 years ago
chaoyi-wu / PMC-LLaMA
View on GitHub
The official codes for "PMC-LLaMA: Towards Building Open-source Language Models for Medicine"
☆679Jul 8, 2024Updated 2 years ago