CAMMA-public / SSG-VQAView external linksLinks
[IPCAI'24 Best Paper] Advancing Surgical VQA with Scene Graph Knowledge
☆47May 23, 2025Updated 8 months ago
Alternatives and similar repositories for SSG-VQA
Users that are interested in SSG-VQA are comparing it to the libraries listed below
Sorting:
- [MedIA'25] Learning multi-modal representations by watching hundreds of surgical video lectures☆79Sep 14, 2025Updated 4 months ago
- Official repository of the GraSP dataset and implemention of TAPIS☆50Dec 31, 2024Updated last year
- MCPL: Multi-modal Collaborative Prompt Learning for Medical Vision-Language Model (Initial Version)☆13Apr 17, 2024Updated last year
- ☆11Jun 21, 2025Updated 7 months ago
- [MICCAI 2024] Official dataset release for "EgoSurgery: A Dataset for Surgical Video Understanding from Egocentric Open Surgery Videos"☆28Nov 25, 2024Updated last year
- ☆30Sep 16, 2024Updated last year
- [ACL 2025] ⚖️ Temporally-aware MLLM for Biomedical Radiology Analysis and Report Generation. Flexible toolkit with MLLM backbone support,…☆27Jan 10, 2026Updated last month
- [ 🎯 NeurIPS 2025 ] 3D-RAD 🩻: A Comprehensive 3D Radiology Med-VQA Dataset with Multi-Temporal Analysis and Diverse Diagnostic Tasks☆27Oct 28, 2025Updated 3 months ago
- Code for the paper "RECAP: Towards Precise Radiology Report Generation via Dynamic Disease Progression Reasoning" (EMNLP'23 Findings).☆28Jun 12, 2025Updated 8 months ago
- ☆32Mar 25, 2025Updated 10 months ago
- Code for the paper "RADAR: Enhancing Radiology Report Generation with Supplementary Knowledge Injection" (ACL'25).☆33Jul 23, 2025Updated 6 months ago
- The repo of ASGMVLP☆18Jan 16, 2026Updated 3 weeks ago
- The official repository of paper named 'A Refer-and-Ground Multimodal Large Language Model for Biomedicine'☆34Nov 5, 2024Updated last year
- ☆37Apr 5, 2025Updated 10 months ago
- Code for the paper "ICON: Improving Inter-Report Consistency in Radiology Report Generation via Lesion-aware Mixup Augmentation" (EMNLP'2…☆17Dec 11, 2024Updated last year
- ☆16Sep 17, 2025Updated 4 months ago
- The official repository of the paper 'Towards a Multimodal Large Language Model with Pixel-Level Insight for Biomedicine'☆119Jan 9, 2025Updated last year
- ☆20Jan 3, 2025Updated last year
- [ACL 2025 Findings] "Worse than Random? An Embarrassingly Simple Probing Evaluation of Large Multimodal Models in Medical VQA"☆25Feb 21, 2025Updated 11 months ago
- [MedIA 2025] - Official repo for the paper: "Scaling up self-supervised learning for improved surgical foundation models"☆49Nov 25, 2025Updated 2 months ago
- [EMNLP 2024] This is the code for our paper "BMRetriever: Tuning Large Language Models as Better Biomedical Text Retrievers".☆23Sep 19, 2024Updated last year
- [MICCAI 2024] Surgformer: Surgical Transformer with Hierarchical Temporal Attention for Surgical Phase Recognition☆44Aug 28, 2025Updated 5 months ago
- Official code of the paper MM-OR: A Large Multimodal Operating Room Dataset for Semantic Understanding of High-Intensity Surgical Environ…☆52Aug 27, 2025Updated 5 months ago
- Official code of the paper ORacle: Large Vision-Language Models for Knowledge-Guided Holistic OR Domain Modeling accepted at MICCAI 2024.☆24Jan 6, 2025Updated last year
- ☆46Apr 25, 2024Updated last year
- LLaVa Version of RaDialog☆25May 27, 2025Updated 8 months ago
- [NeurIPS'24] CARES: A Comprehensive Benchmark of Trustworthiness in Medical Vision Language Models☆77Dec 4, 2024Updated last year
- The official codes for "Can Modern LLMs Act as Agent Cores in Radiology Environments?"☆28Jan 22, 2025Updated last year
- ☆32Oct 6, 2024Updated last year
- Interpreting Chest X-rays Like a Radiologist: A Benchmark with Clinical Reasoning, release the dataset and the model weight☆13May 26, 2025Updated 8 months ago
- 🩻 NV-Reason-CXR-3B is a specialized vision-language model designed for medical reasoning and interpretation of chest X-ray images.☆39Oct 29, 2025Updated 3 months ago
- Papers from the intersection of surgery and data science / machine learning☆15Jan 28, 2024Updated 2 years ago
- [ACM MM 2025 🔥🔥 ] MIRA: A first-of-its-kind medical RAG framework that fuses image features and retrieved knowledge with dynamic contex…☆18Aug 28, 2025Updated 5 months ago
- Rethinking Whole-Body CT Image Interpretation: An Abnormality-Centric Approach☆19Nov 17, 2025Updated 2 months ago
- ☆29Feb 7, 2024Updated 2 years ago
- Codes and Pre-trained models for RAMM: Retrieval-augmented Biomedical Visual Question Answering with Multi-modal Pre-training [ACM MM 202…☆29Nov 2, 2023Updated 2 years ago
- Learning by Aligning Videos in Time (CVPR 2021)☆14Sep 10, 2023Updated 2 years ago
- ☆18Nov 11, 2022Updated 3 years ago
- [CHIL 2024] ViewXGen: Vision-Language Generative Model for View-Specific Chest X-ray Generation☆55Dec 4, 2024Updated last year