longbai1006 / CAT-ViL
Official implementation of “CAT-ViL: Co-Attention Gated Vision-Language Embedding for Visual Question Localized-Answering in Robotic Surgery”, MICCAI 2023
☆15Updated 4 months ago
Related projects ⓘ
Alternatives and complementary repositories for CAT-ViL
- ☆13Updated 2 years ago
- OphNet: A Large-Scale Video Benchmark for Ophthalmic Surgical Workflow Understanding☆29Updated last week
- Official implementation of "Surgical-VQLA: Transformer with Gated Vision-Language Embedding for Visual Question Localized-Answering in Ro…☆20Updated 4 months ago
- ☆13Updated 4 years ago
- ☆14Updated 3 years ago
- This is the official implementation of "Clustering Propagation for Universal Medical Image Segmentation" (Accepted at CVPR 2024).☆30Updated 7 months ago
- [MICCAI 2024] Surgformer: Surgical Transformer with Hierarchical Temporal Attention for Surgical Phase Recognition☆20Updated 3 weeks ago
- The implementation of SSTAN in SUN-SEG dataset. (Semi-supervised Spatial Temporal Attention Network for Video Polyp Segmentation, MICCAI …☆10Updated 3 months ago
- This code is implementation of MICCAI 2024 "Robust Semi-Supervised Multimodal Medical Image Segmentation via Cross Modality Collaboration…☆17Updated last week
- ☆18Updated last year
- This repository contains the code accompanying the paper "A Self-Guided Framework for Radiology Report Generation", accepted by MICCAI 20…☆17Updated 8 months ago
- Official repository of the GraSP dataset and implemention of TAPIS☆17Updated last month
- Localized representation learning from Vision and Text (LoVT)☆26Updated 4 months ago
- TMI 2023: Less is More: Surgical Phase Recognition from Timestamp Supervision☆15Updated last year
- AI-SAM: Automatic and Interactive Segment Anything Model☆15Updated 11 months ago
- ☆11Updated 2 months ago
- This repository contains the code associated with our 2023 TMI paper "Latent Graph Representations for Critical View of Safety Assessment…☆24Updated last month
- Offical code of Unlocking the Power of Spatial and Temporal Information in Medical Multimodal Pre-training[ICML 2024]☆12Updated 5 months ago
- Multi-Aspect Vision Language Pretraining - CVPR2024☆64Updated 3 months ago
- Implementation of ''VPUFormer: Visual Prompt Unified Transformer for Interactive Image Segmentation''☆13Updated last year
- ☆16Updated last year
- [MICCAI'22] Contrastive Transformer-based Multiple Instance Learning for Weakly Supervised Polyp Frame Detection.☆40Updated 11 months ago
- This is the repository for the ICLR2023 accepted paper -- Medical Image Understanding With Pretrained VLM☆29Updated last year
- MICCAI 2022: Free Lunch for Surgical Video Understanding by Distilling Self-Supervisions☆11Updated 2 years ago
- Official Implementation of "CLEFT: Language-Image Contrastive Learning with Efficient Large Language Model and Prompt Fine-Tuning" on MIC…☆13Updated 2 months ago
- ☆36Updated 2 years ago
- ☆29Updated 5 months ago
- Official code for "BoMD: Bag of Multi-label Descriptors for Noisy Chest X-ray Classification"☆22Updated 7 months ago
- Official implementation of SurgicalPart-SAM (SP-SAM)☆11Updated 7 months ago
- [CVPR2024] PairAug: What Can Augmented Image-Text Pairs Do for Radiology?☆27Updated last week