longbai1006/CAT-ViL

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/longbai1006/CAT-ViL)

longbai1006 / CAT-ViL

Official implementation of “CAT-ViL: Co-Attention Gated Vision-Language Embedding for Visual Question Localized-Answering in Robotic Surgery”, MICCAI 2023

☆18

Alternatives and similar repositories for CAT-ViL

Users that are interested in CAT-ViL are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

longbai1006 / Surgical-VQLA
View on GitHub
Official implementation of "Surgical-VQLA: Transformer with Gated Vision-Language Embedding for Visual Question Localized-Answering in Ro…
☆27Jul 7, 2024Updated 2 years ago
longbai1006 / EndoUIC
View on GitHub
Official implementation of "EndoUIC: Promptable Diffusion Transformer for Unified Illumination Correction in Capsule Endoscopy", MICCAI 2…
☆12Jan 29, 2026Updated 5 months ago
longbai1006 / Surgical-VQLAPlus
View on GitHub
Official Implementation of "Surgical-VQLA++: Adversarial Contrastive Learning for Calibrated Robust Visual Question Localized-Answering i…
☆15May 6, 2025Updated last year
xmed-lab / TimeStamp-Surgical
View on GitHub
TMI 2023: Less is More: Surgical Phase Recognition from Timestamp Supervision
☆22Feb 9, 2023Updated 3 years ago
Lornatang / RCAN-PyTorch
View on GitHub
PyTorch implements `Image Super-Resolution Using Very Deep Residual Channel Attention Networks` paper.
☆15Dec 6, 2022Updated 3 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
lalithjets / Surgical_VQA
View on GitHub
Surgical Visual Question Answering. A transformer-based surgical VQA model. Offical Implementation of "Surgical-VQA: Visual Question Answ…
☆68Mar 27, 2023Updated 3 years ago
CAMMA-public / SurgLatentGraph
View on GitHub
This repository contains the code associated with our 2023 TMI paper "Latent Graph Representations for Critical View of Safety Assessment…
☆38Sep 17, 2025Updated 10 months ago
haozhiwen-fighting / Contrast-enhanced-Ultrasound-for-Thyroid-Nodules-Diagnosis
View on GitHub
☆10Jun 6, 2024Updated 2 years ago
lrhan / CIDH-caffe
View on GitHub
This is a source code for cohesion intensive deep hashing (CIDH)
☆10Feb 21, 2020Updated 6 years ago
RoyHirsch / endossl
View on GitHub
Code and models for MICCAI23 paper: "Self-Supervised Learning for Endoscopy Video Analysis".
☆25Oct 2, 2023Updated 2 years ago
olmozavala / DCE_MRI_Preproc
View on GitHub
This is the Matlab code that pre-process a series of DCE MRI images of the breast. It first does image registration and then image classi…
☆16Oct 17, 2015Updated 10 years ago
arnaudjudge / RL4Seg
View on GitHub
Domain adaptation framework for segmentation via reinforcement learning.
☆16Jul 17, 2026Updated last week
HERIUN / vsumm-reinforce_re
View on GitHub
This repo contains the Pytorch implementation of the AAAI'18 paper - Deep Reinforcement Learning for Unsupervised Video Summarization wit…
☆11Jun 5, 2023Updated 3 years ago
Tim-101 / Text-and-Image-Classification
View on GitHub
Classify image and text with ResNet and BERT models using Pytorch
☆13Jul 7, 2020Updated 6 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
hotaki-lab / Product-Review-Sentiment-Analysis
View on GitHub
The goal of this project is to design a classifier to use for sentiment analysis of product reviews. Our training set consists of reviews…
☆10Jul 8, 2021Updated 5 years ago
CAMMA-public / Endoscapes
View on GitHub
Official Repository for the Endoscapes Dataset for Surgical Scene Segmentation, Object Detection, and Critical View of Safety Assessment
☆63Sep 17, 2025Updated 10 months ago
leeh43 / Singularity_Deeplesion
View on GitHub
☆11Jun 5, 2021Updated 5 years ago
aaronhan223 / FuseMoE
View on GitHub
Implementation of FuseMoE for FlexiModal Fusion, NeurIPS'24
☆35Mar 26, 2026Updated 3 months ago
cl-victor1 / Me
View on GitHub
I fine-tuned (p-tuning) Tsinghua’s open-source large language model, ChatGLM2-6B, using several years of my WeChat chat history. Inspired…
☆12Mar 6, 2024Updated 2 years ago
franciszchen / SCA-Net
View on GitHub
☆10Oct 7, 2023Updated 2 years ago
JeunyuLi / MUAF
View on GitHub
☆15Jun 27, 2023Updated 3 years ago
ci-ber / RA
View on GitHub
Generalizing Unsupervised Anomaly Detection: Towards Unbiased Pathology Screening. #MIDL2023.
☆30Sep 1, 2023Updated 2 years ago
BearCleverProud / MoME
View on GitHub
Repository for Mixture of Multimodal Experts
☆52Aug 3, 2024Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
AlfredQin / STNet
View on GitHub
☆17Jul 18, 2023Updated 3 years ago
sha168 / ADNet
View on GitHub
Code for the paper "Anomaly Detection-Inspired Few-Shot Medical Image Segmentation Through Self-Supervision With Supervoxels".
☆43Sep 26, 2022Updated 3 years ago
XuMengyaAmy / SwinMLP_TranCAP
View on GitHub
☆13Jun 26, 2022Updated 4 years ago
GauravGajbhiye / SCAMET_RSIC
View on GitHub
This is tensorflow 2.2 based SCAMET framework for remote sensing image captioning.
☆13Aug 10, 2023Updated 2 years ago
CAMMA-public / cholect50
View on GitHub
A repository for surgical action triplet dataset. Data are videos of laparoscopic cholecystectomy that have been annotated with <instrume…
☆85Sep 17, 2025Updated 10 months ago
cleary-lab / CISI
View on GitHub
code for composite in situ imaging (cisi) analysis
☆12Oct 26, 2020Updated 5 years ago
nchucvml / STVT
View on GitHub
Video Summarization With Spatiotemporal Vision Transformer
☆23Jul 5, 2023Updated 3 years ago
huwan / CityU-Thesis
View on GitHub
A collection of LaTeX thesis template for students at City University of Hong Kong.
☆22Jun 14, 2025Updated last year
robot-Yang / Ewenwan_vision
View on GitHub
☆17Feb 19, 2019Updated 7 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
COMP6248-Reproducability-Challenge / HOW-MUCH-POSITION-INFORMATION-DO-CONVOLUTIONAL-NEURAL-NETWORKS-ENCODE-
View on GitHub
Reproduce the paper named 'HOW MUCH POSITION INFORMATION DO CONVOLUTIONAL NEURAL NETWORKS ENCODE?', which published as a conference paper…
☆24May 29, 2020Updated 6 years ago
zjr2000 / Untrimmed-Video-Feature-Extractor
View on GitHub
A simple and effective feature extractor for untrimmed videos
☆13Sep 1, 2022Updated 3 years ago
GX77 / TextKG
View on GitHub
☆11Jun 27, 2023Updated 3 years ago
jinhong-ni / DEQFusion
View on GitHub
PyTorch Implementation of Deep Equilibrium Multimodal Fusion
☆20Aug 8, 2023Updated 2 years ago
smallboy-code / Breast-cancer-dataset
View on GitHub
☆46Sep 14, 2023Updated 2 years ago
theopsall / Video-Summarization
View on GitHub
Multimodal summarization of user-generated videos from wearable cameras
☆23Jun 22, 2025Updated last year
xmed-lab / DistillingSelf
View on GitHub
MICCAI 2022: Free Lunch for Surgical Video Understanding by Distilling Self-Supervisions
☆13Sep 17, 2022Updated 3 years ago