cilinyan/VISA

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/cilinyan/VISA)

cilinyan / VISA

[ECCV24] VISA: Reasoning Video Object Segmentation via Large Language Model

☆213

Alternatives and similar repositories for VISA

Users that are interested in VISA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

cilinyan / ReVOS-api
View on GitHub
[ECCV24] VISA: Reasoning Video Object Segmentation via Large Language Model
☆22Jul 20, 2024Updated 2 years ago
showlab / VideoLISA
View on GitHub
[NeurlPS 2024] One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videos
☆148Dec 26, 2024Updated last year
haochenheheda / LVVIS
View on GitHub
Large-Vocabulary Video Instance Segmentation dataset
☆99Jul 5, 2024Updated 2 years ago
rkzheng99 / ViLLa
View on GitHub
Video Reasoning Segmentation
☆26Nov 29, 2024Updated last year
congvvc / InstructSeg
View on GitHub
[ICCV 2025] Official implementation of "InstructSeg: Unifying Instructed Visual Segmentation with Multi-modal Large Language Models"
☆56Feb 10, 2025Updated last year
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
dogehhh / ReCLIP
View on GitHub
[CVPR'24 & IJCV'25] Pytorch Implementation for ReCLIP
☆58Aug 27, 2025Updated 10 months ago
heshuting555 / DsHmp
View on GitHub
[CVPR-2024] Decoupling Static and Hierarchical Motion Perception for Referring Video Segmentation
☆83Jul 24, 2024Updated 2 years ago
cilinyan / PiClick
View on GitHub
Official PyTorch implementation of PiClick: Picking the desired mask in click-based interactive segmentation.
☆26Jul 2, 2024Updated 2 years ago
Tapall-AI / MeViS_Track_Solution_2024
View on GitHub
[CVPR 2024 Challenge] 1st Place Solution for MeViS Track in CVPR 2024 PVUW Workshop: Motion Expression guided Video Segmentation
☆31Oct 18, 2024Updated last year
buxiangzhiren / VD-IT
View on GitHub
Code for the paper "Exploring Pre-trained Text-to-Video Diffusion Models for Referring Video Object Segmentation", ECCV 2024
☆48Sep 28, 2024Updated last year
congvvc / HyperSeg
View on GitHub
[CVPR2025] Project for "HyperSeg: Towards Universal Visual Segmentation with Large Language Model".
☆182Dec 13, 2024Updated last year
gaomingqi / Awesome-Video-Object-Segmentation
View on GitHub
🔥 Latest advances in Video Object Segmentation (VOS) – papers, datasets, and projects.
☆515Jul 13, 2026Updated last week
zamling / PSALM
View on GitHub
[ECCV2024] This is an official implementation for "PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model"
☆269Dec 30, 2024Updated last year
GLUS-video / GLUS
View on GitHub
[CVPR 2025] Official PyTorch Implementation of GLUS: Global-Local Reasoning Unified into A Single Large Language Model for Video Segmenta…
☆70Jun 23, 2025Updated last year
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
Tavarich / Awesome-Referring-Video-Object-Segmentation
View on GitHub
A list of referring video object segmentation papers
☆63Jun 28, 2026Updated 3 weeks ago
dahyun-kang / lavg
View on GitHub
[ECCV'24] Official PyTorch implementation of In Defense of Lazy Visual Grounding for Open-Vocabulary Semantic Segmentation
☆51Sep 24, 2024Updated last year
JIA-Lab-research / Seg-Zero
View on GitHub
Project Page For "Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement"
☆635Jan 17, 2026Updated 6 months ago
SitongGong / VRS-HQ
View on GitHub
High Quality Video Reasoning Segmentation
☆151Nov 24, 2025Updated 8 months ago
Shengcao-Cao / groundLMM
View on GitHub
Emergent Visual Grounding in Large Multimodal Models Without Grounding Supervision
☆47Oct 19, 2025Updated 9 months ago
see-say-segment / sesame
View on GitHub
🔥 [CVPR 2024] Official implementation of "See, Say, and Segment: Teaching LMMs to Overcome False Premises (SESAME)"
☆47Jun 16, 2024Updated 2 years ago
kumuji / Sa2VA-i
View on GitHub
Sa2VA-i is an improved version of the popular Sa2VA model
☆17Nov 25, 2025Updated 7 months ago
bo-miao / SgMg
View on GitHub
[ICCV 2023] Spectrum-guided Multi-granularity Referring Video Object Segmentation.
☆117Apr 9, 2025Updated last year
JIA-Lab-research / LISA
View on GitHub
Project Page for "LISA: Reasoning Segmentation via Large Language Model"
☆2,667Feb 16, 2025Updated last year
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
congvvc / LaSagnA
View on GitHub
Project for "LaSagnA: Language-based Segmentation Assistant for Complex Queries".
☆63Apr 29, 2024Updated 2 years ago
yuxiangbao / NBN
View on GitHub
[IEEE TIP 2024] Normalizing Batch Normalization for Long-Tailed Recognition
☆19Jan 8, 2025Updated last year
henghuiding / MeViS
View on GitHub
[ICCV 2023 & TPAMI 2025] MeViS: A Large-scale Benchmark for Video Segmentation with Motion Expressions
☆524Jan 8, 2026Updated 6 months ago
gaomingqi / VOS-Review
View on GitHub
Datasets and Papers (with codes) discussed in "Deep Learning for Video Object Segmentation: A Review", Artificial Intelligence Review, 20…
☆54Oct 30, 2023Updated 2 years ago
heshuting555 / RefMask3D
View on GitHub
[ACM MM-2024] RefMask3D: Language-Guided Transformer for 3D Referring Segmentation
☆65Jul 29, 2024Updated last year
KainingYing / CTVIS
View on GitHub
[ICCV 2023] CTVIS: Consistent Training for Online Video Instance Segmentation
☆83Oct 15, 2023Updated 2 years ago
qirui-chen / RGA3-release
View on GitHub
[ICCV 2025] Object-centric Video Question Answering with Visual Grounding and Referring
☆24Aug 8, 2025Updated 11 months ago
LeapLabTHU / GSVA
View on GitHub
[CVPR2024] GSVA: Generalized Segmentation via Multimodal Large Language Models
☆166Sep 12, 2024Updated last year
bytedance / Sa2VA
View on GitHub
Official Repo For Pixel-LLM Codebase: Sa2VA (Arxiv-25), SAMTok (CVPR-26), VRT, SaSaSa2VA (1-st solution for LSVOS)
☆1,650Jun 19, 2026Updated last month
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
lizhou-cs / mglmm
View on GitHub
☆32Jun 14, 2026Updated last month
dengandong / GroundMoRe
View on GitHub
☆18May 18, 2026Updated 2 months ago
mrwu-mac / R-Bench
View on GitHub
[ICML2024] Repo for the paper `Evaluating and Analyzing Relationship Hallucinations in Large Vision-Language Models'
☆24Jan 1, 2025Updated last year
MaverickRen / PixelLM
View on GitHub
[CVPR 2024] PixelLM is an effective and efficient LMM for pixel-level reasoning and understanding.
☆273Feb 11, 2025Updated last year
mc-lan / ClearCLIP
View on GitHub
[ECCV2024] ClearCLIP: Decomposing CLIP Representations for Dense Vision-Language Inference
☆99Mar 26, 2025Updated last year
RobertLuo1 / NeurIPS2023_SOC
View on GitHub
[NeurIPS 2023] The official implementation of SOC: Semantic-Assisted Object Cluster for Referring Video Object Segmentation
☆33Mar 16, 2024Updated 2 years ago
John-Wendell / USB_Rec
View on GitHub
The official code of Recsys'25 paper 'USB-Rec: An Effective Framework for Improving Conversational Recommendation Capability of Large Lan…
☆24Mar 13, 2026Updated 4 months ago