rkzheng99/ViLLa

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/rkzheng99/ViLLa)

rkzheng99 / ViLLa

Video Reasoning Segmentation

☆26

Alternatives and similar repositories for ViLLa

Users that are interested in ViLLa are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

showlab / VideoLISA
View on GitHub
[NeurlPS 2024] One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videos
☆148Dec 26, 2024Updated last year
cilinyan / VISA
View on GitHub
[ECCV24] VISA: Reasoning Video Object Segmentation via Large Language Model
☆213Aug 5, 2024Updated last year
cilinyan / ReVOS-api
View on GitHub
[ECCV24] VISA: Reasoning Video Object Segmentation via Large Language Model
☆22Jul 20, 2024Updated 2 years ago
Shengcao-Cao / groundLMM
View on GitHub
Emergent Visual Grounding in Large Multimodal Models Without Grounding Supervision
☆47Oct 19, 2025Updated 9 months ago
rkzheng99 / TMT-VIS
View on GitHub
Taxonomy-aware Multi-dataset Joint Training for Video Instance Segmentation (NeurIPS 23)
☆12May 7, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
SitongGong / VRS-HQ
View on GitHub
High Quality Video Reasoning Segmentation
☆151Nov 24, 2025Updated 8 months ago
GLUS-video / GLUS
View on GitHub
[CVPR 2025] Official PyTorch Implementation of GLUS: Global-Local Reasoning Unified into A Single Large Language Model for Video Segmenta…
☆70Jun 23, 2025Updated last year
zamling / PSALM
View on GitHub
[ECCV2024] This is an official implementation for "PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model"
☆269Dec 30, 2024Updated last year
baoxiaoyi / CoReS
View on GitHub
code for the paper "CoReS: Orchestrating the Dance of Reasoning and Segmentation"
☆23Nov 24, 2025Updated 8 months ago
haochenheheda / LVVIS
View on GitHub
Large-Vocabulary Video Instance Segmentation dataset
☆99Jul 5, 2024Updated 2 years ago
mbzuai-oryx / VideoGLaMM
View on GitHub
[CVPR 2025 🔥]A Large Multimodal Model for Pixel-Level Visual Grounding in Videos
☆104Apr 14, 2025Updated last year
LeapLabTHU / GSVA
View on GitHub
[CVPR2024] GSVA: Generalized Segmentation via Multimodal Large Language Models
☆166Sep 12, 2024Updated last year
MCG-NJU / SAM2-Plus
View on GitHub
SAM 2++: Tracking Anything at Any Granularity
☆70Dec 15, 2025Updated 7 months ago
jimmy-dq / SimVOS
View on GitHub
☆14May 25, 2024Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
berkeley-hipie / segllm
View on GitHub
Code release for "SegLLM: Multi-round Reasoning Segmentation"
☆129Feb 20, 2025Updated last year
see-say-segment / sesame
View on GitHub
🔥 [CVPR 2024] Official implementation of "See, Say, and Segment: Teaching LMMs to Overcome False Premises (SESAME)"
☆47Jun 16, 2024Updated 2 years ago
bo-miao / HTR
View on GitHub
[TCSVT 2024] Temporally Consistent Referring Video Object Segmentation with Hybrid Memory
☆19Apr 9, 2025Updated last year
weimengmeng1999 / AdapterSIS
View on GitHub
Enhancing Surgical Instrument Segmentation: Integrating Vision Transformer Insights with Adapter
☆13Mar 21, 2024Updated 2 years ago
skynbe / Refer-Youtube-VOS
View on GitHub
Refer-Youtube-VOS dataset
☆27Mar 10, 2026Updated 4 months ago
RobertLuo1 / iccv2023_RVOS_Challenge
View on GitHub
[ICCV 2023 Workshop] The Official Implementation of The First Prize Solution for RVOS Competition
☆14Jan 1, 2024Updated 2 years ago
WeitaiKang / SegVG
View on GitHub
[ECCV 2024] SegVG: Transferring Object Bounding Box to Segmentation for Visual Grounding
☆63Oct 22, 2024Updated last year
jingjing0419 / Efficient-SAM2
View on GitHub
[ICLR 2026] The official implementation of "Efficient-SAM2: Accelerating SAM2 with Object-Aware Visual Encoding and Memory Retrieval"
☆42Feb 9, 2026Updated 5 months ago
congvvc / LaSagnA
View on GitHub
Project for "LaSagnA: Language-based Segmentation Assistant for Complex Queries".
☆63Apr 29, 2024Updated 2 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
Ali2500 / ViCaS
View on GitHub
ViCaS: A Dataset for Combining Holistic and Pixel-level Video Understanding using Captions with Grounded Segmentation (CVPR'25)
☆21Apr 2, 2025Updated last year
NiFangBaAGe / DATTT
View on GitHub
[CVPR 2024] Depth-aware Test-Time Training for Zero-shot Video Object Segmentation
☆29Apr 28, 2025Updated last year
AI-Application-and-Integration-Lab / SAM4MLLM
View on GitHub
[ECCV 2024] SAM4MLLM: Enhance Multi-Modal Large Language Model for Referring Expression Segmentation
☆51Mar 20, 2025Updated last year
Hydragon516 / GSANet
View on GitHub
[CVPR 2024] Guided Slot Attention for Unsupervised Video Object Segmentation
☆66Dec 23, 2024Updated last year
yoxu515 / MITS
View on GitHub
☆21Jul 25, 2024Updated 2 years ago
linsun449 / iseg.code
View on GitHub
This repo is the official implementation of iSeg: An Iterative Refinement-based Framework for Training-free Segmentation.
☆42May 25, 2026Updated 2 months ago
jasongief / TGS-Agent
View on GitHub
[2026 AAAI] Think Before You Segment: An Object-aware Reasoning Agent for Referring Audio-Visual Segmentation
☆20Nov 8, 2025Updated 8 months ago
Dmmm1997 / MomentSeg
View on GitHub
[ECCV2026] MomentSeg: Moment-Centric Sampling for Enhanced Video Pixel Understanding
☆24Jun 19, 2026Updated last month
bhpfelix / Path-Aggregation-Network-for-Monocular-Depth-Estimation
View on GitHub
Adopting Path Aggregation Network for Monocular Depth Estimation - PyTorch
☆26Mar 30, 2018Updated 8 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
Restricted-Memory / RMem
View on GitHub
official repository of CVPR 2024 paper, RMem: Restricted Memory Banks Improve Video Object Segmentation
☆53Jun 18, 2026Updated last month
thunlp / Migician
View on GitHub
[ACL2025 Findings] Migician: Revealing the Magic of Free-Form Multi-Image Grounding in Multimodal Large Language Models
☆90May 20, 2025Updated last year
kagawa588 / GvSeg
View on GitHub
This is the official implementation of "GvSeg: General and Task-Oriented Video Segmentation" (Accepted at ECCV 2024).
☆18Jul 15, 2024Updated 2 years ago
fanghaook / Awesome-Video-Instance-Segmentation
View on GitHub
Awesome video instance segmentation papers
☆58Mar 12, 2026Updated 4 months ago
zoezheng126 / Spatio-Temporal-LLM
View on GitHub
☆19Aug 7, 2025Updated 11 months ago
rui-qian / READ
View on GitHub
Rui Qian, Xin Yin, Dejing Dou†: Reasoning to Attend: Try to Understand How <SEG> Token Works (CVPR 2025)
☆54Feb 4, 2026Updated 5 months ago
keeplearning-again / MatchSeg
View on GitHub
Official repository of “MatchSeg"
☆12Mar 22, 2024Updated 2 years ago