OpenGVLab/MUTR

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/OpenGVLab/MUTR)

OpenGVLab / MUTR

「AAAI 2024」 Referred by Multi-Modality: A Unified Temporal Transformers for Video Object Segmentation

☆85

Alternatives and similar repositories for MUTR

Users that are interested in MUTR are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

shilinyan99 / PanoVOS
View on GitHub
「ECCV 2024」 PanoVOS: Bridging Non-panoramic and Panoramic Views with Transformer for Video Segmentation
☆21Jul 2, 2024Updated 2 years ago
shilinyan99 / CrossLMM
View on GitHub
CrossLMM: Decoupling Long Video Sequences from LMMs via Dual Cross-Attention Mechanisms
☆25Dec 21, 2025Updated 7 months ago
buxiangzhiren / VD-IT
View on GitHub
Code for the paper "Exploring Pre-trained Text-to-Video Diffusion Models for Referring Video Object Segmentation", ECCV 2024
☆48Sep 28, 2024Updated last year
RobertLuo1 / iccv2023_RVOS_Challenge
View on GitHub
[ICCV 2023 Workshop] The Official Implementation of The First Prize Solution for RVOS Competition
☆14Jan 1, 2024Updated 2 years ago
dzh19990407 / LBDT
View on GitHub
CVPR2022 - Language-Bridged Spatial-Temporal Interaction for Referring Video Object Segmentation
☆24Aug 12, 2022Updated 3 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
JerryX1110 / awesome-rvos
View on GitHub
Referring Video Object Segmentation / Multi-Object Tracking Repo
☆91Jul 27, 2023Updated 2 years ago
JaaackHongggg / WorldSense
View on GitHub
WorldSense: Evaluating Real-world Omnimodal Understanding for Multimodal LLMs
☆50Jul 12, 2026Updated last week
CaraJ7 / DraCo
View on GitHub
Offical Repository for Paper: DraCo: Draft as CoT for Text-to-Image Preview and Rare Concept Generation
☆17Dec 7, 2025Updated 7 months ago
wjn922 / ReferFormer
View on GitHub
[CVPR2022] Official Implementation of ReferFormer
☆355Feb 15, 2025Updated last year
wudongming97 / OnlineRefer
View on GitHub
[ICCV 2023] OnlineRefer: A Simple Online Baseline for Referring Video Object Segmentation
☆58Oct 7, 2023Updated 2 years ago
GeWu-Lab / Ref-AVS
View on GitHub
The official repo for "Ref-AVS: Refer and Segment Objects in Audio-Visual Scenes", ECCV 2024
☆50Oct 12, 2025Updated 9 months ago
asudahkzj / Wnet
View on GitHub
Wnet: Audio-Guided Video Object Segmentation via Wavelet-Based Cross-Modal Denoising Networks
☆24Sep 6, 2022Updated 3 years ago
GeWu-Lab / TSPM
View on GitHub
Official repository for "Boosting Audio Visual Question Answering via Key Semantic-Aware Cues" in ACM MM 2024.
☆17Oct 25, 2024Updated last year
heshuting555 / DsHmp
View on GitHub
[CVPR-2024] Decoupling Static and Hierarchical Motion Perception for Referring Video Segmentation
☆83Jul 24, 2024Updated last year
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
bo-miao / SgMg
View on GitHub
[ICCV 2023] Spectrum-guided Multi-granularity Referring Video Object Segmentation.
☆112Apr 9, 2025Updated last year
southnx / ACoLP
View on GitHub
Open Set Video HOI detection from Action-centric Chain-of-Look Prompting, ICCV2023
☆12Oct 3, 2023Updated 2 years ago
lxa9867 / R2VOS
View on GitHub
Robust Referring Video Object Segmentation with Cyclic Structural Consistency [ICCV 2023]
☆30Mar 13, 2024Updated 2 years ago
vvvb-github / AVSegFormer
View on GitHub
[AAAI 2024] AVSegFormer: Audio-Visual Segmentation with Transformer
☆74Mar 6, 2025Updated last year
rongfu-dsb / MPG-SAM2
View on GitHub
[ICCV 2025] MPG-SAM 2: Adapting SAM 2 with Mask Priors and Global Context for Referring Video Object Segmentation
☆23Sep 5, 2025Updated 10 months ago
Tapall-AI / MeViS_Track_Solution_2024
View on GitHub
[CVPR 2024 Challenge] 1st Place Solution for MeViS Track in CVPR 2024 PVUW Workshop: Motion Expression guided Video Segmentation
☆31Oct 18, 2024Updated last year
RobertLuo1 / NeurIPS2023_SOC
View on GitHub
[NeurIPS 2023] The official implementation of SOC: Semantic-Assisted Object Cluster for Referring Video Object Segmentation
☆33Mar 16, 2024Updated 2 years ago
haochenheheda / Training-Code-of-STM
View on GitHub
Training code of Spatial Time Memory Network. Semi-supervised video object segmentation.
☆170Apr 22, 2021Updated 5 years ago
AlyssaYoung / AVQA
View on GitHub
ACM MM 2022 paper_AVQA: A Dataset for Audio-Visual Question Answering on Videos
☆15Aug 17, 2023Updated 2 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
OpenGVLab / DDPS
View on GitHub
Official Implementation of "Denoising Diffusion Semantic Segmentation with Mask Prior Modeling"
☆76Jul 27, 2023Updated 2 years ago
ShuangLI59 / weakly-supervised-human-object-detection-video
View on GitHub
☆26Oct 8, 2021Updated 4 years ago
OpenGVLab / Official-ConvMAE-Det
View on GitHub
☆18Aug 23, 2022Updated 3 years ago
FeipengMa6 / VLoRA
View on GitHub
[NeurIPS 2024] Visual Perception by Large Language Model’s Weights
☆56Mar 31, 2025Updated last year
wangbo-zhao / 2022CVPR-MMMMTBVS
View on GitHub
This is the code for CVPR2022 paper "Modeling Motion with Multi-Modal Features for Text-Based Video Segmentation"
☆19Feb 19, 2023Updated 3 years ago
rt219 / The-Emergence-of-Objectness
View on GitHub
This is the official released code for our paper, The Emergence of Objectness: Learning Zero-Shot Segmentation from Videos, which has bee…
☆53Apr 14, 2023Updated 3 years ago
gaomingqi / Awesome-Video-Object-Segmentation
View on GitHub
🔥 Latest advances in Video Object Segmentation (VOS) – papers, datasets, and projects.
☆513Jul 13, 2026Updated last week
gaomingqi / VOS-Review
View on GitHub
Datasets and Papers (with codes) discussed in "Deep Learning for Video Object Segmentation: A Review", Artificial Intelligence Review, 20…
☆54Oct 30, 2023Updated 2 years ago
end-of-the-century / Cardiac
View on GitHub
☆10Nov 12, 2020Updated 5 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
Tavarich / Awesome-Referring-Video-Object-Segmentation
View on GitHub
A list of referring video object segmentation papers
☆63Jun 28, 2026Updated 3 weeks ago
miranheo / GenVIS
View on GitHub
[CVPR'23] A Generalized Framework for Video Instance Segmentation
☆136Jan 4, 2024Updated 2 years ago
yannqi / COMBO-AVS
View on GitHub
[CVPR 2024 Highlight] Official implementation of the paper: Cooperation Does Matter: Exploring Multi-Order Bilateral Relations for Audio-…
☆40Apr 20, 2025Updated last year
mttr2021 / MTTR
View on GitHub
☆655Mar 4, 2024Updated 2 years ago
wusize / F-LMM
View on GitHub
[CVPR2025] Code Release of F-LMM: Grounding Frozen Large Multimodal Models
☆115May 29, 2025Updated last year
jibo27 / MemDeblur
View on GitHub
Multi-Scale Memory-Based Video Deblurring, CVPR 2022
☆31May 9, 2022Updated 4 years ago
appletea233 / AL-Ref-SAM2
View on GitHub
[AAAI 2025] AL-Ref-SAM 2: Unleashing the Temporal-Spatial Reasoning Capacity of GPT for Training-Free Audio and Language Referenced Video…
☆93Dec 23, 2024Updated last year