henghuiding / Awesome-Multimodal-Referring-SegmentationLinks
Multimodal Referring Segmentation
☆197Updated last month
Alternatives and similar repositories for Awesome-Multimodal-Referring-Segmentation
Users that are interested in Awesome-Multimodal-Referring-Segmentation are comparing it to the libraries listed below
Sorting:
- [CVPR-2024] Decoupling Static and Hierarchical Motion Perception for Referring Video Segmentation☆86Updated last year
- [CVPR-2023] Primitive Generation and Semantic-related Alignment for Universal Zero-Shot Segmentation☆190Updated 2 years ago
- [ICCV 2025] MOVE: Motion-Guided Few-Shot Video Object Segmentation☆85Updated 3 months ago
- A benchmark dataset for GRES and GREC [CVPR2023 Highlight]☆242Updated last month
- Official repo of "Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens"☆241Updated 3 weeks ago
- [ACM MM-2024] RefMask3D: Language-Guided Transformer for 3D Referring Segmentation☆66Updated last year
- [NeurIPS 2025] Composed Person Retrieval (CPR) is a new cross-modal retrieval task that aims to identify individuals in large-scale perso…☆71Updated 2 months ago
- [ICCV 2023] MOSE: A New Dataset for Video Object Segmentation in Complex Scenes☆362Updated 3 months ago
- [CVPR-2023] Semantic-Promoted Debiasing and Background Disambiguation for Zero-Shot Instance Segmentation☆18Updated 2 years ago
- [TIP-2023] Prototype Adaption and Projection for Few- and Zero-shot 3D Point Cloud Semantic Segmentation☆82Updated 2 years ago
- [ICCV 2025] Towards Omnimodal Expressions and Reasoning in Referring Audio-Visual Segmentation☆82Updated 3 months ago
- [ICCV 2025] AnyI2V: Animating Any Conditional Image with Motion Control Generation☆120Updated 4 months ago
- [ICCV2021 & TPAMI2023] Vision-Language Transformer and Query Generation for Referring Segmentation☆360Updated 4 years ago
- [ICCV 2025] Free-Form Motion Control: Controlling the 6D Poses of Camera and Objects in Video Generation☆54Updated 4 months ago
- [ICCV 2023 & TPAMI 2025] MeViS: A Large-scale Benchmark for Video Segmentation with Motion Expressions☆520Updated 3 weeks ago
- A curated list of publications on image and video segmentation leveraging Multimodal Large Language Models (MLLMs), highlighting state-of…☆180Updated 3 weeks ago
- This repo holds the official code and data for "Unveiling Parts Beyond Objects: Towards Finer-Granularity Referring Expression Segmentati…☆72Updated last year
- A list of referring video object segmentation papers☆57Updated 7 months ago
- [CVPR2024] GSVA: Generalized Segmentation via Multimodal Large Language Models☆156Updated last year
- [CVPR2023 Highlight] GRES: Generalized Referring Expression Segmentation☆690Updated last month
- [ICLR2025] Text4Seg: Reimagining Image Segmentation as Text Generation☆156Updated last month
- Official implementation of SCLIP: Rethinking Self-Attention for Dense Vision-Language Inference☆180Updated last year
- [ICCV-2023] The official code of Bridging Vision and Language Encoders: Parameter-Efficient Tuning for Referring Image Segmentation☆137Updated 6 months ago
- [ECCV24] VISA: Reasoning Video Object Segmentation via Large Language Model☆200Updated last year
- [AAAI-2025] The official code of Densely Connected Parameter-Efficient Tuning for Referring Image Segmentation☆62Updated 7 months ago
- ☆59Updated last year
- [CVPR 2024] The repository contains the official implementation of "Open-Vocabulary Segmentation with Semantic-Assisted Calibration"☆75Updated last year
- The official PyTorch implementation of the CVPR 2023 paper "Contrastive Grouping with Transformer for Referring Image Segmentation".☆50Updated last year
- [CVPR2025] Code Release of F-LMM: Grounding Frozen Large Multimodal Models☆109Updated 7 months ago
- ☆157Updated 2 years ago