[AAAI 2025] AL-Ref-SAM 2: Unleashing the Temporal-Spatial Reasoning Capacity of GPT for Training-Free Audio and Language Referenced Video Object Segmentation
☆92Dec 23, 2024Updated last year
Alternatives and similar repositories for AL-Ref-SAM2
Users that are interested in AL-Ref-SAM2 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The official repo for "Ref-AVS: Refer and Segment Objects in Audio-Visual Scenes", ECCV 2024☆50Oct 12, 2025Updated 5 months ago
- Official code for WACV 2024 paper, "Annotation-free Audio-Visual Segmentation"☆38Oct 11, 2024Updated last year
- The official repo for "Stepping Stones: A Progressive Training Strategy for Audio-Visual Semantic Segmentation", ECCV 2024☆18Oct 11, 2024Updated last year
- [CVPR 2024] LoSh: Long-Short Text Joint Prediction Network for Referring Video Object Segmentation☆13Jun 17, 2024Updated last year
- Code for Linguistic Structure Guided Context Modeling for Referring Image Segmentation, ECCV2020.☆15Oct 2, 2020Updated 5 years ago
- Code for the paper "Exploring Pre-trained Text-to-Video Diffusion Models for Referring Video Object Segmentation", ECCV 2024☆47Sep 28, 2024Updated last year
- Code for Referring Image Segmentation via Cross-Modal Progressive Comprehension, CVPR2020.☆63Feb 2, 2021Updated 5 years ago
- Robust Referring Video Object Segmentation with Cyclic Structural Consistency [ICCV 2023]☆30Mar 13, 2024Updated 2 years ago
- Official repository of "Prompting Segmentation with Sound is Generalizable Audio-Visual Source Localizer", AAAI 2024☆27Mar 14, 2026Updated last week
- [ICCV 2023] Spectrum-guided Multi-granularity Referring Video Object Segmentation.☆111Apr 9, 2025Updated 11 months ago
- [CVPR 2024] "Towards Robust Audiovisual Segmentation in Complex Environments with Quantization-based Semantic Decomposition"☆12Feb 27, 2024Updated 2 years ago
- ☆10Apr 7, 2025Updated 11 months ago
- Official Implementation of "Open-Vocabulary Audio-Visual Semantic Segmentation" [ACM MM 2024 Oral].☆35Nov 2, 2024Updated last year
- This is a repository contains the implementation of our NeurIPS'24 paper "Temporal Sentence Grounding with Relevance Feedback in Videos"☆14Aug 22, 2025Updated 7 months ago
- Tracking with Human-Intent Reasoning☆76Nov 4, 2024Updated last year
- ☆32Mar 1, 2024Updated 2 years ago
- [CVPR 2024 Highlight] Official implementation of the paper: Cooperation Does Matter: Exploring Multi-Order Bilateral Relations for Audio-…☆40Apr 20, 2025Updated 11 months ago
- ☆11Mar 11, 2025Updated last year
- [ICCV 2023] OnlineRefer: A Simple Online Baseline for Referring Video Object Segmentation☆58Oct 7, 2023Updated 2 years ago
- [MM2024 Oral] 3D-GRES: Generalized 3D Referring Expression Segmentation☆42Dec 15, 2024Updated last year
- [CVPR 2025] Official PyTorch Implementation of GLUS: Global-Local Reasoning Unified into A Single Large Language Model for Video Segmenta…☆67Jun 23, 2025Updated 9 months ago
- TrackGPT: Track What You Need in Videos via Text Prompts☆25May 16, 2023Updated 2 years ago
- [CVPR 2024] Customize your NeRF: Adaptive Source Driven 3D Scene Editing via Local-Global Iterative Training☆44Apr 13, 2024Updated last year
- [ECCV24] VISA: Reasoning Video Object Segmentation via Large Language Model☆19Jul 20, 2024Updated last year
- ACM MM 2022 - PPMN: Pixel-Phrase Matching Network for One-Stage Panoptic Narrative Grounding☆11Aug 12, 2022Updated 3 years ago
- ☆20Jul 25, 2024Updated last year
- [ICCV 2025] MPG-SAM 2: Adapting SAM 2 with Mask Priors and Global Context for Referring Video Object Segmentation☆22Sep 5, 2025Updated 6 months ago
- ☆49Jun 19, 2024Updated last year
- [ECCV 2024 Oral] ActionVOS: Actions as Prompts for Video Object Segmentation☆31Dec 4, 2024Updated last year
- [EMNLP 2025 Findings] Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models☆140Aug 21, 2025Updated 7 months ago
- CVPR2022 - Language-Bridged Spatial-Temporal Interaction for Referring Video Object Segmentation☆24Aug 12, 2022Updated 3 years ago
- A list of referring video object segmentation papers☆59Jun 6, 2025Updated 9 months ago
- [TCSVT 2024] Temporally Consistent Referring Video Object Segmentation with Hybrid Memory☆19Apr 9, 2025Updated 11 months ago
- [NeurlPS 2024] One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videos☆146Dec 26, 2024Updated last year
- [AAAI 2024] AVSegFormer: Audio-Visual Segmentation with Transformer☆73Mar 6, 2025Updated last year
- The official code of Towards Balanced Alignment: Modal-Enhanced Semantic Modeling for Video Moment Retrieval (AAAI2024)☆32Mar 29, 2024Updated last year
- Are Binary Annotations Sufficient? Video Moment Retrieval via Hierarchical Uncertainty-based Active Learning☆15Dec 12, 2023Updated 2 years ago
- Referring Video Object Segmentation / Multi-Object Tracking Repo☆89Jul 27, 2023Updated 2 years ago
- [NeurIPS 2024] Repository for the paper "OVT-B: A New Large-Scale Benchmark for Open-Vocabulary Multi-Object Tracking".☆27Nov 9, 2024Updated last year