[AAAI 2025] AL-Ref-SAM 2: Unleashing the Temporal-Spatial Reasoning Capacity of GPT for Training-Free Audio and Language Referenced Video Object Segmentation
☆91Dec 23, 2024Updated last year
Alternatives and similar repositories for AL-Ref-SAM2
Users that are interested in AL-Ref-SAM2 are comparing it to the libraries listed below
Sorting:
- [CVPR 2024] LoSh: Long-Short Text Joint Prediction Network for Referring Video Object Segmentation☆13Jun 17, 2024Updated last year
- Official code for WACV 2024 paper, "Annotation-free Audio-Visual Segmentation"☆37Oct 11, 2024Updated last year
- The official repo for "Stepping Stones: A Progressive Training Strategy for Audio-Visual Semantic Segmentation", ECCV 2024☆18Oct 11, 2024Updated last year
- The official repo for "Ref-AVS: Refer and Segment Objects in Audio-Visual Scenes", ECCV 2024☆50Oct 12, 2025Updated 4 months ago
- Code for the paper "Exploring Pre-trained Text-to-Video Diffusion Models for Referring Video Object Segmentation", ECCV 2024☆47Sep 28, 2024Updated last year
- ☆10Apr 7, 2025Updated 10 months ago
- Robust Referring Video Object Segmentation with Cyclic Structural Consistency [ICCV 2023]☆30Mar 13, 2024Updated last year
- [ICCV 2023] Spectrum-guided Multi-granularity Referring Video Object Segmentation.☆111Apr 9, 2025Updated 10 months ago
- TrackGPT: Track What You Need in Videos via Text Prompts☆25May 16, 2023Updated 2 years ago
- [CVPR 2024] "Towards Robust Audiovisual Segmentation in Complex Environments with Quantization-based Semantic Decomposition"☆12Feb 27, 2024Updated 2 years ago
- Official repository of "Prompting Segmentation with Sound is Generalizable Audio-Visual Source Localizer", AAAI 2024☆27Mar 26, 2024Updated last year
- [ECCV 2024 Oral] ActionVOS: Actions as Prompts for Video Object Segmentation☆31Dec 4, 2024Updated last year
- ☆19Jul 25, 2024Updated last year
- This is a repository contains the implementation of our NeurIPS'24 paper "Temporal Sentence Grounding with Relevance Feedback in Videos"☆13Aug 22, 2025Updated 6 months ago
- ☆11Mar 11, 2025Updated 11 months ago
- Code for Referring Image Segmentation via Cross-Modal Progressive Comprehension, CVPR2020.☆63Feb 2, 2021Updated 5 years ago
- [ICCV 2023] OnlineRefer: A Simple Online Baseline for Referring Video Object Segmentation☆58Oct 7, 2023Updated 2 years ago
- Code for Linguistic Structure Guided Context Modeling for Referring Image Segmentation, ECCV2020.☆15Oct 2, 2020Updated 5 years ago
- [NeurIPS 2024] Repository for the paper "OVT-B: A New Large-Scale Benchmark for Open-Vocabulary Multi-Object Tracking".☆27Nov 9, 2024Updated last year
- [MM2024 Oral] 3D-GRES: Generalized 3D Referring Expression Segmentation☆42Dec 15, 2024Updated last year
- Tracking with Human-Intent Reasoning☆76Nov 4, 2024Updated last year
- [CVPR 2024 Highlight] Official implementation of the paper: Cooperation Does Matter: Exploring Multi-Order Bilateral Relations for Audio-…☆40Apr 20, 2025Updated 10 months ago
- Segment Anything with Deictic Prompting☆27May 13, 2025Updated 9 months ago
- ☆48Jun 19, 2024Updated last year
- ☆13Jul 20, 2024Updated last year
- Official Implementation of "Open-Vocabulary Audio-Visual Semantic Segmentation" [ACM MM 2024 Oral].☆35Nov 2, 2024Updated last year
- A list of referring video object segmentation papers☆57Jun 6, 2025Updated 8 months ago
- ☆18Nov 15, 2024Updated last year
- [TCSVT 2024] Temporally Consistent Referring Video Object Segmentation with Hybrid Memory☆19Apr 9, 2025Updated 10 months ago
- ☆14Jul 15, 2024Updated last year
- [ICRA 2025] LaMOT: Language-Guided Multi-Object Tracking☆29Feb 10, 2025Updated last year
- [CVPR 2025] Official PyTorch Implementation of GLUS: Global-Local Reasoning Unified into A Single Large Language Model for Video Segmenta…☆66Jun 23, 2025Updated 8 months ago
- [NeurIPS 2024 Oral] RG-SAN: Rule-Guided Spatial Awareness Network for End-to-End 3D Referring Expression Segmentation☆19Dec 22, 2024Updated last year
- [EMNLP 2025 Findings] Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models☆138Aug 21, 2025Updated 6 months ago
- ☆33Sep 27, 2024Updated last year
- ☆32Mar 1, 2024Updated 2 years ago
- Referring Video Object Segmentation / Multi-Object Tracking Repo☆90Jul 27, 2023Updated 2 years ago
- The official code of Towards Balanced Alignment: Modal-Enhanced Semantic Modeling for Video Moment Retrieval (AAAI2024)☆32Mar 29, 2024Updated last year
- [ECCV24] VISA: Reasoning Video Object Segmentation via Large Language Model☆19Jul 20, 2024Updated last year