FudanCVL / SAASLinks
[AAAI 2026] Segment Anything Across Shots: A Method and Benchmark
☆18Updated this week
Alternatives and similar repositories for SAAS
Users that are interested in SAAS are comparing it to the libraries listed below
Sorting:
- [ACM MM-2024] RefMask3D: Language-Guided Transformer for 3D Referring Segmentation☆65Updated last year
- 「ECCV 2024」 PanoVOS: Bridging Non-panoramic and Panoramic Views with Transformer for Video Segmentation☆21Updated last year
- (ICCV 2025) ReferDINO: Referring Video Object Segmentation with Visual Grounding Foundations☆119Updated last week
- Code for the paper "Exploring Pre-trained Text-to-Video Diffusion Models for Referring Video Object Segmentation", ECCV 2024☆45Updated last year
- Video Reasoning Segmentation☆27Updated 11 months ago
- [CVPR 2025] Official PyTorch Implementation of GLUS: Global-Local Reasoning Unified into A Single Large Language Model for Video Segmenta…☆60Updated 4 months ago
- Official implementation of "VIRAL: Visual Representation Alignment for MLLMs".☆136Updated 2 months ago
- [MM2024 Oral] 3D-GRES: Generalized 3D Referring Expression Segmentation☆40Updated 11 months ago
- [AAAI 2025] AL-Ref-SAM 2: Unleashing the Temporal-Spatial Reasoning Capacity of GPT for Training-Free Audio and Language Referenced Video…☆90Updated 11 months ago
- [ECCV24] VISA: Reasoning Video Object Segmentation via Large Language Model☆19Updated last year
- [IEEE TCSVT] Official Pytorch Implementation of CLIP-VIS: Adapting CLIP for Open-Vocabulary Video Instance Segmentation.☆47Updated 10 months ago
- [NeurlPS 2024] One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videos☆140Updated 10 months ago
- [CVPR'2025] EntitySAM: Segment Everything in Video☆53Updated 4 months ago
- [CVPR 2025] Official repository of the paper "Mask-Adapter: The Devil is in the Masks for Open-Vocabulary Segmentation"☆112Updated 3 weeks ago
- code for the paper "CoReS: Orchestrating the Dance of Reasoning and Segmentation"☆19Updated 8 months ago
- [ICCV'25] Ross3D: Reconstructive Visual Instruction Tuning with 3D-Awareness☆61Updated 4 months ago
- [ECCV24] VISA: Reasoning Video Object Segmentation via Large Language Model☆195Updated last year
- ☆58Updated last year
- [ICCV 2023] CTVIS: Consistent Training for Online Video Instance Segmentation☆79Updated 2 years ago
- [CVPR 2025] DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception☆142Updated 5 months ago
- SpatialScore: Towards Unified Evaluation for Multimodal Spatial Understanding☆58Updated 4 months ago
- Large-Vocabulary Video Instance Segmentation dataset☆95Updated last year
- [CVPR'25] 🌟🌟 EgoTextVQA: Towards Egocentric Scene-Text Aware Video Question Answering☆41Updated 5 months ago
- Self-Calibrated CLIP for Training-Free Open-Vocabulary Segmentation☆57Updated 6 months ago
- [AAAI 26 Demo] Offical repo for CAT-V - Caption Anything in Video: Object-centric Dense Video Captioning with Spatiotemporal Multimodal P…☆59Updated 3 weeks ago
- [CVPR 2025] The code for paper ''Video-3D LLM: Learning Position-Aware Video Representation for 3D Scene Understanding''.☆177Updated 5 months ago
- [NeurIPS 2024] Understanding Multi-Granularity for Open-Vocabulary Part Segmentation☆55Updated 10 months ago
- OmniSpatial: Towards Comprehensive Spatial Reasoning Benchmark for Vision Language Models☆71Updated last month
- [CVPR-2023] Semantic-Promoted Debiasing and Background Disambiguation for Zero-Shot Instance Segmentation☆18Updated 2 years ago
- [ECCV2024] PartGLEE: A Foundation Model for Recognizing and Parsing Any Objects☆54Updated last year