alibaba / UVOSAM
The official repository of UVOSAM
☆11Updated 3 months ago
Related projects: ⓘ
- The official repository of EffiVED☆13Updated 3 months ago
- ☆52Updated 8 months ago
- The official repo for "Ref-AVS: Refer and Segment Objects in Audio-Visual Scenes", ECCV 2024☆20Updated last month
- GroundVLP: Harnessing Zero-shot Visual Grounding from Vision-Language Pre-training and Open-Vocabulary Object Detection (AAAI 2024)☆51Updated 8 months ago
- CVPR2022 - Language-Bridged Spatial-Temporal Interaction for Referring Video Object Segmentation☆22Updated 2 years ago
- Official Codes for Fine-Grained Visual Prompting, NeurIPS 2023☆32Updated 7 months ago
- Visual Prompt Augmentation☆25Updated 9 months ago
- [CVPR2024] UFineBench: Towards Text-based Person Retrieval with Ultra-fine Granularity☆46Updated this week
- ☆12Updated 10 months ago
- RefTeacher is a strong baseline method for Semi-Supervised Referring Expression Comprehension.☆12Updated last year
- LLM-Seg: Bridging Image Segmentation and Large Language Model Reasoning☆70Updated 5 months ago
- Multi-Class Few-Shot Semantic Segmentation with Visual Prompts☆26Updated this week
- [CVPR 2024] Official implementation of "Universal Segmentation at Arbitrary Granularity with Language Instruction"☆75Updated 6 months ago
- [AAAI2024] Code Release of CLIM: Contrastive Language-Image Mosaic for Region Representation☆25Updated 7 months ago
- ☆76Updated 7 months ago
- [CVPR 2024 Challenge] 1st Place Solution for MeViS Track in CVPR 2024 PVUW Workshop: Motion Expression guided Video Segmentation☆24Updated 3 months ago
- CLIP-Mamba: CLIP Pretrained Mamba Models with OOD and Hessian Evaluation☆56Updated last month
- ☆46Updated last month
- SeqTR: A Simple yet Universal Network for Visual Grounding☆128Updated 3 months ago
- [TMM 2023] Self-paced Curriculum Adapting of CLIP for Visual Grounding.☆104Updated 2 months ago
- ☆123Updated 8 months ago
- Our public repo ranked 1st 🏆🏆 at MMSports2023 challenge on segmentation task☆16Updated 10 months ago
- OvarNet official implement of the paper "OvarNet: Towards Open-vocabulary Object Attribute Recognition"☆98Updated last year
- TRT for WSOL☆29Updated 10 months ago
- Self-Supervised Video Representation Learning with Motion-Aware Masked Autoencoders☆23Updated last month
- [AAAI 2024 Oral] M2CLIP: A Multimodal, Multi-Task Adapting Framework for Video Action Recognition☆25Updated 2 months ago
- DynRefer: Delving into Region-level Multi-modality Tasks via Dynamic Resolution☆34Updated 2 months ago
- [AAAI 2024] Referred by Multi-Modality: A Unified Temporal Transformers for Video Object Segmentation☆62Updated 2 months ago
- Fast and general video object segmentation evaluation.☆24Updated 7 months ago
- [CVPR 2024 Highlight] Official GraCo: Granularity-Controllable Interactive Segmentation.☆41Updated 2 months ago