983632847 / All-in-One
All in One: Exploring Unified Vision-Language Tracking with Multi-Modal Alignment
☆17Updated 2 months ago
Alternatives and similar repositories for All-in-One:
Users that are interested in All-in-One are comparing it to the libraries listed below
- ☆11Updated last month
- [NeurIPS'24] MemVLT: Vision-Language Tracking with Adaptive Memory-based Prompts☆15Updated 7 months ago
- ☆19Updated 9 months ago
- Source code of the paper: Overlapped Trajectory-Enhanced Visual Tracking☆10Updated 8 months ago
- The official repo for "Stepping Stones: A Progressive Training Strategy for Audio-Visual Semantic Segmentation", ECCV 2024☆15Updated 6 months ago
- [ACM MM 2024] Hierarchical Multimodal Fine-grained Modulation for Visual Grounding.☆47Updated 3 weeks ago
- High Quality Video Reasoning Segmentation☆20Updated this week
- The official implementation for the paper [Towards Unified Token Learning for Vision-Language Tracking].☆16Updated last year
- Official Implementation of "Open-Vocabulary Audio-Visual Semantic Segmentation" [ACM MM 2024 Oral].☆25Updated 6 months ago
- A list of referring video object segmentation papers☆35Updated 2 weeks ago
- The official pytorch implementation of our AAAI 2024 paper "Unifying Visual and Vision-Language Tracking via Contrastive Learning"☆42Updated 6 months ago
- ☆20Updated 8 months ago
- Awesome video instance segmentation papers☆40Updated 3 weeks ago
- ☆13Updated 9 months ago
- [ECCV24] VISA: Reasoning Video Object Segmentation via Large Language Model☆16Updated 9 months ago
- The official implementation for the CVPR 2023 paper Joint Visual Grounding and Tracking with Natural Language Specification.☆67Updated last year
- [CVPR 2024 Accepted] TaskWeave: Decoupling and Inter-Task Feedback for Joint Moment Retrieval and Highlight Detection☆24Updated 7 months ago
- ☆15Updated 4 months ago
- [TPAMI 2024] This is the Pytorch code for our paper "Context Disentangling and Prototype Inheriting for Robust Visual Grounding".☆17Updated last month
- This repository is an official implementation of the paper A Simple Baseline for Open-World Tracking via Self-training.☆10Updated last year
- Tracking with Human-Intent Reasoning☆70Updated 6 months ago
- This is the official implementation of "GvSeg: General and Task-Oriented Video Segmentation" (Accepted at ECCV 2024).☆18Updated 9 months ago
- [CVPR 2025] Learning Occlusion-Robust Vision Transformers for Real-Time UAV Tracking☆16Updated 3 weeks ago
- [NeurIPS2024] - SimVG: A Simple Framework for Visual Grounding with Decoupled Multi-modal Fusion☆74Updated 4 months ago
- ☆16Updated 6 months ago
- ☆69Updated 7 months ago
- [NeurIPS 2023] The official implementation of SOC: Semantic-Assisted Object Cluster for Referring Video Object Segmentation☆31Updated last year
- Code for paper "LLMs Can Evolve Continually on Modality for X-Modal Reasoning" NeurIPS2024☆35Updated 4 months ago
- ☆8Updated 11 months ago
- EPCFormer: Expression Prompt Collaboration Transformer for Universal Referring Video Object Segmentation☆9Updated last year