Xuchen-Li / Awesome-Vision-Language-TrackingView external linksLinks
A vision-language tracking paper list, articles related to visual language tracking have been documented.
☆42Dec 15, 2024Updated last year
Alternatives and similar repositories for Awesome-Vision-Language-Tracking
Users that are interested in Awesome-Vision-Language-Tracking are comparing it to the libraries listed below
Sorting:
- (NeurIPS 2023) Open-set visual object query search & localization in long-form videos☆26Feb 1, 2024Updated 2 years ago
- ☆16Oct 4, 2024Updated last year
- The official pytorch implementation of our AAAI 2024 paper "Unifying Visual and Vision-Language Tracking via Contrastive Learning"☆45Nov 4, 2024Updated last year
- A visual object tracking paper list, articles related to visual object tracking have been documented.☆58Nov 6, 2024Updated last year
- [NeurIPS2024] - SimVG: A Simple Framework for Visual Grounding with Decoupled Multi-modal Fusion☆100Oct 29, 2025Updated 3 months ago
- [AAAI-25]Code for SEAL☆15Sep 25, 2025Updated 4 months ago
- ☆11Dec 6, 2024Updated last year
- ☆10Oct 13, 2024Updated last year
- A series of improved methods are used for visual tracking☆10Nov 29, 2025Updated 2 months ago
- [CVPR 2025] DiscoVLA: Discrepancy Reduction in Vision, Language, and Alignment for Parameter-Efficient Video-Text Retrieval☆22Jun 23, 2025Updated 7 months ago
- watermark video delogo☆11Nov 27, 2020Updated 5 years ago
- Code for the ICRA2018 paper "Trajectory-Optimized Sensing for Active Search of Tissue Abnormalities in Robotic Surgery"☆11May 22, 2018Updated 7 years ago
- ☆12Mar 24, 2024Updated last year
- ☆12Jan 4, 2026Updated last month
- Empowering Small VLMs to Think with Dynamic Memorization and Exploration☆15Nov 18, 2025Updated 2 months ago
- ☆14Dec 2, 2025Updated 2 months ago
- [ICCV'23] CiteTracker: Correlating Image and Text for Visual Tracking☆43Jun 20, 2024Updated last year
- LoRAT_pytracking: reproduction of [ECCV2024] LoRAT☆46Dec 9, 2024Updated last year
- This is a Uyghur language translator that supports speech-to-text in Uyghur language, machine translation to Uyghur language text, and te…☆14Nov 28, 2024Updated last year
- 本项目里主要有三个部分:匀速直线运动情况下图像的运动模糊过程仿真、维纳滤波及影响其复原效果的因素、运动模糊参数估计。☆10Jun 15, 2023Updated 2 years ago
- Official Implementation of Edit2Perceive☆25Dec 28, 2025Updated last month
- Code and Dataset for our CVPR 2022 paper "Video Shadow Detection via Spatio-Temporal Interpolation Consistency Training"☆12Jul 8, 2022Updated 3 years ago
- Unsupervised Vehicle Re-Identification via Self-Supervised Metric Learning using Feature Dictionary☆11Jul 30, 2021Updated 4 years ago
- Source code of the paper: Overlapped Trajectory-Enhanced Visual Tracking☆11Sep 3, 2024Updated last year
- ☆12Nov 6, 2025Updated 3 months ago
- [TIM 2024] Handling Occlusion in UAV Visual Tracking with Query-Guided Re-Detection☆18Mar 18, 2025Updated 10 months ago
- [ICLR2023] Video Scene Graph Generation from Single-Frame Weak Supervision☆12Sep 17, 2023Updated 2 years ago
- [NeurIPS 2023] Official Implementation of "PaintSeg: Painting Pixels for Training-free Segmentation"☆14Dec 31, 2023Updated 2 years ago
- ☆13Sep 16, 2022Updated 3 years ago
- Official implementation of NeurIPS 2022 paper "Learning Active Camera for Multi-Object Navigation"☆10Apr 23, 2023Updated 2 years ago
- PiVOT uses a foundational model for online automatic visual prompt refinement to aid tracking.☆15May 15, 2025Updated 8 months ago
- SpaceVLLM: Endowing Multimodal Large Language Model with Spatio-Temporal Video Grounding Capability☆16May 8, 2025Updated 9 months ago
- "From ViT Features to Training-free Video Object Segmentation via Streaming-data Mixture Models" [Uziel, Dinari, and Freifeld, NeurIPS 20…☆13Jan 16, 2024Updated 2 years ago
- 4st place solution for the PBVS 2022 Multi-modal Aerial View Object Classification Challenge - Track 1 (SAR) at PBVS2022☆13Apr 18, 2022Updated 3 years ago
- This is the official repository of the paper 'A Level Set Annotation Framework With Single-Point Supervision for Infrared Small Target De…☆13Aug 18, 2024Updated last year
- Code of the CVPR 2024 paper "Physics-guided Shape-from-Template: Monocular Video Perception through Neural Surrogate Models"☆14Mar 31, 2025Updated 10 months ago
- Structuring Hour-Long Videos into Navigable Chapters and Hierarchical Summaries☆34Nov 19, 2025Updated 2 months ago
- dense feature descriptor for image matching☆15Nov 6, 2023Updated 2 years ago
- The official implementation of the ECCV 2024 paper "Empowering Embodied Visual Tracking with Visual Foundation Models and Offline RL"☆19Oct 17, 2025Updated 3 months ago