jcwang0602 / PLVLLinks
Progressive Language-guided Visual Learning for Multi-Task Visual Grounding
☆11Updated 2 months ago
Alternatives and similar repositories for PLVL
Users that are interested in PLVL are comparing it to the libraries listed below
Sorting:
- All in One: Exploring Unified Vision-Language Tracking with Multi-Modal Alignment☆17Updated 5 months ago
- ☆20Updated 10 months ago
- Official implementation of "SPMTrack: Spatio-Temporal Parameter-Efficient Fine-Tuning with Mixture of Experts for Scalable Visual Trackin…☆22Updated last week
- [ACM MM 2024] Hierarchical Multimodal Fine-grained Modulation for Visual Grounding.☆51Updated 3 months ago
- ☆13Updated last year
- ☆19Updated 11 months ago
- The official implementation for the CVPR 2023 paper Joint Visual Grounding and Tracking with Natural Language Specification.☆68Updated 2 years ago
- ☆71Updated 9 months ago
- Robust Referring Video Object Segmentation with Cyclic Structural Consistency [ICCV 2023]☆30Updated last year
- [AAAI2025 selected as oral] - Multi-task Visual Grounding with Coarse-to-Fine Consistency Constraints☆30Updated 2 weeks ago
- Code for "CARIS: Context-Aware Referring Image Segmentation" [ACM MM2023]☆27Updated 7 months ago
- ☆10Updated last year
- This repository is an official implementation of the paper A Simple Baseline for Open-World Tracking via Self-training.☆10Updated last year
- Related papers about Referring Image Segmentation (RIS)☆16Updated last year
- Repo of NeurIPS23☆15Updated last year
- Tracking with Human-Intent Reasoning☆71Updated 8 months ago
- [NeurIPS'24] MemVLT: Vision-Language Tracking with Adaptive Memory-based Prompts☆16Updated 9 months ago
- [CVPR2024] The code of "UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory"☆67Updated 9 months ago
- CVPR2024: Dual Memory Networks: A Versatile Adaptation Approach for Vision-Language Models☆76Updated last year
- [CVPR 2024 Accepted] TaskWeave: Decoupling and Inter-Task Feedback for Joint Moment Retrieval and Highlight Detection☆26Updated 9 months ago
- ☆21Updated 2 years ago
- [ICCV 2023] Generative Prompt Model for Weakly Supervised Object Localization☆57Updated last year
- [NeurIPS2024] - SimVG: A Simple Framework for Visual Grounding with Decoupled Multi-modal Fusion☆82Updated last month
- ☆92Updated last year
- [TPAMI 2023] Local-Global Context Aware Transformer for Language-Guided Video Segmentation☆48Updated last year
- [ICCV'2023 Oral] Implicit Temporal Modeling with Learnable Alignment for Video Recognition☆37Updated last year
- ☆35Updated last year
- PyTorch implementation of "Efficient Motion Prompt Learning for Robust Visual Tracking" (ICML2025)☆17Updated last month
- Wnet: Audio-Guided Video Object Segmentation via Wavelet-Based Cross-Modal Denoising Networks☆22Updated 2 years ago
- [ECCV 2024 oral] -C2C: Component-to-Composition Learning for Zero-Shot Compositional Action Recognition☆35Updated 7 months ago