☆20Apr 2, 2024Updated last year
Alternatives and similar repositories for CPL
Users that are interested in CPL are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆39Jun 28, 2023Updated 2 years ago
- [ECCV 2024] The first zero-shot setting for spatio-temporal video grounding.☆11Jul 16, 2024Updated last year
- Visual Grounding with Multi-modal Conditional Adaptation (ACMMM 2024 Oral)☆26Jun 11, 2025Updated 9 months ago
- Code for "Distilling Coarse-to-fine Semantic Matching Knowledge for Weakly Supervised 3D Visual Grounding" (ICCV 2023)☆14Oct 2, 2024Updated last year
- [ICME 2024 Oral] DARA: Domain- and Relation-aware Adapters Make Parameter-efficient Tuning for Visual Grounding☆23Feb 26, 2025Updated last year
- Implementation for MAF: Multimodal Alignment Framework☆46Nov 25, 2020Updated 5 years ago
- [CVPR 2022] Pseudo-Q: Generating Pseudo Language Queries for Visual Grounding☆153Jul 13, 2024Updated last year
- [ACL'25 Oral] Code for the paper "UrbanVideo-Bench: Benchmarking Vision-Language Models on Embodied Intelligence with Video Data in Urban…☆26Jul 15, 2025Updated 8 months ago
- [AAAI 2024]Weakly Supervised Multimodal Affordance Grounding for Egocentric Images☆13Nov 10, 2024Updated last year
- [CVPR 2024] Context-Guided Spatio-Temporal Video Grounding☆66Jun 28, 2024Updated last year
- Official code for Zero-shot Referring Expression Comprehension via Structural Similarity Between Images and Captions (CVPR 2024)☆28Jun 21, 2024Updated last year
- ☆11Dec 6, 2024Updated last year
- Official pytorch repository for "Knowing Where to Focus: Event-aware Transformer for Video Grounding" (ICCV 2023)☆55Sep 7, 2023Updated 2 years ago
- [ACM MM 22] Correspondence Matters for Video Referring Expression Comprehension☆15Sep 4, 2022Updated 3 years ago
- End-to-End Adversarial Erasing for Weakly Supervised Semantic Segmentation☆15Nov 15, 2020Updated 5 years ago
- Official Implementation for CVPR 2022 paper "Unsupervised Vision-Language Parsing: Seamlessly Bridging Visual Scene Graphs with Language …