WangWenhao0716 / VSC-DescriptorTrack-SubmissionLinks
[CVPR Challenge Rank 2nd] The codes and related files to reproduce the results for Video Similarity Challenge Descriptor Track.
☆19Updated 8 months ago
Alternatives and similar repositories for VSC-DescriptorTrack-Submission
Users that are interested in VSC-DescriptorTrack-Submission are comparing it to the libraries listed below
Sorting:
- [PR 2024] A large Cross-Modal Video Retrieval Dataset with Reading Comprehension☆28Updated last year
- TransVCL: Attention-enhanced Video Copy Localization Network with Flexible Supervision [AAAI2023 Oral]]☆57Updated 2 years ago
- BTS: A Bi-lingual Benchmark for Text Segmentation in the Wild☆33Updated last year
- Lion: Kindling Vision Intelligence within Large Language Models☆51Updated last year
- Code for the Video Similarity Challenge.☆81Updated last year
- This repository contains the dataset, codebase, and benchmarks for our paper: <CNVid-3.5M: Build, Filter, and Pre-train the Large-scale P…☆25Updated 2 years ago
- Precision Search through Multi-Style Inputs☆73Updated 4 months ago
- ☆54Updated last year
- WeThink: Toward General-purpose Vision-Language Reasoning via Reinforcement Learning☆36Updated 6 months ago
- ☆87Updated last year
- Official PyTorch implementation of `[ACMMM 2023]Relational Contrastive Learning for Scene Text Recognition`☆17Updated 2 years ago
- Official PyTorch implementation of the paper "Revisiting Temporal Modeling for CLIP-based Image-to-Video Knowledge Transferring"☆107Updated last year
- Video dataset dedicated to portrait-mode video recognition.☆55Updated 2 months ago
- [ICLR2024] Codes and Models for COSA: Concatenated Sample Pretrained Vision-Language Foundation Model☆43Updated last year
- ☆72Updated 2 years ago
- Code for CVPR 2022 paper "Scene Consistency Representation Learning for Video Scene Segmentation"☆104Updated 2 years ago
- Facebook Image Similarity Challenge 2021☆19Updated 4 years ago
- [CVPR 2022 - Demo Track] - Effective conditioned and composed image retrieval combining CLIP-based features☆82Updated last year
- Toward Universal Multimodal Embedding☆72Updated 4 months ago
- Official PyTorch implementation of the “Spatial-Semantic Collaborative Cropping for User Generated Content”. (AAAI24)☆70Updated last year
- An efficient multi-modal instruction-following data synthesis tool and the official implementation of Oasis https://arxiv.org/abs/2503.08…☆35Updated 6 months ago
- Frame Flexible Network (CVPR2023)☆56Updated 2 years ago
- [IJCV 2024] TransDETR: End-to-end Video Text Spotting with Transformer☆106Updated last year
- Masked Vision-Language Transformer in Fashion☆38Updated 2 years ago
- Narrative movie understanding benchmark☆77Updated 6 months ago
- [CVPR 2022 Challenge Rank 1st] The official code for V2L: Leveraging Vision and Vision-language Models into Large-scale Product Retrieval…☆29Updated 3 years ago
- [CVPR 2023] VoP: Text-Video Co-operative Prompt Tuning for Cross-Modal Retrieval☆38Updated 2 years ago
- Research Code for Multimodal-Cognition Team in Ant Group☆169Updated 2 months ago
- 2019 CCF 大数据与计算智能大赛 视频版权检测算法 复赛第8名方案 | 8th place solution of Video Copyright Detection Algorithm Track, 2019 CCF Big Data & Computing Int…☆30Updated 6 years ago
- Repository for 23'MM accepted paper "Curriculum-Listener: Consistency- and Complementarity-Aware Audio-Enhanced Temporal Sentence Groundi…☆52Updated last year