WangWenhao0716 / VSC-DescriptorTrack-SubmissionLinks
[CVPR Challenge Rank 2nd] The codes and related files to reproduce the results for Video Similarity Challenge Descriptor Track.
☆19Updated 6 months ago
Alternatives and similar repositories for VSC-DescriptorTrack-Submission
Users that are interested in VSC-DescriptorTrack-Submission are comparing it to the libraries listed below
Sorting:
- [PR 2024] A large Cross-Modal Video Retrieval Dataset with Reading Comprehension☆28Updated last year
- Precision Search through Multi-Style Inputs☆72Updated 3 months ago
- TransVCL: Attention-enhanced Video Copy Localization Network with Flexible Supervision [AAAI2023 Oral]]☆56Updated 2 years ago
- Code for the Video Similarity Challenge.☆80Updated last year
- Code for CVPR 2022 paper "Scene Consistency Representation Learning for Video Scene Segmentation"☆102Updated 2 years ago
- This repository contains the dataset, codebase, and benchmarks for our paper: <CNVid-3.5M: Build, Filter, and Pre-train the Large-scale P…☆25Updated last year
- ☆72Updated 2 years ago
- Video dataset dedicated to portrait-mode video recognition.☆52Updated 2 weeks ago
- WeThink: Toward General-purpose Vision-Language Reasoning via Reinforcement Learning☆35Updated 4 months ago
- BTS: A Bi-lingual Benchmark for Text Segmentation in the Wild☆32Updated last year
- Video Copy Segment Localization (VCSL) dataset and benchmark [CVPR2022]☆130Updated last year
- ☆87Updated last year
- [ICLR2024] Codes and Models for COSA: Concatenated Sample Pretrained Vision-Language Foundation Model☆43Updated 10 months ago
- Research Code for Multimodal-Cognition Team in Ant Group☆168Updated 2 weeks ago
- Lion: Kindling Vision Intelligence within Large Language Models☆51Updated last year
- [ACM MM2025] The official repository for the RealSyn dataset☆37Updated 3 months ago
- Masked Vision-Language Transformer in Fashion☆36Updated 2 years ago
- Chinese CLIP models with SOTA performance.☆59Updated 2 years ago
- ☆53Updated last year
- ☆17Updated last year
- A Dead Simple and Modularized Multi-Modal Training and Finetune Framework. Compatible to any LLaVA/Flamingo/QwenVL/MiniGemini etc series …☆19Updated last year
- Official PyTorch implementation of the “Spatial-Semantic Collaborative Cropping for User Generated Content”. (AAAI24)☆70Updated last year
- [CVPR 2023 Workshop] The code reproduce the results of our solutions on both tracks for Meta AI Video Similarity Challenge (CVPR 2023 Wor…☆53Updated 2 years ago
- Repository for 23'MM accepted paper "Curriculum-Listener: Consistency- and Complementarity-Aware Audio-Enhanced Temporal Sentence Groundi…☆51Updated last year
- Gradio demo used in our Osprey:Pixel Understanding with Visual Instruction Tuning.☆16Updated last year
- [ICME 2023] FlowText: Synthesizing Realistic Scene Text Video with Optical Flow Estimation☆13Updated 2 years ago
- official code for paper: Exploring Domain Incremental Video Highlights Detection with the LiveFood Benchmark☆39Updated last year
- Official PyTorch implementation of `[ACMMM 2023]Relational Contrastive Learning for Scene Text Recognition`☆17Updated 2 years ago
- Official implementation of TagAlign☆35Updated 10 months ago
- Narrative movie understanding benchmark☆76Updated 4 months ago