yugaljain1999 / Video_Captioning_Pytorch
Video captioning on MSR-VTT Dataset
☆12Updated 3 years ago
Alternatives and similar repositories for Video_Captioning_Pytorch:
Users that are interested in Video_Captioning_Pytorch are comparing it to the libraries listed below
- PyTorch implementation of HANet: Hierarchical Alignment Networks for Video-Text Retrieval (ACM MM 2021).☆47Updated 3 years ago
- Offical PyTorch implementation of Clover: Towards A Unified Video-Language Alignment and Fusion Model (CVPR2023)☆40Updated 2 years ago
- [CVPR 2023] VoP: Text-Video Co-operative Prompt Tuning for Cross-Modal Retrieval☆38Updated last year
- Some papers about *diverse* image (a few videos) captioning☆26Updated last year
- ☆17Updated 2 years ago
- PyTorch Implementation on Paper [CVPR2021]Distilling Audio-Visual Knowledge by Compositional Contrastive Learning☆84Updated 3 years ago
- Cross Modal Retrieval with Querybank Normalisation☆55Updated last year
- Official Code for VideoLT: Large-scale Long-tailed Video Recognition (ICCV 2021)☆33Updated 2 years ago
- A PyTorch implementation of EmpiricalMVM☆40Updated last year
- ROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and Intra-modal Knowledge Integration☆56Updated last year
- The Pytorch implementation for "Video-Text Pre-training with Learned Regions"☆42Updated 2 years ago
- ☆34Updated 5 years ago
- [ECCV'22 Poster] Explicit Image Caption Editing☆21Updated 2 years ago
- [CVPR 2022] Cross-Architecture Self-supervised Video Representation Learning☆22Updated 2 years ago
- [ECCV2022] Motion Sensitive Contrastive Learning for Self-supervised Video Representation☆17Updated 2 years ago
- Weakly Supervised Video Moment Retrieval from Text Queries☆42Updated 4 years ago
- ☆19Updated 2 years ago
- [CVPR23] A cascaded diffusion captioning model with a novel semantic-conditional diffusion process that upgrades conventional diffusion m…☆60Updated 8 months ago
- Source code of Universal Weighting Metric Learning for Cross-Modal Matching. The paper is accepted by CVPR2020.☆22Updated 2 years ago
- Gender/Age attribute grounding using weak supervised manner.☆12Updated 5 years ago
- The code for the paper "Hybrid Contrastive Quantization for Efficient Cross-View Video Retrieval" (WWW'22, Oral).☆18Updated 2 years ago
- AAAI2020 - Learning 2D Temporal Localization Networks for Moment Localization with Natural Language☆17Updated 5 years ago
- CrossCLR: Cross-modal Contrastive Learning For Multi-modal Video Representations, ICCV 2021☆61Updated 3 years ago
- [ACL 2021] mTVR: Multilingual Video Moment Retrieval☆26Updated 2 years ago
- A Comprehensive Empirical Study of Vision-Language Pre-trained Model for Supervised Cross-Modal Retrieval☆42Updated 2 years ago
- The codes and features of the re-implementation of SIGIR 2021 work "Deconfounded Video Moment Retrieval with Causal Intervention"☆34Updated 3 years ago
- Code for CVPR2023 paper "Collaborative Noisy Label Cleaner: Learning Scene-aware Trailers for Multi-modal Highlight Detection in Movies"☆17Updated last year
- Phrase Localization Evaluation Toolkit☆19Updated 5 years ago
- source code of our RaNet in EMNLP 2021☆30Updated 2 years ago
- [AAAI 2023] Contrastive Masked Autoencoders for Self-Supervised Video Hashing☆26Updated last year