niluthpol / multimodal_vttLinks
Joint Embedding with Multimodal Cues for Cross-Modal Video-Text Retrieval
☆68Updated 5 years ago
Alternatives and similar repositories for multimodal_vtt
Users that are interested in multimodal_vtt are comparing it to the libraries listed below
Sorting:
- Implementation for "Multilevel Language and Vision Integration for Text-to-Clip Retrieval"☆49Updated 7 years ago
- Heterogeneous Memory Enhanced Multimodal Attention Model for VideoQA☆54Updated 4 years ago
- [CVPR2019] Dual Encoding for Zero-Example Video Retrieval☆153Updated 3 years ago
- Codebase for CVPR 2020 paper "Spatio-Temporal Graph for Video Captioning with Knowledge Distillation"☆23Updated 5 years ago
- Weakly Supervised Dense Event Captioning in Videos, i.e. generating multiple sentence descriptions for a video in a weakly-supervised man…☆104Updated 5 years ago
- Code and Models for paper "Reinforced Video Captioning with Entailment Rewards (EMNLP 2017)"☆44Updated 6 years ago
- Cross-Modal Interaction Networks for Query-Based Moment Retrieval in Videos☆87Updated 5 years ago
- Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval (CVPR 2019)☆134Updated last year
- The code repository for "Cross-Modal and Hierarchical Modeling of Video and Text" in PyTorch☆16Updated 6 years ago
- This repository contains the main baselines introduced in WSSTG (ACL 2019).☆56Updated last year
- Feature Extraction Toolbox from CUHKÐZ&SIAT submission to ActivityNet 2016