ArrowLuo / VideoFeatureExtractorLinks
Video Feature Extractor for S3D-HowTo100M
☆29Updated 4 years ago
Alternatives and similar repositories for VideoFeatureExtractor
Users that are interested in VideoFeatureExtractor are comparing it to the libraries listed below
Sorting:
- Align and Prompt: Video-and-Language Pre-training with Entity Prompts☆188Updated 2 months ago
- Starter Code for VALUE benchmark☆80Updated 2 years ago
- [ECCV 2020] PyTorch code of MMT (a multimodal transformer captioning model) on TVCaption dataset☆90Updated last year
- The codes and features of the re-implementation of SIGIR 2021 work "Deconfounded Video Moment Retrieval with Causal Intervention"☆34Updated 3 years ago
- Implementation for MAF: Multimodal Alignment Framework☆46Updated 4 years ago
- Research code for EMNLP 2020 paper "HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training"☆233Updated 3 years ago
- [ECCV 2020] PyTorch code for XML on TVRetrieval dataset - TVR: A Large-Scale Dataset for Video-Subtitle Moment Retrieval☆159Updated last year
- Codes for paper "Towards Diverse Paragraph Captioning for Untrimmed Videos". CVPR 2021☆65Updated 3 years ago
- Data Release for VALUE Benchmark☆31Updated 3 years ago
- CVPR 2021 Official Pytorch Code for UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pre-training☆34Updated 3 years ago
- ☆74Updated 2 years ago
- [EMNLP 2020] What is More Likely to Happen Next? Video-and-Language Future Event Prediction☆49Updated 2 years ago
- Cross-View Language Modeling: Towards Unified Cross-Lingual Cross-Modal Pre-training (ACL 2023))☆91Updated 2 years ago
- The code repository for "Cross-Modal and Hierarchical Modeling of Video and Text" in PyTorch☆16Updated 6 years ago
- [ACL 2021] mTVR: Multilingual Video Moment Retrieval☆27Updated 2 years ago
- Official code and dataset link for ''VMSMO: Learning to Generate Multimodal Summary for Video-based News Articles''☆36Updated 3 years ago
- The source code of ACL 2020 paper: "Cross-Modality Relevance for Reasoning on Language and Vision"☆27Updated 4 years ago
- Pytorch code for Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners☆115Updated 2 years ago
- [ACL 2023] Official PyTorch code for Singularity model in "Revealing Single Frame Bias for Video-and-Language Learning"☆135Updated 2 years ago
- Code for ACM MM2020 paper: Jointly Cross- and Self-Modal Graph Attention Network for Query-Based Moment Localization☆34Updated 4 years ago
- Implementation for the paper "Hierarchical Conditional Relation Networks for Video Question Answering" (Le et al., CVPR 2020, Oral)☆133Updated 11 months ago
- Span-based Localizing Network for Natural Language Video Localization (ACL 2020)☆108Updated 3 years ago
- Code for the paper "Controllable Video Captioning with an Exemplar Sentence"☆12Updated 4 years ago
- Unpaired Image Captioning☆36Updated 4 years ago
- A PyTorch implementation of VIOLET☆137Updated last year
- Code for the paper "Zero-shot Natural Language Video Localization" (ICCV2021, Oral).☆48Updated 2 years ago
- Code for "Aligning Visual Regions and Textual Concepts for Semantic-Grounded Image Representations" (NeurIPS 2019)☆65Updated 4 years ago
- Using VideoBERT to tackle video prediction☆130Updated 4 years ago
- Official implementation for Hierarchical Deep Residual Reasoning for Temporal Moment Localization☆9Updated 3 years ago
- [arXiv22] Disentangled Representation Learning for Text-Video Retrieval☆96Updated 3 years ago