aranciokov / FSMMDA_VideoRetrievalLinks
☆10Updated 2 years ago
Alternatives and similar repositories for FSMMDA_VideoRetrieval
Users that are interested in FSMMDA_VideoRetrieval are comparing it to the libraries listed below
Sorting:
- [ECCV'22 Poster] Explicit Image Caption Editing☆22Updated 3 years ago
- ☆22Updated 3 years ago
- Official implementation of our EMNLP 2022 paper "CPL: Counterfactual Prompt Learning for Vision and Language Models"☆34Updated 3 years ago
- Recent Advances in Visual Dialog☆30Updated 3 years ago
- ROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and Intra-modal Knowledge Integration☆56Updated 2 years ago
- Implementation for the paper "Unified Multimodal Model with Unlikelihood Training for Visual Dialog"☆13Updated 2 years ago
- Official Implementation for CVPR 2023 paper "Divide and Conquer: Answering Questions with Object Factorization and Compositional Reasonin…☆10Updated last year
- [CVPR 2022] A large-scale public benchmark dataset for video question-answering, especially about evidence and commonsense reasoning. The…☆75Updated 5 months ago
- Implementation for the paper "Dynamic Language Binding in Relational Visual Reasoning" (Le et al., IJCAI 2020)☆13Updated last year
- Video Graph Transformer for Video Question Answering (ECCV'22)☆49Updated 2 years ago
- The official code for "Visual Relationship Detection with Visual-Linguistic Knowledge from Multimodal Representations" (IEEE Access, 2021…☆17Updated 3 years ago
- ☆30Updated 3 years ago
- DeVLBert: Learning Deconfounded Visio-Linguistic Representations☆27Updated 3 years ago
- ☆79Updated 3 years ago
- Source code of our TCSVT'22 paper Reading-strategy Inspired Visual Representation Learning for Text-to-Video Retrieval☆19Updated 3 years ago
- [CVPR 2024] MMSum: A Dataset for Multimodal Summarization and Thumbnail Generation of Videos☆36Updated 10 months ago
- [ACL 2021] mTVR: Multilingual Video Moment Retrieval☆27Updated 3 years ago
- CVPR 2022 (Oral) Pytorch Code for Unsupervised Vision-and-Language Pre-training via Retrieval-based Multi-Granular Alignment