MDMMT: Multidomain Multimodal Transformer for Video Retrieval
☆26Jun 28, 2021Updated 4 years ago
Alternatives and similar repositories for mdmmt
Users that are interested in mdmmt are comparing it to the libraries listed below
Sorting:
- Use CLIP to represent video for Retrieval Task☆70Mar 1, 2021Updated 5 years ago
- Code and benchmarks for the Semantic Video Retrieval Task☆53Oct 18, 2022Updated 3 years ago
- Code accompanying the paper "Fine-grained Video-Text Retrieval with Hierarchical Graph Reasoning".☆211Jun 12, 2020Updated 5 years ago
- ☆32Jun 22, 2022Updated 3 years ago
- Source code of our TPAMI'21 paper Dual Encoding for Video Retrieval by Text and CVPR'19 paper Dual Encoding for Zero-Example Video Retrie…☆88Jan 10, 2023Updated 3 years ago
- A Simple Framwork for CV Pre-training Model (SOCO, VirTex, BEiT)☆15Oct 18, 2021Updated 4 years ago
- ☆35Mar 22, 2019Updated 6 years ago
- A PyTorch implementation of TVC☆24Dec 18, 2023Updated 2 years ago
- Weakly Supervised Video Moment Retrieval from Text Queries☆43Jul 20, 2020Updated 5 years ago
- TIER: Text-Image Encoder-based Regression for AIGC Image Quality Assessment☆10Mar 1, 2025Updated last year
- Repository for the paper "Data Efficient Masked Language Modeling for Vision and Language".☆18Sep 17, 2021Updated 4 years ago
- Pipeline to scrape prompt + image url pairs from LAION `share-dalle-3` discord channel☆11Oct 10, 2023Updated 2 years ago
- This repo contains all the codes for SEScore implementation☆15Mar 3, 2025Updated last year
- [CVPR2019] Dual Encoding for Zero-Example Video Retrieval☆153Jan 10, 2023Updated 3 years ago
- Recursive Neural Networks implemented with Tensorflow☆13Nov 5, 2019Updated 6 years ago
- Dataset for Bilingual VLN☆11Dec 5, 2020Updated 5 years ago
- ☆259Dec 10, 2022Updated 3 years ago
- ☆18Jan 10, 2024Updated 2 years ago
- [arXiv22] Disentangled Representation Learning for Text-Video Retrieval☆98Apr 7, 2022Updated 3 years ago
- ☆19Apr 28, 2023Updated 2 years ago
- ☆62May 11, 2021Updated 4 years ago
- [ECCV 2020] PyTorch code for XML on TVRetrieval dataset - TVR: A Large-Scale Dataset for Video-Subtitle Moment Retrieval☆161May 28, 2024Updated last year
- Extension of Self-Supervised Temporal Hashing☆14Apr 15, 2021Updated 4 years ago
- ☆73Jun 3, 2022Updated 3 years ago
- PyTorch GPU distributed training code for MIL-NCE HowTo100M☆219Jul 5, 2022Updated 3 years ago
- COOT: Cooperative Hierarchical Transformer for Video-Text Representation Learning☆291Sep 6, 2022Updated 3 years ago
- Official implementation for the paper "Transferring Visual Knowledge with Pre-trained Models for Multimodal Machine Translation", publish…☆20Jun 3, 2024Updated last year
- ☆15Mar 20, 2020Updated 5 years ago
- Joint Embedding with Multimodal Cues for Cross-Modal Video-Text Retrieval☆68Apr 10, 2020Updated 5 years ago
- ☆47Apr 29, 2024Updated last year
- [CVPR2022 Oral] The official code for "TransRank: Self-supervised Video Representation Learning via Ranking-based Transformation Recognit…☆18Aug 1, 2022Updated 3 years ago
- Seeing What You Miss: Vision-Language Pre-training with Semantic Completion Learning☆20Dec 21, 2023Updated 2 years ago
- [CVPR 2022] Cross-Architecture Self-supervised Video Representation Learning☆24Jul 5, 2022Updated 3 years ago
- Published in CVPR 2020; matlab codes☆22Sep 15, 2024Updated last year
- A light-weight data management system for large-scale pretraining☆21May 17, 2025Updated 9 months ago
- PyTorch implementation of HANet: Hierarchical Alignment Networks for Video-Text Retrieval (ACM MM 2021).☆47Aug 19, 2021Updated 4 years ago
- Video embeddings for retrieval with natural language queries☆342Feb 15, 2023Updated 3 years ago
- IJCAI2020: Learning to Discretely Compose Reasoning Module Networks for Video Captioning☆79Nov 23, 2020Updated 5 years ago
- ☆52Oct 17, 2023Updated 2 years ago