niluthpol/multimodal_vtt

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/niluthpol/multimodal_vtt)

niluthpol / multimodal_vtt

Joint Embedding with Multimodal Cues for Cross-Modal Video-Text Retrieval

☆68

Alternatives and similar repositories for multimodal_vtt

Users that are interested in multimodal_vtt are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

danieljf24 / dual_encoding
View on GitHub
[CVPR2019] Dual Encoding for Zero-Example Video Retrieval
☆153Jan 10, 2023Updated 3 years ago
cshizhe / hgr_v2t
View on GitHub
Code accompanying the paper "Fine-grained Video-Text Retrieval with Hierarchical Graph Reasoning".
☆211Jun 12, 2020Updated 6 years ago
yalesong / pvse
View on GitHub
Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval (CVPR 2019)
☆135Mar 15, 2024Updated 2 years ago
albanie / collaborative-experts
View on GitHub
Video embeddings for retrieval with natural language queries
☆344Feb 15, 2023Updated 3 years ago
danieljf24 / w2vv
View on GitHub
Word2VisualVec : Predicting Visual Features from Text for Image and Video Caption Retrieval
☆70Jan 27, 2020Updated 6 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
danieljf24 / hybrid_space
View on GitHub
Source code of our TPAMI'21 paper Dual Encoding for Video Retrieval by Text and CVPR'19 paper Dual Encoding for Zero-Example Video Retrie…
☆88Jan 10, 2023Updated 3 years ago
niluthpol / weak_supervised_video_moment
View on GitHub
Weakly Supervised Video Moment Retrieval from Text Queries
☆43Jul 20, 2020Updated 6 years ago
ikuinen / CMIN_moment_retrieval
View on GitHub
Cross-Modal Interaction Networks for Query-Based Moment Retrieval in Videos
☆87Nov 22, 2020Updated 5 years ago
tgc1997 / RMN
View on GitHub
IJCAI2020: Learning to Discretely Compose Reasoning Module Networks for Video Captioning
☆79Nov 23, 2020Updated 5 years ago
KunpengLi1994 / VSRN
View on GitHub
PyTorch code for ICCV'19 paper "Visual Semantic Reasoning for Image-Text Matching"
☆304Jan 14, 2020Updated 6 years ago
chitwansaharia / HACAModel
View on GitHub
Implementation of "Watch, Listen, and Describe: Globally and Locally Aligned Cross-Modal Attentions for Video Captioning" (https://arxiv.…
☆26Nov 3, 2018Updated 7 years ago
lelan-li / SSAH
View on GitHub
Self-Supervised Adversarial Hashing Networks for Cross-Modal Retrieval(CVPR2018)
☆164Jul 20, 2018Updated 8 years ago
yj-yu / lsmdc
View on GitHub
☆33Nov 12, 2018Updated 7 years ago
sunpeng981712364 / ACMR_demo
View on GitHub
☆93Oct 20, 2017Updated 8 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
gabeur / mmt
View on GitHub
Multi-Modal Transformer for Video Retrieval
☆265Oct 9, 2024Updated last year
ZihaoWang-CV / CAMP_iccv19
View on GitHub
CAMP: Cross-Modal Adaptive Message Passing for Text-Image Retrieval
☆127Feb 26, 2020Updated 6 years ago
yiling2018 / saem
View on GitHub
Learning Fragment Self-Attention Embeddings for Image-Text Matching, in ACM MM 2019
☆41Sep 24, 2019Updated 6 years ago
dazhang-cv / MAN
View on GitHub
This is the official repo for "MAN: Moment Alignment Network for Natural Language Moment Retrieval via Iterative Graph Adjustment"
☆17May 27, 2019Updated 7 years ago
AndresPMD / semantic_adaptive_margin
View on GitHub
WACV 2022 Paper - Is An Image Worth Five Sentences? A New Look into Semantics for Image-Text Matching
☆16Dec 10, 2021Updated 4 years ago
li-xirong / video-retrieval
View on GitHub
Deep Learning for Video Retrieval by Natural Language
☆11Oct 20, 2019Updated 6 years ago
BruceW91 / CVSE
View on GitHub
The official source code for the paper Consensus-Aware Visual-Semantic Embedding for Image-Text Matching (ECCV 2020)
☆168Feb 7, 2022Updated 4 years ago
StanfordVL / STGraph
View on GitHub
Codebase for CVPR 2020 paper "Spatio-Temporal Graph for Video Captioning with Knowledge Distillation"
☆23Mar 4, 2020Updated 6 years ago
mynlp / cst_captioning
View on GitHub
PyTorch Implementation of Consensus-based Sequence Training for Video Captioning
☆60May 15, 2018Updated 8 years ago
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
xuchaoxi / video-cnn-feat
View on GitHub
☆31Jun 22, 2022Updated 4 years ago
VisionLearningGroup / Text-to-Clip_Retrieval
View on GitHub
Implementation for "Multilevel Language and Vision Integration for Text-to-Clip Retrieval"
☆49Jan 21, 2019Updated 7 years ago
JaywongWang / CBP
View on GitHub
Official Tensorflow Implementation of the AAAI-2020 paper "Temporally Grounding Language Queries in Videos by Contextual Boundary-aware P…
☆59Mar 24, 2023Updated 3 years ago
fartashf / vsepp
View on GitHub
PyTorch Code for the paper "VSE++: Improving Visual-Semantic Embeddings with Hard Negatives"
☆523Dec 8, 2021Updated 4 years ago
yytzsy / SCDM
View on GitHub
Code for the paper: Semantic Conditioned Dynamic Modulation for Temporal Sentence Grounding in Videos
☆71Sep 7, 2021Updated 4 years ago
caoyue10 / aaai17-cdq
View on GitHub
The implementation of AAAI-17 paper "Collective Deep Quantization of Efficient Cross-modal Retrieval"
☆34Mar 15, 2017Updated 9 years ago
HuiChen24 / IMRAM
View on GitHub
code for our CVPR2020 paper "IMRAM: Iterative Matching with Recurrent Attention Memory for Cross-Modal Image-Text Retrieval"
☆95Mar 8, 2020Updated 6 years ago
li-xirong / avs
View on GitHub
Ad-hoc Video Search
☆29Feb 18, 2021Updated 5 years ago
facebookresearch / grounded-video-description
View on GitHub
Video Grounding and Captioning
☆331Oct 12, 2021Updated 4 years ago
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
simon-ging / coot-videotext
View on GitHub
COOT: Cooperative Hierarchical Transformer for Video-Text Representation Learning
☆291Sep 6, 2022Updated 3 years ago
PKU-ICST-MIPL / UGACH_AAAI2018
View on GitHub
Source code of our AAAI 2018 paper "Unsupervised Generative Adversarial Cross-modal Hashing"
☆53Oct 3, 2019Updated 6 years ago
sunnychencool / AOQ
View on GitHub
Adaptive Offline Quintuplet Loss for Image-Text Matching (AOQ)
☆34Jul 2, 2020Updated 6 years ago
kuanghuei / SCAN
View on GitHub
PyTorch source code for "Stacked Cross Attention for Image-Text Matching" (ECCV 2018)
☆579May 18, 2023Updated 3 years ago
zyfsa / cvpr2018-SSAH
View on GitHub
the tensorflow code for Self-Supervised Adversarial Hashing Networks for Cross-Modal Retrieval
☆39Oct 24, 2018Updated 7 years ago
JaywongWang / DenseVideoCaptioning
View on GitHub
Official Tensorflow Implementation of the paper "Bidirectional Attentive Fusion with Context Gating for Dense Video Captioning" in CVPR 2…
☆151Jul 8, 2019Updated 7 years ago
hardyqr / HAL
View on GitHub
[AAAI'20] Code release for "HAL: Improved Text-Image Matching by Mitigating Visual Semantic Hubs".
☆38Oct 4, 2023Updated 2 years ago