Brucewang91 / SAN

☆13

Related projects: ⓘ

niluthpol / multimodal_vtt
Joint Embedding with Multimodal Cues for Cross-Modal Video-Text Retrieval
☆69Updated 4 years ago
VisionLearningGroup / Text-to-Clip_Retrieval
Implementation for "Multilevel Language and Vision Integration for Text-to-Clip Retrieval"
☆47Updated 5 years ago
vsislab / Controllable_XGating
ICCV2019: Controllable Video Captioning with POS Sequence Guidance Based on Gated Fusion Network
☆67Updated 4 years ago
asdf0982 / vqa-mfb.pytorch
This project is out of date, I don't remember the details inside...
☆85Updated 6 years ago
danieljf24 / dual_encoding
[CVPR2019] Dual Encoding for Zero-Example Video Retrieval
☆155Updated last year
sunpeng981712364 / ACMR_demo
☆92Updated 6 years ago
YingZhangDUT / Cross-Modal-Projection-Learning
TensorFlow Implementation of Deep Cross-Modal Projection Learning
☆94Updated 4 years ago
zfchenUnique / WSSTG
This repository contains the main baselines introduced in WSSTG (ACL 2019).
☆55Updated 2 months ago
gujiuxiang / Multimodal_Retrieval.pytorch
☆109Updated this week
szq0214 / MSR-VTT-Challenge
Video to Language Challenge (MSR-VTT Challenge 2016)
☆32Updated 6 years ago
ZihaoWang-CV / CAMP_iccv19
CAMP: Cross-Modal Adaptive Message Passing for Text-Image Retrieval
☆124Updated 4 years ago
fanchenyou / HME-VideoQA
Heterogeneous Memory Enhanced Multimodal Attention Model for VideoQA
☆54Updated 3 years ago
sunnychencool / AOQ
Adaptive Offline Quintuplet Loss for Image-Text Matching (AOQ)
☆34Updated 4 years ago
pawandeep2155 / Video-Captioning-Using-Object-Trajectory-Features
Video Captioning on MSR-VTT and MSVD dataset using Deep Learning
☆21Updated 4 years ago
yytzsy / SCDM
Code for the paper: Semantic Conditioned Dynamic Modulation for Temporal Sentence Grounding in Videos
☆68Updated 3 years ago
cswhjiang / Recurrent_Fusion_Network
Source code for "Recurrent Fusion Network for Image Captioning".
☆23Updated 5 years ago
yalesong / pvse
Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval (CVPR 2019)
☆132Updated 6 months ago
mynlp / cst_captioning
PyTorch Implementation of Consensus-based Sequence Training for Video Captioning
☆60Updated 6 years ago
XgDuan / WSDEC
Weakly Supervised Dense Event Captioning in Videos, i.e. generating multiple sentence descriptions for a video in a weakly-supervised man…
☆104Updated 4 years ago
mrsalehi / ground-sentence-video
Implementation of the EMNLP 2018 paper "Temporally Grounding Natural Sentence in Video" using PyTorch
☆2Updated last year
yytzsy / GTP
Code for the paper: "Sentence Specified Dynamic Video Thumbnail Generation"
☆33Updated 5 years ago
yytzsy / ABLR_code
The source code of the paper: "To Find Where You Talk: Temporal Sentence Localization in Video with Attention Based Location Regression"
☆30Updated 5 years ago
yiling2018 / saem
Learning Fragment Self-Attention Embeddings for Image-Text Matching, in ACM MM 2019
☆41Updated 4 years ago
ramakanth-pasunuru / video_captioning_rl
Code and Models for paper "Reinforced Video Captioning with Entailment Rewards (EMNLP 2017)"
☆43Updated 4 years ago
Sundrops / video-caption.pytorch
☆32Updated 6 years ago
daqingliu / CAVP
Code release for Context-Aware Visual Policy Network for Sequence-Level Image Captioning (MM 2018) and Context-Aware Visual Policy Networ…
☆47Updated 5 years ago
Jumpin2 / HGA
☆27Updated this week
SydCaption / SAAT
☆62Updated 3 years ago
PKU-ICST-MIPL / UGACH_AAAI2018
Source code of our AAAI 2018 paper "Unsupervised Generative Adversarial Cross-modal Hashing"
☆52Updated 4 years ago
JaywongWang / TGN
Tensorflow Reproduction of the EMNLP-2018 paper "Temporally Grounding Natural Sentence in Video"
☆17Updated last year