srvk / how2-dataset
This repository contains code and metadata of How2 dataset
☆172Updated 3 months ago
Alternatives and similar repositories for how2-dataset:
Users that are interested in how2-dataset are comparing it to the libraries listed below
- Code for the paper Multimodal Transformer Networks for End-to-End Video-Grounded Dialogue Systems (ACL19)☆99Updated 2 years ago
- Zero -- A neural machine translation system☆150Updated last year
- Speech2vec pre-trained word vectors☆76Updated 6 years ago
- Code release for "MERLOT Reserve: Neural Script Knowledge through Vision and Language and Sound"☆139Updated 2 years ago
- Code for the AVLnet (Interspeech 2021) and Cascaded Multilingual (Interspeech 2021) papers.☆50Updated 3 years ago
- The code repository for EMNLP 2021 paper "Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization".☆55Updated 3 years ago
- Official code and dataset link for ''VMSMO: Learning to Generate Multimodal Summary for Video-based News Articles''☆36Updated 3 years ago
- Code and dataset of "MEmoR: A Dataset for Multimodal Emotion Reasoning in Videos" in MM'20.☆53Updated last year
- ☆53Updated 5 years ago
- ☆53Updated 3 years ago
- Multi-modal Neural Machine Translation in PyTorch☆44Updated 6 years ago
- Temporal Reasoning via Audio Question Answering☆24Updated 5 years ago
- Audio Visual Scene-Aware Dialog (AVSD) Challenge at the 10th Dialog System Technology Challenge (DSTC)☆27Updated 2 years ago
- Research code for EMNLP 2020 paper "HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training"☆231Updated 3 years ago
- EgoCom: A Multi-person Multi-modal Egocentric Communications Dataset☆56Updated 4 years ago
- Implementation of "Audio Retrieval with Natural Language Queries", INTERSPEECH 2021, PyTorch☆27Updated last year
- ☆201Updated 3 years ago
- ☆15Updated 4 years ago
- Implementation of meta-transfer-learning for ASR and LM (ACL 2020)☆50Updated 4 years ago
- [EMNLP 2018] PyTorch code for TVQA: Localized, Compositional Video Question Answering☆173Updated 2 years ago
- Code base for the paper "Latent variable model for multi-modal translation".☆16Updated 8 months ago
- ☆48Updated 6 years ago
- Deep Audio-Visual Embedding network (DAVEnet) implementation in PyTorch☆65Updated 6 years ago
- A PyTorch implementation of paper "Learning Shared Semantic Space for Speech-to-Text Translation", ACL (Findings) 2021☆47Updated 3 years ago
- [ECCV 2020] PyTorch code of MMT (a multimodal transformer captioning model) on TVCaption dataset☆90Updated last year
- A curated list of AWESOME papers, datasets and tutorials within Multimodal Machine Translation.☆36Updated 3 years ago
- Multilingual speech translation☆41Updated 3 years ago
- Official codes for the paper "Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech"☆27Updated 3 years ago
- This is the official code repository for the paper 'Cross-modality Data Augmentation for End-to-End Sign Language Translation'. Accepted…☆15Updated last year
- Switchboard Dialog Act Corpus with Penn Treebank links☆144Updated 4 years ago