srvk / how2-datasetLinks
This repository contains code and metadata of How2 dataset
☆178Updated 5 months ago
Alternatives and similar repositories for how2-dataset
Users that are interested in how2-dataset are comparing it to the libraries listed below
Sorting:
- Code and dataset of "MEmoR: A Dataset for Multimodal Emotion Reasoning in Videos" in MM'20.☆53Updated 2 years ago
- Code release for "MERLOT Reserve: Neural Script Knowledge through Vision and Language and Sound"☆140Updated 3 years ago
- ☆53Updated 3 years ago
- Code for the AVLnet (Interspeech 2021) and Cascaded Multilingual (Interspeech 2021) papers.☆52Updated 3 years ago
- ☆53Updated 5 years ago
- Multi-modal Neural Machine Translation in PyTorch☆44Updated 7 years ago
- ☆15Updated 4 years ago
- Code for the paper Multimodal Transformer Networks for End-to-End Video-Grounded Dialogue Systems (ACL19)☆100Updated 2 years ago
- Official code and dataset link for ''VMSMO: Learning to Generate Multimodal Summary for Video-based News Articles''☆36Updated 3 years ago
- Speech2vec pre-trained word vectors☆76Updated 6 years ago
- Cross-lingual Visual Pre-training for Multimodal Machine Translation☆18Updated 3 years ago
- The code repository for EMNLP 2021 paper "Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization".☆55Updated 3 years ago
- A curated list of AWESOME papers, datasets and tutorials within Multimodal Machine Translation.☆36Updated 3 years ago
- A Large-Scale Open-Domain Sign Language Translation Dataset (ASL-English)☆66Updated 11 months ago
- Audio Visual Scene-Aware Dialog (AVSD) Challenge at the 10th Dialog System Technology Challenge (DSTC)☆27Updated 2 years ago
- ☆28Updated 3 years ago
- [EMNLP 2018] PyTorch code for TVQA: Localized, Compositional Video Question Answering☆178Updated 2 years ago
- Zero -- A neural machine translation system☆152Updated 2 years ago
- Multilingual speech translation☆41Updated 4 years ago
- M3ED: Multi-modal Multi-scene Multi-label Emotional Dialogue Database. ACL 2022☆109Updated 2 years ago
- Frozen Pretrained Transformers for Neural Sign Language Translation☆15Updated 3 years ago
- Code base for the paper "Latent variable model for multi-modal translation".☆16Updated 11 months ago
- The code for our INTERSPEECH 2020 paper - Jointly Fine-Tuning "BERT-like'" Self Supervised Models to Improve Multimodal Speech Emotion R…☆121Updated 4 years ago
- Implementation of meta-transfer-learning for ASR and LM (ACL 2020)☆50Updated 4 years ago
- Code and Data for the ACL22 main conference paper "MSCTD: A Multimodal Sentiment Chat Translation Dataset"☆41Updated 6 months ago
- Humor Knowledge Enriched Transformer☆30Updated 3 years ago
- SLTUNET: A Simple Unified Model for Sign Language Translation (ICLR 2023)☆31Updated last year
- Fluency ENhanced Sentence-bert Evaluation (FENSE), metric for audio caption evaluation. And Benchmark dataset AudioCaps-Eval, Clotho-Eval…☆21Updated 2 years ago
- Summary about Video-to-Text datasets. This repository is part of the review paper *Bridging Vision and Language from the Video-to-Text Pe…☆124Updated last year
- This is the official code repository for the paper 'Cross-modality Data Augmentation for End-to-End Sign Language Translation'. Accepted…☆16Updated last year