This repository contains code and metadata of How2 dataset
☆192Dec 30, 2024Updated last year
Alternatives and similar repositories for how2-dataset
Users that are interested in how2-dataset are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆10Dec 21, 2022Updated 3 years ago
- Scripts to download and explore the How2Sign dataset. If you have any questions, please contact: amanda.duarte@upc.edu☆26Jan 25, 2023Updated 3 years ago
- Simple Kaldi recipe for forced alignment☆11Jul 16, 2023Updated 2 years ago
- The code repository for EMNLP 2021 paper "Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization".☆57Jan 14, 2022Updated 4 years ago
- ☆29Jul 23, 2025Updated 8 months ago
- Spoken Language Translation System☆20Jul 26, 2021Updated 4 years ago
- [ECCV 2020] PyTorch code of MMT (a multimodal transformer captioning model) on TVCaption dataset☆91Sep 6, 2023Updated 2 years ago
- ☆37Updated this week
- Multistage Fusion with Forget Gate for Multimodal Summarization in Open-Domain Videos☆12Oct 8, 2020Updated 5 years ago
- ☆14Nov 16, 2022Updated 3 years ago
- EACL 2023 paper "MLASK: Multimodal Summarization of Video-based News Articles"☆12Nov 7, 2023Updated 2 years ago
- This is an extension of kaldi speech recognition software which allows to perform decoding of speech with hybrid word and phoneme graphs.…☆11Feb 4, 2020Updated 6 years ago
- ☆17Apr 14, 2023Updated 2 years ago
- Transformer-based online speech recognition system with TensorFlow 2☆26Jan 22, 2021Updated 5 years ago
- Code for the ACL2022 main conference paper "A Variational Hierarchical Model for Neural Cross-Lingual Summarization"☆18Sep 5, 2022Updated 3 years ago
- Summarization of Multimodal articles☆10Oct 14, 2022Updated 3 years ago
- Official code and dataset link for ''VMSMO: Learning to Generate Multimodal Summary for Video-based News Articles''☆36Jul 30, 2021Updated 4 years ago
- ☆15Jun 17, 2019Updated 6 years ago
- Large scale (>200h) and publicly available read audio book corpus. This corpus is an augmentation of LibriSpeech ASR Corpus (1000h) and c…☆44Jul 9, 2022Updated 3 years ago
- Sequence-to-Sequence Framework in PyTorch☆392Jan 5, 2023Updated 3 years ago
- A bunch of scripts exploiting several tools to perform inverse text normalization (ITN)☆21Sep 27, 2017Updated 8 years ago
- Code for the paper Multimodal Abstractive Summarization with Trimodal Hierarchical Attention☆20Jan 25, 2022Updated 4 years ago
- TFDS data loaders for sign language datasets.☆106Feb 9, 2026Updated last month
- A set of scripts to use in preparing a corpus for speech-to-text processing with the Kaldi Automatic Speech Recognition Library.☆15May 19, 2020Updated 5 years ago
- A python tool that converts Arabic diacritised text to a sequence of phonemes and creates a pronunciation dictionary. This code is based …☆16Sep 5, 2017Updated 8 years ago
- BurrMill core☆22Nov 2, 2021Updated 4 years ago
- Data and code for replicating WMT17 Multimodal Translation results☆16Oct 10, 2018Updated 7 years ago
- PyTorch implementation of the ICASSP-24 paper: "Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Superv…☆39Jan 6, 2024Updated 2 years ago
- DSing ASR task: Resources and Baseline for an unaccompanied singing ASR.☆19Nov 23, 2021Updated 4 years ago
- Starter code for the VMT task and challenge☆51Jul 29, 2020Updated 5 years ago
- Automatically exported from code.google.com/p/transducersaurus☆11Apr 1, 2015Updated 10 years ago
- ☆12Feb 9, 2021Updated 5 years ago
- Zero -- A neural machine translation system☆152May 8, 2023Updated 2 years ago
- ☆20Jul 22, 2022Updated 3 years ago
- ☆10Oct 16, 2025Updated 5 months ago
- ☆53Dec 6, 2021Updated 4 years ago
- Semi-supervised spoken language understanding (SLU) via self-supervised speech and language model pretraining☆12Mar 23, 2021Updated 5 years ago
- Phonetically-Oriented Word Error Rate☆36May 4, 2019Updated 6 years ago
- asr2k☆52Jun 2, 2024Updated last year