This repository contains code and metadata of How2 dataset
☆192Dec 30, 2024Updated last year
Alternatives and similar repositories for how2-dataset
Users that are interested in how2-dataset are comparing it to the libraries listed below
Sorting:
- This is an extension of kaldi speech recognition software which allows to perform decoding of speech with hybrid word and phoneme graphs.…☆11Feb 4, 2020Updated 6 years ago
- Simple Kaldi recipe for forced alignment☆11Jul 16, 2023Updated 2 years ago
- ☆10Dec 21, 2022Updated 3 years ago
- ☆37Nov 22, 2025Updated 3 months ago
- A bunch of scripts exploiting several tools to perform inverse text normalization (ITN)☆21Sep 27, 2017Updated 8 years ago
- ☆17Apr 14, 2023Updated 2 years ago
- Phonetically-Oriented Word Error Rate☆36May 4, 2019Updated 6 years ago
- The code repository for EMNLP 2021 paper "Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization".☆56Jan 14, 2022Updated 4 years ago
- DSing ASR task: Resources and Baseline for an unaccompanied singing ASR.☆19Nov 23, 2021Updated 4 years ago
- BurrMill core☆22Nov 2, 2021Updated 4 years ago
- Spoken Language Translation System☆20Jul 26, 2021Updated 4 years ago
- ☆28Jul 23, 2025Updated 7 months ago
- [ECCV 2020] PyTorch code of MMT (a multimodal transformer captioning model) on TVCaption dataset☆90Sep 6, 2023Updated 2 years ago
- A python tool that converts Arabic diacritised text to a sequence of phonemes and creates a pronunciation dictionary. This code is based …☆16Sep 5, 2017Updated 8 years ago
- Transformer-based online speech recognition system with TensorFlow 2☆26Jan 22, 2021Updated 5 years ago
- SIGMORPHON 2020 Shared Task: Grapheme-to-Phoneme, Unsupervised Induction of Morphology, and Typologically Diverse Morphological Inflectio…☆36Apr 25, 2025Updated 10 months ago
- Punctuation generation for speech transcripts using lexical and prosodic features☆42Mar 5, 2019Updated 6 years ago
- ☆22Apr 8, 2022Updated 3 years ago
- Convert words to numbers☆21Apr 13, 2022Updated 3 years ago
- ☆14Nov 16, 2022Updated 3 years ago
- ☆17Jun 30, 2020Updated 5 years ago
- ☆17Nov 25, 2019Updated 6 years ago
- Adapt Kaldi-ASR nnet3 chain models from Zamia-Speech.org to a different language model☆33Jan 26, 2020Updated 6 years ago
- ☆37Jun 28, 2021Updated 4 years ago
- Automatically constructing corpus for automatic speech recognition from YouTube videos☆157Feb 15, 2020Updated 6 years ago
- Repository for sharing the data in the Tamasheq language, one of the target languages for the low-resource speech translation track at IW…☆18Nov 30, 2022Updated 3 years ago
- A set of scripts to use in preparing a corpus for speech-to-text processing with the Kaldi Automatic Speech Recognition Library.☆15May 19, 2020Updated 5 years ago
- ☆20Jul 22, 2022Updated 3 years ago
- Dataset of ICASSP 2021 MULTILINGUAL PHONETIC DATASET FOR LOW RESOURCE SPEECH RECOGNITION☆46May 12, 2023Updated 2 years ago
- In this repository, I try to combine k2 with speechbrain to decode well and fastly.☆16Jun 17, 2022Updated 3 years ago
- Large scale (>200h) and publicly available read audio book corpus. This corpus is an augmentation of LibriSpeech ASR Corpus (1000h) and c…☆44Jul 9, 2022Updated 3 years ago
- Starter code for the VMT task and challenge☆51Jul 29, 2020Updated 5 years ago
- ☆10Mar 20, 2021Updated 4 years ago
- Unsupervised speech activity detection system.☆11Jul 2, 2018Updated 7 years ago
- ☆11Nov 5, 2021Updated 4 years ago
- Multistage Fusion with Forget Gate for Multimodal Summarization in Open-Domain Videos☆12Oct 8, 2020Updated 5 years ago
- ☆10Oct 16, 2025Updated 4 months ago
- A corpus of diacritized Hebrew texts (טקסט מנוקד)☆11May 4, 2022Updated 3 years ago
- Wenet speech to text for react native☆10Nov 1, 2022Updated 3 years ago