getalp/mass-dataset

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/getalp/mass-dataset)

getalp / mass-dataset

MaSS - Multilingual corpus of Sentence-aligned Spoken utterances

☆50

Alternatives and similar repositories for mass-dataset

Users that are interested in mass-dataset are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

indonesian-nlp / wav2vec2-indonesian
View on GitHub
☆20Apr 5, 2021Updated 5 years ago
coqui-ai / open-bible-scripts
View on GitHub
scipts for working with open.bible data
☆26Jan 24, 2022Updated 4 years ago
SpeechColab / PySpeechColab
View on GitHub
A library of speech gadgets.
☆15Oct 15, 2022Updated 3 years ago
sarahjuan / iban
View on GitHub
☆14Jun 12, 2015Updated 11 years ago
TehreemFarooqi / Preparing-a-speech-recognition-dataset-using-YouTube-videos
View on GitHub
Using YouTube to prepare a speech recognition dataset for any language
☆10Mar 30, 2021Updated 5 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
eastonYi / Unsupervised-ASR
View on GitHub
unsupervised ASR (mainly phone classifier) using EODM and GAN
☆12Oct 22, 2020Updated 5 years ago
revdotcom / words2num
View on GitHub
Convert words to numbers
☆21Apr 13, 2022Updated 4 years ago
besacier / mboshi-french-parallel-corpus
View on GitHub
☆23Apr 8, 2022Updated 4 years ago
gentaiscool / indonesian-nlp
View on GitHub
A curated list of research papers and resources on Indonesian languages
☆41Mar 21, 2024Updated 2 years ago
Yangyangii / TPGST-Tacotron
View on GitHub
Google's TPGST reimplementation.
☆34Dec 11, 2019Updated 6 years ago
coqui-ai / data-checker
View on GitHub
🫠 check your data, before you wreck your model
☆16Aug 11, 2022Updated 3 years ago
gpu-poor / gramvaani_hindi_asr
View on GitHub
This repo contains the baseline model recipes and pre-trained model for GramVanni hindi ASR challenge
☆16Mar 26, 2022Updated 4 years ago
BUTSpeechFIT / ASR-hybrid-decoding
View on GitHub
☆17Nov 25, 2019Updated 6 years ago
xinjli / phonepiece
View on GitHub
phone inventory library
☆17May 15, 2023Updated 3 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
rhasspy / phonetisaurus-pypi
View on GitHub
Python wrapper for phonetisaurus grapheme to phoneme tool
☆12Mar 11, 2021Updated 5 years ago
bootphon / pygamma-agreement
View on GitHub
Gamma Agreement in Python
☆46Mar 4, 2024Updated 2 years ago
dqqcasia / st
View on GitHub
End-to-end Speech Translation
☆35Apr 12, 2021Updated 5 years ago
KathyReid / opensource-voice-tools
View on GitHub
A repo listing known open source voice tools, ordered by where they sit in the voice stack
☆28Sep 23, 2022Updated 3 years ago
charlesliucn / LanMIT
View on GitHub
📖 LanMIT: A Toolkit for Improving Language Models in Low-resourced Speech Recognition based on Kaldi.
☆22Jul 12, 2019Updated 7 years ago
alpoktem / bible2speechDB
View on GitHub
Scripts to create speech corpora from open.bible
☆13Jan 3, 2022Updated 4 years ago
lingjzhu / zipa
View on GitHub
A family of efficient speech models for multilingual phone recognition
☆68Jul 18, 2026Updated last week
i3thuan5 / hts_engine_python
View on GitHub
python wrap for hts engine
☆14Jan 30, 2018Updated 8 years ago
Koziev / StressModel
View on GitHub
Neural model for prediction of stress position in Russian words
☆13Jun 22, 2025Updated last year
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
beer-asr / beer
View on GitHub
Bayesian spEEch Recognizer
☆55Jan 11, 2021Updated 5 years ago
speechcatcher-asr / speechcatcher-data
View on GitHub
☆11Sep 5, 2025Updated 10 months ago
bpopeters / mg2p
View on GitHub
Multilingual grapheme-to-phoneme conversion
☆20Feb 23, 2018Updated 8 years ago
miccio-dk / NISQA
View on GitHub
NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment
☆16Apr 13, 2022Updated 4 years ago
sigmorphon / 2020
View on GitHub
SIGMORPHON 2020 Shared Task: Grapheme-to-Phoneme, Unsupervised Induction of Morphology, and Typologically Diverse Morphological Inflectio…
☆36Apr 25, 2025Updated last year
motazsaad / ara-pronunciation-tool
View on GitHub
A python tool that converts Arabic diacritised text to a sequence of phonemes and creates a pronunciation dictionary. This code is based …
☆15Sep 5, 2017Updated 8 years ago
speechio / asr-noises
View on GitHub
A handy dataset of noises for ASR
☆22May 29, 2019Updated 7 years ago
isca-sig-rosp / ISCA-SIG-RoSP
View on GitHub
Web page for ISCA Special Interest Group: Robust Speech Processing (RoSP)
☆11Dec 4, 2023Updated 2 years ago
fengxin-bupt / Application-of-Word2vec-in-Phoneme-Recognition
View on GitHub
Build an attention-based model for speech recogntion.Use the Word2vec model to help to train the attention model.
☆30Dec 18, 2019Updated 6 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
seungheondoh / hi_kia
View on GitHub
wake-up word emotion recognition [APSIPA 2022]
☆17Nov 11, 2022Updated 3 years ago
ftyers / commonvoice-utils
View on GitHub
Linguistic processing for Common Voice
☆59Jan 18, 2024Updated 2 years ago
prosodylab / prosodylab.dictionaries
View on GitHub
A repository for dictionaries to be used with the Prosodylab-Aligner
☆17May 13, 2014Updated 12 years ago
davidmarttila / vocal-tract-grad
View on GitHub
Vocal Tract Area Estimation by Gradient Descent
☆39Jul 16, 2023Updated 3 years ago
ryanrudes / YTTTS
View on GitHub
The YouTube Text-To-Speech dataset is comprised of waveform audio extracted from YouTube videos alongside their English transcriptions
☆53Apr 1, 2021Updated 5 years ago
NTRLab / MediaSpeech
View on GitHub
☆22Jul 22, 2022Updated 4 years ago
dan-wells / kiss-aligner
View on GitHub
Simple Kaldi recipe for forced alignment
☆11Jul 16, 2023Updated 3 years ago