PHOIBLE data and development.
☆141Jul 5, 2024Updated last year
Alternatives and similar repositories for dev
Users that are interested in dev are comparing it to the libraries listed below
Sorting:
- A phoneme-allophone database for many languages☆53May 19, 2020Updated 5 years ago
- Dataset of ICASSP 2021 MULTILINGUAL PHONETIC DATASET FOR LOW RESOURCE SPEECH RECOGNITION☆46May 12, 2023Updated 2 years ago
- Annotations and scripts for use with University of Wisconsin X-Ray Microbeam Speech Production Database (1994)☆13Oct 8, 2020Updated 5 years ago
- Python package and data files for manipulating phonological segments (phones, phonemes) in terms of universal phonological features.☆295Oct 22, 2025Updated 4 months ago
- A family of efficient speech models for multilingual phone recognition☆45Feb 12, 2026Updated 2 weeks ago
- Moved to cldf-datasets/wals☆14Jan 14, 2020Updated 6 years ago
- Feature extraction for accented-speech or pathological speech☆17Apr 2, 2019Updated 6 years ago
- The Unicode Cookbook for Linguists☆56Nov 21, 2020Updated 5 years ago
- pytorch model for contexless-phoneme prediction from speech audio☆30Oct 30, 2025Updated 4 months ago
- Code for "Error-driven Fixed-Budget ASR Personalization for Accented Speakers" in ICASSP 2021☆11Jun 13, 2021Updated 4 years ago
- Multilingual acoustic word embedding approaches applied and evaluated on GlobalPhone data.☆11Nov 3, 2020Updated 5 years ago
- A tool for transcribing orthographic text as IPA (International Phonetic Alphabet)☆799Dec 24, 2025Updated 2 months ago
- steps to perform text-based speaker diarization with kaldi toolkit☆12Nov 2, 2018Updated 7 years ago
- Massively multilingual pronunciation mining☆362Jan 13, 2026Updated last month
- Phonological CorpusTools☆121May 24, 2025Updated 9 months ago
- Allosaurus is a pretrained universal phone recognizer for more than 2000 languages☆706Apr 26, 2024Updated last year
- IPA tokeniser☆19Jul 28, 2025Updated 7 months ago
- Labeled data for homograph disambiguation☆62Jun 1, 2023Updated 2 years ago
- Glossa latex resources☆14Jan 9, 2026Updated last month
- ☆14Aug 19, 2024Updated last year
- NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment☆16Apr 13, 2022Updated 3 years ago
- Trainable algorithm for automatic measurement of voice onset time☆68Jul 26, 2023Updated 2 years ago
- This is a mirror of https://gitlab.com/tiro-is/tiro-speech-core☆15Jun 19, 2023Updated 2 years ago
- phone inventory library☆17May 15, 2023Updated 2 years ago
- phoneme tokenizer and grapheme-to-phoneme model for 8k languages☆174Jun 9, 2023Updated 2 years ago
- Convert English text from written expressions into spoken forms☆28Jun 22, 2022Updated 3 years ago
- A multilingual phoneme recognizer capable of generalizing zero-shot to unseen phoneme inventories.☆28Mar 14, 2025Updated 11 months ago
- Universal multilingual automatic speech transcription into IPA☆77Feb 28, 2025Updated last year
- Data and code for grapheme-to-phoneme transducers in lots of languages☆147Apr 5, 2024Updated last year
- This is a balanced dataset for English homograph disambiguation (HD), generated with Meta's Llama 2-Chat 70B model.☆22Jan 22, 2024Updated 2 years ago
- Hosts text-to-speech corpus and speech synthesizers for African languages.☆18May 31, 2023Updated 2 years ago
- Megatts2 use HierSpeechpp's vocoder☆18Dec 2, 2024Updated last year
- a Neural Vocoder supporting Ring Attention, Conformer and NSF.☆24Aug 1, 2025Updated 6 months ago
- Read in a 'Praat' 'TextGrid' File☆17Oct 28, 2025Updated 4 months ago
- A Python toolkit converting pronunciation in enwiktionary xml dump to cmudict format☆33Jul 5, 2019Updated 6 years ago
- asr2k☆52Jun 2, 2024Updated last year
- Train no-reference speech quality estimators with multiple datasets via learned, per-dataset alignments.☆18Aug 1, 2025Updated 6 months ago
- ☆15May 8, 2021Updated 4 years ago
- Repository for multilingual speech data resources for native languages of Zambia.☆20Oct 9, 2024Updated last year