CUNY-CL / wikipron
Massively multilingual pronunciation mining
☆321Updated this week
Related projects ⓘ
Alternatives and complementary repositories for wikipron
- Python package and data files for manipulating phonological segments (phones, phonemes) in terms of universal phonological features.☆221Updated 3 months ago
- Grapheme to phoneme conversion with deep learning.☆358Updated 11 months ago
- Grapheme-to-Phoneme transductions that preserve input and output indices, and support cross-lingual g2p!☆135Updated this week
- A tool for transcribing orthographic text as IPA (International Phonetic Alphabet)☆655Updated 2 months ago
- 🙊 software for creating speech recognition models.☆152Updated 5 months ago
- Universal Romanizer that can convert any unicode script to roman (latin) script☆154Updated 3 months ago
- Data and code for grapheme-to-phoneme transducers in lots of languages☆130Updated 7 months ago
- g2p: English Grapheme To Phoneme Conversion☆811Updated last year
- Read, write, and manipulate Praat TextGrid files with Python☆126Updated 11 months ago
- ipapy is a Python module to work with International Phonetic Alphabet (IPA) strings☆81Updated 6 months ago
- Python library for manipulating pronunciations using the International Phonetic Alphabet (IPA)☆80Updated last year
- Helsinki Prosody Corpus and A System for Predicting Prosodic Prominence from Text☆235Updated 5 years ago
- A tokenizer, text cleaner, and phonemizer for many human languages.☆285Updated this week
- Phonetisaurus G2P☆453Updated 5 months ago
- A tool for automatic phoneme transcription☆157Updated last year
- A python library for working with praat, textgrids, time aligned audio transcripts, and audio files. It is primarily used for extracting …☆313Updated 11 months ago
- phoneme tokenizer and grapheme-to-phoneme model for 8k languages☆144Updated last year
- Charsiu: A neural phonetic aligner.☆278Updated 2 years ago
- A Python module for interacting with Praat TextGrid files. Also includes a class for reading HTK .mlf files into Praat☆285Updated last year
- CMU Wilderness Multilingual Speech Dataset☆272Updated 5 years ago
- Allosaurus is a pretrained universal phone recognizer for more than 2000 languages☆565Updated 6 months ago
- DeepSpeech based forced alignment tool☆234Updated 3 years ago
- Multilingual G2P in 100 languages☆288Updated last year
- Segment an audio file and obtain utterance alignments. (Python package)☆321Updated 6 months ago
- This is a github repository of the abandonware Sequitur G2P by Bisani & Ney☆155Updated 4 months ago
- A phoneme-allophone database for many languages☆48Updated 4 years ago
- A large-scale multilingual speech corpus for representation learning, semi-supervised learning and interpretation☆512Updated last year
- A Python library for measuring the acoustic features of speech (simultaneous speech, high entropy) compared to ones of native speech.☆237Updated last year
- python code for converting among IPA, ARPABET, XSAMPA, Callhome, DISC, TIMIT, plus some lexical tones.☆30Updated 9 months ago
- Linguistic processing for Common Voice☆52Updated 10 months ago