isi-nlp / uromanLinks
Universal Romanizer that can convert any unicode script to roman (latin) script
☆203Updated 10 months ago
Alternatives and similar repositories for uroman
Users that are interested in uroman are comparing it to the libraries listed below
Sorting:
- Multilingual G2P in 100 languages☆327Updated 2 years ago
- Grapheme-to-Phoneme transductions that preserve input and output indices, and support cross-lingual g2p!☆163Updated 3 weeks ago
- SHAS: Approaching optimal Segmentation for End-to-End Speech Translation☆38Updated 2 years ago
- Various speech datasets made available to the public☆118Updated 5 months ago
- A phoneme-allophone database for many languages☆52Updated 5 years ago
- Massively multilingual pronunciation mining☆340Updated 2 weeks ago
- phoneme tokenizer and grapheme-to-phoneme model for 8k languages☆162Updated last year
- Python package and data files for manipulating phonological segments (phones, phonemes) in terms of universal phonological features.☆259Updated this week
- Segment an audio file and obtain utterance alignments. (Python package)☆336Updated last year
- CVSS: A Massively Multilingual Speech-to-Speech Translation Corpus☆195Updated 2 years ago
- Data and code for grapheme-to-phoneme transducers in lots of languages☆137Updated last year
- ☆56Updated 2 years ago
- Tracking the progress in end-to-end speech translation☆260Updated last year
- Read, write, and manipulate Praat TextGrid files with Python☆128Updated last year
- Universal multilingual automatic speech transcription into IPA☆65Updated 3 months ago
- 🙊 software for creating speech recognition models.☆159Updated last year
- Helsinki Prosody Corpus and A System for Predicting Prosodic Prominence from Text☆243Updated 5 years ago
- Linguistic processing for Common Voice☆55Updated last year
- ☆43Updated 2 years ago
- Complimentary code for our paper Automatic punctuation restoration with BERT models☆49Updated last year
- ☆178Updated 3 years ago
- Charsiu: A neural phonetic aligner.☆301Updated 2 years ago
- A curated list of awesome disfluency detection publications along with the released code and bibliographical information☆76Updated 4 years ago
- Large scale (>200h) and publicly available read audio book corpus. This corpus is an augmentation of LibriSpeech ASR Corpus (1000h) and c…☆43Updated 2 years ago
- SIGMORPHON 2022 Shared Task on Morpheme Segmentation☆26Updated 2 years ago
- Covering grammars for English and Russian text normalization☆61Updated 5 years ago
- A fast and lightweight python-based CTC beam search decoder for speech recognition.☆444Updated last year
- MorphyNet: a Large Multilingual Database of Derivational and Inflectional Morphology (+morpheme segmentation)☆47Updated 2 years ago
- ☆36Updated last month
- CMU Wilderness Multilingual Speech Dataset☆281Updated 6 years ago