Unicode Standard tokenization routines and orthography profile segmentation
☆39Feb 20, 2025Updated last year
Alternatives and similar repositories for segments
Users that are interested in segments are comparing it to the libraries listed below
Sorting:
- Converts Mandarin Chinese pinyin notation to IPA (international phonetic alphabet) notation☆18Nov 28, 2023Updated 2 years ago
- a compact audio-to-phoneme aligner for singing voice☆12Jan 17, 2024Updated 2 years ago
- This repository contains the files used for our Interspeech 2017 paper.☆16May 30, 2017Updated 8 years ago
- Data and code for grapheme-to-phoneme transducers in lots of languages☆147Apr 5, 2024Updated last year
- This is a balanced dataset for English homograph disambiguation (HD), generated with Meta's Llama 2-Chat 70B model.☆22Jan 22, 2024Updated 2 years ago
- PPSpeech: Phrase based Parallel End-to-End TTS System☆35Aug 31, 2020Updated 5 years ago
- SIGMORPHON 2020 Shared Task: Grapheme-to-Phoneme, Unsupervised Induction of Morphology, and Typologically Diverse Morphological Inflectio…☆36Apr 25, 2025Updated 10 months ago
- ☆19Mar 22, 2024Updated last year
- Breaks a word into syllables using an LSTM-based neural network.☆20Aug 14, 2023Updated 2 years ago
- Phoneme Boundary Detection using Learnable Segmental Features (ICASSP 2020)☆83Nov 13, 2021Updated 4 years ago
- Text-to-Speech tutorial at SLTU 2016☆35May 10, 2016Updated 9 years ago
- Massively multilingual pronunciation mining☆362Jan 13, 2026Updated last month
- A database of number names for 186 languages, locales, and scripts☆67Mar 3, 2023Updated 2 years ago
- Charsiu: A neural phonetic aligner.☆331Sep 19, 2022Updated 3 years ago
- Code for paper titled "Using generative modelling to produce varied intonation for speech synthesis" submitted to the Speech Synthesis Wo…☆24Dec 8, 2019Updated 6 years ago
- 🗣️ Convert between phonetic alphabets☆11Feb 7, 2022Updated 4 years ago
- This repository contains data used in the NAACL 2021 Paper - Proteno: Text Normalization with Limited Data for Fast Deployment in Text to…☆45May 25, 2021Updated 4 years ago
- Mirror of GlottHMM☆10Jun 7, 2016Updated 9 years ago
- Grapheme to phoneme converter for Estonian☆14May 27, 2021Updated 4 years ago
- Festvox voice building tools☆108Updated this week
- Collection of small Lua modules☆10Feb 15, 2026Updated last week
- A Benchmark Corpus for Low-Resource Cantonese Punctuation Restoration from Speech Transcripts☆16Dec 3, 2024Updated last year
- ☆26Apr 21, 2021Updated 4 years ago
- 24-hour Automatic Speech Recognition☆27Jun 4, 2021Updated 4 years ago
- ☆30May 3, 2023Updated 2 years ago
- Data processing tools for preparing speech and labels for training TTS voices☆29Aug 13, 2020Updated 5 years ago
- Official implementation of "Unsupervised Pre-training for Data-Efficient Text-to-Speech on Low Resource Languages", ICASSP 2023☆27Apr 27, 2023Updated 2 years ago
- Links to data used in Sproat & Jaitly (https://arxiv.org/abs/1611.00068) experiments.☆77Jul 9, 2021Updated 4 years ago
- GPT for FACodec☆13Mar 25, 2024Updated last year
- A TeX implementation in a single C++11 class.☆19Sep 19, 2020Updated 5 years ago
- Cross-Linguistic Transcription Systems☆17Dec 17, 2024Updated last year
- A minimal modern (Lua)TeX distribution☆15May 12, 2024Updated last year
- Sequence algorithms for use in Flashlight.☆14Jan 12, 2026Updated last month
- Small-vocabulary neural sequence-to-sequence generation with optional feature conditioning☆35Updated this week
- A lexicon compiler for non-suffixational morphologies☆13Jan 29, 2026Updated last month
- Implementation of Android's TextToSpeechService that provides Estonian text-to-speech☆17Jan 19, 2019Updated 7 years ago
- readers that enable reading kaldi ark in tensorflow☆17Mar 7, 2018Updated 7 years ago
- ☆14Jun 12, 2015Updated 10 years ago
- Fast and differentiable hidden Markov model in C++☆19Jan 20, 2023Updated 3 years ago