as-ideas / DeepPhonemizer
Grapheme to phoneme conversion with deep learning.
☆346Updated 9 months ago
Related projects: ⓘ
- Multilingual G2P in 100 languages☆274Updated last year
- Charsiu: A neural phonetic aligner.☆267Updated 2 years ago
- Segment an audio file and obtain utterance alignments. (Python package)☆319Updated 4 months ago
- g2p: English Grapheme To Phoneme Conversion☆790Updated last year
- phoneme tokenizer and grapheme-to-phoneme model for 8k languages☆138Updated last year
- An official reimplementation of the method described in the INTERSPEECH 2021 paper - Speech Resynthesis from Discrete Disentangled Self-S…☆375Updated last year
- UniSpeech - Large Scale Self-Supervised Learning for Speech☆419Updated 5 months ago
- A Non-Autoregressive Transformer based Text-to-Speech, supporting a family of SOTA transformers with supervised and unsupervised duration…☆318Updated last year
- A tokenizer, text cleaner, and phonemizer for many human languages.☆272Updated 2 months ago
- Helsinki Prosody Corpus and A System for Predicting Prosodic Prominence from Text☆229Updated 4 years ago
- Phoneme Recognition using pre-trained models Wav2vec2, HuBERT and WavLM. Throughout this project, we compared specifically three differen…☆194Updated 2 years ago
- Official implementation of VQMIVC: One-shot (any-to-any) Voice Conversion @ Interspeech 2021 + Online playing demo!☆334Updated 2 years ago
- ☆250Updated last year
- A Survey on Neural Speech Synthesis https://arxiv.org/pdf/2106.15561.pdf☆359Updated 2 years ago
- PyTorch Implementation of FastSpeech 2 : Fast and High-Quality End-to-End Text to Speech☆223Updated 2 years ago
- A large-scale multilingual speech corpus for representation learning, semi-supervised learning and interpretation☆508Updated last year
- Large, modern dataset for speech recognition☆629Updated 6 months ago
- Massively multilingual pronunciation mining☆314Updated 2 weeks ago
- A fast and lightweight python-based CTC beam search decoder for speech recognition.☆421Updated last year
- A PyTorch implementation of Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis☆357Updated last year
- This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.☆555Updated last year
- A Python module for interacting with Praat TextGrid files. Also includes a class for reading HTK .mlf files into Praat☆281Updated 10 months ago
- Allosaurus is a pretrained universal phone recognizer for more than 2000 languages☆546Updated 4 months ago
- Yet another PyTorch implementation of Tacotron 2 with reduction factor and faster training speed.☆143Updated 2 years ago
- Speaker embedding (d-vector) trained with GE2E loss☆270Updated 8 months ago
- Data and code for grapheme-to-phoneme transducers in lots of languages☆128Updated 5 months ago
- This is the GitHub page for publicly available emotional speech data.☆314Updated 2 years ago
- PyTorch Implementation of Non-autoregressive Expressive (emotional, conversational) TTS based on FastSpeech2, supporting English, Korean,…☆276Updated 3 years ago
- VCTK multi-speaker tacotron for ICASSP 2020☆265Updated 2 years ago
- Variational Bayes HMM over x-vectors diarization☆251Updated 8 months ago