petronny / g2pLinks
Pre-trained grapheme-to-phoneme (G2P) models
☆25Updated 4 years ago
Alternatives and similar repositories for g2p
Users that are interested in g2p are comparing it to the libraries listed below
Sorting:
- Neural network-based forced alignment with bidirectional attention mechanism☆78Updated 9 months ago
- ☆44Updated 4 years ago
- multilingual speech aligner☆77Updated last year
- Chinese Text Normalization and Dataset☆86Updated 3 years ago
- Official implementation for Fast-HuBERT: An Efficient Training Framework for Self-Supervised Speech Representation Learning☆95Updated 11 months ago
- 《SpeechGen: Unlocking the Generative Power of Speech Language Models with Prompts》☆76Updated 2 years ago
- An unofficial PyTorch implementation of Mix-Phoneme-Bert☆40Updated 2 years ago
- Text frontend for ESPnet tts recipes☆34Updated 4 years ago
- Phoneme segmentation using pre-trained speech models☆55Updated 2 years ago
- ☆111Updated 3 years ago
- [ICASSP 2024] KNN-CTC: Enhancing ASR via Retrieval of CTC Pseudo Labels☆42Updated last year
- ☆64Updated 3 years ago
- ☆64Updated 3 years ago
- 基于随机森林和条件随机场的中文韵律预测模型☆28Updated last year
- ☆69Updated 4 years ago
- Some fast-ish algorithms for batch text search in moderate-sized collections, intended for data cleanup☆76Updated 3 months ago
- Implementation of the subscale framework from the WaveRNN paper, building on top of Fatchord's WaveRNN repo☆19Updated 5 years ago
- Speech samples and code of BEdit-TTS☆34Updated 2 years ago
- Speech (audio) subjective evaluation system☆42Updated 5 years ago
- Code for paper "Using Phonetic Posteriorgram Based Frame Pairing for Segmental Accent Conversion"☆36Updated 5 years ago
- AutoPrep: An Automatic Preprocessing Framework for In-the-Wild Speech Data☆33Updated last year
- TFGAN: Time and Frequency Domain Based Generative Adversarial Network for High-fidelity Speech Synthesis☆87Updated 4 years ago
- TTS FrontEnd DataSet: Polyphone / Prosody / TextNormalization☆102Updated last year
- The Official Implementation of “Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker Adaptation in Text-to-Speech Synth…☆86Updated 2 years ago
- [InterSpeech 2020] "Improving the Speaker Identity of Non-Parallel Many-to-Many VoiceConversion with Adversarial Speaker Recognition" by …☆39Updated 2 years ago
- VoicePAT is a modular and efficient toolkit for voice privacy research, with main focus on speaker anonymization.☆51Updated last year
- A Full Text-Dependent End to End Mispronunciation Detection and Diagnosis with Easy Data Augment Techniques☆62Updated 4 years ago
- ☆22Updated last year
- [IJCAI'23] Learning to Speak from Text for Low-Resource TTS☆63Updated 2 years ago
- ☆52Updated 4 years ago