Edresson / TTS
Deep learning for Text to Speech
☆26Updated 3 years ago
Related projects ⓘ
Alternatives and complementary repositories for TTS
- An easy way to fine-tune Wav2Vec 2.0 for low-resource languages.☆81Updated last year
- A set of audio augmentation techniques to perform noise insertion in datasets used for Automatic Speech Recognition.☆29Updated 3 years ago
- VAE Tacotron 2, an alternative of GST Tacotron☆87Updated last year
- A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.☆100Updated last year
- Foreign Accent Conversion by Synthesizing Speech from Phonetic Posteriorgrams (Interspeech'19)☆138Updated last year
- Include Basis-MelGAN, MelGAN, HifiGAN and Multiband-HifiGAN, maybe NHV in the future.☆154Updated 3 years ago
- Interface for Controllable Expressive Talking Machine☆38Updated 9 months ago
- Fre-GAN: Adversarial Frequency-consistent Audio Synthesis☆101Updated 3 years ago
- A Pytorch Implementation of MelGAN☆67Updated 5 years ago
- Compendium for the paper "Transparent pronunciation scoring using articulatorily weighted phoneme edit distance" by Karhila, Smolander, Y…☆25Updated 5 years ago
- A better, faster, stronger version of the unbounded interleaved-state recurrent neural network (UIS-RNN)☆57Updated 4 years ago
- SC-GlowTTS: an Efficient Zero-Shot Multi-Speaker Text-To-Speech Model☆106Updated 3 years ago
- This is the implementation of our Interspeech 2020 paper "Converting anyone's emotion: towards speaker-independent emotional voice conver…☆87Updated 3 years ago
- ☆77Updated 5 months ago
- Phoneme Boundary Detection using Learnable Segmental Features (ICASSP 2020)☆79Updated 2 years ago
- Improving the Goodness of Pronunciation with DNNs and RNNs☆31Updated 6 years ago
- Text to Speech Synthesis based on controllable latent representation☆14Updated 5 years ago
- An implement of GlowTTS model. Several modes are added: speaker embedding, prosody encoder(GST), and gradient reversal.☆53Updated 2 years ago
- Online streaming speaker change detection model in Pytorch☆36Updated last year
- an tutorial implement of voice conversion using pytorch☆35Updated 6 years ago
- Implementation code of non-parallel sequence-to-sequence VC☆250Updated last year
- Grapheme-to-phoneme (G2P) conversion is the process of generating pronunciation for words based on their written form. It has a highly es…☆18Updated 3 years ago
- Code for paper "Using Phonetic Posteriorgram Based Frame Pairing for Segmental Accent Conversion"☆35Updated 4 years ago
- Collect Voice Conversion researches☆90Updated this week
- This is the M-AILABS Speech Dataset☆16Updated 4 months ago
- A toolkit to calculate speech audio quality. Not affiliated with the original authors☆38Updated 2 months ago
- Avocodo: Generative Adversarial Network for Artifact-free Vocoder☆115Updated 2 years ago
- An unofficial implementation of https://arxiv.org/abs/2005.05106☆46Updated 3 years ago
- PyTorch Implementation of Generalized End-to-End Loss for Speaker Verification☆28Updated 3 years ago
- scripts to align a given wave to its transcription using trained models by Kaldi☆32Updated 5 years ago