keonlee9420/Comprehensive-Tacotron2

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/keonlee9420/Comprehensive-Tacotron2)

keonlee9420 / Comprehensive-Tacotron2

PyTorch Implementation of Google's Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions. This implementation supports both single-, multi-speaker TTS and several techniques to enforce the robustness and efficiency of the model.

☆49

Alternatives and similar repositories for Comprehensive-Tacotron2

Users that are interested in Comprehensive-Tacotron2 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

miccio-dk / NISQA
View on GitHub
NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment
☆16Apr 13, 2022Updated 4 years ago
LeoniusChen / Attentions-in-Tacotron
View on GitHub
☆69Mar 31, 2021Updated 5 years ago
choiHkk / Transformer-TTS-V2
View on GitHub
☆25Mar 6, 2024Updated 2 years ago
keonlee9420 / Stepwise_Monotonic_Multihead_Attention
View on GitHub
PyTorch Implementation of Stepwise Monotonic Multihead Attention similar to Enhancing Monotonicity for Robust Autoregressive Transformer …
☆39May 16, 2021Updated 5 years ago
awni / future_speech
View on GitHub
The History of Speech Recognition to the Year 2030
☆13Aug 14, 2021Updated 4 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
idiap / zff_vad
View on GitHub
Unsupervised Voice Activity Detection by Modeling Source and System Information using Zero Frequency Filtering
☆23Oct 19, 2023Updated 2 years ago
rishikksh20 / Phone-Level-Mixture-Density-Network-for-TTS
View on GitHub
Rich Prosody Diversity Modelling with Phone-level Mixture Density Network
☆45Dec 1, 2021Updated 4 years ago
BridgetteSong / ExpressiveTacotron
View on GitHub
This repository provides a multi-mode and multi-speaker expressive speech synthesis framework, including multi-attentive Tacotron, DurIAN…
☆74Sep 21, 2022Updated 3 years ago
keonlee9420 / VAENAR-TTS
View on GitHub
PyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis.
☆74Aug 3, 2021Updated 4 years ago
keonlee9420 / FastPitchFormant
View on GitHub
PyTorch Implementation of NCSOFT's FastPitchFormant: Source-filter based Decomposed Modeling for Speech Synthesis
☆74Aug 3, 2021Updated 4 years ago
keonlee9420 / Expressive-FastSpeech2
View on GitHub
PyTorch Implementation of Non-autoregressive Expressive (emotional, conversational) TTS based on FastSpeech2, supporting English, Korean,…
☆321Aug 25, 2021Updated 4 years ago
keonlee9420 / Parallel-Tacotron2
View on GitHub
PyTorch Implementation of Google's Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling
☆191Nov 18, 2021Updated 4 years ago
NeuroWave-ai / CUCVAE-TTS
View on GitHub
☆25Mar 12, 2022Updated 4 years ago
ga642381 / FastSpeech2
View on GitHub
Multi-Speaker Pytorch FastSpeech2: Fast and High-Quality End-to-End Text to Speech
☆99Oct 14, 2022Updated 3 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
rishikksh20 / vae_tacotron2
View on GitHub
VAE Tacotron 2, an alternative of GST Tacotron
☆91Jul 6, 2023Updated 3 years ago
keonlee9420 / WaveGrad2
View on GitHub
PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis
☆68Aug 3, 2021Updated 4 years ago
keonlee9420 / Daft-Exprt
View on GitHub
PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis
☆55Oct 15, 2021Updated 4 years ago
keonlee9420 / StyleSpeech
View on GitHub
PyTorch Implementation of Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation
☆197Feb 10, 2022Updated 4 years ago
hash2430 / pitchtron
View on GitHub
TTS for pitch-accented language. Korean dialect DB.
☆155May 12, 2023Updated 3 years ago
Tomiinek / Blizzard2013_Segmentation
View on GitHub
Transcripts and segmentation for the Blizzard 2013 audiobooks also known as the Lessac or Blizzard 2013 dataset.
☆45Nov 13, 2019Updated 6 years ago
ZackHodari / discrete_intonation
View on GitHub
Code for paper titled "Perception of prosodic variation for speech synthesis using an unsupervised discrete representation of F0" submitt…
☆17May 24, 2020Updated 6 years ago
vara-tts / VARA-TTS
View on GitHub
Demo audio of VARA-TTS model
☆20Jun 11, 2021Updated 5 years ago
cnlinxi / tpse_tacotron2
View on GitHub
TPSE-GST Tacotron2
☆14May 1, 2019Updated 7 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
keonlee9420 / STYLER
View on GitHub
Official repository of STYLER: Style Factor Modeling with Rapidity and Robustness via Speech Decomposition for Expressive and Controllabl…
☆159Jun 5, 2025Updated last year
ex3ndr / supervoice-hybrid
View on GitHub
My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into one
☆26Aug 5, 2024Updated last year
KevinMIN95 / StyleSpeech
View on GitHub
Official implementation of Meta-StyleSpeech and StyleSpeech
☆254Feb 9, 2022Updated 4 years ago
mutiann / neural-lexicon-reader
View on GitHub
Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge
☆21Jul 25, 2022Updated 4 years ago
rishikksh20 / Avocodo-pytorch
View on GitHub
Avocodo: Generative Adversarial Network for Artifact-free Vocoder
☆122Jul 14, 2022Updated 4 years ago
maum-ai / wavegrad2
View on GitHub
Unofficial Pytorch Implementation of WaveGrad2
☆111Aug 18, 2021Updated 4 years ago
spring-media / DeepForcedAligner
View on GitHub
☆81Aug 8, 2025Updated 11 months ago
MiniXC / LightningFastSpeech2
View on GitHub
☆55Jan 13, 2023Updated 3 years ago
msalhab96 / MultiSpeech
View on GitHub
pytorch implementation for MultiSpeech: Multi-Speaker Text to Speech with Transformer paper
☆21Jun 23, 2022Updated 4 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
roholazandie / ryan-tts
View on GitHub
☆18Jan 17, 2022Updated 4 years ago
zzw922cn / wesinger2
View on GitHub
Synthesized singing voice demos of WeSinger 2 paper.
☆26Feb 20, 2023Updated 3 years ago
BogiHsu / Tacotron2-PyTorch
View on GitHub
Yet another PyTorch implementation of Tacotron 2 with reduction factor and faster training speed.
☆148Apr 12, 2022Updated 4 years ago
v-nhandt21 / MusicVoiceConversion
View on GitHub
Sing any popular song with your voice
☆11Jul 10, 2022Updated 4 years ago
nii-yamagishilab / multi-speaker-tacotron
View on GitHub
VCTK multi-speaker tacotron for ICASSP 2020
☆266Mar 29, 2022Updated 4 years ago
b04901014 / MQTTS
View on GitHub
☆260May 15, 2023Updated 3 years ago
NVIDIA / radtts
View on GitHub
Provides training, inference and voice conversion recipes for RADTTS and RADTTS++: Flow-based TTS models with Robust Alignment Learning, …
☆291Apr 6, 2023Updated 3 years ago