zsl24 / Tacotron2-Mandarin-HiFiGAN
Implementation of TTS with combination of Tacotron2 and HiFi-GAN
☆9Updated 2 years ago
Related projects: ⓘ
- Official implementation for the paper Fine-grained style control in transformer-based text-to-speech synthesis.☆86Updated 2 years ago
- Computes the Mel-Cepstral Distance of two WAV files based on the paper "Mel-Cepstral Distance Measure for Objective Speech Quality Assess…☆46Updated 7 months ago
- ☆39Updated last year
- UT-Sarulab MOS prediction system using SSL models☆163Updated 5 months ago
- A pytorch implementation of MBNET: MOS PREDICTION FOR SYNTHESIZED SPEECH WITH MEAN-BIAS NETWORK☆60Updated 2 years ago
- ☆74Updated 2 years ago
- Materials accompanying the paper "Phonological features for 0-shot multilingual speech synthesis"☆31Updated 4 years ago
- [InterSpeech 2020] "Improving the Speaker Identity of Non-Parallel Many-to-Many VoiceConversion with Adversarial Speaker Recognition" by …☆39Updated last year
- ☆69Updated 3 years ago
- ☆72Updated last year
- A toolkit for any-to-any encoder-decoder voice conversion systems☆80Updated last year
- The baseline system for the ICASSP2024 ICMC-ASR Challenge.☆42Updated 9 months ago
- ☆40Updated 2 weeks ago
- a curated list of speech datasets (110+ datasets, 75+ easy to download)☆76Updated last year
- Calculation of MCD (dB) between two speech waveforms☆55Updated 3 years ago
- MagicData-RAMC Dataset and Baseline☆49Updated 2 years ago
- Speech (audio) subjective evaluation system☆37Updated 4 years ago
- Official implementation for Fast-HuBERT: An Efficient Training Framework for Self-Supervised Speech Representation Learning☆78Updated 11 months ago
- used to evaluate wavenet vocoder by rmse f0, MCD, rmse ap...☆14Updated 4 years ago
- Baseline Recipe for VoicePrivacy Challenge 2024: anonymization systems and evaluation software☆37Updated 3 months ago
- Fre-GAN: Adversarial Frequency-consistent Audio Synthesis☆101Updated 3 years ago
- ☆29Updated 2 years ago
- The Official Implementation of “Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker Adaptation in Text-to-Speech Synth…☆81Updated last year
- Implementation of the AlignTTS☆76Updated last year
- UTokyo-SaruLab MOS Prediction System☆49Updated this week
- VoicePAT is a modular and efficient toolkit for voice privacy research, with main focus on speaker anonymization.☆45Updated 4 months ago
- Predict prosody labels for Chinese sentences.☆41Updated 2 years ago
- Speech Representation Disentanglement with Adversarial Mutual Information Learning for One-shot Voice Conversion (Interspeech 2022)☆110Updated 7 months ago
- Official implementation for the paper: A Unified One-Shot Prosody and Speaker Conversion System with Self-Supervised Discrete Speech Unit…☆73Updated last year
- A sequence-to-sequence voice conversion toolkit.☆84Updated 2 months ago