roedoejet / FastSpeech2
An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"
☆22Updated last year
Alternatives and similar repositories for FastSpeech2:
Users that are interested in FastSpeech2 are comparing it to the libraries listed below
- ☆74Updated 2 years ago
- The code for aishell-3 baseline acoustic model☆67Updated 4 years ago
- ☆64Updated last year
- ☆56Updated last year
- Chinese Text Normalization and Dataset☆82Updated 2 years ago
- The Official Implementation of “Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker Adaptation in Text-to-Speech Synth…☆82Updated 2 years ago
- Official implementation of "Automatic Tuning of Loss Trade-offs without Hyper-parameter Search in End-to-End Zero-Shot Speech Synthesis",…☆79Updated last year
- Predict prosody labels for Chinese sentences.☆41Updated 2 years ago
- TTS-frontend with Bert and CRF/lstm (For Tacotron)☆52Updated 4 years ago
- Huawei Grad-TTS for Chinese☆46Updated last year
- Implementation of TTS with combination of Tacotron2 and HiFi-GAN☆9Updated 3 years ago
- Chinese and English Bilinguish G2P☆20Updated last year
- The Implementation of FastSpeech2 Based on Pytorch.☆52Updated last year
- [WIP] Unofficial Implementation of Microsoft's PromptTTS2☆51Updated last year
- Target Speaker Extraction Toolkit☆146Updated this week
- Command-line interface and Python library to transcribe pinyin to IPA. The tones are attached to the vowel of the syllable.☆34Updated 8 months ago
- MagicData-RAMC Dataset and Baseline☆53Updated 2 years ago
- CHIME-7/8 diarization champion system: neural speaker diarization using memory-aware multi-speaker embedding with sequence-to-sequence ar…☆76Updated 9 months ago
- ☆91Updated last year
- Materials accompanying the paper "Phonological features for 0-shot multilingual speech synthesis"☆32Updated 4 years ago
- End-to-End Keyword Spotting (E2E-KWS) using a character level LSTM☆39Updated 2 years ago
- TransferTTS (Zero-Shot learning of VITS)☆94Updated 2 years ago
- HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis☆41Updated 4 years ago
- ☆37Updated 7 months ago
- TTS FrontEnd DataSet: Polyphone / Prosody / TextNormalization☆94Updated last year
- The baseline system for the ICASSP2024 ICMC-ASR Challenge.☆47Updated last year
- Went online decode demo☆29Updated 3 years ago
- Computes the Mel-Cepstral Distance of two WAV files based on the paper "Mel-Cepstral Distance Measure for Objective Speech Quality Assess…☆51Updated 2 months ago
- wav2vec2 audio classification for prosodic boundary detection and other tasks☆39Updated last year
- ☆50Updated 4 months ago