roedoejet / FastSpeech2_ACL2022_reproducibilityView external linksLinks
☆21Feb 27, 2024Updated last year
Alternatives and similar repositories for FastSpeech2_ACL2022_reproducibility
Users that are interested in FastSpeech2_ACL2022_reproducibility are comparing it to the libraries listed below
Sorting:
- TTS Text Analyzer☆32Jul 20, 2023Updated 2 years ago
- ☆25Mar 12, 2022Updated 3 years ago
- Official implementation of "Unsupervised Pre-training for Data-Efficient Text-to-Speech on Low Resource Languages", ICASSP 2023☆27Apr 27, 2023Updated 2 years ago
- The Official Implementation of “Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker Adaptation in Text-to-Speech Synth…☆87Dec 20, 2022Updated 3 years ago
- ☆99Jan 19, 2026Updated 3 weeks ago
- A diffusion-based cross-lingual voice conversion model, as my bachelor's thesis☆44Jul 24, 2023Updated 2 years ago
- Based on https://github.com/fatchord/WaveRNN☆24May 3, 2020Updated 5 years ago
- Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale☆28Aug 4, 2023Updated 2 years ago
- Diffusion-based Speech Enhancement: Demonstration of Performance and Generalization☆11Dec 21, 2024Updated last year
- Multi-Speaker Pytorch FastSpeech2: Fast and High-Quality End-to-End Text to Speech☆98Oct 14, 2022Updated 3 years ago
- [ICASSP 2024] StoryTTS: A Highly Expressive Text-to-Speech Dataset with Rich Textual Expressiveness Annotations☆142Apr 27, 2024Updated last year
- ☆13Nov 16, 2020Updated 5 years ago
- High-level API for tar-based dataset☆12Feb 3, 2024Updated 2 years ago
- Automatic speech annotator processing speech with voice activaty detection, overlapping speech detection, speaker diarization and automat…☆33Jun 14, 2024Updated last year
- A tensorflow based implementation of DeepVoice3 https://arxiv.org/abs/1710.07654☆13Jun 5, 2018Updated 7 years ago
- Official PyTorch implementation of "EdVAE: Mitigating Codebook Collapse with Evidential Discrete Variational Autoencoders"☆14Sep 20, 2024Updated last year
- Official Implement of Multi-Stage Multi-Codebook (MSMC) TTS☆167Apr 10, 2024Updated last year
- Provides training, inference and voice conversion recipes for RADTTS and RADTTS++: Flow-based TTS models with Robust Alignment Learning, …☆291Apr 6, 2023Updated 2 years ago
- Vocoder NSF-HiFiGAN (Moved into deepaudio)☆55Dec 11, 2022Updated 3 years ago
- Official repository of DailyTalk: Spoken Dialogue Dataset for Conversational Text-to-Speech, ICASSP 2023☆251Jun 5, 2025Updated 8 months ago
- Deep Neural Pitch Extractor for Voice Conversion and TTS Training☆146Aug 22, 2022Updated 3 years ago
- [IJCAI'23] Learning to Speak from Text for Low-Resource TTS☆64May 30, 2023Updated 2 years ago
- Clean and modernized implementation of FastSpeech2/LightSpeech using IPA☆18Aug 16, 2024Updated last year
- ☆61Nov 4, 2023Updated 2 years ago
- Demo page of our paper Efficiently Trainable Text-to-Speech System Based on Deep Convolutional Networks With Guided Attention, ICASSP 201…☆15May 30, 2021Updated 4 years ago
- Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks☆17Aug 18, 2023Updated 2 years ago
- A real time implementation of the ddsp from google magenta.☆15Nov 8, 2021Updated 4 years ago
- Official implementation of EdiTTS: Score-based Editing for Controllable Text-to-Speech (INTERSPEECH 2022)☆121Jan 24, 2023Updated 3 years ago
- ☆34Jul 16, 2019Updated 6 years ago
- A neural network layer API and library for sequence modeling, designed for easy creation of sequence models that can be executed layerwis…☆50Feb 6, 2026Updated last week
- PyTorch Implementation of Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation☆197Feb 10, 2022Updated 4 years ago
- Implementation of Acoustic BPE (Shen et al., 2024), extended for RVQ-based Neural Audio Codecs☆77Dec 3, 2025Updated 2 months ago
- An unofficial PyTorch implementation of Mix-Phoneme-Bert☆40Jul 10, 2023Updated 2 years ago
- Datasets for turn-taking research☆17Dec 21, 2023Updated 2 years ago
- End-To-End SpeechSynthesis system with knowledge distillation☆18Jul 16, 2022Updated 3 years ago
- This is a balanced dataset for English homograph disambiguation (HD), generated with Meta's Llama 2-Chat 70B model.☆22Jan 22, 2024Updated 2 years ago
- Blazing fast data loading with HuggingFace Dataset and Ray Data☆16Jan 12, 2024Updated 2 years ago
- Glow-TTS with Stochastic Duration Predictor and Stochastic Pitch Predictor☆18Jun 5, 2023Updated 2 years ago
- Unified Speech Language Model for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"(ICLR 2024)☆152Sep 14, 2023Updated 2 years ago