HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
☆45Mar 2, 2021Updated 5 years ago
Alternatives and similar repositories for multiband-hifigan
Users that are interested in multiband-hifigan are comparing it to the libraries listed below
Sorting:
- ☆15Nov 11, 2024Updated last year
- PyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis.☆73Aug 3, 2021Updated 4 years ago
- [ICASSP 2025] AnCoGen: Analysis, Control and Generation of Speech with a Masked Autoencoder☆12Mar 11, 2025Updated 11 months ago
- Synthesized singing voice demos of WeSinger 2 paper.☆26Feb 20, 2023Updated 3 years ago
- ☆15May 8, 2021Updated 4 years ago
- Incorporating AutoVocoder to MB-iSTFT-VITS☆48Dec 1, 2022Updated 3 years ago
- [IJCAI'23] Learning to Speak from Text for Low-Resource TTS☆64May 30, 2023Updated 2 years ago
- The Official Implementation of “Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker Adaptation in Text-to-Speech Synth…☆87Dec 20, 2022Updated 3 years ago
- Include Basis-MelGAN, MelGAN, HifiGAN and Multiband-HifiGAN, maybe NHV in the future.☆157Jul 2, 2021Updated 4 years ago
- Linear Prediction Coefficients estimation from mel-spectrogram implemented in Python based on Levinson-Durbin algorithm.☆71Mar 19, 2021Updated 4 years ago
- The source code for the paper XiaoiceSing2 (interspeech2023)☆49Jan 15, 2024Updated 2 years ago
- ☆19Feb 2, 2023Updated 3 years ago
- ☆11May 15, 2025Updated 9 months ago
- RepVgg + HiFiGAN☆36Aug 10, 2022Updated 3 years ago
- LVCNet: Efficient Condition-Dependent Modeling Network for Waveform Generation☆80Feb 24, 2021Updated 5 years ago
- An implement of "Phonetic Posteriorgrams based Many-to-Many Singing Voice Conversion via Adversarial Training"☆125Nov 4, 2020Updated 5 years ago
- ☆54Mar 2, 2023Updated 3 years ago
- A low-bitrate single-codebook 16 / 24 kHz speech codec based on focal modulation☆145Nov 30, 2025Updated 3 months ago
- Demo audio of VARA-TTS model☆20Jun 11, 2021Updated 4 years ago
- This repository implement a novel zero-shot TTS framework, named Flamed-TTS, focusing on the efficient generation and dynamic pacing in …☆57Aug 9, 2025Updated 6 months ago
- Rich Prosody Diversity Modelling with Phone-level Mixture Density Network☆45Dec 1, 2021Updated 4 years ago
- ☆19Mar 22, 2024Updated last year
- ☆134Feb 4, 2023Updated 3 years ago
- Joint CTC-S2S Phoneme-level ASR for Voice Conversion and TTS (Text-Mel Alignment)☆125Jun 16, 2022Updated 3 years ago
- Deep Neural Pitch Extractor for Voice Conversion and TTS Training☆146Aug 22, 2022Updated 3 years ago
- UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation☆76Aug 30, 2021Updated 4 years ago
- A robust pitch tracker using synchro-squeezed fft and frequency domain autocorrelation☆36Jan 17, 2024Updated 2 years ago
- ☆55Aug 11, 2022Updated 3 years ago
- NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment☆16Apr 13, 2022Updated 3 years ago
- ☆13Sep 1, 2023Updated 2 years ago
- HiFi-GAN: High Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks☆225Apr 8, 2021Updated 4 years ago
- This is the GitHub page for publicly available emotional speech data.☆381Jan 6, 2022Updated 4 years ago
- A pitch detection model trained to be robust against noise and reverberation environments.☆27Jan 21, 2025Updated last year
- ICASSP 2023 Accepted☆190May 6, 2024Updated last year
- ☆69Mar 31, 2021Updated 4 years ago
- Unofficial pytorch implementation of BigVGAN: A Universal Neural Vocoder with Large-Scale Training☆136Feb 18, 2023Updated 3 years ago
- LAFMA: A Latent Flow Matching Model for Text-to-Audio Generation (INTERSPEECH 2024)☆43Jun 13, 2024Updated last year
- ☆31Jul 13, 2023Updated 2 years ago
- UT-Sarulab MOS prediction system using SSL models☆296Apr 11, 2024Updated last year