coqui-ai / TTS-papers
πΈ collection of TTS papers
β683Updated 10 months ago
Alternatives and similar repositories for TTS-papers:
Users that are interested in TTS-papers are comparing it to the libraries listed below
- List of speech synthesis papers.β1,038Updated last year
- An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing, code-switching, and voice cloning.β836Updated last year
- A Generative Flow for Text-to-Speech via Monotonic Alignment Searchβ692Updated 2 years ago
- Tools for handling speech data in machine learning projects.β1,016Updated this week
- Large, modern dataset for speech recognitionβ673Updated last year
- π A list of accessible speech corpora for ASR, TTS, and other Speech Technologiesβ1,324Updated 11 months ago
- YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyoneβ968Updated 6 months ago
- VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with Adversarial Learning and Architecture Designβ560Updated last year
- HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesisβ2,122Updated 9 months ago
- A summary on our attempts at using Deep Learning approaches for Emotional Text to Speechβ447Updated 10 months ago
- FreeVC: Towards High-Quality Text-Free One-Shot Voice Conversionβ661Updated 3 months ago
- PyTorch Implementation of Non-autoregressive Expressive (emotional, conversational) TTS based on FastSpeech2, supporting English, Korean,β¦β301Updated 3 years ago
- Grapheme to phoneme conversion with deep learning.β381Updated last year
- A Survey on Neural Speech Synthesis https://arxiv.org/pdf/2106.15561.pdfβ369Updated 3 years ago
- Command line utility for forced alignment using Kaldiβ1,458Updated last month
- unofficial vits2-TTS implementation in pytorchβ518Updated last year
- An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"β2,010Updated last year
- g2p: English Grapheme To Phoneme Conversionβ849Updated 2 years ago
- A fully working pytorch implementation of NaturalSpeech (Tan et al., 2022)β475Updated last year
- Unsupervised Speech Decomposition Via Triple Information Bottleneckβ677Updated 6 months ago
- This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.β582Updated last year
- A tokenizer, text cleaner, and phonemizer for many human languages.β310Updated 5 months ago
- UniSpeech - Large Scale Self-Supervised Learning for Speechβ458Updated last year
- Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorchβ1,603Updated last year
- This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team atβ¦β412Updated last month
- Implementation of Voicebox, new SOTA Text-to-speech network from MetaAI, in Pytorchβ650Updated 7 months ago
- Official Implementation of StyleTTSβ431Updated 3 months ago
- FSA/FST algorithms, differentiable, with PyTorch compatibility.β1,195Updated last week
- Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesisβ922Updated 8 months ago
- Unified-Modal Speech-Text Pre-Training for Spoken Language Processingβ1,343Updated last year