TouchSky-Lab/Awesome-Text-to-Speech-TTS

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/TouchSky-Lab/Awesome-Text-to-Speech-TTS)

TouchSky-Lab / Awesome-Text-to-Speech-TTS

Awesome TTS

☆63

Alternatives and similar repositories for Awesome-Text-to-Speech-TTS

Users that are interested in Awesome-Text-to-Speech-TTS are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

iamanigeeit / present
View on GitHub
☆14Aug 19, 2024Updated last year
nonverbalspeech38k / nonverspeech38k
View on GitHub
The official repository for the paper “NonVerbalSpeech-38K: A Scalable Pipeline for Enabling Non-Verbal Speech Generation and Understandi…
☆68Dec 26, 2025Updated 7 months ago
SWivid / AUV
View on GitHub
An All-in-One Speech, Sound, Music Codec with Single Nested Codebook
☆28Oct 11, 2025Updated 9 months ago
noetits / ICE-Talk
View on GitHub
Interface for Controllable Expressive Talking Machine
☆40Sep 20, 2025Updated 10 months ago
RaphaelOlivier / robust_speech
View on GitHub
☆43May 19, 2023Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
CODEJIN / Glow_TTS
View on GitHub
An implement of GlowTTS model. Several modes are added: speaker embedding, prosody encoder(GST), and gradient reversal.
☆55Sep 14, 2022Updated 3 years ago
b04901014 / FG-transformer-TTS
View on GitHub
Official implementation for the paper Fine-grained style control in transformer-based text-to-speech synthesis.
☆90Mar 5, 2022Updated 4 years ago
naver-ai / RapFlow-TTS
View on GitHub
☆56Jul 16, 2025Updated last year
backspacetg / distilXLSR
View on GitHub
Models and codes for INTERSPEECH 2023 paper DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model
☆13Mar 30, 2025Updated last year
HappyColor / Vesper
View on GitHub
A Compact and Effective Pretrained Model for Speech Emotion Recognition
☆55Apr 10, 2026Updated 3 months ago
Helw150 / levanter
View on GitHub
Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax
☆16Jun 16, 2024Updated 2 years ago
deepvk / muse
View on GitHub
🎵 muse: Music Separation
☆11Feb 14, 2024Updated 2 years ago
seungwonpark / awesome-tts-samples
View on GitHub
Awesome list of TTS papers with audio samples
☆61Aug 18, 2020Updated 5 years ago
nivibilla / efficient-vits-finetuning
View on GitHub
Finetuning VITS Efficiently
☆32Nov 6, 2023Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
ex3ndr / supervoice-librilight-preprocessed
View on GitHub
60k hours of phoneme-aligned audio from audio books
☆19Jul 27, 2024Updated last year
Jackson-Kang / Awesome-DL-based-Text-to-speech-Papers-and-Resources
View on GitHub
Various Text-to-speech (TTS) papers based on Deep-learning
☆14Feb 26, 2021Updated 5 years ago
pengzhendong / streaming-asr
View on GitHub
One command to start a streaming ASR server.
☆12Oct 2, 2024Updated last year
hcy71o / SC-CNN
View on GitHub
SC-CNN: Effective Speaker Conditioning Method for Zero-Shot Multi-Speaker Text-to-Speech Systems
☆39Nov 1, 2023Updated 2 years ago
OmniMMI / OpenOmniNexus
View on GitHub
a fully open-source implementation of a GPT-4o-like speech-to-speech video understanding model.
☆38Apr 7, 2025Updated last year
bastibe / MAPS-Scripts
View on GitHub
A fundamental frequency estimation algorithm using features from the magnitude and phase spectrogram.
☆25Mar 29, 2021Updated 5 years ago
Edresson / GE2E-Speaker-Encoder
View on GitHub
GE2E Speaker Encoder - Generalized End-To-End Loss for Speaker Verification
☆14May 17, 2020Updated 6 years ago
Scarfmonster / HiFiPLN
View on GitHub
Multispeaker Community Vocoder Model for DiffSinger
☆39Aug 11, 2025Updated 11 months ago
ogunlao / glowtts_stdp
View on GitHub
Glow-TTS with Stochastic Duration Predictor and Stochastic Pitch Predictor
☆19Jun 5, 2023Updated 3 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
the-bird-F / Expressive-Vectors
View on GitHub
[ICASSP 2026] Task Vector in TTS: Toward Emotionally Expressive Dialectal Speech Synthesis
☆40Dec 24, 2025Updated 7 months ago
sarulab-speech / multi-speaker-dgp
View on GitHub
Official implementation of DGP-based multi-speaker speech synthesis with PyTorch
☆24Mar 23, 2021Updated 5 years ago
EMOsuperb / EMO-SUPERB-submission
View on GitHub
EMO-SUPERB submission
☆51Oct 13, 2025Updated 9 months ago
ml-for-speech / speechtoolkit
View on GitHub
[Early Alpha] A unified framework for text-to-speech, voice conversion, automatic speech recognition, audio classification, voice activit…
☆22Jan 10, 2025Updated last year
marianne-m / brouhaha-vad
View on GitHub
Predicts the level of noise and reverberation on your audiofiles
☆190May 23, 2026Updated 2 months ago
jinhan / tacotron2-gst
View on GitHub
Tacotron2 with Global Style Tokens
☆64Apr 19, 2019Updated 7 years ago
LeoniusChen / Attentions-in-Tacotron
View on GitHub
☆69Mar 31, 2021Updated 5 years ago
shkim816 / acnn_speaker_recog
View on GitHub
acnn for text-independent speaker recognition
☆10Feb 8, 2022Updated 4 years ago
hlt-mt / Speech-MASSIVE
View on GitHub
Speech-MASSIVE is a multilingual Spoken Language Understanding (SLU) dataset comprising the speech counterpart for a portion of the MASSI…
☆25Oct 8, 2025Updated 9 months ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
JusperLee / Gull-Codec-Training
View on GitHub
☆12Mar 11, 2025Updated last year
KevinMIN95 / StyleSpeech
View on GitHub
Official implementation of Meta-StyleSpeech and StyleSpeech
☆254Feb 9, 2022Updated 4 years ago
koudounasalkis / voc2vec
View on GitHub
This repository contains the code for the paper "voc2vec: A Foundation Model for Non-Verbal Vocalization", accepted at ICASSP 2025.
☆57Apr 14, 2025Updated last year
keonlee9420 / Comprehensive-Tacotron2
View on GitHub
PyTorch Implementation of Google's Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions. This implementation supp…
☆49Jul 31, 2023Updated 2 years ago
Deep-unlearning / Finetune-Parakeet
View on GitHub
☆25Oct 22, 2025Updated 9 months ago
chaitanya100100 / Relative-Attributes-Zero-Shot-Learning
View on GitHub
Python Implementation of Visual Relative Attributes for Image Classification and Zero Shot Learning
☆22Jun 14, 2018Updated 8 years ago
NN-Project-2 / Emotion-TTS-Emebddings
View on GitHub
This project explores zero-shot emotional speech synthesis using EMOD, a novel approach combining emotion and content embeddings for mult…
☆19Jun 26, 2026Updated last month