Awesome TTS
☆63Sep 16, 2021Updated 4 years ago
Alternatives and similar repositories for Awesome-Text-to-Speech-TTS
Users that are interested in Awesome-Text-to-Speech-TTS are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆14Aug 19, 2024Updated last year
- Interface for Controllable Expressive Talking Machine☆40Sep 20, 2025Updated 8 months ago
- An implement of GlowTTS model. Several modes are added: speaker embedding, prosody encoder(GST), and gradient reversal.☆55Sep 14, 2022Updated 3 years ago
- Official implementation for the paper Fine-grained style control in transformer-based text-to-speech synthesis.☆90Mar 5, 2022Updated 4 years ago
- Models and codes for INTERSPEECH 2023 paper DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model☆13Mar 30, 2025Updated last year
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- ☆43May 19, 2023Updated 3 years ago
- A Compact and Effective Pretrained Model for Speech Emotion Recognition☆53Apr 10, 2026Updated 2 months ago
- ☆24Updated this week
- Awesome list of TTS papers with audio samples☆61Aug 18, 2020Updated 5 years ago
- Finetuning VITS Efficiently☆33Nov 6, 2023Updated 2 years ago
- 60k hours of phoneme-aligned audio from audio books☆19Jul 27, 2024Updated last year
- The official repository for the paper “NonVerbalSpeech-38K: A Scalable Pipeline for Enabling Non-Verbal Speech Generation and Understandi…☆66Dec 26, 2025Updated 5 months ago
- Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax☆16Jun 16, 2024Updated 2 years ago
- a fully open-source implementation of a GPT-4o-like speech-to-speech video understanding model.☆38Apr 7, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Various Text-to-speech (TTS) papers based on Deep-learning☆14Feb 26, 2021Updated 5 years ago
- This repository contains the code for the paper "voc2vec: A Foundation Model for Non-Verbal Vocalization", accepted at ICASSP 2025.☆55Apr 14, 2025Updated last year
- One command to start a streaming ASR server.☆12Oct 2, 2024Updated last year
- SC-CNN: Effective Speaker Conditioning Method for Zero-Shot Multi-Speaker Text-to-Speech Systems☆39Nov 1, 2023Updated 2 years ago
- 🎵 muse: Music Separation☆11Feb 14, 2024Updated 2 years ago
- A fundamental frequency estimation algorithm using features from the magnitude and phase spectrogram.☆24Mar 29, 2021Updated 5 years ago
- GE2E Speaker Encoder - Generalized End-To-End Loss for Speaker Verification☆14May 17, 2020Updated 6 years ago
- [ICASSP 2026] Task Vector in TTS: Toward Emotionally Expressive Dialectal Speech Synthesis☆40Dec 24, 2025Updated 5 months ago
- A curated list of full-duplex spoken dialogue models & benchmarks☆94Updated this week
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Glow-TTS with Stochastic Duration Predictor and Stochastic Pitch Predictor☆19Jun 5, 2023Updated 3 years ago
- Multispeaker Community Vocoder Model for DiffSinger☆38Aug 11, 2025Updated 10 months ago
- EMO-SUPERB submission☆51Oct 13, 2025Updated 8 months ago
- Speech-MASSIVE is a multilingual Spoken Language Understanding (SLU) dataset comprising the speech counterpart for a portion of the MASSI…☆25Oct 8, 2025Updated 8 months ago
- [Early Alpha] A unified framework for text-to-speech, voice conversion, automatic speech recognition, audio classification, voice activit…☆22Jan 10, 2025Updated last year
- ☆69Mar 31, 2021Updated 5 years ago
- Tacotron2 with Global Style Tokens☆64Apr 19, 2019Updated 7 years ago
- Predicts the level of noise and reverberation on your audiofiles☆189May 23, 2026Updated 3 weeks ago
- acnn for text-independent speaker recognition☆10Feb 8, 2022Updated 4 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- ☆40Nov 18, 2025Updated 6 months ago
- ☆12Mar 11, 2025Updated last year
- PyTorch Implementation of Google's Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions. This implementation supp…☆48Jul 31, 2023Updated 2 years ago
- [INTERSPEECH 2025 Oral]Official code for "Accelerating Diffusion-based Text-to-Speech Model Training with Dual Modality Alignment"☆67Jun 16, 2025Updated last year
- Official implementation of Meta-StyleSpeech and StyleSpeech☆253Feb 9, 2022Updated 4 years ago
- Python Implementation of Visual Relative Attributes for Image Classification and Zero Shot Learning☆22Jun 14, 2018Updated 8 years ago
- [ACM-MM 2025 Workshop] More Is Better: A MoE-Based Emotion Recognition Framework with Human Preference Alignment.☆25Nov 25, 2025Updated 6 months ago