Tobertz-max/DiFlow-TTS

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Tobertz-max/DiFlow-TTS)

Tobertz-max / DiFlow-TTS

DiFlow-TTS delivers low-latency zero-shot TTS via discrete flow matching and factorized speech tokens. A compact, open framework for fast voice synthesis.🐙

☆53

Alternatives and similar repositories for DiFlow-TTS

Users that are interested in DiFlow-TTS are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

YangXusheng-yxs / CodecFormer_5Hz
View on GitHub
☆32Oct 23, 2025Updated 5 months ago
choiHkk / Transformer-TTS-V2
View on GitHub
☆25Mar 6, 2024Updated 2 years ago
zhai-lw / SQCodec
View on GitHub
A lightweight audio codec based on a single quantizer
☆69Aug 15, 2025Updated 7 months ago
SesameAILabs / silentcipher
View on GitHub
☆18Mar 17, 2025Updated last year
Choddeok / DiEmo-TTS
View on GitHub
[INTERSPEECH 2025] The official implementation of DiEmo-TTS: Disentangled Emotion Representations via Self-Supervised Distillation for…
☆16Sep 7, 2025Updated 6 months ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
yukara-ikemiya / Open-Miipher-2
View on GitHub
PyTorch implementation of Miipher-2 [2025] which is a speech restoration model by Google DeepMind
☆64Sep 22, 2025Updated 6 months ago
lifeiteng / VoiceBox
View on GitHub
Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale
☆28Aug 4, 2023Updated 2 years ago
MuyangDu / T5Voice
View on GitHub
T5Voice is a lightweight PyTorch implementation of T5-based text-to-speech synthesis, supporting both streaming and non-streaming speech …
☆28Nov 7, 2025Updated 4 months ago
mush42 / mantoq
View on GitHub
Arabic Grapheme-to-Phoneme (G2P) Conversion
☆13Mar 15, 2025Updated last year
manmay-nakhashi / TTSizer
View on GitHub
🎙️ Automatically transcribe audio/video into high-quality, speaker-specific Text-To-Speech datasets ✨
☆17May 20, 2025Updated 10 months ago
SonyResearch / VRVQ
View on GitHub
Variable Bitrate Residual Vector Quantization for Audio Coding
☆50May 1, 2025Updated 10 months ago
WangHelin1997 / Aty-TTS
View on GitHub
Aty-TTS: Improving fairness for spoken language understanding in atypical speech with Text-to-Speech
☆11May 14, 2025Updated 10 months ago
Audio-Foundation-Models / ConversationTTS
View on GitHub
☆100Jan 19, 2026Updated 2 months ago
youngsheen / GPST
View on GitHub
[ACL 2024] Generative Pre-Trained Speech Language Model with Efficient Hierarchical Transformer
☆69Nov 1, 2024Updated last year
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
seastar105 / pflow-encodec
View on GitHub
Implementation of TTS model based on NVIDIA P-Flow TTS Paper
☆77May 12, 2024Updated last year
liuhuang31 / HiFTNet-sr
View on GitHub
HiFTNet wav/audio super-resolution 16/24 kHz to 48 kHz
☆24Jan 2, 2024Updated 2 years ago
ZhikangNiu / A-DMA
View on GitHub
[INTERSPEECH 2025 Oral]Official code for "Accelerating Diffusion-based Text-to-Speech Model Training with Dual Modality Alignment"
☆64Jun 16, 2025Updated 9 months ago
Tera2Space / AudioAE
View on GitHub
Simple audio AE
☆13Nov 10, 2024Updated last year
Berkeley-Speech-Group / sylber
View on GitHub
Sylber: Syllabic Embedding Representation of Speech from Raw Audio
☆75Mar 17, 2025Updated last year
Eps-Acoustic-Revolution-Lab / EAR_VAE
View on GitHub
This is the official implementation for εar-VAE model including inference and evaluation parts, more details coming soon...
☆68Feb 13, 2026Updated last month
thuhcsi / DiffVar
View on GitHub
☆30Aug 12, 2023Updated 2 years ago
ArenAcikgoz / Whisper-Alignment
View on GitHub
Forced alignment decoder for Whisper.
☆15Mar 13, 2024Updated 2 years ago
andybi7676 / reborn-uasr
View on GitHub
REBORN: Reinforcement-Learned Boundary Segmentation with Iterative Training for Unsupervised ASR
☆14Dec 11, 2024Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
ryota-komatsu / speech_resynth
View on GitHub
Speech Resynthesis and Language Modeling
☆27Jun 11, 2025Updated 9 months ago
P1ping / TokAN
View on GitHub
☆23Jul 30, 2025Updated 7 months ago
WingZLeung / TTDS
View on GitHub
Text-to-dysarthric speech (TTDS) synthesis. An implementation using the Grad-TTS model with the TORGO database.
☆12Mar 15, 2025Updated last year
Stylish-TTS / stylish-tts
View on GitHub
High quality text-to-speech based on StyleTTS 2.
☆75Feb 25, 2026Updated last month
naver-ai / RapFlow-TTS
View on GitHub
☆55Jul 16, 2025Updated 8 months ago
walker-hyf / FCTalker
View on GitHub
FCTalker: Fine and Coarse Grained Context Modeling for Expressive Conversational Speech Synthesis (Accepted by ISCSLP'2024)
☆26Feb 22, 2024Updated 2 years ago
yxlu-0102 / IDEA-TTS
View on GitHub
Incremental Disentanglement for Environment-Aware Zero-Shot Text-to-Speech Synthesis
☆27Mar 21, 2025Updated last year
Mddct / simple-tts
View on GitHub
（WIP）long form speech generatoins
☆31Apr 2, 2025Updated 11 months ago
xi-j / Style-Talker
View on GitHub
An official implementation of Style-Talker for Spoken Dialogue Generation
☆23Jan 12, 2025Updated last year
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
asuni / PitchSqueezer
View on GitHub
A robust pitch tracker using synchro-squeezed fft and frequency domain autocorrelation
☆36Jan 17, 2024Updated 2 years ago
p0p4k / pflowtts_pytorch
View on GitHub
Unofficial implementation of NVIDIA P-Flow TTS paper
☆230Dec 24, 2024Updated last year
kaistmm / fregrad
View on GitHub
[ICASSP 2024] Official code for FreGrad
☆35May 13, 2024Updated last year
opendilab / HH-Codec
View on GitHub
[ICML 2025 Tokenization Workshop] HH-Codec: High Compression High-fidelity Discrete Neural Codec for Spoken Language Modeling
☆85Sep 28, 2025Updated 5 months ago
leto19 / WhiSQA
View on GitHub
Whisper Speech Quality Assessment (WhiSQA)
☆16Oct 14, 2025Updated 5 months ago
p0p4k / vits3_pytorch
View on GitHub
☆28Nov 15, 2023Updated 2 years ago
ZhangXinWhut / SimWhisper-Codec
View on GitHub
Official code for paper:"Speaking Clearly: A Simplified Whisper-Based Codec for Low-Bitrate Speech Coding"
☆36Jan 28, 2026Updated last month