pengzhendong/streaming-tts-webui

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/pengzhendong/streaming-tts-webui)

pengzhendong / streaming-tts-webui

Streaming Text to Speech Web UI

☆22

Alternatives and similar repositories for streaming-tts-webui

Users that are interested in streaming-tts-webui are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

pengzhendong / audiolab
View on GitHub
A streaming audio reader, processor, and writer built on top of soundfile, and PyAV (bindings for FFmpeg)
☆39Mar 31, 2026Updated 3 months ago
pengzhendong / speaker-diarization
View on GitHub
Offline Speaker Diarization with SenseVoice by Sherpa ONNX.
☆15Dec 23, 2024Updated last year
pengzhendong / wetext
View on GitHub
Python runtime for WeTextProcessing (does not depend on Pynini)
☆53Jun 11, 2026Updated last month
Mddct / usm-tokenizer
View on GitHub
semantic tokenizer for speech and music
☆20Jul 6, 2025Updated last year
Mddct / transformer-vocos
View on GitHub
☆35Sep 6, 2025Updated 10 months ago
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
AI-S2-Lab / GPT-Talker
View on GitHub
[ACMMM'2024] Generative Expressive Conversational Speech Synthesis
☆45Oct 28, 2024Updated last year
hhguo / SoCodec
View on GitHub
Ultra-low-bitrate Speech Codec for Speech Language Modeling Applications
☆92Dec 20, 2024Updated last year
LAION-AI / emotion-annotations
View on GitHub
☆110Jul 15, 2026Updated 2 weeks ago
SparkAudio / VoxBox
View on GitHub
A large-scale speech corpus introduced in Spark-TTS, built from diverse open-source datasets for training text-to-speech (TTS) systems.
☆115May 5, 2025Updated last year
pengzhendong / g2p-mix
View on GitHub
Grapheme-to-Phoneme for Mixed Chinese (Mandarin or Cantonese) and English.
☆115Updated this week
xiquan-li / Resonate
View on GitHub
[INTERSPEECH 2026] Pre-training, SFT, DPO and GRPO for Text-to-Audio Generation
☆48Apr 17, 2026Updated 3 months ago
wenet-e2e / wesr
View on GitHub
We Speech Transcript based on LLM, in 300 lines of code.
☆182Jun 20, 2025Updated last year
BiSinger-SVS / BiSinger
View on GitHub
Bilingual Singing Voice Synthesis
☆18Mar 25, 2024Updated 2 years ago
lifeiteng / NotebookTTS
View on GitHub
Text-To-Speech for NotebookLM
☆39Jul 20, 2025Updated last year
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
pengzhendong / audio-pipeline
View on GitHub
☆23Oct 17, 2024Updated last year
pengzhendong / wavesurfer
View on GitHub
For audio visualization and playback in Jupyter notebooks.
☆18Nov 25, 2025Updated 8 months ago
MaxMax2016 / max-vc
View on GitHub
singing voice conversion without f0
☆23May 10, 2023Updated 3 years ago
pengzhendong / streaming-vocos
View on GitHub
Streaming Vocos
☆31Jun 10, 2025Updated last year
luotianze666 / WaveFM
View on GitHub
[NAACL 2025] WaveFM: A High-Fidelity and Efficient Vocoder Based on Flow Matching
☆133Apr 8, 2026Updated 3 months ago
youngsheen / GPST
View on GitHub
[ACL 2024] Generative Pre-Trained Speech Language Model with Efficient Hierarchical Transformer
☆70Nov 1, 2024Updated last year
Cr-Fish / WESR
View on GitHub
Official implementation of ACL'26 (findings) paper WESR (Word-level Event-Speech Recognition): A comprehensive benchmark and baseline for…
☆39Jan 30, 2026Updated 5 months ago
thuhcsi / SpeechCraft
View on GitHub
The official repository of SpeechCraft dataset, a large-scale expressive bilingual speech dataset with natural language descriptions.
☆198Feb 28, 2026Updated 5 months ago
pengzhendong / streaming-ChatTTS
View on GitHub
☆23Oct 30, 2024Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
TanUkkii007 / deepvoice3-tensorflow
View on GitHub
A tensorflow based implementation of DeepVoice3 https://arxiv.org/abs/1710.07654
☆13Jun 5, 2018Updated 8 years ago
Harry-Yu-Shuhang / Step-Audio-tts
View on GitHub
☆11Feb 20, 2025Updated last year
inclusionAI / MingTok-Audio
View on GitHub
☆88Feb 24, 2026Updated 5 months ago
yukara-ikemiya / Open-Miipher-2
View on GitHub
PyTorch implementation of Miipher-2 [2025] which is a speech restoration model by Google DeepMind
☆70Sep 22, 2025Updated 10 months ago
ScottishFold007 / TTSAudioNormalizer
View on GitHub
TTSAudioNormalizer is a specialized tool for TTS data production, featuring descriptive statistical analysis of audio loudness and loud…
☆113Dec 20, 2024Updated last year
light1726 / Speech-Tokenization-Papers
View on GitHub
This repository follows papers and reports on discrete speech representation learning and speech tokenization methods for speech language…
☆15Dec 1, 2023Updated 2 years ago
DaiYvhang / AISHELL-5
View on GitHub
In-car multi-channel speech transcription system of AISHELL-5.
☆48Jun 9, 2025Updated last year
aask1357 / hilcodec
View on GitHub
High fidelity, lightweight, end-to-end, streaming, convolution-based neural audio codec
☆120Jun 23, 2025Updated last year
liuhuang31 / HiFTNet-sr
View on GitHub
HiFTNet wav/audio super-resolution 16/24 kHz to 48 kHz
☆24Jan 2, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
xingchensong / FlashCosyVoice
View on GitHub
FlashCosyVoice: A lightweight vLLM implementation built from scratch for CosyVoice.
☆250Feb 25, 2026Updated 5 months ago
uthree / tinyvc
View on GitHub
a lightweight voice conversion
☆87Feb 25, 2026Updated 5 months ago
wenet-e2e / west
View on GitHub
We Speech Toolkit, LLM based Speech Toolkit for Speech Understanding, Generation, and Interaction
☆206Jul 17, 2026Updated last week
innnky / vispeech
View on GitHub
基于vits fastspeech2 visinger的tts模型
☆24Mar 9, 2023Updated 3 years ago
Mddct / simple-tts
View on GitHub
（WIP）long form speech generatoins
☆30Apr 2, 2025Updated last year
ScottishFold007 / Cosyvoice_DPO_NOTES
View on GitHub
CosyVoice_DPO_NOTES: Supercharge Your Cosyvoice model with Cutting-Edge DPO Fine-Tuning!
☆126Aug 8, 2025Updated 11 months ago
ASLP-lab / LLaSA_Plus
View on GitHub
Llasa Speed Up
☆64Jan 18, 2026Updated 6 months ago