herimor/voxtream

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/herimor/voxtream)

herimor / voxtream

VoXtream is a Full-Stream Zero-shot TTS model with Extremely Low Latency and Speaking rate Control

☆241

Alternatives and similar repositories for voxtream

Users that are interested in voxtream are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Shy-98 / MELLE
View on GitHub
Unofficial PyTorch implementation of "Autoregressive Speech Synthesis without Vector Quantization (MELLE)"
☆41Jun 28, 2025Updated last year
luotianze666 / WaveFM
View on GitHub
[NAACL 2025] WaveFM: A High-Fidelity and Efficient Vocoder Based on Flow Matching
☆130Apr 8, 2026Updated 3 months ago
primepake / dac_vae
View on GitHub
Descript Audio Codec - VAE Variant (.dac-vae): High-Fidelity Audio Compression with Variational Autoencoder
☆37Aug 30, 2025Updated 10 months ago
ozspeech / OZSpeech
View on GitHub
[ACL 2025] OZSpeech: One-step Zero-shot Speech Synthesis with Learned-Prior-Conditioned Flow Matching
☆45Feb 9, 2025Updated last year
ZhikangNiu / Semantic-VAE
View on GitHub
[INTERSPEECH 2026]Official code for "Semantic-VAE: Semantic-Alignment Latent Representation for Better Speech Synthesis"
☆120Jun 21, 2026Updated 2 weeks ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
e-c-k-e-r / vall-e
View on GitHub
An unofficial PyTorch implementation of VALL-E
☆88Aug 3, 2025Updated 11 months ago
jdh-algo / JoyTTS
View on GitHub
☆41Jul 15, 2025Updated 11 months ago
X-LANCE / LSCodec-Inference
View on GitHub
Inference code for Interspeech 2025 paper, "LSCodec: Low-Bitrate and Speaker-Decoupled Discrete Speech Codec"
☆36Oct 23, 2025Updated 8 months ago
IDEA-Emdoor-Lab / UniTTS
View on GitHub
A TTS Trained on Universal Audio.
☆41Jun 6, 2025Updated last year
rishikksh20 / MiniMax-TTS-pytorch
View on GitHub
Try to replicate the architecture of MiniMaxTTS mentioned in it's technical report
☆47Sep 2, 2025Updated 10 months ago
MuyangDu / T5Voice
View on GitHub
T5Voice is a lightweight PyTorch implementation of T5-based text-to-speech synthesis, supporting both streaming and non-streaming speech …
☆28Nov 7, 2025Updated 8 months ago
Audio-Foundation-Models / ConversationTTS
View on GitHub
☆101Jan 19, 2026Updated 5 months ago
hs-oh-prml / DurFlexEVC
View on GitHub
☆81Jan 22, 2025Updated last year
IDRnD / redimnet
View on GitHub
The official pytorch implemention of the Intespeech 2024 paper "Reshape Dimensions Network for Speaker Recognition"
☆199Sep 24, 2025Updated 9 months ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
ydqmkkx / ShallowFlowMatching-TTS
View on GitHub
Official implementation of paper: Shallow Flow Matching for Coarse-to-Fine Text-to-Speech Synthesis
☆55Sep 20, 2025Updated 9 months ago
P1ping / TokAN-Legacy
View on GitHub
☆25Jun 22, 2026Updated 2 weeks ago
yoongi43 / VRVQ
View on GitHub
Implementation of the paper "Variable Bitrate Residual Vector Quantization for Audio Coding"
☆11Apr 10, 2025Updated last year
lifeiteng / NotebookTTS
View on GitHub
Text-To-Speech for NotebookLM
☆39Jul 20, 2025Updated 11 months ago
BakerBunker / FreeV
View on GitHub
[InterSpeech 24] FreeV: Free Lunch For Vocoders Through Pseudo Inversed Mel Filter
☆98Jul 4, 2024Updated 2 years ago
SonyCSLParis / codicodec
View on GitHub
Encode and decode audio samples to/from continuous and discrete compressed representations!
☆118Nov 25, 2025Updated 7 months ago
ZhikangNiu / A-DMA
View on GitHub
[INTERSPEECH 2025 Oral]Official code for "Accelerating Diffusion-based Text-to-Speech Model Training with Dual Modality Alignment"
☆67Jun 16, 2025Updated last year
X-LANCE / UniCATS-CTX-vec2wav
View on GitHub
[AAAI 2024] Code for CTX-vec2wav in UniCATS
☆130Jun 11, 2024Updated 2 years ago
bfs18 / armel
View on GitHub
poorman's ar-dit tts
☆45Dec 31, 2025Updated 6 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
3loi / NaturalVoices
View on GitHub
☆60Oct 22, 2025Updated 8 months ago
winddori2002 / DEX-TTS
View on GitHub
DEX-TTS: Diffusion-based EXpressive TTS with Style Modeling on Time Variability
☆108Jan 17, 2025Updated last year
adelacvg / detail_tts
View on GitHub
All generative model in one for better TTS model
☆74Sep 8, 2024Updated last year
youngsheen / GPST
View on GitHub
[ACL 2024] Generative Pre-Trained Speech Language Model with Efficient Hierarchical Transformer
☆70Nov 1, 2024Updated last year
meaningTeam / tidy-tunes
View on GitHub
Tidy Tunes is an easy-to-use pipeline for mining high-quality audio data for speech generation models. To do so, it chains multiple open …
☆23May 19, 2026Updated last month
hhguo / SoCodec
View on GitHub
Ultra-low-bitrate Speech Codec for Speech Language Modeling Applications
☆92Dec 20, 2024Updated last year
zhenye234 / X-Codec-2.0
View on GitHub
Codec for paper: LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis
☆359Jun 25, 2026Updated 2 weeks ago
Aria-K-Alethia / BigCodec
View on GitHub
Official implementation of the paper "BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec"
☆217Sep 19, 2024Updated last year
AI-S2-Lab / FluentEditor
View on GitHub
[InterSpeech'2024] FluentEditor:Text-based Speech Editing by Considering Acoustic and Prosody Consistency
☆62Oct 23, 2024Updated last year
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
hyama5 / vae_align
View on GitHub
Alignment examples for Interspeech 2024
☆27Jul 5, 2024Updated 2 years ago
yangdongchao / SimpleSpeech
View on GitHub
The open source code for SimpleSpeech series
☆147Oct 8, 2024Updated last year
theodorblackbird / lina-speech
View on GitHub
Official implementation of the TTS model Lina-Speech
☆180Jan 9, 2025Updated last year
yukara-ikemiya / Open-Miipher-2
View on GitHub
PyTorch implementation of Miipher-2 [2025] which is a speech restoration model by Google DeepMind
☆69Sep 22, 2025Updated 9 months ago
caizexin / GenVC
View on GitHub
Self-supervised Generative LM-based Voice Conversion
☆58Apr 24, 2025Updated last year
SonyResearch / VRVQ
View on GitHub
Variable Bitrate Residual Vector Quantization for Audio Coding
☆53May 1, 2025Updated last year
X-E-Speech / X-E-Speech-code
View on GitHub
X-E-Speech: Joint Training Framework of Non-Autoregressive Cross-lingual Emotional Text-to-Speech and Voice Conversion
☆112Apr 1, 2024Updated 2 years ago