anan235/dia-multilingual

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/anan235/dia-multilingual)

anan235 / dia-multilingual

A TTS model capable of generating ultra-realistic dialogue in one pass.

☆222

Alternatives and similar repositories for dia-multilingual

Users that are interested in dia-multilingual are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

stlohrey / dia-finetuning
View on GitHub
A TTS model capable of generating ultra-realistic dialogue in one pass.
☆131Jul 25, 2025Updated last year
yl4579 / DMOSpeech2
View on GitHub
☆302Jul 22, 2025Updated last year
Mddct / transformer-vocos
View on GitHub
☆35Sep 6, 2025Updated 10 months ago
stlohrey / chatterbox-finetuning
View on GitHub
SoTA open-source TTS
☆136Jun 7, 2025Updated last year
herimor / voxtream
View on GitHub
VoXtream is a Full-Stream Zero-shot TTS model with Extremely Low Latency and Speaking rate Control
☆245May 30, 2026Updated last month
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
RobertAgee / dia
View on GitHub
A TTS model capable of generating ultra-realistic dialogue in one pass.
☆16Jun 28, 2025Updated last year
yynil / RWKVTTS
View on GitHub
This project is to train an RWKV LLM for TTS generation which compatible to other TTS engine(like fish/cosy/chattts).
☆101Oct 8, 2025Updated 9 months ago
ZhikangNiu / A-DMA
View on GitHub
[INTERSPEECH 2025 Oral]Official code for "Accelerating Diffusion-based Text-to-Speech Model Training with Dual Modality Alignment"
☆67Jun 16, 2025Updated last year
EZ-VC / EZ-VC
View on GitHub
[EMNLP 2025 Findings] Official code for EZ-VC: Easy Zero-shot Any-to-Any Voice Conversion
☆43Sep 9, 2025Updated 10 months ago
yangdongchao / SimpleSpeech
View on GitHub
The open source code for SimpleSpeech series
☆147Oct 8, 2024Updated last year
yuriak / SpeechDialogueFactory
View on GitHub
☆40Apr 3, 2025Updated last year
davidbrowne17 / chatterbox-streaming
View on GitHub
Streaming and Fine-tuning for Chatterbox TTS
☆292Jun 15, 2025Updated last year
fluxions-ai / stftvae
View on GitHub
Inference for the STFT-VAE continuous audio codec (24kHz, 3.125Hz latent)
☆43Jul 12, 2026Updated last week
Vyvo-Labs / VyvoTTS
View on GitHub
VyvoTTS: LLM-Based Text-to-Speech Training Framework
☆257Apr 8, 2026Updated 3 months ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
devnen / Dia-TTS-Server
View on GitHub
Self-host the powerful Dia TTS model. This server offers a user-friendly Web UI, flexible API endpoints (incl. OpenAI compatible), suppor…
☆352Mar 28, 2026Updated 3 months ago
nari-labs / dia
View on GitHub
A TTS model capable of generating ultra-realistic dialogue in one pass.
☆19,358Nov 19, 2025Updated 8 months ago
zhenye234 / X-Codec-2.0
View on GitHub
Codec for paper: LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis
☆360Jun 25, 2026Updated last month
Blinorot / utmos-pytorch
View on GitHub
Unofficial fairseq-free PyTorch implementation of UTMOS (v1, 2022), matching the original system.
☆35Jun 6, 2026Updated last month
ASLP-lab / FlashTTS
View on GitHub
Fast Streaming TTS with MTP Acceleration and X-pred Mean Flow Distillation
☆67Jun 16, 2026Updated last month
LEMAS-Project / LEMAS-TTS
View on GitHub
LEMAS‑TTS is a multilingual zero‑shot text‑to‑speech system, supporting 10 languages: Chinese English Spanish Russian French German Ital…
☆101Mar 31, 2026Updated 3 months ago
yangdongchao / ALMTokenizer
View on GitHub
The demo page for ALMTokenizer
☆59Apr 14, 2025Updated last year
p0p4k / pflowtts_pytorch
View on GitHub
Unofficial implementation of NVIDIA P-Flow TTS paper
☆228Dec 24, 2024Updated last year
randombk / chatterbox-vllm
View on GitHub
VLLM Port of the Chatterbox TTS model
☆379Oct 18, 2025Updated 9 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
seastar105 / pflow-encodec
View on GitHub
Implementation of TTS model based on NVIDIA P-Flow TTS Paper
☆77Jul 13, 2026Updated last week
Aria-K-Alethia / laughter-synthesis
View on GitHub
Official implementation of the paper "Laughter Synthesis using Pseudo Phonetic Tokens with a Large-scale In-the-wild Laughter Corpus" acc…
☆77Jul 16, 2023Updated 3 years ago
yzGuu830 / efficient-speech-codec
View on GitHub
[EMNLP 2024] ESC: Efficient Speech Coding with Cross-Scale Residual Vector Quantized Transformers
☆126Mar 20, 2025Updated last year
alobashev / mkl-vc
View on GitHub
[Interspeech 2025] Official implementation of "Training-Free Voice Conversion with Factorized Optimal Transport"
☆45Sep 24, 2025Updated 10 months ago
hhguo / SoCodec
View on GitHub
Ultra-low-bitrate Speech Codec for Speech Language Modeling Applications
☆92Dec 20, 2024Updated last year
jjunak-yun / FLowHigh_code
View on GitHub
[ICASSP 2025] "FLowHigh: Towards efficient and high-quality audio super-resolution with single-step flow matching"
☆118Jan 17, 2025Updated last year
zelaki / DisfluentFA
View on GitHub
A Weakly Supervised Forced Alignment for disluent speech
☆15Nov 12, 2023Updated 2 years ago
ydqmkkx / ShallowFlowMatching-TTS
View on GitHub
Official implementation of paper: Shallow Flow Matching for Coarse-to-Fine Text-to-Speech Synthesis
☆55Sep 20, 2025Updated 10 months ago
manmay-nakhashi / TTSizer
View on GitHub
🎙️ Automatically transcribe audio/video into high-quality, speaker-specific Text-To-Speech datasets ✨
☆18May 20, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
ftshijt / Interspeech2024_DiscreteSpeechChallenge
View on GitHub
This is the official train-dev-test release of the Interspeech2024 Discrete Speech Representation Challenge.
☆32Jan 26, 2024Updated 2 years ago
Mddct / cosyvoice2-flow-optimized
View on GitHub
faster inference
☆27Jan 20, 2025Updated last year
HeCheng0625 / Diffusion-Speech-Tokenizer
View on GitHub
This repository contains a series of works on diffusion-based speech tokenizers, including the official implementation of the paper: "TaD…
☆198Jan 25, 2026Updated 6 months ago
kandinskylab / kvae-audio
View on GitHub
KVAE-Audio: a continuous full-band audio waveform autoencoder
☆101Updated this week
canopyai / Orpheus-TTS
View on GitHub
Towards Human-Sounding Speech
☆6,260Dec 5, 2025Updated 7 months ago
lifeiteng / NotebookTTS
View on GitHub
Text-To-Speech for NotebookLM
☆39Jul 20, 2025Updated last year
SmoothKen / knn-svc
View on GitHub
kNN-SVC: Robust Zero-Shot Singing Voice Conversion with Additive Synthesis and Concatenation Smoothness Optimization
☆16Nov 7, 2025Updated 8 months ago