XinyuZhou2000/Spoken-Dialogue

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/XinyuZhou2000/Spoken-Dialogue)

XinyuZhou2000 / Spoken-Dialogue

☆18

Alternatives and similar repositories for Spoken-Dialogue

Users that are interested in Spoken-Dialogue are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

cnaigithub / SpeechDewarping
View on GitHub
Official implementation of "Unsupervised Pre-training for Data-Efficient Text-to-Speech on Low Resource Languages", ICASSP 2023
☆27Apr 27, 2023Updated 3 years ago
acids-ircam / Expressive_WAE_FADER
View on GitHub
companion repository to the DAFx-19 paper "Assisted Sound Sample Generation with Musical Conditioning in Adversarial Auto-Encoders" by Ad…
☆11Jun 22, 2019Updated 7 years ago
AlexandaJerry / SingingVoice-MFA-Training
View on GitHub
MFA acoustic model training based on Opencpop
☆15Sep 23, 2022Updated 3 years ago
lucasjinreal / textfrontend
View on GitHub
单独维护的中文TTS
☆34Oct 28, 2022Updated 3 years ago
BiSinger-SVS / BiSinger
View on GitHub
Bilingual Singing Voice Synthesis
☆18Mar 25, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
freds0 / CML-TTS-Dataset
View on GitHub
CML-TTS: A Multilingual Dataset for Speech Synthesis
☆36Jul 31, 2024Updated last year
Yangyangii / TPGST-Tacotron
View on GitHub
Google's TPGST reimplementation.
☆34Dec 11, 2019Updated 6 years ago
Yoshifumi-Nakano / visual-text-to-speech
View on GitHub
visual-text to speech
☆14Apr 3, 2022Updated 4 years ago
thuhcsi / SpanPSP
View on GitHub
☆76Apr 26, 2022Updated 4 years ago
Agions / caption-fab
View on GitHub
CaptionFab - 专业硬编码字幕提取工具，从视频中精准提取字幕，支持多种格式输出。基于 Tauri + Vue 3 + Rust 构建。
☆20Jun 24, 2026Updated last month
asuni / wavelet_prosody_toolkit
View on GitHub
☆200May 3, 2024Updated 2 years ago
yuboona / some-script-to-help-using-Montreal-Forced-Aligner
View on GitHub
Some script for helping using Montreal Forced Aligner, maily for transforming Hanzi character to pinyin and extrat pause time from .textg…
☆14Feb 9, 2024Updated 2 years ago
keonlee9420 / Comprehensive-E2E-TTS
View on GitHub
A Non-Autoregressive End-to-End Text-to-Speech (text-to-wav), supporting a family of SOTA unsupervised duration modelings. This project g…
☆147Jun 6, 2022Updated 4 years ago
vlarine / wav2vec
View on GitHub
vq-wav2vec inference
☆15Dec 13, 2021Updated 4 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
taresh18 / TTSizer
View on GitHub
🎙️ Automatically transcribe audio/video into high-quality, speaker-specific Text-To-Speech datasets
☆142Aug 10, 2025Updated 11 months ago
Aria-K-Alethia / laughter-synthesis
View on GitHub
Official implementation of the paper "Laughter Synthesis using Pseudo Phonetic Tokens with a Large-scale In-the-wild Laughter Corpus" acc…
☆77Jul 16, 2023Updated 3 years ago
Dapwner / CVAE-Tacotron
View on GitHub
☆26Jun 5, 2024Updated 2 years ago
CODEJIN / VITS_Diffusion
View on GitHub
☆26Sep 22, 2022Updated 3 years ago
adelacvg / diff-vits
View on GitHub
☆39Oct 1, 2023Updated 2 years ago
roedoejet / FastSpeech2_ACL2022_reproducibility
View on GitHub
☆21Feb 27, 2024Updated 2 years ago
hcy71o / TransferTTS
View on GitHub
TransferTTS (Zero-Shot learning of VITS)
☆102Sep 23, 2022Updated 3 years ago
ZackHodari / discrete_intonation
View on GitHub
Code for paper titled "Perception of prosodic variation for speech synthesis using an unsupervised discrete representation of F0" submitt…
☆17May 24, 2020Updated 6 years ago
xiaomi-research / tts-prism
View on GitHub
☆47Apr 27, 2026Updated 3 months ago
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
Shy-98 / MELLE
View on GitHub
Unofficial PyTorch implementation of "Autoregressive Speech Synthesis without Vector Quantization (MELLE)"
☆41Jun 28, 2025Updated last year
light1726 / SpeechTripleNet
View on GitHub
The implementation of paper "SpeechTripleNet: End-to-End Disentangled Speech Representation Learning for Content, Timbre and Prosody"
☆33Nov 23, 2023Updated 2 years ago
channel-io / ch-tts-llasa-rl-grpo
View on GitHub
☆51Apr 20, 2026Updated 3 months ago
JSALT-2022-SSL / superb-prosody
View on GitHub
☆31Jul 13, 2023Updated 3 years ago
ORI-Muchim / BEGANSing
View on GitHub
BEGANSing - Korean SVS + SVC + AudioSR
☆11Feb 17, 2024Updated 2 years ago
Wataru-Nakata / latentlm-tts
View on GitHub
☆29Jul 3, 2026Updated 3 weeks ago
xiquan-li / Resonate
View on GitHub
[INTERSPEECH 2026] Pre-training, SFT, DPO and GRPO for Text-to-Audio Generation
☆48Apr 17, 2026Updated 3 months ago
primepake / dac_vae
View on GitHub
Descript Audio Codec - VAE Variant (.dac-vae): High-Fidelity Audio Compression with Variational Autoencoder
☆38Aug 30, 2025Updated 10 months ago
xinjli / alqalign
View on GitHub
multilingual speech aligner
☆78Nov 19, 2023Updated 2 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
ga642381 / SpeechPrompt
View on GitHub
**Interspeech 2022** 《SpeechPrompt: An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks》Speec…
☆102Apr 10, 2025Updated last year
lifeiteng / TTS-TextAnalyzer
View on GitHub
TTS Text Analyzer
☆31Jul 20, 2023Updated 3 years ago
BayLing-Models / BayLing-Duplex
View on GitHub
Native full-duplex speech dialogue inference for BayLing-Duplex.
☆63Jun 22, 2026Updated last month
ga642381 / SpeechGen
View on GitHub
《SpeechGen: Unlocking the Generative Power of Speech Language Models with Prompts》
☆77Jun 9, 2023Updated 3 years ago
bshall / urhythmic
View on GitHub
Unsupervised Rhythm Modeling for Voice Conversion
☆85Aug 3, 2023Updated 2 years ago
adelacvg / DPTTS
View on GitHub
An AR+AR TTS attempt.
☆18Jan 13, 2025Updated last year
liuhuang31 / g2pw_once
View on GitHub
G2pw's inference speed is accelerated by about 8-10 times. Change loop generated predictive data to only once and model loop prediction b…
☆14Dec 30, 2023Updated 2 years ago