imxtx / awesome-controllabe-speech-synthesis

This is an evolving repo for the paper "Towards Controllable Speech Synthesis in the Era of Large Language Models: A Survey".

☆127

Alternatives and similar repositories for awesome-controllabe-speech-synthesis:

Users that are interested in awesome-controllabe-speech-synthesis are comparing it to the libraries listed below

mct10 / RepCodec
Models and code for RepCodec: A Speech Representation Codec for Speech Tokenization
☆169Updated 8 months ago
lmxue / Audio-FLAN
Audio-FLAN
☆140Updated 2 weeks ago
yangdongchao / SimpleSpeech
The open source code for SimpleSpeech series
☆136Updated 5 months ago
0nutation / USLM
Unified Speech Language Model for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"(ICLR 2024)
☆138Updated last year
thuhcsi / SpeechCraft
The official repository of SpeechCraft dataset, a large-scale expressive bilingual speech dataset with natural language descriptions.
☆110Updated 2 months ago
zhenye234 / FlashSpeech
ACM MM 2024 FlashSpeech: Efficient Zero-Shot Speech Synthesis
☆133Updated 6 months ago
X-LANCE / UniCATS-CTX-vec2wav
[AAAI 2024] Code for CTX-vec2wav in UniCATS
☆128Updated 9 months ago
yangdongchao / LLM-Codec
The open source code for LLM-Codec
☆132Updated 7 months ago
ajd12342 / paraspeechcaps
Codebase for 'Scaling Rich Style-Prompted Text-to-Speech Datasets'
☆109Updated this week
Aria-K-Alethia / BigCodec
Official implementation of the paper "BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec"
☆147Updated 6 months ago
keonlee9420 / evaluate-zero-shot-tts
Evaluation Protocol for Large-Scale Zero-Shot TTS Literature
☆76Updated last week
lifeiteng / naturalspeech3_facodec
FACodec: Speech Codec with Attribute Factorization used for NaturalSpeech 3
☆194Updated 11 months ago
Takaaki-Saeki / DiscreteSpeechMetrics
Reference-aware automatic speech evaluation toolkit
☆144Updated 3 months ago
amphionspace / SD-Eval
[NeurIPS 2024] SD-Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words
☆48Updated 9 months ago
Jiang-Yidi / UniCodec
UniCodec: a unified audio codec with a single codebook to support multi-domain audio data, including speech, music, and sound
☆114Updated 3 weeks ago
sarulab-speech / UTMOSv2
UTokyo-SaruLab MOS Prediction System
☆160Updated 3 weeks ago
lucadellalib / focalcodec
A low-bitrate single-codebook 16 kHz speech codec based on focal modulation
☆79Updated last month
yanghaha0908 / FastHuBERT
Official implementation for Fast-HuBERT: An Efficient Training Framework for Self-Supervised Speech Representation Learning
☆86Updated 4 months ago
y-ren16 / TiCodec
☆69Updated last year
line / LibriTTS-P
LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style Captioning
☆126Updated 9 months ago
jzq2000 / MoonCast
☆67Updated this week
RicherMans / Dasheng
Source for the Interspeech 2024 Paper "Scaling up masked audio encoder learning for general audio classification"
☆57Updated last month
X-LANCE / UniCATS-CTX-txt2vec
[AAAI 2024] CTX-txt2vec, the acoustic model in UniCATS
☆63Updated 4 months ago
thuhcsi / VoxInstruct
VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling
☆67Updated 4 months ago
LqNoob / Neural-Codec-and-Speech-Language-Models
Awesome Neural Codec Models, Text-to-Speech Synthesizers & Speech Language Models
☆122Updated this week
JishengBai / AudioSetCaps
A 6-million Audio-Caption Paired Dataset Built with a LLMs and ALMs-based Automatic Pipeline
☆120Updated 3 months ago
zhenye234 / xcodec
AAAI 2025: Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model
☆186Updated 2 months ago
cantabile-kwok / vec2wav2.0
Code for vec2wav 2.0, a speech token vocoder for VC. Paper: https://arxiv.org/abs/2409.01995
☆75Updated 3 months ago
Choddeok / EmoSphere-TTS
The official implementation of EmoSphere-TTS
☆112Updated 2 months ago
Choddeok / EmoSpherepp
The official implementation of EmoSphere++
☆80Updated last week