cantabile-kwok / vec2wav2.0
View external linksLinks

Code for vec2wav 2.0, a speech token vocoder for VC. Paper: https://arxiv.org/abs/2409.01995

☆78

Alternatives and similar repositories for vec2wav2.0

Users that are interested in vec2wav2.0 are comparing it to the libraries listed below

Sorting:

aask1357 / hilcodec
View on GitHub
High fidelity, lightweight, end-to-end, streaming, convolution-based neural audio codec
☆115Jun 23, 2025Updated 7 months ago
X-LANCE / UniCATS-CTX-vec2wav
View on GitHub
[AAAI 2024] Code for CTX-vec2wav in UniCATS
☆130Jun 11, 2024Updated last year
walker-hyf / NCSSD
View on GitHub
Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)
☆62Nov 1, 2024Updated last year
justinlovelace / SESD
View on GitHub
☆61Oct 28, 2024Updated last year
luotianze666 / WaveFM
View on GitHub
[NAACL 2025] WaveFM: A High-Fidelity and Efficient Vocoder Based on Flow Matching
☆121Mar 27, 2025Updated 10 months ago
ajd12342 / paraspeechcaps
View on GitHub
Codebase for 'Scaling Rich Style-Prompted Text-to-Speech Datasets'
☆153Mar 24, 2025Updated 10 months ago
mushanshanshan / ESLTTS
View on GitHub
ESLTTS dataset
☆16Feb 6, 2025Updated last year
Aria-K-Alethia / BigCodec
View on GitHub
Official implementation of the paper "BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec"
☆212Sep 19, 2024Updated last year
p1an-lin-jung / wv_tts
View on GitHub
☆19Mar 22, 2024Updated last year
0nutation / USLM
View on GitHub
Unified Speech Language Model for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"(ICLR 2024)
☆152Sep 14, 2023Updated 2 years ago
yzGuu830 / efficient-speech-codec
View on GitHub
[EMNLP 2024] ESC: Efficient Speech Coding with Cross-Scale Residual Vector Quantized Transformers
☆125Mar 20, 2025Updated 10 months ago
yangdongchao / SimpleSpeech
View on GitHub
The open source code for SimpleSpeech series
☆145Oct 8, 2024Updated last year
SonyResearch / VRVQ
View on GitHub
Variable Bitrate Residual Vector Quantization for Audio Coding
☆51May 1, 2025Updated 9 months ago
youngsheen / GPST
View on GitHub
[ACL 2024] Generative Pre-Trained Speech Language Model with Efficient Hierarchical Transformer
☆67Nov 1, 2024Updated last year
adelacvg / detail_tts
View on GitHub
All generative model in one for better TTS model
☆74Sep 8, 2024Updated last year
zjlww / ardit-web
View on GitHub
☆25Aug 2, 2024Updated last year
hayeong0 / Diff-HierVC
View on GitHub
Official Pytorch Implementation of "Diff-HierVC: Diffusion-based Hierarchical Voice Conversion with Robust Pitch Generation and Masked Pr…
☆235Jul 3, 2024Updated last year
line / LibriTTS-P
View on GitHub
LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style Captioning
☆159Jun 13, 2024Updated last year
X-E-Speech / X-E-Speech-code
View on GitHub
X-E-Speech: Joint Training Framework of Non-Autoregressive Cross-lingual Emotional Text-to-Speech and Voice Conversion
☆111Apr 1, 2024Updated last year
X-LANCE / VoiceFlow-TTS
View on GitHub
[ICASSP 2024] This is the official code for "VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching"
☆366Sep 3, 2024Updated last year
yangdongchao / LLM-Codec
View on GitHub
The open source code for LLM-Codec
☆145Aug 18, 2024Updated last year
thuhcsi / VoxInstruct
View on GitHub
VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling
☆96Nov 9, 2024Updated last year
Aria-K-Alethia / laughter-synthesis
View on GitHub
Official implementation of the paper "Laughter Synthesis using Pseudo Phonetic Tokens with a Large-scale In-the-wild Laughter Corpus" acc…
☆77Jul 16, 2023Updated 2 years ago
mct10 / RepCodec
View on GitHub
Models and code for RepCodec: A Speech Representation Codec for Speech Tokenization
☆191Jul 12, 2024Updated last year
mubtasimahasan / DM-Codec
View on GitHub
Source code for the EMNLP 2025 paper “DM-Codec: Distilling Multimodal Representations for Speech Tokenization”
☆56Jun 1, 2025Updated 8 months ago
hayeong0 / DDDM-VC
View on GitHub
Official Pytorch Implementation for "DDDM-VC: Decoupled Denoising Diffusion Models with Disentangled Representation and Prior Mixup for V…
☆243Jul 31, 2024Updated last year
hs-oh-prml / DurFlexEVC
View on GitHub
☆82Jan 22, 2025Updated last year
AI-S2-Lab / FluentEditor
View on GitHub
[InterSpeech'2024] FluentEditor:Text-based Speech Editing by Considering Acoustic and Prosody Consistency
☆59Oct 23, 2024Updated last year
walker-hyf / GPT-Talker
View on GitHub
Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)
☆78Nov 1, 2024Updated last year
SparkAudio / VoxBox
View on GitHub
A large-scale speech corpus introduced in Spark-TTS, built from diverse open-source datasets for training text-to-speech (TTS) systems.
☆105May 5, 2025Updated 9 months ago
e-c-k-e-r / vall-e
View on GitHub
An unofficial PyTorch implementation of VALL-E
☆88Aug 3, 2025Updated 6 months ago
AlanBaade / SyllableLM
View on GitHub
Official Code for SyllableLM: Learning Coarse Semantic Units for Speech Language Models
☆59Jul 1, 2025Updated 7 months ago
redmist328 / APNet2
View on GitHub
Source code of APNet2, a vocoder
☆58Nov 23, 2023Updated 2 years ago
lifeiteng / naturalspeech3_facodec
View on GitHub
FACodec: Speech Codec with Attribute Factorization used for NaturalSpeech 3
☆235Apr 20, 2024Updated last year
hs-oh-prml / DiffProsody
View on GitHub
☆68Jul 29, 2023Updated 2 years ago
BakerBunker / FreeV
View on GitHub
[InterSpeech 24] FreeV: Free Lunch For Vocoders Through Pseudo Inversed Mel Filter
☆93Jul 4, 2024Updated last year
X-LANCE / UniCATS-CTX-txt2vec
View on GitHub
[AAAI 2024] CTX-txt2vec, the acoustic model in UniCATS
☆64Nov 18, 2024Updated last year
MWM-io / nansypp
View on GitHub
Unofficial implementation of NANSY++ in Pytorch Lightning
☆50Mar 11, 2024Updated last year
exercise-book-yq / Supercodec
View on GitHub
☆49Apr 1, 2025Updated 10 months ago

cantabile-kwok / vec2wav2.0View external linksLinks

Alternatives and similar repositories for vec2wav2.0

cantabile-kwok / vec2wav2.0
View external linksLinks