VoiceBank-NTPU-TW / VoiceBank-2023View external linksLinks
VoiceBank-2023 is the speech corpus specially designed for constructing personalized Mandarin text-to-speech (TTS) systems.
☆41Jan 4, 2026Updated last month
Alternatives and similar repositories for VoiceBank-2023
Users that are interested in VoiceBank-2023 are comparing it to the libraries listed below
Sorting:
- Implementation of the paper "Variable Bitrate Residual Vector Quantization for Audio Coding"☆11Apr 10, 2025Updated 10 months ago
- text to speech☆10Mar 19, 2024Updated last year
- Official release of StyleTalk dataset.☆72Jul 1, 2024Updated last year
- My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into one☆26Aug 5, 2024Updated last year
- Sing any popular song with your voice☆11Jul 10, 2022Updated 3 years ago
- VAE modified from Descript Audio Codec, which replaces the RVQ with VAE☆88Apr 2, 2024Updated last year
- ESLTTS dataset☆16Feb 6, 2025Updated last year
- ☆22Jan 29, 2026Updated 2 weeks ago
- Train the next generation of TTS systems.☆171Sep 13, 2024Updated last year
- The open source code for SimpleSpeech series☆145Oct 8, 2024Updated last year
- G2pw's inference speed is accelerated by about 8-10 times. Change loop generated predictive data to only once and model loop prediction b…☆14Dec 30, 2023Updated 2 years ago
- Official implementation of the paper "BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec"☆212Sep 19, 2024Updated last year
- [ICASSP 2024] TextrolSpeech: A Text Style Control Speech Corpus With Codec Language Text-to-Speech Models☆183Nov 22, 2024Updated last year
- A toolset for easy formant extraction and visualization from wav files and TTS models☆33Sep 2, 2022Updated 3 years ago
- An unofficial implementation of the paper "One-shot Voice Conversion by Separating Speaker and Content Representations with Instance Norm…☆117May 27, 2021Updated 4 years ago
- EMPHASIS: An Emotional Phoneme-based Acoustic Model for Speech Synthesis System☆15Mar 31, 2019Updated 6 years ago
- Official Code for SyllableLM: Learning Coarse Semantic Units for Speech Language Models☆59Jul 1, 2025Updated 7 months ago
- The implementation of paper "SpeechTripleNet: End-to-End Disentangled Speech Representation Learning for Content, Timbre and Prosody"☆34Nov 23, 2023Updated 2 years ago
- Official implementation of "Automatic Tuning of Loss Trade-offs without Hyper-parameter Search in End-to-End Zero-Shot Speech Synthesis",…☆80May 29, 2023Updated 2 years ago
- Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context☆214Sep 10, 2024Updated last year
- ☆100Jul 22, 2021Updated 4 years ago
- Open Source Speech/Text Data on AI☆19Sep 13, 2022Updated 3 years ago
- Text-To-Speech for NotebookLM☆39Jul 20, 2025Updated 6 months ago
- ☆55Aug 11, 2022Updated 3 years ago
- A Non-Autoregressive End-to-End Text-to-Speech (text-to-wav), supporting a family of SOTA unsupervised duration modelings. This project g…☆146Jun 6, 2022Updated 3 years ago
- singing voice conversion without f0☆23May 10, 2023Updated 2 years ago
- [ACMMM'2024] Generative Expressive Conversational Speech Synthesis☆44Oct 28, 2024Updated last year
- Evaluation Protocol for Large-Scale Zero-Shot TTS Literature☆93Mar 12, 2025Updated 11 months ago
- ☆23Oct 17, 2024Updated last year
- GPT-style network for phonemization with durations of text☆68Mar 21, 2024Updated last year
- Official implementation of the source-filter HiFiGAN vocoder☆268Jul 29, 2023Updated 2 years ago
- Temporary anonymous version☆22Mar 20, 2024Updated last year
- Unofficial pytorch reproduction for the paper "Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction" (…☆61Apr 4, 2024Updated last year
- (R&D) Text to speech using phonemes as inputs and audio codec codes as outputs. Loosely based on MegaByte, VALL-E and Encodec.☆48Sep 4, 2023Updated 2 years ago
- An official implementation of "UnitSpeech: Speaker-adaptive Speech Synthesis with Untranscribed Data"☆137Aug 17, 2023Updated 2 years ago
- Implementation for paper "Disentangled Speech Representation Learning for One-Shot Cross-Lingual Voice Conversion Using ß-VAE"☆44Apr 10, 2023Updated 2 years ago
- Wenet speech to text for react native☆10Nov 1, 2022Updated 3 years ago
- ☆11Nov 7, 2024Updated last year
- Onset-and-Offset-Aware Sound Event Detection☆20Feb 10, 2025Updated last year