Tools to create your own voice dataset for TTS training
☆70Oct 26, 2020Updated 5 years ago
Alternatives and similar repositories for voice-dataset-creation
Users that are interested in voice-dataset-creation are comparing it to the libraries listed below
Sorting:
- A Benchmark Corpus for Low-Resource Cantonese Punctuation Restoration from Speech Transcripts☆16Dec 3, 2024Updated last year
- ☆14Aug 19, 2024Updated last year
- This UI serves as a Synthetic ASR Dataset Generator powered by/for OpenAI Whisper, enabling users to capture audio, transcribing it, on t…☆32Nov 26, 2024Updated last year
- Create training data for training a voice cloner for bark text to speech.☆48Jun 13, 2023Updated 2 years ago
- A unified model for zero-shot singing voice conversion and synthesis☆22Nov 30, 2022Updated 3 years ago
- Run Retrieval-based Voice Conversion training and inference with ease.☆11Jan 24, 2025Updated last year
- Code for ACL 2024 findings paper "wav2vec-S: Adapting Pre-trained Speech Models for Streaming"☆10Apr 20, 2025Updated 10 months ago
- ☆11Mar 28, 2024Updated last year
- text to speech☆10Mar 19, 2024Updated last year
- KABooks is a tool to automate the process of creating datasets for training Text-To-Speech (TTS) and Speech-To-Text (STT) models. Using a…☆12Mar 24, 2023Updated 2 years ago
- Audio tokenization, in the fastest way possible!☆53Aug 26, 2024Updated last year
- CML-TTS: A Multilingual Dataset for Speech Synthesis☆33Jul 31, 2024Updated last year
- A simple app for recording speech datasets.☆26Jun 27, 2022Updated 3 years ago
- Automatically generates TTS dataset using audio and associated text. Make cuts under a custom length. Uses Google Speech to text API to p…☆52Apr 17, 2022Updated 3 years ago
- ☆14Aug 16, 2023Updated 2 years ago
- ☆82Jan 22, 2025Updated last year
- 《SpeechPrompt v2: Prompt Tuning for Speech Classification Tasks》Speech processing with prompting paradigm☆81Oct 19, 2023Updated 2 years ago
- PyTorch Implementation of Generalized End-to-End Loss for Speaker Verification☆28Jan 23, 2021Updated 5 years ago
- Official repository for NAST: Noise Aware Speech Tokenization for Speech Language Models (Interspeech 2024) https://arxiv.org/abs/2406.11…☆46Jul 2, 2024Updated last year
- Arabic deep-learning based diacritization models (Shakkala, Shakkelha) ported to PyTorch☆14May 30, 2023Updated 2 years ago
- Aligntune : A Modular Toolkit for Post Training Alignment of LLMs☆35Feb 26, 2026Updated last week
- Zero-Shot Foreign Accent Conversion without a Native Reference☆36May 1, 2024Updated last year
- Code for the paper: How Much Context Does My Attention-Based ASR System Need?☆10Updated this week
- Multispeaker Community Vocoder Model for DiffSinger☆39Aug 11, 2025Updated 6 months ago
- [WIP] VoiceSmith makes training text to speech models easy.☆229Oct 10, 2022Updated 3 years ago
- TFGAN: Time and Frequency Domain Based Generative Adversarial Network for High-fidelity Speech Synthesis☆88Feb 23, 2021Updated 5 years ago
- [IJCAI'23] Learning to Speak from Text for Low-Resource TTS☆64May 30, 2023Updated 2 years ago
- BeADataScientist☆13Sep 4, 2020Updated 5 years ago
- ☆61Nov 4, 2023Updated 2 years ago
- [AAAI 2024] CTX-txt2vec, the acoustic model in UniCATS☆64Nov 18, 2024Updated last year
- A GUI Toolkit for SVS Label Generation. Heavily utilizes SOFA & Whisper to generate htk-style force-aligned labels with a focus on singin…☆39Aug 18, 2024Updated last year
- (WIP) A retrain of F5-TTS on permissively-licensed data☆13Apr 6, 2025Updated 11 months ago
- UTAUTAI(Unrestricted Tune Automated Technology Artificial Interigence)☆15Oct 27, 2023Updated 2 years ago
- This is a repository dedicated for pre-trained acoustic models of Hong Kong Cantonese and Cantonese forced alignment.☆23Nov 14, 2024Updated last year
- A pipeline to generate user-preferred photo-realistic avatars using stable-diffusion and bayesian-optimization.☆18May 15, 2025Updated 9 months ago
- Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks☆17Aug 18, 2023Updated 2 years ago
- Adversarial Training of Denoising Diffusion Model Using Dual Discriminators for High-Fidelity Multi-Speaker TTS☆40Aug 4, 2023Updated 2 years ago
- Goodness of Pronunciation algorithm using PyKaldi☆18Jun 12, 2022Updated 3 years ago
- Unofficial implementation of ConvNeXt-TTS powered by lightning☆18Oct 20, 2024Updated last year