Automatically generates TTS dataset using audio and associated text. Make cuts under a custom length. Uses Google Speech to text API to perform diarization and transcription or aeneas to force align text to audio.
☆52Apr 17, 2022Updated 3 years ago
Alternatives and similar repositories for TTS-dataset-tools
Users that are interested in TTS-dataset-tools are comparing it to the libraries listed below
Sorting:
- Split long audio files based on subtitle-info in SRT File (Transcript saved in CSV)☆20Nov 14, 2019Updated 6 years ago
- KABooks is a tool to automate the process of creating datasets for training Text-To-Speech (TTS) and Speech-To-Text (STT) models. Using a…☆12Mar 24, 2023Updated 2 years ago
- Obama singing any song. Check ReadMe☆10Dec 10, 2018Updated 7 years ago
- A curated list of other awesome open-source governments organisations and projects☆13Apr 28, 2022Updated 3 years ago
- Prabhupadavani: A Code-mixed Speech Translation Data for 25 languages☆13Oct 12, 2022Updated 3 years ago
- Tools to create your own voice dataset for TTS training☆70Oct 26, 2020Updated 5 years ago
- ☆11Sep 12, 2025Updated 5 months ago
- ☆18Aug 17, 2022Updated 3 years ago
- An implementation of the paper titled "Arabic Speech Emotion Recognition Employing Wav2vec2.0 and HuBERT Based on BAVED Dataset" https://…☆15Feb 17, 2022Updated 4 years ago
- Heteronym to Phoneme Parser☆19Nov 4, 2023Updated 2 years ago
- ☆63Feb 5, 2021Updated 5 years ago
- speaker-disentangled speech linguistic content quantizer☆24Mar 19, 2025Updated 11 months ago
- Detectron2 Toolbox and Benchmark for V3Det☆18Jun 2, 2024Updated last year
- Performant and accurate speech recognition built on Pytorch☆254May 19, 2022Updated 3 years ago
- ☆19Jul 11, 2024Updated last year
- TAPE: An End-to-End Timbre-Aware Pitch Estimator☆23Nov 25, 2023Updated 2 years ago
- An Alexa skill providing a conversational interface to any public figure (as mimicked by GPT3). The legacy GUI is no longer maintained.☆19Nov 6, 2023Updated 2 years ago
- Multivoice: Enhance your foreign-language movie and TV show experience with personalized dubbed versions. Our project uses voice cloning …☆27Aug 1, 2023Updated 2 years ago
- ☆20Mar 16, 2023Updated 2 years ago
- This repository will contain code for the paper "CLIP meets GamePhysics: Towards bug identification in gameplay videos using zero-shot tr…☆26Dec 23, 2023Updated 2 years ago
- A gui to help make a text to speech dataset.☆18Dec 10, 2022Updated 3 years ago
- Train neural networks to generate watercolour paintings from pencil sketches.☆20Oct 30, 2018Updated 7 years ago
- Non Parallel Voice Conversion based on VITS☆24Mar 31, 2023Updated 2 years ago
- [IJCAI'23] Learning to Speak from Text for Low-Resource TTS☆64May 30, 2023Updated 2 years ago
- Translated vocal synthesis - Clone a voice and output speech in another language☆26May 3, 2022Updated 3 years ago
- Vecna is a Python chatbot which recommends songs and movies depending upon your feelings☆12Jun 28, 2022Updated 3 years ago
- ☆26Aug 8, 2024Updated last year
- Talking head animation☆28Dec 8, 2023Updated 2 years ago
- Incremental Disentanglement for Environment-Aware Zero-Shot Text-to-Speech Synthesis☆27Mar 21, 2025Updated 11 months ago
- StyleTTS 2 Optimized Training Fork☆33Feb 2, 2025Updated last year
- This is a collection of resources on AI-AR-ART generation.☆28Dec 14, 2022Updated 3 years ago
- SC-GlowTTS: an Efficient Zero-Shot Multi-Speaker Text-To-Speech Model☆107Sep 10, 2021Updated 4 years ago
- GUI Wrapper for 'A TensorFlow Implementation of DC-TTS: yet another text-to-speech model'☆26Jul 16, 2020Updated 5 years ago
- Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)☆78Nov 1, 2024Updated last year
- Agile metrics tools allows you to track metrics from different sources in order to identify trends and patterns on how your team performa…☆11Feb 13, 2026Updated 3 weeks ago
- Finetune the 1.4B latent diffusion text2img-large checkpoint from CompVis using deepspeed. (work-in-progress)☆36Apr 17, 2022Updated 3 years ago
- HiFi-SR is a Python-based pipeline for the detection of plant mitochondrial structural rearrangements based on the mapping of PacBio high…☆10Apr 15, 2025Updated 10 months ago
- Torch implementation of NANSY, Neural Analysis and Synthesis, arXiv:2110.14513☆64Feb 13, 2023Updated 3 years ago
- This repository contains the code and data for the paper EmoKnob: Enhance Voice Cloning with Fine-Grained Emotion Control by Haozhe Chen,…☆81Oct 3, 2024Updated last year