Text to speech is an emerging zone of AI. This repository helps to create a dataset with audio and transcripts for personalized text to speech .
☆28Mar 14, 2023Updated 3 years ago
Alternatives and similar repositories for TTS_Data_Maker
Users that are interested in TTS_Data_Maker are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Wenet speech to text for react native☆10Nov 1, 2022Updated 3 years ago
- This is a mirror of https://gitlab.com/tiro-is/tiro-speech-core☆15Jun 19, 2023Updated 2 years ago
- From a large speech audio file and its corresponding body of text, automatically chunk the audio and text into (phrase, audio_snippet) pa…☆17May 15, 2015Updated 10 years ago
- In this repository, I try to combine k2 with speechbrain to decode well and fastly.☆16Jun 17, 2022Updated 3 years ago
- A toolset for easy formant extraction and visualization from wav files and TTS models☆33Sep 2, 2022Updated 3 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Project for HIDING SPEAKER’S SEX IN SPEECH USING ZERO-EVIDENCE SPEAKER REPRESENTATION IN AN ANALYSIS/SYNTHESIS PIPELINE☆15Nov 30, 2022Updated 3 years ago
- ☆55Jan 13, 2023Updated 3 years ago
- Enable RNNLM lattice rescoring with Pytorch [kaldi]☆12Jun 5, 2020Updated 5 years ago
- The History of Speech Recognition to the Year 2030☆13Aug 14, 2021Updated 4 years ago
- A unified model for zero-shot singing voice conversion and synthesis☆22Nov 30, 2022Updated 3 years ago
- Hume AI ML Competitions☆28Oct 28, 2022Updated 3 years ago
- A utility to split tarballs into smaller pieces while keeping files intact.☆18Jun 19, 2022Updated 3 years ago
- CML-TTS: A Multilingual Dataset for Speech Synthesis☆34Jul 31, 2024Updated last year
- ☆26Sep 22, 2022Updated 3 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- A free & open tool for transcribing audio interviews with offline ASR support☆25Dec 21, 2023Updated 2 years ago
- Speech Representation Disentanglement with Adversarial Mutual Information Learning for One-shot Voice Conversion (Interspeech 2022)☆119Feb 7, 2024Updated 2 years ago
- NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment☆16Apr 13, 2022Updated 3 years ago
- Proposed splits for the LREC Wikipron paper☆15Apr 7, 2020Updated 5 years ago
- Speechflow for emotion recognition related information decomposition☆10Jul 27, 2021Updated 4 years ago
- Non Parallel Voice Conversion based on VITS☆24Mar 31, 2023Updated 2 years ago
- 单独维护的中文TTS☆34Oct 28, 2022Updated 3 years ago
- Coqui STT (🐸STT) based forced alignment tool☆13Feb 24, 2022Updated 4 years ago
- Goodness of Pronunciation algorithm using PyKaldi☆18Jun 12, 2022Updated 3 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- S3PRL for Speech Emotion Recognition (see s3prl > downstream)☆15Feb 28, 2026Updated 3 weeks ago
- Coqui Inference Engine☆40Aug 3, 2021Updated 4 years ago
- Pytorch implementation of LearnableUpsamplingLayer (NaturalSpeech, Tan et al., 2022)☆57Mar 12, 2024Updated 2 years ago
- Deep understanding and modelling of the hierarchical structure of prosody☆24May 12, 2019Updated 6 years ago
- llmon-py is a multimodal webui for Llama 3-8B.☆16Jul 1, 2024Updated last year
- Taiwanese Speech Synthesis with Tacotron2☆25Oct 2, 2022Updated 3 years ago
- ☆46Apr 16, 2023Updated 2 years ago
- TTS Android demo of PaddleSpeech, merged into https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/demos☆28Nov 30, 2022Updated 3 years ago
- ☆80Aug 8, 2025Updated 7 months ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- The YouTube Text-To-Speech dataset is comprised of waveform audio extracted from YouTube videos alongside their English transcriptions☆52Apr 1, 2021Updated 4 years ago
- Code for the winning solution in the SE&R 2022 Challenge - SER track.☆16Mar 28, 2023Updated 2 years ago
- Code for the paper "MULTI-BAND MASKING FOR WAVEFORM-BASED SINGING VOICE SEPARATION" that was accepted on EUSIPCO2022☆15Jun 18, 2022Updated 3 years ago
- Unsupervised speech activity detection system.☆11Jul 2, 2018Updated 7 years ago
- Text to Speech for Indic languages☆52Mar 23, 2022Updated 4 years ago
- A pipeline to generate user-preferred photo-realistic avatars using stable-diffusion and bayesian-optimization.☆18May 15, 2025Updated 10 months ago
- This repository contains data used in the NAACL 2021 Paper - Proteno: Text Normalization with Limited Data for Fast Deployment in Text to…☆45May 25, 2021Updated 4 years ago