souvikg544/TTS_Data_Maker

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/souvikg544/TTS_Data_Maker)

souvikg544 / TTS_Data_Maker

Text to speech is an emerging zone of AI. This repository helps to create a dataset with audio and transcripts for personalized text to speech .

☆28

Alternatives and similar repositories for TTS_Data_Maker

Users that are interested in TTS_Data_Maker are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Hannes1 / react-native-wenet
View on GitHub
Wenet speech to text for react native
☆10Nov 1, 2022Updated 3 years ago
tiro-is / tiro-speech-core
View on GitHub
This is a mirror of https://gitlab.com/tiro-is/tiro-speech-core
☆15Jun 19, 2023Updated 3 years ago
patyork / AutomaticSpeechChunker
View on GitHub
From a large speech audio file and its corresponding body of text, automatically chunk the audio and text into (phrase, audio_snippet) pa…
☆17May 15, 2015Updated 11 years ago
luomingshuang / k2-speechbrain
View on GitHub
In this repository, I try to combine k2 with speechbrain to decode well and fastly.
☆16Jun 17, 2022Updated 4 years ago
babe269 / performant
View on GitHub
A toolset for easy formant extraction and visualization from wav files and TTS models
☆33Sep 2, 2022Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
nii-yamagishilab / speaker_sex_attribute_privacy
View on GitHub
Project for HIDING SPEAKER’S SEX IN SPEECH USING ZERO-EVIDENCE SPEAKER REPRESENTATION IN AN ANALYSIS/SYNTHESIS PIPELINE
☆15Nov 30, 2022Updated 3 years ago
MiniXC / LightningFastSpeech2
View on GitHub
☆55Jan 13, 2023Updated 3 years ago
cyfer0618 / kaldi-pytorch-rnnlm
View on GitHub
Enable RNNLM lattice rescoring with Pytorch [kaldi]
☆12Jun 5, 2020Updated 6 years ago
awni / future_speech
View on GitHub
The History of Speech Recognition to the Year 2030
☆13Aug 14, 2021Updated 4 years ago
bryan051003 / USVG
View on GitHub
A unified model for zero-shot singing voice conversion and synthesis
☆22Nov 30, 2022Updated 3 years ago
HumeAI / competitions
View on GitHub
Hume AI ML Competitions
☆31Apr 7, 2026Updated 3 months ago
dmuth / tarsplit
View on GitHub
A utility to split tarballs into smaller pieces while keeping files intact.
☆18Jun 19, 2022Updated 4 years ago
projecte-aina / oTranscribe-plus
View on GitHub
A free & open tool for transcribing audio interviews with offline ASR support
☆25Dec 21, 2023Updated 2 years ago
freds0 / CML-TTS-Dataset
View on GitHub
CML-TTS: A Multilingual Dataset for Speech Synthesis
☆35Jul 31, 2024Updated last year
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
CODEJIN / VITS_Diffusion
View on GitHub
☆26Sep 22, 2022Updated 3 years ago
YoungSeng / SRD-VC
View on GitHub
Speech Representation Disentanglement with Adversarial Mutual Information Learning for One-shot Voice Conversion (Interspeech 2022)
☆119Feb 7, 2024Updated 2 years ago
miccio-dk / NISQA
View on GitHub
NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment
☆16Apr 13, 2022Updated 4 years ago
CUNY-CL / wikipron-modeling
View on GitHub
Proposed splits for the LREC Wikipron paper
☆15Apr 7, 2020Updated 6 years ago
FantSun / Speechflow
View on GitHub
Speechflow for emotion recognition related information decomposition
☆10Jul 27, 2021Updated 4 years ago
vtuber-plan / vcvits
View on GitHub
Non Parallel Voice Conversion based on VITS
☆24Mar 31, 2023Updated 3 years ago
lucasjinreal / textfrontend
View on GitHub
单独维护的中文TTS
☆34Oct 28, 2022Updated 3 years ago
gullabi / STT-align
View on GitHub
Coqui STT (🐸STT) based forced alignment tool
☆13Feb 24, 2022Updated 4 years ago
JazminVidal / gop-pykaldi
View on GitHub
Goodness of Pronunciation algorithm using PyKaldi
☆18Jun 12, 2022Updated 4 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
bagustris / s3prl-ser
View on GitHub
S3PRL for Speech Emotion Recognition (see s3prl > downstream)
☆15Feb 28, 2026Updated 4 months ago
gerazov / prosodeep
View on GitHub
Deep understanding and modelling of the hierarchical structure of prosody
☆25May 12, 2019Updated 7 years ago
coqui-ai / inference-engine
View on GitHub
Coqui Inference Engine
☆41Aug 3, 2021Updated 4 years ago
Joshua-1995 / LearnableUpsamplingLayer-Pytorch
View on GitHub
Pytorch implementation of LearnableUpsamplingLayer (NaturalSpeech, Tan et al., 2022)
☆57Mar 12, 2024Updated 2 years ago
3eeps / llmon-py
View on GitHub
llmon-py is a multimodal webui for Llama 3-8B.
☆16Jul 1, 2024Updated 2 years ago
ga642381 / Taiwanese-Speech-Synthesis
View on GitHub
Taiwanese Speech Synthesis with Tacotron2
☆26Oct 2, 2022Updated 3 years ago
idnavid / speech_activity_detection
View on GitHub
Unsupervised speech activity detection system.
☆11Jul 2, 2018Updated 8 years ago
Open-Speech-EkStep / vakyansh-tts
View on GitHub
Text to Speech for Indic languages
☆53Mar 23, 2022Updated 4 years ago
lukassteinwender / avatair
View on GitHub
A pipeline to generate user-preferred photo-realistic avatars using stable-diffusion and bayesian-optimization.
☆18Jun 12, 2026Updated last month
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
yt605155624 / TTSAndroid
View on GitHub
TTS Android demo of PaddleSpeech, merged into https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/demos
☆28Nov 30, 2022Updated 3 years ago
ryanrudes / YTTTS
View on GitHub
The YouTube Text-To-Speech dataset is comprised of waveform audio extracted from YouTube videos alongside their English transcriptions
☆53Apr 1, 2021Updated 5 years ago
shang0712 / HierTTS
View on GitHub
☆47Apr 16, 2023Updated 3 years ago
spring-media / DeepForcedAligner
View on GitHub
☆81Aug 8, 2025Updated 11 months ago
kate-egorova / ASR-hybrid-decoding
View on GitHub
This is an extension of kaldi speech recognition software which allows to perform decoding of speech with hybrid word and phoneme graphs.…
☆11Feb 4, 2020Updated 6 years ago
amazon-science / proteno
View on GitHub
This repository contains data used in the NAACL 2021 Paper - Proteno: Text Normalization with Limited Data for Fast Deployment in Text to…
☆45May 25, 2021Updated 5 years ago
PanagiotisP / svs-multiband
View on GitHub
Code for the paper "MULTI-BAND MASKING FOR WAVEFORM-BASED SINGING VOICE SEPARATION" that was accepted on EUSIPCO2022
☆15Jun 18, 2022Updated 4 years ago