hetpandya/youtube_tts_data_generator

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/hetpandya/youtube_tts_data_generator)

hetpandya / youtube_tts_data_generator

A python library to generate speech dataset from Youtube videos

☆37

Alternatives and similar repositories for youtube_tts_data_generator

Users that are interested in youtube_tts_data_generator are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

TehreemFarooqi / Preparing-a-speech-recognition-dataset-using-YouTube-videos
View on GitHub
Using YouTube to prepare a speech recognition dataset for any language
☆10Mar 30, 2021Updated 5 years ago
komari6 / Arabic-twitter-corpus-AJGT
View on GitHub
introduces an Arabic Jordanian General Tweets (AJGT) Corpus consisted of 1,800 tweets annotated as positive and negative. Modern Standar…
☆12Sep 26, 2023Updated 2 years ago
souvikg544 / TTS_Data_Maker
View on GitHub
Text to speech is an emerging zone of AI. This repository helps to create a dataset with audio and transcripts for personalized text to s…
☆28Mar 14, 2023Updated 3 years ago
auspicious3000 / SpeechSplit-Demo
View on GitHub
Unsupervised Speech Decomposition via Triple Information Bottleneck
☆14Apr 29, 2020Updated 6 years ago
narVidhai / Speech-Transcription-Benchmarking
View on GitHub
Example python scripts to evaluate various ASR methods
☆11Dec 22, 2021Updated 4 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
klintan / swedish-asr-dataset
View on GitHub
Jupyter Notebooks for creating Speech datasets
☆46Mar 3, 2019Updated 7 years ago
OptimusPrimus / dcase2020_workshop
View on GitHub
☆12Aug 3, 2020Updated 5 years ago
BridgetteSong / Tacotron2
View on GitHub
☆13Sep 21, 2022Updated 3 years ago
alxmamaev / ultimate_tts
View on GitHub
☆13Aug 7, 2021Updated 4 years ago
goronfreeman / alfred-genius
View on GitHub
A Genius workflow for Alfred 3
☆11Nov 17, 2017Updated 8 years ago
coqui-ai / data-checker
View on GitHub
🫠 check your data, before you wreck your model
☆16Aug 11, 2022Updated 3 years ago
pengzhendong / ngram-punctuator
View on GitHub
An N-gram punctuator for Chinese and English.
☆18Oct 14, 2025Updated 9 months ago
nttcslab / ToyADMOS2-dataset
View on GitHub
ToyADMOS2: Another dataset of miniature-machine operating sounds for anomalous sound detection under domain shift conditions 🚗 🚃
☆21Apr 16, 2024Updated 2 years ago
codebyzeb / g2p-plus
View on GitHub
Grapheme-to-phoneme tool for corpus conversion, where phonemes match Phoible inventories
☆19Apr 10, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
prateekralhan / Youtube-Whisper-Streamlit
View on GitHub
A streamlit based webapp to generate subtitles for YouTube Videos.
☆17Nov 12, 2022Updated 3 years ago
deepvk / muse
View on GitHub
🎵 muse: Music Separation
☆11Feb 14, 2024Updated 2 years ago
kstenerud / depixelate
View on GitHub
Implementation of various scaling/depixelating techbologies
☆13Jun 24, 2017Updated 9 years ago
rasenganai / Illegal_Parking
View on GitHub
Using AI based approach to detect illegal parking of vehicles (Cars) from an image. The model will receive an image of parked car through…
☆11Jun 2, 2020Updated 6 years ago
Syuparn / TextGridConverter
View on GitHub
convert .lab files to .TextGrid files, which can be used in Praat
☆14Nov 2, 2018Updated 7 years ago
verma-anushka / Gaming-Zone
View on GitHub
The Gaming Zone is a web application that provides you with a collection of classic retro games, including puzzle games, trivia games, bo…
☆10Feb 11, 2020Updated 6 years ago
pgys / NoIze
View on GitHub
A selective noise filter architecture driven by a CNN and Wiener filter
☆17Nov 21, 2019Updated 6 years ago
nonverbalspeech38k / nonverspeech38k
View on GitHub
The official repository for the paper “NonVerbalSpeech-38K: A Scalable Pipeline for Enabling Non-Verbal Speech Generation and Understandi…
☆68Dec 26, 2025Updated 6 months ago
EndlessReform / smoltts
View on GitHub
Open TTS models, built for streaming on the edge
☆45Mar 16, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
pzelasko / kaldialign
View on GitHub
Python wrappers for Kaldi Levenshtein's distance and alignment code.
☆70Jun 15, 2026Updated last month
Speech-Lab-IITM / data2vec-aqc
View on GitHub
Repository having the code and models from the paper: data2vec-aqc: Search for the right Teaching Assistant in the Teacher-Student traini…
☆13Mar 18, 2024Updated 2 years ago
gpu-poor / gramvaani_hindi_asr
View on GitHub
This repo contains the baseline model recipes and pre-trained model for GramVanni hindi ASR challenge
☆16Mar 26, 2022Updated 4 years ago
Open-Speech-EkStep / indic-punct
View on GitHub
☆45Dec 15, 2022Updated 3 years ago
Den4ikAI / Anfice-chatbot
View on GitHub
Диалоговая система на базе FRED-T5
☆38Jul 10, 2023Updated 3 years ago
elnagara / BRAD-Arabic-Dataset
View on GitHub
BRAD: Books Reviews in Arabic Dataset
☆15Feb 4, 2018Updated 8 years ago
r9y9 / segmentation-kit
View on GitHub
Speech Segmentation Toolkit using Julius
☆18Aug 19, 2021Updated 4 years ago
Helw150 / levanter
View on GitHub
Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax
☆16Jun 16, 2024Updated 2 years ago
ZehuaKcrissLi / GTR-Voice
View on GitHub
☆16Nov 11, 2024Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
cleoag / KinectGate
View on GitHub
KinectSDK to AS3 socket gate
☆14Sep 3, 2011Updated 14 years ago
nivibilla / efficient-vits-finetuning
View on GitHub
Finetuning VITS Efficiently
☆32Nov 6, 2023Updated 2 years ago
axelspringer / DeepPhonemizer
View on GitHub
Grapheme to phoneme conversion with deep learning.
☆432Dec 8, 2023Updated 2 years ago
ajd12342 / paraspeechclap
View on GitHub
Codebase for 'ParaSpeechCLAP: A Dual-Encoder Speech-Text Model for Rich Stylistic Language-Audio Pretraining'
☆23Jun 20, 2026Updated last month
shanemcandrewai / Speech-to-Facial-Landmarks
View on GitHub
Replication of speech to facial landmarks results
☆11Jun 17, 2020Updated 6 years ago
deepvk / vitrina
View on GitHub
👀 VITRina: VIsual Token Representations
☆11Jun 15, 2023Updated 3 years ago
xih108 / Video_Completion
View on GitHub
☆10Dec 14, 2020Updated 5 years ago