dscripka / synthetic_speech_dataset_generation
This repository contains text-to-speech (TTS) models and utilities designed produce synthetic training datasets for other speech-related models.
β13Updated last year
Related projects: β
- Indic-Conformer models for ASRβ15Updated 2 months ago
- π― Speech Recognition Challenge by Speech Lab - IIT Madrasβ11Updated 3 years ago
- Rescoring methods for end-to-end Automatic Speech Recognitionβ27Updated 3 years ago
- β38Updated last year
- This app is intended to automatically create a corpus for ASR systems using pseudo-labeling.β27Updated 7 months ago
- β10Updated 11 months ago
- β11Updated 2 years ago
- β16Updated 3 years ago
- English ASR Challenge organized by Speech Lab, IIT Madrasβ11Updated 3 years ago
- Simple Python library, distributed via binary wheels with few direct dependencies, for easily using wav2vec 2.0 models for speech recogniβ¦β24Updated 3 years ago
- Zero-Shot Foreign Accent Conversion without a Native Referenceβ27Updated 4 months ago
- Final training script from HuggingFace Whisper Fine tuning event - to get best results on finetuned model.β12Updated last year
- SpeechGLUE is a speech version of the GLUE benchmark, driven by text-to-speech.β13Updated last year
- Speaker diarization serviceβ17Updated last week
- π« check your data, before you wreck your modelβ16Updated 2 years ago
- Using YouTube to prepare a speech recognition dataset for any languageβ10Updated 3 years ago
- A transcribed speech dataset in Wolof, Pulaar and Sereer, to support agriculture. Funded by Lacuna Fund.β10Updated 4 months ago
- Enable RNNLM lattice rescoring with Pytorch [kaldi]β12Updated 4 years ago
- β10Updated last year
- Multilingual acoustic word embedding approaches applied and evaluated on GlobalPhone data.β10Updated 3 years ago
- This repo contains the baseline model recipes and pre-trained model for GramVanni hindi ASR challengeβ14Updated 2 years ago
- A pipeline to isolate and transcribe one language in mixed-language speechβ18Updated last year
- This repository contains the implementation of the paper: "Span Classification with Structured Information for Disfluency Detection in Spβ¦β12Updated last year
- Implementation of different noise embeddings for noise aware training of Kaldi acoustic models.β12Updated 3 years ago
- This will hold the crowdsourcing platform to be used to store voice data from various speakers which will act as input dataset for speechβ¦β17Updated last year
- Word Error Rate Estimationβ10Updated 4 years ago
- Parallelized automatic corpus collection for ASR. Forked from https://github.com/EgorLakomkin/KTSpeechCrawlerβ24Updated 3 years ago
- This repository contains all the code necessary for running the multilingual distilwhisper from Ferraz et al. 2024 IEEE ICASSP paper.β16Updated 6 months ago
- A handy dataset of noises for ASRβ19Updated 5 years ago
- β40Updated last year