revdotcom/speech-datasets

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/revdotcom/speech-datasets)

revdotcom / speech-datasets

Various speech datasets made available to the public

☆135

Alternatives and similar repositories for speech-datasets

Users that are interested in speech-datasets are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

revdotcom / fstalign
View on GitHub
An efficient OpenFST-based tool for calculating WER and aligning two transcript sequences.
☆169May 12, 2026Updated 2 months ago
huangruizhe / ConEC
View on GitHub
☆14Jun 17, 2024Updated 2 years ago
m-wiesner / nnet_pytorch
View on GitHub
Kaldi style neural network training in pytorch for use in place of nnet3 in Kaldi.
☆26Jul 25, 2024Updated last year
qiujiali / lattice-rescore
View on GitHub
☆16Jun 13, 2022Updated 4 years ago
revdotcom / words2num
View on GitHub
Convert words to numbers
☆21Apr 13, 2022Updated 4 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
bagustris / s3prl-ser
View on GitHub
S3PRL for Speech Emotion Recognition (see s3prl > downstream)
☆15Feb 28, 2026Updated 4 months ago
xinjli / alqalign
View on GitHub
multilingual speech aligner
☆78Nov 19, 2023Updated 2 years ago
sarahjuan / iban
View on GitHub
☆14Jun 12, 2015Updated 11 years ago
asappresearch / multistream-cnn
View on GitHub
Multistream CNN for Robust Acoustic Modeling
☆40Jun 17, 2021Updated 5 years ago
ICASSP2021-tutorial9 / Distant_conversational_ASR_and_analysis
View on GitHub
☆12Jun 10, 2021Updated 5 years ago
waldo-seg / waldo
View on GitHub
image-segmentation and text-localization
☆12Aug 22, 2018Updated 7 years ago
SpeechColab / GigaSpeech
View on GitHub
Large, modern dataset for speech recognition
☆731Feb 26, 2024Updated 2 years ago
SpeechColab / PySpeechColab
View on GitHub
A library of speech gadgets.
☆15Oct 15, 2022Updated 3 years ago
robflynnyh / long-context-asr
View on GitHub
Code for the paper: How Much Context Does My Attention-Based ASR System Need?
☆11Jul 3, 2026Updated 2 weeks ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
luomingshuang / k2-speechbrain
View on GitHub
In this repository, I try to combine k2 with speechbrain to decode well and fastly.
☆16Jun 17, 2022Updated 4 years ago
csukuangfj / kaldilm
View on GitHub
Python wrapper for kaldi's arpa2fst
☆38Aug 27, 2025Updated 10 months ago
Hannes1 / react-native-wenet
View on GitHub
Wenet speech to text for react native
☆10Nov 1, 2022Updated 3 years ago
BriansIDP / WhisperBiasing
View on GitHub
☆88Jul 31, 2025Updated 11 months ago
skakouros / s3prl_attentive_correlation
View on GitHub
Self-Supervised Speech/Sound Pre-training and Representation Learning Toolkit
☆13Nov 18, 2022Updated 3 years ago
burchim / EfficientConformer
View on GitHub
[ASRU 2021] Efficient Conformer: Progressive Downsampling and Grouped Attention for Automatic Speech Recognition
☆221Jun 22, 2023Updated 3 years ago
fyvo / WMT-Biomed-Test
View on GitHub
☆13Aug 23, 2024Updated last year
lhotse-speech / lhotse
View on GitHub
Tools for handling multimodal data in machine learning projects.
☆1,143Jun 22, 2026Updated 3 weeks ago
navana-tech / baseline_recipe_is21s_indic_asr_challenge
View on GitHub
Multilingual and code-switching ASR challenges for low resource Indian languages.
☆23Jul 26, 2021Updated 4 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
vectominist / MiniASR
View on GitHub
A mini, simple, and fast end-to-end automatic speech recognition toolkit.
☆53Dec 6, 2022Updated 3 years ago
XapaJIaMnu / gLM
View on GitHub
A GPU language model, based on btree backed tries.
☆30Mar 6, 2018Updated 8 years ago
dogancan / expected-edit-distance
View on GitHub
Expected edit distance implementation using OpenFst tools
☆11May 13, 2015Updated 11 years ago
farisalasmary / wav2vec2-kenlm
View on GitHub
Incorporating KenLM language model with HuggingFace implementation of Wav2Vec2CTC Model using beam search decoding
☆74Oct 11, 2021Updated 4 years ago
jtrmal / kaldi2020
View on GitHub
☆27Jan 19, 2021Updated 5 years ago
huggingface / open_asr_leaderboard
View on GitHub
☆229Updated this week
csukuangfj / kaldifeat
View on GitHub
Kaldi-compatible online & offline feature extraction with PyTorch, supporting CUDA, batch processing, chunk processing, and autograd - P…
☆215Jul 10, 2026Updated last week
k2-fsa / fast_rnnt
View on GitHub
A torch implementation of a recursion which turns out to be useful for RNN-T.
☆149Aug 25, 2023Updated 2 years ago
JazminVidal / gop-ft
View on GitHub
Transfer learning approach to pronunciation scoring
☆12Jan 17, 2024Updated 2 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
nvidia-riva / riva-asrlib-decoder
View on GitHub
Standalone implementation of the CUDA-accelerated WFST Decoder available in Riva
☆91Feb 18, 2025Updated last year
isca-sig-rosp / ISCA-SIG-RoSP
View on GitHub
Web page for ISCA Special Interest Group: Robust Speech Processing (RoSP)
☆11Dec 4, 2023Updated 2 years ago
flashlight / text
View on GitHub
Text utilities, including beam search decoding, tokenizing, and more, built for use in Flashlight.
☆78Mar 31, 2026Updated 3 months ago
lumaku / ctc-segmentation
View on GitHub
Segment an audio file and obtain utterance alignments. (Python package)
☆348May 15, 2024Updated 2 years ago
tiro-is / tiro-speech-core
View on GitHub
This is a mirror of https://gitlab.com/tiro-is/tiro-speech-core
☆15Jun 19, 2023Updated 3 years ago
wavlab-speech / cmu_multilingual_speech
View on GitHub
CMU multilingual speech repository
☆30Apr 15, 2022Updated 4 years ago
desh2608 / gss
View on GitHub
A simple package for Guided source separation (GSS)
☆134May 20, 2024Updated 2 years ago