salesforce/speech-datasets

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/salesforce/speech-datasets)

salesforce / speech-datasets

Simplified recipes for preparing commonly used speech datasets, and a PyTorch-compatible Python data loader that can perform standard feature computations & data augmentations.

☆15

Alternatives and similar repositories for speech-datasets

Users that are interested in speech-datasets are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

yoyolicoris / variational-diffwave
View on GitHub
☆32Jul 27, 2022Updated 3 years ago
avi33 / universalmelgan
View on GitHub
This is an unofficial implementation of universal melgan according to https://arxiv.org/abs/2011.09631
☆23Aug 15, 2022Updated 3 years ago
rishabhjain16 / whisper_child_asr
View on GitHub
☆12May 23, 2023Updated 3 years ago
harveenchadha / bol
View on GitHub
Open Source Speech Inferencing Libary for Indic Languages
☆12Apr 11, 2022Updated 4 years ago
BridgetteSong / Tacotron2
View on GitHub
☆13Sep 21, 2022Updated 3 years ago
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
chloelavrat / torch-ddsp
View on GitHub
A real time implementation of the ddsp from google magenta.
☆16Nov 8, 2021Updated 4 years ago
MiuLab / Lattice-ELMo
View on GitHub
Source code for ACL 2020 paper "Learning Spoken Language Representations with Neural Lattice Language Modeling"
☆18Feb 11, 2023Updated 3 years ago
s-nlp / parallel_detoxification_dataset
View on GitHub
Data from "Crowdsourcing of Parallel Corpora: the Case of Style Transfer for Detoxification" paper
☆14Apr 3, 2025Updated last year
jyhan03 / dpccn
View on GitHub
This repository provides an implementation of the DPCCN model for single-channel speech separation. More details will be updated soon.
☆13Dec 8, 2021Updated 4 years ago
hBPF / hBPF
View on GitHub
Simple DSL that comiles to BPF assembly
☆18Apr 19, 2018Updated 8 years ago
k2-fsa / multi_quantization
View on GitHub
☆46Nov 2, 2023Updated 2 years ago
emirdemirel / ASA_ICASSP2021
View on GitHub
A duration-invariant audio-to-lyrics alignment pipeline with low memory footprint which segments long music recordings via a recursive bi…
☆15Oct 13, 2022Updated 3 years ago
mugen-org / MUGEN_coinrun
View on GitHub
A repository for the updated version of CoinRun used to collect MUGEN, a multimodal video-audio-text dataset. This repo contains scripts …
☆13Jul 13, 2022Updated 4 years ago
tqbl / ood_audio
View on GitHub
An audio classification system for learning with out-of-distribution data
☆33Dec 8, 2022Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
VITA-Group / Audio-Lottery
View on GitHub
[ICLR 2022] "Audio Lottery: Speech Recognition Made Ultra-Lightweight, Noise-Robust, and Transferable", by Shaojin Ding, Tianlong Chen, Z…
☆32Apr 8, 2022Updated 4 years ago
vliu15 / adversarial-tts
View on GitHub
End-to-end Text-to-Speech with Generative Adversarial Networks
☆20Feb 6, 2021Updated 5 years ago
UniversalDataTool / udt-format
View on GitHub
A simple universal data description format for datasets, tailored for interfacing with humans.
☆25Feb 16, 2021Updated 5 years ago
rabbitmetrics / langchain-agents-explained
View on GitHub
☆21May 26, 2023Updated 3 years ago
mechanicalsea / sugar
View on GitHub
Efficient Speech Processing Tookit for Automatic Speaker Recognition
☆17Feb 8, 2023Updated 3 years ago
xuancong84 / singapore-address-heatmap
View on GitHub
A database and crawling script for Singapore postal code, address name and geo-coordinates
☆14Jul 29, 2020Updated 5 years ago
pjones / nix-hs
View on GitHub
Haskell + nixpkgs = nix-hs
☆24Jun 2, 2021Updated 5 years ago
NeuroWave-ai / CUCVAE-TTS
View on GitHub
☆25Mar 12, 2022Updated 4 years ago
aknutas / nails
View on GitHub
Network Analysis Interface for Literature Studies
☆23May 12, 2023Updated 3 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
amazon-science / proteno
View on GitHub
This repository contains data used in the NAACL 2021 Paper - Proteno: Text Normalization with Limited Data for Fast Deployment in Text to…
☆45May 25, 2021Updated 5 years ago
dafyddg / RFA
View on GitHub
Implementation of the Rhythm Formant Analysis methodology for identifying speech rhythms and rhythm variation in the low frequency spectr…
☆17Apr 27, 2023Updated 3 years ago
lpeterse / haskell-ssh
View on GitHub
An SSH implemenation in pure Haskell
☆17Feb 14, 2022Updated 4 years ago
mutiann / speech_rankings
View on GitHub
A CSRankings-like index for speech researchers
☆35Oct 16, 2024Updated last year
langcog / childes-db
View on GitHub
A SQL interface for the CHILDES child language corpora
☆14Sep 30, 2022Updated 3 years ago
keonlee9420 / WaveGrad2
View on GitHub
PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis
☆68Aug 3, 2021Updated 4 years ago
vincentpierre / PythonUnitySharedMemory
View on GitHub
Using a shared file to exchange data between Unity and Python
☆13Oct 30, 2021Updated 4 years ago
alvations / kopitiam
View on GitHub
How to Order Coffee in Singapore?
☆11Jul 28, 2023Updated 2 years ago
sushant-t / tts-trainer
View on GitHub
Generate audio datasets for training Text-To-Speech models, through smart audio splitting with silence detection, and transcription using…
☆30May 27, 2023Updated 3 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
daanzu / kaldi_ag_training
View on GitHub
Docker image and scripts for training finetuned or completely personal Kaldi speech models. Particularly for use with kaldi-active-gramma…
☆21Jan 24, 2022Updated 4 years ago
neurlang / dataset
View on GitHub
IPA Phonetic dataset lexicon
☆18Jun 20, 2026Updated last month
haideraltahan / CLAR
View on GitHub
☆18Apr 12, 2021Updated 5 years ago
HazyResearch / ludwig-benchmarking-toolkit
View on GitHub
Ludwig benchmark
☆20May 11, 2026Updated 2 months ago
nytopop / csm
View on GitHub
A Conversational Speech Generation Model
☆14Mar 16, 2025Updated last year
ga642381 / RobustVC
View on GitHub
**ICASSP 2022** 《Toward Degradation-Robust Voice Conversion》Using speech enhancement and end-to-end denoising training to improve degrada…
☆24Sep 27, 2022Updated 3 years ago
nytopop / illu
View on GitHub
realtime conversational dynamics
☆19Mar 19, 2025Updated last year