Simplified recipes for preparing commonly used speech datasets, and a PyTorch-compatible Python data loader that can perform standard feature computations & data augmentations.
☆15Jun 12, 2023Updated 2 years ago
Alternatives and similar repositories for speech-datasets
Users that are interested in speech-datasets are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆32Jul 27, 2022Updated 3 years ago
- This is an unofficial implementation of universal melgan according to https://arxiv.org/abs/2011.09631☆23Aug 15, 2022Updated 3 years ago
- Open Source Speech Inferencing Libary for Indic Languages☆13Apr 11, 2022Updated 3 years ago
- ☆13Sep 21, 2022Updated 3 years ago
- An SSH implemenation in pure Haskell☆17Feb 14, 2022Updated 4 years ago
- Generate audio datasets for training Text-To-Speech models, through smart audio splitting with silence detection, and transcription using…☆30May 27, 2023Updated 2 years ago
- A real time implementation of the ddsp from google magenta.☆15Nov 8, 2021Updated 4 years ago
- Data from "Crowdsourcing of Parallel Corpora: the Case of Style Transfer for Detoxification" paper☆14Apr 3, 2025Updated 11 months ago
- SpeechPlus: Small LLM-Based Text-to-Speech Library 🚀☆20May 20, 2025Updated 10 months ago
- Source code for ACL 2020 paper "Learning Spoken Language Representations with Neural Lattice Language Modeling"☆17Feb 11, 2023Updated 3 years ago
- This repository provides an implementation of the DPCCN model for single-channel speech separation. More details will be updated soon.☆13Dec 8, 2021Updated 4 years ago
- A duration-invariant audio-to-lyrics alignment pipeline with low memory footprint which segments long music recordings via a recursive bi…☆15Oct 13, 2022Updated 3 years ago
- An audio classification system for learning with out-of-distribution data☆33Dec 8, 2022Updated 3 years ago
- A repository for the updated version of CoinRun used to collect MUGEN, a multimodal video-audio-text dataset. This repo contains scripts …☆13Jul 13, 2022Updated 3 years ago
- ☆46Nov 2, 2023Updated 2 years ago
- A simple universal data description format for datasets, tailored for interfacing with humans.☆25Feb 16, 2021Updated 5 years ago
- Efficient Speech Processing Tookit for Automatic Speaker Recognition☆17Feb 8, 2023Updated 3 years ago
- Haskell + nixpkgs = nix-hs☆24Jun 2, 2021Updated 4 years ago
- [ICLR 2022] "Audio Lottery: Speech Recognition Made Ultra-Lightweight, Noise-Robust, and Transferable", by Shaojin Ding, Tianlong Chen, Z…☆32Apr 8, 2022Updated 3 years ago
- End-to-end Text-to-Speech with Generative Adversarial Networks☆20Feb 6, 2021Updated 5 years ago
- ☆25Mar 12, 2022Updated 4 years ago
- This repository contains data used in the NAACL 2021 Paper - Proteno: Text Normalization with Limited Data for Fast Deployment in Text to…☆45May 25, 2021Updated 4 years ago
- A CSRankings-like index for speech researchers