DagsHub/audio-datasets

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/DagsHub/audio-datasets)

DagsHub / audio-datasets

open-source audio datasets

☆158

Alternatives and similar repositories for audio-datasets

Users that are interested in audio-datasets are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

RicherMans / SpokenLanguageClassifiers
View on GitHub
Pretrained spoken language classifiers from audio.
☆10Jan 21, 2021Updated 5 years ago
zceng / LVCNet
View on GitHub
LVCNet: Efficient Condition-Dependent Modeling Network for Waveform Generation
☆80Feb 24, 2021Updated 5 years ago
keonlee9420 / WaveGrad2
View on GitHub
PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis
☆68Aug 3, 2021Updated 4 years ago
khanld / Dynamic-Mixing
View on GitHub
Dynamic Mixing For Speech Processing (mix-on-the-fly)
☆22Jul 19, 2022Updated 4 years ago
WangHelin1997 / SpeechTasks
View on GitHub
This is a list of speech tasks and datasets, which can provide training data for Generative AI, AIGC, AI model training, intelligent spee…
☆83Jun 7, 2024Updated 2 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
innnky / descript-audio-vae
View on GitHub
VAE modified from Descript Audio Codec, which replaces the RVQ with VAE
☆92Apr 2, 2024Updated 2 years ago
rishikksh20 / UnivNet-pytorch
View on GitHub
UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation
☆76Aug 30, 2021Updated 4 years ago
SuperKogito / SER-datasets
View on GitHub
A collection of datasets for the purpose of emotion recognition/detection in speech.
☆420Sep 30, 2024Updated last year
chomeyama / SiFiGAN
View on GitHub
Official implementation of the source-filter HiFiGAN vocoder
☆275Jul 29, 2023Updated 3 years ago
FrancoisGrondin / BIRD
View on GitHub
Big Impulse Response Dataset
☆159Oct 19, 2022Updated 3 years ago
dr-pato / SSGD
View on GitHub
Code of the paper "Low-Latency Speech Separation Guided Diarization for Telephone Conversations"
☆15Dec 22, 2022Updated 3 years ago
keonlee9420 / DiffGAN-TTS
View on GitHub
PyTorch Implementation of DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs
☆349Feb 21, 2022Updated 4 years ago
tencent-ailab / bddm
View on GitHub
BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis
☆238Jul 13, 2022Updated 4 years ago
ina-foss / InaGVAD
View on GitHub
Voice activity detection and speaker gender segmentation audiovisual corpus
☆16Jan 20, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
babe269 / performant
View on GitHub
A toolset for easy formant extraction and visualization from wav files and TTS models
☆33Sep 2, 2022Updated 3 years ago
JinhuaLiang / lam4fsl
View on GitHub
An official repo for the paper "Adapting Language-Audio Models as Few-Shot Audio Learners"
☆31May 31, 2023Updated 3 years ago
iver56 / torch-audiomentations
View on GitHub
Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.
☆1,162Nov 24, 2025Updated 8 months ago
jim-schwoebel / voice_datasets
View on GitHub
🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).
☆2,212Jun 6, 2024Updated 2 years ago
KdaiP / DC-Speech-VAE
View on GitHub
5Hz Deep-Compression Speech VAE for AR-Diffusion and CALMs
☆57Nov 19, 2025Updated 8 months ago
Mddct / usm-tokenizer
View on GitHub
semantic tokenizer for speech and music
☆20Jul 6, 2025Updated last year
voidful / vall-e-encodec
View on GitHub
☆41May 15, 2023Updated 3 years ago
leto19 / WhiSQA
View on GitHub
Whisper Speech Quality Assessment (WhiSQA)
☆16Apr 14, 2026Updated 3 months ago
Eps-Acoustic-Revolution-Lab / EAR_HEAR
View on GitHub
☆15Jan 9, 2026Updated 6 months ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
snu-mllab / DisentanglementICML19
View on GitHub
"Learning Discrete and Continuous Factors of Data via Alternating Disentanglement" accepted at ICML2019
☆22Aug 22, 2019Updated 6 years ago
facebookresearch / novel-view-acoustic-synthesis
View on GitHub
Code for Novel View Acoustic Synthesis paper
☆54Aug 14, 2023Updated 2 years ago
ncsoft / avocodo
View on GitHub
Official implementation of "Avocodo: Generative Adversarial Network for Artifact-Free Vocoder" (AAAI2023)
☆154Feb 1, 2023Updated 3 years ago
FrancoisGrondin / gccphat
View on GitHub
☆17Oct 26, 2018Updated 7 years ago
X-LANCE / StoryTTS
View on GitHub
[ICASSP 2024] StoryTTS: A Highly Expressive Text-to-Speech Dataset with Rich Textual Expressiveness Annotations
☆141Apr 27, 2024Updated 2 years ago
felixgontier / dcase-2023-baseline
View on GitHub
☆14Mar 25, 2023Updated 3 years ago
mutiann / neural-lexicon-reader
View on GitHub
Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge
☆21Jul 25, 2022Updated 4 years ago
dpwe / pitchfilter
View on GitHub
Speech enhancement by time-varying pitch-dependent filtering of harmonics
☆27Jul 3, 2014Updated 12 years ago
0nutation / USLM
View on GitHub
Unified Speech Language Model for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"(ICLR 2024)
☆152Sep 14, 2023Updated 2 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
IsraelCohenLab / ConstantBeamwidthBeamformingNonuniform
View on GitHub
☆15May 9, 2022Updated 4 years ago
hhc1997 / vggsound_download
View on GitHub
download the vggsound dataset
☆22Feb 22, 2022Updated 4 years ago
yanghaha0908 / FastHuBERT
View on GitHub
Official implementation for Fast-HuBERT: An Efficient Training Framework for Self-Supervised Speech Representation Learning
☆100Nov 20, 2024Updated last year
noiseux1523 / NIST-SRE-2019
View on GitHub
Score Normalization for NIST 2019 Speaker Recognition Evaluation
☆10Nov 8, 2019Updated 6 years ago
xiaoxiaomiao323 / MSA
View on GitHub
☆16Feb 19, 2026Updated 5 months ago
skit-ai / emotion-tts-dataset
View on GitHub
Dataset release for Emotional TTS in Indian Accent
☆41Mar 25, 2026Updated 4 months ago
keonlee9420 / DailyTalk
View on GitHub
Official repository of DailyTalk: Spoken Dialogue Dataset for Conversational Text-to-Speech, ICASSP 2023
☆260Jun 5, 2025Updated last year