open-source audio datasets
☆157Sep 7, 2023Updated 2 years ago
Alternatives and similar repositories for audio-datasets
Users that are interested in audio-datasets are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Pretrained spoken language classifiers from audio.☆10Jan 21, 2021Updated 5 years ago
- LVCNet: Efficient Condition-Dependent Modeling Network for Waveform Generation☆80Feb 24, 2021Updated 5 years ago
- PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis☆68Aug 3, 2021Updated 4 years ago
- This is a list of speech tasks and datasets, which can provide training data for Generative AI, AIGC, AI model training, intelligent spee…☆82Jun 7, 2024Updated last year
- VAE modified from Descript Audio Codec, which replaces the RVQ with VAE☆92Apr 2, 2024Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Dynamic Mixing For Speech Processing (mix-on-the-fly)☆21Jul 19, 2022Updated 3 years ago
- Revisiting Multimodal Emotion Recognition in Conversation from the Perspective of Graph Spectrum☆34Dec 15, 2024Updated last year
- A collection of datasets for the purpose of emotion recognition/detection in speech.☆414Sep 30, 2024Updated last year
- Whisper Speech Quality Assessment (WhiSQA)☆16Apr 14, 2026Updated last month
- UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation☆76Aug 30, 2021Updated 4 years ago
- Code of the paper "Low-Latency Speech Separation Guided Diarization for Telephone Conversations"☆15Dec 22, 2022Updated 3 years ago
- A simple app for recording speech datasets.☆26Jun 27, 2022Updated 3 years ago
- Official implementation of the source-filter HiFiGAN vocoder☆272Jul 29, 2023Updated 2 years ago
- Big Impulse Response Dataset☆159Oct 19, 2022Updated 3 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Voice activity detection and speaker gender segmentation audiovisual corpus☆16Jan 20, 2025Updated last year
- PyTorch Implementation of DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs☆349Feb 21, 2022Updated 4 years ago
- BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis☆238Jul 13, 2022Updated 3 years ago
- A toolset for easy formant extraction and visualization from wav files and TTS models☆33Sep 2, 2022Updated 3 years ago
- 🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).☆2,191Jun 6, 2024Updated last year
- Code for Novel View Acoustic Synthesis paper☆54Aug 14, 2023Updated 2 years ago
- ☆17Oct 26, 2018Updated 7 years ago
- Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.☆1,148Nov 24, 2025Updated 6 months ago
- "Learning Discrete and Continuous Factors of Data via Alternating Disentanglement" accepted at ICML2019☆22Aug 22, 2019Updated 6 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- ☆41May 15, 2023Updated 3 years ago
- Official implementation of "Avocodo: Generative Adversarial Network for Artifact-Free Vocoder" (AAAI2023)☆154Feb 1, 2023Updated 3 years ago
- [ICASSP 2024] StoryTTS: A Highly Expressive Text-to-Speech Dataset with Rich Textual Expressiveness Annotations☆141Apr 27, 2024Updated 2 years ago
- ☆14Mar 25, 2023Updated 3 years ago
- Unified Speech Language Model for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"(ICLR 2024)☆151Sep 14, 2023Updated 2 years ago
- Speech enhancement by time-varying pitch-dependent filtering of harmonics☆27Jul 3, 2014Updated 11 years ago
- Reference Implementations of Waveform Evaluation Networks (WEnets)☆27Sep 18, 2023Updated 2 years ago
- ☆16Feb 19, 2026Updated 3 months ago
- ☆15May 9, 2022Updated 4 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Official implementation for Fast-HuBERT: An Efficient Training Framework for Self-Supervised Speech Representation Learning☆97Nov 20, 2024Updated last year
- Official repository of DailyTalk: Spoken Dialogue Dataset for Conversational Text-to-Speech, ICASSP 2023☆254Jun 5, 2025Updated 11 months ago
- ☆11Jun 6, 2022Updated 3 years ago
- Dataset release for Emotional TTS in Indian Accent☆41Mar 25, 2026Updated 2 months ago
- ☆18Jun 24, 2025Updated 11 months ago
- Code for paper submission under review.☆35Oct 30, 2017Updated 8 years ago
- Speech Emotion Recognition using transfer learning with wav2vec on IEMOCAP.☆17Aug 8, 2021Updated 4 years ago