open-source audio datasets
☆156Sep 7, 2023Updated 2 years ago
Alternatives and similar repositories for audio-datasets
Users that are interested in audio-datasets are comparing it to the libraries listed below
Sorting:
- Pretrained spoken language classifiers from audio.☆10Jan 21, 2021Updated 5 years ago
- PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis☆69Aug 3, 2021Updated 4 years ago
- LVCNet: Efficient Condition-Dependent Modeling Network for Waveform Generation☆80Feb 24, 2021Updated 5 years ago
- This is a list of speech tasks and datasets, which can provide training data for Generative AI, AIGC, AI model training, intelligent spee…☆81Jun 7, 2024Updated last year
- VAE modified from Descript Audio Codec, which replaces the RVQ with VAE☆88Apr 2, 2024Updated last year
- Code of the paper "Low-Latency Speech Separation Guided Diarization for Telephone Conversations"☆15Dec 22, 2022Updated 3 years ago
- Official implementation of the source-filter HiFiGAN vocoder☆268Jul 29, 2023Updated 2 years ago
- A toolset for easy formant extraction and visualization from wav files and TTS models☆33Sep 2, 2022Updated 3 years ago
- A collection of datasets for the purpose of emotion recognition/detection in speech.☆400Sep 30, 2024Updated last year
- UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation☆76Aug 30, 2021Updated 4 years ago
- ☆41May 15, 2023Updated 2 years ago
- PyTorch Implementation of DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs☆347Feb 21, 2022Updated 4 years ago
- Dynamic Mixing For Speech Processing (mix-on-the-fly)☆21Jul 19, 2022Updated 3 years ago
- "Learning Discrete and Continuous Factors of Data via Alternating Disentanglement" accepted at ICML2019☆22Aug 22, 2019Updated 6 years ago
- Official implementation of "Avocodo: Generative Adversarial Network for Artifact-Free Vocoder" (AAAI2023)☆154Feb 1, 2023Updated 3 years ago
- BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis☆230Jul 13, 2022Updated 3 years ago
- Unified Speech Language Model for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"(ICLR 2024)☆152Sep 14, 2023Updated 2 years ago
- Official implementation of MelHuBERT☆68Feb 21, 2026Updated last week
- Official implementation for Fast-HuBERT: An Efficient Training Framework for Self-Supervised Speech Representation Learning☆96Nov 20, 2024Updated last year
- Semi-supervised Learning for Multi-speaker Text-to-speech Synthesis Using Discrete Speech Representation☆39Jul 16, 2020Updated 5 years ago
- Code for paper submission under review.☆35Oct 30, 2017Updated 8 years ago
- Dataset release for Emotional TTS in Indian Accent☆40Sep 2, 2022Updated 3 years ago
- Official repository for "Speaking Style Conversion With Discrete Self-Supervised Units" (EMNLP 2023). https://arxiv.org/abs/2212.09730☆131Dec 8, 2023Updated 2 years ago
- 🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).☆2,133Jun 6, 2024Updated last year
- Codebase for the paper 'EncodecMAE: Leveraging neural codecs for universal audio representation learning'☆101Jul 24, 2024Updated last year
- Official Implementation of StyleTTS☆462Jan 13, 2025Updated last year
- ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations☆184Mar 6, 2024Updated last year
- Daily tracking of awesome audio papers, including music generation, zero-shot tts, asr, audio generation☆410Nov 2, 2025Updated 4 months ago
- MMM 2021: Crossed-Time Delay Neural Network for Speaker Recognition☆11Dec 4, 2021Updated 4 years ago
- ☆17Jun 24, 2025Updated 8 months ago
- Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge☆21Jul 25, 2022Updated 3 years ago
- ☆11Jun 6, 2022Updated 3 years ago
- ☆47Aug 31, 2024Updated last year
- Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.☆1,135Nov 24, 2025Updated 3 months ago
- Audio Codec Speech processing Universal PERformance Benchmark☆297Jan 8, 2026Updated last month
- Collect Voice Conversion researches☆96Feb 26, 2026Updated last week
- Speech enhancement by time-varying pitch-dependent filtering of harmonics☆27Jul 3, 2014Updated 11 years ago
- Incorporating AutoVocoder to MB-iSTFT-VITS☆48Dec 1, 2022Updated 3 years ago
- Code for Novel View Acoustic Synthesis paper☆51Aug 14, 2023Updated 2 years ago