AI Audio Datasets (AI-ADS) π΅, including Speech, Music, and Sound Effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development, and audio applications.
β944Jul 8, 2025Updated 11 months ago
Alternatives and similar repositories for ai-audio-datasets
Users that are interested in ai-audio-datasets are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Audio Codec Speech processing Universal PERformance Benchmarkβ306May 5, 2026Updated last month
- State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.β1,814Jan 26, 2026Updated 4 months ago
- Ultra-low bitrate neural audio codec (0.31~1.40 kbps) with a better semantic in the latent space.β254Mar 7, 2025Updated last year
- Unified automatic quality assessment for speech, music, and sound.β727Jun 5, 2025Updated last year
- TTS FrontEnd DataSet: Polyphone / Prosody / TextNormalizationβ103Feb 5, 2024Updated 2 years ago
- Managed Kubernetes at scale on DigitalOcean β’ AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- [ICASSP 2024] This is the official code for "VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching"β374Sep 3, 2024Updated last year
- Official repository of the paper "MuQ: Self-Supervised Music Representation Learning with Mel Residual Vector Quantization".β347Aug 4, 2025Updated 10 months ago
- Audio Dataset for training CLAP and other modelsβ740Jan 8, 2026Updated 5 months ago
- A 6-million Audio-Caption Paired Dataset Built with a LLMs and ALMs-based Automatic Pipelineβ204Dec 13, 2024Updated last year
- The Open Source Code of UniAudioβ604Jul 22, 2024Updated last year
- Models and code for RepCodec: A Speech Representation Codec for Speech Tokenizationβ194Jul 12, 2024Updated last year
- Awesome speech/audio LLMs, representation learning, and codec modelsβ1,230Jun 1, 2026Updated last week
- music generation with masked transformers!β354May 16, 2025Updated last year
- a list of demo websites for automatic music generation researchβ787May 20, 2026Updated 3 weeks ago
- Deploy on Railway without the complexity - Free Credits Offer β’ AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- π A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).β2,202Jun 6, 2024Updated 2 years ago
- Fine-tune Stable Audio Open with DiT ControlNet.β252May 16, 2025Updated last year
- Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesisβ1,129Aug 7, 2024Updated last year
- β263Feb 14, 2024Updated 2 years ago
- Keep track of big models in audio domain, including speech, singing, music etc.β512Sep 26, 2024Updated last year
- AcademiCodec: An Open Source Audio Codec Model for Academic Researchβ674Dec 27, 2023Updated 2 years ago
- Ultra-low-bitrate Speech Codec for Speech Language Modeling Applicationsβ92Dec 20, 2024Updated last year
- A lightweight library for Frechet Audio Distance calculation.β315Feb 11, 2026Updated 4 months ago
- A timeline of the latest AI models for audio generation, starting in 2023!β1,910Jan 4, 2024Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient β’ AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples aβ¦β659Jun 9, 2024Updated 2 years ago
- Contrastive Language-Audio Pretrainingβ2,178May 15, 2025Updated last year
- Daily tracking of awesome audio papers, including music generation, zero-shot tts, asr, audio generationβ408Nov 2, 2025Updated 7 months ago
- PyTorch implementation of Audio Flamingo: Series of Advanced Audio Understanding Language Modelsβ1,140Dec 15, 2025Updated 5 months ago
- Audiogen Codecβ145Jul 9, 2024Updated last year
- VAE modified from Descript Audio Codec, which replaces the RVQ with VAEβ92Apr 2, 2024Updated 2 years ago
- Audio generation using diffusion models, in PyTorch.β2,099Jun 12, 2023Updated 3 years ago
- Collection of audio-focused loss functions in PyTorchβ867Jul 30, 2024Updated last year
- A paper and project list about the cutting edge Speech Synthesis, Text-to-Speech (TTS), Singing Voice Synthesis (SVS), Voice Conversion (β¦β483Sep 28, 2022Updated 3 years ago
- Managed Kubernetes at scale on DigitalOcean β’ AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Audio-FLANβ161Sep 23, 2025Updated 8 months ago
- The open source code for LLM-Codecβ147Aug 18, 2024Updated last year
- Awesome Neural Codec Models, Text-to-Speech Synthesizers & Speech Language Modelsβ242Dec 18, 2025Updated 5 months ago
- Encode and decode audio samples to/from compressed latent representations!β258Sep 19, 2025Updated 8 months ago
- FunCodec is a research-oriented toolkit for audio quantization and downstream applications, such as text-to-speech synthesis, music generβ¦β444Jan 25, 2024Updated 2 years ago
- Codebase for 'Scaling Rich Style-Prompted Text-to-Speech Datasets'β161Mar 26, 2026Updated 2 months ago
- The open source code for SimpleSpeech seriesβ146Oct 8, 2024Updated last year