AI Audio Datasets (AI-ADS) π΅, including Speech, Music, and Sound Effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development, and audio applications.
β950Jul 8, 2025Updated 11 months ago
Alternatives and similar repositories for ai-audio-datasets
Users that are interested in ai-audio-datasets are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Audio Codec Speech processing Universal PERformance Benchmarkβ307May 5, 2026Updated last month
- State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.β1,828Jan 26, 2026Updated 5 months ago
- Ultra-low bitrate neural audio codec (0.31~1.40 kbps) with a better semantic in the latent space.β254Mar 7, 2025Updated last year
- Unified automatic quality assessment for speech, music, and sound.β733Jun 5, 2025Updated last year
- TTS FrontEnd DataSet: Polyphone / Prosody / TextNormalizationβ103Feb 5, 2024Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform β’ AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- [ICASSP 2024] This is the official code for "VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching"β376Sep 3, 2024Updated last year
- Official repository of the paper "MuQ: Self-Supervised Music Representation Learning with Mel Residual Vector Quantization".β351Aug 4, 2025Updated 10 months ago
- Audio Dataset for training CLAP and other modelsβ744Jan 8, 2026Updated 5 months ago
- A 6-million Audio-Caption Paired Dataset Built with a LLMs and ALMs-based Automatic Pipelineβ207Dec 13, 2024Updated last year
- The Open Source Code of UniAudioβ607Jul 22, 2024Updated last year
- Models and code for RepCodec: A Speech Representation Codec for Speech Tokenizationβ195Jul 12, 2024Updated last year
- Awesome speech/audio LLMs, representation learning, and codec modelsβ1,232Jun 1, 2026Updated last month
- music generation with masked transformers!β356May 16, 2025Updated last year
- a list of demo websites for automatic music generation researchβ790May 20, 2026Updated last month
- Deploy on Railway without the complexity - Free Credits Offer β’ AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- π A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).β2,204Jun 6, 2024Updated 2 years ago
- Fine-tune Stable Audio Open with DiT ControlNet.β254May 16, 2025Updated last year
- Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesisβ1,132Aug 7, 2024Updated last year
- β265Feb 14, 2024Updated 2 years ago
- Keep track of big models in audio domain, including speech, singing, music etc.β514Sep 26, 2024Updated last year
- AcademiCodec: An Open Source Audio Codec Model for Academic Researchβ675Dec 27, 2023Updated 2 years ago
- Ultra-low-bitrate Speech Codec for Speech Language Modeling Applicationsβ92Dec 20, 2024Updated last year
- A lightweight library for Frechet Audio Distance calculation.β315Feb 11, 2026Updated 4 months ago
- A timeline of the latest AI models for audio generation, starting in 2023!β1,909Jan 4, 2024Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer β’ AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples aβ¦β659Jun 9, 2024Updated 2 years ago
- Contrastive Language-Audio Pretrainingβ2,197May 15, 2025Updated last year
- Daily tracking of awesome audio papers, including music generation, zero-shot tts, asr, audio generationβ409Nov 2, 2025Updated 8 months ago
- PyTorch implementation of Audio Flamingo: Series of Advanced Audio Understanding Language Modelsβ1,146Dec 15, 2025Updated 6 months ago
- Audiogen Codecβ146Jul 9, 2024Updated last year
- Collection of audio-focused loss functions in PyTorchβ871Jul 30, 2024Updated last year
- VAE modified from Descript Audio Codec, which replaces the RVQ with VAEβ92Apr 2, 2024Updated 2 years ago
- Audio generation using diffusion models, in PyTorch.β2,100Jun 12, 2023Updated 3 years ago
- A paper and project list about the cutting edge Speech Synthesis, Text-to-Speech (TTS), Singing Voice Synthesis (SVS), Voice Conversion (β¦β483Sep 28, 2022Updated 3 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer β’ AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Audio-FLANβ161Sep 23, 2025Updated 9 months ago
- The open source code for LLM-Codecβ147Aug 18, 2024Updated last year
- Awesome Neural Codec Models, Text-to-Speech Synthesizers & Speech Language Modelsβ243Dec 18, 2025Updated 6 months ago
- Encode and decode audio samples to/from compressed latent representations!β264Sep 19, 2025Updated 9 months ago
- FunCodec is a research-oriented toolkit for audio quantization and downstream applications, such as text-to-speech synthesis, music generβ¦β443Jan 25, 2024Updated 2 years ago
- Codebase for 'Scaling Rich Style-Prompted Text-to-Speech Datasets'β162Mar 26, 2026Updated 3 months ago
- The open source code for SimpleSpeech seriesβ147Oct 8, 2024Updated last year