Yuan-ManX / ai-audio-datasetsLinks
AI Audio Datasets (AI-ADS) π΅, including Speech, Music, and Sound Effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development, and audio applications.
β870Updated 5 months ago
Alternatives and similar repositories for ai-audio-datasets
Users that are interested in ai-audio-datasets are comparing it to the libraries listed below
Sorting:
- Audio Dataset for training CLAP and other modelsβ721Updated last year
- Official implementation of the paper "Acoustic Music Understanding Model with Large-Scale Self-supervised Training".β420Updated 6 months ago
- AudioLDM training, finetuning, evaluation and inference.β284Updated 11 months ago
- Learning audio concepts from natural language supervisionβ618Updated last year
- Official PyTorch implementation of BigVGAN (ICLR 2023)β1,153Updated last year
- A paper and project list about the cutting edge Speech Synthesis, Text-to-Speech (TTS), Singing Voice Synthesis (SVS), Voice Conversion (β¦β456Updated 3 years ago
- a list of demo websites for automatic music generation researchβ756Updated last month
- Keep track of big models in audio domain, including speech, singing, music etc.β503Updated last year
- This toolbox aims to unify audio generation model evaluation for easier comparison.β365Updated last year
- Unified automatic quality assessment for speech, music, and sound.β643Updated 6 months ago
- Pytorch implementation of the CREPE pitch trackerβ492Updated 6 months ago
- Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesisβ1,021Updated last year
- Collection of resources on the applications of Large Language Models (LLMs) in Audio AI.β705Updated last month
- Implementation of Band Split Roformer, SOTA Attention network for music source separation out of ByteDance AI Labsβ674Updated 3 months ago
- All-In-One Music Structure Analyzerβ674Updated last year
- Daily tracking of awesome audio papers, including music generation, zero-shot tts, asr, audio generationβ406Updated last month
- MU-LLaMA: Music Understanding Large Language Modelβ296Updated 3 months ago
- PyTorch implementation of Audio Flamingo: Series of Advanced Audio Understanding Language Modelsβ897Updated 3 weeks ago
- Metadata, scripts and baselines for the MTG-Jamendo datasetβ341Updated this week
- LP-MusicCaps: LLM-Based Pseudo Music Captioning [ISMIR23]β340Updated last year
- An Open-source Streaming High-fidelity Neural Audio Codecβ497Updated 9 months ago
- Implementation of MusicLM, a text to music model published by Google Research, with a few modifications.β551Updated 2 years ago
- Mustango: Toward Controllable Text-to-Music Generationβ382Updated 6 months ago
- State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.β1,636Updated this week
- A lightweight library for Frechet Audio Distance calculation.β302Updated last week
- Collection of audio-focused loss functions in PyTorchβ823Updated last year
- Code for the paper "LLark: A Multimodal Instruction-Following Language Model for Music" by Josh Gardner, Simon Durand, Daniel Stoller, anβ¦β369Updated last year
- VISinger 2: High-Fidelity End-to-End Singing Voice Synthesis Enhanced by Digital Signal Processing Synthesizerβ351Updated last year
- The Open Source Code of UniAudioβ587Updated last year
- Code, Dataset, and Pretrained Models for Audio and Speech Large Language Model "Listen, Think, and Understand".β461Updated last year