Yuan-ManX / ai-audio-datasets
AI Audio Datasets (AI-ADS) π΅, including Speech, Music, and Sound Effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development, and audio applications.
β641Updated last week
Alternatives and similar repositories for ai-audio-datasets:
Users that are interested in ai-audio-datasets are comparing it to the libraries listed below
- Audio Dataset for training CLAP and other modelsβ668Updated last year
- Keep track of big models in audio domain, including speech, singing, music etc.β470Updated 5 months ago
- Pytorch implementation of the CREPE pitch trackerβ426Updated 8 months ago
- Learning audio concepts from natural language supervisionβ528Updated 5 months ago
- Collection of resources on the applications of Large Language Models (LLMs) in Audio AI.β653Updated 7 months ago
- Official PyTorch implementation of BigVGAN (ICLR 2023)β961Updated 5 months ago
- AudioLDM training, finetuning, evaluation and inference.β234Updated 2 months ago
- Official implementation of the paper "Acoustic Music Understanding Model with Large-Scale Self-supervised Training".β353Updated 10 months ago
- DeepAFx-ST - Style transfer of audio effects with differentiable signal processing. Please see https://csteinmetz1.github.io/DeepAFx-ST/β381Updated last year
- Daily tracking of awesome audio papers, including music generation, zero-shot tts, asr, audio generationβ369Updated this week
- VISinger 2: High-Fidelity End-to-End Singing Voice Synthesis Enhanced by Digital Signal Processing Synthesizerβ334Updated 4 months ago
- Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesisβ884Updated 6 months ago
- Object-oriented handling of audio data, with GPU-powered augmentations, and more.β258Updated 2 months ago
- Contrastive Language-Audio Pretrainingβ1,543Updated 3 months ago
- An Open-source Streaming High-fidelity Neural Audio Codecβ458Updated 4 months ago
- MU-LLaMA: Music Understanding Large Language Modelβ264Updated 11 months ago
- This toolbox aims to unify audio generation model evaluation for easier comparison.β320Updated 5 months ago
- A list of publicly available room impulse response datasets and scripts to download them.β441Updated 4 months ago
- A paper and project list about the cutting edge Speech Synthesis, Text-to-Speech (TTS), Singing Voice Synthesis (SVS), Voice Conversion (β¦β414Updated 2 years ago
- a list of demo websites for automatic music generation researchβ663Updated this week
- Metadata, scripts and baselines for the MTG-Jamendo datasetβ296Updated 7 months ago
- This repository is an implementation of this article: https://arxiv.org/pdf/2107.03312.pdfβ378Updated 2 years ago
- Mustango: Toward Controllable Text-to-Music Generationβ354Updated 7 months ago
- A lightweight library for Frechet Audio Distance calculation.β254Updated 6 months ago
- Collection of audio-focused loss functions in PyTorchβ765Updated 7 months ago
- Implementation of Band Split Roformer, SOTA Attention network for music source separation out of ByteDance AI Labsβ500Updated last month
- State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.β1,309Updated 7 months ago
- Pitch Estimating Neural Networks (PENN)β242Updated 7 months ago
- A simple library for FrΓ©chet Audio Distance (FAD) calculationβ180Updated 2 weeks ago
- Repository for training models for music source separation.β643Updated 2 weeks ago