DagsHub / audio-datasets
open-source audio datasets
☆141Updated last year
Related projects ⓘ
Alternatives and complementary repositories for audio-datasets
- An easy way to fine-tune Wav2Vec 2.0 for low-resource languages.☆81Updated last year
- Scripts for computing the Intelligibility and CLVP scores for evaluating TTS models☆140Updated 10 months ago
- A collection of useful audio datasets and transforms for PyTorch.☆132Updated last year
- Codebase for the paper 'EncodecMAE: Leveraging neural codecs for universal audio representation learning'☆87Updated 3 months ago
- The official code repo for "Zero-shot Audio Source Separation through Query-based Learning from Weakly-labeled Data", in AAAI 2022☆186Updated 2 years ago
- Speaker identification/verification models for Machine Learning for Computer Vision class at UNIBO☆58Updated 2 years ago
- Baseline multi-resolution cross network model trained using the Divide and Remaster Dataset☆77Updated 9 months ago
- Object-oriented handling of audio data, with GPU-powered augmentations, and more.☆233Updated 2 weeks ago
- ☆63Updated last month
- A lightweight library for Frechet Audio Distance calculation.☆236Updated 2 months ago
- Official implementation of "Contrastive Audio-Language Learning for Music" (ISMIR 2022)☆106Updated last year
- A speaker embedding network in Pytorch that is very quick to set up and use for whatever purposes.☆84Updated last year
- GOMIN; Gaudio Open Mel-spectrogram Inversion Network☆109Updated 9 months ago
- A DDSP-based neural voice synthesiser.☆107Updated last week
- This project is about performing Speaker diarization for Hindi Language.☆45Updated 3 years ago
- A mini, simple, and fast end-to-end automatic speech recognition toolkit.☆47Updated last year
- Toward Universal Text-to-Music-Retrieval (TTMR) [ICASSP23]☆111Updated last year
- Incorporating KenLM language model with HuggingFace implementation of Wav2Vec2CTC Model using beam search decoding☆71Updated 3 years ago
- Dataset and baseline code for the VocalSound dataset (ICASSP2022).☆122Updated last year
- 2021 ISMIR tutorial - music classification☆143Updated 2 years ago
- ☆59Updated last month
- Predicts the level of noise and reverberation on your audiofiles☆138Updated 5 months ago
- Code and data repository for paper "VoxCeleb enrichment for Age and Gender recognition" submitted at ASRU 2021☆63Updated 2 years ago
- Pytorch implementation of deep audio embedding calculation☆98Updated last year
- Official pytorch implementation of the paper: "Catch-A-Waveform: Learning to Generate Audio from a Single Short Example" (NeurIPS 2021)☆187Updated 7 months ago
- A simple library for Fréchet Audio Distance (FAD) calculation☆145Updated 3 weeks ago
- A toolbox that provides hackable building blocks for generic 1D/2D/3D UNets, in PyTorch.☆83Updated last year
- Speaker change detection using SincNet and an LSTM/Transformer☆44Updated 4 months ago
- Estimating the Age, Height, and Gender of a speaker with their speech signal. https://arxiv.org/pdf/2110.13653.pdf☆63Updated 3 years ago
- Wav2Vec for speech recognition, classification, and audio classification☆249Updated 2 years ago