archinetai/audio-data-pytorch

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/archinetai/audio-data-pytorch)

archinetai / audio-data-pytorch

A collection of useful audio datasets and transforms for PyTorch.

☆144

Alternatives and similar repositories for audio-data-pytorch

Users that are interested in audio-data-pytorch are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

archinetai / audio-diffusion-pytorch-trainer
View on GitHub
Trainer for audio-diffusion-pytorch
☆129Jan 13, 2023Updated 3 years ago
archinetai / audio-diffusion-pytorch
View on GitHub
Audio generation using diffusion models, in PyTorch.
☆2,098Jun 12, 2023Updated 3 years ago
archinetai / audio-encoders-pytorch
View on GitHub
A collection of audio autoencoders, in PyTorch.
☆44Mar 7, 2023Updated 3 years ago
Kinyugo / msanii
View on GitHub
A novel diffusion-based model for synthesizing long-context, high-fidelity music efficiently.
☆196Apr 27, 2023Updated 3 years ago
archinetai / archisound
View on GitHub
A collection of pre-trained audio models, in PyTorch.
☆116Jan 27, 2023Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
archinetai / a-unet
View on GitHub
A toolbox that provides hackable building blocks for generic 1D/2D/3D UNets, in PyTorch.
☆88Jun 12, 2023Updated 3 years ago
LAION-AI / audio-dataset
View on GitHub
Audio Dataset for training CLAP and other models
☆747Jan 8, 2026Updated 6 months ago
yoyolicoris / music-spectrogram-diffusion-pytorch
View on GitHub
☆88Jan 29, 2023Updated 3 years ago
genisplaja / diffusion-vocal-sep
View on GitHub
Code for "A diffusion-inspired training strategy for singing voice extraction in the waveform domain" (ISMIR 2022)
☆17Feb 16, 2023Updated 3 years ago
YoonjinXD / kadtk
View on GitHub
A standardized toolkit of Kernel Audio Distance (KAD)—a distribution-free, unbiased, and computationally efficient metric for evaluating …
☆104Jun 12, 2025Updated last year
york135 / MIRMLPop
View on GitHub
The MIR-MLPop dataset and the official implementation of the paper "MIR-MLPop: A Multilingual Pop Music Dataset with Time-Aligned Lyrics …
☆35Apr 22, 2024Updated 2 years ago
archinetai / cqt-pytorch
View on GitHub
An invertible and differentiable implementation of the Constant-Q Transform (CQT).
☆73Dec 9, 2022Updated 3 years ago
eloimoliner / unconditional-diff-STFT
View on GitHub
Unconditional music synthesis using a diffusion model in the STFT domain
☆12May 31, 2022Updated 4 years ago
Harmonai-org / oobleck
View on GitHub
open soundstream-ish VAE codecs for downstream neural audio synthesis
☆124Jun 12, 2023Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
JusperLee / Gull-Codec-Training
View on GitHub
☆12Mar 11, 2025Updated last year
liusongxiang / Large-Audio-Models
View on GitHub
Keep track of big models in audio domain, including speech, singing, music etc.
☆515Jul 3, 2026Updated 2 weeks ago
XiaoyuBIE1994 / SDCodec
View on GitHub
(ICASSP 2025) Learning Source Disentanglement in Neural Audio Codec
☆48May 16, 2025Updated last year
reseval / reseval
View on GitHub
Reproducible Subjective Evaluation
☆61Mar 3, 2024Updated 2 years ago
gudgud96 / frechet-audio-distance
View on GitHub
A lightweight library for Frechet Audio Distance calculation.
☆317Feb 11, 2026Updated 5 months ago
NVIDIA / diffusion-audio-restoration
View on GitHub
Audio-to-Audio Schrodinger Bridges is a diffusion-based audio restoration model for bandwidth extension and inpainting.
☆145Aug 13, 2025Updated 11 months ago
nii-yamagishilab / midi-to-audio
View on GitHub
Project for MIDI to Audio Synthesis
☆27Mar 13, 2023Updated 3 years ago
aim-qmul / sdx23-aimless
View on GitHub
Source Separation training codebase for the Sound Demixing Challenge 2023.
☆45May 18, 2023Updated 3 years ago
archinetai / audio-ai-timeline
View on GitHub
A timeline of the latest AI models for audio generation, starting in 2023!
☆1,906Jan 4, 2024Updated 2 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
iamycy / diffwave-sr
View on GitHub
☆87May 21, 2023Updated 3 years ago
csteinmetz1 / auraloss
View on GitHub
Collection of audio-focused loss functions in PyTorch
☆874Jul 30, 2024Updated last year
SuperKogito / pydiogment
View on GitHub
Python library for audio augmentation
☆84Jul 6, 2023Updated 3 years ago
drscotthawley / audio-algebra
View on GitHub
alchemy with embeddings
☆34Jun 14, 2023Updated 3 years ago
descriptinc / audiotools
View on GitHub
Object-oriented handling of audio data, with GPU-powered augmentations, and more.
☆350Apr 1, 2025Updated last year
Aratako / CALM-DACVAE
View on GitHub
An attempt to reproduce CALM (Continuous Audio Language Models) using DACVAE as the audio VAE.
☆17Feb 20, 2026Updated 5 months ago
vivjay30 / pnf-sampling
View on GitHub
☆22Jun 8, 2021Updated 5 years ago
fgnt / paderbox
View on GitHub
Paderbox: A collection of utilities for audio / speech processing
☆43Jul 21, 2025Updated last year
teticio / audio-diffusion
View on GitHub
Apply diffusion models using the new Hugging Face diffusers package to synthesize music instead of images.
☆792Sep 25, 2024Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
lucidrains / audiolm-pytorch
View on GitHub
Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch
☆2,621Jan 12, 2025Updated last year
nomonosound / log-wmse-audio-quality
View on GitHub
logWMSE, an audio quality metric with support for digital silence target. Useful for evaluating audio source separation systems, even whe…
☆39Jun 24, 2025Updated last year
salu133445 / deepperformer
View on GitHub
Deep Performer: Score-to-audio music performance synthesis
☆47Jun 26, 2023Updated 3 years ago
albertfgu / diffwave-sashimi
View on GitHub
Implementation of DiffWave and SaShiMi audio generation models
☆128Apr 4, 2023Updated 3 years ago
Netflix-Skunkworks / listening-test-app
View on GitHub
☆21May 23, 2024Updated 2 years ago
acids-ircam / cached_conv
View on GitHub
☆58May 31, 2023Updated 3 years ago
hugofloresgarcia / vampnet
View on GitHub
music generation with masked transformers!
☆357May 16, 2025Updated last year