sakemin / demucs_batch-multigpuLinks
[Batching/MultiGPU/DataLoader Implemented] Code for the paper Hybrid Spectrogram and Waveform Source Separation
☆23Updated 2 years ago
Alternatives and similar repositories for demucs_batch-multigpu
Users that are interested in demucs_batch-multigpu are comparing it to the libraries listed below
Sorting:
- million song dataset split for extended clean tag & artist-level stratified☆52Updated 2 years ago
- JamendoMaxCaps is a large-scale dataset of 362,000 instrumental creative commons tracks☆44Updated 6 months ago
- Source Separation training codebase for the Sound Demixing Challenge 2023.☆44Updated 2 years ago
- ☆45Updated last year
- Codebase for the paper 'EncodecMAE: Leveraging neural codecs for universal audio representation learning'☆99Updated last year
- Official repository for the paper - SLAP: Siamese Language-Audio Pretraining without negative samples for Music Understanding☆53Updated 2 months ago
- ☆72Updated last year
- music semantic understanding evaluation benchmark☆25Updated 2 years ago
- Unofficial implementation JEN-1: Text-Guided Universal Music Generation with Omnidirectional Diffusion Models(https://arxiv.org/abs/2308.…☆54Updated last year
- AudioSR-Upsampling (any -> 48kHz)☆42Updated last year
- ☆64Updated 5 months ago
- Frechet Audio Distance evaluation in PyTorch☆36Updated 2 years ago
- [ISMIR 2023] LyricWhiz: Robust Multilingual Zero-shot Lyrics Transcription by Whispering to ChatGPT☆51Updated 2 years ago
- PyTorch implementation of DiffRoll, a diffusion-based generative automatic music transcription (AMT) model☆80Updated 2 years ago
- Autovocoder: Fast Waveform Generation from a Learned Speech Representation using Differentiable Digital Signal Processing☆71Updated 3 years ago
- ☆60Updated 2 years ago
- Audiogen Codec☆143Updated last year
- A piano music dataset with Audio, Symbolic and Text labels☆33Updated 9 months ago
- PyTorch implementation of "Source Separation by Flow Matching (FLOSS)" by Google DeepMind☆82Updated 2 weeks ago
- Contains the code associated with the ICLR submission for our text-to-speech diffusion model☆54Updated 2 years ago
- ☆29Updated 2 years ago
- logWMSE, an audio quality metric & loss function with support for digital silence target. Useful for training and evaluating audio source…☆44Updated 7 months ago
- The source code and pre-trained model of the paper "On the Preparation and Validation of a Large-scale Dataset"☆60Updated 3 years ago
- A standardized toolkit of Kernel Audio Distance (KAD)—a distribution-free, unbiased, and computationally efficient metric for evaluating …☆92Updated 6 months ago
- ☆110Updated 3 months ago
- ☆83Updated 2 years ago
- Elucidated Text-To-Audio (ETTA) is a SOTA text-to-audio model with a holistic understanding of the design space and trained with syntheti…☆88Updated last month
- Project for MIDI to Audio Synthesis☆25Updated 2 years ago
- Prosody and Pronunciation Modification Network☆60Updated 7 months ago
- ☆85Updated 2 years ago