seungheondoh / music_caps_dl
Unofficial download repository for MusicCaps
☆44Updated last year
Related projects ⓘ
Alternatives and complementary repositories for music_caps_dl
- Unofficial implementation JEN-1: Text-Guided Universal Music Generation with Omnidirectional Diffusion Models(https://arxiv.org/abs/2308.…☆52Updated 10 months ago
- million song dataset split for extended clean tag & artist-level stratified☆47Updated last year
- Robust Singing Voice Transcription and MIDI Extraction☆56Updated 3 months ago
- A collection of audio autoencoders, in PyTorch.☆39Updated last year
- ☆40Updated 5 months ago
- Findings of ACL 2023 | AlignSTS: a speech-to-singing (STS) model based on modality disentanglement and cross-modal alignment☆65Updated 4 months ago
- ☆51Updated last year
- Codebase for the paper 'EncodecMAE: Leveraging neural codecs for universal audio representation learning'☆88Updated 3 months ago
- " Music Style Transfer with Time-Varying Inversion of Diffusion Models"☆35Updated 3 months ago
- A simple library for Fréchet Audio Distance (FAD) calculation☆145Updated this week
- A DDSP-based neural voice synthesiser.☆109Updated last week
- ☆34Updated 5 months ago
- ☆31Updated 7 months ago
- ☆74Updated last year
- Chorale Music Separation Dataset and Model Framework☆32Updated last year
- ☆79Updated last year
- [ISMIR 2023] LyricWhiz: Robust Multilingual Zero-shot Lyrics Transcription by Whispering to ChatGPT☆38Updated last year
- Implementation of the paper, T-FOLEY: A Controllable Waveform-Domain Diffusion Model for Temporal-Event-Guided Foley Sound Synthesis, ac…☆26Updated 5 months ago
- Toward Universal Text-to-Music-Retrieval (TTMR) [ICASSP23]☆112Updated last year
- Unsupervised Music Source Separation Using Differentiable Parametric Source Models☆60Updated last year
- ☆61Updated 7 months ago
- ConsistencyTTA: Accelerating Diffusion-Based Text-to-Audio Generation with Consistency Distillation☆32Updated this week
- Inference codebase for "Cacophony: An Improved Contrastive Audio-Text Model". Preprint: https://arxiv.org/abs/2402.06986☆37Updated last month
- Polyffusion: A Diffusion Model for Polyphonic Score Generation with Internal and External Controls☆74Updated 4 months ago
- Codes for ISMIR 2022 paper: Beat Transformer: Demixed Beat and Downbeat Tracking with Dilated Self-Attention☆91Updated 7 months ago
- ☆44Updated last month
- Source Separation training codebase for the Sound Demixing Challenge 2023.☆37Updated last year
- An invertible and differentiable implementation of the Constant-Q Transform (CQT).☆54Updated last year
- A 6-million Audio-Caption Paired Dataset Built with a LLMs and ALMs-based Automatic Pipeline☆64Updated 2 weeks ago
- Official Implementation of EnCLAP (ICASSP 2024)☆90Updated 5 months ago