Repo for the IDESSAI 2024 course on modeling audio with discrete tokens.
☆13Sep 13, 2024Updated last year
Alternatives and similar repositories for audio_mod_idessai
Users that are interested in audio_mod_idessai are comparing it to the libraries listed below
Sorting:
- The implementation of "Systematic Analysis of Music Representations from BERT"☆27May 23, 2023Updated 2 years ago
- Official Repository for "Music Source Restoration"☆32Jun 1, 2025Updated 9 months ago
- ☆33Dec 23, 2025Updated 2 months ago
- Demucs Lightning: A PyTorch lightning version of Demucs with Hydra and Tensorboard features☆84May 3, 2023Updated 2 years ago
- Repo of the paper "Towards Building an End-to-End Multilingual Automatic Lyrics Transcription Model""☆15Jun 28, 2024Updated last year
- This study converts piano recordings to mel spectrogram and classifies them by SOTA pre-trained neural network backbones in CV. Comparati…☆23Oct 31, 2025Updated 4 months ago
- A spoken version of the textual story cloze benchmark☆20Aug 6, 2023Updated 2 years ago
- Official repository for Aria-MIDI: a MIDI dataset of 1,186,253 transcribed solo-piano recordings.☆78Jun 19, 2025Updated 8 months ago
- My Master's Project, a function/system/program that gives the structure of a given song (The pattern of repetition of verse, chorus, etc.…☆14Jun 21, 2019Updated 6 years ago
- Unconditional music synthesis using a diffusion model in the STFT domain☆12May 31, 2022Updated 3 years ago
- LibriVoc is a new open-source, large-scale dataset for vocoder artifact detection. LibriVoc is derived from the LibriTTS speech corpus, w…☆16Nov 6, 2025Updated 4 months ago
- The official code repo for "Zero-shot Audio Source Separation through Query-based Learning from Weakly-labeled Data", in AAAI 2022☆210Jul 14, 2022Updated 3 years ago
- LIGHTVOC AN UPSAMPLING-FREE GAN VOCODER BASED ON CONFORMER AND INVERSE SHORT-TIME FOURIER TRANSFORM☆18May 17, 2024Updated last year
- We propose a novel approach for reconstructing human expressiveness in piano performance with a multi-layer bi-directional Transformer. (…☆20May 16, 2024Updated last year
- https://arxiv.org/abs/2111.00195☆16Mar 30, 2022Updated 3 years ago
- Code implementation for the paper titled MusicLIME: Explainable Multimodal Music Understanding☆23Jan 27, 2025Updated last year
- Deep Performer: Score-to-audio music performance synthesis☆44Jun 26, 2023Updated 2 years ago
- Multimodal Music Generation with Explicit Bridges and Retrieval Augmentation: A framework for generating multimodal music by bridging dif…☆28Jan 21, 2025Updated last year
- Frontend filterbank learning module with HVQT initialization capabilities.☆21Feb 27, 2024Updated 2 years ago
- 基于FreeVC的歌声转换☆21Dec 16, 2022Updated 3 years ago
- Audio Prompt Adapter: Unleashing music editing abilities for text-to-music with lightweight finetuning [ISMIR 2024]☆58Nov 10, 2025Updated 3 months ago
- Inference codebase for "Cacophony: An Improved Contrastive Audio-Text Model". Preprint: https://arxiv.org/abs/2402.06986☆48Jan 19, 2026Updated last month
- Supervoice diffusion enhance☆28Jul 15, 2024Updated last year
- "Fx-Encoder++: Extracting Instrument-wise Audio Effect Representations from Mixtures"☆49Aug 23, 2025Updated 6 months ago
- Profile your CoreML models directly from Python 🐍☆30Sep 8, 2025Updated 6 months ago
- Towards Fine-grained Audio Captioning with Multimodal Contextual Cues☆87Jan 4, 2026Updated 2 months ago
- ☆61Nov 4, 2023Updated 2 years ago
- CML-TTS: A Multilingual Dataset for Speech Synthesis☆33Jul 31, 2024Updated last year
- Implements ML audio separation algorithm on audio from YouTube or Spotify resulting in "stems" for download (e.g. vocals, drums, bass) in…☆37Dec 8, 2025Updated 3 months ago
- A toolset for easy formant extraction and visualization from wav files and TTS models☆33Sep 2, 2022Updated 3 years ago
- [DEPRECIATED] [PyTorch 2.0] [638M] [85.33% acc] Full-attention multi-instrumental music transformer for supervised music generation, opti…☆32Nov 23, 2023Updated 2 years ago
- Codebase and project page for EDMSound☆35Nov 20, 2023Updated 2 years ago
- Full models and training code for PESTO☆75Jun 12, 2024Updated last year
- The MIR-MLPop dataset and the official implementation of the paper "MIR-MLPop: A Multilingual Pop Music Dataset with Time-Aligned Lyrics …☆33Apr 22, 2024Updated last year
- This is the repository for the work "BridgeVoC: Revitalizing Neural Vocoder from a Restoration Perspective".☆64Nov 5, 2025Updated 4 months ago
- Notebook and Scripts that showcase running quantized diffusion models on consumer GPUs☆38Oct 28, 2024Updated last year
- An implementation of simple diffusion in PyTorch (and JAX)☆34Jan 28, 2023Updated 3 years ago
- [ICASSP 2025] FreeSVC: Towards Zero-shot Multilingual Singing Voice Conversion☆91Jul 23, 2025Updated 7 months ago
- ☆87Jan 29, 2023Updated 3 years ago