SonicVerse: Multi-Task Learning for Music Feature-Informed Captioning
☆50Jul 28, 2025Updated 7 months ago
Alternatives and similar repositories for SonicVerse
Users that are interested in SonicVerse are comparing it to the libraries listed below
Sorting:
- Improving Symbolic Music Generation with Inference-Time Alignment☆20Aug 2, 2025Updated 7 months ago
- ☆25Jun 19, 2025Updated 8 months ago
- JamendoMaxCaps is a large-scale dataset of 362,000 instrumental creative commons tracks☆46May 24, 2025Updated 9 months ago
- [EMNLP 2025 Findings] Official code for EZ-VC: Easy Zero-shot Any-to-Any Voice Conversion☆34Sep 9, 2025Updated 5 months ago
- "Enhancing Neural Audio Fingerprint Robustness to Audio Degradation for Music Identification" ISMIR2025☆30Sep 11, 2025Updated 5 months ago
- ☆32Nov 25, 2023Updated 2 years ago
- Official repository for the paper: Scaling Self-Supervised Representation Learning for Symbolic Piano Performance (ISMIR 2025)☆95Dec 23, 2025Updated 2 months ago
- Repo of the paper "Towards Building an End-to-End Multilingual Automatic Lyrics Transcription Model""☆15Jun 28, 2024Updated last year
- SonicMaster: Towards Controllable All-in-One Music Restoration and Mastering☆152Aug 25, 2025Updated 6 months ago
- Codes for ICASSP 2024 paper: BEAST: Online Joint Beat and Downbeat Tracking Based on Streaming Transformer. An online beat tracking syste…☆41Sep 11, 2024Updated last year
- Beat annotations for the beat tracker Beat This!☆13Dec 27, 2025Updated 2 months ago
- Official repository for Aria-MIDI: a MIDI dataset of 1,186,253 transcribed solo-piano recordings.☆76Jun 19, 2025Updated 8 months ago
- This repository contains a series of works on diffusion-based speech tokenizers, including the official implementation of the paper: "TaD…☆76Jan 25, 2026Updated last month
- Lyrics and Vocal Melody Generation conditioned on Accompaniment☆29Aug 27, 2022Updated 3 years ago
- Variable Bitrate Residual Vector Quantization for Audio Coding☆51May 1, 2025Updated 10 months ago
- The source code for the paper XiaoiceSing2 (interspeech2023)☆49Jan 15, 2024Updated 2 years ago
- Estimating musical surprisal/information content in Audio☆23Jan 19, 2026Updated last month
- ☆15Sep 20, 2023Updated 2 years ago
- Code and demo for paper: Zhao et al., "Q&A: Query-Based Representation Learning for Multi-Track Symbolic Music re-Arrangement," IJCAI 202…☆20May 2, 2024Updated last year
- [NAACL 2025] WaveFM: A High-Fidelity and Efficient Vocoder Based on Flow Matching☆121Mar 27, 2025Updated 11 months ago
- Elucidated Text-To-Audio (ETTA) is a SOTA text-to-audio model with a holistic understanding of the design space and trained with syntheti…☆102Oct 15, 2025Updated 4 months ago
- ☆19Feb 2, 2023Updated 3 years ago
- Unofficial implementation JEN-1: Text-Guided Universal Music Generation with Omnidirectional Diffusion Models(https://arxiv.org/abs/2308.…☆54Jan 18, 2024Updated 2 years ago
- A python algorithm to change the pitch of the voice in real time☆13Dec 13, 2020Updated 5 years ago
- A repo that builds text to music datasets from scratch, used in MuseContorlLite [ICML2025]☆27May 20, 2025Updated 9 months ago
- Tool to aid in the creation of mashups☆19Apr 7, 2020Updated 5 years ago
- A lightweight audio codec based on a single quantizer☆69Aug 15, 2025Updated 6 months ago
- FastSAG: Towards Fast Non-Autoregressive Singing Accompaniment Generation☆29Dec 19, 2024Updated last year
- Additional material for the paper ADTOF: A large dataset of non-synthetic music for automatic drum transcription☆68Sep 18, 2025Updated 5 months ago
- [ACL 2025] OZSpeech: One-step Zero-shot Speech Synthesis with Learned-Prior-Conditioned Flow Matching☆45Feb 9, 2025Updated last year
- Where is the "main theme" in an orchestral score?☆12Oct 25, 2025Updated 4 months ago
- Dataset/code for AudioMarkBench: Benchmarking Robustness of Audio Watermarking☆45Aug 23, 2024Updated last year
- Joint Embedding Predictive Architecture for Musical Stem Compatibility Estimation☆48Aug 6, 2024Updated last year
- LEMAS‑TTS is a multilingual zero‑shot text‑to‑speech system, supporting 10 languages: Chinese English Spanish Russian French German Ital…☆91Jan 14, 2026Updated last month
- ProsodyLM: Uncovering the Emerging Prosody Processing Capabilities in Speech Language Models☆34Nov 18, 2025Updated 3 months ago
- ☆115Sep 18, 2025Updated 5 months ago
- A pitch detection model trained to be robust against noise and reverberation environments.☆27Jan 21, 2025Updated last year
- Accompanying repository for the paper "DiffVox: A Differentiable Model for Capturing and Analysing Professional Effects Distributions"☆38Oct 28, 2025Updated 4 months ago
- PAM is a no-reference audio quality metric for audio generation tasks☆77Jul 19, 2024Updated last year