Repo of the paper "Towards Building an End-to-End Multilingual Automatic Lyrics Transcription Model""
☆15Jun 28, 2024Updated last year
Alternatives and similar repositories for MultilingualALT
Users that are interested in MultilingualALT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Official implementation of the paper - GD-Retriever: Controllable generative text-music retrieval with diffusion models (Accepted at ISMI…☆17Sep 25, 2025Updated 5 months ago
- Perceived Music Quality Dataset☆12Jul 1, 2024Updated last year
- ☆18Sep 22, 2025Updated 6 months ago
- ☆18Oct 20, 2023Updated 2 years ago
- Inference codebase for "Cacophony: An Improved Contrastive Audio-Text Model". Preprint: https://arxiv.org/abs/2402.06986☆49Jan 19, 2026Updated 2 months ago
- Latent Space Sound Design Tool based on the VAE of stable-audio-open☆15Aug 23, 2024Updated last year
- Text-to-Speech Latency Benchmark☆22Updated this week
- ☆18May 4, 2025Updated 10 months ago
- Estimating musical surprisal/information content in Audio☆23Jan 19, 2026Updated 2 months ago
- Code for ChordSync, a conformer-based audio-to-chord synchroniser☆13Oct 17, 2025Updated 5 months ago
- Official implementation of WildFX Dataset Generating pipeline.☆15Oct 21, 2025Updated 5 months ago
- A web app for annotating Freesound loops, and the tools to analyse the dataset created.☆20Jul 6, 2023Updated 2 years ago
- Forced alignment decoder for Whisper.☆15Mar 13, 2024Updated 2 years ago
- [ACL 2025] OZSpeech: One-step Zero-shot Speech Synthesis with Learned-Prior-Conditioned Flow Matching☆45Feb 9, 2025Updated last year
- Audio Prompt Adapter: Unleashing music editing abilities for text-to-music with lightweight finetuning [ISMIR 2024]☆58Nov 10, 2025Updated 4 months ago
- ☆30Jan 22, 2026Updated 2 months ago
- ☆18Jan 20, 2025Updated last year
- Machine learning tools and framework for automatic music transcription.☆36Jun 17, 2024Updated last year
- ☆15Aug 22, 2025Updated 7 months ago
- The code repository for our paper "Interpreting Song Lyrics with a Music-Informed Pre-trained Language Model".☆24Dec 12, 2022Updated 3 years ago
- MuChoMusic is a benchmark for evaluating music understanding in multimodal audio-language models.☆44Dec 3, 2024Updated last year
- Official PyTorch implementation of (ICME2025 oral) "AutoStyle-TTS: Retrieval-Augmented Generation based Automatic Style Matching Text-to-…☆24Feb 1, 2026Updated last month
- Implementation of an attack/decay model for piano transcription☆11Feb 1, 2018Updated 8 years ago
- A toolkit for benchmarking on a wide variety of audio deepfake datasets.☆29Oct 9, 2025Updated 5 months ago
- ☆15Nov 10, 2025Updated 4 months ago
- Code implementation for the paper titled MusicLIME: Explainable Multimodal Music Understanding☆24Jan 27, 2025Updated last year
- Official Implementation of Jointist☆37Jul 26, 2023Updated 2 years ago
- VAE modified from Descript Audio Codec, which replaces the RVQ with VAE☆90Apr 2, 2024Updated last year
- SonicVerse: Multi-Task Learning for Music Feature-Informed Captioning☆52Jul 28, 2025Updated 7 months ago
- Repo for the IDESSAI 2024 course on modeling audio with discrete tokens.☆13Sep 13, 2024Updated last year
- Code for the paper "Timbre-Trap: A Low-Resource Framework for Instrument-Agnostic Music Transcription"☆40May 5, 2024Updated last year
- The official repo of "WhiStress: Enriching Transcriptions with Sentence Stress Detection" (Interspeech 2025)☆37Jul 24, 2025Updated 8 months ago
- Data manipulation and transformation for audio signal processing, powered by PyTorch☆10Sep 30, 2024Updated last year
- Where is the "main theme" in an orchestral score?☆13Oct 25, 2025Updated 4 months ago
- A Python Library for Fundamental Frequency Estimation in Music Recordings☆56Jan 16, 2026Updated 2 months ago
- Distillation of Self-Supervised Representation-Based Speech Quality Assessment☆44May 15, 2025Updated 10 months ago
- The code for the ISMIR 2019 paper “Supervised symbolic music style translation using synthetic data”.☆28Nov 21, 2022Updated 3 years ago
- Official repository for the paper Singing Voice Graph Modeling for SingFake Detection (Interspeech 2024).☆25Sep 19, 2025Updated 6 months ago
- MeanAudio: Fast and Faithful Text-to-Audio Generation with Mean Flows☆130Sep 2, 2025Updated 6 months ago