Repo of the paper "Towards Building an End-to-End Multilingual Automatic Lyrics Transcription Model""
☆15Jun 28, 2024Updated last year
Alternatives and similar repositories for MultilingualALT
Users that are interested in MultilingualALT are comparing it to the libraries listed below
Sorting:
- ☆18Sep 22, 2025Updated 5 months ago
- ☆18May 4, 2025Updated 10 months ago
- Official PyTorch implementation of (ICME2025 oral) "AutoStyle-TTS: Retrieval-Augmented Generation based Automatic Style Matching Text-to-…☆16Feb 1, 2026Updated last month
- Perceived Music Quality Dataset☆12Jul 1, 2024Updated last year
- Inference codebase for "Cacophony: An Improved Contrastive Audio-Text Model". Preprint: https://arxiv.org/abs/2402.06986☆48Jan 19, 2026Updated last month
- ☆17Jan 20, 2025Updated last year
- Forced alignment decoder for Whisper.☆14Mar 13, 2024Updated last year
- Latent Space Sound Design Tool based on the VAE of stable-audio-open☆15Aug 23, 2024Updated last year
- ☆18Oct 20, 2023Updated 2 years ago
- ☆15Aug 22, 2025Updated 6 months ago
- Audio Prompt Adapter: Unleashing music editing abilities for text-to-music with lightweight finetuning [ISMIR 2024]☆58Nov 10, 2025Updated 3 months ago
- Estimating musical surprisal/information content in Audio☆23Jan 19, 2026Updated last month
- Machine learning tools and framework for automatic music transcription.☆36Jun 17, 2024Updated last year
- Code implementation for the paper titled MusicLIME: Explainable Multimodal Music Understanding☆23Jan 27, 2025Updated last year
- Text-to-Speech Latency Benchmark☆22Jan 16, 2026Updated last month
- MuChoMusic is a benchmark for evaluating music understanding in multimodal audio-language models.☆44Dec 3, 2024Updated last year
- Official Implementation of Jointist☆37Jul 26, 2023Updated 2 years ago
- Audio-JEPA is an adaptation of the Joint-Embedding Predictive Architecture (JEPA) for self-supervised audio representation learning. Buil…☆40Jun 17, 2025Updated 8 months ago
- [ACL 2025] OZSpeech: One-step Zero-shot Speech Synthesis with Learned-Prior-Conditioned Flow Matching☆45Feb 9, 2025Updated last year
- Where is the "main theme" in an orchestral score?☆12Oct 25, 2025Updated 4 months ago
- The code repository for our paper "Interpreting Song Lyrics with a Music-Informed Pre-trained Language Model".☆24Dec 12, 2022Updated 3 years ago
- CLASP: Contrastive Language-Speech Pretraining for Multilingual Multimodal Information Retrieval☆13Jun 27, 2025Updated 8 months ago
- Code for ChordSync, a conformer-based audio-to-chord synchroniser☆13Oct 17, 2025Updated 4 months ago
- Repo for the IDESSAI 2024 course on modeling audio with discrete tokens.☆13Sep 13, 2024Updated last year
- Data manipulation and transformation for audio signal processing, powered by PyTorch☆10Sep 30, 2024Updated last year
- KABooks is a tool to automate the process of creating datasets for training Text-To-Speech (TTS) and Speech-To-Text (STT) models. Using a…☆12Mar 24, 2023Updated 2 years ago
- A Python Library for Fundamental Frequency Estimation in Music Recordings☆55Jan 16, 2026Updated last month
- T5Voice is a lightweight PyTorch implementation of T5-based text-to-speech synthesis, supporting both streaming and non-streaming speech …☆28Nov 7, 2025Updated 3 months ago
- (We are still working on code refactoring and amending the necessary training and inferencing cli) An electric guitar transcription model…☆13Jan 11, 2023Updated 3 years ago
- Implementation of the paper "Variable Bitrate Residual Vector Quantization for Audio Coding"☆11Apr 10, 2025Updated 10 months ago
- [ICASSP 2025] AnCoGen: Analysis, Control and Generation of Speech with a Masked Autoencoder☆12Mar 11, 2025Updated 11 months ago
- A neural speech codec based on discrete WavLM representations☆24Aug 28, 2024Updated last year
- ☆33Dec 23, 2025Updated 2 months ago
- SANE-TTS: Stable And Natural End-to-End Multilingual Text-to-Speech☆11Jun 30, 2023Updated 2 years ago
- Implementation of an attack/decay model for piano transcription☆11Feb 1, 2018Updated 8 years ago
- ☆15Nov 10, 2025Updated 3 months ago
- A toolkit for benchmarking on a wide variety of audio deepfake datasets.☆29Oct 9, 2025Updated 4 months ago
- Official implementation of WildFX Dataset Generating pipeline.☆15Oct 21, 2025Updated 4 months ago
- ☆14Aug 16, 2023Updated 2 years ago