Official implementation of the paper "Acoustic Music Understanding Model with Large-Scale Self-supervised Training".
☆434May 25, 2025Updated 9 months ago
Alternatives and similar repositories for MERT
Users that are interested in MERT are comparing it to the libraries listed below
Sorting:
- ☆251Feb 14, 2024Updated 2 years ago
- State-of-the-art pretrained music models for training, evaluation, inference☆164Jan 20, 2026Updated last month
- MU-LLaMA: Music Understanding Large Language Model☆304Aug 18, 2025Updated 6 months ago
- Official repository of the paper "MuQ: Self-Supervised Music Representation Learning with Mel Residual Vector Quantization".☆311Aug 4, 2025Updated 7 months ago
- music generation with masked transformers!☆350May 16, 2025Updated 9 months ago
- LP-MusicCaps: LLM-Based Pseudo Music Captioning [ISMIR23]☆344Apr 8, 2024Updated last year
- Code for the paper "LLark: A Multimodal Instruction-Following Language Model for Music" by Josh Gardner, Simon Durand, Daniel Stoller, an…☆375May 30, 2024Updated last year
- The Open Source Code of UniAudio☆605Jul 22, 2024Updated last year
- The open source code for LLM-Codec☆145Aug 18, 2024Updated last year
- A simple library for Fréchet Audio Distance (FAD) calculation☆246Aug 22, 2025Updated 6 months ago
- Codebase for the paper 'EncodecMAE: Leveraging neural codecs for universal audio representation learning'☆101Jul 24, 2024Updated last year
- AAAI 2025: Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model☆293Oct 12, 2025Updated 4 months ago
- State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.☆1,718Jan 26, 2026Updated last month
- All-In-One Music Structure Analyzer☆722May 9, 2024Updated last year
- Contrastive Language-Audio Pretraining☆2,039May 15, 2025Updated 9 months ago
- Official PyTorch implementation of Contrastive Learning of Musical Representations☆335Jul 25, 2024Updated last year
- ☆156Nov 22, 2024Updated last year
- Robust Singing Voice Transcription and MIDI Extraction☆112Nov 20, 2024Updated last year
- Models and code for RepCodec: A Speech Representation Codec for Speech Tokenization☆194Jul 12, 2024Updated last year
- This is the official repository for M2UGen☆513Jan 2, 2025Updated last year
- The latent diffusion model for text-to-music generation.☆185Jan 26, 2024Updated 2 years ago
- a list of demo websites for automatic music generation research☆772Updated this week
- A lightweight library for Frechet Audio Distance calculation.☆310Feb 11, 2026Updated 3 weeks ago
- CLaMP 3: Universal Music Information Retrieval Across Unaligned Modalities and Unseen Languages [ACL 2025]☆220May 11, 2025Updated 9 months ago
- MelodyT5: A Unified Score-to-Score Transformer for Symbolic Music Processing [ISMIR 2024]☆46Jan 23, 2025Updated last year
- VISinger 2: High-Fidelity End-to-End Singing Voice Synthesis Enhanced by Digital Signal Processing Synthesizer☆353Nov 4, 2024Updated last year
- Mustango: Toward Controllable Text-to-Music Generation☆386Jun 2, 2025Updated 9 months ago
- ACM MM 2023 CoMoSpeech: One-Step Speech and Singing Voice Synthesis via Consistency Model☆213Apr 26, 2024Updated last year
- An Open-source Streaming High-fidelity Neural Audio Codec☆498Mar 4, 2025Updated last year
- PyTorch implementation of Audio Flamingo: Series of Advanced Audio Understanding Language Models☆1,006Dec 15, 2025Updated 2 months ago
- The source code for the paper XiaoiceSing2 (interspeech2023)☆49Jan 15, 2024Updated 2 years ago
- Results and Models for Learning Audio Representations of Music Content☆107Dec 3, 2024Updated last year
- Encode and decode audio samples to/from compressed latent representations!☆248Sep 19, 2025Updated 5 months ago
- Unified automatic quality assessment for speech, music, and sound.☆681Jun 5, 2025Updated 8 months ago
- A DDSP-based neural voice synthesiser.☆129Nov 14, 2024Updated last year
- PyTorch implementation of the ICASSP-24 paper: "Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Superv…☆38Jan 6, 2024Updated 2 years ago
- Official Implementation of "Multitrack Music Transformer" (ICASSP 2023)☆153Mar 14, 2024Updated last year
- Self-supervised learning for real-time pitch estimation☆280Oct 15, 2025Updated 4 months ago
- Audio generation using diffusion models, in PyTorch.☆2,094Jun 12, 2023Updated 2 years ago