qiuqiangkong/music_llm

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/qiuqiangkong/music_llm)

qiuqiangkong / music_llm

☆56

Alternatives and similar repositories for music_llm

Users that are interested in music_llm are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

qiuqiangkong / mini_llm
View on GitHub
☆29Jul 4, 2025Updated last year
qiuqiangkong / audioflow
View on GitHub
☆128Updated this week
AudioFans / audidata
View on GitHub
☆21Apr 24, 2025Updated last year
JusperLee / Gull-Codec-Training
View on GitHub
☆12Mar 11, 2025Updated last year
york135 / MIRMLPop
View on GitHub
The MIR-MLPop dataset and the official implementation of the paper "MIR-MLPop: A Multilingual Pop Music Dataset with Time-Aligned Lyrics …
☆35Apr 22, 2024Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
Tayjsl97 / RL-Chord
View on GitHub
This is the official implementation of RL-Chord (TNNLS).
☆13Jan 2, 2024Updated 2 years ago
kaistmm / AlignDiT
View on GitHub
[ACM MM 2025] AlignDiT: Multimodal Aligned Diffusion Transformer for Synchronized Speech Generation
☆24Oct 28, 2025Updated 8 months ago
gwh22 / LAFMA
View on GitHub
LAFMA: A Latent Flow Matching Model for Text-to-Audio Generation (INTERSPEECH 2024)
☆44Jun 13, 2024Updated 2 years ago
kyutai-labs / nanoGPTaudio
View on GitHub
Code for the blog "Neural audio codecs: how to get audio into LLMs"
☆173Oct 20, 2025Updated 9 months ago
Sreyan88 / ReCLAP
View on GitHub
☆33Dec 23, 2025Updated 6 months ago
xiquan-li / FineLAP
View on GitHub
[ACL 2026 Main] FineLAP: Taming Heterogeneous Supervision for Fine-grained Language-Audio Pre-training
☆36Apr 20, 2026Updated 3 months ago
eloimoliner / unconditional-diff-STFT
View on GitHub
Unconditional music synthesis using a diffusion model in the STFT domain
☆12May 31, 2022Updated 4 years ago
salgado / music-search
View on GitHub
Code from blog 'Searching by Music: Leveraging Vector Search for Music Information Retrieval'
☆16Nov 16, 2023Updated 2 years ago
qiuqiangkong / music_source_separation
View on GitHub
☆60Jun 15, 2026Updated last month
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
Sreyan88 / CompA
View on GitHub
Code for ICLR 2024 Paper: CompA: Addressing the Gap in Compositional Reasoning in Audio-Language Models
☆23Jul 10, 2024Updated 2 years ago
lonzi / mrflow_dpo
View on GitHub
☆22Jan 3, 2026Updated 6 months ago
zengchang233 / xiaoicesing2
View on GitHub
The source code for the paper XiaoiceSing2 (interspeech2023)
☆49Jan 15, 2024Updated 2 years ago
ZhikangNiu / A-DMA
View on GitHub
[INTERSPEECH 2025 Oral]Official code for "Accelerating Diffusion-based Text-to-Speech Model Training with Dual Modality Alignment"
☆67Jun 16, 2025Updated last year
Eps-Acoustic-Revolution-Lab / EAR_HEAR
View on GitHub
☆15Jan 9, 2026Updated 6 months ago
ilya16 / PianoCoRe
View on GitHub
PianoCoRe: Combined and Refined Piano MIDI Dataset (TISMIR)
☆20May 8, 2026Updated 2 months ago
qiuqiangkong / materials_for_students
View on GitHub
☆16Aug 10, 2025Updated 11 months ago
lmxue / Audio-FLAN
View on GitHub
Audio-FLAN
☆161Sep 23, 2025Updated 9 months ago
FreedomIntelligence / FusionAudio
View on GitHub
Towards Fine-grained Audio Captioning with Multimodal Contextual Cues
☆87Jan 4, 2026Updated 6 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
Andong-Li-speech / BridgeVoC
View on GitHub
This is the repository for the work "BridgeVoC: Revitalizing Neural Vocoder from a Restoration Perspective".
☆67Nov 5, 2025Updated 8 months ago
fundwotsai2001 / AP-adapter
View on GitHub
Audio Prompt Adapter: Unleashing music editing abilities for text-to-music with lightweight finetuning [ISMIR 2024]
☆57Nov 10, 2025Updated 8 months ago
microsoft / fadtk
View on GitHub
A simple library for Fréchet Audio Distance (FAD) calculation
☆266Aug 22, 2025Updated 10 months ago
rishikksh20 / MiniMax-TTS-pytorch
View on GitHub
Try to replicate the architecture of MiniMaxTTS mentioned in it's technical report
☆47Sep 2, 2025Updated 10 months ago
Labbeti / aac-metrics
View on GitHub
Metrics for evaluating Automated Audio Captioning systems, designed for PyTorch.
☆75Mar 22, 2026Updated 3 months ago
habla-liaa / encodecmae
View on GitHub
Codebase for the paper 'EncodecMAE: Leveraging neural codecs for universal audio representation learning'
☆101Jul 24, 2024Updated last year
SWivid / AUV
View on GitHub
An All-in-One Speech, Sound, Music Codec with Single Nested Codebook
☆28Oct 11, 2025Updated 9 months ago
mulab-mir / muchomusic
View on GitHub
MuChoMusic is a benchmark for evaluating music understanding in multimodal audio-language models.
☆46Dec 3, 2024Updated last year
genisplaja / diffusion-vocal-sep
View on GitHub
Code for "A diffusion-inspired training strategy for singing voice extraction in the waveform domain" (ISMIR 2022)
☆17Feb 16, 2023Updated 3 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
ZhikangNiu / Semantic-VAE
View on GitHub
[INTERSPEECH 2026 Oral]Official code for "Semantic-VAE: Semantic-Alignment Latent Representation for Better Speech Synthesis"
☆120Jun 21, 2026Updated last month
ex3ndr / supervoice-hybrid
View on GitHub
My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into one
☆26Aug 5, 2024Updated last year
tpt-adasp / salt
View on GitHub
SALT: STANDARDIZED AUDIO EVENT LABEL TAXONOMY
☆15Nov 28, 2024Updated last year
CarlWangChina / QwenFeat-Vocal-Score
View on GitHub
VocalVerse: A powerful vocal evaluation framework powered by the Qwen LLMs
☆48May 11, 2026Updated 2 months ago
Sreyan88 / RECAP
View on GitHub
Code for ICASSP 2024 Paper: RECAP: Retrieval-Augmented Audio Captioning
☆16Jun 23, 2024Updated 2 years ago
gzhu06 / Cacophony
View on GitHub
Inference codebase for "Cacophony: An Improved Contrastive Audio-Text Model". Preprint: https://arxiv.org/abs/2402.06986
☆49Jan 19, 2026Updated 6 months ago
zhenye234 / CoMoSpeech
View on GitHub
ACM MM 2023 CoMoSpeech: One-Step Speech and Singing Voice Synthesis via Consistency Model
☆214Apr 26, 2024Updated 2 years ago