wbs2788/MTM

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/wbs2788/MTM)

wbs2788 / MTM

Multimodal Music Generation with Explicit Bridges and Retrieval Augmentation: A framework for generating multimodal music by bridging different representations and enhancing generation with RAG.

☆28

Alternatives and similar repositories for MTM

Users that are interested in MTM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

chouliuzuo / GVMGen
View on GitHub
☆32Nov 10, 2025Updated 8 months ago
zxxwxyyy / sonique
View on GitHub
Video Background Music Generation Using Unpaired Audio-Visual Data
☆33Oct 8, 2024Updated last year
Littleor / Personalized-DMER
View on GitHub
Source codes for the paper "Personalized Dynamic Music Emotion Recognition with Dual-Scale Attention-Based Meta-Learning" (PDMER) which p…
☆14Mar 24, 2025Updated last year
migperfer / TriAD-ISMIR2023
View on GitHub
Code accompayning ISMIR23 paper; TriAD: Capturing harmonics with 3D convolutions
☆20Jul 19, 2024Updated 2 years ago
sizhelee / Diff-BGM
View on GitHub
official code for CVPR'24 paper Diff-BGM
☆71Oct 12, 2024Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
madhavlab / wav2tok
View on GitHub
Codebase for ICLR' 23 paper- ''wav2tok: Deep Sequence Tokenizer for Audio Retrieval"
☆36Jun 30, 2026Updated 3 weeks ago
keshavbhandari / yinyang
View on GitHub
☆20May 7, 2025Updated last year
Sreyan88 / ReCLAP
View on GitHub
☆33Dec 23, 2025Updated 7 months ago
JusperLee / Gull-Codec-Training
View on GitHub
☆12Mar 11, 2025Updated last year
SLIT-AI / WRPO
View on GitHub
[ICLR 2025] Weighted-Reward Preference Optimization for Implicit Model Fusion
☆14Mar 17, 2025Updated last year
ilya16 / ScorePerformer
View on GitHub
ScorePerformer: Expressive Piano Performance Rendering with Fine-Grained Control (ISMIR 2023)
☆42Mar 10, 2025Updated last year
zhuole1025 / SymMV
View on GitHub
[ICCV 2023] Video Background Music Generation: Dataset, Method and Evaluation
☆78Mar 29, 2024Updated 2 years ago
aioz-ai / GCD
View on GitHub
Controllable Group Choreography using Contrastive Diffusion (SIGGRAPH ASIA 2023)
☆19Nov 25, 2025Updated 8 months ago
ETH-DISCO / audio-atlas
View on GitHub
☆15Feb 6, 2026Updated 5 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
ismir-24-sub / unsupervised_compositional_representations
View on GitHub
ISMIR 24 Supplementary Material
☆14Oct 28, 2024Updated last year
MTG / SingWithExpressions
View on GitHub
This is the accompanying repository to the paper - Automatic Estimation of Singing Voice Musical Dynamics
☆16Oct 28, 2024Updated last year
adefossez / audio_mod_idessai
View on GitHub
Repo for the IDESSAI 2024 course on modeling audio with discrete tokens.
☆13Sep 13, 2024Updated last year
mulab-mir / muchomusic
View on GitHub
MuChoMusic is a benchmark for evaluating music understanding in multimodal audio-language models.
☆46Dec 3, 2024Updated last year
VincentHancoder / AToM
View on GitHub
The official implementation of work "AToM: Aligning Text-to-Motion Model at Event-Level with GPT-4Vision Reward".
☆19Mar 25, 2025Updated last year
raraz15 / neural-music-fp
View on GitHub
"Enhancing Neural Audio Fingerprint Robustness to Audio Degradation for Music Identification" ISMIR2025
☆38Sep 11, 2025Updated 10 months ago
XulongT / MEGADance
View on GitHub
Code for NeurIPS 2025 Paper “MEGADance: Mixture-of-Experts Architecture for Genre-Aware 3D Dance Generation”
☆17May 21, 2026Updated 2 months ago
mubtasimahasan / DM-Codec
View on GitHub
Source code for the EMNLP 2025 paper “DM-Codec: Distilling Multimodal Representations for Speech Tokenization”
☆57Jun 1, 2025Updated last year
fundwotsai2001 / AP-adapter
View on GitHub
Audio Prompt Adapter: Unleashing music editing abilities for text-to-music with lightweight finetuning [ISMIR 2024]
☆57Nov 10, 2025Updated 8 months ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
boblsturm / aimusicgenerationchallenge2022
View on GitHub
The Ai Music Generation Challenge 2022
☆27May 29, 2024Updated 2 years ago
WildHoneyPie / BEAST
View on GitHub
Codes for ICASSP 2024 paper: BEAST: Online Joint Beat and Downbeat Tracking Based on Streaming Transformer. An online beat tracking syste…
☆44Sep 11, 2024Updated last year
ZeyueT / VidMuse
View on GitHub
[CVPR 2025] Repository of VidMuse
☆140Jun 7, 2025Updated last year
NilsDem / control-transfer-diffusion
View on GitHub
Repository for the paper "Combining audio control and style transfer using latent diffusion", accepted at ISMIR 2024
☆67Feb 19, 2025Updated last year
WingSingFung / TISDiSS
View on GitHub
Official implementation of TISDiSS, a scalable framework for discriminative source separation.
☆16Oct 19, 2025Updated 9 months ago
CarstenEpic / humos
View on GitHub
Humos paper repository
☆26Sep 6, 2025Updated 10 months ago
jhuang448 / MultilingualALT
View on GitHub
Repo of the paper "Towards Building an End-to-End Multilingual Automatic Lyrics Transcription Model""
☆15Jun 28, 2024Updated 2 years ago
ludc506 / InternVL-X
View on GitHub
☆16Mar 26, 2025Updated last year
wonderNo / crossdiff
View on GitHub
☆24Dec 10, 2024Updated last year
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
chymaera96 / GraFP
View on GitHub
Official repository for GraFPrint: an audio identification framework based on graph neural networks.
☆41Sep 18, 2025Updated 10 months ago
RS2002 / Adversarial-MidiBERT
View on GitHub
[ICMR 2025] Official Repository for The Paper, Let Network Decide What to Learn: Symbolic Music Understanding Model Based on Large-scale …
☆19Aug 17, 2025Updated 11 months ago
ldzhangyx / instruct-MusicGen
View on GitHub
The official implementation of our paper "Instruct-MusicGen: Unlocking Text-to-Music Editing for Music Language Models via Instruction Tu…
☆109Jan 14, 2026Updated 6 months ago
qiuk2 / AAR
View on GitHub
[Official Implementation] Acoustic Autoregressive Modeling 🔥
☆74Aug 24, 2024Updated last year
samsad35 / code-ancogen
View on GitHub
[ICASSP 2025] AnCoGen: Analysis, Control and Generation of Speech with a Masked Autoencoder
☆14Mar 11, 2025Updated last year
ETH-DISCO / blap
View on GitHub
Official repo for BLAP: Bootstrapping Language-Audio Pre-training for Music Captioning presented at ICASSP 2025
☆16Nov 18, 2024Updated last year
nicolaus625 / FM4Music
View on GitHub
The official GitHub page for the survey paper "Foundation Models for Music: A Survey".
☆224Sep 4, 2024Updated last year