Multimodal Music Generation with Explicit Bridges and Retrieval Augmentation: A framework for generating multimodal music by bridging different representations and enhancing generation with RAG.
☆28Jan 21, 2025Updated last year
Alternatives and similar repositories for MTM
Users that are interested in MTM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Code accompayning ISMIR23 paper; TriAD: Capturing harmonics with 3D convolutions☆19Jul 19, 2024Updated last year
- ☆28Mar 17, 2026Updated 3 months ago
- Controllable Group Choreography using Contrastive Diffusion (SIGGRAPH ASIA 2023)☆19Nov 25, 2025Updated 7 months ago
- official code for CVPR'24 paper Diff-BGM☆71Oct 12, 2024Updated last year
- ☆20Jun 10, 2025Updated last year
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- ISMIR 24 Supplementary Material☆14Oct 28, 2024Updated last year
- Repo for the IDESSAI 2024 course on modeling audio with discrete tokens.☆13Sep 13, 2024Updated last year
- The official implementation of work "AToM: Aligning Text-to-Motion Model at Event-Level with GPT-4Vision Reward".☆19Mar 25, 2025Updated last year
- Repo of the paper "Towards Building an End-to-End Multilingual Automatic Lyrics Transcription Model""☆15Jun 28, 2024Updated 2 years ago
- Humos paper repository☆26Sep 6, 2025Updated 9 months ago
- ☆24Dec 10, 2024Updated last year
- "Enhancing Neural Audio Fingerprint Robustness to Audio Degradation for Music Identification" ISMIR2025☆38Sep 11, 2025Updated 9 months ago
- ☆20May 7, 2025Updated last year
- Just a copy of https://github.com/RobynE23/CodeHS-Java-APCSA, but I added folders and some extra files that didn't exist. Another option …☆27Jan 23, 2024Updated 2 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- The ArtificialSongGenerator automatically composes and compiles the Artifical Audio Multitrack dataset (AAM).☆27Nov 17, 2025Updated 7 months ago
- ☆55Jul 16, 2025Updated 11 months ago
- MuChoMusic is a benchmark for evaluating music understanding in multimodal audio-language models.☆46Dec 3, 2024Updated last year
- Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor…☆60Apr 11, 2024Updated 2 years ago
- ☆22Aug 5, 2025Updated 11 months ago
- [NeurIPS 2025] Official Implementation for "Enhancing Vision-Language Model Reliability with Uncertainty-Guided Dropout Decoding"☆22Dec 8, 2024Updated last year
- Codebase for ICLR' 23 paper- ''wav2tok: Deep Sequence Tokenizer for Audio Retrieval"☆36Updated this week
- a notebook containing scripts, documentation, and examples for finetuning musicgen☆100Apr 10, 2024Updated 2 years ago
- [ICME 2025] DiffusionTalker: Efficient and Compact Speech-Driven 3D Talking Head via Personalizer-Guided Distillation☆24Mar 25, 2025Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Source code for the EMNLP 2025 paper “DM-Codec: Distilling Multimodal Representations for Speech Tokenization”☆57Jun 1, 2025Updated last year
- [ACL 2025] Can MLLMs Understand the Deep Implication Behind Chinese Images?☆24Apr 9, 2026Updated 2 months ago
- ☆15Mar 15, 2022Updated 4 years ago
- ScorePerformer: Expressive Piano Performance Rendering with Fine-Grained Control (ISMIR 2023)☆41Mar 10, 2025Updated last year
- This is the accompanying repository to the paper - Automatic Estimation of Singing Voice Musical Dynamics☆15Oct 28, 2024Updated last year
- Music Generation in MIDI format using Deep Learning.☆17Jun 22, 2024Updated 2 years ago
- [ICMR 2025] Official Repository for The Paper, Let Network Decide What to Learn: Symbolic Music Understanding Model Based on Large-scale …☆18Aug 17, 2025Updated 10 months ago
- A library for computing Frechet Music Distance.☆31Feb 4, 2025Updated last year
- Salesforce AI Research's open diffusion language model☆65Jun 2, 2026Updated last month
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- CLaMP 2: Multimodal Music Information Retrieval Across 101 Languages Using Large Language Models [NAACL 2025]☆65Feb 28, 2025Updated last year
- [Official Implementation] Acoustic Autoregressive Modeling 🔥☆75Aug 24, 2024Updated last year
- ☆58Oct 10, 2024Updated last year
- Official implementation of ICCV 2023 Oral Paper "Role-Aware Interaction Generation from Textual Description"☆34Oct 20, 2023Updated 2 years ago
- TheGlueNote is representation model for note-wise music alignment.☆14Jul 19, 2024Updated last year
- Official repository for GraFPrint: an audio identification framework based on graph neural networks.☆41Sep 18, 2025Updated 9 months ago
- ☆11Jul 30, 2025Updated 11 months ago