Multimodal Music Generation with Explicit Bridges and Retrieval Augmentation: A framework for generating multimodal music by bridging different representations and enhancing generation with RAG.
☆28Jan 21, 2025Updated last year
Alternatives and similar repositories for MTM
Users that are interested in MTM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Code accompayning ISMIR23 paper; TriAD: Capturing harmonics with 3D convolutions☆19Jul 19, 2024Updated last year
- Controllable Group Choreography using Contrastive Diffusion (SIGGRAPH ASIA 2023)☆18Nov 25, 2025Updated 6 months ago
- official code for CVPR'24 paper Diff-BGM☆71Oct 12, 2024Updated last year
- ☆20Jun 10, 2025Updated 11 months ago
- ISMIR 24 Supplementary Material☆14Oct 28, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Repo for the IDESSAI 2024 course on modeling audio with discrete tokens.☆13Sep 13, 2024Updated last year
- The official implementation of work "AToM: Aligning Text-to-Motion Model at Event-Level with GPT-4Vision Reward".☆19Mar 25, 2025Updated last year
- Repo of the paper "Towards Building an End-to-End Multilingual Automatic Lyrics Transcription Model""☆15Jun 28, 2024Updated last year
- Humos paper repository☆26Sep 6, 2025Updated 8 months ago
- "Enhancing Neural Audio Fingerprint Robustness to Audio Degradation for Music Identification" ISMIR2025☆36Sep 11, 2025Updated 8 months ago
- ☆20May 7, 2025Updated last year
- This is the official repository of Emotion-Driven Melody Harmonization via Melodic Variation and Functional Representation.☆12Sep 25, 2024Updated last year
- Just a copy of https://github.com/RobynE23/CodeHS-Java-APCSA, but I added folders and some extra files that didn't exist. Another option …☆27Jan 23, 2024Updated 2 years ago
- The ArtificialSongGenerator automatically composes and compiles the Artifical Audio Multitrack dataset (AAM).☆27Nov 17, 2025Updated 6 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆55Jul 16, 2025Updated 10 months ago
- MuChoMusic is a benchmark for evaluating music understanding in multimodal audio-language models.☆44Dec 3, 2024Updated last year
- Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor…☆60Apr 11, 2024Updated 2 years ago
- My Master's Project, a function/system/program that gives the structure of a given song (The pattern of repetition of verse, chorus, etc.…☆14Jun 21, 2019Updated 6 years ago
- ☆22Aug 5, 2025Updated 9 months ago
- [NeurIPS 2025] Official Implementation for "Enhancing Vision-Language Model Reliability with Uncertainty-Guided Dropout Decoding"☆22Dec 8, 2024Updated last year
- Codebase for ICLR' 23 paper- ''wav2tok: Deep Sequence Tokenizer for Audio Retrieval"☆36Feb 10, 2026Updated 3 months ago
- a notebook containing scripts, documentation, and examples for finetuning musicgen☆100Apr 10, 2024Updated 2 years ago
- [ICME 2025] DiffusionTalker: Efficient and Compact Speech-Driven 3D Talking Head via Personalizer-Guided Distillation☆24Mar 25, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- The official PyTorch implementation of "The 18th European Conference on Computer Vision" (ECCV 2024) paper Length-Aware Motion Synthesis …☆19Dec 15, 2024Updated last year
- Source code for the EMNLP 2025 paper “DM-Codec: Distilling Multimodal Representations for Speech Tokenization”☆57Jun 1, 2025Updated 11 months ago
- [ACL 2025] Can MLLMs Understand the Deep Implication Behind Chinese Images?☆23Apr 9, 2026Updated last month
- ☆15Mar 15, 2022Updated 4 years ago
- ScorePerformer: Expressive Piano Performance Rendering with Fine-Grained Control (ISMIR 2023)☆41Mar 10, 2025Updated last year
- This is the accompanying repository to the paper - Automatic Estimation of Singing Voice Musical Dynamics☆15Oct 28, 2024Updated last year
- Music Generation in MIDI format using Deep Learning.☆17Jun 22, 2024Updated last year
- [ICMR 2025] Official Repository for The Paper, Let Network Decide What to Learn: Symbolic Music Understanding Model Based on Large-scale …☆18Aug 17, 2025Updated 9 months ago
- arPLS algorithm from "Baseline correction using asymmetrically reweighted penalized least squares smoothing"☆16Mar 25, 2022Updated 4 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- \infty-Video: A Training-Free Approach to Long Video Understanding via Continuous-Time Memory Consolidation☆21Feb 14, 2025Updated last year
- A library for computing Frechet Music Distance.☆31Feb 4, 2025Updated last year
- CLaMP 2: Multimodal Music Information Retrieval Across 101 Languages Using Large Language Models [NAACL 2025]☆64Feb 28, 2025Updated last year
- [Official Implementation] Acoustic Autoregressive Modeling 🔥☆75Aug 24, 2024Updated last year
- ☆58Oct 10, 2024Updated last year
- Official implementation of ICCV 2023 Oral Paper "Role-Aware Interaction Generation from Textual Description"☆34Oct 20, 2023Updated 2 years ago
- TheGlueNote is representation model for note-wise music alignment.☆13Jul 19, 2024Updated last year