wbs2788 / VMBLinks

Multimodal Music Generation with Explicit Bridges and Retrieval Augmentation: A framework for generating multimodal music by bridging different representations and enhancing generation with RAG.

☆28

Alternatives and similar repositories for VMB

Users that are interested in VMB are comparing it to the libraries listed below

Sorting:

chouliuzuo / GVMGen
☆25Updated 5 months ago
ldzhangyx / MusicMagus
The official implementation of the IJCAI 2024 paper "MusicMagus: Zero-Shot Text-to-Music Editing via Diffusion Models".
☆44Updated 11 months ago
sizhelee / Diff-BGM
official code for CVPR'24 paper Diff-BGM
☆68Updated 10 months ago
keshavbhandari / yinyang
☆17Updated 3 months ago
XZWY / MSLDM
Implementation of Multi-Source Music Generation with Latent Diffusion.
☆26Updated 11 months ago
SonyResearch / diffvox
Accompanying repository for the paper "DiffVox: A Differentiable Model for Capturing and Analysing Professional Effects Distributions"
☆33Updated 2 weeks ago
Yuer867 / EMO-Disentanger
This is the official repository of ISMIR 2024 paper "Emotion-driven Piano Music Generation via Two-stage Disentanglement and Functional R…
☆58Updated 11 months ago
sony / diffusion-timbre-transfer
☆51Updated 10 months ago
FreedomIntelligence / FusionAudio
Towards Fine-grained Audio Captioning with Multimodal Contextual Cues
☆80Updated 2 months ago
OpenGVLab / LORIS
[ICML2023] Long-Term Rhythmic Video Soundtracker
☆61Updated last month
seungheondoh / musical-word-embedding
Musical Word Embedding for Music Tagging and Retrieval [IEEE TASLP]
☆25Updated last year
ChanganVR / action2sound
Action2Sound: Ambient-Aware Generation of Action Sounds from Egocentric Videos
☆23Updated 11 months ago
YatingMusic / MusiConGen
☆80Updated 10 months ago
qiuqiangkong / audio_understanding
☆116Updated 6 months ago
zihaod / MusiLingo
☆47Updated last year
ilpoviertola / V-AURA
The official implementation of V-AURA: Temporally Aligned Audio for Video with Autoregression (ICASSP 2025)
☆28Updated 8 months ago
Kikyo-16 / airgen
Official source codes of airsep
☆37Updated last year
Hannieliao / Baton
Official Repository of IJCAI 2024 Paper: "BATON: Aligning Text-to-Audio Model with Human Preference Feedback"
☆29Updated 6 months ago
Sreyan88 / ReCLAP
☆30Updated 5 months ago
guxm2021 / SVT_SpeechBrain
[TOMM 2024] Automatic Lyric Transcription and Automatic Music Transcription from Multimodal Singing
☆25Updated last year
ldzhangyx / instruct-MusicGen
The official implementation of our paper "Instruct-MusicGen: Unlocking Text-to-Music Editing for Music Language Models via Instruction Tu…
☆94Updated last year
xmusic-project / XMIDI_Dataset
XMIDI Dataset: A large-scale symbolic music dataset with emotion and genre labels.
☆29Updated 7 months ago
snap-research / GenAU
☆40Updated 4 months ago
ryota-komatsu / speaker_disentangled_hubert
Official repository of the IEEE SLT 2024 paper "Self-Supervised Syllable Discovery Based on Speaker-Disentangled HuBERT"
☆42Updated last week
zeyuxie29 / PicoAudio
☆42Updated 7 months ago
NKU-HLT / AudioEditor
☆37Updated 5 months ago
Pliploop / SLAP
Official repository for the paper - SLAP: Siamese Language-Audio Pretraining without negative samples for Music Understanding
☆31Updated last week
zcli-charlie / ZIQI-Eval
ZIQI-Eval: A Music Evaluation Benchmark for Large Language Models
☆14Updated last year
soham97 / mellow
small audio language model for reasoning
☆74Updated 4 months ago
shivammehta25 / Diff-TTSG
Diff-TTSG: Denoising probabilistic integrated speech and gesture synthesis
☆39Updated last year