wbs2788 / VMB
Multimodal Music Generation with Explicit Bridges and Retrieval Augmentation: A framework for generating multimodal music by bridging different representations and enhancing generation with RAG.
☆23Updated last month
Alternatives and similar repositories for VMB:
Users that are interested in VMB are comparing it to the libraries listed below
- The official implementation of the IJCAI 2024 paper "MusicMagus: Zero-Shot Text-to-Music Editing via Diffusion Models".☆37Updated 5 months ago
- ☆39Updated 3 months ago
- Textless Speech-to-Music Retrieval Using Emotion Similarity [ICASSP23]☆17Updated last year
- Implementation of Multi-Source Music Generation with Latent Diffusion.☆22Updated 5 months ago
- This repository contains the dataset used to train the neural network model descried in the paper "Implicit HRTF Modeling Using Tempora…☆11Updated last year
- Musical Word Embedding for Music Tagging and Retrieval [IEEE TASLP]☆23Updated 10 months ago
- Long-Term Rhythmic Video Soundtracker, ICML2023☆56Updated 7 months ago
- Implementation of "Audio xLSTMs: Learning Self-supervised audio representations with xLSTMs" in PyTorch☆18Updated last week
- SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer.☆76Updated 2 months ago
- Official source codes of airsep☆36Updated 10 months ago
- PyTorch Implementation of [AudioLCM]: a efficient and high-quality text-to-audio generation with latent consistency model.☆10Updated 8 months ago
- ScorePerformer: Expressive Piano Performance Rendering with Fine-Grained Control (ISMIR 2023)☆36Updated last year
- Repo for the IDESSAI 2024 course on modeling audio with discrete tokens.☆12Updated 5 months ago
- SyncFusion: Multimodal Onset-synchronized Video-to-Audio Foley Synthesis☆13Updated 7 months ago
- official code for CVPR'24 paper Diff-BGM☆55Updated 4 months ago
- Unconditional music synthesis using a diffusion model in the STFT domain☆12Updated 2 years ago
- This is the official repository of ISMIR 2024 paper "Emotion-driven Piano Music Generation via Two-stage Disentanglement and Functional R…☆53Updated 5 months ago
- Codebase and project page for EDMSound☆34Updated last year
- Code for paper "Network Bending of Diffusion Models for Audio-Visual Generation" at DAFx 2024☆13Updated 8 months ago
- Repo of the paper "Towards Building an End-to-End Multilingual Automatic Lyrics Transcription Model""☆11Updated 7 months ago
- The official implementation of our paper "Instruct-MusicGen: Unlocking Text-to-Music Editing for Music Language Models via Instruction Tu…☆79Updated 5 months ago
- Diff-TTSG: Denoising probabilistic integrated speech and gesture synthesis☆39Updated last year
- ☆35Updated 10 months ago
- Official Repository for The Paper, PianoBART: Symbolic Piano Music Understanding and Generating with Large-Scale Pre-Training☆16Updated 4 months ago
- Enriching Music Descriptions with a Finetuned-LLM and Metadata for Text-to-Music Retrieval (TTMR++) [ICASSP24]☆33Updated 4 months ago
- Blind Identification of Binaural Room Impulse Responses from Head-Worn Microphone Arrays☆15Updated 5 months ago