Video2Music: Suitable Music Generation from Videos using an Affective Multimodal Transformer model
☆193Jul 30, 2024Updated last year
Alternatives and similar repositories for Video2Music
Users that are interested in Video2Music are comparing it to the libraries listed below
Sorting:
- [ICCV 2023] Video Background Music Generation: Dataset, Method and Evaluation☆78Mar 29, 2024Updated last year
- ☆13Aug 21, 2022Updated 3 years ago
- ☆29Nov 10, 2025Updated 4 months ago
- Mustango: Toward Controllable Text-to-Music Generation☆387Jun 2, 2025Updated 9 months ago
- Textless Speech-to-Music Retrieval Using Emotion Similarity [ICASSP23]☆17Aug 16, 2023Updated 2 years ago
- [ACM MM 2021 Best Paper Award] Video Background Music Generation with Controllable Music Transformer☆323Jun 8, 2025Updated 9 months ago
- Predicting emotion from music videos: exploring the relative contribution of visual and auditory information on affective responses☆22Oct 3, 2023Updated 2 years ago
- ☆39Apr 15, 2024Updated last year
- This repo contains the code to reproduce the paper: "Enriched Music Representations with Multiple Cross-modal Contrastive Learning"☆15Jun 22, 2023Updated 2 years ago
- JamendoMaxCaps is a large-scale dataset of 362,000 instrumental creative commons tracks☆47May 24, 2025Updated 9 months ago
- LP-MusicCaps: LLM-Based Pseudo Music Captioning [ISMIR23]☆346Apr 8, 2024Updated last year
- music generation with masked transformers!☆351May 16, 2025Updated 10 months ago
- ☆25Apr 18, 2025Updated 11 months ago
- This is the official implementation of MusER (AAAI'24).☆30Jun 4, 2025Updated 9 months ago
- ☆32Nov 25, 2023Updated 2 years ago
- Official implementation of the paper "Acoustic Music Understanding Model with Large-Scale Self-supervised Training".☆440May 25, 2025Updated 9 months ago
- The Song Describer dataset is an evaluation dataset made of ~1.1k captions for 706 permissively licensed music recordings.☆168Dec 22, 2023Updated 2 years ago
- SonicVerse: Multi-Task Learning for Music Feature-Informed Captioning☆50Jul 28, 2025Updated 7 months ago
- A large-scale dataset of caption-annotated MIDI files.☆79Jul 23, 2024Updated last year
- [JCMS 2024] This is the official repository for the paper, MidiBERT-Piano: Large-scale Pre-training for Symbolic Music Understanding.☆203Apr 10, 2024Updated last year
- This is the repo accompanying the paper: "A multimodal model with Twitter FinBERT embeddings for extreme price movement prediction of Bit…☆12Jul 29, 2025Updated 7 months ago
- official code for CVPR'24 paper Diff-BGM☆71Oct 12, 2024Updated last year
- VoiceLDM: Text-to-Speech with Environmental Context☆192Aug 9, 2024Updated last year
- ☆38Mar 10, 2023Updated 3 years ago
- Chorale Music Separation Dataset and Model Framework☆40Dec 5, 2022Updated 3 years ago
- Official Implementation of "Multitrack Music Transformer" (ICASSP 2023)☆154Mar 14, 2024Updated 2 years ago
- MU-LLaMA: Music Understanding Large Language Model☆305Aug 18, 2025Updated 7 months ago
- Video Background Music Generation Using Unpaired Audio-Visual Data☆30Oct 8, 2024Updated last year
- Symphony Generation with Permutation Invariant Language Model☆257Oct 7, 2022Updated 3 years ago
- Making an AI-generated music video from any song with Wav2CLIP and VQGAN-CLIP☆244Jun 10, 2022Updated 3 years ago
- ☆86Oct 20, 2024Updated last year
- Code and demo for paper: Zhao et al., Structured Multi-Track Accompaniment Arrangement via Style Prior Modelling, in NeurIPS 2024.☆40Jan 17, 2026Updated 2 months ago
- Diffusion-based singing voice pitch correction☆137Sep 20, 2024Updated last year
- Improving Symbolic Music Generation with Inference-Time Alignment☆20Aug 2, 2025Updated 7 months ago
- Code and Dataset for <Quantitative Analysis of Melodic Similarity in Music Copyright Infringement Cases, ISMIR 2024>☆14Nov 12, 2024Updated last year
- ☆14Jun 16, 2023Updated 2 years ago
- [Pattern Recognition 2024] Semantic-Aware Frame-Event Fusion based Pattern Recognition via Large Vision-Language Models, Dong Li, Jiandon…☆18Jan 18, 2025Updated last year
- An automatic prosodic boundary annotation tool for Text-to-Speech Synthesis (TTS).☆51Jun 11, 2024Updated last year
- ☆58Nov 2, 2020Updated 5 years ago