Video2Music: Suitable Music Generation from Videos using an Affective Multimodal Transformer model
☆192Jul 30, 2024Updated last year
Alternatives and similar repositories for Video2Music
Users that are interested in Video2Music are comparing it to the libraries listed below
Sorting:
- [ICCV 2023] Video Background Music Generation: Dataset, Method and Evaluation☆78Mar 29, 2024Updated last year
- Mustango: Toward Controllable Text-to-Music Generation☆386Jun 2, 2025Updated 8 months ago
- ☆13Aug 21, 2022Updated 3 years ago
- This repository is the project page for "Point Anywhere: Directed Object Estimation from Omnidirectional Images", including source code …☆12Aug 25, 2023Updated 2 years ago
- ☆29Nov 10, 2025Updated 3 months ago
- ☆38Apr 15, 2024Updated last year
- This is the official implementation of MusER (AAAI'24).☆30Jun 4, 2025Updated 8 months ago
- [ACM MM 2021 Best Paper Award] Video Background Music Generation with Controllable Music Transformer☆322Jun 8, 2025Updated 8 months ago
- Textless Speech-to-Music Retrieval Using Emotion Similarity [ICASSP23]☆17Aug 16, 2023Updated 2 years ago
- music generation with masked transformers!☆350May 16, 2025Updated 9 months ago
- The Song Describer dataset is an evaluation dataset made of ~1.1k captions for 706 permissively licensed music recordings.☆167Dec 22, 2023Updated 2 years ago
- LP-MusicCaps: LLM-Based Pseudo Music Captioning [ISMIR23]☆344Apr 8, 2024Updated last year
- JamendoMaxCaps is a large-scale dataset of 362,000 instrumental creative commons tracks☆46May 24, 2025Updated 9 months ago
- ☆32Nov 25, 2023Updated 2 years ago
- Predicting emotion from music videos: exploring the relative contribution of visual and auditory information on affective responses☆22Oct 3, 2023Updated 2 years ago
- [Pattern Recognition 2024] Semantic-Aware Frame-Event Fusion based Pattern Recognition via Large Vision-Language Models, Dong Li, Jiandon…☆18Jan 18, 2025Updated last year
- ☆25Apr 18, 2025Updated 10 months ago
- Codes and MIDI demos of ISMIR 2022 paper: Domain Adversarial Training on Conditional Variational Auto-Encoder for Controllable Music Gene…☆21Mar 28, 2023Updated 2 years ago
- code for "BEAT-ALIGNED SPECTROGRAM-TO-SEQUENCE GENERATION OF RHYTHM-GAME CHARTS" (ISMIR 2023 LBD)☆18Jan 29, 2024Updated 2 years ago
- Official implementation of the paper "Acoustic Music Understanding Model with Large-Scale Self-supervised Training".☆434May 25, 2025Updated 9 months ago
- Polyffusion: A Diffusion Model for Polyphonic Score Generation with Internal and External Controls☆87Jul 16, 2024Updated last year
- Official implementation of Mozart's Touch: A Lightweight Multi-modal Music Generation Framework Based on Pre-Trained Large Models☆43Mar 3, 2025Updated 11 months ago
- An automatic prosodic boundary annotation tool for Text-to-Speech Synthesis (TTS).☆51Jun 11, 2024Updated last year
- VoiceLDM: Text-to-Speech with Environmental Context☆191Aug 9, 2024Updated last year
- official code for CVPR'24 paper Diff-BGM☆71Oct 12, 2024Updated last year
- ☆38Mar 10, 2023Updated 2 years ago
- MU-LLaMA: Music Understanding Large Language Model☆303Aug 18, 2025Updated 6 months ago
- ☆29Jun 8, 2023Updated 2 years ago
- This is the official repository of ISMIR 2024 paper "Emotion-driven Piano Music Generation via Two-stage Disentanglement and Functional R…☆60Sep 17, 2024Updated last year
- Official codes and models of the paper "Auffusion: Leveraging the Power of Diffusion and Large Language Models for Text-to-Audio Generati…☆192Mar 25, 2024Updated last year
- This repo contains the code to reproduce the paper: "Enriched Music Representations with Multiple Cross-modal Contrastive Learning"☆15Jun 22, 2023Updated 2 years ago
- Video Background Music Generation Using Unpaired Audio-Visual Data☆30Oct 8, 2024Updated last year
- Diffusion-based singing voice pitch correction☆137Sep 20, 2024Updated last year
- Open, royalty free, lyrics2song / song generation data collection / cleaning pipeline.☆17May 9, 2025Updated 9 months ago
- This is the official repository of Emotion-Driven Melody Harmonization via Melodic Variation and Functional Representation.☆12Sep 25, 2024Updated last year
- Improving Symbolic Music Generation with Inference-Time Alignment☆20Aug 2, 2025Updated 7 months ago
- Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge☆21Jul 25, 2022Updated 3 years ago
- Official Implementation of "Multitrack Music Transformer" (ICASSP 2023)☆153Mar 14, 2024Updated last year
- A robust pitch tracker using synchro-squeezed fft and frequency domain autocorrelation☆36Jan 17, 2024Updated 2 years ago