Littleor / Personalized-DMERLinks
Source codes for the paper "Personalized Dynamic Music Emotion Recognition with Dual-Scale Attention-Based Meta-Learning" (PDMER) which published in AAAI-25
☆13Updated 9 months ago
Alternatives and similar repositories for Personalized-DMER
Users that are interested in Personalized-DMER are comparing it to the libraries listed below
Sorting:
- [ICCV 2023] Video Background Music Generation: Dataset, Method and Evaluation☆77Updated last year
- ☆127Updated 3 months ago
- official implementation of MGA-CLAP (ACM MM 2024)☆25Updated last year
- Towards Fine-grained Audio Captioning with Multimodal Contextual Cues☆85Updated 3 months ago
- official code for CVPR'24 paper Diff-BGM☆72Updated last year
- Official source codes for the paper: EmoDubber: Towards High Quality and Emotion Controllable Movie Dubbing.☆32Updated 6 months ago
- Art2Mus is a system that generates music based on digitized artworks and text by using the AudioLDM2 architecture with an added projectio…☆19Updated 2 months ago
- Official Repository of IJCAI 2024 Paper: "BATON: Aligning Text-to-Audio Model with Human Preference Feedback"☆32Updated 9 months ago
- 🦇 Encoder of BAT (Learning to Reason about Spatial Sounds with Large Language Models)☆69Updated 10 months ago
- ☆50Updated last year
- A dataset for Audio-Visual Sound Event Detection in Movies☆26Updated 2 years ago
- The official implementation of V-AURA: Temporally Aligned Audio for Video with Autoregression (ICASSP 2025) (Oral)☆31Updated last year
- ☆42Updated 2 years ago
- small audio language model for reasoning☆83Updated 3 weeks ago
- This is the official implementation of MusER (AAAI'24).☆30Updated 6 months ago
- Source code for the paper 'Audio Captioning Transformer'☆57Updated 3 years ago
- The official repo for Both Ears Wide Open: Towards Language-Driven Spatial Audio Generation☆56Updated 5 months ago
- [ISMIR 2025] A curated list of vision-to-music generation: methods, datasets, evaluation and challenges.☆113Updated 4 months ago
- A curated list of Video to Audio Generation☆89Updated last month
- Official implementation of Mozart's Touch: A Lightweight Multi-modal Music Generation Framework Based on Pre-Trained Large Models☆42Updated 9 months ago
- Towards Long Form Audio-visual Video Understanding☆14Updated 8 months ago
- code for A Large-scale Dataset for Audio-Language Representation Learning☆14Updated last year
- ☆24Updated 3 months ago
- The dataset and baseline code for Text-to-Audio Grounding (TAG)☆49Updated 2 months ago
- A unified tokenizer that is capable of both extracting semantic information and enabling high-fidelity audio reconstruction.☆131Updated 3 months ago
- ☆41Updated 8 months ago
- DeepDubber-V1: Towards High Quality and Dialogue, Narration, Monologue Adaptive Movie Dubbing Via Multi-Modal Chain-of-Thoughts Reasoning…☆27Updated 3 months ago
- [NeurIPS 2025] Benchmark data and code for MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix☆185Updated 2 weeks ago
- The official implementation of the IJCAI 2024 paper "MusicMagus: Zero-Shot Text-to-Music Editing via Diffusion Models".☆47Updated last year
- This reporsitory contains metadata of WavCaps dataset and codes for downstream tasks.☆252Updated last year