shantistewart / Emo-CLIM
Emo-CLIM: Emotion-Aligned Contrastive Learning Between Images and Music [ICASSP 2024]
☆13Updated last year
Alternatives and similar repositories for Emo-CLIM:
Users that are interested in Emo-CLIM are comparing it to the libraries listed below
- A dataset for Audio-Visual Sound Event Detection in Movies☆27Updated 2 years ago
- ☆11Updated 9 months ago
- The official implementation of V-AURA: Temporally Aligned Audio for Video with Autoregression (ICASSP 2025)☆22Updated 3 months ago
- The official implementation of the IJCAI 2024 paper "MusicMagus: Zero-Shot Text-to-Music Editing via Diffusion Models".☆41Updated 7 months ago
- Source code for "Sparse in Space and Time: Audio-visual Synchronisation with Trainable Selectors." (Spotlight at the BMVC 2022)☆52Updated last year
- Official Implementation of the work "Audio Mamba: Bidirectional State Space Model for Audio Representation Learning"☆139Updated 5 months ago
- Code for paper Learning Audio-Visual Dereverberation☆27Updated 2 years ago
- 🦇 Encoder of BAT (Learning to Reason about Spatial Sounds with Large Language Models)☆46Updated 2 months ago
- [CVPR 2023] Official code for paper: Learning to Dub Movies via Hierarchical Prosody Models.☆105Updated 10 months ago
- Official source codes for the paper: EmoDubber: Towards High Quality and Emotion Controllable Movie Dubbing.☆14Updated 3 months ago
- Code for the IEEE Signal Processing Letters 2022 paper "UAVM: Towards Unifying Audio and Visual Models".☆55Updated 2 years ago
- Implementation of the model "AudioFlamingo" from the paper: "Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dial…☆41Updated 3 months ago
- Source code for the paper 'Audio Captioning Transformer'☆54Updated 3 years ago
- SyncFusion: Multimodal Onset-synchronized Video-to-Audio Foley Synthesis☆18Updated 9 months ago
- ☆16Updated last year
- ☆22Updated last year
- Pytorch implementation for “V2C: Visual Voice Cloning”☆32Updated 2 years ago
- small audio language model for reasoning☆58Updated last week
- Audio-Visual Generalized Zero-Shot Learning using Large Pre-Trained Models☆15Updated last year
- [ICLR 2025] Enhancing Self-Supervised Models with Audio Mixtures for Polyphonic Soundscapes☆30Updated last week
- ☆16Updated 2 years ago
- Ego4DSounds: A diverse egocentric dataset with high action-audio correspondence☆18Updated 10 months ago
- Action2Sound: Ambient-Aware Generation of Action Sounds from Egocentric Videos☆21Updated 6 months ago
- Official PyTorch implementation of "Conditional Generation of Audio from Video via Foley Analogies".☆86Updated last year
- ☆32Updated last month
- PyTorch implementation of ECCV 2020 paper "Foley Music: Learning to Generate Music from Videos "☆39Updated 4 years ago
- Towards Long Form Audio-visual Video Understanding☆13Updated 6 months ago
- Official source code of the INTERSPEECH 2023 paper: "Audio-Visual Speech Separation in Noisy Environments with a Lightweight Iterative Mo…☆19Updated last year
- The official repo for Both Ears Wide Open: Towards Language-Driven Spatial Audio Generation☆37Updated this week
- This is the official repository of the papers "Parameter-Efficient Transfer Learning of Audio Spectrogram Transformers" and "Efficient Fi…☆36Updated 8 months ago