shantistewart / Emo-CLIM

Emo-CLIM: Emotion-Aligned Contrastive Learning Between Images and Music [ICASSP 2024]

☆13

Alternatives and similar repositories for Emo-CLIM:

Users that are interested in Emo-CLIM are comparing it to the libraries listed below

usc-sail / mica-subtitle-aligned-movie-sounds
A dataset for Audio-Visual Sound Event Detection in Movies
☆27Updated 2 years ago
Sosdatasets / SoS_Dataset
☆11Updated 9 months ago
ilpoviertola / V-AURA
The official implementation of V-AURA: Temporally Aligned Audio for Video with Autoregression (ICASSP 2025)
☆22Updated 3 months ago
ldzhangyx / MusicMagus
The official implementation of the IJCAI 2024 paper "MusicMagus: Zero-Shot Text-to-Music Editing via Diffusion Models".
☆41Updated 7 months ago
v-iashin / SparseSync
Source code for "Sparse in Space and Time: Audio-visual Synchronisation with Trainable Selectors." (Spotlight at the BMVC 2022)
☆52Updated last year
kaistmm / Audio-Mamba-AuM
Official Implementation of the work "Audio Mamba: Bidirectional State Space Model for Audio Representation Learning"
☆139Updated 5 months ago
facebookresearch / learning-audio-visual-dereverberation
Code for paper Learning Audio-Visual Dereverberation
☆27Updated 2 years ago
zszheng147 / Spatial-AST
🦇 Encoder of BAT (Learning to Reason about Spatial Sounds with Large Language Models)
☆46Updated 2 months ago
GalaxyCong / HPMDubbing
[CVPR 2023] Official code for paper: Learning to Dub Movies via Hierarchical Prosody Models.
☆105Updated 10 months ago
GalaxyCong / DubFlow
Official source codes for the paper: EmoDubber: Towards High Quality and Emotion Controllable Movie Dubbing.
☆14Updated 3 months ago
YuanGongND / uavm
Code for the IEEE Signal Processing Letters 2022 paper "UAVM: Towards Unifying Audio and Visual Models".
☆55Updated 2 years ago
kyegomez / AudioFlamingo
Implementation of the model "AudioFlamingo" from the paper: "Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dial…
☆41Updated 3 months ago
XinhaoMei / ACT
Source code for the paper 'Audio Captioning Transformer'
☆54Updated 3 years ago
mcomunita / syncfusion
SyncFusion: Multimodal Onset-synchronized Video-to-Audio Foley Synthesis
☆18Updated 9 months ago
rxtan2 / AVSeT
☆16Updated last year
DanielMengLiu / AudioVisualLip
☆22Updated last year
chenqi008 / V2C
Pytorch implementation for “V2C: Visual Voice Cloning”
☆32Updated 2 years ago
soham97 / mellow
small audio language model for reasoning
☆58Updated last week
dkurzend / ClipClap-GZSL
Audio-Visual Generalized Zero-Shot Learning using Large Pre-Trained Models
☆15Updated last year
ta012 / SSLAM
[ICLR 2025] Enhancing Self-Supervised Models with Audio Mixtures for Polyphonic Soundscapes
☆30Updated last week
showlab / AVA-AVD
☆16Updated 2 years ago
Ego4DSounds / Ego4DSounds
Ego4DSounds: A diverse egocentric dataset with high action-audio correspondence
☆18Updated 10 months ago
ChanganVR / action2sound
Action2Sound: Ambient-Aware Generation of Action Sounds from Egocentric Videos
☆21Updated 6 months ago
XYPB / CondFoleyGen
Official PyTorch implementation of "Conditional Generation of Audio from Video via Foley Analogies".
☆86Updated last year
kaistmm / SSLalignment
☆32Updated last month
chuangg / Foley-Music
PyTorch implementation of ECCV 2020 paper "Foley Music: Learning to Generate Music from Videos "
☆39Updated 4 years ago
GeWu-Lab / LFAV
Towards Long Form Audio-visual Video Understanding
☆13Updated 6 months ago
hmartelb / avlit
Official source code of the INTERSPEECH 2023 paper: "Audio-Visual Speech Separation in Noisy Environments with a Lightweight Iterative Mo…
☆19Updated last year
PeiwenSun2000 / Both-Ears-Wide-Open
The official repo for Both Ears Wide Open: Towards Language-Driven Spatial Audio Generation
☆37Updated this week
umbertocappellazzo / PETL_AST
This is the official repository of the papers "Parameter-Efficient Transfer Learning of Audio Spectrogram Transformers" and "Efficient Fi…
☆36Updated 8 months ago