JustinYuu / MM_PyramidLinks

[ACM MM 2022] MM_Pyramid: Multimodal Pyramid Attentional Network for Audio-Visual Event Localization and Video Parsing

☆13

Alternatives and similar repositories for MM_Pyramid

Users that are interested in MM_Pyramid are comparing it to the libraries listed below

Sorting:

GenjiB / LAVISH
Vision Transformers are Parameter-Efficient Audio-Visual Learners
☆100Updated last year
YapengTian / AVVP-ECCV20
Unified Multisensory Perception: Weakly-Supervised Audio-Visual Video Parsing, ECCV, 2020. (Spotlight)
☆88Updated 11 months ago
jasongief / PSP_CVPR_2021
[2021 CVPR] Positive Sample Propagation along the Audio-Visual Event Line
☆42Updated 3 years ago
WikiChao / Ego-AV-Loc
[CVPR 2023] Egocentric Audio-Visual Object Localization
☆24Updated last year
FloretCat / CMRAN
Cross-Modal Relation-Aware Networks for Audio-Visual Event Localization， ACM MM 2020
☆33Updated 4 years ago
ExplainableML / AVCA-GZSL
This repository contains the code for our CVPR 2022 paper on "Audio-visual Generalised Zero-shot Learning with Cross-modal Attention and …
☆37Updated 2 years ago
MCG-NJU / JoMoLD
[ECCV 2022] Joint-Modal Label Denoising for Weakly-Supervised Audio-Visual Video Parsing
☆27Updated 3 years ago
ttgeng233 / UnAV
Dense-Localizing Audio-Visual Events in Untrimmed Videos: A Large-Scale Benchmark and Baseline (CVPR 2023)
☆66Updated last year
yujiangpu20 / cma_xdVioDet
Official code for "Audio-Guided Attention Network for Weakly Supervised Violence Detection" (ICCECE2022).
☆14Updated 3 years ago
hche11 / Localizing-Visual-Sounds-the-Hard-Way
Localizing Visual Sounds the Hard Way
☆80Updated 3 years ago
GeWu-Lab / MUSIC-AVQA
MUSIC-AVQA, CVPR2022 (ORAL)
☆86Updated 2 years ago
jasongief / CPSP
[2023 TPAMI] Contrastive Positive Sample Propagation along the Audio-Visual Event Line
☆29Updated 2 years ago
marmot-xy / CMBS
cross modal background suppression for audio-visual event localization
☆36Updated 3 years ago
stoneMo / EZ-VSL
Official Codebase of "Localizing Visual Sounds the Easy Way" (ECCV 2022)
☆35Updated 2 years ago
fyyCS / LSLD
☆14Updated last year
JacobChalk / TIM
Codebase for the paper: "TIM: A Time Interval Machine for Audio-Visual Action Recognition"
☆41Updated 8 months ago
Franklin905 / VALOR
Research code for NeurIPS 2023 paper "Modality-Independent Teachers Meet Weakly-Supervised Audio-Visual Event Parser"
☆18Updated this week
jasongief / OV-AVEL
[2025 CVPR] Towards Open-Vocabulary Audio-Visual Event Localization
☆22Updated 4 months ago
schowdhury671 / meerkat
☆31Updated last week
xiaobai1217 / DomainAdaptation
CVPR2022
☆21Updated 2 years ago
RenHuan1999 / CVPR2023_P-MIL
The official implementation of 'Proposal-based Multiple Instance Learning for Weakly-supervised Temporal Action Localization' (CVPR 2023)
☆41Updated 2 years ago
sauradip / TAGS
[ECCV 2022] Official Pytorch Implementation of paper : " Proposal-Free Temporal Action Detection with Global Segmentation Mask Learning "…
☆17Updated 2 years ago
OpenNLPLab / FNAC_AVL
[CVPR 2023] Official implementation of our paper - Learning Audio-Visual Source Localization via False Negative Aware Contrastive Learnin…
☆25Updated 2 years ago
klauscc / TALLFormer
☆52Updated 2 years ago
brian7685 / Multimodal-Clustering-Network
ICCV 2021
☆33Updated 3 years ago
GeWu-Lab / MWAFM
Multi-Scale Attention for Audio Question Answering
☆28Updated last year
zjsong / SSPL
PyTorch code for "Self-Supervised Predictive Learning: A Negative-Free Method for Sound Source Localization in Visual Scenes" (CVPR, 2022…
☆32Updated last year
stoneMo / DeepAVFusion
Official codebase for "Unveiling the Power of Audio-Visual Early Fusion Transformers with Dense Interactions through Masked Modeling".
☆32Updated 11 months ago
JustinYuu / MACIL_SD
[ACM MM 2022] Modality-aware Contrastive Instance Learning with Self-Distillation for Weakly-Supervised Audio-Visual Violence Detection
☆37Updated 3 years ago
HumamAlwassel / XDC
Self-Supervised Learning by Cross-Modal Audio-Video Clustering (NeurIPS 2020)
☆90Updated 2 years ago