facebookresearch / AVID-CMALinks

Audio Visual Instance Discrimination with Cross-Modal Agreement

☆130

Alternatives and similar repositories for AVID-CMA

Users that are interested in AVID-CMA are comparing it to the libraries listed below

Sorting:

HumamAlwassel / XDC
Self-Supervised Learning by Cross-Modal Audio-Video Clustering (NeurIPS 2020)
☆91Updated 3 years ago
facebookresearch / Listen-to-Look
Listen to Look: Action Recognition by Previewing Audio (CVPR 2020)
☆130Updated 4 years ago
facebookresearch / selavi
This repo covers the implementation for Labelling unlabelled videos from scratch with multi-modal self-supervision, which learns clusters…
☆117Updated 4 years ago
YapengTian / AVE-ECCV18
Audio-Visual Event Localization in Unconstrained Videos, ECCV 2018
☆196Updated 4 years ago
afourast / avobjects
Implementation for ECCV20 paper "Self-Supervised Learning of audio-visual objects from video"
☆113Updated 5 years ago
hche11 / Localizing-Visual-Sounds-the-Hard-Way
Localizing Visual Sounds the Hard Way
☆82Updated 3 years ago
DTaoo / Discriminative-Sounding-Objects-Localization
Code for Discriminative Sounding Objects Localization (NeurIPS 2020)
☆59Updated 3 years ago
YapengTian / AVVP-ECCV20
Unified Multisensory Perception: Weakly-Supervised Audio-Visual Video Parsing, ECCV, 2020. (Spotlight)
☆90Updated last year
yunyikristy / CM-ACC
Cross-model active contrastive coding
☆22Updated 4 years ago
Yu-Wu / Modaily-Aware-Audio-Visual-Video-Parsing
Code for CVPR 2021 paper Exploring Heterogeneous Clues for Weakly-Supervised Audio-Visual Video Parsing
☆24Updated 3 years ago
BestJuly / IIC
Official implementation of ACMMM'20 paper 'Self-supervised Video Representation Learning Using Inter-intra Contrastive Framework'
☆111Updated 4 years ago
jasongief / PSP_CVPR_2021
[2021 CVPR] Positive Sample Propagation along the Audio-Visual Event Line
☆41Updated 3 years ago
rohitrango / objects-that-sound
Unofficial Implementation of Google Deepmind's paper `Objects that Sound`
☆83Updated 7 years ago
antoine77340 / MIL-NCE_HowTo100M
PyTorch GPU distributed training code for MIL-NCE HowTo100M
☆219Updated 3 years ago
roudimit / AVLnet
Code for the AVLnet (Interspeech 2021) and Cascaded Multilingual (Interspeech 2021) papers.
☆53Updated 3 years ago
dharwath / DAVEnet-pytorch
Deep Audio-Visual Embedding network (DAVEnet) implementation in PyTorch
☆65Updated 7 years ago
ekazakos / temporal-binding-network
Implementation of "EPIC-Fusion: Audio-Visual Temporal Binding for Egocentric Action Recognition, ICCV, 2019" in PyTorch
☆112Updated 4 years ago
rhgao / co-separation
Co-Separating Sounds of Visual Objects (ICCV 2019)
☆97Updated 2 years ago
airsplay / vimpac
☆73Updated 3 years ago
TengdaHan / CoCLR
[NeurIPS'20] Self-supervised Co-Training for Video Representation Learning. Tengda Han, Weidi Xie, Andrew Zisserman.
☆288Updated 4 years ago
DmZhukov / CrossTask
☆93Updated 3 years ago
yanbeic / CCL
PyTorch Implementation on Paper [CVPR2021]Distilling Audio-Visual Knowledge by Compositional Contrastive Learning
☆89Updated 4 years ago
jimmy646 / violin
Data and code for CVPR 2020 paper: "VIOLIN: A Large-Scale Dataset for Video-and-Language Inference"
☆162Updated 5 years ago
facebookresearch / EgoCom-Dataset
EgoCom: A Multi-person Multi-modal Egocentric Communications Dataset
☆58Updated 5 years ago
facebookresearch / FAIR-Play
2.5D visual sound dataset
☆102Updated 4 years ago
cvlab-columbia / expert
Code for Learning to Learn Language from Narrated Video
☆33Updated 2 years ago
oncescuandreea / QuerYD_downloader
☆22Updated 2 years ago
rhgao / Deep-MIML-Network
Learning to Separate Object Sounds by Watching Unlabeled Video (ECCV 2018)
☆51Updated 6 years ago
antoine77340 / S3D_HowTo100M
S3D Text-Video model trained on HowTo100M using MIL-NCE
☆200Updated 5 years ago
xudejing / video-clip-order-prediction
Self-supervised Spatiotemporal Learning via Video Clip Order Prediction
☆106Updated 2 years ago