brian7685 / Multimodal-Clustering-Network
ICCV 2021
☆32Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for Multimodal-Clustering-Network
- PyTorch Implementation on Paper [CVPR2021]Distilling Audio-Visual Knowledge by Compositional Contrastive Learning☆86Updated 3 years ago
- ☆13Updated last year
- Self-Supervised Learning by Cross-Modal Audio-Video Clustering (NeurIPS 2020)☆90Updated 2 years ago
- This repository contains the code for our CVPR 2022 paper on "Audio-visual Generalised Zero-shot Learning with Cross-modal Attention and …☆34Updated last year
- Unified Multisensory Perception: Weakly-Supervised Audio-Visual Video Parsing, ECCV, 2020. (Spotlight)☆80Updated 3 months ago
- Code for Enhancing Self-supervised Video Representation Learning via Multi-level Feature Optimization.☆10Updated 3 years ago
- Vision Transformers are Parameter-Efficient Audio-Visual Learners☆89Updated last year
- Dense-Localizing Audio-Visual Events in Untrimmed Videos: A Large-Scale Benchmark and Baseline (CVPR 2023)☆61Updated 9 months ago
- Multimodal Variational Auto-encoder based Audio-Visual Segmentation [ICCV2023].☆17Updated 2 months ago
- ☆61Updated last year
- Cross Modal Retrieval with Querybank Normalisation☆54Updated last year
- Official implementation of "Everything at Once - Multi-modal Fusion Transformer for Video Retrieval". CVPR 2022☆95Updated 2 years ago
- Cross-model active contrastive coding☆21Updated 3 years ago
- Official Pytorch implementation of "Improved Probabilistic Image-Text Representations" (ICLR 2024)☆51Updated 5 months ago
- Official implementation for MGN☆20Updated last year
- Research code for NeurIPS 2023 paper "Modality-Independent Teachers Meet Weakly-Supervised Audio-Visual Event Parser"☆15Updated last year
- This repository contains the code for our ECCV 2022 paper "Temporal and cross-modal attention for audio-visual zero-shot learning"☆24Updated last year
- ☆39Updated last year
- cross modal background suppression for audio-visual event localization☆35Updated 2 years ago
- [ECCV 2022] Joint-Modal Label Denoising for Weakly-Supervised Audio-Visual Video Parsing☆26Updated 2 years ago
- CVPR2022☆20Updated 2 years ago
- [ACM MM 2022] MM_Pyramid: Multimodal Pyramid Attentional Network for Audio-Visual Event Localization and Video Parsing☆12Updated 2 years ago
- [CVPR2021] CoLA: Weakly-Supervised Temporal Action Localization with Snippet Contrastive Learning☆61Updated 2 years ago
- Code for "Unsupervised Hyperbolic Metric Learning" in CVPR 2021.☆19Updated 2 years ago
- Code for EMNLP 2021 paper: Progressively Guide to Attend: An Iterative Alignment Framework for Temporal Sentence Grounding☆12Updated 3 years ago
- Code for CVPR2023 paper "Collaborative Noisy Label Cleaner: Learning Scene-aware Trailers for Multi-modal Highlight Detection in Movies"☆17Updated last year
- Code for ECCV 2022 paper "Can Shuffling Video Benefit Temporal Bias Problem: A Novel Training Framework for Temporal Grounding"☆29Updated last year
- [CVPR 2023] Collecting Cross-Modal Presence-Absence Evidence for Weakly-Supervised Audio-Visual Event Perception☆34Updated last year
- Code for Static and Dynamic Concepts for Self-supervised Video Representation Learning.☆10Updated 2 years ago
- CrossCLR: Cross-modal Contrastive Learning For Multi-modal Video Representations, ICCV 2021☆57Updated 2 years ago