Self-Supervised Learning by Cross-Modal Audio-Video Clustering (NeurIPS 2020)
☆91Oct 24, 2022Updated 3 years ago
Alternatives and similar repositories for XDC
Users that are interested in XDC are comparing it to the libraries listed below
Sorting:
- This repo covers the implementation for Labelling unlabelled videos from scratch with multi-modal self-supervision, which learns clusters…☆117Apr 26, 2021Updated 4 years ago
- Official implementation of ACMMM'20 paper 'Self-supervised Video Representation Learning Using Inter-intra Contrastive Framework'☆112Mar 22, 2021Updated 4 years ago
- [NeurIPS'20] Self-supervised Co-Training for Video Representation Learning. Tengda Han, Weidi Xie, Andrew Zisserman.☆289Oct 10, 2021Updated 4 years ago
- Audio Visual Instance Discrimination with Cross-Modal Agreement☆130Aug 13, 2021Updated 4 years ago
- MMAct Challenge☆13Jun 20, 2021Updated 4 years ago
- PyTorch Implementation on Paper [CVPR2021]Distilling Audio-Visual Knowledge by Compositional Contrastive Learning☆89Jul 7, 2021Updated 4 years ago
- code for CVPR-2019 paper: Self-supervised Spatio-temporal Representation Learning for Videos by Predicting Motion and Appearance Statisti…☆62Feb 9, 2021Updated 5 years ago
- SLIC: Self-Supervised Learning with Iterative Clustering for Human Action Videos [CVPR 2022]☆19Jan 27, 2023Updated 3 years ago
- Implementation for ECCV20 paper "Self-Supervised Learning of audio-visual objects from video"☆115Nov 16, 2020Updated 5 years ago
- Code for Discriminative Sounding Objects Localization (NeurIPS 2020)☆59Jan 19, 2022Updated 4 years ago
- code for our ECCV-2020 paper: Self-supervised Video Representation Learning by Pace Prediction☆100May 13, 2021Updated 4 years ago
- VMZ: Model Zoo for Video Modeling☆1,053Jun 17, 2025Updated 8 months ago
- Video Representation Learning by Dense Predictive Coding. Tengda Han, Weidi Xie, Andrew Zisserman.☆253Oct 8, 2021Updated 4 years ago
- Repository to contain the code for the CVPR 2020 publication: Multi-Modal Domain Adaptation for Fine-Grained Action Recognition☆67Sep 12, 2020Updated 5 years ago
- Video Representation Learning by Recognizing Temporal Transformations. In ECCV, 2020.☆49Mar 18, 2021Updated 4 years ago
- [Arxiv2020] The code for our paper 《Self-Supervised Temporal-Discriminative Representation Learning for Video Action Recognition》 https:/…☆76Sep 19, 2020Updated 5 years ago
- Video embeddings for retrieval with natural language queries☆342Feb 15, 2023Updated 3 years ago
- This repository contains the code implementation used in the paper Temporally Coherent Embeddings for Self-Supervised Video Representatio…☆53Mar 16, 2021Updated 4 years ago
- [ECCV'20 Spotlight] Memory-augmented Dense Predictive Coding for Video Representation Learning. Tengda Han, Weidi Xie, Andrew Zisserman.☆167Apr 29, 2021Updated 4 years ago
- The Pytorch implementation for "Video-Text Pre-training with Learned Regions"☆43Jul 15, 2022Updated 3 years ago
- Learning Spatiotemporal Features via Video and Text Pair Discrimination☆60Jan 20, 2021Updated 5 years ago
- ☆41May 7, 2022Updated 3 years ago
- PyTorch 3D video classification models pre-trained on 65 million Instagram videos☆265Dec 7, 2019Updated 6 years ago
- Video Contrastive Learning with Global Context, ICCVW 2021☆162May 30, 2022Updated 3 years ago
- Weakly-supervised action segmentation in video☆16Feb 13, 2022Updated 4 years ago
- ☆73Jun 3, 2022Updated 3 years ago
- ☆31Jun 18, 2021Updated 4 years ago
- [CVPR 2021 Best Student Paper Honorable Mention, Oral] Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning…☆723Aug 8, 2023Updated 2 years ago
- A curated list of different papers and datasets in various areas of audio-visual processing☆766Jan 30, 2024Updated 2 years ago
- [Arxiv2022] Revitalize Region Feature for Democratizing Video-Language Pre-training☆22Mar 19, 2022Updated 3 years ago
- Ask&Confirm: Active Detail Enriching for Cross-Modal Retrieval with Partial Query (ICCV2021)☆20Dec 4, 2021Updated 4 years ago
- Temporal Relational Modeling with Self-Supervision for Action Segmentation☆20Feb 7, 2021Updated 5 years ago
- ☆15Mar 20, 2020Updated 5 years ago
- [CVPR 2022] Cross-Architecture Self-supervised Video Representation Learning☆24Jul 5, 2022Updated 3 years ago
- Code and benchmarks for the Semantic Video Retrieval Task☆53Oct 18, 2022Updated 3 years ago
- HACS: Human Action Clips and Segments Dataset☆197Apr 23, 2020Updated 5 years ago
- VGGSound: A Large-scale Audio-Visual Dataset☆351Sep 13, 2021Updated 4 years ago
- Repository for "Space-Time Correspondence as a Contrastive Random Walk" (NeurIPS 2020)☆276Dec 11, 2021Updated 4 years ago
- I have created a dataset of Image-Text-Pairs by using the cosine similarity of the CLIP embeddings of the image & it's caption derrived f…☆16Apr 22, 2021Updated 4 years ago