ruiwang2021/mvd

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ruiwang2021/mvd)

ruiwang2021 / mvd

[CVPR2023] Masked Video Distillation: Rethinking Masked Feature Modeling for Self-supervised Video Representation Learning (https://arxiv.org/abs/2212.04500)

☆135

Alternatives and similar repositories for mvd

Users that are interested in mvd are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

XinyuSun / MME
View on GitHub
official implementation of CVPR 23 paper "M3Video: Masked Motion Modeling for Self-Supervised Video Representation Learning"
☆52Dec 8, 2023Updated 2 years ago
whwu95 / ATM
View on GitHub
【ICCV'2023】What Can Simple Arithmetic Operations Do for Temporal Modeling?
☆74Jan 26, 2024Updated 2 years ago
MCG-NJU / AMD
View on GitHub
[CVPR 2024] Asymmetric Masked Distillation for Pre-Training Small Foundation Models
☆18Jan 11, 2026Updated 6 months ago
OpenGVLab / unmasked_teacher
View on GitHub
[ICCV2023 Oral] Unmasked Teacher: Towards Training-Efficient Video Foundation Models
☆348May 27, 2024Updated 2 years ago
OpenGVLab / VideoMAEv2
View on GitHub
[CVPR 2023] VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking
☆803Oct 8, 2024Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
daniel-code / TubeViT
View on GitHub
An unofficial implementation of TubeViT in "Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning"
☆95Jul 15, 2026Updated last week
xyzforever / BEVT
View on GitHub
PyTorch implementation of BEVT (CVPR 2022) https://arxiv.org/abs/2112.01529
☆161Jul 19, 2022Updated 4 years ago
MCG-NJU / VideoMAE-Action-Detection
View on GitHub
[NeurIPS 2022 Spotlight] VideoMAE for Action Detection
☆70Feb 3, 2023Updated 3 years ago
XinyuSun / awesome-self-supervised-representation-learning
View on GitHub
awesome video representation learning
☆15Mar 22, 2021Updated 5 years ago
whwu95 / BIKE
View on GitHub
【CVPR'2023】Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models
☆156Sep 9, 2024Updated last year
DAVEISHAN / TimeBalance
View on GitHub
Placeholder
☆10Jul 17, 2023Updated 3 years ago
MCG-NJU / VideoMAE
View on GitHub
[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
☆1,775Dec 8, 2023Updated 2 years ago
wengzejia1 / Semiformer
View on GitHub
☆36Nov 4, 2022Updated 3 years ago
amirbar / StoP
View on GitHub
☆12Jun 26, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
alibaba-mmai-research / DiST
View on GitHub
ICCV2023: Disentangling Spatial and Temporal Learning for Efficient Image-to-Video Transfer Learning
☆41Sep 25, 2023Updated 2 years ago
alibaba-mmai-research / Masked-Action-Recognition
View on GitHub
Official code for the paper: MAR: Masked Autoencoders for Efficient Action Recognition
☆32Dec 7, 2022Updated 3 years ago
OpenGVLab / InternVideo
View on GitHub
[ECCV2024] Video Foundation Models & Data for Multimodal Understanding
☆2,342Jul 2, 2026Updated 3 weeks ago
wgcban / adamae
View on GitHub
[CVPR'23] AdaMAE: Adaptive Masking for Efficient Spatiotemporal Learning with Masked Autoencoders
☆84Feb 2, 2024Updated 2 years ago
wengzejia1 / Open-VCLIP
View on GitHub
☆119Feb 19, 2024Updated 2 years ago
leexinhao / ZeroI2V
View on GitHub
[ECCV 2024] ZeroI2V: Zero-Cost Adaptation of Pre-trained Transformers from Image to Video
☆20Jul 29, 2024Updated last year
yangle15 / DyFADet-pytorch
View on GitHub
☆32Jul 4, 2024Updated 2 years ago
dominickrei / pi-vit
View on GitHub
[CVPR 2024] Code and models for pi-ViT, a video transformer for understanding activities of daily living
☆31Nov 12, 2025Updated 8 months ago
fmthoker / skeleton-contrast
View on GitHub
☆44Aug 31, 2022Updated 3 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
shvdiwnkozbw / Self-supervised-Video-Concept
View on GitHub
Code for Static and Dynamic Concepts for Self-supervised Video Representation Learning.
☆11Jul 28, 2022Updated 3 years ago
facebookresearch / mae_st
View on GitHub
Official Open Source code for "Masked Autoencoders As Spatiotemporal Learners"
☆371Jan 12, 2026Updated 6 months ago
KHU-VLL / DEVIAS
View on GitHub
[ECCV 2024 Oral] Official implementation of the paper "DEVIAS: Learning Disentangled Video Representations of Action and Scene"
☆29Nov 15, 2025Updated 8 months ago
UCSC-VLAA / DMAE
View on GitHub
[CVPR 2023] This repository includes the official implementation our paper "Masked Autoencoders Enable Efficient Knowledge Distillers"
☆109Jul 24, 2023Updated 3 years ago
naver-ai / tc-clip
View on GitHub
[ECCV 2024] Official PyTorch implementation of TC-CLIP "Leveraging Temporal Contextualization for Video Action Recognition"
☆102Feb 25, 2025Updated last year
Lliar-liar / Daily-Omni
View on GitHub
This is the official repository of Daily-Omni: Towards Audio-Visual Reasoning with Temporal Alignment across Modalities
☆42Apr 28, 2026Updated 2 months ago
taoyang1122 / adapt-image-models
View on GitHub
[ICLR'23] AIM: Adapting Image Models for Efficient Video Action Recognition
☆299Sep 17, 2023Updated 2 years ago
alibaba-mmai-research / TAdaConv
View on GitHub
[ICLR 2022] TAda! Temporally-Adaptive Convolutions for Video Understanding. This codebase provides solutions for video classification, vi…
☆246Aug 23, 2023Updated 2 years ago
Hypnosx / Kinetics-TPS
View on GitHub
ICCV DeeperAction Challenge - Kinetics-TPS Challenge on Part-level Action Parsing and Action Recognition.
☆14Jun 4, 2021Updated 5 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
OpenGVLab / UniFormerV2
View on GitHub
[ICCV2023] UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video UniFormer
☆351Apr 2, 2024Updated 2 years ago
microsoft / AVGen-Bench
View on GitHub
[ICML26] AVGen-Bench is a task-driven benchmark for multi-granular evaluation of Text-to-Audio-Video (T2AV) generation.
☆22Jul 2, 2026Updated 3 weeks ago
sming256 / AdaTAD
View on GitHub
[CVPR2024] The official implementation of AdaTAD: End-to-End Temporal Action Detection with 1B Parameters Across 1000 Frames
☆42Jul 9, 2024Updated 2 years ago
pulkitkumar95 / trokens
View on GitHub
[ICCV 2025] Trokens: Semantic-Aware Relational Trajectory Tokens for Few-Shot Action Recognition
☆25Sep 26, 2025Updated 9 months ago
whwu95 / Text4Vis
View on GitHub
【AAAI'2023 & IJCV】Transferring Vision-Language Models for Visual Recognition: A Classifier Perspective
☆199May 30, 2024Updated 2 years ago
AndongDeng / BEAR
View on GitHub
BEAR: a new BEnchmark on video Action Recognition
☆46Apr 21, 2024Updated 2 years ago
mondalanindya / MSQNet
View on GitHub
Actor-agnostic Multi-label Action Recognition with Multi-modal Query [ICCVW '23]
☆24Oct 20, 2023Updated 2 years ago