OpenGVLab/unmasked_teacher

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/OpenGVLab/unmasked_teacher)

OpenGVLab / unmasked_teacher

[ICCV2023 Oral] Unmasked Teacher: Towards Training-Efficient Video Foundation Models

☆348

Alternatives and similar repositories for unmasked_teacher

Users that are interested in unmasked_teacher are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

klauscc / VindLU
View on GitHub
☆108Dec 23, 2022Updated 3 years ago
OpenGVLab / InternVideo
View on GitHub
[ECCV2024] Video Foundation Models & Data for Multimodal Understanding
☆2,342Jul 2, 2026Updated 3 weeks ago
OpenGVLab / UniFormerV2
View on GitHub
[ICCV2023] UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video UniFormer
☆351Apr 2, 2024Updated 2 years ago
ruiwang2021 / mvd
View on GitHub
[CVPR2023] Masked Video Distillation: Rethinking Masked Feature Modeling for Self-supervised Video Representation Learning (https://arxiv…
☆135May 21, 2023Updated 3 years ago
OpenGVLab / VideoMAEv2
View on GitHub
[CVPR 2023] VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking
☆803Oct 8, 2024Updated last year
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
OpenGVLab / VideoChat-Flash
View on GitHub
[ICLR2026] VideoChat-Flash: Hierarchical Compression for Long-Context Video Modeling
☆527Updated this week
OpenGVLab / VideoMamba
View on GitHub
[ECCV2024] VideoMamba: State Space Model for Efficient Video Understanding
☆1,121Jul 6, 2024Updated 2 years ago
OpenGVLab / perception_test_iccv2023
View on GitHub
Champion Solutions repository for Perception Test challenges in ICCV2023 workshop.
☆14Oct 18, 2023Updated 2 years ago
MCG-NJU / VideoMAE
View on GitHub
[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
☆1,775Dec 8, 2023Updated 2 years ago
CASIA-IVA-Lab / VALOR
View on GitHub
[TPAMI2024] Codes and Models for VALOR: Vision-Audio-Language Omni-Perception Pretraining Model and Dataset
☆311Dec 25, 2024Updated last year
OpenGVLab / efficient-video-recognition
View on GitHub
☆184Aug 20, 2022Updated 3 years ago
zhaoyue-zephyrus / AVION
View on GitHub
[arXiv:2309.16669] Code release for "Training a Large Video Model on a Single Machine in a Day"
☆138Aug 23, 2025Updated 11 months ago
microsoft / XPretrain
View on GitHub
Multi-modality pre-training
☆511Mar 27, 2026Updated 3 months ago
CASIA-IVA-Lab / VAST
View on GitHub
[NIPS2023] Code and Model for VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset
☆302Mar 14, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
OpenGVLab / Ask-Anything
View on GitHub
[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
☆3,344Jul 17, 2026Updated last week
facebookresearch / LaViLa
View on GitHub
Code release for "Learning Video Representations from Large Language Models"
☆534Oct 1, 2023Updated 2 years ago
llyx97 / FETV
View on GitHub
[NeurIPS 2023 Datasets and Benchmarks] "FETV: A Benchmark for Fine-Grained Evaluation of Open-Domain Text-to-Video Generation", Yuanxin L…
☆56Mar 4, 2024Updated 2 years ago
showlab / all-in-one
View on GitHub
[CVPR2023] All in One: Exploring Unified Video-Language Pre-training
☆281Mar 25, 2023Updated 3 years ago
ArrowLuo / CLIP4Clip
View on GitHub
An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"
☆1,029Apr 12, 2024Updated 2 years ago
whwu95 / BIKE
View on GitHub
【CVPR'2023】Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models
☆156Sep 9, 2024Updated last year
huangb23 / VTimeLLM
View on GitHub
[CVPR'2024 Highlight] Official PyTorch implementation of the paper "VTimeLLM: Empower LLM to Grasp Video Moments".
☆295Jun 13, 2024Updated 2 years ago
daniel-code / TubeViT
View on GitHub
An unofficial implementation of TubeViT in "Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning"
☆95Jul 15, 2026Updated last week
OpenGVLab / EgoVideo
View on GitHub
[CVPR 2024 Champions][ICLR 2025] Solutions for EgoVis Chanllenges in CVPR 2024
☆136May 11, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
mbzuai-oryx / Video-ChatGPT
View on GitHub
[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the cap…
☆1,505Aug 5, 2025Updated 11 months ago
DCDmllm / Momentor
View on GitHub
☆81Nov 24, 2024Updated last year
whwu95 / Cap4Video
View on GitHub
【CVPR'2023 Highlight & TPAMI】Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?
☆256Nov 29, 2024Updated last year
baaivision / EVA
View on GitHub
EVA Series: Visual Representation Fantasies from BAAI
☆2,685Aug 1, 2024Updated last year
taoyang1122 / adapt-image-models
View on GitHub
[ICLR'23] AIM: Adapting Image Models for Efficient Video Action Recognition
☆299Sep 17, 2023Updated 2 years ago
showlab / UniVTG
View on GitHub
[ICCV 2023] UniVTG: Towards Unified Video-Language Temporal Grounding
☆380May 8, 2024Updated 2 years ago
antoyang / VidChapters
View on GitHub
[NeurIPS 2023 D&B] VidChapters-7M: Video Chapters at Scale
☆213Nov 13, 2023Updated 2 years ago
PKU-YuanGroup / LanguageBind
View on GitHub
【ICLR 2024🔥】 Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment
☆884Mar 25, 2024Updated 2 years ago
sallymmx / ActionCLIP
View on GitHub
This is the official implement of paper "ActionCLIP: A New Paradigm for Action Recognition"
☆614Dec 6, 2023Updated 2 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
showlab / EgoVLP
View on GitHub
[NeurIPS 2022] Egocentric Video-Language Pretraining
☆261May 9, 2024Updated 2 years ago
google-deepmind / perception_test
View on GitHub
☆253Jun 19, 2026Updated last month
xuguohai / X-CLIP
View on GitHub
An official implementation for "X-CLIP: End-to-End Multi-grained Contrastive Learning for Video-Text Retrieval"
☆185Apr 6, 2024Updated 2 years ago
baaivision / Emu
View on GitHub
Emu Series: Generative Multimodal Models from BAAI
☆1,776Jan 12, 2026Updated 6 months ago
leexinhao / ZeroI2V
View on GitHub
[ECCV 2024] ZeroI2V: Zero-Cost Adaptation of Pre-trained Transformers from Image to Video
☆20Jul 29, 2024Updated last year
jingwangsg / MS-DETR
View on GitHub
An official implementation for MS-DETR in ACL'23
☆17Jun 3, 2023Updated 3 years ago
Yui010206 / SeViLA
View on GitHub
[NeurIPS 2023] Self-Chained Image-Language Model for Video Localization and Question Answering
☆198Jan 14, 2024Updated 2 years ago