Official Open Source code for "Masked Autoencoders As Spatiotemporal Learners"
☆362Jan 12, 2026Updated last month
Alternatives and similar repositories for mae_st
Users that are interested in mae_st are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training☆1,683Dec 8, 2023Updated 2 years ago
- PyTorch implementation of MAE https//arxiv.org/abs/2111.06377☆8,230Jul 23, 2024Updated last year
- ConvMAE: Masked Convolution Meets Masked Autoencoders☆524Mar 14, 2023Updated 2 years ago
- Directed masked autoencoders☆14Feb 20, 2026Updated last week
- Official codes for ConMIM (ICLR 2023)☆58Feb 8, 2023Updated 3 years ago
- A collection of literature after or concurrent with Masked Autoencoder (MAE) (Kaiming He el al.).☆860Jul 10, 2024Updated last year
- MultiMAE: Multi-modal Multi-task Masked Autoencoders, ECCV 2022☆615Dec 13, 2022Updated 3 years ago
- Hiera: A fast, powerful, and simple hierarchical vision transformer.☆1,055Mar 2, 2024Updated 2 years ago
- Official Codes for "Uniform Masking: Enabling MAE Pre-training for Pyramid-based Vision Transformers with Locality"☆245Dec 3, 2022Updated 3 years ago
- Official Open Source code for "Scaling Language-Image Pre-training via Masking"☆427Mar 30, 2023Updated 2 years ago
- [CVPR 2023] VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking☆756Oct 8, 2024Updated last year
- This is a PyTorch implementation of “Context AutoEncoder for Self-Supervised Representation Learning"☆198Jan 11, 2023Updated 3 years ago
- iBOT : Image BERT Pre-Training with Online Tokenizer (ICLR 2022)☆765Apr 14, 2022Updated 3 years ago
- [ICCV 2023] You Only Look at One Partial Sequence☆343Oct 21, 2023Updated 2 years ago
- This is an official implementation for "SimMIM: A Simple Framework for Masked Image Modeling".☆1,026Sep 29, 2022Updated 3 years ago
- The official pytorch implementation of our paper "Is Space-Time Attention All You Need for Video Understanding?"☆1,831Apr 9, 2024Updated last year
- Code release for ConvNeXt V2 model☆1,975Aug 14, 2024Updated last year
- Omnivore: A Single Model for Many Visual Modalities☆571Nov 12, 2022Updated 3 years ago
- code release of research paper "Exploring Long-Sequence Masked Autoencoders"☆100Oct 14, 2022Updated 3 years ago
- Reading list for research topics in Masked Image Modeling☆338Dec 3, 2024Updated last year
- PyTorch implementation of BEVT (CVPR 2022) https://arxiv.org/abs/2112.01529☆162Jul 19, 2022Updated 3 years ago
- Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks (NeurIPS2022)☆85Nov 2, 2022Updated 3 years ago
- Masked Siamese Networks for Label-Efficient Learning (https://arxiv.org/abs/2204.07141)☆464May 9, 2022Updated 3 years ago
- [CVPR 2022] DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting☆544Sep 15, 2023Updated 2 years ago
- [ICME 2022] code for the paper, SimVit: Exploring a simple vision transformer with sliding windows.☆68Oct 11, 2022Updated 3 years ago
- Video Contrastive Learning with Global Context, ICCVW 2021☆162May 30, 2022Updated 3 years ago
- PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.☆7,299Feb 19, 2026Updated 2 weeks ago
- [ICLR2022] official implementation of UniFormer☆899Mar 29, 2024Updated last year
- A deep learning library for video understanding research.☆3,544Jan 12, 2026Updated last month
- Code for "Recognizing Scenes from Novel Viewpoints"☆29Sep 16, 2022Updated 3 years ago
- PyTorch code for Vision Transformers training with the Self-Supervised learning method DINO☆7,459Jul 3, 2024Updated last year
- A method to increase the speed and lower the memory footprint of existing vision transformers.☆1,171Jun 17, 2024Updated last year
- Official PyTorch implementation of GroupViT: Semantic Segmentation Emerges from Text Supervision, CVPR 2022.☆783May 10, 2022Updated 3 years ago
- [Survey] Masked Modeling for Self-supervised Representation Learning on Vision and Beyond (https://arxiv.org/abs/2401.00897)☆353Apr 23, 2025Updated 10 months ago
- Mosaic Representation Learning for Self-supervised Visual Pre-training (ICLR2023, Spotlight)☆15Apr 7, 2023Updated 2 years ago
- Official Code of ECCV 2022 paper MS-CLIP☆91Jul 27, 2022Updated 3 years ago
- [CVPR2023] Masked Video Distillation: Rethinking Masked Feature Modeling for Self-supervised Video Representation Learning (https://arxiv…☆135May 21, 2023Updated 2 years ago
- Video embeddings for retrieval with natural language queries☆342Feb 15, 2023Updated 3 years ago
- PyTorch implementation of MoCo v3 https//arxiv.org/abs/2104.02057☆1,317Nov 25, 2021Updated 4 years ago