innat / VideoMAE
[NeurIPS'22] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
☆19Updated last year
Alternatives and similar repositories for VideoMAE:
Users that are interested in VideoMAE are comparing it to the libraries listed below
- Keras 3 Implementation of Video Swin Transformers for 3D Video Modeling☆32Updated 2 months ago
- PyTorch and TensorFlow/Keras image models with automatic weight conversions and equal API/implementations - Vision Transformer (ViT), Res…☆37Updated last year
- Self-Supervised Learning in PyTorch☆135Updated 11 months ago
- Vision Transformers for image classification, image segmentation, and object detection.☆46Updated 4 months ago
- [ECCV 2022] Official repository for "MaxViT: Multi-Axis Vision Transformer". SOTA foundation models for classification, detection, segmen…☆462Updated last year
- Which model is the best at object detection? Which is best for small or large objects? We compare the results in a handy leaderboard.☆66Updated this week
- Hiera: A fast, powerful, and simple hierarchical vision transformer.☆957Updated last year
- ☆66Updated last month
- xLSTM as Generic Vision Backbone☆464Updated 4 months ago
- Implementation of paper - YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information☆56Updated 11 months ago
- EscVM YouTube Channel Repository. Start from Notebooks ⬅️☆64Updated 6 months ago
- An SDK for Transformers + YOLO and other SSD family models☆59Updated last month
- A Simplified PyTorch Implementation of Vision Transformer (ViT)☆166Updated 9 months ago
- The second generation of YOWO action detector.☆235Updated 10 months ago
- Implementation of SegFormer in PyTorch☆69Updated 2 years ago
- Includes PyTorch -> Keras model porting code for ConvNeXt family of models with fine-tuning and inference notebooks.☆100Updated 2 years ago
- A modular PyTorch library for vision transformer models☆162Updated last year
- [ICML 2023] Official PyTorch implementation of Global Context Vision Transformers☆429Updated last year
- A clean, modular implementation of the Yolov7 model family, which uses the official pretrained weights, with utilities for training the m…☆116Updated last year
- This notebook is designed to plot the attention maps of a vision transformer trained on MNIST digits.☆34Updated last month
- A multi-backend (TensorFlow, PyTorch, JAX, and NumPy) implementation of the Segment Anything model in Keras 3.0☆32Updated 11 months ago
- ☆27Updated last year
- Object Detection with Transformers : DETR, Conditional DETR, Deformable DETR, Dynamic Head☆11Updated 2 years ago
- Continuation of an abandoned project fast-coco-eval☆92Updated last month
- ☆182Updated 3 weeks ago
- ☆17Updated last year
- Official repository for "Video-FocalNets: Spatio-Temporal Focal Modulation for Video Action Recognition" [ICCV 2023]☆96Updated 10 months ago
- A summarization of Transformer-based architectures for CV tasks, including image classification, object detection, segmentation, and Few-…☆108Updated 2 years ago
- Torch nn vizualization☆52Updated last year
- [NeurIPS 2022] Official code for "Focal Modulation Networks"☆723Updated last year