facebookresearch/mvit

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/facebookresearch/mvit)

facebookresearch / mvit

Code Release for MViTv2 on Image Recognition.

☆456

Alternatives and similar repositories for mvit

Users that are interested in mvit are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

facebookresearch / SlowFast
View on GitHub
PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.
☆7,393Mar 16, 2026Updated 4 months ago
karttikeya / minREV
View on GitHub
A simple minimal implementation of Reversible Vision Transformers
☆127Mar 14, 2024Updated 2 years ago
facebookresearch / hiera
View on GitHub
Hiera: A fast, powerful, and simple hierarchical vision transformer.
☆1,074Mar 2, 2024Updated 2 years ago
ShoufaChen / AdaptFormer
View on GitHub
[NeurIPS 2022] Implementation of "AdaptFormer: Adapting Vision Transformers for Scalable Visual Recognition"
☆388Sep 16, 2022Updated 3 years ago
facebookresearch / MeMViT
View on GitHub
Code Release for MeMViT Memory-Augmented Multiscale Vision Transformer for Efficient Long-Term Video Recognition, CVPR 2022
☆155Nov 30, 2022Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
Sense-X / UniFormer
View on GitHub
[ICLR2022] official implementation of UniFormer
☆907Mar 29, 2024Updated 2 years ago
SwinTransformer / Video-Swin-Transformer
View on GitHub
This is an official implementation for "Video Swin Transformers".
☆1,667Mar 8, 2023Updated 3 years ago
facebookresearch / TimeSformer
View on GitHub
The official pytorch implementation of our paper "Is Space-Time Attention All You Need for Video Understanding?"
☆1,864Apr 9, 2024Updated 2 years ago
whai362 / PVT
View on GitHub
Official implementation of PVT series
☆1,902Oct 27, 2022Updated 3 years ago
facebookresearch / pytorchvideo
View on GitHub
A deep learning library for video understanding research.
☆3,565May 5, 2026Updated 2 months ago
facebookresearch / deit
View on GitHub
Official DeiT repository
☆4,359Mar 15, 2024Updated 2 years ago
Chenglin-Yang / LVT
View on GitHub
Lite Vision Transformer (CVPR 2022)
☆144Oct 21, 2022Updated 3 years ago
microsoft / Swin-Transformer
View on GitHub
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
☆16,010Jul 24, 2024Updated 2 years ago
ucasligang / SimViT
View on GitHub
[ICME 2022] code for the paper, SimVit: Exploring a simple vision transformer with sliding windows.
☆67Oct 11, 2022Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
MCG-NJU / VideoMAE
View on GitHub
[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
☆1,775Dec 8, 2023Updated 2 years ago
Alpha-VL / ConvMAE
View on GitHub
ConvMAE: Masked Convolution Meets Masked Autoencoders
☆531Mar 14, 2023Updated 3 years ago
facebookresearch / mae
View on GitHub
PyTorch implementation of MAE https//arxiv.org/abs/2111.06377
☆8,369Jul 23, 2024Updated 2 years ago
LightDXY / BootMAE
View on GitHub
ECCV2022,Bootstrapped Masked Autoencoders for Vision BERT Pretraining
☆97Nov 2, 2022Updated 3 years ago
google-research / scenic
View on GitHub
Scenic: A Jax Library for Computer Vision Research and Beyond
☆3,819Updated this week
microsoft / CSWin-Transformer
View on GitHub
CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped, CVPR 2022
☆586Nov 1, 2023Updated 2 years ago
czczup / ViT-Adapter
View on GitHub
[ICLR 2023 Spotlight] Vision Transformer Adapter for Dense Predictions
☆1,503Jun 3, 2025Updated last year
NVlabs / GroupViT
View on GitHub
Official PyTorch implementation of GroupViT: Semantic Segmentation Emerges from Text Supervision, CVPR 2022.
☆788May 10, 2022Updated 4 years ago
OpenGVLab / efficient-video-recognition
View on GitHub
☆184Aug 20, 2022Updated 3 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
Sense-X / MixMIM
View on GitHub
MixMIM: Mixed and Masked Image Modeling for Efficient Visual Representation Learning
☆145Jul 2, 2023Updated 3 years ago
youngwanLEE / MPViT
View on GitHub
[CVPR 2022] MPViT:Multi-Path Vision Transformer for Dense Prediction
☆387Mar 2, 2022Updated 4 years ago
OliverRensu / Shunted-Transformer
View on GitHub
☆216Dec 17, 2021Updated 4 years ago
NVlabs / FAN
View on GitHub
Official PyTorch implementation of Fully Attentional Networks
☆484Mar 31, 2023Updated 3 years ago
open-mmlab / mmaction2
View on GitHub
OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
☆5,104Mar 18, 2026Updated 4 months ago
fundamentalvision / Deformable-DETR
View on GitHub
Deformable DETR: Deformable Transformers for End-to-End Object Detection.
☆4,004May 16, 2024Updated 2 years ago
sail-sg / poolformer
View on GitHub
PoolFormer: MetaFormer Is Actually What You Need for Vision (CVPR 2022 Oral)
☆1,363Jun 1, 2024Updated 2 years ago
LeapLabTHU / DAT
View on GitHub
Repository of Vision Transformer with Deformable Attention (CVPR2022) and DAT++: Spatially Dynamic Vision Transformerwith Deformable Atte…
☆940Apr 17, 2024Updated 2 years ago
raoyongming / HorNet
View on GitHub
[NeurIPS 2022] HorNet: Efficient High-Order Spatial Interactions with Recursive Gated Convolutions
☆345Dec 30, 2025Updated 6 months ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
facebookresearch / Motionformer
View on GitHub
Code + pre-trained models for the paper Keeping Your Eye on the Ball Trajectory Attention in Video Transformers
☆234Jun 13, 2022Updated 4 years ago
amazon-science / tubelet-transformer
View on GitHub
This is an official implementation of TubeR: Tubelet Transformer for Video Action Detection
☆96Apr 14, 2023Updated 3 years ago
dingmyu / davit
View on GitHub
[ECCV 2022]Code for paper "DaViT: Dual Attention Vision Transformer"
☆378Feb 13, 2024Updated 2 years ago
baaivision / EVA
View on GitHub
EVA Series: Visual Representation Fantasies from BAAI
☆2,685Aug 1, 2024Updated last year
facebookresearch / ConvNeXt-V2
View on GitHub
Code release for ConvNeXt V2 model
☆2,066Aug 14, 2024Updated last year
IBM / CrossViT
View on GitHub
Official implementation of CrossViT. https://arxiv.org/abs/2103.14899
☆417Jan 12, 2022Updated 4 years ago
snap-research / EfficientFormer
View on GitHub
EfficientFormerV2 [ICCV 2023] & EfficientFormer [NeurIPs 2022]
☆1,116Aug 13, 2023Updated 2 years ago