SwinTransformer / Video-Swin-TransformerLinks

This is an official implementation for "Video Swin Transformers".

☆1,594

Alternatives and similar repositories for Video-Swin-Transformer

Users that are interested in Video-Swin-Transformer are comparing it to the libraries listed below

Sorting:

facebookresearch / TimeSformer
The official pytorch implementation of our paper "Is Space-Time Attention All You Need for Video Understanding?"
☆1,790Updated last year
rishikksh20 / ViViT-pytorch
Implementation of ViViT: A Video Vision Transformer
☆556Updated 4 years ago
Sense-X / UniFormer
[ICLR2022] official implementation of UniFormer
☆886Updated last year
cvdfoundation / kinetics-dataset
☆907Updated last year
haofanwang / video-swin-transformer-pytorch
Video Swin Transformer - PyTorch
☆266Updated 3 years ago
MCG-NJU / VideoMAE
[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
☆1,604Updated last year
lucidrains / TimeSformer-pytorch
Implementation of TimeSformer from Facebook AI, a pure attention-based solution for video classification
☆726Updated 4 years ago
mx-mark / VideoTransformer-pytorch
PyTorch implementation of a collections of scalable Video Transformer Benchmarks.
☆304Updated 3 years ago
whai362 / PVT
Official implementation of PVT series
☆1,866Updated 3 years ago
facebookresearch / mvit
Code Release for MViTv2 on Image Recognition.
☆444Updated 11 months ago
sail-sg / poolformer
PoolFormer: MetaFormer Is Actually What You Need for Vision (CVPR 2022 Oral)
☆1,354Updated last year
microsoft / SimMIM
This is an official implementation for "SimMIM: A Simple Framework for Masked Image Modeling".
☆1,004Updated 3 years ago
OpenGVLab / VideoMAEv2
[CVPR 2023] VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking
☆693Updated last year
yitu-opensource / T2T-ViT
ICCV2021, Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet
☆1,188Updated 2 years ago
DirtyHarryLYL / Transformer-in-Vision
Recent Transformer-based CV and related works.
☆1,334Updated 2 years ago
happyharrycn / actionformer_release
Code release for ActionFormer (ECCV 2022)
☆522Updated last year
jeonsworld / ViT-pytorch
Pytorch reimplementation of the Vision Transformer (An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale)
☆2,091Updated 3 years ago
berniwal / swin-transformer-pytorch
Implementation of the Swin Transformer in PyTorch.
☆848Updated 4 years ago
facebookresearch / moco-v3
PyTorch implementation of MoCo v3 https//arxiv.org/abs/2104.02057
☆1,306Updated 3 years ago
microsoft / Cream
This is a collection of our NAS and Vision Transformer work.
☆1,807Updated last year
LeapLabTHU / DAT
Repository of Vision Transformer with Deformable Attention (CVPR2022) and DAT++: Spatially Dynamic Vision Transformerwith Deformable Atte…
☆911Updated last year
sallymmx / ActionCLIP
This is the official implement of paper "ActionCLIP: A New Paradigm for Action Recognition"
☆590Updated last year
OpenGVLab / UniFormerV2
[ICCV2023] UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video UniFormer
☆333Updated last year
facebookresearch / ConvNeXt-V2
Code release for ConvNeXt V2 model
☆1,873Updated last year
microsoft / VideoX
VideoX: a collection of video cross-modal models
☆1,047Updated last year
piergiaj / pytorch-i3d
☆1,027Updated 5 years ago
dandelin / ViLT
Code for the ICML 2021 (long talk) paper: "ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision"
☆1,505Updated last year
microsoft / CSWin-Transformer
CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped, CVPR 2022
☆583Updated 2 years ago
microsoft / CvT
This is an official implementation of CvT: Introducing Convolutions to Vision Transformers.
☆585Updated 2 years ago
mit-han-lab / temporal-shift-module
[ICCV 2019] TSM: Temporal Shift Module for Efficient Video Understanding
☆2,153Updated last year