facebookresearch / TimeSformerLinks

The official pytorch implementation of our paper "Is Space-Time Attention All You Need for Video Understanding?"

☆1,732

Alternatives and similar repositories for TimeSformer

Users that are interested in TimeSformer are comparing it to the libraries listed below

Sorting:

SwinTransformer / Video-Swin-Transformer
This is an official implementation for "Video Swin Transformers".
☆1,568Updated 2 years ago
cvdfoundation / kinetics-dataset
☆874Updated last year
lucidrains / TimeSformer-pytorch
Implementation of TimeSformer from Facebook AI, a pure attention-based solution for video classification
☆719Updated 3 years ago
rishikksh20 / ViViT-pytorch
Implementation of ViViT: A Video Vision Transformer
☆541Updated 4 years ago
MCG-NJU / VideoMAE
[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
☆1,552Updated last year
Sense-X / UniFormer
[ICLR2022] official implementation of UniFormer
☆876Updated last year
mx-mark / VideoTransformer-pytorch
PyTorch implementation of a collections of scalable Video Transformer Benchmarks.
☆300Updated 3 years ago
google-research / scenic
Scenic: A Jax Library for Computer Vision Research and Beyond
☆3,618Updated 3 weeks ago
OpenGVLab / VideoMAEv2
[CVPR 2023] VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking
☆669Updated 9 months ago
haofanwang / video-swin-transformer-pytorch
Video Swin Transformer - PyTorch
☆260Updated 3 years ago
mit-han-lab / temporal-shift-module
[ICCV 2019] TSM: Temporal Shift Module for Efficient Video Understanding
☆2,134Updated last year
happyharrycn / actionformer_release
Code release for ActionFormer (ECCV 2022)
☆508Updated last year
facebookresearch / mvit
Code Release for MViTv2 on Image Recognition.
☆435Updated 8 months ago
DirtyHarryLYL / Transformer-in-Vision
Recent Transformer-based CV and related works.
☆1,334Updated last year
piergiaj / pytorch-i3d
☆1,018Updated 5 years ago
microsoft / SimMIM
This is an official implementation for "SimMIM: A Simple Framework for Masked Image Modeling".
☆988Updated 2 years ago
facebookresearch / moco-v3
PyTorch implementation of MoCo v3 https//arxiv.org/abs/2104.02057
☆1,279Updated 3 years ago
sallymmx / ActionCLIP
This is the official implement of paper "ActionCLIP: A New Paradigm for Action Recognition"
☆566Updated last year
jeonsworld / ViT-pytorch
Pytorch reimplementation of the Vision Transformer (An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale)
☆2,064Updated 3 years ago
The-AI-Summer / self-attention-cv
Implementation of various self-attention mechanisms focused on computer vision. Ongoing repository.
☆1,210Updated 3 years ago
v-iashin / video_features
Extract video features from raw videos using multiple GPUs. We support RAFT flow frames as well as S3D, I3D, R(2+1)D, VGGish, CLIP, and T…
☆606Updated 6 months ago
sail-sg / poolformer
PoolFormer: MetaFormer Is Actually What You Need for Vision (CVPR 2022 Oral)
☆1,347Updated last year
whai362 / PVT
Official implementation of PVT series
☆1,835Updated 2 years ago
microsoft / VideoX
VideoX: a collection of video cross-modal models
☆1,036Updated last year
facebookresearch / pytorchvideo
A deep learning library for video understanding research.
☆3,461Updated 6 months ago
jacobgil / vit-explain
Explainability for Vision Transformers
☆986Updated 3 years ago
hila-chefer / Transformer-Explainability
[CVPR 2021] Official PyTorch implementation for Transformer Interpretability Beyond Attention Visualization, a novel method to visualize …
☆1,913Updated last year
microsoft / Cream
This is a collection of our NAS and Vision Transformer work.
☆1,785Updated last year
yitu-opensource / T2T-ViT
ICCV2021, Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet
☆1,187Updated last year
zhenyingfang / Awesome-Temporal-Action-Detection-Temporal-Action-Proposal-Generation
Temporal Action Detection & Weakly Supervised Temporal Action Detection & Temporal Action Proposal Generation
☆515Updated last month