rishikksh20/ViViT-pytorch

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/rishikksh20/ViViT-pytorch)

rishikksh20 / ViViT-pytorch

Implementation of ViViT: A Video Vision Transformer

☆558

Alternatives and similar repositories for ViViT-pytorch

Users that are interested in ViViT-pytorch are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

mx-mark / VideoTransformer-pytorch
View on GitHub
PyTorch implementation of a collections of scalable Video Transformer Benchmarks.
☆306May 4, 2022Updated 4 years ago
drv-agwl / ViViT-pytorch
View on GitHub
☆69Apr 26, 2021Updated 5 years ago
noureldien / vivit_pytorch
View on GitHub
Implementation of ViViT: A Video Vision Transformer - Zipping Coding Challenge
☆33Jun 10, 2021Updated 5 years ago
lucidrains / STAM-pytorch
View on GitHub
Implementation of STAM (Space Time Attention Model), a pure and simple attention model that reaches SOTA for video classification
☆133Apr 1, 2021Updated 5 years ago
facebookresearch / TimeSformer
View on GitHub
The official pytorch implementation of our paper "Is Space-Time Attention All You Need for Video Understanding?"
☆1,863Apr 9, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
SwinTransformer / Video-Swin-Transformer
View on GitHub
This is an official implementation for "Video Swin Transformers".
☆1,667Mar 8, 2023Updated 3 years ago
KSonPham / ViVit-a-Pytorch-implementation
View on GitHub
☆23Nov 18, 2022Updated 3 years ago
google-research / scenic
View on GitHub
Scenic: A Jax Library for Computer Vision Research and Beyond
☆3,819Jul 9, 2026Updated last week
lucidrains / TimeSformer-pytorch
View on GitHub
Implementation of TimeSformer from Facebook AI, a pure attention-based solution for video classification
☆729Aug 25, 2021Updated 4 years ago
haofanwang / video-swin-transformer-pytorch
View on GitHub
Video Swin Transformer - PyTorch
☆269Jan 4, 2022Updated 4 years ago
bomri / SlowFast
View on GitHub
PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.
☆87Sep 13, 2021Updated 4 years ago
Alibaba-MIIL / STAM
View on GitHub
Official implementation of "An Image is Worth 16x16 Words, What is a Video Worth?" (2021 paper)
☆221Aug 23, 2022Updated 3 years ago
facebookresearch / SlowFast
View on GitHub
PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.
☆7,391Mar 16, 2026Updated 4 months ago
MCG-NJU / VideoMAE
View on GitHub
[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
☆1,773Dec 8, 2023Updated 2 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
open-mmlab / mmaction2
View on GitHub
OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
☆5,097Mar 18, 2026Updated 4 months ago
cvdfoundation / kinetics-dataset
View on GitHub
☆981May 15, 2024Updated 2 years ago
facebookresearch / AVT
View on GitHub
Code release for ICCV 2021 paper "Anticipative Video Transformer"
☆154Feb 11, 2022Updated 4 years ago
facebookresearch / Motionformer
View on GitHub
Code + pre-trained models for the paper Keeping Your Eye on the Ball Trajectory Attention in Video Transformers
☆234Jun 13, 2022Updated 4 years ago
lucidrains / vit-pytorch
View on GitHub
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Py…
☆25,423Jun 22, 2026Updated 3 weeks ago
wangxiang1230 / OadTR
View on GitHub
Code for our ICCV 2021 Paper "OadTR: Online Action Detection with Transformers".
☆97Jul 16, 2023Updated 3 years ago
facebookresearch / pytorchvideo
View on GitHub
A deep learning library for video understanding research.
☆3,565May 5, 2026Updated 2 months ago
daniel-code / TubeViT
View on GitHub
An unofficial implementation of TubeViT in "Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning"
☆95Updated this week
alibaba-mmai-research / TAdaConv
View on GitHub
[ICLR 2022] TAda! Temporally-Adaptive Convolutions for Video Understanding. This codebase provides solutions for video classification, vi…
☆246Aug 23, 2023Updated 2 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
sallymmx / ActionCLIP
View on GitHub
This is the official implement of paper "ActionCLIP: A New Paradigm for Action Recognition"
☆613Dec 6, 2023Updated 2 years ago
V-Sense / ACTION-Net
View on GitHub
Official PyTorch implementation of ACTION-Net: Multipath Excitation for Action Recognition (CVPR'21)
☆208Apr 19, 2021Updated 5 years ago
deepcs233 / TIN
View on GitHub
[AAAI 2020] Temporal Interlacing Network
☆84Nov 25, 2020Updated 5 years ago
antoine77340 / MIL-NCE_HowTo100M
View on GitHub
PyTorch GPU distributed training code for MIL-NCE HowTo100M
☆221Jul 5, 2022Updated 4 years ago
OpenGVLab / UniFormerV2
View on GitHub
[ICCV2023] UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video UniFormer
☆351Apr 2, 2024Updated 2 years ago
google-deepmind / dmvr
View on GitHub
☆68Nov 3, 2022Updated 3 years ago
v-iashin / video_features
View on GitHub
Extract video features from raw videos using multiple GPUs. We support RAFT flow frames as well as S3D, I3D, R(2+1)D, VGGish, CLIP, and T…
☆653Feb 1, 2026Updated 5 months ago
Axe-- / ActionBERT
View on GitHub
Transformer for Action Recognition in PyTorch
☆39Mar 14, 2020Updated 6 years ago
mit-han-lab / temporal-shift-module
View on GitHub
[ICCV 2019] TSM: Temporal Shift Module for Efficient Video Understanding
☆2,215Jul 11, 2024Updated 2 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
liu-zhy / temporal-adaptive-module
View on GitHub
TAM: Temporal Adaptive Module for Video Recognition
☆207Aug 18, 2022Updated 3 years ago
wdrink / STTS
View on GitHub
Official PyTorch implementation of the ECCV 2022 paper: Efficient Video Transformers with Spatial-Temporal Token Selection.
☆52Jul 13, 2022Updated 4 years ago
SforAiDl / vformer
View on GitHub
A modular PyTorch library for vision transformer models
☆165Oct 28, 2023Updated 2 years ago
fcakyon / video-transformers
View on GitHub
Easiest way of fine-tuning HuggingFace video classification models
☆148Mar 20, 2023Updated 3 years ago
MCG-NJU / TDN
View on GitHub
[CVPR 2021] TDN: Temporal Difference Networks for Efficient Action Recognition
☆384Sep 17, 2022Updated 3 years ago
jfzhang95 / pytorch-video-recognition
View on GitHub
PyTorch implemented C3D, R3D, R2Plus1D models for video activity recognition.
☆1,237Dec 27, 2023Updated 2 years ago
piergiaj / pytorch-i3d
View on GitHub
☆1,051Jun 28, 2020Updated 6 years ago