lucidrains / TimeSformer-pytorchView external linksLinks
Implementation of TimeSformer from Facebook AI, a pure attention-based solution for video classification
☆727Aug 25, 2021Updated 4 years ago
Alternatives and similar repositories for TimeSformer-pytorch
Users that are interested in TimeSformer-pytorch are comparing it to the libraries listed below
Sorting:
- The official pytorch implementation of our paper "Is Space-Time Attention All You Need for Video Understanding?"☆1,824Apr 9, 2024Updated last year
- Official implementation of "An Image is Worth 16x16 Words, What is a Video Worth?" (2021 paper)☆222Aug 23, 2022Updated 3 years ago
- This is an official implementation for "Video Swin Transformers".☆1,630Mar 8, 2023Updated 2 years ago
- Implementation of ViViT: A Video Vision Transformer☆556Jun 21, 2021Updated 4 years ago
- PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.☆7,286Feb 5, 2026Updated last week
- A deep learning library for video understanding research.☆3,544Jan 12, 2026Updated last month
- code for our ECCV-2020 paper: Self-supervised Video Representation Learning by Pace Prediction☆100May 13, 2021Updated 4 years ago
- [CVPR 2021 Best Student Paper Honorable Mention, Oral] Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning…☆723Aug 8, 2023Updated 2 years ago
- ICCV2021, Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet☆1,191Oct 27, 2023Updated 2 years ago
- [ICCV 2019] TSM: Temporal Shift Module for Efficient Video Understanding☆2,180Jul 11, 2024Updated last year
- Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image classification, in Pytorc…☆310Dec 27, 2021Updated 4 years ago
- [CVPR 2020] Temporal Pyramid Network for Action Recognition☆393Jan 12, 2021Updated 5 years ago
- [CVPR 2021] TDN: Temporal Difference Networks for Efficient Action Recognition☆381Sep 17, 2022Updated 3 years ago
- Video Representation Learning by Recognizing Temporal Transformations. In ECCV, 2020.☆49Mar 18, 2021Updated 4 years ago
- S3D Text-Video model trained on HowTo100M using MIL-NCE☆200Jul 3, 2020Updated 5 years ago
- Implementation of Bottleneck Transformer in Pytorch☆677Sep 20, 2021Updated 4 years ago
- An attempt at the implementation of Glom, Geoffrey Hinton's new idea that integrates concepts from neural fields, top-down-bottom-up proc…☆196Mar 27, 2021Updated 4 years ago
- [NeurIPS'20] Self-supervised Co-Training for Video Representation Learning. Tengda Han, Weidi Xie, Andrew Zisserman.☆289Oct 10, 2021Updated 4 years ago
- Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Py…☆24,993Updated this week
- [NeurIPS‘2021] "TransGAN: Two Pure Transformers Can Make One Strong GAN, and That Can Scale Up", Yifan Jiang, Shiyu Chang, Zhangyang Wang☆1,690Nov 3, 2022Updated 3 years ago
- [CVPR2021 Oral] End-to-End Video Instance Segmentation with Transformers☆756Jul 15, 2021Updated 4 years ago
- 3D ResNets for Action Recognition (CVPR 2018)☆4,042Jan 20, 2021Updated 5 years ago
- PyTorch GPU distributed training code for MIL-NCE HowTo100M☆219Jul 5, 2022Updated 3 years ago
- Implementation of TransGanFormer, an all-attention GAN that combines the finding from the recent GanFormer and TransGan paper☆155Apr 27, 2021Updated 4 years ago
- [ECCV'20 Spotlight] Memory-augmented Dense Predictive Coding for Video Representation Learning. Tengda Han, Weidi Xie, Andrew Zisserman.☆167Apr 29, 2021Updated 4 years ago
- Gate-Shift Networks for Video Action Recognition - CVPR 2020☆150Jun 19, 2020Updated 5 years ago
- ☆71Oct 6, 2023Updated 2 years ago
- Official DeiT repository☆4,323Mar 15, 2024Updated last year
- Implementation of Multistream Transformers in Pytorch☆54Jul 31, 2021Updated 4 years ago
- PyTorch 3D video classification models pre-trained on 65 million Instagram videos☆265Dec 7, 2019Updated 6 years ago
- The implementation of CVPR2021 paper Temporal Query Networks for Fine-grained Video Understanding☆64Mar 9, 2022Updated 3 years ago
- A general video understanding codebase from SenseTime X-Lab☆445Apr 1, 2021Updated 4 years ago
- Code release for SLIP Self-supervision meets Language-Image Pre-training☆787Feb 9, 2023Updated 3 years ago
- Implementation of various self-attention mechanisms focused on computer vision. Ongoing repository.☆1,216Sep 14, 2021Updated 4 years ago
- A concise but complete implementation of CLIP with various experimental improvements from recent papers☆722Oct 16, 2023Updated 2 years ago
- Implementation of the 😇 Attention layer from the paper, Scaling Local Self-Attention For Parameter Efficient Visual Backbones☆201Mar 24, 2021Updated 4 years ago
- An open-source toolbox for action understanding based on PyTorch☆1,878Apr 8, 2022Updated 3 years ago
- Video Contrastive Learning with Global Context, ICCVW 2021☆162May 30, 2022Updated 3 years ago
- Implementation of Fast Transformer in Pytorch☆176Aug 26, 2021Updated 4 years ago