Implementation of the paper Video Action Transformer Network
☆138Apr 5, 2021Updated 4 years ago
Alternatives and similar repositories for Video-Action-Transformer-Network-Pytorch-
Users that are interested in Video-Action-Transformer-Network-Pytorch- are comparing it to the libraries listed below
Sorting:
- Transformer for Action Recognition in PyTorch☆38Mar 14, 2020Updated 5 years ago
- An implementation of Video Transformer Network (VTN) approach for Action Recognition in TensorFlow.☆55Sep 29, 2020Updated 5 years ago
- [CVPR 2020] Temporal Pyramid Network for Action Recognition☆393Jan 12, 2021Updated 5 years ago
- Code repository for the paper: 'Something-Else: Compositional Action Recognition with Spatial-Temporal Interaction Networks'☆148Aug 25, 2023Updated 2 years ago
- Code of the STAGE module for video action detection☆48May 25, 2021Updated 4 years ago
- STEP: Spatio-Temporal Progressive Learning for Video Action Detection. CVPR'19 (Oral)☆252Oct 19, 2019Updated 6 years ago
- Video Transformer Network☆41Jun 8, 2021Updated 4 years ago
- video summarization lstm-gan pytorch implementation☆27Dec 6, 2019Updated 6 years ago
- ☆16Jan 6, 2025Updated last year
- Learning Spatiotemporal Features via Video and Text Pair Discrimination☆60Jan 20, 2021Updated 5 years ago
- Action-Localization, Atomic Visual Actions (AVA) Dataset☆25Sep 18, 2019Updated 6 years ago
- A video database bridging human actions and human-object relationships☆156Jun 30, 2020Updated 5 years ago
- Implementation of "Encoraging LSTMs to Anticipate Actions Very Early", ICCV 2017☆19Mar 25, 2018Updated 7 years ago
- Extension of hLSTMat☆19Apr 15, 2021Updated 4 years ago
- Zero-shot video classification by end-to-end training of 3D convolutional neural networks☆150Jun 14, 2020Updated 5 years ago
- Long-Term Feature Banks for Detailed Video Understanding☆384Aug 30, 2021Updated 4 years ago
- Code for the paper: Audio-Visual Model Distillation Using Acoustic Images☆21Mar 24, 2023Updated 2 years ago
- Implementation of the paper Unsupervised Learning of Video Representations using LSTMs☆10Nov 24, 2017Updated 8 years ago
- Video Summarization Transformer: Implementation in PyTorch of the Transformer model for video summarisation☆10Oct 27, 2020Updated 5 years ago
- Character Grounding and Re-Identification in Story of Videos and Text Descriptions☆10Jan 17, 2021Updated 5 years ago
- PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.☆7,297Feb 19, 2026Updated last week
- Official PyTorch implementation of ACTION-Net: Multipath Excitation for Action Recognition (CVPR'21)☆209Apr 19, 2021Updated 4 years ago
- This repository provides the dataset introduced by the paper "Where Does It Exist: Spatio-Temporal Video Grounding for Multi-Form Sentenc…☆69May 1, 2020Updated 5 years ago
- Spatio-Temporal Action Localization System☆424May 21, 2022Updated 3 years ago
- [ICCV 2019] TSM: Temporal Shift Module for Efficient Video Understanding☆2,181Jul 11, 2024Updated last year
- Implementation of "Watch, Listen, and Describe: Globally and Locally Aligned Cross-Modal Attentions for Video Captioning" (https://arxiv.…☆26Nov 3, 2018Updated 7 years ago
- Codebase for "Revisiting spatio-temporal layouts for compositional action recognition" (Oral at BMVC 2021).☆27Apr 3, 2022Updated 3 years ago
- code for our ECCV-2020 paper: Self-supervised Video Representation Learning by Pace Prediction☆100May 13, 2021Updated 4 years ago
- I3D Nonlocal ResNets in Pytorch☆258Mar 26, 2022Updated 3 years ago
- video captioning using 3DCNN and LSTM (pytorch)☆11Sep 26, 2019Updated 6 years ago
- Code for Enhancing Self-supervised Video Representation Learning via Multi-level Feature Optimization.☆10Sep 28, 2021Updated 4 years ago
- ☆69Apr 26, 2021Updated 4 years ago
- dataset cleansing for Visual Genome☆30Apr 26, 2017Updated 8 years ago
- Pytorch Implementation of Videos as Space-Time Region Graphs☆27May 30, 2025Updated 9 months ago
- An open-source toolbox for action understanding based on PyTorch☆1,877Apr 8, 2022Updated 3 years ago
- "Object-Region Video Transformers”, Herzig et al., CVPR 2022☆50Jul 6, 2022Updated 3 years ago
- Joint Embedding with Multimodal Cues for Cross-Modal Video-Text Retrieval☆68Apr 10, 2020Updated 5 years ago
- Code for Oops! Predicting Unintentional Action in Video☆79Apr 13, 2020Updated 5 years ago
- LSCT-PHIQNet☆12Jan 19, 2023Updated 3 years ago