innat / VideoSwin
Keras 3 Implementation of Video Swin Transformers for 3D Video Modeling
☆30Updated 2 months ago
Alternatives and similar repositories for VideoSwin:
Users that are interested in VideoSwin are comparing it to the libraries listed below
- [NeurIPS'22] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training☆18Updated last year
- Easiest way of fine-tuning HuggingFace video classification models☆137Updated last year
- An unofficial implementation of TubeViT in "Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning"☆89Updated 5 months ago
- Keras (TensorFlow v2) reimplementation of Swin Transformer V1 and V2 models☆22Updated 6 months ago
- Implementation of Deep Orthogonal Fusion of Local and Global Features in TensorFlow 2☆25Updated last year
- ☆68Updated 3 years ago
- Official repository for "Video-FocalNets: Spatio-Temporal Focal Modulation for Video Action Recognition" [ICCV 2023]☆95Updated 9 months ago
- Repository accompanying the "Sign Pose-based Transformer for Word-level Sign Language Recognition" paper☆84Updated last year
- Video Swin Transformer - PyTorch☆242Updated 3 years ago
- PyTorch implementation of a collections of scalable Video Transformer Benchmarks.☆289Updated 2 years ago
- [ICCV2023] UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video UniFormer☆303Updated 10 months ago
- menovideo: pytorch library for video action recognition and video understanding☆28Updated 3 years ago
- Video classification exercise using UCF101 data for training an early-fusion and SlowFast architecture model, both using the PyTorch Ligh…☆14Updated 3 years ago
- Implementation of STAM (Space Time Attention Model), a pure and simple attention model that reaches SOTA for video classification☆130Updated 3 years ago
- A Keras implementation of hybrid efficientnet swin transformer model.☆33Updated last year
- ☆14Updated 3 years ago
- [BMVC 2022] Official repository for "How to Train Vision Transformer on Small-scale Datasets?"☆147Updated last year
- PyTorch and TensorFlow/Keras image models with automatic weight conversions and equal API/implementations - Vision Transformer (ViT), Res…☆37Updated last year
- Includes PyTorch -> Keras model porting code for ConvNeXt family of models with fine-tuning and inference notebooks.☆100Updated 2 years ago
- An implementation of the X3D video recognition architecture in TensorFlow/Keras☆15Updated 3 years ago
- This repository contains the MPOSE2021 Dataset for short-time pose-based Human Action Recognition (HAR).☆51Updated last year
- This repo is official implementation of the paper "Multimodal transformer for Nurse Activity Recognition", published in CVPM2022, CVPRW.☆17Updated 8 months ago
- ☆47Updated last year
- 2nd Place Google - Isolated Sign Language Recognition☆47Updated last year
- [CVPR2023] Masked Video Distillation: Rethinking Masked Feature Modeling for Self-supervised Video Representation Learning (https://arxiv…☆116Updated last year
- ☆12Updated last year
- A modular PyTorch library for vision transformer models☆162Updated last year
- The repository collects many various multi-modal transformer architectures, including image transformer, video transformer, image-languag…☆226Updated 2 years ago
- Code Release for MViTv2 on Image Recognition.☆416Updated 2 months ago
- CorrNet+: Sign Language Recognition and Translation via Spatial-Temporal Correlation☆17Updated 2 weeks ago