innat / VideoSwinLinks
Keras 3 Implementation of Video Swin Transformers for 3D Video Modeling
☆33Updated 9 months ago
Alternatives and similar repositories for VideoSwin
Users that are interested in VideoSwin are comparing it to the libraries listed below
Sorting:
- Easiest way of fine-tuning HuggingFace video classification models☆145Updated 2 years ago
- Official repository for "Video-FocalNets: Spatio-Temporal Focal Modulation for Video Action Recognition" [ICCV 2023]☆100Updated last year
- Video classification exercise using UCF101 data for training an early-fusion and SlowFast architecture model, both using the PyTorch Ligh…☆15Updated 3 years ago
- Awesome Fine-Grained Image Classification☆91Updated last year
- [NeurIPS'22] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training☆22Updated last year
- Self-Supervised Learning in PyTorch☆140Updated last year
- Action recognition tutorial using UCF-101 dataset.☆28Updated 3 years ago
- Vision Transformers for image classification, image segmentation, and object detection.☆58Updated 11 months ago
- PyTorch implementation of a collections of scalable Video Transformer Benchmarks.☆304Updated 3 years ago
- This repository demonstrates how to use TensorFlow based SegFormer model in 🤗 transformers package.☆30Updated 3 years ago
- Official implementation of "Delving into CLIP latent space for Video Anomaly Recognition", CVIU 2024☆84Updated 2 weeks ago
- Heatmap Learner Convolutional Neural Network for Object Counting and Localization☆44Updated last year
- ☆76Updated 2 months ago
- Code Release for MViTv2 on Image Recognition.☆437Updated 10 months ago
- Easy-to-read implementation of self-supervised learning using vision transformer and knowledge distillation with no labels - DINO☆29Updated 2 years ago
- Fine-tune Facebook's DETR (DEtection TRansformer) on Colaboratory.☆152Updated 2 years ago
- A Keras implementation of hybrid efficientnet swin transformer model.☆34Updated last year
- LRCN approach for video regression that uses CNNs for visual input and LSTMs to process sequences of frame embeddings☆21Updated 4 years ago
- menovideo: pytorch library for video action recognition and video understanding☆29Updated 3 years ago
- Easy to use class balanced cross entropy and focal loss implementation for Pytorch☆99Updated 9 months ago
- Tensorflow implementation of the Vision Transformer (An Image is Worth 16x16 Words: Transformer☆109Updated 4 years ago
- The repository collects many various multi-modal transformer architectures, including image transformer, video transformer, image-languag…☆230Updated 3 years ago
- [BMVC 2022] Official repository for "How to Train Vision Transformer on Small-scale Datasets?"☆162Updated last year
- An unofficial implementation of TubeViT in "Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning"☆92Updated last year
- A Detection Toolbox for Tensorflow2☆56Updated 2 years ago
- Video Swin Transformer - PyTorch☆265Updated 3 years ago
- Implementation of Swin Transformers in TensorFlow along with converted pre-trained models, code for off-the-shelf classification and fine…☆60Updated 3 years ago
- End-to-End Object Detection with Transformers☆51Updated last month
- Awesome Video Anomaly Detection☆56Updated last month
- Implementation of Deep Orthogonal Fusion of Local and Global Features in TensorFlow 2☆26Updated 2 years ago