rish-16 / tokenlearner-pytorch
Unofficial PyTorch implementation of TokenLearner by Google AI
☆64Updated last year
Related projects: ⓘ
- Implementation of Uniformer, a simple attention and 3d convolutional net that achieved SOTA in a number of video classification tasks, de…☆96Updated 2 years ago
- A compilation of network architectures for vision and others without usage of self-attention mechanism☆77Updated last year
- This is a offical PyTorch/GPU implementation of SupMAE.☆76Updated 2 years ago
- code release of research paper "Exploring Long-Sequence Masked Autoencoders"☆99Updated last year
- A PyTorch implementation of Mugs proposed by our paper "Mugs: A Multi-Granular Self-Supervised Learning Framework".☆82Updated 7 months ago
- PyTorch code for MUST☆105Updated last year
- ☆111Updated last year
- ☆72Updated 2 years ago
- Official repository for "Intriguing Properties of Vision Transformers" (NeurIPS 2021--Spotlight)☆176Updated 2 years ago
- Code and models for the paper "The effectiveness of MAE pre-pretraining for billion-scale pretraining" https://arxiv.org/abs/2303.13496☆75Updated last month
- ☆62Updated 2 years ago
- NeurIPS 2021, Official codes for "Efficient Training of Visual Transformers with Small Datasets".☆138Updated last year
- ECCV2022,Bootstrapped Masked Autoencoders for Vision BERT Pretraining☆98Updated last year
- (ICML 2022) Official PyTorch implementation of “Blurs Behave Like Ensembles: Spatial Smoothings to Improve Accuracy, Uncertainty, and Rob…☆76Updated 2 years ago
- Visual Language Transformer Interpreter - An interactive visualization tool for interpreting vision-language transformers☆85Updated last year
- ☆84Updated 2 years ago
- Code for the paper titled "CiT Curation in Training for Effective Vision-Language Data".☆78Updated last year
- [ICLR2024] Exploring Target Representations for Masked Autoencoders☆52Updated 8 months ago
- Code Release for MeMViT Memory-Augmented Multiscale Vision Transformer for Efficient Long-Term Video Recognition, CVPR 2022☆144Updated last year
- [ICME 2022] code for the paper, SimVit: Exploring a simple vision transformer with sliding windows.☆67Updated last year
- (ICLR 2023) Official PyTorch implementation of "What Do Self-Supervised Vision Transformers Learn?"☆97Updated 6 months ago
- Multimodal Masked Autoencoders (M3AE): A JAX/Flax Implementation☆100Updated 5 months ago
- Implementation of STAM (Space Time Attention Model), a pure and simple attention model that reaches SOTA for video classification☆126Updated 3 years ago
- PyTorch implementation of the paper "MILAN: Masked Image Pretraining on Language Assisted Representation" https://arxiv.org/pdf/2208.0604…☆79Updated 2 years ago
- A task-agnostic vision-language architecture as a step towards General Purpose Vision☆92Updated 3 years ago
- [NeurIPS 2021] ORL: Unsupervised Object-Level Representation Learning from Scene Images☆58Updated 2 years ago
- Official repository for the General Robust Image Task (GRIT) Benchmark☆48Updated last year
- ☆48Updated 11 months ago
- [ICLR 2023 Spotlight] GPViT: A High Resolution Non-Hierarchical Vision Transformer with Group Propagation☆97Updated last year
- Whitening for Self-Supervised Representation Learning | Official repository☆123Updated last year