changsn / STViT-R
This is an official implementation for "Making Vision Transformers Efficient from A Token Sparsification View".
☆34Updated 2 months ago
Alternatives and similar repositories for STViT-R:
Users that are interested in STViT-R are comparing it to the libraries listed below
- ☆85Updated last year
- (AAAI 2023 Oral) Pytorch implementation of "CF-ViT: A General Coarse-to-Fine Method for Vision Transformer"☆103Updated last year
- Adaptive Token Sampling for Efficient Vision Transformers (ECCV 2022 Oral Presentation)☆101Updated last year
- [CVPR 2023 Highlight] Masked Image Modeling with Local Multi-Scale Reconstruction☆48Updated last year
- [CVPR 2023] This repository includes the official implementation our paper "Masked Autoencoders Enable Efficient Knowledge Distillers"☆106Updated last year
- Codes for ECCV2022 paper - contrastive deep supervision☆69Updated 2 years ago
- [ICLR 2023] Masked Frequency Modeling for Self-Supervised Visual Pre-Training☆75Updated 2 years ago
- [ICLR2024] Exploring Target Representations for Masked Autoencoders☆55Updated last year
- Video Test-Time Adaptation for Action Recognition (CVPR 2023)☆44Updated 6 months ago
- ☆59Updated 2 years ago
- Official Implementation of AlignMixup - CVPR 2022☆71Updated 3 years ago
- [AAAI 2022] This is the official PyTorch implementation of "Less is More: Pay Less Attention in Vision Transformers"☆96Updated 2 years ago
- ☆66Updated 2 years ago
- TokenMix: Rethinking Image Mixing for Data Augmentation in Vision Transformers (ECCV 2022)☆93Updated 2 years ago
- ☆28Updated last year
- Official implement of Evo-ViT: Slow-Fast Token Evolution for Dynamic Vision Transformer☆72Updated 2 years ago
- [CVPR'23 & TPAMI'25] Hard Patches Mining for Masked Image Modeling☆93Updated 3 weeks ago
- [ICCV'2023 Oral] Implicit Temporal Modeling with Learnable Alignment for Video Recognition☆35Updated last year
- MixMIM: Mixed and Masked Image Modeling for Efficient Visual Representation Learning☆142Updated last year
- MSG-Transformer: Exchanging Local Spatial Information by Manipulating Messenger Tokens (CVPR 2022)☆81Updated 2 years ago
- [NeurIPS'23] DropPos: Pre-Training Vision Transformers by Reconstructing Dropped Positions☆60Updated last year
- Python code for ICLR 2022 spotlight paper EViT: Expediting Vision Transformers via Token Reorganizations☆183Updated last year
- Novel Class Discovery in Semantic Segmentation. CVPR 2022☆68Updated 2 years ago
- [CVPR-22] This is the official implementation of the paper "Adavit: Adaptive vision transformers for efficient image recognition".☆51Updated 2 years ago
- Official PyTorch implementation of the ECCV 2022 paper: Efficient Video Transformers with Spatial-Temporal Token Selection.☆47Updated 2 years ago
- AFNet(NeurIPS 2022)☆19Updated 2 years ago
- [BMVC 2024] PlainMamba: Improving Non-hierarchical Mamba in Visual Recognition☆77Updated last month
- Code for Part-Guided Relational Transformers for Fine-Grained Visual Recognition, appeared in TIP 2021☆23Updated last year
- Project Page for "Multi-Task Dense Prediction via Mixture of Low-Rank Experts"☆71Updated 4 months ago
- [CVPR2022] PyTorch implementation of ''Background Activation Suppression for Weakly Supervised Object Localization''.☆45Updated last year