VILA-Lab / i-maeLinks

i-mae Pytorch Repo

☆19

Alternatives and similar repositories for i-mae

Users that are interested in i-mae are comparing it to the libraries listed below

Sorting:

OliverRensu / DeepMIM
[WACV2025 Oral] DeepMIM: Deep Supervision for Masked Image Modeling
☆53Updated 2 months ago
enyac-group / supmae
This is a offical PyTorch/GPU implementation of SupMAE.
☆78Updated 2 years ago
OpenGVLab / STM-Evaluation
☆72Updated 4 months ago
TencentARC / ConMIM
Official codes for ConMIM (ICLR 2023)
☆60Updated 2 years ago
ma-xu / FCViT
A Close Look at Spatial Modeling: From Attention to Convolution
☆91Updated 2 years ago
ChenhongyiYang / GPViT
[ICLR 2023 Spotlight] GPViT: A High Resolution Non-Hierarchical Vision Transformer with Group Propagation
☆101Updated 2 years ago
raoyongming / AMixer
[ECCV 2022] AMixer: Adaptive Weight Mixing for Self-attention Free Vision Transformers
☆28Updated 2 years ago
fundamentalvision / Siamese-Image-Modeling
☆16Updated 2 years ago
ziplab / SN-Netv2
[ECCV 2024] This is the official implementation of "Stitched ViTs are Flexible Vision Backbones".
☆28Updated last year
sunsmarterjie / beyond_masking
Beyond Masking: Demystifying Token-Based Pre-Training for Vision Transformers
☆26Updated 3 years ago
dzy3 / KCD
Pytorch implementation of our paper accepted by ECCV2022 -- Knowledge Condensation Distillation https://arxiv.org/abs/2207.05409
☆30Updated 2 years ago
szq0214 / SReT
Official PyTorch implementation of our ECCV 2022 paper "Sliced Recursive Transformer"
☆65Updated 2 years ago
ucasligang / SimViT
[ICME 2022] code for the paper, SimVit: Exploring a simple vision transformer with sliding windows.
☆68Updated 2 years ago
facebookresearch / PLRC
Code for Point-Level Regin Contrast (https//arxiv.org/abs/2202.04639)
☆35Updated 2 years ago
StevenGrove / vtpack
code base for vision transformers
☆36Updated 3 years ago
JunlinHan / CropMix
Code of CropMix: Sampling a Rich Input Distribution via Multi-Scale Cropping
☆17Updated 2 years ago
rentainhe / pytorch-pooling
Test different pooling method used in CNN for Computer Vision Task
☆35Updated 4 years ago
syp2ysy / prompt-SelF
[TIP] Exploring Effective Factors for Improving Visual In-Context Learning
☆19Updated last month
ariG23498 / TokenLearner
TensorFlow implementation of "TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?"
☆35Updated 3 years ago
LightDXY / BootMAE
ECCV2022,Bootstrapped Masked Autoencoders for Vision BERT Pretraining
☆97Updated 2 years ago
HubHop / vit-attention-benchmark
Benchmarking Attention Mechanism in Vision Transformers.
☆18Updated 2 years ago
LeonHLJ / Teach-DETR
Teach-DETR: Better Training DETR with Teachers
☆31Updated last year
Sense-X / TokenMix
TokenMix: Rethinking Image Mixing for Data Augmentation in Vision Transformers (ECCV 2022)
☆94Updated 2 years ago
lmbxmu / SuperViT
Official Pytorch implementation of Super Vision Transformer (IJCV)
☆43Updated 2 years ago
JunlinHan / MachineMem
Code of "What Images are More Memorable to Machines?"
☆15Updated 2 years ago
Alpha-VL / FastConvMAE
☆59Updated 3 years ago
weijiawu / Awesome-Synthetic-Data-for-Perception-Task
☆43Updated 2 years ago
megvii-research / revisitAIRL
[ECCV2022] Revisiting the Critical Factors of Augmentation-Invariant Representation Learning
☆12Updated 3 years ago
TencentARC / DTN
Official code for "Dynamic Token Normalization Improves Vision Transformer", ICLR 2022.
☆28Updated 3 years ago
facebookresearch / long_seq_mae
code release of research paper "Exploring Long-Sequence Masked Autoencoders"
☆100Updated 2 years ago