ziplab / MesaLinks

This is the official PyTorch implementation for "Mesa: A Memory-saving Training Framework for Transformers".

☆120

Alternatives and similar repositories for Mesa

Users that are interested in Mesa are comparing it to the libraries listed below

Sorting:

VITA-Group / AsViT
[ICLR 2022] "As-ViT: Auto-scaling Vision Transformers without Training" by Wuyang Chen, Wei Huang, Xianzhi Du, Xiaodan Song, Zhangyang Wa…
☆76Updated 3 years ago
facebookresearch / asym-siam
PyTorch implementation of Asymmetric Siamese (https://arxiv.org/abs/2204.00613)
☆99Updated 3 years ago
ChengyueGongR / PatchVisionTransformer
☆73Updated 2 years ago
Tete-Xiao / ReSim
PyTorch Implementation of Region Similarity Representation Learning (ReSim)
☆89Updated 3 years ago
facebookresearch / long_seq_mae
code release of research paper "Exploring Long-Sequence Masked Autoencoders"
☆100Updated 2 years ago
facebookresearch / Generic-Grouping
Open-source code for Generic Grouping Network (GGN, CVPR 2022)
☆111Updated 4 months ago
Jiahao000 / ORL
[NeurIPS 2021] ORL: Unsupervised Object-Level Representation Learning from Scene Images
☆58Updated 3 years ago
yucornetto / GG-Transformer
Code and models for the paper Glance-and-Gaze Vision Transformer
☆28Updated 4 years ago
enyac-group / supmae
This is a offical PyTorch/GPU implementation of SupMAE.
☆78Updated 2 years ago
StevenGrove / vtpack
code base for vision transformers
☆36Updated 3 years ago
changlin31 / AutoProg
(CVPR 2022) Automated Progressive Learning for Efficient Training of Vision Transformers
☆25Updated 4 months ago
TencentARC / ConMIM
Official codes for ConMIM (ICLR 2023)
☆60Updated 2 years ago
gaopengcuhk / Container
Official Code Release for Container : Context Aggregation Network
☆46Updated 3 years ago
dddzg / unimoco
UniMoCo: Unsupervised, Semi-Supervised and Full-Supervised Visual Representation Learning
☆55Updated 3 years ago
zhoudaquan / Refiner_ViT
☆109Updated 3 years ago
ziplab / EcoFormer
[NeurIPS 2022 Spotlight] This is the official PyTorch implementation of "EcoFormer: Energy-Saving Attention with Linear Complexity"
☆72Updated 2 years ago
facebookresearch / OTTER
This code provides a PyTorch implementation for OTTER (Optimal Transport distillation for Efficient zero-shot Recognition), as described …
☆69Updated 3 years ago
yuhuixu1993 / BNET
Batch Normalization with Enhanced Linear Transformation
☆53Updated last year
LightDXY / BootMAE
ECCV2022,Bootstrapped Masked Autoencoders for Vision BERT Pretraining
☆97Updated 2 years ago
FocalNet / Networks-Beyond-Attention
A compilation of network architectures for vision and others without usage of self-attention mechanism
☆80Updated 2 years ago
mingkai-zheng / ReSSL
ReSSL: Relational Self-Supervised Learning with Weak Augmentation
☆58Updated 3 years ago
vtddggg / Robust-Vision-Transformer
The implementation of our paper: Towards Robust Vision Transformer (CVPR2022)
☆142Updated 2 years ago
zengarden / momentum2-teacher
Implementation of momentum^2 teacher
☆121Updated 4 years ago
princeton-vl / SOLID
☆44Updated 2 years ago
VITA-Group / SViTE
[NeurIPS'21] "Chasing Sparsity in Vision Transformers: An End-to-End Exploration" by Tianlong Chen, Yu Cheng, Zhe Gan, Lu Yuan, Lei Zhang…
☆90Updated last year
Alpha-VL / FastConvMAE
☆59Updated 3 years ago
blackfeather-wang / InfoPro-Pytorch
Learning recognition/segmentation models without end-to-end training. 40%-60% less GPU memory footprint. Same training time. Better perfo…
☆90Updated 2 years ago
zhenxingjian / Partial_Distance_Correlation
This is the official GitHub for paper: On the Versatile Uses of Partial Distance Correlation in Deep Learning, in ECCV 2022
☆175Updated 2 years ago
HubHop / vit-attention-benchmark
Benchmarking Attention Mechanism in Vision Transformers.
☆18Updated 2 years ago
haohang96 / bingo
Bag of Instances Aggregation Boosts Self-supervised Distillation (ICLR 2022)
☆33Updated 3 years ago