tobna / WhatTransformerToFavorLinks
Github repository for the paper Which Transformer to Favor: A Comparative Analysis of Efficiency in Vision Transformers.
☆31Updated 8 months ago
Alternatives and similar repositories for WhatTransformerToFavor
Users that are interested in WhatTransformerToFavor are comparing it to the libraries listed below
Sorting:
- The offical implementation of [NeurIPS2024] Wasserstein Distance Rivals Kullback-Leibler Divergence for Knowledge Distillation https://ar…☆45Updated 11 months ago
- ☆28Updated 2 years ago
- Code for 'Multi-level Logit Distillation' (CVPR2023)☆70Updated last year
- Training ImageNet / CIFAR models with sota strategies and fancy techniques such as ViT, KD, Rep, etc.☆86Updated last year
- [ECCV 2022] Implementation of the paper "Locality Guidance for Improving Vision Transformers on Tiny Datasets"☆82Updated 3 years ago
- (AAAI 2023 Oral) Pytorch implementation of "CF-ViT: A General Coarse-to-Fine Method for Vision Transformer"☆106Updated 2 years ago
- [ICML2024] DetKDS: Knowledge Distillation Search for Object Detectors☆17Updated last year
- ☆63Updated 4 years ago
- ☆47Updated 2 years ago
- The official project website of "NORM: Knowledge Distillation via N-to-One Representation Matching" (The paper of NORM is published in IC…☆20Updated 2 years ago
- ☆23Updated last year
- Learning Efficient Vision Transformers via Fine-Grained Manifold Distillation. NeurIPS 2022.☆32Updated 3 years ago
- Official implementation of paper "Knowledge Distillation from A Stronger Teacher", NeurIPS 2022☆153Updated 2 years ago
- ResMLP: Feedforward networks for image classification with data-efficient training☆45Updated 4 years ago
- Official PyTorch implementation of our ECCV 2022 paper "Sliced Recursive Transformer"☆66Updated 3 years ago
- PyTorch code and checkpoints release for VanillaKD: https://arxiv.org/abs/2305.15781☆76Updated 2 years ago
- Code for You Only Cut Once: Boosting Data Augmentation with a Single Cut, ICML 2022.☆105Updated 2 years ago
- Official implement of Evo-ViT: Slow-Fast Token Evolution for Dynamic Vision Transformer☆73Updated 3 years ago
- Official code for Scale Decoupled Distillation☆43Updated last year
- Convolutional Initialization for Data-Efficient Vision Transformers☆16Updated last year
- ☆43Updated 2 years ago
- [ECCV-2022] Official implementation of MixSKD: Self-Knowledge Distillation from Mixup for Image Recognition && Pytorch Implementations of…☆110Updated 3 years ago
- [AAAI 2022] This is the official PyTorch implementation of "Less is More: Pay Less Attention in Vision Transformers"☆97Updated 3 years ago
- [CVPR 2023] Castling-ViT: Compressing Self-Attention via Switching Towards Linear-Angular Attention During Vision Transformer Inference☆30Updated last year
- A Close Look at Spatial Modeling: From Attention to Convolution☆91Updated 2 years ago
- [ICML 2024] Official PyTorch implementation of "SLAB: Efficient Transformers with Simplified Linear Attention and Progressive Re-paramete…☆109Updated last year
- Official Pytorch implementation of Super Vision Transformer (IJCV)☆43Updated 2 years ago
- Source code of our TNNLS paper "Boosting Convolutional Neural Networks with Middle Spectrum Grouped Convolution"☆12Updated 2 years ago
- [BMVC 2022] Official repository for "How to Train Vision Transformer on Small-scale Datasets?"☆166Updated 2 years ago
- [ECCV 2022] EdgeViT: Competing Light-weight CNNs on Mobile Devices with Vision Transformers☆112Updated 2 years ago