xxgege / GAM
The official repo for CVPR2023 highlight paper "Gradient Norm Aware Minimization Seeks First-Order Flatness and Improves Generalization".
☆82Updated last year
Alternatives and similar repositories for GAM:
Users that are interested in GAM are comparing it to the libraries listed below
- ☆26Updated 2 years ago
- [NeurIPS 2022] A novel 1-Lipschitz network that can be efficiently trained to achieve certified L-infinity robustness for free!☆31Updated 2 years ago
- Transformers trained on Tiny ImageNet☆53Updated 2 years ago
- Official implementation of paper "Knowledge Distillation from A Stronger Teacher", NeurIPS 2022☆140Updated 2 years ago
- [CVPR 2024] Friendly Sharpness-Aware Minimization☆28Updated 4 months ago
- This is the official GitHub for paper: On the Versatile Uses of Partial Distance Correlation in Deep Learning, in ECCV 2022☆173Updated last year
- [ICLR 2022] "Anti-Oversmoothing in Deep Vision Transformers via the Fourier Domain Analysis: From Theory to Practice" by Peihao Wang, Wen…☆80Updated last year
- PyTorch implementation of paper "Dataset Distillation via Factorization" in NeurIPS 2022.☆64Updated 2 years ago
- [ICCV 2021] Amplitude-Phase Recombination: Rethinking Robustness of Convolutional Neural Networks in Frequency Domain☆75Updated 2 years ago
- [CVPR 2023] This repository includes the official implementation our paper "Masked Autoencoders Enable Efficient Knowledge Distillers"☆104Updated last year
- Implementation of ASAM: Adaptive Sharpness-Aware Minimization for Scale-Invariant Learning of Deep Neural Networks, ICML 2021.☆141Updated 3 years ago
- Official repository of our work "Finding Lottery Tickets in Vision Models via Data-driven Spectral Foresight Pruning" accepted at CVPR 20…☆21Updated last month
- Decoupled Kullback-Leibler Divergence Loss (DKL), NeurIPS 2024 / Generalized Kullback-Leibler Divergence Loss (GKL)☆42Updated last week
- The offical implement of ImbSAM (Imbalanced-SAM)☆23Updated last year
- [CVPR-2022] Official implementation for "Knowledge Distillation with the Reused Teacher Classifier".☆92Updated 2 years ago
- Denoising Masked Autoencoders Help Robust Classification.☆62Updated last year
- The official codes of our CVPR-2023 paper: Sharpness-Aware Gradient Matching for Domain Generalization☆73Updated last year
- Implementation of HAT https://arxiv.org/pdf/2204.00993☆49Updated last year
- [NeurIPS'23] DropPos: Pre-Training Vision Transformers by Reconstructing Dropped Positions☆61Updated 10 months ago
- PyTorch repository for ICLR 2022 paper (GSAM) which improves generalization (e.g. +3.8% top-1 accuracy on ImageNet with ViT-B/32)☆143Updated 2 years ago
- Official PyTorch implementation of PS-KD☆84Updated 2 years ago
- [AAAI 2023] Official PyTorch Code for "Curriculum Temperature for Knowledge Distillation"☆166Updated 3 months ago
- Official implementation for paper "Knowledge Diffusion for Distillation", NeurIPS 2023☆81Updated last year
- [CVPR 2023 Highlight] This is the official implementation of "Stitchable Neural Networks".☆249Updated last year
- Learning Efficient Vision Transformers via Fine-Grained Manifold Distillation. NeurIPS 2022.☆32Updated 2 years ago
- [ICCV23] Robust Mixture-of-Expert Training for Convolutional Neural Networks by Yihua Zhang, Ruisi Cai, Tianlong Chen, Guanhua Zhang, Hua…☆51Updated last year
- Training ImageNet / CIFAR models with sota strategies and fancy techniques such as ViT, KD, Rep, etc.☆82Updated last year
- Code for Paper "Self-Distillation from the Last Mini-Batch for Consistency Regularization"☆40Updated 2 years ago
- Paper List for Multi-Task Learning (focus on architectures and optimization for MTL)☆46Updated last year
- Code for 'Multi-level Logit Distillation' (CVPR2023)☆60Updated 6 months ago