zeke-xie / stable-weight-decay-regularization
[NeurIPS 2023] The PyTorch Implementation of Scheduled (Stable) Weight Decay.
☆58Updated 11 months ago
Alternatives and similar repositories for stable-weight-decay-regularization:
Users that are interested in stable-weight-decay-regularization are comparing it to the libraries listed below
- [ICML 2021] The official PyTorch Implementations of Positive-Negative Momentum Optimizers.☆28Updated 2 years ago
- [ICLR 2022] "Anti-Oversmoothing in Deep Vision Transformers via the Fourier Domain Analysis: From Theory to Practice" by Peihao Wang, Wen…☆77Updated last year
- [Neural Computation, MIT Press] The PyTorch Implementation of Variable Optimizers/ Neural Variable Risk Minimization proposed in our Neur…☆33Updated 3 years ago
- ☆61Updated last year
- Official PyTorch implementation of "Meta-prediction Model for Distillation-Aware NAS on Unseen Datasets" (ICLR 2023 notable top 25%)☆22Updated 10 months ago
- Denoising Masked Autoencoders Help Robust Classification.☆60Updated last year
- Learning recognition/segmentation models without end-to-end training. 40%-60% less GPU memory footprint. Same training time. Better perfo…☆90Updated 2 years ago
- ResMLP: Feedforward networks for image classification with data-efficient training☆42Updated 3 years ago
- Code for "Implicit Normalizing Flows" (ICLR 2021 spotlight)☆34Updated 3 years ago
- This is a offical PyTorch/GPU implementation of SupMAE.☆77Updated 2 years ago
- Metrics for "Beyond neural scaling laws: beating power law scaling via data pruning " (NeurIPS 2022 Outstanding Paper Award)☆55Updated last year
- [NeurIPS'22] What Makes a "Good" Data Augmentation in Knowledge Distillation -- A Statistical Perspective☆36Updated 2 years ago
- This repo is for our paper: Normalization Techniques in Training DNNs: Methodology, Analysis and Application☆84Updated 3 years ago
- [ICLR 2024] Official code for the paper 'Elucidating the Exposure Bias in Diffusion Models'☆24Updated 8 months ago
- Recent Advances in MLP-based Models (MLP is all you need!)☆113Updated 2 years ago
- A collection of differentiable SVD methods and ICCV21 "Why Approximate Matrix Square Root Outperforms Accurate SVD in Global Covariance P…☆71Updated last year
- DropIT: Dropping Intermediate Tensors for Memory-Efficient DNN Training (ICLR 2023)☆30Updated last year
- Code release for Deep Incubation (https://arxiv.org/abs/2212.04129)☆91Updated last year
- ☆57Updated last year
- ☆21Updated 2 years ago
- Official implementation for Estimating the Optimal Covariance with Imperfect Mean in Diffusion Probabilistic Models (ICML 2022), and a re…☆106Updated 2 years ago
- [ICLR 2022]: Fast AdvProp☆34Updated 2 years ago
- (Unofficial) PyTorch implementation of the paper Early Convolutions Help Transformers See Better☆43Updated 3 years ago
- Code for ViTAS_Vision Transformer Architecture Search☆52Updated 3 years ago
- [ICLR 2021] "Long Live the Lottery: The Existence of Winning Tickets in Lifelong Learning" by Tianlong Chen*, Zhenyu Zhang*, Sijia Liu, S…☆23Updated 3 years ago
- Repo for the paper "Extrapolating from a Single Image to a Thousand Classes using Distillation"☆37Updated 6 months ago
- [ICML 2022, Oral] The PyTorch Implementation of Adaptive Inertia Methods. The algorithms are based on our paper: "Adaptive Inertia: Dise…☆145Updated last year
- ☆41Updated 3 years ago
- Repository containing code for blockwise SSL training☆28Updated 3 months ago
- ☆25Updated 3 years ago