zeke-xie / stable-weight-decay-regularization
[NeurIPS 2023] The PyTorch Implementation of Scheduled (Stable) Weight Decay.
☆59Updated last year
Alternatives and similar repositories for stable-weight-decay-regularization:
Users that are interested in stable-weight-decay-regularization are comparing it to the libraries listed below
- [ICML 2021] The official PyTorch Implementations of Positive-Negative Momentum Optimizers.☆28Updated 2 years ago
- [Neural Computation, MIT Press] The PyTorch Implementation of Variable Optimizers/ Neural Variable Risk Minimization proposed in our Neur…☆33Updated 3 years ago
- ResMLP: Feedforward networks for image classification with data-efficient training☆42Updated 3 years ago
- [ICLR 2022] "Anti-Oversmoothing in Deep Vision Transformers via the Fourier Domain Analysis: From Theory to Practice" by Peihao Wang, Wen…☆79Updated last year
- [ICML 2022, Oral] The PyTorch Implementation of Adaptive Inertia Methods. The algorithms are based on our paper: "Adaptive Inertia: Dise…☆145Updated last year
- A collection of differentiable SVD methods and ICCV21 "Why Approximate Matrix Square Root Outperforms Accurate SVD in Global Covariance P…☆71Updated last year
- Metrics for "Beyond neural scaling laws: beating power law scaling via data pruning " (NeurIPS 2022 Outstanding Paper Award)☆55Updated last year
- [ICLR 2024] Improving Convergence and Generalization Using Parameter Symmetries☆29Updated 8 months ago
- ☆61Updated last year
- Official PyTorch implementation of "Meta-prediction Model for Distillation-Aware NAS on Unseen Datasets" (ICLR 2023 notable top 25%)☆23Updated 10 months ago
- Repo for the paper "Extrapolating from a Single Image to a Thousand Classes using Distillation"☆36Updated 7 months ago
- A torch-based implementation of K-Means and K-Means++☆17Updated 4 years ago
- This repo is for our paper: Normalization Techniques in Training DNNs: Methodology, Analysis and Application☆84Updated 3 years ago
- Official PyTorch implementation for the paper Minimizing Trajectory Curvature of ODE-based Generative Models, ICML 2023☆79Updated this week
- ☆28Updated 4 years ago
- [NeurIPS'22] What Makes a "Good" Data Augmentation in Knowledge Distillation -- A Statistical Perspective☆36Updated 2 years ago
- This is a offical PyTorch/GPU implementation of SupMAE.☆77Updated 2 years ago
- [ICLR 2024] Official code for the paper 'Elucidating the Exposure Bias in Diffusion Models'☆24Updated 9 months ago
- Denoising Masked Autoencoders Help Robust Classification.☆60Updated last year
- Recent Advances in MLP-based Models (MLP is all you need!)☆113Updated 2 years ago
- [ICLR 2023] Eva: Practical Second-order Optimization with Kronecker-vectorized Approximation☆12Updated last year
- [ICLR 2022]: Fast AdvProp☆34Updated 2 years ago
- Repository containing code for blockwise SSL training☆28Updated 4 months ago
- ☆41Updated 2 years ago
- Code release for Deep Incubation (https://arxiv.org/abs/2212.04129)☆91Updated last year
- (CVPR 2022) Automated Progressive Learning for Efficient Training of Vision Transformers☆25Updated 2 years ago
- A repository for DenseSSMs☆86Updated 10 months ago
- Official code for "On Calibrating Diffusion Probabilistic Models"☆28Updated last year
- ☆35Updated last year
- This is the official implementation for Image Clustering via the Principle of Rate Reduction in the Age of Pretrained Models.☆24Updated last year