zeke-xie / adaptive-inertia-adaiLinks

[ICML 2022, Oral] The PyTorch Implementation of Adaptive Inertia Methods. The algorithms are based on our paper: "Adaptive Inertia: Disentangling the Effects of Adaptive Learning Rate and Momentum".

☆150

Alternatives and similar repositories for adaptive-inertia-adai

Users that are interested in adaptive-inertia-adai are comparing it to the libraries listed below

Sorting:

xie-lab-ml / deep-learning-dynamics-paper-list
This is a list of peer-reviewed representative papers on deep learning dynamics (optimization dynamics of neural networks). The success o…
☆281Updated last year
zeke-xie / stable-weight-decay-regularization
[NeurIPS 2023] The PyTorch Implementation of Scheduled (Stable) Weight Decay.
☆60Updated last year
xxgege / GAM
The official repo for CVPR2023 highlight paper "Gradient Norm Aware Minimization Seeks First-Order Flatness and Improves Generalization".
☆85Updated 2 years ago
ryanchankh / redunet_demo
☆84Updated 4 years ago
foocker / deeplearningtheory
☆260Updated 5 months ago
ryanchankh / mcr2
Official Implementation of Learning Diverse and Discriminative Representations via the Principle of Maximal Coding Rate Reduction (2020)
☆200Updated 2 years ago
Ma-Lab-Berkeley / ReduNet
ReduNet
☆539Updated 3 years ago
AvivNavon / nash-mtl
Official implementation of "Multi-Task Learning as a Bargaining Game" [ICML 2022]
☆229Updated last month
Gsunshine / Enjoy-Hamburger
[ICLR 2021 top 3%] Is Attention Better Than Matrix Decomposition?
☆334Updated 2 years ago
zbh2047 / SortNet
[NeurIPS 2022] A novel 1-Lipschitz network that can be efficiently trained to achieve certified L-infinity robustness for free!
☆31Updated 2 years ago
transformer-vq / transformer_vq
☆196Updated last year
lucidrains / FLASH-pytorch
Implementation of the Transformer variant proposed in "Transformer Quality in Linear Time"
☆368Updated last year
zeke-xie / Positive-Negative-Momentum
[ICML 2021] The official PyTorch Implementations of Positive-Negative Momentum Optimizers.
☆28Updated 2 years ago
baofff / Extended-Analytic-DPM
Official implementation for Estimating the Optimal Covariance with Imperfect Mean in Diffusion Probabilistic Models (ICML 2022), and a re…
☆109Updated 3 years ago
bojone / Keras-DDPM
生成扩散模型的Keras实现
☆301Updated 5 months ago
BIGBALLON / distribuuuu
The pure and clear PyTorch Distributed Training Framework.
☆275Updated last year
Cranial-XIX / FAMO
Official PyTorch Implementation for Fast Adaptive Multitask Optimization (FAMO)
☆94Updated last year
fkodom / fft-conv-pytorch
Implementation of 1D, 2D, and 3D FFT convolutions in PyTorch. Much faster than direct convolutions for large kernel sizes.
☆501Updated last year
omihub777 / ViT-CIFAR
PyTorch implementation for Vision Transformer[Dosovitskiy, A.(ICLR'21)] modified to obtain over 90% accuracy FROM SCRATCH on CIFAR-10 wit…
☆198Updated last year
haofanwang / awesome-mlp-papers
Recent Advances in MLP-based Models (MLP is all you need!)
☆116Updated 2 years ago
miniHuiHui / awesome-high-order-neural-network
☆50Updated 10 months ago
baofff / Analytic-DPM
Code for the paper Analytic-DPM: an Analytic Estimate of the Optimal Reverse Variance in Diffusion Probabilistic Models (ICLR 2022 Outsta…
☆174Updated 3 years ago
zugexiaodui / torch_flops
A library for calculating the FLOPs in the forward() process based on torch.fx
☆124Updated 4 months ago
FlyEgle / MAE-pytorch
Masked Autoencoders Are Scalable Vision Learners
☆249Updated 2 years ago
xuzhiqin1990 / F-Principle
code to show F-Principle in the DNN training
☆60Updated 2 years ago
juntang-zhuang / GSAM
PyTorch repository for ICLR 2022 paper (GSAM) which improves generalization (e.g. +3.8% top-1 accuracy on ImageNet with ViT-B/32)
☆144Updated 2 years ago
serend1p1ty / core-pytorch-utils
Yet another PyTorch Trainer and some core components for deep learning.
☆222Updated last year
tsb0601 / EMP-SSL
This repository contains the implementation for the paper "EMP-SSL: Towards Self-Supervised Learning in One Training Epoch."
☆227Updated last year
scaomath / galerkin-transformer
[NeurIPS 2021] Galerkin Transformer: a linear attention without softmax for Partial Differential Equations
☆248Updated last year
nasimrahaman / SpectralBias
Code for "On the Spectral Bias of Neural Networks", to appear in ICML 2019 (Long Beach, CA).
☆108Updated 6 years ago