lonePatient / lookahead_pytorchLinks

pytorch implement of Lookahead Optimizer

☆192

Alternatives and similar repositories for lookahead_pytorch

Users that are interested in lookahead_pytorch are comparing it to the libraries listed below

Sorting:

alphadl / lookahead.pytorch
lookahead optimizer (Lookahead Optimizer: k steps forward, 1 step back) for pytorch
☆337Updated 5 years ago
egg-west / AdamW-pytorch
Implementation and experiments for AdamW on Pytorch
☆94Updated 5 years ago
majumderb / rezero
Official PyTorch Repo for "ReZero is All You Need: Fast Convergence at Large Depth"
☆410Updated last year
google / bi-tempered-loss
Robust Bi-Tempered Logistic Loss Based on Bregman Divergences. https://arxiv.org/pdf/1906.03361.pdf
☆147Updated 3 years ago
lessw2020 / mish
Mish Deep Learning Activation Function for PyTorch / FastAI
☆161Updated 5 years ago
pytorch / contrib
Implementations of ideas from recent papers
☆392Updated 4 years ago
mgrankin / over9000
Over9000 optimizer
☆426Updated 2 years ago
michaelrzhang / lookahead
Implementation for the Lookahead Optimizer.
☆241Updated 3 years ago
cybertronai / pytorch-lamb
Implementation of https://arxiv.org/abs/1904.00962
☆376Updated 4 years ago
sdoria / SimpleSelfAttention
A simpler version of the self-attention layer from SAGAN, and some image classification results.
☆212Updated 5 years ago
eladhoffer / utils.pytorch
Utilities for Pytorch
☆89Updated 2 years ago
lessw2020 / Ranger-Mish-ImageWoof-5
Repo to build on / reproduce the record breaking Ranger-Mish-SelfAttention setup on FastAI ImageWoof dataset 5 epochs
☆116Updated 5 years ago
vfdev-5 / UDA-pytorch
Unsupervised Data Augmentation experiments in PyTorch
☆60Updated 6 years ago
mpyrozhok / adamwr
Implements https://arxiv.org/abs/1711.05101 AdamW optimizer, cosine learning rate scheduler and "Cyclical Learning Rates for Training Neu…
☆150Updated 6 years ago
digantamisra98 / EvoNorm
Unofficial PyTorch Implementation of EvoNorm
☆122Updated 3 years ago
XuezheMax / apollo
Apollo: An Adaptive Parameter-wise Diagonal Quasi-Newton Method for Nonconvex Stochastic Optimization
☆182Updated 3 years ago
Mrpatekful / swats
Unofficial implementation of Switching from Adam to SGD optimization in PyTorch.
☆66Updated 2 years ago
sgugger / Adam-experiments
Experiments with Adam/AdamW/amsgrad
☆202Updated 6 years ago
karanchahal / distiller
A large scale study of Knowledge Distillation.
☆220Updated 5 years ago
justheuristic / prefetch_generator
Simple package that makes your generator work in background thread
☆280Updated 3 years ago
anandsaha / pytorch.cyclic.learning.rate
Using the CLR algorithm for training (https://arxiv.org/abs/1506.01186)
☆108Updated 7 years ago
oval-group / smooth-topk
Smooth Loss Functions for Deep Top-k Classification
☆255Updated 3 years ago
loshchil / AdamW-and-SGDW
Decoupled Weight Decay Regularization (ICLR 2019)
☆277Updated 6 years ago
galatolofederico / pytorch-balanced-batch
A pytorch dataset sampler for always sampling balanced batches.
☆115Updated 4 years ago
tbachlechner / ReZero-examples
PyTorch Examples repo for "ReZero is All You Need: Fast Convergence at Large Depth"
☆61Updated last year
prigoyal / pytorch_memonger
Experimental ground for optimizing memory of pytorch models
☆366Updated 7 years ago
csrhddlam / pytorch-checkpoint
☆165Updated 6 years ago
PhilJd / contiguous_pytorch_params
Accelerate training by storing parameters in one contiguous chunk of memory.
☆290Updated 4 years ago
ymcui / LAMB_Optimizer_TF
LAMB Optimizer for Large Batch Training (TensorFlow version)
☆120Updated 5 years ago
bojone / keras_lookahead
lookahead optimizer for keras
☆170Updated 5 years ago