Mrpatekful / swatsLinks
Unofficial implementation of Switching from Adam to SGD optimization in PyTorch.
☆68Updated 3 years ago
Alternatives and similar repositories for swats
Users that are interested in swats are comparing it to the libraries listed below
Sorting:
- pytorch implement of Lookahead Optimizer☆195Updated 3 years ago
- Implements https://arxiv.org/abs/1711.05101 AdamW optimizer, cosine learning rate scheduler and "Cyclical Learning Rates for Training Neu…☆153Updated 6 years ago
- Pytorch implementation of the hamburger module from the ICLR 2021 paper "Is Attention Better Than Matrix Decomposition"☆99Updated 5 years ago
- This in my Demo of Chen et al. "GradNorm: Gradient Normalization for Adaptive Loss Balancing in Deep Multitask Networks" ICML 2018☆181Updated 4 years ago
- Robust Bi-Tempered Logistic Loss Based on Bregman Divergences. https://arxiv.org/pdf/1906.03361.pdf☆147Updated 4 years ago
- lookahead optimizer (Lookahead Optimizer: k steps forward, 1 step back) for pytorch☆338Updated 6 years ago
- Utilities for Pytorch☆88Updated 3 years ago
- Implementation and experiments for AdamW on Pytorch☆94Updated 6 years ago
- Useful PyTorch functions and modules that are not implemented in PyTorch by default☆190Updated last year
- pytorch implementation of basic kmeans algorithm(lloyd method with forgy initialization) with gpu support☆94Updated 7 years ago
- PyTorch Examples repo for "ReZero is All You Need: Fast Convergence at Large Depth"☆62Updated last year
- ☆262Updated 6 years ago
- Official PyTorch Repo for "ReZero is All You Need: Fast Convergence at Large Depth"☆416Updated last year
- Loss and accuracy go opposite ways...right?☆95Updated 5 years ago
- [ICML 2020] code for "PowerNorm: Rethinking Batch Normalization in Transformers" https://arxiv.org/abs/2003.07845☆120Updated 4 years ago
- ☆148Updated 4 years ago
- A TensorFlow re-implementation of Momentum Contrast (MoCo): https://arxiv.org/abs/1911.05722☆159Updated 2 years ago
- ☆165Updated 7 years ago
- Multi-Task Learning Framework on PyTorch. State-of-the-art methods are implemented to effectively train models on multiple tasks.☆150Updated 6 years ago
- Unofficial PyTorch Implementation of EvoNorm☆123Updated 4 years ago
- A PyTorch implementation of the 1d and 2d Sinusoidal positional encoding/embedding.☆261Updated 5 years ago
- Feature extraction made simple with torchextractor☆101Updated 4 years ago
- Bootstrapping loss function implementation in pytorch☆36Updated 5 years ago
- A pytorch dataset sampler for always sampling balanced batches.☆118Updated 5 years ago
- Mish Deep Learning Activation Function for PyTorch / FastAI☆161Updated 5 years ago
- Apollo: An Adaptive Parameter-wise Diagonal Quasi-Newton Method for Nonconvex Stochastic Optimization☆182Updated 4 years ago
- Learning Rate Warmup in PyTorch☆415Updated 7 months ago
- [ICML 2020] code for the flooding regularizer proposed in "Do We Need Zero Training Loss After Achieving Zero Training Error?"☆95Updated 3 years ago
- Framework for creating (partially) reversible neural networks with PyTorch☆155Updated 3 years ago
- homura is a library for fast prototyping DL research☆106Updated 3 years ago