Mrpatekful / swatsLinks
Unofficial implementation of Switching from Adam to SGD optimization in PyTorch.
☆66Updated 2 years ago
Alternatives and similar repositories for swats
Users that are interested in swats are comparing it to the libraries listed below
Sorting:
- pytorch implement of Lookahead Optimizer☆190Updated 2 years ago
- Pytorch implementation of the hamburger module from the ICLR 2021 paper "Is Attention Better Than Matrix Decomposition"☆99Updated 4 years ago
- Implements https://arxiv.org/abs/1711.05101 AdamW optimizer, cosine learning rate scheduler and "Cyclical Learning Rates for Training Neu…☆149Updated 5 years ago
- Implementations of Recent Papers in Computer Vision☆38Updated 2 years ago
- Multi-Task Learning Framework on PyTorch. State-of-the-art methods are implemented to effectively train models on multiple tasks.☆149Updated 6 years ago
- Unofficial PyTorch Implementation of EvoNorm☆122Updated 3 years ago
- This in my Demo of Chen et al. "GradNorm: Gradient Normalization for Adaptive Loss Balancing in Deep Multitask Networks" ICML 2018☆178Updated 3 years ago
- Implementation and experiments for AdamW on Pytorch☆94Updated 5 years ago
- Implementation from the paper Attention Augmented Convolutional Networks in Tensorflow (https://arxiv.org/pdf/1904.09925v1.pdf)☆46Updated 6 years ago
- lookahead optimizer (Lookahead Optimizer: k steps forward, 1 step back) for pytorch☆336Updated 5 years ago
- Loss and accuracy go opposite ways...right?☆93Updated 5 years ago
- Utilities for Pytorch☆89Updated 2 years ago
- PyTorch Examples repo for "ReZero is All You Need: Fast Convergence at Large Depth"☆62Updated 10 months ago
- A pytorch dataset sampler for always sampling balanced batches.☆115Updated 4 years ago
- [ICML 2020] code for "PowerNorm: Rethinking Batch Normalization in Transformers" https://arxiv.org/abs/2003.07845☆120Updated 3 years ago
- [ICML 2020] code for the flooding regularizer proposed in "Do We Need Zero Training Loss After Achieving Zero Training Error?"☆92Updated 2 years ago
- The official implementation of paper "DIANet:Dense-and-Implicit-Attention-Network".☆102Updated last year
- Full implementation of the paper "Rethinking Softmax with Cross-Entropy: Neural Network Classifier as Mutual Information Estimator".☆101Updated 5 years ago
- An implementation of shampoo☆74Updated 7 years ago
- diffGrad: An Optimization Method for Convolutional Neural Networks☆55Updated 2 years ago
- Code for CVPR 2019 paper "Deep Metric Learning to Rank"☆96Updated 3 years ago
- A Pytorch implementation of "LegoNet: Efficient Convolutional Neural Networks with Lego Filters" (ICML 2019).☆140Updated 4 years ago
- Implementation of Sparsemax activation in Pytorch☆160Updated 5 years ago
- Implementation of Online Label Smoothing in PyTorch☆94Updated 2 years ago
- Torch implementation of the paper "ShakeDrop regularization" (https://arxiv.org/abs/1802.02375).☆21Updated 7 years ago
- Bootstrapping loss function implementation in pytorch☆36Updated 4 years ago
- Official implementation of Auxiliary Learning by Implicit Differentiation [ICLR 2021]☆84Updated 10 months ago
- The official implementation of paper "Instance Enhancement Batch Normalization: an Adaptive Regulator of Batch Noise".☆40Updated last year
- [ICLR'20] [PyTorch] Inverted Attention Routing for Capsules☆30Updated 5 years ago
- DataLoader subclass for PyTorch to work with HDF5 files.☆49Updated 6 years ago