loshchil / AdamW-and-SGDWView external linksLinks
Decoupled Weight Decay Regularization (ICLR 2019)
☆288Jan 9, 2019Updated 7 years ago
Alternatives and similar repositories for AdamW-and-SGDW
Users that are interested in AdamW-and-SGDW are comparing it to the libraries listed below
Sorting:
- ☆255Nov 23, 2016Updated 9 years ago
- keras implementation of AdamW from Fixing Weight Decay Regularization in Adam (https://arxiv.org/abs/1711.05101)☆71Jul 23, 2018Updated 7 years ago
- Experiments with Adam/AdamW/amsgrad☆201Sep 5, 2018Updated 7 years ago
- Partially Adaptive Momentum Estimation method in the paper "Closing the Generalization Gap of Adaptive Gradient Methods in Training Deep …☆39Apr 13, 2023Updated 2 years ago
- Code for "Aggregated Momentum: Stability Through Passive Damping", Lucas et al. 2018☆35Nov 6, 2018Updated 7 years ago
- ☆23Nov 24, 2018Updated 7 years ago
- 2.86% and 15.85% on CIFAR-10 and CIFAR-100☆297Oct 9, 2018Updated 7 years ago
- Unsupervised instance segmentation via active robot interaction☆76Jul 1, 2022Updated 3 years ago
- i-RevNet Pytorch Code☆396Feb 16, 2021Updated 5 years ago
- On the Variance of the Adaptive Learning Rate and Beyond☆2,549Jul 31, 2021Updated 4 years ago
- ☆13Jul 31, 2018Updated 7 years ago
- Code for visualizing the loss landscape of neural nets☆3,153Apr 5, 2022Updated 3 years ago
- TensorFlow implementation of (Momentum) Stochastic Variance-Adapted Gradient.☆44May 11, 2018Updated 7 years ago
- AdamW optimizer for Keras☆116Aug 9, 2019Updated 6 years ago
- Full implementation of the paper "Rethinking Softmax with Cross-Entropy: Neural Network Classifier as Mutual Information Estimator".☆101Mar 9, 2020Updated 5 years ago
- Apollo: An Adaptive Parameter-wise Diagonal Quasi-Newton Method for Nonconvex Stochastic Optimization☆182Nov 21, 2021Updated 4 years ago
- Small scale experiments with group normalization☆58Apr 4, 2018Updated 7 years ago
- Distributed Learning by Pair-Wise Averaging☆52Oct 31, 2017Updated 8 years ago
- ☆219May 23, 2018Updated 7 years ago
- Pytorch implementation of MaxPoolingLoss.☆177Jun 9, 2018Updated 7 years ago
- CondenseNet: Light weighted CNN for mobile devices☆691Nov 11, 2019Updated 6 years ago
- Implementation of Adversarial Variational Optimization in PyTorch☆43Aug 7, 2018Updated 7 years ago
- Code for paper "Which Training Methods for GANs do actually Converge? (ICML 2018)"☆921Aug 27, 2019Updated 6 years ago
- A Python implementation of a graph-based parser for Abstract Meaning Representation (AMR)☆11Feb 2, 2018Updated 8 years ago
- Unofficial Pytorch implementation of the paper Filter Response Normalization.☆19Dec 9, 2019Updated 6 years ago
- Code for Switchable Normalization from "Differentiable Learning-to-Normalize via Switchable Normalization", https://arxiv.org/abs/1806.10…☆870Jun 11, 2020Updated 5 years ago
- An implementation of shampoo☆77Mar 9, 2018Updated 7 years ago
- PyTorch implementation of "Transparency by Design: Closing the Gap Between Performance and Interpretability in Visual Reasoning"☆347Dec 7, 2021Updated 4 years ago
- ☆29Feb 6, 2018Updated 8 years ago
- Lua implementation of Entropy-SGD☆81Apr 9, 2018Updated 7 years ago
- Implementation and experiments for AdamW on Pytorch☆94Nov 23, 2019Updated 6 years ago
- A PyTorch implementation of the paper Mixup: Beyond Empirical Risk Minimization in PyTorch☆124Jan 8, 2018Updated 8 years ago
- A PyTorch Implementation of Single Shot Scale-invariant Face Detector.☆232Oct 28, 2019Updated 6 years ago
- ☆12Sep 26, 2019Updated 6 years ago
- Differentiable Neural Computers, Sparse Access Memory and Sparse Differentiable Neural Computers, for Pytorch☆347Jan 8, 2026Updated last month
- A plug-in replacement for DataLoader to load Imagenet disk-sequentially in PyTorch.☆239Aug 18, 2021Updated 4 years ago
- Synchronized Multi-GPU Batch Normalization☆222May 2, 2019Updated 6 years ago
- Manifold-Mixup implementation for fastai V1☆19Oct 1, 2020Updated 5 years ago
- Code for "Training Generative Adversarial Networks with Binary Neurons by End-to-end Backpropagation"☆26Oct 30, 2019Updated 6 years ago