JingzhaoZhang / why-clipping-acceleratesLinks

A pytorch implementation for the LSTM experiments in the paper: Why Gradient Clipping Accelerates Training: A Theoretical Justification for Adaptivity

☆46

Alternatives and similar repositories for why-clipping-accelerates

Users that are interested in why-clipping-accelerates are comparing it to the libraries listed below

Sorting:

ssnl / PyTorch-Reparam-Module
Reparameterize your PyTorch modules
☆71Updated 4 years ago
izmailovpavel / torch_swa_examples
☆47Updated 4 years ago
toiaydcdyywlhzvlob / backpack
This repository is no longer maintained. Check
☆81Updated 5 years ago
hongyanz / TRADES-smoothing
[JMLR] TRADES + random smoothing for certifiable robustness
☆14Updated 5 years ago
yaohungt / Adaptive-Regularization-Neural-Network
[NeurIPS'19] [PyTorch] Adaptive Regularization in NN
☆68Updated 6 years ago
ColinQiyangLi / LConvNet
Implementation of Methods Proposed in Preventing Gradient Attenuation in Lipschitz Constrained Convolutional Networks (NeurIPS 2019)
☆35Updated 5 years ago
mpezeshki / Gradient_Starvation
Gradient Starvation: A Learning Proclivity in Neural Networks
☆61Updated 4 years ago
bamos / HowToTrainYourMAMLPytorch
The original code for the paper "How to train your MAML" along with a replication of the original "Model Agnostic Meta Learning" (MAML) p…
☆41Updated 5 years ago
j-min / Dropouts
PyTorch Implementations of Dropout Variants
☆88Updated 7 years ago
minhtannguyen / SRSGD
Code base for SRSGD.
☆28Updated 5 years ago
minyoungg / overparam
☆42Updated 2 years ago
asteroidhouse / self-tuning-networks
Code for Self-Tuning Networks (ICLR 2019) https://arxiv.org/abs/1903.03088
☆54Updated 6 years ago
RAIVNLab / supsup
Code for "Supermasks in Superposition"
☆124Updated 2 years ago
shwinshaker / LipGrow
An adaptive training algorithm for residual network
☆17Updated 5 years ago
moskomule / anatome
Ἀνατομή is a PyTorch library to analyze representation of neural networks
☆65Updated 4 months ago
k9k2 / qSGD
SGD and Ordered SGD codes for deep learning, SVM, and logistic regression
☆36Updated 5 years ago
uber-research / loss-change-allocation
☆61Updated 2 years ago
ucla-vision / information-dropout
Implementation of Information Dropout
☆39Updated 8 years ago
briancheung / superposition
☆45Updated 6 years ago
editable-ICLR2020 / editable
☆32Updated 6 years ago
prolearner / hypertorch
☆124Updated last year
sIncerass / powernorm
[ICML 2020] code for "PowerNorm: Rethinking Batch Normalization in Transformers" https://arxiv.org/abs/2003.07845
☆120Updated 4 years ago
yukimasano / linear-probes
Evaluating AlexNet features at various depths
☆40Updated 5 years ago
Cyanogenoid / fspool
[ICLR 2020] FSPool: Learning Set Representations with Featurewise Sort Pooling
☆41Updated 2 years ago
tbachlechner / ReZero-examples
PyTorch Examples repo for "ReZero is All You Need: Fast Convergence at Large Depth"
☆62Updated last year
uclaml / Padam
Partially Adaptive Momentum Estimation method in the paper "Closing the Generalization Gap of Adaptive Gradient Methods in Training Deep …
☆39Updated 2 years ago
revbucket / geometric-certificates
Geometric Certifications of Neural Nets
☆42Updated 2 years ago
choidami / sst
☆50Updated 5 years ago
wiseodd / last_layer_laplace
Last-layer Laplace approximation code examples
☆83Updated 4 years ago
chingyaoc / fair-mixup
ICLR 2021, Fair Mixup: Fairness via Interpolation
☆59Updated 4 years ago