mknbv / adashift
AdaShift optimizer implementation in PyTorch
☆17Updated 5 years ago
Related projects ⓘ
Alternatives and complementary repositories for adashift
- MUSCO: MUlti-Stage COmpression of neural networks☆71Updated 3 years ago
- FLOPs and other statistics COunter for Pytorch neural networks☆23Updated 3 years ago
- Code for MSID, a Multi-Scale Intrinsic Distance for comparing generative models, studying neural networks, and more!☆50Updated 5 years ago
- Theoretical Deep Learning: generalization ability☆46Updated 4 years ago
- Pytorch implementation of Variational Dropout Sparsifies Deep Neural Networks☆84Updated 2 years ago
- Skoltech 2017 NLA course☆37Updated 6 years ago
- Compression schema for gradients of activations in backward pass☆44Updated last year
- ☆11Updated 3 years ago
- ☆26Updated 3 years ago
- The Deep Weight Prior, ICLR 2019☆44Updated 3 years ago
- Very simple and short implementation of gradient boosting in 18 lines of code☆9Updated 4 years ago
- Compression of NMT transformer model with tensor methods☆46Updated 5 years ago
- Deep Generative Models course, 2021☆21Updated 2 years ago
- A fork of the official TPU models repo with fixes and a solution of the Kaggle Open Images 2019 Object Detection Challenge☆49Updated 5 years ago
- ☆61Updated 4 years ago
- Greedy Bayesian Posterior Approximation with Deep Ensembles. A. Tiulpin and M. B. Blaschko. (2021)☆11Updated 2 years ago
- NLA 2018 Skoltech course☆51Updated 5 years ago
- Learning to Initialize Neural Networks for Stable and Efficient Training☆136Updated 2 years ago
- Course "Theories of Deep Learning"☆196Updated 5 years ago
- A course on Optimization Methods☆150Updated 2 years ago
- Loss Patterns of Neural Networks☆82Updated 3 years ago
- custom cuda kernel for {2, 3}d relative attention with pytorch wrapper☆43Updated 4 years ago
- Implementations of quasi-hyperbolic optimization algorithms.☆102Updated 4 years ago
- Utilities for Neural Network training☆19Updated 3 years ago
- Implementation of Spectral Leakage and Rethinking the Kernel Size in CNNs in Pytorch☆14Updated 3 years ago
- On the New method of Hessian-free second-order optimization☆8Updated 4 years ago
- ☆47Updated 3 years ago
- Code for paper "SWALP: Stochastic Weight Averaging forLow-Precision Training".☆62Updated 5 years ago
- Unofficial pytorch implementation of ReZero in ResNet☆23Updated 4 years ago
- Presentations of the advanced topics in optimization☆11Updated 5 years ago