mknbv / adashift
AdaShift optimizer implementation in PyTorch
☆16Updated 5 years ago
Related projects: ⓘ
- Code for MSID, a Multi-Scale Intrinsic Distance for comparing generative models, studying neural networks, and more!☆49Updated 5 years ago
- MUSCO: MUlti-Stage COmpression of neural networks☆73Updated 3 years ago
- Skoltech 2017 NLA course☆37Updated 5 years ago
- FLOPs and other statistics COunter for Pytorch neural networks☆23Updated 3 years ago
- Theoretical Deep Learning: generalization ability☆46Updated 4 years ago
- NLA 2018 Skoltech course☆51Updated 5 years ago
- Compression schema for gradients of activations in backward pass☆43Updated last year
- A fork of the official TPU models repo with fixes and a solution of the Kaggle Open Images 2019 Object Detection Challenge☆49Updated 4 years ago
- Learning to Initialize Neural Networks for Stable and Efficient Training☆134Updated 2 years ago
- Pytorch implementation of Variational Dropout Sparsifies Deep Neural Networks☆83Updated 2 years ago
- Presentations of the advanced topics in optimization☆11Updated 4 years ago
- Implementations of quasi-hyperbolic optimization algorithms.☆100Updated 4 years ago
- Utilities for Neural Network training☆19Updated 3 years ago
- model-in-the-loop☆42Updated 5 years ago
- ☆26Updated 3 years ago
- Greedy Bayesian Posterior Approximation with Deep Ensembles. A. Tiulpin and M. B. Blaschko. (2021)☆11Updated 2 years ago
- Course "Theories of Deep Learning"☆196Updated 4 years ago
- Simple implementation of the LSUV initialization in PyTorch☆58Updated 8 months ago
- The Deep Weight Prior, ICLR 2019☆44Updated 3 years ago
- ☆21Updated 2 years ago
- [AAAI 2020 Oral] Low-variance Black-box Gradient Estimates for the Plackett-Luce Distribution☆36Updated 3 years ago
- Deep Generative Models course, 2021☆20Updated 2 years ago
- ☆46Updated 3 years ago
- Very simple and short implementation of gradient boosting in 18 lines of code☆9Updated 4 years ago
- ☆11Updated 3 years ago
- ☆66Updated 6 years ago
- Modification of PyTorch implementation of YOLOv3 Object Detection.☆17Updated 4 years ago
- On the New method of Hessian-free second-order optimization☆8Updated 4 years ago
- Compression of NMT transformer model with tensor methods☆46Updated 5 years ago
- Code for paper "SWALP: Stochastic Weight Averaging forLow-Precision Training".☆62Updated 5 years ago