lancopku / meProp
meProp: Sparsified Back Propagation for Accelerated Deep Learning (ICML 2017)
☆110Updated 2 years ago
Alternatives and similar repositories for meProp:
Users that are interested in meProp are comparing it to the libraries listed below
- ☆64Updated 7 years ago
- Code used to generate the results appearing in "Train longer, generalize better: closing the generalization gap in large batch training o…☆148Updated 7 years ago
- Implement Decoupled Neural Interfaces using Synthetic Gradients in Pytorch☆119Updated 7 years ago
- A pytorch implementation of "Self-Normalizing Neural Networks" by Klambauer et al. (still beta)☆59Updated 7 years ago
- ☆75Updated 7 years ago
- Implementation of ICLR 2017 paper "Loss-aware Binarization of Deep Networks"☆18Updated 5 years ago
- Source code for "Efficient Training of BERT by Progressively Stacking"☆112Updated 5 years ago
- Dynamic evaluation for pytorch language models, now includes hyperparameter tuning☆105Updated 7 years ago
- Training RNNs as Fast as CNNs (Simple Recurrent Unit)☆30Updated 7 years ago
- Code for SegTree Transformer (ICLR-RLGM 2019).☆27Updated 5 years ago
- Adaptive Softmax implementation for PyTorch☆79Updated 5 years ago
- Codes for "Towards Binary-Valued Gates for Robust LSTM Training".☆76Updated 6 years ago
- Sparse and structured neural attention mechanisms☆223Updated 4 years ago
- ☆79Updated 6 years ago
- Pytorch implementation of bytenet from "Neural Machine Translation in Linear Time" paper☆47Updated 7 years ago
- Code for "EigenDamage: Structured Pruning in the Kronecker-Factored Eigenbasis" https://arxiv.org/abs/1905.05934☆112Updated 4 years ago
- Dynamic batching for variable length inputs.☆20Updated 7 years ago
- Highway networks implemented in PyTorch.☆56Updated 7 years ago
- ☆54Updated 7 years ago
- Code for paper "Continual and Multi-Task Architecture Search (ACL 2019)"☆41Updated 5 years ago
- Cleaned original source code from my NIPS publication☆154Updated 7 years ago
- Implementation of "Learning with Random Learning Rates" in PyTorch.☆102Updated 5 years ago
- Implements pytorch code for the Accelerated SGD algorithm.☆215Updated 6 years ago
- Efficient Architecture Search by Network Transformation, in AAAI 2018☆170Updated 5 years ago
- Compare outputs between layers written in Tensorflow and layers written in Pytorch☆72Updated 6 years ago
- Just-in-time Dynamic Batching with MXNet Gluon.☆52Updated 4 years ago
- Pytorch implementation of DeepMind's differentiable neural computer paper.☆94Updated 7 years ago
- A new kind of pooling layer for faster and sharper convergence☆76Updated 7 years ago