ryankiros / layer-normLinks
Code and models from the paper "Layer Normalization"
☆246Updated 8 years ago
Alternatives and similar repositories for layer-norm
Users that are interested in layer-norm are comparing it to the libraries listed below
Sorting:
- TensorFlow implementation of normalizations such as Layer Normalization, HyperNetworks.☆111Updated 8 years ago
- Benchmarks for several RNN variations with different deep-learning frameworks☆170Updated 6 years ago
- ☆167Updated 8 years ago
- Implementation of the paper [Using Fast Weights to Attend to the Recent Past](https://arxiv.org/abs/1610.06258)☆172Updated 8 years ago
- Efficient layer normalization GPU kernel for Tensorflow☆111Updated 8 years ago
- ☆64Updated 8 years ago
- Torch7 implementation of Grid LSTM as described here: http://arxiv.org/pdf/1507.01526v2.pdf☆187Updated 9 years ago
- Batch-Normalized LSTM (Recurrent Batch Normalization) implementation in Torch.☆90Updated 9 years ago
- Implementation of http://arxiv.org/abs/1511.05641 that lets one build a larger net starting from a smaller one.☆159Updated 8 years ago
- Lasagne code for weight normalization☆88Updated 9 years ago
- auto-tuning momentum SGD optimizer☆288Updated 6 years ago
- ☆137Updated 7 years ago
- Batch normalized LSTM for tensorflow☆179Updated 8 years ago
- Structured Prediction Energy Networks in Torch☆132Updated 8 years ago
- Mixed Incremental Cross-Entropy REINFORCE ICLR 2016☆331Updated 8 years ago
- Recreating the Deep Residual Network in Lasagne☆117Updated 9 years ago
- An implementation of the RL-NTM from http://arxiv.org/abs/1505.00521☆157Updated 9 years ago
- Multi-GPU mini-framework for Theano☆196Updated 7 years ago
- Review Network for Caption Generation☆181Updated 7 years ago
- Study of HeXA@UNIST in Preparation for Submission☆106Updated 9 years ago
- Torch implementation of DRAW: A Recurrent Neural Network For Image Generation☆135Updated 9 years ago
- bidirectional lstm☆152Updated 9 years ago
- Implementations of "LSTM: A Search Space Odyssey" variants and their training results on the PTB dataset.☆95Updated 8 years ago
- ☆121Updated 8 years ago
- ByteNet for character-level language modelling☆319Updated 7 years ago
- ☆89Updated 8 years ago
- ☆88Updated 10 years ago
- Deep Unsupervised Perceptual Grouping☆131Updated 4 years ago
- Implementation of the DRAW network in lasagne☆200Updated 9 years ago
- 🏃 Implementation of Using Fast Weights to Attend to the Recent Past.☆269Updated 6 years ago