facebookresearch / adaptive-softmax
Implements an efficient softmax approximation as described in the paper "Efficient softmax approximation for GPUs" (http://arxiv.org/abs/1609.04309)
☆395Updated 5 years ago
Alternatives and similar repositories for adaptive-softmax:
Users that are interested in adaptive-softmax are comparing it to the libraries listed below
- Mixed Incremental Cross-Entropy REINFORCE ICLR 2016☆332Updated 7 years ago
- Code and models from the paper "Layer Normalization"☆245Updated 8 years ago
- ☆165Updated 8 years ago
- Language Modeling☆156Updated 5 years ago
- This is a self contained software accompanying the paper titled: Learning Longer Memory in Recurrent Neural Networks: http://arxiv.org/ab…☆169Updated 6 years ago
- auto-tuning momentum SGD optimizer☆287Updated 5 years ago
- Benchmarks for several RNN variations with different deep-learning frameworks☆169Updated 5 years ago
- ByteNet for character-level language modelling☆319Updated 7 years ago
- Recurrent Highway Networks - Implementations for Tensorflow, Torch7, Theano and Brainstorm☆404Updated 5 years ago
- This library provides utilities for creating and manipulating RNNs to model sequential data.☆192Updated 7 years ago
- ☆395Updated 6 years ago
- ☆168Updated 8 years ago
- ☆144Updated 7 years ago
- Tools for PyTorch☆221Updated 2 years ago
- ☆617Updated 7 years ago
- TensorFlow implementation of "Tracking the World State with Recurrent Entity Networks".☆273Updated 7 years ago
- OptNet - Reducing memory usage in torch neural nets☆282Updated 7 years ago
- Cleaned original source code from my NIPS publication☆154Updated 7 years ago
- Standalone TensorBoard for visualizing in deep learning☆371Updated 4 years ago
- ☆473Updated 2 years ago
- Adaptive Computation Time algorithm in Tensorflow☆255Updated 7 years ago
- Write PyTorch code at the level of individual examples, then run it efficiently on minibatches.☆484Updated 2 years ago
- Multi-GPU mini-framework for Theano☆195Updated 7 years ago
- nmtpy is a Python framework based on dl4mt-tutorial to experiment with Neural Machine Translation pipelines.☆126Updated 6 years ago
- Code for Structured Attention Networks https://arxiv.org/abs/1702.00887☆238Updated 7 years ago
- C++/CUDA toolkit for training sequence and sequence-to-sequence models across multiple GPUs☆186Updated 7 years ago
- Examples and scripts using Blocks☆147Updated 8 years ago
- Transformer of "Attention Is All You Need" (Vaswani et al. 2017) by Chainer.☆314Updated 7 years ago
- Dynamic evaluation for pytorch language models, now includes hyperparameter tuning☆105Updated 7 years ago
- A Neural Turing Machine implementation in Torch.☆279Updated 9 years ago