facebookresearch / adaptive-softmax
Implements an efficient softmax approximation as described in the paper "Efficient softmax approximation for GPUs" (http://arxiv.org/abs/1609.04309)
☆395Updated 5 years ago
Related projects ⓘ
Alternatives and complementary repositories for adaptive-softmax
- Mixed Incremental Cross-Entropy REINFORCE ICLR 2016☆332Updated 7 years ago
- Code and models from the paper "Layer Normalization"☆245Updated 8 years ago
- ☆165Updated 8 years ago
- Language Modeling☆156Updated 5 years ago
- ☆394Updated 6 years ago
- Write PyTorch code at the level of individual examples, then run it efficiently on minibatches.☆485Updated 2 years ago
- ByteNet for character-level language modelling☆319Updated 7 years ago
- This library provides utilities for creating and manipulating RNNs to model sequential data.☆192Updated 7 years ago
- auto-tuning momentum SGD optimizer☆287Updated 5 years ago
- Benchmarks for several RNN variations with different deep-learning frameworks☆169Updated 5 years ago
- Tools for PyTorch☆221Updated 2 years ago
- Recurrent Highway Networks - Implementations for Tensorflow, Torch7, Theano and Brainstorm☆404Updated 5 years ago
- ☆144Updated 7 years ago
- ☆168Updated 8 years ago
- This is a self contained software accompanying the paper titled: Learning Longer Memory in Recurrent Neural Networks: http://arxiv.org/ab…☆169Updated 6 years ago
- Gated Attention Reader for Text Comprehension☆185Updated 6 years ago
- Examples and scripts using Blocks☆147Updated 8 years ago
- ☆472Updated 2 years ago
- TensorFlow implementation of "Tracking the World State with Recurrent Entity Networks".☆273Updated 7 years ago
- Adaptive Computation Time algorithm in Tensorflow☆255Updated 7 years ago
- Example code for Weight Normalization, from "Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Netw…☆365Updated 5 years ago
- Standalone TensorBoard for visualizing in deep learning☆370Updated 4 years ago
- Multi-GPU mini-framework for Theano☆195Updated 7 years ago
- nmtpy is a Python framework based on dl4mt-tutorial to experiment with Neural Machine Translation pipelines.☆126Updated 6 years ago
- Dynamic evaluation for pytorch language models, now includes hyperparameter tuning☆105Updated 6 years ago
- C++/CUDA toolkit for training sequence and sequence-to-sequence models across multiple GPUs☆186Updated 7 years ago
- Transformer of "Attention Is All You Need" (Vaswani et al. 2017) by Chainer.☆313Updated 7 years ago
- Code for the Eager Translation Model from the paper You May Not Need Attention☆293Updated 5 years ago
- Cleaned original source code from my NIPS publication☆154Updated 6 years ago