facebookresearch / adaptive-softmaxLinks
Implements an efficient softmax approximation as described in the paper "Efficient softmax approximation for GPUs" (http://arxiv.org/abs/1609.04309)
☆395Updated 6 years ago
Alternatives and similar repositories for adaptive-softmax
Users that are interested in adaptive-softmax are comparing it to the libraries listed below
Sorting:
- Mixed Incremental Cross-Entropy REINFORCE ICLR 2016☆331Updated 8 years ago
- Language Modeling☆156Updated 5 years ago
- Write PyTorch code at the level of individual examples, then run it efficiently on minibatches.☆484Updated 3 years ago
- ☆165Updated 8 years ago
- Code and models from the paper "Layer Normalization"☆247Updated 8 years ago
- ByteNet for character-level language modelling☆319Updated 7 years ago
- Recurrent Highway Networks - Implementations for Tensorflow, Torch7, Theano and Brainstorm☆402Updated 5 years ago
- ☆167Updated 8 years ago
- ☆143Updated 7 years ago
- Adaptive Computation Time algorithm in Tensorflow☆256Updated 8 years ago
- auto-tuning momentum SGD optimizer☆288Updated 6 years ago
- This library provides utilities for creating and manipulating RNNs to model sequential data.☆191Updated 7 years ago
- TensorFlow implementation of "Tracking the World State with Recurrent Entity Networks".☆273Updated 7 years ago
- This is a self contained software accompanying the paper titled: Learning Longer Memory in Recurrent Neural Networks: http://arxiv.org/ab…☆168Updated 7 years ago
- Tools for PyTorch☆222Updated 2 years ago
- Benchmarks for several RNN variations with different deep-learning frameworks☆169Updated 6 years ago
- ☆395Updated 6 years ago
- nmtpy is a Python framework based on dl4mt-tutorial to experiment with Neural Machine Translation pipelines.☆125Updated 7 years ago
- Transformer of "Attention Is All You Need" (Vaswani et al. 2017) by Chainer.☆321Updated 7 years ago
- C++/CUDA toolkit for training sequence and sequence-to-sequence models across multiple GPUs☆186Updated 8 years ago
- Examples and scripts using Blocks☆148Updated 8 years ago
- OptNet - Reducing memory usage in torch neural nets☆283Updated 8 years ago
- Code for Structured Attention Networks https://arxiv.org/abs/1702.00887☆238Updated 8 years ago
- Multi-GPU mini-framework for Theano☆195Updated 7 years ago
- ☆473Updated 3 years ago
- Dynamic evaluation for pytorch language models, now includes hyperparameter tuning☆104Updated 7 years ago
- Gated Attention Reader for Text Comprehension☆188Updated 7 years ago
- ☆617Updated 8 years ago
- Sequence-to-Sequence learning using PyTorch☆521Updated 5 years ago
- Coherence + Recurrent Neural Network + Convolutional Neural Network☆142Updated 8 years ago