soskek / attention_is_all_you_need
Transformer of "Attention Is All You Need" (Vaswani et al. 2017) by Chainer.
☆314Updated 7 years ago
Related projects ⓘ
Alternatives and complementary repositories for attention_is_all_you_need
- Language Modeling☆156Updated 5 years ago
- Batch normalized LSTM for tensorflow☆180Updated 8 years ago
- Recurrent Highway Networks - Implementations for Tensorflow, Torch7, Theano and Brainstorm☆404Updated 5 years ago
- ByteNet for character-level language modelling☆319Updated 7 years ago
- TensorFlow implementation of normalizations such as Layer Normalization, HyperNetworks.☆112Updated 8 years ago
- QRNN implementation for TensorFlow☆238Updated last year
- End-To-End Memory Network using Tensorflow☆342Updated 7 years ago
- Tensorflow implementation of "Language Modeling with Gated Convolutional Networks"☆272Updated 7 years ago
- attention model for entailment on SNLI corpus implemented in Tensorflow and Keras☆178Updated 8 years ago
- ☆165Updated 8 years ago
- Tensorflow implementation for DilatedRNN☆346Updated 7 years ago
- Sequence-to-Sequence learning using PyTorch☆523Updated 5 years ago
- Dynamic Memory Networks (https://arxiv.org/abs/1603.01417) in Tensorflow☆240Updated 8 years ago
- A tensorflow implementation of Fairseq Convolutional Sequence to Sequence Learning(Gehring et al. 2017)☆303Updated 7 years ago
- ☆218Updated 9 years ago
- Mixed Incremental Cross-Entropy REINFORCE ICLR 2016☆332Updated 7 years ago
- Implements an efficient softmax approximation as described in the paper "Efficient softmax approximation for GPUs" (http://arxiv.org/abs/…☆395Updated 5 years ago
- Gated Attention Reader for Text Comprehension☆186Updated 6 years ago
- A neural conversation model☆139Updated 8 years ago
- Adaptive Computation Time algorithm in Tensorflow☆255Updated 7 years ago
- ☆137Updated 7 years ago
- in progress☆188Updated 6 years ago
- Tensorflow based Neural Conversation Models☆191Updated 7 years ago
- ☆168Updated 8 years ago
- fairseq: Convolutional Sequence to Sequence Learning (Gehring et al. 2017) by Chainer☆65Updated 7 years ago
- TensorFlow implementation of "Tracking the World State with Recurrent Entity Networks".☆273Updated 7 years ago
- Quasi-recurrent Neural Networks for Keras☆76Updated 7 years ago
- Code to train state-of-the-art Neural Machine Translation systems.☆105Updated 8 years ago
- Code for Structured Attention Networks https://arxiv.org/abs/1702.00887☆238Updated 7 years ago