cghezhang / AdamW
https://arxiv.org/abs/1711.05101
☆17Updated 6 years ago
Alternatives and similar repositories for AdamW:
Users that are interested in AdamW are comparing it to the libraries listed below
- Highway networks implemented in PyTorch.☆56Updated 8 years ago
- souce code for "Accelerating Neural Transformer via an Average Attention Network"☆78Updated 5 years ago
- ☆26Updated 5 years ago
- Reversible Recurrent Neural Network Pytorch Implementation☆21Updated 7 years ago
- Attention is All You Need in Sonnet☆38Updated 7 years ago
- ☆74Updated 7 years ago
- ☆42Updated 6 years ago
- Training RNNs as Fast as CNNs (Simple Recurrent Unit)☆30Updated 7 years ago
- Mxnet implementation of an ICLR 2018 paper: A new method of region embedding for text classification.☆10Updated 6 years ago
- A Toolkit for Training, Tracking, Saving Models and Syncing Results☆61Updated 5 years ago
- ☆11Updated 6 years ago
- An unofficial PyTorch implementation of the HAN and AdaHAN models presented in the "Learning Visual Question Answering by Bootstrapping H…☆54Updated 6 years ago
- For visual commonsense model☆34Updated 6 years ago
- Official code of our work, Robust, Transferable Sentence Representations for Text Classification [Arxiv 2018].☆21Updated 6 years ago
- Generative Adversarial Networks in Neural Machine Translation☆57Updated 7 years ago
- A Tensorflow implementation of Yin Wenpeng's recent paper on TACL "Attentive Convolution"☆33Updated 6 years ago
- pytorch学习笔记☆8Updated 6 years ago
- An Implementation of Bidirectional Attention Flow☆40Updated 7 years ago
- Tensorflow Implementation of Relation Networks for the bAbI QA Task, detailed in "A Simple Neural Network Module for Relational Reasoning…☆49Updated 7 years ago
- Bi-Directional Block Self-Attention☆123Updated 6 years ago
- A Better Way to Attend: Attention with Trees for Video Question Answering☆25Updated 6 years ago
- Training RNNs as fast as CNNs. An unofficial tensorflow implementation.☆32Updated 7 years ago
- A PyTorch Implementation of "Quasi-Recurrent Neural Networks"☆46Updated 7 years ago
- 👾 A library of state-of-the-art pretrained models for Natural Language Processing (NLP)☆9Updated 5 years ago
- adafactor optimizer for keras☆20Updated 3 years ago
- a pytorch implementation of match lstm question answering model☆43Updated 11 months ago
- Sequential Matching Network implemented by MXNET☆18Updated 6 years ago
- PyTorch implementation of Attention-over-Attention Neural Networks for Reading Comprehension☆60Updated 7 years ago
- Code for EMNLP 2018 paper https://arxiv.org/pdf/1808.09075.pdf☆38Updated 6 years ago
- (Beta Version!) Experiment Code for Paper ``CoT: Cooperative Training for Generative Modeling of Discrete Data''☆72Updated 5 years ago