DSE-MSU / R-transformer
Pytorch implementation of R-Transformer. Some parts of the code are adapted from the implementation of TCN and Transformer.
☆227Updated 5 years ago
Alternatives and similar repositories for R-transformer:
Users that are interested in R-transformer are comparing it to the libraries listed below
- Minimal RNN classifier with self-attention in Pytorch☆150Updated 3 years ago
- An LSTM in PyTorch with best practices (weight dropout, forget bias, etc.) built-in. Fully compatible with PyTorch LSTM.☆132Updated 5 years ago
- LAnguage Modelling Benchmarks☆137Updated 4 years ago
- ☆213Updated 4 years ago
- Implementation of Universal Transformer in Pytorch☆259Updated 6 years ago
- ☆83Updated 5 years ago
- ☆76Updated 4 years ago
- [ICLR'19] Trellis Networks for Sequence Modeling☆471Updated 5 years ago
- Multi-head attention in PyTorch☆150Updated 5 years ago
- Fully featured implementation of Routing Transformer☆288Updated 3 years ago
- Code for Multi-Head Attention: Collaborate Instead of Concatenate