RMichaelSwan / MogrifierLSTMLinks

A quick walk-through of the innards of LSTMs and a naive implementation of the Mogrifier LSTM paper in PyTorch

☆78

Alternatives and similar repositories for MogrifierLSTM

Users that are interested in MogrifierLSTM are comparing it to the libraries listed below

Sorting:

FreedomIntelligence / complex-order
☆84Updated 6 years ago
dashstander / block-recurrent-transformer
Pytorch implementation of "Block Recurrent Transformers" (Hutchins & Schlag et al., 2022)
☆84Updated 3 years ago
lancopku / Prime
A simple module consistently outperforms self-attention and Transformer model on main NMT datasets with SoTA performance.
☆86Updated 2 years ago
fawazsammani / mogrifier-lstm-pytorch
Implementation of Mogrifier LSTM in PyTorch
☆34Updated 5 years ago
CyberZHG / torch-multi-head-attention
Multi-head attention in PyTorch
☆154Updated 6 years ago
leaderj1001 / Synthesizer-Rethinking-Self-Attention-Transformer-Models
Implementing SYNTHESIZER: Rethinking Self-Attention in Transformer Models using Pytorch
☆70Updated 5 years ago
10-zin / Synthesizer
A PyTorch implementation of the paper - "Synthesizer: Rethinking Self-Attention in Transformer Models"
☆73Updated 3 years ago
takashiishida / flooding
[ICML 2020] code for the flooding regularizer proposed in "Do We Need Zero Training Loss After Achieving Zero Training Error?"
☆95Updated 3 years ago
xuanqing94 / FLOATER
Learning to Encode Position for Transformer with Continuous Dynamical Model
☆59Updated 5 years ago
yaohungt / TransformerDissection
[EMNLP'19] Summary for Transformer Understanding
☆53Updated 6 years ago
QData / LaMP
ECML 2019: Graph Neural Networks for Multi-Label Classification
☆91Updated last year
DSE-MSU / R-transformer
Pytorch implementation of R-Transformer. Some parts of the code are adapted from the implementation of TCN and Transformer.
☆230Updated 6 years ago
szhangtju / The-compression-of-Transformer
☆64Updated 4 years ago
google-deepmind / lamb
LAnguage Modelling Benchmarks
☆138Updated 5 years ago
monk1337 / Various-Attention-mechanisms
This repository contain various types of attention mechanism like Bahdanau , Soft attention , Additive Attention , Hierarchical Attention…
☆127Updated 4 years ago
epfml / collaborative-attention
Code for Multi-Head Attention: Collaborate Instead of Concatenate
☆153Updated 2 years ago
lukasruff / CVDD-PyTorch
A PyTorch implementation of Context Vector Data Description (CVDD), a method for Anomaly Detection on text.
☆71Updated 3 years ago
zlinao / Variational-Transformer
Variational Transformers for Diverse Response Generation
☆82Updated last year
jackyyy0228 / Order-free-Learning-Alleviating-Exposure-Bias-in-Multi-label-Classification
☆20Updated 6 years ago
leimao / Two-Layer-Hierarchical-Softmax-PyTorch
Two-Layer Hierarchical Softmax Implementation for PyTorch
☆70Updated 4 years ago
Gladys-Zhao / mRNN-mLSTM
Code for ICML 2020 paper: Do RNN and LSTM have Long Memory?
☆17Updated 4 years ago
linzehui / Curriculum-Learning-PaperList-Materials
Curriculum Learning related papers and materials
☆53Updated 5 years ago
ywatanabe1989 / custom_losses_pytorch
Custom loss functions to use in (mainly) PyTorch.
☆39Updated 5 years ago
cloneofsimo / realformer-pytorch
Implementation of RealFormer using pytorch
☆101Updated 4 years ago
thomlake / pytorch-attention
pytorch neural network attention mechanism
☆148Updated 6 years ago
CharizardAcademy / convtransformer
Code for the ACL2020 paper Character-Level Translation with Self-Attention
☆31Updated 5 years ago
LittleChenCc / Meta_LSTM
The implementation of Meta-LSTM in "Meta Multi-Task Learning for Sequence Modeling." AAAI-18
☆33Updated 7 years ago
timerstime / SDG4DA
Selection Distribution Generator for Domain Adaptation
☆27Updated 6 years ago
lancopku / AdaNorm
Code for "Understanding and Improving Layer Normalization"
☆46Updated 6 years ago
keitakurita / Better_LSTM_PyTorch
An LSTM in PyTorch with best practices (weight dropout, forget bias, etc.) built-in. Fully compatible with PyTorch LSTM.
☆134Updated 5 years ago