keitakurita / Better_LSTM_PyTorchLinks

An LSTM in PyTorch with best practices (weight dropout, forget bias, etc.) built-in. Fully compatible with PyTorch LSTM.

☆134

Alternatives and similar repositories for Better_LSTM_PyTorch

Users that are interested in Better_LSTM_PyTorch are comparing it to the libraries listed below

Sorting:

seba-1511 / lstms.pth
PyTorch implementations of LSTM Variants (Dropout + Layer Norm)
☆137Updated 4 years ago
bastings / annotated_encoder_decoder
The Annotated Encoder Decoder with Attention
☆166Updated 4 years ago
mttk / rnn-classifier
Minimal RNN classifier with self-attention in Pytorch
☆150Updated 3 years ago
google-deepmind / lamb
LAnguage Modelling Benchmarks
☆138Updated 5 years ago
yunjey / seq2seq-dataloader
PyTorch DataLoader for seq2seq
☆85Updated 6 years ago
DSE-MSU / R-transformer
Pytorch implementation of R-Transformer. Some parts of the code are adapted from the implementation of TCN and Transformer.
☆230Updated 6 years ago
locuslab / trellisnet
[ICLR'19] Trellis Networks for Sequence Modeling
☆471Updated 5 years ago
threelittlemonkeys / seq2seq-pytorch
Sequence to Sequence Models in PyTorch
☆44Updated last year
leimao / Two-Layer-Hierarchical-Softmax-PyTorch
Two-Layer Hierarchical Softmax Implementation for PyTorch
☆69Updated 4 years ago
tacchinotacchi / distil-bilstm
Scripts to train a bidirectional LSTM with knowledge distillation from BERT
☆158Updated 5 years ago
Toni-Antonova / VAE-Text-Generation
Text Generation Using A Variational Autoencoder
☆109Updated 8 years ago
b-etienne / Seq2seq-PyTorch
☆76Updated 5 years ago
variational-attention / tf-var-attention
Variational Attention for Sequence to Sequence Models
☆20Updated 7 years ago
ngarneau / understanding-pytorch-batching-lstm
Understanding and visualizing PyTorch Batching with LSTM
☆140Updated 7 years ago
phohenecker / pytorch-transformer
A PyTorch implementation of the Transformer model from "Attention Is All You Need".
☆59Updated 6 years ago
lancopku / Prime
A simple module consistently outperforms self-attention and Transformer model on main NMT datasets with SoTA performance.
☆85Updated 2 years ago
andreamad8 / Universal-Transformer-Pytorch
Implementation of Universal Transformer in Pytorch
☆261Updated 6 years ago
epfml / collaborative-attention
Code for Multi-Head Attention: Collaborate Instead of Concatenate
☆152Updated 2 years ago
harvardnlp / cascaded-generation
Cascaded Text Generation with Markov Transformers
☆129Updated 2 years ago
bplank / semi-supervised-baselines
Code for "Strong Baselines for Neural Semi-supervised Learning under Domain Shift" (Ruder & Plank, 2018 ACL)
☆61Updated 2 years ago
harvardnlp / sa-vae
☆152Updated 7 years ago
zalandoresearch / pytorch-dilated-rnn
Dilated RNNs in pytorch
☆213Updated 6 years ago
jihunchoi / recurrent-batch-normalization-pytorch
PyTorch implementation of recurrent batch normalization
☆243Updated 6 years ago
Smerity / pytorch-lamb
Implementation of https://arxiv.org/abs/1904.00962
☆15Updated 5 years ago
HarshTrivedi / packing-unpacking-pytorch-minimal-tutorial
Minimal tutorial on packing and unpacking sequences in pytorch
☆210Updated 6 years ago
harvardnlp / var-attn
Latent Alignment and Variational Attention
☆327Updated 6 years ago
shentianxiao / text-autoencoders
☆209Updated last year
AuCson / PyTorch-Batch-Attention-Seq2seq
PyTorch implementation of batched bi-RNN encoder and attention-decoder.
☆280Updated 6 years ago
rdspring1 / PyTorch_GBW_LM
PyTorch Language Model for 1-Billion Word (LM1B / GBW) Dataset
☆123Updated 5 years ago
jxhe / vae-lagging-encoder
PyTorch implementation of "Lagging Inference Networks and Posterior Collapse in Variational Autoencoders" (ICLR 2019)
☆183Updated 4 years ago