Stonesjtu / Pytorch-NCE
The Noise Contrastive Estimation for softmax output written in Pytorch
☆318Updated 4 years ago
Related projects: ⓘ
- The entmax mapping and its loss, a family of sparse softmax alternatives.☆406Updated 2 months ago
- Transformer with Untied Positional Encoding (TUPE). Code of paper "Rethinking Positional Encoding in Language Pre-training". Improve exis…☆249Updated 2 years ago
- PyTorch Implementation of the paper Learning to Reweight Examples for Robust Deep Learning☆351Updated 5 years ago
- Latent Alignment and Variational Attention☆326Updated 5 years ago
- PyTorch implementations of LSTM Variants (Dropout + Layer Norm)☆136Updated 3 years ago
- Code for paper "Learning to Reweight Examples for Robust Deep Learning"☆269Updated 5 years ago
- pytorch neural network attention mechanism☆147Updated 5 years ago
- Implementation of Sparsemax activation in Pytorch☆155Updated 4 years ago
- categorical variational autoencoder using the Gumbel-Softmax estimator☆425Updated 7 years ago
- Code for the paper "Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks"☆577Updated 5 years ago
- Pytorch implementation of R-Transformer. Some parts of the code are adapted from the implementation of TCN and Transformer.☆224Updated 5 years ago
- Implementation of Universal Transformer in Pytorch☆256Updated 5 years ago
- Implementation of https://arxiv.org/abs/1904.00962☆366Updated 3 years ago
- Minimal RNN classifier with self-attention in Pytorch☆148Updated 2 years ago
- Official PyTorch Repo for "ReZero is All You Need: Fast Convergence at Large Depth"☆405Updated last month
- An implementation of DeepMind's Relational Recurrent Neural Networks (NeurIPS 2018) in PyTorch.☆244Updated 5 years ago
- PyTorch Language Model for 1-Billion Word (LM1B / GBW) Dataset☆122Updated 5 years ago
- A pytorch implementation of the paper: "Skip RNN: Learning to Skip State Updates in Recurrent Neural Networks"☆81Updated 5 years ago
- PyTorch Implementation of "Non-Autoregressive Neural Machine Translation"☆269Updated 2 years ago
- Code for "Gradient Surgery for Multi-Task Learning"☆299Updated 4 years ago
- Codes for "Understanding and Improving Transformer From a Multi-Particle Dynamic System Point of View"☆146Updated 5 years ago
- Virtual Adversarial Training (VAT) implementation for PyTorch☆296Updated 5 years ago
- Generative Flow based Sequence-to-Sequence Toolkit written in Python.☆243Updated 4 years ago
- Sinkhorn Transformer - Practical implementation of Sparse Sinkhorn Attention☆252Updated 3 years ago
- Keras implementation of Representation Learning with Contrastive Predictive Coding☆519Updated 5 years ago
- A PyTorch implementation of : Language Modeling with Gated Convolutional Networks.☆100Updated 2 years ago
- PyTorch implementation of a Variational Autoencoder with Gumbel-Softmax Distribution☆198Updated 6 years ago
- ☆83Updated 4 years ago
- PyTorch implementation of batched bi-RNN encoder and attention-decoder.☆278Updated 5 years ago
- pytorch implementation of Attention is all you need☆239Updated 3 years ago