deep-spin / entmaxLinks

The entmax mapping and its loss, a family of sparse softmax alternatives.

☆449

Alternatives and similar repositories for entmax

Users that are interested in entmax are comparing it to the libraries listed below

Sorting:

KrisKorrel / sparsemax-pytorch
Implementation of Sparsemax activation in Pytorch
☆164Updated 5 years ago
lucidrains / sinkhorn-transformer
Sinkhorn Transformer - Practical implementation of Sparse Sinkhorn Attention
☆268Updated 4 years ago
Stonesjtu / Pytorch-NCE
The Noise Contrastive Estimation for softmax output written in Pytorch
☆319Updated 5 years ago
LiyuanLucasLiu / Transformer-Clinic
Understanding the Difficulty of Training Transformers
☆330Updated 3 years ago
guolinke / TUPE
Transformer with Untied Positional Encoding (TUPE). Code of paper "Rethinking Positional Encoding in Language Pre-training". Improve exis…
☆252Updated 3 years ago
jxhe / vae-lagging-encoder
PyTorch implementation of "Lagging Inference Networks and Posterior Collapse in Variational Autoencoders" (ICLR 2019)
☆185Updated 4 years ago
cybertronai / pytorch-lamb
Implementation of https://arxiv.org/abs/1904.00962
☆377Updated 4 years ago
lucidrains / routing-transformer
Fully featured implementation of Routing Transformer
☆296Updated 3 years ago
dalab / hyperbolic_nn
Source code for the paper "Hyperbolic Neural Networks", https://arxiv.org/abs/1805.09112
☆180Updated 5 years ago
harvardnlp / pytorch-struct
Fast, general, and tested differentiable structured prediction in PyTorch
☆1,117Updated 3 years ago
haofuml / cyclical_annealing
☆196Updated 2 years ago
epfml / collaborative-attention
Code for Multi-Head Attention: Collaborate Instead of Concatenate
☆151Updated 2 years ago
alex-tifrea / poincare_glove
Implementation of the "Poincare Glove: Hyperbolic word embeddings" paper
☆88Updated 4 years ago
laiguokun / Funnel-Transformer
☆219Updated 5 years ago
successar / AttentionExplanation
☆316Updated 3 years ago
shentianxiao / text-autoencoders
☆212Updated last year
google-research / long-range-arena
Long Range Arena for Benchmarking Efficient Transformers
☆765Updated last year
jiacheng-xu / vmf_vae_nlp
Code for EMNLP18 paper "Spherical Latent Spaces for Stable Variational Autoencoders"
☆170Updated 6 years ago
ChunyuanLI / Optimus
Optimus: the first large-scale pre-trained VAE language model
☆391Updated 2 years ago
tatp22 / linformer-pytorch
My take on a practical implementation of Linformer for Pytorch.
☆421Updated 3 years ago
XuezheMax / flowseq
Generative Flow based Sequence-to-Sequence Toolkit written in Python.
☆246Updated 5 years ago
lena-voita / the-story-of-heads
This is a repository with the code for the ACL 2019 paper "Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, t…
☆314Updated 4 years ago
andreamad8 / Universal-Transformer-Pytorch
Implementation of Universal Transformer in Pytorch
☆263Updated 6 years ago
google-deepmind / lamb
LAnguage Modelling Benchmarks
☆138Updated 5 years ago
ssnl / align_uniform
Open source code for paper "Understanding Contrastive Representation Learning through Alignment and Uniformity on the Hypersphere" ICML 2…
☆457Updated 3 years ago
wouterkool / stochastic-beam-search
Implementation of Stochastic Beam Search using Fairseq
☆105Updated 6 years ago
mlpen / Nystromformer
☆383Updated 2 years ago
ermongroup / neuralsort
Code for "Stochastic Optimization of Sorting Networks using Continuous Relaxations", ICLR 2019.
☆146Updated 2 years ago
google-research / fast-soft-sort
Fast Differentiable Sorting and Ranking
☆612Updated last year
keitakurita / Better_LSTM_PyTorch
An LSTM in PyTorch with best practices (weight dropout, forget bias, etc.) built-in. Fully compatible with PyTorch LSTM.
☆134Updated 5 years ago