aced125 / sparsemaxLinks

A PyTorch Implementation of the Sparsemax operator (https://arxiv.org/pdf/1803.09820.pdf)

☆34

Alternatives and similar repositories for sparsemax

Users that are interested in sparsemax are comparing it to the libraries listed below

Sorting:

pseeth / autoclip
Adaptive Gradient Clipping
☆136Updated 2 years ago
ctlllll / SGConv
☆163Updated 2 years ago
aliutkus / spe
Relative Positional Encoding for Transformers with Linear Complexity
☆64Updated 3 years ago
tk-rusch / LEM
Official code for Long Expressive Memory (ICLR 2022, Spotlight)
☆70Updated 3 years ago
alex-matton / causal-transformer-decoder
☆73Updated 4 years ago
Sleepwalking / pytorch-softdtw
An implementation of SoftDTW for PyTorch.
☆223Updated 5 years ago
gcambara / cape
Continuous Augmented Positional Embeddings (CAPE) implementation for PyTorch
☆41Updated 2 years ago
ag1988 / dss
Sequence Modeling with Structured State Spaces
☆65Updated 3 years ago
SarthakYadav / audax
A home for audio ML in JAX. Has common features, learnable frontends, pretrained supervised and self-supervised models.
☆68Updated 3 years ago
google-research / diffstride
TF/Keras code for DiffStride, a pooling layer with learnable strides.
☆124Updated 3 years ago
guillaumeBellec / multitask
☆24Updated 9 months ago
chrischute / flowplusplus
Implementation of Flow++ in PyTorch
☆40Updated 5 years ago
dwromero / ckconv
Code repository of the paper "CKConv: Continuous Kernel Convolution For Sequential Data" published at ICLR 2022. https://arxiv.org/abs/21…
☆123Updated 2 years ago
google-research / soft-dtw-divergences
An implementation of soft-DTW divergences.
☆135Updated 3 years ago
borchero / pycave
Traditional Machine Learning Models for Large-Scale Datasets in PyTorch.
☆128Updated this week
sony / sqvae
Pytorch implementation of stochastically quantized variational autoencoder (SQ-VAE)
☆190Updated 3 years ago
dwromero / wavelet_networks
Code repository of the paper "Wavelet Networks: Scale-Translation Equivariant Learning From Raw Time-Series, TMLR" https://arxiv.org/abs…
☆82Updated last year
janfreyberg / pytorch-revgrad
A minimal pytorch package implementing a gradient reversal layer.
☆158Updated 9 months ago
lucidrains / gated-state-spaces-pytorch
Implementation of Gated State Spaces, from the paper "Long Range Language Modeling via Gated State Spaces", in Pytorch
☆101Updated 2 years ago
tatsy / normalizing-flows-pytorch
PyTorch implementations of normalizing flow and its variants.
☆79Updated 4 years ago
lucidrains / Mega-pytorch
Implementation of Mega, the Single-head Attention with Multi-headed EMA architecture that currently holds SOTA on Long Range Arena
☆204Updated last year
pclucas14 / iaf-vae
Pytorch Implementation of OpenAI's "Improved Variational Inference with Inverse Autoregressive Flow"
☆82Updated 5 years ago
thjashin / multires-conv
Sequence Modeling with Multiresolution Convolutional Memory (ICML 2023)
☆125Updated last year
michaelsdr / sinkformers
Transformers with doubly stochastic attention
☆46Updated 2 years ago
lorenlugosch / pytorch_HMM
HMMs in PyTorch
☆139Updated 4 years ago
L0SG / NanoFlow
PyTorch implementation of the paper "NanoFlow: Scalable Normalizing Flows with Sublinear Parameter Complexity." (NeurIPS 2020)
☆66Updated 4 years ago
yoyolicoris / pytorch-NMF
A pytorch package for non-negative matrix factorization.
☆241Updated last year
NVIDIA / transformer-ls
Official PyTorch Implementation of Long-Short Transformer (NeurIPS 2021).
☆225Updated 3 years ago
revsic / jax-variational-diffwave
Jax/Flax implementation of Variational-DiffWave.
☆40Updated 3 years ago
lucidrains / gradnorm-pytorch
A practical implementation of GradNorm, Gradient Normalization for Adaptive Loss Balancing, in Pytorch
☆100Updated last year