aced125 / sparsemax
A PyTorch Implementation of the Sparsemax operator (https://arxiv.org/pdf/1803.09820.pdf)
☆32Updated 2 years ago
Alternatives and similar repositories for sparsemax:
Users that are interested in sparsemax are comparing it to the libraries listed below
- Sequence Modeling with Structured State Spaces☆63Updated 2 years ago
- Transformers with doubly stochastic attention☆45Updated 2 years ago
- Pytorch implementation of the Power Spherical distribution☆74Updated 9 months ago
- [EMNLP'19] Summary for Transformer Understanding☆53Updated 5 years ago
- Code repository of the paper "CKConv: Continuous Kernel Convolution For Sequential Data" published at ICLR 2022. https://arxiv.org/abs/21…☆119Updated 2 years ago
- Implementation of Gated State Spaces, from the paper "Long Range Language Modeling via Gated State Spaces", in Pytorch☆99Updated 2 years ago
- Continuous Augmented Positional Embeddings (CAPE) implementation for PyTorch☆40Updated 2 years ago
- Exemplar VAE: Linking Generative Models, Nearest Neighbor Retrieval, and Data Augmentation☆69Updated 4 years ago
- Relative Positional Encoding for Transformers with Linear Complexity☆63Updated 3 years ago
- Implementation of Flow++ in PyTorch☆41Updated 5 years ago
- Official code for Long Expressive Memory (ICLR 2022, Spotlight)☆69Updated 3 years ago
- ☆74Updated 4 years ago
- ☆164Updated 2 years ago
- PyTorch implementations of normalizing flow and its variants.☆76Updated 3 years ago
- Fast Discounted Cumulative Sums in PyTorch☆95Updated 3 years ago
- CUDA kernels for generalized matrix-multiplication in PyTorch☆79Updated 3 years ago
- ☆49Updated 4 years ago
- Code for the paper PermuteFormer☆42Updated 3 years ago
- Implementations of various linear RNN layers using pytorch and triton☆49Updated last year
- ☆68Updated 2 years ago
- Sequence Modeling with Multiresolution Convolutional Memory (ICML 2023)☆123Updated last year
- Structured matrices for compressing neural networks☆66Updated last year
- Jax/Flax implementation of Variational-DiffWave.☆40Updated 3 years ago
- code for "Semi-Discrete Normalizing Flows through Differentiable Tessellation"☆26Updated 2 years ago
- Code to reproduce the results for Compositional Attention☆60Updated 2 years ago
- Stochastic Normalizing Flows☆76Updated 3 years ago
- Pytorch code for "Improving Self-Supervised Learning by Characterizing Idealized Representations"☆40Updated 2 years ago
- [ICML 2024] SIRFShampoo: Structured inverse- and root-free Shampoo in PyTorch (https://arxiv.org/abs/2402.03496)☆14Updated 5 months ago
- Efficient Householder Transformation in PyTorch☆64Updated 3 years ago
- Pytorch Implementation of OpenAI's "Improved Variational Inference with Inverse Autoregressive Flow"☆80Updated 4 years ago