hyeon95y / SparseLinearLinks
A custom PyTorch layer that is capable of implementing extremely wide and sparse linear layers efficiently
☆49Updated last year
Alternatives and similar repositories for SparseLinear
Users that are interested in SparseLinear are comparing it to the libraries listed below
Sorting:
- Structured matrices for compressing neural networks☆66Updated last year
- Sequence Modeling with Structured State Spaces☆64Updated 2 years ago
- Tensorflow implementation and notebooks for Implicit Maximum Likelihood Estimation☆67Updated 3 years ago
- Transformers with doubly stochastic attention☆45Updated 2 years ago
- Official code repository of the paper Linear Transformers Are Secretly Fast Weight Programmers.☆105Updated 3 years ago
- The official repository for our paper "The Devil is in the Detail: Simple Tricks Improve Systematic Generalization of Transformers". We s…☆67Updated 2 years ago
- ☆49Updated 4 years ago
- Layerwise Batch Entropy Regularization☆23Updated 2 years ago
- Easy-to-use AdaHessian optimizer (PyTorch)☆78Updated 4 years ago
- Pytorch library for factorized L0-based pruning.☆45Updated last year
- Euclidean Wasserstein-2 optimal transportation☆47Updated last year
- Fast Discounted Cumulative Sums in PyTorch☆96Updated 3 years ago
- ICML 2020 Paper: Latent Variable Modelling with Hyperbolic Normalizing Flows☆54Updated 2 years ago
- Easy Hypernetworks in Pytorch and Jax☆100Updated 2 years ago
- Padé Activation Units: End-to-end Learning of Activation Functions in Deep Neural Network☆64Updated 4 years ago
- [NeurIPS 2020] Official Implementation: "SMYRF: Efficient Attention using Asymmetric Clustering".☆50Updated last year
- repo for paper: Adaptive Checkpoint Adjoint (ACA) method for gradient estimation in neural ODE☆55Updated 4 years ago
- ☆99Updated 3 years ago
- CUDA kernels for generalized matrix-multiplication in PyTorch☆82Updated 3 years ago
- Relative Positional Encoding for Transformers with Linear Complexity☆63Updated 3 years ago
- Code base for SRSGD.☆28Updated 5 years ago
- A minimal implementation of a VAE with BinConcrete (relaxed Bernoulli) latent distribution in TensorFlow.☆22Updated 5 years ago
- Code for ICLR 2021 Paper, "Anytime Sampling for Autoregressive Models via Ordered Autoencoding"☆26Updated 2 years ago
- Repository containing Pytorch code for EKFAC and K-FAC perconditioners.☆143Updated last year
- Pytorch code for "Improving Self-Supervised Learning by Characterizing Idealized Representations"☆41Updated 2 years ago
- An implementation of (Induced) Set Attention Block, from the Set Transformers paper☆59Updated 2 years ago
- ☆163Updated 2 years ago
- ☆15Updated 5 years ago
- Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr\"om Method (NeurIPS 2021)☆61Updated 3 years ago
- Efficient Riemannian Optimization on Stiefel Manifold via Cayley Transform☆40Updated 6 years ago