hyeon95y / SparseLinear
A custom PyTorch layer that is capable of implementing extremely wide and sparse linear layers efficiently
☆49Updated last year
Alternatives and similar repositories for SparseLinear:
Users that are interested in SparseLinear are comparing it to the libraries listed below
- Structured matrices for compressing neural networks☆66Updated last year
- Tensorflow implementation and notebooks for Implicit Maximum Likelihood Estimation☆67Updated 2 years ago
- Code for the article "What if Neural Networks had SVDs?", to be presented as a spotlight paper at NeurIPS 2020.☆74Updated 7 months ago
- Code for the paper: "Tensor Programs II: Neural Tangent Kernel for Any Architecture"☆105Updated 4 years ago
- Easy-to-use AdaHessian optimizer (PyTorch)☆77Updated 4 years ago
- Official code repository of the paper Linear Transformers Are Secretly Fast Weight Programmers.☆102Updated 3 years ago
- ☆49Updated 4 years ago
- Official code for UnICORNN (ICML 2021)☆27Updated 3 years ago
- Pytorch implementation of the Power Spherical distribution☆74Updated 7 months ago
- Code for the Thermodynamic Variational Objective☆26Updated 2 years ago
- Transformers with doubly stochastic attention☆45Updated 2 years ago
- The official repository for our paper "The Devil is in the Detail: Simple Tricks Improve Systematic Generalization of Transformers". We s…☆67Updated 2 years ago
- repo for paper: Adaptive Checkpoint Adjoint (ACA) method for gradient estimation in neural ODE☆54Updated 4 years ago
- Estimating Gradients for Discrete Random Variables by Sampling without Replacement☆40Updated 5 years ago
- codebase for "A Theory of the Inductive Bias and Generalization of Kernel Regression and Wide Neural Networks"☆50Updated last year
- ☆67Updated 5 years ago
- Sequence Modeling with Structured State Spaces☆63Updated 2 years ago
- Monotone operator equilibrium networks☆51Updated 4 years ago
- Meta-learning inductive biases in the form of useful conserved quantities.☆37Updated 2 years ago
- Euclidean Wasserstein-2 optimal transportation☆45Updated last year
- Code accompanying our paper "Feature Learning in Infinite-Width Neural Networks" (https://arxiv.org/abs/2011.14522)☆61Updated 3 years ago
- Limitations of the Empirical Fisher Approximation☆47Updated last week
- CUDA kernels for generalized matrix-multiplication in PyTorch☆79Updated 3 years ago
- Reparameterize your PyTorch modules☆70Updated 4 years ago
- Introducing diverse tasks for NAS☆49Updated 2 years ago
- Implementations and checkpoints for ResNet, Wide ResNet, ResNeXt, ResNet-D, and ResNeSt in JAX (Flax).☆108Updated 2 years ago
- ☆31Updated 4 years ago
- PyTorch-SSO: Scalable Second-Order methods in PyTorch☆145Updated last year
- Padé Activation Units: End-to-end Learning of Activation Functions in Deep Neural Network☆64Updated 4 years ago
- Laplace Redux -- Effortless Bayesian Deep Learning☆42Updated last year