ml-research / self-expanding-neural-networksLinks
Self-Expanding Neural Networks
☆39Updated last year
Alternatives and similar repositories for self-expanding-neural-networks
Users that are interested in self-expanding-neural-networks are comparing it to the libraries listed below
Sorting:
- Sequence Modeling with Multiresolution Convolutional Memory (ICML 2023)☆127Updated 2 years ago
- Implementation of the transformer proposed in "Building Blocks for a Complex-Valued Transformer Architecture"☆86Updated 2 years ago
- PyTorch implementation of Structured State Space for Sequence Modeling (S4), based on Annotated S4.☆87Updated last year
- ☆129Updated 4 months ago
- Deep Networks Grok All the Time and Here is Why☆38Updated last year
- HGRN2: Gated Linear RNNs with State Expansion☆56Updated last year
- The official repository for HyperZ⋅Z⋅W Operator Connects Slow-Fast Networks for Full Context Interaction.☆41Updated 8 months ago
- ☆69Updated last year
- Modern Fixed Point Systems using Pytorch☆125Updated 2 years ago
- Omnigrok: Grokking Beyond Algorithmic Data☆62Updated 2 years ago
- Official code for the paper "Attention as a Hypernetwork"☆46Updated last year
- C++ and Cuda ops for fused FourierKAN☆82Updated last year
- Official implementation for Equivariant Architectures for Learning in Deep Weight Spaces [ICML 2023]☆90Updated 2 years ago
- Inference Speed Benchmark for Learning to (Learn at Test Time): RNNs with Expressive Hidden States☆76Updated last year
- ☆97Updated last year
- ☆77Updated 10 months ago
- A State-Space Model with Rational Transfer Function Representation.☆83Updated last year
- ☆156Updated last month
- ☆62Updated last year
- ☆54Updated last year
- Efficient Riemannian Optimization on Stiefel Manifold via Cayley Transform☆43Updated 6 years ago
- Parallelizing non-linear sequential models over the sequence length☆56Updated 6 months ago
- A PyTorch wrapper of parallel exclusive scan in CUDA☆12Updated 2 years ago
- Implementation of "Gradients without backpropagation" paper (https://arxiv.org/abs/2202.08587) using functorch☆114Updated 2 years ago
- A centralized place for deep thinking code and experiments☆88Updated 2 years ago
- ☆67Updated 4 years ago
- Rational Activation Functions - Replacing Padé Activation Units☆103Updated 9 months ago
- A More Fair and Comprehensive Comparison between KAN and MLP☆176Updated last year
- This repository includes code to reproduce the tables in "Loss Landscapes are All You Need: Neural Network Generalization Can Be Explaine…☆40Updated 2 years ago
- Integrating Mamba/SSMs with Transformer for Enhanced Long Context and High-Quality Sequence Modeling☆211Updated 2 months ago