ml-research / self-expanding-neural-networks
Self-Expanding Neural Networks
☆39Updated last year
Alternatives and similar repositories for self-expanding-neural-networks:
Users that are interested in self-expanding-neural-networks are comparing it to the libraries listed below
- ☆52Updated 5 months ago
- HGRN2: Gated Linear RNNs with State Expansion☆53Updated 7 months ago
- ☆30Updated 5 months ago
- ☆87Updated 9 months ago
- Source code for the paper "Positional Attention: Out-of-Distribution Generalization and Expressivity for Neural Algorithmic Reasoning"☆14Updated last month
- Code for GFlowNet-EM, a novel algorithm for fitting latent variable models with compositional latents and an intractable true posterior.☆41Updated last year
- ☆49Updated last year
- ☆51Updated 9 months ago
- Sequence Modeling with Multiresolution Convolutional Memory (ICML 2023)☆122Updated last year
- ☆37Updated 2 years ago
- Deep Networks Grok All the Time and Here is Why☆31Updated 10 months ago
- C++ and Cuda ops for fused FourierKAN☆76Updated 10 months ago
- Official source code for "Graph Neural Networks for Learning Equivariant Representations of Neural Networks". In ICLR 2024 (oral).☆77Updated 8 months ago
- Code for "Accelerating Training with Neuron Interaction and Nowcasting Networks" [to appear at ICLR 2025]☆18Updated 2 weeks ago
- Implementation of Spectral State Space Models☆16Updated last year
- A State-Space Model with Rational Transfer Function Representation.☆78Updated 10 months ago
- Official repository for the paper "Approximating Two-Layer Feedforward Networks for Efficient Transformers"☆36Updated last year
- NeuMeta transforms neural networks by allowing a single model to adapt on the fly to different sizes, generating the right weights when n…☆42Updated 4 months ago
- Inference Speed Benchmark for Learning to (Learn at Test Time): RNNs with Expressive Hidden States☆64Updated 8 months ago
- Official code for the paper "Attention as a Hypernetwork"☆25Updated 9 months ago
- Parallelizing non-linear sequential models over the sequence length☆51Updated 2 months ago
- ☆30Updated 5 months ago
- PyTorch implementation for "Long Horizon Temperature Scaling", ICML 2023☆20Updated last year
- The official repository for HyperZ⋅Z⋅W Operator Connects Slow-Fast Networks for Full Context Interaction.☆36Updated last month
- DeciMamba: Exploring the Length Extrapolation Potential of Mamba (ICLR 2025)☆24Updated 8 months ago
- ☆54Updated 7 months ago
- Minimal Implementation of Visual Autoregressive Modelling (VAR)☆28Updated this week
- This repository includes code to reproduce the tables in "Loss Landscapes are All You Need: Neural Network Generalization Can Be Explaine…☆35Updated 2 years ago
- ☆31Updated 10 months ago
- Unofficial Implementation of Selective Attention Transformer☆16Updated 4 months ago