ml-research / self-expanding-neural-networks
Self-Expanding Neural Networks
☆39Updated 11 months ago
Alternatives and similar repositories for self-expanding-neural-networks:
Users that are interested in self-expanding-neural-networks are comparing it to the libraries listed below
- HGRN2: Gated Linear RNNs with State Expansion☆52Updated 5 months ago
- ☆50Updated 3 months ago
- Official source code for "Graph Neural Networks for Learning Equivariant Representations of Neural Networks". In ICLR 2024 (oral).☆76Updated 5 months ago
- Parallelizing non-linear sequential models over the sequence length☆49Updated this week
- Why Do We Need Weight Decay in Modern Deep Learning? [NeurIPS 2024]☆58Updated 3 months ago
- ☆12Updated 2 years ago
- Deep Networks Grok All the Time and Here is Why☆23Updated 8 months ago
- Sequence Modeling with Multiresolution Convolutional Memory (ICML 2023)☆121Updated last year
- Official code for our NeurIPS 2024 paper "einspace: Searching for Neural Architectures from Fundamental Operations"☆25Updated 2 months ago
- Implementation of MambaFormer in Pytorch ++ Zeta from the paper: "Can Mamba Learn How to Learn? A Comparative Study on In-Context Learnin…☆20Updated last week
- Code for GFlowNet-EM, a novel algorithm for fitting latent variable models with compositional latents and an intractable true posterior.☆41Updated 11 months ago
- ☆48Updated 11 months ago
- A State-Space Model with Rational Transfer Function Representation.☆76Updated 8 months ago
- ☆64Updated 2 months ago
- This repository includes code to reproduce the tables in "Loss Landscapes are All You Need: Neural Network Generalization Can Be Explaine…☆34Updated last year
- The official repository for HyperZ⋅Z⋅W Operator Connects Slow-Fast Networks for Full Context Interaction.☆31Updated last week
- Official repository for the paper "Approximating Two-Layer Feedforward Networks for Efficient Transformers"☆36Updated last year
- Source code for the paper "Positional Attention: Out-of-Distribution Generalization and Expressivity for Neural Algorithmic Reasoning"☆14Updated last week
- ☆29Updated 3 months ago
- ☆80Updated 7 months ago
- Towards Understanding the Mixture-of-Experts Layer in Deep Learning☆22Updated last year
- Code for "Accelerating Training with Neuron Interaction and Nowcasting Networks"☆17Updated last week
- C++ and Cuda ops for fused FourierKAN☆74Updated 8 months ago
- ☆51Updated 7 months ago
- Omnigrok: Grokking Beyond Algorithmic Data☆52Updated last year
- Official code for the paper "Attention as a Hypernetwork"☆23Updated 6 months ago
- ☆22Updated last year
- ☆26Updated 10 months ago
- Official implementation for Equivariant Architectures for Learning in Deep Weight Spaces [ICML 2023]☆86Updated last year
- Pytorch implementation of Simplified Structured State-Spaces for Sequence Modeling (S5)☆72Updated 8 months ago