FarnoushRJ / MambaLRP
[NeurIPS 2024] Official implementation of the paper "MambaLRP: Explaining Selective State Space Sequence Models".
☆38Updated 6 months ago
Alternatives and similar repositories for MambaLRP:
Users that are interested in MambaLRP are comparing it to the libraries listed below
- Official PyTorch Implementation of "The Hidden Attention of Mamba Models"☆217Updated 11 months ago
- More dimensions = More fun☆22Updated 9 months ago
- [ICLR 2025] Official Code Release for Explaining Modern Gated-Linear RNNs via a Unified Implicit Attention Formulation☆42Updated 2 months ago
- A Triton Kernel for incorporating Bi-Directionality in Mamba2☆65Updated 4 months ago
- Official implementation of "Hydra: Bidirectional State Space Models Through Generalized Matrix Mixers"☆130Updated 3 months ago
- Official Code for ICLR 2024 Paper: Non-negative Contrastive Learning☆45Updated last year
- Official code for the ICML 2024 paper "The Entropy Enigma: Success and Failure of Entropy Minimization"☆50Updated 11 months ago
- ☆33Updated 9 months ago
- Why Do We Need Weight Decay in Modern Deep Learning? [NeurIPS 2024]☆66Updated 7 months ago
- Sequence Modeling with Multiresolution Convolutional Memory (ICML 2023)☆123Updated last year
- ☆58Updated 3 months ago
- HGRN2: Gated Linear RNNs with State Expansion☆54Updated 8 months ago
- Code and benchmark for the paper: "A Practitioner's Guide to Continual Multimodal Pretraining" [NeurIPS'24]☆54Updated 4 months ago
- Official implementation of Phi-Mamba. A MOHAWK-distilled model (Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Mode…☆106Updated 7 months ago
- State Space Models☆67Updated last year
- Ofiicial Implementation for Mamba-ND: Selective State Space Modeling for Multi-Dimensional Data☆59Updated 10 months ago
- MambaFormer in-context learning experiments and implementation for https://arxiv.org/abs/2402.04248☆52Updated 10 months ago
- The official Pytorch implementation of the paper "Fourier Transformer: Fast Long Range Modeling by Removing Sequence Redundancy with FFT …☆37Updated last year
- Towards Understanding the Mixture-of-Experts Layer in Deep Learning☆28Updated last year
- I2M2: Jointly Modeling Inter- & Intra-Modality Dependencies for Multi-modal Learning (NeurIPS 2024)☆19Updated 6 months ago
- DeciMamba: Exploring the Length Extrapolation Potential of Mamba (ICLR 2025)☆26Updated 3 weeks ago
- ☆40Updated 3 months ago
- Trying out the Mamba architecture on small examples (cifar-10, shakespeare char level etc.)☆46Updated last year
- Official implementation for "Targeted Cause Discovery with Data-Driven Learning"☆23Updated 8 months ago
- A repository for DenseSSMs☆87Updated last year
- Implementations of various linear RNN layers using pytorch and triton☆50Updated last year
- The official repository for HyperZ⋅Z⋅W Operator Connects Slow-Fast Networks for Full Context Interaction.☆36Updated last month
- ☆52Updated 7 months ago
- [NeurIPS 2023] Factorized Contrastive Learning: Going Beyond Multi-view Redundancy☆66Updated last year
- ☆48Updated last year