FarnoushRJ / MambaLRP
[NeurIPS 2024] Official implementation of the paper "MambaLRP: Explaining Selective State Space Sequence Models".
☆37Updated 3 months ago
Alternatives and similar repositories for MambaLRP:
Users that are interested in MambaLRP are comparing it to the libraries listed below
- Official PyTorch Implementation of "The Hidden Attention of Mamba Models"☆211Updated 8 months ago
- More dimensions = More fun☆21Updated 6 months ago
- MambaFormer in-context learning experiments and implementation for https://arxiv.org/abs/2402.04248☆36Updated 7 months ago
- Official implementation of "Hydra: Bidirectional State Space Models Through Generalized Matrix Mixers"☆122Updated 2 weeks ago
- A Triton Kernel for incorporating Bi-Directionality in Mamba2☆60Updated last month
- Why Do We Need Weight Decay in Modern Deep Learning? [NeurIPS 2024]☆60Updated 4 months ago
- HGRN2: Gated Linear RNNs with State Expansion☆52Updated 5 months ago
- Sequence Modeling with Multiresolution Convolutional Memory (ICML 2023)☆122Updated last year
- Official implementation of Phi-Mamba. A MOHAWK-distilled model (Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Mode…☆96Updated 5 months ago
- Code and benchmark for the paper: "A Practitioner's Guide to Continual Multimodal Pretraining" [NeurIPS'24]☆50Updated 2 months ago
- ☆51Updated 4 months ago
- Official Code for ICLR 2024 Paper: Non-negative Contrastive Learning☆39Updated 9 months ago
- Simple, minimal implementation of the Mamba SSM in one pytorch file. Using logcumsumexp (Heisen sequence).☆108Updated 3 months ago
- ☆45Updated 10 months ago
- Official code for the ICML 2024 paper "The Entropy Enigma: Success and Failure of Entropy Minimization"☆48Updated 8 months ago
- [NeurIPS 2023] Factorized Contrastive Learning: Going Beyond Multi-view Redundancy☆64Updated last year
- Official Implementation Of The Paper: `DeciMamba: Exploring the Length Extrapolation Potential of Mamba'☆23Updated 6 months ago
- State Space Models☆64Updated 9 months ago
- [NeurIPS 23' Oral] Emergence of Shape Bias in Convolutional Neural Networks through Activation Sparsity☆25Updated 9 months ago
- The official repository for HyperZ⋅Z⋅W Operator Connects Slow-Fast Networks for Full Context Interaction.☆31Updated last week
- [CVPR 2024] Friendly Sharpness-Aware Minimization☆27Updated 3 months ago
- Unofficial Implementation of Selective Attention Transformer☆15Updated 3 months ago
- Official implementation for "Targeted Cause Discovery with Data-Driven Learning"☆23Updated 5 months ago
- Official PyTorch Implementation☆18Updated 2 years ago
- Towards Understanding the Mixture-of-Experts Layer in Deep Learning☆22Updated last year
- Ofiicial Implementation for Mamba-ND: Selective State Space Modeling for Multi-Dimensional Data☆56Updated 7 months ago
- The official Pytorch implementation of the paper "Fourier Transformer: Fast Long Range Modeling by Removing Sequence Redundancy with FFT …☆32Updated 11 months ago
- This repository contains the code for our paper "Probabilistic Contrastive Learning Recovers the Correct Aleatoric Uncertainty of Ambiguo…☆39Updated last year
- Implementation of Agent Attention in Pytorch☆89Updated 7 months ago