FarnoushRJ / MambaLRPLinks
[NeurIPS 2024] Official implementation of the paper "MambaLRP: Explaining Selective State Space Sequence Models".
☆38Updated 6 months ago
Alternatives and similar repositories for MambaLRP
Users that are interested in MambaLRP are comparing it to the libraries listed below
Sorting:
- Official PyTorch Implementation of "The Hidden Attention of Mamba Models"☆222Updated last year
- More dimensions = More fun☆22Updated 10 months ago
- [ICLR 2025] Official Code Release for Explaining Modern Gated-Linear RNNs via a Unified Implicit Attention Formulation☆42Updated 3 months ago
- HGRN2: Gated Linear RNNs with State Expansion☆54Updated 9 months ago
- Official implementation of "Hydra: Bidirectional State Space Models Through Generalized Matrix Mixers"☆135Updated 4 months ago
- Official code for the ICML 2024 paper "The Entropy Enigma: Success and Failure of Entropy Minimization"☆51Updated last year
- The official repository for HyperZ⋅Z⋅W Operator Connects Slow-Fast Networks for Full Context Interaction.☆36Updated last month
- PyTorch implementation of Soft MoE by Google Brain in "From Sparse to Soft Mixtures of Experts" (https://arxiv.org/pdf/2308.00951.pdf)☆73Updated last year
- ☆53Updated 8 months ago
- MambaFormer in-context learning experiments and implementation for https://arxiv.org/abs/2402.04248☆54Updated 11 months ago
- I2M2: Jointly Modeling Inter- & Intra-Modality Dependencies for Multi-modal Learning (NeurIPS 2024)☆19Updated 7 months ago
- Why Do We Need Weight Decay in Modern Deep Learning? [NeurIPS 2024]☆66Updated 8 months ago
- ☆33Updated 10 months ago
- Towards Understanding the Mixture-of-Experts Layer in Deep Learning☆30Updated last year
- [NeurIPS 2023] Factorized Contrastive Learning: Going Beyond Multi-view Redundancy☆67Updated last year
- Implementations of various linear RNN layers using pytorch and triton☆51Updated last year
- Official Code for ICLR 2024 Paper: Non-negative Contrastive Learning☆45Updated last year
- Minimal Implementation of Visual Autoregressive Modelling (VAR)☆33Updated 2 months ago
- Official implementation of Phi-Mamba. A MOHAWK-distilled model (Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Mode…☆108Updated 8 months ago
- Code and benchmark for the paper: "A Practitioner's Guide to Continual Multimodal Pretraining" [NeurIPS'24]☆56Updated 5 months ago
- A toolkit for quantitative evaluation of data attribution methods.☆47Updated last month
- LoRA-XS: Low-Rank Adaptation with Extremely Small Number of Parameters☆34Updated 2 months ago
- ☆28Updated 3 months ago
- A Triton Kernel for incorporating Bi-Directionality in Mamba2☆68Updated 5 months ago
- Deep Networks Grok All the Time and Here is Why☆35Updated last year
- Ofiicial Implementation for Mamba-ND: Selective State Space Modeling for Multi-Dimensional Data☆59Updated 11 months ago
- Switch EMA: A Free Lunch for Better Flatness and Sharpness☆26Updated last year
- DeciMamba: Exploring the Length Extrapolation Potential of Mamba (ICLR 2025)☆28Updated last month
- Code for Principal Masked Autoencoders☆27Updated 2 months ago
- This repository contains the code for our paper "Probabilistic Contrastive Learning Recovers the Correct Aleatoric Uncertainty of Ambiguo…☆40Updated 2 years ago