FarnoushRJ / MambaLRPLinks
[NeurIPS 2024] Official implementation of the paper "MambaLRP: Explaining Selective State Space Sequence Models".
☆40Updated 7 months ago
Alternatives and similar repositories for MambaLRP
Users that are interested in MambaLRP are comparing it to the libraries listed below
Sorting:
- Official PyTorch Implementation of "The Hidden Attention of Mamba Models"☆223Updated last year
- Official implementation of "Hydra: Bidirectional State Space Models Through Generalized Matrix Mixers"☆138Updated 4 months ago
- MambaFormer in-context learning experiments and implementation for https://arxiv.org/abs/2402.04248☆55Updated last year
- More dimensions = More fun☆22Updated 10 months ago
- Official code for the ICML 2024 paper "The Entropy Enigma: Success and Failure of Entropy Minimization"☆52Updated last year
- ☆33Updated 11 months ago
- Code for "Theoretical Foundations of Deep Selective State-Space Models" (NeurIPS 2024)☆15Updated 5 months ago
- [ICLR 2025] Official Code Release for Explaining Modern Gated-Linear RNNs via a Unified Implicit Attention Formulation☆42Updated 3 months ago
- Why Do We Need Weight Decay in Modern Deep Learning? [NeurIPS 2024]☆66Updated 9 months ago
- Official Code for ICLR 2024 Paper: Non-negative Contrastive Learning☆45Updated last year
- Recycling diverse models☆44Updated 2 years ago
- Code for GFlowNet-EM, a novel algorithm for fitting latent variable models with compositional latents and an intractable true posterior.☆40Updated last year
- ☆53Updated 8 months ago
- Code for Principal Masked Autoencoders☆27Updated 2 months ago
- The official repository for HyperZ⋅Z⋅W Operator Connects Slow-Fast Networks for Full Context Interaction.☆38Updated 2 months ago
- ☆29Updated 4 months ago
- Deep Networks Grok All the Time and Here is Why☆37Updated last year
- [NeurIPS 2023] Factorized Contrastive Learning: Going Beyond Multi-view Redundancy☆69Updated last year
- Towards Understanding the Mixture-of-Experts Layer in Deep Learning☆31Updated last year
- Official implementation of Phi-Mamba. A MOHAWK-distilled model (Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Mode…☆108Updated 9 months ago
- Official repo for the paper "Weight-based Decomposition: A Case for Bilinear MLPs"☆21Updated 6 months ago
- Sequence Modeling with Multiresolution Convolutional Memory (ICML 2023)☆124Updated last year
- A Triton Kernel for incorporating Bi-Directionality in Mamba2☆70Updated 6 months ago
- Official implementation for "Targeted Cause Discovery with Data-Driven Learning"☆23Updated 9 months ago
- Visualizing representations with diffusion based conditional generative model.☆95Updated 2 years ago
- PyTorch implementation of "From Sparse to Soft Mixtures of Experts"☆57Updated last year
- HGRN2: Gated Linear RNNs with State Expansion☆55Updated 10 months ago
- Implementations of various linear RNN layers using pytorch and triton☆53Updated last year
- An official pytorch implementation of EACL2024 short paper "Flow Matching for Conditional Text Generation in a Few Sampling Steps"☆17Updated last year
- Implementation of MambaFormer in Pytorch ++ Zeta from the paper: "Can Mamba Learn How to Learn? A Comparative Study on In-Context Learnin…☆21Updated this week