Hprairie / Bi-Mamba2
A Triton Kernel for incorporating Bi-Directionality in Mamba2
☆60Updated 3 weeks ago
Alternatives and similar repositories for Bi-Mamba2:
Users that are interested in Bi-Mamba2 are comparing it to the libraries listed below
- Official implementation of "Hydra: Bidirectional State Space Models Through Generalized Matrix Mixers"☆118Updated 5 months ago
- Official PyTorch Implementation of "The Hidden Attention of Mamba Models"☆209Updated 7 months ago
- An efficient pytorch implementation of selective scan in one file, works with both cpu and gpu, with corresponding mathematical derivatio…☆78Updated 10 months ago
- Ofiicial Implementation for Mamba-ND: Selective State Space Modeling for Multi-Dimensional Data☆55Updated 6 months ago
- Awesome list of papers that extend Mamba to various applications.☆129Updated 3 weeks ago
- Implementation of Griffin from the paper: "Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models"☆51Updated 2 months ago
- ☆45Updated 9 months ago
- A repository for DenseSSMs☆87Updated 9 months ago
- Simba☆194Updated 9 months ago
- Minimal Mamba-2 implementation in PyTorch☆164Updated 7 months ago
- This repository is the official implementation of our Autoregressive Pretraining with Mamba in Vision☆68Updated 7 months ago
- ☆42Updated last month
- Official PyTorch Implementation of Gated Delta Networks: Improving Mamba2 with Delta Rule☆73Updated 2 weeks ago
- Official implementation of Phi-Mamba. A MOHAWK-distilled model (Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Mode…☆92Updated 4 months ago
- More dimensions = More fun☆21Updated 5 months ago
- HGRN2: Gated Linear RNNs with State Expansion☆52Updated 4 months ago
- Implementation of Agent Attention in Pytorch☆89Updated 6 months ago
- ☆64Updated 2 months ago
- State Space Models☆64Updated 8 months ago
- Introduce Mamba2 to Vision.☆111Updated 4 months ago
- My implementation of the original transformer model (Vaswani et al.). I've additionally included the playground.py file for visualizing o…☆43Updated last month
- Implementation of Switch Transformers from the paper: "Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficien…☆67Updated 2 months ago
- The this is the official implementation of "DAPE: Data-Adaptive Positional Encoding for Length Extrapolation"☆33Updated 3 months ago
- Official Implementation Of The Paper: `DeciMamba: Exploring the Length Extrapolation Potential of Mamba'☆22Updated 5 months ago
- Implementation of a multimodal diffusion transformer in Pytorch☆99Updated 6 months ago
- Implementation of ViTaR: ViTAR: Vision Transformer with Any Resolution in PyTorch☆30Updated 2 months ago
- ☆36Updated 7 months ago
- Transformer-Mamba Diffusion Models☆94Updated 6 months ago
- Causal depthwise conv1d in CUDA, with a PyTorch interface☆378Updated last month
- Official repository of MLLA (NeurIPS 2024)☆267Updated last month