myscience / mambaLinks

Pytorch (Lightning) implementation of the Mamba model

☆29

Alternatives and similar repositories for mamba

Users that are interested in mamba are comparing it to the libraries listed below

Sorting:

kyegomez / Jamba
PyTorch Implementation of Jamba: "Jamba: A Hybrid Transformer-Mamba Language Model"
☆182Updated this week
kyegomez / MambaTransformer
Integrating Mamba/SSMs with Transformer for Enhanced Long Context and High-Quality Sequence Modeling
☆201Updated 2 weeks ago
myscience / x-lstm
Pytorch implementation of the xLSTM model by Beck et al. (2024)
☆169Updated 11 months ago
fkodom / yet-another-retnet
A simple but robust PyTorch implementation of RetNet from "Retentive Network: A Successor to Transformer for Large Language Models" (http…
☆106Updated last year
lucidrains / agent-attention-pytorch
Implementation of Agent Attention in Pytorch
☆91Updated last year
goombalab / hydra
Official implementation of "Hydra: Bidirectional State Space Models Through Generalized Matrix Mixers"
☆151Updated 6 months ago
lucidrains / infini-transformer-pytorch
Implementation of Infini-Transformer in Pytorch
☆110Updated 7 months ago
kyegomez / Griffin
Implementation of Griffin from the paper: "Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models"
☆56Updated 2 weeks ago
andrewgcodes / xlstm
my attempts at implementing various bits of Sepp Hochreiter's new xLSTM architecture
☆131Updated last year
kyegomez / MambaFormer
Implementation of MambaFormer in Pytorch ++ Zeta from the paper: "Can Mamba Learn How to Learn? A Comparative Study on In-Context Learnin…
☆21Updated 2 weeks ago
kyegomez / MambaByte
Implementation of MambaByte in "MambaByte: Token-free Selective State Space Model" in Pytorch and Zeta
☆120Updated 2 weeks ago
lucidrains / adam-atan2-pytorch
Implementation of the proposed Adam-atan2 from Google Deepmind in Pytorch
☆115Updated 8 months ago
Hprairie / Bi-Mamba2
A Triton Kernel for incorporating Bi-Directionality in Mamba2
☆74Updated 7 months ago
PeaBrane / mamba-tiny
Simple, minimal implementation of the Mamba SSM in one pytorch file. Using logcumsumexp (Heisen sequence).
☆120Updated 9 months ago
lucidrains / vit-arc-slot
Explorations into improving ViTArc with Slot Attention
☆42Updated 9 months ago
kyegomez / MoE-Mamba
Implementation of MoE Mamba from the paper: "MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts" in Pytorch and Ze…
☆109Updated last week
lucidrains / minGRU-pytorch
Implementation of the proposed minGRU in Pytorch
☆300Updated 4 months ago
bobby-he / simplified_transformers
☆292Updated 7 months ago
lucidrains / hyper-connections
Attempt to make multiple residual streams from Bytedance's Hyper-Connections paper accessible to the public
☆88Updated last month
apapiu / mamba_small_bench
Trying out the Mamba architecture on small examples (cifar-10, shakespeare char level etc.)
☆48Updated last year
AmeenAli / HiddenMambaAttn
Official PyTorch Implementation of "The Hidden Attention of Mamba Models"
☆226Updated last year
lucidrains / grokfast-pytorch
Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"
☆101Updated 7 months ago
jacobfa / fft
☆125Updated 2 months ago
TariqAHassan / S4Torch
PyTorch implementation of Structured State Space for Sequence Modeling (S4), based on Annotated S4.
☆83Updated last year
nanowell / AdEMAMix-Optimizer-Pytorch
The AdEMAMix Optimizer: Better, Faster, Older.
☆184Updated 10 months ago
lucidrains / block-recurrent-transformer-pytorch
Implementation of Block Recurrent Transformer - Pytorch
☆221Updated 11 months ago
Zyphra / BlackMamba
Code repository for Black Mamba
☆252Updated last year
kyegomez / xLSTM
Implementation of xLSTM in Pytorch from the paper: "xLSTM: Extended Long Short-Term Memory"
☆119Updated 2 weeks ago
lucidrains / taylor-series-linear-attention
Explorations into the recently proposed Taylor Series Linear Attention
☆100Updated 11 months ago
lucidrains / deep-cross-attention
Implementation of the proposed DeepCrossAttention by Heddes et al at Google research, in Pytorch
☆90Updated 5 months ago