AvivBick / awesome-ssm-mlLinks

Reading list for research topics in state-space models

☆325

Alternatives and similar repositories for awesome-ssm-ml

Users that are interested in awesome-ssm-ml are comparing it to the libraries listed below

Sorting:

radarFudan / Awesome-state-space-models
Collection of papers on state-space models
☆600Updated last month
goombalab / hydra
Official implementation of "Hydra: Bidirectional State Space Models Through Generalized Matrix Mixers"
☆161Updated 8 months ago
PeaBrane / mamba-tiny
Simple, minimal implementation of the Mamba SSM in one pytorch file. Using logcumsumexp (Heisen sequence).
☆123Updated 11 months ago
pengzhangzhi / Awesome-Mamba
Awesome list of papers that extend Mamba to various applications.
☆138Updated 4 months ago
hkproj / mamba-notes
Notes on the Mamba and the S4 model (Mamba: Linear-Time Sequence Modeling with Selective State Spaces)
☆171Updated last year
AmeenAli / HiddenMambaAttn
Official PyTorch Implementation of "The Hidden Attention of Mamba Models"
☆228Updated last year
goombalab / phi-mamba
Official implementation of Phi-Mamba. A MOHAWK-distilled model (Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Mode…
☆116Updated last year
Hprairie / Bi-Mamba2
A Triton Kernel for incorporating Bi-Directionality in Mamba2
☆75Updated 9 months ago
NVlabs / GatedDeltaNet
[ICLR 2025] Official PyTorch Implementation of Gated Delta Networks: Improving Mamba2 with Delta Rule
☆312Updated last month
lucidrains / st-moe-pytorch
Implementation of ST-Moe, the latest incarnation of MoE after years of research at Brain, in Pytorch
☆362Updated last year
kyegomez / Jamba
PyTorch Implementation of Jamba: "Jamba: A Hybrid Transformer-Mamba Language Model"
☆190Updated last week
jzhang38 / LongMamba
Some preliminary explorations of Mamba's context scaling.
☆216Updated last year
apple / ml-sigmoid-attention
☆302Updated 5 months ago
test-time-training / ttt-lm-jax
Official JAX implementation of Learning to (Learn at Test Time): RNNs with Expressive Hidden States
☆423Updated last year
srush / annotated-mamba
Annotated version of the Mamba paper
☆489Updated last year
jxiw / MambaInLlama
[NeurIPS 2024] Official Repository of The Mamba in the Llama: Distilling and Accelerating Hybrid Models
☆230Updated 5 months ago
kyegomez / Griffin
Implementation of Griffin from the paper: "Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models"
☆56Updated last week
lucidrains / soft-moe-pytorch
Implementation of Soft MoE, proposed by Brain's Vision team, in Pytorch
☆327Updated 6 months ago
Haiyang-W / TokenFormer
[ICLR2025 Spotlight🔥] Official Implementation of TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters
☆575Updated 8 months ago
goombalab / hnet
H-Net: Hierarchical Network with Dynamic Chunking
☆744Updated 2 weeks ago
tensorgi / TPA
[NeurIPS 2025 Spotlight] TPA: Tensor ProducT ATTenTion Transformer (T6) (https://arxiv.org/abs/2501.06425)
☆397Updated 3 weeks ago
kuleshov-group / awesome-discrete-diffusion-models
A curated list for awesome discrete diffusion models resources.
☆463Updated last month
kyegomez / MambaTransformer
Integrating Mamba/SSMs with Transformer for Enhanced Long Context and High-Quality Sequence Modeling
☆206Updated last week
assafbk / DeciMamba
DeciMamba: Exploring the Length Extrapolation Potential of Mamba (ICLR 2025)
☆31Updated 6 months ago
bobby-he / simplified_transformers
☆292Updated 9 months ago
zhixuan-lin / forgetting-transformer
[ICLR 2025 & COLM 2025] Official PyTorch implementation of the Forgetting Transformer and Adaptive Computation Pruning
☆131Updated 2 weeks ago
tommyip / mamba2-minimal
Minimal Mamba-2 implementation in PyTorch
☆222Updated last year
kyegomez / SwitchTransformers
Implementation of Switch Transformers from the paper: "Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficien…
☆125Updated last week
kyleliang919 / C-Optim
When it comes to optimizers, it's always better to be safe than sorry
☆375Updated 2 weeks ago
louaaron / Score-Entropy-Discrete-Diffusion
[ICML 2024 Best Paper] Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution (https://arxiv.org/abs/2310.16834)
☆642Updated last year