johnma2006/mamba-minimal

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/johnma2006/mamba-minimal)

johnma2006 / mamba-minimal

Simple, minimal implementation of the Mamba SSM in one file of PyTorch.

☆2,964

Alternatives and similar repositories for mamba-minimal

Users that are interested in mamba-minimal are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

state-spaces / mamba
View on GitHub
Mamba SSM architecture
☆18,667Updated this week
alxndrTL / mamba.py
View on GitHub
A simple and efficient Mamba implementation in pure PyTorch and MLX.
☆1,472May 3, 2026Updated 2 months ago
hustvl / Vim
View on GitHub
[ICML 2024] Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
☆3,893Feb 13, 2025Updated last year
srush / annotated-mamba
View on GitHub
Annotated version of the Mamba paper
☆501Feb 27, 2024Updated 2 years ago
state-spaces / s4
View on GitHub
Structured state space sequence models
☆2,911Jul 17, 2024Updated 2 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
midrender / mamba-chat
View on GitHub
Mamba-Chat: A chat LLM based on the state-space model architecture 🐍
☆943Mar 3, 2024Updated 2 years ago
yyyujintang / Awesome-Mamba-Papers
View on GitHub
Awesome Papers related to Mamba.
☆1,400Oct 17, 2024Updated last year
MzeroMiko / VMamba
View on GitHub
VMamba: Visual State Space Models，code is based on mamba
☆3,209Mar 7, 2025Updated last year
Tiiny-AI / PowerInfer
View on GitHub
High-speed Large Language Model Serving for Local Deployment
☆9,679May 11, 2026Updated 2 months ago
fla-org / flash-linear-attention
View on GitHub
🚀 Efficient implementations for emerging model architectures
☆5,425Updated this week
radarFudan / Awesome-state-space-models
View on GitHub
Collection of papers on state-space models
☆620Nov 4, 2025Updated 8 months ago
BlinkDL / RWKV-LM
View on GitHub
RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable)…
☆14,641Updated this week
Dao-AILab / flash-attention
View on GitHub
Fast and memory-efficient exact attention
☆24,539Updated this week
radarFudan / mamba-minimal-jax
View on GitHub
☆36Nov 22, 2024Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
tommyip / mamba2-minimal
View on GitHub
Minimal Mamba-2 implementation in PyTorch
☆256Jun 17, 2024Updated 2 years ago
HazyResearch / based
View on GitHub
Code for exploring Based models from "Simple linear attention language models balance the recall-throughput tradeoff"
☆256Jun 6, 2025Updated last year
jzhang38 / LongMamba
View on GitHub
Some preliminary explorations of Mamba's context scaling.
☆221Feb 8, 2024Updated 2 years ago
goombalab / hydra
View on GitHub
Official implementation of "Hydra: Bidirectional State Space Models Through Generalized Matrix Mixers"
☆175Jan 30, 2025Updated last year
HazyResearch / zoology
View on GitHub
Understand and test language model architectures on synthetic tasks.
☆278Mar 22, 2026Updated 4 months ago
microsoft / Samba
View on GitHub
[ICLR 2025] Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
☆966Nov 16, 2025Updated 8 months ago
meta-pytorch / gpt-fast
View on GitHub
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
☆6,231Aug 22, 2025Updated 11 months ago
jzhang38 / TinyLlama
View on GitHub
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
☆9,017May 3, 2024Updated 2 years ago
srush / annotated-s4
View on GitHub
Implementation of https://srush.github.io/annotated-s4
☆519Jun 20, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
kyegomez / MambaTransformer
View on GitHub
Integrating Mamba/SSMs with Transformer for Enhanced Long Context and High-Quality Sequence Modeling
☆226Jul 20, 2026Updated last week
johnma2006 / candle
View on GitHub
Deep learning library implemented from scratch in numpy. Mixtral, Mamba, LLaMA, GPT, ResNet, and other experiments.
☆56Apr 12, 2024Updated 2 years ago
HazyResearch / ThunderKittens
View on GitHub
Tile primitives for speedy kernels
☆3,566Jul 13, 2026Updated 2 weeks ago
KindXiaoming / pykan
View on GitHub
Kolmogorov Arnold Networks
☆16,327Jan 19, 2025Updated last year
vvvm23 / mamba-jax
View on GitHub
Unofficial but Efficient Implementation of "Mamba: Linear-Time Sequence Modeling with Selective State Spaces" in JAX
☆94Jan 25, 2024Updated 2 years ago
openai / transformer-debugger
View on GitHub
☆4,120Apr 15, 2026Updated 3 months ago
NX-AI / xlstm
View on GitHub
Official repository of the xLSTM.
☆2,188May 28, 2026Updated last month
google-deepmind / recurrentgemma
View on GitHub
Open weights language model from Google DeepMind, based on Griffin.
☆682Feb 6, 2026Updated 5 months ago
facebookresearch / schedule_free
View on GitHub
Schedule-Free Optimization in PyTorch
☆2,317Jun 18, 2026Updated last month
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
PeaBrane / mamba-tiny
View on GitHub
Simple, minimal implementation of the Mamba SSM in one pytorch file. Using logcumsumexp (Heisen sequence).
☆134Oct 18, 2024Updated last year
mit-han-lab / streaming-llm
View on GitHub
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
☆7,249Jul 11, 2024Updated 2 years ago
OpenNLPLab / HGRN
View on GitHub
[NeurIPS 2023 spotlight] Official implementation of HGRN in our NeurIPS 2023 paper - Hierarchically Gated Recurrent Neural Network for Se…
☆68Apr 24, 2024Updated 2 years ago
arcee-ai / mergekit
View on GitHub
Tools for merging pretrained large language models.
☆7,261Jun 17, 2026Updated last month
pytorch / torchtitan
View on GitHub
A PyTorch native platform for training generative AI models
☆5,566Updated this week
triton-lang / triton
View on GitHub
Development repository for the Triton language and compiler
☆19,789Updated this week
LegallyCoder / mamba-hf
View on GitHub
Implementation of the Mamba SSM with hf_integration.
☆55Aug 31, 2024Updated last year