johnma2006 / mamba-minimal
Simple, minimal implementation of the Mamba SSM in one file of PyTorch.
☆2,624Updated 8 months ago
Related projects ⓘ
Alternatives and complementary repositories for mamba-minimal
- A simple and efficient Mamba implementation in pure PyTorch and MLX.☆1,012Updated 2 months ago
- Mamba SSM architecture☆13,239Updated 2 weeks ago
- [ICML 2024] Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model☆2,993Updated last week
- An implementation of "Retentive Network: A Successor to Transformer for Large Language Models"☆1,163Updated last year
- Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton☆1,339Updated this week
- Official PyTorch implementation of Learning to (Learn at Test Time): RNNs with Expressive Hidden States☆1,040Updated 4 months ago
- Structured state space sequence models☆2,470Updated 4 months ago
- Schedule-Free Optimization in PyTorch☆1,898Updated 2 weeks ago
- Official repository of the xLSTM.☆1,407Updated 2 weeks ago
- Awesome Papers related to Mamba.☆1,220Updated last month
- Collection of papers on state-space models☆556Updated 2 weeks ago
- VMamba: Visual State Space Models,code is based on mamba☆2,196Updated 3 weeks ago
- Mamba-Chat: A chat LLM based on the state-space model architecture 🐍☆908Updated 8 months ago
- ☆713Updated 5 months ago
- Official PyTorch Implementation of MambaVision: A Hybrid Mamba-Transformer Vision Backbone☆818Updated last month
- GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection☆1,435Updated 3 weeks ago
- Implementation of Rotary Embeddings, from the Roformer paper, in Pytorch☆571Updated last week
- 🦁 Lion, new optimizer discovered by Google Brain using genetic algorithms that is purportedly better than Adam(w), in Pytorch☆2,041Updated 5 months ago
- Causal depthwise conv1d in CUDA, with a PyTorch interface☆329Updated 3 months ago
- A JAX research toolkit for building, editing, and visualizing neural networks.☆1,679Updated this week
- Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.☆2,339Updated 2 months ago
- Foundation Architecture for (M)LLMs☆3,034Updated 7 months ago
- [Mamba-Survey-2024] Paper list for State-Space-Model/Mamba and it's Applications☆618Updated 3 weeks ago
- Vector (and Scalar) Quantization, in Pytorch☆2,639Updated last week
- Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"☆6,369Updated 5 months ago
- xLSTM as Generic Vision Backbone☆438Updated 2 weeks ago
- Official JAX implementation of Learning to (Learn at Test Time): RNNs with Expressive Hidden States☆366Updated 3 months ago
- Tile primitives for speedy kernels☆1,658Updated this week
- The official implementation of “Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training”☆939Updated 9 months ago
- Annotated version of the Mamba paper☆457Updated 8 months ago