PeaBrane / mamba-tiny
Simple, minimal implementation of the Mamba SSM in one pytorch file. Using logcumsumexp (Heisen sequence).
☆102Updated last month
Related projects ⓘ
Alternatives and complementary repositories for mamba-tiny
- Official implementation of "Hydra: Bidirectional State Space Models Through Generalized Matrix Mixers"☆103Updated 3 months ago
- Trying out the Mamba architecture on small examples (cifar-10, shakespeare char level etc.)☆42Updated 11 months ago
- Reading list for research topics in state-space models☆242Updated 3 weeks ago
- Annotated version of the Mamba paper☆457Updated 8 months ago
- Quick implementation of nGPT, learning entirely on the hypersphere, from NvidiaAI☆259Updated 2 weeks ago
- Integrating Mamba/SSMs with Transformer for Enhanced Long Context and High-Quality Sequence Modeling☆170Updated last week
- Notes on the Mamba and the S4 model (Mamba: Linear-Time Sequence Modeling with Selective State Spaces)☆149Updated 10 months ago
- Official PyTorch Implementation of "The Hidden Attention of Mamba Models"☆202Updated 5 months ago
- Implementation of the proposed minGRU in Pytorch☆247Updated last week
- ☆263Updated 3 months ago
- PyTorch implementation of Structured State Space for Sequence Modeling (S4), based on Annotated S4.☆70Updated 8 months ago
- Awesome list of papers that extend Mamba to various applications.☆128Updated 2 months ago
- Some preliminary explorations of Mamba's context scaling.☆191Updated 9 months ago
- Official implementation of Phi-Mamba. A MOHAWK-distilled model (Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Mode…☆79Updated 2 months ago
- Evaluating the Mamba architecture on the Othello game☆43Updated 6 months ago
- Training small GPT-2 style models using Kolmogorov-Arnold networks.☆108Updated 5 months ago
- Pytorch implementation of Simplified Structured State-Spaces for Sequence Modeling (S5)☆64Updated 6 months ago
- Implementation of a multimodal diffusion transformer in Pytorch☆97Updated 4 months ago
- Implementation of Soft MoE, proposed by Brain's Vision team, in Pytorch☆248Updated 6 months ago
- PyTorch Implementation of Jamba: "Jamba: A Hybrid Transformer-Mamba Language Model"☆137Updated last week
- Code repository for Black Mamba☆232Updated 9 months ago
- Implementation of the proposed Adam-atan2 from Google Deepmind in Pytorch☆94Updated this week
- Implementation of Griffin from the paper: "Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models"☆50Updated last week
- The official repository for HyperZ⋅Z⋅W Operator Connects Slow-Fast Networks for Full Context Interaction.☆31Updated 2 months ago
- The AdEMAMix Optimizer: Better, Faster, Older.☆173Updated 2 months ago
- Build high-performance AI models with modular building blocks☆425Updated last week
- ☆129Updated last week
- Minimal Mamba-2 implementation in PyTorch☆137Updated 5 months ago
- A simple implimentation of Bayesian Flow Networks (BFN)☆239Updated 10 months ago
- A More Fair and Comprehensive Comparison between KAN and MLP☆150Updated 3 months ago