srush/mamba-primer

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/srush/mamba-primer)

srush / mamba-primer

☆38

Alternatives and similar repositories for mamba-primer

Users that are interested in mamba-primer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

srush / triton-autodiff
View on GitHub
Experiment of using Tangent to autodiff triton
☆81Jan 22, 2024Updated 2 years ago
srush / do-we-need-attention
View on GitHub
☆167Jul 5, 2023Updated 3 years ago
OpenNLPLab / HGRN
View on GitHub
[NeurIPS 2023 spotlight] Official implementation of HGRN in our NeurIPS 2023 paper - Hierarchically Gated Recurrent Neural Network for Se…
☆68Apr 24, 2024Updated 2 years ago
OpenNLPLab / HGRN2
View on GitHub
HGRN2: Gated Linear RNNs with State Expansion
☆58Aug 20, 2024Updated last year
emalach / LinearLM
View on GitHub
Code for the paper: https://arxiv.org/pdf/2309.06979.pdf
☆21Jul 29, 2024Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
vvvm23 / mamba-jax
View on GitHub
Unofficial but Efficient Implementation of "Mamba: Linear-Time Sequence Modeling with Selective State Spaces" in JAX
☆94Jan 25, 2024Updated 2 years ago
srush / annotated-mamba
View on GitHub
Annotated version of the Mamba paper
☆501Feb 27, 2024Updated 2 years ago
ermongroup / fast_feedforward_computation
View on GitHub
Official code for "Accelerating Feedforward Computation via Parallel Nonlinear Equation Solving", ICML 2021
☆30Sep 25, 2021Updated 4 years ago
phddamuge / UniRPG
View on GitHub
This is code for the EMNLP 2022 Paper "UniRPG: Unified Discrete Reasoning over Table and Text as Program Generation".
☆10Apr 30, 2023Updated 3 years ago
EleutherAI / rnngineering
View on GitHub
Engineering the state of RNN language models (Mamba, RWKV, etc.)
☆33May 25, 2024Updated 2 years ago
sustcsonglin / disco-pointer
View on GitHub
Official Implementation of ACL2023: Don't Parse, Choose Spans! Continuous and Discontinuous Constituency Parsing via Autoregressive Span …
☆14Aug 25, 2023Updated 2 years ago
catid / spectral_ssm
View on GitHub
Implementation of Spectral State Space Models
☆16Feb 23, 2024Updated 2 years ago
mishajw / repeng
View on GitHub
Experiments with representation engineering
☆14Feb 28, 2024Updated 2 years ago
Noahs-ARK / PaLM
View on GitHub
PyTorch implementation for PaLM: A Hybrid Parser and Language Model.
☆10Jan 7, 2020Updated 6 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
BlinkDL / LinearAttentionArena
View on GitHub
Here we will test various linear attention designs.
☆62Apr 25, 2024Updated 2 years ago
Cadenza-Labs / sleeper-agents
View on GitHub
☆15Jul 12, 2024Updated 2 years ago
kazuki-irie / kv-memory-brain
View on GitHub
Official Code Repository for the paper "Key-value memory in the brain"
☆32Feb 25, 2025Updated last year
Doraemonzzz / xmixers
View on GitHub
Xmixers: A collection of SOTA efficient token/channel mixers
☆29Sep 4, 2025Updated 10 months ago
expz / annotated-hyena
View on GitHub
An annotated implementation of the Hyena Hierarchy paper
☆34May 28, 2023Updated 3 years ago
dtunai / Griffin-Jax
View on GitHub
Jax implementation of "Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models"
☆15May 10, 2024Updated 2 years ago
Itamarzimm / UnifiedImplicitAttnRepr
View on GitHub
[ICLR 2025] Official Code Release for Explaining Modern Gated-Linear RNNs via a Unified Implicit Attention Formulation
☆50Mar 1, 2025Updated last year
renll / SeqBoat
View on GitHub
[NeurIPS 2023] Sparse Modular Activation for Efficient Sequence Modeling
☆40Dec 2, 2023Updated 2 years ago
mit-han-lab / SMEPO
View on GitHub
☆16May 27, 2026Updated last month
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
srush / Transformer-Puzzles
View on GitHub
Puzzles for exploring transformers
☆398May 4, 2023Updated 3 years ago
IBM / selective-dense-state-space-model
View on GitHub
Open-sourcing code associated with the AAAI-25 paper "On the Expressiveness and Length Generalization of Selective State-Space Models on …
☆16Sep 18, 2025Updated 10 months ago
goombalab / hydra
View on GitHub
Official implementation of "Hydra: Bidirectional State Space Models Through Generalized Matrix Mixers"
☆175Jan 30, 2025Updated last year
bdusell / stack-attention
View on GitHub
Code for the paper "Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns"
☆18Mar 15, 2024Updated 2 years ago
srush / tangent
View on GitHub
Source-to-Source Debuggable Derivatives in Pure Python
☆15Jan 23, 2024Updated 2 years ago
hkproj / mamba-notes
View on GitHub
Notes on the Mamba and the S4 model (Mamba: Linear-Time Sequence Modeling with Selective State Spaces)
☆183Jan 7, 2024Updated 2 years ago
yafuly / SyntacticGen
View on GitHub
☆16Jul 11, 2023Updated 3 years ago
rycolab / aflt-f2023
View on GitHub
Advanced Formal Language Theory (263-5352-00L; Frühjahr 2023)
☆10Feb 21, 2023Updated 3 years ago
QZ1-boy / CPGA
View on GitHub
[CVPR2024] Dataset and Code of "CPGA: Coding Priors-Guided Aggregation Network for Compressed Video Quality Enhancement".
☆14Dec 14, 2024Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
amirzandieh / HyperAttention
View on GitHub
Triton Implementation of HyperAttention Algorithm
☆48Dec 11, 2023Updated 2 years ago
LouChao98 / nner_as_parsing
View on GitHub
☆16Mar 22, 2023Updated 3 years ago
srush / torch-golf
View on GitHub
Silly twitter torch implementations.
☆48Oct 14, 2022Updated 3 years ago
sustcsonglin / flash-linear-rnn
View on GitHub
Implementations of various linear RNN layers using pytorch and triton
☆55Aug 4, 2023Updated 2 years ago
berlino / gated_linear_attention
View on GitHub
☆107Mar 9, 2024Updated 2 years ago
Deep-Learning-Profiling-Tools / triton-viz
View on GitHub
☆350Jul 16, 2026Updated last week
wilson1yan / cs294-158-ssl
View on GitHub
☆16Jun 25, 2022Updated 4 years ago