srush/mamba-scans

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/srush/mamba-scans)

srush / mamba-scans

Blog post

☆17

Alternatives and similar repositories for mamba-scans

Users that are interested in mamba-scans are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

sustcsonglin / mamba-triton
View on GitHub
☆52Jan 28, 2024Updated 2 years ago
Aleph-Alpha-Research / NeurIPS-WANT-submission-efficient-parallelization-layouts
View on GitHub
☆22Dec 15, 2023Updated 2 years ago
srush / triton-autodiff
View on GitHub
Experiment of using Tangent to autodiff triton
☆81Jan 22, 2024Updated 2 years ago
srush / tangent
View on GitHub
Source-to-Source Debuggable Derivatives in Pure Python
☆15Jan 23, 2024Updated 2 years ago
EleutherAI / rnngineering
View on GitHub
Engineering the state of RNN language models (Mamba, RWKV, etc.)
☆33May 25, 2024Updated 2 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
srush / LLM-Talk
View on GitHub
☆53Jan 30, 2024Updated 2 years ago
HazyResearch / embroid
View on GitHub
Embroid: Unsupervised Prediction Smoothing Can Improve Few-Shot Classification
☆11Aug 12, 2023Updated 2 years ago
glassroom / heinsen_attention
View on GitHub
Reference implementation of "Softmax Attention with Constant Cost per Token" (Heinsen, 2024)
☆25Jun 6, 2024Updated 2 years ago
f-dangel / einconv
View on GitHub
Convolutions and more as einsum for PyTorch
☆18Jun 6, 2024Updated 2 years ago
VITA-Group / SSM-Bottleneck
View on GitHub
[ICLR'25] "Understanding Bottlenecks of State Space Models through the Lens of Recency and Over-smoothing" by Peihao Wang, Ruisi Cai, Yue…
☆18Mar 21, 2025Updated last year
johnryan465 / pscan
View on GitHub
☆40Jan 5, 2024Updated 2 years ago
zomux / lanmt-ebm
View on GitHub
lanmt ebm
☆12Jun 19, 2020Updated 6 years ago
IDSIA / rtrl-elstm
View on GitHub
Official repository for the paper "Exploring the Promise and Limits of Real-Time Recurrent Learning" (ICLR 2024)
☆13Jun 11, 2025Updated last year
dtunai / Griffin-Jax
View on GitHub
Jax implementation of "Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models"
☆15May 10, 2024Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
berlino / seq_icl
View on GitHub
☆54May 20, 2024Updated 2 years ago
rycolab / aflt-f2023
View on GitHub
Advanced Formal Language Theory (263-5352-00L; Frühjahr 2023)
☆10Feb 21, 2023Updated 3 years ago
polo5 / FDS
View on GitHub
Gradient-based Hyperparameter Optimization Over Long Horizons
☆14Sep 29, 2021Updated 4 years ago
yikangshen / megablocks
View on GitHub
☆20May 30, 2024Updated 2 years ago
siyuanseever / llama2Rnn.c
View on GitHub
☆13Apr 15, 2024Updated 2 years ago
codefuse-ai / Collinear-Constrained-Attention
View on GitHub
☆62Jun 17, 2024Updated 2 years ago
Leooyii / LCEG
View on GitHub
[COLM'25] A Controlled Study on Long Context Extension and Generalization in LLMs
☆65Mar 9, 2026Updated 4 months ago
alan-hpc / cuda_op_benchmark
View on GitHub
方便扩展的Cuda算子理解和优化框架，仅用在学习使用
☆18Jun 13, 2024Updated 2 years ago
katalinic / sdflows
View on GitHub
JAX exponential map normalising flows on sphere
☆17Oct 4, 2020Updated 5 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
proger / nanokitchen
View on GitHub
Parallel Associative Scan for Language Models
☆18Jan 8, 2024Updated 2 years ago
proger / accelerated-scan
View on GitHub
Accelerated First Order Parallel Associative Scan
☆198Jan 7, 2026Updated 6 months ago
TRI-ML / linear_open_lm
View on GitHub
A repository for research on medium sized language models.
☆78May 23, 2024Updated 2 years ago
joelouismarino / variational_rl
View on GitHub
Variational Reinforcement Learning
☆18Jul 25, 2024Updated last year
acosharma / elita-transformer
View on GitHub
Official Repository for Efficient Linear-Time Attention Transformers.
☆18Jun 2, 2024Updated 2 years ago
lindermanlab / ssm-jax
View on GitHub
Bayesian learning and inference for state space models (SSMs) using Google Research's JAX as a backend
☆63Jun 17, 2024Updated 2 years ago
radarFudan / Curse-of-memory
View on GitHub
Curse-of-memory phenomenon of RNNs in sequence modelling
☆19May 8, 2025Updated last year
maximzubkov / fft-scan
View on GitHub
Efficient PScan implementation in PyTorch
☆17Jan 2, 2024Updated 2 years ago
OpenNLPLab / HGRN2
View on GitHub
HGRN2: Gated Linear RNNs with State Expansion
☆58Aug 20, 2024Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
BlinkDL / LinearAttentionArena
View on GitHub
Here we will test various linear attention designs.
☆62Apr 25, 2024Updated 2 years ago
Lingkai-Kong / so-ebm
View on GitHub
Code for paper: End-to-end Stochastic Optimization with Energy-based Model
☆16Feb 14, 2023Updated 3 years ago
haorannlp / mix
View on GitHub
Code for "Mixed Cross Entropy Loss for Neural Machine Translation"
☆20Jul 23, 2021Updated 5 years ago
KurochkinAlexey / AntisymmetricRNN
View on GitHub
Python implementation of paper "AntisymmetricRNN: A Dynamical System View on Recurrent Neural Networks"
☆15Aug 2, 2019Updated 6 years ago
Benjamin-Walker / selective-ssms-and-linear-cdes
View on GitHub
Code for "Theoretical Foundations of Deep Selective State-Space Models" (NeurIPS 2024)
☆17Jan 7, 2025Updated last year
sayakpaul / big_vision_experiments
View on GitHub
Contains my experiments with the `big_vision` repo to train ViTs on ImageNet-1k.
☆22Jan 16, 2023Updated 3 years ago
gisilvs / AEF
View on GitHub
☆33Mar 1, 2023Updated 3 years ago