vvvm23 / mamba-jaxLinks

Unofficial but Efficient Implementation of "Mamba: Linear-Time Sequence Modeling with Selective State Spaces" in JAX

☆85

Alternatives and similar repositories for mamba-jax

Users that are interested in mamba-jax are comparing it to the libraries listed below

Sorting:

radarFudan / mamba-minimal-jax
☆31Updated 8 months ago
lucidrains / gateloop-transformer
Implementation of GateLoop Transformer in Pytorch and Jax
☆89Updated last year
google-deepmind / spectral_ssm
☆33Updated last year
davisyoshida / lorax
LoRA for arbitrary JAX models and functions
☆140Updated last year
irhum / hyena
JAX/Flax implementation of the Hyena Hierarchy
☆34Updated 2 years ago
martin-marek / batch-size
📄Small Batch Size Training for Language Models
☆41Updated this week
shikaiqiu / compute-better-spent
☆53Updated 10 months ago
lucidrains / taylor-series-linear-attention
Explorations into the recently proposed Taylor Series Linear Attention
☆100Updated 11 months ago
machine-discovery / deer
Parallelizing non-linear sequential models over the sequence length
☆53Updated last month
dvruette / barrel-rec-pytorch
☆53Updated last year
johnryan465 / pscan
☆40Updated last year
ClashLuke / tpucare
Automatically take good care of your preemptible TPUs
☆36Updated 2 years ago
srush / mamba-primer
☆37Updated last year
thjashin / multires-conv
Sequence Modeling with Multiresolution Convolutional Memory (ICML 2023)
☆125Updated last year
lucidrains / GAF-microbatch-pytorch
Implementation of Gradient Agreement Filtering, from Chaubard et al. of Stanford, but for single machine microbatches, in Pytorch
☆25Updated 6 months ago
amirzandieh / HyperAttention
Triton Implementation of HyperAttention Algorithm
☆48Updated last year
srush / mamba-scans
Blog post
☆17Updated last year
lucidrains / grokfast-pytorch
Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"
☆101Updated 7 months ago
young-geng / mlxu
Machine Learning eXperiment Utilities
☆46Updated last week
AndPotap / einsum-search
☆32Updated 10 months ago
sustcsonglin / mamba-triton
☆49Updated last year
EleutherAI / rnngineering
Engineering the state of RNN language models (Mamba, RWKV, etc.)
☆32Updated last year
ruke1ire / RTF
A State-Space Model with Rational Transfer Function Representation.
☆79Updated last year
young-geng / scalax
A simple library for scaling up JAX programs
☆140Updated 9 months ago
kvfrans / splus
☆115Updated last month
lucidrains / pause-transformer
Yet another random morning idea to be quickly tried and architecture shared if it works; to allow the transformer to pause for any amount…
☆53Updated last year
lucidrains / infini-transformer-pytorch
Implementation of Infini-Transformer in Pytorch
☆110Updated 7 months ago
epfml / DenseFormer
☆81Updated last year
edwardmilsom / function-space-learning-rates-paper
Code for the paper "Function-Space Learning Rates"
☆23Updated 2 months ago
jopetty / word-problem
Experiments on the impact of depth in transformers and SSMs.
☆32Updated 9 months ago