NX-AI / xlstm-jaxLinks

Official JAX implementation of xLSTM including fast and efficient training and inference code. 7B model available at https://huggingface.co/NX-AI/xLSTM-7b.

☆98

Alternatives and similar repositories for xlstm-jax

Users that are interested in xlstm-jax are comparing it to the libraries listed below

Sorting:

nikhilvyas / SOAP
☆206Updated 8 months ago
modula-systems / modula
🧱 Modula software package
☆216Updated last week
microsoft / dion
Dion optimizer algorithm
☆193Updated this week
imbue-ai / carbs
Cost aware hyperparameter tuning algorithm
☆166Updated last year
google-deepmind / nanodo
☆275Updated last year
EleutherAI / nanoGPT-mup
The simplest, fastest repository for training/finetuning medium-sized GPTs.
☆149Updated last month
nanowell / AdEMAMix-Optimizer-Pytorch
The AdEMAMix Optimizer: Better, Faster, Older.
☆184Updated 10 months ago
ruke1ire / RTF
A State-Space Model with Rational Transfer Function Representation.
☆79Updated last year
HomebrewML / HeavyBall
Efficient optimizers
☆252Updated last week
lindermanlab / S5
☆298Updated 6 months ago
PeaBrane / mamba-tiny
Simple, minimal implementation of the Mamba SSM in one pytorch file. Using logcumsumexp (Heisen sequence).
☆120Updated 9 months ago
KindXiaoming / grow-crystals
Getting crystal-like representations with harmonic loss
☆192Updated 4 months ago
ironjr / grokfast
Official repository for the paper "Grokfast: Accelerated Grokking by Amplifying Slow Gradients"
☆559Updated last year
athms / mad-lab
A MAD laboratory to improve AI architecture designs 🧪
☆123Updated 7 months ago
google-research / optformer
☆224Updated last month
clement-bonnet / lpn
Latent Program Network (from the "Searching Latent Program Spaces" paper)
☆93Updated 4 months ago
srush / annotated-mamba
Annotated version of the Mamba paper
☆487Updated last year
lucidrains / nGPT-pytorch
Quick implementation of nGPT, learning entirely on the hypersphere, from NvidiaAI
☆289Updated 2 months ago
HazyResearch / zoology
Understand and test language model architectures on synthetic tasks.
☆221Updated 3 weeks ago
ethansmith2000 / fsdp_optimizers
supporting pytorch FSDP for optimizers
☆84Updated 7 months ago
alxndrTL / othello_mamba
Evaluating the Mamba architecture on the Othello game
☆48Updated last year
NX-AI / mlstm_kernels
Tiled Flash Linear Attention library for fast and efficient mLSTM Kernels.
☆65Updated last week
proger / accelerated-scan
Accelerated First Order Parallel Associative Scan
☆184Updated 11 months ago
marin-community / marin
☆347Updated this week
NVIDIA / ngpt
Normalized Transformer (nGPT)
☆185Updated 8 months ago
LucasPrietoAl / grokking-at-the-edge-of-numerical-stability
☆100Updated 2 weeks ago
kvfrans / splus
☆115Updated last month
apple / ml-ademamix
☆65Updated 8 months ago
KellerJordan / cifar10-airbench
CIFAR-10 speedruns: 94% in 2.6 seconds and 96% in 27 seconds
☆274Updated 2 weeks ago
google-deepmind / regress-lm
Library for text-to-text regression, applicable to any input string representation and allows pretraining and fine-tuning over multiple r…
☆122Updated this week