NX-AI / xlstm-jax
Official JAX implementation of xLSTM including fast and efficient training and inference code. 7B model available at https://huggingface.co/NX-AI/xLSTM-7b.
โ91Updated 4 months ago
Alternatives and similar repositories for xlstm-jax:
Users that are interested in xlstm-jax are comparing it to the libraries listed below
- ๐งฑ Modula software packageโ188Updated last month
- โ290Updated 4 months ago
- โ150Updated 8 months ago
- โ178Updated 5 months ago
- A State-Space Model with Rational Transfer Function Representation.โ78Updated 11 months ago
- The simplest, fastest repository for training/finetuning medium-sized GPTs.โ114Updated this week
- Simple, minimal implementation of the Mamba SSM in one pytorch file. Using logcumsumexp (Heisen sequence).โ116Updated 6 months ago
- โ217Updated 9 months ago
- supporting pytorch FSDP for optimizersโ80Updated 5 months ago
- A MAD laboratory to improve AI architecture designs ๐งชโ114Updated 4 months ago
- DeMo: Decoupled Momentum Optimizationโ185Updated 5 months ago
- โ81Updated last year
- Accelerated First Order Parallel Associative Scanโ182Updated 8 months ago
- CIFAR-10 speedruns: 94% in 2.6 seconds and 96% in 27 secondsโ233Updated 2 months ago
- Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"โ99Updated 4 months ago
- An implementation of PSGD Kron second-order optimizer for PyTorchโ89Updated last month
- Understand and test language model architectures on synthetic tasks.โ195Updated 2 months ago
- Annotated version of the Mamba paperโ483Updated last year
- Efficient optimizersโ193Updated this week
- Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT trainingโ123Updated last year
- Latent Program Network (from the "Searching Latent Program Spaces" paper)โ82Updated last month
- Implementation of GateLoop Transformer in Pytorch and Jaxโ87Updated 10 months ago
- โ60Updated 5 months ago
- Tiled Flash Linear Attention library for fast and efficient mLSTM Kernels.โ56Updated last month
- Scaling is a distributed training library and installable dependency designed to scale up neural networks, with a dedicated module for trโฆโ60Updated 6 months ago
- WIPโ93Updated 8 months ago
- Getting crystal-like representations with harmonic lossโ182Updated last month
- Implementation of PSGD optimizer in JAXโ33Updated 4 months ago
- โ94Updated 3 months ago
- Cost aware hyperparameter tuning algorithmโ151Updated 10 months ago