NX-AI / xlstm-jaxLinks
Official JAX implementation of xLSTM including fast and efficient training and inference code. 7B model available at https://huggingface.co/NX-AI/xLSTM-7b.
โ97Updated 6 months ago
Alternatives and similar repositories for xlstm-jax
Users that are interested in xlstm-jax are comparing it to the libraries listed below
Sorting:
- ๐งฑ Modula software packageโ204Updated 3 months ago
- โ197Updated 7 months ago
- โ273Updated last year
- Cost aware hyperparameter tuning algorithmโ162Updated last year
- The simplest, fastest repository for training/finetuning medium-sized GPTs.โ147Updated 2 weeks ago
- The AdEMAMix Optimizer: Better, Faster, Older.โ183Updated 10 months ago
- โ98Updated 5 months ago
- โ295Updated 6 months ago
- A MAD laboratory to improve AI architecture designs ๐งชโ123Updated 6 months ago
- Efficient optimizersโ232Updated last week
- Latent Program Network (from the "Searching Latent Program Spaces" paper)โ91Updated 4 months ago
- CIFAR-10 speedruns: 94% in 2.6 seconds and 96% in 27 secondsโ263Updated 4 months ago
- Understand and test language model architectures on synthetic tasks.โ219Updated last month
- Accelerated First Order Parallel Associative Scanโ182Updated 10 months ago
- โ221Updated 2 weeks ago
- Accelerate, Optimize performance with streamlined training and serving options with JAX.โ288Updated this week
- An implementation of PSGD Kron second-order optimizer for PyTorchโ92Updated 3 months ago
- Official repository for the paper "Grokfast: Accelerated Grokking by Amplifying Slow Gradients"โ555Updated last year
- supporting pytorch FSDP for optimizersโ82Updated 7 months ago
- Getting crystal-like representations with harmonic lossโ191Updated 3 months ago
- Evaluating the Mamba architecture on the Othello gameโ47Updated last year
- Normalized Transformer (nGPT)โ184Updated 7 months ago
- A State-Space Model with Rational Transfer Function Representation.โ79Updated last year
- Simple, minimal implementation of the Mamba SSM in one pytorch file. Using logcumsumexp (Heisen sequence).โ120Updated 8 months ago
- โ110Updated last month
- $100K or 100 Days: Trade-offs when Pre-Training with Academic Resourcesโ140Updated last month
- Efficiently discovering algorithms via LLMs with evolutionary search and reinforcement learning.โ103Updated 2 months ago
- Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"โ101Updated 6 months ago
- Library for text-to-text regression, applicable to any input string representation and allows pretraining and fine-tuning over multiple rโฆโ86Updated this week
- For optimization algorithm research and development.โ521Updated this week