lindermanlab / elkLinks
Scalable and Stable Parallelization of Nonlinear RNNS
☆17Updated 6 months ago
Alternatives and similar repositories for elk
Users that are interested in elk are comparing it to the libraries listed below
Sorting:
- ☆51Updated last year
- Implementation of PSGD optimizer in JAX☆34Updated 7 months ago
- ☆32Updated 10 months ago
- Latent Program Network (from the "Searching Latent Program Spaces" paper)☆93Updated 4 months ago
- ☆115Updated last month
- A simple library for scaling up JAX programs☆140Updated 9 months ago
- ☆53Updated 10 months ago
- 🧱 Modula software package☆216Updated last week
- LoRA for arbitrary JAX models and functions☆140Updated last year
- A MAD laboratory to improve AI architecture designs 🧪☆123Updated 7 months ago
- Unofficial but Efficient Implementation of "Mamba: Linear-Time Sequence Modeling with Selective State Spaces" in JAX☆85Updated last year
- [ICLR'25] Artificial Kuramoto Oscillatory Neurons☆95Updated last month
- Pytorch-like dataloaders for JAX.☆94Updated 2 months ago
- Minimal but scalable implementation of large language models in JAX☆35Updated 2 weeks ago
- Maximal Update Parametrization (μP) with Flax & Optax.☆16Updated last year
- Code for the paper "Function-Space Learning Rates"☆23Updated 2 months ago
- The Energy Transformer block, in JAX☆59Updated last year
- ☆31Updated 8 months ago
- seqax = sequence modeling + JAX☆165Updated 2 weeks ago
- Turn jitted jax functions back into python source code☆22Updated 7 months ago
- Use Jax functions in Pytorch☆248Updated 2 years ago
- ☆206Updated 8 months ago
- Source code for the paper "Positional Attention: Expressivity and Learnability of Algorithmic Computation"☆14Updated 2 months ago
- supporting pytorch FSDP for optimizers☆84Updated 7 months ago
- This repository contains the official code for Energy Transformer---an efficient Energy-based Transformer variant for graph classificatio…☆25Updated last year
- 📄Small Batch Size Training for Language Models☆36Updated last week
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆149Updated last month
- Accelerated First Order Parallel Associative Scan☆184Updated 11 months ago
- nanoGPT using Equinox☆13Updated 2 years ago
- Parallelizing non-linear sequential models over the sequence length☆53Updated last month