glassroom / heinsen_sequence
Code implementing "Efficient Parallelization of a Ubiquitious Sequential Computation" (Heinsen, 2023)
β92Updated 4 months ago
Alternatives and similar repositories for heinsen_sequence:
Users that are interested in heinsen_sequence are comparing it to the libraries listed below
- β52Updated 6 months ago
- β215Updated 9 months ago
- A MAD laboratory to improve AI architecture designs π§ͺβ111Updated 4 months ago
- Pytorch implementation of preconditioned stochastic gradient descent (Kron and affine preconditioner, low-rank approximation preconditionβ¦β173Updated this week
- seqax = sequence modeling + JAXβ154Updated 2 weeks ago
- Accelerated First Order Parallel Associative Scanβ181Updated 8 months ago
- β53Updated last year
- Experiment of using Tangent to autodiff tritonβ78Updated last year
- β39Updated last year
- supporting pytorch FSDP for optimizersβ80Updated 4 months ago
- π§± Modula software packageβ188Updated 3 weeks ago
- Implementation of PSGD optimizer in JAXβ31Updated 3 months ago
- β246Updated 6 months ago
- A library for unit scaling in PyTorchβ125Updated 4 months ago
- train with kittens!β57Updated 5 months ago
- Implementation of GateLoop Transformer in Pytorch and Jaxβ87Updated 10 months ago
- β175Updated 4 months ago
- A simple library for scaling up JAX programsβ134Updated 5 months ago
- LoRA for arbitrary JAX models and functionsβ136Updated last year
- β17Updated 8 months ago
- Parallel Associative Scan for Language Modelsβ18Updated last year
- Automatically take good care of your preemptible TPUsβ36Updated last year
- Code for exploring Based models from "Simple linear attention language models balance the recall-throughput tradeoff"β232Updated 2 months ago
- Named Tensors for Legible Deep Learning in JAXβ172Updated this week
- Running Jax in PyTorch Lightningβ94Updated 4 months ago
- Understand and test language model architectures on synthetic tasks.β192Updated last month
- β31Updated last year
- Jax/Flax rewrite of Karpathy's nanoGPTβ57Updated 2 years ago
- Deep learning library implemented from scratch in numpy. Mixtral, Mamba, LLaMA, GPT, ResNet, and other experiments.β51Updated last year
- β99Updated this week