vpj / jax_transformer
Autoregressive transformer in JAX from scratch
โ22Updated 3 years ago
Alternatives and similar repositories for jax_transformer:
Users that are interested in jax_transformer are comparing it to the libraries listed below
- minGPT in JAXโ47Updated 3 years ago
- LoRA for arbitrary JAX models and functionsโ135Updated 11 months ago
- Serialize JAX, Flax, Haiku, or Objax model params with ๐ค`safetensors`โ44Updated 8 months ago
- some common Huggingface transformers in maximal update parametrization (ยตP)โ78Updated 2 years ago
- Unofficial but Efficient Implementation of "Mamba: Linear-Time Sequence Modeling with Selective State Spaces" in JAXโ82Updated last year
- The simplest, fastest repository for training/finetuning medium-sized GPTs.โ33Updated last year
- Jax/Flax rewrite of Karpathy's nanoGPTโ55Updated 2 years ago
- โ111Updated last week
- Scaling scaling laws with board games.โ47Updated last year
- A port of muP to JAX/Haikuโ25Updated 2 years ago
- unofficial re-implementation of "Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets"โ71Updated 2 years ago
- Image augmentation library for Jaxโ37Updated 10 months ago
- JMP is a Mixed Precision library for JAX.โ191Updated 2 weeks ago
- Transformer with Mu-Parameterization, implemented in Jax/Flax. Supports FSDP on TPU pods.โ30Updated 2 months ago
- A functional training loops library for JAXโ86Updated last year
- Running Jax in PyTorch Lightningโ86Updated 2 months ago
- Fast Discounted Cumulative Sums in PyTorchโ95Updated 3 years ago
- If it quacks like a tensor...โ56Updated 3 months ago
- Neural Networks for JAXโ83Updated 4 months ago
- Minimal but scalable implementation of large language models in JAXโ31Updated 3 months ago
- Meta-learning inductive biases in the form of useful conserved quantities.โ37Updated 2 years ago
- This is a port of Mistral-7B model in JAXโ31Updated 7 months ago
- A simple library for scaling up JAX programsโ129Updated 3 months ago
- HomebrewNLP in JAX flavour for maintable TPU-Trainingโ48Updated last year
- Automatically take good care of your preemptible TPUsโ36Updated last year
- Lightning-like training API for JAX with Flaxโ38Updated 2 months ago
- Train very large language models in Jax.โ201Updated last year
- Pytorch-like dataloaders for JAX.โ73Updated 3 months ago
- LayerNorm(SmallInit(Embedding)) in a Transformer to improve convergenceโ59Updated 2 years ago
- Machine Learning eXperiment Utilitiesโ46Updated 8 months ago