vpj / jax_transformer
Autoregressive transformer in JAX from scratch
โ22Updated 3 years ago
Alternatives and similar repositories for jax_transformer:
Users that are interested in jax_transformer are comparing it to the libraries listed below
- LoRA for arbitrary JAX models and functionsโ136Updated last year
- Serialize JAX, Flax, Haiku, or Objax model params with ๐ค`safetensors`โ44Updated 10 months ago
- Machine Learning eXperiment Utilitiesโ46Updated 10 months ago
- Inference code for LLaMA models in JAXโ118Updated 11 months ago
- Jax/Flax rewrite of Karpathy's nanoGPTโ57Updated 2 years ago
- minGPT in JAXโ48Updated 3 years ago
- Train very large language models in Jax.โ204Updated last year
- If it quacks like a tensor...โ58Updated 5 months ago
- HomebrewNLP in JAX flavour for maintable TPU-Trainingโ49Updated last year
- The simplest, fastest repository for training/finetuning medium-sized GPTs.โ33Updated last year
- some common Huggingface transformers in maximal update parametrization (ยตP)โ80Updated 3 years ago
- โ60Updated 3 years ago
- JAX Synergistic Memory Inspectorโ172Updated 9 months ago
- Automatically take good care of your preemptible TPUsโ36Updated last year
- โ20Updated last year
- unofficial re-implementation of "Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets"โ78Updated 2 years ago
- Fast Discounted Cumulative Sums in PyTorchโ95Updated 3 years ago
- LayerNorm(SmallInit(Embedding)) in a Transformer to improve convergenceโ60Updated 3 years ago
- Transformer with Mu-Parameterization, implemented in Jax/Flax. Supports FSDP on TPU pods.โ30Updated this week
- Resources from the EleutherAI Math Reading Groupโ53Updated 2 months ago
- JAX implementation of the Mistral 7b v0.2 modelโ35Updated 9 months ago
- Scaling scaling laws with board games.โ48Updated last year
- Minimal but scalable implementation of large language models in JAXโ34Updated 5 months ago
- Unofficial but Efficient Implementation of "Mamba: Linear-Time Sequence Modeling with Selective State Spaces" in JAXโ83Updated last year
- A simple library for scaling up JAX programsโ134Updated 5 months ago
- My explorations into editing the knowledge and memories of an attention networkโ34Updated 2 years ago
- Image augmentation library for Jaxโ39Updated last year
- โ102Updated this week
- A library to create and manage configuration files, especially for machine learning projects.โ77Updated 3 years ago
- A metrics library for the JAX ecosystemโ40Updated 2 years ago