vpj / jax_transformerLinks
Autoregressive transformer in JAX from scratch
โ23Updated 4 years ago
Alternatives and similar repositories for jax_transformer
Users that are interested in jax_transformer are comparing it to the libraries listed below
Sorting:
- LoRA for arbitrary JAX models and functionsโ144Updated last year
- Serialize JAX, Flax, Haiku, or Objax model params with ๐ค`safetensors`โ47Updated last year
- Train very large language models in Jax.โ210Updated 2 years ago
- minGPT in JAXโ48Updated 4 years ago
- Unofficial but Efficient Implementation of "Mamba: Linear-Time Sequence Modeling with Selective State Spaces" in JAXโ92Updated 2 years ago
- JAX Synergistic Memory Inspectorโ184Updated last year
- โ63Updated 3 years ago
- Amos optimizer with JEstimator lib.โ82Updated last year
- JMP is a Mixed Precision library for JAX.โ211Updated last year
- Fast Discounted Cumulative Sums in PyTorchโ97Updated 4 years ago
- The simplest, fastest repository for training/finetuning medium-sized GPTs.โ37Updated 2 years ago
- If it quacks like a tensor...โ59Updated last year
- Neural Networks for JAXโ84Updated last year
- HomebrewNLP in JAX flavour for maintable TPU-Trainingโ51Updated 2 years ago
- A functional training loops library for JAXโ88Updated last year
- Inference code for LLaMA models in JAXโ120Updated last year
- Jax/Flax rewrite of Karpathy's nanoGPTโ63Updated 2 years ago
- A simple library for scaling up JAX programsโ144Updated 2 months ago
- JAX implementation of the Llama 2 modelโ216Updated last year
- Machine Learning eXperiment Utilitiesโ48Updated 6 months ago
- Implementation of Flash Attention in Jaxโ225Updated last year
- Implementation of the specific Transformer architecture from PaLM - Scaling Language Modeling with Pathways - in Jax (Equinox framework)โ190Updated 3 years ago
- โ120Updated this week
- Functional local implementations of main model parallelism approachesโ95Updated 2 years ago
- Running Jax in PyTorch Lightningโ119Updated last year
- Implementation of VQ-VAE with a GPT-style sampler in the JAX and Haiku ecosystem.โ12Updated 2 years ago
- โ167Updated 2 years ago
- some common Huggingface transformers in maximal update parametrization (ยตP)โ87Updated 3 years ago
- Scaling scaling laws with board games.โ53Updated 2 years ago
- โ66Updated 3 years ago