NTT123 / pax
A stateful pytree library for training neural networks.
☆22Updated 2 years ago
Alternatives and similar repositories for pax:
Users that are interested in pax are comparing it to the libraries listed below
- Image augmentation library for Jax☆39Updated last year
- Re-implementation of 'Grokking: Generalization beyond overfitting on small algorithmic datasets'☆38Updated 3 years ago
- Serialize JAX, Flax, Haiku, or Objax model params with 🤗`safetensors`☆44Updated 10 months ago
- ☆31Updated 2 weeks ago
- DiCE: The Infinitely Differentiable Monte-Carlo Estimator☆31Updated last year
- HomebrewNLP in JAX flavour for maintable TPU-Training☆49Updated last year
- A selection of neural network models ported from torchvision for JAX & Flax.☆44Updated 4 years ago
- Automatically take good care of your preemptible TPUs☆36Updated last year
- ☆17Updated 7 months ago
- AdaCat☆49Updated 2 years ago
- A GPT, made only of MLPs, in Jax☆57Updated 3 years ago
- My explorations into editing the knowledge and memories of an attention network☆34Updated 2 years ago
- A functional training loops library for JAX☆86Updated last year
- A metrics library for the JAX ecosystem☆40Updated 2 years ago
- If it quacks like a tensor...☆58Updated 5 months ago
- PyTorch interface for TrueGrad Optimizers☆41Updated last year
- JAX implementation of Learning to learn by gradient descent by gradient descent☆27Updated 5 months ago
- minGPT in JAX☆48Updated 3 years ago
- A generative modelling toolkit for PyTorch.☆70Updated 3 years ago
- Neural Networks for JAX☆84Updated 6 months ago
- ☆24Updated 6 years ago
- ☆33Updated 2 years ago
- ☆64Updated 7 months ago
- 👑 Pytorch code for the Nero optimiser.☆20Updated 2 years ago
- Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing☆48Updated 3 years ago
- A small library for creating and manipulating custom JAX Pytree classes☆56Updated 2 years ago
- RWKV model implementation☆37Updated last year
- ☆60Updated 3 years ago
- ☆101Updated 9 months ago
- Codes accompanying the paper "LaProp: a Better Way to Combine Momentum with Adaptive Gradient"☆28Updated 4 years ago