mgrankin / minGPTLinks

minGPT in JAX

☆48

Alternatives and similar repositories for minGPT

Users that are interested in minGPT are comparing it to the libraries listed below

Sorting:

davisyoshida / lorax
LoRA for arbitrary JAX models and functions
☆140Updated last year
lixilinx / psgd_torch
Pytorch implementation of preconditioned stochastic gradient descent (Kron and affine preconditioner, low-rank approximation precondition…
☆180Updated last week
cgarciae / nanoGPT-jax
The simplest, fastest repository for training/finetuning medium-sized GPTs.
☆35Updated last year
davisyoshida / haiku-mup
A port of muP to JAX/Haiku
☆25Updated 2 years ago
cgarciae / ciclo
A functional training loops library for JAX
☆88Updated last year
vvvm23 / mamba-jax
Unofficial but Efficient Implementation of "Mamba: Linear-Time Sequence Modeling with Selective State Spaces" in JAX
☆85Updated last year
young-geng / scalax
A simple library for scaling up JAX programs
☆140Updated 9 months ago
toshas / torch-discounted-cumsum
Fast Discounted Cumulative Sums in PyTorch
☆96Updated 3 years ago
aks2203 / easy-to-hard
Official repository for the paper "Can You Learn an Algorithm? Generalizing from Easy to Hard Problems with Recurrent Networks"
☆59Updated 3 years ago
jenkspt / gpt-jax
Jax/Flax rewrite of Karpathy's nanoGPT
☆59Updated 2 years ago
davisyoshida / qax
If it quacks like a tensor...
☆58Updated 8 months ago
andyljones / boardlaw
Scaling scaling laws with board games.
☆51Updated 2 years ago
google-deepmind / tf2jax
☆115Updated this week
google-deepmind / jmp
JMP is a Mixed Precision library for JAX.
☆207Updated 6 months ago
ayaka14732 / jax-smi
JAX Synergistic Memory Inspector
☆177Updated last year
cgarciae / nnx
Neural Networks for JAX
☆84Updated 10 months ago
nestordemeure / jochastic
A JAX implementation of stochastic addition.
☆14Updated 2 years ago
dylandoblar / noether-networks
Meta-learning inductive biases in the form of useful conserved quantities.
☆37Updated 2 years ago
alvarobartt / safejax
Serialize JAX, Flax, Haiku, or Objax model params with 🤗`safetensors`
☆45Updated last year
shyamsn97 / hyper-nn
Easy Hypernetworks in Pytorch and Jax
☆103Updated 2 years ago
JesseFarebro / flax-mup
Maximal Update Parametrization (μP) with Flax & Optax.
☆16Updated last year
Sea-Snell / grokking
unofficial re-implementation of "Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets"
☆78Updated 3 years ago
sholtodouglas / scalingExperiments
☆61Updated 3 years ago
attentionneuron / attentionneuron.github.io
☆26Updated 2 years ago
ssokota / mec
Code for minimum-entropy coupling.
☆32Updated last year
google-deepmind / dks
Multi-framework implementation of Deep Kernel Shaping and Tailored Activation Transformations, which are methods that modify neural netwo…
☆71Updated last month
GallagherCommaJack / modulax
☆17Updated 11 months ago
irhum / hyena
JAX/Flax implementation of the Hyena Hierarchy
☆34Updated 2 years ago
google-deepmind / jaxline
☆158Updated last year
ludwigwinkler / JaxLightning
Running Jax in PyTorch Lightning
☆109Updated 7 months ago