davisyoshida / haiku-mupLinks

A port of muP to JAX/Haiku

☆25

Alternatives and similar repositories for haiku-mup

Users that are interested in haiku-mup are comparing it to the libraries listed below

Sorting:

GallagherCommaJack / modulax
☆17Updated last year
davisyoshida / lorax
LoRA for arbitrary JAX models and functions
☆143Updated last year
nestordemeure / jochastic
A JAX implementation of stochastic addition.
☆14Updated 3 years ago
davisyoshida / qax
If it quacks like a tensor...
☆59Updated last year
mgrankin / minGPT
minGPT in JAX
☆48Updated 3 years ago
google-deepmind / jmp
JMP is a Mixed Precision library for JAX.
☆211Updated 10 months ago
cgarciae / ciclo
A functional training loops library for JAX
☆88Updated last year
ayaka14732 / jax-smi
JAX Synergistic Memory Inspector
☆182Updated last year
young-geng / scalax
A simple library for scaling up JAX programs
☆144Updated last month
joschu / jax-exp
☆24Updated 6 years ago
lixilinx / psgd_torch
Pytorch implementation of preconditioned stochastic gradient descent (Kron and affine preconditioner, low-rank approximation precondition…
☆188Updated last month
evanatyourservice / psgd_jax
Implementation of PSGD optimizer in JAX
☆35Updated 11 months ago
JesseFarebro / flax-mup
Maximal Update Parametrization (μP) with Flax & Optax.
☆16Updated last year
johnryan465 / pscan
☆40Updated last year
sholtodouglas / scalingExperiments
☆62Updated 3 years ago
toshas / torch-discounted-cumsum
Fast Discounted Cumulative Sums in PyTorch
☆96Updated 4 years ago
google-deepmind / jaxline
☆161Updated last year
google-deepmind / einshape
☆107Updated last year
BirkhoffG / jax-dataloader
Pytorch-like dataloaders for JAX.
☆97Updated 6 months ago
akbir / deq-jax
[NeurIPS'19] Deep Equilibrium Models Jax Implementation
☆42Updated 5 years ago
srush / torch-golf
Silly twitter torch implementations.
☆46Updated 3 years ago
shyamsn97 / hyper-nn
Easy Hypernetworks in Pytorch and Jax
☆106Updated 2 years ago
xl0 / lovely-jax
JAX Arrays for human consumption
☆110Updated last month
ag1988 / dss
Sequence Modeling with Structured State Spaces
☆66Updated 3 years ago
vvvm23 / mamba-jax
Unofficial but Efficient Implementation of "Mamba: Linear-Time Sequence Modeling with Selective State Spaces" in JAX
☆92Updated last year
dylandoblar / noether-networks
Meta-learning inductive biases in the form of useful conserved quantities.
☆38Updated 3 years ago
jenkspt / gpt-jax
Jax/Flax rewrite of Karpathy's nanoGPT
☆62Updated 2 years ago
Sea-Snell / JAXSeq
Train very large language models in Jax.
☆210Updated 2 years ago
ClashLuke / tpucare
Automatically take good care of your preemptible TPUs
☆37Updated 2 years ago
cgarciae / nnx
Neural Networks for JAX
☆84Updated last year