google-deepmind / dks
Multi-framework implementation of Deep Kernel Shaping and Tailored Activation Transformations, which are methods that modify neural network models (and their initializations) to make them easier to train.
☆66Updated 2 months ago
Alternatives and similar repositories for dks:
Users that are interested in dks are comparing it to the libraries listed below
- Meta-learning inductive biases in the form of useful conserved quantities.☆37Updated 2 years ago
- JAX implementation of Learning to learn by gradient descent by gradient descent☆26Updated 3 months ago
- Pytorch implementation of preconditioned stochastic gradient descent (Kron and affine preconditioner, low-rank approximation precondition…☆154Updated last month
- ☆50Updated 3 months ago
- CUDA implementation of autoregressive linear attention, with all the latest research findings☆44Updated last year
- ☆30Updated this week
- [ICML 2024] SIRFShampoo: Structured inverse- and root-free Shampoo in PyTorch (https://arxiv.org/abs/2402.03496)☆14Updated 2 months ago
- Open source code for EigenGame.☆30Updated last year
- ☆111Updated 3 weeks ago
- Transformers with doubly stochastic attention☆44Updated 2 years ago
- A selection of neural network models ported from torchvision for JAX & Flax.☆44Updated 4 years ago
- Demo of the unit_scaling library, showing how a model can be easily adapted to train in FP8.☆43Updated 6 months ago
- Latent Diffusion Language Models☆68Updated last year
- AdaCat☆49Updated 2 years ago
- ☆58Updated 2 years ago
- Code Release for "Broken Neural Scaling Laws" (BNSL) paper☆57Updated last year
- Automatically take good care of your preemptible TPUs☆35Updated last year
- ☆100Updated 7 months ago
- An implementation of PSGD Kron second-order optimizer for PyTorch☆29Updated 3 weeks ago
- ☆26Updated 11 months ago
- JMP is a Mixed Precision library for JAX.☆189Updated last month
- Quantification of Uncertainty with Adversarial Models☆27Updated last year
- minGPT in JAX☆46Updated 3 years ago
- Experiment of using Tangent to autodiff triton☆74Updated last year
- Official Implementation of the ICML 2023 paper: "Neural Wave Machines: Learning Spatiotemporally Structured Representations with Locally …☆69Updated last year
- Usable implementation of Emerging Symbol Binding Network (ESBN), in Pytorch☆23Updated 4 years ago
- ☆58Updated 2 years ago
- [ICML 2024] SINGD: KFAC-like Structured Inverse-Free Natural Gradient Descent (http://arxiv.org/abs/2312.05705)☆21Updated 2 months ago
- Implementations and checkpoints for ResNet, Wide ResNet, ResNeXt, ResNet-D, and ResNeSt in JAX (Flax).☆106Updated 2 years ago
- A port of muP to JAX/Haiku☆25Updated 2 years ago