google-deepmind / dksLinks
Multi-framework implementation of Deep Kernel Shaping and Tailored Activation Transformations, which are methods that modify neural network models (and their initializations) to make them easier to train.
☆71Updated 2 weeks ago
Alternatives and similar repositories for dks
Users that are interested in dks are comparing it to the libraries listed below
Sorting:
- Pytorch implementation of preconditioned stochastic gradient descent (Kron and affine preconditioner, low-rank approximation precondition…☆179Updated last month
- Meta-learning inductive biases in the form of useful conserved quantities.☆37Updated 2 years ago
- ☆53Updated 9 months ago
- Official repository for the paper "Can You Learn an Algorithm? Generalizing from Easy to Hard Problems with Recurrent Networks"☆59Updated 3 years ago
- JAX implementation of Learning to learn by gradient descent by gradient descent☆27Updated 9 months ago
- ☆60Updated 3 years ago
- Automatically take good care of your preemptible TPUs☆36Updated 2 years ago
- Experiment of using Tangent to autodiff triton☆79Updated last year
- CUDA implementation of autoregressive linear attention, with all the latest research findings☆44Updated 2 years ago
- A case study of efficient training of large language models using commodity hardware.☆68Updated 2 years ago
- ☆31Updated last month
- Fast training of unitary deep network layers from low-rank updates☆28Updated 2 years ago
- ☆32Updated 9 months ago
- Jax like function transformation engine but micro, microjax☆33Updated 8 months ago
- ☆114Updated last week
- A port of muP to JAX/Haiku☆25Updated 2 years ago
- Open source code for EigenGame.☆30Updated 2 years ago
- Implementations and checkpoints for ResNet, Wide ResNet, ResNeXt, ResNet-D, and ResNeSt in JAX (Flax).☆112Updated 3 years ago
- Demo of the unit_scaling library, showing how a model can be easily adapted to train in FP8.☆46Updated last year
- Latent Diffusion Language Models☆68Updated last year
- Differentiable Algorithms and Algorithmic Supervision.☆115Updated 2 years ago
- Code accompanying our paper "Feature Learning in Infinite-Width Neural Networks" (https://arxiv.org/abs/2011.14522)☆62Updated 4 years ago
- DiCE: The Infinitely Differentiable Monte-Carlo Estimator☆31Updated last year
- A simple hypernetwork implementation in jax using haiku.☆23Updated 2 years ago
- A GPT, made only of MLPs, in Jax☆58Updated 4 years ago
- This repository includes code to reproduce the tables in "Loss Landscapes are All You Need: Neural Network Generalization Can Be Explaine…☆37Updated 2 years ago
- Neural Networks for JAX☆84Updated 9 months ago
- Running Jax in PyTorch Lightning☆106Updated 7 months ago
- A collection of optimizers, some arcane others well known, for Flax.☆29Updated 3 years ago
- Image augmentation library for Jax