KellerJordan / top-sgd
Optimization algorithm which fits a ResNet to CIFAR-10 5x faster than SGD / Adam (with terrible generalization)
☆11Updated 11 months ago
Related projects: ⓘ
- Meta-learning inductive biases in the form of useful conserved quantities.☆37Updated last year
- ☆42Updated 3 months ago
- Code for minimum-entropy coupling.☆29Updated 2 months ago
- ☆40Updated 2 months ago
- Unofficial but Efficient Implementation of "Mamba: Linear-Time Sequence Modeling with Selective State Spaces" in JAX☆74Updated 7 months ago
- Euclidean Wasserstein-2 optimal transportation☆43Updated last year
- Lightning-like training API for JAX with Flax☆28Updated 4 months ago
- Multi-framework implementation of Deep Kernel Shaping and Tailored Activation Transformations, which are methods that modify neural netwo…☆57Updated last month
- Replicating and dissecting the git-re-basin project in one-click-replication Colabs☆36Updated 2 years ago
- Pytorch implementation of a simple way to enable (Stochastic) Frame Averaging for any network☆45Updated last month
- Clean RL implementation using MLX☆26Updated 6 months ago
- Neural Optimal Transport with Lagrangian Costs☆37Updated 2 months ago
- ☆28Updated last week
- Open source code for EigenGame.☆28Updated last year
- minGPT in JAX☆45Updated 2 years ago
- Code for the paper "Learning Temporal Distances: Contrastive Successor Features Can Provide a Metric Structure for Decision-Making"☆20Updated 2 months ago
- Official repository for the paper "Can You Learn an Algorithm? Generalizing from Easy to Hard Problems with Recurrent Networks"☆58Updated 2 years ago
- ☆46Updated 7 months ago
- Pytorch implementation of preconditioned stochastic gradient descent (affine group preconditioner, low-rank approximation preconditioner …☆105Updated this week
- Automatic Integration for Neural Spatio-Temporal Point Process models (AI-STPP) is a new paradigm for exact, efficient, non-parametric inf…☆22Updated 10 months ago
- ☆17Updated 4 months ago
- JAX implementation of "Fine-Tuning Language Models with Just Forward Passes"☆20Updated last year
- Experiment of using Tangent to autodiff triton☆66Updated 7 months ago
- flexible meta-learning in jax☆12Updated 11 months ago
- ☆23Updated this week
- A metrics library for the JAX ecosystem☆36Updated last year
- ☆27Updated this week
- Generative cellular automaton-like learning environments for RL.☆19Updated last month
- ☆30Updated this week
- A system for automating selection and optimization of pre-trained models from the TAO Model Zoo☆19Updated 2 months ago