kach / gradient-descent-the-ultimate-optimizer
Code for our NeurIPS 2022 paper
☆363Updated last year
Related projects ⓘ
Alternatives and complementary repositories for gradient-descent-the-ultimate-optimizer
- A library to inspect and extract intermediate layers of PyTorch models.☆470Updated 2 years ago
- Named tensors with first-class dimensions for PyTorch☆322Updated last year
- Laplace approximations for Deep Learning.☆471Updated this week
- This library would form a permanent home for reusable components for deep probabilistic programming. The library would form and harness a…☆301Updated 3 weeks ago
- BackPACK - a backpropagation package built on top of PyTorch which efficiently computes quantities other than the gradient.☆561Updated this week
- ☆751Updated this week
- Implementation of the Adan (ADAptive Nesterov momentum algorithm) Optimizer in Pytorch☆247Updated 2 years ago
- ☆207Updated 6 months ago
- Implementation of https://srush.github.io/annotated-s4☆469Updated last year
- Unofficial JAX implementations of deep learning research papers☆151Updated 2 years ago
- Tensors, for human consumption☆1,113Updated 3 weeks ago
- Implicit MLE: Backpropagating Through Discrete Exponential Family Distributions☆258Updated last year
- Constrained optimization toolkit for PyTorch☆661Updated 2 years ago
- A parallel ODE solver for PyTorch☆232Updated last month
- Automatic gradient descent☆206Updated last year
- Compositional Linear Algebra☆432Updated 3 weeks ago
- ASDL: Automatic Second-order Differentiation Library for PyTorch☆179Updated 3 months ago
- FFCV-SSL Fast Forward Computer Vision for Self-Supervised Learning.☆203Updated last year
- Cockpit: A Practical Debugging Tool for Training Deep Neural Networks☆473Updated 2 years ago
- D-Adaptation for SGD, Adam and AdaGrad☆502Updated 11 months ago
- Optimal transport tools implemented with the JAX framework, to get differentiable, parallel and jit-able computations.☆526Updated this week
- MLCommons Algorithmic Efficiency is a benchmark and competition measuring neural network training speedups due to algorithmic improvement…☆333Updated 3 weeks ago
- Code release for "Git Re-Basin: Merging Models modulo Permutation Symmetries"☆471Updated last year
- Use Jax functions in Pytorch☆228Updated last year
- Betty: an automatic differentiation library for generalized meta-learning and multilevel optimization☆332Updated 4 months ago
- ☆187Updated 2 years ago
- Helps you write algorithms in PyTorch that adapt to the available (CUDA) memory☆428Updated 2 months ago
- ☆261Updated 3 months ago
- Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attention Does Not Need O(n²) Memory"☆360Updated last year
- Pretrained deep learning models for Jax/Flax: StyleGAN2, GPT2, VGG, ResNet, etc.☆238Updated last year