kach / gradient-descent-the-ultimate-optimizer
Code for our NeurIPS 2022 paper
☆366Updated 2 years ago
Alternatives and similar repositories for gradient-descent-the-ultimate-optimizer:
Users that are interested in gradient-descent-the-ultimate-optimizer are comparing it to the libraries listed below
- Named tensors with first-class dimensions for PyTorch☆322Updated last year
- A library to inspect and extract intermediate layers of PyTorch models.☆470Updated 2 years ago
- Implementation of https://srush.github.io/annotated-s4☆477Updated last year
- D-Adaptation for SGD, Adam and AdaGrad☆510Updated last year
- Implementation of the Adan (ADAptive Nesterov momentum algorithm) Optimizer in Pytorch☆250Updated 2 years ago
- Laplace approximations for Deep Learning.☆483Updated last month
- Implicit MLE: Backpropagating Through Discrete Exponential Family Distributions☆257Updated last year
- Code release for "Git Re-Basin: Merging Models modulo Permutation Symmetries"☆477Updated last year
- ☆759Updated last week
- This library would form a permanent home for reusable components for deep probabilistic programming. The library would form and harness a…☆302Updated last month
- functorch is JAX-like composable function transforms for PyTorch.☆1,403Updated this week
- MLCommons Algorithmic Efficiency is a benchmark and competition measuring neural network training speedups due to algorithmic improvement…☆345Updated this week
- BackPACK - a backpropagation package built on top of PyTorch which efficiently computes quantities other than the gradient.☆568Updated 2 weeks ago
- For optimization algorithm research and development.☆484Updated this week
- Constrained optimization toolkit for PyTorch☆664Updated 2 years ago
- Betty: an automatic differentiation library for generalized meta-learning and multilevel optimization☆337Updated 6 months ago
- Helps you write algorithms in PyTorch that adapt to the available (CUDA) memory☆432Updated 4 months ago
- A PyTorch implementation of Perceiver, Perceiver IO and Perceiver AR with PyTorch Lightning scripts for distributed training☆446Updated last year
- Automatic gradient descent☆206Updated last year
- Tensors, for human consumption☆1,178Updated last month
- VQVAEs, GumbelSoftmaxes and friends☆548Updated 3 years ago
- Compositional Linear Algebra☆456Updated this week
- A minimal PyTorch implementation of probabilistic diffusion models for 2D datasets.☆696Updated 8 months ago
- Pretrained deep learning models for Jax/Flax: StyleGAN2, GPT2, VGG, ResNet, etc.☆240Updated last year
- Easy Hypernetworks in Pytorch and Jax☆96Updated last year
- Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attention Does Not Need O(n²) Memory"☆370Updated last year
- Cockpit: A Practical Debugging Tool for Training Deep Neural Networks☆473Updated 2 years ago
- FFCV-SSL Fast Forward Computer Vision for Self-Supervised Learning.☆202Updated last year
- ☆164Updated last year
- Annotated version of the Mamba paper☆469Updated 10 months ago