kach / gradient-descent-the-ultimate-optimizer
Code for our NeurIPS 2022 paper
☆368Updated 2 years ago
Alternatives and similar repositories for gradient-descent-the-ultimate-optimizer
Users that are interested in gradient-descent-the-ultimate-optimizer are comparing it to the libraries listed below
Sorting:
- Named tensors with first-class dimensions for PyTorch☆320Updated last year
- ☆776Updated 3 weeks ago
- Implicit MLE: Backpropagating Through Discrete Exponential Family Distributions☆258Updated last year
- Implementation of the Adan (ADAptive Nesterov momentum algorithm) Optimizer in Pytorch☆252Updated 2 years ago
- Automatic gradient descent☆207Updated last year
- This library would form a permanent home for reusable components for deep probabilistic programming. The library would form and harness a…☆307Updated 2 months ago
- Cockpit: A Practical Debugging Tool for Training Deep Neural Networks☆477Updated 2 years ago
- Helps you write algorithms in PyTorch that adapt to the available (CUDA) memory☆436Updated 8 months ago
- A library to inspect and extract intermediate layers of PyTorch models.☆472Updated 3 years ago
- functorch is JAX-like composable function transforms for PyTorch.☆1,424Updated this week
- Tensors, for human consumption☆1,251Updated 5 months ago
- BackPACK - a backpropagation package built on top of PyTorch which efficiently computes quantities other than the gradient.☆581Updated 4 months ago
- Implementation of https://srush.github.io/annotated-s4☆494Updated 2 years ago
- Laplace approximations for Deep Learning.☆502Updated 3 weeks ago
- MLCommons Algorithmic Efficiency is a benchmark and competition measuring neural network training speedups due to algorithmic improvement…☆382Updated last week
- Constrained optimization toolkit for PyTorch☆678Updated 3 years ago
- TorchOpt is an efficient library for differentiable optimization built upon PyTorch.☆586Updated last week
- Compositional Linear Algebra☆476Updated last month
- A parallel ODE solver for PyTorch☆255Updated 7 months ago
- A PyTorch implementation of Perceiver, Perceiver IO and Perceiver AR with PyTorch Lightning scripts for distributed training☆466Updated last year
- Fast, differentiable sorting and ranking in PyTorch☆811Updated last year
- Type annotations and dynamic checking for a tensor's shape, dtype, names, etc.☆1,432Updated last week
- Unofficial JAX implementations of deep learning research papers☆156Updated 2 years ago
- ☆291Updated 4 months ago
- TensorDict is a pytorch dedicated tensor container.☆925Updated this week
- Easy Hypernetworks in Pytorch and Jax☆100Updated 2 years ago
- Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attention Does Not Need O(n²) Memory"☆378Updated last year
- Drop-in replacement for any ResNet with a significantly reduced memory footprint and better representation capabilities☆209Updated last year
- Code release for "Git Re-Basin: Merging Models modulo Permutation Symmetries"☆479Updated 2 years ago
- D-Adaptation for SGD, Adam and AdaGrad☆521Updated 3 months ago