kach / gradient-descent-the-ultimate-optimizerLinks
Code for our NeurIPS 2022 paper
☆368Updated 2 years ago
Alternatives and similar repositories for gradient-descent-the-ultimate-optimizer
Users that are interested in gradient-descent-the-ultimate-optimizer are comparing it to the libraries listed below
Sorting:
- Named tensors with first-class dimensions for PyTorch☆331Updated 2 years ago
- This library would form a permanent home for reusable components for deep probabilistic programming. The library would form and harness a…☆305Updated 3 weeks ago
- A library to inspect and extract intermediate layers of PyTorch models.☆473Updated 3 years ago
- Laplace approximations for Deep Learning.☆508Updated 2 months ago
- Implicit MLE: Backpropagating Through Discrete Exponential Family Distributions☆258Updated last year
- ☆778Updated 3 weeks ago
- BackPACK - a backpropagation package built on top of PyTorch which efficiently computes quantities other than the gradient.☆587Updated 5 months ago
- Constrained optimization toolkit for PyTorch☆683Updated 3 years ago
- Cockpit: A Practical Debugging Tool for Training Deep Neural Networks☆480Updated 2 years ago
- Implementation of https://srush.github.io/annotated-s4☆499Updated this week
- functorch is JAX-like composable function transforms for PyTorch.☆1,432Updated this week
- ☆152Updated 2 years ago
- Helps you write algorithms in PyTorch that adapt to the available (CUDA) memory☆438Updated 9 months ago
- FFCV-SSL Fast Forward Computer Vision for Self-Supervised Learning.☆208Updated last year
- For optimization algorithm research and development.☆521Updated this week
- Code release for "Git Re-Basin: Merging Models modulo Permutation Symmetries"☆480Updated 2 years ago
- Tensors, for human consumption☆1,256Updated this week
- D-Adaptation for SGD, Adam and AdaGrad☆522Updated 5 months ago
- Optimal transport tools implemented with the JAX framework, to solve large scale matching problems of any flavor.☆606Updated this week
- ☆312Updated 3 months ago
- Fast Differentiable Sorting and Ranking☆605Updated last year
- Implementation of the Adan (ADAptive Nesterov momentum algorithm) Optimizer in Pytorch☆252Updated 2 years ago
- Fast, differentiable sorting and ranking in PyTorch☆815Updated 2 weeks ago
- Pretrained deep learning models for Jax/Flax: StyleGAN2, GPT2, VGG, ResNet, etc.☆254Updated 3 months ago
- Automatic gradient descent☆208Updated 2 years ago
- A simple way to keep track of an Exponential Moving Average (EMA) version of your Pytorch model☆592Updated 6 months ago
- Normalizing flows in PyTorch☆391Updated 3 weeks ago
- VQVAEs, GumbelSoftmaxes and friends☆572Updated 3 years ago
- Unofficial JAX implementations of deep learning research papers☆156Updated 3 years ago
- Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attention Does Not Need O(n²) Memory"☆379Updated last year