ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning
☆283Feb 27, 2023Updated 3 years ago
Alternatives and similar repositories for adahessian
Users that are interested in adahessian are comparing it to the libraries listed below
Sorting:
- PyHessian is a Pytorch library for second-order based analysis and training of Neural Networks☆778Jul 10, 2025Updated 8 months ago
- PyTorch-SSO: Scalable Second-Order methods in PyTorch☆148Oct 1, 2023Updated 2 years ago
- torch-optimizer -- collection of optimizers for Pytorch☆3,163Mar 22, 2024Updated last year
- ☆23Mar 1, 2022Updated 4 years ago
- diffGrad: An Optimization Method for Convolutional Neural Networks☆55Oct 12, 2022Updated 3 years ago
- ☆43Jan 30, 2024Updated 2 years ago
- [ICML 2024] SINGD: KFAC-like Structured Inverse-Free Natural Gradient Descent (http://arxiv.org/abs/2312.05705)☆24Nov 4, 2024Updated last year
- SKFAC Preconditioner for MindSpore☆12Jul 2, 2021Updated 4 years ago
- BackPACK - a backpropagation package built on top of PyTorch which efficiently computes quantities other than the gradient.☆606Nov 28, 2025Updated 3 months ago
- Hessian backpropagation (HBP): PyTorch extension of backpropagation for block-diagonal curvature matrix approximations☆21Mar 25, 2023Updated 2 years ago
- PyTorch implementation of Hessian Free optimisation☆43Dec 19, 2019Updated 6 years ago
- Scalable Computation of Hessian Diagonals☆14Jun 2, 2024Updated last year
- Limitations of the Empirical Fisher Approximation☆49Mar 3, 2025Updated last year
- Code accompanying the NeurIPS 2020 paper: WoodFisher (Singh & Alistarh, 2020)☆53Mar 8, 2021Updated 5 years ago
- Hessian trace estimation using PyTorch and Hutch++☆20Oct 29, 2020Updated 5 years ago
- Ranger deep learning optimizer rewrite to use newest components☆341Feb 18, 2024Updated 2 years ago
- Analyze AdaHessian optimizer on 2D functions.☆13Aug 13, 2021Updated 4 years ago
- (unofficial) - customized fork of DETR, optimized for intelligent obj detection on 'real world' custom datasets☆12Aug 22, 2020Updated 5 years ago
- The official implementation of “Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training”☆983Jan 30, 2024Updated 2 years ago
- A collection of optimizers, some arcane others well known, for Flax.☆29Aug 6, 2021Updated 4 years ago
- An adaptive training algorithm for residual network☆17Aug 22, 2020Updated 5 years ago
- [TMLR 2022] Curvature access through the generalized Gauss-Newton's low-rank structure: Eigenvalues, eigenvectors, directional derivative…☆17Jul 19, 2023Updated 2 years ago
- Apollo: An Adaptive Parameter-wise Diagonal Quasi-Newton Method for Nonconvex Stochastic Optimization☆182Nov 21, 2021Updated 4 years ago
- Repository for NeurIPS 2020 Spotlight "AdaBelief Optimizer: Adapting stepsizes by the belief in observed gradients"☆1,068Aug 9, 2024Updated last year
- Pytorch implementation of KFAC and E-KFAC (Natural Gradient).☆133Jul 2, 2019Updated 6 years ago
- Unofficially Implements https://arxiv.org/abs/2112.05682 to get Linear Memory Cost on Attention for PyTorch☆12Jan 16, 2022Updated 4 years ago
- [ICML2023] Instant Soup Cheap Pruning Ensembles in A Single Pass Can Draw Lottery Tickets from Large Models. Ajay Jaiswal, Shiwei Liu, Ti…☆11Nov 28, 2023Updated 2 years ago
- This repository contains the results for the paper: "Descending through a Crowded Valley - Benchmarking Deep Learning Optimizers"☆184Jul 17, 2021Updated 4 years ago
- ☆11Dec 8, 2022Updated 3 years ago
- Repository containing Pytorch code for EKFAC and K-FAC perconditioners.☆153Jun 22, 2023Updated 2 years ago
- Manifold-Mixup implementation for fastai V1☆19Oct 1, 2020Updated 5 years ago
- Hypergradient descent☆146May 31, 2024Updated last year
- ☆83Jan 15, 2020Updated 6 years ago
- A Chainer extension for K-FAC☆20Jun 16, 2019Updated 6 years ago
- Convolutions and more as einsum for PyTorch☆17Jun 6, 2024Updated last year
- [ICML 2024] SIRFShampoo: Structured inverse- and root-free Shampoo in PyTorch (https://arxiv.org/abs/2402.03496)☆15Nov 4, 2024Updated last year
- A Sparse-tensor Communication Framework for Distributed Deep Learning☆13Nov 1, 2021Updated 4 years ago
- [ICLR 2023] Eva: Practical Second-order Optimization with Kronecker-vectorized Approximation☆12Jul 31, 2023Updated 2 years ago
- Implemented image caption generation method propossed in Show, Attend, and Tell paper using the Fastai framework to describe the content …☆24Dec 8, 2022Updated 3 years ago