gpauloski / kfac-pytorchView external linksLinks
Distributed K-FAC preconditioner for PyTorch
☆95Updated this week
Alternatives and similar repositories for kfac-pytorch
Users that are interested in kfac-pytorch are comparing it to the libraries listed below
Sorting:
- ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning☆283Feb 27, 2023Updated 2 years ago
- Pytorch implementation of KFAC - this is a port of https://github.com/tensorflow/kfac/☆30Jun 6, 2024Updated last year
- BERT for Distributed PyTorch + AMP Training☆12Mar 15, 2023Updated 2 years ago
- A Chainer extension for K-FAC☆20Jun 16, 2019Updated 6 years ago
- SKFAC Preconditioner for MindSpore☆12Jul 2, 2021Updated 4 years ago
- Repository containing Pytorch code for EKFAC and K-FAC perconditioners.☆152Jun 22, 2023Updated 2 years ago
- {KFAC,EKFAC,Diagonal,Implicit} Fisher Matrices and finite width NTKs in PyTorch☆221Updated this week
- PyTorch-SSO: Scalable Second-Order methods in PyTorch☆148Oct 1, 2023Updated 2 years ago
- An implementation of KFAC for TensorFlow☆199Feb 11, 2022Updated 4 years ago
- Pytorch optimizers implementing Hilbert Constrained Gradient Descent☆19May 9, 2019Updated 6 years ago
- Regularization, Neural Network Training Dynamics☆14Jan 13, 2020Updated 6 years ago
- ☆135Oct 23, 2017Updated 8 years ago
- Efficient reference implementations of the static & dynamic M-FAC algorithms (for pruning and optimization)☆17Feb 23, 2022Updated 3 years ago
- ☆30Feb 11, 2021Updated 5 years ago
- ☆19Jan 27, 2021Updated 5 years ago
- Computing gradients and Hessians of feed-forward networks with GPU acceleration☆20Feb 14, 2024Updated 2 years ago
- ☆13Jun 2, 2022Updated 3 years ago
- Layer-wise Sparsification of Distributed Deep Learning☆10Jul 6, 2020Updated 5 years ago
- This repository contains the results for the paper: "Descending through a Crowded Valley - Benchmarking Deep Learning Optimizers"☆184Jul 17, 2021Updated 4 years ago
- ☆10Apr 29, 2023Updated 2 years ago
- Repository for the paper: "TiC-LM: A Web-Scale Benchmark for Time-Continual LLM Pretraining" ACL Oral 2025☆20Jan 31, 2026Updated 2 weeks ago
- Official Pytorch Implementation for the paper 'SUPER-ADAM: Faster and Universal Framework of Adaptive Gradients'☆17Jan 12, 2022Updated 4 years ago
- Code for reproducing experiments performed for Accoridon☆13Jun 11, 2021Updated 4 years ago
- Universal Python binding for the LMDB 'Lightning' Database☆13Nov 7, 2017Updated 8 years ago
- Source code for "Taming GANs with Lookahead–Minmax", ICLR 2021.☆15Mar 28, 2021Updated 4 years ago
- Repository for the COLM 2025 paper SpecDec++: Boosting Speculative Decoding via Adaptive Candidate Lengths☆15Jul 10, 2025Updated 7 months ago
- The code for the NeurIPS19 paper and blog on "Uniform convergence may be unable to explain generalization in deep learning".☆10Oct 26, 2019Updated 6 years ago
- Code for the ICML 2021 paper "Sharing Less is More: Lifelong Learning in Deep Networks with Selective Layer Transfer"☆12Aug 17, 2021Updated 4 years ago
- Ok-Topk is a scheme for distributed training with sparse gradients. Ok-Topk integrates a novel sparse allreduce algorithm (less than 6k c…☆27Dec 10, 2022Updated 3 years ago
- AN EFFICIENT AND GENERAL FRAMEWORK FOR LAYERWISE-ADAPTIVE GRADIENT COMPRESSION☆14Oct 27, 2023Updated 2 years ago
- ☆12Nov 5, 2019Updated 6 years ago
- [ICDCS 2023] DeAR: Accelerating Distributed Deep Learning with Fine-Grained All-Reduce Pipelining☆12Dec 4, 2023Updated 2 years ago
- Last-layer Laplace approximation code examples☆83Oct 18, 2021Updated 4 years ago
- Analyze AdaHessian optimizer on 2D functions.☆13Aug 13, 2021Updated 4 years ago
- Official code for the paper "Why Do Self-Supervised Models Transfer? Investigating the Impact of Invariance on Downstream Tasks".☆16Dec 7, 2021Updated 4 years ago
- code for experiments in Grosse and Salakhutdinov, 2015.☆12Oct 9, 2016Updated 9 years ago
- Look for arXiv papers in a Zotero library and find available DOIs of published versions.☆14Apr 11, 2024Updated last year
- Our implementation of Shampoo optimizer based on https://arxiv.org/pdf/1802.09568.pdf☆12Dec 23, 2019Updated 6 years ago
- The official implementation of “Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training”☆981Jan 30, 2024Updated 2 years ago