Distributed K-FAC preconditioner for PyTorch
☆95Mar 17, 2026Updated last week
Alternatives and similar repositories for kfac-pytorch
Users that are interested in kfac-pytorch are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- BERT for Distributed PyTorch + AMP Training☆12Mar 15, 2023Updated 3 years ago
- Pytorch implementation of KFAC and E-KFAC (Natural Gradient).☆133Jul 2, 2019Updated 6 years ago
- A Chainer extension for K-FAC☆20Jun 16, 2019Updated 6 years ago
- Pytorch implementation of KFAC - this is a port of https://github.com/tensorflow/kfac/☆30Jun 6, 2024Updated last year
- ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning☆284Feb 27, 2023Updated 3 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆135Oct 23, 2017Updated 8 years ago
- BackPACK - a backpropagation package built on top of PyTorch which efficiently computes quantities other than the gradient.☆607Nov 28, 2025Updated 3 months ago
- An implementation of KFAC for TensorFlow☆199Feb 11, 2022Updated 4 years ago
- Second Order Optimization and Curvature Estimation with K-FAC in JAX.☆317Mar 16, 2026Updated last week
- {KFAC,EKFAC,Diagonal,Implicit} Fisher Matrices and finite width NTKs in PyTorch☆221Mar 17, 2026Updated last week
- PyTorch-SSO: Scalable Second-Order methods in PyTorch☆149Oct 1, 2023Updated 2 years ago
- ☆33Jul 8, 2024Updated last year
- PyHessian is a Pytorch library for second-order based analysis and training of Neural Networks☆778Jul 10, 2025Updated 8 months ago
- Code to simulate energy-based analog systems and equilibrium propagation☆32Apr 6, 2025Updated 11 months ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Efficient reference implementations of the static & dynamic M-FAC algorithms (for pruning and optimization)☆17Feb 23, 2022Updated 4 years ago
- ☆14Sep 14, 2021Updated 4 years ago
- [ICML 2024] SINGD: KFAC-like Structured Inverse-Free Natural Gradient Descent (http://arxiv.org/abs/2312.05705)☆24Nov 4, 2024Updated last year
- Artifact for IPDPS'21: DSXplore: Optimizing Convolutional Neural Networks via Sliding-Channel Convolutions.☆13Apr 6, 2021Updated 4 years ago
- Ok-Topk is a scheme for distributed training with sparse gradients. Ok-Topk integrates a novel sparse allreduce algorithm (less than 6k c…☆27Dec 10, 2022Updated 3 years ago
- ☆10Apr 29, 2023Updated 2 years ago
- Limitations of the Empirical Fisher Approximation☆49Mar 3, 2025Updated last year
- In this project, we propose to study Vision Transformers trained using the Barlow Twins self-supervised method, and compare the results w…☆16Oct 3, 2023Updated 2 years ago
- Advanced optimizer with Gradient-Centralization☆21Aug 26, 2020Updated 5 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- ☆47Nov 18, 2022Updated 3 years ago
- Anonymized code for Igeood: An Information Geometry Approach to Out-of-Distribution Detection☆12Jan 25, 2022Updated 4 years ago
- Repository to reproduce the results of the paper "Holomorphic Equilibrium Propagation Computes Exact Gradients Through Finite Size Oscill…☆11Oct 20, 2024Updated last year
- ☆16Nov 24, 2025Updated 4 months ago
- Repository for the paper: "TiC-LM: A Web-Scale Benchmark for Time-Continual LLM Pretraining" ACL Oral 2025☆22Mar 6, 2026Updated 2 weeks ago
- Computing gradients and Hessians of feed-forward networks with GPU acceleration☆20Feb 14, 2024Updated 2 years ago
- Collection of algorithms for approximating Fisher Information Matrix for Natural Gradient (and second order method in general)☆143May 26, 2019Updated 6 years ago
- The code for the NeurIPS19 paper and blog on "Uniform convergence may be unable to explain generalization in deep learning".☆10Oct 26, 2019Updated 6 years ago
- This repository contains the results for the paper: "Descending through a Crowded Valley - Benchmarking Deep Learning Optimizers"☆184Jul 17, 2021Updated 4 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Implementation of Influence Function approximations for differently sized ML models, using PyTorch☆16Sep 15, 2023Updated 2 years ago
- MANOVA.RM☆11Feb 7, 2025Updated last year
- ☆33Dec 3, 2019Updated 6 years ago
- A LARS implementation in PyTorch☆353Feb 21, 2020Updated 6 years ago
- Pytorch implementation of preconditioned stochastic gradient descent (Kron and affine preconditioner, low-rank approximation precondition…☆192Updated this week
- ☆13Feb 24, 2020Updated 6 years ago
- This is the code associated with the paper A Variational Inequality Perspective for Generative Adversarial Networks.☆43May 1, 2019Updated 6 years ago