lixilinx / psgd_torch
Pytorch implementation of preconditioned stochastic gradient descent (Kron and affine preconditioner, low-rank approximation preconditioner and more)
β171Updated this week
Alternatives and similar repositories for psgd_torch:
Users that are interested in psgd_torch are comparing it to the libraries listed below
- β52Updated 6 months ago
- ASDL: Automatic Second-order Differentiation Library for PyTorchβ185Updated 3 months ago
- 𧱠Modula software packageβ187Updated this week
- Implementation of PSGD optimizer in JAXβ30Updated 3 months ago
- β172Updated 4 months ago
- Code for the article "What if Neural Networks had SVDs?", to be presented as a spotlight paper at NeurIPS 2020.β74Updated 8 months ago
- Efficient optimizersβ185Updated this week
- Implementations and checkpoints for ResNet, Wide ResNet, ResNeXt, ResNet-D, and ResNeSt in JAX (Flax).β108Updated 2 years ago
- β60Updated 3 years ago
- Experiment of using Tangent to autodiff tritonβ78Updated last year
- A library for unit scaling in PyTorchβ125Updated 4 months ago
- A simple library for scaling up JAX programsβ134Updated 5 months ago
- β215Updated 8 months ago
- A functional training loops library for JAXβ86Updated last year
- JMP is a Mixed Precision library for JAX.β193Updated 2 months ago
- β87Updated 3 weeks ago
- supporting pytorch FSDP for optimizersβ80Updated 3 months ago
- Hessian spectral density estimation in TF and Jaxβ122Updated 4 years ago
- Multi-framework implementation of Deep Kernel Shaping and Tailored Activation Transformations, which are methods that modify neural netwoβ¦β68Updated last week
- seqax = sequence modeling + JAXβ151Updated 2 weeks ago
- Unofficial JAX implementations of deep learning research papersβ154Updated 2 years ago
- Jax/Flax rewrite of Karpathy's nanoGPTβ57Updated 2 years ago
- Automatically take good care of your preemptible TPUsβ36Updated last year
- Accelerated First Order Parallel Associative Scanβ180Updated 7 months ago
- β221Updated last month
- Replicating and dissecting the git-re-basin project in one-click-replication Colabsβ36Updated 2 years ago
- [ICML 2024] SIRFShampoo: Structured inverse- and root-free Shampoo in PyTorch (https://arxiv.org/abs/2402.03496)β14Updated 4 months ago
- LoRA for arbitrary JAX models and functionsβ135Updated last year
- DoG is SGD's Best Friend: A Parameter-Free Dynamic Step Size Scheduleβ60Updated last year
- Neural Networks for JAXβ83Updated 6 months ago