lixilinx / psgd_torch
Pytorch implementation of preconditioned stochastic gradient descent (Kron and affine preconditioner, low-rank approximation preconditioner and more)
β174Updated this week
Alternatives and similar repositories for psgd_torch
Users that are interested in psgd_torch are comparing it to the libraries listed below
Sorting:
- β180Updated 5 months ago
- π§± Modula software packageβ189Updated last month
- β60Updated 3 years ago
- Automatically take good care of your preemptible TPUsβ36Updated 2 years ago
- Multi-framework implementation of Deep Kernel Shaping and Tailored Activation Transformations, which are methods that modify neural netwoβ¦β70Updated 2 weeks ago
- A library for unit scaling in PyTorchβ125Updated 5 months ago
- Efficient optimizersβ193Updated this week
- Experiment of using Tangent to autodiff tritonβ78Updated last year
- A functional training loops library for JAXβ88Updated last year
- LoRA for arbitrary JAX models and functionsβ136Updated last year
- supporting pytorch FSDP for optimizersβ80Updated 5 months ago
- JMP is a Mixed Precision library for JAX.β198Updated 3 months ago
- β217Updated 10 months ago
- Implementation of PSGD optimizer in JAXβ33Updated 4 months ago
- β53Updated 7 months ago
- Parameter-Free Optimizers for Pytorchβ127Updated last year
- A simple library for scaling up JAX programsβ134Updated 6 months ago
- minGPT in JAXβ48Updated 3 years ago
- Code implementing "Efficient Parallelization of a Ubiquitious Sequential Computation" (Heinsen, 2023)β92Updated 5 months ago
- JAX Synergistic Memory Inspectorβ173Updated 9 months ago
- ASDL: Automatic Second-order Differentiation Library for PyTorchβ185Updated 5 months ago
- Simple and efficient RevNet-Library for PyTorch with XLA and DeepSpeed support and parameter offloadβ127Updated 2 years ago
- Run PyTorch in JAX. π€β242Updated 2 months ago
- Replicating and dissecting the git-re-basin project in one-click-replication Colabsβ36Updated 2 years ago
- Hessian spectral density estimation in TF and Jaxβ123Updated 4 years ago
- seqax = sequence modeling + JAXβ155Updated last month
- Code for the article "What if Neural Networks had SVDs?", to be presented as a spotlight paper at NeurIPS 2020.β75Updated 9 months ago
- β109Updated this week
- β43Updated last month
- β226Updated 3 months ago