lixilinx / psgd_torch
Pytorch implementation of preconditioned stochastic gradient descent (Kron and affine preconditioner, low-rank approximation preconditioner and more)
β154Updated last month
Alternatives and similar repositories for psgd_torch:
Users that are interested in psgd_torch are comparing it to the libraries listed below
- β149Updated last month
- Implementation of PSGD optimizer in JAXβ27Updated 3 weeks ago
- 𧱠Modula software packageβ134Updated this week
- ASDL: Automatic Second-order Differentiation Library for PyTorchβ182Updated last month
- Experiment of using Tangent to autodiff tritonβ74Updated last year
- β50Updated 3 months ago
- Implementations and checkpoints for ResNet, Wide ResNet, ResNeXt, ResNet-D, and ResNeSt in JAX (Flax).β106Updated 2 years ago
- LoRA for arbitrary JAX models and functionsβ135Updated 11 months ago
- Automatically take good care of your preemptible TPUsβ34Updated last year
- Efficient optimizersβ154Updated this week
- Code implementing "Efficient Parallelization of a Ubiquitious Sequential Computation" (Heinsen, 2023)β89Updated last month
- JMP is a Mixed Precision library for JAX.β189Updated last month
- [ICML 2024] SIRFShampoo: Structured inverse- and root-free Shampoo in PyTorch (https://arxiv.org/abs/2402.03496)β14Updated 2 months ago
- β217Updated 9 months ago
- β203Updated 6 months ago
- A functional training loops library for JAXβ86Updated 11 months ago
- Replicating and dissecting the git-re-basin project in one-click-replication Colabsβ36Updated 2 years ago
- Easy-to-use AdaHessian optimizer (PyTorch)β77Updated 4 years ago
- An implementation of PSGD Kron second-order optimizer for PyTorchβ29Updated 3 weeks ago
- A library for unit scaling in PyTorchβ122Updated 2 months ago
- Running Jax in PyTorch Lightningβ86Updated last month
- Optimization algorithm which fits a ResNet to CIFAR-10 5x faster than SGD / Adam (with terrible generalization)β12Updated last year
- Accelerated First Order Parallel Associative Scanβ170Updated 5 months ago
- A simple library for scaling up JAX programsβ129Updated 2 months ago
- supporting pytorch FSDP for optimizersβ75Updated last month
- Multi-framework implementation of Deep Kernel Shaping and Tailored Activation Transformations, which are methods that modify neural netwoβ¦β66Updated 2 months ago
- β58Updated 2 years ago
- Unofficial JAX implementations of deep learning research papersβ153Updated 2 years ago
- Unofficial but Efficient Implementation of "Mamba: Linear-Time Sequence Modeling with Selective State Spaces" in JAXβ82Updated last year
- π§ Pytorch code for the Fromage optimiser.β123Updated 6 months ago