lixilinx / psgd_torchLinks
Pytorch implementation of preconditioned stochastic gradient descent (Kron and affine preconditioner, low-rank approximation preconditioner and more)
β180Updated last week
Alternatives and similar repositories for psgd_torch
Users that are interested in psgd_torch are comparing it to the libraries listed below
Sorting:
- ASDL: Automatic Second-order Differentiation Library for PyTorchβ188Updated 8 months ago
- π§± Modula software packageβ216Updated last week
- β206Updated 8 months ago
- LoRA for arbitrary JAX models and functionsβ140Updated last year
- Implementations and checkpoints for ResNet, Wide ResNet, ResNeXt, ResNet-D, and ResNeSt in JAX (Flax).β112Updated 3 years ago
- Parameter-Free Optimizers for Pytorchβ130Updated last year
- β60Updated 3 years ago
- JMP is a Mixed Precision library for JAX.β207Updated 6 months ago
- A simple library for scaling up JAX programsβ140Updated 9 months ago
- A functional training loops library for JAXβ88Updated last year
- Code implementing "Efficient Parallelization of a Ubiquitious Sequential Computation" (Heinsen, 2023)β94Updated 8 months ago
- Run PyTorch in JAX. π€β266Updated 3 weeks ago
- β115Updated this week
- Neural Networks for JAXβ84Updated 10 months ago
- β53Updated 10 months ago
- β40Updated last year
- minGPT in JAXβ48Updated 3 years ago
- β232Updated 5 months ago
- Pytorch-like dataloaders for JAX.β94Updated 2 months ago
- Code accompanying our paper "Feature Learning in Infinite-Width Neural Networks" (https://arxiv.org/abs/2011.14522)β62Updated 4 years ago
- A library for unit scaling in PyTorchβ128Updated 3 weeks ago
- JAX Synergistic Memory Inspectorβ177Updated last year
- Named tensors with first-class dimensions for PyTorchβ332Updated 2 years ago
- Running Jax in PyTorch Lightningβ109Updated 7 months ago
- Simple and efficient RevNet-Library for PyTorch with XLA and DeepSpeed support and parameter offloadβ128Updated 3 years ago
- Multi-framework implementation of Deep Kernel Shaping and Tailored Activation Transformations, which are methods that modify neural netwoβ¦β71Updated last month
- β158Updated last year
- Experiment of using Tangent to autodiff tritonβ79Updated last year
- β104Updated last year
- Jax/Flax rewrite of Karpathy's nanoGPTβ59Updated 2 years ago