lixilinx / psgd_torchLinks
Pytorch implementation of preconditioned stochastic gradient descent (Kron and affine preconditioner, low-rank approximation preconditioner and more)
☆188Updated last week
Alternatives and similar repositories for psgd_torch
Users that are interested in psgd_torch are comparing it to the libraries listed below
Sorting:
- ASDL: Automatic Second-order Differentiation Library for PyTorch☆190Updated 10 months ago
- LoRA for arbitrary JAX models and functions☆141Updated last year
- ☆218Updated 10 months ago
- Parameter-Free Optimizers for Pytorch☆131Updated last year
- Multi-framework implementation of Deep Kernel Shaping and Tailored Activation Transformations, which are methods that modify neural netwo…☆74Updated 3 months ago
- minGPT in JAX☆48Updated 3 years ago
- ☆234Updated 8 months ago
- JMP is a Mixed Precision library for JAX.☆208Updated 8 months ago
- A library for unit scaling in PyTorch☆132Updated 3 months ago
- JAX Synergistic Memory Inspector☆179Updated last year
- 🧱 Modula software package☆291Updated 2 months ago
- Jax/Flax rewrite of Karpathy's nanoGPT☆62Updated 2 years ago
- ☆60Updated 3 years ago
- Code implementing "Efficient Parallelization of a Ubiquitious Sequential Computation" (Heinsen, 2023)☆95Updated 10 months ago
- Replicating and dissecting the git-re-basin project in one-click-replication Colabs☆35Updated 3 years ago
- A simple library for scaling up JAX programs☆144Updated 11 months ago
- Running Jax in PyTorch Lightning☆113Updated 10 months ago
- A functional training loops library for JAX☆88Updated last year
- A Python package of computer vision models for the Equinox ecosystem.☆109Updated last year
- Code accompanying our paper "Feature Learning in Infinite-Width Neural Networks" (https://arxiv.org/abs/2011.14522)☆63Updated 4 years ago
- Implementations and checkpoints for ResNet, Wide ResNet, ResNeXt, ResNet-D, and ResNeSt in JAX (Flax).☆114Updated 3 years ago
- Simple and efficient RevNet-Library for PyTorch with XLA and DeepSpeed support and parameter offload☆131Updated 3 years ago
- ☆283Updated last year
- ☆17Updated last year
- ☆115Updated last month
- ☆58Updated last year
- If it quacks like a tensor...☆59Updated 11 months ago
- supporting pytorch FSDP for optimizers☆83Updated 10 months ago
- Experiment of using Tangent to autodiff triton☆80Updated last year
- MLCommons Algorithmic Efficiency is a benchmark and competition measuring neural network training speedups due to algorithmic improvement…☆400Updated this week