ZQZCalin / trainitLinks
☆13Updated last month
Alternatives and similar repositories for trainit
Users that are interested in trainit are comparing it to the libraries listed below
Sorting:
- Pytorch implementation of preconditioned stochastic gradient descent (Kron and affine preconditioner, low-rank approximation precondition…☆190Updated 3 weeks ago
- [ICML 2024] SINGD: KFAC-like Structured Inverse-Free Natural Gradient Descent (http://arxiv.org/abs/2312.05705)☆24Updated last year
- Minimal pretraining script for language modeling in PyTorch. Supporting torch compilation and DDP. It includes a model implementation and…☆41Updated 2 months ago
- PyTorch linear operators for curvature matrices (Hessian, Fisher/GGN, KFAC, ...)☆62Updated this week
- ☆53Updated last month
- Sketched linear operations for PyTorch☆100Updated 3 months ago
- IVON optimizer for neural networks based on variational learning.☆81Updated last year
- Pytorch-like dataloaders for JAX.☆99Updated last month
- [ICML 2024] SIRFShampoo: Structured inverse- and root-free Shampoo in PyTorch (https://arxiv.org/abs/2402.03496)☆15Updated last year
- Implementation of PSGD optimizer in JAX☆35Updated last year
- 🧱 Modula software package☆322Updated 5 months ago
- A library for unit scaling in PyTorch☆133Updated 6 months ago
- ASDL: Automatic Second-order Differentiation Library for PyTorch☆191Updated last year
- Parameter-Free Optimizers for Pytorch☆130Updated last year
- ☆18Updated last year
- ☆246Updated last year
- Distributed K-FAC preconditioner for PyTorch☆94Updated last week
- Amortized Probabilistic Conditioning for Optimization, Simulation and Inference (Chang et al., AISTATS 2025)☆21Updated 2 weeks ago
- ☆62Updated last year
- LoRA for arbitrary JAX models and functions☆144Updated last year
- DoG is SGD's Best Friend: A Parameter-Free Dynamic Step Size Schedule☆64Updated 2 years ago
- Code to reproduce experiments in Markovian Flow Matching: Accelerating MCMC with Continuous Normalizing Flows☆13Updated last year
- Lightning-like training API for JAX with Flax☆45Updated last year
- ☆16Updated 2 years ago
- ☆21Updated 2 years ago
- Maximal Update Parametrization (μP) with Flax & Optax.☆16Updated 2 years ago
- ☆124Updated 7 months ago
- Non official implementation of the Linear Recurrent Unit (LRU, Orvieto et al. 2023)☆61Updated 5 months ago
- ☆11Updated 4 years ago
- ☆33Updated last year