evanatyourservice / kron_torchLinks
An implementation of PSGD Kron second-order optimizer for PyTorch
β96Updated 3 months ago
Alternatives and similar repositories for kron_torch
Users that are interested in kron_torch are comparing it to the libraries listed below
Sorting:
- π§± Modula software packageβ291Updated 2 months ago
- Efficient optimizersβ275Updated last week
- supporting pytorch FSDP for optimizersβ83Updated 10 months ago
- Scalable and Performant Data Loadingβ311Updated last week
- β218Updated 10 months ago
- Getting crystal-like representations with harmonic lossβ192Updated 6 months ago
- DeMo: Decoupled Momentum Optimizationβ194Updated 10 months ago
- The AdEMAMix Optimizer: Better, Faster, Older.β186Updated last year
- β67Updated 11 months ago
- β91Updated last year
- WIPβ93Updated last year
- β81Updated last year
- Official implementation of the paper: "ZClip: Adaptive Spike Mitigation for LLM Pre-Training".β136Updated last week
- Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"β102Updated 10 months ago
- β150Updated last year
- Focused on fast experimentation and simplicityβ75Updated 10 months ago
- Dion optimizer algorithmβ369Updated 3 weeks ago
- β120Updated 4 months ago
- Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT trainingβ132Updated last year
- For optimization algorithm research and development.β543Updated last week
- research impl of Native Sparse Attention (2502.11089)β62Updated 8 months ago
- The simplest, fastest repository for training/finetuning medium-sized GPTs.β168Updated 4 months ago
- Supporting code for the blog post on modular manifolds.β94Updated last month
- $100K or 100 Days: Trade-offs when Pre-Training with Academic Resourcesβ147Updated 3 weeks ago
- CIFAR-10 speedruns: 94% in 2.6 seconds and 96% in 27 secondsβ320Updated 3 months ago
- β102Updated 3 months ago
- NanoGPT-speedrunning for the poor T4 enjoyersβ72Updated 6 months ago
- β58Updated last year
- Normalized Transformer (nGPT)β192Updated 11 months ago
- Just some miscellaneous utility functions / decorators / modules related to Pytorch and Accelerate to help speed up implementation of newβ¦β123Updated last year