HomebrewML/HeavyBall

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/HomebrewML/HeavyBall)

HomebrewML / HeavyBall

Efficient optimizers

☆335

Alternatives and similar repositories for HeavyBall

Users that are interested in HeavyBall are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ethansmith2000 / fsdp_optimizers
View on GitHub
supporting pytorch FSDP for optimizers
☆84Dec 8, 2024Updated last year
nikhilvyas / SOAP
View on GitHub
☆273Dec 2, 2024Updated last year
evanatyourservice / kron_torch
View on GitHub
An implementation of PSGD Kron second-order optimizer for PyTorch
☆102Jul 24, 2025Updated 11 months ago
fal-ai / diffusion-speedrun
View on GitHub
Focused on fast experimentation and simplicity
☆77Dec 24, 2024Updated last year
evanatyourservice / llm-jax
View on GitHub
Train a SmolLM-style llm on fineweb-edu in JAX/Flax with an assortment of optimizers.
☆19Jul 24, 2025Updated 11 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
lixilinx / psgd_torch
View on GitHub
Pytorch implementation of preconditioned stochastic gradient descent (Kron and affine preconditioner, low-rank approximation precondition…
☆198May 30, 2026Updated last month
facebookresearch / schedule_free
View on GitHub
Schedule-Free Optimization in PyTorch
☆2,314Jun 18, 2026Updated last month
facebookresearch / optimizers
View on GitHub
For optimization algorithm research and development.
☆578Updated this week
microsoft / dion
View on GitHub
Dion optimizer algorithm
☆494Jul 12, 2026Updated last week
cloneofsimo / repa-rf
View on GitHub
☆32Nov 4, 2024Updated last year
ClashLuke / SOAP
View on GitHub
☆22Nov 9, 2024Updated last year
BlinkDL / SmallInitEmb
View on GitHub
LayerNorm(SmallInit(Embedding)) in a Transformer to improve convergence
☆61Feb 21, 2022Updated 4 years ago
riverstone496 / awesome-second-order-optimization
View on GitHub
☆32May 17, 2026Updated 2 months ago
cloneofsimo / zeroshampoo
View on GitHub
☆33Sep 10, 2024Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
nikhilvyas / SOAP_MUON
View on GitHub
Combining SOAP and MUON
☆22Feb 11, 2025Updated last year
KellerJordan / Muon
View on GitHub
Muon is an optimizer for hidden layers in neural networks
☆2,714May 24, 2026Updated last month
NVIDIA-NeMo / Emerging-Optimizers
View on GitHub
☆209Updated this week
modula-systems / modula
View on GitHub
🧱 Modula software package
☆337Aug 18, 2025Updated 11 months ago
cloneofsimo / scaling-guide
View on GitHub
WIP
☆96Aug 13, 2024Updated last year
NoahAmsel / PolarExpress
View on GitHub
☆32Jul 6, 2026Updated 2 weeks ago
lindermanlab / elk
View on GitHub
Scalable and Stable Parallelization of Nonlinear RNNS
☆33Jun 28, 2026Updated 3 weeks ago
proger / accelerated-scan
View on GitHub
Accelerated First Order Parallel Associative Scan
☆198Jan 7, 2026Updated 6 months ago
LIONS-EPFL / scion
View on GitHub
☆70Apr 8, 2026Updated 3 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
cloneofsimo / ezmup
View on GitHub
Simple implementation of muP, based on Spectral Condition for Feature Learning. The implementation is SGD only, dont use it for Adam
☆88Jul 28, 2024Updated last year
apple / ml-cross-entropy
View on GitHub
☆610Sep 23, 2025Updated 9 months ago
graphcore-research / unit-scaling
View on GitHub
A library for unit scaling in PyTorch
☆134Jul 11, 2025Updated last year
warner-benjamin / optimi
View on GitHub
Fast, Modern, and Low Precision PyTorch Optimizers
☆129May 16, 2026Updated 2 months ago
Noumena-Network / nmoe
View on GitHub
MoE training for Me and You and maybe other people
☆394Mar 15, 2026Updated 4 months ago
xjdr-alt / mla_blog_translation
View on GitHub
☆13Jun 18, 2024Updated 2 years ago
ClashLuke / tpucare
View on GitHub
Automatically take good care of your preemptible TPUs
☆37May 15, 2023Updated 3 years ago
proger / nanokitchen
View on GitHub
Parallel Associative Scan for Language Models
☆18Jan 8, 2024Updated 2 years ago
Sike-Wang / low-bit-Shampoo
View on GitHub
4-bit Shampoo for Memory-Efficient Network Training (NeurIPS 2024)
☆13Feb 13, 2025Updated last year
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
thib-s / flash-newton-schulz
View on GitHub
My attempt to improve the speed of the newton schulz algorithm, starting from the dion implementation.
☆38Apr 30, 2026Updated 2 months ago
kyleliang919 / C-Optim
View on GitHub
[ICLR 2026] When it comes to optimizers, it's always better to be safe than sorry
☆417Sep 26, 2025Updated 9 months ago
GindaChen / FlexFlashAttention3
View on GitHub
FlexAttention w/ FlashAttention3 Support
☆27Oct 5, 2024Updated last year
patrick-kidger / patdb
View on GitHub
A snappy + easy + pretty TUI debugger for Python.
☆71May 22, 2026Updated last month
euclaise / supertrainer2000
View on GitHub
☆50Mar 14, 2024Updated 2 years ago
leloykun / adaptive-muon
View on GitHub
A single-line modification to any (dualizer-based) optimizer that allows the optimizer to adapt to the scale of the gradients as they cha…
☆19Jan 11, 2025Updated last year
kvfrans / matrix-whitening
View on GitHub
Code for "What really matters in matrix-whitening optimizers?"
☆25Oct 31, 2025Updated 8 months ago