KellerJordan / Muon

Muon optimizer for neural networks: >30% extra sample efficiency, <3% wallclock overhead
69Updated this week

Related projects

Alternatives and complementary repositories for Muon