cofe-ai / Mu-scaling

Research without Re-search: Maximal Update Parametrization Yields Accurate Loss Prediction across Scales
31Updated last year

Alternatives and similar repositories for Mu-scaling:

Users that are interested in Mu-scaling are comparing it to the libraries listed below