cofe-ai / Mu-scalingLinks

Research without Re-search: Maximal Update Parametrization Yields Accurate Loss Prediction across Scales
32Updated last year

Alternatives and similar repositories for Mu-scaling

Users that are interested in Mu-scaling are comparing it to the libraries listed below

Sorting: