lmsdss / LayerNorm-Scaling

Official Pytorch Implementation of "The Curse of Depth in Large Language Models" by Wenfang Sun, Xinyuan Song, Pengxiang Li, Lu Yin,Yefeng Zheng, Shiwei Liu
39Updated this week

Alternatives and similar repositories for LayerNorm-Scaling:

Users that are interested in LayerNorm-Scaling are comparing it to the libraries listed below