lmsdss / LayerNorm-Scaling

Official Pytorch Implementation of "The Curse of Depth in Large Language Models" by Wenfang Sun, Xinyuan Song, Pengxiang Li, Lu Yin,Yefeng Zheng, Shiwei Liu
38Updated 3 weeks ago

Alternatives and similar repositories for LayerNorm-Scaling:

Users that are interested in LayerNorm-Scaling are comparing it to the libraries listed below