sanyalsunny111 / Early_Weight_AvgLinks
[COLM 2024] Early Weight Averaging meets High Learning Rates for LLM Pre-training
☆17Updated 11 months ago
Alternatives and similar repositories for Early_Weight_Avg
Users that are interested in Early_Weight_Avg are comparing it to the libraries listed below
Sorting:
- Embedding Recycling for Language models☆39Updated 2 years ago
- Implementation of "LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models"☆40Updated 10 months ago
- Implementation of the model: "Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models" in PyTorch☆29Updated this week
- ☆14Updated 3 years ago
- ☆14Updated 11 months ago
- Implementation of TableFormer, Robust Transformer Modeling for Table-Text Encoding, in Pytorch☆39Updated 3 years ago
- ☆65Updated last year
- In-Context Alignment: Chat with Vanilla Language Models Before Fine-Tuning