apple / ml-hypercloningLinks
☆52Updated last year
Alternatives and similar repositories for ml-hypercloning
Users that are interested in ml-hypercloning are comparing it to the libraries listed below
Sorting:
- some common Huggingface transformers in maximal update parametrization (µP)☆87Updated 3 years ago
- ☆108Updated 5 months ago
- The source code of our work "Prepacking: A Simple Method for Fast Prefilling and Increased Throughput in Large Language Models" [AISTATS …☆60Updated last year
- Train, tune, and infer Bamba model☆137Updated 7 months ago
- Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*☆86Updated 2 years ago
- ☆82Updated last year
- ☆48Updated last year
- Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"