Stonesjtu / pytorch-learning
learning notes when learning the source code of pytorch
☆24Updated 5 years ago
Related projects ⓘ
Alternatives and complementary repositories for pytorch-learning
- Distributed ML Optimizer☆31Updated 3 years ago
- Implementation of a Transformer, but completely in Triton☆248Updated 2 years ago
- Profile the GPU memory usage of every line in a Pytorch code☆82Updated 6 years ago
- Official Pytorch Implementation of Length-Adaptive Transformer (ACL 2021)☆100Updated 4 years ago
- ☆76Updated 5 months ago
- Fast Discounted Cumulative Sums in PyTorch☆95Updated 3 years ago
- Simple and efficient pytorch-native transformer training and inference (batched)☆61Updated 7 months ago
- ☆47Updated 4 years ago
- ☆132Updated last year
- Inference code for LLaMA models☆19Updated 5 months ago
- some common Huggingface transformers in maximal update parametrization (µP)☆76Updated 2 years ago
- Torch Distributed Experimental☆116Updated 3 months ago
- Block Sparse movement pruning☆78Updated 3 years ago
- A case study of efficient training of large language models using commodity hardware.☆68Updated 2 years ago
- Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*☆80Updated 10 months ago
- Method to improve inference time for BERT. This is an implementation of the paper titled "PoWER-BERT: Accelerating BERT Inference via Pro…☆58Updated last year
- Fairring (FAIR + Herring) is a plug-in for PyTorch that provides a process group for distributed training that outperforms NCCL at large …☆63Updated 2 years ago
- Repository for Sparse Finetuning of LLMs via modified version of the MosaicML llmfoundry☆38Updated 9 months ago
- [JMLR'20] NeurIPS 2019 MicroNet Challenge Efficient Language Modeling, Champion☆40Updated 3 years ago
- NAACL '24 (Best Demo Paper RunnerUp) / MlSys @ NeurIPS '23 - RedCoast: A Lightweight Tool to Automate Distributed Training and Inference☆60Updated 2 weeks ago
- Triton-based implementation of Sparse Mixture of Experts.☆184Updated last month
- ☆42Updated 4 years ago
- Official repository for LightSeq: Sequence Level Parallelism for Distributed Training of Long Context Transformers☆196Updated 2 months ago
- Example python package with pybind11 cpp extension☆57Updated 3 years ago
- A library to create and manage configuration files, especially for machine learning projects.☆77Updated 2 years ago
- A block oriented training approach for inference time optimization.☆29Updated 2 months ago
- Pytorch library for factorized L0-based pruning.☆43Updated last year
- Utilities for Training Very Large Models☆56Updated last month