Stonesjtu / pytorch-learningLinks
learning notes when learning the source code of pytorch
☆24Updated 6 years ago
Alternatives and similar repositories for pytorch-learning
Users that are interested in pytorch-learning are comparing it to the libraries listed below
Sorting:
- Efficient, check-pointed data loading for deep learning with massive data sets.☆210Updated 2 years ago
- Implementation of a Transformer, but completely in Triton☆277Updated 3 years ago
- ☆121Updated last year
- Fast Discounted Cumulative Sums in PyTorch☆96Updated 4 years ago
- A case study of efficient training of large language models using commodity hardware.☆68Updated 3 years ago
- Distributed ML Optimizer☆34Updated 4 years ago
- Train very large language models in Jax.☆210Updated 2 years ago
- Profile the GPU memory usage of every line in a Pytorch code☆83Updated 7 years ago
- ☆252Updated last year
- Torch Distributed Experimental☆117Updated last year
- some common Huggingface transformers in maximal update parametrization (µP)☆87Updated 3 years ago
- ☆190Updated this week
- A minimal PyTorch Lightning OpenAI GPT w DeepSpeed Training!☆113Updated 2 years ago
- A performant, memory-efficient checkpointing library for PyTorch applications, designed with large, complex distributed workloads in mind…☆162Updated 3 months ago
- Simple and efficient pytorch-native transformer training and inference (batched)☆79Updated last year
- ☆68Updated 8 months ago
- PyTorch centric eager mode debugger☆48Updated last year
- Official code for "Distributed Deep Learning in Open Collaborations" (NeurIPS 2021)☆117Updated 3 years ago
- ☆62Updated 3 years ago
- ☆29Updated 3 years ago
- ☆150Updated 2 years ago
- Experiment of using Tangent to autodiff triton☆81Updated last year
- Research and development for optimizing transformers☆131Updated 4 years ago
- A fusion of a linear layer and a cross entropy loss, written for pytorch in triton.☆73Updated last year
- PyTorch RFCs (experimental)☆136Updated 6 months ago
- Block-sparse primitives for PyTorch☆160Updated 4 years ago
- ☆364Updated last year
- Python pdb for multiple processes☆70Updated 6 months ago
- JORA: JAX Tensor-Parallel LoRA Library (ACL 2024)☆36Updated last year
- Pytorch library for factorized L0-based pruning.☆45Updated 2 years ago