Stonesjtu / pytorch-learningLinks

learning notes when learning the source code of pytorch

☆24

Alternatives and similar repositories for pytorch-learning

Users that are interested in pytorch-learning are comparing it to the libraries listed below

Sorting:

microsoft / infinibatch
Efficient, check-pointed data loading for deep learning with massive data sets.
☆209Updated 2 years ago
lucidrains / triton-transformer
Implementation of a Transformer, but completely in Triton
☆275Updated 3 years ago
mgmalek / efficient_cross_entropy
☆122Updated last year
ray-project / distml
Distributed ML Optimizer
☆33Updated 4 years ago
pytorch / torchdistx
Torch Distributed Experimental
☆117Updated last year
Lightning-AI / forked-pdb
Python pdb for multiple processes
☆59Updated 4 months ago
google / praxis
☆189Updated 2 weeks ago
toshas / torch-discounted-cumsum
Fast Discounted Cumulative Sums in PyTorch
☆96Updated 4 years ago
li-js / gpu_memory_profiling
Profile the GPU memory usage of every line in a Pytorch code
☆83Updated 7 years ago
microsoft / varuna
☆253Updated last year
parasj / checkmate
Training neural networks in TensorFlow 2.0 with 5x less memory
☆136Updated 3 years ago
microsoft / mutransformers
some common Huggingface transformers in maximal update parametrization (µP)
☆82Updated 3 years ago
spcl / substation
Research and development for optimizing transformers
☆131Updated 4 years ago
meta-pytorch / torchsnapshot
A performant, memory-efficient checkpointing library for PyTorch applications, designed with large, complex distributed workloads in mind…
☆161Updated 3 weeks ago
jaymody / speculative-sampling
Simple implementation of Speculative Sampling in NumPy for GPT-2.
☆96Updated 2 years ago
epfml / dynamic-sparse-flash-attention
☆148Updated 2 years ago
SeanNaren / minGPT
A minimal PyTorch Lightning OpenAI GPT w DeepSpeed Training!
☆113Updated 2 years ago
Edward-Sun / gpt-accelera
Simple and efficient pytorch-native transformer training and inference (batched)
☆79Updated last year
yandex-research / DeDLOC
Official code for "Distributed Deep Learning in Open Collaborations" (NeurIPS 2021)
☆117Updated 3 years ago
HomebrewML / HomebrewNLP-torch
A case study of efficient training of large language models using commodity hardware.
☆68Updated 3 years ago
ezyang / torchdbg
PyTorch centric eager mode debugger
☆48Updated 9 months ago
nshepperd / flash_attn_jax
JAX bindings for Flash Attention v2
☆95Updated last month
pytorch / rfcs
PyTorch RFCs (experimental)
☆135Updated 4 months ago
asappresearch / flop
Pytorch library for factorized L0-based pruning.
☆45Updated 2 years ago
vdesai2014 / inference-optimization-blog-post
☆89Updated last year
sholtodouglas / scalingExperiments
☆62Updated 3 years ago
softmax1 / Flash-Attention-Softmax-N
CUDA and Triton implementations of Flash Attention with SoftmaxN.
☆73Updated last year
Sea-Snell / JAXSeq
Train very large language models in Jax.
☆209Updated last year
JonasGeiping / linear_cross_entropy_loss
A fusion of a linear layer and a cross entropy loss, written for pytorch in triton.
☆70Updated last year
UmerHA / triton_util
Make triton easier
☆47Updated last year