Distributed-AI / PipeTransformerLinks

PipeTransformer: Automated Elastic Pipelining for Distributed Training of Large-scale Models. ICML 2021

☆56

Alternatives and similar repositories for PipeTransformer

Users that are interested in PipeTransformer are comparing it to the libraries listed below

Sorting:

lzhangbv / dear_pytorch
[ICDCS 2023] DeAR: Accelerating Distributed Deep Learning with Fine-Grained All-Reduce Pipelining
☆11Updated last year
spcl / substation
Research and development for optimizing transformers
☆131Updated 4 years ago
petuum / autodist
Simple Distributed Deep Learning on TensorFlow
☆134Updated 4 months ago
DachengLi1 / AMP
(NeurIPS 2022) Automatically finding good model-parallel strategies, especially for complex models and clusters.
☆41Updated 2 years ago
Youhe-Jiang / IJCAI2023-OptimalShardedDataParallel
[IJCAI2023] An automated parallel training system that combines the advantages from both data and model parallelism. If you have any inte…
☆52Updated 2 years ago
HKBU-HPML / ddl-benchmarks
ddl-benchmarks: Benchmarks for Distributed Deep Learning
☆36Updated 5 years ago
facebookresearch / fairring
Fairring (FAIR + Herring) is a plug-in for PyTorch that provides a process group for distributed training that outperforms NCCL at large …
☆65Updated 3 years ago
GuanhuaWang / sensAI
sensAI: ConvNets Decomposition via Class Parallelism for Fast Inference on Live Data
☆65Updated last year
DS3Lab / DT-FM
☆93Updated 3 years ago
mlbench / mlbench-benchmarks
Distributed ML Training Benchmarks
☆27Updated 2 years ago
hpcaitech / TensorNVMe
A Python library transfers PyTorch tensors between CPU and NVMe
☆120Updated 10 months ago
epfml / powersgd
Practical low-rank gradient compression for distributed optimization: https://arxiv.org/abs/1905.13727
☆148Updated 11 months ago
saareliad / FTPipe
FTPipe and related pipeline model parallelism research.
☆43Updated 2 years ago
cli99 / flops-profiler
pytorch-profiler
☆51Updated 2 years ago
parasj / checkmate
Training neural networks in TensorFlow 2.0 with 5x less memory
☆136Updated 3 years ago
awslabs / raf
☆145Updated 8 months ago
thu-pacman / FasterMoE
☆87Updated 3 years ago
mlcommons / training_results_v1.0
This repository contains the results and code for the MLPerf™ Training v1.0 benchmark.
☆37Updated last year
xuqifan897 / Optimus
☆28Updated 4 years ago
zhuohan123 / terapipe
☆75Updated 4 years ago
HPDL-Group / Merak
☆81Updated 5 months ago
uwsampl / dtr-prototype
Dynamic Tensor Rematerialization prototype (modified PyTorch) and simulator. Paper: https://arxiv.org/abs/2006.09616
☆132Updated 2 years ago
stanford-futuredata / stk
☆112Updated last year
awslabs / lorien
☆42Updated 2 years ago
mit-han-lab / inter-operator-scheduler
[MLSys 2021] IOS: Inter-Operator Scheduler for CNN Acceleration
☆199Updated 3 years ago
geoffxy / habitat
🔮 Execution time predictions for deep neural network training iterations across different GPUs.
☆62Updated 2 years ago
ParCIS / Chimera
Chimera: bidirectional pipeline parallelism for efficiently training large-scale models.
☆67Updated 7 months ago
meta-pytorch / torchsnapshot
A performant, memory-efficient checkpointing library for PyTorch applications, designed with large, complex distributed workloads in mind…
☆161Updated last month
stanford-mast / INFaaS
Model-less Inference Serving
☆92Updated last year
AlibabaPAI / DAPPLE
An Efficient Pipelined Data Parallel Approach for Training Large Model
☆76Updated 4 years ago