hku-systems / vpipeLinks

☆25

Alternatives and similar repositories for vpipe

Users that are interested in vpipe are comparing it to the libraries listed below

Sorting:

AlibabaPAI / DAPPLE
An Efficient Pipelined Data Parallel Approach for Training Large Model
☆76Updated 4 years ago
alpa-projects / mms
AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving (OSDI 23)
☆88Updated 2 years ago
parasailteam / coconet
☆83Updated 2 years ago
microsoft / SuperScaler
An experimental parallel training platform
☆54Updated last year
netx-repo / PipeSwitch
PipeSwitch: Fast Pipelined Context Switching for Deep Learning Applications
☆126Updated 3 years ago
pkusys / ElasticFlow
Artifacts for our ASPLOS'23 paper ElasticFlow
☆54Updated last year
SymbioticLab / Oobleck
A resilient distributed training framework
☆95Updated last year
HPMLL / BurstGPT
A ChatGPT(GPT-3.5) & GPT-4 Workload Trace to Optimize LLM Serving Systems
☆213Updated 3 months ago
zhuohan123 / terapipe
☆75Updated 4 years ago
mutinifni / splitwise-sim
LLM serving cluster simulator
☆116Updated last year
UofT-EcoSystem / hfta
Boost hardware utilization for ML training workloads via Inter-model Horizontal Fusion
☆32Updated last year
HPDL-Group / Merak
☆81Updated 5 months ago
microsoft / msccl-tools
Synthesizer for optimal collective communication algorithms
☆118Updated last year
uclasystem / bamboo
Bamboo is a system for running large pipeline-parallel DNNs affordably, reliably, and efficiently using spot instances.
☆51Updated 2 years ago
saareliad / FTPipe
FTPipe and related pipeline model parallelism research.
☆43Updated 2 years ago
DachengLi1 / AMP
(NeurIPS 2022) Automatically finding good model-parallel strategies, especially for complex models and clusters.
☆41Updated 2 years ago
Raphael-Hao / brainstorm
Compiler for Dynamic Neural Networks
☆46Updated last year
Raphael-Hao / Abacus
☆38Updated 3 months ago
ConnollyLeon / awesome-Auto-Parallelism
A baseline repository of Auto-Parallelism in Training Neural Networks
☆147Updated 3 years ago
geoffxy / habitat
🔮 Execution time predictions for deep neural network training iterations across different GPUs.
☆62Updated 2 years ago
SymbioticLab / Salus
Fine-grained GPU sharing primitives
☆146Updated 2 months ago
microsoft / nnscaler
nnScaler: Compiling DNN models for Parallel Training
☆118Updated last month
S-Lab-System-Group / HeliosData
Helios Traces from SenseTime
☆58Updated 3 years ago
casys-kaist / glet
☆52Updated 9 months ago
stanford-futuredata / gavel
Code for "Heterogenity-Aware Cluster Scheduling Policies for Deep Learning Workloads", which appeared at OSDI 2020
☆130Updated last year
kungfu-team / tenplex
Dynamic resources changes for multi-dimensional parallelism training
☆29Updated 2 months ago
Relaxed-System-Lab / HexGen
[ICML 2024] Serving LLMs on heterogeneous decentralized clusters.
☆30Updated last year
msr-fiddle / synergy
☆51Updated 2 years ago
ParCIS / Chimera
Chimera: bidirectional pipeline parallelism for efficiently training large-scale models.
☆67Updated 7 months ago
SymbioticLab / ModelKeeper
A Cluster-Wide Model Manager to Accelerate DNN Training via Automated Training Warmup
☆35Updated 2 years ago