lsds / CrossbowLinks

Crossbow: A Multi-GPU Deep Learning System for Training with Small Batch Sizes

☆56

Alternatives and similar repositories for Crossbow

Users that are interested in Crossbow are comparing it to the libraries listed below

Sorting:

netx-repo / PipeSwitch
PipeSwitch: Fast Pipelined Context Switching for Deep Learning Applications
☆126Updated 3 years ago
SymbioticLab / Salus
Fine-grained GPU sharing primitives
☆146Updated 3 months ago
Funatiq / gossip
gossip: Efficient Communication Primitives for Multi-GPU Systems
☆59Updated 3 years ago
msr-fiddle / DS-Analyzer
☆38Updated 4 years ago
byteps / examples
BytePS examples (Vision, NLP, GAN, etc)
☆19Updated 2 years ago
anandj91 / p3
☆21Updated 2 years ago
uclasystem / dorylus
Dorylus: Affordable, Scalable, and Accurate GNN Training
☆76Updated 4 years ago
tbd-ai / tbd-suite
☆47Updated 2 years ago
czkkkkkk / gccl
☆14Updated 4 years ago
suquark / hoplite
☆44Updated 4 years ago
uwsampl / nexus
☆83Updated 4 months ago
msr-fiddle / CoorDL
☆24Updated 2 years ago
stanford-mast / INFaaS
Model-less Inference Serving
☆92Updated last year
msr-fiddle / CheckFreq
☆56Updated 4 years ago
netx-repo / training-bottleneck
Analyze network performance in distributed training
☆19Updated 5 years ago
uw-mad-dash / shockwave
Artifact for "Shockwave: Fair and Efficient Cluster Scheduling for Dynamic Adaptation in Machine Learning" [NSDI '23]
☆45Updated 2 years ago
SymbioticLab / Fluid
A Generic Resource-Aware Hyperparameter Tuning Execution Engine
☆15Updated 3 years ago
SJTU-IPADS / reef-artifacts
A GPU-accelerated DNN inference serving system that supports instant kernel preemption and biased concurrent execution in GPU scheduling.
☆43Updated 3 years ago
xldrx / tictac
☆22Updated 6 years ago
SymbioticLab / Tiresias
Tiresias is a GPU cluster manager for distributed deep learning training.
☆163Updated 5 years ago
AlibabaPAI / DAPPLE
An Efficient Pipelined Data Parallel Approach for Training Large Model
☆76Updated 4 years ago
CGCL-codes / Tensorflow-RDMA
Tensorflow is a computational library using data flow graphs for scalable machine learning, and Tensorflow-RDMA is the implementation ov…
☆58Updated 2 years ago
uclasystem / MemLiner
MemLiner is a remote-memory-friendly runtime system.
☆31Updated 2 years ago
saareliad / FTPipe
FTPipe and related pipeline model parallelism research.
☆43Updated 2 years ago
shriramsb / vDNN
☆22Updated 6 years ago
kanonjz / paper
Machine Learning System
☆14Updated 5 years ago
sands-lab / omnireduce
☆68Updated 2 years ago
rkhan055 / SHADE
SHADE: Enable Fundamental Cacheability for Distributed Deep Learning Training
☆35Updated 2 years ago
jasperzhong / read-papers-and-code
My paper/code reading notes in Chinese
☆46Updated 4 months ago
parasailteam / coconet
☆83Updated 2 years ago