mcanini / SysML-reading-listLinks

Systems for ML/AI & ML/AI for Systems paper reading list: A curated reading list of computer science research for work at the intersection of machine learning and systems. PR are welcome.

☆278

Alternatives and similar repositories for SysML-reading-list

Users that are interested in SysML-reading-list are comparing it to the libraries listed below

Sorting:

ucbrise / cs294-ai-sys-sp19
CS294; AI For Systems and Systems For AI
☆224Updated 5 years ago
SymbioticLab / Salus
Fine-grained GPU sharing primitives
☆143Updated last week
mosharaf / eecs598
Advanced Topics on Systems for X
☆277Updated last year
netx-repo / PipeSwitch
PipeSwitch: Fast Pipelined Context Switching for Deep Learning Applications
☆127Updated 3 years ago
bharathgs / Awesome-Distributed-Deep-Learning
A curated list of awesome Distributed Deep Learning resources.
☆427Updated last year
lsds / KungFu
Fast and Adaptive Distributed Machine Learning for TensorFlow, PyTorch and MindSpore.
☆297Updated last year
guanh01 / CS692-mlsys
This is the (evolving) reading list for the seminar.
☆59Updated 4 years ago
stanford-mast / INFaaS
Model-less Inference Serving
☆90Updated last year
uwsampl / nexus
☆83Updated last month
GHGmc2 / awesome-ml-infra
Building Machine Learning Infrastructure!
☆44Updated 6 years ago
petuum / autodist
Simple Distributed Deep Learning on TensorFlow
☆133Updated last month
ucbrise / cs294-ai-sys-fa19
CS294-162; Machine Learning Systems Seminar
☆31Updated 2 years ago
byteps / examples
BytePS examples (Vision, NLP, GAN, etc)
☆19Updated 2 years ago
msr-fiddle / pipedream
☆393Updated 2 years ago
SymbioticLab / Tiresias
Tiresias is a GPU cluster manager for distributed deep learning training.
☆155Updated 5 years ago
tbd-ai / tbd-suite
☆47Updated 2 years ago
kzhang28 / Optimus
An Efficient Dynamic Resource Scheduler for Deep Learning Clusters
☆42Updated 7 years ago
lsds / Crossbow
Crossbow: A Multi-GPU Deep Learning System for Training with Small Batch Sizes
☆55Updated 2 years ago
anandj91 / p3
☆21Updated 2 years ago
AlibabaPAI / DAPPLE
An Efficient Pipelined Data Parallel Approach for Training Large Model
☆77Updated 4 years ago
xldrx / tictac
☆22Updated 6 years ago
dlsys-course / tinyflow
Tutorial code on how to build your own Deep Learning System in 2k Lines
☆125Updated 8 years ago
ucbrise / cirrus
Serverless ML Framework
☆106Updated 3 years ago
marcoszh / MArk-Project
Exploiting Cloud Services for Cost-Effective, SLO-Aware Machine Learning Inference Serving
☆37Updated 5 years ago
suquark / hoplite
☆45Updated 3 years ago
netx-repo / training-bottleneck
Analyze network performance in distributed training
☆18Updated 4 years ago
petuum / adaptdl
Resource-adaptive cluster scheduler for deep learning training.
☆447Updated 2 years ago
CGCL-codes / Tensorflow-RDMA
Tensorflow is a computational library using data flow graphs for scalable machine learning, and Tensorflow-RDMA is the implementation ov…
☆58Updated 2 years ago
alibaba / GPU-scheduler-for-deep-learning
GPU-scheduler-for-deep-learning
☆210Updated 4 years ago
geoffxy / habitat
🔮 Execution time predictions for deep neural network training iterations across different GPUs.
☆63Updated 2 years ago